Skip to content
Snippets Groups Projects
  1. Mar 03, 2025
    • Gavrilov Ilia's avatar
      drop_monitor: fix incorrect initialization order · 10347064
      Gavrilov Ilia authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      commit 07b598c0e6f06a0f254c88dafb4ad50f8a8c6eea upstream.
      
      Syzkaller reports the following bug:
      
      BUG: spinlock bad magic on CPU#1, syz-executor.0/7995
       lock: 0xffff88805303f3e0, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
      CPU: 1 PID: 7995 Comm: syz-executor.0 Tainted: G            E     5.10.209+ #1
      Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x119/0x179 lib/dump_stack.c:118
       debug_spin_lock_before kernel/locking/spinlock_debug.c:83 [inline]
       do_raw_spin_lock+0x1f6/0x270 kernel/locking/spinlock_debug.c:112
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:117 [inline]
       _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
       reset_per_cpu_data+0xe6/0x240 [drop_monitor]
       net_dm_cmd_trace+0x43d/0x17a0 [drop_monitor]
       genl_family_rcv_msg_doit+0x22f/0x330 net/netlink/genetlink.c:739
       genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
       genl_rcv_msg+0x341/0x5a0 net/netlink/genetlink.c:800
       netlink_rcv_skb+0x14d/0x440 net/netlink/af_netlink.c:2497
       genl_rcv+0x29/0x40 net/netlink/genetlink.c:811
       netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline]
       netlink_unicast+0x54b/0x800 net/netlink/af_netlink.c:1348
       netlink_sendmsg+0x914/0xe00 net/netlink/af_netlink.c:1916
       sock_sendmsg_nosec net/socket.c:651 [inline]
       __sock_sendmsg+0x157/0x190 net/socket.c:663
       ____sys_sendmsg+0x712/0x870 net/socket.c:2378
       ___sys_sendmsg+0xf8/0x170 net/socket.c:2432
       __sys_sendmsg+0xea/0x1b0 net/socket.c:2461
       do_syscall_64+0x30/0x40 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x62/0xc7
      RIP: 0033:0x7f3f9815aee9
      Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f3f972bf0c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007f3f9826d050 RCX: 00007f3f9815aee9
      RDX: 0000000020000000 RSI: 0000000020001300 RDI: 0000000000000007
      RBP: 00007f3f981b63bd R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 000000000000006e R14: 00007f3f9826d050 R15: 00007ffe01ee6768
      
      If drop_monitor is built as a kernel module, syzkaller may have time
      to send a netlink NET_DM_CMD_START message during the module loading.
      This will call the net_dm_monitor_start() function that uses
      a spinlock that has not yet been initialized.
      
      To fix this, let's place resource initialization above the registration
      of a generic netlink family.
      
      Found by InfoTeCS on behalf of Linux Verification Center
      (linuxtesting.org) with Syzkaller.
      
      Fixes: 9a8afc8d ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarIlia Gavrilov <Ilia.Gavrilov@infotecs.ru>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Link: https://patch.msgid.link/20250213152054.2785669-1-Ilia.Gavrilov@infotecs.ru
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      10347064
  2. Oct 02, 2024
    • Al Viro's avatar
      move asm/unaligned.h to linux/unaligned.h · 5f60d5f6
      Al Viro authored
      asm/unaligned.h is always an include of asm-generic/unaligned.h;
      might as well move that thing to linux/unaligned.h and include
      that - there's nothing arch-specific in that header.
      
      auto-generated by the following:
      
      for i in `git grep -l -w asm/unaligned.h`; do
      	sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
      done
      for i in `git grep -l -w asm-generic/unaligned.h`; do
      	sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
      done
      git mv include/asm-generic/unaligned.h include/linux/unaligned.h
      git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
      sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
      sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
      5f60d5f6
  3. Jun 19, 2024
  4. Apr 15, 2024
    • Wander Lairson Costa's avatar
      drop_monitor: replace spin_lock by raw_spin_lock · f1e197a6
      Wander Lairson Costa authored
      
      trace_drop_common() is called with preemption disabled, and it acquires
      a spin_lock. This is problematic for RT kernels because spin_locks are
      sleeping locks in this configuration, which causes the following splat:
      
      BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
      in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 449, name: rcuc/47
      preempt_count: 1, expected: 0
      RCU nest depth: 2, expected: 2
      5 locks held by rcuc/47/449:
       #0: ff1100086ec30a60 ((softirq_ctrl.lock)){+.+.}-{2:2}, at: __local_bh_disable_ip+0x105/0x210
       #1: ffffffffb394a280 (rcu_read_lock){....}-{1:2}, at: rt_spin_lock+0xbf/0x130
       #2: ffffffffb394a280 (rcu_read_lock){....}-{1:2}, at: __local_bh_disable_ip+0x11c/0x210
       #3: ffffffffb394a160 (rcu_callback){....}-{0:0}, at: rcu_do_batch+0x360/0xc70
       #4: ff1100086ee07520 (&data->lock){+.+.}-{2:2}, at: trace_drop_common.constprop.0+0xb5/0x290
      irq event stamp: 139909
      hardirqs last  enabled at (139908): [<ffffffffb1df2b33>] _raw_spin_unlock_irqrestore+0x63/0x80
      hardirqs last disabled at (139909): [<ffffffffb19bd03d>] trace_drop_common.constprop.0+0x26d/0x290
      softirqs last  enabled at (139892): [<ffffffffb07a1083>] __local_bh_enable_ip+0x103/0x170
      softirqs last disabled at (139898): [<ffffffffb0909b33>] rcu_cpu_kthread+0x93/0x1f0
      Preemption disabled at:
      [<ffffffffb1de786b>] rt_mutex_slowunlock+0xab/0x2e0
      CPU: 47 PID: 449 Comm: rcuc/47 Not tainted 6.9.0-rc2-rt1+ #7
      Hardware name: Dell Inc. PowerEdge R650/0Y2G81, BIOS 1.6.5 04/15/2022
      Call Trace:
       <TASK>
       dump_stack_lvl+0x8c/0xd0
       dump_stack+0x14/0x20
       __might_resched+0x21e/0x2f0
       rt_spin_lock+0x5e/0x130
       ? trace_drop_common.constprop.0+0xb5/0x290
       ? skb_queue_purge_reason.part.0+0x1bf/0x230
       trace_drop_common.constprop.0+0xb5/0x290
       ? preempt_count_sub+0x1c/0xd0
       ? _raw_spin_unlock_irqrestore+0x4a/0x80
       ? __pfx_trace_drop_common.constprop.0+0x10/0x10
       ? rt_mutex_slowunlock+0x26a/0x2e0
       ? skb_queue_purge_reason.part.0+0x1bf/0x230
       ? __pfx_rt_mutex_slowunlock+0x10/0x10
       ? skb_queue_purge_reason.part.0+0x1bf/0x230
       trace_kfree_skb_hit+0x15/0x20
       trace_kfree_skb+0xe9/0x150
       kfree_skb_reason+0x7b/0x110
       skb_queue_purge_reason.part.0+0x1bf/0x230
       ? __pfx_skb_queue_purge_reason.part.0+0x10/0x10
       ? mark_lock.part.0+0x8a/0x520
      ...
      
      trace_drop_common() also disables interrupts, but this is a minor issue
      because we could easily replace it with a local_lock.
      
      Replace the spin_lock with raw_spin_lock to avoid sleeping in atomic
      context.
      
      Signed-off-by: default avatarWander Lairson Costa <wander@redhat.com>
      Reported-by: default avatarHu Chunyu <chuhu@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f1e197a6
  5. Dec 29, 2023
  6. Dec 07, 2023
    • Ido Schimmel's avatar
      drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group · e0378187
      Ido Schimmel authored
      
      The "NET_DM" generic netlink family notifies drop locations over the
      "events" multicast group. This is problematic since by default generic
      netlink allows non-root users to listen to these notifications.
      
      Fix by adding a new field to the generic netlink multicast group
      structure that when set prevents non-root users or root without the
      'CAP_SYS_ADMIN' capability (in the user namespace owning the network
      namespace) from joining the group. Set this field for the "events"
      group. Use 'CAP_SYS_ADMIN' rather than 'CAP_NET_ADMIN' because of the
      nature of the information that is shared over this group.
      
      Note that the capability check in this case will always be performed
      against the initial user namespace since the family is not netns aware
      and only operates in the initial network namespace.
      
      A new field is added to the structure rather than using the "flags"
      field because the existing field uses uAPI flags and it is inappropriate
      to add a new uAPI flag for an internal kernel check. In net-next we can
      rework the "flags" field to use internal flags and fold the new field
      into it. But for now, in order to reduce the amount of changes, add a
      new field.
      
      Since the information can only be consumed by root, mark the control
      plane operations that start and stop the tracing as root-only using the
      'GENL_ADMIN_PERM' flag.
      
      Tested using [1].
      
      Before:
      
       # capsh -- -c ./dm_repo
       # capsh --drop=cap_sys_admin -- -c ./dm_repo
      
      After:
      
       # capsh -- -c ./dm_repo
       # capsh --drop=cap_sys_admin -- -c ./dm_repo
       Failed to join "events" multicast group
      
      [1]
       $ cat dm.c
       #include <stdio.h>
       #include <netlink/genl/ctrl.h>
       #include <netlink/genl/genl.h>
       #include <netlink/socket.h>
      
       int main(int argc, char **argv)
       {
       	struct nl_sock *sk;
       	int grp, err;
      
       	sk = nl_socket_alloc();
       	if (!sk) {
       		fprintf(stderr, "Failed to allocate socket\n");
       		return -1;
       	}
      
       	err = genl_connect(sk);
       	if (err) {
       		fprintf(stderr, "Failed to connect socket\n");
       		return err;
       	}
      
       	grp = genl_ctrl_resolve_grp(sk, "NET_DM", "events");
       	if (grp < 0) {
       		fprintf(stderr,
       			"Failed to resolve \"events\" multicast group\n");
       		return grp;
       	}
      
       	err = nl_socket_add_memberships(sk, grp, NFNLGRP_NONE);
       	if (err) {
       		fprintf(stderr, "Failed to join \"events\" multicast group\n");
       		return err;
       	}
      
       	return 0;
       }
       $ gcc -I/usr/include/libnl3 -lnl-3 -lnl-genl-3 -o dm_repo dm.c
      
      Fixes: 9a8afc8d ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol")
      Reported-by: default avatar"The UK's National Cyber Security Centre (NCSC)" <security@ncsc.gov.uk>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Link: https://lore.kernel.org/r/20231206213102.1824398-3-idosch@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e0378187
  7. Apr 21, 2023
  8. Nov 07, 2022
    • Jakub Kicinski's avatar
      genetlink: introduce split op representation · 20b0b53a
      Jakub Kicinski authored
      
      We currently have two forms of operations - small ops and "full" ops
      (or just ops). The former does not have pointers for some of the less
      commonly used features (namely dump start/done and policy).
      
      The "full" ops, however, still don't contain all the necessary
      information. In particular the policy is per command ID, while
      do and dump often accept different attributes. It's also not
      possible to define different pre_doit and post_doit callbacks
      for different commands within the family.
      
      At the same time a lot of commands do not support dumping and
      therefore all the dump-related information is wasted space.
      
      Create a new command representation which can hold info about
      a do implementation or a dump implementation, but not both at
      the same time.
      
      Use this new representation on the command execution path
      (genl_family_rcv_msg) as we either run a do or a dump and
      don't have to create a "full" op there.
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      20b0b53a
  9. Oct 29, 2022
  10. Aug 29, 2022
    • Jakub Kicinski's avatar
      genetlink: start to validate reserved header bytes · 9c5d03d3
      Jakub Kicinski authored
      
      We had historically not checked that genlmsghdr.reserved
      is 0 on input which prevents us from using those precious
      bytes in the future.
      
      One use case would be to extend the cmd field, which is
      currently just 8 bits wide and 256 is not a lot of commands
      for some core families.
      
      To make sure that new families do the right thing by default
      put the onus of opting out of validation on existing families.
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Acked-by: Paul Moore <paul@paul-moore.com> (NetLabel)
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c5d03d3
  11. Aug 23, 2022
  12. Jun 10, 2022
  13. Jun 07, 2022
    • Menglong Dong's avatar
      net: skb: use auto-generation to convert skb drop reason to string · ec43908d
      Menglong Dong authored
      
      It is annoying to add new skb drop reasons to 'enum skb_drop_reason'
      and TRACE_SKB_DROP_REASON in trace/event/skb.h, and it's easy to forget
      to add the new reasons we added to TRACE_SKB_DROP_REASON.
      
      TRACE_SKB_DROP_REASON is used to convert drop reason of type number
      to string. For now, the string we passed to user space is exactly the
      same as the name in 'enum skb_drop_reason' with a 'SKB_DROP_REASON_'
      prefix. Therefore, we can use 'auto-generation' to generate these
      drop reasons to string at build time.
      
      The new source 'dropreason_str.c' will be auto generated during build
      time, which contains the string array
      'const char * const drop_reasons[]'.
      
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      ec43908d
  14. May 16, 2022
  15. Feb 23, 2022
    • Eric Dumazet's avatar
      drop_monitor: remove quadratic behavior · b26ef81c
      Eric Dumazet authored
      
      drop_monitor is using an unique list on which all netdevices in
      the host have an element, regardless of their netns.
      
      This scales poorly, not only at device unregister time (what I
      caught during my netns dismantle stress tests), but also at packet
      processing time whenever trace_napi_poll_hit() is called.
      
      If the intent was to avoid adding one pointer in 'struct net_device'
      then surely we prefer O(1) behavior.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b26ef81c
  16. Feb 11, 2022
    • Eric Dumazet's avatar
      drop_monitor: fix data-race in dropmon_net_event / trace_napi_poll_hit · dcd54265
      Eric Dumazet authored
      
      trace_napi_poll_hit() is reading stat->dev while another thread can write
      on it from dropmon_net_event()
      
      Use READ_ONCE()/WRITE_ONCE() here, RCU rules are properly enforced already,
      we only have to take care of load/store tearing.
      
      BUG: KCSAN: data-race in dropmon_net_event / trace_napi_poll_hit
      
      write to 0xffff88816f3ab9c0 of 8 bytes by task 20260 on cpu 1:
       dropmon_net_event+0xb8/0x2b0 net/core/drop_monitor.c:1579
       notifier_call_chain kernel/notifier.c:84 [inline]
       raw_notifier_call_chain+0x53/0xb0 kernel/notifier.c:392
       call_netdevice_notifiers_info net/core/dev.c:1919 [inline]
       call_netdevice_notifiers_extack net/core/dev.c:1931 [inline]
       call_netdevice_notifiers net/core/dev.c:1945 [inline]
       unregister_netdevice_many+0x867/0xfb0 net/core/dev.c:10415
       ip_tunnel_delete_nets+0x24a/0x280 net/ipv4/ip_tunnel.c:1123
       vti_exit_batch_net+0x2a/0x30 net/ipv4/ip_vti.c:515
       ops_exit_list net/core/net_namespace.c:173 [inline]
       cleanup_net+0x4dc/0x8d0 net/core/net_namespace.c:597
       process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
       worker_thread+0x616/0xa70 kernel/workqueue.c:2454
       kthread+0x1bf/0x1e0 kernel/kthread.c:377
       ret_from_fork+0x1f/0x30
      
      read to 0xffff88816f3ab9c0 of 8 bytes by interrupt on cpu 0:
       trace_napi_poll_hit+0x89/0x1c0 net/core/drop_monitor.c:292
       trace_napi_poll include/trace/events/napi.h:14 [inline]
       __napi_poll+0x36b/0x3f0 net/core/dev.c:6366
       napi_poll net/core/dev.c:6432 [inline]
       net_rx_action+0x29e/0x650 net/core/dev.c:6519
       __do_softirq+0x158/0x2de kernel/softirq.c:558
       do_softirq+0xb1/0xf0 kernel/softirq.c:459
       __local_bh_enable_ip+0x68/0x70 kernel/softirq.c:383
       __raw_spin_unlock_bh include/linux/spinlock_api_smp.h:167 [inline]
       _raw_spin_unlock_bh+0x33/0x40 kernel/locking/spinlock.c:210
       spin_unlock_bh include/linux/spinlock.h:394 [inline]
       ptr_ring_consume_bh include/linux/ptr_ring.h:367 [inline]
       wg_packet_decrypt_worker+0x73c/0x780 drivers/net/wireguard/receive.c:506
       process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
       worker_thread+0x616/0xa70 kernel/workqueue.c:2454
       kthread+0x1bf/0x1e0 kernel/kthread.c:377
       ret_from_fork+0x1f/0x30
      
      value changed: 0xffff88815883e000 -> 0x0000000000000000
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 26435 Comm: kworker/0:1 Not tainted 5.17.0-rc1-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: wg-crypt-wg2 wg_packet_decrypt_worker
      
      Fixes: 4ea7e386 ("dropmon: add ability to detect when hardware dropsrxpackets")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dcd54265
  17. Feb 10, 2022
  18. Jan 10, 2022
    • Menglong Dong's avatar
      net: skb: introduce kfree_skb_reason() · c504e5c2
      Menglong Dong authored
      
      Introduce the interface kfree_skb_reason(), which is able to pass
      the reason why the skb is dropped to 'kfree_skb' tracepoint.
      
      Add the 'reason' field to 'trace_kfree_skb', therefor user can get
      more detail information about abnormal skb with 'drop_monitor' or
      eBPF.
      
      All drop reasons are defined in the enum 'skb_drop_reason', and
      they will be print as string in 'kfree_skb' tracepoint in format
      of 'reason: XXX'.
      
      ( Maybe the reasons should be defined in a uapi header file, so that
      user space can use them? )
      
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c504e5c2
  19. Dec 07, 2021
  20. Aug 05, 2021
  21. Mar 19, 2021
  22. Mar 10, 2021
    • Ido Schimmel's avatar
      drop_monitor: Perform cleanup upon probe registration failure · 9398e9c0
      Ido Schimmel authored
      
      In the rare case that drop_monitor fails to register its probe on the
      'napi_poll' tracepoint, it will not deactivate its hysteresis timer as
      part of the error path. If the hysteresis timer was armed by the shortly
      lived 'kfree_skb' probe and user space retries to initiate tracing, a
      warning will be emitted for trying to initialize an active object [1].
      
      Fix this by properly undoing all the operations that were done prior to
      probe registration, in both software and hardware code paths.
      
      Note that syzkaller managed to fail probe registration by injecting a
      slab allocation failure [2].
      
      [1]
      ODEBUG: init active (active state 0) object type: timer_list hint: sched_send_work+0x0/0x60 include/linux/list.h:135
      WARNING: CPU: 1 PID: 8649 at lib/debugobjects.c:505 debug_print_object+0x16e/0x250 lib/debugobjects.c:505
      Modules linked in:
      CPU: 1 PID: 8649 Comm: syz-executor.0 Not tainted 5.11.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:505
      [...]
      Call Trace:
       __debug_object_init+0x524/0xd10 lib/debugobjects.c:588
       debug_timer_init kernel/time/timer.c:722 [inline]
       debug_init kernel/time/timer.c:770 [inline]
       init_timer_key+0x2d/0x340 kernel/time/timer.c:814
       net_dm_trace_on_set net/core/drop_monitor.c:1111 [inline]
       set_all_monitor_traces net/core/drop_monitor.c:1188 [inline]
       net_dm_monitor_start net/core/drop_monitor.c:1295 [inline]
       net_dm_cmd_trace+0x720/0x1220 net/core/drop_monitor.c:1339
       genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:739
       genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
       genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:800
       netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502
       genl_rcv+0x24/0x40 net/netlink/genetlink.c:811
       netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
       netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
       netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:672
       ____sys_sendmsg+0x6e8/0x810 net/socket.c:2348
       ___sys_sendmsg+0xf3/0x170 net/socket.c:2402
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2435
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      [2]
       FAULT_INJECTION: forcing a failure.
       name failslab, interval 1, probability 0, space 0, times 1
       CPU: 1 PID: 8645 Comm: syz-executor.0 Not tainted 5.11.0-syzkaller #0
       Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
       Call Trace:
        dump_stack+0xfa/0x151
        should_fail.cold+0x5/0xa
        should_failslab+0x5/0x10
        __kmalloc+0x72/0x3f0
        tracepoint_add_func+0x378/0x990
        tracepoint_probe_register+0x9c/0xe0
        net_dm_cmd_trace+0x7fc/0x1220
        genl_family_rcv_msg_doit+0x228/0x320
        genl_rcv_msg+0x328/0x580
        netlink_rcv_skb+0x153/0x420
        genl_rcv+0x24/0x40
        netlink_unicast+0x533/0x7d0
        netlink_sendmsg+0x856/0xd90
        sock_sendmsg+0xcf/0x120
        ____sys_sendmsg+0x6e8/0x810
        ___sys_sendmsg+0xf3/0x170
        __sys_sendmsg+0xe5/0x1b0
        do_syscall_64+0x2d/0x70
        entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: 70c69274 ("drop_monitor: Initialize timer and work item upon tracing enable")
      Fixes: 8ee2267a ("drop_monitor: Convert to using devlink tracepoint")
      Reported-by: default avatar <syzbot+779559d6503f3a56213d@syzkaller.appspotmail.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9398e9c0
  23. Oct 03, 2020
  24. Oct 01, 2020
  25. Aug 23, 2020
  26. Jun 21, 2020
  27. May 01, 2020
  28. Feb 28, 2020
  29. Feb 25, 2020
  30. Feb 07, 2020
    • Ido Schimmel's avatar
      drop_monitor: Do not cancel uninitialized work item · dfa7f709
      Ido Schimmel authored
      
      Drop monitor uses a work item that takes care of constructing and
      sending netlink notifications to user space. In case drop monitor never
      started to monitor, then the work item is uninitialized and not
      associated with a function.
      
      Therefore, a stop command from user space results in canceling an
      uninitialized work item which leads to the following warning [1].
      
      Fix this by not processing a stop command if drop monitor is not
      currently monitoring.
      
      [1]
      [   31.735402] ------------[ cut here ]------------
      [   31.736470] WARNING: CPU: 0 PID: 143 at kernel/workqueue.c:3032 __flush_work+0x89f/0x9f0
      ...
      [   31.738120] CPU: 0 PID: 143 Comm: dwdump Not tainted 5.5.0-custom-09491-g16d4077796b8 #727
      [   31.741968] RIP: 0010:__flush_work+0x89f/0x9f0
      ...
      [   31.760526] Call Trace:
      [   31.771689]  __cancel_work_timer+0x2a6/0x3b0
      [   31.776809]  net_dm_cmd_trace+0x300/0xef0
      [   31.777549]  genl_rcv_msg+0x5c6/0xd50
      [   31.781005]  netlink_rcv_skb+0x13b/0x3a0
      [   31.784114]  genl_rcv+0x29/0x40
      [   31.784720]  netlink_unicast+0x49f/0x6a0
      [   31.787148]  netlink_sendmsg+0x7cf/0xc80
      [   31.790426]  ____sys_sendmsg+0x620/0x770
      [   31.793458]  ___sys_sendmsg+0xfd/0x170
      [   31.802216]  __sys_sendmsg+0xdf/0x1a0
      [   31.806195]  do_syscall_64+0xa0/0x540
      [   31.806885]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: 8e94c3bc ("drop_monitor: Allow user to start monitoring hardware drops")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dfa7f709
  31. Jan 30, 2020
  32. Sep 16, 2019
  33. Aug 23, 2019
  34. Aug 17, 2019
    • Ido Schimmel's avatar
      drop_monitor: Allow user to start monitoring hardware drops · 8e94c3bc
      Ido Schimmel authored
      
      Drop monitor has start and stop commands, but so far these were only
      used to start and stop monitoring of software drops.
      
      Now that drop monitor can also monitor hardware drops, we should allow
      the user to control these as well.
      
      Do that by adding SW and HW flags to these commands. If no flag is
      specified, then only start / stop monitoring software drops. This is
      done in order to maintain backward-compatibility with existing user
      space applications.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8e94c3bc
    • Ido Schimmel's avatar
      drop_monitor: Add support for summary alert mode for hardware drops · d40e1deb
      Ido Schimmel authored
      
      In summary alert mode a notification is sent with a list of recent drop
      reasons and a count of how many packets were dropped due to this reason.
      
      To avoid expensive operations in the context in which packets are
      dropped, each CPU holds an array whose number of entries is the maximum
      number of drop reasons that can be encoded in the netlink notification.
      Each entry stores the drop reason and a count. When a packet is dropped
      the array is traversed and a new entry is created or the count of an
      existing entry is incremented.
      
      Later, in process context, the array is replaced with a newly allocated
      copy and the old array is encoded in a netlink notification. To avoid
      breaking user space, the notification includes the ancillary header,
      which is 'struct net_dm_alert_msg' with number of entries set to '0'.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d40e1deb
Loading