Skip to content
Snippets Groups Projects
  1. May 13, 2024
    • Sean Anderson's avatar
      dma: xilinx_dpdma: Fix locking · b3899fc1
      Sean Anderson authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 244296cc ]
      
      There are several places where either chan->lock or chan->vchan.lock was
      not held. Add appropriate locking. This fixes lockdep warnings like
      
      [   31.077578] ------------[ cut here ]------------
      [   31.077831] WARNING: CPU: 2 PID: 40 at drivers/dma/xilinx/xilinx_dpdma.c:834 xilinx_dpdma_chan_queue_transfer+0x274/0x5e0
      [   31.077953] Modules linked in:
      [   31.078019] CPU: 2 PID: 40 Comm: kworker/u12:1 Not tainted 6.6.20+ #98
      [   31.078102] Hardware name: xlnx,zynqmp (DT)
      [   31.078169] Workqueue: events_unbound deferred_probe_work_func
      [   31.078272] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [   31.078377] pc : xilinx_dpdma_chan_queue_transfer+0x274/0x5e0
      [   31.078473] lr : xilinx_dpdma_chan_queue_transfer+0x270/0x5e0
      [   31.078550] sp : ffffffc083bb2e10
      [   31.078590] x29: ffffffc083bb2e10 x28: 0000000000000000 x27: ffffff880165a168
      [   31.078754] x26: ffffff880164e920 x25: ffffff880164eab8 x24: ffffff880164d480
      [   31.078920] x23: ffffff880165a148 x22: ffffff880164e988 x21: 0000000000000000
      [   31.079132] x20: ffffffc082aa3000 x19: ffffff880164e880 x18: 0000000000000000
      [   31.079295] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
      [   31.079453] x14: 0000000000000000 x13: ffffff8802263dc0 x12: 0000000000000001
      [   31.079613] x11: 0001ffc083bb2e34 x10: 0001ff880164e98f x9 : 0001ffc082aa3def
      [   31.079824] x8 : 0001ffc082aa3dec x7 : 0000000000000000 x6 : 0000000000000516
      [   31.079982] x5 : ffffffc7f8d43000 x4 : ffffff88003c9c40 x3 : ffffffffffffffff
      [   31.080147] x2 : ffffffc7f8d43000 x1 : 00000000000000c0 x0 : 0000000000000000
      [   31.080307] Call trace:
      [   31.080340]  xilinx_dpdma_chan_queue_transfer+0x274/0x5e0
      [   31.080518]  xilinx_dpdma_issue_pending+0x11c/0x120
      [   31.080595]  zynqmp_disp_layer_update+0x180/0x3ac
      [   31.080712]  zynqmp_dpsub_plane_atomic_update+0x11c/0x21c
      [   31.080825]  drm_atomic_helper_commit_planes+0x20c/0x684
      [   31.080951]  drm_atomic_helper_commit_tail+0x5c/0xb0
      [   31.081139]  commit_tail+0x234/0x294
      [   31.081246]  drm_atomic_helper_commit+0x1f8/0x210
      [   31.081363]  drm_atomic_commit+0x100/0x140
      [   31.081477]  drm_client_modeset_commit_atomic+0x318/0x384
      [   31.081634]  drm_client_modeset_commit_locked+0x8c/0x24c
      [   31.081725]  drm_client_modeset_commit+0x34/0x5c
      [   31.081812]  __drm_fb_helper_restore_fbdev_mode_unlocked+0x104/0x168
      [   31.081899]  drm_fb_helper_set_par+0x50/0x70
      [   31.081971]  fbcon_init+0x538/0xc48
      [   31.082047]  visual_init+0x16c/0x23c
      [   31.082207]  do_bind_con_driver.isra.0+0x2d0/0x634
      [   31.082320]  do_take_over_console+0x24c/0x33c
      [   31.082429]  do_fbcon_takeover+0xbc/0x1b0
      [   31.082503]  fbcon_fb_registered+0x2d0/0x34c
      [   31.082663]  register_framebuffer+0x27c/0x38c
      [   31.082767]  __drm_fb_helper_initial_config_and_unlock+0x5c0/0x91c
      [   31.082939]  drm_fb_helper_initial_config+0x50/0x74
      [   31.083012]  drm_fbdev_dma_client_hotplug+0xb8/0x108
      [   31.083115]  drm_client_register+0xa0/0xf4
      [   31.083195]  drm_fbdev_dma_setup+0xb0/0x1cc
      [   31.083293]  zynqmp_dpsub_drm_init+0x45c/0x4e0
      [   31.083431]  zynqmp_dpsub_probe+0x444/0x5e0
      [   31.083616]  platform_probe+0x8c/0x13c
      [   31.083713]  really_probe+0x258/0x59c
      [   31.083793]  __driver_probe_device+0xc4/0x224
      [   31.083878]  driver_probe_device+0x70/0x1c0
      [   31.083961]  __device_attach_driver+0x108/0x1e0
      [   31.084052]  bus_for_each_drv+0x9c/0x100
      [   31.084125]  __device_attach+0x100/0x298
      [   31.084207]  device_initial_probe+0x14/0x20
      [   31.084292]  bus_probe_device+0xd8/0xdc
      [   31.084368]  deferred_probe_work_func+0x11c/0x180
      [   31.084451]  process_one_work+0x3ac/0x988
      [   31.084643]  worker_thread+0x398/0x694
      [   31.084752]  kthread+0x1bc/0x1c0
      [   31.084848]  ret_from_fork+0x10/0x20
      [   31.084932] irq event stamp: 64549
      [   31.084970] hardirqs last  enabled at (64548): [<ffffffc081adf35c>] _raw_spin_unlock_irqrestore+0x80/0x90
      [   31.085157] hardirqs last disabled at (64549): [<ffffffc081adf010>] _raw_spin_lock_irqsave+0xc0/0xdc
      [   31.085277] softirqs last  enabled at (64503): [<ffffffc08001071c>] __do_softirq+0x47c/0x500
      [   31.085390] softirqs last disabled at (64498): [<ffffffc080017134>] ____do_softirq+0x10/0x1c
      [   31.085501] ---[ end trace 0000000000000000 ]---
      
      Fixes: 7cbb0c63 ("dmaengine: xilinx: dpdma: Add the Xilinx DisplayPort DMA engine driver")
      Signed-off-by: default avatarSean Anderson <sean.anderson@linux.dev>
      Reviewed-by: default avatarTomi Valkeinen <tomi.valkeinen@ideasonboard.com>
      Link: https://lore.kernel.org/r/20240308210034.3634938-2-sean.anderson@linux.dev
      
      
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b3899fc1
    • Andy Shevchenko's avatar
      idma64: Don't try to serve interrupts when device is powered off · 96aef037
      Andy Shevchenko authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 9140ce47 ]
      
      When iDMA 64-bit device is powered off, the IRQ status register
      is all 1:s. This is never happen in real case and signalling that
      the device is simply powered off. Don't try to serve interrupts
      that are not ours.
      
      Fixes: 667dfed9 ("dmaengine: add a driver for Intel integrated DMA 64-bit")
      Reported-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Closes: https://lore.kernel.org/r/700bbb84-90e1-4505-8ff0-3f17ea8bc631@gmail.com
      
      
      Tested-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Link: https://lore.kernel.org/r/20240321120453.1360138-1-andriy.shevchenko@linux.intel.com
      
      
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      96aef037
    • Arnd Bergmann's avatar
      dmaengine: owl: fix register access functions · dd4234fe
      Arnd Bergmann authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 43c633ef ]
      
      When building with 'make W=1', clang notices that the computed register
      values are never actually written back but instead the wrong variable
      is set:
      
      drivers/dma/owl-dma.c:244:6: error: variable 'regval' set but not used [-Werror,-Wunused-but-set-variable]
        244 |         u32 regval;
            |             ^
      drivers/dma/owl-dma.c:268:6: error: variable 'regval' set but not used [-Werror,-Wunused-but-set-variable]
        268 |         u32 regval;
            |             ^
      
      Change these to what was most likely intended.
      
      Fixes: 47e20577 ("dmaengine: Add Actions Semi Owl family S900 DMA driver")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarPeter Korsgaard <peter@korsgaard.com>
      Reviewed-by: default avatarManivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
      Link: https://lore.kernel.org/r/20240322132116.906475-1-arnd@kernel.org
      
      
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      dd4234fe
    • Eric Dumazet's avatar
      tcp: Fix NEW_SYN_RECV handling in inet_twsk_purge() · 4c9e872c
      Eric Dumazet authored and Frieder Schrempf's avatar Frieder Schrempf committed
      commit 1c4e97dd upstream.
      
      inet_twsk_purge() uses rcu to find TIME_WAIT and NEW_SYN_RECV
      objects to purge.
      
      These objects use SLAB_TYPESAFE_BY_RCU semantic and need special
      care. We need to use refcount_inc_not_zero(&sk->sk_refcnt).
      
      Reuse the existing correct logic I wrote for TIME_WAIT,
      because both structures have common locations for
      sk_state, sk_family, and netns pointer.
      
      If after the refcount_inc_not_zero() the object fields longer match
      the keys, use sock_gen_put(sk) to release the refcount.
      
      Then we can call inet_twsk_deschedule_put() for TIME_WAIT,
      inet_csk_reqsk_queue_drop_and_put() for NEW_SYN_RECV sockets,
      with BH disabled.
      
      Then we need to restart the loop because we had drop rcu_read_lock().
      
      Fixes: 740ea3c4 ("tcp: Clean up kernel listener's reqsk in inet_twsk_purge()")
      Link: https://lore.kernel.org/netdev/CANn89iLvFuuihCtt9PME2uS1WJATnf5fKjDToa1WzVnRzHnPfg@mail.gmail.com/T/#u
      
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20240308200122.64357-2-kuniyu@amazon.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      [shaozhengchao: resolved conflicts in 5.10]
      Signed-off-by: default avatarZhengchao Shao <shaozhengchao@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4c9e872c
    • Kuniyuki Iwashima's avatar
      tcp: Clean up kernel listener's reqsk in inet_twsk_purge() · 1ad839c7
      Kuniyuki Iwashima authored and Frieder Schrempf's avatar Frieder Schrempf committed
      commit 740ea3c4 upstream.
      
      Eric Dumazet reported a use-after-free related to the per-netns ehash
      series. [0]
      
      When we create a TCP socket from userspace, the socket always holds a
      refcnt of the netns.  This guarantees that a reqsk timer is always fired
      before netns dismantle.  Each reqsk has a refcnt of its listener, so the
      listener is not freed before the reqsk, and the net is not freed before
      the listener as well.
      
      OTOH, when in-kernel users create a TCP socket, it might not hold a refcnt
      of its netns.  Thus, a reqsk timer can be fired after the netns dismantle
      and access freed per-netns ehash.
      
      To avoid the use-after-free, we need to clean up TCP_NEW_SYN_RECV sockets
      in inet_twsk_purge() if the netns uses a per-netns ehash.
      
      [0]: https://lore.kernel.org/netdev/CANn89iLXMup0dRD_Ov79Xt8N9FM0XdhCHEN05sf3eLwxKweM6w@mail.gmail.com/
      
      
      
      BUG: KASAN: use-after-free in tcp_or_dccp_get_hashinfo
      include/net/inet_hashtables.h:181 [inline]
      BUG: KASAN: use-after-free in reqsk_queue_unlink+0x320/0x350
      net/ipv4/inet_connection_sock.c:913
      Read of size 8 at addr ffff88807545bd80 by task syz-executor.2/8301
      
      CPU: 1 PID: 8301 Comm: syz-executor.2 Not tainted
      6.0.0-syzkaller-02757-gaf7d23f9d96a #0
      Hardware name: Google Google Compute Engine/Google Compute Engine,
      BIOS Google 09/22/2022
      Call Trace:
      <IRQ>
      __dump_stack lib/dump_stack.c:88 [inline]
      dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
      print_address_description mm/kasan/report.c:317 [inline]
      print_report.cold+0x2ba/0x719 mm/kasan/report.c:433
      kasan_report+0xb1/0x1e0 mm/kasan/report.c:495
      tcp_or_dccp_get_hashinfo include/net/inet_hashtables.h:181 [inline]
      reqsk_queue_unlink+0x320/0x350 net/ipv4/inet_connection_sock.c:913
      inet_csk_reqsk_queue_drop net/ipv4/inet_connection_sock.c:927 [inline]
      inet_csk_reqsk_queue_drop_and_put net/ipv4/inet_connection_sock.c:939 [inline]
      reqsk_timer_handler+0x724/0x1160 net/ipv4/inet_connection_sock.c:1053
      call_timer_fn+0x1a0/0x6b0 kernel/time/timer.c:1474
      expire_timers kernel/time/timer.c:1519 [inline]
      __run_timers.part.0+0x674/0xa80 kernel/time/timer.c:1790
      __run_timers kernel/time/timer.c:1768 [inline]
      run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1803
      __do_softirq+0x1d0/0x9c8 kernel/softirq.c:571
      invoke_softirq kernel/softirq.c:445 [inline]
      __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
      irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
      sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1107
      </IRQ>
      
      Fixes: d1e5e640 ("tcp: Introduce optional per-netns ehash.")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20221012145036.74960-1-kuniyu@amazon.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      [shaozhengchao: resolved conflicts in 5.10]
      Signed-off-by: default avatarZhengchao Shao <shaozhengchao@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ad839c7
    • Arnd Bergmann's avatar
      mtd: diskonchip: work around ubsan link failure · 97004a94
      Arnd Bergmann authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      commit 21c9fb61 upstream.
      
      I ran into a randconfig build failure with UBSAN using gcc-13.2:
      
      arm-linux-gnueabi-ld: error: unplaced orphan section `.bss..Lubsan_data31' from `drivers/mtd/nand/raw/diskonchip.o'
      
      I'm not entirely sure what is going on here, but I suspect this has something
      to do with the check for the end of the doc_locations[] array that contains
      an (unsigned long)0xffffffff element, which is compared against the signed
      (int)0xffffffff. If this is the case, we should get a runtime check for
      undefined behavior, but we instead get an unexpected build-time error.
      
      I would have expected this to work fine on 32-bit architectures despite the
      signed integer overflow, though on 64-bit architectures this likely won't
      ever work.
      
      Changing the contition to instead check for the size of the array makes the
      code safe everywhere and avoids the ubsan check that leads to the link
      error. The loop code goes back to before 2.6.12.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20240405143015.717429-1-arnd@kernel.org
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      97004a94
    • Andrey Ryabinin's avatar
      stackdepot: respect __GFP_NOLOCKDEP allocation flag · ae1d1d89
      Andrey Ryabinin authored and Frieder Schrempf's avatar Frieder Schrempf committed
      commit 6fe60465 upstream.
      
      If stack_depot_save_flags() allocates memory it always drops
      __GFP_NOLOCKDEP flag.  So when KASAN tries to track __GFP_NOLOCKDEP
      allocation we may end up with lockdep splat like bellow:
      
      ======================================================
       WARNING: possible circular locking dependency detected
       6.9.0-rc3+ #49 Not tainted
       ------------------------------------------------------
       kswapd0/149 is trying to acquire lock:
       ffff88811346a920
      (&xfs_nondir_ilock_class){++++}-{4:4}, at: xfs_reclaim_inode+0x3ac/0x590
      [xfs]
      
       but task is already holding lock:
       ffffffff8bb33100 (fs_reclaim){+.+.}-{0:0}, at:
      balance_pgdat+0x5d9/0xad0
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
       -> #1 (fs_reclaim){+.+.}-{0:0}:
              __lock_acquire+0x7da/0x1030
              lock_acquire+0x15d/0x400
              fs_reclaim_acquire+0xb5/0x100
       prepare_alloc_pages.constprop.0+0xc5/0x230
              __alloc_pages+0x12a/0x3f0
              alloc_pages_mpol+0x175/0x340
              stack_depot_save_flags+0x4c5/0x510
              kasan_save_stack+0x30/0x40
              kasan_save_track+0x10/0x30
              __kasan_slab_alloc+0x83/0x90
              kmem_cache_alloc+0x15e/0x4a0
              __alloc_object+0x35/0x370
              __create_object+0x22/0x90
       __kmalloc_node_track_caller+0x477/0x5b0
              krealloc+0x5f/0x110
              xfs_iext_insert_raw+0x4b2/0x6e0 [xfs]
              xfs_iext_insert+0x2e/0x130 [xfs]
              xfs_iread_bmbt_block+0x1a9/0x4d0 [xfs]
              xfs_btree_visit_block+0xfb/0x290 [xfs]
              xfs_btree_visit_blocks+0x215/0x2c0 [xfs]
              xfs_iread_extents+0x1a2/0x2e0 [xfs]
       xfs_buffered_write_iomap_begin+0x376/0x10a0 [xfs]
              iomap_iter+0x1d1/0x2d0
       iomap_file_buffered_write+0x120/0x1a0
              xfs_file_buffered_write+0x128/0x4b0 [xfs]
              vfs_write+0x675/0x890
              ksys_write+0xc3/0x160
              do_syscall_64+0x94/0x170
       entry_SYSCALL_64_after_hwframe+0x71/0x79
      
      Always preserve __GFP_NOLOCKDEP to fix this.
      
      Link: https://lkml.kernel.org/r/20240418141133.22950-1-ryabinin.a.a@gmail.com
      
      
      Fixes: cd11016e ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB")
      Signed-off-by: default avatarAndrey Ryabinin <ryabinin.a.a@gmail.com>
      Reported-by: default avatarXiubo Li <xiubli@redhat.com>
      Closes: https://lore.kernel.org/all/a0caa289-ca02-48eb-9bf2-d86fd47b71f4@redhat.com/
      
      
      Reported-by: default avatarDamien Le Moal <damien.lemoal@opensource.wdc.com>
      Closes: https://lore.kernel.org/all/f9ff999a-e170-b66b-7caf-293f2b147ac2@opensource.wdc.com/
      
      
      Suggested-by: default avatarDave Chinner <david@fromorbit.com>
      Tested-by: default avatarXiubo Li <xiubli@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ae1d1d89
    • Peter Münster's avatar
      net: b44: set pause params only when interface is up · e1967f8b
      Peter Münster authored and Frieder Schrempf's avatar Frieder Schrempf committed
      commit e3eb7dd4 upstream.
      
      b44_free_rings() accesses b44::rx_buffers (and ::tx_buffers)
      unconditionally, but b44::rx_buffers is only valid when the
      device is up (they get allocated in b44_open(), and deallocated
      again in b44_close()), any other time these are just a NULL pointers.
      
      So if you try to change the pause params while the network interface
      is disabled/administratively down, everything explodes (which likely
      netifd tries to do).
      
      Link: https://github.com/openwrt/openwrt/issues/13789
      
      
      Fixes: 1da177e4 (Linux-2.6.12-rc2)
      Cc: stable@vger.kernel.org
      Reported-by: default avatarPeter Münster <pm@a16n.net>
      Suggested-by: default avatarJonas Gorski <jonas.gorski@gmail.com>
      Signed-off-by: default avatarVaclav Svoboda <svoboda@neng.cz>
      Tested-by: default avatarPeter Münster <pm@a16n.net>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarPeter Münster <pm@a16n.net>
      Reviewed-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/r/87y192oolj.fsf@a16n.net
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e1967f8b
    • Rahul Rameshbabu's avatar
      ethernet: Add helper for assigning packet type when dest address does not match device address · 227110bc
      Rahul Rameshbabu authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      commit 6e159fd6 upstream.
      
      Enable reuse of logic in eth_type_trans for determining packet type.
      
      Suggested-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarRahul Rameshbabu <rrameshbabu@nvidia.com>
      Reviewed-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Link: https://lore.kernel.org/r/20240423181319.115860-3-rrameshbabu@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      227110bc
    • Guanrui Huang's avatar
      irqchip/gic-v3-its: Prevent double free on error · 8a8217a4
      Guanrui Huang authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      commit c26591af upstream.
      
      The error handling path in its_vpe_irq_domain_alloc() causes a double free
      when its_vpe_init() fails after successfully allocating at least one
      interrupt. This happens because its_vpe_irq_domain_free() frees the
      interrupts along with the area bitmap and the vprop_page and
      its_vpe_irq_domain_alloc() subsequently frees the area bitmap and the
      vprop_page again.
      
      Fix this by unconditionally invoking its_vpe_irq_domain_free() which
      handles all cases correctly and by removing the bitmap/vprop_page freeing
      from its_vpe_irq_domain_alloc().
      
      [ tglx: Massaged change log ]
      
      Fixes: 7d75bbb4 ("irqchip/gic-v3-its: Add VPE irq domain allocation/teardown")
      Signed-off-by: default avatarGuanrui Huang <guanrui.huang@linux.alibaba.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarMarc Zyngier <maz@kernel.org>
      Reviewed-by: default avatarZenghui Yu <yuzenghui@huawei.com>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20240418061053.96803-2-guanrui.huang@linux.alibaba.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8a8217a4
    • Mukul Joshi's avatar
      drm/amdgpu: Fix leak when GPU memory allocation fails · 56cfdbe6
      Mukul Joshi authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      commit 25e9227c upstream.
      
      Free the sync object if the memory allocation fails for any
      reason.
      
      Signed-off-by: default avatarMukul Joshi <mukul.joshi@amd.com>
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      56cfdbe6
    • Alex Deucher's avatar
      drm/amdgpu/sdma5.2: use legacy HDP flush for SDMA2/3 · a7bfdc4c
      Alex Deucher authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      commit 9792b7cc upstream.
      
      This avoids a potential conflict with firmwares with the newer
      HDP flush mechanism.
      
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a7bfdc4c
    • Iskander Amara's avatar
      arm64: dts: rockchip: enable internal pull-up for Q7_THRM# on RK3399 Puma · b9a76fed
      Iskander Amara authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      commit 0ac417b8 upstream.
      
      Q7_THRM# pin is connected to a diode on the module which is used
      as a level shifter, and the pin have a pull-down enabled by
      default. We need to configure it to internal pull-up, other-
      wise whenever the pin is configured as INPUT and we try to
      control it externally the value will always remain zero.
      
      Signed-off-by: default avatarIskander Amara <iskander.amara@theobroma-systems.com>
      Fixes: 2c66fc34 ("arm64: dts: rockchip: add RK3399-Q7 (Puma) SoM")
      Reviewed-by: default avatarQuentin Schulz <quentin.schulz@theobroma-systems.com>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20240308085243.69903-1-iskander.amara@theobroma-systems.com
      
      
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b9a76fed
    • Sean Christopherson's avatar
      cpu: Re-enable CPU mitigations by default for !X86 architectures · 8ed69da5
      Sean Christopherson authored and Frieder Schrempf's avatar Frieder Schrempf committed
      commit fe42754b upstream.
      
      Rename x86's to CPU_MITIGATIONS, define it in generic code, and force it
      on for all architectures exception x86.  A recent commit to turn
      mitigations off by default if SPECULATION_MITIGATIONS=n kinda sorta
      missed that "cpu_mitigations" is completely generic, whereas
      SPECULATION_MITIGATIONS is x86-specific.
      
      Rename x86's SPECULATIVE_MITIGATIONS instead of keeping both and have it
      select CPU_MITIGATIONS, as having two configs for the same thing is
      unnecessary and confusing.  This will also allow x86 to use the knob to
      manage mitigations that aren't strictly related to speculative
      execution.
      
      Use another Kconfig to communicate to common code that CPU_MITIGATIONS
      is already defined instead of having x86's menu depend on the common
      CPU_MITIGATIONS.  This allows keeping a single point of contact for all
      of x86's mitigations, and it's not clear that other architectures *want*
      to allow disabling mitigations at compile-time.
      
      Fixes: f337a6a2 ("x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n")
      Closes: https://lkml.kernel.org/r/20240413115324.53303a68%40canb.auug.org.au
      
      
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Reported-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reported-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      Acked-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20240420000556.2645001-2-seanjc@google.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8ed69da5
    • Johannes Thumshirn's avatar
      btrfs: fix information leak in btrfs_ioctl_logical_to_ino() · 1aac3f1d
      Johannes Thumshirn authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      commit 2f7ef5bb upstream.
      
      Syzbot reported the following information leak for in
      btrfs_ioctl_logical_to_ino():
      
        BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline]
        BUG: KMSAN: kernel-infoleak in _copy_to_user+0xbc/0x110 lib/usercopy.c:40
         instrument_copy_to_user include/linux/instrumented.h:114 [inline]
         _copy_to_user+0xbc/0x110 lib/usercopy.c:40
         copy_to_user include/linux/uaccess.h:191 [inline]
         btrfs_ioctl_logical_to_ino+0x440/0x750 fs/btrfs/ioctl.c:3499
         btrfs_ioctl+0x714/0x1260
         vfs_ioctl fs/ioctl.c:51 [inline]
         __do_sys_ioctl fs/ioctl.c:904 [inline]
         __se_sys_ioctl+0x261/0x450 fs/ioctl.c:890
         __x64_sys_ioctl+0x96/0xe0 fs/ioctl.c:890
         x64_sys_call+0x1883/0x3b50 arch/x86/include/generated/asm/syscalls_64.h:17
         do_syscall_x64 arch/x86/entry/common.c:52 [inline]
         do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83
         entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
        Uninit was created at:
         __kmalloc_large_node+0x231/0x370 mm/slub.c:3921
         __do_kmalloc_node mm/slub.c:3954 [inline]
         __kmalloc_node+0xb07/0x1060 mm/slub.c:3973
         kmalloc_node include/linux/slab.h:648 [inline]
         kvmalloc_node+0xc0/0x2d0 mm/util.c:634
         kvmalloc include/linux/slab.h:766 [inline]
         init_data_container+0x49/0x1e0 fs/btrfs/backref.c:2779
         btrfs_ioctl_logical_to_ino+0x17c/0x750 fs/btrfs/ioctl.c:3480
         btrfs_ioctl+0x714/0x1260
         vfs_ioctl fs/ioctl.c:51 [inline]
         __do_sys_ioctl fs/ioctl.c:904 [inline]
         __se_sys_ioctl+0x261/0x450 fs/ioctl.c:890
         __x64_sys_ioctl+0x96/0xe0 fs/ioctl.c:890
         x64_sys_call+0x1883/0x3b50 arch/x86/include/generated/asm/syscalls_64.h:17
         do_syscall_x64 arch/x86/entry/common.c:52 [inline]
         do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83
         entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
        Bytes 40-65535 of 65536 are uninitialized
        Memory access of size 65536 starts at ffff888045a40000
      
      This happens, because we're copying a 'struct btrfs_data_container' back
      to user-space. This btrfs_data_container is allocated in
      'init_data_container()' via kvmalloc(), which does not zero-fill the
      memory.
      
      Fix this by using kvzalloc() which zeroes out the memory on allocation.
      
      CC: stable@vger.kernel.org # 4.14+
      Reported-by: default avatar <syzbot+510a1abbb8116eeb341d@syzkaller.appspotmail.com>
      Reviewed-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarJohannes Thumshirn <Johannes.thumshirn@wdc.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1aac3f1d
    • WangYuli's avatar
      Bluetooth: btusb: Add Realtek RTL8852BE support ID 0x0bda:0x4853 · c7c33f68
      WangYuli authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      commit d1a5a7ee upstream.
      
      Add the support ID(0x0bda, 0x4853) to usb_device_id table for
      Realtek RTL8852BE.
      
      Without this change the device utilizes an obsolete version of
      the firmware that is encoded in it rather than the updated Realtek
      firmware and config files from the firmware directory. The latter
      files implement many new features.
      
      The device table is as follows:
      
      T: Bus=03 Lev=01 Prnt=01 Port=09 Cnt=03 Dev#= 4 Spd=12 MxCh= 0
      D: Ver= 1.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1
      P: Vendor=0bda ProdID=4853 Rev= 0.00
      S: Manufacturer=Realtek
      S: Product=Bluetooth Radio
      S: SerialNumber=00e04c000001
      C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA
      I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms
      E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms
      E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms
      I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms
      E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms
      I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms
      E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms
      I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms
      E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms
      I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms
      E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms
      I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms
      E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms
      I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms
      E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLarry Finger <Larry.Finger@lwfinger.net>
      Signed-off-by: default avatarWangYuli <wangyuli@uniontech.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c7c33f68
    • Nathan Chancellor's avatar
      Bluetooth: Fix type of len in {l2cap,sco}_sock_getsockopt_old() · 962f6ed8
      Nathan Chancellor authored and Frieder Schrempf's avatar Frieder Schrempf committed
      commit 9bf4e919 upstream.
      
      After an innocuous optimization change in LLVM main (19.0.0), x86_64
      allmodconfig (which enables CONFIG_KCSAN / -fsanitize=thread) fails to
      build due to the checks in check_copy_size():
      
        In file included from net/bluetooth/sco.c:27:
        In file included from include/linux/module.h:13:
        In file included from include/linux/stat.h:19:
        In file included from include/linux/time.h:60:
        In file included from include/linux/time32.h:13:
        In file included from include/linux/timex.h:67:
        In file included from arch/x86/include/asm/timex.h:6:
        In file included from arch/x86/include/asm/tsc.h:10:
        In file included from arch/x86/include/asm/msr.h:15:
        In file included from include/linux/percpu.h:7:
        In file included from include/linux/smp.h:118:
        include/linux/thread_info.h:244:4: error: call to '__bad_copy_from'
        declared with 'error' attribute: copy source size is too small
          244 |                         __bad_copy_from();
              |                         ^
      
      The same exact error occurs in l2cap_sock.c. The copy_to_user()
      statements that are failing come from l2cap_sock_getsockopt_old() and
      sco_sock_getsockopt_old(). This does not occur with GCC with or without
      KCSAN or Clang without KCSAN enabled.
      
      len is defined as an 'int' because it is assigned from
      '__user int *optlen'. However, it is clamped against the result of
      sizeof(), which has a type of 'size_t' ('unsigned long' for 64-bit
      platforms). This is done with min_t() because min() requires compatible
      types, which results in both len and the result of sizeof() being casted
      to 'unsigned int', meaning len changes signs and the result of sizeof()
      is truncated. From there, len is passed to copy_to_user(), which has a
      third parameter type of 'unsigned long', so it is widened and changes
      signs again. This excessive casting in combination with the KCSAN
      instrumentation causes LLVM to fail to eliminate the __bad_copy_from()
      call, failing the build.
      
      The official recommendation from LLVM developers is to consistently use
      long types for all size variables to avoid the unnecessary casting in
      the first place. Change the type of len to size_t in both
      l2cap_sock_getsockopt_old() and sco_sock_getsockopt_old(). This clears
      up the error while allowing min_t() to be replaced with min(), resulting
      in simpler code with no casts and fewer implicit conversions. While len
      is a different type than optlen now, it should result in no functional
      change because the result of sizeof() will clamp all values of optlen in
      the same manner as before.
      
      Cc: stable@vger.kernel.org
      Closes: https://github.com/ClangBuiltLinux/linux/issues/2007
      Link: https://github.com/llvm/llvm-project/issues/85647
      
      
      Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Reviewed-by: default avatarJustin Stitt <justinstitt@google.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      962f6ed8
    • Christian Marangi's avatar
      PM / devfreq: Fix buffer overflow in trans_stat_show · fc3bcc03
      Christian Marangi authored and Frieder Schrempf's avatar Frieder Schrempf committed
      commit 08e23d05 upstream.
      
      Fix buffer overflow in trans_stat_show().
      
      Convert simple snprintf to the more secure scnprintf with size of
      PAGE_SIZE.
      
      Add condition checking if we are exceeding PAGE_SIZE and exit early from
      loop. Also add at the end a warning that we exceeded PAGE_SIZE and that
      stats is disabled.
      
      Return -EFBIG in the case where we don't have enough space to write the
      full transition table.
      
      Also document in the ABI that this function can return -EFBIG error.
      
      Link: https://lore.kernel.org/all/20231024183016.14648-2-ansuelsmth@gmail.com/
      Cc: stable@vger.kernel.org
      Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218041
      
      
      Fixes: e552bbaf ("PM / devfreq: Add sysfs node for representing frequency transition information.")
      Signed-off-by: default avatarChristian Marangi <ansuelsmth@gmail.com>
      Signed-off-by: default avatarChanwoo Choi <cw00.choi@samsung.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc3bcc03
    • Robin H. Johnson's avatar
      tracing: Increase PERF_MAX_TRACE_SIZE to handle Sentinel1 and docker together · 86ec0a85
      Robin H. Johnson authored and Frieder Schrempf's avatar Frieder Schrempf committed
      commit e531e90b upstream.
      
      Running endpoint security solutions like Sentinel1 that use perf-based
      tracing heavily lead to this repeated dump complaining about dockerd.
      The default value of 2048 is nowhere near not large enough.
      
      Using the prior patch "tracing: show size of requested buffer", we get
      "perf buffer not large enough, wanted 6644, have 6144", after repeated
      up-sizing (I did 2/4/6/8K). With 8K, the problem doesn't occur at all,
      so below is the trace for 6K.
      
      I'm wondering if this value should be selectable at boot time, but this
      is a good starting point.
      
      ```
      ------------[ cut here ]------------
      perf buffer not large enough, wanted 6644, have 6144
      WARNING: CPU: 1 PID: 4997 at kernel/trace/trace_event_perf.c:402 perf_trace_buf_alloc+0x8c/0xa0
      Modules linked in: [..]
      CPU: 1 PID: 4997 Comm: sh Tainted: G                T 5.13.13-x86_64-00039-gb3959163488e #63
      Hardware name: LENOVO 20KH002JUS/20KH002JUS, BIOS N23ET66W (1.41 ) 09/02/2019
      RIP: 0010:perf_trace_buf_alloc+0x8c/0xa0
      Code: 80 3d 43 97 d0 01 00 74 07 31 c0 5b 5d 41 5c c3 ba 00 18 00 00 89 ee 48 c7 c7 00 82 7d 91 c6 05 25 97 d0 01 01 e8 22 ee bc 00 <0f> 0b 31 c0 eb db 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 55 89
      RSP: 0018:ffffb922026b7d58 EFLAGS: 00010282
      RAX: 0000000000000000 RBX: ffff9da5ee012000 RCX: 0000000000000027
      RDX: ffff9da881657828 RSI: 0000000000000001 RDI: ffff9da881657820
      RBP: 00000000000019f4 R08: 0000000000000000 R09: ffffb922026b7b80
      R10: ffffb922026b7b78 R11: ffffffff91dda688 R12: 000000000000000f
      R13: ffff9da5ee012108 R14: ffff9da8816570a0 R15: ffffb922026b7e30
      FS:  00007f420db1a080(0000) GS:ffff9da881640000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000060 CR3: 00000002504a8006 CR4: 00000000003706e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       kprobe_perf_func+0x11e/0x270
       ? do_execveat_common.isra.0+0x1/0x1c0
       ? do_execveat_common.isra.0+0x5/0x1c0
       kprobe_ftrace_handler+0x10e/0x1d0
       0xffffffffc03aa0c8
       ? do_execveat_common.isra.0+0x1/0x1c0
       do_execveat_common.isra.0+0x5/0x1c0
       __x64_sys_execve+0x33/0x40
       do_syscall_64+0x6b/0xc0
       ? do_syscall_64+0x11/0xc0
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x7f420dc1db37
      Code: ff ff 76 e7 f7 d8 64 41 89 00 eb df 0f 1f 80 00 00 00 00 f7 d8 64 41 89 00 eb dc 0f 1f 84 00 00 00 00 00 b8 3b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 01 43 0f 00 f7 d8 64 89 01 48
      RSP: 002b:00007ffd4e8b4e38 EFLAGS: 00000246 ORIG_RAX: 000000000000003b
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f420dc1db37
      RDX: 0000564338d1e740 RSI: 0000564338d32d50 RDI: 0000564338d28f00
      RBP: 0000564338d28f00 R08: 0000564338d32d50 R09: 0000000000000020
      R10: 00000000000001b6 R11: 0000000000000246 R12: 0000564338d28f00
      R13: 0000564338d32d50 R14: 0000564338d1e740 R15: 0000564338d28c60
      ---[ end trace 83ab3e8e16275e49 ]---
      ```
      
      Link: https://lkml.kernel.org/r/20210831043723.13481-2-robbat2@gentoo.org
      
      
      
      Signed-off-by: default avatarRobin H. Johnson <robbat2@gentoo.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@igalia.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      86ec0a85
    • Robin H. Johnson's avatar
      tracing: Show size of requested perf buffer · 361f44bb
      Robin H. Johnson authored and Frieder Schrempf's avatar Frieder Schrempf committed
      commit a90afe8d upstream.
      
      If the perf buffer isn't large enough, provide a hint about how large it
      needs to be for whatever is running.
      
      Link: https://lkml.kernel.org/r/20210831043723.13481-1-robbat2@gentoo.org
      
      
      
      Signed-off-by: default avatarRobin H. Johnson <robbat2@gentoo.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@igalia.com>
      361f44bb
    • Shifeng Li's avatar
      net/mlx5e: Fix a race in command alloc flow · a9a3f551
      Shifeng Li authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      commit 8f5100da upstream.
      
      Fix a cmd->ent use after free due to a race on command entry.
      Such race occurs when one of the commands releases its last refcount and
      frees its index and entry while another process running command flush
      flow takes refcount to this command entry. The process which handles
      commands flush may see this command as needed to be flushed if the other
      process allocated a ent->idx but didn't set ent to cmd->ent_arr in
      cmd_work_handler(). Fix it by moving the assignment of cmd->ent_arr into
      the spin lock.
      
      [70013.081955] BUG: KASAN: use-after-free in mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core]
      [70013.081967] Write of size 4 at addr ffff88880b1510b4 by task kworker/26:1/1433361
      [70013.081968]
      [70013.082028] Workqueue: events aer_isr
      [70013.082053] Call Trace:
      [70013.082067]  dump_stack+0x8b/0xbb
      [70013.082086]  print_address_description+0x6a/0x270
      [70013.082102]  kasan_report+0x179/0x2c0
      [70013.082173]  mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core]
      [70013.082267]  mlx5_cmd_flush+0x80/0x180 [mlx5_core]
      [70013.082304]  mlx5_enter_error_state+0x106/0x1d0 [mlx5_core]
      [70013.082338]  mlx5_try_fast_unload+0x2ea/0x4d0 [mlx5_core]
      [70013.082377]  remove_one+0x200/0x2b0 [mlx5_core]
      [70013.082409]  pci_device_remove+0xf3/0x280
      [70013.082439]  device_release_driver_internal+0x1c3/0x470
      [70013.082453]  pci_stop_bus_device+0x109/0x160
      [70013.082468]  pci_stop_and_remove_bus_device+0xe/0x20
      [70013.082485]  pcie_do_fatal_recovery+0x167/0x550
      [70013.082493]  aer_isr+0x7d2/0x960
      [70013.082543]  process_one_work+0x65f/0x12d0
      [70013.082556]  worker_thread+0x87/0xb50
      [70013.082571]  kthread+0x2e9/0x3a0
      [70013.082592]  ret_from_fork+0x1f/0x40
      
      The logical relationship of this error is as follows:
      
                   aer_recover_work              |          ent->work
      -------------------------------------------+------------------------------
      aer_recover_work_func                      |
      |- pcie_do_recovery                        |
        |- report_error_detected                 |
          |- mlx5_pci_err_detected               |cmd_work_handler
            |- mlx5_enter_error_state            |  |- cmd_alloc_index
              |- enter_error_state               |    |- lock cmd->alloc_lock
                |- mlx5_cmd_flush                |    |- clear_bit
                  |- mlx5_cmd_trigger_completions|    |- unlock cmd->alloc_lock
                    |- lock cmd->alloc_lock      |
                    |- vector = ~dev->cmd.vars.bitmask
                    |- for_each_set_bit          |
                      |- cmd_ent_get(cmd->ent_arr[i]) (UAF)
                    |- unlock cmd->alloc_lock    |  |- cmd->ent_arr[ent->idx]=ent
      
      The cmd->ent_arr[ent->idx] assignment and the bit clearing are not
      protected by the cmd->alloc_lock in cmd_work_handler().
      
      Fixes: 50b2412b ("net/mlx5: Avoid possible free of command entry while timeout comp handler")
      Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Signed-off-by: default avatarShifeng Li <lishifeng@sangfor.com.cn>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarSamasth Norway Ananda <samasth.norway.ananda@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a9a3f551
    • Greg Kroah-Hartman's avatar
      Revert "crypto: api - Disallow identical driver names" · 4bd33586
      Greg Kroah-Hartman authored and Frieder Schrempf's avatar Frieder Schrempf committed
      This reverts commit 462c383e which is
      commit 27016f75 upstream.
      
      It is reported to cause problems in older kernels due to some crypto
      drivers having the same name, so revert it here to fix the problems.
      
      Link: https://lore.kernel.org/r/aceda6e2-cefb-4146-aef8-ff4bafa56e56@roeck-us.net
      
      
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Ovidiu Panait <ovidiu.panait@windriver.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4bd33586
    • Emil Kronborg's avatar
      serial: mxs-auart: add spinlock around changing cts state · 29aac7e0
      Emil Kronborg authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 54c4ec5f ]
      
      The uart_handle_cts_change() function in serial_core expects the caller
      to hold uport->lock. For example, I have seen the below kernel splat,
      when the Bluetooth driver is loaded on an i.MX28 board.
      
          [   85.119255] ------------[ cut here ]------------
          [   85.124413] WARNING: CPU: 0 PID: 27 at /drivers/tty/serial/serial_core.c:3453 uart_handle_cts_change+0xb4/0xec
          [   85.134694] Modules linked in: hci_uart bluetooth ecdh_generic ecc wlcore_sdio configfs
          [   85.143314] CPU: 0 PID: 27 Comm: kworker/u3:0 Not tainted 6.6.3-00021-gd62a2f068f92 #1
          [   85.151396] Hardware name: Freescale MXS (Device Tree)
          [   85.156679] Workqueue: hci0 hci_power_on [bluetooth]
          (...)
          [   85.191765]  uart_handle_cts_change from mxs_auart_irq_handle+0x380/0x3f4
          [   85.198787]  mxs_auart_irq_handle from __handle_irq_event_percpu+0x88/0x210
          (...)
      
      Cc: stable@vger.kernel.org
      Fixes: 4d90bb14 ("serial: core: Document and assert lock requirements for irq helpers")
      Reviewed-by: default avatarFrank Li <Frank.Li@nxp.com>
      Signed-off-by: default avatarEmil Kronborg <emil.kronborg@protonmail.com>
      Link: https://lore.kernel.org/r/20240320121530.11348-1-emil.kronborg@protonmail.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      29aac7e0
    • Thomas Gleixner's avatar
      serial: core: Provide port lock wrappers · 53313b5b
      Thomas Gleixner authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit b0af4bcb ]
      
      When a serial port is used for kernel console output, then all
      modifications to the UART registers which are done from other contexts,
      e.g. getty, termios, are interference points for the kernel console.
      
      So far this has been ignored and the printk output is based on the
      principle of hope. The rework of the console infrastructure which aims to
      support threaded and atomic consoles, requires to mark sections which
      modify the UART registers as unsafe. This allows the atomic write function
      to make informed decisions and eventually to restore operational state. It
      also allows to prevent the regular UART code from modifying UART registers
      while printk output is in progress.
      
      All modifications of UART registers are guarded by the UART port lock,
      which provides an obvious synchronization point with the console
      infrastructure.
      
      Provide wrapper functions for spin_[un]lock*(port->lock) invocations so
      that the console mechanics can be applied later on at a single place and
      does not require to copy the same logic all over the drivers.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarIlpo Järvinen <ilpo.jarvinen@linux.intel.com>
      Signed-off-by: default avatarJohn Ogness <john.ogness@linutronix.de>
      Link: https://lore.kernel.org/r/20230914183831.587273-2-john.ogness@linutronix.de
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Stable-dep-of: 54c4ec5f ("serial: mxs-auart: add spinlock around changing cts state")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      53313b5b
    • Kuniyuki Iwashima's avatar
      af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc(). · d8908b93
      Kuniyuki Iwashima authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 1971d13f ]
      
      syzbot reported a lockdep splat regarding unix_gc_lock and
      unix_state_lock().
      
      One is called from recvmsg() for a connected socket, and another
      is called from GC for TCP_LISTEN socket.
      
      So, the splat is false-positive.
      
      Let's add a dedicated lock class for the latter to suppress the splat.
      
      Note that this change is not necessary for net-next.git as the issue
      is only applied to the old GC impl.
      
      [0]:
      WARNING: possible circular locking dependency detected
      6.9.0-rc5-syzkaller-00007-g4d2008430ce8 #0 Not tainted
       -----------------------------------------------------
      kworker/u8:1/11 is trying to acquire lock:
      ffff88807cea4e70 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
      ffff88807cea4e70 (&u->lock){+.+.}-{2:2}, at: __unix_gc+0x40e/0xf70 net/unix/garbage.c:302
      
      but task is already holding lock:
      ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
      ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0x117/0xf70 net/unix/garbage.c:261
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
       -> #1 (unix_gc_lock){+.+.}-{2:2}:
             lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
             __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
             _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
             spin_lock include/linux/spinlock.h:351 [inline]
             unix_notinflight+0x13d/0x390 net/unix/garbage.c:140
             unix_detach_fds net/unix/af_unix.c:1819 [inline]
             unix_destruct_scm+0x221/0x350 net/unix/af_unix.c:1876
             skb_release_head_state+0x100/0x250 net/core/skbuff.c:1188
             skb_release_all net/core/skbuff.c:1200 [inline]
             __kfree_skb net/core/skbuff.c:1216 [inline]
             kfree_skb_reason+0x16d/0x3b0 net/core/skbuff.c:1252
             kfree_skb include/linux/skbuff.h:1262 [inline]
             manage_oob net/unix/af_unix.c:2672 [inline]
             unix_stream_read_generic+0x1125/0x2700 net/unix/af_unix.c:2749
             unix_stream_splice_read+0x239/0x320 net/unix/af_unix.c:2981
             do_splice_read fs/splice.c:985 [inline]
             splice_file_to_pipe+0x299/0x500 fs/splice.c:1295
             do_splice+0xf2d/0x1880 fs/splice.c:1379
             __do_splice fs/splice.c:1436 [inline]
             __do_sys_splice fs/splice.c:1652 [inline]
             __se_sys_splice+0x331/0x4a0 fs/splice.c:1634
             do_syscall_x64 arch/x86/entry/common.c:52 [inline]
             do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
             entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
       -> #0 (&u->lock){+.+.}-{2:2}:
             check_prev_add kernel/locking/lockdep.c:3134 [inline]
             check_prevs_add kernel/locking/lockdep.c:3253 [inline]
             validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
             __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
             lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
             __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
             _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
             spin_lock include/linux/spinlock.h:351 [inline]
             __unix_gc+0x40e/0xf70 net/unix/garbage.c:302
             process_one_work kernel/workqueue.c:3254 [inline]
             process_scheduled_works+0xa10/0x17c0 kernel/workqueue.c:3335
             worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
             kthread+0x2f0/0x390 kernel/kthread.c:388
             ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
             ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
      
      other info that might help us debug this:
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(unix_gc_lock);
                                     lock(&u->lock);
                                     lock(unix_gc_lock);
        lock(&u->lock);
      
       *** DEADLOCK ***
      
      3 locks held by kworker/u8:1/11:
       #0: ffff888015089148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3229 [inline]
       #0: ffff888015089148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_scheduled_works+0x8e0/0x17c0 kernel/workqueue.c:3335
       #1: ffffc90000107d00 (unix_gc_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3230 [inline]
       #1: ffffc90000107d00 (unix_gc_work){+.+.}-{0:0}, at: process_scheduled_works+0x91b/0x17c0 kernel/workqueue.c:3335
       #2: ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
       #2: ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0x117/0xf70 net/unix/garbage.c:261
      
      stack backtrace:
      CPU: 0 PID: 11 Comm: kworker/u8:1 Not tainted 6.9.0-rc5-syzkaller-00007-g4d2008430ce8 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
      Workqueue: events_unbound __unix_gc
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
       check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
       check_prev_add kernel/locking/lockdep.c:3134 [inline]
       check_prevs_add kernel/locking/lockdep.c:3253 [inline]
       validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
       __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
       __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
       _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
       spin_lock include/linux/spinlock.h:351 [inline]
       __unix_gc+0x40e/0xf70 net/unix/garbage.c:302
       process_one_work kernel/workqueue.c:3254 [inline]
       process_scheduled_works+0xa10/0x17c0 kernel/workqueue.c:3335
       worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
       kthread+0x2f0/0x390 kernel/kthread.c:388
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
       </TASK>
      
      Fixes: 47d8ac01 ("af_unix: Fix garbage collector racing against connect()")
      Reported-and-tested-by: default avatar <syzbot+fa379358c28cc87cc307@syzkaller.appspotmail.com>
      Closes: https://syzkaller.appspot.com/bug?extid=fa379358c28cc87cc307
      
      
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20240424170443.9832-1-kuniyu@amazon.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d8908b93
    • Jason Reeder's avatar
      net: ethernet: ti: am65-cpts: Fix PTPv1 message type on TX packets · ee961a24
      Jason Reeder authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 1b9e743e ]
      
      The CPTS, by design, captures the messageType (Sync, Delay_Req, etc.)
      field from the second nibble of the PTP header which is defined in the
      PTPv2 (1588-2008) specification. In the PTPv1 (1588-2002) specification
      the first two bytes of the PTP header are defined as the versionType
      which is always 0x0001. This means that any PTPv1 packets that are
      tagged for TX timestamping by the CPTS will have their messageType set
      to 0x0 which corresponds to a Sync message type. This causes issues
      when a PTPv1 stack is expecting a Delay_Req (messageType: 0x1)
      timestamp that never appears.
      
      Fix this by checking if the ptp_class of the timestamped TX packet is
      PTP_CLASS_V1 and then matching the PTP sequence ID to the stored
      sequence ID in the skb->cb data structure. If the sequence IDs match
      and the packet is of type PTPv1 then there is a chance that the
      messageType has been incorrectly stored by the CPTS so overwrite the
      messageType stored by the CPTS with the messageType from the skb->cb
      data structure. This allows the PTPv1 stack to receive TX timestamps
      for Delay_Req packets which are necessary to lock onto a PTP Leader.
      
      Signed-off-by: default avatarJason Reeder <jreeder@ti.com>
      Signed-off-by: default avatarRavi Gunasekaran <r-gunasekaran@ti.com>
      Tested-by: default avatarEd Trexel <ed.trexel@hp.com>
      Fixes: f6bd5952 ("net: ethernet: ti: introduce am654 common platform time sync driver")
      Link: https://lore.kernel.org/r/20240424071626.32558-1-r-gunasekaran@ti.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ee961a24
    • Sudheer Mogilappagari's avatar
      iavf: Fix TC config comparison with existing adapter TC config · 6dc1abf8
      Sudheer Mogilappagari authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 54976cf5 ]
      
      Same number of TCs doesn't imply that underlying TC configs are
      same. The config could be different due to difference in number
      of queues in each TC. Add utility function to determine if TC
      configs are same.
      
      Fixes: d5b33d02 ("i40evf: add ndo_setup_tc callback to i40evf")
      Signed-off-by: default avatarSudheer Mogilappagari <sudheer.mogilappagari@intel.com>
      Tested-by: Mineri Bhange <minerix.bhange@intel.com> (A Contingent Worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Link: https://lore.kernel.org/r/20240423182723.740401-4-anthony.l.nguyen@intel.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6dc1abf8
    • Erwan Velu's avatar
      i40e: Report MFS in decimal base instead of hex · 388a759b
      Erwan Velu authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit ef3c3131 ]
      
      If the MFS is set below the default (0x2600), a warning message is
      reported like the following :
      
      	MFS for port 1 has been set below the default: 600
      
      This message is a bit confusing as the number shown here (600) is in
      fact an hexa number: 0x600 = 1536
      
      Without any explicit "0x" prefix, this message is read like the MFS is
      set to 600 bytes.
      
      MFS, as per MTUs, are usually expressed in decimal base.
      
      This commit reports both current and default MFS values in decimal
      so it's less confusing for end-users.
      
      A typical warning message looks like the following :
      
      	MFS for port 1 (1536) has been set below the default (9728)
      
      Signed-off-by: default avatarErwan Velu <e.velu@criteo.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Tested-by: default avatarTony Brelinski <tony.brelinski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Fixes: 3a2c6ced ("i40e: Add a check to see if MFS is set")
      Link: https://lore.kernel.org/r/20240423182723.740401-3-anthony.l.nguyen@intel.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      388a759b
    • Sindhu Devale's avatar
      i40e: Do not use WQ_MEM_RECLAIM flag for workqueue · af3e208a
      Sindhu Devale authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 2cc7d150 ]
      
      Issue reported by customer during SRIOV testing, call trace:
      When both i40e and the i40iw driver are loaded, a warning
      in check_flush_dependency is being triggered. This seems
      to be because of the i40e driver workqueue is allocated with
      the WQ_MEM_RECLAIM flag, and the i40iw one is not.
      
      Similar error was encountered on ice too and it was fixed by
      removing the flag. Do the same for i40e too.
      
      [Feb 9 09:08] ------------[ cut here ]------------
      [  +0.000004] workqueue: WQ_MEM_RECLAIM i40e:i40e_service_task [i40e] is
      flushing !WQ_MEM_RECLAIM infiniband:0x0
      [  +0.000060] WARNING: CPU: 0 PID: 937 at kernel/workqueue.c:2966
      check_flush_dependency+0x10b/0x120
      [  +0.000007] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq
      snd_timer snd_seq_device snd soundcore nls_utf8 cifs cifs_arc4
      nls_ucs2_utils rdma_cm iw_cm ib_cm cifs_md4 dns_resolver netfs qrtr
      rfkill sunrpc vfat fat intel_rapl_msr intel_rapl_common irdma
      intel_uncore_frequency intel_uncore_frequency_common ice ipmi_ssif
      isst_if_common skx_edac nfit libnvdimm x86_pkg_temp_thermal
      intel_powerclamp gnss coretemp ib_uverbs rapl intel_cstate ib_core
      iTCO_wdt iTCO_vendor_support acpi_ipmi mei_me ipmi_si intel_uncore
      ioatdma i2c_i801 joydev pcspkr mei ipmi_devintf lpc_ich
      intel_pch_thermal i2c_smbus ipmi_msghandler acpi_power_meter acpi_pad
      xfs libcrc32c ast sd_mod drm_shmem_helper t10_pi drm_kms_helper sg ixgbe
      drm i40e ahci crct10dif_pclmul libahci crc32_pclmul igb crc32c_intel
      libata ghash_clmulni_intel i2c_algo_bit mdio dca wmi dm_mirror
      dm_region_hash dm_log dm_mod fuse
      [  +0.000050] CPU: 0 PID: 937 Comm: kworker/0:3 Kdump: loaded Not
      tainted 6.8.0-rc2-Feb-net_dev-Qiueue-00279-gbd43c5687e05 #1
      [  +0.000003] Hardware name: Intel Corporation S2600BPB/S2600BPB, BIOS
      SE5C620.86B.02.01.0013.121520200651 12/15/2020
      [  +0.000001] Workqueue: i40e i40e_service_task [i40e]
      [  +0.000024] RIP: 0010:check_flush_dependency+0x10b/0x120
      [  +0.000003] Code: ff 49 8b 54 24 18 48 8d 8b b0 00 00 00 49 89 e8 48
      81 c6 b0 00 00 00 48 c7 c7 b0 97 fa 9f c6 05 8a cc 1f 02 01 e8 35 b3 fd
      ff <0f> 0b e9 10 ff ff ff 80 3d 78 cc 1f 02 00 75 94 e9 46 ff ff ff 90
      [  +0.000002] RSP: 0018:ffffbd294976bcf8 EFLAGS: 00010282
      [  +0.000002] RAX: 0000000000000000 RBX: ffff94d4c483c000 RCX:
      0000000000000027
      [  +0.000001] RDX: ffff94d47f620bc8 RSI: 0000000000000001 RDI:
      ffff94d47f620bc0
      [  +0.000001] RBP: 0000000000000000 R08: 0000000000000000 R09:
      00000000ffff7fff
      [  +0.000001] R10: ffffbd294976bb98 R11: ffffffffa0be65e8 R12:
      ffff94c5451ea180
      [  +0.000001] R13: ffff94c5ab5e8000 R14: ffff94c5c20b6e05 R15:
      ffff94c5f1330ab0
      [  +0.000001] FS:  0000000000000000(0000) GS:ffff94d47f600000(0000)
      knlGS:0000000000000000
      [  +0.000002] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  +0.000001] CR2: 00007f9e6f1fca70 CR3: 0000000038e20004 CR4:
      00000000007706f0
      [  +0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
      0000000000000000
      [  +0.000001] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
      0000000000000400
      [  +0.000001] PKRU: 55555554
      [  +0.000001] Call Trace:
      [  +0.000001]  <TASK>
      [  +0.000002]  ? __warn+0x80/0x130
      [  +0.000003]  ? check_flush_dependency+0x10b/0x120
      [  +0.000002]  ? report_bug+0x195/0x1a0
      [  +0.000005]  ? handle_bug+0x3c/0x70
      [  +0.000003]  ? exc_invalid_op+0x14/0x70
      [  +0.000002]  ? asm_exc_invalid_op+0x16/0x20
      [  +0.000006]  ? check_flush_dependency+0x10b/0x120
      [  +0.000002]  ? check_flush_dependency+0x10b/0x120
      [  +0.000002]  __flush_workqueue+0x126/0x3f0
      [  +0.000015]  ib_cache_cleanup_one+0x1c/0xe0 [ib_core]
      [  +0.000056]  __ib_unregister_device+0x6a/0xb0 [ib_core]
      [  +0.000023]  ib_unregister_device_and_put+0x34/0x50 [ib_core]
      [  +0.000020]  i40iw_close+0x4b/0x90 [irdma]
      [  +0.000022]  i40e_notify_client_of_netdev_close+0x54/0xc0 [i40e]
      [  +0.000035]  i40e_service_task+0x126/0x190 [i40e]
      [  +0.000024]  process_one_work+0x174/0x340
      [  +0.000003]  worker_thread+0x27e/0x390
      [  +0.000001]  ? __pfx_worker_thread+0x10/0x10
      [  +0.000002]  kthread+0xdf/0x110
      [  +0.000002]  ? __pfx_kthread+0x10/0x10
      [  +0.000002]  ret_from_fork+0x2d/0x50
      [  +0.000003]  ? __pfx_kthread+0x10/0x10
      [  +0.000001]  ret_from_fork_asm+0x1b/0x30
      [  +0.000004]  </TASK>
      [  +0.000001] ---[ end trace 0000000000000000 ]---
      
      Fixes: 4d5957cb ("i40e: remove WQ_UNBOUND and the task limit of our workqueue")
      Signed-off-by: default avatarSindhu Devale <sindhu.devale@intel.com>
      Reviewed-by: default avatarArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
      Reviewed-by: default avatarMateusz Polchlopek <mateusz.polchlopek@intel.com>
      Signed-off-by: default avatarAleksandr Loktionov <aleksandr.loktionov@intel.com>
      Tested-by: default avatarRobert Ganzynkowicz <robert.ganzynkowicz@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Link: https://lore.kernel.org/r/20240423182723.740401-2-anthony.l.nguyen@intel.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      af3e208a
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: honor table dormant flag from netdev release event path · cbe3aa40
      Pablo Neira Ayuso authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 8e30abc9 ]
      
      Check for table dormant flag otherwise netdev release event path tries
      to unregister an already unregistered hook.
      
      [524854.857999] ------------[ cut here ]------------
      [524854.858010] WARNING: CPU: 0 PID: 3386599 at net/netfilter/core.c:501 __nf_unregister_net_hook+0x21a/0x260
      [...]
      [524854.858848] CPU: 0 PID: 3386599 Comm: kworker/u32:2 Not tainted 6.9.0-rc3+ #365
      [524854.858869] Workqueue: netns cleanup_net
      [524854.858886] RIP: 0010:__nf_unregister_net_hook+0x21a/0x260
      [524854.858903] Code: 24 e8 aa 73 83 ff 48 63 43 1c 83 f8 01 0f 85 3d ff ff ff e8 98 d1 f0 ff 48 8b 3c 24 e8 8f 73 83 ff 48 63 43 1c e9 26 ff ff ff <0f> 0b 48 83 c4 18 48 c7 c7 00 68 e9 82 5b 5d 41 5c 41 5d 41 5e 41
      [524854.858914] RSP: 0018:ffff8881e36d79e0 EFLAGS: 00010246
      [524854.858926] RAX: 0000000000000000 RBX: ffff8881339ae790 RCX: ffffffff81ba524a
      [524854.858936] RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff8881c8a16438
      [524854.858945] RBP: ffff8881c8a16438 R08: 0000000000000001 R09: ffffed103c6daf34
      [524854.858954] R10: ffff8881e36d79a7 R11: 0000000000000000 R12: 0000000000000005
      [524854.858962] R13: ffff8881c8a16000 R14: 0000000000000000 R15: ffff8881351b5a00
      [524854.858971] FS:  0000000000000000(0000) GS:ffff888390800000(0000) knlGS:0000000000000000
      [524854.858982] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [524854.858991] CR2: 00007fc9be0f16f4 CR3: 00000001437cc004 CR4: 00000000001706f0
      [524854.859000] Call Trace:
      [524854.859006]  <TASK>
      [524854.859013]  ? __warn+0x9f/0x1a0
      [524854.859027]  ? __nf_unregister_net_hook+0x21a/0x260
      [524854.859044]  ? report_bug+0x1b1/0x1e0
      [524854.859060]  ? handle_bug+0x3c/0x70
      [524854.859071]  ? exc_invalid_op+0x17/0x40
      [524854.859083]  ? asm_exc_invalid_op+0x1a/0x20
      [524854.859100]  ? __nf_unregister_net_hook+0x6a/0x260
      [524854.859116]  ? __nf_unregister_net_hook+0x21a/0x260
      [524854.859135]  nf_tables_netdev_event+0x337/0x390 [nf_tables]
      [524854.859304]  ? __pfx_nf_tables_netdev_event+0x10/0x10 [nf_tables]
      [524854.859461]  ? packet_notifier+0xb3/0x360
      [524854.859476]  ? _raw_spin_unlock_irqrestore+0x11/0x40
      [524854.859489]  ? dcbnl_netdevice_event+0x35/0x140
      [524854.859507]  ? __pfx_nf_tables_netdev_event+0x10/0x10 [nf_tables]
      [524854.859661]  notifier_call_chain+0x7d/0x140
      [524854.859677]  unregister_netdevice_many_notify+0x5e1/0xae0
      
      Fixes: d54725cd ("netfilter: nf_tables: support for multiple devices per netdev hook")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cbe3aa40
    • Ido Schimmel's avatar
      mlxsw: spectrum_acl_tcam: Fix memory leak when canceling rehash work · 2150b4ab
      Ido Schimmel authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit fb4e2b70 ]
      
      The rehash delayed work is rescheduled with a delay if the number of
      credits at end of the work is not negative as supposedly it means that
      the migration ended. Otherwise, it is rescheduled immediately.
      
      After "mlxsw: spectrum_acl_tcam: Fix possible use-after-free during
      rehash" the above is no longer accurate as a non-negative number of
      credits is no longer indicative of the migration being done. It can also
      happen if the work encountered an error in which case the migration will
      resume the next time the work is scheduled.
      
      The significance of the above is that it is possible for the work to be
      pending and associated with hints that were allocated when the migration
      started. This leads to the hints being leaked [1] when the work is
      canceled while pending as part of ACL region dismantle.
      
      Fix by freeing the hints if hints are associated with a work that was
      canceled while pending.
      
      Blame the original commit since the reliance on not having a pending
      work associated with hints is fragile.
      
      [1]
      unreferenced object 0xffff88810e7c3000 (size 256):
        comm "kworker/0:16", pid 176, jiffies 4295460353
        hex dump (first 32 bytes):
          00 30 95 11 81 88 ff ff 61 00 00 00 00 00 00 80  .0......a.......
          00 00 61 00 40 00 00 00 00 00 00 00 04 00 00 00  ..a.@...........
        backtrace (crc 2544ddb9):
          [<00000000cf8cfab3>] kmalloc_trace+0x23f/0x2a0
          [<000000004d9a1ad9>] objagg_hints_get+0x42/0x390
          [<000000000b143cf3>] mlxsw_sp_acl_erp_rehash_hints_get+0xca/0x400
          [<0000000059bdb60a>] mlxsw_sp_acl_tcam_vregion_rehash_work+0x868/0x1160
          [<00000000e81fd734>] process_one_work+0x59c/0xf20
          [<00000000ceee9e81>] worker_thread+0x799/0x12c0
          [<00000000bda6fe39>] kthread+0x246/0x300
          [<0000000070056d23>] ret_from_fork+0x34/0x70
          [<00000000dea2b93e>] ret_from_fork_asm+0x1a/0x30
      
      Fixes: c9c9af91 ("mlxsw: spectrum_acl: Allow to interrupt/continue rehash work")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Tested-by: default avatarAlexander Zubkov <green@qrator.net>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/0cc12ebb07c4d4c41a1265ee2c28b392ff997a86.1713797103.git.petrm@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2150b4ab
    • Ido Schimmel's avatar
      mlxsw: spectrum_acl_tcam: Fix incorrect list API usage · c8f9d16c
      Ido Schimmel authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit b377add0 ]
      
      Both the function that migrates all the chunks within a region and the
      function that migrates all the entries within a chunk call
      list_first_entry() on the respective lists without checking that the
      lists are not empty. This is incorrect usage of the API, which leads to
      the following warning [1].
      
      Fix by returning if the lists are empty as there is nothing to migrate
      in this case.
      
      [1]
      WARNING: CPU: 0 PID: 6437 at drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c:1266 mlxsw_sp_acl_tcam_vchunk_migrate_all+0x1f1/0>
      Modules linked in:
      CPU: 0 PID: 6437 Comm: kworker/0:37 Not tainted 6.9.0-rc3-custom-00883-g94a65f079ef6 #39
      Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019
      Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
      RIP: 0010:mlxsw_sp_acl_tcam_vchunk_migrate_all+0x1f1/0x2c0
      [...]
      Call Trace:
       <TASK>
       mlxsw_sp_acl_tcam_vregion_rehash_work+0x6c/0x4a0
       process_one_work+0x151/0x370
       worker_thread+0x2cb/0x3e0
       kthread+0xd0/0x100
       ret_from_fork+0x34/0x50
       ret_from_fork_asm+0x1a/0x30
       </TASK>
      
      Fixes: 6f9579d4 ("mlxsw: spectrum_acl: Remember where to continue rehash migration")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Tested-by: default avatarAlexander Zubkov <green@qrator.net>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/4628e9a22d1d84818e28310abbbc498e7bc31bc9.1713797103.git.petrm@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c8f9d16c
    • Ido Schimmel's avatar
      mlxsw: spectrum_acl_tcam: Fix warning during rehash · a8b286cf
      Ido Schimmel authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 743edc85 ]
      
      As previously explained, the rehash delayed work migrates filters from
      one region to another. This is done by iterating over all chunks (all
      the filters with the same priority) in the region and in each chunk
      iterating over all the filters.
      
      When the work runs out of credits it stores the current chunk and entry
      as markers in the per-work context so that it would know where to resume
      the migration from the next time the work is scheduled.
      
      Upon error, the chunk marker is reset to NULL, but without resetting the
      entry markers despite being relative to it. This can result in migration
      being resumed from an entry that does not belong to the chunk being
      migrated. In turn, this will eventually lead to a chunk being iterated
      over as if it is an entry. Because of how the two structures happen to
      be defined, this does not lead to KASAN splats, but to warnings such as
      [1].
      
      Fix by creating a helper that resets all the markers and call it from
      all the places the currently only reset the chunk marker. For good
      measures also call it when starting a completely new rehash. Add a
      warning to avoid future cases.
      
      [1]
      WARNING: CPU: 7 PID: 1076 at drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.c:407 mlxsw_afk_encode+0x242/0x2f0
      Modules linked in:
      CPU: 7 PID: 1076 Comm: kworker/7:24 Tainted: G        W          6.9.0-rc3-custom-00880-g29e61d91b77b #29
      Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019
      Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
      RIP: 0010:mlxsw_afk_encode+0x242/0x2f0
      [...]
      Call Trace:
       <TASK>
       mlxsw_sp_acl_atcam_entry_add+0xd9/0x3c0
       mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0
       mlxsw_sp_acl_tcam_vchunk_migrate_all+0x109/0x290
       mlxsw_sp_acl_tcam_vregion_rehash_work+0x6c/0x470
       process_one_work+0x151/0x370
       worker_thread+0x2cb/0x3e0
       kthread+0xd0/0x100
       ret_from_fork+0x34/0x50
       </TASK>
      
      Fixes: 6f9579d4 ("mlxsw: spectrum_acl: Remember where to continue rehash migration")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Tested-by: default avatarAlexander Zubkov <green@qrator.net>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/cc17eed86b41dd829d39b07906fec074a9ce580e.1713797103.git.petrm@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a8b286cf
    • Ido Schimmel's avatar
      mlxsw: spectrum_acl_tcam: Fix memory leak during rehash · f885f8f0
      Ido Schimmel authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 8ca3f7a7 ]
      
      The rehash delayed work migrates filters from one region to another.
      This is done by iterating over all chunks (all the filters with the same
      priority) in the region and in each chunk iterating over all the
      filters.
      
      If the migration fails, the code tries to migrate the filters back to
      the old region. However, the rollback itself can also fail in which case
      another migration will be erroneously performed. Besides the fact that
      this ping pong is not a very good idea, it also creates a problem.
      
      Each virtual chunk references two chunks: The currently used one
      ('vchunk->chunk') and a backup ('vchunk->chunk2'). During migration the
      first holds the chunk we want to migrate filters to and the second holds
      the chunk we are migrating filters from.
      
      The code currently assumes - but does not verify - that the backup chunk
      does not exist (NULL) if the currently used chunk does not reference the
      target region. This assumption breaks when we are trying to rollback a
      rollback, resulting in the backup chunk being overwritten and leaked
      [1].
      
      Fix by not rolling back a failed rollback and add a warning to avoid
      future cases.
      
      [1]
      WARNING: CPU: 5 PID: 1063 at lib/parman.c:291 parman_destroy+0x17/0x20
      Modules linked in:
      CPU: 5 PID: 1063 Comm: kworker/5:11 Tainted: G        W          6.9.0-rc2-custom-00784-gc6a05c468a0b #14
      Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019
      Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
      RIP: 0010:parman_destroy+0x17/0x20
      [...]
      Call Trace:
       <TASK>
       mlxsw_sp_acl_atcam_region_fini+0x19/0x60
       mlxsw_sp_acl_tcam_region_destroy+0x49/0xf0
       mlxsw_sp_acl_tcam_vregion_rehash_work+0x1f1/0x470
       process_one_work+0x151/0x370
       worker_thread+0x2cb/0x3e0
       kthread+0xd0/0x100
       ret_from_fork+0x34/0x50
       ret_from_fork_asm+0x1a/0x30
       </TASK>
      
      Fixes: 84350051 ("mlxsw: spectrum_acl: Do rollback as another call to mlxsw_sp_acl_tcam_vchunk_migrate_all()")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Tested-by: default avatarAlexander Zubkov <green@qrator.net>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/d5edd4f4503934186ae5cfe268503b16345b4e0f.1713797103.git.petrm@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f885f8f0
    • Ido Schimmel's avatar
      mlxsw: spectrum_acl_tcam: Rate limit error message · 50415ced
      Ido Schimmel authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 5bcf9255 ]
      
      In the rare cases when the device resources are exhausted it is likely
      that the rehash delayed work will fail. An error message will be printed
      whenever this happens which can be overwhelming considering the fact
      that the work is per-region and that there can be hundreds of regions.
      
      Fix by rate limiting the error message.
      
      Fixes: e5e7962e ("mlxsw: spectrum_acl: Implement region migration according to hints")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Tested-by: default avatarAlexander Zubkov <green@qrator.net>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/c510763b2ebd25e7990d80183feff91cde593145.1713797103.git.petrm@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      50415ced
    • Ido Schimmel's avatar
      mlxsw: spectrum_acl_tcam: Fix possible use-after-free during rehash · dbbc295a
      Ido Schimmel authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 54225988 ]
      
      The rehash delayed work migrates filters from one region to another
      according to the number of available credits.
      
      The migrated from region is destroyed at the end of the work if the
      number of credits is non-negative as the assumption is that this is
      indicative of migration being complete. This assumption is incorrect as
      a non-negative number of credits can also be the result of a failed
      migration.
      
      The destruction of a region that still has filters referencing it can
      result in a use-after-free [1].
      
      Fix by not destroying the region if migration failed.
      
      [1]
      BUG: KASAN: slab-use-after-free in mlxsw_sp_acl_ctcam_region_entry_remove+0x21d/0x230
      Read of size 8 at addr ffff8881735319e8 by task kworker/0:31/3858
      
      CPU: 0 PID: 3858 Comm: kworker/0:31 Tainted: G        W          6.9.0-rc2-custom-00782-gf2275c2157d8 #5
      Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019
      Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
      Call Trace:
       <TASK>
       dump_stack_lvl+0xc6/0x120
       print_report+0xce/0x670
       kasan_report+0xd7/0x110
       mlxsw_sp_acl_ctcam_region_entry_remove+0x21d/0x230
       mlxsw_sp_acl_ctcam_entry_del+0x2e/0x70
       mlxsw_sp_acl_atcam_entry_del+0x81/0x210
       mlxsw_sp_acl_tcam_vchunk_migrate_all+0x3cd/0xb50
       mlxsw_sp_acl_tcam_vregion_rehash_work+0x157/0x1300
       process_one_work+0x8eb/0x19b0
       worker_thread+0x6c9/0xf70
       kthread+0x2c9/0x3b0
       ret_from_fork+0x4d/0x80
       ret_from_fork_asm+0x1a/0x30
       </TASK>
      
      Allocated by task 174:
       kasan_save_stack+0x33/0x60
       kasan_save_track+0x14/0x30
       __kasan_kmalloc+0x8f/0xa0
       __kmalloc+0x19c/0x360
       mlxsw_sp_acl_tcam_region_create+0xdf/0x9c0
       mlxsw_sp_acl_tcam_vregion_rehash_work+0x954/0x1300
       process_one_work+0x8eb/0x19b0
       worker_thread+0x6c9/0xf70
       kthread+0x2c9/0x3b0
       ret_from_fork+0x4d/0x80
       ret_from_fork_asm+0x1a/0x30
      
      Freed by task 7:
       kasan_save_stack+0x33/0x60
       kasan_save_track+0x14/0x30
       kasan_save_free_info+0x3b/0x60
       poison_slab_object+0x102/0x170
       __kasan_slab_free+0x14/0x30
       kfree+0xc1/0x290
       mlxsw_sp_acl_tcam_region_destroy+0x272/0x310
       mlxsw_sp_acl_tcam_vregion_rehash_work+0x731/0x1300
       process_one_work+0x8eb/0x19b0
       worker_thread+0x6c9/0xf70
       kthread+0x2c9/0x3b0
       ret_from_fork+0x4d/0x80
       ret_from_fork_asm+0x1a/0x30
      
      Fixes: c9c9af91 ("mlxsw: spectrum_acl: Allow to interrupt/continue rehash work")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Tested-by: default avatarAlexander Zubkov <green@qrator.net>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/3e412b5659ec2310c5c615760dfe5eac18dd7ebd.1713797103.git.petrm@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      dbbc295a
    • Ido Schimmel's avatar
      mlxsw: spectrum_acl_tcam: Fix possible use-after-free during activity update · 2832e02a
      Ido Schimmel authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 79b5b4b1 ]
      
      The rule activity update delayed work periodically traverses the list of
      configured rules and queries their activity from the device.
      
      As part of this task it accesses the entry pointed by 'ventry->entry',
      but this entry can be changed concurrently by the rehash delayed work,
      leading to a use-after-free [1].
      
      Fix by closing the race and perform the activity query under the
      'vregion->lock' mutex.
      
      [1]
      BUG: KASAN: slab-use-after-free in mlxsw_sp_acl_tcam_flower_rule_activity_get+0x121/0x140
      Read of size 8 at addr ffff8881054ed808 by task kworker/0:18/181
      
      CPU: 0 PID: 181 Comm: kworker/0:18 Not tainted 6.9.0-rc2-custom-00781-gd5ab772d32f7 #2
      Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019
      Workqueue: mlxsw_core mlxsw_sp_acl_rule_activity_update_work
      Call Trace:
       <TASK>
       dump_stack_lvl+0xc6/0x120
       print_report+0xce/0x670
       kasan_report+0xd7/0x110
       mlxsw_sp_acl_tcam_flower_rule_activity_get+0x121/0x140
       mlxsw_sp_acl_rule_activity_update_work+0x219/0x400
       process_one_work+0x8eb/0x19b0
       worker_thread+0x6c9/0xf70
       kthread+0x2c9/0x3b0
       ret_from_fork+0x4d/0x80
       ret_from_fork_asm+0x1a/0x30
       </TASK>
      
      Allocated by task 1039:
       kasan_save_stack+0x33/0x60
       kasan_save_track+0x14/0x30
       __kasan_kmalloc+0x8f/0xa0
       __kmalloc+0x19c/0x360
       mlxsw_sp_acl_tcam_entry_create+0x7b/0x1f0
       mlxsw_sp_acl_tcam_vchunk_migrate_all+0x30d/0xb50
       mlxsw_sp_acl_tcam_vregion_rehash_work+0x157/0x1300
       process_one_work+0x8eb/0x19b0
       worker_thread+0x6c9/0xf70
       kthread+0x2c9/0x3b0
       ret_from_fork+0x4d/0x80
       ret_from_fork_asm+0x1a/0x30
      
      Freed by task 1039:
       kasan_save_stack+0x33/0x60
       kasan_save_track+0x14/0x30
       kasan_save_free_info+0x3b/0x60
       poison_slab_object+0x102/0x170
       __kasan_slab_free+0x14/0x30
       kfree+0xc1/0x290
       mlxsw_sp_acl_tcam_vchunk_migrate_all+0x3d7/0xb50
       mlxsw_sp_acl_tcam_vregion_rehash_work+0x157/0x1300
       process_one_work+0x8eb/0x19b0
       worker_thread+0x6c9/0xf70
       kthread+0x2c9/0x3b0
       ret_from_fork+0x4d/0x80
       ret_from_fork_asm+0x1a/0x30
      
      Fixes: 2bffc532 ("mlxsw: spectrum_acl: Don't take mutex in mlxsw_sp_acl_tcam_vregion_rehash_work()")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Tested-by: default avatarAlexander Zubkov <green@qrator.net>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/1fcce0a60b231ebeb2515d91022284ba7b4ffe7a.1713797103.git.petrm@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2832e02a
    • Ido Schimmel's avatar
      mlxsw: spectrum_acl_tcam: Fix race during rehash delayed work · 68813e27
      Ido Schimmel authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit d90cfe20 ]
      
      The purpose of the rehash delayed work is to reduce the number of masks
      (eRPs) used by an ACL region as the eRP bank is a global and limited
      resource.
      
      This is done in three steps:
      
      1. Creating a new set of masks and a new ACL region which will use the
         new masks and to which the existing filters will be migrated to. The
         new region is assigned to 'vregion->region' and the region from which
         the filters are migrated from is assigned to 'vregion->region2'.
      
      2. Migrating all the filters from the old region to the new region.
      
      3. Destroying the old region and setting 'vregion->region2' to NULL.
      
      Only the second steps is performed under the 'vregion->lock' mutex
      although its comments says that among other things it "Protects
      consistency of region, region2 pointers".
      
      This is problematic as the first step can race with filter insertion
      from user space that uses 'vregion->region', but under the mutex.
      
      Fix by holding the mutex across the entirety of the delayed work and not
      only during the second step.
      
      Fixes: 2bffc532 ("mlxsw: spectrum_acl: Don't take mutex in mlxsw_sp_acl_tcam_vregion_rehash_work()")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Tested-by: default avatarAlexander Zubkov <green@qrator.net>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/1ec1d54edf2bad0a369e6b4fa030aba64e1f124b.1713797103.git.petrm@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      68813e27
    • Hyunwoo Kim's avatar
      net: openvswitch: Fix Use-After-Free in ovs_ct_exit · c38565cc
      Hyunwoo Kim authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 5ea7b72d ]
      
      Since kfree_rcu, which is called in the hlist_for_each_entry_rcu traversal
      of ovs_ct_limit_exit, is not part of the RCU read critical section, it
      is possible that the RCU grace period will pass during the traversal and
      the key will be free.
      
      To prevent this, it should be changed to hlist_for_each_entry_safe.
      
      Fixes: 11efd5cb ("openvswitch: Support conntrack zone limit")
      Signed-off-by: default avatarHyunwoo Kim <v4bel@theori.io>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarAaron Conole <aconole@redhat.com>
      Link: https://lore.kernel.org/r/ZiYvzQN/Ry5oeFQW@v4bel-B760M-AORUS-ELITE-AX
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c38565cc
    • Ismael Luceno's avatar
      ipvs: Fix checksumming on GSO of SCTP packets · 1ae2b2f6
      Ismael Luceno authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit e10d3ba4 ]
      
      It was observed in the wild that pairs of consecutive packets would leave
      the IPVS with the same wrong checksum, and the issue only went away when
      disabling GSO.
      
      IPVS needs to avoid computing the SCTP checksum when using GSO.
      
      Fixes: 90017acc ("sctp: Add GSO support")
      Co-developed-by: default avatarFiro Yang <firo.yang@suse.com>
      Signed-off-by: default avatarIsmael Luceno <iluceno@suse.de>
      Tested-by: default avatarAndreas Taschner <andreas.taschner@suse.com>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1ae2b2f6
Loading