Skip to content
Snippets Groups Projects
  1. Mar 03, 2025
    • Andy Yan's avatar
      arm64: dts: rockchip: Fix lcdpwr_en pin for Cool Pi GenBook · 7a3899c2
      Andy Yan authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit a1d939055a22be06d8c12bf53afb258b9d38575f ]
      
      According to the schematic, the lcdpwr_en pin is GPIO0_C4,
      not GPIO1_C4.
      
      Fixes: 4a8c1161 ("arm64: dts: rockchip: Add support for rk3588 based Cool Pi CM5 GenBook")
      Signed-off-by: default avatarAndy Yan <andyshrk@163.com>
      Link: https://lore.kernel.org/r/20250113104825.2390427-1-andyshrk@163.com
      
      
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7a3899c2
    • Abel Wu's avatar
      bpf: Fix deadlock when freeing cgroup storage · bb302797
      Abel Wu authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit c78f4afbd962f43a3989f45f3ca04300252b19b5 ]
      
      The following commit
      bc235cdb ("bpf: Prevent deadlock from recursive bpf_task_storage_[get|delete]")
      first introduced deadlock prevention for fentry/fexit programs attaching
      on bpf_task_storage helpers. That commit also employed the logic in map
      free path in its v6 version.
      
      Later bpf_cgrp_storage was first introduced in
      c4bcfb38 ("bpf: Implement cgroup storage available to non-cgroup-attached bpf progs")
      which faces the same issue as bpf_task_storage, instead of its busy
      counter, NULL was passed to bpf_local_storage_map_free() which opened
      a window to cause deadlock:
      
      	<TASK>
      		(acquiring local_storage->lock)
      	_raw_spin_lock_irqsave+0x3d/0x50
      	bpf_local_storage_update+0xd1/0x460
      	bpf_cgrp_storage_get+0x109/0x130
      	bpf_prog_a4d4a370ba857314_cgrp_ptr+0x139/0x170
      	? __bpf_prog_enter_recur+0x16/0x80
      	bpf_trampoline_6442485186+0x43/0xa4
      	cgroup_storage_ptr+0x9/0x20
      		(holding local_storage->lock)
      	bpf_selem_unlink_storage_nolock.constprop.0+0x135/0x160
      	bpf_selem_unlink_storage+0x6f/0x110
      	bpf_local_storage_map_free+0xa2/0x110
      	bpf_map_free_deferred+0x5b/0x90
      	process_one_work+0x17c/0x390
      	worker_thread+0x251/0x360
      	kthread+0xd2/0x100
      	ret_from_fork+0x34/0x50
      	ret_from_fork_asm+0x1a/0x30
      	</TASK>
      
      Progs:
       - A: SEC("fentry/cgroup_storage_ptr")
         - cgid (BPF_MAP_TYPE_HASH)
      	Record the id of the cgroup the current task belonging
      	to in this hash map, using the address of the cgroup
      	as the map key.
         - cgrpa (BPF_MAP_TYPE_CGRP_STORAGE)
      	If current task is a kworker, lookup the above hash
      	map using function parameter @owner as the key to get
      	its corresponding cgroup id which is then used to get
      	a trusted pointer to the cgroup through
      	bpf_cgroup_from_id(). This trusted pointer can then
      	be passed to bpf_cgrp_storage_get() to finally trigger
      	the deadlock issue.
       - B: SEC("tp_btf/sys_enter")
         - cgrpb (BPF_MAP_TYPE_CGRP_STORAGE)
      	The only purpose of this prog is to fill Prog A's
      	hash map by calling bpf_cgrp_storage_get() for as
      	many userspace tasks as possible.
      
      Steps to reproduce:
       - Run A;
       - while (true) { Run B; Destroy B; }
      
      Fix this issue by passing its busy counter to the free procedure so
      it can be properly incremented before storage/smap locking.
      
      Fixes: c4bcfb38 ("bpf: Implement cgroup storage available to non-cgroup-attached bpf progs")
      Signed-off-by: default avatarAbel Wu <wuyun.abel@bytedance.com>
      Acked-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Link: https://lore.kernel.org/r/20241221061018.37717-1-wuyun.abel@bytedance.com
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bb302797
    • Jiayuan Chen's avatar
      bpf: Disable non stream socket for strparser · baba7cfa
      Jiayuan Chen authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 5459cce6bf49e72ee29be21865869c2ac42419f5 ]
      
      Currently, only TCP supports strparser, but sockmap doesn't intercept
      non-TCP connections to attach strparser. For example, with UDP, although
      the read/write handlers are replaced, strparser is not executed due to
      the lack of a read_sock operation.
      
      Furthermore, in udp_bpf_recvmsg(), it checks whether the psock has data,
      and if not, it falls back to the native UDP read interface, making
      UDP + strparser appear to read correctly. According to its commit history,
      this behavior is unexpected.
      
      Moreover, since UDP lacks the concept of streams, we intercept it directly.
      
      Fixes: 1fa1fe8f ("bpf, sockmap: Test shutdown() correctly exits epoll and recv()=0")
      Signed-off-by: default avatarJiayuan Chen <mrpre@163.com>
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Acked-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://patch.msgid.link/20250122100917.49845-4-mrpre@163.com
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      baba7cfa
    • Jiayuan Chen's avatar
      bpf: Fix wrong copied_seq calculation · 234a783b
      Jiayuan Chen authored and Frieder Schrempf's avatar Frieder Schrempf committed
      [ Upstream commit 36b62df5683c315ba58c950f1a9c771c796c30ec ]
      
      'sk->copied_seq' was updated in the tcp_eat_skb() function when the action
      of a BPF program was SK_REDIRECT. For other actions, like SK_PASS, the
      update logic for 'sk->copied_seq' was moved to tcp_bpf_recvmsg_parser()
      to ensure the accuracy of the 'fionread' feature.
      
      It works for a single stream_verdict scenario, as it also modified
      sk_data_ready->sk_psock_verdict_data_ready->tcp_read_skb
      to remove updating 'sk->copied_seq'.
      
      However, for programs where both stream_parser and stream_verdict are
      active (strparser purpose), tcp_read_sock() was used instead of
      tcp_read_skb() (sk_data_ready->strp_data_ready->tcp_read_sock).
      tcp_read_sock() now still updates 'sk->copied_seq', leading to duplicate
      updates.
      
      In summary, for strparser + SK_PASS, copied_seq is redundantly calculated
      in both tcp_read_sock() and tcp_bpf_recvmsg_parser().
      
      The issue causes incorrect copied_seq calculations, which prevent
      correct data reads from the recv() interface in user-land.
      
      We do not want to add new proto_ops to implement a new version of
      tcp_read_sock, as this would introduce code complexity [1].
      
      We could have added noack and copied_seq to desc, and then called
      ops->read_sock. However, unfortunately, other modules didn’t fully
      initialize desc to zero. So, for now, we are directly calling
      tcp_read_sock_noack() in tcp_bpf.c.
      
      [1]: https://lore.kernel.org/bpf/20241218053408.437295-1-mrpre@163.com
      
      
      
      Fixes: e5c6de5f ("bpf, sockmap: Incorrectly handling copied_seq")
      Suggested-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarJiayuan Chen <mrpre@163.com>
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://patch.msgid.link/20250122100917.49845-3-mrpre@163.com
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      234a783b
    • Jiayuan Chen's avatar
      strparser: Add read_sock callback · a82c6016
      Jiayuan Chen authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 0532a79efd68a4d9686b0385e4993af4b130ff82 ]
      
      Added a new read_sock handler, allowing users to customize read operations
      instead of relying on the native socket's read_sock.
      
      Signed-off-by: default avatarJiayuan Chen <mrpre@163.com>
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://patch.msgid.link/20250122100917.49845-2-mrpre@163.com
      
      
      Stable-dep-of: 36b62df5683c ("bpf: Fix wrong copied_seq calculation")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a82c6016
    • Andrii Nakryiko's avatar
      bpf: avoid holding freeze_mutex during mmap operation · bc347129
      Andrii Nakryiko authored and Frieder Schrempf's avatar Frieder Schrempf committed
      [ Upstream commit bc27c52eea189e8f7492d40739b7746d67b65beb ]
      
      We use map->freeze_mutex to prevent races between map_freeze() and
      memory mapping BPF map contents with writable permissions. The way we
      naively do this means we'll hold freeze_mutex for entire duration of all
      the mm and VMA manipulations, which is completely unnecessary. This can
      potentially also lead to deadlocks, as reported by syzbot in [0].
      
      So, instead, hold freeze_mutex only during writeability checks, bump
      (proactively) "write active" count for the map, unlock the mutex and
      proceed with mmap logic. And only if something went wrong during mmap
      logic, then undo that "write active" counter increment.
      
        [0] https://lore.kernel.org/bpf/678dcbc9.050a0220.303755.0066.GAE@google.com/
      
      
      
      Fixes: fc970227 ("bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY")
      Reported-by: default avatar <syzbot+4dc041c686b7c816a71e@syzkaller.appspotmail.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20250129012246.1515826-2-andrii@kernel.org
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bc347129
    • Andrii Nakryiko's avatar
      bpf: unify VM_WRITE vs VM_MAYWRITE use in BPF map mmaping logic · d4fcc6a7
      Andrii Nakryiko authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 98671a0fd1f14e4a518ee06b19037c20014900eb ]
      
      For all BPF maps we ensure that VM_MAYWRITE is cleared when
      memory-mapping BPF map contents as initially read-only VMA. This is
      because in some cases BPF verifier relies on the underlying data to not
      be modified afterwards by user space, so once something is mapped
      read-only, it shouldn't be re-mmap'ed as read-write.
      
      As such, it's not necessary to check VM_MAYWRITE in bpf_map_mmap() and
      map->ops->map_mmap() callbacks: VM_WRITE should be consistently set for
      read-write mappings, and if VM_WRITE is not set, there is no way for
      user space to upgrade read-only mapping to read-write one.
      
      This patch cleans up this VM_WRITE vs VM_MAYWRITE handling within
      bpf_map_mmap(), which is an entry point for any BPF map mmap()-ing
      logic. We also drop unnecessary sanitization of VM_MAYWRITE in BPF
      ringbuf's map_mmap() callback implementation, as it is already performed
      by common code in bpf_map_mmap().
      
      Note, though, that in bpf_map_mmap_{open,close}() callbacks we can't
      drop VM_MAYWRITE use, because it's possible (and is outside of
      subsystem's control) to have initially read-write memory mapping, which
      is subsequently dropped to read-only by user space through mprotect().
      In such case, from BPF verifier POV it's read-write data throughout the
      lifetime of BPF map, and is counted as "active writer".
      
      But its VMAs will start out as VM_WRITE|VM_MAYWRITE, then mprotect() can
      change it to just VM_MAYWRITE (and no VM_WRITE), so when its finally
      munmap()'ed and bpf_map_mmap_close() is called, vm_flags will be just
      VM_MAYWRITE, but we still need to decrement active writer count with
      bpf_map_write_active_dec() as it's still considered to be a read-write
      mapping by the rest of BPF subsystem.
      
      Similar reasoning applies to bpf_map_mmap_open(), which is called
      whenever mmap(), munmap(), and/or mprotect() forces mm subsystem to
      split original VMA into multiple discontiguous VMAs.
      
      Memory-mapping handling is a bit tricky, yes.
      
      Cc: Jann Horn <jannh@google.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Shakeel Butt <shakeel.butt@linux.dev>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20250129012246.1515826-1-andrii@kernel.org
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Stable-dep-of: bc27c52eea18 ("bpf: avoid holding freeze_mutex during mmap operation")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d4fcc6a7
    • Shigeru Yoshida's avatar
      bpf, test_run: Fix use-after-free issue in eth_skb_pkt_type() · d8a69b59
      Shigeru Yoshida authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 6b3d638ca897e099fa99bd6d02189d3176f80a47 ]
      
      KMSAN reported a use-after-free issue in eth_skb_pkt_type()[1]. The
      cause of the issue was that eth_skb_pkt_type() accessed skb's data
      that didn't contain an Ethernet header. This occurs when
      bpf_prog_test_run_xdp() passes an invalid value as the user_data
      argument to bpf_test_init().
      
      Fix this by returning an error when user_data is less than ETH_HLEN in
      bpf_test_init(). Additionally, remove the check for "if (user_size >
      size)" as it is unnecessary.
      
      [1]
      BUG: KMSAN: use-after-free in eth_skb_pkt_type include/linux/etherdevice.h:627 [inline]
      BUG: KMSAN: use-after-free in eth_type_trans+0x4ee/0x980 net/ethernet/eth.c:165
       eth_skb_pkt_type include/linux/etherdevice.h:627 [inline]
       eth_type_trans+0x4ee/0x980 net/ethernet/eth.c:165
       __xdp_build_skb_from_frame+0x5a8/0xa50 net/core/xdp.c:635
       xdp_recv_frames net/bpf/test_run.c:272 [inline]
       xdp_test_run_batch net/bpf/test_run.c:361 [inline]
       bpf_test_run_xdp_live+0x2954/0x3330 net/bpf/test_run.c:390
       bpf_prog_test_run_xdp+0x148e/0x1b10 net/bpf/test_run.c:1318
       bpf_prog_test_run+0x5b7/0xa30 kernel/bpf/syscall.c:4371
       __sys_bpf+0x6a6/0xe20 kernel/bpf/syscall.c:5777
       __do_sys_bpf kernel/bpf/syscall.c:5866 [inline]
       __se_sys_bpf kernel/bpf/syscall.c:5864 [inline]
       __x64_sys_bpf+0xa4/0xf0 kernel/bpf/syscall.c:5864
       x64_sys_call+0x2ea0/0x3d90 arch/x86/include/generated/asm/syscalls_64.h:322
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xd9/0x1d0 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      Uninit was created at:
       free_pages_prepare mm/page_alloc.c:1056 [inline]
       free_unref_page+0x156/0x1320 mm/page_alloc.c:2657
       __free_pages+0xa3/0x1b0 mm/page_alloc.c:4838
       bpf_ringbuf_free kernel/bpf/ringbuf.c:226 [inline]
       ringbuf_map_free+0xff/0x1e0 kernel/bpf/ringbuf.c:235
       bpf_map_free kernel/bpf/syscall.c:838 [inline]
       bpf_map_free_deferred+0x17c/0x310 kernel/bpf/syscall.c:862
       process_one_work kernel/workqueue.c:3229 [inline]
       process_scheduled_works+0xa2b/0x1b60 kernel/workqueue.c:3310
       worker_thread+0xedf/0x1550 kernel/workqueue.c:3391
       kthread+0x535/0x6b0 kernel/kthread.c:389
       ret_from_fork+0x6e/0x90 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
      
      CPU: 1 UID: 0 PID: 17276 Comm: syz.1.16450 Not tainted 6.12.0-05490-g9bb88c659673 #8
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
      
      Fixes: be3d72a2 ("bpf: move user_size out of bpf_test_init")
      Reported-by: default avatarsyzkaller <syzkaller@googlegroups.com>
      Suggested-by: default avatarMartin KaFai Lau <martin.lau@linux.dev>
      Signed-off-by: default avatarShigeru Yoshida <syoshida@redhat.com>
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Acked-by: default avatarStanislav Fomichev <sdf@fomichev.me>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://patch.msgid.link/20250121150643.671650-1-syoshida@redhat.com
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d8a69b59
    • Paolo Abeni's avatar
      net: allow small head cache usage with large MAX_SKB_FRAGS values · 0aa856c7
      Paolo Abeni authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 14ad6ed30a10afbe91b0749d6378285f4225d482 ]
      
      Sabrina reported the following splat:
      
          WARNING: CPU: 0 PID: 1 at net/core/dev.c:6935 netif_napi_add_weight_locked+0x8f2/0xba0
          Modules linked in:
          CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.14.0-rc1-net-00092-g011b03359038 #996
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
          RIP: 0010:netif_napi_add_weight_locked+0x8f2/0xba0
          Code: e8 c3 e6 6a fe 48 83 c4 28 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc c7 44 24 10 ff ff ff ff e9 8f fb ff ff e8 9e e6 6a fe <0f> 0b e9 d3 fe ff ff e8 92 e6 6a fe 48 8b 04 24 be ff ff ff ff 48
          RSP: 0000:ffffc9000001fc60 EFLAGS: 00010293
          RAX: 0000000000000000 RBX: ffff88806ce48128 RCX: 1ffff11001664b9e
          RDX: ffff888008f00040 RSI: ffffffff8317ca42 RDI: ffff88800b325cb6
          RBP: ffff88800b325c40 R08: 0000000000000001 R09: ffffed100167502c
          R10: ffff88800b3a8163 R11: 0000000000000000 R12: ffff88800ac1c168
          R13: ffff88800ac1c168 R14: ffff88800ac1c168 R15: 0000000000000007
          FS:  0000000000000000(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: ffff888008201000 CR3: 0000000004c94001 CR4: 0000000000370ef0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
          Call Trace:
          <TASK>
          gro_cells_init+0x1ba/0x270
          xfrm_input_init+0x4b/0x2a0
          xfrm_init+0x38/0x50
          ip_rt_init+0x2d7/0x350
          ip_init+0xf/0x20
          inet_init+0x406/0x590
          do_one_initcall+0x9d/0x2e0
          do_initcalls+0x23b/0x280
          kernel_init_freeable+0x445/0x490
          kernel_init+0x20/0x1d0
          ret_from_fork+0x46/0x80
          ret_from_fork_asm+0x1a/0x30
          </TASK>
          irq event stamp: 584330
          hardirqs last  enabled at (584338): [<ffffffff8168bf87>] __up_console_sem+0x77/0xb0
          hardirqs last disabled at (584345): [<ffffffff8168bf6c>] __up_console_sem+0x5c/0xb0
          softirqs last  enabled at (583242): [<ffffffff833ee96d>] netlink_insert+0x14d/0x470
          softirqs last disabled at (583754): [<ffffffff8317c8cd>] netif_napi_add_weight_locked+0x77d/0xba0
      
      on kernel built with MAX_SKB_FRAGS=45, where SKB_WITH_OVERHEAD(1024)
      is smaller than GRO_MAX_HEAD.
      
      Such built additionally contains the revert of the single page frag cache
      so that napi_get_frags() ends up using the page frag allocator, triggering
      the splat.
      
      Note that the underlying issue is independent from the mentioned
      revert; address it ensuring that the small head cache will fit either TCP
      and GRO allocation and updating napi_alloc_skb() and __netdev_alloc_skb()
      to select kmalloc() usage for any allocation fitting such cache.
      
      Reported-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Fixes: 3948b059 ("net: introduce a config option to tweak MAX_SKB_FRAGS")
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0aa856c7
    • Sabrina Dubroca's avatar
      tcp: drop secpath at the same time as we currently drop dst · 8a11c500
      Sabrina Dubroca authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 9b6412e6979f6f9e0632075f8f008937b5cd4efd ]
      
      Xiumei reported hitting the WARN in xfrm6_tunnel_net_exit while
      running tests that boil down to:
       - create a pair of netns
       - run a basic TCP test over ipcomp6
       - delete the pair of netns
      
      The xfrm_state found on spi_byaddr was not deleted at the time we
      delete the netns, because we still have a reference on it. This
      lingering reference comes from a secpath (which holds a ref on the
      xfrm_state), which is still attached to an skb. This skb is not
      leaked, it ends up on sk_receive_queue and then gets defer-free'd by
      skb_attempt_defer_free.
      
      The problem happens when we defer freeing an skb (push it on one CPU's
      defer_list), and don't flush that list before the netns is deleted. In
      that case, we still have a reference on the xfrm_state that we don't
      expect at this point.
      
      We already drop the skb's dst in the TCP receive path when it's no
      longer needed, so let's also drop the secpath. At this point,
      tcp_filter has already called into the LSM hooks that may require the
      secpath, so it should not be needed anymore. However, in some of those
      places, the MPTCP extension has just been attached to the skb, so we
      cannot simply drop all extensions.
      
      Fixes: 68822bdf ("net: generalize skb freeing deferral to per-cpu lists")
      Reported-by: default avatarXiumei Mu <xmu@redhat.com>
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://patch.msgid.link/5055ba8f8f72bdcb602faa299faca73c280b7735.1739743613.git.sd@queasysnail.net
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8a11c500
    • Nick Hu's avatar
      net: axienet: Set mac_managed_pm · 6e49e63d
      Nick Hu authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit a370295367b55662a32a4be92565fe72a5aa79bb ]
      
      The external PHY will undergo a soft reset twice during the resume process
      when it wake up from suspend. The first reset occurs when the axienet
      driver calls phylink_of_phy_connect(), and the second occurs when
      mdio_bus_phy_resume() invokes phy_init_hw(). The second soft reset of the
      external PHY does not reinitialize the internal PHY, which causes issues
      with the internal PHY, resulting in the PHY link being down. To prevent
      this, setting the mac_managed_pm flag skips the mdio_bus_phy_resume()
      function.
      
      Fixes: a129b41f ("Revert "net: phy: dp83867: perform soft reset and retain established link"")
      Signed-off-by: default avatarNick Hu <nick.hu@sifive.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Link: https://patch.msgid.link/20250217055843.19799-1-nick.hu@sifive.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6e49e63d
    • Breno Leitao's avatar
      arp: switch to dev_getbyhwaddr() in arp_req_set_public() · cc42cd0f
      Breno Leitao authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 4eae0ee0f1e6256d0b0b9dd6e72f1d9cf8f72e08 ]
      
      The arp_req_set_public() function is called with the rtnl lock held,
      which provides enough synchronization protection. This makes the RCU
      variant of dev_getbyhwaddr() unnecessary. Switch to using the simpler
      dev_getbyhwaddr() function since we already have the required rtnl
      locking.
      
      This change helps maintain consistency in the networking code by using
      the appropriate helper function for the existing locking context.
      Since we're not holding the RCU read lock in arp_req_set_public()
      existing code could trigger false positive locking warnings.
      
      Fixes: 941666c2 ("net: RCU conversion of dev_getbyhwaddr() and arp_ioctl()")
      Suggested-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarBreno Leitao <leitao@debian.org>
      Link: https://patch.msgid.link/20250218-arm_fix_selftest-v5-2-d3d6892db9e1@debian.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cc42cd0f
    • Breno Leitao's avatar
      net: Add non-RCU dev_getbyhwaddr() helper · 09280a93
      Breno Leitao authored and Frieder Schrempf's avatar Frieder Schrempf committed
      [ Upstream commit 4b5a28b38c4a0106c64416a1b2042405166b26ce ]
      
      Add dedicated helper for finding devices by hardware address when
      holding rtnl_lock, similar to existing dev_getbyhwaddr_rcu(). This prevents
      PROVE_LOCKING warnings when rtnl_lock is held but RCU read lock is not.
      
      Extract common address comparison logic into dev_addr_cmp().
      
      The context about this change could be found in the following
      discussion:
      
      Link: https://lore.kernel.org/all/20250206-scarlet-ermine-of-improvement-1fcac5@leitao/
      
      
      
      Cc: kuniyu@amazon.com
      Cc: ushankar@purestorage.com
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarBreno Leitao <leitao@debian.org>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://patch.msgid.link/20250218-arm_fix_selftest-v5-1-d3d6892db9e1@debian.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Stable-dep-of: 4eae0ee0f1e6 ("arp: switch to dev_getbyhwaddr() in arp_req_set_public()")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      09280a93
    • Cong Wang's avatar
      flow_dissector: Fix port range key handling in BPF conversion · ce6a62ab
      Cong Wang authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 69ab34f705fbfabcace64b5d53bb7a4450fac875 ]
      
      Fix how port range keys are handled in __skb_flow_bpf_to_target() by:
      - Separating PORTS and PORTS_RANGE key handling
      - Using correct key_ports_range structure for range keys
      - Properly initializing both key types independently
      
      This ensures port range information is correctly stored in its dedicated
      structure rather than incorrectly using the regular ports key structure.
      
      Fixes: 59fb9b62 ("flow_dissector: Fix to use new variables for port ranges in bpf hook")
      Reported-by: default avatarQiang Zhang <dtzq01@gmail.com>
      Closes: https://lore.kernel.org/netdev/CAPx+-5uvFxkhkz4=j_Xuwkezjn9U6kzKTD5jz4tZ9msSJ0fOJA@mail.gmail.com/
      
      
      Cc: Yoshiki Komachi <komachi.yoshiki@gmail.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Link: https://patch.msgid.link/20250218043210.732959-4-xiyou.wangcong@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ce6a62ab
    • Cong Wang's avatar
      flow_dissector: Fix handling of mixed port and port-range keys · 2a24f3a3
      Cong Wang authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 3e5796862c692ea608d96f0a1437f9290f44953a ]
      
      This patch fixes a bug in TC flower filter where rules combining a
      specific destination port with a source port range weren't working
      correctly.
      
      The specific case was when users tried to configure rules like:
      
      tc filter add dev ens38 ingress protocol ip flower ip_proto udp \
      dst_port 5000 src_port 2000-3000 action drop
      
      The root cause was in the flow dissector code. While both
      FLOW_DISSECTOR_KEY_PORTS and FLOW_DISSECTOR_KEY_PORTS_RANGE flags
      were being set correctly in the classifier, the __skb_flow_dissect_ports()
      function was only populating one of them: whichever came first in
      the enum check. This meant that when the code needed both a specific
      port and a port range, one of them would be left as 0, causing the
      filter to not match packets as expected.
      
      Fix it by removing the either/or logic and instead checking and
      populating both key types independently when they're in use.
      
      Fixes: 8ffb055b ("cls_flower: Fix the behavior using port ranges with hw-offload")
      Reported-by: default avatarQiang Zhang <dtzq01@gmail.com>
      Closes: https://lore.kernel.org/netdev/CAPx+-5uvFxkhkz4=j_Xuwkezjn9U6kzKTD5jz4tZ9msSJ0fOJA@mail.gmail.com/
      
      
      Cc: Yoshiki Komachi <komachi.yoshiki@gmail.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Link: https://patch.msgid.link/20250218043210.732959-2-xiyou.wangcong@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2a24f3a3
    • Kuniyuki Iwashima's avatar
      geneve: Suppress list corruption splat in geneve_destroy_tunnels(). · 870d403d
      Kuniyuki Iwashima authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 62fab6eef61f245dc8797e3a6a5b890ef40e8628 ]
      
      As explained in the previous patch, iterating for_each_netdev() and
      gn->geneve_list during ->exit_batch_rtnl() could trigger ->dellink()
      twice for the same device.
      
      If CONFIG_DEBUG_LIST is enabled, we will see a list_del() corruption
      splat in the 2nd call of geneve_dellink().
      
      Let's remove for_each_netdev() in geneve_destroy_tunnels() and delegate
      that part to default_device_exit_batch().
      
      Fixes: 9593172d93b9 ("geneve: Fix use-after-free in geneve_find_dev().")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://patch.msgid.link/20250217203705.40342-3-kuniyu@amazon.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      870d403d
    • Kuniyuki Iwashima's avatar
      gtp: Suppress list corruption splat in gtp_net_exit_batch_rtnl(). · 822fd26a
      Kuniyuki Iwashima authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 4ccacf86491d33d2486b62d4d44864d7101b299d ]
      
      Brad Spengler reported the list_del() corruption splat in
      gtp_net_exit_batch_rtnl(). [0]
      
      Commit eb28fd76c0a0 ("gtp: Destroy device along with udp socket's netns
      dismantle.") added the for_each_netdev() loop in gtp_net_exit_batch_rtnl()
      to destroy devices in each netns as done in geneve and ip tunnels.
      
      However, this could trigger ->dellink() twice for the same device during
      ->exit_batch_rtnl().
      
      Say we have two netns A & B and gtp device B that resides in netns B but
      whose UDP socket is in netns A.
      
        1. cleanup_net() processes netns A and then B.
      
        2. gtp_net_exit_batch_rtnl() finds the device B while iterating
           netns A's gn->gtp_dev_list and calls ->dellink().
      
        [ device B is not yet unlinked from netns B
          as unregister_netdevice_many() has not been called. ]
      
        3. gtp_net_exit_batch_rtnl() finds the device B while iterating
           netns B's for_each_netdev() and calls ->dellink().
      
      gtp_dellink() cleans up the device's hash table, unlinks the dev from
      gn->gtp_dev_list, and calls unregister_netdevice_queue().
      
      Basically, calling gtp_dellink() multiple times is fine unless
      CONFIG_DEBUG_LIST is enabled.
      
      Let's remove for_each_netdev() in gtp_net_exit_batch_rtnl() and
      delegate the destruction to default_device_exit_batch() as done
      in bareudp.
      
      [0]:
      list_del corruption, ffff8880aaa62c00->next (autoslab_size_M_dev_P_net_core_dev_11127_8_1328_8_S_4096_A_64_n_139+0xc00/0x1000 [slab object]) is LIST_POISON1 (ffffffffffffff02) (prev is 0xffffffffffffff04)
      kernel BUG at lib/list_debug.c:58!
      Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN
      CPU: 1 UID: 0 PID: 1804 Comm: kworker/u8:7 Tainted: G                T   6.12.13-grsec-full-20250211091339 #1
      Tainted: [T]=RANDSTRUCT
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
      Workqueue: netns cleanup_net
      RIP: 0010:[<ffffffff84947381>] __list_del_entry_valid_or_report+0x141/0x200 lib/list_debug.c:58
      Code: c2 76 91 31 c0 e8 9f b1 f7 fc 0f 0b 4d 89 f0 48 c7 c1 02 ff ff ff 48 89 ea 48 89 ee 48 c7 c7 e0 c2 76 91 31 c0 e8 7f b1 f7 fc <0f> 0b 4d 89 e8 48 c7 c1 04 ff ff ff 48 89 ea 48 89 ee 48 c7 c7 60
      RSP: 0018:fffffe8040b4fbd0 EFLAGS: 00010283
      RAX: 00000000000000cc RBX: dffffc0000000000 RCX: ffffffff818c4054
      RDX: ffffffff84947381 RSI: ffffffff818d1512 RDI: 0000000000000000
      RBP: ffff8880aaa62c00 R08: 0000000000000001 R09: fffffbd008169f32
      R10: fffffe8040b4f997 R11: 0000000000000001 R12: a1988d84f24943e4
      R13: ffffffffffffff02 R14: ffffffffffffff04 R15: ffff8880aaa62c08
      RBX: kasan shadow of 0x0
      RCX: __wake_up_klogd.part.0+0x74/0xe0 kernel/printk/printk.c:4554
      RDX: __list_del_entry_valid_or_report+0x141/0x200 lib/list_debug.c:58
      RSI: vprintk+0x72/0x100 kernel/printk/printk_safe.c:71
      RBP: autoslab_size_M_dev_P_net_core_dev_11127_8_1328_8_S_4096_A_64_n_139+0xc00/0x1000 [slab object]
      RSP: process kstack fffffe8040b4fbd0+0x7bd0/0x8000 [kworker/u8:7+netns 1804 ]
      R09: kasan shadow of process kstack fffffe8040b4f990+0x7990/0x8000 [kworker/u8:7+netns 1804 ]
      R10: process kstack fffffe8040b4f997+0x7997/0x8000 [kworker/u8:7+netns 1804 ]
      R15: autoslab_size_M_dev_P_net_core_dev_11127_8_1328_8_S_4096_A_64_n_139+0xc08/0x1000 [slab object]
      FS:  0000000000000000(0000) GS:ffff888116000000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000748f5372c000 CR3: 0000000015408000 CR4: 00000000003406f0 shadow CR4: 00000000003406f0
      Stack:
       0000000000000000 ffffffff8a0c35e7 ffffffff8a0c3603 ffff8880aaa62c00
       ffff8880aaa62c00 0000000000000004 ffff88811145311c 0000000000000005
       0000000000000001 ffff8880aaa62000 fffffe8040b4fd40 ffffffff8a0c360d
      Call Trace:
       <TASK>
       [<ffffffff8a0c360d>] __list_del_entry_valid include/linux/list.h:131 [inline] fffffe8040b4fc28
       [<ffffffff8a0c360d>] __list_del_entry include/linux/list.h:248 [inline] fffffe8040b4fc28
       [<ffffffff8a0c360d>] list_del include/linux/list.h:262 [inline] fffffe8040b4fc28
       [<ffffffff8a0c360d>] gtp_dellink+0x16d/0x360 drivers/net/gtp.c:1557 fffffe8040b4fc28
       [<ffffffff8a0d0404>] gtp_net_exit_batch_rtnl+0x124/0x2c0 drivers/net/gtp.c:2495 fffffe8040b4fc88
       [<ffffffff8e705b24>] cleanup_net+0x5a4/0xbe0 net/core/net_namespace.c:635 fffffe8040b4fcd0
       [<ffffffff81754c97>] process_one_work+0xbd7/0x2160 kernel/workqueue.c:3326 fffffe8040b4fd88
       [<ffffffff81757195>] process_scheduled_works kernel/workqueue.c:3407 [inline] fffffe8040b4fec0
       [<ffffffff81757195>] worker_thread+0x6b5/0xfa0 kernel/workqueue.c:3488 fffffe8040b4fec0
       [<ffffffff817782a0>] kthread+0x360/0x4c0 kernel/kthread.c:397 fffffe8040b4ff78
       [<ffffffff814d8594>] ret_from_fork+0x74/0xe0 arch/x86/kernel/process.c:172 fffffe8040b4ffb8
       [<ffffffff8110f509>] ret_from_fork_asm+0x29/0xc0 arch/x86/entry/entry_64.S:399 fffffe8040b4ffe8
       </TASK>
      Modules linked in:
      
      Fixes: eb28fd76c0a0 ("gtp: Destroy device along with udp socket's netns dismantle.")
      Reported-by: default avatarBrad Spengler <spender@grsecurity.net>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://patch.msgid.link/20250217203705.40342-2-kuniyu@amazon.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      822fd26a
    • Kory Maincent's avatar
      net: pse-pd: pd692x0: Fix power limit retrieval · 31d1b918
      Kory Maincent authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit f6093c5ec74d5cc495f89bd359253d9c738d04d9 ]
      
      Fix incorrect data offset read in the pd692x0_pi_get_pw_limit callback.
      The issue was previously unnoticed as it was only used by the regulator
      API and not thoroughly tested, since the PSE is mainly controlled via
      ethtool.
      
      The function became actively used by ethtool after commit 3e9dbfec4998
      ("net: pse-pd: Split ethtool_get_status into multiple callbacks"),
      which led to the discovery of this issue.
      
      Fix it by using the correct data offset.
      
      Fixes: a87e699c ("net: pse-pd: pd692x0: Enhance with new current limit and voltage read callbacks")
      Signed-off-by: default avatarKory Maincent <kory.maincent@bootlin.com>
      Link: https://patch.msgid.link/20250217134812.1925345-1-kory.maincent@bootlin.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      31d1b918
    • Kory Maincent's avatar
      net: pse-pd: Use power limit at driver side instead of current limit · e3d3dad2
      Kory Maincent authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit e0a5e2bba38aa61a900934b45d6e846e0a6d7524 ]
      
      The regulator framework uses current limits, but the PSE standard and
      known PSE controllers rely on power limits. Instead of converting
      current to power within each driver, perform the conversion in the PSE
      core. This avoids redundancy in driver implementation and aligns better
      with the standard, simplifying driver development.
      
      Remove at the same time the _pse_ethtool_get_status() function which is
      not needed anymore.
      
      Acked-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Signed-off-by: default avatarKory Maincent <kory.maincent@bootlin.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Stable-dep-of: f6093c5ec74d ("net: pse-pd: pd692x0: Fix power limit retrieval")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e3d3dad2
    • Kory Maincent's avatar
      net: pse-pd: Avoid setting max_uA in regulator constraints · d1d8799f
      Kory Maincent authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 675d0e3cacc3ae7c29294a5f6a820187f862ad8b ]
      
      Setting the max_uA constraint in the regulator API imposes a current
      limit during the regulator registration process. This behavior conflicts
      with preserving the maximum PI power budget configuration across reboots.
      
      Instead, compare the desired current limit to MAX_PI_CURRENT in the
      pse_pi_set_current_limit() function to ensure proper handling of the
      power budget.
      
      Acked-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Signed-off-by: default avatarKory Maincent <kory.maincent@bootlin.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Stable-dep-of: f6093c5ec74d ("net: pse-pd: pd692x0: Fix power limit retrieval")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d1d8799f
    • Jakub Kicinski's avatar
      tcp: adjust rcvq_space after updating scaling ratio · 85dbccb0
      Jakub Kicinski authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit f5da7c45188eea71394bf445655cae2df88a7788 ]
      
      Since commit under Fixes we set the window clamp in accordance
      to newly measured rcvbuf scaling_ratio. If the scaling_ratio
      decreased significantly we may put ourselves in a situation
      where windows become smaller than rcvq_space, preventing
      tcp_rcv_space_adjust() from increasing rcvbuf.
      
      The significant decrease of scaling_ratio is far more likely
      since commit 697a6c8c ("tcp: increase the default TCP scaling ratio"),
      which increased the "default" scaling ratio from ~30% to 50%.
      
      Hitting the bad condition depends a lot on TCP tuning, and
      drivers at play. One of Meta's workloads hits it reliably
      under following conditions:
       - default rcvbuf of 125k
       - sender MTU 1500, receiver MTU 5000
       - driver settles on scaling_ratio of 78 for the config above.
      Initial rcvq_space gets calculated as TCP_INIT_CWND * tp->advmss
      (10 * 5k = 50k). Once we find out the true scaling ratio and
      MSS we clamp the windows to 38k. Triggering the condition also
      depends on the message sequence of this workload. I can't repro
      the problem with simple iperf or TCP_RR-style tests.
      
      Fixes: a2cbb160 ("tcp: Update window clamping condition")
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarNeal Cardwell <ncardwell@google.com>
      Link: https://patch.msgid.link/20250217232905.3162187-1-kuba@kernel.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      85dbccb0
    • Michal Luczaj's avatar
      vsock/bpf: Warn on socket without transport · d5b03da0
      Michal Luczaj authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 857ae05549ee2542317e7084ecaa5f8536634dd9 ]
      
      In the spirit of commit 91751e248256 ("vsock: prevent null-ptr-deref in
      vsock_*[has_data|has_space]"), armorize the "impossible" cases with a
      warning.
      
      Fixes: 634f1a71 ("vsock: support sockmap")
      Signed-off-by: default avatarMichal Luczaj <mhal@rbox.co>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d5b03da0
    • Michal Luczaj's avatar
      sockmap, vsock: For connectible sockets allow only connected · 84d6ba9a
      Michal Luczaj authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 8fb5bb169d17cdd12c2dcc2e96830ed487d77a0f ]
      
      sockmap expects all vsocks to have a transport assigned, which is expressed
      in vsock_proto::psock_update_sk_prot(). However, there is an edge case
      where an unconnected (connectible) socket may lose its previously assigned
      transport. This is handled with a NULL check in the vsock/BPF recv path.
      
      Another design detail is that listening vsocks are not supposed to have any
      transport assigned at all. Which implies they are not supported by the
      sockmap. But this is complicated by the fact that a socket, before
      switching to TCP_LISTEN, may have had some transport assigned during a
      failed connect() attempt. Hence, we may end up with a listening vsock in a
      sockmap, which blows up quickly:
      
      KASAN: null-ptr-deref in range [0x0000000000000120-0x0000000000000127]
      CPU: 7 UID: 0 PID: 56 Comm: kworker/7:0 Not tainted 6.14.0-rc1+
      Workqueue: vsock-loopback vsock_loopback_work
      RIP: 0010:vsock_read_skb+0x4b/0x90
      Call Trace:
       sk_psock_verdict_data_ready+0xa4/0x2e0
       virtio_transport_recv_pkt+0x1ca8/0x2acc
       vsock_loopback_work+0x27d/0x3f0
       process_one_work+0x846/0x1420
       worker_thread+0x5b3/0xf80
       kthread+0x35a/0x700
       ret_from_fork+0x2d/0x70
       ret_from_fork_asm+0x1a/0x30
      
      For connectible sockets, instead of relying solely on the state of
      vsk->transport, tell sockmap to only allow those representing established
      connections. This aligns with the behaviour for AF_INET and AF_UNIX.
      
      Fixes: 634f1a71 ("vsock: support sockmap")
      Signed-off-by: default avatarMichal Luczaj <mhal@rbox.co>
      Acked-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      84d6ba9a
    • Nick Child's avatar
      ibmvnic: Don't reference skb after sending to VIOS · 5e3218f9
      Nick Child authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit bdf5d13aa05ec314d4385b31ac974d6c7e0997c9 ]
      
      Previously, after successfully flushing the xmit buffer to VIOS,
      the tx_bytes stat was incremented by the length of the skb.
      
      It is invalid to access the skb memory after sending the buffer to
      the VIOS because, at any point after sending, the VIOS can trigger
      an interrupt to free this memory. A race between reading skb->len
      and freeing the skb is possible (especially during LPM) and will
      result in use-after-free:
       ==================================================================
       BUG: KASAN: slab-use-after-free in ibmvnic_xmit+0x75c/0x1808 [ibmvnic]
       Read of size 4 at addr c00000024eb48a70 by task hxecom/14495
       <...>
       Call Trace:
       [c000000118f66cf0] [c0000000018cba6c] dump_stack_lvl+0x84/0xe8 (unreliable)
       [c000000118f66d20] [c0000000006f0080] print_report+0x1a8/0x7f0
       [c000000118f66df0] [c0000000006f08f0] kasan_report+0x128/0x1f8
       [c000000118f66f00] [c0000000006f2868] __asan_load4+0xac/0xe0
       [c000000118f66f20] [c0080000046eac84] ibmvnic_xmit+0x75c/0x1808 [ibmvnic]
       [c000000118f67340] [c0000000014be168] dev_hard_start_xmit+0x150/0x358
       <...>
       Freed by task 0:
       kasan_save_stack+0x34/0x68
       kasan_save_track+0x2c/0x50
       kasan_save_free_info+0x64/0x108
       __kasan_mempool_poison_object+0x148/0x2d4
       napi_skb_cache_put+0x5c/0x194
       net_tx_action+0x154/0x5b8
       handle_softirqs+0x20c/0x60c
       do_softirq_own_stack+0x6c/0x88
       <...>
       The buggy address belongs to the object at c00000024eb48a00 which
        belongs to the cache skbuff_head_cache of size 224
      ==================================================================
      
      Fixes: 032c5e82 ("Driver for IBM System i/p VNIC protocol")
      Signed-off-by: default avatarNick Child <nnac123@linux.ibm.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://patch.msgid.link/20250214155233.235559-1-nnac123@linux.ibm.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5e3218f9
    • Nick Child's avatar
      ibmvnic: Add stat for tx direct vs tx batched · b22fac74
      Nick Child authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 2ee73c54a615b74d2e7ee6f20844fd3ba63fc485 ]
      
      Allow tracking of packets sent with send_subcrq direct vs
      indirect. `ethtool -S <dev>` will now provide a counter
      of the number of uses of each xmit method. This metric will
      be useful in performance debugging.
      
      Signed-off-by: default avatarNick Child <nnac123@linux.ibm.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://patch.msgid.link/20241001163531.1803152-1-nnac123@linux.ibm.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Stable-dep-of: bdf5d13aa05e ("ibmvnic: Don't reference skb after sending to VIOS")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b22fac74
    • Julian Ruess's avatar
      s390/ism: add release function for struct device · fb15c604
      Julian Ruess authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 915e34d5ad35a6a9e56113f852ade4a730fb88f0 ]
      
      According to device_release() in /drivers/base/core.c,
      a device without a release function is a broken device
      and must be fixed.
      
      The current code directly frees the device after calling device_add()
      without waiting for other kernel parts to release their references.
      Thus, a reference could still be held to a struct device,
      e.g., by sysfs, leading to potential use-after-free
      issues if a proper release function is not set.
      
      Fixes: 8c81ba20 ("net/smc: De-tangle ism and smc device initialization")
      Reviewed-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Reviewed-by: default avatarWenjia Zhang <wenjia@linux.ibm.com>
      Signed-off-by: default avatarJulian Ruess <julianr@linux.ibm.com>
      Signed-off-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://patch.msgid.link/20250214120137.563409-1-wintera@linux.ibm.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fb15c604
    • Takashi Iwai's avatar
      ALSA: seq: Drop UMP events when no UMP-conversion is set · d3798a6f
      Takashi Iwai authored and Frieder Schrempf's avatar Frieder Schrempf committed
      [ Upstream commit e77aa4b2eaa7fb31b2a7a50214ecb946b2a8b0f6 ]
      
      When a destination client is a user client in the legacy MIDI mode and
      it sets the no-UMP-conversion flag, currently the all UMP events are
      still passed as-is.  But this may confuse the user-space, because the
      event packet size is different from the legacy mode.
      
      Since we cannot handle UMP events in user clients unless it's running
      in the UMP client mode, we should filter out those events instead of
      accepting blindly.  This patch addresses it by slightly adjusting the
      conditions for UMP event handling at the event delivery time.
      
      Fixes: 329ffe11 ("ALSA: seq: Allow suppressing UMP conversions")
      Link: https://lore.kernel.org/b77a2cd6-7b59-4eb0-a8db-22d507d3af5f@gmail.com
      Link: https://patch.msgid.link/20250217170034.21930-1-tiwai@suse.de
      
      
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d3798a6f
    • Pierre Riteau's avatar
      net/sched: cls_api: fix error handling causing NULL dereference · c3db075e
      Pierre Riteau authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 071ed42cff4fcdd89025d966d48eabef59913bf2 ]
      
      tcf_exts_miss_cookie_base_alloc() calls xa_alloc_cyclic() which can
      return 1 if the allocation succeeded after wrapping. This was treated as
      an error, with value 1 returned to caller tcf_exts_init_ex() which sets
      exts->actions to NULL and returns 1 to caller fl_change().
      
      fl_change() treats err == 1 as success, calling tcf_exts_validate_ex()
      which calls tcf_action_init() with exts->actions as argument, where it
      is dereferenced.
      
      Example trace:
      
      BUG: kernel NULL pointer dereference, address: 0000000000000000
      CPU: 114 PID: 16151 Comm: handler114 Kdump: loaded Not tainted 5.14.0-503.16.1.el9_5.x86_64 #1
      RIP: 0010:tcf_action_init+0x1f8/0x2c0
      Call Trace:
       tcf_action_init+0x1f8/0x2c0
       tcf_exts_validate_ex+0x175/0x190
       fl_change+0x537/0x1120 [cls_flower]
      
      Fixes: 80cd22c3 ("net/sched: cls_api: Support hardware miss to tc action")
      Signed-off-by: default avatarPierre Riteau <pierre@stackhpc.com>
      Reviewed-by: default avatarMichal Swiatkowski <michal.swiatkowski@linux.intel.com>
      Link: https://patch.msgid.link/20250213223610.320278-1-pierre@stackhpc.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c3db075e
    • Vitaly Rodionov's avatar
      ALSA: hda/cirrus: Correct the full scale volume set logic · 3c8c4804
      Vitaly Rodionov authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 08b613b9e2ba431db3bd15cb68ca72472a50ef5c ]
      
      This patch corrects the full-scale volume setting logic. On certain
      platforms, the full-scale volume bit is required. The current logic
      mistakenly sets this bit and incorrectly clears reserved bit 0, causing
      the headphone output to be muted.
      
      Fixes: 342b6b61 ("ALSA: hda/cs8409: Fix Full Scale Volume setting for all variants")
      Signed-off-by: default avatarVitaly Rodionov <vitalyr@opensource.cirrus.com>
      Link: https://patch.msgid.link/20250214210736.30814-1-vitalyr@opensource.cirrus.com
      
      
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3c8c4804
    • Kuniyuki Iwashima's avatar
      geneve: Fix use-after-free in geneve_find_dev(). · 2932bad2
      Kuniyuki Iwashima authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 9593172d93b9f91c362baec4643003dc29802929 ]
      
      syzkaller reported a use-after-free in geneve_find_dev() [0]
      without repro.
      
      geneve_configure() links struct geneve_dev.next to
      net_generic(net, geneve_net_id)->geneve_list.
      
      The net here could differ from dev_net(dev) if IFLA_NET_NS_PID,
      IFLA_NET_NS_FD, or IFLA_TARGET_NETNSID is set.
      
      When dev_net(dev) is dismantled, geneve_exit_batch_rtnl() finally
      calls unregister_netdevice_queue() for each dev in the netns,
      and later the dev is freed.
      
      However, its geneve_dev.next is still linked to the backend UDP
      socket netns.
      
      Then, use-after-free will occur when another geneve dev is created
      in the netns.
      
      Let's call geneve_dellink() instead in geneve_destroy_tunnels().
      
      [0]:
      BUG: KASAN: slab-use-after-free in geneve_find_dev drivers/net/geneve.c:1295 [inline]
      BUG: KASAN: slab-use-after-free in geneve_configure+0x234/0x858 drivers/net/geneve.c:1343
      Read of size 2 at addr ffff000054d6ee24 by task syz.1.4029/13441
      
      CPU: 1 UID: 0 PID: 13441 Comm: syz.1.4029 Not tainted 6.13.0-g0ad9617c78ac #24 dc35ca22c79fb82e8e7bc5c9c9adafea898b1e3d
      Hardware name: linux,dummy-virt (DT)
      Call trace:
       show_stack+0x38/0x50 arch/arm64/kernel/stacktrace.c:466 (C)
       __dump_stack lib/dump_stack.c:94 [inline]
       dump_stack_lvl+0xbc/0x108 lib/dump_stack.c:120
       print_address_description mm/kasan/report.c:378 [inline]
       print_report+0x16c/0x6f0 mm/kasan/report.c:489
       kasan_report+0xc0/0x120 mm/kasan/report.c:602
       __asan_report_load2_noabort+0x20/0x30 mm/kasan/report_generic.c:379
       geneve_find_dev drivers/net/geneve.c:1295 [inline]
       geneve_configure+0x234/0x858 drivers/net/geneve.c:1343
       geneve_newlink+0xb8/0x128 drivers/net/geneve.c:1634
       rtnl_newlink_create+0x23c/0x868 net/core/rtnetlink.c:3795
       __rtnl_newlink net/core/rtnetlink.c:3906 [inline]
       rtnl_newlink+0x1054/0x1630 net/core/rtnetlink.c:4021
       rtnetlink_rcv_msg+0x61c/0x918 net/core/rtnetlink.c:6911
       netlink_rcv_skb+0x1dc/0x398 net/netlink/af_netlink.c:2543
       rtnetlink_rcv+0x34/0x50 net/core/rtnetlink.c:6938
       netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline]
       netlink_unicast+0x618/0x838 net/netlink/af_netlink.c:1348
       netlink_sendmsg+0x5fc/0x8b0 net/netlink/af_netlink.c:1892
       sock_sendmsg_nosec net/socket.c:713 [inline]
       __sock_sendmsg net/socket.c:728 [inline]
       ____sys_sendmsg+0x410/0x6f8 net/socket.c:2568
       ___sys_sendmsg+0x178/0x1d8 net/socket.c:2622
       __sys_sendmsg net/socket.c:2654 [inline]
       __do_sys_sendmsg net/socket.c:2659 [inline]
       __se_sys_sendmsg net/socket.c:2657 [inline]
       __arm64_sys_sendmsg+0x12c/0x1c8 net/socket.c:2657
       __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
       invoke_syscall+0x90/0x278 arch/arm64/kernel/syscall.c:49
       el0_svc_common+0x13c/0x250 arch/arm64/kernel/syscall.c:132
       do_el0_svc+0x54/0x70 arch/arm64/kernel/syscall.c:151
       el0_svc+0x4c/0xa8 arch/arm64/kernel/entry-common.c:744
       el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:762
       el0t_64_sync+0x198/0x1a0 arch/arm64/kernel/entry.S:600
      
      Allocated by task 13247:
       kasan_save_stack mm/kasan/common.c:47 [inline]
       kasan_save_track+0x30/0x68 mm/kasan/common.c:68
       kasan_save_alloc_info+0x44/0x58 mm/kasan/generic.c:568
       poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
       __kasan_kmalloc+0x84/0xa0 mm/kasan/common.c:394
       kasan_kmalloc include/linux/kasan.h:260 [inline]
       __do_kmalloc_node mm/slub.c:4298 [inline]
       __kmalloc_node_noprof+0x2a0/0x560 mm/slub.c:4304
       __kvmalloc_node_noprof+0x9c/0x230 mm/util.c:645
       alloc_netdev_mqs+0xb8/0x11a0 net/core/dev.c:11470
       rtnl_create_link+0x2b8/0xb50 net/core/rtnetlink.c:3604
       rtnl_newlink_create+0x19c/0x868 net/core/rtnetlink.c:3780
       __rtnl_newlink net/core/rtnetlink.c:3906 [inline]
       rtnl_newlink+0x1054/0x1630 net/core/rtnetlink.c:4021
       rtnetlink_rcv_msg+0x61c/0x918 net/core/rtnetlink.c:6911
       netlink_rcv_skb+0x1dc/0x398 net/netlink/af_netlink.c:2543
       rtnetlink_rcv+0x34/0x50 net/core/rtnetlink.c:6938
       netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline]
       netlink_unicast+0x618/0x838 net/netlink/af_netlink.c:1348
       netlink_sendmsg+0x5fc/0x8b0 net/netlink/af_netlink.c:1892
       sock_sendmsg_nosec net/socket.c:713 [inline]
       __sock_sendmsg net/socket.c:728 [inline]
       ____sys_sendmsg+0x410/0x6f8 net/socket.c:2568
       ___sys_sendmsg+0x178/0x1d8 net/socket.c:2622
       __sys_sendmsg net/socket.c:2654 [inline]
       __do_sys_sendmsg net/socket.c:2659 [inline]
       __se_sys_sendmsg net/socket.c:2657 [inline]
       __arm64_sys_sendmsg+0x12c/0x1c8 net/socket.c:2657
       __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
       invoke_syscall+0x90/0x278 arch/arm64/kernel/syscall.c:49
       el0_svc_common+0x13c/0x250 arch/arm64/kernel/syscall.c:132
       do_el0_svc+0x54/0x70 arch/arm64/kernel/syscall.c:151
       el0_svc+0x4c/0xa8 arch/arm64/kernel/entry-common.c:744
       el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:762
       el0t_64_sync+0x198/0x1a0 arch/arm64/kernel/entry.S:600
      
      Freed by task 45:
       kasan_save_stack mm/kasan/common.c:47 [inline]
       kasan_save_track+0x30/0x68 mm/kasan/common.c:68
       kasan_save_free_info+0x58/0x70 mm/kasan/generic.c:582
       poison_slab_object mm/kasan/common.c:247 [inline]
       __kasan_slab_free+0x48/0x68 mm/kasan/common.c:264
       kasan_slab_free include/linux/kasan.h:233 [inline]
       slab_free_hook mm/slub.c:2353 [inline]
       slab_free mm/slub.c:4613 [inline]
       kfree+0x140/0x420 mm/slub.c:4761
       kvfree+0x4c/0x68 mm/util.c:688
       netdev_release+0x94/0xc8 net/core/net-sysfs.c:2065
       device_release+0x98/0x1c0
       kobject_cleanup lib/kobject.c:689 [inline]
       kobject_release lib/kobject.c:720 [inline]
       kref_put include/linux/kref.h:65 [inline]
       kobject_put+0x2b0/0x438 lib/kobject.c:737
       netdev_run_todo+0xe5c/0xfc8 net/core/dev.c:11185
       rtnl_unlock+0x20/0x38 net/core/rtnetlink.c:151
       cleanup_net+0x4fc/0x8c0 net/core/net_namespace.c:648
       process_one_work+0x700/0x1398 kernel/workqueue.c:3236
       process_scheduled_works kernel/workqueue.c:3317 [inline]
       worker_thread+0x8c4/0xe10 kernel/workqueue.c:3398
       kthread+0x4bc/0x608 kernel/kthread.c:464
       ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:862
      
      The buggy address belongs to the object at ffff000054d6e000
       which belongs to the cache kmalloc-cg-4k of size 4096
      The buggy address is located 3620 bytes inside of
       freed 4096-byte region [ffff000054d6e000, ffff000054d6f000)
      
      The buggy address belongs to the physical page:
      page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x94d68
      head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
      memcg:ffff000016276181
      flags: 0x3fffe0000000040(head|node=0|zone=0|lastcpupid=0x1ffff)
      page_type: f5(slab)
      raw: 03fffe0000000040 ffff0000c000f500 dead000000000122 0000000000000000
      raw: 0000000000000000 0000000000040004 00000001f5000000 ffff000016276181
      head: 03fffe0000000040 ffff0000c000f500 dead000000000122 0000000000000000
      head: 0000000000000000 0000000000040004 00000001f5000000 ffff000016276181
      head: 03fffe0000000003 fffffdffc1535a01 ffffffffffffffff 0000000000000000
      head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff000054d6ed00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff000054d6ed80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      >ffff000054d6ee00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                     ^
       ffff000054d6ee80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff000054d6ef00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      Fixes: 2d07dc79 ("geneve: add initial netdev driver for GENEVE tunnels")
      Reported-by: default avatarsyzkaller <syzkaller@googlegroups.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://patch.msgid.link/20250213043354.91368-1-kuniyu@amazon.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2932bad2
    • Junnan Wu's avatar
      vsock/virtio: fix variables initialization during resuming · 600f36bd
      Junnan Wu authored and Frieder Schrempf's avatar Frieder Schrempf committed
      [ Upstream commit 55eff109e76a14e5ed10c8c3c3978d20a35e2a4d ]
      
      When executing suspend to ram twice in a row,
      the `rx_buf_nr` and `rx_buf_max_nr` increase to three times vq->num_free.
      Then after virtqueue_get_buf and `rx_buf_nr` decreased
      in function virtio_transport_rx_work,
      the condition to fill rx buffer
      (rx_buf_nr < rx_buf_max_nr / 2) will never be met.
      
      It is because that `rx_buf_nr` and `rx_buf_max_nr`
      are initialized only in virtio_vsock_probe(),
      but they should be reset whenever virtqueues are recreated,
      like after a suspend/resume.
      
      Move the `rx_buf_nr` and `rx_buf_max_nr` initialization in
      virtio_vsock_vqs_init(), so we are sure that they are properly
      initialized, every time we initialize the virtqueues, either when we
      load the driver or after a suspend/resume.
      
      To prevent erroneous atomic load operations on the `queued_replies`
      in the virtio_transport_send_pkt_work() function
      which may disrupt the scheduling of vsock->rx_work
      when transmitting reply-required socket packets,
      this atomic variable must undergo synchronized initialization
      alongside the preceding two variables after a suspend/resume.
      
      Fixes: bd50c5dc ("vsock/virtio: add support for device suspend/resume")
      Link: https://lore.kernel.org/virtualization/20250207052033.2222629-1-junnan01.wu@samsung.com/
      
      
      Co-developed-by: default avatarYing Gao <ying01.gao@samsung.com>
      Signed-off-by: default avatarYing Gao <ying01.gao@samsung.com>
      Signed-off-by: default avatarJunnan Wu <junnan01.wu@samsung.com>
      Reviewed-by: default avatarLuigi Leonardi <leonardi@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Link: https://patch.msgid.link/20250214012200.1883896-1-junnan01.wu@samsung.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      600f36bd
    • Shengjiu Wang's avatar
      ASoC: imx-audmix: remove cpu_mclk which is from cpu dai device · a0d74394
      Shengjiu Wang authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 571b69f2f9b1ec7cf7d0e9b79e52115a87a869c4 ]
      
      When defer probe happens, there may be below error:
      
      platform 59820000.sai: Resources present before probing
      
      The cpu_mclk clock is from the cpu dai device, if it is not released,
      then the cpu dai device probe will fail for the second time.
      
      The cpu_mclk is used to get rate for rate constraint, rate constraint
      may be specific for each platform, which is not necessary for machine
      driver, so remove it.
      
      Fixes: b86ef536 ("ASoC: fsl: Add Audio Mixer machine driver")
      Signed-off-by: default avatarShengjiu Wang <shengjiu.wang@nxp.com>
      Link: https://patch.msgid.link/20250213070518.547375-1-shengjiu.wang@nxp.com
      
      
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a0d74394
    • Christophe Leroy's avatar
      powerpc/code-patching: Fix KASAN hit by not flagging text patching area as VM_ALLOC · 287e6028
      Christophe Leroy authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit d262a192d38e527faa5984629aabda2e0d1c4f54 ]
      
      Erhard reported the following KASAN hit while booting his PowerMac G4
      with a KASAN-enabled kernel 6.13-rc6:
      
        BUG: KASAN: vmalloc-out-of-bounds in copy_to_kernel_nofault+0xd8/0x1c8
        Write of size 8 at addr f1000000 by task chronyd/1293
      
        CPU: 0 UID: 123 PID: 1293 Comm: chronyd Tainted: G        W          6.13.0-rc6-PMacG4 #2
        Tainted: [W]=WARN
        Hardware name: PowerMac3,6 7455 0x80010303 PowerMac
        Call Trace:
        [c2437590] [c1631a84] dump_stack_lvl+0x70/0x8c (unreliable)
        [c24375b0] [c0504998] print_report+0xdc/0x504
        [c2437610] [c050475c] kasan_report+0xf8/0x108
        [c2437690] [c0505a3c] kasan_check_range+0x24/0x18c
        [c24376a0] [c03fb5e4] copy_to_kernel_nofault+0xd8/0x1c8
        [c24376c0] [c004c014] patch_instructions+0x15c/0x16c
        [c2437710] [c00731a8] bpf_arch_text_copy+0x60/0x7c
        [c2437730] [c0281168] bpf_jit_binary_pack_finalize+0x50/0xac
        [c2437750] [c0073cf4] bpf_int_jit_compile+0xb30/0xdec
        [c2437880] [c0280394] bpf_prog_select_runtime+0x15c/0x478
        [c24378d0] [c1263428] bpf_prepare_filter+0xbf8/0xc14
        [c2437990] [c12677ec] bpf_prog_create_from_user+0x258/0x2b4
        [c24379d0] [c027111c] do_seccomp+0x3dc/0x1890
        [c2437ac0] [c001d8e0] system_call_exception+0x2dc/0x420
        [c2437f30] [c00281ac] ret_from_syscall+0x0/0x2c
        --- interrupt: c00 at 0x5a1274
        NIP:  005a1274 LR: 006a3b3c CTR: 005296c8
        REGS: c2437f40 TRAP: 0c00   Tainted: G        W           (6.13.0-rc6-PMacG4)
        MSR:  0200f932 <VEC,EE,PR,FP,ME,IR,DR,RI>  CR: 24004422  XER: 00000000
      
        GPR00: 00000166 af8f3fa0 a7ee3540 00000001 00000000 013b6500 005a5858 0200f932
        GPR08: 00000000 00001fe9 013d5fc8 005296c8 2822244c 00b2fcd8 00000000 af8f4b57
        GPR16: 00000000 00000001 00000000 00000000 00000000 00000001 00000000 00000002
        GPR24: 00afdbb0 00000000 00000000 00000000 006e0004 013ce060 006e7c1c 00000001
        NIP [005a1274] 0x5a1274
        LR [006a3b3c] 0x6a3b3c
        --- interrupt: c00
      
        The buggy address belongs to the virtual mapping at
         [f1000000, f1002000) created by:
         text_area_cpu_up+0x20/0x190
      
        The buggy address belongs to the physical page:
        page: refcount:1 mapcount:0 mapping:00000000 index:0x0 pfn:0x76e30
        flags: 0x80000000(zone=2)
        raw: 80000000 00000000 00000122 00000000 00000000 00000000 ffffffff 00000001
        raw: 00000000
        page dumped because: kasan: bad access detected
      
        Memory state around the buggy address:
         f0ffff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
         f0ffff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        >f1000000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
                   ^
         f1000080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
         f1000100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
        ==================================================================
      
      f8 corresponds to KASAN_VMALLOC_INVALID which means the area is not
      initialised hence not supposed to be used yet.
      
      Powerpc text patching infrastructure allocates a virtual memory area
      using get_vm_area() and flags it as VM_ALLOC. But that flag is meant
      to be used for vmalloc() and vmalloc() allocated memory is not
      supposed to be used before a call to __vmalloc_node_range() which is
      never called for that area.
      
      That went undetected until commit e4137f08816b ("mm, kasan, kmsan:
      instrument copy_from/to_kernel_nofault")
      
      The area allocated by text_area_cpu_up() is not vmalloc memory, it is
      mapped directly on demand when needed by map_kernel_page(). There is
      no VM flag corresponding to such usage, so just pass no flag. That way
      the area will be unpoisonned and usable immediately.
      
      Reported-by: default avatarErhard Furtner <erhard_f@mailbox.org>
      Closes: https://lore.kernel.org/all/20250112135832.57c92322@yea/
      
      
      Fixes: 37bc3e5f ("powerpc/lib/code-patching: Use alternate map for patch_instruction()")
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarMadhavan Srinivasan <maddy@linux.ibm.com>
      Link: https://patch.msgid.link/06621423da339b374f48c0886e3a5db18e896be8.1739342693.git.christophe.leroy@csgroup.eu
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      287e6028
    • Kailang Yang's avatar
      ALSA: hda/realtek: Fixup ALC225 depop procedure · 87ba79cc
      Kailang Yang authored and Frieder Schrempf's avatar Frieder Schrempf committed
      [ Upstream commit 174448badb4409491bfba2e6b46f7aa078741c5e ]
      
      Headset MIC will no function when power_save=0.
      
      Fixes: 1fd50509fe14 ("ALSA: hda/realtek: Update ALC225 depop procedure")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=219743
      
      
      Signed-off-by: default avatarKailang Yang <kailang@realtek.com>
      Link: https://lore.kernel.org/0474a095ab0044d0939ec4bf4362423d@realtek.com
      
      
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      87ba79cc
    • Christophe Leroy's avatar
      powerpc/64s: Rewrite __real_pte() and __rpte_to_hidx() as static inline · bc2ebfcc
      Christophe Leroy authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 61bcc752d1b81fde3cae454ff20c1d3c359df500 ]
      
      Rewrite __real_pte() and __rpte_to_hidx() as static inline in order to
      avoid following warnings/errors when building with 4k page size:
      
      	  CC      arch/powerpc/mm/book3s64/hash_tlb.o
      	arch/powerpc/mm/book3s64/hash_tlb.c: In function 'hpte_need_flush':
      	arch/powerpc/mm/book3s64/hash_tlb.c:49:16: error: variable 'offset' set but not used [-Werror=unused-but-set-variable]
      	   49 |         int i, offset;
      	      |                ^~~~~~
      
      	  CC      arch/powerpc/mm/book3s64/hash_native.o
      	arch/powerpc/mm/book3s64/hash_native.c: In function 'native_flush_hash_range':
      	arch/powerpc/mm/book3s64/hash_native.c:782:29: error: variable 'index' set but not used [-Werror=unused-but-set-variable]
      	  782 |         unsigned long hash, index, hidx, shift, slot;
      	      |                             ^~~~~
      
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Closes: https://lore.kernel.org/oe-kbuild-all/202501081741.AYFwybsq-lkp@intel.com/
      
      
      Fixes: ff31e105 ("powerpc/mm/hash64: Store the slot information at the right offset for hugetlb")
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Reviewed-by: default avatarRitesh Harjani (IBM) <ritesh.list@gmail.com>
      Signed-off-by: default avatarMadhavan Srinivasan <maddy@linux.ibm.com>
      Link: https://patch.msgid.link/e0d340a5b7bd478ecbf245d826e6ab2778b74e06.1736706263.git.christophe.leroy@csgroup.eu
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bc2ebfcc
    • Christophe Leroy's avatar
      powerpc/code-patching: Disable KASAN report during patching via temporary mm · b6b55e03
      Christophe Leroy authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit dc9c5166c3cb044f8a001e397195242fd6796eee ]
      
      Erhard reports the following KASAN hit on Talos II (power9) with kernel 6.13:
      
      [   12.028126] ==================================================================
      [   12.028198] BUG: KASAN: user-memory-access in copy_to_kernel_nofault+0x8c/0x1a0
      [   12.028260] Write of size 8 at addr 0000187e458f2000 by task systemd/1
      
      [   12.028346] CPU: 87 UID: 0 PID: 1 Comm: systemd Tainted: G                T  6.13.0-P9-dirty #3
      [   12.028408] Tainted: [T]=RANDSTRUCT
      [   12.028446] Hardware name: T2P9D01 REV 1.01 POWER9 0x4e1202 opal:skiboot-bc106a0 PowerNV
      [   12.028500] Call Trace:
      [   12.028536] [c000000008dbf3b0] [c000000001656a48] dump_stack_lvl+0xbc/0x110 (unreliable)
      [   12.028609] [c000000008dbf3f0] [c0000000006e2fc8] print_report+0x6b0/0x708
      [   12.028666] [c000000008dbf4e0] [c0000000006e2454] kasan_report+0x164/0x300
      [   12.028725] [c000000008dbf600] [c0000000006e54d4] kasan_check_range+0x314/0x370
      [   12.028784] [c000000008dbf640] [c0000000006e6310] __kasan_check_write+0x20/0x40
      [   12.028842] [c000000008dbf660] [c000000000578e8c] copy_to_kernel_nofault+0x8c/0x1a0
      [   12.028902] [c000000008dbf6a0] [c0000000000acfe4] __patch_instructions+0x194/0x210
      [   12.028965] [c000000008dbf6e0] [c0000000000ade80] patch_instructions+0x150/0x590
      [   12.029026] [c000000008dbf7c0] [c0000000001159bc] bpf_arch_text_copy+0x6c/0xe0
      [   12.029085] [c000000008dbf800] [c000000000424250] bpf_jit_binary_pack_finalize+0x40/0xc0
      [   12.029147] [c000000008dbf830] [c000000000115dec] bpf_int_jit_compile+0x3bc/0x930
      [   12.029206] [c000000008dbf990] [c000000000423720] bpf_prog_select_runtime+0x1f0/0x280
      [   12.029266] [c000000008dbfa00] [c000000000434b18] bpf_prog_load+0xbb8/0x1370
      [   12.029324] [c000000008dbfb70] [c000000000436ebc] __sys_bpf+0x5ac/0x2e00
      [   12.029379] [c000000008dbfd00] [c00000000043a228] sys_bpf+0x28/0x40
      [   12.029435] [c000000008dbfd20] [c000000000038eb4] system_call_exception+0x334/0x610
      [   12.029497] [c000000008dbfe50] [c00000000000c270] system_call_vectored_common+0xf0/0x280
      [   12.029561] --- interrupt: 3000 at 0x3fff82f5cfa8
      [   12.029608] NIP:  00003fff82f5cfa8 LR: 00003fff82f5cfa8 CTR: 0000000000000000
      [   12.029660] REGS: c000000008dbfe80 TRAP: 3000   Tainted: G                T   (6.13.0-P9-dirty)
      [   12.029735] MSR:  900000000280f032 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI>  CR: 42004848  XER: 00000000
      [   12.029855] IRQMASK: 0
                     GPR00: 0000000000000169 00003fffdcf789a0 00003fff83067100 0000000000000005
                     GPR04: 00003fffdcf78a98 0000000000000090 0000000000000000 0000000000000008
                     GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
                     GPR12: 0000000000000000 00003fff836ff7e0 c000000000010678 0000000000000000
                     GPR16: 0000000000000000 0000000000000000 00003fffdcf78f28 00003fffdcf78f90
                     GPR20: 0000000000000000 0000000000000000 0000000000000000 00003fffdcf78f80
                     GPR24: 00003fffdcf78f70 00003fffdcf78d10 00003fff835c7239 00003fffdcf78bd8
                     GPR28: 00003fffdcf78a98 0000000000000000 0000000000000000 000000011f547580
      [   12.030316] NIP [00003fff82f5cfa8] 0x3fff82f5cfa8
      [   12.030361] LR [00003fff82f5cfa8] 0x3fff82f5cfa8
      [   12.030405] --- interrupt: 3000
      [   12.030444] ==================================================================
      
      Commit c28c15b6 ("powerpc/code-patching: Use temporary mm for
      Radix MMU") is inspired from x86 but unlike x86 is doesn't disable
      KASAN reports during patching. This wasn't a problem at the begining
      because __patch_mem() is not instrumented.
      
      Commit 465cabc9 ("powerpc/code-patching: introduce
      patch_instructions()") use copy_to_kernel_nofault() to copy several
      instructions at once. But when using temporary mm the destination is
      not regular kernel memory but a kind of kernel-like memory located
      in user address space. Because it is not in kernel address space it is
      not covered by KASAN shadow memory. Since commit e4137f08816b ("mm,
      kasan, kmsan: instrument copy_from/to_kernel_nofault") KASAN reports
      bad accesses from copy_to_kernel_nofault(). Here a bad access to user
      memory is reported because KASAN detects the lack of shadow memory and
      the address is below TASK_SIZE.
      
      Do like x86 in commit b3fd8e83 ("x86/alternatives: Use temporary
      mm for text poking") and disable KASAN reports during patching when
      using temporary mm.
      
      Reported-by: default avatarErhard Furtner <erhard_f@mailbox.org>
      Close: https://lore.kernel.org/all/20250201151435.48400261@yea/
      
      
      Fixes: 465cabc9 ("powerpc/code-patching: introduce patch_instructions()")
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Acked-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarMadhavan Srinivasan <maddy@linux.ibm.com>
      Link: https://patch.msgid.link/1c05b2a1b02ad75b981cfc45927e0b4a90441046.1738577687.git.christophe.leroy@csgroup.eu
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b6b55e03
    • Peter Ujfalusi's avatar
      ASoC: SOF: ipc4-topology: Harden loops for looking up ALH copiers · 00d1e406
      Peter Ujfalusi authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 6fd60136d256b3b948333ebdb3835f41a95ab7ef ]
      
      Other, non DAI copier widgets could have the same  stream name (sname) as
      the ALH copier and in that case the copier->data is NULL, no alh_data is
      attached, which could lead to NULL pointer dereference.
      We could check for this NULL pointer in sof_ipc4_prepare_copier_module()
      and avoid the crash, but a similar loop in sof_ipc4_widget_setup_comp_dai()
      will miscalculate the ALH device count, causing broken audio.
      
      The correct fix is to harden the matching logic by making sure that the
      1. widget is a DAI widget - so dai = w->private is valid
      2. the dai (and thus the copier) is ALH copier
      
      Fixes: a150345a ("ASoC: SOF: ipc4-topology: add SoundWire/ALH aggregation support")
      Reported-by: default avatarSeppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
      Link: https://github.com/thesofproject/sof/pull/9652
      
      
      Signed-off-by: default avatarPeter Ujfalusi <peter.ujfalusi@linux.intel.com>
      Reviewed-by: default avatarLiam Girdwood <liam.r.girdwood@intel.com>
      Reviewed-by: default avatarRanjani Sridharan <ranjani.sridharan@linux.intel.com>
      Reviewed-by: default avatarBard Liao <yung-chuan.liao@linux.intel.com>
      Link: https://patch.msgid.link/20250206084642.14988-1-peter.ujfalusi@linux.intel.com
      
      
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      00d1e406
    • John Keeping's avatar
      ASoC: rockchip: i2s-tdm: fix shift config for SND_SOC_DAIFMT_DSP_[AB] · b4095d1e
      John Keeping authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 6b24e67b4056ba83b1e95e005b7e50fdb1cc6cf4 ]
      
      Commit 2f45a4e2 ("ASoC: rockchip: i2s_tdm: Fixup config for
      SND_SOC_DAIFMT_DSP_A/B") applied a partial change to fix the
      configuration for DSP A and DSP B formats.
      
      The shift control also needs updating to set the correct offset for
      frame data compared to LRCK.  Set the correct values.
      
      Fixes: 081068fd ("ASoC: rockchip: add support for i2s-tdm controller")
      Signed-off-by: default avatarJohn Keeping <jkeeping@inmusicbrands.com>
      Link: https://patch.msgid.link/20250204161311.2117240-1-jkeeping@inmusicbrands.com
      
      
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b4095d1e
    • Tejun Heo's avatar
      sched_ext: Fix migration disabled handling in targeted dispatches · d4688ad3
      Tejun Heo authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 32966821574cd2917bd60f2554f435fe527f4702 ]
      
      A dispatch operation that can target a specific local DSQ -
      scx_bpf_dsq_move_to_local() or scx_bpf_dsq_move() - checks whether the task
      can be migrated to the target CPU using task_can_run_on_remote_rq(). If the
      task can't be migrated to the targeted CPU, it is bounced through a global
      DSQ.
      
      task_can_run_on_remote_rq() assumes that the task is on a CPU that's
      different from the targeted CPU but the callers doesn't uphold the
      assumption and may call the function when the task is already on the target
      CPU. When such task has migration disabled, task_can_run_on_remote_rq() ends
      up returning %false incorrectly unnecessarily bouncing the task to a global
      DSQ.
      
      Fix it by updating the callers to only call task_can_run_on_remote_rq() when
      the task is on a different CPU than the target CPU. As this is a bit subtle,
      for clarity and documentation:
      
      - Make task_can_run_on_remote_rq() trigger SCHED_WARN_ON() if the task is on
        the same CPU as the target CPU.
      
      - is_migration_disabled() test in task_can_run_on_remote_rq() cannot trigger
        if the task is on a different CPU than the target CPU as the preceding
        task_allowed_on_cpu() test should fail beforehand. Convert the test into
        SCHED_WARN_ON().
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Fixes: 4c30f5ce ("sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq()")
      Fixes: 0366017e ("sched_ext: Use task_can_run_on_remote_rq() test in dispatch_to_local_dsq()")
      Cc: stable@vger.kernel.org # v6.12+
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d4688ad3
    • Tejun Heo's avatar
      sched_ext: Factor out move_task_between_dsqs() from scx_dispatch_from_dsq() · 237d1031
      Tejun Heo authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 8427acb6b5861d205abca7afa656a897bbae34b7 ]
      
      Pure reorganization. No functional changes.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Stable-dep-of: 32966821574c ("sched_ext: Fix migration disabled handling in targeted dispatches")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      237d1031
Loading