linux

mirror of https://github.com/torvalds/linux.git synced 2025-08-15 14:11:42 +02:00

Author	SHA1	Message	Date
Arnd Bergmann	0c952efa0d	arm64: tegra: Device tree changes for v6.17-rc1 This contains an extra patch that drops numa-node-id properties that were added to the Tegra264 DT files by mistake. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAmiLmBAACgkQ3SOs138+ s6GQKhAAljOUD4Tx1vrP+tKofoQskPVnKVyzo5wUImAhFxkk0eX7CFQb4YFS0cBo IgjOUKZIcAP2WJT35HcipmXVMsBEHvw63I9/Ktw1F5sgk17jxmrbMTBuWLLt+bbB BGZPy4+o64hT/OJnpdEMI+IUhhodcpvKhKoxMDagLujwOGzTzEnwzaWnv4V2V9qr dcV1//PFvCW8pvn175Pplz6U2TeASJ7Nff575we/p9iE/h/oXXSCXeG82ma9udwg 9+SSuY238tXhhr6uBPItcKD/TqwByFj5bD7DdgTMMP7kQ7flygxLP9ZVTcPOW9Ww t7ag4+9w0srXIKxd4CAarpiZj6TYCEU9GvDF53GH+ZoY1miFvdNAh04wn5sCFdrN ASZ2THIaU8hjDjX6hE+bbUuCuIYcXX6G0CH0z/oj7QJAWyrmN4gIjz7f65160QI/ IDZDkcelLNd57Vv0Z4mdh3SUeKYGzjtIw3oaZjHZIzbFGnyIE/TiXolQXbxYek8k Jz59bjhQFIunSJLFSHPRqm58OsMqpvcdbbHfwwf1+HsTIamJtpSK1xduwe2uGS1c lyQ/2VpYKkIwoWFw/QbbQM6HTkJ7vKP2JPqh2IWtPXadF28HbYtbq9+Yra79v7/E vMIEG2K32uzwWjUp2mgV20KfiauDSLcga6V4zRJfqWvCSVs5rN0= =2IBm -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmiWYyQACgkQmmx57+YA GNmTwBAAgVbyKlBrq09jS10z4GW5GNleDstxZdsPiTQpXnKS6nMEw8FOYyi3ra9y FRwY4yROd1yz5oQEFh4l0wWpuj/jAD5MaCQSL/xEKrC14INm4VBfp/SbuKZa5bf0 45I1fx3ME1VUgpOWzx9DLoW7Nj3BWxHZbAzdAdy8kaiMzDonUMCC+cL8HnWDdQ5Y 7HWrVjlQeWDAxtOLM2jFjPB9kGMVcWgCycjRFCP+00E2Uw+JXlNj8e0qjqLxR/my zFskqkl8cnddK3/4HEGkyTI3BrEH9YUyxLHRpvMK0/xtj5ytVDgANeRaj86B7AUJ owMlA/XK2hlpYhGlinsdgZgj+JWVveFQh7dAuOCqTxq/Bgr5EgXW3MOCFy7W9N40 XD8n/yVXtYuq1GxBv2in6uPo1s2s+QGdC7qRDJbTl3wMgZUs1LsatZgpk0QMZ4fk UDTj2WiVp3b4/zkqvAybY9iUIq7M4av5OJejciyjVyCBFGFIRKriv/QvfkRDLRcl RR9Xo3AgLHEGSkNeMHJhHNmALVwwXm2ZDSsSgwm1C0VcKgwQ8XBDD3yVTCJ5GBI8 Lkn7jce66+IzbAIc0J1b6b3Crkmm49PH4SfIbuupzGq92z4CnnEc0mXYothmU64A iZVI6eUfUNA3d+k1aV8mI3Pj9j+K0Ffx1lVe49l24LrTye5Rz0g= =OFV+ -----END PGP SIGNATURE----- Merge tag 'tegra-for-6.17-arm64-dt-v3' of https://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into arm/fixes arm64: tegra: Device tree changes for v6.17-rc1 This contains an extra patch that drops numa-node-id properties that were added to the Tegra264 DT files by mistake. * tag 'tegra-for-6.17-arm64-dt-v3' of https://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux: arm64: tegra: Remove numa-node-id properties arm64: tegra: Add p3971-0089+p3834-0008 support arm64: tegra: Add memory controller on Tegra264 arm64: tegra: Add Tegra264 support dt-bindings: memory: tegra: Add Tegra264 support Link: https://lore.kernel.org/r/20250731162920.3329820-1-thierry.reding@gmail.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2025-08-08 22:50:44 +02:00
Haiyang Zhang	33caa208db	hv_netvsc: Fix panic during namespace deletion with VF The existing code move the VF NIC to new namespace when NETDEV_REGISTER is received on netvsc NIC. During deletion of the namespace, default_device_exit_batch() >> default_device_exit_net() is called. When netvsc NIC is moved back and registered to the default namespace, it automatically brings VF NIC back to the default namespace. This will cause the default_device_exit_net() >> for_each_netdev_safe loop unable to detect the list end, and hit NULL ptr: [ 231.449420] mana 7870:00:00.0 enP30832s1: Moved VF to namespace with: eth0 [ 231.449656] BUG: kernel NULL pointer dereference, address: 0000000000000010 [ 231.450246] #PF: supervisor read access in kernel mode [ 231.450579] #PF: error_code(0x0000) - not-present page [ 231.450916] PGD 17b8a8067 P4D 0 [ 231.451163] Oops: Oops: 0000 [#1] SMP NOPTI [ 231.451450] CPU: 82 UID: 0 PID: 1394 Comm: kworker/u768:1 Not tainted 6.16.0-rc4+ #3 VOLUNTARY [ 231.452042] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 11/21/2024 [ 231.452692] Workqueue: netns cleanup_net [ 231.452947] RIP: 0010:default_device_exit_batch+0x16c/0x3f0 [ 231.453326] Code: c0 0c f5 b3 e8 d5 db fe ff 48 85 c0 74 15 48 c7 c2 f8 fd ca b2 be 10 00 00 00 48 8d 7d c0 e8 7b 77 25 00 49 8b 86 28 01 00 00 <48> 8b 50 10 4c 8b 2a 4c 8d 62 f0 49 83 ed 10 4c 39 e0 0f 84 d6 00 [ 231.454294] RSP: 0018:ff75fc7c9bf9fd00 EFLAGS: 00010246 [ 231.454610] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 61c8864680b583eb [ 231.455094] RDX: ff1fa9f71462d800 RSI: ff75fc7c9bf9fd38 RDI: 0000000030766564 [ 231.455686] RBP: ff75fc7c9bf9fd78 R08: 0000000000000000 R09: 0000000000000000 [ 231.456126] R10: 0000000000000001 R11: 0000000000000004 R12: ff1fa9f70088e340 [ 231.456621] R13: ff1fa9f70088e340 R14: ffffffffb3f50c20 R15: ff1fa9f7103e6340 [ 231.457161] FS: 0000000000000000(0000) GS:ff1faa6783a08000(0000) knlGS:0000000000000000 [ 231.457707] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 231.458031] CR2: 0000000000000010 CR3: 0000000179ab2006 CR4: 0000000000b73ef0 [ 231.458434] Call Trace: [ 231.458600] <TASK> [ 231.458777] ops_undo_list+0x100/0x220 [ 231.459015] cleanup_net+0x1b8/0x300 [ 231.459285] process_one_work+0x184/0x340 To fix it, move the ns change to a workqueue, and take rtnl_lock to avoid changing the netdev list when default_device_exit_net() is using it. Cc: stable@vger.kernel.org Fixes: `4c262801ea` ("hv_netvsc: Fix VF namespace also in synthetic NIC NETDEV_REGISTER event") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://patch.msgid.link/1754511711-11188-1-git-send-email-haiyangz@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-08 13:24:16 -07:00
Stanislav Fomichev	c642379608	hamradio: ignore ops-locked netdevs Syzkaller managed to trigger lock dependency in xsk_notify via register_netdevice. As discussed in [0], using register_netdevice in the notifiers is problematic so skip adding hamradio for ops-locked devices. xsk_notifier+0x89/0x230 net/xdp/xsk.c:1664 notifier_call_chain+0x1b6/0x3e0 kernel/notifier.c:85 call_netdevice_notifiers_extack net/core/dev.c:2267 [inline] call_netdevice_notifiers net/core/dev.c:2281 [inline] unregister_netdevice_many_notify+0x14d7/0x1ff0 net/core/dev.c:12156 unregister_netdevice_many net/core/dev.c:12219 [inline] unregister_netdevice_queue+0x33c/0x380 net/core/dev.c:12063 register_netdevice+0x1689/0x1ae0 net/core/dev.c:11241 bpq_new_device drivers/net/hamradio/bpqether.c:481 [inline] bpq_device_event+0x491/0x600 drivers/net/hamradio/bpqether.c:523 notifier_call_chain+0x1b6/0x3e0 kernel/notifier.c:85 call_netdevice_notifiers_extack net/core/dev.c:2267 [inline] call_netdevice_notifiers net/core/dev.c:2281 [inline] __dev_notify_flags+0x18d/0x2e0 net/core/dev.c:-1 netif_change_flags+0xe8/0x1a0 net/core/dev.c:9608 dev_change_flags+0x130/0x260 net/core/dev_api.c:68 devinet_ioctl+0xbb4/0x1b50 net/ipv4/devinet.c:1200 inet_ioctl+0x3c0/0x4c0 net/ipv4/af_inet.c:1001 0: https://lore.kernel.org/netdev/20250625140357.6203d0af@kernel.org/ Fixes: `4c975fd700` ("net: hold instance lock during NETDEV_REGISTER/UP") Suggested-by: Jakub Kicinski <kuba@kernel.org> Reported-by: syzbot+e6300f66a999a6612477@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=e6300f66a999a6612477 Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250806213726.1383379-2-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-08 13:22:28 -07:00
Stanislav Fomichev	53898ebabe	net: lapbether: ignore ops-locked netdevs Syzkaller managed to trigger lock dependency in xsk_notify via register_netdevice. As discussed in [0], using register_netdevice in the notifiers is problematic so skip adding lapbeth for ops-locked devices. xsk_notifier+0xa4/0x280 net/xdp/xsk.c:1645 notifier_call_chain+0xbc/0x410 kernel/notifier.c:85 call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:2230 call_netdevice_notifiers_extack net/core/dev.c:2268 [inline] call_netdevice_notifiers net/core/dev.c:2282 [inline] unregister_netdevice_many_notify+0xf9d/0x2700 net/core/dev.c:12077 unregister_netdevice_many net/core/dev.c:12140 [inline] unregister_netdevice_queue+0x305/0x3f0 net/core/dev.c:11984 register_netdevice+0x18f1/0x2270 net/core/dev.c:11149 lapbeth_new_device drivers/net/wan/lapbether.c:420 [inline] lapbeth_device_event+0x5b1/0xbe0 drivers/net/wan/lapbether.c:462 notifier_call_chain+0xbc/0x410 kernel/notifier.c:85 call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:2230 call_netdevice_notifiers_extack net/core/dev.c:2268 [inline] call_netdevice_notifiers net/core/dev.c:2282 [inline] __dev_notify_flags+0x12c/0x2e0 net/core/dev.c:9497 netif_change_flags+0x108/0x160 net/core/dev.c:9526 dev_change_flags+0xba/0x250 net/core/dev_api.c:68 devinet_ioctl+0x11d5/0x1f50 net/ipv4/devinet.c:1200 inet_ioctl+0x3a7/0x3f0 net/ipv4/af_inet.c:1001 0: https://lore.kernel.org/netdev/20250625140357.6203d0af@kernel.org/ Fixes: `4c975fd700` ("net: hold instance lock during NETDEV_REGISTER/UP") Suggested-by: Jakub Kicinski <kuba@kernel.org> Reported-by: syzbot+e67ea9c235b13b4f0020@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=e67ea9c235b13b4f0020 Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250806213726.1383379-1-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-08 13:22:28 -07:00
Tristram Ha	829f45f9d9	net: dsa: microchip: Fix KSZ8863 reset problem ksz8873_valid_regs[] was added for register access for KSZ8863/KSZ8873 switches, but the reset register is not in the list so ksz8_reset_switch() does not take any effect. Replace regmap_update_bits() using ksz_regmap_8 with ksz_rmw8() so that an error message will be given if the register is not defined. A side effect of not resetting the switch is the static MAC table is not cleared. Further additions to the table will show write error as there are only 8 entries in the table. Fixes: `d0dec33330` ("net: dsa: microchip: Add register access control for KSZ8873 chip") Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20250807005453.8306-1-Tristram.Ha@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-08 13:18:18 -07:00
Xin Long	fd60d8a086	sctp: linearize cloned gso packets in sctp_rcv A cloned head skb still shares these frag skbs in fraglist with the original head skb. It's not safe to access these frag skbs. syzbot reported two use-of-uninitialized-memory bugs caused by this: BUG: KMSAN: uninit-value in sctp_inq_pop+0x15b7/0x1920 net/sctp/inqueue.c:211 sctp_inq_pop+0x15b7/0x1920 net/sctp/inqueue.c:211 sctp_assoc_bh_rcv+0x1a7/0xc50 net/sctp/associola.c:998 sctp_inq_push+0x2ef/0x380 net/sctp/inqueue.c:88 sctp_backlog_rcv+0x397/0xdb0 net/sctp/input.c:331 sk_backlog_rcv+0x13b/0x420 include/net/sock.h:1122 __release_sock+0x1da/0x330 net/core/sock.c:3106 release_sock+0x6b/0x250 net/core/sock.c:3660 sctp_wait_for_connect+0x487/0x820 net/sctp/socket.c:9360 sctp_sendmsg_to_asoc+0x1ec1/0x1f00 net/sctp/socket.c:1885 sctp_sendmsg+0x32b9/0x4a80 net/sctp/socket.c:2031 inet_sendmsg+0x25a/0x280 net/ipv4/af_inet.c:851 sock_sendmsg_nosec net/socket.c:718 [inline] and BUG: KMSAN: uninit-value in sctp_assoc_bh_rcv+0x34e/0xbc0 net/sctp/associola.c:987 sctp_assoc_bh_rcv+0x34e/0xbc0 net/sctp/associola.c:987 sctp_inq_push+0x2a3/0x350 net/sctp/inqueue.c:88 sctp_backlog_rcv+0x3c7/0xda0 net/sctp/input.c:331 sk_backlog_rcv+0x142/0x420 include/net/sock.h:1148 __release_sock+0x1d3/0x330 net/core/sock.c:3213 release_sock+0x6b/0x270 net/core/sock.c:3767 sctp_wait_for_connect+0x458/0x820 net/sctp/socket.c:9367 sctp_sendmsg_to_asoc+0x223a/0x2260 net/sctp/socket.c:1886 sctp_sendmsg+0x3910/0x49f0 net/sctp/socket.c:2032 inet_sendmsg+0x269/0x2a0 net/ipv4/af_inet.c:851 sock_sendmsg_nosec net/socket.c:712 [inline] This patch fixes it by linearizing cloned gso packets in sctp_rcv(). Fixes: `90017accff` ("sctp: Add GSO support") Reported-by: syzbot+773e51afe420baaf0e2b@syzkaller.appspotmail.com Reported-by: syzbot+70a42f45e76bede082be@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Link: https://patch.msgid.link/dd7dc337b99876d4132d0961f776913719f7d225.1754595611.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-08 13:08:06 -07:00
Zhang Rui	44207567fa	tools/power turbostat: Fix bogus SysWatt for forked program Similar to delta_cpu(), delta_platform() is called in turbostat main loop. This ensures accurate SysWatt readings in periodic monitoring mode $ sudo turbostat -S -q --show power -i 1 CoreTmp PkgTmp PkgWatt CorWatt GFXWatt RAMWatt PKG_% RAM_% SysWatt 60 61 6.21 1.13 0.16 0.00 0.00 0.00 13.07 58 61 6.00 1.07 0.18 0.00 0.00 0.00 12.75 58 61 5.74 1.05 0.17 0.00 0.00 0.00 12.22 58 60 6.27 1.11 0.24 0.00 0.00 0.00 13.55 However, delta_platform() is missing for forked program and causes bogus SysWatt reporting, $ sudo turbostat -S -q --show power sleep 1 1.004736 sec CoreTmp PkgTmp PkgWatt CorWatt GFXWatt RAMWatt PKG_% RAM_% SysWatt 57 58 6.05 1.02 0.16 0.00 0.00 0.00 0.03 Add missing delta_platform() for forked program. Fixes: `e5f687b89b` ("tools/power turbostat: Add RAPL psys as a built-in counter") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2025-08-08 16:05:54 -04:00
Calvin Owens	d34fe509f5	tools/power turbostat: Handle cap_get_proc() ENOSYS Kernels configured with CONFIG_MULTIUSER=n have no cap_get_proc(). Check for ENOSYS to recognize this case, and continue on to attempt to access the requested MSRs (such as temperature). Signed-off-by: Calvin Owens <calvin@wbinvd.org> Signed-off-by: Len Brown <len.brown@intel.com>	2025-08-08 16:05:53 -04:00
Calvin Owens	6ea0ec1b95	tools/power turbostat: Fix build with musl turbostat.c: In function 'parse_int_file': turbostat.c:5567:19: error: 'PATH_MAX' undeclared (first use in this function) 5567 \| char path[PATH_MAX]; \| ^~~~~~~~ turbostat.c: In function 'probe_graphics': turbostat.c:6787:19: error: 'PATH_MAX' undeclared (first use in this function) 6787 \| char path[PATH_MAX]; \| ^~~~~~~~ Signed-off-by: Calvin Owens <calvin@wbinvd.org> Reviewed-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2025-08-08 16:05:53 -04:00
Len Brown	d44c40e4e3	tools/power turbostat: verify arguments to params --show and --hide $ sudo turbostat --quiet --show junk turbostat: Counter 'junk' can not be added. Previously, invalid arguments to --show and --hide were silently ignored Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2025-08-08 16:05:27 -04:00
Budimir Markovic	aba0c94f61	vsock: Do not allow binding to VMADDR_PORT_ANY It is possible for a vsock to autobind to VMADDR_PORT_ANY. This can cause a use-after-free when a connection is made to the bound socket. The socket returned by accept() also has port VMADDR_PORT_ANY but is not on the list of unbound sockets. Binding it will result in an extra refcount decrement similar to the one fixed in `fcdd2242c0` (vsock: Keep the binding until socket destruction). Modify the check in __vsock_bind_connectible() to also prevent binding to VMADDR_PORT_ANY. Fixes: `d021c34405` ("VSOCK: Introduce VM Sockets") Reported-by: Budimir Markovic <markovicbudimir@gmail.com> Signed-off-by: Budimir Markovic <markovicbudimir@gmail.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://patch.msgid.link/20250807041811.678-1-markovicbudimir@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-08 12:55:00 -07:00
Alok Tiwari	5f1d1d14db	net: ti: icss-iep: Fix incorrect type for return value in extts_enable() The variable ret in icss_iep_extts_enable() was incorrectly declared as u32, while the function returns int and may return negative error codes. This will cause sign extension issues and incorrect error propagation. Update ret to be int to fix error handling. This change corrects the declaration to avoid potential type mismatch. Fixes: `c1e0230eea` ("net: ti: icss-iep: Add IEP driver") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250805142323.1949406-1-alok.a.tiwari@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-08 12:54:54 -07:00
Jakub Kicinski	64fdaa94bf	net: page_pool: allow enabling recycling late, fix false positive warning Page pool can have pages "directly" (locklessly) recycled to it, if the NAPI that owns the page pool is scheduled to run on the same CPU. To make this safe we check that the NAPI is disabled while we destroy the page pool. In most cases NAPI and page pool lifetimes are tied together so this happens naturally. The queue API expects the following order of calls: -> mem_alloc alloc new pp -> stop napi_disable -> start napi_enable -> mem_free free old pp Here we allocate the page pool in ->mem_alloc and free in ->mem_free. But the NAPIs are only stopped between ->stop and ->start. We created page_pool_disable_direct_recycling() to safely shut down the recycling in ->stop. This way the page_pool_destroy() call in ->mem_free doesn't have to worry about recycling any more. Unfortunately, the page_pool_disable_direct_recycling() is not enough to deal with failures which necessitate freeing the _new_ page pool. If we hit a failure in ->mem_alloc or ->stop the new page pool has to be freed while the NAPI is active (assuming driver attaches the page pool to an existing NAPI instance and doesn't reallocate NAPIs). Freeing the new page pool is technically safe because it hasn't been used for any packets, yet, so there can be no recycling. But the check in napi_assert_will_not_race() has no way of knowing that. We could check if page pool is empty but that'd make the check much less likely to trigger during development. Add page_pool_enable_direct_recycling(), pairing with page_pool_disable_direct_recycling(). It will allow us to create the new page pools in "disabled" state and only enable recycling when we know the reconfig operation will not fail. Coincidentally it will also let us re-enable the recycling for the old pool, if the reconfig failed: -> mem_alloc (new) -> stop (old) # disables direct recycling for old -> start (new) # fail!! -> start (old) # go back to old pp but direct recycling is lost :( -> mem_free (new) The new helper is idempotent to make the life easier for drivers, which can operate in HDS mode and support zero-copy Rx. The driver can call the helper twice whether there are two pools or it has multiple references to a single pool. Fixes: `40eca00ae6` ("bnxt_en: unlink page pool when stopping Rx queue") Tested-by: David Wei <dw@davidwei.uk> Link: https://patch.msgid.link/20250805003654.2944974-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-08 12:54:42 -07:00
MD Danish Anwar	06feac1540	net: ti: icssg-prueth: Fix emac link speed handling When link settings are changed emac->speed is populated by emac_adjust_link(). The link speed and other settings are then written into the DRAM. However if both ports are brought down after this and brought up again or if the operating mode is changed and a firmware reload is needed, the DRAM is cleared by icssg_config(). As a result the link settings are lost. Fix this by calling emac_adjust_link() after icssg_config(). This re populates the settings in the DRAM after a new firmware load. Fixes: `9facce84f4` ("net: ti: icssg-prueth: Fix firmware load sequence.") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Message-ID: <20250805173812.2183161-1-danishanwar@ti.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-08 12:06:12 -07:00
Jakub Kicinski	2182153cfd	Merge branch 'there-are-some-bugfix-for-hibmcge-ethernet-driver' Jijie Shao says: ==================== There are some bugfix for hibmcge ethernet driver This patch set is intended to fix several issues for hibmcge driver: 1. Holding the rtnl_lock in pci_error_handlers->reset_prepare() may lead to a deadlock issue. 2. A division by zero issue caused by debugfs when the port is down. 3. A probabilistic false positive issue with np_link_fail. v2: https://lore.kernel.org/20250805181446.3deaceb9@kernel.org v1: https://lore.kernel.org/20250731134749.4090041-1-shaojijie@huawei.com ==================== Link: https://patch.msgid.link/20250806102758.3632674-1-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-08 11:48:51 -07:00
Jijie Shao	62c50180ff	net: hibmcge: fix the np_link_fail error reporting issue Currently, after modifying device port mode, the np_link_ok state is immediately checked. At this point, the device may not yet ready, leading to the querying of an intermediate state. This patch will poll to check if np_link is ok after modifying device port mode, and only report np_link_fail upon timeout. Fixes: `e0306637e8` ("net: hibmcge: Add support for mac link exception handling feature") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-08 11:48:49 -07:00
Jijie Shao	7004b26f0b	net: hibmcge: fix the division by zero issue When the network port is down, the queue is released, and ring->len is 0. In debugfs, hbg_get_queue_used_num() will be called, which may lead to a division by zero issue. This patch adds a check, if ring->len is 0, hbg_get_queue_used_num() directly returns 0. Fixes: `40735e7543` ("net: hibmcge: Implement .ndo_start_xmit function") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-08 11:48:49 -07:00
Jijie Shao	c875503a9b	net: hibmcge: fix rtnl deadlock issue Currently, the hibmcge netdev acquires the rtnl_lock in pci_error_handlers.reset_prepare() and releases it in pci_error_handlers.reset_done(). However, in the PCI framework: pci_reset_bus - __pci_reset_slot - pci_slot_save_and_disable_locked - pci_dev_save_and_disable - err_handler->reset_prepare(dev); In pci_slot_save_and_disable_locked(): list_for_each_entry(dev, &slot->bus->devices, bus_list) { if (!dev->slot \|\| dev->slot!= slot) continue; pci_dev_save_and_disable(dev); if (dev->subordinate) pci_bus_save_and_disable_locked(dev->subordinate); } This will iterate through all devices under the current bus and execute err_handler->reset_prepare(), causing two devices of the hibmcge driver to sequentially request the rtnl_lock, leading to a deadlock. Since the driver now executes netif_device_detach() before the reset process, it will not concurrently with other netdev APIs, so there is no need to hold the rtnl_lock now. Therefore, this patch removes the rtnl_lock during the reset process and adjusts the position of HBG_NIC_STATE_RESETTING to ensure that multiple resets are not executed concurrently. Fixes: `3f5a61f6d5` ("net: hibmcge: Add reset supported in this module") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-08 11:48:49 -07:00
Jakub Kicinski	f6a2a31043	netfilter pull request 25-08-07 -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEjF9xRqF1emXiQiqU1w0aZmrPKyEFAmiUi/oACgkQ1w0aZmrP KyGk+w/9F3+glmsB4XIaCMVdQVIe0SQizHCRnkG2tbMhdbUwnjfwot1VwWW8/vZU 69FffpqEjHzzhQtHsv5gRJdxwZluhsB73HvgmqevnugjQ8dfNQ8/KNcOaEwh/PPL y3iqffx0V8JLzbHRiPxGpIeFo2xeAofpMMrl+wZG+e2hnqUkV/DJFz0jcaK21eHa fhpAnBBS8HHCeS3NI2rdAvOTY91680RKGvN0RsqfUZKraRe8ObFxTIwIfR5enL1g 42tvGThvW3o1eOj09hFZi+hmvkvp3LKycrevJ1XyS2iWT36lfpbLK79Ydq6U6OZc MojxEpza/SHog+Cnj8uKbZNHfhbiRbt7Z4uFIiW2Wrv/oKNF6tBAtA3VdBz8ayEu ahnZ7ap6n/e2SfUNf03YONf+UjweHBAg5ifqOfGOdSbBgwb/b35guheLT377vG97 UDbHL6vkn/d/nV+t+gErDiF0YKfui97cB/iz2F3ux7+P0FtEVoVRTnp49beoWIO3 XNXDVF36jTm+a4owwZIBYXXUjCQcUuEjEBNyQGsc28ffUCG5nZ+HAYP+h9odgTZT e6ajqI1ymuKSHbh3oRBtEkCXx/MIImsWdfgKcHvQGPt7WV0oGKq6ZTh5TmsrfdTC dao1BY6lErKQbbZcuoZD5j915vvvDTjLi33PATH9rcY2gdyudgQ= =b5UN -----END PGP SIGNATURE----- Merge tag 'nf-25-08-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Reinstantiate Florian Westphal as a Netfilter maintainer. 2) Depend on both NETFILTER_XTABLES and NETFILTER_XTABLES_LEGACY, from Arnd Bergmann. 3) Use id to annotate last conntrack/expectation visited to resume netlink dump, patches from Florian Westphal. 4) Fix bogus element in nft_pipapo avx2 lookup, introduced in the last nf-next batch of updates, also from Florian. 5) Return 0 instead of recycling ret variable in nf_conntrack_log_invalid_sysctl(), introduced in the last nf-next batch of updates, from Dan Carpenter. 6) Fix WARN_ON_ONCE triggered by syzbot with larger cgroup level in nft_socket. * tag 'nf-25-08-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_socket: remove WARN_ON_ONCE with huge level value netfilter: conntrack: clean up returns in nf_conntrack_log_invalid_sysctl() netfilter: nft_set_pipapo: don't return bogus extension pointer netfilter: ctnetlink: remove refcounting in expectation dumpers netfilter: ctnetlink: fix refcount leak on table dump netfilter: add back NETFILTER_XTABLES dependencies MAINTAINERS: resurrect my netfilter maintainer entry ==================== Link: https://patch.msgid.link/20250807112948.1400523-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-08 11:45:14 -07:00
Jens Axboe	33503c083f	io_uring/memmap: cast nr_pages to size_t before shifting If the allocated size exceeds UINT_MAX, then it's necessary to cast the mr->nr_pages value to size_t to prevent it from overflowing. In practice this isn't much of a concern as the required memory size will have been validated upfront, and accounted to the user. And > 4GB sizes will be necessary to make the lack of a cast a problem, which greatly exceeds normal user locked_vm settings that are generally in the kb to mb range. However, if root is used, then accounting isn't done, and then it's possible to hit this issue. Link: https://lore.kernel.org/all/6895b298.050a0220.7f033.0059.GAE@google.com/ Cc: stable@vger.kernel.org Reported-by: syzbot+23727438116feb13df15@syzkaller.appspotmail.com Fixes: `087f997870` ("io_uring/memmap: implement mmap for regions") Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-08-08 06:35:14 -06:00
Steffen Klassert	d8369183a0	Merge branch 'xfrm: some fixes for GSO with SW crypto' Sabrina Dubroca says: ==================== This series fixes a few issues with GSO. Some recent patches made the incorrect assumption that GSO is only used by offload. The first two patches in this series restore the old behavior. The final patch is in the UDP GSO code, but fixes an issue with IPsec that is currently masked by the lack of GSO for SW crypto. With GSO, VXLAN over IPsec doesn't get checksummed. ==================== Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2025-08-08 10:44:23 +02:00
Adam Young	5378bdf6a6	mailbox/pcc: support mailbox management of the shared buffer Define a new, optional, callback that allows the driver to specify how the return data buffer is allocated. If that callback is set, mailbox/pcc.c is now responsible for reading from and writing to the PCC shared buffer. This also allows for proper checks of the Commnand complete flag between the PCC sender and receiver. For Type 4 channels, initialize the command complete flag prior to accepting messages. Since the mailbox does not know what memory allocation scheme to use for response messages, the client now has an optional callback that allows it to allocate the buffer for a response message. When an outbound message is written to the buffer, the mailbox checks for the flag indicating the client wants an tx complete notification via IRQ. Upon receipt of the interrupt It will pair it with the outgoing message. The expected use is to free the kernel memory buffer for the previous outgoing message. Signed-off-by: Adam Young <admiyo@os.amperecomputing.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>	2025-08-07 23:49:56 -05:00
Linus Torvalds	3781648824	Previous releases - regressions: - netlink: avoid infinite retry looping in netlink_unicast() Previous releases - always broken: - packet: fix a race in packet_set_ring() and packet_notifier() - ipv6: reject malicious packets in ipv6_gso_segment() - sched: mqprio: fix stack out-of-bounds write in tc entry parsing - net: drop UFO packets (injected via virtio) in udp_rcv_segment() - eth: mlx5: correctly set gso_segs when LRO is used, avoid false positive checksum validation errors - netpoll: prevent hanging NAPI when netcons gets enabled - phy: mscc: fix parsing of unicast frames for PTP timestamping - number of device tree / OF reference leak fixes Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmiUvMcACgkQMUZtbf5S IrucuA//bQGZdQkpRo/2zWDFS4wN7hV8PkR+kI0F4rUyNQjFDyUOrccYnhHl8VRn DfnmVC9oiWxwuW2QgrgH1KKSw9dxhDPVmhLezAHaZv9XAPPgPO/Yb2Dr3oKHF+yK +/QW57FN8rNg8mHUzMS/26Y+rH6OubTaf3MKcsT2uZhuIXKHScOUedmr6EeEkp/L c32MpWfcVC7M2jjvCH+HO4k4RawWaB8W93iMUrLMSVI1oE3Tsjyhl5Cv9TBixh7g 3KZW98qVqMXKBQr0QTF4LBR0IWgKOS4KMVJqrgQ3CZszbDWbmBsFfi8olr/AS0Nt vgk6hTd8vVHXa+sOvdFpDbocdjXBq6vf5bDd9p57bt3JyxFMfFUWbG1eDj8bUpMI YuKAhQg9mTr6R8DaLwqANY8zrSET2FiYbNkUsP9TD8++q2j34U5/a468cxymfjJ/ 90vSrBGUpwxYPhx2B+Slu1SZJ1g0NbUL34jrKBuSNloVoZDRuzq2zd310BigG/35 T5CTr5OH+USlFjL6KpgGHnygiMQB/h2WgEjWbFoZHnIb+exVKCgS6IiNBVV3b2OK Nv6iVyFKPdKJ8qxeaiYMWA16wG8buOytlBn/1hYKLBvyVKAAx9Ib4ao2TK9w/1Vs 50sX21h+Awj2a/17sXlSXA+RVdeBcAzHdsGWHr1ql31htn2Xl10= =nwy7 -----END PGP SIGNATURE----- Merge tag 'net-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: Previous releases - regressions: - netlink: avoid infinite retry looping in netlink_unicast() Previous releases - always broken: - packet: fix a race in packet_set_ring() and packet_notifier() - ipv6: reject malicious packets in ipv6_gso_segment() - sched: mqprio: fix stack out-of-bounds write in tc entry parsing - net: drop UFO packets (injected via virtio) in udp_rcv_segment() - eth: mlx5: correctly set gso_segs when LRO is used, avoid false positive checksum validation errors - netpoll: prevent hanging NAPI when netcons gets enabled - phy: mscc: fix parsing of unicast frames for PTP timestamping - a number of device tree / OF reference leak fixes" * tag 'net-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (44 commits) pptp: fix pptp_xmit() error path net: ti: icssg-prueth: Fix skb handling for XDP_PASS net: Update threaded state in napi config in netif_set_threaded selftests: netdevsim: Xfail nexthop test on slow machines eth: fbnic: Lock the tx_dropped update eth: fbnic: Fix tx_dropped reporting eth: fbnic: remove the debugging trick of super high page bias net: ftgmac100: fix potential NULL pointer access in ftgmac100_phy_disconnect dt-bindings: net: Replace bouncing Alexandru Tachici emails dpll: zl3073x: ZL3073X_I2C and ZL3073X_SPI should depend on NET net/sched: mqprio: fix stack out-of-bounds write in tc entry parsing Revert "net: mdio_bus: Use devm for getting reset GPIO" selftests: net: packetdrill: xfail all problems on slow machines net/packet: fix a race in packet_set_ring() and packet_notifier() benet: fix BUG when creating VFs net: airoha: npu: Add missing MODULE_FIRMWARE macros net: devmem: fix DMA direction on unmapping ipa: fix compile-testing with qcom-mdt=m eth: fbnic: unlink NAPIs from queues on error to open net: Add locking to protect skb->dev access in ip_output ...	2025-08-08 07:03:25 +03:00
Linus Torvalds	bec077162b	more s390 updates for 6.17 merge window - Support MMIO read/write tracing - Enable THP swapping and THP migration - Unmask SLCF bit ("stateless command filtering") introduced with CEX8 cards, so that user space applications like lszcrypt could evaluate and list this feature - Fix the value of high_memory variable, so it considers possible tailing offline memory blocks - Make vmem_pte_alloc() consistent and always allocate memory of PAGE_SIZE for page tables. This ensures a page table occupies the whole page, as the rest of the code assumes - Fix kernel image end address in the decompressor debug output - Fix a typo in debug_sprintf_format_fn() comment -----BEGIN PGP SIGNATURE----- iI0EABYKADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCaJSgORccYWdvcmRlZXZA bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8OccAQDfJpNc8OWiL2Vo6kaXCo8e/q7X p+k3Je7LQsaJjR2cPgD/Zhg8Uw2Z+HVdSF2uPioe0ftcwDFo8HJ+9hgyN848kQA= =YdPI -----END PGP SIGNATURE----- Merge tag 's390-6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Alexander Gordeev: - Support MMIO read/write tracing - Enable THP swapping and THP migration - Unmask SLCF bit ("stateless command filtering") introduced with CEX8 cards, so that user space applications like lszcrypt could evaluate and list this feature - Fix the value of high_memory variable, so it considers possible tailing offline memory blocks - Make vmem_pte_alloc() consistent and always allocate memory of PAGE_SIZE for page tables. This ensures a page table occupies the whole page, as the rest of the code assumes - Fix kernel image end address in the decompressor debug output - Fix a typo in debug_sprintf_format_fn() comment * tag 's390-6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/debug: Fix typo in debug_sprintf_format_fn() comment s390/boot: Fix startup debugging log s390/mm: Allocate page table with PAGE_SIZE granularity s390/mm: Enable THP_SWAP and THP_MIGRATION s390: Support CONFIG_TRACE_MMIO_ACCESS s390/mm: Set high_memory at the end of the identity mapping s390/ap: Unmask SLCF bit in card and queue ap functions sysfs	2025-08-08 06:56:55 +03:00
Linus Torvalds	b1e06c19ab	vhost: bugfix A single fix for a regression in vhost. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCgAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmiR15cPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpglAH/jo+KgoEiNwZhBtFri5dGv8oW3MYjrMyJniC z+FXA4/1WrJzgeH14Bx2ZBJyEOrnqxbEoft5lAMTQH7tIgwMNCrMyza9jpNEtcSx VvYA45Uy94HnWKFGDnJzGgfKqWqlAJltJfdZeoUO97iYBvbsRuTwnYAQBl8ec2F8 vit0jhs7y6GF1pi+etyoSXrzfgTnh39XdhKczRoarzTYQQl+ZvxTsUR87lOlcjXc HbGGwXOyQlETfCynlehx1c6bf9tXL6SQXzXOzI7cSqcMq3omi8Xhwp0L8WcqDA8h YcSb8GXeGDkcawQmtK6Wld5TxtRjIRSwcS+VmXdp7o61RPddt64= =QSMh -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull vhost fix from Michael Tsirkin: "A single fix for a regression in vhost" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost: initialize vq->nheads properly	2025-08-08 06:54:23 +03:00
Linus Torvalds	ffe8ac927d	drm fixes for 6.17-rc1 i915: - DP LPFS fixes xe: - SRIOV: PF fixes and removal of need of module param - Fix driver unbind around Devcoredump - Mark xe driver as BROKEN if kernel page size is not 4kB amdgpu: - GC 9.5.0 fixes - SMU fix - DCE 6 DC fixes - mmhub client ID fixes - VRR fix - Backlight fix - UserQ fix - Legacy reset fix - Misc fixes amdkfd: - CRIU fix - Debugfs fix -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmiVQYoACgkQDHTzWXnE hr4OVg/+OT59WwFrqodc79YRnGkbMllYfhhsC+hgczRWS1WV6GCp5yYSZVqWjs5I H44mM/rUEywXGf5m6lNx9GCCLF+ASgTAosVY/+cSDDpdB+Chl5CSYMbwmmlaS86a yzOxbNz7LC+gxrD9ZSmFxaWOop789ISjRhiafAWeKcJeVIo/tuW7rX2wJsN+LdNo ioUnNR4UHPKhdk7SQhm6ho3izL02e3H72fK/jcUj2FMehqJRLW6ilNTSntJ+bwp4 WmtSHSHTfeFximemIiU1l11cilwpAkWqmDjR9ZBW/F6psBU3FX3eOP4MaKL1Io2X 0gKXcnrz8XSLlZ46rKwHLj3IB3cNuGd2jO/rTcHDTQWGhrjPgn5fZnorOsfMRlTU ySaEeVsgCupQyiB1hOYFMECse+tqzV++05Bu/BL3NpYn6GMikghPkianAKlwpwxS lVDeOTmC9WX/jD9FeW8a69nsGZJoT3oSp9Ro2xjqlo1bwm1FVmUlxKgXSRIdHofg oZ8PGGDQGM5hf69Y/iLNFSPC+Z22n8cDrA0in92Gek965/h2HHkttQVWEMsaVc/x gkxb6rWv6ZMeLMrRY4ZQFd8JaOFfEzrc7aWemknEmF0HIdyJJUu2zWiUwm2MgpI/ tshUMJdgoJaPoTinHFJeID8HrdzgFy9owHnKUne9nNNkPFjskV4= =rAxo -----END PGP SIGNATURE----- Merge tag 'drm-next-2025-08-08' of https://gitlab.freedesktop.org/drm/kernel Pull drm fixes from Dave Airlie: "This is the fixes that built up in the merge window, mostly amdgpu and xe with one i915 display fix, seems like things are pretty good for rc1. i915: - DP LPFS fixes xe: - SRIOV: PF fixes and removal of need of module param - Fix driver unbind around Devcoredump - Mark xe driver as BROKEN if kernel page size is not 4kB amdgpu: - GC 9.5.0 fixes - SMU fix - DCE 6 DC fixes - mmhub client ID fixes - VRR fix - Backlight fix - UserQ fix - Legacy reset fix - Misc fixes amdkfd: - CRIU fix - Debugfs fix" * tag 'drm-next-2025-08-08' of https://gitlab.freedesktop.org/drm/kernel: (28 commits) drm/amdgpu: add missing vram lost check for LEGACY RESET drm/amdgpu/discovery: fix fw based ip discovery drm/amdkfd: Destroy KFD debugfs after destroy KFD wq amdgpu/amdgpu_discovery: increase timeout limit for IFWI init drm/amdgpu: Update SDMA firmware version check for user queue support drm/amdgpu: Add NULL check for asic_funcs drm/amd/display: Revert "drm/amd/display: Fix AMDGPU_MAX_BL_LEVEL value" drm/amd/display: fix a Null pointer dereference vulnerability drm/amd/display: Add primary plane to commits for correct VRR handling drm/amdgpu: update mmhub 3.3 client id mappings drm/amdgpu: update mmhub 3.0.1 client id mappings drm/amdgpu: Retain job->vm in amdgpu_job_prepare_job drm/amd/display: Fix DCE 6.0 and 6.4 PLL programming. drm/amd/display: Don't overwrite dce60_clk_mgr drm/amdkfd: Fix checkpoint-restore on multi-xcc drm/amd: Restore cached manual clock settings during resume drm/amd: Restore cached power limit during resume drm/amdgpu: Update external revid for GC v9.5.0 drm/amdgpu: Update supported modes for GC v9.5.0 Mark xe driver as BROKEN if kernel page size is not 4kB ...	2025-08-08 06:48:14 +03:00
Linus Torvalds	2939a792c4	fbdev fixes for 6.17-rc1: - Revert a patch which broke VGA console. - Fix an out-of-bounds access bug which may happen during console resizing when a console is mapped to a frame buffer. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCaJRydwAKCRD3ErUQojoP Xx6JAP929Q1PbZAbYSc/99dkxmKmfQk7/g7daI5Rl9YHWvvALgD/WFM/kWWIcFFe T4Z8+v3fa3+vez8w4jH5k9/IY2YU4w8= =Zr8P -----END PGP SIGNATURE----- Merge tag 'fbdev-for-6.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev Pull fbdev fixes for 6.17-rc1: - Revert a patch which broke VGA console - Fix an out-of-bounds access bug which may happen during console resizing when a console is mapped to a frame buffer * tag 'fbdev-for-6.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: Revert "vgacon: Add check for vc_origin address range in vgacon_scroll()" fbdev: Fix vmalloc out-of-bounds write in fast_imageblit	2025-08-08 06:43:20 +03:00
Linus Torvalds	83affacd18	LoongArch changes for v6.17 1, Complete KSave registers definition; 2, Support the mem=<size> kernel parameter; 3, Support BPF dynamic modification & trampoline; 4, Add MMC/SDIO controller nodes in dts; 5, Some bug fixes and other small changes. -----BEGIN PGP SIGNATURE----- iQJJBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmiR8aUWHGNoZW5odWFj YWlAa2VybmVsLm9yZwAKCRAChivD8uImesNJD/jFKBJ05ryQgz2mKERx6THyAvGz bWf1Fx3EcfJ0J/3cd8yIhP316dy6L+FElUNPJixz9OuWk/3LBDNzQb9F/QbBkNtg Ho2jaE3q90OXiZ/hwNmK2aRYP5Pi2YcAmIWpz65opj8SRTYq5i0mp+El7LzvQ+JO Y2Ghp13S5cQC7boL0iJf0jhYcf0PW7kpzIJKbZhHYT3vWsGA+gRBufanAuBPUo07 zypaXcm54fni5n6ucOElTF8WZVoouFOeF7KW0tkL12ufkVyMnRfnu+tJgG9ohtjK EtrisF3scgC/pg9cTWC706C85FjNz3/+mi0AbQpH2NjJczmoeQwQfnAYxbeIwCJD C6wIY1V5IBVDCU5dTCqEOMvfaCjl0AzvZxn3NUh+OUyf3y9pENpnI1JHkyj4z0A4 qDsFI+4W+XpvF/EAeDGCU/TlYEmEqEDZ+Y/eQkLT4s4pGIWjvX3vMLyugyE0nTR8 y6BbyvJXmG5As/TQEGmvNaBnpDgCsxJWRtnuiUS9eAqhwGFpCdO5aZ3R7Pumke4l Meyn9TIUEsNZ20Fdvjo+Asa8FTtkV4OLkMPSXGb3ag8sNtnJSRl0PpXYZa7CZKa6 6I4kmO8130TKOqwdxAlEdU9JnznFEMxEkoNdOuQKEHnGPUQbCVf9Zz5gg+vM1vxN Vhlmdjm3bSFcKNAY =SVHt -----END PGP SIGNATURE----- Merge tag 'loongarch-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Complete KSave registers definition - Support the mem=<size> kernel parameter - Support BPF dynamic modification & trampoline - Add MMC/SDIO controller nodes in dts - Some bug fixes and other small changes * tag 'loongarch-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: vDSO: Remove -nostdlib complier flag LoongArch: dts: Add eMMC/SDIO controller support to Loongson-2K2000 LoongArch: dts: Add SDIO controller support to Loongson-2K1000 LoongArch: dts: Add SDIO controller support to Loongson-2K0500 LoongArch: BPF: Set bpf_jit_bypass_spec_v1/v4() LoongArch: BPF: Fix the tailcall hierarchy LoongArch: BPF: Fix jump offset calculation in tailcall LoongArch: BPF: Add struct ops support for trampoline LoongArch: BPF: Add basic bpf trampoline support LoongArch: BPF: Add dynamic code modification support LoongArch: BPF: Rename and refactor validate_code() LoongArch: Add larch_insn_gen_{beq,bne} helpers LoongArch: Don't use %pK through printk() in unwinder LoongArch: Avoid in-place string operation on FDT content LoongArch: Support mem=<size> kernel parameter LoongArch: Make relocate_new_kernel_size be a .quad value LoongArch: Complete KSave registers definition	2025-08-08 06:36:48 +03:00
Thorsten Blum	8e7d178d06	smb: server: Fix extension string in ksmbd_extract_shortname() In ksmbd_extract_shortname(), strscpy() is incorrectly called with the length of the source string (excluding the NUL terminator) rather than the size of the destination buffer. This results in "__" being copied to 'extension' rather than "___" (two underscores instead of three). Use the destination buffer size instead to ensure that the string "___" (three underscores) is copied correctly. Cc: stable@vger.kernel.org Fixes: `e2f34481b2` ("cifsd: add server-side procedures for SMB3") Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-08-07 18:23:12 -05:00
Namjae Jeon	e6bb919397	ksmbd: limit repeated connections from clients with the same IP Repeated connections from clients with the same IP address may exhaust the max connections and prevent other normal client connections. This patch limit repeated connections from clients with the same IP. Reported-by: tianshuo han <hantianshuo233@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-08-07 18:22:58 -05:00
Dave Airlie	64c6275194	amd-drm-fixes-6.17-2025-08-07: amdgpu: - GC 9.5.0 fixes - SMU fix - DCE 6 DC fixes - mmhub client ID fixes - VRR fix - Backlight fix - UserQ fix - Legacy reset fix - Misc fixes amdkfd: - CRIU fix - Debugfs fix -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCaJSm+gAKCRC93/aFa7yZ 2NSoAQDVoZmhTGtXnQQAEwI6dmoMN9IKH6IOyGgOVYvUFbrpGwEAk3ogg25LhIDy VSakxfy88aKUq974GLpeejBvAi0DvgU= =hnJZ -----END PGP SIGNATURE----- Merge tag 'amd-drm-fixes-6.17-2025-08-07' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-fixes-6.17-2025-08-07: amdgpu: - GC 9.5.0 fixes - SMU fix - DCE 6 DC fixes - mmhub client ID fixes - VRR fix - Backlight fix - UserQ fix - Legacy reset fix - Misc fixes amdkfd: - CRIU fix - Debugfs fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250807132030.1168068-1-alexander.deucher@amd.com	2025-08-08 08:17:13 +10:00
Dave Airlie	10acca927f	- SRIOV: PF fixes and removal of need of module param (Michal) - Fix driver unbind around Devcoredump (Bala) - Mark xe driver as BROKEN if kernel page size is not 4kB (Simon) -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmiTVrYACgkQ+mJfZA7r E8p+0Qf9F/eM9v6Nb7vavC5OR79KveZMhQNqr2acOYKTLexraa+euYYSED0T0kya gXEUL8nRA69JjSQDZXmGZj9FhcDrA0NtTWLbsqb5HqjyjkFvStp6QZ+xAmJPYfgf AHxIleG19cmXhSzzMzjB6dSQp2tiqZcvBJ6XQ0DaaBPwHjGTMABeG1Kz9Tpjqz8S 9ayOdTzmgBkRanzM1TDoPZCo/lJ5mIvg/pNQQg6cPlqtf1A+pVxkxCWUZN+xdtYb 7rFvdbZongT3KzHkVd0u6h253InvgCeAIJTdujleHu2fW0KkA0yrAH+CY0q4/SAz X+IV57Aq9Nz7+F+0CDh7GB13eNqSvw== =cSQi -----END PGP SIGNATURE----- Merge tag 'drm-xe-next-fixes-2025-08-06' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next - SRIOV: PF fixes and removal of need of module param (Michal) - Fix driver unbind around Devcoredump (Bala) - Mark xe driver as BROKEN if kernel page size is not 4kB (Simon) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/aJNXnIAp2Cq-2pZj@intel.com	2025-08-08 05:50:11 +10:00
Stefan Metzmacher	dfe6f14aed	smb: client: only use a single wait_queue to monitor smbdirect connection status There's no need for separate conn_wait and disconn_wait queues. This will simplify the move to common code, the server code already a single wait_queue for this. Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-08-07 12:40:11 -05:00
Stefan Metzmacher	550a194c59	smb: client: don't call init_waitqueue_head(&info->conn_wait) twice in _smbd_get_connection It is already called long before we may hit this cleanup code path. Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-08-07 12:40:11 -05:00
Stefan Metzmacher	7613997457	smb: client: improve logging in smbd_conn_upcall() Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-08-07 12:40:11 -05:00
Stefan Metzmacher	03537826f7	smb: client: return an error if rdma_connect does not return within 5 seconds This matches the timeout for tcp connections. Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Fixes: `f198186aa9` ("CIFS: SMBD: Establish SMB Direct connection") Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-08-07 12:40:11 -05:00
Nam Cao	d5c647b08e	PCI: vmd: Fix wrong kfree() in vmd_msi_free() vmd_msi_alloc() allocates struct vmd_irq and stashes it into irq_data->chip_data associated with the VMD's interrupt domain. vmd_msi_free() extracts the pointer by calling irq_get_chip_data() and frees it. irq_get_chip_data() returns the chip_data associated with the top interrupt domain. This worked in the past because VMD's interrupt domain was the top domain. But `d7d8ab87e3` ("PCI: vmd: Switch to msi_create_parent_irq_domain()") changed the interrupt domain hierarchy so VMD's interrupt domain is not the top domain anymore. irq_get_chip_data() now returns the chip_data at the MSI devices' interrupt domains. It is therefore broken for vmd_msi_free() to kfree() this chip_data. Fix by extracting the chip_data associated with the VMD's interrupt domain. Fixes: `d7d8ab87e3` ("PCI: vmd: Switch to msi_create_parent_irq_domain()") Reported-by: Kenneth Crudup <kenny@panix.com> Closes: https://lore.kernel.org/linux-pci/dfa40e48-8840-4e61-9fda-25cdb3ad81c1@panix.com/ Reported-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Closes: https://lore.kernel.org/linux-pci/ed53280ed15d1140700b96cca2734bf327ee92539e5eb68e80f5bbbf0f01@linux.gnuweeb.org/ Tested-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Tested-by: Kenneth Crudup <kenny@panix.com> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/20250807081051.2253962-1-namcao@linutronix.de	2025-08-07 11:30:12 -05:00
Alexei Starovoitov	0e260fc798	Merge branch 'perf-s390-regression-move-uid-filtering-to-bpf-filters' Ilya Leoshkevich says: ==================== perf/s390: Regression: Move uid filtering to BPF filters v4: https://lore.kernel.org/bpf/20250806114227.14617-1-iii@linux.ibm.com/ v4 -> v5: Fix a typo in the commit message (Yonghong). v3: https://lore.kernel.org/bpf/20250805130346.1225535-1-iii@linux.ibm.com/ v3 -> v4: Rename the new field to dont_enable (Alexei, Eduard). Switch the Fixes: tag in patch 2 (Alexander, Thomas). Fix typos in the cover letter (Thomas). v2: https://lore.kernel.org/bpf/20250728144340.711196-1-tmricht@linux.ibm.com/ v2 -> v3: Use no_ioctl_enable in perf. v1: https://lore.kernel.org/bpf/20250725093405.3629253-1-tmricht@linux.ibm.com/ v1 -> v2: Introduce no_ioctl_enable (Jiri). Hi, This series fixes a regression caused by moving UID filtering to BPF. The regression affects all events that support auxiliary data, most notably, "cycles" events on s390, but also PT events on Intel. The symptom is missing events when UID filtering is enabled. Patch 1 introduces a new option for the bpf_program__attach_perf_event_opts() function. Patch 2 makes use of it in perf, and also contains a lot of technical details of why exactly the problem is occurring. Thanks to Thomas Richter for the investigation and the initial version of this fix, and to Jiri Olsa for suggestions. Best regards, Ilya ==================== Link: https://patch.msgid.link/20250806162417.19666-1-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-08-07 09:04:34 -07:00
Ilya Leoshkevich	5e2ac8e857	perf bpf-filter: Enable events manually On s390, and, in general, on all platforms where the respective event supports auxiliary data gathering, the command: # ./perf record -u 0 -aB --synth=no -- ./perf test -w thloop [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data ] # ./perf report --stats \| grep SAMPLE # does not generate samples in the perf.data file. On x86 the command: # sudo perf record -e intel_pt// -u 0 ls is broken too. Looking at the sequence of calls in 'perf record' reveals this behavior: 1. The event 'cycles' is created and enabled: record__open() +-> evlist__apply_filters() +-> perf_bpf_filter__prepare() +-> bpf_program.attach_perf_event() +-> bpf_program.attach_perf_event_opts() +-> __GI___ioctl(..., PERF_EVENT_IOC_ENABLE, ...) The event 'cycles' is enabled and active now. However the event's ring-buffer to store the samples generated by hardware is not allocated yet. 2. The event's fd is mmap()ed to create the ring buffer: record__open() +-> record__mmap() +-> record__mmap_evlist() +-> evlist__mmap_ex() +-> perf_evlist__mmap_ops() +-> mmap_per_cpu() +-> mmap_per_evsel() +-> mmap__mmap() +-> perf_mmap__mmap() +-> mmap() This allocates the ring buffer for the event 'cycles'. With mmap() the kernel creates the ring buffer: perf_mmap(): kernel function to create the event's ring \| buffer to save the sampled data. \| +-> ring_buffer_attach(): Allocates memory for ring buffer. \| The PMU has auxiliary data setup function. The \| has_aux(event) condition is true and the PMU's \| stop() is called to stop sampling. It is not \| restarted: \| \| if (has_aux(event)) \| perf_event_stop(event, 0); \| +-> cpumsf_pmu_stop(): Hardware sampling is stopped. No samples are generated and saved anymore. 3. After the event 'cycles' has been mapped, the event is enabled a second time in: __cmd_record() +-> evlist__enable() +-> __evlist__enable() +-> evsel__enable_cpu() +-> perf_evsel__enable_cpu() +-> perf_evsel__run_ioctl() +-> perf_evsel__ioctl() +-> __GI___ioctl(., PERF_EVENT_IOC_ENABLE, .) The second ioctl(fd, PERF_EVENT_IOC_ENABLE, 0); is just a NOP in this case. The first invocation in (1.) sets the event::state to PERF_EVENT_STATE_ACTIVE. The kernel functions perf_ioctl() +-> _perf_ioctl() +-> _perf_event_enable() +-> __perf_event_enable() return immediately because event::state is already set to PERF_EVENT_STATE_ACTIVE. This happens on s390, because the event 'cycles' offers the possibility to save auxilary data. The PMU callbacks setup_aux() and free_aux() are defined. Without both callback functions, cpumsf_pmu_stop() is not invoked and sampling continues. To remedy this, remove the first invocation of ioctl(..., PERF_EVENT_IOC_ENABLE, ...). in step (1.) Create the event in step (1.) and enable it in step (3.) after the ring buffer has been mapped. Output after: # ./perf record -aB --synth=no -u 0 -- ./perf test -w thloop 2 [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.876 MB perf.data ] # ./perf report --stats \| grep SAMPLE SAMPLE events: 16200 (99.5%) SAMPLE events: 16200 # The software event succeeded both before and after the patch: # ./perf record -e cpu-clock -aB --synth=no -u 0 -- \ ./perf test -w thloop 2 [ perf record: Woken up 7 times to write data ] [ perf record: Captured and wrote 2.870 MB perf.data ] # ./perf report --stats \| grep SAMPLE SAMPLE events: 53506 (99.8%) SAMPLE events: 53506 # Fixes: `b4c658d4d6` ("perf target: Remove uid from target") Suggested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Co-developed-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250806162417.19666-3-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-08-07 09:03:44 -07:00
Ilya Leoshkevich	9474e27a24	libbpf: Add the ability to suppress perf event enablement Automatically enabling a perf event after attaching a BPF prog to it is not always desirable. Add a new "dont_enable" field to struct bpf_perf_event_opts. While introducing "enable" instead would be nicer in that it would avoid a double negation in the implementation, it would make DECLARE_LIBBPF_OPTS() less efficient. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Suggested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Co-developed-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250806162417.19666-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-08-07 09:01:41 -07:00
Boris Burkov	7b63259618	btrfs: fix iteration bug in __qgroup_excl_accounting() __qgroup_excl_accounting() uses the qgroup iterator machinery to update the account of one qgroups usage for all its parent hierarchy, when we either add or remove a relation and have only exclusive usage. However, there is a small bug there: we loop with an extra iteration temporary qgroup called `cur` but never actually refer to that in the body of the loop. As a result, we redundantly account the same usage to the first qgroup in the list. This can be reproduced in the following way: mkfs.btrfs -f -O squota <dev> mount <dev> <mnt> btrfs subvol create <mnt>/sv dd if=/dev/zero of=<mnt>/sv/f bs=1M count=1 sync btrfs qgroup create 1/100 <mnt> btrfs qgroup create 2/200 <mnt> btrfs qgroup assign 1/100 2/200 <mnt> btrfs qgroup assign 0/256 1/100 <mnt> btrfs qgroup show <mnt> and the broken result is (note the 2MiB on 1/100 and 0Mib on 2/100): Qgroupid Referenced Exclusive Path -------- ---------- --------- ---- 0/5 16.00KiB 16.00KiB <toplevel> 0/256 1.02MiB 1.02MiB sv Qgroupid Referenced Exclusive Path -------- ---------- --------- ---- 0/5 16.00KiB 16.00KiB <toplevel> 0/256 1.02MiB 1.02MiB sv 1/100 2.03MiB 2.03MiB 2/100<1 member qgroup> 2/100 0.00B 0.00B <0 member qgroups> With this fix, which simply re-uses `qgroup` as the iteration variable, we see the expected result: Qgroupid Referenced Exclusive Path -------- ---------- --------- ---- 0/5 16.00KiB 16.00KiB <toplevel> 0/256 1.02MiB 1.02MiB sv Qgroupid Referenced Exclusive Path -------- ---------- --------- ---- 0/5 16.00KiB 16.00KiB <toplevel> 0/256 1.02MiB 1.02MiB sv 1/100 1.02MiB 1.02MiB 2/100<1 member qgroup> 2/100 1.02MiB 1.02MiB <0 member qgroups> The existing fstests did not exercise two layer inheritance so this bug was missed. I intend to add that testing there, as well. Fixes: `a0bdc04b07` ("btrfs: qgroup: use qgroup_iterator in __qgroup_excl_accounting()") CC: stable@vger.kernel.org # 6.12+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>	2025-08-07 17:07:16 +02:00
Naohiro Aota	3a931e9b39	btrfs: zoned: do not select metadata BG as finish target We call btrfs_zone_finish_one_bg() to zone finish one block group and make room to activate another block group. Currently, we can choose a metadata block group as a target. But, as we reserve an active metadata block group, we no longer want to select a metadata block group. So, skip it in the loop. CC: stable@vger.kernel.org # 6.6+ Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-08-07 17:07:16 +02:00
Qu Wenruo	4289b494ac	btrfs: do not allow relocation of partially dropped subvolumes [BUG] There is an internal report that balance triggered transaction abort, with the following call trace: item 85 key (594509824 169 0) itemoff 12599 itemsize 33 extent refs 1 gen 197740 flags 2 ref#0: tree block backref root 7 item 86 key (594558976 169 0) itemoff 12566 itemsize 33 extent refs 1 gen 197522 flags 2 ref#0: tree block backref root 7 ... BTRFS error (device loop0): extent item not found for insert, bytenr 594526208 num_bytes 16384 parent 449921024 root_objectid 934 owner 1 offset 0 BTRFS error (device loop0): failed to run delayed ref for logical 594526208 num_bytes 16384 type 182 action 1 ref_mod 1: -117 ------------[ cut here ]------------ BTRFS: Transaction aborted (error -117) WARNING: CPU: 1 PID: 6963 at ../fs/btrfs/extent-tree.c:2168 btrfs_run_delayed_refs+0xfa/0x110 [btrfs] And btrfs check doesn't report anything wrong related to the extent tree. [CAUSE] The cause is a little complex, firstly the extent tree indeed doesn't have the backref for 594526208. The extent tree only have the following two backrefs around that bytenr on-disk: item 65 key (594509824 METADATA_ITEM 0) itemoff 13880 itemsize 33 refs 1 gen 197740 flags TREE_BLOCK tree block skinny level 0 (176 0x7) tree block backref root CSUM_TREE item 66 key (594558976 METADATA_ITEM 0) itemoff 13847 itemsize 33 refs 1 gen 197522 flags TREE_BLOCK tree block skinny level 0 (176 0x7) tree block backref root CSUM_TREE But the such missing backref item is not an corruption on disk, as the offending delayed ref belongs to subvolume 934, and that subvolume is being dropped: item 0 key (934 ROOT_ITEM 198229) itemoff 15844 itemsize 439 generation 198229 root_dirid 256 bytenr 10741039104 byte_limit 0 bytes_used 345571328 last_snapshot 198229 flags 0x1000000000001(RDONLY) refs 0 drop_progress key (206324 EXTENT_DATA 2711650304) drop_level 2 level 2 generation_v2 198229 And that offending tree block 594526208 is inside the dropped range of that subvolume. That explains why there is no backref item for that bytenr and why btrfs check is not reporting anything wrong. But this also shows another problem, as btrfs will do all the orphan subvolume cleanup at a read-write mount. So half-dropped subvolume should not exist after an RW mount, and balance itself is also exclusive to subvolume cleanup, meaning we shouldn't hit a subvolume half-dropped during relocation. The root cause is, there is no orphan item for this subvolume. In fact there are 5 subvolumes from around 2021 that have the same problem. It looks like the original report has some older kernels running, and caused those zombie subvolumes. Thankfully upstream commit `8d488a8c7b` ("btrfs: fix subvolume/snapshot deletion not triggered on mount") has long fixed the bug. [ENHANCEMENT] For repairing such old fs, btrfs-progs will be enhanced. Considering how delayed the problem will show up (at run delayed ref time) and at that time we have to abort transaction already, it is too late. Instead here we reject any half-dropped subvolume for reloc tree at the earliest time, preventing confusion and extra time wasted on debugging similar bugs. CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-08-07 17:07:15 +02:00
Filipe Manana	fc5799986f	btrfs: error on missing block group when unaccounting log tree extent buffers Currently we only log an error message if we can't find the block group for a log tree extent buffer when unaccounting it (while freeing a log tree). A missing block group means something is seriously wrong and we end up leaking space from the metadata space info. So return -ENOENT in case we don't find the block group. CC: stable@vger.kernel.org # 6.12+ Reviewed-by: Boris Burkov <boris@bur.io> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-08-07 17:07:15 +02:00
Qu Wenruo	deaf895212	btrfs: fix wrong length parameter for btrfs_cleanup_ordered_extents() Inside nocow_one_range(), if the checksum cloning for data reloc inode failed, we call btrfs_cleanup_ordered_extents() to cleanup the just allocated ordered extents. But unlike extent_clear_unlock_delalloc(), btrfs_cleanup_ordered_extents() requires a length, not an inclusive end bytenr. This can be problematic, as the @end is normally way larger than @len. This means btrfs_cleanup_ordered_extents() can be called on folios out of the correct range, and if the out-of-range folio is under writeback, we can incorrectly clear the ordered flag of the folio, and trigger the DEBUG_WARN() inside btrfs_writepage_cow_fixup(). Fix the wrong parameter with correct length instead. Fixes: `94f6c5c17e` ("btrfs: move ordered extent cleanup to where they are allocated") CC: stable@vger.kernel.org # 6.15+ Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-08-07 17:07:15 +02:00
Qu Wenruo	15fc0bec88	btrfs: make btrfs_cleanup_ordered_extents() support large folios When hitting a large folio, btrfs_cleanup_ordered_extents() will get the same large folio multiple times, and clearing the same range again and again. Thankfully this is not causing anything wrong, just inefficiency. This is caused by the fact that we're iterating folios using the old page index, thus can hit the same large folio again and again. Enhance it by increasing @index to the index of the folio end, and only increase @index by 1 if we failed to grab a folio. Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-08-07 17:07:15 +02:00
Leo Martins	ad580dfa38	btrfs: fix subpage deadlock in try_release_subpage_extent_buffer() There is a potential deadlock that can happen in try_release_subpage_extent_buffer() because the irq-safe xarray spin lock fs_info->buffer_tree is being acquired before the irq-unsafe eb->refs_lock. This leads to the potential race: // T1 (random eb->refs user) // T2 (release folio) spin_lock(&eb->refs_lock); // interrupt end_bbio_meta_write() btrfs_meta_folio_clear_writeback() btree_release_folio() folio_test_writeback() //false try_release_extent_buffer() try_release_subpage_extent_buffer() xa_lock_irq(&fs_info->buffer_tree) spin_lock(&eb->refs_lock); // blocked; held by T1 buffer_tree_clear_mark() xas_lock_irqsave() // blocked; held by T2 I believe that the spin lock can safely be replaced by an rcu_read_lock. The xa_for_each loop does not need the spin lock as it's already internally protected by the rcu_read_lock. The extent buffer is also protected by the rcu_read_lock so it won't be freed before we take the eb->refs_lock and check the ref count. The rcu_read_lock is taken and released every iteration, just like the spin lock, which means we're not protected against concurrent insertions into the xarray. This is fine because we rely on folio->private to detect if there are any ebs remaining in the folio. There is already some precedent for this with find_extent_buffer_nolock, which loads an extent buffer from the xarray with only rcu_read_lock. lockdep warning: ===================================================== WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected 6.16.0-0_fbk701_debug_rc0_123_g4c06e63b9203 #1 Tainted: G E N ----------------------------------------------------- kswapd0/66 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: ffff000011ffd600 (&eb->refs_lock){+.+.}-{3:3}, at: try_release_extent_buffer+0x18c/0x560 and this task is already holding: ffff0000c1d91b88 (&buffer_xa_class){-.-.}-{3:3}, at: try_release_extent_buffer+0x13c/0x560 which would create a new lock dependency: (&buffer_xa_class){-.-.}-{3:3} -> (&eb->refs_lock){+.+.}-{3:3} but this new dependency connects a HARDIRQ-irq-safe lock: (&buffer_xa_class){-.-.}-{3:3} ... which became HARDIRQ-irq-safe at: lock_acquire+0x178/0x358 _raw_spin_lock_irqsave+0x60/0x88 buffer_tree_clear_mark+0xc4/0x160 end_bbio_meta_write+0x238/0x398 btrfs_bio_end_io+0x1f8/0x330 btrfs_orig_write_end_io+0x1c4/0x2c0 bio_endio+0x63c/0x678 blk_update_request+0x1c4/0xa00 blk_mq_end_request+0x54/0x88 virtblk_request_done+0x124/0x1d0 blk_mq_complete_request+0x84/0xa0 virtblk_done+0x130/0x238 vring_interrupt+0x130/0x288 __handle_irq_event_percpu+0x1e8/0x708 handle_irq_event+0x98/0x1b0 handle_fasteoi_irq+0x264/0x7c0 generic_handle_domain_irq+0xa4/0x108 gic_handle_irq+0x7c/0x1a0 do_interrupt_handler+0xe4/0x148 el1_interrupt+0x30/0x50 el1h_64_irq_handler+0x14/0x20 el1h_64_irq+0x6c/0x70 _raw_spin_unlock_irq+0x38/0x70 __run_timer_base+0xdc/0x5e0 run_timer_softirq+0xa0/0x138 handle_softirqs.llvm.13542289750107964195+0x32c/0xbd0 ____do_softirq.llvm.17674514681856217165+0x18/0x28 call_on_irq_stack+0x24/0x30 __irq_exit_rcu+0x164/0x430 irq_exit_rcu+0x18/0x88 el1_interrupt+0x34/0x50 el1h_64_irq_handler+0x14/0x20 el1h_64_irq+0x6c/0x70 arch_local_irq_enable+0x4/0x8 do_idle+0x1a0/0x3b8 cpu_startup_entry+0x60/0x80 rest_init+0x204/0x228 start_kernel+0x394/0x3f0 __primary_switched+0x8c/0x8958 to a HARDIRQ-irq-unsafe lock: (&eb->refs_lock){+.+.}-{3:3} ... which became HARDIRQ-irq-unsafe at: ... lock_acquire+0x178/0x358 _raw_spin_lock+0x4c/0x68 free_extent_buffer_stale+0x2c/0x170 btrfs_read_sys_array+0x1b0/0x338 open_ctree+0xeb0/0x1df8 btrfs_get_tree+0xb60/0x1110 vfs_get_tree+0x8c/0x250 fc_mount+0x20/0x98 btrfs_get_tree+0x4a4/0x1110 vfs_get_tree+0x8c/0x250 do_new_mount+0x1e0/0x6c0 path_mount+0x4ec/0xa58 __arm64_sys_mount+0x370/0x490 invoke_syscall+0x6c/0x208 el0_svc_common+0x14c/0x1b8 do_el0_svc+0x4c/0x60 el0_svc+0x4c/0x160 el0t_64_sync_handler+0x70/0x100 el0t_64_sync+0x168/0x170 other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&eb->refs_lock); local_irq_disable(); lock(&buffer_xa_class); lock(&eb->refs_lock); <Interrupt> lock(&buffer_xa_class); * DEADLOCK * 2 locks held by kswapd0/66: #0: ffff800085506e40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0xe8/0xe50 #1: ffff0000c1d91b88 (&buffer_xa_class){-.-.}-{3:3}, at: try_release_extent_buffer+0x13c/0x560 Link: https://www.kernel.org/doc/Documentation/locking/lockdep-design.rst#:~:text=Multi%2Dlock%20dependency%20rules%3A Fixes: `19d7f65f03` ("btrfs: convert the buffer_radix to an xarray") CC: stable@vger.kernel.org # 6.16+ Reviewed-by: Boris Burkov <boris@bur.io> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Leo Martins <loemra.dev@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-08-07 17:07:15 +02:00
Eric Dumazet	ae633388ca	pptp: fix pptp_xmit() error path I accidentally added a bug in pptp_xmit() that syzbot caught for us. Only call ip_rt_put() if a route has been allocated. BUG: unable to handle page fault for address: ffffffffffffffdb PGD df3b067 P4D df3b067 PUD df3d067 PMD 0 Oops: Oops: 0002 [#1] SMP KASAN PTI CPU: 1 UID: 0 PID: 6346 Comm: syz.0.336 Not tainted 6.16.0-next-20250804-syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025 RIP: 0010:arch_atomic_add_return arch/x86/include/asm/atomic.h:85 [inline] RIP: 0010:raw_atomic_sub_return_release include/linux/atomic/atomic-arch-fallback.h:846 [inline] RIP: 0010:atomic_sub_return_release include/linux/atomic/atomic-instrumented.h:327 [inline] RIP: 0010:__rcuref_put include/linux/rcuref.h:109 [inline] RIP: 0010:rcuref_put+0x172/0x210 include/linux/rcuref.h:173 Call Trace: <TASK> dst_release+0x24/0x1b0 net/core/dst.c:167 ip_rt_put include/net/route.h:285 [inline] pptp_xmit+0x14b/0x1a90 drivers/net/ppp/pptp.c:267 __ppp_channel_push+0xf2/0x1c0 drivers/net/ppp/ppp_generic.c:2166 ppp_channel_push+0x123/0x660 drivers/net/ppp/ppp_generic.c:2198 ppp_write+0x2b0/0x400 drivers/net/ppp/ppp_generic.c:544 vfs_write+0x27b/0xb30 fs/read_write.c:684 ksys_write+0x145/0x250 fs/read_write.c:738 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: `de9c4861fb` ("pptp: ensure minimal skb length in pptp_xmit()") Reported-by: syzbot+27d7cfbc93457e472e00@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/689095a5.050a0220.1fc43d.0009.GAE@google.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250807142146.2877060-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-08-07 07:47:04 -07:00
Yu Kuai	45fa9f97e6	lib/sbitmap: make sbitmap_get_shallow() internal Because it's only used in sbitmap.c Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20250807032413.1469456-3-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-08-07 06:30:17 -06:00
Yu Kuai	42e6c6ce03	lib/sbitmap: convert shallow_depth from one word to the whole sbitmap Currently elevators will record internal 'async_depth' to throttle asynchronous requests, and they both calculate shallow_dpeth based on sb->shift, with the respect that sb->shift is the available tags in one word. However, sb->shift is not the availbale tags in the last word, see __map_depth: if (index == sb->map_nr - 1) return sb->depth - (index << sb->shift); For consequence, if the last word is used, more tags can be get than expected, for example, assume nr_requests=256 and there are four words, in the worst case if user set nr_requests=32, then the first word is the last word, and still use bits per word, which is 64, to calculate async_depth is wrong. One the ohter hand, due to cgroup qos, bfq can allow only one request to be allocated, and set shallow_dpeth=1 will still allow the number of words request to be allocated. Fix this problems by using shallow_depth to the whole sbitmap instead of per word, also change kyber, mq-deadline and bfq to follow this, a new helper __map_depth_with_shallow() is introduced to calculate available bits in each word. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250807032413.1469456-2-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-08-07 06:30:17 -06:00

1 2 3 4 5 ...

1381809 commits