mailweb.openeuler.org
Manage this list

Keyboard Shortcuts

Thread View

  • j: Next unread message
  • k: Previous unread message
  • j a: Jump to all threads
  • j l: Jump to MailingList overview

Kernel

Threads by month
  • ----- 2025 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2024 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2023 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2022 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2021 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2020 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2019 -----
  • December
kernel@openeuler.org

  • 41 participants
  • 21841 discussions
[PATCH OLK-5.10] net/smc: fix UAF on smcsk after smc_listen_out()
by Wang Liang 09 Dec '25

09 Dec '25
From: "D. Wythe" <alibuda(a)linux.alibaba.com> mainline inclusion from mainline-v6.17-rc3 commit d9cef55ed49117bd63695446fb84b4b91815c0b4 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/ICWO2I CVE: CVE-2025-38734 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- BPF CI testing report a UAF issue: [ 16.446633] BUG: kernel NULL pointer dereference, address: 000000000000003 0 [ 16.447134] #PF: supervisor read access in kernel mod e [ 16.447516] #PF: error_code(0x0000) - not-present pag e [ 16.447878] PGD 0 P4D 0 [ 16.448063] Oops: Oops: 0000 [#1] PREEMPT SMP NOPT I [ 16.448409] CPU: 0 UID: 0 PID: 9 Comm: kworker/0:1 Tainted: G OE 6.13.0-rc3-g89e8a75fda73-dirty #4 2 [ 16.449124] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODUL E [ 16.449502] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/201 4 [ 16.450201] Workqueue: smc_hs_wq smc_listen_wor k [ 16.450531] RIP: 0010:smc_listen_work+0xc02/0x159 0 [ 16.452158] RSP: 0018:ffffb5ab40053d98 EFLAGS: 0001024 6 [ 16.452526] RAX: 0000000000000001 RBX: 0000000000000002 RCX: 000000000000030 0 [ 16.452994] RDX: 0000000000000280 RSI: 00003513840053f0 RDI: 000000000000000 0 [ 16.453492] RBP: ffffa097808e3800 R08: ffffa09782dba1e0 R09: 000000000000000 5 [ 16.453987] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa0978274640 0 [ 16.454497] R13: 0000000000000000 R14: 0000000000000000 R15: ffffa09782d4092 0 [ 16.454996] FS: 0000000000000000(0000) GS:ffffa097bbc00000(0000) knlGS:000000000000000 0 [ 16.455557] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003 3 [ 16.455961] CR2: 0000000000000030 CR3: 0000000102788004 CR4: 0000000000770ef 0 [ 16.456459] PKRU: 5555555 4 [ 16.456654] Call Trace : [ 16.456832] <TASK > [ 16.456989] ? __die+0x23/0x7 0 [ 16.457215] ? page_fault_oops+0x180/0x4c 0 [ 16.457508] ? __lock_acquire+0x3e6/0x249 0 [ 16.457801] ? exc_page_fault+0x68/0x20 0 [ 16.458080] ? asm_exc_page_fault+0x26/0x3 0 [ 16.458389] ? smc_listen_work+0xc02/0x159 0 [ 16.458689] ? smc_listen_work+0xc02/0x159 0 [ 16.458987] ? lock_is_held_type+0x8f/0x10 0 [ 16.459284] process_one_work+0x1ea/0x6d 0 [ 16.459570] worker_thread+0x1c3/0x38 0 [ 16.459839] ? __pfx_worker_thread+0x10/0x1 0 [ 16.460144] kthread+0xe0/0x11 0 [ 16.460372] ? __pfx_kthread+0x10/0x1 0 [ 16.460640] ret_from_fork+0x31/0x5 0 [ 16.460896] ? __pfx_kthread+0x10/0x1 0 [ 16.461166] ret_from_fork_asm+0x1a/0x3 0 [ 16.461453] </TASK > [ 16.461616] Modules linked in: bpf_testmod(OE) [last unloaded: bpf_testmod(OE) ] [ 16.462134] CR2: 000000000000003 0 [ 16.462380] ---[ end trace 0000000000000000 ]--- [ 16.462710] RIP: 0010:smc_listen_work+0xc02/0x1590 The direct cause of this issue is that after smc_listen_out_connected(), newclcsock->sk may be NULL since it will releases the smcsk. Therefore, if the application closes the socket immediately after accept, newclcsock->sk can be NULL. A possible execution order could be as follows: smc_listen_work | userspace ----------------------------------------------------------------- lock_sock(sk) | smc_listen_out_connected() | | \- smc_listen_out | | | \- release_sock | | |- sk->sk_data_ready() | | fd = accept(); | close(fd); | \- socket->sk = NULL; /* newclcsock->sk is NULL now */ SMC_STAT_SERV_SUCC_INC(sock_net(newclcsock->sk)) Since smc_listen_out_connected() will not fail, simply swapping the order of the code can easily fix this issue. Fixes: 3b2dec2603d5 ("net/smc: restructure client and server code in af_smc") Signed-off-by: D. Wythe <alibuda(a)linux.alibaba.com> Reviewed-by: Guangguan Wang <guangguan.wang(a)linux.alibaba.com> Reviewed-by: Alexandra Winter <wintera(a)linux.ibm.com> Reviewed-by: Dust Li <dust.li(a)linux.alibaba.com> Link: https://patch.msgid.link/20250818054618.41615-1-alibuda@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> Conflicts: net/smc/af_smc.c [conflicts due to not merge 194730a9beb5 ("net/smc: Make SMC statistics network namespace aware")] Signed-off-by: Wang Liang <wangliang74(a)huawei.com> --- net/smc/af_smc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 9cd508208a0d..9d4c58b09505 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -2492,8 +2492,9 @@ static void smc_listen_work(struct work_struct *work) goto out_decl; } - smc_listen_out_connected(new_smc); SMC_STAT_SERV_SUCC_INC(ini); + /* smc_listen_out() will release smcsk */ + smc_listen_out_connected(new_smc); goto out_free; out_unlock: -- 2.34.1
2 1
0 0
[PATCH OLK-5.10] i40e: remove read access to debugfs files
by Wang Liang 09 Dec '25

09 Dec '25
From: Jacob Keller <jacob.e.keller(a)intel.com> mainline inclusion from mainline-v6.17-rc5 commit 9fcdb1c3c4ba134434694c001dbff343f1ffa319 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/ID0R34 CVE: CVE-2025-39901 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- The 'command' and 'netdev_ops' debugfs files are a legacy debugging interface supported by the i40e driver since its early days by commit 02e9c290814c ("i40e: debugfs interface"). Both of these debugfs files provide a read handler which is mostly useless, and which is implemented with questionable logic. They both use a static 256 byte buffer which is initialized to the empty string. In the case of the 'command' file this buffer is literally never used and simply wastes space. In the case of the 'netdev_ops' file, the last command written is saved here. On read, the files contents are presented as the name of the device followed by a colon and then the contents of their respective static buffer. For 'command' this will always be "<device>: ". For 'netdev_ops', this will be "<device>: <last command written>". But note the buffer is shared between all devices operated by this module. At best, it is mostly meaningless information, and at worse it could be accessed simultaneously as there doesn't appear to be any locking mechanism. We have also recently received multiple reports for both read functions about their use of snprintf and potential overflow that could result in reading arbitrary kernel memory. For the 'command' file, this is definitely impossible, since the static buffer is always zero and never written to. For the 'netdev_ops' file, it does appear to be possible, if the user carefully crafts the command input, it will be copied into the buffer, which could be large enough to cause snprintf to truncate, which then causes the copy_to_user to read beyond the length of the buffer allocated by kzalloc. A minimal fix would be to replace snprintf() with scnprintf() which would cap the return to the number of bytes written, preventing an overflow. A more involved fix would be to drop the mostly useless static buffers, saving 512 bytes and modifying the read functions to stop needing those as input. Instead, lets just completely drop the read access to these files. These are debug interfaces exposed as part of debugfs, and I don't believe that dropping read access will break any script, as the provided output is pretty useless. You can find the netdev name through other more standard interfaces, and the 'netdev_ops' interface can easily result in garbage if you issue simultaneous writes to multiple devices at once. In order to properly remove the i40e_dbg_netdev_ops_buf, we need to refactor its write function to avoid using the static buffer. Instead, use the same logic as the i40e_dbg_command_write, with an allocated buffer. Update the code to use this instead of the static buffer, and ensure we free the buffer on exit. This fixes simultaneous writes to 'netdev_ops' on multiple devices, and allows us to remove the now unused static buffer along with removing the read access. Fixes: 02e9c290814c ("i40e: debugfs interface") Reported-by: Kunwu Chan <chentao(a)kylinos.cn> Closes: https://lore.kernel.org/intel-wired-lan/20231208031950.47410-1-chentao@kyli… Reported-by: Wang Haoran <haoranwangsec(a)gmail.com> Closes: https://lore.kernel.org/all/CANZ3JQRRiOdtfQJoP9QM=6LS1Jto8PGBGw6y7-TL=BcnzH… Reported-by: Amir Mohammad Jahangirzad <a.jahangirzad(a)gmail.com> Closes: https://lore.kernel.org/all/20250722115017.206969-1-a.jahangirzad@gmail.com/ Signed-off-by: Jacob Keller <jacob.e.keller(a)intel.com> Reviewed-by: Dawid Osuchowski <dawid.osuchowski(a)linux.intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com> Reviewed-by: Simon Horman <horms(a)kernel.org> Reviewed-by: Kunwu Chan <kunwu.chan(a)linux.dev> Tested-by: Rinitha S <sx.rinitha(a)intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com> Conflicts: drivers/net/ethernet/intel/i40e/i40e_debugfs.c [conflicts due to not merge 43f4466ca91d ("i40e: Add helper to access main VSI")] Signed-off-by: Wang Liang <wangliang74(a)huawei.com> --- .../net/ethernet/intel/i40e/i40e_debugfs.c | 121 +++--------------- 1 file changed, 19 insertions(+), 102 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c index 8bcf5902babf..d5c7f07db071 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c +++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c @@ -57,47 +57,6 @@ static struct i40e_veb *i40e_dbg_find_veb(struct i40e_pf *pf, int seid) * setup, adding or removing filters, or other things. Many of * these will be useful for some forms of unit testing. **************************************************************/ -static char i40e_dbg_command_buf[256] = ""; - -/** - * i40e_dbg_command_read - read for command datum - * @filp: the opened file - * @buffer: where to write the data for the user to read - * @count: the size of the user's buffer - * @ppos: file position offset - **/ -static ssize_t i40e_dbg_command_read(struct file *filp, char __user *buffer, - size_t count, loff_t *ppos) -{ - struct i40e_pf *pf = filp->private_data; - int bytes_not_copied; - int buf_size = 256; - char *buf; - int len; - - /* don't allow partial reads */ - if (*ppos != 0) - return 0; - if (count < buf_size) - return -ENOSPC; - - buf = kzalloc(buf_size, GFP_KERNEL); - if (!buf) - return -ENOSPC; - - len = snprintf(buf, buf_size, "%s: %s\n", - pf->vsi[pf->lan_vsi]->netdev->name, - i40e_dbg_command_buf); - - bytes_not_copied = copy_to_user(buffer, buf, len); - kfree(buf); - - if (bytes_not_copied) - return -EFAULT; - - *ppos = len; - return len; -} static char *i40e_filter_state_string[] = { "INVALID", @@ -1635,7 +1594,6 @@ static ssize_t i40e_dbg_command_write(struct file *filp, static const struct file_operations i40e_dbg_command_fops = { .owner = THIS_MODULE, .open = simple_open, - .read = i40e_dbg_command_read, .write = i40e_dbg_command_write, }; @@ -1644,47 +1602,6 @@ static const struct file_operations i40e_dbg_command_fops = { * The netdev_ops entry in debugfs is for giving the driver commands * to be executed from the netdev operations. **************************************************************/ -static char i40e_dbg_netdev_ops_buf[256] = ""; - -/** - * i40e_dbg_netdev_ops - read for netdev_ops datum - * @filp: the opened file - * @buffer: where to write the data for the user to read - * @count: the size of the user's buffer - * @ppos: file position offset - **/ -static ssize_t i40e_dbg_netdev_ops_read(struct file *filp, char __user *buffer, - size_t count, loff_t *ppos) -{ - struct i40e_pf *pf = filp->private_data; - int bytes_not_copied; - int buf_size = 256; - char *buf; - int len; - - /* don't allow partal reads */ - if (*ppos != 0) - return 0; - if (count < buf_size) - return -ENOSPC; - - buf = kzalloc(buf_size, GFP_KERNEL); - if (!buf) - return -ENOSPC; - - len = snprintf(buf, buf_size, "%s: %s\n", - pf->vsi[pf->lan_vsi]->netdev->name, - i40e_dbg_netdev_ops_buf); - - bytes_not_copied = copy_to_user(buffer, buf, len); - kfree(buf); - - if (bytes_not_copied) - return -EFAULT; - - *ppos = len; - return len; -} /** * i40e_dbg_netdev_ops_write - write into netdev_ops datum @@ -1698,35 +1615,36 @@ static ssize_t i40e_dbg_netdev_ops_write(struct file *filp, size_t count, loff_t *ppos) { struct i40e_pf *pf = filp->private_data; + char *cmd_buf, *buf_tmp; int bytes_not_copied; struct i40e_vsi *vsi; - char *buf_tmp; int vsi_seid; int i, cnt; /* don't allow partial writes */ if (*ppos != 0) return 0; - if (count >= sizeof(i40e_dbg_netdev_ops_buf)) - return -ENOSPC; - memset(i40e_dbg_netdev_ops_buf, 0, sizeof(i40e_dbg_netdev_ops_buf)); - bytes_not_copied = copy_from_user(i40e_dbg_netdev_ops_buf, - buffer, count); - if (bytes_not_copied) + cmd_buf = kzalloc(count + 1, GFP_KERNEL); + if (!cmd_buf) + return count; + bytes_not_copied = copy_from_user(cmd_buf, buffer, count); + if (bytes_not_copied) { + kfree(cmd_buf); return -EFAULT; - i40e_dbg_netdev_ops_buf[count] = '\0'; + } + cmd_buf[count] = '\0'; - buf_tmp = strchr(i40e_dbg_netdev_ops_buf, '\n'); + buf_tmp = strchr(cmd_buf, '\n'); if (buf_tmp) { *buf_tmp = '\0'; - count = buf_tmp - i40e_dbg_netdev_ops_buf + 1; + count = buf_tmp - cmd_buf + 1; } - if (strncmp(i40e_dbg_netdev_ops_buf, "change_mtu", 10) == 0) { + if (strncmp(cmd_buf, "change_mtu", 10) == 0) { int mtu; - cnt = sscanf(&i40e_dbg_netdev_ops_buf[11], "%i %i", + cnt = sscanf(&cmd_buf[11], "%i %i", &vsi_seid, &mtu); if (cnt != 2) { dev_info(&pf->pdev->dev, "change_mtu <vsi_seid> <mtu>\n"); @@ -1748,8 +1666,8 @@ static ssize_t i40e_dbg_netdev_ops_write(struct file *filp, dev_info(&pf->pdev->dev, "Could not acquire RTNL - please try again\n"); } - } else if (strncmp(i40e_dbg_netdev_ops_buf, "set_rx_mode", 11) == 0) { - cnt = sscanf(&i40e_dbg_netdev_ops_buf[11], "%i", &vsi_seid); + } else if (strncmp(cmd_buf, "set_rx_mode", 11) == 0) { + cnt = sscanf(&cmd_buf[11], "%i", &vsi_seid); if (cnt != 1) { dev_info(&pf->pdev->dev, "set_rx_mode <vsi_seid>\n"); goto netdev_ops_write_done; @@ -1769,8 +1687,8 @@ static ssize_t i40e_dbg_netdev_ops_write(struct file *filp, dev_info(&pf->pdev->dev, "Could not acquire RTNL - please try again\n"); } - } else if (strncmp(i40e_dbg_netdev_ops_buf, "napi", 4) == 0) { - cnt = sscanf(&i40e_dbg_netdev_ops_buf[4], "%i", &vsi_seid); + } else if (strncmp(cmd_buf, "napi", 4) == 0) { + cnt = sscanf(&cmd_buf[4], "%i", &vsi_seid); if (cnt != 1) { dev_info(&pf->pdev->dev, "napi <vsi_seid>\n"); goto netdev_ops_write_done; @@ -1788,21 +1706,20 @@ static ssize_t i40e_dbg_netdev_ops_write(struct file *filp, dev_info(&pf->pdev->dev, "napi called\n"); } } else { - dev_info(&pf->pdev->dev, "unknown command '%s'\n", - i40e_dbg_netdev_ops_buf); + dev_info(&pf->pdev->dev, "unknown command '%s'\n", cmd_buf); dev_info(&pf->pdev->dev, "available commands\n"); dev_info(&pf->pdev->dev, " change_mtu <vsi_seid> <mtu>\n"); dev_info(&pf->pdev->dev, " set_rx_mode <vsi_seid>\n"); dev_info(&pf->pdev->dev, " napi <vsi_seid>\n"); } netdev_ops_write_done: + kfree(cmd_buf); return count; } static const struct file_operations i40e_dbg_netdev_ops_fops = { .owner = THIS_MODULE, .open = simple_open, - .read = i40e_dbg_netdev_ops_read, .write = i40e_dbg_netdev_ops_write, }; -- 2.34.1
2 1
0 0
[PATCH OLK-5.10] ipvs: Defer ip_vs_ftp unregister during netns cleanup
by Wang Liang 09 Dec '25

09 Dec '25
From: Slavin Liu <slavin452(a)gmail.com> stable inclusion from stable-v5.10.246 commit 421b1ae1574dfdda68b835c15ac4921ec0030182 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/ID3FOO CVE: CVE-2025-40018 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- [ Upstream commit 134121bfd99a06d44ef5ba15a9beb075297c0821 ] On the netns cleanup path, __ip_vs_ftp_exit() may unregister ip_vs_ftp before connections with valid cp->app pointers are flushed, leading to a use-after-free. Fix this by introducing a global `exiting_module` flag, set to true in ip_vs_ftp_exit() before unregistering the pernet subsystem. In __ip_vs_ftp_exit(), skip ip_vs_ftp unregister if called during netns cleanup (when exiting_module is false) and defer it to __ip_vs_cleanup_batch(), which unregisters all apps after all connections are flushed. If called during module exit, unregister ip_vs_ftp immediately. Fixes: 61b1ab4583e2 ("IPVS: netns, add basic init per netns.") Suggested-by: Julian Anastasov <ja(a)ssi.bg> Signed-off-by: Slavin Liu <slavin452(a)gmail.com> Signed-off-by: Julian Anastasov <ja(a)ssi.bg> Signed-off-by: Florian Westphal <fw(a)strlen.de> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Wang Liang <wangliang74(a)huawei.com> --- net/netfilter/ipvs/ip_vs_ftp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c index cf925906f59b..67d0d4f1f0db 100644 --- a/net/netfilter/ipvs/ip_vs_ftp.c +++ b/net/netfilter/ipvs/ip_vs_ftp.c @@ -53,6 +53,7 @@ enum { IP_VS_FTP_EPSV, }; +static bool exiting_module; /* * List of ports (up to IP_VS_APP_MAX_PORTS) to be handled by helper * First port is set to the default port. @@ -607,7 +608,7 @@ static void __ip_vs_ftp_exit(struct net *net) { struct netns_ipvs *ipvs = net_ipvs(net); - if (!ipvs) + if (!ipvs || !exiting_module) return; unregister_ip_vs_app(ipvs, &ip_vs_ftp); @@ -629,6 +630,7 @@ static int __init ip_vs_ftp_init(void) */ static void __exit ip_vs_ftp_exit(void) { + exiting_module = true; unregister_pernet_subsys(&ip_vs_ftp_ops); /* rcu_barrier() is called by netns */ } -- 2.34.1
2 1
0 0
[PATCH OLK-5.10] net/packet: fix a race in packet_set_ring() and packet_notifier()
by Wang Liang 09 Dec '25

09 Dec '25
From: Quang Le <quanglex97(a)gmail.com> stable inclusion from stable-v5.10.241 commit 7da733f117533e9b2ebbd530a22ae4028713955c category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/ICUC4S CVE: CVE-2025-38617 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- commit 01d3c8417b9c1b884a8a981a3b886da556512f36 upstream. When packet_set_ring() releases po->bind_lock, another thread can run packet_notifier() and process an NETDEV_UP event. This race and the fix are both similar to that of commit 15fe076edea7 ("net/packet: fix a race in packet_bind() and packet_notifier()"). There too the packet_notifier NETDEV_UP event managed to run while a po->bind_lock critical section had to be temporarily released. And the fix was similarly to temporarily set po->num to zero to keep the socket unhooked until the lock is retaken. The po->bind_lock in packet_set_ring and packet_notifier precede the introduction of git history. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable(a)vger.kernel.org Signed-off-by: Quang Le <quanglex97(a)gmail.com> Signed-off-by: Willem de Bruijn <willemb(a)google.com> Link: https://patch.msgid.link/20250801175423.2970334-1-willemdebruijn.kernel@gma… Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Wang Liang <wangliang74(a)huawei.com> --- net/packet/af_packet.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index ae57d3e5df32..ba7c192342d9 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -4446,10 +4446,10 @@ static int packet_set_ring(struct sock *sk, union tpacket_req_u *req_u, spin_lock(&po->bind_lock); was_running = po->running; num = po->num; - if (was_running) { - WRITE_ONCE(po->num, 0); + WRITE_ONCE(po->num, 0); + if (was_running) __unregister_prot_hook(sk, false); - } + spin_unlock(&po->bind_lock); synchronize_net(); @@ -4481,10 +4481,10 @@ static int packet_set_ring(struct sock *sk, union tpacket_req_u *req_u, mutex_unlock(&po->pg_vec_lock); spin_lock(&po->bind_lock); - if (was_running) { - WRITE_ONCE(po->num, num); + WRITE_ONCE(po->num, num); + if (was_running) register_prot_hook(sk); - } + spin_unlock(&po->bind_lock); if (pg_vec && (po->tp_version > TPACKET_V2)) { /* Because we don't support block-based V3 on tx-ring */ -- 2.34.1
2 1
0 0
[openeuler:OLK-6.6 3493/3493] mm/dynamic_pool.c:118:55: sparse: sparse: incorrect type in initializer (different address spaces)
by kernel test robot 09 Dec '25

09 Dec '25
tree: https://gitee.com/openeuler/kernel.git OLK-6.6 head: 0189df7df2ef7f7dbbabf7ff8398e45adf306e24 commit: 8822f3476adadcf84c7cb3db0d1a0a39a6fdc398 [3493/3493] mm/dynamic_pool: introduce per-memcg memory pool config: x86_64-randconfig-r122-20251209 (https://download.01.org/0day-ci/archive/20251209/202512091456.3c7HGfWF-lkp@…) compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251209/202512091456.3c7HGfWF-lkp@…) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp(a)intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202512091456.3c7HGfWF-lkp@intel.com/ sparse warnings: (new ones prefixed by >>) >> mm/dynamic_pool.c:118:55: sparse: sparse: incorrect type in initializer (different address spaces) @@ expected struct cgroup_subsys_state *css @@ got struct cgroup_subsys_state [noderef] __rcu * @@ mm/dynamic_pool.c:118:55: sparse: expected struct cgroup_subsys_state *css mm/dynamic_pool.c:118:55: sparse: got struct cgroup_subsys_state [noderef] __rcu * vim +118 mm/dynamic_pool.c 115 116 int dynamic_pool_destroy(struct cgroup *cgrp, bool *clear_css_online) 117 { > 118 struct cgroup_subsys_state *css = cgrp->subsys[memory_cgrp_id]; 119 struct mem_cgroup *memcg = mem_cgroup_from_css(css); 120 struct dynamic_pool *dpool; 121 int ret = 0; 122 123 if (!dpool_enabled || !memcg) 124 return 0; 125 126 mutex_lock(&dpool_mutex); 127 dpool = dpool_get_from_memcg(memcg); 128 if (!dpool) 129 goto unlock; 130 131 if (dpool->memcg != memcg) { 132 memcg->dpool = NULL; 133 goto put; 134 } 135 136 /* A offline dpool is not allowed for allocation */ 137 dpool->online = false; 138 139 memcg->dpool = NULL; 140 141 /* Release the initial reference count */ 142 dpool_put(dpool); 143 144 /* 145 * Since dpool is destroyed and the memcg will be freed then, 146 * clear CSS_ONLINE immediately to prevent race with create. 147 */ 148 if (cgrp->self.flags & CSS_ONLINE) { 149 cgrp->self.flags &= ~CSS_ONLINE; 150 *clear_css_online = true; 151 } 152 153 put: 154 dpool_put(dpool); 155 unlock: 156 mutex_unlock(&dpool_mutex); 157 158 return ret; 159 } 160 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
1 0
0 0
[PATCH OLK-5.10] vsock: Do not allow binding to VMADDR_PORT_ANY
by Wang Liang 09 Dec '25

09 Dec '25
From: Budimir Markovic <markovicbudimir(a)gmail.com> stable inclusion from stable-v5.10.241 commit d1a5b1964cef42727668ac0d8532dae4f8c19386 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/ICUC4U CVE: CVE-2025-38618 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- commit aba0c94f61ec05315fa7815d21aefa4c87f6a9f4 upstream. It is possible for a vsock to autobind to VMADDR_PORT_ANY. This can cause a use-after-free when a connection is made to the bound socket. The socket returned by accept() also has port VMADDR_PORT_ANY but is not on the list of unbound sockets. Binding it will result in an extra refcount decrement similar to the one fixed in fcdd2242c023 (vsock: Keep the binding until socket destruction). Modify the check in __vsock_bind_connectible() to also prevent binding to VMADDR_PORT_ANY. Fixes: d021c344051a ("VSOCK: Introduce VM Sockets") Reported-by: Budimir Markovic <markovicbudimir(a)gmail.com> Signed-off-by: Budimir Markovic <markovicbudimir(a)gmail.com> Reviewed-by: Stefano Garzarella <sgarzare(a)redhat.com> Link: https://patch.msgid.link/20250807041811.678-1-markovicbudimir@gmail.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Wang Liang <wangliang74(a)huawei.com> --- net/vmw_vsock/af_vsock.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 3d24457ddf15..5e68c0705d3f 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -612,7 +612,8 @@ static int __vsock_bind_stream(struct vsock_sock *vsk, unsigned int i; for (i = 0; i < MAX_PORT_RETRIES; i++) { - if (port <= LAST_RESERVED_PORT) + if (port == VMADDR_PORT_ANY || + port <= LAST_RESERVED_PORT) port = LAST_RESERVED_PORT + 1; new_addr.svm_port = port++; -- 2.34.1
2 1
0 0
[PATCH OLK-5.10] ice: Fix a null pointer dereference in ice_copy_and_init_pkg()
by Wang Liang 09 Dec '25

09 Dec '25
From: Haoxiang Li <haoxiang_li2024(a)163.com> stable inclusion from stable-v5.10.241 commit 435462f8ab2b9c5340a5414ce02f70117d0cfede category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/ICUCA2 CVE: CVE-2025-38664 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- [ Upstream commit 4ff12d82dac119b4b99b5a78b5af3bf2474c0a36 ] Add check for the return value of devm_kmemdup() to prevent potential null pointer dereference. Fixes: c76488109616 ("ice: Implement Dynamic Device Personalization (DDP) download") Cc: stable(a)vger.kernel.org Signed-off-by: Haoxiang Li <haoxiang_li2024(a)163.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski(a)linux.intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov(a)intel.com> Reviewed-by: Simon Horman <horms(a)kernel.org> Tested-by: Rinitha S <sx.rinitha(a)intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com> [ applied the patch to ice_flex_pipe.c instead of ice_ddp.c ] Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Wang Liang <wangliang74(a)huawei.com> --- drivers/net/ethernet/intel/ice/ice_flex_pipe.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c index a81be917f653..da2906720c63 100644 --- a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c +++ b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c @@ -1449,6 +1449,8 @@ enum ice_status ice_copy_and_init_pkg(struct ice_hw *hw, const u8 *buf, u32 len) return ICE_ERR_PARAM; buf_copy = devm_kmemdup(ice_hw_to_dev(hw), buf, len, GFP_KERNEL); + if (!buf_copy) + return ICE_ERR_NO_MEMORY; status = ice_init_pkg(hw, buf_copy, len); if (status) { -- 2.34.1
2 1
0 0
[PATCH OLK-5.10] ipv6: reject malicious packets in ipv6_gso_segment()
by Wang Liang 09 Dec '25

09 Dec '25
From: Eric Dumazet <edumazet(a)google.com> stable inclusion from stable-v5.10.241 commit 3f638e0b28bde7c3354a0df938ab3a96739455d1 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/ICU6G7 CVE: CVE-2025-38572 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- [ Upstream commit d45cf1e7d7180256e17c9ce88e32e8061a7887fe ] syzbot was able to craft a packet with very long IPv6 extension headers leading to an overflow of skb->transport_header. This 16bit field has a limited range. Add skb_reset_transport_header_careful() helper and use it from ipv6_gso_segment() WARNING: CPU: 0 PID: 5871 at ./include/linux/skbuff.h:3032 skb_reset_transport_header include/linux/skbuff.h:3032 [inline] WARNING: CPU: 0 PID: 5871 at ./include/linux/skbuff.h:3032 ipv6_gso_segment+0x15e2/0x21e0 net/ipv6/ip6_offload.c:151 Modules linked in: CPU: 0 UID: 0 PID: 5871 Comm: syz-executor211 Not tainted 6.16.0-rc6-syzkaller-g7abc678e3084 #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025 RIP: 0010:skb_reset_transport_header include/linux/skbuff.h:3032 [inline] RIP: 0010:ipv6_gso_segment+0x15e2/0x21e0 net/ipv6/ip6_offload.c:151 Call Trace: <TASK> skb_mac_gso_segment+0x31c/0x640 net/core/gso.c:53 nsh_gso_segment+0x54a/0xe10 net/nsh/nsh.c:110 skb_mac_gso_segment+0x31c/0x640 net/core/gso.c:53 __skb_gso_segment+0x342/0x510 net/core/gso.c:124 skb_gso_segment include/net/gso.h:83 [inline] validate_xmit_skb+0x857/0x11b0 net/core/dev.c:3950 validate_xmit_skb_list+0x84/0x120 net/core/dev.c:4000 sch_direct_xmit+0xd3/0x4b0 net/sched/sch_generic.c:329 __dev_xmit_skb net/core/dev.c:4102 [inline] __dev_queue_xmit+0x17b6/0x3a70 net/core/dev.c:4679 Fixes: d1da932ed4ec ("ipv6: Separate ipv6 offload support") Reported-by: syzbot+af43e647fd835acc02df(a)syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/688a1a05.050a0220.5d226.0008.GAE@google.com/… Signed-off-by: Eric Dumazet <edumazet(a)google.com> Reviewed-by: Dawid Osuchowski <dawid.osuchowski(a)linux.intel.com> Reviewed-by: Willem de Bruijn <willemb(a)google.com> Link: https://patch.msgid.link/20250730131738.3385939-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Wang Liang <wangliang74(a)huawei.com> --- include/linux/skbuff.h | 23 +++++++++++++++++++++++ net/ipv6/ip6_offload.c | 4 +++- 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 4a7a2ff7aec3..6e8dfec0a2a9 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -2594,6 +2594,29 @@ static inline void skb_reset_transport_header(struct sk_buff *skb) skb->transport_header = skb->data - skb->head; } +/** + * skb_reset_transport_header_careful - conditionally reset transport header + * @skb: buffer to alter + * + * Hardened version of skb_reset_transport_header(). + * + * Returns: true if the operation was a success. + */ +static inline bool __must_check +skb_reset_transport_header_careful(struct sk_buff *skb) +{ + long offset = skb->data - skb->head; + + if (unlikely(offset != (typeof(skb->transport_header))offset)) + return false; + + if (unlikely(offset == (typeof(skb->transport_header))~0U)) + return false; + + skb->transport_header = offset; + return true; +} + static inline void skb_set_transport_header(struct sk_buff *skb, const int offset) { diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c index 15c8eef1ef44..71b5a7c3df09 100644 --- a/net/ipv6/ip6_offload.c +++ b/net/ipv6/ip6_offload.c @@ -111,7 +111,9 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb, ops = rcu_dereference(inet6_offloads[proto]); if (likely(ops && ops->callbacks.gso_segment)) { - skb_reset_transport_header(skb); + if (!skb_reset_transport_header_careful(skb)) + goto out; + segs = ops->callbacks.gso_segment(skb, features); if (!segs) skb->network_header = skb_mac_header(skb) + nhoff - skb->head; -- 2.34.1
2 1
0 0
[PATCH OLK-5.10] arm64: kdump: Skip kmemleak scan reserved memory for kdump
by Zeng Heng 09 Dec '25

09 Dec '25
From: Chen Wandun <chenwandun(a)huawei.com> mainline inclusion from mainline-v5.15-rc1 commit 85f58eb1889826b9745737718723a80b639e0fbd category: bugfix bugzilla: 188440 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- Trying to boot with kdump + kmemleak, command will result in a crash: "echo scan > /sys/kernel/debug/kmemleak" crashkernel reserved: 0x0000000007c00000 - 0x0000000027c00000 (512 MB) Kernel command line: BOOT_IMAGE=(hd1,gpt2)/vmlinuz-5.14.0-rc5-next-20210809+ root=/dev/mapper/ao-root ro rd.lvm.lv=ao/root rd.lvm.lv=ao/swap crashkernel=512M Unable to handle kernel paging request at virtual address ffff000007c00000 Mem abort info: ESR = 0x96000007 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x07: level 3 translation fault Data abort info: ISV = 0, ISS = 0x00000007 CM = 0, WnR = 0 swapper pgtable: 64k pages, 48-bit VAs, pgdp=00002024f0d80000 [ffff000007c00000] pgd=1800205ffffd0003, p4d=1800205ffffd0003, pud=1800205ffffd0003, pmd=1800205ffffc0003, pte=0068000007c00f06 Internal error: Oops: 96000007 [#1] SMP pstate: 804000c9 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : scan_block+0x98/0x230 lr : scan_block+0x94/0x230 sp : ffff80008d6cfb70 x29: ffff80008d6cfb70 x28: 0000000000000000 x27: 0000000000000000 x26: 00000000000000c0 x25: 0000000000000001 x24: 0000000000000000 x23: ffffa88a6b18b398 x22: ffff000007c00ff9 x21: ffffa88a6ac7fc40 x20: ffffa88a6af6a830 x19: ffff000007c00000 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: ffffffffffffffff x14: ffffffff00000000 x13: ffffffffffffffff x12: 0000000000000020 x11: 0000000000000000 x10: 0000000001080000 x9 : ffffa88a6951c77c x8 : ffffa88a6a893988 x7 : ffff203ff6cfb3c0 x6 : ffffa88a6a52b3c0 x5 : ffff203ff6cfb3c0 x4 : 0000000000000000 x3 : 0000000000000000 x2 : 0000000000000001 x1 : ffff20226cb56a40 x0 : 0000000000000000 Call trace: scan_block+0x98/0x230 scan_gray_list+0x120/0x270 kmemleak_scan+0x3a0/0x648 kmemleak_write+0x3ac/0x4c8 full_proxy_write+0x6c/0xa0 vfs_write+0xc8/0x2b8 ksys_write+0x70/0xf8 __arm64_sys_write+0x24/0x30 invoke_syscall+0x4c/0x110 el0_svc_common+0x9c/0x190 do_el0_svc+0x30/0x98 el0_svc+0x28/0xd8 el0t_64_sync_handler+0x90/0xb8 el0t_64_sync+0x180/0x184 The reserved memory for kdump will be looked up by kmemleak, this area will be set invalid when kdump service is bring up. That will result in crash when kmemleak scan this area. Fixes: a7259df76702 ("memblock: make memblock_find_in_range method private") Signed-off-by: Chen Wandun <chenwandun(a)huawei.com> Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com> Reviewed-by: Mike Rapoport <rppt(a)linux.ibm.com> Reviewed-by: Catalin Marinas <catalin.marinas(a)arm.com> Link: https://lore.kernel.org/r/20210910064844.3827813-1-chenwandun@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com> conflict: arch/arm64/mm/init.c Signed-off-by: ZhangPeng <zhangpeng362(a)huawei.com> Signed-off-by: Zeng Heng <zengheng4(a)huawei.com> --- kernel/crash_core.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/kernel/crash_core.c b/kernel/crash_core.c index d1402585bf84..368d8d1dd03c 100644 --- a/kernel/crash_core.c +++ b/kernel/crash_core.c @@ -9,6 +9,7 @@ #include <linux/vmalloc.h> #include <linux/memblock.h> #include <linux/swiotlb.h> +#include <linux/kmemleak.h> #ifdef CONFIG_KEXEC_CORE #include <asm/kexec.h> @@ -556,6 +557,14 @@ void __init reserve_crashkernel(void) (unsigned long)(crash_base >> 20), (unsigned long)(total_mem >> 20)); +#ifdef CONFIG_ARM64 + /* + * The crashkernel memory will be removed from the kernel linear + * map. Inform kmemleak so that it won't try to access it. + */ + kmemleak_ignore_phys(crash_base); +#endif + crashk_res.start = crash_base; crashk_res.end = crash_base + crash_size - 1; } -- 2.25.1
2 1
0 0
[openeuler:OLK-6.6 3493/3493] mm/memblock.c:1409:20: warning: no previous prototype for function 'memblock_alloc_range_nid_flags'
by kernel test robot 09 Dec '25

09 Dec '25
Hi Ma, FYI, the error/warning still remains. tree: https://gitee.com/openeuler/kernel.git OLK-6.6 head: d86ef4e1e96aa04c46e92dca30f98d069c1ad15a commit: 64018b291c1f49622c4b23b303364d760306d662 [3493/3493] mm/memblock: Introduce ability to alloc memory from specify memory region config: x86_64-randconfig-r122-20251209 (https://download.01.org/0day-ci/archive/20251209/202512091204.Q8mSD8MC-lkp@…) compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251209/202512091204.Q8mSD8MC-lkp@…) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp(a)intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202512091204.Q8mSD8MC-lkp@intel.com/ All warnings (new ones prefixed by >>): In file included from mm/memblock.c:18: In file included from include/linux/memblock.h:12: In file included from include/linux/mm.h:2181: include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion] 522 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_" | ~~~~~~~~~~~ ^ ~~~ >> mm/memblock.c:1409:20: warning: no previous prototype for function 'memblock_alloc_range_nid_flags' [-Wmissing-prototypes] 1409 | phys_addr_t __init memblock_alloc_range_nid_flags(phys_addr_t size, | ^ mm/memblock.c:1409:1: note: declare 'static' if the function is not intended to be used outside of this translation unit 1409 | phys_addr_t __init memblock_alloc_range_nid_flags(phys_addr_t size, | ^ | static 2 warnings generated. -- >> mm/memblock.c:1579: warning: expecting prototype for memblock_alloc_internal(). Prototype was for __memblock_alloc_internal() instead vim +/memblock_alloc_range_nid_flags +1409 mm/memblock.c 1386 1387 /** 1388 * memblock_alloc_range_nid_flags - allocate boot memory block with specify flag 1389 * @size: size of memory block to be allocated in bytes 1390 * @align: alignment of the region and block's size 1391 * @start: the lower bound of the memory region to allocate (phys address) 1392 * @end: the upper bound of the memory region to allocate (phys address) 1393 * @nid: nid of the free area to find, %NUMA_NO_NODE for any node 1394 * @exact_nid: control the allocation fall back to other nodes 1395 * @flags: alloc memory from specify memblock flag 1396 * 1397 * The allocation is performed from memory region limited by 1398 * memblock.current_limit if @end == %MEMBLOCK_ALLOC_ACCESSIBLE. 1399 * 1400 * If the specified node can not hold the requested memory and @exact_nid 1401 * is false, the allocation falls back to any node in the system. 1402 * 1403 * In addition, function sets the min_count to 0 using kmemleak_alloc_phys for 1404 * allocated boot memory block, so that it is never reported as leaks. 1405 * 1406 * Return: 1407 * Physical address of allocated memory block on success, %0 on failure. 1408 */ > 1409 phys_addr_t __init memblock_alloc_range_nid_flags(phys_addr_t size, 1410 phys_addr_t align, phys_addr_t start, 1411 phys_addr_t end, int nid, 1412 bool exact_nid, 1413 enum memblock_flags flags) 1414 { 1415 phys_addr_t found; 1416 1417 if (WARN_ONCE(nid == MAX_NUMNODES, "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n")) 1418 nid = NUMA_NO_NODE; 1419 1420 if (!align) { 1421 /* Can't use WARNs this early in boot on powerpc */ 1422 dump_stack(); 1423 align = SMP_CACHE_BYTES; 1424 } 1425 1426 again: 1427 found = memblock_find_in_range_node(size, align, start, end, nid, 1428 flags); 1429 if (found && !memblock_reserve(found, size)) 1430 goto done; 1431 1432 if (nid != NUMA_NO_NODE && !exact_nid) { 1433 found = memblock_find_in_range_node(size, align, start, 1434 end, NUMA_NO_NODE, 1435 flags); 1436 if (found && !memblock_reserve(found, size)) 1437 goto done; 1438 } 1439 1440 if (flags & MEMBLOCK_MIRROR) { 1441 flags &= ~MEMBLOCK_MIRROR; 1442 pr_warn_ratelimited("Could not allocate %pap bytes of mirrored memory\n", 1443 &size); 1444 goto again; 1445 } 1446 1447 return 0; 1448 1449 done: 1450 /* 1451 * Skip kmemleak for those places like kasan_init() and 1452 * early_pgtable_alloc() due to high volume. 1453 */ 1454 if (end != MEMBLOCK_ALLOC_NOLEAKTRACE) 1455 /* 1456 * Memblock allocated blocks are never reported as 1457 * leaks. This is because many of these blocks are 1458 * only referred via the physical address which is 1459 * not looked up by kmemleak. 1460 */ 1461 kmemleak_alloc_phys(found, size, 0); 1462 1463 /* 1464 * Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, 1465 * require memory to be accepted before it can be used by the 1466 * guest. 1467 * 1468 * Accept the memory of the allocated buffer. 1469 */ 1470 accept_memory(found, found + size); 1471 1472 return found; 1473 } 1474 1475 /** 1476 * memblock_alloc_range_nid - allocate boot memory block 1477 * @size: size of memory block to be allocated in bytes 1478 * @align: alignment of the region and block's size 1479 * @start: the lower bound of the memory region to allocate (phys address) 1480 * @end: the upper bound of the memory region to allocate (phys address) 1481 * @nid: nid of the free area to find, %NUMA_NO_NODE for any node 1482 * @exact_nid: control the allocation fall back to other nodes 1483 * 1484 * The allocation is performed from memory region limited by 1485 * memblock.current_limit if @end == %MEMBLOCK_ALLOC_ACCESSIBLE. 1486 * 1487 * If the specified node can not hold the requested memory and @exact_nid 1488 * is false, the allocation falls back to any node in the system. 1489 * 1490 * For systems with memory mirroring, the allocation is attempted first 1491 * from the regions with mirroring enabled and then retried from any 1492 * memory region. 1493 * 1494 * In addition, function using kmemleak_alloc_phys for allocated boot 1495 * memory block, it is never reported as leaks. 1496 * 1497 * Return: 1498 * Physical address of allocated memory block on success, %0 on failure. 1499 */ 1500 phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size, 1501 phys_addr_t align, phys_addr_t start, 1502 phys_addr_t end, int nid, 1503 bool exact_nid) 1504 { 1505 return memblock_alloc_range_nid_flags(size, align, start, end, nid, 1506 exact_nid, 1507 choose_memblock_flags()); 1508 } 1509 1510 /** 1511 * memblock_phys_alloc_range - allocate a memory block inside specified range 1512 * @size: size of memory block to be allocated in bytes 1513 * @align: alignment of the region and block's size 1514 * @start: the lower bound of the memory region to allocate (physical address) 1515 * @end: the upper bound of the memory region to allocate (physical address) 1516 * 1517 * Allocate @size bytes in the between @start and @end. 1518 * 1519 * Return: physical address of the allocated memory block on success, 1520 * %0 on failure. 1521 */ 1522 phys_addr_t __init memblock_phys_alloc_range(phys_addr_t size, 1523 phys_addr_t align, 1524 phys_addr_t start, 1525 phys_addr_t end) 1526 { 1527 memblock_dbg("%s: %llu bytes align=0x%llx from=%pa max_addr=%pa %pS\n", 1528 __func__, (u64)size, (u64)align, &start, &end, 1529 (void *)_RET_IP_); 1530 return memblock_alloc_range_nid(size, align, start, end, NUMA_NO_NODE, 1531 false); 1532 } 1533 1534 /** 1535 * memblock_phys_alloc_try_nid - allocate a memory block from specified NUMA node 1536 * @size: size of memory block to be allocated in bytes 1537 * @align: alignment of the region and block's size 1538 * @nid: nid of the free area to find, %NUMA_NO_NODE for any node 1539 * 1540 * Allocates memory block from the specified NUMA node. If the node 1541 * has no available memory, attempts to allocated from any node in the 1542 * system. 1543 * 1544 * Return: physical address of the allocated memory block on success, 1545 * %0 on failure. 1546 */ 1547 phys_addr_t __init memblock_phys_alloc_try_nid(phys_addr_t size, phys_addr_t align, int nid) 1548 { 1549 return memblock_alloc_range_nid(size, align, 0, 1550 MEMBLOCK_ALLOC_ACCESSIBLE, nid, false); 1551 } 1552 1553 /** 1554 * memblock_alloc_internal - allocate boot memory block 1555 * @size: size of memory block to be allocated in bytes 1556 * @align: alignment of the region and block's size 1557 * @min_addr: the lower bound of the memory region to allocate (phys address) 1558 * @max_addr: the upper bound of the memory region to allocate (phys address) 1559 * @nid: nid of the free area to find, %NUMA_NO_NODE for any node 1560 * @exact_nid: control the allocation fall back to other nodes 1561 * @flags: alloc memory from specify memblock flag 1562 * 1563 * Allocates memory block using memblock_alloc_range_nid() and 1564 * converts the returned physical address to virtual. 1565 * 1566 * The @min_addr limit is dropped if it can not be satisfied and the allocation 1567 * will fall back to memory below @min_addr. Other constraints, such 1568 * as node and mirrored memory will be handled again in 1569 * memblock_alloc_range_nid(). 1570 * 1571 * Return: 1572 * Virtual address of allocated memory block on success, NULL on failure. 1573 */ 1574 static void * __init __memblock_alloc_internal( 1575 phys_addr_t size, phys_addr_t align, 1576 phys_addr_t min_addr, phys_addr_t max_addr, 1577 int nid, bool exact_nid, 1578 enum memblock_flags flags) > 1579 { 1580 phys_addr_t alloc; 1581 1582 /* 1583 * Detect any accidental use of these APIs after slab is ready, as at 1584 * this moment memblock may be deinitialized already and its 1585 * internal data may be destroyed (after execution of memblock_free_all) 1586 */ 1587 if (WARN_ON_ONCE(slab_is_available())) 1588 return kzalloc_node(size, GFP_NOWAIT, nid); 1589 1590 if (max_addr > memblock.current_limit) 1591 max_addr = memblock.current_limit; 1592 1593 alloc = memblock_alloc_range_nid_flags(size, align, min_addr, max_addr, nid, 1594 exact_nid, flags); 1595 1596 /* retry allocation without lower limit */ 1597 if (!alloc && min_addr) 1598 alloc = memblock_alloc_range_nid_flags(size, align, 0, max_addr, nid, 1599 exact_nid, flags); 1600 1601 if (!alloc) 1602 return NULL; 1603 1604 return phys_to_virt(alloc); 1605 } 1606 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
1 0
0 0
  • ← Newer
  • 1
  • ...
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • ...
  • 2185
  • Older →

HyperKitty Powered by HyperKitty