hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I8XVM8
CVE: NA
------------------------------------
When I do some aer-inject and sysfs remove stress tests, I got the
following use-after-free Calltrace:
==================================================================
BUG: KASAN: use-after-free in pci_stop_bus_device+0x174/0x178
Read of size 8 at addr fffffc3e2e402218 by task bash/26311
CPU: 38 PID: 26311 Comm: bash Tainted: G W 4.19.105+ #82
Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 CS V5.B161.01 06/10/2021
Call trace:
dump_backtrace+0x0/0x360
show_stack+0x24/0x30
dump_stack+0x130/0x164
print_address_description+0x68/0x278
kasan_report+0x204/0x330
__asan_report_load8_noabort+0x30/0x40
pci_stop_bus_device+0x174/0x178
pci_stop_and_remove_bus_device_locked+0x24/0x40
remove_store+0x1c8/0x1e0
dev_attr_store+0x60/0x80
sysfs_kf_write+0x104/0x170
kernfs_fop_write+0x23c/0x430
__vfs_write+0xec/0x4e0
vfs_write+0x12c/0x3d0
ksys_write+0xe8/0x208
__arm64_sys_write+0x70/0xa0
el0_svc_common+0x10c/0x450
el0_svc_handler+0x50/0xc0
el0_svc+0x10/0x14
Allocated by task 684:
kasan_kmalloc+0xe0/0x190
kmem_cache_alloc_trace+0x110/0x240
pci_alloc_dev+0x4c/0x110
pci_scan_single_device+0x100/0x218
pci_scan_slot+0x8c/0x2d8
pci_scan_child_bus_extend+0x90/0x628
pci_scan_child_bus+0x24/0x30
pci_scan_bridge_extend+0x3b8/0xb28
pci_scan_child_bus_extend+0x350/0x628
pci_rescan_bus+0x24/0x48
pcie_do_fatal_recovery+0x390/0x4b0
handle_error_source+0x124/0x158
aer_isr+0x5a0/0x800
process_one_work+0x598/0x1250
worker_thread+0x384/0xf08
kthread+0x2a4/0x320
ret_from_fork+0x10/0x18
Freed by task 685:
__kasan_slab_free+0x120/0x228
kasan_slab_free+0x10/0x18
kfree+0x88/0x218
pci_release_dev+0xb4/0xd8
device_release+0x6c/0x1c0
kobject_put+0x12c/0x400
put_device+0x24/0x30
pci_dev_put+0x24/0x30
handle_error_source+0x12c/0x158
aer_isr+0x5a0/0x800
process_one_work+0x598/0x1250
worker_thread+0x384/0xf08
kthread+0x2a4/0x320
ret_from_fork+0x10/0x18
The buggy address belongs to the object at fffffc3e2e402200
which belongs to the cache kmalloc-4096 of size 4096
The buggy address is located 24 bytes inside of
4096-byte region [fffffc3e2e402200, fffffc3e2e403200)
The buggy address belongs to the page:
page:ffff7ff0f8b90000 count:1 mapcount:0 mapping:ffffdc365f016e00 index:0x0 compound_mapcount: 0
flags: 0x6ffffe0000008100(slab|head)
raw: 6ffffe0000008100 ffff7f70d83aae00 0000000300000003 ffffdc365f016e00
raw: 0000000000000000 0000000080070007 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
fffffc3e2e402100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
fffffc3e2e402180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>fffffc3e2e402200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
fffffc3e2e402280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
fffffc3e2e402300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
It is caused by the following race condition:
CPU0 CPU1
remove_store() aer_isr()
device_remove_file_self() handle_error_source()
pci_stop_and_remove_bus_device_locked pcie_do_fatal_recovery()
(blocked) pci_lock_rescan_remove() #CPU1 acquire the lock
pci_stop_and_remove_bus_device()
pci_unlock_rescan_remove() #CPU1 release the lock
pci_lock_rescan_remove() #CPU0 acquire the lock
pci_dev_put() #free pci_dev
pci_stop_and_remove_bus_device()
pci_stop_bus_device() #use-after-free
pci_unlock_rescan_remove()
An AER interrupt is triggered on CPU1. CPU1 starts to process it. A work
'aer_isr()' is scheduled on CPU1. It calling into
pcie_do_fatal_recovery(), and aquire lock 'pci_rescan_remove_lock'.
Before it removes the sysfs corresponding to the error pci device, a
sysfs remove operation is executed on CPU0. CPU0 use
device_remove_file_self() to remove the sysfs directory and wait for the
lock to be released. After CPU1 finish pci_stop_and_remove_bus_device(),
it release the lock and free the 'pci_dev' in pci_dev_put(). CPU0 acquire
the lock and access the 'pci_dev'. Then a use-after-free is triggered.
To fix this issue, we increase the reference count in remove_store()
before remove the device and decrease the reference count in the end.
Signed-off-by: Xiongfeng Wang <wangxiongfeng2(a)huawei.com>
Reviewed-by: Hanjun Guo <guohanjun(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
Conflicts:
drivers/pci/pci-sysfs.c
Signed-off-by: Xiongfeng Wang <wangxiongfeng2(a)huawei.com>
---
drivers/pci/pci-sysfs.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 3317b9354716..e3373cdc5244 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -483,12 +483,17 @@ static ssize_t remove_store(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
unsigned long val;
+ struct pci_dev *pdev = to_pci_dev(dev);
if (kstrtoul(buf, 0, &val) < 0)
return -EINVAL;
- if (val && device_remove_file_self(dev, attr))
- pci_stop_and_remove_bus_device_locked(to_pci_dev(dev));
+ if (val) {
+ pci_dev_get(pdev);
+ if (device_remove_file_self(dev, attr))
+ pci_stop_and_remove_bus_device_locked(pdev);
+ pci_dev_put(pdev);
+ }
return count;
}
static DEVICE_ATTR_IGNORE_LOCKDEP(remove, 0220, NULL,
--
2.20.1
From: Dan Carpenter <dan.carpenter(a)linaro.org>
mainline inclusion
from mainline-v6.7-rc2
commit c301f0981fdd3fd1ffac6836b423c4d7a8e0eb63
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I8WQRG
CVE: CVE-2024-0607
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
---------------------------
The problem is in nft_byteorder_eval() where we are iterating through a
loop and writing to dst[0], dst[1], dst[2] and so on... On each
iteration we are writing 8 bytes. But dst[] is an array of u32 so each
element only has space for 4 bytes. That means that every iteration
overwrites part of the previous element.
I spotted this bug while reviewing commit caf3ef7468f7 ("netfilter:
nf_tables: prevent OOB access in nft_byteorder_eval") which is a related
issue. I think that the reason we have not detected this bug in testing
is that most of time we only write one element.
Fixes: ce1e7989d989 ("netfilter: nft_byteorder: provide 64bit le/be conversion")
Signed-off-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Signed-off-by: Liu Jian <liujian56(a)huawei.com>
Conflicts:
include/net/netfilter/nf_tables.h
net/netfilter/nft_byteorder.c
net/netfilter/nft_meta.c
---
net/netfilter/nft_byteorder.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nft_byteorder.c b/net/netfilter/nft_byteorder.c
index 46a8f894717c..ba65e1e8732b 100644
--- a/net/netfilter/nft_byteorder.c
+++ b/net/netfilter/nft_byteorder.c
@@ -41,19 +41,20 @@ static void nft_byteorder_eval(const struct nft_expr *expr,
switch (priv->size) {
case 8: {
+ u64 *dst64 = (void *)dst;
u64 src64;
switch (priv->op) {
case NFT_BYTEORDER_NTOH:
for (i = 0; i < priv->len / 8; i++) {
src64 = get_unaligned((u64 *)&src[i]);
- put_unaligned_be64(src64, &dst[i]);
+ put_unaligned_be64(src64, &dst64[i]);
}
break;
case NFT_BYTEORDER_HTON:
for (i = 0; i < priv->len / 8; i++) {
src64 = get_unaligned_be64(&src[i]);
- put_unaligned(src64, (u64 *)&dst[i]);
+ put_unaligned(src64, (u64 *)&dst64[i]);
}
break;
}
--
2.34.1
From: Dan Carpenter <dan.carpenter(a)linaro.org>
mainline inclusion
from mainline-v6.7-rc2
commit c301f0981fdd3fd1ffac6836b423c4d7a8e0eb63
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I8WQRG
CVE: CVE-2024-0607
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
---------------------------
The problem is in nft_byteorder_eval() where we are iterating through a
loop and writing to dst[0], dst[1], dst[2] and so on... On each
iteration we are writing 8 bytes. But dst[] is an array of u32 so each
element only has space for 4 bytes. That means that every iteration
overwrites part of the previous element.
I spotted this bug while reviewing commit caf3ef7468f7 ("netfilter:
nf_tables: prevent OOB access in nft_byteorder_eval") which is a related
issue. I think that the reason we have not detected this bug in testing
is that most of time we only write one element.
Fixes: ce1e7989d989 ("netfilter: nft_byteorder: provide 64bit le/be conversion")
Signed-off-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Signed-off-by: Liu Jian <liujian56(a)huawei.com>
Conflicts:
include/net/netfilter/nf_tables.h
net/netfilter/nft_byteorder.c
---
include/net/netfilter/nf_tables.h | 4 ++--
net/netfilter/nft_byteorder.c | 5 +++--
net/netfilter/nft_meta.c | 2 +-
3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index ea893a6d9b36..7550328080bf 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -132,9 +132,9 @@ static inline u16 nft_reg_load16(const u32 *sreg)
return *(u16 *)sreg;
}
-static inline void nft_reg_store64(u32 *dreg, u64 val)
+static inline void nft_reg_store64(u64 *dreg, u64 val)
{
- put_unaligned(val, (u64 *)dreg);
+ put_unaligned(val, dreg);
}
static inline u64 nft_reg_load64(const u32 *sreg)
diff --git a/net/netfilter/nft_byteorder.c b/net/netfilter/nft_byteorder.c
index 7b0b8fecb220..9d250bd60bb8 100644
--- a/net/netfilter/nft_byteorder.c
+++ b/net/netfilter/nft_byteorder.c
@@ -38,20 +38,21 @@ void nft_byteorder_eval(const struct nft_expr *expr,
switch (priv->size) {
case 8: {
+ u64 *dst64 = (void *)dst;
u64 src64;
switch (priv->op) {
case NFT_BYTEORDER_NTOH:
for (i = 0; i < priv->len / 8; i++) {
src64 = nft_reg_load64(&src[i]);
- nft_reg_store64(&dst[i], be64_to_cpu(src64));
+ nft_reg_store64(&dst64[i], be64_to_cpu(src64));
}
break;
case NFT_BYTEORDER_HTON:
for (i = 0; i < priv->len / 8; i++) {
src64 = (__force __u64)
cpu_to_be64(nft_reg_load64(&src[i]));
- nft_reg_store64(&dst[i], src64);
+ nft_reg_store64(&dst64[i], src64);
}
break;
}
diff --git a/net/netfilter/nft_meta.c b/net/netfilter/nft_meta.c
index 44d9b38e5f90..cb5bb0e21b66 100644
--- a/net/netfilter/nft_meta.c
+++ b/net/netfilter/nft_meta.c
@@ -63,7 +63,7 @@ nft_meta_get_eval_time(enum nft_meta_keys key,
{
switch (key) {
case NFT_META_TIME_NS:
- nft_reg_store64(dest, ktime_get_real_ns());
+ nft_reg_store64((u64 *)dest, ktime_get_real_ns());
break;
case NFT_META_TIME_DAY:
nft_reg_store8(dest, nft_meta_weekday());
--
2.34.1
A crash kernel boot failure issue is reported in openeuler 5.10 series kernel(in
x86 virtual machine with Hi1822 virtual nic). Finally the issue was narrowed
down to irq vector conflict.
Backport following patchset from kernel maillist to fix this issue.
https://lore.kernel.org/linux-pci/20181018183721.27467-3-gpiccoli@canonical…
Guilherme G. Piccoli (2):
x86/PCI: Export find_cap() to be used in early PCI code
x86/quirks: Add parameter to clear MSIs early on boot
.../admin-guide/kernel-parameters.txt | 6 ++++
arch/x86/include/asm/pci-direct.h | 2 ++
arch/x86/kernel/aperture_64.c | 30 ++---------------
arch/x86/kernel/early-quirks.c | 32 +++++++++++++++++++
arch/x86/pci/common.c | 4 +++
arch/x86/pci/early.c | 25 +++++++++++++++
6 files changed, 71 insertions(+), 28 deletions(-)
--
2.20.1