From: Finn Thain <fthain(a)linux-m68k.org>
stable inclusion
from stable-v5.10.216
commit ab86cf6f8d24e63e9aca23da5108af1aa5483928
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9L5KR
CVE: CVE-2024-26999
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit 1be3226445362bfbf461c92a5bcdb1723f2e4907 upstream.
The mitigation was intended to stop the irq completely. That may be
better than a hard lock-up but it turns out that you get a crash anyway
if you're using pmac_zilog as a serial console:
ttyPZ0: pmz: rx irq flood !
BUG: spinlock recursion on CPU#0, swapper/0
That's because the pr_err() call in pmz_receive_chars() results in
pmz_console_write() attempting to lock a spinlock already locked in
pmz_interrupt(). With CONFIG_DEBUG_SPINLOCK=y, this produces a fatal
BUG splat. The spinlock in question is the one in struct uart_port.
Even when it's not fatal, the serial port rx function ceases to work.
Also, the iteration limit doesn't play nicely with QEMU, as can be
seen in the bug report linked below.
A web search for other reports of the error message "pmz: rx irq flood"
didn't produce anything. So I don't think this code is needed any more.
Remove it.
Cc: Benjamin Herrenschmidt <benh(a)kernel.crashing.org>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Nicholas Piggin <npiggin(a)gmail.com>
Cc: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Cc: Aneesh Kumar K.V <aneesh.kumar(a)kernel.org>
Cc: Naveen N. Rao <naveen.n.rao(a)linux.ibm.com>
Cc: Andy Shevchenko <andy.shevchenko(a)gmail.com>
Cc: stable(a)kernel.org
Cc: linux-m68k(a)lists.linux-m68k.org
Link: https://github.com/vivier/qemu-m68k/issues/44
Link: https://lore.kernel.org/all/1078874617.9746.36.camel@gaston/
Acked-by: Michael Ellerman <mpe(a)ellerman.id.au>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Finn Thain <fthain(a)linux-m68k.org>
Link: https://lore.kernel.org/r/e853cf2c762f23101cd2ddec0cc0c2be0e72685f.17125682…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Yi Yang <yiyang13(a)huawei.com>
---
drivers/tty/serial/pmac_zilog.c | 14 --------------
1 file changed, 14 deletions(-)
diff --git a/drivers/tty/serial/pmac_zilog.c b/drivers/tty/serial/pmac_zilog.c
index d6aef8a1f0a4..1d0717fc3729 100644
--- a/drivers/tty/serial/pmac_zilog.c
+++ b/drivers/tty/serial/pmac_zilog.c
@@ -217,7 +217,6 @@ static bool pmz_receive_chars(struct uart_pmac_port *uap)
{
struct tty_port *port;
unsigned char ch, r1, drop, flag;
- int loops = 0;
/* Sanity check, make sure the old bug is no longer happening */
if (uap->port.state == NULL) {
@@ -298,24 +297,11 @@ static bool pmz_receive_chars(struct uart_pmac_port *uap)
if (r1 & Rx_OVR)
tty_insert_flip_char(port, 0, TTY_OVERRUN);
next_char:
- /* We can get stuck in an infinite loop getting char 0 when the
- * line is in a wrong HW state, we break that here.
- * When that happens, I disable the receive side of the driver.
- * Note that what I've been experiencing is a real irq loop where
- * I'm getting flooded regardless of the actual port speed.
- * Something strange is going on with the HW
- */
- if ((++loops) > 1000)
- goto flood;
ch = read_zsreg(uap, R0);
if (!(ch & Rx_CH_AV))
break;
}
- return true;
- flood:
- pmz_interrupt_control(uap, 0);
- pmz_error("pmz: rx irq flood !\n");
return true;
}
--
2.25.1
From: Finn Thain <fthain(a)linux-m68k.org>
stable inclusion
from stable-v4.19.313
commit 69a02273e288011b521ee7c1f3ab2c23fda633ce
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9L5KR
CVE: CVE-2024-26999
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit 1be3226445362bfbf461c92a5bcdb1723f2e4907 upstream.
The mitigation was intended to stop the irq completely. That may be
better than a hard lock-up but it turns out that you get a crash anyway
if you're using pmac_zilog as a serial console:
ttyPZ0: pmz: rx irq flood !
BUG: spinlock recursion on CPU#0, swapper/0
That's because the pr_err() call in pmz_receive_chars() results in
pmz_console_write() attempting to lock a spinlock already locked in
pmz_interrupt(). With CONFIG_DEBUG_SPINLOCK=y, this produces a fatal
BUG splat. The spinlock in question is the one in struct uart_port.
Even when it's not fatal, the serial port rx function ceases to work.
Also, the iteration limit doesn't play nicely with QEMU, as can be
seen in the bug report linked below.
A web search for other reports of the error message "pmz: rx irq flood"
didn't produce anything. So I don't think this code is needed any more.
Remove it.
Cc: Benjamin Herrenschmidt <benh(a)kernel.crashing.org>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Nicholas Piggin <npiggin(a)gmail.com>
Cc: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Cc: Aneesh Kumar K.V <aneesh.kumar(a)kernel.org>
Cc: Naveen N. Rao <naveen.n.rao(a)linux.ibm.com>
Cc: Andy Shevchenko <andy.shevchenko(a)gmail.com>
Cc: stable(a)kernel.org
Cc: linux-m68k(a)lists.linux-m68k.org
Link: https://github.com/vivier/qemu-m68k/issues/44
Link: https://lore.kernel.org/all/1078874617.9746.36.camel@gaston/
Acked-by: Michael Ellerman <mpe(a)ellerman.id.au>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Finn Thain <fthain(a)linux-m68k.org>
Link: https://lore.kernel.org/r/e853cf2c762f23101cd2ddec0cc0c2be0e72685f.17125682…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Yi Yang <yiyang13(a)huawei.com>
---
drivers/tty/serial/pmac_zilog.c | 14 --------------
1 file changed, 14 deletions(-)
diff --git a/drivers/tty/serial/pmac_zilog.c b/drivers/tty/serial/pmac_zilog.c
index 3d21790d961e..2cddcf74f702 100644
--- a/drivers/tty/serial/pmac_zilog.c
+++ b/drivers/tty/serial/pmac_zilog.c
@@ -220,7 +220,6 @@ static bool pmz_receive_chars(struct uart_pmac_port *uap)
{
struct tty_port *port;
unsigned char ch, r1, drop, error, flag;
- int loops = 0;
/* Sanity check, make sure the old bug is no longer happening */
if (uap->port.state == NULL) {
@@ -303,24 +302,11 @@ static bool pmz_receive_chars(struct uart_pmac_port *uap)
if (r1 & Rx_OVR)
tty_insert_flip_char(port, 0, TTY_OVERRUN);
next_char:
- /* We can get stuck in an infinite loop getting char 0 when the
- * line is in a wrong HW state, we break that here.
- * When that happens, I disable the receive side of the driver.
- * Note that what I've been experiencing is a real irq loop where
- * I'm getting flooded regardless of the actual port speed.
- * Something strange is going on with the HW
- */
- if ((++loops) > 1000)
- goto flood;
ch = read_zsreg(uap, R0);
if (!(ch & Rx_CH_AV))
break;
}
- return true;
- flood:
- pmz_interrupt_control(uap, 0);
- pmz_error("pmz: rx irq flood !\n");
return true;
}
--
2.25.1
From: "Guilherme G. Piccoli" <gpiccoli(a)igalia.com>
mainline inclusion
from mainline-v6.9-rc2
commit f23a4d6e07570826fe95023ca1aa96a011fa9f84
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9HJP4
CVE: CVE-2024-26935
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Commit fc663711b944 ("scsi: core: Remove the /proc/scsi/${proc_name}
directory earlier") fixed a bug related to modules loading/unloading, by
adding a call to scsi_proc_hostdir_rm() on scsi_remove_host(). But that led
to a potential duplicate call to the hostdir_rm() routine, since it's also
called from scsi_host_dev_release(). That triggered a regression report,
which was then fixed by commit be03df3d4bfe ("scsi: core: Fix a procfs host
directory removal regression"). The fix just dropped the hostdir_rm() call
from dev_release().
But it happens that this proc directory is created on scsi_host_alloc(),
and that function "pairs" with scsi_host_dev_release(), while
scsi_remove_host() pairs with scsi_add_host(). In other words, it seems the
reason for removing the proc directory on dev_release() was meant to cover
cases in which a SCSI host structure was allocated, but the call to
scsi_add_host() didn't happen. And that pattern happens to exist in some
error paths, for example.
Syzkaller causes that by using USB raw gadget device, error'ing on
usb-storage driver, at usb_stor_probe2(). By checking that path, we can see
that the BadDevice label leads to a scsi_host_put() after a SCSI host
allocation, but there's no call to scsi_add_host() in such path. That leads
to messages like this in dmesg (and a leak of the SCSI host proc
structure):
usb-storage 4-1:87.51: USB Mass Storage device detected
proc_dir_entry 'scsi/usb-storage' already registered
WARNING: CPU: 1 PID: 3519 at fs/proc/generic.c:377 proc_register+0x347/0x4e0 fs/proc/generic.c:376
The proper fix seems to still call scsi_proc_hostdir_rm() on dev_release(),
but guard that with the state check for SHOST_CREATED; there is even a
comment in scsi_host_dev_release() detailing that: such conditional is
meant for cases where the SCSI host was allocated but there was no calls to
{add,remove}_host(), like the usb-storage case.
This is what we propose here and with that, the error path of usb-storage
does not trigger the warning anymore.
Reported-by: syzbot+c645abf505ed21f931b5(a)syzkaller.appspotmail.com
Fixes: be03df3d4bfe ("scsi: core: Fix a procfs host directory removal regression")
Cc: stable(a)vger.kernel.org
Cc: Bart Van Assche <bvanassche(a)acm.org>
Cc: John Garry <john.g.garry(a)oracle.com>
Cc: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com>
Link: https://lore.kernel.org/r/20240313113006.2834799-1-gpiccoli@igalia.com
Reviewed-by: Bart Van Assche <bvanassche(a)acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Li Lingfeng <lilingfeng3(a)huawei.com>
---
drivers/scsi/hosts.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index d7f51b84f3c7..445f4a220df3 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -353,12 +353,13 @@ static void scsi_host_dev_release(struct device *dev)
if (shost->shost_state == SHOST_CREATED) {
/*
- * Free the shost_dev device name here if scsi_host_alloc()
- * and scsi_host_put() have been called but neither
+ * Free the shost_dev device name and remove the proc host dir
+ * here if scsi_host_{alloc,put}() have been called but neither
* scsi_host_add() nor scsi_remove_host() has been called.
* This avoids that the memory allocated for the shost_dev
- * name is leaked.
+ * name as well as the proc dir structure are leaked.
*/
+ scsi_proc_hostdir_rm(shost->hostt);
kfree(dev_name(&shost->shost_dev));
}
--
2.31.1
From: Tasos Sahanidis <tasos(a)tasossah.com>
mainline inclusion
from mainline-v6.0-rc5
commit d29f59051d3a07b81281b2df2b8c9dfe4716067f
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9LKE1
CVE: CVE-2022-48702
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
---------------------------
The voice allocator sometimes begins allocating from near the end of the
array and then wraps around, however snd_emu10k1_pcm_channel_alloc()
accesses the newly allocated voices as if it never wrapped around.
This results in out of bounds access if the first voice has a high enough
index so that first_voice + requested_voice_count > NUM_G (64).
The more voices are requested, the more likely it is for this to occur.
This was initially discovered using PipeWire, however it can be reproduced
by calling aplay multiple times with 16 channels:
aplay -r 48000 -D plughw:CARD=Live,DEV=3 -c 16 /dev/zero
UBSAN: array-index-out-of-bounds in sound/pci/emu10k1/emupcm.c:127:40
index 65 is out of range for type 'snd_emu10k1_voice [64]'
CPU: 1 PID: 31977 Comm: aplay Tainted: G W IOE 6.0.0-rc2-emu10k1+ #7
Hardware name: ASUSTEK COMPUTER INC P5W DH Deluxe/P5W DH Deluxe, BIOS 3002 07/22/2010
Call Trace:
<TASK>
dump_stack_lvl+0x49/0x63
dump_stack+0x10/0x16
ubsan_epilogue+0x9/0x3f
__ubsan_handle_out_of_bounds.cold+0x44/0x49
snd_emu10k1_playback_hw_params+0x3bc/0x420 [snd_emu10k1]
snd_pcm_hw_params+0x29f/0x600 [snd_pcm]
snd_pcm_common_ioctl+0x188/0x1410 [snd_pcm]
? exit_to_user_mode_prepare+0x35/0x170
? do_syscall_64+0x69/0x90
? syscall_exit_to_user_mode+0x26/0x50
? do_syscall_64+0x69/0x90
? exit_to_user_mode_prepare+0x35/0x170
snd_pcm_ioctl+0x27/0x40 [snd_pcm]
__x64_sys_ioctl+0x95/0xd0
do_syscall_64+0x5c/0x90
? do_syscall_64+0x69/0x90
? do_syscall_64+0x69/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Signed-off-by: Tasos Sahanidis <tasos(a)tasossah.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/3707dcab-320a-62ff-63c0-73fc201ef756@tasossah.com
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5(a)huawei.com>
---
sound/pci/emu10k1/emupcm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/pci/emu10k1/emupcm.c b/sound/pci/emu10k1/emupcm.c
index 9f2b6097f486..623776b13f8d 100644
--- a/sound/pci/emu10k1/emupcm.c
+++ b/sound/pci/emu10k1/emupcm.c
@@ -137,7 +137,7 @@ static int snd_emu10k1_pcm_channel_alloc(struct snd_emu10k1_pcm * epcm, int voic
epcm->voices[0]->epcm = epcm;
if (voices > 1) {
for (i = 1; i < voices; i++) {
- epcm->voices[i] = &epcm->emu->voices[epcm->voices[0]->number + i];
+ epcm->voices[i] = &epcm->emu->voices[(epcm->voices[0]->number + i) % NUM_G];
epcm->voices[i]->epcm = epcm;
}
}
--
2.34.1
From: Ravi Bangoria <ravi.bangoria(a)amd.com>
mainline inclusion
from mainline-v6.1-rc6
commit baa014b9543c8e5e94f5d15b66abfe60750b8284
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I9MMNG
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
amd_pmu_enable_all() does:
if (!test_bit(idx, cpuc->active_mask))
continue;
amd_pmu_enable_event(cpuc->events[idx]);
A perf NMI of another event can come between these two steps. Perf NMI
handler internally disables and enables _all_ events, including the one
which nmi-intercepted amd_pmu_enable_all() was in process of enabling.
If that unintentionally enabled event has very low sampling period and
causes immediate successive NMI, causing the event to be throttled,
cpuc->events[idx] and cpuc->active_mask gets cleared by x86_pmu_stop().
This will result in amd_pmu_enable_event() getting called with event=NULL
when amd_pmu_enable_all() resumes after handling the NMIs. This causes a
kernel crash:
BUG: kernel NULL pointer dereference, address: 0000000000000198
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
[...]
Call Trace:
<TASK>
amd_pmu_enable_all+0x68/0xb0
ctx_resched+0xd9/0x150
event_function+0xb8/0x130
? hrtimer_start_range_ns+0x141/0x4a0
? perf_duration_warn+0x30/0x30
remote_function+0x4d/0x60
__flush_smp_call_function_queue+0xc4/0x500
flush_smp_call_function_queue+0x11d/0x1b0
do_idle+0x18f/0x2d0
cpu_startup_entry+0x19/0x20
start_secondary+0x121/0x160
secondary_startup_64_no_verify+0xe5/0xeb
</TASK>
amd_pmu_disable_all()/amd_pmu_enable_all() calls inside perf NMI handler
were recently added as part of BRS enablement but I'm not sure whether
we really need them. We can just disable BRS in the beginning and enable
it back while returning from NMI. This will solve the issue by not
enabling those events whose active_masks are set but are not yet enabled
in hw pmu.
Fixes: ada543459cab ("perf/x86/amd: Add AMD Fam19h Branch Sampling support")
Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org>
Signed-off-by: Ravi Bangoria <ravi.bangoria(a)amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Link: https://lkml.kernel.org/r/20221114044029.373-1-ravi.bangoria@amd.com
Signed-off-by: Luo Gengkun <luogengkun2(a)huawei.com>
---
arch/x86/events/amd/core.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index 2c082abb930b..3127519281f5 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -900,8 +900,7 @@ static int amd_pmu_handle_irq(struct pt_regs *regs)
pmu_enabled = cpuc->enabled;
cpuc->enabled = 0;
- /* stop everything (includes BRS) */
- amd_pmu_disable_all();
+ amd_brs_disable_all();
/* Drain BRS is in use (could be inactive) */
if (cpuc->lbr_users)
@@ -912,7 +911,7 @@ static int amd_pmu_handle_irq(struct pt_regs *regs)
cpuc->enabled = pmu_enabled;
if (pmu_enabled)
- amd_pmu_enable_all(0);
+ amd_brs_enable_all();
return amd_pmu_adjust_nmi_window(handled);
}
--
2.34.1
mainline inclusion
from mainline-v6.9-rc1
commit 03f12122b20b6e6028e9ed69030a49f9cffcbb75
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I9IVH5
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
'open_mutex' of gendisk is used to protect open/close block devices. But
in bd_link_disk_holder(), it is used to protect the creation of symlink
between holding disk and slave bdev, which introduces some issues.
When bd_link_disk_holder() is called, the driver is usually in the process
of initialization/modification and may suspend submitting io. At this
time, any io hold 'open_mutex', such as scanning partitions, can cause
deadlocks. For example, in raid:
T1 T2
bdev_open_by_dev
lock open_mutex [1]
...
efi_partition
...
md_submit_bio
md_ioctl mddev_syspend
-> suspend all io
md_add_new_disk
bind_rdev_to_array
bd_link_disk_holder
try lock open_mutex [2]
md_handle_request
-> wait mddev_resume
T1 scan partition, T2 add a new device to raid. T1 waits for T2 to resume
mddev, but T2 waits for open_mutex held by T1. Deadlock occurs.
Fix it by introducing a local mutex 'blk_holder_mutex' to replace
'open_mutex'.
Fixes: 1b0a2d950ee2 ("md: use new apis to suspend array for ioctls involed array reconfiguration")
Reported-by: mgperkow(a)gmail.com
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218459
Signed-off-by: Li Nan <linan122(a)huawei.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Yu Kuai <yukuai3(a)huawei.com>
Link: https://lore.kernel.org/r/20240221090122.1281868-1-linan666@huaweicloud.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
block/holder.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/block/holder.c b/block/holder.c
index 37d18c13d958..791091a7eac2 100644
--- a/block/holder.c
+++ b/block/holder.c
@@ -8,6 +8,8 @@ struct bd_holder_disk {
int refcnt;
};
+static DEFINE_MUTEX(blk_holder_mutex);
+
static struct bd_holder_disk *bd_find_holder_disk(struct block_device *bdev,
struct gendisk *disk)
{
@@ -80,7 +82,7 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
kobject_get(bdev->bd_holder_dir);
mutex_unlock(&bdev->bd_disk->open_mutex);
- mutex_lock(&disk->open_mutex);
+ mutex_lock(&blk_holder_mutex);
WARN_ON_ONCE(!bdev->bd_holder);
holder = bd_find_holder_disk(bdev, disk);
@@ -108,7 +110,7 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
goto out_del_symlink;
list_add(&holder->list, &disk->slave_bdevs);
- mutex_unlock(&disk->open_mutex);
+ mutex_unlock(&blk_holder_mutex);
return 0;
out_del_symlink:
@@ -116,7 +118,7 @@ int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
out_free_holder:
kfree(holder);
out_unlock:
- mutex_unlock(&disk->open_mutex);
+ mutex_unlock(&blk_holder_mutex);
if (ret)
kobject_put(bdev->bd_holder_dir);
return ret;
@@ -140,7 +142,7 @@ void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
if (WARN_ON_ONCE(!disk->slave_dir))
return;
- mutex_lock(&disk->open_mutex);
+ mutex_lock(&blk_holder_mutex);
holder = bd_find_holder_disk(bdev, disk);
if (!WARN_ON_ONCE(holder == NULL) && !--holder->refcnt) {
del_symlink(disk->slave_dir, bdev_kobj(bdev));
@@ -149,6 +151,6 @@ void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
list_del_init(&holder->list);
kfree(holder);
}
- mutex_unlock(&disk->open_mutex);
+ mutex_unlock(&blk_holder_mutex);
}
EXPORT_SYMBOL_GPL(bd_unlink_disk_holder);
--
2.39.2
From: "Guilherme G. Piccoli" <gpiccoli(a)igalia.com>
stable inclusion
from stable-v5.10.215
commit 5c2386ba80e779a92ec3bb64ccadbedd88f779b1
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9HJP4
CVE: CVE-2024-26935
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit f23a4d6e07570826fe95023ca1aa96a011fa9f84 upstream.
Commit fc663711b944 ("scsi: core: Remove the /proc/scsi/${proc_name}
directory earlier") fixed a bug related to modules loading/unloading, by
adding a call to scsi_proc_hostdir_rm() on scsi_remove_host(). But that led
to a potential duplicate call to the hostdir_rm() routine, since it's also
called from scsi_host_dev_release(). That triggered a regression report,
which was then fixed by commit be03df3d4bfe ("scsi: core: Fix a procfs host
directory removal regression"). The fix just dropped the hostdir_rm() call
from dev_release().
But it happens that this proc directory is created on scsi_host_alloc(),
and that function "pairs" with scsi_host_dev_release(), while
scsi_remove_host() pairs with scsi_add_host(). In other words, it seems the
reason for removing the proc directory on dev_release() was meant to cover
cases in which a SCSI host structure was allocated, but the call to
scsi_add_host() didn't happen. And that pattern happens to exist in some
error paths, for example.
Syzkaller causes that by using USB raw gadget device, error'ing on
usb-storage driver, at usb_stor_probe2(). By checking that path, we can see
that the BadDevice label leads to a scsi_host_put() after a SCSI host
allocation, but there's no call to scsi_add_host() in such path. That leads
to messages like this in dmesg (and a leak of the SCSI host proc
structure):
usb-storage 4-1:87.51: USB Mass Storage device detected
proc_dir_entry 'scsi/usb-storage' already registered
WARNING: CPU: 1 PID: 3519 at fs/proc/generic.c:377 proc_register+0x347/0x4e0 fs/proc/generic.c:376
The proper fix seems to still call scsi_proc_hostdir_rm() on dev_release(),
but guard that with the state check for SHOST_CREATED; there is even a
comment in scsi_host_dev_release() detailing that: such conditional is
meant for cases where the SCSI host was allocated but there was no calls to
{add,remove}_host(), like the usb-storage case.
This is what we propose here and with that, the error path of usb-storage
does not trigger the warning anymore.
Reported-by: syzbot+c645abf505ed21f931b5(a)syzkaller.appspotmail.com
Fixes: be03df3d4bfe ("scsi: core: Fix a procfs host directory removal regression")
Cc: stable(a)vger.kernel.org
Cc: Bart Van Assche <bvanassche(a)acm.org>
Cc: John Garry <john.g.garry(a)oracle.com>
Cc: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com>
Link: https://lore.kernel.org/r/20240313113006.2834799-1-gpiccoli@igalia.com
Reviewed-by: Bart Van Assche <bvanassche(a)acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Li Lingfeng <lilingfeng3(a)huawei.com>
---
v1->v2:
Add CVE num in commit message.
drivers/scsi/hosts.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 5bac4936c647..3feaa61f948f 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -334,12 +334,13 @@ static void scsi_host_dev_release(struct device *dev)
if (shost->shost_state == SHOST_CREATED) {
/*
- * Free the shost_dev device name here if scsi_host_alloc()
- * and scsi_host_put() have been called but neither
+ * Free the shost_dev device name and remove the proc host dir
+ * here if scsi_host_{alloc,put}() have been called but neither
* scsi_host_add() nor scsi_host_remove() has been called.
* This avoids that the memory allocated for the shost_dev
- * name is leaked.
+ * name as well as the proc dir structure are leaked.
*/
+ scsi_proc_hostdir_rm(shost->hostt);
kfree(dev_name(&shost->shost_dev));
}
--
2.31.1