From: Ido Schimmel idosch@nvidia.com
mainline inclusion from mainline-v6.3-rc6 commit c484fcc058bada604d7e4e5228d4affb646ddbc2 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I718K0 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
---------------------------
When a net device is put administratively up, its 'IFF_UP' flag is set (if not set already) and a 'NETDEV_UP' notification is emitted, which causes the 8021q driver to add VLAN ID 0 on the device. The reverse happens when a net device is put administratively down.
When changing the type of a bond to Ethernet, its 'IFF_UP' flag is incorrectly cleared, resulting in the kernel skipping the above process and VLAN ID 0 being leaked [1].
Fix by restoring the flag when changing the type to Ethernet, in a similar fashion to the restoration of the 'IFF_SLAVE' flag.
The issue can be reproduced using the script in [2], with example out before and after the fix in [3].
[1] unreferenced object 0xffff888103479900 (size 256): comm "ip", pid 329, jiffies 4294775225 (age 28.561s) hex dump (first 32 bytes): 00 a0 0c 15 81 88 ff ff 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff81a6051a>] kmalloc_trace+0x2a/0xe0 [<ffffffff8406426c>] vlan_vid_add+0x30c/0x790 [<ffffffff84068e21>] vlan_device_event+0x1491/0x21a0 [<ffffffff81440c8e>] notifier_call_chain+0xbe/0x1f0 [<ffffffff8372383a>] call_netdevice_notifiers_info+0xba/0x150 [<ffffffff837590f2>] __dev_notify_flags+0x132/0x2e0 [<ffffffff8375ad9f>] dev_change_flags+0x11f/0x180 [<ffffffff8379af36>] do_setlink+0xb96/0x4060 [<ffffffff837adf6a>] __rtnl_newlink+0xc0a/0x18a0 [<ffffffff837aec6c>] rtnl_newlink+0x6c/0xa0 [<ffffffff837ac64e>] rtnetlink_rcv_msg+0x43e/0xe00 [<ffffffff839a99e0>] netlink_rcv_skb+0x170/0x440 [<ffffffff839a738f>] netlink_unicast+0x53f/0x810 [<ffffffff839a7fcb>] netlink_sendmsg+0x96b/0xe90 [<ffffffff8369d12f>] ____sys_sendmsg+0x30f/0xa70 [<ffffffff836a6d7a>] ___sys_sendmsg+0x13a/0x1e0 unreferenced object 0xffff88810f6a83e0 (size 32): comm "ip", pid 329, jiffies 4294775225 (age 28.561s) hex dump (first 32 bytes): a0 99 47 03 81 88 ff ff a0 99 47 03 81 88 ff ff ..G.......G..... 81 00 00 00 01 00 00 00 cc cc cc cc cc cc cc cc ................ backtrace: [<ffffffff81a6051a>] kmalloc_trace+0x2a/0xe0 [<ffffffff84064369>] vlan_vid_add+0x409/0x790 [<ffffffff84068e21>] vlan_device_event+0x1491/0x21a0 [<ffffffff81440c8e>] notifier_call_chain+0xbe/0x1f0 [<ffffffff8372383a>] call_netdevice_notifiers_info+0xba/0x150 [<ffffffff837590f2>] __dev_notify_flags+0x132/0x2e0 [<ffffffff8375ad9f>] dev_change_flags+0x11f/0x180 [<ffffffff8379af36>] do_setlink+0xb96/0x4060 [<ffffffff837adf6a>] __rtnl_newlink+0xc0a/0x18a0 [<ffffffff837aec6c>] rtnl_newlink+0x6c/0xa0 [<ffffffff837ac64e>] rtnetlink_rcv_msg+0x43e/0xe00 [<ffffffff839a99e0>] netlink_rcv_skb+0x170/0x440 [<ffffffff839a738f>] netlink_unicast+0x53f/0x810 [<ffffffff839a7fcb>] netlink_sendmsg+0x96b/0xe90 [<ffffffff8369d12f>] ____sys_sendmsg+0x30f/0xa70 [<ffffffff836a6d7a>] ___sys_sendmsg+0x13a/0x1e0
[2] ip link add name t-nlmon type nlmon ip link add name t-dummy type dummy ip link add name t-bond type bond mode active-backup
ip link set dev t-bond up ip link set dev t-nlmon master t-bond ip link set dev t-nlmon nomaster ip link show dev t-bond ip link set dev t-dummy master t-bond ip link show dev t-bond
ip link del dev t-bond ip link del dev t-dummy ip link del dev t-nlmon
[3] Before:
12: t-bond: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 link/netlink 12: t-bond: <BROADCAST,MULTICAST,MASTER,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 46:57:39:a4:46:a2 brd ff:ff:ff:ff:ff:ff
After:
12: t-bond: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 link/netlink 12: t-bond: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 66:48:7b:74:b6:8a brd ff:ff:ff:ff:ff:ff
Fixes: e36b9d16c6a6 ("bonding: clean muticast addresses when device changes type") Fixes: 75c78500ddad ("bonding: remap muticast addresses without using dev_close() and dev_open()") Fixes: 9ec7eb60dcbc ("bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether type change") Reported-by: Mirsad Goran Todorovac mirsad.todorovac@alu.unizg.hr Link: https://lore.kernel.org/netdev/78a8a03b-6070-3e6b-5042-f848dab16fb8@alu.uniz... Tested-by: Mirsad Goran Todorovac mirsad.todorovac@alu.unizg.hr Signed-off-by: Ido Schimmel idosch@nvidia.com Acked-by: Jay Vosburgh jay.vosburgh@canonical.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Dong Chenchen dongchenchen2@huawei.com Reviewed-by: Liu Jian liujian56@huawei.com Reviewed-by: Yue Haibing yuehaibing@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- drivers/net/bonding/bond_main.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 08939a35a187..fe28cbec4a4c 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1393,14 +1393,15 @@ void bond_lower_state_changed(struct slave *slave)
/* The bonding driver uses ether_setup() to convert a master bond device * to ARPHRD_ETHER, that resets the target netdevice's flags so we always - * have to restore the IFF_MASTER flag, and only restore IFF_SLAVE if it was set + * have to restore the IFF_MASTER flag, and only restore IFF_SLAVE and IFF_UP + * if they were set */ static void bond_ether_setup(struct net_device *bond_dev) { - unsigned int slave_flag = bond_dev->flags & IFF_SLAVE; + unsigned int flags = bond_dev->flags & (IFF_SLAVE | IFF_UP);
ether_setup(bond_dev); - bond_dev->flags |= IFF_MASTER | slave_flag; + bond_dev->flags |= IFF_MASTER | flags; bond_dev->priv_flags &= ~IFF_TX_SKB_SHARING; }
From: Li Nan linan122@huawei.com
hulk inclusion category: bugfix bugzilla: 188553, https://gitee.com/openeuler/kernel/issues/I6TNFX CVE: NA
--------------------------------
If we want to remove a device, first we delete it from mddev->disks list, then init rdev->del_work to put it (see unbind_rdev_from_array()).
flush_rdev_wq() traverses mddev->disks to check if there is any pending rdev->del_work, if so, flush it. Howerver, rdev will not be in the list of mddev->disks if rdev->del_work exists, and flush_workqueue() will never be executed.
Replace it with flush_workqueue() to ensure del_work has been completed when adding devices.
Fixes: cc1ffe61c026 ("md: add new workqueue for delete rdev") Signed-off-by: Li Nan linan122@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- drivers/md/md.c | 18 ++---------------- 1 file changed, 2 insertions(+), 16 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c index baa874823930..5be396690abd 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -4399,20 +4399,6 @@ null_show(struct mddev *mddev, char *page) return -EINVAL; }
-/* need to ensure rdev_delayed_delete() has completed */ -static void flush_rdev_wq(struct mddev *mddev) -{ - struct md_rdev *rdev; - - rcu_read_lock(); - rdev_for_each_rcu(rdev, mddev) - if (work_pending(&rdev->del_work)) { - flush_workqueue(md_rdev_misc_wq); - break; - } - rcu_read_unlock(); -} - static ssize_t new_dev_store(struct mddev *mddev, const char *buf, size_t len) { @@ -4440,7 +4426,7 @@ new_dev_store(struct mddev *mddev, const char *buf, size_t len) minor != MINOR(dev)) return -EOVERFLOW;
- flush_rdev_wq(mddev); + flush_workqueue(md_rdev_misc_wq); err = mddev_lock(mddev); if (err) return err; @@ -7348,7 +7334,7 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode, }
if (cmd == ADD_NEW_DISK || cmd == HOT_ADD_DISK) - flush_rdev_wq(mddev); + flush_workqueue(md_rdev_misc_wq);
if (cmd == HOT_REMOVE_DISK) /* need to ensure recovery thread has run */
From: Li Nan linan122@huawei.com
hulk inclusion category: bugfix bugzilla: 188553, https://gitee.com/openeuler/kernel/issues/I6TNFX CVE: NA
--------------------------------
rdev->del_work has not been queued to md_rdev_misc_wq and flush_workqueue will not flush it if tow threads add and remove same device. sysfs might WARN duplicate filename as below.
//T1 //T2 mdadm write super add success remove unbind_rdev_from_array
md_ioctl flush_workqueue INIT_WORK queue_work md_add_new_disk duplicate filename dev-xxx
Check if there is any kobj with the same name, and return busy if true.
Fixes: 5792a2856a63 ("md: avoid a deadlock when removing a device from an md array via sysfs") Signed-off-by: Li Nan linan122@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- drivers/md/md.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c index 5be396690abd..5b06e1c81af5 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -2255,8 +2255,9 @@ EXPORT_SYMBOL(md_integrity_add_rdev);
static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev) { - char b[BDEVNAME_SIZE]; + char b[BDEVNAME_SIZE + 4]; struct kobject *ko; + struct kernfs_node *sysfs_rdev; int err;
/* prevent duplicates */ @@ -2307,13 +2308,22 @@ static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev) mdname(mddev), mddev->max_disks); return -EBUSY; } - bdevname(rdev->bdev,b); + memcpy(b, "dev-", 4); + bdevname(rdev->bdev, b + 4); strreplace(b, '/', '!');
rdev->mddev = mddev; pr_debug("md: bind<%s>\n", b);
- if ((err = kobject_add(&rdev->kobj, &mddev->kobj, "dev-%s", b))) + sysfs_rdev = sysfs_get_dirent_safe(mddev->kobj.sd, b); + if (sysfs_rdev) { + sysfs_put(sysfs_rdev); + err = -EBUSY; + goto fail; + } + + err = kobject_add(&rdev->kobj, &mddev->kobj, b); + if (err) goto fail;
ko = &part_to_dev(rdev->bdev->bd_part)->kobj; @@ -2334,7 +2344,7 @@ static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev) return 0;
fail: - pr_warn("md: failed to register dev-%s for %s\n", + pr_warn("md: failed to register %s for %s\n", b, mdname(mddev)); return err; }
From: Li Nan linan122@huawei.com
hulk inclusion category: bugfix bugzilla: 188569, https://gitee.com/openeuler/kernel/issues/I6ZG5B CVE: NA
--------------------------------
bb->changed and unacked_exist is set and badblocks_update_acked() is involked even if no badblocks changes in badblocks_set(). Only update them when badblocks changes.
Fixes: 9e0e252a048b ("badblocks: Add core badblock management code") Signed-off-by: Li Nan linan122@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- block/badblocks.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/block/badblocks.c b/block/badblocks.c index 91f7bcf979d3..81df68ecde3a 100644 --- a/block/badblocks.c +++ b/block/badblocks.c @@ -173,7 +173,7 @@ int badblocks_set(struct badblocks *bb, sector_t s, int sectors, { u64 *p; int lo, hi; - int rv = 0; + int rv = 0, changed = 0; unsigned long flags;
if (bb->shift < 0) @@ -238,6 +238,7 @@ int badblocks_set(struct badblocks *bb, sector_t s, int sectors, s = a + BB_MAX_LEN; } sectors = e - s; + changed = 1; } } if (sectors && hi < bb->count) { @@ -268,6 +269,7 @@ int badblocks_set(struct badblocks *bb, sector_t s, int sectors, sectors = e - s; lo = hi; hi++; + changed = 1; } } if (sectors == 0 && hi < bb->count) { @@ -286,6 +288,7 @@ int badblocks_set(struct badblocks *bb, sector_t s, int sectors, memmove(p + hi, p + hi + 1, (bb->count - hi - 1) * 8); bb->count--; + changed = 1; } } while (sectors) { @@ -308,14 +311,17 @@ int badblocks_set(struct badblocks *bb, sector_t s, int sectors, p[hi] = BB_MAKE(s, this_sectors, acknowledged); sectors -= this_sectors; s += this_sectors; + changed = 1; } }
- bb->changed = 1; - if (!acknowledged) - bb->unacked_exist = 1; - else - badblocks_update_acked(bb); + if (changed) { + bb->changed = changed; + if (!acknowledged) + bb->unacked_exist = 1; + else + badblocks_update_acked(bb); + } write_sequnlock_irqrestore(&bb->lock, flags);
return rv;
From: Li Nan linan122@huawei.com
hulk inclusion category: bugfix bugzilla: 188569, https://gitee.com/openeuler/kernel/issues/I6ZG5B CVE: NA
--------------------------------
Order of badblocks will be reversed if we set a large area at once. 'hi' remains unchanged while adding continuous badblocks is wrong, the next setting is greater than 'hi', it should be added to the next position. Let 'hi' +1 each cycle.
# echo 0 2048 > bad_blocks # cat bad_blocks 1536 512 1024 512 512 512 0 512
Fixes: 9e0e252a048b ("badblocks: Add core badblock management code") Signed-off-by: Li Nan linan122@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- block/badblocks.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/block/badblocks.c b/block/badblocks.c index 81df68ecde3a..7ccb9f2c412d 100644 --- a/block/badblocks.c +++ b/block/badblocks.c @@ -311,6 +311,7 @@ int badblocks_set(struct badblocks *bb, sector_t s, int sectors, p[hi] = BB_MAKE(s, this_sectors, acknowledged); sectors -= this_sectors; s += this_sectors; + hi++; changed = 1; } }
From: Li Nan linan122@huawei.com
hulk inclusion category: bugfix bugzilla: 188569, https://gitee.com/openeuler/kernel/issues/I6ZG5B CVE: NA
--------------------------------
badblocks will loss if we set it as below:
# echo 1 1 > bad_blocks # echo 3 1 > bad_blocks # echo 1 5 > bad_blocks # cat bad_blocks 1 3
we will combine badblocks if there is an intersection between p[lo] and p[hi] in badblocks_set(). The end of new badblocks is p[hi]'s end now. but p[lo] may cross p[hi] and new end should be the larger of p[lo] and p[hi]. lo: |------------------------| hi: |--------|
Fixes: 9e0e252a048b ("badblocks: Add core badblock management code") Signed-off-by: Li Nan linan122@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- block/badblocks.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/block/badblocks.c b/block/badblocks.c index 7ccb9f2c412d..3256f642785a 100644 --- a/block/badblocks.c +++ b/block/badblocks.c @@ -275,16 +275,14 @@ int badblocks_set(struct badblocks *bb, sector_t s, int sectors, if (sectors == 0 && hi < bb->count) { /* we might be able to combine lo and hi */ /* Note: 's' is at the end of 'lo' */ - sector_t a = BB_OFFSET(p[hi]); - int lolen = BB_LEN(p[lo]); - int hilen = BB_LEN(p[hi]); - int newlen = lolen + hilen - (s - a); + sector_t a = BB_OFFSET(p[lo]); + int newlen = max((u64)s, BB_OFFSET(p[hi]) + BB_LEN(p[hi])) - a;
- if (s >= a && newlen < BB_MAX_LEN) { + if (s >= BB_OFFSET(p[hi]) && newlen < BB_MAX_LEN) { /* yes, we can combine them */ int ack = BB_ACK(p[lo]) && BB_ACK(p[hi]);
- p[lo] = BB_MAKE(BB_OFFSET(p[lo]), newlen, ack); + p[lo] = BB_MAKE(a, newlen, ack); memmove(p + hi, p + hi + 1, (bb->count - hi - 1) * 8); bb->count--;
From: zhangyue zhangyue1@kylinos.cn
mainline inclusion from mainline-v5.16-rc5 commit 07641b5f32f6991758b08da9b1f4173feeb64f2a category: bugfix bugzilla: 188015, https://gitee.com/openeuler/kernel/issues/I6OERX CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=...
--------------------------------
In driver/md/md.c, if the function autorun_array() is called, the problem of double free may occur.
In function autorun_array(), when the function do_md_run() returns an error, the function do_md_stop() will be called.
The function do_md_run() called function md_run(), but in function md_run(), the pointer mddev->private may be freed.
The function do_md_stop() called the function __md_stop(), but in function __md_stop(), the pointer mddev->private also will be freed without judging null.
At this time, the pointer mddev->private will be double free, so it needs to be judged null or not.
Signed-off-by: zhangyue zhangyue1@kylinos.cn Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Li Nan linan122@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- drivers/md/md.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c index 5b06e1c81af5..d5690cb1e811 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -6051,7 +6051,8 @@ static void __md_stop(struct mddev *mddev) spin_lock(&mddev->lock); mddev->pers = NULL; spin_unlock(&mddev->lock); - pers->free(mddev, mddev->private); + if (mddev->private) + pers->free(mddev, mddev->private); mddev->private = NULL; if (pers->sync_request && mddev->to_remove == NULL) mddev->to_remove = &md_redundancy_group;
From: Eric Dumazet edumazet@google.com
mainline inclusion from mainline-v5.18-rc1 commit 7d959f6e978cbbca90e26a192cc39480e977182f category: bugfix bugzilla: 188015, https://gitee.com/openeuler/kernel/issues/I6OERX CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=...
--------------------------------
Calling mdelay(1000) from process context, even while a reboot is in progress, does not make sense.
Using msleep() allows other threads to make progress.
Signed-off-by: Eric Dumazet edumazet@google.com Cc: linux-raid@vger.kernel.org Signed-off-by: Song Liu song@kernel.org Signed-off-by: Li Nan linan122@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- drivers/md/md.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c index d5690cb1e811..5f784b4e27a3 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9288,7 +9288,7 @@ static int md_notify_reboot(struct notifier_block *this, * driver, we do want to have a safe RAID driver ... */ if (need_delay) - mdelay(1000*1); + msleep(1000);
return NOTIFY_DONE; }
From: Christoph Hellwig hch@lst.de
mainline inclusion from mainline-v5.16-rc1 commit 94f3cd7d832c28681f1dea54b4dd8606e5e2bc75 category: bugfix bugzilla: 188015, https://gitee.com/openeuler/kernel/issues/I6OERX CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=...
--------------------------------
disks_mutex is intended to serialize md_alloc. Extended it to also cover the kobject_uevent call and getting the sysfs dirent to help reducing error handling complexity.
Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Jens Axboe axboe@kernel.dk
Conflict: drivers/md/md.c
Signed-off-by: Li Nan linan122@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- drivers/md/md.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c index 5f784b4e27a3..5a5e1f1fdb52 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -5536,12 +5536,12 @@ static int md_alloc(dev_t dev, char *name) sysfs_create_group(&mddev->kobj, &md_bitmap_group)) pr_debug("pointless warning\n"); abort: - mutex_unlock(&disks_mutex); if (!error && mddev->kobj.sd) { kobject_uevent(&mddev->kobj, KOBJ_ADD); mddev->sysfs_state = sysfs_get_dirent_safe(mddev->kobj.sd, "array_state"); mddev->sysfs_level = sysfs_get_dirent_safe(mddev->kobj.sd, "level"); } + mutex_unlock(&disks_mutex); mddev_put(mddev); return error; }