From: Li Nan linan122@huawei.com
hulk inclusion category: bugfix bugzilla: 188804, https://gitee.com/openeuler/kernel/issues/I78YIS CVE: NA
--------------------------------
When add a new disk to raid10, it will traverse conf->mirror from start and find one of the following mirror: 1. mirror->rdev is set to WantReplacement and it have no replacement, set new disk to mirror->replacement. 2. no rdev, set new disk to mirror->rdev.
There is a array as below (sda is set to WantReplacement):
Number Major Minor RaidDevice State 0 8 0 0 active sync set-A /dev/sda - 0 0 1 removed 2 8 32 2 active sync set-A /dev/sdc 3 8 48 3 active sync set-B /dev/sdd
Use 'mdadm --add' to add a new disk to this array, the new disk will become sda's replacement instead of add to removed position, which is confusing for users. Meanwhile, after new disk recovery success, sda will be set to Faulty.
Prioritize adding disk to 'removed' mirror is a better choice. In the above scenario, the behavior is the same as before, except sda will not be deleted. Before other disks are added, continued use sda is more reliable.
Signed-off-by: Li Nan linan122@huawei.com Reviewed-by: Yu Kuai yukuai3@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com --- drivers/md/raid10.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-)
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 190e6f18d0e2..2c41b201cfb4 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -1760,9 +1760,10 @@ static int raid10_add_disk(struct mddev *mddev, struct md_rdev *rdev) { struct r10conf *conf = mddev->private; int err = -EEXIST; - int mirror; + int mirror, repl_slot = -1; int first = 0; int last = conf->geo.raid_disks - 1; + struct raid10_info *p;
if (mddev->recovery_cp < MaxSector) /* only hot-add to in-sync arrays, as recovery is @@ -1785,24 +1786,14 @@ static int raid10_add_disk(struct mddev *mddev, struct md_rdev *rdev) else mirror = first; for ( ; mirror <= last ; mirror++) { - struct raid10_info *p = &conf->mirrors[mirror]; + p = &conf->mirrors[mirror]; if (p->recovery_disabled == mddev->recovery_disabled) continue; if (p->rdev) { - if (!test_bit(WantReplacement, &p->rdev->flags) || - p->replacement != NULL) - continue; - clear_bit(In_sync, &rdev->flags); - set_bit(Replacement, &rdev->flags); - clear_bit(WantRemove, &rdev->flags); - rdev->raid_disk = mirror; - err = 0; - if (mddev->gendisk) - disk_stack_limits(mddev->gendisk, rdev->bdev, - rdev->data_offset << 9); - conf->fullsync = 1; - rcu_assign_pointer(p->replacement, rdev); - break; + if (test_bit(WantReplacement, &p->rdev->flags) && + p->replacement == NULL && repl_slot < 0) + repl_slot = mirror; + continue; }
if (mddev->gendisk) @@ -1819,6 +1810,21 @@ static int raid10_add_disk(struct mddev *mddev, struct md_rdev *rdev) rcu_assign_pointer(p->rdev, rdev); break; } + + if (err && repl_slot >= 0) { + p = &conf->mirrors[repl_slot]; + clear_bit(In_sync, &rdev->flags); + set_bit(Replacement, &rdev->flags); + clear_bit(WantRemove, &rdev->flags); + rdev->raid_disk = repl_slot; + err = 0; + if (mddev->gendisk) + disk_stack_limits(mddev->gendisk, rdev->bdev, + rdev->data_offset << 9); + conf->fullsync = 1; + rcu_assign_pointer(p->replacement, rdev); + } + if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev))) blk_queue_flag_set(QUEUE_FLAG_DISCARD, mddev->queue);