From: Logan Gunthorpe logang@deltatee.com
mainline inclusion from mainline-v6.0-rc1 commit b368856aab02c8fcaabb809aad401b2cf96504f2 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6JN1I CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
The 07layouts test in mdadm fails on some systems. The failure presents itself as the backup file not being removed before the next layout is grown into:
mdadm: /dev/md0: cannot create backup file /tmp/md-test-backup: File exists
This is because the background mdadm process, which is responsible for cleaning up this backup file gets into an infinite loop waiting for the reshape to start. mdadm checks the mdstat file if a reshape is going and, if it is not, it waits for an event on the file or times out in 5 seconds. On faster machines, the reshape may complete before the 5 seconds times out, and thus the background mdadm process loops waiting for a reshape to start that has already occurred.
mdadm reads the mdstat file to start, but mdstat does not report that the reshape has begun, even though it has indeed begun. So the mdstat_wait() call (in mdadm) which polls on the mdstat file won't ever return until timing out.
The reason mdstat reports the reshape has started is due to an issue in status_resync(). recovery_active is subtracted from curr_resync which will result in a value of zero for the first chunk of reshaped data, and the resulting read will report no reshape in progress.
To fix this, if "resync - recovery_active" is an overloaded value, force the value to be MD_RESYNC_ACTIVE so the code reports a resync in progress.
Signed-off-by: Logan Gunthorpe logang@deltatee.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Song Liu song@kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Yu Kuai yukuai3@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Jialin Zhang zhangjialin11@huawei.com --- drivers/md/md.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c index 447b9da3aeb9..a6fa4872faf3 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -8073,10 +8073,20 @@ static int status_resync(struct seq_file *seq, struct mddev *mddev) if (test_bit(MD_RECOVERY_DONE, &mddev->recovery)) /* Still cleaning up */ resync = max_sectors; - } else if (resync > max_sectors) + } else if (resync > max_sectors) { resync = max_sectors; - else + } else { resync -= atomic_read(&mddev->recovery_active); + if (resync < MD_RESYNC_ACTIVE) { + /* + * Resync has started, but the subtraction has + * yielded one of the special values. Force it + * to active to ensure the status reports an + * active resync. + */ + resync = MD_RESYNC_ACTIVE; + } + }
if (resync == MD_RESYNC_NONE) { if (test_bit(MD_RESYNCING_REMOTE, &mddev->recovery)) {