From: "Darrick J. Wong" darrick.wong@oracle.com
[ Upstream commit eb0efe5063bb10bcb653e4f8e92a74719c03a347 ]
The data fork scrubber calls filemap_write_and_wait to flush dirty pages and delalloc reservations out to disk prior to checking the data fork's extent mappings. Unfortunately, this means that scrub can consume the EIO/ENOSPC errors that would otherwise have stayed around in the address space until (we hope) the writer application calls fsync to persist data and collect errors. The end result is that programs that wrote to a file might never see the error code and proceed as if nothing were wrong.
xfs_scrub is not in a position to notify file writers about the writeback failure, and it's only here to check metadata, not file contents. Therefore, if writeback fails, we should stuff the error code back into the address space so that an fsync by the writer application can pick that up.
Fixes: 99d9d8d05da2 ("xfs: scrub inode block mappings") Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Reviewed-by: Brian Foster bfoster@redhat.com Reviewed-by: Dave Chinner dchinner@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Yang Yingliang yangyingliang@huawei.com --- fs/xfs/scrub/bmap.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c index e1d11f3223e36..f84a58e523bc8 100644 --- a/fs/xfs/scrub/bmap.c +++ b/fs/xfs/scrub/bmap.c @@ -53,9 +53,27 @@ xchk_setup_inode_bmap( */ if (S_ISREG(VFS_I(sc->ip)->i_mode) && sc->sm->sm_type == XFS_SCRUB_TYPE_BMBTD) { + struct address_space *mapping = VFS_I(sc->ip)->i_mapping; + inode_dio_wait(VFS_I(sc->ip)); - error = filemap_write_and_wait(VFS_I(sc->ip)->i_mapping); - if (error) + + /* + * Try to flush all incore state to disk before we examine the + * space mappings for the data fork. Leave accumulated errors + * in the mapping for the writer threads to consume. + * + * On ENOSPC or EIO writeback errors, we continue into the + * extent mapping checks because write failures do not + * necessarily imply anything about the correctness of the file + * metadata. The metadata and the file data could be on + * completely separate devices; a media failure might only + * affect a subset of the disk, etc. We can handle delalloc + * extents in the scrubber, so leaving them in memory is fine. + */ + error = filemap_fdatawrite(mapping); + if (!error) + error = filemap_fdatawait_keep_errors(mapping); + if (error && (error != -ENOSPC && error != -EIO)) goto out; }