From: Zhang Yi yi.zhang@huawei.com
maillist inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I9VTE3 CVE: NA
Reference: https://lore.kernel.org/linux-fsdevel/20240529095206.2568162-1-yi.zhang@huaw...
--------------------------------
When unaligned truncating down a realtime file which sb_rextsize is bigger than one block, xfs_truncate_page() only zeros out the tail EOF block, this could expose stale data since commit '943bc0882ceb ("iomap: don't increase i_size if it's not a write operation")'.
If we truncate file that contains a large enough written extent:
|< rxext >|< rtext >| ...WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW ^ (new EOF) ^ old EOF
Since we only zeros out the tail of the EOF block, and xfs_itruncate_extents() unmap the whole ailgned extents, it becomes this state:
|< rxext >| ...WWWzWWWWWWWWWWWWW ^ new EOF
Then if we do an extending write like this, the blocks in the previous tail extent becomes stale:
|< rxext >| ...WWWzSSSSSSSSSSSSS..........WWWWWWWWWWWWWWWWW ^ old EOF ^ append start ^ new EOF
Fix this by zeroing out the tail allocation uint and also make sure xfs_itruncate_extents() unmap rtextsize aligned extents.
Fixes: 943bc0882ceb ("iomap: don't increase i_size if it's not a write operation") Reported-by: Chandan Babu R chandanbabu@kernel.org Link: https://lore.kernel.org/linux-xfs/0b92a215-9d9b-3788-4504-a520778953c2@huawe... Signed-off-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Long Li leo.lilong@huawei.com --- fs/xfs/xfs_inode.c | 13 +++++++++++++ fs/xfs/xfs_inode.h | 1 + fs/xfs/xfs_iops.c | 2 +- 3 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 5fafc8f419cc..b7c1d3731747 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -3985,3 +3985,16 @@ xfs_iunlock2_io_mmap( if (!same_inode) inode_unlock(VFS_I(ip1)); } + +/* Returns the size of fundamental allocation unit for a file, in bytes. */ +unsigned int +xfs_inode_alloc_unitsize( + struct xfs_inode *ip) +{ + unsigned int blocks = 1; + + if (XFS_IS_REALTIME_INODE(ip)) + blocks = ip->i_mount->m_sb.sb_rextsize; + + return XFS_FSB_TO_B(ip->i_mount, blocks); +} diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index b5b97be319e6..818f7622d851 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -571,5 +571,6 @@ void xfs_end_io(struct work_struct *work);
int xfs_ilock2_io_mmap(struct xfs_inode *ip1, struct xfs_inode *ip2); void xfs_iunlock2_io_mmap(struct xfs_inode *ip1, struct xfs_inode *ip2); +unsigned int xfs_inode_alloc_unitsize(struct xfs_inode *ip);
#endif /* __XFS_INODE_H__ */ diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 673f066d3ad4..97569fb5f196 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -848,7 +848,7 @@ xfs_setattr_size(
write_back = newsize > ip->i_d.di_size && oldsize != ip->i_d.di_size; if (newsize < oldsize) { - unsigned int blocksize = i_blocksize(inode); + unsigned int blocksize = xfs_inode_alloc_unitsize(ip);
/* * iomap won't detect a dirty page over an unwritten block (or a