From: yangerkun yangerkun@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I8TRWW CVE: NA
--------------------------------
Twice fixup for the same ag may happen within exact one tp, and the consume of agfl after first fixup may trigger failure of second fixup, which is a unintended behavior and then xfs shutdown[1][2].
Gao Xiang describe one solution that we can reserve more blocks when first fixup, but there is some logical error:
- we may first see postallocs as 1 and second as 0, this can trigger pointless agfl filling or shortening - upper case(postallocs first equals to 1, second equals to 0) give us examples that we need shorten the agfl, but xfs_alloc_fix_freelist can only free agfl after success freespace check. Besides, the filling or shortening of agfl won't change fdblocks, so we can fall into that we can see fdblocks(or resblocks) but ag fixup will reject us, and then xfs can shutdown too - once postallocs equals to 1, it can also change the logical of xfs_alloc_ag_max_usable, which will change the block allocation logical(found this problem by check each ag's freeblocks after we fallocate a huge file) - once postallocs equals to 1, we reserve 2 * xfs_alloc_min_freelist(), but sometimes it seems not enough once bnt/cnt grow and the second fixup need more reserve...
This patch fix all bug above by using m_ag_maxlevels to reserve more blocks, and adapt xfs_alloc_set_aside/xfs_alloc_ag_max_usable to match this more reserve. Besides, we just reserve more, won't fill or shorten agfl according to that reserve.
[1] https://www.spinics.net/lists/linux-xfs/msg66440.html [2] https://lore.kernel.org/linux-xfs/20221228133204.4021519-1-guoxuenan@huawei....
Fixes: 53f85096f93e ("xfs: account extra freespace btree splits for multiple allocations") Signed-off-by: yangerkun yangerkun@huawei.com Signed-off-by: Long Li leo.lilong@huawei.com --- fs/xfs/libxfs/xfs_alloc.c | 39 ++++++++++++++++++++++++++++++++++----- fs/xfs/xfs_mount.c | 9 +++++++++ 2 files changed, 43 insertions(+), 5 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 392d787ceb22..2931486dcbd7 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -92,6 +92,25 @@ xfs_prealloc_blocks( */ #define XFS_ALLOCBT_AGFL_RESERVE 4
+/* + * Twice fixup for the same ag may happen within exact one tp, and the consume + * of agfl after first fixup may trigger second fixup's failure, then xfs will + * shutdown. To avoid that, we reserve blocks which can satisfy the second + * fixup. + */ +xfs_extlen_t +xfs_ag_fixup_aside( + struct xfs_mount *mp) +{ + xfs_extlen_t ret; + + ret = 2 * mp->m_alloc_maxlevels; + if (xfs_has_rmapbt(mp)) + ret += mp->m_rmap_maxlevels; + + return ret; +} + /* * Compute the number of blocks that we set aside to guarantee the ability to * refill the AGFL and handle a full bmap btree split. @@ -114,7 +133,8 @@ unsigned int xfs_alloc_set_aside( struct xfs_mount *mp) { - return mp->m_sb.sb_agcount * (XFS_ALLOCBT_AGFL_RESERVE + 4); + return mp->m_sb.sb_agcount * (XFS_ALLOCBT_AGFL_RESERVE + + 4 + xfs_ag_fixup_aside(mp)); }
/* @@ -147,6 +167,8 @@ xfs_alloc_ag_max_usable( if (xfs_has_reflink(mp)) blocks++; /* refcount root block */
+ blocks += xfs_ag_fixup_aside(mp); + return mp->m_sb.sb_agblocks - blocks; }
@@ -2618,6 +2640,7 @@ xfs_alloc_fix_freelist( struct xfs_alloc_arg targs; /* local allocation arguments */ xfs_agblock_t bno; /* freelist block */ xfs_extlen_t need; /* total blocks needed in freelist */ + xfs_extlen_t minfree; int error = 0;
/* deferred ops (AGFL block frees) require permanent transactions */ @@ -2650,8 +2673,11 @@ xfs_alloc_fix_freelist( * blocks to perform multiple allocations from a single AG and * transaction if needed. */ - need = xfs_alloc_min_freelist(mp, pag) * (1 + args->postallocs); - if (!xfs_alloc_space_available(args, need, alloc_flags | + minfree = need = xfs_alloc_min_freelist(mp, pag); + if (args->postallocs) + minfree += xfs_ag_fixup_aside(mp); + + if (!xfs_alloc_space_available(args, minfree, alloc_flags | XFS_ALLOC_FLAG_CHECK)) goto out_agbp_relse;
@@ -2674,8 +2700,11 @@ xfs_alloc_fix_freelist( xfs_agfl_reset(tp, agbp, pag);
/* If there isn't enough total space or single-extent, reject it. */ - need = xfs_alloc_min_freelist(mp, pag) * (1 + args->postallocs); - if (!xfs_alloc_space_available(args, need, alloc_flags)) + minfree = need = xfs_alloc_min_freelist(mp, pag); + if (args->postallocs) + minfree += xfs_ag_fixup_aside(mp); + + if (!xfs_alloc_space_available(args, minfree, alloc_flags)) goto out_agbp_relse;
#ifdef DEBUG diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 0a0fd19573d8..78c72d0aa0a6 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -693,6 +693,15 @@ xfs_mountfs(
xfs_agbtree_compute_maxlevels(mp);
+ /* + * We now need m_ag_maxlevels/m_rmap_maxlevels to initialize + * m_alloc_set_aside/m_ag_max_usable. And when we first do the + * init in xfs_sb_mount_common, m_alloc_set_aside/m_ag_max_usable + * still equals to 0. Redo it now. + */ + mp->m_alloc_set_aside = xfs_alloc_set_aside(mp); + mp->m_ag_max_usable = xfs_alloc_ag_max_usable(mp); + /* * Check if sb_agblocks is aligned at stripe boundary. If sb_agblocks * is NOT aligned turn off m_dalign since allocator alignment is within