From: "Darrick J. Wong" <darrick.wong(a)oracle.com>
mainline inclusion
from mainline-v5.11-rc4
commit 6da1b4b1ab36d80a3994fd4811c8381de10af604
category: bugfix
bugzilla: 185867 https://gitee.com/openeuler/kernel/issues/I4KIAO
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
When overlayfs is running on top of xfs and the user unlinks a file in
the overlay, overlayfs will create a whiteout inode and ask xfs to
"rename" the whiteout file atop the one being unlinked. If the file
being unlinked loses its one nlink, we then have to put the inode on the
unlinked list.
This requires us to grab the AGI buffer of the whiteout inode to take it
off the unlinked list (which is where whiteouts are created) and to grab
the AGI buffer of the file being deleted. If the whiteout was created
in a higher numbered AG than the file being deleted, we'll lock the AGIs
in the wrong order and deadlock.
Therefore, grab all the AGI locks we think we'll need ahead of time, and
in order of increasing AG number per the locking rules.
Reported-by: wenli xie <wlxie7296(a)gmail.com>
Fixes: 93597ae8dac0 ("xfs: Fix deadlock between AGI and AGF when target_ip exists in xfs_rename()")
Signed-off-by: Darrick J. Wong <darrick.wong(a)oracle.com>
Reviewed-by: Brian Foster <bfoster(a)redhat.com>
Signed-off-by: Guo Xuenan <guoxuenan(a)huawei.com>
Reviewed-by: Lihong Kou <koulihong(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
fs/xfs/libxfs/xfs_dir2.h | 2 --
fs/xfs/libxfs/xfs_dir2_sf.c | 2 +-
fs/xfs/xfs_inode.c | 42 ++++++++++++++++++++++---------------
3 files changed, 26 insertions(+), 20 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_dir2.h b/fs/xfs/libxfs/xfs_dir2.h
index e55378640b05..d03e6098ded9 100644
--- a/fs/xfs/libxfs/xfs_dir2.h
+++ b/fs/xfs/libxfs/xfs_dir2.h
@@ -47,8 +47,6 @@ extern int xfs_dir_lookup(struct xfs_trans *tp, struct xfs_inode *dp,
extern int xfs_dir_removename(struct xfs_trans *tp, struct xfs_inode *dp,
struct xfs_name *name, xfs_ino_t ino,
xfs_extlen_t tot);
-extern bool xfs_dir2_sf_replace_needblock(struct xfs_inode *dp,
- xfs_ino_t inum);
extern int xfs_dir_replace(struct xfs_trans *tp, struct xfs_inode *dp,
struct xfs_name *name, xfs_ino_t inum,
xfs_extlen_t tot);
diff --git a/fs/xfs/libxfs/xfs_dir2_sf.c b/fs/xfs/libxfs/xfs_dir2_sf.c
index 2463b5d73447..8c4f76bba88b 100644
--- a/fs/xfs/libxfs/xfs_dir2_sf.c
+++ b/fs/xfs/libxfs/xfs_dir2_sf.c
@@ -1018,7 +1018,7 @@ xfs_dir2_sf_removename(
/*
* Check whether the sf dir replace operation need more blocks.
*/
-bool
+static bool
xfs_dir2_sf_replace_needblock(
struct xfs_inode *dp,
xfs_ino_t inum)
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index af54224ebbe7..b72dd3f67ca7 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -3212,7 +3212,7 @@ xfs_rename(
struct xfs_trans *tp;
struct xfs_inode *wip = NULL; /* whiteout inode */
struct xfs_inode *inodes[__XFS_SORT_INODES];
- struct xfs_buf *agibp;
+ int i;
int num_inodes = __XFS_SORT_INODES;
bool new_parent = (src_dp != target_dp);
bool src_is_directory = S_ISDIR(VFS_I(src_ip)->i_mode);
@@ -3325,6 +3325,30 @@ xfs_rename(
}
}
+ /*
+ * Lock the AGI buffers we need to handle bumping the nlink of the
+ * whiteout inode off the unlinked list and to handle dropping the
+ * nlink of the target inode. Per locking order rules, do this in
+ * increasing AG order and before directory block allocation tries to
+ * grab AGFs because we grab AGIs before AGFs.
+ *
+ * The (vfs) caller must ensure that if src is a directory then
+ * target_ip is either null or an empty directory.
+ */
+ for (i = 0; i < num_inodes && inodes[i] != NULL; i++) {
+ if (inodes[i] == wip ||
+ (inodes[i] == target_ip &&
+ (VFS_I(target_ip)->i_nlink == 1 || src_is_directory))) {
+ struct xfs_buf *bp;
+ xfs_agnumber_t agno;
+
+ agno = XFS_INO_TO_AGNO(mp, inodes[i]->i_ino);
+ error = xfs_read_agi(mp, tp, agno, &bp);
+ if (error)
+ goto out_trans_cancel;
+ }
+ }
+
/*
* Directory entry creation below may acquire the AGF. Remove
* the whiteout from the unlinked list first to preserve correct
@@ -3377,22 +3401,6 @@ xfs_rename(
* In case there is already an entry with the same
* name at the destination directory, remove it first.
*/
-
- /*
- * Check whether the replace operation will need to allocate
- * blocks. This happens when the shortform directory lacks
- * space and we have to convert it to a block format directory.
- * When more blocks are necessary, we must lock the AGI first
- * to preserve locking order (AGI -> AGF).
- */
- if (xfs_dir2_sf_replace_needblock(target_dp, src_ip->i_ino)) {
- error = xfs_read_agi(mp, tp,
- XFS_INO_TO_AGNO(mp, target_ip->i_ino),
- &agibp);
- if (error)
- goto out_trans_cancel;
- }
-
error = xfs_dir_replace(tp, target_dp, target_name,
src_ip->i_ino, spaceres);
if (error)
--
2.20.1
From: Yang Yingliang <yangyingliang(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4VSGH
CVE: NA
--------------------------------
This reverts commit c6d2a109d90440e9e5e927a740128a35acf9d0b5.
I got the following messages when booting kernel:
EFI stub: Booting Linux Kernel...
EFI stub: EFI_RNG_PROTOCOL unavailable, KASLR will be disabled
EFI stub: Using DTB from configuration table
EFI stub: Exiting boot services and installing virtual address map...
...
[ 0.000000] CPU features: kernel page table isolation forced ON by KASLR
[ 0.000000] CPU features: detected: Kernel page table isolation (KPTI)
[ 3.393380] KASLR disabled due to lack of seed
KPTI is forced on by KASLR, but in fact KASLR is not enabled, it's
because kaslr_offset() returns non-zero in kaslr_requires_kpti().
To avoid this problem, when efi kaslr is disabled, make image
MIN_KIMG_ALIGN align which is used to get KASLR offset in
primary_entry(), so kaslr_offset() will returns 0.
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
Reviewed-by: Xie XiuQi <xiexiuqi(a)huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
drivers/firmware/efi/libstub/arm64-stub.c | 28 ++++++++++++-----------
1 file changed, 15 insertions(+), 13 deletions(-)
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index c1b57dfb1277..aa796324fd62 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -79,6 +79,18 @@ static bool check_image_region(u64 base, u64 size)
return ret;
}
+/*
+ * Although relocatable kernels can fix up the misalignment with respect to
+ * MIN_KIMG_ALIGN, the resulting virtual text addresses are subtly out of
+ * sync with those recorded in the vmlinux when kaslr is disabled but the
+ * image required relocation anyway. Therefore retain 2M alignment unless
+ * KASLR is in use.
+ */
+static u64 min_kimg_align(void)
+{
+ return efi_nokaslr ? MIN_KIMG_ALIGN : EFI_KIMG_ALIGN;
+}
+
efi_status_t handle_kernel_image(unsigned long *image_addr,
unsigned long *image_size,
unsigned long *reserve_addr,
@@ -89,16 +101,6 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
unsigned long kernel_size, kernel_memsize = 0;
u32 phys_seed = 0;
- /*
- * Although relocatable kernels can fix up the misalignment with
- * respect to MIN_KIMG_ALIGN, the resulting virtual text addresses are
- * subtly out of sync with those recorded in the vmlinux when kaslr is
- * disabled but the image required relocation anyway. Therefore retain
- * 2M alignment if KASLR was explicitly disabled, even if it was not
- * going to be activated to begin with.
- */
- u64 min_kimg_align = efi_nokaslr ? MIN_KIMG_ALIGN : EFI_KIMG_ALIGN;
-
if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
if (!efi_nokaslr) {
status = efi_get_random_bytes(sizeof(phys_seed),
@@ -132,7 +134,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
* If KASLR is enabled, and we have some randomness available,
* locate the kernel at a randomized offset in physical memory.
*/
- status = efi_random_alloc(*reserve_size, min_kimg_align,
+ status = efi_random_alloc(*reserve_size, min_kimg_align(),
reserve_addr, phys_seed);
} else {
status = EFI_OUT_OF_RESOURCES;
@@ -141,7 +143,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
if (status != EFI_SUCCESS) {
if (!check_image_region((u64)_text, kernel_memsize)) {
efi_err("FIRMWARE BUG: Image BSS overlaps adjacent EFI memory region\n");
- } else if (IS_ALIGNED((u64)_text, min_kimg_align)) {
+ } else if (IS_ALIGNED((u64)_text, min_kimg_align())) {
/*
* Just execute from wherever we were loaded by the
* UEFI PE/COFF loader if the alignment is suitable.
@@ -152,7 +154,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
}
status = efi_allocate_pages_aligned(*reserve_size, reserve_addr,
- ULONG_MAX, min_kimg_align);
+ ULONG_MAX, min_kimg_align());
if (status != EFI_SUCCESS) {
efi_err("Failed to relocate kernel\n");
--
2.20.1
From: Liu Yuntao <liuyuntao10(a)huawei.com>
mainline inclusion
from mainline-v5.17-rc6
commit e79ce9832316e09529b212a21278d68240ccbf1f
category: bugfix
bugzilla: 186043
CVE: NA
-------------------------------------------------
When we specify a large number for node in hugepages parameter, it may
be parsed to another number due to truncation in this statement:
node = tmp;
For example, add following parameter in command line:
hugepagesz=1G hugepages=4294967297:5
and kernel will allocate 5 hugepages for node 1 instead of ignoring it.
I move the validation check earlier to fix this issue, and slightly
simplifies the condition here.
Link: https://lkml.kernel.org/r/20220209134018.8242-1-liuyuntao10@huawei.com
Fixes: b5389086ad7be0 ("hugetlbfs: extend the definition of hugepages parameter to support node allocation")
Signed-off-by: Liu Yuntao <liuyuntao10(a)huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Liu Shixin <liushixin2(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
mm/hugetlb.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f7e41390f3d8b..68cec97bbd066 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3151,10 +3151,10 @@ static int __init hugetlb_nrpages_setup(char *s)
pr_warn("HugeTLB: architecture can't support node specific alloc, ignoring!\n");
return 0;
}
+ if (tmp >= nr_online_nodes)
+ goto invalid;
node = tmp;
p += count + 1;
- if (node < 0 || node >= nr_online_nodes)
- goto invalid;
/* Parse hugepages */
if (sscanf(p, "%lu%n", &tmp, &count) != 1)
goto invalid;
--
2.25.1