From: Chris Down chris@chrisdown.name
mainline inclusion from mainline-v5.9-rc1 commit ea3271f7196c65ae5d3e1c7b3f733892c017dbd6 category: bugfix bugzilla: NA CVE: NA ---------------------------
The default is still set to inode32 for backwards compatibility, but system administrators can opt in to the new 64-bit inode numbers by either:
1. Passing inode64 on the command line when mounting, or 2. Configuring the kernel with CONFIG_TMPFS_INODE64=y
The inode64 and inode32 names are used based on existing precedent from XFS.
[hughd@google.com: Kconfig fixes] Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008011928010.13320@eggly.anvils
Signed-off-by: Chris Down chris@chrisdown.name Signed-off-by: Hugh Dickins hughd@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Amir Goldstein amir73il@gmail.com Acked-by: Hugh Dickins hughd@google.com Cc: Al Viro viro@zeniv.linux.org.uk Cc: Matthew Wilcox willy@infradead.org Cc: Jeff Layton jlayton@kernel.org Cc: Johannes Weiner hannes@cmpxchg.org Cc: Tejun Heo tj@kernel.org Link: http://lkml.kernel.org/r/8b23758d0c66b5e2263e08baf9c4b6a7565cbd8f.1594661218... Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Reviewed-by: zhangyi (F) yi.zhang@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com --- Documentation/filesystems/tmpfs.txt | 17 ++++++++ fs/Kconfig | 21 ++++++++++ include/linux/shmem_fs.h | 1 + mm/shmem.c | 61 ++++++++++++++++++++++++++--- 4 files changed, 95 insertions(+), 5 deletions(-)
diff --git a/Documentation/filesystems/tmpfs.txt b/Documentation/filesystems/tmpfs.txt index d06e9a59a9f4..c72536fa78ae 100644 --- a/Documentation/filesystems/tmpfs.txt +++ b/Documentation/filesystems/tmpfs.txt @@ -135,6 +135,21 @@ gid: The group id These options do not have any effect on remount. You can change these parameters with chmod(1), chown(1) and chgrp(1) on a mounted filesystem.
+tmpfs has a mount option to select whether it will wrap at 32- or 64-bit inode +numbers: + +======= ======================== +inode64 Use 64-bit inode numbers +inode32 Use 32-bit inode numbers +======= ======================== + +On a 32-bit kernel, inode32 is implicit, and inode64 is refused at mount time. +On a 64-bit kernel, CONFIG_TMPFS_INODE64 sets the default. inode64 avoids the +possibility of multiple files with the same inode number on a single device; +but risks glibc failing with EOVERFLOW once 33-bit inode numbers are reached - +if a long-lived tmpfs is accessed by 32-bit applications so ancient that +opening a file larger than 2GiB fails with EINVAL. +
So 'mount -t tmpfs -o size=10G,nr_inodes=10k,mode=700 tmpfs /mytmpfs' will give you tmpfs instance on /mytmpfs which can allocate 10GB @@ -147,3 +162,5 @@ Updated: Hugh Dickins, 4 June 2007 Updated: KOSAKI Motohiro, 16 Mar 2010 +Updated: + Chris Down, 13 July 2020 diff --git a/fs/Kconfig b/fs/Kconfig index 3fa013a39981..2d9d472d8ba8 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -190,6 +190,27 @@ config TMPFS_XATTR
If unsure, say N.
+config TMPFS_INODE64 + bool "Use 64-bit ino_t by default in tmpfs" + depends on TMPFS && 64BIT + default n + help + tmpfs has historically used only inode numbers as wide as an unsigned + int. In some cases this can cause wraparound, potentially resulting + in multiple files with the same inode number on a single device. This + option makes tmpfs use the full width of ino_t by default, without + needing to specify the inode64 option when mounting. + + But if a long-lived tmpfs is to be accessed by 32-bit applications so + ancient that opening a file larger than 2GiB fails with EINVAL, then + the INODE64 config option and inode64 mount option risk operations + failing with EOVERFLOW once 33-bit inode numbers are reached. + + To override this configured default, use the inode32 or inode64 + option when mounting. + + If unsure, say N. + config HUGETLBFS bool "HugeTLB file system support" depends on X86 || IA64 || SPARC64 || (S390 && 64BIT) || \ diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index dd67c6e59cea..401afcf81efe 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -34,6 +34,7 @@ struct shmem_sb_info { unsigned char huge; /* Whether to try for hugepages */ kuid_t uid; /* Mount uid for root directory */ kgid_t gid; /* Mount gid for root directory */ + bool full_inums; /* If i_ino should be uint or ino_t */ ino_t next_ino; /* The next per-sb inode number to use */ ino_t __percpu *ino_batch; /* The next per-cpu inode number to use */ struct mempolicy *mpol; /* default memory policy for mappings */ diff --git a/mm/shmem.c b/mm/shmem.c index 25205b46b053..ad5ca9ed943e 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -293,12 +293,17 @@ static int shmem_reserve_inode(struct super_block *sb, ino_t *inop) ino = sbinfo->next_ino++; if (unlikely(is_zero_ino(ino))) ino = sbinfo->next_ino++; - if (unlikely(ino > UINT_MAX)) { + if (unlikely(!sbinfo->full_inums && + ino > UINT_MAX)) { /* * Emulate get_next_ino uint wraparound for * compatibility */ - ino = 1; + if (IS_ENABLED(CONFIG_64BIT)) + pr_warn("%s: inode number overflow on device %d, consider using inode64 mount option\n", + __func__, MINOR(sb->s_dev)); + sbinfo->next_ino = 1; + ino = sbinfo->next_ino++; } *inop = ino; } @@ -311,6 +316,10 @@ static int shmem_reserve_inode(struct super_block *sb, ino_t *inop) * unknown contexts. As such, use a per-cpu batched allocator * which doesn't require the per-sb stat_lock unless we are at * the batch boundary. + * + * We don't need to worry about inode{32,64} since SB_KERNMOUNT + * shmem mounts are not exposed to userspace, so we don't need + * to worry about things like glibc compatibility. */ ino_t *next_ino; next_ino = per_cpu_ptr(sbinfo->ino_batch, get_cpu()); @@ -3427,9 +3436,12 @@ static int shmem_parse_options(char *options, struct shmem_sb_info *sbinfo, if ((value = strchr(this_char,'=')) != NULL) { *value++ = 0; } else { - pr_err("tmpfs: No value for mount option '%s'\n", - this_char); - goto error; + if (strcmp(this_char,"inode32") && + strcmp(this_char,"inode64")) { + pr_err("tmpfs: No value for mount option '%s'\n", + this_char); + goto error; + } }
if (!strcmp(this_char,"size")) { @@ -3495,6 +3507,14 @@ static int shmem_parse_options(char *options, struct shmem_sb_info *sbinfo, if (mpol_parse_str(value, &mpol)) goto bad_val; #endif + } else if (!strcmp(this_char,"inode32")) { + sbinfo->full_inums = false; + } else if (!strcmp(this_char,"inode64")) { + if (sizeof(ino_t) < 8) { + pr_err("tmpfs: Cannot use inode64 with <64bit inums in kernel\n"); + goto error; + } + sbinfo->full_inums = true; } else { pr_err("tmpfs: Bad mount option %s\n", this_char); goto error; @@ -3539,8 +3559,15 @@ static int shmem_remount_fs(struct super_block *sb, int *flags, char *data) if (config.max_inodes && !sbinfo->max_inodes) goto out;
+ if (sbinfo->full_inums && !config.full_inums && + sbinfo->next_ino > UINT_MAX) { + pr_err("tmpfs: Current inum too high to switch to 32-bit inums\n"); + goto out; + } + error = 0; sbinfo->huge = config.huge; + sbinfo->full_inums = config.full_inums; sbinfo->max_blocks = config.max_blocks; sbinfo->max_inodes = config.max_inodes; sbinfo->free_inodes = config.max_inodes - inodes; @@ -3574,6 +3601,29 @@ static int shmem_show_options(struct seq_file *seq, struct dentry *root) if (!gid_eq(sbinfo->gid, GLOBAL_ROOT_GID)) seq_printf(seq, ",gid=%u", from_kgid_munged(&init_user_ns, sbinfo->gid)); + + /* + * Showing inode{64,32} might be useful even if it's the system default, + * since then people don't have to resort to checking both here and + * /proc/config.gz to confirm 64-bit inums were successfully applied + * (which may not even exist if IKCONFIG_PROC isn't enabled). + * + * We hide it when inode64 isn't the default and we are using 32-bit + * inodes, since that probably just means the feature isn't even under + * consideration. + * + * As such: + * + * +-----------------+-----------------+ + * | TMPFS_INODE64=y | TMPFS_INODE64=n | + * +------------------+-----------------+-----------------+ + * | full_inums=true | show | show | + * | full_inums=false | show | hide | + * +------------------+-----------------+-----------------+ + * + */ + if (IS_ENABLED(CONFIG_TMPFS_INODE64) || sbinfo->full_inums) + seq_printf(seq, ",inode%d", (sbinfo->full_inums ? 64 : 32)); #ifdef CONFIG_TRANSPARENT_HUGE_PAGECACHE /* Rightly or wrongly, show huge mount option unmasked by shmem_huge */ if (sbinfo->huge) @@ -3622,6 +3672,7 @@ int shmem_fill_super(struct super_block *sb, void *data, int silent) if (!(sb->s_flags & SB_KERNMOUNT)) { sbinfo->max_blocks = shmem_default_max_blocks(); sbinfo->max_inodes = shmem_default_max_inodes(); + sbinfo->full_inums = IS_ENABLED(CONFIG_TMPFS_INODE64); if (shmem_parse_options(data, sbinfo, false)) { err = -EINVAL; goto failed;