* [PATCH v7 0/2] tmpfs: inode: Reduce risk of inum overflow @ 2020-07-13 17:28 Chris Down 2020-07-13 17:28 ` [PATCH v7 1/2] tmpfs: Per-superblock i_ino support Chris Down 2020-07-13 17:28 ` [PATCH v7 2/2] tmpfs: Support 64-bit inums per-sb Chris Down 0 siblings, 2 replies; 8+ messages in thread From: Chris Down @ 2020-07-13 17:28 UTC (permalink / raw) To: Andrew Morton Cc: Hugh Dickins, Al Viro, Matthew Wilcox, Amir Goldstein, Jeff Layton, Johannes Weiner, Tejun Heo, linux-mm, linux-fsdevel, linux-kernel, kernel-team In Facebook production we are seeing heavy i_ino wraparounds on tmpfs. On affected tiers, in excess of 10% of hosts show multiple files with different content and the same inode number, with some servers even having as many as 150 duplicated inode numbers with differing file content. This causes actual, tangible problems in production. For example, we have complaints from those working on remote caches that their application is reporting cache corruptions because it uses (device, inodenum) to establish the identity of a particular cache object, but because it's not unique any more, the application refuses to continue and reports cache corruption. Even worse, sometimes applications may not even detect the corruption but may continue anyway, causing phantom and hard to debug behaviour. In general, userspace applications expect that (device, inodenum) should be enough to be uniquely point to one inode, which seems fair enough. One might also need to check the generation, but in this case: 1. That's not currently exposed to userspace (ioctl(...FS_IOC_GETVERSION...) returns ENOTTY on tmpfs); 2. Even with generation, there shouldn't be two live inodes with the same inode number on one device. In order to mitigate this, we take a two-pronged approach: 1. Moving inum generation from being global to per-sb for tmpfs. This itself allows some reduction in i_ino churn. This works on both 64- and 32- bit machines. 2. Adding inode{64,32} for tmpfs. This fix is supported on machines with 64-bit ino_t only: we allow users to mount tmpfs with a new inode64 option that uses the full width of ino_t, or CONFIG_TMPFS_INODE64. You can see how this compares to previous related patches which didn't implement this per-superblock: - https://patchwork.kernel.org/patch/11254001/ - https://patchwork.kernel.org/patch/11023915/ Changes since v6: - Fix misalignment in percpu batching, thanks Matthew. Chris Down (2): tmpfs: Per-superblock i_ino support tmpfs: Support 64-bit inums per-sb Documentation/filesystems/tmpfs.rst | 11 +++ fs/Kconfig | 15 ++++ include/linux/fs.h | 15 ++++ include/linux/shmem_fs.h | 3 + mm/shmem.c | 127 ++++++++++++++++++++++++++-- 5 files changed, 166 insertions(+), 5 deletions(-) -- 2.27.0 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v7 1/2] tmpfs: Per-superblock i_ino support 2020-07-13 17:28 [PATCH v7 0/2] tmpfs: inode: Reduce risk of inum overflow Chris Down @ 2020-07-13 17:28 ` Chris Down 2020-08-01 19:22 ` Hugh Dickins 2020-07-13 17:28 ` [PATCH v7 2/2] tmpfs: Support 64-bit inums per-sb Chris Down 1 sibling, 1 reply; 8+ messages in thread From: Chris Down @ 2020-07-13 17:28 UTC (permalink / raw) To: Andrew Morton Cc: Hugh Dickins, Al Viro, Matthew Wilcox, Amir Goldstein, Jeff Layton, Johannes Weiner, Tejun Heo, linux-mm, linux-fsdevel, linux-kernel, kernel-team get_next_ino has a number of problems: - It uses and returns a uint, which is susceptible to become overflowed if a lot of volatile inodes that use get_next_ino are created. - It's global, with no specificity per-sb or even per-filesystem. This means it's not that difficult to cause inode number wraparounds on a single device, which can result in having multiple distinct inodes with the same inode number. This patch adds a per-superblock counter that mitigates the second case. This design also allows us to later have a specific i_ino size per-device, for example, allowing users to choose whether to use 32- or 64-bit inodes for each tmpfs mount. This is implemented in the next commit. For internal shmem mounts which may be less tolerant to spinlock delays, we implement a percpu batching scheme which only takes the stat_lock at each batch boundary. Signed-off-by: Chris Down <chris@chrisdown.name> Cc: Amir Goldstein <amir73il@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: linux-mm@kvack.org Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: kernel-team@fb.com --- include/linux/fs.h | 15 +++++++++ include/linux/shmem_fs.h | 2 ++ mm/shmem.c | 66 +++++++++++++++++++++++++++++++++++++--- 3 files changed, 78 insertions(+), 5 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index f15848899945..b70b334f8e16 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2961,6 +2961,21 @@ extern void discard_new_inode(struct inode *); extern unsigned int get_next_ino(void); extern void evict_inodes(struct super_block *sb); +/* + * Userspace may rely on the the inode number being non-zero. For example, glibc + * simply ignores files with zero i_ino in unlink() and other places. + * + * As an additional complication, if userspace was compiled with + * _FILE_OFFSET_BITS=32 on a 64-bit kernel we'll only end up reading out the + * lower 32 bits, so we need to check that those aren't zero explicitly. With + * _FILE_OFFSET_BITS=64, this may cause some harmless false-negatives, but + * better safe than sorry. + */ +static inline bool is_zero_ino(ino_t ino) +{ + return (u32)ino == 0; +} + extern void __iget(struct inode * inode); extern void iget_failed(struct inode *); extern void clear_inode(struct inode *); diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 7a35a6901221..eb628696ec66 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -36,6 +36,8 @@ struct shmem_sb_info { unsigned char huge; /* Whether to try for hugepages */ kuid_t uid; /* Mount uid for root directory */ kgid_t gid; /* Mount gid for root directory */ + ino_t next_ino; /* The next per-sb inode number to use */ + ino_t __percpu *ino_batch; /* The next per-cpu inode number to use */ struct mempolicy *mpol; /* default memory policy for mappings */ spinlock_t shrinklist_lock; /* Protects shrinklist */ struct list_head shrinklist; /* List of shinkable inodes */ diff --git a/mm/shmem.c b/mm/shmem.c index a0dbe62f8042..0ae250b4da28 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -260,18 +260,67 @@ bool vma_is_shmem(struct vm_area_struct *vma) static LIST_HEAD(shmem_swaplist); static DEFINE_MUTEX(shmem_swaplist_mutex); -static int shmem_reserve_inode(struct super_block *sb) +/* + * shmem_reserve_inode() performs bookkeeping to reserve a shmem inode, and + * produces a novel ino for the newly allocated inode. + * + * It may also be called when making a hard link to permit the space needed by + * each dentry. However, in that case, no new inode number is needed since that + * internally draws from another pool of inode numbers (currently global + * get_next_ino()). This case is indicated by passing NULL as inop. + */ +#define SHMEM_INO_BATCH 1024 +static int shmem_reserve_inode(struct super_block *sb, ino_t *inop) { struct shmem_sb_info *sbinfo = SHMEM_SB(sb); - if (sbinfo->max_inodes) { + ino_t ino; + + if (!(sb->s_flags & SB_KERNMOUNT)) { spin_lock(&sbinfo->stat_lock); if (!sbinfo->free_inodes) { spin_unlock(&sbinfo->stat_lock); return -ENOSPC; } sbinfo->free_inodes--; + if (inop) { + ino = sbinfo->next_ino++; + if (unlikely(is_zero_ino(ino))) + ino = sbinfo->next_ino++; + if (unlikely(ino > UINT_MAX)) { + /* + * Emulate get_next_ino uint wraparound for + * compatibility + */ + ino = 1; + } + *inop = ino; + } spin_unlock(&sbinfo->stat_lock); + } else if (inop) { + /* + * __shmem_file_setup, one of our callers, is lock-free: it + * doesn't hold stat_lock in shmem_reserve_inode since + * max_inodes is always 0, and is called from potentially + * unknown contexts. As such, use a per-cpu batched allocator + * which doesn't require the per-sb stat_lock unless we are at + * the batch boundary. + */ + ino_t *next_ino; + next_ino = per_cpu_ptr(sbinfo->ino_batch, get_cpu()); + ino = *next_ino; + if (unlikely(ino % SHMEM_INO_BATCH == 0)) { + spin_lock(&sbinfo->stat_lock); + ino = sbinfo->next_ino; + sbinfo->next_ino += SHMEM_INO_BATCH; + spin_unlock(&sbinfo->stat_lock); + if (unlikely(is_zero_ino(ino))) + ino++; + } + *inop = ino; + *next_ino = ++ino; + put_cpu(); } + return 0; } @@ -2222,13 +2271,14 @@ static struct inode *shmem_get_inode(struct super_block *sb, const struct inode struct inode *inode; struct shmem_inode_info *info; struct shmem_sb_info *sbinfo = SHMEM_SB(sb); + ino_t ino; - if (shmem_reserve_inode(sb)) + if (shmem_reserve_inode(sb, &ino)) return NULL; inode = new_inode(sb); if (inode) { - inode->i_ino = get_next_ino(); + inode->i_ino = ino; inode_init_owner(inode, dir, mode); inode->i_blocks = 0; inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode); @@ -2932,7 +2982,7 @@ static int shmem_link(struct dentry *old_dentry, struct inode *dir, struct dentr * first link must skip that, to get the accounting right. */ if (inode->i_nlink) { - ret = shmem_reserve_inode(inode->i_sb); + ret = shmem_reserve_inode(inode->i_sb, NULL); if (ret) goto out; } @@ -3584,6 +3634,7 @@ static void shmem_put_super(struct super_block *sb) { struct shmem_sb_info *sbinfo = SHMEM_SB(sb); + free_percpu(sbinfo->ino_batch); percpu_counter_destroy(&sbinfo->used_blocks); mpol_put(sbinfo->mpol); kfree(sbinfo); @@ -3626,6 +3677,11 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc) #endif sbinfo->max_blocks = ctx->blocks; sbinfo->free_inodes = sbinfo->max_inodes = ctx->inodes; + if (sb->s_flags & SB_KERNMOUNT) { + sbinfo->ino_batch = alloc_percpu(ino_t); + if (!sbinfo->ino_batch) + goto failed; + } sbinfo->uid = ctx->uid; sbinfo->gid = ctx->gid; sbinfo->mode = ctx->mode; -- 2.27.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v7 1/2] tmpfs: Per-superblock i_ino support 2020-07-13 17:28 ` [PATCH v7 1/2] tmpfs: Per-superblock i_ino support Chris Down @ 2020-08-01 19:22 ` Hugh Dickins 0 siblings, 0 replies; 8+ messages in thread From: Hugh Dickins @ 2020-08-01 19:22 UTC (permalink / raw) To: Chris Down Cc: Andrew Morton, Hugh Dickins, Al Viro, Matthew Wilcox, Amir Goldstein, Jeff Layton, Johannes Weiner, Tejun Heo, linux-mm, linux-fsdevel, linux-kernel, kernel-team On Mon, 13 Jul 2020, Chris Down wrote: > get_next_ino has a number of problems: > > - It uses and returns a uint, which is susceptible to become overflowed > if a lot of volatile inodes that use get_next_ino are created. > - It's global, with no specificity per-sb or even per-filesystem. This > means it's not that difficult to cause inode number wraparounds on a > single device, which can result in having multiple distinct inodes > with the same inode number. > > This patch adds a per-superblock counter that mitigates the second case. > This design also allows us to later have a specific i_ino size > per-device, for example, allowing users to choose whether to use 32- or > 64-bit inodes for each tmpfs mount. This is implemented in the next > commit. > > For internal shmem mounts which may be less tolerant to spinlock delays, > we implement a percpu batching scheme which only takes the stat_lock at > each batch boundary. > > Signed-off-by: Chris Down <chris@chrisdown.name> > Cc: Amir Goldstein <amir73il@gmail.com> > Cc: Hugh Dickins <hughd@google.com> Acked-by: Hugh Dickins <hughd@google.com> Thanks for coming back and completing this, Chris. Some comments below, nothing to detract from that Ack, more notes to myself: things I might change slightly when I get back here later on. I'm glad to see Andrew pulled your 0/2 text into this 1/2. > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Al Viro <viro@zeniv.linux.org.uk> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: Jeff Layton <jlayton@kernel.org> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: Tejun Heo <tj@kernel.org> > Cc: linux-mm@kvack.org > Cc: linux-fsdevel@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: kernel-team@fb.com > --- > include/linux/fs.h | 15 +++++++++ > include/linux/shmem_fs.h | 2 ++ > mm/shmem.c | 66 +++++++++++++++++++++++++++++++++++++--- > 3 files changed, 78 insertions(+), 5 deletions(-) > > diff --git a/include/linux/fs.h b/include/linux/fs.h > index f15848899945..b70b334f8e16 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -2961,6 +2961,21 @@ extern void discard_new_inode(struct inode *); > extern unsigned int get_next_ino(void); > extern void evict_inodes(struct super_block *sb); > > +/* > + * Userspace may rely on the the inode number being non-zero. For example, glibc > + * simply ignores files with zero i_ino in unlink() and other places. > + * > + * As an additional complication, if userspace was compiled with > + * _FILE_OFFSET_BITS=32 on a 64-bit kernel we'll only end up reading out the > + * lower 32 bits, so we need to check that those aren't zero explicitly. With > + * _FILE_OFFSET_BITS=64, this may cause some harmless false-negatives, but > + * better safe than sorry. > + */ > +static inline bool is_zero_ino(ino_t ino) > +{ > + return (u32)ino == 0; > +} Hmm, okay. The value of this unnecessary, and inaccurately named, wrapper is in the great comment above it: which then suffers a bit from being hidden away in a header file. I'd understand its placing better if you had also changed get_next_ino() to use it. And I have another reason for wondering whether this function is the right thing to abstract out: 1 is also somewhat special, being the ino of the root. It seems a little unfortunate, when we recycle through the 32-bit space, to reuse the one ino that is certain to be still in use (maybe all the others get "rm -rf"ed every day). But I haven't yet decided whether that's worth bothering about at all. > + > extern void __iget(struct inode * inode); > extern void iget_failed(struct inode *); > extern void clear_inode(struct inode *); > diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h > index 7a35a6901221..eb628696ec66 100644 > --- a/include/linux/shmem_fs.h > +++ b/include/linux/shmem_fs.h > @@ -36,6 +36,8 @@ struct shmem_sb_info { > unsigned char huge; /* Whether to try for hugepages */ > kuid_t uid; /* Mount uid for root directory */ > kgid_t gid; /* Mount gid for root directory */ > + ino_t next_ino; /* The next per-sb inode number to use */ > + ino_t __percpu *ino_batch; /* The next per-cpu inode number to use */ > struct mempolicy *mpol; /* default memory policy for mappings */ > spinlock_t shrinklist_lock; /* Protects shrinklist */ > struct list_head shrinklist; /* List of shinkable inodes */ > diff --git a/mm/shmem.c b/mm/shmem.c > index a0dbe62f8042..0ae250b4da28 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -260,18 +260,67 @@ bool vma_is_shmem(struct vm_area_struct *vma) > static LIST_HEAD(shmem_swaplist); > static DEFINE_MUTEX(shmem_swaplist_mutex); > > -static int shmem_reserve_inode(struct super_block *sb) > +/* > + * shmem_reserve_inode() performs bookkeeping to reserve a shmem inode, and > + * produces a novel ino for the newly allocated inode. > + * > + * It may also be called when making a hard link to permit the space needed by > + * each dentry. However, in that case, no new inode number is needed since that > + * internally draws from another pool of inode numbers (currently global > + * get_next_ino()). This case is indicated by passing NULL as inop. > + */ > +#define SHMEM_INO_BATCH 1024 > +static int shmem_reserve_inode(struct super_block *sb, ino_t *inop) > { > struct shmem_sb_info *sbinfo = SHMEM_SB(sb); > - if (sbinfo->max_inodes) { > + ino_t ino; > + > + if (!(sb->s_flags & SB_KERNMOUNT)) { I was disappointed to find that this is still decided by SB_KERNMOUNT instead of max_inodes, as we had discussed previously. But it may just be me who sees that as a regression (taking stat_lock when minimizing stat_lock was the mounter's choice), so I'll accept responsibility to follow that up later (and in no great hurry). And I may discover why you didn't make the change - remount with different options might well turn out to be a pain, and may entail some compromises. > spin_lock(&sbinfo->stat_lock); > if (!sbinfo->free_inodes) { > spin_unlock(&sbinfo->stat_lock); > return -ENOSPC; > } > sbinfo->free_inodes--; > + if (inop) { > + ino = sbinfo->next_ino++; > + if (unlikely(is_zero_ino(ino))) > + ino = sbinfo->next_ino++; > + if (unlikely(ino > UINT_MAX)) { > + /* > + * Emulate get_next_ino uint wraparound for > + * compatibility > + */ > + ino = 1; > + } > + *inop = ino; > + } > spin_unlock(&sbinfo->stat_lock); > + } else if (inop) { > + /* > + * __shmem_file_setup, one of our callers, is lock-free: it > + * doesn't hold stat_lock in shmem_reserve_inode since > + * max_inodes is always 0, and is called from potentially > + * unknown contexts. As such, use a per-cpu batched allocator > + * which doesn't require the per-sb stat_lock unless we are at > + * the batch boundary. > + */ > + ino_t *next_ino; > + next_ino = per_cpu_ptr(sbinfo->ino_batch, get_cpu()); > + ino = *next_ino; > + if (unlikely(ino % SHMEM_INO_BATCH == 0)) { > + spin_lock(&sbinfo->stat_lock); > + ino = sbinfo->next_ino; > + sbinfo->next_ino += SHMEM_INO_BATCH; > + spin_unlock(&sbinfo->stat_lock); > + if (unlikely(is_zero_ino(ino))) > + ino++; > + } > + *inop = ino; > + *next_ino = ++ino; > + put_cpu(); > } > + > return 0; > } > > @@ -2222,13 +2271,14 @@ static struct inode *shmem_get_inode(struct super_block *sb, const struct inode > struct inode *inode; > struct shmem_inode_info *info; > struct shmem_sb_info *sbinfo = SHMEM_SB(sb); > + ino_t ino; > > - if (shmem_reserve_inode(sb)) > + if (shmem_reserve_inode(sb, &ino)) > return NULL; > > inode = new_inode(sb); > if (inode) { > - inode->i_ino = get_next_ino(); > + inode->i_ino = ino; > inode_init_owner(inode, dir, mode); > inode->i_blocks = 0; > inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode); > @@ -2932,7 +2982,7 @@ static int shmem_link(struct dentry *old_dentry, struct inode *dir, struct dentr > * first link must skip that, to get the accounting right. > */ > if (inode->i_nlink) { > - ret = shmem_reserve_inode(inode->i_sb); > + ret = shmem_reserve_inode(inode->i_sb, NULL); > if (ret) > goto out; > } > @@ -3584,6 +3634,7 @@ static void shmem_put_super(struct super_block *sb) > { > struct shmem_sb_info *sbinfo = SHMEM_SB(sb); > > + free_percpu(sbinfo->ino_batch); > percpu_counter_destroy(&sbinfo->used_blocks); > mpol_put(sbinfo->mpol); > kfree(sbinfo); > @@ -3626,6 +3677,11 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc) > #endif > sbinfo->max_blocks = ctx->blocks; > sbinfo->free_inodes = sbinfo->max_inodes = ctx->inodes; > + if (sb->s_flags & SB_KERNMOUNT) { > + sbinfo->ino_batch = alloc_percpu(ino_t); > + if (!sbinfo->ino_batch) > + goto failed; > + } > sbinfo->uid = ctx->uid; > sbinfo->gid = ctx->gid; > sbinfo->mode = ctx->mode; > -- > 2.27.0 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v7 2/2] tmpfs: Support 64-bit inums per-sb 2020-07-13 17:28 [PATCH v7 0/2] tmpfs: inode: Reduce risk of inum overflow Chris Down 2020-07-13 17:28 ` [PATCH v7 1/2] tmpfs: Per-superblock i_ino support Chris Down @ 2020-07-13 17:28 ` Chris Down 2020-08-01 20:41 ` Hugh Dickins 1 sibling, 1 reply; 8+ messages in thread From: Chris Down @ 2020-07-13 17:28 UTC (permalink / raw) To: Andrew Morton Cc: Hugh Dickins, Al Viro, Matthew Wilcox, Amir Goldstein, Jeff Layton, Johannes Weiner, Tejun Heo, linux-mm, linux-fsdevel, linux-kernel, kernel-team The default is still set to inode32 for backwards compatibility, but system administrators can opt in to the new 64-bit inode numbers by either: 1. Passing inode64 on the command line when mounting, or 2. Configuring the kernel with CONFIG_TMPFS_INODE64=y The inode64 and inode32 names are used based on existing precedent from XFS. Signed-off-by: Chris Down <chris@chrisdown.name> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: linux-mm@kvack.org Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: kernel-team@fb.com --- Documentation/filesystems/tmpfs.rst | 11 +++++ fs/Kconfig | 15 +++++++ include/linux/shmem_fs.h | 1 + mm/shmem.c | 65 ++++++++++++++++++++++++++++- 4 files changed, 90 insertions(+), 2 deletions(-) diff --git a/Documentation/filesystems/tmpfs.rst b/Documentation/filesystems/tmpfs.rst index 4e95929301a5..47b84ddaa8bb 100644 --- a/Documentation/filesystems/tmpfs.rst +++ b/Documentation/filesystems/tmpfs.rst @@ -150,6 +150,15 @@ These options do not have any effect on remount. You can change these parameters with chmod(1), chown(1) and chgrp(1) on a mounted filesystem. +tmpfs has a mount option to select whether it will wrap at 32- or 64-bit inode +numbers: + +inode64 Use 64-bit inode numbers +inode32 Use 32-bit inode numbers + +On 64-bit, the default is set by CONFIG_TMPFS_INODE64. On 32-bit, inode64 is +not legal and will produce an error at mount time. + So 'mount -t tmpfs -o size=10G,nr_inodes=10k,mode=700 tmpfs /mytmpfs' will give you tmpfs instance on /mytmpfs which can allocate 10GB RAM/SWAP in 10240 inodes and it is only accessible by root. @@ -161,3 +170,5 @@ RAM/SWAP in 10240 inodes and it is only accessible by root. Hugh Dickins, 4 June 2007 :Updated: KOSAKI Motohiro, 16 Mar 2010 +Updated: + Chris Down, 13 July 2020 diff --git a/fs/Kconfig b/fs/Kconfig index ff257b81fde5..64d530ba42f6 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -229,6 +229,21 @@ config TMPFS_XATTR If unsure, say N. +config TMPFS_INODE64 + bool "Use 64-bit ino_t by default in tmpfs" + depends on TMPFS && 64BIT + default n + help + tmpfs has historically used only inode numbers as wide as an unsigned + int. In some cases this can cause wraparound, potentially resulting in + multiple files with the same inode number on a single device. This option + makes tmpfs use the full width of ino_t by default, similarly to the + inode64 mount option. + + To override this default, use the inode32 or inode64 mount options. + + If unsure, say N. + config HUGETLBFS bool "HugeTLB file system support" depends on X86 || IA64 || SPARC64 || (S390 && 64BIT) || \ diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index eb628696ec66..a5a5d1d4d7b1 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -36,6 +36,7 @@ struct shmem_sb_info { unsigned char huge; /* Whether to try for hugepages */ kuid_t uid; /* Mount uid for root directory */ kgid_t gid; /* Mount gid for root directory */ + bool full_inums; /* If i_ino should be uint or ino_t */ ino_t next_ino; /* The next per-sb inode number to use */ ino_t __percpu *ino_batch; /* The next per-cpu inode number to use */ struct mempolicy *mpol; /* default memory policy for mappings */ diff --git a/mm/shmem.c b/mm/shmem.c index 0ae250b4da28..f3126ad7ba3d 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -114,11 +114,13 @@ struct shmem_options { kuid_t uid; kgid_t gid; umode_t mode; + bool full_inums; int huge; int seen; #define SHMEM_SEEN_BLOCKS 1 #define SHMEM_SEEN_INODES 2 #define SHMEM_SEEN_HUGE 4 +#define SHMEM_SEEN_INUMS 8 }; #ifdef CONFIG_TMPFS @@ -286,12 +288,17 @@ static int shmem_reserve_inode(struct super_block *sb, ino_t *inop) ino = sbinfo->next_ino++; if (unlikely(is_zero_ino(ino))) ino = sbinfo->next_ino++; - if (unlikely(ino > UINT_MAX)) { + if (unlikely(!sbinfo->full_inums && + ino > UINT_MAX)) { /* * Emulate get_next_ino uint wraparound for * compatibility */ - ino = 1; + if (IS_ENABLED(CONFIG_64BIT)) + pr_warn("%s: inode number overflow on device %d, consider using inode64 mount option\n", + __func__, MINOR(sb->s_dev)); + sbinfo->next_ino = 1; + ino = sbinfo->next_ino++; } *inop = ino; } @@ -304,6 +311,10 @@ static int shmem_reserve_inode(struct super_block *sb, ino_t *inop) * unknown contexts. As such, use a per-cpu batched allocator * which doesn't require the per-sb stat_lock unless we are at * the batch boundary. + * + * We don't need to worry about inode{32,64} since SB_KERNMOUNT + * shmem mounts are not exposed to userspace, so we don't need + * to worry about things like glibc compatibility. */ ino_t *next_ino; next_ino = per_cpu_ptr(sbinfo->ino_batch, get_cpu()); @@ -3397,6 +3408,8 @@ enum shmem_param { Opt_nr_inodes, Opt_size, Opt_uid, + Opt_inode32, + Opt_inode64, }; static const struct constant_table shmem_param_enums_huge[] = { @@ -3416,6 +3429,8 @@ const struct fs_parameter_spec shmem_fs_parameters[] = { fsparam_string("nr_inodes", Opt_nr_inodes), fsparam_string("size", Opt_size), fsparam_u32 ("uid", Opt_uid), + fsparam_flag ("inode32", Opt_inode32), + fsparam_flag ("inode64", Opt_inode64), {} }; @@ -3487,6 +3502,18 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param) break; } goto unsupported_parameter; + case Opt_inode32: + ctx->full_inums = false; + ctx->seen |= SHMEM_SEEN_INUMS; + break; + case Opt_inode64: + if (sizeof(ino_t) < 8) { + return invalfc(fc, + "Cannot use inode64 with <64bit inums in kernel\n"); + } + ctx->full_inums = true; + ctx->seen |= SHMEM_SEEN_INUMS; + break; } return 0; @@ -3578,8 +3605,16 @@ static int shmem_reconfigure(struct fs_context *fc) } } + if ((ctx->seen & SHMEM_SEEN_INUMS) && !ctx->full_inums && + sbinfo->next_ino > UINT_MAX) { + err = "Current inum too high to switch to 32-bit inums"; + goto out; + } + if (ctx->seen & SHMEM_SEEN_HUGE) sbinfo->huge = ctx->huge; + if (ctx->seen & SHMEM_SEEN_INUMS) + sbinfo->full_inums = ctx->full_inums; if (ctx->seen & SHMEM_SEEN_BLOCKS) sbinfo->max_blocks = ctx->blocks; if (ctx->seen & SHMEM_SEEN_INODES) { @@ -3619,6 +3654,29 @@ static int shmem_show_options(struct seq_file *seq, struct dentry *root) if (!gid_eq(sbinfo->gid, GLOBAL_ROOT_GID)) seq_printf(seq, ",gid=%u", from_kgid_munged(&init_user_ns, sbinfo->gid)); + + /* + * Showing inode{64,32} might be useful even if it's the system default, + * since then people don't have to resort to checking both here and + * /proc/config.gz to confirm 64-bit inums were successfully applied + * (which may not even exist if IKCONFIG_PROC isn't enabled). + * + * We hide it when inode64 isn't the default and we are using 32-bit + * inodes, since that probably just means the feature isn't even under + * consideration. + * + * As such: + * + * +-----------------+-----------------+ + * | TMPFS_INODE64=y | TMPFS_INODE64=n | + * +------------------+-----------------+-----------------+ + * | full_inums=true | show | show | + * | full_inums=false | show | hide | + * +------------------+-----------------+-----------------+ + * + */ + if (IS_ENABLED(CONFIG_TMPFS_INODE64) || sbinfo->full_inums) + seq_printf(seq, ",inode%d", (sbinfo->full_inums ? 64 : 32)); #ifdef CONFIG_TRANSPARENT_HUGEPAGE /* Rightly or wrongly, show huge mount option unmasked by shmem_huge */ if (sbinfo->huge) @@ -3667,6 +3725,8 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc) ctx->blocks = shmem_default_max_blocks(); if (!(ctx->seen & SHMEM_SEEN_INODES)) ctx->inodes = shmem_default_max_inodes(); + if (!(ctx->seen & SHMEM_SEEN_INUMS)) + ctx->full_inums = IS_ENABLED(CONFIG_TMPFS_INODE64); } else { sb->s_flags |= SB_NOUSER; } @@ -3684,6 +3744,7 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc) } sbinfo->uid = ctx->uid; sbinfo->gid = ctx->gid; + sbinfo->full_inums = ctx->full_inums; sbinfo->mode = ctx->mode; sbinfo->huge = ctx->huge; sbinfo->mpol = ctx->mpol; -- 2.27.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v7 2/2] tmpfs: Support 64-bit inums per-sb 2020-07-13 17:28 ` [PATCH v7 2/2] tmpfs: Support 64-bit inums per-sb Chris Down @ 2020-08-01 20:41 ` Hugh Dickins 2020-08-02 2:37 ` [PATCH mmotm] tmpfs: support 64-bit inums per-sb fix Hugh Dickins 0 siblings, 1 reply; 8+ messages in thread From: Hugh Dickins @ 2020-08-01 20:41 UTC (permalink / raw) To: Chris Down Cc: Andrew Morton, Hugh Dickins, Al Viro, Matthew Wilcox, Amir Goldstein, Jeff Layton, Johannes Weiner, Tejun Heo, linux-mm, linux-fsdevel, linux-kernel, kernel-team On Mon, 13 Jul 2020, Chris Down wrote: > The default is still set to inode32 for backwards compatibility, but > system administrators can opt in to the new 64-bit inode numbers by > either: > > 1. Passing inode64 on the command line when mounting, or > 2. Configuring the kernel with CONFIG_TMPFS_INODE64=y > > The inode64 and inode32 names are used based on existing precedent from > XFS. > > Signed-off-by: Chris Down <chris@chrisdown.name> > Reviewed-by: Amir Goldstein <amir73il@gmail.com> > Cc: Hugh Dickins <hughd@google.com> Acked-by: Hugh Dickins <hughd@google.com> Again, thanks, and comments below, but nothing to override tha Ack. > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Al Viro <viro@zeniv.linux.org.uk> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: Jeff Layton <jlayton@kernel.org> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: Tejun Heo <tj@kernel.org> > Cc: linux-mm@kvack.org > Cc: linux-fsdevel@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: kernel-team@fb.com > --- > Documentation/filesystems/tmpfs.rst | 11 +++++ > fs/Kconfig | 15 +++++++ > include/linux/shmem_fs.h | 1 + > mm/shmem.c | 65 ++++++++++++++++++++++++++++- > 4 files changed, 90 insertions(+), 2 deletions(-) > > diff --git a/Documentation/filesystems/tmpfs.rst b/Documentation/filesystems/tmpfs.rst > index 4e95929301a5..47b84ddaa8bb 100644 > --- a/Documentation/filesystems/tmpfs.rst > +++ b/Documentation/filesystems/tmpfs.rst > @@ -150,6 +150,15 @@ These options do not have any effect on remount. You can change these > parameters with chmod(1), chown(1) and chgrp(1) on a mounted filesystem. > > > +tmpfs has a mount option to select whether it will wrap at 32- or 64-bit inode > +numbers: > + > +inode64 Use 64-bit inode numbers > +inode32 Use 32-bit inode numbers > + > +On 64-bit, the default is set by CONFIG_TMPFS_INODE64. On 32-bit, inode64 is > +not legal and will produce an error at mount time. > + I did originally want to move this up higher, between the discussion of nr_inodes and mpol; but its placing here follows where you placed it in /proc/mounts, and I end up agreeing with that, so let's leave where you've placed it. But this is a description of a new pair of mount options, nothing to do with the paragraph below, so one more blank line is needed. Except that tmpfs.txt has now been replaced by tmpfs.rst, with some ====ing too. We'd better add that. I'm unfamiliar with, and not prepared for .rst - getting Doc right has become a job for specialists. I'll send a -fix.patch of how how I think it should look to Andrew and Randy Dunlap, hoping Randy can confirm whether it ends up right. > So 'mount -t tmpfs -o size=10G,nr_inodes=10k,mode=700 tmpfs /mytmpfs' > will give you tmpfs instance on /mytmpfs which can allocate 10GB > RAM/SWAP in 10240 inodes and it is only accessible by root. > @@ -161,3 +170,5 @@ RAM/SWAP in 10240 inodes and it is only accessible by root. > Hugh Dickins, 4 June 2007 > :Updated: > KOSAKI Motohiro, 16 Mar 2010 > +Updated: I bet that should say ":Updated:" now. > + Chris Down, 13 July 2020 > diff --git a/fs/Kconfig b/fs/Kconfig > index ff257b81fde5..64d530ba42f6 100644 > --- a/fs/Kconfig > +++ b/fs/Kconfig > @@ -229,6 +229,21 @@ config TMPFS_XATTR > > If unsure, say N. > > +config TMPFS_INODE64 > + bool "Use 64-bit ino_t by default in tmpfs" > + depends on TMPFS && 64BIT > + default n Okay: I haven't looked up what our last position was on which way the default should be, but I do recall that we were reluctantly realizing that we couldn't safely be as radical as we had hoped. As to "default n" being the default and unnecessary: I don't mind seeing it there explicitly, let's give the work to whoever wants to delete that line. > + help > + tmpfs has historically used only inode numbers as wide as an unsigned > + int. In some cases this can cause wraparound, potentially resulting in > + multiple files with the same inode number on a single device. This option > + makes tmpfs use the full width of ino_t by default, similarly to the > + inode64 mount option. I've just realized that this, and the inode64 Documentation, make no mention of why you might not want to enable it: it sounds like such a good thing, with the documentation on why not in include/linux/fs.h. I'd better add some text to these in the -fix.patch. > + > + To override this default, use the inode32 or inode64 mount options. > + > + If unsure, say N. > + > config HUGETLBFS > bool "HugeTLB file system support" > depends on X86 || IA64 || SPARC64 || (S390 && 64BIT) || \ > diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h > index eb628696ec66..a5a5d1d4d7b1 100644 > --- a/include/linux/shmem_fs.h > +++ b/include/linux/shmem_fs.h > @@ -36,6 +36,7 @@ struct shmem_sb_info { > unsigned char huge; /* Whether to try for hugepages */ > kuid_t uid; /* Mount uid for root directory */ > kgid_t gid; /* Mount gid for root directory */ > + bool full_inums; /* If i_ino should be uint or ino_t */ Fine, but don't be surprised if I change that eventually, maybe I'll make it a bit in a bitmask: I've nothing against bools generally, but like to know the size of my structure fields, and never sure what size a bool ends up with. > ino_t next_ino; /* The next per-sb inode number to use */ > ino_t __percpu *ino_batch; /* The next per-cpu inode number to use */ > struct mempolicy *mpol; /* default memory policy for mappings */ > diff --git a/mm/shmem.c b/mm/shmem.c > index 0ae250b4da28..f3126ad7ba3d 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -114,11 +114,13 @@ struct shmem_options { > kuid_t uid; > kgid_t gid; > umode_t mode; > + bool full_inums; > int huge; > int seen; > #define SHMEM_SEEN_BLOCKS 1 > #define SHMEM_SEEN_INODES 2 > #define SHMEM_SEEN_HUGE 4 > +#define SHMEM_SEEN_INUMS 8 > }; > > #ifdef CONFIG_TMPFS > @@ -286,12 +288,17 @@ static int shmem_reserve_inode(struct super_block *sb, ino_t *inop) > ino = sbinfo->next_ino++; > if (unlikely(is_zero_ino(ino))) > ino = sbinfo->next_ino++; > - if (unlikely(ino > UINT_MAX)) { > + if (unlikely(!sbinfo->full_inums && > + ino > UINT_MAX)) { And don't be surprised if I put that on one line one day :) > /* > * Emulate get_next_ino uint wraparound for > * compatibility > */ > - ino = 1; > + if (IS_ENABLED(CONFIG_64BIT)) > + pr_warn("%s: inode number overflow on device %d, consider using inode64 mount option\n", > + __func__, MINOR(sb->s_dev)); I like that you've added a warning; but it comes too late for a remount, doesn't it? We could do a countdown - "Only 1 billion more inode numbers until overflow!" - that would be fun :) But a multiplicity of warnings would be irritating to some and unnoticed by others. I don't know. > + sbinfo->next_ino = 1; > + ino = sbinfo->next_ino++; > } > *inop = ino; > } > @@ -304,6 +311,10 @@ static int shmem_reserve_inode(struct super_block *sb, ino_t *inop) > * unknown contexts. As such, use a per-cpu batched allocator > * which doesn't require the per-sb stat_lock unless we are at > * the batch boundary. > + * > + * We don't need to worry about inode{32,64} since SB_KERNMOUNT > + * shmem mounts are not exposed to userspace, so we don't need > + * to worry about things like glibc compatibility. > */ > ino_t *next_ino; > next_ino = per_cpu_ptr(sbinfo->ino_batch, get_cpu()); > @@ -3397,6 +3408,8 @@ enum shmem_param { > Opt_nr_inodes, > Opt_size, > Opt_uid, > + Opt_inode32, > + Opt_inode64, > }; > > static const struct constant_table shmem_param_enums_huge[] = { > @@ -3416,6 +3429,8 @@ const struct fs_parameter_spec shmem_fs_parameters[] = { > fsparam_string("nr_inodes", Opt_nr_inodes), > fsparam_string("size", Opt_size), > fsparam_u32 ("uid", Opt_uid), > + fsparam_flag ("inode32", Opt_inode32), > + fsparam_flag ("inode64", Opt_inode64), > {} > }; > > @@ -3487,6 +3502,18 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param) > break; > } > goto unsupported_parameter; > + case Opt_inode32: > + ctx->full_inums = false; > + ctx->seen |= SHMEM_SEEN_INUMS; > + break; > + case Opt_inode64: > + if (sizeof(ino_t) < 8) { > + return invalfc(fc, > + "Cannot use inode64 with <64bit inums in kernel\n"); > + } > + ctx->full_inums = true; > + ctx->seen |= SHMEM_SEEN_INUMS; > + break; > } > return 0; > > @@ -3578,8 +3605,16 @@ static int shmem_reconfigure(struct fs_context *fc) > } > } > > + if ((ctx->seen & SHMEM_SEEN_INUMS) && !ctx->full_inums && > + sbinfo->next_ino > UINT_MAX) { > + err = "Current inum too high to switch to 32-bit inums"; > + goto out; > + } > + > if (ctx->seen & SHMEM_SEEN_HUGE) > sbinfo->huge = ctx->huge; > + if (ctx->seen & SHMEM_SEEN_INUMS) > + sbinfo->full_inums = ctx->full_inums; > if (ctx->seen & SHMEM_SEEN_BLOCKS) > sbinfo->max_blocks = ctx->blocks; > if (ctx->seen & SHMEM_SEEN_INODES) { > @@ -3619,6 +3654,29 @@ static int shmem_show_options(struct seq_file *seq, struct dentry *root) > if (!gid_eq(sbinfo->gid, GLOBAL_ROOT_GID)) > seq_printf(seq, ",gid=%u", > from_kgid_munged(&init_user_ns, sbinfo->gid)); > + > + /* > + * Showing inode{64,32} might be useful even if it's the system default, > + * since then people don't have to resort to checking both here and > + * /proc/config.gz to confirm 64-bit inums were successfully applied > + * (which may not even exist if IKCONFIG_PROC isn't enabled). > + * > + * We hide it when inode64 isn't the default and we are using 32-bit > + * inodes, since that probably just means the feature isn't even under > + * consideration. > + * > + * As such: > + * > + * +-----------------+-----------------+ > + * | TMPFS_INODE64=y | TMPFS_INODE64=n | > + * +------------------+-----------------+-----------------+ > + * | full_inums=true | show | show | > + * | full_inums=false | show | hide | > + * +------------------+-----------------+-----------------+ > + * > + */ Thanks for spelling this out: it may be the most contentious part of the patch. I don't disagree with your decision, but I can imagine it causing annoyance. Let's start off with it as you decided, but be ready to be persuaded differently. I have a growing inclination towards the 64-bit kernel showing ",inode32" but never ",inode64". > + if (IS_ENABLED(CONFIG_TMPFS_INODE64) || sbinfo->full_inums) > + seq_printf(seq, ",inode%d", (sbinfo->full_inums ? 64 : 32)); > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > /* Rightly or wrongly, show huge mount option unmasked by shmem_huge */ > if (sbinfo->huge) > @@ -3667,6 +3725,8 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc) > ctx->blocks = shmem_default_max_blocks(); > if (!(ctx->seen & SHMEM_SEEN_INODES)) > ctx->inodes = shmem_default_max_inodes(); > + if (!(ctx->seen & SHMEM_SEEN_INUMS)) > + ctx->full_inums = IS_ENABLED(CONFIG_TMPFS_INODE64); > } else { > sb->s_flags |= SB_NOUSER; > } > @@ -3684,6 +3744,7 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc) > } > sbinfo->uid = ctx->uid; > sbinfo->gid = ctx->gid; > + sbinfo->full_inums = ctx->full_inums; > sbinfo->mode = ctx->mode; > sbinfo->huge = ctx->huge; > sbinfo->mpol = ctx->mpol; > -- > 2.27.0 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH mmotm] tmpfs: support 64-bit inums per-sb fix 2020-08-01 20:41 ` Hugh Dickins @ 2020-08-02 2:37 ` Hugh Dickins 2020-08-02 3:05 ` Randy Dunlap 0 siblings, 1 reply; 8+ messages in thread From: Hugh Dickins @ 2020-08-02 2:37 UTC (permalink / raw) To: Andrew Morton, Chris Down, Randy Dunlap Cc: Al Viro, Matthew Wilcox, Amir Goldstein, Hugh Dickins, Jeff Layton, Johannes Weiner, Tejun Heo, linux-mm, linux-fsdevel, linux-kernel, kernel-team Expanded Chris's Documentation and Kconfig help on tmpfs inode64. TMPFS_INODE64 still there, still default N, but writing down its very limited limitation does make me wonder again if we want the option. Signed-off-by: Hugh Dickins <hughd@google.com> --- Andrew, please fold into tmpfs-support-64-bit-inums-per-sb.patch later. Randy, you're very active on Documentation and linux-next: may I ask you please to try applying this patch to latest, and see if tmpfs.rst comes out looking right to you? I'm an old dog still stuck in the days of tmpfs.txt, hoping to avoid new tricks for a while. Thanks! (Bonus points if you can explain what the "::" on line 122 is about. I started out reading Documentation/doc-guide/sphinx.rst, but... got diverted. Perhaps I should ask Mauro or Jon, but turning for help first to you.) Documentation/filesystems/tmpfs.rst | 13 ++++++++++--- fs/Kconfig | 16 +++++++++++----- 2 files changed, 21 insertions(+), 8 deletions(-) --- mmotm/Documentation/filesystems/tmpfs.rst 2020-07-27 18:54:51.116524795 -0700 +++ linux/Documentation/filesystems/tmpfs.rst 2020-08-01 18:37:07.719713987 -0700 @@ -153,11 +153,18 @@ parameters with chmod(1), chown(1) and c tmpfs has a mount option to select whether it will wrap at 32- or 64-bit inode numbers: +======= ======================== inode64 Use 64-bit inode numbers inode32 Use 32-bit inode numbers +======= ======================== + +On a 32-bit kernel, inode32 is implicit, and inode64 is refused at mount time. +On a 64-bit kernel, CONFIG_TMPFS_INODE64 sets the default. inode64 avoids the +possibility of multiple files with the same inode number on a single device; +but risks glibc failing with EOVERFLOW once 33-bit inode numbers are reached - +if a long-lived tmpfs is accessed by 32-bit applications so ancient that +opening a file larger than 2GiB fails with EINVAL. -On 64-bit, the default is set by CONFIG_TMPFS_INODE64. On 32-bit, inode64 is -not legal and will produce an error at mount time. So 'mount -t tmpfs -o size=10G,nr_inodes=10k,mode=700 tmpfs /mytmpfs' will give you tmpfs instance on /mytmpfs which can allocate 10GB @@ -170,5 +177,5 @@ RAM/SWAP in 10240 inodes and it is only Hugh Dickins, 4 June 2007 :Updated: KOSAKI Motohiro, 16 Mar 2010 -Updated: +:Updated: Chris Down, 13 July 2020 --- mmotm/fs/Kconfig 2020-07-27 18:54:59.384550639 -0700 +++ linux/fs/Kconfig 2020-08-01 18:11:33.749236321 -0700 @@ -223,12 +223,18 @@ config TMPFS_INODE64 default n help tmpfs has historically used only inode numbers as wide as an unsigned - int. In some cases this can cause wraparound, potentially resulting in - multiple files with the same inode number on a single device. This option - makes tmpfs use the full width of ino_t by default, similarly to the - inode64 mount option. + int. In some cases this can cause wraparound, potentially resulting + in multiple files with the same inode number on a single device. This + option makes tmpfs use the full width of ino_t by default, without + needing to specify the inode64 option when mounting. - To override this default, use the inode32 or inode64 mount options. + But if a long-lived tmpfs is to be accessed by 32-bit applications so + ancient that opening a file larger than 2GiB fails with EINVAL, then + the INODE64 config option and inode64 mount option risk operations + failing with EOVERFLOW once 33-bit inode numbers are reached. + + To override this configured default, use the inode32 or inode64 + option when mounting. If unsure, say N. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH mmotm] tmpfs: support 64-bit inums per-sb fix 2020-08-02 2:37 ` [PATCH mmotm] tmpfs: support 64-bit inums per-sb fix Hugh Dickins @ 2020-08-02 3:05 ` Randy Dunlap 2020-08-02 5:25 ` Hugh Dickins 0 siblings, 1 reply; 8+ messages in thread From: Randy Dunlap @ 2020-08-02 3:05 UTC (permalink / raw) To: Hugh Dickins, Andrew Morton, Chris Down Cc: Al Viro, Matthew Wilcox, Amir Goldstein, Jeff Layton, Johannes Weiner, Tejun Heo, linux-mm, linux-fsdevel, linux-kernel, kernel-team On 8/1/20 7:37 PM, Hugh Dickins wrote: > Expanded Chris's Documentation and Kconfig help on tmpfs inode64. > TMPFS_INODE64 still there, still default N, but writing down its very > limited limitation does make me wonder again if we want the option. > > Signed-off-by: Hugh Dickins <hughd@google.com> > --- > Andrew, please fold into tmpfs-support-64-bit-inums-per-sb.patch later. > > Randy, you're very active on Documentation and linux-next: may I ask you > please to try applying this patch to latest, and see if tmpfs.rst comes > out looking right to you? I'm an old dog still stuck in the days of Hi Hugh, It looks fine. > tmpfs.txt, hoping to avoid new tricks for a while. Thanks! (Bonus > points if you can explain what the "::" on line 122 is about. I started > out reading Documentation/doc-guide/sphinx.rst, but... got diverted. > Perhaps I should ask Mauro or Jon, but turning for help first to you.) That's the correct file. Around line 216, it says: * For inserting fixed width text blocks (for code examples, use case examples, etc.), use ``::`` for anything that doesn't really benefit from syntax highlighting, especially short snippets. Use ``.. code-block:: <language>`` for longer code blocks that benefit from highlighting. For a short snippet of code embedded in the text, use \`\`. so it's just for a (short) code example block, fixed font... > > Documentation/filesystems/tmpfs.rst | 13 ++++++++++--- > fs/Kconfig | 16 +++++++++++----- > 2 files changed, 21 insertions(+), 8 deletions(-) > > --- mmotm/Documentation/filesystems/tmpfs.rst 2020-07-27 18:54:51.116524795 -0700 > +++ linux/Documentation/filesystems/tmpfs.rst 2020-08-01 18:37:07.719713987 -0700 cheers. -- ~Randy ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH mmotm] tmpfs: support 64-bit inums per-sb fix 2020-08-02 3:05 ` Randy Dunlap @ 2020-08-02 5:25 ` Hugh Dickins 0 siblings, 0 replies; 8+ messages in thread From: Hugh Dickins @ 2020-08-02 5:25 UTC (permalink / raw) To: Randy Dunlap Cc: Hugh Dickins, Andrew Morton, Chris Down, Al Viro, Matthew Wilcox, Amir Goldstein, Jeff Layton, Johannes Weiner, Tejun Heo, linux-mm, linux-fsdevel, linux-kernel, kernel-team On Sat, 1 Aug 2020, Randy Dunlap wrote: > On 8/1/20 7:37 PM, Hugh Dickins wrote: > > Expanded Chris's Documentation and Kconfig help on tmpfs inode64. > > TMPFS_INODE64 still there, still default N, but writing down its very > > limited limitation does make me wonder again if we want the option. > > > > Signed-off-by: Hugh Dickins <hughd@google.com> > > --- > > Andrew, please fold into tmpfs-support-64-bit-inums-per-sb.patch later. > > > > Randy, you're very active on Documentation and linux-next: may I ask you > > please to try applying this patch to latest, and see if tmpfs.rst comes > > out looking right to you? I'm an old dog still stuck in the days of > > Hi Hugh, > It looks fine. Thank you so much, Randy. > > > tmpfs.txt, hoping to avoid new tricks for a while. Thanks! (Bonus > > points if you can explain what the "::" on line 122 is about. I started > > out reading Documentation/doc-guide/sphinx.rst, but... got diverted. > > Perhaps I should ask Mauro or Jon, but turning for help first to you.) > > That's the correct file. Around line 216, it says: > > * For inserting fixed width text blocks (for code examples, use case > examples, etc.), use ``::`` for anything that doesn't really benefit > from syntax highlighting, especially short snippets. Use > ``.. code-block:: <language>`` for longer code blocks that benefit > from highlighting. For a short snippet of code embedded in the text, use \`\`. > > > so it's just for a (short) code example block, fixed font... Bonus points awarded, thanks...ish. I'll have to look around for more examples of where that's done, and I think it'll only make real sense to me, when I'm further along, producing the proper output, then seeing how bad something looks without the "::". Thanks again, Hugh ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-08-02 5:25 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-07-13 17:28 [PATCH v7 0/2] tmpfs: inode: Reduce risk of inum overflow Chris Down 2020-07-13 17:28 ` [PATCH v7 1/2] tmpfs: Per-superblock i_ino support Chris Down 2020-08-01 19:22 ` Hugh Dickins 2020-07-13 17:28 ` [PATCH v7 2/2] tmpfs: Support 64-bit inums per-sb Chris Down 2020-08-01 20:41 ` Hugh Dickins 2020-08-02 2:37 ` [PATCH mmotm] tmpfs: support 64-bit inums per-sb fix Hugh Dickins 2020-08-02 3:05 ` Randy Dunlap 2020-08-02 5:25 ` Hugh Dickins
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).