From: Amir Goldstein <amir73il@gmail.com>
To: Hugh Dickins <hughd@google.com>
Cc: Chris Down <chris@chrisdown.name>,
"zhengbin (A)" <zhengbin13@huawei.com>,
"J. R. Okajima" <hooanon05g@gmail.com>,
Linux MM <linux-mm@kvack.org>,
Andrew Morton <akpm@linux-foundation.org>,
Al Viro <viro@zeniv.linux.org.uk>,
Matthew Wilcox <willy@infradead.org>,
Jeff Layton <jlayton@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>, Tejun Heo <tj@kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
kernel-team@fb.com
Subject: Re: [PATCH v5 1/2] tmpfs: Add per-superblock i_ino support
Date: Wed, 8 Jan 2020 14:51:38 +0200 [thread overview]
Message-ID: <CAOQ4uxi2kAnX4+iy2=XXEXDB=zhsCcd8smAh7DvSGbR34k7eww@mail.gmail.com> (raw)
In-Reply-To: <alpine.LSU.2.11.2001080206390.1884@eggly.anvils>
On Wed, Jan 8, 2020 at 12:59 PM Hugh Dickins <hughd@google.com> wrote:
>
> On Tue, 7 Jan 2020, Amir Goldstein wrote:
> >
> > I vote in favor or best of both patches to result in a simpler outcome:
> > 1. use inop approach from Hugh's patch
> > 2. use get_next_ino() instead of per-sb ino_batch for kern_mount
> >
> > Hugh, do you have any evidence or suspect that shmem_file_setup()
> > could be contributing to depleting the global get_next_ino pool?
>
> Depends on what the system is used for: on one system, it will make
> very little use of that pool, on another it will be one of the major
> depleters of the pool. Generally, it would be kinder to the other
> users of the pool (those that might also care about ino duplication)
> if shmem were to cede all use of it; but a bigger patch, yes.
>
> > Do you have an objection to the combination above?
>
> Objection would be too strong: I'm uncomfortable with it, and not
> tempted to replace our internal implementation by one reverting to
> use get_next_ino(); but not as uncomfortable as I am with holding
> up progress on the issue.
>
> Uncomfortable because of the "depletion" you mention. Uncomfortable
> because half the reason for ever offering the unlimited "nr_inodes=0"
> mount option was to avoid stat_lock overhead (though, for all I know,
> maybe nobody anywhere uses that option now - and I'll be surprised if
> 0-day uses it and reports any slowdown).
>
> Also uncomfortable because your (excellent, and I'd never thought
> about it that way) argument that the shm_mnt objects are internal
> and unstatable (that may not resemble how you expressed it, I've not
> gone back to search the mails to find where you made the point), is
> not entirely correct nowadays - memfd_create()d objects come from
> that same shm_mnt, and they can be fstat()ed. What is the
> likelihood that anything would care about duplicated inos of
> memfd_create()d objects? Fairly low, I think we'll agree; and
> we can probably also say that their inos are an odd implementation
> detail that no memfd_create() user should depend upon anyway. But
> I mention it in case it changes your own feeling about the shm_mnt.
>
I have no strong feeling either. I just didn't know how intensive its use
is. You have provided sufficient arguments IMO to go with your version
of the patch.
> > > Not-Yet-Signed-off-by: Hugh Dickins <hughd@google.com>
> > >
> > > 1) Probably needs minor corrections for the 32-bit kernel: I wasn't
> > > worrying about that at the time, and expect some "unsigned long"s
> > > I didn't need to change for the 64-bit kernel actually need to be
> > > "u64"s or "ino_t"s now.
> > > 2) The "return 1" from shmem_encode_fh() would nowadays be written as
> > > "return FILEID_INO32_GEN" (and overlayfs takes particular interest
> > > in that fileid); yet it already put the high 32 bits of the ino into
> > > the fh: probably needs a different fileid once high 32 bits are set.
> >
> > Nice spotting, but this really isn't a problem for overlayfs.
> > I agree that it would be nice to follow the same practice as xfs with
> > XFS_FILEID_TYPE_64FLAG, but as one can see from the private
> > flag, this is by no means a standard of any sort.
> >
> > As the name_to_handle_at() man page says:
> > "Other than the use of the handle_bytes field, the caller should treat the
> > file_handle structure as an opaque data type: the handle_type and f_handle
> > fields are needed only by a subsequent call to open_by_handle_at()."
> >
> > It is true that one of my initial versions was checking value returned from
> > encode_fh, but Miklos has pointed out to me that this value is arbitrary
> > and we shouldn't rely on it.
> >
> > In current overlayfs, the value FILEID_INO32_GEN is only used internally
> > to indicate the case where filesystem has no encode_fh op (see comment
> > above ovl_can_decode_fh()). IOW, only the special case of encoding
> > by the default export_encode_fh() is considered a proof for 32bit inodes.
> > So tmpfs has never been considered as a 32bit inodes filesystem by
> > overlayfs.
>
> Thanks, you know far more about encode_fh() than I ever shall, so for
> now I'll take your word for it that we don't need to make any change
> there for this 64-bit ino support. One day I should look back, go
> through the encode_fh() callsites, and decide what that "return 1"
> really ought to say.
>
TBH, I meant to say that return 1 shouldn't matter functionally,
but it would be much nicer to change it to FILEID_INO64_GEN
and define that as 0x81 in include/linux/exportfs.h and actually
order the gen word after the inode64 words.
But that is independent of the per-sb ino change, because the
layout of the file handle has always been [gen,inode64] and never
really matched that standard of FILEID_INO32_GEN.
I may send a patch to later on to change that.
Thanks,
Amir.
next prev parent reply other threads:[~2020-01-08 12:51 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-05 12:05 [PATCH v5 0/2] fs: inode: shmem: Reduce risk of inum overflow Chris Down
2020-01-05 12:06 ` [PATCH v5 1/2] tmpfs: Add per-superblock i_ino support Chris Down
2020-01-06 2:03 ` zhengbin (A)
2020-01-06 6:41 ` Amir Goldstein
2020-01-07 8:01 ` Hugh Dickins
2020-01-07 8:35 ` Amir Goldstein
2020-01-08 10:58 ` Hugh Dickins
2020-01-08 12:51 ` Amir Goldstein [this message]
2020-01-06 13:17 ` Chris Down
2020-01-05 12:06 ` [PATCH v5 2/2] tmpfs: Support 64-bit inums per-sb Chris Down
2020-01-07 0:10 ` Dave Chinner
2020-01-07 0:16 ` Chris Down
2020-01-07 0:39 ` Dave Chinner
2020-01-07 6:54 ` Amir Goldstein
2020-01-07 8:36 ` Hugh Dickins
2020-01-07 10:12 ` Amir Goldstein
2020-01-07 21:07 ` Dave Chinner
2020-01-07 21:37 ` Chris Mason
2020-01-08 11:24 ` Hugh Dickins
2020-01-09 0:43 ` Jeff Layton
2020-01-10 16:45 ` Chris Down
2020-01-13 7:36 ` Hugh Dickins
2020-01-20 15:11 ` Chris Down
2020-02-25 23:14 ` Hugh Dickins
2020-01-07 20:59 ` Dave Chinner
2020-01-08 14:37 ` Mikael Magnusson
2020-01-13 6:58 ` Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAOQ4uxi2kAnX4+iy2=XXEXDB=zhsCcd8smAh7DvSGbR34k7eww@mail.gmail.com' \
--to=amir73il@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chris@chrisdown.name \
--cc=hannes@cmpxchg.org \
--cc=hooanon05g@gmail.com \
--cc=hughd@google.com \
--cc=jlayton@kernel.org \
--cc=kernel-team@fb.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=tj@kernel.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
--cc=zhengbin13@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).