public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Brauner <brauner@kernel.org>, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] fix bitmap corruption on close_range() with CLOSE_RANGE_UNSHARE
Date: Tue, 6 Aug 2024 02:02:17 +0100	[thread overview]
Message-ID: <20240806010217.GL5334@ZenIV> (raw)
In-Reply-To: <CAHk-=wjb1pGkNuaJOyJf9Uois648to5NJNLXHk5ELFTB_HL0PA@mail.gmail.com>

On Mon, Aug 05, 2024 at 05:04:05PM -0700, Linus Torvalds wrote:
> On Mon, 5 Aug 2024 at 16:45, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > So... do we really need that indirect?  The problem would be
> > seeing ->max_fds update before that of the array pointers.
> 
> The reason is simply so that we can have one single allocation.
> 
> In fact, quite often, it's zero allocations when you can use the
> 'struct fdtable fdtab' that is embedded in 'struct files_struct'.

More to the point, we use arrays embedded into files_struct.

>But
> the 'struct fdtable' was a convenient way to allocate all those
> bitmaps _and_ the 'struct file *' array all together.

I don't think so - IIRC, it was introduced when we added RCU'd
file lookup.  Let me check...  Yep; badf16621c1f "[PATCH] files:
break up files struct", with RCU support as rationale.  Followed
by ab2af1f50050 "[PATCH] files: files struct with RCU".

Before those commits ->max_fds and ->fd used to live in
files_struct and fget() (OK, fcheck_files()) had been taking
->files_lock, so that concurrent expand_files() would not
screw us over.

The problem was not just the need to delay freeing old ->fd
array; that could be dealt with easily enough.  Think what
would've happened if fcheck_files() ended up fetching
new value of ->max_fds and old value of ->fd, which pointed
to pre-expansion array.  Indirection allowed to update
both in one store.

The thing is, ->max_fds for successive ->fdt is monotonously
increasing.  So a lockless reader seeing the old size is
fine with the new table - we just need to prevent the opposite.

Would rcu_assign_pointer of pointers + smp_store_release of max_fds on expand
(all under ->files_lock, etc.) paired with
smp_load_acquire of max_fds + rcu_dereference of ->fd on file lookup side
be enough, or do we need an explicit smp_wmb/smp_rmb in there?

> And yes, I think it's entirely a historical artifact of how that thing
> grew to be. Long long long ago there was no secondary allocation at
> all, and MAX_OPEN was fixed at 20.

Yes - but that had changed way before 2005 when those patches went in.
Separate allocation of bitmaps is 2.3.12pre7 and separate allocation of
->fd is even older - 2.1.90pre1.  NR_OPEN had been 1024 at that point;
earlier history is
	0.97: 20 -> 32
	0.98.4: 32 -> 256
	2.1.26: 256 -> 1024

  reply	other threads:[~2024-08-06  1:02 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-03 22:50 [PATCH] fix bitmap corruption on close_range() with CLOSE_RANGE_UNSHARE Al Viro
2024-08-03 23:06 ` Al Viro
2024-08-03 23:51 ` Linus Torvalds
2024-08-04  0:05   ` Linus Torvalds
2024-08-04  0:34   ` Al Viro
2024-08-04  3:42     ` Linus Torvalds
2024-08-04  3:47     ` Al Viro
2024-08-04  4:17       ` Al Viro
2024-08-04 15:18       ` Linus Torvalds
2024-08-04 21:13         ` Al Viro
2024-08-05 23:44           ` Al Viro
2024-08-06  0:04             ` Linus Torvalds
2024-08-06  1:02               ` Al Viro [this message]
2024-08-06  8:41                 ` Christian Brauner
2024-08-06 16:32                   ` Al Viro
2024-08-06 17:01                     ` Linus Torvalds
2024-08-05  7:22 ` Christian Brauner
2024-08-05 18:54   ` Al Viro
2024-08-06  9:11     ` Christian Brauner
2024-08-05  9:48 ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240806010217.GL5334@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=brauner@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox