All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Brauner <brauner@kernel.org>, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] fix bitmap corruption on close_range() with CLOSE_RANGE_UNSHARE
Date: Tue, 6 Aug 2024 02:02:17 +0100	[thread overview]
Message-ID: <20240806010217.GL5334@ZenIV> (raw)
In-Reply-To: <CAHk-=wjb1pGkNuaJOyJf9Uois648to5NJNLXHk5ELFTB_HL0PA@mail.gmail.com>

On Mon, Aug 05, 2024 at 05:04:05PM -0700, Linus Torvalds wrote:
> On Mon, 5 Aug 2024 at 16:45, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > So... do we really need that indirect?  The problem would be
> > seeing ->max_fds update before that of the array pointers.
> 
> The reason is simply so that we can have one single allocation.
> 
> In fact, quite often, it's zero allocations when you can use the
> 'struct fdtable fdtab' that is embedded in 'struct files_struct'.

More to the point, we use arrays embedded into files_struct.

>But
> the 'struct fdtable' was a convenient way to allocate all those
> bitmaps _and_ the 'struct file *' array all together.

I don't think so - IIRC, it was introduced when we added RCU'd
file lookup.  Let me check...  Yep; badf16621c1f "[PATCH] files:
break up files struct", with RCU support as rationale.  Followed
by ab2af1f50050 "[PATCH] files: files struct with RCU".

Before those commits ->max_fds and ->fd used to live in
files_struct and fget() (OK, fcheck_files()) had been taking
->files_lock, so that concurrent expand_files() would not
screw us over.

The problem was not just the need to delay freeing old ->fd
array; that could be dealt with easily enough.  Think what
would've happened if fcheck_files() ended up fetching
new value of ->max_fds and old value of ->fd, which pointed
to pre-expansion array.  Indirection allowed to update
both in one store.

The thing is, ->max_fds for successive ->fdt is monotonously
increasing.  So a lockless reader seeing the old size is
fine with the new table - we just need to prevent the opposite.

Would rcu_assign_pointer of pointers + smp_store_release of max_fds on expand
(all under ->files_lock, etc.) paired with
smp_load_acquire of max_fds + rcu_dereference of ->fd on file lookup side
be enough, or do we need an explicit smp_wmb/smp_rmb in there?

> And yes, I think it's entirely a historical artifact of how that thing
> grew to be. Long long long ago there was no secondary allocation at
> all, and MAX_OPEN was fixed at 20.

Yes - but that had changed way before 2005 when those patches went in.
Separate allocation of bitmaps is 2.3.12pre7 and separate allocation of
->fd is even older - 2.1.90pre1.  NR_OPEN had been 1024 at that point;
earlier history is
	0.97: 20 -> 32
	0.98.4: 32 -> 256
	2.1.26: 256 -> 1024

  reply	other threads:[~2024-08-06  1:02 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-03 22:50 [PATCH] fix bitmap corruption on close_range() with CLOSE_RANGE_UNSHARE Al Viro
2024-08-03 23:06 ` Al Viro
2024-08-03 23:51 ` Linus Torvalds
2024-08-04  0:05   ` Linus Torvalds
2024-08-04  0:34   ` Al Viro
2024-08-04  3:42     ` Linus Torvalds
2024-08-04  3:47     ` Al Viro
2024-08-04  4:17       ` Al Viro
2024-08-04 15:18       ` Linus Torvalds
2024-08-04 21:13         ` Al Viro
2024-08-05 23:44           ` Al Viro
2024-08-06  0:04             ` Linus Torvalds
2024-08-06  1:02               ` Al Viro [this message]
2024-08-06  8:41                 ` Christian Brauner
2024-08-06 16:32                   ` Al Viro
2024-08-06 17:01                     ` Linus Torvalds
2024-08-05  7:22 ` Christian Brauner
2024-08-05 18:54   ` Al Viro
2024-08-06  9:11     ` Christian Brauner
2024-08-05  9:48 ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240806010217.GL5334@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=brauner@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.