linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>, Christoph Hellwig <hch@infradead.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v3 3/4] super: wait for nascent superblocks
Date: Mon, 21 Aug 2023 18:02:24 +0200	[thread overview]
Message-ID: <20230821160224.ees2jmlddqtupvsc@quack3> (raw)
In-Reply-To: <20230821155237.d4luoqrzhnlffbti@quack3>

On Mon 21-08-23 17:52:37, Jan Kara wrote:
> On Fri 18-08-23 16:00:50, Christian Brauner wrote:
> > Recent patches experiment with making it possible to allocate a new
> > superblock before opening the relevant block device. Naturally this has
> > intricate side-effects that we get to learn about while developing this.
> > 
> > Superblock allocators such as sget{_fc}() return with s_umount of the
> > new superblock held and lock ordering currently requires that block
> > level locks such as bdev_lock and open_mutex rank above s_umount.
> > 
> > Before aca740cecbe5 ("fs: open block device after superblock creation")
> > ordering was guaranteed to be correct as block devices were opened prior
> > to superblock allocation and thus s_umount wasn't held. But now s_umount
> > must be dropped before opening block devices to avoid locking
> > violations.
> > 
> > This has consequences. The main one being that iterators over
> > @super_blocks and @fs_supers that grab a temporary reference to the
> > superblock can now also grab s_umount before the caller has managed to
> > open block devices and called fill_super(). So whereas before such
> > iterators or concurrent mounts would have simply slept on s_umount until
> > SB_BORN was set or the superblock was discard due to initalization
> > failure they can now needlessly spin through sget{_fc}().
> > 
> > If the caller is sleeping on bdev_lock or open_mutex one caller waiting
> > on SB_BORN will always spin somewhere and potentially this can go on for
> > quite a while.
> > 
> > It should be possible to drop s_umount while allowing iterators to wait
> > on a nascent superblock to either be born or discarded. This patch
> > implements a wait_var_event() mechanism allowing iterators to sleep
> > until they are woken when the superblock is born or discarded.
> > 
> > This also allows us to avoid relooping through @fs_supers and
> > @super_blocks if a superblock isn't yet born or dying.
> > 
> > Link: aca740cecbe5 ("fs: open block device after superblock creation")
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> ...
> > +/* wake waiters */
> > +#define SUPER_WAKE_FLAGS (SB_BORN | SB_DYING)
> > +static void super_wake(struct super_block *sb, unsigned int flag)
> > +{
> > +	WARN_ON_ONCE((flag & ~SUPER_WAKE_FLAGS));
> > +	WARN_ON_ONCE(hweight32(flag & SUPER_WAKE_FLAGS) > 1);
> > +
> > +	/*
> > +	 * Pairs with smp_load_acquire() in super_lock() and
> > +	 * ensures that @flag is set before we wake anyone and ensures
> > +	 * that checking whether waitqueue is active isn't hoisted
> > +	 * before the store of @flag.
> > +	 */
> > +	sb->s_flags |= flag;
> > +	smp_mb();
> > +	wake_up_var(&sb->s_flags);
> 
> I think we misunderstood here. I believe we need:
> 
> 	/*
> 	 * Pairs with smp_load_acquire() in super_lock() to make sure
> 	 * all initializations in the superblock are seen by the user
> 	 * seeing SB_BORN sent.
> 	 */
> 	smp_store_release(&sb->s_flags, sb->s_flags | flag);
> 	/*
> 	 * Pairs with the barrier in prepare_to_wait_event() to make sure
> 	 * ___wait_var_event() either sees SB_BORN set or
> 	 * waitqueue_active() check in wake_up_var() sees the waiter
> 	 */
> 	smp_rmb();
> 	wake_up_var(&sb->s_flags);
> 
> or we need something equivalent with stronger barriers. Like:
> 
> 	smp_wmb();
> 	sb->s_flags |= flag;
> 	smp_rmb();
> 	wake_up_var(&sb->s_flags);

Ah, in both of the above cases we actually need smp_mb() (as you properly
chose) instead of smp_rmb() I wrote above because we need to establish
write-vs-read ordering.

BTW, if you pick one of these two schemes, feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

If you decide for something else, I'd like to see the result first...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  parent reply	other threads:[~2023-08-21 16:02 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-18 14:00 [PATCH v3 0/4] super: allow waiting without s_umount held Christian Brauner
2023-08-18 14:00 ` [PATCH v3 1/4] super: use locking helpers Christian Brauner
2023-08-18 14:00 ` [PATCH v3 2/4] super: make locking naming consistent Christian Brauner
2023-08-18 14:00 ` [PATCH v3 3/4] super: wait for nascent superblocks Christian Brauner
2023-08-18 14:57   ` Matthew Wilcox
2023-08-18 15:21     ` Christian Brauner
2023-08-21 15:52   ` Jan Kara
2023-08-21 15:56     ` Christian Brauner
2023-08-21 16:04       ` Jan Kara
2023-08-21 16:02     ` Jan Kara [this message]
2023-08-21 16:08       ` Christian Brauner
2023-08-18 14:00 ` [PATCH v3 4/4] super: wait until we passed kill super Christian Brauner
2023-08-21 15:57   ` Jan Kara
2023-08-22 12:36 ` [PATCH v3 0/4] super: allow waiting without s_umount held Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230821160224.ees2jmlddqtupvsc@quack3 \
    --to=jack@suse.cz \
    --cc=brauner@kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).