public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Jan Kara <jack@suse.cz>
Cc: Christian Brauner <brauner@kernel.org>,
	linux-fsdevel@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>,
	David Howells <dhowells@redhat.com>
Subject: Re: [PATCH] fs: Provide helpers for manipulating sb->s_readonly_remount
Date: Tue, 20 Jun 2023 09:16:10 +1000	[thread overview]
Message-ID: <ZJDhuldMQRvYGRSh@dread.disaster.area> (raw)
In-Reply-To: <20230619110526.3tothvlcww6cgfup@quack3>

On Mon, Jun 19, 2023 at 01:05:26PM +0200, Jan Kara wrote:
> On Sat 17-06-23 09:33:42, Dave Chinner wrote:
> > On Fri, Jun 16, 2023 at 06:38:27PM +0200, Jan Kara wrote:
> > > Provide helpers to set and clear sb->s_readonly_remount including
> > > appropriate memory barriers. Also use this opportunity to document what
> > > the barriers pair with and why they are needed.
> > > 
> > > Suggested-by: Dave Chinner <david@fromorbit.com>
> > > Signed-off-by: Jan Kara <jack@suse.cz>
> > 
> > The helper conversion looks fine so from that perspective the patch
> > looks good.
> > 
> > However, I'm not sure the use of memory barriers is correct, though.
> 
> AFAICS, the barriers are correct but my documentation was not ;)
> Christian's reply has all the details but maybe let me attempt a bit more
> targetted reply here.

*nod*

> 
> > IIUC, we want mnt_is_readonly() to return true when ever
> > s_readonly_remount is set. Is that the behaviour we are trying to
> > acheive for both ro->rw and rw->ro transactions?
> 
> Yes. But what matters is the ordering of s_readonly_remount checking wrt
> other flags. See below.
> 
> > > ---
> > >  fs/internal.h      | 26 ++++++++++++++++++++++++++
> > >  fs/namespace.c     | 10 ++++------
> > >  fs/super.c         | 17 ++++++-----------
> > >  include/linux/fs.h |  2 +-
> > >  4 files changed, 37 insertions(+), 18 deletions(-)
> > > 
> > > diff --git a/fs/internal.h b/fs/internal.h
> > > index bd3b2810a36b..01bff3f6db79 100644
> > > --- a/fs/internal.h
> > > +++ b/fs/internal.h
> > > @@ -120,6 +120,32 @@ void put_super(struct super_block *sb);
> > >  extern bool mount_capable(struct fs_context *);
> > >  int sb_init_dio_done_wq(struct super_block *sb);
> > >  
> > > +/*
> > > + * Prepare superblock for changing its read-only state (i.e., either remount
> > > + * read-write superblock read-only or vice versa). After this function returns
> > > + * mnt_is_readonly() will return true for any mount of the superblock if its
> > > + * caller is able to observe any changes done by the remount. This holds until
> > > + * sb_end_ro_state_change() is called.
> > > + */
> > > +static inline void sb_start_ro_state_change(struct super_block *sb)
> > > +{
> > > +	WRITE_ONCE(sb->s_readonly_remount, 1);
> > > +	/* The barrier pairs with the barrier in mnt_is_readonly() */
> > > +	smp_wmb();
> > > +}
> > 
> > I'm not sure how this wmb pairs with the memory barrier in
> > mnt_is_readonly() to provide the correct behavior. The barrier in
> > mnt_is_readonly() happens after it checks s_readonly_remount, so
> > the s_readonly_remount in mnt_is_readonly is not ordered in any way
> > against this barrier.
> > 
> > The barrier in mnt_is_readonly() ensures that the loads of SB_RDONLY
> > and MNT_READONLY are ordered after s_readonly_remount(), but we
> > don't change those flags until a long way after s_readonly_remount
> > is set.
> 
> You are correct. I've reread the code and the ordering that matters is
> __mnt_want_write() on the read side and reconfigure_super() on the write
> side. In particular for RW->RO transition we must make sure that: If
> __mnt_want_write() does not see MNT_WRITE_HOLD set, it will see
> s_readonly_remount set. There is another set of barriers in those functions
> that makes sure sb_prepare_remount_readonly() sees incremented mnt_writers
> if __mnt_want_write() did not see MNT_WRITE_HOLD set, but that's a
> different story.

Yup, as I said to Christian, there is nothing in the old or new code
that even hints at an interaction with MNT_WRITE_HOLD or
__mnt_want_write() here. I couldn't make that jump from reading the
code, and so the memory barrier placement made no sense at all.

> Hence the barrier in sb_start_ro_state_change() pairs with
> smp_rmb() barrier in __mnt_want_write() before the
> mnt_is_readonly() check at the end of the function. I'll fix my
> patch, thanks for correction.

Please also update the mnt_[un]hold_writers() and __mnt_want_write()
documentation to also point at the new sb_start/end_ro_state_change
helpers, as all the memory barriers in this code are tightly
coupled.

Thanks!

-Dave.
-- 
Dave Chinner
david@fromorbit.com

      reply	other threads:[~2023-06-19 23:16 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-16 16:38 [PATCH] fs: Provide helpers for manipulating sb->s_readonly_remount Jan Kara
2023-06-16 23:33 ` Dave Chinner
2023-06-17 15:05   ` Christian Brauner
2023-06-19 23:11     ` Dave Chinner
2023-06-19 11:05   ` Jan Kara
2023-06-19 23:16     ` Dave Chinner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZJDhuldMQRvYGRSh@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=brauner@kernel.org \
    --cc=dhowells@redhat.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox