Linux block layer
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
	 Matthew Wilcox <willy@infradead.org>,
	linux-block@vger.kernel.org
Subject: Re: [PATCH 1/2] block: handle BLK_OPEN_RESTRICT_WRITES correctly
Date: Tue, 26 Mar 2024 16:47:07 +0100	[thread overview]
Message-ID: <20240326-langanhaltend-anbot-1c1e74cc198f@brauner> (raw)
In-Reply-To: <20240326133107.bnjx2rjf5l6yijgz@quack3>

On Tue, Mar 26, 2024 at 02:31:07PM +0100, Jan Kara wrote:
> On Tue 26-03-24 14:17:20, Christian Brauner wrote:
> > On Tue, Mar 26, 2024 at 01:57:15PM +0100, Jan Kara wrote:
> > > On Sat 23-03-24 17:11:19, Christian Brauner wrote:
> > > > Last kernel release we introduce CONFIG_BLK_DEV_WRITE_MOUNTED. By
> > > > default this option is set. When it is set the long-standing behavior
> > > > of being able to write to mounted block devices is enabled.
> > > > 
> > > > But in order to guard against unintended corruption by writing to the
> > > > block device buffer cache CONFIG_BLK_DEV_WRITE_MOUNTED can be turned
> > > > off. In that case it isn't possible to write to mounted block devices
> > > > anymore.
> > > > 
> > > > A filesystem may open its block devices with BLK_OPEN_RESTRICT_WRITES
> > > > which disallows concurrent BLK_OPEN_WRITE access. When we still had the
> > > > bdev handle around we could recognize BLK_OPEN_RESTRICT_WRITES because
> > > > the mode was passed around. Since we managed to get rid of the bdev
> > > > handle we changed that logic to recognize BLK_OPEN_RESTRICT_WRITES based
> > > > on whether the file was opened writable and writes to that block device
> > > > are blocked. That logic doesn't work because we do allow
> > > > BLK_OPEN_RESTRICT_WRITES to be specified without BLK_OPEN_WRITE.
> > > > 
> > > > So fix the detection logic. Use O_EXCL as an indicator that
> > > > BLK_OPEN_RESTRICT_WRITES has been requested. We do the exact same thing
> > > > for pidfds where O_EXCL means that this is a pidfd that refers to a
> > > > thread. For userspace open paths O_EXCL will never be retained but for
> > > > internal opens where we open files that are never installed into a file
> > > > descriptor table this is fine.
> > > > 
> > > > Note that BLK_OPEN_RESTRICT_WRITES is an internal only flag that cannot
> > > > directly be raised by userspace. It is implicitly raised during
> > > > mounting.
> > > > 
> > > > Passes xftests and blktests with CONFIG_BLK_DEV_WRITE_MOUNTED set and
> > > > unset.
> > > > 
> > > > Fixes: 321de651fa56 ("block: don't rely on BLK_OPEN_RESTRICT_WRITES when yielding write access")
> > > > Reported-by: Matthew Wilcox <willy@infradead.org>
> > > > Link: https://lore.kernel.org/r/ZfyyEwu9Uq5Pgb94@casper.infradead.org
> > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > 
> > > The fix looks correct but admittedly it looks a bit hacky. I'd prefer
> > > storing the needed information in some other flag, preferably one that does
> > > not already have a special meaning with block devices. But FMODE_ space is
> > > exhausted and don't see another easy solution. So I guess:
> > 
> > Admittedly, it's not pretty but we're abusing O_EXCL already for block
> > devices. We do have FMODE_* available. We could add
> > FMODE_WRITE_RESTRICTED because we have two bits left.
> 
> Yeah, I'm mostly afraid that a few years down the road someone comes and
> adds code thinking that file->f_flags & O_EXCL means this is exclusive bdev
> open. Which will be kind of natural assumption but subtly wrong... So to
> make the code more robust, I'd prefer burning a fmode flag for this rather
> than abusing O_EXCL.

Ok, done and resend. Thanks for the feedback! I was rather focussed on
not using a bit but I agree it's likely the right thing to do!

  parent reply	other threads:[~2024-03-26 15:47 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-23 14:54 [PATCH] block: handle BLK_OPEN_RESTRICT_WRITES correctly Christian Brauner
2024-03-23 15:59 ` Christian Brauner
2024-03-23 16:11 ` [PATCH 1/2] " Christian Brauner
2024-03-23 16:11   ` [PATCH 2/2] [RFC]: block: count BLK_OPEN_RESTRICT_WRITES openers Christian Brauner
2024-03-26 13:24     ` Jan Kara
2024-03-25 11:51   ` [PATCH 1/2] block: handle BLK_OPEN_RESTRICT_WRITES correctly Yu Kuai
2024-03-25 12:04     ` Christian Brauner
2024-03-25 13:52       ` Yu Kuai
2024-03-25 13:54     ` Christian Brauner
2024-03-26  1:32       ` Yu Kuai
2024-03-26 12:57   ` Jan Kara
2024-03-26 13:17     ` Christian Brauner
2024-03-26 13:31       ` Jan Kara
2024-03-26 15:46         ` [PATCH v2] " Christian Brauner
2024-03-26 17:25           ` Christoph Hellwig
2024-03-26 22:42           ` Jan Kara
2024-03-26 15:47         ` Christian Brauner [this message]
2024-03-27 12:01   ` [PATCH 1/2] " Christian Brauner
2024-03-29  4:56   ` Matthew Wilcox
2024-03-29 12:10     ` Christian Brauner
2024-03-29 15:11       ` Christian Brauner
2024-03-29 15:24         ` Christian Brauner
2024-04-03  6:04       ` Christian Brauner
2024-04-03 19:22         ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240326-langanhaltend-anbot-1c1e74cc198f@brauner \
    --to=brauner@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox