Linux block layer
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
	 Matthew Wilcox <willy@infradead.org>,
	linux-block@vger.kernel.org
Subject: Re: [PATCH 1/2] block: handle BLK_OPEN_RESTRICT_WRITES correctly
Date: Tue, 26 Mar 2024 14:17:20 +0100	[thread overview]
Message-ID: <20240326-eidesstattlich-abwahl-ddfb1c75c05e@brauner> (raw)
In-Reply-To: <20240326125715.i23eo6sh5223tdmc@quack3>

On Tue, Mar 26, 2024 at 01:57:15PM +0100, Jan Kara wrote:
> On Sat 23-03-24 17:11:19, Christian Brauner wrote:
> > Last kernel release we introduce CONFIG_BLK_DEV_WRITE_MOUNTED. By
> > default this option is set. When it is set the long-standing behavior
> > of being able to write to mounted block devices is enabled.
> > 
> > But in order to guard against unintended corruption by writing to the
> > block device buffer cache CONFIG_BLK_DEV_WRITE_MOUNTED can be turned
> > off. In that case it isn't possible to write to mounted block devices
> > anymore.
> > 
> > A filesystem may open its block devices with BLK_OPEN_RESTRICT_WRITES
> > which disallows concurrent BLK_OPEN_WRITE access. When we still had the
> > bdev handle around we could recognize BLK_OPEN_RESTRICT_WRITES because
> > the mode was passed around. Since we managed to get rid of the bdev
> > handle we changed that logic to recognize BLK_OPEN_RESTRICT_WRITES based
> > on whether the file was opened writable and writes to that block device
> > are blocked. That logic doesn't work because we do allow
> > BLK_OPEN_RESTRICT_WRITES to be specified without BLK_OPEN_WRITE.
> > 
> > So fix the detection logic. Use O_EXCL as an indicator that
> > BLK_OPEN_RESTRICT_WRITES has been requested. We do the exact same thing
> > for pidfds where O_EXCL means that this is a pidfd that refers to a
> > thread. For userspace open paths O_EXCL will never be retained but for
> > internal opens where we open files that are never installed into a file
> > descriptor table this is fine.
> > 
> > Note that BLK_OPEN_RESTRICT_WRITES is an internal only flag that cannot
> > directly be raised by userspace. It is implicitly raised during
> > mounting.
> > 
> > Passes xftests and blktests with CONFIG_BLK_DEV_WRITE_MOUNTED set and
> > unset.
> > 
> > Fixes: 321de651fa56 ("block: don't rely on BLK_OPEN_RESTRICT_WRITES when yielding write access")
> > Reported-by: Matthew Wilcox <willy@infradead.org>
> > Link: https://lore.kernel.org/r/ZfyyEwu9Uq5Pgb94@casper.infradead.org
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> 
> The fix looks correct but admittedly it looks a bit hacky. I'd prefer
> storing the needed information in some other flag, preferably one that does
> not already have a special meaning with block devices. But FMODE_ space is
> exhausted and don't see another easy solution. So I guess:

Admittedly, it's not pretty but we're abusing O_EXCL already for block
devices. We do have FMODE_* available. We could add
FMODE_WRITE_RESTRICTED because we have two bits left.

One other solution apart from FMODE_* I had come up with was even
nastier which would've been using the struct fd approach of using the
two available bits in the pointer. But that doesn't work because we have
stuff like dm that passes in strings which are byte aligned. And it's
arguably uglier.

  reply	other threads:[~2024-03-26 13:17 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-23 14:54 [PATCH] block: handle BLK_OPEN_RESTRICT_WRITES correctly Christian Brauner
2024-03-23 15:59 ` Christian Brauner
2024-03-23 16:11 ` [PATCH 1/2] " Christian Brauner
2024-03-23 16:11   ` [PATCH 2/2] [RFC]: block: count BLK_OPEN_RESTRICT_WRITES openers Christian Brauner
2024-03-26 13:24     ` Jan Kara
2024-03-25 11:51   ` [PATCH 1/2] block: handle BLK_OPEN_RESTRICT_WRITES correctly Yu Kuai
2024-03-25 12:04     ` Christian Brauner
2024-03-25 13:52       ` Yu Kuai
2024-03-25 13:54     ` Christian Brauner
2024-03-26  1:32       ` Yu Kuai
2024-03-26 12:57   ` Jan Kara
2024-03-26 13:17     ` Christian Brauner [this message]
2024-03-26 13:31       ` Jan Kara
2024-03-26 15:46         ` [PATCH v2] " Christian Brauner
2024-03-26 17:25           ` Christoph Hellwig
2024-03-26 22:42           ` Jan Kara
2024-03-26 15:47         ` [PATCH 1/2] " Christian Brauner
2024-03-27 12:01   ` Christian Brauner
2024-03-29  4:56   ` Matthew Wilcox
2024-03-29 12:10     ` Christian Brauner
2024-03-29 15:11       ` Christian Brauner
2024-03-29 15:24         ` Christian Brauner
2024-04-03  6:04       ` Christian Brauner
2024-04-03 19:22         ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240326-eidesstattlich-abwahl-ddfb1c75c05e@brauner \
    --to=brauner@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox