From: Al Viro <viro@zeniv.linux.org.uk>
To: Jan Kara <jack@suse.cz>
Cc: Yu Kuai <yukuai1@huaweicloud.com>, Christoph Hellwig <hch@lst.de>,
brauner@kernel.org, axboe@kernel.dk,
linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
yi.zhang@huawei.com, yangerkun@huawei.com,
"yukuai (C)" <yukuai3@huawei.com>
Subject: Re: [RFC v4 linux-next 19/19] fs & block: remove bdev->bd_inode
Date: Fri, 22 Mar 2024 14:57:28 +0000 [thread overview]
Message-ID: <20240322145728.GN538574@ZenIV> (raw)
In-Reply-To: <20240322131030.pxbvtubien2t27zw@quack3>
On Fri, Mar 22, 2024 at 02:10:30PM +0100, Jan Kara wrote:
> > End result:
> >
> > * old ->i_private leaked (already grabbed by your get_bdev_file())
> > * ->bd_openers at 1 (after your bdev_open() gets through)
> > * ->i_private left NULL.
> >
> > Christoph, could we please get rid of that atomic_t nonsense?
> > It only confuses people into brainos like that. It really
> > needs ->open_mutex for any kind of atomicity.
>
> Well, there are a couple of places where we end up reading bd_openers
> without ->open_mutex. Sure these places are racy wrt other opens / closes
> so they need to be careful but we want to make sure we read back at least
> some sane value which is not guaranteed with normal int and compiler
> possily playing weird tricks when updating it. But yes, we could convert
> the atomic_t to using READ_ONCE + WRITE_ONCE in appropriate places to avoid
> these issues and make it more obvious bd_openers are not really handled in
> an atomic way.
What WRITE_ONE()? We really shouldn't modify it without ->open_mutex; do
we ever do that? In current mainline:
in blkdev_get_whole(), both callers under ->open_mutex:
block/bdev.c:671: if (!atomic_read(&bdev->bd_openers))
block/bdev.c:675: atomic_inc(&bdev->bd_openers);
in blkdev_put_whole(), the sole caller under ->open_mutex:
block_mutex/bdev.c:681: if (atomic_dec_and_test(&bdev->bd_openers))
in blkdev_get_part(), both callers under ->open_mutex:
block/bdev.c:700: if (!atomic_read(&part->bd_openers)) {
block/bdev.c:704: atomic_inc(&part->bd_openers);
in blkdev_put_whole(), the sole caller under ->open_mutex:
block/bdev.c:741: if (atomic_dec_and_test(&part->bd_openers)) {
in bdev_release(), a deliberately racy reader, commented as such:
block/bdev.c:1032: if (atomic_read(&bdev->bd_openers) == 1)
in sync_bdevs(), under ->open_mutex:
block/bdev.c:1163: if (!atomic_read(&bdev->bd_openers)) {
in bdev_del_partition(), under ->open_mutex:
block/partitions/core.c:460: if (atomic_read(&part->bd_openers))
and finally, in disk_openers(), a racy reader:
include/linux/blkdev.h:231: return atomic_read(&disk->part0->bd_openers);
So that's two READ_ONCE() and a bunch of reads and writes under ->open_mutex.
Callers of disk_openers() need to be careful and looking through those...
Some of them are under ->open_mutex (either explicitly, or as e.g. lo_release()
called only via bdev ->release(), which comes only under ->open_mutex), but
four of them are not:
arch/um/drivers/ubd_kern.c:1023: if (disk_openers(ubd_dev->disk))
in ubd_remove(). Racy, possibly a bug. AFAICS, it's accessible through UML
console and there's nothing to stop it from racing with open().
drivers/block/loop.c:1245: if (disk_openers(lo->lo_disk) > 1) {
in loop_clr_fd(). Under loop's private lock, but that's likely to
be a race - ->bd_openers updates are not under that. Note that
there's no ->open() for /dev/loop, BTW...
drivers/block/loop.c:2161: if (lo->lo_state != Lo_unbound || disk_openers(lo->lo_disk) > 0) {
in loop_control_remove(). Similar to the previous one, except that
it's done out of band *and* it doesn't have the "autoclean" logics
to work around udev, lovingly described in the comment before the
call in loop_clr_fd().
drivers/block/nbd.c:1279: if (disk_openers(nbd->disk) > 1)
in nbd_bdev_reset(). Under nbd private mutex (->config_lock),
so there's some exclusion with nbd_open(), but ->bd_openers change
comes outside of that. Might or might not be a bug - I need to wake
up properly to look through that.
next prev parent reply other threads:[~2024-03-22 14:57 UTC|newest]
Thread overview: 97+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-22 12:45 [RFC v4 linux-next 00/19] fs & block: remove bdev->bd_inode Yu Kuai
2024-02-22 12:45 ` [RFC v4 linux-next 01/19] block: move two helpers into bdev.c Yu Kuai
2024-03-15 14:31 ` Jan Kara
2024-03-17 21:19 ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 02/19] block: remove sync_blockdev_nowait() Yu Kuai
2024-03-15 14:34 ` Jan Kara
2024-03-17 21:19 ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 03/19] block: remove sync_blockdev_range() Yu Kuai
2024-03-15 14:37 ` Jan Kara
2024-03-17 21:21 ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 04/19] block: prevent direct access of bd_inode Yu Kuai
2024-03-15 14:44 ` Jan Kara
2024-03-17 21:23 ` Christoph Hellwig
2024-03-22 5:44 ` Al Viro
2024-02-22 12:45 ` [RFC v4 linux-next 05/19] bcachefs: remove dead function bdev_sectors() Yu Kuai
2024-03-15 14:42 ` Jan Kara
2024-03-17 21:23 ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 06/19] cramfs: prevent direct access of bd_inode Yu Kuai
2024-03-15 14:44 ` Jan Kara
2024-03-17 21:23 ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 07/19] erofs: " Yu Kuai
2024-03-15 14:45 ` Jan Kara
2024-03-17 21:24 ` Christoph Hellwig
2024-03-18 2:39 ` Gao Xiang
2024-02-22 12:45 ` [RFC v4 linux-next 08/19] nilfs2: " Yu Kuai
2024-03-15 14:49 ` Jan Kara
2024-03-17 21:24 ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 09/19] gfs2: " Yu Kuai
2024-03-15 14:54 ` Jan Kara
2024-03-17 21:24 ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 10/19] s390/dasd: use bdev api in dasd_format() Yu Kuai
2024-03-15 14:55 ` Jan Kara
2024-03-17 21:25 ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 11/19] btrfs: prevent direct access of bd_inode Yu Kuai
2024-03-15 15:09 ` Jan Kara
2024-03-17 21:25 ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 12/19] ext4: remove block_device_ejected() Yu Kuai
2024-02-22 12:45 ` [RFC v4 linux-next 13/19] ext4: prevent direct access of bd_inode Yu Kuai
2024-03-15 14:58 ` Jan Kara
2024-03-17 21:25 ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 14/19] jbd2: " Yu Kuai
2024-03-15 15:06 ` Jan Kara
2024-03-17 21:26 ` Christoph Hellwig
2024-03-18 1:10 ` Yu Kuai
2024-02-22 12:45 ` [RFC v4 linux-next 15/19] bcache: " Yu Kuai
2024-03-15 15:11 ` Jan Kara
2024-03-17 21:34 ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 16/19] block2mtd: " Yu Kuai
2024-03-15 15:12 ` Jan Kara
2024-03-17 21:36 ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 17/19] dm-vdo: " Yu Kuai
2024-02-28 13:41 ` Christoph Hellwig
2024-03-18 9:11 ` Jan Kara
2024-03-18 9:19 ` Jan Kara
2024-03-18 13:38 ` Yu Kuai
2024-03-19 2:00 ` Matthew Sakai
2024-02-22 12:45 ` [RFC v4 linux-next 18/19] scsi: factor out a helper bdev_read_folio() from scsi_bios_ptable() Yu Kuai
2024-03-17 21:36 ` Christoph Hellwig
2024-03-18 1:12 ` Yu Kuai
2024-03-18 9:22 ` Jan Kara
2024-02-22 12:45 ` [RFC v4 linux-next 19/19] fs & block: remove bdev->bd_inode Yu Kuai
2024-03-17 21:38 ` Christoph Hellwig
2024-03-18 1:26 ` Yu Kuai
2024-03-18 1:32 ` Christoph Hellwig
2024-03-18 1:51 ` Yu Kuai
2024-03-18 7:19 ` Yu Kuai
2024-03-18 10:07 ` Christian Brauner
2024-03-18 10:29 ` Christian Brauner
2024-03-18 10:46 ` Christian Brauner
2024-03-18 11:57 ` Yu Kuai
2024-03-18 23:35 ` Christoph Hellwig
2024-03-18 23:22 ` Christoph Hellwig
2024-03-19 8:26 ` Yu Kuai
2024-03-21 11:27 ` Jan Kara
2024-03-21 12:15 ` Yu Kuai
2024-03-22 6:37 ` Al Viro
2024-03-22 6:39 ` Al Viro
2024-03-22 6:52 ` Yu Kuai
2024-03-22 12:57 ` Jan Kara
2024-03-22 13:57 ` Christian Brauner
2024-03-22 15:43 ` Al Viro
2024-03-22 16:16 ` Al Viro
2024-03-22 6:33 ` Al Viro
2024-03-22 7:09 ` Yu Kuai
2024-03-22 16:01 ` Al Viro
2024-03-22 13:10 ` Jan Kara
2024-03-22 14:57 ` Al Viro [this message]
2024-03-25 1:06 ` Christoph Hellwig
2024-02-28 13:42 ` [RFC v4 linux-next 00/19] " Christoph Hellwig
2024-03-15 12:08 ` Yu Kuai
2024-03-15 13:54 ` Christian Brauner
2024-03-16 2:49 ` Yu Kuai
2024-03-18 9:39 ` Christian Brauner
2024-03-19 1:18 ` Yu Kuai
2024-03-19 1:43 ` Yu Kuai
2024-03-19 2:13 ` Matthew Sakai
2024-03-19 2:27 ` Yu Kuai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240322145728.GN538574@ZenIV \
--to=viro@zeniv.linux.org.uk \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=yangerkun@huawei.com \
--cc=yi.zhang@huawei.com \
--cc=yukuai1@huaweicloud.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).