linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Jan Kara <jack@suse.cz>
Cc: Yu Kuai <yukuai1@huaweicloud.com>, Christoph Hellwig <hch@lst.de>,
	brauner@kernel.org, axboe@kernel.dk,
	linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
	yi.zhang@huawei.com, yangerkun@huawei.com,
	"yukuai (C)" <yukuai3@huawei.com>
Subject: Re: [RFC v4 linux-next 19/19] fs & block: remove bdev->bd_inode
Date: Fri, 22 Mar 2024 14:57:28 +0000	[thread overview]
Message-ID: <20240322145728.GN538574@ZenIV> (raw)
In-Reply-To: <20240322131030.pxbvtubien2t27zw@quack3>

On Fri, Mar 22, 2024 at 02:10:30PM +0100, Jan Kara wrote:
> > End result:
> > 
> > * old ->i_private leaked (already grabbed by your get_bdev_file())
> > * ->bd_openers at 1 (after your bdev_open() gets through)
> > * ->i_private left NULL.
> > 
> > Christoph, could we please get rid of that atomic_t nonsense?
> > It only confuses people into brainos like that.  It really
> > needs ->open_mutex for any kind of atomicity.
> 
> Well, there are a couple of places where we end up reading bd_openers
> without ->open_mutex. Sure these places are racy wrt other opens / closes
> so they need to be careful but we want to make sure we read back at least
> some sane value which is not guaranteed with normal int and compiler
> possily playing weird tricks when updating it. But yes, we could convert
> the atomic_t to using READ_ONCE + WRITE_ONCE in appropriate places to avoid
> these issues and make it more obvious bd_openers are not really handled in
> an atomic way.

What WRITE_ONE()?  We really shouldn't modify it without ->open_mutex; do
we ever do that?  In current mainline:

in blkdev_get_whole(), both callers under ->open_mutex:
block/bdev.c:671:       if (!atomic_read(&bdev->bd_openers))
block/bdev.c:675:       atomic_inc(&bdev->bd_openers);

in blkdev_put_whole(), the sole caller under ->open_mutex:
block_mutex/bdev.c:681:       if (atomic_dec_and_test(&bdev->bd_openers))

in blkdev_get_part(), both callers under ->open_mutex:
block/bdev.c:700:       if (!atomic_read(&part->bd_openers)) {
block/bdev.c:704:       atomic_inc(&part->bd_openers);

in blkdev_put_whole(), the sole caller under ->open_mutex:
block/bdev.c:741:       if (atomic_dec_and_test(&part->bd_openers)) {

in bdev_release(), a deliberately racy reader, commented as such:
block/bdev.c:1032:      if (atomic_read(&bdev->bd_openers) == 1)

in sync_bdevs(), under ->open_mutex:
block/bdev.c:1163:              if (!atomic_read(&bdev->bd_openers)) {

in bdev_del_partition(), under ->open_mutex:
block/partitions/core.c:460:    if (atomic_read(&part->bd_openers))

and finally, in disk_openers(), a racy reader:
include/linux/blkdev.h:231:     return atomic_read(&disk->part0->bd_openers);

So that's two READ_ONCE() and a bunch of reads and writes under ->open_mutex.
Callers of disk_openers() need to be careful and looking through those...
Some of them are under ->open_mutex (either explicitly, or as e.g. lo_release()
called only via bdev ->release(), which comes only under ->open_mutex), but
four of them are not:

arch/um/drivers/ubd_kern.c:1023:                if (disk_openers(ubd_dev->disk))
in ubd_remove().  Racy, possibly a bug.  AFAICS, it's accessible through UML
console and there's nothing to stop it from racing with open().

drivers/block/loop.c:1245:      if (disk_openers(lo->lo_disk) > 1) {
in loop_clr_fd().  Under loop's private lock, but that's likely to
be a race - ->bd_openers updates are not under that.  Note that
there's no ->open() for /dev/loop, BTW...

drivers/block/loop.c:2161:      if (lo->lo_state != Lo_unbound || disk_openers(lo->lo_disk) > 0) {
in loop_control_remove().  Similar to the previous one, except that
it's done out of band *and* it doesn't have the "autoclean" logics
to work around udev, lovingly described in the comment before the
call in loop_clr_fd().

drivers/block/nbd.c:1279:       if (disk_openers(nbd->disk) > 1)
in nbd_bdev_reset().  Under nbd private mutex (->config_lock),
so there's some exclusion with nbd_open(), but ->bd_openers change
comes outside of that.  Might or might not be a bug - I need to wake
up properly to look through that.

  reply	other threads:[~2024-03-22 14:57 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-22 12:45 [RFC v4 linux-next 00/19] fs & block: remove bdev->bd_inode Yu Kuai
2024-02-22 12:45 ` [RFC v4 linux-next 01/19] block: move two helpers into bdev.c Yu Kuai
2024-03-15 14:31   ` Jan Kara
2024-03-17 21:19   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 02/19] block: remove sync_blockdev_nowait() Yu Kuai
2024-03-15 14:34   ` Jan Kara
2024-03-17 21:19   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 03/19] block: remove sync_blockdev_range() Yu Kuai
2024-03-15 14:37   ` Jan Kara
2024-03-17 21:21   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 04/19] block: prevent direct access of bd_inode Yu Kuai
2024-03-15 14:44   ` Jan Kara
2024-03-17 21:23   ` Christoph Hellwig
2024-03-22  5:44   ` Al Viro
2024-02-22 12:45 ` [RFC v4 linux-next 05/19] bcachefs: remove dead function bdev_sectors() Yu Kuai
2024-03-15 14:42   ` Jan Kara
2024-03-17 21:23   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 06/19] cramfs: prevent direct access of bd_inode Yu Kuai
2024-03-15 14:44   ` Jan Kara
2024-03-17 21:23   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 07/19] erofs: " Yu Kuai
2024-03-15 14:45   ` Jan Kara
2024-03-17 21:24   ` Christoph Hellwig
2024-03-18  2:39   ` Gao Xiang
2024-02-22 12:45 ` [RFC v4 linux-next 08/19] nilfs2: " Yu Kuai
2024-03-15 14:49   ` Jan Kara
2024-03-17 21:24   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 09/19] gfs2: " Yu Kuai
2024-03-15 14:54   ` Jan Kara
2024-03-17 21:24   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 10/19] s390/dasd: use bdev api in dasd_format() Yu Kuai
2024-03-15 14:55   ` Jan Kara
2024-03-17 21:25   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 11/19] btrfs: prevent direct access of bd_inode Yu Kuai
2024-03-15 15:09   ` Jan Kara
2024-03-17 21:25   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 12/19] ext4: remove block_device_ejected() Yu Kuai
2024-02-22 12:45 ` [RFC v4 linux-next 13/19] ext4: prevent direct access of bd_inode Yu Kuai
2024-03-15 14:58   ` Jan Kara
2024-03-17 21:25   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 14/19] jbd2: " Yu Kuai
2024-03-15 15:06   ` Jan Kara
2024-03-17 21:26   ` Christoph Hellwig
2024-03-18  1:10     ` Yu Kuai
2024-02-22 12:45 ` [RFC v4 linux-next 15/19] bcache: " Yu Kuai
2024-03-15 15:11   ` Jan Kara
2024-03-17 21:34   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 16/19] block2mtd: " Yu Kuai
2024-03-15 15:12   ` Jan Kara
2024-03-17 21:36   ` Christoph Hellwig
2024-02-22 12:45 ` [RFC v4 linux-next 17/19] dm-vdo: " Yu Kuai
2024-02-28 13:41   ` Christoph Hellwig
2024-03-18  9:11     ` Jan Kara
2024-03-18  9:19   ` Jan Kara
2024-03-18 13:38     ` Yu Kuai
2024-03-19  2:00       ` Matthew Sakai
2024-02-22 12:45 ` [RFC v4 linux-next 18/19] scsi: factor out a helper bdev_read_folio() from scsi_bios_ptable() Yu Kuai
2024-03-17 21:36   ` Christoph Hellwig
2024-03-18  1:12     ` Yu Kuai
2024-03-18  9:22   ` Jan Kara
2024-02-22 12:45 ` [RFC v4 linux-next 19/19] fs & block: remove bdev->bd_inode Yu Kuai
2024-03-17 21:38   ` Christoph Hellwig
2024-03-18  1:26     ` Yu Kuai
2024-03-18  1:32       ` Christoph Hellwig
2024-03-18  1:51         ` Yu Kuai
2024-03-18  7:19           ` Yu Kuai
2024-03-18 10:07             ` Christian Brauner
2024-03-18 10:29               ` Christian Brauner
2024-03-18 10:46                 ` Christian Brauner
2024-03-18 11:57                   ` Yu Kuai
2024-03-18 23:35                 ` Christoph Hellwig
2024-03-18 23:22             ` Christoph Hellwig
2024-03-19  8:26               ` Yu Kuai
2024-03-21 11:27                 ` Jan Kara
2024-03-21 12:15                   ` Yu Kuai
2024-03-22  6:37                     ` Al Viro
2024-03-22  6:39                       ` Al Viro
2024-03-22  6:52                         ` Yu Kuai
2024-03-22 12:57                           ` Jan Kara
2024-03-22 13:57                             ` Christian Brauner
2024-03-22 15:43                           ` Al Viro
2024-03-22 16:16                             ` Al Viro
2024-03-22  6:33                 ` Al Viro
2024-03-22  7:09                   ` Yu Kuai
2024-03-22 16:01                     ` Al Viro
2024-03-22 13:10                   ` Jan Kara
2024-03-22 14:57                     ` Al Viro [this message]
2024-03-25  1:06                       ` Christoph Hellwig
2024-02-28 13:42 ` [RFC v4 linux-next 00/19] " Christoph Hellwig
2024-03-15 12:08 ` Yu Kuai
2024-03-15 13:54   ` Christian Brauner
2024-03-16  2:49     ` Yu Kuai
2024-03-18  9:39       ` Christian Brauner
2024-03-19  1:18         ` Yu Kuai
2024-03-19  1:43           ` Yu Kuai
2024-03-19  2:13             ` Matthew Sakai
2024-03-19  2:27               ` Yu Kuai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240322145728.GN538574@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai1@huaweicloud.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).