* Re: Should seed device be allowed to be mounted multiple times?
[not found] ` <a3db2131-37a8-469f-a20d-dc83b2b14475@oracle.com>
@ 2025-08-05 0:52 ` Qu Wenruo
2025-08-05 12:43 ` Christian Brauner
0 siblings, 1 reply; 4+ messages in thread
From: Qu Wenruo @ 2025-08-05 0:52 UTC (permalink / raw)
To: Anand Jain, Qu Wenruo, Qu Wenruo, linux-btrfs, David Sterba,
linux-fsdevel@vger.kernel.org
在 2025/8/5 10:06, Anand Jain 写道:
>
>
>>> Thanks for the comments.
>>> Our seed block device use-case doesn’t fall under the kind of risk that
>>> BLK_OPEN_RESTRICT_WRITES is meant to guard against—it’s not a typical
>>> multi-FS RW setup. Seed devices are readonly, so it might be reasonable
>>> to handle this at the block layer—or maybe it’s not feasible.
>
>
>> Read-only doesn't prevent the device from being removed suddenly.
>
> I don't see how this is related to the BLK_OPEN_RESTRICT_WRITES flag.
> Can you clarify?
It's not related to that flag, I'm talking about the
fs_bdev_mark_dead(), and the remaining 3 callbacks.
Those call backs are all depending on the bdev holder to grab a super block.
Thus a block device should and can not have multiple super blocks.
>
> ------
> /* open is exclusive wrt all other BLK_OPEN_WRITE opens to the device */
> #define BLK_OPEN_RESTRICT_WRITES ((__force blk_mode_t)(1 << 5))
> ------
>
>> You still didn't know that the whole fs_holder_ops is based on the
>> assumption that one block device should only belong to one mounted fs.
>
> You're missing the point: after a sprout, Btrfs internally becomes a new
> filesystem with a new FSID. Some may call it insane—but it's different,
> useful, and it works.
I totally know that, it's you don't understand how bdev holder works,
nor willing to spend any time reading the details about bdev_open().
Just search the @holder inside that function, even without
RESTRICT_WRITES flag, it will still fail at bd_may_reclaim() due to the
holder (super block) mismatch.
>
> During that transition, fs_holder (or equivalent) needs to be updated to
> reflect the change. If that's not currently possible, we may need to add
> support for it.
>
> The problem is that fs_holder_ops still sees it as a seed device, which
> is risky—we don’t know what else could break if the FSID change isn’t
> properly handled.
Nope, it's super simple, you just can not mount have a block device with
two different holders.
>
>> And I see that assumption completely valid.
>>
>> I didn't see any reason why any sane people want to mount the sported
>> fs and the seed device at the same time.
>
> Neither of us has data on how it’s being used.
Just read all the other filesystems' code.
Either it's pushing super block as bdev holder, so that we can easily
grab the fs from bdev through bdev_super_lock(), or it's bcachefs doing
the similar thing, but without using the existing helpers.
> And as I’ve hinted, it
> does violate kABI from a technical standpoint.
>> If the use case is to sprout a fs based on the seed device multiple
>> times, it's still possible, just unmount the sprout fs before mounting
>> the seed device again.
>
> In a datacenter environment, unmounting isn’t always a viable option.
If you're mounting the fs already, why you can not umount suddenly?
If you're talking about rootfs, it's no deal breaker, just remove the
seed device from the sprout fs, then mount the seed device again.
>
> Now that there’s a regression and a feature has been broken, let’s not
> shift the discussion to whether that feature was useful. I prefer to
> keep things technical—not personal—and I expect respectful communication
> to be mutual, not taken for granted.
I have explained the technical details enough. If you are not willing to
understand, sure call it whatever you want.
>
> Btrfs has some unique behaviors, and it’s possible we’ll need changes in
> the block layer or fs_holder_ops. That still needs to be figured out.
Unique doesn't mean correct nor sane.
And seed device is nothing special. If you don't want to accept that one
mounted block device should only belong to one mounted fs, sure go ahead
and see what everyone else thinks.
>
> Thanks, Anand
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Should seed device be allowed to be mounted multiple times?
2025-08-05 0:52 ` Should seed device be allowed to be mounted multiple times? Qu Wenruo
@ 2025-08-05 12:43 ` Christian Brauner
2025-08-05 22:20 ` Qu Wenruo
0 siblings, 1 reply; 4+ messages in thread
From: Christian Brauner @ 2025-08-05 12:43 UTC (permalink / raw)
To: Qu Wenruo, Anand Jain
Cc: Josef Bacik, Qu Wenruo, Qu Wenruo, linux-btrfs, David Sterba,
linux-fsdevel@vger.kernel.org
On Tue, Aug 05, 2025 at 10:22:49AM +0930, Qu Wenruo wrote:
>
>
> 在 2025/8/5 10:06, Anand Jain 写道:
> >
> >
> > > > Thanks for the comments.
> > > > Our seed block device use-case doesn’t fall under the kind of risk that
> > > > BLK_OPEN_RESTRICT_WRITES is meant to guard against—it’s not a typical
> > > > multi-FS RW setup. Seed devices are readonly, so it might be reasonable
> > > > to handle this at the block layer—or maybe it’s not feasible.
> >
> >
> > > Read-only doesn't prevent the device from being removed suddenly.
> >
> > I don't see how this is related to the BLK_OPEN_RESTRICT_WRITES flag.
> > Can you clarify?
>
> It's not related to that flag, I'm talking about the fs_bdev_mark_dead(),
> and the remaining 3 callbacks.
>
> Those call backs are all depending on the bdev holder to grab a super block.
>
> Thus a block device should and can not have multiple super blocks.
I'm pretty sure you can't just break the seed device sharing use-case
without causing a lot of regressions...
If you know what the seed devices are than you can change the code to
simply use the btrfs filesystem type as the holder without any holder
operations but just for seed devices. Then seed devices can be opened
by/shared with any btrfs filesystem.
The only restriction is that you cannot use a device as a seed device
that another btrfs filesystem uses as a non-seed device because then it
will be fully owned by the other btrfs filesystem. But Josef tells me
you can only use it as a seed device anyway.
IOW, if you have a concept of shareable devices between different btrfs
filesystems then it's fine to reflect that in the code. If really needed
you can later add custom block holder ops for seed devices so you can
e.g., iterate through all filesystems that share the device.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Should seed device be allowed to be mounted multiple times?
2025-08-05 12:43 ` Christian Brauner
@ 2025-08-05 22:20 ` Qu Wenruo
2025-08-08 14:14 ` Christian Brauner
0 siblings, 1 reply; 4+ messages in thread
From: Qu Wenruo @ 2025-08-05 22:20 UTC (permalink / raw)
To: Christian Brauner, Anand Jain
Cc: Josef Bacik, Qu Wenruo, Qu Wenruo, linux-btrfs, David Sterba,
linux-fsdevel@vger.kernel.org
在 2025/8/5 22:13, Christian Brauner 写道:
> On Tue, Aug 05, 2025 at 10:22:49AM +0930, Qu Wenruo wrote:
>>
>>
>> 在 2025/8/5 10:06, Anand Jain 写道:
>>>
>>>
>>>>> Thanks for the comments.
>>>>> Our seed block device use-case doesn’t fall under the kind of risk that
>>>>> BLK_OPEN_RESTRICT_WRITES is meant to guard against—it’s not a typical
>>>>> multi-FS RW setup. Seed devices are readonly, so it might be reasonable
>>>>> to handle this at the block layer—or maybe it’s not feasible.
>>>
>>>
>>>> Read-only doesn't prevent the device from being removed suddenly.
>>>
>>> I don't see how this is related to the BLK_OPEN_RESTRICT_WRITES flag.
>>> Can you clarify?
>>
>> It's not related to that flag, I'm talking about the fs_bdev_mark_dead(),
>> and the remaining 3 callbacks.
>>
>> Those call backs are all depending on the bdev holder to grab a super block.
>>
>> Thus a block device should and can not have multiple super blocks.
>
> I'm pretty sure you can't just break the seed device sharing use-case
> without causing a lot of regressions...
It's not that widely affecting, we can still share the same seed device
for all different sprout fses, just only one of them can be mounted at
the same time.
And even with that limitation, it won't affect most (or any) real world
use cases.
Even the most complex case like using seed devices as rootfs, and we
want to sprout the rootfs again, just remove the seed device from the
current rootfs, then one can mount the seed device again.
>
> If you know what the seed devices are than you can change the code to
> simply use the btrfs filesystem type as the holder without any holder
> operations but just for seed devices. Then seed devices can be opened
> by/shared with any btrfs filesystem.
But we will lose all the bdev related events.
We still want to sync/freeze/thaw the real sprouted fs in the end.
>
> The only restriction is that you cannot use a device as a seed device
> that another btrfs filesystem uses as a non-seed device because then it
> will be fully owned by the other btrfs filesystem. But Josef tells me
> you can only use it as a seed device anyway.
>
> IOW, if you have a concept of shareable devices between different btrfs
> filesystems then it's fine to reflect that in the code. If really needed
> you can later add custom block holder ops for seed devices so you can
> e.g., iterate through all filesystems that share the device.
Sure it's possible, with a lot of extra code looking up where the seed
device belongs, and all the extra bdev event proxy.
But I'd say, the seed device specification is not well specified in the
very beginning, thus it results a lot of "creative" but not practical
use cases.
Yes, this will result some regression, but I'd prefer a more sounding
and simpler logic for the whole seed device, with minimal impact to the
most common existing use cases.
Thanks,
Qu
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Should seed device be allowed to be mounted multiple times?
2025-08-05 22:20 ` Qu Wenruo
@ 2025-08-08 14:14 ` Christian Brauner
0 siblings, 0 replies; 4+ messages in thread
From: Christian Brauner @ 2025-08-08 14:14 UTC (permalink / raw)
To: Qu Wenruo
Cc: Anand Jain, Josef Bacik, Qu Wenruo, Qu Wenruo, linux-btrfs,
David Sterba, linux-fsdevel@vger.kernel.org
On Wed, Aug 06, 2025 at 07:50:06AM +0930, Qu Wenruo wrote:
>
>
> 在 2025/8/5 22:13, Christian Brauner 写道:
> > On Tue, Aug 05, 2025 at 10:22:49AM +0930, Qu Wenruo wrote:
> > >
> > >
> > > 在 2025/8/5 10:06, Anand Jain 写道:
> > > >
> > > >
> > > > > > Thanks for the comments.
> > > > > > Our seed block device use-case doesn’t fall under the kind of risk that
> > > > > > BLK_OPEN_RESTRICT_WRITES is meant to guard against—it’s not a typical
> > > > > > multi-FS RW setup. Seed devices are readonly, so it might be reasonable
> > > > > > to handle this at the block layer—or maybe it’s not feasible.
> > > >
> > > >
> > > > > Read-only doesn't prevent the device from being removed suddenly.
> > > >
> > > > I don't see how this is related to the BLK_OPEN_RESTRICT_WRITES flag.
> > > > Can you clarify?
> > >
> > > It's not related to that flag, I'm talking about the fs_bdev_mark_dead(),
> > > and the remaining 3 callbacks.
> > >
> > > Those call backs are all depending on the bdev holder to grab a super block.
> > >
> > > Thus a block device should and can not have multiple super blocks.
> >
> > I'm pretty sure you can't just break the seed device sharing use-case
> > without causing a lot of regressions...
>
> It's not that widely affecting, we can still share the same seed device for
> all different sprout fses, just only one of them can be mounted at the same
> time.
>
> And even with that limitation, it won't affect most (or any) real world use
> cases.
>
> Even the most complex case like using seed devices as rootfs, and we want to
> sprout the rootfs again, just remove the seed device from the current
> rootfs, then one can mount the seed device again.
>
> >
> > If you know what the seed devices are than you can change the code to
> > simply use the btrfs filesystem type as the holder without any holder
> > operations but just for seed devices. Then seed devices can be opened
> > by/shared with any btrfs filesystem.
>
> But we will lose all the bdev related events.
>
> We still want to sync/freeze/thaw the real sprouted fs in the end.
>
> >
> > The only restriction is that you cannot use a device as a seed device
> > that another btrfs filesystem uses as a non-seed device because then it
> > will be fully owned by the other btrfs filesystem. But Josef tells me
> > you can only use it as a seed device anyway.
> >
> > IOW, if you have a concept of shareable devices between different btrfs
> > filesystems then it's fine to reflect that in the code. If really needed
> > you can later add custom block holder ops for seed devices so you can
> > e.g., iterate through all filesystems that share the device.
>
> Sure it's possible, with a lot of extra code looking up where the seed
> device belongs, and all the extra bdev event proxy.
>
>
> But I'd say, the seed device specification is not well specified in the very
> beginning, thus it results a lot of "creative" but not practical use cases.
>
> Yes, this will result some regression, but I'd prefer a more sounding and
> simpler logic for the whole seed device, with minimal impact to the most
> common existing use cases.
Ok, I'm not in a position to argue this effectively. If you think you an
reasonably get away with this regression so be it. But if this ends up
in a total revert of the conversion even though we'd have alternative
solution I'm not going to be happy...
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-08-08 14:15 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <aef03da8-853a-4c9f-b77b-30cf050ec1a5@suse.de>
[not found] ` <4cdf6f5c-41e8-4943-9c8b-794e04aa47c5@suse.de>
[not found] ` <8daff5f7-c8e8-4e74-a56c-3d161d3bda1f@oracle.com>
[not found] ` <bddc796f-a0e0-4ab5-ab90-8cd10e20db23@suse.de>
[not found] ` <184c750a-ce86-4e08-9722-7aa35163c940@oracle.com>
[not found] ` <bc8ecf02-b1a1-4bc0-80e3-162e334db94a@gmx.com>
[not found] ` <a3db2131-37a8-469f-a20d-dc83b2b14475@oracle.com>
2025-08-05 0:52 ` Should seed device be allowed to be mounted multiple times? Qu Wenruo
2025-08-05 12:43 ` Christian Brauner
2025-08-05 22:20 ` Qu Wenruo
2025-08-08 14:14 ` Christian Brauner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).