* LOOP_CONFIGURE uevents
[not found] ` <20231020120436.jgxdlawibpfuprnz@quack3>
@ 2023-10-23 14:08 ` Christian Brauner
2023-10-24 7:06 ` Christoph Hellwig
[not found] ` <20231023-ausgraben-berichten-d747aa50d876@brauner>
1 sibling, 1 reply; 6+ messages in thread
From: Christian Brauner @ 2023-10-23 14:08 UTC (permalink / raw)
To: Jan Kara, Christoph Hellwig; +Cc: linux-fsdevel, linux-block
> > And one final question:
> >
> > dev_set_uevent_suppress(disk_to_dev(lo->lo_disk), 1);
> > disk_force_media_change(lo->lo_disk);
> > /* more stuff */
> > dev_set_uevent_suppress(disk_to_dev(lo->lo_disk), 0);
> >
> > What exactly does that achieve? Does it just delay the delivery of the
> > uevent after the disk sequence number was changed in
> > disk_force_media_change()? Because it doesn't seem to actually prevent
> > uevent generation.
>
> Well, if you grep for dev_get_uevent_suppress() you'll notice there is
> exactly *one* place looking at it - the generation of ADD event when adding
> a partition bdev. I'm not sure what's the rationale behind this
> functionality.
I looked at dev_set_uevent_suppress() before and what it does is that it
fully prevents the generation of uevents for the kobject. It doesn't
just hold them back like the comments "uncork" in loop_change_fd() and
loop_configure() suggest:
static inline void dev_set_uevent_suppress(struct device *dev, int val)
{
dev->kobj.uevent_suppress = val;
}
and then
int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
char *envp_ext[])
{
[...]
/* skip the event, if uevent_suppress is set*/
if (kobj->uevent_suppress) {
pr_debug("kobject: '%s' (%p): %s: uevent_suppress "
"caused the event to drop!\n",
kobject_name(kobj), kobj, __func__);
return 0;
}
So commit 498ef5c777d9 ("loop: suppress uevents while reconfiguring the device")
tried to fix a problem where uevents were generated for LOOP_SET_FD
before LOOP_SET_STATUS* was called.
That broke LOOP_CONFIGURE because LOOP_CONFIGURE is supposed to be
LOOP_SET_FD + LOOP_SET_STATUS in one shot.
Then commit bb430b694226 ("loop: LOOP_CONFIGURE: send uevents for partitions")
fixed that by moving loop_reread_partitions() out of the uevent
suppression.
No you get uevents if you trigger a partition rescan but only if there
are actually partitions. What you never get however is a media change
event even though we do increment the disk sequence number and attach an
image to the loop device.
This seems problematic because we leave userspace unable to listen for
attaching images to a loop device. Shouldn't we regenerate the media
change event after we're done setting up the device and before the
partition rescan for LOOP_CONFIGURE?
^ permalink raw reply [flat|nested] 6+ messages in thread
* loop change deprecation bdev->bd_holder_lock
[not found] ` <20231023-ausgraben-berichten-d747aa50d876@brauner>
@ 2023-10-23 15:35 ` Christian Brauner
2023-10-24 7:03 ` Christoph Hellwig
0 siblings, 1 reply; 6+ messages in thread
From: Christian Brauner @ 2023-10-23 15:35 UTC (permalink / raw)
To: Jan Kara, Christoph Hellwig; +Cc: linux-fsdevel, linux-block, Jens Axboe
On Mon, Oct 23, 2023 at 09:40:44AM +0200, Christian Brauner wrote:
> > I agree that it breaks the original usecase for LOOP_CHANGE_FD. I'd say it
> > also shows nobody is likely using LOOP_CHANGE_FD anymore. Maybe time to try
> > deprecating it?
>
> Yeah, why don't we try that. In it's current form the concept isn't very
> useful and arguably is broken which no one has really noticed for years.
>
> * codesearch.debian.net has zero users
> * codesearch on github has zero users
> * cs.android.com has zero users
>
> Only definitions, strace, ltp, and stress-ng that sort of test this
> functionality. I say we try to deprecate this.
I just realized that if we're able to deprecate LOOP_CHANGE_FD we remove
one of the most problematic/weird cases for partitions and filesystems.
So right now, we can attach a filesystem image to a loop device and then
have partitions be claimed by different filesystems:
* create an image with two partitions
* format first partition as xfs, second as ext4
sudo losetup -f --show --read-only -P img1
sudo mount /dev/loop0p1 /mnt1
sudo mount /dev/loop0p2 /mnt2
What happens here is all very wrong afaict. When we issue, e.g., a
change fd event on the first partition:
sudo ./loop_change_fd /dev/loop0p1 img2
we call disk_force_media_change() but that only works on disk->part0
which means that we don't even cleanly shutdown the filesystem on the
partition we're trying to mess around with.
Later on we then completely fall on our faces when we fail to remove the
partitions because they're still in active use.
So I guess, ideally, we'd really force removal of the partitions and
shut down the filesystem but we can't easily do that because of locking
requirements where we can't acquire s_umount beneath open_mutex. Plus,
this wouldn't fit with the original semantics.
There's probably ways around this but I don't think that's actually
worth it for LOOP_CHANGE_FD. The places where forced removal of
partitions really matters is del_gendisk(). And there we've got it
working correctly and are able to actually remove partitions that still
have active users.
For now, we should give up any pretense that disk_force_media_change()
does anything useful for loop change fd and simply remove it completely.
It's either useless, or it breaks the original semantics of loop change
fd although I don't think anyone's ever used it the way I described
above.
So?
From 0d9b5c4963791f0c3ba8529ca56233be87122c59 Mon Sep 17 00:00:00 2001
From: Christian Brauner <brauner@kernel.org>
Date: Mon, 23 Oct 2023 17:24:56 +0200
Subject: [PATCH] loop: drop disk_force_media_change()
The disk_force_media_change() call is currently only useful for
filesystems and in that case it's not very useful. Consider attaching a
filesystem image to a loop device and then have partitions be claimed by
different filesystems:
* create an image with two partitions
* format first partition as xfs, second as ext4
sudo losetup -f --show --read-only -P img1
sudo mount /dev/loop0p1 /mnt1
sudo mount /dev/loop0p2 /mnt2
When we issue e.g., a loop change fd event on the first partition:
sudo ./loop_change_fd /dev/loop0p1 img2
we call disk_force_media_change() but that only works on disk->part0
which means that we don't even cleanly shutdown the filesystem on the
partition we're trying to mess around with.
Later on we then completely fall on our faces when we fail to remove the
partitions because they're still in active use.
Give up any pretense that disk_force_media_change() does anything useful
for loop change fd and simply remove it completely. It's either useless,
or it's meaningless for the original semantics of loop change fd. Anyone
who uses this is on their own anyway.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
drivers/block/loop.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 9f2d412fc560..87c98e35abac 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -603,7 +603,8 @@ static int loop_change_fd(struct loop_device *lo, struct block_device *bdev,
goto out_err;
/* and ... switch */
- disk_force_media_change(lo->lo_disk);
+ invalidate_bdev(lo->lo_disk->part0);
+ set_bit(GD_NEED_PART_SCAN, &lo->lo_disk->state);
blk_mq_freeze_queue(lo->lo_queue);
mapping_set_gfp_mask(old_file->f_mapping, lo->old_gfp_mask);
lo->lo_backing_file = file;
--
2.34.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: loop change deprecation bdev->bd_holder_lock
2023-10-23 15:35 ` loop change deprecation bdev->bd_holder_lock Christian Brauner
@ 2023-10-24 7:03 ` Christoph Hellwig
2023-10-24 8:44 ` loop change deprecation Christian Brauner
0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2023-10-24 7:03 UTC (permalink / raw)
To: Christian Brauner
Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, linux-block,
Jens Axboe
On Mon, Oct 23, 2023 at 05:35:25PM +0200, Christian Brauner wrote:
> I just realized that if we're able to deprecate LOOP_CHANGE_FD we remove
> one of the most problematic/weird cases for partitions and filesystems.
> change fd event on the first partition:
>
> sudo ./loop_change_fd /dev/loop0p1 img2
>
> we call disk_force_media_change() but that only works on disk->part0
> which means that we don't even cleanly shutdown the filesystem on the
> partition we're trying to mess around with.
Yes, disk_force_media_change has that general problem back from the
early Linux days (it had a different name back then, though). I think
it is because traditionally removable media in Linux never had
partitions, e.g. the CDROM drivers typically only allocated a single
minor number so they could not be scanned. But that has changed because
the interfaces got used for different use cases, and we also had
dynamic majors for a long time that now allow partitions. And there
are real use cases even for traditional removable media, e.g. MacOS
CDROMs traditionally did have partitions.
> For now, we should give up any pretense that disk_force_media_change()
> does anything useful for loop change fd and simply remove it completely.
> It's either useless, or it breaks the original semantics of loop change
> fd although I don't think anyone's ever used it the way I described
> above.
Maybe we can just drop the CHANGE_FD ioctl and see if anyone screams?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: LOOP_CONFIGURE uevents
2023-10-23 14:08 ` LOOP_CONFIGURE uevents Christian Brauner
@ 2023-10-24 7:06 ` Christoph Hellwig
2023-10-24 8:42 ` Christian Brauner
0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2023-10-24 7:06 UTC (permalink / raw)
To: Christian Brauner; +Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, linux-block
On Mon, Oct 23, 2023 at 04:08:21PM +0200, Christian Brauner wrote:
> No you get uevents if you trigger a partition rescan but only if there
> are actually partitions. What you never get however is a media change
> event even though we do increment the disk sequence number and attach an
> image to the loop device.
>
> This seems problematic because we leave userspace unable to listen for
> attaching images to a loop device. Shouldn't we regenerate the media
> change event after we're done setting up the device and before the
> partition rescan for LOOP_CONFIGURE?
Maybe. I think people mostly didn't care about the even anyway, but
about the changing sequence number to check that the content hasn't
changed.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: LOOP_CONFIGURE uevents
2023-10-24 7:06 ` Christoph Hellwig
@ 2023-10-24 8:42 ` Christian Brauner
0 siblings, 0 replies; 6+ messages in thread
From: Christian Brauner @ 2023-10-24 8:42 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Jan Kara, linux-fsdevel, linux-block
On Tue, Oct 24, 2023 at 12:06:26AM -0700, Christoph Hellwig wrote:
> On Mon, Oct 23, 2023 at 04:08:21PM +0200, Christian Brauner wrote:
> > No you get uevents if you trigger a partition rescan but only if there
> > are actually partitions. What you never get however is a media change
> > event even though we do increment the disk sequence number and attach an
> > image to the loop device.
> >
> > This seems problematic because we leave userspace unable to listen for
> > attaching images to a loop device. Shouldn't we regenerate the media
> > change event after we're done setting up the device and before the
> > partition rescan for LOOP_CONFIGURE?
>
> Maybe. I think people mostly didn't care about the even anyway, but
> about the changing sequence number to check that the content hasn't
> changed.
Yes, exactly. The core is the changed sequence number but you don't get
that notification if you don't have any partitions and you request a
partition rescan from LOOP_CONFIGURE.
So I think we should always send the media change event for the sequence
number independent of the partition notification.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: loop change deprecation
2023-10-24 7:03 ` Christoph Hellwig
@ 2023-10-24 8:44 ` Christian Brauner
0 siblings, 0 replies; 6+ messages in thread
From: Christian Brauner @ 2023-10-24 8:44 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Jan Kara, linux-fsdevel, linux-block, Jens Axboe
(Sorry for the broken "Subject:" btw in the first mail.)
On Tue, Oct 24, 2023 at 12:03:25AM -0700, Christoph Hellwig wrote:
> On Mon, Oct 23, 2023 at 05:35:25PM +0200, Christian Brauner wrote:
> > I just realized that if we're able to deprecate LOOP_CHANGE_FD we remove
> > one of the most problematic/weird cases for partitions and filesystems.
>
> > change fd event on the first partition:
> >
> > sudo ./loop_change_fd /dev/loop0p1 img2
> >
> > we call disk_force_media_change() but that only works on disk->part0
> > which means that we don't even cleanly shutdown the filesystem on the
> > partition we're trying to mess around with.
>
> Yes, disk_force_media_change has that general problem back from the
> early Linux days (it had a different name back then, though). I think
> it is because traditionally removable media in Linux never had
> partitions, e.g. the CDROM drivers typically only allocated a single
> minor number so they could not be scanned. But that has changed because
> the interfaces got used for different use cases, and we also had
> dynamic majors for a long time that now allow partitions. And there
> are real use cases even for traditional removable media, e.g. MacOS
> CDROMs traditionally did have partitions.
>
> > For now, we should give up any pretense that disk_force_media_change()
> > does anything useful for loop change fd and simply remove it completely.
> > It's either useless, or it breaks the original semantics of loop change
> > fd although I don't think anyone's ever used it the way I described
> > above.
>
> Maybe we can just drop the CHANGE_FD ioctl and see if anyone screams?
Yes, I suggested that in the prior mail. I think we should do that.
We'd need changes to LTP and blktests but there are no active users in
either codesearch.debian, cs.github, or cs.android.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-10-24 8:44 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20231018152924.3858-1-jack@suse.cz>
[not found] ` <20231019-galopp-zeltdach-b14b7727f269@brauner>
[not found] ` <ZTExy7YTFtToAOOx@infradead.org>
[not found] ` <20231020-enthusiasmus-vielsagend-463a7c821bf3@brauner>
[not found] ` <20231020120436.jgxdlawibpfuprnz@quack3>
2023-10-23 14:08 ` LOOP_CONFIGURE uevents Christian Brauner
2023-10-24 7:06 ` Christoph Hellwig
2023-10-24 8:42 ` Christian Brauner
[not found] ` <20231023-ausgraben-berichten-d747aa50d876@brauner>
2023-10-23 15:35 ` loop change deprecation bdev->bd_holder_lock Christian Brauner
2023-10-24 7:03 ` Christoph Hellwig
2023-10-24 8:44 ` loop change deprecation Christian Brauner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).