linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Revert commit 310ca162d77
@ 2019-03-20 12:58 Jan Kara
  2019-03-20 15:12 ` Greg KH
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Jan Kara @ 2019-03-20 12:58 UTC (permalink / raw)
  To: stable; +Cc: Jens Axboe, linux-block

Hello,

commit 310ca162d77 "block/loop: Use global lock for ioctl() operation." has
been pushed to multiple stable trees. This patch is a part of larger series
that overhauls the locking inside loopback device upstream and for 4.4,
4.9, and 4.14 stable trees only this patch from the series is applied. Our
testing now has shown [1] that the patch alone makes present deadlocks
inside loopback driver more likely (the openqa test in our infrastructure
didn't hit the deadlock before whereas with the new kernel it hits it
reliably every time). So I would suggest we revert 310ca162d77 from 4.4,
4.9, and 4.14 kernels.

Another option would be to backport other locking fixes for the loop
device but honestly I don't think that's a stable material - never heard
of real users hitting problems, only syzkaller could, and we are still
fixing up some small glitches resulting from that rework...

								Honza

[1] https://bugzilla.suse.com/show_bug.cgi?id=1129739

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Revert commit 310ca162d77
  2019-03-20 12:58 Revert commit 310ca162d77 Jan Kara
@ 2019-03-20 15:12 ` Greg KH
  2019-03-20 15:16   ` Greg KH
  2019-03-20 15:16 ` Greg KH
  2019-04-29 12:05 ` Salvatore Bonaccorso
  2 siblings, 1 reply; 8+ messages in thread
From: Greg KH @ 2019-03-20 15:12 UTC (permalink / raw)
  To: Jan Kara; +Cc: stable, Jens Axboe, linux-block

On Wed, Mar 20, 2019 at 01:58:06PM +0100, Jan Kara wrote:
> Hello,
> 
> commit 310ca162d77 "block/loop: Use global lock for ioctl() operation." has
> been pushed to multiple stable trees. This patch is a part of larger series
> that overhauls the locking inside loopback device upstream and for 4.4,
> 4.9, and 4.14 stable trees only this patch from the series is applied. Our
> testing now has shown [1] that the patch alone makes present deadlocks
> inside loopback driver more likely (the openqa test in our infrastructure
> didn't hit the deadlock before whereas with the new kernel it hits it
> reliably every time). So I would suggest we revert 310ca162d77 from 4.4,
> 4.9, and 4.14 kernels.

Ugh, ok.

> Another option would be to backport other locking fixes for the loop
> device but honestly I don't think that's a stable material - never heard
> of real users hitting problems, only syzkaller could, and we are still
> fixing up some small glitches resulting from that rework...

I tried to backport a number of the loop fixes, and odds are I am the
one that grabbed this.  There are other loop locking fixes in the stable
releases, I don't know if I can just revert this one, let me check...

And yes, I did get some loop bugreports from the Android kernel team as
they are using loop devices in large numbers on the new Android release
for something.  So I think they have already backported a number of
these fixes to their trees, which made me want to push these out to more
people.

Also, given that syzbot has a reproducer for 310ca162d77, are we worst
off if we revert it?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Revert commit 310ca162d77
  2019-03-20 15:12 ` Greg KH
@ 2019-03-20 15:16   ` Greg KH
  2019-03-21 10:41     ` Jan Kara
  0 siblings, 1 reply; 8+ messages in thread
From: Greg KH @ 2019-03-20 15:16 UTC (permalink / raw)
  To: Jan Kara; +Cc: stable, Jens Axboe, linux-block

On Wed, Mar 20, 2019 at 04:12:31PM +0100, Greg KH wrote:
> On Wed, Mar 20, 2019 at 01:58:06PM +0100, Jan Kara wrote:
> > Hello,
> > 
> > commit 310ca162d77 "block/loop: Use global lock for ioctl() operation." has
> > been pushed to multiple stable trees. This patch is a part of larger series
> > that overhauls the locking inside loopback device upstream and for 4.4,
> > 4.9, and 4.14 stable trees only this patch from the series is applied. Our
> > testing now has shown [1] that the patch alone makes present deadlocks
> > inside loopback driver more likely (the openqa test in our infrastructure
> > didn't hit the deadlock before whereas with the new kernel it hits it
> > reliably every time). So I would suggest we revert 310ca162d77 from 4.4,
> > 4.9, and 4.14 kernels.
> 
> Ugh, ok.
> 
> > Another option would be to backport other locking fixes for the loop
> > device but honestly I don't think that's a stable material - never heard
> > of real users hitting problems, only syzkaller could, and we are still
> > fixing up some small glitches resulting from that rework...
> 
> I tried to backport a number of the loop fixes, and odds are I am the
> one that grabbed this.  There are other loop locking fixes in the stable
> releases, I don't know if I can just revert this one, let me check...

Nope, does not revert cleanly due to the other loop fixes I added after
this one.

So either I back out _all_ of the loop patches, or we leave what we have
now, or we add the proper fixes to get things working.  I'd like to have
this all work properly, so why not just have me backport the needed
fixes that are upstream?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Revert commit 310ca162d77
  2019-03-20 12:58 Revert commit 310ca162d77 Jan Kara
  2019-03-20 15:12 ` Greg KH
@ 2019-03-20 15:16 ` Greg KH
  2019-03-21 10:26   ` Jan Kara
  2019-04-29 12:05 ` Salvatore Bonaccorso
  2 siblings, 1 reply; 8+ messages in thread
From: Greg KH @ 2019-03-20 15:16 UTC (permalink / raw)
  To: Jan Kara; +Cc: stable, Jens Axboe, linux-block

On Wed, Mar 20, 2019 at 01:58:06PM +0100, Jan Kara wrote:
> Hello,
> 
> commit 310ca162d77 "block/loop: Use global lock for ioctl() operation." has
> been pushed to multiple stable trees. This patch is a part of larger series
> that overhauls the locking inside loopback device upstream and for 4.4,
> 4.9, and 4.14 stable trees only this patch from the series is applied. Our
> testing now has shown [1] that the patch alone makes present deadlocks
> inside loopback driver more likely (the openqa test in our infrastructure
> didn't hit the deadlock before whereas with the new kernel it hits it
> reliably every time). So I would suggest we revert 310ca162d77 from 4.4,
> 4.9, and 4.14 kernels.
> 
> Another option would be to backport other locking fixes for the loop
> device but honestly I don't think that's a stable material - never heard
> of real users hitting problems, only syzkaller could, and we are still
> fixing up some small glitches resulting from that rework...
> 
> 								Honza
> 
> [1] https://bugzilla.suse.com/show_bug.cgi?id=1129739

Note, I do not have access to that bug report :(


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Revert commit 310ca162d77
  2019-03-20 15:16 ` Greg KH
@ 2019-03-21 10:26   ` Jan Kara
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2019-03-21 10:26 UTC (permalink / raw)
  To: Greg KH; +Cc: Jan Kara, stable, Jens Axboe, linux-block

On Wed 20-03-19 16:16:28, Greg KH wrote:
> On Wed, Mar 20, 2019 at 01:58:06PM +0100, Jan Kara wrote:
> > Hello,
> > 
> > commit 310ca162d77 "block/loop: Use global lock for ioctl() operation." has
> > been pushed to multiple stable trees. This patch is a part of larger series
> > that overhauls the locking inside loopback device upstream and for 4.4,
> > 4.9, and 4.14 stable trees only this patch from the series is applied. Our
> > testing now has shown [1] that the patch alone makes present deadlocks
> > inside loopback driver more likely (the openqa test in our infrastructure
> > didn't hit the deadlock before whereas with the new kernel it hits it
> > reliably every time). So I would suggest we revert 310ca162d77 from 4.4,
> > 4.9, and 4.14 kernels.
> > 
> > Another option would be to backport other locking fixes for the loop
> > device but honestly I don't think that's a stable material - never heard
> > of real users hitting problems, only syzkaller could, and we are still
> > fixing up some small glitches resulting from that rework...
> > 
> > 								Honza
> > 
> > [1] https://bugzilla.suse.com/show_bug.cgi?id=1129739
> 
> Note, I do not have access to that bug report :(

Sorry, sometimes I hate how paranoid our bugzilla is. I've mirrorred the
relevant parts to:

https://bugzilla.kernel.org/show_bug.cgi?id=202985

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Revert commit 310ca162d77
  2019-03-20 15:16   ` Greg KH
@ 2019-03-21 10:41     ` Jan Kara
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2019-03-21 10:41 UTC (permalink / raw)
  To: Greg KH; +Cc: Jan Kara, stable, Jens Axboe, linux-block

On Wed 20-03-19 16:16:01, Greg KH wrote:
> On Wed, Mar 20, 2019 at 04:12:31PM +0100, Greg KH wrote:
> > On Wed, Mar 20, 2019 at 01:58:06PM +0100, Jan Kara wrote:
> > > Hello,
> > > 
> > > commit 310ca162d77 "block/loop: Use global lock for ioctl() operation." has
> > > been pushed to multiple stable trees. This patch is a part of larger series
> > > that overhauls the locking inside loopback device upstream and for 4.4,
> > > 4.9, and 4.14 stable trees only this patch from the series is applied. Our
> > > testing now has shown [1] that the patch alone makes present deadlocks
> > > inside loopback driver more likely (the openqa test in our infrastructure
> > > didn't hit the deadlock before whereas with the new kernel it hits it
> > > reliably every time). So I would suggest we revert 310ca162d77 from 4.4,
> > > 4.9, and 4.14 kernels.
> > 
> > Ugh, ok.
> > 
> > > Another option would be to backport other locking fixes for the loop
> > > device but honestly I don't think that's a stable material - never heard
> > > of real users hitting problems, only syzkaller could, and we are still
> > > fixing up some small glitches resulting from that rework...
> > 
> > I tried to backport a number of the loop fixes, and odds are I am the
> > one that grabbed this.  There are other loop locking fixes in the stable
> > releases, I don't know if I can just revert this one, let me check...
> 
> Nope, does not revert cleanly due to the other loop fixes I added after
> this one.
> 
> So either I back out _all_ of the loop patches, or we leave what we have
> now, or we add the proper fixes to get things working.  I'd like to have
> this all work properly, so why not just have me backport the needed
> fixes that are upstream?

Sure, that is another option. Be sure to include recent fixups:

40853d6fc619 "loop: do not print warn message if partition scan is
successful"
758a58d0bc67 "loop: set GENHD_FL_NO_PART_SCAN after blkdev_reread_part()"

otherwise you need commits:

b1ab5fa309e6 "block/loop: Don't grab "struct file" for vfs_getattr()
operation."
...
628bd8594709 "loop: Fix double mutex_unlock(&loop_ctl_mutex) in
loop_control_ioctl()"

to drivers/block/loop.c. It's just that my experience tells me that when
there were 15 original patches in the series and then 3 followup fixups,
the series is not usually worth the risk for something nobody hit besides
fuzzers...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Revert commit 310ca162d77
  2019-03-20 12:58 Revert commit 310ca162d77 Jan Kara
  2019-03-20 15:12 ` Greg KH
  2019-03-20 15:16 ` Greg KH
@ 2019-04-29 12:05 ` Salvatore Bonaccorso
  2019-04-29 14:04   ` Greg KH
  2 siblings, 1 reply; 8+ messages in thread
From: Salvatore Bonaccorso @ 2019-04-29 12:05 UTC (permalink / raw)
  To: Jan Kara; +Cc: stable, Jens Axboe, linux-block, Greg KH, 928125, Brad Barnett

Hi Jan, hi Greg,

On Wed, Mar 20, 2019 at 01:58:06PM +0100, Jan Kara wrote:
> Hello,
> 
> commit 310ca162d77 "block/loop: Use global lock for ioctl() operation." has
> been pushed to multiple stable trees. This patch is a part of larger series
> that overhauls the locking inside loopback device upstream and for 4.4,
> 4.9, and 4.14 stable trees only this patch from the series is applied. Our
> testing now has shown [1] that the patch alone makes present deadlocks
> inside loopback driver more likely (the openqa test in our infrastructure
> didn't hit the deadlock before whereas with the new kernel it hits it
> reliably every time). So I would suggest we revert 310ca162d77 from 4.4,
> 4.9, and 4.14 kernels.

A user in Debian reported [1], providing the following testcase which showed up
after the recent update to 4.9.168-1 in Debian stretch (based on upstream
v4.9.168) as follows:

	dd if=/dev/zero of=/tmp/ff1.raw bs=1G seek=8 count=0
	sync
	sleep 1
	parted /tmp/ff1.raw mklabel msdos
	parted -s /tmp/ff1.raw mkpart primary linux-swap 1 100
	parted -s -- /tmp/ff1.raw mkpart primary ext2 101 -1
	parted -s -- /tmp/ff1.raw set 2 boot on
	sleep 5
	losetup -Pf /tmp/ff1.raw --show

I have verified that the same happens with v4.9.171 where the mentioned commit
was not reverted, and bisecting of the testcase showed it was introduced with
3ae3d167f5ec2c7bb5fcd12b7772cfadc93b2305 (v4.9.152~9) (which is the backport of
310ca162d77 for 4.9).

Reverting 3ae3d167f5ec2c7bb5fcd12b7772cfadc93b2305 on top of v4.9.171 worked
and fixed the respective issue.

Can this commit in meanwhile be reverted or is there further ongoing work in
integrating the followup fixes as mentioned in
https://lore.kernel.org/stable/20190321104110.GF29086@quack2.suse.cz/ .

Regards,
Salvatore

 [1] https://bugs.debian.org/928125

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Revert commit 310ca162d77
  2019-04-29 12:05 ` Salvatore Bonaccorso
@ 2019-04-29 14:04   ` Greg KH
  0 siblings, 0 replies; 8+ messages in thread
From: Greg KH @ 2019-04-29 14:04 UTC (permalink / raw)
  To: Jan Kara, stable, Jens Axboe, linux-block, 928125, Brad Barnett

On Mon, Apr 29, 2019 at 02:05:42PM +0200, Salvatore Bonaccorso wrote:
> Hi Jan, hi Greg,
> 
> On Wed, Mar 20, 2019 at 01:58:06PM +0100, Jan Kara wrote:
> > Hello,
> > 
> > commit 310ca162d77 "block/loop: Use global lock for ioctl() operation." has
> > been pushed to multiple stable trees. This patch is a part of larger series
> > that overhauls the locking inside loopback device upstream and for 4.4,
> > 4.9, and 4.14 stable trees only this patch from the series is applied. Our
> > testing now has shown [1] that the patch alone makes present deadlocks
> > inside loopback driver more likely (the openqa test in our infrastructure
> > didn't hit the deadlock before whereas with the new kernel it hits it
> > reliably every time). So I would suggest we revert 310ca162d77 from 4.4,
> > 4.9, and 4.14 kernels.
> 
> A user in Debian reported [1], providing the following testcase which showed up
> after the recent update to 4.9.168-1 in Debian stretch (based on upstream
> v4.9.168) as follows:
> 
> 	dd if=/dev/zero of=/tmp/ff1.raw bs=1G seek=8 count=0
> 	sync
> 	sleep 1
> 	parted /tmp/ff1.raw mklabel msdos
> 	parted -s /tmp/ff1.raw mkpart primary linux-swap 1 100
> 	parted -s -- /tmp/ff1.raw mkpart primary ext2 101 -1
> 	parted -s -- /tmp/ff1.raw set 2 boot on
> 	sleep 5
> 	losetup -Pf /tmp/ff1.raw --show
> 
> I have verified that the same happens with v4.9.171 where the mentioned commit
> was not reverted, and bisecting of the testcase showed it was introduced with
> 3ae3d167f5ec2c7bb5fcd12b7772cfadc93b2305 (v4.9.152~9) (which is the backport of
> 310ca162d77 for 4.9).
> 
> Reverting 3ae3d167f5ec2c7bb5fcd12b7772cfadc93b2305 on top of v4.9.171 worked
> and fixed the respective issue.
> 
> Can this commit in meanwhile be reverted or is there further ongoing work in
> integrating the followup fixes as mentioned in
> https://lore.kernel.org/stable/20190321104110.GF29086@quack2.suse.cz/ .

Sorry for the delay here.  No, I didn't find any time for the followup
stuff here, and Jan is right, this should just be dropped.

I've now reverted it from 3.18.y, 4.4.y, 4.9.y, and 4.14.y.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-04-29 14:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-03-20 12:58 Revert commit 310ca162d77 Jan Kara
2019-03-20 15:12 ` Greg KH
2019-03-20 15:16   ` Greg KH
2019-03-21 10:41     ` Jan Kara
2019-03-20 15:16 ` Greg KH
2019-03-21 10:26   ` Jan Kara
2019-04-29 12:05 ` Salvatore Bonaccorso
2019-04-29 14:04   ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).