Re: INFO: task hung in sync

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: INFO: task hung in sync_blockdev
       [not found] <001a11447070ac6fcb0564a08cb1@google.com>
@ 2018-02-07 15:52 ` Andi Kleen
  2018-02-08  9:28   ` Jan Kara
  0 siblings, 1 reply; 11+ messages in thread
From: Andi Kleen @ 2018-02-07 15:52 UTC (permalink / raw)
  To: syzbot
  Cc: akpm, aryabinin, jack, jlayton, linux-kernel, linux-mm, mgorman,
	mingo, rgoldwyn, syzkaller-bugs, linux-fsdevel

>  #0:  (&bdev->bd_mutex){+.+.}, at: [<0000000040269370>]
> __blkdev_put+0xbc/0x7f0 fs/block_dev.c:1757
> 1 lock held by blkid/19199:
>  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
> __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
>  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<0000000033edf9f2>]
> n_tty_read+0x2ef/0x1a00 drivers/tty/n_tty.c:2131
> 1 lock held by syz-executor5/19330:
>  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
> __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
> 1 lock held by syz-executor5/19331:
>  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
> __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439

It seems multiple processes deadlocked on the bd_mutex. 
Unfortunately there's no backtrace for the lock acquisitions,
so it's hard to see the exact sequence.

It seems lockdep is already active, so it's likely not
just an ordering violation, but something else.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: INFO: task hung in sync_blockdev
  2018-02-07 15:52 ` INFO: task hung in sync_blockdev Andi Kleen
@ 2018-02-08  9:28   ` Jan Kara
  2018-02-08 13:28     ` Dmitry Vyukov
  2018-02-08 14:49     ` Andi Kleen
  0 siblings, 2 replies; 11+ messages in thread
From: Jan Kara @ 2018-02-08  9:28 UTC (permalink / raw)
  To: Andi Kleen
  Cc: syzbot, akpm, aryabinin, jack, jlayton, linux-kernel, linux-mm,
	mgorman, mingo, rgoldwyn, syzkaller-bugs, linux-fsdevel

On Wed 07-02-18 07:52:29, Andi Kleen wrote:
> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<0000000040269370>]
> > __blkdev_put+0xbc/0x7f0 fs/block_dev.c:1757
> > 1 lock held by blkid/19199:
> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
> >  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<0000000033edf9f2>]
> > n_tty_read+0x2ef/0x1a00 drivers/tty/n_tty.c:2131
> > 1 lock held by syz-executor5/19330:
> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
> > 1 lock held by syz-executor5/19331:
> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
> 
> It seems multiple processes deadlocked on the bd_mutex. 
> Unfortunately there's no backtrace for the lock acquisitions,
> so it's hard to see the exact sequence.

Well, all in the report points to a situation where some IO was submitted
to the block device and never completed (more exactly it took longer than
those 120s to complete that IO). It would need more digging into the
syzkaller program to find out what kind of device that was and possibly why
the IO took so long to complete...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: INFO: task hung in sync_blockdev
  2018-02-08  9:28   ` Jan Kara
@ 2018-02-08 13:28     ` Dmitry Vyukov
  2018-02-08 14:08       ` Jan Kara
  2018-02-08 14:49     ` Andi Kleen
  1 sibling, 1 reply; 11+ messages in thread
From: Dmitry Vyukov @ 2018-02-08 13:28 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andi Kleen, syzbot, Andrew Morton, Andrey Ryabinin, jlayton, LKML,
	Linux-MM, Mel Gorman, Ingo Molnar, rgoldwyn, syzkaller-bugs,
	linux-fsdevel

On Thu, Feb 8, 2018 at 10:28 AM, Jan Kara <jack@suse.cz> wrote:
> On Wed 07-02-18 07:52:29, Andi Kleen wrote:
>> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<0000000040269370>]
>> > __blkdev_put+0xbc/0x7f0 fs/block_dev.c:1757
>> > 1 lock held by blkid/19199:
>> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
>> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
>> >  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<0000000033edf9f2>]
>> > n_tty_read+0x2ef/0x1a00 drivers/tty/n_tty.c:2131
>> > 1 lock held by syz-executor5/19330:
>> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
>> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
>> > 1 lock held by syz-executor5/19331:
>> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
>> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
>>
>> It seems multiple processes deadlocked on the bd_mutex.
>> Unfortunately there's no backtrace for the lock acquisitions,
>> so it's hard to see the exact sequence.
>
> Well, all in the report points to a situation where some IO was submitted
> to the block device and never completed (more exactly it took longer than
> those 120s to complete that IO). It would need more digging into the
> syzkaller program to find out what kind of device that was and possibly why
> the IO took so long to complete...


Would a traceback of all task stacks help in this case?
What I've seen in several "task hung" reports is that the CPU
traceback is not showing anything useful. So perhaps it should be
changed to task traceback? Or it would not help either?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: INFO: task hung in sync_blockdev
  2018-02-08 13:28     ` Dmitry Vyukov
@ 2018-02-08 14:08       ` Jan Kara
  2018-02-08 14:18         ` Dmitry Vyukov
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Kara @ 2018-02-08 14:08 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Jan Kara, Andi Kleen, syzbot, Andrew Morton, Andrey Ryabinin,
	jlayton, LKML, Linux-MM, Mel Gorman, Ingo Molnar, rgoldwyn,
	syzkaller-bugs, linux-fsdevel

On Thu 08-02-18 14:28:08, Dmitry Vyukov wrote:
> On Thu, Feb 8, 2018 at 10:28 AM, Jan Kara <jack@suse.cz> wrote:
> > On Wed 07-02-18 07:52:29, Andi Kleen wrote:
> >> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<0000000040269370>]
> >> > __blkdev_put+0xbc/0x7f0 fs/block_dev.c:1757
> >> > 1 lock held by blkid/19199:
> >> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
> >> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
> >> >  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<0000000033edf9f2>]
> >> > n_tty_read+0x2ef/0x1a00 drivers/tty/n_tty.c:2131
> >> > 1 lock held by syz-executor5/19330:
> >> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
> >> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
> >> > 1 lock held by syz-executor5/19331:
> >> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
> >> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
> >>
> >> It seems multiple processes deadlocked on the bd_mutex.
> >> Unfortunately there's no backtrace for the lock acquisitions,
> >> so it's hard to see the exact sequence.
> >
> > Well, all in the report points to a situation where some IO was submitted
> > to the block device and never completed (more exactly it took longer than
> > those 120s to complete that IO). It would need more digging into the
> > syzkaller program to find out what kind of device that was and possibly why
> > the IO took so long to complete...
> 
> 
> Would a traceback of all task stacks help in this case?
> What I've seen in several "task hung" reports is that the CPU
> traceback is not showing anything useful. So perhaps it should be
> changed to task traceback? Or it would not help either?

Task stack traceback for all tasks (usually only tasks in D state - i.e.
sysrq-w - are enough actually) would definitely help for debugging
deadlocks on sleeping locks. For this particular case I'm not sure if it
would help or not since it is quite possible the IO is just sitting in some
queue never getting processed due to some racing syzkaller process tearing
down the device in the wrong moment or something like that... Such case is
very difficult to debug without full kernel crashdump of the hung kernel
(or a reproducer for that matter) and even with that it is usually rather
time consuming. But for the deadlocks which do occur more frequently it
would be probably worth the time so it would be nice if such option was
eventually available.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: INFO: task hung in sync_blockdev
  2018-02-08 14:08       ` Jan Kara
@ 2018-02-08 14:18         ` Dmitry Vyukov
  2018-02-08 16:18           ` Jan Kara
  0 siblings, 1 reply; 11+ messages in thread
From: Dmitry Vyukov @ 2018-02-08 14:18 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andi Kleen, syzbot, Andrew Morton, Andrey Ryabinin, jlayton, LKML,
	Linux-MM, Mel Gorman, Ingo Molnar, rgoldwyn, syzkaller-bugs,
	linux-fsdevel

On Thu, Feb 8, 2018 at 3:08 PM, Jan Kara <jack@suse.cz> wrote:
> On Thu 08-02-18 14:28:08, Dmitry Vyukov wrote:
>> On Thu, Feb 8, 2018 at 10:28 AM, Jan Kara <jack@suse.cz> wrote:
>> > On Wed 07-02-18 07:52:29, Andi Kleen wrote:
>> >> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<0000000040269370>]
>> >> > __blkdev_put+0xbc/0x7f0 fs/block_dev.c:1757
>> >> > 1 lock held by blkid/19199:
>> >> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
>> >> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
>> >> >  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<0000000033edf9f2>]
>> >> > n_tty_read+0x2ef/0x1a00 drivers/tty/n_tty.c:2131
>> >> > 1 lock held by syz-executor5/19330:
>> >> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
>> >> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
>> >> > 1 lock held by syz-executor5/19331:
>> >> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
>> >> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
>> >>
>> >> It seems multiple processes deadlocked on the bd_mutex.
>> >> Unfortunately there's no backtrace for the lock acquisitions,
>> >> so it's hard to see the exact sequence.
>> >
>> > Well, all in the report points to a situation where some IO was submitted
>> > to the block device and never completed (more exactly it took longer than
>> > those 120s to complete that IO). It would need more digging into the
>> > syzkaller program to find out what kind of device that was and possibly why
>> > the IO took so long to complete...
>>
>>
>> Would a traceback of all task stacks help in this case?
>> What I've seen in several "task hung" reports is that the CPU
>> traceback is not showing anything useful. So perhaps it should be
>> changed to task traceback? Or it would not help either?
>
> Task stack traceback for all tasks (usually only tasks in D state - i.e.
> sysrq-w - are enough actually) would definitely help for debugging
> deadlocks on sleeping locks. For this particular case I'm not sure if it
> would help or not since it is quite possible the IO is just sitting in some
> queue never getting processed

That's what I was afraid of.

> due to some racing syzkaller process tearing
> down the device in the wrong moment or something like that... Such case is
> very difficult to debug without full kernel crashdump of the hung kernel
> (or a reproducer for that matter) and even with that it is usually rather
> time consuming. But for the deadlocks which do occur more frequently it
> would be probably worth the time so it would be nice if such option was
> eventually available.

By "full kernel crashdump" you mean kdump thing, or something else?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: INFO: task hung in sync_blockdev
  2018-02-08 14:18         ` Dmitry Vyukov
@ 2018-02-08 16:18           ` Jan Kara
  2018-02-08 16:23             ` Andrey Ryabinin
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Kara @ 2018-02-08 16:18 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Jan Kara, Andi Kleen, syzbot, Andrew Morton, Andrey Ryabinin,
	jlayton, LKML, Linux-MM, Mel Gorman, Ingo Molnar, rgoldwyn,
	syzkaller-bugs, linux-fsdevel

On Thu 08-02-18 15:18:11, Dmitry Vyukov wrote:
> On Thu, Feb 8, 2018 at 3:08 PM, Jan Kara <jack@suse.cz> wrote:
> > On Thu 08-02-18 14:28:08, Dmitry Vyukov wrote:
> >> On Thu, Feb 8, 2018 at 10:28 AM, Jan Kara <jack@suse.cz> wrote:
> >> > On Wed 07-02-18 07:52:29, Andi Kleen wrote:
> >> >> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<0000000040269370>]
> >> >> > __blkdev_put+0xbc/0x7f0 fs/block_dev.c:1757
> >> >> > 1 lock held by blkid/19199:
> >> >> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
> >> >> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
> >> >> >  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<0000000033edf9f2>]
> >> >> > n_tty_read+0x2ef/0x1a00 drivers/tty/n_tty.c:2131
> >> >> > 1 lock held by syz-executor5/19330:
> >> >> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
> >> >> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
> >> >> > 1 lock held by syz-executor5/19331:
> >> >> >  #0:  (&bdev->bd_mutex){+.+.}, at: [<00000000b4dcaa18>]
> >> >> > __blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
> >> >>
> >> >> It seems multiple processes deadlocked on the bd_mutex.
> >> >> Unfortunately there's no backtrace for the lock acquisitions,
> >> >> so it's hard to see the exact sequence.
> >> >
> >> > Well, all in the report points to a situation where some IO was submitted
> >> > to the block device and never completed (more exactly it took longer than
> >> > those 120s to complete that IO). It would need more digging into the
> >> > syzkaller program to find out what kind of device that was and possibly why
> >> > the IO took so long to complete...
> >>
> >>
> >> Would a traceback of all task stacks help in this case?
> >> What I've seen in several "task hung" reports is that the CPU
> >> traceback is not showing anything useful. So perhaps it should be
> >> changed to task traceback? Or it would not help either?
> >
> > Task stack traceback for all tasks (usually only tasks in D state - i.e.
> > sysrq-w - are enough actually) would definitely help for debugging
> > deadlocks on sleeping locks. For this particular case I'm not sure if it
> > would help or not since it is quite possible the IO is just sitting in some
> > queue never getting processed
> 
> That's what I was afraid of.
> 
> > due to some racing syzkaller process tearing
> > down the device in the wrong moment or something like that... Such case is
> > very difficult to debug without full kernel crashdump of the hung kernel
> > (or a reproducer for that matter) and even with that it is usually rather
> > time consuming. But for the deadlocks which do occur more frequently it
> > would be probably worth the time so it would be nice if such option was
> > eventually available.
> 
> By "full kernel crashdump" you mean kdump thing, or something else?

Yes, the kdump thing (for KVM guest you can grab the memory dump also from
the host in a simplier way and it should be usable with the crash utility
AFAIK).

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: INFO: task hung in sync_blockdev
  2018-02-08 16:18           ` Jan Kara
@ 2018-02-08 16:23             ` Andrey Ryabinin
  2018-02-08 17:17               ` Dmitry Vyukov
  0 siblings, 1 reply; 11+ messages in thread
From: Andrey Ryabinin @ 2018-02-08 16:23 UTC (permalink / raw)
  To: Jan Kara, Dmitry Vyukov
  Cc: Andi Kleen, syzbot, Andrew Morton, jlayton, LKML, Linux-MM,
	Mel Gorman, Ingo Molnar, rgoldwyn, syzkaller-bugs, linux-fsdevel



On 02/08/2018 07:18 PM, Jan Kara wrote:

>> By "full kernel crashdump" you mean kdump thing, or something else?
> 
> Yes, the kdump thing (for KVM guest you can grab the memory dump also from
> the host in a simplier way and it should be usable with the crash utility
> AFAIK).
> 

In QEMU monitor 'dump-guest-memory' command:

(qemu) help dump-guest-memory 
dump-guest-memory [-p] [-d] [-z|-l|-s] filename [begin length] -- dump guest memory into file 'filename'.
                        -p: do paging to get guest's memory mapping.
                        -d: return immediately (do not wait for completion).
                        -z: dump in kdump-compressed format, with zlib compression.
                        -l: dump in kdump-compressed format, with lzo compression.
                        -s: dump in kdump-compressed format, with snappy compression.
                        begin: the starting physical address.
                        length: the memory size, in bytes

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: INFO: task hung in sync_blockdev
  2018-02-08 16:23             ` Andrey Ryabinin
@ 2018-02-08 17:17               ` Dmitry Vyukov
  2018-02-09 15:31                 ` Andrey Ryabinin
  0 siblings, 1 reply; 11+ messages in thread
From: Dmitry Vyukov @ 2018-02-08 17:17 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Jan Kara, Andi Kleen, syzbot, Andrew Morton, jlayton, LKML,
	Linux-MM, Mel Gorman, Ingo Molnar, rgoldwyn, syzkaller-bugs,
	linux-fsdevel

On Thu, Feb 8, 2018 at 5:23 PM, Andrey Ryabinin <aryabinin@virtuozzo.com> wrote:
>
>
> On 02/08/2018 07:18 PM, Jan Kara wrote:
>
>>> By "full kernel crashdump" you mean kdump thing, or something else?
>>
>> Yes, the kdump thing (for KVM guest you can grab the memory dump also from
>> the host in a simplier way and it should be usable with the crash utility
>> AFAIK).
>>
>
> In QEMU monitor 'dump-guest-memory' command:
>
> (qemu) help dump-guest-memory
> dump-guest-memory [-p] [-d] [-z|-l|-s] filename [begin length] -- dump guest memory into file 'filename'.
>                         -p: do paging to get guest's memory mapping.
>                         -d: return immediately (do not wait for completion).
>                         -z: dump in kdump-compressed format, with zlib compression.
>                         -l: dump in kdump-compressed format, with lzo compression.
>                         -s: dump in kdump-compressed format, with snappy compression.
>                         begin: the starting physical address.
>                         length: the memory size, in bytes


Nice!
Do you know straight away if it's scriptable/automatable? Or do I just
send some magic sequence of bytes representing ^A+C,
dump-guest-memory, \n to stdin pipe?

Unfortunately, syzbot uses GCE VMs for testing, and there does not
seem to be such feature on GCE...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: INFO: task hung in sync_blockdev
  2018-02-08 17:17               ` Dmitry Vyukov
@ 2018-02-09 15:31                 ` Andrey Ryabinin
  0 siblings, 0 replies; 11+ messages in thread
From: Andrey Ryabinin @ 2018-02-09 15:31 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Jan Kara, Andi Kleen, syzbot, Andrew Morton, jlayton, LKML,
	Linux-MM, Mel Gorman, Ingo Molnar, rgoldwyn, syzkaller-bugs,
	linux-fsdevel



On 02/08/2018 08:17 PM, Dmitry Vyukov wrote:
> On Thu, Feb 8, 2018 at 5:23 PM, Andrey Ryabinin <aryabinin@virtuozzo.com> wrote:
>>
>>
>> On 02/08/2018 07:18 PM, Jan Kara wrote:
>>
>>>> By "full kernel crashdump" you mean kdump thing, or something else?
>>>
>>> Yes, the kdump thing (for KVM guest you can grab the memory dump also from
>>> the host in a simplier way and it should be usable with the crash utility
>>> AFAIK).
>>>
>>
>> In QEMU monitor 'dump-guest-memory' command:
>>
>> (qemu) help dump-guest-memory
>> dump-guest-memory [-p] [-d] [-z|-l|-s] filename [begin length] -- dump guest memory into file 'filename'.
>>                         -p: do paging to get guest's memory mapping.
>>                         -d: return immediately (do not wait for completion).
>>                         -z: dump in kdump-compressed format, with zlib compression.
>>                         -l: dump in kdump-compressed format, with lzo compression.
>>                         -s: dump in kdump-compressed format, with snappy compression.
>>                         begin: the starting physical address.
>>                         length: the memory size, in bytes
> 
> 
> Nice!
> Do you know straight away if it's scriptable/automatable? Or do I just
> send some magic sequence of bytes representing ^A+C,
> dump-guest-memory, \n to stdin pipe?
> 

I wouldn't do it via stdin. You can setup monitor on any chardev you like and send command
there when you know that guest paniced. Look for -mon and -chardev qemu options.

> Unfortunately, syzbot uses GCE VMs for testing, and there does not
> seem to be such feature on GCE...
> 

Well, you still have kdump.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: INFO: task hung in sync_blockdev
  2018-02-08  9:28   ` Jan Kara
  2018-02-08 13:28     ` Dmitry Vyukov
@ 2018-02-08 14:49     ` Andi Kleen
  2018-02-08 16:20       ` Jan Kara
  1 sibling, 1 reply; 11+ messages in thread
From: Andi Kleen @ 2018-02-08 14:49 UTC (permalink / raw)
  To: Jan Kara
  Cc: syzbot, akpm, aryabinin, jlayton, linux-kernel, linux-mm, mgorman,
	mingo, rgoldwyn, syzkaller-bugs, linux-fsdevel

> > It seems multiple processes deadlocked on the bd_mutex. 
> > Unfortunately there's no backtrace for the lock acquisitions,
> > so it's hard to see the exact sequence.
> 
> Well, all in the report points to a situation where some IO was submitted
> to the block device and never completed (more exactly it took longer than
> those 120s to complete that IO). It would need more digging into the

Are you sure? I didn't think outstanding IO would take bd_mutex.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: INFO: task hung in sync_blockdev
  2018-02-08 14:49     ` Andi Kleen
@ 2018-02-08 16:20       ` Jan Kara
  0 siblings, 0 replies; 11+ messages in thread
From: Jan Kara @ 2018-02-08 16:20 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Jan Kara, syzbot, akpm, aryabinin, jlayton, linux-kernel,
	linux-mm, mgorman, mingo, rgoldwyn, syzkaller-bugs, linux-fsdevel

On Thu 08-02-18 06:49:18, Andi Kleen wrote:
> > > It seems multiple processes deadlocked on the bd_mutex. 
> > > Unfortunately there's no backtrace for the lock acquisitions,
> > > so it's hard to see the exact sequence.
> > 
> > Well, all in the report points to a situation where some IO was submitted
> > to the block device and never completed (more exactly it took longer than
> > those 120s to complete that IO). It would need more digging into the
> 
> Are you sure? I didn't think outstanding IO would take bd_mutex.

The stack trace is:

 schedule+0xf5/0x430 kernel/sched/core.c:3480
 io_schedule+0x1c/0x70 kernel/sched/core.c:5096
 wait_on_page_bit_common+0x4b3/0x770 mm/filemap.c:1099
 wait_on_page_bit mm/filemap.c:1132 [inline]
 wait_on_page_writeback include/linux/pagemap.h:546 [inline]
 __filemap_fdatawait_range+0x282/0x430 mm/filemap.c:533
 filemap_fdatawait_range mm/filemap.c:558 [inline]
 filemap_fdatawait include/linux/fs.h:2590 [inline]
 filemap_write_and_wait+0x7a/0xd0 mm/filemap.c:624
 __sync_blockdev fs/block_dev.c:448 [inline]
 sync_blockdev.part.29+0x50/0x70 fs/block_dev.c:457
 sync_blockdev fs/block_dev.c:444 [inline]
 __blkdev_put+0x18b/0x7f0 fs/block_dev.c:1763
 blkdev_put+0x85/0x4f0 fs/block_dev.c:1835
 blkdev_close+0x8b/0xb0 fs/block_dev.c:1842
 __fput+0x327/0x7e0 fs/file_table.c:209
 ____fput+0x15/0x20 fs/file_table.c:243


So we are waiting for PageWriteback on some page. And bd_mutex is grabbed
by this process in __blkdev_put() before calling sync_blockdev().

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-02-09 15:31 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <001a11447070ac6fcb0564a08cb1@google.com>
2018-02-07 15:52 ` INFO: task hung in sync_blockdev Andi Kleen
2018-02-08  9:28   ` Jan Kara
2018-02-08 13:28     ` Dmitry Vyukov
2018-02-08 14:08       ` Jan Kara
2018-02-08 14:18         ` Dmitry Vyukov
2018-02-08 16:18           ` Jan Kara
2018-02-08 16:23             ` Andrey Ryabinin
2018-02-08 17:17               ` Dmitry Vyukov
2018-02-09 15:31                 ` Andrey Ryabinin
2018-02-08 14:49     ` Andi Kleen
2018-02-08 16:20       ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).