All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org,
	xfs@oss.sgi.com
Subject: Re: [4.8 hang] xfstests generic/361 hangs on dax enabled filesystems
Date: Wed, 3 Aug 2016 11:11:27 -0600	[thread overview]
Message-ID: <20160803171127.GA15876@linux.intel.com> (raw)
In-Reply-To: <20160803003354.GP16044@dastard>

On Wed, Aug 03, 2016 at 10:33:54AM +1000, Dave Chinner wrote:
> Hi folks,
> 
> Just hit a reproducable hang in generic/361. Essentially this on
> a 8GB pmem device:
> 
> mkfs.xfs -f /dev/pmem1
> mount -o dax /dev/pmem1 /mnt/scratch
> xfs_io -f -c "truncate 1g" test.img
> losetup -f --show /mnt/scratch/test.img
> mkfs.xfs -f /dev/loop0
> 
> And the mkfs.xfs command hangs with a discard that never completes:
> 
> [  243.413918] INFO: task mkfs.xfs:5708 blocked for more than 120 seconds.
> [  243.415678]       Not tainted 4.7.0-dgc+ #862
> [  243.416772] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  243.418769] mkfs.xfs        D ffff880835143c18 13848  5708   5441 0x00000000
> [  243.420620]  ffff880835143c18 ffff880835143c20 ffff88083a244780 ffff8808358ba3c0
> [  243.422636]  ffff88023aa20000 ffff880835144000 7fffffffffffffff 7fffffffffffffff
> [  243.424586]  ffff8808358ba3c0 00000000024000c0 ffff880835143c30 ffffffff81e5e38c
> [  243.426466] Call Trace:
> [  243.427050]  [<ffffffff81e5e38c>] schedule+0x3c/0x90
> [  243.428224]  [<ffffffff81e62be5>] schedule_timeout+0x265/0x330
> [  243.429563]  [<ffffffff8109f125>] ? kvm_clock_read+0x25/0x40
> [  243.430896]  [<ffffffff8109f149>] ? kvm_clock_get_cycles+0x9/0x10
> [  243.432360]  [<ffffffff81125edc>] ? ktime_get+0x3c/0xb0
> [  243.433556]  [<ffffffff81e5db54>] io_schedule_timeout+0xa4/0x110
> [  243.434932]  [<ffffffff81e5eed6>] wait_for_completion_io+0xd6/0x110
> [  243.436297]  [<ffffffff810decd0>] ? wake_up_q+0x70/0x70
> [  243.437436]  [<ffffffff817d6f06>] submit_bio_wait+0x56/0x70
> [  243.438671]  [<ffffffff817e851a>] blkdev_issue_discard+0x6a/0xb0
> [  243.439980]  [<ffffffff810dab69>] ? __might_sleep+0x49/0x80
> [  243.441182]  [<ffffffff817eea87>] blk_ioctl_discard+0x97/0xb0
> [  243.442370]  [<ffffffff817ef7bb>] blkdev_ioctl+0x7eb/0x9a0
> [  243.443485]  [<ffffffff81236a1d>] block_ioctl+0x3d/0x50
> [  243.444552]  [<ffffffff812100df>] do_vfs_ioctl+0x8f/0x670
> [  243.445630]  [<ffffffff81002434>] ? exit_to_usermode_loop+0x94/0xb0
> [  243.446902]  [<ffffffff81210739>] SyS_ioctl+0x79/0x90
> [  243.447927]  [<ffffffff81002bc5>] ? syscall_return_slowpath+0xf5/0x190
> [  243.449236]  [<ffffffff81e63d32>] entry_SYSCALL_64_fastpath+0x1a/0xa4
> 
> This only reproduces when the underlying filesystem is mounted with
> -o dax, so there is a bad interaction with loop devices and DAX
> occurring somewhere. generic/361 is a recent test (committed june 14)
> so this probably hasn't actually been tested until now.
> 
> I haven't got time to look at this right now, hence the report.

Cool, thanks for the report.  I've reproduced this with linux/master, and the
test passes with v4.7.

Running a bisect...
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org,
	xfs@oss.sgi.com
Subject: Re: [4.8 hang] xfstests generic/361 hangs on dax enabled filesystems
Date: Wed, 3 Aug 2016 11:11:27 -0600	[thread overview]
Message-ID: <20160803171127.GA15876@linux.intel.com> (raw)
In-Reply-To: <20160803003354.GP16044@dastard>

On Wed, Aug 03, 2016 at 10:33:54AM +1000, Dave Chinner wrote:
> Hi folks,
> 
> Just hit a reproducable hang in generic/361. Essentially this on
> a 8GB pmem device:
> 
> mkfs.xfs -f /dev/pmem1
> mount -o dax /dev/pmem1 /mnt/scratch
> xfs_io -f -c "truncate 1g" test.img
> losetup -f --show /mnt/scratch/test.img
> mkfs.xfs -f /dev/loop0
> 
> And the mkfs.xfs command hangs with a discard that never completes:
> 
> [  243.413918] INFO: task mkfs.xfs:5708 blocked for more than 120 seconds.
> [  243.415678]       Not tainted 4.7.0-dgc+ #862
> [  243.416772] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  243.418769] mkfs.xfs        D ffff880835143c18 13848  5708   5441 0x00000000
> [  243.420620]  ffff880835143c18 ffff880835143c20 ffff88083a244780 ffff8808358ba3c0
> [  243.422636]  ffff88023aa20000 ffff880835144000 7fffffffffffffff 7fffffffffffffff
> [  243.424586]  ffff8808358ba3c0 00000000024000c0 ffff880835143c30 ffffffff81e5e38c
> [  243.426466] Call Trace:
> [  243.427050]  [<ffffffff81e5e38c>] schedule+0x3c/0x90
> [  243.428224]  [<ffffffff81e62be5>] schedule_timeout+0x265/0x330
> [  243.429563]  [<ffffffff8109f125>] ? kvm_clock_read+0x25/0x40
> [  243.430896]  [<ffffffff8109f149>] ? kvm_clock_get_cycles+0x9/0x10
> [  243.432360]  [<ffffffff81125edc>] ? ktime_get+0x3c/0xb0
> [  243.433556]  [<ffffffff81e5db54>] io_schedule_timeout+0xa4/0x110
> [  243.434932]  [<ffffffff81e5eed6>] wait_for_completion_io+0xd6/0x110
> [  243.436297]  [<ffffffff810decd0>] ? wake_up_q+0x70/0x70
> [  243.437436]  [<ffffffff817d6f06>] submit_bio_wait+0x56/0x70
> [  243.438671]  [<ffffffff817e851a>] blkdev_issue_discard+0x6a/0xb0
> [  243.439980]  [<ffffffff810dab69>] ? __might_sleep+0x49/0x80
> [  243.441182]  [<ffffffff817eea87>] blk_ioctl_discard+0x97/0xb0
> [  243.442370]  [<ffffffff817ef7bb>] blkdev_ioctl+0x7eb/0x9a0
> [  243.443485]  [<ffffffff81236a1d>] block_ioctl+0x3d/0x50
> [  243.444552]  [<ffffffff812100df>] do_vfs_ioctl+0x8f/0x670
> [  243.445630]  [<ffffffff81002434>] ? exit_to_usermode_loop+0x94/0xb0
> [  243.446902]  [<ffffffff81210739>] SyS_ioctl+0x79/0x90
> [  243.447927]  [<ffffffff81002bc5>] ? syscall_return_slowpath+0xf5/0x190
> [  243.449236]  [<ffffffff81e63d32>] entry_SYSCALL_64_fastpath+0x1a/0xa4
> 
> This only reproduces when the underlying filesystem is mounted with
> -o dax, so there is a bad interaction with loop devices and DAX
> occurring somewhere. generic/361 is a recent test (committed june 14)
> so this probably hasn't actually been tested until now.
> 
> I haven't got time to look at this right now, hence the report.

Cool, thanks for the report.  I've reproduced this with linux/master, and the
test passes with v4.7.

Running a bisect...

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

WARNING: multiple messages have this Message-ID (diff)
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com, linux-fsdevel@vger.kernel.org,
	linux-nvdimm@lists.01.org
Subject: Re: [4.8 hang] xfstests generic/361 hangs on dax enabled filesystems
Date: Wed, 3 Aug 2016 11:11:27 -0600	[thread overview]
Message-ID: <20160803171127.GA15876@linux.intel.com> (raw)
In-Reply-To: <20160803003354.GP16044@dastard>

On Wed, Aug 03, 2016 at 10:33:54AM +1000, Dave Chinner wrote:
> Hi folks,
> 
> Just hit a reproducable hang in generic/361. Essentially this on
> a 8GB pmem device:
> 
> mkfs.xfs -f /dev/pmem1
> mount -o dax /dev/pmem1 /mnt/scratch
> xfs_io -f -c "truncate 1g" test.img
> losetup -f --show /mnt/scratch/test.img
> mkfs.xfs -f /dev/loop0
> 
> And the mkfs.xfs command hangs with a discard that never completes:
> 
> [  243.413918] INFO: task mkfs.xfs:5708 blocked for more than 120 seconds.
> [  243.415678]       Not tainted 4.7.0-dgc+ #862
> [  243.416772] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  243.418769] mkfs.xfs        D ffff880835143c18 13848  5708   5441 0x00000000
> [  243.420620]  ffff880835143c18 ffff880835143c20 ffff88083a244780 ffff8808358ba3c0
> [  243.422636]  ffff88023aa20000 ffff880835144000 7fffffffffffffff 7fffffffffffffff
> [  243.424586]  ffff8808358ba3c0 00000000024000c0 ffff880835143c30 ffffffff81e5e38c
> [  243.426466] Call Trace:
> [  243.427050]  [<ffffffff81e5e38c>] schedule+0x3c/0x90
> [  243.428224]  [<ffffffff81e62be5>] schedule_timeout+0x265/0x330
> [  243.429563]  [<ffffffff8109f125>] ? kvm_clock_read+0x25/0x40
> [  243.430896]  [<ffffffff8109f149>] ? kvm_clock_get_cycles+0x9/0x10
> [  243.432360]  [<ffffffff81125edc>] ? ktime_get+0x3c/0xb0
> [  243.433556]  [<ffffffff81e5db54>] io_schedule_timeout+0xa4/0x110
> [  243.434932]  [<ffffffff81e5eed6>] wait_for_completion_io+0xd6/0x110
> [  243.436297]  [<ffffffff810decd0>] ? wake_up_q+0x70/0x70
> [  243.437436]  [<ffffffff817d6f06>] submit_bio_wait+0x56/0x70
> [  243.438671]  [<ffffffff817e851a>] blkdev_issue_discard+0x6a/0xb0
> [  243.439980]  [<ffffffff810dab69>] ? __might_sleep+0x49/0x80
> [  243.441182]  [<ffffffff817eea87>] blk_ioctl_discard+0x97/0xb0
> [  243.442370]  [<ffffffff817ef7bb>] blkdev_ioctl+0x7eb/0x9a0
> [  243.443485]  [<ffffffff81236a1d>] block_ioctl+0x3d/0x50
> [  243.444552]  [<ffffffff812100df>] do_vfs_ioctl+0x8f/0x670
> [  243.445630]  [<ffffffff81002434>] ? exit_to_usermode_loop+0x94/0xb0
> [  243.446902]  [<ffffffff81210739>] SyS_ioctl+0x79/0x90
> [  243.447927]  [<ffffffff81002bc5>] ? syscall_return_slowpath+0xf5/0x190
> [  243.449236]  [<ffffffff81e63d32>] entry_SYSCALL_64_fastpath+0x1a/0xa4
> 
> This only reproduces when the underlying filesystem is mounted with
> -o dax, so there is a bad interaction with loop devices and DAX
> occurring somewhere. generic/361 is a recent test (committed june 14)
> so this probably hasn't actually been tested until now.
> 
> I haven't got time to look at this right now, hence the report.

Cool, thanks for the report.  I've reproduced this with linux/master, and the
test passes with v4.7.

Running a bisect...

  parent reply	other threads:[~2016-08-03 17:11 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-03  0:33 [4.8 hang] xfstests generic/361 hangs on dax enabled filesystems Dave Chinner
2016-08-03  0:33 ` Dave Chinner
2016-08-03  0:33 ` Dave Chinner
2016-08-03  0:59 ` Dan Williams
2016-08-03  0:59   ` Dan Williams
2016-08-03  0:59   ` Dan Williams
2016-08-03 17:11 ` Ross Zwisler [this message]
2016-08-03 17:11   ` Ross Zwisler
2016-08-03 17:11   ` Ross Zwisler
2016-08-03 22:37   ` Ross Zwisler
2016-08-03 22:37     ` Ross Zwisler
2016-08-03 22:37     ` Ross Zwisler
2016-08-03 23:16     ` Dave Chinner
2016-08-03 23:16       ` Dave Chinner
2016-08-03 23:16       ` Dave Chinner
2016-08-04  2:52       ` Dan Williams
2016-08-04  2:52         ` Dan Williams
2016-08-04  2:52         ` Dan Williams
2016-08-04 15:48 ` Christoph Hellwig
2016-08-04 15:48   ` Christoph Hellwig
     [not found]   ` <20160804154805.GA24025-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-08-04 18:46     ` Ross Zwisler
2016-08-04 18:46       ` Ross Zwisler
2016-08-04 18:46       ` Ross Zwisler
2016-08-04 18:54       ` Ross Zwisler
2016-08-04 18:54         ` Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160803171127.GA15876@linux.intel.com \
    --to=ross.zwisler@linux.intel.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.