From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Ross Zwisler <ross.zwisler@linux.intel.com>,
Jan Kara <jack@suse.cz>,
linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
linux-mm@kvack.org, Dan Williams <dan.j.williams@intel.com>,
linux-nvdimm@lists.01.org, Matthew Wilcox <willy@linux.intel.com>
Subject: Re: [RFC v3] [PATCH 0/18] DAX page fault locking
Date: Tue, 10 May 2016 16:39:37 -0600 [thread overview]
Message-ID: <20160510223937.GA10222@linux.intel.com> (raw)
In-Reply-To: <20160510203003.GA5314@linux.intel.com>
On Tue, May 10, 2016 at 02:30:03PM -0600, Ross Zwisler wrote:
> On Tue, May 10, 2016 at 05:28:14PM +0200, Jan Kara wrote:
> > On Mon 09-05-16 11:38:28, Jan Kara wrote:
> > Somehow, I'm not able to reproduce the warnings... Anyway, I think I see
> > what's going on. Can you check whether the warning goes away when you
> > change the condition at the end of page_cache_tree_delete() to:
> >
> > if (!dax_mapping(mapping) && !workingset_node_pages(node) &&
> > list_empty(&node->private_list)) {
>
> Yep, this took care of both of the issues that I reported. I'll restart my
> testing with this in my baseline, but as of this fix I don't have any more
> open testing issues. :)
Well, looks like I spoke too soon. The two tests that were failing for me are
now passing, but I can still create what looks like a related failure using
XFS, DAX, and the two xfstests generic/231 and generic/232 run back-to-back.
Here's the shell:
# ./check generic/231 generic/232
FSTYP -- xfs (debug)
PLATFORM -- Linux/x86_64 alara 4.6.0-rc5jan_testing_2+
MKFS_OPTIONS -- -f -bsize=4096 /dev/pmem0p2
MOUNT_OPTIONS -- -o dax -o context=system_u:object_r:nfs_t:s0 /dev/pmem0p2 /mnt/xfstests_scratch
generic/231 88s ... 88s
generic/232 2s ..../check: line 543: 9105 Segmentation fault ./$seq > $tmp.rawout 2>&1
[failed, exit status 139] - output mismatch (see /root/xfstests/results//generic/232.out.bad)
--- tests/generic/232.out 2015-10-02 10:19:36.806795894 -0600
+++ /root/xfstests/results//generic/232.out.bad 2016-05-10 16:17:54.805637876 -0600
@@ -3,5 +3,3 @@
Testing fsstress
seed = S
-Comparing user usage
-Comparing group usage
...
(Run 'diff -u tests/generic/232.out /root/xfstests/results//generic/232.out.bad' to see the entire diff)
and the serial log:
run fstests generic/232 at 2016-05-10 16:17:53
XFS (pmem0p2): DAX enabled. Warning: EXPERIMENTAL, use at your own risk
XFS (pmem0p2): Mounting V5 Filesystem
XFS (pmem0p2): Ending clean mount
XFS (pmem0p2): Quotacheck needed: Please wait.
XFS (pmem0p2): Quotacheck: Done.
------------[ cut here ]------------
kernel BUG at mm/workingset.c:423!
invalid opcode: 0000 [#1] SMP
Modules linked in: nd_pmem nd_btt nd_e820 libnvdimm
CPU: 1 PID: 9105 Comm: 232 Not tainted 4.6.0-rc5jan_testing_2+ #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
task: ffff8801f4eb98c0 ti: ffff88040e5f0000 task.ti: ffff88040e5f0000
RIP: 0010:[<ffffffff81207b93>] [<ffffffff81207b93>] shadow_lru_isolate+0x183/0x1a0
RSP: 0018:ffff88040e5f3be8 EFLAGS: 00010006
RAX: ffff880401f68270 RBX: ffff880401f68260 RCX: ffff880401f68470
RDX: 000000000000006c RSI: 0000000000000000 RDI: ffff880410b2bd80
RBP: ffff88040e5f3c10 R08: 0000000000000008 R09: 0000000000000000
R10: ffff8800b59eb840 R11: 0000000000000080 R12: ffff880410b2bd80
R13: ffff8800b59eb828 R14: ffff8800b59eb810 R15: ffff880410b2bdc8
FS: 00007fb73c58c700(0000) GS:ffff88041a200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb73c5a4000 CR3: 000000040e139000 CR4: 00000000000006e0
Stack:
ffff880410b2bd80 ffff880410b2bdc8 ffff88040e5f3d18 ffff880410b2bdc8
ffff880401f68260 ffff88040e5f3c60 ffffffff81206c7f 0000000000000000
0000000000000000 ffffffff81207a10 ffff88040e5f3d10 0000000000000000
Call Trace:
[<ffffffff81206c7f>] __list_lru_walk_one.isra.3+0x9f/0x150 mm/list_lru.c:223
[<ffffffff81206d53>] list_lru_walk_one+0x23/0x30 mm/list_lru.c:263
[< inline >] list_lru_shrink_walk include/linux/list_lru.h:170
[<ffffffff81207bea>] scan_shadow_nodes+0x3a/0x50 mm/workingset.c:457
[< inline >] do_shrink_slab mm/vmscan.c:344
[<ffffffff811ea37e>] shrink_slab.part.40+0x1fe/0x420 mm/vmscan.c:442
[<ffffffff811ea5c9>] shrink_slab+0x29/0x30 mm/vmscan.c:406
[<ffffffff811ec831>] drop_slab_node+0x31/0x60 mm/vmscan.c:460
[<ffffffff811ec89f>] drop_slab+0x3f/0x70 mm/vmscan.c:471
[<ffffffff812d8c39>] drop_caches_sysctl_handler+0x69/0xb0 fs/drop_caches.c:58
[<ffffffff812f2937>] proc_sys_call_handler+0xe7/0x100 fs/proc/proc_sysctl.c:543
[<ffffffff812f2964>] proc_sys_write+0x14/0x20 fs/proc/proc_sysctl.c:561
[<ffffffff81269aa7>] __vfs_write+0x37/0x120 fs/read_write.c:529
[<ffffffff8126a3fc>] vfs_write+0xac/0x1a0 fs/read_write.c:578
[< inline >] SYSC_write fs/read_write.c:625
[<ffffffff8126b8d8>] SyS_write+0x58/0xd0 fs/read_write.c:617
[<ffffffff81a92a3c>] entry_SYSCALL_64_fastpath+0x1f/0xbd arch/x86/entry/entry_64.S:207
Code: 66 90 66 66 90 e8 4e 53 88 00 fa 66 66 90 66 66 90 e8 52 5c ef ff 4c 89 e7 e8 ba a1 88 00 89 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 <0f> 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 66 66 66 66 66 66 2e
RIP [<ffffffff81207b93>] shadow_lru_isolate+0x183/0x1a0 mm/workingset.c:448
RSP <ffff88040e5f3be8>
---[ end trace c4ff9bc94605ec45 ]---
This was against a tree with your most recent fix. The full tree can be found
here:
https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=jan_testing
This only recreates on about 1/2 of the runs of these tests in my system.
Thanks,
- Ross
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-05-10 22:39 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-18 21:35 [RFC v3] [PATCH 0/18] DAX page fault locking Jan Kara
2016-04-18 21:35 ` [PATCH 01/18] ext4: Handle transient ENOSPC properly for DAX Jan Kara
2016-04-18 21:35 ` [PATCH 02/18] ext4: Fix race in transient ENOSPC detection Jan Kara
2016-04-18 21:35 ` [PATCH 03/18] DAX: move RADIX_DAX_ definitions to dax.c Jan Kara
2016-04-18 21:35 ` [PATCH 04/18] dax: Remove complete_unwritten argument Jan Kara
2016-04-18 21:35 ` [PATCH 05/18] ext2: Avoid DAX zeroing to corrupt data Jan Kara
2016-04-29 16:30 ` Ross Zwisler
2016-04-18 21:35 ` [PATCH 06/18] dax: Remove dead zeroing code from fault handlers Jan Kara
2016-04-29 16:48 ` Ross Zwisler
2016-04-18 21:35 ` [PATCH 07/18] ext4: Refactor direct IO code Jan Kara
2016-04-18 21:35 ` [PATCH 08/18] ext4: Pre-zero allocated blocks for DAX IO Jan Kara
2016-04-29 18:01 ` Ross Zwisler
2016-05-02 13:09 ` Jan Kara
2016-04-18 21:35 ` [PATCH 09/18] dax: Remove zeroing from dax_io() Jan Kara
2016-04-29 18:56 ` Ross Zwisler
2016-04-18 21:35 ` [PATCH 10/18] dax: Remove pointless writeback from dax_do_io() Jan Kara
2016-04-29 19:00 ` Ross Zwisler
2016-04-18 21:35 ` [PATCH 11/18] dax: Fix condition for filling of PMD holes Jan Kara
2016-04-29 19:08 ` Ross Zwisler
2016-05-02 13:16 ` Jan Kara
2016-04-18 21:35 ` [PATCH 12/18] dax: Remove redundant inode size checks Jan Kara
2016-04-18 21:35 ` [PATCH 13/18] dax: Make huge page handling depend of CONFIG_BROKEN Jan Kara
2016-04-29 19:53 ` Ross Zwisler
2016-05-02 13:19 ` Jan Kara
2016-04-18 21:35 ` [PATCH 14/18] dax: Define DAX lock bit for radix tree exceptional entry Jan Kara
2016-04-29 20:03 ` Ross Zwisler
2016-04-18 21:35 ` [PATCH 15/18] dax: Allow DAX code to replace exceptional entries Jan Kara
2016-04-29 20:29 ` Ross Zwisler
2016-04-18 21:35 ` [PATCH 16/18] dax: New fault locking Jan Kara
2016-04-27 4:27 ` NeilBrown
2016-05-06 4:13 ` Ross Zwisler
2016-05-10 12:27 ` Jan Kara
2016-05-11 19:26 ` Ross Zwisler
2016-05-12 7:58 ` Jan Kara
2016-04-18 21:35 ` [PATCH 17/18] dax: Use radix tree entry lock to protect cow faults Jan Kara
2016-04-19 11:46 ` Jerome Glisse
2016-04-19 14:33 ` Jan Kara
2016-04-19 15:19 ` Jerome Glisse
2016-04-18 21:35 ` [PATCH 18/18] dax: Remove i_mmap_lock protection Jan Kara
2016-05-06 3:35 ` [RFC v3] [PATCH 0/18] DAX page fault locking Ross Zwisler
2016-05-06 20:33 ` Ross Zwisler
2016-05-09 9:38 ` Jan Kara
2016-05-10 15:28 ` Jan Kara
2016-05-10 20:30 ` Ross Zwisler
2016-05-10 22:39 ` Ross Zwisler [this message]
2016-05-11 9:19 ` Jan Kara
2016-05-11 15:52 ` Ross Zwisler
2016-05-09 21:28 ` Verma, Vishal L
2016-05-10 11:52 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160510223937.GA10222@linux.intel.com \
--to=ross.zwisler@linux.intel.com \
--cc=dan.j.williams@intel.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvdimm@lists.01.org \
--cc=willy@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).