* Re: parallel lookups on NFS
@ 2016-04-24 19:18 ` Al Viro
0 siblings, 0 replies; 25+ messages in thread
From: Al Viro @ 2016-04-24 19:18 UTC (permalink / raw)
To: Jeff Layton; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust, Linus Torvalds
On Sun, Apr 24, 2016 at 08:46:15AM -0400, Jeff Layton wrote:
> > Suggestions?��Right now my local tree has nfs_lookup() and
> > nfs_readdir() run with directory locked shared.��And they are still
> > serialized by the damn ->silly_count ;-/
> >
>
> Hmm...well, most of that was added in commit 565277f63c61. Looking at
> the bug referenced in that commit log, I think that the main thing we
> want to ensure is that rmdir waits until all of the sillydeletes for
> files in its directory have finished.
>
> But...there's also some code to ensure that if a lookup races in while
> we're doing the sillydelete that we transfer all of the dentry info to
> the new alias. That's the messy part.
It's a bit more - AFAICS, it also wants to prevent lookup coming after we'd
started with building an unlink request and getting serviced before that
unlink.
> The new infrastructure for parallel lookups might make it simpler
> actually. When we go to do the sillydelete, could we add the dentry to
> the "lookup in progress" hash that you're adding as part of the
> parallel lookup infrastructure? Then the tasks doing lookups could find
> it and just wait on the sillydelete to complete. After the sillydelete,
> we'd turn the thing into a negative dentry and then wake up the waiters
> (if any). That does make the whole dentry teardown a bit more complex
> though.
Umm... A lot more complex, unfortunately - if anything, I would allocate
a _new_ dentry at sillyrename time and used it pretty much instead of
your nfs_unlinkdata. Unhashed, negative and pinned down. And insert it
into in-lookup hash only when we get around to actual delayed unlink.
First of all, dentries in in-lookup hash *must* be negative before they can
be inserted there. That wouldn't be so much PITA per se, but we also have
a problem with the sucker possibly being on a shrink list and only the owner
of the shrink list can remove it from there. So this kind of reuse would be
hard to arrange.
Do we want to end up with a negative hashed after that unlink, though?
Sure, we know it's negative, but will anyone really try to look it up
afterwards? IOW, is it anything but a hash pollution? What's more,
unlike in-lookup hash, there are assumptions about the primary one; namely,
directory-modifying operation can be certain that nobody else will be adding
entries to hash behind its back, positive or negative.
I'm not at all sure that NFS doesn't rely upon this. The in-lookup hash
has no such warranties (and still very few users, of course), so we can
afford adding stuff to it without bothering with locking the parent directory.
_IF_ we don't try to add anything to primary hash, we can bloody well use it
without ->i_rwsem on the parent.
AFAICS, ->args is redundant if you have the sucker with the right name/parent.
So's ->dir; the rest is probably still needed, so a shrunk nfs_unlinkdata
would still be needed, more's the pity... Well, we can point ->d_fsdata of
the replacement dentry to set to it.
And yes, it means allocation in two pieces instead of just one when we hit
sillyrename. Seeing that it's doing at least two RPC roundtrips right in
nfs_sillyrename(), I think that overhead isn't a serious worry.
What we get out of that is fully parallel lookup/readdir/sillyunlink - all
exclusion is on per-name basis (nfs_prime_dcache() vs. nfs_lookup() vs.
nfs_do_call_unlink()). It will require a bit of care in atomic_open(),
though...
I'll play with that a bit and see what can be done...
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: parallel lookups on NFS
2016-04-24 19:18 ` Al Viro
(?)
@ 2016-04-24 20:51 ` Jeff Layton
-1 siblings, 0 replies; 25+ messages in thread
From: Jeff Layton @ 2016-04-24 20:51 UTC (permalink / raw)
To: Al Viro; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust, Linus Torvalds
On Sun, 2016-04-24 at 20:18 +0100, Al Viro wrote:
> On Sun, Apr 24, 2016 at 08:46:15AM -0400, Jeff Layton wrote:
> >
> > >
> > > Suggestions? Right now my local tree has nfs_lookup() and
> > > nfs_readdir() run with directory locked shared. And they are still
> > > serialized by the damn ->silly_count ;-/
> > >
> > Hmm...well, most of that was added in commit 565277f63c61. Looking at
> > the bug referenced in that commit log, I think that the main thing we
> > want to ensure is that rmdir waits until all of the sillydeletes for
> > files in its directory have finished.
> >
> > But...there's also some code to ensure that if a lookup races in while
> > we're doing the sillydelete that we transfer all of the dentry info to
> > the new alias. That's the messy part.
> It's a bit more - AFAICS, it also wants to prevent lookup coming after we'd
> started with building an unlink request and getting serviced before that
> unlink.
>
> >
> > The new infrastructure for parallel lookups might make it simpler
> > actually. When we go to do the sillydelete, could we add the dentry to
> > the "lookup in progress" hash that you're adding as part of the
> > parallel lookup infrastructure? Then the tasks doing lookups could find
> > it and just wait on the sillydelete to complete. After the sillydelete,
> > we'd turn the thing into a negative dentry and then wake up the waiters
> > (if any). That does make the whole dentry teardown a bit more complex
> > though.
> Umm... A lot more complex, unfortunately - if anything, I would allocate
> a _new_ dentry at sillyrename time and used it pretty much instead of
> your nfs_unlinkdata. Unhashed, negative and pinned down. And insert it
> into in-lookup hash only when we get around to actual delayed unlink.
>
> First of all, dentries in in-lookup hash *must* be negative before they can
> be inserted there. That wouldn't be so much PITA per se, but we also have
> a problem with the sucker possibly being on a shrink list and only the owner
> of the shrink list can remove it from there. So this kind of reuse would be
> hard to arrange.
>
> Do we want to end up with a negative hashed after that unlink, though?
> Sure, we know it's negative, but will anyone really try to look it up
> afterwards? IOW, is it anything but a hash pollution? What's more,
> unlike in-lookup hash, there are assumptions about the primary one; namely,
> directory-modifying operation can be certain that nobody else will be adding
> entries to hash behind its back, positive or negative.
>
You may be right. One problematic scenario is something like this:
A READDIR is issued that ends up racing with a sillydelete. The READDIR response comes in first and we start processing entries. Eventually it hits the dentry that's being sillydeleted and blocks on it. The dentry is deleted on the server so we create a negative dentry and wake up the READDIR. It then reinstantiates the dentry as positive.
Hmm...is that really a problem though? Maybe not since we'll just issue the RMDIR to the server at that point and the directory _would_ be empty there. Alternately we'd probably just end up flushing the dentry when we notice that the dir has changed.
In any case, there probably is little benefit to keeping the dentry around. If d_drop'ing it makes things simpler, then I've no objection.
> I'm not at all sure that NFS doesn't rely upon this. The in-lookup hash
> has no such warranties (and still very few users, of course), so we can
> afford adding stuff to it without bothering with locking the parent directory.
> _IF_ we don't try to add anything to primary hash, we can bloody well use it
> without ->i_rwsem on the parent.
>
> AFAICS, ->args is redundant if you have the sucker with the right name/parent.
> So's ->dir; the rest is probably still needed, so a shrunk nfs_unlinkdata
> would still be needed, more's the pity... Well, we can point ->d_fsdata of
> the replacement dentry to set to it.
>
> And yes, it means allocation in two pieces instead of just one when we hit
> sillyrename. Seeing that it's doing at least two RPC roundtrips right in
> nfs_sillyrename(), I think that overhead isn't a serious worry.
>
Agreed.
> What we get out of that is fully parallel lookup/readdir/sillyunlink - all
> exclusion is on per-name basis (nfs_prime_dcache() vs. nfs_lookup() vs.
> nfs_do_call_unlink()). It will require a bit of care in atomic_open(),
> though...
>
> I'll play with that a bit and see what can be done...
--
Jeff Layton <jlayton@poochiereds.net>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: parallel lookups on NFS
2016-04-24 19:18 ` Al Viro
(?)
(?)
@ 2016-04-29 7:58 ` Al Viro
2016-04-30 13:15 ` Jeff Layton
-1 siblings, 1 reply; 25+ messages in thread
From: Al Viro @ 2016-04-29 7:58 UTC (permalink / raw)
To: Jeff Layton; +Cc: linux-nfs, linux-fsdevel, Trond Myklebust, Linus Torvalds
On Sun, Apr 24, 2016 at 08:18:35PM +0100, Al Viro wrote:
> What we get out of that is fully parallel lookup/readdir/sillyunlink - all
> exclusion is on per-name basis (nfs_prime_dcache() vs. nfs_lookup() vs.
> nfs_do_call_unlink()). It will require a bit of care in atomic_open(),
> though...
>
> I'll play with that a bit and see what can be done...
OK, a bunch of atomic_open cleanups (moderately tested) +
almost untested sillyunlink patch are in vfs.git#untested.nfs.
It ought to make lookups (and readdir, and !O_CREAT case of atomic_open)
on NFS really execute in parallel. Folks, please hit that sucker with
NFS torture tests. In particular, the stuff mentioned in commit
565277f6 would be interesting to try.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: parallel lookups on NFS
2016-04-29 7:58 ` Al Viro
@ 2016-04-30 13:15 ` Jeff Layton
0 siblings, 0 replies; 25+ messages in thread
From: Jeff Layton @ 2016-04-30 13:15 UTC (permalink / raw)
To: Al Viro
Cc: linux-nfs, linux-fsdevel, Trond Myklebust, Linus Torvalds,
Anna Schumaker
On Fri, 2016-04-29 at 08:58 +0100, Al Viro wrote:
> On Sun, Apr 24, 2016 at 08:18:35PM +0100, Al Viro wrote:
>
> >
> > What we get out of that is fully parallel
> > lookup/readdir/sillyunlink - all
> > exclusion is on per-name basis (nfs_prime_dcache() vs. nfs_lookup()
> > vs.
> > nfs_do_call_unlink()). It will require a bit of care in
> > atomic_open(),
> > though...
> >
> > I'll play with that a bit and see what can be done...
> OK, a bunch of atomic_open cleanups (moderately tested) +
> almost untested sillyunlink patch are in vfs.git#untested.nfs.
>
> It ought to make lookups (and readdir, and !O_CREAT case of
> atomic_open)
> on NFS really execute in parallel. Folks, please hit that sucker
> with
> NFS torture tests. In particular, the stuff mentioned in commit
> 565277f6 would be interesting to try.
I pulled down the branch and built it, and then ran the cthon special
tests 100 times in a loop, and ran "ls -l" on the test directory in a
loop at the same time. On pass 42, I hit this:
[ 1168.630763] general protection fault: 0000 [#1] SMP
[ 1168.631617] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xfs snd_hda_codec_generic snd_hda_intel snd_hda_codec libcrc32c snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd nfsd joydev ppdev soundcore acpi_cpufreq virtio_net pcspkr i2c_piix4 tpm_tis tpm parport_pc parport virtio_balloon floppy pvpanic nfs_acl lockd auth_rpcgss grace sunrpc qxl drm_kms_helper ttm drm virtio_console virtio_blk virtio_pci virtio_ring virtio serio_raw ata_generic pata_acpi
[ 1168.638448] CPU: 3 PID: 1850 Comm: op_ren Not tainted 4.6.0-rc1+ #25
[ 1168.639413] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 1168.640146] task: ffff880035cf5400 ti: ffff8800d064c000 task.ti: ffff8800d064c000
[ 1168.641107] RIP: 0010:[<ffffffff811f6488>] [<ffffffff811f6488>] kmem_cache_alloc+0x78/0x160
[ 1168.642292] RSP: 0018:ffff8800d064fa90 EFLAGS: 00010246
[ 1168.642978] RAX: 73747365746e7572 RBX: 0000000000000894 RCX: 0000000000000020
[ 1168.643920] RDX: 0000000000318271 RSI: 00000000024080c0 RDI: 000000000001a440
[ 1168.644862] RBP: ffff8800d064fac0 R08: ffff88021fd9a440 R09: ffff880035b82400
[ 1168.645794] R10: 0000000000000000 R11: ffff8800d064fb70 R12: 00000000024080c0
[ 1168.646762] R13: ffffffff81317667 R14: ffff880217001d00 R15: 73747365746e7572
[ 1168.647650] FS: 00007f0cb8295700(0000) GS:ffff88021fd80000(0000) knlGS:0000000000000000
[ 1168.648639] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1168.649330] CR2: 0000000000401090 CR3: 0000000035f9a000 CR4: 00000000000006e0
[ 1168.650239] Stack:
[ 1168.650498] 00ff8800026084c0 0000000000000894 ffff880035b82400 ffff8800d064fd14
[ 1168.651509] ffff8800d0650000 ffff880201ca5e38 ffff8800d064fae0 ffffffff81317667
[ 1168.652506] ffffffff81c9b140 ffff880035b82400 ffff8800d064fb00 ffffffff8130ef03
[ 1168.653494] Call Trace:
[ 1168.653889] [<ffffffff81317667>] selinux_file_alloc_security+0x37/0x60
[ 1168.654728] [<ffffffff8130ef03>] security_file_alloc+0x33/0x50
[ 1168.655447] [<ffffffff812117da>] get_empty_filp+0x9a/0x1c0
[ 1168.656231] [<ffffffff81399d96>] ? copy_to_iter+0x1b6/0x260
[ 1168.656999] [<ffffffff8121d75e>] path_openat+0x2e/0x1660
[ 1168.657645] [<ffffffff81103133>] ? current_fs_time+0x23/0x30
[ 1168.658311] [<ffffffff81399d96>] ? copy_to_iter+0x1b6/0x260
[ 1168.658999] [<ffffffff81103133>] ? current_fs_time+0x23/0x30
[ 1168.659742] [<ffffffff8122bea3>] ? touch_atime+0x23/0xa0
[ 1168.660435] [<ffffffff8121fe3e>] do_filp_open+0x7e/0xe0
[ 1168.661072] [<ffffffff8120e8d7>] ? __vfs_read+0xa7/0xd0
[ 1168.661792] [<ffffffff8120e8d7>] ? __vfs_read+0xa7/0xd0
[ 1168.662410] [<ffffffff811f6444>] ? kmem_cache_alloc+0x34/0x160
[ 1168.663130] [<ffffffff81214d94>] do_open_execat+0x64/0x150
[ 1168.664100] [<ffffffff8121524b>] open_exec+0x2b/0x50
[ 1168.664949] [<ffffffff8126302a>] load_elf_binary+0x29a/0x1670
[ 1168.665880] [<ffffffff811c43d4>] ? get_user_pages_remote+0x54/0x60
[ 1168.666843] [<ffffffff81215fac>] ? copy_strings.isra.30+0x25c/0x370
[ 1168.667812] [<ffffffff8121595e>] search_binary_handler+0x9e/0x1d0
[ 1168.668753] [<ffffffff8121714c>] do_execveat_common.isra.41+0x4fc/0x6d0
[ 1168.669753] [<ffffffff812175ba>] SyS_execve+0x3a/0x50
[ 1168.670560] [<ffffffff81003cb2>] do_syscall_64+0x62/0x110
[ 1168.671384] [<ffffffff8174ae21>] entry_SYSCALL64_slow_path+0x25/0x25
[ 1168.672305] Code: 49 83 78 10 00 4d 8b 38 0f 84 bd 00 00 00 4d 85 ff 0f 84 b4 00 00 00 49 63 46 20 49 8b 3e 4c 01 f8 40 f6 c7 0f 0f 85 cf 00 00 00 <48> 8b 18 48 8d 4a 01 4c 89 f8 65 48 0f c7 0f 0f 94 c0 84 c0 74
[ 1168.676071] RIP [<ffffffff811f6488>] kmem_cache_alloc+0x78/0x160
[ 1168.677008] RSP <ffff8800d064fa90>
[ 1168.679699] general protection fault: 0000 [#2]
kmem_cache corruption maybe?
(gdb) list *(kmem_cache_alloc+0x78)
0xffffffff811f6488 is in kmem_cache_alloc (mm/slub.c:245).
240 * Core slab cache functions
241 *******************************************************************/
242
243 static inline void *get_freepointer(struct kmem_cache *s, void *object)
244 {
245 return *(void **)(object + s->offset);
246 }
247
248 static void prefetch_freepointer(const struct kmem_cache *s, void *object)
249 {
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: parallel lookups on NFS
@ 2016-04-30 13:15 ` Jeff Layton
0 siblings, 0 replies; 25+ messages in thread
From: Jeff Layton @ 2016-04-30 13:15 UTC (permalink / raw)
To: Al Viro
Cc: linux-nfs, linux-fsdevel, Trond Myklebust, Linus Torvalds,
Anna Schumaker
On Fri, 2016-04-29 at 08:58 +0100, Al Viro wrote:
> On Sun, Apr 24, 2016 at 08:18:35PM +0100, Al Viro wrote:
>
> >
> > What we get out of that is fully parallel
> > lookup/readdir/sillyunlink - all
> > exclusion is on per-name basis (nfs_prime_dcache() vs. nfs_lookup()
> > vs.
> > nfs_do_call_unlink()). It will require a bit of care in
> > atomic_open(),
> > though...
> >
> > I'll play with that a bit and see what can be done...
> OK, a bunch of atomic_open cleanups (moderately tested) +
> almost untested sillyunlink patch are in vfs.git#untested.nfs.
>
> It ought to make lookups (and readdir, and !O_CREAT case of
> atomic_open)
> on NFS really execute in parallel. Folks, please hit that sucker
> with
> NFS torture tests. In particular, the stuff mentioned in commit
> 565277f6 would be interesting to try.
I pulled down the branch and built it, and then ran the cthon special
tests 100 times in a loop, and ran "ls -l" on the test directory in a
loop at the same time. On pass 42, I hit this:
[ 1168.630763] general protection fault: 0000 [#1] SMP
[ 1168.631617] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xfs snd_hda_codec_generic snd_hda_intel snd_hda_codec libcrc32c snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd nfsd joydev ppdev soundcore acpi_cpufreq virtio_net pcspkr i2c_piix4 tpm_tis tpm parport_pc parport virtio_balloon floppy pvpanic nfs_acl lockd auth_rpcgss grace sunrpc qxl drm_kms_helper ttm drm virtio_console virtio_blk virtio_pci virtio_ring virtio serio_raw ata_generic pata_acpi
[ 1168.638448] CPU: 3 PID: 1850 Comm: op_ren Not tainted 4.6.0-rc1+ #25
[ 1168.639413] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 1168.640146] task: ffff880035cf5400 ti: ffff8800d064c000 task.ti: ffff8800d064c000
[ 1168.641107] RIP: 0010:[<ffffffff811f6488>] [<ffffffff811f6488>] kmem_cache_alloc+0x78/0x160
[ 1168.642292] RSP: 0018:ffff8800d064fa90 EFLAGS: 00010246
[ 1168.642978] RAX: 73747365746e7572 RBX: 0000000000000894 RCX: 0000000000000020
[ 1168.643920] RDX: 0000000000318271 RSI: 00000000024080c0 RDI: 000000000001a440
[ 1168.644862] RBP: ffff8800d064fac0 R08: ffff88021fd9a440 R09: ffff880035b82400
[ 1168.645794] R10: 0000000000000000 R11: ffff8800d064fb70 R12: 00000000024080c0
[ 1168.646762] R13: ffffffff81317667 R14: ffff880217001d00 R15: 73747365746e7572
[ 1168.647650] FS: 00007f0cb8295700(0000) GS:ffff88021fd80000(0000) knlGS:0000000000000000
[ 1168.648639] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1168.649330] CR2: 0000000000401090 CR3: 0000000035f9a000 CR4: 00000000000006e0
[ 1168.650239] Stack:
[ 1168.650498] 00ff8800026084c0 0000000000000894 ffff880035b82400 ffff8800d064fd14
[ 1168.651509] ffff8800d0650000 ffff880201ca5e38 ffff8800d064fae0 ffffffff81317667
[ 1168.652506] ffffffff81c9b140 ffff880035b82400 ffff8800d064fb00 ffffffff8130ef03
[ 1168.653494] Call Trace:
[ 1168.653889] [<ffffffff81317667>] selinux_file_alloc_security+0x37/0x60
[ 1168.654728] [<ffffffff8130ef03>] security_file_alloc+0x33/0x50
[ 1168.655447] [<ffffffff812117da>] get_empty_filp+0x9a/0x1c0
[ 1168.656231] [<ffffffff81399d96>] ? copy_to_iter+0x1b6/0x260
[ 1168.656999] [<ffffffff8121d75e>] path_openat+0x2e/0x1660
[ 1168.657645] [<ffffffff81103133>] ? current_fs_time+0x23/0x30
[ 1168.658311] [<ffffffff81399d96>] ? copy_to_iter+0x1b6/0x260
[ 1168.658999] [<ffffffff81103133>] ? current_fs_time+0x23/0x30
[ 1168.659742] [<ffffffff8122bea3>] ? touch_atime+0x23/0xa0
[ 1168.660435] [<ffffffff8121fe3e>] do_filp_open+0x7e/0xe0
[ 1168.661072] [<ffffffff8120e8d7>] ? __vfs_read+0xa7/0xd0
[ 1168.661792] [<ffffffff8120e8d7>] ? __vfs_read+0xa7/0xd0
[ 1168.662410] [<ffffffff811f6444>] ? kmem_cache_alloc+0x34/0x160
[ 1168.663130] [<ffffffff81214d94>] do_open_execat+0x64/0x150
[ 1168.664100] [<ffffffff8121524b>] open_exec+0x2b/0x50
[ 1168.664949] [<ffffffff8126302a>] load_elf_binary+0x29a/0x1670
[ 1168.665880] [<ffffffff811c43d4>] ? get_user_pages_remote+0x54/0x60
[ 1168.666843] [<ffffffff81215fac>] ? copy_strings.isra.30+0x25c/0x370
[ 1168.667812] [<ffffffff8121595e>] search_binary_handler+0x9e/0x1d0
[ 1168.668753] [<ffffffff8121714c>] do_execveat_common.isra.41+0x4fc/0x6d0
[ 1168.669753] [<ffffffff812175ba>] SyS_execve+0x3a/0x50
[ 1168.670560] [<ffffffff81003cb2>] do_syscall_64+0x62/0x110
[ 1168.671384] [<ffffffff8174ae21>] entry_SYSCALL64_slow_path+0x25/0x25
[ 1168.672305] Code: 49 83 78 10 00 4d 8b 38 0f 84 bd 00 00 00 4d 85 ff 0f 84 b4 00 00 00 49 63 46 20 49 8b 3e 4c 01 f8 40 f6 c7 0f 0f 85 cf 00 00 00 <48> 8b 18 48 8d 4a 01 4c 89 f8 65 48 0f c7 0f 0f 94 c0 84 c0 74
[ 1168.676071] RIP [<ffffffff811f6488>] kmem_cache_alloc+0x78/0x160
[ 1168.677008] RSP <ffff8800d064fa90>
[ 1168.679699] general protection fault: 0000 [#2]
kmem_cache corruption maybe?
(gdb) list *(kmem_cache_alloc+0x78)
0xffffffff811f6488 is in kmem_cache_alloc (mm/slub.c:245).
240 * Core slab cache functions
241 *******************************************************************/
242
243 static inline void *get_freepointer(struct kmem_cache *s, void *object)
244 {
245 return *(void **)(object + s->offset);
246 }
247
248 static void prefetch_freepointer(const struct kmem_cache *s, void *object)
249 {
>From /sys/kernel/slab (after rebooting and rerunning the cthon tests once):
lrwxrwxrwx. 1 root root 0 Apr 30 09:10 selinux_file_security -> :t-0000016
[root@rawhide slab]# ls -l | grep :t-0000016
lrwxrwxrwx. 1 root root 0 Apr 30 09:10 kmalloc-16 -> :t-0000016
lrwxrwxrwx. 1 root root 0 Apr 30 09:10 selinux_file_security -> :t-0000016
drwxr-xr-x. 2 root root 0 Apr 30 09:11 :t-0000016
Hard to tell if this is related to your changes but it certainly did
happen in the open codepath. I'll see if I can reproduce it again.
--
Jeff Layton <jlayton@poochiereds.net>
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: parallel lookups on NFS
2016-04-30 13:15 ` Jeff Layton
(?)
@ 2016-04-30 13:22 ` Jeff Layton
2016-04-30 14:22 ` Al Viro
-1 siblings, 1 reply; 25+ messages in thread
From: Jeff Layton @ 2016-04-30 13:22 UTC (permalink / raw)
To: Al Viro
Cc: linux-nfs, linux-fsdevel, Trond Myklebust, Linus Torvalds,
Anna Schumaker
On Sat, 2016-04-30 at 09:15 -0400, Jeff Layton wrote:
> On Fri, 2016-04-29 at 08:58 +0100, Al Viro wrote:
> >
> > On Sun, Apr 24, 2016 at 08:18:35PM +0100, Al Viro wrote:
> >
> > >
> > >
> > > What we get out of that is fully parallel
> > > lookup/readdir/sillyunlink - all
> > > exclusion is on per-name basis (nfs_prime_dcache() vs.
> > > nfs_lookup()
> > > vs.
> > > nfs_do_call_unlink()). It will require a bit of care in
> > > atomic_open(),
> > > though...
> > >
> > > I'll play with that a bit and see what can be done...
> > OK, a bunch of atomic_open cleanups (moderately tested) +
> > almost untested sillyunlink patch are in vfs.git#untested.nfs.
> >
> > It ought to make lookups (and readdir, and !O_CREAT case of
> > atomic_open)
> > on NFS really execute in parallel. Folks, please hit that sucker
> > with
> > NFS torture tests. In particular, the stuff mentioned in commit
> > 565277f6 would be interesting to try.
>
> I pulled down the branch and built it, and then ran the cthon special
> tests 100 times in a loop, and ran "ls -l" on the test directory in a
> loop at the same time. On pass 42, I hit this:
>
> [ 1168.630763] general protection fault: 0000 [#1] SMP
> [ 1168.631617] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xfs snd_hda_codec_generic snd_hda_intel snd_hda_codec libcrc32c snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd nfsd joydev ppdev soundcore acpi_cpufreq virtio_net pcspkr i2c_piix4 tpm_tis tpm parport_pc parport virtio_balloon floppy pvpanic nfs_acl lockd auth_rpcgss grace sunrpc qxl drm_kms_helper ttm drm virtio_console virtio_blk virtio_pci virtio_ring virtio serio_raw ata_generic pata_acpi
> [ 1168.638448] CPU: 3 PID: 1850 Comm: op_ren Not tainted 4.6.0-rc1+ #25
> [ 1168.639413] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [ 1168.640146] task: ffff880035cf5400 ti: ffff8800d064c000 task.ti: ffff8800d064c000
> [ 1168.641107] RIP: 0010:[] [] kmem_cache_alloc+0x78/0x160
> [ 1168.642292] RSP: 0018:ffff8800d064fa90 EFLAGS: 00010246
> [ 1168.642978] RAX: 73747365746e7572 RBX: 0000000000000894 RCX: 0000000000000020
> [ 1168.643920] RDX: 0000000000318271 RSI: 00000000024080c0 RDI: 000000000001a440
> [ 1168.644862] RBP: ffff8800d064fac0 R08: ffff88021fd9a440 R09: ffff880035b82400
> [ 1168.645794] R10: 0000000000000000 R11: ffff8800d064fb70 R12: 00000000024080c0
> [ 1168.646762] R13: ffffffff81317667 R14: ffff880217001d00 R15: 73747365746e7572
> [ 1168.647650] FS: 00007f0cb8295700(0000) GS:ffff88021fd80000(0000) knlGS:0000000000000000
> [ 1168.648639] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1168.649330] CR2: 0000000000401090 CR3: 0000000035f9a000 CR4: 00000000000006e0
> [ 1168.650239] Stack:
> [ 1168.650498] 00ff8800026084c0 0000000000000894 ffff880035b82400 ffff8800d064fd14
> [ 1168.651509] ffff8800d0650000 ffff880201ca5e38 ffff8800d064fae0 ffffffff81317667
> [ 1168.652506] ffffffff81c9b140 ffff880035b82400 ffff8800d064fb00 ffffffff8130ef03
> [ 1168.653494] Call Trace:
> [ 1168.653889] [] selinux_file_alloc_security+0x37/0x60
> [ 1168.654728] [] security_file_alloc+0x33/0x50
> [ 1168.655447] [] get_empty_filp+0x9a/0x1c0
> [ 1168.656231] [] ? copy_to_iter+0x1b6/0x260
> [ 1168.656999] [] path_openat+0x2e/0x1660
> [ 1168.657645] [] ? current_fs_time+0x23/0x30
> [ 1168.658311] [] ? copy_to_iter+0x1b6/0x260
> [ 1168.658999] [] ? current_fs_time+0x23/0x30
> [ 1168.659742] [] ? touch_atime+0x23/0xa0
> [ 1168.660435] [] do_filp_open+0x7e/0xe0
> [ 1168.661072] [] ? __vfs_read+0xa7/0xd0
> [ 1168.661792] [] ? __vfs_read+0xa7/0xd0
> [ 1168.662410] [] ? kmem_cache_alloc+0x34/0x160
> [ 1168.663130] [] do_open_execat+0x64/0x150
> [ 1168.664100] [] open_exec+0x2b/0x50
> [ 1168.664949] [] load_elf_binary+0x29a/0x1670
> [ 1168.665880] [] ? get_user_pages_remote+0x54/0x60
> [ 1168.666843] [] ? copy_strings.isra.30+0x25c/0x370
> [ 1168.667812] [] search_binary_handler+0x9e/0x1d0
> [ 1168.668753] [] do_execveat_common.isra.41+0x4fc/0x6d0
> [ 1168.669753] [] SyS_execve+0x3a/0x50
> [ 1168.670560] [] do_syscall_64+0x62/0x110
> [ 1168.671384] [] entry_SYSCALL64_slow_path+0x25/0x25
> [ 1168.672305] Code: 49 83 78 10 00 4d 8b 38 0f 84 bd 00 00 00 4d 85 ff 0f 84 b4 00 00 00 49 63 46 20 49 8b 3e 4c 01 f8 40 f6 c7 0f 0f 85 cf 00 00 00 <48> 8b 18 48 8d 4a 01 4c 89 f8 65 48 0f c7 0f 0f 94 c0 84 c0 74
> [ 1168.676071] RIP [] kmem_cache_alloc+0x78/0x160
> [ 1168.677008] RSP
> [ 1168.679699] general protection fault: 0000 [#2]
>
>
> kmem_cache corruption maybe?
>
> (gdb) list *(kmem_cache_alloc+0x78)
> 0xffffffff811f6488 is in kmem_cache_alloc (mm/slub.c:245).
> 240 * Core slab cache functions
> 241 *******************************************************************/
> 242
> 243 static inline void *get_freepointer(struct kmem_cache *s, void *object)
> 244 {
> 245 return *(void **)(object + s->offset);
> 246 }
> 247
> 248 static void prefetch_freepointer(const struct kmem_cache *s, void *object)
> 249 {
>
>
> From /sys/kernel/slab (after rebooting and rerunning the cthon tests once):
>
> lrwxrwxrwx. 1 root root 0 Apr 30 09:10 selinux_file_security -> :t-0000016
>
> [root@rawhide slab]# ls -l | grep :t-0000016
> lrwxrwxrwx. 1 root root 0 Apr 30 09:10 kmalloc-16 -> :t-0000016
> lrwxrwxrwx. 1 root root 0 Apr 30 09:10 selinux_file_security -> :t-0000016
> drwxr-xr-x. 2 root root 0 Apr 30 09:11 :t-0000016
>
>
> Hard to tell if this is related to your changes but it certainly did
> happen in the open codepath. I'll see if I can reproduce it again.
>
Second oops on second attempt. Different codepath this time:
[ 548.093261] general protection fault: 0000 [#1] SMP
[ 548.094023] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xfs snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device libcrc32c snd_pcm snd_timer snd acpi_cpufreq ppdev tpm_tis soundcore tpm nfsd parport_pc parport virtio_net floppy virtio_balloon joydev pvpanic i2c_piix4 pcspkr nfs_acl lockd auth_rpcgss grace sunrpc qxl drm_kms_helper ttm virtio_console virtio_blk drm serio_raw virtio_pci virtio_ring ata_generic virtio pata_acpi
[ 548.100968] CPU: 2 PID: 18173 Comm: ls Not tainted 4.6.0-rc1+ #25
[ 548.101799] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 548.102586] task: ffff8800d0275400 ti: ffff880202934000 task.ti: ffff880202934000
[ 548.103610] RIP: 0010:[<ffffffff811f9a06>] [<ffffffff811f9a06>] __kmalloc_track_caller+0x96/0x1b0
[ 548.104793] RSP: 0018:ffff880202937ad0 EFLAGS: 00010246
[ 548.105488] RAX: ff656c6966676962 RBX: 0000000000000006 RCX: 0000000000000000
[ 548.106412] RDX: 0000000000009f9f RSI: 0000000000000000 RDI: 000000000001a420
[ 548.107335] RBP: ffff880202937b00 R08: ffff88021fd1a420 R09: ff656c6966676962
[ 548.108236] R10: 0000000000000001 R11: ffff880212756240 R12: 00000000024000c0
[ 548.109375] R13: 0000000000000006 R14: ffffffffa03d9697 R15: ffff880217001e00
[ 548.110308] FS: 00007fd7839e2800(0000) GS:ffff88021fd00000(0000) knlGS:0000000000000000
[ 548.111378] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 548.112136] CR2: 00007ffe35047f3c CR3: 0000000202816000 CR4: 00000000000006e0
[ 548.113064] Stack:
[ 548.113333] ffff880201065548 0000000000000006 ffff88020222e314 0000000000000000
[ 548.114338] ffff880202f563e8 ffff880202937cb0 ffff880202937b20 ffffffff811b5e50
[ 548.115335] ffff880202f56000 ffff880202937df0 ffff880202937c38 ffffffffa03d9697
[ 548.116342] Call Trace:
[ 548.116665] [<ffffffff811b5e50>] kmemdup+0x20/0x50
[ 548.117295] [<ffffffffa03d9697>] nfs_readdir_page_filler+0x277/0x5f0 [nfs]
[ 548.118173] [<ffffffffa03d9c0f>] nfs_readdir_xdr_to_array+0x1ff/0x330 [nfs]
[ 548.119058] [<ffffffffa03d9d64>] nfs_readdir_filler+0x24/0xb0 [nfs]
[ 548.119856] [<ffffffff8119772e>] ? add_to_page_cache_lru+0x6e/0xc0
[ 548.120625] [<ffffffff81197b4f>] do_read_cache_page+0x13f/0x2c0
[ 548.121397] [<ffffffffa03d9d40>] ? nfs_readdir_xdr_to_array+0x330/0x330 [nfs]
[ 548.122296] [<ffffffff81197ce9>] read_cache_page+0x19/0x20
[ 548.122995] [<ffffffffa03d9f63>] nfs_readdir+0x173/0x5f0 [nfs]
[ 548.123711] [<ffffffff81457a5a>] ? tty_write+0x1ca/0x2f0
[ 548.124568] [<ffffffffa0427cb0>] ? nfs4_xdr_dec_fsinfo+0x70/0x70 [nfsv4]
[ 548.125372] [<ffffffff812232bb>] iterate_dir+0x16b/0x1a0
[ 548.126039] [<ffffffff812236e8>] SyS_getdents+0x88/0x100
[ 548.126686] [<ffffffff812232f0>] ? iterate_dir+0x1a0/0x1a0
[ 548.127651] [<ffffffff81003cb2>] do_syscall_64+0x62/0x110
[ 548.128614] [<ffffffff8174ae21>] entry_SYSCALL64_slow_path+0x25/0x25
[ 548.129667] Code: 49 83 78 10 00 4d 8b 08 0f 84 bf 00 00 00 4d 85 c9 0f 84 b6 00 00 00 49 63 47 20 49 8b 3f 4c 01 c8 40 f6 c7 0f 0f 85 fa 00 00 00 <48> 8b 18 48 8d 4a 01 4c 89 c8 65 48 0f c7 0f 0f 94 c0 84 c0 74
[ 548.133600] RIP [<ffffffff811f9a06>] __kmalloc_track_caller+0x96/0x1b0
[ 548.134659] RSP <ffff880202937ad0>
[ 548.135353] ---[ end trace 77ac2ecac8d76afe ]---
...but looks like same problem:
(gdb) list *(__kmalloc_track_caller+0x96)
0xffffffff811f9a06 is in __kmalloc_track_caller (mm/slub.c:245).
240 * Core slab cache functions
241 *******************************************************************/
242
243 static inline void *get_freepointer(struct kmem_cache *s, void *object)
244 {
245 return *(void **)(object + s->offset);
246 }
247
248 static void prefetch_freepointer(const struct kmem_cache *s, void *object)
249 {
--
Jeff Layton <jlayton@poochiereds.net>
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: parallel lookups on NFS
2016-04-30 13:22 ` Jeff Layton
@ 2016-04-30 14:22 ` Al Viro
0 siblings, 0 replies; 25+ messages in thread
From: Al Viro @ 2016-04-30 14:22 UTC (permalink / raw)
To: Jeff Layton
Cc: linux-nfs, linux-fsdevel, Trond Myklebust, Linus Torvalds,
Anna Schumaker
On Sat, Apr 30, 2016 at 09:22:56AM -0400, Jeff Layton wrote:
> ...but looks like same problem:
>
> (gdb) list *(__kmalloc_track_caller+0x96)
> 0xffffffff811f9a06 is in __kmalloc_track_caller (mm/slub.c:245).
> 240 * Core slab cache functions
> 241 *******************************************************************/
> 242
> 243 static inline void *get_freepointer(struct kmem_cache *s, void *object)
> 244 {
> 245 return *(void **)(object + s->offset);
> 246 }
> 247
> 248 static void prefetch_freepointer(const struct kmem_cache *s, void *object)
> 249 {
Joy... Does that happen without the last commit as well? I realize that
memory corruption could well have been introduced earlier and changes in
the last commit had only increased the odds, but...
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: parallel lookups on NFS
@ 2016-04-30 14:22 ` Al Viro
0 siblings, 0 replies; 25+ messages in thread
From: Al Viro @ 2016-04-30 14:22 UTC (permalink / raw)
To: Jeff Layton
Cc: linux-nfs, linux-fsdevel, Trond Myklebust, Linus Torvalds,
Anna Schumaker
On Sat, Apr 30, 2016 at 09:22:56AM -0400, Jeff Layton wrote:
> ...but looks like same problem:
>
> (gdb) list *(__kmalloc_track_caller+0x96)
> 0xffffffff811f9a06 is in __kmalloc_track_caller (mm/slub.c:245).
> 240������*����������������������Core slab cache functions
> 241������*******************************************************************/
> 242
> 243�����static inline void *get_freepointer(struct kmem_cache *s, void *object)
> 244�����{
> 245�������������return *(void **)(object + s->offset);
> 246�����}
> 247
> 248�����static void prefetch_freepointer(const struct kmem_cache *s, void *object)
> 249�����{
Joy... Does that happen without the last commit as well? I realize that
memory corruption could well have been introduced earlier and changes in
the last commit had only increased the odds, but...
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: parallel lookups on NFS
2016-04-30 14:22 ` Al Viro
(?)
@ 2016-04-30 14:43 ` Jeff Layton
2016-04-30 18:58 ` Al Viro
-1 siblings, 1 reply; 25+ messages in thread
From: Jeff Layton @ 2016-04-30 14:43 UTC (permalink / raw)
To: Al Viro
Cc: linux-nfs, linux-fsdevel, Trond Myklebust, Linus Torvalds,
Anna Schumaker
On Sat, 2016-04-30 at 15:22 +0100, Al Viro wrote:
> On Sat, Apr 30, 2016 at 09:22:56AM -0400, Jeff Layton wrote:
> >
> > ...but looks like same problem:
> >
> > (gdb) list *(__kmalloc_track_caller+0x96)
> > 0xffffffff811f9a06 is in __kmalloc_track_caller (mm/slub.c:245).
> > 240 * Core slab cache functions
> > 241 *******************************************************************/
> > 242
> > 243 static inline void *get_freepointer(struct kmem_cache *s, void *object)
> > 244 {
> > 245 return *(void **)(object + s->offset);
> > 246 }
> > 247
> > 248 static void prefetch_freepointer(const struct kmem_cache *s, void *object)
> > 249 {
> Joy... Does that happen without the last commit as well? I realize that
> memory corruption could well have been introduced earlier and changes in
> the last commit had only increased the odds, but...
Not exactly, but the test seems to have deadlocked without the last
patch in play. Here's the ls command:
[jlayton@rawhide ~]$ cat /proc/1425/stack
[<ffffffffa03d6eec>] nfs_block_sillyrename+0x5c/0xa0 [nfs]
[<ffffffffa03c8ef8>] nfs_readdir+0xf8/0x620 [nfs]
[<ffffffff812232bb>] iterate_dir+0x16b/0x1a0
[<ffffffff812236e8>] SyS_getdents+0x88/0x100
[<ffffffff81003cb2>] do_syscall_64+0x62/0x110
[<ffffffff8174ae21>] return_from_SYSCALL_64+0x0/0x6a
[<ffffffffffffffff>] 0xffffffffffffffff
...and here is the nfsidem command:
[jlayton@rawhide ~]$ cat /proc/1295/stack
[<ffffffff813953b7>] call_rwsem_down_write_failed+0x17/0x30
[<ffffffff8121f65b>] filename_create+0x6b/0x150
[<ffffffff812204e4>] SyS_mkdir+0x44/0xe0
[<ffffffff81003cb2>] do_syscall_64+0x62/0x110
[<ffffffff8174ae21>] return_from_SYSCALL_64+0x0/0x6a
[<ffffffffffffffff>] 0xffffffffffffffff
I'll have to take off here in a bit so I won't be able to help much
until later, but all I was doing was running the cthon special tests
like so:
$ ./server -p /export -s -N 100 tlielax
That makes a directory called "rawhide.test" (since the client's
hostname is "rawhide") and runs its tests in there. Then I ran this in
a different shell:
$ while true; do ls -l /mnt/tlielax/rawhide.test ; done
Probably I should run this on a stock kernel just to see if there are
preexisting problems...
--
Jeff Layton <jlayton@poochiereds.net>
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: parallel lookups on NFS
2016-04-30 14:43 ` Jeff Layton
@ 2016-04-30 18:58 ` Al Viro
0 siblings, 0 replies; 25+ messages in thread
From: Al Viro @ 2016-04-30 18:58 UTC (permalink / raw)
To: Jeff Layton
Cc: linux-nfs, linux-fsdevel, Trond Myklebust, Linus Torvalds,
Anna Schumaker
On Sat, Apr 30, 2016 at 10:43:34AM -0400, Jeff Layton wrote:
> Not exactly, but the test seems to have deadlocked without the last
> patch in play. Here's the ls command:
>
> [jlayton@rawhide ~]$ cat /proc/1425/stack
> [<ffffffffa03d6eec>] nfs_block_sillyrename+0x5c/0xa0 [nfs]
> [<ffffffffa03c8ef8>] nfs_readdir+0xf8/0x620 [nfs]
> [<ffffffff812232bb>] iterate_dir+0x16b/0x1a0
> [<ffffffff812236e8>] SyS_getdents+0x88/0x100
> [<ffffffff81003cb2>] do_syscall_64+0x62/0x110
> [<ffffffff8174ae21>] return_from_SYSCALL_64+0x0/0x6a
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> ...and here is the nfsidem command:
>
> [jlayton@rawhide ~]$ cat /proc/1295/stack
> [<ffffffff813953b7>] call_rwsem_down_write_failed+0x17/0x30
> [<ffffffff8121f65b>] filename_create+0x6b/0x150
> [<ffffffff812204e4>] SyS_mkdir+0x44/0xe0
> [<ffffffff81003cb2>] do_syscall_64+0x62/0x110
> [<ffffffff8174ae21>] return_from_SYSCALL_64+0x0/0x6a
> [<ffffffffffffffff>] 0xffffffffffffffff
>
>
> I'll have to take off here in a bit so I won't be able to help much
> until later, but all I was doing was running the cthon special tests
> like so:
>
> $ ./server -p /export -s -N 100 tlielax
>
> That makes a directory called "rawhide.test" (since the client's
> hostname is "rawhide") and runs its tests in there. Then I ran this in
> a different shell:
>
> $ while true; do ls -l /mnt/tlielax/rawhide.test ; done
>
> Probably I should run this on a stock kernel just to see if there are
> preexisting problems...
FWIW, I could reproduce that (and I really wonder WTF is going on - looks
like nfs_async_unlink_release() getting lost somehow), but not the memory
corruption with the last commit... What .config are you using?
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: parallel lookups on NFS
@ 2016-04-30 18:58 ` Al Viro
0 siblings, 0 replies; 25+ messages in thread
From: Al Viro @ 2016-04-30 18:58 UTC (permalink / raw)
To: Jeff Layton
Cc: linux-nfs, linux-fsdevel, Trond Myklebust, Linus Torvalds,
Anna Schumaker
On Sat, Apr 30, 2016 at 10:43:34AM -0400, Jeff Layton wrote:
> Not exactly, but the test seems to have deadlocked without the last
> patch in play. Here's the ls command:
>
> [jlayton@rawhide ~]$ cat /proc/1425/stack
> [<ffffffffa03d6eec>] nfs_block_sillyrename+0x5c/0xa0 [nfs]
> [<ffffffffa03c8ef8>] nfs_readdir+0xf8/0x620 [nfs]
> [<ffffffff812232bb>] iterate_dir+0x16b/0x1a0
> [<ffffffff812236e8>] SyS_getdents+0x88/0x100
> [<ffffffff81003cb2>] do_syscall_64+0x62/0x110
> [<ffffffff8174ae21>] return_from_SYSCALL_64+0x0/0x6a
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> ...and here is the nfsidem command:
>
> [jlayton@rawhide ~]$ cat /proc/1295/stack
> [<ffffffff813953b7>] call_rwsem_down_write_failed+0x17/0x30
> [<ffffffff8121f65b>] filename_create+0x6b/0x150
> [<ffffffff812204e4>] SyS_mkdir+0x44/0xe0
> [<ffffffff81003cb2>] do_syscall_64+0x62/0x110
> [<ffffffff8174ae21>] return_from_SYSCALL_64+0x0/0x6a
> [<ffffffffffffffff>] 0xffffffffffffffff
>
>
> I'll have to take off here in a bit so I won't be able to help much
> until later, but all I was doing was running the cthon special tests
> like so:
>
> � � $ ./server -p /export -s -N 100 tlielax
>
> That makes a directory called "rawhide.test" (since the client's
> hostname is "rawhide") and runs its tests in there. Then I ran this in
> a different shell:
>
> $ while true; do ls -l /mnt/tlielax/rawhide.test ; done
>
> Probably I should run this on a stock kernel just to see if there are
> preexisting problems...
FWIW, I could reproduce that (and I really wonder WTF is going on - looks
like nfs_async_unlink_release() getting lost somehow), but not the memory
corruption with the last commit... What .config are you using?
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: parallel lookups on NFS
2016-04-30 18:58 ` Al Viro
(?)
@ 2016-04-30 19:29 ` Al Viro
[not found] ` <1462048765.10011.44.camel@poochiereds.net>
-1 siblings, 1 reply; 25+ messages in thread
From: Al Viro @ 2016-04-30 19:29 UTC (permalink / raw)
To: Jeff Layton
Cc: linux-nfs, linux-fsdevel, Trond Myklebust, Linus Torvalds,
Anna Schumaker
On Sat, Apr 30, 2016 at 07:58:36PM +0100, Al Viro wrote:
> FWIW, I could reproduce that (and I really wonder WTF is going on - looks
> like nfs_async_unlink_release() getting lost somehow), but not the memory
> corruption with the last commit... What .config are you using?
OK, I do understand the deadlock. nfs_unblock_sillyrename() doesn't issue
a wakeup. If you have lookup/lookup colliding on nfs_block_sillyrename()
*and* no pending delayed unlinks in that directory, you are stuck. Grr...
I'll see if the obvious fix helps and push it into the queue if it does.
It doesn't explain the memory corruptor, though. Could you post your .config?
^ permalink raw reply [flat|nested] 25+ messages in thread