linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] kernel BUG at fs/dcache.c:899
@ 2018-07-13 15:33 Peter Geis
  2018-07-13 16:06 ` Al Viro
  0 siblings, 1 reply; 2+ messages in thread
From: Peter Geis @ 2018-07-13 15:33 UTC (permalink / raw)
  To: viro; +Cc: linux-fsdevel

Good Morning,

I have been trying to track down a bug that has been causing my Tegra3 
device to reboot while compiling.
I finally managed to catch the offender, the details are below:
The offending code is a triggered bug in dget_parent, the code is:
	rcu_read_unlock();
	BUG_ON(!ret->d_lockref.count);
	ret->d_lockref.count++;

Thanks,
Peter Geis

[ 6399.677492] ------------[ cut here ]------------
[ 6399.682136] kernel BUG at fs/dcache.c:899!
[ 6399.686228] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
[ 6399.692072] Modules linked in: cpufreq_userspace cpufreq_powersave 
cpufreq_conservative snd_soc_tegr4
[ 6399.732443] CPU: 1 PID: 18614 Comm: fixdep Not tainted 
4.18.0-rc4-next-20180710-00059-g52ccdc95b9c6 2
[ 6399.741927] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
[ 6399.748220] PC is at dget_parent+0xac/0xb0
[ 6399.752315] LR is at dget_parent+0x78/0xb0
[ 6399.756416] pc : [<c033a5d4>]    lr : [<c033a5a0>]    psr: 600d0013
[ 6399.762675] sp : d52abcf0  ip : d52abcf0  fp : d52abd0c
[ 6399.767908] r10: c1004dc8  r9 : c12ed6c0  r8 : c12ed6c0
[ 6399.773144] r7 : ccf55518  r6 : ccf554c8  r5 : ccf55550  r4 : ccf554c8
[ 6399.779662] r3 : 00000000  r2 : 0000000b  r1 : 00000000  r0 : 00000000
[ 6399.786193] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM 
Segment none
[ 6399.793330] Control: 10c5387d  Table: ab03c04a  DAC: 00000051
[ 6399.799092] Process fixdep (pid: 18614, stack limit = 0x51488b23)
[ 6399.805203] Stack: (0xd52abcf0 to 0xd52ac000)
[ 6399.809558] bce0:                                     c12ed6c0 
ed626e30 ed626e30 ecc6c000
[ 6399.817746] bd00: d52abd34 d52abd10 c037e92c c033a534 d52abd48 
d52abd60 00000002 c1004dc8
[ 6399.825934] bd20: ecc6d800 ed626e30 d52abdc4 d52abd38 c03cc0fc 
c037e8f8 e3b07ce7 00000009
[ 6399.834123] bd40: d52abd9c d52abd50 c032b938 c032c4c0 ed624ee8 
00000081 00000000 2f2f2f2f
[ 6399.842290] bd60: 00000000 e3b07ce7 d52abd9c d52abd78 c032c4c0 
c032c2e4 d52abe90 c032b630
[ 6399.850482] bd80: d52abe98 a833ed19 d52abdc8 c001402d c12ed6c8 
aaf19945 c12ed6c8 c12ed6c0
[ 6399.858668] bda0: ed626e30 c12ed6c8 c03cc0a0 00020000 c12ed6c0 
c1004dc8 d52abdec d52abdc8
[ 6399.866857] bdc0: c031ab54 c03cc0ac d52abe90 00000000 00000000 
00000000 00020000 c12ed6c0
[ 6399.875053] bde0: d52abdfc d52abdf0 c031bec0 c031a990 d52abe8c 
d52abe00 c032f3dc c031be90
[ 6399.883251] be00: efdc7e90 c12ed780 d52abe94 d52abe18 c02a2d8c 
c033f1bc c0d18774 aaf19945
[ 6399.891420] be20: 00000165 006000c0 ffffe000 00000041 c1004dc8 
00000000 00000004 ccf554c8
[ 6399.899606] be40: 00000002 ed626e30 ee7a8b10 ccf55550 00000165 
00000000 00000165 aaf19945
[ 6399.907795] be60: c0324ac8 c1004dc8 c1004dc8 d52abe90 d52abf50 
00000001 d52aa000 00000142
[ 6399.915984] be80: d52abf44 d52abe90 c033169c c032f0b0 ee7a8b10 
ccf55550 a833ed19 0000000f
[ 6399.924174] bea0: c001402d d52abeb0 00000000 c05590a4 ed626e30 
00000101 00000002 00005084
[ 6399.932342] bec0: 00000000 00000000 00000000 d52abed0 ffffff9c 
00000003 e751d800 c0a0da8c
[ 6399.940532] bee0: d52abefc d52abef0 c0a0da8c c0161d1c d52abf34 
d52abf00 c0340ea8 c0a0da64
[ 6399.948726] bf00: c0014000 00000000 00020000 00020000 ffffff9c 
ffffff9c c0014000 aaf19945
[ 6399.956920] bf20: d52aa000 00000003 c1004dc8 ffffff9c c0014000 
fffff000 d52abf94 d52abf48
[ 6399.965105] bf40: c031c20c c0331624 d52abf64 d52abf58 00020000 
c0160000 00000004 00000100
[ 6399.973290] bf60: 00000001 aaf19945 c0101204 b6fc2000 b6fee968 
b6fee968 00000142 c0101204
[ 6399.981458] bf80: d52aa000 00000142 d52abfa4 d52abf98 c031c2d8 
c031c088 00000000 d52abfa8
[ 6399.989644] bfa0: c0101000 c031c2c8 b6fc2000 b6fee968 ffffff9c 
00a76b09 00020000 00000000
[ 6399.997844] bfc0: b6fc2000 b6fee968 b6fee968 00000142 00a76b34 
0000002c 00000020 00a78f50
[ 6400.006037] bfe0: 00000142 becd9210 b6f5a709 b6ee5206 200d0030 
ffffff9c 00000000 00000000
[ 6400.014257] [<c033a5d4>] (dget_parent) from [<c037e92c>] 
(fscrypt_file_open+0x40/0xe0)
[ 6400.022183] [<c037e92c>] (fscrypt_file_open) from [<c03cc0fc>] 
(ext4_file_open+0x5c/0x1dc)
[ 6400.030520] [<c03cc0fc>] (ext4_file_open) from [<c031ab54>] 
(do_dentry_open.constprop.4+0x1d0/0x3d0)
[ 6400.039715] [<c031ab54>] (do_dentry_open.constprop.4) from 
[<c031bec0>] (vfs_open+0x3c/0x40)
[ 6400.048200] [<c031bec0>] (vfs_open) from [<c032f3dc>] 
(path_openat+0x338/0x139c)
[ 6400.055624] [<c032f3dc>] (path_openat) from [<c033169c>] 
(do_filp_open+0x84/0xf0)
[ 6400.063138] [<c033169c>] (do_filp_open) from [<c031c20c>] 
(do_sys_open+0x190/0x214)
[ 6400.070825] [<c031c20c>] (do_sys_open) from [<c031c2d8>] 
(sys_openat+0x1c/0x20)
[ 6400.078202] [<c031c2d8>] (sys_openat) from [<c0101000>] 
(ret_fast_syscall+0x0/0x54)
[ 6400.085890] Exception stack(0xd52abfa8 to 0xd52abff0)
[ 6400.090970] bfa0:                   b6fc2000 b6fee968 ffffff9c 
00a76b09 00020000 00000000
[ 6400.099150] bfc0: b6fc2000 b6fee968 b6fee968 00000142 00a76b34 
0000002c 00000020 00a78f50
[ 6400.107313] bfe0: 00000142 becd9210 b6f5a709 b6ee5206
[ 6400.112385] Code: e1a00007 eb1b4d22 ebf9dba1 eaffffe9 (e7f001f2)
[ 6400.118529] ---[ end trace 8a2c7cc18f454ac3 ]---

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [BUG] kernel BUG at fs/dcache.c:899
  2018-07-13 15:33 [BUG] kernel BUG at fs/dcache.c:899 Peter Geis
@ 2018-07-13 16:06 ` Al Viro
  0 siblings, 0 replies; 2+ messages in thread
From: Al Viro @ 2018-07-13 16:06 UTC (permalink / raw)
  To: Peter Geis; +Cc: linux-fsdevel

On Fri, Jul 13, 2018 at 11:33:37AM -0400, Peter Geis wrote:
> Good Morning,
> 
> I have been trying to track down a bug that has been causing my Tegra3
> device to reboot while compiling.
> I finally managed to catch the offender, the details are below:
> The offending code is a triggered bug in dget_parent, the code is:
> 	rcu_read_unlock();
> 	BUG_ON(!ret->d_lockref.count);
> 	ret->d_lockref.count++;

Interesting...  We call that while holding a reference to dentry (we'd better).
That code is
        rcu_read_lock();
        ret = dentry->d_parent;
ret won't get freed until after rcu_read_unlock, so spin_lock is safe here
        spin_lock(&ret->d_lock);
        if (unlikely(ret != dentry->d_parent)) {
                spin_unlock(&ret->d_lock);
                rcu_read_unlock();
                goto repeat;
        }
Since we got through that, we have observed dentry->d_parent == ret with
ret->d_lock held.
        rcu_read_unlock();
        BUG_ON(!ret->d_lockref.count);

Now, this means that dentry->d_parent is *not* equal to ret anymore -
otherwise ret would remain pinned.  The only place that changes
->d_parent of a live dentry is __d_move() - no other assignments
exist.  __d_move() is done under rename_lock - it's globally serialized.
And it grabs ->d_lock on all parents involved before modifying ->d_parent
of anything, so the observed condition (ret == dentry->d_parent, ret->d_lock
held by us) can't change until we drop ret->d_lock...

Which kernel had that been?  It looks either like a memory corruption (anywhere)
or as if you called that with dentry itself getting killed right under you.
Reference to ->d_parent is not dropped until after the last reference to
dentry goes away, so...

Could you slap
	if (WARN_ON(!ret->d_lockref.count))
		printk(KERN_ERR "child: %px[%ld], parent: %px:%px\n",
			dentry, (long)dentry->d_lockref.count, dentry->d_parent, ret);
right before that rcu_read_unlock() and see if you can trigger that?

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-07-13 16:22 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-13 15:33 [BUG] kernel BUG at fs/dcache.c:899 Peter Geis
2018-07-13 16:06 ` Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).