Linux PARISC architecture development
 help / color / mirror / Atom feed
From: Helge Deller <deller@gmx.de>
To: John David Anglin <dave.anglin@bell.net>,
	Hillf Danton <hdanton@sina.com>
Cc: linux-kernel@vger.kernel.org, linux-parisc@vger.kernel.org
Subject: Re: WARNING: CPU: 1 PID: 14735 at fs/dcache.c:365 dentry_free+0x100/0x128
Date: Tue, 19 Jul 2022 23:25:07 +0200	[thread overview]
Message-ID: <76e47f90-bcc2-caab-50c5-6bff7fdc5c1d@gmx.de> (raw)
In-Reply-To: <7d53692b-6ac8-e1bd-4d0d-7e97aa01b18d@bell.net>

On 7/19/22 22:59, John David Anglin wrote:
> Hi Helge,
>
> I hit this warning with the patch below building ghc on mx3210:

As I wrote, I didn't faced it yet on my buildd server, but that could
just have been luck...
Hillf, should we try if this second hunk triggers?

--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -616,6 +618,7 @@ static void __dentry_kill(struct dentry
 		dentry->d_flags |= DCACHE_MAY_FREE;
 		can_free = false;
 	}
+	BUG_ON(!hlist_unhashed(&dentry->d_u.d_alias));
 	spin_unlock(&dentry->d_lock);
 	if (likely(can_free))
 		dentry_free(dentry);

Helge


> mx3210 login: ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 32654 at fs/dcache.c:365 dentry_free+0xfc/0x108
> Modules linked in: binfmt_misc ext2 ext4 crc16 mbcache jbd2 ipmi_watchdog sg ipmi_si ipmi_poweroff ipmi_devintf ipmi_msghandler fuse nfsd ip_tables x_tables ipv6 autofs4 xfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod t10_pi ses enclosure scsi_transport_sas crc64_rocksoft crc64 uas usb_storage sr_mod cdrom ohci_pci sym53c8xx pata_cmd64x ehci_pci ohci_hcd libata scsi_transport_spi ehci_hcd tg3 scsi_mod usbcore scsi_common usb_common
> CPU: 2 PID: 32654 Comm: cc1 Not tainted 5.18.12+ #2
> Hardware name: 9000/800/rp3440
>
>      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
> PSW: 00001000000001000110100000001111 Not tainted
> r00-03  000000000804680f 00000040ce7fc880 00000000404f2b74 00000040ce7fc920
> r04-07  0000000040be4940 000000410f6cd630 00000001413e4068 000000410f6cd688
> r08-11  0000000040fd2e60 0000000040bc5020 0000000040c2c940 00000000000800e0
> r12-15  0000000040c2c940 0000000000000001 0000000040c2c940 000000410f6cd688
> r16-19  00000001f9fe105d 00000040ce7fc1f8 000000000000002f 000000000a0c1000
> r20-23  000000000800000f 000000000800000f 000000410f6cd639 000000000800000f
> r24-27  0000000000000000 0000000000000385 000000410f6cd630 0000000040be4940
> r28-31  0000000041104530 00000040ce7fc8f0 00000040ce7fc9a0 0000000000000000
> sr00-03  0000000000a03800 0000000000000000 0000000000000000 0000000000a03800
> sr04-07  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>
> IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000404f18bc 00000000404f18c0
>  IIR: 03ffe01f    ISR: 0000000010350000  IOR: 00000239ff3fc928
>  CPU:        2   CR30: 00000040cadd1380 CR31: ffffffffffffffff
>  ORIG_R28: 00000040ce7fcb70
>  IAOQ[0]: dentry_free+0xfc/0x108
>  IAOQ[1]: dentry_free+0x100/0x108
>  RP(r2): __dentry_kill+0x2bc/0x338
> Backtrace:
>  [<00000000404f2b74>] __dentry_kill+0x2bc/0x338
>  [<00000000404f37b8>] dentry_kill+0xb0/0x318
>  [<00000000404f3d08>] dput+0x2e8/0x328
>  [<00000000404dd7dc>] step_into+0x344/0x390
>  [<00000000404dda4c>] walk_component+0xa4/0x310
>  [<00000000404df234>] link_path_walk.part.0+0x2ec/0x4b0
>  [<00000000404e0000>] path_openat+0xe8/0x348
>  [<00000000404e2c58>] do_filp_open+0x98/0x178
>  [<00000000404babe8>] do_sys_openat2+0x148/0x288
>  [<00000000404bb41c>] compat_sys_openat+0x54/0x98
>  [<0000000040203e30>] syscall_exit+0x0/0x10
>
> ---[ end trace 0000000000000000 ]---
> watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [cc1:32657]
>
> Regards,
> Dave
>
> On 2022-07-19 12:32 p.m., Helge Deller wrote:
>> Hello Hillf,
>>
>> On 7/17/22 13:36, Hillf Danton wrote:
>>> On Sun, 17 Jul 2022 11:42:48 +0200
>>>> I used WARN_ON() instead of BUG_ON().
>>>> With that, both triggered, first the first one, then the second one.
>>>> Full log is here:
>>>> http://dellerweb.de/testcases/minicom.dcache.crash.6-warn
>>> Given the first BUG_ON triggered, and dentry at the moment is supposed to
>>> not be alias, see if it is still in lookup with d_lock held. That is the
>>> step before de-unioning d_alias with d_in_lookup_hash.
>>>
>>> On the other hand if only the second one triggered, we should track
>>> DCACHE_DENTRY_KILLED instead in assumption that killed dentry was
>>> used again after releasing d_lock surrounding the firt one.
>> The machine has now been up for 2 days without any issues, while it had pretty
>> much the same load as when it was crashing earlier.
>> So, in summary I'd assume that your patch below fixes the issue.
>>
>> I'm now rebooting the machine with a new kernel, where I just changed
>>     if (unlikely(d_in_lookup(dentry)))
>> to
>>     if (WARN_ON_ONCE(d_in_lookup(dentry)))
>> in order to see if this really triggered.
>>
>> Anyway, I think your patch is good so far.
>> Would that be the final patch, or should I test some others?
>>
>> Thanks!
>> Helge
>>
>>> --- a/fs/dcache.c
>>> +++ b/fs/dcache.c
>>> @@ -605,8 +605,12 @@ static void __dentry_kill(struct dentry
>>>           spin_unlock(&parent->d_lock);
>>>       if (dentry->d_inode)
>>>           dentry_unlink_inode(dentry);
>>> -    else
>>> +    else {
>>> +        if (unlikely(d_in_lookup(dentry))) {
>>> +            __d_lookup_done(dentry);
>>> +        }
>>>           spin_unlock(&dentry->d_lock);
>>> +    }
>>>       this_cpu_dec(nr_dentry);
>>>       if (dentry->d_op && dentry->d_op->d_release)
>>>           dentry->d_op->d_release(dentry);
>
>


  reply	other threads:[~2022-07-19 21:25 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20220709090756.2384-1-hdanton@sina.com>
2022-07-15  8:18 ` WARNING: CPU: 1 PID: 14735 at fs/dcache.c:365 dentry_free+0x100/0x128 Helge Deller
     [not found] ` <20220715133300.1297-1-hdanton@sina.com>
2022-07-16  5:27   ` Helge Deller
2022-07-17  9:42     ` Helge Deller
     [not found]     ` <20220717113634.1552-1-hdanton@sina.com>
2022-07-19 16:32       ` Helge Deller
2022-07-19 20:59         ` John David Anglin
2022-07-19 21:25           ` Helge Deller [this message]
2022-07-20  2:00             ` Al Viro
2022-07-20  2:22         ` Al Viro
2022-07-20  2:31     ` Al Viro
2022-07-20  2:33       ` Al Viro
2022-07-20  3:29     ` Al Viro
2022-07-20  6:53       ` Helge Deller
2022-07-20  7:07         ` Al Viro
2022-07-20  9:21           ` Helge Deller
     [not found]           ` <20220720110032.1787-1-hdanton@sina.com>
2022-07-20 17:06             ` Al Viro
2022-07-20 23:15               ` Sam James
2022-07-21  3:54                 ` Helge Deller
2022-07-30 20:21                   ` Helge Deller
2022-07-09  5:33 Helge Deller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=76e47f90-bcc2-caab-50c5-6bff7fdc5c1d@gmx.de \
    --to=deller@gmx.de \
    --cc=dave.anglin@bell.net \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-parisc@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox