From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-eopbgr680083.outbound.protection.outlook.com ([40.107.68.83]:34174 "EHLO NAM04-BN3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727793AbeKJATZ (ORCPT ); Fri, 9 Nov 2018 19:19:25 -0500 From: Jan Glauber To: Alexander Viro CC: "linux-fsdevel@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Will Deacon Subject: dcache_readdir NULL inode oops Date: Fri, 9 Nov 2018 14:37:51 +0000 Message-ID: <20181109143744.GA12128@hc> Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-ID: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Hi Al, I'm seeing the following oops reproducible with upstream kernel on arm64 (T= hunderX2): [ 5428.795719] Unable to handle kernel NULL pointer dereference at virtual = address 0000000000000040 [ 5428.813838] Mem abort info: [ 5428.820721] ESR =3D 0x96000006 [ 5428.828476] Exception class =3D DABT (current EL), IL =3D 32 bits [ 5428.841590] SET =3D 0, FnV =3D 0 [ 5428.848939] EA =3D 0, S1PTW =3D 0 [ 5428.855941] Data abort info: [ 5428.862422] ISV =3D 0, ISS =3D 0x00000006 [ 5428.870787] CM =3D 0, WnR =3D 0 [ 5428.877359] user pgtable: 4k pages, 48-bit VAs, pgdp =3D 0000000052f9e03= 4 [ 5428.891098] [0000000000000040] pgd=3D0000007ebb0d6003, pud=3D0000007ed30= 73003, pmd=3D0000000000000000 [ 5428.909251] Internal error: Oops: 96000006 [#1] SMP [ 5428.919122] Modules linked in: xt_conntrack nf_conntrack nf_defrag_ipv6 = nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ip6table_= filter ip6_tables iptable_filter ipmi_ssif ip_tables x_tables ipv6 crc32_ce= bnx2x crct10dif_ce igb nvme nvme_core i2c_algo_bit mdio gpio_xlp i2c_xlp9x= x [ 5428.972724] CPU: 45 PID: 220018 Comm: stress-ng-dev Not tainted 4.19.0-j= ang+ #45 [ 5428.987664] Hardware name: To be filled by O.E.M. Saber/To be filled by = O.E.M., BIOS 0ACKL018 03/30/2018 [ 5429.006819] pstate: 60400009 (nZCv daif +PAN -UAO) [ 5429.016567] pc : dcache_readdir+0xfc/0x1a8 [ 5429.024903] lr : dcache_readdir+0x134/0x1a8 [ 5429.033376] sp : ffff00002d553d70 [ 5429.040101] x29: ffff00002d553d70 x28: ffff807db4988000 [ 5429.050892] x27: 0000000000000000 x26: 0000000000000000 [ 5429.061679] x25: 0000000056000000 x24: ffff8024577106c0 [ 5429.072457] x23: 0000000000000000 x22: ffff80267b92a480 [ 5429.083248] x21: ffff80267b92a520 x20: ffff8024575e5e00 [ 5429.094029] x19: ffff00002d553e40 x18: 0000000000000000 [ 5429.104805] x17: 0000000000000000 x16: 0000000000000000 [ 5429.115553] x15: 0000000000000000 x14: 0000000000000000 [ 5429.126332] x13: 0000000000000000 x12: 0000000000000000 [ 5429.137096] x11: 0000000000000000 x10: ffff80266b398228 [ 5429.147849] x9 : ffff80266b398000 x8 : 0000000000007e4e [ 5429.158580] x7 : 0000000000000000 x6 : ffff00000830d190 [ 5429.169362] x5 : 0000000000000000 x4 : ffff00000d7506a8 [ 5429.180123] x3 : 0000000000000002 x2 : 0000000000000002 [ 5429.190890] x1 : ffff8024575e5e38 x0 : ffff00002d553e40 [ 5429.201715] Process stress-ng-dev (pid: 220018, stack limit =3D 0x000000= 009437ac28) [ 5429.216828] Call trace: [ 5429.221855] dcache_readdir+0xfc/0x1a8 [ 5429.229459] iterate_dir+0x8c/0x1a0 [ 5429.236561] ksys_getdents64+0xa4/0x188 [ 5429.244357] __arm64_sys_getdents64+0x28/0x38 [ 5429.253201] el0_svc_handler+0x7c/0x100 [ 5429.260989] el0_svc+0x8/0xc [ 5429.266878] Code: a9429681 aa1303e0 b9402682 a9400e66 (f94020a4) [ 5429.279192] ---[ end trace 5c1e28c07cf016c5 ]--- It happens after 1-3 hours of running 'stress-ng --dev 128'. This testcase = does a scandir of /dev and then calls random stuff like ioctl, lseek, open/close etc. on the entri= es. I assume no files are deleted under /dev during the testcase. The NULL pointer is the inode pointer of next. The next dentry->d_flags is = DCACHE_RCUACCESS when this happens. Any hints on how to further debug this? --Jan