* commit c7314d74fcb0
@ 2014-01-19 20:50 Arnaud Ebalard
2014-01-19 21:49 ` Al Viro
0 siblings, 1 reply; 3+ messages in thread
From: Arnaud Ebalard @ 2014-01-19 20:50 UTC (permalink / raw)
To: Alexander Viro; +Cc: linux-fsdevel
Hi Al,
While doing md5sum on files available on a NFSv3 share, the following
came up on the NAS (an ARM-based ReadyNAS 102) acting as NFS server and
running 3.13.0-rc8:
[ 3832.280073] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[ 3832.288188] pgd = c0004000
[ 3832.290909] [00000000] *pgd=00000000
[ 3832.294505] Internal error: Oops: 17 [#1] ARM
[ 3832.298868] Modules linked in:
[ 3832.301942] CPU: 0 PID: 359 Comm: kworker/0:1 Not tainted 3.13.0-rc8.rn102-00076-g7d0d46da750a-dirty #62
[ 3832.311452] Workqueue: events delayed_fput
[ 3832.315565] task: df88d340 ti: df940000 task.ti: df940000
[ 3832.320973] PC is at __fput+0x24/0x1f0
[ 3832.324729] LR is at __fput+0x1c/0x1f0
[ 3832.328485] pc : [<c008b8c8>] lr : [<c008b8c0>] psr: 60000013
[ 3832.328485] sp : df941ed0 ip : c07726c0 fp : 00000009
[ 3832.339981] r10: 00000000 r9 : 00000000 r8 : 00000000
[ 3832.345214] r7 : 00000000 r6 : c152b900 r5 : 00000000 r4 : 00000000
[ 3832.351751] r3 : 0000021d r2 : 00000000 r1 : c152be40 r0 : 00000000
[ 3832.358290] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
[ 3832.365611] Control: 10c5387d Table: 1eaa8019 DAC: 00000015
[ 3832.371366] Process kworker/0:1 (pid: 359, stack limit = 0xdf940238)
[ 3832.377729] Stack: (0xdf941ed0 to 0xdf942000)
[ 3832.382095] 1ec0: 00000000 00000000 00000000 c152be40
[ 3832.390289] 1ee0: c077569c c07726c0 df940000 c0fb0300 00000000 00000000 00000009 c008bac0
[ 3832.398482] 1f00: df90ff80 c002d6cc c0772ff0 df940000 df941f3c c0553fa8 df88d340 df90ff80
[ 3832.406676] 1f20: c07726c0 c07726d0 df940000 df90ff98 c07a5b55 00000001 00000009 c002dc98
[ 3832.414869] 1f40: df941f6c c0553fa8 df88d340 00000000 df913480 df90ff80 c002db84 00000000
[ 3832.423063] 1f60: 00000000 00000000 00000000 c0032848 d3effdfa 00000000 eff1757f df90ff80
[ 3832.431256] 1f80: 00000000 df941f84 df941f84 00000000 df941f90 df941f90 df941fac df913480
[ 3832.439449] 1fa0: c003278c 00000000 00000000 c000e218 00000000 00000000 00000000 00000000
[ 3832.447642] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 3832.455836] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 f727f277 47f34ef7
[ 3832.464036] [<c008b8c8>] (__fput+0x24/0x1f0) from [<c008bac0>] (delayed_fput+0x2c/0x3c)
[ 3832.472069] [<c008bac0>] (delayed_fput+0x2c/0x3c) from [<c002d6cc>] (process_one_work+0xf4/0x32c)
[ 3832.480963] [<c002d6cc>] (process_one_work+0xf4/0x32c) from [<c002dc98>] (worker_thread+0x114/0x34c)
[ 3832.490120] [<c002dc98>] (worker_thread+0x114/0x34c) from [<c0032848>] (kthread+0xbc/0xd8)
[ 3832.498407] [<c0032848>] (kthread+0xbc/0xd8) from [<c000e218>] (ret_from_fork+0x14/0x3c)
[ 3832.506514] Code: e5904010 eb132240 e5965010 e5963020 (e1d520b0)
[ 3832.512633] ---[ end trace a00affa6c4ade951 ]---
[ 3832.517258] Kernel panic - not syncing: Fatal exception
I took a quick look at what is at __fput+0x24 and I think the issue
happens in the inlined fsnotify_close() at the beginning of __fput():
static void __fput(struct file *file)
{
struct dentry *dentry = file->f_path.dentry;
struct vfsmount *mnt = file->f_path.mnt;
struct inode *inode = file->f_inode;
might_sleep();
fsnotify_close(file);
/*
* The function eventpoll_release() should be the first called
I came to the conclusion that it gets a file which is NULL and dies when
trying to get 'mode' below:
static inline void fsnotify_close(struct file *file)
{
struct path *path = &file->f_path;
struct inode *inode = file_inode(file);
fmode_t mode = file->f_mode;
__u32 mask = (mode & FMODE_WRITE) ? FS_CLOSE_WRITE : FS_CLOSE_NOWRITE;
Anyway, I may be wrong on the analysis and can provide the image if
you want to take a look. I then took a look at which commit may have
caused the bug and considered 4f5e65a1cc90 (fput: turn "list_head
delayed_fput_list" into llist_head) and then c7314d74fcb0 ("nfsd
regression since delayed fput()") but they are quite old and I guess
someone else would have hit it already if one was the cause. Bottom
line: I hope the backtrace may give you an idea.
As a side note, regarding the performance I get with the NAS (GbE port,
1.2GHz ARMv7 single core SoC, 512MB of RAM, NFSv3 over TCP w/ 65k/65k,
fs is ext4 over lvm over software RAID1 on 2 WD30EFRX disks),
reading a single large file stored on the NAS is done on average at a
speed of 50MB/s. The same file is served by nginx at 104MB/s and by
apache at 65MB/s.
Cheers,
a+
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: commit c7314d74fcb0 2014-01-19 20:50 commit c7314d74fcb0 Arnaud Ebalard @ 2014-01-19 21:49 ` Al Viro 2014-01-20 17:56 ` Arnaud Ebalard 0 siblings, 1 reply; 3+ messages in thread From: Al Viro @ 2014-01-19 21:49 UTC (permalink / raw) To: Arnaud Ebalard; +Cc: linux-fsdevel On Sun, Jan 19, 2014 at 09:50:51PM +0100, Arnaud Ebalard wrote: > Hi Al, > > While doing md5sum on files available on a NFSv3 share, the following > came up on the NAS (an ARM-based ReadyNAS 102) acting as NFS server and > running 3.13.0-rc8: > > [ 3832.280073] Unable to handle kernel NULL pointer dereference at virtual address 00000000 > [ 3832.288188] pgd = c0004000 > [ 3832.290909] [00000000] *pgd=00000000 > [ 3832.294505] Internal error: Oops: 17 [#1] ARM > [ 3832.298868] Modules linked in: > [ 3832.301942] CPU: 0 PID: 359 Comm: kworker/0:1 Not tainted 3.13.0-rc8.rn102-00076-g7d0d46da750a-dirty #62 > [ 3832.311452] Workqueue: events delayed_fput > [ 3832.315565] task: df88d340 ti: df940000 task.ti: df940000 > [ 3832.320973] PC is at __fput+0x24/0x1f0 > [ 3832.324729] LR is at __fput+0x1c/0x1f0 > [ 3832.328485] pc : [<c008b8c8>] lr : [<c008b8c0>] psr: 60000013 > [ 3832.328485] sp : df941ed0 ip : c07726c0 fp : 00000009 > [ 3832.339981] r10: 00000000 r9 : 00000000 r8 : 00000000 > [ 3832.345214] r7 : 00000000 r6 : c152b900 r5 : 00000000 r4 : 00000000 > [ 3832.351751] r3 : 0000021d r2 : 00000000 r1 : c152be40 r0 : 00000000 > [ 3832.358290] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel > [ 3832.365611] Control: 10c5387d Table: 1eaa8019 DAC: 00000015 > [ 3832.371366] Process kworker/0:1 (pid: 359, stack limit = 0xdf940238) > [ 3832.377729] Stack: (0xdf941ed0 to 0xdf942000) > [ 3832.382095] 1ec0: 00000000 00000000 00000000 c152be40 > [ 3832.390289] 1ee0: c077569c c07726c0 df940000 c0fb0300 00000000 00000000 00000009 c008bac0 > [ 3832.398482] 1f00: df90ff80 c002d6cc c0772ff0 df940000 df941f3c c0553fa8 df88d340 df90ff80 > [ 3832.406676] 1f20: c07726c0 c07726d0 df940000 df90ff98 c07a5b55 00000001 00000009 c002dc98 > [ 3832.414869] 1f40: df941f6c c0553fa8 df88d340 00000000 df913480 df90ff80 c002db84 00000000 > [ 3832.423063] 1f60: 00000000 00000000 00000000 c0032848 d3effdfa 00000000 eff1757f df90ff80 > [ 3832.431256] 1f80: 00000000 df941f84 df941f84 00000000 df941f90 df941f90 df941fac df913480 > [ 3832.439449] 1fa0: c003278c 00000000 00000000 c000e218 00000000 00000000 00000000 00000000 > [ 3832.447642] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > [ 3832.455836] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 f727f277 47f34ef7 > [ 3832.464036] [<c008b8c8>] (__fput+0x24/0x1f0) from [<c008bac0>] (delayed_fput+0x2c/0x3c) > [ 3832.472069] [<c008bac0>] (delayed_fput+0x2c/0x3c) from [<c002d6cc>] (process_one_work+0xf4/0x32c) > [ 3832.480963] [<c002d6cc>] (process_one_work+0xf4/0x32c) from [<c002dc98>] (worker_thread+0x114/0x34c) > [ 3832.490120] [<c002dc98>] (worker_thread+0x114/0x34c) from [<c0032848>] (kthread+0xbc/0xd8) > [ 3832.498407] [<c0032848>] (kthread+0xbc/0xd8) from [<c000e218>] (ret_from_fork+0x14/0x3c) > [ 3832.506514] Code: e5904010 eb132240 e5965010 e5963020 (e1d520b0) > [ 3832.512633] ---[ end trace a00affa6c4ade951 ]--- > [ 3832.517258] Kernel panic - not syncing: Fatal exception > > I took a quick look at what is at __fput+0x24 and I think the issue > happens in the inlined fsnotify_close() at the beginning of __fput(): > > static void __fput(struct file *file) > { > struct dentry *dentry = file->f_path.dentry; > struct vfsmount *mnt = file->f_path.mnt; > struct inode *inode = file->f_inode; > > might_sleep(); > > fsnotify_close(file); > /* > * The function eventpoll_release() should be the first called > > I came to the conclusion that it gets a file which is NULL and dies when > trying to get 'mode' below: > > static inline void fsnotify_close(struct file *file) > { > struct path *path = &file->f_path; > struct inode *inode = file_inode(file); > fmode_t mode = file->f_mode; > __u32 mask = (mode & FMODE_WRITE) ? FS_CLOSE_WRITE : FS_CLOSE_NOWRITE; > > > Anyway, I may be wrong on the analysis and can provide the image if > you want to take a look. I then took a look at which commit may have > caused the bug and considered 4f5e65a1cc90 (fput: turn "list_head > delayed_fput_list" into llist_head) and then c7314d74fcb0 ("nfsd > regression since delayed fput()") but they are quite old and I guess > someone else would have hit it already if one was the cause. Bottom > line: I hope the backtrace may give you an idea. The whole image would be an overkill, but System.map and disassembly of __fput would be useful... The thing is, delayed_fput() does this: for (; node; node = next) { next = llist_next(node); __fput(llist_entry(node, struct file, f_u.fu_llist)); } and llist_entry() here is just a cast - f_u.fu_list is at offset zero. So to get NULL passed to __fput() here you'd need node == NULL. Even unmapped address that has escaped the loop condition would've oopsed before reaching __fput() - we *do* fetch node->next (i.e. file->f_u.fu_list.next) before going into __fput(); that isn't going to be reordered away. Besides, f_mode is quite a bit into struct file and dereferencing has happened at address 0, unless I'm misreading that oops... ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: commit c7314d74fcb0 2014-01-19 21:49 ` Al Viro @ 2014-01-20 17:56 ` Arnaud Ebalard 0 siblings, 0 replies; 3+ messages in thread From: Arnaud Ebalard @ 2014-01-20 17:56 UTC (permalink / raw) To: Al Viro; +Cc: linux-fsdevel Hi Al, Al Viro <viro@ZenIV.linux.org.uk> writes: > The whole image would be an overkill, but System.map and disassembly of > __fput would be useful... The thing is, delayed_fput() does this: > for (; node; node = next) { > next = llist_next(node); > __fput(llist_entry(node, struct file, f_u.fu_llist)); > } > and llist_entry() here is just a cast - f_u.fu_list is at offset zero. > So to get NULL passed to __fput() here you'd need node == NULL. Even > unmapped address that has escaped the loop condition would've oopsed > before reaching __fput() - we *do* fetch node->next (i.e. > file->f_u.fu_list.next) before going into __fput(); that isn't going > to be reordered away. > > Besides, f_mode is quite a bit into struct file and dereferencing has > happened at address 0, unless I'm misreading that oops... No, I guess you are right. System.map file is available here: http://natisbad.org/System.map. A disassembly of __fput() and delayed_fput() is inlined below. Note: kernel is compiled with LOADADDR set to 0x8000. __fput() (via arm-linux-gnueabi-objdump -S -EL -D -b binary -m arm --start-address=0x000838a4 --stop-address=0x83a94 Image): 000838a4 <.data+0x838a4>: 838a4: e92d4ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, lr} 838a8: e1a06000 mov r6, r0 838ac: e24dd00c sub sp, sp, #12 838b0: e590700c ldr r7, [r0, #12] 838b4: e5908008 ldr r8, [r0, #8] 838b8: e5904010 ldr r4, [r0, #16] 838bc: eb132240 bl 0x54c1c4 838c0: e5965010 ldr r5, [r6, #16] 838c4: e5963020 ldr r3, [r6, #32] 838c8: e1d520b0 ldrh r2, [r5] 838cc: e3130002 tst r3, #2 838d0: e2022a0f and r2, r2, #61440 ; 0xf000 838d4: 03a0a010 moveq sl, #16 838d8: 13a0a008 movne sl, #8 838dc: e3520901 cmp r2, #16384 ; 0x4000 838e0: 038aa101 orreq sl, sl, #1073741824 ; 0x40000000 838e4: e2139401 ands r9, r3, #16777216 ; 0x1000000 838e8: 0a000046 beq 0x83a08 838ec: e5962074 ldr r2, [r6, #116] ; 0x74 838f0: e2863074 add r3, r6, #116 ; 0x74 838f4: e1520003 cmp r2, r3 838f8: 1a000058 bne 0x83a60 838fc: e1a00006 mov r0, r6 83900: eb00f5b3 bl 0xc0fd4 83904: e596301c ldr r3, [r6, #28] 83908: e3130a02 tst r3, #8192 ; 0x2000 8390c: e5963014 ldr r3, [r6, #20] 83910: 1a000049 bne 0x83a3c 83914: e5933034 ldr r3, [r3, #52] ; 0x34 83918: e3530000 cmp r3, #0 8391c: 0a000002 beq 0x8392c 83920: e1a00004 mov r0, r4 83924: e1a01006 mov r1, r6 83928: e12fff33 blx r3 8392c: e1d430b0 ldrh r3, [r4] 83930: e2033a0f and r3, r3, #61440 ; 0xf000 83934: e3530a02 cmp r3, #8192 ; 0x2000 83938: 0a00004b beq 0x83a6c 8393c: e5963014 ldr r3, [r6, #20] 83940: e3530000 cmp r3, #0 83944: 0a000001 beq 0x83950 83948: e5930000 ldr r0, [r3] 8394c: ebff1733 bl 0x49620 83950: e5960030 ldr r0, [r6, #48] ; 0x30 83954: ebfe8f2a bl 0x27604 83958: e5963020 ldr r3, [r6, #32] 8395c: e3130002 tst r3, #2 83960: 1a000013 bne 0x839b4 83964: e3a03000 mov r3, #0 83968: e586300c str r3, [r6, #12] 8396c: e5863008 str r3, [r6, #8] 83970: e5863010 str r3, [r6, #16] 83974: e59f1110 ldr r1, [pc, #272] ; 0x83a8c 83978: e3e04000 mvn r4, #0 8397c: e3e05000 mvn r5, #0 83980: e1c120d0 ldrd r2, [r1] 83984: e0922004 adds r2, r2, r4 83988: e0a33005 adc r3, r3, r5 8398c: e1c120f0 strd r2, [r1] 83990: e59f10f8 ldr r1, [pc, #248] ; 0x83a90 83994: e1a00006 mov r0, r6 83998: ebfee9d7 bl 0x3e0fc 8399c: e1a00007 mov r0, r7 839a0: eb004756 bl 0x95700 839a4: e1a00008 mov r0, r8 839a8: e28dd00c add sp, sp, #12 839ac: e8bd4ff0 pop {r4, r5, r6, r7, r8, r9, sl, fp, lr} 839b0: ea005ea3 b 0x9b444 839b4: e596300c ldr r3, [r6, #12] 839b8: e5960008 ldr r0, [r6, #8] 839bc: e5933028 ldr r3, [r3, #40] ; 0x28 839c0: e28320b0 add r2, r3, #176 ; 0xb0 839c4: f5d2f000 pld [r2] 839c8: e1921f9f ldrex r1, [r2] 839cc: e2411001 sub r1, r1, #1 839d0: e182cf91 strex ip, r1, [r2] 839d4: e33c0000 teq ip, #0 839d8: 1afffffa bne 0x839c8 839dc: e1d330b0 ldrh r3, [r3] 839e0: e2032a0b and r2, r3, #45056 ; 0xb000 839e4: e3520a02 cmp r2, #8192 ; 0x2000 839e8: 0affffdd beq 0x83964 839ec: e2033a0f and r3, r3, #61440 ; 0xf000 839f0: e3530a01 cmp r3, #4096 ; 0x1000 839f4: 0affffda beq 0x83964 839f8: e3530903 cmp r3, #49152 ; 0xc000 839fc: 0affffd8 beq 0x83964 83a00: eb005fdc bl 0x9b978 83a04: eaffffd6 b 0x83964 83a08: e286b008 add fp, r6, #8 83a0c: e596100c ldr r1, [r6, #12] 83a10: e1a0200a mov r2, sl 83a14: e1a0000b mov r0, fp 83a18: eb00ca6e bl 0xb63d8 83a1c: e58d9000 str r9, [sp] 83a20: e58d9004 str r9, [sp, #4] 83a24: e1a00005 mov r0, r5 83a28: e1a0100a mov r1, sl 83a2c: e1a0200b mov r2, fp 83a30: e3a03001 mov r3, #1 83a34: eb00c98c bl 0xb606c 83a38: eaffffab b 0x838ec 83a3c: e593c040 ldr ip, [r3, #64] ; 0x40 83a40: e35c0000 cmp ip, #0 83a44: 0affffb2 beq 0x83914 83a48: e3e00000 mvn r0, #0 83a4c: e1a01006 mov r1, r6 83a50: e3a02000 mov r2, #0 83a54: e12fff3c blx ip 83a58: e5963014 ldr r3, [r6, #20] 83a5c: eaffffac b 0x83914 83a60: e1a00006 mov r0, r6 83a64: eb00d816 bl 0xb9ac4 83a68: eaffffa3 b 0x838fc 83a6c: e5940118 ldr r0, [r4, #280] ; 0x118 83a70: e3500000 cmp r0, #0 83a74: 0affffb0 beq 0x8393c 83a78: e5963020 ldr r3, [r6, #32] 83a7c: e3130901 tst r3, #16384 ; 0x4000 83a80: 1affffad bne 0x8393c 83a84: eb0008bd bl 0x85d80 83a88: eaffffab b 0x8393c 83a8c: c07eb4c0 rsbsgt fp, lr, r0, asr #9 83a90: c008b860 andgt fp, r8, r0, ror #16 delayed_fput() (via arm-linux-gnueabi-objdump -S -EL -D -b binary -m arm --start-address=0x83a94 --stop-address=0x83ad0 Image): 00083a94 <.data+0x83a94>: 83a94: e92d4010 push {r4, lr} 83a98: e59f202c ldr r2, [pc, #44] ; 0x83acc 83a9c: e3a03000 mov r3, #0 83aa0: e1920f9f ldrex r0, [r2] 83aa4: e1821f93 strex r1, r3, [r2] 83aa8: e3310000 teq r1, #0 83aac: 1afffffb bne 0x83aa0 83ab0: e3500000 cmp r0, #0 83ab4: 08bd8010 popeq {r4, pc} 83ab8: e5904000 ldr r4, [r0] 83abc: ebffff78 bl 0x838a4 83ac0: e2540000 subs r0, r4, #0 83ac4: 1afffffb bne 0x83ab8 83ac8: e8bd8010 pop {r4, pc} 83acc: c07eb4c8 rsbsgt fp, lr, r8, asr #9 Cheers, a+ ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-01-20 17:57 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-01-19 20:50 commit c7314d74fcb0 Arnaud Ebalard 2014-01-19 21:49 ` Al Viro 2014-01-20 17:56 ` Arnaud Ebalard
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).