linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* commit c7314d74fcb0
@ 2014-01-19 20:50 Arnaud Ebalard
  2014-01-19 21:49 ` Al Viro
  0 siblings, 1 reply; 3+ messages in thread
From: Arnaud Ebalard @ 2014-01-19 20:50 UTC (permalink / raw)
  To: Alexander Viro; +Cc: linux-fsdevel

Hi Al,

While doing md5sum on files available on a NFSv3 share, the following
came up on the NAS (an ARM-based ReadyNAS 102) acting as NFS server and
running 3.13.0-rc8:

[ 3832.280073] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[ 3832.288188] pgd = c0004000
[ 3832.290909] [00000000] *pgd=00000000
[ 3832.294505] Internal error: Oops: 17 [#1] ARM
[ 3832.298868] Modules linked in:
[ 3832.301942] CPU: 0 PID: 359 Comm: kworker/0:1 Not tainted 3.13.0-rc8.rn102-00076-g7d0d46da750a-dirty #62
[ 3832.311452] Workqueue: events delayed_fput
[ 3832.315565] task: df88d340 ti: df940000 task.ti: df940000
[ 3832.320973] PC is at __fput+0x24/0x1f0
[ 3832.324729] LR is at __fput+0x1c/0x1f0
[ 3832.328485] pc : [<c008b8c8>]    lr : [<c008b8c0>]    psr: 60000013
[ 3832.328485] sp : df941ed0  ip : c07726c0  fp : 00000009
[ 3832.339981] r10: 00000000  r9 : 00000000  r8 : 00000000
[ 3832.345214] r7 : 00000000  r6 : c152b900  r5 : 00000000  r4 : 00000000
[ 3832.351751] r3 : 0000021d  r2 : 00000000  r1 : c152be40  r0 : 00000000
[ 3832.358290] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM Segment kernel
[ 3832.365611] Control: 10c5387d  Table: 1eaa8019  DAC: 00000015
[ 3832.371366] Process kworker/0:1 (pid: 359, stack limit = 0xdf940238)
[ 3832.377729] Stack: (0xdf941ed0 to 0xdf942000)
[ 3832.382095] 1ec0:                                     00000000 00000000 00000000 c152be40
[ 3832.390289] 1ee0: c077569c c07726c0 df940000 c0fb0300 00000000 00000000 00000009 c008bac0
[ 3832.398482] 1f00: df90ff80 c002d6cc c0772ff0 df940000 df941f3c c0553fa8 df88d340 df90ff80
[ 3832.406676] 1f20: c07726c0 c07726d0 df940000 df90ff98 c07a5b55 00000001 00000009 c002dc98
[ 3832.414869] 1f40: df941f6c c0553fa8 df88d340 00000000 df913480 df90ff80 c002db84 00000000
[ 3832.423063] 1f60: 00000000 00000000 00000000 c0032848 d3effdfa 00000000 eff1757f df90ff80
[ 3832.431256] 1f80: 00000000 df941f84 df941f84 00000000 df941f90 df941f90 df941fac df913480
[ 3832.439449] 1fa0: c003278c 00000000 00000000 c000e218 00000000 00000000 00000000 00000000
[ 3832.447642] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 3832.455836] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 f727f277 47f34ef7
[ 3832.464036] [<c008b8c8>] (__fput+0x24/0x1f0) from [<c008bac0>] (delayed_fput+0x2c/0x3c)
[ 3832.472069] [<c008bac0>] (delayed_fput+0x2c/0x3c) from [<c002d6cc>] (process_one_work+0xf4/0x32c)
[ 3832.480963] [<c002d6cc>] (process_one_work+0xf4/0x32c) from [<c002dc98>] (worker_thread+0x114/0x34c)
[ 3832.490120] [<c002dc98>] (worker_thread+0x114/0x34c) from [<c0032848>] (kthread+0xbc/0xd8)
[ 3832.498407] [<c0032848>] (kthread+0xbc/0xd8) from [<c000e218>] (ret_from_fork+0x14/0x3c)
[ 3832.506514] Code: e5904010 eb132240 e5965010 e5963020 (e1d520b0) 
[ 3832.512633] ---[ end trace a00affa6c4ade951 ]---
[ 3832.517258] Kernel panic - not syncing: Fatal exception

I took a quick look at what is at __fput+0x24 and I think the issue
happens in the inlined fsnotify_close() at the beginning of __fput():

static void __fput(struct file *file)
{
	struct dentry *dentry = file->f_path.dentry;
	struct vfsmount *mnt = file->f_path.mnt;
	struct inode *inode = file->f_inode;

	might_sleep();

	fsnotify_close(file);
	/*
	 * The function eventpoll_release() should be the first called

I came to the conclusion that it gets a file which is NULL and dies when
trying to get 'mode' below:

static inline void fsnotify_close(struct file *file)
{
	struct path *path = &file->f_path;
	struct inode *inode = file_inode(file);
	fmode_t mode = file->f_mode;
	__u32 mask = (mode & FMODE_WRITE) ? FS_CLOSE_WRITE : FS_CLOSE_NOWRITE;


Anyway, I may be wrong on the analysis and can provide the image if
you want to take a look. I then took a look at which commit may have
caused the bug and considered 4f5e65a1cc90 (fput: turn "list_head
delayed_fput_list" into llist_head) and then c7314d74fcb0 ("nfsd
regression since delayed fput()") but they are quite old and I guess
someone else would have hit it already if one was the cause. Bottom
line: I hope the backtrace may give you an idea.

As a side note, regarding the performance I get with the NAS (GbE port,
1.2GHz ARMv7 single core SoC, 512MB of RAM, NFSv3 over TCP w/ 65k/65k,
fs is ext4 over lvm over software RAID1 on 2 WD30EFRX disks),
reading a single large file stored on the NAS is done on average at a
speed of 50MB/s. The same file is served by nginx at 104MB/s and by
apache at 65MB/s.

Cheers,

a+






^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: commit c7314d74fcb0
  2014-01-19 20:50 commit c7314d74fcb0 Arnaud Ebalard
@ 2014-01-19 21:49 ` Al Viro
  2014-01-20 17:56   ` Arnaud Ebalard
  0 siblings, 1 reply; 3+ messages in thread
From: Al Viro @ 2014-01-19 21:49 UTC (permalink / raw)
  To: Arnaud Ebalard; +Cc: linux-fsdevel

On Sun, Jan 19, 2014 at 09:50:51PM +0100, Arnaud Ebalard wrote:
> Hi Al,
> 
> While doing md5sum on files available on a NFSv3 share, the following
> came up on the NAS (an ARM-based ReadyNAS 102) acting as NFS server and
> running 3.13.0-rc8:
> 
> [ 3832.280073] Unable to handle kernel NULL pointer dereference at virtual address 00000000
> [ 3832.288188] pgd = c0004000
> [ 3832.290909] [00000000] *pgd=00000000
> [ 3832.294505] Internal error: Oops: 17 [#1] ARM
> [ 3832.298868] Modules linked in:
> [ 3832.301942] CPU: 0 PID: 359 Comm: kworker/0:1 Not tainted 3.13.0-rc8.rn102-00076-g7d0d46da750a-dirty #62
> [ 3832.311452] Workqueue: events delayed_fput
> [ 3832.315565] task: df88d340 ti: df940000 task.ti: df940000
> [ 3832.320973] PC is at __fput+0x24/0x1f0
> [ 3832.324729] LR is at __fput+0x1c/0x1f0
> [ 3832.328485] pc : [<c008b8c8>]    lr : [<c008b8c0>]    psr: 60000013
> [ 3832.328485] sp : df941ed0  ip : c07726c0  fp : 00000009
> [ 3832.339981] r10: 00000000  r9 : 00000000  r8 : 00000000
> [ 3832.345214] r7 : 00000000  r6 : c152b900  r5 : 00000000  r4 : 00000000
> [ 3832.351751] r3 : 0000021d  r2 : 00000000  r1 : c152be40  r0 : 00000000
> [ 3832.358290] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM Segment kernel
> [ 3832.365611] Control: 10c5387d  Table: 1eaa8019  DAC: 00000015
> [ 3832.371366] Process kworker/0:1 (pid: 359, stack limit = 0xdf940238)
> [ 3832.377729] Stack: (0xdf941ed0 to 0xdf942000)
> [ 3832.382095] 1ec0:                                     00000000 00000000 00000000 c152be40
> [ 3832.390289] 1ee0: c077569c c07726c0 df940000 c0fb0300 00000000 00000000 00000009 c008bac0
> [ 3832.398482] 1f00: df90ff80 c002d6cc c0772ff0 df940000 df941f3c c0553fa8 df88d340 df90ff80
> [ 3832.406676] 1f20: c07726c0 c07726d0 df940000 df90ff98 c07a5b55 00000001 00000009 c002dc98
> [ 3832.414869] 1f40: df941f6c c0553fa8 df88d340 00000000 df913480 df90ff80 c002db84 00000000
> [ 3832.423063] 1f60: 00000000 00000000 00000000 c0032848 d3effdfa 00000000 eff1757f df90ff80
> [ 3832.431256] 1f80: 00000000 df941f84 df941f84 00000000 df941f90 df941f90 df941fac df913480
> [ 3832.439449] 1fa0: c003278c 00000000 00000000 c000e218 00000000 00000000 00000000 00000000
> [ 3832.447642] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [ 3832.455836] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 f727f277 47f34ef7
> [ 3832.464036] [<c008b8c8>] (__fput+0x24/0x1f0) from [<c008bac0>] (delayed_fput+0x2c/0x3c)
> [ 3832.472069] [<c008bac0>] (delayed_fput+0x2c/0x3c) from [<c002d6cc>] (process_one_work+0xf4/0x32c)
> [ 3832.480963] [<c002d6cc>] (process_one_work+0xf4/0x32c) from [<c002dc98>] (worker_thread+0x114/0x34c)
> [ 3832.490120] [<c002dc98>] (worker_thread+0x114/0x34c) from [<c0032848>] (kthread+0xbc/0xd8)
> [ 3832.498407] [<c0032848>] (kthread+0xbc/0xd8) from [<c000e218>] (ret_from_fork+0x14/0x3c)
> [ 3832.506514] Code: e5904010 eb132240 e5965010 e5963020 (e1d520b0) 
> [ 3832.512633] ---[ end trace a00affa6c4ade951 ]---
> [ 3832.517258] Kernel panic - not syncing: Fatal exception
> 
> I took a quick look at what is at __fput+0x24 and I think the issue
> happens in the inlined fsnotify_close() at the beginning of __fput():
> 
> static void __fput(struct file *file)
> {
> 	struct dentry *dentry = file->f_path.dentry;
> 	struct vfsmount *mnt = file->f_path.mnt;
> 	struct inode *inode = file->f_inode;
> 
> 	might_sleep();
> 
> 	fsnotify_close(file);
> 	/*
> 	 * The function eventpoll_release() should be the first called
> 
> I came to the conclusion that it gets a file which is NULL and dies when
> trying to get 'mode' below:
> 
> static inline void fsnotify_close(struct file *file)
> {
> 	struct path *path = &file->f_path;
> 	struct inode *inode = file_inode(file);
> 	fmode_t mode = file->f_mode;
> 	__u32 mask = (mode & FMODE_WRITE) ? FS_CLOSE_WRITE : FS_CLOSE_NOWRITE;
> 
> 
> Anyway, I may be wrong on the analysis and can provide the image if
> you want to take a look. I then took a look at which commit may have
> caused the bug and considered 4f5e65a1cc90 (fput: turn "list_head
> delayed_fput_list" into llist_head) and then c7314d74fcb0 ("nfsd
> regression since delayed fput()") but they are quite old and I guess
> someone else would have hit it already if one was the cause. Bottom
> line: I hope the backtrace may give you an idea.

The whole image would be an overkill, but System.map and disassembly of
__fput would be useful...  The thing is, delayed_fput() does this:
        for (; node; node = next) {
                next = llist_next(node);
                __fput(llist_entry(node, struct file, f_u.fu_llist));
        }
and llist_entry() here is just a cast - f_u.fu_list is at offset zero.
So to get NULL passed to __fput() here you'd need node == NULL.  Even
unmapped address that has escaped the loop condition would've oopsed
before reaching __fput() - we *do* fetch node->next (i.e.
file->f_u.fu_list.next) before going into __fput(); that isn't going
to be reordered away.

Besides, f_mode is quite a bit into struct file and dereferencing has
happened at address 0, unless I'm misreading that oops...

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: commit c7314d74fcb0
  2014-01-19 21:49 ` Al Viro
@ 2014-01-20 17:56   ` Arnaud Ebalard
  0 siblings, 0 replies; 3+ messages in thread
From: Arnaud Ebalard @ 2014-01-20 17:56 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel

Hi Al,

Al Viro <viro@ZenIV.linux.org.uk> writes:

> The whole image would be an overkill, but System.map and disassembly of
> __fput would be useful...  The thing is, delayed_fput() does this:
>         for (; node; node = next) {
>                 next = llist_next(node);
>                 __fput(llist_entry(node, struct file, f_u.fu_llist));
>         }
> and llist_entry() here is just a cast - f_u.fu_list is at offset zero.
> So to get NULL passed to __fput() here you'd need node == NULL.  Even
> unmapped address that has escaped the loop condition would've oopsed
> before reaching __fput() - we *do* fetch node->next (i.e.
> file->f_u.fu_list.next) before going into __fput(); that isn't going
> to be reordered away.
>
> Besides, f_mode is quite a bit into struct file and dereferencing has
> happened at address 0, unless I'm misreading that oops...

No, I guess you are right.

System.map file is available here: http://natisbad.org/System.map. A
disassembly of __fput() and delayed_fput() is inlined below. Note:
kernel is compiled with LOADADDR set to 0x8000.

__fput() (via arm-linux-gnueabi-objdump -S -EL -D -b binary -m arm
--start-address=0x000838a4 --stop-address=0x83a94 Image):

000838a4 <.data+0x838a4>:
   838a4:	e92d4ff0 	push	{r4, r5, r6, r7, r8, r9, sl, fp, lr}
   838a8:	e1a06000 	mov	r6, r0
   838ac:	e24dd00c 	sub	sp, sp, #12
   838b0:	e590700c 	ldr	r7, [r0, #12]
   838b4:	e5908008 	ldr	r8, [r0, #8]
   838b8:	e5904010 	ldr	r4, [r0, #16]
   838bc:	eb132240 	bl	0x54c1c4
   838c0:	e5965010 	ldr	r5, [r6, #16]
   838c4:	e5963020 	ldr	r3, [r6, #32]
   838c8:	e1d520b0 	ldrh	r2, [r5]
   838cc:	e3130002 	tst	r3, #2
   838d0:	e2022a0f 	and	r2, r2, #61440	; 0xf000
   838d4:	03a0a010 	moveq	sl, #16
   838d8:	13a0a008 	movne	sl, #8
   838dc:	e3520901 	cmp	r2, #16384	; 0x4000
   838e0:	038aa101 	orreq	sl, sl, #1073741824	; 0x40000000
   838e4:	e2139401 	ands	r9, r3, #16777216	; 0x1000000
   838e8:	0a000046 	beq	0x83a08
   838ec:	e5962074 	ldr	r2, [r6, #116]	; 0x74
   838f0:	e2863074 	add	r3, r6, #116	; 0x74
   838f4:	e1520003 	cmp	r2, r3
   838f8:	1a000058 	bne	0x83a60
   838fc:	e1a00006 	mov	r0, r6
   83900:	eb00f5b3 	bl	0xc0fd4
   83904:	e596301c 	ldr	r3, [r6, #28]
   83908:	e3130a02 	tst	r3, #8192	; 0x2000
   8390c:	e5963014 	ldr	r3, [r6, #20]
   83910:	1a000049 	bne	0x83a3c
   83914:	e5933034 	ldr	r3, [r3, #52]	; 0x34
   83918:	e3530000 	cmp	r3, #0
   8391c:	0a000002 	beq	0x8392c
   83920:	e1a00004 	mov	r0, r4
   83924:	e1a01006 	mov	r1, r6
   83928:	e12fff33 	blx	r3
   8392c:	e1d430b0 	ldrh	r3, [r4]
   83930:	e2033a0f 	and	r3, r3, #61440	; 0xf000
   83934:	e3530a02 	cmp	r3, #8192	; 0x2000
   83938:	0a00004b 	beq	0x83a6c
   8393c:	e5963014 	ldr	r3, [r6, #20]
   83940:	e3530000 	cmp	r3, #0
   83944:	0a000001 	beq	0x83950
   83948:	e5930000 	ldr	r0, [r3]
   8394c:	ebff1733 	bl	0x49620
   83950:	e5960030 	ldr	r0, [r6, #48]	; 0x30
   83954:	ebfe8f2a 	bl	0x27604
   83958:	e5963020 	ldr	r3, [r6, #32]
   8395c:	e3130002 	tst	r3, #2
   83960:	1a000013 	bne	0x839b4
   83964:	e3a03000 	mov	r3, #0
   83968:	e586300c 	str	r3, [r6, #12]
   8396c:	e5863008 	str	r3, [r6, #8]
   83970:	e5863010 	str	r3, [r6, #16]
   83974:	e59f1110 	ldr	r1, [pc, #272]	; 0x83a8c
   83978:	e3e04000 	mvn	r4, #0
   8397c:	e3e05000 	mvn	r5, #0
   83980:	e1c120d0 	ldrd	r2, [r1]
   83984:	e0922004 	adds	r2, r2, r4
   83988:	e0a33005 	adc	r3, r3, r5
   8398c:	e1c120f0 	strd	r2, [r1]
   83990:	e59f10f8 	ldr	r1, [pc, #248]	; 0x83a90
   83994:	e1a00006 	mov	r0, r6
   83998:	ebfee9d7 	bl	0x3e0fc
   8399c:	e1a00007 	mov	r0, r7
   839a0:	eb004756 	bl	0x95700
   839a4:	e1a00008 	mov	r0, r8
   839a8:	e28dd00c 	add	sp, sp, #12
   839ac:	e8bd4ff0 	pop	{r4, r5, r6, r7, r8, r9, sl, fp, lr}
   839b0:	ea005ea3 	b	0x9b444
   839b4:	e596300c 	ldr	r3, [r6, #12]
   839b8:	e5960008 	ldr	r0, [r6, #8]
   839bc:	e5933028 	ldr	r3, [r3, #40]	; 0x28
   839c0:	e28320b0 	add	r2, r3, #176	; 0xb0
   839c4:	f5d2f000 	pld	[r2]
   839c8:	e1921f9f 	ldrex	r1, [r2]
   839cc:	e2411001 	sub	r1, r1, #1
   839d0:	e182cf91 	strex	ip, r1, [r2]
   839d4:	e33c0000 	teq	ip, #0
   839d8:	1afffffa 	bne	0x839c8
   839dc:	e1d330b0 	ldrh	r3, [r3]
   839e0:	e2032a0b 	and	r2, r3, #45056	; 0xb000
   839e4:	e3520a02 	cmp	r2, #8192	; 0x2000
   839e8:	0affffdd 	beq	0x83964
   839ec:	e2033a0f 	and	r3, r3, #61440	; 0xf000
   839f0:	e3530a01 	cmp	r3, #4096	; 0x1000
   839f4:	0affffda 	beq	0x83964
   839f8:	e3530903 	cmp	r3, #49152	; 0xc000
   839fc:	0affffd8 	beq	0x83964
   83a00:	eb005fdc 	bl	0x9b978
   83a04:	eaffffd6 	b	0x83964
   83a08:	e286b008 	add	fp, r6, #8
   83a0c:	e596100c 	ldr	r1, [r6, #12]
   83a10:	e1a0200a 	mov	r2, sl
   83a14:	e1a0000b 	mov	r0, fp
   83a18:	eb00ca6e 	bl	0xb63d8
   83a1c:	e58d9000 	str	r9, [sp]
   83a20:	e58d9004 	str	r9, [sp, #4]
   83a24:	e1a00005 	mov	r0, r5
   83a28:	e1a0100a 	mov	r1, sl
   83a2c:	e1a0200b 	mov	r2, fp
   83a30:	e3a03001 	mov	r3, #1
   83a34:	eb00c98c 	bl	0xb606c
   83a38:	eaffffab 	b	0x838ec
   83a3c:	e593c040 	ldr	ip, [r3, #64]	; 0x40
   83a40:	e35c0000 	cmp	ip, #0
   83a44:	0affffb2 	beq	0x83914
   83a48:	e3e00000 	mvn	r0, #0
   83a4c:	e1a01006 	mov	r1, r6
   83a50:	e3a02000 	mov	r2, #0
   83a54:	e12fff3c 	blx	ip
   83a58:	e5963014 	ldr	r3, [r6, #20]
   83a5c:	eaffffac 	b	0x83914
   83a60:	e1a00006 	mov	r0, r6
   83a64:	eb00d816 	bl	0xb9ac4
   83a68:	eaffffa3 	b	0x838fc
   83a6c:	e5940118 	ldr	r0, [r4, #280]	; 0x118
   83a70:	e3500000 	cmp	r0, #0
   83a74:	0affffb0 	beq	0x8393c
   83a78:	e5963020 	ldr	r3, [r6, #32]
   83a7c:	e3130901 	tst	r3, #16384	; 0x4000
   83a80:	1affffad 	bne	0x8393c
   83a84:	eb0008bd 	bl	0x85d80
   83a88:	eaffffab 	b	0x8393c
   83a8c:	c07eb4c0 	rsbsgt	fp, lr, r0, asr #9
   83a90:	c008b860 	andgt	fp, r8, r0, ror #16

delayed_fput() (via arm-linux-gnueabi-objdump -S -EL -D -b binary -m arm
--start-address=0x83a94 --stop-address=0x83ad0 Image):

00083a94 <.data+0x83a94>:
   83a94:	e92d4010 	push	{r4, lr}
   83a98:	e59f202c 	ldr	r2, [pc, #44]	; 0x83acc
   83a9c:	e3a03000 	mov	r3, #0
   83aa0:	e1920f9f 	ldrex	r0, [r2]
   83aa4:	e1821f93 	strex	r1, r3, [r2]
   83aa8:	e3310000 	teq	r1, #0
   83aac:	1afffffb 	bne	0x83aa0
   83ab0:	e3500000 	cmp	r0, #0
   83ab4:	08bd8010 	popeq	{r4, pc}
   83ab8:	e5904000 	ldr	r4, [r0]
   83abc:	ebffff78 	bl	0x838a4
   83ac0:	e2540000 	subs	r0, r4, #0
   83ac4:	1afffffb 	bne	0x83ab8
   83ac8:	e8bd8010 	pop	{r4, pc}
   83acc:	c07eb4c8 	rsbsgt	fp, lr, r8, asr #9

Cheers,

a+

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-01-20 17:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-19 20:50 commit c7314d74fcb0 Arnaud Ebalard
2014-01-19 21:49 ` Al Viro
2014-01-20 17:56   ` Arnaud Ebalard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).