public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* NFS write OOPS with 2.6.29.2
@ 2009-05-03 16:03 Holger Kiehl
  2009-05-05  6:14 ` Andrew Morton
  0 siblings, 1 reply; 9+ messages in thread
From: Holger Kiehl @ 2009-05-03 16:03 UTC (permalink / raw)
  To: linux-kernel

Hello

With plain kernel 2.6.29.2 I get the following OOPS (several of them) when
writing lots of small files on the client system:

    May  3 18:48:34 obelix kernel: ------------[ cut here ]------------
    May  3 18:48:34 obelix kernel: kernel BUG at fs/nfs/write.c:252!
    May  3 18:48:34 obelix kernel: invalid opcode: 0000 [#1] SMP
    May  3 18:48:34 obelix kernel: last sysfs file: /sys/devices/pci0000:00/0000:00:1c.2/0000:11:00.0/0000:12:00.0/irq
    May  3 18:48:34 obelix kernel: CPU 1
    May  3 18:48:34 obelix kernel: Modules linked in: nfs lockd nfs_acl auth_rpcgss coretemp smsc47m1 ipmi_si ipmi_msghandler sunrpc binfmt_misc usbhid i5000_edac i2c_i801 uhci_hcd sg i2c_core i5k_amb ehci_hcd usbcore [last unloaded: scsi_wait_scan]
    May  3 18:48:34 obelix kernel: Pid: 8328, comm: sf_loc Not tainted 2.6.29.2 #2 PRIMERGY RX300 S4
    May  3 18:48:34 obelix kernel: RIP: 0010:[<ffffffffa010effc>]  [<ffffffffa010effc>] nfs_do_writepage+0xfa/0x196 [nfs]
    May  3 18:48:34 obelix kernel: RSP: 0018:ffff8807e619bb28  EFLAGS: 00010286
    May  3 18:48:34 obelix kernel: RAX: 0000000000000001 RBX: ffffe2001ba1dcc0 RCX: 0000000000000015
    May  3 18:48:34 obelix kernel: RDX: 0000000000000000 RSI: 0000000000600020 RDI: ffff8807ec10e950
    May  3 18:48:34 obelix kernel: RBP: ffff8807e619bb58 R08: ffff88082cf52540 R09: ffff88083cc74440
    May  3 18:48:34 obelix kernel: R10: 0000000000000000 R11: 00000000fffffffa R12: ffff88082cf52540
    May  3 18:48:34 obelix kernel: R13: ffff8807ec10ea9c R14: ffffe2001ba1dcc0 R15: ffff8807ec10e9e8
    May  3 18:48:34 obelix kernel: FS:  00007f03010276f0(0000) GS:ffff88083cca3e40(0000) knlGS:0000000000000000
    May  3 18:48:34 obelix kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    May  3 18:48:34 obelix kernel: CR2: 00007fb127e21508 CR3: 000000082cf94000 CR4: 00000000000406e0
    May  3 18:48:34 obelix kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    May  3 18:48:34 obelix kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    May  3 18:48:34 obelix kernel: Process sf_loc (pid: 8328, threadinfo ffff8807e619a000, task ffff88082cf84840)
    May  3 18:48:34 obelix kernel: Stack:
    May  3 18:48:34 obelix kernel: ffff8807e619bcb8 ffffe2001ba1dcc0 ffffe2001ba1dcc0 0000000000000000
    May  3 18:48:34 obelix kernel: 0000000000000001 0000000000000000 ffff8807e619bb78 ffffffffa010f50a
    May  3 18:48:34 obelix kernel: ffffe2001ba1dcc0 ffff8807e619bd68 ffff8807e619bca8 ffffffff8026f964
    May  3 18:48:34 obelix kernel: Call Trace:
    May  3 18:48:34 obelix kernel: [<ffffffffa010f50a>] nfs_writepages_callback+0xf/0x20 [nfs]
    May  3 18:48:34 obelix kernel: [<ffffffff8026f964>] write_cache_pages+0x246/0x389
    May  3 18:48:34 obelix kernel: [<ffffffffa010f4fb>] ? nfs_writepages_callback+0x0/0x20 [nfs]
    May  3 18:48:34 obelix kernel: [<ffffffff80268c20>] ? find_get_pages_tag+0x3e/0xd9
    May  3 18:48:34 obelix kernel: [<ffffffffa010f4d1>] nfs_writepages+0xb0/0xda [nfs]
    May  3 18:48:34 obelix kernel: [<ffffffffa01107f1>] ? nfs_flush_one+0x0/0xd9 [nfs]
    May  3 18:48:34 obelix kernel: [<ffffffffa01107f1>] ? nfs_flush_one+0x0/0xd9 [nfs]
    May  3 18:48:34 obelix kernel: [<ffffffffa01105e5>] __nfs_write_mapping+0x19/0x50 [nfs]
    May  3 18:48:34 obelix kernel: [<ffffffffa0110676>] nfs_write_mapping+0x5a/0x7e [nfs]
    May  3 18:48:34 obelix kernel: [<ffffffffa01106c3>] nfs_wb_all+0x12/0x14 [nfs]
    May  3 18:48:34 obelix kernel: [<ffffffffa0106096>] nfs_sync_mapping+0x34/0x38 [nfs]
    May  3 18:48:34 obelix kernel: [<ffffffffa0103a7e>] do_setlk+0x89/0xb0 [nfs]
    May  3 18:48:34 obelix kernel: [<ffffffffa0103cc7>] nfs_lock+0x18a/0x19b [nfs]
    May  3 18:48:34 obelix kernel: [<ffffffff802c1978>] vfs_lock_file+0x1e/0x35
    May  3 18:48:34 obelix kernel: [<ffffffff802c1b7a>] fcntl_setlk+0x13e/0x278
    May  3 18:48:34 obelix kernel: [<ffffffff802a0649>] sys_fcntl+0x2bc/0x33a
    May  3 18:48:34 obelix kernel: [<ffffffff8020b51b>] system_call_fastpath+0x16/0x1b
    May  3 18:48:34 obelix kernel: Code: 00 4c 89 e7 e8 ba cb ff ff 4c 89 e7 89 c3 e8 05 cc ff ff 85 db 74 a7 e9 82 00 00 00 41 f6 44 24 40 02 74 0b 41 fe 87 b4 00 00 00 <0f> 0b eb fe 4c 89 f7 e8 2d 03 16 e0 85 c0 75 49 49 8b 46 18 ba
    May  3 18:48:34 obelix kernel: RIP  [<ffffffffa010effc>] nfs_do_writepage+0xfa/0x196 [nfs]
    May  3 18:48:34 obelix kernel: RSP <ffff8807e619bb28>
    May  3 18:48:34 obelix kernel: ---[ end trace 1d4f513ef96df0b8 ]---
    May  3 18:48:36 obelix kernel: kernel BUG at fs/nfs/write.c:252!
    May  3 18:48:36 obelix kernel: invalid opcode: 0000 [#2] SMP
    May  3 18:48:36 obelix kernel: last sysfs file: /sys/devices/pci0000:00/0000:00:1c.2/0000:11:00.0/0000:12:00.0/irq
    May  3 18:48:36 obelix kernel: CPU 1
    May  3 18:48:36 obelix kernel: Modules linked in: nfs lockd nfs_acl auth_rpcgss coretemp smsc47m1 ipmi_si ipmi_msghandler sunrpc binfmt_misc usbhid i5000_edac i2c_i801 uhci_hcd sg i2c_core i5k_amb ehci_hcd usbcore [last unloaded: scsi_wait_scan]
    May  3 18:48:36 obelix kernel: Pid: 8350, comm: sf_loc Tainted: G      D    2.6.29.2 #2 PRIMERGY RX300 S4
    May  3 18:48:36 obelix kernel: RIP: 0010:[<ffffffffa010effc>]  [<ffffffffa010effc>] nfs_do_writepage+0xfa/0x196 [nfs]
    May  3 18:48:36 obelix kernel: RSP: 0018:ffff8807e62e7b28  EFLAGS: 00010202
    May  3 18:48:36 obelix kernel: RAX: 0000000000000001 RBX: ffffe2001b9ef5e0 RCX: 0000000000000015
    May  3 18:48:36 obelix kernel: RDX: 0000000000000000 RSI: 0000000000600020 RDI: ffff8807ec065950
    May  3 18:48:36 obelix kernel: RBP: ffff8807e62e7b58 R08: ffff880827de9d40 R09: ffff8807e4858920
    May  3 18:48:36 obelix kernel: R10: 0000000000000000 R11: 00000000fffffffa R12: ffff880827de9d40
    May  3 18:48:36 obelix kernel: R13: ffff8807ec065a9c R14: ffffe2001b9ef5e0 R15: ffff8807ec0659e8
    May  3 18:48:36 obelix kernel: FS:  00007f6eba6616f0(0000) GS:ffff88083cca3e40(0000) knlGS:0000000000000000
    May  3 18:48:36 obelix kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    May  3 18:48:36 obelix kernel: CR2: 00007fc02394c638 CR3: 0000000827cf3000 CR4: 00000000000406e0
    May  3 18:48:36 obelix kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    May  3 18:48:36 obelix kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    May  3 18:48:36 obelix kernel: Process sf_loc (pid: 8350, threadinfo ffff8807e62e6000, task ffff8807e6012140)
    May  3 18:48:36 obelix kernel: Stack:
    May  3 18:48:36 obelix kernel: ffff8807e62e7cb8 ffffe2001b9ef5e0 ffffe2001b9ef5e0 0000000000000000
    May  3 18:48:36 obelix kernel: 0000000000000001 0000000000000000 ffff8807e62e7b78 ffffffffa010f50a
    May  3 18:48:36 obelix kernel: ffffe2001b9ef5e0 ffff8807e62e7d68 ffff8807e62e7ca8 ffffffff8026f964
    May  3 18:48:36 obelix kernel: Call Trace:
    May  3 18:48:36 obelix kernel: [<ffffffffa010f50a>] nfs_writepages_callback+0xf/0x20 [nfs]
    May  3 18:48:36 obelix kernel: [<ffffffff8026f964>] write_cache_pages+0x246/0x389
    May  3 18:48:36 obelix kernel: [<ffffffffa010f4fb>] ? nfs_writepages_callback+0x0/0x20 [nfs]
    May  3 18:48:36 obelix kernel: [<ffffffff80268c20>] ? find_get_pages_tag+0x3e/0xd9
    May  3 18:48:36 obelix kernel: [<ffffffffa010f4d1>] nfs_writepages+0xb0/0xda [nfs]
    May  3 18:48:36 obelix kernel: [<ffffffffa01107f1>] ? nfs_flush_one+0x0/0xd9 [nfs]
    May  3 18:48:36 obelix kernel: [<ffffffffa01107f1>] ? nfs_flush_one+0x0/0xd9 [nfs]
    May  3 18:48:36 obelix kernel: [<ffffffffa01105e5>] __nfs_write_mapping+0x19/0x50 [nfs]
    May  3 18:48:36 obelix kernel: [<ffffffffa0110676>] nfs_write_mapping+0x5a/0x7e [nfs]
    May  3 18:48:36 obelix kernel: [<ffffffffa01106c3>] nfs_wb_all+0x12/0x14 [nfs]
    May  3 18:48:36 obelix kernel: [<ffffffffa0106096>] nfs_sync_mapping+0x34/0x38 [nfs]
    May  3 18:48:36 obelix kernel: [<ffffffffa0103a7e>] do_setlk+0x89/0xb0 [nfs]
    May  3 18:48:36 obelix kernel: [<ffffffffa0103cc7>] nfs_lock+0x18a/0x19b [nfs]
    May  3 18:48:36 obelix kernel: [<ffffffff802c1978>] vfs_lock_file+0x1e/0x35
    May  3 18:48:36 obelix kernel: [<ffffffff802c1b7a>] fcntl_setlk+0x13e/0x278
    May  3 18:48:36 obelix kernel: [<ffffffff802a0649>] sys_fcntl+0x2bc/0x33a
    May  3 18:48:36 obelix kernel: [<ffffffff8020b51b>] system_call_fastpath+0x16/0x1b
    May  3 18:48:36 obelix kernel: Code: 00 4c 89 e7 e8 ba cb ff ff 4c 89 e7 89 c3 e8 05 cc ff ff 85 db 74 a7 e9 82 00 00 00 41 f6 44 24 40 02 74 0b 41 fe 87 b4 00 00 00 <0f> 0b eb fe 4c 89 f7 e8 2d 03 16 e0 85 c0 75 49 49 8b 46 18 ba
    May  3 18:48:36 obelix kernel: RIP  [<ffffffffa010effc>] nfs_do_writepage+0xfa/0x196 [nfs]
    May  3 18:48:36 obelix kernel: RSP <ffff8807e62e7b28>
    May  3 18:48:36 obelix kernel: ---[ end trace 1d4f513ef96df0b9 ]---

System has 2 CPU's (8 cores) and 32G ram. If more information is needed please
ask. There are a lot of process hanging in D-state. This is a test system
so I can try any suggestions or patches.

Thanks,
Holger

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: NFS write OOPS with 2.6.29.2
  2009-05-03 16:03 NFS write OOPS with 2.6.29.2 Holger Kiehl
@ 2009-05-05  6:14 ` Andrew Morton
  2009-05-09 19:16   ` Holger Kiehl
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2009-05-05  6:14 UTC (permalink / raw)
  To: Holger Kiehl; +Cc: linux-kernel, linux-nfs


(cc linux-nfs)

On Sun, 3 May 2009 16:03:38 +0000 (GMT) Holger Kiehl <Holger.Kiehl@dwd.de> wrote:

> Hello
> 
> With plain kernel 2.6.29.2 I get the following OOPS (several of them) when
> writing lots of small files on the client system:
> 
>     May  3 18:48:34 obelix kernel: ------------[ cut here ]------------
>     May  3 18:48:34 obelix kernel: kernel BUG at fs/nfs/write.c:252!

I think this is a well-know bug, and fixes should be available in 2.6.29.3?

>     May  3 18:48:34 obelix kernel: invalid opcode: 0000 [#1] SMP
>     May  3 18:48:34 obelix kernel: last sysfs file: /sys/devices/pci0000:00/0000:00:1c.2/0000:11:00.0/0000:12:00.0/irq
>     May  3 18:48:34 obelix kernel: CPU 1
>     May  3 18:48:34 obelix kernel: Modules linked in: nfs lockd nfs_acl auth_rpcgss coretemp smsc47m1 ipmi_si ipmi_msghandler sunrpc binfmt_misc usbhid i5000_edac i2c_i801 uhci_hcd sg i2c_core i5k_amb ehci_hcd usbcore [last unloaded: scsi_wait_scan]
>     May  3 18:48:34 obelix kernel: Pid: 8328, comm: sf_loc Not tainted 2.6.29.2 #2 PRIMERGY RX300 S4
>     May  3 18:48:34 obelix kernel: RIP: 0010:[<ffffffffa010effc>]  [<ffffffffa010effc>] nfs_do_writepage+0xfa/0x196 [nfs]
>     May  3 18:48:34 obelix kernel: RSP: 0018:ffff8807e619bb28  EFLAGS: 00010286
>     May  3 18:48:34 obelix kernel: RAX: 0000000000000001 RBX: ffffe2001ba1dcc0 RCX: 0000000000000015
>     May  3 18:48:34 obelix kernel: RDX: 0000000000000000 RSI: 0000000000600020 RDI: ffff8807ec10e950
>     May  3 18:48:34 obelix kernel: RBP: ffff8807e619bb58 R08: ffff88082cf52540 R09: ffff88083cc74440
>     May  3 18:48:34 obelix kernel: R10: 0000000000000000 R11: 00000000fffffffa R12: ffff88082cf52540
>     May  3 18:48:34 obelix kernel: R13: ffff8807ec10ea9c R14: ffffe2001ba1dcc0 R15: ffff8807ec10e9e8
>     May  3 18:48:34 obelix kernel: FS:  00007f03010276f0(0000) GS:ffff88083cca3e40(0000) knlGS:0000000000000000
>     May  3 18:48:34 obelix kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>     May  3 18:48:34 obelix kernel: CR2: 00007fb127e21508 CR3: 000000082cf94000 CR4: 00000000000406e0
>     May  3 18:48:34 obelix kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>     May  3 18:48:34 obelix kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>     May  3 18:48:34 obelix kernel: Process sf_loc (pid: 8328, threadinfo ffff8807e619a000, task ffff88082cf84840)
>     May  3 18:48:34 obelix kernel: Stack:
>     May  3 18:48:34 obelix kernel: ffff8807e619bcb8 ffffe2001ba1dcc0 ffffe2001ba1dcc0 0000000000000000
>     May  3 18:48:34 obelix kernel: 0000000000000001 0000000000000000 ffff8807e619bb78 ffffffffa010f50a
>     May  3 18:48:34 obelix kernel: ffffe2001ba1dcc0 ffff8807e619bd68 ffff8807e619bca8 ffffffff8026f964
>     May  3 18:48:34 obelix kernel: Call Trace:
>     May  3 18:48:34 obelix kernel: [<ffffffffa010f50a>] nfs_writepages_callback+0xf/0x20 [nfs]
>     May  3 18:48:34 obelix kernel: [<ffffffff8026f964>] write_cache_pages+0x246/0x389
>     May  3 18:48:34 obelix kernel: [<ffffffffa010f4fb>] ? nfs_writepages_callback+0x0/0x20 [nfs]
>     May  3 18:48:34 obelix kernel: [<ffffffff80268c20>] ? find_get_pages_tag+0x3e/0xd9
>     May  3 18:48:34 obelix kernel: [<ffffffffa010f4d1>] nfs_writepages+0xb0/0xda [nfs]
>     May  3 18:48:34 obelix kernel: [<ffffffffa01107f1>] ? nfs_flush_one+0x0/0xd9 [nfs]
>     May  3 18:48:34 obelix kernel: [<ffffffffa01107f1>] ? nfs_flush_one+0x0/0xd9 [nfs]
>     May  3 18:48:34 obelix kernel: [<ffffffffa01105e5>] __nfs_write_mapping+0x19/0x50 [nfs]
>     May  3 18:48:34 obelix kernel: [<ffffffffa0110676>] nfs_write_mapping+0x5a/0x7e [nfs]
>     May  3 18:48:34 obelix kernel: [<ffffffffa01106c3>] nfs_wb_all+0x12/0x14 [nfs]
>     May  3 18:48:34 obelix kernel: [<ffffffffa0106096>] nfs_sync_mapping+0x34/0x38 [nfs]
>     May  3 18:48:34 obelix kernel: [<ffffffffa0103a7e>] do_setlk+0x89/0xb0 [nfs]
>     May  3 18:48:34 obelix kernel: [<ffffffffa0103cc7>] nfs_lock+0x18a/0x19b [nfs]
>     May  3 18:48:34 obelix kernel: [<ffffffff802c1978>] vfs_lock_file+0x1e/0x35
>     May  3 18:48:34 obelix kernel: [<ffffffff802c1b7a>] fcntl_setlk+0x13e/0x278
>     May  3 18:48:34 obelix kernel: [<ffffffff802a0649>] sys_fcntl+0x2bc/0x33a
>     May  3 18:48:34 obelix kernel: [<ffffffff8020b51b>] system_call_fastpath+0x16/0x1b
>     May  3 18:48:34 obelix kernel: Code: 00 4c 89 e7 e8 ba cb ff ff 4c 89 e7 89 c3 e8 05 cc ff ff 85 db 74 a7 e9 82 00 00 00 41 f6 44 24 40 02 74 0b 41 fe 87 b4 00 00 00 <0f> 0b eb fe 4c 89 f7 e8 2d 03 16 e0 85 c0 75 49 49 8b 46 18 ba
>     May  3 18:48:34 obelix kernel: RIP  [<ffffffffa010effc>] nfs_do_writepage+0xfa/0x196 [nfs]
> ...
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: NFS write OOPS with 2.6.29.2
  2009-05-05  6:14 ` Andrew Morton
@ 2009-05-09 19:16   ` Holger Kiehl
  2009-05-10  4:17     ` Trond Myklebust
  0 siblings, 1 reply; 9+ messages in thread
From: Holger Kiehl @ 2009-05-09 19:16 UTC (permalink / raw)
  To: linux-kernel, linux-nfs; +Cc: Andrew Morton

On Mon, 4 May 2009, Andrew Morton wrote:

>
> (cc linux-nfs)
>
> On Sun, 3 May 2009 16:03:38 +0000 (GMT) Holger Kiehl <Holger.Kiehl@dwd.de> wrote:
>
>> Hello
>>
>> With plain kernel 2.6.29.2 I get the following OOPS (several of them) when
>> writing lots of small files on the client system:
>>
>>     May  3 18:48:34 obelix kernel: ------------[ cut here ]------------
>>     May  3 18:48:34 obelix kernel: kernel BUG at fs/nfs/write.c:252!
>
> I think this is a well-know bug, and fixes should be available in 2.6.29.3?
>
Thanks for this information. I just tried 2.6.29.3 and it still oopses.
Are there any patches I can try?

Thanks,
Holger

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: NFS write OOPS with 2.6.29.2
  2009-05-09 19:16   ` Holger Kiehl
@ 2009-05-10  4:17     ` Trond Myklebust
  2009-05-11  9:24       ` Holger Kiehl
  0 siblings, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2009-05-10  4:17 UTC (permalink / raw)
  To: Holger Kiehl; +Cc: linux-kernel, linux-nfs, Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 859 bytes --]

On Sat, 2009-05-09 at 19:16 +0000, Holger Kiehl wrote:
> On Mon, 4 May 2009, Andrew Morton wrote:
> 
> >
> > (cc linux-nfs)
> >
> > On Sun, 3 May 2009 16:03:38 +0000 (GMT) Holger Kiehl <Holger.Kiehl@dwd.de> wrote:
> >
> >> Hello
> >>
> >> With plain kernel 2.6.29.2 I get the following OOPS (several of them) when
> >> writing lots of small files on the client system:
> >>
> >>     May  3 18:48:34 obelix kernel: ------------[ cut here ]------------
> >>     May  3 18:48:34 obelix kernel: kernel BUG at fs/nfs/write.c:252!
> >
> > I think this is a well-know bug, and fixes should be available in 2.6.29.3?
> >
> Thanks for this information. I just tried 2.6.29.3 and it still oopses.
> Are there any patches I can try?

The attached backports against 2.6.29 are untested, but they are known
to compile at least. Could you give them a try?

Cheers
  Trond


[-- Attachment #2: patches-fix_nfs_mmap.tar.bz2 --]
[-- Type: application/x-bzip-compressed-tar, Size: 8301 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: NFS write OOPS with 2.6.29.2
  2009-05-10  4:17     ` Trond Myklebust
@ 2009-05-11  9:24       ` Holger Kiehl
  2009-05-11 12:16         ` Trond Myklebust
  0 siblings, 1 reply; 9+ messages in thread
From: Holger Kiehl @ 2009-05-11  9:24 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-kernel, linux-nfs, Andrew Morton

On Sun, 10 May 2009, Trond Myklebust wrote:

> On Sat, 2009-05-09 at 19:16 +0000, Holger Kiehl wrote:
>> On Mon, 4 May 2009, Andrew Morton wrote:
>>
>>>
>>> (cc linux-nfs)
>>>
>>> On Sun, 3 May 2009 16:03:38 +0000 (GMT) Holger Kiehl <Holger.Kiehl@dwd.de> wrote:
>>>
>>>> Hello
>>>>
>>>> With plain kernel 2.6.29.2 I get the following OOPS (several of them) when
>>>> writing lots of small files on the client system:
>>>>
>>>>     May  3 18:48:34 obelix kernel: ------------[ cut here ]------------
>>>>     May  3 18:48:34 obelix kernel: kernel BUG at fs/nfs/write.c:252!
>>>
>>> I think this is a well-know bug, and fixes should be available in 2.6.29.3?
>>>
>> Thanks for this information. I just tried 2.6.29.3 and it still oopses.
>> Are there any patches I can try?
>
> The attached backports against 2.6.29 are untested, but they are known
> to compile at least. Could you give them a try?
>
Thanks. They do compile but when there is a mmap on the NFS drive the
program gets a SIGBUS:

    unlink("/home/afdbench/afd2/fifodir/AFD_ACTIVE") = 0
    close(3)                                = 0
    open("/home/afdbench/afd2/fifodir/AFD_ACTIVE", O_RDWR|O_CREAT|O_TRUNC|O_CLOEXEC, 0600) = 3
    lseek(3, 78, SEEK_SET)                  = 78
    write(3, "\377"..., 1)                  = 1
    mmap(NULL, 78, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f8045cdb000
    --- SIGBUS (Bus error) @ 0 (0) ---

This was with 2.6.29.3 plus the patches you send me.

Holger

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: NFS write OOPS with 2.6.29.2
  2009-05-11  9:24       ` Holger Kiehl
@ 2009-05-11 12:16         ` Trond Myklebust
  2009-05-11 12:45           ` Holger Kiehl
  0 siblings, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2009-05-11 12:16 UTC (permalink / raw)
  To: Holger Kiehl; +Cc: linux-kernel, linux-nfs, Andrew Morton

On Mon, 2009-05-11 at 09:24 +0000, Holger Kiehl wrote:
> On Sun, 10 May 2009, Trond Myklebust wrote:
> 
> > On Sat, 2009-05-09 at 19:16 +0000, Holger Kiehl wrote:
> >> On Mon, 4 May 2009, Andrew Morton wrote:
> >>
> >>>
> >>> (cc linux-nfs)
> >>>
> >>> On Sun, 3 May 2009 16:03:38 +0000 (GMT) Holger Kiehl <Holger.Kiehl@dwd.de> wrote:
> >>>
> >>>> Hello
> >>>>
> >>>> With plain kernel 2.6.29.2 I get the following OOPS (several of them) when
> >>>> writing lots of small files on the client system:
> >>>>
> >>>>     May  3 18:48:34 obelix kernel: ------------[ cut here ]------------
> >>>>     May  3 18:48:34 obelix kernel: kernel BUG at fs/nfs/write.c:252!
> >>>
> >>> I think this is a well-know bug, and fixes should be available in 2.6.29.3?
> >>>
> >> Thanks for this information. I just tried 2.6.29.3 and it still oopses.
> >> Are there any patches I can try?
> >
> > The attached backports against 2.6.29 are untested, but they are known
> > to compile at least. Could you give them a try?
> >
> Thanks. They do compile but when there is a mmap on the NFS drive the
> program gets a SIGBUS:
> 
>     unlink("/home/afdbench/afd2/fifodir/AFD_ACTIVE") = 0
>     close(3)                                = 0
>     open("/home/afdbench/afd2/fifodir/AFD_ACTIVE", O_RDWR|O_CREAT|O_TRUNC|O_CLOEXEC, 0600) = 3
>     lseek(3, 78, SEEK_SET)                  = 78
>     write(3, "\377"..., 1)                  = 1
>     mmap(NULL, 78, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f8045cdb000
>     --- SIGBUS (Bus error) @ 0 (0) ---
> 
> This was with 2.6.29.3 plus the patches you send me.
> 
> Holger

Oh, duh... You need this little patchlet too.

Sorry...

Cheers
  Trond
------------------------------------------------------------------
>From 2b2ec7554cf7ec5e4412f89a5af6abe8ce950700 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <Trond.Myklebust@netapp.com>
Date: Tue, 7 Apr 2009 14:02:53 -0700
Subject: [PATCH] NFS: Fix the return value in nfs_page_mkwrite()

Commit c2ec175c39f62949438354f603f4aa170846aabb ("mm: page_mkwrite
change prototype to match fault") exposed a bug in the NFS
implementation of page_mkwrite.  We should be returning 0 on success...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 fs/nfs/file.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index 3523b89..5a97bcf 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -516,8 +516,6 @@ static int nfs_vm_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
 		goto out_unlock;
 
 	ret = nfs_updatepage(filp, page, 0, pagelen);
-	if (ret == 0)
-		ret = pagelen;
 out_unlock:
 	unlock_page(page);
 	if (ret)
-- 
1.6.0.4




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: NFS write OOPS with 2.6.29.2
  2009-05-11 12:16         ` Trond Myklebust
@ 2009-05-11 12:45           ` Holger Kiehl
  2009-05-11 12:57             ` Trond Myklebust
  0 siblings, 1 reply; 9+ messages in thread
From: Holger Kiehl @ 2009-05-11 12:45 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-kernel, linux-nfs, Andrew Morton

On Mon, 11 May 2009, Trond Myklebust wrote:

> On Mon, 2009-05-11 at 09:24 +0000, Holger Kiehl wrote:
>> On Sun, 10 May 2009, Trond Myklebust wrote:
>>
>>> On Sat, 2009-05-09 at 19:16 +0000, Holger Kiehl wrote:
>>>> On Mon, 4 May 2009, Andrew Morton wrote:
>>>>
>>>>>
>>>>> (cc linux-nfs)
>>>>>
>>>>> On Sun, 3 May 2009 16:03:38 +0000 (GMT) Holger Kiehl <Holger.Kiehl@dwd.de> wrote:
>>>>>
>>>>>> Hello
>>>>>>
>>>>>> With plain kernel 2.6.29.2 I get the following OOPS (several of them) when
>>>>>> writing lots of small files on the client system:
>>>>>>
>>>>>>     May  3 18:48:34 obelix kernel: ------------[ cut here ]------------
>>>>>>     May  3 18:48:34 obelix kernel: kernel BUG at fs/nfs/write.c:252!
>>>>>
>>>>> I think this is a well-know bug, and fixes should be available in 2.6.29.3?
>>>>>
>>>> Thanks for this information. I just tried 2.6.29.3 and it still oopses.
>>>> Are there any patches I can try?
>>>
>>> The attached backports against 2.6.29 are untested, but they are known
>>> to compile at least. Could you give them a try?
>>>
>> Thanks. They do compile but when there is a mmap on the NFS drive the
>> program gets a SIGBUS:
>>
>>     unlink("/home/afdbench/afd2/fifodir/AFD_ACTIVE") = 0
>>     close(3)                                = 0
>>     open("/home/afdbench/afd2/fifodir/AFD_ACTIVE", O_RDWR|O_CREAT|O_TRUNC|O_CLOEXEC, 0600) = 3
>>     lseek(3, 78, SEEK_SET)                  = 78
>>     write(3, "\377"..., 1)                  = 1
>>     mmap(NULL, 78, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f8045cdb000
>>     --- SIGBUS (Bus error) @ 0 (0) ---
>>
>> This was with 2.6.29.3 plus the patches you send me.
>>
>> Holger
>
> Oh, duh... You need this little patchlet too.
>
Thanks! Now it works. One more problem I have is with splice(). My
application reports the following error:

    splice() error : Invalid argument

When it is called as follows:

     if ((bytes_written = splice(fd_pipe[0], NULL, to_fd,
                                 NULL, bytes_read,
                                 SPLICE_F_MOVE | SPLICE_F_MORE)) == -1)

Or may I not use splice() over NFS?

Thanks,
Holger

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: NFS write OOPS with 2.6.29.2
  2009-05-11 12:45           ` Holger Kiehl
@ 2009-05-11 12:57             ` Trond Myklebust
  2009-05-13 10:50               ` Suresh Jayaraman
  0 siblings, 1 reply; 9+ messages in thread
From: Trond Myklebust @ 2009-05-11 12:57 UTC (permalink / raw)
  To: Holger Kiehl; +Cc: linux-kernel, linux-nfs, Andrew Morton

On Mon, 2009-05-11 at 12:45 +0000, Holger Kiehl wrote:
> On Mon, 11 May 2009, Trond Myklebust wrote:
> 
> > On Mon, 2009-05-11 at 09:24 +0000, Holger Kiehl wrote:
> >> On Sun, 10 May 2009, Trond Myklebust wrote:
> >>
> >>> On Sat, 2009-05-09 at 19:16 +0000, Holger Kiehl wrote:
> >>>> On Mon, 4 May 2009, Andrew Morton wrote:
> >>>>
> >>>>>
> >>>>> (cc linux-nfs)
> >>>>>
> >>>>> On Sun, 3 May 2009 16:03:38 +0000 (GMT) Holger Kiehl <Holger.Kiehl@dwd.de> wrote:
> >>>>>
> >>>>>> Hello
> >>>>>>
> >>>>>> With plain kernel 2.6.29.2 I get the following OOPS (several of them) when
> >>>>>> writing lots of small files on the client system:
> >>>>>>
> >>>>>>     May  3 18:48:34 obelix kernel: ------------[ cut here ]------------
> >>>>>>     May  3 18:48:34 obelix kernel: kernel BUG at fs/nfs/write.c:252!
> >>>>>
> >>>>> I think this is a well-know bug, and fixes should be available in 2.6.29.3?
> >>>>>
> >>>> Thanks for this information. I just tried 2.6.29.3 and it still oopses.
> >>>> Are there any patches I can try?
> >>>
> >>> The attached backports against 2.6.29 are untested, but they are known
> >>> to compile at least. Could you give them a try?
> >>>
> >> Thanks. They do compile but when there is a mmap on the NFS drive the
> >> program gets a SIGBUS:
> >>
> >>     unlink("/home/afdbench/afd2/fifodir/AFD_ACTIVE") = 0
> >>     close(3)                                = 0
> >>     open("/home/afdbench/afd2/fifodir/AFD_ACTIVE", O_RDWR|O_CREAT|O_TRUNC|O_CLOEXEC, 0600) = 3
> >>     lseek(3, 78, SEEK_SET)                  = 78
> >>     write(3, "\377"..., 1)                  = 1
> >>     mmap(NULL, 78, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f8045cdb000
> >>     --- SIGBUS (Bus error) @ 0 (0) ---
> >>
> >> This was with 2.6.29.3 plus the patches you send me.
> >>
> >> Holger
> >
> > Oh, duh... You need this little patchlet too.
> >
> Thanks! Now it works. One more problem I have is with splice(). My
> application reports the following error:
> 
>     splice() error : Invalid argument
> 
> When it is called as follows:
> 
>      if ((bytes_written = splice(fd_pipe[0], NULL, to_fd,
>                                  NULL, bytes_read,
>                                  SPLICE_F_MOVE | SPLICE_F_MORE)) == -1)
> 
> Or may I not use splice() over NFS?

The read part is there, but the write part is still missing (just an
oversight - implementing it is pretty trivial). I'm planning on fixing
that for 2.6.31.

Cheers
  Trond


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: NFS write OOPS with 2.6.29.2
  2009-05-11 12:57             ` Trond Myklebust
@ 2009-05-13 10:50               ` Suresh Jayaraman
  0 siblings, 0 replies; 9+ messages in thread
From: Suresh Jayaraman @ 2009-05-13 10:50 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Holger Kiehl, linux-kernel, linux-nfs, Andrew Morton

Trond Myklebust wrote:
> On Mon, 2009-05-11 at 12:45 +0000, Holger Kiehl wrote:
>> On Mon, 11 May 2009, Trond Myklebust wrote:
>>>
>> Thanks! Now it works. One more problem I have is with splice(). My
>> application reports the following error:
>>
>>     splice() error : Invalid argument
>>
>> When it is called as follows:
>>
>>      if ((bytes_written = splice(fd_pipe[0], NULL, to_fd,
>>                                  NULL, bytes_read,
>>                                  SPLICE_F_MOVE | SPLICE_F_MORE)) == -1)
>>
>> Or may I not use splice() over NFS?
> 
> The read part is there, but the write part is still missing (just an
> oversight - implementing it is pretty trivial). I'm planning on fixing
> that for 2.6.31.
> 

Did my latest respun look OK?

http://lkml.org/lkml/2009/4/22/70


Thanks,

-- 
Suresh Jayaraman

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-05-13 10:50 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-03 16:03 NFS write OOPS with 2.6.29.2 Holger Kiehl
2009-05-05  6:14 ` Andrew Morton
2009-05-09 19:16   ` Holger Kiehl
2009-05-10  4:17     ` Trond Myklebust
2009-05-11  9:24       ` Holger Kiehl
2009-05-11 12:16         ` Trond Myklebust
2009-05-11 12:45           ` Holger Kiehl
2009-05-11 12:57             ` Trond Myklebust
2009-05-13 10:50               ` Suresh Jayaraman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox