* [PATCH] Fix typo on __rpc_purge_upcall
@ 2005-11-21 19:51 Vince Busam
2005-11-21 19:55 ` Trond Myklebust
0 siblings, 1 reply; 14+ messages in thread
From: Vince Busam @ 2005-11-21 19:51 UTC (permalink / raw)
To: nfs
I posted this last week. Here's an official style patch.
Vince
------------------------------------------------------------------
Fix an obvious typo that would cause a NULL pointer dereference.
Signed-off-by: Vince Busam <vbusam@google.com>
---
--- linux-2.6.13.4/net/sunrpc/rpc_pipe.c.orig 2005-11-16 16:48:00.000000000 -0800
+++ linux-2.6.13.4/net/sunrpc/rpc_pipe.c 2005-11-16 16:52:23.000000000 -0800
@@ -51,7 +51,7 @@ __rpc_purge_upcall(struct inode *inode,
rpci->ops->destroy_msg(msg);
}
while (!list_empty(&rpci->in_upcall)) {
- msg = list_entry(rpci->pipe.next, struct rpc_pipe_msg, list);
+ msg = list_entry(rpci->in_upcall.next, struct rpc_pipe_msg, list);
list_del_init(&msg->list);
msg->errno = err;
rpci->ops->destroy_msg(msg);
-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc. Get Certified Today
Register for a JBoss Training Course. Free Certification Exam
for All Training Attendees Through End of 2005. For more info visit:
http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH] Fix typo on __rpc_purge_upcall 2005-11-21 19:51 [PATCH] Fix typo on __rpc_purge_upcall Vince Busam @ 2005-11-21 19:55 ` Trond Myklebust 2005-11-21 21:51 ` Vince Busam 0 siblings, 1 reply; 14+ messages in thread From: Trond Myklebust @ 2005-11-21 19:55 UTC (permalink / raw) To: Vince Busam; +Cc: nfs On Mon, 2005-11-21 at 11:51 -0800, Vince Busam wrote: > I posted this last week. Here's an official style patch. I've already put a fix into the latest NFS_ALL. See http://client.linux-nfs.org/Linux-2.6.x/2.6.15-rc2/linux-2.6.15-06-rpc_pipe_fix_cleanup.dif Thanks! Trond ------------------------------------------------------- This SF.Net email is sponsored by the JBoss Inc. Get Certified Today Register for a JBoss Training Course. Free Certification Exam for All Training Attendees Through End of 2005. For more info visit: http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix typo on __rpc_purge_upcall 2005-11-21 19:55 ` Trond Myklebust @ 2005-11-21 21:51 ` Vince Busam 2005-11-21 22:34 ` Trond Myklebust 0 siblings, 1 reply; 14+ messages in thread From: Vince Busam @ 2005-11-21 21:51 UTC (permalink / raw) To: Trond Myklebust; +Cc: nfs Trond Myklebust wrote: > On Mon, 2005-11-21 at 11:51 -0800, Vince Busam wrote: > > http://client.linux-nfs.org/Linux-2.6.x/2.6.15-rc2/linux-2.6.15-06-rpc_pipe_fix_cleanup.dif > That looks good to me. After testing this fix for a week, I haven't gotten an Oops, but the system still locks up. The only relevant log message is about an upcall timing out. Nov 20 00:19:00 dig kernel: RPC: AUTH_GSS upcall timed out. Nov 20 00:19:00 dig kernel: Please check user daemon is running! Vince ------------------------------------------------------- This SF.Net email is sponsored by the JBoss Inc. Get Certified Today Register for a JBoss Training Course. Free Certification Exam for All Training Attendees Through End of 2005. For more info visit: http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix typo on __rpc_purge_upcall 2005-11-21 21:51 ` Vince Busam @ 2005-11-21 22:34 ` Trond Myklebust 2005-11-21 22:59 ` Vince Busam 0 siblings, 1 reply; 14+ messages in thread From: Trond Myklebust @ 2005-11-21 22:34 UTC (permalink / raw) To: Vince Busam; +Cc: nfs On Mon, 2005-11-21 at 13:51 -0800, Vince Busam wrote: > Trond Myklebust wrote: > > On Mon, 2005-11-21 at 11:51 -0800, Vince Busam wrote: > > > > http://client.linux-nfs.org/Linux-2.6.x/2.6.15-rc2/linux-2.6.15-06-rpc_pipe_fix_cleanup.dif > > > > That looks good to me. After testing this fix for a week, I haven't gotten an Oops, but > the system still locks up. The only relevant log message is about an upcall timing out. > > Nov 20 00:19:00 dig kernel: RPC: AUTH_GSS upcall timed out. > Nov 20 00:19:00 dig kernel: Please check user daemon is running! What kernel is this? There was a patch from Steve that caused this type of behaviour in some 2.6.14 CITI_ALL patches. That patch has since been removed. Cheers, Trond ------------------------------------------------------- This SF.Net email is sponsored by the JBoss Inc. Get Certified Today Register for a JBoss Training Course. Free Certification Exam for All Training Attendees Through End of 2005. For more info visit: http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix typo on __rpc_purge_upcall 2005-11-21 22:34 ` Trond Myklebust @ 2005-11-21 22:59 ` Vince Busam 2005-11-21 23:07 ` Trond Myklebust 0 siblings, 1 reply; 14+ messages in thread From: Vince Busam @ 2005-11-21 22:59 UTC (permalink / raw) To: Trond Myklebust; +Cc: nfs Trond Myklebust wrote: > On Mon, 2005-11-21 at 13:51 -0800, Vince Busam wrote: > >>Trond Myklebust wrote: >> >>>On Mon, 2005-11-21 at 11:51 -0800, Vince Busam wrote: >>> >>>http://client.linux-nfs.org/Linux-2.6.x/2.6.15-rc2/linux-2.6.15-06-rpc_pipe_fix_cleanup.dif >>> >> >>That looks good to me. After testing this fix for a week, I haven't gotten an Oops, but >>the system still locks up. The only relevant log message is about an upcall timing out. >> >>Nov 20 00:19:00 dig kernel: RPC: AUTH_GSS upcall timed out. >>Nov 20 00:19:00 dig kernel: Please check user daemon is running! > > > What kernel is this? There was a patch from Steve that caused this type > of behaviour in some 2.6.14 CITI_ALL patches. That patch has since been > removed. This is 2.6.13.4, with the __rpc_purge_upcall patch, linux-2.6.13-CITI_NFS4_ALL-1.dif, and an ugly patch that I don't remember why I'm using. --- linux-2.6.8/net/sunrpc/auth_gss/auth_gss.c 2004-08-13 22:36:57.000000000 -0700 +++ linux-2.6.8-new/net/sunrpc/auth_gss/auth_gss.c 2004-08-24 14:44:40.887239458 -0700 @@ -515,6 +515,8 @@ clnt = rpci->private; auth = clnt->cl_auth; + if (auth == NULL) + return; gss_auth = container_of(auth, struct gss_auth, rpc_auth); spin_lock(&gss_auth->lock); while (!list_empty(&gss_auth->upcalls)) { ------------------------------------------------------- This SF.Net email is sponsored by the JBoss Inc. Get Certified Today Register for a JBoss Training Course. Free Certification Exam for All Training Attendees Through End of 2005. For more info visit: http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix typo on __rpc_purge_upcall 2005-11-21 22:59 ` Vince Busam @ 2005-11-21 23:07 ` Trond Myklebust 2005-11-28 18:16 ` Vince Busam 0 siblings, 1 reply; 14+ messages in thread From: Trond Myklebust @ 2005-11-21 23:07 UTC (permalink / raw) To: Vince Busam; +Cc: nfs On Mon, 2005-11-21 at 14:59 -0800, Vince Busam wrote: > Trond Myklebust wrote: > > On Mon, 2005-11-21 at 13:51 -0800, Vince Busam wrote: > > > >>Trond Myklebust wrote: > >> > >>>On Mon, 2005-11-21 at 11:51 -0800, Vince Busam wrote: > >>> > >>>http://client.linux-nfs.org/Linux-2.6.x/2.6.15-rc2/linux-2.6.15-06-rpc_pipe_fix_cleanup.dif > >>> > >> > >>That looks good to me. After testing this fix for a week, I haven't gotten an Oops, but > >>the system still locks up. The only relevant log message is about an upcall timing out. > >> > >>Nov 20 00:19:00 dig kernel: RPC: AUTH_GSS upcall timed out. > >>Nov 20 00:19:00 dig kernel: Please check user daemon is running! > > > > > > What kernel is this? There was a patch from Steve that caused this type > > of behaviour in some 2.6.14 CITI_ALL patches. That patch has since been > > removed. > > This is 2.6.13.4, with the __rpc_purge_upcall patch, linux-2.6.13-CITI_NFS4_ALL-1.dif, and > an ugly patch that I don't remember why I'm using. > > --- linux-2.6.8/net/sunrpc/auth_gss/auth_gss.c 2004-08-13 22:36:57.000000000 -0700 > +++ linux-2.6.8-new/net/sunrpc/auth_gss/auth_gss.c 2004-08-24 14:44:40.887239458 -0700 > @@ -515,6 +515,8 @@ > > clnt = rpci->private; > auth = clnt->cl_auth; > + if (auth == NULL) > + return; > gss_auth = container_of(auth, struct gss_auth, rpc_auth); > spin_lock(&gss_auth->lock); > while (!list_empty(&gss_auth->upcalls)) { Could you revert that patch, and just add the one from http://client.linux-nfs.org/Linux-2.6.x/2.6.14/linux-2.6.14-88-rpcsec_gss_fix.dif That should bring you up to the rpc_pipefs from 2.6.14. Cheers Trond ------------------------------------------------------- This SF.Net email is sponsored by the JBoss Inc. Get Certified Today Register for a JBoss Training Course. Free Certification Exam for All Training Attendees Through End of 2005. For more info visit: http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix typo on __rpc_purge_upcall 2005-11-21 23:07 ` Trond Myklebust @ 2005-11-28 18:16 ` Vince Busam 2005-11-28 18:52 ` Trond Myklebust 0 siblings, 1 reply; 14+ messages in thread From: Vince Busam @ 2005-11-28 18:16 UTC (permalink / raw) To: Trond Myklebust; +Cc: nfs Trond Myklebust wrote: > > Could you revert that patch, and just add the one from > > http://client.linux-nfs.org/Linux-2.6.x/2.6.14/linux-2.6.14-88-rpcsec_gss_fix.dif > I got an Oops I haven't seen before. (2.6.13.4 + linux-2.6.13-CITI_NFS4_ALL-1.dif + linux-2.6.14-88-rpcsec_gss_fix.dif + linux-2.6.15-06-rpc_pipe_fix_cleanup.dif) Nov 26 00:05:36 dig kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000 Nov 26 00:05:36 dig kernel: printing eip: Nov 26 00:05:36 dig kernel: f8ad94ad Nov 26 00:05:36 dig kernel: *pde = 00000000 Nov 26 00:05:36 dig kernel: Oops: 0002 [#1] Nov 26 00:05:36 dig kernel: PREEMPT SMP Nov 26 00:05:36 dig kernel: Modules linked in: des binfmt_misc cpufreq_userspace cpufreq_ondemand cpufreq_powersave autofs4 video button battery container ac nfs lockd af_packet tg3 snd_intel8x0 snd_ac97_codec ata_piix libata snd_usb_audio snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_usb_lib snd_rawmidi snd_seq_device snd_hwdep snd soundcore pwc videodev v4l2_common uhci_hcd pci_hotplug intel_agp floppy pcspkr rtc sd_mod tsdev usbhid usb_storage scsi_mod evdev md_mod dm_mod nvidia agpgart psmouse mousedev parport_pc lp parport ide_cd cdrom rpcsec_gss_krb5 auth_rpcgss sunrpc ehci_hcd usbcore ext3 jbd mbcache ide_disk ide_generic via82cxxx trm290 triflex slc90e66 sis5513 siimage serverworks sc1200 rz1000 piix pdc202xx_old opti621 ns87415 hpt366 hpt34x generic cy82c693 cs5530 cs5520 cmd64x atiixp amd74xx alim15x3 aec62xx pdc202xx_new ide_core unix thermal processor fan Nov 26 00:05:36 dig kernel: CPU: 0 Nov 26 00:05:36 dig kernel: EIP: 0060:[<f8ad94ad>] Tainted: P VLI Nov 26 00:05:36 dig kernel: EFLAGS: 00010287 (2.6.13.4-gg5vb5) Nov 26 00:05:36 dig kernel: EIP is at rpc_pipe_read+0xad/0x130 [sunrpc] Nov 26 00:05:36 dig kernel: eax: 00000000 ebx: f5470b08 ecx: f5e1a88c edx: 00000000 Nov 26 00:05:36 dig kernel: esi: f5e1a700 edi: f55e3c80 ebp: 00000000 esp: f5b97f4c Nov 26 00:05:36 dig kernel: ds: 007b es: 007b ss: 0068 Nov 26 00:05:36 dig kernel: Process rpc.gssd (pid: 7243, threadinfo=f5b96000 task=c22ba540) Nov 26 00:05:36 dig kernel: Stack: e9a3f00c c0305200 e9a3f008 e9a3f008 00000004 f55e3c80 bff5dab4 00000000 Nov 26 00:05:36 dig kernel: c0165a03 f55e3c80 bff5dab4 00000004 f5b97fa4 f55e3c80 fffffff7 00000004 Nov 26 00:05:36 dig kernel: f5b96000 c0165df1 f55e3c80 bff5dab4 00000004 f5b97fa4 00000000 00000000 Nov 26 00:05:36 dig kernel: Call Trace: Nov 26 00:05:36 dig kernel: [<c0165a03>] vfs_read+0xf3/0x1b0 Nov 26 00:05:36 dig kernel: [<c0165df1>] sys_read+0x51/0x80 Nov 26 00:05:36 dig kernel: [<c010316b>] sysenter_past_esp+0x54/0x75 Nov 26 00:05:36 dig kernel: Code: 24 14 8b 7c 24 18 8b 6c 24 1c 83 c4 20 c3 8b 96 84 01 00 00 8d 86 84 01 00 00 39 c2 74 d0 89 d3 8b 52 04 8b 03 8d 8e 8c 01 00 00 <89> 02 89 50 04 8b 86 8c 01 00 00 89 58 04 89 03 89 4b 04 8b 86 ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix typo on __rpc_purge_upcall 2005-11-28 18:16 ` Vince Busam @ 2005-11-28 18:52 ` Trond Myklebust 2005-12-05 21:03 ` Vince Busam 0 siblings, 1 reply; 14+ messages in thread From: Trond Myklebust @ 2005-11-28 18:52 UTC (permalink / raw) To: Vince Busam; +Cc: nfs On Mon, 2005-11-28 at 10:16 -0800, Vince Busam wrote: > Trond Myklebust wrote: > > > > Could you revert that patch, and just add the one from > > > > http://client.linux-nfs.org/Linux-2.6.x/2.6.14/linux-2.6.14-88-rpcsec_gss_fix.dif > > > > I got an Oops I haven't seen before. (2.6.13.4 + linux-2.6.13-CITI_NFS4_ALL-1.dif + > linux-2.6.14-88-rpcsec_gss_fix.dif + linux-2.6.15-06-rpc_pipe_fix_cleanup.dif) > > Nov 26 00:05:36 dig kernel: Unable to handle kernel NULL pointer dereference at > virtual address 00000000 > Nov 26 00:05:36 dig kernel: printing eip: > Nov 26 00:05:36 dig kernel: f8ad94ad > Nov 26 00:05:36 dig kernel: *pde = 00000000 > Nov 26 00:05:36 dig kernel: Oops: 0002 [#1] > Nov 26 00:05:36 dig kernel: PREEMPT SMP > Nov 26 00:05:36 dig kernel: Modules linked in: des binfmt_misc cpufreq_userspace > cpufreq_ondemand cpufreq_powersave autofs4 video button battery container ac nfs lockd > af_packet tg3 snd_intel8x0 snd_ac97_codec ata_piix libata snd_usb_audio > snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_usb_lib snd_rawmidi > snd_seq_device snd_hwdep snd soundcore pwc videodev v4l2_common uhci_hcd pci_hotplug > intel_agp floppy pcspkr rtc sd_mod tsdev usbhid usb_storage scsi_mod evdev md_mod dm_mod > nvidia agpgart psmouse mousedev parport_pc lp parport ide_cd cdrom rpcsec_gss_krb5 > auth_rpcgss sunrpc ehci_hcd usbcore ext3 jbd mbcache ide_disk ide_generic via82cxxx trm290 > triflex slc90e66 sis5513 siimage serverworks sc1200 rz1000 piix pdc202xx_old opti621 > ns87415 hpt366 hpt34x generic cy82c693 cs5530 cs5520 cmd64x atiixp amd74xx alim15x3 > aec62xx pdc202xx_new ide_core unix thermal processor fan > Nov 26 00:05:36 dig kernel: CPU: 0 > Nov 26 00:05:36 dig kernel: EIP: 0060:[<f8ad94ad>] Tainted: P VLI > Nov 26 00:05:36 dig kernel: EFLAGS: 00010287 (2.6.13.4-gg5vb5) > Nov 26 00:05:36 dig kernel: EIP is at rpc_pipe_read+0xad/0x130 [sunrpc] > Nov 26 00:05:36 dig kernel: eax: 00000000 ebx: f5470b08 ecx: f5e1a88c edx: 00000000 > Nov 26 00:05:36 dig kernel: esi: f5e1a700 edi: f55e3c80 ebp: 00000000 esp: f5b97f4c > Nov 26 00:05:36 dig kernel: ds: 007b es: 007b ss: 0068 > Nov 26 00:05:36 dig kernel: Process rpc.gssd (pid: 7243, threadinfo=f5b96000 task=c22ba540) > Nov 26 00:05:36 dig kernel: Stack: e9a3f00c c0305200 e9a3f008 e9a3f008 00000004 > f55e3c80 bff5dab4 00000000 > Nov 26 00:05:36 dig kernel: c0165a03 f55e3c80 bff5dab4 00000004 f5b97fa4 f55e3c80 fffffff7 > 00000004 > Nov 26 00:05:36 dig kernel: f5b96000 c0165df1 f55e3c80 bff5dab4 00000004 f5b97fa4 00000000 > 00000000 > Nov 26 00:05:36 dig kernel: Call Trace: > Nov 26 00:05:36 dig kernel: [<c0165a03>] vfs_read+0xf3/0x1b0 > Nov 26 00:05:36 dig kernel: [<c0165df1>] sys_read+0x51/0x80 > Nov 26 00:05:36 dig kernel: [<c010316b>] sysenter_past_esp+0x54/0x75 > Nov 26 00:05:36 dig kernel: Code: 24 14 8b 7c 24 18 8b 6c 24 1c 83 c4 20 c3 8b 96 84 01 00 > 00 8d 86 84 01 00 00 39 c2 74 d0 89 d3 8b 52 04 8b 03 8d 8e 8c 01 00 > 00 <89> 02 89 50 04 8b 86 8c 01 00 00 89 58 04 89 03 89 4b 04 8b 86 Argh... Yep. Looks like the "fix" to ensure that we purge rpci->in_upcall was wrong. Does the following patch fix it? Cheers, Trond ------- SUNRPC: Remove redundant list rpci->in_upcall. The elements on rpci->in_upcall are tracked by the filp->private_data, which will ensure that they get released when the file is closed. Note that early purging of the elements on that list was responsible for a potential Oops... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> --- include/linux/sunrpc/rpc_pipe_fs.h | 1 - net/sunrpc/rpc_pipe.c | 5 +---- 2 files changed, 1 insertions(+), 5 deletions(-) diff --git a/include/linux/sunrpc/rpc_pipe_fs.h b/include/linux/sunrpc/rpc_pipe_fs.h index 6392934..ee353f2 100644 --- a/include/linux/sunrpc/rpc_pipe_fs.h +++ b/include/linux/sunrpc/rpc_pipe_fs.h @@ -22,7 +22,6 @@ struct rpc_inode { struct inode vfs_inode; void *private; struct list_head pipe; - struct list_head in_upcall; int pipelen; int nreaders; int nwriters; diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c index e3b242d..eb240b6 100644 --- a/net/sunrpc/rpc_pipe.c +++ b/net/sunrpc/rpc_pipe.c @@ -38,7 +38,7 @@ static kmem_cache_t *rpc_inode_cachep __ #define RPC_UPCALL_TIMEOUT (30*HZ) -static void +static inline void __rpc_purge_list(struct rpc_inode *rpci, struct list_head *head, int err) { struct rpc_pipe_msg *msg; @@ -59,7 +59,6 @@ __rpc_purge_upcall(struct inode *inode, struct rpc_inode *rpci = RPC_I(inode); __rpc_purge_list(rpci, &rpci->pipe, err); - __rpc_purge_list(rpci, &rpci->in_upcall, err); rpci->pipelen = 0; wake_up(&rpci->waitq); } @@ -210,7 +209,6 @@ rpc_pipe_read(struct file *filp, char __ msg = list_entry(rpci->pipe.next, struct rpc_pipe_msg, list); - list_move(&msg->list, &rpci->in_upcall); rpci->pipelen -= msg->len; filp->private_data = msg; msg->copied = 0; @@ -814,7 +812,6 @@ init_once(void * foo, kmem_cache_t * cac rpci->private = NULL; rpci->nreaders = 0; rpci->nwriters = 0; - INIT_LIST_HEAD(&rpci->in_upcall); INIT_LIST_HEAD(&rpci->pipe); rpci->pipelen = 0; init_waitqueue_head(&rpci->waitq); ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix typo on __rpc_purge_upcall 2005-11-28 18:52 ` Trond Myklebust @ 2005-12-05 21:03 ` Vince Busam 2005-12-12 18:57 ` Vince Busam 0 siblings, 1 reply; 14+ messages in thread From: Vince Busam @ 2005-12-05 21:03 UTC (permalink / raw) To: Trond Myklebust; +Cc: nfs Trond Myklebust wrote: > > Argh... Yep. Looks like the "fix" to ensure that we purge > rpci->in_upcall was wrong. Does the following patch fix it? I got another oops in __rpc_purge_upcall, which looks like this after applying the patches. Looks like rcpi must have been NULL, but I'll defer to the experts here. static void __rpc_purge_upcall(struct inode *inode, int err) { struct rpc_inode *rpci = RPC_I(inode); __rpc_purge_list(rpci, &rpci->pipe, err); rpci->pipelen = 0; wake_up(&rpci->waitq); } Dec 4 13:09:59 block kernel: RPC: AUTH_GSS upcall timed out. Dec 4 13:09:59 block kernel: Please check user daemon is running! Dec 4 13:10:12 block kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000004 Dec 4 13:10:12 block kernel: printing eip: Dec 4 13:10:12 block kernel: f8a98d55 Dec 4 13:10:12 block kernel: *pde = 00000000 Dec 4 13:10:12 block kernel: Oops: 0002 [#1] Dec 4 13:10:12 block kernel: PREEMPT SMP Dec 4 13:10:12 block kernel: Modules linked in: des tsdev usbhid vmnet vmmon binfmt_misc cpufreq_userspace cpufreq_ondemand cpufreq_powersave autofs4 video button battery container ac capability commoncap nfs lockd af_packet tg3 generic piix snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc uhci_hcd pci_hotplug floppy pcspkr rtc md_mod evdev dm_mod nvidia agpgart psmouse mousedev parport_pc lp parport ide_generic ide_disk ide_cd cdrom ide_core rpcsec_gss_krb5 auth_rpcgss sunrpc ehci_hcd usbcore ext3 jbd mbcache ahci sd_mod ata_piix libata scsi_mod unix thermal processor fan Dec 4 13:10:12 block kernel: CPU: 1 Dec 4 13:10:12 block kernel: EIP: 0060:[<f8a98d55>] Tainted: P VLI Dec 4 13:10:12 block kernel: EFLAGS: 00010202 (2.6.13.4-gg5vb7) Dec 4 13:10:12 block kernel: EIP is at __rpc_purge_upcall+0x35/0x80 [sunrpc] Dec 4 13:10:12 block kernel: eax: 00000000 ebx: c2bcec84 ecx: d16e1688 edx: 00000000 Dec 4 13:10:12 block kernel: esi: c2bceb00 edi: f88b5ce0 ebp: ffffffe0 esp: eea1bf30 Dec 4 13:10:12 block kernel: ds: 007b es: 007b ss: 0068 Dec 4 13:10:12 block kernel: Process rpc.gssd (pid: 5833, threadinfo=eea1a000 task=ef353020) Dec 4 13:10:12 block kernel: Stack: d16e1680 c2bceb00 cf453380 c2bceb00 c2bceb00 f8a990cb c2bceb00 ffffffe0 Dec 4 13:10:12 block kernel: 00000008 cf453380 eea94800 c01675fa c2bceb00 cf453380 00000000 00000000 Dec 4 13:10:12 block kernel: d16a28c0 cf453380 ef02b300 00000000 cf453380 c0165906 cf453380 ef02b300 Dec 4 13:10:12 block kernel: Call Trace: Dec 4 13:10:12 block kernel: [<f8a990cb>] rpc_pipe_release+0xcb/0xf0 [sunrpc] Dec 4 13:10:12 block kernel: [<c01675fa>] __fput+0x18a/0x1d0 Dec 4 13:10:12 block kernel: [<c0165906>] filp_close+0x46/0x90 Dec 4 13:10:12 block kernel: [<c01659ba>] sys_close+0x6a/0xa0 Dec 4 13:10:12 block kernel: [<c010316b>] sysenter_past_esp+0x54/0x75 Dec 4 13:10:12 block kernel: Code: 18 8b 6c 24 1c 8b 86 ac 01 00 00 8d 9e 84 01 00 00 8b 78 0c 8b 86 84 01 00 00 39 d8 74 25 89 c1 8d b6 00 00 00 00 8b 51 04 8b 01 <89> 50 04 89 02 89 49 04 89 09 89 69 14 89 0c 24 ff d7 8b 0b 39 Dec 5 10:59:31 block kernel: x55/0xb0 Vince ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix typo on __rpc_purge_upcall 2005-12-05 21:03 ` Vince Busam @ 2005-12-12 18:57 ` Vince Busam 2005-12-12 19:18 ` Trond Myklebust 0 siblings, 1 reply; 14+ messages in thread From: Vince Busam @ 2005-12-12 18:57 UTC (permalink / raw) To: nfs I applied this patch from 2.6.15-rc5, and got the following oops. I really wish I could reproduce this faster, but it still only happens over the weekend when my credentials have expired. Letting them expire during the week doesn't reproduce it. --- e3b242daf53c64506f9ba77937a94bb544bcefe6 +++ c76ea221798caf96666ef99ac3ce5c1694c832b7 @@ -59,7 +59,6 @@ __rpc_purge_upcall(struct inode *inode, struct rpc_inode *rpci = RPC_I(inode); __rpc_purge_list(rpci, &rpci->pipe, err); - __rpc_purge_list(rpci, &rpci->in_upcall, err); rpci->pipelen = 0; wake_up(&rpci->waitq); } @@ -119,6 +118,7 @@ rpc_close_pipes(struct inode *inode) down(&inode->i_sem); if (rpci->ops != NULL) { rpci->nreaders = 0; + __rpc_purge_list(rpci, &rpci->in_upcall, -EPIPE); __rpc_purge_upcall(inode, -EPIPE); rpci->nwriters = 0; if (rpci->ops->release_pipe) Dec 11 13:53:28 block kernel: RPC: AUTH_GSS upcall timed out. Dec 11 13:53:28 block kernel: Please check user daemon is running! Dec 11 13:53:43 block kernel: RPC: AUTH_GSS upcall timed out. Dec 11 13:53:43 block kernel: Please check user daemon is running! Dec 11 13:53:43 block kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000004 Dec 11 13:53:43 block kernel: printing eip: Dec 11 13:53:43 block kernel: f8ad1d55 Dec 11 13:53:43 block kernel: *pde = 00000000 Dec 11 13:53:43 block kernel: Oops: 0002 [#1] Dec 11 13:53:43 block kernel: PREEMPT SMP Dec 11 13:53:43 block kernel: Modules linked in: ext2 loop des binfmt_misc cpufreq_userspace cpufreq_ondemand cpufreq_powersave autofs4 video button battery container ac capability commoncap nfs lockd af_packet tg3 generic piix snd_intel8x0 snd_usb_audio snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_usb_lib snd_rawmidi snd_seq_device snd_hwdep snd soundcore snd_page_alloc pwc videodev v4l2_common uhci_hcd pci_hotplug floppy pcspkr rtc tsdev usbhid evdev md_mod dm_mod nvidia agpgart psmouse mousedev parport_pc lp parport ide_generic ide_disk ide_cd cdrom rpcsec_gss_krb5 auth_rpcgss sunrpc ehci_hcd ext3 jbd mbcache ahci sd_mod ata_piix libata usb_storage usbcore scsi_mod ide_core unix thermal processor fan Dec 11 13:53:43 block kernel: CPU: 1 Dec 11 13:53:43 block kernel: EIP: 0060:[<f8ad1d55>] Tainted: P VLI Dec 11 13:53:43 block kernel: EFLAGS: 00010287 (2.6.13.4-gg5vb8) Dec 11 13:53:43 block kernel: EIP is at __rpc_purge_list+0x35/0x60 [sunrpc] Dec 11 13:53:43 block kernel: eax: 00000000 ebx: ebcdc684 ecx: ea628908 edx: 00000000 Dec 11 13:53:43 block kernel: esi: f890ece0 edi: ffffffe0 ebp: ebcdc500 esp: ebac7f1c Dec 11 13:53:43 block kernel: ds: 007b es: 007b ss: 0068 Dec 11 13:53:43 block kernel: Process rpc.gssd (pid: 7196, threadinfo=ebac6000 task=dfe61540) Dec 11 13:53:43 block kernel: Stack: ea628900 ebcdc500 ffffffe0 ebcdc500 f8ad1dad ebcdc500 ebcdc684 ffffffe0 Dec 11 13:53:43 block kernel: ebcdc500 ea20ea80 f8ad213b ebcdc500 ffffffe0 00000008 ea20ea80 ebcdaf00 Dec 11 13:53:43 block kernel: c01675fa ebcdc500 ea20ea80 00000000 00000000 ebba9d40 ea20ea80 dfb06080 Dec 11 13:53:43 block kernel: Call Trace: Dec 11 13:53:43 block kernel: [<f8ad1dad>] __rpc_purge_upcall+0x2d/0x80 [sunrpc]Dec 11 13:53:43 block kernel: [<f8ad213b>] rpc_pipe_release+0xcb/0xf0 [sunrpc] Dec 11 13:53:43 block kernel: [<c01675fa>] __fput+0x18a/0x1d0 Dec 11 13:53:43 block kernel: [<c0165906>] filp_close+0x46/0x90 Dec 11 13:53:43 block kernel: [<c01659ba>] sys_close+0x6a/0xa0 Dec 11 13:53:43 block kernel: [<c010316b>] sysenter_past_esp+0x54/0x75 Dec 11 13:53:43 block kernel: Code: 8b 44 24 14 8b 7c 24 1c 8b 0b 8b 80 b4 01 00 00 39 d9 8b 70 0c 74 2c eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 90 8b 51 04 8b 01 <89> 50 04 89 02 89 49 04 89 09 89 79 14 89 0c 24 ff d6 8b 0b 39 ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix typo on __rpc_purge_upcall 2005-12-12 18:57 ` Vince Busam @ 2005-12-12 19:18 ` Trond Myklebust 2005-12-12 20:33 ` Vince Busam 0 siblings, 1 reply; 14+ messages in thread From: Trond Myklebust @ 2005-12-12 19:18 UTC (permalink / raw) To: Vince Busam; +Cc: nfs On Mon, 2005-12-12 at 10:57 -0800, Vince Busam wrote: > I applied this patch from 2.6.15-rc5, and got the following oops. I really wish I could > reproduce this faster, but it still only happens over the weekend when my credentials have > expired. Letting them expire during the week doesn't reproduce it. Could you send us the contents of rpc_close_pipes() and rpc_pipe_release()? I cannot see how rpc_pipe_release can be calling __rpc_purge_upcall with a null entry for rpci->ops: the inode->i_sem should be protecting it from changing. Cheers, Trond ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix typo on __rpc_purge_upcall 2005-12-12 19:18 ` Trond Myklebust @ 2005-12-12 20:33 ` Vince Busam 2005-12-12 23:51 ` Trond Myklebust 0 siblings, 1 reply; 14+ messages in thread From: Vince Busam @ 2005-12-12 20:33 UTC (permalink / raw) To: Trond Myklebust; +Cc: nfs Trond Myklebust wrote: > On Mon, 2005-12-12 at 10:57 -0800, Vince Busam wrote: > >>I applied this patch from 2.6.15-rc5, and got the following oops. I really wish I could >>reproduce this faster, but it still only happens over the weekend when my credentials have >>expired. Letting them expire during the week doesn't reproduce it. > > > Could you send us the contents of rpc_close_pipes() and > rpc_pipe_release()? > > I cannot see how rpc_pipe_release can be calling __rpc_purge_upcall with > a null entry for rpci->ops: the inode->i_sem should be protecting it > from changing. static void rpc_close_pipes(struct inode *inode) { struct rpc_inode *rpci = RPC_I(inode); cancel_delayed_work(&rpci->queue_timeout); flush_scheduled_work(); down(&inode->i_sem); if (rpci->ops != NULL) { rpci->nreaders = 0; __rpc_purge_list(rpci, &rpci->in_upcall, -EPIPE); __rpc_purge_upcall(inode, -EPIPE); rpci->nwriters = 0; if (rpci->ops->release_pipe) rpci->ops->release_pipe(inode); rpci->ops = NULL; } rpc_inode_setowner(inode, NULL); up(&inode->i_sem); } static int rpc_pipe_release(struct inode *inode, struct file *filp) { struct rpc_inode *rpci = RPC_I(filp->f_dentry->d_inode); struct rpc_pipe_msg *msg; down(&inode->i_sem); if (rpci->ops == NULL) goto out; msg = (struct rpc_pipe_msg *)filp->private_data; if (msg != NULL) { msg->errno = -EPIPE; list_del_init(&msg->list); rpci->ops->destroy_msg(msg); } if (filp->f_mode & FMODE_WRITE) rpci->nwriters --; if (filp->f_mode & FMODE_READ) rpci->nreaders --; if (!rpci->nreaders) __rpc_purge_upcall(inode, -EPIPE); if (rpci->ops->release_pipe) rpci->ops->release_pipe(inode); out: up(&inode->i_sem); return 0; } ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix typo on __rpc_purge_upcall 2005-12-12 20:33 ` Vince Busam @ 2005-12-12 23:51 ` Trond Myklebust 2006-01-05 22:30 ` Vince Busam 0 siblings, 1 reply; 14+ messages in thread From: Trond Myklebust @ 2005-12-12 23:51 UTC (permalink / raw) To: Vince Busam; +Cc: nfs [-- Attachment #1: Type: text/plain, Size: 324 bytes --] On Mon, 2005-12-12 at 12:33 -0800, Vince Busam wrote: > Trond Myklebust wrote: > > > > Could you send us the contents of rpc_close_pipes() and > > rpc_pipe_release()? > > Hmm.... Looks correct. The only potential races I can see should be fixed by the following patch. Can you apply and then try again? Cheers, Trond [-- Attachment #2: linux-2.6.15-37-fix_rpc_pipefs_race.dif --] [-- Type: text/plain, Size: 1387 bytes --] SUNRPC: Fix a potential race in rpc_pipefs. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> --- net/sunrpc/rpc_pipe.c | 9 ++++++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c index c76ea22..511647e 100644 --- a/net/sunrpc/rpc_pipe.c +++ b/net/sunrpc/rpc_pipe.c @@ -70,8 +70,11 @@ rpc_timeout_upcall_queue(void *data) struct inode *inode = &rpci->vfs_inode; down(&inode->i_sem); + if (rpci->ops == NULL) + goto out; if (rpci->nreaders == 0 && !list_empty(&rpci->pipe)) __rpc_purge_upcall(inode, -ETIMEDOUT); +out: up(&inode->i_sem); } @@ -113,8 +116,6 @@ rpc_close_pipes(struct inode *inode) { struct rpc_inode *rpci = RPC_I(inode); - cancel_delayed_work(&rpci->queue_timeout); - flush_scheduled_work(); down(&inode->i_sem); if (rpci->ops != NULL) { rpci->nreaders = 0; @@ -127,6 +128,8 @@ rpc_close_pipes(struct inode *inode) } rpc_inode_setowner(inode, NULL); up(&inode->i_sem); + cancel_delayed_work(&rpci->queue_timeout); + flush_scheduled_work(); } static struct inode * @@ -166,7 +169,7 @@ rpc_pipe_open(struct inode *inode, struc static int rpc_pipe_release(struct inode *inode, struct file *filp) { - struct rpc_inode *rpci = RPC_I(filp->f_dentry->d_inode); + struct rpc_inode *rpci = RPC_I(inode); struct rpc_pipe_msg *msg; down(&inode->i_sem); ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix typo on __rpc_purge_upcall 2005-12-12 23:51 ` Trond Myklebust @ 2006-01-05 22:30 ` Vince Busam 0 siblings, 0 replies; 14+ messages in thread From: Vince Busam @ 2006-01-05 22:30 UTC (permalink / raw) To: Trond Myklebust; +Cc: nfs Trond Myklebust wrote: > On Mon, 2005-12-12 at 12:33 -0800, Vince Busam wrote: > >>Trond Myklebust wrote: >> >>>Could you send us the contents of rpc_close_pipes() and >>>rpc_pipe_release()? >>> > > > Hmm.... Looks correct. The only potential races I can see should be > fixed by the following patch. Can you apply and then try again? > I'm still got an oops after applying that patch (it still takes a long time for it to occur, this happened over the break with expired credentials). Dec 24 01:07:43 block kernel: RPC: AUTH_GSS upcall timed out. Dec 24 01:07:43 block kernel: Please check user daemon is running! Dec 24 01:07:45 block kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000004 Dec 24 01:07:45 block kernel: printing eip: Dec 24 01:07:45 block kernel: f8ad1d4b Dec 24 01:07:45 block kernel: *pde = 00000000 Dec 24 01:07:45 block kernel: Oops: 0002 [#1] Dec 24 01:07:45 block kernel: PREEMPT SMP Dec 24 01:07:45 block kernel: Modules linked in: des binfmt_misc cpufreq_userspace cpufreq_ondemand cpufreq_powersave autofs4 video button battery container ac capability commoncap nfs lockd af_packet tg3 generic piix snd_intel8x0 snd_ac97_codec snd_usb_audio snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_usb_lib snd_rawmidi snd_seq_device snd_hwdep snd soundcore pwc videodev v4l2_common uhci_hcd pci_hotplug floppy pcspkr rtc tsdev evdev usbhid md_mod dm_mod nvidia agpgart psmouse mousedev parport_pc lp parport ide_generic ide_disk ide_cd cdrom rpcsec_gss_krb5 auth_rpcgss sunrpc ehci_hcd ext3 jbd mbcache ahci sd_mod ata_piix libata usb_storage usbcore scsi_mod ide_core unix thermal processor fanDec 24 01:07:45 block kernel: CPU: 1 Dec 24 01:07:45 block kernel: EIP: 0060:[<f8ad1d4b>] Tainted: P VLI Dec 24 01:07:45 block kernel: EFLAGS: 00010286 (2.6.13.4-gg5vb9) Dec 24 01:07:45 block kernel: EIP is at __rpc_purge_list+0x2b/0xc0 [sunrpc] Dec 24 01:07:45 block kernel: eax: 00000000 ebx: c877de88 ecx: c877dea0 edx: 00000000 Dec 24 01:07:45 block kernel: esi: ec69e684 edi: f890ece0 ebp: ffffffe0 esp: ebd4ff14 Dec 24 01:07:45 block kernel: ds: 007b es: 007b ss: 0068 Dec 24 01:07:45 block kernel: Process rpc.gssd (pid: 7410, threadinfo=ebd4e000 task=ec48c540) Dec 24 01:07:45 block kernel: Stack: c877de80 00000002 d646d440 ec69e500 ffffffe0 ec68fa00 ec69e500 f8ad1e15 Dec 24 01:07:45 block kernel: ec69e500 ec69e684 ffffffe0 ec69e500 e584bc80 f8ad21be ec69e500 ffffffe0 Dec 24 01:07:45 block kernel: 00000008 e584bc80 c01675fa ec69e500 e584bc80 00000000 00000000 ebe667a0 Dec 24 01:07:45 block kernel: Call Trace: Dec 24 01:07:45 block kernel: [<f8ad1e15>] __rpc_purge_upcall+0x35/0xb0 [sunrpc]Dec 24 01:07:45 block kernel: [<f8ad21be>] rpc_pipe_release+0xae/0xd0 [sunrpc] Dec 24 01:07:45 block kernel: [<c01675fa>] __fput+0x18a/0x1d0 Dec 24 01:07:45 block kernel: [<c0165906>] filp_close+0x46/0x90 Dec 24 01:07:45 block kernel: [<c01659ba>] sys_close+0x6a/0xa0 Dec 24 01:07:45 block kernel: [<c010316b>] sysenter_past_esp+0x54/0x75 Dec 24 01:07:45 block kernel: Code: 55 57 56 53 83 ec 0c 8b 5c 24 20 8b 74 24 24 8b 6c 24 28 85 db 74 78 85 f6 74 54 8b 83 b4 01 00 00 8b 78 0c eb 17 8b 53 04 8b 03 <89> 50 04 89 02 89 5b 04 89 1b 89 6b 14 89 1c 24 ff d7 8b 1e 39 After dissassembling the code, it appears this is happening in list_del_init(&msg->list) in __rpc_purge_list(), in the first line of the inlined function __list_del(). Vince ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2006-01-05 22:30 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-11-21 19:51 [PATCH] Fix typo on __rpc_purge_upcall Vince Busam 2005-11-21 19:55 ` Trond Myklebust 2005-11-21 21:51 ` Vince Busam 2005-11-21 22:34 ` Trond Myklebust 2005-11-21 22:59 ` Vince Busam 2005-11-21 23:07 ` Trond Myklebust 2005-11-28 18:16 ` Vince Busam 2005-11-28 18:52 ` Trond Myklebust 2005-12-05 21:03 ` Vince Busam 2005-12-12 18:57 ` Vince Busam 2005-12-12 19:18 ` Trond Myklebust 2005-12-12 20:33 ` Vince Busam 2005-12-12 23:51 ` Trond Myklebust 2006-01-05 22:30 ` Vince Busam
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.