All of lore.kernel.org
 help / color / mirror / Atom feed
* NFS broken in latest 2.6.37-rcX
@ 2010-12-22 21:34 Guy Martin
  2010-12-22 21:53 ` Carlos O'Donell
  0 siblings, 1 reply; 5+ messages in thread
From: Guy Martin @ 2010-12-22 21:34 UTC (permalink / raw)
  To: linux-parisc


Hi all,


It seems that NFS got broken recently.
I've bisected this to babddc72a9468884ce1a23db3c3d54b0afa299f0.

Both NFS version 2 and 3 are affected. I haven't tested NFS 4.

I've been able to reproduce with both 32bit and 64bit kernels using
gcc-4.5.1 with the fix for PR46915 included.

To reproduce, simply mount an NFS share and try to list the files with
ls.

When listing the directory, the code seem to be looping in the commit I
mentioned. The network traffic goes high and you always see the same
packets flowing.

In current HEAD, the behavior is slightly different. The process uses
100% and either you get a kernel panic or if you are lucky, you get
something like "memory exhausted".


I'm not sure how to troubleshoot this further.

Any idea ?


  Guy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: NFS broken in latest 2.6.37-rcX
  2010-12-22 21:34 NFS broken in latest 2.6.37-rcX Guy Martin
@ 2010-12-22 21:53 ` Carlos O'Donell
  2011-01-05 17:36   ` James Bottomley
  0 siblings, 1 reply; 5+ messages in thread
From: Carlos O'Donell @ 2010-12-22 21:53 UTC (permalink / raw)
  To: Guy Martin; +Cc: linux-parisc

On Wed, Dec 22, 2010 at 4:34 PM, Guy Martin <gmsoft@tuxicoman.be> wrote:
> It seems that NFS got broken recently.
> I've bisected this to babddc72a9468884ce1a23db3c3d54b0afa299f0.
>
> Both NFS version 2 and 3 are affected. I haven't tested NFS 4.
>
> I've been able to reproduce with both 32bit and 64bit kernels using
> gcc-4.5.1 with the fix for PR46915 included.
>
> To reproduce, simply mount an NFS share and try to list the files with
> ls.
>
> When listing the directory, the code seem to be looping in the commit I
> mentioned. The network traffic goes high and you always see the same
> packets flowing.
>
> In current HEAD, the behavior is slightly different. The process uses
> 100% and either you get a kernel panic or if you are lucky, you get
> something like "memory exhausted".
>
>
> I'm not sure how to troubleshoot this further.
>
> Any idea ?

You need to understand the failure mode.

I would do two things:

(a) Revert the patch on HEAD and see if it works. This is usually
convincing proof that something is broken and the patch interacts
badly with our arch.

(c) Put printfs in the code to see where it's looping and under what
conditions.

Once you have a better grasp of the failure mode you can contact the
author of the patch and tell them about the breakage, CC linux-parisc,
and ask for help.

That would be me plan of attack.

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: NFS broken in latest 2.6.37-rcX
  2010-12-22 21:53 ` Carlos O'Donell
@ 2011-01-05 17:36   ` James Bottomley
  2011-01-05 18:31     ` Guy Martin
  0 siblings, 1 reply; 5+ messages in thread
From: James Bottomley @ 2011-01-05 17:36 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: Guy Martin, linux-parisc

On Wed, 2010-12-22 at 16:53 -0500, Carlos O'Donell wrote:
> On Wed, Dec 22, 2010 at 4:34 PM, Guy Martin <gmsoft@tuxicoman.be> wrote:
> > It seems that NFS got broken recently.
> > I've bisected this to babddc72a9468884ce1a23db3c3d54b0afa299f0.
> >
> > Both NFS version 2 and 3 are affected. I haven't tested NFS 4.
> >
> > I've been able to reproduce with both 32bit and 64bit kernels using
> > gcc-4.5.1 with the fix for PR46915 included.
> >
> > To reproduce, simply mount an NFS share and try to list the files with
> > ls.
> >
> > When listing the directory, the code seem to be looping in the commit I
> > mentioned. The network traffic goes high and you always see the same
> > packets flowing.
> >
> > In current HEAD, the behavior is slightly different. The process uses
> > 100% and either you get a kernel panic or if you are lucky, you get
> > something like "memory exhausted".
> >
> >
> > I'm not sure how to troubleshoot this further.
> >
> > Any idea ?
> 
> You need to understand the failure mode.
> 
> I would do two things:
> 
> (a) Revert the patch on HEAD and see if it works. This is usually
> convincing proof that something is broken and the patch interacts
> badly with our arch.
> 
> (c) Put printfs in the code to see where it's looping and under what
> conditions.
> 
> Once you have a better grasp of the failure mode you can contact the
> author of the patch and tell them about the breakage, CC linux-parisc,
> and ask for help.
> 
> That would be me plan of attack.

Arm ran into this as well

http://marc.info/?t=129372938100002

It looks to be an inequivalent aliasing problem caused by  writes via
the kernel direct mapping, and reads via vmalloc space.

James



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: NFS broken in latest 2.6.37-rcX
  2011-01-05 17:36   ` James Bottomley
@ 2011-01-05 18:31     ` Guy Martin
  2011-01-05 18:47       ` James Bottomley
  0 siblings, 1 reply; 5+ messages in thread
From: Guy Martin @ 2011-01-05 18:31 UTC (permalink / raw)
  To: James Bottomley; +Cc: Carlos O'Donell, linux-parisc


James,

Thanks for pointing me to the thread.

I reached similar conclusion as well. After adding lots of printk, I
could see that the page where the skb buff is copied is not or
partially updated when being read by xdr_page_filler.

Cheers,
  Guy

On Wed, 05 Jan 2011 11:36:01 -0600
James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> On Wed, 2010-12-22 at 16:53 -0500, Carlos O'Donell wrote:
> > On Wed, Dec 22, 2010 at 4:34 PM, Guy Martin <gmsoft@tuxicoman.be>
> > wrote:
> > > It seems that NFS got broken recently.
> > > I've bisected this to babddc72a9468884ce1a23db3c3d54b0afa299f0.
> > >
> > > Both NFS version 2 and 3 are affected. I haven't tested NFS 4.
> > >
> > > I've been able to reproduce with both 32bit and 64bit kernels
> > > using gcc-4.5.1 with the fix for PR46915 included.
> > >
> > > To reproduce, simply mount an NFS share and try to list the files
> > > with ls.
> > >
> > > When listing the directory, the code seem to be looping in the
> > > commit I mentioned. The network traffic goes high and you always
> > > see the same packets flowing.
> > >
> > > In current HEAD, the behavior is slightly different. The process
> > > uses 100% and either you get a kernel panic or if you are lucky,
> > > you get something like "memory exhausted".
> > >
> > >
> > > I'm not sure how to troubleshoot this further.
> > >
> > > Any idea ?
> > 
> > You need to understand the failure mode.
> > 
> > I would do two things:
> > 
> > (a) Revert the patch on HEAD and see if it works. This is usually
> > convincing proof that something is broken and the patch interacts
> > badly with our arch.
> > 
> > (c) Put printfs in the code to see where it's looping and under what
> > conditions.
> > 
> > Once you have a better grasp of the failure mode you can contact the
> > author of the patch and tell them about the breakage, CC
> > linux-parisc, and ask for help.
> > 
> > That would be me plan of attack.
> 
> Arm ran into this as well
> 
> http://marc.info/?t=129372938100002
> 
> It looks to be an inequivalent aliasing problem caused by  writes via
> the kernel direct mapping, and reads via vmalloc space.
> 
> James
> 
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: NFS broken in latest 2.6.37-rcX
  2011-01-05 18:31     ` Guy Martin
@ 2011-01-05 18:47       ` James Bottomley
  0 siblings, 0 replies; 5+ messages in thread
From: James Bottomley @ 2011-01-05 18:47 UTC (permalink / raw)
  To: Guy Martin; +Cc: Carlos O'Donell, linux-parisc

On Wed, 2011-01-05 at 19:31 +0100, Guy Martin wrote:
> James,
> 
> Thanks for pointing me to the thread.
> 
> I reached similar conclusion as well. After adding lots of printk, I
> could see that the page where the skb buff is copied is not or
> partially updated when being read by xdr_page_filler.

So does this fix it?

James

---

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 996dd89..37d7347 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1783,6 +1783,7 @@ static int nfs_symlink(struct inode *dir, struct dentry *dentry, const char *sym
 	memcpy(kaddr, symname, pathlen);
 	if (pathlen < PAGE_SIZE)
 		memset(kaddr + pathlen, 0, PAGE_SIZE - pathlen);
+	flush_kernel_dcache_page(page);
 	kunmap_atomic(kaddr, KM_USER0);
 
 	error = NFS_PROTO(dir)->symlink(dir, dentry, page, pathlen, &attr);
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 4435e5e..9a6bfea 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -205,6 +205,7 @@ static void nfs4_setup_readdir(u64 cookie, __be32 *verifier, struct dentry *dent
 
 	readdir->pgbase = (char *)p - (char *)start;
 	readdir->count -= readdir->pgbase;
+	flush_kernel_dcache_page(*readdir->pages);
 	kunmap_atomic(start, KM_USER0);
 }
 



^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-01-05 18:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-22 21:34 NFS broken in latest 2.6.37-rcX Guy Martin
2010-12-22 21:53 ` Carlos O'Donell
2011-01-05 17:36   ` James Bottomley
2011-01-05 18:31     ` Guy Martin
2011-01-05 18:47       ` James Bottomley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.