Linux NFS development
 help / color / mirror / Atom feed
From: "Holger Hoffstätte" <holger.hoffstaette@googlemail.com>
To: linux-nfs@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Subject: Re: 3.18.1: broken directory with one file too many
Date: Thu, 18 Dec 2014 15:35:45 +0000 (UTC)	[thread overview]
Message-ID: <pan.2014.12.18.15.35.45@googlemail.com> (raw)
In-Reply-To: 20141218144856.GA18179@fieldses.org

On Thu, 18 Dec 2014 09:48:56 -0500, J. Bruce Fields wrote:

> On a quick skim, the server's READDIR responses look correct.  The entry
> btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
> is returned in frame 53 (with complete reassembled reply displayed by
> wireshark in frame 63).
> 
> You could double-check for me--just run "wireshark nfs-server.pcap",
> look for packets labeled "Reply ... READDIR", and expand out the READDIR
> op and directory listing.  I don't see anything obviously wrong.

That's what I can see in Wireshark as well (#53 as part of the "20 
reassembled segments"). As I said in my followup I don't think there is 
anything wrong with that particular file since removing others "fixed" 
the problem. That's why I suspected NIC/TCP buggery, and since my kernels 
usually have a bunch of patches (the ones in that repo) I wanted to try 
vanilla 3.18.0/1 as well as -3.14.27 first.

>> Meanwhile I'll try older/plain (unpatched) kernels. So far reverting
>> the client to vanilla 3.18.1 or 3.14.27 has not helped..
> 
> I'm a little unclear: when you said "All this is on freshly baked
> 3.18.1", are you describing the client, or the server, or both?

That was on both. As I wrote in the followups I've now also tried to
first downgrade the clients (didn't help) and then finally found that
3.14.27 (both with and without my patches) on the server repeatably 
works, regardless of client. Right now I have 3.18.1 as clients and 
3.14.27 on the server, and that works fine.

I never noticed any other problems when first testing 3.18, which is why 
switched over all machines; it has been working really well so far.
No other networking problems, and I use NFS all day long. If there really 
was NIC packet corruption, NFS dropped requests or general page cache 
borkage then I think I would have noticed something much earlier.

Maybe you can try to reproduce? Try git clone https://github.com/
hhoffstaette/kernel-patches and rewind to rev e7b720ef after which I 
first noticed the problem. Then look at the 3.14 directory over NFS.

Let me know if there is anything else I can try!

regards,
Holger


  parent reply	other threads:[~2014-12-18 15:36 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-16 22:19 3.18.1: broken directory with one file too many Holger Hoffstätte
2014-12-17 21:22 ` J. Bruce Fields
2014-12-18 12:22   ` Holger Hoffstätte
2014-12-18 12:51     ` Holger Hoffstätte
2014-12-18 12:59       ` Holger Hoffstätte
2014-12-18 14:48     ` J. Bruce Fields
2014-12-18 14:58       ` Benjamin Coddington
2014-12-18 15:19         ` J. Bruce Fields
2014-12-18 15:42           ` Holger Hoffstätte
2014-12-18 16:32             ` J. Bruce Fields
2014-12-18 16:42               ` Holger Hoffstätte
2014-12-18 17:06                 ` J. Bruce Fields
2014-12-18 19:44                   ` Holger Hoffstätte
2014-12-20 18:02                     ` J. Bruce Fields
2014-12-20 18:50                       ` Holger Hoffstätte
2015-01-07  0:25                       ` Holger Hoffstätte
2015-01-07 18:21                         ` J. Bruce Fields
2015-01-07 20:06                           ` [PATCH] nfsd4: tweak rd_dircount accounting J. Bruce Fields
2014-12-18 17:18           ` 3.18.1: broken directory with one file too many J. Bruce Fields
2014-12-18 15:35       ` Holger Hoffstätte [this message]
2014-12-18 16:30         ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pan.2014.12.18.15.35.45@googlemail.com \
    --to=holger.hoffstaette@googlemail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox