Linux NFS development
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: "Holger Hoffstätte" <holger.hoffstaette@googlemail.com>
Cc: linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: 3.18.1: broken directory with one file too many
Date: Wed, 17 Dec 2014 16:22:00 -0500	[thread overview]
Message-ID: <20141217212159.GA11517@fieldses.org> (raw)
In-Reply-To: <pan.2014.12.16.22.19.18@googlemail.com>

On Tue, Dec 16, 2014 at 10:19:18PM +0000, Holger Hoffstätte wrote:
> g
> (please CC: for followups)
> 
> I just spent two hours trying to untangle a *weird* bug that I have not
> seen before. It might be new to 3.18.x but I don't know for sure.
> Apologies in advance for the long prelude but I figured I need to
> describe the problem scenario as precisely as possible.
> 
> All this is on freshly baked 3.18.1 with Gentoo userland; the exported
> filesystem is ext4.
> 
> On my NFS server I work with a git repo:
> 
> holger>git clone ../work/kernel-patches.git
> Cloning into 'kernel-patches'...
> done.
> holger>cd kernel-patches 
> holger>git status
> On branch master
> Your branch is up-to-date with 'origin/master'.
> nothing to commit, working directory clean
> holger>ll
> total 92K
> drwxr-xr-x 2 holger users  72K Dec 16 22:41 3.14/
> drwxr-xr-x 2 holger users  16K Dec 16 22:41 3.18/
> -rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
> holger>
> 
> Looking fine!
> 
> On my NFS client this directory is automounted via NFS:
> 
> holger>mount | grep home
> tux:/home/holger on /mnt/tux/holger type nfs (rw,noatime,tcp,sloppy,vers=4,addr=192.168.100.222,clientaddr=192.168.100.128)
> 
> This has worked for ages and never caused any problems.
> 
> Let's see how my git repo is doing:
>  
> holger>cd /mnt/tux/holger/Projects/kernel-patches 
> holger>ll
> total 92K
> drwxr-xr-x 2 holger users  72K Dec 16 22:41 3.14/
> drwxr-xr-x 2 holger users  16K Dec 16 22:41 3.18/
> -rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
> holger>git status
> On branch master
> Your branch is up-to-date with 'origin/master'.
> Untracked files:
>   (use "git add <file>..." to include in what will be committed)
> 
> 	3.14/btrfs-20
> 
> nothing added to commit but untracked files present (use "git add" to track)
> 
> ..wait, what? There is no such file "btrfs-20" !
> 
> holger>ll 3.14 | head
> ls: cannot access 3.14/btrfs-20: No such file or directory
> total 4.5M
> -rw-r--r-- 1 holger users 3.3K Dec 16 22:41 bfq-v7r6-001-block-cgroups-kconfig-build-bits-for-BFQ-v7r6-3.14.patch
> -rw-r--r-- 1 holger users 219K Dec 16 22:41 bfq-v7r6-002-block-introduce-the-BFQ-v7r6-I-O-sched-for-3.14.patch
> -rw-r--r-- 1 holger users  41K Dec 16 22:41 bfq-v7r6-003-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r6-for-3.14.0.patch
> -rw-r--r-- 1 holger users 237K Dec 16 22:41 bfs-454-001-sched-bfs.patch
> -rw-r--r-- 1 holger users 5.2K Dec 16 22:41 bfs-454-002-cpu-topology.patch
> -rw-r--r-- 1 holger users  13K Dec 16 22:41 bfs-454-003-smtnice-v6.patch
> -????????? ? ?      ?        ?            ? btrfs-20
> -rw-r--r-- 1 holger users 7.8K Dec 16 22:41 btrfs-20140114-don't-mix-the-ordered-extents-of-all-files-together-during-logging-the-inodes.patch
> -rw-r--r-- 1 holger users 1.2K Dec 16 22:41 btrfs-20140130-add-missing-error-check-in-incremental-send.patch
> holger>
> 
> There is a "rogue" file messing up the directory?!
> 
> This used to work until I added a specific file, so..
> 
> holger>ll 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch 
> -rw-r--r-- 1 holger users 2.3K Dec 16 22:41 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
> 
> holger>stat 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
>   File: ‘3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch’
>   Size: 2306      	Blocks: 8          IO Block: 1048576 regular file
> Device: 18h/24d	Inode: 22544856    Links: 1
> Access: (0644/-rw-r--r--)  Uid: ( 1000/  holger)   Gid: (  100/   users)
> Access: 2014-12-16 22:41:36.515665610 +0100
> Modify: 2014-12-16 22:41:36.515665610 +0100
> Change: 2014-12-16 22:41:36.515665610 +0100
>  Birth: -
> 
> Looks fine..maybe try moving it to the parent?
> 
> holger>mv 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch .
> holger>ll
> total 96K
> drwxr-xr-x 2 holger users  72K Dec 16 22:44 3.14/
> drwxr-xr-x 2 holger users  16K Dec 16 22:41 3.18/
> -rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
> -rw-r--r-- 1 holger users 2.3K Dec 16 22:41 btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
> 
> holger>git status
> On branch master
> Your branch is up-to-date with 'origin/master'.
> Changes not staged for commit:
>   (use "git add/rm <file>..." to update what will be committed)
>   (use "git checkout -- <file>..." to discard changes in working directory)
> 
> 	deleted:    3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
> 
> Untracked files:
>   (use "git add <file>..." to include in what will be committed)
> 
> 	btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
> 
> no changes added to commit (use "git add" and/or "git commit -a")
> 
> holger>ll 3.14 | head                                                                  
> total 4.5M
> -rw-r--r-- 1 holger users 3.3K Dec 16 22:41 bfq-v7r6-001-block-cgroups-kconfig-build-bits-for-BFQ-v7r6-3.14.patch
> -rw-r--r-- 1 holger users 219K Dec 16 22:41 bfq-v7r6-002-block-introduce-the-BFQ-v7r6-I-O-sched-for-3.14.patch
> -rw-r--r-- 1 holger users  41K Dec 16 22:41 bfq-v7r6-003-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r6-for-3.14.0.patch
> -rw-r--r-- 1 holger users 237K Dec 16 22:41 bfs-454-001-sched-bfs.patch
> -rw-r--r-- 1 holger users 5.2K Dec 16 22:41 bfs-454-002-cpu-topology.patch
> -rw-r--r-- 1 holger users  13K Dec 16 22:41 bfs-454-003-smtnice-v6.patch
> -rw-r--r-- 1 holger users 7.8K Dec 16 22:41 btrfs-20140114-don't-mix-the-ordered-extents-of-all-files-together-during-logging-the-inodes.patch
> -rw-r--r-- 1 holger users 1.2K Dec 16 22:41 btrfs-20140130-add-missing-error-check-in-incremental-send.patch
> -rw-r--r-- 1 holger users 5.2K Dec 16 22:41 btrfs-20140130-fix-32-64-bit-problem-with-BTRFS_SET_RECEIVED_SUBVOL-ioctl.patch
> 
> I can move it back into 3.14/ and the directory is messed up again.
> All this is reproducible, in different export directories.
> 
> Any ideas what this might be? A direntry hash collision maybe?
> There is a large number of files starting with btrfs-2014xxyy-.. but with
> the typical kernel patch names (some quite long), so that would be pretty
> bad. Also everything works locally on ext4 without problems, so I suspect
> it's an isolated NFS problem.

That doesn't sound familiar.  A network trace showing the READDIR would
be really useful.  Since this is so reproducible, I think that should be
possible.  So do something like:

	move the problem file into 3.14/
	tcpdump -s0 -wtmp.pcap -i<relevant interface>
	ls the directory on the client.
	kill tcpdump
	send us tmp.pcap and/or take a look at it with wireshark and see
	what the READDIR response looks like.

--b.

  reply	other threads:[~2014-12-17 21:22 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-16 22:19 3.18.1: broken directory with one file too many Holger Hoffstätte
2014-12-17 21:22 ` J. Bruce Fields [this message]
2014-12-18 12:22   ` Holger Hoffstätte
2014-12-18 12:51     ` Holger Hoffstätte
2014-12-18 12:59       ` Holger Hoffstätte
2014-12-18 14:48     ` J. Bruce Fields
2014-12-18 14:58       ` Benjamin Coddington
2014-12-18 15:19         ` J. Bruce Fields
2014-12-18 15:42           ` Holger Hoffstätte
2014-12-18 16:32             ` J. Bruce Fields
2014-12-18 16:42               ` Holger Hoffstätte
2014-12-18 17:06                 ` J. Bruce Fields
2014-12-18 19:44                   ` Holger Hoffstätte
2014-12-20 18:02                     ` J. Bruce Fields
2014-12-20 18:50                       ` Holger Hoffstätte
2015-01-07  0:25                       ` Holger Hoffstätte
2015-01-07 18:21                         ` J. Bruce Fields
2015-01-07 20:06                           ` [PATCH] nfsd4: tweak rd_dircount accounting J. Bruce Fields
2014-12-18 17:18           ` 3.18.1: broken directory with one file too many J. Bruce Fields
2014-12-18 15:35       ` Holger Hoffstätte
2014-12-18 16:30         ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141217212159.GA11517@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=holger.hoffstaette@googlemail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox