All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Holger Hoffstätte" <holger.hoffstaette@googlemail.com>
To: linux-nfs@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Subject: 3.18.1: broken directory with one file too many
Date: Tue, 16 Dec 2014 22:19:18 +0000 (UTC)	[thread overview]
Message-ID: <pan.2014.12.16.22.19.18@googlemail.com> (raw)

g
(please CC: for followups)

I just spent two hours trying to untangle a *weird* bug that I have not
seen before. It might be new to 3.18.x but I don't know for sure.
Apologies in advance for the long prelude but I figured I need to
describe the problem scenario as precisely as possible.

All this is on freshly baked 3.18.1 with Gentoo userland; the exported
filesystem is ext4.

On my NFS server I work with a git repo:

holger>git clone ../work/kernel-patches.git
Cloning into 'kernel-patches'...
done.
holger>cd kernel-patches 
holger>git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean
holger>ll
total 92K
drwxr-xr-x 2 holger users  72K Dec 16 22:41 3.14/
drwxr-xr-x 2 holger users  16K Dec 16 22:41 3.18/
-rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
holger>

Looking fine!

On my NFS client this directory is automounted via NFS:

holger>mount | grep home
tux:/home/holger on /mnt/tux/holger type nfs (rw,noatime,tcp,sloppy,vers=4,addr=192.168.100.222,clientaddr=192.168.100.128)

This has worked for ages and never caused any problems.

Let's see how my git repo is doing:
 
holger>cd /mnt/tux/holger/Projects/kernel-patches 
holger>ll
total 92K
drwxr-xr-x 2 holger users  72K Dec 16 22:41 3.14/
drwxr-xr-x 2 holger users  16K Dec 16 22:41 3.18/
-rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
holger>git status
On branch master
Your branch is up-to-date with 'origin/master'.
Untracked files:
  (use "git add <file>..." to include in what will be committed)

	3.14/btrfs-20

nothing added to commit but untracked files present (use "git add" to track)

..wait, what? There is no such file "btrfs-20" !

holger>ll 3.14 | head
ls: cannot access 3.14/btrfs-20: No such file or directory
total 4.5M
-rw-r--r-- 1 holger users 3.3K Dec 16 22:41 bfq-v7r6-001-block-cgroups-kconfig-build-bits-for-BFQ-v7r6-3.14.patch
-rw-r--r-- 1 holger users 219K Dec 16 22:41 bfq-v7r6-002-block-introduce-the-BFQ-v7r6-I-O-sched-for-3.14.patch
-rw-r--r-- 1 holger users  41K Dec 16 22:41 bfq-v7r6-003-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r6-for-3.14.0.patch
-rw-r--r-- 1 holger users 237K Dec 16 22:41 bfs-454-001-sched-bfs.patch
-rw-r--r-- 1 holger users 5.2K Dec 16 22:41 bfs-454-002-cpu-topology.patch
-rw-r--r-- 1 holger users  13K Dec 16 22:41 bfs-454-003-smtnice-v6.patch
-????????? ? ?      ?        ?            ? btrfs-20
-rw-r--r-- 1 holger users 7.8K Dec 16 22:41 btrfs-20140114-don't-mix-the-ordered-extents-of-all-files-together-during-logging-the-inodes.patch
-rw-r--r-- 1 holger users 1.2K Dec 16 22:41 btrfs-20140130-add-missing-error-check-in-incremental-send.patch
holger>

There is a "rogue" file messing up the directory?!

This used to work until I added a specific file, so..

holger>ll 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch 
-rw-r--r-- 1 holger users 2.3K Dec 16 22:41 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch

holger>stat 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
  File: ‘3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch’
  Size: 2306      	Blocks: 8          IO Block: 1048576 regular file
Device: 18h/24d	Inode: 22544856    Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/  holger)   Gid: (  100/   users)
Access: 2014-12-16 22:41:36.515665610 +0100
Modify: 2014-12-16 22:41:36.515665610 +0100
Change: 2014-12-16 22:41:36.515665610 +0100
 Birth: -

Looks fine..maybe try moving it to the parent?

holger>mv 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch .
holger>ll
total 96K
drwxr-xr-x 2 holger users  72K Dec 16 22:44 3.14/
drwxr-xr-x 2 holger users  16K Dec 16 22:41 3.18/
-rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
-rw-r--r-- 1 holger users 2.3K Dec 16 22:41 btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch

holger>git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	deleted:    3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch

no changes added to commit (use "git add" and/or "git commit -a")

holger>ll 3.14 | head                                                                  
total 4.5M
-rw-r--r-- 1 holger users 3.3K Dec 16 22:41 bfq-v7r6-001-block-cgroups-kconfig-build-bits-for-BFQ-v7r6-3.14.patch
-rw-r--r-- 1 holger users 219K Dec 16 22:41 bfq-v7r6-002-block-introduce-the-BFQ-v7r6-I-O-sched-for-3.14.patch
-rw-r--r-- 1 holger users  41K Dec 16 22:41 bfq-v7r6-003-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r6-for-3.14.0.patch
-rw-r--r-- 1 holger users 237K Dec 16 22:41 bfs-454-001-sched-bfs.patch
-rw-r--r-- 1 holger users 5.2K Dec 16 22:41 bfs-454-002-cpu-topology.patch
-rw-r--r-- 1 holger users  13K Dec 16 22:41 bfs-454-003-smtnice-v6.patch
-rw-r--r-- 1 holger users 7.8K Dec 16 22:41 btrfs-20140114-don't-mix-the-ordered-extents-of-all-files-together-during-logging-the-inodes.patch
-rw-r--r-- 1 holger users 1.2K Dec 16 22:41 btrfs-20140130-add-missing-error-check-in-incremental-send.patch
-rw-r--r-- 1 holger users 5.2K Dec 16 22:41 btrfs-20140130-fix-32-64-bit-problem-with-BTRFS_SET_RECEIVED_SUBVOL-ioctl.patch

I can move it back into 3.14/ and the directory is messed up again.
All this is reproducible, in different export directories.

Any ideas what this might be? A direntry hash collision maybe?
There is a large number of files starting with btrfs-2014xxyy-.. but with
the typical kernel patch names (some quite long), so that would be pretty
bad. Also everything works locally on ext4 without problems, so I suspect
it's an isolated NFS problem.

Things that did not work - same results every time:
- dropping caches on client & server
- restarting NFS on client & server

Any suggestions welcome; I'll gladly test patches.

thanks
Holger


WARNING: multiple messages have this Message-ID (diff)
From: "Holger Hoffstätte" <holger.hoffstaette@googlemail.com>
To: linux-kernel@vger.kernel.org
Cc: linux-nfs@vger.kernel.org
Subject: 3.18.1: broken directory with one file too many
Date: Tue, 16 Dec 2014 22:19:18 +0000 (UTC)	[thread overview]
Message-ID: <pan.2014.12.16.22.19.18@googlemail.com> (raw)

g
(please CC: for followups)

I just spent two hours trying to untangle a *weird* bug that I have not
seen before. It might be new to 3.18.x but I don't know for sure.
Apologies in advance for the long prelude but I figured I need to
describe the problem scenario as precisely as possible.

All this is on freshly baked 3.18.1 with Gentoo userland; the exported
filesystem is ext4.

On my NFS server I work with a git repo:

holger>git clone ../work/kernel-patches.git
Cloning into 'kernel-patches'...
done.
holger>cd kernel-patches 
holger>git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean
holger>ll
total 92K
drwxr-xr-x 2 holger users  72K Dec 16 22:41 3.14/
drwxr-xr-x 2 holger users  16K Dec 16 22:41 3.18/
-rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
holger>

Looking fine!

On my NFS client this directory is automounted via NFS:

holger>mount | grep home
tux:/home/holger on /mnt/tux/holger type nfs (rw,noatime,tcp,sloppy,vers=4,addr=192.168.100.222,clientaddr=192.168.100.128)

This has worked for ages and never caused any problems.

Let's see how my git repo is doing:
 
holger>cd /mnt/tux/holger/Projects/kernel-patches 
holger>ll
total 92K
drwxr-xr-x 2 holger users  72K Dec 16 22:41 3.14/
drwxr-xr-x 2 holger users  16K Dec 16 22:41 3.18/
-rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
holger>git status
On branch master
Your branch is up-to-date with 'origin/master'.
Untracked files:
  (use "git add <file>..." to include in what will be committed)

	3.14/btrfs-20

nothing added to commit but untracked files present (use "git add" to track)

..wait, what? There is no such file "btrfs-20" !

holger>ll 3.14 | head
ls: cannot access 3.14/btrfs-20: No such file or directory
total 4.5M
-rw-r--r-- 1 holger users 3.3K Dec 16 22:41 bfq-v7r6-001-block-cgroups-kconfig-build-bits-for-BFQ-v7r6-3.14.patch
-rw-r--r-- 1 holger users 219K Dec 16 22:41 bfq-v7r6-002-block-introduce-the-BFQ-v7r6-I-O-sched-for-3.14.patch
-rw-r--r-- 1 holger users  41K Dec 16 22:41 bfq-v7r6-003-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r6-for-3.14.0.patch
-rw-r--r-- 1 holger users 237K Dec 16 22:41 bfs-454-001-sched-bfs.patch
-rw-r--r-- 1 holger users 5.2K Dec 16 22:41 bfs-454-002-cpu-topology.patch
-rw-r--r-- 1 holger users  13K Dec 16 22:41 bfs-454-003-smtnice-v6.patch
-????????? ? ?      ?        ?            ? btrfs-20
-rw-r--r-- 1 holger users 7.8K Dec 16 22:41 btrfs-20140114-don't-mix-the-ordered-extents-of-all-files-together-during-logging-the-inodes.patch
-rw-r--r-- 1 holger users 1.2K Dec 16 22:41 btrfs-20140130-add-missing-error-check-in-incremental-send.patch
holger>

There is a "rogue" file messing up the directory?!

This used to work until I added a specific file, so..

holger>ll 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch 
-rw-r--r-- 1 holger users 2.3K Dec 16 22:41 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch

holger>stat 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch
  File: ‘3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch’
  Size: 2306      	Blocks: 8          IO Block: 1048576 regular file
Device: 18h/24d	Inode: 22544856    Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/  holger)   Gid: (  100/   users)
Access: 2014-12-16 22:41:36.515665610 +0100
Modify: 2014-12-16 22:41:36.515665610 +0100
Change: 2014-12-16 22:41:36.515665610 +0100
 Birth: -

Looks fine..maybe try moving it to the parent?

holger>mv 3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch .
holger>ll
total 96K
drwxr-xr-x 2 holger users  72K Dec 16 22:44 3.14/
drwxr-xr-x 2 holger users  16K Dec 16 22:41 3.18/
-rw-r--r-- 1 holger users 2.4K Dec 16 22:41 README.md
-rw-r--r-- 1 holger users 2.3K Dec 16 22:41 btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch

holger>git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	deleted:    3.14/btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	btrfs-20141216-fix-a-warning-of-qgroup-account-on-shared-extents.patch

no changes added to commit (use "git add" and/or "git commit -a")

holger>ll 3.14 | head                                                                  
total 4.5M
-rw-r--r-- 1 holger users 3.3K Dec 16 22:41 bfq-v7r6-001-block-cgroups-kconfig-build-bits-for-BFQ-v7r6-3.14.patch
-rw-r--r-- 1 holger users 219K Dec 16 22:41 bfq-v7r6-002-block-introduce-the-BFQ-v7r6-I-O-sched-for-3.14.patch
-rw-r--r-- 1 holger users  41K Dec 16 22:41 bfq-v7r6-003-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r6-for-3.14.0.patch
-rw-r--r-- 1 holger users 237K Dec 16 22:41 bfs-454-001-sched-bfs.patch
-rw-r--r-- 1 holger users 5.2K Dec 16 22:41 bfs-454-002-cpu-topology.patch
-rw-r--r-- 1 holger users  13K Dec 16 22:41 bfs-454-003-smtnice-v6.patch
-rw-r--r-- 1 holger users 7.8K Dec 16 22:41 btrfs-20140114-don't-mix-the-ordered-extents-of-all-files-together-during-logging-the-inodes.patch
-rw-r--r-- 1 holger users 1.2K Dec 16 22:41 btrfs-20140130-add-missing-error-check-in-incremental-send.patch
-rw-r--r-- 1 holger users 5.2K Dec 16 22:41 btrfs-20140130-fix-32-64-bit-problem-with-BTRFS_SET_RECEIVED_SUBVOL-ioctl.patch

I can move it back into 3.14/ and the directory is messed up again.
All this is reproducible, in different export directories.

Any ideas what this might be? A direntry hash collision maybe?
There is a large number of files starting with btrfs-2014xxyy-.. but with
the typical kernel patch names (some quite long), so that would be pretty
bad. Also everything works locally on ext4 without problems, so I suspect
it's an isolated NFS problem.

Things that did not work - same results every time:
- dropping caches on client & server
- restarting NFS on client & server

Any suggestions welcome; I'll gladly test patches.

thanks
Holger


             reply	other threads:[~2014-12-16 22:35 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-16 22:19 Holger Hoffstätte [this message]
2014-12-16 22:19 ` 3.18.1: broken directory with one file too many Holger Hoffstätte
2014-12-17 21:22 ` J. Bruce Fields
2014-12-18 12:22   ` Holger Hoffstätte
2014-12-18 12:51     ` Holger Hoffstätte
2014-12-18 12:59       ` Holger Hoffstätte
2014-12-18 14:48     ` J. Bruce Fields
2014-12-18 14:58       ` Benjamin Coddington
2014-12-18 15:19         ` J. Bruce Fields
2014-12-18 15:42           ` Holger Hoffstätte
2014-12-18 16:32             ` J. Bruce Fields
2014-12-18 16:42               ` Holger Hoffstätte
2014-12-18 17:06                 ` J. Bruce Fields
2014-12-18 19:44                   ` Holger Hoffstätte
2014-12-20 18:02                     ` J. Bruce Fields
2014-12-20 18:50                       ` Holger Hoffstätte
2015-01-07  0:25                       ` Holger Hoffstätte
2015-01-07 18:21                         ` J. Bruce Fields
2015-01-07 20:06                           ` [PATCH] nfsd4: tweak rd_dircount accounting J. Bruce Fields
2014-12-18 17:18           ` 3.18.1: broken directory with one file too many J. Bruce Fields
2014-12-18 15:35       ` Holger Hoffstätte
2014-12-18 16:30         ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pan.2014.12.16.22.19.18@googlemail.com \
    --to=holger.hoffstaette@googlemail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.