All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Rantala, Tommi T. (Nokia - FI/Espoo)" <tommi.t.rantala@nokia.com>
Cc: "darrick.wong@oracle.com" <darrick.wong@oracle.com>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"hch@lst.de" <hch@lst.de>
Subject: Re: 5.5 XFS getdents regression?
Date: Wed, 11 Mar 2020 09:14:06 +1100	[thread overview]
Message-ID: <20200310221406.GO10776@dread.disaster.area> (raw)
In-Reply-To: <72c5fd8e9a23dde619f70f21b8100752ec63e1d2.camel@nokia.com>

On Tue, Mar 10, 2020 at 08:45:58AM +0000, Rantala, Tommi T. (Nokia - FI/Espoo) wrote:
> Hello,
> 
> One of my GitLab CI jobs stopped working after upgrading server 5.4.18-
> 100.fc30.x86_64 -> 5.5.7-100.fc30.x86_64.
> (tested 5.5.8-100.fc30.x86_64 too, no change)
> The server is fedora30 with XFS rootfs.
> The problem reproduces always, and takes only couple minutes to run.
> 
> The CI job fails in the beginning when doing "git clean" in docker
> container, and failing to rmdir some directory:
> "warning: failed to remove 
> .vendor/pkg/mod/golang.org/x/net@v0.0.0-20200114155413-6afb5195e5aa/intern
> al/socket: Directory not empty"
> 
> Quick google search finds some other people reporting similar problems
> with 5.5.0:
> https://gitlab.com/gitlab-org/gitlab-runner/issues/3185

Which appears to be caused by multiple gitlab processes modifying
the directory at the same time. i.e. something is adding an entry to
the directory at the same time something is trying to rm -rf it.
That's a race condition, and would lead to the exact symptoms you
see here, depending on where in the directory the new entry is
added.

> Collected some data with strace, and it seems that getdents is not
> returning all entries:
> 
> 5.4 getdents64() returns 52+50+1+0 entries 
> => all files in directory are deleted and rmdir() is OK
> 
> 5.5 getdents64() returns 52+50+0+0 entries
> => rmdir() fails with ENOTEMPTY

Yup, that's a classic userspace TOCTOU race.

Remember, getdents() is effectively a sequential walk through the
directory data - subsequent calls start at the offset (cookie) where
the previous one left off. New entries can be added between
getdents() syscalls.

If that new entry is put at the tail of the directory, then the last
getdents() call will return that entry rather than none because it
was placed at an offset in the directory that the getdents() sweep
has not yet reached, and hence will be found by a future getdents()
call in the sweep.


However, if there is a hole in the directory structure before the
current getdents cookie offset, a new entry can be added in that
hole. i.e. at an offset in the directory that getdents has already
passed over. That dirent will never be reported by the current
getdents() sequence - a directory rewind and re-read is required to
find it. i.e. there's an inherent userspace TOUTOC race condition in
'rm -rf' operations.

IOWs, this is exactly what you'd expect to see when there are
concurrent userspace modifications to a directory that is currently
being read. Hence you need to rule out an application and userspace
level issues before looking for filesystem level problems.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2020-03-10 22:14 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-10  8:45 5.5 XFS getdents regression? Rantala, Tommi T. (Nokia - FI/Espoo)
2020-03-10 11:12 ` Bhaskar Chowdhury
2020-03-10 14:41   ` Eric Sandeen
2020-03-10 22:14 ` Dave Chinner [this message]
2020-03-11 17:06   ` Rantala, Tommi T. (Nokia - FI/Espoo)
2020-03-11 17:22     ` hch
2020-03-12  8:09       ` Rantala, Tommi T. (Nokia - FI/Espoo)
2020-03-12  8:18         ` hch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200310221406.GO10776@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=darrick.wong@oracle.com \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=tommi.t.rantala@nokia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.