From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: Filipe Manana <fdmanana@gmail.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>,
Martin Raiber <martin@urbackup.org>
Subject: Re: Ghost directory entries (previously "Unable to remove directory entry")
Date: Sat, 1 Jan 2022 19:04:18 -0500 [thread overview]
Message-ID: <YdDsAl0bx6DLZT+i@hungrycats.org> (raw)
In-Reply-To: <20211120193627.GE17148@hungrycats.org>
[-- Attachment #1: Type: text/plain, Size: 3971 bytes --]
On Sat, Nov 20, 2021 at 02:36:27PM -0500, Zygo Blaxell wrote:
> On Sat, Nov 20, 2021 at 05:13:39PM +0000, Filipe Manana wrote:
[...]
> > > > I just tried that loop using that case, and several variations
> > > > (different names, different directories, etc), I still couldn't
> > > > reproduce and I don't see how that alone could lead to any issue.
> > > >
> > > > I think there must be something else happening in parallel besides that loop.
> > >
> > > Definitely something else is required, the loop by itself won't reproduce
> > > the issue.
> > >
> > > I don't know what the other thing is. Candidates are: balance or
> > > snapshot delete pushing commit times above watchdog threshold, hitting
> > > a BUG_ON, filesystem ENOSPC, kernel out of memory (i.e. events that
> > > would normally force a host to reboot).
I've been able to reproduce it 3 times on 5.15.6 with the attached
test program. It runs 16 threads that try to create, write to, sometimes
fsync(!), and rename files from a pool of 256 fixed names. Run the script
in an empty directory, load the machine up so that transaction commits
take a long time to run, crash it, and about a third of the time there
will be dirents without inodes behind them, sometimes several hundred
of each file name.
After collecting a few more reports on IRC I found that there's two
sets of behaviors: on kernels before 5.14, fsync() correlates strongly
with the issue, while on 5.14 and later, fsync() correlates negatively.
I examined the applications that were using the directories where the
ghost dirents appeared, and divided them into a group of apps that call
fsync() and those that don't, and then matched those apps up with incident
reports. So there might be two distinct bugs leading to a similar result
(or it was always one bug but in 5.14 something changed the mechanics).
In the repro program I call fsync() on half the files and not on the
other half of the files, so it should work on kernels before and after
5.14.
Some more reports of the issue have come in over IRC and in GNOME
bug reports. They are reporting it on 5.10 which I haven't been able
to reproduce. I don't think there's anything special about 5.10's code,
but it's the kernel most people will be running now, so "a thing that
happens on all kernel versions before 5.14" will be reported much more
often on 5.10 than any other kernel at this time.
I have one case where a machine rebooted from 5.14 to 5.10 and the issue
was found after the reboot. The app involved is non-fsyncing, which
would suggest the problem occurred while running 5.14 and not during log
replay on 5.10. That data seems to fit with what we understand about
the problem, i.e. that it's putting junk in the log tree that is getting
replayed to create the ghost dirents at mount time after the crash.
> > If a transaction abort happened (fs turned into RO), than it's very
> > likely to have been fixed recently in 5.16-rc1 by:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9a35fc9542fa6c220d69987612b88c54cba2bc33
> >
> > The code paths fixed are triggered during a rename of a previously
> > fsynced file, and they are precisely about deleting directory items.
> >
> > IIRC, the problem Martin reported some months ago, he had transaction aborts.
>
> OK, if I can cherry-pick or backport that, I can test it in a lot more
> machines than I can run misc-next on. With a multi-year gap between
> incidents, it will be some time before I can confirm any fix works.
>
> If this commit is in a -rc kernel anyway (and presumably heading to
> -stable?) then the reports on IRC should come to a stop on their own,
> no action required.
I haven't been able to reproduce this on misc-next (which has the
patch below), or 5.16-rc kernels, or any earlier kernel that I've
tested with the patch backported.
Maybe we should put this commit in -stable? 5.15 is still alive and
definitely affected.
> Thanks!
>
[-- Attachment #2: repro-ghost-dirent.cc --]
[-- Type: text/x-c++src, Size: 2154 bytes --]
#include <iostream>
#include <list>
#include <memory>
#include <random>
#include <sstream>
#include <string>
#include <thread>
#include <cstring>
#include <fcntl.h>
#include <unistd.h>
#include <sys/ioctl.h>
using namespace std;
int
main(int argc, char **argv)
{
(void)argc;
(void)argv;
const size_t thread_count = 16;
random_device rd;
uniform_int_distribution<default_random_engine::result_type> uid(
numeric_limits<default_random_engine::result_type>::min(),
numeric_limits<default_random_engine::result_type>::max()
);
default_random_engine generator(uid(rd));
list<shared_ptr<thread>> threads;
ostringstream data_str;
for (size_t x = 0; x < 1024; ++x) {
data_str << x << endl;
}
const string data = data_str.str();
for (size_t x = 0; x < thread_count; ++x) {
const size_t base = x * thread_count;
threads.push_back(make_shared<thread>([x, &data, base, &generator]() {
uniform_int_distribution<size_t> name_dist(base, base + thread_count - 1);
uniform_int_distribution<size_t> size_dist(1, data.size());
char thread_name[32];
sprintf(thread_name, "%zu ", x);
while (true) {
char name[32];
const size_t file_num = name_dist(generator);
sprintf(name, "file_%08zx", file_num);
char tmp_name[1024];
sprintf(tmp_name, "%s.tmp", name);
const int fd = open(tmp_name, O_WRONLY | O_CREAT | O_NOFOLLOW | O_EXCL, 0777);
if (fd < 0) {
cerr << "open " << tmp_name << " failed: " << strerror(errno) << endl;
}
const int rv = write(fd, data.data(), size_dist(generator));
if (rv < 1) {
cerr << "write " << tmp_name << " failed: " << strerror(errno) << endl;
}
if (file_num & 1) {
const int fv = fsync(fd);
if (fv) {
cerr << "fsync " << tmp_name << " failed: " << strerror(errno) << endl;
}
}
if (close(fd)) {
cerr << "close " << tmp_name << " failed: " << strerror(errno) << endl;
}
if (rename(tmp_name, name)) {
cerr << "rename " << tmp_name << " -> " << name << " failed: " << strerror(errno) << endl;
}
cerr << thread_name;
}
}));
}
for (auto i : threads) {
i->join();
}
return EXIT_SUCCESS;
}
prev parent reply other threads:[~2022-01-02 0:04 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-18 21:09 Ghost directory entries (previously "Unable to remove directory entry") Zygo Blaxell
2021-11-18 23:04 ` Filipe Manana
2021-11-19 3:20 ` Zygo Blaxell
2021-11-19 9:25 ` Filipe Manana
2021-11-20 15:04 ` Zygo Blaxell
2021-11-20 15:15 ` Martin Raiber
2021-11-20 16:03 ` Zygo Blaxell
2021-11-20 17:13 ` Filipe Manana
2021-11-20 19:36 ` Zygo Blaxell
2022-01-02 0:04 ` Zygo Blaxell [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YdDsAl0bx6DLZT+i@hungrycats.org \
--to=ce3g8jdj@umail.furryterror.org \
--cc=fdmanana@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=martin@urbackup.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox