From: Andrew Morton <akpm@linux-foundation.org>
To: Jesper Krogh <jesper@krogh.cc>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Many open/close on same files yeilds "No such file or directory".
Date: Thu, 8 May 2008 22:36:35 -0700 [thread overview]
Message-ID: <20080508223635.523b8fa7.akpm@linux-foundation.org> (raw)
In-Reply-To: <4823DFA6.9010504@krogh.cc>
On Fri, 09 May 2008 07:22:46 +0200 Jesper Krogh <jesper@krogh.cc> wrote:
> Hi.
>
> A week has passed the problem still persist and I have done a fair
> amount of testing.
>
> I have now been able to reproduce it on 2 different servers, a
> dual-core,8 cpu Sun X4600 amd64 and a 2 cpu single-core Sun V20Z.
>
> It has been reproduced against 4 different diskarrays one of them being
> the IDE-SCSI(We have 2 of them) array mentioned before. The other one is
> an iSCSI SAN. And lately when trying to move of the "trouble systems" we
> hit the bug on a FC-SAN.
>
> I belive that we can rule out hardware now.
>
> I have both reproduced it on "old" ext3 volumes and freshly created
> ones.
>
> I havent been able to reproduce it on the internal system disks of
> the servers, but thinking a bit more about the setup it seems that
> the filesystem need to have a significant load on the
> filesystem-structures (not data) to exploit the bug. The typical
> environment where I found it was serving (the same) ~5GB of files of the
> NFS-server to 48 dual-core machines, this bacically never hits the
> actual disk (with 32GB of ram in the machine). In this setup
> a single process could hit the bug on the server (the problem could
> also be seen over NFS from the clients).
>
> It is worth keeping in mind that the problem is "rare". Thus all
> applications only opening and reading a single file every 10 minutes
> or so probably never hits it.
>
> Altering the test-script so it just tries again to pick up the file,
> it succeeds on the second pass.
>
> From my perspective this defininately has something to do with how the
> OS caches/invalidates the filesystem structures it has in memory (or
> somewhere near that).
>
> I still haven't been able to produce a small pack-and-shit test that
> knocks the problem out on every system. But please come with suggestions
> about how to write a piece of code that tries to hit the problem from
> my descriptions.
It's weird.
> My feeling is that the script below may reveal the bug on any "busy"
> volume, where busy is lots of activity in the OS-cache of the volume,
> not on the actual drives.
By this do you mean that there has to be a lot of other activity on the
system to reproduce it? Stuff which is turning over memory?
Because one possiblity is that the cached dentry got reclaimed by memory
pressure and we have some race bug which causes us to think that the file
doesn't exist.
(That still shouldn't happen because the dentry should be marked
recently-accessed, but perhaps the underlying inode gets reclaimed or
something. Grasping at straws here)
> All suggestions are welcome.
>
> I have tried 4 different kernel versions:
> 2.6.20, 2.6.22, 2.6.24, 2.6.25.2
>
> Jesper
>
> Jesper Krogh wrote:
> > Hi list.
> >
> > I have a "fairly" reproducible problem. When a program opens and closes
> > the same file many times, it eventually ends up with a "no such file or
> > directory". Test program that can reproduce the problem on my setup:
> >
> > root@hest:~# cat test-file-c.c
> > #include <stdlib.h>
> > #include <stdio.h>
> > #include <fcntl.h>
> > #include <unistd.h>
> >
> > int main(int argc, char *argv[]) {
> > unsigned long i=0;
> > int fh;
> > char *filename;
> >
> > filename=argv[1];
> >
> > while(1) {
> > fh=open(filename, O_RDONLY);
> > if (fh==-1) {
> > printf("Failed to open %s\n", filename);
> > printf("Open number: %ld\n",i);
> > exit(10);
> > }
> > close(fh);
> > i++;
> > }
> >
> > exit(0);
> > }
gee.
next prev parent reply other threads:[~2008-05-09 5:36 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-01 15:34 Many open/close on same files yeilds "No such file or directory" Jesper Krogh
2008-05-02 5:39 ` Andrew Morton
2008-05-02 8:20 ` Jesper Krogh
2008-05-01 12:15 ` Arjan van de Ven
2008-05-02 11:03 ` Many open/close on same files yeilds Jesper Krogh
2008-05-01 14:07 ` Arjan van de Ven
2008-05-02 15:19 ` Many open/close on same files yeilds "No such file or directory" Jesper Krogh
2008-05-02 15:47 ` Ray Lee
2008-05-02 15:55 ` Jesper Krogh
2008-05-02 16:45 ` Ray Lee
2008-05-02 19:53 ` Jesper Krogh
2008-05-02 19:52 ` Jesper Krogh
2008-05-05 17:43 ` Jesper Krogh
2008-05-05 17:51 ` Randy.Dunlap
2008-05-05 17:54 ` Jesper Krogh
[not found] ` <2c0942db0805051121r47cc97d2jb71cc8ab9eaa7981@mail.gmail.com>
2008-05-05 18:29 ` Jesper Krogh
[not found] ` <2c0942db0805051154q63a18bcfhce8a30d4a663ea3f@mail.gmail.com>
2008-05-07 20:51 ` Jesper Krogh
2008-05-07 22:27 ` Jesper Krogh
2008-05-02 15:21 ` Jesper Krogh
2008-05-09 5:22 ` Jesper Krogh
2008-05-09 5:36 ` Andrew Morton [this message]
2008-05-09 6:09 ` Jesper Krogh
2008-05-09 6:22 ` Andrew Morton
2008-05-12 1:53 ` Neil Brown
2008-05-12 6:00 ` J. Bruce Fields
2008-05-12 6:41 ` Jesper Krogh
2008-05-12 6:51 ` Andrew Morton
[not found] <aoJcW-38V-37@gated-at.bofh.it>
[not found] ` <aoWjI-1Br-5@gated-at.bofh.it>
[not found] ` <aoYOH-6RO-13@gated-at.bofh.it>
[not found] ` <ap5nc-3ZT-7@gated-at.bofh.it>
[not found] ` <ap5Gx-4vu-43@gated-at.bofh.it>
[not found] ` <ap9Ar-4Nn-21@gated-at.bofh.it>
[not found] ` <aqcZe-7Fg-23@gated-at.bofh.it>
[not found] ` <aqd98-7Vb-25@gated-at.bofh.it>
[not found] ` <aqd99-7Vb-27@gated-at.bofh.it>
2008-05-05 19:05 ` Henry Nestler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080508223635.523b8fa7.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=jesper@krogh.cc \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox