From: Andrew Morton <akpm@linux-foundation.org>
To: Jesper Krogh <jesper@krogh.cc>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Many open/close on same files yeilds "No such file or directory".
Date: Thu, 1 May 2008 22:39:38 -0700 [thread overview]
Message-ID: <20080501223938.921f7cd2.akpm@linux-foundation.org> (raw)
In-Reply-To: <4819E316.7000607@krogh.cc>
On Thu, 01 May 2008 17:34:46 +0200 Jesper Krogh <jesper@krogh.cc> wrote:
> Hi list.
>
> I have a "fairly" reproducible problem. When a program opens and closes
> the same file many times, it eventually ends up with a "no such file or
> directory". Test program that can reproduce the problem on my setup:
>
> root@hest:~# cat test-file-c.c
> #include <stdlib.h>
> #include <stdio.h>
> #include <fcntl.h>
> #include <unistd.h>
>
> int main(int argc, char *argv[]) {
> unsigned long i=0;
> int fh;
> char *filename;
>
> filename=argv[1];
>
> while(1) {
> fh=open(filename, O_RDONLY);
> if (fh==-1) {
> printf("Failed to open %s\n", filename);
> printf("Open number: %ld\n",i);
> exit(10);
> }
> close(fh);
> i++;
> }
>
> exit(0);
> }
> root@hest:~# ./test-file-c /z/bio/databases/online/index/index-by-accno
> Failed to open /z/bio/databases/online/index/index-by-accno
> Open number: 61785000
> root@hest:~# ./test-file-c /z/bio/databases/online/index/index-by-accno
> Failed to open /z/bio/databases/online/index/index-by-accno
> Open number: 120929685
> (The problem is not isolate to a single file on the filesystem).
>
What an amazing bug.
> strace on the program reviel that the system indeed return a "No such
> file or directory" to the program.
>
> This is run on an Ubuntu Gutsy (vendor kernel): 2.6.22-14-server on an
> 4.5TB ext3 filesystem on an LVM volume. The volume was created on a
> dapper (2 releases back) and has just followed with during upgrades.
The test program is (almost) all in RAM and won't care about the hardware.
> I cannot reproduce it on other disks attached to the same server or on
> other servers attached to similar disksystems.
hmm.
I guess it would be interesting to remount that filesystem with `noatime'
to eliminate the last bit of I/O and block-=realted code.
> The filesystem was taken offline yesterday for a forced fsck and it was
> found to be clean.
>
> The diskarray is a quite old Fibrenetix FX1200 with 12xPATA disk
> in raid5 (with hotspare) exposed to the OS as 3 SCSI-disks of
> 2+2+0.5TB assembled with LVM afterwards. The SCSI-controller is a:
> 05:08.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X
> Fusion-MPT Dual Ultra320 SCSI (rev c1)
>
> What suggestions do you have to solve this problem?
>
> I'm about to mkfs.ext3 the volume and spool it back in from the backup,
> but somehow I'm not convinced that it will solve the problem at all.
> It may just be a hardware problem, but dmesg doesnt tell anything.
>
> We actually got the problem from a perl-script, but this seems to be the
> minimal program that reproduces the problem.
I'd suspect that after 1e8 loops your CPU got too hot and started to
misbehave.
next prev parent reply other threads:[~2008-05-02 5:40 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-01 15:34 Many open/close on same files yeilds "No such file or directory" Jesper Krogh
2008-05-02 5:39 ` Andrew Morton [this message]
2008-05-02 8:20 ` Jesper Krogh
2008-05-01 12:15 ` Arjan van de Ven
2008-05-02 11:03 ` Many open/close on same files yeilds Jesper Krogh
2008-05-01 14:07 ` Arjan van de Ven
2008-05-02 15:19 ` Many open/close on same files yeilds "No such file or directory" Jesper Krogh
2008-05-02 15:47 ` Ray Lee
2008-05-02 15:55 ` Jesper Krogh
2008-05-02 16:45 ` Ray Lee
2008-05-02 19:53 ` Jesper Krogh
2008-05-02 19:52 ` Jesper Krogh
2008-05-05 17:43 ` Jesper Krogh
2008-05-05 17:51 ` Randy.Dunlap
2008-05-05 17:54 ` Jesper Krogh
[not found] ` <2c0942db0805051121r47cc97d2jb71cc8ab9eaa7981@mail.gmail.com>
2008-05-05 18:29 ` Jesper Krogh
[not found] ` <2c0942db0805051154q63a18bcfhce8a30d4a663ea3f@mail.gmail.com>
2008-05-07 20:51 ` Jesper Krogh
2008-05-07 22:27 ` Jesper Krogh
2008-05-02 15:21 ` Jesper Krogh
2008-05-09 5:22 ` Jesper Krogh
2008-05-09 5:36 ` Andrew Morton
2008-05-09 6:09 ` Jesper Krogh
2008-05-09 6:22 ` Andrew Morton
2008-05-12 1:53 ` Neil Brown
2008-05-12 6:00 ` J. Bruce Fields
2008-05-12 6:41 ` Jesper Krogh
2008-05-12 6:51 ` Andrew Morton
[not found] <aoJcW-38V-37@gated-at.bofh.it>
[not found] ` <aoWjI-1Br-5@gated-at.bofh.it>
[not found] ` <aoYOH-6RO-13@gated-at.bofh.it>
[not found] ` <ap5nc-3ZT-7@gated-at.bofh.it>
[not found] ` <ap5Gx-4vu-43@gated-at.bofh.it>
[not found] ` <ap9Ar-4Nn-21@gated-at.bofh.it>
[not found] ` <aqcZe-7Fg-23@gated-at.bofh.it>
[not found] ` <aqd98-7Vb-25@gated-at.bofh.it>
[not found] ` <aqd99-7Vb-27@gated-at.bofh.it>
2008-05-05 19:05 ` Henry Nestler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080501223938.921f7cd2.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=jesper@krogh.cc \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox