From: Andrew Morton <akpm@linux-foundation.org>
To: Jesper Krogh <jesper@krogh.cc>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Many open/close on same files yeilds "No such file or directory".
Date: Thu, 1 May 2008 22:39:38 -0700 [thread overview]
Message-ID: <20080501223938.921f7cd2.akpm@linux-foundation.org> (raw)
In-Reply-To: <4819E316.7000607@krogh.cc>
On Thu, 01 May 2008 17:34:46 +0200 Jesper Krogh <jesper@krogh.cc> wrote:
> Hi list.
>
> I have a "fairly" reproducible problem. When a program opens and closes
> the same file many times, it eventually ends up with a "no such file or
> directory". Test program that can reproduce the problem on my setup:
>
> root@hest:~# cat test-file-c.c
> #include <stdlib.h>
> #include <stdio.h>
> #include <fcntl.h>
> #include <unistd.h>
>
> int main(int argc, char *argv[]) {
> unsigned long i=0;
> int fh;
> char *filename;
>
> filename=argv[1];
>
> while(1) {
> fh=open(filename, O_RDONLY);
> if (fh==-1) {
> printf("Failed to open %s\n", filename);
> printf("Open number: %ld\n",i);
> exit(10);
> }
> close(fh);
> i++;
> }
>
> exit(0);
> }
> root@hest:~# ./test-file-c /z/bio/databases/online/index/index-by-accno
> Failed to open /z/bio/databases/online/index/index-by-accno
> Open number: 61785000
> root@hest:~# ./test-file-c /z/bio/databases/online/index/index-by-accno
> Failed to open /z/bio/databases/online/index/index-by-accno
> Open number: 120929685
> (The problem is not isolate to a single file on the filesystem).
>
What an amazing bug.
> strace on the program reviel that the system indeed return a "No such
> file or directory" to the program.
>
> This is run on an Ubuntu Gutsy (vendor kernel): 2.6.22-14-server on an
> 4.5TB ext3 filesystem on an LVM volume. The volume was created on a
> dapper (2 releases back) and has just followed with during upgrades.
The test program is (almost) all in RAM and won't care about the hardware.
> I cannot reproduce it on other disks attached to the same server or on
> other servers attached to similar disksystems.
hmm.
I guess it would be interesting to remount that filesystem with `noatime'
to eliminate the last bit of I/O and block-=realted code.
> The filesystem was taken offline yesterday for a forced fsck and it was
> found to be clean.
>
> The diskarray is a quite old Fibrenetix FX1200 with 12xPATA disk
> in raid5 (with hotspare) exposed to the OS as 3 SCSI-disks of
> 2+2+0.5TB assembled with LVM afterwards. The SCSI-controller is a:
> 05:08.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X
> Fusion-MPT Dual Ultra320 SCSI (rev c1)
>
> What suggestions do you have to solve this problem?
>
> I'm about to mkfs.ext3 the volume and spool it back in from the backup,
> but somehow I'm not convinced that it will solve the problem at all.
> It may just be a hardware problem, but dmesg doesnt tell anything.
>
> We actually got the problem from a perl-script, but this seems to be the
> minimal program that reproduces the problem.
I'd suspect that after 1e8 loops your CPU got too hot and started to
misbehave.
next prev parent reply other threads:[~2008-05-02 5:40 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-01 15:34 Many open/close on same files yeilds "No such file or directory" Jesper Krogh
2008-05-02 5:39 ` Andrew Morton [this message]
2008-05-02 8:20 ` Jesper Krogh
2008-05-01 12:15 ` Arjan van de Ven
2008-05-02 11:03 ` Many open/close on same files yeilds Jesper Krogh
2008-05-01 14:07 ` Arjan van de Ven
2008-05-02 15:19 ` Many open/close on same files yeilds "No such file or directory" Jesper Krogh
2008-05-02 15:47 ` Ray Lee
2008-05-02 15:55 ` Jesper Krogh
2008-05-02 16:45 ` Ray Lee
2008-05-02 19:53 ` Jesper Krogh
2008-05-02 19:52 ` Jesper Krogh
2008-05-05 17:43 ` Jesper Krogh
2008-05-05 17:51 ` Randy.Dunlap
2008-05-05 17:54 ` Jesper Krogh
[not found] ` <2c0942db0805051121r47cc97d2jb71cc8ab9eaa7981@mail.gmail.com>
2008-05-05 18:29 ` Jesper Krogh
[not found] ` <2c0942db0805051154q63a18bcfhce8a30d4a663ea3f@mail.gmail.com>
2008-05-07 20:51 ` Jesper Krogh
2008-05-07 22:27 ` Jesper Krogh
2008-05-02 15:21 ` Jesper Krogh
2008-05-09 5:22 ` Jesper Krogh
2008-05-09 5:36 ` Andrew Morton
2008-05-09 6:09 ` Jesper Krogh
2008-05-09 6:22 ` Andrew Morton
2008-05-12 1:53 ` Neil Brown
2008-05-12 1:53 ` Neil Brown
[not found] ` <18471.41781.164396.385159-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2008-05-12 6:00 ` J. Bruce Fields
2008-05-12 6:00 ` J. Bruce Fields
2008-05-12 6:41 ` Jesper Krogh
[not found] ` <4827E67E.3050008-Q2TZfHgGEy4@public.gmane.org>
2008-05-12 6:51 ` Andrew Morton
2008-05-12 6:51 ` Andrew Morton
[not found] <aoJcW-38V-37@gated-at.bofh.it>
[not found] ` <aoWjI-1Br-5@gated-at.bofh.it>
[not found] ` <aoYOH-6RO-13@gated-at.bofh.it>
[not found] ` <ap5nc-3ZT-7@gated-at.bofh.it>
[not found] ` <ap5Gx-4vu-43@gated-at.bofh.it>
[not found] ` <ap9Ar-4Nn-21@gated-at.bofh.it>
[not found] ` <aqcZe-7Fg-23@gated-at.bofh.it>
[not found] ` <aqd98-7Vb-25@gated-at.bofh.it>
[not found] ` <aqd99-7Vb-27@gated-at.bofh.it>
2008-05-05 19:05 ` Henry Nestler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080501223938.921f7cd2.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=jesper@krogh.cc \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.