All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Timo Sirainen <tss@iki.fi>
Cc: linux-kernel@vger.kernel.org
Subject: Re: readdir loses renamed files
Date: Mon, 25 Oct 2004 08:37:23 -0400	[thread overview]
Message-ID: <20041025123722.GA5107@thunk.org> (raw)
In-Reply-To: <431547F9-2624-11D9-8AC3-000393CC2E90@iki.fi>

On Mon, Oct 25, 2004 at 04:21:57AM +0300, Timo Sirainen wrote:
> I'd have thought this had already been asked many times before, but 
> google didn't show me anything.
> 
> My problem is that mails in a large maildir get temporarily lost. This 
> happens because readdir() never returns a file which was just rename()d 
> by another process. Either new or the old name would have been fine, 
> but it's not returned at all.
> 
> Is there a chance this could get fixed? Every OS/filesystem I've tested 
> so far has had the same problem, so I'll have to implement some extra 
> locking anyway (so much for maildir being lockless), but it would be 
> nice to have at least one OS where it works without the extra locking 
> overhead.

In some cases it won't even just get lost, but the old and new name
can both be returned.  For example, if you assume the use of a simple
non-tree, linked-list implementation of a directory, such can be found
in Solaris's ufs, BSD 4.3's FFS, Linux's ext2 and minix filesystems,
and many others, and you have a fully tightly packed directory (i.e.,
no gaps), with the directory entry "foo" at the beginning of the file,
and readdir() has already returned the first "foo" entry when some
other application renames it "Supercalifragilisticexpialidocious", the
new name will not fit in the old name's directory location, so it will
be placed at the end of the directory --- where it will be returned by
readdir() a second time.

This is not a bug; the POSIX specification explicitly allows this
behavior.  If a filename is renamed during a readdir() session of a
directory, it is undefined where that neither, either, or both of the
new and old filenames will be returned.

And that's because there's no good way to do this without trashing the
performance of the system, especially when most applications don't
care.  (Do you really want your entire system running significantly
slower, penalizing all other applications on your system, just because
of one stupid/badly-written application?)  In order to do this, the
kernel would have to atomically snapshot the directory --- even one
which might be several megabytes in length, and store a copy of it in
memory, while the application calls readdir().  Several processes
could perform a denial-of-service attack by starting to call
readdir(), and then stopping.  This would end up locking up huge
amounts of non-pageable system memory, and cause the system to come
down in a hurry.

							- Ted

  parent reply	other threads:[~2004-10-25 12:40 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-10-25  1:21 readdir loses renamed files Timo Sirainen
2004-10-25  8:29 ` Chris Wedgwood
2004-10-25 12:35   ` Timo Sirainen
2004-10-25 12:47     ` Jan Engelhardt
2004-10-25 12:37 ` Theodore Ts'o [this message]
2004-10-25 13:22   ` Timo Sirainen
2004-10-28  9:34   ` Matthias Andree
2004-10-28 11:44     ` Andreas Dilger
2004-10-28 14:34       ` Jan Engelhardt
2004-10-28 15:41       ` Matthias Andree
2004-10-29 21:15       ` Hans Reiser
2004-10-29 21:28         ` Jan Engelhardt
2004-10-30 19:11           ` Hans Reiser
2004-10-31  6:32             ` Jan Engelhardt
2004-11-01  5:38               ` Hans Reiser
2004-10-28 17:06     ` Theodore Ts'o
2004-10-28 19:00       ` Bernd Eckenfels

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041025123722.GA5107@thunk.org \
    --to=tytso@mit.edu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tss@iki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.