linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Phillip Susi <psusi@cfl.rr.com>
To: Jamie Lokier <jamie@shareable.org>
Cc: linux-fsdevel@vger.kernel.org,
	Linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: readahead on directories
Date: Fri, 23 Apr 2010 00:13:15 -0400	[thread overview]
Message-ID: <1271995995.2855.48.camel@faldara> (raw)
In-Reply-To: <20100422224327.GE13951@shareable.org>

On Thu, 2010-04-22 at 23:43 +0100, Jamie Lokier wrote:
> No, that is not the reason.  pwrite needs the mutex too.

Which mutex and what for?

> Now you are describing using threads in the blocking cases.  (Work
> queues, thread pools, same thing.)  Earlier you were saying threads
> are the wrong approach....  Good, good :-)

Sure, in some cases, just not ALL.  If you can't control whether or not
the call blocks then you HAVE to use threads.  If you can be sure it
won't block most of the time, then most of the time you don't need any
other threads, and when you finally do, you need very few.

> A big problem with it, apart from having to change lots of places in
> all the filesystems, is that the work-queues run with the wrong
> security and I/O context.  Network filesystems break permissions, quotas
> break, ionice doesn't work, etc.  It's obviously fixable but more
> involved than just putting a read request on a work queue.

Hrm... good point.

> Fine-grained locking isn't the same thing as using non-sleepable locks.

Yes, it is not the same, but non-sleepable locks can ONLY be used with
fine grained locks.  The two reasons to use a mutex instead of a spin
lock are that you can sleep while holding it, and so it isn't a problem
to hold it for an extended period of time.

> So is read().  And then the calling application usually exits, because
> there's nothing else it can do usefully.  Same if aio_read() ever returns ENOMEM.
> 
> That way lies an application getting ENOMEM often and having to retry
> aio_read in a loop, probably a busy one, which isn't how the interface
> is supposed to work, and is not efficient either.

Simply retrying in a loop would be very stupid.  The programs using aio
are not simple stupid, so they would take more appropriate action.  For
example a server might decide it already has enough data in the pipe and
forget about asking for more until the queues empty, or it might decide
to drop that client, which would free up some more memory, or it might
decide it has some cache it can free up.  Something like readahead could
decide that if there isn't enough memory left then it has no business
trying to read any more, and exit.  Both of these are preferable to
waiting for something else to free up enough memory to continue.

  reply	other threads:[~2010-04-23  4:13 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-19 15:51 readahead on directories Phillip Susi
2010-04-21  0:44 ` Jamie Lokier
2010-04-21 14:57   ` Phillip Susi
2010-04-21 16:12     ` Jamie Lokier
2010-04-21 18:10       ` Phillip Susi
2010-04-21 20:22         ` Jamie Lokier
2010-04-21 20:59           ` Phillip Susi
2010-04-21 22:06             ` Jamie Lokier
2010-04-22  7:01               ` Brad Boyer
2010-04-22 14:26               ` Phillip Susi
2010-04-22 17:53                 ` Jamie Lokier
2010-04-22 19:23                   ` Phillip Susi
2010-04-22 20:35                     ` Jamie Lokier
2010-04-22 21:22                       ` Phillip Susi
2010-04-22 22:43                         ` Jamie Lokier
2010-04-23  4:13                           ` Phillip Susi [this message]
2010-04-21 18:38       ` Evgeniy Polyakov
2010-04-21 18:51         ` Jamie Lokier
2010-04-21 18:56           ` Evgeniy Polyakov
2010-04-21 20:02             ` Jamie Lokier
2010-04-21 20:21               ` Evgeniy Polyakov
2010-04-21 20:39                 ` Jamie Lokier
2010-04-21 19:23           ` Phillip Susi
2010-04-21 20:01             ` Jamie Lokier
2010-04-21 20:13               ` Phillip Susi
2010-04-21 20:37                 ` Jamie Lokier
2010-05-07 13:38 ` unified page and buffer cache? (was: readahead on directories) Phillip Susi
2010-05-07 13:53   ` Matthew Wilcox
2010-05-07 15:45     ` unified page and buffer cache? Phillip Susi
2010-05-07 18:30       ` Matthew Wilcox
2010-05-08  0:50         ` Phillip Susi
2010-05-08  0:46       ` tytso
2010-05-08  0:54         ` Phillip Susi
2010-05-08 12:52           ` tytso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1271995995.2855.48.camel@faldara \
    --to=psusi@cfl.rr.com \
    --cc=jamie@shareable.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).