linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <jens.axboe@oracle.com>
To: Jamie Lokier <jamie@shareable.org>
Cc: Christoph Hellwig <hch@lst.de>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: get_fs_excl/put_fs_excl/has_fs_excl
Date: Fri, 24 Apr 2009 07:58:37 +0200	[thread overview]
Message-ID: <20090424055837.GX4593@kernel.dk> (raw)
In-Reply-To: <20090423212314.GK13326@shareable.org>

On Thu, Apr 23 2009, Jamie Lokier wrote:
> Jens Axboe wrote:
> > The intent was to add some sort of notification mechanism from the file
> > system to inform the IO scheduler (and others?) that this process is how
> > holding a file system wide resource. So if you have a low priority
> > process getting access to such a resource, you want to boost its
> > priority to avoid higher priority apps getting stuck beind it. Sort of a
> > poor mans priority inheritance.
> 
> Very closely related to this: I'm building something where I want one
> particular task to have absolute higher I/O priority than all other
> tasks.  No problem, use the lovely RT I/O priority facility.
> 
> But if that task needs access to a buffer or page which is already
> undergoing I/O started by another task - what happens?  I'd like the
> _I/O_ priority to be boosted in that case, so that the high priority
> task does not have to wait on a long queue of low priority I/Os.
> 
> E.g. this happens when the high priority task reads from a file, and a
> low priority task has already initiated readahead for that file.  It's
> a particular problem if the low priority task's I/O is queued behind a
> lot of other low priority I/O.
> 
> That can be avoided by just not reading the same files :-)  But more
> subtly, the high priority task may find itself waiting on metadata
> blocks which overlap metadata blocks from I/O in a low priority tasks.
> The application can't easily avoid this.
> 
> So I'd like operations which wait for I/O to complete to compare the
> task's I/O priority with the I/O request already queued, and boost the
> request priority if it's lower, moving it forward in the elevator if
> necessary.
> 
> All this to guarantee a high I/O priority task has a maximum response
> time no matter what low priority I/O is doing.  Even O_DIRECT has to
> read metadata sometimes...

So presumably both the RT and normal task end up doing lock_page() on
the same page. Then __wait_on_bit_lock() uses
prepare_to_wait_exclusive() on the wait queue, which does FIFO ordering
of tasks. When IO completes, the first waiter is woken up. If the wait
queue was sorted by process priority, then lock_page() would honor the
task priority and make sure that the highest prio task got woken first.

> It seems if I/O priority boosting were implemented like this, that
> might solve the superblock priority thing too, without needing
> filesystem changes and generically for all metadata?

It's a different situation, one is waiting for some resource (the page)
to become available by being read in, so it's waiting for IO. The other
is holding some shared resource and then performing IO, potentially
waiting for that IO. In the latter case, the RT (or just higher)
priority task can't get access to the shared resource, so we can't do
much more than simply expedite the IO of the lower priority task. The
former case COULD be solved with prioritized wait queues.

-- 
Jens Axboe


  reply	other threads:[~2009-04-24  5:58 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-23 19:18 get_fs_excl/put_fs_excl/has_fs_excl Christoph Hellwig
2009-04-23 19:21 ` get_fs_excl/put_fs_excl/has_fs_excl Jens Axboe
2009-04-23 21:23   ` get_fs_excl/put_fs_excl/has_fs_excl Jamie Lokier
2009-04-24  5:58     ` Jens Axboe [this message]
2009-04-24 18:40   ` get_fs_excl/put_fs_excl/has_fs_excl Christoph Hellwig
2009-04-25 15:16     ` get_fs_excl/put_fs_excl/has_fs_excl Theodore Tso
2009-04-27  9:53       ` get_fs_excl/put_fs_excl/has_fs_excl Jens Axboe
2009-04-27 11:33         ` get_fs_excl/put_fs_excl/has_fs_excl Theodore Tso
2009-04-27 14:47           ` get_fs_excl/put_fs_excl/has_fs_excl Jamie Lokier
2009-04-27 16:29             ` get_fs_excl/put_fs_excl/has_fs_excl Theodore Tso
2009-04-27 17:03               ` get_fs_excl/put_fs_excl/has_fs_excl Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090424055837.GX4593@kernel.dk \
    --to=jens.axboe@oracle.com \
    --cc=hch@lst.de \
    --cc=jamie@shareable.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).