From: Jamie Lokier <jamie@shareable.org>
To: Theodore Tso <tytso@mit.edu>, Jens Axboe <jens.axboe@oracle.com>,
Christoph Hellwig <hch@lst.de>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-ext4@vger.kernel.org
Subject: Re: get_fs_excl/put_fs_excl/has_fs_excl
Date: Mon, 27 Apr 2009 15:47:42 +0100 [thread overview]
Message-ID: <20090427144742.GC4885@shareable.org> (raw)
In-Reply-To: <20090427113356.GC9059@mit.edu>
Theodore Tso wrote:
> *) Do we only care about processes whose I/O priority is below the
> default? (i.e., either in the idle class, or in a low-priority
> best efforts class) What if the concern is a real-time process
> which is being blocked by a default I/O priority process taking its
> time while holding some fs-wide resource?
>
> If the answer to the previous question is no, it becomes more
> reasonable to consider bump the submission priority of the process
> in question to the highest priority "best efforts" level. After
> all, if this truly is a "filesystem-wide" resource, then no one is
> going to make forward progress relating to this block device unless
> and until the filesystem-wide lock is resolved. Also, if we don't
> allow this situation to return to userspace, presumably the
> kernel-code involved will only be writing to the block-device in
> question. (This might not be entirely true if in the case of the
> sendfile(2) syscall, but currently we can only read from
> filesystems with sendfile, and so presumably a filesystem would
> never call get_fs_excl why servicing a sendfile request.)
>
> *) Is implementing the bulk of this in the cfq scheduler really the
> best place to do this? To explore something completely different,
> what if the filesystem simply explicitly set I/O priority levels in
> its block I/O submissions, and provided optional callback functions
> which could be used by the page writeback routines to determine the
> appropriate I/O priority level that should be used given a
> particular filesystem and inode number. (That actually could be
> used to provide another cool function --- we could expose to
> userspace the concept that particular inode should always have its
> I/O go out with a higher priority, perhaps via chattr flag.)
>
> Basically, the argument here is that we already have the
> appropriate mechanism for ordering I/O requests, which is I/O
> priority mechanism, and the policy really needs to be set by the
> filesystem --- and it might be far more than just "do we have a
> filesystem-wide exclusive lock" or not.
Personally, I'm interested in the following:
- A process with RT I/O priority and RT CPU priority is reading
a series of files from disk. It should be very reliable at this.
- Other normal I/O priority and normal CPU priority processes are
reading and writing the disk.
I would like the first process to have a guaranteed minimum I/O
performance: it should continuously make progress, even when it needs
to read some file metadata which overlaps a page affected by the other
processes. I don't mind all the interference from disk head seeks and
so on, but I would like the I/O that the first process depends on to
have RT I/O priority - including when it's waiting on I/O initiated by
another process and the normal I/O priority queue is full.
So, I'm not exactly sure, but I think what I need for that is:
- I/O priority boosting (re-queuing in the elevator) to fix the
inversion when waiting on I/O which was previously queued with
normal I/O priority, and
- Task priority boosting when waiting on a filesystem resource
which is held by a normal priority task.
(I'm not sure if generic task priority boosting is already addressed to some
extent in the RT-PREEMPT Linux tree.)
-- Jamie
next prev parent reply other threads:[~2009-04-27 14:47 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-23 19:18 get_fs_excl/put_fs_excl/has_fs_excl Christoph Hellwig
2009-04-23 19:21 ` get_fs_excl/put_fs_excl/has_fs_excl Jens Axboe
2009-04-23 21:23 ` get_fs_excl/put_fs_excl/has_fs_excl Jamie Lokier
2009-04-24 5:58 ` get_fs_excl/put_fs_excl/has_fs_excl Jens Axboe
2009-04-24 18:40 ` get_fs_excl/put_fs_excl/has_fs_excl Christoph Hellwig
2009-04-25 15:16 ` get_fs_excl/put_fs_excl/has_fs_excl Theodore Tso
2009-04-27 9:53 ` get_fs_excl/put_fs_excl/has_fs_excl Jens Axboe
2009-04-27 11:33 ` get_fs_excl/put_fs_excl/has_fs_excl Theodore Tso
2009-04-27 14:47 ` Jamie Lokier [this message]
2009-04-27 16:29 ` get_fs_excl/put_fs_excl/has_fs_excl Theodore Tso
2009-04-27 17:03 ` get_fs_excl/put_fs_excl/has_fs_excl Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090427144742.GC4885@shareable.org \
--to=jamie@shareable.org \
--cc=hch@lst.de \
--cc=jens.axboe@oracle.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.