From: Trond Myklebust <trond.myklebust@primarydata.com>
To: NeilBrown <neilb@suse.com>
Cc: Anna Schumaker <anna.schumaker@netapp.com>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] SUNRPC: restore fair scheduling to priority queues.
Date: Sat, 26 Dec 2015 19:33:34 -0500 [thread overview]
Message-ID: <CAHQdGtTekchy7+6phSJikhPYk_FW4G6fuFEHy8WPn_t2iRUvEA@mail.gmail.com> (raw)
In-Reply-To: <874mfjay1l.fsf@notabene.neil.brown.name>
On Tue, Dec 15, 2015 at 10:10 PM, NeilBrown <neilb@suse.com> wrote:
> On Wed, Dec 16 2015, Trond Myklebust wrote:
>
>> On Tue, Dec 15, 2015 at 6:44 PM, NeilBrown <neilb@suse.com> wrote:
>>>
>>> Commit: c05eecf63610 ("SUNRPC: Don't allow low priority tasks to pre-empt higher priority ones")
>>>
>>> removed the 'fair scheduling' feature from SUNRPC priority queues.
>>> This feature caused problems for some queues (send queue and session slot queue)
>>> but is still needed for others, particularly the tcp slot queue.
>>>
>>> Without fairness, reads (priority 1) can starve background writes
>>> (priority 0) so a streaming read can cause writeback to block
>>> indefinitely. This is not easy to measure with default settings as
>>> the current slot table size is much larger than the read-ahead size.
>>> However if the slot-table size is reduced (seen when backporting to
>>> older kernels with a limited size) the problem is easily demonstrated.
>>>
>>> This patch conditionally restores fair scheduling. It is now the
>>> default unless rpc_sleep_on_priority() is called directly. Then the
>>> queue switches to strict priority observance.
>>>
>>> As that function is called for both the send queue and the session
>>> slot queue and not for any others, this has exactly the desired
>>> effect.
>>>
>>> The "count" field that was removed by the previous patch is restored.
>>> A value for '255' means "strict priority queuing, no fair queuing".
>>> Any other value is a could of owners to be processed before switching
>>> to a different priority level, just like before.
<snip>
>> Are we sure there is value in keeping FLUSH_LOWPRI for background writes?
>
> There is currently also FLUSH_HIGHPRI for "for_reclaim" writes.
> Should they be allowed to starve reads?
>
> If you treated all reads and writed the same, then I can't see value in
> restoring fair scheduling. If there is any difference, then I suspect
> we do need the fairness.
I disagree. Reclaiming memory should always be able to pre-empt
"interactive" features such as read. Everything goes down the toilet
when we force the kernel into situations where it needs to swap.
Cheers
Trond
next prev parent reply other threads:[~2015-12-27 0:33 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-15 23:44 [PATCH] SUNRPC: restore fair scheduling to priority queues NeilBrown
2015-12-16 0:48 ` Trond Myklebust
2015-12-16 3:10 ` NeilBrown
2015-12-27 0:33 ` Trond Myklebust [this message]
2016-02-10 1:23 ` NeilBrown
2016-02-10 1:45 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAHQdGtTekchy7+6phSJikhPYk_FW4G6fuFEHy8WPn_t2iRUvEA@mail.gmail.com \
--to=trond.myklebust@primarydata.com \
--cc=anna.schumaker@netapp.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).