From: Ingo Molnar <mingo@elte.hu>
To: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>,
David Miller <davem@davemloft.net>,
wenji@fnal.gov, akpm@osdl.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP
Date: Thu, 30 Nov 2006 11:32:40 +0100 [thread overview]
Message-ID: <20061130103240.GA25733@elte.hu> (raw)
In-Reply-To: <20061130102205.GA20654@2ka.mipt.ru>
* Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
> > David's line of thinking for a solution sounds better to me. This
> > patch does not prevent the process from being preempted (for
> > potentially a long time), by any means.
>
> It steals timeslices from other processes to complete tcp_recvmsg()
> task, and only when it does it for too long, it will be preempted.
> Processing backlog queue on behalf of need_resched() will break
> fairness too - processing itself can take a lot of time, so process
> can be scheduled away in that part too.
correct - it's just the wrong thing to do. The '10% performance win'
that was measured was against _9 other tasks who contended for the same
CPU resource_. I.e. it's /not/ an absolute 'performance win' AFAICS,
it's a simple shift in CPU cycles away from the other 9 tasks and
towards the task that does TCP receive.
Note that even without the change the TCP receiving task is already
getting a disproportionate share of cycles due to softirq processing!
Under a load of 10.0 it went from 500 mbits to 74 mbits, while the
'fair' share would be 50 mbits. So the TCP receiver /already/ has an
unfair advantage. The patch only deepends that unfairness.
The solution is really simple and needs no kernel change at all: if you
want the TCP receiver to get a larger share of timeslices then either
renice it to -20 or renice the other tasks to +19.
The other disadvantage, even ignoring that it's the wrong thing to do,
is the crudeness of preempt_disable() that i mentioned in the other
post:
---------->
independently of the issue at hand, in general the explicit use of
preempt_disable() in non-infrastructure code is quite a heavy tool. Its
effects are heavy and global: it disables /all/ preemption (even on
PREEMPT_RT). Furthermore, when preempt_disable() is used for per-CPU
data structures then [unlike for example to a spin-lock] the connection
between the 'data' and the 'lock' is not explicit - causing all kinds of
grief when trying to convert such code to a different preemption model.
(such as PREEMPT_RT :-)
So my plan is to remove all "open-coded" use of preempt_disable() [and
raw use of local_irq_save/restore] from the kernel and replace it with
some facility that connects data and lock. (Note that this will not
result in any actual changes on the instruction level because internally
every such facility still maps to preempt_disable() on non-PREEMPT_RT
kernels, so on non-PREEMPT_RT kernels such code will still be the same
as before.)
Ingo
next prev parent reply other threads:[~2006-11-30 10:32 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-11-30 1:56 [patch 1/4] - Potential performance bottleneck for Linxu TCP Wenji Wu
2006-11-30 2:19 ` David Miller
2006-11-30 6:17 ` Ingo Molnar
2006-11-30 6:30 ` David Miller
2006-11-30 6:47 ` Ingo Molnar
2006-11-30 7:12 ` David Miller
2006-11-30 7:35 ` Ingo Molnar
2006-11-30 9:52 ` Evgeniy Polyakov
2006-11-30 10:07 ` Nick Piggin
2006-11-30 10:22 ` Evgeniy Polyakov
2006-11-30 10:32 ` Ingo Molnar [this message]
2006-11-30 17:04 ` Wenji Wu
2006-11-30 20:20 ` Ingo Molnar
2006-11-30 20:58 ` Wenji Wu
2006-11-30 20:22 ` David Miller
2006-11-30 20:30 ` Ingo Molnar
2006-11-30 20:38 ` David Miller
2006-11-30 20:49 ` Ingo Molnar
2006-11-30 20:54 ` Ingo Molnar
2006-11-30 20:55 ` David Miller
2006-11-30 20:14 ` David Miller
2006-11-30 20:42 ` Wenji Wu
2006-12-01 9:53 ` Evgeniy Polyakov
2006-12-01 23:18 ` David Miller
2006-11-30 6:56 ` Ingo Molnar
2006-11-30 16:08 ` Wenji Wu
2006-11-30 20:06 ` David Miller
2006-11-30 9:33 ` Christoph Hellwig
2006-11-30 16:51 ` Lee Revell
-- strict thread matches above, loose matches on Subject: below --
2006-11-30 2:02 Wenji Wu
2006-11-30 6:19 ` Ingo Molnar
2006-11-29 23:27 [Changelog] " Wenji Wu
2006-11-29 23:28 ` [patch 1/4] " Wenji Wu
2006-11-30 0:53 ` David Miller
2006-11-30 1:08 ` Andrew Morton
2006-11-30 1:13 ` David Miller
2006-11-30 6:04 ` Mike Galbraith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20061130103240.GA25733@elte.hu \
--to=mingo@elte.hu \
--cc=akpm@osdl.org \
--cc=davem@davemloft.net \
--cc=johnpol@2ka.mipt.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=nickpiggin@yahoo.com.au \
--cc=wenji@fnal.gov \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).