From: Li Yu <raise.sail@gmail.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Changli Gao <xiaosuo@gmail.com>,
Linux Netdev List <netdev@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
davidel@xmailserver.org
Subject: Re: [RFC] Introduce to batch variants of accept() and epoll_ctl() syscall
Date: Mon, 09 Jul 2012 11:36:38 +0800 [thread overview]
Message-ID: <4FFA51C6.3070306@gmail.com> (raw)
In-Reply-To: <4FF6B20E.7000402@gmail.com>
于 2012年07月06日 17:38, Li Yu 写道:
> 于 2012年06月15日 16:51, Eric Dumazet 写道:
>> On Fri, 2012-06-15 at 13:37 +0800, Li Yu wrote:
>>
>>> Of course, I think that implementing them should not be a hard work :)
>>>
>>> Em. I really do not know whether it is necessary to introduce to a new
>>> syscall here. An alternative solution to add new socket option to handle
>>> such batch requirement, so applications also can detect if kernel has
>>> this extended ability with a easy getsockopt() call.
>>>
>>> Any way, I am going to try to write a prototype first.
>>
>> Before that, could you post the result of "perf top", or "perf
>> record ...;perf report"
>>
>
> Sorry for I just have time to write a benchmark to reproduce this
> problem on my test bed, below are results of "perf record -g -C 0".
> kernel is 3.4.0:
>
> Events: 7K cycles
> + 54.87% swapper [kernel.kallsyms] [k] poll_idle
> - 3.10% :22984 [kernel.kallsyms] [k] _raw_spin_lock
> - _raw_spin_lock
> - 64.62% sch_direct_xmit
> dev_queue_xmit
> ip_finish_output
> ip_output
> - ip_local_out
> + 49.48% ip_queue_xmit
> + 37.48% ip_build_and_send_pkt
> + 13.04% ip_send_skb
>
> I can not reproduce complete same high CPU usage on my testing
> environment, but top show that it has similar ratio of sys% and
> si% on one CPU:
>
> Tasks: 125 total, 2 running, 123 sleeping, 0 stopped, 0 zombie
> Cpu0 : 1.0%us, 30.7%sy, 0.0%ni, 18.8%id, 0.0%wa, 0.0%hi, 49.5%si,
> 0.0%st
>
> Well, it seem that I must acknowledge I was wrong here. however,
> I recall that I indeed ever encountered this in another benchmarking a
> small packets performance.
>
> I guess, this is since TX softirq and syscall context contend same lock
> in sch_direct_xmit(), is this right?
>
Em, do we have some means to decrease the lock contention here?
> thanks
>
> Yu
>
>>> The top shows the kernel is most cpu hog, the testing is simple,
>>> just a accept() -> epoll_ctl(ADD) loop, the ratio of cpu util sys% to
>>> si% is about 2:5.
>>
>> This ratio is not meaningful, if we dont know where time is spent.
>>
>>
>> I doubt epoll_ctl(ADD) is a problem here...
>>
>> If it is, batching the fds wont speed the thing anyway...
>>
>> I believe accept() is the problem here, because it contends with the
>> softirq processing the tcp session handshake.
>>
>>
>>
>>
>
>
next prev parent reply other threads:[~2012-07-09 3:36 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-15 4:13 [RFC] Introduce to batch variants of accept() and epoll_ctl() syscall Li Yu
2012-06-15 4:29 ` Changli Gao
2012-06-15 5:37 ` Li Yu
2012-06-15 8:51 ` Eric Dumazet
2012-06-18 23:27 ` Andi Kleen
2012-07-06 9:38 ` Li Yu
2012-07-09 3:36 ` Li Yu [this message]
2012-06-15 8:35 ` David Laight
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FFA51C6.3070306@gmail.com \
--to=raise.sail@gmail.com \
--cc=davidel@xmailserver.org \
--cc=eric.dumazet@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=xiaosuo@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).