* vethpair creation performance, 3.14 versus 4.2.0
@ 2015-08-31 19:48 Rick Jones
2015-08-31 21:29 ` David Ahern
2015-08-31 23:04 ` Eric Dumazet
0 siblings, 2 replies; 4+ messages in thread
From: Rick Jones @ 2015-08-31 19:48 UTC (permalink / raw)
To: Raghavendra K T, netdev
On 08/29/2015 10:59 PM, Raghavendra K T wrote:
> Please note that similar overhead was also reported while creating
> veth pairs https://lkml.org/lkml/2013/3/19/556
That got me curious, so I took the veth pair creation script from there,
and started running it out to 10K pairs, comparing a 3.14.44 kernel with
a 4.2.0-rc4+ from net-next and then net-next after pulling to get the
snmp stat aggregation perf change (4.2.0-rc8+).
Indeed, the 4.2.0-rc8+ kernel with the change was faster than the
4.2.0-rc4+ kernel without it, but both were slower than the 3.14.44 kernel.
I've put a spreadsheet with the results at:
ftp://ftp.netperf.org/vethpair/vethpair_compare.ods
A perf top for the 4.20-rc8+ kernel from the net-next tree looks like
this out around 10K pairs:
PerfTop: 11155 irqs/sec kernel:94.2% exact: 0.0% [4000Hz
cycles], (all, 32 CPUs)
-------------------------------------------------------------------------------
23.44% [kernel] [k] vsscanf
7.32% [kernel] [k] mutex_spin_on_owner.isra.4
5.63% [kernel] [k] __memcpy
5.27% [kernel] [k] __dev_alloc_name
3.46% [kernel] [k] format_decode
3.44% [kernel] [k] vsnprintf
3.16% [kernel] [k] acpi_os_write_port
2.71% [kernel] [k] number.isra.13
1.50% [kernel] [k] strncmp
1.21% [kernel] [k] _parse_integer
0.93% [kernel] [k] filemap_map_pages
0.82% [kernel] [k] put_dec_trunc8
0.82% [kernel] [k] unmap_single_vma
0.78% [kernel] [k] native_queued_spin_lock_slowpath
0.71% [kernel] [k] menu_select
0.65% [kernel] [k] clear_page
0.64% [kernel] [k] _raw_spin_lock
0.62% [kernel] [k] page_fault
0.60% [kernel] [k] find_busiest_group
0.53% [kernel] [k] snprintf
0.52% [kernel] [k] int_sqrt
0.46% [kernel] [k] simple_strtoull
0.44% [kernel] [k] page_remove_rmap
My attempts to get a call-graph have been met with very limited success.
Even though I've installed the dbg package from "make deb-pkg" the
symbol resolution doesn't seem to be working.
happy benchmarking,
rick jones
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: vethpair creation performance, 3.14 versus 4.2.0
2015-08-31 19:48 vethpair creation performance, 3.14 versus 4.2.0 Rick Jones
@ 2015-08-31 21:29 ` David Ahern
2015-08-31 21:31 ` Rick Jones
2015-08-31 23:04 ` Eric Dumazet
1 sibling, 1 reply; 4+ messages in thread
From: David Ahern @ 2015-08-31 21:29 UTC (permalink / raw)
To: Rick Jones, Raghavendra K T, netdev
On 8/31/15 1:48 PM, Rick Jones wrote:
> My attempts to get a call-graph have been met with very limited success.
> Even though I've installed the dbg package from "make deb-pkg" the
> symbol resolution doesn't seem to be working.
Looks like Debian does not enable framepointers by default:
$ grep FRAME /boot/config-3.2.0-4-amd64
...
# CONFIG_FRAME_POINTER is not set
Similar result for jessie.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: vethpair creation performance, 3.14 versus 4.2.0
2015-08-31 21:29 ` David Ahern
@ 2015-08-31 21:31 ` Rick Jones
0 siblings, 0 replies; 4+ messages in thread
From: Rick Jones @ 2015-08-31 21:31 UTC (permalink / raw)
To: David Ahern, Raghavendra K T, netdev
On 08/31/2015 02:29 PM, David Ahern wrote:
> On 8/31/15 1:48 PM, Rick Jones wrote:
>> My attempts to get a call-graph have been met with very limited success.
>> Even though I've installed the dbg package from "make deb-pkg" the
>> symbol resolution doesn't seem to be working.
>
> Looks like Debian does not enable framepointers by default:
>
> $ grep FRAME /boot/config-3.2.0-4-amd64
> ...
> # CONFIG_FRAME_POINTER is not set
>
> Similar result for jessie.
And indeed, my config file has a Debian lineage.
rick
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: vethpair creation performance, 3.14 versus 4.2.0
2015-08-31 19:48 vethpair creation performance, 3.14 versus 4.2.0 Rick Jones
2015-08-31 21:29 ` David Ahern
@ 2015-08-31 23:04 ` Eric Dumazet
1 sibling, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2015-08-31 23:04 UTC (permalink / raw)
To: Rick Jones; +Cc: Raghavendra K T, netdev
On Mon, 2015-08-31 at 12:48 -0700, Rick Jones wrote:
> On 08/29/2015 10:59 PM, Raghavendra K T wrote:
> > Please note that similar overhead was also reported while creating
> > veth pairs https://lkml.org/lkml/2013/3/19/556
>
>
> That got me curious, so I took the veth pair creation script from there,
> and started running it out to 10K pairs, comparing a 3.14.44 kernel with
> a 4.2.0-rc4+ from net-next and then net-next after pulling to get the
> snmp stat aggregation perf change (4.2.0-rc8+).
>
> Indeed, the 4.2.0-rc8+ kernel with the change was faster than the
> 4.2.0-rc4+ kernel without it, but both were slower than the 3.14.44 kernel.
>
> I've put a spreadsheet with the results at:
>
> ftp://ftp.netperf.org/vethpair/vethpair_compare.ods
>
> A perf top for the 4.20-rc8+ kernel from the net-next tree looks like
> this out around 10K pairs:
>
> PerfTop: 11155 irqs/sec kernel:94.2% exact: 0.0% [4000Hz
> cycles], (all, 32 CPUs)
> -------------------------------------------------------------------------------
>
> 23.44% [kernel] [k] vsscanf
> 7.32% [kernel] [k] mutex_spin_on_owner.isra.4
> 5.63% [kernel] [k] __memcpy
> 5.27% [kernel] [k] __dev_alloc_name
> 3.46% [kernel] [k] format_decode
> 3.44% [kernel] [k] vsnprintf
> 3.16% [kernel] [k] acpi_os_write_port
> 2.71% [kernel] [k] number.isra.13
> 1.50% [kernel] [k] strncmp
> 1.21% [kernel] [k] _parse_integer
> 0.93% [kernel] [k] filemap_map_pages
> 0.82% [kernel] [k] put_dec_trunc8
> 0.82% [kernel] [k] unmap_single_vma
> 0.78% [kernel] [k] native_queued_spin_lock_slowpath
> 0.71% [kernel] [k] menu_select
> 0.65% [kernel] [k] clear_page
> 0.64% [kernel] [k] _raw_spin_lock
> 0.62% [kernel] [k] page_fault
> 0.60% [kernel] [k] find_busiest_group
> 0.53% [kernel] [k] snprintf
> 0.52% [kernel] [k] int_sqrt
> 0.46% [kernel] [k] simple_strtoull
> 0.44% [kernel] [k] page_remove_rmap
>
> My attempts to get a call-graph have been met with very limited success.
> Even though I've installed the dbg package from "make deb-pkg" the
> symbol resolution doesn't seem to be working.
Well, you do not need call graph to spot the well known issue with
__dev_alloc_name() which has O(N) behavior
If we really need to be fast here, and keep eth%d or veth%d names
with guarantee of lowest numbers, we would need an IDR
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-08-31 23:04 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-31 19:48 vethpair creation performance, 3.14 versus 4.2.0 Rick Jones
2015-08-31 21:29 ` David Ahern
2015-08-31 21:31 ` Rick Jones
2015-08-31 23:04 ` Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).