From: Vlad Buslov <vladbu@mellanox.com>
To: Dennis Zhou <dennis@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>, Tejun Heo <tj@kernel.org>,
Linux Kernel Network Developers <netdev@vger.kernel.org>,
Yevgeny Kliteynik <kliteyn@mellanox.com>,
Yossef Efraim <yossefe@mellanox.com>,
Maor Gottlieb <maorg@mellanox.com>
Subject: Re: tc filter insertion rate degradation
Date: Tue, 29 Jan 2019 19:22:10 +0000 [thread overview]
Message-ID: <vbfzhrj9smb.fsf@mellanox.com> (raw)
In-Reply-To: <20190124172126.GA66944@dennisz-mbp.dhcp.thefacebook.com>
On Thu 24 Jan 2019 at 17:21, Dennis Zhou <dennis@kernel.org> wrote:
> Hi Vlad and Eric,
>
> On Tue, Jan 22, 2019 at 09:33:10AM -0800, Eric Dumazet wrote:
>> On Mon, Jan 21, 2019 at 3:24 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>> >
>> > Hi Eric,
>> >
>> > I've been investigating significant tc filter insertion rate degradation
>> > and it seems it is caused by your commit 001c96db0181 ("net: align
>> > gnet_stats_basic_cpu struct"). With this commit insertion rate is
>> > reduced from ~65k rules/sec to ~43k rules/sec when inserting 1m rules
>> > from file in tc batch mode on my machine.
>> >
>> > Tc perf profile indicates that pcpu allocator now consumes 2x CPU:
>> >
>> > 1) Before:
>> >
>> > Samples: 63K of event 'cycles:ppp', Event count (approx.): 48796480071
>> > Children Self Co Shared Object Symbol
>> > + 21.19% 3.38% tc [kernel.vmlinux] [k] pcpu_alloc
>> > + 3.45% 0.25% tc [kernel.vmlinux] [k] pcpu_alloc_area
>> >
>> > 2) After:
>> >
>> > Samples1: 92K of event 'cycles:ppp', Event count (approx.): 71446806550
>> > Children Self Co Shared Object Symbol
>> > + 44.67% 3.99% tc [kernel.vmlinux] [k] pcpu_alloc
>> > + 19.25% 0.22% tc [kernel.vmlinux] [k] pcpu_alloc_area
>> >
>> > It seems that it takes much more work for pcpu allocator to perform
>> > allocation with new stricter alignment requirements. Not sure if it is
>> > expected behavior or not in this case.
>> >
>> > Regards,
>> > Vlad
>
> Would you mind sharing a little more information with me:
> 1) output before and after a run of /sys/kernel/debug/percpu_stats
Hi Dennis,
Some of these files are quite large, so I put them to my Dropbox.
Output before:
Percpu Memory Statistics
Allocation Info:
----------------------------------------
unit_size : 262144
static_size : 139160
reserved_size : 8192
dyn_size : 28776
atom_size : 2097152
alloc_size : 2097152
Global Stats:
----------------------------------------
nr_alloc : 3343
nr_dealloc : 752
nr_cur_alloc : 2591
nr_max_alloc : 2598
nr_chunks : 3
nr_max_chunks : 3
min_alloc_size : 4
max_alloc_size : 8208
empty_pop_pages : 3
Per Chunk Stats:
----------------------------------------
Chunk: <- Reserved Chunk
nr_alloc : 5
max_alloc_size : 320
empty_pop_pages : 0
first_bit : 1002
free_bytes : 7448
contig_bytes : 7424
sum_frag : 24
max_frag : 24
cur_min_alloc : 16
cur_med_alloc : 64
cur_max_alloc : 320
Chunk: <- First Chunk
nr_alloc : 479
max_alloc_size : 8208
empty_pop_pages : 0
first_bit : 8192
free_bytes : 0
contig_bytes : 0
sum_frag : 0
max_frag : 0
cur_min_alloc : 4
cur_med_alloc : 24
cur_max_alloc : 8208
Chunk:
nr_alloc : 1925
max_alloc_size : 8208
empty_pop_pages : 0
first_bit : 63102
free_bytes : 852
contig_bytes : 12
sum_frag : 852
max_frag : 12
cur_min_alloc : 4
cur_med_alloc : 8
cur_max_alloc : 8208
Chunk:
nr_alloc : 182
max_alloc_size : 936
empty_pop_pages : 3
first_bit : 21
free_bytes : 256452
contig_bytes : 255120
sum_frag : 1332
max_frag : 368
cur_min_alloc : 8
cur_med_alloc : 20
cur_max_alloc : 320
After: https://www.dropbox.com/s/unyzhx4vgo2x30e/stats_after?dl=0
> 2) a full perf output
https://www.dropbox.com/s/isfcxca3npn5slx/perf.data?dl=0
> 3) a reproducer
$ sudo tc -b add.0
Example batch file: https://www.dropbox.com/s/ey7cbl5nwu5p0tg/add.0?dl=0
Thanks,
Vlad
prev parent reply other threads:[~2019-01-29 19:22 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-21 11:24 tc filter insertion rate degradation Vlad Buslov
2019-01-22 17:33 ` Eric Dumazet
2019-01-22 21:18 ` Tejun Heo
2019-01-22 22:40 ` Eric Dumazet
2019-01-24 17:21 ` Dennis Zhou
2019-01-29 19:22 ` Vlad Buslov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=vbfzhrj9smb.fsf@mellanox.com \
--to=vladbu@mellanox.com \
--cc=dennis@kernel.org \
--cc=edumazet@google.com \
--cc=kliteyn@mellanox.com \
--cc=maorg@mellanox.com \
--cc=netdev@vger.kernel.org \
--cc=tj@kernel.org \
--cc=yossefe@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.