* question about the performance of 'btrfs send'
@ 2022-10-15 12:35 Wang Yugui
2022-10-17 13:02 ` David Sterba
0 siblings, 1 reply; 2+ messages in thread
From: Wang Yugui @ 2022-10-15 12:35 UTC (permalink / raw)
To: linux-btrfs
Hi,
a question about the performance of 'btrfs send'.
The output speed of 'btrfs send' is about 700MiB/s in the 3 cases.
1) kernel 5.15.73 + 'btrfs send --proto 1'
2) kernel: 6.0.1(with btrfs-devel misc-6.1) + 'btrfs send --proto 1'
3) kernel: 6.0.1(with btrfs-devel misc-6.1) + 'btrfs send --proto 2'
btrfs-progs: 6.0
the outut of 'perf report':
Overhead Command Shared Object Symbol
*1 40.63% btrfs [kernel.kallsyms] [k] __crc32c_le
*2 9.97% btrfs [kernel.kallsyms] [k] memcpy_erms
*3 9.25% btrfs [kernel.kallsyms] [k] send_extent_data
*4 5.40% btrfs [kernel.kallsyms] [k] asm_exc_nmi
*5 2.73% btrfs [kernel.kallsyms] [k] __alloc_pages
1.14% btrfs [kernel.kallsyms] [k] __rmqueue_pcplist
0.92% btrfs [kernel.kallsyms] [k] bad_range
0.88% btrfs [kernel.kallsyms] [k] get_page_from_freelist
What I expected:
the above *1) __crc32c_le take >60%, and the outut speed > 1GiB/s.
The *1) __crc32c_le is necessary operation, and the speed
seems OK. 2GB/s * 40% = 800MiB/s, it is close to 700MiB/s.
Question:
The above *3) is difficult to understand. Any advice?
Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2022/10/15
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: question about the performance of 'btrfs send'
2022-10-15 12:35 question about the performance of 'btrfs send' Wang Yugui
@ 2022-10-17 13:02 ` David Sterba
0 siblings, 0 replies; 2+ messages in thread
From: David Sterba @ 2022-10-17 13:02 UTC (permalink / raw)
To: Wang Yugui; +Cc: linux-btrfs
On Sat, Oct 15, 2022 at 08:35:01PM +0800, Wang Yugui wrote:
> Hi,
>
> a question about the performance of 'btrfs send'.
>
> The output speed of 'btrfs send' is about 700MiB/s in the 3 cases.
> 1) kernel 5.15.73 + 'btrfs send --proto 1'
> 2) kernel: 6.0.1(with btrfs-devel misc-6.1) + 'btrfs send --proto 1'
> 3) kernel: 6.0.1(with btrfs-devel misc-6.1) + 'btrfs send --proto 2'
> btrfs-progs: 6.0
>
> the outut of 'perf report':
> Overhead Command Shared Object Symbol
> *1 40.63% btrfs [kernel.kallsyms] [k] __crc32c_le
> *2 9.97% btrfs [kernel.kallsyms] [k] memcpy_erms
> *3 9.25% btrfs [kernel.kallsyms] [k] send_extent_data
> *4 5.40% btrfs [kernel.kallsyms] [k] asm_exc_nmi
> *5 2.73% btrfs [kernel.kallsyms] [k] __alloc_pages
> 1.14% btrfs [kernel.kallsyms] [k] __rmqueue_pcplist
> 0.92% btrfs [kernel.kallsyms] [k] bad_range
> 0.88% btrfs [kernel.kallsyms] [k] get_page_from_freelist
>
> What I expected:
> the above *1) __crc32c_le take >60%, and the outut speed > 1GiB/s.
> The *1) __crc32c_le is necessary operation, and the speed
> seems OK. 2GB/s * 40% = 800MiB/s, it is close to 700MiB/s.
>
> Question:
> The above *3) is difficult to understand. Any advice?
The perf report does not include IO, right? It's only CPU time spent.
That it's accounted only for send_extent_data would also mean there's
some function inlining involved so it does not point exactly where the
time is spent. I'd say it's the main loop around send_write that emits
the commands and works with memory data.
What could be suboptimal is the call get_cur_path in send_write that
rebuilds the path each time even though it's for the same file.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2022-10-17 13:03 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-15 12:35 question about the performance of 'btrfs send' Wang Yugui
2022-10-17 13:02 ` David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).