qemu-nbd performance regression in bd2cd4a4

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* qemu-nbd performance regression in bd2cd4a4
@ 2023-04-06 10:55 Lukáš Doktor
  2023-04-06 11:20 ` Florian Westphal
  2023-04-06 15:07 ` Eric Blake
  0 siblings, 2 replies; 4+ messages in thread
From: Lukáš Doktor @ 2023-04-06 10:55 UTC (permalink / raw)
  To: qemu-devel, Florian Westphal, Eric Blake, Kevin Wolf


[-- Attachment #1.1.1: Type: text/plain, Size: 3895 bytes --]

Hello Florian, folks,

my CI caught ~5% regression (in 60s runs, when using 240s it was about 10%) in qemu-nbd performance bisected multiple-times up to bd2cd4a441ded163b62371790876f28a9b834317 in fio when using 4k blocks read. Note that other scenarios (reads using 1024k blocks, writes using 4 nor 1024k blocks) were not affected. Is this expected?

Bisect status:

    # status: waiting for both good and bad commits
    # good: [60ca584b8af0de525656f959991a440f8c191f12] Merge tag 'pull-for-8.0-220323-1' of https://gitlab.com/stsquad/qemu into staging
    git bisect good 60ca584b8af0de525656f959991a440f8c191f12
    # status: waiting for bad commit, 1 good commit known
    # bad: [4584e76c9ae0c03a562d4c9726fe7811ea3628c8] Merge tag 'pull-loongarch-20230404' of https://gitlab.com/gaosong/qemu into staging
    git bisect bad 4584e76c9ae0c03a562d4c9726fe7811ea3628c8
    # bad: [3b555b51156279f8dd9184c85b7af920b9f4cb9e] Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into staging
    git bisect bad 3b555b51156279f8dd9184c85b7af920b9f4cb9e
    # bad: [d8fbf9aa85aed64450907580a1d70583f097e9df] block/export: Fix graph locking in blk_get_geometry() call
    git bisect bad d8fbf9aa85aed64450907580a1d70583f097e9df
    # good: [8635a3a153da3a6712c4ee249c2bf3513cbfdbf7] Revert "docs/about/deprecated: Deprecate 32-bit arm hosts for system emulation"
    git bisect good 8635a3a153da3a6712c4ee249c2bf3513cbfdbf7
    # good: [d82e2e76358dec42ba42b7e54bdc7ae61493fc9a] Merge tag 'pull-xen-20230324' of https://xenbits.xen.org/git-http/people/aperard/qemu-dm into staging
    git bisect good d82e2e76358dec42ba42b7e54bdc7ae61493fc9a
    # bad: [bd2cd4a441ded163b62371790876f28a9b834317] nbd/server: push pending frames after sending reply
    git bisect bad bd2cd4a441ded163b62371790876f28a9b834317
    # good: [e3debd5e7d0ce031356024878a0a18b9d109354a] Merge tag 'pull-request-2023-03-24' of https://gitlab.com/thuth/qemu into staging
    git bisect good e3debd5e7d0ce031356024878a0a18b9d109354a
    # first bad commit: [bd2cd4a441ded163b62371790876f28a9b834317] nbd/server: push pending frames after sending reply

fio-nbd export:

    mkdir -p /var/lib/runperf/runperf-nbd/
    dd bs=1M count=256 if=/dev/urandom of='/var/lib/runperf/runperf-nbd//disk.img'
    qemu-nbd -t -k /var/lib/runperf/runperf-nbd//socket -f raw /var/lib/runperf/runperf-nbd//disk.img

Fio job (executed via pbench, let me know if you need simplified steps with fio only):

    cat > /var/lib/runperf/runperf-nbd/nbd.fio << \MrGg1N
    # To use fio to test nbdkit:
    #
    # nbdkit -U - memory size=256M --run 'export unixsocket; fio examples/nbd.fio'
    #
    # To use fio to test qemu-nbd:
    #
    # rm -f /tmp/disk.img /tmp/socket
    # truncate -s 256M /tmp/disk.img
    # export target=/tmp/socket
    # qemu-nbd -t -k $target -f raw /tmp/disk.img &
    # fio examples/nbd.fio
    # killall qemu-nbd
    
    [global]
    bs = $@
    runtime = 30
    ioengine = nbd
    iodepth = 32
    direct = 1
    sync = 0
    time_based = 1
    clocksource = gettimeofday
    ramp_time = 5
    write_bw_log = fio
    write_iops_log = fio
    write_lat_log = fio
    log_avg_msec = 1000
    write_hist_log = fio
    log_hist_msec = 10000
    # log_hist_coarseness = 4 # 76 bins
    
    rw = $@
    uri=nbd+unix:///?socket=/var/lib/runperf/runperf-nbd/socket
    # Starting from nbdkit 1.14 the following will work:
    #uri=${uri}
    
    [job0]
    offset=0
    
    [job1]
    offset=64m
    
    [job2]
    offset=128m
    
    [job3]
    offset=192m
    
    MrGg1N
    benchmark_bin=/usr/local/bin/fio pbench-fio  --block-sizes=4 --job-file=/var/lib/runperf/runperf-nbd/nbd.fio --numjobs=4 --ramptime=10 --runtime=30 --samples=1 --test-types=read --clients=virtlab506.virt.lab.eng.bos.redhat.com

Regards,
Lukáš

[-- Attachment #1.1.2: report-annonym.html --]
[-- Type: text/html, Size: 3058586 bytes --]

[-- Attachment #1.1.3: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 12925 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: qemu-nbd performance regression in bd2cd4a4
  2023-04-06 10:55 qemu-nbd performance regression in bd2cd4a4 Lukáš Doktor
@ 2023-04-06 11:20 ` Florian Westphal
  2023-04-06 15:07 ` Eric Blake
  1 sibling, 0 replies; 4+ messages in thread
From: Florian Westphal @ 2023-04-06 11:20 UTC (permalink / raw)
  To: Lukáš Doktor
  Cc: qemu-devel, Florian Westphal, Eric Blake, Kevin Wolf

Lukáš Doktor <ldoktor@redhat.com> wrote:
> Fio job (executed via pbench, let me know if you need simplified steps with fio only):

That would be useful, thanks.

I suspect revert is useless because upcoming TCP_NODELAY change will increase
small packet rate too.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: qemu-nbd performance regression in bd2cd4a4
  2023-04-06 10:55 qemu-nbd performance regression in bd2cd4a4 Lukáš Doktor
  2023-04-06 11:20 ` Florian Westphal
@ 2023-04-06 15:07 ` Eric Blake
  2023-04-12  8:49   ` Lukáš Doktor
  1 sibling, 1 reply; 4+ messages in thread
From: Eric Blake @ 2023-04-06 15:07 UTC (permalink / raw)
  To: Lukáš Doktor; +Cc: qemu-devel, Florian Westphal, Kevin Wolf

On Thu, Apr 06, 2023 at 12:55:38PM +0200, Lukáš Doktor wrote:
> Hello Florian, folks,
> 
> my CI caught ~5% regression (in 60s runs, when using 240s it was about 10%) in qemu-nbd performance bisected multiple-times up to bd2cd4a441ded163b62371790876f28a9b834317 in fio when using 4k blocks read. Note that other scenarios (reads using 1024k blocks, writes using 4 nor 1024k blocks) were not affected. Is this expected?

Large operations (1024k blocks) are dominated by the transaction
itself, and not the network overhead.  Small operations (4k reads)
used to benefit from TCP batching (introduces latency, but less
network overhead), but we intentionally started corking things
(decreases latency, but now the network is prone to send smaller
packets which means more network overhead).  So a slight decrease in
performance for only small size traffic is not surprising.  I'm not
sure if anything can be done about it in the short term, because the
benefits in the other direction (magnitude order of improvement for
TLS traffic) by being transactional instead of batching outweigh the
network overhead of small transactions, and most clients are going to
do more than just minimum-size reads.

However, commit bd2cd4a44 does mention a potential future optimization
of not uncorking if there is an easy way to detect if another reply in
the queue will be sent shortly.  Also, distinct actions for corking
and uncorking costs extra system calls; it may be possible to utilize
MSG_MORE on the existing data syscall paths instead of having to
separately cork/uncork, which in turn could still mark message
transaction boundaries with less overhead.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: qemu-nbd performance regression in bd2cd4a4
  2023-04-06 15:07 ` Eric Blake
@ 2023-04-12  8:49   ` Lukáš Doktor
  0 siblings, 0 replies; 4+ messages in thread
From: Lukáš Doktor @ 2023-04-12  8:49 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, Florian Westphal, Kevin Wolf


[-- Attachment #1.1.1: Type: text/plain, Size: 1943 bytes --]

I see, let me mark it as "expected" regression and hopefully I'll detect the optimization if they are ever implemented. Thank you for the explanation.

Regards,
Lukáš

Dne 06. 04. 23 v 17:07 Eric Blake napsal(a):
> On Thu, Apr 06, 2023 at 12:55:38PM +0200, Lukáš Doktor wrote:
>> Hello Florian, folks,
>>
>> my CI caught ~5% regression (in 60s runs, when using 240s it was about 10%) in qemu-nbd performance bisected multiple-times up to bd2cd4a441ded163b62371790876f28a9b834317 in fio when using 4k blocks read. Note that other scenarios (reads using 1024k blocks, writes using 4 nor 1024k blocks) were not affected. Is this expected?
> 
> Large operations (1024k blocks) are dominated by the transaction
> itself, and not the network overhead.  Small operations (4k reads)
> used to benefit from TCP batching (introduces latency, but less
> network overhead), but we intentionally started corking things
> (decreases latency, but now the network is prone to send smaller
> packets which means more network overhead).  So a slight decrease in
> performance for only small size traffic is not surprising.  I'm not
> sure if anything can be done about it in the short term, because the
> benefits in the other direction (magnitude order of improvement for
> TLS traffic) by being transactional instead of batching outweigh the
> network overhead of small transactions, and most clients are going to
> do more than just minimum-size reads.
> 
> However, commit bd2cd4a44 does mention a potential future optimization
> of not uncorking if there is an easy way to detect if another reply in
> the queue will be sent shortly.  Also, distinct actions for corking
> and uncorking costs extra system calls; it may be possible to utilize
> MSG_MORE on the existing data syscall paths instead of having to
> separately cork/uncork, which in turn could still mark message
> transaction boundaries with less overhead.
> 

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 12925 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-04-12  8:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-06 10:55 qemu-nbd performance regression in bd2cd4a4 Lukáš Doktor
2023-04-06 11:20 ` Florian Westphal
2023-04-06 15:07 ` Eric Blake
2023-04-12  8:49   ` Lukáš Doktor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).