All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Jens Axboe <axboe@kernel.dk>, io-uring@vger.kernel.org
Subject: Re: [PATCHSET 0/3] Improve MSG_RING SINGLE_ISSUER performance
Date: Wed, 29 May 2024 03:08:33 +0100	[thread overview]
Message-ID: <377bad85-032d-4906-9142-d7be5cae9dcb@gmail.com> (raw)
In-Reply-To: <18a96f04-bb30-4bd8-82ca-e72f1c954dac@kernel.dk>

On 5/29/24 02:35, Jens Axboe wrote:
> On 5/28/24 5:04 PM, Jens Axboe wrote:
>> On 5/28/24 12:31 PM, Jens Axboe wrote:
>>> I suspect a bug in the previous patches, because this is what the
>>> forward port looks like. First, for reference, the current results:
>>
>> Got it sorted, and pinned sender and receiver on CPUs to avoid the
>> variation. It looks like this with the task_work approach that I sent
>> out as v1:
>>
>> Latencies for: Sender
>>      percentiles (nsec):
>>       |  1.0000th=[ 2160],  5.0000th=[ 2672], 10.0000th=[ 2768],
>>       | 20.0000th=[ 3568], 30.0000th=[ 3568], 40.0000th=[ 3600],
>>       | 50.0000th=[ 3600], 60.0000th=[ 3600], 70.0000th=[ 3632],
>>       | 80.0000th=[ 3632], 90.0000th=[ 3664], 95.0000th=[ 3696],
>>       | 99.0000th=[ 4832], 99.5000th=[15168], 99.9000th=[16192],
>>       | 99.9500th=[16320], 99.9900th=[18304]
>> Latencies for: Receiver
>>      percentiles (nsec):
>>       |  1.0000th=[ 1528],  5.0000th=[ 1576], 10.0000th=[ 1656],
>>       | 20.0000th=[ 2040], 30.0000th=[ 2064], 40.0000th=[ 2064],
>>       | 50.0000th=[ 2064], 60.0000th=[ 2064], 70.0000th=[ 2096],
>>       | 80.0000th=[ 2096], 90.0000th=[ 2128], 95.0000th=[ 2160],
>>       | 99.0000th=[ 3472], 99.5000th=[14784], 99.9000th=[15168],
>>       | 99.9500th=[15424], 99.9900th=[17280]
>>
>> and here's the exact same test run on the current patches:
>>
>> Latencies for: Sender
>>      percentiles (nsec):
>>       |  1.0000th=[  362],  5.0000th=[  362], 10.0000th=[  370],
>>       | 20.0000th=[  370], 30.0000th=[  370], 40.0000th=[  370],
>>       | 50.0000th=[  374], 60.0000th=[  382], 70.0000th=[  382],
>>       | 80.0000th=[  382], 90.0000th=[  382], 95.0000th=[  390],
>>       | 99.0000th=[  402], 99.5000th=[  430], 99.9000th=[  900],
>>       | 99.9500th=[  972], 99.9900th=[ 1432]
>> Latencies for: Receiver
>>      percentiles (nsec):
>>       |  1.0000th=[ 1528],  5.0000th=[ 1544], 10.0000th=[ 1560],
>>       | 20.0000th=[ 1576], 30.0000th=[ 1592], 40.0000th=[ 1592],
>>       | 50.0000th=[ 1592], 60.0000th=[ 1608], 70.0000th=[ 1608],
>>       | 80.0000th=[ 1640], 90.0000th=[ 1672], 95.0000th=[ 1688],
>>       | 99.0000th=[ 1848], 99.5000th=[ 2128], 99.9000th=[14272],
>>       | 99.9500th=[14784], 99.9900th=[73216]
>>
>> I'll try and augment the test app to do proper rated submissions, so I
>> can ramp up the rates a bit and see what happens.
> 
> And the final one, with the rated sends sorted out. One key observation
> is that v1 trails the current edition, it just can't keep up as the rate
> is increased. If we cap the rate at at what should be 33K messages per
> second, v1 gets ~28K messages and has the following latency profile (for
> a 3 second run)

Do you see where the receiver latency comes from? The wakeups are
quite similar in nature, assuming it's all wait(nr=1) and CPUs
are not 100% consumed. The hop back spoils scheduling timing?


> Latencies for: Receiver (msg=83863)
>      percentiles (nsec):
>       |  1.0000th=[  1208],  5.0000th=[  1336], 10.0000th=[  1400],
>       | 20.0000th=[  1768], 30.0000th=[  1912], 40.0000th=[  1976],
>       | 50.0000th=[  2040], 60.0000th=[  2160], 70.0000th=[  2256],
>       | 80.0000th=[  2480], 90.0000th=[  2736], 95.0000th=[  3024],
>       | 99.0000th=[  4080], 99.5000th=[  4896], 99.9000th=[  9664],
>       | 99.9500th=[ 17024], 99.9900th=[218112]
> Latencies for: Sender (msg=83863)
>      percentiles (nsec):
>       |  1.0000th=[  1928],  5.0000th=[  2064], 10.0000th=[  2160],
>       | 20.0000th=[  2608], 30.0000th=[  2672], 40.0000th=[  2736],
>       | 50.0000th=[  2864], 60.0000th=[  2960], 70.0000th=[  3152],
>       | 80.0000th=[  3408], 90.0000th=[  4128], 95.0000th=[  4576],
>       | 99.0000th=[  5920], 99.5000th=[  6752], 99.9000th=[ 13376],
>       | 99.9500th=[ 22912], 99.9900th=[261120]
> 
> and the current edition does:
> 
> Latencies for: Sender (msg=94488)
>      percentiles (nsec):
>       |  1.0000th=[  181],  5.0000th=[  191], 10.0000th=[  201],
>       | 20.0000th=[  215], 30.0000th=[  225], 40.0000th=[  235],
>       | 50.0000th=[  262], 60.0000th=[  306], 70.0000th=[  430],
>       | 80.0000th=[ 1004], 90.0000th=[ 2480], 95.0000th=[ 3632],
>       | 99.0000th=[ 8096], 99.5000th=[12352], 99.9000th=[18048],
>       | 99.9500th=[19584], 99.9900th=[23680]
> Latencies for: Receiver (msg=94488)
>      percentiles (nsec):
>       |  1.0000th=[  342],  5.0000th=[  398], 10.0000th=[  482],
>       | 20.0000th=[  652], 30.0000th=[  812], 40.0000th=[  972],
>       | 50.0000th=[ 1240], 60.0000th=[ 1640], 70.0000th=[ 1944],
>       | 80.0000th=[ 2448], 90.0000th=[ 3248], 95.0000th=[ 5216],
>       | 99.0000th=[10304], 99.5000th=[12352], 99.9000th=[18048],
>       | 99.9500th=[19840], 99.9900th=[23168]
> 
> If we cap it where v1 keeps up, at 13K messages per second, v1 does:
> 
> Latencies for: Receiver (msg=38820)
>      percentiles (nsec):
>       |  1.0000th=[ 1160],  5.0000th=[ 1256], 10.0000th=[ 1352],
>       | 20.0000th=[ 1688], 30.0000th=[ 1928], 40.0000th=[ 1976],
>       | 50.0000th=[ 2064], 60.0000th=[ 2384], 70.0000th=[ 2480],
>       | 80.0000th=[ 2768], 90.0000th=[ 3280], 95.0000th=[ 3472],
>       | 99.0000th=[ 4192], 99.5000th=[ 4512], 99.9000th=[ 6624],
>       | 99.9500th=[ 8768], 99.9900th=[14272]
> Latencies for: Sender (msg=38820)
>      percentiles (nsec):
>       |  1.0000th=[ 1848],  5.0000th=[ 1928], 10.0000th=[ 2040],
>       | 20.0000th=[ 2608], 30.0000th=[ 2640], 40.0000th=[ 2736],
>       | 50.0000th=[ 3024], 60.0000th=[ 3120], 70.0000th=[ 3376],
>       | 80.0000th=[ 3824], 90.0000th=[ 4512], 95.0000th=[ 4768],
>       | 99.0000th=[ 5536], 99.5000th=[ 6048], 99.9000th=[ 9024],
>       | 99.9500th=[10304], 99.9900th=[23424]
> 
> and v2 does:
> 
> Latencies for: Sender (msg=39005)
>      percentiles (nsec):
>       |  1.0000th=[  191],  5.0000th=[  211], 10.0000th=[  262],
>       | 20.0000th=[  342], 30.0000th=[  382], 40.0000th=[  402],
>       | 50.0000th=[  450], 60.0000th=[  532], 70.0000th=[ 1080],
>       | 80.0000th=[ 1848], 90.0000th=[ 4768], 95.0000th=[10944],
>       | 99.0000th=[16512], 99.5000th=[18304], 99.9000th=[22400],
>       | 99.9500th=[26496], 99.9900th=[41728]
> Latencies for: Receiver (msg=39005)
>      percentiles (nsec):
>       |  1.0000th=[  410],  5.0000th=[  604], 10.0000th=[  700],
>       | 20.0000th=[  900], 30.0000th=[ 1128], 40.0000th=[ 1320],
>       | 50.0000th=[ 1672], 60.0000th=[ 2256], 70.0000th=[ 2736],
>       | 80.0000th=[ 3760], 90.0000th=[ 5408], 95.0000th=[11072],
>       | 99.0000th=[18304], 99.5000th=[20096], 99.9000th=[24704],
>       | 99.9500th=[27520], 99.9900th=[35584]
> 

-- 
Pavel Begunkov

  reply	other threads:[~2024-05-29  2:08 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-24 22:58 [PATCHSET 0/3] Improve MSG_RING SINGLE_ISSUER performance Jens Axboe
2024-05-24 22:58 ` [PATCH 1/3] io_uring/msg_ring: split fd installing into a helper Jens Axboe
2024-05-24 22:58 ` [PATCH 2/3] io_uring/msg_ring: avoid double indirection task_work for data messages Jens Axboe
2024-05-28 13:18   ` Pavel Begunkov
2024-05-28 14:23     ` Jens Axboe
2024-05-28 13:32   ` Pavel Begunkov
2024-05-28 14:23     ` Jens Axboe
2024-05-28 16:23       ` Pavel Begunkov
2024-05-28 17:59         ` Jens Axboe
2024-05-29  2:04           ` Pavel Begunkov
2024-05-29  2:43             ` Jens Axboe
2024-05-24 22:58 ` [PATCH 3/3] io_uring/msg_ring: avoid double indirection task_work for fd passing Jens Axboe
2024-05-28 13:31 ` [PATCHSET 0/3] Improve MSG_RING SINGLE_ISSUER performance Pavel Begunkov
2024-05-28 14:34   ` Jens Axboe
2024-05-28 14:39     ` Jens Axboe
2024-05-28 15:27     ` Jens Axboe
2024-05-28 16:50     ` Pavel Begunkov
2024-05-28 18:07       ` Jens Axboe
2024-05-28 18:31         ` Jens Axboe
2024-05-28 23:04           ` Jens Axboe
2024-05-29  1:35             ` Jens Axboe
2024-05-29  2:08               ` Pavel Begunkov [this message]
2024-05-29  2:42                 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=377bad85-032d-4906-9142-d7be5cae9dcb@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.