Re: [PATCH 2/4] io_uring: move locking inside overflow posting

Linux io-uring development
 help / color / mirror / Atom feed

From: Jens Axboe <axboe@kernel.dk>
To: Pavel Begunkov <asml.silence@gmail.com>, io-uring@vger.kernel.org
Subject: Re: [PATCH 2/4] io_uring: move locking inside overflow posting
Date: Wed, 14 May 2025 14:05:47 -0600	[thread overview]
Message-ID: <1644225f-36c9-4abf-8da3-cc853cdab0e8@kernel.dk> (raw)
In-Reply-To: <a788a22f-0efd-4b78-94b5-c096b38c0e6c@gmail.com>

On 5/14/25 2:00 PM, Pavel Begunkov wrote:
> On 5/14/25 20:25, Jens Axboe wrote:
>> On 5/14/25 11:18 AM, Pavel Begunkov wrote:
>>> On 5/14/25 17:42, Jens Axboe wrote:
>>>> On 5/14/25 2:07 AM, Pavel Begunkov wrote:
>>> ...>> +static bool io_cqring_event_overflow(struct io_ring_ctx *ctx, bool locked,
>>>>> +                     u64 user_data, s32 res, u32 cflags,
>>>>> +                     u64 extra1, u64 extra2)
>>>>> +{
>>>>> +    bool queued;
>>>>> +
>>>>> +    if (locked) {
>>>>> +        queued = __io_cqring_event_overflow(ctx, user_data, res, cflags,
>>>>> +                            extra1, extra2);
>>>>> +    } else {
>>>>> +        spin_lock(&ctx->completion_lock);
>>>>> +        queued = __io_cqring_event_overflow(ctx, user_data, res, cflags,
>>>>> +                            extra1, extra2);
>>>>> +        spin_unlock(&ctx->completion_lock);
>>>>> +    }
>>>>> +    return queued;
>>>>> +}
>>>>
>>>> Really not a fan of passing in locking state and having a split helper
>>>> like that.
>>>
>>> I'd agree in general, but it's the same thing that's already in
>>> __io_submit_flush_completions(), just with a named argument
>>> instead of backflipping on ctx->lockless_cq.
>>
>> And I honestly _greatly_ prefer that to passing in some random bool
>> where you have to go find the function to even see what it does...
> 
> I see, it's much worse than having it somewhat abstracted,
> trying to be more reliable and self checking to prevent bugs,
> I give up on trying to do anything with it.

I'm just saying that I don't like 'bool locked' kind of arguments. We do
have some of them, but I always find it confusing. I do think the
__io_submit_flush_completions() approach is more robust.

Maybe have an alloc helper for the cqe and pass that in instead? Then we
could get rid of adding 'locked' AND 5 other arguments?

>>>> It's also pretty unwieldy with 7 arguments.
>>>
>>> It's 6 vs 7, not much difference, and the real problem here is
>>> passing all cqe parts as separate arguments, which this series
>>> doesn't touch.
>>
>> It's 6 you're passing on, it's 7 for the overflow helper.
> 
> I don't get what you're saying.

The helper you add, io_cqring_event_overflow(), takes 7 arguments. And
yes most of that is cqe arguments, the bool locked means we're now at 7
rather than 7 arguments. Not the end of the world, but both of them are
pretty horrid in that sense.

>>>> Overall, why do we care about atomic vs non-atomic allocations for
>>>> overflows? If you're hitting overflow, you've messed up... And if it's
>>>
>>> Not really, it's not far fetched to overflow with multishots unless
>>> you're overallocating CQ quite a bit, especially if traffic is not so
>>> homogeneous and has spikes. Expensive and should be minimised b/c of
>>> that, but it's still reasonably possible in a well structured app.
>>
>> It's still possible, but it should be a fairly rare occurence after all.
> 
> That's what I'm saying. And handling that "rare" well is the difference
> between having reliable software and terminally screwing userspace.

As I wrote just a bit below, I don't disagree on getting rid of
GFP_ATOMIC at all, just stating that it'll be a bit more jitter. But
that's still better than a potentially less reliable allocation for
overflow. And the jitter part should hopefully be rare, as overflows
should not be a common occurence.

>>>> about cost, gfp atomic alloc will be predictable, vs GFP_KERNEL which
>>>> can be a lot more involved.
>>>
>>> Not the cost but reliability. If overflowing fails, the userspace can't
>>> reasonably do anything to recover, so either terminate or leak. And
>>> it's taking the atomic reserves from those who actually need it
>>> like irq.
>>
>> I don't really disagree on that part, GFP_ATOMIC is only really used
>> because of the locking, as this series deals with. Just saying that
>> it'll make overflow jitter more pronounced, potentially, though not
> 
> It's like saying that the allocation makes heavier so let's just
> drop the CQE on the floor. It should do reclaim only when the
> situation with memory is already bad enough and you're risking
> failing atomic allocations (even if not exactly that).

Right, like I said I agree on that part...

-- 
Jens Axboe

next prev parent reply	other threads:[~2025-05-14 20:05 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-14  8:07 [PATCH 0/4] overflow completion enhancements Pavel Begunkov
2025-05-14  8:07 ` [PATCH 1/4] io_uring: open code io_req_cqe_overflow() Pavel Begunkov
2025-05-14  8:07 ` [PATCH 2/4] io_uring: move locking inside overflow posting Pavel Begunkov
2025-05-14 16:42   ` Jens Axboe
2025-05-14 17:18     ` Pavel Begunkov
2025-05-14 19:25       ` Jens Axboe
2025-05-14 20:00         ` Pavel Begunkov
2025-05-14 20:05           ` Jens Axboe [this message]
2025-05-14 21:52             ` Jens Axboe
2025-05-15 11:04               ` Pavel Begunkov
2025-05-14  8:07 ` [PATCH 3/4] io_uring: alloc overflow entry before locking Pavel Begunkov
2025-05-14  8:07 ` [PATCH 4/4] io_uring: add lockdep warning for overflow posting Pavel Begunkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1644225f-36c9-4abf-8da3-cc853cdab0e8@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=asml.silence@gmail.com \
    --cc=io-uring@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox