From: Hao Xu <hao.xu@linux.dev>
To: Pavel Begunkov <asml.silence@gmail.com>, io-uring@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Subject: Re: [PATCH v3] io_uring: switch cancel_hash to use per entry spinlock
Date: Sat, 11 Jun 2022 01:40:59 +0800 [thread overview]
Message-ID: <c4fdf822-6514-0147-fb13-fd3d64c0ada3@linux.dev> (raw)
In-Reply-To: <39bbb017-7f7e-39bd-f209-f3526f30c21d@gmail.com>
On 6/11/22 00:10, Pavel Begunkov wrote:
> On 6/10/22 16:45, Hao Xu wrote:
>> On 6/10/22 18:21, Pavel Begunkov wrote:
>>> On 6/8/22 12:12, Hao Xu wrote:
>>>> From: Hao Xu <howeyxu@tencent.com>
>>>>
>>>> Add a new io_hash_bucket structure so that each bucket in cancel_hash
>>>> has separate spinlock. Use per entry lock for cancel_hash, this removes
>>>> some completion lock invocation and remove contension between different
>>>> cancel_hash entries.
>>>>
>>>> Signed-off-by: Hao Xu <howeyxu@tencent.com>
>>>> ---
>>>>
>>>> v1->v2:
>>>> - Add per entry lock for poll/apoll task work code which was missed
>>>> in v1
>>>> - add an member in io_kiocb to track req's indice in cancel_hash
>>>>
>>>> v2->v3:
>>>> - make struct io_hash_bucket align with cacheline to avoid cacheline
>>>> false sharing.
>>>> - re-calculate hash value when deleting an entry from cancel_hash.
>>>> (cannot leverage struct io_poll to store the indice since it's
>>>> already 64 Bytes)
>>>>
>>>> io_uring/cancel.c | 14 +++++++--
>>>> io_uring/cancel.h | 6 ++++
>>>> io_uring/fdinfo.c | 9 ++++--
>>>> io_uring/io_uring.c | 8 +++--
>>>> io_uring/io_uring_types.h | 2 +-
>>>> io_uring/poll.c | 64
>>>> +++++++++++++++++++++------------------
>>>> 6 files changed, 65 insertions(+), 38 deletions(-)
>>>>
>>>> diff --git a/io_uring/cancel.c b/io_uring/cancel.c
>>>> index 83cceb52d82d..bced5d6b9294 100644
>>>> --- a/io_uring/cancel.c
>>>> +++ b/io_uring/cancel.c
>>>> @@ -93,14 +93,14 @@ int io_try_cancel(struct io_kiocb *req, struct
>>>> io_cancel_data *cd)
>>>> if (!ret)
>>>> return 0;
>>>> - spin_lock(&ctx->completion_lock);
>>>> ret = io_poll_cancel(ctx, cd);
>>>> if (ret != -ENOENT)
>>>> goto out;
>>>> + spin_lock(&ctx->completion_lock);
>>>> if (!(cd->flags & IORING_ASYNC_CANCEL_FD))
>>>> ret = io_timeout_cancel(ctx, cd);
>>>> -out:
>>>> spin_unlock(&ctx->completion_lock);
>>>> +out:
>>>> return ret;
>>>> }
>>>> @@ -192,3 +192,13 @@ int io_async_cancel(struct io_kiocb *req,
>>>> unsigned int issue_flags)
>>>> io_req_set_res(req, ret, 0);
>>>> return IOU_OK;
>>>> }
>>>> +
>>>> +inline void init_hash_table(struct io_hash_bucket *hash_table,
>>>> unsigned size)
>>>
>>> Not inline, it can break builds
>>>
>>>> diff --git a/io_uring/cancel.h b/io_uring/cancel.h
>>>> index 4f35d8696325..b57d6706f84d 100644
>>>> --- a/io_uring/cancel.h
>>>> +++ b/io_uring/cancel.h
>>>> @@ -4,3 +4,9 @@ int io_async_cancel_prep(struct io_kiocb *req, const
>>>> struct io_uring_sqe *sqe);
>>>> int io_async_cancel(struct io_kiocb *req, unsigned int issue_flags);
>>>> int io_try_cancel(struct io_kiocb *req, struct io_cancel_data *cd);
>>>> +inline void init_hash_table(struct io_hash_bucket *hash_table,
>>>> unsigned size);
>>>
>>> And this inline as well
>>>
>>>> diff --git a/io_uring/poll.c b/io_uring/poll.c
>>>> index 0df5eca93b16..515f1727e3c6 100644
>>>> --- a/io_uring/poll.c
>>>> +++ b/io_uring/poll.c
>>> [...]
>>>> static struct io_kiocb *io_poll_find(struct io_ring_ctx *ctx, bool
>>>> poll_only,
>>>> struct io_cancel_data *cd)
>>>> - __must_hold(&ctx->completion_lock)
>>>> {
>>>> - struct hlist_head *list;
>>>> struct io_kiocb *req;
>>>> - list = &ctx->cancel_hash[hash_long(cd->data,
>>>> ctx->cancel_hash_bits)];
>>>> - hlist_for_each_entry(req, list, hash_node) {
>>>> + u32 index = hash_long(cd->data, ctx->cancel_hash_bits);
>>>> + struct io_hash_bucket *hb = &ctx->cancel_hash[index];
>>>> +
>>>> + spin_lock(&hb->lock);
>>>> + hlist_for_each_entry(req, &hb->list, hash_node) {
>>>> if (cd->data != req->cqe.user_data)
>>>> continue;
>>>> if (poll_only && req->opcode != IORING_OP_POLL_ADD)
>>>> @@ -569,47 +577,48 @@ static struct io_kiocb *io_poll_find(struct
>>>> io_ring_ctx *ctx, bool poll_only,
>>>> continue;
>>>> req->work.cancel_seq = cd->seq;
>>>> }
>>>> + spin_unlock(&hb->lock);
>>>
>>> The problem here is that after you unlock, nothing keeps the
>>> request alive. Before it was more like
>>>
>>> lock(completion_lock);
>>> req = poll_find();
>>> cancel_poll(req);
>>> unlock(completion_lock);
>>>
>>> and was relying on all of this happening under ->completion_lock.
>>> Now following io_poll_disarm() and/or io_poll_cancel_req() race.
>>> Same with io_poll_file_find().
>>
>> Looks we have to add completion_lock back for cancellation path.
>
> It was relying on completion_lock only because it was guarding
> the hashing, so now find+cancel should happen under the per
> bucket spins, i.e.
>
> lock(buckets[index].lock);
> req = poll_find();
> cancel_poll(req);
> unlock(buckets[index].lock);
>
> A bit trickier to code but doable.
Ah, seems I misunderstood your words, which I'm clear with now.
Yea, it's a bit odd. I'll think about this issue before taking this
solution tomorrow.
Btw, I saw a req->refcount set for poll_add, seems it is not necessary?
(I haven't check it carefully yet)
>
next prev parent reply other threads:[~2022-06-10 17:41 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-08 11:12 [PATCH v3] io_uring: switch cancel_hash to use per entry spinlock Hao Xu
2022-06-08 11:27 ` Pavel Begunkov
2022-06-10 10:21 ` Pavel Begunkov
2022-06-10 15:45 ` Hao Xu
2022-06-10 16:10 ` Pavel Begunkov
2022-06-10 17:40 ` Hao Xu [this message]
2022-06-10 17:45 ` Pavel Begunkov
2022-06-11 4:34 ` Hao Xu
2022-06-11 13:18 ` Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c4fdf822-6514-0147-fb13-fd3d64c0ada3@linux.dev \
--to=hao.xu@linux.dev \
--cc=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=io-uring@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.