Linux io-uring development
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Jens Axboe <axboe@kernel.dk>, hexue <xue01.he@samsung.com>
Cc: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v8] io_uring: releasing CPU resources when polling
Date: Thu, 24 Oct 2024 15:26:46 +0100	[thread overview]
Message-ID: <f60116a5-8c35-4389-bbb6-7bf6deaf71c6@gmail.com> (raw)
In-Reply-To: <9bc8f8c4-3415-48bb-9bd1-0996f2ef6669@kernel.dk>

On 10/24/24 15:18, Jens Axboe wrote:
> On 10/23/24 8:38 PM, hexue wrote:
>> On 9/25/2024 12:12, Pavel Begunkov wrote:
...
>> When the number of threads exceeds the number of CPU cores,the
>> database throughput does not increase significantly. However,
>> hybrid polling can releasing some CPU resources during the polling
>> process, so that part of the CPU time can be used for frequent
>> data processing and other operations, which speeds up the reading
>> process, thereby improving throughput and optimizaing database
>> performance.I tried different compression strategies and got
>> results similar to the above table.(~30% throughput improvement)
>>
>> As more database applications adapt to the io_uring engine, I think
>> the application of hybrid poll may have potential in some scenarios.
> 
> Thanks for posting some numbers on that part, that's useful. I do
> think the feature is useful as well, but I still have some issues
> with the implementation. Below is an incremental patch on top of
> yours to resolve some of those, potentially. Issues:
> 
> 1) The patch still reads a bit like a hack, in that it doesn't seem to
>     care about following the current style. This reads a bit lazy/sloppy
>     or unfinished. I've fixed that up.
> 
> 2) Appropriate member and function naming.
> 
> 3) Same as above, it doesn't care about proper placement of additions to
>     structs. Again this is a bit lazy and wasteful, attention should be
>     paid to where additions are placed to not needlessly bloat
>     structures, or place members in cache unfortunate locations. For
>     example, available_time is just placed at the end of io_ring_ctx,
>     why? It's a submission side member, and there's room with other
>     related members. Not only is the placement now where you'd want it to
>     be, memory wise, it also doesn't add 8 bytes to io_uring_ctx.
> 
> 4) Like the above, the io_kiocb bloat is, by far, the worst. Seems to me
>     that we can share space with the polling hash_node. This obviously
>     needs careful vetting, haven't done that yet. IOPOLL setups should
>     not be using poll at all. This needs extra checking. The poll_state
>     can go with cancel_seq_set, as there's a hole there any. And just
>     like that, rather than add 24b to io_kiocb, it doesn't take any extra
>     space at all.
> 
> 5) HY_POLL is a terrible name. It's associated with IOPOLL, and so let's
>     please use a name related to that. And require IOPOLL being set with
>     HYBRID_IOPOLL, as it's really a variant of that. Makes it clear that
>     HYBRID_IOPOLL is really just a mode of operation for IOPOLL, and it
>     can't exist without that.
> 
> Please take a look at this incremental and test it, and then post a v9
> that looks a lot more finished. Caveat - I haven't tested this one at
> all. Thanks!
> 
> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
> index c79ee9fe86d4..6cf6a45835e5 100644
> --- a/include/linux/io_uring_types.h
> +++ b/include/linux/io_uring_types.h
> @@ -238,6 +238,8 @@ struct io_ring_ctx {
>   		struct io_rings		*rings;
>   		struct percpu_ref	refs;
>   
> +		u64			poll_available_time;
> +
>   		clockid_t		clockid;
>   		enum tk_offsets		clock_offset;
>   
> @@ -433,9 +435,6 @@ struct io_ring_ctx {
>   	struct page			**sqe_pages;
>   
>   	struct page			**cq_wait_page;
> -
> -	/* for io_uring hybrid poll*/
> -	u64			available_time;
>   };
>   
>   struct io_tw_state {
> @@ -647,9 +646,22 @@ struct io_kiocb {
>   
>   	atomic_t			refs;
>   	bool				cancel_seq_set;
> +	bool				poll_state;

As mentioned briefly before, that can be just a req->flags flag

>   	struct io_task_work		io_task_work;
> -	/* for polled requests, i.e. IORING_OP_POLL_ADD and async armed poll */
> -	struct hlist_node		hash_node;
> +	union {
> +		/*
> +		 * for polled requests, i.e. IORING_OP_POLL_ADD and async armed
> +		 * poll
> +		 */
> +		struct hlist_node	hash_node;
> +		/*
> +		 * For IOPOLL setup queues, with hybrid polling
> +		 */
> +		struct {
> +			u64		iopoll_start;
> +			u64		iopoll_end;

And IIRC it doesn't need to store the end as it's used immediately
after it's set in the same function.

> +		};
> +	};
>   	/* internal polling, see IORING_FEAT_FAST_POLL */
>   	struct async_poll		*apoll;
>   	/* opcode allocated if it needs to store data for async defer */
> @@ -665,10 +677,6 @@ struct io_kiocb {
>   		u64			extra1;
>   		u64			extra2;
>   	} big_cqe;
> -    /* for io_uring hybrid iopoll */
> -	bool		poll_state;
> -	u64			iopoll_start;
> -	u64			iopoll_end;
>   };
>   
...

-- 
Pavel Begunkov

  reply	other threads:[~2024-10-24 14:26 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20240918021016epcas5p14d6e771ee39bee5dddf253c39b84110c@epcas5p1.samsung.com>
2024-09-18  2:10 ` [PATCH v8 RESEND] io_uring: releasing CPU resources when polling hexue
2024-09-25  8:29   ` [PATCH V8] " hexue
2024-09-25 12:12     ` Pavel Begunkov
2024-10-24  2:38       ` Re: [PATCH v8] " hexue
2024-10-24 14:18         ` Jens Axboe
2024-10-24 14:26           ` Pavel Begunkov [this message]
2024-10-24 14:40             ` Jens Axboe
2024-10-24 14:49               ` Pavel Begunkov
2024-10-24 14:49                 ` Jens Axboe
2024-10-25  6:10                   ` hexue
     [not found] <CGME20240819025446epcas5p43dde8e55435917ecd0175400b0b7cc62@epcas5p4.samsung.com>
2024-08-19  2:54 ` [PATCH V8] " hexue
     [not found] <CGME20240812015950epcas5p26c0eb254964abfe8a7ff05de0ead05f8@epcas5p2.samsung.com>
2024-08-12  1:59 ` [PATCH v8] " hexue

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f60116a5-8c35-4389-bbb6-7bf6deaf71c6@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=xue01.he@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox