From: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
talal-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org
Subject: Re: [PATCH V4 libibverbs 2/7] Add member functions to poll an extended CQ
Date: Sun, 29 May 2016 18:38:41 -0400 [thread overview]
Message-ID: <574B6F71.9060808@redhat.com> (raw)
In-Reply-To: <1464533475-18949-3-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
[-- Attachment #1.1: Type: text/plain, Size: 4169 bytes --]
On 5/29/2016 10:51 AM, Yishai Hadas wrote:
> From: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>
> Currently, ibv_poll_cq is one stop shop for polling all the
> completion's attribute. The current implementation has a few
> implications:
> 1. The vendor's completion format is transformed to an agnostic IB
> format (copying data which might has been used directly).
> 2. Adding new completion attributes means wasting more time in copies.
> 3. Extensible functions require wasting cycles on logic which
> determines whether a function is known and implemented in the
> provider.
>
> In order to solve these problems, we introduce a new poll_cq mechanism
> for extensible CQs. Polling is done in batches, where every batch
> could contain more than one completion. A batch is started with
> calling to start_poll function. This already fetches a completion
> that the user can now query using the various query functions.
> Advancing to the next completion is done using next_poll.
> After querying all completions (or when the user wants to stop
> fetching completions in the current batch), end_poll is called.
>
> As 'wr_id' and 'status' are mostly used, they can be accessed directly
> and there is no function call for.
So I've been busy with kernel stuff, but I don't really like this patch.
I'm not really enamored with the entire poll_ex API.
First, I don't like the start_poll/poll/end_poll setup. I would rather
we do something like using refcounting on the WCs. Maybe something like
returning an array of pointers to WCs where each WC already has a ref
count taken, and when the user is done, they call ibv_release_wc() (or
something like that) on one or more of these WCs, at which point they
are freed and the hardware can do with them as they wish.
Then, I've been thinking about how to get this data to the user, and
again, I don't really like where we are headed. I was thinking about
changing it so that we create an enum like so:
enum wc_elements {
WC_WR_ID,
WC_WR_STATUS,
etc.
};
Then we define each cq struct to have a new array called offsets:
struct cq {
unsigned char offsets[64]; /* One cache line, should be 64byte aligned */
....
} __align__(64);
When a cq is created by the driver, it is responsible for filling out
the offsets array (can be static if your driver only has one CQE format,
but if it changes with different options, it allows for that).
Essentially, it is a mapping from the enum of WC items to the position
in the hardware's CQE. Then you define something like this in the verbs
header file:
static inline u8 __ibv_get_wc_u8(struct cq *cq, struct wc *wc, unsigned
char wc_element)
{
return *(wc + cq->offsets[wc_element]);
}
and one for the other data sizes too, then something like this:
define ibv_get_wc_element(cq, wc, wc_element, data) \
{ \
switch(wc_element) { \
/* All char size elements */ \
case WC_WR_STATUS: \
data = __ibv_get_wc_u8(cq, wc, wc_element); \
You get the idea. The compiler will optimize the above down to static
code at each location. And you could simplify things a bit by having
another static array that keeps the element sizes in it, and then run
the switch on that.
The long and short of this method is that accessing CQE struct elements
becomes essentially a load of a variable offset from the cq->offsets[]
item, which should stay hot in cache, followed by a lea of the cqe
address and the offset.
I would prefer having to have an individual routine to access each element.
I would prefer to avoid all the calls if possible.
And I want something that can be optimized away by the compiler as much
as possible.
The above accomplishes that and has the room to support up to 64
different WC struct elements, with a maximum offset of 256 bytes (much
larger than any WC today).
What do people think of this idea? If no one shoots holes in it, I will
try to code something up this upcoming week (although I can't promise
it, I currently have two sick children at home and that prevents me from
getting much coding done).
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]
next prev parent reply other threads:[~2016-05-29 22:38 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-29 14:51 [PATCH V4 libibverbs 0/7] Completion timestamping Yishai Hadas
[not found] ` <1464533475-18949-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-29 14:51 ` [PATCH V4 libibverbs 1/7] Add support for extended creating CQ verb Yishai Hadas
2016-05-29 14:51 ` [PATCH V4 libibverbs 2/7] Add member functions to poll an extended CQ Yishai Hadas
[not found] ` <1464533475-18949-3-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-29 22:38 ` Doug Ledford [this message]
[not found] ` <574B6F71.9060808-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-05-29 23:30 ` Jason Gunthorpe
[not found] ` <20160529233009.GA12420-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-05-29 23:47 ` Doug Ledford
[not found] ` <8F7BC9E2-75EC-413B-BEBE-11450225AF06-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-05-30 1:35 ` Jason Gunthorpe
[not found] ` <20160530013507.GA19230-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-05-30 3:39 ` Doug Ledford
2016-05-31 17:46 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373AB05CA6D-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2016-05-31 18:06 ` Jason Gunthorpe
[not found] ` <20160531180608.GA21834-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-05-31 19:48 ` Hefty, Sean
2016-05-30 7:47 ` Matan Barak
[not found] ` <4958edf4-7296-26c9-4cbe-8fab45be11a3-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-30 10:01 ` Yishai Hadas
[not found] ` <4e8befc4-aec5-a17d-24ce-40ff97d345da@redhat.com>
[not found] ` <4e8befc4-aec5-a17d-24ce-40ff97d345da-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-05-30 13:50 ` Doug Ledford
[not found] ` <8708a378-4c48-df98-86a4-d210bbe690b5-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-05-30 15:01 ` Matan Barak (External)
[not found] ` <ecdbec76-31cd-74e1-25b4-7d60c3fa2af0-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-30 17:05 ` Jason Gunthorpe
2016-05-29 14:51 ` [PATCH V4 libibverbs 3/7] Add timestamp_mask and hca_core_clock to ibv_query_device_ex Yishai Hadas
2016-05-29 14:51 ` [PATCH V4 libibverbs 4/7] Add completion timestamp to poll_cq Yishai Hadas
2016-05-29 14:51 ` [PATCH V4 libibverbs 5/7] Create a single threaded CQ Yishai Hadas
2016-05-29 14:51 ` [PATCH V4 libibverbs 6/7] Add a verb that queries real time values from the HCA Yishai Hadas
2016-05-29 14:51 ` [PATCH V4 libibverbs 7/7] Add timestamp support in rc_pingpong Yishai Hadas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=574B6F71.9060808@redhat.com \
--to=dledford-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
--cc=jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=talal-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox