netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Adit Ranadive <aditr@vmware.com>
Cc: dledford@redhat.com, linux-rdma@vger.kernel.org,
	pv-drivers@vmware.com, netdev@vger.kernel.org,
	linux-pci@vger.kernel.org, jhansen@vmware.com,
	asarwade@vmware.com, georgezhang@vmware.com, bryantan@vmware.com
Subject: Re: [PATCH v5 02/16] IB/pvrdma: Add user-level shared functions
Date: Mon, 26 Sep 2016 09:13:38 +0300	[thread overview]
Message-ID: <20160926061338.GH4088@leon.nu> (raw)
In-Reply-To: <9c9f3668-ff2b-f421-2270-3193c0f62cc9@vmware.com>

[-- Attachment #1: Type: text/plain, Size: 9529 bytes --]

On Sun, Sep 25, 2016 at 09:22:11PM -0700, Adit Ranadive wrote:
> On Sun, Sep 25 2016 at 10:26:24AM +0300, Leon Romanovsky wrote:
> > > On Sat, Sep 24, 2016 at 04:21:26PM -0700, Adit Ranadive wrote:
> > > We share some common structures with the user-level driver. This patch adds
> > > those structures and shared functions to traverse the QP/CQ rings.
>
> <...>
>
> > > +
> > > +#include <linux/types.h>
> > > +
> > > +#define PVRDMA_UVERBS_ABI_VERSION	3
> > > +#define PVRDMA_BOARD_ID			1
> > > +#define PVRDMA_REV_ID			1
> > >
> > > Please don't add defines which you are not using in the library and the
> > > two above are not in use.
> > >
>
> I'll move these to the pvrdma.h file.
>
> <...>
>
> > > diff --git a/include/uapi/rdma/pvrdma-uapi.h b/include/uapi/rdma/pvrdma-uapi.h
> > > new file mode 100644
> > > index 0000000..430d8a5
>
> <...>
>
> > > +
> > > +#ifndef __PVRDMA_UAPI_H__
> > > +#define __PVRDMA_UAPI_H__
> > > +
> > > +#include <linux/types.h>
> > > +
> > > +#define PVRDMA_VERSION 17
> > >
> > > What do you plan to do with this VERSION?
> > > How is it related to ABI?
> > >
>
> Not related. This is only for the driver to know which APIs to support.
> For example, an older driver would still be able to work with a newer
> device. I can move this to pvrdma.h as well.
>
> To be honest, I thought I can move this file into the uapi folder since
> the structures here are shared with the user-level library. Based on
> your comments in this thread and the other ones, I think it makes sense
> to move this file back to the pvrdma driver folder and rename it
> (pvrdma_wqe.h?) to avoid confusion. There might still be some duplicate
> code (especially the UAR offsets and WQE structs) here and in our
> user-level library.
>
> Let me know if that makes sense.
>
> > > +
> > > +#define PVRDMA_UAR_HANDLE_MASK	0x00FFFFFF	/* Bottom 24 bits. */
> > > +#define PVRDMA_UAR_QP_OFFSET	0		/* Offset of QP doorbell. */
> > > +#define PVRDMA_UAR_QP_SEND	BIT(30)		/* Send bit. */
> > > +#define PVRDMA_UAR_QP_RECV	BIT(31)		/* Recv bit. */
> > > +#define PVRDMA_UAR_CQ_OFFSET	4		/* Offset of CQ doorbell. */
> > > +#define PVRDMA_UAR_CQ_ARM_SOL	BIT(29)		/* Arm solicited bit. */
> > > +#define PVRDMA_UAR_CQ_ARM	BIT(30)		/* Arm bit. */
> > > +#define PVRDMA_UAR_CQ_POLL	BIT(31)		/* Poll bit. */
> > > +#define PVRDMA_INVALID_IDX	-1		/* Invalid index. */
> > >
> > > +
> > > +/* PVRDMA atomic compare and swap */
> > > +struct pvrdma_exp_cmp_swap {
> > >
> > > _EXP_ looks very similar to MLNX_OFED naming convention.
> > >
>
> Yes, the operation was based on that. Any concerns?
> I can rename this and the one below.

Yes, please.
The common practice in IB subsystem is to use _ex_ notation for such
extended structures.

>
> > > +	__u64 swap_val;
> > > +	__u64 compare_val;
> > > +	__u64 swap_mask;
> > > +	__u64 compare_mask;
> > > +};
> > > +
> > > +/* PVRDMA atomic fetch and add */
> > > +struct pvrdma_exp_fetch_add {
> > >
> > > The same as above.
> > >
> > > +	__u64 add_val;
> > > +	__u64 field_boundary;
> > > +};
> > > +
> > > +/* PVRDMA address vector. */
> > > +struct pvrdma_av {
> > > +	__u32 port_pd;
> > > +	__u32 sl_tclass_flowlabel;
> > > +	__u8 dgid[16];
> > > +	__u8 src_path_bits;
> > > +	__u8 gid_index;
> > > +	__u8 stat_rate;
> > > +	__u8 hop_limit;
> > > +	__u8 dmac[6];
> > > +	__u8 reserved[6];
> > > +};
> > > +
> > > +/* PVRDMA scatter/gather entry */
> > > +struct pvrdma_sge {
> > > +	__u64   addr;
> > > +	__u32   length;
> > > +	__u32   lkey;
> > > +};
> > > +
> > > +/* PVRDMA receive queue work request */
> > > +struct pvrdma_rq_wqe_hdr {
> > > +	__u64 wr_id;		/* wr id */
> > > +	__u32 num_sge;		/* size of s/g array */
> > > +	__u32 total_len;	/* reserved */
> > > +};
> > > +/* Use pvrdma_sge (ib_sge) for receive queue s/g array elements. */
> > > +
> > > +/* PVRDMA send queue work request */
> > > +struct pvrdma_sq_wqe_hdr {
> > > +	__u64 wr_id;		/* wr id */
> > > +	__u32 num_sge;		/* size of s/g array */
> > > +	__u32 total_len;	/* reserved */
> > > +	__u32 opcode;		/* operation type */
> > > +	__u32 send_flags;	/* wr flags */
> > > +	union {
> > > +		__u32 imm_data;
> > > +		__u32 invalidate_rkey;
> > > +	} ex;
> > > +	__u32 reserved;
> > > +	union {
> > > +		struct {
> > > +			__u64 remote_addr;
> > > +			__u32 rkey;
> > > +			__u8 reserved[4];
> > > +		} rdma;
> > > +		struct {
> > > +			__u64 remote_addr;
> > > +			__u64 compare_add;
> > > +			__u64 swap;
> > > +			__u32 rkey;
> > > +			__u32 reserved;
> > > +		} atomic;
> > > +		struct {
> > > +			__u64 remote_addr;
> > > +			__u32 log_arg_sz;
> > > +			__u32 rkey;
> > > +			union {
> > > +				struct pvrdma_exp_cmp_swap  cmp_swap;
> > > +				struct pvrdma_exp_fetch_add fetch_add;
> > > +			} wr_data;
> > > +		} masked_atomics;
> > > +		struct {
> > > +			__u64 iova_start;
> > > +			__u64 pl_pdir_dma;
> > > +			__u32 page_shift;
> > > +			__u32 page_list_len;
> > > +			__u32 length;
> > > +			__u32 access_flags;
> > > +			__u32 rkey;
> > > +		} fast_reg;
> > > +		struct {
> > > +			__u32 remote_qpn;
> > > +			__u32 remote_qkey;
> > > +			struct pvrdma_av av;
> > > +		} ud;
> > > +	} wr;
> > > +};
> > >
> > > No, I have half-baked patch series which refactors this structure in kernel.
> > > There is no need to put this structure in UAPI.
> > >
>
> This is specific to our device.. We do need to enqueue the WQE in this format
> for the device to recognize it. This is the same format that the user-level
> library will put the WQE in. As I said above, we can move this to the main
> pvrdma driver directory if you prefer.

This is different implementations between kernel and user space.
We don't want to bring user space limitations to kernel.
Take a look here:
http://lxr.free-electrons.com/source/include/rdma/ib_verbs.h#L1192

>
> > > +/* Use pvrdma_sge (ib_sge) for send queue s/g array elements. */
> > > +
> > > +/* Completion queue element. */
> > > +struct pvrdma_cqe {
> > > +	__u64 wr_id;
> > > +	__u64 qp;
> > > +	__u32 opcode;
> > > +	__u32 status;
> > > +	__u32 byte_len;
> > > +	__u32 imm_data;
> > > +	__u32 src_qp;
> > > +	__u32 wc_flags;
> > > +	__u32 vendor_err;
> > > +	__u16 pkey_index;
> > > +	__u16 slid;
> > > +	__u8 sl;
> > > +	__u8 dlid_path_bits;
> > > +	__u8 port_num;
> > > +	__u8 smac[6];
> > > +	__u8 reserved2[7]; /* Pad to next power of 2 (64). */
> > > +};
> > > +
> > > +struct pvrdma_ring {
> > > +	atomic_t prod_tail;	/* Producer tail. */
> > > +	atomic_t cons_head;	/* Consumer head. */
> > > +};
> > > +
> > > +struct pvrdma_ring_state {
> > > +	struct pvrdma_ring tx;	/* Tx ring. */
> > > +	struct pvrdma_ring rx;	/* Rx ring. */
> > > +};
> > > +
> > > +static inline int pvrdma_idx_valid(__u32 idx, __u32 max_elems)
> > > +{
> > > +	/* Generates fewer instructions than a less-than. */
> > > +	return (idx & ~((max_elems << 1) - 1)) == 0;
> > > +}
> > > +
> > > +static inline __s32 pvrdma_idx(atomic_t *var, __u32 max_elems)
> > > +{
> > > +	const unsigned int idx = atomic_read(var);
> > > +
> > > +	if (pvrdma_idx_valid(idx, max_elems))
> > > +		return idx & (max_elems - 1);
> > > +	return PVRDMA_INVALID_IDX;
> > > +}
> > > +
> > > +static inline void pvrdma_idx_ring_inc(atomic_t *var, __u32 max_elems)
> > > +{
> > > +	__u32 idx = atomic_read(var) + 1;	/* Increment. */
> > >
> > > It is definitely different atomic_read than you expect. From my grep
> > > searches on my machine, linux kernel doesn't export in standard headers
> > > the atomic_* functions and C has their implementation of that functions.
> > >
>
> This would probably change for the user-level library, so no need have this file
> in UAPI.
>
> > > +
> > > +	idx &= (max_elems << 1) - 1;		/* Modulo size, flip gen. */
> > > +	atomic_set(var, idx);
> > > +}
> > > +
> > > +static inline __s32 pvrdma_idx_ring_has_space(const struct pvrdma_ring *r,
> > > +					      __u32 max_elems, __u32 *out_tail)
> > > +{
> > > +	const __u32 tail = atomic_read(&r->prod_tail);
> > > +	const __u32 head = atomic_read(&r->cons_head);
> > > +
> > > +	if (pvrdma_idx_valid(tail, max_elems) &&
> > > +	    pvrdma_idx_valid(head, max_elems)) {
> > > +		*out_tail = tail & (max_elems - 1);
> > > +		return tail != (head ^ max_elems);
> > > +	}
> > > +	return PVRDMA_INVALID_IDX;
> > > +}
> > > +
> > > +static inline __s32 pvrdma_idx_ring_has_data(const struct pvrdma_ring *r,
> > > +					     __u32 max_elems, __u32 *out_head)
> > > +{
> > > +	const __u32 tail = atomic_read(&r->prod_tail);
> > > +	const __u32 head = atomic_read(&r->cons_head);
> > > +
> > > +	if (pvrdma_idx_valid(tail, max_elems) &&
> > > +	    pvrdma_idx_valid(head, max_elems)) {
> > > +		*out_head = head & (max_elems - 1);
> > > +		return tail != head;
> > > +	}
> > > +	return PVRDMA_INVALID_IDX;
> > > +}
> > > +
> > > +static inline bool pvrdma_idx_ring_is_valid_idx(const struct pvrdma_ring *r,
> > > +						__u32 max_elems, __u32 *idx)
> > > +{
> > > +	const __u32 tail = atomic_read(&r->prod_tail);
> > > +	const __u32 head = atomic_read(&r->cons_head);
> > > +
> > > +	if (pvrdma_idx_valid(tail, max_elems) &&
> > > +	    pvrdma_idx_valid(head, max_elems) &&
> > > +	    pvrdma_idx_valid(*idx, max_elems)) {
> > > +		if (tail > head && (*idx < tail && *idx >= head))
> > > +			return true;
> > > +		else if (head > tail && (*idx >= head || *idx < tail))
> > > +			return true;
> > > +	}
> > > +	return false;
> > > +}
> > > +
> > > +#endif /* __PVRDMA_UAPI_H__ */
> > >
> > > I suggest completely remove this file from UAPI headers folder.
> > >
>
> I can move this back to the pvrdma driver folder.

Yes, please.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2016-09-26  6:13 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-24 23:21 [PATCH v5 00/16] Add Paravirtual RDMA Driver Adit Ranadive
2016-09-24 23:21 ` [PATCH v5 02/16] IB/pvrdma: Add user-level shared functions Adit Ranadive
     [not found]   ` <b52db7f59d69089f7b1d53311574143c4da8252a.1474759181.git.aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
2016-09-25  7:26     ` Leon Romanovsky
2016-09-25 12:29       ` Leon Romanovsky
     [not found]       ` <20160925072624.GV4088-2ukJVAZIZ/Y@public.gmane.org>
2016-09-26  4:22         ` Adit Ranadive
2016-09-26  6:13           ` Leon Romanovsky [this message]
     [not found]             ` <20160926061338.GH4088-2ukJVAZIZ/Y@public.gmane.org>
2016-09-26 17:33               ` Adit Ranadive
2016-09-24 23:21 ` [PATCH v5 03/16] IB/pvrdma: Add virtual device RDMA structures Adit Ranadive
2016-09-24 23:21 ` [PATCH v5 04/16] IB/pvrdma: Add the paravirtual RDMA device specification Adit Ranadive
2016-09-24 23:21 ` [PATCH v5 05/16] IB/pvrdma: Add functions for Verbs support Adit Ranadive
2016-09-24 23:21 ` [PATCH v5 06/16] IB/pvrdma: Add paravirtual rdma device Adit Ranadive
     [not found] ` <cover.1474759181.git.aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
2016-09-24 23:21   ` [PATCH v5 01/16] vmxnet3: Move PCI Id to pci_ids.h Adit Ranadive
2016-09-24 23:21   ` [PATCH v5 07/16] IB/pvrdma: Add helper functions Adit Ranadive
2016-09-24 23:21   ` [PATCH v5 09/16] IB/pvrdma: Add support for Completion Queues Adit Ranadive
2016-09-24 23:21   ` [PATCH v5 10/16] IB/pvrdma: Add UAR support Adit Ranadive
2016-09-24 23:21 ` [PATCH v5 08/16] IB/pvrdma: Add device command support Adit Ranadive
     [not found]   ` <47e16986707ddbe0ffa15795ef9ae60d15bdd728.1474759181.git.aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
2016-09-26  7:16     ` Yuval Shaia
     [not found]       ` <20160926071601.GA6352-Hxa29pjIrERMGUUWBy6pNA@public.gmane.org>
2016-09-26 18:06         ` Adit Ranadive
2016-09-24 23:21 ` [PATCH v5 11/16] IB/pvrdma: Add support for memory regions Adit Ranadive
2016-09-24 23:21 ` [PATCH v5 12/16] IB/pvrdma: Add Queue Pair support Adit Ranadive
2016-09-24 23:21 ` [PATCH v5 13/16] IB/pvrdma: Add the main driver module for PVRDMA Adit Ranadive
     [not found]   ` <8db982c37a0a96b37df02826564474a216922f25.1474759181.git.aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
2016-09-25  7:57     ` Leon Romanovsky
2016-09-26  5:10       ` Adit Ranadive
     [not found]         ` <da4a09fc-bef1-60ea-f9af-7c505a9af6d6-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
2016-09-26  6:03           ` Leon Romanovsky
2016-09-26  7:27     ` Yuval Shaia
2016-09-26 18:15       ` Adit Ranadive
2016-09-27  9:21         ` David Laight
2016-09-27 18:50           ` Adit Ranadive
2016-09-24 23:21 ` [PATCH v5 14/16] IB/pvrdma: Add Kconfig and Makefile Adit Ranadive
2016-09-24 23:21 ` [PATCH v5 15/16] IB: Add PVRDMA driver Adit Ranadive
2016-09-24 23:21 ` [PATCH v5 16/16] MAINTAINERS: Update for " Adit Ranadive
2016-09-25  7:30   ` Leon Romanovsky
     [not found]     ` <20160925073010.GW4088-2ukJVAZIZ/Y@public.gmane.org>
2016-09-26  5:22       ` Adit Ranadive
2016-09-26  5:56         ` Leon Romanovsky
2016-09-25  7:03 ` [PATCH v5 00/16] Add Paravirtual RDMA Driver Leon Romanovsky
2016-09-26  5:25   ` Adit Ranadive
2016-09-26  5:57     ` Leon Romanovsky
     [not found]     ` <9f65ab5c-d8c2-e8e3-9334-5d1865a20dc9-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
2016-09-26 16:51       ` Jason Gunthorpe
2016-09-26 20:40         ` Adit Ranadive
     [not found]           ` <b4e0acd9-e1a2-f6ac-0afa-c0fd62dd62f0-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
2016-09-26 21:07             ` Jason Gunthorpe
2016-09-26 21:16               ` Adit Ranadive
2016-09-26 22:42 ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160926061338.GH4088@leon.nu \
    --to=leon@kernel.org \
    --cc=aditr@vmware.com \
    --cc=asarwade@vmware.com \
    --cc=bryantan@vmware.com \
    --cc=dledford@redhat.com \
    --cc=georgezhang@vmware.com \
    --cc=jhansen@vmware.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pv-drivers@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).