public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: "Wan, Kaike" <kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"Hefty,
	Sean" <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	"Weiny, Ira" <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Jason Gunthorpe
	<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>,
	"Hal Rosenstock
	(hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org)"
	<hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>,
	Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Subject: Re: [RFC] IB/sa: Route SA pathrecord query through netlink
Date: Thu, 21 May 2015 13:21:14 -0400	[thread overview]
Message-ID: <1432228874.28905.35.camel@redhat.com> (raw)
In-Reply-To: <3F128C9216C9B84BB6ED23EF16290AFB0CAB2E96-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 4957 bytes --]

On Thu, 2015-05-21 at 13:52 +0000, Wan, Kaike wrote:
> In our previous posting to the mailing list, we proposed to send a MAD request from kernel (more
> specifically, from ib_sa module) to a user space application (ibacm in this case) through netlink.
> The user space application will send back the response. This simple scheme can achieve the goal 
> of a local SA cache in user space.
> 
> The format of the request and response is diagrammed below:
> 
>   ------------------
>   | netlink header |
>   ------------------
>   |     MAD        |
>   ------------------
> 
> The kernel requests for a pathrecord, and the user application finds it in its local cache and sends
> it to the kernel. If the netlink request fails, the kernel will send the request to SA through the
> normal IB path (ib_mad -> hca driver -> wire).
> 
> Jason pointed out that this message format was limited to lower stack format (MAD) and its use
> could not be readily extended to upper layer modules like rdma_cm. After lengthy discussions, we 
> come up with a new and modified scheme, as described below.
> 
> The general format of the request and response will be the same:
> 
>   ------------------
>   | netlink header |
>   ------------------
>   |  Data header   |
>   ------------------
>   |      Data      |
>   ------------------
> 
> The data header contains information about the type of request/response, the status (for response),
> the type (format) of the data, the total length of the data header + data, and a flags field about
> the request/response or data.
> 
> Based on the type of the data, the data section may be in different format: a string about the host
> name to resolve, an IP4/IP6 address, a pathrecord, a user pathrecord (struct ib_user_path_rec),
> or simply a MAD (like our posted patches), etc. Essentially it can be of any format based on the 
> data type. The key is to document the format so that the kernel and user space can communicate 
> correctly.
> 
> The details are described below:
> 
> #define IB_NL_VERSION		0x01
> 
> #define IB_NL_OP_MASK		0x0F
> #define IB_NL_OP_RESOLVE	0x01
> #define IB_NL_OP_QUERY_PATH	0x02
> #define IB_NL_OP_SET_TIMEOUT	0x03
> #define IB_NL_OP_ACK		0x80

If OP_ACK is one bit, why isn't the OP_MASK 0x7f?

> #define IB_NL_STATUS_SUCCESS	0x0000
> #define IB_NL_STATUS_ENODATA	0x0001

Do we need 16 bits for a bool?  In fact, couldn't this actually be
switched so that the return of the message uses OP_SUCCESS instead of
OP_ACK?

In other words, instead of two items here, couldn't the ACK bit be
dropped entirely and replaced with SUCCESS so that when the user app
returns the netlink packet, if the op on return == to the op on send, it
failed, if it's op | SUCCESS, it succeeded?

> #define IB_NL_DATA_TYPE_INVALID			0x0000
> #define IB_NL_DATA_TYPE_NAME			0x0001
> #define IB_NL_DATA_TYPE_ADDRESS_IP		0x0002
> #define IB_NL_DATA_TYPE_ADDRESS_IP6		0x0003
> #define IB_NL_DATA_TYPE_PATH_RECORD		0x0004
> #define IB_NL_DATA_TYPE_USER_PATH_REC		0x0005
> #define IB_NL_DATA_TYPE_MAD			0x0006
> 
> #define IB_NL_FLAGS_PATH_GMP			1
> #define IB_NL_FLAGS_PATH_PRIMARY		(1<<1)
> #define IB_NL_FLAGS_PATH_ALTERNATE		(1<<2)
> #define IB_NL_FLAGS_PATH_OUTBOUND		(1<<3)
> #define IB_NL_FLAGS_PATH_INBOUND		(1<<4)
> #define IB_NL_FLAGS_PATH_INBOUND_REVERSE 	(1<<5)
> #define IB_NL_FLAGS_PATH_BIDIRECTIONAL		(IB_PATH_OUTBOUND | IB_PATH_INBOUND_REVERSE)
> #define IB_NL_FLAGS_QUERY_SA			(1<<31)
> #define IB_NL_FLAGS_NODELAY			(1<<30)

Please keep these in numerical order, don't put <<31 and below it <<30

> struct ib_nl_data_hdr {
> 	__u8	version;
> 	__u8	opcode;
> 	__u16	status;
Drop status because we fold it into opcode
> 	__u16	type;
> 	__u16	reserved;
Drop reserved because we don't need alignment any more
> 	__u32	flags;
Flags is the only thing using bits fast, and we would want to make this
header an even 128bits in length, so add a __u32 reserved; here.  That's
more likely to be useful than the current layout since we are likely to
run out of flags before anything else.
> 	__u32	length;
> };
> 
> struct ib_nl_data {
> 	struct ib_nl_data_hdr		hdr;
> 	__u8				data[0];
> };
> 
> 
> These defines and structures can be added to file include/upai/rdma/rdma_netlink.h (replace with
> RDMA_NL prefix) or contained in a seperate file (include/upai/rdma/ib_netlink.h ???). 
> 
> Please share your thoughts.

I think an extensible netlink framework here is the right way to go,
certainly better than the one shot method you had first.

> Kaike
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  parent reply	other threads:[~2015-05-21 17:21 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-21 13:52 [RFC] IB/sa: Route SA pathrecord query through netlink Wan, Kaike
     [not found] ` <3F128C9216C9B84BB6ED23EF16290AFB0CAB2E96-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-21 17:21   ` Doug Ledford [this message]
     [not found]     ` <1432228874.28905.35.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-05-21 17:35       ` Doug Ledford
     [not found]         ` <1432229723.28905.40.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-05-21 17:48           ` Wan, Kaike
2015-05-21 17:43       ` Wan, Kaike
2015-05-21 18:12   ` Jason Gunthorpe
     [not found]     ` <20150521181200.GC6771-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-05-21 19:14       ` Wan, Kaike
2015-05-21 19:44   ` ira.weiny
     [not found]     ` <20150521194439.GA6389-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2015-05-21 19:49       ` Jason Gunthorpe
2015-05-21 20:40       ` Hefty, Sean
2015-05-21 23:33       ` Wan, Kaike

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1432228874.28905.35.camel@redhat.com \
    --to=dledford-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org \
    --cc=ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org \
    --cc=kaike.wan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox