From mboxrd@z Thu Jan 1 00:00:00 1970 From: Doug Ledford Subject: Re: [RFC] rdma/uverbs: Sketch for an ioctl framework Date: Wed, 25 May 2016 14:06:38 -0400 Message-ID: <5745E9AE.6020700@redhat.com> References: <1828884A29C6694DAF28B7E6B8A82373AB04FB7F@ORSMSX109.amr.corp.intel.com> <11b6d9c1-0b20-f929-c896-ca084fe18192@redhat.com> <20160524214137.GA6760@obsidianresearch.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="VG52hxfKjUSqxSaVAoo0gVLfc9EOAgl44" Return-path: In-Reply-To: <20160524214137.GA6760-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Jason Gunthorpe Cc: Liran Liss , Hefty Sean , "linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org)" List-Id: linux-rdma@vger.kernel.org This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --VG52hxfKjUSqxSaVAoo0gVLfc9EOAgl44 Content-Type: multipart/mixed; boundary="c9EOoIJuBlU5SMLAKSvOHdbMaox8Do0v4" From: Doug Ledford To: Jason Gunthorpe Cc: Liran Liss , Hefty Sean , "linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org)" Message-ID: <5745E9AE.6020700-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Subject: Re: [RFC] rdma/uverbs: Sketch for an ioctl framework References: <1828884A29C6694DAF28B7E6B8A82373AB04FB7F-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org> <11b6d9c1-0b20-f929-c896-ca084fe18192-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> <20160524214137.GA6760-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> In-Reply-To: <20160524214137.GA6760-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> --c9EOoIJuBlU5SMLAKSvOHdbMaox8Do0v4 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On 5/24/2016 5:41 PM, Jason Gunthorpe wrote: > On Tue, May 24, 2016 at 01:57:54PM -0400, Doug Ledford wrote: >=20 >> Sean's proposal does away with the rigid nature of the current verbs 1= =2E0 >> API and makes it much more flexible on a per-driver basis. This doesn= 't >> address the end user API issues, but it at least cleans up the user >> driver <-> kernel driver API so that one vendor's driver is not forced= >> to carry around all the baggage needed for every other vendor's driver= s. >=20 > I'm not sure what you are reading, but to me both proposals look very > similar. They are both based on the generic object/action/method sort > of model I talked about in an earlier thread. They are similar in initial expression, but not in intent. Sean's is not concerned about preserving struct ib_qp (just as an example) as it stands, while Mellanox's patchset is all about passing around the same objects via a different interface. Even though they encode objects in netlink attributes, they are still expected to be the same basic objects we use today (and this is how they minimize the driver impact). > The main differences seem to boil down to the data marshalling and the > dispatching style for the kernel side.. Data marshalling in Sean's case also entails data content changes with a modest reorganization of what it entails for an item to be a core item (Sean can correct me if I'm wrong here). > Sean hasn't explored how to encode the actual method arguments, while > Mellanox's has a fairly well developed scheme with the netlink > encoding and sgl result list thingy. You are correct that Sean's patch has very little in the way of argument validation. However, I'm not entirely sure that Sean intended the core to do that sort of validation, he may have intended the drivers to do their own on the passed through data. The Mellanox patches do so, but at the expense of netlink which many people on this list find painful to read. >> Under that model, each vendor only carries what they need. It would >> then be libibverbs responsibility to take that driver specific data >> and >=20 > Either patchset could go in this direction. This is a basic question > we need to decide on. And this is my central point, that I tried to make in my previous email: There are multiple trains of thought on where this will end up, and simply switching from write to ioctl is only part of the overall big picture. There should only be one API break in this entire process, so we need to make sure that any other possible API breakers are included in the initial change to the ioctl interface. > I'm starting to think the basic thrust of the Mellanox series (provide > an easy compatability with our userspace) is a sane approach. The > other options look like far too much work to use as a starting point. I could not care less about this argument. When you have to break an API, you do what you have to do to do the job right. Doing things the right way may turn out to be the easy way, but the argument would be because it's the right thing to do, not the easy thing to do. > That doesn't mean we can't decide to move in a direct-only direction - > the uAPI needs to have enough extension points to allow for that. Such > work should happen incrementally, and mainly target new uAPIs. This is arguable. If we know we want to go basically direct only in the future, then preserving the existing layer in the ioctl API eventually becomes a burden. It would be better to go direct only from the beginning. This needs to be settled. >> and not also the user visible libibverbs API at the same time. If all= >> we want to talk about is verbs 1.0 over ioctl, then yes, we can do tha= t. >> But not if we truly want to discuss a verbs 2.0 API. And I have yet = to >> gather from the discussions I hear from people that we are in fact >> decided on pursuing a verbs 1.0 over ioctl API instead of considering = a >> verbs 2.0 API. >=20 > You are the only person I've heard who wants to restructure the > libibverbs interface at the same time.. That's not entirely true. My vision in my head for how we might start altering the libibverbs interface is already being done (although with a slightly different implementation than I had in mind) in the timestamp patches. What I want to see us do in libibverbs is to make extensions start following a new pattern instead of the one they have traditionally followed. But the reason I bring this up is because we need to be thinking of the end to end data transfers when we are thinking about the API break and any changes we might make. I'm thinking about possible changes in libibverbs, Sean is thinking about libfabrics/psm2/hfi1. If we end up just doing a behind the scenes switch from write to ioctl with no changing of data structures or command flow or anything else, then we can ignore the end to end picture because it won't change significantly. But if we do other things too, then I want other people to keep things like this in mind, it's fundamental to good design in a case like this. --c9EOoIJuBlU5SMLAKSvOHdbMaox8Do0v4-- --VG52hxfKjUSqxSaVAoo0gVLfc9EOAgl44 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBCAAGBQJXRemuAAoJELgmozMOVy/d/2MP/Ar+hx6m/TVAcOUM4pHFtKVa 9hMaIlv/a/Vutl+H0BdwGHUWVoduGqvw0nortKKX5kCVtdTaaMQkptTBKVsVMGjF x5+nlvwtJK5keVVrNri5bDqnjt5gDAJcMvnSWGLcyWdZeqkkxCtZWIStFEX/JINj jSqw+e1Oo7ynWjJMTVMhVTpFD2ZaNG0pepgV4g0TP0Xn1x8mZ0wc20qIj1+S6h7S ychlDp5Wts1mj4g3Fy3R6V8joP35xRvBqZeF0HBT+664SATV64DkdFifZ3dg5hEf FtLIhbyBmY4KFYGl6BgPPdRcwlj0kpcNAOftp+l9diUFr9qeQQL3d27uZheaqRlr /puVvJTGNrqrNqsHuVJuDQh2sS3P1DG/cdS5W+A7KuPnoyTa1+XiV4e+wp9tQqzg aIEF6+8C8nS2p9uQSniBkqzJwkUlSvr8mbxfJIwE9nI7Sk+YnVT7XmuwPm7y78PL V9p3VqHmoIjW9D5wSt+Q/ZTnr+d91cibrWUjnP6FmdjzbaHR1tmO7riWLYnHyzBt yIb7IoJBiS3oyGXByumGY+Y07l/ORmD1ZlcEKdy5kT3/C0NcW7ZBO5MIC3vmn9Yc YhTB20P/VZ2Erg8rHp14foYh4f7hiBqIOWM9Ib+3XkEVIJ0o8d1au8EBK/rl7Zhl 0U8V8LL0C1MwmmdKuWK/ =BNdC -----END PGP SIGNATURE----- --VG52hxfKjUSqxSaVAoo0gVLfc9EOAgl44-- -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html