From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?B?RGF2aWQgTnlzdHLDtm0=?= Subject: Re: [ovs-dev] [PATCH RFC] dpif-netdev: Add support Intel DPDK based ports. Date: Thu, 13 Mar 2014 08:37:50 +0100 Message-ID: <5321604E.50808@gmail.com> References: <1390873715-26714-1-git-send-email-pshelar@nicira.com> <52E7D13B.9020404@redhat.com> <52E8B88A.1070104@redhat.com> <52E8D772.9070302@6wind.com> <52E8E2AB.1080600@redhat.com> <52E92DA6.9070704@6wind.com> <52E936D9.4010207@redhat.com> <00ef01cf1d33$5e509270$1af1b750$@com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org, dev-VfR2kkLFssw@public.gmane.org, dpdk-ovs-y27Ovi1pjclAfugRpC6u6w@public.gmane.org To: =?UTF-8?B?RnJhbsOnb2lzLUZyw6lkw6lyaWMgT3pvZw==?= , 'Thomas Graf' , 'Vincent JARDIN' Return-path: In-Reply-To: <00ef01cf1d33$5e509270$1af1b750$@com> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces-VfR2kkLFssw@public.gmane.org Sender: "dev" On 2014-01-29 21:47, Fran=C3=A7ois-Fr=C3=A9d=C3=A9ric Ozog wrote: >>> First and easy answer: it is open source, so anyone can recompile. So= , >>> what's the issue? >> >> I'm talking from a pure distribution perspective here: Requiring to >> recompile all DPDK based applications to distribute a bugfix or to add >> support for a new PMD is not ideal. > >> >> So ideally OVS would have the possibility to link against the shared >> library long term. > > I agree that distribution of DPDK apps is not covered properly at prese= nt. > Identifying the proper scheme requires a specific analysis based on the > constraints of the Telecom/Cloud/Networking markets. > > In the telecom world, if you fix the underlying framework of an app, yo= u > will still have to validate the solution, ie app/framework. In addition= , the > idea of shared libraries introduces the implied requirement to validate= apps > against diverse versions of DPDK shared libraries. This translates into > development and support costs. > > I also expect many DPDK applications to tackle core networking features= , > with sub micro second packet handling delays and even lower than 200ns > (NAT64...). The lazy binding based on ELF PLT represent quite a cost, n= ot > mentioning that optimization stops are shared libraries boundaries (gcc > whole program optimization can be very effective...). Microsoft DLL lin= kage > are an order of magnitude faster. If Linux was to provide that, I would > probably revise my judgment. (I haven't checked Linux dynamic linking > implementation for some time so my understanding of Linux dynamic linki= ng > may be outdated). > > >> >>> I get lost: do you mean ABI + API toward the PMDs or towards the >>> applications using the librte ? >> >> Towards the PMDs is more straight forward at first so it seems logical= to >> focus on that first. > > I don't think it is so straight forward. Many recent cards such as Chel= sio > and Myricom have a very different "packet memory layout" that does not = fit > so easily into actual DPDK architecture. > > 1) "traditional" architecture: the driver reserves X buffers and provid= e the > card with descriptors of those buffers. Each packet is DMA'ed into exac= tly > one buffer. Typically you have 2K buffers, a 64 byte packet consumes ex= actly > one buffer > > 2) "alternative" new architecture: the driver reserves a memory zone, s= ay > 4MB, without any structure, and provide a a single zone description and= a > ring buffer to the card. (there no individual buffer descriptors any mo= re). > The card fills the memory zone with packets, one next to the other and > specifies where the packets are by updating the supplied ring. Out of t= he > many issues fitting this scheme into DPDK, you cannot free a single mbu= f: > you have to maintain a ref count to the memory zone so that, when all m= bufs > have been "released", the memory zone can be freed. > That's quite a stretch from actual paradigm. > > Apart from this aspect, managing RSS is two tied to Intel's flow direct= or > concepts and cannot accommodate directly smarter or dumber RSS mechanis= ms. > > That said, I fully agree PMD API should be revisited. Hi, Sorry for jumping in late. Perhaps you are already aware of OpenDataPlane, which can use DPDK as=20 its south bound NIC interface. > > Cordially, > > Fran=C3=A7ois-Fr=C3=A9d=C3=A9ric >