From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matthew Hall Subject: Re: tcpdump support in DPDK 2.3 Date: Wed, 16 Dec 2015 18:38:24 -0500 Message-ID: <20151216233824.GA23052@mhcomputing.net> References: <20151214182931.GA17279@mhcomputing.net> <20151214223613.GC21163@mhcomputing.net> <20151216104502.GA10020@bricha3-MOBL3> <98CBD80474FA8B44BF855DF32C47DC358AF76F@smartserver.smartshare.dk> <20151216115611.GB10020@bricha3-MOBL3> <98CBD80474FA8B44BF855DF32C47DC358AF771@smartserver.smartshare.dk> <20151216131249.GC10020@bricha3-MOBL3> <98CBD80474FA8B44BF855DF32C47DC358AF776@smartserver.smartshare.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Cc: dev@dpdk.org To: Morten B Return-path: Received: from mail.mhcomputing.net (master.mhcomputing.net [74.208.228.170]) by dpdk.org (Postfix) with ESMTP id 64A7D5A72 for ; Thu, 17 Dec 2015 00:38:26 +0100 (CET) Content-Disposition: inline In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC358AF776@smartserver.smartshare.dk> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Wed, Dec 16, 2015 at 11:45:46PM +0100, Morten Br=F8rup wrote: > Matthew presented a very important point a few hours ago: We don't need= =20 > tcpdump support for debugging the application in a lab; we already have= =20 > plenty of other tools for debugging what we are developing. We need tcp= dump=20 > support for debugging network issues in a production network. +1 > In my "hardened network appliance" world, a solution designed purely fo= r=20 > legacy applications (tcpdump, Wireshark etc.) is useless because the ne= twork=20 > technician doesn't have access to these applications on the appliance. Maybe that's true on one exact system. But I've used a whole ton of syste= ms=20 including appliances where this was not true. I really do want to find a = way=20 to support them, but according to my recent discussions w/ Alex Nasonov w= ho=20 made bpfjit, I don't think it is possible without really tearing apart=20 libpcap. So for now the only good hope is Wireshark's Extcap support. > While a PC system running a DPDK based application might have plenty of= =20 > spare lcores for filtering, the SmartShare appliances are already using= all=20 > lcores for dedicated purposes, so the runtime filtering has to be done = by=20 > the IO lcores (otherwise we would have to rehash everything and realloc= ate=20 > some lcores for mirroring, which I strongly oppose). Our non-DPDK firmw= are=20 > has also always been filtering directly in the fast path. The shared process stuff and weird leftover lcore stuff seems way too com= plex=20 for me whether or not there are any spare lcores. To me it seems easier i= f I=20 just call some function and hand it mbufs, and it would quickly check the= m=20 against a linked list of active filters if filters are present, or do not= hing=20 and return if no filter is active. > If the filter is so complex that it unexpectedly degrades the normal tr= affic=20 > forwarding performance If bpfjit is used, I think it is very hard to affect the performance much= .=20 Unless you do something incredibly crazy. > Although it is generally considered bad design if a system's behavior (= or=20 > performance) changes unexpectedly when debugging features are being use= d,=20 I think we can keep the behavior change quite small using something like = what=20 I described. > Other companies might prefer to keep their fast path performance unaffe= cted=20 > and dedicate/reallocate some lcores for filtering. It always starts out unaffected... then goes back to accepting a bit of=20 slowness when people are forced to re-learn how bad it is with no debuggi= ng. I=20 have seen it again and again in many companies. Hence my proposal for=20 efficient lightweight debugging support from the beginning. > 1. BPF filtering (... a DPDK variant of bpfjit), +1 > 2. scalable packet queueing for the mirrored packets (probably multi=20 > producer, single or multi consumer) I hate queueing. Queueing always reduces max possible throughput because=20 queueing is inefficient. It is better just to put them where they need to= go=20 immediately (run to completion) while the mbufs are already prefetched. > Then the DPDK application can take care of interfacing to=20 > the attached application and outputting the mirrored packets to the=20 > appropriate destination Too complicated. Pcap and extcap should be working by default. > A note about packet ordering: Mirrored packets belonging to different f= lows=20 > are probably out of order because of RSS, where multiple lcores contrib= ute=20 > to the mirror output. Where I worry is weird configurations where a flow can occur in >1 cores.= But=20 I think most users try not to do this.