From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jiri Pirko Subject: Re: [patch net-next v3 02/17] net: make vid as a parameter for ndo_fdb_add/ndo_fdb_del Date: Wed, 26 Nov 2014 12:40:35 +0100 Message-ID: <20141126114035.GM1875@nanopsycho.orion> References: <1416911328-10979-3-git-send-email-jiri@resnulli.us> <5474A25C.3080505@mojatatu.com> <5474A7EE.8000300@intel.com> <5474ABF0.60901@mojatatu.com> <5474AE9B.6000500@intel.com> <5474B353.10802@mojatatu.com> <547546C5.3060207@mojatatu.com> <5475B952.2080500@mojatatu.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Scott Feldman , John Fastabend , Netdev , "David S. Miller" , "nhorman@tuxdriver.com" , Andy Gospodarek , Thomas Graf , "dborkman@redhat.com" , "ogerlitz@mellanox.com" , "jesse@nicira.com" , "pshelar@nicira.com" , "azhou@nicira.com" , "ben@decadent.org.uk" , "stephen@networkplumber.org" , "Kirsher, Jeffrey T" , "vyasevic@redhat.com" , Cong Wang , Eric Dumazet , Florian Fainelli , Roopa Prabhu , John Linville Return-path: Received: from mail-wi0-f182.google.com ([209.85.212.182]:42647 "EHLO mail-wi0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751570AbaKZLki (ORCPT ); Wed, 26 Nov 2014 06:40:38 -0500 Received: by mail-wi0-f182.google.com with SMTP id h11so4620026wiw.15 for ; Wed, 26 Nov 2014 03:40:36 -0800 (PST) Content-Disposition: inline In-Reply-To: <5475B952.2080500@mojatatu.com> Sender: netdev-owner@vger.kernel.org List-ID: Wed, Nov 26, 2014 at 12:28:18PM CET, jhs@mojatatu.com wrote: >On 11/25/14 22:59, Scott Feldman wrote: >>On Tue, Nov 25, 2014 at 5:19 PM, Jamal Hadi Salim wrote: >>>On 11/25/14 21:36, Scott Feldman wrote: > >>> >>>>>Ok, guess i am gonna have to go stare at the code some more. >>>>>I thought we returned one of the error codes? >>>>>A bitmask would work for a single entry - because you have two >>>>>options add to h/ware and/or s/ware. So response is easy to encode. >>>>>But if i have 1000 and they are sparsely populated (think an indexed >>>>>table and i have indices 1, 23, 45, etc), then a bitmask would be >>>>>hard to use. >>>> >>>> >>>>I'm confused by this discussion. >>> >>> >>>This is about the policy which states "install as many as you can, dont >>>worry about failures". In such a case, how do you tell user space back >>>"oh, btw you know your request #1, #23, and 45 went ok, but nothing else >>>worked". A simple return code wont work. You could return a code to >>>say "some worked". At which case user space could dump and find out only >>>#1, #23 and #45 worked. >> >>You request for what? That's my confusion. > >Scott, you are gonna make do this all over again?;-> >The summary is there are three possible policies that could be >identified by the user asking for a kernel operation. >One use case example was to send a bunch of (for example) >create/updates and request that the kernel should not abort on a >failure of a single one but to keep going and create/update as many >as possible. Is that part clear? I know it is not what you do, >but there are use cases for that (Read John's response). >Now assuming someone wants this and some entries failed; >how do you tell user space back what was actually updated vs not? >You could return a code which says "partial success". >Forget whether the table is keyed or indexed but if you wanted >to return more detailed info you would return an array/vector of some >sort with status code per entry. Something netlink cant do. >Is that a better description? Sure this is something that is reasonable to request. But that would require a major changes to userspace api. At this moment, when we are using the existing api, I would leave this out for phase 1. Let this be resolved later as a separate work. Does that make sense? > >>Are you trying to install >>FDB entry into both SW and HW at same time? > > >What is wrong with installing on both hardware and software? The >point was to identify what kind of policies could be requested by >the user; but even for the bridge why is it bad that i ask for >both master&self? >It is something I can do today with none of these patches. > >>And then do a bunch in a >>batch? I'm saying use MASTER for SW and SELF for HW in two steps, > >But that would be enforcing your policy on me. > >>if >>you want FDB entry installed in both Sw and HW. Check your return >>code each step. Batch all to HW first, then batch all that PASSED to >>SW. I don't even know really why you're trying to install to both HW >>and SW. Install it to HW and be done. fdb_dump will set HW entries >>via SELF. >> > >First off: bad performance, but your call to do it that way >(just please please dont enforce it on me;->) > >Lets take the hardware batching you mentioned above and see if >i can help to clarify in the third policy choice (continue-on-failure). >Lets say you have a keyed table such as the fdb table is. >You send 10 entries to be created/added in hardware. #3 and #5 failed >because you made a mistake and sent them with the same key. #9 and #10 >failed because the hardware doesnt have any more space. >we didnt stop and go back for #3 and #5 because the user told >us to continue and do the rest when we fail. And s/he did that because >she wanted to put as many entries in hardware as possible without >necessarily needing to know how much space exists. > > >>Ah, Jamal, look again at patches 13-17/17 in last v3 set. That was a >>big steaming snickerdoodle just for you! Now you can push policy >>knobs down to port driver and or bridge to fine tune what ever you >>want. You'll find knobs for learning, flooding, learning sync to hw, >>etc. I thought you even ACKed some of these. > >I think it almost there. >What you are missing is the policy decision to only sync when i >say so. Having an ndo_ops is a necessity but i dont want the driver >to decide for me just because it can ;-> >Telling hardware to learn is instructing it to self update its entries >based on source lookup failure. That is distinctly different from >telling to sync to the kernel. So if you add that knob we are in good >shape. > >cheers, >jamal > >>a) above knob is 14/17 >>patch, b) above is using existing learning knob on bridge, c) above I >>don't get...no point in syncing that direction. >> >