* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. [not found] <200412270715.iBR7Fffe026855@hera.kernel.org> @ 2004-12-27 12:16 ` Thomas Graf 2004-12-28 13:20 ` jamal 0 siblings, 1 reply; 42+ messages in thread From: Thomas Graf @ 2004-12-27 12:16 UTC (permalink / raw) To: David S. Miller; +Cc: netdev > ChangeSet 1.2055.37.1, 2004/11/17 16:08:01-08:00, util@deuroconsult.ro > > [PKT_SCHED]: Allow using nfmark as key in U32 classifier. > > Signed-off-by: Catalin(ux aka Dino) BOIE <catab at umbrella.ro> > Signed-off-by: David S. Miller <davem@davemloft.net> I must have missed this one. This should have been implemented in the metadata action module to make it available to all classifiers. We should really stop to add more stuff to specific classifiers which have to be removed once we have metamatch. I've made a proposal on paper just need some more time to cook up a patch. > +#ifdef CONFIG_CLS_U32_MARK > + if (tb[TCA_U32_MARK-1]) { > + if (RTA_PAYLOAD(tb[TCA_U32_MARK-1]) < sizeof(struct tc_u32_mark)) > + return -EINVAL; > + mark = RTA_DATA(tb[TCA_U32_MARK-1]); > + memcpy(&n->mark, mark, sizeof(struct tc_u32_mark)); > + n->mark.success = 0; > + } > +#endif This should have gone into u32_set_params ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-27 12:16 ` [PKT_SCHED]: Allow using nfmark as key in U32 classifier Thomas Graf @ 2004-12-28 13:20 ` jamal 2004-12-28 13:40 ` Thomas Graf 0 siblings, 1 reply; 42+ messages in thread From: jamal @ 2004-12-28 13:20 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Mon, 2004-12-27 at 07:16, Thomas Graf wrote: > > ChangeSet 1.2055.37.1, 2004/11/17 16:08:01-08:00, util@deuroconsult.ro > > > > [PKT_SCHED]: Allow using nfmark as key in U32 classifier. > > > > Signed-off-by: Catalin(ux aka Dino) BOIE <catab at umbrella.ro> > > Signed-off-by: David S. Miller <davem@davemloft.net> > > I must have missed this one. This should have been implemented in the > metadata action module to make it available to all classifiers. You mean meta match. > We > should really stop to add more stuff to specific classifiers which have > to be removed once we have metamatch. I've made a proposal on paper > just need some more time to cook up a patch. I have cycles now. Are you working on this or should i invest some of my cycles in it? Lets fix this once and for all. cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-28 13:20 ` jamal @ 2004-12-28 13:40 ` Thomas Graf 2004-12-28 13:59 ` jamal 0 siblings, 1 reply; 42+ messages in thread From: Thomas Graf @ 2004-12-28 13:40 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * jamal <1104240053.1100.53.camel@jzny.localdomain> 2004-12-28 08:20 > On Mon, 2004-12-27 at 07:16, Thomas Graf wrote: > > > ChangeSet 1.2055.37.1, 2004/11/17 16:08:01-08:00, util@deuroconsult.ro > > > > > > [PKT_SCHED]: Allow using nfmark as key in U32 classifier. > > > > > > Signed-off-by: Catalin(ux aka Dino) BOIE <catab at umbrella.ro> > > > Signed-off-by: David S. Miller <davem@davemloft.net> > > > > I must have missed this one. This should have been implemented in the > > metadata action module to make it available to all classifiers. > > You mean meta match. Yes, I was thinking about implementing it as action, any objections? > > We > > should really stop to add more stuff to specific classifiers which have > > to be removed once we have metamatch. I've made a proposal on paper > > just need some more time to cook up a patch. > > I have cycles now. Are you working on this or should i invest some of my > cycles in it? Lets fix this once and for all. I don't care, I'm still testing the patchset to make classifer changes consistent again. A few big changes to tcindex/route classifier need extensive testing. My thoughts on meta match: A match consists of a header, lvalue, rvalue and stats. The header contains the handle, the requested operand (eq,lt,gt) and a flag to invert the meaning of the match. A value consists of a type and data where type specifies the actual metadata to be used. A few upper bits of the type specify the kind, i.e. wether it's numeric or a bytestring. Only values with matching upper bits can be compared. Additionally a mask and shift operator can be configured. Why so complicated? Comparing two kernel meta values can useful, realdev == indev when classyfing on tunnels, comparing backlogs, etc. Device matches should be made available as both numeric and bytestring match to have a fastpath when ifindex is stable and a slower path which is more flexible to changes. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-28 13:40 ` Thomas Graf @ 2004-12-28 13:59 ` jamal 2004-12-28 14:50 ` Thomas Graf 2004-12-28 16:11 ` Thomas Graf 0 siblings, 2 replies; 42+ messages in thread From: jamal @ 2004-12-28 13:59 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Tue, 2004-12-28 at 08:40, Thomas Graf wrote: > * jamal <1104240053.1100.53.camel@jzny.localdomain> 2004-12-28 08:20 > > You mean meta match. > > Yes, I was thinking about implementing it as action, any objections? > If you implement the mothership match changes we discussed then it should go as a match (as opposed to action). As an action its yet another deferal for later cleanup. So my preference is to get the changes we discussed then this meta match. I could whip out meta action for setting values if you are gonna work on the match piece. > > > We > > > should really stop to add more stuff to specific classifiers which have > > > to be removed once we have metamatch. I've made a proposal on paper > > > just need some more time to cook up a patch. > > > > I have cycles now. Are you working on this or should i invest some of my > > cycles in it? Lets fix this once and for all. > > I don't care, I'm still testing the patchset to make classifer changes > consistent again. A few big changes to tcindex/route classifier need > extensive testing. > If you have started working on it i would prefer you make the changes. BTW, I am talking about the top level match changes we discussed. > My thoughts on meta match: > > A match consists of a header, lvalue, rvalue and stats. The header > contains the handle, the requested operand (eq,lt,gt) and a flag to > invert the meaning of the match. A value consists of a type and data > where type specifies the actual metadata to be used. A few upper bits > of the type specify the kind, i.e. wether it's numeric or a > bytestring. Only values with matching upper bits can be compared. > Additionally a mask and shift operator can be configured. Why so > complicated? Comparing two kernel meta values can useful, realdev > == indev when classyfing on tunnels, comparing backlogs, etc. Device > matches should be made available as both numeric and bytestring match > to have a fastpath when ifindex is stable and a slower path which is > more flexible to changes. Sounds reasonable at the high level. I am not sure i got the stats part. Can you write out the BNF. Heres what i was thinking for meta action: tc <OPERATION> action metaset [index ID] METAVARS OPERATION:= ADD|DEL|GET|DUMP METAVARS:= <[INDEV] [FWMARK] [TCINDEX] [PRIO] [CLASSID]> INDEV = indev <devname> FWMARK:= fwmark <value> TCINDEX := tcindex <value> PRIO := prio <value> CLASSID := classid <class> I notice you throw in some mask and shift operator - not sure if i could make use of it here. cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-28 13:59 ` jamal @ 2004-12-28 14:50 ` Thomas Graf 2004-12-28 15:55 ` jamal 2004-12-28 16:11 ` Thomas Graf 1 sibling, 1 reply; 42+ messages in thread From: Thomas Graf @ 2004-12-28 14:50 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * jamal <1104242397.1090.94.camel@jzny.localdomain> 2004-12-28 08:59 > On Tue, 2004-12-28 at 08:40, Thomas Graf wrote: > > * jamal <1104240053.1100.53.camel@jzny.localdomain> 2004-12-28 08:20 > > > > You mean meta match. > > > > Yes, I was thinking about implementing it as action, any objections? > > > > > If you implement the mothership match changes we discussed then it > should go as a match (as opposed to action). As an action its yet > another deferal for later cleanup. I'm getting more and more careful because we already suffer from various limitations by design of underlying layers. I agree that the best way would be to make it a generic match but we will end up implementing logic expressions code for every layer over and over. I have to think a little more about it, here's an up-to-date brain dump: Classifier extensions should no longer be configured over classifier specific TLV types but rather be part of a nested TLV. The extesions should be changeable directly without going through the classifier changing code, i.e. RTM_NEWFEXT/RTM_DELFEXT or something alike. It should be possible to create logic relations between extensions like match indev = "eth0" or (nfmark gt 10 or nfmark lt 4). Doesn't sound too bad but we're actually just implementing things on top of classifiers that should actually be on the same level. > So my preference is to get the changes we discussed then this meta > match. > I could whip out meta action for setting values if you are gonna work > on the match piece. I was thinking of combining these by simply introducing an ASSIGN operator so we don't have redundant code. We could make a generic metadata api so netfilter could make use of it. > Sounds reasonable at the high level. I am not sure i got the stats part. Simple hit/success counters per match to be returned as separate TLV. > Can you write out the BNF. Heres what i was thinking for meta action: tc <OPERATION> meta [NOT] VALUE OPERATOR VALUE VALUE ::= { METAVARS | number | pattern } OPERATOR ::= { = | > | < | assign } where: typeof(METAVAR) for every value pair must be equal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-28 14:50 ` Thomas Graf @ 2004-12-28 15:55 ` jamal 0 siblings, 0 replies; 42+ messages in thread From: jamal @ 2004-12-28 15:55 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Tue, 2004-12-28 at 09:50, Thomas Graf wrote: > * jamal <1104242397.1090.94.camel@jzny.localdomain> 2004-12-28 08:59 > > If you implement the mothership match changes we discussed then it > > should go as a match (as opposed to action). As an action its yet > > another deferal for later cleanup. > > I'm getting more and more careful because we already suffer from > various limitations by design of underlying layers. I agree that the > best way would be to make it a generic match but we will end up > implementing logic expressions code for every layer over and over. Thats the point of this discussion ;-> We need to figure a way to do it with minimal damage ;-> I actually dont think its such a bad idea to change all classifiers this once as long as the backward compat applies. > I have to think a little more about it, here's an up-to-date brain dump: > Classifier extensions should no longer be configured over classifier > specific TLV types but rather be part of a nested TLV. The extesions > should be changeable directly without going through the classifier > changing code, i.e. RTM_NEWFEXT/RTM_DELFEXT or something alike. Sensible. Also enforce that it gets configured when configuring the filter - this way binding can happen at installation and not lookup time. Now we need to find a clever way to do little coding .. I will give it some thinking in the background. Clearly this has to be interleaved with filtering as opposed to be at the end of filtering as actions are. > It > should be possible to create logic relations between extensions like > match indev = "eth0" or (nfmark gt 10 or nfmark lt 4). > Ok, so several things from your text in requirements/desires department: - Reusing existing classifiers is very valuable at that extension level. We actually could do it today with continue/reclassify except it will be faster if you could just point to the next classifier rule from the current classification. Not sure if this makes sense.. - The idea of continue/reclassify is also valuable for match extensions. - match extensions desire minimalist API. Need to write one page of code approach .. - I like the logical relationships you have. > Doesn't sound too bad but we're actually just implementing things > on top of classifiers that should actually be on the same level. > Not in the case of the 2 page of code for extra match. > > So my preference is to get the changes we discussed then this meta > > match. > > I could whip out meta action for setting values if you are gonna work > > on the match piece. > > I was thinking of combining these by simply introducing an ASSIGN > operator so we don't have redundant code. That is still a hack. > We could make a generic > metadata api so netfilter could make use of it. > I can see tc using netfilter but not the other way round without a lot of complexity. > > Sounds reasonable at the high level. I am not sure i got the stats part. > > Simple hit/success counters per match to be returned as separate TLV. > Yes, but how do you use them to match? > > Can you write out the BNF. Heres what i was thinking for meta action: > > tc <OPERATION> meta [NOT] VALUE OPERATOR VALUE hopefully within a filter since these are not really standalone. as in filter ... extmatch meta ... > VALUE ::= { METAVARS | number | pattern } > OPERATOR ::= { = | > | < | assign } > > where: typeof(METAVAR) for every value pair must be equal so how would tcindex fit here? cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-28 13:59 ` jamal 2004-12-28 14:50 ` Thomas Graf @ 2004-12-28 16:11 ` Thomas Graf 2004-12-28 16:36 ` jamal 1 sibling, 1 reply; 42+ messages in thread From: Thomas Graf @ 2004-12-28 16:11 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev > If you implement the mothership match changes we discussed then it > should go as a match (as opposed to action). As an action its yet > another deferal for later cleanup. What about this... We introduce a new API tcf_exts which holds all the things on top of a classifier. There might be several instaces per classifier, i.e. one per filter for u32 and fw and one per hash bucket for route/tcindex, etc. The classifier no longer knows about action/police/meta/... you name it but rather forwards the TLV TCA_..._EXTS to the extensions API. Backward compatibility is provided in the API (very simple). The extension infrastructure builds a tree to implement logic expressions where each node can be one of the supported types. The action code would be transformed to an extension which would mean that there could be multiple action chains. Example: cls -+ meta indev=ppp0 | \ <and> meta assign nfmark=2 | \ <and> action:gact redirect <or> | \ meta nfmark=0x20 \ <and> action:mirred Every extension node has a unique handle in the namespace of the tree. Deletion of a node results in deletion of all sub nodes. Configuration must go via change routine of classifier or via tp->get() and some generic way to retrieve extension handle from classifier. Thoughts? ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-28 16:11 ` Thomas Graf @ 2004-12-28 16:36 ` jamal 2004-12-28 16:51 ` jamal 2004-12-28 19:26 ` Thomas Graf 0 siblings, 2 replies; 42+ messages in thread From: jamal @ 2004-12-28 16:36 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Tue, 2004-12-28 at 11:11, Thomas Graf wrote: > > If you implement the mothership match changes we discussed then it > > should go as a match (as opposed to action). As an action its yet > > another deferal for later cleanup. > > What about this... > > We introduce a new API tcf_exts which holds all the things on top of > a classifier. There might be several instaces per classifier, i.e. one > per filter for u32 and fw and one per hash bucket for route/tcindex, > etc. > > The classifier no longer knows about action/police/meta/... you name it > but rather forwards the TLV TCA_..._EXTS to the extensions API. Backward > compatibility is provided in the API (very simple). > I think this sounds cleaner but is major surgery. The other issue i have with it is semantically i see classification followed by actions. Classification may have extended classification within it. Actions may also have extended actions within them. The majority of the surgery is going to be in ensuring that you can mix and match actions, extended filters and extended actions. > The extension infrastructure builds a tree to implement logic > expressions where each node can be one of the supported types. > > The action code would be transformed to an extension which would > mean that there could be multiple action chains. Example: > > cls -+ meta indev=ppp0 > | \ <and> meta assign nfmark=2 > | \ <and> action:gact redirect > <or> | > \ meta nfmark=0x20 > \ <and> action:mirred > > Every extension node has a unique handle in the namespace of the tree. > Deletion of a node results in deletion of all sub nodes. > > Configuration must go via change routine of classifier or via tp->get() > and some generic way to retrieve extension handle from classifier. > > Thoughts? Note my concerns above - we are talking major splicing! Heres another approach: The classifier is blind while executing those actions. Data that needs to be embeded within the classifier is: struct {extmatch type:extmatch void_data}. extmatch_classify(extmatchdatastruct,skb) is a generic call which does a lookup on the type and calls the proper callback. Callbacks return standard classifier ret codes. So an indev matcher will take the skb and compare against the indev data stored in struct->void_data. Only other call i can see needed is a registration function. extended matchers register a callback and type. user space stuff is easy. Now with above i dont see how to fit your logical experssions - but its a simple change and fits the requirement of writting the one page extended matcher. The same thought could be extended to actions. Sounds too easy unless i am intoxicated with the double-doubles i have been conmsuming last few hours;-> thoughts? cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-28 16:36 ` jamal @ 2004-12-28 16:51 ` jamal 2004-12-28 19:26 ` Thomas Graf 1 sibling, 0 replies; 42+ messages in thread From: jamal @ 2004-12-28 16:51 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Tue, 2004-12-28 at 11:36, jamal wrote: I have to go out for a few hours, for completion sake: - initialization happens like you said with extended matches TLV which result in building one of those extended match datastruct bound on the filter. The binding part is easy, the hard part is how you interleaf u32 matches for example vs indev. ** Also i see your point that changing all the classifiers is painful, but doing it this once so the next written classifier is easy is worth the effort in my opinion. cheers, jamal > Heres another approach: > The classifier is blind while executing those actions. > Data that needs to be embeded within the classifier is: > struct {extmatch type:extmatch void_data}. > extmatch_classify(extmatchdatastruct,skb) is a generic call which does a > lookup on the type and calls the proper callback. Callbacks return > standard classifier ret codes. > So an indev matcher will take the skb and compare against the indev data > stored in struct->void_data. > Only other call i can see needed is a registration function. extended > matchers register a callback and type. > user space stuff is easy. > Now with above i dont see how to fit your logical experssions - but its > a simple change and fits the requirement of writting the one page > extended matcher. The same thought could be extended to actions. > Sounds too easy unless i am intoxicated with the double-doubles i have > been conmsuming last few hours;-> > > thoughts? > > cheers, > jamal > > > > > ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-28 16:36 ` jamal 2004-12-28 16:51 ` jamal @ 2004-12-28 19:26 ` Thomas Graf 2004-12-28 21:14 ` jamal 1 sibling, 1 reply; 42+ messages in thread From: Thomas Graf @ 2004-12-28 19:26 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * jamal <1104251817.1090.164.camel@jzny.localdomain> 2004-12-28 11:36 > I think this sounds cleaner but is major surgery. The other issue i have > with it is semantically i see classification followed by actions. > Classification may have extended classification within it. Actions may > also have extended actions within them. Hmm... right, let's just leave the action as-is, we don't gain much anyway. Nevertheless, actions should be part of the extensions API but we can safely put them at the end of the extensions match routine so it always comes last. So basically we end up with 1) cls specific matches 2) generic matches including logical expressions 3) action We can always change it later without touching a single classifier. > The classifier is blind while executing those actions. > Data that needs to be embeded within the classifier is: > struct {extmatch type:extmatch void_data}. No, I'd really like to avoid having this but instead put a tcf_exts struct into it which holds all the data needed by the generic part. This also includes the tc_action pointer. This way we can get rid of all action/policer related ifdefs in the classifiers, reduce the chances of mistakes and don't need to touch classifier when changing something in the generic part. > extmatch_classify(extmatchdatastruct,skb) is a generic call which does a > lookup on the type and calls the proper callback. Callbacks return > standard classifier ret codes. Exactly, I called it tcf_exts_match with the following return definition: <0: error/unmatched filter, classifier must stop matching and move on to next filter or return mismatch if at the end. 0: Normal match, classifier must do whathever it would to if a filter matches >0: Classifier return code (TC_ACT_*), classifier must stop immediately and pass on the return code. So basically I export this API: tcf_exts_validate validates the input data and initalizes the action, it must not change any attributes. Validated data is stored in a temporary tcf_exts structure provided by the classifier (local variable) tcf_exts_change Applies the changes from tcf_exts_validate to the classifier, must not fail under any circumstance. Classifier provides both, the destination pointer (hold in classifier data) and the temporary buffer from tcf_exts_valiate. tcf_exts_match Runs all the matches and applies action. tcf_exts_destroy Destroys a complete extensions configuration tcf_exts_dump Dumps extensions into given skb. tcf_exts_dump_stats Dumps statistics into given skb. tcf_exts_is_predicative Returns 1 if a predicative extension is present, i.e. an extension which might cause further actions and thus overrule the regular tcf_result. In other words, returns 1 if a TC_ACT_* my be returned. tcf_exts_available Returns 1 if at least 1 extension is present. How does the exntesions API know about the classifier specifc TLV types? Every classifier maintains a map which holds the types, this is used in _validate, _dump and _dump_stats. > - initialization happens like you said with extended matches TLV which > result in building one of those extended match datastruct bound on the > filter. This is the part I'm unsure about. I want to keep it simple but also powerful. The action part is clear and will be untouched. The generic match part is more difficult, we either make userspace transfer the complete tree every time or we introduce commands like in the classifier. Second is probably better but a little bit more work. The binding part is easy, the hard part is how you interleaf u32 > matches for example vs indev. Value TLV: TCA_META_VALUE_TYPE, struct tcf_meta_type TCA_META_VALUE_DATA, variable struct tcf_meta_type { __u16 kind; __u16 len; }; Where kind: enum { /* numberic types */ TCF_META_I_NUMBER = 0x100, TCF_META_I_NFMARK = 0x101, TCF_META_I_TCINDEX = 0x102, /* variable length types */ TCF_META_V_PATTERN = 0x200, TCF_META_V_INDEV = 0x201, }; A matching of the upper 4 bits means the values can be compared. Of course userspace should check for this so we only have to put in a BUG_ON assertion. > ** Also i see your point that changing all the classifiers is painful, > but doing it this once so the next written classifier is easy is worth > the effort in my opinion. Truly agreed, I did this work already and I'm now testing it. We can easly put the generic match on top of it and all we have to do is add TCA_XXX_EXTS for every classifier and add it to the map. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-28 19:26 ` Thomas Graf @ 2004-12-28 21:14 ` jamal 2004-12-28 22:10 ` Thomas Graf 0 siblings, 1 reply; 42+ messages in thread From: jamal @ 2004-12-28 21:14 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Tue, 2004-12-28 at 14:26, Thomas Graf wrote: > * jamal <1104251817.1090.164.camel@jzny.localdomain> 2004-12-28 11:36 > Hmm... right, let's just leave the action as-is, we don't gain much > anyway. Nevertheless, actions should be part of the extensions API > but we can safely put them at the end of the extensions match routine > so it always comes last. So basically we end up with > > 1) cls specific matches > 2) generic matches including logical expressions > 3) action Add 4) action extensions as well. > We can always change it later without touching a single classifier. > > > The classifier is blind while executing those actions. > > Data that needs to be embeded within the classifier is: > > struct {extmatch type:extmatch void_data}. > > No, I'd really like to avoid having this but instead put a tcf_exts > struct into it which holds all the data needed by the generic part. > This also includes the tc_action pointer. This way we can get rid > of all action/policer related ifdefs in the classifiers, reduce the > chances of mistakes and don't need to touch classifier when changing > something in the generic part. Whatever you had before is fine for action/policer - with intent to kill policer eventually. What i objected to is the indev and any other thing that has to do with classification helping - thats not where it should fit. Take u32 for example: The fit for match extensions is really at the key level not at a layer above. We need a sel2 which has new keys (which is easy because thats transported in a TLV). > > extmatch_classify(extmatchdatastruct,skb) is a generic call which does a > > lookup on the type and calls the proper callback. Callbacks return > > standard classifier ret codes. > > Exactly, I called it tcf_exts_match with the following return definition: > <0: error/unmatched filter, classifier must stop matching and move on to > next filter or return mismatch if at the end. > 0: Normal match, classifier must do whathever it would to if a filter matches > >0: Classifier return code (TC_ACT_*), classifier must stop immediately > and pass on the return code. > Why not reuse what already exists in terms of classifier/filter return codes? They are pretty sufficient and cover all the cases. > So basically I export this API: > > tcf_exts_validate > validates the input data and initalizes the action, it must not > change any attributes. Validated data is stored in a temporary > tcf_exts structure provided by the classifier (local variable) > > tcf_exts_change > Applies the changes from tcf_exts_validate to the classifier, must > not fail under any circumstance. Classifier provides both, the > destination pointer (hold in classifier data) and the temporary > buffer from tcf_exts_valiate. > > tcf_exts_match > Runs all the matches and applies action. > > tcf_exts_destroy > Destroys a complete extensions configuration > > tcf_exts_dump > Dumps extensions into given skb. > > tcf_exts_dump_stats > Dumps statistics into given skb. > > tcf_exts_is_predicative > Returns 1 if a predicative extension is present, i.e. an extension > which might cause further actions and thus overrule the regular > tcf_result. In other words, returns 1 if a TC_ACT_* my be returned. > > tcf_exts_available > Returns 1 if at least 1 extension is present. > > How does the exntesions API know about the classifier specifc TLV > types? Every classifier maintains a map which holds the types, this > is used in _validate, _dump and _dump_stats. > Hrm, so someone writting the one page extension now has to fill in all these functions? If yes this defeats the purpose of the exercise to create a single page of code. The user should really have to write only a match function and call a registration function to hook up into the framework. To be clear: struct tcf_ematch_ops { struct tcf_ematch_ops *next; char kind[IFNAMSIZ]; u32 type; int (*classify)(struct sk_buff*, struct tcf_ematch_data); } A global list of these is kept somewhere and a register/unregister manipulation exists. And the mythical one page code that needs writting: --------------------------------------------------- int myveryownmatch(struct sk_buff *skb, struct tc_action tcf_ematch_data *e) { return TCF_OK; } struct tcf_ematch_ops_ops my_tcf_ematch_ops = { .next = NULL, .name = "mymatch", .type = TCF_MY_MATCHID, .walk = myveryownmatch }; static int __init mymatch_init_module(void) { ret = tcf_register_ematch(&my_tcf_ematch_ops); // check here, ematch may already be registered etc } static void __exit mymatch_cleanup_module(void) { tcf_unregister_ematch(&my_tcf_ematch_ops); } module_init(mymatch_init_module); module_exit(mymatch_cleanup_module); ---------------- If what you describe above is internal - accessible via classifier then fine (other than tcf_exts_match) - lathough it looks excessive. I dont see these things calling actions. They are interleaved between matches. At completion of matches/filtering then you call the action code. > > - initialization happens like you said with extended matches TLV which > > result in building one of those extended match datastruct bound on the > > filter. > > This is the part I'm unsure about. I want to keep it simple but also > powerful. The action part is clear and will be untouched. The generic > match part is more difficult, we either make userspace transfer the > complete tree every time or we introduce commands like in the > classifier. Second is probably better but a little bit more work. Whats wrong with extended TLVs you mentioned earlier? match u32 .. ematch indev ... match u32 ... ematch meta tcindex .. the ematches are essentially TLVs on their own and are owned by the classifier. The classifier doesnt know whats in them. It just calls generic code to execute them. > > The binding part is easy, the hard part is how you interleaf u32 > > matches for example vs indev. > > Value TLV: > TCA_META_VALUE_TYPE, struct tcf_meta_type > TCA_META_VALUE_DATA, variable > > struct tcf_meta_type > { > __u16 kind; > __u16 len; > }; > > Where kind: > > enum > { > /* numberic types */ > TCF_META_I_NUMBER = 0x100, > TCF_META_I_NFMARK = 0x101, > TCF_META_I_TCINDEX = 0x102, > > /* variable length types */ > TCF_META_V_PATTERN = 0x200, > TCF_META_V_INDEV = 0x201, > }; > > A matching of the upper 4 bits means the values can be > compared. Of course userspace should check for this so > we only have to put in a BUG_ON assertion. I think you are only refering to one ematch kind above --> for metadata. What i talked about is arbitrary (example i could put a quick hack to grep strings without writting a full classifier). Essentially what you have fits just fine - you may need two ids; one for IDing as meta match and other as tcindex etc. The second one can be hidden. > > ** Also i see your point that changing all the classifiers is painful, > > but doing it this once so the next written classifier is easy is worth > > the effort in my opinion. > > Truly agreed, I did this work already and I'm now testing it. We can > easly put the generic match on top of it and all we have to do is add > TCA_XXX_EXTS for every classifier and add it to the map. > Refer to what i talked about above. The extension are little helpers i.e they cant live on their own - they need classifiers. Just review and sync essentially. The action extensions as well fit the same way. cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-28 21:14 ` jamal @ 2004-12-28 22:10 ` Thomas Graf 2004-12-28 23:06 ` jamal 0 siblings, 1 reply; 42+ messages in thread From: Thomas Graf @ 2004-12-28 22:10 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * jamal <1104268498.1090.254.camel@jzny.localdomain> 2004-12-28 16:14 > Whatever you had before is fine for action/policer - with intent to kill > policer eventually. I left it in for now but I see no reason why to do so actually. Old iproute2 binaries should do just fine with the action backward compatibility code? > What i objected to is the indev and any other thing that has to do with > classification helping - thats not where it should fit. > Take u32 for example: The fit for match extensions is really at the key > level not at a layer above. > We need a sel2 which has new keys (which is easy because thats > transported in a TLV). Take a look at http://people.suug.ch/~tgr/patches/queue/03_tcf_exts_u32.diff The extensions are on the same level as the selector. The patchset still has errors in the patches for route and tcindex since it's non-trivial to adapt them to allow changing parameter on-the-fly. The rest is tested and works perfectly fine. I can create a subset or we can just take the first few patches for now and do the development on u32/fw and port it later. > Why not reuse what already exists in terms of classifier/filter return > codes? They are pretty sufficient and cover all the cases. I do reuse them. TC_ACT_* from include/linux/pkt_cls.h > Hrm, so someone writting the one page extension now has to fill in all > these functions? No, that's just how the classifier accesses the extensions API. > [ematch api] Exactly, this would be API visible to the matches. > If what you describe above is internal - accessible via classifier then > fine (other than tcf_exts_match) - lathough it looks excessive. The validate/change split is needed to implement consistent changes in classifiers. The current way causes corruption in classifer data whenever an action configuration fails. > I dont see these things calling actions. They are interleaved between > matches. At completion of matches/filtering then you call the action > code. Right, tcf_exts_match calls the generic matches and at the very end the action. > Whats wrong with extended TLVs you mentioned earlier? > > match u32 .. > ematch indev ... > match u32 ... > ematch meta tcindex .. > > the ematches are essentially TLVs on their own and are owned by > the classifier. The classifier doesnt know whats in them. It just > calls generic code to execute them. They should go into TCA_XXX_EXTS as embeded TLVs. The problem is not how to do it but rather how far to go. Do we want userspace to be able to delete a single generic match? Do we want to only allow replacing all matches? We will hit the limit of skbs at some point if we keep on encapsulating. ;-> > I think you are only refering to one ematch kind above --> for metadata. Correct. This would be the generic match for metadata. > What i talked about is arbitrary (example i could put a quick hack to > grep strings without writting a full classifier). Essentially what you > have fits just fine - you may need two ids; one for IDing as meta match > and other as tcindex etc. The second one can be hidden. I don't get this. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-28 22:10 ` Thomas Graf @ 2004-12-28 23:06 ` jamal 2004-12-28 23:19 ` Thomas Graf 0 siblings, 1 reply; 42+ messages in thread From: jamal @ 2004-12-28 23:06 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Tue, 2004-12-28 at 17:10, Thomas Graf wrote: > * jamal <1104268498.1090.254.camel@jzny.localdomain> 2004-12-28 16:14 > > Whatever you had before is fine for action/policer - with intent to kill > > policer eventually. > > I left it in for now but I see no reason why to do so actually. Old > iproute2 binaries should do just fine with the action backward > compatibility code? > Its maintainance work. Nothing it provides isnt provided by new policer. > > What i objected to is the indev and any other thing that has to do with > > classification helping - thats not where it should fit. > > Take u32 for example: The fit for match extensions is really at the key > > level not at a layer above. > > We need a sel2 which has new keys (which is easy because thats > > transported in a TLV). > > Take a look at http://people.suug.ch/~tgr/patches/queue/03_tcf_exts_u32.diff > > The extensions are on the same level as the selector. Ok, i see we may be talking about two separate things:-> that patch is fine for policer/action (I noticed you removed indev - good thing). It is not proper spot for the matches and infact should go in as a separate patch altogether (relation is very minimal). For the matches, the checks are going to be per key _not_ at the selector level; i.e: struct tc_newu32_key { __u32 mask; __u32 val; int off; int offmask; pointer to extendedmatches here. }; Since these keys are packed in a selector and the selector is what gets transported from/to user space we need a selector2 which packs these new keys instead. Makes sense? i.e need a TCA_U32_SEL2 where the extended matches are stored. Backward compatibility: New TC transports them in addition to TCA_U32_SEL and old kernels ignore them. old TC doesnt send them so new kernels can handle them just fine. Beauty of TLVs. Your check in the classifier is if (matched) { if (NULL != key.pointertoematch) { ret = call the generic ematch function if ret == OK continue with next match else failed } } u32_change receives the extended matches and populates them accordingly. There is no need for a dump fucntion to exist for them. They get shipped back the same way they came in - user space knows how to dump them. a key.pointertoematch could be infact a llist of these items. So the struct looks like: struct tcf_ematch_info { struct tcf_ematch_info *next; __u32 type void *data; may need a datalen here for dumping back to user space. }; Makes sense? Back to what i said earlier i can now write a single page of code to scan for word "Thomas" if i get a match on TCP port 25 for all IP addresses... i.e metadata is a subset of all this. cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-28 23:06 ` jamal @ 2004-12-28 23:19 ` Thomas Graf 2004-12-28 23:39 ` jamal 0 siblings, 1 reply; 42+ messages in thread From: Thomas Graf @ 2004-12-28 23:19 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * jamal <1104275197.1100.276.camel@jzny.localdomain> 2004-12-28 18:06 > Its maintainance work. Nothing it provides isnt provided by > new policer. I'll remove it. > It is not proper spot for the matches and infact > should go in as a separate patch altogether (relation is very minimal). > > For the matches, the checks are going to be per key _not_ at the > selector level; i.e: > > struct tc_newu32_key > { > __u32 mask; > __u32 val; > int off; > int offmask; > pointer to extendedmatches here. > }; > > Since these keys are packed in a selector and the selector is what gets > transported from/to user space we need a selector2 which packs these new > keys instead. Makes sense? i.e need a TCA_U32_SEL2 where the extended > matches are stored. Why? I don't get that. Generic matches must only be considered if all keys of u32 match. u32 keys are just ANDed matches if one fails we can directly declare the classifier as unmatched. The only thing we would gain is that we could add multiple generic matches but with lack of real logical expressions. I'd rather implemnt some simple form of logical expression in the generic part so all classifiers can benfit. > Makes sense? Not for me. ;-> > Back to what i said earlier i can now write a single page of code > to scan for word "Thomas" if i get a match on TCP port 25 for all IP > addresses... i.e metadata is a subset of all this. Agreed, and smart as you are you simply take the Knuth-Morris-Pratt code out of my EGP patch to get some performance. ;-> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-28 23:19 ` Thomas Graf @ 2004-12-28 23:39 ` jamal 2004-12-29 0:09 ` Thomas Graf 0 siblings, 1 reply; 42+ messages in thread From: jamal @ 2004-12-28 23:39 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Tue, 2004-12-28 at 18:19, Thomas Graf wrote: > * jamal <1104275197.1100.276.camel@jzny.localdomain> 2004-12-28 18:06 > > Its maintainance work. Nothing it provides isnt provided by > > new policer. > > I'll remove it. > Well one of your goals was to reduce the ifdef cluter. That goal is achieved with that goal. So I would keep it as is for now but kill it in the future. I am hoping we are talking about the same thing here - the old policer ;-> > > Since these keys are packed in a selector and the selector is what gets > > transported from/to user space we need a selector2 which packs these new > > keys instead. Makes sense? i.e need a TCA_U32_SEL2 where the extended > > matches are stored. > > Why? I don't get that. Generic matches must only be considered if all > keys of u32 match. u32 keys are just ANDed matches if one fails we can > directly declare the classifier as unmatched. yes - and that still applies here but you can now interleaf - as i mentioned earlier: match u32 .. ematch string "Thomas" ... match u32 ... ematch meta tcindex .. I dont wanna go into details of whether we could actually make the new keys do more than just strict AND from left to right - but you can see the potential to "fix" this if we are defining a new key ;-> The same applies for interleafing of actions and eactions. > The only thing we would > gain is that we could add multiple generic matches but with lack of real > logical expressions. I'd rather implemnt some simple form of logical > expression in the generic part so all classifiers can benfit. Ok, the logical expressions are the tricky part. But refer to what i am saying above. You still need to be backward compatible. But for the new keys you could go onto the adventorous side. I havent given the logical expressions much thought but i will in the background > > Makes sense? > > Not for me. ;-> > > > Back to what i said earlier i can now write a single page of code > > to scan for word "Thomas" if i get a match on TCP port 25 for all IP > > addresses... i.e metadata is a subset of all this. > > Agreed, and smart as you are you simply take the Knuth-Morris-Pratt > code out of my EGP patch to get some performance. ;-> Of course;-> Ive worked long enough in Linux to appreciate the definition of TheLinuxWay(tm) ;-> I also get to inherit your bugs this way. cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-28 23:39 ` jamal @ 2004-12-29 0:09 ` Thomas Graf 2004-12-29 1:13 ` jamal 0 siblings, 1 reply; 42+ messages in thread From: Thomas Graf @ 2004-12-29 0:09 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * jamal <1104277165.1100.291.camel@jzny.localdomain> 2004-12-28 18:39 > On Tue, 2004-12-28 at 18:19, Thomas Graf wrote: > > Why? I don't get that. Generic matches must only be considered if all > > keys of u32 match. u32 keys are just ANDed matches if one fails we can > > directly declare the classifier as unmatched. > > yes - and that still applies here but you can now interleaf - as i > mentioned earlier: > > match u32 .. > ematch string "Thomas" ... > match u32 ... > ematch meta tcindex .. Yes but the only avantage of this is that a u32 match can be made dependant on a ematch. Is this really worth special handling? It requires special handling not needed for any of the other classifiers. I understand your point but don't agree at the moment. I might change my mind tomorrow ;-> > I dont wanna go into details of whether we could actually make the new > keys do more than just strict AND from left to right - but you can see > the potential to "fix" this if we are defining a new key ;-> We should rather do it on cls_api level, unfortunantely it's not that simple but the current status of having one classifier kind per prio and no way to interconnect them must be changed somewhen. > Ok, the logical expressions are the tricky part. But refer to what i am > saying above. You still need to be backward compatible. But for the new > keys you could go onto the adventorous side. I havent given the logical > expressions much thought but i will in the background Implementing logical expressions directly into u32 would be bad but we could have u32 hold a expression tree rather than the ematch directly which means you could do match u32 .. (ematch meta nfmark .. or string "...") match u32 .. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-29 0:09 ` Thomas Graf @ 2004-12-29 1:13 ` jamal 2004-12-29 12:48 ` Thomas Graf 0 siblings, 1 reply; 42+ messages in thread From: jamal @ 2004-12-29 1:13 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Tue, 2004-12-28 at 19:09, Thomas Graf wrote: > * jamal <1104277165.1100.291.camel@jzny.localdomain> 2004-12-28 18:39 > > On Tue, 2004-12-28 at 18:19, Thomas Graf wrote: > > match u32 .. > > ematch string "Thomas" ... > > match u32 ... > > ematch meta tcindex .. > > Yes but the only avantage of this is that a u32 match can be > made dependant on a ematch. Is this really worth special > handling? It requires special handling not needed for any > of the other classifiers. > > I understand your point but don't agree at the moment. I > might change my mind tomorrow ;-> no problem ;-> I think the effort is the same as in doing it the other way - only result is more sophisticated. I havent investigated the other classifiers - u32 tends to be more complex, so solving it on u32 solves all the others typically. > > I dont wanna go into details of whether we could actually make the new > > keys do more than just strict AND from left to right - but you can see > > the potential to "fix" this if we are defining a new key ;-> > > We should rather do it on cls_api level, unfortunantely it's not that > simple but the current status of having one classifier kind per prio and > no way to interconnect them must be changed somewhen. > Remember two levels: 1) the classifier logical expressions (u32 and rsvp for example) - those belong to cls api. if u32 match .. ... AND rsvp ... OR route ... evaluation is left to right with some brute logical OR and ANDs via the continue and reclassify codes. 2) This issue is at a the single classifier/filter level, so its fair to push that into the classifier logic. an extended match cannot live by itself. Its parasitic on a real classifier - so the scope MUST be restricted to classifier as a matter of fact. > > Ok, the logical expressions are the tricky part. But refer to what i am > > saying above. You still need to be backward compatible. But for the new > > keys you could go onto the adventorous side. I havent given the logical > > expressions much thought but i will in the background > > Implementing logical expressions directly into u32 would be bad but > we could have u32 hold a expression tree rather than the ematch > directly which means you could do > > match u32 .. > (ematch meta nfmark .. or string "...") > match u32 .. > Or you could have: match u32 OR (ematch meta nfmark .. or string "...") match u32 .. Recall, the opportunity to do more in terms of logical expressions within u32 exists because we can introduce more interesting keys ... Anyways, I am going to introduce extended actions. I need to write a few stupid little actions not worth the whole API (such as a checksum validator/recomputer which i need to follow the pedit action for stateless NAT). It would use exactly teh same interleaving logic as what weve discussed. I dont know where you and Patrick are with the code but it think it would be safe for me to work off 2610-bk1. Tommorow i am working on some e1000 stuff - but day after i should be able to touch the action code. cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-29 1:13 ` jamal @ 2004-12-29 12:48 ` Thomas Graf 2004-12-29 14:20 ` jamal 0 siblings, 1 reply; 42+ messages in thread From: Thomas Graf @ 2004-12-29 12:48 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * jamal <1104282811.1090.314.camel@jzny.localdomain> 2004-12-28 20:13 > no problem ;-> I think the effort is the same as in doing it the other > way - only result is more sophisticated. I havent investigated the other > classifiers - u32 tends to be more complex, so solving it on u32 solves > all the others typically. OK, I've changed my mind after some thinking. It's little bit more work but it's worth it. The ematch TLV contains an array of ematches, every ematch contains the logic relation to the next (2 bits) and a flag to invert the meaning (1 bit). A special ematch containing an index exists to implement precdence. A AND (B1 OR B2) AND C AND D ------->-PUSH------- -->-- / -->-- \ -->-- / \ / / \ \ / \ +-------+-------+-------+-------+-------+--------+ | A AND | B AND | C AND | D END | B1 OR | B2 END | +-------+-------+-------+-------+-------+--------+ \ / --------<-POP--------- A simple check that a jump index is never smaller than the current index (excluding backward jumps via stack) catches endless loop and avoids the use of a ttl. > Remember two levels: > 1) the classifier logical expressions (u32 and rsvp for example) - those > belong to cls api. > > if u32 match .. ... AND rsvp ... OR route ... > evaluation is left to right with some brute logical OR and ANDs via the > continue and reclassify codes. > > 2) This issue is at a the single classifier/filter level, so its fair to > push that into the classifier logic. an extended match cannot live by > itself. Its parasitic on a real classifier - so the scope MUST be > restricted to classifier as a matter of fact. Absolutely agreed, I did a context switch without prior warning. ;-> > Or you could have: > match u32 OR > (ematch meta nfmark .. or string "...") > match u32 .. > > Recall, the opportunity to do more in terms of logical expressions within u32 > exists because we can introduce more interesting keys ... We could reuse the 8 unused bits after nkeys in tc_u32sel too. iproute2 sets them to 0 so we can simply use them without anyone noticing. > I dont know where you and Patrick are with the code but it think it > would be safe for me to work off 2610-bk1. Tommorow i am working on some > e1000 stuff - but day after i should be able to touch the action code. I'm not touching the action bits but Patrick is as far as I know. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-29 12:48 ` Thomas Graf @ 2004-12-29 14:20 ` jamal 2004-12-29 15:01 ` Thomas Graf 0 siblings, 1 reply; 42+ messages in thread From: jamal @ 2004-12-29 14:20 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev, Patrick McHardy On Wed, 2004-12-29 at 07:48, Thomas Graf wrote: > * jamal <1104282811.1090.314.camel@jzny.localdomain> 2004-12-28 20:13 .. > The ematch TLV contains an array of ematches, every > ematch contains the logic relation to the next (2 bits) and a flag to > invert the meaning (1 bit). A special ematch containing an index exists > to implement precdence. > > A AND (B1 OR B2) AND C AND D > > > ------->-PUSH------- > -->-- / -->-- \ -->-- > / \ / / \ \ / \ > +-------+-------+-------+-------+-------+--------+ > | A AND | B AND | C AND | D END | B1 OR | B2 END | > +-------+-------+-------+-------+-------+--------+ > \ / > --------<-POP--------- > > A simple check that a jump index is never smaller than the current > index (excluding backward jumps via stack) catches endless loop > and avoids the use of a ttl. > Sounds good. Given the opportunity: I think we need to put those flags as well for the u32 keys(and other classifiers) so we can have similar logic? Also in the case of u32 (to take this opportunity) i would like to stash state inot a 16 bit ID to help in pretty printing the matches. So if we have an extra 32 bits for flags:ID probably 8 bits for your need for flags, 16 bits for private Id and maybe another 8 bit for something else like versioning... Thoughts? > > Recall, the opportunity to do more in terms of logical expressions within u32 > > exists because we can introduce more interesting keys ... > > We could reuse the 8 unused bits after nkeys in tc_u32sel too. iproute2 sets > them to 0 so we can simply use them without anyone noticing. I would recommend just introducing the extra 32 bits per key. So much easier. > > I dont know where you and Patrick are with the code but it think it > > would be safe for me to work off 2610-bk1. Tommorow i am working on some > > e1000 stuff - but day after i should be able to touch the action code. > > I'm not touching the action bits but Patrick is as far as I know. Ok, dont wanna conflict with work hes doing - so i will wait until tommorow to see what hes upto. CCed him. cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-29 14:20 ` jamal @ 2004-12-29 15:01 ` Thomas Graf 2004-12-29 15:53 ` jamal 0 siblings, 1 reply; 42+ messages in thread From: Thomas Graf @ 2004-12-29 15:01 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev, Patrick McHardy * jamal <1104330054.1089.329.camel@jzny.localdomain> 2004-12-29 09:20 > Given the opportunity: I think we need to put those flags as well for > the u32 keys(and other classifiers) so we can have similar logic? Sounds reasonable and easy to do if we introduce a new selector TLV. Speaking of the other classifiers: fw: Currently a list of ORed matches, nfmark transported via handle. We could change it to transfer the nfmark via a TLV and implement the same logic as in u32 (simple). The problem is mainly how to guarantee backwards compatibility, the handle would no longer tell about the nfmark being matched. OTOH, fw is no longer needed once we have metadata match. Adding a "always-true" classifier with ematch extension will completely replace it (except for the old path with netfilter disabled). tcindex: No changes required and partly replaced with metadata match but not completely. It would still be perfectly fine to map dscp values to classids. route4: Also partly replaced with metadata match but we would lose the exellent fast paths. We can leave it as-is and have metadata match for more complex matches (slow) and route4 for simple but fast uses. rsvp: Could theoretically be replaced with new u32 and extensive use of continue/reclassify but that's quite difficult. It's very specialized (and currently quite vulnerable to pskbs) and the use of it is clearly towards fast flow redirection. No need to change this. So, the conclusion for me is that we can focus on u32 and new classifiers and eventually make fw obsolete in the future. > Also in the case of u32 (to take this opportunity) i would like to stash > state inot a 16 bit ID to help in pretty printing the matches. Not sure what you mean. Which "state"? > So if we have an extra 32 bits for flags:ID probably 8 bits for your > need for flags, 16 bits for private Id and maybe another 8 bit for > something else like versioning... Basically what I need is 3 bits for logic relations and at least 8 for precedence index or 4 bits and reuse one of the already existing fields unused when key is used as container for sub keys. So, 8 bits will suit me very well. +---+---+-----+ | I | C | R | +---+---+-----+ I := 1 Match is inverted 0 Match is straight C := 1 Container Key 0 Normal Key R := 0 0 Last Key 0 1 AND 1 0 OR While we're at it we should increase nkeys to 16bit. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-29 15:01 ` Thomas Graf @ 2004-12-29 15:53 ` jamal 2004-12-30 17:43 ` Thomas Graf 0 siblings, 1 reply; 42+ messages in thread From: jamal @ 2004-12-29 15:53 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Wed, 2004-12-29 at 10:01, Thomas Graf wrote: > * jamal <1104330054.1089.329.camel@jzny.localdomain> 2004-12-29 09:20 > > Given the opportunity: I think we need to put those flags as well for > > the u32 keys(and other classifiers) so we can have similar logic? > > Sounds reasonable and easy to do if we introduce a new selector TLV. Shouldnt be and issue and both backward and forward compatible. > Speaking of the other classifiers: > > fw: Currently a list of ORed matches, nfmark transported via handle. > We could change it to transfer the nfmark via a TLV and implement > the same logic as in u32 (simple). The problem is mainly how to > guarantee backwards compatibility, the handle would no longer > tell about the nfmark being matched. OTOH, fw is no longer needed > once we have metadata match. Adding a "always-true" classifier with > ematch extension will completely replace it (except for the old > path with netfilter disabled). > I would suspect we would end killing fwmark or it will get deprecated for lack of use. So probably safer to leave it alone. > tcindex: No changes required and partly replaced with metadata match > but not completely. It would still be perfectly fine to map > dscp values to classids. > This is a tricky one since it has those speacial cases. > route4: Also partly replaced with metadata match but we would lose > the exellent fast paths. We can leave it as-is and have metadata > match for more complex matches (slow) and route4 for simple but > fast uses. > nod. > rsvp: Could theoretically be replaced with new u32 and extensive use > of continue/reclassify but that's quite difficult. It's very > specialized (and currently quite vulnerable to pskbs) and the use > of it is clearly towards fast flow redirection. No need to change > this. > > So, the conclusion for me is that we can focus on u32 and new > classifiers and eventually make fw obsolete in the future. > Geez, I should have read to the end first ;-> So i agree with you. > > Also in the case of u32 (to take this opportunity) i would like to stash > > state inot a 16 bit ID to help in pretty printing the matches. > > Not sure what you mean. Which "state"? > One example of what state you could store: In the case where i enter something readable in english, the display back is raw; example: match ip src 10.0.0.210/32 gets displayed as: match 0a0000d2/ffffffff at 12 And a lot of times its tricky to find exactly what "at 12" means. If i store some ID that would tell me "IP" when i dump then i can pretty print it in english in user space using ip_print(). > > So if we have an extra 32 bits for flags:ID probably 8 bits for your > > need for flags, 16 bits for private Id and maybe another 8 bit for > > something else like versioning... > > Basically what I need is 3 bits for logic relations and at least 8 > for precedence index or 4 bits and reuse one of the already existing > fields unused when key is used as container for sub keys. So, 8 bits > will suit me very well. > > +---+---+-----+ > | I | C | R | > +---+---+-----+ > > I := 1 Match is inverted > 0 Match is straight > > C := 1 Container Key > 0 Normal Key > > R := 0 0 Last Key > 0 1 AND > 1 0 OR > > While we're at it we should increase nkeys to 16bit. > Sounds good to me since we have a new sel. It may endup being tricky to be both fast and backward compat; but we'll see what fun awaits when you start coding. cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-29 15:53 ` jamal @ 2004-12-30 17:43 ` Thomas Graf 2004-12-31 4:58 ` jamal 0 siblings, 1 reply; 42+ messages in thread From: Thomas Graf @ 2004-12-30 17:43 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * jamal <1104335620.1025.22.camel@jzny.localdomain> 2004-12-29 10:53 > > > Also in the case of u32 (to take this opportunity) i would like to stash > > > state inot a 16 bit ID to help in pretty printing the matches. > > > > Not sure what you mean. Which "state"? > > > > One example of what state you could store: > In the case where i enter something readable in english, the display > back is raw; > example: > match ip src 10.0.0.210/32 > gets displayed as: match 0a0000d2/ffffffff at 12 > And a lot of times its tricky to find exactly what "at 12" means. > > If i store some ID that would tell me "IP" when i dump then i can pretty > print it in english in user space using ip_print(). Understood, we could store a map in userspace mapping those IDs to pretty english match descriptions. I think avoiding to hardcode those ids but rather just hold it for userspace is the best thing. OTOH, if we give unique ids to matches we can use it instead of a separate ID. Unique in terms of parent classid + filter handle + match handle must be unique per interface. Thoughts? > Sounds good to me since we have a new sel. > It may endup being tricky to be both fast and backward compat; but we'll > see what fun awaits when you start coding. I started developing concrete thoughts. The current u32 match could be made a generic match just like meta which would give us a u32 w/o hashtables on a always-true classifier. The problem arises with exactly those hashtables. u32 uses the same selector for this and furthermore even defines stuff like hoff and hmask for this purpose. I have to read up again to understand the hashing in full detail but this is the only issue I see that we might face. What I have in mind is, something like u32 but much simpler, w/o the overhead of creating additional filters for hashing etc. Basically this can be the always-true classifier which just implements the generic matches tree. And have the existing u32 with the generic matches added when hashing is required. I planned to write these 2 additional generic matches so far. It's pretty simple because I can just take over the code from EGP. KMP: Knuth-Morris-Pratt text search algorithm NByte: Compares any number of bytes against a pattern, very useful for comparing IPv6 addresses instead of creating 4 ANDed u32 matches. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-30 17:43 ` Thomas Graf @ 2004-12-31 4:58 ` jamal 2004-12-31 11:08 ` Thomas Graf 0 siblings, 1 reply; 42+ messages in thread From: jamal @ 2004-12-31 4:58 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Thu, 2004-12-30 at 12:43, Thomas Graf wrote: > * jamal <1104335620.1025.22.camel@jzny.localdomain> 2004-12-29 10:53 > > If i store some ID that would tell me "IP" when i dump then i can pretty > > print it in english in user space using ip_print(). > > Understood, we could store a map in userspace mapping those IDs to > pretty english match descriptions. I think avoiding to hardcode those > ids but rather just hold it for userspace is the best thing. We may be in sync: I was thinking of just teaching tc to stash something there that it understands on how to translate. Thinking about it now, this may not be sufficient: perhaps we need a few bits in the selector to identify the owner who installed the rule to begin with. Then it would be safe to interpret the meaning of the ID (by the same app). Did you say there were some unused bits in the selector? > OTOH, if > we give unique ids to matches we can use it instead of a separate ID. Note: The above id i was talking about is "opaque" i.e the real meaning of it is only known to the user space app that installed it (unloess you want to define things in kernel headers) > Unique in terms of parent classid + filter handle + match handle must > be unique per interface. Thoughts? I think all you need really is to say "this match starts at IP" i.e such a definition is global. handles per rule already exist - and you can actually specify them when installing a rule. Are those insufficient? > > Sounds good to me since we have a new sel. > > It may endup being tricky to be both fast and backward compat; but we'll > > see what fun awaits when you start coding. > > I started developing concrete thoughts. The current u32 match could be > made a generic match just like meta which would give us a u32 w/o hashtables > on a always-true classifier. The problem arises with exactly those > hashtables. u32 uses the same selector for this and furthermore even defines > stuff like hoff and hmask for this purpose. I have to read up again to > understand the hashing in full detail but this is the only issue I see that > we might face. > Why not make the always-true to be an extended match? actually a u32 match of 0 0 is always true. Those hashes are quiet tricky/flexible; i would rather we clone u32 and call it something else then speacilize it. > What I have in mind is, something like u32 but much simpler, w/o the > overhead of creating additional filters for hashing etc. Basically > this can be the always-true classifier which just implements the > generic matches tree. And have the existing u32 with the generic > matches added when hashing is required. > Whip anothe classifier is my opinion. Or write extended matches. > I planned to write these 2 additional generic matches so far. It's > pretty simple because I can just take over the code from EGP. > > KMP: Knuth-Morris-Pratt text search algorithm > NByte: Compares any number of bytes against a pattern, very useful > for comparing IPv6 addresses instead of creating 4 ANDed > u32 matches. Both sound very appealing. You plan to do them as extended matches, correct? KMP can be used for something like virus scanning? does it maintain state? cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-31 4:58 ` jamal @ 2004-12-31 11:08 ` Thomas Graf 2004-12-31 14:59 ` jamal 0 siblings, 1 reply; 42+ messages in thread From: Thomas Graf @ 2004-12-31 11:08 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * jamal <1104469111.1049.219.camel@jzny.localdomain> 2004-12-30 23:58 > On Thu, 2004-12-30 at 12:43, Thomas Graf wrote: > > * jamal <1104335620.1025.22.camel@jzny.localdomain> 2004-12-29 10:53 > > > > If i store some ID that would tell me "IP" when i dump then i can pretty > > > print it in english in user space using ip_print(). > > > > Understood, we could store a map in userspace mapping those IDs to > > pretty english match descriptions. I think avoiding to hardcode those > > ids but rather just hold it for userspace is the best thing. > > We may be in sync: > I was thinking of just teaching tc to stash something there that it > understands on how to translate. Thinking about it now, this may not > be sufficient: perhaps we need a few bits in the selector to identify > the owner who installed the rule to begin with. Then it would be safe to > interpret the meaning of the ID (by the same app). Did you say there > were some unused bits in the selector? Right, but why not do this in userspace by having a global map somewhere in a file? A u32 config could have been modified by multiple pids and it would get really messy to store a pid for every possible changeable item. > I think all you need really is to say "this match starts at IP" i.e such > a definition is global. > handles per rule already exist - and you can actually specify them when > installing a rule. Are those insufficient? Those are absolutely sufficient. I was thinking of giving a match a 16bit ID which can be used for both, identifying and mapping, i.e: __u8 kind; /* match type, for lookup in matchers table */ __u8 flags; /* Invert Flag + Relations */ __u16 handle; /* must be unique per selector, may be autogenerated */ I want to have those matches be as small as possible, so no nested TLVs but rather this u32 + matcher specific data form a TLV together. A selector consists of a TLV array of such matches. The first TLV, type=1 becomes a header with the possibility to transfer classifier specific options (such as hash table configuration for u32). > Why not make the always-true to be an extended match? actually a u32 > match of 0 0 is always true. Those hashes are quiet tricky/flexible; > i would rather we clone u32 and call it something else then speacilize > it. Agreed, I don't want to change u32 but I want to introduce ematches in u32 as well so we can benefit from the hashing but for those who don't need hashing u32 is already bloat so we can do a simple always-true classifier which does nothing more than evaluating the ematches. I want to have the u32 match be a ematch as well so the always-true classifier would become a u32 alternative but without the hashing overhead. > Both sound very appealing. You plan to do them as extended matches, > correct? Excatly. > KMP can be used for something like virus scanning? does it > maintain state? It requires the following parameters: - start offset - end offset - pattern - prefix table and then will simply start at `start` and scans until `end` looking for pattern with the help of the prefix table. Again, I'm not sure what you mean by state ;-> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-31 11:08 ` Thomas Graf @ 2004-12-31 14:59 ` jamal 2004-12-31 15:39 ` Thomas Graf 0 siblings, 1 reply; 42+ messages in thread From: jamal @ 2004-12-31 14:59 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Fri, 2004-12-31 at 06:08, Thomas Graf wrote: > * jamal <1104469111.1049.219.camel@jzny.localdomain> 2004-12-30 23:58 > > perhaps we need a few bits in the selector to identify > > the owner who installed the rule to begin with. Then it would be safe to > > interpret the meaning of the ID (by the same app). Did you say there > > were some unused bits in the selector? > > Right, but why not do this in userspace by having a global map > somewhere in a file? A u32 config could have been modified by > multiple pids and it would get really messy to store a pid for > every possible changeable item. The "PID" is per app - same as it is for rtm where the different known routing apps like zebra are given an ID and certain IDs are left blank for general use. So if you run multiple tcs they will all have the same ID. The way the id is used in the rtm historically has been for allowing multiple route protocols to install route entries (this way for example OSPF does/nt announce RIP routes etc). In our case this would mean: "this rule was installed by tc - only it knows what the opaque value 10 means". In that case tc would be responsible to decode 10 which would mean to it "this is an ip match therefore use ip_print() to pretty print" and 10 is global to tc only and in that case stored in some header file belonging to tc. > > I think all you need really is to say "this match starts at IP" i.e such > > a definition is global. > > handles per rule already exist - and you can actually specify them when > > installing a rule. Are those insufficient? > > Those are absolutely sufficient. I was thinking of giving a match a > 16bit ID which can be used for both, identifying and mapping, i.e: > > __u8 kind; /* match type, for lookup in matchers table */ 255 possible matchers max? > __u8 flags; /* Invert Flag + Relations */ > __u16 handle; /* must be unique per selector, may be autogenerated */ Ok this is the one used to store the opaque IDs - unique per app so may be the same across multiple selectors. Probably steal a few bits from here and use them in nkind. > I want to have those matches be as small as possible, so no nested > TLVs but rather this u32 + matcher specific data form a TLV together. > > A selector consists of a TLV array of such matches. The first TLV, > type=1 becomes a header with the possibility to transfer classifier > specific options (such as hash table configuration for u32). Maybe i misunderstood you. You are going to have: SEL2 | | +--match1 | | | + -- extended match1 . . . . . . . +---- extnded matchn | +--- matchn Why does the first one have to be speacial? One of the mistakes in u32 is the tight packing of the matches. ie the match1-n above are packet together. If they were put in TLVs probably only new thing that will be needed is MATCH2 TLV. No harm in transporting an extra 32 bits for a TLV - its not like you are going to have a million matches and need to save space. So i would suggest everything under a TLV (SEL2->MATCH(ES)->EMATCH(ES)) > > Why not make the always-true to be an extended match? actually a u32 > > match of 0 0 is always true. Those hashes are quiet tricky/flexible; > > i would rather we clone u32 and call it something else then speacilize > > it. > > Agreed, I don't want to change u32 but I want to introduce ematches > in u32 as well so we can benefit from the hashing but for those who > don't need hashing u32 is already bloat so we can do a simple > always-true classifier which does nothing more than evaluating the > ematches. I want to have the u32 match be a ematch as well so the > always-true classifier would become a u32 alternative but without > the hashing overhead. Sounds fine to me. so a u32 match 00 emathes here .. > It requires the following parameters: > - start offset > - end offset > - pattern > - prefix table > > and then will simply start at `start` and scans until `end` looking > for pattern with the help of the prefix table. Again, I'm not sure what > you mean by state ;-> A virus would span several packets. So the state/knowldge of whether something is a virus is spread across several packets. That knowledge typically needs to be accumulated before making a call. Is this thing capable of remembering? cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-31 14:59 ` jamal @ 2004-12-31 15:39 ` Thomas Graf 2004-12-31 16:44 ` jamal 0 siblings, 1 reply; 42+ messages in thread From: Thomas Graf @ 2004-12-31 15:39 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * jamal <1104505142.1048.262.camel@jzny.localdomain> 2004-12-31 09:59 > On Fri, 2004-12-31 at 06:08, Thomas Graf wrote: > > * jamal <1104469111.1049.219.camel@jzny.localdomain> 2004-12-30 23:58 > > > > perhaps we need a few bits in the selector to identify > > > the owner who installed the rule to begin with. Then it would be safe to > > > interpret the meaning of the ID (by the same app). Did you say there > > > were some unused bits in the selector? > > > > Right, but why not do this in userspace by having a global map > > somewhere in a file? A u32 config could have been modified by > > multiple pids and it would get really messy to store a pid for > > every possible changeable item. > > The "PID" is per app - same as it is for rtm where the different known > routing apps like zebra are given an ID and certain IDs are left blank > for general use. So if you run multiple tcs they will all have the same > ID. The way the id is used in the rtm historically has been for allowing > multiple route protocols to install route entries (this way for example > OSPF does/nt announce RIP routes etc). In our case this would mean: > "this rule was installed by tc - only it knows what the opaque value > 10 means". In that case tc would be responsible to decode 10 which would > mean to it "this is an ip match therefore use ip_print() to pretty > print" and 10 is global to tc only and in that case stored in some > header file belonging to tc. Exactly so we don't need any PIDs stored. /etc/iproute2/tc_matches. > > __u8 kind; /* match type, for lookup in matchers table */ > > 255 possible matchers max? 256 not enough? > > __u8 flags; /* Invert Flag + Relations */ > > __u16 handle; /* must be unique per selector, may be autogenerated */ > > Ok this is the one used to store the opaque IDs - unique per app > so may be the same across multiple selectors. No, it must be unique, we will return EINVAL if it isn't. Most apps will set it to 0 which will autogenerate it with a generate && !get just like in u32. > Probably steal a few bits from here and use them in nkind. Fair, we can also steal a few bits from flags although I'd like to keep at least 3 empty. > Maybe i misunderstood you. You are going to have: > > SEL2 > | > | > +--match1 > | | > | + -- extended match1 > . . > . . > . . > . +---- extnded matchn > | > +--- matchn > > > Why does the first one have to be speacial? No, I'm going to have everything be ematches. > One of the mistakes in u32 is the tight packing of the matches. ie the > match1-n above are packet together. If they were put in TLVs probably > only new thing that will be needed is MATCH2 TLV. > No harm in transporting an extra 32 bits for a TLV - its not like you > are going to have a million matches and need to save space. > So i would suggest everything under a TLV (SEL2->MATCH(ES)->EMATCH(ES)) SEL2: TLV (TCA_U32_SEL2) +-------------------+ | Selector header | +-------------------+ | Match 1 TLV | +-------------------+ | ... | +-------------------+ | Match N TLV | +-------------------+ where Match TLV: +--------------------+ | Match Header (u32) | +- - - - - - - - - - + | Match Data | +--------------------+ Where Header: u8 kind; u8 flags; u16 handle; (or some more for kind and less for handle) where kind := { 0 | match type } handle := { 0 | unique } where kind == 0 means that the match is a container and data contains a u32 pointing into the match array. where kind != 0 means a match to be looked up in the matcher ops table. with the following matches: - u32: {u8|u16|u32} at offset match - meta - kmp - nbyte - ... In case SEL1 TLV is provided we simply create a flat index with no containers and all AND relations. Which means we can do: Sel2 \__match u32 at 2 16 0xff | + and match meta nfmark 2 | + and container \__match u32 at 4 11 0xf0 | + or nbyte "::1" at 12 Thoughts? > A virus would span several packets. So the state/knowldge of whether > something is a virus is spread across several packets. That knowledge > typically needs to be accumulated before making a call. Is this thing > capable of remembering? Not and I would rather do it outside in a separate ematch to match stateful information. I have to think some more about this though. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-31 15:39 ` Thomas Graf @ 2004-12-31 16:44 ` jamal 2004-12-31 17:32 ` jamal 2004-12-31 18:11 ` Thomas Graf 0 siblings, 2 replies; 42+ messages in thread From: jamal @ 2004-12-31 16:44 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Fri, 2004-12-31 at 10:39, Thomas Graf wrote: > * jamal <1104505142.1048.262.camel@jzny.localdomain> 2004-12-31 09:59 > Exactly so we don't need any PIDs stored. /etc/iproute2/tc_matches. Sure - that would be a fine place to store the IDs. Although i am not sure you want to call it matches - rather its u32 program identifiers. > > > __u8 kind; /* match type, for lookup in matchers table */ > > > > 255 possible matchers max? > > 256 not enough? I dont know, possible given how easy it would be to add a new match that it wont be sufficient. > > > __u8 flags; /* Invert Flag + Relations */ > > > __u16 handle; /* must be unique per selector, may be autogenerated */ > > > > Ok this is the one used to store the opaque IDs - unique per app > > so may be the same across multiple selectors. > > No, it must be unique, we will return EINVAL if it isn't. Most > apps will set it to 0 which will autogenerate it with a > generate && !get just like in u32. > We may be talking about different things - please double check. The misunderstanding seems to be on scoping: PID is global and the opaque ID is per match. IOW, theres a hierachy: A program(such as tc) installs a filter rule - we need to be able to identify the program - this is the PID. Unique across all Linux. A filter rule constitutes one or more matches. Different programs may install different u32 rules. For most users its a single program - tc. Each program that installs a rule will need to be identified by the PID. Main purpose is so that it can decode what the second level number means. This second level number is what i said was opaque. Its meaning is per app. So as an example PID 0x10 identifies application tc and opaque code 0x20 for tc translates to mean "thats an ip match, so no need to dump raw data - just dump it in english using ip_print()". 0x10 belongs to the selector; 0x20 is per match. 0x21 could mean "this is a TCP match with options" etc The ematches on the hand are purely decodable via user space without needing the opaque numbers - the kind/type serves these just fine. > > Probably steal a few bits from here and use them in nkind. > > Fair, we can also steal a few bits from flags although I'd like to > keep at least 3 empty. the 16 bits for match handle sounds a little too generous; still 4 bits or so from there should be fine. > SEL2: > > TLV (TCA_U32_SEL2) > +-------------------+ > | Selector header | > +-------------------+ > | Match 1 TLV | > +-------------------+ > | ... | > +-------------------+ > | Match N TLV | > +-------------------+ nod. > where Match TLV: > > +--------------------+ > | Match Header (u32) | > +- - - - - - - - - - + > | Match Data | > +--------------------+ > > Where Header: > u8 kind; > u8 flags; > u16 handle; > > (or some more for kind and less for handle) nod. > where > > kind := { 0 | match type } > handle := { 0 | unique } > > where kind == 0 means that the match is a container and data > contains a u32 pointing into the match array. > So essentiall good old u32 here? > where kind != 0 means a match to be looked up in the matcher > ops table. > > with the following matches: > - u32: {u8|u16|u32} at offset match > - meta > - kmp > - nbyte > - ... Everything is almost the same as to what i was saying - except i see the u32 again there; isnt kind = 0 covering this? > In case SEL1 TLV is provided we simply create a flat > index with no containers and all AND relations. No choice there. > Which means we can do: > > Sel2 > \__match u32 at 2 16 0xff > | > + and match meta nfmark 2 > | > + and container > \__match u32 at 4 11 0xf0 > | > + or nbyte "::1" at 12 > > Thoughts? > Ok, I may have understood what you are talking about in the second u32 where kind !=0. I think we are in sync. Go nuts. > > A virus would span several packets. So the state/knowldge of whether > > something is a virus is spread across several packets. That knowledge > > typically needs to be accumulated before making a call. Is this thing > > capable of remembering? > > Not and I would rather do it outside in a separate ematch to match > stateful information. I have to think some more about this though. Its a non-trivial problem. Its ok to defer it for now but keep it in the back of your mind. cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-31 16:44 ` jamal @ 2004-12-31 17:32 ` jamal 2004-12-31 18:11 ` Thomas Graf 1 sibling, 0 replies; 42+ messages in thread From: jamal @ 2004-12-31 17:32 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Fri, 2004-12-31 at 11:44, jamal wrote: > > with the following matches: One thing i just remembered: You need to know the length of the matches and their data in order to store them. This is why i was earlier preaching putting them in TLVs. Some things dont need the datalen like u32 - however i suspect most will. So either need a length somewhere in the header or use TLVs for the ematches in which you can make T=kind - still have 32 bit inside body but reserve bits not used for flags for future use. Thoughts?. BTW, for deleting - should not allow deleting one of N matches. Deletion should be at selector level. Replace can be used to pretend a single match was deleted. cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-31 16:44 ` jamal 2004-12-31 17:32 ` jamal @ 2004-12-31 18:11 ` Thomas Graf 2004-12-31 18:19 ` Thomas Graf 2004-12-31 20:51 ` jamal 1 sibling, 2 replies; 42+ messages in thread From: Thomas Graf @ 2004-12-31 18:11 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * jamal <1104511494.1048.303.camel@jzny.localdomain> 2004-12-31 11:44 > We may be talking about different things - please double check. > The misunderstanding seems to be on scoping: PID is global and > the opaque ID is per match. > IOW, theres a hierachy: > A program(such as tc) installs a filter rule - we need to be able to > identify the program - this is the PID. Unique across all Linux. > A filter rule constitutes one or more matches. Different programs may > install different u32 rules. For most users its a single program - tc. > Each program that installs a rule will need to be identified by the PID. > Main purpose is so that it can decode what the second level number > means. This second level number is what i said was opaque. Its meaning > is per app. > So as an example PID 0x10 identifies application tc and opaque code > 0x20 for tc translates to mean "thats an ip match, so no need to dump > raw data - just dump it in english using ip_print()". > 0x10 belongs to the selector; 0x20 is per match. 0x21 could mean "this > is a TCP match with options" etc > The ematches on the hand are purely decodable via user space without > needing the opaque numbers - the kind/type serves these just fine. Agreed I just don't get the reason for the PID. tc is usually called as a new process instance when dumping. For me there are two possible options, we can either introduce a ID system where an ID is assigned to a match string in either a config file or a header file or we can have tc write id -> desc maps to a global file somewhere where id means match id + u32 handle + parent + dev. The first is probably the better way. We could extend the match header to 64bit: u16 handle u16 matchID u16 kind u8 flags u8 pad > Its a non-trivial problem. Its ok to defer it for now but keep it in the > back of your mind. Agreed. * jamal <1104514372.1047.326.camel@jzny.localdomain> 2004-12-31 12:32 > One thing i just remembered: You need to know the length of the matches > and their data in order to store them. This is why i was earlier > preaching putting them in TLVs. Some things dont need the datalen > like u32 - however i suspect most will. It might not have been obvious, but every match is indeed in its own TLV. The part I don't want to use own TLVs is to separate the match header and match data. Match header is always the same size and match data can be aligned as well. We need len attributes for things like meta indev match, nbyte and kmp though. A Nbyte config TLV would look like: TCA_EMATCH +-------------------------+ | struct tcf_ematch_hdr | | - - - - - - - - - - - --| | ematch data | +-------------------------+ where ematch data contains nested TLVs such as TCA_EMATCH_NBYTE_HDR header, contains length of pattern + possibily more TCA_EMATCH_NBYTE_START lower limit of searching range (offset) TCA_EMATCH_NBYTE_END upper limit of searching range (offset) TCA_EMATCH_NBYTE_PATTERN searching pattern, u8 [] The length in the header is required because we can't use L from TCA_EMATCH_NBYTE_PATTERN since it might be padded. Same would go for meta: TCA_EMATCH_META_HDR TCA_EMATCH_META_LVALUE TCA_EMATCH_MEtA_RVALUE If needed we can put in match specific stats via a _STATS TLV. > So either need a length somewhere in the header or use TLVs for the > ematches in which you can make T=kind - still have 32 bit inside body > but reserve bits not used for flags for future use. Thoughts?. I thought about the following ordering in the selctor TLV: T=1 generic selector header T=2 classifier specific selector header (u32 hashsing stuff goes here) T=3 ematch 1 T=N ematch N > BTW, for deleting - should not allow deleting one of N matches. Deletion > should be at selector level. Replace can be used to pretend a single > match was deleted. Agreed. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-31 18:11 ` Thomas Graf @ 2004-12-31 18:19 ` Thomas Graf 2004-12-31 20:51 ` jamal 1 sibling, 0 replies; 42+ messages in thread From: Thomas Graf @ 2004-12-31 18:19 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * Thomas Graf <20041231181153.GP32419@postel.suug.ch> 2004-12-31 19:11 > T=1 generic selector header > T=2 classifier specific selector header (u32 hashsing stuff goes here) > T=3 ematch 1 > T=N ematch N Of course this should have been: T=N ematch N-2 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-31 18:11 ` Thomas Graf 2004-12-31 18:19 ` Thomas Graf @ 2004-12-31 20:51 ` jamal 2005-01-01 12:10 ` Thomas Graf 2005-01-01 18:32 ` Thomas Graf 1 sibling, 2 replies; 42+ messages in thread From: jamal @ 2004-12-31 20:51 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Fri, 2004-12-31 at 13:11, Thomas Graf wrote: > * jamal <1104511494.1048.303.camel@jzny.localdomain> 2004-12-31 11:44 > Agreed I just don't get the reason for the PID. tc is usually called as > a new process instance when dumping. Indeed. A new instance of tc should be able to delete or understand what an old instance with different process ID installed. The P in pid here stands for "program" not "process". Looking at it from another angle it is the "owner" of that rule. I gave an example of routes as a comparison: Example: [root@jzny root]# ip r ls 10.1.0.25 via 10.0.0.90 dev eth0 proto zebra 10.0.0.0/24 dev eth0 proto kernel scope link src 10.0.0.9 127.0.0.0/8 dev lo scope link default via 10.0.0.1 dev eth0 [root@jzny root]# See the "proto" field? Same logic - if tc installed those rules that should read "tc". Same thinking for new u32 display with sel2. More impaortantly though: If the u32 rule was installed by tc, then it can understand what the code/opaqueid _in the match_ means. If i knew how tc used those opaque values then i too in my program could intepret them when i dump even though i didnt install the rule. > For me there are two possible > options, we can either introduce a ID system where an ID is assigned > to a match string in either a config file or a header file or we can > have tc write id -> desc maps to a global file somewhere where id > means match id + u32 handle + parent + dev. The first is probably > the better way. We could extend the match header to 64bit: > > u16 handle > u16 matchID > u16 kind > u8 flags > u8 pad We need to know who installed the rule so we can intepret what the ID in the match is. Unless you see a desire that, in order to understand all this, we need to also know on which device and which parent adds towards reaching that goal then I am afraid this will overcomplicate things. Theres probably other things you could gain from as well by having all those fields; you dont need them for this simple case. > > Its a non-trivial problem. Its ok to defer it for now but keep it in the > > back of your mind. > > Agreed. Lets review at a future date though. > * jamal <1104514372.1047.326.camel@jzny.localdomain> 2004-12-31 12:32 > > One thing i just remembered: You need to know the length of the matches > > and their data in order to store them. This is why i was earlier > > preaching putting them in TLVs. Some things dont need the datalen > > like u32 - however i suspect most will. > > It might not have been obvious, but every match is indeed in its > own TLV. Ok, cool. To recall: > TLV (TCA_U32_SEL2) > +-------------------+ > | Selector header | > +-------------------+ > | Match 1 TLV | > +-------------------+ > | ... | > +-------------------+ > | Match N TLV | > +-------------------+ You may actually need those Ts enumerated as if they are array indices. Look at the way i transfer actions using "order" > The part I don't want to use own TLVs is to separate the > match header and match data. Match header is always the same size > and match data can be aligned as well. We need len attributes for > things like meta indev match, nbyte and kmp though. A Nbyte config > TLV would look like: > > TCA_EMATCH > +-------------------------+ > | struct tcf_ematch_hdr | > | - - - - - - - - - - - --| > | ematch data | > +-------------------------+ I was more worried about the matches not being TLVs. So this looks good. > where ematch data contains nested TLVs such as > > TCA_EMATCH_NBYTE_HDR header, contains length of pattern + possibily more > TCA_EMATCH_NBYTE_START lower limit of searching range (offset) > TCA_EMATCH_NBYTE_END upper limit of searching range (offset) > TCA_EMATCH_NBYTE_PATTERN searching pattern, u8 [] > > The length in the header is required because we can't use > L from TCA_EMATCH_NBYTE_PATTERN since it might be padded. > My view was length is also a common field. Theres also another reason why you want length viewable in a dumb way: --> you dont really wanna force people to write dumpers for these ematchers (goal: keep this interface as simple as it can be); i.e dont need any pretty formater in the kernel. If you have a length then you can reconstruct the TCA_EMATCH easily without caring about the content. This is the path i started taking in eactions. Refer to my notes i sent earlier on the mythical one page ematch/eaction. If someone wants funky stuff - write a classifier. > Same would go for meta: > > TCA_EMATCH_META_HDR > TCA_EMATCH_META_LVALUE > TCA_EMATCH_MEtA_RVALUE > > If needed we can put in match specific stats via a _STATS TLV. Stats are the other thing that adds complexity to the API. If you can make it optional then that would be best - I was thinking to not even have it in. > > So either need a length somewhere in the header or use TLVs for the > > ematches in which you can make T=kind - still have 32 bit inside body > > but reserve bits not used for flags for future use. Thoughts?. > > I thought about the following ordering in the selctor TLV: > > T=1 generic selector header > > T=2 classifier specific selector header (u32 hashsing stuff goes here) > T=3 ematch 1 > T=N ematch N I thought we already agreed on the layout: SEL2- which may nest E/MATCHEs TLVs. Sel2 not being very different from original selector. May be i didnt follow. cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-31 20:51 ` jamal @ 2005-01-01 12:10 ` Thomas Graf 2005-01-01 23:29 ` jamal 2005-01-01 18:32 ` Thomas Graf 1 sibling, 1 reply; 42+ messages in thread From: Thomas Graf @ 2005-01-01 12:10 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * jamal <1104526311.1047.379.camel@jzny.localdomain> 2004-12-31 15:51 > On Fri, 2004-12-31 at 13:11, Thomas Graf wrote: > > * jamal <1104511494.1048.303.camel@jzny.localdomain> 2004-12-31 11:44 > > > Agreed I just don't get the reason for the PID. tc is usually called as > > a new process instance when dumping. > > Indeed. A new instance of tc should be able to delete or understand > what an old instance with different process ID installed. > The P in pid here stands for "program" not "process". Looking at it from > another angle it is the "owner" of that rule. Ahh, now this makes sense. Sorry, I misunderstood you all the time. > > u16 handle > > u16 matchID > > u16 kind > > u8 flags > > u8 pad > > We need to know who installed the rule so we can intepret what the ID in > the match is. Yes but this should go into the selector header. > You may actually need those Ts enumerated as if they are array > indices. Look at the way i transfer actions using "order" Right, order would be N-2. I don't see any reason for storing it, I didn't had need for it in EGP and I used exactly the same techniques as in the action code. > My view was length is also a common field. Theres also another reason > why you want length viewable in a dumb way: > --> you dont really wanna force people to write dumpers for these > ematchers (goal: keep this interface as simple as it can be); i.e dont > need any pretty formater in the kernel. > If you have a length then you can reconstruct the TCA_EMATCH easily > without caring about the content. This is the path i started taking in > eactions. Refer to my notes i sent earlier on the mythical one page > ematch/eaction. > If someone wants funky stuff - write a classifier. Very simple ematches will only require a struct for configuration so the dumping is not more than 5 lines of code. The length can be calculated via RTA_PAYLOAD(ematch_tlv) - sizeof(ematch_hdr). This of course requires the struct to be aligned to RTA_ALIGN but that's generally not a problem at all. I understand your concern but I also want to allow a little bit more complicated ematches such as KMP or later a very simple regular expression implementation. Here's some code from the simple_cmp key of EGP giving a good idea how a simple ematch will look like: static int sc_match(struct egp_cls *cls, struct egp_key *k) { struct egp_key_sc *sc = k->data; u32 lvalue = cls->ops->read(cls, &sc->left); u32 rvalue = cls->ops->read(cls, &sc->right); switch (sc->op) { #define SC(a, b) case EGP_OP_##a: return b SC(NONE, 0); SC(EQ, lvalue == rvalue); SC(NE, lvalue != rvalue); SC(LT, lvalue < rvalue); SC(LE, lvalue <= rvalue); SC(GT, lvalue > rvalue); SC(GE, lvalue >= rvalue); #undef SC } BUG(); return 0; } static int sc_validate(struct egp_config *conf, struct egp_ops *ops, void *d, size_t len) { int err; struct egp_key_sc *sc = d; if (len != sizeof(*sc) || sc->op > EGP_OP_MAX) return -EINVAL; return 0; } static int sc_replace(struct egp_cls *cls, struct egp_key *k, void *d, size_t len) { k->data = kmalloc(len, GFP_KERNEL); if (NULL == k->data) return -ENOBUFS; memcpy(k->data, d, len); return 0; } static int sc_dump(struct egp_cls *cls, struct sk_buff *skb, struct egp_key *k) { struct egp_key_sc *sc = k->data; RTA_PUT(skb, TCA_EGP_KEY_DATA, sizeof(*sc), sc); return 0; rtattr_failure: return -1; } static void sc_free_data(struct egp_cls *cls, struct egp_key *k) { if (k->data) kfree(k->data); } OTOH, on more complex ematches data might be nested TLVs with optional parameters. etc. > Stats are the other thing that adds complexity to the API. If you can > make it optional then that would be best - I was thinking to not even > have it in. It's probably enough to allow optional generic hits/success stats per match. > I thought we already agreed on the layout: > SEL2- which may nest E/MATCHEs TLVs. Sel2 not being very different from > original selector. May be i didnt follow. You did follow but I made the existing u32 match a ematch as well. Things like the program ID goes into the selector header T=1 and classifier specific selector bits such as the hashing bits goes into T=2. Thinking of it it's probably cleaner to put things like hmark, hoff into its own TLV. So the selector TLV contains the selector header at T=1 and nested ematch TLVs at T=2..T=N. I think we're mostly in sync so I'll start working on it and we can review again. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2005-01-01 12:10 ` Thomas Graf @ 2005-01-01 23:29 ` jamal 2005-01-02 0:06 ` Thomas Graf 0 siblings, 1 reply; 42+ messages in thread From: jamal @ 2005-01-01 23:29 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Sat, 2005-01-01 at 07:10, Thomas Graf wrote: > * jamal <1104526311.1047.379.camel@jzny.localdomain> 2004-12-31 15:51 > > We need to know who installed the rule so we can intepret what the ID in > > the match is. > > Yes but this should go into the selector header. > Agreed - PID needs to go into selector. The other ID is per match. > > You may actually need those Ts enumerated as if they are array > > indices. Look at the way i transfer actions using "order" > > Right, order would be N-2. I don't see any reason for storing it, I > didn't had need for it in EGP and I used exactly the same techniques > as in the action code. > If youve done it on EGP then go ahead and use that scheme since you are comfortable with it. > > My view was length is also a common field. Theres also another reason > > why you want length viewable in a dumb way: > > --> you dont really wanna force people to write dumpers for these > > ematchers (goal: keep this interface as simple as it can be); i.e dont > > need any pretty formater in the kernel. > > If you have a length then you can reconstruct the TCA_EMATCH easily > > without caring about the content. This is the path i started taking in > > eactions. Refer to my notes i sent earlier on the mythical one page > > ematch/eaction. > > If someone wants funky stuff - write a classifier. > > Very simple ematches will only require a struct for configuration > so the dumping is not more than 5 lines of code. The length can be > calculated via RTA_PAYLOAD(ematch_tlv) - sizeof(ematch_hdr). This > of course requires the struct to be aligned to RTA_ALIGN but > that's generally not a problem at all. Does the ematch API include a dump()? I dont think it should - thats the point i was making. Should be simple. > I understand your concern but I also want to allow a little bit > more complicated ematches such as KMP or later a very simple > regular expression implementation. > > Here's some code from the simple_cmp key of EGP giving a good > idea how a simple ematch will look like: > > static int > sc_match(struct egp_cls *cls, struct egp_key *k) > { > struct egp_key_sc *sc = k->data; > u32 lvalue = cls->ops->read(cls, &sc->left); > u32 rvalue = cls->ops->read(cls, &sc->right); > > switch (sc->op) { > #define SC(a, b) case EGP_OP_##a: return b > SC(NONE, 0); > SC(EQ, lvalue == rvalue); > SC(NE, lvalue != rvalue); > SC(LT, lvalue < rvalue); > SC(LE, lvalue <= rvalue); > SC(GT, lvalue > rvalue); > SC(GE, lvalue >= rvalue); > #undef SC > } > > BUG(); > return 0; > } > nice and valid for API. > static int > sc_validate(struct egp_config *conf, struct egp_ops *ops, void *d, size_t len) > { > int err; > struct egp_key_sc *sc = d; > > if (len != sizeof(*sc) || sc->op > EGP_OP_MAX) > return -EINVAL; > return 0; > } > Even this is too much for a simple ematch. Validation should happen in user space or maybe at the mother clasifier. maybe a maxsize,minsize attribute is needed in the ematch struct. > static int > sc_replace(struct egp_cls *cls, struct egp_key *k, void *d, size_t len) > { > k->data = kmalloc(len, GFP_KERNEL); > if (NULL == k->data) > return -ENOBUFS; > memcpy(k->data, d, len); > return 0; > } Equivalent of this should be done by the mother classifier. All it needs to know is the length (and no other details). And the length is known from the L in TLV. > static int > sc_dump(struct egp_cls *cls, struct sk_buff *skb, struct egp_key *k) > { > struct egp_key_sc *sc = k->data; > RTA_PUT(skb, TCA_EGP_KEY_DATA, sizeof(*sc), sc); > return 0; > > rtattr_failure: > return -1; > } > Again if you store length, you dont need this. Mother classifier can do it. > static void > sc_free_data(struct egp_cls *cls, struct egp_key *k) > { > if (k->data) > kfree(k->data); > } Dont need this either; mother classifier can handle it. > OTOH, on more complex ematches data might be nested TLVs with > optional parameters. etc. > > > Stats are the other thing that adds complexity to the API. If you can > > make it optional then that would be best - I was thinking to not even > > have it in. > > It's probably enough to allow optional generic hits/success stats > per match. Even at the moment classifiers dont have stats. If you want stats you could add a simple gact accept action. Note also: think of the 100K rules being added and amount of RAM used; > > I thought we already agreed on the layout: > > SEL2- which may nest E/MATCHEs TLVs. Sel2 not being very different from > > original selector. May be i didnt follow. > > You did follow but I made the existing u32 match a ematch as well. > Things like the program ID goes into the selector header T=1 and > classifier specific selector bits such as the hashing bits goes > into T=2. Thinking of it it's probably cleaner to put things like > hmark, hoff into its own TLV. So the selector TLV contains the > selector header at T=1 and nested ematch TLVs at T=2..T=N. > Note that the ematch is supposed to be a very very simple thing... Something a fireman who has implemented a iptables target can whip in an hour. Keep in mind that design goal. Non trivial coding needed or poor usability is the major problem with tc in general. Avoid that theme. An ematch _needs_ a mother classifier such as u32. u32 has a very nice and very flexible layout - it can be trained to build any sort of tree. Maybe the first step should be to not even have u32 as an ematch .. > I think we're mostly in sync so I'll start working on it and > we can review again. Maybe i will wait for the code. cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2005-01-01 23:29 ` jamal @ 2005-01-02 0:06 ` Thomas Graf 2005-01-03 14:36 ` jamal 0 siblings, 1 reply; 42+ messages in thread From: Thomas Graf @ 2005-01-02 0:06 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * jamal <1104622164.1048.444.camel@jzny.localdomain> 2005-01-01 18:29 > Does the ematch API include a dump()? I dont think it should - thats the > point i was making. Should be simple. Yes, although simple ematches are not required to implement dump. > > [validate] > Even this is too much for a simple ematch. Validation should happen in > user space or maybe at the mother clasifier. maybe a maxsize,minsize > attribute is needed in the ematch struct. > > [replace] > Equivalent of this should be done by the mother classifier. > All it needs to know is the length (and no other details). And the > length is known from the L in TLV. I merged validate/replace into change which takes a unsigned long for storage and a data/len parameter. It's up to the ematch what he does with it, data may contain a u32 directly and the ematch might save it in the unsigned long or a ematch may allocate a structure. Why are you focused on hiding so much? I'd rather try to make it simple but still allow more complex ematches to exist. Have a look at http://people.suug.ch/~tgr/tmp/ematch.diff I broke the API down to: change match destroy (optional, only for complex ematches) dump (optional, only for complex ematches) > Even at the moment classifiers dont have stats. If you want stats > you could add a simple gact accept action. > Note also: think of the 100K rules being added and amount of RAM used; Agreed, I didn't add them so far, it's up to the ematch whether it wants to maintain stats or not. > Note that the ematch is supposed to be a very very simple thing... > Something a fireman who has implemented a iptables target can whip in an > hour. Keep in mind that design goal. Non trivial coding needed or poor > usability is the major problem with tc in general. Avoid that theme. > An ematch _needs_ a mother classifier such as u32. u32 has a very nice > and very flexible layout - it can be trained to build any sort of tree. > Maybe the first step should be to not even have u32 as an ematch .. I understand your point but I want to at least be able to implement some more complex stuff. Hiding too much can be bad too. Having only 2 functions to implement is really easy, the rest can be done the LinuxWay(tm) ;-> Have a look at the code and tell me what you think. Here's an example ematch, I find this quite simple already. static in foo_change(struct tcf_proto *tp, void *data, int len, struct tcf_ematch *m) { if (len != sizeof(u32)) return -EINVAL; m->data = *(u32*)data; return 0; } static int foo_match(struct sk_buff *skb, struct tcf_ematch *m) { return skb->security == m->data; } static struct tcf_ematch_ops foo_ops = { .kind = 111, .change = foo_change, .match = foo_match, .owner = THIS_MODULE, .link = LIST_HEAD_INIT(foo_ops.link) } static int __init foo_init(void) { return tcf_ematch_register(&foo_ops); } static void __exit foo_exit(void) { return tcf_ematch_unregister(&foo_ops); } module_init(init_foo); module_exit(exit_foo); ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2005-01-02 0:06 ` Thomas Graf @ 2005-01-03 14:36 ` jamal 2005-01-03 15:02 ` Thomas Graf 0 siblings, 1 reply; 42+ messages in thread From: jamal @ 2005-01-03 14:36 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Sat, 2005-01-01 at 19:06, Thomas Graf wrote: > * jamal <1104622164.1048.444.camel@jzny.localdomain> 2005-01-01 18:29 > > Does the ematch API include a dump()? I dont think it should - thats the > > point i was making. Should be simple. > > Yes, although simple ematches are not required to implement dump. ok. I realize its optional - but i wouldnt even give the writter of ematch the opportunity to write one. Want something more complex? write a classifier. > > > [validate] > > Even this is too much for a simple ematch. Validation should happen in > > user space or maybe at the mother clasifier. maybe a maxsize,minsize > > attribute is needed in the ematch struct. > > > [replace] > > Equivalent of this should be done by the mother classifier. > > All it needs to know is the length (and no other details). And the > > length is known from the L in TLV. > > I merged validate/replace into change which takes a unsigned long > for storage and a data/len parameter. It's up to the ematch what he > does with it, data may contain a u32 directly and the ematch might > save it in the unsigned long or a ematch may allocate a structure. Again allowing for this may be overkill. Just send the same structure the ematch needs in exactly the same form it needs it and you dont need this. > Why are you focused on hiding so much? I'd rather try to make it > simple but still allow more complex ematches to exist. > > Have a look at http://people.suug.ch/~tgr/tmp/ematch.diff Thanks, give me a few hours then i will look. > I broke the API down to: > > change > match > destroy (optional, only for complex ematches) > dump (optional, only for complex ematches) > Other than match, anything really should be optional for simplicity sake. > > Note that the ematch is supposed to be a very very simple thing... > > Something a fireman who has implemented a iptables target can whip in an > > hour. Keep in mind that design goal. Non trivial coding needed or poor > > usability is the major problem with tc in general. Avoid that theme. > > An ematch _needs_ a mother classifier such as u32. u32 has a very nice > > and very flexible layout - it can be trained to build any sort of tree. > > Maybe the first step should be to not even have u32 as an ematch .. > > I understand your point but I want to at least be able to implement > some more complex stuff. Hiding too much can be bad too. Having only 2 > functions to implement is really easy, the rest can be done the LinuxWay(tm) ;-> > True - but the goals of this ematch is to be simple ;-> > Have a look at the code and tell me what you think. > > Here's an example ematch, I find this quite simple already. > > static in foo_change(struct tcf_proto *tp, void *data, int len, struct > tcf_ematch *m) > { > if (len != sizeof(u32)) > return -EINVAL; > m->data = *(u32*)data; > return 0; > } > I still say not needed ;-> > static int foo_match(struct sk_buff *skb, struct tcf_ematch *m) > { > return skb->security == m->data; > } makes sense. > static struct tcf_ematch_ops foo_ops = { > .kind = 111, > .change = foo_change, > .match = foo_match, > .owner = THIS_MODULE, > .link = LIST_HEAD_INIT(foo_ops.link) > } whats the .link for? > static int __init foo_init(void) > { > return tcf_ematch_register(&foo_ops); > } > > static void __exit foo_exit(void) > { > return tcf_ematch_unregister(&foo_ops); > } > > module_init(init_foo); > module_exit(exit_foo); All looks good. Give me a few hours, first day of working week will slow me down. cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2005-01-03 14:36 ` jamal @ 2005-01-03 15:02 ` Thomas Graf 2005-01-03 15:55 ` jamal 0 siblings, 1 reply; 42+ messages in thread From: Thomas Graf @ 2005-01-03 15:02 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * jamal <1104763012.1047.524.camel@jzny.localdomain> 2005-01-03 09:36 > On Sat, 2005-01-01 at 19:06, Thomas Graf wrote: > > * jamal <1104622164.1048.444.camel@jzny.localdomain> 2005-01-01 18:29 > > > Does the ematch API include a dump()? I dont think it should - thats the > > > point i was making. Should be simple. > > > > Yes, although simple ematches are not required to implement dump. > > ok. I realize its optional - but i wouldnt even give the writter of > ematch the opportunity to write one. Want something more complex? write > a classifier. A classifier is at least 300 lines and you lose the ability to use the logic relations. > Again allowing for this may be overkill. Just send the same structure > the ematch needs in exactly the same form it needs it and you dont need > this. Compromise: If change/dump is not provided the api allocates and memcpy's itself resptively dumps m->data. Simple ematches don't have to care and can simple access m->data, more complex ematches can implement their own change/dump. Does that sound beter? > whats the .link for? I use list.h to chain ematch_ops and it's better to have it initialized. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2005-01-03 15:02 ` Thomas Graf @ 2005-01-03 15:55 ` jamal 2005-01-03 16:26 ` Thomas Graf 0 siblings, 1 reply; 42+ messages in thread From: jamal @ 2005-01-03 15:55 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Mon, 2005-01-03 at 10:02, Thomas Graf wrote: > > ok. I realize its optional - but i wouldnt even give the writter of > > ematch the opportunity to write one. Want something more complex? write > > a classifier. > > A classifier is at least 300 lines and you lose the ability to use the > logic relations. Sure, but simpler-than-classifier is where we started ;-> I think in time we should revamp this a little more, but classifiers we have today already we just cant get rid of them now. Over time, I agree we should revamp. > > Again allowing for this may be overkill. Just send the same structure > > the ematch needs in exactly the same form it needs it and you dont need > > this. > > Compromise: If change/dump is not provided the api allocates and > memcpy's itself resptively dumps m->data. Simple ematches don't have to > care and can simple access m->data, more complex ematches can implement > their own change/dump. Does that sound beter? Yes, indeed. A copy instead of reference is owned by the classifier. To me its a valid compromise. Dont want those matches to be shared in any case. Also dont want user to know shit about TLVs. Neither in the kernel, nor in user space. Simplicty. Sorry, havent looked at the code yet. Is the patch you posted the same as you have on the url earlier? > > whats the .link for? > > I use list.h to chain ematch_ops and it's better to have it initialized. I relaized that - but should the user know anything about this? cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2005-01-03 15:55 ` jamal @ 2005-01-03 16:26 ` Thomas Graf 0 siblings, 0 replies; 42+ messages in thread From: Thomas Graf @ 2005-01-03 16:26 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * jamal <1104767726.1049.591.camel@jzny.localdomain> 2005-01-03 10:55 > Sorry, havent looked at the code yet. Is the patch you posted the same > as you have on the url earlier? Yes, with a few minor fixes and I made it depend on CONFIG_NET_CLS_EMATCH and added nil macros when it's not defined. Those macros need some more work though. TODO: - KConfig doc - make change optional - dump m->data if no dump implementation is provided The question I'm yet unsure is whether to strictly return 1/0 or allow classifier return codes. Given I'm done with the above, a minimal ematch will look like: static int sec_match(struct sk_buff *skb, struct tcf_ematch *m) { return *(u32 *) m->data == skb->security; } static struct tcf_ematch_ops sec_ops = { .kind = 11, .match = sec_match, .owner = THIS_MODULE } ... init/exit calling tcf_em_(un_register ... What do you think about checking RTA_PAYLOAD of the ematch data for < sizeof(unsigned long) and assign it to m->data without allocation? Would save us nonse allocations. The yet unused 8bit in tcf_ematch_hdr would hold a flag to indicate that m->data contains the data directly. > I relaized that - but should the user know anything about this? Actually no but initializing it in tcf_em_register wouldn't serve any purpose so we can either remove it or leave it there. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2004-12-31 20:51 ` jamal 2005-01-01 12:10 ` Thomas Graf @ 2005-01-01 18:32 ` Thomas Graf 2005-01-01 23:42 ` jamal 1 sibling, 1 reply; 42+ messages in thread From: Thomas Graf @ 2005-01-01 18:32 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * jamal <1104526311.1047.379.camel@jzny.localdomain> 2004-12-31 15:51 > > T=1 generic selector header > > > > T=2 classifier specific selector header (u32 hashsing stuff goes here) > > T=3 ematch 1 > > T=N ematch N > > I thought we already agreed on the layout: > SEL2- which may nest E/MATCHEs TLVs. Sel2 not being very different from > original selector. May be i didnt follow. OK, I changed my mind while implementing it and a selector now looks like this: Selector TLV: +----------------------------+ | TCA_EMATCH_TREE_HDR | +----------------------------+ | TCA_EMATCH_TREE_LIST | | +------------------------+ | | | T=1 Match 1 | | | +------------------------+ | | | T=2 Match 2 | | | +------------------------+ | | | T=N Match N | | | +------------------------+ | +----------------------------+ So we can put more into the selector if needed without breaking compatibility. TCA_EMATCH_TREE_HDR currently contains `nmatches' specifying N and progid holding the PID you talked about. The match TLVs must have a continous numbering because I don't want to define limits as done in the action code. I'll post an RFC patch tomorrow implementing the API and a simple ematch. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2005-01-01 18:32 ` Thomas Graf @ 2005-01-01 23:42 ` jamal 2005-01-02 0:13 ` Thomas Graf 0 siblings, 1 reply; 42+ messages in thread From: jamal @ 2005-01-01 23:42 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Sat, 2005-01-01 at 13:32, Thomas Graf wrote: > OK, I changed my mind while implementing it and a selector now looks > like this: > > Selector TLV: > > +----------------------------+ > | TCA_EMATCH_TREE_HDR | > +----------------------------+ > | TCA_EMATCH_TREE_LIST | > | +------------------------+ | > | | T=1 Match 1 | | > | +------------------------+ | > | | T=2 Match 2 | | > | +------------------------+ | > | | T=N Match N | | > | +------------------------+ | > +----------------------------+ > what happened to the good old SEL TLV (which i believe we called SEL2 now); or maybe thats what contains this TLV? > So we can put more into the selector if needed without breaking > compatibility. TCA_EMATCH_TREE_HDR currently contains `nmatches' > specifying N and progid holding the PID you talked about. Ok, so i think you may be saying the old selector stays intact then (sans the matches)? Why do you need to specify "nmatches". You know exactly where each one starts and ends (from the TLVs). What is TCA_EMATCH_TREE_LIST for? Looks like another TLV nesting. Not needed, you just plumb the T=1,..T=N right after the header. > The match TLVs must have a continous numbering because I don't > want to define limits as done in the action code. I think the way you have it is fine - and believe it is the way the action code has it for the list. > I'll post an RFC patch tomorrow implementing the API and a > simple ematch. Nice. I have started implementing the eaction code but too obsessed with this other thing i am working on - hopefully i will get to it before my vacation expires. cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2005-01-01 23:42 ` jamal @ 2005-01-02 0:13 ` Thomas Graf 2005-01-03 14:39 ` jamal 0 siblings, 1 reply; 42+ messages in thread From: Thomas Graf @ 2005-01-02 0:13 UTC (permalink / raw) To: jamal; +Cc: David S. Miller, netdev * jamal <1104622934.1047.460.camel@jzny.localdomain> 2005-01-01 18:42 > what happened to the good old SEL TLV (which i believe we called SEL2 > now); or maybe thats what contains this TLV? Please look at the patch I posted in the other post. I think we missudnerstand each other. > > So we can put more into the selector if needed without breaking > > compatibility. TCA_EMATCH_TREE_HDR currently contains `nmatches' > > specifying N and progid holding the PID you talked about. > > Ok, so i think you may be saying the old selector stays intact then > (sans the matches)? Right, old selector TLV statys as-is. Although I have to look u32 closely before I can make final statements. > Why do you need to specify "nmatches". It's mainly a shortcut to validate precedence jumps so I can avoid traversing the RTA chain twice. It could be avoided but is quite handy to speed things up and also acts for validation purposes to check consistency of the match list. > What is TCA_EMATCH_TREE_LIST for? Looks like another TLV nesting. Not > needed, you just plumb the T=1,..T=N right after the header. No, what if we need some more stuff in the selector TLV? We can't modify the header TLV w/o breaking backwards compatibility. Adding this addtional nesting allows to simply add stuff after TREE_LIST TLV. > I think the way you have it is fine - and believe it is the way the > action code has it for the list. You're using a maximum prio aren't you? I use a RTA_OK() loop supporting unlimited number of matches without the need to allocate rtattr pointer array. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. 2005-01-02 0:13 ` Thomas Graf @ 2005-01-03 14:39 ` jamal 0 siblings, 0 replies; 42+ messages in thread From: jamal @ 2005-01-03 14:39 UTC (permalink / raw) To: Thomas Graf; +Cc: David S. Miller, netdev On Sat, 2005-01-01 at 19:13, Thomas Graf wrote: > * jamal <1104622934.1047.460.camel@jzny.localdomain> 2005-01-01 18:42 > > what happened to the good old SEL TLV (which i believe we called SEL2 > > now); or maybe thats what contains this TLV? > > Please look at the patch I posted in the other post. I think > we missudnerstand each other. Will do. > > Why do you need to specify "nmatches". > > It's mainly a shortcut to validate precedence jumps so I > can avoid traversing the RTA chain twice. It could be > avoided but is quite handy to speed things up and > also acts for validation purposes to check consistency of > the match list. Ok. > > What is TCA_EMATCH_TREE_LIST for? Looks like another TLV nesting. Not > > needed, you just plumb the T=1,..T=N right after the header. > > No, what if we need some more stuff in the selector TLV? We can't > modify the header TLV w/o breaking backwards compatibility. Adding > this addtional nesting allows to simply add stuff after TREE_LIST > TLV. Good point. > > I think the way you have it is fine - and believe it is the way the > > action code has it for the list. > > You're using a maximum prio aren't you? I use a RTA_OK() loop > supporting unlimited number of matches without the need to > allocate rtattr pointer array. Sounds reasonable; Note in my case, you probably want a limit on how many actions you chain in one policy - I would suggest you do the same as well. In other words even if you do it that way, have a limit check/setting somewhere. cheers, jamal ^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2005-01-03 16:26 UTC | newest]
Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200412270715.iBR7Fffe026855@hera.kernel.org>
2004-12-27 12:16 ` [PKT_SCHED]: Allow using nfmark as key in U32 classifier Thomas Graf
2004-12-28 13:20 ` jamal
2004-12-28 13:40 ` Thomas Graf
2004-12-28 13:59 ` jamal
2004-12-28 14:50 ` Thomas Graf
2004-12-28 15:55 ` jamal
2004-12-28 16:11 ` Thomas Graf
2004-12-28 16:36 ` jamal
2004-12-28 16:51 ` jamal
2004-12-28 19:26 ` Thomas Graf
2004-12-28 21:14 ` jamal
2004-12-28 22:10 ` Thomas Graf
2004-12-28 23:06 ` jamal
2004-12-28 23:19 ` Thomas Graf
2004-12-28 23:39 ` jamal
2004-12-29 0:09 ` Thomas Graf
2004-12-29 1:13 ` jamal
2004-12-29 12:48 ` Thomas Graf
2004-12-29 14:20 ` jamal
2004-12-29 15:01 ` Thomas Graf
2004-12-29 15:53 ` jamal
2004-12-30 17:43 ` Thomas Graf
2004-12-31 4:58 ` jamal
2004-12-31 11:08 ` Thomas Graf
2004-12-31 14:59 ` jamal
2004-12-31 15:39 ` Thomas Graf
2004-12-31 16:44 ` jamal
2004-12-31 17:32 ` jamal
2004-12-31 18:11 ` Thomas Graf
2004-12-31 18:19 ` Thomas Graf
2004-12-31 20:51 ` jamal
2005-01-01 12:10 ` Thomas Graf
2005-01-01 23:29 ` jamal
2005-01-02 0:06 ` Thomas Graf
2005-01-03 14:36 ` jamal
2005-01-03 15:02 ` Thomas Graf
2005-01-03 15:55 ` jamal
2005-01-03 16:26 ` Thomas Graf
2005-01-01 18:32 ` Thomas Graf
2005-01-01 23:42 ` jamal
2005-01-02 0:13 ` Thomas Graf
2005-01-03 14:39 ` jamal
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).