netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* (Lack of) specification for RX n-tuple filtering
@ 2010-07-22 21:02 Ben Hutchings
  2010-07-22 21:50 ` Dimitris Michailidis
  0 siblings, 1 reply; 18+ messages in thread
From: Ben Hutchings @ 2010-07-22 21:02 UTC (permalink / raw)
  To: Peter Waskiewicz; +Cc: netdev, David Miller

The n-tuple filtering facility is half-baked at present.  There is an
interface to add filters but none to remove them!  And ETHTOOL_GRXNTUPLE
is not at all symmetric with ETHTOOL_SRXNTUPLE (which I complained about
at the time it was added, to no avail).

An ETHTOOL_RESET command with flag ETH_RESET_FILTER set could be defined
to clear all the filters, but that's a big hammer to use, and I think
that in general drivers should push the same configuration back to the
hardware after resetting it for whatever reason.

So far as I can work out, ixgbe clears all the filters when the filter
table fills up.  Is that true?  Is this really the intended behaviour of
manually set filters?

I also see this in the ixgbe implementation:

	/*
	 * Program the relevant mask registers.  If src/dst_port or src/dst_addr
	 * are zero, then assume a full mask for that field.  Also assume that
	 * a VLAN of 0 is unspecified, so mask that out as well.  L4type
	 * cannot be masked out in this implementation.
	 *
	 * This also assumes IPv4 only.  IPv6 masking isn't supported at this
	 * point in time.
	 */

An IPv4 address of 0 is certainly valid, so this isn't a good rule.  And
in any case, such a rule should be specified *with the interface*, in
<linux/ethtool.h>, not the implementation.

This also implies that 'mask' specifies bits to be ignored, not bits to
be matched.  That also was not specified.

Ben.`

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: (Lack of) specification for RX n-tuple filtering
  2010-07-22 21:02 (Lack of) specification for RX n-tuple filtering Ben Hutchings
@ 2010-07-22 21:50 ` Dimitris Michailidis
  2010-09-07 14:43   ` Ben Hutchings
  0 siblings, 1 reply; 18+ messages in thread
From: Dimitris Michailidis @ 2010-07-22 21:50 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Peter Waskiewicz, netdev, David Miller

Ben Hutchings wrote:
> The n-tuple filtering facility is half-baked at present.  There is an
> interface to add filters but none to remove them!  And ETHTOOL_GRXNTUPLE
> is not at all symmetric with ETHTOOL_SRXNTUPLE (which I complained about
> at the time it was added, to no avail).

It's a bit worse than that.  Currently one can only append filters, not 
insert at a given position, as ethtool_rx_ntuple doesn't have an index 
field.  For devices that use TCAMs, where position matters, it's quite an 
obstacle.  It also means one cannot modify an existing filter by specifying 
a new filter for the same index.


> 
> An ETHTOOL_RESET command with flag ETH_RESET_FILTER set could be defined
> to clear all the filters, but that's a big hammer to use, and I think
> that in general drivers should push the same configuration back to the
> hardware after resetting it for whatever reason.
> 
> So far as I can work out, ixgbe clears all the filters when the filter
> table fills up.  Is that true?  Is this really the intended behaviour of
> manually set filters?
> 
> I also see this in the ixgbe implementation:
> 
> 	/*
> 	 * Program the relevant mask registers.  If src/dst_port or src/dst_addr
> 	 * are zero, then assume a full mask for that field.  Also assume that
> 	 * a VLAN of 0 is unspecified, so mask that out as well.  L4type
> 	 * cannot be masked out in this implementation.
> 	 *
> 	 * This also assumes IPv4 only.  IPv6 masking isn't supported at this
> 	 * point in time.
> 	 */
> 
> An IPv4 address of 0 is certainly valid, so this isn't a good rule.  And
> in any case, such a rule should be specified *with the interface*, in
> <linux/ethtool.h>, not the implementation.
> 
> This also implies that 'mask' specifies bits to be ignored, not bits to
> be matched.  That also was not specified.
> 
> Ben.`
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: (Lack of) specification for RX n-tuple filtering
  2010-07-22 21:50 ` Dimitris Michailidis
@ 2010-09-07 14:43   ` Ben Hutchings
  2010-12-08 16:24     ` Vladislav Zolotarov
  0 siblings, 1 reply; 18+ messages in thread
From: Ben Hutchings @ 2010-09-07 14:43 UTC (permalink / raw)
  To: Dimitris Michailidis; +Cc: Peter Waskiewicz, netdev, David Miller

On Thu, 2010-07-22 at 14:50 -0700, Dimitris Michailidis wrote:
> Ben Hutchings wrote:
> > The n-tuple filtering facility is half-baked at present.  There is an
> > interface to add filters but none to remove them!  And ETHTOOL_GRXNTUPLE
> > is not at all symmetric with ETHTOOL_SRXNTUPLE (which I complained about
> > at the time it was added, to no avail).
> 
> It's a bit worse than that.  Currently one can only append filters, not 
> insert at a given position, as ethtool_rx_ntuple doesn't have an index 
> field.  For devices that use TCAMs, where position matters, it's quite an 
> obstacle.  It also means one cannot modify an existing filter by specifying 
> a new filter for the same index.

It looks like drivers for devices that use TCAMs should implement the
RXNFC interface instead.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: (Lack of) specification for RX n-tuple filtering
  2010-09-07 14:43   ` Ben Hutchings
@ 2010-12-08 16:24     ` Vladislav Zolotarov
  2010-12-08 16:39       ` David Miller
  2010-12-08 17:22       ` Ben Hutchings
  0 siblings, 2 replies; 18+ messages in thread
From: Vladislav Zolotarov @ 2010-12-08 16:24 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Dimitris Michailidis, Peter Waskiewicz, netdev@vger.kernel.org,
	David Miller


> > It's a bit worse than that.  Currently one can only append filters, not
> > insert at a given position, as ethtool_rx_ntuple doesn't have an index
> > field.  For devices that use TCAMs, where position matters, it's quite an
> > obstacle.  It also means one cannot modify an existing filter by specifying
> > a new filter for the same index.
> 
> It looks like drivers for devices that use TCAMs should implement the
> RXNFC interface instead.
> 

Ben, from ethtool manpage it sounds like RXNFC option defines the way
the RSS hash should be calculated, while SRXNTUPLE is meant to control
the destination Rx queue for a stream specified by a filter/filters. The
semantics for a specification of the steam is also quite different. For
instance, how do u define a rule to drop all packets with source IP
address 192.168.10.200 by means of RXNFC? While with SRXNTUPLE it's
straight forward. So, if I understood the semantics of both interfaces
correctly, there is a very limited range of functionality where they may
replace one another. Pls., correct me if I'm wrong.

I also agree with Dimitris: what we have here is an offload of some
Netfilter functionality to HW. Regardless the HW implementation (TCAM or
not) if it's allowed to configure more than one rule for the same
protocol the ordering of filtering rules is important: for instance if u
change the order of applying the rules in the example below the result
of the filtering for the traffic with both VLAN 4 and destination port
3000 will be different.

ethtool -U ethX flow-type tcp4 vlan 4 action 0
ethtool -U ethX flow-type tcp4 dst-port 3000 action 3

By the way it's also unclear from the ethtool man page if it's allowed
to configure more than one rule for the same protocol. If it's not then
the above example is void... ;) However, if we want to define a proper
filtering interface I think we shouldn't restrict the driver
implementation from defining a set of rules for the same protocol,
allowing not to though.

So, I think that attaching an index to each rule could be a good idea -
this would allow us both inserting rules at the desired positions in the
filtering rule table and editing the existing rules.

It's also unclear what is the relation between RXNFC and SRXNTUPLE. The
last in general may override the decision made based on the hash result.
So, it sounds like applying rules of SRXNTUPLE should come before
applying the RSS logic and only if there was no match RSS should be
applied to that frame. Do I get it right?

Pls., comment.

thanks,
vlad



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: (Lack of) specification for RX n-tuple filtering
  2010-12-08 16:24     ` Vladislav Zolotarov
@ 2010-12-08 16:39       ` David Miller
  2010-12-08 17:29         ` Ben Hutchings
  2010-12-08 17:31         ` Vladislav Zolotarov
  2010-12-08 17:22       ` Ben Hutchings
  1 sibling, 2 replies; 18+ messages in thread
From: David Miller @ 2010-12-08 16:39 UTC (permalink / raw)
  To: vladz; +Cc: bhutchings, dm, peter.p.waskiewicz.jr, netdev

From: "Vladislav Zolotarov" <vladz@broadcom.com>
Date: Wed, 8 Dec 2010 18:24:03 +0200

> I also agree with Dimitris: what we have here is an offload of some
> Netfilter functionality to HW. Regardless the HW implementation (TCAM or
> not) if it's allowed to configure more than one rule for the same
> protocol the ordering of filtering rules is important: for instance if u
> change the order of applying the rules in the example below the result
> of the filtering for the traffic with both VLAN 4 and destination port
> 3000 will be different.

It's not the same, this whole ordering thing you expect in netfilter
land is simply not present in these hardware implementations.

The hardware does a parallel TCAM match lookup, and whatever matches
is used.

Some hardware does link-level protocol lookups first, then L3/L4 later
in the RX path right before computing the hash and selecting an RX
queue.

There really is no ordering available, so let's not pretend it can be
used "just like" netfilter rules.

As per the difference between the various ethtool facilities, this
just represents the fact that whats available to offload differs
per device.  The best we can do is encapsulate commonality as best
as we can, but each interface essentially represents what one
major chipset provides.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: (Lack of) specification for RX n-tuple filtering
  2010-12-08 16:24     ` Vladislav Zolotarov
  2010-12-08 16:39       ` David Miller
@ 2010-12-08 17:22       ` Ben Hutchings
  2010-12-08 18:39         ` Vladislav Zolotarov
  2010-12-08 18:54         ` Dimitris Michailidis
  1 sibling, 2 replies; 18+ messages in thread
From: Ben Hutchings @ 2010-12-08 17:22 UTC (permalink / raw)
  To: Vladislav Zolotarov
  Cc: Dimitris Michailidis, Peter Waskiewicz, netdev@vger.kernel.org,
	David Miller

On Wed, 2010-12-08 at 18:24 +0200, Vladislav Zolotarov wrote:
> > > It's a bit worse than that.  Currently one can only append filters, not
> > > insert at a given position, as ethtool_rx_ntuple doesn't have an index
> > > field.  For devices that use TCAMs, where position matters, it's quite an
> > > obstacle.  It also means one cannot modify an existing filter by specifying
> > > a new filter for the same index.
> > 
> > It looks like drivers for devices that use TCAMs should implement the
> > RXNFC interface instead.
> > 
> 
> Ben, from ethtool manpage it sounds like RXNFC option defines the way
> the RSS hash should be calculated, while SRXNTUPLE is meant to control
> the destination Rx queue for a stream specified by a filter/filters.

By 'RXNFC interface' I mean ETHTOOL_{G,S}RXCLS* and not
ETHTOOL_{G,S}RXFH which wrongly share (part of) the same structure..

> The
> semantics for a specification of the steam is also quite different. For
> instance, how do u define a rule to drop all packets with source IP
> address 192.168.10.200 by means of RXNFC?

Something like this, I think:

struct ethtool_rxnfc insert_rule = {
	.cmd = ETHTOOL_SRXCLSRLINS,
	.flow_type = IP_USER_SPEC,
	.fs = {
		.flow_type = IP_USER_SPEC,
		.h_u.usr_ip4_spec = {
			.ip4src = inet_aton("192.168.10.200"),
			.ip_ver = ETH_RX_NFC_IP4
		},
		.m_u.usr_ip4_spec = {
			.ip4dst = 0xffffffff,
			.l4_4_bytes = 0xffffffff,
			.tos = 0xff,
			.proto = 0xff
		},
		.ring_cookie = RX_CLS_FLOW_DISC,
		.location = 0,
	}
};

[...]
> I also agree with Dimitris: what we have here is an offload of some
> Netfilter functionality to HW. Regardless the HW implementation (TCAM or
> not) if it's allowed to configure more than one rule for the same
> protocol the ordering of filtering rules is important: for instance if u
> change the order of applying the rules in the example below the result
> of the filtering for the traffic with both VLAN 4 and destination port
> 3000 will be different.

Our hardware (and, I suspect, the ixgbe hardware) has hash tables for
specific types of matching.  There is some control of precedence between
different types of match, but that's all.

> ethtool -U ethX flow-type tcp4 vlan 4 action 0
> ethtool -U ethX flow-type tcp4 dst-port 3000 action 3
> 
> By the way it's also unclear from the ethtool man page if it's allowed
> to configure more than one rule for the same protocol. If it's not then
> the above example is void... ;)

It's allowed, but precedence is unspecified.

> However, if we want to define a proper
> filtering interface I think we shouldn't restrict the driver
> implementation from defining a set of rules for the same protocol,
> allowing not to though.
> 
> So, I think that attaching an index to each rule could be a good idea -
> this would allow us both inserting rules at the desired positions in the
> filtering rule table and editing the existing rules.

This really sounds like the RXNFC interface.

> It's also unclear what is the relation between RXNFC and SRXNTUPLE. The
> last in general may override the decision made based on the hash result.
> So, it sounds like applying rules of SRXNTUPLE should come before
> applying the RSS logic and only if there was no match RSS should be
> applied to that frame. Do I get it right?

That's right.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: (Lack of) specification for RX n-tuple filtering
  2010-12-08 16:39       ` David Miller
@ 2010-12-08 17:29         ` Ben Hutchings
  2010-12-08 17:31           ` David Miller
  2010-12-09 10:31           ` Vladislav Zolotarov
  2010-12-08 17:31         ` Vladislav Zolotarov
  1 sibling, 2 replies; 18+ messages in thread
From: Ben Hutchings @ 2010-12-08 17:29 UTC (permalink / raw)
  To: David Miller; +Cc: vladz, dm, peter.p.waskiewicz.jr, netdev

On Wed, 2010-12-08 at 08:39 -0800, David Miller wrote:
> From: "Vladislav Zolotarov" <vladz@broadcom.com>
> Date: Wed, 8 Dec 2010 18:24:03 +0200
> 
> > I also agree with Dimitris: what we have here is an offload of some
> > Netfilter functionality to HW. Regardless the HW implementation (TCAM or
> > not) if it's allowed to configure more than one rule for the same
> > protocol the ordering of filtering rules is important: for instance if u
> > change the order of applying the rules in the example below the result
> > of the filtering for the traffic with both VLAN 4 and destination port
> > 3000 will be different.
> 
> It's not the same, this whole ordering thing you expect in netfilter
> land is simply not present in these hardware implementations.
> 
> The hardware does a parallel TCAM match lookup, and whatever matches
> is used.

I think the match with the lowest index wins, which is why it's possible
to specify the rule's index (location) with ETHTOOL_SRXCLSRLINS and why
Peter defined new commands without that for use with the ixgbe driver.

> Some hardware does link-level protocol lookups first, then L3/L4 later
> in the RX path right before computing the hash and selecting an RX
> queue.
>
> There really is no ordering available, so let's not pretend it can be
> used "just like" netfilter rules.
> 
> As per the difference between the various ethtool facilities, this
> just represents the fact that whats available to offload differs
> per device.  The best we can do is encapsulate commonality as best
> as we can, but each interface essentially represents what one
> major chipset provides.

I think the interfaces are actually somewhat more flexible than any of
the current implementations.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: (Lack of) specification for RX n-tuple filtering
  2010-12-08 17:29         ` Ben Hutchings
@ 2010-12-08 17:31           ` David Miller
  2010-12-09 10:31           ` Vladislav Zolotarov
  1 sibling, 0 replies; 18+ messages in thread
From: David Miller @ 2010-12-08 17:31 UTC (permalink / raw)
  To: bhutchings; +Cc: vladz, dm, peter.p.waskiewicz.jr, netdev

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Wed, 08 Dec 2010 17:29:24 +0000

> I think the interfaces are actually somewhat more flexible than any of
> the current implementations.

Yeah you're probably right.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: (Lack of) specification for RX n-tuple filtering
  2010-12-08 16:39       ` David Miller
  2010-12-08 17:29         ` Ben Hutchings
@ 2010-12-08 17:31         ` Vladislav Zolotarov
  1 sibling, 0 replies; 18+ messages in thread
From: Vladislav Zolotarov @ 2010-12-08 17:31 UTC (permalink / raw)
  To: David Miller
  Cc: bhutchings@solarflare.com, dm@chelsio.com,
	peter.p.waskiewicz.jr@intel.com, netdev@vger.kernel.org

> > I also agree with Dimitris: what we have here is an offload of some
> > Netfilter functionality to HW. Regardless the HW implementation (TCAM or
> > not) if it's allowed to configure more than one rule for the same
> > protocol the ordering of filtering rules is important: for instance if u
> > change the order of applying the rules in the example below the result
> > of the filtering for the traffic with both VLAN 4 and destination port
> > 3000 will be different.
> 
> It's not the same, this whole ordering thing you expect in netfilter
> land is simply not present in these hardware implementations.
> 
> The hardware does a parallel TCAM match lookup, and whatever matches
> is used.

So, u say that in scope of a single protocol all rules create a set
which ordering is a vendor specific and the same configuration of
n-tuple rules may generate different results for the same traffic on
NICs from different vendors? Don't u think it's confusing from the user
point of view? ;)





^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: (Lack of) specification for RX n-tuple filtering
  2010-12-08 17:22       ` Ben Hutchings
@ 2010-12-08 18:39         ` Vladislav Zolotarov
  2010-12-08 19:02           ` Ben Hutchings
  2010-12-08 18:54         ` Dimitris Michailidis
  1 sibling, 1 reply; 18+ messages in thread
From: Vladislav Zolotarov @ 2010-12-08 18:39 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Dimitris Michailidis, Peter Waskiewicz, netdev@vger.kernel.org,
	David Miller


> > The
> > semantics for a specification of the steam is also quite different. For
> > instance, how do u define a rule to drop all packets with source IP
> > address 192.168.10.200 by means of RXNFC?
> 
> Something like this, I think:
> 
> struct ethtool_rxnfc insert_rule = {
> 	.cmd = ETHTOOL_SRXCLSRLINS,
> 	.flow_type = IP_USER_SPEC,
> 	.fs = {
> 		.flow_type = IP_USER_SPEC,
> 		.h_u.usr_ip4_spec = {
> 			.ip4src = inet_aton("192.168.10.200"),
> 			.ip_ver = ETH_RX_NFC_IP4
> 		},
> 		.m_u.usr_ip4_spec = {
> 			.ip4dst = 0xffffffff,
> 			.l4_4_bytes = 0xffffffff,
> 			.tos = 0xff,
> 			.proto = 0xff
> 		},
> 		.ring_cookie = RX_CLS_FLOW_DISC,
> 		.location = 0,
> 	}
> };
> 

Aha. Ok. From the remarks in the upstream ethtool.h I see now that
ethtool_rxnfc has quite wide configuration possibilities (including the
above). I missed it before. ;)

Ben, could u, pls., explain me then what's the difference between
defining the rule as u wrote above on top of -N option (nfc) and
defining the rule doing the same thing on top on -U (n-tuple) option and
when I as a user should prefer one option to another? Are they expected
to be implemented differently from FW/HW perspective?

thanks,
vlad

P.S. I see that ethtool.h from the 2.6.36 tree already has the
ethtool_rxnfc that would allow such a filtering definition however from
the man page of the 2.6.36 version of the ethtool package it's unclear
what should be a command line for such a configuration. Is it supported
with the current ethtool version or maybe I'm missing something in a man
page?



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: (Lack of) specification for RX n-tuple filtering
  2010-12-08 17:22       ` Ben Hutchings
  2010-12-08 18:39         ` Vladislav Zolotarov
@ 2010-12-08 18:54         ` Dimitris Michailidis
  2010-12-08 19:14           ` Ben Hutchings
  1 sibling, 1 reply; 18+ messages in thread
From: Dimitris Michailidis @ 2010-12-08 18:54 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Vladislav Zolotarov, Peter Waskiewicz, netdev@vger.kernel.org,
	David Miller

Ben Hutchings wrote:
> On Wed, 2010-12-08 at 18:24 +0200, Vladislav Zolotarov wrote:
>>>> It's a bit worse than that.  Currently one can only append filters, not
>>>> insert at a given position, as ethtool_rx_ntuple doesn't have an index
>>>> field.  For devices that use TCAMs, where position matters, it's quite an
>>>> obstacle.  It also means one cannot modify an existing filter by specifying
>>>> a new filter for the same index.
>>> It looks like drivers for devices that use TCAMs should implement the
>>> RXNFC interface instead.
>>>
>> Ben, from ethtool manpage it sounds like RXNFC option defines the way
>> the RSS hash should be calculated, while SRXNTUPLE is meant to control
>> the destination Rx queue for a stream specified by a filter/filters.
> 
> By 'RXNFC interface' I mean ETHTOOL_{G,S}RXCLS* and not
> ETHTOOL_{G,S}RXFH which wrongly share (part of) the same structure..
> 
>> The
>> semantics for a specification of the steam is also quite different. For
>> instance, how do u define a rule to drop all packets with source IP
>> address 192.168.10.200 by means of RXNFC?
> 
> Something like this, I think:
> 
> struct ethtool_rxnfc insert_rule = {
> 	.cmd = ETHTOOL_SRXCLSRLINS,
> 	.flow_type = IP_USER_SPEC,
> 	.fs = {
> 		.flow_type = IP_USER_SPEC,
> 		.h_u.usr_ip4_spec = {
> 			.ip4src = inet_aton("192.168.10.200"),
> 			.ip_ver = ETH_RX_NFC_IP4
> 		},
> 		.m_u.usr_ip4_spec = {
> 			.ip4dst = 0xffffffff,
> 			.l4_4_bytes = 0xffffffff,
> 			.tos = 0xff,
> 			.proto = 0xff
> 		},
> 		.ring_cookie = RX_CLS_FLOW_DISC,
> 		.location = 0,
> 	}
> };

I think the mask would be 0 for don't care fields and 1 for care, so

	.m_u.usr_ip4_spec.ip4src = htonl(0xffffffff)
	.m_u.usr_ip4_spec.ip4dst = htonl(0)
etc

There's a lot of overlap between SRXCLSRLINS and SRXNTUPLE and neither is a 
superset.  SRXCLSRLINS has the advantage of specifying position but 
SRXNTUPLE includes vlan and a device-specific field that are handy.

Also for reporting rules GRXNTUPLE is more flexible than GRXCLSRULE as it 
lets the driver specify the information it reports.  In fact I've been 
thinking of using SRXCLSRLINS and GRXNTUPLE for cxgb4 but haven't gotten 
over the ugliness of that yet.

> 
> [...]
>> I also agree with Dimitris: what we have here is an offload of some
>> Netfilter functionality to HW. Regardless the HW implementation (TCAM or
>> not) if it's allowed to configure more than one rule for the same
>> protocol the ordering of filtering rules is important: for instance if u
>> change the order of applying the rules in the example below the result
>> of the filtering for the traffic with both VLAN 4 and destination port
>> 3000 will be different.
> 
> Our hardware (and, I suspect, the ixgbe hardware) has hash tables for
> specific types of matching.  There is some control of precedence between
> different types of match, but that's all.
> 
>> ethtool -U ethX flow-type tcp4 vlan 4 action 0
>> ethtool -U ethX flow-type tcp4 dst-port 3000 action 3
>>
>> By the way it's also unclear from the ethtool man page if it's allowed
>> to configure more than one rule for the same protocol. If it's not then
>> the above example is void... ;)
> 
> It's allowed, but precedence is unspecified.
> 
>> However, if we want to define a proper
>> filtering interface I think we shouldn't restrict the driver
>> implementation from defining a set of rules for the same protocol,
>> allowing not to though.
>>
>> So, I think that attaching an index to each rule could be a good idea -
>> this would allow us both inserting rules at the desired positions in the
>> filtering rule table and editing the existing rules.
> 
> This really sounds like the RXNFC interface.
> 
>> It's also unclear what is the relation between RXNFC and SRXNTUPLE. The
>> last in general may override the decision made based on the hash result.
>> So, it sounds like applying rules of SRXNTUPLE should come before
>> applying the RSS logic and only if there was no match RSS should be
>> applied to that frame. Do I get it right?
> 
> That's right.

It can be more involved than this.  Our HW allows a rule to select a 
different part of the RSS table so you get a filter hit and still do RSS 
afterwards if you want.  Current ethtool interfaces do not support this, 
basically it would be a different action for either SRXNTUPLE or SRXCLSRLINS.

> 
> Ben.
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: (Lack of) specification for RX n-tuple filtering
  2010-12-08 18:39         ` Vladislav Zolotarov
@ 2010-12-08 19:02           ` Ben Hutchings
  2010-12-08 19:10             ` Vladislav Zolotarov
  0 siblings, 1 reply; 18+ messages in thread
From: Ben Hutchings @ 2010-12-08 19:02 UTC (permalink / raw)
  To: Vladislav Zolotarov
  Cc: Dimitris Michailidis, Peter Waskiewicz, netdev@vger.kernel.org,
	David Miller

On Wed, 2010-12-08 at 20:39 +0200, Vladislav Zolotarov wrote:
> > > The
> > > semantics for a specification of the steam is also quite different. For
> > > instance, how do u define a rule to drop all packets with source IP
> > > address 192.168.10.200 by means of RXNFC?
> > 
> > Something like this, I think:
> > 
> > struct ethtool_rxnfc insert_rule = {
> > 	.cmd = ETHTOOL_SRXCLSRLINS,
> > 	.flow_type = IP_USER_SPEC,
> > 	.fs = {
> > 		.flow_type = IP_USER_SPEC,
> > 		.h_u.usr_ip4_spec = {
> > 			.ip4src = inet_aton("192.168.10.200"),
> > 			.ip_ver = ETH_RX_NFC_IP4
> > 		},
> > 		.m_u.usr_ip4_spec = {
> > 			.ip4dst = 0xffffffff,
> > 			.l4_4_bytes = 0xffffffff,
> > 			.tos = 0xff,
> > 			.proto = 0xff
> > 		},
> > 		.ring_cookie = RX_CLS_FLOW_DISC,
> > 		.location = 0,
> > 	}
> > };
> > 
> 
> Aha. Ok. From the remarks in the upstream ethtool.h I see now that
> ethtool_rxnfc has quite wide configuration possibilities (including the
> above). I missed it before. ;)
> 
> Ben, could u, pls., explain me then what's the difference between
> defining the rule as u wrote above on top of -N option (nfc) and
> defining the rule doing the same thing on top on -U (n-tuple) option and
> when I as a user should prefer one option to another? Are they expected
> to be implemented differently from FW/HW perspective?

The -N option modifies the hash function for all flows of a specific
type (using ETHTOOL_SRXFH) whereas the -U option steers a specific flow
or set of flows (using ETHTOOL_SRXNTUPLE).  The implementation of the -U
option could potentially be made to fallback to ETHTOOL_SRXCLSRLINS if
vlan_tag and user_def are not specified.

> thanks,
> vlad
> 
> P.S. I see that ethtool.h from the 2.6.36 tree already has the
> ethtool_rxnfc that would allow such a filtering definition however from
> the man page of the 2.6.36 version of the ethtool package it's unclear
> what should be a command line for such a configuration. Is it supported
> with the current ethtool version or maybe I'm missing something in a man
> page?

It's not supported.  Santwona Behera implemented the kernel side of this
but so far as I know he never sent any patches for ethtool.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: (Lack of) specification for RX n-tuple filtering
  2010-12-08 19:02           ` Ben Hutchings
@ 2010-12-08 19:10             ` Vladislav Zolotarov
  2010-12-08 19:14               ` Ben Hutchings
  0 siblings, 1 reply; 18+ messages in thread
From: Vladislav Zolotarov @ 2010-12-08 19:10 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Dimitris Michailidis, Peter Waskiewicz, netdev@vger.kernel.org,
	David Miller


> The implementation of the -U
> option could potentially be made to fallback to ETHTOOL_SRXCLSRLINS if
> vlan_tag and user_def are not specified.
> 

Having said that, don't u think that it could be more user friendly to
extend the ETHTOOL_SRXCLSRLINS interface to handle the lan_tag and
user_def and drop the n-tuple interface at all?

thanks,
vlad 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: (Lack of) specification for RX n-tuple filtering
  2010-12-08 18:54         ` Dimitris Michailidis
@ 2010-12-08 19:14           ` Ben Hutchings
  2010-12-08 19:26             ` Dimitris Michailidis
  0 siblings, 1 reply; 18+ messages in thread
From: Ben Hutchings @ 2010-12-08 19:14 UTC (permalink / raw)
  To: Dimitris Michailidis
  Cc: Vladislav Zolotarov, Peter Waskiewicz, netdev@vger.kernel.org,
	David Miller

On Wed, 2010-12-08 at 10:54 -0800, Dimitris Michailidis wrote:
> Ben Hutchings wrote:
> > On Wed, 2010-12-08 at 18:24 +0200, Vladislav Zolotarov wrote:
> >>>> It's a bit worse than that.  Currently one can only append filters, not
> >>>> insert at a given position, as ethtool_rx_ntuple doesn't have an index
> >>>> field.  For devices that use TCAMs, where position matters, it's quite an
> >>>> obstacle.  It also means one cannot modify an existing filter by specifying
> >>>> a new filter for the same index.
> >>> It looks like drivers for devices that use TCAMs should implement the
> >>> RXNFC interface instead.
> >>>
> >> Ben, from ethtool manpage it sounds like RXNFC option defines the way
> >> the RSS hash should be calculated, while SRXNTUPLE is meant to control
> >> the destination Rx queue for a stream specified by a filter/filters.
> > 
> > By 'RXNFC interface' I mean ETHTOOL_{G,S}RXCLS* and not
> > ETHTOOL_{G,S}RXFH which wrongly share (part of) the same structure..
> > 
> >> The
> >> semantics for a specification of the steam is also quite different. For
> >> instance, how do u define a rule to drop all packets with source IP
> >> address 192.168.10.200 by means of RXNFC?
> > 
> > Something like this, I think:
> > 
> > struct ethtool_rxnfc insert_rule = {
> > 	.cmd = ETHTOOL_SRXCLSRLINS,
> > 	.flow_type = IP_USER_SPEC,
> > 	.fs = {
> > 		.flow_type = IP_USER_SPEC,
> > 		.h_u.usr_ip4_spec = {
> > 			.ip4src = inet_aton("192.168.10.200"),
> > 			.ip_ver = ETH_RX_NFC_IP4
> > 		},
> > 		.m_u.usr_ip4_spec = {
> > 			.ip4dst = 0xffffffff,
> > 			.l4_4_bytes = 0xffffffff,
> > 			.tos = 0xff,
> > 			.proto = 0xff
> > 		},
> > 		.ring_cookie = RX_CLS_FLOW_DISC,
> > 		.location = 0,
> > 	}
> > };
> 
> I think the mask would be 0 for don't care fields and 1 for care, so
> 
> 	.m_u.usr_ip4_spec.ip4src = htonl(0xffffffff)
> 	.m_u.usr_ip4_spec.ip4dst = htonl(0)
> etc

That is definitely the opposite of what ixgbe and sfc do for
ethtool_ntuple_rx_flow_spec, and I believe it is the opposite of what
niu does for ethtool_rx_flow_spec.

[...]
> >> It's also unclear what is the relation between RXNFC and SRXNTUPLE. The
> >> last in general may override the decision made based on the hash result.
> >> So, it sounds like applying rules of SRXNTUPLE should come before
> >> applying the RSS logic and only if there was no match RSS should be
> >> applied to that frame. Do I get it right?
> > 
> > That's right.
> 
> It can be more involved than this.  Our HW allows a rule to select a 
> different part of the RSS table so you get a filter hit and still do RSS 
> afterwards if you want.  Current ethtool interfaces do not support this, 
> basically it would be a different action for either SRXNTUPLE or SRXCLSRLINS.

So does the rule specify an offset added to the output of the RSS hash
and indirection table, or can it also select a different indirection
table?  Our current hardware also has a filter flag for the former
behaviour...  There are still plenty of bits to spare in 'action' and
'ring_cookie' so perhaps we could define a flag for this?

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: (Lack of) specification for RX n-tuple filtering
  2010-12-08 19:10             ` Vladislav Zolotarov
@ 2010-12-08 19:14               ` Ben Hutchings
  2010-12-08 19:39                 ` Ben Hutchings
  0 siblings, 1 reply; 18+ messages in thread
From: Ben Hutchings @ 2010-12-08 19:14 UTC (permalink / raw)
  To: Vladislav Zolotarov
  Cc: Dimitris Michailidis, Peter Waskiewicz, netdev@vger.kernel.org,
	David Miller

On Wed, 2010-12-08 at 21:10 +0200, Vladislav Zolotarov wrote:
> > The implementation of the -U
> > option could potentially be made to fallback to ETHTOOL_SRXCLSRLINS if
> > vlan_tag and user_def are not specified.
> > 
> 
> Having said that, don't u think that it could be more user friendly to
> extend the ETHTOOL_SRXCLSRLINS interface to handle the lan_tag and
> user_def and drop the n-tuple interface at all?

No, we can't remove userland interfaces.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: (Lack of) specification for RX n-tuple filtering
  2010-12-08 19:14           ` Ben Hutchings
@ 2010-12-08 19:26             ` Dimitris Michailidis
  0 siblings, 0 replies; 18+ messages in thread
From: Dimitris Michailidis @ 2010-12-08 19:26 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Vladislav Zolotarov, Peter Waskiewicz, netdev@vger.kernel.org,
	David Miller

Ben Hutchings wrote:
> On Wed, 2010-12-08 at 10:54 -0800, Dimitris Michailidis wrote:
>> Ben Hutchings wrote:
>>> On Wed, 2010-12-08 at 18:24 +0200, Vladislav Zolotarov wrote:
>>>>>> It's a bit worse than that.  Currently one can only append filters, not
>>>>>> insert at a given position, as ethtool_rx_ntuple doesn't have an index
>>>>>> field.  For devices that use TCAMs, where position matters, it's quite an
>>>>>> obstacle.  It also means one cannot modify an existing filter by specifying
>>>>>> a new filter for the same index.
>>>>> It looks like drivers for devices that use TCAMs should implement the
>>>>> RXNFC interface instead.
>>>>>
>>>> Ben, from ethtool manpage it sounds like RXNFC option defines the way
>>>> the RSS hash should be calculated, while SRXNTUPLE is meant to control
>>>> the destination Rx queue for a stream specified by a filter/filters.
>>> By 'RXNFC interface' I mean ETHTOOL_{G,S}RXCLS* and not
>>> ETHTOOL_{G,S}RXFH which wrongly share (part of) the same structure..
>>>
>>>> The
>>>> semantics for a specification of the steam is also quite different. For
>>>> instance, how do u define a rule to drop all packets with source IP
>>>> address 192.168.10.200 by means of RXNFC?
>>> Something like this, I think:
>>>
>>> struct ethtool_rxnfc insert_rule = {
>>> 	.cmd = ETHTOOL_SRXCLSRLINS,
>>> 	.flow_type = IP_USER_SPEC,
>>> 	.fs = {
>>> 		.flow_type = IP_USER_SPEC,
>>> 		.h_u.usr_ip4_spec = {
>>> 			.ip4src = inet_aton("192.168.10.200"),
>>> 			.ip_ver = ETH_RX_NFC_IP4
>>> 		},
>>> 		.m_u.usr_ip4_spec = {
>>> 			.ip4dst = 0xffffffff,
>>> 			.l4_4_bytes = 0xffffffff,
>>> 			.tos = 0xff,
>>> 			.proto = 0xff
>>> 		},
>>> 		.ring_cookie = RX_CLS_FLOW_DISC,
>>> 		.location = 0,
>>> 	}
>>> };
>> I think the mask would be 0 for don't care fields and 1 for care, so
>>
>> 	.m_u.usr_ip4_spec.ip4src = htonl(0xffffffff)
>> 	.m_u.usr_ip4_spec.ip4dst = htonl(0)
>> etc
> 
> That is definitely the opposite of what ixgbe and sfc do for
> ethtool_ntuple_rx_flow_spec, and I believe it is the opposite of what
> niu does for ethtool_rx_flow_spec.

These are the values as our HW at least wants them.  The care bits are 1 in 
the mask.  It's not a huge deal, the driver can complement the masks.

> 
> [...]
>>>> It's also unclear what is the relation between RXNFC and SRXNTUPLE. The
>>>> last in general may override the decision made based on the hash result.
>>>> So, it sounds like applying rules of SRXNTUPLE should come before
>>>> applying the RSS logic and only if there was no match RSS should be
>>>> applied to that frame. Do I get it right?
>>> That's right.
>> It can be more involved than this.  Our HW allows a rule to select a 
>> different part of the RSS table so you get a filter hit and still do RSS 
>> afterwards if you want.  Current ethtool interfaces do not support this, 
>> basically it would be a different action for either SRXNTUPLE or SRXCLSRLINS.
> 
> So does the rule specify an offset added to the output of the RSS hash
> and indirection table, or can it also select a different indirection
> table?  Our current hardware also has a filter flag for the former
> behaviour...  There are still plenty of bits to spare in 'action' and
> 'ring_cookie' so perhaps we could define a flag for this?

You can partition the indirection table and then a rule can specify that 
matching packets should consult region X of the table.  The hash value is 
not altered, just the part of the overall table it indexes into.

> 
> Ben.
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: (Lack of) specification for RX n-tuple filtering
  2010-12-08 19:14               ` Ben Hutchings
@ 2010-12-08 19:39                 ` Ben Hutchings
  0 siblings, 0 replies; 18+ messages in thread
From: Ben Hutchings @ 2010-12-08 19:39 UTC (permalink / raw)
  To: Vladislav Zolotarov
  Cc: Dimitris Michailidis, Peter Waskiewicz, netdev@vger.kernel.org,
	David Miller

On Wed, 2010-12-08 at 19:14 +0000, Ben Hutchings wrote:
> On Wed, 2010-12-08 at 21:10 +0200, Vladislav Zolotarov wrote:
> > > The implementation of the -U
> > > option could potentially be made to fallback to ETHTOOL_SRXCLSRLINS if
> > > vlan_tag and user_def are not specified.
> > > 
> > 
> > Having said that, don't u think that it could be more user friendly to
> > extend the ETHTOOL_SRXCLSRLINS interface to handle the lan_tag and
> > user_def and drop the n-tuple interface at all?
> 
> No, we can't remove userland interfaces.

Having said that, this particular interface is fairly broken...

$ cat test.c
#include <stddef.h>
#include <stdio.h>

#include <linux/ethtool.h>

int main(void)
{
    printf("%zd\n", offsetof(struct ethtool_rx_flow_spec, ring_cookie));
    return 0;
}
$ cc -m64 -Wall test.c
$ ./a.out 
152
$ cc -m32 -Wall test.c
$ ./a.out 
148

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: (Lack of) specification for RX n-tuple filtering
  2010-12-08 17:29         ` Ben Hutchings
  2010-12-08 17:31           ` David Miller
@ 2010-12-09 10:31           ` Vladislav Zolotarov
  1 sibling, 0 replies; 18+ messages in thread
From: Vladislav Zolotarov @ 2010-12-09 10:31 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: David Miller, dm@chelsio.com, peter.p.waskiewicz.jr@intel.com,
	netdev@vger.kernel.org

> > It's not the same, this whole ordering thing you expect in netfilter
> > land is simply not present in these hardware implementations.
> > 
> > The hardware does a parallel TCAM match lookup, and whatever matches
> > is used.
> 
> I think the match with the lowest index wins, which is why it's possible
> to specify the rule's index (location) with ETHTOOL_SRXCLSRLINS and why
> Peter defined new commands without that for use with the ixgbe driver.
> 

Ben, practically, with the current ethtool userspace implementation it
seems like there is no way to specify the CAM index of the rule in the
n-tuple interface, is it? So, the decision on the index is up to the
vendor thus creating an uncertainty space. 

And I guess it's exactly what Dimitris meant talking about the index -
he said "a rule index", u say "a CAM index" while generally we are
talking about the same thing. U r referring the ETHTOOL_SRXCLSRLINS but
it has no user space interface yet and it's unclear when it will, while
n-tuple is already there. We can't remove the existing user space
interfaces - I agree. Then let's not adding the interfaces interfering
with the existing ones. This immediately implies that
ETHTOOL_SRXCLSRLINS shell never see light in a userland as a separate
interface and n-tuple user interface should be properly extended to
implement the missing ETHTOOL_SRXCLSRLINS functionality.

Pls., comment.

thanks,
vlad



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2010-12-09 10:31 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-22 21:02 (Lack of) specification for RX n-tuple filtering Ben Hutchings
2010-07-22 21:50 ` Dimitris Michailidis
2010-09-07 14:43   ` Ben Hutchings
2010-12-08 16:24     ` Vladislav Zolotarov
2010-12-08 16:39       ` David Miller
2010-12-08 17:29         ` Ben Hutchings
2010-12-08 17:31           ` David Miller
2010-12-09 10:31           ` Vladislav Zolotarov
2010-12-08 17:31         ` Vladislav Zolotarov
2010-12-08 17:22       ` Ben Hutchings
2010-12-08 18:39         ` Vladislav Zolotarov
2010-12-08 19:02           ` Ben Hutchings
2010-12-08 19:10             ` Vladislav Zolotarov
2010-12-08 19:14               ` Ben Hutchings
2010-12-08 19:39                 ` Ben Hutchings
2010-12-08 18:54         ` Dimitris Michailidis
2010-12-08 19:14           ` Ben Hutchings
2010-12-08 19:26             ` Dimitris Michailidis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).