netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* VJ Channel API - driver level (PATCH)
@ 2006-05-02 22:53 Alex Aizman
  2006-05-02 23:00 ` Stephen Hemminger
  0 siblings, 1 reply; 19+ messages in thread
From: Alex Aizman @ 2006-05-02 22:53 UTC (permalink / raw)
  To: netdev

Signed-off-by: Alex Aizman <alex@xxxxxxxxxx>

Hacked netdevice.h to support multiple channels.

--- netdevice-orig.h	2006-03-04 10:01:38.000000000 -0800
+++ netdevice-channel.h	2006-03-09 10:17:11.419955200 -0800
@@ -246,6 +246,147 @@
 
 extern int __init netdev_boot_setup(char *str);
 
+#ifdef CONFIG_NET_CHANNELS
+/***************  NETDEVICE HW CHANNELS data structures *****************/
+/**
+ * enum netdev_hwchannel_rx_flow_e - Hardware receive channel "flow" types.
+ * @HWCH_RX_FLOW_NONE: does not filter rx packets.
+ * @HWCH_RX_FLOW_MACADDR: filters based upon the rx mac address
+ * @HWCH_RX_FLOW_VLAN_ID: filters based upon the rx vlan id tag
+ * @HWCH_RX_FLOW_VLAN_QOS: fikters based upon the vlan qos field
+ * @HWCH_RX_FLOW_PORT: filters based upon the tcp or udp receive port number
+ * @HWCH_RX_FLOW_L4_HASH: filters based upon a hash of the tcp session id
+ * @HWCH_RX_FLOW_L4_SPDM: filters based upon a hash of the four-tuple of the
+ * following: source ip, source port, destination ip, destinaton port
+ *
+ * A rx is bound to a specific device.  When one of thsese enums is used, 
+ * traffic is filtered onto the queue of only the requested type.  By default
+ * we use HWCH_RX_FLOW_NONE as we usually want all traffic from this device.
+ **/
+typedef enum netdev_hwchannel_rx_flow_e  {
+	HWCH_RX_FLOW_NONE,
+	HWCH_RX_FLOW_MACADDR,
+	HWCH_RX_FLOW_VLAN_ID,
+	HWCH_RX_FLOW_VLAN_QOS,
+	HWCH_RX_FLOW_PORT,
+	HWCH_RX_FLOW_L4_HASH,
+	HWCH_RX_FLOW_L4_SPDM,
+} netdev_hwchan_rx_flow_e;
+
+/**
+ * enum enum netdev_hwchannel_priority_e - Hardware channel priorities.
+ * @HWCH_PRIORITY_NONE: no priority, process as fast as possible
+ * @HWCH_PRIORITY_LOWEST: process all other channels first
+ * @HWCH_PRIORITY_LOW: channel with low priority
+ * @HWCH_PRIORITY_MEDIUM: channel with medium priority
+ * @HWCH_PRIORITY_HIGH: channel with high priority
+ * @HWCH_PRIORITY_HIGHEST: process this channel before all others
+ *
+ * Channel priorities can be set on both tx and rx channels.  A default
+ * priority of HWCH_PRIORITY_NONE means all channels are considered equal by
+ * the hardware.  If a priority is set then HWCH_PRIORITY_HIGHEST is treated
+ * first and HWCH_PRIORITY_LOWEST is treated last.
+ * 
+ **/
+typedef enum netdev_hwchannel_priority_e {
+	HWCH_PRIORITY_NONE,
+	HWCH_PRIORITY_LOWEST,
+	HWCH_PRIORITY_LOW,
+	HWCH_PRIORITY_MEDIUM,
+	HWCH_PRIORITY_HIGH,
+	HWCH_PRIORITY_HIGHEST,
+} netdev_hwchan_priority_e;
+
+/**
+ * struct netdev_rx_flow - Uniquely identifies the traffic flow for a given
+ * rx channel.
+ * @type: specifies what type of traffic flow this channel will use.  This
+ * also specifies which of the fields in the union will be examined.
+ * @macaddr: if type is set to HWCH_RX_FLOW_MACADDR this field will be used
+ * to only accept traffic from this mac address
+ * @vlan_id: if type is set to HWCH_RX_FLOW_VLAN_ID this field will be used
+ * to only accept traffic from packets with this vlan id
+ * @vlan_qos: if type is set to HWCH_RX_FLOW_VLAN_QOS this field will be used
+ * to only accept traffic from packets with this vlan qos tag
+ * @port: if type is set to HWCH_RX_FLOW_PORT this field will be used to only
+ * accept traffic from tcp or udp packets with this destination port number
+ * @session_id: if type is set to HWCH_RX_FLOW_L4_HASH this field will be used
+ * to only allow tcp traffic from this specific session id
+ * @l4_4tuple: if type ie set to HWCH_RX_FLOW_L4_SPDM this struct will be
+ * used to only accept traffic which has the correct four-tuple consisting of:
+ * source ip, source port, destination ip, destinatopn port
+ *
+ * Receive channels can be set to shape the types of traffic placed upon them.
+ * This interface allows one to determine how to shape incoming traffic on
+ * a specified channel.  For example, if 
+ * netdev_rx_flow.type = HWCH_RX_FLOW_PORT and 
+ * netdev_rx_flow.rx_flow_val.port = 3260 identifies all standard iSCSI
+ * traffic.  The API call bind_rx_channel() is used to take the contents of
+ * struct netdev_rx_flow and apply it to a given rx channel.
+ **/
+struct netdev_rx_flow {
+	netdev_hwchan_rx_flow_e	type;
+	union rx_flow_val {
+		unsigned char macaddr[MAX_ADDR_LEN];/* HWCH_RX_FLOW_MACADDR  */
+		unsigned short vlan_id;		    /* HWCH_RX_FLOW_VLAN_ID  */
+		unsigned char  vlan_qos;	    /* HWCH_RX_FLOW_VLAN_QOS */
+		unsigned short port;		    /* HWCH_RX_FLOW_PORT     */
+		unsigned int session_id;	    /* HWCH_RX_FLOW_L4_HASH */
+		struct {			    /* HWCH_RX_FLOW_L4_SPDM */
+			uint32_t	src_ip;
+			unsigned short	src_port;
+			uint32_t	dst_ip;
+			unsigned short	dst_port;
+		} l4_4tuple;
+	} rx_flow_val;
+};
+
+/**
+ * (*netif_rx_hwchannel_cb) - function to be used for an rx channels callback.
+ * @skb: system network buffer to place next available packet from the channel
+ * @kernel_channelh: hardware rx channel handle to bind process
+ * @flow: specifies the type of traffic flow to examine for the given hardware
+ * rx channel
+ *
+ * By using this callback an application can harvest traffic from a specific
+ * hardware rx channel bypassing the kernel.  This function is optional.
+ **/
+typedef int (*netif_rx_hwchannel_cb) (struct sk_buff *skb,
+				      void *kernel_channelh,
+				      struct netdev_rx_flow flow);
+
+/**************************************************************************
+ * 2. Kernel-provided Calls (optional).
+ * ----------------------------------
+ * "Channelized" alterations of the corresponding netif_() callbacks.
+ *************************************************************************/
+
+/**
+ * netif_rx_hwchannel - gives a received buffer from an rx channel to the 
+ * linux stack.
+ * @skb: system network buffer to place next available packet from the channel
+ * @kernel_channelh: hardware rx channel handle to process
+ * @flow: specifies the type of traffic flow to examine for the given hardware
+ * rx channel
+ *
+ * Post a buffer received on a given Rx hardware channel to the stack.
+ * Used only if the channel callback is _not_ specified,
+ * see open_rx_hwchannel().
+ **/
+extern int netif_rx_hwchannel (struct sk_buff *skb,
+			       void *kernel_channelh,
+			       struct netdev_rx_flow flow);
+
+/*
+ * Start/Stop/Wakeup all traffic (flows) using a given Tx channel.
+ * The channel must be Tx, that is - it must be open with open_tx_hwchannel().
+ */
+extern void netif_start_queue_tx_hwchannel (void *kernel_channelh);
+extern void netif_stop_queue_tx_hwchannel (void *kernel_channelh);
+extern void netif_wakeup_queue_tx_hwchannel (void *kernel_channelh);
+extern void netif_queue_stopped (void *kernel_channelh);
+#endif /* CONFIG_NET_CHANNELS */
+
 /*
  *	The DEVICE structure.
  *	Actually, this whole structure is a big mistake.  It mixes I/O
@@ -502,6 +643,144 @@
 
 	/* class/net/name entry */
 	struct class_device	class_dev;
+
+#ifdef CONFIG_NET_CHANNELS
+	/***************** NET DEVICE HW CHANNELS ******************
+	 *  1. Low-level calls.  
+	 *  Exposes uni-directional hardware-supported channel: hw_channelh.
+	 *  Provided by a multi-channel (driver + adapter).
+ 	 *****************************************************************/
+
+	/**
+	 * (*open_tx_hwchannel) - open a channel to be used for transmit
+	 * @dev: net_device structure to associate with this channel
+	 * @priority: the channel priority as to how it is processed relative
+	 * to other channels
+	 * @burst_size: size of the channel, number of descriptors, etc.
+	 * @kernel_channelh: hardware channel to bind transmit traffic to
+	 * @hw_channelh: pointer to a user handle for a given hardware channel
+	 *
+	 * Opens a transmit channel and binds it to the hw_channelh parameter.
+	 * A hardware channel cannot be opened twice and cannot be used prior
+	 * to openeing.  Should any error occur, hw_channelh will point to 
+	 * NULL.
+	 **/
+	int (*open_tx_hwchannel) (struct net_device *dev,
+				  netdev_hwchan_priority_e priority,
+				  int burst_size,
+				  void *kernel_channelh,
+				  void **hw_channelh);
+
+	/**
+	 * (*open_rx_hwchannel) - open a channel to be used for receive
+	 * @dev: net_device structure to associate with this channel
+	 * @flow_type: the receive channel flow type to associate with this
+	 * hardware channel
+	 * @priority: the channel priority as to how it is processed relative
+	 * to other channels
+	 * @burst_size: size of the channel, number of descriptors, etc.
+	 * @kernel_channelh: hardware channel to bind transimit traffic to
+	 * @hw_channelh: pointer to a user handle for a given hardware channel
+	 *
+	 * Opens a receive channel and binds it to the hw_channelh parameter.
+	 * A hardware channel cannot be opened twice and cannot be used prior
+	 * to openeing.  Should any error occur, hw_channelh will point to 
+	 * NULL.  
+	 * Note: open_rx_hwchannel is optional.  If the callback is not 
+	 * specified the driver will use the regular netif_ API.
+	 **/
+	int (*open_rx_hwchannel) (struct net_device *dev,
+				  netdev_hwchan_rx_flow_e flow_type,
+				  netdev_hwchan_priority_e priority,
+				  int cpu,
+				  int burst_size,
+				  netif_rx_hwchannel_cb	callback,
+				  void *kernel_channelh, 
+				  void **hw_channelh);
+
+	/**
+	 * (*close_hwchannel) - close a hardware channel.
+	 * @hw_channelh - specifies the particular hardware channel to close.
+	 *
+	 * Closes down a hardware channel that has been opened.  This applies
+	 * to either a transmit or receive channel.  Note in order to close
+	 * down a channel link, both the transmit and receive channels have 
+	 * to be closed separately.
+	 **/
+	int (*close_hwchannel) (void *hw_channelh);
+
+	/**
+	 * (*hard_start_xmit_hwchannel) - post a buffer to a transmit channel
+	 * @skb: network packet to add to the transmit channel
+	 * @hw_channelh: transmit channel to append the packet
+	 *
+	 * Posts a packet into the next available slot on the given transmit
+	 * channel.  This call will fail if hw_channelh is not open or is 
+	 * not a transmit channel.  Note that unlike receive there is no API
+	 * to bind transmit traffic to a given channel.
+	 **/
+	int (*hard_start_xmit_hwchannel) (struct sk_buff *skb,
+					  void *hw_channelh);
+
+	/**
+	 * (*bind_rx_hwchannel) - binds a given receive traffic flow to a 
+	 * receive channel
+	 * @flow: specific receive flow pattern to match traffic against
+	 * @hw_channelh: receive hardware channel to match traffic based upon
+	 * the given flow
+	 *
+	 * Given a specific flow this function will bind that flow to the 
+	 * named receive hardware channel.  The relationship between receive
+	 * flows and receive hardware channels is one-to-many.  This means
+	 * several flows can be bound to the same receive hardware channel.
+	 * The function call will fail if the channel is not a receive channel,
+	 * if the channel is not opened, or if the specified flow does not
+	 * correspond with the channel type applied to the channel during
+	 * channel open.
+	 **/
+	int (*bind_rx_hwchannel) (struct netdev_rx_flow flow, void *hw_channelh);
+
+	/**
+	 * (*unbind_rx_hwchannel) - unbinds a given receive traffic flow to a 
+	 * receive channel
+	 * @flow: specific receive flow pattern to match traffic against
+	 * @hw_channelh: receive hardware channel to match traffic based upon
+	 * the given flow
+	 *
+	 * Unbindes a certain flow from a receive hardware channel.  The
+	 * function will fail if the channel is not open, the channel is not
+	 * a receive channel, or if the given flow has not been previously
+	 * bound to the receive channel.
+	 **/
+	int (*unbind_rx_hwchannel) (struct netdev_rx_flow flow, void *hw_channelh);
+
+	/* 
+	 * (*poll_hwchannel) - completion handler executed in a polling mode
+	 * @hw_channel: hardware channel to poll
+	 * @budget: number of completions "budgeted" for processing in this 
+	 * iteration
+	 *
+	 * Polls a hardware channel and tries to reap at least as many 
+	 * packets as contained in budgeted.  This works for both transmit
+	 * and receive channels.  For transmit this function cleans up 
+	 * packets marked as completed by the hardware.  For receive the 
+	 * packets are either passed up the netif_ stack or by the 
+	 * channel receive callback if one was given.
+	 */ 
+	int (*poll_hwchannel) (int *budget, void *hw_channelh);
+
+	/**
+	 * (*get_stats_hwchannel) - obtain statistics about a hardware channel
+	 * @hw_channelh: specific transmit or receive hardware channel to 
+	 * query for statistics
+	 *
+	 * Returns the standard statistics about a given transmit or receive
+	 * hardware channel.  The statistics are stored in the usual 
+	 * struct net_device_stats format.
+	 **/
+	struct net_device_stats* (*get_stats_hwchannel) (void *hw_channelh);
+
+#endif /* CONFIG_NET_CHANNELS */
 };
 
 #define	NETDEV_ALIGN		32



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VJ Channel API - driver level (PATCH)
  2006-05-02 22:53 VJ Channel API - driver level (PATCH) Alex Aizman
@ 2006-05-02 23:00 ` Stephen Hemminger
  2006-05-03  6:47   ` David S. Miller
  0 siblings, 1 reply; 19+ messages in thread
From: Stephen Hemminger @ 2006-05-02 23:00 UTC (permalink / raw)
  To: Alex Aizman; +Cc: netdev

On Tue, 02 May 2006 15:53:50 -0700
Alex Aizman <alex@neterion.com> wrote:

> Signed-off-by: Alex Aizman <alex@xxxxxxxxxx>
> 
> Hacked netdevice.h to support multiple channels.
> 
> --- netdevice-orig.h	2006-03-04 10:01:38.000000000 -0800
> +++ netdevice-channel.h	2006-03-09 10:17:11.419955200 -0800
> @@ -246,6 +246,147 @@
>  
>  extern int __init netdev_boot_setup(char *str);
>  
> +#ifdef CONFIG_NET_CHANNELS
> +/***************  NETDEVICE HW CHANNELS data structures *****************/
> +/**
> + * enum netdev_hwchannel_rx_flow_e - Hardware receive channel "flow" types.
> + * @HWCH_RX_FLOW_NONE: does not filter rx packets.
> + * @HWCH_RX_FLOW_MACADDR: filters based upon the rx mac address
> + * @HWCH_RX_FLOW_VLAN_ID: filters based upon the rx vlan id tag
> + * @HWCH_RX_FLOW_VLAN_QOS: fikters based upon the vlan qos field
> + * @HWCH_RX_FLOW_PORT: filters based upon the tcp or udp receive port number
> + * @HWCH_RX_FLOW_L4_HASH: filters based upon a hash of the tcp session id
> + * @HWCH_RX_FLOW_L4_SPDM: filters based upon a hash of the four-tuple of the
> + * following: source ip, source port, destination ip, destinaton port
> + *
> + * A rx is bound to a specific device.  When one of thsese enums is used, 
> + * traffic is filtered onto the queue of only the requested type.  By default
> + * we use HWCH_RX_FLOW_NONE as we usually want all traffic from this device.
> + **/
> +typedef enum netdev_hwchannel_rx_flow_e  {
> +	HWCH_RX_FLOW_NONE,
> +	HWCH_RX_FLOW_MACADDR,
> +	HWCH_RX_FLOW_VLAN_ID,
> +	HWCH_RX_FLOW_VLAN_QOS,
> +	HWCH_RX_FLOW_PORT,
> +	HWCH_RX_FLOW_L4_HASH,
> +	HWCH_RX_FLOW_L4_SPDM,
> +} netdev_hwchan_rx_flow_e;
> +

No, not a typedef.  also pls use shorter names.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VJ Channel API - driver level (PATCH)
  2006-05-02 23:00 ` Stephen Hemminger
@ 2006-05-03  6:47   ` David S. Miller
  0 siblings, 0 replies; 19+ messages in thread
From: David S. Miller @ 2006-05-03  6:47 UTC (permalink / raw)
  To: shemminger; +Cc: alex, netdev


I don't think we should be defining driver APIs when we haven't even
figured out how the core of it would even work yet.

A key part of this is the netfilter bits, that will require
non-trivial flow identification, a hash will simply not be enough, and
it will not be allowed to not support the netfilter bits properly
since everyone will have netfilter enabled in one way or another.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: VJ Channel API - driver level (PATCH)
@ 2006-05-03 13:56 Leonid Grossman
  2006-05-03 20:23 ` David S. Miller
  0 siblings, 1 reply; 19+ messages in thread
From: Leonid Grossman @ 2006-05-03 13:56 UTC (permalink / raw)
  To: David S. Miller, shemminger; +Cc: alex, netdev

 

> -----Original Message-----
> From: netdev-owner@vger.kernel.org 
> [mailto:netdev-owner@vger.kernel.org] On Behalf Of David S. Miller
> Sent: Tuesday, May 02, 2006 11:48 PM
> To: shemminger@osdl.org
> Cc: alex@neterion.com; netdev@vger.kernel.org
> Subject: Re: VJ Channel API - driver level (PATCH)
> 
> 
> I don't think we should be defining driver APIs when we 
> haven't even figured out how the core of it would even work yet.
> 
> A key part of this is the netfilter bits, that will require 
> non-trivial flow identification, a hash will simply not be 
> enough, and it will not be allowed to not support the 
> netfilter bits properly since everyone will have netfilter 
> enabled in one way or another.

Hi Dave,

Do you have suggestions on potential hardware assists/offloads for
netfilter?

I suppose some of it can be worthwhile, although in general may be too
complex to implement - especially above 1 Gig. 

I'd expect high end NIC ASICs to implement rx steering based upon some
sort of hash (for load balancing), as well as explicit "1:1" steering
between a sw channel and a hw channel. Both options for channel
configuration are present in the driver interface.
If netfilter assists can be done in hardware, I agree the driver
interface will need to add support for these - otherwise, netfilter
processing will stay above the driver.


> -
> To unsubscribe from this list: send the line "unsubscribe 
> netdev" in the body of a message to majordomo@vger.kernel.org 
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: VJ Channel API - driver level (PATCH)
@ 2006-05-03 15:56 Caitlin Bestler
  2006-05-03 18:07 ` Evgeniy Polyakov
  0 siblings, 1 reply; 19+ messages in thread
From: Caitlin Bestler @ 2006-05-03 15:56 UTC (permalink / raw)
  To: Leonid Grossman, David S. Miller, shemminger; +Cc: alex, netdev

netdev-owner@vger.kernel.org wrote:
>> -----Original Message-----
>> From: netdev-owner@vger.kernel.org
>> [mailto:netdev-owner@vger.kernel.org] On Behalf Of David S. Miller
>> Sent: Tuesday, May 02, 2006 11:48 PM
>> To: shemminger@osdl.org
>> Cc: alex@neterion.com; netdev@vger.kernel.org
>> Subject: Re: VJ Channel API - driver level (PATCH)
>> 
>> 
>> I don't think we should be defining driver APIs when we haven't even
>> figured out how the core of it would even work yet.
>> 
>> A key part of this is the netfilter bits, that will require
>> non-trivial flow identification, a hash will simply not be enough,
>> and it will not be allowed to not support the netfilter bits properly
>> since everyone will have netfilter enabled in one way or another.
> 
> Hi Dave,
> 
> Do you have suggestions on potential hardware
> assists/offloads for netfilter?
> 
> I suppose some of it can be worthwhile, although in general
> may be too complex to implement - especially above 1 Gig.
> 
> I'd expect high end NIC ASICs to implement rx steering based
> upon some sort of hash (for load balancing), as well as
> explicit "1:1" steering between a sw channel and a hw
> channel. Both options for channel configuration are present
> in the driver interface.
> If netfilter assists can be done in hardware, I agree the
> driver interface will need to add support for these -
> otherwise, netfilter processing will stay above the driver.
> 
> 

Even if the hardware cannot fully implement netfilter rules
there is still value in having an interface that documents 
exactly how much filtering a given piece of hardware can do.
There is no point in having the kernel repeat packet classifications
that have already been done by the NIC.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: VJ Channel API - driver level (PATCH)
@ 2006-05-03 17:52 Caitlin Bestler
  0 siblings, 0 replies; 19+ messages in thread
From: Caitlin Bestler @ 2006-05-03 17:52 UTC (permalink / raw)
  To: Alex Aizman, netdev

Are you proposing a mechanism for the consuming end of a tx
channel to support a large number of channels, or are you
assuming that the number of tx channels will be small enough
that simply polling them in priority order is adequate?


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VJ Channel API - driver level (PATCH)
  2006-05-03 15:56 Caitlin Bestler
@ 2006-05-03 18:07 ` Evgeniy Polyakov
  2006-05-03 18:45   ` YOSHIFUJI Hideaki / 吉藤英明
  2006-05-03 20:35   ` David S. Miller
  0 siblings, 2 replies; 19+ messages in thread
From: Evgeniy Polyakov @ 2006-05-03 18:07 UTC (permalink / raw)
  To: Caitlin Bestler
  Cc: Leonid Grossman, David S. Miller, shemminger, alex, netdev

On Wed, May 03, 2006 at 08:56:23AM -0700, Caitlin Bestler (caitlinb@broadcom.com) wrote:
> > I'd expect high end NIC ASICs to implement rx steering based
> > upon some sort of hash (for load balancing), as well as
> > explicit "1:1" steering between a sw channel and a hw
> > channel. Both options for channel configuration are present
> > in the driver interface.
> > If netfilter assists can be done in hardware, I agree the
> > driver interface will need to add support for these -
> > otherwise, netfilter processing will stay above the driver.
> > 
> > 
> 
> Even if the hardware cannot fully implement netfilter rules
> there is still value in having an interface that documents 
> exactly how much filtering a given piece of hardware can do.
> There is no point in having the kernel repeat packet classifications
> that have already been done by the NIC.

Please do not suppose that vj channel must rely on underlaying hardware.
New interface MUST work better or at least not worse than existing skb
queueing for majority of users, and I doubt users with netfilter capable
hardware are there.
It is only some hint to the SW, not rules, that hardware can provide.
The best would be ipv4/ipv6 hashing, and I think it is enough.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: VJ Channel API - driver level (PATCH)
@ 2006-05-03 18:12 Caitlin Bestler
  2006-05-03 18:49 ` Stephen Hemminger
  0 siblings, 1 reply; 19+ messages in thread
From: Caitlin Bestler @ 2006-05-03 18:12 UTC (permalink / raw)
  To: Evgeniy Polyakov
  Cc: Leonid Grossman, David S. Miller, shemminger, alex, netdev

Evgeniy Polyakov wrote:
> On Wed, May 03, 2006 at 08:56:23AM -0700, Caitlin Bestler
> (caitlinb@broadcom.com) wrote:
>>> I'd expect high end NIC ASICs to implement rx steering based upon
>>> some sort of hash (for load balancing), as well as explicit "1:1"
>>> steering between a sw channel and a hw channel. Both options for
>>> channel configuration are present in the driver interface.
>>> If netfilter assists can be done in hardware, I agree the driver
>>> interface will need to add support for these - otherwise, netfilter
>>> processing will stay above the driver.
>>> 
>>> 
>> 
>> Even if the hardware cannot fully implement netfilter rules there is
>> still value in having an interface that documents exactly how much
>> filtering a given piece of hardware can do.
>> There is no point in having the kernel repeat packet classifications
>> that have already been done by the NIC.
> 
> Please do not suppose that vj channel must rely on
> underlaying hardware.
> New interface MUST work better or at least not worse than
> existing skb queueing for majority of users, and I doubt
> users with netfilter capable hardware are there.
> It is only some hint to the SW, not rules, that hardware can provide.
> The best would be ipv4/ipv6 hashing, and I think it is enough.

I agree. I was just stating that *if* there is direct hardware 
support then the software should be enabled to skip 
redundant checks. What I'm suggesting is really the
equivalent of knowing whether the hardware generates
or checks CRCs and TCP checksums. Don't mandate
the feature, just have the option to avoid redundant work.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VJ Channel API - driver level (PATCH)
  2006-05-03 18:07 ` Evgeniy Polyakov
@ 2006-05-03 18:45   ` YOSHIFUJI Hideaki / 吉藤英明
  2006-05-03 20:35   ` David S. Miller
  1 sibling, 0 replies; 19+ messages in thread
From: YOSHIFUJI Hideaki / 吉藤英明 @ 2006-05-03 18:45 UTC (permalink / raw)
  To: netdev
  Cc: johnpol, caitlinb, Leonid.Grossman, davem, shemminger, alex,
	yoshfuji

In article <20060503180740.GA14506@2ka.mipt.ru> (at Wed, 3 May 2006 22:07:40 +0400), Evgeniy Polyakov <johnpol@2ka.mipt.ru> says:

> > Even if the hardware cannot fully implement netfilter rules
> > there is still value in having an interface that documents 
> > exactly how much filtering a given piece of hardware can do.
> > There is no point in having the kernel repeat packet classifications
> > that have already been done by the NIC.
> 
> Please do not suppose that vj channel must rely on underlaying hardware.
> New interface MUST work better or at least not worse than existing skb
> queueing for majority of users, and I doubt users with netfilter capable
> hardware are there.
> It is only some hint to the SW, not rules, that hardware can provide.
> The best would be ipv4/ipv6 hashing, and I think it is enough.

And, I believe that, if a packet contains any ipv6 extension header(s),
including routing header, fragmentation header, etc., 
we should process it in kernel as we do for now.

Regards,

--yoshfuji

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VJ Channel API - driver level (PATCH)
  2006-05-03 18:12 Caitlin Bestler
@ 2006-05-03 18:49 ` Stephen Hemminger
  0 siblings, 0 replies; 19+ messages in thread
From: Stephen Hemminger @ 2006-05-03 18:49 UTC (permalink / raw)
  To: Caitlin Bestler
  Cc: Evgeniy Polyakov, Leonid Grossman, David S. Miller, alex, netdev

On Wed, 3 May 2006 11:12:15 -0700
"Caitlin Bestler" <caitlinb@broadcom.com> wrote:

> Evgeniy Polyakov wrote:
> > On Wed, May 03, 2006 at 08:56:23AM -0700, Caitlin Bestler
> > (caitlinb@broadcom.com) wrote:
> >>> I'd expect high end NIC ASICs to implement rx steering based upon
> >>> some sort of hash (for load balancing), as well as explicit "1:1"
> >>> steering between a sw channel and a hw channel. Both options for
> >>> channel configuration are present in the driver interface.
> >>> If netfilter assists can be done in hardware, I agree the driver
> >>> interface will need to add support for these - otherwise, netfilter
> >>> processing will stay above the driver.
> >>> 
> >>> 
> >> 
> >> Even if the hardware cannot fully implement netfilter rules there is
> >> still value in having an interface that documents exactly how much
> >> filtering a given piece of hardware can do.
> >> There is no point in having the kernel repeat packet classifications
> >> that have already been done by the NIC.
> > 
> > Please do not suppose that vj channel must rely on
> > underlaying hardware.
> > New interface MUST work better or at least not worse than
> > existing skb queueing for majority of users, and I doubt
> > users with netfilter capable hardware are there.
> > It is only some hint to the SW, not rules, that hardware can provide.
> > The best would be ipv4/ipv6 hashing, and I think it is enough.
> 
> I agree. I was just stating that *if* there is direct hardware 
> support then the software should be enabled to skip 
> redundant checks. What I'm suggesting is really the
> equivalent of knowing whether the hardware generates
> or checks CRCs and TCP checksums. Don't mandate
> the feature, just have the option to avoid redundant work.
> 

Also like mulitcast filtering, you need to allow for the partial
match case. If hardware can do some of the work, it is helps.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VJ Channel API - driver level (PATCH)
  2006-05-03 13:56 Leonid Grossman
@ 2006-05-03 20:23 ` David S. Miller
  0 siblings, 0 replies; 19+ messages in thread
From: David S. Miller @ 2006-05-03 20:23 UTC (permalink / raw)
  To: Leonid.Grossman; +Cc: shemminger, alex, netdev

From: "Leonid Grossman" <Leonid.Grossman@neterion.com>
Date: Wed, 3 May 2006 09:56:18 -0400

> Do you have suggestions on potential hardware assists/offloads for
> netfilter?

We don't know yet what things will look like, that's why we
shouldn't be defining APIs and I cannot give any such advice
yet.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VJ Channel API - driver level (PATCH)
  2006-05-03 18:07 ` Evgeniy Polyakov
  2006-05-03 18:45   ` YOSHIFUJI Hideaki / 吉藤英明
@ 2006-05-03 20:35   ` David S. Miller
  1 sibling, 0 replies; 19+ messages in thread
From: David S. Miller @ 2006-05-03 20:35 UTC (permalink / raw)
  To: johnpol; +Cc: caitlinb, Leonid.Grossman, shemminger, alex, netdev

From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Date: Wed, 3 May 2006 22:07:40 +0400

> On Wed, May 03, 2006 at 08:56:23AM -0700, Caitlin Bestler (caitlinb@broadcom.com) wrote:
> > > I'd expect high end NIC ASICs to implement rx steering based
> > > upon some sort of hash (for load balancing), as well as
> > > explicit "1:1" steering between a sw channel and a hw
> > > channel. Both options for channel configuration are present
> > > in the driver interface.
> > > If netfilter assists can be done in hardware, I agree the
> > > driver interface will need to add support for these -
> > > otherwise, netfilter processing will stay above the driver.
> > > 
> > > 
> > 
> > Even if the hardware cannot fully implement netfilter rules
> > there is still value in having an interface that documents 
> > exactly how much filtering a given piece of hardware can do.
> > There is no point in having the kernel repeat packet classifications
> > that have already been done by the NIC.
> 
> Please do not suppose that vj channel must rely on underlaying hardware.

I am not.  I am just saying that it is futile to build hardware that
cannot handle netfilter at least to some extent, because this will
result in HW net channels being disabled for %99 of real users which
makes the hardware just a waste.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: VJ Channel API - driver level (PATCH)
@ 2006-05-03 20:40 Caitlin Bestler
  2006-05-04 22:49 ` Alex Aizman
  0 siblings, 1 reply; 19+ messages in thread
From: Caitlin Bestler @ 2006-05-03 20:40 UTC (permalink / raw)
  To: David S. Miller, johnpol; +Cc: Leonid.Grossman, shemminger, alex, netdev

David S. Miller wrote:
> From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
> Date: Wed, 3 May 2006 22:07:40 +0400
> 
>> On Wed, May 03, 2006 at 08:56:23AM -0700, Caitlin Bestler
> (caitlinb@broadcom.com) wrote:
>>>> I'd expect high end NIC ASICs to implement rx steering based upon
>>>> some sort of hash (for load balancing), as well as explicit "1:1"
>>>> steering between a sw channel and a hw channel. Both options for
>>>> channel configuration are present in the driver interface.
>>>> If netfilter assists can be done in hardware, I agree the driver
>>>> interface will need to add support for these - otherwise,
>>>> netfilter processing will stay above the driver.
>>>> 
>>>> 
>>> 
>>> Even if the hardware cannot fully implement netfilter rules there is
>>> still value in having an interface that documents exactly how much
>>> filtering a given piece of hardware can do.
>>> There is no point in having the kernel repeat packet classifications
>>> that have already been done by the NIC.
>> 
>> Please do not suppose that vj channel must rely on underlaying
>> hardware. 
> 
> I am not.  I am just saying that it is futile to build
> hardware that cannot handle netfilter at least to some
> extent, because this will result in HW net channels being
> disabled for %99 of real users which makes the hardware just a waste.

Or netfilters being disabled, which would be just as bad or worse.
The kernel and hardware need to co-operate so that users are not
asked to make artificial choices between performance and security.




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VJ Channel API - driver level (PATCH)
  2006-05-03 20:40 Caitlin Bestler
@ 2006-05-04 22:49 ` Alex Aizman
  2006-05-04 23:04   ` David S. Miller
  0 siblings, 1 reply; 19+ messages in thread
From: Alex Aizman @ 2006-05-04 22:49 UTC (permalink / raw)
  To: Caitlin Bestler
  Cc: David S. Miller, johnpol, Leonid.Grossman, shemminger, netdev

So, what are the requirements? The hardware already parses L2, L3, L4 headers, 
and for the future generation we could add to the set of already supported 
steering/filtering criteria. Having some discussion on the essential vs. 
optional requirements seems like the right thing at this point.

On one hand, this describes what's available:

http://www.spinics.net/lists/netdev/msg04001.html

OTOH, and this is just my opinion - it'd be unrealistic to expect a general 
purpose NIC to offload the entire netfilter. On the third hand, one could 
think of IPsec and/or NAT, and what happens then with the hardware-supported 
filtering. And so on.

There's also a question of relative importance specifically for the Data 
Center environment.

Anyway, discussion would help.

Caitlin Bestler wrote:
> David S. Miller wrote:
>> From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
>> Date: Wed, 3 May 2006 22:07:40 +0400
>>
>>> On Wed, May 03, 2006 at 08:56:23AM -0700, Caitlin Bestler
>> (caitlinb@broadcom.com) wrote:
>>>>> I'd expect high end NIC ASICs to implement rx steering based upon
>>>>> some sort of hash (for load balancing), as well as explicit "1:1"
>>>>> steering between a sw channel and a hw channel. Both options for
>>>>> channel configuration are present in the driver interface.
>>>>> If netfilter assists can be done in hardware, I agree the driver
>>>>> interface will need to add support for these - otherwise,
>>>>> netfilter processing will stay above the driver.
>>>>>
>>>>>
>>>> Even if the hardware cannot fully implement netfilter rules there is
>>>> still value in having an interface that documents exactly how much
>>>> filtering a given piece of hardware can do.
>>>> There is no point in having the kernel repeat packet classifications
>>>> that have already been done by the NIC.
>>> Please do not suppose that vj channel must rely on underlaying
>>> hardware. 
>> I am not.  I am just saying that it is futile to build
>> hardware that cannot handle netfilter at least to some
>> extent, because this will result in HW net channels being
>> disabled for %99 of real users which makes the hardware just a waste.
> 
> Or netfilters being disabled, which would be just as bad or worse.
> The kernel and hardware need to co-operate so that users are not
> asked to make artificial choices between performance and security.
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VJ Channel API - driver level (PATCH)
  2006-05-04 22:49 ` Alex Aizman
@ 2006-05-04 23:04   ` David S. Miller
  2006-05-05  9:36     ` Evgeniy Polyakov
  0 siblings, 1 reply; 19+ messages in thread
From: David S. Miller @ 2006-05-04 23:04 UTC (permalink / raw)
  To: alex; +Cc: caitlinb, johnpol, Leonid.Grossman, shemminger, netdev

From: Alex Aizman <alex@neterion.com>
Date: Thu, 04 May 2006 15:49:11 -0700

> So, what are the requirements?

I will say it a 10th time, "we simply don't know yet."

Please be patient and let us design the net channel infrastructure
properly, then we can think clearly about how hardware might support
things.

Hardware folks are jumping the gun and it's very annoying and takes
precious time away from thinking about and working on the
implementation.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VJ Channel API - driver level (PATCH)
  2006-05-04 23:04   ` David S. Miller
@ 2006-05-05  9:36     ` Evgeniy Polyakov
  2006-05-06  0:35       ` David S. Miller
  0 siblings, 1 reply; 19+ messages in thread
From: Evgeniy Polyakov @ 2006-05-05  9:36 UTC (permalink / raw)
  To: David S. Miller; +Cc: alex, caitlinb, Leonid.Grossman, shemminger, netdev

On Thu, May 04, 2006 at 04:04:32PM -0700, David S. Miller (davem@davemloft.net) wrote:
> Hardware folks are jumping the gun and it's very annoying and takes
> precious time away from thinking about and working on the
> implementation.

Hardware folks could also create it's own implementation and show community
if theirs approach is good or not.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VJ Channel API - driver level (PATCH)
  2006-05-05  9:36     ` Evgeniy Polyakov
@ 2006-05-06  0:35       ` David S. Miller
  2006-05-06  8:42         ` Evgeniy Polyakov
  0 siblings, 1 reply; 19+ messages in thread
From: David S. Miller @ 2006-05-06  0:35 UTC (permalink / raw)
  To: johnpol; +Cc: alex, caitlinb, Leonid.Grossman, shemminger, netdev

From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Date: Fri, 5 May 2006 13:36:56 +0400

> Hardware folks could also create it's own implementation and show
> community if theirs approach is good or not.

Designing hardware for non-existing software infrastructure is
risky buisness :-)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VJ Channel API - driver level (PATCH)
  2006-05-06  0:35       ` David S. Miller
@ 2006-05-06  8:42         ` Evgeniy Polyakov
  2006-05-06  8:57           ` Evgeniy Polyakov
  0 siblings, 1 reply; 19+ messages in thread
From: Evgeniy Polyakov @ 2006-05-06  8:42 UTC (permalink / raw)
  To: David S. Miller; +Cc: alex, caitlinb, Leonid.Grossman, shemminger, netdev

On Fri, May 05, 2006 at 05:35:33PM -0700, David S. Miller (davem@davemloft.net) wrote:
> From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
> Date: Fri, 5 May 2006 13:36:56 +0400
> 
> > Hardware folks could also create it's own implementation and show
> > community if theirs approach is good or not.
> 
> Designing hardware for non-existing software infrastructure is
> risky buisness :-)

There are companies that do TOE and crypto accelerators without
support from even windows :)

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VJ Channel API - driver level (PATCH)
  2006-05-06  8:42         ` Evgeniy Polyakov
@ 2006-05-06  8:57           ` Evgeniy Polyakov
  0 siblings, 0 replies; 19+ messages in thread
From: Evgeniy Polyakov @ 2006-05-06  8:57 UTC (permalink / raw)
  To: David S. Miller; +Cc: alex, caitlinb, Leonid.Grossman, shemminger, netdev

On Sat, May 06, 2006 at 12:42:38PM +0400, Evgeniy Polyakov (johnpol@2ka.mipt.ru) wrote:
> On Fri, May 05, 2006 at 05:35:33PM -0700, David S. Miller (davem@davemloft.net) wrote:
> > From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
> > Date: Fri, 5 May 2006 13:36:56 +0400
> > 
> > > Hardware folks could also create it's own implementation and show
> > > community if theirs approach is good or not.
> > 
> > Designing hardware for non-existing software infrastructure is
> > risky buisness :-)
> 
> There are companies that do TOE and crypto accelerators without
> support from even windows :)

And actually most of them have research departments which do new
interesting developments.

It is Open Source after all - just do everything you want and show
community that it is usefull. Neterion did that right with UFO and LRO.

IBM started to do it with netchannels.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2006-05-06  8:57 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-02 22:53 VJ Channel API - driver level (PATCH) Alex Aizman
2006-05-02 23:00 ` Stephen Hemminger
2006-05-03  6:47   ` David S. Miller
  -- strict thread matches above, loose matches on Subject: below --
2006-05-03 13:56 Leonid Grossman
2006-05-03 20:23 ` David S. Miller
2006-05-03 15:56 Caitlin Bestler
2006-05-03 18:07 ` Evgeniy Polyakov
2006-05-03 18:45   ` YOSHIFUJI Hideaki / 吉藤英明
2006-05-03 20:35   ` David S. Miller
2006-05-03 17:52 Caitlin Bestler
2006-05-03 18:12 Caitlin Bestler
2006-05-03 18:49 ` Stephen Hemminger
2006-05-03 20:40 Caitlin Bestler
2006-05-04 22:49 ` Alex Aizman
2006-05-04 23:04   ` David S. Miller
2006-05-05  9:36     ` Evgeniy Polyakov
2006-05-06  0:35       ` David S. Miller
2006-05-06  8:42         ` Evgeniy Polyakov
2006-05-06  8:57           ` Evgeniy Polyakov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).