* Re: [PATCH RESEND net-next 0/2] ntuple filters with RSS
[not found] <533b5eff-49b6-16c3-9873-dda3fb05c3d4@solarflare.com>
@ 2018-02-27 17:38 ` David Miller
2018-02-27 17:55 ` Edward Cree
0 siblings, 1 reply; 18+ messages in thread
From: David Miller @ 2018-02-27 17:38 UTC (permalink / raw)
To: ecree; +Cc: linux-net-drivers, netdev, linville
Edward, none of these postings are making it to the list.
The problem is there are syntax errors in your email headers.
Any time a person's name contains a special character like ".",
that entire string must be enclosed in double quotes.
This is the case for "John W. Linville" so please add proper
quotes around such names and resend your patch series again.
Thank you.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RESEND net-next 0/2] ntuple filters with RSS
2018-02-27 17:38 ` David Miller
@ 2018-02-27 17:55 ` Edward Cree
2018-02-27 19:28 ` John W. Linville
0 siblings, 1 reply; 18+ messages in thread
From: Edward Cree @ 2018-02-27 17:55 UTC (permalink / raw)
To: David Miller; +Cc: linux-net-drivers, netdev, linville
On 27/02/18 17:38, David Miller wrote:
> The problem is there are syntax errors in your email headers.
>
> Any time a person's name contains a special character like ".",
> that entire string must be enclosed in double quotes.
>
> This is the case for "John W. Linville" so please add proper
> quotes around such names and resend your patch series again.
Thank you for spotting this! I looked at the headers and failed
to notice anything wrong with them.
I'm surprised that git-imap-send doesn't check for this...
Will resend with that fixed.
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH RESEND net-next 0/2] ntuple filters with RSS
@ 2018-02-27 17:59 Edward Cree
2018-02-27 18:02 ` [PATCH net-next 1/2] net: ethtool: extend RXNFC API to support RSS spreading of filter matches Edward Cree
` (3 more replies)
0 siblings, 4 replies; 18+ messages in thread
From: Edward Cree @ 2018-02-27 17:59 UTC (permalink / raw)
To: linux-net-drivers, David Miller; +Cc: netdev, John W. Linville
This series introduces the ability to mark an ethtool steering filter to use
RSS spreading, and the ability to create and configure multiple RSS contexts
with different indirection tables, hash keys, and hash fields.
An implementation for the sfc driver (for 7000-series and later SFC NICs) is
included in patch 2/2.
The anticipated use case of this feature is for steering traffic destined for
a container (or virtual machine) to the subset of CPUs on which processes in
the container (or the VM's vCPUs) are bound, while retaining the scalability
of RSS spreading from the viewpoint inside the container.
The use of both a base queue number (ring_cookie) and indirection table is
intended to allow re-use of a single RSS context to target multiple sets of
CPUs. For instance, if an 8-core system is hosting three containers on CPUs
[1,2], [3,4] and [6,7], then a single RSS context with an equal-weight [0,1]
indirection table could be used to target all three containers by setting
ring_cookie to 1, 3 and 6 on the respective filters.
Edward Cree (2):
net: ethtool: extend RXNFC API to support RSS spreading of filter
matches
sfc: support RSS spreading of ethtool ntuple filters
drivers/net/ethernet/sfc/ef10.c | 273 ++++++++++++++++++++++------------
drivers/net/ethernet/sfc/efx.c | 65 +++++++-
drivers/net/ethernet/sfc/efx.h | 12 +-
drivers/net/ethernet/sfc/ethtool.c | 153 ++++++++++++++++---
drivers/net/ethernet/sfc/farch.c | 11 +-
drivers/net/ethernet/sfc/filter.h | 7 +-
drivers/net/ethernet/sfc/net_driver.h | 44 +++++-
drivers/net/ethernet/sfc/nic.h | 2 -
drivers/net/ethernet/sfc/siena.c | 26 ++--
include/linux/ethtool.h | 5 +
include/uapi/linux/ethtool.h | 32 +++-
net/core/ethtool.c | 64 ++++++--
12 files changed, 523 insertions(+), 171 deletions(-)
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH net-next 1/2] net: ethtool: extend RXNFC API to support RSS spreading of filter matches
2018-02-27 17:59 [PATCH RESEND net-next 0/2] ntuple filters with RSS Edward Cree
@ 2018-02-27 18:02 ` Edward Cree
2018-02-27 18:03 ` [PATCH net-next 2/2] sfc: support RSS spreading of ethtool ntuple filters Edward Cree
` (2 subsequent siblings)
3 siblings, 0 replies; 18+ messages in thread
From: Edward Cree @ 2018-02-27 18:02 UTC (permalink / raw)
To: linux-net-drivers, David Miller; +Cc: netdev, John W. Linville
We use a two-step process to configure a filter with RSS spreading. First,
the RSS context is allocated and configured using ETHTOOL_SRSSH; this
returns an identifier (rss_context) which can then be passed to subsequent
invocations of ETHTOOL_SRXCLSRLINS to specify that the offset from the RSS
indirection table lookup should be added to the queue number (ring_cookie)
when delivering the packet. Drivers for devices which can only use the
indirection table entry directly (not add it to a base queue number)
should reject rule insertions combining RSS with a nonzero ring_cookie.
Signed-off-by: Edward Cree <ecree@solarflare.com>
---
include/linux/ethtool.h | 5 ++++
include/uapi/linux/ethtool.h | 32 +++++++++++++++++-----
net/core/ethtool.c | 64 +++++++++++++++++++++++++++++++++-----------
3 files changed, 80 insertions(+), 21 deletions(-)
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 2ec41a7eb54f..ebe41811ed34 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -371,6 +371,11 @@ struct ethtool_ops {
u8 *hfunc);
int (*set_rxfh)(struct net_device *, const u32 *indir,
const u8 *key, const u8 hfunc);
+ int (*get_rxfh_context)(struct net_device *, u32 *indir, u8 *key,
+ u8 *hfunc, u32 rss_context);
+ int (*set_rxfh_context)(struct net_device *, const u32 *indir,
+ const u8 *key, const u8 hfunc,
+ u32 *rss_context, bool delete);
void (*get_channels)(struct net_device *, struct ethtool_channels *);
int (*set_channels)(struct net_device *, struct ethtool_channels *);
int (*get_dump_flag)(struct net_device *, struct ethtool_dump *);
diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index 44a0b675a6bc..20da156aaf64 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -914,12 +914,15 @@ static inline __u64 ethtool_get_flow_spec_ring_vf(__u64 ring_cookie)
* @flow_type: Type of flow to be affected, e.g. %TCP_V4_FLOW
* @data: Command-dependent value
* @fs: Flow classification rule
+ * @rss_context: RSS context to be affected
* @rule_cnt: Number of rules to be affected
* @rule_locs: Array of used rule locations
*
* For %ETHTOOL_GRXFH and %ETHTOOL_SRXFH, @data is a bitmask indicating
* the fields included in the flow hash, e.g. %RXH_IP_SRC. The following
- * structure fields must not be used.
+ * structure fields must not be used, except that if @flow_type includes
+ * the %FLOW_RSS flag, then @rss_context determines which RSS context to
+ * act on.
*
* For %ETHTOOL_GRXRINGS, @data is set to the number of RX rings/queues
* on return.
@@ -931,7 +934,9 @@ static inline __u64 ethtool_get_flow_spec_ring_vf(__u64 ring_cookie)
* set in @data then special location values should not be used.
*
* For %ETHTOOL_GRXCLSRULE, @fs.@location specifies the location of an
- * existing rule on entry and @fs contains the rule on return.
+ * existing rule on entry and @fs contains the rule on return; if
+ * @fs.@flow_type includes the %FLOW_RSS flag, then @rss_context is
+ * filled with the RSS context ID associated with the rule.
*
* For %ETHTOOL_GRXCLSRLALL, @rule_cnt specifies the array size of the
* user buffer for @rule_locs on entry. On return, @data is the size
@@ -942,7 +947,11 @@ static inline __u64 ethtool_get_flow_spec_ring_vf(__u64 ring_cookie)
* For %ETHTOOL_SRXCLSRLINS, @fs specifies the rule to add or update.
* @fs.@location either specifies the location to use or is a special
* location value with %RX_CLS_LOC_SPECIAL flag set. On return,
- * @fs.@location is the actual rule location.
+ * @fs.@location is the actual rule location. If @fs.@flow_type
+ * includes the %FLOW_RSS flag, @rss_context is the RSS context ID to
+ * use for flow spreading traffic which matches this rule. The value
+ * from the rxfh indirection table will be added to @fs.@ring_cookie
+ * to choose which ring to deliver to.
*
* For %ETHTOOL_SRXCLSRLDEL, @fs.@location specifies the location of an
* existing rule on entry.
@@ -963,7 +972,10 @@ struct ethtool_rxnfc {
__u32 flow_type;
__u64 data;
struct ethtool_rx_flow_spec fs;
- __u32 rule_cnt;
+ union {
+ __u32 rule_cnt;
+ __u32 rss_context;
+ };
__u32 rule_locs[0];
};
@@ -990,7 +1002,11 @@ struct ethtool_rxfh_indir {
/**
* struct ethtool_rxfh - command to get/set RX flow hash indir or/and hash key.
* @cmd: Specific command number - %ETHTOOL_GRSSH or %ETHTOOL_SRSSH
- * @rss_context: RSS context identifier.
+ * @rss_context: RSS context identifier. Context 0 is the default for normal
+ * traffic; other contexts can be referenced as the destination for RX flow
+ * classification rules. %ETH_RXFH_CONTEXT_ALLOC is used with command
+ * %ETHTOOL_SRSSH to allocate a new RSS context; on return this field will
+ * contain the ID of the newly allocated context.
* @indir_size: On entry, the array size of the user buffer for the
* indirection table, which may be zero, or (for %ETHTOOL_SRSSH),
* %ETH_RXFH_INDIR_NO_CHANGE. On return from %ETHTOOL_GRSSH,
@@ -1009,7 +1025,8 @@ struct ethtool_rxfh_indir {
* size should be returned. For %ETHTOOL_SRSSH, an @indir_size of
* %ETH_RXFH_INDIR_NO_CHANGE means that indir table setting is not requested
* and a @indir_size of zero means the indir table should be reset to default
- * values. An hfunc of zero means that hash function setting is not requested.
+ * values (if @rss_context == 0) or that the RSS context should be deleted.
+ * An hfunc of zero means that hash function setting is not requested.
*/
struct ethtool_rxfh {
__u32 cmd;
@@ -1021,6 +1038,7 @@ struct ethtool_rxfh {
__u32 rsvd32;
__u32 rss_config[0];
};
+#define ETH_RXFH_CONTEXT_ALLOC 0xffffffff
#define ETH_RXFH_INDIR_NO_CHANGE 0xffffffff
/**
@@ -1635,6 +1653,8 @@ static inline int ethtool_validate_duplex(__u8 duplex)
/* Flag to enable additional fields in struct ethtool_rx_flow_spec */
#define FLOW_EXT 0x80000000
#define FLOW_MAC_EXT 0x40000000
+/* Flag to enable RSS spreading of traffic matching rule (nfc only) */
+#define FLOW_RSS 0x20000000
/* L3-L4 network traffic flow hash options */
#define RXH_L2DA (1 << 1)
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 107b122c8969..7c8685e47351 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -1028,6 +1028,15 @@ static noinline_for_stack int ethtool_get_rxnfc(struct net_device *dev,
if (copy_from_user(&info, useraddr, info_size))
return -EFAULT;
+ /* If FLOW_RSS was requested then user-space must be using the
+ * new definition, as FLOW_RSS is newer.
+ */
+ if (cmd == ETHTOOL_GRXFH && info.flow_type & FLOW_RSS) {
+ info_size = sizeof(info);
+ if (copy_from_user(&info, useraddr, info_size))
+ return -EFAULT;
+ }
+
if (info.cmd == ETHTOOL_GRXCLSRLALL) {
if (info.rule_cnt > 0) {
if (info.rule_cnt <= KMALLOC_MAX_SIZE / sizeof(u32))
@@ -1257,9 +1266,11 @@ static noinline_for_stack int ethtool_get_rxfh(struct net_device *dev,
user_key_size = rxfh.key_size;
/* Check that reserved fields are 0 for now */
- if (rxfh.rss_context || rxfh.rsvd8[0] || rxfh.rsvd8[1] ||
- rxfh.rsvd8[2] || rxfh.rsvd32)
+ if (rxfh.rsvd8[0] || rxfh.rsvd8[1] || rxfh.rsvd8[2] || rxfh.rsvd32)
return -EINVAL;
+ /* Most drivers don't handle rss_context, check it's 0 as well */
+ if (rxfh.rss_context && !ops->get_rxfh_context)
+ return -EOPNOTSUPP;
rxfh.indir_size = dev_indir_size;
rxfh.key_size = dev_key_size;
@@ -1282,7 +1293,12 @@ static noinline_for_stack int ethtool_get_rxfh(struct net_device *dev,
if (user_key_size)
hkey = rss_config + indir_bytes;
- ret = dev->ethtool_ops->get_rxfh(dev, indir, hkey, &dev_hfunc);
+ if (rxfh.rss_context)
+ ret = dev->ethtool_ops->get_rxfh_context(dev, indir, hkey,
+ &dev_hfunc,
+ rxfh.rss_context);
+ else
+ ret = dev->ethtool_ops->get_rxfh(dev, indir, hkey, &dev_hfunc);
if (ret)
goto out;
@@ -1312,6 +1328,7 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
u8 *hkey = NULL;
u8 *rss_config;
u32 rss_cfg_offset = offsetof(struct ethtool_rxfh, rss_config[0]);
+ bool delete = false;
if (!ops->get_rxnfc || !ops->set_rxfh)
return -EOPNOTSUPP;
@@ -1325,9 +1342,11 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
return -EFAULT;
/* Check that reserved fields are 0 for now */
- if (rxfh.rss_context || rxfh.rsvd8[0] || rxfh.rsvd8[1] ||
- rxfh.rsvd8[2] || rxfh.rsvd32)
+ if (rxfh.rsvd8[0] || rxfh.rsvd8[1] || rxfh.rsvd8[2] || rxfh.rsvd32)
return -EINVAL;
+ /* Most drivers don't handle rss_context, check it's 0 as well */
+ if (rxfh.rss_context && !ops->set_rxfh_context)
+ return -EOPNOTSUPP;
/* If either indir, hash key or function is valid, proceed further.
* Must request at least one change: indir size, hash key or function.
@@ -1352,7 +1371,8 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
if (ret)
goto out;
- /* rxfh.indir_size == 0 means reset the indir table to default.
+ /* rxfh.indir_size == 0 means reset the indir table to default (master
+ * context) or delete the context (other RSS contexts).
* rxfh.indir_size == ETH_RXFH_INDIR_NO_CHANGE means leave it unchanged.
*/
if (rxfh.indir_size &&
@@ -1365,9 +1385,13 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
if (ret)
goto out;
} else if (rxfh.indir_size == 0) {
- indir = (u32 *)rss_config;
- for (i = 0; i < dev_indir_size; i++)
- indir[i] = ethtool_rxfh_indir_default(i, rx_rings.data);
+ if (rxfh.rss_context == 0) {
+ indir = (u32 *)rss_config;
+ for (i = 0; i < dev_indir_size; i++)
+ indir[i] = ethtool_rxfh_indir_default(i, rx_rings.data);
+ } else {
+ delete = true;
+ }
}
if (rxfh.key_size) {
@@ -1380,15 +1404,25 @@ static noinline_for_stack int ethtool_set_rxfh(struct net_device *dev,
}
}
- ret = ops->set_rxfh(dev, indir, hkey, rxfh.hfunc);
+ if (rxfh.rss_context)
+ ret = ops->set_rxfh_context(dev, indir, hkey, rxfh.hfunc,
+ &rxfh.rss_context, delete);
+ else
+ ret = ops->set_rxfh(dev, indir, hkey, rxfh.hfunc);
if (ret)
goto out;
- /* indicate whether rxfh was set to default */
- if (rxfh.indir_size == 0)
- dev->priv_flags &= ~IFF_RXFH_CONFIGURED;
- else if (rxfh.indir_size != ETH_RXFH_INDIR_NO_CHANGE)
- dev->priv_flags |= IFF_RXFH_CONFIGURED;
+ if (copy_to_user(useraddr + offsetof(struct ethtool_rxfh, rss_context),
+ &rxfh.rss_context, sizeof(rxfh.rss_context)))
+ ret = -EFAULT;
+
+ if (!rxfh.rss_context) {
+ /* indicate whether rxfh was set to default */
+ if (rxfh.indir_size == 0)
+ dev->priv_flags &= ~IFF_RXFH_CONFIGURED;
+ else if (rxfh.indir_size != ETH_RXFH_INDIR_NO_CHANGE)
+ dev->priv_flags |= IFF_RXFH_CONFIGURED;
+ }
out:
kfree(rss_config);
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH net-next 2/2] sfc: support RSS spreading of ethtool ntuple filters
2018-02-27 17:59 [PATCH RESEND net-next 0/2] ntuple filters with RSS Edward Cree
2018-02-27 18:02 ` [PATCH net-next 1/2] net: ethtool: extend RXNFC API to support RSS spreading of filter matches Edward Cree
@ 2018-02-27 18:03 ` Edward Cree
2018-03-01 20:51 ` kbuild test robot
2018-02-27 23:47 ` [PATCH RESEND net-next 0/2] ntuple filters with RSS Jakub Kicinski
2018-03-01 18:36 ` David Miller
3 siblings, 1 reply; 18+ messages in thread
From: Edward Cree @ 2018-02-27 18:03 UTC (permalink / raw)
To: linux-net-drivers, David Miller; +Cc: netdev, John W. Linville
Use a linked list to associate user-facing context IDs with FW-facing
context IDs (since the latter can change after an MC reset).
Signed-off-by: Edward Cree <ecree@solarflare.com>
---
drivers/net/ethernet/sfc/ef10.c | 273 ++++++++++++++++++++++------------
drivers/net/ethernet/sfc/efx.c | 65 +++++++-
drivers/net/ethernet/sfc/efx.h | 12 +-
drivers/net/ethernet/sfc/ethtool.c | 153 ++++++++++++++++---
drivers/net/ethernet/sfc/farch.c | 11 +-
drivers/net/ethernet/sfc/filter.h | 7 +-
drivers/net/ethernet/sfc/net_driver.h | 44 +++++-
drivers/net/ethernet/sfc/nic.h | 2 -
drivers/net/ethernet/sfc/siena.c | 26 ++--
9 files changed, 443 insertions(+), 150 deletions(-)
diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 75fbf58e421c..30d69bac6b8f 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -28,9 +28,6 @@ enum {
EFX_EF10_TEST = 1,
EFX_EF10_REFILL,
};
-
-/* The reserved RSS context value */
-#define EFX_EF10_RSS_CONTEXT_INVALID 0xffffffff
/* The maximum size of a shared RSS context */
/* TODO: this should really be from the mcdi protocol export */
#define EFX_EF10_MAX_SHARED_RSS_CONTEXT_SIZE 64UL
@@ -697,7 +694,7 @@ static int efx_ef10_probe(struct efx_nic *efx)
}
nic_data->warm_boot_count = rc;
- nic_data->rx_rss_context = EFX_EF10_RSS_CONTEXT_INVALID;
+ efx->rss_context.context_id = EFX_EF10_RSS_CONTEXT_INVALID;
nic_data->vport_id = EVB_PORT_ID_ASSIGNED;
@@ -1489,8 +1486,8 @@ static int efx_ef10_init_nic(struct efx_nic *efx)
}
/* don't fail init if RSS setup doesn't work */
- rc = efx->type->rx_push_rss_config(efx, false, efx->rx_indir_table, NULL);
- efx->rss_active = (rc == 0);
+ rc = efx->type->rx_push_rss_config(efx, false,
+ efx->rss_context.rx_indir_table, NULL);
return 0;
}
@@ -1507,7 +1504,7 @@ static void efx_ef10_reset_mc_allocations(struct efx_nic *efx)
nic_data->must_restore_filters = true;
nic_data->must_restore_piobufs = true;
efx_ef10_forget_old_piobufs(efx);
- nic_data->rx_rss_context = EFX_EF10_RSS_CONTEXT_INVALID;
+ efx->rss_context.context_id = EFX_EF10_RSS_CONTEXT_INVALID;
/* Driver-created vswitches and vports must be re-created */
nic_data->must_probe_vswitching = true;
@@ -2703,27 +2700,30 @@ static int efx_ef10_get_rss_flags(struct efx_nic *efx, u32 context, u32 *flags)
* Defaults are 4-tuple for TCP and 2-tuple for UDP and other-IP, so we
* just need to set the UDP ports flags (for both IP versions).
*/
-static void efx_ef10_set_rss_flags(struct efx_nic *efx, u32 context)
+static void efx_ef10_set_rss_flags(struct efx_nic *efx,
+ struct efx_rss_context *ctx)
{
MCDI_DECLARE_BUF(inbuf, MC_CMD_RSS_CONTEXT_SET_FLAGS_IN_LEN);
u32 flags;
BUILD_BUG_ON(MC_CMD_RSS_CONTEXT_SET_FLAGS_OUT_LEN != 0);
- if (efx_ef10_get_rss_flags(efx, context, &flags) != 0)
+ if (efx_ef10_get_rss_flags(efx, ctx->context_id, &flags) != 0)
return;
- MCDI_SET_DWORD(inbuf, RSS_CONTEXT_SET_FLAGS_IN_RSS_CONTEXT_ID, context);
+ MCDI_SET_DWORD(inbuf, RSS_CONTEXT_SET_FLAGS_IN_RSS_CONTEXT_ID,
+ ctx->context_id);
flags |= RSS_MODE_HASH_PORTS << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_UDP_IPV4_RSS_MODE_LBN;
flags |= RSS_MODE_HASH_PORTS << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_UDP_IPV6_RSS_MODE_LBN;
MCDI_SET_DWORD(inbuf, RSS_CONTEXT_SET_FLAGS_IN_FLAGS, flags);
if (!efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_SET_FLAGS, inbuf, sizeof(inbuf),
NULL, 0, NULL))
/* Succeeded, so UDP 4-tuple is now enabled */
- efx->rx_hash_udp_4tuple = true;
+ ctx->rx_hash_udp_4tuple = true;
}
-static int efx_ef10_alloc_rss_context(struct efx_nic *efx, u32 *context,
- bool exclusive, unsigned *context_size)
+static int efx_ef10_alloc_rss_context(struct efx_nic *efx, bool exclusive,
+ struct efx_rss_context *ctx,
+ unsigned *context_size)
{
MCDI_DECLARE_BUF(inbuf, MC_CMD_RSS_CONTEXT_ALLOC_IN_LEN);
MCDI_DECLARE_BUF(outbuf, MC_CMD_RSS_CONTEXT_ALLOC_OUT_LEN);
@@ -2739,7 +2739,7 @@ static int efx_ef10_alloc_rss_context(struct efx_nic *efx, u32 *context,
EFX_EF10_MAX_SHARED_RSS_CONTEXT_SIZE);
if (!exclusive && rss_spread == 1) {
- *context = EFX_EF10_RSS_CONTEXT_INVALID;
+ ctx->context_id = EFX_EF10_RSS_CONTEXT_INVALID;
if (context_size)
*context_size = 1;
return 0;
@@ -2762,29 +2762,26 @@ static int efx_ef10_alloc_rss_context(struct efx_nic *efx, u32 *context,
if (outlen < MC_CMD_RSS_CONTEXT_ALLOC_OUT_LEN)
return -EIO;
- *context = MCDI_DWORD(outbuf, RSS_CONTEXT_ALLOC_OUT_RSS_CONTEXT_ID);
+ ctx->context_id = MCDI_DWORD(outbuf, RSS_CONTEXT_ALLOC_OUT_RSS_CONTEXT_ID);
if (context_size)
*context_size = rss_spread;
if (nic_data->datapath_caps &
1 << MC_CMD_GET_CAPABILITIES_OUT_ADDITIONAL_RSS_MODES_LBN)
- efx_ef10_set_rss_flags(efx, *context);
+ efx_ef10_set_rss_flags(efx, ctx);
return 0;
}
-static void efx_ef10_free_rss_context(struct efx_nic *efx, u32 context)
+static int efx_ef10_free_rss_context(struct efx_nic *efx, u32 context)
{
MCDI_DECLARE_BUF(inbuf, MC_CMD_RSS_CONTEXT_FREE_IN_LEN);
- int rc;
MCDI_SET_DWORD(inbuf, RSS_CONTEXT_FREE_IN_RSS_CONTEXT_ID,
context);
-
- rc = efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_FREE, inbuf, sizeof(inbuf),
+ return efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_FREE, inbuf, sizeof(inbuf),
NULL, 0, NULL);
- WARN_ON(rc != 0);
}
static int efx_ef10_populate_rss_table(struct efx_nic *efx, u32 context,
@@ -2796,15 +2793,15 @@ static int efx_ef10_populate_rss_table(struct efx_nic *efx, u32 context,
MCDI_SET_DWORD(tablebuf, RSS_CONTEXT_SET_TABLE_IN_RSS_CONTEXT_ID,
context);
- BUILD_BUG_ON(ARRAY_SIZE(efx->rx_indir_table) !=
+ BUILD_BUG_ON(ARRAY_SIZE(efx->rss_context.rx_indir_table) !=
MC_CMD_RSS_CONTEXT_SET_TABLE_IN_INDIRECTION_TABLE_LEN);
- /* This iterates over the length of efx->rx_indir_table, but copies
- * bytes from rx_indir_table. That's because the latter is a pointer
- * rather than an array, but should have the same length.
- * The efx->rx_hash_key loop below is similar.
+ /* This iterates over the length of efx->rss_context.rx_indir_table, but
+ * copies bytes from rx_indir_table. That's because the latter is a
+ * pointer rather than an array, but should have the same length.
+ * The efx->rss_context.rx_hash_key loop below is similar.
*/
- for (i = 0; i < ARRAY_SIZE(efx->rx_indir_table); ++i)
+ for (i = 0; i < ARRAY_SIZE(efx->rss_context.rx_indir_table); ++i)
MCDI_PTR(tablebuf,
RSS_CONTEXT_SET_TABLE_IN_INDIRECTION_TABLE)[i] =
(u8) rx_indir_table[i];
@@ -2816,9 +2813,9 @@ static int efx_ef10_populate_rss_table(struct efx_nic *efx, u32 context,
MCDI_SET_DWORD(keybuf, RSS_CONTEXT_SET_KEY_IN_RSS_CONTEXT_ID,
context);
- BUILD_BUG_ON(ARRAY_SIZE(efx->rx_hash_key) !=
+ BUILD_BUG_ON(ARRAY_SIZE(efx->rss_context.rx_hash_key) !=
MC_CMD_RSS_CONTEXT_SET_KEY_IN_TOEPLITZ_KEY_LEN);
- for (i = 0; i < ARRAY_SIZE(efx->rx_hash_key); ++i)
+ for (i = 0; i < ARRAY_SIZE(efx->rss_context.rx_hash_key); ++i)
MCDI_PTR(keybuf, RSS_CONTEXT_SET_KEY_IN_TOEPLITZ_KEY)[i] = key[i];
return efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_SET_KEY, keybuf,
@@ -2827,27 +2824,27 @@ static int efx_ef10_populate_rss_table(struct efx_nic *efx, u32 context,
static void efx_ef10_rx_free_indir_table(struct efx_nic *efx)
{
- struct efx_ef10_nic_data *nic_data = efx->nic_data;
+ int rc;
- if (nic_data->rx_rss_context != EFX_EF10_RSS_CONTEXT_INVALID)
- efx_ef10_free_rss_context(efx, nic_data->rx_rss_context);
- nic_data->rx_rss_context = EFX_EF10_RSS_CONTEXT_INVALID;
+ if (efx->rss_context.context_id != EFX_EF10_RSS_CONTEXT_INVALID) {
+ rc = efx_ef10_free_rss_context(efx, efx->rss_context.context_id);
+ WARN_ON(rc != 0);
+ }
+ efx->rss_context.context_id = EFX_EF10_RSS_CONTEXT_INVALID;
}
static int efx_ef10_rx_push_shared_rss_config(struct efx_nic *efx,
unsigned *context_size)
{
- u32 new_rx_rss_context;
struct efx_ef10_nic_data *nic_data = efx->nic_data;
- int rc = efx_ef10_alloc_rss_context(efx, &new_rx_rss_context,
- false, context_size);
+ int rc = efx_ef10_alloc_rss_context(efx, false, &efx->rss_context,
+ context_size);
if (rc != 0)
return rc;
- nic_data->rx_rss_context = new_rx_rss_context;
nic_data->rx_rss_context_exclusive = false;
- efx_set_default_rx_indir_table(efx);
+ efx_set_default_rx_indir_table(efx, &efx->rss_context);
return 0;
}
@@ -2855,50 +2852,79 @@ static int efx_ef10_rx_push_exclusive_rss_config(struct efx_nic *efx,
const u32 *rx_indir_table,
const u8 *key)
{
+ u32 old_rx_rss_context = efx->rss_context.context_id;
struct efx_ef10_nic_data *nic_data = efx->nic_data;
int rc;
- u32 new_rx_rss_context;
- if (nic_data->rx_rss_context == EFX_EF10_RSS_CONTEXT_INVALID ||
+ if (efx->rss_context.context_id == EFX_EF10_RSS_CONTEXT_INVALID ||
!nic_data->rx_rss_context_exclusive) {
- rc = efx_ef10_alloc_rss_context(efx, &new_rx_rss_context,
- true, NULL);
+ rc = efx_ef10_alloc_rss_context(efx, true, &efx->rss_context,
+ NULL);
if (rc == -EOPNOTSUPP)
return rc;
else if (rc != 0)
goto fail1;
- } else {
- new_rx_rss_context = nic_data->rx_rss_context;
}
- rc = efx_ef10_populate_rss_table(efx, new_rx_rss_context,
+ rc = efx_ef10_populate_rss_table(efx, efx->rss_context.context_id,
rx_indir_table, key);
if (rc != 0)
goto fail2;
- if (nic_data->rx_rss_context != new_rx_rss_context)
- efx_ef10_rx_free_indir_table(efx);
- nic_data->rx_rss_context = new_rx_rss_context;
+ if (efx->rss_context.context_id != old_rx_rss_context &&
+ old_rx_rss_context != EFX_EF10_RSS_CONTEXT_INVALID)
+ WARN_ON(efx_ef10_free_rss_context(efx, old_rx_rss_context) != 0);
nic_data->rx_rss_context_exclusive = true;
- if (rx_indir_table != efx->rx_indir_table)
- memcpy(efx->rx_indir_table, rx_indir_table,
- sizeof(efx->rx_indir_table));
- if (key != efx->rx_hash_key)
- memcpy(efx->rx_hash_key, key, efx->type->rx_hash_key_size);
+ if (rx_indir_table != efx->rss_context.rx_indir_table)
+ memcpy(efx->rss_context.rx_indir_table, rx_indir_table,
+ sizeof(efx->rss_context.rx_indir_table));
+ if (key != efx->rss_context.rx_hash_key)
+ memcpy(efx->rss_context.rx_hash_key, key,
+ efx->type->rx_hash_key_size);
return 0;
fail2:
- if (new_rx_rss_context != nic_data->rx_rss_context)
- efx_ef10_free_rss_context(efx, new_rx_rss_context);
+ if (old_rx_rss_context != efx->rss_context.context_id) {
+ WARN_ON(efx_ef10_free_rss_context(efx, efx->rss_context.context_id) != 0);
+ efx->rss_context.context_id = old_rx_rss_context;
+ }
fail1:
netif_err(efx, hw, efx->net_dev, "%s: failed rc=%d\n", __func__, rc);
return rc;
}
-static int efx_ef10_rx_pull_rss_config(struct efx_nic *efx)
+static int efx_ef10_rx_push_rss_context_config(struct efx_nic *efx,
+ struct efx_rss_context *ctx,
+ const u32 *rx_indir_table,
+ const u8 *key)
+{
+ int rc;
+
+ if (ctx->context_id == EFX_EF10_RSS_CONTEXT_INVALID) {
+ rc = efx_ef10_alloc_rss_context(efx, true, ctx, NULL);
+ if (rc)
+ return rc;
+ }
+
+ if (!rx_indir_table) /* Delete this context */
+ return efx_ef10_free_rss_context(efx, ctx->context_id);
+
+ rc = efx_ef10_populate_rss_table(efx, ctx->context_id,
+ rx_indir_table, key);
+ if (rc)
+ return rc;
+
+ memcpy(ctx->rx_indir_table, rx_indir_table,
+ sizeof(efx->rss_context.rx_indir_table));
+ memcpy(ctx->rx_hash_key, key, efx->type->rx_hash_key_size);
+
+ return 0;
+}
+
+static int efx_ef10_rx_pull_rss_context_config(struct efx_nic *efx,
+ struct efx_rss_context *ctx)
{
- struct efx_ef10_nic_data *nic_data = efx->nic_data;
MCDI_DECLARE_BUF(inbuf, MC_CMD_RSS_CONTEXT_GET_TABLE_IN_LEN);
MCDI_DECLARE_BUF(tablebuf, MC_CMD_RSS_CONTEXT_GET_TABLE_OUT_LEN);
MCDI_DECLARE_BUF(keybuf, MC_CMD_RSS_CONTEXT_GET_KEY_OUT_LEN);
@@ -2908,12 +2934,12 @@ static int efx_ef10_rx_pull_rss_config(struct efx_nic *efx)
BUILD_BUG_ON(MC_CMD_RSS_CONTEXT_GET_TABLE_IN_LEN !=
MC_CMD_RSS_CONTEXT_GET_KEY_IN_LEN);
- if (nic_data->rx_rss_context == EFX_EF10_RSS_CONTEXT_INVALID)
+ if (ctx->context_id == EFX_EF10_RSS_CONTEXT_INVALID)
return -ENOENT;
MCDI_SET_DWORD(inbuf, RSS_CONTEXT_GET_TABLE_IN_RSS_CONTEXT_ID,
- nic_data->rx_rss_context);
- BUILD_BUG_ON(ARRAY_SIZE(efx->rx_indir_table) !=
+ ctx->context_id);
+ BUILD_BUG_ON(ARRAY_SIZE(ctx->rx_indir_table) !=
MC_CMD_RSS_CONTEXT_GET_TABLE_OUT_INDIRECTION_TABLE_LEN);
rc = efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_GET_TABLE, inbuf, sizeof(inbuf),
tablebuf, sizeof(tablebuf), &outlen);
@@ -2923,13 +2949,13 @@ static int efx_ef10_rx_pull_rss_config(struct efx_nic *efx)
if (WARN_ON(outlen != MC_CMD_RSS_CONTEXT_GET_TABLE_OUT_LEN))
return -EIO;
- for (i = 0; i < ARRAY_SIZE(efx->rx_indir_table); i++)
- efx->rx_indir_table[i] = MCDI_PTR(tablebuf,
+ for (i = 0; i < ARRAY_SIZE(ctx->rx_indir_table); i++)
+ ctx->rx_indir_table[i] = MCDI_PTR(tablebuf,
RSS_CONTEXT_GET_TABLE_OUT_INDIRECTION_TABLE)[i];
MCDI_SET_DWORD(inbuf, RSS_CONTEXT_GET_KEY_IN_RSS_CONTEXT_ID,
- nic_data->rx_rss_context);
- BUILD_BUG_ON(ARRAY_SIZE(efx->rx_hash_key) !=
+ ctx->context_id);
+ BUILD_BUG_ON(ARRAY_SIZE(ctx->rx_hash_key) !=
MC_CMD_RSS_CONTEXT_SET_KEY_IN_TOEPLITZ_KEY_LEN);
rc = efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_GET_KEY, inbuf, sizeof(inbuf),
keybuf, sizeof(keybuf), &outlen);
@@ -2939,13 +2965,38 @@ static int efx_ef10_rx_pull_rss_config(struct efx_nic *efx)
if (WARN_ON(outlen != MC_CMD_RSS_CONTEXT_GET_KEY_OUT_LEN))
return -EIO;
- for (i = 0; i < ARRAY_SIZE(efx->rx_hash_key); ++i)
- efx->rx_hash_key[i] = MCDI_PTR(
+ for (i = 0; i < ARRAY_SIZE(ctx->rx_hash_key); ++i)
+ ctx->rx_hash_key[i] = MCDI_PTR(
keybuf, RSS_CONTEXT_GET_KEY_OUT_TOEPLITZ_KEY)[i];
return 0;
}
+static int efx_ef10_rx_pull_rss_config(struct efx_nic *efx)
+{
+ return efx_ef10_rx_pull_rss_context_config(efx, &efx->rss_context);
+}
+
+static void efx_ef10_rx_restore_rss_contexts(struct efx_nic *efx)
+{
+ struct efx_rss_context *ctx;
+ int rc;
+
+ list_for_each_entry(ctx, &efx->rss_context.list, list) {
+ /* previous NIC RSS context is gone */
+ ctx->context_id = EFX_EF10_RSS_CONTEXT_INVALID;
+ /* so try to allocate a new one */
+ rc = efx_ef10_rx_push_rss_context_config(efx, ctx,
+ ctx->rx_indir_table,
+ ctx->rx_hash_key);
+ if (rc)
+ netif_warn(efx, probe, efx->net_dev,
+ "failed to restore RSS context %u, rc=%d"
+ "; RSS filters may fail to be applied\n",
+ ctx->user_id, rc);
+ }
+}
+
static int efx_ef10_pf_rx_push_rss_config(struct efx_nic *efx, bool user,
const u32 *rx_indir_table,
const u8 *key)
@@ -2956,7 +3007,7 @@ static int efx_ef10_pf_rx_push_rss_config(struct efx_nic *efx, bool user,
return 0;
if (!key)
- key = efx->rx_hash_key;
+ key = efx->rss_context.rx_hash_key;
rc = efx_ef10_rx_push_exclusive_rss_config(efx, rx_indir_table, key);
@@ -2965,7 +3016,8 @@ static int efx_ef10_pf_rx_push_rss_config(struct efx_nic *efx, bool user,
bool mismatch = false;
size_t i;
- for (i = 0; i < ARRAY_SIZE(efx->rx_indir_table) && !mismatch;
+ for (i = 0;
+ i < ARRAY_SIZE(efx->rss_context.rx_indir_table) && !mismatch;
i++)
mismatch = rx_indir_table[i] !=
ethtool_rxfh_indir_default(i, efx->rss_spread);
@@ -3000,11 +3052,9 @@ static int efx_ef10_vf_rx_push_rss_config(struct efx_nic *efx, bool user,
const u8 *key
__attribute__ ((unused)))
{
- struct efx_ef10_nic_data *nic_data = efx->nic_data;
-
if (user)
return -EOPNOTSUPP;
- if (nic_data->rx_rss_context != EFX_EF10_RSS_CONTEXT_INVALID)
+ if (efx->rss_context.context_id != EFX_EF10_RSS_CONTEXT_INVALID)
return 0;
return efx_ef10_rx_push_shared_rss_config(efx, NULL);
}
@@ -4109,6 +4159,7 @@ efx_ef10_filter_push_prep_set_match_fields(struct efx_nic *efx,
static void efx_ef10_filter_push_prep(struct efx_nic *efx,
const struct efx_filter_spec *spec,
efx_dword_t *inbuf, u64 handle,
+ struct efx_rss_context *ctx,
bool replacing)
{
struct efx_ef10_nic_data *nic_data = efx->nic_data;
@@ -4116,11 +4167,16 @@ static void efx_ef10_filter_push_prep(struct efx_nic *efx,
memset(inbuf, 0, MC_CMD_FILTER_OP_EXT_IN_LEN);
- /* Remove RSS flag if we don't have an RSS context. */
- if (flags & EFX_FILTER_FLAG_RX_RSS &&
- spec->rss_context == EFX_FILTER_RSS_CONTEXT_DEFAULT &&
- nic_data->rx_rss_context == EFX_EF10_RSS_CONTEXT_INVALID)
- flags &= ~EFX_FILTER_FLAG_RX_RSS;
+ /* If RSS filter, caller better have given us an RSS context */
+ if (flags & EFX_FILTER_FLAG_RX_RSS) {
+ /* We don't have the ability to return an error, so we'll just
+ * log a warning and disable RSS for the filter.
+ */
+ if (WARN_ON_ONCE(!ctx))
+ flags &= ~EFX_FILTER_FLAG_RX_RSS;
+ else if (WARN_ON_ONCE(ctx->context_id == EFX_EF10_RSS_CONTEXT_INVALID))
+ flags &= ~EFX_FILTER_FLAG_RX_RSS;
+ }
if (replacing) {
MCDI_SET_DWORD(inbuf, FILTER_OP_IN_OP,
@@ -4146,21 +4202,18 @@ static void efx_ef10_filter_push_prep(struct efx_nic *efx,
MC_CMD_FILTER_OP_IN_RX_MODE_RSS :
MC_CMD_FILTER_OP_IN_RX_MODE_SIMPLE);
if (flags & EFX_FILTER_FLAG_RX_RSS)
- MCDI_SET_DWORD(inbuf, FILTER_OP_IN_RX_CONTEXT,
- spec->rss_context !=
- EFX_FILTER_RSS_CONTEXT_DEFAULT ?
- spec->rss_context : nic_data->rx_rss_context);
+ MCDI_SET_DWORD(inbuf, FILTER_OP_IN_RX_CONTEXT, ctx->context_id);
}
static int efx_ef10_filter_push(struct efx_nic *efx,
- const struct efx_filter_spec *spec,
- u64 *handle, bool replacing)
+ const struct efx_filter_spec *spec, u64 *handle,
+ struct efx_rss_context *ctx, bool replacing)
{
MCDI_DECLARE_BUF(inbuf, MC_CMD_FILTER_OP_EXT_IN_LEN);
MCDI_DECLARE_BUF(outbuf, MC_CMD_FILTER_OP_EXT_OUT_LEN);
int rc;
- efx_ef10_filter_push_prep(efx, spec, inbuf, *handle, replacing);
+ efx_ef10_filter_push_prep(efx, spec, inbuf, *handle, ctx, replacing);
rc = efx_mcdi_rpc(efx, MC_CMD_FILTER_OP, inbuf, sizeof(inbuf),
outbuf, sizeof(outbuf), NULL);
if (rc == 0)
@@ -4253,6 +4306,7 @@ static s32 efx_ef10_filter_insert(struct efx_nic *efx,
DECLARE_BITMAP(mc_rem_map, EFX_EF10_FILTER_SEARCH_LIMIT);
struct efx_filter_spec *saved_spec;
unsigned int match_pri, hash;
+ struct efx_rss_context *ctx;
unsigned int priv_flags;
bool replacing = false;
int ins_index = -1;
@@ -4275,6 +4329,18 @@ static s32 efx_ef10_filter_insert(struct efx_nic *efx,
if (is_mc_recip)
bitmap_zero(mc_rem_map, EFX_EF10_FILTER_SEARCH_LIMIT);
+ if (spec->flags & EFX_FILTER_FLAG_RX_RSS) {
+ if (spec->rss_context)
+ ctx = efx_find_rss_context_entry(spec->rss_context,
+ &efx->rss_context.list);
+ else
+ ctx = &efx->rss_context;
+ if (!ctx)
+ return -ENOENT;
+ if (ctx->context_id == EFX_EF10_RSS_CONTEXT_INVALID)
+ return -EOPNOTSUPP;
+ }
+
/* Find any existing filters with the same match tuple or
* else a free slot to insert at. If any of them are busy,
* we have to wait and retry.
@@ -4390,7 +4456,7 @@ static s32 efx_ef10_filter_insert(struct efx_nic *efx,
spin_unlock_bh(&efx->filter_lock);
rc = efx_ef10_filter_push(efx, spec, &table->entry[ins_index].handle,
- replacing);
+ ctx, replacing);
/* Finalise the software table entry */
spin_lock_bh(&efx->filter_lock);
@@ -4534,12 +4600,13 @@ static int efx_ef10_filter_remove_internal(struct efx_nic *efx,
new_spec.priority = EFX_FILTER_PRI_AUTO;
new_spec.flags = (EFX_FILTER_FLAG_RX |
- (efx_rss_enabled(efx) ?
+ (efx_rss_active(&efx->rss_context) ?
EFX_FILTER_FLAG_RX_RSS : 0));
new_spec.dmaq_id = 0;
- new_spec.rss_context = EFX_FILTER_RSS_CONTEXT_DEFAULT;
+ new_spec.rss_context = 0;
rc = efx_ef10_filter_push(efx, &new_spec,
&table->entry[filter_idx].handle,
+ &efx->rss_context,
true);
spin_lock_bh(&efx->filter_lock);
@@ -4783,7 +4850,8 @@ static s32 efx_ef10_filter_rfs_insert(struct efx_nic *efx,
cookie = replacing << 31 | ins_index << 16 | spec->dmaq_id;
efx_ef10_filter_push_prep(efx, spec, inbuf,
- table->entry[ins_index].handle, replacing);
+ table->entry[ins_index].handle, NULL,
+ replacing);
efx_mcdi_rpc_async(efx, MC_CMD_FILTER_OP, inbuf, sizeof(inbuf),
MC_CMD_FILTER_OP_OUT_LEN,
efx_ef10_filter_rfs_insert_complete, cookie);
@@ -5104,6 +5172,7 @@ static void efx_ef10_filter_table_restore(struct efx_nic *efx)
unsigned int invalid_filters = 0, failed = 0;
struct efx_ef10_filter_vlan *vlan;
struct efx_filter_spec *spec;
+ struct efx_rss_context *ctx;
unsigned int filter_idx;
u32 mcdi_flags;
int match_pri;
@@ -5133,17 +5202,34 @@ static void efx_ef10_filter_table_restore(struct efx_nic *efx)
invalid_filters++;
goto not_restored;
}
- if (spec->rss_context != EFX_FILTER_RSS_CONTEXT_DEFAULT &&
- spec->rss_context != nic_data->rx_rss_context)
- netif_warn(efx, drv, efx->net_dev,
- "Warning: unable to restore a filter with specific RSS context.\n");
+ if (spec->rss_context)
+ ctx = efx_find_rss_context_entry(spec->rss_context,
+ &efx->rss_context.list);
+ else
+ ctx = &efx->rss_context;
+ if (spec->flags & EFX_FILTER_FLAG_RX_RSS) {
+ if (!ctx) {
+ netif_warn(efx, drv, efx->net_dev,
+ "Warning: unable to restore a filter with nonexistent RSS context %u.\n",
+ spec->rss_context);
+ invalid_filters++;
+ goto not_restored;
+ }
+ if (ctx->context_id == EFX_EF10_RSS_CONTEXT_INVALID) {
+ netif_warn(efx, drv, efx->net_dev,
+ "Warning: unable to restore a filter with RSS context %u as it was not created.\n",
+ spec->rss_context);
+ invalid_filters++;
+ goto not_restored;
+ }
+ }
table->entry[filter_idx].spec |= EFX_EF10_FILTER_FLAG_BUSY;
spin_unlock_bh(&efx->filter_lock);
rc = efx_ef10_filter_push(efx, spec,
&table->entry[filter_idx].handle,
- false);
+ ctx, false);
if (rc)
failed++;
spin_lock_bh(&efx->filter_lock);
@@ -6784,6 +6870,9 @@ const struct efx_nic_type efx_hunt_a0_nic_type = {
.tx_limit_len = efx_ef10_tx_limit_len,
.rx_push_rss_config = efx_ef10_pf_rx_push_rss_config,
.rx_pull_rss_config = efx_ef10_rx_pull_rss_config,
+ .rx_push_rss_context_config = efx_ef10_rx_push_rss_context_config,
+ .rx_pull_rss_context_config = efx_ef10_rx_pull_rss_context_config,
+ .rx_restore_rss_contexts = efx_ef10_rx_restore_rss_contexts,
.rx_probe = efx_ef10_rx_probe,
.rx_init = efx_ef10_rx_init,
.rx_remove = efx_ef10_rx_remove,
diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index 16757cfc5b29..7321a4cf6f4d 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -1353,12 +1353,13 @@ static void efx_fini_io(struct efx_nic *efx)
pci_disable_device(efx->pci_dev);
}
-void efx_set_default_rx_indir_table(struct efx_nic *efx)
+void efx_set_default_rx_indir_table(struct efx_nic *efx,
+ struct efx_rss_context *ctx)
{
size_t i;
- for (i = 0; i < ARRAY_SIZE(efx->rx_indir_table); i++)
- efx->rx_indir_table[i] =
+ for (i = 0; i < ARRAY_SIZE(ctx->rx_indir_table); i++)
+ ctx->rx_indir_table[i] =
ethtool_rxfh_indir_default(i, efx->rss_spread);
}
@@ -1739,9 +1740,9 @@ static int efx_probe_nic(struct efx_nic *efx)
} while (rc == -EAGAIN);
if (efx->n_channels > 1)
- netdev_rss_key_fill(&efx->rx_hash_key,
- sizeof(efx->rx_hash_key));
- efx_set_default_rx_indir_table(efx);
+ netdev_rss_key_fill(efx->rss_context.rx_hash_key,
+ sizeof(efx->rss_context.rx_hash_key));
+ efx_set_default_rx_indir_table(efx, &efx->rss_context);
netif_set_real_num_tx_queues(efx->net_dev, efx->n_tx_channels);
netif_set_real_num_rx_queues(efx->net_dev, efx->n_rx_channels);
@@ -2700,6 +2701,8 @@ int efx_reset_up(struct efx_nic *efx, enum reset_type method, bool ok)
" VFs may not function\n", rc);
#endif
+ if (efx->type->rx_restore_rss_contexts)
+ efx->type->rx_restore_rss_contexts(efx);
down_read(&efx->filter_sem);
efx_restore_filters(efx);
up_read(&efx->filter_sem);
@@ -3003,6 +3006,7 @@ static int efx_init_struct(struct efx_nic *efx,
efx->type->rx_hash_offset - efx->type->rx_prefix_size;
efx->rx_packet_ts_offset =
efx->type->rx_ts_offset - efx->type->rx_prefix_size;
+ INIT_LIST_HEAD(&efx->rss_context.list);
spin_lock_init(&efx->stats_lock);
efx->vi_stride = EFX_DEFAULT_VI_STRIDE;
efx->num_mac_stats = MC_CMD_MAC_NSTATS;
@@ -3072,6 +3076,55 @@ void efx_update_sw_stats(struct efx_nic *efx, u64 *stats)
stats[GENERIC_STAT_rx_noskb_drops] = atomic_read(&efx->n_rx_noskb_drops);
}
+/* RSS contexts. We're using linked lists and crappy O(n) algorithms, because
+ * (a) this is an infrequent control-plane operation and (b) n is small (max 64)
+ */
+struct efx_rss_context *efx_alloc_rss_context_entry(struct list_head *head)
+{
+ struct efx_rss_context *ctx, *new;
+ u32 id = 1; /* Don't use zero, that refers to the master RSS context */
+
+ /* Search for first gap in the numbering */
+ list_for_each_entry(ctx, head, list) {
+ if (ctx->user_id != id)
+ break;
+ id++;
+ /* Check for wrap. If this happens, we have nearly 2^32
+ * allocated RSS contexts, which seems unlikely.
+ */
+ if (WARN_ON_ONCE(!id))
+ return NULL;
+ }
+
+ /* Create the new entry */
+ new = kmalloc(sizeof(struct efx_rss_context), GFP_KERNEL);
+ if (!new)
+ return NULL;
+ new->context_id = EFX_EF10_RSS_CONTEXT_INVALID;
+ new->rx_hash_udp_4tuple = false;
+
+ /* Insert the new entry into the gap */
+ new->user_id = id;
+ list_add_tail(&new->list, &ctx->list);
+ return new;
+}
+
+struct efx_rss_context *efx_find_rss_context_entry(u32 id, struct list_head *head)
+{
+ struct efx_rss_context *ctx;
+
+ list_for_each_entry(ctx, head, list)
+ if (ctx->user_id == id)
+ return ctx;
+ return NULL;
+}
+
+void efx_free_rss_context_entry(struct efx_rss_context *ctx)
+{
+ list_del(&ctx->list);
+ kfree(ctx);
+}
+
/**************************************************************************
*
* PCI interface
diff --git a/drivers/net/ethernet/sfc/efx.h b/drivers/net/ethernet/sfc/efx.h
index 0cddc5ad77b1..3429ae3f3b08 100644
--- a/drivers/net/ethernet/sfc/efx.h
+++ b/drivers/net/ethernet/sfc/efx.h
@@ -34,7 +34,8 @@ extern unsigned int efx_piobuf_size;
extern bool efx_separate_tx_channels;
/* RX */
-void efx_set_default_rx_indir_table(struct efx_nic *efx);
+void efx_set_default_rx_indir_table(struct efx_nic *efx,
+ struct efx_rss_context *ctx);
void efx_rx_config_page_split(struct efx_nic *efx);
int efx_probe_rx_queue(struct efx_rx_queue *rx_queue);
void efx_remove_rx_queue(struct efx_rx_queue *rx_queue);
@@ -182,6 +183,15 @@ static inline void efx_filter_rfs_expire(struct efx_channel *channel) {}
#endif
bool efx_filter_is_mc_recipient(const struct efx_filter_spec *spec);
+/* RSS contexts */
+struct efx_rss_context *efx_alloc_rss_context_entry(struct list_head *list);
+struct efx_rss_context *efx_find_rss_context_entry(u32 id, struct list_head *list);
+void efx_free_rss_context_entry(struct efx_rss_context *ctx);
+static inline bool efx_rss_active(struct efx_rss_context *ctx)
+{
+ return ctx->context_id != EFX_EF10_RSS_CONTEXT_INVALID;
+}
+
/* Channels */
int efx_channel_dummy_op_int(struct efx_channel *channel);
void efx_channel_dummy_op_void(struct efx_channel *channel);
diff --git a/drivers/net/ethernet/sfc/ethtool.c b/drivers/net/ethernet/sfc/ethtool.c
index 4db2dc2bf52f..64049e71e6e7 100644
--- a/drivers/net/ethernet/sfc/ethtool.c
+++ b/drivers/net/ethernet/sfc/ethtool.c
@@ -808,7 +808,8 @@ static inline void ip6_fill_mask(__be32 *mask)
}
static int efx_ethtool_get_class_rule(struct efx_nic *efx,
- struct ethtool_rx_flow_spec *rule)
+ struct ethtool_rx_flow_spec *rule,
+ u32 *rss_context)
{
struct ethtool_tcpip4_spec *ip_entry = &rule->h_u.tcp_ip4_spec;
struct ethtool_tcpip4_spec *ip_mask = &rule->m_u.tcp_ip4_spec;
@@ -964,6 +965,11 @@ static int efx_ethtool_get_class_rule(struct efx_nic *efx,
rule->m_ext.vlan_tci = htons(0xfff);
}
+ if (spec.flags & EFX_FILTER_FLAG_RX_RSS) {
+ rule->flow_type |= FLOW_RSS;
+ *rss_context = spec.rss_context;
+ }
+
return rc;
}
@@ -972,6 +978,8 @@ efx_ethtool_get_rxnfc(struct net_device *net_dev,
struct ethtool_rxnfc *info, u32 *rule_locs)
{
struct efx_nic *efx = netdev_priv(net_dev);
+ u32 rss_context = 0;
+ s32 rc;
switch (info->cmd) {
case ETHTOOL_GRXRINGS:
@@ -979,12 +987,20 @@ efx_ethtool_get_rxnfc(struct net_device *net_dev,
return 0;
case ETHTOOL_GRXFH: {
+ struct efx_rss_context *ctx = &efx->rss_context;
+
+ if (info->flow_type & FLOW_RSS && info->rss_context) {
+ ctx = efx_find_rss_context_entry(info->rss_context,
+ &efx->rss_context.list);
+ if (!ctx)
+ return -ENOENT;
+ }
info->data = 0;
- if (!efx->rss_active) /* No RSS */
+ if (!efx_rss_active(ctx)) /* No RSS */
return 0;
- switch (info->flow_type) {
+ switch (info->flow_type & ~FLOW_RSS) {
case UDP_V4_FLOW:
- if (efx->rx_hash_udp_4tuple)
+ if (ctx->rx_hash_udp_4tuple)
/* fall through */
case TCP_V4_FLOW:
info->data |= RXH_L4_B_0_1 | RXH_L4_B_2_3;
@@ -995,7 +1011,7 @@ efx_ethtool_get_rxnfc(struct net_device *net_dev,
info->data |= RXH_IP_SRC | RXH_IP_DST;
break;
case UDP_V6_FLOW:
- if (efx->rx_hash_udp_4tuple)
+ if (ctx->rx_hash_udp_4tuple)
/* fall through */
case TCP_V6_FLOW:
info->data |= RXH_L4_B_0_1 | RXH_L4_B_2_3;
@@ -1023,10 +1039,14 @@ efx_ethtool_get_rxnfc(struct net_device *net_dev,
case ETHTOOL_GRXCLSRULE:
if (efx_filter_get_rx_id_limit(efx) == 0)
return -EOPNOTSUPP;
- return efx_ethtool_get_class_rule(efx, &info->fs);
+ rc = efx_ethtool_get_class_rule(efx, &info->fs, &rss_context);
+ if (rc < 0)
+ return rc;
+ if (info->fs.flow_type & FLOW_RSS)
+ info->rss_context = rss_context;
+ return 0;
- case ETHTOOL_GRXCLSRLALL: {
- s32 rc;
+ case ETHTOOL_GRXCLSRLALL:
info->data = efx_filter_get_rx_id_limit(efx);
if (info->data == 0)
return -EOPNOTSUPP;
@@ -1036,7 +1056,6 @@ efx_ethtool_get_rxnfc(struct net_device *net_dev,
return rc;
info->rule_cnt = rc;
return 0;
- }
default:
return -EOPNOTSUPP;
@@ -1054,7 +1073,8 @@ static inline bool ip6_mask_is_empty(__be32 mask[4])
}
static int efx_ethtool_set_class_rule(struct efx_nic *efx,
- struct ethtool_rx_flow_spec *rule)
+ struct ethtool_rx_flow_spec *rule,
+ u32 rss_context)
{
struct ethtool_tcpip4_spec *ip_entry = &rule->h_u.tcp_ip4_spec;
struct ethtool_tcpip4_spec *ip_mask = &rule->m_u.tcp_ip4_spec;
@@ -1066,6 +1086,7 @@ static int efx_ethtool_set_class_rule(struct efx_nic *efx,
struct ethtool_usrip6_spec *uip6_mask = &rule->m_u.usr_ip6_spec;
struct ethhdr *mac_entry = &rule->h_u.ether_spec;
struct ethhdr *mac_mask = &rule->m_u.ether_spec;
+ enum efx_filter_flags flags = 0;
struct efx_filter_spec spec;
int rc;
@@ -1084,12 +1105,19 @@ static int efx_ethtool_set_class_rule(struct efx_nic *efx,
rule->m_ext.data[1]))
return -EINVAL;
- efx_filter_init_rx(&spec, EFX_FILTER_PRI_MANUAL,
- efx->rx_scatter ? EFX_FILTER_FLAG_RX_SCATTER : 0,
+ if (efx->rx_scatter)
+ flags |= EFX_FILTER_FLAG_RX_SCATTER;
+ if (rule->flow_type & FLOW_RSS)
+ flags |= EFX_FILTER_FLAG_RX_RSS;
+
+ efx_filter_init_rx(&spec, EFX_FILTER_PRI_MANUAL, flags,
(rule->ring_cookie == RX_CLS_FLOW_DISC) ?
EFX_FILTER_RX_DMAQ_ID_DROP : rule->ring_cookie);
- switch (rule->flow_type & ~FLOW_EXT) {
+ if (rule->flow_type & FLOW_RSS)
+ spec.rss_context = rss_context;
+
+ switch (rule->flow_type & ~(FLOW_EXT | FLOW_RSS)) {
case TCP_V4_FLOW:
case UDP_V4_FLOW:
spec.match_flags = (EFX_FILTER_MATCH_ETHER_TYPE |
@@ -1265,7 +1293,8 @@ static int efx_ethtool_set_rxnfc(struct net_device *net_dev,
switch (info->cmd) {
case ETHTOOL_SRXCLSRLINS:
- return efx_ethtool_set_class_rule(efx, &info->fs);
+ return efx_ethtool_set_class_rule(efx, &info->fs,
+ info->rss_context);
case ETHTOOL_SRXCLSRLDEL:
return efx_filter_remove_id_safe(efx, EFX_FILTER_PRI_MANUAL,
@@ -1280,7 +1309,9 @@ static u32 efx_ethtool_get_rxfh_indir_size(struct net_device *net_dev)
{
struct efx_nic *efx = netdev_priv(net_dev);
- return (efx->n_rx_channels == 1) ? 0 : ARRAY_SIZE(efx->rx_indir_table);
+ if (efx->n_rx_channels == 1)
+ return 0;
+ return ARRAY_SIZE(efx->rss_context.rx_indir_table);
}
static u32 efx_ethtool_get_rxfh_key_size(struct net_device *net_dev)
@@ -1303,9 +1334,11 @@ static int efx_ethtool_get_rxfh(struct net_device *net_dev, u32 *indir, u8 *key,
if (hfunc)
*hfunc = ETH_RSS_HASH_TOP;
if (indir)
- memcpy(indir, efx->rx_indir_table, sizeof(efx->rx_indir_table));
+ memcpy(indir, efx->rss_context.rx_indir_table,
+ sizeof(efx->rss_context.rx_indir_table));
if (key)
- memcpy(key, efx->rx_hash_key, efx->type->rx_hash_key_size);
+ memcpy(key, efx->rss_context.rx_hash_key,
+ efx->type->rx_hash_key_size);
return 0;
}
@@ -1321,13 +1354,93 @@ static int efx_ethtool_set_rxfh(struct net_device *net_dev, const u32 *indir,
return 0;
if (!key)
- key = efx->rx_hash_key;
+ key = efx->rss_context.rx_hash_key;
if (!indir)
- indir = efx->rx_indir_table;
+ indir = efx->rss_context.rx_indir_table;
return efx->type->rx_push_rss_config(efx, true, indir, key);
}
+static int efx_ethtool_get_rxfh_context(struct net_device *net_dev, u32 *indir,
+ u8 *key, u8 *hfunc, u32 rss_context)
+{
+ struct efx_nic *efx = netdev_priv(net_dev);
+ struct efx_rss_context *ctx;
+ int rc;
+
+ if (!efx->type->rx_pull_rss_context_config)
+ return -EOPNOTSUPP;
+ ctx = efx_find_rss_context_entry(rss_context, &efx->rss_context.list);
+ if (!ctx)
+ return -ENOENT;
+ rc = efx->type->rx_pull_rss_context_config(efx, ctx);
+ if (rc)
+ return rc;
+
+ if (hfunc)
+ *hfunc = ETH_RSS_HASH_TOP;
+ if (indir)
+ memcpy(indir, ctx->rx_indir_table, sizeof(ctx->rx_indir_table));
+ if (key)
+ memcpy(key, ctx->rx_hash_key, efx->type->rx_hash_key_size);
+ return 0;
+}
+
+static int efx_ethtool_set_rxfh_context(struct net_device *net_dev,
+ const u32 *indir, const u8 *key,
+ const u8 hfunc, u32 *rss_context,
+ bool delete)
+{
+ struct efx_nic *efx = netdev_priv(net_dev);
+ struct efx_rss_context *ctx;
+ bool allocated = false;
+ int rc;
+
+ if (!efx->type->rx_push_rss_context_config)
+ return -EOPNOTSUPP;
+ /* Hash function is Toeplitz, cannot be changed */
+ if (hfunc != ETH_RSS_HASH_NO_CHANGE && hfunc != ETH_RSS_HASH_TOP)
+ return -EOPNOTSUPP;
+ if (*rss_context == ETH_RXFH_CONTEXT_ALLOC) {
+ if (delete)
+ /* alloc + delete == Nothing to do */
+ return -EINVAL;
+ ctx = efx_alloc_rss_context_entry(&efx->rss_context.list);
+ if (!ctx)
+ return -ENOMEM;
+ ctx->context_id = EFX_EF10_RSS_CONTEXT_INVALID;
+ /* Initialise indir table and key to defaults */
+ efx_set_default_rx_indir_table(efx, ctx);
+ netdev_rss_key_fill(ctx->rx_hash_key, sizeof(ctx->rx_hash_key));
+ allocated = true;
+ } else {
+ ctx = efx_find_rss_context_entry(*rss_context,
+ &efx->rss_context.list);
+ if (!ctx)
+ return -ENOENT;
+ }
+
+ if (delete) {
+ /* delete this context */
+ rc = efx->type->rx_push_rss_context_config(efx, ctx, NULL, NULL);
+ if (!rc)
+ efx_free_rss_context_entry(ctx);
+ return rc;
+ }
+
+ if (!key)
+ key = ctx->rx_hash_key;
+ if (!indir)
+ indir = ctx->rx_indir_table;
+
+ rc = efx->type->rx_push_rss_context_config(efx, ctx, indir, key);
+ if (rc && allocated)
+ efx_free_rss_context_entry(ctx);
+ else
+ *rss_context = ctx->user_id;
+ return rc;
+}
+
static int efx_ethtool_get_ts_info(struct net_device *net_dev,
struct ethtool_ts_info *ts_info)
{
@@ -1403,6 +1516,8 @@ const struct ethtool_ops efx_ethtool_ops = {
.get_rxfh_key_size = efx_ethtool_get_rxfh_key_size,
.get_rxfh = efx_ethtool_get_rxfh,
.set_rxfh = efx_ethtool_set_rxfh,
+ .get_rxfh_context = efx_ethtool_get_rxfh_context,
+ .set_rxfh_context = efx_ethtool_set_rxfh_context,
.get_ts_info = efx_ethtool_get_ts_info,
.get_module_info = efx_ethtool_get_module_info,
.get_module_eeprom = efx_ethtool_get_module_eeprom,
diff --git a/drivers/net/ethernet/sfc/farch.c b/drivers/net/ethernet/sfc/farch.c
index 266b9bee1f3a..ad001e77d554 100644
--- a/drivers/net/ethernet/sfc/farch.c
+++ b/drivers/net/ethernet/sfc/farch.c
@@ -1630,12 +1630,12 @@ void efx_farch_rx_push_indir_table(struct efx_nic *efx)
size_t i = 0;
efx_dword_t dword;
- BUILD_BUG_ON(ARRAY_SIZE(efx->rx_indir_table) !=
+ BUILD_BUG_ON(ARRAY_SIZE(efx->rss_context.rx_indir_table) !=
FR_BZ_RX_INDIRECTION_TBL_ROWS);
for (i = 0; i < FR_BZ_RX_INDIRECTION_TBL_ROWS; i++) {
EFX_POPULATE_DWORD_1(dword, FRF_BZ_IT_QUEUE,
- efx->rx_indir_table[i]);
+ efx->rss_context.rx_indir_table[i]);
efx_writed(efx, &dword,
FR_BZ_RX_INDIRECTION_TBL +
FR_BZ_RX_INDIRECTION_TBL_STEP * i);
@@ -1647,14 +1647,14 @@ void efx_farch_rx_pull_indir_table(struct efx_nic *efx)
size_t i = 0;
efx_dword_t dword;
- BUILD_BUG_ON(ARRAY_SIZE(efx->rx_indir_table) !=
+ BUILD_BUG_ON(ARRAY_SIZE(efx->rss_context.rx_indir_table) !=
FR_BZ_RX_INDIRECTION_TBL_ROWS);
for (i = 0; i < FR_BZ_RX_INDIRECTION_TBL_ROWS; i++) {
efx_readd(efx, &dword,
FR_BZ_RX_INDIRECTION_TBL +
FR_BZ_RX_INDIRECTION_TBL_STEP * i);
- efx->rx_indir_table[i] = EFX_DWORD_FIELD(dword, FRF_BZ_IT_QUEUE);
+ efx->rss_context.rx_indir_table[i] = EFX_DWORD_FIELD(dword, FRF_BZ_IT_QUEUE);
}
}
@@ -2032,8 +2032,7 @@ efx_farch_filter_from_gen_spec(struct efx_farch_filter_spec *spec,
{
bool is_full = false;
- if ((gen_spec->flags & EFX_FILTER_FLAG_RX_RSS) &&
- gen_spec->rss_context != EFX_FILTER_RSS_CONTEXT_DEFAULT)
+ if ((gen_spec->flags & EFX_FILTER_FLAG_RX_RSS) && gen_spec->rss_context)
return -EINVAL;
spec->priority = gen_spec->priority;
diff --git a/drivers/net/ethernet/sfc/filter.h b/drivers/net/ethernet/sfc/filter.h
index 8189a1cd973f..59021ad6d98d 100644
--- a/drivers/net/ethernet/sfc/filter.h
+++ b/drivers/net/ethernet/sfc/filter.h
@@ -125,7 +125,9 @@ enum efx_encap_type {
* @match_flags: Match type flags, from &enum efx_filter_match_flags
* @priority: Priority of the filter, from &enum efx_filter_priority
* @flags: Miscellaneous flags, from &enum efx_filter_flags
- * @rss_context: RSS context to use, if %EFX_FILTER_FLAG_RX_RSS is set
+ * @rss_context: RSS context to use, if %EFX_FILTER_FLAG_RX_RSS is set. This
+ * is a user_id (with 0 meaning the driver/default RSS context), not an
+ * MCFW context_id.
* @dmaq_id: Source/target queue index, or %EFX_FILTER_RX_DMAQ_ID_DROP for
* an RX drop filter
* @outer_vid: Outer VLAN ID to match, if %EFX_FILTER_MATCH_OUTER_VID is set
@@ -173,7 +175,6 @@ struct efx_filter_spec {
};
enum {
- EFX_FILTER_RSS_CONTEXT_DEFAULT = 0xffffffff,
EFX_FILTER_RX_DMAQ_ID_DROP = 0xfff
};
@@ -185,7 +186,7 @@ static inline void efx_filter_init_rx(struct efx_filter_spec *spec,
memset(spec, 0, sizeof(*spec));
spec->priority = priority;
spec->flags = EFX_FILTER_FLAG_RX | flags;
- spec->rss_context = EFX_FILTER_RSS_CONTEXT_DEFAULT;
+ spec->rss_context = 0;
spec->dmaq_id = rxq_id;
}
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index d20a8660ee48..203d64c88de5 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -704,6 +704,28 @@ union efx_multicast_hash {
struct vfdi_status;
+/* The reserved RSS context value */
+#define EFX_EF10_RSS_CONTEXT_INVALID 0xffffffff
+/**
+ * struct efx_rss_context - A user-defined RSS context for filtering
+ * @list: node of linked list on which this struct is stored
+ * @context_id: the RSS_CONTEXT_ID returned by MC firmware, or
+ * %EFX_EF10_RSS_CONTEXT_INVALID if this context is not present on the NIC.
+ * For Siena, 0 if RSS is active, else %EFX_EF10_RSS_CONTEXT_INVALID.
+ * @user_id: the rss_context ID exposed to userspace over ethtool.
+ * @rx_hash_udp_4tuple: UDP 4-tuple hashing enabled
+ * @rx_hash_key: Toeplitz hash key for this RSS context
+ * @indir_table: Indirection table for this RSS context
+ */
+struct efx_rss_context {
+ struct list_head list;
+ u32 context_id;
+ u32 user_id;
+ bool rx_hash_udp_4tuple;
+ u8 rx_hash_key[40];
+ u32 rx_indir_table[128];
+};
+
/**
* struct efx_nic - an Efx NIC
* @name: Device name (net device name or bus id before net device registered)
@@ -764,11 +786,9 @@ struct vfdi_status;
* (valid only for NICs that set %EFX_RX_PKT_PREFIX_LEN; always negative)
* @rx_packet_ts_offset: Offset of timestamp from start of packet data
* (valid only if channel->sync_timestamps_enabled; always negative)
- * @rx_hash_key: Toeplitz hash key for RSS
- * @rx_indir_table: Indirection table for RSS
* @rx_scatter: Scatter mode enabled for receives
- * @rss_active: RSS enabled on hardware
- * @rx_hash_udp_4tuple: UDP 4-tuple hashing enabled
+ * @rss_context: Main RSS context. Its @list member is the head of the list of
+ * RSS contexts created by user requests
* @int_error_count: Number of internal errors seen recently
* @int_error_expire: Time at which error count will be expired
* @irq_soft_enabled: Are IRQs soft-enabled? If not, IRQ handler will
@@ -909,11 +929,8 @@ struct efx_nic {
int rx_packet_hash_offset;
int rx_packet_len_offset;
int rx_packet_ts_offset;
- u8 rx_hash_key[40];
- u32 rx_indir_table[128];
bool rx_scatter;
- bool rss_active;
- bool rx_hash_udp_4tuple;
+ struct efx_rss_context rss_context;
unsigned int_error_count;
unsigned long int_error_expire;
@@ -1099,6 +1116,10 @@ struct efx_udp_tunnel {
* @tx_write: Write TX descriptors and doorbell
* @rx_push_rss_config: Write RSS hash key and indirection table to the NIC
* @rx_pull_rss_config: Read RSS hash key and indirection table back from the NIC
+ * @rx_push_rss_context_config: Write RSS hash key and indirection table for
+ * user RSS context to the NIC
+ * @rx_pull_rss_context_config: Read RSS hash key and indirection table for user
+ * RSS context back from the NIC
* @rx_probe: Allocate resources for RX queue
* @rx_init: Initialise RX queue on the NIC
* @rx_remove: Free resources for RX queue
@@ -1237,6 +1258,13 @@ struct efx_nic_type {
int (*rx_push_rss_config)(struct efx_nic *efx, bool user,
const u32 *rx_indir_table, const u8 *key);
int (*rx_pull_rss_config)(struct efx_nic *efx);
+ int (*rx_push_rss_context_config)(struct efx_nic *efx,
+ struct efx_rss_context *ctx,
+ const u32 *rx_indir_table,
+ const u8 *key);
+ int (*rx_pull_rss_context_config)(struct efx_nic *efx,
+ struct efx_rss_context *ctx);
+ void (*rx_restore_rss_contexts)(struct efx_nic *efx);
int (*rx_probe)(struct efx_rx_queue *rx_queue);
void (*rx_init)(struct efx_rx_queue *rx_queue);
void (*rx_remove)(struct efx_rx_queue *rx_queue);
diff --git a/drivers/net/ethernet/sfc/nic.h b/drivers/net/ethernet/sfc/nic.h
index 6549fc685a48..d080a414e8f2 100644
--- a/drivers/net/ethernet/sfc/nic.h
+++ b/drivers/net/ethernet/sfc/nic.h
@@ -374,7 +374,6 @@ enum {
* @piobuf_size: size of a single PIO buffer
* @must_restore_piobufs: Flag: PIO buffers have yet to be restored after MC
* reboot
- * @rx_rss_context: Firmware handle for our RSS context
* @rx_rss_context_exclusive: Whether our RSS context is exclusive or shared
* @stats: Hardware statistics
* @workaround_35388: Flag: firmware supports workaround for bug 35388
@@ -415,7 +414,6 @@ struct efx_ef10_nic_data {
unsigned int piobuf_handle[EF10_TX_PIOBUF_COUNT];
u16 piobuf_size;
bool must_restore_piobufs;
- u32 rx_rss_context;
bool rx_rss_context_exclusive;
u64 stats[EF10_STAT_COUNT];
bool workaround_35388;
diff --git a/drivers/net/ethernet/sfc/siena.c b/drivers/net/ethernet/sfc/siena.c
index ae8645ae4492..18aab25234ba 100644
--- a/drivers/net/ethernet/sfc/siena.c
+++ b/drivers/net/ethernet/sfc/siena.c
@@ -350,11 +350,11 @@ static int siena_rx_pull_rss_config(struct efx_nic *efx)
* siena_rx_push_rss_config, below)
*/
efx_reado(efx, &temp, FR_CZ_RX_RSS_IPV6_REG1);
- memcpy(efx->rx_hash_key, &temp, sizeof(temp));
+ memcpy(efx->rss_context.rx_hash_key, &temp, sizeof(temp));
efx_reado(efx, &temp, FR_CZ_RX_RSS_IPV6_REG2);
- memcpy(efx->rx_hash_key + sizeof(temp), &temp, sizeof(temp));
+ memcpy(efx->rss_context.rx_hash_key + sizeof(temp), &temp, sizeof(temp));
efx_reado(efx, &temp, FR_CZ_RX_RSS_IPV6_REG3);
- memcpy(efx->rx_hash_key + 2 * sizeof(temp), &temp,
+ memcpy(efx->rss_context.rx_hash_key + 2 * sizeof(temp), &temp,
FRF_CZ_RX_RSS_IPV6_TKEY_HI_WIDTH / 8);
efx_farch_rx_pull_indir_table(efx);
return 0;
@@ -367,26 +367,26 @@ static int siena_rx_push_rss_config(struct efx_nic *efx, bool user,
/* Set hash key for IPv4 */
if (key)
- memcpy(efx->rx_hash_key, key, sizeof(temp));
- memcpy(&temp, efx->rx_hash_key, sizeof(temp));
+ memcpy(efx->rss_context.rx_hash_key, key, sizeof(temp));
+ memcpy(&temp, efx->rss_context.rx_hash_key, sizeof(temp));
efx_writeo(efx, &temp, FR_BZ_RX_RSS_TKEY);
/* Enable IPv6 RSS */
- BUILD_BUG_ON(sizeof(efx->rx_hash_key) <
+ BUILD_BUG_ON(sizeof(efx->rss_context.rx_hash_key) <
2 * sizeof(temp) + FRF_CZ_RX_RSS_IPV6_TKEY_HI_WIDTH / 8 ||
FRF_CZ_RX_RSS_IPV6_TKEY_HI_LBN != 0);
- memcpy(&temp, efx->rx_hash_key, sizeof(temp));
+ memcpy(&temp, efx->rss_context.rx_hash_key, sizeof(temp));
efx_writeo(efx, &temp, FR_CZ_RX_RSS_IPV6_REG1);
- memcpy(&temp, efx->rx_hash_key + sizeof(temp), sizeof(temp));
+ memcpy(&temp, efx->rss_context.rx_hash_key + sizeof(temp), sizeof(temp));
efx_writeo(efx, &temp, FR_CZ_RX_RSS_IPV6_REG2);
EFX_POPULATE_OWORD_2(temp, FRF_CZ_RX_RSS_IPV6_THASH_ENABLE, 1,
FRF_CZ_RX_RSS_IPV6_IP_THASH_ENABLE, 1);
- memcpy(&temp, efx->rx_hash_key + 2 * sizeof(temp),
+ memcpy(&temp, efx->rss_context.rx_hash_key + 2 * sizeof(temp),
FRF_CZ_RX_RSS_IPV6_TKEY_HI_WIDTH / 8);
efx_writeo(efx, &temp, FR_CZ_RX_RSS_IPV6_REG3);
- memcpy(efx->rx_indir_table, rx_indir_table,
- sizeof(efx->rx_indir_table));
+ memcpy(efx->rss_context.rx_indir_table, rx_indir_table,
+ sizeof(efx->rss_context.rx_indir_table));
efx_farch_rx_push_indir_table(efx);
return 0;
@@ -432,8 +432,8 @@ static int siena_init_nic(struct efx_nic *efx)
EFX_RX_USR_BUF_SIZE >> 5);
efx_writeo(efx, &temp, FR_AZ_RX_CFG);
- siena_rx_push_rss_config(efx, false, efx->rx_indir_table, NULL);
- efx->rss_active = true;
+ siena_rx_push_rss_config(efx, false, efx->rss_context.rx_indir_table, NULL);
+ efx->rss_context.context_id = 0; /* indicates RSS is active */
/* Enable event logging */
rc = efx_mcdi_log_ctrl(efx, true, false, 0);
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH RESEND net-next 0/2] ntuple filters with RSS
2018-02-27 17:55 ` Edward Cree
@ 2018-02-27 19:28 ` John W. Linville
0 siblings, 0 replies; 18+ messages in thread
From: John W. Linville @ 2018-02-27 19:28 UTC (permalink / raw)
To: Edward Cree; +Cc: David Miller, linux-net-drivers, netdev
On Tue, Feb 27, 2018 at 05:55:51PM +0000, Edward Cree wrote:
> On 27/02/18 17:38, David Miller wrote:
> > The problem is there are syntax errors in your email headers.
> >
> > Any time a person's name contains a special character like ".",
> > that entire string must be enclosed in double quotes.
> >
> > This is the case for "John W. Linville" so please add proper
> > quotes around such names and resend your patch series again.
> Thank you for spotting this!� I looked at the headers and failed
> �to notice anything wrong with them.
> I'm surprised that git-imap-send doesn't check for this...
>
> Will resend with that fixed.
Haha, sorry for indirectly causing this issue! If it helps, you can
leave-off the "W." -- I'll still know its for me... :-)
John
--
John W. Linville Someday the world will need a hero, and you
linville@tuxdriver.com might be all we have. Be ready.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RESEND net-next 0/2] ntuple filters with RSS
2018-02-27 17:59 [PATCH RESEND net-next 0/2] ntuple filters with RSS Edward Cree
2018-02-27 18:02 ` [PATCH net-next 1/2] net: ethtool: extend RXNFC API to support RSS spreading of filter matches Edward Cree
2018-02-27 18:03 ` [PATCH net-next 2/2] sfc: support RSS spreading of ethtool ntuple filters Edward Cree
@ 2018-02-27 23:47 ` Jakub Kicinski
2018-02-28 1:24 ` Alexander Duyck
2018-03-01 18:36 ` David Miller
3 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2018-02-27 23:47 UTC (permalink / raw)
To: Edward Cree
Cc: linux-net-drivers, David Miller, netdev, John W. Linville,
Or Gerlitz, Alexander Duyck
On Tue, 27 Feb 2018 17:59:12 +0000, Edward Cree wrote:
> This series introduces the ability to mark an ethtool steering filter to use
> RSS spreading, and the ability to create and configure multiple RSS contexts
> with different indirection tables, hash keys, and hash fields.
> An implementation for the sfc driver (for 7000-series and later SFC NICs) is
> included in patch 2/2.
>
> The anticipated use case of this feature is for steering traffic destined for
> a container (or virtual machine) to the subset of CPUs on which processes in
> the container (or the VM's vCPUs) are bound, while retaining the scalability
> of RSS spreading from the viewpoint inside the container.
> The use of both a base queue number (ring_cookie) and indirection table is
> intended to allow re-use of a single RSS context to target multiple sets of
> CPUs. For instance, if an 8-core system is hosting three containers on CPUs
> [1,2], [3,4] and [6,7], then a single RSS context with an equal-weight [0,1]
> indirection table could be used to target all three containers by setting
> ring_cookie to 1, 3 and 6 on the respective filters.
Please, let's stop extending ethtool_rx_flow APIs. I bit my tongue
when Intel was adding their "redirection to VF" based on ethtool ntuples
and look now they're adding the same functionality with flower :| And
wonder how to handle two interfaces doing the same thing.
On the use case itself, I wonder how much sense that makes. Can your
hardware not tag the packet as well so you could then mux it to
something like macvlan offload?
CC: Alex, Or
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RESEND net-next 0/2] ntuple filters with RSS
2018-02-27 23:47 ` [PATCH RESEND net-next 0/2] ntuple filters with RSS Jakub Kicinski
@ 2018-02-28 1:24 ` Alexander Duyck
2018-03-02 15:24 ` Edward Cree
0 siblings, 1 reply; 18+ messages in thread
From: Alexander Duyck @ 2018-02-28 1:24 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Edward Cree, linux-net-drivers, David Miller, netdev,
John W. Linville, Or Gerlitz, Alexander Duyck
On Tue, Feb 27, 2018 at 3:47 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
> On Tue, 27 Feb 2018 17:59:12 +0000, Edward Cree wrote:
>> This series introduces the ability to mark an ethtool steering filter to use
>> RSS spreading, and the ability to create and configure multiple RSS contexts
>> with different indirection tables, hash keys, and hash fields.
>> An implementation for the sfc driver (for 7000-series and later SFC NICs) is
>> included in patch 2/2.
>>
>> The anticipated use case of this feature is for steering traffic destined for
>> a container (or virtual machine) to the subset of CPUs on which processes in
>> the container (or the VM's vCPUs) are bound, while retaining the scalability
>> of RSS spreading from the viewpoint inside the container.
>> The use of both a base queue number (ring_cookie) and indirection table is
>> intended to allow re-use of a single RSS context to target multiple sets of
>> CPUs. For instance, if an 8-core system is hosting three containers on CPUs
>> [1,2], [3,4] and [6,7], then a single RSS context with an equal-weight [0,1]
>> indirection table could be used to target all three containers by setting
>> ring_cookie to 1, 3 and 6 on the respective filters.
>
> Please, let's stop extending ethtool_rx_flow APIs. I bit my tongue
> when Intel was adding their "redirection to VF" based on ethtool ntuples
> and look now they're adding the same functionality with flower :| And
> wonder how to handle two interfaces doing the same thing.
>
> On the use case itself, I wonder how much sense that makes. Can your
> hardware not tag the packet as well so you could then mux it to
> something like macvlan offload?
>
> CC: Alex, Or
We did something like this for i40e. Basically we required creating
the queue groups using mqprio to keep them symmetric on Tx and Rx, and
then allowed for TC ingress filters to redirect traffic to those queue
groups.
- Alex
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RESEND net-next 0/2] ntuple filters with RSS
2018-02-27 17:59 [PATCH RESEND net-next 0/2] ntuple filters with RSS Edward Cree
` (2 preceding siblings ...)
2018-02-27 23:47 ` [PATCH RESEND net-next 0/2] ntuple filters with RSS Jakub Kicinski
@ 2018-03-01 18:36 ` David Miller
2018-03-02 16:01 ` Edward Cree
3 siblings, 1 reply; 18+ messages in thread
From: David Miller @ 2018-03-01 18:36 UTC (permalink / raw)
To: ecree; +Cc: linux-net-drivers, netdev, linville
From: Edward Cree <ecree@solarflare.com>
Date: Tue, 27 Feb 2018 17:59:12 +0000
> This series introduces the ability to mark an ethtool steering filter to use
> RSS spreading, and the ability to create and configure multiple RSS contexts
> with different indirection tables, hash keys, and hash fields.
> An implementation for the sfc driver (for 7000-series and later SFC NICs) is
> included in patch 2/2.
>
> The anticipated use case of this feature is for steering traffic destined for
> a container (or virtual machine) to the subset of CPUs on which processes in
> the container (or the VM's vCPUs) are bound, while retaining the scalability
> of RSS spreading from the viewpoint inside the container.
> The use of both a base queue number (ring_cookie) and indirection table is
> intended to allow re-use of a single RSS context to target multiple sets of
> CPUs. For instance, if an 8-core system is hosting three containers on CPUs
> [1,2], [3,4] and [6,7], then a single RSS context with an equal-weight [0,1]
> indirection table could be used to target all three containers by setting
> ring_cookie to 1, 3 and 6 on the respective filters.
We really should have the ethtool interfaces under deep freeze until we
convert it to netlink or similar.
Second, this is a real hackish way to extend ethtool with new
semantics. A structure changes layout based upon a flag bit setting
in an earlier member? Yikes...
Lastly, there has been feedback asking how practical and useful this
facility actually is, and you must address that.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 2/2] sfc: support RSS spreading of ethtool ntuple filters
2018-02-27 18:03 ` [PATCH net-next 2/2] sfc: support RSS spreading of ethtool ntuple filters Edward Cree
@ 2018-03-01 20:51 ` kbuild test robot
0 siblings, 0 replies; 18+ messages in thread
From: kbuild test robot @ 2018-03-01 20:51 UTC (permalink / raw)
To: Edward Cree
Cc: kbuild-all, linux-net-drivers, David Miller, netdev,
John W. Linville
[-- Attachment #1: Type: text/plain, Size: 17927 bytes --]
Hi Edward,
I love your patch! Perhaps something to improve:
[auto build test WARNING on net-next/master]
url: https://github.com/0day-ci/linux/commits/Edward-Cree/ntuple-filters-with-RSS/20180302-031011
config: x86_64-rhel (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64
Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings
All warnings (new ones prefixed by >>):
drivers/net//ethernet/sfc/ef10.c: In function 'efx_ef10_filter_insert':
>> drivers/net//ethernet/sfc/ef10.c:4458:5: warning: 'ctx' may be used uninitialized in this function [-Wmaybe-uninitialized]
rc = efx_ef10_filter_push(efx, spec, &table->entry[ins_index].handle,
~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ctx, replacing);
~~~~~~~~~~~~~~~
vim +/ctx +4458 drivers/net//ethernet/sfc/ef10.c
8127d661 Ben Hutchings 2013-08-29 4300
8127d661 Ben Hutchings 2013-08-29 4301 static s32 efx_ef10_filter_insert(struct efx_nic *efx,
8127d661 Ben Hutchings 2013-08-29 4302 struct efx_filter_spec *spec,
8127d661 Ben Hutchings 2013-08-29 4303 bool replace_equal)
8127d661 Ben Hutchings 2013-08-29 4304 {
8127d661 Ben Hutchings 2013-08-29 4305 struct efx_ef10_filter_table *table = efx->filter_state;
8127d661 Ben Hutchings 2013-08-29 4306 DECLARE_BITMAP(mc_rem_map, EFX_EF10_FILTER_SEARCH_LIMIT);
8127d661 Ben Hutchings 2013-08-29 4307 struct efx_filter_spec *saved_spec;
8127d661 Ben Hutchings 2013-08-29 4308 unsigned int match_pri, hash;
87dec16f Edward Cree 2018-02-27 4309 struct efx_rss_context *ctx;
8127d661 Ben Hutchings 2013-08-29 4310 unsigned int priv_flags;
8127d661 Ben Hutchings 2013-08-29 4311 bool replacing = false;
8127d661 Ben Hutchings 2013-08-29 4312 int ins_index = -1;
8127d661 Ben Hutchings 2013-08-29 4313 DEFINE_WAIT(wait);
8127d661 Ben Hutchings 2013-08-29 4314 bool is_mc_recip;
8127d661 Ben Hutchings 2013-08-29 4315 s32 rc;
8127d661 Ben Hutchings 2013-08-29 4316
8127d661 Ben Hutchings 2013-08-29 4317 /* For now, only support RX filters */
8127d661 Ben Hutchings 2013-08-29 4318 if ((spec->flags & (EFX_FILTER_FLAG_RX | EFX_FILTER_FLAG_TX)) !=
8127d661 Ben Hutchings 2013-08-29 4319 EFX_FILTER_FLAG_RX)
8127d661 Ben Hutchings 2013-08-29 4320 return -EINVAL;
8127d661 Ben Hutchings 2013-08-29 4321
7ac0dd9d Andrew Rybchenko 2016-06-15 4322 rc = efx_ef10_filter_pri(table, spec);
8127d661 Ben Hutchings 2013-08-29 4323 if (rc < 0)
8127d661 Ben Hutchings 2013-08-29 4324 return rc;
8127d661 Ben Hutchings 2013-08-29 4325 match_pri = rc;
8127d661 Ben Hutchings 2013-08-29 4326
8127d661 Ben Hutchings 2013-08-29 4327 hash = efx_ef10_filter_hash(spec);
8127d661 Ben Hutchings 2013-08-29 4328 is_mc_recip = efx_filter_is_mc_recipient(spec);
8127d661 Ben Hutchings 2013-08-29 4329 if (is_mc_recip)
8127d661 Ben Hutchings 2013-08-29 4330 bitmap_zero(mc_rem_map, EFX_EF10_FILTER_SEARCH_LIMIT);
8127d661 Ben Hutchings 2013-08-29 4331
87dec16f Edward Cree 2018-02-27 4332 if (spec->flags & EFX_FILTER_FLAG_RX_RSS) {
87dec16f Edward Cree 2018-02-27 4333 if (spec->rss_context)
87dec16f Edward Cree 2018-02-27 4334 ctx = efx_find_rss_context_entry(spec->rss_context,
87dec16f Edward Cree 2018-02-27 4335 &efx->rss_context.list);
87dec16f Edward Cree 2018-02-27 4336 else
87dec16f Edward Cree 2018-02-27 4337 ctx = &efx->rss_context;
87dec16f Edward Cree 2018-02-27 4338 if (!ctx)
87dec16f Edward Cree 2018-02-27 4339 return -ENOENT;
87dec16f Edward Cree 2018-02-27 4340 if (ctx->context_id == EFX_EF10_RSS_CONTEXT_INVALID)
87dec16f Edward Cree 2018-02-27 4341 return -EOPNOTSUPP;
87dec16f Edward Cree 2018-02-27 4342 }
87dec16f Edward Cree 2018-02-27 4343
8127d661 Ben Hutchings 2013-08-29 4344 /* Find any existing filters with the same match tuple or
8127d661 Ben Hutchings 2013-08-29 4345 * else a free slot to insert at. If any of them are busy,
8127d661 Ben Hutchings 2013-08-29 4346 * we have to wait and retry.
8127d661 Ben Hutchings 2013-08-29 4347 */
8127d661 Ben Hutchings 2013-08-29 4348 for (;;) {
8127d661 Ben Hutchings 2013-08-29 4349 unsigned int depth = 1;
8127d661 Ben Hutchings 2013-08-29 4350 unsigned int i;
8127d661 Ben Hutchings 2013-08-29 4351
8127d661 Ben Hutchings 2013-08-29 4352 spin_lock_bh(&efx->filter_lock);
8127d661 Ben Hutchings 2013-08-29 4353
8127d661 Ben Hutchings 2013-08-29 4354 for (;;) {
8127d661 Ben Hutchings 2013-08-29 4355 i = (hash + depth) & (HUNT_FILTER_TBL_ROWS - 1);
8127d661 Ben Hutchings 2013-08-29 4356 saved_spec = efx_ef10_filter_entry_spec(table, i);
8127d661 Ben Hutchings 2013-08-29 4357
8127d661 Ben Hutchings 2013-08-29 4358 if (!saved_spec) {
8127d661 Ben Hutchings 2013-08-29 4359 if (ins_index < 0)
8127d661 Ben Hutchings 2013-08-29 4360 ins_index = i;
8127d661 Ben Hutchings 2013-08-29 4361 } else if (efx_ef10_filter_equal(spec, saved_spec)) {
8127d661 Ben Hutchings 2013-08-29 4362 if (table->entry[i].spec &
8127d661 Ben Hutchings 2013-08-29 4363 EFX_EF10_FILTER_FLAG_BUSY)
8127d661 Ben Hutchings 2013-08-29 4364 break;
8127d661 Ben Hutchings 2013-08-29 4365 if (spec->priority < saved_spec->priority &&
7665d1ab Ben Hutchings 2013-11-21 4366 spec->priority != EFX_FILTER_PRI_AUTO) {
8127d661 Ben Hutchings 2013-08-29 4367 rc = -EPERM;
8127d661 Ben Hutchings 2013-08-29 4368 goto out_unlock;
8127d661 Ben Hutchings 2013-08-29 4369 }
8127d661 Ben Hutchings 2013-08-29 4370 if (!is_mc_recip) {
8127d661 Ben Hutchings 2013-08-29 4371 /* This is the only one */
8127d661 Ben Hutchings 2013-08-29 4372 if (spec->priority ==
8127d661 Ben Hutchings 2013-08-29 4373 saved_spec->priority &&
8127d661 Ben Hutchings 2013-08-29 4374 !replace_equal) {
8127d661 Ben Hutchings 2013-08-29 4375 rc = -EEXIST;
8127d661 Ben Hutchings 2013-08-29 4376 goto out_unlock;
8127d661 Ben Hutchings 2013-08-29 4377 }
8127d661 Ben Hutchings 2013-08-29 4378 ins_index = i;
8127d661 Ben Hutchings 2013-08-29 4379 goto found;
8127d661 Ben Hutchings 2013-08-29 4380 } else if (spec->priority >
8127d661 Ben Hutchings 2013-08-29 4381 saved_spec->priority ||
8127d661 Ben Hutchings 2013-08-29 4382 (spec->priority ==
8127d661 Ben Hutchings 2013-08-29 4383 saved_spec->priority &&
8127d661 Ben Hutchings 2013-08-29 4384 replace_equal)) {
8127d661 Ben Hutchings 2013-08-29 4385 if (ins_index < 0)
8127d661 Ben Hutchings 2013-08-29 4386 ins_index = i;
8127d661 Ben Hutchings 2013-08-29 4387 else
8127d661 Ben Hutchings 2013-08-29 4388 __set_bit(depth, mc_rem_map);
8127d661 Ben Hutchings 2013-08-29 4389 }
8127d661 Ben Hutchings 2013-08-29 4390 }
8127d661 Ben Hutchings 2013-08-29 4391
8127d661 Ben Hutchings 2013-08-29 4392 /* Once we reach the maximum search depth, use
8127d661 Ben Hutchings 2013-08-29 4393 * the first suitable slot or return -EBUSY if
8127d661 Ben Hutchings 2013-08-29 4394 * there was none
8127d661 Ben Hutchings 2013-08-29 4395 */
8127d661 Ben Hutchings 2013-08-29 4396 if (depth == EFX_EF10_FILTER_SEARCH_LIMIT) {
8127d661 Ben Hutchings 2013-08-29 4397 if (ins_index < 0) {
8127d661 Ben Hutchings 2013-08-29 4398 rc = -EBUSY;
8127d661 Ben Hutchings 2013-08-29 4399 goto out_unlock;
8127d661 Ben Hutchings 2013-08-29 4400 }
8127d661 Ben Hutchings 2013-08-29 4401 goto found;
8127d661 Ben Hutchings 2013-08-29 4402 }
8127d661 Ben Hutchings 2013-08-29 4403
8127d661 Ben Hutchings 2013-08-29 4404 ++depth;
8127d661 Ben Hutchings 2013-08-29 4405 }
8127d661 Ben Hutchings 2013-08-29 4406
8127d661 Ben Hutchings 2013-08-29 4407 prepare_to_wait(&table->waitq, &wait, TASK_UNINTERRUPTIBLE);
8127d661 Ben Hutchings 2013-08-29 4408 spin_unlock_bh(&efx->filter_lock);
8127d661 Ben Hutchings 2013-08-29 4409 schedule();
8127d661 Ben Hutchings 2013-08-29 4410 }
8127d661 Ben Hutchings 2013-08-29 4411
8127d661 Ben Hutchings 2013-08-29 4412 found:
8127d661 Ben Hutchings 2013-08-29 4413 /* Create a software table entry if necessary, and mark it
8127d661 Ben Hutchings 2013-08-29 4414 * busy. We might yet fail to insert, but any attempt to
8127d661 Ben Hutchings 2013-08-29 4415 * insert a conflicting filter while we're waiting for the
8127d661 Ben Hutchings 2013-08-29 4416 * firmware must find the busy entry.
8127d661 Ben Hutchings 2013-08-29 4417 */
8127d661 Ben Hutchings 2013-08-29 4418 saved_spec = efx_ef10_filter_entry_spec(table, ins_index);
8127d661 Ben Hutchings 2013-08-29 4419 if (saved_spec) {
7665d1ab Ben Hutchings 2013-11-21 4420 if (spec->priority == EFX_FILTER_PRI_AUTO &&
7665d1ab Ben Hutchings 2013-11-21 4421 saved_spec->priority >= EFX_FILTER_PRI_AUTO) {
8127d661 Ben Hutchings 2013-08-29 4422 /* Just make sure it won't be removed */
7665d1ab Ben Hutchings 2013-11-21 4423 if (saved_spec->priority > EFX_FILTER_PRI_AUTO)
7665d1ab Ben Hutchings 2013-11-21 4424 saved_spec->flags |= EFX_FILTER_FLAG_RX_OVER_AUTO;
8127d661 Ben Hutchings 2013-08-29 4425 table->entry[ins_index].spec &=
b59e6ef8 Ben Hutchings 2013-11-21 4426 ~EFX_EF10_FILTER_FLAG_AUTO_OLD;
8127d661 Ben Hutchings 2013-08-29 4427 rc = ins_index;
8127d661 Ben Hutchings 2013-08-29 4428 goto out_unlock;
8127d661 Ben Hutchings 2013-08-29 4429 }
8127d661 Ben Hutchings 2013-08-29 4430 replacing = true;
8127d661 Ben Hutchings 2013-08-29 4431 priv_flags = efx_ef10_filter_entry_flags(table, ins_index);
8127d661 Ben Hutchings 2013-08-29 4432 } else {
8127d661 Ben Hutchings 2013-08-29 4433 saved_spec = kmalloc(sizeof(*spec), GFP_ATOMIC);
8127d661 Ben Hutchings 2013-08-29 4434 if (!saved_spec) {
8127d661 Ben Hutchings 2013-08-29 4435 rc = -ENOMEM;
8127d661 Ben Hutchings 2013-08-29 4436 goto out_unlock;
8127d661 Ben Hutchings 2013-08-29 4437 }
8127d661 Ben Hutchings 2013-08-29 4438 *saved_spec = *spec;
8127d661 Ben Hutchings 2013-08-29 4439 priv_flags = 0;
8127d661 Ben Hutchings 2013-08-29 4440 }
8127d661 Ben Hutchings 2013-08-29 4441 efx_ef10_filter_set_entry(table, ins_index, saved_spec,
8127d661 Ben Hutchings 2013-08-29 4442 priv_flags | EFX_EF10_FILTER_FLAG_BUSY);
8127d661 Ben Hutchings 2013-08-29 4443
8127d661 Ben Hutchings 2013-08-29 4444 /* Mark lower-priority multicast recipients busy prior to removal */
8127d661 Ben Hutchings 2013-08-29 4445 if (is_mc_recip) {
8127d661 Ben Hutchings 2013-08-29 4446 unsigned int depth, i;
8127d661 Ben Hutchings 2013-08-29 4447
8127d661 Ben Hutchings 2013-08-29 4448 for (depth = 0; depth < EFX_EF10_FILTER_SEARCH_LIMIT; depth++) {
8127d661 Ben Hutchings 2013-08-29 4449 i = (hash + depth) & (HUNT_FILTER_TBL_ROWS - 1);
8127d661 Ben Hutchings 2013-08-29 4450 if (test_bit(depth, mc_rem_map))
8127d661 Ben Hutchings 2013-08-29 4451 table->entry[i].spec |=
8127d661 Ben Hutchings 2013-08-29 4452 EFX_EF10_FILTER_FLAG_BUSY;
8127d661 Ben Hutchings 2013-08-29 4453 }
8127d661 Ben Hutchings 2013-08-29 4454 }
8127d661 Ben Hutchings 2013-08-29 4455
8127d661 Ben Hutchings 2013-08-29 4456 spin_unlock_bh(&efx->filter_lock);
8127d661 Ben Hutchings 2013-08-29 4457
8127d661 Ben Hutchings 2013-08-29 @4458 rc = efx_ef10_filter_push(efx, spec, &table->entry[ins_index].handle,
87dec16f Edward Cree 2018-02-27 4459 ctx, replacing);
8127d661 Ben Hutchings 2013-08-29 4460
8127d661 Ben Hutchings 2013-08-29 4461 /* Finalise the software table entry */
8127d661 Ben Hutchings 2013-08-29 4462 spin_lock_bh(&efx->filter_lock);
8127d661 Ben Hutchings 2013-08-29 4463 if (rc == 0) {
8127d661 Ben Hutchings 2013-08-29 4464 if (replacing) {
8127d661 Ben Hutchings 2013-08-29 4465 /* Update the fields that may differ */
7665d1ab Ben Hutchings 2013-11-21 4466 if (saved_spec->priority == EFX_FILTER_PRI_AUTO)
7665d1ab Ben Hutchings 2013-11-21 4467 saved_spec->flags |=
7665d1ab Ben Hutchings 2013-11-21 4468 EFX_FILTER_FLAG_RX_OVER_AUTO;
8127d661 Ben Hutchings 2013-08-29 4469 saved_spec->priority = spec->priority;
7665d1ab Ben Hutchings 2013-11-21 4470 saved_spec->flags &= EFX_FILTER_FLAG_RX_OVER_AUTO;
8127d661 Ben Hutchings 2013-08-29 4471 saved_spec->flags |= spec->flags;
8127d661 Ben Hutchings 2013-08-29 4472 saved_spec->rss_context = spec->rss_context;
8127d661 Ben Hutchings 2013-08-29 4473 saved_spec->dmaq_id = spec->dmaq_id;
8127d661 Ben Hutchings 2013-08-29 4474 }
8127d661 Ben Hutchings 2013-08-29 4475 } else if (!replacing) {
8127d661 Ben Hutchings 2013-08-29 4476 kfree(saved_spec);
8127d661 Ben Hutchings 2013-08-29 4477 saved_spec = NULL;
8127d661 Ben Hutchings 2013-08-29 4478 }
8127d661 Ben Hutchings 2013-08-29 4479 efx_ef10_filter_set_entry(table, ins_index, saved_spec, priv_flags);
8127d661 Ben Hutchings 2013-08-29 4480
8127d661 Ben Hutchings 2013-08-29 4481 /* Remove and finalise entries for lower-priority multicast
8127d661 Ben Hutchings 2013-08-29 4482 * recipients
8127d661 Ben Hutchings 2013-08-29 4483 */
8127d661 Ben Hutchings 2013-08-29 4484 if (is_mc_recip) {
bb53f4d4 Martin Habets 2017-06-22 4485 MCDI_DECLARE_BUF(inbuf, MC_CMD_FILTER_OP_EXT_IN_LEN);
8127d661 Ben Hutchings 2013-08-29 4486 unsigned int depth, i;
8127d661 Ben Hutchings 2013-08-29 4487
8127d661 Ben Hutchings 2013-08-29 4488 memset(inbuf, 0, sizeof(inbuf));
8127d661 Ben Hutchings 2013-08-29 4489
8127d661 Ben Hutchings 2013-08-29 4490 for (depth = 0; depth < EFX_EF10_FILTER_SEARCH_LIMIT; depth++) {
8127d661 Ben Hutchings 2013-08-29 4491 if (!test_bit(depth, mc_rem_map))
8127d661 Ben Hutchings 2013-08-29 4492 continue;
8127d661 Ben Hutchings 2013-08-29 4493
8127d661 Ben Hutchings 2013-08-29 4494 i = (hash + depth) & (HUNT_FILTER_TBL_ROWS - 1);
8127d661 Ben Hutchings 2013-08-29 4495 saved_spec = efx_ef10_filter_entry_spec(table, i);
8127d661 Ben Hutchings 2013-08-29 4496 priv_flags = efx_ef10_filter_entry_flags(table, i);
8127d661 Ben Hutchings 2013-08-29 4497
8127d661 Ben Hutchings 2013-08-29 4498 if (rc == 0) {
8127d661 Ben Hutchings 2013-08-29 4499 spin_unlock_bh(&efx->filter_lock);
8127d661 Ben Hutchings 2013-08-29 4500 MCDI_SET_DWORD(inbuf, FILTER_OP_IN_OP,
8127d661 Ben Hutchings 2013-08-29 4501 MC_CMD_FILTER_OP_IN_OP_UNSUBSCRIBE);
8127d661 Ben Hutchings 2013-08-29 4502 MCDI_SET_QWORD(inbuf, FILTER_OP_IN_HANDLE,
8127d661 Ben Hutchings 2013-08-29 4503 table->entry[i].handle);
8127d661 Ben Hutchings 2013-08-29 4504 rc = efx_mcdi_rpc(efx, MC_CMD_FILTER_OP,
8127d661 Ben Hutchings 2013-08-29 4505 inbuf, sizeof(inbuf),
8127d661 Ben Hutchings 2013-08-29 4506 NULL, 0, NULL);
8127d661 Ben Hutchings 2013-08-29 4507 spin_lock_bh(&efx->filter_lock);
8127d661 Ben Hutchings 2013-08-29 4508 }
8127d661 Ben Hutchings 2013-08-29 4509
8127d661 Ben Hutchings 2013-08-29 4510 if (rc == 0) {
8127d661 Ben Hutchings 2013-08-29 4511 kfree(saved_spec);
8127d661 Ben Hutchings 2013-08-29 4512 saved_spec = NULL;
8127d661 Ben Hutchings 2013-08-29 4513 priv_flags = 0;
8127d661 Ben Hutchings 2013-08-29 4514 } else {
8127d661 Ben Hutchings 2013-08-29 4515 priv_flags &= ~EFX_EF10_FILTER_FLAG_BUSY;
8127d661 Ben Hutchings 2013-08-29 4516 }
8127d661 Ben Hutchings 2013-08-29 4517 efx_ef10_filter_set_entry(table, i, saved_spec,
8127d661 Ben Hutchings 2013-08-29 4518 priv_flags);
8127d661 Ben Hutchings 2013-08-29 4519 }
8127d661 Ben Hutchings 2013-08-29 4520 }
8127d661 Ben Hutchings 2013-08-29 4521
8127d661 Ben Hutchings 2013-08-29 4522 /* If successful, return the inserted filter ID */
8127d661 Ben Hutchings 2013-08-29 4523 if (rc == 0)
0ccb998b Jon Cooper 2017-02-17 4524 rc = efx_ef10_make_filter_id(match_pri, ins_index);
8127d661 Ben Hutchings 2013-08-29 4525
8127d661 Ben Hutchings 2013-08-29 4526 wake_up_all(&table->waitq);
8127d661 Ben Hutchings 2013-08-29 4527 out_unlock:
8127d661 Ben Hutchings 2013-08-29 4528 spin_unlock_bh(&efx->filter_lock);
8127d661 Ben Hutchings 2013-08-29 4529 finish_wait(&table->waitq, &wait);
8127d661 Ben Hutchings 2013-08-29 4530 return rc;
8127d661 Ben Hutchings 2013-08-29 4531 }
8127d661 Ben Hutchings 2013-08-29 4532
:::::: The code at line 4458 was first introduced by commit
:::::: 8127d661e77f5ec410093bce411f540afa34593f sfc: Add support for Solarflare SFC9100 family
:::::: TO: Ben Hutchings <bhutchings@solarflare.com>
:::::: CC: Ben Hutchings <bhutchings@solarflare.com>
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 40831 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RESEND net-next 0/2] ntuple filters with RSS
2018-02-28 1:24 ` Alexander Duyck
@ 2018-03-02 15:24 ` Edward Cree
2018-03-02 18:55 ` Jakub Kicinski
0 siblings, 1 reply; 18+ messages in thread
From: Edward Cree @ 2018-03-02 15:24 UTC (permalink / raw)
To: Alexander Duyck, Jakub Kicinski
Cc: linux-net-drivers, David Miller, netdev, John W. Linville,
Or Gerlitz, Alexander Duyck
On Tue, Feb 27, 2018 at 3:47 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
> Please, let's stop extending ethtool_rx_flow APIs. I bit my tongue
> when Intel was adding their "redirection to VF" based on ethtool ntuples
> and look now they're adding the same functionality with flower :| And
> wonder how to handle two interfaces doing the same thing.
Since sfc only supports ethtool NFC interfaces (we have no flower support,
and I also wonder how one is to support both of those interfaces without
producing an ugly mess), I'd much rather put this in ethtool than have to
implement all of flower just so we can have this extension.
I guess part of the question is, which other drivers besides us would want
to implement something like this, and what are their requirements?
> On the use case itself, I wonder how much sense that makes. Can your
> hardware not tag the packet as well so you could then mux it to
> something like macvlan offload?
In practice the only way our hardware can "tag the packet" is by the
selection of RX queue. So you could for instance give a container its
own RX queues (rather than just using the existing RX queues on the
appropriate CPUs), and maybe in future hook those queues up to l2fwd
offload somehow.
But that seems like a separate job (offloading the macvlan switching) to
what this series is about (making the RX processing happen on the right
CPUs). Is software macvlan switching really noticeably slow, anyway?
Besides, more powerful filtering than just MAC addr might be needed, if,
for instance, the container network is encapsulated. In that case
something like a UDP 4-tuple filter might be necessary (or, indeed, a
filter looking at the VNID (VxLAN TNI) - which our hardware can do but
ethtool doesn't currently have a way to specify). AFAICT l2-fwd-offload
can only be used for straight MAC addr, not for overlay networks like
VxLAN or FOU? At least, existing ndo_dfwd_add_station() implementations
don't seem to check that dev is a macvlan... Does it even support
VLAN filters? fm10k implementation doesn't seem to.
Anyway, like I say, filtering traffic onto its own queues seems to be
orthogonal, or at least separate, to binding those queues into an
upperdev for demux offload.
On 28/02/18 01:24, Alexander Duyck wrote:
> We did something like this for i40e. Basically we required creating
> the queue groups using mqprio to keep them symmetric on Tx and Rx, and
> then allowed for TC ingress filters to redirect traffic to those queue
> groups.
>
> - Alex
If we're not doing macvlan offload, I'm not sure what, if anything, the
TX side would buy us. So for now it seems to make sense for TX just to
use the TXQ associated with the CPU from which the TX originates, which
I believe already happens automatically.
-Ed
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RESEND net-next 0/2] ntuple filters with RSS
2018-03-01 18:36 ` David Miller
@ 2018-03-02 16:01 ` Edward Cree
2018-03-02 17:49 ` David Riddoch
2018-03-07 15:24 ` David Miller
0 siblings, 2 replies; 18+ messages in thread
From: Edward Cree @ 2018-03-02 16:01 UTC (permalink / raw)
To: David Miller; +Cc: linux-net-drivers, netdev, linville
On 01/03/18 18:36, David Miller wrote:
> We really should have the ethtool interfaces under deep freeze until we
> convert it to netlink or similar.
> Second, this is a real hackish way to extend ethtool with new
> semantics. A structure changes layout based upon a flag bit setting
> in an earlier member? Yikes...
Yeah, while I'm reasonably confident it's ABI-compatible (presence of that
flag in the past should always have led to drivers complaining they didn't
recognise it), and it is somewhat similar to the existing FLOW_EXT flag,
it is indeed rather ugly. This is the only way I could see to do it
without adding a whole new command number, which I felt might also be
contentious (see: deep freeze) but is probably a better approach.
> Lastly, there has been feedback asking how practical and useful this
> facility actually is, and you must address that.
According to our marketing folks, there is end-user demand for this feature
or something like it. I didn't see any arguments why this isn't useful,
just that other things might be useful too. (Also, sorry it took me so
long to address their feedback, but I had to do a bit of background
reading before I could understand what Jakub was suggesting.)
-Ed
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RESEND net-next 0/2] ntuple filters with RSS
2018-03-02 16:01 ` Edward Cree
@ 2018-03-02 17:49 ` David Riddoch
2018-03-07 15:24 ` David Miller
1 sibling, 0 replies; 18+ messages in thread
From: David Riddoch @ 2018-03-02 17:49 UTC (permalink / raw)
To: Edward Cree, David Miller; +Cc: linux-net-drivers, netdev, linville
>> Lastly, there has been feedback asking how practical and useful this
>> facility actually is, and you must address that.
> According to our marketing folks, there is end-user demand for this feature
> or something like it.
The main benefit comes on numa systems, when you have high throughput
applications or containers on multiple numa nodes. Using RSS without
steering gives poor efficiency because traffic is often not received on
the same node as the application. With flow steering to a single queue
you can get a bottleneck, as all traffic for a TCP/UDP port or container
goes to one core. ARFS doesn't scale to large numbers of flows.
This feature allows the admin to ensure packets are received on the same
numa node as the application (improving efficiency) and avoids the
single core bottleneck.
David
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RESEND net-next 0/2] ntuple filters with RSS
2018-03-02 15:24 ` Edward Cree
@ 2018-03-02 18:55 ` Jakub Kicinski
2018-03-02 23:24 ` Alexander Duyck
0 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2018-03-02 18:55 UTC (permalink / raw)
To: Edward Cree
Cc: Alexander Duyck, linux-net-drivers, David Miller, netdev,
John W. Linville, Or Gerlitz, Alexander Duyck
On Fri, 2 Mar 2018 15:24:29 +0000, Edward Cree wrote:
> On Tue, Feb 27, 2018 at 3:47 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
>
> > Please, let's stop extending ethtool_rx_flow APIs. I bit my tongue
> > when Intel was adding their "redirection to VF" based on ethtool ntuples
> > and look now they're adding the same functionality with flower :| And
> > wonder how to handle two interfaces doing the same thing.
> Since sfc only supports ethtool NFC interfaces (we have no flower support,
> and I also wonder how one is to support both of those interfaces without
> producing an ugly mess), I'd much rather put this in ethtool than have to
> implement all of flower just so we can have this extension.
"Just this one extension" is exactly the attitude that can lead to
messy APIs :(
> I guess part of the question is, which other drivers besides us would want
> to implement something like this, and what are their requirements?
I think every vendor is trying to come up with ways to make their HW
work with containers better these days.
> > On the use case itself, I wonder how much sense that makes. Can your
> > hardware not tag the packet as well so you could then mux it to
> > something like macvlan offload?
> In practice the only way our hardware can "tag the packet" is by the
> selection of RX queue. So you could for instance give a container its
> own RX queues (rather than just using the existing RX queues on the
> appropriate CPUs), and maybe in future hook those queues up to l2fwd
> offload somehow.
> But that seems like a separate job (offloading the macvlan switching) to
> what this series is about (making the RX processing happen on the right
> CPUs). Is software macvlan switching really noticeably slow, anyway?
OK, thanks for clarifying.
> Besides, more powerful filtering than just MAC addr might be needed, if,
> for instance, the container network is encapsulated. In that case
> something like a UDP 4-tuple filter might be necessary (or, indeed, a
> filter looking at the VNID (VxLAN TNI) - which our hardware can do but
> ethtool doesn't currently have a way to specify). AFAICT l2-fwd-offload
> can only be used for straight MAC addr, not for overlay networks like
> VxLAN or FOU? At least, existing ndo_dfwd_add_station() implementations
> don't seem to check that dev is a macvlan... Does it even support
> VLAN filters? fm10k implementation doesn't seem to.
Exactly! One can come up with many protocol combinations which flower
already has APIs for... ethtool is not the place for it.
> Anyway, like I say, filtering traffic onto its own queues seems to be
> orthogonal, or at least separate, to binding those queues into an
> upperdev for demux offload.
It is, I was just trying to broaden the scope to more capable HW so we
design APIs that would serve all.
> On 28/02/18 01:24, Alexander Duyck wrote:
>
> > We did something like this for i40e. Basically we required creating
> > the queue groups using mqprio to keep them symmetric on Tx and Rx, and
> > then allowed for TC ingress filters to redirect traffic to those queue
> > groups.
> >
> > - Alex
> If we're not doing macvlan offload, I'm not sure what, if anything, the
> TX side would buy us. So for now it seems to make sense for TX just to
> use the TXQ associated with the CPU from which the TX originates, which
> I believe already happens automatically.
I don't think that's what Alex was referring to. Please see
commit e284fc280473 ("i40e: Add and delete cloud filter") for
instance :)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RESEND net-next 0/2] ntuple filters with RSS
2018-03-02 18:55 ` Jakub Kicinski
@ 2018-03-02 23:24 ` Alexander Duyck
0 siblings, 0 replies; 18+ messages in thread
From: Alexander Duyck @ 2018-03-02 23:24 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Edward Cree, linux-net-drivers, David Miller, netdev,
John W. Linville, Or Gerlitz, Alexander Duyck
On Fri, Mar 2, 2018 at 10:55 AM, Jakub Kicinski <kubakici@wp.pl> wrote:
> On Fri, 2 Mar 2018 15:24:29 +0000, Edward Cree wrote:
>> On Tue, Feb 27, 2018 at 3:47 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
>>
>> > Please, let's stop extending ethtool_rx_flow APIs. I bit my tongue
>> > when Intel was adding their "redirection to VF" based on ethtool ntuples
>> > and look now they're adding the same functionality with flower :| And
>> > wonder how to handle two interfaces doing the same thing.
>> Since sfc only supports ethtool NFC interfaces (we have no flower support,
>> and I also wonder how one is to support both of those interfaces without
>> producing an ugly mess), I'd much rather put this in ethtool than have to
>> implement all of flower just so we can have this extension.
>
> "Just this one extension" is exactly the attitude that can lead to
> messy APIs :(
>
>> I guess part of the question is, which other drivers besides us would want
>> to implement something like this, and what are their requirements?
>
> I think every vendor is trying to come up with ways to make their HW
> work with containers better these days.
>
>> > On the use case itself, I wonder how much sense that makes. Can your
>> > hardware not tag the packet as well so you could then mux it to
>> > something like macvlan offload?
>> In practice the only way our hardware can "tag the packet" is by the
>> selection of RX queue. So you could for instance give a container its
>> own RX queues (rather than just using the existing RX queues on the
>> appropriate CPUs), and maybe in future hook those queues up to l2fwd
>> offload somehow.
>> But that seems like a separate job (offloading the macvlan switching) to
>> what this series is about (making the RX processing happen on the right
>> CPUs). Is software macvlan switching really noticeably slow, anyway?
>
> OK, thanks for clarifying.
>
>> Besides, more powerful filtering than just MAC addr might be needed, if,
>> for instance, the container network is encapsulated. In that case
>> something like a UDP 4-tuple filter might be necessary (or, indeed, a
>> filter looking at the VNID (VxLAN TNI) - which our hardware can do but
>> ethtool doesn't currently have a way to specify). AFAICT l2-fwd-offload
>> can only be used for straight MAC addr, not for overlay networks like
>> VxLAN or FOU? At least, existing ndo_dfwd_add_station() implementations
>> don't seem to check that dev is a macvlan... Does it even support
>> VLAN filters? fm10k implementation doesn't seem to.
>
> Exactly! One can come up with many protocol combinations which flower
> already has APIs for... ethtool is not the place for it.
>
>> Anyway, like I say, filtering traffic onto its own queues seems to be
>> orthogonal, or at least separate, to binding those queues into an
>> upperdev for demux offload.
>
> It is, I was just trying to broaden the scope to more capable HW so we
> design APIs that would serve all.
>
>> On 28/02/18 01:24, Alexander Duyck wrote:
>>
>> > We did something like this for i40e. Basically we required creating
>> > the queue groups using mqprio to keep them symmetric on Tx and Rx, and
>> > then allowed for TC ingress filters to redirect traffic to those queue
>> > groups.
>> >
>> > - Alex
>> If we're not doing macvlan offload, I'm not sure what, if anything, the
>> TX side would buy us. So for now it seems to make sense for TX just to
>> use the TXQ associated with the CPU from which the TX originates, which
>> I believe already happens automatically.
>
> I don't think that's what Alex was referring to. Please see
> commit e284fc280473 ("i40e: Add and delete cloud filter") for
> instance :)
Right. And as far as the Tx queue association goes right now we are
basing things off of skb->priority which is easily controlled via
cgroups. So in theory you could associate a given set of cgroup to a
specific set of Tx queues using this approach.
Most of the filtering that Jakub pointed out is applied to the Rx side
to make sure the packets come in on the right queue set.
- Alex
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RESEND net-next 0/2] ntuple filters with RSS
2018-03-02 16:01 ` Edward Cree
2018-03-02 17:49 ` David Riddoch
@ 2018-03-07 15:24 ` David Miller
2018-03-07 15:40 ` Edward Cree
1 sibling, 1 reply; 18+ messages in thread
From: David Miller @ 2018-03-07 15:24 UTC (permalink / raw)
To: ecree; +Cc: linux-net-drivers, netdev, linville
From: Edward Cree <ecree@solarflare.com>
Date: Fri, 2 Mar 2018 16:01:47 +0000
> On 01/03/18 18:36, David Miller wrote:
>> We really should have the ethtool interfaces under deep freeze until we
>> convert it to netlink or similar.
>> Second, this is a real hackish way to extend ethtool with new
>> semantics.� A structure changes layout based upon a flag bit setting
>> in an earlier member?� Yikes...
> Yeah, while I'm reasonably confident it's ABI-compatible (presence of that
> �flag in the past should always have led to drivers complaining they didn't
> �recognise it), and it is somewhat similar to the existing FLOW_EXT flag,
> �it is indeed rather ugly.� This is the only way I could see to do it
> �without adding a whole new command number, which I felt might also be
> �contentious (see: deep freeze) but is probably a better approach.
>
>> Lastly, there has been feedback asking how practical and useful this
>> facility actually is, and you must address that.
> According to our marketing folks, there is end-user demand for this feature
> �or something like it.� I didn't see any arguments why this isn't useful,
> �just that other things might be useful too.� (Also, sorry it took me so
> �long to address their feedback, but I had to do a bit of background
> �reading before I could understand what Jakub was suggesting.)
Ok.
Since nobody is really working on the ethtool --> devlink/netlink conversion,
it really isn't reasonable for me to block useful changes like your's.
So please resubmit this series and I will apply it.
Thanks.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RESEND net-next 0/2] ntuple filters with RSS
2018-03-07 15:24 ` David Miller
@ 2018-03-07 15:40 ` Edward Cree
2018-03-07 20:55 ` David Miller
0 siblings, 1 reply; 18+ messages in thread
From: Edward Cree @ 2018-03-07 15:40 UTC (permalink / raw)
To: David Miller; +Cc: linux-net-drivers, netdev, linville
On 07/03/18 15:24, David Miller wrote:
> Ok.
>
> Since nobody is really working on the ethtool --> devlink/netlink conversion,
> it really isn't reasonable for me to block useful changes like your's.
>
> So please resubmit this series and I will apply it.
>
> Thanks.
Ok, thanks. Should I stick with the hackish union-and-flag-bit, or define a
new ethtool command number for the extended command?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH RESEND net-next 0/2] ntuple filters with RSS
2018-03-07 15:40 ` Edward Cree
@ 2018-03-07 20:55 ` David Miller
0 siblings, 0 replies; 18+ messages in thread
From: David Miller @ 2018-03-07 20:55 UTC (permalink / raw)
To: ecree; +Cc: linux-net-drivers, netdev, linville
From: Edward Cree <ecree@solarflare.com>
Date: Wed, 7 Mar 2018 15:40:39 +0000
> On 07/03/18 15:24, David Miller wrote:
>> Ok.
>>
>> Since nobody is really working on the ethtool --> devlink/netlink conversion,
>> it really isn't reasonable for me to block useful changes like your's.
>>
>> So please resubmit this series and I will apply it.
>>
>> Thanks.
> Ok, thanks.� Should I stick with the hackish union-and-flag-bit, or define a
> �new ethtool command number for the extended command?
I'd say stick with the union-and-flag-bit hack.
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2018-03-07 20:55 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-27 17:59 [PATCH RESEND net-next 0/2] ntuple filters with RSS Edward Cree
2018-02-27 18:02 ` [PATCH net-next 1/2] net: ethtool: extend RXNFC API to support RSS spreading of filter matches Edward Cree
2018-02-27 18:03 ` [PATCH net-next 2/2] sfc: support RSS spreading of ethtool ntuple filters Edward Cree
2018-03-01 20:51 ` kbuild test robot
2018-02-27 23:47 ` [PATCH RESEND net-next 0/2] ntuple filters with RSS Jakub Kicinski
2018-02-28 1:24 ` Alexander Duyck
2018-03-02 15:24 ` Edward Cree
2018-03-02 18:55 ` Jakub Kicinski
2018-03-02 23:24 ` Alexander Duyck
2018-03-01 18:36 ` David Miller
2018-03-02 16:01 ` Edward Cree
2018-03-02 17:49 ` David Riddoch
2018-03-07 15:24 ` David Miller
2018-03-07 15:40 ` Edward Cree
2018-03-07 20:55 ` David Miller
[not found] <533b5eff-49b6-16c3-9873-dda3fb05c3d4@solarflare.com>
2018-02-27 17:38 ` David Miller
2018-02-27 17:55 ` Edward Cree
2018-02-27 19:28 ` John W. Linville
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).