* Re: [RFC PATCH 00/11] udp: full early demux for unconnected sockets
From: Eric Dumazet @ 2017-09-22 21:58 UTC (permalink / raw)
To: Paolo Abeni
Cc: netdev, David S. Miller, Pablo Neira Ayuso, Florian Westphal,
Eric Dumazet, Hannes Frederic Sowa
In-Reply-To: <cover.1506114055.git.pabeni@redhat.com>
On Fri, 2017-09-22 at 23:06 +0200, Paolo Abeni wrote:
> This series refactor the UDP early demux code so that:
>
> * full socket lookup is performed for unicast packets
> * a sk is grabbed even for unconnected socket match
> * a dst cache is used even in such scenario
>
> To perform this tasks a couple of facilities are added:
>
> * noref socket references, scoped inside the current RCU section, to be
> explicitly cleared before leaving such section
> * a dst cache inside the inet and inet6 local addresses tables, caching the
> related local dst entry
>
> The measured performance gain under small packet UDP flood is as follow:
>
> ingress NIC vanilla patched delta
> rx queues (kpps) (kpps) (%)
> [ipv4]
> 1 2177 2414 10
> 2 2527 2892 14
> 3 3050 3733 22
This is a clear sign your program is not using latest SO_REUSEPORT +
[ec]BPF filter [1]
return socket[RX_QUEUE# | or CPU#];
If udp_sink uses SO_REUSEPORT with no extra hint, socket selection is
based on a lazy hash, meaning that you do not have proper siloing.
return socket[hash(skb)];
Multiple cpus can then :
- compete on grabbing same socket refcount
- compete on grabbing the receive queue lock
- compete for releasing lock and socket refcount
- skb freeing done on different cpus than where allocated.
You are adding complexity to the kernel because you are using a
sub-optimal user space program, favoring false sharing.
First solve the false sharing issue.
Performance with 2 rx queues should be almost twice the performance with
1 rx queue.
Then we can see if the gains you claim are still applicable.
Thanks
PS: Wei Wan is about to release the IPV6 changes so that the big
differences you showed are going to disappear soon.
Refs [1]
tools/testing/selftests/net/reuseport_bpf.c
6a5ef90c58daada158ba16ba330558efc3471491 Merge branch 'faster-soreuseport'
3ca8e4029969d40ab90e3f1ecd83ab1cadd60fbb soreuseport: BPF selection functional test
538950a1b7527a0a52ccd9337e3fcd304f027f13 soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF
e32ea7e747271a0abcd37e265005e97cc81d9df5 soreuseport: fast reuseport UDP socket selection
ef456144da8ef507c8cf504284b6042e9201a05c soreuseport: define reuseport groups
^ permalink raw reply
* [PATCH net-next] hv_netvsc: Fix the real number of queues of non-vRSS cases
From: Haiyang Zhang @ 2017-09-22 22:31 UTC (permalink / raw)
To: davem, netdev; +Cc: haiyangz, kys, olaf, vkuznets, linux-kernel
From: Haiyang Zhang <haiyangz@microsoft.com>
For older hosts without multi-channel (vRSS) support, and some error
cases, we still need to set the real number of queues to one.
This patch adds this missing setting.
Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
---
drivers/net/hyperv/netvsc_drv.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index d4902ee5f260..68eac12fbf75 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -1929,6 +1929,12 @@ static int netvsc_probe(struct hv_device *dev,
/* We always need headroom for rndis header */
net->needed_headroom = RNDIS_AND_PPI_SIZE;
+ /* Initialize the number of queues to be 1, we may change it if more
+ * channels are offered later.
+ */
+ netif_set_real_num_tx_queues(net, 1);
+ netif_set_real_num_rx_queues(net, 1);
+
/* Notify the netvsc driver of the new device */
memset(&device_info, 0, sizeof(device_info));
device_info.ring_size = ring_size;
--
2.14.1
^ permalink raw reply related
* [PATCH net] net: orphan frags on stand-alone ptype in dev_queue_xmit_nit
From: Willem de Bruijn @ 2017-09-22 22:51 UTC (permalink / raw)
To: netdev; +Cc: davem, Willem de Bruijn
Zerocopy skbs frags are copied when the skb is looped to a local sock.
Commit 1080e512d44d ("net: orphan frags on receive") introduced calls
to skb_orphan_frags to deliver_skb and __netif_receive_skb.
With msg_zerocopy, these skbs can also exist in the tx path and thus
loop from dev_queue_xmit_nit. This already calls deliver_skb in its
loop. But it does not orphan before a separate pt_prev->func().
Add the missing skb_orphan_frags_rx.
Fixes: 1f8b977ab32d ("sock: enable MSG_ZEROCOPY")
Signed-off-by: Willem de Bruijn <willemb@google.com>
---
net/core/dev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 9a2254f9802f..3f5b26ff4f74 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1948,7 +1948,7 @@ void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
goto again;
}
out_unlock:
- if (pt_prev)
+ if (pt_prev && !skb_orphan_frags_rx(skb2, GFP_ATOMIC))
pt_prev->func(skb2, skb->dev, pt_prev, skb->dev);
rcu_read_unlock();
}
--
2.14.1.821.g8fa685d3b7-goog
^ permalink raw reply related
* [PATCH net-next v3 0/2] net: dsa: port enabling
From: Vivien Didelot @ 2017-09-22 23:01 UTC (permalink / raw)
To: netdev
Cc: linux-kernel, kernel, David S. Miller, Florian Fainelli,
Andrew Lunn, Vivien Didelot
This patchset makes slave open and close symmetrical and provides
helpers for enabling or disabling a given DSA port.
Changes in v3:
- save the phy_device change for a future patchset
Changes in v2:
- do not remove the phy argument from port enable/disable
Vivien Didelot (2):
net: dsa: make slave close symmetrical to open
net: dsa: add port enable and disable helpers
net/dsa/dsa_priv.h | 3 ++-
net/dsa/port.c | 31 ++++++++++++++++++++++++++++++-
net/dsa/slave.c | 21 ++++++---------------
3 files changed, 38 insertions(+), 17 deletions(-)
--
2.14.1
^ permalink raw reply
* [PATCH net-next v3 1/2] net: dsa: make slave close symmetrical to open
From: Vivien Didelot @ 2017-09-22 23:01 UTC (permalink / raw)
To: netdev
Cc: linux-kernel, kernel, David S. Miller, Florian Fainelli,
Andrew Lunn, Vivien Didelot
In-Reply-To: <20170922230156.19521-1-vivien.didelot@savoirfairelinux.com>
The DSA slave open function configures the unicast MAC addresses on the
master device, enable the switch port, change its STP state, then start
the PHY device.
Make the close function symmetric, by first stopping the PHY device,
then changing the STP state, disabling the switch port and restore the
master device.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
---
net/dsa/slave.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 02ace7d462c4..c2bb48579032 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -133,6 +133,11 @@ static int dsa_slave_close(struct net_device *dev)
if (p->phy)
phy_stop(p->phy);
+ dsa_port_set_state_now(p->dp, BR_STATE_DISABLED);
+
+ if (ds->ops->port_disable)
+ ds->ops->port_disable(ds, p->dp->index, p->phy);
+
dev_mc_unsync(master, dev);
dev_uc_unsync(master, dev);
if (dev->flags & IFF_ALLMULTI)
@@ -143,11 +148,6 @@ static int dsa_slave_close(struct net_device *dev)
if (!ether_addr_equal(dev->dev_addr, master->dev_addr))
dev_uc_del(master, dev->dev_addr);
- if (ds->ops->port_disable)
- ds->ops->port_disable(ds, p->dp->index, p->phy);
-
- dsa_port_set_state_now(p->dp, BR_STATE_DISABLED);
-
return 0;
}
--
2.14.1
^ permalink raw reply related
* [PATCH net-next v3 2/2] net: dsa: add port enable and disable helpers
From: Vivien Didelot @ 2017-09-22 23:01 UTC (permalink / raw)
To: netdev
Cc: linux-kernel, kernel, David S. Miller, Florian Fainelli,
Andrew Lunn, Vivien Didelot
In-Reply-To: <20170922230156.19521-1-vivien.didelot@savoirfairelinux.com>
Provide dsa_port_enable and dsa_port_disable helpers to respectively
enable and disable a switch port. This makes the dsa_port_set_state_now
helper static.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
---
net/dsa/dsa_priv.h | 3 ++-
net/dsa/port.c | 31 ++++++++++++++++++++++++++++++-
net/dsa/slave.c | 19 +++++--------------
3 files changed, 37 insertions(+), 16 deletions(-)
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 9803952a5b40..0298a0f6a349 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -117,7 +117,8 @@ void dsa_master_ethtool_restore(struct net_device *dev);
/* port.c */
int dsa_port_set_state(struct dsa_port *dp, u8 state,
struct switchdev_trans *trans);
-void dsa_port_set_state_now(struct dsa_port *dp, u8 state);
+int dsa_port_enable(struct dsa_port *dp, struct phy_device *phy);
+void dsa_port_disable(struct dsa_port *dp, struct phy_device *phy);
int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br);
void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br);
int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
diff --git a/net/dsa/port.c b/net/dsa/port.c
index 76d43a82d397..72c8dbd3d3f2 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -56,7 +56,7 @@ int dsa_port_set_state(struct dsa_port *dp, u8 state,
return 0;
}
-void dsa_port_set_state_now(struct dsa_port *dp, u8 state)
+static void dsa_port_set_state_now(struct dsa_port *dp, u8 state)
{
int err;
@@ -65,6 +65,35 @@ void dsa_port_set_state_now(struct dsa_port *dp, u8 state)
pr_err("DSA: failed to set STP state %u (%d)\n", state, err);
}
+int dsa_port_enable(struct dsa_port *dp, struct phy_device *phy)
+{
+ u8 stp_state = dp->bridge_dev ? BR_STATE_BLOCKING : BR_STATE_FORWARDING;
+ struct dsa_switch *ds = dp->ds;
+ int port = dp->index;
+ int err;
+
+ if (ds->ops->port_enable) {
+ err = ds->ops->port_enable(ds, port, phy);
+ if (err)
+ return err;
+ }
+
+ dsa_port_set_state_now(dp, stp_state);
+
+ return 0;
+}
+
+void dsa_port_disable(struct dsa_port *dp, struct phy_device *phy)
+{
+ struct dsa_switch *ds = dp->ds;
+ int port = dp->index;
+
+ dsa_port_set_state_now(dp, BR_STATE_DISABLED);
+
+ if (ds->ops->port_disable)
+ ds->ops->port_disable(ds, port, phy);
+}
+
int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br)
{
struct dsa_notifier_bridge_info info = {
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index c2bb48579032..bd51ef56ec5b 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -73,9 +73,7 @@ static int dsa_slave_open(struct net_device *dev)
{
struct dsa_slave_priv *p = netdev_priv(dev);
struct dsa_port *dp = p->dp;
- struct dsa_switch *ds = dp->ds;
struct net_device *master = dsa_master_netdev(p);
- u8 stp_state = dp->bridge_dev ? BR_STATE_BLOCKING : BR_STATE_FORWARDING;
int err;
if (!(master->flags & IFF_UP))
@@ -98,13 +96,9 @@ static int dsa_slave_open(struct net_device *dev)
goto clear_allmulti;
}
- if (ds->ops->port_enable) {
- err = ds->ops->port_enable(ds, p->dp->index, p->phy);
- if (err)
- goto clear_promisc;
- }
-
- dsa_port_set_state_now(p->dp, stp_state);
+ err = dsa_port_enable(dp, p->phy);
+ if (err)
+ goto clear_promisc;
if (p->phy)
phy_start(p->phy);
@@ -128,15 +122,12 @@ static int dsa_slave_close(struct net_device *dev)
{
struct dsa_slave_priv *p = netdev_priv(dev);
struct net_device *master = dsa_master_netdev(p);
- struct dsa_switch *ds = p->dp->ds;
+ struct dsa_port *dp = p->dp;
if (p->phy)
phy_stop(p->phy);
- dsa_port_set_state_now(p->dp, BR_STATE_DISABLED);
-
- if (ds->ops->port_disable)
- ds->ops->port_disable(ds, p->dp->index, p->phy);
+ dsa_port_disable(dp, p->phy);
dev_mc_unsync(master, dev);
dev_uc_unsync(master, dev);
--
2.14.1
^ permalink raw reply related
* Re: [PATCH net] net: orphan frags on stand-alone ptype in dev_queue_xmit_nit
From: Eric Dumazet @ 2017-09-22 23:04 UTC (permalink / raw)
To: Willem de Bruijn; +Cc: netdev, davem
In-Reply-To: <20170922225141.126435-1-willemb@google.com>
On Fri, 2017-09-22 at 18:51 -0400, Willem de Bruijn wrote:
> Zerocopy skbs frags are copied when the skb is looped to a local sock.
> Commit 1080e512d44d ("net: orphan frags on receive") introduced calls
> to skb_orphan_frags to deliver_skb and __netif_receive_skb.
>
> With msg_zerocopy, these skbs can also exist in the tx path and thus
> loop from dev_queue_xmit_nit. This already calls deliver_skb in its
> loop. But it does not orphan before a separate pt_prev->func().
>
> Add the missing skb_orphan_frags_rx.
>
> Fixes: 1f8b977ab32d ("sock: enable MSG_ZEROCOPY")
> Signed-off-by: Willem de Bruijn <willemb@google.com>
> ---
> net/core/dev.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 9a2254f9802f..3f5b26ff4f74 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1948,7 +1948,7 @@ void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
> goto again;
> }
> out_unlock:
> - if (pt_prev)
> + if (pt_prev && !skb_orphan_frags_rx(skb2, GFP_ATOMIC))
> pt_prev->func(skb2, skb->dev, pt_prev, skb->dev);
Don't you need to kfree_skb(skb2) in case of failure ?
> rcu_read_unlock();
> }
^ permalink raw reply
* Re: [PATCH net] net: orphan frags on stand-alone ptype in dev_queue_xmit_nit
From: Willem de Bruijn @ 2017-09-22 23:15 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Willem de Bruijn, Network Development, David Miller
In-Reply-To: <1506121446.29839.177.camel@edumazet-glaptop3.roam.corp.google.com>
On Fri, Sep 22, 2017 at 7:04 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2017-09-22 at 18:51 -0400, Willem de Bruijn wrote:
>> Zerocopy skbs frags are copied when the skb is looped to a local sock.
>> Commit 1080e512d44d ("net: orphan frags on receive") introduced calls
>> to skb_orphan_frags to deliver_skb and __netif_receive_skb.
>>
>> With msg_zerocopy, these skbs can also exist in the tx path and thus
>> loop from dev_queue_xmit_nit. This already calls deliver_skb in its
>> loop. But it does not orphan before a separate pt_prev->func().
>>
>> Add the missing skb_orphan_frags_rx.
>>
>> Fixes: 1f8b977ab32d ("sock: enable MSG_ZEROCOPY")
>> Signed-off-by: Willem de Bruijn <willemb@google.com>
>> ---
>> net/core/dev.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 9a2254f9802f..3f5b26ff4f74 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -1948,7 +1948,7 @@ void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
>> goto again;
>> }
>> out_unlock:
>> - if (pt_prev)
>> + if (pt_prev && !skb_orphan_frags_rx(skb2, GFP_ATOMIC))
>> pt_prev->func(skb2, skb->dev, pt_prev, skb->dev);
>
> Don't you need to kfree_skb(skb2) in case of failure ?
Oh, yes, of course! :/ Will fix right away.
^ permalink raw reply
* [PATCH net v2] net: orphan frags on stand-alone ptype in dev_queue_xmit_nit
From: Willem de Bruijn @ 2017-09-22 23:42 UTC (permalink / raw)
To: netdev; +Cc: davem, Willem de Bruijn
Zerocopy skbs frags are copied when the skb is looped to a local sock.
Commit 1080e512d44d ("net: orphan frags on receive") introduced calls
to skb_orphan_frags to deliver_skb and __netif_receive_skb for this.
With msg_zerocopy, these skbs can also exist in the tx path and thus
loop from dev_queue_xmit_nit. This already calls deliver_skb in its
loop. But it does not orphan before a separate pt_prev->func().
Add the missing skb_orphan_frags_rx.
Changes
v1->v2: handle skb_orphan_frags_rx failure
Fixes: 1f8b977ab32d ("sock: enable MSG_ZEROCOPY")
Signed-off-by: Willem de Bruijn <willemb@google.com>
---
net/core/dev.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 9a2254f9802f..588b473194a8 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1948,8 +1948,12 @@ void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
goto again;
}
out_unlock:
- if (pt_prev)
- pt_prev->func(skb2, skb->dev, pt_prev, skb->dev);
+ if (pt_prev) {
+ if (!skb_orphan_frags_rx(skb2, GFP_ATOMIC))
+ pt_prev->func(skb2, skb->dev, pt_prev, skb->dev);
+ else
+ kfree_skb(skb2);
+ }
rcu_read_unlock();
}
EXPORT_SYMBOL_GPL(dev_queue_xmit_nit);
--
2.14.1.821.g8fa685d3b7-goog
^ permalink raw reply related
* Re: [PATCH net v2] net: orphan frags on stand-alone ptype in dev_queue_xmit_nit
From: Eric Dumazet @ 2017-09-22 23:50 UTC (permalink / raw)
To: Willem de Bruijn; +Cc: netdev, davem
In-Reply-To: <20170922234237.43174-1-willemb@google.com>
On Fri, 2017-09-22 at 19:42 -0400, Willem de Bruijn wrote:
> Zerocopy skbs frags are copied when the skb is looped to a local sock.
> Commit 1080e512d44d ("net: orphan frags on receive") introduced calls
> to skb_orphan_frags to deliver_skb and __netif_receive_skb for this.
>
> With msg_zerocopy, these skbs can also exist in the tx path and thus
> loop from dev_queue_xmit_nit. This already calls deliver_skb in its
> loop. But it does not orphan before a separate pt_prev->func().
>
> Add the missing skb_orphan_frags_rx.
>
> Changes
> v1->v2: handle skb_orphan_frags_rx failure
>
> Fixes: 1f8b977ab32d ("sock: enable MSG_ZEROCOPY")
> Signed-off-by: Willem de Bruijn <willemb@google.com>
> ---
> net/core/dev.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply
* [PATCH net-next 0/3] liquidio: firmware loading
From: Felix Manlunas @ 2017-09-23 0:12 UTC (permalink / raw)
To: davem
Cc: netdev, raghu.vatsavayi, derek.chickles, satananda.burla,
ricardo.farrington
From: Rick Farrington <ricardo.farrington@cavium.com>
1. Allow host driver parameter to override auto-loaded firmware (in flash).
2. Verify version of firmware that is auto-loaded from flash.
3. Change value of fw_type module parameter to reflect default firmware
image name that is loaded by host driver (in /sys/module/liquidio/...)
drivers/net/ethernet/cavium/liquidio/lio_main.c | 90 +++++++++++++++-------
.../net/ethernet/cavium/liquidio/liquidio_image.h | 1 +
.../net/ethernet/cavium/liquidio/octeon_device.c | 11 ++-
.../net/ethernet/cavium/liquidio/octeon_device.h | 10 +++
4 files changed, 84 insertions(+), 28 deletions(-)
--
1.8.3.1
^ permalink raw reply
* [PATCH net-next 1/3] liquidio: allow override of firmware present in flash
From: Felix Manlunas @ 2017-09-23 0:12 UTC (permalink / raw)
To: davem
Cc: netdev, raghu.vatsavayi, derek.chickles, satananda.burla,
ricardo.farrington
In-Reply-To: <20170923001206.GA1458@felix-thinkpad.cavium.com>
From: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
---
drivers/net/ethernet/cavium/liquidio/lio_main.c | 68 ++++++++++++++--------
.../net/ethernet/cavium/liquidio/liquidio_image.h | 1 +
.../net/ethernet/cavium/liquidio/octeon_device.c | 11 +++-
.../net/ethernet/cavium/liquidio/octeon_device.h | 10 ++++
4 files changed, 64 insertions(+), 26 deletions(-)
diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c b/drivers/net/ethernet/cavium/liquidio/lio_main.c
index e7f5494..ce08f71 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
@@ -59,9 +59,9 @@
module_param(debug, int, 0644);
MODULE_PARM_DESC(debug, "NETIF_MSG debug bits");
-static char fw_type[LIO_MAX_FW_TYPE_LEN] = LIO_FW_NAME_TYPE_NIC;
+static char fw_type[LIO_MAX_FW_TYPE_LEN] = LIO_FW_NAME_TYPE_AUTO;
module_param_string(fw_type, fw_type, sizeof(fw_type), 0444);
-MODULE_PARM_DESC(fw_type, "Type of firmware to be loaded. Default \"nic\". Use \"none\" to load firmware from flash.");
+MODULE_PARM_DESC(fw_type, "Type of firmware to be loaded (default is \"auto\"), which uses firmware in flash, if present, else loads \"nic\".");
static u32 console_bitmask;
module_param(console_bitmask, int, 0644);
@@ -1115,10 +1115,10 @@ static int liquidio_watchdog(void *param)
return 0;
}
-static bool fw_type_is_none(void)
+static bool fw_type_is_auto(void)
{
- return strncmp(fw_type, LIO_FW_NAME_TYPE_NONE,
- sizeof(LIO_FW_NAME_TYPE_NONE)) == 0;
+ return strncmp(fw_type, LIO_FW_NAME_TYPE_AUTO,
+ sizeof(LIO_FW_NAME_TYPE_AUTO)) == 0;
}
/**
@@ -1302,7 +1302,7 @@ static void octeon_destroy_resources(struct octeon_device *oct)
* Implementation note: only soft-reset the device
* if it is a CN6XXX OR the LAST CN23XX device.
*/
- if (fw_type_is_none())
+ if (atomic_read(oct->adapter_fw_state) == FW_IS_PRELOADED)
octeon_pci_flr(oct);
else if (OCTEON_CN6XXX(oct) || !refcount)
oct->fn_list.soft_reset(oct);
@@ -1934,7 +1934,7 @@ static int load_firmware(struct octeon_device *oct)
char fw_name[LIO_MAX_FW_FILENAME_LEN];
char *tmp_fw_type;
- if (fw_type[0] == '\0')
+ if (fw_type_is_auto())
tmp_fw_type = LIO_FW_NAME_TYPE_NIC;
else
tmp_fw_type = fw_type;
@@ -3882,9 +3882,9 @@ static void nic_starter(struct work_struct *work)
static int octeon_device_init(struct octeon_device *octeon_dev)
{
int j, ret;
- int fw_loaded = 0;
char bootcmd[] = "\n";
char *dbg_enb = NULL;
+ enum lio_fw_state fw_state;
struct octeon_device_priv *oct_priv =
(struct octeon_device_priv *)octeon_dev->priv;
atomic_set(&octeon_dev->status, OCT_DEV_BEGIN_STATE);
@@ -3916,24 +3916,40 @@ static int octeon_device_init(struct octeon_device *octeon_dev)
octeon_dev->app_mode = CVM_DRV_INVALID_APP;
- if (OCTEON_CN23XX_PF(octeon_dev)) {
- if (!cn23xx_fw_loaded(octeon_dev) && !fw_type_is_none()) {
- fw_loaded = 0;
- /* Do a soft reset of the Octeon device. */
- if (octeon_dev->fn_list.soft_reset(octeon_dev))
- return 1;
- /* things might have changed */
- if (!cn23xx_fw_loaded(octeon_dev))
- fw_loaded = 0;
- else
- fw_loaded = 1;
- } else {
- fw_loaded = 1;
- }
- } else if (octeon_dev->fn_list.soft_reset(octeon_dev)) {
- return 1;
+ /* CN23XX supports preloaded firmware if the following is true:
+ *
+ * The adapter indicates that firmware is currently running AND
+ * 'fw_type' is 'auto'.
+ *
+ * (default state is NEEDS_TO_BE_LOADED, override it if appropriate).
+ */
+ if (OCTEON_CN23XX_PF(octeon_dev) &&
+ cn23xx_fw_loaded(octeon_dev) && fw_type_is_auto()) {
+ atomic_cmpxchg(octeon_dev->adapter_fw_state,
+ FW_NEEDS_TO_BE_LOADED, FW_IS_PRELOADED);
}
+ /* If loading firmware, only first device of adapter needs to do so. */
+ fw_state = atomic_cmpxchg(octeon_dev->adapter_fw_state,
+ FW_NEEDS_TO_BE_LOADED,
+ FW_IS_BEING_LOADED);
+
+ /* Here, [local variable] 'fw_state' is set to one of:
+ *
+ * FW_IS_PRELOADED: No firmware is to be loaded (see above)
+ * FW_NEEDS_TO_BE_LOADED: The driver's first instance will load
+ * firmware to the adapter.
+ * FW_IS_BEING_LOADED: The driver's second instance will not load
+ * firmware to the adapter.
+ */
+
+ /* Prior to f/w load, perform a soft reset of the Octeon device;
+ * if error resetting, return w/error.
+ */
+ if (fw_state == FW_NEEDS_TO_BE_LOADED)
+ if (octeon_dev->fn_list.soft_reset(octeon_dev))
+ return 1;
+
/* Initialize the dispatch mechanism used to push packets arriving on
* Octeon Output queues.
*/
@@ -4063,7 +4079,7 @@ static int octeon_device_init(struct octeon_device *octeon_dev)
atomic_set(&octeon_dev->status, OCT_DEV_IO_QUEUES_DONE);
- if ((!OCTEON_CN23XX_PF(octeon_dev)) || !fw_loaded) {
+ if (fw_state == FW_NEEDS_TO_BE_LOADED) {
dev_dbg(&octeon_dev->pci_dev->dev, "Waiting for DDR initialization...\n");
if (!ddr_timeout) {
dev_info(&octeon_dev->pci_dev->dev,
@@ -4125,6 +4141,8 @@ static int octeon_device_init(struct octeon_device *octeon_dev)
dev_err(&octeon_dev->pci_dev->dev, "Could not load firmware to board\n");
return 1;
}
+
+ atomic_set(octeon_dev->adapter_fw_state, FW_HAS_BEEN_LOADED);
}
handshake[octeon_dev->octeon_id].init_ok = 1;
diff --git a/drivers/net/ethernet/cavium/liquidio/liquidio_image.h b/drivers/net/ethernet/cavium/liquidio/liquidio_image.h
index 78a3685..5bf5e87 100644
--- a/drivers/net/ethernet/cavium/liquidio/liquidio_image.h
+++ b/drivers/net/ethernet/cavium/liquidio/liquidio_image.h
@@ -24,6 +24,7 @@
#define LIO_FW_BASE_NAME "lio_"
#define LIO_FW_NAME_SUFFIX ".bin"
#define LIO_FW_NAME_TYPE_NIC "nic"
+#define LIO_FW_NAME_TYPE_AUTO "auto"
#define LIO_FW_NAME_TYPE_NONE "none"
#define LIO_MAX_FIRMWARE_VERSION_LEN 16
diff --git a/drivers/net/ethernet/cavium/liquidio/octeon_device.c b/drivers/net/ethernet/cavium/liquidio/octeon_device.c
index 29d53b1..e4aa339 100644
--- a/drivers/net/ethernet/cavium/liquidio/octeon_device.c
+++ b/drivers/net/ethernet/cavium/liquidio/octeon_device.c
@@ -541,6 +541,7 @@
static struct octeon_device *octeon_device[MAX_OCTEON_DEVICES];
static atomic_t adapter_refcounts[MAX_OCTEON_DEVICES];
+static atomic_t adapter_fw_states[MAX_OCTEON_DEVICES];
static u32 octeon_device_count;
/* locks device array (i.e. octeon_device[]) */
@@ -770,6 +771,10 @@ int octeon_register_device(struct octeon_device *oct,
oct->adapter_refcount = &adapter_refcounts[oct->octeon_id];
atomic_set(oct->adapter_refcount, 0);
+ /* Like the reference count, the f/w state is shared 'per-adapter' */
+ oct->adapter_fw_state = &adapter_fw_states[oct->octeon_id];
+ atomic_set(oct->adapter_fw_state, FW_NEEDS_TO_BE_LOADED);
+
spin_lock(&octeon_devices_lock);
for (idx = (int)oct->octeon_id - 1; idx >= 0; idx--) {
if (!octeon_device[idx]) {
@@ -780,11 +785,15 @@ int octeon_register_device(struct octeon_device *oct,
atomic_inc(oct->adapter_refcount);
return 1; /* here, refcount is guaranteed to be 1 */
}
- /* if another device is at same bus/dev, use its refcounter */
+ /* If another device is at same bus/dev, use its refcounter
+ * (and f/w state variable).
+ */
if ((octeon_device[idx]->loc.bus == bus) &&
(octeon_device[idx]->loc.dev == dev)) {
oct->adapter_refcount =
octeon_device[idx]->adapter_refcount;
+ oct->adapter_fw_state =
+ octeon_device[idx]->adapter_fw_state;
break;
}
}
diff --git a/drivers/net/ethernet/cavium/liquidio/octeon_device.h b/drivers/net/ethernet/cavium/liquidio/octeon_device.h
index 894af19..33d19c4 100644
--- a/drivers/net/ethernet/cavium/liquidio/octeon_device.h
+++ b/drivers/net/ethernet/cavium/liquidio/octeon_device.h
@@ -50,6 +50,13 @@ enum octeon_pci_swap_mode {
OCTEON_PCI_32BIT_LW_SWAP = 3
};
+enum lio_fw_state {
+ FW_IS_PRELOADED = 0,
+ FW_NEEDS_TO_BE_LOADED = 1,
+ FW_IS_BEING_LOADED = 2,
+ FW_HAS_BEEN_LOADED = 3,
+};
+
enum {
OCTEON_CONFIG_TYPE_DEFAULT = 0,
NUM_OCTEON_CONFS,
@@ -557,6 +564,9 @@ struct octeon_device {
} loc;
atomic_t *adapter_refcount; /* reference count of adapter */
+
+ atomic_t *adapter_fw_state; /* per-adapter, lio_fw_state */
+
bool ptp_enable;
};
--
1.8.3.1
^ permalink raw reply related
* [PATCH net-next 2/3] liquidio: verify firmware version when auto-loaded from flash.
From: Felix Manlunas @ 2017-09-23 0:12 UTC (permalink / raw)
To: davem
Cc: netdev, raghu.vatsavayi, derek.chickles, satananda.burla,
ricardo.farrington
In-Reply-To: <20170923001206.GA1458@felix-thinkpad.cavium.com>
From: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
---
drivers/net/ethernet/cavium/liquidio/lio_main.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c b/drivers/net/ethernet/cavium/liquidio/lio_main.c
index ce08f71..a3c9867 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
@@ -3303,7 +3303,7 @@ static int setup_nic_devices(struct octeon_device *octeon_dev)
{
struct lio *lio = NULL;
struct net_device *netdev;
- u8 mac[6], i, j;
+ u8 mac[6], i, j, *fw_ver;
struct octeon_soft_command *sc;
struct liquidio_if_cfg_context *ctx;
struct liquidio_if_cfg_resp *resp;
@@ -3414,6 +3414,22 @@ static int setup_nic_devices(struct octeon_device *octeon_dev)
goto setup_nic_dev_fail;
}
+ /* Verify f/w version (in case of 'auto' loading from flash) */
+ fw_ver = octeon_dev->fw_info.liquidio_firmware_version;
+ if (memcmp(LIQUIDIO_BASE_VERSION,
+ fw_ver,
+ strlen(LIQUIDIO_BASE_VERSION))) {
+ dev_err(&octeon_dev->pci_dev->dev,
+ "Unmatched firmware version. Expected %s.x, got %s.\n",
+ LIQUIDIO_BASE_VERSION, fw_ver);
+ goto setup_nic_dev_fail;
+ } else if (atomic_read(octeon_dev->adapter_fw_state) ==
+ FW_IS_PRELOADED) {
+ dev_info(&octeon_dev->pci_dev->dev,
+ "Using auto-loaded firmware version %s.\n",
+ fw_ver);
+ }
+
octeon_swap_8B_data((u64 *)(&resp->cfg_info),
(sizeof(struct liquidio_if_cfg_info)) >> 3);
--
1.8.3.1
^ permalink raw reply related
* [PATCH net-next 3/3] liquidio: update module parameter fw_type to reflect firmware type loaded
From: Felix Manlunas @ 2017-09-23 0:12 UTC (permalink / raw)
To: davem
Cc: netdev, raghu.vatsavayi, derek.chickles, satananda.burla,
ricardo.farrington
In-Reply-To: <20170923001206.GA1458@felix-thinkpad.cavium.com>
From: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
---
drivers/net/ethernet/cavium/liquidio/lio_main.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c b/drivers/net/ethernet/cavium/liquidio/lio_main.c
index a3c9867..963803b 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
@@ -1934,10 +1934,12 @@ static int load_firmware(struct octeon_device *oct)
char fw_name[LIO_MAX_FW_FILENAME_LEN];
char *tmp_fw_type;
- if (fw_type_is_auto())
+ if (fw_type_is_auto()) {
tmp_fw_type = LIO_FW_NAME_TYPE_NIC;
- else
+ strncpy(fw_type, tmp_fw_type, sizeof(fw_type));
+ } else {
tmp_fw_type = fw_type;
+ }
sprintf(fw_name, "%s%s%s_%s%s", LIO_FW_DIR, LIO_FW_BASE_NAME,
octeon_get_conf(oct)->card_name, tmp_fw_type,
--
1.8.3.1
^ permalink raw reply related
* [PATCH 0/3] fix reuseaddr regression
From: Josef Bacik @ 2017-09-23 0:20 UTC (permalink / raw)
To: davem, netdev, kernel-team, linux-kernel
I introduced a regression when reworking the fastreuse port stuff that allows
bind conflicts to occur once a reuseaddr successfully opens on an existing tb.
The root cause is I reversed an if statement which caused us to set the tb as if
there were no owners on the socket if there were, which obviously is not
correct.
Dave could you please queue these changes up for -stable, I've run them through
the net tests and added another test to check for this problem specifically.
Thanks,
Josef
^ permalink raw reply
* [PATCH 1/3] net: set tb->fast_sk_family
From: Josef Bacik @ 2017-09-23 0:20 UTC (permalink / raw)
To: davem, netdev, kernel-team, linux-kernel; +Cc: Josef Bacik
In-Reply-To: <1506126008-9148-1-git-send-email-josef@toxicpanda.com>
From: Josef Bacik <jbacik@fb.com>
We need to set the tb->fast_sk_family properly so we can use the proper
comparison function for all subsequent reuseport bind requests.
Fixes: 637bc8bbe6c0 ("inet: reset tb->fastreuseport when adding a reuseport sk")
Reported-and-tested-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
---
net/ipv4/inet_connection_sock.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index b9c64b40a83a..f87f4805e244 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -328,6 +328,7 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
tb->fastuid = uid;
tb->fast_rcv_saddr = sk->sk_rcv_saddr;
tb->fast_ipv6_only = ipv6_only_sock(sk);
+ tb->fast_sk_family = sk->sk_family;
#if IS_ENABLED(CONFIG_IPV6)
tb->fast_v6_rcv_saddr = sk->sk_v6_rcv_saddr;
#endif
@@ -354,6 +355,7 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
tb->fastuid = uid;
tb->fast_rcv_saddr = sk->sk_rcv_saddr;
tb->fast_ipv6_only = ipv6_only_sock(sk);
+ tb->fast_sk_family = sk->sk_family;
#if IS_ENABLED(CONFIG_IPV6)
tb->fast_v6_rcv_saddr = sk->sk_v6_rcv_saddr;
#endif
--
2.7.4
^ permalink raw reply related
* [PATCH 2/3] net: use inet6_rcv_saddr to compare sockets
From: Josef Bacik @ 2017-09-23 0:20 UTC (permalink / raw)
To: davem, netdev, kernel-team, linux-kernel; +Cc: Josef Bacik
In-Reply-To: <1506126008-9148-1-git-send-email-josef@toxicpanda.com>
From: Josef Bacik <jbacik@fb.com>
In ipv6_rcv_saddr_equal() we need to use inet6_rcv_saddr(sk) for the
ipv6 compare with the fast socket information to make sure we're doing
the proper comparisons.
Fixes: 637bc8bbe6c0 ("inet: reset tb->fastreuseport when adding a reuseport sk")
Reported-and-tested-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
---
net/ipv4/inet_connection_sock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index f87f4805e244..a1bf30438bc5 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -266,7 +266,7 @@ static inline int sk_reuseport_match(struct inet_bind_bucket *tb,
#if IS_ENABLED(CONFIG_IPV6)
if (tb->fast_sk_family == AF_INET6)
return ipv6_rcv_saddr_equal(&tb->fast_v6_rcv_saddr,
- &sk->sk_v6_rcv_saddr,
+ inet6_rcv_saddr(sk),
tb->fast_rcv_saddr,
sk->sk_rcv_saddr,
tb->fast_ipv6_only,
--
2.7.4
^ permalink raw reply related
* [PATCH 3/3] inet: fix improper empty comparison
From: Josef Bacik @ 2017-09-23 0:20 UTC (permalink / raw)
To: davem, netdev, kernel-team, linux-kernel; +Cc: Josef Bacik
In-Reply-To: <1506126008-9148-1-git-send-email-josef@toxicpanda.com>
From: Josef Bacik <jbacik@fb.com>
When doing my reuseport rework I screwed up and changed a
if (hlist_empty(&tb->owners))
to
if (!hlist_empty(&tb->owners))
This is obviously bad as all of the reuseport/reuse logic was reversed,
which caused weird problems like allowing an ipv4 bind conflict if we
opened an ipv4 only socket on a port followed by an ipv6 only socket on
the same port.
Fixes: b9470c27607b ("inet: kill smallest_size and smallest_port")
Reported-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
---
net/ipv4/inet_connection_sock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index a1bf30438bc5..c039c937ba90 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -321,7 +321,7 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
goto fail_unlock;
}
success:
- if (!hlist_empty(&tb->owners)) {
+ if (hlist_empty(&tb->owners)) {
tb->fastreuse = reuse;
if (sk->sk_reuseport) {
tb->fastreuseport = FASTREUSEPORT_ANY;
--
2.7.4
^ permalink raw reply related
* Re: [PATCH 0/3] fix reuseaddr regression
From: Josef Bacik @ 2017-09-23 0:28 UTC (permalink / raw)
To: David Miller; +Cc: josef, netdev, linux-kernel, crobinso, labbott, kernel-team
In-Reply-To: <20170919.135056.44228457394918392.davem@davemloft.net>
On Tue, Sep 19, 2017 at 01:50:56PM -0700, David Miller wrote:
> From: josef@toxicpanda.com
> Date: Mon, 18 Sep 2017 12:28:54 -0400
>
> > I introduced a regression when reworking the fastreuse port stuff that allows
> > bind conflicts to occur once a reuseaddr socket successfully opens on an
> > existing tb. The root cause is I reversed an if statement which caused us to
> > set the tb as if there were no owners on the socket if there were, which
> > obviously is not correct.
> >
> > Dave I have follow up patches that will add a selftest for this case and I ran
> > the other reuseport related tests as well. These need to go in pretty quickly
> > as it breaks kvm, I've marked them for stable. Sorry for the regression,
>
> First, please fix your "From: " field so that it actually has your full
> name rather than just your email address. This matter when I apply
> your patches.
>
> Second, remove the stable CC:. For networking changes, you simply ask
> me to queue the changes up for -stable.
>
Sorry Dave, I've fixed my git email settings and I droped the stable cc and sent
a new round. Didn't see this until just now, my bad.
Josef
^ permalink raw reply
* [PATCH net-next] liquidio: pass date and time info to NIC firmware
From: Felix Manlunas @ 2017-09-23 0:35 UTC (permalink / raw)
To: davem
Cc: netdev, raghu.vatsavayi, derek.chickles, satananda.burla,
manish.awasthi, veerasenareddy.burru
From: Veerasenareddy Burru <veerasenareddy.burru@cavium.com>
Signed-off-by: Veerasenareddy Burru <veerasenareddy.burru@cavium.com>
Signed-off-by: Manish Awasthi <manish.awasthi@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
---
.../net/ethernet/cavium/liquidio/octeon_console.c | 28 +++++++++++++++++++---
1 file changed, 25 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/cavium/liquidio/octeon_console.c b/drivers/net/ethernet/cavium/liquidio/octeon_console.c
index ec3dd69..eda799b 100644
--- a/drivers/net/ethernet/cavium/liquidio/octeon_console.c
+++ b/drivers/net/ethernet/cavium/liquidio/octeon_console.c
@@ -803,15 +803,19 @@ static int octeon_console_read(struct octeon_device *oct, u32 console_num,
}
#define FBUF_SIZE (4 * 1024 * 1024)
+#define MAX_DATE_SIZE 30
int octeon_download_firmware(struct octeon_device *oct, const u8 *data,
size_t size)
{
- int ret = 0;
+ struct octeon_firmware_file_header *h;
+ char date[MAX_DATE_SIZE];
+ struct timeval time;
u32 crc32_result;
+ struct tm tm_val;
u64 load_addr;
u32 image_len;
- struct octeon_firmware_file_header *h;
+ int ret = 0;
u32 i, rem;
if (size < sizeof(struct octeon_firmware_file_header)) {
@@ -890,11 +894,29 @@ int octeon_download_firmware(struct octeon_device *oct, const u8 *data,
load_addr += size;
}
}
+
+ /* Get time of the day */
+ do_gettimeofday(&time);
+ time_to_tm(time.tv_sec, (-sys_tz.tz_minuteswest) * 60, &tm_val);
+ ret = snprintf(date, MAX_DATE_SIZE,
+ " date=%04ld.%02d.%02d-%02d:%02d:%02d",
+ tm_val.tm_year + 1900, tm_val.tm_mon + 1, tm_val.tm_mday,
+ tm_val.tm_hour, tm_val.tm_min, tm_val.tm_sec);
+ if ((sizeof(h->bootcmd) - strnlen(h->bootcmd, sizeof(h->bootcmd))) <
+ ret) {
+ dev_err(&oct->pci_dev->dev, "Boot command buffer too small\n");
+ return -EINVAL;
+ }
+ strncat(h->bootcmd, date,
+ sizeof(h->bootcmd) - strnlen(h->bootcmd, sizeof(h->bootcmd)));
+
dev_info(&oct->pci_dev->dev, "Writing boot command: %s\n",
h->bootcmd);
/* Invoke the bootcmd */
ret = octeon_console_send_cmd(oct, h->bootcmd, 50);
+ if (ret)
+ dev_info(&oct->pci_dev->dev, "Boot command send failed\n");
- return 0;
+ return ret;
}
^ permalink raw reply related
* Re: [PATCH net-next 10/10] net: hns3: Add mqprio support when interacting with network stack
From: Yunsheng Lin @ 2017-09-23 0:47 UTC (permalink / raw)
To: Jiri Pirko
Cc: davem@davemloft.net, huangdaode, xuwei (O), Liguozhu (Kenneth),
Zhuangyuzeng (Yisen), Gabriele Paoloni, John Garry, Linuxarm,
Salil Mehta, lipeng (Y), netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <20170922160322.GB2005@nanopsycho.orion>
Hi, Jiri
On 2017/9/23 0:03, Jiri Pirko wrote:
> Fri, Sep 22, 2017 at 04:11:51PM CEST, linyunsheng@huawei.com wrote:
>> Hi, Jiri
>>
>>>> - if (!tc) {
>>>> + if (if_running) {
>>>> + (void)hns3_nic_net_stop(netdev);
>>>> + msleep(100);
>>>> + }
>>>> +
>>>> + ret = (kinfo->dcb_ops && kinfo->dcb_ops->>setup_tc) ?
>>>> + kinfo->dcb_ops->setup_tc(h, tc, prio_tc) : ->EOPNOTSUPP;
>>
>>> This is most odd. Why do you call dcb_ops from >ndo_setup_tc callback?
>>> Why are you mixing this together? prio->tc mapping >can be done
>>> directly in dcbnl
>>
>> Here is what we do in dcb_ops->setup_tc:
>> Firstly, if current tc num is different from the tc num
>> that user provide, then we setup the queues for each
>> tc.
>>
>> Secondly, we tell hardware the pri to tc mapping that
>> the stack is using. In rx direction, our hardware need
>> that mapping to put different packet into different tc'
>> queues according to the priority of the packet, then
>> rss decides which specific queue in the tc should the
>> packet goto.
>>
>> By mixing, I suppose you meant why we need the
>> pri to tc infomation?
>
> by mixing, I mean what I wrote. You are calling dcb_ops callback from
> ndo_setup_tc callback. So you are mixing DCBNL subsystem and TC
> subsystem. Why? Why do you need sch_mqprio? Why DCBNL is not enough for
> all?
When using lldptool, dcbnl is involved.
But when using tc qdisc, dcbbl is not involved, below is the a few key
call graph in the kernel when tc qdisc cmd is executed.
cmd:
tc qdisc add dev eth0 root handle 1:0 mqprio num_tc 4 map 1 2 3 3 1 3 1 1 hw 1
call graph:
rtnetlink_rcv_msg -> tc_modify_qdisc -> qdisc_create -> mqprio_init ->
hns3_nic_setup_tc
When hns3_nic_setup_tc is called, we need to know how many tc num and
prio_tc mapping from the tc_mqprio_qopt which is provided in the paramter
in the ndo_setup_tc function, and dcb_ops is the our hardware specific
method to setup the tc related parameter to the hardware, so this is why
we call dcb_ops callback in ndo_setup_tc callback.
I hope this will answer your question, thanks for your time.
>
>
>
>> I hope I did not misunderstand your question, thanks
>> for your time reviewing.
>
> .
>
^ permalink raw reply
* Re: [PATCH net-next] virtio-net: correctly set xdp_xmit for mergeable buffer
From: David Miller @ 2017-09-23 1:16 UTC (permalink / raw)
To: jasowang; +Cc: mst, virtualization, netdev, linux-kernel, john.fastabend
In-Reply-To: <1506062338-3617-1-git-send-email-jasowang@redhat.com>
From: Jason Wang <jasowang@redhat.com>
Date: Fri, 22 Sep 2017 14:38:58 +0800
> We should set xdp_xmit only when xdp_do_redirect() succeed.
>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
Applied, thanks Jason.
^ permalink raw reply
* Re: tools: selftests: psock_tpacket: skip un-supported tpacket_v3 test
From: David Miller @ 2017-09-23 1:20 UTC (permalink / raw)
To: orson.zhai
Cc: shuah, milosz.wasilewski, sumit.semwal, netdev, linux-kselftest
In-Reply-To: <20170922101717.11933-1-orson.zhai@linaro.org>
From: Orson Zhai <orson.zhai@linaro.org>
Date: Fri, 22 Sep 2017 18:17:17 +0800
> The TPACKET_V3 test of PACKET_TX_RING will fail with kernel version
> lower than v4.11. Supported code of tx ring was add with commit id
> <7f953ab2ba46: af_packet: TX_RING support for TPACKET_V3> at Jan. 3
> of 2017.
>
> So skip this item test instead of reporting failing for old kernels.
>
> Signed-off-by: Orson Zhai <orson.zhai@linaro.org>
The whole point is to make sure the kernel in which the selftest
code is present functions properly.
There are many tests in selftests that only work on recent kernels.
I'm not applying this, sorry.
^ permalink raw reply
* Re: [PATCH 0/5] use setup_timer() helper function.
From: David Miller @ 2017-09-23 1:22 UTC (permalink / raw)
To: allen.lkml; +Cc: netdev, sameo
In-Reply-To: <1506077902-1796-1-git-send-email-allen.lkml@gmail.com>
From: Allen Pais <allen.lkml@gmail.com>
Date: Fri, 22 Sep 2017 16:28:17 +0530
> This series uses setup_timer() helper function. The series
> addresses the files under net/*.
There was a recent change to the nfc code in net-next which causes
your patches to not apply.
Please repsin against net-next, thanks.
^ permalink raw reply
* Re: [PATCH] net: stmmac: Meet alignment requirements for DMA
From: David Miller @ 2017-09-23 1:26 UTC (permalink / raw)
To: matt.redfearn; +Cc: netdev, alexandre.torgue, peppe.cavallaro, linux-kernel
In-Reply-To: <1506078833-14002-1-git-send-email-matt.redfearn@imgtec.com>
From: Matt Redfearn <matt.redfearn@imgtec.com>
Date: Fri, 22 Sep 2017 12:13:53 +0100
> According to Documentation/DMA-API.txt:
> Warnings: Memory coherency operates at a granularity called the cache
> line width. In order for memory mapped by this API to operate
> correctly, the mapped region must begin exactly on a cache line
> boundary and end exactly on one (to prevent two separately mapped
> regions from sharing a single cache line). Since the cache line size
> may not be known at compile time, the API will not enforce this
> requirement. Therefore, it is recommended that driver writers who
> don't take special care to determine the cache line size at run time
> only map virtual regions that begin and end on page boundaries (which
> are guaranteed also to be cache line boundaries).
This is rediculious. You're misreading what this document is trying
to explain.
As long as you use the dma_{map,unamp}_single() and sync to/from
deivce interfaces properly, the cacheline issues will be handled properly
and the cpu and the device will see proper uptodate memory contents.
It is completely rediculious to require every driver to stash away two
sets of pointer for every packet, and to DMA map the headroom of the SKB
which is wasteful.
I'm not applying this, fix this problem properly, thanks.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox