* [RFC v3 -net 1/2] virtio: Start feature MTU support
From: Aaron Conole @ 2016-04-01 19:32 UTC (permalink / raw)
To: netdev, Michael S. Tsirkin, virtualization, linux-kernel,
Paolo Abeni, Sergei Shtylyov, Pankaj Gupta
In-Reply-To: <1459539136-13948-1-git-send-email-aconole@redhat.com>
This commit adds the feature bit and associated mtu device entry for the
virtio network device. Future commits will make use of these bits to support
negotiated MTU.
Signed-off-by: Aaron Conole <aconole@bytheb.org>
---
v2,v3:
* No change
include/uapi/linux/virtio_net.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index ec32293..41a6a01 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -55,6 +55,7 @@
#define VIRTIO_NET_F_MQ 22 /* Device supports Receive Flow
* Steering */
#define VIRTIO_NET_F_CTRL_MAC_ADDR 23 /* Set MAC address */
+#define VIRTIO_NET_F_MTU 25 /* Device supports Default MTU Negotiation */
#ifndef VIRTIO_NET_NO_LEGACY
#define VIRTIO_NET_F_GSO 6 /* Host handles pkts w/ any GSO type */
@@ -73,6 +74,8 @@ struct virtio_net_config {
* Legal values are between 1 and 0x8000
*/
__u16 max_virtqueue_pairs;
+ /* Default maximum transmit unit advice */
+ __u16 mtu;
} __attribute__((packed));
/*
--
2.5.5
^ permalink raw reply related
* [RFC v3 -next 2/2] virtio_net: Read the advised MTU
From: Aaron Conole @ 2016-04-01 19:32 UTC (permalink / raw)
To: netdev, Michael S. Tsirkin, virtualization, linux-kernel,
Paolo Abeni, Sergei Shtylyov, Pankaj Gupta
In-Reply-To: <1459539136-13948-1-git-send-email-aconole@redhat.com>
This patch checks the feature bit for the VIRTIO_NET_F_MTU feature. If it
exists, read the advised MTU and use it.
No proper error handling is provided for the case where a user changes the
negotiated MTU. A future commit will add proper error handling. Instead, a
warning is emitted if the guest changes the device MTU after previously
being given advice.
Signed-off-by: Aaron Conole <aconole@bytheb.org>
---
v2:
* Whitespace cleanup in the last hunk
* Code style change around the pr_warn
* Additional test for mtu change before printing warning
v3:
* removed the mtu change warning
drivers/net/virtio_net.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 49d84e5..2308083 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1450,6 +1450,7 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
static int virtnet_change_mtu(struct net_device *dev, int new_mtu)
{
+ struct virtnet_info *vi = netdev_priv(dev);
if (new_mtu < MIN_MTU || new_mtu > MAX_MTU)
return -EINVAL;
dev->mtu = new_mtu;
@@ -1896,6 +1897,12 @@ static int virtnet_probe(struct virtio_device *vdev)
if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ))
vi->has_cvq = true;
+ if (virtio_has_feature(vdev, VIRTIO_NET_F_MTU)) {
+ dev->mtu = virtio_cread16(vdev,
+ offsetof(struct virtio_net_config,
+ mtu));
+ }
+
if (vi->any_header_sg)
dev->needed_headroom = vi->hdr_len;
@@ -2081,6 +2088,7 @@ static unsigned int features[] = {
VIRTIO_NET_F_GUEST_ANNOUNCE, VIRTIO_NET_F_MQ,
VIRTIO_NET_F_CTRL_MAC_ADDR,
VIRTIO_F_ANY_LAYOUT,
+ VIRTIO_NET_F_MTU,
};
static struct virtio_driver virtio_net_driver = {
--
2.5.5
^ permalink raw reply related
* Re: [PATCH] RDS: sync congestion map updating
From: santosh shilimkar @ 2016-04-01 19:47 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: Wengang Wang, leon-2ukJVAZIZ/Y, netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <56FC927E.9090404-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
(cc-ing netdev)
On 3/30/2016 7:59 PM, Wengang Wang wrote:
>
>
> 在 2016年03月31日 09:51, Wengang Wang 写道:
>>
>>
>> 在 2016年03月31日 01:16, santosh shilimkar 写道:
>>> Hi Wengang,
>>>
>>> On 3/30/2016 9:19 AM, Leon Romanovsky wrote:
>>>> On Wed, Mar 30, 2016 at 05:08:22PM +0800, Wengang Wang wrote:
>>>>> Problem is found that some among a lot of parallel RDS
>>>>> communications hang.
>>>>> In my test ten or so among 33 communications hang. The send
>>>>> requests got
>>>>> -ENOBUF error meaning the peer socket (port) is congested. But
>>>>> meanwhile,
>>>>> peer socket (port) is not congested.
>>>>>
>>>>> The congestion map updating can happen in two paths: one is in
>>>>> rds_recvmsg path
>>>>> and the other is when it receives packets from the hardware. There
>>>>> is no
>>>>> synchronization when updating the congestion map. So a bit
>>>>> operation (clearing)
>>>>> in the rds_recvmsg path can be skipped by another bit operation
>>>>> (setting) in
>>>>> hardware packet receving path.
>>>>>
>
> To be more detailed. Here, the two paths (user calls recvmsg and
> hardware receives data) are for different rds socks. thus the
> rds_sock->rs_recv_lock is not helpful to sync the updating on congestion
> map.
>
For archive purpose, let me try to conclude the thread. I synced
with Wengang offlist and came up with below fix. I was under
impression that __set_bit_le() was atmoic version. After fixing
it like patch(end of the email), the bug gets addressed.
I will probably send this as fix for stable as well.
From 5614b61f6fdcd6ae0c04e50b97efd13201762294 Mon Sep 17 00:00:00 2001
From: Santosh Shilimkar <santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Date: Wed, 30 Mar 2016 23:26:47 -0700
Subject: [PATCH] RDS: Fix the atomicity for congestion map update
Two different threads with different rds sockets may be in
rds_recv_rcvbuf_delta() via receive path. If their ports
both map to the same word in the congestion map, then
using non-atomic ops to update it could cause the map to
be incorrect. Lets use atomics to avoid such an issue.
Full credit to Wengang <wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> for
finding the issue, analysing it and also pointing out
to offending code with spin lock based fix.
Signed-off-by: Wengang Wang <wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---
net/rds/cong.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/rds/cong.c b/net/rds/cong.c
index e6144b8..6641bcf 100644
--- a/net/rds/cong.c
+++ b/net/rds/cong.c
@@ -299,7 +299,7 @@ void rds_cong_set_bit(struct rds_cong_map *map,
__be16 port)
i = be16_to_cpu(port) / RDS_CONG_MAP_PAGE_BITS;
off = be16_to_cpu(port) % RDS_CONG_MAP_PAGE_BITS;
- __set_bit_le(off, (void *)map->m_page_addrs[i]);
+ set_bit_le(off, (void *)map->m_page_addrs[i]);
}
void rds_cong_clear_bit(struct rds_cong_map *map, __be16 port)
@@ -313,7 +313,7 @@ void rds_cong_clear_bit(struct rds_cong_map *map,
__be16 port)
i = be16_to_cpu(port) / RDS_CONG_MAP_PAGE_BITS;
off = be16_to_cpu(port) % RDS_CONG_MAP_PAGE_BITS;
- __clear_bit_le(off, (void *)map->m_page_addrs[i]);
+ clear_bit_le(off, (void *)map->m_page_addrs[i]);
}
static int rds_cong_test_bit(struct rds_cong_map *map, __be16 port)
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* Re: [PATCH v3 net-next] net: ipv4: Consider failed nexthops in multipath routes
From: Julian Anastasov @ 2016-04-01 19:51 UTC (permalink / raw)
To: David Ahern; +Cc: netdev
In-Reply-To: <1459523824-29828-1-git-send-email-dsa@cumulusnetworks.com>
Hello,
On Fri, 1 Apr 2016, David Ahern wrote:
> v3
> - Julian comments: changed use of dead in documentation to failed,
> init state to NUD_REACHABLE which simplifies fib_good_nh, use of
> nh_dev for neighbor lookup, fallback to first entry which is what
> current logic does
>
> v2
> - use rcu locking to avoid refcnts per Eric's suggestion
> - only consider neighbor info for nh_scope == RT_SCOPE_LINK per Julian's
> comment
> - drop the 'state == NUD_REACHABLE' from the state check since it is
> part of NUD_VALID (comment from Julian)
> - wrapped the use of the neigh in a sysctl
>
> Documentation/networking/ip-sysctl.txt | 10 ++++++++++
> include/net/netns/ipv4.h | 3 +++
> net/ipv4/fib_semantics.c | 32 ++++++++++++++++++++++++++++----
> net/ipv4/sysctl_net_ipv4.c | 11 +++++++++++
> 4 files changed, 52 insertions(+), 4 deletions(-)
>
> diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
> index d97268e8ff10..e08abf96824a 100644
> --- a/net/ipv4/fib_semantics.c
> +++ b/net/ipv4/fib_semantics.c
> @@ -1559,21 +1559,45 @@ int fib_sync_up(struct net_device *dev, unsigned int nh_flags)
> }
>
> #ifdef CONFIG_IP_ROUTE_MULTIPATH
> +static bool fib_good_nh(const struct fib_nh *nh)
> +{
> + int state = NUD_REACHABLE;
> +
> + if (nh->nh_scope == RT_SCOPE_LINK) {
> + struct neighbour *n = NULL;
NULL is not needed anymore.
> +
> + rcu_read_lock_bh();
> +
> + n = __neigh_lookup_noref(&arp_tbl, &nh->nh_gw, nh->nh_dev);
> + if (n)
> + state = n->nud_state;
> +
> + rcu_read_unlock_bh();
> + }
> +
> + return !!(state & NUD_VALID);
> +}
>
> void fib_select_multipath(struct fib_result *res, int hash)
> {
> struct fib_info *fi = res->fi;
> + struct net *net = fi->fib_net;
> + unsigned char first_nhsel = 0;
Looking at fib_table_lookup() res->nh_sel is not 0
in all cases. I personally don't like that we do not
fallback properly but to make this logic more correct we
can use something like this:
bool first = false;
>
> for_nexthops(fi) {
> if (hash > atomic_read(&nh->nh_upper_bound))
> continue;
>
> - res->nh_sel = nhsel;
> - return;
> + if (!net->ipv4.sysctl_fib_multipath_use_neigh ||
> + fib_good_nh(nh)) {
> + res->nh_sel = nhsel;
> + return;
> + }
> + if (!first_nhsel)
> + first_nhsel = nhsel;
if (!first) {
res->nh_sel = nhsel;
first = true;
}
> } endfor_nexthops(fi);
>
> - /* Race condition: route has just become dead. */
> - res->nh_sel = 0;
> + res->nh_sel = first_nhsel;
And then this is not needed anymore. Even setting
to 0 was not needed because 0 is not better than current
nh_sel when both are DEAD/LINKDOWN.
Regards
^ permalink raw reply
* [PATCH v2 -next] net/core/dev: Warn on a too-short GRO frame
From: Aaron Conole @ 2016-04-01 19:58 UTC (permalink / raw)
To: netdev, Joe Perches
From: Aaron Conole <aconole@bytheb.org>
When signaling that a GRO frame is ready to be processed, the network stack
correctly checks length and aborts processing when a frame is less than 14
bytes. However, such a condition is really indicative of a broken driver,
and should be loudly signaled, rather than silently dropped as the case is
today.
Convert the condition to use net_warn_ratelimited() to ensure the stack
loudly complains about such broken drivers.
Signed-off-by: Aaron Conole <aconole@bytheb.org>
---
v2:
* Convert from WARN_ON to net_warn_ratelimited
net/core/dev.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/core/dev.c b/net/core/dev.c
index b9bcbe7..1be269e 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4663,6 +4663,8 @@ static struct sk_buff *napi_frags_skb(struct napi_struct *napi)
if (unlikely(skb_gro_header_hard(skb, hlen))) {
eth = skb_gro_header_slow(skb, hlen, 0);
if (unlikely(!eth)) {
+ net_warn_ratelimited("%s: dropping impossible skb\n",
+ __func__);
napi_reuse_skb(napi, skb);
return NULL;
}
--
2.5.5
^ permalink raw reply related
* Re: [net PATCH 2/2] ipv4/GRO: Make GRO conform to RFC 6864
From: Alexander Duyck @ 2016-04-01 19:58 UTC (permalink / raw)
To: David Miller
Cc: Eric Dumazet, Alex Duyck, Herbert Xu, Tom Herbert, Jesse Gross,
Eric Dumazet, Netdev
In-Reply-To: <20160401.152405.915323132719949585.davem@davemloft.net>
On Fri, Apr 1, 2016 at 12:24 PM, David Miller <davem@davemloft.net> wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Fri, 01 Apr 2016 11:49:03 -0700
>
>> For example, TCP stack tracks per socket ID generation, even if it
>> sends DF=1 frames. Damn useful for tcpdump analysis and drop
>> inference.
>
> Thanks for mentioning this, I never considered this use case.
RFC 6864 is pretty explicit about this, IPv4 ID used only for
fragmentation. https://tools.ietf.org/html/rfc6864#section-4.1
The goal with this change is to try and keep most of the existing
behavior in tact without violating this rule? I would think the
sequence number should give you the ability to infer a drop in the
case of TCP. In the case of UDP tunnels we are now getting a bit more
data since we were ignoring the outer IP header ID before.
>> With your change, the resulting GRO packet would propagate the ID of
>> first frag. Most GSO/GSO engines will then provide a ID sequence,
>> which might not match original packets.
Right. But that is only in the case where the IP IDs did not already
increment or were left uninitialized meaning the transmitter was
probably already following RFC 6864 and chose a fixed value. Odds are
in such a case we end up improving the performance if anything as
there are plenty of legacy systems out there that still require the
IPv4 ID increment in order to get LRO/GRO.
>> I do not particularly care, but it is worth mentioning that GRO+TSO
>> would not be idempotent anymore.
In the patch I mentioned we had already broken that. I'm basically
just going through and fixing the cases for tunnels where we were
doing the outer header wrong while at the same time relaxing the
requirements for the inner header if DF is set. I'll probably add
some documentation do the Documentation folder about it as well. I'm
currently in the process of writing up documentation for GSO and GSO
partial for the upcoming patchset. I can pretty easily throw in a few
comments about GRO as well.
> Our eventual plan was to start emitting zero in the ID field for
> outgoing TCP datagrams with DF set, since the issue that caused us to
> generate incrementing IDs in the first place (buggy Microsoft SLHC
> compression) we decided is not relevant and important enough to
> accommodate any more.
For the GSO partial stuff I was probably just going to have the IP ID
on the inner headers lurch forward in chunks equal to gso_segs when we
are doing the segmentation. I didn't want to use a fixed value just
because that would likely make it easy to identify Linux devices being
a bump in the wire. I figure if there are already sources that
weren't updating IP ID for their segmentation offloads then if we just
take that approach odds are we will blend in with the other devices
and be more difficult to single out.
Another reason for doing it this way is that different devices are
going to have different behaviors with GSO partial. In the case of
the i40e driver it recognizes both inner and outer network headers so
it can increment both correctly. In the case of igb and ixgbe they
only can support the outer header so the inner IP ID value would be
lurching by gso_size every time we move from one GSO frame to the
next.
> So outside of your TCP behavior analysis case, there isn't a
> compelling argument to keeping that code around any more, rather than
> just put zero in the ID field.
>
> I suppose we could keep the counter code around and allow it to be
> enabled using a sysctl or socket option, but how strongly do you
> really feel about this?
I'm not suggesting we drop the counter code for transmit. What RFC
6864 says is "Originating sources MAY set the IPv4 ID field of atomic
datagrams to any value."
For transmit we can leave the IP ID code as is. For receive we should
not be snooping into the IP ID for any frames that have the DF bit set
as devices that have adopted RFC 6864 on their transmit path will end
up causing issues.
- Alex
^ permalink raw reply
* Re: [Odd commit author id merge via netdev]
From: Johannes Berg @ 2016-04-01 20:01 UTC (permalink / raw)
To: santosh shilimkar, netdev, David S. Miller
In-Reply-To: <56FEB50E.4060004@oracle.com>
On Fri, 2016-04-01 at 10:51 -0700, santosh shilimkar wrote:
> Hi Dave,
>
> I noticed something odd while checking the recent
> commits of mine in kernel.org tree made it via netdev.
>
> Don't know if its patchwork tool doing this.
> Usual author line in my git objects :
> Author: Santosh Shilimkar <emaid-id>
>
> But the commits going via your tree seems to be like below..
> Author: email-id <email-id>
>
> Few more examples of the commits end of the email. Can this
> be fixed for future commits ? The git objects you pulled from
> my tree directly have right author format where as ones which
> are picked from patchworks seems to be odd.
>
Patchwork does store this info somehow and re-use it, quite possibly
from the very first patch you ever sent. I think this bug was *just*
fixed in patchwork, but it'll probably be a while until that fix lands.
However, you can go and create a patchwork account with the real name,
associate it with all the email addresses you use and then I think
it'll pick it up. Not entirely sure though, you'll have to test it.
johannes
^ permalink raw reply
* [PATCH] ip6_tunnel: set rtnl_link_ops before calling register_netdevice
From: Thadeu Lima de Souza Cascardo @ 2016-04-01 20:17 UTC (permalink / raw)
To: netdev
When creating an ip6tnl tunnel with ip tunnel, rtnl_link_ops is not set
before ip6_tnl_create2 is called. When register_netdevice is called, there
is no linkinfo attribute in the NEWLINK message because of that.
Setting rtnl_link_ops before calling register_netdevice fixes that.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
---
net/ipv6/ip6_tunnel.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index eb2ac4b..1f20345 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -252,12 +252,12 @@ static int ip6_tnl_create2(struct net_device *dev)
t = netdev_priv(dev);
+ dev->rtnl_link_ops = &ip6_link_ops;
err = register_netdevice(dev);
if (err < 0)
goto out;
strcpy(t->parms.name, dev->name);
- dev->rtnl_link_ops = &ip6_link_ops;
dev_hold(dev);
ip6_tnl_link(ip6n, t);
--
2.5.0
^ permalink raw reply related
* Re: [PATCH v2 -next] net/core/dev: Warn on a too-short GRO frame
From: Eric Dumazet @ 2016-04-01 20:29 UTC (permalink / raw)
To: Aaron Conole; +Cc: netdev, Joe Perches
In-Reply-To: <1459540695-24404-1-git-send-email-aconole@redhat.com>
On Fri, 2016-04-01 at 15:58 -0400, Aaron Conole wrote:
> From: Aaron Conole <aconole@bytheb.org>
>
> When signaling that a GRO frame is ready to be processed, the network stack
> correctly checks length and aborts processing when a frame is less than 14
> bytes. However, such a condition is really indicative of a broken driver,
> and should be loudly signaled, rather than silently dropped as the case is
> today.
>
> Convert the condition to use net_warn_ratelimited() to ensure the stack
> loudly complains about such broken drivers.
>
> Signed-off-by: Aaron Conole <aconole@bytheb.org>
> ---
Shouldn't we give a hint of device name ?
(available in napi->dev->name )
^ permalink raw reply
* [PATCH v4 net-next 00/15] MTU/buffer reconfig changes
From: Jakub Kicinski @ 2016-04-01 21:06 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski
Hi!
Sorry it takes me so long to iterate this.
Previous series included some not entirely related patches,
this one is cut down. Main issue I'm trying to solve here
is that .ndo_change_mtu() in nfpvf driver is doing full
close/open to reallocate buffers - which if open fails
can result in device being basically closed even though
the interface is started. As suggested by you I try to move
towards a paradigm where the resources are allocated first
and the MTU change is only done once I'm certain (almost)
nothing can fail. Almost because I need to communicate
with FW and that can always time out.
Patch 1 fixes small issue. Next 10 patches reorganize things
so that I can easily allocate new rings and sets of buffers
while the device is running. Patches 13 and 15 reshape the
.ndo_change_mtu() and ethtool's ring-resize operation into
desired form.
Jakub Kicinski (15):
nfp: correct RX buffer length calculation
nfp: move link state interrupt request/free calls
nfp: break up nfp_net_{alloc|free}_rings
nfp: make *x_ring_init do all the init
nfp: allocate ring SW structs dynamically
nfp: cleanup tx ring flush and rename to reset
nfp: reorganize initial filling of RX rings
nfp: preallocate RX buffers early in .ndo_open
nfp: move filling ring information to FW config
nfp: slice .ndo_open() and .ndo_stop() up
nfp: sync ring state during FW reconfiguration
nfp: propagate list buffer size in struct rx_ring
nfp: convert .ndo_change_mtu() to prepare/commit paradigm
nfp: pass ring count as function parameter
nfp: allow ring size reconfiguration at runtime
drivers/net/ethernet/netronome/nfp/nfp_net.h | 10 +-
.../net/ethernet/netronome/nfp/nfp_net_common.c | 905 ++++++++++++++-------
.../net/ethernet/netronome/nfp/nfp_net_ethtool.c | 30 +-
3 files changed, 617 insertions(+), 328 deletions(-)
--
1.9.1
^ permalink raw reply
* [PATCH v4 net-next 01/15] nfp: correct RX buffer length calculation
From: Jakub Kicinski @ 2016-04-01 21:06 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski
In-Reply-To: <1459544811-24879-1-git-send-email-jakub.kicinski@netronome.com>
When calculating the RX buffer length we need to account for
up to 2 VLAN tags and up to 8 MPLS labels. Rounding up to 1k
is an relic of a distant past and can be removed. While at
it also remove trivial print statement.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 43c618bafdb6..307c02c4ba4a 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -61,6 +61,7 @@
#include <linux/ktime.h>
+#include <net/mpls.h>
#include <net/vxlan.h>
#include "nfp_net_ctrl.h"
@@ -1911,9 +1912,6 @@ static void nfp_net_set_rx_mode(struct net_device *netdev)
static int nfp_net_change_mtu(struct net_device *netdev, int new_mtu)
{
struct nfp_net *nn = netdev_priv(netdev);
- u32 tmp;
-
- nn_dbg(nn, "New MTU = %d\n", new_mtu);
if (new_mtu < 68 || new_mtu > nn->max_mtu) {
nn_err(nn, "New MTU (%d) is not valid\n", new_mtu);
@@ -1921,10 +1919,8 @@ static int nfp_net_change_mtu(struct net_device *netdev, int new_mtu)
}
netdev->mtu = new_mtu;
-
- /* Freelist buffer size rounded up to the nearest 1K */
- tmp = new_mtu + ETH_HLEN + VLAN_HLEN + NFP_NET_MAX_PREPEND;
- nn->fl_bufsz = roundup(tmp, 1024);
+ nn->fl_bufsz = NFP_NET_MAX_PREPEND + ETH_HLEN + VLAN_HLEN * 2 +
+ MPLS_HLEN * 8 + new_mtu;
/* restart if running */
if (netif_running(netdev)) {
--
1.9.1
^ permalink raw reply related
* [PATCH v4 net-next 02/15] nfp: move link state interrupt request/free calls
From: Jakub Kicinski @ 2016-04-01 21:06 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski
In-Reply-To: <1459544811-24879-1-git-send-email-jakub.kicinski@netronome.com>
We need to be able to disable the link state interrupt when
the device is brought down. We used to just free the IRQ
at the beginning of .ndo_stop(). As we now move towards
more ordered .ndo_open()/.ndo_stop() paths LSC allocation
should be placed in the "allocate resource" section.
Since the IRQ can't be freed early in .ndo_stop(), it is
disabled instead.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
.../net/ethernet/netronome/nfp/nfp_net_common.c | 23 +++++++++++-----------
1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 307c02c4ba4a..9ce04b179fac 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1730,10 +1730,16 @@ static int nfp_net_netdev_open(struct net_device *netdev)
NFP_NET_IRQ_EXN_IDX, nn->exn_handler);
if (err)
return err;
+ err = nfp_net_aux_irq_request(nn, NFP_NET_CFG_LSC, "%s-lsc",
+ nn->lsc_name, sizeof(nn->lsc_name),
+ NFP_NET_IRQ_LSC_IDX, nn->lsc_handler);
+ if (err)
+ goto err_free_exn;
+ disable_irq(nn->irq_entries[NFP_NET_CFG_LSC].vector);
err = nfp_net_alloc_rings(nn);
if (err)
- goto err_free_exn;
+ goto err_free_lsc;
err = netif_set_real_num_tx_queues(netdev, nn->num_tx_rings);
if (err)
@@ -1813,19 +1819,11 @@ static int nfp_net_netdev_open(struct net_device *netdev)
netif_tx_wake_all_queues(netdev);
- err = nfp_net_aux_irq_request(nn, NFP_NET_CFG_LSC, "%s-lsc",
- nn->lsc_name, sizeof(nn->lsc_name),
- NFP_NET_IRQ_LSC_IDX, nn->lsc_handler);
- if (err)
- goto err_stop_tx;
+ enable_irq(nn->irq_entries[NFP_NET_CFG_LSC].vector);
nfp_net_read_link_status(nn);
return 0;
-err_stop_tx:
- netif_tx_disable(netdev);
- for (r = 0; r < nn->num_r_vecs; r++)
- nfp_net_tx_flush(nn->r_vecs[r].tx_ring);
err_disable_napi:
while (r--) {
napi_disable(&nn->r_vecs[r].napi);
@@ -1835,6 +1833,8 @@ err_clear_config:
nfp_net_clear_config_and_disable(nn);
err_free_rings:
nfp_net_free_rings(nn);
+err_free_lsc:
+ nfp_net_aux_irq_free(nn, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
err_free_exn:
nfp_net_aux_irq_free(nn, NFP_NET_CFG_EXN, NFP_NET_IRQ_EXN_IDX);
return err;
@@ -1856,7 +1856,7 @@ static int nfp_net_netdev_close(struct net_device *netdev)
/* Step 1: Disable RX and TX rings from the Linux kernel perspective
*/
- nfp_net_aux_irq_free(nn, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
+ disable_irq(nn->irq_entries[NFP_NET_CFG_LSC].vector);
netif_carrier_off(netdev);
nn->link_up = false;
@@ -1877,6 +1877,7 @@ static int nfp_net_netdev_close(struct net_device *netdev)
}
nfp_net_free_rings(nn);
+ nfp_net_aux_irq_free(nn, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
nfp_net_aux_irq_free(nn, NFP_NET_CFG_EXN, NFP_NET_IRQ_EXN_IDX);
nn_dbg(nn, "%s down", netdev->name);
--
1.9.1
^ permalink raw reply related
* [PATCH v4 net-next 04/15] nfp: make *x_ring_init do all the init
From: Jakub Kicinski @ 2016-04-01 21:06 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski
In-Reply-To: <1459544811-24879-1-git-send-email-jakub.kicinski@netronome.com>
nfp_net_[rt]x_ring_init functions used to be called from probe
path only and some of their functionality was spilled to the
call site. In order to reuse them for ring reconfiguration
we need them to do all the init.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
.../net/ethernet/netronome/nfp/nfp_net_common.c | 28 ++++++++++++++--------
1 file changed, 18 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index b435d15ef8d6..7233471af7a8 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -348,12 +348,18 @@ static irqreturn_t nfp_net_irq_exn(int irq, void *data)
/**
* nfp_net_tx_ring_init() - Fill in the boilerplate for a TX ring
* @tx_ring: TX ring structure
+ * @r_vec: IRQ vector servicing this ring
+ * @idx: Ring index
*/
-static void nfp_net_tx_ring_init(struct nfp_net_tx_ring *tx_ring)
+static void
+nfp_net_tx_ring_init(struct nfp_net_tx_ring *tx_ring,
+ struct nfp_net_r_vector *r_vec, unsigned int idx)
{
- struct nfp_net_r_vector *r_vec = tx_ring->r_vec;
struct nfp_net *nn = r_vec->nfp_net;
+ tx_ring->idx = idx;
+ tx_ring->r_vec = r_vec;
+
tx_ring->qcidx = tx_ring->idx * nn->stride_tx;
tx_ring->qcp_q = nn->tx_bar + NFP_QCP_QUEUE_OFF(tx_ring->qcidx);
}
@@ -361,12 +367,18 @@ static void nfp_net_tx_ring_init(struct nfp_net_tx_ring *tx_ring)
/**
* nfp_net_rx_ring_init() - Fill in the boilerplate for a RX ring
* @rx_ring: RX ring structure
+ * @r_vec: IRQ vector servicing this ring
+ * @idx: Ring index
*/
-static void nfp_net_rx_ring_init(struct nfp_net_rx_ring *rx_ring)
+static void
+nfp_net_rx_ring_init(struct nfp_net_rx_ring *rx_ring,
+ struct nfp_net_r_vector *r_vec, unsigned int idx)
{
- struct nfp_net_r_vector *r_vec = rx_ring->r_vec;
struct nfp_net *nn = r_vec->nfp_net;
+ rx_ring->idx = idx;
+ rx_ring->r_vec = r_vec;
+
rx_ring->fl_qcidx = rx_ring->idx * nn->stride_rx;
rx_ring->rx_qcidx = rx_ring->fl_qcidx + (nn->stride_rx - 1);
@@ -404,14 +416,10 @@ static void nfp_net_irqs_assign(struct net_device *netdev)
cpumask_set_cpu(r, &r_vec->affinity_mask);
r_vec->tx_ring = &nn->tx_rings[r];
- nn->tx_rings[r].idx = r;
- nn->tx_rings[r].r_vec = r_vec;
- nfp_net_tx_ring_init(r_vec->tx_ring);
+ nfp_net_tx_ring_init(r_vec->tx_ring, r_vec, r);
r_vec->rx_ring = &nn->rx_rings[r];
- nn->rx_rings[r].idx = r;
- nn->rx_rings[r].r_vec = r_vec;
- nfp_net_rx_ring_init(r_vec->rx_ring);
+ nfp_net_rx_ring_init(r_vec->rx_ring, r_vec, r);
}
}
--
1.9.1
^ permalink raw reply related
* [PATCH v4 net-next 03/15] nfp: break up nfp_net_{alloc|free}_rings
From: Jakub Kicinski @ 2016-04-01 21:06 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski
In-Reply-To: <1459544811-24879-1-git-send-email-jakub.kicinski@netronome.com>
nfp_net_{alloc|free}_rings contained strange mix of allocations
and vector initialization. Remove it, declare vector init as
a separate function and handle allocations explicitly.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
.../net/ethernet/netronome/nfp/nfp_net_common.c | 126 ++++++++-------------
1 file changed, 47 insertions(+), 79 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 9ce04b179fac..b435d15ef8d6 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1489,91 +1489,40 @@ err_alloc:
return -ENOMEM;
}
-static void __nfp_net_free_rings(struct nfp_net *nn, unsigned int n_free)
+static int
+nfp_net_prepare_vector(struct nfp_net *nn, struct nfp_net_r_vector *r_vec,
+ int idx)
{
- struct nfp_net_r_vector *r_vec;
- struct msix_entry *entry;
+ struct msix_entry *entry = &nn->irq_entries[r_vec->irq_idx];
+ int err;
- while (n_free--) {
- r_vec = &nn->r_vecs[n_free];
- entry = &nn->irq_entries[r_vec->irq_idx];
+ snprintf(r_vec->name, sizeof(r_vec->name),
+ "%s-rxtx-%d", nn->netdev->name, idx);
+ err = request_irq(entry->vector, r_vec->handler, 0, r_vec->name, r_vec);
+ if (err) {
+ nn_err(nn, "Error requesting IRQ %d\n", entry->vector);
+ return err;
+ }
- nfp_net_rx_ring_free(r_vec->rx_ring);
- nfp_net_tx_ring_free(r_vec->tx_ring);
+ /* Setup NAPI */
+ netif_napi_add(nn->netdev, &r_vec->napi,
+ nfp_net_poll, NAPI_POLL_WEIGHT);
- irq_set_affinity_hint(entry->vector, NULL);
- free_irq(entry->vector, r_vec);
+ irq_set_affinity_hint(entry->vector, &r_vec->affinity_mask);
- netif_napi_del(&r_vec->napi);
- }
-}
+ nn_dbg(nn, "RV%02d: irq=%03d/%03d\n", idx, entry->vector, entry->entry);
-/**
- * nfp_net_free_rings() - Free all ring resources
- * @nn: NFP Net device to reconfigure
- */
-static void nfp_net_free_rings(struct nfp_net *nn)
-{
- __nfp_net_free_rings(nn, nn->num_r_vecs);
+ return 0;
}
-/**
- * nfp_net_alloc_rings() - Allocate resources for RX and TX rings
- * @nn: NFP Net device to reconfigure
- *
- * Return: 0 on success or negative errno on error.
- */
-static int nfp_net_alloc_rings(struct nfp_net *nn)
+static void
+nfp_net_cleanup_vector(struct nfp_net *nn, struct nfp_net_r_vector *r_vec)
{
- struct nfp_net_r_vector *r_vec;
- struct msix_entry *entry;
- int err;
- int r;
+ struct msix_entry *entry = &nn->irq_entries[r_vec->irq_idx];
- for (r = 0; r < nn->num_r_vecs; r++) {
- r_vec = &nn->r_vecs[r];
- entry = &nn->irq_entries[r_vec->irq_idx];
-
- /* Setup NAPI */
- netif_napi_add(nn->netdev, &r_vec->napi,
- nfp_net_poll, NAPI_POLL_WEIGHT);
-
- snprintf(r_vec->name, sizeof(r_vec->name),
- "%s-rxtx-%d", nn->netdev->name, r);
- err = request_irq(entry->vector, r_vec->handler, 0,
- r_vec->name, r_vec);
- if (err) {
- nn_dbg(nn, "Error requesting IRQ %d\n", entry->vector);
- goto err_napi_del;
- }
-
- irq_set_affinity_hint(entry->vector, &r_vec->affinity_mask);
-
- nn_dbg(nn, "RV%02d: irq=%03d/%03d\n",
- r, entry->vector, entry->entry);
-
- /* Allocate TX ring resources */
- err = nfp_net_tx_ring_alloc(r_vec->tx_ring);
- if (err)
- goto err_free_irq;
-
- /* Allocate RX ring resources */
- err = nfp_net_rx_ring_alloc(r_vec->rx_ring);
- if (err)
- goto err_free_tx;
- }
-
- return 0;
-
-err_free_tx:
- nfp_net_tx_ring_free(r_vec->tx_ring);
-err_free_irq:
irq_set_affinity_hint(entry->vector, NULL);
- free_irq(entry->vector, r_vec);
-err_napi_del:
netif_napi_del(&r_vec->napi);
- __nfp_net_free_rings(nn, r);
- return err;
+ free_irq(entry->vector, r_vec);
}
/**
@@ -1737,9 +1686,19 @@ static int nfp_net_netdev_open(struct net_device *netdev)
goto err_free_exn;
disable_irq(nn->irq_entries[NFP_NET_CFG_LSC].vector);
- err = nfp_net_alloc_rings(nn);
- if (err)
- goto err_free_lsc;
+ for (r = 0; r < nn->num_r_vecs; r++) {
+ err = nfp_net_prepare_vector(nn, &nn->r_vecs[r], r);
+ if (err)
+ goto err_free_prev_vecs;
+
+ err = nfp_net_tx_ring_alloc(nn->r_vecs[r].tx_ring);
+ if (err)
+ goto err_cleanup_vec_p;
+
+ err = nfp_net_rx_ring_alloc(nn->r_vecs[r].rx_ring);
+ if (err)
+ goto err_free_tx_ring_p;
+ }
err = netif_set_real_num_tx_queues(netdev, nn->num_tx_rings);
if (err)
@@ -1832,8 +1791,15 @@ err_disable_napi:
err_clear_config:
nfp_net_clear_config_and_disable(nn);
err_free_rings:
- nfp_net_free_rings(nn);
-err_free_lsc:
+ r = nn->num_r_vecs;
+err_free_prev_vecs:
+ while (r--) {
+ nfp_net_rx_ring_free(nn->r_vecs[r].rx_ring);
+err_free_tx_ring_p:
+ nfp_net_tx_ring_free(nn->r_vecs[r].tx_ring);
+err_cleanup_vec_p:
+ nfp_net_cleanup_vector(nn, &nn->r_vecs[r]);
+ }
nfp_net_aux_irq_free(nn, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
err_free_exn:
nfp_net_aux_irq_free(nn, NFP_NET_CFG_EXN, NFP_NET_IRQ_EXN_IDX);
@@ -1874,9 +1840,11 @@ static int nfp_net_netdev_close(struct net_device *netdev)
for (r = 0; r < nn->num_r_vecs; r++) {
nfp_net_rx_flush(nn->r_vecs[r].rx_ring);
nfp_net_tx_flush(nn->r_vecs[r].tx_ring);
+ nfp_net_rx_ring_free(nn->r_vecs[r].rx_ring);
+ nfp_net_tx_ring_free(nn->r_vecs[r].tx_ring);
+ nfp_net_cleanup_vector(nn, &nn->r_vecs[r]);
}
- nfp_net_free_rings(nn);
nfp_net_aux_irq_free(nn, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
nfp_net_aux_irq_free(nn, NFP_NET_CFG_EXN, NFP_NET_IRQ_EXN_IDX);
--
1.9.1
^ permalink raw reply related
* [PATCH v4 net-next 05/15] nfp: allocate ring SW structs dynamically
From: Jakub Kicinski @ 2016-04-01 21:06 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski
In-Reply-To: <1459544811-24879-1-git-send-email-jakub.kicinski@netronome.com>
To be able to switch rings more easly on config changes allocate
them dynamically, separately from nfp_net structure.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
drivers/net/ethernet/netronome/nfp/nfp_net.h | 6 ++---
.../net/ethernet/netronome/nfp/nfp_net_common.c | 28 +++++++++++++++++-----
2 files changed, 25 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
index ab264e1bccd0..0a87571a7d9c 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
@@ -472,6 +472,9 @@ struct nfp_net {
u32 rx_offset;
+ struct nfp_net_tx_ring *tx_rings;
+ struct nfp_net_rx_ring *rx_rings;
+
#ifdef CONFIG_PCI_IOV
unsigned int num_vfs;
struct vf_data_storage *vfinfo;
@@ -504,9 +507,6 @@ struct nfp_net {
int txd_cnt;
int rxd_cnt;
- struct nfp_net_tx_ring tx_rings[NFP_NET_MAX_TX_RINGS];
- struct nfp_net_rx_ring rx_rings[NFP_NET_MAX_RX_RINGS];
-
u8 num_irqs;
u8 num_r_vecs;
struct nfp_net_r_vector r_vecs[NFP_NET_MAX_TX_RINGS];
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 7233471af7a8..8f7e2e044811 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -414,12 +414,6 @@ static void nfp_net_irqs_assign(struct net_device *netdev)
r_vec->irq_idx = NFP_NET_NON_Q_VECTORS + r;
cpumask_set_cpu(r, &r_vec->affinity_mask);
-
- r_vec->tx_ring = &nn->tx_rings[r];
- nfp_net_tx_ring_init(r_vec->tx_ring, r_vec, r);
-
- r_vec->rx_ring = &nn->rx_rings[r];
- nfp_net_rx_ring_init(r_vec->rx_ring, r_vec, r);
}
}
@@ -1504,6 +1498,12 @@ nfp_net_prepare_vector(struct nfp_net *nn, struct nfp_net_r_vector *r_vec,
struct msix_entry *entry = &nn->irq_entries[r_vec->irq_idx];
int err;
+ r_vec->tx_ring = &nn->tx_rings[idx];
+ nfp_net_tx_ring_init(r_vec->tx_ring, r_vec, idx);
+
+ r_vec->rx_ring = &nn->rx_rings[idx];
+ nfp_net_rx_ring_init(r_vec->rx_ring, r_vec, idx);
+
snprintf(r_vec->name, sizeof(r_vec->name),
"%s-rxtx-%d", nn->netdev->name, idx);
err = request_irq(entry->vector, r_vec->handler, 0, r_vec->name, r_vec);
@@ -1694,6 +1694,15 @@ static int nfp_net_netdev_open(struct net_device *netdev)
goto err_free_exn;
disable_irq(nn->irq_entries[NFP_NET_CFG_LSC].vector);
+ nn->rx_rings = kcalloc(nn->num_rx_rings, sizeof(*nn->rx_rings),
+ GFP_KERNEL);
+ if (!nn->rx_rings)
+ goto err_free_lsc;
+ nn->tx_rings = kcalloc(nn->num_tx_rings, sizeof(*nn->tx_rings),
+ GFP_KERNEL);
+ if (!nn->tx_rings)
+ goto err_free_rx_rings;
+
for (r = 0; r < nn->num_r_vecs; r++) {
err = nfp_net_prepare_vector(nn, &nn->r_vecs[r], r);
if (err)
@@ -1808,6 +1817,10 @@ err_free_tx_ring_p:
err_cleanup_vec_p:
nfp_net_cleanup_vector(nn, &nn->r_vecs[r]);
}
+ kfree(nn->tx_rings);
+err_free_rx_rings:
+ kfree(nn->rx_rings);
+err_free_lsc:
nfp_net_aux_irq_free(nn, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
err_free_exn:
nfp_net_aux_irq_free(nn, NFP_NET_CFG_EXN, NFP_NET_IRQ_EXN_IDX);
@@ -1853,6 +1866,9 @@ static int nfp_net_netdev_close(struct net_device *netdev)
nfp_net_cleanup_vector(nn, &nn->r_vecs[r]);
}
+ kfree(nn->rx_rings);
+ kfree(nn->tx_rings);
+
nfp_net_aux_irq_free(nn, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
nfp_net_aux_irq_free(nn, NFP_NET_CFG_EXN, NFP_NET_IRQ_EXN_IDX);
--
1.9.1
^ permalink raw reply related
* [PATCH v4 net-next 06/15] nfp: cleanup tx ring flush and rename to reset
From: Jakub Kicinski @ 2016-04-01 21:06 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski
In-Reply-To: <1459544811-24879-1-git-send-email-jakub.kicinski@netronome.com>
Since we never used flush without freeing the ring later
the functionality of the two operations is mixed.
Rename flush to ring reset and move there all the things
which have to be done after FW ring state is cleared.
While at it do some clean-ups.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
.../net/ethernet/netronome/nfp/nfp_net_common.c | 81 ++++++++++------------
1 file changed, 37 insertions(+), 44 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 8f7e2e044811..9a027a3cfe02 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -868,61 +868,59 @@ static void nfp_net_tx_complete(struct nfp_net_tx_ring *tx_ring)
}
/**
- * nfp_net_tx_flush() - Free any untransmitted buffers currently on the TX ring
- * @tx_ring: TX ring structure
+ * nfp_net_tx_ring_reset() - Free any untransmitted buffers and reset pointers
+ * @nn: NFP Net device
+ * @tx_ring: TX ring structure
*
* Assumes that the device is stopped
*/
-static void nfp_net_tx_flush(struct nfp_net_tx_ring *tx_ring)
+static void
+nfp_net_tx_ring_reset(struct nfp_net *nn, struct nfp_net_tx_ring *tx_ring)
{
- struct nfp_net_r_vector *r_vec = tx_ring->r_vec;
- struct nfp_net *nn = r_vec->nfp_net;
- struct pci_dev *pdev = nn->pdev;
const struct skb_frag_struct *frag;
struct netdev_queue *nd_q;
- struct sk_buff *skb;
- int nr_frags;
- int fidx;
- int idx;
+ struct pci_dev *pdev = nn->pdev;
while (tx_ring->rd_p != tx_ring->wr_p) {
- idx = tx_ring->rd_p % tx_ring->cnt;
+ int nr_frags, fidx, idx;
+ struct sk_buff *skb;
+ idx = tx_ring->rd_p % tx_ring->cnt;
skb = tx_ring->txbufs[idx].skb;
- if (skb) {
- nr_frags = skb_shinfo(skb)->nr_frags;
- fidx = tx_ring->txbufs[idx].fidx;
-
- if (fidx == -1) {
- /* unmap head */
- dma_unmap_single(&pdev->dev,
- tx_ring->txbufs[idx].dma_addr,
- skb_headlen(skb),
- DMA_TO_DEVICE);
- } else {
- /* unmap fragment */
- frag = &skb_shinfo(skb)->frags[fidx];
- dma_unmap_page(&pdev->dev,
- tx_ring->txbufs[idx].dma_addr,
- skb_frag_size(frag),
- DMA_TO_DEVICE);
- }
-
- /* check for last gather fragment */
- if (fidx == nr_frags - 1)
- dev_kfree_skb_any(skb);
-
- tx_ring->txbufs[idx].dma_addr = 0;
- tx_ring->txbufs[idx].skb = NULL;
- tx_ring->txbufs[idx].fidx = -2;
+ nr_frags = skb_shinfo(skb)->nr_frags;
+ fidx = tx_ring->txbufs[idx].fidx;
+
+ if (fidx == -1) {
+ /* unmap head */
+ dma_unmap_single(&pdev->dev,
+ tx_ring->txbufs[idx].dma_addr,
+ skb_headlen(skb), DMA_TO_DEVICE);
+ } else {
+ /* unmap fragment */
+ frag = &skb_shinfo(skb)->frags[fidx];
+ dma_unmap_page(&pdev->dev,
+ tx_ring->txbufs[idx].dma_addr,
+ skb_frag_size(frag), DMA_TO_DEVICE);
}
- memset(&tx_ring->txds[idx], 0, sizeof(tx_ring->txds[idx]));
+ /* check for last gather fragment */
+ if (fidx == nr_frags - 1)
+ dev_kfree_skb_any(skb);
+
+ tx_ring->txbufs[idx].dma_addr = 0;
+ tx_ring->txbufs[idx].skb = NULL;
+ tx_ring->txbufs[idx].fidx = -2;
tx_ring->qcp_rd_p++;
tx_ring->rd_p++;
}
+ memset(tx_ring->txds, 0, sizeof(*tx_ring->txds) * tx_ring->cnt);
+ tx_ring->wr_p = 0;
+ tx_ring->rd_p = 0;
+ tx_ring->qcp_rd_p = 0;
+ tx_ring->wr_ptr_add = 0;
+
nd_q = netdev_get_tx_queue(nn->netdev, tx_ring->idx);
netdev_tx_reset_queue(nd_q);
}
@@ -1363,11 +1361,6 @@ static void nfp_net_tx_ring_free(struct nfp_net_tx_ring *tx_ring)
tx_ring->txds, tx_ring->dma);
tx_ring->cnt = 0;
- tx_ring->wr_p = 0;
- tx_ring->rd_p = 0;
- tx_ring->qcp_rd_p = 0;
- tx_ring->wr_ptr_add = 0;
-
tx_ring->txbufs = NULL;
tx_ring->txds = NULL;
tx_ring->dma = 0;
@@ -1860,7 +1853,7 @@ static int nfp_net_netdev_close(struct net_device *netdev)
*/
for (r = 0; r < nn->num_r_vecs; r++) {
nfp_net_rx_flush(nn->r_vecs[r].rx_ring);
- nfp_net_tx_flush(nn->r_vecs[r].tx_ring);
+ nfp_net_tx_ring_reset(nn, nn->r_vecs[r].tx_ring);
nfp_net_rx_ring_free(nn->r_vecs[r].rx_ring);
nfp_net_tx_ring_free(nn->r_vecs[r].tx_ring);
nfp_net_cleanup_vector(nn, &nn->r_vecs[r]);
--
1.9.1
^ permalink raw reply related
* [PATCH v4 net-next 07/15] nfp: reorganize initial filling of RX rings
From: Jakub Kicinski @ 2016-04-01 21:06 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski
In-Reply-To: <1459544811-24879-1-git-send-email-jakub.kicinski@netronome.com>
Separate allocation of buffers from giving them to FW,
thanks to this it will be possible to move allocation
earlier on .ndo_open() path and reuse buffers during
runtime reconfiguration.
Similar to TX side clean up the spill of functionality
from flush to freeing the ring. Unlike on TX side,
RX ring reset does not free buffers from the ring.
Ring reset means only that FW pointers are zeroed and
buffers on the ring must be placed in [0, cnt - 1)
positions.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
.../net/ethernet/netronome/nfp/nfp_net_common.c | 119 ++++++++++++++-------
1 file changed, 78 insertions(+), 41 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 9a027a3cfe02..b057102769f9 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1021,62 +1021,100 @@ static void nfp_net_rx_give_one(struct nfp_net_rx_ring *rx_ring,
}
/**
- * nfp_net_rx_flush() - Free any buffers currently on the RX ring
- * @rx_ring: RX ring to remove buffers from
+ * nfp_net_rx_ring_reset() - Reflect in SW state of freelist after disable
+ * @rx_ring: RX ring structure
*
- * Assumes that the device is stopped
+ * Warning: Do *not* call if ring buffers were never put on the FW freelist
+ * (i.e. device was not enabled)!
*/
-static void nfp_net_rx_flush(struct nfp_net_rx_ring *rx_ring)
+static void nfp_net_rx_ring_reset(struct nfp_net_rx_ring *rx_ring)
{
- struct nfp_net *nn = rx_ring->r_vec->nfp_net;
- struct pci_dev *pdev = nn->pdev;
- int idx;
+ unsigned int wr_idx, last_idx;
- while (rx_ring->rd_p != rx_ring->wr_p) {
- idx = rx_ring->rd_p % rx_ring->cnt;
+ /* Move the empty entry to the end of the list */
+ wr_idx = rx_ring->wr_p % rx_ring->cnt;
+ last_idx = rx_ring->cnt - 1;
+ rx_ring->rxbufs[wr_idx].dma_addr = rx_ring->rxbufs[last_idx].dma_addr;
+ rx_ring->rxbufs[wr_idx].skb = rx_ring->rxbufs[last_idx].skb;
+ rx_ring->rxbufs[last_idx].dma_addr = 0;
+ rx_ring->rxbufs[last_idx].skb = NULL;
- if (rx_ring->rxbufs[idx].skb) {
- dma_unmap_single(&pdev->dev,
- rx_ring->rxbufs[idx].dma_addr,
- nn->fl_bufsz, DMA_FROM_DEVICE);
- dev_kfree_skb_any(rx_ring->rxbufs[idx].skb);
- rx_ring->rxbufs[idx].dma_addr = 0;
- rx_ring->rxbufs[idx].skb = NULL;
- }
+ memset(rx_ring->rxds, 0, sizeof(*rx_ring->rxds) * rx_ring->cnt);
+ rx_ring->wr_p = 0;
+ rx_ring->rd_p = 0;
+ rx_ring->wr_ptr_add = 0;
+}
- memset(&rx_ring->rxds[idx], 0, sizeof(rx_ring->rxds[idx]));
+/**
+ * nfp_net_rx_ring_bufs_free() - Free any buffers currently on the RX ring
+ * @nn: NFP Net device
+ * @rx_ring: RX ring to remove buffers from
+ *
+ * Assumes that the device is stopped and buffers are in [0, ring->cnt - 1)
+ * entries. After device is disabled nfp_net_rx_ring_reset() must be called
+ * to restore required ring geometry.
+ */
+static void
+nfp_net_rx_ring_bufs_free(struct nfp_net *nn, struct nfp_net_rx_ring *rx_ring)
+{
+ struct pci_dev *pdev = nn->pdev;
+ unsigned int i;
- rx_ring->rd_p++;
+ for (i = 0; i < rx_ring->cnt - 1; i++) {
+ /* NULL skb can only happen when initial filling of the ring
+ * fails to allocate enough buffers and calls here to free
+ * already allocated ones.
+ */
+ if (!rx_ring->rxbufs[i].skb)
+ continue;
+
+ dma_unmap_single(&pdev->dev, rx_ring->rxbufs[i].dma_addr,
+ nn->fl_bufsz, DMA_FROM_DEVICE);
+ dev_kfree_skb_any(rx_ring->rxbufs[i].skb);
+ rx_ring->rxbufs[i].dma_addr = 0;
+ rx_ring->rxbufs[i].skb = NULL;
}
}
/**
- * nfp_net_rx_fill_freelist() - Attempt filling freelist with RX buffers
- * @rx_ring: RX ring to fill
- *
- * Try to fill as many buffers as possible into freelist. Return
- * number of buffers added.
- *
- * Return: Number of freelist buffers added.
+ * nfp_net_rx_ring_bufs_alloc() - Fill RX ring with buffers (don't give to FW)
+ * @nn: NFP Net device
+ * @rx_ring: RX ring to remove buffers from
*/
-static int nfp_net_rx_fill_freelist(struct nfp_net_rx_ring *rx_ring)
+static int
+nfp_net_rx_ring_bufs_alloc(struct nfp_net *nn, struct nfp_net_rx_ring *rx_ring)
{
- struct sk_buff *skb;
- dma_addr_t dma_addr;
+ struct nfp_net_rx_buf *rxbufs;
+ unsigned int i;
+
+ rxbufs = rx_ring->rxbufs;
- while (nfp_net_rx_space(rx_ring)) {
- skb = nfp_net_rx_alloc_one(rx_ring, &dma_addr);
- if (!skb) {
- nfp_net_rx_flush(rx_ring);
+ for (i = 0; i < rx_ring->cnt - 1; i++) {
+ rxbufs[i].skb =
+ nfp_net_rx_alloc_one(rx_ring, &rxbufs[i].dma_addr);
+ if (!rxbufs[i].skb) {
+ nfp_net_rx_ring_bufs_free(nn, rx_ring);
return -ENOMEM;
}
- nfp_net_rx_give_one(rx_ring, skb, dma_addr);
}
return 0;
}
/**
+ * nfp_net_rx_ring_fill_freelist() - Give buffers from the ring to FW
+ * @rx_ring: RX ring to fill
+ */
+static void nfp_net_rx_ring_fill_freelist(struct nfp_net_rx_ring *rx_ring)
+{
+ unsigned int i;
+
+ for (i = 0; i < rx_ring->cnt - 1; i++)
+ nfp_net_rx_give_one(rx_ring, rx_ring->rxbufs[i].skb,
+ rx_ring->rxbufs[i].dma_addr);
+}
+
+/**
* nfp_net_rx_csum_has_errors() - group check if rxd has any csum errors
* @flags: RX descriptor flags field in CPU byte order
*/
@@ -1432,10 +1470,6 @@ static void nfp_net_rx_ring_free(struct nfp_net_rx_ring *rx_ring)
rx_ring->rxds, rx_ring->dma);
rx_ring->cnt = 0;
- rx_ring->wr_p = 0;
- rx_ring->rd_p = 0;
- rx_ring->wr_ptr_add = 0;
-
rx_ring->rxbufs = NULL;
rx_ring->rxds = NULL;
rx_ring->dma = 0;
@@ -1642,12 +1676,13 @@ static int nfp_net_start_vec(struct nfp_net *nn, struct nfp_net_r_vector *r_vec)
disable_irq(irq_vec);
- err = nfp_net_rx_fill_freelist(r_vec->rx_ring);
+ err = nfp_net_rx_ring_bufs_alloc(r_vec->nfp_net, r_vec->rx_ring);
if (err) {
nn_err(nn, "RV%02d: couldn't allocate enough buffers\n",
r_vec->irq_idx);
goto out;
}
+ nfp_net_rx_ring_fill_freelist(r_vec->rx_ring);
napi_enable(&r_vec->napi);
out:
@@ -1796,7 +1831,8 @@ static int nfp_net_netdev_open(struct net_device *netdev)
err_disable_napi:
while (r--) {
napi_disable(&nn->r_vecs[r].napi);
- nfp_net_rx_flush(nn->r_vecs[r].rx_ring);
+ nfp_net_rx_ring_reset(nn->r_vecs[r].rx_ring);
+ nfp_net_rx_ring_bufs_free(nn, nn->r_vecs[r].rx_ring);
}
err_clear_config:
nfp_net_clear_config_and_disable(nn);
@@ -1852,7 +1888,8 @@ static int nfp_net_netdev_close(struct net_device *netdev)
/* Step 3: Free resources
*/
for (r = 0; r < nn->num_r_vecs; r++) {
- nfp_net_rx_flush(nn->r_vecs[r].rx_ring);
+ nfp_net_rx_ring_reset(nn->r_vecs[r].rx_ring);
+ nfp_net_rx_ring_bufs_free(nn, nn->r_vecs[r].rx_ring);
nfp_net_tx_ring_reset(nn, nn->r_vecs[r].tx_ring);
nfp_net_rx_ring_free(nn->r_vecs[r].rx_ring);
nfp_net_tx_ring_free(nn->r_vecs[r].tx_ring);
--
1.9.1
^ permalink raw reply related
* [PATCH v4 net-next 08/15] nfp: preallocate RX buffers early in .ndo_open
From: Jakub Kicinski @ 2016-04-01 21:06 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski
In-Reply-To: <1459544811-24879-1-git-send-email-jakub.kicinski@netronome.com>
We want the .ndo_open() to have following structure:
- allocate resources;
- configure HW/FW;
- enable the device from stack perspective.
Therefore filling RX rings needs to be moved to the beginning
of .ndo_open().
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
.../net/ethernet/netronome/nfp/nfp_net_common.c | 34 +++++++---------------
1 file changed, 11 insertions(+), 23 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index b057102769f9..c04706cd7d51 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1667,28 +1667,19 @@ static void nfp_net_clear_config_and_disable(struct nfp_net *nn)
* @nn: NFP Net device structure
* @r_vec: Ring vector to be started
*/
-static int nfp_net_start_vec(struct nfp_net *nn, struct nfp_net_r_vector *r_vec)
+static void
+nfp_net_start_vec(struct nfp_net *nn, struct nfp_net_r_vector *r_vec)
{
unsigned int irq_vec;
- int err = 0;
irq_vec = nn->irq_entries[r_vec->irq_idx].vector;
disable_irq(irq_vec);
- err = nfp_net_rx_ring_bufs_alloc(r_vec->nfp_net, r_vec->rx_ring);
- if (err) {
- nn_err(nn, "RV%02d: couldn't allocate enough buffers\n",
- r_vec->irq_idx);
- goto out;
- }
nfp_net_rx_ring_fill_freelist(r_vec->rx_ring);
-
napi_enable(&r_vec->napi);
-out:
- enable_irq(irq_vec);
- return err;
+ enable_irq(irq_vec);
}
static int nfp_net_netdev_open(struct net_device *netdev)
@@ -1743,6 +1734,10 @@ static int nfp_net_netdev_open(struct net_device *netdev)
err = nfp_net_rx_ring_alloc(nn->r_vecs[r].rx_ring);
if (err)
goto err_free_tx_ring_p;
+
+ err = nfp_net_rx_ring_bufs_alloc(nn, nn->r_vecs[r].rx_ring);
+ if (err)
+ goto err_flush_rx_ring_p;
}
err = netif_set_real_num_tx_queues(netdev, nn->num_tx_rings);
@@ -1815,11 +1810,8 @@ static int nfp_net_netdev_open(struct net_device *netdev)
* - enable all TX queues
* - set link state
*/
- for (r = 0; r < nn->num_r_vecs; r++) {
- err = nfp_net_start_vec(nn, &nn->r_vecs[r]);
- if (err)
- goto err_disable_napi;
- }
+ for (r = 0; r < nn->num_r_vecs; r++)
+ nfp_net_start_vec(nn, &nn->r_vecs[r]);
netif_tx_wake_all_queues(netdev);
@@ -1828,18 +1820,14 @@ static int nfp_net_netdev_open(struct net_device *netdev)
return 0;
-err_disable_napi:
- while (r--) {
- napi_disable(&nn->r_vecs[r].napi);
- nfp_net_rx_ring_reset(nn->r_vecs[r].rx_ring);
- nfp_net_rx_ring_bufs_free(nn, nn->r_vecs[r].rx_ring);
- }
err_clear_config:
nfp_net_clear_config_and_disable(nn);
err_free_rings:
r = nn->num_r_vecs;
err_free_prev_vecs:
while (r--) {
+ nfp_net_rx_ring_bufs_free(nn, nn->r_vecs[r].rx_ring);
+err_flush_rx_ring_p:
nfp_net_rx_ring_free(nn->r_vecs[r].rx_ring);
err_free_tx_ring_p:
nfp_net_tx_ring_free(nn->r_vecs[r].tx_ring);
--
1.9.1
^ permalink raw reply related
* [PATCH v4 net-next 09/15] nfp: move filling ring information to FW config
From: Jakub Kicinski @ 2016-04-01 21:06 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski
In-Reply-To: <1459544811-24879-1-git-send-email-jakub.kicinski@netronome.com>
nfp_net_[rt]x_ring_{alloc,free} should only allocate or free
ring resources without touching the device. Move setting
parameters in the BAR to separate functions. This will make
it possible to reuse alloc/free functions to allocate new
rings while the device is running.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
.../net/ethernet/netronome/nfp/nfp_net_common.c | 50 ++++++++++++++--------
1 file changed, 32 insertions(+), 18 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index c04706cd7d51..f504de12ed2a 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1388,10 +1388,6 @@ static void nfp_net_tx_ring_free(struct nfp_net_tx_ring *tx_ring)
struct nfp_net *nn = r_vec->nfp_net;
struct pci_dev *pdev = nn->pdev;
- nn_writeq(nn, NFP_NET_CFG_TXR_ADDR(tx_ring->idx), 0);
- nn_writeb(nn, NFP_NET_CFG_TXR_SZ(tx_ring->idx), 0);
- nn_writeb(nn, NFP_NET_CFG_TXR_VEC(tx_ring->idx), 0);
-
kfree(tx_ring->txbufs);
if (tx_ring->txds)
@@ -1431,11 +1427,6 @@ static int nfp_net_tx_ring_alloc(struct nfp_net_tx_ring *tx_ring)
if (!tx_ring->txbufs)
goto err_alloc;
- /* Write the DMA address, size and MSI-X info to the device */
- nn_writeq(nn, NFP_NET_CFG_TXR_ADDR(tx_ring->idx), tx_ring->dma);
- nn_writeb(nn, NFP_NET_CFG_TXR_SZ(tx_ring->idx), ilog2(tx_ring->cnt));
- nn_writeb(nn, NFP_NET_CFG_TXR_VEC(tx_ring->idx), r_vec->irq_idx);
-
netif_set_xps_queue(nn->netdev, &r_vec->affinity_mask, tx_ring->idx);
nn_dbg(nn, "TxQ%02d: QCidx=%02d cnt=%d dma=%#llx host=%p\n",
@@ -1459,10 +1450,6 @@ static void nfp_net_rx_ring_free(struct nfp_net_rx_ring *rx_ring)
struct nfp_net *nn = r_vec->nfp_net;
struct pci_dev *pdev = nn->pdev;
- nn_writeq(nn, NFP_NET_CFG_RXR_ADDR(rx_ring->idx), 0);
- nn_writeb(nn, NFP_NET_CFG_RXR_SZ(rx_ring->idx), 0);
- nn_writeb(nn, NFP_NET_CFG_RXR_VEC(rx_ring->idx), 0);
-
kfree(rx_ring->rxbufs);
if (rx_ring->rxds)
@@ -1502,11 +1489,6 @@ static int nfp_net_rx_ring_alloc(struct nfp_net_rx_ring *rx_ring)
if (!rx_ring->rxbufs)
goto err_alloc;
- /* Write the DMA address, size and MSI-X info to the device */
- nn_writeq(nn, NFP_NET_CFG_RXR_ADDR(rx_ring->idx), rx_ring->dma);
- nn_writeb(nn, NFP_NET_CFG_RXR_SZ(rx_ring->idx), ilog2(rx_ring->cnt));
- nn_writeb(nn, NFP_NET_CFG_RXR_VEC(rx_ring->idx), r_vec->irq_idx);
-
nn_dbg(nn, "RxQ%02d: FlQCidx=%02d RxQCidx=%02d cnt=%d dma=%#llx host=%p\n",
rx_ring->idx, rx_ring->fl_qcidx, rx_ring->rx_qcidx,
rx_ring->cnt, (unsigned long long)rx_ring->dma, rx_ring->rxds);
@@ -1631,6 +1613,17 @@ static void nfp_net_write_mac_addr(struct nfp_net *nn, const u8 *mac)
get_unaligned_be16(nn->netdev->dev_addr + 4) << 16);
}
+static void nfp_net_vec_clear_ring_data(struct nfp_net *nn, unsigned int idx)
+{
+ nn_writeq(nn, NFP_NET_CFG_RXR_ADDR(idx), 0);
+ nn_writeb(nn, NFP_NET_CFG_RXR_SZ(idx), 0);
+ nn_writeb(nn, NFP_NET_CFG_RXR_VEC(idx), 0);
+
+ nn_writeq(nn, NFP_NET_CFG_TXR_ADDR(idx), 0);
+ nn_writeb(nn, NFP_NET_CFG_TXR_SZ(idx), 0);
+ nn_writeb(nn, NFP_NET_CFG_TXR_VEC(idx), 0);
+}
+
/**
* nfp_net_clear_config_and_disable() - Clear control BAR and disable NFP
* @nn: NFP Net device to reconfigure
@@ -1638,6 +1631,7 @@ static void nfp_net_write_mac_addr(struct nfp_net *nn, const u8 *mac)
static void nfp_net_clear_config_and_disable(struct nfp_net *nn)
{
u32 new_ctrl, update;
+ unsigned int r;
int err;
new_ctrl = nn->ctrl;
@@ -1659,9 +1653,26 @@ static void nfp_net_clear_config_and_disable(struct nfp_net *nn)
return;
}
+ for (r = 0; r < nn->num_r_vecs; r++)
+ nfp_net_vec_clear_ring_data(nn, r);
+
nn->ctrl = new_ctrl;
}
+static void
+nfp_net_vec_write_ring_data(struct nfp_net *nn, struct nfp_net_r_vector *r_vec,
+ unsigned int idx)
+{
+ /* Write the DMA address, size and MSI-X info to the device */
+ nn_writeq(nn, NFP_NET_CFG_RXR_ADDR(idx), r_vec->rx_ring->dma);
+ nn_writeb(nn, NFP_NET_CFG_RXR_SZ(idx), ilog2(r_vec->rx_ring->cnt));
+ nn_writeb(nn, NFP_NET_CFG_RXR_VEC(idx), r_vec->irq_idx);
+
+ nn_writeq(nn, NFP_NET_CFG_TXR_ADDR(idx), r_vec->tx_ring->dma);
+ nn_writeb(nn, NFP_NET_CFG_TXR_SZ(idx), ilog2(r_vec->tx_ring->cnt));
+ nn_writeb(nn, NFP_NET_CFG_TXR_VEC(idx), r_vec->irq_idx);
+}
+
/**
* nfp_net_start_vec() - Start ring vector
* @nn: NFP Net device structure
@@ -1769,6 +1780,9 @@ static int nfp_net_netdev_open(struct net_device *netdev)
* - Set the Freelist buffer size
* - Enable the FW
*/
+ for (r = 0; r < nn->num_r_vecs; r++)
+ nfp_net_vec_write_ring_data(nn, &nn->r_vecs[r], r);
+
nn_writeq(nn, NFP_NET_CFG_TXRS_ENABLE, nn->num_tx_rings == 64 ?
0xffffffffffffffffULL : ((u64)1 << nn->num_tx_rings) - 1);
--
1.9.1
^ permalink raw reply related
* [PATCH v4 net-next 10/15] nfp: slice .ndo_open() and .ndo_stop() up
From: Jakub Kicinski @ 2016-04-01 21:06 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski
In-Reply-To: <1459544811-24879-1-git-send-email-jakub.kicinski@netronome.com>
Divide .ndo_open() and .ndo_stop() into logical, callable
chunks. No functional changes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
.../net/ethernet/netronome/nfp/nfp_net_common.c | 218 +++++++++++++--------
1 file changed, 136 insertions(+), 82 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index f504de12ed2a..f171a7da8931 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1673,6 +1673,82 @@ nfp_net_vec_write_ring_data(struct nfp_net *nn, struct nfp_net_r_vector *r_vec,
nn_writeb(nn, NFP_NET_CFG_TXR_VEC(idx), r_vec->irq_idx);
}
+static int __nfp_net_set_config_and_enable(struct nfp_net *nn)
+{
+ u32 new_ctrl, update = 0;
+ unsigned int r;
+ int err;
+
+ new_ctrl = nn->ctrl;
+
+ if (nn->cap & NFP_NET_CFG_CTRL_RSS) {
+ nfp_net_rss_write_key(nn);
+ nfp_net_rss_write_itbl(nn);
+ nn_writel(nn, NFP_NET_CFG_RSS_CTRL, nn->rss_cfg);
+ update |= NFP_NET_CFG_UPDATE_RSS;
+ }
+
+ if (nn->cap & NFP_NET_CFG_CTRL_IRQMOD) {
+ nfp_net_coalesce_write_cfg(nn);
+
+ new_ctrl |= NFP_NET_CFG_CTRL_IRQMOD;
+ update |= NFP_NET_CFG_UPDATE_IRQMOD;
+ }
+
+ for (r = 0; r < nn->num_r_vecs; r++)
+ nfp_net_vec_write_ring_data(nn, &nn->r_vecs[r], r);
+
+ nn_writeq(nn, NFP_NET_CFG_TXRS_ENABLE, nn->num_tx_rings == 64 ?
+ 0xffffffffffffffffULL : ((u64)1 << nn->num_tx_rings) - 1);
+
+ nn_writeq(nn, NFP_NET_CFG_RXRS_ENABLE, nn->num_rx_rings == 64 ?
+ 0xffffffffffffffffULL : ((u64)1 << nn->num_rx_rings) - 1);
+
+ nfp_net_write_mac_addr(nn, nn->netdev->dev_addr);
+
+ nn_writel(nn, NFP_NET_CFG_MTU, nn->netdev->mtu);
+ nn_writel(nn, NFP_NET_CFG_FLBUFSZ, nn->fl_bufsz);
+
+ /* Enable device */
+ new_ctrl |= NFP_NET_CFG_CTRL_ENABLE;
+ update |= NFP_NET_CFG_UPDATE_GEN;
+ update |= NFP_NET_CFG_UPDATE_MSIX;
+ update |= NFP_NET_CFG_UPDATE_RING;
+ if (nn->cap & NFP_NET_CFG_CTRL_RINGCFG)
+ new_ctrl |= NFP_NET_CFG_CTRL_RINGCFG;
+
+ nn_writel(nn, NFP_NET_CFG_CTRL, new_ctrl);
+ err = nfp_net_reconfig(nn, update);
+
+ nn->ctrl = new_ctrl;
+
+ /* Since reconfiguration requests while NFP is down are ignored we
+ * have to wipe the entire VXLAN configuration and reinitialize it.
+ */
+ if (nn->ctrl & NFP_NET_CFG_CTRL_VXLAN) {
+ memset(&nn->vxlan_ports, 0, sizeof(nn->vxlan_ports));
+ memset(&nn->vxlan_usecnt, 0, sizeof(nn->vxlan_usecnt));
+ vxlan_get_rx_port(nn->netdev);
+ }
+
+ return err;
+}
+
+/**
+ * nfp_net_set_config_and_enable() - Write control BAR and enable NFP
+ * @nn: NFP Net device to reconfigure
+ */
+static int nfp_net_set_config_and_enable(struct nfp_net *nn)
+{
+ int err;
+
+ err = __nfp_net_set_config_and_enable(nn);
+ if (err)
+ nfp_net_clear_config_and_disable(nn);
+
+ return err;
+}
+
/**
* nfp_net_start_vec() - Start ring vector
* @nn: NFP Net device structure
@@ -1693,20 +1769,33 @@ nfp_net_start_vec(struct nfp_net *nn, struct nfp_net_r_vector *r_vec)
enable_irq(irq_vec);
}
+/**
+ * nfp_net_open_stack() - Start the device from stack's perspective
+ * @nn: NFP Net device to reconfigure
+ */
+static void nfp_net_open_stack(struct nfp_net *nn)
+{
+ unsigned int r;
+
+ for (r = 0; r < nn->num_r_vecs; r++)
+ nfp_net_start_vec(nn, &nn->r_vecs[r]);
+
+ netif_tx_wake_all_queues(nn->netdev);
+
+ enable_irq(nn->irq_entries[NFP_NET_CFG_LSC].vector);
+ nfp_net_read_link_status(nn);
+}
+
static int nfp_net_netdev_open(struct net_device *netdev)
{
struct nfp_net *nn = netdev_priv(netdev);
int err, r;
- u32 update = 0;
- u32 new_ctrl;
if (nn->ctrl & NFP_NET_CFG_CTRL_ENABLE) {
nn_err(nn, "Dev is already enabled: 0x%08x\n", nn->ctrl);
return -EBUSY;
}
- new_ctrl = nn->ctrl;
-
/* Step 1: Allocate resources for rings and the like
* - Request interrupts
* - Allocate RX and TX ring resources
@@ -1759,20 +1848,6 @@ static int nfp_net_netdev_open(struct net_device *netdev)
if (err)
goto err_free_rings;
- if (nn->cap & NFP_NET_CFG_CTRL_RSS) {
- nfp_net_rss_write_key(nn);
- nfp_net_rss_write_itbl(nn);
- nn_writel(nn, NFP_NET_CFG_RSS_CTRL, nn->rss_cfg);
- update |= NFP_NET_CFG_UPDATE_RSS;
- }
-
- if (nn->cap & NFP_NET_CFG_CTRL_IRQMOD) {
- nfp_net_coalesce_write_cfg(nn);
-
- new_ctrl |= NFP_NET_CFG_CTRL_IRQMOD;
- update |= NFP_NET_CFG_UPDATE_IRQMOD;
- }
-
/* Step 2: Configure the NFP
* - Enable rings from 0 to tx_rings/rx_rings - 1.
* - Write MAC address (in case it changed)
@@ -1780,43 +1855,9 @@ static int nfp_net_netdev_open(struct net_device *netdev)
* - Set the Freelist buffer size
* - Enable the FW
*/
- for (r = 0; r < nn->num_r_vecs; r++)
- nfp_net_vec_write_ring_data(nn, &nn->r_vecs[r], r);
-
- nn_writeq(nn, NFP_NET_CFG_TXRS_ENABLE, nn->num_tx_rings == 64 ?
- 0xffffffffffffffffULL : ((u64)1 << nn->num_tx_rings) - 1);
-
- nn_writeq(nn, NFP_NET_CFG_RXRS_ENABLE, nn->num_rx_rings == 64 ?
- 0xffffffffffffffffULL : ((u64)1 << nn->num_rx_rings) - 1);
-
- nfp_net_write_mac_addr(nn, netdev->dev_addr);
-
- nn_writel(nn, NFP_NET_CFG_MTU, netdev->mtu);
- nn_writel(nn, NFP_NET_CFG_FLBUFSZ, nn->fl_bufsz);
-
- /* Enable device */
- new_ctrl |= NFP_NET_CFG_CTRL_ENABLE;
- update |= NFP_NET_CFG_UPDATE_GEN;
- update |= NFP_NET_CFG_UPDATE_MSIX;
- update |= NFP_NET_CFG_UPDATE_RING;
- if (nn->cap & NFP_NET_CFG_CTRL_RINGCFG)
- new_ctrl |= NFP_NET_CFG_CTRL_RINGCFG;
-
- nn_writel(nn, NFP_NET_CFG_CTRL, new_ctrl);
- err = nfp_net_reconfig(nn, update);
+ err = nfp_net_set_config_and_enable(nn);
if (err)
- goto err_clear_config;
-
- nn->ctrl = new_ctrl;
-
- /* Since reconfiguration requests while NFP is down are ignored we
- * have to wipe the entire VXLAN configuration and reinitialize it.
- */
- if (nn->ctrl & NFP_NET_CFG_CTRL_VXLAN) {
- memset(&nn->vxlan_ports, 0, sizeof(nn->vxlan_ports));
- memset(&nn->vxlan_usecnt, 0, sizeof(nn->vxlan_usecnt));
- vxlan_get_rx_port(netdev);
- }
+ goto err_free_rings;
/* Step 3: Enable for kernel
* - put some freelist descriptors on each RX ring
@@ -1824,18 +1865,10 @@ static int nfp_net_netdev_open(struct net_device *netdev)
* - enable all TX queues
* - set link state
*/
- for (r = 0; r < nn->num_r_vecs; r++)
- nfp_net_start_vec(nn, &nn->r_vecs[r]);
-
- netif_tx_wake_all_queues(netdev);
-
- enable_irq(nn->irq_entries[NFP_NET_CFG_LSC].vector);
- nfp_net_read_link_status(nn);
+ nfp_net_open_stack(nn);
return 0;
-err_clear_config:
- nfp_net_clear_config_and_disable(nn);
err_free_rings:
r = nn->num_r_vecs;
err_free_prev_vecs:
@@ -1859,36 +1892,31 @@ err_free_exn:
}
/**
- * nfp_net_netdev_close() - Called when the device is downed
- * @netdev: netdev structure
+ * nfp_net_close_stack() - Quiescent the stack (part of close)
+ * @nn: NFP Net device to reconfigure
*/
-static int nfp_net_netdev_close(struct net_device *netdev)
+static void nfp_net_close_stack(struct nfp_net *nn)
{
- struct nfp_net *nn = netdev_priv(netdev);
- int r;
-
- if (!(nn->ctrl & NFP_NET_CFG_CTRL_ENABLE)) {
- nn_err(nn, "Dev is not up: 0x%08x\n", nn->ctrl);
- return 0;
- }
+ unsigned int r;
- /* Step 1: Disable RX and TX rings from the Linux kernel perspective
- */
disable_irq(nn->irq_entries[NFP_NET_CFG_LSC].vector);
- netif_carrier_off(netdev);
+ netif_carrier_off(nn->netdev);
nn->link_up = false;
for (r = 0; r < nn->num_r_vecs; r++)
napi_disable(&nn->r_vecs[r].napi);
- netif_tx_disable(netdev);
+ netif_tx_disable(nn->netdev);
+}
- /* Step 2: Tell NFP
- */
- nfp_net_clear_config_and_disable(nn);
+/**
+ * nfp_net_close_free_all() - Free all runtime resources
+ * @nn: NFP Net device to reconfigure
+ */
+static void nfp_net_close_free_all(struct nfp_net *nn)
+{
+ unsigned int r;
- /* Step 3: Free resources
- */
for (r = 0; r < nn->num_r_vecs; r++) {
nfp_net_rx_ring_reset(nn->r_vecs[r].rx_ring);
nfp_net_rx_ring_bufs_free(nn, nn->r_vecs[r].rx_ring);
@@ -1903,6 +1931,32 @@ static int nfp_net_netdev_close(struct net_device *netdev)
nfp_net_aux_irq_free(nn, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
nfp_net_aux_irq_free(nn, NFP_NET_CFG_EXN, NFP_NET_IRQ_EXN_IDX);
+}
+
+/**
+ * nfp_net_netdev_close() - Called when the device is downed
+ * @netdev: netdev structure
+ */
+static int nfp_net_netdev_close(struct net_device *netdev)
+{
+ struct nfp_net *nn = netdev_priv(netdev);
+
+ if (!(nn->ctrl & NFP_NET_CFG_CTRL_ENABLE)) {
+ nn_err(nn, "Dev is not up: 0x%08x\n", nn->ctrl);
+ return 0;
+ }
+
+ /* Step 1: Disable RX and TX rings from the Linux kernel perspective
+ */
+ nfp_net_close_stack(nn);
+
+ /* Step 2: Tell NFP
+ */
+ nfp_net_clear_config_and_disable(nn);
+
+ /* Step 3: Free resources
+ */
+ nfp_net_close_free_all(nn);
nn_dbg(nn, "%s down", netdev->name);
return 0;
--
1.9.1
^ permalink raw reply related
* [PATCH v4 net-next 11/15] nfp: sync ring state during FW reconfiguration
From: Jakub Kicinski @ 2016-04-01 21:06 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski
In-Reply-To: <1459544811-24879-1-git-send-email-jakub.kicinski@netronome.com>
FW reconfiguration in .ndo_open()/.ndo_stop() should reset/
restore queue state. Since we need IRQs to be disabled when
filling rings on RX path we have to move disable_irq() from
.ndo_open() all the way up to IRQ allocation.
nfp_net_start_vec() becomes trivial now so it's inlined.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
.../net/ethernet/netronome/nfp/nfp_net_common.c | 45 ++++++++--------------
1 file changed, 16 insertions(+), 29 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index f171a7da8931..2878ac021eda 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1520,6 +1520,7 @@ nfp_net_prepare_vector(struct nfp_net *nn, struct nfp_net_r_vector *r_vec,
nn_err(nn, "Error requesting IRQ %d\n", entry->vector);
return err;
}
+ disable_irq(entry->vector);
/* Setup NAPI */
netif_napi_add(nn->netdev, &r_vec->napi,
@@ -1648,13 +1649,14 @@ static void nfp_net_clear_config_and_disable(struct nfp_net *nn)
nn_writel(nn, NFP_NET_CFG_CTRL, new_ctrl);
err = nfp_net_reconfig(nn, update);
- if (err) {
+ if (err)
nn_err(nn, "Could not disable device: %d\n", err);
- return;
- }
- for (r = 0; r < nn->num_r_vecs; r++)
+ for (r = 0; r < nn->num_r_vecs; r++) {
+ nfp_net_rx_ring_reset(nn->r_vecs[r].rx_ring);
+ nfp_net_tx_ring_reset(nn, nn->r_vecs[r].tx_ring);
nfp_net_vec_clear_ring_data(nn, r);
+ }
nn->ctrl = new_ctrl;
}
@@ -1722,6 +1724,9 @@ static int __nfp_net_set_config_and_enable(struct nfp_net *nn)
nn->ctrl = new_ctrl;
+ for (r = 0; r < nn->num_r_vecs; r++)
+ nfp_net_rx_ring_fill_freelist(nn->r_vecs[r].rx_ring);
+
/* Since reconfiguration requests while NFP is down are ignored we
* have to wipe the entire VXLAN configuration and reinitialize it.
*/
@@ -1750,26 +1755,6 @@ static int nfp_net_set_config_and_enable(struct nfp_net *nn)
}
/**
- * nfp_net_start_vec() - Start ring vector
- * @nn: NFP Net device structure
- * @r_vec: Ring vector to be started
- */
-static void
-nfp_net_start_vec(struct nfp_net *nn, struct nfp_net_r_vector *r_vec)
-{
- unsigned int irq_vec;
-
- irq_vec = nn->irq_entries[r_vec->irq_idx].vector;
-
- disable_irq(irq_vec);
-
- nfp_net_rx_ring_fill_freelist(r_vec->rx_ring);
- napi_enable(&r_vec->napi);
-
- enable_irq(irq_vec);
-}
-
-/**
* nfp_net_open_stack() - Start the device from stack's perspective
* @nn: NFP Net device to reconfigure
*/
@@ -1777,8 +1762,10 @@ static void nfp_net_open_stack(struct nfp_net *nn)
{
unsigned int r;
- for (r = 0; r < nn->num_r_vecs; r++)
- nfp_net_start_vec(nn, &nn->r_vecs[r]);
+ for (r = 0; r < nn->num_r_vecs; r++) {
+ napi_enable(&nn->r_vecs[r].napi);
+ enable_irq(nn->irq_entries[nn->r_vecs[r].irq_idx].vector);
+ }
netif_tx_wake_all_queues(nn->netdev);
@@ -1903,8 +1890,10 @@ static void nfp_net_close_stack(struct nfp_net *nn)
netif_carrier_off(nn->netdev);
nn->link_up = false;
- for (r = 0; r < nn->num_r_vecs; r++)
+ for (r = 0; r < nn->num_r_vecs; r++) {
+ disable_irq(nn->irq_entries[nn->r_vecs[r].irq_idx].vector);
napi_disable(&nn->r_vecs[r].napi);
+ }
netif_tx_disable(nn->netdev);
}
@@ -1918,9 +1907,7 @@ static void nfp_net_close_free_all(struct nfp_net *nn)
unsigned int r;
for (r = 0; r < nn->num_r_vecs; r++) {
- nfp_net_rx_ring_reset(nn->r_vecs[r].rx_ring);
nfp_net_rx_ring_bufs_free(nn, nn->r_vecs[r].rx_ring);
- nfp_net_tx_ring_reset(nn, nn->r_vecs[r].tx_ring);
nfp_net_rx_ring_free(nn->r_vecs[r].rx_ring);
nfp_net_tx_ring_free(nn->r_vecs[r].tx_ring);
nfp_net_cleanup_vector(nn, &nn->r_vecs[r]);
--
1.9.1
^ permalink raw reply related
* [PATCH v4 net-next 12/15] nfp: propagate list buffer size in struct rx_ring
From: Jakub Kicinski @ 2016-04-01 21:06 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski
In-Reply-To: <1459544811-24879-1-git-send-email-jakub.kicinski@netronome.com>
Free list buffer size needs to be propagated to few functions
as a parameter and added to struct nfp_net_rx_ring since soon
some of the functions will be reused to manage rings with
buffers of size different than nn->fl_bufsz.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
drivers/net/ethernet/netronome/nfp/nfp_net.h | 3 +++
.../net/ethernet/netronome/nfp/nfp_net_common.c | 24 ++++++++++++++--------
2 files changed, 19 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
index 0a87571a7d9c..1e08c9cf3ee0 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
@@ -298,6 +298,8 @@ struct nfp_net_rx_buf {
* @rxds: Virtual address of FL/RX ring in host memory
* @dma: DMA address of the FL/RX ring
* @size: Size, in bytes, of the FL/RX ring (needed to free)
+ * @bufsz: Buffer allocation size for convenience of management routines
+ * (NOTE: this is in second cache line, do not use on fast path!)
*/
struct nfp_net_rx_ring {
struct nfp_net_r_vector *r_vec;
@@ -319,6 +321,7 @@ struct nfp_net_rx_ring {
dma_addr_t dma;
unsigned int size;
+ unsigned int bufsz;
} ____cacheline_aligned;
/**
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 2878ac021eda..eeabc33fe13d 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -958,25 +958,27 @@ static inline int nfp_net_rx_space(struct nfp_net_rx_ring *rx_ring)
* nfp_net_rx_alloc_one() - Allocate and map skb for RX
* @rx_ring: RX ring structure of the skb
* @dma_addr: Pointer to storage for DMA address (output param)
+ * @fl_bufsz: size of freelist buffers
*
* This function will allcate a new skb, map it for DMA.
*
* Return: allocated skb or NULL on failure.
*/
static struct sk_buff *
-nfp_net_rx_alloc_one(struct nfp_net_rx_ring *rx_ring, dma_addr_t *dma_addr)
+nfp_net_rx_alloc_one(struct nfp_net_rx_ring *rx_ring, dma_addr_t *dma_addr,
+ unsigned int fl_bufsz)
{
struct nfp_net *nn = rx_ring->r_vec->nfp_net;
struct sk_buff *skb;
- skb = netdev_alloc_skb(nn->netdev, nn->fl_bufsz);
+ skb = netdev_alloc_skb(nn->netdev, fl_bufsz);
if (!skb) {
nn_warn_ratelimit(nn, "Failed to alloc receive SKB\n");
return NULL;
}
*dma_addr = dma_map_single(&nn->pdev->dev, skb->data,
- nn->fl_bufsz, DMA_FROM_DEVICE);
+ fl_bufsz, DMA_FROM_DEVICE);
if (dma_mapping_error(&nn->pdev->dev, *dma_addr)) {
dev_kfree_skb_any(skb);
nn_warn_ratelimit(nn, "Failed to map DMA RX buffer\n");
@@ -1069,7 +1071,7 @@ nfp_net_rx_ring_bufs_free(struct nfp_net *nn, struct nfp_net_rx_ring *rx_ring)
continue;
dma_unmap_single(&pdev->dev, rx_ring->rxbufs[i].dma_addr,
- nn->fl_bufsz, DMA_FROM_DEVICE);
+ rx_ring->bufsz, DMA_FROM_DEVICE);
dev_kfree_skb_any(rx_ring->rxbufs[i].skb);
rx_ring->rxbufs[i].dma_addr = 0;
rx_ring->rxbufs[i].skb = NULL;
@@ -1091,7 +1093,8 @@ nfp_net_rx_ring_bufs_alloc(struct nfp_net *nn, struct nfp_net_rx_ring *rx_ring)
for (i = 0; i < rx_ring->cnt - 1; i++) {
rxbufs[i].skb =
- nfp_net_rx_alloc_one(rx_ring, &rxbufs[i].dma_addr);
+ nfp_net_rx_alloc_one(rx_ring, &rxbufs[i].dma_addr,
+ rx_ring->bufsz);
if (!rxbufs[i].skb) {
nfp_net_rx_ring_bufs_free(nn, rx_ring);
return -ENOMEM;
@@ -1279,7 +1282,8 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget)
skb = rx_ring->rxbufs[idx].skb;
- new_skb = nfp_net_rx_alloc_one(rx_ring, &new_dma_addr);
+ new_skb = nfp_net_rx_alloc_one(rx_ring, &new_dma_addr,
+ nn->fl_bufsz);
if (!new_skb) {
nfp_net_rx_give_one(rx_ring, rx_ring->rxbufs[idx].skb,
rx_ring->rxbufs[idx].dma_addr);
@@ -1466,10 +1470,12 @@ static void nfp_net_rx_ring_free(struct nfp_net_rx_ring *rx_ring)
/**
* nfp_net_rx_ring_alloc() - Allocate resource for a RX ring
* @rx_ring: RX ring to allocate
+ * @fl_bufsz: Size of buffers to allocate
*
* Return: 0 on success, negative errno otherwise.
*/
-static int nfp_net_rx_ring_alloc(struct nfp_net_rx_ring *rx_ring)
+static int
+nfp_net_rx_ring_alloc(struct nfp_net_rx_ring *rx_ring, unsigned int fl_bufsz)
{
struct nfp_net_r_vector *r_vec = rx_ring->r_vec;
struct nfp_net *nn = r_vec->nfp_net;
@@ -1477,6 +1483,7 @@ static int nfp_net_rx_ring_alloc(struct nfp_net_rx_ring *rx_ring)
int sz;
rx_ring->cnt = nn->rxd_cnt;
+ rx_ring->bufsz = fl_bufsz;
rx_ring->size = sizeof(*rx_ring->rxds) * rx_ring->cnt;
rx_ring->rxds = dma_zalloc_coherent(&pdev->dev, rx_ring->size,
@@ -1818,7 +1825,8 @@ static int nfp_net_netdev_open(struct net_device *netdev)
if (err)
goto err_cleanup_vec_p;
- err = nfp_net_rx_ring_alloc(nn->r_vecs[r].rx_ring);
+ err = nfp_net_rx_ring_alloc(nn->r_vecs[r].rx_ring,
+ nn->fl_bufsz);
if (err)
goto err_free_tx_ring_p;
--
1.9.1
^ permalink raw reply related
* [PATCH v4 net-next 13/15] nfp: convert .ndo_change_mtu() to prepare/commit paradigm
From: Jakub Kicinski @ 2016-04-01 21:06 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski
In-Reply-To: <1459544811-24879-1-git-send-email-jakub.kicinski@netronome.com>
When changing MTU on running device first allocate new rings
and buffers and once it succeeds proceed with changing MTU.
Allocation of new rings is not really necessary for this
operation - it's done to keep the code simple and because
size of the extra ring memory is quite small compared to
the size of buffers.
Operation can still fail midway through if FW communication
times out. In that case we retry with old MTU (rings).
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
.../net/ethernet/netronome/nfp/nfp_net_common.c | 110 +++++++++++++++++++--
1 file changed, 103 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index eeabc33fe13d..33001ce1d8bf 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1507,6 +1507,64 @@ err_alloc:
return -ENOMEM;
}
+static struct nfp_net_rx_ring *
+nfp_net_shadow_rx_rings_prepare(struct nfp_net *nn, unsigned int fl_bufsz)
+{
+ struct nfp_net_rx_ring *rings;
+ unsigned int r;
+
+ rings = kcalloc(nn->num_rx_rings, sizeof(*rings), GFP_KERNEL);
+ if (!rings)
+ return NULL;
+
+ for (r = 0; r < nn->num_rx_rings; r++) {
+ nfp_net_rx_ring_init(&rings[r], nn->rx_rings[r].r_vec, r);
+
+ if (nfp_net_rx_ring_alloc(&rings[r], fl_bufsz))
+ goto err_free_prev;
+
+ if (nfp_net_rx_ring_bufs_alloc(nn, &rings[r]))
+ goto err_free_ring;
+ }
+
+ return rings;
+
+err_free_prev:
+ while (r--) {
+ nfp_net_rx_ring_bufs_free(nn, &rings[r]);
+err_free_ring:
+ nfp_net_rx_ring_free(&rings[r]);
+ }
+ kfree(rings);
+ return NULL;
+}
+
+static struct nfp_net_rx_ring *
+nfp_net_shadow_rx_rings_swap(struct nfp_net *nn, struct nfp_net_rx_ring *rings)
+{
+ struct nfp_net_rx_ring *old = nn->rx_rings;
+ unsigned int r;
+
+ for (r = 0; r < nn->num_rx_rings; r++)
+ old[r].r_vec->rx_ring = &rings[r];
+
+ nn->rx_rings = rings;
+ return old;
+}
+
+static void
+nfp_net_shadow_rx_rings_free(struct nfp_net *nn, struct nfp_net_rx_ring *rings)
+{
+ unsigned int r;
+
+ for (r = 0; r < nn->num_r_vecs; r++) {
+ nfp_net_rx_ring_bufs_free(nn, &rings[r]);
+ nfp_net_rx_ring_free(&rings[r]);
+ }
+
+ kfree(rings);
+}
+
static int
nfp_net_prepare_vector(struct nfp_net *nn, struct nfp_net_r_vector *r_vec,
int idx)
@@ -1985,24 +2043,62 @@ static void nfp_net_set_rx_mode(struct net_device *netdev)
static int nfp_net_change_mtu(struct net_device *netdev, int new_mtu)
{
+ unsigned int old_mtu, old_fl_bufsz, new_fl_bufsz;
struct nfp_net *nn = netdev_priv(netdev);
+ struct nfp_net_rx_ring *tmp_rings;
+ int err;
if (new_mtu < 68 || new_mtu > nn->max_mtu) {
nn_err(nn, "New MTU (%d) is not valid\n", new_mtu);
return -EINVAL;
}
- netdev->mtu = new_mtu;
- nn->fl_bufsz = NFP_NET_MAX_PREPEND + ETH_HLEN + VLAN_HLEN * 2 +
+ old_mtu = netdev->mtu;
+ old_fl_bufsz = nn->fl_bufsz;
+ new_fl_bufsz = NFP_NET_MAX_PREPEND + ETH_HLEN + VLAN_HLEN * 2 +
MPLS_HLEN * 8 + new_mtu;
- /* restart if running */
- if (netif_running(netdev)) {
- nfp_net_netdev_close(netdev);
- nfp_net_netdev_open(netdev);
+ if (!netif_running(netdev)) {
+ netdev->mtu = new_mtu;
+ nn->fl_bufsz = new_fl_bufsz;
+ return 0;
}
- return 0;
+ /* Prepare new rings */
+ tmp_rings = nfp_net_shadow_rx_rings_prepare(nn, new_fl_bufsz);
+ if (!tmp_rings)
+ return -ENOMEM;
+
+ /* Stop device, swap in new rings, try to start the firmware */
+ nfp_net_close_stack(nn);
+ nfp_net_clear_config_and_disable(nn);
+
+ tmp_rings = nfp_net_shadow_rx_rings_swap(nn, tmp_rings);
+
+ netdev->mtu = new_mtu;
+ nn->fl_bufsz = new_fl_bufsz;
+
+ err = nfp_net_set_config_and_enable(nn);
+ if (err) {
+ const int err_new = err;
+
+ /* Try with old configuration and old rings */
+ tmp_rings = nfp_net_shadow_rx_rings_swap(nn, tmp_rings);
+
+ netdev->mtu = old_mtu;
+ nn->fl_bufsz = old_fl_bufsz;
+
+ err = __nfp_net_set_config_and_enable(nn);
+ if (err)
+ nn_err(nn, "Can't restore MTU - FW communication failed (%d,%d)\n",
+ err_new, err);
+ }
+
+ nfp_net_shadow_rx_rings_free(nn, tmp_rings);
+
+ nfp_net_open_stack(nn);
+
+ return err;
}
static struct rtnl_link_stats64 *nfp_net_stat64(struct net_device *netdev,
--
1.9.1
^ permalink raw reply related
* [PATCH v4 net-next 14/15] nfp: pass ring count as function parameter
From: Jakub Kicinski @ 2016-04-01 21:06 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski
In-Reply-To: <1459544811-24879-1-git-send-email-jakub.kicinski@netronome.com>
Soon ring resize will call this functions with values
different than the current configuration we need to
explicitly pass the ring count as parameter.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
.../net/ethernet/netronome/nfp/nfp_net_common.c | 23 +++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 33001ce1d8bf..631168a1971e 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1408,17 +1408,18 @@ static void nfp_net_tx_ring_free(struct nfp_net_tx_ring *tx_ring)
/**
* nfp_net_tx_ring_alloc() - Allocate resource for a TX ring
* @tx_ring: TX Ring structure to allocate
+ * @cnt: Ring buffer count
*
* Return: 0 on success, negative errno otherwise.
*/
-static int nfp_net_tx_ring_alloc(struct nfp_net_tx_ring *tx_ring)
+static int nfp_net_tx_ring_alloc(struct nfp_net_tx_ring *tx_ring, u32 cnt)
{
struct nfp_net_r_vector *r_vec = tx_ring->r_vec;
struct nfp_net *nn = r_vec->nfp_net;
struct pci_dev *pdev = nn->pdev;
int sz;
- tx_ring->cnt = nn->txd_cnt;
+ tx_ring->cnt = cnt;
tx_ring->size = sizeof(*tx_ring->txds) * tx_ring->cnt;
tx_ring->txds = dma_zalloc_coherent(&pdev->dev, tx_ring->size,
@@ -1471,18 +1472,20 @@ static void nfp_net_rx_ring_free(struct nfp_net_rx_ring *rx_ring)
* nfp_net_rx_ring_alloc() - Allocate resource for a RX ring
* @rx_ring: RX ring to allocate
* @fl_bufsz: Size of buffers to allocate
+ * @cnt: Ring buffer count
*
* Return: 0 on success, negative errno otherwise.
*/
static int
-nfp_net_rx_ring_alloc(struct nfp_net_rx_ring *rx_ring, unsigned int fl_bufsz)
+nfp_net_rx_ring_alloc(struct nfp_net_rx_ring *rx_ring, unsigned int fl_bufsz,
+ u32 cnt)
{
struct nfp_net_r_vector *r_vec = rx_ring->r_vec;
struct nfp_net *nn = r_vec->nfp_net;
struct pci_dev *pdev = nn->pdev;
int sz;
- rx_ring->cnt = nn->rxd_cnt;
+ rx_ring->cnt = cnt;
rx_ring->bufsz = fl_bufsz;
rx_ring->size = sizeof(*rx_ring->rxds) * rx_ring->cnt;
@@ -1508,7 +1511,8 @@ err_alloc:
}
static struct nfp_net_rx_ring *
-nfp_net_shadow_rx_rings_prepare(struct nfp_net *nn, unsigned int fl_bufsz)
+nfp_net_shadow_rx_rings_prepare(struct nfp_net *nn, unsigned int fl_bufsz,
+ u32 buf_cnt)
{
struct nfp_net_rx_ring *rings;
unsigned int r;
@@ -1520,7 +1524,7 @@ nfp_net_shadow_rx_rings_prepare(struct nfp_net *nn, unsigned int fl_bufsz)
for (r = 0; r < nn->num_rx_rings; r++) {
nfp_net_rx_ring_init(&rings[r], nn->rx_rings[r].r_vec, r);
- if (nfp_net_rx_ring_alloc(&rings[r], fl_bufsz))
+ if (nfp_net_rx_ring_alloc(&rings[r], fl_bufsz, buf_cnt))
goto err_free_prev;
if (nfp_net_rx_ring_bufs_alloc(nn, &rings[r]))
@@ -1879,12 +1883,12 @@ static int nfp_net_netdev_open(struct net_device *netdev)
if (err)
goto err_free_prev_vecs;
- err = nfp_net_tx_ring_alloc(nn->r_vecs[r].tx_ring);
+ err = nfp_net_tx_ring_alloc(nn->r_vecs[r].tx_ring, nn->txd_cnt);
if (err)
goto err_cleanup_vec_p;
err = nfp_net_rx_ring_alloc(nn->r_vecs[r].rx_ring,
- nn->fl_bufsz);
+ nn->fl_bufsz, nn->rxd_cnt);
if (err)
goto err_free_tx_ring_p;
@@ -2065,7 +2069,8 @@ static int nfp_net_change_mtu(struct net_device *netdev, int new_mtu)
}
/* Prepare new rings */
- tmp_rings = nfp_net_shadow_rx_rings_prepare(nn, new_fl_bufsz);
+ tmp_rings = nfp_net_shadow_rx_rings_prepare(nn, new_fl_bufsz,
+ nn->rxd_cnt);
if (!tmp_rings)
return -ENOMEM;
--
1.9.1
^ permalink raw reply related
* [PATCH v4 net-next 15/15] nfp: allow ring size reconfiguration at runtime
From: Jakub Kicinski @ 2016-04-01 21:06 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski
In-Reply-To: <1459544811-24879-1-git-send-email-jakub.kicinski@netronome.com>
Since much of the required changes have already been made for
changing MTU at runtime let's use it for ring size changes as
well.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
drivers/net/ethernet/netronome/nfp/nfp_net.h | 1 +
.../net/ethernet/netronome/nfp/nfp_net_common.c | 126 +++++++++++++++++++++
.../net/ethernet/netronome/nfp/nfp_net_ethtool.c | 30 ++---
3 files changed, 136 insertions(+), 21 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
index 1e08c9cf3ee0..90ad6264e62c 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
@@ -724,6 +724,7 @@ void nfp_net_rss_write_key(struct nfp_net *nn);
void nfp_net_coalesce_write_cfg(struct nfp_net *nn);
int nfp_net_irqs_alloc(struct nfp_net *nn);
void nfp_net_irqs_disable(struct nfp_net *nn);
+int nfp_net_set_ring_size(struct nfp_net *nn, u32 rxd_cnt, u32 txd_cnt);
#ifdef CONFIG_NFP_NET_DEBUG
void nfp_net_debugfs_create(void);
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 631168a1971e..57f330a4736e 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1445,6 +1445,59 @@ err_alloc:
return -ENOMEM;
}
+static struct nfp_net_tx_ring *
+nfp_net_shadow_tx_rings_prepare(struct nfp_net *nn, u32 buf_cnt)
+{
+ struct nfp_net_tx_ring *rings;
+ unsigned int r;
+
+ rings = kcalloc(nn->num_tx_rings, sizeof(*rings), GFP_KERNEL);
+ if (!rings)
+ return NULL;
+
+ for (r = 0; r < nn->num_tx_rings; r++) {
+ nfp_net_tx_ring_init(&rings[r], nn->tx_rings[r].r_vec, r);
+
+ if (nfp_net_tx_ring_alloc(&rings[r], buf_cnt))
+ goto err_free_prev;
+ }
+
+ return rings;
+
+err_free_prev:
+ while (r--)
+ nfp_net_tx_ring_free(&rings[r]);
+ kfree(rings);
+ return NULL;
+}
+
+static struct nfp_net_tx_ring *
+nfp_net_shadow_tx_rings_swap(struct nfp_net *nn, struct nfp_net_tx_ring *rings)
+{
+ struct nfp_net_tx_ring *old = nn->tx_rings;
+ unsigned int r;
+
+ for (r = 0; r < nn->num_tx_rings; r++)
+ old[r].r_vec->tx_ring = &rings[r];
+
+ nn->tx_rings = rings;
+ return old;
+}
+
+static void
+nfp_net_shadow_tx_rings_free(struct nfp_net *nn, struct nfp_net_tx_ring *rings)
+{
+ unsigned int r;
+
+ if (!rings)
+ return;
+
+ for (r = 0; r < nn->num_tx_rings; r++)
+ nfp_net_tx_ring_free(&rings[r]);
+
+ kfree(rings);
+}
+
/**
* nfp_net_rx_ring_free() - Free resources allocated to a RX ring
* @rx_ring: RX ring to free
@@ -1561,6 +1614,9 @@ nfp_net_shadow_rx_rings_free(struct nfp_net *nn, struct nfp_net_rx_ring *rings)
{
unsigned int r;
+ if (!rings)
+ return;
+
for (r = 0; r < nn->num_r_vecs; r++) {
nfp_net_rx_ring_bufs_free(nn, &rings[r]);
nfp_net_rx_ring_free(&rings[r]);
@@ -2106,6 +2162,76 @@ static int nfp_net_change_mtu(struct net_device *netdev, int new_mtu)
return err;
}
+int nfp_net_set_ring_size(struct nfp_net *nn, u32 rxd_cnt, u32 txd_cnt)
+{
+ struct nfp_net_tx_ring *tx_rings = NULL;
+ struct nfp_net_rx_ring *rx_rings = NULL;
+ u32 old_rxd_cnt, old_txd_cnt;
+ int err;
+
+ if (!netif_running(nn->netdev)) {
+ nn->rxd_cnt = rxd_cnt;
+ nn->txd_cnt = txd_cnt;
+ return 0;
+ }
+
+ old_rxd_cnt = nn->rxd_cnt;
+ old_txd_cnt = nn->txd_cnt;
+
+ /* Prepare new rings */
+ if (nn->rxd_cnt != rxd_cnt) {
+ rx_rings = nfp_net_shadow_rx_rings_prepare(nn, nn->fl_bufsz,
+ rxd_cnt);
+ if (!rx_rings)
+ return -ENOMEM;
+ }
+ if (nn->txd_cnt != txd_cnt) {
+ tx_rings = nfp_net_shadow_tx_rings_prepare(nn, txd_cnt);
+ if (!tx_rings) {
+ nfp_net_shadow_rx_rings_free(nn, rx_rings);
+ return -ENOMEM;
+ }
+ }
+
+ /* Stop device, swap in new rings, try to start the firmware */
+ nfp_net_close_stack(nn);
+ nfp_net_clear_config_and_disable(nn);
+
+ if (rx_rings)
+ rx_rings = nfp_net_shadow_rx_rings_swap(nn, rx_rings);
+ if (tx_rings)
+ tx_rings = nfp_net_shadow_tx_rings_swap(nn, tx_rings);
+
+ nn->rxd_cnt = rxd_cnt;
+ nn->txd_cnt = txd_cnt;
+
+ err = nfp_net_set_config_and_enable(nn);
+ if (err) {
+ const int err_new = err;
+
+ /* Try with old configuration and old rings */
+ if (rx_rings)
+ rx_rings = nfp_net_shadow_rx_rings_swap(nn, rx_rings);
+ if (tx_rings)
+ tx_rings = nfp_net_shadow_tx_rings_swap(nn, tx_rings);
+
+ nn->rxd_cnt = old_rxd_cnt;
+ nn->txd_cnt = old_txd_cnt;
+
+ err = __nfp_net_set_config_and_enable(nn);
+ if (err)
+ nn_err(nn, "Can't restore ring config - FW communication failed (%d,%d)\n",
+ err_new, err);
+ }
+
+ nfp_net_shadow_rx_rings_free(nn, rx_rings);
+ nfp_net_shadow_tx_rings_free(nn, tx_rings);
+
+ nfp_net_open_stack(nn);
+
+ return err;
+}
+
static struct rtnl_link_stats64 *nfp_net_stat64(struct net_device *netdev,
struct rtnl_link_stats64 *stats)
{
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
index 9a4084a68db5..ccfef1f17627 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
@@ -153,37 +153,25 @@ static int nfp_net_set_ringparam(struct net_device *netdev,
struct nfp_net *nn = netdev_priv(netdev);
u32 rxd_cnt, txd_cnt;
- if (netif_running(netdev)) {
- /* Some NIC drivers allow reconfiguration on the fly,
- * some down the interface, change and then up it
- * again. For now we don't allow changes when the
- * device is up.
- */
- nn_warn(nn, "Can't change rings while device is up\n");
- return -EBUSY;
- }
-
/* We don't have separate queues/rings for small/large frames. */
if (ring->rx_mini_pending || ring->rx_jumbo_pending)
return -EINVAL;
/* Round up to supported values */
rxd_cnt = roundup_pow_of_two(ring->rx_pending);
- rxd_cnt = max_t(u32, rxd_cnt, NFP_NET_MIN_RX_DESCS);
- rxd_cnt = min_t(u32, rxd_cnt, NFP_NET_MAX_RX_DESCS);
-
txd_cnt = roundup_pow_of_two(ring->tx_pending);
- txd_cnt = max_t(u32, txd_cnt, NFP_NET_MIN_TX_DESCS);
- txd_cnt = min_t(u32, txd_cnt, NFP_NET_MAX_TX_DESCS);
- if (nn->rxd_cnt != rxd_cnt || nn->txd_cnt != txd_cnt)
- nn_dbg(nn, "Change ring size: RxQ %u->%u, TxQ %u->%u\n",
- nn->rxd_cnt, rxd_cnt, nn->txd_cnt, txd_cnt);
+ if (rxd_cnt < NFP_NET_MIN_RX_DESCS || rxd_cnt > NFP_NET_MAX_RX_DESCS ||
+ txd_cnt < NFP_NET_MIN_TX_DESCS || txd_cnt > NFP_NET_MAX_TX_DESCS)
+ return -EINVAL;
- nn->rxd_cnt = rxd_cnt;
- nn->txd_cnt = txd_cnt;
+ if (nn->rxd_cnt == rxd_cnt && nn->txd_cnt == txd_cnt)
+ return 0;
- return 0;
+ nn_dbg(nn, "Change ring size: RxQ %u->%u, TxQ %u->%u\n",
+ nn->rxd_cnt, rxd_cnt, nn->txd_cnt, txd_cnt);
+
+ return nfp_net_set_ring_size(nn, rxd_cnt, txd_cnt);
}
static void nfp_net_get_strings(struct net_device *netdev,
--
1.9.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox