* Re: [PATCH v3 0/5] Introduce variable length mdev alias
From: Cornelia Huck @ 2019-09-11 16:29 UTC (permalink / raw)
To: Parav Pandit
Cc: Alex Williamson, Jiri Pirko, kwankhede@nvidia.com,
davem@davemloft.net, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <AM0PR05MB48668DFF8E816F0D2D3041BFD1B10@AM0PR05MB4866.eurprd05.prod.outlook.com>
On Wed, 11 Sep 2019 15:30:40 +0000
Parav Pandit <parav@mellanox.com> wrote:
> Hi Alex,
>
> > -----Original Message-----
> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Wednesday, September 11, 2019 8:56 AM
> > To: Parav Pandit <parav@mellanox.com>
> > Cc: Jiri Pirko <jiri@mellanox.com>; kwankhede@nvidia.com;
> > cohuck@redhat.com; davem@davemloft.net; kvm@vger.kernel.org; linux-
> > kernel@vger.kernel.org; netdev@vger.kernel.org
> > Subject: Re: [PATCH v3 0/5] Introduce variable length mdev alias
> >
> > On Mon, 9 Sep 2019 20:42:32 +0000
> > Parav Pandit <parav@mellanox.com> wrote:
> >
> > > Hi Alex,
> > >
> > > > -----Original Message-----
> > > > From: Parav Pandit <parav@mellanox.com>
> > > > Sent: Sunday, September 1, 2019 11:25 PM
> > > > To: alex.williamson@redhat.com; Jiri Pirko <jiri@mellanox.com>;
> > > > kwankhede@nvidia.com; cohuck@redhat.com; davem@davemloft.net
> > > > Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
> > > > netdev@vger.kernel.org; Parav Pandit <parav@mellanox.com>
> > > > Subject: [PATCH v3 0/5] Introduce variable length mdev alias
> > > >
> > > > To have consistent naming for the netdevice of a mdev and to have
> > > > consistent naming of the devlink port [1] of a mdev, which is formed
> > > > using phys_port_name of the devlink port, current UUID is not usable
> > > > because UUID is too long.
> > > >
> > > > UUID in string format is 36-characters long and in binary 128-bit.
> > > > Both formats are not able to fit within 15 characters limit of netdev
> > name.
> > > >
> > > > It is desired to have mdev device naming consistent using UUID.
> > > > So that widely used user space framework such as ovs [2] can make
> > > > use of mdev representor in similar way as PCIe SR-IOV VF and PF
> > representors.
> > > >
> > > > Hence,
> > > > (a) mdev alias is created which is derived using sha1 from the mdev
> > name.
> > > > (b) Vendor driver describes how long an alias should be for the
> > > > child mdev created for a given parent.
> > > > (c) Mdev aliases are unique at system level.
> > > > (d) alias is created optionally whenever parent requested.
> > > > This ensures that non networking mdev parents can function without
> > > > alias creation overhead.
> > > >
> > > > This design is discussed at [3].
> > > >
> > > > An example systemd/udev extension will have,
> > > >
> > > > 1. netdev name created using mdev alias available in sysfs.
> > > >
> > > > mdev UUID=83b8f4f2-509f-382f-3c1e-e6bfe0fa1001
> > > > mdev 12 character alias=cd5b146a80a5
> > > >
> > > > netdev name of this mdev = enmcd5b146a80a5 Here en = Ethernet link m
> > > > = mediated device
> > > >
> > > > 2. devlink port phys_port_name created using mdev alias.
> > > > devlink phys_port_name=pcd5b146a80a5
> > > >
> > > > This patchset enables mdev core to maintain unique alias for a mdev.
> > > >
> > > > Patch-1 Introduces mdev alias using sha1.
> > > > Patch-2 Ensures that mdev alias is unique in a system.
> > > > Patch-3 Exposes mdev alias in a sysfs hirerchy, update Documentation
> > > > Patch-4 Introduces mdev_alias() API.
> > > > Patch-5 Extends mtty driver to optionally provide alias generation.
> > > > This also enables to test UUID based sha1 collision and trigger
> > > > error handling for duplicate sha1 results.
> > > >
> > > > [1] http://man7.org/linux/man-pages/man8/devlink-port.8.html
> > > > [2] https://docs.openstack.org/os-vif/latest/user/plugins/ovs.html
> > > > [3] https://patchwork.kernel.org/cover/11084231/
> > > >
> > > > ---
> > > > Changelog:
> > > > v2->v3:
> > > > - Addressed comment from Yunsheng Lin
> > > > - Changed strcmp() ==0 to !strcmp()
> > > > - Addressed comment from Cornelia Hunk
> > > > - Merged sysfs Documentation patch with syfs patch
> > > > - Added more description for alias return value
> > >
> > > Did you get a chance review this updated series?
> > > I addressed Cornelia's and yours comment.
> > > I do not think allocating alias memory twice, once for comparison and
> > > once for storing is good idea or moving alias generation logic inside
> > > the mdev_list_lock(). So I didn't address that suggestion of Cornelia.
> >
> > Sorry, I'm at LPC this week. I agree, I don't think the double allocation is
> > necessary, I thought the comment was sufficient to clarify null'ing the
> > variable. It's awkward, but seems correct.
Not hot about it, but no real complaints.
However, please give me some more time, as I'm at LPC as well.
> >
> > I'm not sure what we do with this patch series though, has the real
> > consumer of this even been proposed? It feels optimistic to include at this
> > point. We've used the sample driver as a placeholder in the past for
> > mdev_uuid(), but we arrived at that via a conversion rather than explicitly
> > adding the API. Please let me know where the consumer patches stand,
> > perhaps it would make more sense for them to go in together rather than
> > risk adding an unused API. Thanks,
> >
> Given that consumer patch series is relatively large (around 15+ patches), I was considering to merge this one as pre-series to it.
> Its ok to combine this with consumer patch series.
> But wanted to have it reviewed beforehand, so that churn is less in actual consumer series which is more mlx5_core and devlink/netdev centric.
> So if you can add Review-by, it will be easier to combine with consumer series.
>
> And if we merge it with consumer series, it will come through Dave Miller's tree instead of your tree.
> Would that work for you?
It would be easier to see what to do here if we could see the consumer
for this. If those patches are fine, we could maybe queue this series
via both trees?
^ permalink raw reply
* RE: [PATCH v3 0/5] Introduce variable length mdev alias
From: Parav Pandit @ 2019-09-11 16:38 UTC (permalink / raw)
To: Parav Pandit, Alex Williamson
Cc: Jiri Pirko, kwankhede@nvidia.com, cohuck@redhat.com,
davem@davemloft.net, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <AM0PR05MB48668DFF8E816F0D2D3041BFD1B10@AM0PR05MB4866.eurprd05.prod.outlook.com>
> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org <linux-kernel-
> owner@vger.kernel.org> On Behalf Of Parav Pandit
> Sent: Wednesday, September 11, 2019 10:31 AM
> To: Alex Williamson <alex.williamson@redhat.com>
> Cc: Jiri Pirko <jiri@mellanox.com>; kwankhede@nvidia.com;
> cohuck@redhat.com; davem@davemloft.net; kvm@vger.kernel.org; linux-
> kernel@vger.kernel.org; netdev@vger.kernel.org
> Subject: RE: [PATCH v3 0/5] Introduce variable length mdev alias
>
> Hi Alex,
>
> > -----Original Message-----
> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Wednesday, September 11, 2019 8:56 AM
> > To: Parav Pandit <parav@mellanox.com>
> > Cc: Jiri Pirko <jiri@mellanox.com>; kwankhede@nvidia.com;
> > cohuck@redhat.com; davem@davemloft.net; kvm@vger.kernel.org; linux-
> > kernel@vger.kernel.org; netdev@vger.kernel.org
> > Subject: Re: [PATCH v3 0/5] Introduce variable length mdev alias
> >
> > On Mon, 9 Sep 2019 20:42:32 +0000
> > Parav Pandit <parav@mellanox.com> wrote:
> >
> > > Hi Alex,
> > >
> > > > -----Original Message-----
> > > > From: Parav Pandit <parav@mellanox.com>
> > > > Sent: Sunday, September 1, 2019 11:25 PM
> > > > To: alex.williamson@redhat.com; Jiri Pirko <jiri@mellanox.com>;
> > > > kwankhede@nvidia.com; cohuck@redhat.com; davem@davemloft.net
> > > > Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
> > > > netdev@vger.kernel.org; Parav Pandit <parav@mellanox.com>
> > > > Subject: [PATCH v3 0/5] Introduce variable length mdev alias
> > > >
> > > > To have consistent naming for the netdevice of a mdev and to have
> > > > consistent naming of the devlink port [1] of a mdev, which is
> > > > formed using phys_port_name of the devlink port, current UUID is
> > > > not usable because UUID is too long.
> > > >
> > > > UUID in string format is 36-characters long and in binary 128-bit.
> > > > Both formats are not able to fit within 15 characters limit of
> > > > netdev
> > name.
> > > >
> > > > It is desired to have mdev device naming consistent using UUID.
> > > > So that widely used user space framework such as ovs [2] can make
> > > > use of mdev representor in similar way as PCIe SR-IOV VF and PF
> > representors.
> > > >
> > > > Hence,
> > > > (a) mdev alias is created which is derived using sha1 from the
> > > > mdev
> > name.
> > > > (b) Vendor driver describes how long an alias should be for the
> > > > child mdev created for a given parent.
> > > > (c) Mdev aliases are unique at system level.
> > > > (d) alias is created optionally whenever parent requested.
> > > > This ensures that non networking mdev parents can function without
> > > > alias creation overhead.
> > > >
> > > > This design is discussed at [3].
> > > >
> > > > An example systemd/udev extension will have,
> > > >
> > > > 1. netdev name created using mdev alias available in sysfs.
> > > >
> > > > mdev UUID=83b8f4f2-509f-382f-3c1e-e6bfe0fa1001
> > > > mdev 12 character alias=cd5b146a80a5
> > > >
> > > > netdev name of this mdev = enmcd5b146a80a5 Here en = Ethernet link
> > > > m = mediated device
> > > >
> > > > 2. devlink port phys_port_name created using mdev alias.
> > > > devlink phys_port_name=pcd5b146a80a5
> > > >
> > > > This patchset enables mdev core to maintain unique alias for a mdev.
> > > >
> > > > Patch-1 Introduces mdev alias using sha1.
> > > > Patch-2 Ensures that mdev alias is unique in a system.
> > > > Patch-3 Exposes mdev alias in a sysfs hirerchy, update
> > > > Documentation
> > > > Patch-4 Introduces mdev_alias() API.
> > > > Patch-5 Extends mtty driver to optionally provide alias generation.
> > > > This also enables to test UUID based sha1 collision and trigger
> > > > error handling for duplicate sha1 results.
> > > >
> > > > [1] http://man7.org/linux/man-pages/man8/devlink-port.8.html
> > > > [2] https://docs.openstack.org/os-vif/latest/user/plugins/ovs.html
> > > > [3] https://patchwork.kernel.org/cover/11084231/
> > > >
> > > > ---
> > > > Changelog:
> > > > v2->v3:
> > > > - Addressed comment from Yunsheng Lin
> > > > - Changed strcmp() ==0 to !strcmp()
> > > > - Addressed comment from Cornelia Hunk
> > > > - Merged sysfs Documentation patch with syfs patch
> > > > - Added more description for alias return value
> > >
> > > Did you get a chance review this updated series?
> > > I addressed Cornelia's and yours comment.
> > > I do not think allocating alias memory twice, once for comparison
> > > and once for storing is good idea or moving alias generation logic
> > > inside the mdev_list_lock(). So I didn't address that suggestion of
> Cornelia.
> >
> > Sorry, I'm at LPC this week. I agree, I don't think the double
> > allocation is necessary, I thought the comment was sufficient to
> > clarify null'ing the variable. It's awkward, but seems correct.
> >
> > I'm not sure what we do with this patch series though, has the real
> > consumer of this even been proposed?
Jiri already acked to use mdev_alias() to generate phys_port_name several days back in the discussion we had in [1].
After concluding in the thread [1], I proceed with mdev_alias().
mlx5_core patches are not yet present on netdev mailing list, but we all agree to use it in mdev_alias() in devlink phys_port_name generation.
So we have collective agreement on how to proceed forward.
I wasn't probably clear enough in previous email reply about it, so adding link here.
[1] https://patchwork.kernel.org/cover/11084231/#22838955
> It feels optimistic to include
> > at this point. We've used the sample driver as a placeholder in the
> > past for mdev_uuid(), but we arrived at that via a conversion rather
> > than explicitly adding the API. Please let me know where the consumer
> > patches stand, perhaps it would make more sense for them to go in
> > together rather than risk adding an unused API. Thanks,
> >
> Given that consumer patch series is relatively large (around 15+ patches), I
> was considering to merge this one as pre-series to it.
> Its ok to combine this with consumer patch series.
> But wanted to have it reviewed beforehand, so that churn is less in actual
> consumer series which is more mlx5_core and devlink/netdev centric.
> So if you can add Review-by, it will be easier to combine with consumer
> series.
>
> And if we merge it with consumer series, it will come through Dave Miller's
> tree instead of your tree.
> Would that work for you?
^ permalink raw reply
* [net 0/2][pull request] Intel Wired LAN Driver Updates 2019-09-11
From: Jeff Kirsher @ 2019-09-11 16:49 UTC (permalink / raw)
To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann
This series contains fixes to ixgbe.
Alex fixes up the adaptive ITR scheme for ixgbe which could result in a
value that was either 0 or something less than 10 which was causing
issues with hardware features, like RSC, that do not function well with
ITR values that low.
Ilya Maximets fixes the ixgbe driver to limit the number of transmit
descriptors to clean by the number of transmit descriptors used in the
transmit ring, so that the driver does not try to "double" clean the
same descriptors.
The following are changes since commit f4b752a6b2708bfdf7fbe8a241082c8104f4ce05:
mlx4: fix spelling mistake "veify" -> "verify"
and are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue 10GbE
Alexander Duyck (1):
ixgbe: Prevent u8 wrapping of ITR value to something less than 10us
Ilya Maximets (1):
ixgbe: fix double clean of Tx descriptors with xdp
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 4 ++-
drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 29 +++++++------------
2 files changed, 14 insertions(+), 19 deletions(-)
--
2.21.0
^ permalink raw reply
* [net 2/2] ixgbe: fix double clean of Tx descriptors with xdp
From: Jeff Kirsher @ 2019-09-11 16:49 UTC (permalink / raw)
To: davem
Cc: Ilya Maximets, netdev, nhorman, sassmann, stable, William Tu,
Eelco Chaudron, Jeff Kirsher
In-Reply-To: <20190911164955.10644-1-jeffrey.t.kirsher@intel.com>
From: Ilya Maximets <i.maximets@samsung.com>
Tx code doesn't clear the descriptors' status after cleaning.
So, if the budget is larger than number of used elems in a ring, some
descriptors will be accounted twice and xsk_umem_complete_tx will move
prod_tail far beyond the prod_head breaking the completion queue ring.
Fix that by limiting the number of descriptors to clean by the number
of used descriptors in the Tx ring.
'ixgbe_clean_xdp_tx_irq()' function refactored to look more like
'ixgbe_xsk_clean_tx_ring()' since we're allowed to directly use
'next_to_clean' and 'next_to_use' indexes.
CC: stable@vger.kernel.org
Fixes: 8221c5eba8c1 ("ixgbe: add AF_XDP zero-copy Tx support")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Tested-by: William Tu <u9012063@gmail.com>
Tested-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 29 ++++++++------------
1 file changed, 11 insertions(+), 18 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
index 6b609553329f..a3b6d8c89127 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
@@ -633,19 +633,17 @@ static void ixgbe_clean_xdp_tx_buffer(struct ixgbe_ring *tx_ring,
bool ixgbe_clean_xdp_tx_irq(struct ixgbe_q_vector *q_vector,
struct ixgbe_ring *tx_ring, int napi_budget)
{
+ u16 ntc = tx_ring->next_to_clean, ntu = tx_ring->next_to_use;
unsigned int total_packets = 0, total_bytes = 0;
- u32 i = tx_ring->next_to_clean, xsk_frames = 0;
- unsigned int budget = q_vector->tx.work_limit;
struct xdp_umem *umem = tx_ring->xsk_umem;
union ixgbe_adv_tx_desc *tx_desc;
struct ixgbe_tx_buffer *tx_bi;
- bool xmit_done;
+ u32 xsk_frames = 0;
- tx_bi = &tx_ring->tx_buffer_info[i];
- tx_desc = IXGBE_TX_DESC(tx_ring, i);
- i -= tx_ring->count;
+ tx_bi = &tx_ring->tx_buffer_info[ntc];
+ tx_desc = IXGBE_TX_DESC(tx_ring, ntc);
- do {
+ while (ntc != ntu) {
if (!(tx_desc->wb.status & cpu_to_le32(IXGBE_TXD_STAT_DD)))
break;
@@ -661,22 +659,18 @@ bool ixgbe_clean_xdp_tx_irq(struct ixgbe_q_vector *q_vector,
tx_bi++;
tx_desc++;
- i++;
- if (unlikely(!i)) {
- i -= tx_ring->count;
+ ntc++;
+ if (unlikely(ntc == tx_ring->count)) {
+ ntc = 0;
tx_bi = tx_ring->tx_buffer_info;
tx_desc = IXGBE_TX_DESC(tx_ring, 0);
}
/* issue prefetch for next Tx descriptor */
prefetch(tx_desc);
+ }
- /* update budget accounting */
- budget--;
- } while (likely(budget));
-
- i += tx_ring->count;
- tx_ring->next_to_clean = i;
+ tx_ring->next_to_clean = ntc;
u64_stats_update_begin(&tx_ring->syncp);
tx_ring->stats.bytes += total_bytes;
@@ -688,8 +682,7 @@ bool ixgbe_clean_xdp_tx_irq(struct ixgbe_q_vector *q_vector,
if (xsk_frames)
xsk_umem_complete_tx(umem, xsk_frames);
- xmit_done = ixgbe_xmit_zc(tx_ring, q_vector->tx.work_limit);
- return budget > 0 && xmit_done;
+ return ixgbe_xmit_zc(tx_ring, q_vector->tx.work_limit);
}
int ixgbe_xsk_async_xmit(struct net_device *dev, u32 qid)
--
2.21.0
^ permalink raw reply related
* [net 1/2] ixgbe: Prevent u8 wrapping of ITR value to something less than 10us
From: Jeff Kirsher @ 2019-09-11 16:49 UTC (permalink / raw)
To: davem
Cc: Alexander Duyck, netdev, nhorman, sassmann, stable,
Gregg Leventhal, Andrew Bowers, Jeff Kirsher
In-Reply-To: <20190911164955.10644-1-jeffrey.t.kirsher@intel.com>
From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
There were a couple cases where the ITR value generated via the adaptive
ITR scheme could exceed 126. This resulted in the value becoming either 0
or something less than 10. Switching back and forth between a value less
than 10 and a value greater than 10 can cause issues as certain hardware
features such as RSC to not function well when the ITR value has dropped
that low.
CC: stable@vger.kernel.org
Fixes: b4ded8327fea ("ixgbe: Update adaptive ITR algorithm")
Reported-by: Gregg Leventhal <gleventhal@janestreet.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 7882148abb43..77ca9005dc41 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -2621,7 +2621,7 @@ static void ixgbe_update_itr(struct ixgbe_q_vector *q_vector,
/* 16K ints/sec to 9.2K ints/sec */
avg_wire_size *= 15;
avg_wire_size += 11452;
- } else if (avg_wire_size <= 1980) {
+ } else if (avg_wire_size < 1968) {
/* 9.2K ints/sec to 8K ints/sec */
avg_wire_size *= 5;
avg_wire_size += 22420;
@@ -2654,6 +2654,8 @@ static void ixgbe_update_itr(struct ixgbe_q_vector *q_vector,
case IXGBE_LINK_SPEED_2_5GB_FULL:
case IXGBE_LINK_SPEED_1GB_FULL:
case IXGBE_LINK_SPEED_10_FULL:
+ if (avg_wire_size > 8064)
+ avg_wire_size = 8064;
itr += DIV_ROUND_UP(avg_wire_size,
IXGBE_ITR_ADAPTIVE_MIN_INC * 64) *
IXGBE_ITR_ADAPTIVE_MIN_INC;
--
2.21.0
^ permalink raw reply related
* [net-next v2 01/13] ixgbe: fix memory leaks
From: Jeff Kirsher @ 2019-09-11 16:50 UTC (permalink / raw)
To: davem; +Cc: Wenwen Wang, netdev, nhorman, sassmann, Andrew Bowers,
Jeff Kirsher
In-Reply-To: <20190911165014.10742-1-jeffrey.t.kirsher@intel.com>
From: Wenwen Wang <wenwen@cs.uga.edu>
In ixgbe_configure_clsu32(), 'jump', 'input', and 'mask' are allocated
through kzalloc() respectively in a for loop body. Then,
ixgbe_clsu32_build_input() is invoked to build the input. If this process
fails, next iteration of the for loop will be executed. However, the
allocated 'jump', 'input', and 'mask' are not deallocated on this execution
path, leading to memory leaks.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 99df595abfba..95c0827dfd4c 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -9490,6 +9490,10 @@ static int ixgbe_configure_clsu32(struct ixgbe_adapter *adapter,
jump->mat = nexthdr[i].jump;
adapter->jump_tables[link_uhtid] = jump;
break;
+ } else {
+ kfree(mask);
+ kfree(input);
+ kfree(jump);
}
}
return 0;
--
2.21.0
^ permalink raw reply related
* [net-next v2 08/13] i40e: Fix message for other card without FEC.
From: Jeff Kirsher @ 2019-09-11 16:50 UTC (permalink / raw)
To: davem
Cc: Czeslaw Zagorski, netdev, nhorman, sassmann, Andrew Bowers,
Jeff Kirsher
In-Reply-To: <20190911165014.10742-1-jeffrey.t.kirsher@intel.com>
From: Czeslaw Zagorski <czeslawx.zagorski@intel.com>
When variable "req_fec, fec, an" are empty,
dmesg shows log with "Requested FEC: , Negotiated FEC: , Autoneg:".
Add link dmesg log for cards without FEC.
Signed-off-by: Czeslaw Zagorski <czeslawx.zagorski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 700f38ec8e91..6031223eafab 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -6594,11 +6594,15 @@ void i40e_print_link_message(struct i40e_vsi *vsi, bool isup)
else
req_fec = "CL74 FC-FEC/BASE-R";
}
+ netdev_info(vsi->netdev,
+ "NIC Link is Up, %sbps Full Duplex, Requested FEC: %s, Negotiated FEC: %s, Autoneg: %s, Flow Control: %s\n",
+ speed, req_fec, fec, an, fc);
+ } else {
+ netdev_info(vsi->netdev,
+ "NIC Link is Up, %sbps Full Duplex, Flow Control: %s\n",
+ speed, fc);
}
- netdev_info(vsi->netdev,
- "NIC Link is Up, %sbps Full Duplex, Requested FEC: %s, Negotiated FEC: %s, Autoneg: %s, Flow Control: %s\n",
- speed, req_fec, fec, an, fc);
}
/**
--
2.21.0
^ permalink raw reply related
* [net-next v2 07/13] i40e: fix missed "Negotiated" string in i40e_print_link_message()
From: Jeff Kirsher @ 2019-09-11 16:50 UTC (permalink / raw)
To: davem
Cc: Aleksandr Loktionov, netdev, nhorman, sassmann, Andrew Bowers,
Jeff Kirsher
In-Reply-To: <20190911165014.10742-1-jeffrey.t.kirsher@intel.com>
From: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
The "Negotiated" string in i40e_print_link_message() function was missed.
This string has been added to the dmesg and small refactoring done removing
common substrings and unifying link status message format.
Without this patch it was not clear that FEC is related to negotiated FEC.
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 3e2e465f43f9..700f38ec8e91 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -6569,19 +6569,19 @@ void i40e_print_link_message(struct i40e_vsi *vsi, bool isup)
}
if (pf->hw.phy.link_info.link_speed == I40E_LINK_SPEED_25GB) {
- req_fec = ", Requested FEC: None";
- fec = ", FEC: None";
- an = ", Autoneg: False";
+ req_fec = "None";
+ fec = "None";
+ an = "False";
if (pf->hw.phy.link_info.an_info & I40E_AQ_AN_COMPLETED)
- an = ", Autoneg: True";
+ an = "True";
if (pf->hw.phy.link_info.fec_info &
I40E_AQ_CONFIG_FEC_KR_ENA)
- fec = ", FEC: CL74 FC-FEC/BASE-R";
+ fec = "CL74 FC-FEC/BASE-R";
else if (pf->hw.phy.link_info.fec_info &
I40E_AQ_CONFIG_FEC_RS_ENA)
- fec = ", FEC: CL108 RS-FEC";
+ fec = "CL108 RS-FEC";
/* 'CL108 RS-FEC' should be displayed when RS is requested, or
* both RS and FC are requested
@@ -6590,13 +6590,14 @@ void i40e_print_link_message(struct i40e_vsi *vsi, bool isup)
(I40E_AQ_REQUEST_FEC_KR | I40E_AQ_REQUEST_FEC_RS)) {
if (vsi->back->hw.phy.link_info.req_fec_info &
I40E_AQ_REQUEST_FEC_RS)
- req_fec = ", Requested FEC: CL108 RS-FEC";
+ req_fec = "CL108 RS-FEC";
else
- req_fec = ", Requested FEC: CL74 FC-FEC/BASE-R";
+ req_fec = "CL74 FC-FEC/BASE-R";
}
}
- netdev_info(vsi->netdev, "NIC Link is Up, %sbps Full Duplex%s%s%s, Flow Control: %s\n",
+ netdev_info(vsi->netdev,
+ "NIC Link is Up, %sbps Full Duplex, Requested FEC: %s, Negotiated FEC: %s, Autoneg: %s, Flow Control: %s\n",
speed, req_fec, fec, an, fc);
}
--
2.21.0
^ permalink raw reply related
* [net-next v2 06/13] i40e: mark additional missing bits as reserved
From: Jeff Kirsher @ 2019-09-11 16:50 UTC (permalink / raw)
To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Andrew Bowers,
Jeff Kirsher
In-Reply-To: <20190911165014.10742-1-jeffrey.t.kirsher@intel.com>
From: Jacob Keller <jacob.e.keller@intel.com>
Mark bits 0xD through 0xF for the command flags of a cloud filter as
reserved. These bits are not yet defined and are considered as reserved
in the data sheet.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h b/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h
index 7ff768761659..530613f31527 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h
@@ -1394,6 +1394,9 @@ struct i40e_aqc_cloud_filters_element_data {
#define I40E_AQC_ADD_CLOUD_FILTER_IMAC 0x000A
#define I40E_AQC_ADD_CLOUD_FILTER_OMAC_TEN_ID_IMAC 0x000B
#define I40E_AQC_ADD_CLOUD_FILTER_IIP 0x000C
+/* 0x000D reserved */
+/* 0x000E reserved */
+/* 0x000F reserved */
/* 0x0010 to 0x0017 is for custom filters */
#define I40E_AQC_ADD_CLOUD_FILTER_IP_PORT 0x0010 /* Dest IP + L4 Port */
#define I40E_AQC_ADD_CLOUD_FILTER_MAC_PORT 0x0011 /* Dest MAC + L4 Port */
--
2.21.0
^ permalink raw reply related
* [net-next v2 13/13] i40e: fix potential RX buffer starvation for AF_XDP
From: Jeff Kirsher @ 2019-09-11 16:50 UTC (permalink / raw)
To: davem
Cc: Magnus Karlsson, netdev, nhorman, sassmann, Andrew Bowers,
Jeff Kirsher
In-Reply-To: <20190911165014.10742-1-jeffrey.t.kirsher@intel.com>
From: Magnus Karlsson <magnus.karlsson@intel.com>
When the RX rings are created they are also populated with buffers
so that packets can be received. Usually these are kernel buffers,
but for AF_XDP in zero-copy mode, these are user-space buffers and
in this case the application might not have sent down any buffers
to the driver at this point. And if no buffers are allocated at ring
creation time, no packets can be received and no interrupts will be
generated so the NAPI poll function that allocates buffers to the
rings will never get executed.
To rectify this, we kick the NAPI context of any queue with an
attached AF_XDP zero-copy socket in two places in the code. Once
after an XDP program has loaded and once after the umem is registered.
This take care of both cases: XDP program gets loaded first then AF_XDP
socket is created, and the reverse, AF_XDP socket is created first,
then XDP program is loaded.
Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_xsk.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
index 0373bc6c7e61..feb5bd54d840 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
@@ -157,6 +157,11 @@ static int i40e_xsk_umem_disable(struct i40e_vsi *vsi, u16 qid)
err = i40e_queue_pair_enable(vsi, qid);
if (err)
return err;
+
+ /* Kick start the NAPI context so that receiving will start */
+ err = i40e_xsk_wakeup(vsi->netdev, qid, XDP_WAKEUP_RX);
+ if (err)
+ return err;
}
return 0;
--
2.21.0
^ permalink raw reply related
* [net-next v2 12/13] net/ixgbevf: make array api static const, makes object smaller
From: Jeff Kirsher @ 2019-09-11 16:50 UTC (permalink / raw)
To: davem
Cc: Colin Ian King, netdev, nhorman, sassmann, Andrew Bowers,
Jeff Kirsher
In-Reply-To: <20190911165014.10742-1-jeffrey.t.kirsher@intel.com>
From: Colin Ian King <colin.king@canonical.com>
Don't populate the array API on the stack but instead make it
static const. Makes the object code smaller by 58 bytes.
Before:
text data bss dec hex filename
82969 9763 256 92988 16b3c ixgbevf/ixgbevf_main.o
After:
text data bss dec hex filename
82815 9859 256 92930 16b02 ixgbevf/ixgbevf_main.o
(gcc version 9.2.1, amd64)
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 75e849a64db7..75e93ce2ed99 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -2261,12 +2261,14 @@ static void ixgbevf_init_last_counter_stats(struct ixgbevf_adapter *adapter)
static void ixgbevf_negotiate_api(struct ixgbevf_adapter *adapter)
{
struct ixgbe_hw *hw = &adapter->hw;
- int api[] = { ixgbe_mbox_api_14,
- ixgbe_mbox_api_13,
- ixgbe_mbox_api_12,
- ixgbe_mbox_api_11,
- ixgbe_mbox_api_10,
- ixgbe_mbox_api_unknown };
+ static const int api[] = {
+ ixgbe_mbox_api_14,
+ ixgbe_mbox_api_13,
+ ixgbe_mbox_api_12,
+ ixgbe_mbox_api_11,
+ ixgbe_mbox_api_10,
+ ixgbe_mbox_api_unknown
+ };
int err, idx = 0;
spin_lock_bh(&adapter->mbx_lock);
--
2.21.0
^ permalink raw reply related
* [net-next v2 11/13] iavf: fix MAC address setting for VFs when filter is rejected
From: Jeff Kirsher @ 2019-09-11 16:50 UTC (permalink / raw)
To: davem
Cc: Stefan Assmann, netdev, nhorman, sassmann, Andrew Bowers,
Jeff Kirsher
In-Reply-To: <20190911165014.10742-1-jeffrey.t.kirsher@intel.com>
From: Stefan Assmann <sassmann@kpanic.de>
Currently iavf unconditionally applies MAC address change requests. This
brings the VF in a state where it is no longer able to pass traffic if
the PF rejects a MAC filter change for the VF.
A typical scenario for a rejected MAC filter is for an untrusted VF to
request to change the MAC address when an administratively set MAC is
present.
To keep iavf working in this scenario the MAC filter handling in iavf
needs to act on the PF reply regarding the MAC filter change. In the
case of an ack the new MAC address gets set, whereas in the case of a
nack the previous MAC address needs to stay in place.
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/iavf/iavf_main.c | 1 -
drivers/net/ethernet/intel/iavf/iavf_virtchnl.c | 7 +++++++
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 07f5541a0f01..8f310e520b06 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -804,7 +804,6 @@ static int iavf_set_mac(struct net_device *netdev, void *p)
if (f) {
ether_addr_copy(hw->mac.addr, addr->sa_data);
- ether_addr_copy(netdev->dev_addr, adapter->hw.mac.addr);
}
return (f == NULL) ? -ENOMEM : 0;
diff --git a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
index d49d58a6de80..c46770eba320 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
@@ -1252,6 +1252,8 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter,
case VIRTCHNL_OP_ADD_ETH_ADDR:
dev_err(&adapter->pdev->dev, "Failed to add MAC filter, error %s\n",
iavf_stat_str(&adapter->hw, v_retval));
+ /* restore administratively set MAC address */
+ ether_addr_copy(adapter->hw.mac.addr, netdev->dev_addr);
break;
case VIRTCHNL_OP_DEL_VLAN:
dev_err(&adapter->pdev->dev, "Failed to delete VLAN filter, error %s\n",
@@ -1319,6 +1321,11 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter,
}
}
switch (v_opcode) {
+ case VIRTCHNL_OP_ADD_ETH_ADDR: {
+ if (!ether_addr_equal(netdev->dev_addr, adapter->hw.mac.addr))
+ ether_addr_copy(netdev->dev_addr, adapter->hw.mac.addr);
+ }
+ break;
case VIRTCHNL_OP_GET_STATS: {
struct iavf_eth_stats *stats =
(struct iavf_eth_stats *)msg;
--
2.21.0
^ permalink raw reply related
* [net-next v2 10/13] i40e: clear __I40E_VIRTCHNL_OP_PENDING on invalid min Tx rate
From: Jeff Kirsher @ 2019-09-11 16:50 UTC (permalink / raw)
To: davem
Cc: Stefan Assmann, netdev, nhorman, sassmann, Andrew Bowers,
Jeff Kirsher
In-Reply-To: <20190911165014.10742-1-jeffrey.t.kirsher@intel.com>
From: Stefan Assmann <sassmann@kpanic.de>
In the case of an invalid min Tx rate being requested
i40e_ndo_set_vf_bw() immediately returns -EINVAL instead of releasing
__I40E_VIRTCHNL_OP_PENDING first.
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index f8aa4deceb5e..3d2440838822 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -4263,7 +4263,8 @@ int i40e_ndo_set_vf_bw(struct net_device *netdev, int vf_id, int min_tx_rate,
if (min_tx_rate) {
dev_err(&pf->pdev->dev, "Invalid min tx rate (%d) (greater than 0) specified for VF %d.\n",
min_tx_rate, vf_id);
- return -EINVAL;
+ ret = -EINVAL;
+ goto error;
}
vf = &pf->vf[vf_id];
--
2.21.0
^ permalink raw reply related
* [net-next v2 09/13] i40e: use BIT macro to specify the cloud filter field flags
From: Jeff Kirsher @ 2019-09-11 16:50 UTC (permalink / raw)
To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Andrew Bowers,
Jeff Kirsher
In-Reply-To: <20190911165014.10742-1-jeffrey.t.kirsher@intel.com>
From: Jacob Keller <jacob.e.keller@intel.com>
The macros used to specify the cloud filter fields are intended to be
individual bits. Declare them using the BIT() macro to make their
intention a little more clear.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e.h | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index f1a1bd324b50..2af9f6308f84 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -243,11 +243,11 @@ struct i40e_fdir_filter {
u32 fd_id;
};
-#define I40E_CLOUD_FIELD_OMAC 0x01
-#define I40E_CLOUD_FIELD_IMAC 0x02
-#define I40E_CLOUD_FIELD_IVLAN 0x04
-#define I40E_CLOUD_FIELD_TEN_ID 0x08
-#define I40E_CLOUD_FIELD_IIP 0x10
+#define I40E_CLOUD_FIELD_OMAC BIT(0)
+#define I40E_CLOUD_FIELD_IMAC BIT(1)
+#define I40E_CLOUD_FIELD_IVLAN BIT(2)
+#define I40E_CLOUD_FIELD_TEN_ID BIT(3)
+#define I40E_CLOUD_FIELD_IIP BIT(4)
#define I40E_CLOUD_FILTER_FLAGS_OMAC I40E_CLOUD_FIELD_OMAC
#define I40E_CLOUD_FILTER_FLAGS_IMAC I40E_CLOUD_FIELD_IMAC
--
2.21.0
^ permalink raw reply related
* [net-next v2 03/13] ixgbe: use skb_get_queue_mapping in tx path
From: Jeff Kirsher @ 2019-09-11 16:50 UTC (permalink / raw)
To: davem; +Cc: Tonghao Zhang, netdev, nhorman, sassmann, Andrew Bowers,
Jeff Kirsher
In-Reply-To: <20190911165014.10742-1-jeffrey.t.kirsher@intel.com>
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Use the common api, and don't access queue_mapping directly.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 95c0827dfd4c..dc034f4e8cf6 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -8748,7 +8748,7 @@ static netdev_tx_t __ixgbe_xmit_frame(struct sk_buff *skb,
if (skb_put_padto(skb, 17))
return NETDEV_TX_OK;
- tx_ring = ring ? ring : adapter->tx_ring[skb->queue_mapping];
+ tx_ring = ring ? ring : adapter->tx_ring[skb_get_queue_mapping(skb)];
if (unlikely(test_bit(__IXGBE_TX_DISABLED, &tx_ring->state)))
return NETDEV_TX_BUSY;
--
2.21.0
^ permalink raw reply related
* [net-next v2 05/13] i40e: remove I40E_AQC_ADD_CLOUD_FILTER_OIP
From: Jeff Kirsher @ 2019-09-11 16:50 UTC (permalink / raw)
To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Andrew Bowers,
Jeff Kirsher
In-Reply-To: <20190911165014.10742-1-jeffrey.t.kirsher@intel.com>
From: Jacob Keller <jacob.e.keller@intel.com>
The bit 0x0001 used in the cloud filters adminq command is reserved, and
is not actually a valid type.
The Linux driver has never used this type, and it's not clear if any
driver ever has.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h b/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h
index 21cccec328e3..7ff768761659 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h
@@ -1382,7 +1382,7 @@ struct i40e_aqc_cloud_filters_element_data {
#define I40E_AQC_ADD_CLOUD_FILTER_MASK (0x3F << \
I40E_AQC_ADD_CLOUD_FILTER_SHIFT)
/* 0x0000 reserved */
-#define I40E_AQC_ADD_CLOUD_FILTER_OIP 0x0001
+/* 0x0001 reserved */
/* 0x0002 reserved */
#define I40E_AQC_ADD_CLOUD_FILTER_IMAC_IVLAN 0x0003
#define I40E_AQC_ADD_CLOUD_FILTER_IMAC_IVLAN_TEN_ID 0x0004
--
2.21.0
^ permalink raw reply related
* [net-next v2 04/13] i40e: use ktime_get_real_ts64 instead of ktime_to_timespec64
From: Jeff Kirsher @ 2019-09-11 16:50 UTC (permalink / raw)
To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Andrew Bowers,
Jeff Kirsher
In-Reply-To: <20190911165014.10742-1-jeffrey.t.kirsher@intel.com>
From: Jacob Keller <jacob.e.keller@intel.com>
Remove a call to ktime_to_timespec64 by calling ktime_get_real_ts64
directly.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_ptp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ptp.c b/drivers/net/ethernet/intel/i40e/i40e_ptp.c
index 11394a52e21c..9bf1ad4319f5 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ptp.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ptp.c
@@ -725,7 +725,7 @@ static long i40e_ptp_create_clock(struct i40e_pf *pf)
pf->tstamp_config.tx_type = HWTSTAMP_TX_OFF;
/* Set the previous "reset" time to the current Kernel clock time */
- pf->ptp_prev_hw_time = ktime_to_timespec64(ktime_get_real());
+ ktime_get_real_ts64(&pf->ptp_prev_hw_time);
pf->ptp_reset_start = ktime_get();
return 0;
--
2.21.0
^ permalink raw reply related
* [net-next v2 02/13] i40e: check __I40E_VF_DISABLE bit in i40e_sync_filters_subtask
From: Jeff Kirsher @ 2019-09-11 16:50 UTC (permalink / raw)
To: davem
Cc: Stefan Assmann, netdev, nhorman, sassmann, stable, Andrew Bowers,
Jeff Kirsher
In-Reply-To: <20190911165014.10742-1-jeffrey.t.kirsher@intel.com>
From: Stefan Assmann <sassmann@kpanic.de>
While testing VF spawn/destroy the following panic occurred.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000029
[...]
Workqueue: i40e i40e_service_task [i40e]
RIP: 0010:i40e_sync_vsi_filters+0x6fd/0xc60 [i40e]
[...]
Call Trace:
? __switch_to_asm+0x35/0x70
? __switch_to_asm+0x41/0x70
? __switch_to_asm+0x35/0x70
? _cond_resched+0x15/0x30
i40e_sync_filters_subtask+0x56/0x70 [i40e]
i40e_service_task+0x382/0x11b0 [i40e]
? __switch_to_asm+0x41/0x70
? __switch_to_asm+0x41/0x70
process_one_work+0x1a7/0x3b0
worker_thread+0x30/0x390
? create_worker+0x1a0/0x1a0
kthread+0x112/0x130
? kthread_bind+0x30/0x30
ret_from_fork+0x35/0x40
Investigation revealed a race where pf->vf[vsi->vf_id].trusted may get
accessed by the watchdog via i40e_sync_filters_subtask() although
i40e_free_vfs() already free'd pf->vf.
To avoid this the call to i40e_sync_vsi_filters() in
i40e_sync_filters_subtask() needs to be guarded by __I40E_VF_DISABLE,
which is also used by i40e_free_vfs().
Note: put the __I40E_VF_DISABLE check after the
__I40E_MACVLAN_SYNC_PENDING check as the latter is more likely to
trigger.
CC: stable@vger.kernel.org
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index e9f2f276bf27..3e2e465f43f9 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -2592,6 +2592,10 @@ static void i40e_sync_filters_subtask(struct i40e_pf *pf)
return;
if (!test_and_clear_bit(__I40E_MACVLAN_SYNC_PENDING, pf->state))
return;
+ if (test_and_set_bit(__I40E_VF_DISABLE, pf->state)) {
+ set_bit(__I40E_MACVLAN_SYNC_PENDING, pf->state);
+ return;
+ }
for (v = 0; v < pf->num_alloc_vsi; v++) {
if (pf->vsi[v] &&
@@ -2606,6 +2610,7 @@ static void i40e_sync_filters_subtask(struct i40e_pf *pf)
}
}
}
+ clear_bit(__I40E_VF_DISABLE, pf->state);
}
/**
--
2.21.0
^ permalink raw reply related
* [net-next v2 00/13][pull request] Intel Wired LAN Driver Updates 2019-09-11
From: Jeff Kirsher @ 2019-09-11 16:50 UTC (permalink / raw)
To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann
This series contains updates to i40e, ixgbe/vf and iavf.
Wenwen Wang fixes a potential memory leak where 3 allocated variables
are not properly cleaned up on failure for ixgbe.
Stefan Assmann fixes a potential kernel panic found when repeatedly
spawning and destroying VFs in i40e when a NULL pointer is dereferenced
due to a race condition. Fixed up the i40e driver to clear the
__I40E_VIRTCHNL_OP_PENDING bit before returning after an invalid
minimum transmit rate is requested. Updates the iavf driver to only
apply the MAC address change when the PF ACK's the requested change.
Tonghao Zhang updates ixgbe to use the skb_get_queue_mapping() API call
instead of the driver accessing the queue mapping directly.
Jake updates i40e to use ktime_get_real_ts64() instead of
ktime_to_timespec64(). Removes the define for bit 0x0001 for cloud
filters, since it is a reserved bit and not a valid type. Also added
code comments to clearly state which bits are reserved and should not be
used or defined for cloud filter adminq command. Clarify the macros
used to specify the cloud filter fields are individual bits, so use the
BIT() macro.
Aleksandr fixes up the print_link_message() to include the "negotiated"
FEC status for i40e.
Czeslaw also adds additional log message for devices without FEC in the
print_link_message() for i40e.
Colin Ian King reduces the object code size by making the array API
static constant.
Magnus fixes a potential receive buffer starvation issue for AF_XDP by
kicking the NAPI context of any queue with an attached AF_XDP zero-copy
socket.
v2: Removed patch 11 from the original series (Alex Duyck's ITR fix),
so that it can be sent to the net tree.
The following are changes since commit c1609946b8b6485e1d405663004867ea9e92178a:
Merge branch 'qed-Fix-series'
and are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 40GbE
Aleksandr Loktionov (1):
i40e: fix missed "Negotiated" string in i40e_print_link_message()
Colin Ian King (1):
net/ixgbevf: make array api static const, makes object smaller
Czeslaw Zagorski (1):
i40e: Fix message for other card without FEC.
Jacob Keller (4):
i40e: use ktime_get_real_ts64 instead of ktime_to_timespec64
i40e: remove I40E_AQC_ADD_CLOUD_FILTER_OIP
i40e: mark additional missing bits as reserved
i40e: use BIT macro to specify the cloud filter field flags
Magnus Karlsson (1):
i40e: fix potential RX buffer starvation for AF_XDP
Stefan Assmann (3):
i40e: check __I40E_VF_DISABLE bit in i40e_sync_filters_subtask
i40e: clear __I40E_VIRTCHNL_OP_PENDING on invalid min Tx rate
iavf: fix MAC address setting for VFs when filter is rejected
Tonghao Zhang (1):
ixgbe: use skb_get_queue_mapping in tx path
Wenwen Wang (1):
ixgbe: fix memory leaks
drivers/net/ethernet/intel/i40e/i40e.h | 10 +++----
.../net/ethernet/intel/i40e/i40e_adminq_cmd.h | 5 +++-
drivers/net/ethernet/intel/i40e/i40e_main.c | 30 ++++++++++++-------
drivers/net/ethernet/intel/i40e/i40e_ptp.c | 2 +-
.../ethernet/intel/i40e/i40e_virtchnl_pf.c | 3 +-
drivers/net/ethernet/intel/i40e/i40e_xsk.c | 5 ++++
drivers/net/ethernet/intel/iavf/iavf_main.c | 1 -
.../net/ethernet/intel/iavf/iavf_virtchnl.c | 7 +++++
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 6 +++-
.../net/ethernet/intel/ixgbevf/ixgbevf_main.c | 14 +++++----
10 files changed, 57 insertions(+), 26 deletions(-)
--
2.21.0
^ permalink raw reply
* Re: [PATCH 04/11] net: phylink: switch to using fwnode_gpiod_get_index()
From: Andy Shevchenko @ 2019-09-11 16:52 UTC (permalink / raw)
To: Russell King - ARM Linux admin
Cc: Dmitry Torokhov, Linus Walleij, Mika Westerberg, linux-kernel,
linux-gpio, Andrew Lunn, David S. Miller, Florian Fainelli,
Heiner Kallweit, netdev
In-Reply-To: <20190911101016.GW13294@shell.armlinux.org.uk>
On Wed, Sep 11, 2019 at 11:10:16AM +0100, Russell King - ARM Linux admin wrote:
> On Wed, Sep 11, 2019 at 02:55:11AM -0700, Dmitry Torokhov wrote:
> > On Wed, Sep 11, 2019 at 10:49:29AM +0100, Russell King - ARM Linux admin wrote:
> > > On Wed, Sep 11, 2019 at 12:46:19PM +0300, Andy Shevchenko wrote:
> > > > On Wed, Sep 11, 2019 at 10:39:14AM +0100, Russell King - ARM Linux admin wrote:
> > > > > On Wed, Sep 11, 2019 at 12:25:14PM +0300, Andy Shevchenko wrote:
> > > > > > On Wed, Sep 11, 2019 at 12:52:08AM -0700, Dmitry Torokhov wrote:
> > > > > > > Instead of fwnode_get_named_gpiod() that I plan to hide away, let's use
> > > > > > > the new fwnode_gpiod_get_index() that mimics gpiod_get_index(), bit
> > > > > > > works with arbitrary firmware node.
> > > e > >
> > > > > > I'm wondering if it's possible to step forward and replace
> > > > > > fwnode_get_gpiod_index by gpiod_get() / gpiod_get_index() here and
> > > > > > in other cases in this series.
> > > > >
> > > > > No, those require a struct device, but we have none. There are network
> > > > > drivers where there is a struct device for the network complex, but only
> > > > > DT nodes for the individual network interfaces. So no, gpiod_* really
> > > > > doesn't work.
> > > >
> > > > In the following patch the node is derived from struct device. So, I believe
> > > > some cases can be handled differently.
> Referring back to my comment, notice that I said we have none for the
> phylink case, so it's not possible there.
>
> I'm not sure why Andy replied the way he did, unless he mis-read my
> comment.
It is a first patch which does the change. Mostly my reply was to Dmitry and
your comment clarifies the case with this patch, thanks!
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply
* Re: VRF Issue Since kernel 5
From: David Ahern @ 2019-09-11 16:53 UTC (permalink / raw)
To: Alexis Bauvin, Gowen; +Cc: netdev@vger.kernel.org
In-Reply-To: <9E920DE7-9CC9-493C-A1D2-957FE1AED897@online.net>
On 9/9/19 10:28 AM, Alexis Bauvin wrote:
> Also, your `unreachable default metric 4278198272` route looks odd to me.
>
New recommendation from FRR group. See
https://www.kernel.org/doc/Documentation/networking/vrf.txt and search
for 4278198272
^ permalink raw reply
* Re: VRF Issue Since kernel 5
From: David Ahern @ 2019-09-11 17:02 UTC (permalink / raw)
To: Gowen, netdev@vger.kernel.org
In-Reply-To: <CWLP265MB1554308A1373D9ECE68CB854FDB70@CWLP265MB1554.GBRP265.PROD.OUTLOOK.COM>
At LPC this week and just now getting a chance to process the data you sent.
On 9/9/19 8:46 AM, Gowen wrote:
> the production traffic is all in the 10.0.0.0/8 network (eth1 global VRF) except for a few subnets (DNS) which are routed out eth0 (mgmt-vrf)
>
>
> Admin@NETM06:~$ ip route show
> default via 10.24.12.1 dev eth0
> 10.0.0.0/8 via 10.24.12.1 dev eth1
> 10.24.12.0/24 dev eth1 proto kernel scope link src 10.24.12.9
> 10.24.65.0/24 via 10.24.12.1 dev eth0
> 10.25.65.0/24 via 10.24.12.1 dev eth0
> 10.26.0.0/21 via 10.24.12.1 dev eth0
> 10.26.64.0/21 via 10.24.12.1 dev eth0
interesting route table. This is default VRF but you have route leaking
through eth0 which is in mgmt-vrf.
>
>
> Admin@NETM06:~$ ip route show vrf mgmt-vrf
> default via 10.24.12.1 dev eth0
> unreachable default metric 4278198272
> 10.24.12.0/24 dev eth0 proto kernel scope link src 10.24.12.10
> 10.24.65.0/24 via 10.24.12.1 dev eth0
> 10.25.65.0/24 via 10.24.12.1 dev eth0
> 10.26.0.0/21 via 10.24.12.1 dev eth0
> 10.26.64.0/21 via 10.24.12.1 dev eth0
The DNS servers are 10.24.65.203 or 10.24.64.203 which you want to go
out mgmt-vrf. correct?
10.24.65.203 should hit the route "10.24.65.0/24 via 10.24.12.1 dev
eth0" for both default VRF and mgmt-vrf.
10.24.64.203 will NOT hit a route leak entry so traverse the VRF
associated with the context of the command (mgmt-vrf or default). Is
that intentional? (verify with: `ip ro get 10.24.64.203 fibmatch` and
`ip ro get 10.24.64.203 vrf mgmt-vrf fibmatch`)
>
>
>
> The strange activity occurs when I enter the command “sudo apt update” as I can resolve the DNS request (10.24.65.203 or 10.24.64.203, verified with tcpdump) out eth0 but for the actual update traffic there is no activity:
>
>
> sudo tcpdump -i eth0 '(host 10.24.65.203 or host 10.25.65.203) and port 53' -n
> <OUTPUT OMITTED FOR BREVITY>
> 10:06:05.268735 IP 10.24.12.10.39963 > 10.24.65.203.53: 48798+ [1au] A? security.ubuntu.com. (48)
> <OUTPUT OMITTED FOR BREVITY>
> 10:06:05.284403 IP 10.24.65.203.53 > 10.24.12.10.39963: 48798 13/0/1 A 91.189.91.23, A 91.189.88.24, A 91.189.91.26, A 91.189.88.162, A 91.189.88.149, A 91.189.91.24, A 91.189.88.173, A 91.189.88.177, A 91.189.88.31, A 91.189.91.14, A 91.189.88.176, A 91.189.88.175, A 91.189.88.174 (256)
>
>
>
> You can see that the update traffic is returned but is not accepted by the stack and a RST is sent
>
>
> Admin@NETM06:~$ sudo tcpdump -i eth0 '(not host 168.63.129.16 and port 80)' -n
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
> 10:17:12.690658 IP 10.24.12.10.40216 > 91.189.88.175.80: Flags [S], seq 2279624826, win 64240, options [mss 1460,sackOK,TS val 2029365856 ecr 0,nop,wscale 7], length 0
> 10:17:12.691929 IP 10.24.12.10.52362 > 91.189.95.83.80: Flags [S], seq 1465797256, win 64240, options [mss 1460,sackOK,TS val 3833463674 ecr 0,nop,wscale 7], length 0
> 10:17:12.696270 IP 91.189.88.175.80 > 10.24.12.10.40216: Flags [S.], seq 968450722, ack 2279624827, win 28960, options [mss 1418,sackOK,TS val 81957103 ecr 2029365856,nop,wscale 7], length 0
> 10:17:12.696301 IP 10.24.12.10.40216 > 91.189.88.175.80: Flags [R], seq 2279624827, win 0, length 0
> 10:17:12.697884 IP 91.189.95.83.80 > 10.24.12.10.52362: Flags [S.], seq 4148330738, ack 1465797257, win 28960, options [mss 1418,sackOK,TS val 2257624414 ecr 3833463674,nop,wscale 8], length 0
> 10:17:12.697909 IP 10.24.12.10.52362 > 91.189.95.83.80: Flags [R], seq 1465797257, win 0, length 0
>
>
>
>
> I can emulate the DNS lookup using netcat in the vrf:
>
>
> sudo ip vrf exec mgmt-vrf nc -u 10.24.65.203 53
>
`ip vrf exec mgmt-vrf <COMMAND>` means that every IPv4 and IPv6 socket
opened by <COMMAND> is automatically bound to mgmt-vrf which causes
route lookups to hit the mgmt-vrf table.
Just running <COMMAND> (without binding to any vrf) means no socket is
bound to anything unless the command does a bind. In that case the
routing lookups determine which egress device is used.
Now the response comes back, if the ingress interface is a VRF then the
socket lookup wants to match on a device.
Now, a later response shows this for DNS lookups:
isc-worker0000 20261 [000] 2215.013849: fib:fib_table_lookup: table
10 oif 0 iif 0 proto 0 0.0.0.0/0 -> 127.0.0.1/0 tos 0 scope 0 flags 0
==> dev eth0 gw 10.24.12.1 src 10.24.12.10 err 0
isc-worker0000 20261 [000] 2215.013915: fib:fib_table_lookup: table
10 oif 4 iif 1 proto 17 0.0.0.0/52138 -> 127.0.0.53/53 tos 0 scope 0
flags 4 ==> dev eth0 gw 10.24.12.1 src 10.24.12.10 err 0
isc-worker0000 20261 [000] 2220.014006: fib:fib_table_lookup: table
10 oif 4 iif 1 proto 17 0.0.0.0/52138 -> 127.0.0.53/53 tos 0 scope 0
flags 4 ==> dev eth0 gw 10.24.12.1 src 10.24.12.10 err 0
which suggests your process is passing off the DNS lookup to a local
process (isc-worker) and it hits the default route for mgmt-vrf when it
is trying to connect to a localhost address.
For mgmt-vrf I suggest always adding 127.0.0.1/8 to the mgmt vrf device
(and ::1/128 for IPv6 starting with 5.x kernels - I forget the exact
kernel version).
That might solve your problem; it might not.
(BTW: Cumulus uses fib rules for DNS servers to force DNS packets out
the mgmt-vrf interface.)
^ permalink raw reply
* Re: [PATCH 00/11] Add support for software nodes to gpiolib
From: Andy Shevchenko @ 2019-09-11 17:13 UTC (permalink / raw)
To: Dmitry Torokhov
Cc: Linus Walleij, Mika Westerberg, linux-kernel, linux-gpio,
Andrew Lunn, Andrzej Hajda, Bartosz Golaszewski, Daniel Vetter,
David Airlie, David S. Miller, Florian Fainelli, Heiner Kallweit,
Jernej Skrabec, Jonas Karlman, Laurent Pinchart, Neil Armstrong,
Russell King, dri-devel, linux-acpi, netdev
In-Reply-To: <20190911075215.78047-1-dmitry.torokhov@gmail.com>
On Wed, Sep 11, 2019 at 12:52:04AM -0700, Dmitry Torokhov wrote:
> This series attempts to add support for software nodes to gpiolib, using
> software node references that were introduced recently. This allows us
> to convert more drivers to the generic device properties and drop
> support for custom platform data:
>
> static const struct software_node gpio_bank_b_node = {
> |-------.name = "B",
> };
>
> static const struct property_entry simone_key_enter_props[] = {
> |-------PROPERTY_ENTRY_U32("linux,code", KEY_ENTER),
> |-------PROPERTY_ENTRY_STRING("label", "enter"),
> |-------PROPERTY_ENTRY_REF("gpios", &gpio_bank_b_node, 123, GPIO_ACTIVE_LOW),
> |-------{ }
> };
>
> If we agree in principle, I would like to have the very first 3 patches
> in an immutable branch off maybe -rc8 so that it can be pulled into
> individual subsystems so that patches switching various drivers to
> fwnode_gpiod_get_index() could be applied.
FWIW,
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
for patches 1-8 after addressing minor issues.
I'll review the rest later on.
>
> Thanks,
> Dmitry
>
> Dmitry Torokhov (11):
> gpiolib: of: add a fallback for wlf,reset GPIO name
> gpiolib: introduce devm_fwnode_gpiod_get_index()
> gpiolib: introduce fwnode_gpiod_get_index()
> net: phylink: switch to using fwnode_gpiod_get_index()
> net: mdio: switch to using fwnode_gpiod_get_index()
> drm/bridge: ti-tfp410: switch to using fwnode_gpiod_get_index()
> gpliolib: make fwnode_get_named_gpiod() static
> gpiolib: of: tease apart of_find_gpio()
> gpiolib: of: tease apart acpi_find_gpio()
> gpiolib: consolidate fwnode GPIO lookups
> gpiolib: add support for software nodes
>
> drivers/gpio/Makefile | 1 +
> drivers/gpio/gpiolib-acpi.c | 153 ++++++++++++++----------
> drivers/gpio/gpiolib-acpi.h | 21 ++--
> drivers/gpio/gpiolib-devres.c | 33 ++----
> drivers/gpio/gpiolib-of.c | 159 ++++++++++++++-----------
> drivers/gpio/gpiolib-of.h | 26 ++--
> drivers/gpio/gpiolib-swnode.c | 92 +++++++++++++++
> drivers/gpio/gpiolib-swnode.h | 13 ++
> drivers/gpio/gpiolib.c | 184 ++++++++++++++++-------------
> drivers/gpu/drm/bridge/ti-tfp410.c | 4 +-
> drivers/net/phy/mdio_bus.c | 4 +-
> drivers/net/phy/phylink.c | 4 +-
> include/linux/gpio/consumer.h | 53 ++++++---
> 13 files changed, 471 insertions(+), 276 deletions(-)
> create mode 100644 drivers/gpio/gpiolib-swnode.c
> create mode 100644 drivers/gpio/gpiolib-swnode.h
>
> --
> 2.23.0.162.g0b9fbb3734-goog
>
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply
* [PATCH bpf-next 1/3] i40e: fix xdp handle calculations
From: Ciara Loftus @ 2019-09-11 17:24 UTC (permalink / raw)
To: netdev, ast, daniel, bjorn.topel, magnus.karlsson, jonathan.lemon
Cc: bruce.richardson, bpf, intel-wired-lan, kevin.laatz, Ciara Loftus
Commit 4c5d9a7fa149 ("i40e: fix xdp handle calculations") reintroduced
the addition of the umem headroom to the xdp handle in the i40e_zca_free,
i40e_alloc_buffer_slow_zc and i40e_alloc_buffer_zc functions. However,
the headroom is already added to the handle in the function i40_run_xdp_zc.
This commit removes the latter addition and fixes the case where the
headroom is non-zero.
Fixes: 4c5d9a7fa149 ("i40e: fix xdp handle calculations")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
index 0373bc6c7e61..5f285ba1f1f9 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
@@ -192,7 +192,7 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp)
{
struct xdp_umem *umem = rx_ring->xsk_umem;
int err, result = I40E_XDP_PASS;
- u64 offset = umem->headroom;
+ u64 offset;
struct i40e_ring *xdp_ring;
struct bpf_prog *xdp_prog;
u32 act;
@@ -203,7 +203,7 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp)
*/
xdp_prog = READ_ONCE(rx_ring->xdp_prog);
act = bpf_prog_run_xdp(xdp_prog, xdp);
- offset += xdp->data - xdp->data_hard_start;
+ offset = xdp->data - xdp->data_hard_start;
xdp->handle = xsk_umem_adjust_offset(umem, xdp->handle, offset);
--
2.17.1
^ permalink raw reply related
* [PATCH bpf-next 2/3] ixgbe: fix xdp handle calculations
From: Ciara Loftus @ 2019-09-11 17:24 UTC (permalink / raw)
To: netdev, ast, daniel, bjorn.topel, magnus.karlsson, jonathan.lemon
Cc: bruce.richardson, bpf, intel-wired-lan, kevin.laatz, Ciara Loftus
In-Reply-To: <20190911172435.21042-1-ciara.loftus@intel.com>
Commit 7cbbf9f1fa23 ("ixgbe: fix xdp handle calculations") reintroduced
the addition of the umem headroom to the xdp handle in the ixgbe_zca_free,
ixgbe_alloc_buffer_slow_zc and ixgbe_alloc_buffer_zc functions. However,
the headroom is already added to the handle in the function
ixgbe_run_xdp_zc. This commit removes the latter addition and fixes the
case where the headroom is non-zero.
Fixes: 7cbbf9f1fa23 ("ixgbe: fix xdp handle calculations")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
index ad802a8909e0..5ed8b5a257cf 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
@@ -145,7 +145,7 @@ static int ixgbe_run_xdp_zc(struct ixgbe_adapter *adapter,
{
struct xdp_umem *umem = rx_ring->xsk_umem;
int err, result = IXGBE_XDP_PASS;
- u64 offset = umem->headroom;
+ u64 offset;
struct bpf_prog *xdp_prog;
struct xdp_frame *xdpf;
u32 act;
@@ -153,7 +153,7 @@ static int ixgbe_run_xdp_zc(struct ixgbe_adapter *adapter,
rcu_read_lock();
xdp_prog = READ_ONCE(rx_ring->xdp_prog);
act = bpf_prog_run_xdp(xdp_prog, xdp);
- offset += xdp->data - xdp->data_hard_start;
+ offset = xdp->data - xdp->data_hard_start;
xdp->handle = xsk_umem_adjust_offset(umem, xdp->handle, offset);
--
2.17.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox