Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] net: usb: pegasus: fix improper read if get_registers() fail
From: David Miller @ 2019-08-01 22:18 UTC (permalink / raw)
  To: kda; +Cc: petkan, netdev
In-Reply-To: <20190730131357.30697-1-dkirjanov@suse.com>

From: Denis Kirjanov <kda@linux-powerpc.org>
Date: Tue, 30 Jul 2019 15:13:57 +0200

> get_registers() may fail with -ENOMEM and in this
> case we can read a garbage from the status variable tmp.
> 
> Reported-by: syzbot+3499a83b2d062ae409d4@syzkaller.appspotmail.com
> Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>

Applied, thank you.

^ permalink raw reply

* Re: [net-next 1/1] tipc: reduce risk of wakeup queue starvation
From: David Miller @ 2019-08-01 22:19 UTC (permalink / raw)
  To: jon.maloy
  Cc: netdev, gordan.mihaljevic, tung.q.nguyen, hoang.h.le, canh.d.luu,
	ying.xue, tipc-discussion
In-Reply-To: <1564496598-5080-1-git-send-email-jon.maloy@ericsson.com>

From: Jon Maloy <jon.maloy@ericsson.com>
Date: Tue, 30 Jul 2019 16:23:18 +0200

> In commit 365ad353c256 ("tipc: reduce risk of user starvation during
> link congestion") we allowed senders to add exactly one list of extra
> buffers to the link backlog queues during link congestion (aka
> "oversubscription"). However, the criteria for when to stop adding
> wakeup messages to the input queue when the overload abates is
> inaccurate, and may cause starvation problems during very high load.
> 
> Currently, we stop adding wakeup messages after 10 total failed attempts
> where we find that there is no space left in the backlog queue for a
> certain importance level. The counter for this is accumulated across all
> levels, which may lead the algorithm to leave the loop prematurely,
> although there may still be plenty of space available at some levels.
> The result is sometimes that messages near the wakeup queue tail are not
> added to the input queue as they should be.
> 
> We now introduce a more exact algorithm, where we keep adding wakeup
> messages to a level as long as the backlog queue has free slots for
> the corresponding level, and stop at the moment there are no more such
> slots or when there are no more wakeup messages to dequeue.
> 
> Fixes: 365ad35 ("tipc: reduce risk of user starvation during link congestion")
> Reported-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
> Acked-by: Ying Xue <ying.xue@windriver.com>
> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>

Applied, thank you.

^ permalink raw reply

* [net-next 05/11] fm10k: reduce the scope of the q_idx local variable
From: Jeff Kirsher @ 2019-08-01 22:25 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190801222548.15975-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

Reduce the scope of the q_idx local variable in the fm10k_cache_ring_qos
function.

This was detected by cppcheck and resolves the following style warning
produced by that tool:

[fm10k_main.c:2016]: (style) The scope of the variable 'q_idx' can be
reduced.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_main.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index 9ffff7886085..9e6bddff7625 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright(c) 2013 - 2018 Intel Corporation. */
+/* Copyright(c) 2013 - 2019 Intel Corporation. */
 
 #include <linux/types.h>
 #include <linux/module.h>
@@ -17,7 +17,7 @@ const char fm10k_driver_version[] = DRV_VERSION;
 char fm10k_driver_name[] = "fm10k";
 static const char fm10k_driver_string[] = DRV_SUMMARY;
 static const char fm10k_copyright[] =
-	"Copyright(c) 2013 - 2018 Intel Corporation.";
+	"Copyright(c) 2013 - 2019 Intel Corporation.";
 
 MODULE_AUTHOR("Intel Corporation, <linux.nics@intel.com>");
 MODULE_DESCRIPTION(DRV_SUMMARY);
@@ -1870,7 +1870,7 @@ static int fm10k_init_msix_capability(struct fm10k_intfc *interface)
 static bool fm10k_cache_ring_qos(struct fm10k_intfc *interface)
 {
 	struct net_device *dev = interface->netdev;
-	int pc, offset, rss_i, i, q_idx;
+	int pc, offset, rss_i, i;
 	u16 pc_stride = interface->ring_feature[RING_F_QOS].mask + 1;
 	u8 num_pcs = netdev_get_num_tc(dev);
 
@@ -1880,7 +1880,8 @@ static bool fm10k_cache_ring_qos(struct fm10k_intfc *interface)
 	rss_i = interface->ring_feature[RING_F_RSS].indices;
 
 	for (pc = 0, offset = 0; pc < num_pcs; pc++, offset += rss_i) {
-		q_idx = pc;
+		int q_idx = pc;
+
 		for (i = 0; i < rss_i; i++) {
 			interface->tx_ring[offset + i]->reg_idx = q_idx;
 			interface->tx_ring[offset + i]->qos_pc = pc;
-- 
2.21.0


^ permalink raw reply related

* [net-next 06/11] fm10k: reduce the scope of the tx_buffer variable
From: Jeff Kirsher @ 2019-08-01 22:25 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190801222548.15975-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

The tx_buffer local variable in the function fm10k_clean_tx_ring is not
used except inside a smaller block scope. Reduce the scope to its point
of use.

This was detected by cppcheck and resolves the following style warning
produced by that tool:

[fm10k_netdev.c:179]: (style) The scope of the variable 'tx_buffer' can
be reduced.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index 538a8467f434..c73fb38be678 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -169,7 +169,6 @@ void fm10k_unmap_and_free_tx_resource(struct fm10k_ring *ring,
  **/
 static void fm10k_clean_tx_ring(struct fm10k_ring *tx_ring)
 {
-	struct fm10k_tx_buffer *tx_buffer;
 	unsigned long size;
 	u16 i;
 
@@ -179,7 +178,8 @@ static void fm10k_clean_tx_ring(struct fm10k_ring *tx_ring)
 
 	/* Free all the Tx ring sk_buffs */
 	for (i = 0; i < tx_ring->count; i++) {
-		tx_buffer = &tx_ring->tx_buffer[i];
+		struct fm10k_tx_buffer *tx_buffer = &tx_ring->tx_buffer[i];
+
 		fm10k_unmap_and_free_tx_resource(tx_ring, tx_buffer);
 	}
 
-- 
2.21.0


^ permalink raw reply related

* [net-next 00/11][pull request] 100GbE Intel Wired LAN Driver Updates 2019-08-01
From: Jeff Kirsher @ 2019-08-01 22:25 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann

This series for fm10k, by Jake Keller, reduces the scope of local variables
where possible.

The following are changes since commit a8e600e2184f45c40025cbe4d7e8893b69378a9f:
  Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 100GbE

Jacob Keller (11):
  fm10k: reduce scope of the err variable
  fm10k: reduce scope of *p local variable
  fm10k: reduce the scope of qv local variable
  fm10k: reduce the scope of local err variable
  fm10k: reduce the scope of the q_idx local variable
  fm10k: reduce the scope of the tx_buffer variable
  fm10k: reduce the scope of the err variable
  fm10k: reduce the scope of the local i variable
  fm10k: reduce the scope of the local msg variable
  fm10k: reduce the scope of the result local variable
  fm10k: reduce scope of the ring variable

 drivers/net/ethernet/intel/fm10k/fm10k_dcbnl.c   | 6 +++---
 drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c | 9 ++++-----
 drivers/net/ethernet/intel/fm10k/fm10k_iov.c     | 5 +++--
 drivers/net/ethernet/intel/fm10k/fm10k_main.c    | 9 +++++----
 drivers/net/ethernet/intel/fm10k/fm10k_mbx.c     | 6 ++++--
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c  | 8 ++++----
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c     | 9 +++++----
 drivers/net/ethernet/intel/fm10k/fm10k_pf.c      | 3 +--
 8 files changed, 29 insertions(+), 26 deletions(-)

-- 
2.21.0


^ permalink raw reply

* [net-next 07/11] fm10k: reduce the scope of the err variable
From: Jeff Kirsher @ 2019-08-01 22:25 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190801222548.15975-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

Reduce the scope of the local variable err in the fm10k_detach_subtask
function.

This was detected by cppcheck and resolves the following warning
produced by that tool:

[fm10k_pci.c:403]: (style) The scope of the variable 'err' can be reduced.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 7bfc8a5b6f55..9421832c2480 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright(c) 2013 - 2018 Intel Corporation. */
+/* Copyright(c) 2013 - 2019 Intel Corporation. */
 
 #include <linux/module.h>
 #include <linux/interrupt.h>
@@ -344,7 +344,6 @@ static void fm10k_detach_subtask(struct fm10k_intfc *interface)
 	struct net_device *netdev = interface->netdev;
 	u32 __iomem *hw_addr;
 	u32 value;
-	int err;
 
 	/* do nothing if netdev is still present or hw_addr is set */
 	if (netif_device_present(netdev) || interface->hw.hw_addr)
@@ -362,6 +361,8 @@ static void fm10k_detach_subtask(struct fm10k_intfc *interface)
 	hw_addr = READ_ONCE(interface->uc_addr);
 	value = readl(hw_addr);
 	if (~value) {
+		int err;
+
 		/* Make sure the reset was initiated because we detached,
 		 * otherwise we might race with a different reset flow.
 		 */
-- 
2.21.0


^ permalink raw reply related

* [net-next 02/11] fm10k: reduce scope of *p local variable
From: Jeff Kirsher @ 2019-08-01 22:25 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190801222548.15975-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

Reduce the scope of the char *p local variable to only the block where
it is used.

This was detected by cppcheck and resolves the following style warning
produced by that tool:

[fm10k_ethtool.c:229]: (style) The scope of the variable 'p' can be
reduced.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c b/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
index 4895dd83dd08..7b9440c0aee1 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright(c) 2013 - 2018 Intel Corporation. */
+/* Copyright(c) 2013 - 2019 Intel Corporation. */
 
 #include <linux/vmalloc.h>
 
@@ -222,7 +222,6 @@ static void __fm10k_add_ethtool_stats(u64 **data, void *pointer,
 				      const unsigned int size)
 {
 	unsigned int i;
-	char *p;
 
 	if (!pointer) {
 		/* memory is not zero allocated so we have to clear it */
@@ -232,7 +231,7 @@ static void __fm10k_add_ethtool_stats(u64 **data, void *pointer,
 	}
 
 	for (i = 0; i < size; i++) {
-		p = (char *)pointer + stats[i].stat_offset;
+		char *p = (char *)pointer + stats[i].stat_offset;
 
 		switch (stats[i].sizeof_stat) {
 		case sizeof(u64):
-- 
2.21.0


^ permalink raw reply related

* [net-next 01/11] fm10k: reduce scope of the err variable
From: Jeff Kirsher @ 2019-08-01 22:25 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190801222548.15975-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

Reduce the scope of the err local variable in the fm10k_dcbnl_ieee_setets
function.

This was detected using cppcheck, and resolves the following style
warning:

[fm10k_dcbnl.c:37]: (style) The scope of the variable 'err' can be reduced.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_dcbnl.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_dcbnl.c b/drivers/net/ethernet/intel/fm10k/fm10k_dcbnl.c
index 20768ac7f17e..c45315472245 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_dcbnl.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_dcbnl.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright(c) 2013 - 2018 Intel Corporation. */
+/* Copyright(c) 2013 - 2019 Intel Corporation. */
 
 #include "fm10k.h"
 
@@ -36,7 +36,7 @@ static int fm10k_dcbnl_ieee_getets(struct net_device *dev, struct ieee_ets *ets)
 static int fm10k_dcbnl_ieee_setets(struct net_device *dev, struct ieee_ets *ets)
 {
 	u8 num_tc = 0;
-	int i, err;
+	int i;
 
 	/* verify type and determine num_tcs needed */
 	for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) {
@@ -57,7 +57,7 @@ static int fm10k_dcbnl_ieee_setets(struct net_device *dev, struct ieee_ets *ets)
 
 	/* update TC hardware mapping if necessary */
 	if (num_tc != netdev_get_num_tc(dev)) {
-		err = fm10k_setup_tc(dev, num_tc);
+		int err = fm10k_setup_tc(dev, num_tc);
 		if (err)
 			return err;
 	}
-- 
2.21.0


^ permalink raw reply related

* [net-next 08/11] fm10k: reduce the scope of the local i variable
From: Jeff Kirsher @ 2019-08-01 22:25 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190801222548.15975-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

Reduce the scope of the local loop variable in the
fm10k_check_hang_subtask function.

This was detected by cppcheck and resolves the following warning
produced by that tool:

[driver/fm10k_pci.c:852]: (style) The scope of the variable 'i' can be
reduced.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 9421832c2480..9522e9f8f8b8 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -698,8 +698,6 @@ static void fm10k_watchdog_subtask(struct fm10k_intfc *interface)
  */
 static void fm10k_check_hang_subtask(struct fm10k_intfc *interface)
 {
-	int i;
-
 	/* If we're down or resetting, just bail */
 	if (test_bit(__FM10K_DOWN, interface->state) ||
 	    test_bit(__FM10K_RESETTING, interface->state))
@@ -711,6 +709,8 @@ static void fm10k_check_hang_subtask(struct fm10k_intfc *interface)
 	interface->next_tx_hang_check = jiffies + (2 * HZ);
 
 	if (netif_carrier_ok(interface->netdev)) {
+		int i;
+
 		/* Force detection of hung controller */
 		for (i = 0; i < interface->num_tx_queues; i++)
 			set_check_for_tx_hang(interface->tx_ring[i]);
-- 
2.21.0


^ permalink raw reply related

* [net-next 09/11] fm10k: reduce the scope of the local msg variable
From: Jeff Kirsher @ 2019-08-01 22:25 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190801222548.15975-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

The msg variable in the fm10k_mbx_validate_msg_size and
fm10k_sm_mbx_transmit functions is only used within the do {} loop
scope. Reduce its scope only to where it is used.

This was detected by cppcheck, and resolves the following warnings
produced by that tool:

[fm10k_mbx.c:299]: (style) The scope of the variable 'msg' can be reduced.
[fm10k_mbx.c:2004]: (style) The scope of the variable 'msg' can be reduced.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_mbx.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c
index 21021fe4f1c3..aece335b41f8 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_mbx.c
@@ -297,13 +297,14 @@ static u16 fm10k_mbx_validate_msg_size(struct fm10k_mbx_info *mbx, u16 len)
 {
 	struct fm10k_mbx_fifo *fifo = &mbx->rx;
 	u16 total_len = 0, msg_len;
-	u32 *msg;
 
 	/* length should include previous amounts pushed */
 	len += mbx->pushed;
 
 	/* offset in message is based off of current message size */
 	do {
+		u32 *msg;
+
 		msg = fifo->buffer + fm10k_fifo_tail_offset(fifo, total_len);
 		msg_len = FM10K_TLV_DWORD_LEN(*msg);
 		total_len += msg_len;
@@ -1920,7 +1921,6 @@ static void fm10k_sm_mbx_transmit(struct fm10k_hw *hw,
 	/* reduce length by 1 to convert to a mask */
 	u16 mbmem_len = mbx->mbmem_len - 1;
 	u16 tail_len, len = 0;
-	u32 *msg;
 
 	/* push head behind tail */
 	if (mbx->tail < head)
@@ -1930,6 +1930,8 @@ static void fm10k_sm_mbx_transmit(struct fm10k_hw *hw,
 
 	/* determine msg aligned offset for end of buffer */
 	do {
+		u32 *msg;
+
 		msg = fifo->buffer + fm10k_fifo_head_offset(fifo, len);
 		tail_len = len;
 		len += FM10K_TLV_DWORD_LEN(*msg);
-- 
2.21.0


^ permalink raw reply related

* [net-next 10/11] fm10k: reduce the scope of the result local variable
From: Jeff Kirsher @ 2019-08-01 22:25 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190801222548.15975-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

Reduce the scope of the result local variable in the
fm10k_iov_msg_lport_state_pf function.

This was detected by cppcheck and resolves the following warning
produced by that tool:

[fm10k_pf.c:1435]: (style) The scope of the variable 'result' can be
reduced.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_pf.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
index cb4d02629b86..e85b2f2eef05 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pf.c
@@ -1352,7 +1352,6 @@ s32 fm10k_iov_msg_lport_state_pf(struct fm10k_hw *hw, u32 **results,
 				 struct fm10k_mbx_info *mbx)
 {
 	struct fm10k_vf_info *vf_info = (struct fm10k_vf_info *)mbx;
-	u32 *result;
 	s32 err = 0;
 	u32 msg[2];
 	u8 mode = 0;
@@ -1362,7 +1361,7 @@ s32 fm10k_iov_msg_lport_state_pf(struct fm10k_hw *hw, u32 **results,
 		return FM10K_ERR_PARAM;
 
 	if (!!results[FM10K_LPORT_STATE_MSG_XCAST_MODE]) {
-		result = results[FM10K_LPORT_STATE_MSG_XCAST_MODE];
+		u32 *result = results[FM10K_LPORT_STATE_MSG_XCAST_MODE];
 
 		/* XCAST mode update requested */
 		err = fm10k_tlv_attr_get_u8(result, &mode);
-- 
2.21.0


^ permalink raw reply related

* [net-next 11/11] fm10k: reduce scope of the ring variable
From: Jeff Kirsher @ 2019-08-01 22:25 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190801222548.15975-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

Reduce the scope of the ring local variable in the fm10k_assign_l2_accel
function.

This was detected by cppcheck and resolves the following warning
produced by that tool:

[fm10k_netdev.c:1447]: (style) The scope of the variable 'ring' can be
reduced.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
index c73fb38be678..259da075093f 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
@@ -1444,11 +1444,11 @@ static int __fm10k_setup_tc(struct net_device *dev, enum tc_setup_type type,
 static void fm10k_assign_l2_accel(struct fm10k_intfc *interface,
 				  struct fm10k_l2_accel *l2_accel)
 {
-	struct fm10k_ring *ring;
 	int i;
 
 	for (i = 0; i < interface->num_rx_queues; i++) {
-		ring = interface->rx_ring[i];
+		struct fm10k_ring *ring = interface->rx_ring[i];
+
 		rcu_assign_pointer(ring->l2_accel, l2_accel);
 	}
 
-- 
2.21.0


^ permalink raw reply related

* [net-next 03/11] fm10k: reduce the scope of qv local variable
From: Jeff Kirsher @ 2019-08-01 22:25 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190801222548.15975-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

Reduce the scope of the qv vector pointer local variable in the
fm10k_set_coalesce function.

This was detected by cppcheck and resolves the following style warning
produced by that tool:

[fm10k_ethtool.c:658]: (style) The scope of the variable 'qv' can be
reduced.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c b/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
index 7b9440c0aee1..1f7e4a8f4557 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c
@@ -650,7 +650,6 @@ static int fm10k_set_coalesce(struct net_device *dev,
 			      struct ethtool_coalesce *ec)
 {
 	struct fm10k_intfc *interface = netdev_priv(dev);
-	struct fm10k_q_vector *qv;
 	u16 tx_itr, rx_itr;
 	int i;
 
@@ -676,7 +675,8 @@ static int fm10k_set_coalesce(struct net_device *dev,
 
 	/* update q_vectors */
 	for (i = 0; i < interface->num_q_vectors; i++) {
-		qv = interface->q_vector[i];
+		struct fm10k_q_vector *qv = interface->q_vector[i];
+
 		qv->tx.itr = tx_itr;
 		qv->rx.itr = rx_itr;
 	}
-- 
2.21.0


^ permalink raw reply related

* [net-next 04/11] fm10k: reduce the scope of local err variable
From: Jeff Kirsher @ 2019-08-01 22:25 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190801222548.15975-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

Reduce the scope of the local err variable in the fm10k_iov_alloc_data
function.

This was detected by cppcheck and resolves the following style warning
produced by that tool:

[fm10k_iov.c:426]: (style) The scope of the variable 'err' can be reduced.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_iov.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
index 8de77155f2e7..afe1fafd2447 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright(c) 2013 - 2018 Intel Corporation. */
+/* Copyright(c) 2013 - 2019 Intel Corporation. */
 
 #include "fm10k.h"
 #include "fm10k_vf.h"
@@ -426,7 +426,7 @@ static s32 fm10k_iov_alloc_data(struct pci_dev *pdev, int num_vfs)
 	struct fm10k_iov_data *iov_data = interface->iov_data;
 	struct fm10k_hw *hw = &interface->hw;
 	size_t size;
-	int i, err;
+	int i;
 
 	/* return error if iov_data is already populated */
 	if (iov_data)
@@ -452,6 +452,7 @@ static s32 fm10k_iov_alloc_data(struct pci_dev *pdev, int num_vfs)
 	/* loop through vf_info structures initializing each entry */
 	for (i = 0; i < num_vfs; i++) {
 		struct fm10k_vf_info *vf_info = &iov_data->vf_info[i];
+		int err;
 
 		/* Record VF VSI value */
 		vf_info->vsi = i + 1;
-- 
2.21.0


^ permalink raw reply related

* linux-next: Fixes tag needs some work in the net-next tree
From: Stephen Rothwell @ 2019-08-01 22:54 UTC (permalink / raw)
  To: David Miller, Networking
  Cc: Linux Next Mailing List, Linux Kernel Mailing List, Jon Maloy

[-- Attachment #1: Type: text/plain, Size: 432 bytes --]

Hi all,

In commit

  7c5b42055964 ("tipc: reduce risk of wakeup queue starvation")

Fixes tag

  Fixes: 365ad35 ("tipc: reduce risk of user starvation during link congestion")

has these problem(s):

  - SHA1 should be at least 12 digits long
    Can be fixed by setting core.abbrev to 12 (or more) or (for git v2.11
    or later) just making sure it is not set (or set to "auto").

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* [PATCH net] gve: Fix case where desc_cnt and data_cnt can get out of sync
From: Catherine Sullivan @ 2019-08-01 23:07 UTC (permalink / raw)
  To: netdev; +Cc: Catherine Sullivan, Sagi Shahar

desc_cnt and data_cnt should always be equal. In the case of a dropped
packet desc_cnt was still getting updated (correctly), data_cnt
was not. To eliminate this bug and prevent it from recurring this
patch combines them into one ring level cnt.

Signed-off-by: Catherine Sullivan <csully@google.com>
Reviewed-by: Sagi Shahar <sagis@google.com>
---
 drivers/net/ethernet/google/gve/gve.h         |  8 ++---
 drivers/net/ethernet/google/gve/gve_ethtool.c |  4 +--
 drivers/net/ethernet/google/gve/gve_rx.c      | 34 ++++++++-----------
 3 files changed, 20 insertions(+), 26 deletions(-)

diff --git a/drivers/net/ethernet/google/gve/gve.h b/drivers/net/ethernet/google/gve/gve.h
index 92372dc43be8..ebc37e256922 100644
--- a/drivers/net/ethernet/google/gve/gve.h
+++ b/drivers/net/ethernet/google/gve/gve.h
@@ -31,9 +31,6 @@
 struct gve_rx_desc_queue {
 	struct gve_rx_desc *desc_ring; /* the descriptor ring */
 	dma_addr_t bus; /* the bus for the desc_ring */
-	u32 cnt; /* free-running total number of completed packets */
-	u32 fill_cnt; /* free-running total number of descriptors posted */
-	u32 mask; /* masks the cnt to the size of the ring */
 	u8 seqno; /* the next expected seqno for this desc*/
 };
 
@@ -60,8 +57,6 @@ struct gve_rx_data_queue {
 	dma_addr_t data_bus; /* dma mapping of the slots */
 	struct gve_rx_slot_page_info *page_info; /* page info of the buffers */
 	struct gve_queue_page_list *qpl; /* qpl assigned to this queue */
-	u32 mask; /* masks the cnt to the size of the ring */
-	u32 cnt; /* free-running total number of completed packets */
 };
 
 struct gve_priv;
@@ -73,6 +68,9 @@ struct gve_rx_ring {
 	struct gve_rx_data_queue data;
 	u64 rbytes; /* free-running bytes received */
 	u64 rpackets; /* free-running packets received */
+	u32 cnt; /* free-running total number of completed packets */
+	u32 fill_cnt; /* free-running total number of descs and buffs posted */
+	u32 mask; /* masks the cnt and fill_cnt to the size of the ring */
 	u32 q_num; /* queue index */
 	u32 ntfy_id; /* notification block index */
 	struct gve_queue_resources *q_resources; /* head and tail pointer idx */
diff --git a/drivers/net/ethernet/google/gve/gve_ethtool.c b/drivers/net/ethernet/google/gve/gve_ethtool.c
index 26540b856541..d8fa816f4473 100644
--- a/drivers/net/ethernet/google/gve/gve_ethtool.c
+++ b/drivers/net/ethernet/google/gve/gve_ethtool.c
@@ -138,8 +138,8 @@ gve_get_ethtool_stats(struct net_device *netdev,
 		for (ring = 0; ring < priv->rx_cfg.num_queues; ring++) {
 			struct gve_rx_ring *rx = &priv->rx[ring];
 
-			data[i++] = rx->desc.cnt;
-			data[i++] = rx->desc.fill_cnt;
+			data[i++] = rx->cnt;
+			data[i++] = rx->fill_cnt;
 		}
 	} else {
 		i += priv->rx_cfg.num_queues * NUM_GVE_RX_CNTS;
diff --git a/drivers/net/ethernet/google/gve/gve_rx.c b/drivers/net/ethernet/google/gve/gve_rx.c
index 1914b8350da7..59564ac99d2a 100644
--- a/drivers/net/ethernet/google/gve/gve_rx.c
+++ b/drivers/net/ethernet/google/gve/gve_rx.c
@@ -37,7 +37,7 @@ static void gve_rx_free_ring(struct gve_priv *priv, int idx)
 	rx->data.qpl = NULL;
 	kvfree(rx->data.page_info);
 
-	slots = rx->data.mask + 1;
+	slots = rx->mask + 1;
 	bytes = sizeof(*rx->data.data_ring) * slots;
 	dma_free_coherent(dev, bytes, rx->data.data_ring,
 			  rx->data.data_bus);
@@ -64,7 +64,7 @@ static int gve_prefill_rx_pages(struct gve_rx_ring *rx)
 	/* Allocate one page per Rx queue slot. Each page is split into two
 	 * packet buffers, when possible we "page flip" between the two.
 	 */
-	slots = rx->data.mask + 1;
+	slots = rx->mask + 1;
 
 	rx->data.page_info = kvzalloc(slots *
 				      sizeof(*rx->data.page_info), GFP_KERNEL);
@@ -111,7 +111,7 @@ static int gve_rx_alloc_ring(struct gve_priv *priv, int idx)
 	rx->q_num = idx;
 
 	slots = priv->rx_pages_per_qpl;
-	rx->data.mask = slots - 1;
+	rx->mask = slots - 1;
 
 	/* alloc rx data ring */
 	bytes = sizeof(*rx->data.data_ring) * slots;
@@ -125,7 +125,7 @@ static int gve_rx_alloc_ring(struct gve_priv *priv, int idx)
 		err = -ENOMEM;
 		goto abort_with_slots;
 	}
-	rx->desc.fill_cnt = filled_pages;
+	rx->fill_cnt = filled_pages;
 	/* Ensure data ring slots (packet buffers) are visible. */
 	dma_wmb();
 
@@ -156,8 +156,8 @@ static int gve_rx_alloc_ring(struct gve_priv *priv, int idx)
 		err = -ENOMEM;
 		goto abort_with_q_resources;
 	}
-	rx->desc.mask = slots - 1;
-	rx->desc.cnt = 0;
+	rx->mask = slots - 1;
+	rx->cnt = 0;
 	rx->desc.seqno = 1;
 	gve_rx_add_to_block(priv, idx);
 
@@ -213,7 +213,7 @@ void gve_rx_write_doorbell(struct gve_priv *priv, struct gve_rx_ring *rx)
 {
 	u32 db_idx = be32_to_cpu(rx->q_resources->db_index);
 
-	iowrite32be(rx->desc.fill_cnt, &priv->db_bar2[db_idx]);
+	iowrite32be(rx->fill_cnt, &priv->db_bar2[db_idx]);
 }
 
 static enum pkt_hash_types gve_rss_type(__be16 pkt_flags)
@@ -273,7 +273,7 @@ static void gve_rx_flip_buff(struct gve_rx_slot_page_info *page_info,
 }
 
 static bool gve_rx(struct gve_rx_ring *rx, struct gve_rx_desc *rx_desc,
-		   netdev_features_t feat)
+		   netdev_features_t feat, u32 idx)
 {
 	struct gve_rx_slot_page_info *page_info;
 	struct gve_priv *priv = rx->gve;
@@ -282,14 +282,12 @@ static bool gve_rx(struct gve_rx_ring *rx, struct gve_rx_desc *rx_desc,
 	struct sk_buff *skb;
 	int pagecount;
 	u16 len;
-	u32 idx;
 
 	/* drop this packet */
 	if (unlikely(rx_desc->flags_seq & GVE_RXF_ERR))
 		return true;
 
 	len = be16_to_cpu(rx_desc->len) - GVE_RX_PAD;
-	idx = rx->data.cnt & rx->data.mask;
 	page_info = &rx->data.page_info[idx];
 
 	/* gvnic can only receive into registered segments. If the buffer
@@ -340,8 +338,6 @@ static bool gve_rx(struct gve_rx_ring *rx, struct gve_rx_desc *rx_desc,
 	if (!skb)
 		return true;
 
-	rx->data.cnt++;
-
 	if (likely(feat & NETIF_F_RXCSUM)) {
 		/* NIC passes up the partial sum */
 		if (rx_desc->csum)
@@ -370,7 +366,7 @@ static bool gve_rx_work_pending(struct gve_rx_ring *rx)
 	__be16 flags_seq;
 	u32 next_idx;
 
-	next_idx = rx->desc.cnt & rx->desc.mask;
+	next_idx = rx->cnt & rx->mask;
 	desc = rx->desc.desc_ring + next_idx;
 
 	flags_seq = desc->flags_seq;
@@ -385,8 +381,8 @@ bool gve_clean_rx_done(struct gve_rx_ring *rx, int budget,
 {
 	struct gve_priv *priv = rx->gve;
 	struct gve_rx_desc *desc;
-	u32 cnt = rx->desc.cnt;
-	u32 idx = cnt & rx->desc.mask;
+	u32 cnt = rx->cnt;
+	u32 idx = cnt & rx->mask;
 	u32 work_done = 0;
 	u64 bytes = 0;
 
@@ -401,10 +397,10 @@ bool gve_clean_rx_done(struct gve_rx_ring *rx, int budget,
 			   rx->q_num, GVE_SEQNO(desc->flags_seq),
 			   rx->desc.seqno);
 		bytes += be16_to_cpu(desc->len) - GVE_RX_PAD;
-		if (!gve_rx(rx, desc, feat))
+		if (!gve_rx(rx, desc, feat, idx))
 			gve_schedule_reset(priv);
 		cnt++;
-		idx = cnt & rx->desc.mask;
+		idx = cnt & rx->mask;
 		desc = rx->desc.desc_ring + idx;
 		rx->desc.seqno = gve_next_seqno(rx->desc.seqno);
 		work_done++;
@@ -417,8 +413,8 @@ bool gve_clean_rx_done(struct gve_rx_ring *rx, int budget,
 	rx->rpackets += work_done;
 	rx->rbytes += bytes;
 	u64_stats_update_end(&rx->statss);
-	rx->desc.cnt = cnt;
-	rx->desc.fill_cnt += work_done;
+	rx->cnt = cnt;
+	rx->fill_cnt += work_done;
 
 	/* restock desc ring slots */
 	dma_wmb();	/* Ensure descs are visible before ringing doorbell */
-- 
2.22.0.770.g0f2c4a37fd-goog


^ permalink raw reply related

* Re: [PATCH net-next v5 5/6] flow_offload: support get flow_block immediately
From: Jakub Kicinski @ 2019-08-01 23:11 UTC (permalink / raw)
  To: wenxu; +Cc: jiri, pablo, fw, netfilter-devel, netdev, John Hurley
In-Reply-To: <1564628627-10021-6-git-send-email-wenxu@ucloud.cn>

On Thu,  1 Aug 2019 11:03:46 +0800, wenxu@ucloud.cn wrote:
> From: wenxu <wenxu@ucloud.cn>
> 
> The new flow-indr-block can't get the tcf_block
> directly. It provide a callback list to find the flow_block immediately
> when the device register and contain a ingress block.
> 
> Signed-off-by: wenxu <wenxu@ucloud.cn>

First of all thanks for splitting the series up into more patches, 
it is easier to follow the logic now!

> @@ -328,6 +348,7 @@ struct flow_indr_block_dev {
>  
>  	INIT_LIST_HEAD(&indr_dev->cb_list);
>  	indr_dev->dev = dev;
> +	flow_get_default_block(indr_dev);
>  	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
>  				   flow_indr_setup_block_ht_params)) {
>  		kfree(indr_dev);

I wonder if it's still practical to keep the block information in the
indr_dev structure at all. The way this used to work was:

[hash table of devices]     --------------
                 |         |    netdev    |
                 |         |    refcnt    |
  indir_dev[tun0]|  ------ | cached block | ---- [ TC block ]
                 |         |   callbacks  | .
                 |          --------------   \__ [cb, cb_priv, cb_ident]
                 |                               [cb, cb_priv, cb_ident]
                 |          --------------
                 |         |    netdev    |
                 |         |    refcnt    |
  indir_dev[tun1]|  ------ | cached block | ---- [ TC block ]
                 |         |   callbacks  |.
-----------------           --------------   \__ [cb, cb_priv, cb_ident]
                                                 [cb, cb_priv, cb_ident]

In the example above we have two tunnels tun0 and tun1, each one has a
indr_dev structure allocated, and for each one of them two drivers
registered for callbacks (hence the callbacks list has two entries).

We used to cache the TC block in the indr_dev structure, but now that
there are multiple subsytems using the indr_dev we either have to have
a list of cached blocks (with entries for each subsystem) or just always
iterate over the subsystems :(

After all the same device may have both a TC block and a NFT block.

I think always iterating would be easier:

The indr_dev struct would no longer have the block pointer, instead
when new driver registers for the callback instead of:

	if (indr_dev->ing_cmd_cb)
		indr_dev->ing_cmd_cb(indr_dev->dev, indr_dev->flow_block,
				     indr_block_cb->cb, indr_block_cb->cb_priv,
				     FLOW_BLOCK_BIND);

We'd have something like the loop in flow_get_default_block():

	for each (subsystem)
		subsystem->handle_new_indir_cb(indr_dev, cb);

And then per-subsystem logic would actually call the cb. Or:

	for each (subsystem)
		block = get_default_block(indir_dev)
		indr_dev->ing_cmd_cb(...)

I hope this makes sense.

Also please double check nft unload logic has an RCU synchronization
point, I'm not 100% confident rcu_barrier() implies synchronize_rcu().
Perhaps someone more knowledgeable can chime in :)

^ permalink raw reply

* Re: [PATCH 2/2] cnic: Use refcount_t for refcount
From: Michael Chan @ 2019-08-01 23:15 UTC (permalink / raw)
  To: Chuhong Yuan; +Cc: David S . Miller, Netdev, open list
In-Reply-To: <CANhBUQ1J8hXBZv4x3pJhG_08ZS1zR=9Uj2EUta2sgtyND_QKPw@mail.gmail.com>

On Wed, Jul 31, 2019 at 7:22 PM Chuhong Yuan <hslester96@gmail.com> wrote:
>
> Michael Chan <michael.chan@broadcom.com> 于2019年8月1日周四 上午1:58写道：
> >
> > On Wed, Jul 31, 2019 at 5:22 AM Chuhong Yuan <hslester96@gmail.com> wrote:
> >
> > >  static void cnic_ctx_wr(struct cnic_dev *dev, u32 cid_addr, u32 off, u32 val)
> > > @@ -494,7 +494,7 @@ int cnic_register_driver(int ulp_type, struct cnic_ulp_ops *ulp_ops)
> > >         }
> > >         read_unlock(&cnic_dev_lock);
> > >
> > > -       atomic_set(&ulp_ops->ref_count, 0);
> > > +       refcount_set(&ulp_ops->ref_count, 0);
> > >         rcu_assign_pointer(cnic_ulp_tbl[ulp_type], ulp_ops);
> > >         mutex_unlock(&cnic_lock);
> > >
> >
> > Willem's comment applies here too.  The driver needs to be modified to
> > count from 1 to use refcount_t.
> >
> > Thanks.
>
> I have revised this problem but find the other two refcounts -
> cnic_dev::ref_count
> and cnic_sock::ref_count have no set.
> I am not sure where to initialize them to 1.
>
> Besides, should ulp_ops->ref_count be set to 0 when unregistered?

I will send a patch to fix up the initialization of all the atomic ref
counts.  After that, you can add your patch to convert to refcount_t.

^ permalink raw reply

* Re: [v2,0/2] tools: bpftool: add net attach/detach command to attach XDP prog
From: Jakub Kicinski @ 2019-08-01 23:21 UTC (permalink / raw)
  To: Daniel T. Lee; +Cc: Daniel Borkmann, Alexei Starovoitov, netdev
In-Reply-To: <20190801081133.13200-1-danieltimlee@gmail.com>

On Thu,  1 Aug 2019 17:11:31 +0900, Daniel T. Lee wrote:
> Currently, bpftool net only supports dumping progs attached on the
> interface. To attach XDP prog on interface, user must use other tool
> (eg. iproute2). By this patch, with `bpftool net attach/detach`, user
> can attach/detach XDP prog on interface.
> 
>     $ ./bpftool prog
>     ...
>     208: xdp  name xdp_prog1  tag ad822e38b629553f  gpl
>       loaded_at 2019-07-28T18:03:11+0900  uid 0
>     ...
>     $ ./bpftool net attach id 208 xdpdrv enp6s0np1
>     $ ./bpftool net
>     xdp:
>     enp6s0np1(5) driver id 208
>     ...
>     $ ./bpftool net detach xdpdrv enp6s0np1
>     $ ./bpftool net
>     xdp:
>     ...
> 
> While this patch only contains support for XDP, through `net
> attach/detach`, bpftool can further support other prog attach types.
> 
> XDP attach/detach tested on Mellanox ConnectX-4 and Netronome Agilio.

Please provide documentation for man pages, and bash completions.

^ permalink raw reply

* Re: [v2,1/2] tools: bpftool: add net attach command to attach XDP on interface
From: Jakub Kicinski @ 2019-08-01 23:36 UTC (permalink / raw)
  To: Daniel T. Lee; +Cc: Daniel Borkmann, Alexei Starovoitov, netdev
In-Reply-To: <20190801081133.13200-2-danieltimlee@gmail.com>

On Thu,  1 Aug 2019 17:11:32 +0900, Daniel T. Lee wrote:
> By this commit, using `bpftool net attach`, user can attach XDP prog on
> interface. New type of enum 'net_attach_type' has been made, as stated at
> cover-letter, the meaning of 'attach' is, prog will be attached on interface.
> 
> BPF prog will be attached through libbpf 'bpf_set_link_xdp_fd'.
> 
> Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
> ---
> Changes in v2:
>   - command 'load' changed to 'attach' for the consistency
>   - 'NET_ATTACH_TYPE_XDP_DRIVE' changed to 'NET_ATTACH_TYPE_XDP_DRIVER'
> 
>  tools/bpf/bpftool/net.c | 107 +++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 106 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/bpf/bpftool/net.c b/tools/bpf/bpftool/net.c
> index 67e99c56bc88..f3b57660b303 100644
> --- a/tools/bpf/bpftool/net.c
> +++ b/tools/bpf/bpftool/net.c
> @@ -55,6 +55,35 @@ struct bpf_attach_info {
>  	__u32 flow_dissector_id;
>  };
>  
> +enum net_attach_type {
> +	NET_ATTACH_TYPE_XDP,
> +	NET_ATTACH_TYPE_XDP_GENERIC,
> +	NET_ATTACH_TYPE_XDP_DRIVER,
> +	NET_ATTACH_TYPE_XDP_OFFLOAD,
> +	__MAX_NET_ATTACH_TYPE
> +};
> +
> +static const char * const attach_type_strings[] = {
> +	[NET_ATTACH_TYPE_XDP] = "xdp",
> +	[NET_ATTACH_TYPE_XDP_GENERIC] = "xdpgeneric",
> +	[NET_ATTACH_TYPE_XDP_DRIVER] = "xdpdrv",
> +	[NET_ATTACH_TYPE_XDP_OFFLOAD] = "xdpoffload",
> +	[__MAX_NET_ATTACH_TYPE] = NULL,

Not sure if the terminator is necessary,
ARRAY_SIZE(attach_type_strings) should suffice?

> +};
> +
> +static enum net_attach_type parse_attach_type(const char *str)
> +{
> +	enum net_attach_type type;
> +
> +	for (type = 0; type < __MAX_NET_ATTACH_TYPE; type++) {
> +		if (attach_type_strings[type] &&
> +		   is_prefix(str, attach_type_strings[type]))
> +			return type;
> +	}
> +
> +	return __MAX_NET_ATTACH_TYPE;
> +}
> +
>  static int dump_link_nlmsg(void *cookie, void *msg, struct nlattr **tb)
>  {
>  	struct bpf_netdev_t *netinfo = cookie;
> @@ -223,6 +252,77 @@ static int query_flow_dissector(struct bpf_attach_info *attach_info)
>  	return 0;
>  }
>  
> +static int parse_attach_args(int argc, char **argv, int *progfd,
> +			     enum net_attach_type *attach_type, int *ifindex)
> +{
> +	if (!REQ_ARGS(3))
> +		return -EINVAL;
> +
> +	*progfd = prog_parse_fd(&argc, &argv);
> +	if (*progfd < 0)
> +		return *progfd;
> +
> +	*attach_type = parse_attach_type(*argv);
> +	if (*attach_type == __MAX_NET_ATTACH_TYPE) {
> +		p_err("invalid net attach/detach type");
> +		return -EINVAL;

You should close the progfd on error paths.

> +	}

Hm. I'm not too sure about the ordering of arguments, type should
probably be right after attach.

If we ever add tc attach support or some other hook, that's more
fundamental part of the command than the program. So I think:

bpftool net attach xdp id xyz dev ethN

> +	NEXT_ARG();
> +	if (!REQ_ARGS(1))
> +		return -EINVAL;

Error message needed here.

> +	*ifindex = if_nametoindex(*argv);
> +	if (!*ifindex) {
> +		p_err("Invalid ifname");

"ifname" is not mentioned in help, it'd be best to keep this error
message consistent with bpftool prog load.

> +		return -EINVAL;
> +	}

Please require the dev keyword before the interface name.
That'll make it feel closer to prog load syntax.

> +	return 0;
> +}
> +
> +static int do_attach_detach_xdp(int *progfd, enum net_attach_type *attach_type,
> +				int *ifindex)
> +{
> +	__u32 flags;
> +	int err;
> +
> +	flags = XDP_FLAGS_UPDATE_IF_NOEXIST;

Please add this as an option so that user can decide whether overwrite
is allowed or not.

> +	if (*attach_type == NET_ATTACH_TYPE_XDP_GENERIC)
> +		flags |= XDP_FLAGS_SKB_MODE;
> +	if (*attach_type == NET_ATTACH_TYPE_XDP_DRIVER)
> +		flags |= XDP_FLAGS_DRV_MODE;
> +	if (*attach_type == NET_ATTACH_TYPE_XDP_OFFLOAD)
> +		flags |= XDP_FLAGS_HW_MODE;
> +
> +	err = bpf_set_link_xdp_fd(*ifindex, *progfd, flags);
> +
> +	return err;

no need for the err variable here.

> +}
> +
> +static int do_attach(int argc, char **argv)
> +{
> +	enum net_attach_type attach_type;
> +	int err, progfd, ifindex;
> +
> +	err = parse_attach_args(argc, argv, &progfd, &attach_type, &ifindex);
> +	if (err)
> +		return err;

Probably not the best idea to move this out into a helper.

> +	if (is_prefix("xdp", attach_type_strings[attach_type]))
> +		err = do_attach_detach_xdp(&progfd, &attach_type, &ifindex);

Hm. We either need an error to be reported if it's not xdp or since we
only accept XDP now perhaps the if() is superfluous?

> +	if (err < 0) {
> +		p_err("link set %s failed", attach_type_strings[attach_type]);

"link set"?  So you are familiar with iproute2 syntax! :)

> +		return -1;
> +	}
> +
> +	if (json_output)
> +		jsonw_null(json_wtr);
> +
> +	return 0;
> +}
> +
>  static int do_show(int argc, char **argv)
>  {
>  	struct bpf_attach_info attach_info = {};
> @@ -305,13 +405,17 @@ static int do_help(int argc, char **argv)
>  
>  	fprintf(stderr,
>  		"Usage: %s %s { show | list } [dev <devname>]\n"
> +		"       %s %s attach PROG LOAD_TYPE <devname>\n"
>  		"       %s %s help\n"
> +		"\n"
> +		"       " HELP_SPEC_PROGRAM "\n"
> +		"       LOAD_TYPE := { xdp | xdpgeneric | xdpdrv | xdpoffload }\n"

ATTACH_TYPE now?

Perhaps a new line before the "Note"?

>  		"Note: Only xdp and tc attachments are supported now.\n"
>  		"      For progs attached to cgroups, use \"bpftool cgroup\"\n"
>  		"      to dump program attachments. For program types\n"
>  		"      sk_{filter,skb,msg,reuseport} and lwt/seg6, please\n"
>  		"      consult iproute2.\n",
> -		bin_name, argv[-2], bin_name, argv[-2]);
> +		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2]);
>  
>  	return 0;
>  }
> @@ -319,6 +423,7 @@ static int do_help(int argc, char **argv)
>  static const struct cmd cmds[] = {
>  	{ "show",	do_show },
>  	{ "list",	do_show },
> +	{ "attach",	do_attach },
>  	{ "help",	do_help },
>  	{ 0 }
>  };


^ permalink raw reply

* Re: [PATCH] net: sched: use temporary variable for actions indexes
From: Cong Wang @ 2019-08-01 23:41 UTC (permalink / raw)
  To: dmitrolin
  Cc: Linux Kernel Network Developers, David Miller, Jiri Pirko,
	Jamal Hadi Salim, Vlad Buslov
In-Reply-To: <1564664571-31508-1-git-send-email-dmitrolin@mellanox.com>

On Thu, Aug 1, 2019 at 6:03 AM <dmitrolin@mellanox.com> wrote:
>
> From: Dmytro Linkin <dmitrolin@mellanox.com>
>
> Currently init call of all actions (except ipt) init their 'parm'
> structure as a direct pointer to nla data in skb. This leads to race
> condition when some of the filter actions were initialized successfully
> (and were assigned with idr action index that was written directly
> into nla data), but then were deleted and retried (due to following
> action module missing or classifier-initiated retry), in which case
> action init code tries to insert action to idr with index that was
> assigned on previous iteration. During retry the index can be reused
> by another action that was inserted concurrently, which causes
> unintended action sharing between filters.
> To fix described race condition, save action idr index to temporary
> stack-allocated variable instead on nla data.
>
> Fixes: 0190c1d452a9 ("net: sched: atomically check-allocate action")
> Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>


Acked-by: Cong Wang <xiyou.wangcong@gmail.com>

This is a sad side-effect we have to deal with for this retry logic,
we have to restore all global status in each retry loop. :-(

Thanks.

^ permalink raw reply

* [PATCH v5 1/3] mm/gup: add make_dirty arg to put_user_pages_dirty_lock()
From: john.hubbard @ 2019-08-01 23:47 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Björn Töpel, Boaz Harrosh,
	Christoph Hellwig, Daniel Vetter, Dan Williams, Dave Chinner,
	David Airlie, David S . Miller, Ilya Dryomov, Jan Kara,
	Jason Gunthorpe, Jens Axboe, Jérôme Glisse,
	Johannes Thumshirn, Magnus Karlsson, Matthew Wilcox,
	Miklos Szeredi, Ming Lei, Sage Weil, Santosh Shilimkar, Yan Zheng,
	netdev, dri-devel, linux-mm, linux-rdma, bpf, LKML, John Hubbard,
	Ira Weiny
In-Reply-To: <20190801234735.2149-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

Provide more capable variation of put_user_pages_dirty_lock(),
and delete put_user_pages_dirty(). This is based on the
following:

1. Lots of call sites become simpler if a bool is passed
into put_user_page*(), instead of making the call site
choose which put_user_page*() variant to call.

2. Christoph Hellwig's observation that set_page_dirty_lock()
is usually correct, and set_page_dirty() is usually a
bug, or at least questionable, within a put_user_page*()
calling chain.

This leads to the following API choices:

    * put_user_pages_dirty_lock(page, npages, make_dirty)

    * There is no put_user_pages_dirty(). You have to
      hand code that, in the rare case that it's
      required.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/infiniband/core/umem.c             |   5 +-
 drivers/infiniband/hw/hfi1/user_pages.c    |   5 +-
 drivers/infiniband/hw/qib/qib_user_pages.c |  13 +--
 drivers/infiniband/hw/usnic/usnic_uiom.c   |   5 +-
 drivers/infiniband/sw/siw/siw_mem.c        |  18 +---
 include/linux/mm.h                         |   5 +-
 mm/gup.c                                   | 115 +++++++++------------
 7 files changed, 60 insertions(+), 106 deletions(-)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 08da840ed7ee..965cf9dea71a 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -54,10 +54,7 @@ static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int d
 
 	for_each_sg_page(umem->sg_head.sgl, &sg_iter, umem->sg_nents, 0) {
 		page = sg_page_iter_page(&sg_iter);
-		if (umem->writable && dirty)
-			put_user_pages_dirty_lock(&page, 1);
-		else
-			put_user_page(page);
+		put_user_pages_dirty_lock(&page, 1, umem->writable && dirty);
 	}
 
 	sg_free_table(&umem->sg_head);
diff --git a/drivers/infiniband/hw/hfi1/user_pages.c b/drivers/infiniband/hw/hfi1/user_pages.c
index b89a9b9aef7a..469acb961fbd 100644
--- a/drivers/infiniband/hw/hfi1/user_pages.c
+++ b/drivers/infiniband/hw/hfi1/user_pages.c
@@ -118,10 +118,7 @@ int hfi1_acquire_user_pages(struct mm_struct *mm, unsigned long vaddr, size_t np
 void hfi1_release_user_pages(struct mm_struct *mm, struct page **p,
 			     size_t npages, bool dirty)
 {
-	if (dirty)
-		put_user_pages_dirty_lock(p, npages);
-	else
-		put_user_pages(p, npages);
+	put_user_pages_dirty_lock(p, npages, dirty);
 
 	if (mm) { /* during close after signal, mm can be NULL */
 		atomic64_sub(npages, &mm->pinned_vm);
diff --git a/drivers/infiniband/hw/qib/qib_user_pages.c b/drivers/infiniband/hw/qib/qib_user_pages.c
index bfbfbb7e0ff4..26c1fb8d45cc 100644
--- a/drivers/infiniband/hw/qib/qib_user_pages.c
+++ b/drivers/infiniband/hw/qib/qib_user_pages.c
@@ -37,15 +37,6 @@
 
 #include "qib.h"
 
-static void __qib_release_user_pages(struct page **p, size_t num_pages,
-				     int dirty)
-{
-	if (dirty)
-		put_user_pages_dirty_lock(p, num_pages);
-	else
-		put_user_pages(p, num_pages);
-}
-
 /**
  * qib_map_page - a safety wrapper around pci_map_page()
  *
@@ -124,7 +115,7 @@ int qib_get_user_pages(unsigned long start_page, size_t num_pages,
 
 	return 0;
 bail_release:
-	__qib_release_user_pages(p, got, 0);
+	put_user_pages_dirty_lock(p, got, false);
 bail:
 	atomic64_sub(num_pages, &current->mm->pinned_vm);
 	return ret;
@@ -132,7 +123,7 @@ int qib_get_user_pages(unsigned long start_page, size_t num_pages,
 
 void qib_release_user_pages(struct page **p, size_t num_pages)
 {
-	__qib_release_user_pages(p, num_pages, 1);
+	put_user_pages_dirty_lock(p, num_pages, true);
 
 	/* during close after signal, mm can be NULL */
 	if (current->mm)
diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
index 0b0237d41613..62e6ffa9ad78 100644
--- a/drivers/infiniband/hw/usnic/usnic_uiom.c
+++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
@@ -75,10 +75,7 @@ static void usnic_uiom_put_pages(struct list_head *chunk_list, int dirty)
 		for_each_sg(chunk->page_list, sg, chunk->nents, i) {
 			page = sg_page(sg);
 			pa = sg_phys(sg);
-			if (dirty)
-				put_user_pages_dirty_lock(&page, 1);
-			else
-				put_user_page(page);
+			put_user_pages_dirty_lock(&page, 1, dirty);
 			usnic_dbg("pa: %pa\n", &pa);
 		}
 		kfree(chunk);
diff --git a/drivers/infiniband/sw/siw/siw_mem.c b/drivers/infiniband/sw/siw/siw_mem.c
index 67171c82b0c4..2284966e4499 100644
--- a/drivers/infiniband/sw/siw/siw_mem.c
+++ b/drivers/infiniband/sw/siw/siw_mem.c
@@ -60,20 +60,6 @@ struct siw_mem *siw_mem_id2obj(struct siw_device *sdev, int stag_index)
 	return NULL;
 }
 
-static void siw_free_plist(struct siw_page_chunk *chunk, int num_pages,
-			   bool dirty)
-{
-	struct page **p = chunk->plist;
-
-	while (num_pages--) {
-		if (!PageDirty(*p) && dirty)
-			put_user_pages_dirty_lock(p, 1);
-		else
-			put_user_page(*p);
-		p++;
-	}
-}
-
 void siw_umem_release(struct siw_umem *umem, bool dirty)
 {
 	struct mm_struct *mm_s = umem->owning_mm;
@@ -82,8 +68,8 @@ void siw_umem_release(struct siw_umem *umem, bool dirty)
 	for (i = 0; num_pages; i++) {
 		int to_free = min_t(int, PAGES_PER_CHUNK, num_pages);
 
-		siw_free_plist(&umem->page_chunk[i], to_free,
-			       umem->writable && dirty);
+		put_user_pages_dirty_lock(&umem->page_chunk[i], to_free,
+					  umem->writable && dirty);
 		kfree(umem->page_chunk[i].plist);
 		num_pages -= to_free;
 	}
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0334ca97c584..9759b6a24420 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1057,8 +1057,9 @@ static inline void put_user_page(struct page *page)
 	put_page(page);
 }
 
-void put_user_pages_dirty(struct page **pages, unsigned long npages);
-void put_user_pages_dirty_lock(struct page **pages, unsigned long npages);
+void put_user_pages_dirty_lock(struct page **pages, unsigned long npages,
+			       bool make_dirty);
+
 void put_user_pages(struct page **pages, unsigned long npages);
 
 #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
diff --git a/mm/gup.c b/mm/gup.c
index 98f13ab37bac..7fefd7ab02c4 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -29,85 +29,70 @@ struct follow_page_context {
 	unsigned int page_mask;
 };
 
-typedef int (*set_dirty_func_t)(struct page *page);
-
-static void __put_user_pages_dirty(struct page **pages,
-				   unsigned long npages,
-				   set_dirty_func_t sdf)
-{
-	unsigned long index;
-
-	for (index = 0; index < npages; index++) {
-		struct page *page = compound_head(pages[index]);
-
-		/*
-		 * Checking PageDirty at this point may race with
-		 * clear_page_dirty_for_io(), but that's OK. Two key cases:
-		 *
-		 * 1) This code sees the page as already dirty, so it skips
-		 * the call to sdf(). That could happen because
-		 * clear_page_dirty_for_io() called page_mkclean(),
-		 * followed by set_page_dirty(). However, now the page is
-		 * going to get written back, which meets the original
-		 * intention of setting it dirty, so all is well:
-		 * clear_page_dirty_for_io() goes on to call
-		 * TestClearPageDirty(), and write the page back.
-		 *
-		 * 2) This code sees the page as clean, so it calls sdf().
-		 * The page stays dirty, despite being written back, so it
-		 * gets written back again in the next writeback cycle.
-		 * This is harmless.
-		 */
-		if (!PageDirty(page))
-			sdf(page);
-
-		put_user_page(page);
-	}
-}
-
 /**
- * put_user_pages_dirty() - release and dirty an array of gup-pinned pages
- * @pages:  array of pages to be marked dirty and released.
+ * put_user_pages_dirty_lock() - release and optionally dirty gup-pinned pages
+ * @pages:  array of pages to be maybe marked dirty, and definitely released.
  * @npages: number of pages in the @pages array.
+ * @make_dirty: whether to mark the pages dirty
  *
  * "gup-pinned page" refers to a page that has had one of the get_user_pages()
  * variants called on that page.
  *
  * For each page in the @pages array, make that page (or its head page, if a
- * compound page) dirty, if it was previously listed as clean. Then, release
- * the page using put_user_page().
+ * compound page) dirty, if @make_dirty is true, and if the page was previously
+ * listed as clean. In any case, releases all pages using put_user_page(),
+ * possibly via put_user_pages(), for the non-dirty case.
  *
  * Please see the put_user_page() documentation for details.
  *
- * set_page_dirty(), which does not lock the page, is used here.
- * Therefore, it is the caller's responsibility to ensure that this is
- * safe. If not, then put_user_pages_dirty_lock() should be called instead.
+ * set_page_dirty_lock() is used internally. If instead, set_page_dirty() is
+ * required, then the caller should a) verify that this is really correct,
+ * because _lock() is usually required, and b) hand code it:
+ * set_page_dirty_lock(), put_user_page().
  *
  */
-void put_user_pages_dirty(struct page **pages, unsigned long npages)
+void put_user_pages_dirty_lock(struct page **pages, unsigned long npages,
+			       bool make_dirty)
 {
-	__put_user_pages_dirty(pages, npages, set_page_dirty);
-}
-EXPORT_SYMBOL(put_user_pages_dirty);
+	unsigned long index;
 
-/**
- * put_user_pages_dirty_lock() - release and dirty an array of gup-pinned pages
- * @pages:  array of pages to be marked dirty and released.
- * @npages: number of pages in the @pages array.
- *
- * For each page in the @pages array, make that page (or its head page, if a
- * compound page) dirty, if it was previously listed as clean. Then, release
- * the page using put_user_page().
- *
- * Please see the put_user_page() documentation for details.
- *
- * This is just like put_user_pages_dirty(), except that it invokes
- * set_page_dirty_lock(), instead of set_page_dirty().
- *
- */
-void put_user_pages_dirty_lock(struct page **pages, unsigned long npages)
-{
-	__put_user_pages_dirty(pages, npages, set_page_dirty_lock);
+	/*
+	 * TODO: this can be optimized for huge pages: if a series of pages is
+	 * physically contiguous and part of the same compound page, then a
+	 * single operation to the head page should suffice.
+	 */
+
+	if (!make_dirty) {
+		put_user_pages(pages, npages);
+		return;
+	}
+
+	for (index = 0; index < npages; index++) {
+		struct page *page = compound_head(pages[index]);
+		/*
+		 * Checking PageDirty at this point may race with
+		 * clear_page_dirty_for_io(), but that's OK. Two key
+		 * cases:
+		 *
+		 * 1) This code sees the page as already dirty, so it
+		 * skips the call to set_page_dirty(). That could happen
+		 * because clear_page_dirty_for_io() called
+		 * page_mkclean(), followed by set_page_dirty().
+		 * However, now the page is going to get written back,
+		 * which meets the original intention of setting it
+		 * dirty, so all is well: clear_page_dirty_for_io() goes
+		 * on to call TestClearPageDirty(), and write the page
+		 * back.
+		 *
+		 * 2) This code sees the page as clean, so it calls
+		 * set_page_dirty(). The page stays dirty, despite being
+		 * written back, so it gets written back again in the
+		 * next writeback cycle. This is harmless.
+		 */
+		if (!PageDirty(page))
+			set_page_dirty_lock(page);
+		put_user_page(page);
+	}
 }
 EXPORT_SYMBOL(put_user_pages_dirty_lock);
 
-- 
2.22.0


^ permalink raw reply related

* [PATCH v5 3/3] net/xdp: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-01 23:47 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Björn Töpel, Boaz Harrosh,
	Christoph Hellwig, Daniel Vetter, Dan Williams, Dave Chinner,
	David Airlie, David S . Miller, Ilya Dryomov, Jan Kara,
	Jason Gunthorpe, Jens Axboe, Jérôme Glisse,
	Johannes Thumshirn, Magnus Karlsson, Matthew Wilcox,
	Miklos Szeredi, Ming Lei, Sage Weil, Santosh Shilimkar, Yan Zheng,
	netdev, dri-devel, linux-mm, linux-rdma, bpf, LKML, John Hubbard
In-Reply-To: <20190801234735.2149-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Acked-by: Björn Töpel <bjorn.topel@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 net/xdp/xdp_umem.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c
index 83de74ca729a..17c4b3d3dc34 100644
--- a/net/xdp/xdp_umem.c
+++ b/net/xdp/xdp_umem.c
@@ -166,14 +166,7 @@ void xdp_umem_clear_dev(struct xdp_umem *umem)
 
 static void xdp_umem_unpin_pages(struct xdp_umem *umem)
 {
-	unsigned int i;
-
-	for (i = 0; i < umem->npgs; i++) {
-		struct page *page = umem->pgs[i];
-
-		set_page_dirty_lock(page);
-		put_page(page);
-	}
+	put_user_pages_dirty_lock(umem->pgs, umem->npgs, true);
 
 	kfree(umem->pgs);
 	umem->pgs = NULL;
-- 
2.22.0


^ permalink raw reply related

* [PATCH v5 2/3] drivers/gpu/drm/via: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-01 23:47 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Björn Töpel, Boaz Harrosh,
	Christoph Hellwig, Daniel Vetter, Dan Williams, Dave Chinner,
	David Airlie, David S . Miller, Ilya Dryomov, Jan Kara,
	Jason Gunthorpe, Jens Axboe, Jérôme Glisse,
	Johannes Thumshirn, Magnus Karlsson, Matthew Wilcox,
	Miklos Szeredi, Ming Lei, Sage Weil, Santosh Shilimkar, Yan Zheng,
	netdev, dri-devel, linux-mm, linux-rdma, bpf, LKML, John Hubbard
In-Reply-To: <20190801234735.2149-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Also reverse the order of a comparison, in order to placate
checkpatch.pl.

Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/drm/via/via_dmablit.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/via/via_dmablit.c b/drivers/gpu/drm/via/via_dmablit.c
index 062067438f1d..b5b5bf0ba65e 100644
--- a/drivers/gpu/drm/via/via_dmablit.c
+++ b/drivers/gpu/drm/via/via_dmablit.c
@@ -171,7 +171,6 @@ via_map_blit_for_device(struct pci_dev *pdev,
 static void
 via_free_sg_info(struct pci_dev *pdev, drm_via_sg_info_t *vsg)
 {
-	struct page *page;
 	int i;
 
 	switch (vsg->state) {
@@ -186,13 +185,8 @@ via_free_sg_info(struct pci_dev *pdev, drm_via_sg_info_t *vsg)
 		kfree(vsg->desc_pages);
 		/* fall through */
 	case dr_via_pages_locked:
-		for (i = 0; i < vsg->num_pages; ++i) {
-			if (NULL != (page = vsg->pages[i])) {
-				if (!PageReserved(page) && (DMA_FROM_DEVICE == vsg->direction))
-					SetPageDirty(page);
-				put_page(page);
-			}
-		}
+		put_user_pages_dirty_lock(vsg->pages, vsg->num_pages,
+					  (vsg->direction == DMA_FROM_DEVICE));
 		/* fall through */
 	case dr_via_pages_alloc:
 		vfree(vsg->pages);
-- 
2.22.0


^ permalink raw reply related

* [PATCH v5 0/3]  mm/gup: add make_dirty arg to put_user_pages_dirty_lock()
From: john.hubbard @ 2019-08-01 23:47 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Björn Töpel, Boaz Harrosh,
	Christoph Hellwig, Daniel Vetter, Dan Williams, Dave Chinner,
	David Airlie, David S . Miller, Ilya Dryomov, Jan Kara,
	Jason Gunthorpe, Jens Axboe, Jérôme Glisse,
	Johannes Thumshirn, Magnus Karlsson, Matthew Wilcox,
	Miklos Szeredi, Ming Lei, Sage Weil, Santosh Shilimkar, Yan Zheng,
	netdev, dri-devel, linux-mm, linux-rdma, bpf, LKML, John Hubbard

From: John Hubbard <jhubbard@nvidia.com>

Changes since v4:

* Christophe Hellwig's review applied: deleted siw_free_plist() and
  __qib_release_user_pages(), now that put_user_pages_dirty_lock() does
  what those routines were doing.

* Applied Bjorn's ACK for net/xdp, and Christophe's Reviewed-by for patch
  #1.

Changes since v3:

* Fixed an unused variable warning in siw_mem.c

Changes since v2:

* Critical bug fix: remove a stray "break;" from the new routine.

Changes since v1:

* Instead of providing __put_user_pages(), add an argument to
  put_user_pages_dirty_lock(), and delete put_user_pages_dirty().
  This is based on the following points:

    1. Lots of call sites become simpler if a bool is passed
    into put_user_page*(), instead of making the call site
    choose which put_user_page*() variant to call.

    2. Christoph Hellwig's observation that set_page_dirty_lock()
    is usually correct, and set_page_dirty() is usually a
    bug, or at least questionable, within a put_user_page*()
    calling chain.

* Added the Infiniband driver back to the patch series, because it is
  a caller of put_user_pages_dirty_lock().

Unchanged parts from the v1 cover letter (except for the diffstat):

Notes about the remaining patches to come:

There are about 50+ patches in my tree [2], and I'll be sending out the
remaining ones in a few more groups:

    * The block/bio related changes (Jerome mostly wrote those, but I've
      had to move stuff around extensively, and add a little code)

    * mm/ changes

    * other subsystem patches

    * an RFC that shows the current state of the tracking patch set. That
      can only be applied after all call sites are converted, but it's
      good to get an early look at it.

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").



John Hubbard (3):
  mm/gup: add make_dirty arg to put_user_pages_dirty_lock()
  drivers/gpu/drm/via: convert put_page() to put_user_page*()
  net/xdp: convert put_page() to put_user_page*()

 drivers/gpu/drm/via/via_dmablit.c          |  10 +-
 drivers/infiniband/core/umem.c             |   5 +-
 drivers/infiniband/hw/hfi1/user_pages.c    |   5 +-
 drivers/infiniband/hw/qib/qib_user_pages.c |  13 +--
 drivers/infiniband/hw/usnic/usnic_uiom.c   |   5 +-
 drivers/infiniband/sw/siw/siw_mem.c        |  18 +---
 include/linux/mm.h                         |   5 +-
 mm/gup.c                                   | 115 +++++++++------------
 net/xdp/xdp_umem.c                         |   9 +-
 9 files changed, 63 insertions(+), 122 deletions(-)

-- 
2.22.0


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox