Netdev List
 help / color / mirror / Atom feed
* Re: Reminder: 3 open syzbot bugs in vhost subsystem
From: Eric Biggers @ 2019-07-24  3:13 UTC (permalink / raw)
  To: Jason Wang
  Cc: kvm, virtualization, netdev, Michael S. Tsirkin, linux-kernel,
	syzkaller-bugs
In-Reply-To: <fabf96ac-e472-c7fd-07ff-486fe03e6433@redhat.com>

On Wed, Jul 24, 2019 at 11:05:14AM +0800, Jason Wang wrote:
> > --------------------------------------------------------------------------------
> > Title:              KASAN: use-after-free Write in tlb_finish_mmu
> > Last occurred:      5 days ago
> > Reported:           4 days ago
> > Branches:           Mainline
> > Dashboard link:     https://syzkaller.appspot.com/bug?id=d57b94f89e48c85ef7d95acc208209ea4bdc10de
> > Original thread:    https://lkml.kernel.org/lkml/00000000000045e7a1058e02458a@google.com/T/#u
> > 
> > This bug has a syzkaller reproducer only.
> > 
> > This bug was bisected to:
> > 
> > 	commit 7f466032dc9e5a61217f22ea34b2df932786bbfc
> > 	Author: Jason Wang <jasowang@redhat.com>
> > 	Date:   Fri May 24 08:12:18 2019 +0000
> > 
> > 	  vhost: access vq metadata through kernel virtual address
> > 
> > No one has replied to the original thread for this bug yet.
> > 
> > If you fix this bug, please add the following tag to the commit:
> >      Reported-by: syzbot+8267e9af795434ffadad@syzkaller.appspotmail.com
> > 
> > If you send any email or patch for this bug, please reply to the original
> > thread.  For the git send-email command to use, or tips on how to reply if the
> > thread isn't in your mailbox, see the "Reply instructions" at
> > https://lkml.kernel.org/r/00000000000045e7a1058e02458a@google.com
> > 
> > --------------------------------------------------------------------------------
> > Title:              KASAN: use-after-free Read in finish_task_switch (2)
> > Last occurred:      5 days ago
> > Reported:           4 days ago
> > Branches:           Mainline
> > Dashboard link:     https://syzkaller.appspot.com/bug?id=9a98fcad6c8bd31f5c3afbdc6c75de9f082c0ffa
> > Original thread:    https://lkml.kernel.org/lkml/000000000000490679058e0245ee@google.com/T/#u
> > 
> > This bug has a syzkaller reproducer only.
> > 
> > This bug was bisected to:
> > 
> > 	commit 7f466032dc9e5a61217f22ea34b2df932786bbfc
> > 	Author: Jason Wang <jasowang@redhat.com>
> > 	Date:   Fri May 24 08:12:18 2019 +0000
> > 
> > 	  vhost: access vq metadata through kernel virtual address
> > 
> > No one has replied to the original thread for this bug yet.
> 
> 
> Hi:
> 
> We believe above two bugs are duplicated with the report "WARNING in
> __mmdrop". Can I just dup them with
> 
> #syz dup "WARNING in __mmdrop"
> 
> (If yes, just wonder how syzbot differ bugs, technically, several different
> bug can hit the same warning).
> 

Yes, please mark them as duplicates; see https://goo.gl/tpsmEJ#status for
correct syntax.  You need to send the command to the syzbot email address
specific to each bug.  Easiest way is to reply to the original threads.

- Eric

^ permalink raw reply

* [PATCH net-next 00/11] net: hns3: some code optimizations & bugfixes & features
From: Huazhong Tan @ 2019-07-24  3:18 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
	Huazhong Tan

This patch-set includes code optimizations, bugfixes and features for
the HNS3 ethernet controller driver.

[patch 1/11] checks reset status before setting channel.

[patch 2/11] adds a NULL pointer checking.

[patch 3/11] removes reset level upgrading when current reset fails.

[patch 4/11] fixes a bug related to IRQ vector number initialization.

[patch 5/11] fixes a GFP flags errors when holding spin_lock.

[patch 6/11] modifies firmware version format.

[patch 7/11] adds some print information.

[patch 8/11 - 9/11] adds two code optimizations about interrupt handler
and work task.

[patch 10/11] adds support for using order 1 pages with a 4K buffer.

[patch 11/11] adds a detection about the reason of IMP errors.

Guangbin Huang (1):
  net: hns3: add a check for get_reset_level

Huazhong Tan (1):
  net: hns3: remove upgrade reset level when reset fail

Jian Shen (1):
  net: hns3: add reset checking before set channels

Weihang Li (1):
  net: hns3: add support for handling IMP error

Yonglong Liu (2):
  net: hns3: fix mis-counting IRQ vector numbers issue
  net: hns3: adds debug messages to identify eth down cause

Yufeng Mo (2):
  net: hns3: change GFP flag during lock period
  net: hns3: modify firmware version display format

Yunsheng Lin (3):
  net: hns3: add interrupt affinity support for misc interrupt
  net: hns3: make hclge_service use delayed workqueue
  net: hns3: Add support for using order 1 pages with a 4K buffer

 drivers/net/ethernet/hisilicon/hns3/hnae3.h        |  10 ++
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c    |  39 ++++-
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.h    |  15 +-
 drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c |  41 ++++-
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c |  10 +-
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c |  14 ++
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c |  37 ++++-
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h |   4 +
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c    | 165 ++++++++++++++-------
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h    |  16 +-
 .../ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c   |  11 +-
 .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c  |  12 +-
 12 files changed, 298 insertions(+), 76 deletions(-)

-- 
2.7.4


^ permalink raw reply

* [PATCH net-next 09/11] net: hns3: make hclge_service use delayed workqueue
From: Huazhong Tan @ 2019-07-24  3:18 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
	Yunsheng Lin, Huazhong Tan
In-Reply-To: <1563938327-9865-1-git-send-email-tanhuazhong@huawei.com>

From: Yunsheng Lin <linyunsheng@huawei.com>

Use delayed work instead of using timers to trigger the
hclge_serive.

Simplify the code with one less middle function and in order
to support misc irq affinity.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c    | 60 ++++++++--------------
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h    |  3 +-
 2 files changed, 22 insertions(+), 41 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index fe45986..45acbc9 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -2524,9 +2524,13 @@ static void hclge_task_schedule(struct hclge_dev *hdev)
 {
 	if (!test_bit(HCLGE_STATE_DOWN, &hdev->state) &&
 	    !test_bit(HCLGE_STATE_REMOVING, &hdev->state) &&
-	    !test_and_set_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state))
-		queue_work_on(cpumask_first(&hdev->affinity_mask), system_wq,
-			      &hdev->service_task);
+	    !test_and_set_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state)) {
+		hdev->hw_stats.stats_timer++;
+		hdev->fd_arfs_expire_timer++;
+		mod_delayed_work_on(cpumask_first(&hdev->affinity_mask),
+				    system_wq, &hdev->service_task,
+				    round_jiffies_relative(HZ));
+	}
 }
 
 static int hclge_get_mac_link_status(struct hclge_dev *hdev)
@@ -2741,25 +2745,6 @@ static int hclge_get_status(struct hnae3_handle *handle)
 	return hdev->hw.mac.link;
 }
 
-static void hclge_service_timer(struct timer_list *t)
-{
-	struct hclge_dev *hdev = from_timer(hdev, t, service_timer);
-
-	mod_timer(&hdev->service_timer, jiffies + HZ);
-	hdev->hw_stats.stats_timer++;
-	hdev->fd_arfs_expire_timer++;
-	hclge_task_schedule(hdev);
-}
-
-static void hclge_service_complete(struct hclge_dev *hdev)
-{
-	WARN_ON(!test_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state));
-
-	/* Flush memory before next watchdog */
-	smp_mb__before_atomic();
-	clear_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state);
-}
-
 static u32 hclge_check_event_cause(struct hclge_dev *hdev, u32 *clearval)
 {
 	u32 rst_src_reg, cmdq_src_reg, msix_src_reg;
@@ -2937,9 +2922,6 @@ static void hclge_irq_affinity_notify(struct irq_affinity_notify *notify,
 					      affinity_notify);
 
 	cpumask_copy(&hdev->affinity_mask, mask);
-	del_timer_sync(&hdev->service_timer);
-	hdev->service_timer.expires = jiffies + HZ;
-	add_timer_on(&hdev->service_timer, cpumask_first(&hdev->affinity_mask));
 }
 
 static void hclge_irq_affinity_release(struct kref *ref)
@@ -3639,7 +3621,9 @@ static void hclge_update_vport_alive(struct hclge_dev *hdev)
 static void hclge_service_task(struct work_struct *work)
 {
 	struct hclge_dev *hdev =
-		container_of(work, struct hclge_dev, service_task);
+		container_of(work, struct hclge_dev, service_task.work);
+
+	clear_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state);
 
 	if (hdev->hw_stats.stats_timer >= HCLGE_STATS_TIMER_INTERVAL) {
 		hclge_update_stats_for_all(hdev);
@@ -3654,7 +3638,8 @@ static void hclge_service_task(struct work_struct *work)
 		hclge_rfs_filter_expire(hdev);
 		hdev->fd_arfs_expire_timer = 0;
 	}
-	hclge_service_complete(hdev);
+
+	hclge_task_schedule(hdev);
 }
 
 struct hclge_vport *hclge_get_vport(struct hnae3_handle *handle)
@@ -6193,13 +6178,13 @@ static void hclge_set_timer_task(struct hnae3_handle *handle, bool enable)
 	struct hclge_dev *hdev = vport->back;
 
 	if (enable) {
-		del_timer_sync(&hdev->service_timer);
-		hdev->service_timer.expires = jiffies + HZ;
-		add_timer_on(&hdev->service_timer,
-			     cpumask_first(&hdev->affinity_mask));
+		hclge_task_schedule(hdev);
 	} else {
-		del_timer_sync(&hdev->service_timer);
-		cancel_work_sync(&hdev->service_task);
+		/* Set the DOWN flag here to disable the service to be
+		 * scheduled again
+		 */
+		set_bit(HCLGE_STATE_DOWN, &hdev->state);
+		cancel_delayed_work_sync(&hdev->service_task);
 		clear_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state);
 	}
 }
@@ -8638,12 +8623,10 @@ static void hclge_state_uninit(struct hclge_dev *hdev)
 	set_bit(HCLGE_STATE_DOWN, &hdev->state);
 	set_bit(HCLGE_STATE_REMOVING, &hdev->state);
 
-	if (hdev->service_timer.function)
-		del_timer_sync(&hdev->service_timer);
 	if (hdev->reset_timer.function)
 		del_timer_sync(&hdev->reset_timer);
-	if (hdev->service_task.func)
-		cancel_work_sync(&hdev->service_task);
+	if (hdev->service_task.work.func)
+		cancel_delayed_work_sync(&hdev->service_task);
 	if (hdev->rst_service_task.func)
 		cancel_work_sync(&hdev->rst_service_task);
 	if (hdev->mbx_service_task.func)
@@ -8848,9 +8831,8 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev)
 
 	hclge_dcb_ops_set(hdev);
 
-	timer_setup(&hdev->service_timer, hclge_service_timer, 0);
 	timer_setup(&hdev->reset_timer, hclge_reset_timer, 0);
-	INIT_WORK(&hdev->service_task, hclge_service_task);
+	INIT_DELAYED_WORK(&hdev->service_task, hclge_service_task);
 	INIT_WORK(&hdev->rst_service_task, hclge_reset_service_task);
 	INIT_WORK(&hdev->mbx_service_task, hclge_mailbox_service_task);
 
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
index 14df23c..688e425 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
@@ -806,9 +806,8 @@ struct hclge_dev {
 	u16 adminq_work_limit; /* Num of admin receive queue desc to process */
 	unsigned long service_timer_period;
 	unsigned long service_timer_previous;
-	struct timer_list service_timer;
 	struct timer_list reset_timer;
-	struct work_struct service_task;
+	struct delayed_work service_task;
 	struct work_struct rst_service_task;
 	struct work_struct mbx_service_task;
 
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 06/11] net: hns3: modify firmware version display format
From: Huazhong Tan @ 2019-07-24  3:18 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
	Yufeng Mo, Peng Li, Huazhong Tan
In-Reply-To: <1563938327-9865-1-git-send-email-tanhuazhong@huawei.com>

From: Yufeng Mo <moyufeng@huawei.com>

This patch modifies firmware version display format in
hclge(vf)_cmd_init() and hns3_get_drvinfo(). Also, adds
some optimizations for firmware version display format.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hnae3.h              |  9 +++++++++
 drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c       | 15 +++++++++++++--
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c   | 10 +++++++++-
 drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c | 11 +++++++++--
 4 files changed, 40 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
index 48c7b70..a4624db 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
@@ -179,6 +179,15 @@ struct hnae3_vector_info {
 #define HNAE3_RING_GL_RX 0
 #define HNAE3_RING_GL_TX 1
 
+#define HNAE3_FW_VERSION_BYTE3_SHIFT	24
+#define HNAE3_FW_VERSION_BYTE3_MASK	GENMASK(31, 24)
+#define HNAE3_FW_VERSION_BYTE2_SHIFT	16
+#define HNAE3_FW_VERSION_BYTE2_MASK	GENMASK(23, 16)
+#define HNAE3_FW_VERSION_BYTE1_SHIFT	8
+#define HNAE3_FW_VERSION_BYTE1_MASK	GENMASK(15, 8)
+#define HNAE3_FW_VERSION_BYTE0_SHIFT	0
+#define HNAE3_FW_VERSION_BYTE0_MASK	GENMASK(7, 0)
+
 struct hnae3_ring_chain_node {
 	struct hnae3_ring_chain_node *next;
 	u32 tqp_index;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
index 5bff98a..e71c92b 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
@@ -527,6 +527,7 @@ static void hns3_get_drvinfo(struct net_device *netdev,
 {
 	struct hns3_nic_priv *priv = netdev_priv(netdev);
 	struct hnae3_handle *h = priv->ae_handle;
+	u32 fw_version;
 
 	if (!h->ae_algo->ops->get_fw_version) {
 		netdev_err(netdev, "could not get fw version!\n");
@@ -545,8 +546,18 @@ static void hns3_get_drvinfo(struct net_device *netdev,
 		sizeof(drvinfo->bus_info));
 	drvinfo->bus_info[ETHTOOL_BUSINFO_LEN - 1] = '\0';
 
-	snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), "0x%08x",
-		 priv->ae_handle->ae_algo->ops->get_fw_version(h));
+	fw_version = priv->ae_handle->ae_algo->ops->get_fw_version(h);
+
+	snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
+		 "%lu.%lu.%lu.%lu",
+		 hnae3_get_field(fw_version, HNAE3_FW_VERSION_BYTE3_MASK,
+				 HNAE3_FW_VERSION_BYTE3_SHIFT),
+		 hnae3_get_field(fw_version, HNAE3_FW_VERSION_BYTE2_MASK,
+				 HNAE3_FW_VERSION_BYTE2_SHIFT),
+		 hnae3_get_field(fw_version, HNAE3_FW_VERSION_BYTE1_MASK,
+				 HNAE3_FW_VERSION_BYTE1_SHIFT),
+		 hnae3_get_field(fw_version, HNAE3_FW_VERSION_BYTE0_MASK,
+				 HNAE3_FW_VERSION_BYTE0_SHIFT));
 }
 
 static u32 hns3_get_link(struct net_device *netdev)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
index 22f6acd..c2320bf 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
@@ -419,7 +419,15 @@ int hclge_cmd_init(struct hclge_dev *hdev)
 	}
 	hdev->fw_version = version;
 
-	dev_info(&hdev->pdev->dev, "The firmware version is %08x\n", version);
+	pr_info_once("The firmware version is %lu.%lu.%lu.%lu\n",
+		     hnae3_get_field(version, HNAE3_FW_VERSION_BYTE3_MASK,
+				     HNAE3_FW_VERSION_BYTE3_SHIFT),
+		     hnae3_get_field(version, HNAE3_FW_VERSION_BYTE2_MASK,
+				     HNAE3_FW_VERSION_BYTE2_SHIFT),
+		     hnae3_get_field(version, HNAE3_FW_VERSION_BYTE1_MASK,
+				     HNAE3_FW_VERSION_BYTE1_SHIFT),
+		     hnae3_get_field(version, HNAE3_FW_VERSION_BYTE0_MASK,
+				     HNAE3_FW_VERSION_BYTE0_SHIFT));
 
 	return 0;
 
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c
index 652b796..004125b 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c
@@ -405,8 +405,15 @@ int hclgevf_cmd_init(struct hclgevf_dev *hdev)
 	}
 	hdev->fw_version = version;
 
-	dev_info(&hdev->pdev->dev, "The firmware version is %08x\n", version);
-
+	pr_info_once("The firmware version is %lu.%lu.%lu.%lu\n",
+		     hnae3_get_field(version, HNAE3_FW_VERSION_BYTE3_MASK,
+				     HNAE3_FW_VERSION_BYTE3_SHIFT),
+		     hnae3_get_field(version, HNAE3_FW_VERSION_BYTE2_MASK,
+				     HNAE3_FW_VERSION_BYTE2_SHIFT),
+		     hnae3_get_field(version, HNAE3_FW_VERSION_BYTE1_MASK,
+				     HNAE3_FW_VERSION_BYTE1_SHIFT),
+		     hnae3_get_field(version, HNAE3_FW_VERSION_BYTE0_MASK,
+				     HNAE3_FW_VERSION_BYTE0_SHIFT));
 	return 0;
 
 err_cmd_init:
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 08/11] net: hns3: add interrupt affinity support for misc interrupt
From: Huazhong Tan @ 2019-07-24  3:18 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
	Yunsheng Lin, Peng Li, Huazhong Tan
In-Reply-To: <1563938327-9865-1-git-send-email-tanhuazhong@huawei.com>

From: Yunsheng Lin <linyunsheng@huawei.com>

The misc interrupt is used to schedule the reset and mailbox
subtask, and a 1 sec timer is used to schedule the service
subtask, which does periodic work.

This patch sets the above three subtask's affinity using the
misc interrupt' affinity.

Also this patch setups a affinity notify for misc interrupt to
allow user to change the above three subtask's affinity.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c    | 59 ++++++++++++++++++++--
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h    |  4 ++
 2 files changed, 59 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index f345095..fe45986 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -1270,6 +1270,12 @@ static int hclge_configure(struct hclge_dev *hdev)
 
 	hclge_init_kdump_kernel_config(hdev);
 
+	/* Set the init affinity based on pci func number */
+	i = cpumask_weight(cpumask_of_node(dev_to_node(&hdev->pdev->dev)));
+	i = i ? PCI_FUNC(hdev->pdev->devfn) % i : 0;
+	cpumask_set_cpu(cpumask_local_spread(i, dev_to_node(&hdev->pdev->dev)),
+			&hdev->affinity_mask);
+
 	return ret;
 }
 
@@ -2502,14 +2508,16 @@ static void hclge_mbx_task_schedule(struct hclge_dev *hdev)
 {
 	if (!test_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state) &&
 	    !test_and_set_bit(HCLGE_STATE_MBX_SERVICE_SCHED, &hdev->state))
-		schedule_work(&hdev->mbx_service_task);
+		queue_work_on(cpumask_first(&hdev->affinity_mask), system_wq,
+			      &hdev->mbx_service_task);
 }
 
 static void hclge_reset_task_schedule(struct hclge_dev *hdev)
 {
 	if (!test_bit(HCLGE_STATE_REMOVING, &hdev->state) &&
 	    !test_and_set_bit(HCLGE_STATE_RST_SERVICE_SCHED, &hdev->state))
-		schedule_work(&hdev->rst_service_task);
+		queue_work_on(cpumask_first(&hdev->affinity_mask), system_wq,
+			      &hdev->rst_service_task);
 }
 
 static void hclge_task_schedule(struct hclge_dev *hdev)
@@ -2517,7 +2525,8 @@ static void hclge_task_schedule(struct hclge_dev *hdev)
 	if (!test_bit(HCLGE_STATE_DOWN, &hdev->state) &&
 	    !test_bit(HCLGE_STATE_REMOVING, &hdev->state) &&
 	    !test_and_set_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state))
-		(void)schedule_work(&hdev->service_task);
+		queue_work_on(cpumask_first(&hdev->affinity_mask), system_wq,
+			      &hdev->service_task);
 }
 
 static int hclge_get_mac_link_status(struct hclge_dev *hdev)
@@ -2921,6 +2930,39 @@ static void hclge_get_misc_vector(struct hclge_dev *hdev)
 	hdev->num_msi_used += 1;
 }
 
+static void hclge_irq_affinity_notify(struct irq_affinity_notify *notify,
+				      const cpumask_t *mask)
+{
+	struct hclge_dev *hdev = container_of(notify, struct hclge_dev,
+					      affinity_notify);
+
+	cpumask_copy(&hdev->affinity_mask, mask);
+	del_timer_sync(&hdev->service_timer);
+	hdev->service_timer.expires = jiffies + HZ;
+	add_timer_on(&hdev->service_timer, cpumask_first(&hdev->affinity_mask));
+}
+
+static void hclge_irq_affinity_release(struct kref *ref)
+{
+}
+
+static void hclge_misc_affinity_setup(struct hclge_dev *hdev)
+{
+	irq_set_affinity_hint(hdev->misc_vector.vector_irq,
+			      &hdev->affinity_mask);
+
+	hdev->affinity_notify.notify = hclge_irq_affinity_notify;
+	hdev->affinity_notify.release = hclge_irq_affinity_release;
+	irq_set_affinity_notifier(hdev->misc_vector.vector_irq,
+				  &hdev->affinity_notify);
+}
+
+static void hclge_misc_affinity_teardown(struct hclge_dev *hdev)
+{
+	irq_set_affinity_notifier(hdev->misc_vector.vector_irq, NULL);
+	irq_set_affinity_hint(hdev->misc_vector.vector_irq, NULL);
+}
+
 static int hclge_misc_irq_init(struct hclge_dev *hdev)
 {
 	int ret;
@@ -6151,7 +6193,10 @@ static void hclge_set_timer_task(struct hnae3_handle *handle, bool enable)
 	struct hclge_dev *hdev = vport->back;
 
 	if (enable) {
-		mod_timer(&hdev->service_timer, jiffies + HZ);
+		del_timer_sync(&hdev->service_timer);
+		hdev->service_timer.expires = jiffies + HZ;
+		add_timer_on(&hdev->service_timer,
+			     cpumask_first(&hdev->affinity_mask));
 	} else {
 		del_timer_sync(&hdev->service_timer);
 		cancel_work_sync(&hdev->service_task);
@@ -8809,6 +8854,11 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev)
 	INIT_WORK(&hdev->rst_service_task, hclge_reset_service_task);
 	INIT_WORK(&hdev->mbx_service_task, hclge_mailbox_service_task);
 
+	/* Setup affinity after service timer setup because add_timer_on
+	 * is called in affinity notify.
+	 */
+	hclge_misc_affinity_setup(hdev);
+
 	hclge_clear_all_event_cause(hdev);
 	hclge_clear_resetting_state(hdev);
 
@@ -8970,6 +9020,7 @@ static void hclge_uninit_ae_dev(struct hnae3_ae_dev *ae_dev)
 	struct hclge_dev *hdev = ae_dev->priv;
 	struct hclge_mac *mac = &hdev->hw.mac;
 
+	hclge_misc_affinity_teardown(hdev);
 	hclge_state_uninit(hdev);
 
 	if (mac->phydev)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
index 6a12285..14df23c 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
@@ -864,6 +864,10 @@ struct hclge_dev {
 
 	DECLARE_KFIFO(mac_tnl_log, struct hclge_mac_tnl_stats,
 		      HCLGE_MAC_TNL_LOG_SIZE);
+
+	/* affinity mask and notify for misc interrupt */
+	cpumask_t affinity_mask;
+	struct irq_affinity_notify affinity_notify;
 };
 
 /* VPort level vlan tag configuration for TX direction */
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 03/11] net: hns3: remove upgrade reset level when reset fail
From: Huazhong Tan @ 2019-07-24  3:18 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
	Huazhong Tan
In-Reply-To: <1563938327-9865-1-git-send-email-tanhuazhong@huawei.com>

Currently, hclge_reset_err_handle() will assert a global reset
when the failing count is smaller than MAX_RESET_FAIL_CNT, which
will affect other running functions.

So this patch removes this upgrading, and uses re-scheduling reset
task to do it.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
---
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c    | 28 +++++++---------------
 1 file changed, 8 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 3fde5471..3c64d70 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -3305,7 +3305,7 @@ static int hclge_reset_prepare_wait(struct hclge_dev *hdev)
 	return ret;
 }
 
-static bool hclge_reset_err_handle(struct hclge_dev *hdev, bool is_timeout)
+static bool hclge_reset_err_handle(struct hclge_dev *hdev)
 {
 #define MAX_RESET_FAIL_CNT 5
 
@@ -3322,20 +3322,11 @@ static bool hclge_reset_err_handle(struct hclge_dev *hdev, bool is_timeout)
 		return false;
 	} else if (hdev->reset_fail_cnt < MAX_RESET_FAIL_CNT) {
 		hdev->reset_fail_cnt++;
-		if (is_timeout) {
-			set_bit(hdev->reset_type, &hdev->reset_pending);
-			dev_info(&hdev->pdev->dev,
-				 "re-schedule to wait for hw reset done\n");
-			return true;
-		}
-
-		dev_info(&hdev->pdev->dev, "Upgrade reset level\n");
-		hclge_clear_reset_cause(hdev);
-		set_bit(HNAE3_GLOBAL_RESET, &hdev->default_reset_request);
-		mod_timer(&hdev->reset_timer,
-			  jiffies + HCLGE_RESET_INTERVAL);
-
-		return false;
+		set_bit(hdev->reset_type, &hdev->reset_pending);
+		dev_info(&hdev->pdev->dev,
+			 "re-schedule reset task(%d)\n",
+			 hdev->reset_fail_cnt);
+		return true;
 	}
 
 	hclge_clear_reset_cause(hdev);
@@ -3382,7 +3373,6 @@ static int hclge_reset_stack(struct hclge_dev *hdev)
 static void hclge_reset(struct hclge_dev *hdev)
 {
 	struct hnae3_ae_dev *ae_dev = pci_get_drvdata(hdev->pdev);
-	bool is_timeout = false;
 	int ret;
 
 	/* Initialize ae_dev reset status as well, in case enet layer wants to
@@ -3410,10 +3400,8 @@ static void hclge_reset(struct hclge_dev *hdev)
 	if (ret)
 		goto err_reset;
 
-	if (hclge_reset_wait(hdev)) {
-		is_timeout = true;
+	if (hclge_reset_wait(hdev))
 		goto err_reset;
-	}
 
 	hdev->rst_stats.hw_reset_done_cnt++;
 
@@ -3465,7 +3453,7 @@ static void hclge_reset(struct hclge_dev *hdev)
 err_reset_lock:
 	rtnl_unlock();
 err_reset:
-	if (hclge_reset_err_handle(hdev, is_timeout))
+	if (hclge_reset_err_handle(hdev))
 		hclge_reset_task_schedule(hdev);
 }
 
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 01/11] net: hns3: add reset checking before set channels
From: Huazhong Tan @ 2019-07-24  3:18 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
	Jian Shen, Huazhong Tan
In-Reply-To: <1563938327-9865-1-git-send-email-tanhuazhong@huawei.com>

From: Jian Shen <shenjian15@huawei.com>

hns3_set_channels() should check the resetting status firstly,
since the device will reinitialize when resetting. If the
reset has not completed, the hns3_set_channels() may access
invalid memory.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 69f7ef8..08af782 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -4378,6 +4378,9 @@ int hns3_set_channels(struct net_device *netdev,
 	u16 org_tqp_num;
 	int ret;
 
+	if (hns3_nic_resetting(netdev))
+		return -EBUSY;
+
 	if (ch->rx_count || ch->tx_count)
 		return -EINVAL;
 
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 02/11] net: hns3: add a check for get_reset_level
From: Huazhong Tan @ 2019-07-24  3:18 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
	Guangbin Huang, Huazhong Tan
In-Reply-To: <1563938327-9865-1-git-send-email-tanhuazhong@huawei.com>

From: Guangbin Huang <huangguangbin@huawei.com>

For some cases, ops->get_reset_level may not be implemented, so we
should check whether it is NULL before calling get_reset_level.

Signed-off-by: Guangbin Huang <huangguangbin@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 08af782..4d58c53 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -1963,7 +1963,7 @@ static pci_ers_result_t hns3_slot_reset(struct pci_dev *pdev)
 
 	ops = ae_dev->ops;
 	/* request the reset */
-	if (ops->reset_event) {
+	if (ops->reset_event && ops->get_reset_level) {
 		if (ae_dev->hw_err_reset_req) {
 			reset_type = ops->get_reset_level(ae_dev,
 						&ae_dev->hw_err_reset_req);
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 10/11] net: hns3: Add support for using order 1 pages with a 4K buffer
From: Huazhong Tan @ 2019-07-24  3:18 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
	Yunsheng Lin, Huazhong Tan
In-Reply-To: <1563938327-9865-1-git-send-email-tanhuazhong@huawei.com>

From: Yunsheng Lin <linyunsheng@huawei.com>

Hardware supports 0.5K, 1K, 2K, 4K RX buffer size, the
RX buffer can not be reused because the hns3_page_order
return 0 when page size and RX buffer size are both 4096.

So this patch changes the hns3_page_order to return 1 when
RX buffer is greater than half of the page size and page size
is less the 8192, and dev_alloc_pages has already been used
to allocate the compound page for RX buffer.

This patch also changes hnae3_* to hns3_* for page order
and RX buffer size calculation because they are used in
hns3 module.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 10 +++++-----
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.h | 15 ++++++++++++---
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index cff5d59..56af4be 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -2086,7 +2086,7 @@ static void hns3_set_default_feature(struct net_device *netdev)
 static int hns3_alloc_buffer(struct hns3_enet_ring *ring,
 			     struct hns3_desc_cb *cb)
 {
-	unsigned int order = hnae3_page_order(ring);
+	unsigned int order = hns3_page_order(ring);
 	struct page *p;
 
 	p = dev_alloc_pages(order);
@@ -2097,7 +2097,7 @@ static int hns3_alloc_buffer(struct hns3_enet_ring *ring,
 	cb->page_offset = 0;
 	cb->reuse_flag = 0;
 	cb->buf  = page_address(p);
-	cb->length = hnae3_page_size(ring);
+	cb->length = hns3_page_size(ring);
 	cb->type = DESC_TYPE_PAGE;
 
 	return 0;
@@ -2400,7 +2400,7 @@ static void hns3_nic_reuse_page(struct sk_buff *skb, int i,
 {
 	struct hns3_desc *desc = &ring->desc[ring->next_to_clean];
 	int size = le16_to_cpu(desc->rx.size);
-	u32 truesize = hnae3_buf_size(ring);
+	u32 truesize = hns3_buf_size(ring);
 
 	skb_add_rx_frag(skb, i, desc_cb->priv, desc_cb->page_offset + pull_len,
 			size - pull_len, truesize);
@@ -2415,7 +2415,7 @@ static void hns3_nic_reuse_page(struct sk_buff *skb, int i,
 	/* Move offset up to the next cache line */
 	desc_cb->page_offset += truesize;
 
-	if (desc_cb->page_offset + truesize <= hnae3_page_size(ring)) {
+	if (desc_cb->page_offset + truesize <= hns3_page_size(ring)) {
 		desc_cb->reuse_flag = 1;
 		/* Bump ref count on page before it is given */
 		get_page(desc_cb->priv);
@@ -2697,7 +2697,7 @@ static int hns3_add_frag(struct hns3_enet_ring *ring, struct hns3_desc *desc,
 		}
 
 		if (ring->tail_skb) {
-			head_skb->truesize += hnae3_buf_size(ring);
+			head_skb->truesize += hns3_buf_size(ring);
 			head_skb->data_len += le16_to_cpu(desc->rx.size);
 			head_skb->len += le16_to_cpu(desc->rx.size);
 			skb = ring->tail_skb;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
index 848b866..1a17856 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
@@ -608,9 +608,18 @@ static inline bool hns3_nic_resetting(struct net_device *netdev)
 
 #define tx_ring_data(priv, idx) ((priv)->ring_data[idx])
 
-#define hnae3_buf_size(_ring) ((_ring)->buf_size)
-#define hnae3_page_order(_ring) (get_order(hnae3_buf_size(_ring)))
-#define hnae3_page_size(_ring) (PAGE_SIZE << (u32)hnae3_page_order(_ring))
+#define hns3_buf_size(_ring) ((_ring)->buf_size)
+
+static inline unsigned int hns3_page_order(struct hns3_enet_ring *ring)
+{
+#if (PAGE_SIZE < 8192)
+	if (ring->buf_size > (PAGE_SIZE / 2))
+		return 1;
+#endif
+	return 0;
+}
+
+#define hns3_page_size(_ring) (PAGE_SIZE << hns3_page_order(_ring))
 
 /* iterator for handling rings in ring group */
 #define hns3_for_each_ring(pos, head) \
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 05/11] net: hns3: change GFP flag during lock period
From: Huazhong Tan @ 2019-07-24  3:18 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
	Yufeng Mo, lipeng 00277521, Huazhong Tan
In-Reply-To: <1563938327-9865-1-git-send-email-tanhuazhong@huawei.com>

From: Yufeng Mo <moyufeng@huawei.com>

When allocating memory, the GFP_KERNEL cannot be used during the
spin_lock period. This is because it may cause scheduling when holding
spin_lock. This patch changes GFP flag to GFP_ATOMIC in this case.

Fixes: dd74f815dd41 ("net: hns3: Add support for rule add/delete for flow director")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: lipeng 00277521 <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index a59d13f..f345095 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -5799,7 +5799,7 @@ static int hclge_add_fd_entry_by_arfs(struct hnae3_handle *handle, u16 queue_id,
 			return -ENOSPC;
 		}
 
-		rule = kzalloc(sizeof(*rule), GFP_KERNEL);
+		rule = kzalloc(sizeof(*rule), GFP_ATOMIC);
 		if (!rule) {
 			spin_unlock_bh(&hdev->fd_rule_lock);
 
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 07/11] net: hns3: adds debug messages to identify eth down cause
From: Huazhong Tan @ 2019-07-24  3:18 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
	Yonglong Liu, Peng Li, Huazhong Tan
In-Reply-To: <1563938327-9865-1-git-send-email-tanhuazhong@huawei.com>

From: Yonglong Liu <liuyonglong@huawei.com>

Some times just see the eth interface have been down/up via
dmesg, but can not know why the eth down. So adds some debug
messages to identify the cause for this.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c    | 24 ++++++++++++++++++++
 drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 26 ++++++++++++++++++++++
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c | 14 ++++++++++++
 3 files changed, 64 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 4d58c53..cff5d59 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -459,6 +459,10 @@ static int hns3_nic_net_open(struct net_device *netdev)
 		h->ae_algo->ops->set_timer_task(priv->ae_handle, true);
 
 	hns3_config_xps(priv);
+
+	if (netif_msg_ifup(h))
+		netdev_info(netdev, "net open\n");
+
 	return 0;
 }
 
@@ -519,6 +523,9 @@ static int hns3_nic_net_stop(struct net_device *netdev)
 	if (test_and_set_bit(HNS3_NIC_STATE_DOWN, &priv->state))
 		return 0;
 
+	if (netif_msg_ifdown(h))
+		netdev_info(netdev, "net stop\n");
+
 	if (h->ae_algo->ops->set_timer_task)
 		h->ae_algo->ops->set_timer_task(priv->ae_handle, false);
 
@@ -1550,6 +1557,9 @@ static int hns3_setup_tc(struct net_device *netdev, void *type_data)
 	h = hns3_get_handle(netdev);
 	kinfo = &h->kinfo;
 
+	if (netif_msg_ifdown(h))
+		netdev_info(netdev, "setup tc: num_tc=%d\n", tc);
+
 	return (kinfo->dcb_ops && kinfo->dcb_ops->setup_tc) ?
 		kinfo->dcb_ops->setup_tc(h, tc, prio_tc) : -EOPNOTSUPP;
 }
@@ -1593,6 +1603,11 @@ static int hns3_ndo_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan,
 	struct hnae3_handle *h = hns3_get_handle(netdev);
 	int ret = -EIO;
 
+	if (netif_msg_ifdown(h))
+		netdev_info(netdev,
+			    "set vf vlan: vf=%d, vlan=%d, qos=%d, vlan_proto=%d\n",
+			    vf, vlan, qos, vlan_proto);
+
 	if (h->ae_algo->ops->set_vf_vlan_filter)
 		ret = h->ae_algo->ops->set_vf_vlan_filter(h, vf, vlan,
 							  qos, vlan_proto);
@@ -1611,6 +1626,10 @@ static int hns3_nic_change_mtu(struct net_device *netdev, int new_mtu)
 	if (!h->ae_algo->ops->set_mtu)
 		return -EOPNOTSUPP;
 
+	if (netif_msg_ifdown(h))
+		netdev_info(netdev, "change mtu from %d to %d\n",
+			    netdev->mtu, new_mtu);
+
 	ret = h->ae_algo->ops->set_mtu(h, new_mtu);
 	if (ret)
 		netdev_err(netdev, "failed to change MTU in hardware %d\n",
@@ -4395,6 +4414,11 @@ int hns3_set_channels(struct net_device *netdev,
 	if (kinfo->rss_size == new_tqp_num)
 		return 0;
 
+	if (netif_msg_ifdown(h))
+		netdev_info(netdev,
+			    "set channels: tqp_num=%d, rxfh=%d\n",
+			    new_tqp_num, rxfh_configured);
+
 	ret = hns3_reset_notify(h, HNAE3_DOWN_CLIENT);
 	if (ret)
 		return ret;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
index e71c92b..edb9845 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
@@ -311,6 +311,9 @@ static void hns3_self_test(struct net_device *ndev,
 	if (eth_test->flags != ETH_TEST_FL_OFFLINE)
 		return;
 
+	if (netif_msg_ifdown(h))
+		netdev_info(ndev, "self test start\n");
+
 	st_param[HNAE3_LOOP_APP][0] = HNAE3_LOOP_APP;
 	st_param[HNAE3_LOOP_APP][1] =
 			h->flags & HNAE3_SUPPORT_APP_LOOPBACK;
@@ -374,6 +377,9 @@ static void hns3_self_test(struct net_device *ndev,
 
 	if (if_running)
 		ndev->netdev_ops->ndo_open(ndev);
+
+	if (netif_msg_ifdown(h))
+		netdev_info(ndev, "self test end\n");
 }
 
 static int hns3_get_sset_count(struct net_device *netdev, int stringset)
@@ -604,6 +610,11 @@ static int hns3_set_pauseparam(struct net_device *netdev,
 {
 	struct hnae3_handle *h = hns3_get_handle(netdev);
 
+	if (netif_msg_ifdown(h))
+		netdev_info(netdev,
+			    "set pauseparam: autoneg=%d, rx:%d, tx:%d\n",
+			    param->autoneg, param->rx_pause, param->tx_pause);
+
 	if (h->ae_algo->ops->set_pauseparam)
 		return h->ae_algo->ops->set_pauseparam(h, param->autoneg,
 						       param->rx_pause,
@@ -743,6 +754,13 @@ static int hns3_set_link_ksettings(struct net_device *netdev,
 	if (cmd->base.speed == SPEED_1000 && cmd->base.duplex == DUPLEX_HALF)
 		return -EINVAL;
 
+	if (netif_msg_ifdown(handle))
+		netdev_info(netdev,
+			    "set link(%s): autoneg=%d, speed=%d, duplex=%d\n",
+			    netdev->phydev ? "phy" : "mac",
+			    cmd->base.autoneg, cmd->base.speed,
+			    cmd->base.duplex);
+
 	/* Only support ksettings_set for netdev with phy attached for now */
 	if (netdev->phydev)
 		return phy_ethtool_ksettings_set(netdev->phydev, cmd);
@@ -984,6 +1002,10 @@ static int hns3_nway_reset(struct net_device *netdev)
 		return -EINVAL;
 	}
 
+	if (netif_msg_ifdown(handle))
+		netdev_info(netdev, "nway reset (using %s)\n",
+			    phy ? "phy" : "mac");
+
 	if (phy)
 		return genphy_restart_aneg(phy);
 
@@ -1308,6 +1330,10 @@ static int hns3_set_fecparam(struct net_device *netdev,
 	if (!ops->set_fec)
 		return -EOPNOTSUPP;
 	fec_mode = eth_to_loc_fec(fec->fec);
+
+	if (netif_msg_ifdown(handle))
+		netdev_info(netdev, "set fecparam: mode=%d\n", fec_mode);
+
 	return ops->set_fec(handle, fec_mode);
 }
 
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c
index bac4ce1..133e7c6 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c
@@ -201,6 +201,7 @@ static int hclge_client_setup_tc(struct hclge_dev *hdev)
 static int hclge_ieee_setets(struct hnae3_handle *h, struct ieee_ets *ets)
 {
 	struct hclge_vport *vport = hclge_get_vport(h);
+	struct net_device *netdev = h->kinfo.netdev;
 	struct hclge_dev *hdev = vport->back;
 	bool map_changed = false;
 	u8 num_tc = 0;
@@ -215,6 +216,9 @@ static int hclge_ieee_setets(struct hnae3_handle *h, struct ieee_ets *ets)
 		return ret;
 
 	if (map_changed) {
+		if (netif_msg_ifdown(h))
+			netdev_info(netdev, "set ets\n");
+
 		ret = hclge_notify_client(hdev, HNAE3_DOWN_CLIENT);
 		if (ret)
 			return ret;
@@ -300,6 +304,7 @@ static int hclge_ieee_getpfc(struct hnae3_handle *h, struct ieee_pfc *pfc)
 static int hclge_ieee_setpfc(struct hnae3_handle *h, struct ieee_pfc *pfc)
 {
 	struct hclge_vport *vport = hclge_get_vport(h);
+	struct net_device *netdev = h->kinfo.netdev;
 	struct hclge_dev *hdev = vport->back;
 	u8 i, j, pfc_map, *prio_tc;
 
@@ -325,6 +330,11 @@ static int hclge_ieee_setpfc(struct hnae3_handle *h, struct ieee_pfc *pfc)
 	hdev->tm_info.hw_pfc_map = pfc_map;
 	hdev->tm_info.pfc_en = pfc->pfc_en;
 
+	if (netif_msg_ifdown(h))
+		netdev_info(netdev,
+			    "set pfc: pfc_en=%d, pfc_map=%d, num_tc=%d\n",
+			    pfc->pfc_en, pfc_map, hdev->tm_info.num_tc);
+
 	hclge_tm_pfc_info_update(hdev);
 
 	return hclge_pause_setup_hw(hdev, false);
@@ -345,8 +355,12 @@ static u8 hclge_getdcbx(struct hnae3_handle *h)
 static u8 hclge_setdcbx(struct hnae3_handle *h, u8 mode)
 {
 	struct hclge_vport *vport = hclge_get_vport(h);
+	struct net_device *netdev = h->kinfo.netdev;
 	struct hclge_dev *hdev = vport->back;
 
+	if (netif_msg_drv(h))
+		netdev_info(netdev, "set dcbx: mode=%d\n", mode);
+
 	/* No support for LLD_MANAGED modes or CEE */
 	if ((mode & DCB_CAP_DCBX_LLD_MANAGED) ||
 	    (mode & DCB_CAP_DCBX_VER_CEE) ||
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 04/11] net: hns3: fix mis-counting IRQ vector numbers issue
From: Huazhong Tan @ 2019-07-24  3:18 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
	Yonglong Liu, Peng Li, Huazhong Tan
In-Reply-To: <1563938327-9865-1-git-send-email-tanhuazhong@huawei.com>

From: Yonglong Liu <liuyonglong@huawei.com>

The num_msi_left means the vector numbers of NIC, but if the
PF supported RoCE, it contains the vector numbers of NIC and
RoCE(Not expected).

This may cause interrupts lost in some case, because of the
NIC module used the vector resources which belongs to RoCE.

This patch corrects the value of num_msi_left to be equals to
the vector numbers of NIC, and adjust the default tqp numbers
according to the value of num_msi_left.

Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c   |  5 ++++-
 drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 12 ++++++++++--
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 3c64d70..a59d13f 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -1470,13 +1470,16 @@ static int hclge_vport_setup(struct hclge_vport *vport, u16 num_tqps)
 {
 	struct hnae3_handle *nic = &vport->nic;
 	struct hclge_dev *hdev = vport->back;
+	u16 alloc_tqps;
 	int ret;
 
 	nic->pdev = hdev->pdev;
 	nic->ae_algo = &ae_algo;
 	nic->numa_node_mask = hdev->numa_node_mask;
 
-	ret = hclge_knic_setup(vport, num_tqps,
+	alloc_tqps = min_t(u16, hdev->roce_base_msix_offset - 1, num_tqps);
+
+	ret = hclge_knic_setup(vport, alloc_tqps,
 			       hdev->num_tx_desc, hdev->num_rx_desc);
 	if (ret)
 		dev_err(&hdev->pdev->dev, "knic setup failed %d\n", ret);
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index a13a0e1..db84782 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -287,6 +287,14 @@ static int hclgevf_get_queue_info(struct hclgevf_dev *hdev)
 	memcpy(&hdev->rss_size_max, &resp_msg[2], sizeof(u16));
 	memcpy(&hdev->rx_buf_len, &resp_msg[4], sizeof(u16));
 
+	/* if irq is not enough, let tqps have the same value of irqs,
+	 * to make sure one irq just bind to one tqp, this can improve
+	 * the performance
+	 */
+	hdev->num_tqps = min_t(u16, hdev->roce_base_msix_offset - 1,
+			       hdev->num_tqps);
+	hdev->rss_size_max = min_t(u16, hdev->rss_size_max, hdev->num_tqps);
+
 	return 0;
 }
 
@@ -2208,7 +2216,7 @@ static int hclgevf_init_msi(struct hclgevf_dev *hdev)
 	int vectors;
 	int i;
 
-	if (hnae3_get_bit(hdev->ae_dev->flag, HNAE3_DEV_SUPPORT_ROCE_B))
+	if (hnae3_dev_roce_supported(hdev))
 		vectors = pci_alloc_irq_vectors(pdev,
 						hdev->roce_base_msix_offset + 1,
 						hdev->num_msi,
@@ -2495,7 +2503,7 @@ static int hclgevf_query_vf_resource(struct hclgevf_dev *hdev)
 
 	req = (struct hclgevf_query_res_cmd *)desc.data;
 
-	if (hnae3_get_bit(hdev->ae_dev->flag, HNAE3_DEV_SUPPORT_ROCE_B)) {
+	if (hnae3_dev_roce_supported(hdev)) {
 		hdev->roce_base_msix_offset =
 		hnae3_get_field(__le16_to_cpu(req->msixcap_localid_ba_rocee),
 				HCLGEVF_MSIX_OFT_ROCEE_M,
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next 11/11] net: hns3: add support for handling IMP error
From: Huazhong Tan @ 2019-07-24  3:18 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
	Weihang Li, Peng Li, Huazhong Tan
In-Reply-To: <1563938327-9865-1-git-send-email-tanhuazhong@huawei.com>

From: Weihang Li <liweihang@hisilicon.com>

When IMP goes errors, the hardware reports a RAS to the driver,
the driver record this kind of error. Then a IMP reset will happen,
the driver checks the reason and takes the corresponding action
when doing IMP reset.

So this patch adds imp_err_state field to the struct hclge_dev
to record the error type, and handle_imp_error ops to handle it.

Signed-off-by: Weihang Li <liweihang@hisilicon.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hnae3.h        |  1 +
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c | 37 ++++++++++++++++++++--
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h |  4 +++
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c    | 29 +++++++++++++++++
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h    |  9 ++++++
 5 files changed, 78 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
index a4624db..3a1d6cc 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
@@ -515,6 +515,7 @@ struct hnae3_ae_ops {
 	int (*mac_connect_phy)(struct hnae3_handle *handle);
 	void (*mac_disconnect_phy)(struct hnae3_handle *handle);
 	void (*restore_vlan_table)(struct hnae3_handle *handle);
+	void (*handle_imp_error)(struct hnae3_handle *handle);
 };
 
 struct hnae3_dcb_ops {
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
index 0a72438..25df66d 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
@@ -683,6 +683,28 @@ static int hclge_cmd_query_error(struct hclge_dev *hdev,
 	return ret;
 }
 
+static int hclge_check_imp_poison_err(struct hclge_dev *hdev)
+{
+	struct device *dev = &hdev->pdev->dev;
+	int ras_status;
+	int ret = false;
+
+	ras_status = hclge_read_dev(&hdev->hw, HCLGE_PF_OTHER_INT_REG);
+	if (ras_status & HCLGE_RAS_IMP_RD_POISON_MASK) {
+		set_bit(HCLGE_IMP_RD_POISON, &hdev->imp_err_state);
+		/* This error will be handle by IMP reset */
+		dev_info(dev, "IMP RD poison detected!\n");
+		ret = true;
+	} else if (ras_status & HCLGE_RAS_IMP_CMDQ_ERR_MASK) {
+		set_bit(HCLGE_IMP_CMDQ_ERROR, &hdev->imp_err_state);
+		/* This error will be handle by IMP reset */
+		dev_info(dev, "IMP CMDQ error detected!\n");
+		ret = true;
+	}
+
+	return ret;
+}
+
 static int hclge_clear_mac_tnl_int(struct hclge_dev *hdev)
 {
 	struct hclge_desc desc;
@@ -1321,10 +1343,12 @@ static int hclge_handle_pf_ras_error(struct hclge_dev *hdev,
 	/* log PPU(RCB) errors */
 	desc_data = (__le32 *)&desc[3];
 	status = le32_to_cpu(*desc_data) & HCLGE_PPU_PF_INT_RAS_MASK;
-	if (status)
+	if (status) {
 		hclge_log_error(dev, "PPU_PF_ABNORMAL_INT_ST0",
 				&hclge_ppu_pf_abnormal_int[0], status,
 				&ae_dev->hw_err_reset_req);
+		hdev->ppu_poison_ras_err = true;
+	}
 
 	/* clear all PF RAS errors */
 	hclge_cmd_reuse_desc(&desc[0], false);
@@ -1632,6 +1656,7 @@ pci_ers_result_t hclge_handle_hw_ras_error(struct hnae3_ae_dev *ae_dev)
 	struct hclge_dev *hdev = ae_dev->priv;
 	struct device *dev = &hdev->pdev->dev;
 	u32 status;
+	int ret;
 
 	if (!test_bit(HCLGE_STATE_SERVICE_INITED, &hdev->state)) {
 		dev_err(dev,
@@ -1639,6 +1664,9 @@ pci_ers_result_t hclge_handle_hw_ras_error(struct hnae3_ae_dev *ae_dev)
 		return PCI_ERS_RESULT_NONE;
 	}
 
+	if (hclge_check_imp_poison_err(hdev))
+		return PCI_ERS_RESULT_RECOVERED;
+
 	status = hclge_read_dev(&hdev->hw, HCLGE_RAS_PF_OTHER_INT_STS_REG);
 
 	if (status & HCLGE_RAS_REG_NFE_MASK ||
@@ -1652,7 +1680,12 @@ pci_ers_result_t hclge_handle_hw_ras_error(struct hnae3_ae_dev *ae_dev)
 		dev_warn(dev,
 			 "HNS Non-Fatal RAS error(status=0x%x) identified\n",
 			 status);
-		hclge_handle_all_ras_errors(hdev);
+		ret = hclge_handle_all_ras_errors(hdev);
+		if (ret) {
+			ret = hclge_check_imp_poison_err(hdev);
+			if (ret)
+				return PCI_ERS_RESULT_RECOVERED;
+		}
 	}
 
 	/* Handling Non-fatal Rocee RAS errors */
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
index 7ea8bb2..4839fc4 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
@@ -12,6 +12,10 @@
 #define HCLGE_PF_MSIX_INT_MIN_BD_NUM	4
 
 #define HCLGE_RAS_PF_OTHER_INT_STS_REG   0x20B00
+#define HCLGE_RAS_IMP_RD_POISON_MASK \
+	BIT(HCLGE_VECTOR0_IMP_RD_POISON_B)
+#define HCLGE_RAS_IMP_CMDQ_ERR_MASK \
+	BIT(HCLGE_VECTOR0_IMP_CMDQ_ERR_B)
 #define HCLGE_RAS_REG_NFE_MASK   0xFF00
 #define HCLGE_RAS_REG_ROCEE_ERR_MASK   0x3000000
 
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 45acbc9..36a2b65 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -3281,6 +3281,7 @@ static int hclge_reset_prepare_wait(struct hclge_dev *hdev)
 {
 #define HCLGE_RESET_SYNC_TIME 100
 
+	struct hnae3_handle *handle = &hdev->vport[0].nic;
 	u32 reg_val;
 	int ret = 0;
 
@@ -3315,6 +3316,8 @@ static int hclge_reset_prepare_wait(struct hclge_dev *hdev)
 		hdev->rst_stats.flr_rst_cnt++;
 		break;
 	case HNAE3_IMP_RESET:
+		if (handle && handle->ae_algo->ops->handle_imp_error)
+			handle->ae_algo->ops->handle_imp_error(handle);
 		reg_val = hclge_read_dev(&hdev->hw, HCLGE_PF_OTHER_INT_REG);
 		hclge_write_dev(&hdev->hw, HCLGE_PF_OTHER_INT_REG,
 				BIT(HCLGE_VECTOR0_IMP_RESET_INT_B) | reg_val);
@@ -3517,6 +3520,9 @@ static void hclge_reset_event(struct pci_dev *pdev, struct hnae3_handle *handle)
 	else if (time_after(jiffies, (hdev->last_reset_time + 4 * 5 * HZ)))
 		hdev->reset_level = HNAE3_FUNC_RESET;
 
+	if (hdev->ppu_poison_ras_err)
+		hdev->ppu_poison_ras_err = false;
+
 	dev_info(&hdev->pdev->dev, "received reset event , reset type is %d",
 		 hdev->reset_level);
 
@@ -3545,6 +3551,27 @@ static void hclge_reset_timer(struct timer_list *t)
 	hclge_reset_event(hdev->pdev, NULL);
 }
 
+void hclge_handle_imp_error(struct hnae3_handle *handle)
+{
+	struct hclge_vport *vport = hclge_get_vport(handle);
+	struct hclge_dev *hdev = vport->back;
+	u32 reg_val;
+
+	if (test_and_clear_bit(HCLGE_IMP_RD_POISON, &hdev->imp_err_state)) {
+		dev_err(&hdev->pdev->dev, "Detected IMP RD poison!\n");
+		reg_val = hclge_read_dev(&hdev->hw, HCLGE_PF_OTHER_INT_REG) &
+			  ~BIT(HCLGE_VECTOR0_IMP_RD_POISON_B);
+		hclge_write_dev(&hdev->hw, HCLGE_PF_OTHER_INT_REG, reg_val);
+	}
+
+	if (test_and_clear_bit(HCLGE_IMP_CMDQ_ERROR, &hdev->imp_err_state)) {
+		dev_err(&hdev->pdev->dev, "Detected IMP CMDQ error!\n");
+		reg_val = hclge_read_dev(&hdev->hw, HCLGE_PF_OTHER_INT_REG) &
+			  ~BIT(HCLGE_VECTOR0_IMP_CMDQ_ERR_B);
+		hclge_write_dev(&hdev->hw, HCLGE_PF_OTHER_INT_REG, reg_val);
+	}
+}
+
 static void hclge_reset_subtask(struct hclge_dev *hdev)
 {
 	struct hnae3_ae_dev *ae_dev = pci_get_drvdata(hdev->pdev);
@@ -3560,6 +3587,7 @@ static void hclge_reset_subtask(struct hclge_dev *hdev)
 	 */
 	hdev->last_reset_time = jiffies;
 	hdev->reset_type = hclge_get_reset_level(ae_dev, &hdev->reset_pending);
+
 	if (hdev->reset_type != HNAE3_NONE_RESET)
 		hclge_reset(hdev);
 
@@ -9516,6 +9544,7 @@ static const struct hnae3_ae_ops hclge_ops = {
 	.mac_connect_phy = hclge_mac_connect_phy,
 	.mac_disconnect_phy = hclge_mac_disconnect_phy,
 	.restore_vlan_table = hclge_restore_vlan_table,
+	.handle_imp_error = hclge_handle_imp_error,
 };
 
 static struct hnae3_ae_algo ae_algo = {
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
index 688e425..7b7ba30 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
@@ -178,6 +178,8 @@ enum HLCGE_PORT_TYPE {
 #define HCLGE_VECTOR0_RX_CMDQ_INT_B	1
 
 #define HCLGE_VECTOR0_IMP_RESET_INT_B	1
+#define HCLGE_VECTOR0_IMP_CMDQ_ERR_B	4
+#define HCLGE_VECTOR0_IMP_RD_POISON_B	5
 
 #define HCLGE_MAC_DEFAULT_FRAME \
 	(ETH_HLEN + ETH_FCS_LEN + 2 * VLAN_HLEN + ETH_DATA_LEN)
@@ -676,6 +678,11 @@ enum HCLGE_MAC_ADDR_TYPE {
 	HCLGE_MAC_ADDR_MC
 };
 
+enum HCLGE_IMP_ERR_TYPE {
+	HCLGE_IMP_RD_POISON,
+	HCLGE_IMP_CMDQ_ERROR,
+};
+
 struct hclge_vport_vlan_cfg {
 	struct list_head node;
 	int hd_tbl_status;
@@ -777,6 +784,8 @@ struct hclge_dev {
 	u8 tc_num_last_time;
 	enum hclge_fc_mode fc_mode_last_time;
 	u8 support_sfp_query;
+	bool ppu_poison_ras_err;
+	unsigned long imp_err_state;
 
 #define HCLGE_FLAG_TC_BASE_SCH_MODE		1
 #define HCLGE_FLAG_VNET_BASE_SCH_MODE		2
-- 
2.7.4


^ permalink raw reply related

* [PATCH] net: dsa: Check existence of .port_mdb_add callback before calling it
From: Chen-Yu Tsai @ 2019-07-24  3:47 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli, David S . Miller
  Cc: Chen-Yu Tsai, netdev, linux-kernel

From: Chen-Yu Tsai <wens@csie.org>

With the recent addition of commit 75dad2520fc3 ("net: dsa: b53: Disable
all ports on setup"), users of b53 (BCM53125 on Lamobo R1 in my case)
are forced to use the dsa subsystem to enable the switch, instead of
having it in the default transparent "forward-to-all" mode.

The b53 driver does not support mdb bitmap functions. However the dsa
layer does not check for the existence of the .port_mdb_add callback
before actually using it. This results in a NULL pointer dereference,
as shown in the kernel oops below.

The other functions seem to be properly guarded. Do the same for
.port_mdb_add in dsa_switch_mdb_add_bitmap() as well.

b53 is not the only driver that doesn't support mdb bitmap functions.
Others include bcm_sf2, dsa_loop, lantiq_gswip, mt7530, mv88e6060,
qca8k, realtek-smi, and vitesse-vsc73xx.

    8<--- cut here ---
    Unable to handle kernel NULL pointer dereference at virtual address 00000000
    pgd = (ptrval)
    [00000000] *pgd=00000000
    Internal error: Oops: 80000005 [#1] SMP ARM
    Modules linked in: rtl8xxxu rtl8192cu rtl_usb rtl8192c_common rtlwifi mac80211 cfg80211
    CPU: 1 PID: 134 Comm: kworker/1:2 Not tainted 5.3.0-rc1-00247-gd3519030752a #1
    Hardware name: Allwinner sun7i (A20) Family
    Workqueue: events switchdev_deferred_process_work
    PC is at 0x0
    LR is at dsa_switch_event+0x570/0x620
    pc : [<00000000>]    lr : [<c08533ec>]    psr: 80070013
    sp : ee871db8  ip : 00000000  fp : ee98d0a4
    r10: 0000000c  r9 : 00000008  r8 : ee89f710
    r7 : ee98d040  r6 : ee98d088  r5 : c0f04c48  r4 : ee98d04c
    r3 : 00000000  r2 : ee89f710  r1 : 00000008  r0 : ee98d040
    Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
    Control: 10c5387d  Table: 6deb406a  DAC: 00000051
    Process kworker/1:2 (pid: 134, stack limit = 0x(ptrval))
    Stack: (0xee871db8 to 0xee872000)
    1da0:                                                       ee871e14 103ace2d
    1dc0: 00000000 ffffffff 00000000 ee871e14 00000005 00000000 c08524a0 00000000
    1de0: ffffe000 c014bdfc c0f04c48 ee871e98 c0f04c48 ee9e5000 c0851120 c014bef0
    1e00: 00000000 b643aea2 ee9b4068 c08509a8 ee2bf940 ee89f710 ee871ecb 00000000
    1e20: 00000008 103ace2d 00000000 c087e248 ee29c868 103ace2d 00000001 ffffffff
    1e40: 00000000 ee871e98 00000006 00000000 c0fb2a50 c087e2d0 ffffffff c08523c4
    1e60: ffffffff c014bdfc 00000006 c0fad2d0 ee871e98 ee89f710 00000000 c014c500
    1e80: 00000000 ee89f3c0 c0f04c48 00000000 ee9e5000 c087dfb4 ee9e5000 00000000
    1ea0: ee89f710 ee871ecb 00000001 103ace2d 00000000 c0f04c48 00000000 c087e0a8
    1ec0: 00000000 efd9a3e0 0089f3c0 103ace2d ee89f700 ee89f710 ee9e5000 00000122
    1ee0: 00000100 c087e130 ee89f700 c0fad2c8 c1003ef0 c087de4c 2e928000 c0fad2ec
    1f00: c0fad2ec ee839580 ef7a62c0 ef7a9400 00000000 c087def8 c0fad2ec c01447dc
    1f20: ef315640 ef7a62c0 00000008 ee839580 ee839594 ef7a62c0 00000008 c0f03d00
    1f40: ef7a62d8 ef7a62c0 ffffe000 c0145b84 ffffe000 c0fb2420 c0bfaa8c 00000000
    1f60: ffffe000 ee84b600 ee84b5c0 00000000 ee870000 ee839580 c0145b40 ef0e5ea4
    1f80: ee84b61c c014a6f8 00000001 ee84b5c0 c014a5b0 00000000 00000000 00000000
    1fa0: 00000000 00000000 00000000 c01010e8 00000000 00000000 00000000 00000000
    1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
    [<c08533ec>] (dsa_switch_event) from [<c014bdfc>] (notifier_call_chain+0x48/0x84)
    [<c014bdfc>] (notifier_call_chain) from [<c014bef0>] (raw_notifier_call_chain+0x18/0x20)
    [<c014bef0>] (raw_notifier_call_chain) from [<c08509a8>] (dsa_port_mdb_add+0x48/0x74)
    [<c08509a8>] (dsa_port_mdb_add) from [<c087e248>] (__switchdev_handle_port_obj_add+0x54/0xd4)
    [<c087e248>] (__switchdev_handle_port_obj_add) from [<c087e2d0>] (switchdev_handle_port_obj_add+0x8/0x14)
    [<c087e2d0>] (switchdev_handle_port_obj_add) from [<c08523c4>] (dsa_slave_switchdev_blocking_event+0x94/0xa4)
    [<c08523c4>] (dsa_slave_switchdev_blocking_event) from [<c014bdfc>] (notifier_call_chain+0x48/0x84)
    [<c014bdfc>] (notifier_call_chain) from [<c014c500>] (blocking_notifier_call_chain+0x50/0x68)
    [<c014c500>] (blocking_notifier_call_chain) from [<c087dfb4>] (switchdev_port_obj_notify+0x44/0xa8)
    [<c087dfb4>] (switchdev_port_obj_notify) from [<c087e0a8>] (switchdev_port_obj_add_now+0x90/0x104)
    [<c087e0a8>] (switchdev_port_obj_add_now) from [<c087e130>] (switchdev_port_obj_add_deferred+0x14/0x5c)
    [<c087e130>] (switchdev_port_obj_add_deferred) from [<c087de4c>] (switchdev_deferred_process+0x64/0x104)
    [<c087de4c>] (switchdev_deferred_process) from [<c087def8>] (switchdev_deferred_process_work+0xc/0x14)
    [<c087def8>] (switchdev_deferred_process_work) from [<c01447dc>] (process_one_work+0x218/0x50c)
    [<c01447dc>] (process_one_work) from [<c0145b84>] (worker_thread+0x44/0x5bc)
    [<c0145b84>] (worker_thread) from [<c014a6f8>] (kthread+0x148/0x150)
    [<c014a6f8>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
    Exception stack(0xee871fb0 to 0xee871ff8)
    1fa0:                                     00000000 00000000 00000000 00000000
    1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
    Code: bad PC value
    ---[ end trace 1292c61abd17b130 ]---

    [<c08533ec>] (dsa_switch_event) from [<c014bdfc>] (notifier_call_chain+0x48/0x84)
    corresponds to

	$ arm-linux-gnueabihf-addr2line -C -i -e vmlinux c08533ec

	linux/net/dsa/switch.c:156
	linux/net/dsa/switch.c:178
	linux/net/dsa/switch.c:328

Fixes: e6db98db8a95 ("net: dsa: add switch mdb bitmap functions")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
---

This piece of code was moved around in release 4.16, and probably
existed for even longer in some other form. Seems kind of odd that
I'm running into it just now. Maybe there's some other issue I'm
missing, or that it's specific to b53 only? Either way I get the
feeling that I missing something and this is only papering over it.

---
 net/dsa/switch.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index 4ec5b7f85d51..13c7ef53e0de 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -147,14 +147,19 @@ dsa_switch_mdb_prepare_bitmap(struct dsa_switch *ds,
 	return 0;
 }
 
-static void dsa_switch_mdb_add_bitmap(struct dsa_switch *ds,
-				      const struct switchdev_obj_port_mdb *mdb,
-				      const unsigned long *bitmap)
+static int dsa_switch_mdb_add_bitmap(struct dsa_switch *ds,
+				     const struct switchdev_obj_port_mdb *mdb,
+				     const unsigned long *bitmap)
 {
 	int port;
 
+	if (!ds->ops->port_mdb_add)
+		return -EOPNOTSUPP;
+
 	for_each_set_bit(port, bitmap, ds->num_ports)
 		ds->ops->port_mdb_add(ds, port, mdb);
+
+	return 0;
 }
 
 static int dsa_switch_mdb_add(struct dsa_switch *ds,
@@ -175,9 +180,7 @@ static int dsa_switch_mdb_add(struct dsa_switch *ds,
 	if (switchdev_trans_ph_prepare(trans))
 		return dsa_switch_mdb_prepare_bitmap(ds, mdb, ds->bitmap);
 
-	dsa_switch_mdb_add_bitmap(ds, mdb, ds->bitmap);
-
-	return 0;
+	return dsa_switch_mdb_add_bitmap(ds, mdb, ds->bitmap);
 }
 
 static int dsa_switch_mdb_del(struct dsa_switch *ds,
-- 
2.20.1


^ permalink raw reply related

* Re: Reminder: 3 open syzbot bugs in "net/ax25" subsystem
From: Joe @ 2019-07-24  3:42 UTC (permalink / raw)
  To: linux-hams, netdev, Ralf Baechle, David S. Miller, linux-kernel,
	syzkaller-bugs
In-Reply-To: <20190724024049.GC643@sol.localdomain>

Hi Eric, How do I get off of this thread? When I try to unsubscribe it 
tells me I'm not a member of the group.

Thanks in advance,

Joe G.

On 7/23/19 7:40 PM, Eric Biggers wrote:
> [This email was generated by a script.  Let me know if you have any suggestions
> to make it better, or if you want it re-generated with the latest status.]
>
> Of the currently open syzbot reports against the upstream kernel, I've manually
> marked 3 of them as possibly being bugs in the "net/ax25" subsystem.  I've
> listed these reports below, sorted by an algorithm that tries to list first the
> reports most likely to be still valid, important, and actionable.
>
> If you believe a bug is no longer valid, please close the syzbot report by
> sending a '#syz fix', '#syz dup', or '#syz invalid' command in reply to the
> original thread, as explained at https://goo.gl/tpsmEJ#status
>
> If you believe I misattributed a bug to the "net/ax25" subsystem, please let me
> know, and if possible forward the report to the correct people or mailing list.
>
> Here are the bugs:
>
> --------------------------------------------------------------------------------
> Title:              general protection fault in ax25_send_frame
> Last occurred:      0 days ago
> Reported:           204 days ago
> Branches:           Mainline and others
> Dashboard link:     https://syzkaller.appspot.com/bug?id=1cdd5b120f129364fc8e9b2b027826cf99fa696e
> Original thread:    https://lkml.kernel.org/lkml/0000000000009ea37c057e58d787@google.com/T/#u
>
> Unfortunately, this bug does not have a reproducer.
>
> No one replied to the original thread for this bug.
>
> If you fix this bug, please add the following tag to the commit:
>      Reported-by: syzbot+e0b81535a27b8be39502@syzkaller.appspotmail.com
>
> If you send any email or patch for this bug, please consider replying to the
> original thread.  For the git send-email command to use, or tips on how to reply
> if the thread isn't in your mailbox, see the "Reply instructions" at
> https://lkml.kernel.org/r/0000000000009ea37c057e58d787@google.com
>
> --------------------------------------------------------------------------------
> Title:              KASAN: stack-out-of-bounds Write in ax25_getname
> Last occurred:      90 days ago
> Reported:           206 days ago
> Branches:           Mainline and others
> Dashboard link:     https://syzkaller.appspot.com/bug?id=fb195f91dc044978c1b186f1288b1eff61edcc20
> Original thread:    https://lkml.kernel.org/lkml/000000000000ed4120057e2df0c6@google.com/T/#u
>
> This bug has a syzkaller reproducer only.
>
> No one replied to the original thread for this bug.
>
> If you fix this bug, please add the following tag to the commit:
>      Reported-by: syzbot+6a29097222b4d3b8617c@syzkaller.appspotmail.com
>
> If you send any email or patch for this bug, please consider replying to the
> original thread.  For the git send-email command to use, or tips on how to reply
> if the thread isn't in your mailbox, see the "Reply instructions" at
> https://lkml.kernel.org/r/000000000000ed4120057e2df0c6@google.com
>
> --------------------------------------------------------------------------------
> Title:              inconsistent lock state in ax25_std_heartbeat_expiry
> Last occurred:      122 days ago
> Reported:           120 days ago
> Branches:           net
> Dashboard link:     https://syzkaller.appspot.com/bug?id=9086a8eac930890b2730d6441093bd478e32913f
> Original thread:    https://lkml.kernel.org/lkml/0000000000001b07250584efbee3@google.com/T/#u
>
> Unfortunately, this bug does not have a reproducer.
>
> The original thread for this bug received 2 replies; the last was 119 days ago.
>
> If you fix this bug, please add the following tag to the commit:
>      Reported-by: syzbot+e350b81e95a6a214da8a@syzkaller.appspotmail.com
>
> If you send any email or patch for this bug, please consider replying to the
> original thread.  For the git send-email command to use, or tips on how to reply
> if the thread isn't in your mailbox, see the "Reply instructions" at
> https://lkml.kernel.org/r/0000000000001b07250584efbee3@google.com
>

^ permalink raw reply

* Reminder: 3 open syzbot bugs in isdn subsystem
From: Eric Biggers @ 2019-07-24  3:57 UTC (permalink / raw)
  To: netdev, Karsten Keil; +Cc: linux-kernel, syzkaller-bugs

[This email was generated by a script.  Let me know if you have any suggestions
to make it better, or if you want it re-generated with the latest status.]

Of the currently open syzbot reports against the upstream kernel, I've manually
marked 3 of them as possibly being bugs in the isdn subsystem.  I've listed
these reports below, sorted by an algorithm that tries to list first the reports
most likely to be still valid, important, and actionable.

Of these 3 bugs, 1 was seen in mainline in the last week.

If you believe a bug is no longer valid, please close the syzbot report by
sending a '#syz fix', '#syz dup', or '#syz invalid' command in reply to the
original thread, as explained at https://goo.gl/tpsmEJ#status

If you believe I misattributed a bug to the isdn subsystem, please let me know,
and if possible forward the report to the correct people or mailing list.

Here are the bugs:

--------------------------------------------------------------------------------
Title:              KMSAN: uninit-value in capi_write
Last occurred:      1 day ago
Reported:           162 days ago
Branches:           Mainline (with KMSAN patches)
Dashboard link:     https://syzkaller.appspot.com/bug?id=1075c8d832330e876cad2c8c6872323f5211bc31
Original thread:    https://lkml.kernel.org/lkml/0000000000000e35f00581a579cd@google.com/T/#u

This bug has a C reproducer.

No one replied to the original thread for this bug.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+0849c524d9c634f5ae66@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lkml.kernel.org/r/0000000000000e35f00581a579cd@google.com

--------------------------------------------------------------------------------
Title:              KASAN: use-after-free Read in dev_expire_timer
Last occurred:      168 days ago
Reported:           202 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=a520c896ffb4fbc8539cda4c0bb6fe48995defd7
Original thread:    https://lkml.kernel.org/lkml/000000000000a317f3057e766a0a@google.com/T/#u

Unfortunately, this bug does not have a reproducer.

No one replied to the original thread for this bug.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+7ea31f54ecacb8f68330@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lkml.kernel.org/r/000000000000a317f3057e766a0a@google.com

--------------------------------------------------------------------------------
Title:              BUG: spinlock bad magic in flush_workqueue_prep_pwqs
Last occurred:      144 days ago
Reported:           143 days ago
Branches:           Mainline
Dashboard link:     https://syzkaller.appspot.com/bug?id=cbd7a891bd8268f9eb68dfd1f3551670b4ccb412
Original thread:    https://lkml.kernel.org/lkml/0000000000009e31600583172454@google.com/T/#u

Unfortunately, this bug does not have a reproducer.

No one replied to the original thread for this bug.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+130f0c441448a93a166b@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lkml.kernel.org/r/0000000000009e31600583172454@google.com


^ permalink raw reply

* [PATCH 03/12] block: bio_release_pages: use flags arg instead of bool
From: john.hubbard @ 2019-07-24  4:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Anna Schumaker, David S . Miller,
	Dominique Martinet, Eric Van Hensbergen, Jason Gunthorpe,
	Jason Wang, Jens Axboe, Latchesar Ionkov, Michael S . Tsirkin,
	Miklos Szeredi, Trond Myklebust, Christoph Hellwig,
	Matthew Wilcox, linux-mm, LKML, ceph-devel, kvm, linux-block,
	linux-cifs, linux-fsdevel, linux-nfs, linux-rdma, netdev,
	samba-technical, v9fs-developer, virtualization, John Hubbard,
	Christoph Hellwig, Jérôme Glisse, Minwoo Im
In-Reply-To: <20190724042518.14363-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

In commit d241a95f3514 ("block: optionally mark pages dirty in
bio_release_pages"), new "bool mark_dirty" argument was added to
bio_release_pages.

In upcoming work, another bool argument (to indicate that the pages came
from get_user_pages) is going to be added. That's one bool too many,
because it's not desirable have calls of the form:

    foo(true, false, true, etc);

In order to prepare for that, change the argument from a bool, to a
typesafe (enum-based) flags argument.

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Minwoo Im <minwoo.im.dev@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 block/bio.c         | 12 ++++++------
 fs/block_dev.c      |  4 ++--
 fs/direct-io.c      |  2 +-
 include/linux/bio.h | 13 ++++++++++++-
 4 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 299a0e7651ec..7675e2de509d 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -833,7 +833,7 @@ int bio_add_page(struct bio *bio, struct page *page,
 }
 EXPORT_SYMBOL(bio_add_page);
 
-void bio_release_pages(struct bio *bio, bool mark_dirty)
+void bio_release_pages(struct bio *bio, enum bio_rp_flags_t flags)
 {
 	struct bvec_iter_all iter_all;
 	struct bio_vec *bvec;
@@ -842,7 +842,7 @@ void bio_release_pages(struct bio *bio, bool mark_dirty)
 		return;
 
 	bio_for_each_segment_all(bvec, bio, iter_all) {
-		if (mark_dirty && !PageCompound(bvec->bv_page))
+		if ((flags & BIO_RP_MARK_DIRTY) && !PageCompound(bvec->bv_page))
 			set_page_dirty_lock(bvec->bv_page);
 		put_page(bvec->bv_page);
 	}
@@ -1421,7 +1421,7 @@ struct bio *bio_map_user_iov(struct request_queue *q,
 	return bio;
 
  out_unmap:
-	bio_release_pages(bio, false);
+	bio_release_pages(bio, BIO_RP_NORMAL);
 	bio_put(bio);
 	return ERR_PTR(ret);
 }
@@ -1437,7 +1437,7 @@ struct bio *bio_map_user_iov(struct request_queue *q,
  */
 void bio_unmap_user(struct bio *bio)
 {
-	bio_release_pages(bio, bio_data_dir(bio) == READ);
+	bio_release_pages(bio, bio_rp_dirty_flag(bio_data_dir(bio) == READ));
 	bio_put(bio);
 	bio_put(bio);
 }
@@ -1683,7 +1683,7 @@ static void bio_dirty_fn(struct work_struct *work)
 	while ((bio = next) != NULL) {
 		next = bio->bi_private;
 
-		bio_release_pages(bio, true);
+		bio_release_pages(bio, BIO_RP_MARK_DIRTY);
 		bio_put(bio);
 	}
 }
@@ -1699,7 +1699,7 @@ void bio_check_pages_dirty(struct bio *bio)
 			goto defer;
 	}
 
-	bio_release_pages(bio, false);
+	bio_release_pages(bio, BIO_RP_NORMAL);
 	bio_put(bio);
 	return;
 defer:
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 4707dfff991b..9fe6616f8788 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -259,7 +259,7 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
 	}
 	__set_current_state(TASK_RUNNING);
 
-	bio_release_pages(&bio, should_dirty);
+	bio_release_pages(&bio, bio_rp_dirty_flag(should_dirty));
 	if (unlikely(bio.bi_status))
 		ret = blk_status_to_errno(bio.bi_status);
 
@@ -329,7 +329,7 @@ static void blkdev_bio_end_io(struct bio *bio)
 	if (should_dirty) {
 		bio_check_pages_dirty(bio);
 	} else {
-		bio_release_pages(bio, false);
+		bio_release_pages(bio, BIO_RP_NORMAL);
 		bio_put(bio);
 	}
 }
diff --git a/fs/direct-io.c b/fs/direct-io.c
index ae196784f487..423ef431ddda 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -551,7 +551,7 @@ static blk_status_t dio_bio_complete(struct dio *dio, struct bio *bio)
 	if (dio->is_async && should_dirty) {
 		bio_check_pages_dirty(bio);	/* transfers ownership */
 	} else {
-		bio_release_pages(bio, should_dirty);
+		bio_release_pages(bio, bio_rp_dirty_flag(should_dirty));
 		bio_put(bio);
 	}
 	return err;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 3cdb84cdc488..2715e55679c1 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -440,7 +440,18 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page,
 void __bio_add_page(struct bio *bio, struct page *page,
 		unsigned int len, unsigned int off);
 int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter);
-void bio_release_pages(struct bio *bio, bool mark_dirty);
+
+enum bio_rp_flags_t {
+	BIO_RP_NORMAL		= 0,
+	BIO_RP_MARK_DIRTY	= 1,
+};
+
+static inline enum bio_rp_flags_t bio_rp_dirty_flag(bool mark_dirty)
+{
+	return mark_dirty ? BIO_RP_MARK_DIRTY : BIO_RP_NORMAL;
+}
+
+void bio_release_pages(struct bio *bio, enum bio_rp_flags_t flags);
 struct rq_map_data;
 extern struct bio *bio_map_user_iov(struct request_queue *,
 				    struct iov_iter *, gfp_t);
-- 
2.22.0


^ permalink raw reply related

* [PATCH 05/12] block_dev: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-07-24  4:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Anna Schumaker, David S . Miller,
	Dominique Martinet, Eric Van Hensbergen, Jason Gunthorpe,
	Jason Wang, Jens Axboe, Latchesar Ionkov, Michael S . Tsirkin,
	Miklos Szeredi, Trond Myklebust, Christoph Hellwig,
	Matthew Wilcox, linux-mm, LKML, ceph-devel, kvm, linux-block,
	linux-cifs, linux-fsdevel, linux-nfs, linux-rdma, netdev,
	samba-technical, v9fs-developer, virtualization,
	Jérôme Glisse, John Hubbard, Jan Kara, Dan Williams,
	Johannes Thumshirn, Ming Lei, Dave Chinner, Boaz Harrosh
In-Reply-To: <20190724042518.14363-1-jhubbard@nvidia.com>

From: Jérôme Glisse <jglisse@redhat.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Changes from Jérôme's original patch:

* reworked to be compatible with recent bio_release_pages() changes.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-block@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Jan Kara <jack@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Boaz Harrosh <boaz@plexistor.com>
---
 block/bio.c         | 13 +++++++++++++
 fs/block_dev.c      | 22 +++++++++++++++++-----
 include/linux/bio.h |  8 ++++++++
 3 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 74f9eba2583b..3b9f66e64bc1 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1746,6 +1746,19 @@ void bio_check_pages_dirty(struct bio *bio)
 	__bio_check_pages_dirty(bio, false);
 }
 
+enum bio_rp_flags_t bio_rp_flags(struct iov_iter *iter, bool mark_dirty)
+{
+	enum bio_rp_flags_t flags = BIO_RP_NORMAL;
+
+	if (mark_dirty)
+		flags |= BIO_RP_MARK_DIRTY;
+
+	if (iov_iter_get_pages_use_gup(iter))
+		flags |= BIO_RP_FROM_GUP;
+
+	return flags;
+}
+
 void update_io_ticks(struct hd_struct *part, unsigned long now)
 {
 	unsigned long stamp;
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 9fe6616f8788..d53abaf31e54 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -259,7 +259,7 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
 	}
 	__set_current_state(TASK_RUNNING);
 
-	bio_release_pages(&bio, bio_rp_dirty_flag(should_dirty));
+	bio_release_pages(&bio, bio_rp_flags(iter, should_dirty));
 	if (unlikely(bio.bi_status))
 		ret = blk_status_to_errno(bio.bi_status);
 
@@ -295,7 +295,7 @@ static int blkdev_iopoll(struct kiocb *kiocb, bool wait)
 	return blk_poll(q, READ_ONCE(kiocb->ki_cookie), wait);
 }
 
-static void blkdev_bio_end_io(struct bio *bio)
+static void _blkdev_bio_end_io(struct bio *bio, bool from_gup)
 {
 	struct blkdev_dio *dio = bio->bi_private;
 	bool should_dirty = dio->should_dirty;
@@ -327,13 +327,23 @@ static void blkdev_bio_end_io(struct bio *bio)
 	}
 
 	if (should_dirty) {
-		bio_check_pages_dirty(bio);
+		__bio_check_pages_dirty(bio, from_gup);
 	} else {
-		bio_release_pages(bio, BIO_RP_NORMAL);
+		bio_release_pages(bio, bio_rp_gup_flag(from_gup));
 		bio_put(bio);
 	}
 }
 
+static void blkdev_bio_end_io(struct bio *bio)
+{
+	_blkdev_bio_end_io(bio, false);
+}
+
+static void blkdev_bio_from_gup_end_io(struct bio *bio)
+{
+	_blkdev_bio_end_io(bio, true);
+}
+
 static ssize_t
 __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
 {
@@ -380,7 +390,9 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
 		bio->bi_iter.bi_sector = pos >> 9;
 		bio->bi_write_hint = iocb->ki_hint;
 		bio->bi_private = dio;
-		bio->bi_end_io = blkdev_bio_end_io;
+		bio->bi_end_io = iov_iter_get_pages_use_gup(iter) ?
+				 blkdev_bio_from_gup_end_io :
+				 blkdev_bio_end_io;
 		bio->bi_ioprio = iocb->ki_ioprio;
 
 		ret = bio_iov_iter_get_pages(bio, iter);
diff --git a/include/linux/bio.h b/include/linux/bio.h
index d68a40c2c9d4..b9460d1a4679 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -452,6 +452,13 @@ static inline enum bio_rp_flags_t bio_rp_dirty_flag(bool mark_dirty)
 	return mark_dirty ? BIO_RP_MARK_DIRTY : BIO_RP_NORMAL;
 }
 
+static inline enum bio_rp_flags_t bio_rp_gup_flag(bool from_gup)
+{
+	return from_gup ? BIO_RP_FROM_GUP : BIO_RP_NORMAL;
+}
+
+enum bio_rp_flags_t bio_rp_flags(struct iov_iter *iter, bool mark_dirty);
+
 void bio_release_pages(struct bio *bio, enum bio_rp_flags_t flags);
 struct rq_map_data;
 extern struct bio *bio_map_user_iov(struct request_queue *,
@@ -463,6 +470,7 @@ extern struct bio *bio_copy_kern(struct request_queue *, void *, unsigned int,
 				 gfp_t, int);
 extern void bio_set_pages_dirty(struct bio *bio);
 extern void bio_check_pages_dirty(struct bio *bio);
+void __bio_check_pages_dirty(struct bio *bio, bool from_gup);
 
 void generic_start_io_acct(struct request_queue *q, int op,
 				unsigned long sectors, struct hd_struct *part);
-- 
2.22.0


^ permalink raw reply related

* [PATCH 12/12] fs/ceph: fix a build warning: returning a value from void function
From: john.hubbard @ 2019-07-24  4:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Anna Schumaker, David S . Miller,
	Dominique Martinet, Eric Van Hensbergen, Jason Gunthorpe,
	Jason Wang, Jens Axboe, Latchesar Ionkov, Michael S . Tsirkin,
	Miklos Szeredi, Trond Myklebust, Christoph Hellwig,
	Matthew Wilcox, linux-mm, LKML, ceph-devel, kvm, linux-block,
	linux-cifs, linux-fsdevel, linux-nfs, linux-rdma, netdev,
	samba-technical, v9fs-developer, virtualization, John Hubbard
In-Reply-To: <20190724042518.14363-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

Trivial build warning fix: don't return a value from a function
whose type is "void".

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 fs/ceph/debugfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c
index 2eb88ed22993..fa14c8e8761d 100644
--- a/fs/ceph/debugfs.c
+++ b/fs/ceph/debugfs.c
@@ -294,7 +294,7 @@ void ceph_fs_debugfs_init(struct ceph_fs_client *fsc)
 
 void ceph_fs_debugfs_init(struct ceph_fs_client *fsc)
 {
-	return 0;
+	return;
 }
 
 void ceph_fs_debugfs_cleanup(struct ceph_fs_client *fsc)
-- 
2.22.0


^ permalink raw reply related

* [PATCH 08/12] fs/cifs: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-07-24  4:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Anna Schumaker, David S . Miller,
	Dominique Martinet, Eric Van Hensbergen, Jason Gunthorpe,
	Jason Wang, Jens Axboe, Latchesar Ionkov, Michael S . Tsirkin,
	Miklos Szeredi, Trond Myklebust, Christoph Hellwig,
	Matthew Wilcox, linux-mm, LKML, ceph-devel, kvm, linux-block,
	linux-cifs, linux-fsdevel, linux-nfs, linux-rdma, netdev,
	samba-technical, v9fs-developer, virtualization,
	Jérôme Glisse, John Hubbard, Jan Kara, Dan Williams,
	Johannes Thumshirn, Ming Lei, Dave Chinner, Boaz Harrosh,
	Steve French
In-Reply-To: <20190724042518.14363-1-jhubbard@nvidia.com>

From: Jérôme Glisse <jglisse@redhat.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-block@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-cifs@vger.kernel.org
Cc: Jan Kara <jack@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Steve French <sfrench@samba.org>
---
 fs/cifs/cifsglob.h |  3 +++
 fs/cifs/file.c     | 22 +++++++++++++++++-----
 fs/cifs/misc.c     | 19 +++++++++++++++----
 3 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index fe610e7e3670..e95cb82bfa50 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -1283,6 +1283,7 @@ struct cifs_aio_ctx {
 	 * If yes, iter is a copy of the user passed iov_iter
 	 */
 	bool			direct_io;
+	bool			from_gup;
 };
 
 struct cifs_readdata;
@@ -1317,6 +1318,7 @@ struct cifs_readdata {
 	struct cifs_credits		credits;
 	unsigned int			nr_pages;
 	struct page			**pages;
+	bool				from_gup;
 };
 
 struct cifs_writedata;
@@ -1343,6 +1345,7 @@ struct cifs_writedata {
 	struct cifs_credits		credits;
 	unsigned int			nr_pages;
 	struct page			**pages;
+	bool				from_gup;
 };
 
 /*
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 97090693d182..84fa7e0a578f 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -2571,8 +2571,13 @@ cifs_uncached_writedata_release(struct kref *refcount)
 					struct cifs_writedata, refcount);
 
 	kref_put(&wdata->ctx->refcount, cifs_aio_ctx_release);
-	for (i = 0; i < wdata->nr_pages; i++)
-		put_page(wdata->pages[i]);
+	if (wdata->from_gup) {
+		for (i = 0; i < wdata->nr_pages; i++)
+			put_user_page(wdata->pages[i]);
+	} else {
+		for (i = 0; i < wdata->nr_pages; i++)
+			put_page(wdata->pages[i]);
+	}
 	cifs_writedata_release(refcount);
 }
 
@@ -2781,7 +2786,7 @@ cifs_write_from_iter(loff_t offset, size_t len, struct iov_iter *from,
 				break;
 			}
 
-
+			wdata->from_gup = iov_iter_get_pages_use_gup(from);
 			wdata->page_offset = start;
 			wdata->tailsz =
 				nr_pages > 1 ?
@@ -2797,6 +2802,7 @@ cifs_write_from_iter(loff_t offset, size_t len, struct iov_iter *from,
 				add_credits_and_wake_if(server, credits, 0);
 				break;
 			}
+			wdata->from_gup = false;
 
 			rc = cifs_write_allocate_pages(wdata->pages, nr_pages);
 			if (rc) {
@@ -3238,8 +3244,12 @@ cifs_uncached_readdata_release(struct kref *refcount)
 	unsigned int i;
 
 	kref_put(&rdata->ctx->refcount, cifs_aio_ctx_release);
-	for (i = 0; i < rdata->nr_pages; i++) {
-		put_page(rdata->pages[i]);
+	if (rdata->from_gup) {
+		for (i = 0; i < rdata->nr_pages; i++)
+			put_user_page(rdata->pages[i]);
+	} else {
+		for (i = 0; i < rdata->nr_pages; i++)
+			put_page(rdata->pages[i]);
 	}
 	cifs_readdata_release(refcount);
 }
@@ -3502,6 +3512,7 @@ cifs_send_async_read(loff_t offset, size_t len, struct cifsFileInfo *open_file,
 				break;
 			}
 
+			rdata->from_gup = iov_iter_get_pages_use_gup(&direct_iov);
 			npages = (cur_len + start + PAGE_SIZE-1) / PAGE_SIZE;
 			rdata->page_offset = start;
 			rdata->tailsz = npages > 1 ?
@@ -3519,6 +3530,7 @@ cifs_send_async_read(loff_t offset, size_t len, struct cifsFileInfo *open_file,
 				rc = -ENOMEM;
 				break;
 			}
+			rdata->from_gup = false;
 
 			rc = cifs_read_allocate_pages(rdata, npages);
 			if (rc) {
diff --git a/fs/cifs/misc.c b/fs/cifs/misc.c
index f383877a6511..5a04c34fea05 100644
--- a/fs/cifs/misc.c
+++ b/fs/cifs/misc.c
@@ -822,10 +822,18 @@ cifs_aio_ctx_release(struct kref *refcount)
 	if (ctx->bv) {
 		unsigned i;
 
-		for (i = 0; i < ctx->npages; i++) {
-			if (ctx->should_dirty)
-				set_page_dirty(ctx->bv[i].bv_page);
-			put_page(ctx->bv[i].bv_page);
+		if (ctx->from_gup) {
+			for (i = 0; i < ctx->npages; i++) {
+				if (ctx->should_dirty)
+					set_page_dirty(ctx->bv[i].bv_page);
+				put_user_page(ctx->bv[i].bv_page);
+			}
+		} else {
+			for (i = 0; i < ctx->npages; i++) {
+				if (ctx->should_dirty)
+					set_page_dirty(ctx->bv[i].bv_page);
+				put_page(ctx->bv[i].bv_page);
+			}
 		}
 		kvfree(ctx->bv);
 	}
@@ -881,6 +889,9 @@ setup_aio_ctx_iter(struct cifs_aio_ctx *ctx, struct iov_iter *iter, int rw)
 
 	saved_len = count;
 
+	/* This is only use by cifs_aio_ctx_release() */
+	ctx->from_gup = iov_iter_get_pages_use_gup(iter);
+
 	while (count && npages < max_pages) {
 		rc = iov_iter_get_pages(iter, pages, count, max_pages, &start);
 		if (rc < 0) {
-- 
2.22.0


^ permalink raw reply related

* [PATCH 11/12] 9p/net: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-07-24  4:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Anna Schumaker, David S . Miller,
	Dominique Martinet, Eric Van Hensbergen, Jason Gunthorpe,
	Jason Wang, Jens Axboe, Latchesar Ionkov, Michael S . Tsirkin,
	Miklos Szeredi, Trond Myklebust, Christoph Hellwig,
	Matthew Wilcox, linux-mm, LKML, ceph-devel, kvm, linux-block,
	linux-cifs, linux-fsdevel, linux-nfs, linux-rdma, netdev,
	samba-technical, v9fs-developer, virtualization,
	Jérôme Glisse, John Hubbard, Jan Kara, Dan Williams,
	Johannes Thumshirn, Ming Lei, Dave Chinner, Boaz Harrosh
In-Reply-To: <20190724042518.14363-1-jhubbard@nvidia.com>

From: Jérôme Glisse <jglisse@redhat.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-block@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: v9fs-developer@lists.sourceforge.net
Cc: Jan Kara <jack@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: Dominique Martinet <asmadeus@codewreck.org>
---
 net/9p/trans_common.c | 14 ++++++++++----
 net/9p/trans_common.h |  3 ++-
 net/9p/trans_virtio.c | 18 +++++++++++++-----
 3 files changed, 25 insertions(+), 10 deletions(-)

diff --git a/net/9p/trans_common.c b/net/9p/trans_common.c
index 3dff68f05fb9..e5c359c369a6 100644
--- a/net/9p/trans_common.c
+++ b/net/9p/trans_common.c
@@ -19,12 +19,18 @@
 /**
  *  p9_release_pages - Release pages after the transaction.
  */
-void p9_release_pages(struct page **pages, int nr_pages)
+void p9_release_pages(struct page **pages, int nr_pages, bool from_gup)
 {
 	int i;
 
-	for (i = 0; i < nr_pages; i++)
-		if (pages[i])
-			put_page(pages[i]);
+	if (from_gup) {
+		for (i = 0; i < nr_pages; i++)
+			if (pages[i])
+				put_user_page(pages[i]);
+	} else {
+		for (i = 0; i < nr_pages; i++)
+			if (pages[i])
+				put_page(pages[i]);
+	}
 }
 EXPORT_SYMBOL(p9_release_pages);
diff --git a/net/9p/trans_common.h b/net/9p/trans_common.h
index c43babb3f635..dcf025867314 100644
--- a/net/9p/trans_common.h
+++ b/net/9p/trans_common.h
@@ -12,4 +12,5 @@
  *
  */
 
-void p9_release_pages(struct page **, int);
+void p9_release_pages(struct page **pages, int nr_pages, bool from_gup);
+
diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index a3cd90a74012..3714ca5ecdc2 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -306,11 +306,14 @@ static int p9_get_mapped_pages(struct virtio_chan *chan,
 			       struct iov_iter *data,
 			       int count,
 			       size_t *offs,
-			       int *need_drop)
+			       int *need_drop,
+			       bool *from_gup)
 {
 	int nr_pages;
 	int err;
 
+	*from_gup = false;
+
 	if (!iov_iter_count(data))
 		return 0;
 
@@ -332,6 +335,7 @@ static int p9_get_mapped_pages(struct virtio_chan *chan,
 		*need_drop = 1;
 		nr_pages = DIV_ROUND_UP(n + *offs, PAGE_SIZE);
 		atomic_add(nr_pages, &vp_pinned);
+		*from_gup = iov_iter_get_pages_use_gup(data);
 		return n;
 	} else {
 		/* kernel buffer, no need to pin pages */
@@ -397,13 +401,15 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req,
 	size_t offs;
 	int need_drop = 0;
 	int kicked = 0;
+	bool in_from_gup, out_from_gup;
 
 	p9_debug(P9_DEBUG_TRANS, "virtio request\n");
 
 	if (uodata) {
 		__le32 sz;
 		int n = p9_get_mapped_pages(chan, &out_pages, uodata,
-					    outlen, &offs, &need_drop);
+					    outlen, &offs, &need_drop,
+					    &out_from_gup);
 		if (n < 0) {
 			err = n;
 			goto err_out;
@@ -422,7 +428,8 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req,
 		memcpy(&req->tc.sdata[0], &sz, sizeof(sz));
 	} else if (uidata) {
 		int n = p9_get_mapped_pages(chan, &in_pages, uidata,
-					    inlen, &offs, &need_drop);
+					    inlen, &offs, &need_drop,
+					    &in_from_gup);
 		if (n < 0) {
 			err = n;
 			goto err_out;
@@ -504,11 +511,12 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req,
 err_out:
 	if (need_drop) {
 		if (in_pages) {
-			p9_release_pages(in_pages, in_nr_pages);
+			p9_release_pages(in_pages, in_nr_pages, in_from_gup);
 			atomic_sub(in_nr_pages, &vp_pinned);
 		}
 		if (out_pages) {
-			p9_release_pages(out_pages, out_nr_pages);
+			p9_release_pages(out_pages, out_nr_pages,
+					 out_from_gup);
 			atomic_sub(out_nr_pages, &vp_pinned);
 		}
 		/* wakeup anybody waiting for slots to pin pages */
-- 
2.22.0


^ permalink raw reply related

* [PATCH 10/12] fs/ceph: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-07-24  4:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Anna Schumaker, David S . Miller,
	Dominique Martinet, Eric Van Hensbergen, Jason Gunthorpe,
	Jason Wang, Jens Axboe, Latchesar Ionkov, Michael S . Tsirkin,
	Miklos Szeredi, Trond Myklebust, Christoph Hellwig,
	Matthew Wilcox, linux-mm, LKML, ceph-devel, kvm, linux-block,
	linux-cifs, linux-fsdevel, linux-nfs, linux-rdma, netdev,
	samba-technical, v9fs-developer, virtualization,
	Jérôme Glisse, John Hubbard, Jan Kara, Dan Williams,
	Johannes Thumshirn, Ming Lei, Dave Chinner, Boaz Harrosh,
	Yan, Zheng, Sage Weil, Ilya Dryomov
In-Reply-To: <20190724042518.14363-1-jhubbard@nvidia.com>

From: Jérôme Glisse <jglisse@redhat.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Changes from Jérôme's original patch:

* Use the enhanced put_user_pages_dirty_lock().

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-block@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: ceph-devel@vger.kernel.org
Cc: Jan Kara <jack@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: "Yan, Zheng" <zyan@redhat.com>
Cc: Sage Weil <sage@redhat.com>
Cc: Ilya Dryomov <idryomov@gmail.com>
---
 fs/ceph/file.c | 62 ++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 48 insertions(+), 14 deletions(-)

diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 685a03cc4b77..c628a1f96978 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -158,18 +158,26 @@ static ssize_t iter_get_bvecs_alloc(struct iov_iter *iter, size_t maxsize,
 	return bytes;
 }
 
-static void put_bvecs(struct bio_vec *bvecs, int num_bvecs, bool should_dirty)
+static void put_bvecs(struct bio_vec *bv, int num_bvecs, bool should_dirty,
+		      bool from_gup)
 {
 	int i;
 
+
 	for (i = 0; i < num_bvecs; i++) {
-		if (bvecs[i].bv_page) {
+		if (!bv[i].bv_page)
+			continue;
+
+		if (from_gup) {
+			put_user_pages_dirty_lock(&bv[i].bv_page, 1,
+						  should_dirty);
+		} else {
 			if (should_dirty)
-				set_page_dirty_lock(bvecs[i].bv_page);
-			put_page(bvecs[i].bv_page);
+				set_page_dirty_lock(bv[i].bv_page);
+			put_page(bv[i].bv_page);
 		}
 	}
-	kvfree(bvecs);
+	kvfree(bv);
 }
 
 /*
@@ -730,6 +738,7 @@ struct ceph_aio_work {
 };
 
 static void ceph_aio_retry_work(struct work_struct *work);
+static void ceph_aio_from_gup_retry_work(struct work_struct *work);
 
 static void ceph_aio_complete(struct inode *inode,
 			      struct ceph_aio_request *aio_req)
@@ -774,7 +783,7 @@ static void ceph_aio_complete(struct inode *inode,
 	kfree(aio_req);
 }
 
-static void ceph_aio_complete_req(struct ceph_osd_request *req)
+static void _ceph_aio_complete_req(struct ceph_osd_request *req, bool from_gup)
 {
 	int rc = req->r_result;
 	struct inode *inode = req->r_inode;
@@ -793,7 +802,9 @@ static void ceph_aio_complete_req(struct ceph_osd_request *req)
 
 		aio_work = kmalloc(sizeof(*aio_work), GFP_NOFS);
 		if (aio_work) {
-			INIT_WORK(&aio_work->work, ceph_aio_retry_work);
+			INIT_WORK(&aio_work->work, from_gup ?
+				  ceph_aio_from_gup_retry_work :
+				  ceph_aio_retry_work);
 			aio_work->req = req;
 			queue_work(ceph_inode_to_client(inode)->inode_wq,
 				   &aio_work->work);
@@ -830,7 +841,7 @@ static void ceph_aio_complete_req(struct ceph_osd_request *req)
 	}
 
 	put_bvecs(osd_data->bvec_pos.bvecs, osd_data->num_bvecs,
-		  aio_req->should_dirty);
+		  aio_req->should_dirty, from_gup);
 	ceph_osdc_put_request(req);
 
 	if (rc < 0)
@@ -840,7 +851,17 @@ static void ceph_aio_complete_req(struct ceph_osd_request *req)
 	return;
 }
 
-static void ceph_aio_retry_work(struct work_struct *work)
+static void ceph_aio_complete_req(struct ceph_osd_request *req)
+{
+	_ceph_aio_complete_req(req, false);
+}
+
+static void ceph_aio_from_gup_complete_req(struct ceph_osd_request *req)
+{
+	_ceph_aio_complete_req(req, true);
+}
+
+static void _ceph_aio_retry_work(struct work_struct *work, bool from_gup)
 {
 	struct ceph_aio_work *aio_work =
 		container_of(work, struct ceph_aio_work, work);
@@ -891,7 +912,8 @@ static void ceph_aio_retry_work(struct work_struct *work)
 
 	ceph_osdc_put_request(orig_req);
 
-	req->r_callback = ceph_aio_complete_req;
+	req->r_callback = from_gup ? ceph_aio_from_gup_complete_req :
+			  ceph_aio_complete_req;
 	req->r_inode = inode;
 	req->r_priv = aio_req;
 
@@ -899,13 +921,23 @@ static void ceph_aio_retry_work(struct work_struct *work)
 out:
 	if (ret < 0) {
 		req->r_result = ret;
-		ceph_aio_complete_req(req);
+		_ceph_aio_complete_req(req, from_gup);
 	}
 
 	ceph_put_snap_context(snapc);
 	kfree(aio_work);
 }
 
+static void ceph_aio_retry_work(struct work_struct *work)
+{
+	_ceph_aio_retry_work(work, false);
+}
+
+static void ceph_aio_from_gup_retry_work(struct work_struct *work)
+{
+	_ceph_aio_retry_work(work, true);
+}
+
 static ssize_t
 ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,
 		       struct ceph_snap_context *snapc,
@@ -927,6 +959,7 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,
 	loff_t pos = iocb->ki_pos;
 	bool write = iov_iter_rw(iter) == WRITE;
 	bool should_dirty = !write && iter_is_iovec(iter);
+	bool from_gup = iov_iter_get_pages_use_gup(iter);
 
 	if (write && ceph_snap(file_inode(file)) != CEPH_NOSNAP)
 		return -EROFS;
@@ -1023,7 +1056,8 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,
 			aio_req->num_reqs++;
 			atomic_inc(&aio_req->pending_reqs);
 
-			req->r_callback = ceph_aio_complete_req;
+			req->r_callback = !from_gup ? ceph_aio_complete_req :
+					  ceph_aio_from_gup_complete_req;
 			req->r_inode = inode;
 			req->r_priv = aio_req;
 			list_add_tail(&req->r_private_item, &aio_req->osd_reqs);
@@ -1054,7 +1088,7 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,
 				len = ret;
 		}
 
-		put_bvecs(bvecs, num_pages, should_dirty);
+		put_bvecs(bvecs, num_pages, should_dirty, from_gup);
 		ceph_osdc_put_request(req);
 		if (ret < 0)
 			break;
@@ -1093,7 +1127,7 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,
 							      req, false);
 			if (ret < 0) {
 				req->r_result = ret;
-				ceph_aio_complete_req(req);
+				_ceph_aio_complete_req(req, from_gup);
 			}
 		}
 		return -EIOCBQUEUED;
-- 
2.22.0


^ permalink raw reply related

* [PATCH 09/12] fs/fuse: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-07-24  4:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Anna Schumaker, David S . Miller,
	Dominique Martinet, Eric Van Hensbergen, Jason Gunthorpe,
	Jason Wang, Jens Axboe, Latchesar Ionkov, Michael S . Tsirkin,
	Miklos Szeredi, Trond Myklebust, Christoph Hellwig,
	Matthew Wilcox, linux-mm, LKML, ceph-devel, kvm, linux-block,
	linux-cifs, linux-fsdevel, linux-nfs, linux-rdma, netdev,
	samba-technical, v9fs-developer, virtualization,
	Jérôme Glisse, John Hubbard, Jan Kara, Dan Williams,
	Johannes Thumshirn, Ming Lei, Dave Chinner, Boaz Harrosh
In-Reply-To: <20190724042518.14363-1-jhubbard@nvidia.com>

From: Jérôme Glisse <jglisse@redhat.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Changes from Jérôme's original patch:

* Use the enhanced put_user_pages_dirty_lock().

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-block@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Jan Kara <jack@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
---
 fs/fuse/dev.c  | 22 +++++++++++++++++----
 fs/fuse/file.c | 53 +++++++++++++++++++++++++++++++++++++-------------
 2 files changed, 57 insertions(+), 18 deletions(-)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index ea8237513dfa..8ef65c9cd3f6 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -780,6 +780,7 @@ struct fuse_copy_state {
 	unsigned len;
 	unsigned offset;
 	unsigned move_pages:1;
+	bool from_gup;
 };
 
 static void fuse_copy_init(struct fuse_copy_state *cs, int write,
@@ -800,13 +801,22 @@ static void fuse_copy_finish(struct fuse_copy_state *cs)
 			buf->len = PAGE_SIZE - cs->len;
 		cs->currbuf = NULL;
 	} else if (cs->pg) {
-		if (cs->write) {
-			flush_dcache_page(cs->pg);
-			set_page_dirty_lock(cs->pg);
+		if (cs->from_gup) {
+			if (cs->write) {
+				flush_dcache_page(cs->pg);
+				put_user_pages_dirty_lock(&cs->pg, 1, true);
+			} else
+				put_user_page(cs->pg);
+		} else {
+			if (cs->write) {
+				flush_dcache_page(cs->pg);
+				set_page_dirty_lock(cs->pg);
+			}
+			put_page(cs->pg);
 		}
-		put_page(cs->pg);
 	}
 	cs->pg = NULL;
+	cs->from_gup = false;
 }
 
 /*
@@ -834,6 +844,7 @@ static int fuse_copy_fill(struct fuse_copy_state *cs)
 			BUG_ON(!cs->nr_segs);
 			cs->currbuf = buf;
 			cs->pg = buf->page;
+			cs->from_gup = false;
 			cs->offset = buf->offset;
 			cs->len = buf->len;
 			cs->pipebufs++;
@@ -851,6 +862,7 @@ static int fuse_copy_fill(struct fuse_copy_state *cs)
 			buf->len = 0;
 
 			cs->currbuf = buf;
+			cs->from_gup = false;
 			cs->pg = page;
 			cs->offset = 0;
 			cs->len = PAGE_SIZE;
@@ -866,6 +878,7 @@ static int fuse_copy_fill(struct fuse_copy_state *cs)
 		cs->len = err;
 		cs->offset = off;
 		cs->pg = page;
+		cs->from_gup = iov_iter_get_pages_use_gup(cs->iter);
 		iov_iter_advance(cs->iter, err);
 	}
 
@@ -1000,6 +1013,7 @@ static int fuse_try_move_page(struct fuse_copy_state *cs, struct page **pagep)
 	unlock_page(newpage);
 out_fallback:
 	cs->pg = buf->page;
+	cs->from_gup = false;
 	cs->offset = buf->offset;
 
 	err = lock_request(cs->req);
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 5ae2828beb00..c34c22ac5b22 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -543,12 +543,20 @@ void fuse_read_fill(struct fuse_req *req, struct file *file, loff_t pos,
 	req->out.args[0].size = count;
 }
 
-static void fuse_release_user_pages(struct fuse_req *req, bool should_dirty)
+static void fuse_release_user_pages(struct fuse_req *req, bool should_dirty,
+				    bool from_gup)
 {
 	unsigned i;
 
+	if (from_gup) {
+		put_user_pages_dirty_lock(req->pages, req->num_pages,
+					  should_dirty);
+		return;
+	}
+
 	for (i = 0; i < req->num_pages; i++) {
 		struct page *page = req->pages[i];
+
 		if (should_dirty)
 			set_page_dirty_lock(page);
 		put_page(page);
@@ -621,12 +629,13 @@ static void fuse_aio_complete(struct fuse_io_priv *io, int err, ssize_t pos)
 	kref_put(&io->refcnt, fuse_io_release);
 }
 
-static void fuse_aio_complete_req(struct fuse_conn *fc, struct fuse_req *req)
+static void _fuse_aio_complete_req(struct fuse_conn *fc, struct fuse_req *req,
+				   bool from_gup)
 {
 	struct fuse_io_priv *io = req->io;
 	ssize_t pos = -1;
 
-	fuse_release_user_pages(req, io->should_dirty);
+	fuse_release_user_pages(req, io->should_dirty, from_gup);
 
 	if (io->write) {
 		if (req->misc.write.in.size != req->misc.write.out.size)
@@ -641,8 +650,18 @@ static void fuse_aio_complete_req(struct fuse_conn *fc, struct fuse_req *req)
 	fuse_aio_complete(io, req->out.h.error, pos);
 }
 
+static void fuse_aio_from_gup_complete_req(struct fuse_conn *fc, struct fuse_req *req)
+{
+	_fuse_aio_complete_req(fc, req, true);
+}
+
+static void fuse_aio_complete_req(struct fuse_conn *fc, struct fuse_req *req)
+{
+	_fuse_aio_complete_req(fc, req, false);
+}
+
 static size_t fuse_async_req_send(struct fuse_conn *fc, struct fuse_req *req,
-		size_t num_bytes, struct fuse_io_priv *io)
+		size_t num_bytes, struct fuse_io_priv *io, bool from_gup)
 {
 	spin_lock(&io->lock);
 	kref_get(&io->refcnt);
@@ -651,7 +670,8 @@ static size_t fuse_async_req_send(struct fuse_conn *fc, struct fuse_req *req,
 	spin_unlock(&io->lock);
 
 	req->io = io;
-	req->end = fuse_aio_complete_req;
+	req->end = from_gup ? fuse_aio_from_gup_complete_req :
+		   fuse_aio_complete_req;
 
 	__fuse_get_request(req);
 	fuse_request_send_background(fc, req);
@@ -660,7 +680,8 @@ static size_t fuse_async_req_send(struct fuse_conn *fc, struct fuse_req *req,
 }
 
 static size_t fuse_send_read(struct fuse_req *req, struct fuse_io_priv *io,
-			     loff_t pos, size_t count, fl_owner_t owner)
+			     loff_t pos, size_t count, fl_owner_t owner,
+			     bool from_gup)
 {
 	struct file *file = io->iocb->ki_filp;
 	struct fuse_file *ff = file->private_data;
@@ -675,7 +696,7 @@ static size_t fuse_send_read(struct fuse_req *req, struct fuse_io_priv *io,
 	}
 
 	if (io->async)
-		return fuse_async_req_send(fc, req, count, io);
+		return fuse_async_req_send(fc, req, count, io, from_gup);
 
 	fuse_request_send(fc, req);
 	return req->out.args[0].size;
@@ -755,7 +776,7 @@ static int fuse_do_readpage(struct file *file, struct page *page)
 	req->page_descs[0].length = count;
 	init_sync_kiocb(&iocb, file);
 	io = (struct fuse_io_priv) FUSE_IO_PRIV_SYNC(&iocb);
-	num_read = fuse_send_read(req, &io, pos, count, NULL);
+	num_read = fuse_send_read(req, &io, pos, count, NULL, false);
 	err = req->out.h.error;
 
 	if (!err) {
@@ -976,7 +997,8 @@ static void fuse_write_fill(struct fuse_req *req, struct fuse_file *ff,
 }
 
 static size_t fuse_send_write(struct fuse_req *req, struct fuse_io_priv *io,
-			      loff_t pos, size_t count, fl_owner_t owner)
+			      loff_t pos, size_t count, fl_owner_t owner,
+			      bool from_gup)
 {
 	struct kiocb *iocb = io->iocb;
 	struct file *file = iocb->ki_filp;
@@ -996,7 +1018,7 @@ static size_t fuse_send_write(struct fuse_req *req, struct fuse_io_priv *io,
 	}
 
 	if (io->async)
-		return fuse_async_req_send(fc, req, count, io);
+		return fuse_async_req_send(fc, req, count, io, from_gup);
 
 	fuse_request_send(fc, req);
 	return req->misc.write.out.size;
@@ -1031,7 +1053,7 @@ static size_t fuse_send_write_pages(struct fuse_req *req, struct kiocb *iocb,
 	for (i = 0; i < req->num_pages; i++)
 		fuse_wait_on_page_writeback(inode, req->pages[i]->index);
 
-	res = fuse_send_write(req, &io, pos, count, NULL);
+	res = fuse_send_write(req, &io, pos, count, NULL, false);
 
 	offset = req->page_descs[0].offset;
 	count = res;
@@ -1351,6 +1373,7 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
 	ssize_t res = 0;
 	struct fuse_req *req;
 	int err = 0;
+	bool from_gup = iov_iter_get_pages_use_gup(iter);
 
 	if (io->async)
 		req = fuse_get_req_for_background(fc, iov_iter_npages(iter,
@@ -1384,13 +1407,15 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
 				inarg = &req->misc.write.in;
 				inarg->write_flags |= FUSE_WRITE_KILL_PRIV;
 			}
-			nres = fuse_send_write(req, io, pos, nbytes, owner);
+			nres = fuse_send_write(req, io, pos, nbytes, owner,
+					       from_gup);
 		} else {
-			nres = fuse_send_read(req, io, pos, nbytes, owner);
+			nres = fuse_send_read(req, io, pos, nbytes, owner,
+					      from_gup);
 		}
 
 		if (!io->async)
-			fuse_release_user_pages(req, io->should_dirty);
+			fuse_release_user_pages(req, io->should_dirty, from_gup);
 		if (req->out.h.error) {
 			err = req->out.h.error;
 			break;
-- 
2.22.0


^ permalink raw reply related

* [PATCH 07/12] vhost-scsi: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-07-24  4:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Anna Schumaker, David S . Miller,
	Dominique Martinet, Eric Van Hensbergen, Jason Gunthorpe,
	Jason Wang, Jens Axboe, Latchesar Ionkov, Michael S . Tsirkin,
	Miklos Szeredi, Trond Myklebust, Christoph Hellwig,
	Matthew Wilcox, linux-mm, LKML, ceph-devel, kvm, linux-block,
	linux-cifs, linux-fsdevel, linux-nfs, linux-rdma, netdev,
	samba-technical, v9fs-developer, virtualization,
	Jérôme Glisse, John Hubbard, Jan Kara, Dan Williams,
	Johannes Thumshirn, Ming Lei, Dave Chinner, Boaz Harrosh,
	Paolo Bonzini, Stefan Hajnoczi
In-Reply-To: <20190724042518.14363-1-jhubbard@nvidia.com>

From: Jérôme Glisse <jglisse@redhat.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Changes from Jérôme's original patch:

* Changed a WARN_ON to a BUG_ON.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Cc: virtualization@lists.linux-foundation.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-block@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Jan Kara <jack@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
---
 drivers/vhost/scsi.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index a9caf1bc3c3e..282565ab5e3f 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -329,11 +329,11 @@ static void vhost_scsi_release_cmd(struct se_cmd *se_cmd)
 
 	if (tv_cmd->tvc_sgl_count) {
 		for (i = 0; i < tv_cmd->tvc_sgl_count; i++)
-			put_page(sg_page(&tv_cmd->tvc_sgl[i]));
+			put_user_page(sg_page(&tv_cmd->tvc_sgl[i]));
 	}
 	if (tv_cmd->tvc_prot_sgl_count) {
 		for (i = 0; i < tv_cmd->tvc_prot_sgl_count; i++)
-			put_page(sg_page(&tv_cmd->tvc_prot_sgl[i]));
+			put_user_page(sg_page(&tv_cmd->tvc_prot_sgl[i]));
 	}
 
 	vhost_scsi_put_inflight(tv_cmd->inflight);
@@ -630,6 +630,13 @@ vhost_scsi_map_to_sgl(struct vhost_scsi_cmd *cmd,
 	size_t offset;
 	unsigned int npages = 0;
 
+	/*
+	 * Here in all cases we should have an IOVEC which use GUP. If that is
+	 * not the case then we will wrongly call put_user_page() and the page
+	 * refcount will go wrong (this is in vhost_scsi_release_cmd())
+	 */
+	WARN_ON(!iov_iter_get_pages_use_gup(iter));
+
 	bytes = iov_iter_get_pages(iter, pages, LONG_MAX,
 				VHOST_SCSI_PREALLOC_UPAGES, &offset);
 	/* No pages were pinned */
@@ -681,7 +688,7 @@ vhost_scsi_iov_to_sgl(struct vhost_scsi_cmd *cmd, bool write,
 			while (p < sg) {
 				struct page *page = sg_page(p++);
 				if (page)
-					put_page(page);
+					put_user_page(page);
 			}
 			return ret;
 		}
-- 
2.22.0


^ permalink raw reply related

* [PATCH 06/12] fs/nfs: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-07-24  4:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Anna Schumaker, David S . Miller,
	Dominique Martinet, Eric Van Hensbergen, Jason Gunthorpe,
	Jason Wang, Jens Axboe, Latchesar Ionkov, Michael S . Tsirkin,
	Miklos Szeredi, Trond Myklebust, Christoph Hellwig,
	Matthew Wilcox, linux-mm, LKML, ceph-devel, kvm, linux-block,
	linux-cifs, linux-fsdevel, linux-nfs, linux-rdma, netdev,
	samba-technical, v9fs-developer, virtualization,
	Jérôme Glisse, John Hubbard, Jan Kara, Dan Williams,
	Johannes Thumshirn, Ming Lei, Dave Chinner, Boaz Harrosh
In-Reply-To: <20190724042518.14363-1-jhubbard@nvidia.com>

From: Jérôme Glisse <jglisse@redhat.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-block@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-nfs@vger.kernel.org
Cc: Jan Kara <jack@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Anna Schumaker <anna.schumaker@netapp.com>
---
 fs/nfs/direct.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
index 0cb442406168..35f30fe2900f 100644
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -512,7 +512,10 @@ static ssize_t nfs_direct_read_schedule_iovec(struct nfs_direct_req *dreq,
 			pos += req_len;
 			dreq->bytes_left -= req_len;
 		}
-		nfs_direct_release_pages(pagevec, npages);
+		if (iov_iter_get_pages_use_gup(iter))
+			put_user_pages(pagevec, npages);
+		else
+			nfs_direct_release_pages(pagevec, npages);
 		kvfree(pagevec);
 		if (result < 0)
 			break;
@@ -935,7 +938,10 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq,
 			pos += req_len;
 			dreq->bytes_left -= req_len;
 		}
-		nfs_direct_release_pages(pagevec, npages);
+		if (iov_iter_get_pages_use_gup(iter))
+			put_user_pages(pagevec, npages);
+		else
+			nfs_direct_release_pages(pagevec, npages);
 		kvfree(pagevec);
 		if (result < 0)
 			break;
-- 
2.22.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox