Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH v2 18/34] fbdev/pvr2fb: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-04 22:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Bartlomiej Zolnierkiewicz, Kees Cook,
	Al Viro, Bhumika Goyal, Arvind Yadav
In-Reply-To: <20190804224915.28669-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Bhumika Goyal <bhumirks@gmail.com>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-fbdev@vger.kernel.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/video/fbdev/pvr2fb.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/video/fbdev/pvr2fb.c b/drivers/video/fbdev/pvr2fb.c
index 7ff4b6b84282..0e4f9aa6444d 100644
--- a/drivers/video/fbdev/pvr2fb.c
+++ b/drivers/video/fbdev/pvr2fb.c
@@ -700,8 +700,7 @@ static ssize_t pvr2fb_write(struct fb_info *info, const char *buf,
 	ret = count;
 
 out_unmap:
-	for (i = 0; i < nr_pages; i++)
-		put_page(pages[i]);
+	put_user_pages(pages, nr_pages);
 
 	kfree(pages);
 
-- 
2.22.0


^ permalink raw reply related

* [PATCH v2 20/34] xen: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-04 22:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Juergen Gross, Boris Ostrovsky
In-Reply-To: <20190804224915.28669-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

This also handles pages[i] == NULL cases, thanks to an approach
that is actually written by Juergen Gross.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>

Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel@lists.xenproject.org
---

Hi Juergen,

Say, this is *exactly* what you proposed in your gup.patch, so
I've speculatively added your Signed-off-by above, but need your
approval before that's final. Let me know please...

thanks,
John Hubbard


 drivers/xen/privcmd.c | 32 +++++++++++---------------------
 1 file changed, 11 insertions(+), 21 deletions(-)

diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
index c6070e70dd73..c7d0763ca8c2 100644
--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -582,10 +582,11 @@ static long privcmd_ioctl_mmap_batch(
 
 static int lock_pages(
 	struct privcmd_dm_op_buf kbufs[], unsigned int num,
-	struct page *pages[], unsigned int nr_pages)
+	struct page *pages[], unsigned int *nr_pages)
 {
-	unsigned int i;
+	unsigned int i, free = *nr_pages;
 
+	*nr_pages = 0;
 	for (i = 0; i < num; i++) {
 		unsigned int requested;
 		int pinned;
@@ -593,35 +594,22 @@ static int lock_pages(
 		requested = DIV_ROUND_UP(
 			offset_in_page(kbufs[i].uptr) + kbufs[i].size,
 			PAGE_SIZE);
-		if (requested > nr_pages)
+		if (requested > free)
 			return -ENOSPC;
 
 		pinned = get_user_pages_fast(
 			(unsigned long) kbufs[i].uptr,
-			requested, FOLL_WRITE, pages);
+			requested, FOLL_WRITE, pages + *nr_pages);
 		if (pinned < 0)
 			return pinned;
 
-		nr_pages -= pinned;
-		pages += pinned;
+		free -= pinned;
+		*nr_pages += pinned;
 	}
 
 	return 0;
 }
 
-static void unlock_pages(struct page *pages[], unsigned int nr_pages)
-{
-	unsigned int i;
-
-	if (!pages)
-		return;
-
-	for (i = 0; i < nr_pages; i++) {
-		if (pages[i])
-			put_page(pages[i]);
-	}
-}
-
 static long privcmd_ioctl_dm_op(struct file *file, void __user *udata)
 {
 	struct privcmd_data *data = file->private_data;
@@ -681,11 +669,12 @@ static long privcmd_ioctl_dm_op(struct file *file, void __user *udata)
 
 	xbufs = kcalloc(kdata.num, sizeof(*xbufs), GFP_KERNEL);
 	if (!xbufs) {
+		nr_pages = 0;
 		rc = -ENOMEM;
 		goto out;
 	}
 
-	rc = lock_pages(kbufs, kdata.num, pages, nr_pages);
+	rc = lock_pages(kbufs, kdata.num, pages, &nr_pages);
 	if (rc)
 		goto out;
 
@@ -699,7 +688,8 @@ static long privcmd_ioctl_dm_op(struct file *file, void __user *udata)
 	xen_preemptible_hcall_end();
 
 out:
-	unlock_pages(pages, nr_pages);
+	if (pages)
+		put_user_pages(pages, nr_pages);
 	kfree(xbufs);
 	kfree(pages);
 	kfree(kbufs);
-- 
2.22.0


^ permalink raw reply related

* [PATCH v2 19/34] fsl_hypervisor: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-04 22:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Al Viro, Kees Cook, Rob Herring
In-Reply-To: <20190804224915.28669-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

This changes the release code slightly, because each page slot in the
page_list[] array is no longer checked for NULL. However, that check
was wrong anyway, because the get_user_pages() pattern of usage here
never allowed for NULL entries within a range of pinned pages.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/virt/fsl_hypervisor.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/virt/fsl_hypervisor.c b/drivers/virt/fsl_hypervisor.c
index 93d5bebf9572..a8f78d572c45 100644
--- a/drivers/virt/fsl_hypervisor.c
+++ b/drivers/virt/fsl_hypervisor.c
@@ -292,11 +292,8 @@ static long ioctl_memcpy(struct fsl_hv_ioctl_memcpy __user *p)
 		virt_to_phys(sg_list), num_pages);
 
 exit:
-	if (pages) {
-		for (i = 0; i < num_pages; i++)
-			if (pages[i])
-				put_page(pages[i]);
-	}
+	if (pages)
+		put_user_pages(pages, num_pages);
 
 	kfree(sg_list_unaligned);
 	kfree(pages);
-- 
2.22.0


^ permalink raw reply related

* [PATCH v2 17/34] vfio: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-04 22:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Alex Williamson
In-Reply-To: <20190804224915.28669-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Note that this effectively changes the code's behavior in
qp_release_pages(): it now ultimately calls set_page_dirty_lock(),
instead of set_page_dirty(). This is probably more accurate.

As Christoph Hellwig put it, "set_page_dirty() is only safe if we are
dealing with a file backed page where we have reference on the inode it
hangs off." [1]

[1] https://lore.kernel.org/r/20190723153640.GB720@lst.de

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: kvm@vger.kernel.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/vfio/vfio_iommu_type1.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 054391f30fa8..5a5461a14299 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -320,9 +320,9 @@ static int put_pfn(unsigned long pfn, int prot)
 {
 	if (!is_invalid_reserved_pfn(pfn)) {
 		struct page *page = pfn_to_page(pfn);
-		if (prot & IOMMU_WRITE)
-			SetPageDirty(page);
-		put_page(page);
+		bool dirty = prot & IOMMU_WRITE;
+
+		put_user_pages_dirty_lock(&page, 1, dirty);
 		return 1;
 	}
 	return 0;
@@ -356,7 +356,7 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr,
 		 */
 		if (ret > 0 && vma_is_fsdax(vmas[0])) {
 			ret = -EOPNOTSUPP;
-			put_page(page[0]);
+			put_user_page(page[0]);
 		}
 	}
 	up_read(&mm->mmap_sem);
-- 
2.22.0


^ permalink raw reply related

* [PATCH v2 15/34] staging/vc04_services: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-04 22:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Greg Kroah-Hartman, Eric Anholt,
	Stefan Wahren, Mihaela Muraru, Suniel Mahesh, Al Viro,
	Sidong Yang, Kishore KP
In-Reply-To: <20190804224915.28669-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Cc: Eric Anholt <eric@anholt.net>
Cc: Stefan Wahren <stefan.wahren@i2se.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mihaela Muraru <mihaela.muraru21@gmail.com>
Cc: Suniel Mahesh <sunil.m@techveda.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Sidong Yang <realwakka@gmail.com>
Cc: Kishore KP <kishore.p@techveda.org>
Cc: linux-rpi-kernel@lists.infradead.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: devel@driverdev.osuosl.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 .../vc04_services/interface/vchiq_arm/vchiq_2835_arm.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
index 61c69f353cdb..ec92b4c50e95 100644
--- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
+++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
@@ -336,10 +336,7 @@ cleanup_pagelistinfo(struct vchiq_pagelist_info *pagelistinfo)
 	}
 
 	if (pagelistinfo->pages_need_release) {
-		unsigned int i;
-
-		for (i = 0; i < pagelistinfo->num_pages; i++)
-			put_page(pagelistinfo->pages[i]);
+		put_user_pages(pagelistinfo->pages, pagelistinfo->num_pages);
 	}
 
 	dma_free_coherent(g_dev, pagelistinfo->pagelist_buffer_size,
@@ -454,10 +451,7 @@ create_pagelist(char __user *buf, size_t count, unsigned short type)
 				       __func__, actual_pages, num_pages);
 
 			/* This is probably due to the process being killed */
-			while (actual_pages > 0) {
-				actual_pages--;
-				put_page(pages[actual_pages]);
-			}
+			put_user_pages(pages, actual_pages);
 			cleanup_pagelistinfo(pagelistinfo);
 			return NULL;
 		}
-- 
2.22.0


^ permalink raw reply related

* [PATCH v2 13/34] rapidio: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-04 22:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Matt Porter, Alexandre Bounine, Al Viro,
	Logan Gunthorpe, Christophe JAILLET, Ioan Nicu, Kees Cook,
	Tvrtko Ursulin
In-Reply-To: <20190804224915.28669-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Ioan Nicu <ioan.nicu.ext@nokia.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/rapidio/devices/rio_mport_cdev.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/rapidio/devices/rio_mport_cdev.c b/drivers/rapidio/devices/rio_mport_cdev.c
index 8155f59ece38..0e8ea0e5a89e 100644
--- a/drivers/rapidio/devices/rio_mport_cdev.c
+++ b/drivers/rapidio/devices/rio_mport_cdev.c
@@ -572,14 +572,12 @@ static void dma_req_free(struct kref *ref)
 	struct mport_dma_req *req = container_of(ref, struct mport_dma_req,
 			refcount);
 	struct mport_cdev_priv *priv = req->priv;
-	unsigned int i;
 
 	dma_unmap_sg(req->dmach->device->dev,
 		     req->sgt.sgl, req->sgt.nents, req->dir);
 	sg_free_table(&req->sgt);
 	if (req->page_list) {
-		for (i = 0; i < req->nr_pages; i++)
-			put_page(req->page_list[i]);
+		put_user_pages(req->page_list, req->nr_pages);
 		kfree(req->page_list);
 	}
 
@@ -815,7 +813,7 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode,
 	struct mport_dma_req *req;
 	struct mport_dev *md = priv->md;
 	struct dma_chan *chan;
-	int i, ret;
+	int ret;
 	int nents;
 
 	if (xfer->length == 0)
@@ -946,8 +944,7 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode,
 
 err_pg:
 	if (!req->page_list) {
-		for (i = 0; i < nr_pages; i++)
-			put_page(page_list[i]);
+		put_user_pages(page_list, nr_pages);
 		kfree(page_list);
 	}
 err_req:
-- 
2.22.0


^ permalink raw reply related

* [PATCH v2 12/34] vmci: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-04 22:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Arnd Bergmann, Al Viro,
	Gustavo A . R . Silva, Kees Cook
In-Reply-To: <20190804224915.28669-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Note that this effectively changes the code's behavior in
qp_release_pages(): it now ultimately calls set_page_dirty_lock(),
instead of set_page_dirty(). This is probably more accurate.

As Christoph Hellwig put it, "set_page_dirty() is only safe if we are
dealing with a file backed page where we have reference on the inode it
hangs off." [1]

[1] https://lore.kernel.org/r/20190723153640.GB720@lst.de

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Gustavo A. R. Silva <gustavo@embeddedor.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/misc/vmw_vmci/vmci_context.c    |  2 +-
 drivers/misc/vmw_vmci/vmci_queue_pair.c | 11 ++---------
 2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/misc/vmw_vmci/vmci_context.c b/drivers/misc/vmw_vmci/vmci_context.c
index 16695366ec92..9daa52ee63b7 100644
--- a/drivers/misc/vmw_vmci/vmci_context.c
+++ b/drivers/misc/vmw_vmci/vmci_context.c
@@ -587,7 +587,7 @@ void vmci_ctx_unset_notify(struct vmci_ctx *context)
 
 	if (notify_page) {
 		kunmap(notify_page);
-		put_page(notify_page);
+		put_user_page(notify_page);
 	}
 }
 
diff --git a/drivers/misc/vmw_vmci/vmci_queue_pair.c b/drivers/misc/vmw_vmci/vmci_queue_pair.c
index 8531ae781195..e5434551d0ef 100644
--- a/drivers/misc/vmw_vmci/vmci_queue_pair.c
+++ b/drivers/misc/vmw_vmci/vmci_queue_pair.c
@@ -626,15 +626,8 @@ static void qp_release_queue_mutex(struct vmci_queue *queue)
 static void qp_release_pages(struct page **pages,
 			     u64 num_pages, bool dirty)
 {
-	int i;
-
-	for (i = 0; i < num_pages; i++) {
-		if (dirty)
-			set_page_dirty(pages[i]);
-
-		put_page(pages[i]);
-		pages[i] = NULL;
-	}
+	put_user_pages_dirty_lock(pages, num_pages, dirty);
+	memset(pages, 0, num_pages * sizeof(struct page *));
 }
 
 /*
-- 
2.22.0


^ permalink raw reply related

* [PATCH v2 14/34] oradax: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-04 22:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, David S . Miller, Jonathan Helman,
	Rob Gardner, Andy Shevchenko, Jonathan Corbet, Wei Yongjun,
	Mauro Carvalho Chehab
In-Reply-To: <20190804224915.28669-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Cc: David S. Miller <davem@davemloft.net>
Cc: Jonathan Helman <jonathan.helman@oracle.com>
Cc: Rob Gardner <rob.gardner@oracle.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Cc: sparclinux@vger.kernel.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/sbus/char/oradax.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/sbus/char/oradax.c b/drivers/sbus/char/oradax.c
index 8af216287a84..029e619992fc 100644
--- a/drivers/sbus/char/oradax.c
+++ b/drivers/sbus/char/oradax.c
@@ -412,7 +412,7 @@ static void dax_unlock_pages(struct dax_ctx *ctx, int ccb_index, int nelem)
 				dax_dbg("freeing page %p", p);
 				if (j == OUT)
 					set_page_dirty(p);
-				put_page(p);
+				put_user_page(p);
 				ctx->pages[i][j] = NULL;
 			}
 		}
-- 
2.22.0


^ permalink raw reply related

* [PATCH v2 11/34] scif: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-04 22:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Sudeep Dutt, Ashutosh Dixit,
	Arnd Bergmann, Joerg Roedel, Robin Murphy, Zhen Lei
In-Reply-To: <20190804224915.28669-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/misc/mic/scif/scif_rma.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/misc/mic/scif/scif_rma.c b/drivers/misc/mic/scif/scif_rma.c
index 01e27682ea30..d84ed9466920 100644
--- a/drivers/misc/mic/scif/scif_rma.c
+++ b/drivers/misc/mic/scif/scif_rma.c
@@ -113,13 +113,14 @@ static int scif_destroy_pinned_pages(struct scif_pinned_pages *pin)
 	int writeable = pin->prot & SCIF_PROT_WRITE;
 	int kernel = SCIF_MAP_KERNEL & pin->map_flags;
 
-	for (j = 0; j < pin->nr_pages; j++) {
-		if (pin->pages[j] && !kernel) {
+	if (kernel) {
+		for (j = 0; j < pin->nr_pages; j++) {
 			if (writeable)
-				SetPageDirty(pin->pages[j]);
+				set_page_dirty_lock(pin->pages[j]);
 			put_page(pin->pages[j]);
 		}
-	}
+	} else
+		put_user_pages_dirty_lock(pin->pages, pin->nr_pages, writeable);
 
 	scif_free(pin->pages,
 		  pin->nr_pages * sizeof(*pin->pages));
@@ -1385,11 +1386,9 @@ int __scif_pin_pages(void *addr, size_t len, int *out_prot,
 				if (ulimit)
 					__scif_dec_pinned_vm_lock(mm, nr_pages);
 				/* Roll back any pinned pages */
-				for (i = 0; i < pinned_pages->nr_pages; i++) {
-					if (pinned_pages->pages[i])
-						put_page(
-						pinned_pages->pages[i]);
-				}
+				put_user_pages(pinned_pages->pages,
+					       pinned_pages->nr_pages);
+
 				prot &= ~SCIF_PROT_WRITE;
 				try_upgrade = false;
 				goto retry;
-- 
2.22.0


^ permalink raw reply related

* [PATCH v2 08/34] media/ivtv: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-04 22:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Andy Walls, Mauro Carvalho Chehab
In-Reply-To: <20190804224915.28669-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Cc: Andy Walls <awalls@md.metrocast.net>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-media@vger.kernel.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/media/pci/ivtv/ivtv-udma.c | 14 ++++----------
 drivers/media/pci/ivtv/ivtv-yuv.c  | 11 +++--------
 2 files changed, 7 insertions(+), 18 deletions(-)

diff --git a/drivers/media/pci/ivtv/ivtv-udma.c b/drivers/media/pci/ivtv/ivtv-udma.c
index 5f8883031c9c..7c7f33c2412b 100644
--- a/drivers/media/pci/ivtv/ivtv-udma.c
+++ b/drivers/media/pci/ivtv/ivtv-udma.c
@@ -92,7 +92,7 @@ int ivtv_udma_setup(struct ivtv *itv, unsigned long ivtv_dest_addr,
 {
 	struct ivtv_dma_page_info user_dma;
 	struct ivtv_user_dma *dma = &itv->udma;
-	int i, err;
+	int err;
 
 	IVTV_DEBUG_DMA("ivtv_udma_setup, dst: 0x%08x\n", (unsigned int)ivtv_dest_addr);
 
@@ -119,8 +119,7 @@ int ivtv_udma_setup(struct ivtv *itv, unsigned long ivtv_dest_addr,
 		IVTV_DEBUG_WARN("failed to map user pages, returned %d instead of %d\n",
 			   err, user_dma.page_count);
 		if (err >= 0) {
-			for (i = 0; i < err; i++)
-				put_page(dma->map[i]);
+			put_user_pages(dma->map, err);
 			return -EINVAL;
 		}
 		return err;
@@ -130,9 +129,7 @@ int ivtv_udma_setup(struct ivtv *itv, unsigned long ivtv_dest_addr,
 
 	/* Fill SG List with new values */
 	if (ivtv_udma_fill_sg_list(dma, &user_dma, 0) < 0) {
-		for (i = 0; i < dma->page_count; i++) {
-			put_page(dma->map[i]);
-		}
+		put_user_pages(dma->map, dma->page_count);
 		dma->page_count = 0;
 		return -ENOMEM;
 	}
@@ -153,7 +150,6 @@ int ivtv_udma_setup(struct ivtv *itv, unsigned long ivtv_dest_addr,
 void ivtv_udma_unmap(struct ivtv *itv)
 {
 	struct ivtv_user_dma *dma = &itv->udma;
-	int i;
 
 	IVTV_DEBUG_INFO("ivtv_unmap_user_dma\n");
 
@@ -170,9 +166,7 @@ void ivtv_udma_unmap(struct ivtv *itv)
 	ivtv_udma_sync_for_cpu(itv);
 
 	/* Release User Pages */
-	for (i = 0; i < dma->page_count; i++) {
-		put_page(dma->map[i]);
-	}
+	put_user_pages(dma->map, dma->page_count);
 	dma->page_count = 0;
 }
 
diff --git a/drivers/media/pci/ivtv/ivtv-yuv.c b/drivers/media/pci/ivtv/ivtv-yuv.c
index cd2fe2d444c0..2c61a11d391d 100644
--- a/drivers/media/pci/ivtv/ivtv-yuv.c
+++ b/drivers/media/pci/ivtv/ivtv-yuv.c
@@ -30,7 +30,6 @@ static int ivtv_yuv_prep_user_dma(struct ivtv *itv, struct ivtv_user_dma *dma,
 	struct yuv_playback_info *yi = &itv->yuv_info;
 	u8 frame = yi->draw_frame;
 	struct yuv_frame_info *f = &yi->new_frame_info[frame];
-	int i;
 	int y_pages, uv_pages;
 	unsigned long y_buffer_offset, uv_buffer_offset;
 	int y_decode_height, uv_decode_height, y_size;
@@ -81,8 +80,7 @@ static int ivtv_yuv_prep_user_dma(struct ivtv *itv, struct ivtv_user_dma *dma,
 				 uv_pages, uv_dma.page_count);
 
 			if (uv_pages >= 0) {
-				for (i = 0; i < uv_pages; i++)
-					put_page(dma->map[y_pages + i]);
+				put_user_pages(&dma->map[y_pages], uv_pages);
 				rc = -EFAULT;
 			} else {
 				rc = uv_pages;
@@ -93,8 +91,7 @@ static int ivtv_yuv_prep_user_dma(struct ivtv *itv, struct ivtv_user_dma *dma,
 				 y_pages, y_dma.page_count);
 		}
 		if (y_pages >= 0) {
-			for (i = 0; i < y_pages; i++)
-				put_page(dma->map[i]);
+			put_user_pages(dma->map, y_pages);
 			/*
 			 * Inherit the -EFAULT from rc's
 			 * initialization, but allow it to be
@@ -112,9 +109,7 @@ static int ivtv_yuv_prep_user_dma(struct ivtv *itv, struct ivtv_user_dma *dma,
 	/* Fill & map SG List */
 	if (ivtv_udma_fill_sg_list (dma, &uv_dma, ivtv_udma_fill_sg_list (dma, &y_dma, 0)) < 0) {
 		IVTV_DEBUG_WARN("could not allocate bounce buffers for highmem userspace buffers\n");
-		for (i = 0; i < dma->page_count; i++) {
-			put_page(dma->map[i]);
-		}
+		put_user_pages(dma->map, dma->page_count);
 		dma->page_count = 0;
 		return -ENOMEM;
 	}
-- 
2.22.0


^ permalink raw reply related

* [PATCH v2 07/34] drm/radeon: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-04 22:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Alex Deucher, Christian König,
	David Zhou, David Airlie
In-Reply-To: <20190804224915.28669-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: David (ChunMing) Zhou <David1.Zhou@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/drm/radeon/radeon_ttm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
index fb3696bc616d..4c9943fa10df 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -540,7 +540,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm)
 	kfree(ttm->sg);
 
 release_pages:
-	release_pages(ttm->pages, pinned);
+	put_user_pages(ttm->pages, pinned);
 	return r;
 }
 
-- 
2.22.0


^ permalink raw reply related

* [PATCH v2 10/34] genwqe: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-04 22:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Greg Kroah-Hartman, Frank Haverkamp,
	Guilherme G . Piccoli, Arnd Bergmann
In-Reply-To: <20190804224915.28669-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

This changes the release code slightly, because each page slot in the
page_list[] array is no longer checked for NULL. However, that check
was wrong anyway, because the get_user_pages() pattern of usage here
never allowed for NULL entries within a range of pinned pages.

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Cc: Frank Haverkamp <haver@linux.vnet.ibm.com>
Cc: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/misc/genwqe/card_utils.c | 17 +++--------------
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/drivers/misc/genwqe/card_utils.c b/drivers/misc/genwqe/card_utils.c
index 2e1c4d2905e8..2a888f31d2c5 100644
--- a/drivers/misc/genwqe/card_utils.c
+++ b/drivers/misc/genwqe/card_utils.c
@@ -517,24 +517,13 @@ int genwqe_free_sync_sgl(struct genwqe_dev *cd, struct genwqe_sgl *sgl)
 /**
  * genwqe_free_user_pages() - Give pinned pages back
  *
- * Documentation of get_user_pages is in mm/gup.c:
- *
- * If the page is written to, set_page_dirty (or set_page_dirty_lock,
- * as appropriate) must be called after the page is finished with, and
- * before put_page is called.
+ * The pages may have been written to, so we call put_user_pages_dirty_lock(),
+ * rather than put_user_pages().
  */
 static int genwqe_free_user_pages(struct page **page_list,
 			unsigned int nr_pages, int dirty)
 {
-	unsigned int i;
-
-	for (i = 0; i < nr_pages; i++) {
-		if (page_list[i] != NULL) {
-			if (dirty)
-				set_page_dirty_lock(page_list[i]);
-			put_page(page_list[i]);
-		}
-	}
+	put_user_pages_dirty_lock(page_list, nr_pages, dirty);
 	return 0;
 }
 
-- 
2.22.0


^ permalink raw reply related

* [PATCH v2 09/34] media/v4l2-core/mm: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-04 22:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Mauro Carvalho Chehab, Kees Cook,
	Hans Verkuil, Sakari Ailus, Robin Murphy, Souptick Joarder
In-Reply-To: <20190804224915.28669-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: linux-media@vger.kernel.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/media/v4l2-core/videobuf-dma-sg.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf-dma-sg.c b/drivers/media/v4l2-core/videobuf-dma-sg.c
index 66a6c6c236a7..d6eeb437ec19 100644
--- a/drivers/media/v4l2-core/videobuf-dma-sg.c
+++ b/drivers/media/v4l2-core/videobuf-dma-sg.c
@@ -349,8 +349,7 @@ int videobuf_dma_free(struct videobuf_dmabuf *dma)
 	BUG_ON(dma->sglen);
 
 	if (dma->pages) {
-		for (i = 0; i < dma->nr_pages; i++)
-			put_page(dma->pages[i]);
+		put_user_pages(dma->pages, dma->nr_pages);
 		kfree(dma->pages);
 		dma->pages = NULL;
 	}
-- 
2.22.0


^ permalink raw reply related

* [PATCH v2 06/34] drm/i915: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-04 22:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, David Airlie
In-Reply-To: <20190804224915.28669-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

This is a merge-able version of the fix, because it restricts
itself to put_user_page() and put_user_pages(), both of which
have not changed their APIs. Later, i915_gem_userptr_put_pages()
can be simplified to use put_user_pages_dirty_lock().

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 2caa594322bc..76dda2923cf1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -527,7 +527,7 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
 	}
 	mutex_unlock(&obj->mm.lock);
 
-	release_pages(pvec, pinned);
+	put_user_pages(pvec, pinned);
 	kvfree(pvec);
 
 	i915_gem_object_put(obj);
@@ -640,7 +640,7 @@ static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
 		__i915_gem_userptr_set_active(obj, true);
 
 	if (IS_ERR(pages))
-		release_pages(pvec, pinned);
+		put_user_pages(pvec, pinned);
 	kvfree(pvec);
 
 	return PTR_ERR_OR_ZERO(pages);
@@ -675,7 +675,7 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
 			set_page_dirty_lock(page);
 
 		mark_page_accessed(page);
-		put_page(page);
+		put_user_page(page);
 	}
 	obj->mm.dirty = false;
 
-- 
2.22.0


^ permalink raw reply related

* [PATCH v2 04/34] x86/kvm: convert put_page() to put_user_page*()
From: john.hubbard @ 2019-08-04 22:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Joerg Roedel, Paolo Bonzini,
	Radim Krčmář, Thomas Gleixner, Ingo Molnar,
	H . Peter Anvin
In-Reply-To: <20190804224915.28669-1-jhubbard@nvidia.com>

From: John Hubbard <jhubbard@nvidia.com>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Cc: Joerg Roedel <joro@8bytes.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 arch/x86/kvm/svm.c  | 4 ++--
 virt/kvm/kvm_main.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 7eafc6907861..ff93c923ed36 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1827,7 +1827,7 @@ static struct page **sev_pin_memory(struct kvm *kvm, unsigned long uaddr,
 
 err:
 	if (npinned > 0)
-		release_pages(pages, npinned);
+		put_user_pages(pages, npinned);
 
 	kvfree(pages);
 	return NULL;
@@ -1838,7 +1838,7 @@ static void sev_unpin_memory(struct kvm *kvm, struct page **pages,
 {
 	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
 
-	release_pages(pages, npages);
+	put_user_pages(pages, npages);
 	kvfree(pages);
 	sev->pages_locked -= npages;
 }
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 887f3b0c2b60..4b6a596ea8e9 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1499,7 +1499,7 @@ static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fault,
 
 		if (__get_user_pages_fast(addr, 1, 1, &wpage) == 1) {
 			*writable = true;
-			put_page(page);
+			put_user_page(page);
 			page = wpage;
 		}
 	}
@@ -1831,7 +1831,7 @@ EXPORT_SYMBOL_GPL(kvm_release_page_clean);
 void kvm_release_pfn_clean(kvm_pfn_t pfn)
 {
 	if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn))
-		put_page(pfn_to_page(pfn));
+		put_user_page(pfn_to_page(pfn));
 }
 EXPORT_SYMBOL_GPL(kvm_release_pfn_clean);
 
-- 
2.22.0


^ permalink raw reply related

* Re: [PATCH 1/1] bpf: introduce new helper udp_flow_src_port
From: Y Song @ 2019-08-04 23:04 UTC (permalink / raw)
  To: Farid Zakaria; +Cc: Alexei Starovoitov, Daniel Borkmann, netdev, bpf
In-Reply-To: <CACCo2j=RAua1E0d6E+tVoOG=q1sSLuZpLqx32dY4mmhYNtDzvg@mail.gmail.com>

On Sun, Aug 4, 2019 at 1:43 PM Farid Zakaria <farid.m.zakaria@gmail.com> wrote:
>
> * re-sending as I've sent previously as HTML ... sorry *
>
> First off, thank you for taking the time to review this patch.

You are welcome. Also, just let you know typically people do
interleaved reply instead of
top reply.

>
> It's not clear to me the backport you'd like for libbpf, I have been following
> the documentation outlined in
> https://www.kernel.org/doc/html/latest/bpf/index.html
> I will hold off on changes to this patch as you've asked for it going forward.

The document at the above link does not specify how patches are structured.
What I suggest is what people typically do here for each review and cherry-pick
in different cases. There are no functionality changes in your patch. Just
break it into three commits instead one. The cover letter is still needed.
There are more examples in here:
https://lore.kernel.org/bpf/

>
> This patch is inspired by a MPLSoUDP (https://tools.ietf.org/html/rfc7510)]
> kernel module that was ported to eBPF -- our implementation is slightly
> modified for our custom use case.
>
> The Linux kernel provides a single abstraction for the src port for
> UDP tunneling
> via udp_flow_src_port. If it's improved eBPF filters would benefit if
> the call is the same.
>
> Exposing this function to eBPF programs would maintain feature parity
> with other kernel
> tunneling implementations.

Thanks for providing the detailed information above. Such use case
information should be in the commit message to answer the question
like why we need this feature.

If I understand correctly, you try to do MPLS over UDP in bpf program,
right? Your want to the UDP source port to be the same as the one
computed by kernel? Approximation is possible but it may introduce
different behavior at routing side (ECMP) and you do not want that to happen.

Your test case only tries to modify a tcp packet source port and checks it
is indeed changed. It would be good if your test case can be a little bit
closer to your use case.

You can submit v2 of the patch set with 3 commits, more detailed
commit messages and possibly modified test cases, so we can
continue to review.

>
> Farid Zakaria
>
> Farid Zakaria
>
>
>
> On Sun, Aug 4, 2019 at 1:41 PM Farid Zakaria <farid.m.zakaria@gmail.com> wrote:
> >
> > First off, thank you for taking the time to review this patch.
> >
> > It's not clear to me the backport you'd like for libbpf, I have been following
> > the documentation outlined in https://www.kernel.org/doc/html/latest/bpf/index.html
> > I will hold off on changes to this patch as you've asked for it going forward.
> >
> > This patch is inspired by a MPLSoUDP (https://tools.ietf.org/html/rfc7510)]
> > kernel module that was ported to eBPF -- our implementation is slightly
> > modified for our custom use case.
> >
> > The Linux kernel provides a single abstraction for the src port for UDP tunneling
> > via udp_flow_src_port. If it's improved eBPF filters would benefit if the call is the same.
> >
> > Exposing this function to eBPF programs would maintain feature parity with other kernel
> > tunneling implementations.
> >
> > Cheers,
> > Farid Zakaria
> >
> >
> >
> > On Sat, Aug 3, 2019 at 11:52 PM Y Song <ys114321@gmail.com> wrote:
> >>
> >> On Sat, Aug 3, 2019 at 8:29 PM Farid Zakaria <farid.m.zakaria@gmail.com> wrote:
> >> >
> >> > Foo over UDP uses UDP encapsulation to add additional entropy
> >> > into the packets so that they get beter distribution across EMCP
> >> > routes.
> >> >
> >> > Expose udp_flow_src_port as a bpf helper so that tunnel filters
> >> > can benefit from the helper.
> >> >
> >> > Signed-off-by: Farid Zakaria <farid.m.zakaria@gmail.com>
> >> > ---
> >> >  include/uapi/linux/bpf.h                      | 21 +++++++--
> >> >  net/core/filter.c                             | 20 ++++++++
> >> >  tools/include/uapi/linux/bpf.h                | 21 +++++++--
> >> >  tools/testing/selftests/bpf/bpf_helpers.h     |  2 +
> >> >  .../bpf/prog_tests/udp_flow_src_port.c        | 28 +++++++++++
> >> >  .../bpf/progs/test_udp_flow_src_port_kern.c   | 47 +++++++++++++++++++
> >> >  6 files changed, 131 insertions(+), 8 deletions(-)
> >> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/udp_flow_src_port.c
> >> >  create mode 100644 tools/testing/selftests/bpf/progs/test_udp_flow_src_port_kern.c
> >>
> >> First, for each review, backport and sync with libbpf repo, in the future,
> >> could you break the patch to two patches?
> >>    1. kernel changes (net/core/filter.c, include/uapi/linux/bpf.h)
> >>    2. tools/include/uapi/linux/bpf.h
> >>    3. tools/testing/ changes
> >>
> >> Second, could you explain why existing __sk_buff->hash not enough?
> >> there are corner cases where if __sk_buff->hash is 0 and the kernel did some
> >> additional hashing, but maybe you can approximate in bpf program?
> >> For case, min >= max, I suppose you can get min/max port values
> >> from the user space for a particular net device and then calculate
> >> the hash in the bpf program?
> >> What I want to know if how much accuracy you will lose if you just
> >> use __sk_buff->hash and do approximation in bpf program.
> >>
> >> >
> >> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> >> > index 4393bd4b2419..90e814153dec 100644
> >> > --- a/include/uapi/linux/bpf.h
> >> > +++ b/include/uapi/linux/bpf.h
> >> > @@ -2545,9 +2545,21 @@ union bpf_attr {
> >> >   *             *th* points to the start of the TCP header, while *th_len*
> >> >   *             contains **sizeof**\ (**struct tcphdr**).
> >> >   *
> >> > - *     Return
> >> > - *             0 if *iph* and *th* are a valid SYN cookie ACK, or a negative
> >> > - *             error otherwise.
> >> > + *  Return
> >> > + *      0 if *iph* and *th* are a valid SYN cookie ACK, or a negative
> >> > + *      error otherwise.
> >> > + *
> >> > + * int bpf_udp_flow_src_port(struct sk_buff *skb, int min, int max, int use_eth)
> >> > + *  Description
> >> > + *      It's common to implement tunnelling inside a UDP protocol to provide
> >> > + *      additional randomness to the packet. The destination port of the UDP
> >> > + *      header indicates the inner packet type whereas the source port is used
> >> > + *      for additional entropy.
> >> > + *
> >> > + *  Return
> >> > + *      An obfuscated hash of the packet that falls within the
> >> > + *      min & max port range.
> >> > + *      If min >= max, the default port range is used
> >> >   *
> >> >   * int bpf_sysctl_get_name(struct bpf_sysctl *ctx, char *buf, size_t buf_len, u64 flags)
> >> >   *     Description
> >> > @@ -2853,7 +2865,8 @@ union bpf_attr {
> >> >         FN(sk_storage_get),             \
> >> >         FN(sk_storage_delete),          \
> >> >         FN(send_signal),                \
> >> > -       FN(tcp_gen_syncookie),
> >> > +       FN(tcp_gen_syncookie),  \
> >> > +       FN(udp_flow_src_port),
> >> >
> >> >  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> >> >   * function eBPF program intends to call
> >> > diff --git a/net/core/filter.c b/net/core/filter.c
> >> > index 5a2707918629..fdf0ebb8c2c8 100644
> >> > --- a/net/core/filter.c
> >> > +++ b/net/core/filter.c
> >> > @@ -2341,6 +2341,24 @@ static const struct bpf_func_proto bpf_msg_pull_data_proto = {
> >> >         .arg4_type      = ARG_ANYTHING,
> >> >  };
> >> >
> >> > +BPF_CALL_4(bpf_udp_flow_src_port, struct sk_buff *, skb, int, min,
> >> > +          int, max, int, use_eth)
> >> > +{
> >> > +       struct net *net = dev_net(skb->dev);
> >> > +
> >> > +       return udp_flow_src_port(net, skb, min, max, use_eth);
> >> > +}
> >> > +
> >> [...]

^ permalink raw reply

* Re: Slowness forming TIPC cluster with explicit node addresses
From: Chris Packham @ 2019-08-04 23:04 UTC (permalink / raw)
  To: jon.maloy@ericsson.com, tipc-discussion@lists.sourceforge.net
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <CH2PR15MB3575BF6FC4001C19B8A789559ADB0@CH2PR15MB3575.namprd15.prod.outlook.com>

On Sun, 2019-08-04 at 21:53 +0000, Jon Maloy wrote:
> 
> > 
> > -----Original Message-----
> > From: netdev-owner@vger.kernel.org <netdev-owner@vger.kernel.org>
> > On
> > Behalf Of Chris Packham
> > Sent: 2-Aug-19 01:11
> > To: Jon Maloy <jon.maloy@ericsson.com>; tipc-
> > discussion@lists.sourceforge.net
> > Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org
> > Subject: Re: Slowness forming TIPC cluster with explicit node
> > addresses
> > 
> > On Mon, 2019-07-29 at 09:04 +1200, Chris Packham wrote:
> > > 
> > > On Fri, 2019-07-26 at 13:31 +0000, Jon Maloy wrote:
> > > > 
> > > > 
> > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > -----Original Message-----
> > > > > From: netdev-owner@vger.kernel.org <netdev-
> > owner@vger.kernel.org>
> > > 
> > > > 
> > > > > 
> > > > > On Behalf Of Chris Packham
> > > > > Sent: 25-Jul-19 19:37
> > > > > To: tipc-discussion@lists.sourceforge.net
> > > > > Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org
> > > > > Subject: Slowness forming TIPC cluster with explicit node
> > > > > addresses
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > I'm having problems forming a TIPC cluster between 2 nodes.
> > > > > 
> > > > > This is the basic steps I'm going through on each node.
> > > > > 
> > > > > modprobe tipc
> > > > > ip link set eth2 up
> > > > > tipc node set addr 1.1.5 # or 1.1.6 tipc bearer enable media
> > > > > eth
> > > > > dev eth0
> > > > eth2, I assume...
> > > > 
> > > Yes sorry I keep switching between between Ethernet ports for
> > > testing
> > > so I hand edited the email.
> > > 
> > > > 
> > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > Then to confirm if the cluster is formed I use tipc link list
> > > > > 
> > > > > [root@node-5 ~]# tipc link list
> > > > > broadcast-link: up
> > > > > ...
> > > > > 
> > > > > Looking at tcpdump the two nodes are sending packets
> > > > > 
> > > > > 22:30:05.782320 TIPC v2.0 1.1.5 > 0.0.0, headerlength 60
> > > > > bytes,
> > > > > MessageSize
> > > > > 76 bytes, Neighbor Detection Protocol internal, messageType
> > > > > Link
> > > > > request
> > > > > 22:30:05.863555 TIPC v2.0 1.1.6 > 0.0.0, headerlength 60
> > > > > bytes,
> > > > > MessageSize
> > > > > 76 bytes, Neighbor Detection Protocol internal, messageType
> > > > > Link
> > > > > request
> > > > > 
> > > > > Eventually (after a few minutes) the link does come up
> > > > > 
> > > > > [root@node-6 ~]# tipc link list
> > > > > broadcast-link: up
> > > > > 1001006:eth2-1001005:eth2: up
> > > > > 
> > > > > [root@node-5 ~]# tipc link list
> > > > > broadcast-link: up
> > > > > 1001005:eth2-1001006:eth2: up
> > > > > 
> > > > > When I remove the "tipc node set addr" things seem to kick
> > > > > into
> > > > > life straight away
> > > > > 
> > > > > [root@node-5 ~]# tipc link list
> > > > > broadcast-link: up
> > > > > 0050b61bd2aa:eth2-0050b61e6dfa:eth2: up
> > > > > 
> > > > > So there appears to be some difference in behaviour between
> > > > > having
> > > > > an explicit node address and using the default. Unfortunately
> > > > > our
> > > > > application relies on setting the node addresses.
> > > > I do this many times a day, without any problems. If there
> > > > would be
> > > > any time difference, I would expect the 'auto configurable'
> > > > version
> > > > to be slower, because it involves a DAD step.
> > > > Are you sure you don't have any other nodes running in your
> > > > system?
> > > > 
> > > > ///jon
> > > > 
> > > Nope the two nodes are connected back to back. Does the number of
> > > Ethernet interfaces make a difference? As you can see I've got 3
> > > on
> > > each node. One is completely disconnected, one is for booting
> > > over
> > > TFTP
> > >  (only used by U-boot) and the other is the USB Ethernet I'm
> > > using for
> > > testing.
> > > 
> > So I can still reproduce this on nodes that only have one network
> > interface and
> > are the only things connected.
> > 
> > I did find one thing that helps
> > 
> > diff --git a/net/tipc/discover.c b/net/tipc/discover.c index
> > c138d68e8a69..49921dad404a 100644
> > --- a/net/tipc/discover.c
> > +++ b/net/tipc/discover.c
> > @@ -358,10 +358,10 @@ int tipc_disc_create(struct net *net, struct
> > tipc_bearer *b,
> >         tipc_disc_init_msg(net, d->skb, DSC_REQ_MSG, b);
> > 
> >         /* Do we need an address trial period first ? */
> > -       if (!tipc_own_addr(net)) {
> > +//     if (!tipc_own_addr(net)) {
> >                 tn->addr_trial_end = jiffies +
> > msecs_to_jiffies(1000);
> >                 msg_set_type(buf_msg(d->skb), DSC_TRIAL_MSG);
> > -       }
> > +//     }
> >         memcpy(&d->dest, dest, sizeof(*dest));
> >         d->net = net;
> >         d->bearer_id = b->identity;
> > 
> > I think because with pre-configured addresses the duplicate address
> > detection
> > is skipped the shorter init phase is skipped. Would is make sense
> > to
> > unconditionally do the trial step? Or is there some better way to
> > get things to
> > transition with pre-assigned addresses.
>
> I am on vacation until the end of next-week, so I can't give you any
> good analysis right now.

Thanks for taking the time to respond.

> To do the trial step doesn’t make much sense to me, -it would only
> delay the setup unnecessarily (but with only 1 second).
> Can you check the initial value of addr_trial_end when there a pre-
> configured address?

I had the same thought. For both my devices 'addr_trial_end = 0' so I
think tipc_disc_addr_trial_msg should end up with trial == false

> 
> ///jon
> 

^ permalink raw reply

* Re: [PATCH 31/34] nfs: convert put_page() to put_user_page*()
From: Calum Mackay @ 2019-08-04 23:28 UTC (permalink / raw)
  To: John Hubbard, john.hubbard, Andrew Morton
  Cc: calum.mackay, Christoph Hellwig, Dan Williams, Dave Chinner,
	Dave Hansen, Ira Weiny, Jan Kara, Jason Gunthorpe,
	Jérôme Glisse, LKML, amd-gfx, ceph-devel, devel, devel,
	dri-devel, intel-gfx, kvm, linux-arm-kernel, linux-block,
	linux-crypto, linux-fbdev, linux-fsdevel, linux-media, linux-mm,
	linux-nfs, linux-rdma, linux-rpi-kernel, linux-xfs, netdev,
	rds-devel, sparclinux, x86, xen-devel, Trond Myklebust,
	Anna Schumaker
In-Reply-To: <db136399-ed87-56ea-bd6e-e5d29b145eda@nvidia.com>

On 03/08/2019 2:41 am, John Hubbard wrote:
> On 8/2/19 6:27 PM, Calum Mackay wrote:
>> On 02/08/2019 3:20 am, john.hubbard@gmail.com wrote:
> ...
>> Since it's static, and only called twice, might it be better to change its two callers [nfs_direct_{read,write}_schedule_iovec()] to call put_user_pages() directly, and remove nfs_direct_release_pages() entirely?
>>
>> thanks,
>> calum.
>>
>>
>>>      void nfs_init_cinfo_from_dreq(struct nfs_commit_info *cinfo,
>>>
>   
> Hi Calum,
> 
> Absolutely! Is it OK to add your reviewed-by, with the following incremental
> patch made to this one?

Thanks John; looks good.

Reviewed-by: Calum Mackay <calum.mackay@oracle.com>

> 
> diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
> index b00b89dda3c5..c0c1b9f2c069 100644
> --- a/fs/nfs/direct.c
> +++ b/fs/nfs/direct.c
> @@ -276,11 +276,6 @@ ssize_t nfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>          return nfs_file_direct_write(iocb, iter);
>   }
>   
> -static void nfs_direct_release_pages(struct page **pages, unsigned int npages)
> -{
> -       put_user_pages(pages, npages);
> -}
> -
>   void nfs_init_cinfo_from_dreq(struct nfs_commit_info *cinfo,
>                                struct nfs_direct_req *dreq)
>   {
> @@ -510,7 +505,7 @@ static ssize_t nfs_direct_read_schedule_iovec(struct nfs_direct_req *dreq,
>                          pos += req_len;
>                          dreq->bytes_left -= req_len;
>                  }
> -               nfs_direct_release_pages(pagevec, npages);
> +               put_user_pages(pagevec, npages);
>                  kvfree(pagevec);
>                  if (result < 0)
>                          break;
> @@ -933,7 +928,7 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq,
>                          pos += req_len;
>                          dreq->bytes_left -= req_len;
>                  }
> -               nfs_direct_release_pages(pagevec, npages);
> +               put_user_pages(pagevec, npages);
>                  kvfree(pagevec);
>                  if (result < 0)
>                          break;
> 
> 
> 
> thanks,
> 

^ permalink raw reply

* Re: [PATCH v2 bpf-next 1/4] bpf: unprivileged BPF access via /dev/bpf
From: Andy Lutomirski @ 2019-08-05  0:08 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Song Liu, Kees Cook, Networking, bpf, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, Lorenz Bauer, Jann Horn, Greg KH,
	Linux API, LSM List
In-Reply-To: <CALCETrU7NbBnXXsw1B+DvTkfTVRBFWXuJ8cZERCCNvdFG6KqRw@mail.gmail.com>

On Sun, Aug 4, 2019 at 3:16 PM Andy Lutomirski <luto@kernel.org> wrote:
>
> On Fri, Aug 2, 2019 at 12:22 AM Song Liu <songliubraving@fb.com> wrote:
> >
> > Hi Andy,
> >
>  >> I actually agree CAP_BPF_ADMIN makes sense. The hard part is to make
> > >> existing tools (setcap, getcap, etc.) and libraries aware of the new CAP.
> > >
> > > It's been done before -- it's not that hard.  IMO the main tricky bit
> > > would be try be somewhat careful about defining exactly what
> > > CAP_BPF_ADMIN does.
> >
> > Agreed. I think defining CAP_BPF_ADMIN could be a good topic for the
> > Plumbers conference.
> >
> > OTOH, I don't think we have to wait for CAP_BPF_ADMIN to allow daemons
> > like systemd to do sys_bpf() without root.
>
> I don't understand the use case here.  Are you talking about systemd
> --user?  As far as I know, a user is expected to be able to fully
> control their systemd --user process, so giving it unrestricted bpf
> access is very close to giving it superuser access, and this doesn't
> sound like a good idea.  I think that, if systemd --user needs bpf(),
> it either needs real unprivileged bpf() or it needs a privileged
> helper (SUID or a daemon) to intermediate this access.
>
> >
> > >
> > >>> I don't see why you need to invent a whole new mechanism for this.
> > >>> The entire cgroup ecosystem outside bpf() does just fine using the
> > >>> write permission on files in cgroupfs to control access.  Why can't
> > >>> bpf() do the same thing?
> > >>
> > >> It is easier to use write permission for BPF_PROG_ATTACH. But it is
> > >> not easy to do the same for other bpf commands: BPF_PROG_LOAD and
> > >> BPF_MAP_*. A lot of these commands don't have target concept. Maybe
> > >> we should have target concept for all these commands. But that is a
> > >> much bigger project. OTOH, "all or nothing" model allows all these
> > >> commands at once.
> > >
> > > For BPF_PROG_LOAD, I admit I've never understood why permission is
> > > required at all.  I think that CAP_SYS_ADMIN or similar should be
> > > needed to get is_priv in the verifier, but I think that should mainly
> > > be useful for tracing, and that requires lots of privilege anyway.
> > > BPF_MAP_* is probably the trickiest part.  One solution would be some
> > > kind of bpffs, but I'm sure other solutions are possible.
> >
> > Improving permission management of cgroup_bpf is another good topic to
> > discuss. However, it is also an overkill for current use case.
> >
>
> I looked at the code some more, and I don't think this is so hard
> after all.  As I understand it, all of the map..by_id stuff is, to
> some extent, deprecated in favor of persistent maps.  As I see it, the
> map..by_id calls should require privilege forever, although I can
> imagine ways to scope that privilege to a namespace if the maps
> themselves were to be scoped to a namespace.
>
> Instead, unprivileged tools would use the persistent map interface
> roughly like this:
>
> $ bpftool map create /sys/fs/bpf/my_dir/filename type hash key 8 value
> 8 entries 64 name mapname
>
> This would require that the caller have either CAP_DAC_OVERRIDE or
> that the caller have permission to create files in /sys/fs/bpf/my_dir
> (using the same rules as for any filesystem), and the resulting map
> would end up owned by the creating user and have mode 0600 (or maybe
> 0666, or maybe a new bpf_attr parameter) modified by umask.  Then all
> the various capable() checks that are currently involved in accessing
> a persistent map would instead check FMODE_READ or FMODE_WRITE on the
> map file as appropriate.
>
> Half of this stuff already works.  I just set my system up like this:
>
> $ ls -l /sys/fs/bpf
> total 0
> drwxr-xr-x. 3 luto luto 0 Aug  4 15:10 luto
>
> $ mkdir /sys/fs/bpf/luto/test
>
> $ ls -l /sys/fs/bpf/luto
> total 0
> drwxrwxr-x. 2 luto luto 0 Aug  4 15:10 test
>
> I bet that making the bpf() syscalls work appropriately in this
> context without privilege would only be a couple of hours of work.
> The hard work, creating bpffs and making it function, is already done
> :)
>
> P.S. The docs for bpftool create are less than fantastic.  The
> complete lack of any error message at all when the syscall returns
> -EACCES is also not fantastic.

This isn't remotely finished, but I spent a bit of time fiddling with this:

https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=bpf/perms

What do you think?  (It's obviously not done.  It doesn't compile, and
I haven't gotten to the permissions needed to do map operations.  I
also haven't touched the capable() checks.)

^ permalink raw reply

* Re: [PATCH] Documentation: virt: Fix broken reference to virt tree's index
From: Matthew Wilcox @ 2019-08-05  0:10 UTC (permalink / raw)
  To: Sheriff Esseson
  Cc: skhan, linux-kernel-mentees, Jonathan Corbet, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song,
	open list:DOCUMENTATION, open list, open list:RISC-V ARCHITECTURE,
	open list:BPF (Safe dynamic programs and tools),
	open list:BPF (Safe dynamic programs and tools)
In-Reply-To: <20190804154635.GA18475@localhost>

On Sun, Aug 04, 2019 at 04:46:35PM +0100, Sheriff Esseson wrote:
> Fix broken reference to virt/index.rst.
> 
> Sequel to: 2f5947dfcaec ("Documentation: move Documentation/virtual to
> Documentation/virt")

'Sequel to'?  Do you mean 'Fixes'?

> Reported-by: Sphinx

Reported-by is used for people.  See
Documentation/process/submitting-patches.rst section 13.


^ permalink raw reply

* Re: [PATCH v2 31/34] fs/nfs: convert put_page() to put_user_page*()
From: Calum Mackay @ 2019-08-05  0:26 UTC (permalink / raw)
  To: john.hubbard, Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, John Hubbard, Trond Myklebust, Anna Schumaker
In-Reply-To: <20190804224915.28669-32-jhubbard@nvidia.com>

On 04/08/2019 11:49 pm, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
> 
> For pages that were retained via get_user_pages*(), release those pages
> via the new put_user_page*() routines, instead of via put_page() or
> release_pages().
> 
> This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
> ("mm: introduce put_user_page*(), placeholder versions").
> 
> Cc: Calum Mackay <calum.mackay@oracle.com>
> Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
> Cc: Anna Schumaker <anna.schumaker@netapp.com>
> Cc: linux-nfs@vger.kernel.org
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>   fs/nfs/direct.c | 11 ++---------
>   1 file changed, 2 insertions(+), 9 deletions(-)

Reviewed-by: Calum Mackay <calum.mackay@oracle.com>


> diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
> index 0cb442406168..c0c1b9f2c069 100644
> --- a/fs/nfs/direct.c
> +++ b/fs/nfs/direct.c
> @@ -276,13 +276,6 @@ ssize_t nfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>   	return nfs_file_direct_write(iocb, iter);
>   }
>   
> -static void nfs_direct_release_pages(struct page **pages, unsigned int npages)
> -{
> -	unsigned int i;
> -	for (i = 0; i < npages; i++)
> -		put_page(pages[i]);
> -}
> -
>   void nfs_init_cinfo_from_dreq(struct nfs_commit_info *cinfo,
>   			      struct nfs_direct_req *dreq)
>   {
> @@ -512,7 +505,7 @@ static ssize_t nfs_direct_read_schedule_iovec(struct nfs_direct_req *dreq,
>   			pos += req_len;
>   			dreq->bytes_left -= req_len;
>   		}
> -		nfs_direct_release_pages(pagevec, npages);
> +		put_user_pages(pagevec, npages);
>   		kvfree(pagevec);
>   		if (result < 0)
>   			break;
> @@ -935,7 +928,7 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq,
>   			pos += req_len;
>   			dreq->bytes_left -= req_len;
>   		}
> -		nfs_direct_release_pages(pagevec, npages);
> +		put_user_pages(pagevec, npages);
>   		kvfree(pagevec);
>   		if (result < 0)
>   			break;
> 

^ permalink raw reply

* [PATCH] tools: bpftool: fix reading from /proc/config.gz
From: Peter Wu @ 2019-08-05  0:15 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: netdev, Stanislav Fomichev, Jakub Kicinski, Quentin Monnet

/proc/config has never existed as far as I can see, but /proc/config.gz
is present on Arch Linux. Execute an external gunzip program to avoid
linking to zlib and rework the option scanning code since a pipe is not
seekable. This also fixes a file handle leak on some error paths.

Fixes: 4567b983f78c ("tools: bpftool: add probes for kernel configuration options")
Cc: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Peter Wu <peter@lekensteyn.nl>
---
 tools/bpf/bpftool/feature.c | 92 +++++++++++++++++++++----------------
 1 file changed, 52 insertions(+), 40 deletions(-)

diff --git a/tools/bpf/bpftool/feature.c b/tools/bpf/bpftool/feature.c
index d672d9086fff..e9e10f582047 100644
--- a/tools/bpf/bpftool/feature.c
+++ b/tools/bpf/bpftool/feature.c
@@ -284,34 +284,34 @@ static void probe_jit_limit(void)
 	}
 }
 
-static char *get_kernel_config_option(FILE *fd, const char *option)
+static bool get_kernel_config_option(FILE *fd, char **buf_p, size_t *n_p,
+				     char **value)
 {
-	size_t line_n = 0, optlen = strlen(option);
-	char *res, *strval, *line = NULL;
-	ssize_t n;
+	char *sep;
+	ssize_t linelen;
 
-	rewind(fd);
-	while ((n = getline(&line, &line_n, fd)) > 0) {
-		if (strncmp(line, option, optlen))
+	while ((linelen = getline(buf_p, n_p, fd)) > 0) {
+		char *line = *buf_p;
+		if (strncmp(line, "CONFIG_", 7))
 			continue;
-		/* Check we have at least '=', value, and '\n' */
-		if (strlen(line) < optlen + 3)
-			continue;
-		if (*(line + optlen) != '=')
+
+		sep = memchr(line, '=', linelen);
+		if (!sep)
 			continue;
 
 		/* Trim ending '\n' */
-		line[strlen(line) - 1] = '\0';
+		line[linelen - 1] = '\0';
+
+		/* Split on '=' and ensure that a value is present. */
+		*sep = '\0';
+		if (!sep[1])
+			continue;
 
-		/* Copy and return config option value */
-		strval = line + optlen + 1;
-		res = strdup(strval);
-		free(line);
-		return res;
+		*value = sep + 1;
+		return true;
 	}
-	free(line);
 
-	return NULL;
+	return false;
 }
 
 static void probe_kernel_image_config(void)
@@ -386,31 +386,34 @@ static void probe_kernel_image_config(void)
 		/* test_bpf module for BPF tests */
 		"CONFIG_TEST_BPF",
 	};
+	char *values[ARRAY_SIZE(options)] = { };
 	char *value, *buf = NULL;
 	struct utsname utsn;
 	char path[PATH_MAX];
 	size_t i, n;
 	ssize_t ret;
-	FILE *fd;
+	FILE *fd = NULL;
+	bool is_pipe = false;
 
 	if (uname(&utsn))
-		goto no_config;
+		goto end_parse;
 
 	snprintf(path, sizeof(path), "/boot/config-%s", utsn.release);
 
 	fd = fopen(path, "r");
 	if (!fd && errno == ENOENT) {
-		/* Some distributions put the config file at /proc/config, give
-		 * it a try.
-		 * Sometimes it is also at /proc/config.gz but we do not try
-		 * this one for now, it would require linking against libz.
+		/* Some distributions build with CONFIG_IKCONFIG=y and put the
+		 * config file at /proc/config.gz. We try to invoke an external
+		 * gzip program to avoid linking to libz.
+		 * Hide stderr to avoid interference with the JSON output.
 		 */
-		fd = fopen("/proc/config", "r");
+		fd = popen("gunzip -c /proc/config.gz 2>/dev/null", "r");
+		is_pipe = true;
 	}
 	if (!fd) {
 		p_info("skipping kernel config, can't open file: %s",
 		       strerror(errno));
-		goto no_config;
+		goto end_parse;
 	}
 	/* Sanity checks */
 	ret = getline(&buf, &n, fd);
@@ -418,27 +421,36 @@ static void probe_kernel_image_config(void)
 	if (!buf || !ret) {
 		p_info("skipping kernel config, can't read from file: %s",
 		       strerror(errno));
-		free(buf);
-		goto no_config;
+		goto end_parse;
 	}
 	if (strcmp(buf, "# Automatically generated file; DO NOT EDIT.\n")) {
 		p_info("skipping kernel config, can't find correct file");
-		free(buf);
-		goto no_config;
+		goto end_parse;
+	}
+
+	while (get_kernel_config_option(fd, &buf, &n, &value)) {
+		for (i = 0; i < ARRAY_SIZE(options); i++) {
+			if (values[i] || strcmp(buf, options[i]))
+				continue;
+
+			values[i] = strdup(value);
+		}
+	}
+
+end_parse:
+	if (fd) {
+		if (is_pipe) {
+			if (pclose(fd))
+				p_info("failed to read /proc/config.gz");
+		} else
+			fclose(fd);
 	}
 	free(buf);
 
 	for (i = 0; i < ARRAY_SIZE(options); i++) {
-		value = get_kernel_config_option(fd, options[i]);
-		print_kernel_option(options[i], value);
-		free(value);
+		print_kernel_option(options[i], values[i]);
+		free(values[i]);
 	}
-	fclose(fd);
-	return;
-
-no_config:
-	for (i = 0; i < ARRAY_SIZE(options); i++)
-		print_kernel_option(options[i], NULL);
 }
 
 static bool probe_bpf_syscall(const char *define_prefix)
-- 
2.22.0


^ permalink raw reply related

* Re: [PATCH net-next] net: can: Fix compiling warning
From: maowenan @ 2019-08-05  1:13 UTC (permalink / raw)
  To: Sergei Shtylyov, socketcan, davem, netdev; +Cc: linux-kernel, kernel-janitors
In-Reply-To: <133b3357-e0a4-64c8-40b7-02d386e12cef@cogentembedded.com>



On 2019/8/2 16:59, Sergei Shtylyov wrote:
> Hello!
> 
> On 02.08.2019 6:36, Mao Wenan wrote:
> 
>> There are two warings in net/can, fix them by setting bcm_sock_no_ioctlcmd
> 
>    Warnings. :-)

Thanks, I will send v2.

> 
>> and raw_sock_no_ioctlcmd as static.
>>
>> net/can/bcm.c:1683:5: warning: symbol 'bcm_sock_no_ioctlcmd' was not declared. Should it be static?
>> net/can/raw.c:840:5: warning: symbol 'raw_sock_no_ioctlcmd' was not declared. Should it be static?
>>
>> Fixes: 473d924d7d46 ("can: fix ioctl function removal")
>>
>> Signed-off-by: Mao Wenan <maowenan@huawei.com>
> [...]
> 
> MBR, Sergei
> 
> 


^ permalink raw reply

* [PATCH net-next] net: can: Fix compiling warning
From: Mao Wenan @ 2019-08-05  1:22 UTC (permalink / raw)
  To: socketcan, davem, netdev; +Cc: linux-kernel, kernel-janitors, Mao Wenan

There are two warnings in net/can, fix them by setting bcm_sock_no_ioctlcmd
and raw_sock_no_ioctlcmd as static.

net/can/bcm.c:1683:5: warning: symbol 'bcm_sock_no_ioctlcmd' was not declared. Should it be static?
net/can/raw.c:840:5: warning: symbol 'raw_sock_no_ioctlcmd' was not declared. Should it be static?

Fixes: 473d924d7d46 ("can: fix ioctl function removal")

Signed-off-by: Mao Wenan <maowenan@huawei.com>
---
 v1->v2: change patch description typo error, 'warings' to 'warnings'.
 net/can/bcm.c | 2 +-
 net/can/raw.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/can/bcm.c b/net/can/bcm.c
index bf1d0bbecec8..b8a32b4ac368 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1680,7 +1680,7 @@ static int bcm_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 	return size;
 }
 
-int bcm_sock_no_ioctlcmd(struct socket *sock, unsigned int cmd,
+static int bcm_sock_no_ioctlcmd(struct socket *sock, unsigned int cmd,
 			 unsigned long arg)
 {
 	/* no ioctls for socket layer -> hand it down to NIC layer */
diff --git a/net/can/raw.c b/net/can/raw.c
index da386f1fa815..a01848ff9b12 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -837,7 +837,7 @@ static int raw_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 	return size;
 }
 
-int raw_sock_no_ioctlcmd(struct socket *sock, unsigned int cmd,
+static int raw_sock_no_ioctlcmd(struct socket *sock, unsigned int cmd,
 			 unsigned long arg)
 {
 	/* no ioctls for socket layer -> hand it down to NIC layer */
-- 
2.20.1


^ permalink raw reply related

* [PATCH net-next v2] net: can: Fix compiling warning
From: Mao Wenan @ 2019-08-05  1:26 UTC (permalink / raw)
  To: socketcan, davem, netdev; +Cc: linux-kernel, kernel-janitors, Mao Wenan
In-Reply-To: <1d9e329a-eafc-6c32-ee2a-df3b15231a2a@huawei.com>

There are two warnings in net/can, fix them by setting bcm_sock_no_ioctlcmd
and raw_sock_no_ioctlcmd as static.

net/can/bcm.c:1683:5: warning: symbol 'bcm_sock_no_ioctlcmd' was not declared. Should it be static?
net/can/raw.c:840:5: warning: symbol 'raw_sock_no_ioctlcmd' was not declared. Should it be static?

Fixes: 473d924d7d46 ("can: fix ioctl function removal")
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
---
 v1->v2: change patch description typo error, 'warings' to 'warnings'.
 net/can/bcm.c | 2 +-
 net/can/raw.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/can/bcm.c b/net/can/bcm.c
index bf1d0bbecec8..b8a32b4ac368 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1680,7 +1680,7 @@ static int bcm_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 	return size;
 }
 
-int bcm_sock_no_ioctlcmd(struct socket *sock, unsigned int cmd,
+static int bcm_sock_no_ioctlcmd(struct socket *sock, unsigned int cmd,
 			 unsigned long arg)
 {
 	/* no ioctls for socket layer -> hand it down to NIC layer */
diff --git a/net/can/raw.c b/net/can/raw.c
index da386f1fa815..a01848ff9b12 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -837,7 +837,7 @@ static int raw_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 	return size;
 }
 
-int raw_sock_no_ioctlcmd(struct socket *sock, unsigned int cmd,
+static int raw_sock_no_ioctlcmd(struct socket *sock, unsigned int cmd,
 			 unsigned long arg)
 {
 	/* no ioctls for socket layer -> hand it down to NIC layer */
-- 
2.20.1


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox