All of lore.kernel.org

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Buildroot] [PATCH 21/27] toolchain: introduce BR2_TOOLCHAIN_HAS_GCC_BUG_90620
From: Thomas Petazzoni @ 2019-06-20 16:14 UTC (permalink / raw)
  To: buildroot
In-Reply-To: <20190614210346.121013-22-giulio.benetti@micronovasrl.com>

On Fri, 14 Jun 2019 23:03:40 +0200
Giulio Benetti <giulio.benetti@micronovasrl.com> wrote:

> GCC fails on building haproxy for the Microblaze Arch:
> http://autobuild.buildroot.org/results/64706f96db793777de9d3ec63b0a47d776cf33fd/build-end.log
> 
> Originally reported for gpsd:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90620
> 
> Fixed on Gcc 8.x but regressed in Gcc 9.x.
> 
> Signed-off-by: Giulio Benetti <giulio.benetti@micronovasrl.com>
> ---
>  toolchain/Config.in | 8 ++++++++
>  1 file changed, 8 insertions(+)

Applied to master, thanks.

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply

* [RFC PATCH 06/28] block: Support dma-direct bios in bio_advance_iter()
From: Logan Gunthorpe @ 2019-06-20 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-block, linux-nvme, linux-pci, linux-rdma
  Cc: Jens Axboe, Christoph Hellwig, Bjorn Helgaas, Dan Williams,
	Sagi Grimberg, Keith Busch, Jason Gunthorpe, Stephen Bates,
	Logan Gunthorpe
In-Reply-To: <20190620161240.22738-1-logang@deltatee.com>

Dma-direct bio iterators need to be advanced using a similar
dvec_iter_advance helper.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 include/linux/bio.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/bio.h b/include/linux/bio.h
index 8180309123d7..e212e5958a75 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -134,6 +134,8 @@ static inline void bio_advance_iter(struct bio *bio, struct bvec_iter *iter,
 
 	if (bio_no_advance_iter(bio))
 		iter->bi_size -= bytes;
+	else if (op_is_dma_direct(bio->bi_opf))
+		dvec_iter_advance(bio->bi_dma_vec, iter, bytes);
 	else
 		bvec_iter_advance(bio->bi_io_vec, iter, bytes);
 		/* TODO: It is reasonable to complete bio with error here. */
-- 
2.20.1


^ permalink raw reply related

* [RFC PATCH 08/28] block: Introduce dmavec_phys_mergeable()
From: Logan Gunthorpe @ 2019-06-20 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-block, linux-nvme, linux-pci, linux-rdma
  Cc: Jens Axboe, Christoph Hellwig, Bjorn Helgaas, Dan Williams,
	Sagi Grimberg, Keith Busch, Jason Gunthorpe, Stephen Bates,
	Logan Gunthorpe
In-Reply-To: <20190620161240.22738-1-logang@deltatee.com>

Introduce a new helper which is an analog of biovec_phys_mergeable()
for dma-direct vectors.

This also provides a common helper vec_phys_mergeable() for use in
code that's general to both bio_vecs and dma_vecs.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 block/blk.h | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/block/blk.h b/block/blk.h
index 7814aa207153..4142383eed7a 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -66,20 +66,36 @@ static inline void blk_queue_enter_live(struct request_queue *q)
 	percpu_ref_get(&q->q_usage_counter);
 }
 
+static inline bool vec_phys_mergeable(struct request_queue *q,
+				      unsigned long addr1, unsigned int len1,
+				      unsigned long addr2, unsigned int len2)
+{
+	unsigned long mask = queue_segment_boundary(q);
+
+	if (addr1 + len1 != addr2)
+		return false;
+	if ((addr1 | mask) != ((addr2 + len2 - 1) | mask))
+		return false;
+	return true;
+}
+
 static inline bool biovec_phys_mergeable(struct request_queue *q,
 		struct bio_vec *vec1, struct bio_vec *vec2)
 {
-	unsigned long mask = queue_segment_boundary(q);
 	phys_addr_t addr1 = page_to_phys(vec1->bv_page) + vec1->bv_offset;
 	phys_addr_t addr2 = page_to_phys(vec2->bv_page) + vec2->bv_offset;
 
-	if (addr1 + vec1->bv_len != addr2)
-		return false;
 	if (xen_domain() && !xen_biovec_phys_mergeable(vec1, vec2->bv_page))
 		return false;
-	if ((addr1 | mask) != ((addr2 + vec2->bv_len - 1) | mask))
-		return false;
-	return true;
+
+	return vec_phys_mergeable(q, addr1, vec1->bv_len, addr2, vec2->bv_len);
+}
+
+static inline bool dmavec_phys_mergeable(struct request_queue *q,
+		struct dma_vec *vec1, struct dma_vec *vec2)
+{
+	return vec_phys_mergeable(q, vec1->dv_addr, vec1->dv_len,
+				  vec2->dv_addr, vec2->dv_len);
 }
 
 static inline bool __bvec_gap_to_prev(struct request_queue *q,
-- 
2.20.1


^ permalink raw reply related

* [RFC PATCH 12/28] block: Create helper for bvec_should_split()
From: Logan Gunthorpe @ 2019-06-20 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-block, linux-nvme, linux-pci, linux-rdma
  Cc: Jens Axboe, Christoph Hellwig, Bjorn Helgaas, Dan Williams,
	Sagi Grimberg, Keith Busch, Jason Gunthorpe, Stephen Bates,
	Logan Gunthorpe
In-Reply-To: <20190620161240.22738-1-logang@deltatee.com>

In order to support dma-direct bios, blk_bio_segment_split() will
need to operate on both bio_vecs and dma_vecs. In order to do
this, the code inside bio_for_each_bvec() is moved into a generic
helper called bvec_should_split().

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 block/blk-merge.c | 86 +++++++++++++++++++++++++----------------------
 1 file changed, 46 insertions(+), 40 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 414e61a714bf..d9e89c0ad40d 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -212,6 +212,48 @@ struct blk_segment_split_ctx {
 	const unsigned max_segs;
 };
 
+static bool bvec_should_split(struct request_queue *q, struct bio_vec *bv,
+			      struct blk_segment_split_ctx *ctx)
+{
+	/*
+	 * If the queue doesn't support SG gaps and adding this
+	 * offset would create a gap, disallow it.
+	 */
+	if (ctx->prv_valid && bvec_gap_to_prev(q, &ctx->bvprv, bv->bv_offset))
+		return true;
+
+	if (ctx->sectors + (bv->bv_len >> 9) > ctx->max_sectors) {
+		/*
+		 * Consider this a new segment if we're splitting in
+		 * the middle of this vector.
+		 */
+		if (ctx->nsegs < ctx->max_segs &&
+		    ctx->sectors < ctx->max_sectors) {
+			/* split in the middle of bvec */
+			bv->bv_len = (ctx->max_sectors - ctx->sectors) << 9;
+			bvec_split_segs(q, bv, &ctx->nsegs,
+					&ctx->sectors, ctx->max_segs);
+		}
+		return true;
+	}
+
+	if (ctx->nsegs == ctx->max_segs)
+		return true;
+
+	ctx->bvprv = *bv;
+	ctx->prv_valid = true;
+
+	if (bv->bv_offset + bv->bv_len <= PAGE_SIZE) {
+		ctx->nsegs++;
+		ctx->sectors += bv->bv_len >> 9;
+	} else if (bvec_split_segs(q, bv, &ctx->nsegs, &ctx->sectors,
+				   ctx->max_segs)) {
+		return true;
+	}
+
+	return false;
+}
+
 static struct bio *blk_bio_segment_split(struct request_queue *q,
 					 struct bio *bio,
 					 struct bio_set *bs,
@@ -219,7 +261,7 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 {
 	struct bio_vec bv;
 	struct bvec_iter iter;
-	bool do_split = true;
+	bool do_split = false;
 	struct bio *new = NULL;
 
 	struct blk_segment_split_ctx ctx = {
@@ -228,47 +270,11 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 	};
 
 	bio_for_each_bvec(bv, bio, iter) {
-		/*
-		 * If the queue doesn't support SG gaps and adding this
-		 * offset would create a gap, disallow it.
-		 */
-		if (ctx.prv_valid && bvec_gap_to_prev(q, &ctx.bvprv,
-						      bv.bv_offset))
-			goto split;
-
-		if (ctx.sectors + (bv.bv_len >> 9) > ctx.max_sectors) {
-			/*
-			 * Consider this a new segment if we're splitting in
-			 * the middle of this vector.
-			 */
-			if (ctx.nsegs < ctx.max_segs &&
-			    ctx.sectors < ctx.max_sectors) {
-				/* split in the middle of bvec */
-				bv.bv_len =
-					(ctx.max_sectors - ctx.sectors) << 9;
-				bvec_split_segs(q, &bv, &ctx.nsegs,
-						&ctx.sectors, ctx.max_segs);
-			}
-			goto split;
-		}
-
-		if (ctx.nsegs == ctx.max_segs)
-			goto split;
-
-		ctx.bvprv = bv;
-		ctx.prv_valid = true;
-
-		if (bv.bv_offset + bv.bv_len <= PAGE_SIZE) {
-			ctx.nsegs++;
-			ctx.sectors += bv.bv_len >> 9;
-		} else if (bvec_split_segs(q, &bv, &ctx.nsegs, &ctx.sectors,
-				ctx.max_segs)) {
-			goto split;
-		}
+		do_split = bvec_should_split(q, &bv, &ctx);
+		if (do_split)
+			break;
 	}
 
-	do_split = false;
-split:
 	*segs = ctx.nsegs;
 
 	if (do_split) {
-- 
2.20.1


^ permalink raw reply related

* [RFC PATCH 23/28] nvme-pci: Remove support for PCI_P2PDMA requests
From: Logan Gunthorpe @ 2019-06-20 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-block, linux-nvme, linux-pci, linux-rdma
  Cc: Jens Axboe, Christoph Hellwig, Bjorn Helgaas, Dan Williams,
	Sagi Grimberg, Keith Busch, Jason Gunthorpe, Stephen Bates,
	Logan Gunthorpe
In-Reply-To: <20190620161240.22738-1-logang@deltatee.com>

These requests have been superseded by dma-direct requests and are
therefore no longer needed.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 drivers/nvme/host/core.c |  2 --
 drivers/nvme/host/nvme.h |  3 +--
 drivers/nvme/host/pci.c  | 27 ++++++++++-----------------
 3 files changed, 11 insertions(+), 21 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 8e876417c44b..63d132c478b4 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3257,8 +3257,6 @@ static int nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid)
 	}
 
 	blk_queue_flag_set(QUEUE_FLAG_NONROT, ns->queue);
-	if (ctrl->ops->flags & NVME_F_PCI_P2PDMA)
-		blk_queue_flag_set(QUEUE_FLAG_PCI_P2PDMA, ns->queue);
 	if (ctrl->ops->flags & NVME_F_DMA_DIRECT)
 		blk_queue_flag_set(QUEUE_FLAG_DMA_DIRECT, ns->queue);
 
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index f1dddc95c6a8..d103cecc14dd 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -361,8 +361,7 @@ struct nvme_ctrl_ops {
 	unsigned int flags;
 #define NVME_F_FABRICS			(1 << 0)
 #define NVME_F_METADATA_SUPPORTED	(1 << 1)
-#define NVME_F_PCI_P2PDMA		(1 << 2)
-#define NVME_F_DMA_DIRECT		(1 << 3)
+#define NVME_F_DMA_DIRECT		(1 << 2)
 	int (*reg_read32)(struct nvme_ctrl *ctrl, u32 off, u32 *val);
 	int (*reg_write32)(struct nvme_ctrl *ctrl, u32 off, u32 val);
 	int (*reg_read64)(struct nvme_ctrl *ctrl, u32 off, u64 *val);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 5957f3a4f261..7f806e76230a 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -564,9 +564,8 @@ static void nvme_unmap_data(struct nvme_dev *dev, struct request *req)
 
 	WARN_ON_ONCE(!iod->nents);
 
-	/* P2PDMA requests do not need to be unmapped */
-	if (!is_pci_p2pdma_page(sg_page(iod->sg)) &&
-	    !blk_rq_is_dma_direct(req))
+	/* DMA direct requests do not need to be unmapped */
+	if (!blk_rq_is_dma_direct(req))
 		dma_unmap_sg(dev->dev, iod->sg, iod->nents, rq_dma_dir(req));
 
 
@@ -828,16 +827,14 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req,
 	if (blk_rq_nr_phys_segments(req) == 1 && !blk_rq_is_dma_direct(req)) {
 		struct bio_vec bv = req_bvec(req);
 
-		if (!is_pci_p2pdma_page(bv.bv_page)) {
-			if (bv.bv_offset + bv.bv_len <= dev->ctrl.page_size * 2)
-				return nvme_setup_prp_simple(dev, req,
-							     &cmnd->rw, &bv);
+		if (bv.bv_offset + bv.bv_len <= dev->ctrl.page_size * 2)
+			return nvme_setup_prp_simple(dev, req,
+						     &cmnd->rw, &bv);
 
-			if (iod->nvmeq->qid &&
-			    dev->ctrl.sgls & ((1 << 0) | (1 << 1)))
-				return nvme_setup_sgl_simple(dev, req,
-							     &cmnd->rw, &bv);
-		}
+		if (iod->nvmeq->qid &&
+		    dev->ctrl.sgls & ((1 << 0) | (1 << 1)))
+			return nvme_setup_sgl_simple(dev, req,
+						     &cmnd->rw, &bv);
 	}
 
 	iod->dma_len = 0;
@@ -849,10 +846,7 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req,
 	if (!iod->nents)
 		goto out;
 
-	if (is_pci_p2pdma_page(sg_page(iod->sg)))
-		nr_mapped = pci_p2pdma_map_sg(dev->dev, iod->sg, iod->nents,
-					      rq_dma_dir(req));
-	else if (blk_rq_is_dma_direct(req))
+	if (blk_rq_is_dma_direct(req))
 		nr_mapped = iod->nents;
 	else
 		nr_mapped = dma_map_sg_attrs(dev->dev, iod->sg, iod->nents,
@@ -2642,7 +2636,6 @@ static const struct nvme_ctrl_ops nvme_pci_ctrl_ops = {
 	.name			= "pcie",
 	.module			= THIS_MODULE,
 	.flags			= NVME_F_METADATA_SUPPORTED |
-				  NVME_F_PCI_P2PDMA |
 				  NVME_F_DMA_DIRECT,
 	.reg_read32		= nvme_pci_reg_read32,
 	.reg_write32		= nvme_pci_reg_write32,
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH net] af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET
From: Neil Horman @ 2019-06-20 16:14 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Network Development, Matteo Croce, David S. Miller
In-Reply-To: <CAF=yD-KFZBS7PpvvBkHS5jQdjRr4tWpeHmb7=9QPmvD-RTcpYw@mail.gmail.com>

On Thu, Jun 20, 2019 at 11:16:13AM -0400, Willem de Bruijn wrote:
> On Thu, Jun 20, 2019 at 10:24 AM Neil Horman <nhorman@tuxdriver.com> wrote:
> >
> > On Thu, Jun 20, 2019 at 09:41:30AM -0400, Willem de Bruijn wrote:
> > > On Wed, Jun 19, 2019 at 4:26 PM Neil Horman <nhorman@tuxdriver.com> wrote:
> > > >
> > > > When an application is run that:
> > > > a) Sets its scheduler to be SCHED_FIFO
> > > > and
> > > > b) Opens a memory mapped AF_PACKET socket, and sends frames with the
> > > > MSG_DONTWAIT flag cleared, its possible for the application to hang
> > > > forever in the kernel.  This occurs because when waiting, the code in
> > > > tpacket_snd calls schedule, which under normal circumstances allows
> > > > other tasks to run, including ksoftirqd, which in some cases is
> > > > responsible for freeing the transmitted skb (which in AF_PACKET calls a
> > > > destructor that flips the status bit of the transmitted frame back to
> > > > available, allowing the transmitting task to complete).
> > > >
> > > > However, when the calling application is SCHED_FIFO, its priority is
> > > > such that the schedule call immediately places the task back on the cpu,
> > > > preventing ksoftirqd from freeing the skb, which in turn prevents the
> > > > transmitting task from detecting that the transmission is complete.
> > > >
> > > > We can fix this by converting the schedule call to a completion
> > > > mechanism.  By using a completion queue, we force the calling task, when
> > > > it detects there are no more frames to send, to schedule itself off the
> > > > cpu until such time as the last transmitted skb is freed, allowing
> > > > forward progress to be made.
> > > >
> > > > Tested by myself and the reporter, with good results
> > > >
> > > > Appies to the net tree
> > > >
> > > > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> > > > Reported-by: Matteo Croce <mcroce@redhat.com>
> > > > CC: "David S. Miller" <davem@davemloft.net>
> > > > ---
> > >
> > > This is a complex change for a narrow configuration. Isn't a
> > > SCHED_FIFO process preempting ksoftirqd a potential problem for other
> > > networking workloads as well? And the right configuration to always
> > > increase ksoftirqd priority when increasing another process's
> > > priority? Also, even when ksoftirqd kicks in, isn't some progress
> > > still made on the local_bh_enable reached from schedule()?
> > >
> >
> > A few questions here to answer:
> 
> Thanks for the detailed explanation.
> 
Gladly.

> > Regarding other protocols having this problem, thats not the case, because non
> > packet sockets honor the SK_SNDTIMEO option here (i.e. they sleep for a period
> > of time specified by the SNDTIMEO option if MSG_DONTWAIT isn't set.  We could
> > certainly do that, but the current implementation doesn't (opting instead to
> > wait indefinately until the respective packet(s) have transmitted or errored
> > out), and I wanted to maintain that behavior.  If there is consensus that packet
> > sockets should honor SNDTIMEO, then I can certainly do that.
> >
> > As for progress made by calling local_bh_enable, My read of the code doesn't
> > have the scheduler calling local_bh_enable at all.  Instead schedule uses
> > preempt_disable/preempt_enable_no_resched() to gain exlcusive access to the cpu,
> > which ignores pending softirqs on re-enablement.
> 
> Ah, I'm mistaken there, then.
> 
> >  Perhaps that needs to change,
> > but I'm averse to making scheduler changes for this (the aforementioned concern
> > about complex changes for a narrow use case)
> >
> > Regarding raising the priority of ksoftirqd, that could be a solution, but the
> > priority would need to be raised to a high priority SCHED_FIFO parameter, and
> > that gets back to making complex changes for a narrow problem domain
> >
> > As for the comlexity of the of the solution, I think this is, given your
> > comments the least complex and intrusive change to solve the given problem.
> 
> Could it be simpler to ensure do_softirq() gets run here? That would
> allow progress for this case.
> 
I'm not sure.  On the surface, we certainly could do it, but inserting a call to
do_softirq, either directly, or indirectly through some other mechanism seems
like a non-obvious fix, and may lead to confusion down the road.  I'm hesitant
to pursue such a soultion without some evidence it would make a better solution.

> >  We
> > need to find a way to force the calling task off the cpu while the asynchronous
> > operations in the transmit path complete, and we can do that this way, or by
> > honoring SK_SNDTIMEO.  I'm fine with doing the latter, but I didn't want to
> > alter the current protocol behavior without consensus on that.
> 
> In general SCHED_FIFO is dangerous with regard to stalling other
> progress, incl. ksoftirqd. But it does appear that this packet socket
> case is special inside networking in calling schedule() directly here.
> 
> If converting that, should it convert to logic more akin to other
> sockets, like sock_wait_for_wmem? I haven't had a chance to read up on
> the pros and cons of completion here yet, sorry. Didn't want to delay
> responding until after I get a chance.
> 
That would be the solution described above (i.e. honoring SK_SNDTIMEO.
Basically you call sock_send_waittimeo, which returns a timeout value, or 0 if
MSG_DONTWAIT is set), then you block for that period of time waiting for
transmit completion.  I'm happy to implement that solution, but I'd like to get
some clarity as to if there is a reason we don't currently honor that socket
option now before I change the behavior that way.

Dave, do you have any insight into AF_PACKET history as to why we would ignore
the send timeout socket option here?

Best
Neil


^ permalink raw reply

* [RFC PATCH 27/28] PCI/P2PDMA: Remove struct pages that back P2PDMA memory
From: Logan Gunthorpe @ 2019-06-20 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-block, linux-nvme, linux-pci, linux-rdma
  Cc: Jens Axboe, Christoph Hellwig, Bjorn Helgaas, Dan Williams,
	Sagi Grimberg, Keith Busch, Jason Gunthorpe, Stephen Bates,
	Logan Gunthorpe
In-Reply-To: <20190620161240.22738-1-logang@deltatee.com>

There are no more users of the struct pages that back P2P memory,
so convert the devm_memremap_pages() call to devm_memremap() to remove
them.

The percpu_ref and completion are retained in struct p2pdma to track
when there is no memory allocated out of the genpool and it is safe
to free it.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 drivers/pci/p2pdma.c | 107 +++++++++++++------------------------------
 1 file changed, 33 insertions(+), 74 deletions(-)

diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
index 9b82e13f802c..83d93911f792 100644
--- a/drivers/pci/p2pdma.c
+++ b/drivers/pci/p2pdma.c
@@ -22,10 +22,6 @@
 struct pci_p2pdma {
 	struct gen_pool *pool;
 	bool p2pmem_published;
-};
-
-struct p2pdma_pagemap {
-	struct dev_pagemap pgmap;
 	struct percpu_ref ref;
 	struct completion ref_done;
 };
@@ -78,29 +74,12 @@ static const struct attribute_group p2pmem_group = {
 	.name = "p2pmem",
 };
 
-static struct p2pdma_pagemap *to_p2p_pgmap(struct percpu_ref *ref)
-{
-	return container_of(ref, struct p2pdma_pagemap, ref);
-}
-
 static void pci_p2pdma_percpu_release(struct percpu_ref *ref)
 {
-	struct p2pdma_pagemap *p2p_pgmap = to_p2p_pgmap(ref);
+	struct pci_p2pdma *p2pdma =
+		container_of(ref, struct pci_p2pdma, ref);
 
-	complete(&p2p_pgmap->ref_done);
-}
-
-static void pci_p2pdma_percpu_kill(struct percpu_ref *ref)
-{
-	percpu_ref_kill(ref);
-}
-
-static void pci_p2pdma_percpu_cleanup(struct percpu_ref *ref)
-{
-	struct p2pdma_pagemap *p2p_pgmap = to_p2p_pgmap(ref);
-
-	wait_for_completion(&p2p_pgmap->ref_done);
-	percpu_ref_exit(&p2p_pgmap->ref);
+	complete(&p2pdma->ref_done);
 }
 
 static void pci_p2pdma_release(void *data)
@@ -111,6 +90,10 @@ static void pci_p2pdma_release(void *data)
 	if (!p2pdma)
 		return;
 
+	percpu_ref_kill(&p2pdma->ref);
+	wait_for_completion(&p2pdma->ref_done);
+	percpu_ref_exit(&p2pdma->ref);
+
 	/* Flush and disable pci_alloc_p2p_mem() */
 	pdev->p2pdma = NULL;
 	synchronize_rcu();
@@ -128,10 +111,17 @@ static int pci_p2pdma_setup(struct pci_dev *pdev)
 	if (!p2p)
 		return -ENOMEM;
 
+	init_completion(&p2p->ref_done);
+
 	p2p->pool = gen_pool_create(PAGE_SHIFT, dev_to_node(&pdev->dev));
 	if (!p2p->pool)
 		goto out;
 
+	error = percpu_ref_init(&p2p->ref, pci_p2pdma_percpu_release, 0,
+				GFP_KERNEL);
+	if (error)
+		goto out_pool_destroy;
+
 	error = devm_add_action_or_reset(&pdev->dev, pci_p2pdma_release, pdev);
 	if (error)
 		goto out_pool_destroy;
@@ -165,8 +155,7 @@ static int pci_p2pdma_setup(struct pci_dev *pdev)
 int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size,
 			    u64 offset)
 {
-	struct p2pdma_pagemap *p2p_pgmap;
-	struct dev_pagemap *pgmap;
+	struct resource res;
 	void *addr;
 	int error;
 
@@ -188,50 +177,26 @@ int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size,
 			return error;
 	}
 
-	p2p_pgmap = devm_kzalloc(&pdev->dev, sizeof(*p2p_pgmap), GFP_KERNEL);
-	if (!p2p_pgmap)
-		return -ENOMEM;
+	res.start = pci_resource_start(pdev, bar) + offset;
+	res.end = res.start + size - 1;
+	res.flags = pci_resource_flags(pdev, bar);
 
-	init_completion(&p2p_pgmap->ref_done);
-	error = percpu_ref_init(&p2p_pgmap->ref,
-			pci_p2pdma_percpu_release, 0, GFP_KERNEL);
-	if (error)
-		goto pgmap_free;
-
-	pgmap = &p2p_pgmap->pgmap;
-
-	pgmap->res.start = pci_resource_start(pdev, bar) + offset;
-	pgmap->res.end = pgmap->res.start + size - 1;
-	pgmap->res.flags = pci_resource_flags(pdev, bar);
-	pgmap->ref = &p2p_pgmap->ref;
-	pgmap->type = MEMORY_DEVICE_PCI_P2PDMA;
-	pgmap->pci_p2pdma_bus_offset = pci_bus_address(pdev, bar) -
-		pci_resource_start(pdev, bar);
-	pgmap->kill = pci_p2pdma_percpu_kill;
-	pgmap->cleanup = pci_p2pdma_percpu_cleanup;
-
-	addr = devm_memremap_pages(&pdev->dev, pgmap);
-	if (IS_ERR(addr)) {
-		error = PTR_ERR(addr);
-		goto pgmap_free;
-	}
+	addr = devm_memremap(&pdev->dev, res.start, size, MEMREMAP_WC);
+	if (IS_ERR(addr))
+		return PTR_ERR(addr);
 
-	error = gen_pool_add_owner(pdev->p2pdma->pool, (unsigned long)addr,
-			pci_bus_address(pdev, bar) + offset,
-			resource_size(&pgmap->res), dev_to_node(&pdev->dev),
-			&p2p_pgmap->ref);
+	error = gen_pool_add_virt(pdev->p2pdma->pool, (unsigned long)addr,
+				  pci_bus_address(pdev, bar) + offset, size,
+				  dev_to_node(&pdev->dev));
 	if (error)
 		goto pages_free;
 
-	pci_info(pdev, "added peer-to-peer DMA memory %pR\n",
-		 &pgmap->res);
+	pci_info(pdev, "added peer-to-peer DMA memory %pR\n", &res);
 
 	return 0;
 
 pages_free:
-	devm_memunmap_pages(&pdev->dev, pgmap);
-pgmap_free:
-	devm_kfree(&pdev->dev, p2p_pgmap);
+	devm_memunmap(&pdev->dev, addr);
 	return error;
 }
 EXPORT_SYMBOL_GPL(pci_p2pdma_add_resource);
@@ -601,7 +566,6 @@ EXPORT_SYMBOL_GPL(pci_p2pmem_find_many);
 void *pci_alloc_p2pmem(struct pci_dev *pdev, size_t size)
 {
 	void *ret = NULL;
-	struct percpu_ref *ref;
 
 	/*
 	 * Pairs with synchronize_rcu() in pci_p2pdma_release() to
@@ -612,16 +576,13 @@ void *pci_alloc_p2pmem(struct pci_dev *pdev, size_t size)
 	if (unlikely(!pdev->p2pdma))
 		goto out;
 
-	ret = (void *)gen_pool_alloc_owner(pdev->p2pdma->pool, size,
-			(void **) &ref);
-	if (!ret)
+	if (unlikely(!percpu_ref_tryget_live(&pdev->p2pdma->ref)))
 		goto out;
 
-	if (unlikely(!percpu_ref_tryget_live(ref))) {
-		gen_pool_free(pdev->p2pdma->pool, (unsigned long) ret, size);
-		ret = NULL;
-		goto out;
-	}
+	ret = (void *)gen_pool_alloc(pdev->p2pdma->pool, size);
+	if (!ret)
+		percpu_ref_put(&pdev->p2pdma->ref);
+
 out:
 	rcu_read_unlock();
 	return ret;
@@ -636,11 +597,9 @@ EXPORT_SYMBOL_GPL(pci_alloc_p2pmem);
  */
 void pci_free_p2pmem(struct pci_dev *pdev, void *addr, size_t size)
 {
-	struct percpu_ref *ref;
+	gen_pool_free(pdev->p2pdma->pool, (uintptr_t)addr, size);
 
-	gen_pool_free_owner(pdev->p2pdma->pool, (uintptr_t)addr, size,
-			(void **) &ref);
-	percpu_ref_put(ref);
+	percpu_ref_put(&pdev->p2pdma->ref);
 }
 EXPORT_SYMBOL_GPL(pci_free_p2pmem);
 
-- 
2.20.1


^ permalink raw reply related

* [RFC PATCH 10/28] block: Create generic vec_split_segs() from bvec_split_segs()
From: Logan Gunthorpe @ 2019-06-20 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-block, linux-nvme, linux-pci, linux-rdma
  Cc: Jens Axboe, Christoph Hellwig, Bjorn Helgaas, Dan Williams,
	Sagi Grimberg, Keith Busch, Jason Gunthorpe, Stephen Bates,
	Logan Gunthorpe
In-Reply-To: <20190620161240.22738-1-logang@deltatee.com>

bvec_split_segs() only requires the address and length of the
vector. In order to generalize it to work with dma_vecs, we just
take the address and length directly instead of the bio_vec.

The function is renamed to vec_split_segs() and a helper is added
to avoid having to adjust the existing callsites.

Note: the new bvec_split_segs() helper will be removed in a subsequent
patch.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 block/blk-merge.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 17713d7d98d5..3581c7ac3c1b 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -158,13 +158,13 @@ static unsigned get_max_segment_size(struct request_queue *q,
 }
 
 /*
- * Split the bvec @bv into segments, and update all kinds of
- * variables.
+ * Split the an address/offset and length into segments, and
+ * update all kinds of variables.
  */
-static bool bvec_split_segs(struct request_queue *q, struct bio_vec *bv,
-		unsigned *nsegs, unsigned *sectors, unsigned max_segs)
+static bool vec_split_segs(struct request_queue *q, unsigned offset,
+			   unsigned len, unsigned *nsegs, unsigned *sectors,
+			   unsigned max_segs)
 {
-	unsigned len = bv->bv_len;
 	unsigned total_len = 0;
 	unsigned new_nsegs = 0, seg_size = 0;
 
@@ -173,14 +173,14 @@ static bool bvec_split_segs(struct request_queue *q, struct bio_vec *bv,
 	 * current bvec has to be splitted as multiple segments.
 	 */
 	while (len && new_nsegs + *nsegs < max_segs) {
-		seg_size = get_max_segment_size(q, bv->bv_offset + total_len);
+		seg_size = get_max_segment_size(q, offset + total_len);
 		seg_size = min(seg_size, len);
 
 		new_nsegs++;
 		total_len += seg_size;
 		len -= seg_size;
 
-		if ((bv->bv_offset + total_len) & queue_virt_boundary(q))
+		if ((offset + total_len) & queue_virt_boundary(q))
 			break;
 	}
 
@@ -194,6 +194,13 @@ static bool bvec_split_segs(struct request_queue *q, struct bio_vec *bv,
 	return !!len;
 }
 
+static bool bvec_split_segs(struct request_queue *q, struct bio_vec *bv,
+		unsigned *nsegs, unsigned *sectors, unsigned max_segs)
+{
+	return vec_split_segs(q, bv->bv_offset, bv->bv_len, nsegs,
+			      sectors, max_segs);
+}
+
 static struct bio *blk_bio_segment_split(struct request_queue *q,
 					 struct bio *bio,
 					 struct bio_set *bs,
-- 
2.20.1


^ permalink raw reply related

* [RFC PATCH 09/28] block: Introduce vec_gap_to_prev()
From: Logan Gunthorpe @ 2019-06-20 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-block, linux-nvme, linux-pci, linux-rdma
  Cc: Jens Axboe, Christoph Hellwig, Bjorn Helgaas, Dan Williams,
	Sagi Grimberg, Keith Busch, Jason Gunthorpe, Stephen Bates,
	Logan Gunthorpe
In-Reply-To: <20190620161240.22738-1-logang@deltatee.com>

Introduce vec_gap_to_prev() which is a more general
form of bvec_gap_to_prev().

In order to support splitting dma_vecs we will need to do a similar
calcualtion using the DMA address and length.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 block/blk.h | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/block/blk.h b/block/blk.h
index 4142383eed7a..c5512fefe703 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -98,11 +98,19 @@ static inline bool dmavec_phys_mergeable(struct request_queue *q,
 				  vec2->dv_addr, vec2->dv_len);
 }
 
+static inline bool __vec_gap_to_prev(struct request_queue *q,
+		unsigned int prv_offset, unsigned int prv_len,
+		unsigned int nxt_offset)
+{
+	return (nxt_offset & queue_virt_boundary(q)) ||
+		((prv_offset + prv_len) & queue_virt_boundary(q));
+}
+
 static inline bool __bvec_gap_to_prev(struct request_queue *q,
 		struct bio_vec *bprv, unsigned int offset)
 {
-	return (offset & queue_virt_boundary(q)) ||
-		((bprv->bv_offset + bprv->bv_len) & queue_virt_boundary(q));
+	return __vec_gap_to_prev(q, bprv->bv_offset, bprv->bv_len,
+				 offset);
 }
 
 /*
@@ -117,6 +125,15 @@ static inline bool bvec_gap_to_prev(struct request_queue *q,
 	return __bvec_gap_to_prev(q, bprv, offset);
 }
 
+static inline bool vec_gap_to_prev(struct request_queue *q,
+		unsigned int prv_offset, unsigned int prv_len,
+		unsigned int nxt_offset)
+{
+	if (!queue_virt_boundary(q))
+		return false;
+	return __vec_gap_to_prev(q, prv_offset, prv_len, nxt_offset);
+}
+
 #ifdef CONFIG_BLK_DEV_INTEGRITY
 void blk_flush_integrity(void);
 bool __bio_integrity_endio(struct bio *);
-- 
2.20.1


^ permalink raw reply related

* Re: [igt-dev] [PATCH v2] lib/i915: gem_engine_topology: get eb flags from engine's class:instance
From: Tvrtko Ursulin @ 2019-06-20 16:14 UTC (permalink / raw)
  To: Andi Shyti, IGT dev; +Cc: Andi Shyti
In-Reply-To: <20190620131453.10203-1-andi.shyti@intel.com>


On 20/06/2019 14:14, Andi Shyti wrote:
> The execution buffer flag value has now the engine index as it is
> mapped in the context. Retrieve the mapped index by interrogating
> the driver starting from the class/instance tuple.
> 
> A "gem_context_get_eb_flags_ci" helper allows to avoid declaring
> a "struct i915_engine_class_instance" for the purpose.
> 
> Return -EINVAL if the engine is not mapped in the given context.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Signed-off-by: Andi Shyti <andi.shyti@intel.com>
> Cc: Ramalingam C <ramalingam.c@intel.com>
> ---
> V1 --> V2 changelog:
> --------------------
> - refactor the code to avoid initializing the context just for
>    the purpose of getting the execution buffer flag (thanks
>    Tvrtko)
> 
>   lib/i915/gem_engine_topology.c | 31 +++++++++++++++++++++++++++++++
>   lib/i915/gem_engine_topology.h |  6 ++++++
>   2 files changed, 37 insertions(+)
> 
> diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
> index fdd1b951672b..fd5be3491b89 100644
> --- a/lib/i915/gem_engine_topology.c
> +++ b/lib/i915/gem_engine_topology.c
> @@ -270,6 +270,37 @@ int gem_context_lookup_engine(int fd, uint64_t engine, uint32_t ctx_id,
>   	return 0;
>   }
>   
> +int gem_context_get_eb_flags(int fd, uint32_t ctx_id,
> +			     struct i915_engine_class_instance *ci)
> +{
> +	DEFINE_CONTEXT_ENGINES_PARAM(engines, param, ctx_id, GEM_MAX_ENGINES);
> +
> +	/* legacy kernels */
> +	if (gem_topology_get_param(fd, &param)) {
> +		const struct intel_execution_engine2 *e;
> +
> +		__for_each_static_engine(e)
> +			if (e->class == ci->engine_class &&
> +			    e->instance == ci->engine_instance)
> +				return e->flags;
> +
> +		return -EINVAL;
> +	}
> +
> +	/* context has no engine mapped */
> +	if (!param.size)
> +		return -EINVAL;
> +
> +	/* engine map lookup */
> +	for (int i = 0; i < param.size; i++)
> +		if (engines.engines[i].engine_class == ci->engine_class &&
> +		    engines.engines[i].engine_instance == ci->engine_instance)
> +			return i;
> +
> +	/* engine is not mapped in the given context */
> +	return -EINVAL;
> +}

Looks good to me.. ;)

> +
>   void gem_context_set_all_engines(int fd, uint32_t ctx)
>   {
>   	DEFINE_CONTEXT_ENGINES_PARAM(engines, param, ctx, GEM_MAX_ENGINES);
> diff --git a/lib/i915/gem_engine_topology.h b/lib/i915/gem_engine_topology.h
> index 2415fd1e379b..57b5473bbd5a 100644
> --- a/lib/i915/gem_engine_topology.h
> +++ b/lib/i915/gem_engine_topology.h
> @@ -53,6 +53,12 @@ int gem_context_lookup_engine(int fd, uint64_t engine, uint32_t ctx_id,
>   
>   void gem_context_set_all_engines(int fd, uint32_t ctx);
>   
> +int gem_context_get_eb_flags(int fd, uint32_t ctx_id,
> +			     struct i915_engine_class_instance *ci);
> +
> +#define gem_context_get_eb_flags_ci(f, c, ...) \
> +	gem_context_get_eb_flags(f, c, &((struct i915_engine_class_instance){__VA_ARGS__}))
> +

Hah this is some trick. I assume this allows:

eb.flags = gem_context_get_eb_flags(fd, ctx, ..._RENDER, 0);

?

What if too few or too many parameters are given? I'm in two minds but 
can't argue it is very to be able to do this in IGT.

>   #define __for_each_static_engine(e__) \
>   	for ((e__) = intel_execution_engines2; (e__)->name; (e__)++)
>   
> 

Can you extend the series with a patch which converts the problematic 
subtests in perf_pmu to use this helper? Or even merge into this patch, 
I don't mind. Would have some moral grounds to r-b it then. ;)

Regards,

Tvrtko
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply

* [RFC PATCH 13/28] block: Generalize bvec_should_split()
From: Logan Gunthorpe @ 2019-06-20 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-block, linux-nvme, linux-pci, linux-rdma
  Cc: Jens Axboe, Christoph Hellwig, Bjorn Helgaas, Dan Williams,
	Sagi Grimberg, Keith Busch, Jason Gunthorpe, Stephen Bates,
	Logan Gunthorpe
In-Reply-To: <20190620161240.22738-1-logang@deltatee.com>

bvec_should_split() will need to also operate on dma_vecs so
generalize it to take an offset and length instead of a bio_vec.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 block/blk-merge.c | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index d9e89c0ad40d..32653fca53ce 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -206,23 +206,25 @@ struct blk_segment_split_ctx {
 	unsigned sectors;
 
 	bool prv_valid;
-	struct bio_vec bvprv;
+	unsigned prv_offset;
+	unsigned prv_len;
 
 	const unsigned max_sectors;
 	const unsigned max_segs;
 };
 
-static bool bvec_should_split(struct request_queue *q, struct bio_vec *bv,
-			      struct blk_segment_split_ctx *ctx)
+static bool vec_should_split(struct request_queue *q, unsigned offset,
+			     unsigned len, struct blk_segment_split_ctx *ctx)
 {
 	/*
 	 * If the queue doesn't support SG gaps and adding this
 	 * offset would create a gap, disallow it.
 	 */
-	if (ctx->prv_valid && bvec_gap_to_prev(q, &ctx->bvprv, bv->bv_offset))
+	if (ctx->prv_valid &&
+	    vec_gap_to_prev(q, ctx->prv_offset, ctx->prv_len, offset))
 		return true;
 
-	if (ctx->sectors + (bv->bv_len >> 9) > ctx->max_sectors) {
+	if (ctx->sectors + (len >> 9) > ctx->max_sectors) {
 		/*
 		 * Consider this a new segment if we're splitting in
 		 * the middle of this vector.
@@ -230,9 +232,9 @@ static bool bvec_should_split(struct request_queue *q, struct bio_vec *bv,
 		if (ctx->nsegs < ctx->max_segs &&
 		    ctx->sectors < ctx->max_sectors) {
 			/* split in the middle of bvec */
-			bv->bv_len = (ctx->max_sectors - ctx->sectors) << 9;
-			bvec_split_segs(q, bv, &ctx->nsegs,
-					&ctx->sectors, ctx->max_segs);
+			len = (ctx->max_sectors - ctx->sectors) << 9;
+			vec_split_segs(q, offset, len, &ctx->nsegs,
+				       &ctx->sectors, ctx->max_segs);
 		}
 		return true;
 	}
@@ -240,14 +242,15 @@ static bool bvec_should_split(struct request_queue *q, struct bio_vec *bv,
 	if (ctx->nsegs == ctx->max_segs)
 		return true;
 
-	ctx->bvprv = *bv;
+	ctx->prv_offset = offset;
+	ctx->prv_len = len;
 	ctx->prv_valid = true;
 
-	if (bv->bv_offset + bv->bv_len <= PAGE_SIZE) {
+	if (offset + len <= PAGE_SIZE) {
 		ctx->nsegs++;
-		ctx->sectors += bv->bv_len >> 9;
-	} else if (bvec_split_segs(q, bv, &ctx->nsegs, &ctx->sectors,
-				   ctx->max_segs)) {
+		ctx->sectors += len >> 9;
+	} else if (vec_split_segs(q, offset, len, &ctx->nsegs, &ctx->sectors,
+				  ctx->max_segs)) {
 		return true;
 	}
 
@@ -270,7 +273,7 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 	};
 
 	bio_for_each_bvec(bv, bio, iter) {
-		do_split = bvec_should_split(q, &bv, &ctx);
+		do_split = vec_should_split(q, bv.bv_offset, bv.bv_len, &ctx);
 		if (do_split)
 			break;
 	}
-- 
2.20.1


^ permalink raw reply related

* [RFC PATCH 05/28] block: Skip dma-direct bios in bio_integrity_prep()
From: Logan Gunthorpe @ 2019-06-20 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-block, linux-nvme, linux-pci, linux-rdma
  Cc: Jens Axboe, Christoph Hellwig, Bjorn Helgaas, Dan Williams,
	Sagi Grimberg, Keith Busch, Jason Gunthorpe, Stephen Bates,
	Logan Gunthorpe
In-Reply-To: <20190620161240.22738-1-logang@deltatee.com>

The block layer will not be able to handle integrity for dma-direct
bios seeing it does not have access to the underlying data.

If users of dma-direct require integrity, they will have to handle it
in the layer creating the bios. This is left as future work should
somebody care about handling such a case.

Thus, bio_integrity_prep() should ignore dma-direct bios.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 block/bio-integrity.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 4db620849515..10fdf456fcd8 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -221,6 +221,10 @@ bool bio_integrity_prep(struct bio *bio)
 	if (bio_integrity(bio))
 		return true;
 
+	/* The block layer cannot handle integrity for dma-direct bios */
+	if (bio_is_dma_direct(bio))
+		return true;
+
 	if (bio_data_dir(bio) == READ) {
 		if (!bi->profile->verify_fn ||
 		    !(bi->flags & BLK_INTEGRITY_VERIFY))
-- 
2.20.1


^ permalink raw reply related

* [Buildroot] [git commit] support/download/git: fix formatting of error message
From: Thomas Petazzoni @ 2019-06-20 16:14 UTC (permalink / raw)
  To: buildroot

commit: https://git.buildroot.net/buildroot/commit/?id=8dd1a41630fff72638b7942c926c2f50095ab0d6
branch: https://git.buildroot.net/buildroot/commit/?id=refs/heads/master

'.' should be at the end of the sentence, not the beginning of a new
line.

Signed-off-by: John Keeping <john@metanate.com>
Cc: Yann E. MORIN <yann.morin.1998@free.fr>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
---
 support/download/git | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/support/download/git b/support/download/git
index 17ca04eb98..075f665bbf 100755
--- a/support/download/git
+++ b/support/download/git
@@ -130,7 +130,7 @@ fi
 # scratch won't help, so we don't want to trash the repository for a
 # missing commit. We just exit without going through the ERR trap.
 if ! _git rev-parse --quiet --verify "'${cset}^{commit}'" >/dev/null 2>&1; then
-    printf "Commit '%s' does not exist in this repository\n." "${cset}"
+    printf "Commit '%s' does not exist in this repository.\n" "${cset}"
     exit 1
 fi
 

^ permalink raw reply related

* [RFC PATCH 17/28] block: Introduce queue flag to indicate support for dma-direct bios
From: Logan Gunthorpe @ 2019-06-20 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-block, linux-nvme, linux-pci, linux-rdma
  Cc: Jens Axboe, Christoph Hellwig, Bjorn Helgaas, Dan Williams,
	Sagi Grimberg, Keith Busch, Jason Gunthorpe, Stephen Bates,
	Logan Gunthorpe
In-Reply-To: <20190620161240.22738-1-logang@deltatee.com>

Queues will need to advertise support to accept dma-direct requests.

The existing PCI P2P support which will be replaced by this and thus
the P2P flag will be dropped in a subsequent patch.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 include/linux/blkdev.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index ce70d5dded5f..a5b856324276 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -616,6 +616,7 @@ struct request_queue {
 #define QUEUE_FLAG_SCSI_PASSTHROUGH 23	/* queue supports SCSI commands */
 #define QUEUE_FLAG_QUIESCED	24	/* queue has been quiesced */
 #define QUEUE_FLAG_PCI_P2PDMA	25	/* device supports PCI p2p requests */
+#define QUEUE_FLAG_DMA_DIRECT	26	/* device supports dma-addr requests */
 
 #define QUEUE_FLAG_MQ_DEFAULT	((1 << QUEUE_FLAG_IO_STAT) |		\
 				 (1 << QUEUE_FLAG_SAME_COMP))
@@ -642,6 +643,8 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q);
 	test_bit(QUEUE_FLAG_SCSI_PASSTHROUGH, &(q)->queue_flags)
 #define blk_queue_pci_p2pdma(q)	\
 	test_bit(QUEUE_FLAG_PCI_P2PDMA, &(q)->queue_flags)
+#define blk_queue_dma_direct(q)	\
+	test_bit(QUEUE_FLAG_DMA_DIRECT, &(q)->queue_flags)
 
 #define blk_noretry_request(rq) \
 	((rq)->cmd_flags & (REQ_FAILFAST_DEV|REQ_FAILFAST_TRANSPORT| \
-- 
2.20.1


^ permalink raw reply related

* [RFC PATCH 01/28] block: Introduce DMA direct request type
From: Logan Gunthorpe @ 2019-06-20 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-block, linux-nvme, linux-pci, linux-rdma
  Cc: Jens Axboe, Christoph Hellwig, Bjorn Helgaas, Dan Williams,
	Sagi Grimberg, Keith Busch, Jason Gunthorpe, Stephen Bates,
	Logan Gunthorpe
In-Reply-To: <20190620161240.22738-1-logang@deltatee.com>

A DMA direct request allows passing DMA addresses directly through
the block layer, instead of struct pages. This allows the calling
layer to take care of the mapping and unmapping and also creates
a path to doing peer-to-peer transactions without using struct pages.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 include/linux/blk_types.h |  9 ++++++++-
 include/linux/blkdev.h    | 10 ++++++++++
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 95202f80676c..f3cabfdb6774 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -322,6 +322,7 @@ enum req_flag_bits {
 	__REQ_NOUNMAP,		/* do not free blocks when zeroing */
 
 	__REQ_HIPRI,
+	__REQ_DMA_DIRECT,	/* DMA address direct request */
 
 	/* for driver use */
 	__REQ_DRV,
@@ -345,6 +346,7 @@ enum req_flag_bits {
 #define REQ_NOWAIT		(1ULL << __REQ_NOWAIT)
 #define REQ_NOUNMAP		(1ULL << __REQ_NOUNMAP)
 #define REQ_HIPRI		(1ULL << __REQ_HIPRI)
+#define REQ_DMA_DIRECT		(1ULL << __REQ_DMA_DIRECT)
 
 #define REQ_DRV			(1ULL << __REQ_DRV)
 #define REQ_SWAP		(1ULL << __REQ_SWAP)
@@ -353,7 +355,7 @@ enum req_flag_bits {
 	(REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT | REQ_FAILFAST_DRIVER)
 
 #define REQ_NOMERGE_FLAGS \
-	(REQ_NOMERGE | REQ_PREFLUSH | REQ_FUA)
+	(REQ_NOMERGE | REQ_PREFLUSH | REQ_FUA | REQ_DMA_DIRECT)
 
 enum stat_group {
 	STAT_READ,
@@ -412,6 +414,11 @@ static inline int op_stat_group(unsigned int op)
 	return op_is_write(op);
 }
 
+static inline int op_is_dma_direct(unsigned int op)
+{
+	return op & REQ_DMA_DIRECT;
+}
+
 typedef unsigned int blk_qc_t;
 #define BLK_QC_T_NONE		-1U
 #define BLK_QC_T_SHIFT		16
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 592669bcc536..ce70d5dded5f 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -271,6 +271,16 @@ static inline bool bio_is_passthrough(struct bio *bio)
 	return blk_op_is_scsi(op) || blk_op_is_private(op);
 }
 
+static inline bool bio_is_dma_direct(struct bio *bio)
+{
+	return op_is_dma_direct(bio->bi_opf);
+}
+
+static inline bool blk_rq_is_dma_direct(struct request *rq)
+{
+	return op_is_dma_direct(rq->cmd_flags);
+}
+
 static inline unsigned short req_get_ioprio(struct request *req)
 {
 	return req->ioprio;
-- 
2.20.1


^ permalink raw reply related

* [RFC PATCH 03/28] block: Warn on mis-use of dma-direct bios
From: Logan Gunthorpe @ 2019-06-20 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-block, linux-nvme, linux-pci, linux-rdma
  Cc: Jens Axboe, Christoph Hellwig, Bjorn Helgaas, Dan Williams,
	Sagi Grimberg, Keith Busch, Jason Gunthorpe, Stephen Bates,
	Logan Gunthorpe
In-Reply-To: <20190620161240.22738-1-logang@deltatee.com>

This is a result of an audit of users of 'bi_io_vec'. A number of
warnings and blocking conditions are added to ensure dma-direct bios
are not incorrectly accessing the 'bi_io_vec' when they should access
the 'bi_dma_vec'. These are largely just protecting against mis-uses
in future development so depending on taste and public opinion some
or all of these checks may not be necessary.

A few other issues with dma-direct bios will be tackled in subsequent
patches.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 block/bio.c      | 33 +++++++++++++++++++++++++++++++++
 block/blk-core.c |  3 +++
 2 files changed, 36 insertions(+)

diff --git a/block/bio.c b/block/bio.c
index 683cbb40f051..6998fceddd36 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -525,6 +525,9 @@ void zero_fill_bio_iter(struct bio *bio, struct bvec_iter start)
 	struct bio_vec bv;
 	struct bvec_iter iter;
 
+	if (WARN_ON_ONCE(bio_is_dma_direct(bio)))
+		return;
+
 	__bio_for_each_segment(bv, bio, iter, start) {
 		char *data = bvec_kmap_irq(&bv, &flags);
 		memset(data, 0, bv.bv_len);
@@ -707,6 +710,8 @@ static int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
 	 */
 	if (unlikely(bio_flagged(bio, BIO_CLONED)))
 		return 0;
+	if (unlikely(bio_is_dma_direct(bio)))
+		return 0;
 
 	if (((bio->bi_iter.bi_size + len) >> 9) > queue_max_hw_sectors(q))
 		return 0;
@@ -783,6 +788,8 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page,
 {
 	if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
 		return false;
+	if (WARN_ON_ONCE(bio_is_dma_direct(bio)))
+		return false;
 
 	if (bio->bi_vcnt > 0) {
 		struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
@@ -814,6 +821,7 @@ void __bio_add_page(struct bio *bio, struct page *page,
 
 	WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED));
 	WARN_ON_ONCE(bio_full(bio));
+	WARN_ON_ONCE(bio_is_dma_direct(bio));
 
 	bv->bv_page = page;
 	bv->bv_offset = off;
@@ -851,6 +859,8 @@ static void bio_get_pages(struct bio *bio)
 	struct bvec_iter_all iter_all;
 	struct bio_vec *bvec;
 
+	WARN_ON_ONCE(bio_is_dma_direct(bio));
+
 	bio_for_each_segment_all(bvec, bio, iter_all)
 		get_page(bvec->bv_page);
 }
@@ -956,6 +966,8 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 
 	if (WARN_ON_ONCE(bio->bi_vcnt))
 		return -EINVAL;
+	if (WARN_ON_ONCE(bio_is_dma_direct(bio)))
+		return -EINVAL;
 
 	do {
 		if (is_bvec)
@@ -1029,6 +1041,9 @@ void bio_copy_data_iter(struct bio *dst, struct bvec_iter *dst_iter,
 	void *src_p, *dst_p;
 	unsigned bytes;
 
+	if (WARN_ON_ONCE(bio_is_dma_direct(src) || bio_is_dma_direct(dst)))
+		return;
+
 	while (src_iter->bi_size && dst_iter->bi_size) {
 		src_bv = bio_iter_iovec(src, *src_iter);
 		dst_bv = bio_iter_iovec(dst, *dst_iter);
@@ -1143,6 +1158,9 @@ static int bio_copy_from_iter(struct bio *bio, struct iov_iter *iter)
 	struct bio_vec *bvec;
 	struct bvec_iter_all iter_all;
 
+	if (WARN_ON_ONCE(bio_is_dma_direct(bio)))
+		return -EINVAL;
+
 	bio_for_each_segment_all(bvec, bio, iter_all) {
 		ssize_t ret;
 
@@ -1174,6 +1192,9 @@ static int bio_copy_to_iter(struct bio *bio, struct iov_iter iter)
 	struct bio_vec *bvec;
 	struct bvec_iter_all iter_all;
 
+	if (WARN_ON_ONCE(bio_is_dma_direct(bio)))
+		return -EINVAL;
+
 	bio_for_each_segment_all(bvec, bio, iter_all) {
 		ssize_t ret;
 
@@ -1197,6 +1218,9 @@ void bio_free_pages(struct bio *bio)
 	struct bio_vec *bvec;
 	struct bvec_iter_all iter_all;
 
+	if (WARN_ON_ONCE(bio_is_dma_direct(bio)))
+		return;
+
 	bio_for_each_segment_all(bvec, bio, iter_all)
 		__free_page(bvec->bv_page);
 }
@@ -1653,6 +1677,9 @@ void bio_set_pages_dirty(struct bio *bio)
 	struct bio_vec *bvec;
 	struct bvec_iter_all iter_all;
 
+	if (unlikely(bio_is_dma_direct(bio)))
+		return;
+
 	bio_for_each_segment_all(bvec, bio, iter_all) {
 		if (!PageCompound(bvec->bv_page))
 			set_page_dirty_lock(bvec->bv_page);
@@ -1704,6 +1731,9 @@ void bio_check_pages_dirty(struct bio *bio)
 	unsigned long flags;
 	struct bvec_iter_all iter_all;
 
+	if (unlikely(bio_is_dma_direct(bio)))
+		return;
+
 	bio_for_each_segment_all(bvec, bio, iter_all) {
 		if (!PageDirty(bvec->bv_page) && !PageCompound(bvec->bv_page))
 			goto defer;
@@ -1777,6 +1807,9 @@ void bio_flush_dcache_pages(struct bio *bi)
 	struct bio_vec bvec;
 	struct bvec_iter iter;
 
+	if (unlikely(bio_is_dma_direct(bi)))
+		return;
+
 	bio_for_each_segment(bvec, bi, iter)
 		flush_dcache_page(bvec.bv_page);
 }
diff --git a/block/blk-core.c b/block/blk-core.c
index 8340f69670d8..ea152d54c7ce 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1467,6 +1467,9 @@ void rq_flush_dcache_pages(struct request *rq)
 	struct req_iterator iter;
 	struct bio_vec bvec;
 
+	if (unlikely(blk_rq_is_dma_direct(rq)))
+		return;
+
 	rq_for_each_segment(bvec, rq, iter)
 		flush_dcache_page(bvec.bv_page);
 }
-- 
2.20.1


^ permalink raw reply related

* [Buildroot] [PATCH] download/git: fix formatting of error message
From: Thomas Petazzoni @ 2019-06-20 16:15 UTC (permalink / raw)
  To: buildroot
In-Reply-To: <20190619150526.31746-1-john@metanate.com>

On Wed, 19 Jun 2019 16:05:26 +0100
John Keeping <john@metanate.com> wrote:

> '.' should be at the end of the sentence, not the beginning of a new
> line.
> 
> Signed-off-by: John Keeping <john@metanate.com>
> Cc: Yann E. MORIN <yann.morin.1998@free.fr>
> ---
>  support/download/git | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Applied to master, thanks.

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply

* [RFC PATCH 02/28] block: Add dma_vec structure
From: Logan Gunthorpe @ 2019-06-20 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-block, linux-nvme, linux-pci, linux-rdma
  Cc: Jens Axboe, Christoph Hellwig, Bjorn Helgaas, Dan Williams,
	Sagi Grimberg, Keith Busch, Jason Gunthorpe, Stephen Bates,
	Logan Gunthorpe
In-Reply-To: <20190620161240.22738-1-logang@deltatee.com>

The dma_vec structure is similar to the bio_vec structure except
it only stores DMA addresses instead of the struct page address.

struct bios will be able to make use of dma_vecs with a union and,
therefore, we need to ensure that struct dma_vec is no larger
than struct bvec, as they will share the allocated memory.

dma_vecs can make the same use of the bvec_iter structure
to iterate through the vectors.

This will be used for passing DMA addresses directly through the block
layer. I expect something like struct dma_vec will also be used in
Christoph's work to improve the dma_mapping layer and remove sgls.
At some point, these would use the same structure.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 include/linux/bio.h       | 12 +++++++++++
 include/linux/blk_types.h |  5 ++++-
 include/linux/bvec.h      | 43 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/include/linux/bio.h b/include/linux/bio.h
index 0f23b5682640..8180309123d7 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -28,6 +28,8 @@
 
 #define bio_iter_iovec(bio, iter)				\
 	bvec_iter_bvec((bio)->bi_io_vec, (iter))
+#define bio_iter_dma_vec(bio, iter)				\
+	bvec_iter_dvec((bio)->bi_dma_vec, (iter))
 
 #define bio_iter_page(bio, iter)				\
 	bvec_iter_page((bio)->bi_io_vec, (iter))
@@ -39,6 +41,7 @@
 #define bio_page(bio)		bio_iter_page((bio), (bio)->bi_iter)
 #define bio_offset(bio)		bio_iter_offset((bio), (bio)->bi_iter)
 #define bio_iovec(bio)		bio_iter_iovec((bio), (bio)->bi_iter)
+#define bio_dma_vec(bio)	bio_iter_dma_vec((bio), (bio)->bi_iter)
 
 #define bio_multiple_segments(bio)				\
 	((bio)->bi_iter.bi_size != bio_iovec(bio).bv_len)
@@ -155,6 +158,15 @@ static inline void bio_advance_iter(struct bio *bio, struct bvec_iter *iter,
 #define bio_for_each_bvec(bvl, bio, iter)			\
 	__bio_for_each_bvec(bvl, bio, iter, (bio)->bi_iter)
 
+#define __bio_for_each_dvec(dvl, bio, iter, start)		\
+	for (iter = (start);						\
+	     (iter).bi_size &&						\
+		((dvl = bvec_iter_dvec((bio)->bi_dma_vec, (iter))), 1); \
+	     dvec_iter_advance((bio)->bi_dma_vec, &(iter), (dvl).dv_len))
+
+#define bio_for_each_dvec(dvl, bio, iter)			\
+	__bio_for_each_dvec(dvl, bio, iter, (bio)->bi_iter)
+
 #define bio_iter_last(bvec, iter) ((iter).bi_size == (bvec).bv_len)
 
 static inline unsigned bio_segments(struct bio *bio)
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index f3cabfdb6774..7f76ea73b77d 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -191,7 +191,10 @@ struct bio {
 
 	atomic_t		__bi_cnt;	/* pin count */
 
-	struct bio_vec		*bi_io_vec;	/* the actual vec list */
+	union {
+		struct bio_vec	*bi_io_vec;	/* the actual vec list */
+		struct dma_vec	*bi_dma_vec;	/* for dma direct bios*/
+	};
 
 	struct bio_set		*bi_pool;
 
diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index a032f01e928c..f680e96132ef 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -21,6 +21,11 @@ struct bio_vec {
 	unsigned int	bv_offset;
 };
 
+struct dma_vec {
+	dma_addr_t	dv_addr;
+	unsigned int	dv_len;
+};
+
 struct bvec_iter {
 	sector_t		bi_sector;	/* device address in 512 byte
 						   sectors */
@@ -84,6 +89,18 @@ struct bvec_iter_all {
 	.bv_offset	= bvec_iter_offset((bvec), (iter)),	\
 })
 
+#define bvec_iter_dvec_addr(dvec, iter)	\
+	(__bvec_iter_bvec((dvec), (iter))->dv_addr + (iter).bi_bvec_done)
+#define bvec_iter_dvec_len(dvec, iter)	\
+	min((iter).bi_size,					\
+	    __bvec_iter_bvec((dvec), (iter))->dv_len - (iter).bi_bvec_done)
+
+#define bvec_iter_dvec(dvec, iter)				\
+((struct dma_vec) {						\
+	.dv_addr	= bvec_iter_dvec_addr((dvec), (iter)),	\
+	.dv_len		= bvec_iter_dvec_len((dvec), (iter)),	\
+})
+
 static inline bool bvec_iter_advance(const struct bio_vec *bv,
 		struct bvec_iter *iter, unsigned bytes)
 {
@@ -110,6 +127,32 @@ static inline bool bvec_iter_advance(const struct bio_vec *bv,
 	return true;
 }
 
+static inline bool dvec_iter_advance(const struct dma_vec *dv,
+		struct bvec_iter *iter, unsigned bytes)
+{
+	if (WARN_ONCE(bytes > iter->bi_size,
+		      "Attempted to advance past end of dvec iter\n")) {
+		iter->bi_size = 0;
+		return false;
+	}
+
+	while (bytes) {
+		const struct dma_vec *cur = dv + iter->bi_idx;
+		unsigned len = min3(bytes, iter->bi_size,
+				    cur->dv_len - iter->bi_bvec_done);
+
+		bytes -= len;
+		iter->bi_size -= len;
+		iter->bi_bvec_done += len;
+
+		if (iter->bi_bvec_done == cur->dv_len) {
+			iter->bi_bvec_done = 0;
+			iter->bi_idx++;
+		}
+	}
+	return true;
+}
+
 #define for_each_bvec(bvl, bio_vec, iter, start)			\
 	for (iter = (start);						\
 	     (iter).bi_size &&						\
-- 
2.20.1


^ permalink raw reply related

* [RFC PATCH 00/28] Removing struct page from P2PDMA
From: Logan Gunthorpe @ 2019-06-20 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-block, linux-nvme, linux-pci, linux-rdma
  Cc: Jens Axboe, Christoph Hellwig, Bjorn Helgaas, Dan Williams,
	Sagi Grimberg, Keith Busch, Jason Gunthorpe, Stephen Bates,
	Logan Gunthorpe

For eons there has been a debate over whether or not to use
struct pages for peer-to-peer DMA transactions. Pro-pagers have
argued that struct pages are necessary for interacting with
existing code like scatterlists or the bio_vecs. Anti-pagers
assert that the tracking of the memory is unecessary and
allocating the pages is a waste of memory. Both viewpoints are
valid, however developers working on GPUs and RDMA tend to be
able to do away with struct pages relatively easily compared to
those wanting to work with NVMe devices through the block layer.
So it would be of great value to be able to universally do P2PDMA
transactions without the use of struct pages.

Previously, there have been multiple attempts[1][2] to replace
struct page usage with pfn_t but this has been unpopular seeing
it creates dangerous edge cases where unsuspecting code might
run accross pfn_t's they are not ready for.

Currently, we have P2PDMA using struct pages through the block layer
and the dangerous cases are avoided by using a queue flag that
indicates support for the special pages.

This RFC proposes a new solution: allow the block layer to take
DMA addresses directly for queues that indicate support. This will
provide a more general path for doing P2PDMA-like requests and will
allow us to remove the struct pages that back P2PDMA memory thus paving
the way to build a more uniform P2PDMA ecosystem.

This is a fairly long patch set but most of the patches are quite
small. Patches 1 through 18 introduce the concept of a dma_vec that
is similar to a bio_vec (except it takes dma_addr_t's instead of pages
and offsets) as well as a special dma-direct bio/request. Most of these
patches just prevent the new type of bio from being mis-used and
also support splitting and mapping them in the same way that struct
page bios can be operated on. Patches 19 through 22 modify the existing
P2PDMA support in nvme-pci, ib-core and nvmet to use DMA addresses
directly. Patches 23 through 25 remove the P2PDMA specific
code from the block layer and ib-core. Finally, patches 26 through 28
remove the struct pages from the PCI P2PDMA code.

This RFC is based on v5.2-rc5 and a git branch is available here:

https://github.com/sbates130272/linux-p2pmem.git dma_direct_rfc1

[1] https://lwn.net/Articles/647404/
[2] https://lore.kernel.org/lkml/1495662147-18277-1-git-send-email-logang@deltatee.com/

--

Logan Gunthorpe (28):
  block: Introduce DMA direct request type
  block: Add dma_vec structure
  block: Warn on mis-use of dma-direct bios
  block: Never bounce dma-direct bios
  block: Skip dma-direct bios in bio_integrity_prep()
  block: Support dma-direct bios in bio_advance_iter()
  block: Use dma_vec length in bio_cur_bytes() for dma-direct bios
  block: Introduce dmavec_phys_mergeable()
  block: Introduce vec_gap_to_prev()
  block: Create generic vec_split_segs() from bvec_split_segs()
  block: Create blk_segment_split_ctx
  block: Create helper for bvec_should_split()
  block: Generalize bvec_should_split()
  block: Support splitting dma-direct bios
  block: Support counting dma-direct bio segments
  block: Implement mapping dma-direct requests to SGs in blk_rq_map_sg()
  block: Introduce queue flag to indicate support for dma-direct bios
  block: Introduce bio_add_dma_addr()
  nvme-pci: Support dma-direct bios
  IB/core: Introduce API for initializing a RW ctx from a DMA address
  nvmet: Split nvmet_bdev_execute_rw() into a helper function
  nvmet: Use DMA addresses instead of struct pages for P2P
  nvme-pci: Remove support for PCI_P2PDMA requests
  block: Remove PCI_P2PDMA queue flag
  IB/core: Remove P2PDMA mapping support in rdma_rw_ctx
  PCI/P2PDMA: Remove SGL helpers
  PCI/P2PDMA: Remove struct pages that back P2PDMA memory
  memremap: Remove PCI P2PDMA page memory type

 Documentation/driver-api/pci/p2pdma.rst |   9 +-
 block/bio-integrity.c                   |   4 +
 block/bio.c                             |  71 +++++++
 block/blk-core.c                        |   3 +
 block/blk-merge.c                       | 256 ++++++++++++++++++------
 block/blk.h                             |  49 ++++-
 block/bounce.c                          |   8 +
 drivers/infiniband/core/rw.c            |  85 ++++++--
 drivers/nvme/host/core.c                |   4 +-
 drivers/nvme/host/nvme.h                |   2 +-
 drivers/nvme/host/pci.c                 |  29 ++-
 drivers/nvme/target/core.c              |  12 +-
 drivers/nvme/target/io-cmd-bdev.c       |  82 +++++---
 drivers/nvme/target/nvmet.h             |   5 +-
 drivers/nvme/target/rdma.c              |  43 +++-
 drivers/pci/p2pdma.c                    | 202 +++----------------
 include/linux/bio.h                     |  32 ++-
 include/linux/blk_types.h               |  14 +-
 include/linux/blkdev.h                  |  16 +-
 include/linux/bvec.h                    |  43 ++++
 include/linux/memremap.h                |   5 -
 include/linux/mm.h                      |  13 --
 include/linux/pci-p2pdma.h              |  19 --
 include/rdma/rw.h                       |   6 +
 24 files changed, 648 insertions(+), 364 deletions(-)

--
2.20.1

^ permalink raw reply

* [RFC PATCH 04/28] block: Never bounce dma-direct bios
From: Logan Gunthorpe @ 2019-06-20 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-block, linux-nvme, linux-pci, linux-rdma
  Cc: Jens Axboe, Christoph Hellwig, Bjorn Helgaas, Dan Williams,
	Sagi Grimberg, Keith Busch, Jason Gunthorpe, Stephen Bates,
	Logan Gunthorpe
In-Reply-To: <20190620161240.22738-1-logang@deltatee.com>

It is expected the creator of the dma-direct bio will ensure the
target device can access the DMA address it's creating bios for.
It's also not possible to bounce a dma-direct bio seeing the block
layer doesn't have any way to access the underlying data behind
the DMA address.

Thus, never bounce dma-direct bios.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 block/bounce.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/block/bounce.c b/block/bounce.c
index f8ed677a1bf7..17e020a40cca 100644
--- a/block/bounce.c
+++ b/block/bounce.c
@@ -367,6 +367,14 @@ void blk_queue_bounce(struct request_queue *q, struct bio **bio_orig)
 	if (!bio_has_data(*bio_orig))
 		return;
 
+	/*
+	 * For DMA direct bios, Upper layers are expected to ensure
+	 * the device in question can access the DMA addresses. So
+	 * it never makes sense to bounce a DMA direct bio.
+	 */
+	if (bio_is_dma_direct(*bio_orig))
+		return;
+
 	/*
 	 * for non-isa bounce case, just check if the bounce pfn is equal
 	 * to or bigger than the highest pfn in the system -- in that case,
-- 
2.20.1


^ permalink raw reply related

* [RFC PATCH 15/28] block: Support counting dma-direct bio segments
From: Logan Gunthorpe @ 2019-06-20 16:12 UTC (permalink / raw)
  To: linux-kernel, linux-block, linux-nvme, linux-pci, linux-rdma
  Cc: Jens Axboe, Christoph Hellwig, Bjorn Helgaas, Dan Williams,
	Sagi Grimberg, Keith Busch, Jason Gunthorpe, Stephen Bates,
	Logan Gunthorpe
In-Reply-To: <20190620161240.22738-1-logang@deltatee.com>

Change __blk_recalc_rq_segments() to loop through dma_vecs when
appropriate. It calls vec_split_segs() for each dma_vec or bio_vec.

Once this is done the bvec_split_segs() helper is no longer used.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 block/blk-merge.c | 41 ++++++++++++++++++++++++++++++-----------
 1 file changed, 30 insertions(+), 11 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index c4c016f994f6..a7a5453987f9 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -194,13 +194,6 @@ static bool vec_split_segs(struct request_queue *q, unsigned offset,
 	return !!len;
 }
 
-static bool bvec_split_segs(struct request_queue *q, struct bio_vec *bv,
-		unsigned *nsegs, unsigned *sectors, unsigned max_segs)
-{
-	return vec_split_segs(q, bv->bv_offset, bv->bv_len, nsegs,
-			      sectors, max_segs);
-}
-
 struct blk_segment_split_ctx {
 	unsigned nsegs;
 	unsigned sectors;
@@ -366,12 +359,36 @@ void blk_queue_split(struct request_queue *q, struct bio **bio)
 }
 EXPORT_SYMBOL(blk_queue_split);
 
+static unsigned int bio_calc_segs(struct request_queue *q, struct bio *bio)
+{
+	unsigned int nsegs = 0;
+	struct bvec_iter iter;
+	struct bio_vec bv;
+
+	bio_for_each_bvec(bv, bio, iter)
+		vec_split_segs(q, bv.bv_offset, bv.bv_len, &nsegs,
+			       NULL, UINT_MAX);
+
+	return nsegs;
+}
+
+static unsigned int bio_dma_calc_segs(struct request_queue *q, struct bio *bio)
+{
+	unsigned int nsegs = 0;
+	struct bvec_iter iter;
+	struct dma_vec dv;
+
+	bio_for_each_dvec(dv, bio, iter)
+		vec_split_segs(q, dv.dv_addr, dv.dv_len, &nsegs,
+			       NULL, UINT_MAX);
+
+	return nsegs;
+}
+
 static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
 					     struct bio *bio)
 {
 	unsigned int nr_phys_segs = 0;
-	struct bvec_iter iter;
-	struct bio_vec bv;
 
 	if (!bio)
 		return 0;
@@ -386,8 +403,10 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
 	}
 
 	for_each_bio(bio) {
-		bio_for_each_bvec(bv, bio, iter)
-			bvec_split_segs(q, &bv, &nr_phys_segs, NULL, UINT_MAX);
+		if (bio_is_dma_direct(bio))
+			nr_phys_segs += bio_calc_segs(q, bio);
+		else
+			nr_phys_segs += bio_dma_calc_segs(q, bio);
 	}
 
 	return nr_phys_segs;
-- 
2.20.1


^ permalink raw reply related

* Re: Audit and fix all misuse of NLA_STRING: STATUS
From: Kees Cook @ 2019-06-20 16:15 UTC (permalink / raw)
  To: Romain Perier; +Cc: Kernel Hardening
In-Reply-To: <CABgxDoLSzkVJ7Vh8mLiZySz6uS+VEu+GUxRqX8EWHKQDyz2fSg@mail.gmail.com>

On Tue, Jun 18, 2019 at 07:56:42PM +0200, Romain Perier wrote:
> Hi !
> 
> Here a first review, you can get the complete list here:
> 
> https://salsa.debian.org/rperier-guest/linux-tree/raw/next/STATUS

Cool! You identified three issues:

net/netfilter/nfnetlink_cthelper.c:
	NF_CT_HELPER_NAME_LEN is used instead of NF_CT_EXP_POLICY_NAME_LEN

net/netfilter/ipset/ip_set_list_set.c:
	IPSET_ATTR_NAME and IPSET_ATTR_NAMEREF both have a len of
	IPSET_MAXNAMELEN for a string of size IPSET_MAXNAMELEN

net/openvswitch/conntrack.c:
	maxlen of NF_CT_HELPER_NAME_LEN with a string of size
	NF_CT_HELPER_NAME_LEN. maxlen of CTNL_TIMEOUT_NAME_MAX with a
	string of size CTNL_TIMEOUT_NAME_MAX

I haven't looked closely at this myself yet, but I think the next step
would be to write patches for each of these. And while doing that, have
an eye toward thinking about how each case could be made more robust in
the future to avoid these kinds of flaws returning.

Nice!

-- 
Kees Cook

^ permalink raw reply

* [PATCH] dmaengine: tegra210-adma: fix transfer failure
From: Sameer Pujar @ 2019-06-20 16:15 UTC (permalink / raw)
  To: vkoul, dan.j.williams
  Cc: thierry.reding, jonathanh, ldewangan, dmaengine, linux-tegra,
	linux-kernel, Sameer Pujar

From Tegra186 onwards OUTSTANDING_REQUESTS field is added in channel
configuration register (bits 7:4). ADMA allows a maximum of 8 reads
to source and that many writes to target memory be outstanding at any
given point of time. If this field is not programmed, DMA transfers
fail to happen.

Thus added 'ch_pending_req' member in chip data structure and the
same is populated with maximum allowed pending requests. Since the
field is not applicable to Tegra210, mentioned bit fields are unused
and hence the member is initialized with 0.

Fixes: 433de642a76c ("dmaengine: tegra210-adma: add support for Tegra186/Tegra194")

Signed-off-by: Sameer Pujar <spujar@nvidia.com>
---
 drivers/dma/tegra210-adma.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/dma/tegra210-adma.c b/drivers/dma/tegra210-adma.c
index 17ea4dd99..8d291cf 100644
--- a/drivers/dma/tegra210-adma.c
+++ b/drivers/dma/tegra210-adma.c
@@ -96,6 +96,7 @@ struct tegra_adma;
  * @ch_req_tx_shift: Register offset for AHUB transmit channel select.
  * @ch_req_rx_shift: Register offset for AHUB receive channel select.
  * @ch_base_offset: Register offset of DMA channel registers.
+ * @ch_pending_req: Outstaning DMA requests for a channel.
  * @ch_fifo_ctrl: Default value for channel FIFO CTRL register.
  * @ch_req_mask: Mask for Tx or Rx channel select.
  * @ch_req_max: Maximum number of Tx or Rx channels available.
@@ -109,6 +110,7 @@ struct tegra_adma_chip_data {
 	unsigned int ch_req_tx_shift;
 	unsigned int ch_req_rx_shift;
 	unsigned int ch_base_offset;
+	unsigned int ch_pending_req;
 	unsigned int ch_fifo_ctrl;
 	unsigned int ch_req_mask;
 	unsigned int ch_req_max;
@@ -613,6 +615,7 @@ static int tegra_adma_set_xfer_params(struct tegra_adma_chan *tdc,
 			 ADMA_CH_CTRL_FLOWCTRL_EN;
 	ch_regs->config |= cdata->adma_get_burst_config(burst_size);
 	ch_regs->config |= ADMA_CH_CONFIG_WEIGHT_FOR_WRR(1);
+	ch_regs->config |= cdata->ch_pending_req;
 	ch_regs->fifo_ctrl = cdata->ch_fifo_ctrl;
 	ch_regs->tc = desc->period_len & ADMA_CH_TC_COUNT_MASK;
 
@@ -797,6 +800,7 @@ static const struct tegra_adma_chip_data tegra210_chip_data = {
 	.ch_req_tx_shift	= 28,
 	.ch_req_rx_shift	= 24,
 	.ch_base_offset		= 0,
+	.ch_pending_req		= 0,
 	.ch_fifo_ctrl		= TEGRA210_FIFO_CTRL_DEFAULT,
 	.ch_req_mask		= 0xf,
 	.ch_req_max		= 10,
@@ -811,6 +815,7 @@ static const struct tegra_adma_chip_data tegra186_chip_data = {
 	.ch_req_tx_shift	= 27,
 	.ch_req_rx_shift	= 22,
 	.ch_base_offset		= 0x10000,
+	.ch_pending_req		= (8 << 4),
 	.ch_fifo_ctrl		= TEGRA186_FIFO_CTRL_DEFAULT,
 	.ch_req_mask		= 0x1f,
 	.ch_req_max		= 20,
-- 
2.7.4

^ permalink raw reply related

* [PATCH] dmaengine: tegra210-adma: fix transfer failure
From: Sameer Pujar @ 2019-06-20 16:15 UTC (permalink / raw)
  To: vkoul, dan.j.williams
  Cc: thierry.reding, jonathanh, ldewangan, dmaengine, linux-tegra,
	linux-kernel, Sameer Pujar

From Tegra186 onwards OUTSTANDING_REQUESTS field is added in channel
configuration register (bits 7:4). ADMA allows a maximum of 8 reads
to source and that many writes to target memory be outstanding at any
given point of time. If this field is not programmed, DMA transfers
fail to happen.

Thus added 'ch_pending_req' member in chip data structure and the
same is populated with maximum allowed pending requests. Since the
field is not applicable to Tegra210, mentioned bit fields are unused
and hence the member is initialized with 0.

Fixes: 433de642a76c ("dmaengine: tegra210-adma: add support for Tegra186/Tegra194")

Signed-off-by: Sameer Pujar <spujar@nvidia.com>
---
 drivers/dma/tegra210-adma.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/dma/tegra210-adma.c b/drivers/dma/tegra210-adma.c
index 17ea4dd99..8d291cf 100644
--- a/drivers/dma/tegra210-adma.c
+++ b/drivers/dma/tegra210-adma.c
@@ -96,6 +96,7 @@ struct tegra_adma;
  * @ch_req_tx_shift: Register offset for AHUB transmit channel select.
  * @ch_req_rx_shift: Register offset for AHUB receive channel select.
  * @ch_base_offset: Register offset of DMA channel registers.
+ * @ch_pending_req: Outstaning DMA requests for a channel.
  * @ch_fifo_ctrl: Default value for channel FIFO CTRL register.
  * @ch_req_mask: Mask for Tx or Rx channel select.
  * @ch_req_max: Maximum number of Tx or Rx channels available.
@@ -109,6 +110,7 @@ struct tegra_adma_chip_data {
 	unsigned int ch_req_tx_shift;
 	unsigned int ch_req_rx_shift;
 	unsigned int ch_base_offset;
+	unsigned int ch_pending_req;
 	unsigned int ch_fifo_ctrl;
 	unsigned int ch_req_mask;
 	unsigned int ch_req_max;
@@ -613,6 +615,7 @@ static int tegra_adma_set_xfer_params(struct tegra_adma_chan *tdc,
 			 ADMA_CH_CTRL_FLOWCTRL_EN;
 	ch_regs->config |= cdata->adma_get_burst_config(burst_size);
 	ch_regs->config |= ADMA_CH_CONFIG_WEIGHT_FOR_WRR(1);
+	ch_regs->config |= cdata->ch_pending_req;
 	ch_regs->fifo_ctrl = cdata->ch_fifo_ctrl;
 	ch_regs->tc = desc->period_len & ADMA_CH_TC_COUNT_MASK;
 
@@ -797,6 +800,7 @@ static const struct tegra_adma_chip_data tegra210_chip_data = {
 	.ch_req_tx_shift	= 28,
 	.ch_req_rx_shift	= 24,
 	.ch_base_offset		= 0,
+	.ch_pending_req		= 0,
 	.ch_fifo_ctrl		= TEGRA210_FIFO_CTRL_DEFAULT,
 	.ch_req_mask		= 0xf,
 	.ch_req_max		= 10,
@@ -811,6 +815,7 @@ static const struct tegra_adma_chip_data tegra186_chip_data = {
 	.ch_req_tx_shift	= 27,
 	.ch_req_rx_shift	= 22,
 	.ch_base_offset		= 0x10000,
+	.ch_pending_req		= (8 << 4),
 	.ch_fifo_ctrl		= TEGRA186_FIFO_CTRL_DEFAULT,
 	.ch_req_mask		= 0x1f,
 	.ch_req_max		= 20,
-- 
2.7.4


^ permalink raw reply related

* Re: [PATCH 3/3] nl80211: Include wiphy address setup in NEW_WIPHY
From: Denis Kenzior @ 2019-06-20 16:16 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-wireless
In-Reply-To: <ec1ca95a5789f9001e89e603633b20316d687721.camel@sipsolutions.net>

Hi Johannes,

On 06/20/2019 01:58 AM, Johannes Berg wrote:
> Didn't really review all of this yet, but
> 
>   	switch (state->split_start) {
>>   	case 0:
>> +		if (nla_put(msg, NL80211_ATTR_MAC, ETH_ALEN,
>> +			    rdev->wiphy.perm_addr))
>> +			goto nla_put_failure;
> 
> We generally can't add anything to any of the cases before the split was
> allowed, for compatibility with old userspace.

Can you educate me here? Is it because the non-split dump messages would 
grow too large?  But then non-dumps aren't split, so I still don't get 
how anyone can be broken by this (that isn't already broken in the first 
place).

Anyhow, What is the cut off point?  It didn't seem worthwhile to send 
yet-another-message for ~60 bytes of data, but if you want me to add it 
as a separate message, no problem.

Regards,
-Denis

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.