All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Christoph Hellwig <hch@lst.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	linux-rdma@vger.kernel.org, iommu@lists.linux-foundation.org,
	Logan Gunthorpe <logang@deltatee.com>,
	linux-pci@vger.kernel.org
Subject: Re: [PATCH 2/5] RDMA/core: remove use of dma_virt_ops
Date: Wed, 4 Nov 2020 11:52:55 -0400	[thread overview]
Message-ID: <20201104155255.GR36674@ziepe.ca> (raw)
In-Reply-To: <20201104140108.GA5674@lst.de>

On Wed, Nov 04, 2020 at 03:01:08PM +0100, Christoph Hellwig wrote:

> > Sigh. I think the proper fix is to replace addr/length with a
> > scatterlist pointer in the struct ib_sge, then have SW drivers
> > directly use the page pointer properly.
> 
> The proper fix is to move the DMA mapping into the RDMA core, yes.
> And as you said it will be hard.  But I don't think scatterlists
> are the right interface.  IMHO we can keep re-use the existing
> struct ib_sge:
> 
> struct ib_ge {
> 	u64     addr;
> 	u32     length;
> 	u32     lkey;
> };

Gah, right, this is all about local_dma_lkey..
 
> with the difference that if lkey is not a MR, addr is the physical
> address of the memory, not a dma_addr_t or virtual address.

It could work, I think a resonable ULP API would be to have some

 rdma_fill_ib_sge_from_sgl()
 rdma_map_sge_single()
 etc etc

ie instead of wrappering the DMA API as-is we have a new API that
directly builds the ib_sge. It always fills the local_dma_lkey from
the pd, so it knows it is doing DMA from local kernel memory.

Logically SW devices then have a local_dma_lkey MR that has an IOVA of
the CPU physical address space, not the DMA address space as HW
devices have. The ib_sge builders can know this detail and fill in
addr from either a cpu phyical or a dma map.

The SW device has to translate the addr/length in CPU space to
something else. It actually makes reasonable sense architecturally.

This is actually much less horrible than I thought..

Convert all ULPs to one of these new APIs, searching for
local_dma_lkey will find all places. This will replace a whole lot of
calls to ib DMA API wrapper functions. Searching for local_dma_lkey
will find all users. Drivers already work with sge.addr == CPU
address, so no driver change

Then to kill the dma_ops wrappers the remaining users should all be
connected to map_mr_sg. In this case we want a no-op dma map and fix
the three map_mr_sg's to use the page side of the sgl, not the DMA
side

Not as horrible as I imagined at first, actually..

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Christoph Hellwig <hch@lst.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	Logan Gunthorpe <logang@deltatee.com>,
	linux-rdma@vger.kernel.org, linux-pci@vger.kernel.org,
	iommu@lists.linux-foundation.org
Subject: Re: [PATCH 2/5] RDMA/core: remove use of dma_virt_ops
Date: Wed, 4 Nov 2020 11:52:55 -0400	[thread overview]
Message-ID: <20201104155255.GR36674@ziepe.ca> (raw)
In-Reply-To: <20201104140108.GA5674@lst.de>

On Wed, Nov 04, 2020 at 03:01:08PM +0100, Christoph Hellwig wrote:

> > Sigh. I think the proper fix is to replace addr/length with a
> > scatterlist pointer in the struct ib_sge, then have SW drivers
> > directly use the page pointer properly.
> 
> The proper fix is to move the DMA mapping into the RDMA core, yes.
> And as you said it will be hard.  But I don't think scatterlists
> are the right interface.  IMHO we can keep re-use the existing
> struct ib_sge:
> 
> struct ib_ge {
> 	u64     addr;
> 	u32     length;
> 	u32     lkey;
> };

Gah, right, this is all about local_dma_lkey..
 
> with the difference that if lkey is not a MR, addr is the physical
> address of the memory, not a dma_addr_t or virtual address.

It could work, I think a resonable ULP API would be to have some

 rdma_fill_ib_sge_from_sgl()
 rdma_map_sge_single()
 etc etc

ie instead of wrappering the DMA API as-is we have a new API that
directly builds the ib_sge. It always fills the local_dma_lkey from
the pd, so it knows it is doing DMA from local kernel memory.

Logically SW devices then have a local_dma_lkey MR that has an IOVA of
the CPU physical address space, not the DMA address space as HW
devices have. The ib_sge builders can know this detail and fill in
addr from either a cpu phyical or a dma map.

The SW device has to translate the addr/length in CPU space to
something else. It actually makes reasonable sense architecturally.

This is actually much less horrible than I thought..

Convert all ULPs to one of these new APIs, searching for
local_dma_lkey will find all places. This will replace a whole lot of
calls to ib DMA API wrapper functions. Searching for local_dma_lkey
will find all users. Drivers already work with sge.addr == CPU
address, so no driver change

Then to kill the dma_ops wrappers the remaining users should all be
connected to map_mr_sg. In this case we want a no-op dma map and fix
the three map_mr_sg's to use the page side of the sgl, not the DMA
side

Not as horrible as I imagined at first, actually..

Jason

  parent reply	other threads:[~2020-11-04 15:53 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-04  9:50 remove dma_virt_ops Christoph Hellwig
2020-11-04  9:50 ` Christoph Hellwig
2020-11-04  9:50 ` [PATCH 1/5] RDMA/core: remove ib_dma_{alloc,free}_coherent Christoph Hellwig
2020-11-04  9:50   ` Christoph Hellwig
2020-11-04  9:50 ` [PATCH 2/5] RDMA/core: remove use of dma_virt_ops Christoph Hellwig
2020-11-04  9:50   ` Christoph Hellwig
2020-11-04 13:42   ` Jason Gunthorpe
2020-11-04 13:42     ` Jason Gunthorpe
2020-11-04 14:01     ` Christoph Hellwig
2020-11-04 14:01       ` Christoph Hellwig
2020-11-04 15:09       ` Bernard Metzler
2020-11-04 15:09         ` Bernard Metzler
2020-11-04 18:10         ` Jason Gunthorpe
2020-11-04 18:10           ` Jason Gunthorpe
2020-11-04 15:52       ` Jason Gunthorpe [this message]
2020-11-04 15:52         ` Jason Gunthorpe
2020-11-04 16:31         ` Christoph Hellwig
2020-11-04 16:31           ` Christoph Hellwig
2020-11-04 18:09           ` Jason Gunthorpe
2020-11-04 18:09             ` Jason Gunthorpe
2020-11-04  9:50 ` [PATCH 3/5] PCI/p2p: remove the DMA_VIRT_OPS hacks Christoph Hellwig
2020-11-04  9:50   ` Christoph Hellwig
2020-11-04 16:53   ` Bjorn Helgaas
2020-11-04 16:53     ` Bjorn Helgaas
2020-11-04 17:00   ` Logan Gunthorpe
2020-11-04 17:00     ` Logan Gunthorpe
2020-11-04  9:50 ` [PATCH 4/5] PCI/p2p: cleanup up __pci_p2pdma_map_sg a bit Christoph Hellwig
2020-11-04  9:50   ` Christoph Hellwig
2020-11-04 16:54   ` Bjorn Helgaas
2020-11-04 16:54     ` Bjorn Helgaas
2020-11-04  9:50 ` [PATCH 5/5] dma-mapping: remove dma_virt_ops Christoph Hellwig
2020-11-04  9:50   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201104155255.GR36674@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=bhelgaas@google.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=logang@deltatee.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.