From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Kenneth Lee" <liguozhu@hisilicon.com>,
"Leon Romanovsky" <leon@kernel.org>,
"Kenneth Lee" <nek.in.cn@gmail.com>,
"Tim Sell" <timothy.sell@unisys.com>,
linux-doc@vger.kernel.org,
"Alexander Shishkin" <alexander.shishkin@linux.intel.com>,
"Zaibo Xu" <xuzaibo@huawei.com>,
zhangfei.gao@foxmail.com, linuxarm@huawei.com,
haojian.zhuang@linaro.org, "Christoph Lameter" <cl@linux.com>,
"Hao Fang" <fanghao11@huawei.com>,
"Gavin Schenk" <g.schenk@eckelmann.de>,
"RDMA mailing list" <linux-rdma@vger.kernel.org>,
"Zhou Wang" <wangzhou1@hisilicon.com>,
"Doug Ledford" <dledford@redhat.com>,
"Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>,
"David Kershner" <david.kershner@unisys.com>,
"Johan Hovold" <johan@kernel.org>,
"Cyrille Pitchen" <cyrille.pitchen@free-electrons.com>,
"Sagar Dharia" <sdharia@codeaurora.org>,
"Jens Axboe" <axboe@kernel.dk>,
guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>,
"Randy Dunlap" <rdunlap@infradead.org>,
linux-kernel@vger.kernel.org, "Vinod Koul" <vkoul@kernel.org>,
linux-crypto@vger.kernel.org,
"Philippe Ombredanne" <pombredanne@nexb.com>,
"Sanyog Kale" <sanyog.r.kale@intel.com>,
"David S. Miller" <davem@davemloft.net>,
linux-accelerators@lists.ozlabs.org,
"Jean-Philippe Brucker" <jean-philippe.brucker@arm.com>,
iommu@lists.linux-foundation.org
Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
Date: Tue, 20 Nov 2018 09:16:50 +0000 [thread overview]
Message-ID: <20181120091650.0000419a@huawei.com> (raw)
In-Reply-To: <20181120032939.GR4890@ziepe.ca>
+CC Jean-Phillipe and iommu list.
On Mon, 19 Nov 2018 20:29:39 -0700
Jason Gunthorpe <jgg@ziepe.ca> wrote:
> On Tue, Nov 20, 2018 at 11:07:02AM +0800, Kenneth Lee wrote:
> > On Mon, Nov 19, 2018 at 11:49:54AM -0700, Jason Gunthorpe wrote:
> > > Date: Mon, 19 Nov 2018 11:49:54 -0700
> > > From: Jason Gunthorpe <jgg@ziepe.ca>
> > > To: Kenneth Lee <liguozhu@hisilicon.com>
> > > CC: Leon Romanovsky <leon@kernel.org>, Kenneth Lee <nek.in.cn@gmail.com>,
> > > Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, Alexander
> > > Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu
> > > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com,
> > > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang
> > > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing
> > > list <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>,
> > > Doug Ledford <dledford@redhat.com>, Uwe Kleine-König
> > > <u.kleine-koenig@pengutronix.de>, David Kershner
> > > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille
> > > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia
> > > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>,
> > > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap
> > > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul
> > > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne
> > > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, "David S.
> > > Miller" <davem@davemloft.net>, linux-accelerators@lists.ozlabs.org
> > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> > > User-Agent: Mutt/1.9.4 (2018-02-28)
> > > Message-ID: <20181119184954.GB4890@ziepe.ca>
> > >
> > > On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote:
> > >
> > > > If the hardware cannot share page table with the CPU, we then need to have
> > > > some way to change the device page table. This is what happen in ODP. It
> > > > invalidates the page table in device upon mmu_notifier call back. But this cannot
> > > > solve the COW problem: if the user process A share a page P with device, and A
> > > > forks a new process B, and it continue to write to the page. By COW, the
> > > > process B will keep the page P, while A will get a new page P'. But you have
> > > > no way to let the device know it should use P' rather than P.
> > >
> > > Is this true? I thought mmu_notifiers covered all these cases.
> > >
> > > The mm_notifier for A should fire if B causes the physical address of
> > > A's pages to change via COW.
> > >
> > > And this causes the device page tables to re-synchronize.
> >
> > I don't see such code. The current do_cow_fault() implemenation has nothing to
> > do with mm_notifer.
>
> Well, that sure sounds like it would be a bug in mmu_notifiers..
>
> But considering Jean's SVA stuff seems based on mmu notifiers, I have
> a hard time believing that it has any different behavior from RDMA's
> ODP, and if it does have different behavior, then it is probably just
> a bug in the ODP implementation.
>
> > > > In WarpDrive/uacce, we make this simple. If you support IOMMU and it support
> > > > SVM/SVA. Everything will be fine just like ODP implicit mode. And you don't need
> > > > to write any code for that. Because it has been done by IOMMU framework. If it
> > >
> > > Looks like the IOMMU code uses mmu_notifier, so it is identical to
> > > IB's ODP. The only difference is that IB tends to have the IOMMU page
> > > table in the device, not in the CPU.
> > >
> > > The only case I know if that is different is the new-fangled CAPI
> > > stuff where the IOMMU can directly use the CPU's page table and the
> > > IOMMU page table (in device or CPU) is eliminated.
> >
> > Yes. We are not focusing on the current implementation. As mentioned in the
> > cover letter. We are expecting Jean Philips' SVA patch:
> > git://linux-arm.org/linux-jpb.
>
> This SVA stuff does not look comparable to CAPI as it still requires
> maintaining seperate IOMMU page tables.
>
> Also, those patches from Jean have a lot of references to
> mmu_notifiers (ie look at iommu_mmu_notifier).
>
> Are you really sure it is actually any different at all?
>
> > > Anyhow, I don't think a single instance of hardware should justify an
> > > entire new subsystem. Subsystems are hard to make and without multiple
> > > hardware examples there is no way to expect that it would cover any
> > > future use cases.
> >
> > Yes. That's our first expectation. We can keep it with our driver. But because
> > there is no user driver support for any accelerator in mainline kernel. Even the
> > well known QuickAssit has to be maintained out of tree. So we try to see if
> > people is interested in working together to solve the problem.
>
> Well, you should come with patches ack'ed by these other groups.
>
> > > If all your driver needs is to mmap some PCI bar space, route
> > > interrupts and do DMA mapping then mediated VFIO is probably a good
> > > choice.
> >
> > Yes. That is what is done in our RFCv1/v2. But we accepted Jerome's opinion and
> > try not to add complexity to the mm subsystem.
>
> Why would a mediated VFIO driver touch the mm subsystem? Sounds like
> you don't have a VFIO driver if it needs to do stuff like that...
>
> > > If it needs to do a bunch of other stuff, not related to PCI bar
> > > space, interrupts and DMA mapping (ie special code for compression,
> > > crypto, AI, whatever) then you should probably do what Jerome said and
> > > make a drivers/char/hisillicon_foo_bar.c that exposes just what your
> > > hardware does.
> >
> > Yes. If no other accelerator driver writer is interested. That is the
> > expectation:)
>
> I don't think it matters what other drivers do.
>
> If your driver does not need any other kernel code then VFIO is
> sensible. In this kind of world you will probably have a RDMA-like
> userspace driver that can bring this to a common user space API, even
> if one driver use VFIO and a different driver uses something else.
>
> > You create some connections (queues) to NIC, RSA, and AI engine. Then you got
> > data direct from the NIC and pass the pointer to RSA engine for decryption. The
> > CPU then finish some data taking or operation and then pass through to the AI
> > engine for CNN calculation....This will need a place to maintain the same
> > address space by some means.
>
> How is this any different from what we have today?
>
> SVA is not something even remotely new, IB has been doing various
> versions of it for 20 years.
>
> Jason
next prev parent reply other threads:[~2018-11-20 9:17 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-12 7:58 [RFCv3 PATCH 0/6] A General Accelerator Framework, WarpDrive Kenneth Lee
2018-11-12 7:58 ` [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce Kenneth Lee
2018-11-13 0:23 ` Leon Romanovsky
2018-11-14 2:58 ` Kenneth Lee
2018-11-14 16:00 ` Leon Romanovsky
2018-11-15 8:51 ` Kenneth Lee
2018-11-15 14:54 ` Leon Romanovsky
2018-11-19 9:14 ` Kenneth Lee
2018-11-19 9:19 ` Kenneth Lee
2018-11-19 10:48 ` Leon Romanovsky
2018-11-19 16:48 ` Jerome Glisse
2018-11-19 18:27 ` Jason Gunthorpe
2018-11-19 18:42 ` Jerome Glisse
2018-11-19 18:53 ` Jason Gunthorpe
2018-11-19 19:17 ` Jerome Glisse
2018-11-19 19:27 ` Jason Gunthorpe
2018-11-19 19:46 ` Jerome Glisse
2018-11-19 20:11 ` Jason Gunthorpe
2018-11-19 20:26 ` Jerome Glisse
2018-11-19 21:26 ` Jason Gunthorpe
2018-11-19 21:33 ` Jerome Glisse
2018-11-19 21:41 ` Jason Gunthorpe
2018-11-19 19:02 ` Leon Romanovsky
2018-11-19 19:19 ` Christopher Lameter
2018-11-19 19:25 ` Jerome Glisse
2018-11-20 2:30 ` Kenneth Lee
2018-11-27 2:52 ` Kenneth Lee
2018-11-19 18:49 ` Jason Gunthorpe
2018-11-20 3:07 ` Kenneth Lee
2018-11-20 3:29 ` Jason Gunthorpe
2018-11-20 9:16 ` Jonathan Cameron [this message]
2018-11-20 12:19 ` Jean-Philippe Brucker
2018-11-21 6:08 ` Kenneth Lee
2018-11-22 2:58 ` Jason Gunthorpe
2018-11-23 8:02 ` Kenneth Lee
2018-11-23 18:05 ` Jason Gunthorpe
2018-11-24 4:13 ` Kenneth Lee
2018-11-20 5:17 ` Leon Romanovsky
2018-11-21 3:02 ` Kenneth Lee
2018-11-12 7:58 ` [RFCv3 PATCH 2/6] uacce: add uacce module Kenneth Lee
2018-11-12 7:58 ` [RFCv3 PATCH 3/6] crypto/hisilicon: add hisilicon Queue Manager driver Kenneth Lee
2018-11-12 7:58 ` [RFCv3 PATCH 4/6] crypto/hisilicon: add Hisilicon zip driver Kenneth Lee
2018-11-12 7:58 ` [RFCv3 PATCH 5/6] crypto: add uacce support to Hisilicon qm Kenneth Lee
2018-11-12 7:58 ` [RFCv3 PATCH 6/6] uacce: add user sample for uacce/warpdrive Kenneth Lee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181120091650.0000419a@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=axboe@kernel.dk \
--cc=cl@linux.com \
--cc=cyrille.pitchen@free-electrons.com \
--cc=davem@davemloft.net \
--cc=david.kershner@unisys.com \
--cc=dledford@redhat.com \
--cc=fanghao11@huawei.com \
--cc=g.schenk@eckelmann.de \
--cc=guodong.xu@linaro.org \
--cc=haojian.zhuang@linaro.org \
--cc=iommu@lists.linux-foundation.org \
--cc=jean-philippe.brucker@arm.com \
--cc=jgg@ziepe.ca \
--cc=johan@kernel.org \
--cc=leon@kernel.org \
--cc=liguozhu@hisilicon.com \
--cc=linux-accelerators@lists.ozlabs.org \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=nek.in.cn@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pombredanne@nexb.com \
--cc=rdunlap@infradead.org \
--cc=sanyog.r.kale@intel.com \
--cc=sdharia@codeaurora.org \
--cc=timothy.sell@unisys.com \
--cc=u.kleine-koenig@pengutronix.de \
--cc=vkoul@kernel.org \
--cc=wangzhou1@hisilicon.com \
--cc=xuzaibo@huawei.com \
--cc=zhangfei.gao@foxmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).