All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Song Bao Hua (Barry Song)" <song.bao.hua@hisilicon.com>
To: Jason Gunthorpe <jgg@ziepe.ca>, "Wangzhou (B)" <wangzhou1@hisilicon.com>
Cc: "chensihang \(A\)" <chensihang1@hisilicon.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	Zhangfei Gao <zhangfei.gao@linaro.org>,
	"Liguozhu \(Kenneth\)" <liguozhu@hisilicon.com>,
	"linux-accelerators@lists.ozlabs.org"
	<linux-accelerators@lists.ozlabs.org>
Subject: RE: [RFC PATCH v2] uacce: Add uacce_ctrl misc device
Date: Mon, 25 Jan 2021 22:21:14 +0000	[thread overview]
Message-ID: <96b655ade2534a65974a378bb68383ee@hisilicon.com> (raw)
In-Reply-To: <20210125154717.GW4605@ziepe.ca>



> -----Original Message-----
> From: Jason Gunthorpe [mailto:jgg@ziepe.ca]
> Sent: Tuesday, January 26, 2021 4:47 AM
> To: Wangzhou (B) <wangzhou1@hisilicon.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>; Arnd Bergmann
> <arnd@arndb.de>; Zhangfei Gao <zhangfei.gao@linaro.org>;
> linux-accelerators@lists.ozlabs.org; linux-kernel@vger.kernel.org;
> iommu@lists.linux-foundation.org; linux-mm@kvack.org; Song Bao Hua (Barry Song)
> <song.bao.hua@hisilicon.com>; Liguozhu (Kenneth) <liguozhu@hisilicon.com>;
> chensihang (A) <chensihang1@hisilicon.com>
> Subject: Re: [RFC PATCH v2] uacce: Add uacce_ctrl misc device
> 
> On Mon, Jan 25, 2021 at 04:34:56PM +0800, Zhou Wang wrote:
> 
> > +static int uacce_pin_page(struct uacce_pin_container *priv,
> > +			  struct uacce_pin_address *addr)
> > +{
> > +	unsigned int flags = FOLL_FORCE | FOLL_WRITE;
> > +	unsigned long first, last, nr_pages;
> > +	struct page **pages;
> > +	struct pin_pages *p;
> > +	int ret;
> > +
> > +	first = (addr->addr & PAGE_MASK) >> PAGE_SHIFT;
> > +	last = ((addr->addr + addr->size - 1) & PAGE_MASK) >> PAGE_SHIFT;
> > +	nr_pages = last - first + 1;
> > +
> > +	pages = vmalloc(nr_pages * sizeof(struct page *));
> > +	if (!pages)
> > +		return -ENOMEM;
> > +
> > +	p = kzalloc(sizeof(*p), GFP_KERNEL);
> > +	if (!p) {
> > +		ret = -ENOMEM;
> > +		goto free;
> > +	}
> > +
> > +	ret = pin_user_pages_fast(addr->addr & PAGE_MASK, nr_pages,
> > +				  flags | FOLL_LONGTERM, pages);
> 
> This needs to copy the RLIMIT_MEMLOCK and can_do_mlock() stuff from
> other places, like ib_umem_get
> 
> > +	ret = xa_err(xa_store(&priv->array, p->first, p, GFP_KERNEL));
> 
> And this is really weird, I don't think it makes sense to make handles
> for DMA based on the starting VA.
> 
> > +static int uacce_unpin_page(struct uacce_pin_container *priv,
> > +			    struct uacce_pin_address *addr)
> > +{
> > +	unsigned long first, last, nr_pages;
> > +	struct pin_pages *p;
> > +
> > +	first = (addr->addr & PAGE_MASK) >> PAGE_SHIFT;
> > +	last = ((addr->addr + addr->size - 1) & PAGE_MASK) >> PAGE_SHIFT;
> > +	nr_pages = last - first + 1;
> > +
> > +	/* find pin_pages */
> > +	p = xa_load(&priv->array, first);
> > +	if (!p)
> > +		return -ENODEV;
> > +
> > +	if (p->nr_pages != nr_pages)
> > +		return -EINVAL;
> > +
> > +	/* unpin */
> > +	unpin_user_pages(p->pages, p->nr_pages);
> 
> And unpinning without guaranteeing there is no ongoing DMA is really
> weird

In SVA case, kernel has no idea if accelerators are accessing
the memory so I would assume SVA has a method to prevent
the pages being transferred from migration or release. Otherwise,
SVA will crash easily in a system with high memory pressure.

Anyway, This is a problem worth further investigating.

> 
> Are you abusing this in conjunction with a SVA scheme just to prevent
> page motion? Why wasn't mlock good enough?

Page migration won't cause any disfunction in SVA case as IO page
fault will get a valid page again. It is only a performance issue
as IO page fault has larger latency than the usual page fault,
would be 3-80slower than page fault[1]

mlock, while certainly be able to prevent swapping out, it won't
be able to stop page moving due to:
* memory compaction in alloc_pages()
* making huge pages
* numa balance
* memory compaction in CMA
etc.

[1] https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7482091&tag=1
> 
> Jason

Thanks
Barry

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

WARNING: multiple messages have this Message-ID (diff)
From: "Song Bao Hua (Barry Song)" <song.bao.hua@hisilicon.com>
To: Jason Gunthorpe <jgg@ziepe.ca>, "Wangzhou (B)" <wangzhou1@hisilicon.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Arnd Bergmann <arnd@arndb.de>,
	Zhangfei Gao <zhangfei.gao@linaro.org>,
	"linux-accelerators@lists.ozlabs.org"
	<linux-accelerators@lists.ozlabs.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"Liguozhu (Kenneth)" <liguozhu@hisilicon.com>,
	"chensihang (A)" <chensihang1@hisilicon.com>
Subject: RE: [RFC PATCH v2] uacce: Add uacce_ctrl misc device
Date: Mon, 25 Jan 2021 22:21:14 +0000	[thread overview]
Message-ID: <96b655ade2534a65974a378bb68383ee@hisilicon.com> (raw)
In-Reply-To: <20210125154717.GW4605@ziepe.ca>



> -----Original Message-----
> From: Jason Gunthorpe [mailto:jgg@ziepe.ca]
> Sent: Tuesday, January 26, 2021 4:47 AM
> To: Wangzhou (B) <wangzhou1@hisilicon.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>; Arnd Bergmann
> <arnd@arndb.de>; Zhangfei Gao <zhangfei.gao@linaro.org>;
> linux-accelerators@lists.ozlabs.org; linux-kernel@vger.kernel.org;
> iommu@lists.linux-foundation.org; linux-mm@kvack.org; Song Bao Hua (Barry Song)
> <song.bao.hua@hisilicon.com>; Liguozhu (Kenneth) <liguozhu@hisilicon.com>;
> chensihang (A) <chensihang1@hisilicon.com>
> Subject: Re: [RFC PATCH v2] uacce: Add uacce_ctrl misc device
> 
> On Mon, Jan 25, 2021 at 04:34:56PM +0800, Zhou Wang wrote:
> 
> > +static int uacce_pin_page(struct uacce_pin_container *priv,
> > +			  struct uacce_pin_address *addr)
> > +{
> > +	unsigned int flags = FOLL_FORCE | FOLL_WRITE;
> > +	unsigned long first, last, nr_pages;
> > +	struct page **pages;
> > +	struct pin_pages *p;
> > +	int ret;
> > +
> > +	first = (addr->addr & PAGE_MASK) >> PAGE_SHIFT;
> > +	last = ((addr->addr + addr->size - 1) & PAGE_MASK) >> PAGE_SHIFT;
> > +	nr_pages = last - first + 1;
> > +
> > +	pages = vmalloc(nr_pages * sizeof(struct page *));
> > +	if (!pages)
> > +		return -ENOMEM;
> > +
> > +	p = kzalloc(sizeof(*p), GFP_KERNEL);
> > +	if (!p) {
> > +		ret = -ENOMEM;
> > +		goto free;
> > +	}
> > +
> > +	ret = pin_user_pages_fast(addr->addr & PAGE_MASK, nr_pages,
> > +				  flags | FOLL_LONGTERM, pages);
> 
> This needs to copy the RLIMIT_MEMLOCK and can_do_mlock() stuff from
> other places, like ib_umem_get
> 
> > +	ret = xa_err(xa_store(&priv->array, p->first, p, GFP_KERNEL));
> 
> And this is really weird, I don't think it makes sense to make handles
> for DMA based on the starting VA.
> 
> > +static int uacce_unpin_page(struct uacce_pin_container *priv,
> > +			    struct uacce_pin_address *addr)
> > +{
> > +	unsigned long first, last, nr_pages;
> > +	struct pin_pages *p;
> > +
> > +	first = (addr->addr & PAGE_MASK) >> PAGE_SHIFT;
> > +	last = ((addr->addr + addr->size - 1) & PAGE_MASK) >> PAGE_SHIFT;
> > +	nr_pages = last - first + 1;
> > +
> > +	/* find pin_pages */
> > +	p = xa_load(&priv->array, first);
> > +	if (!p)
> > +		return -ENODEV;
> > +
> > +	if (p->nr_pages != nr_pages)
> > +		return -EINVAL;
> > +
> > +	/* unpin */
> > +	unpin_user_pages(p->pages, p->nr_pages);
> 
> And unpinning without guaranteeing there is no ongoing DMA is really
> weird

In SVA case, kernel has no idea if accelerators are accessing
the memory so I would assume SVA has a method to prevent
the pages being transferred from migration or release. Otherwise,
SVA will crash easily in a system with high memory pressure.

Anyway, This is a problem worth further investigating.

> 
> Are you abusing this in conjunction with a SVA scheme just to prevent
> page motion? Why wasn't mlock good enough?

Page migration won't cause any disfunction in SVA case as IO page
fault will get a valid page again. It is only a performance issue
as IO page fault has larger latency than the usual page fault,
would be 3-80slower than page fault[1]

mlock, while certainly be able to prevent swapping out, it won't
be able to stop page moving due to:
* memory compaction in alloc_pages()
* making huge pages
* numa balance
* memory compaction in CMA
etc.

[1] https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7482091&tag=1
> 
> Jason

Thanks
Barry


  reply	other threads:[~2021-01-25 22:21 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-25  8:34 [RFC PATCH v2] uacce: Add uacce_ctrl misc device Zhou Wang
2021-01-25  8:34 ` Zhou Wang
2021-01-25  9:28 ` Greg Kroah-Hartman
2021-01-25  9:28   ` Greg Kroah-Hartman
2021-01-25 12:47   ` Zhou Wang
2021-01-25 12:47     ` Zhou Wang
2021-01-25 15:47 ` Jason Gunthorpe
2021-01-25 15:47   ` Jason Gunthorpe
2021-01-25 22:21   ` Song Bao Hua (Barry Song) [this message]
2021-01-25 22:21     ` Song Bao Hua (Barry Song)
2021-01-25 23:16     ` Jason Gunthorpe
2021-01-25 23:16       ` Jason Gunthorpe
2021-01-25 23:35       ` Song Bao Hua (Barry Song)
2021-01-25 23:35         ` Song Bao Hua (Barry Song)
2021-01-26  1:13         ` Jason Gunthorpe
2021-01-26  1:13           ` Jason Gunthorpe
2021-01-26  1:26           ` Song Bao Hua (Barry Song)
2021-01-26  1:26             ` Song Bao Hua (Barry Song)
2021-01-26 18:20             ` Jason Gunthorpe
2021-01-26 18:20               ` Jason Gunthorpe
2021-01-28  1:28               ` Song Bao Hua (Barry Song)
2021-01-28  1:28                 ` Song Bao Hua (Barry Song)
2021-01-29 10:09             ` Tian, Kevin
2021-01-29 10:09               ` Tian, Kevin
2021-01-29 10:33               ` Song Bao Hua (Barry Song)
2021-01-29 10:33                 ` Song Bao Hua (Barry Song)
2021-02-01 23:44               ` Jason Gunthorpe
2021-02-01 23:44                 ` Jason Gunthorpe
2021-02-02  0:22                 ` Song Bao Hua (Barry Song)
2021-02-02  0:22                   ` Song Bao Hua (Barry Song)
2021-02-02  2:51                 ` Tian, Kevin
2021-02-02  2:51                   ` Tian, Kevin
2021-02-02  3:47                   ` Song Bao Hua (Barry Song)
2021-02-02  3:47                     ` Song Bao Hua (Barry Song)
2021-01-26  9:00   ` Zhou Wang
2021-01-26  9:00     ` Zhou Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=96b655ade2534a65974a378bb68383ee@hisilicon.com \
    --to=song.bao.hua@hisilicon.com \
    --cc=arnd@arndb.de \
    --cc=chensihang1@hisilicon.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jgg@ziepe.ca \
    --cc=liguozhu@hisilicon.com \
    --cc=linux-accelerators@lists.ozlabs.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=wangzhou1@hisilicon.com \
    --cc=zhangfei.gao@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.