From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Jerome Glisse <jglisse@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
Ingo Molnar <mingo@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Linux MM <linux-mm@kvack.org>, Ingo Molnar <mingo@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Logan Gunthorpe <logang@deltatee.com>,
Kirill Shutemov <kirill.shutemov@linux.intel.com>
Subject: Re: [PATCH v2] mm, zone_device: replace {get, put}_zone_device_page() with a single reference
Date: Tue, 2 May 2017 14:37:46 +0300 [thread overview]
Message-ID: <20170502113746.5ybuix3lnvlk7kxt@node.shutemov.name> (raw)
In-Reply-To: <20170501135545.GA16772@redhat.com>
On Mon, May 01, 2017 at 09:55:48AM -0400, Jerome Glisse wrote:
> On Mon, May 01, 2017 at 01:23:59PM +0300, Kirill A. Shutemov wrote:
> > On Sun, Apr 30, 2017 at 07:14:24PM -0400, Jerome Glisse wrote:
> > > On Sat, Apr 29, 2017 at 01:17:26PM +0300, Kirill A. Shutemov wrote:
> > > > On Fri, Apr 28, 2017 at 03:33:07PM -0400, Jerome Glisse wrote:
> > > > > On Fri, Apr 28, 2017 at 12:22:24PM -0700, Dan Williams wrote:
> > > > > > Are you sure about needing to hook the 2 -> 1 transition? Could we
> > > > > > change ZONE_DEVICE pages to not have an elevated reference count when
> > > > > > they are created so you can keep the HMM references out of the mm hot
> > > > > > path?
> > > > >
> > > > > 100% sure on that :) I need to callback into driver for 2->1 transition
> > > > > no way around that. If we change ZONE_DEVICE to not have an elevated
> > > > > reference count that you need to make a lot more change to mm so that
> > > > > ZONE_DEVICE is never use as fallback for memory allocation. Also need
> > > > > to make change to be sure that ZONE_DEVICE page never endup in one of
> > > > > the path that try to put them back on lru. There is a lot of place that
> > > > > would need to be updated and it would be highly intrusive and add a
> > > > > lot of special cases to other hot code path.
> > > >
> > > > Could you explain more on where the requirement comes from or point me to
> > > > where I can read about this.
> > > >
> > >
> > > HMM ZONE_DEVICE pages are use like other pages (anonymous or file back page)
> > > in _any_ vma. So i need to know when a page is freed ie either as result of
> > > unmap, exit or migration or anything that would free the memory. For zone
> > > device a page is free once its refcount reach 1 so i need to catch refcount
> > > transition from 2->1
> >
> > What if we would rework zone device to have pages with refcount 0 at
> > start?
>
> That is a _lot_ of work from top of my head because it would need changes
> to a lot of places and likely more hot code path that simply adding some-
> thing to put_page() note that i only need something in put_page() i do not
> need anything in the get page path. Is adding a conditional branch for
> HMM pages in put_page() that much of a problem ?
Well, it gets inlined everywhere. Removing zone_device code from
get_page() and put_page() saved non-trivial ~140k in vmlinux for
allyesconfig.
Re-introducing part this bloat would be unfortunate.
> > > This is the only way i can inform the device that the page is now free. See
> > >
> > > https://cgit.freedesktop.org/~glisse/linux/commit/?h=hmm-v21&id=52da8fe1a088b87b5321319add79e43b8372ed7d
> > >
> > > There is _no_ way around that.
> >
> > I'm still not convinced that it's impossible.
> >
> > Could you describe lifecycle for pages in case of HMM?
>
> Process malloc something, end it over to some function in the program
> that use the GPU that function call GPU API (OpenCL, CUDA, ...) that
> trigger a migration to device memory.
>
> So in the kernel you get a migration like any existing migration,
> original page is unmap, if refcount is all ok (no pin) then a device
> page is allocated and thing are migrated to device memory.
>
> What happen after is unknown. Either userspace/kernel driver decide
> to migrate back to system memory, either there is an munmap, either
> there is a CPU page fault, ... So from that point on the device page
> as the exact same life as a regular page.
>
> Above i describe the migrate case, but you can also have new memory
> allocation that directly allocate device memory. For instance if the
> GPU do a page fault on an address that isn't back by anything then
> we can directly allocate a device page. No migration involve in that
> case.
>
> HMM pages are like any other pages in most respect. Exception are:
> - no GUP
Hm. How do you exclude GUP? And why is it required?
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-05-02 11:48 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAA9_cmf7=aGXKoQFkzS_UJtznfRtWofitDpV2AyGwpaRGKyQkg@mail.gmail.com>
2017-04-23 23:31 ` get_zone_device_page() in get_page() and page_cache_get_speculative() Kirill A. Shutemov
2017-04-24 17:23 ` Dan Williams
2017-04-24 17:30 ` Kirill A. Shutemov
2017-04-24 17:47 ` Dan Williams
2017-04-24 18:01 ` Kirill A. Shutemov
2017-04-24 18:25 ` Kirill A. Shutemov
2017-04-24 18:41 ` Dan Williams
2017-04-25 13:19 ` Kirill A. Shutemov
2017-04-25 16:44 ` Dan Williams
2017-04-27 0:55 ` [PATCH] mm, zone_device: replace {get, put}_zone_device_page() with a single reference Dan Williams
2017-04-27 8:33 ` Kirill A. Shutemov
2017-04-28 6:39 ` Ingo Molnar
2017-04-28 8:14 ` [PATCH] mm, zone_device: Replace " Kirill A. Shutemov
2017-04-28 17:23 ` [PATCH v2] mm, zone_device: replace " Dan Williams
2017-04-28 17:34 ` Jerome Glisse
2017-04-28 17:41 ` Dan Williams
2017-04-28 18:00 ` Jerome Glisse
2017-04-28 19:02 ` Dan Williams
2017-04-28 19:16 ` Jerome Glisse
2017-04-28 19:22 ` Dan Williams
2017-04-28 19:33 ` Jerome Glisse
2017-04-29 10:17 ` Kirill A. Shutemov
2017-04-30 23:14 ` Jerome Glisse
2017-05-01 1:42 ` Dan Williams
2017-05-01 1:54 ` Jerome Glisse
2017-05-01 2:40 ` Dan Williams
2017-05-01 3:48 ` Logan Gunthorpe
2017-05-01 10:23 ` Kirill A. Shutemov
2017-05-01 13:55 ` Jerome Glisse
2017-05-01 20:19 ` Dan Williams
2017-05-01 20:32 ` Jerome Glisse
2017-05-02 11:37 ` Kirill A. Shutemov [this message]
2017-05-02 13:22 ` Jerome Glisse
2017-04-29 14:18 ` Ingo Molnar
2017-05-01 2:45 ` Dan Williams
2017-05-01 7:12 ` Ingo Molnar
2017-05-01 9:33 ` Kirill A. Shutemov
2017-04-27 16:11 ` [PATCH] " Logan Gunthorpe
2017-04-27 16:14 ` Dan Williams
2017-04-27 16:33 ` Logan Gunthorpe
2017-04-27 16:38 ` Dan Williams
2017-04-27 16:45 ` Logan Gunthorpe
2017-04-27 16:46 ` Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170502113746.5ybuix3lnvlk7kxt@node.shutemov.name \
--to=kirill@shutemov.name \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=jglisse@redhat.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=logang@deltatee.com \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).