Re: [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jerome Glisse <jglisse@redhat.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Bob Liu <liubo95@huawei.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>, John Hubbard <jhubbard@nvidia.com>,
	David Nellans <dnellans@nvidia.com>,
	Balbir Singh <bsingharora@gmail.com>,
	Michal Hocko <mhocko@kernel.org>
Subject: Re: [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5
Date: Fri, 21 Jul 2017 11:22:04 -0400	[thread overview]
Message-ID: <20170721152203.GB3202@redhat.com> (raw)
In-Reply-To: <CAPcyv4jJraGPW214xJ+wU3G=88UUP45YiA6hV5_NvNZSNB4qGA@mail.gmail.com>

On Thu, Jul 20, 2017 at 08:48:20PM -0700, Dan Williams wrote:
> On Thu, Jul 20, 2017 at 6:41 PM, Jerome Glisse <jglisse@redhat.com> wrote:
> > On Fri, Jul 21, 2017 at 09:15:29AM +0800, Bob Liu wrote:
> >> On 2017/7/20 23:03, Jerome Glisse wrote:
> >> > On Wed, Jul 19, 2017 at 05:09:04PM +0800, Bob Liu wrote:
> >> >> On 2017/7/19 10:25, Jerome Glisse wrote:
> >> >>> On Wed, Jul 19, 2017 at 09:46:10AM +0800, Bob Liu wrote:
> >> >>>> On 2017/7/18 23:38, Jerome Glisse wrote:
> >> >>>>> On Tue, Jul 18, 2017 at 11:26:51AM +0800, Bob Liu wrote:
> >> >>>>>> On 2017/7/14 5:15, Jerome Glisse wrote:
> >
> > [...]
> >
> >> >> Then it's more like replace the numa node solution(CDM) with ZONE_DEVICE
> >> >> (type MEMORY_DEVICE_PUBLIC). But the problem is the same, e.g how to make
> >> >> sure the device memory say HBM won't be occupied by normal CPU allocation.
> >> >> Things will be more complex if there are multi GPU connected by nvlink
> >> >> (also cache coherent) in a system, each GPU has their own HBM.
> >> >>
> >> >> How to decide allocate physical memory from local HBM/DDR or remote HBM/
> >> >> DDR?
> >> >>
> >> >> If using numa(CDM) approach there are NUMA mempolicy and autonuma mechanism
> >> >> at least.
> >> >
> >> > NUMA is not as easy as you think. First like i said we want the device
> >> > memory to be isolated from most existing mm mechanism. Because memory
> >> > is unreliable and also because device might need to be able to evict
> >> > memory to make contiguous physical memory allocation for graphics.
> >> >
> >>
> >> Right, but we need isolation any way.
> >> For hmm-cdm, the isolation is not adding device memory to lru list, and many
> >> if (is_device_public_page(page)) ...
> >>
> >> But how to evict device memory?
> >
> > What you mean by evict ? Device driver can evict whenever they see the need
> > to do so. CPU page fault will evict too. Process exit or munmap() will free
> > the device memory.
> >
> > Are you refering to evict in the sense of memory reclaim under pressure ?
> >
> > So the way it flows for memory pressure is that if device driver want to
> > make room it can evict stuff to system memory and if there is not enough
> > system memory than thing get reclaim as usual before device driver can
> > make progress on device memory reclaim.
> >
> >
> >> > Second device driver are not integrated that closely within mm and the
> >> > scheduler kernel code to allow to efficiently plug in device access
> >> > notification to page (ie to update struct page so that numa worker
> >> > thread can migrate memory base on accurate informations).
> >> >
> >> > Third it can be hard to decide who win between CPU and device access
> >> > when it comes to updating thing like last CPU id.
> >> >
> >> > Fourth there is no such thing like device id ie equivalent of CPU id.
> >> > If we were to add something the CPU id field in flags of struct page
> >> > would not be big enough so this can have repercusion on struct page
> >> > size. This is not an easy sell.
> >> >
> >> > They are other issues i can't think of right now. I think for now it
> >>
> >> My opinion is most of the issues are the same no matter use CDM or HMM-CDM.
> >> I just care about a more complete solution no matter CDM,HMM-CDM or other ways.
> >> HMM or HMM-CDM depends on device driver, but haven't see a public/full driver to
> >> demonstrate the whole solution works fine.
> >
> > I am working with NVidia close source driver team to make sure that it works
> > well for them. I am also working on nouveau open source driver for same NVidia
> > hardware thought it will be of less use as what is missing there is a solid
> > open source userspace to leverage this. Nonetheless open source driver are in
> > the work.
> 
> Can you point to the nouveau patches? I still find these HMM patches
> un-reviewable without an upstream consumer.

I am still working on those, i hope i will be able to post them in 3 weeks or so.

Cheers,
Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Jerome Glisse <jglisse@redhat.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Bob Liu <liubo95@huawei.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>, John Hubbard <jhubbard@nvidia.com>,
	David Nellans <dnellans@nvidia.com>,
	Balbir Singh <bsingharora@gmail.com>,
	Michal Hocko <mhocko@kernel.org>
Subject: Re: [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5
Date: Fri, 21 Jul 2017 11:22:04 -0400	[thread overview]
Message-ID: <20170721152203.GB3202@redhat.com> (raw)
In-Reply-To: <CAPcyv4jJraGPW214xJ+wU3G=88UUP45YiA6hV5_NvNZSNB4qGA@mail.gmail.com>

On Thu, Jul 20, 2017 at 08:48:20PM -0700, Dan Williams wrote:
> On Thu, Jul 20, 2017 at 6:41 PM, Jerome Glisse <jglisse@redhat.com> wrote:
> > On Fri, Jul 21, 2017 at 09:15:29AM +0800, Bob Liu wrote:
> >> On 2017/7/20 23:03, Jerome Glisse wrote:
> >> > On Wed, Jul 19, 2017 at 05:09:04PM +0800, Bob Liu wrote:
> >> >> On 2017/7/19 10:25, Jerome Glisse wrote:
> >> >>> On Wed, Jul 19, 2017 at 09:46:10AM +0800, Bob Liu wrote:
> >> >>>> On 2017/7/18 23:38, Jerome Glisse wrote:
> >> >>>>> On Tue, Jul 18, 2017 at 11:26:51AM +0800, Bob Liu wrote:
> >> >>>>>> On 2017/7/14 5:15, Jérôme Glisse wrote:
> >
> > [...]
> >
> >> >> Then it's more like replace the numa node solution(CDM) with ZONE_DEVICE
> >> >> (type MEMORY_DEVICE_PUBLIC). But the problem is the same, e.g how to make
> >> >> sure the device memory say HBM won't be occupied by normal CPU allocation.
> >> >> Things will be more complex if there are multi GPU connected by nvlink
> >> >> (also cache coherent) in a system, each GPU has their own HBM.
> >> >>
> >> >> How to decide allocate physical memory from local HBM/DDR or remote HBM/
> >> >> DDR?
> >> >>
> >> >> If using numa(CDM) approach there are NUMA mempolicy and autonuma mechanism
> >> >> at least.
> >> >
> >> > NUMA is not as easy as you think. First like i said we want the device
> >> > memory to be isolated from most existing mm mechanism. Because memory
> >> > is unreliable and also because device might need to be able to evict
> >> > memory to make contiguous physical memory allocation for graphics.
> >> >
> >>
> >> Right, but we need isolation any way.
> >> For hmm-cdm, the isolation is not adding device memory to lru list, and many
> >> if (is_device_public_page(page)) ...
> >>
> >> But how to evict device memory?
> >
> > What you mean by evict ? Device driver can evict whenever they see the need
> > to do so. CPU page fault will evict too. Process exit or munmap() will free
> > the device memory.
> >
> > Are you refering to evict in the sense of memory reclaim under pressure ?
> >
> > So the way it flows for memory pressure is that if device driver want to
> > make room it can evict stuff to system memory and if there is not enough
> > system memory than thing get reclaim as usual before device driver can
> > make progress on device memory reclaim.
> >
> >
> >> > Second device driver are not integrated that closely within mm and the
> >> > scheduler kernel code to allow to efficiently plug in device access
> >> > notification to page (ie to update struct page so that numa worker
> >> > thread can migrate memory base on accurate informations).
> >> >
> >> > Third it can be hard to decide who win between CPU and device access
> >> > when it comes to updating thing like last CPU id.
> >> >
> >> > Fourth there is no such thing like device id ie equivalent of CPU id.
> >> > If we were to add something the CPU id field in flags of struct page
> >> > would not be big enough so this can have repercusion on struct page
> >> > size. This is not an easy sell.
> >> >
> >> > They are other issues i can't think of right now. I think for now it
> >>
> >> My opinion is most of the issues are the same no matter use CDM or HMM-CDM.
> >> I just care about a more complete solution no matter CDM,HMM-CDM or other ways.
> >> HMM or HMM-CDM depends on device driver, but haven't see a public/full driver to
> >> demonstrate the whole solution works fine.
> >
> > I am working with NVidia close source driver team to make sure that it works
> > well for them. I am also working on nouveau open source driver for same NVidia
> > hardware thought it will be of less use as what is missing there is a solid
> > open source userspace to leverage this. Nonetheless open source driver are in
> > the work.
> 
> Can you point to the nouveau patches? I still find these HMM patches
> un-reviewable without an upstream consumer.

I am still working on those, i hope i will be able to post them in 3 weeks or so.

Cheers,
Jérôme

next prev parent reply	other threads:[~2017-07-21 15:22 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-13 21:15 [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5 Jérôme Glisse
2017-07-13 21:15 ` Jérôme Glisse
2017-07-13 21:15 ` [PATCH 1/6] mm/zone-device: rename DEVICE_PUBLIC to DEVICE_HOST Jérôme Glisse
2017-07-13 21:15   ` Jérôme Glisse
2017-07-17  9:09   ` Balbir Singh
2017-07-17  9:09     ` Balbir Singh
2017-07-13 21:15 ` [PATCH 2/6] mm/device-public-memory: device memory cache coherent with CPU v4 Jérôme Glisse
2017-07-13 21:15   ` Jérôme Glisse
2017-07-13 23:01   ` Balbir Singh
2017-07-13 23:01     ` Balbir Singh
2017-07-13 21:15 ` [PATCH 3/6] mm/hmm: add new helper to hotplug CDM memory region v3 Jérôme Glisse
2017-07-13 21:15   ` Jérôme Glisse
2017-07-13 21:15 ` [PATCH 4/6] mm/memcontrol: allow to uncharge page without using page->lru field Jérôme Glisse
2017-07-13 21:15   ` Jérôme Glisse
2017-07-17  9:10   ` Balbir Singh
2017-07-17  9:10     ` Balbir Singh
2017-07-17  9:10     ` Balbir Singh
2017-07-13 21:15 ` [PATCH 5/6] mm/memcontrol: support MEMORY_DEVICE_PRIVATE and MEMORY_DEVICE_PUBLIC v3 Jérôme Glisse
2017-07-13 21:15   ` Jérôme Glisse
2017-07-13 21:15   ` Jérôme Glisse
2017-07-17  9:15   ` Balbir Singh
2017-07-17  9:15     ` Balbir Singh
2017-07-17  9:15     ` Balbir Singh
2017-07-13 21:15 ` [PATCH 6/6] mm/hmm: documents how device memory is accounted in rss and memcg Jérôme Glisse
2017-07-13 21:15   ` Jérôme Glisse
2017-07-14 13:26   ` Michal Hocko
2017-07-14 13:26     ` Michal Hocko
2017-07-18  3:26 ` [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5 Bob Liu
2017-07-18  3:26   ` Bob Liu
2017-07-18 15:38   ` Jerome Glisse
2017-07-18 15:38     ` Jerome Glisse
2017-07-19  1:46     ` Bob Liu
2017-07-19  1:46       ` Bob Liu
2017-07-19  2:25       ` Jerome Glisse
2017-07-19  2:25         ` Jerome Glisse
2017-07-19  9:09         ` Bob Liu
2017-07-19  9:09           ` Bob Liu
2017-07-20 15:03           ` Jerome Glisse
2017-07-20 15:03             ` Jerome Glisse
2017-07-21  1:15             ` Bob Liu
2017-07-21  1:15               ` Bob Liu
2017-07-21  1:41               ` Jerome Glisse
2017-07-21  1:41                 ` Jerome Glisse
2017-07-21  2:10                 ` Bob Liu
2017-07-21  2:10                   ` Bob Liu
2017-07-21 12:01                   ` Bob Liu
2017-07-21 12:01                     ` Bob Liu
2017-07-21 15:21                     ` Jerome Glisse
2017-07-21 15:21                       ` Jerome Glisse
2017-07-21  3:48                 ` Dan Williams
2017-07-21  3:48                   ` Dan Williams
2017-07-21 15:22                   ` Jerome Glisse [this message]
2017-07-21 15:22                     ` Jerome Glisse
2017-09-05 19:36                   ` Jerome Glisse
2017-09-05 19:36                     ` Jerome Glisse
2017-09-09 23:22                     ` Bob Liu
2017-09-09 23:22                       ` Bob Liu
2017-09-11 23:36                       ` Jerome Glisse
2017-09-11 23:36                         ` Jerome Glisse
2017-09-12  1:02                         ` Bob Liu
2017-09-12  1:02                           ` Bob Liu
2017-09-12 16:17                           ` Jerome Glisse
2017-09-12 16:17                             ` Jerome Glisse
2017-09-26  9:56                         ` Bob Liu
2017-09-26  9:56                           ` Bob Liu
2017-09-26 16:16                           ` Jerome Glisse
2017-09-26 16:16                             ` Jerome Glisse
2017-09-30  2:57                             ` Bob Liu
2017-09-30  2:57                               ` Bob Liu
2017-09-30 22:49                               ` Jerome Glisse
2017-09-30 22:49                                 ` Jerome Glisse
2017-10-11 13:15                                 ` Bob Liu
2017-10-11 13:15                                   ` Bob Liu
2017-10-12 15:37                                   ` Jerome Glisse
2017-10-12 15:37                                     ` Jerome Glisse
2017-11-16  2:10                                     ` chet l
2017-11-16  2:10                                       ` chet l
2017-11-16  2:44                                       ` Jerome Glisse
2017-11-16  2:44                                         ` Jerome Glisse
2017-11-16  3:23                                         ` chetan L
2017-11-16  3:23                                           ` chetan L
2017-11-16  3:29                                           ` chetan L
2017-11-16  3:29                                             ` chetan L
2017-11-16 21:29                                             ` Jerome Glisse
2017-11-16 21:29                                               ` Jerome Glisse
2017-11-16 22:41                                               ` chetan L
2017-11-16 22:41                                                 ` chetan L
2017-11-16 23:11                                                 ` Jerome Glisse
2017-11-16 23:11                                                   ` Jerome Glisse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170721152203.GB3202@redhat.com \
    --to=jglisse@redhat.com \
    --cc=bsingharora@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=dnellans@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liubo95@huawei.com \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.