Re: [HMM 00/16] HMM (Heterogeneous Memory Management) v19

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Jerome Glisse <jglisse@redhat.com>
To: "Figo.zhang" <figo1802@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>, John Hubbard <jhubbard@nvidia.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	David Nellans <dnellans@nvidia.com>
Subject: Re: [HMM 00/16] HMM (Heterogeneous Memory Management) v19
Date: Thu, 6 Apr 2017 00:59:50 -0400	[thread overview]
Message-ID: <20170406045950.GA12362@redhat.com> (raw)
In-Reply-To: <CAF7GXvptCfV89rAi=j1cy1df12039GDpq_DHOyx+_xk0FjBDPg@mail.gmail.com>

On Thu, Apr 06, 2017 at 11:22:12AM +0800, Figo.zhang wrote:

[...]

> > Heterogeneous Memory Management (HMM) (description and justification)
> >
> > Today device driver expose dedicated memory allocation API through their
> > device file, often relying on a combination of IOCTL and mmap calls. The
> > device can only access and use memory allocated through this API. This
> > effectively split the program address space into object allocated for the
> > device and useable by the device and other regular memory (malloc, mmap
> > of a file, share memory, a?|) only accessible by CPU (or in a very limited
> > way by a device by pinning memory).
> >
> > Allowing different isolated component of a program to use a device thus
> > require duplication of the input data structure using device memory
> > allocator. This is reasonable for simple data structure (array, grid,
> > image, a?|) but this get extremely complex with advance data structure
> > (list, tree, graph, a?|) that rely on a web of memory pointers. This is
> > becoming a serious limitation on the kind of work load that can be
> > offloaded to device like GPU.
> >
> 
> how handle it by current  GPU software stack? maintain a complex middle
> firmwork/HAL?

Yes you still need a framework like OpenCL or CUDA. They are work under
way to leverage GPU directly from language like C++, so i expect that
the HAL will be hidden more and more for a larger group of programmer.
Note i still expect some programmer will want to program closer to the
hardware to extract every bit of performances they can.

For OpenCL you need HMM to implement what is described as fine-grained
system SVM memory model (see OpenCL 2.0 or latter specification).

> > New industry standard like C++, OpenCL or CUDA are pushing to remove this
> > barrier. This require a shared address space between GPU device and CPU so
> > that GPU can access any memory of a process (while still obeying memory
> > protection like read only).
> 
> GPU can access the whole process VMAs or any VMAs which backing system
> memory has migrate to GPU page table?

Whole process VMAs, it does not need to be migrated to device memory. The
migration is an optional features that is necessary for performances but
GPU can access system memory just fine.

[...]

> > When page backing an address of a process is migrated to device memory
> > the CPU page table entry is set to a new specific swap entry. CPU access
> > to such address triggers a migration back to system memory, just like if
> > the page was swap on disk. HMM also blocks any one from pinning a
> > ZONE_DEVICE page so that it can always be migrated back to system memory
> > if CPU access it. Conversely HMM does not migrate to device memory any
> > page that is pin in system memory.
> >
> 
> the purpose of  migrate the system pages to device is that device can read
> the system memory?
> if the CPU/programs want read the device data, it need pin/mapping the
> device memory to the process address space?
> if multiple applications want to read the same device memory region
> concurrently, how to do it?

Purpose of migrating to device memory is to leverage device memory bandwidth.
PCIE bandwidth 32GB/s, device memory bandwidth between 256GB/s to 1TB/s also
device bandwidth has smaller latency.

CPU can not access device memory. It can but in limited way on PCIE and it
would violate memory model programmer get for regular system memory hence
for all intents and purposes it is better to say that CPU can not access
any of the device memory.

Share VMA will just work, so if a VMA is share between 2 process than both
process can access the same memory. All the semantics that are valid on the
CPU are also valid on the GPU. Nothing change there.

> it is better a graph to show how CPU and GPU share the address space.

I am not good at making ASCII graph, nor would i know how to graph this.
Any valid address on the CPU is valid on the GPU, that's it really. The
migration to device memory is orthogonal to the share address space.

Cheers,
JA(C)rA'me

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

     prev parent reply	other threads:[~2017-04-06  4:59 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-05 20:40 [HMM 00/16] HMM (Heterogeneous Memory Management) v19 Jérôme Glisse
2017-04-05 20:40 ` [HMM 01/16] mm/memory/hotplug: add memory type parameter to arch_add/remove_memory Jérôme Glisse
2017-04-06  9:45   ` Anshuman Khandual
2017-04-06 13:58     ` Jerome Glisse
2017-04-07 12:13   ` Michal Hocko
2017-04-07 14:32     ` Jerome Glisse
2017-04-07 14:45       ` Michal Hocko
2017-04-07 14:57         ` Jerome Glisse
2017-04-07 15:11           ` Michal Hocko
2017-04-07 16:10             ` Jerome Glisse
2017-04-07 16:37               ` Michal Hocko
2017-04-07 17:10                 ` Jerome Glisse
2017-04-07 17:59                   ` Michal Hocko
2017-04-07 18:27                     ` Jerome Glisse
2017-04-05 20:40 ` [HMM 02/16] mm/put_page: move ZONE_DEVICE page reference decrement v2 Jérôme Glisse
2017-04-05 20:40 ` [HMM 03/16] mm/unaddressable-memory: new type of ZONE_DEVICE for unaddressable memory Jérôme Glisse
2017-04-05 20:40 ` [HMM 04/16] mm/ZONE_DEVICE/x86: add support for un-addressable device memory Jérôme Glisse
2017-04-05 20:40 ` [HMM 05/16] mm/migrate: new migrate mode MIGRATE_SYNC_NO_COPY Jérôme Glisse
2017-04-05 20:40 ` [HMM 06/16] mm/migrate: new memory migration helper for use with device memory v4 Jérôme Glisse
2017-04-05 20:40 ` [HMM 07/16] mm/migrate: migrate_vma() unmap page from vma while collecting pages Jérôme Glisse
2017-04-05 20:40 ` [HMM 08/16] mm/hmm: heterogeneous memory management (HMM for short) Jérôme Glisse
2017-04-05 20:40 ` [HMM 09/16] mm/hmm/mirror: mirror process address space on device with HMM helpers Jérôme Glisse
2017-04-05 20:40 ` [HMM 10/16] mm/hmm/mirror: helper to snapshot CPU page table v2 Jérôme Glisse
2017-04-10  8:35   ` Michal Hocko
2017-04-10  8:43   ` Michal Hocko
2017-04-10 22:10     ` Andrew Morton
2017-04-11  1:33       ` Jerome Glisse
2017-04-11 20:33         ` Andrew Morton
2017-04-05 20:40 ` [HMM 11/16] mm/hmm/mirror: device page fault handler Jérôme Glisse
2017-04-05 20:40 ` [HMM 12/16] mm/migrate: support un-addressable ZONE_DEVICE page in migration Jérôme Glisse
2017-04-05 20:40 ` [HMM 13/16] mm/migrate: allow migrate_vma() to alloc new page on empty entry Jérôme Glisse
2017-04-05 20:40 ` [HMM 14/16] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE Jérôme Glisse
2017-04-06 21:22   ` Jerome Glisse
2017-04-07  1:37   ` Balbir Singh
2017-04-07  2:02     ` Jerome Glisse
2017-04-07 16:26       ` Jerome Glisse
2017-04-10  4:31         ` Balbir Singh
2017-04-05 20:40 ` [HMM 15/16] mm/hmm/devmem: dummy HMM device for ZONE_DEVICE memory v2 Jérôme Glisse
2017-04-05 20:40 ` [HMM 16/16] hmm: heterogeneous memory management documentation Jérôme Glisse
2017-04-06  3:22 ` [HMM 00/16] HMM (Heterogeneous Memory Management) v19 Figo.zhang
2017-04-06  4:59   ` Jerome Glisse [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170406045950.GA12362@redhat.com \
    --to=jglisse@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=dnellans@nvidia.com \
    --cc=figo1802@gmail.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).