All of lore.kernel.org
 help / color / mirror / Atom feed
From: Valmiki <valmikibow@gmail.com>
To: Ralph Campbell <rcampbell@nvidia.com>, linux-mm@kvack.org
Cc: jglisse@redhat.com
Subject: Re: Regarding HMM
Date: Sun, 23 Aug 2020 18:38:16 +0530	[thread overview]
Message-ID: <6b768b7d-e754-ebea-8467-005c38db6dd9@gmail.com> (raw)
In-Reply-To: <3482c2c7-6827-77f7-a581-69af8adc73c3@nvidia.com>


On 18-08-2020 10:36 pm, Ralph Campbell wrote:
> 
> On 8/18/20 12:15 AM, Valmiki wrote:
>> Hi All,
>>
>> Im trying to understand heterogeneous memory management, i have 
>> following doubts.
>>
>> If HMM is being used we dont have to use DMA controller on device for 
>> memory transfers ?
>> Without DMA if software is managing page faults and migrations, will 
>> there be any performance impacts ?
>>
>> Is HMM targeted for any specific use cases where DMA controller is not 
>> there on device ?
>>
>> Regards,
>> Valmiki
>>
> 
> There are two APIs that are part of "HMM" and are independent of each 
> other.
> 
> hmm_range_fault() is for getting the physical address of a system 
> resident memory page that
> a device can map but is not pinned in the usual way I/O increases the 
> page reference count
> to pin the page. The device driver has to handle invalidation callbacks 
> to remove the device
> mapping. This lets the device access the page without moving it.
> 
> migrate_vma_setup(), migrate_vma_pages(), and migrate_vma_finalize() are 
> used by the device
> driver to migrate data to device private memory. After migration, the 
> system memory is freed
> and the CPU page table holds an invalid PTE that points to the device 
> private struct page
> (similar to a swap PTE). If the CPU process faults on that address, 
> there is a callback
> to the driver to migrate it back to system memory. This is where device 
> DMA engines can
> be used to copy data to/from system memory and device private memory.
> 
> The use case for the above is to be able to run code such as OpenCL on 
> GPUs and CPUs using
> the same virtual addresses without having to call special memory 
> allocators.
> In other words, just use mmap() and malloc() and not clSVMAlloc().
> 
> There is a performance consideration here. If the GPU accesses the data 
> over PCIe to
> system memory, there is much less bandwidth than accessing local GPU 
> memory. If the
> data is to be accessed/used many times, it can be more efficient to 
> migrate the data
> to local GPU memory. If the data is only accessed a few times, then it 
> is probably
> more efficient to map system memory.
Thanks Ralph for the clarification.


      parent reply	other threads:[~2020-08-23 13:08 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-18  7:15 Regarding HMM Valmiki
2020-08-18 17:06 ` Ralph Campbell
2020-08-18 20:35   ` John Hubbard
2020-08-23 13:21     ` Valmiki
2020-08-23 13:08   ` Valmiki [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6b768b7d-e754-ebea-8467-005c38db6dd9@gmail.com \
    --to=valmikibow@gmail.com \
    --cc=jglisse@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=rcampbell@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.