From: Simon Jeons <simon.jeons@gmail.com>
To: Jerome Glisse <j.glisse@gmail.com>
Cc: Shachar Raindel <raindel@mellanox.com>,
lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org,
Andrea Arcangeli <aarcange@redhat.com>,
Roland Dreier <roland@purestorage.com>,
Haggai Eran <haggaie@mellanox.com>,
Or Gerlitz <ogerlitz@mellanox.com>,
Sagi Grimberg <sagig@mellanox.com>,
Liran Liss <liranl@mellanox.com>
Subject: Re: [LSF/MM TOPIC] Hardware initiated paging of user process pages, hardware access to the CPU page tables of user processes
Date: Tue, 16 Apr 2013 15:03:23 +0800 [thread overview]
Message-ID: <516CF7BB.3050301@gmail.com> (raw)
In-Reply-To: <CAH3drwbjQa2Xms30b8J_oEUw7Eikcno-7Xqf=7=da3LHWXvkKA@mail.gmail.com>
Hi Jerome,
On 02/08/2013 11:21 PM, Jerome Glisse wrote:
> On Fri, Feb 8, 2013 at 6:18 AM, Shachar Raindel <raindel@mellanox.com> wrote:
>> Hi,
>>
>> We would like to present a reference implementation for safely sharing
>> memory pages from user space with the hardware, without pinning.
>>
>> We will be happy to hear the community feedback on our prototype
>> implementation, and suggestions for future improvements.
>>
>> We would also like to discuss adding features to the core MM subsystem to
>> assist hardware access to user memory without pinning.
>>
>> Following is a longer motivation and explanation on the technology
>> presented:
>>
>> Many application developers would like to be able to be able to communicate
>> directly with the hardware from the userspace.
>>
>> Use cases for that includes high performance networking API such as
>> InfiniBand, RoCE and iWarp and interfacing with GPUs.
>>
>> Currently, if the user space application wants to share system memory with
>> the hardware device, the kernel component must pin the memory pages in RAM,
>> using get_user_pages.
>>
>> This is a hurdle, as it usually makes large portions the application memory
>> unmovable. This pinning also makes the user space development model very
>> complicated ? one needs to register memory before using it for communication
>> with the hardware.
>>
>> We use the mmu-notifiers [1] mechanism to inform the hardware when the
>> mapping of a page is changed. If the hardware tries to access a page which
>> is not yet mapped for the hardware, it requests a resolution for the page
>> address from the kernel.
>>
>> This mechanism allows the hardware to access the entire address space of the
>> user application, without pinning even a single page.
>>
>> We would like to use the LSF/MM forum opportunity to discuss open issues we
>> have for further development, such as:
>>
>> -Allowing the hardware to perform page table walk, similar to
>> get_user_pages_fast to resolve user pages that are already in RAM.
get_user_pages_fast just get page reference count instead of populate
the pte to page table, correct? Then how can GPU driver use iommu to
access the page?
>>
>> -Batching page eviction by various kernel subsystems (swapper, page-cache)
>> to reduce the amount of communication needed with the hardware in such
>> events
>>
>> -Hinting from the hardware to the MM regarding page fetches which are
>> speculative, similarly to prefetching done by the page-cache
>>
>> -Page-in notifications from the kernel to the driver, such that we can keep
>> our secondary TLB in sync with the kernel page table without incurring page
>> faults.
>>
>> -Allowed and banned actions while in an MMU notifier callback. We have
>> already done some work on making the MMU notifiers sleepable [2], but there
>> might be additional limitations, which we would like to discuss.
>>
>> -Hinting from the MMU notifiers as for the reason for the notification - for
>> example we would like to react differently if a page was moved by NUMA
>> migration vs. page being swapped out.
>>
>> [1] http://lwn.net/Articles/266320/
>>
>> [2] http://comments.gmane.org/gmane.linux.kernel.mm/85002
>>
>> Thanks,
>>
>> --Shachar
> As a GPU driver developer i can say that this is something we want to
> do in a very near future. Also i think we would like another
> capabilities :
>
> - hint to mm on memory range that are best not to evict (easier for
> driver to know what is hot and gonna see activities)
>
> Dunno how big the change to the page eviction path would need to be.
>
> Cheers,
> Jerome
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=ilto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-04-16 7:03 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-02-08 11:18 [LSF/MM TOPIC] Hardware initiated paging of user process pages, hardware access to the CPU page tables of user processes Shachar Raindel
2013-02-08 15:21 ` Jerome Glisse
2013-04-16 7:03 ` Simon Jeons [this message]
2013-04-16 16:27 ` Jerome Glisse
2013-04-16 23:50 ` Simon Jeons
2013-04-17 14:01 ` Jerome Glisse
2013-04-17 23:48 ` Simon Jeons
2013-04-18 1:02 ` Jerome Glisse
2013-02-09 6:05 ` Michel Lespinasse
2013-02-09 16:29 ` Jerome Glisse
2013-04-09 8:28 ` Simon Jeons
2013-04-09 14:21 ` Jerome Glisse
2013-04-10 1:41 ` Simon Jeons
2013-04-10 20:45 ` Jerome Glisse
2013-04-11 3:42 ` Simon Jeons
2013-04-11 18:38 ` Jerome Glisse
2013-04-12 1:54 ` Simon Jeons
2013-04-12 2:11 ` [Lsf-pc] " Rik van Riel
2013-04-12 2:57 ` Jerome Glisse
2013-04-12 5:44 ` Simon Jeons
2013-04-12 13:32 ` Jerome Glisse
2013-04-10 1:57 ` Simon Jeons
2013-04-10 20:55 ` Jerome Glisse
2013-04-11 3:37 ` Simon Jeons
2013-04-11 18:48 ` Jerome Glisse
2013-04-12 3:13 ` Simon Jeons
2013-04-12 3:21 ` Jerome Glisse
2013-04-15 8:39 ` Simon Jeons
2013-04-15 15:38 ` Jerome Glisse
2013-04-16 4:20 ` Simon Jeons
2013-04-16 16:19 ` Jerome Glisse
2013-02-10 7:54 ` Shachar Raindel
2013-04-09 8:17 ` Simon Jeons
2013-04-10 1:48 ` Simon Jeons
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=516CF7BB.3050301@gmail.com \
--to=simon.jeons@gmail.com \
--cc=aarcange@redhat.com \
--cc=haggaie@mellanox.com \
--cc=j.glisse@gmail.com \
--cc=linux-mm@kvack.org \
--cc=liranl@mellanox.com \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=ogerlitz@mellanox.com \
--cc=raindel@mellanox.com \
--cc=roland@purestorage.com \
--cc=sagig@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).