From: Jerome Glisse <j.glisse@gmail.com>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: Rik van Riel <riel@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
linux-mm <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Mel Gorman <mgorman@suse.de>,
Andrew Morton <akpm@linux-foundation.org>,
Linda Wang <lwang@redhat.com>, Kevin E Martin <kem@redhat.com>,
Jerome Glisse <jglisse@redhat.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Johannes Weiner <jweiner@redhat.com>,
Larry Woodman <lwoodman@redhat.com>,
Dave Airlie <airlied@redhat.com>, Jeff Law <law@redhat.com>,
Brendan Conoboy <blc@redhat.com>,
Joe Donohue <jdonohue@redhat.com>,
Duncan Poole <dpoole@nvidia.com>,
Sherry Cheung <SCheung@nvidia.com>,
Subhash Gutti <sgutti@nvidia.com>,
John Hubbard <jhubbard@nvidia.com>,
Mark Hairgrove <mhairgrove@nvidia.com>,
Lucien Dunning <ldunni
Subject: Re: [RFC] Heterogeneous memory management (mirror process address space on a device mmu).
Date: Tue, 6 May 2014 14:26:47 -0400 [thread overview]
Message-ID: <20140506182645.GH6731@gmail.com> (raw)
In-Reply-To: <201405061817.s46IHFlD026027@mail.zytor.com>
On Tue, May 06, 2014 at 11:02:33AM -0700, H. Peter Anvin wrote:
> <ogerlitz@mellanox.com>,Sagi Grimberg <sagig@mellanox.com>,Shachar Raindel <raindel@mellanox.com>,Liran Liss <liranl@mellanox.com>,Roland Dreier <roland@purestorage.com>,"Sander, Ben" <ben.sander@amd.com>,"Stoner, Greg" <Greg.Stoner@amd.com>,"Bridgman, John" <John.Bridgman@amd.com>,"Mantor, Michael" <Michael.Mantor@amd.com>,"Blinzer, Paul" <Paul.Blinzer@amd.com>,"Morichetti, Laurent" <Laurent.Morichetti@amd.com>,"Deucher, Alexander" <Alexander.Deucher@amd.com>,"Gabbay, Oded" <Oded.Gabbay@amd.com>,Davidlohr Bueso <davidlohr@hp.com>
> Message-ID: <0bf54468-3ed1-4cd4-b771-4836c78dde14@email.android.com>
>
> Nothing wrong with device-side memory, but not having it accessible by
> the CPU seems fundamentally brown from the point of view of unified
> memory addressing.
Unified memory addressing does not imply CPU and GPU working on same set
of data at same time. So having part of the address space only accessible
by GPU while it's actively working on it make sense. The GPU then can have
low latency (no pcie bus) and enormous bandwith and thus perform the
computation a lot faster.
Note that my patchset handle cpu page fault while data is inside GPU memory
and migrate it back to system memory. So from CPU point of view it is just
as if things were in some kind of swap device except that the swap device
is actualy doing some useful computation.
Also on the cache coherent front, cache coherency has a cost, a very high
cost. This is why even on APU (where the GPU and CPU are on same die and
the mmu of the GPU and CPU have privilege link think today AMD APU or next
year intel skylake) you have two memory link, one cache coherent with the
CPU and another one that is not cache coherent with the CPU. The latter
link is way faster and my patchset is also intended to help taking advantages
of this second link (http://developer.amd.com/wordpress/media/2013/06/1004_final.pdf)
Cheers,
Jérôme
>
> On May 6, 2014 9:54:08 AM PDT, Jerome Glisse <j.glisse@gmail.com> wrote:
> >On Tue, May 06, 2014 at 12:47:16PM -0400, Rik van Riel wrote:
> >> On 05/06/2014 12:34 PM, Linus Torvalds wrote:
> >> > On Tue, May 6, 2014 at 9:30 AM, Rik van Riel <riel@redhat.com>
> >wrote:
> >> >>
> >> >> The GPU runs a lot faster when using video memory, instead
> >> >> of system memory, on the other side of the PCIe bus.
> >> >
> >> > The nineties called, and they want their old broken model back.
> >> >
> >> > Get with the times. No high-performance future GPU will ever run
> >> > behind the PCIe bus. We still have a few straggling historical
> >> > artifacts, but everybody knows where the future is headed.
> >> >
> >> > They are already cache-coherent because flushing caches etc was too
> >> > damn expensive. They're getting more so.
> >>
> >> I suppose that VRAM could simply be turned into a very high
> >> capacity CPU cache for the GPU, for the case where people
> >> want/need an add-on card.
> >>
> >> With a few hundred MB of "CPU cache" on the video card, we
> >> could offload processing to the GPU very easily, without
> >> having to worry about multiple address or page table formats
> >> on the CPU side.
> >>
> >> A new generation of GPU hardware seems to come out every
> >> six months or so, so I guess we could live with TLB
> >> invalidations to the first generations of hardware being
> >> comically slow :)
> >>
> >
> >I do not want to speak for any GPU manufacturer but i think it is safe
> >to say that there will be dedicated memory for GPU for a long time. It
> >is not going anywhere soon and it is a lot more than few hundred MB,
> >think several GB. If you think about 4k, 8k screen you really gonna
> >want
> >8GB at least on desktop computer and for compute you will likely see
> >16GB or 32GB as common size.
> >
> >Again i stress that there is nothing on the horizon that let me believe
> >that regular memory associated to CPU will ever come close to the
> >bandwith
> >that exist with memory associated to GPU. It is already more than 10
> >times
> >faster on GPU and as far as i know the gap will grow even more in the
> >next
> >generation.
> >
> >So dedicated memory to gpu should not be discarded as something that is
> >vanishing quite the contrary it should be acknowledge as something that
> >is
> >here to stay a lot longer afaict.
> >
> >Cheers,
> >Jérôme
>
> --
> Sent from my mobile phone. Please pardon brevity and lack of formatting.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-05-06 18:26 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-02 13:51 [RFC] Heterogeneous memory management (mirror process address space on a device mmu) j.glisse
2014-05-02 13:52 ` [PATCH 01/11] mm: differentiate unmap for vmscan from other unmap j.glisse
2014-05-02 13:52 ` [PATCH 02/11] mmu_notifier: add action information to address invalidation j.glisse
2014-05-02 13:52 ` [PATCH 03/11] mmu_notifier: pass through vma to invalidate_range and invalidate_page j.glisse
2014-05-02 13:52 ` [PATCH 04/11] interval_tree: helper to find previous item of a node in rb interval tree j.glisse
2014-05-02 13:52 ` [PATCH 05/11] mm/memcg: support accounting null page and transfering null charge to new page j.glisse
2014-05-02 13:52 ` [PATCH 06/11] hmm: heterogeneous memory management j.glisse
2014-05-02 13:52 ` [PATCH 07/11] hmm: support moving anonymous page to remote memory j.glisse
2014-05-02 13:52 ` [PATCH 08/11] hmm: support for migrate file backed pages " j.glisse
2014-05-02 13:52 ` [PATCH 09/11] fs/ext4: add support for hmm migration to remote memory of pagecache j.glisse
2014-05-02 13:52 ` [PATCH 10/11] hmm/dummy: dummy driver to showcase the hmm api j.glisse
2014-05-02 13:52 ` [PATCH 11/11] hmm/dummy_driver: add support for fake remote memory using pages j.glisse
2014-05-06 10:29 ` [RFC] Heterogeneous memory management (mirror process address space on a device mmu) Peter Zijlstra
2014-05-06 14:57 ` Linus Torvalds
2014-05-06 15:00 ` Jerome Glisse
2014-05-06 15:18 ` Linus Torvalds
2014-05-06 15:33 ` Jerome Glisse
2014-05-06 15:42 ` Rik van Riel
2014-05-06 15:47 ` Linus Torvalds
2014-05-06 16:18 ` Jerome Glisse
2014-05-06 16:32 ` Linus Torvalds
2014-05-06 16:49 ` Jerome Glisse
2014-05-06 17:28 ` Jerome Glisse
2014-05-06 17:43 ` Linus Torvalds
2014-05-06 18:13 ` Jerome Glisse
2014-05-06 18:22 ` Linus Torvalds
2014-05-06 18:38 ` Jerome Glisse
2014-05-07 7:18 ` Benjamin Herrenschmidt
2014-05-07 7:14 ` Benjamin Herrenschmidt
2014-05-07 12:39 ` Jerome Glisse
2014-05-09 1:26 ` Jerome Glisse
2014-05-10 4:28 ` Benjamin Herrenschmidt
2014-05-11 0:48 ` Jerome Glisse
2014-05-06 16:30 ` Rik van Riel
2014-05-06 16:34 ` Linus Torvalds
2014-05-06 16:47 ` Rik van Riel
2014-05-06 16:54 ` Jerome Glisse
2014-05-06 18:02 ` H. Peter Anvin
2014-05-06 18:26 ` Jerome Glisse [this message]
2014-05-06 22:44 ` David Airlie
2014-05-07 2:33 ` Davidlohr Bueso
2014-05-07 13:00 ` Peter Zijlstra
2014-05-07 17:34 ` Davidlohr Bueso
2014-05-07 16:21 ` Linus Torvalds
2014-05-08 16:47 ` sagi grimberg
2014-05-08 17:56 ` Jerome Glisse
2014-05-09 1:42 ` Davidlohr Bueso
2014-05-09 1:45 ` Jerome Glisse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140506182645.GH6731@gmail.com \
--to=j.glisse@gmail.com \
--cc=SCheung@nvidia.com \
--cc=aarcange@redhat.com \
--cc=airlied@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=blc@redhat.com \
--cc=dpoole@nvidia.com \
--cc=hpa@zytor.com \
--cc=jdonohue@redhat.com \
--cc=jglisse@redhat.com \
--cc=jhubbard@nvidia.com \
--cc=jweiner@redhat.com \
--cc=kem@redhat.com \
--cc=law@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lwang@redhat.com \
--cc=lwoodman@redhat.com \
--cc=mgorman@suse.de \
--cc=mhairgrove@nvidia.com \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=sgutti@nvidia.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).