From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jerome Glisse Subject: Re: [RFC] Heterogeneous memory management (mirror process address space on a device mmu). Date: Tue, 6 May 2014 14:26:47 -0400 Message-ID: <20140506182645.GH6731@gmail.com> References: <20140506150014.GA6731@gmail.com> <20140506153315.GB6731@gmail.com> <53690E29.7060602@redhat.com> <53691214.80906@redhat.com> <20140506165405.GE6731@gmail.com> <201405061817.s46IHFlD026027@mail.zytor.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Rik van Riel , Linus Torvalds , Peter Zijlstra , linux-mm , Linux Kernel Mailing List , linux-fsdevel , Mel Gorman , Andrew Morton , Linda Wang , Kevin E Martin , Jerome Glisse , Andrea Arcangeli , Johannes Weiner , Larry Woodman , Dave Airlie , Jeff Law , Brendan Conoboy , Joe Donohue , Duncan Poole , Sherry Cheung , Subhash Gutti , John Hubbard , Mark Hairgrove , Lucien Dunning Return-path: Content-Disposition: inline In-Reply-To: <201405061817.s46IHFlD026027@mail.zytor.com> Sender: owner-linux-mm@kvack.org List-Id: linux-fsdevel.vger.kernel.org On Tue, May 06, 2014 at 11:02:33AM -0700, H. Peter Anvin wrote: > ,Sagi Grimberg ,Shachar Rain= del ,Liran Liss ,Roland Dreier= ,"Sander, Ben" ,"Stoner, Gre= g" ,"Bridgman, John" ,"Mantor= , Michael" ,"Blinzer, Paul" ,"Morichetti, Laurent" ,"Deucher, Alexander"= ,"Gabbay, Oded" ,Davidlo= hr Bueso > Message-ID: <0bf54468-3ed1-4cd4-b771-4836c78dde14@email.android.com> >=20 > Nothing wrong with device-side memory, but not having it accessible by > the CPU seems fundamentally brown from the point of view of unified > memory addressing. Unified memory addressing does not imply CPU and GPU working on same set of data at same time. So having part of the address space only accessible by GPU while it's actively working on it make sense. The GPU then can hav= e low latency (no pcie bus) and enormous bandwith and thus perform the computation a lot faster. Note that my patchset handle cpu page fault while data is inside GPU memo= ry and migrate it back to system memory. So from CPU point of view it is jus= t as if things were in some kind of swap device except that the swap device is actualy doing some useful computation. Also on the cache coherent front, cache coherency has a cost, a very high cost. This is why even on APU (where the GPU and CPU are on same die and the mmu of the GPU and CPU have privilege link think today AMD APU or nex= t year intel skylake) you have two memory link, one cache coherent with the CPU and another one that is not cache coherent with the CPU. The latter link is way faster and my patchset is also intended to help taking advant= ages of this second link (http://developer.amd.com/wordpress/media/2013/06/100= 4_final.pdf) Cheers, J=E9r=F4me >=20 > On May 6, 2014 9:54:08 AM PDT, Jerome Glisse wrote= : > >On Tue, May 06, 2014 at 12:47:16PM -0400, Rik van Riel wrote: > >> On 05/06/2014 12:34 PM, Linus Torvalds wrote: > >> > On Tue, May 6, 2014 at 9:30 AM, Rik van Riel > >wrote: > >> >> > >> >> The GPU runs a lot faster when using video memory, instead > >> >> of system memory, on the other side of the PCIe bus. > >> >=20 > >> > The nineties called, and they want their old broken model back. > >> >=20 > >> > Get with the times. No high-performance future GPU will ever run > >> > behind the PCIe bus. We still have a few straggling historical > >> > artifacts, but everybody knows where the future is headed. > >> >=20 > >> > They are already cache-coherent because flushing caches etc was to= o > >> > damn expensive. They're getting more so. > >>=20 > >> I suppose that VRAM could simply be turned into a very high > >> capacity CPU cache for the GPU, for the case where people > >> want/need an add-on card. > >>=20 > >> With a few hundred MB of "CPU cache" on the video card, we > >> could offload processing to the GPU very easily, without > >> having to worry about multiple address or page table formats > >> on the CPU side. > >>=20 > >> A new generation of GPU hardware seems to come out every > >> six months or so, so I guess we could live with TLB > >> invalidations to the first generations of hardware being > >> comically slow :) > >>=20 > > > >I do not want to speak for any GPU manufacturer but i think it is safe > >to say that there will be dedicated memory for GPU for a long time. It > >is not going anywhere soon and it is a lot more than few hundred MB, > >think several GB. If you think about 4k, 8k screen you really gonna > >want > >8GB at least on desktop computer and for compute you will likely see > >16GB or 32GB as common size. > > > >Again i stress that there is nothing on the horizon that let me believ= e > >that regular memory associated to CPU will ever come close to the > >bandwith > >that exist with memory associated to GPU. It is already more than 10 > >times > >faster on GPU and as far as i know the gap will grow even more in the > >next > >generation. > > > >So dedicated memory to gpu should not be discarded as something that i= s > >vanishing quite the contrary it should be acknowledge as something tha= t > >is > >here to stay a lot longer afaict. > > > >Cheers, > >J=E9r=F4me >=20 > --=20 > Sent from my mobile phone. Please pardon brevity and lack of formattin= g. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org