From mboxrd@z Thu Jan 1 00:00:00 1970 From: tgh Subject: Re: Re: NUMA and SMP Date: Wed, 21 Mar 2007 09:08:34 +0800 Message-ID: <46008592.5070603@ncic.ac.cn> References: <907625E08839C4409CE5768403633E0B018E1879@sefsexmb1.amd.com> <8790346913e7b2e96fdc58199e039895@xensource.com> <45FFDD32.8030607@ncic.ac.cn> <1174398691.5642.43.camel@lapbode42.lrr.in.tum.de> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <1174398691.5642.43.camel@lapbode42.lrr.in.tum.de> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Daniel Stodden Cc: Xen Developers List-Id: xen-devel@lists.xenproject.org Thank you for your reply Daniel Stodden =E5=86=99=E9=81=93: > On Tue, 2007-03-20 at 21:10 +0800, tgh wrote: > =20 >> I am puzzled ,what is the page migration? >> Thank you in advance >> =20 > > NUMA is clear? NUMA distributes main memory across multiple memory > interfaces. > > This used to be a feature reserved to high-end multiprocessor > architectures, but in servers it is becoming sort of a commodity these > days, in part due AMD multiprocessor systems being NUMA systems these > days. AMD64 processors carry an integrated memory controller. So, if yo= u > buy an SMP machine with AMD processors today, you'd find each slice of > the total memory being connected to a different processor inside. > > Note that this doesn't break the 'symmetric' in 'SMP': it still remains > a global, flat physical address space. The processors have interconnect= s > by which memory can be read from remote processors as well, and will do > so transparently to system and application software. > =20 that is ,in the smp with adm64,it is a numa in the hardware=20 architecture,while a smp in the system software,is it right? Thank you in advance > [The alternative is rather the 'classic' model: Multiple processors > interconnected making SMP, but single memory interface in a single > northbridge (Intel would call it the "MCH") connecting to the front-sid= e > bus, connecting all processors them to main memory. Obviously, that > single memory interface will easily become a bottleneck, if all > processors try to access memory simultaneously.] > > NUMA *may* help here: accessing local memory is very fast. Acessing > remote memory is still pretty fast, but not as fast as it could be: > hence 'NUMA' - non-uniform memory access. > > So, in order to take advantage of such a memory topology, memory data > would ideally be always at the CPU where the processing happens. But > processes (or domains, regarding xen) may migrate between different > processors. Whether this happens depends on scheduling decisions. > There's a cost involved in migration itself, so schedulers will do it > ideally only if it really-makes-sense(TM). > > In order to keep a NUMA-system happy, pages once allocated could be > moved as well, to where the current CPU is. This is page migration. > As you may imagine, even more costly, and unfortunately completely > useless if cpu migration needs to happen on a regular basis. Therefore > it's difficult to get it right. Getting it right depends on how much th= e > scheduler and memory management knows about where the memory asked for > will be needed -- in advance. This is the hardest part: Most software > won't tell, because the programming models employed today do not even > recognize the fact that it may matter. Even if they would, in many > cases, it would be even difficult to predict at all. > > regards, > daniel > > =20