From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38113) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLurv-0002yW-H7 for qemu-devel@nongnu.org; Thu, 12 Feb 2015 09:35:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YLurp-0004z4-Lr for qemu-devel@nongnu.org; Thu, 12 Feb 2015 09:35:51 -0500 Received: from greensocs.com ([193.104.36.180]:10691) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLurp-0004yo-8x for qemu-devel@nongnu.org; Thu, 12 Feb 2015 09:35:45 -0500 From: Mark Burton Content-Type: multipart/alternative; boundary="Apple-Mail=_33927468-793C-4514-8B25-B24573F2D3F4" Message-Id: Date: Thu, 12 Feb 2015 15:35:39 +0100 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Subject: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Maydell , qemu-devel , mttcg@listserver.greensocs.com --Apple-Mail=_33927468-793C-4514-8B25-B24573F2D3F4 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 TLB Flush: We have spent a few days on this issue, and still haven=E2=80=99t = resolved the best path. Our solution seems to work, most of the time, but we still have some = strange issues - so I want to check that what we are proposing has a = chance of working. Our plan is to allow all CPU=E2=80=99s to continue. Potentially one CPU = will want to write to the TLBs. Subsequent to the write, it requests a = TLB Flush. We are proposing to implement this by signalling all other = CPU=E2=80=99s to exit (and requesting they flush before re-starting). In = other words, this would happen asynchronously. This means - there is a theoretical period of time when one CPU is = writing to the TLBs while other CPU=E2=80=99s are executing. Our belief = is that this has to be handled by software anyway, and this should not = be an issue from Qemu=E2=80=99s point of view.=20 The alternative would be to force all other CPU=E2=80=99s to exit before = writing the TLB=E2=80=99s - this is both expensive and very painful to = organise (as we get into horrid deadlocks whichever way we turn)=E2=80=A6 We=E2=80=99d appreciate some thoughts on this... Cheers Mark. +44 (0)20 7100 3485 x 210 +33 (0)5 33 52 01 77x 210 +33 (0)603762104 mark.burton --Apple-Mail=_33927468-793C-4514-8B25-B24573F2D3F4 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

TLB = Flush:

We have = spent a few days on this issue, and still haven=E2=80=99t resolved the = best path.

Our = solution seems to work, most of the time, but we still have some strange = issues - so I want to check that what we are proposing has a chance of = working.


Our plan is to allow all CPU=E2=80=99s = to continue. Potentially one CPU will want to write to the TLBs. = Subsequent to the write, it requests a TLB Flush. We are proposing to = implement this by signalling all other CPU=E2=80=99s to exit (and = requesting they flush before re-starting). In other words, this would = happen asynchronously.

This means - there is a theoretical period of time when one = CPU is writing to the TLBs while other CPU=E2=80=99s are executing. =  Our belief is that this has to be handled by software anyway, and = this should not be an issue from Qemu=E2=80=99s point of = view. 
The alternative would be to force all = other CPU=E2=80=99s to exit before writing the TLB=E2=80=99s - this is = both expensive and very painful to organise (as we get into horrid = deadlocks whichever way we turn)=E2=80=A6

We=E2=80=99d appreciate some thoughts = on this...

Cheers

Mark.




 +44 (0)20 7100 3485 x = 210
 +33 (0)5 = 33 52 01 77x 210

+33 (0)603762104
mark.burton

= --Apple-Mail=_33927468-793C-4514-8B25-B24573F2D3F4-- From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39816) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLv1O-000800-Tn for qemu-devel@nongnu.org; Thu, 12 Feb 2015 09:45:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YLv1J-00081g-Tx for qemu-devel@nongnu.org; Thu, 12 Feb 2015 09:45:38 -0500 Received: from cantor2.suse.de ([195.135.220.15]:36976 helo=mx2.suse.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLv1J-00080z-MN for qemu-devel@nongnu.org; Thu, 12 Feb 2015 09:45:33 -0500 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.1 \(1993\)) From: Alexander Graf In-Reply-To: Date: Thu, 12 Feb 2015 15:45:29 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> References: Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Mark Burton Cc: mttcg@listserver.greensocs.com, Peter Maydell , qemu-devel > On 12.02.2015, at 15:35, Mark Burton = wrote: >=20 >=20 > TLB Flush: >=20 > We have spent a few days on this issue, and still haven=E2=80=99t = resolved the best path. >=20 > Our solution seems to work, most of the time, but we still have some = strange issues - so I want to check that what we are proposing has a = chance of working. >=20 >=20 > Our plan is to allow all CPU=E2=80=99s to continue. Potentially one = CPU will want to write to the TLBs. Subsequent to the write, it requests = a TLB Flush. Local or global? For local TLB flushes you don't notify the other CPUs = at all. For global ones, the semantics of the call usually dictate = atomicity. > We are proposing to implement this by signalling all other CPU=E2=80=99s= to exit (and requesting they flush before re-starting). In other words, = this would happen asynchronously. For global flushes, give them a pointer payload along with the flush = request and tell all cpus to increment it atomically. In your main = thread, wait until *ptr =3D=3D nKickedCpus. FWIW TLBs are always CPU local. When there's a "global TLB flush" = instruction, it pretty much does stall the CPU, notifies the others to = also flush their TLBs, waits and then continues. If this really does become a performance bottleneck (which I doubt it = does, almost nobody except x86 does global flushes), you can also do = some nasty hacky tricks, such as (atomically) change the valid bit in = remote CPUs TLB entries. But really only do this as a last resort if the = clean version doesn't perform well. Alex > This means - there is a theoretical period of time when one CPU is = writing to the TLBs while other CPU=E2=80=99s are executing. Our belief = is that this has to be handled by software anyway, and this should not = be an issue from Qemu=E2=80=99s point of view.=20 > The alternative would be to force all other CPU=E2=80=99s to exit = before writing the TLB=E2=80=99s - this is both expensive and very = painful to organise (as we get into horrid deadlocks whichever way we = turn)=E2=80=A6 >=20 > We=E2=80=99d appreciate some thoughts on this... >=20 > Cheers >=20 > Mark. >=20 >=20 >=20 > +44 (0)20 7100 3485 x 210 > +33 (0)5 33 52 01 77x 210 >=20 > +33 (0)603762104 > mark.burton >=20 >=20 From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42694) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLvE8-0007Lj-DZ for qemu-devel@nongnu.org; Thu, 12 Feb 2015 09:58:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YLvE0-0004Z3-Qg for qemu-devel@nongnu.org; Thu, 12 Feb 2015 09:58:48 -0500 Received: from mail-lb0-f179.google.com ([209.85.217.179]:43053) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLvE0-0004Yv-Jt for qemu-devel@nongnu.org; Thu, 12 Feb 2015 09:58:40 -0500 Received: by mail-lb0-f179.google.com with SMTP id w7so9980002lbi.10 for ; Thu, 12 Feb 2015 06:58:39 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> From: Peter Maydell Date: Thu, 12 Feb 2015 14:58:19 +0000 Message-ID: Content-Type: text/plain; charset=UTF-8 Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Graf Cc: mttcg@listserver.greensocs.com, Mark Burton , qemu-devel On 12 February 2015 at 14:45, Alexander Graf wrote: > almost nobody except x86 does global flushes All ARM TLB maintenance operations have both "this CPU only" and "all TLBs in the Inner Shareable domain" [that's ARM-speak for "every CPU core in the cluster"] variants (the latter being the TLB *IS operations). Looking at Linux's arch/arm64/mm/tlb.S and arch/arm64/include/asm/tlbflush.h most of the operations defined there use the IS variants. -- PMM From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43088) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLvGg-0008Vc-Ns for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:01:31 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YLvGd-0005QI-Es for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:01:26 -0500 Received: from mail-lb0-f169.google.com ([209.85.217.169]:64649) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLvGd-0005Q5-7U for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:01:23 -0500 Received: by mail-lb0-f169.google.com with SMTP id p9so10023257lbv.0 for ; Thu, 12 Feb 2015 07:01:22 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> From: Peter Maydell Date: Thu, 12 Feb 2015 15:01:02 +0000 Message-ID: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Graf Cc: mttcg@listserver.greensocs.com, Mark Burton , qemu-devel On 12 February 2015 at 14:45, Alexander Graf wrote: > >> On 12.02.2015, at 15:35, Mark Burton wrote: >> We are proposing to implement this by signalling all other CPU=E2=80=99s >> to exit (and requesting they flush before re-starting). In other >> words, this would happen asynchronously. > > For global flushes, give them a pointer payload along with the flush > request and tell all cpus to increment it atomically. In your main > thread, wait until *ptr =3D=3D nKickedCpus. I bet this will not be the only situation where you want to do an "get all other CPUs to do $something and wait til they have done so" kind of operation, so some lightweight but generic infrastructure for doing that would not be a bad plan. (Similarly "get all other CPUs to stop, then I can do $something and let the others continue".) -- PMM From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44562) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLvNX-0007dE-AM for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:08:32 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YLvNU-0007eQ-4A for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:08:31 -0500 Received: from greensocs.com ([193.104.36.180]:43120) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLvNT-0007eF-Pu for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:08:28 -0500 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) From: Mark Burton In-Reply-To: Date: Thu, 12 Feb 2015 16:08:24 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <45883806-2296-486F-A0DC-D8A0A74F85B9@greensocs.com> References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Maydell Cc: mttcg@listserver.greensocs.com, Alexander Graf , qemu-devel > On 12 Feb 2015, at 16:01, Peter Maydell = wrote: >=20 > On 12 February 2015 at 14:45, Alexander Graf wrote: >>=20 >>> On 12.02.2015, at 15:35, Mark Burton = wrote: >>> We are proposing to implement this by signalling all other CPU=E2=80=99= s >>> to exit (and requesting they flush before re-starting). In other >>> words, this would happen asynchronously. >>=20 >> For global flushes, give them a pointer payload along with the flush >> request and tell all cpus to increment it atomically. In your main >> thread, wait until *ptr =3D=3D nKickedCpus. >=20 > I bet this will not be the only situation where you want to > do an "get all other CPUs to do $something and wait til they > have done so" kind of operation, so some lightweight but generic > infrastructure for doing that would not be a bad plan. (Similarly > "get all other CPUs to stop, then I can do $something and let > the others continue=E2=80=9D.) We tried this - we ended up in knots. We had 2 CPU=E2=80=99s trying to flush at about the same time, both = waiting for the other. We had CPU=E2=80=99s trying to get the global mutex to finish what they = were doing, while being told to flush,=20 We had CPU=E2=80=99s in the global mutex trying to do something that = would cause a flush=E2=80=A6 etc.... We had spaghetti with extra Bolognese sauce=E2=80=A6 We eventually concluded, yes - in an infinite universe everything is = possible, but if we could simply do this =E2=80=98asynchronously=E2=80=99 = then our lives would be a LOT easier. e.g. - ask all CPU=E2=80=99s to =E2=80=9Cexit and do something=E2=80=9D = is easy - wait for them to do that is a whole other problem=E2=80=A6 Our question is - do we need this =E2=80=98sync=E2=80=99 (before the = flush), or can we actually allow CPU=E2=80=99s to flush themselves = asynchronously=E2=80=A6. Cheers Mark. >=20 > -- PMM +44 (0)20 7100 3485 x 210 +33 (0)5 33 52 01 77x 210 +33 (0)603762104 mark.burton From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44975) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLvQP-0001n0-12 for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:11:30 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YLvQL-0008Kl-OZ for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:11:28 -0500 Received: from greensocs.com ([193.104.36.180]:54852) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLvQL-0008Ke-Fb for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:11:25 -0500 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) From: Mark Burton In-Reply-To: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> Date: Thu, 12 Feb 2015 16:11:23 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Graf Cc: mttcg@listserver.greensocs.com, Peter Maydell , qemu-devel OK - Alex - your implication is that it has to be atomic, we need the = sync=E2=80=A6 :-( I have a horrid feeling that the atomicity of global flush can=E2=80=99t = be causing the (almost, but not quite reproducible) errors we=E2=80=99re = seeing - but=E2=80=A6 anyway ;-) Cheers Mark. > On 12 Feb 2015, at 15:45, Alexander Graf wrote: >=20 >=20 >> On 12.02.2015, at 15:35, Mark Burton = wrote: >>=20 >>=20 >> TLB Flush: >>=20 >> We have spent a few days on this issue, and still haven=E2=80=99t = resolved the best path. >>=20 >> Our solution seems to work, most of the time, but we still have some = strange issues - so I want to check that what we are proposing has a = chance of working. >>=20 >>=20 >> Our plan is to allow all CPU=E2=80=99s to continue. Potentially one = CPU will want to write to the TLBs. Subsequent to the write, it requests = a TLB Flush. >=20 > Local or global? For local TLB flushes you don't notify the other CPUs = at all. For global ones, the semantics of the call usually dictate = atomicity. >=20 >> We are proposing to implement this by signalling all other CPU=E2=80=99= s to exit (and requesting they flush before re-starting). In other = words, this would happen asynchronously. >=20 > For global flushes, give them a pointer payload along with the flush = request and tell all cpus to increment it atomically. In your main = thread, wait until *ptr =3D=3D nKickedCpus. >=20 > FWIW TLBs are always CPU local. When there's a "global TLB flush" = instruction, it pretty much does stall the CPU, notifies the others to = also flush their TLBs, waits and then continues. >=20 > If this really does become a performance bottleneck (which I doubt it = does, almost nobody except x86 does global flushes), you can also do = some nasty hacky tricks, such as (atomically) change the valid bit in = remote CPUs TLB entries. But really only do this as a last resort if the = clean version doesn't perform well. >=20 >=20 > Alex >=20 >> This means - there is a theoretical period of time when one CPU is = writing to the TLBs while other CPU=E2=80=99s are executing. Our belief = is that this has to be handled by software anyway, and this should not = be an issue from Qemu=E2=80=99s point of view.=20 >> The alternative would be to force all other CPU=E2=80=99s to exit = before writing the TLB=E2=80=99s - this is both expensive and very = painful to organise (as we get into horrid deadlocks whichever way we = turn)=E2=80=A6 >>=20 >> We=E2=80=99d appreciate some thoughts on this... >>=20 >> Cheers >>=20 >> Mark. >>=20 >>=20 >>=20 >> +44 (0)20 7100 3485 x 210 >> +33 (0)5 33 52 01 77x 210 >>=20 >> +33 (0)603762104 >> mark.burton >>=20 >>=20 >=20 +44 (0)20 7100 3485 x 210 +33 (0)5 33 52 01 77x 210 +33 (0)603762104 mark.burton From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47010) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLvYX-0008U7-OX for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:19:59 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YLvYU-0002sA-9d for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:19:53 -0500 Received: from cantor2.suse.de ([195.135.220.15]:40722 helo=mx2.suse.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLvYU-0002rk-49 for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:19:50 -0500 Message-ID: <54DCC494.2010400@suse.de> Date: Thu, 12 Feb 2015 16:19:48 +0100 From: Alexander Graf MIME-Version: 1.0 References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> <45883806-2296-486F-A0DC-D8A0A74F85B9@greensocs.com> In-Reply-To: <45883806-2296-486F-A0DC-D8A0A74F85B9@greensocs.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Mark Burton , Peter Maydell Cc: mttcg@listserver.greensocs.com, qemu-devel On 12.02.15 16:08, Mark Burton wrote: >=20 >> On 12 Feb 2015, at 16:01, Peter Maydell wro= te: >> >> On 12 February 2015 at 14:45, Alexander Graf wrote: >>> >>>> On 12.02.2015, at 15:35, Mark Burton wro= te: >>>> We are proposing to implement this by signalling all other CPU=E2=80= =99s >>>> to exit (and requesting they flush before re-starting). In other >>>> words, this would happen asynchronously. >>> >>> For global flushes, give them a pointer payload along with the flush >>> request and tell all cpus to increment it atomically. In your main >>> thread, wait until *ptr =3D=3D nKickedCpus. >> >> I bet this will not be the only situation where you want to >> do an "get all other CPUs to do $something and wait til they >> have done so" kind of operation, so some lightweight but generic >> infrastructure for doing that would not be a bad plan. (Similarly >> "get all other CPUs to stop, then I can do $something and let >> the others continue=E2=80=9D.) >=20 > We tried this - we ended up in knots. > We had 2 CPU=E2=80=99s trying to flush at about the same time, both wai= ting for the other. > We had CPU=E2=80=99s trying to get the global mutex to finish what they= were doing, while being told to flush,=20 > We had CPU=E2=80=99s in the global mutex trying to do something that wo= uld cause a flush=E2=80=A6 etc.... > We had spaghetti with extra Bolognese sauce=E2=80=A6 >=20 > We eventually concluded, yes - in an infinite universe everything is po= ssible, but if we could simply do this =E2=80=98asynchronously=E2=80=99 t= hen our lives would be a LOT easier. > e.g. - ask all CPU=E2=80=99s to =E2=80=9Cexit and do something=E2=80=9D= is easy - wait for them to do that is a whole other problem=E2=80=A6 >=20 > Our question is - do we need this =E2=80=98sync=E2=80=99 (before the fl= ush), or can we actually allow CPU=E2=80=99s to flush themselves asynchro= nously=E2=80=A6. The respective target architecture specs will tell you. And I very much doubt that it is ok in most cases. Alex From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49774) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLvjX-0007OB-VH for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:31:19 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YLvjT-0000di-UG for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:31:15 -0500 Received: from mx1.redhat.com ([209.132.183.28]:47270) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLvjT-0000cE-Ms for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:31:11 -0500 Date: Thu, 12 Feb 2015 15:31:02 +0000 From: "Dr. David Alan Gilbert" Message-ID: <20150212153102.GB15127@work-vm> References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> <45883806-2296-486F-A0DC-D8A0A74F85B9@greensocs.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <45883806-2296-486F-A0DC-D8A0A74F85B9@greensocs.com> Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Mark Burton Cc: mttcg@greensocs.com, Peter Maydell , Alexander Graf , qemu-devel * Mark Burton (mark.burton@greensocs.com) wrote: > > > On 12 Feb 2015, at 16:01, Peter Maydell wrote: > > > > On 12 February 2015 at 14:45, Alexander Graf wrote: > >> > >>> On 12.02.2015, at 15:35, Mark Burton wrote: > >>> We are proposing to implement this by signalling all other CPU???s > >>> to exit (and requesting they flush before re-starting). In other > >>> words, this would happen asynchronously. > >> > >> For global flushes, give them a pointer payload along with the flush > >> request and tell all cpus to increment it atomically. In your main > >> thread, wait until *ptr == nKickedCpus. > > > > I bet this will not be the only situation where you want to > > do an "get all other CPUs to do $something and wait til they > > have done so" kind of operation, so some lightweight but generic > > infrastructure for doing that would not be a bad plan. (Similarly > > "get all other CPUs to stop, then I can do $something and let > > the others continue???.) > > We tried this - we ended up in knots. > We had 2 CPU???s trying to flush at about the same time, both waiting for the other. > We had CPU???s trying to get the global mutex to finish what they were doing, while being told to flush, > We had CPU???s in the global mutex trying to do something that would cause a flush??? etc.... > We had spaghetti with extra Bolognese sauce??? This is the hard problem of multithreaded emulation. You've always got to let CPUs get back to a point where you can invalidate a mapping/page quickly. Thus you've also got to be very careful about where any CPU might get into a loop or take another lock that would stop another CPU causing an invalidate. Either that or you need a way of somehow breaking locks or recovering from the situation. > We eventually concluded, yes - in an infinite universe everything is possible, but if we could simply do this ???asynchronously??? then our lives would be a LOT easier. > e.g. - ask all CPU???s to ???exit and do something??? is easy - wait for them to do that is a whole other problem??? Which is why you've got to bound how long it might take those CPUs to get back to you, and optimise out cases where it's not really needed later. > Our question is - do we need this ???sync??? (before the flush), or can we actually allow CPU???s to flush themselves asynchronously???. Always assume the worst. Dave > > Cheers > > Mark. > > > > > > > -- PMM > > > +44 (0)20 7100 3485 x 210 > +33 (0)5 33 52 01 77x 210 > > +33 (0)603762104 > mark.burton > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51438) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLvqp-0002k3-8W for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:38:48 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YLvqj-0004xH-MS for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:38:46 -0500 Received: from cantor2.suse.de ([195.135.220.15]:42077 helo=mx2.suse.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLvqj-0004x2-FX for qemu-devel@nongnu.org; Thu, 12 Feb 2015 10:38:41 -0500 Message-ID: <54DCC8FF.7000609@suse.de> Date: Thu, 12 Feb 2015 16:38:39 +0100 From: Alexander Graf MIME-Version: 1.0 References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Maydell Cc: mttcg@listserver.greensocs.com, Mark Burton , qemu-devel On 12.02.15 15:58, Peter Maydell wrote: > On 12 February 2015 at 14:45, Alexander Graf wrote: >> almost nobody except x86 does global flushes >=20 > All ARM TLB maintenance operations have both "this CPU only" > and "all TLBs in the Inner Shareable domain" [that's ARM-speak > for "every CPU core in the cluster"] variants (the latter > being the TLB *IS operations). Looking at Linux's > arch/arm64/mm/tlb.S and arch/arm64/include/asm/tlbflush.h > most of the operations defined there use the IS variants. Wow, did anyone benchmark this? I know that PPC switched away from global flushes and instead tracks the CPUs a task was running on to limit the scope of CPUs that need to flush. Alex From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56280) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLwE9-0007j2-Ee for qemu-devel@nongnu.org; Thu, 12 Feb 2015 11:02:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YLwE4-0008CD-Ax for qemu-devel@nongnu.org; Thu, 12 Feb 2015 11:02:53 -0500 Received: from greensocs.com ([193.104.36.180]:19674) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLwE4-0008C0-4D for qemu-devel@nongnu.org; Thu, 12 Feb 2015 11:02:48 -0500 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) From: Mark Burton In-Reply-To: <54DCC8FF.7000609@suse.de> Date: Thu, 12 Feb 2015 17:02:44 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> <54DCC8FF.7000609@suse.de> Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Graf Cc: mttcg@listserver.greensocs.com, Peter Maydell , qemu-devel > On 12 Feb 2015, at 16:38, Alexander Graf wrote: >=20 >=20 >=20 > On 12.02.15 15:58, Peter Maydell wrote: >> On 12 February 2015 at 14:45, Alexander Graf wrote: >>> almost nobody except x86 does global flushes >>=20 >> All ARM TLB maintenance operations have both "this CPU only" >> and "all TLBs in the Inner Shareable domain" [that's ARM-speak >> for "every CPU core in the cluster"] variants (the latter >> being the TLB *IS operations). Looking at Linux's >> arch/arm64/mm/tlb.S and arch/arm64/include/asm/tlbflush.h >> most of the operations defined there use the IS variants. >=20 > Wow, did anyone benchmark this? I know that PPC switched away from > global flushes and instead tracks the CPUs a task was running on to > limit the scope of CPUs that need to flush. Doesn=E2=80=99t that mean you have to signal a specific CPU to cause it = to flush itself=E2=80=A6. Isn=E2=80=99t that in itself expensive? Do you = have to organise some sort of atomicity yourself around that too? Cheers Mark. >=20 >=20 > Alex +44 (0)20 7100 3485 x 210 +33 (0)5 33 52 01 77x 210 +33 (0)603762104 mark.burton From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46775) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLykZ-0007v9-Pp for qemu-devel@nongnu.org; Thu, 12 Feb 2015 13:44:36 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YLykU-0002WF-P7 for qemu-devel@nongnu.org; Thu, 12 Feb 2015 13:44:31 -0500 Received: from greensocs.com ([193.104.36.180]:41252) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YLykU-0002Vt-FB for qemu-devel@nongnu.org; Thu, 12 Feb 2015 13:44:26 -0500 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) From: Mark Burton In-Reply-To: <20150212153102.GB15127@work-vm> Date: Thu, 12 Feb 2015 19:44:16 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> <45883806-2296-486F-A0DC-D8A0A74F85B9@greensocs.com> <20150212153102.GB15127@work-vm> Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: mttcg@greensocs.com, Peter Maydell , Alexander Graf , qemu-devel > On 12 Feb 2015, at 16:31, Dr. David Alan Gilbert = wrote: >=20 > * Mark Burton (mark.burton@greensocs.com) wrote: >>=20 >>> On 12 Feb 2015, at 16:01, Peter Maydell = wrote: >>>=20 >>> On 12 February 2015 at 14:45, Alexander Graf wrote: >>>>=20 >>>>> On 12.02.2015, at 15:35, Mark Burton = wrote: >>>>> We are proposing to implement this by signalling all other CPU???s >>>>> to exit (and requesting they flush before re-starting). In other >>>>> words, this would happen asynchronously. >>>>=20 >>>> For global flushes, give them a pointer payload along with the = flush >>>> request and tell all cpus to increment it atomically. In your main >>>> thread, wait until *ptr =3D=3D nKickedCpus. >>>=20 >>> I bet this will not be the only situation where you want to >>> do an "get all other CPUs to do $something and wait til they >>> have done so" kind of operation, so some lightweight but generic >>> infrastructure for doing that would not be a bad plan. (Similarly >>> "get all other CPUs to stop, then I can do $something and let >>> the others continue???.) >>=20 >> We tried this - we ended up in knots. >> We had 2 CPU???s trying to flush at about the same time, both waiting = for the other. >> We had CPU???s trying to get the global mutex to finish what they = were doing, while being told to flush,=20 >> We had CPU???s in the global mutex trying to do something that would = cause a flush??? etc.... >> We had spaghetti with extra Bolognese sauce??? >=20 > This is the hard problem of multithreaded emulation. > You've always got to let CPUs get back to a point where you can > invalidate a mapping/page quickly. >=20 > Thus you've also got to be very careful about where any CPU might > get into a loop or take another lock that would stop another CPU > causing an invalidate. Either that or you need a way of somehow > breaking locks or recovering from the situation. Indeed -=20 for now - we=E2=80=99re building something which will likely be less = than ideal. Once we have some sort of evidence that it works, and = (hopefully) more reliably than the approach we have right now, then we = come up with a more elegant scheme. >=20 >> We eventually concluded, yes - in an infinite universe everything is = possible, but if we could simply do this ???asynchronously??? then our = lives would be a LOT easier. >> e.g. - ask all CPU???s to ???exit and do something??? is easy - = wait for them to do that is a whole other problem??? >=20 > Which is why you've got to bound how long it might take > those CPUs to get back to you, and optimise out cases where > it's not really needed later. >=20 >> Our question is - do we need this ???sync??? (before the flush), or = can we actually allow CPU???s to flush themselves asynchronously???. >=20 > Always assume the worst. :-) Cheers Mark. >=20 > Dave >=20 >>=20 >> Cheers >>=20 >> Mark. >>=20 >>=20 >>=20 >>>=20 >>> -- PMM >>=20 >>=20 >> +44 (0)20 7100 3485 x 210 >> +33 (0)5 33 52 01 77x 210 >>=20 >> +33 (0)603762104 >> mark.burton >>=20 > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK +44 (0)20 7100 3485 x 210 +33 (0)5 33 52 01 77x 210 +33 (0)603762104 mark.burton From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36271) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YM1lZ-0003eA-SS for qemu-devel@nongnu.org; Thu, 12 Feb 2015 16:57:47 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YM1lW-00081x-LF for qemu-devel@nongnu.org; Thu, 12 Feb 2015 16:57:45 -0500 Received: from mail-la0-f42.google.com ([209.85.215.42]:38687) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YM1lW-00081Y-Eg for qemu-devel@nongnu.org; Thu, 12 Feb 2015 16:57:42 -0500 Received: by lamq1 with SMTP id q1so12812755lam.5 for ; Thu, 12 Feb 2015 13:57:41 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <54DCC494.2010400@suse.de> References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> <45883806-2296-486F-A0DC-D8A0A74F85B9@greensocs.com> <54DCC494.2010400@suse.de> From: Peter Maydell Date: Thu, 12 Feb 2015 21:57:21 +0000 Message-ID: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Graf Cc: mttcg@listserver.greensocs.com, Mark Burton , qemu-devel On 12 February 2015 at 15:19, Alexander Graf wrote: > On 12.02.15 16:08, Mark Burton wrote: >> Our question is - do we need this =E2=80=98sync=E2=80=99 (before the flu= sh), >> or can we actually allow CPU=E2=80=99s to flush themselves asynchronousl= y=E2=80=A6. > > The respective target architecture specs will tell you. And I very much > doubt that it is ok in most cases. For ARM note that TLB maintenance operations do not have to complete synchronously. They can be reordered relative to other TLB maintenance ops or to loads or stores (by this CPU or by other CPUs if this is a global invalidate). The only requirement is that if the CPU that did the TLB maintenance op executes a DMB (barrier) then the TLB op must finish before the barrier completes execution. So you could split the "kick off TLB invalidate" and "make sure all CPUs are done" phases if you wanted. [cf v8 ARM ARM rev A.e section D4.7.2 and in particular the subsection on "ordering and completion".] This only applies to ARM guests, of course. ("Other CPU architectures are available." :-)) -- PMM From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37951) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YM1qY-0006as-Gu for qemu-devel@nongnu.org; Thu, 12 Feb 2015 17:02:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YM1qV-0001s3-Bc for qemu-devel@nongnu.org; Thu, 12 Feb 2015 17:02:54 -0500 Received: from mail-lb0-f175.google.com ([209.85.217.175]:41976) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YM1qV-0001rk-4Q for qemu-devel@nongnu.org; Thu, 12 Feb 2015 17:02:51 -0500 Received: by mail-lb0-f175.google.com with SMTP id n10so12089435lbv.6 for ; Thu, 12 Feb 2015 14:02:50 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <54DCC8FF.7000609@suse.de> References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> <54DCC8FF.7000609@suse.de> From: Peter Maydell Date: Thu, 12 Feb 2015 22:02:29 +0000 Message-ID: Content-Type: text/plain; charset=UTF-8 Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Graf Cc: mttcg@listserver.greensocs.com, Mark Burton , qemu-devel On 12 February 2015 at 15:38, Alexander Graf wrote: > On 12.02.15 15:58, Peter Maydell wrote: >> All ARM TLB maintenance operations have both "this CPU only" >> and "all TLBs in the Inner Shareable domain" [that's ARM-speak >> for "every CPU core in the cluster"] variants (the latter >> being the TLB *IS operations). Looking at Linux's >> arch/arm64/mm/tlb.S and arch/arm64/include/asm/tlbflush.h >> most of the operations defined there use the IS variants. > > Wow, did anyone benchmark this? I know that PPC switched away from > global flushes and instead tracks the CPUs a task was running on to > limit the scope of CPUs that need to flush. That would be a valid implementation. The CPU has to behave as the spec says it must, but there's no reason you couldn't implement "flush by ASID for all TLBs" via some implementation specific tracking of ASID use per CPU to limit which cores you sent the flush request to, if you thought that was a better way to do it. -- PMM From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41002) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YM1xq-0008Dj-Hf for qemu-devel@nongnu.org; Thu, 12 Feb 2015 17:10:31 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YM1xn-0005uv-0K for qemu-devel@nongnu.org; Thu, 12 Feb 2015 17:10:26 -0500 Received: from roura.ac.upc.es ([147.83.33.10]:52971) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YM1xm-0005ue-LS for qemu-devel@nongnu.org; Thu, 12 Feb 2015 17:10:22 -0500 From: =?utf-8?Q?Llu=C3=ADs_Vilanova?= References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> <54DCC8FF.7000609@suse.de> Date: Thu, 12 Feb 2015 23:10:18 +0100 In-Reply-To: (Mark Burton's message of "Thu, 12 Feb 2015 17:02:44 +0100") Message-ID: <87oaoy233p.fsf@fimbulvetr.bsc.es> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Mark Burton Cc: mttcg@greensocs.com, Peter Maydell , Alexander Graf , qemu-devel Mark Burton writes: >> On 12 Feb 2015, at 16:38, Alexander Graf wrote: >>=20 >>=20 >>=20 >> On 12.02.15 15:58, Peter Maydell wrote: >>> On 12 February 2015 at 14:45, Alexander Graf wrote: >>>> almost nobody except x86 does global flushes >>>=20 >>> All ARM TLB maintenance operations have both "this CPU only" >>> and "all TLBs in the Inner Shareable domain" [that's ARM-speak >>> for "every CPU core in the cluster"] variants (the latter >>> being the TLB *IS operations). Looking at Linux's >>> arch/arm64/mm/tlb.S and arch/arm64/include/asm/tlbflush.h >>> most of the operations defined there use the IS variants. >>=20 >> Wow, did anyone benchmark this? I know that PPC switched away from >> global flushes and instead tracks the CPUs a task was running on to >> limit the scope of CPUs that need to flush. > Doesn=E2=80=99t that mean you have to signal a specific CPU to cause it t= o flush itself=E2=80=A6. Isn=E2=80=99t that in itself expensive? Do you hav= e to organise some sort of atomicity yourself around that too? Yup. AFAIR, Linux in x86-64 queues a request to a per-CPU request list, and= uses IPIs to signal these types of operations to the target CPU: http://lxr.free-electrons.com/source/kernel/smp.c?v=3D2.6.32#L386 Waiting for completion is implemented on top by incrementing some counter f= rom each CPU, and waiting for it to have the correct final value. If something were implemented on these lines, it could be used as a generic cross-CPU event messaging infrastructure (plus some interrupt bit in the CPU structure that TCG would check to break away from guest code; I believe something similar is already being used - icount? -). PS: To be honest, I still don't know which TLBs we're talking about here, a= nd which cases trigger these TLB flush operations. Cheers, Lluis --=20 "And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer." -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58916) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMAU1-0008Pl-5q for qemu-devel@nongnu.org; Fri, 13 Feb 2015 02:16:17 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YMATw-0006CW-Gy for qemu-devel@nongnu.org; Fri, 13 Feb 2015 02:16:13 -0500 Received: from greensocs.com ([193.104.36.180]:49229) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMATw-0006CR-4r for qemu-devel@nongnu.org; Fri, 13 Feb 2015 02:16:08 -0500 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) From: Mark Burton In-Reply-To: <87oaoy233p.fsf@fimbulvetr.bsc.es> Date: Fri, 13 Feb 2015 08:16:04 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> <54DCC8FF.7000609@suse.de> <87oaoy233p.fsf@fimbulvetr.bsc.es> Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?utf-8?Q?Llu=C3=ADs_Vilanova?= Cc: mttcg@greensocs.com, Peter Maydell , Alexander Graf , qemu-devel Up top - thanks Peter, I think you may give us an idea ! > On 12 Feb 2015, at 23:10, Llu=C3=ADs Vilanova = wrote: >=20 > Mark Burton writes: >=20 >>> On 12 Feb 2015, at 16:38, Alexander Graf wrote: >>>=20 >>>=20 >>>=20 >>> On 12.02.15 15:58, Peter Maydell wrote: >>>> On 12 February 2015 at 14:45, Alexander Graf wrote: >>>>> almost nobody except x86 does global flushes >>>>=20 >>>> All ARM TLB maintenance operations have both "this CPU only" >>>> and "all TLBs in the Inner Shareable domain" [that's ARM-speak >>>> for "every CPU core in the cluster"] variants (the latter >>>> being the TLB *IS operations). Looking at Linux's >>>> arch/arm64/mm/tlb.S and arch/arm64/include/asm/tlbflush.h >>>> most of the operations defined there use the IS variants. >>>=20 >>> Wow, did anyone benchmark this? I know that PPC switched away from >>> global flushes and instead tracks the CPUs a task was running on to >>> limit the scope of CPUs that need to flush. >=20 >> Doesn=E2=80=99t that mean you have to signal a specific CPU to cause = it to flush itself=E2=80=A6. Isn=E2=80=99t that in itself expensive? Do = you have to organise some sort of atomicity yourself around that too? >=20 > Yup. AFAIR, Linux in x86-64 queues a request to a per-CPU request = list, and uses > IPIs to signal these types of operations to the target CPU: >=20 > http://lxr.free-electrons.com/source/kernel/smp.c?v=3D2.6.32#L386 >=20 > Waiting for completion is implemented on top by incrementing some = counter from > each CPU, and waiting for it to have the correct final value. If the kernel is doing this - then effectively - for X86, each CPU only = flush=E2=80=99s it=E2=80=99s own TLB (from the perspective of Qemu) - = correct? (in which case, for Qemu itself - for x86) - we dont need to implement a = global flush, and hence we dont need to build the mechanism to sync ? If I understand correctly then - the processor that causes some pain is = the ARM that has (and uses) global flush, but the mitigating factors is = that those flushes can by asyncronous so long as they complete before a = memory barrier=E2=80=A6. Cheers Mark. >=20 > If something were implemented on these lines, it could be used as a = generic > cross-CPU event messaging infrastructure (plus some interrupt bit in = the CPU > structure that TCG would check to break away from guest code; I = believe > something similar is already being used - icount? -). >=20 > PS: To be honest, I still don't know which TLBs we're talking about = here, and > which cases trigger these TLB flush operations. >=20 >=20 > Cheers, > Lluis >=20 > --=20 > "And it's much the same thing with knowledge, for whenever you learn > something new, the whole world becomes that much richer." > -- The Princess of Pure Reason, as told by Norton Juster in The = Phantom > Tollbooth +44 (0)20 7100 3485 x 210 +33 (0)5 33 52 01 77x 210 +33 (0)603762104 mark.burton From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60193) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMAct-0002Zf-F9 for qemu-devel@nongnu.org; Fri, 13 Feb 2015 02:25:28 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YMAco-0000WP-OE for qemu-devel@nongnu.org; Fri, 13 Feb 2015 02:25:23 -0500 Received: from mail-lb0-f181.google.com ([209.85.217.181]:57905) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMAco-0000VY-9r for qemu-devel@nongnu.org; Fri, 13 Feb 2015 02:25:18 -0500 Received: by mail-lb0-f181.google.com with SMTP id b6so13894639lbj.12 for ; Thu, 12 Feb 2015 23:25:17 -0800 (PST) MIME-Version: 1.0 In-Reply-To: References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> <54DCC8FF.7000609@suse.de> <87oaoy233p.fsf@fimbulvetr.bsc.es> From: Peter Maydell Date: Fri, 13 Feb 2015 07:24:57 +0000 Message-ID: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Mark Burton Cc: qemu-devel , mttcg@greensocs.com, =?UTF-8?Q?Llu=C3=ADs_Vilanova?= , Alexander Graf On 13 February 2015 at 07:16, Mark Burton wrote= : > If the kernel is doing this - then effectively - for X86, each CPU only > flush=E2=80=99s it=E2=80=99s own TLB (from the perspective of Qemu) - cor= rect? > (in which case, for Qemu itself - for x86) - we dont need to implement > a global flush, and hence we dont need to build the mechanism to sync ? The semantics you need are "flush the QEMU TLB for CPU X" (where X may not be the CPU you're running on). This is what tlb_flush() does: it takes a CPU argument to act on. (Ditto tlb_flush_page, etc.) We then use that to implement the target's required semantics (eg in ARM the tlbiall_is_write() function is handled by iterating through all CPUs and calling tlb_flush on them). If you don't want the pain of checking the semantics of every backend and figuring out a new set of primitives to implement, then what you need to do is continue to provide the guarantees the current tlb_flush function does: when it returns then the CPU it's supposed to have acted on has definitely done so. You can try and be cleverer if you want to, but personally I would recommend keeping the scope of your work simple where you can. -- PMM From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33877) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMAok-0007DD-0F for qemu-devel@nongnu.org; Fri, 13 Feb 2015 02:37:39 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YMAof-0004pV-AC for qemu-devel@nongnu.org; Fri, 13 Feb 2015 02:37:37 -0500 Received: from greensocs.com ([193.104.36.180]:10485) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMAoe-0004pK-To for qemu-devel@nongnu.org; Fri, 13 Feb 2015 02:37:33 -0500 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) From: Mark Burton In-Reply-To: Date: Fri, 13 Feb 2015 08:37:28 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> <54DCC8FF.7000609@suse.de> <87oaoy233p.fsf@fimbulvetr.bsc.es> Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Maydell Cc: qemu-devel , mttcg@greensocs.com, =?utf-8?Q?Llu=C3=ADs_Vilanova?= , Alexander Graf > On 13 Feb 2015, at 08:24, Peter Maydell = wrote: >=20 > On 13 February 2015 at 07:16, Mark Burton = wrote: >> If the kernel is doing this - then effectively - for X86, each CPU = only >> flush=E2=80=99s it=E2=80=99s own TLB (from the perspective of Qemu) - = correct? >> (in which case, for Qemu itself - for x86) - we dont need to = implement >> a global flush, and hence we dont need to build the mechanism to sync = ? > The semantics you need are "flush the QEMU TLB for CPU X" (where > X may not be the CPU you're running on). This is what tlb_flush() > does: it takes a CPU argument to act on. (Ditto tlb_flush_page, etc.) > We then use that to implement the target's required semantics > (eg in ARM the tlbiall_is_write() function is handled by iterating > through all CPUs and calling tlb_flush on them). What Lluis implied seemed to be that the kernel arranged to signal the = CPU that would flush. Hence, (for X86), we would only ever flush our own = TLB. >=20 > If you don't want the pain of checking the semantics of every > backend and figuring out a new set of primitives to implement, > then what you need to do is continue to provide the guarantees > the current tlb_flush function does: when it returns then the > CPU it's supposed to have acted on has definitely done so. >=20 > You can try and be cleverer if you want to, but personally > I would recommend keeping the scope of your work simple > where you can. yes - though keeping it simple (silly) seems to have some complexities = in this case, which is why we are trying to reduce the guarantees that = tlm_flush() provides.=20 At present - the =E2=80=98foreach cpu, tlb_flush()=E2=80=99 is = effectively atomic, as no other CPU will be executing at the same time. Adding multi-thread, we can already say - this =E2=80=98atomicity=E2=80=99= isn=E2=80=99t strictly required. As you say, the only thing tlb_flush = needs to guarantee is that the CPU concerned has flushed.=20 - that already helps. And I agree with you is the right place to take = tlb_flush(). Of course, when only the current CPU is flushed things are much simpler = (and already handled)... For our immediate concern, in the interests of getting the thing working = and making sure we=E2=80=99ve turned over all the stones, on ARM - it = MAY help us to check that the flush has happened =E2=80=98in the next = memory barrier=E2=80=99=E2=80=A6. - I dont know if that will help us or not, and - even if it = does, I agree with you, it would be more messy than it need be. However, in the interests of making sure that there are no other issues = - we may =E2=80=98hack=E2=80=99 something before we put in place a more = elegant solution=E2=80=A6.=20 (right now, we have some mutex issues, shifting the sync to the barrier = MAY help us avoid that=E2=80=A6. To Be Seen=E2=80=A6. and anyway - it = would only be a temporary fix). Cheers Mark. >=20 > -- PMM +44 (0)20 7100 3485 x 210 +33 (0)5 33 52 01 77x 210 +33 (0)603762104 mark.burton From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51820) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMCeM-0002WA-ED for qemu-devel@nongnu.org; Fri, 13 Feb 2015 04:35:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YMCeI-0007Qr-CX for qemu-devel@nongnu.org; Fri, 13 Feb 2015 04:35:02 -0500 Received: from mx1.redhat.com ([209.132.183.28]:45457) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMCeI-0007Qn-55 for qemu-devel@nongnu.org; Fri, 13 Feb 2015 04:34:58 -0500 Message-ID: <54DDC537.5070003@redhat.com> Date: Fri, 13 Feb 2015 10:34:47 +0100 From: Paolo Bonzini MIME-Version: 1.0 References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> <45883806-2296-486F-A0DC-D8A0A74F85B9@greensocs.com> <54DCC494.2010400@suse.de> In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Maydell , Alexander Graf Cc: mttcg@greensocs.com, Mark Burton , qemu-devel On 12/02/2015 22:57, Peter Maydell wrote: > The only > requirement is that if the CPU that did the TLB maintenance > op executes a DMB (barrier) then the TLB op must finish > before the barrier completes execution. So you could split > the "kick off TLB invalidate" and "make sure all CPUs > are done" phases if you wanted. [cf v8 ARM ARM rev A.e > section D4.7.2 and in particular the subsection on > "ordering and completion".] You can just make DMB start a new translation block. Then when the TLB flush helpers call cpu_exit() or cpu_interrupt() the flush request is serviced. Paolo From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52090) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMCh5-0004Eb-BD for qemu-devel@nongnu.org; Fri, 13 Feb 2015 04:37:52 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YMCh2-0000CF-3Y for qemu-devel@nongnu.org; Fri, 13 Feb 2015 04:37:51 -0500 Received: from greensocs.com ([193.104.36.180]:19567) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMCh1-0000C5-Q6 for qemu-devel@nongnu.org; Fri, 13 Feb 2015 04:37:48 -0500 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) From: Mark Burton In-Reply-To: <54DDC537.5070003@redhat.com> Date: Fri, 13 Feb 2015 10:37:43 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> <45883806-2296-486F-A0DC-D8A0A74F85B9@greensocs.com> <54DCC494.2010400@suse.de> <54DDC537.5070003@redhat.com> Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: mttcg@greensocs.com, Peter Maydell , Alexander Graf , qemu-devel the memory barrier is on the cpu requesting the flush isn=E2=80=99t it = (not on the CPU that is being flushed)? Cheers Mark. > On 13 Feb 2015, at 10:34, Paolo Bonzini wrote: >=20 >=20 >=20 > On 12/02/2015 22:57, Peter Maydell wrote: >> The only >> requirement is that if the CPU that did the TLB maintenance >> op executes a DMB (barrier) then the TLB op must finish >> before the barrier completes execution. So you could split >> the "kick off TLB invalidate" and "make sure all CPUs >> are done" phases if you wanted. [cf v8 ARM ARM rev A.e >> section D4.7.2 and in particular the subsection on >> "ordering and completion".] >=20 > You can just make DMB start a new translation block. Then when the = TLB > flush helpers call cpu_exit() or cpu_interrupt() the flush request is > serviced. >=20 > Paolo +44 (0)20 7100 3485 x 210 +33 (0)5 33 52 01 77x 210 +33 (0)603762104 mark.burton From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53859) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMCsA-0008OR-U7 for qemu-devel@nongnu.org; Fri, 13 Feb 2015 04:49:19 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YMCs7-0003JB-OW for qemu-devel@nongnu.org; Fri, 13 Feb 2015 04:49:18 -0500 Received: from mx1.redhat.com ([209.132.183.28]:40907) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMCs7-0003Ix-He for qemu-devel@nongnu.org; Fri, 13 Feb 2015 04:49:15 -0500 Message-ID: <54DDC890.4050801@redhat.com> Date: Fri, 13 Feb 2015 10:49:04 +0100 From: Paolo Bonzini MIME-Version: 1.0 References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> <45883806-2296-486F-A0DC-D8A0A74F85B9@greensocs.com> <54DCC494.2010400@suse.de> <54DDC537.5070003@redhat.com> In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Mark Burton Cc: mttcg@greensocs.com, Peter Maydell , Alexander Graf , qemu-devel On 13/02/2015 10:37, Mark Burton wrote: > the memory barrier is on the cpu requesting the flush isn=E2=80=99t it = (not > on the CPU that is being flushed)? Oops, I misread Peter's explanation. In that case, perhaps DMB can be treated in a similar way as WFI, using cpu->halted. Queueing work on other CPUs can be done with async_run_on_cpu, which exits the idle loop in qemu_tcg_wait_io_event (this avoids the deadlocks). Checking that other CPUs have flushed the TLBs can be done in cpu_has_work ("always return false if cpu->halted =3D= =3D true there are outstanding TLB requests"). Paolo From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40112) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMGKM-0007Et-Ow for qemu-devel@nongnu.org; Fri, 13 Feb 2015 08:30:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YMGKI-0004Xo-7N for qemu-devel@nongnu.org; Fri, 13 Feb 2015 08:30:38 -0500 Received: from roura.ac.upc.es ([147.83.33.10]:47983) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMGKH-0004Xa-S5 for qemu-devel@nongnu.org; Fri, 13 Feb 2015 08:30:34 -0500 From: =?utf-8?Q?Llu=C3=ADs_Vilanova?= References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> <54DCC8FF.7000609@suse.de> <87oaoy233p.fsf@fimbulvetr.bsc.es> Date: Fri, 13 Feb 2015 14:30:29 +0100 In-Reply-To: (Mark Burton's message of "Fri, 13 Feb 2015 08:37:28 +0100") Message-ID: <87lhk2uefe.fsf@fimbulvetr.bsc.es> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Mark Burton Cc: mttcg@greensocs.com, Peter Maydell , Alexander Graf , qemu-devel Mark Burton writes: >> On 13 Feb 2015, at 08:24, Peter Maydell wrote: >>=20 >> On 13 February 2015 at 07:16, Mark Burton wr= ote: >>> If the kernel is doing this - then effectively - for X86, each CPU only >>> flush=E2=80=99s it=E2=80=99s own TLB (from the perspective of Qemu) - c= orrect? >>> (in which case, for Qemu itself - for x86) - we dont need to implement >>> a global flush, and hence we dont need to build the mechanism to sync ? >> The semantics you need are "flush the QEMU TLB for CPU X" (where >> X may not be the CPU you're running on). This is what tlb_flush() >> does: it takes a CPU argument to act on. (Ditto tlb_flush_page, etc.) >> We then use that to implement the target's required semantics >> (eg in ARM the tlbiall_is_write() function is handled by iterating >> through all CPUs and calling tlb_flush on them). > What Lluis implied seemed to be that the kernel arranged to signal the CP= U that would flush. Hence, (for X86), we would only ever flush our own TLB. That's correct. [...] > For our immediate concern, in the interests of getting the thing working = and > making sure we=E2=80=99ve turned over all the stones, on ARM - it MAY hel= p us to check > that the flush has happened =E2=80=98in the next memory barrier=E2=80=99= =E2=80=A6. > - I dont know if that will help us or not, and - even if it does, I agre= e with you, it would be more messy than it need be. > However, in the interests of making sure that there are no other issues -= we may =E2=80=98hack=E2=80=99 something before we put in place a more eleg= ant solution=E2=80=A6.=20 > (right now, we have some mutex issues, shifting the sync to the barrier M= AY help us avoid that=E2=80=A6. To Be Seen=E2=80=A6. and anyway - it would = only be a temporary fix). But you shouldn't assume that everyone either uses x86's semantics (aka, ea= ch CPU gets an IPI), or the ARM semantics you described where the global TLB f= lush instruction has asynchronous effects. First, in ARM you still have to ensure other CPUs did what you asked them to (whenever the arch manual says you mu= st do so). Second, it seems like ARM does not always behave in the way you descri= bed: http://lxr.free-electrons.com/source/arch/arm/kernel/smp.c?v=3D2.6.32#L630 Granted, this is just the same behaviour as x86, but noone guarantees you t= hat some other operation in any of the multiple architectures supported by QEMU= will never need a synchronous instruction with global effects. I understand the pressure of getting something running and work from that, = but I think that having a framework for asynchronous cross-CPU messaging would be rather useful in the future. That can be then complemented with a mechanism= to wait for these asynchronous messages. You can achieve any desired behaviour= by composing these two. Cheers, Lluis --=20 "And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer." -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40444) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMGMO-00008G-UD for qemu-devel@nongnu.org; Fri, 13 Feb 2015 08:32:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YMGMK-0005Pj-PM for qemu-devel@nongnu.org; Fri, 13 Feb 2015 08:32:44 -0500 Received: from greensocs.com ([193.104.36.180]:31978) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMGMK-0005PY-BX for qemu-devel@nongnu.org; Fri, 13 Feb 2015 08:32:40 -0500 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) From: Mark Burton In-Reply-To: <87lhk2uefe.fsf@fimbulvetr.bsc.es> Date: Fri, 13 Feb 2015 14:32:35 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <2CFCCA1B-1AAE-4DB2-BF96-66AC40BCEA2F@greensocs.com> References: <31B94C07-29A9-4595-95ED-FA860B527BD8@suse.de> <54DCC8FF.7000609@suse.de> <87oaoy233p.fsf@fimbulvetr.bsc.es> <87lhk2uefe.fsf@fimbulvetr.bsc.es> Subject: Re: [Qemu-devel] Help on TLB Flush List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?utf-8?Q?Llu=C3=ADs_Vilanova?= Cc: mttcg@greensocs.com, Peter Maydell , Alexander Graf , qemu-devel Agreed Cheers Mark. > On 13 Feb 2015, at 14:30, Llu=C3=ADs Vilanova = wrote: >=20 > Mark Burton writes: >=20 >>> On 13 Feb 2015, at 08:24, Peter Maydell = wrote: >>>=20 >>> On 13 February 2015 at 07:16, Mark Burton = wrote: >>>> If the kernel is doing this - then effectively - for X86, each CPU = only >>>> flush=E2=80=99s it=E2=80=99s own TLB (from the perspective of Qemu) = - correct? >>>> (in which case, for Qemu itself - for x86) - we dont need to = implement >>>> a global flush, and hence we dont need to build the mechanism to = sync ? >=20 >>> The semantics you need are "flush the QEMU TLB for CPU X" (where >>> X may not be the CPU you're running on). This is what tlb_flush() >>> does: it takes a CPU argument to act on. (Ditto tlb_flush_page, = etc.) >>> We then use that to implement the target's required semantics >>> (eg in ARM the tlbiall_is_write() function is handled by iterating >>> through all CPUs and calling tlb_flush on them). >=20 >> What Lluis implied seemed to be that the kernel arranged to signal = the CPU that would flush. Hence, (for X86), we would only ever flush our = own TLB. >=20 > That's correct. >=20 > [...] >> For our immediate concern, in the interests of getting the thing = working and >> making sure we=E2=80=99ve turned over all the stones, on ARM - it MAY = help us to check >> that the flush has happened =E2=80=98in the next memory = barrier=E2=80=99=E2=80=A6. >> - I dont know if that will help us or not, and - even if it = does, I agree with you, it would be more messy than it need be. >> However, in the interests of making sure that there are no other = issues - we may =E2=80=98hack=E2=80=99 something before we put in place = a more elegant solution=E2=80=A6.=20 >> (right now, we have some mutex issues, shifting the sync to the = barrier MAY help us avoid that=E2=80=A6. To Be Seen=E2=80=A6. and anyway = - it would only be a temporary fix). >=20 > But you shouldn't assume that everyone either uses x86's semantics = (aka, each > CPU gets an IPI), or the ARM semantics you described where the global = TLB flush > instruction has asynchronous effects. First, in ARM you still have to = ensure > other CPUs did what you asked them to (whenever the arch manual says = you must do > so). Second, it seems like ARM does not always behave in the way you = described: >=20 > = http://lxr.free-electrons.com/source/arch/arm/kernel/smp.c?v=3D2.6.32#L630= >=20 > Granted, this is just the same behaviour as x86, but noone guarantees = you that > some other operation in any of the multiple architectures supported by = QEMU will > never need a synchronous instruction with global effects. >=20 > I understand the pressure of getting something running and work from = that, but I > think that having a framework for asynchronous cross-CPU messaging = would be > rather useful in the future. That can be then complemented with a = mechanism to > wait for these asynchronous messages. You can achieve any desired = behaviour by > composing these two. >=20 >=20 > Cheers, > Lluis >=20 > --=20 > "And it's much the same thing with knowledge, for whenever you learn > something new, the whole world becomes that much richer." > -- The Princess of Pure Reason, as told by Norton Juster in The = Phantom > Tollbooth +44 (0)20 7100 3485 x 210 +33 (0)5 33 52 01 77x 210 +33 (0)603762104 mark.burton