From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTP id F419CDDE1D for ; Mon, 21 May 2007 17:07:02 +1000 (EST) Subject: fsl booke MM vs. SMP questions From: Benjamin Herrenschmidt To: ppc-dev Content-Type: text/plain Date: Mon, 21 May 2007 17:06:55 +1000 Message-Id: <1179731215.32247.659.camel@localhost.localdomain> Mime-Version: 1.0 Cc: Kumar Gala , Paul Mackerras List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Hi Folks ! I see that the fsl booke code has some #ifdef CONFIG_SMP bits here or there, thus I suppose there are some SMP implementations of these right ? I'm having some serious issues trying to figure out how the TLB management is made SMP safe however. There are at least two main issues I've spotted at this point (there's at least one more if there are HW threading, that is the TLB is shared between logical processors, but I'll ignore that for now since I don't think there is such a thing ... yet). - How do you guys shield PTE flushing vs. TLB misses on another CPU ? That is, how do you prevent (if you do) the following scenario: cpu 0 cpu 1 tlb miss pte_clear (or similar) load PTE value write 0 to PTE (or replace) tlbviax (tlbie) tlbwe That scenario, as you can see, will leave you with stale entries in the TLB which will ultimately lead to all sort of unpleasant/random behaviours. If the answer is "oops ... we don't", then let's try to find out ways out of that since I may have a similar issue in a not too distant future :-) And I'm trying to find out a -fast- way to deal with that without bloating the fast path. My main problem is that I want to avoid taking a spin lock or equivalent atomic operation in the fast TLB reload path (which would solve the problem) since lwarx/stwcx. are generally real slow (hundreds of cycles on some processors). - I see that your TLB miss handle is using a non-atomic store to write the _PAGE_ACCESSED bit back to the PTE. Don't you have a similar race where something would do: cpu 0 cpu 1 tlb miss pte_clear (or similar) load PTE value write 0 to PTE (or replace) write back PTE with _PAGE_ACCESSED tlbwe This is an extension of the previous race but it's a different problem so I listed it separately. In that case, the problem is worse, since not only you have a stale TLB entry, but you -also- have corrupted the linux PTE by writing back the old value in it. At this point, I'm afraid you may have no choice but going atomic, which means paying the cost of lwarx/stwcx. on TLB misses, though if you have a solution for the first problem, then you can avoid the atomic operation in the second problem if _PAGE_ACCESSED is already set. If not, you might have to use a _PAGE_BUSY bit similar to what 64 bits uses as a per-PTE lock, or use mmu_hash_lock... Unless you come up with a great idea or some HW black magic that makes the problem go away... In any case, I'm curious about how you have or intend to solve that since as I said above, I might be in a similar situation soon and am trying to keep the TLB miss handler as fast as humanly possible. Cheers, Ben.