public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Olof Johansson <olof@lixom.net>
To: akpm@osdl.org
Cc: linuxppc64-dev@ozlabs.org, anton@samba.org, paulus@samba.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH] PPC64: Remove hot busy-wait loop in __hash_page
Date: Fri, 29 Apr 2005 02:07:52 -0500	[thread overview]
Message-ID: <20050429070752.GA30785@lixom.net> (raw)

Hi,

It turns out that our current __hash_page code will do a very hot
busy-wait loop waiting on _PAGE_BUSY to be cleared. It even does
ldarx/stdcx in the loop, which will bounce reservations around like crazy
if there's more than one CPU spinning on the same PTE (or even another
PTE in the same reservation granule). The end result is that each fault
takes longer when there's contention, which in turn increases the chance
of another thread hitting the same fault and also piling up. Not pretty.

There's two options here:
1. Do an out-of-line busy loop a'la spinlocks with just loads (no
   reserves)
2. Just bail and refault if needed.

(2) makes sense here: If the PTE is busy, chances are it's in flux anyway
and the other code path making a change might just be ready to hash it.

This fixes a stampede seen on a large-ish system where a multithreaded
HPC app faults in the same text pages on several cpus at the same time.


Signed-off-by: Olof Johansson <olof@lixom.net>


Index: 2.6/arch/ppc64/mm/hash_low.S
===================================================================
--- 2.6.orig/arch/ppc64/mm/hash_low.S	2005-04-24 17:49:43.000000000 -0500
+++ 2.6/arch/ppc64/mm/hash_low.S	2005-04-28 17:19:04.000000000 -0500
@@ -85,7 +85,10 @@ _GLOBAL(__hash_page)
 	bne-	htab_wrong_access
 	/* Check if PTE is busy */
 	andi.	r0,r31,_PAGE_BUSY
-	bne-	1b
+	/* If so, just bail out and refault if needed. Someone else
+	 * is changing this PTE anyway and might hash it.
+	 */
+	bne-	bail_ok
 	/* Prepare new PTE value (turn access RW into DIRTY, then
 	 * add BUSY,HASHPTE and ACCESSED)
 	 */
@@ -215,6 +218,10 @@ _GLOBAL(htab_call_hpte_remove)
 	/* Try all again */
 	b	htab_insert_pte	
 
+bail_ok:
+	li	r3,0
+	b	bail
+
 htab_pte_insert_ok:
 	/* Insert slot number & secondary bit in PTE */
 	rldimi	r30,r3,12,63-15

                 reply	other threads:[~2005-04-29  7:08 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050429070752.GA30785@lixom.net \
    --to=olof@lixom.net \
    --cc=akpm@osdl.org \
    --cc=anton@samba.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc64-dev@ozlabs.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox