linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Jan Beulich <JBeulich@novell.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>,
	Andi Kleen <andi@firstfloor.org>, Hugh Dickins <hughd@google.com>,
	Jeremy Fitzhardinge <jeremy@goop.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Xen-devel@lists.xensource.com" <Xen-devel@lists.xensource.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Johannes Weiner <jweiner@redhat.com>,
	Larry Woodman <lwoodman@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH] fix pgd_lock deadlock
Date: Thu, 24 Feb 2011 05:22:22 +0100	[thread overview]
Message-ID: <20110224042222.GG31195@random.random> (raw)
In-Reply-To: <4D63D4CD020000780003320A@vpn.id2.novell.com>

On Tue, Feb 22, 2011 at 02:22:53PM +0000, Jan Beulich wrote:
> >>> On 22.02.11 at 14:49, Andrea Arcangeli <aarcange@redhat.com> wrote:
> > On Tue, Feb 22, 2011 at 07:48:54AM +0000, Jan Beulich wrote:
> >> A possible alternative would be to acquire the page table lock
> >> in vmalloc_sync_all() only in the Xen case (perhaps by storing
> >> NULL into page->index in pgd_set_mm() when not running on
> >> Xen). This is utilizing the fact that there aren't (supposed to
> >> be - for non-pvops this is definitely the case) any TLB flush IPIs
> >> under Xen, and hence the race you're trying to fix doesn't
> >> exist there (while non-Xen doesn't need the extra locking).
> > 
> > That's sure ok with me. Can we use a global runtime to check if the
> > guest is running under Xen paravirt, instead of passing that info
> > through page->something?
> 
> If everyone's okay with putting a couple of "if (xen_pv_domain())"
> into mm/fault.c - sure. I would have thought that this wouldn't be
> liked, hence the suggestion to make this depend on seeing the
> backlink be non-NULL.

What about this? The page->private logic gets optimized away at
compile time with XEN=n.

The removal of _irqsave from pgd_lock, I'll delay it as it's no bugfix
anymore.

===
Subject: xen: stop taking the page_table_lock with irq disabled

From: Andrea Arcangeli <aarcange@redhat.com>

It's forbidden to take the page_table_lock with the irq disabled or if there's
contention the IPIs (for tlb flushes) sent with the page_table_lock held will
never run leading to a deadlock.

Only Xen needs the page_table_lock and Xen won't need IPI TLB flushes hence the
deadlock doesn't exist for Xen.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 arch/x86/include/asm/pgtable.h |    5 +++--
 arch/x86/mm/fault.c            |   10 ++++++----
 arch/x86/mm/init_64.c          |   10 ++++++----
 arch/x86/mm/pgtable.c          |    9 +++------
 4 files changed, 18 insertions(+), 16 deletions(-)

--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -235,14 +235,16 @@ void vmalloc_sync_all(void)
 
 		spin_lock_irqsave(&pgd_lock, flags);
 		list_for_each_entry(page, &pgd_list, lru) {
-			spinlock_t *pgt_lock;
+			struct mm_struct *mm;
 			pmd_t *ret;
 
-			pgt_lock = &pgd_page_get_mm(page)->page_table_lock;
+			mm = pgd_page_get_mm(page);
 
-			spin_lock(pgt_lock);
+			if (mm)
+				spin_lock(&mm->page_table_lock);
 			ret = vmalloc_sync_one(page_address(page), address);
-			spin_unlock(pgt_lock);
+			if (mm)
+				spin_unlock(&mm->page_table_lock);
 
 			if (!ret)
 				break;
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -114,11 +114,12 @@ void sync_global_pgds(unsigned long star
 		spin_lock_irqsave(&pgd_lock, flags);
 		list_for_each_entry(page, &pgd_list, lru) {
 			pgd_t *pgd;
-			spinlock_t *pgt_lock;
+			struct mm_struct *mm;
 
 			pgd = (pgd_t *)page_address(page) + pgd_index(address);
-			pgt_lock = &pgd_page_get_mm(page)->page_table_lock;
-			spin_lock(pgt_lock);
+			mm = pgd_page_get_mm(page);
+			if (mm)
+				spin_lock(&mm->page_table_lock);
 
 			if (pgd_none(*pgd))
 				set_pgd(pgd, *pgd_ref);
@@ -126,7 +127,8 @@ void sync_global_pgds(unsigned long star
 				BUG_ON(pgd_page_vaddr(*pgd)
 				       != pgd_page_vaddr(*pgd_ref));
 
-			spin_unlock(pgt_lock);
+			if (mm)
+				spin_unlock(&mm->page_table_lock);
 		}
 		spin_unlock_irqrestore(&pgd_lock, flags);
 	}
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -4,6 +4,7 @@
 #include <asm/pgtable.h>
 #include <asm/tlb.h>
 #include <asm/fixmap.h>
+#include <xen/xen.h>
 
 #define PGALLOC_GFP GFP_KERNEL | __GFP_NOTRACK | __GFP_REPEAT | __GFP_ZERO
 
@@ -91,12 +92,8 @@ static inline void pgd_list_del(pgd_t *p
 static void pgd_set_mm(pgd_t *pgd, struct mm_struct *mm)
 {
 	BUILD_BUG_ON(sizeof(virt_to_page(pgd)->index) < sizeof(mm));
-	virt_to_page(pgd)->index = (pgoff_t)mm;
-}
-
-struct mm_struct *pgd_page_get_mm(struct page *page)
-{
-	return (struct mm_struct *)page->index;
+	if (xen_pv_domain())
+		virt_to_page(pgd)->index = (pgoff_t)mm;
 }
 
 static void pgd_ctor(struct mm_struct *mm, pgd_t *pgd)
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -28,8 +28,6 @@ extern unsigned long empty_zero_page[PAG
 extern spinlock_t pgd_lock;
 extern struct list_head pgd_list;
 
-extern struct mm_struct *pgd_page_get_mm(struct page *page);
-
 #ifdef CONFIG_PARAVIRT
 #include <asm/paravirt.h>
 #else  /* !CONFIG_PARAVIRT */
@@ -83,6 +81,9 @@ extern struct mm_struct *pgd_page_get_mm
 
 #endif	/* CONFIG_PARAVIRT */
 
+#define pgd_page_get_mm(__page) \
+	((struct mm_struct *)(xen_pv_domain() ? (__page)->index : 0))
+
 /*
  * The following only work if pte_present() is true.
  * Undefined behaviour if not..

  parent reply	other threads:[~2011-02-24  4:22 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-14 20:56 [PATCH] x86: hold mm->page_table_lock while doing vmalloc_sync Jeremy Fitzhardinge
2010-10-15 17:07 ` [Xen-devel] " Jeremy Fitzhardinge
2010-10-19 22:17   ` [tip:x86/mm] x86, mm: Hold " tip-bot for Jeremy Fitzhardinge
2010-10-20 10:36     ` Borislav Petkov
2010-10-20 19:31       ` [tip:x86/mm] x86, mm: Fix incorrect data type in vmalloc_sync_all() tip-bot for tip-bot for Jeremy Fitzhardinge
2010-10-20 19:50         ` Borislav Petkov
2010-10-20 19:53           ` H. Peter Anvin
2010-10-20 20:10             ` Borislav Petkov
2010-10-20 20:13               ` H. Peter Anvin
2010-10-20 22:11                 ` Borislav Petkov
2010-10-20 21:26             ` Ben Pfaff
2010-10-20 19:58       ` tip-bot for Borislav Petkov
2010-10-21 21:06 ` [PATCH] x86: hold mm->page_table_lock while doing vmalloc_sync Jeremy Fitzhardinge
2010-10-21 21:26   ` H. Peter Anvin
2010-10-21 21:34     ` Jeremy Fitzhardinge
2011-02-03  2:48   ` Andrea Arcangeli
2011-02-03 20:44     ` Jeremy Fitzhardinge
2011-02-04  1:21       ` Andrea Arcangeli
2011-02-04 21:27         ` Jeremy Fitzhardinge
2011-02-07 23:20           ` Andrea Arcangeli
2011-02-15 19:07             ` [PATCH] fix pgd_lock deadlock Andrea Arcangeli
2011-02-15 19:26               ` Thomas Gleixner
2011-02-15 19:54                 ` Andrea Arcangeli
2011-02-15 20:05                   ` Thomas Gleixner
2011-02-15 20:26                     ` Thomas Gleixner
2011-02-15 22:52                       ` Andrea Arcangeli
2011-02-15 23:03                         ` Thomas Gleixner
2011-02-15 23:17                           ` Andrea Arcangeli
2011-02-16  9:58                             ` Peter Zijlstra
2011-02-16 10:15                               ` Andrea Arcangeli
2011-02-16 10:28                                 ` Ingo Molnar
2011-02-16 14:49                                   ` Andrea Arcangeli
2011-02-16 16:26                                     ` Rik van Riel
2011-02-16 20:15                                     ` Ingo Molnar
2012-04-23  9:07                                     ` [2.6.32.y][PATCH] " Philipp Hahn
2012-04-23 19:09                                       ` Willy Tarreau
2011-02-16 18:33                     ` [PATCH] " Andrea Arcangeli
2011-02-16 21:34                       ` Konrad Rzeszutek Wilk
2011-02-17 10:19                       ` Johannes Weiner
2011-02-21 14:30                         ` Andrea Arcangeli
2011-02-21 14:53                           ` Johannes Weiner
2011-02-22  7:48                             ` Jan Beulich
2011-02-22 13:49                               ` Andrea Arcangeli
2011-02-22 14:22                                 ` Jan Beulich
2011-02-22 14:34                                   ` Andrea Arcangeli
2011-02-22 17:08                                     ` Jeremy Fitzhardinge
2011-02-22 17:13                                       ` Andrea Arcangeli
2011-02-24  4:22                                   ` Andrea Arcangeli [this message]
2011-02-24  8:23                                     ` Jan Beulich
2011-02-24 14:11                                       ` Andrea Arcangeli
2011-02-21 17:40                         ` Jeremy Fitzhardinge
2011-02-03 20:59     ` [PATCH] x86: hold mm->page_table_lock while doing vmalloc_sync Larry Woodman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110224042222.GG31195@random.random \
    --to=aarcange@redhat.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=JBeulich@novell.com \
    --cc=Xen-devel@lists.xensource.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=jeremy@goop.org \
    --cc=jweiner@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lwoodman@redhat.com \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).