From: Andrea Arcangeli <aarcange@redhat.com>
To: Jan Beulich <JBeulich@novell.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>,
Andi Kleen <andi@firstfloor.org>, Hugh Dickins <hughd@google.com>,
Jeremy Fitzhardinge <jeremy@goop.org>,
the arch/x86 maintainers <x86@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Andrew Morton <akpm@linux-foundation.org>,
"Xen-devel@lists.xensource.com" <Xen-devel@lists.xensource.com>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
Johannes Weiner <jweiner@redhat.com>,
Larry Woodman <lwoodman@redhat.com>,
Rik van Riel <riel@redhat.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH] fix pgd_lock deadlock
Date: Thu, 24 Feb 2011 05:22:22 +0100 [thread overview]
Message-ID: <20110224042222.GG31195@random.random> (raw)
In-Reply-To: <4D63D4CD020000780003320A@vpn.id2.novell.com>
On Tue, Feb 22, 2011 at 02:22:53PM +0000, Jan Beulich wrote:
> >>> On 22.02.11 at 14:49, Andrea Arcangeli <aarcange@redhat.com> wrote:
> > On Tue, Feb 22, 2011 at 07:48:54AM +0000, Jan Beulich wrote:
> >> A possible alternative would be to acquire the page table lock
> >> in vmalloc_sync_all() only in the Xen case (perhaps by storing
> >> NULL into page->index in pgd_set_mm() when not running on
> >> Xen). This is utilizing the fact that there aren't (supposed to
> >> be - for non-pvops this is definitely the case) any TLB flush IPIs
> >> under Xen, and hence the race you're trying to fix doesn't
> >> exist there (while non-Xen doesn't need the extra locking).
> >
> > That's sure ok with me. Can we use a global runtime to check if the
> > guest is running under Xen paravirt, instead of passing that info
> > through page->something?
>
> If everyone's okay with putting a couple of "if (xen_pv_domain())"
> into mm/fault.c - sure. I would have thought that this wouldn't be
> liked, hence the suggestion to make this depend on seeing the
> backlink be non-NULL.
What about this? The page->private logic gets optimized away at
compile time with XEN=n.
The removal of _irqsave from pgd_lock, I'll delay it as it's no bugfix
anymore.
===
Subject: xen: stop taking the page_table_lock with irq disabled
From: Andrea Arcangeli <aarcange@redhat.com>
It's forbidden to take the page_table_lock with the irq disabled or if there's
contention the IPIs (for tlb flushes) sent with the page_table_lock held will
never run leading to a deadlock.
Only Xen needs the page_table_lock and Xen won't need IPI TLB flushes hence the
deadlock doesn't exist for Xen.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
arch/x86/include/asm/pgtable.h | 5 +++--
arch/x86/mm/fault.c | 10 ++++++----
arch/x86/mm/init_64.c | 10 ++++++----
arch/x86/mm/pgtable.c | 9 +++------
4 files changed, 18 insertions(+), 16 deletions(-)
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -235,14 +235,16 @@ void vmalloc_sync_all(void)
spin_lock_irqsave(&pgd_lock, flags);
list_for_each_entry(page, &pgd_list, lru) {
- spinlock_t *pgt_lock;
+ struct mm_struct *mm;
pmd_t *ret;
- pgt_lock = &pgd_page_get_mm(page)->page_table_lock;
+ mm = pgd_page_get_mm(page);
- spin_lock(pgt_lock);
+ if (mm)
+ spin_lock(&mm->page_table_lock);
ret = vmalloc_sync_one(page_address(page), address);
- spin_unlock(pgt_lock);
+ if (mm)
+ spin_unlock(&mm->page_table_lock);
if (!ret)
break;
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -114,11 +114,12 @@ void sync_global_pgds(unsigned long star
spin_lock_irqsave(&pgd_lock, flags);
list_for_each_entry(page, &pgd_list, lru) {
pgd_t *pgd;
- spinlock_t *pgt_lock;
+ struct mm_struct *mm;
pgd = (pgd_t *)page_address(page) + pgd_index(address);
- pgt_lock = &pgd_page_get_mm(page)->page_table_lock;
- spin_lock(pgt_lock);
+ mm = pgd_page_get_mm(page);
+ if (mm)
+ spin_lock(&mm->page_table_lock);
if (pgd_none(*pgd))
set_pgd(pgd, *pgd_ref);
@@ -126,7 +127,8 @@ void sync_global_pgds(unsigned long star
BUG_ON(pgd_page_vaddr(*pgd)
!= pgd_page_vaddr(*pgd_ref));
- spin_unlock(pgt_lock);
+ if (mm)
+ spin_unlock(&mm->page_table_lock);
}
spin_unlock_irqrestore(&pgd_lock, flags);
}
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -4,6 +4,7 @@
#include <asm/pgtable.h>
#include <asm/tlb.h>
#include <asm/fixmap.h>
+#include <xen/xen.h>
#define PGALLOC_GFP GFP_KERNEL | __GFP_NOTRACK | __GFP_REPEAT | __GFP_ZERO
@@ -91,12 +92,8 @@ static inline void pgd_list_del(pgd_t *p
static void pgd_set_mm(pgd_t *pgd, struct mm_struct *mm)
{
BUILD_BUG_ON(sizeof(virt_to_page(pgd)->index) < sizeof(mm));
- virt_to_page(pgd)->index = (pgoff_t)mm;
-}
-
-struct mm_struct *pgd_page_get_mm(struct page *page)
-{
- return (struct mm_struct *)page->index;
+ if (xen_pv_domain())
+ virt_to_page(pgd)->index = (pgoff_t)mm;
}
static void pgd_ctor(struct mm_struct *mm, pgd_t *pgd)
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -28,8 +28,6 @@ extern unsigned long empty_zero_page[PAG
extern spinlock_t pgd_lock;
extern struct list_head pgd_list;
-extern struct mm_struct *pgd_page_get_mm(struct page *page);
-
#ifdef CONFIG_PARAVIRT
#include <asm/paravirt.h>
#else /* !CONFIG_PARAVIRT */
@@ -83,6 +81,9 @@ extern struct mm_struct *pgd_page_get_mm
#endif /* CONFIG_PARAVIRT */
+#define pgd_page_get_mm(__page) \
+ ((struct mm_struct *)(xen_pv_domain() ? (__page)->index : 0))
+
/*
* The following only work if pte_present() is true.
* Undefined behaviour if not..
next prev parent reply other threads:[~2011-02-24 4:22 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-14 20:56 [PATCH] x86: hold mm->page_table_lock while doing vmalloc_sync Jeremy Fitzhardinge
2010-10-15 17:07 ` [Xen-devel] " Jeremy Fitzhardinge
2010-10-19 22:17 ` [tip:x86/mm] x86, mm: Hold " tip-bot for Jeremy Fitzhardinge
2010-10-20 10:36 ` Borislav Petkov
2010-10-20 19:31 ` [tip:x86/mm] x86, mm: Fix incorrect data type in vmalloc_sync_all() tip-bot for tip-bot for Jeremy Fitzhardinge
2010-10-20 19:50 ` Borislav Petkov
2010-10-20 19:53 ` H. Peter Anvin
2010-10-20 20:10 ` Borislav Petkov
2010-10-20 20:13 ` H. Peter Anvin
2010-10-20 22:11 ` Borislav Petkov
2010-10-20 21:26 ` Ben Pfaff
2010-10-20 19:58 ` tip-bot for Borislav Petkov
2010-10-21 21:06 ` [PATCH] x86: hold mm->page_table_lock while doing vmalloc_sync Jeremy Fitzhardinge
2010-10-21 21:26 ` H. Peter Anvin
2010-10-21 21:34 ` Jeremy Fitzhardinge
2011-02-03 2:48 ` Andrea Arcangeli
2011-02-03 20:44 ` Jeremy Fitzhardinge
2011-02-04 1:21 ` Andrea Arcangeli
2011-02-04 21:27 ` Jeremy Fitzhardinge
2011-02-07 23:20 ` Andrea Arcangeli
2011-02-15 19:07 ` [PATCH] fix pgd_lock deadlock Andrea Arcangeli
2011-02-15 19:26 ` Thomas Gleixner
2011-02-15 19:54 ` Andrea Arcangeli
2011-02-15 20:05 ` Thomas Gleixner
2011-02-15 20:26 ` Thomas Gleixner
2011-02-15 22:52 ` Andrea Arcangeli
2011-02-15 23:03 ` Thomas Gleixner
2011-02-15 23:17 ` Andrea Arcangeli
2011-02-16 9:58 ` Peter Zijlstra
2011-02-16 10:15 ` Andrea Arcangeli
2011-02-16 10:28 ` Ingo Molnar
2011-02-16 14:49 ` Andrea Arcangeli
2011-02-16 16:26 ` Rik van Riel
2011-02-16 20:15 ` Ingo Molnar
2012-04-23 9:07 ` [2.6.32.y][PATCH] " Philipp Hahn
2012-04-23 19:09 ` Willy Tarreau
2011-02-16 18:33 ` [PATCH] " Andrea Arcangeli
2011-02-16 21:34 ` Konrad Rzeszutek Wilk
2011-02-17 10:19 ` Johannes Weiner
2011-02-21 14:30 ` Andrea Arcangeli
2011-02-21 14:53 ` Johannes Weiner
2011-02-22 7:48 ` Jan Beulich
2011-02-22 13:49 ` Andrea Arcangeli
2011-02-22 14:22 ` Jan Beulich
2011-02-22 14:34 ` Andrea Arcangeli
2011-02-22 17:08 ` Jeremy Fitzhardinge
2011-02-22 17:13 ` Andrea Arcangeli
2011-02-24 4:22 ` Andrea Arcangeli [this message]
2011-02-24 8:23 ` Jan Beulich
2011-02-24 14:11 ` Andrea Arcangeli
2011-02-21 17:40 ` Jeremy Fitzhardinge
2011-02-03 20:59 ` [PATCH] x86: hold mm->page_table_lock while doing vmalloc_sync Larry Woodman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110224042222.GG31195@random.random \
--to=aarcange@redhat.com \
--cc=Ian.Campbell@citrix.com \
--cc=JBeulich@novell.com \
--cc=Xen-devel@lists.xensource.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=hpa@zytor.com \
--cc=hughd@google.com \
--cc=jeremy@goop.org \
--cc=jweiner@redhat.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lwoodman@redhat.com \
--cc=riel@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).