linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: the arch/x86 maintainers <x86@kernel.org>,
	"Xen-devel@lists.xensource.com" <Xen-devel@lists.xensource.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	Jan Beulich <JBeulich@novell.com>
Subject: Re: [PATCH] x86: hold mm->page_table_lock while doing vmalloc_sync
Date: Thu, 21 Oct 2010 14:06:59 -0700	[thread overview]
Message-ID: <4CC0AB73.8060609@goop.org> (raw)
In-Reply-To: <4CB76E8B.2090309@goop.org>

 Ping?  Have you had any thoughts about possible x86-64 problems with this?

Thanks,
    J

On 10/14/2010 01:56 PM, Jeremy Fitzhardinge wrote:
>
> Take mm->page_table_lock while syncing the vmalloc region.  This prevents
> a race with the Xen pagetable pin/unpin code, which expects that the
> page_table_lock is already held.  If this race occurs, then Xen can see
> an inconsistent page type (a page can either be read/write or a pagetable
> page, and pin/unpin converts it between them), which will cause either
> the pin or the set_p[gm]d to fail; either will crash the kernel.
>
> vmalloc_sync_all() should be called rarely, so this extra use of
> page_table_lock should not interfere with its normal users.
>
> The mm pointer is stashed in the pgd page's index field, as that won't
> be otherwise used for pgd pages.
>
> Bug reported by Ian Campbell <ian.cambell@eu.citrix.com>
> Derived from a patch by Jan Beulich <jbeulich@novell.com>
>
> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
>
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index a34c785..422b363 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -28,6 +28,8 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
>  extern spinlock_t pgd_lock;
>  extern struct list_head pgd_list;
>  
> +extern struct mm_struct *pgd_page_get_mm(struct page *page);
> +
>  #ifdef CONFIG_PARAVIRT
>  #include <asm/paravirt.h>
>  #else  /* !CONFIG_PARAVIRT */
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index 4c4508e..b7f9ae1 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -229,7 +229,16 @@ void vmalloc_sync_all(void)
>  
>  		spin_lock_irqsave(&pgd_lock, flags);
>  		list_for_each_entry(page, &pgd_list, lru) {
> -			if (!vmalloc_sync_one(page_address(page), address))
> +			spinlock_t *pgt_lock;
> +			int ret;
> +
> +			pgt_lock = &pgd_page_get_mm(page)->page_table_lock;
> +
> +			spin_lock(pgt_lock);
> +			ret = vmalloc_sync_one(page_address(page), address);
> +			spin_unlock(pgt_lock);
> +
> +			if (!ret)
>  				break;
>  		}
>  		spin_unlock_irqrestore(&pgd_lock, flags);
> @@ -341,11 +350,19 @@ void vmalloc_sync_all(void)
>  		spin_lock_irqsave(&pgd_lock, flags);
>  		list_for_each_entry(page, &pgd_list, lru) {
>  			pgd_t *pgd;
> +			spinlock_t *pgt_lock;
> +
>  			pgd = (pgd_t *)page_address(page) + pgd_index(address);
> +
> +			pgt_lock = &pgd_page_get_mm(page)->page_table_lock;
> +			spin_lock(pgt_lock);
> +
>  			if (pgd_none(*pgd))
>  				set_pgd(pgd, *pgd_ref);
>  			else
>  				BUG_ON(pgd_page_vaddr(*pgd) != pgd_page_vaddr(*pgd_ref));
> +
> +			spin_unlock(pgt_lock);
>  		}
>  		spin_unlock_irqrestore(&pgd_lock, flags);
>  	}
> diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
> index 5c4ee42..c70e57d 100644
> --- a/arch/x86/mm/pgtable.c
> +++ b/arch/x86/mm/pgtable.c
> @@ -87,7 +87,19 @@ static inline void pgd_list_del(pgd_t *pgd)
>  #define UNSHARED_PTRS_PER_PGD				\
>  	(SHARED_KERNEL_PMD ? KERNEL_PGD_BOUNDARY : PTRS_PER_PGD)
>  
> -static void pgd_ctor(pgd_t *pgd)
> +
> +static void pgd_set_mm(pgd_t *pgd, struct mm_struct *mm)
> +{
> +	BUILD_BUG_ON(sizeof(virt_to_page(pgd)->index) < sizeof(mm));
> +	virt_to_page(pgd)->index = (pgoff_t)mm;
> +}
> +
> +struct mm_struct *pgd_page_get_mm(struct page *page)
> +{
> +	return (struct mm_struct *)page->index;
> +}
> +
> +static void pgd_ctor(struct mm_struct *mm, pgd_t *pgd)
>  {
>  	/* If the pgd points to a shared pagetable level (either the
>  	   ptes in non-PAE, or shared PMD in PAE), then just copy the
> @@ -105,8 +117,10 @@ static void pgd_ctor(pgd_t *pgd)
>  	}
>  
>  	/* list required to sync kernel mapping updates */
> -	if (!SHARED_KERNEL_PMD)
> +	if (!SHARED_KERNEL_PMD) {
> +		pgd_set_mm(pgd, mm);
>  		pgd_list_add(pgd);
> +	}
>  }
>  
>  static void pgd_dtor(pgd_t *pgd)
> @@ -272,7 +286,7 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
>  	 */
>  	spin_lock_irqsave(&pgd_lock, flags);
>  
> -	pgd_ctor(pgd);
> +	pgd_ctor(mm, pgd);
>  	pgd_prepopulate_pmd(mm, pgd, pmds);
>  
>  	spin_unlock_irqrestore(&pgd_lock, flags);
>
>


  parent reply	other threads:[~2010-10-21 21:07 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-14 20:56 [PATCH] x86: hold mm->page_table_lock while doing vmalloc_sync Jeremy Fitzhardinge
2010-10-15 17:07 ` [Xen-devel] " Jeremy Fitzhardinge
2010-10-19 22:17   ` [tip:x86/mm] x86, mm: Hold " tip-bot for Jeremy Fitzhardinge
2010-10-20 10:36     ` Borislav Petkov
2010-10-20 19:31       ` [tip:x86/mm] x86, mm: Fix incorrect data type in vmalloc_sync_all() tip-bot for tip-bot for Jeremy Fitzhardinge
2010-10-20 19:50         ` Borislav Petkov
2010-10-20 19:53           ` H. Peter Anvin
2010-10-20 20:10             ` Borislav Petkov
2010-10-20 20:13               ` H. Peter Anvin
2010-10-20 22:11                 ` Borislav Petkov
2010-10-20 21:26             ` Ben Pfaff
2010-10-20 19:58       ` tip-bot for Borislav Petkov
2010-10-21 21:06 ` Jeremy Fitzhardinge [this message]
2010-10-21 21:26   ` [PATCH] x86: hold mm->page_table_lock while doing vmalloc_sync H. Peter Anvin
2010-10-21 21:34     ` Jeremy Fitzhardinge
2011-02-03  2:48   ` Andrea Arcangeli
2011-02-03 20:44     ` Jeremy Fitzhardinge
2011-02-04  1:21       ` Andrea Arcangeli
2011-02-04 21:27         ` Jeremy Fitzhardinge
2011-02-07 23:20           ` Andrea Arcangeli
2011-02-15 19:07             ` [PATCH] fix pgd_lock deadlock Andrea Arcangeli
2011-02-15 19:26               ` Thomas Gleixner
2011-02-15 19:54                 ` Andrea Arcangeli
2011-02-15 20:05                   ` Thomas Gleixner
2011-02-15 20:26                     ` Thomas Gleixner
2011-02-15 22:52                       ` Andrea Arcangeli
2011-02-15 23:03                         ` Thomas Gleixner
2011-02-15 23:17                           ` Andrea Arcangeli
2011-02-16  9:58                             ` Peter Zijlstra
2011-02-16 10:15                               ` Andrea Arcangeli
2011-02-16 10:28                                 ` Ingo Molnar
2011-02-16 14:49                                   ` Andrea Arcangeli
2011-02-16 16:26                                     ` Rik van Riel
2011-02-16 20:15                                     ` Ingo Molnar
2012-04-23  9:07                                     ` [2.6.32.y][PATCH] " Philipp Hahn
2012-04-23 19:09                                       ` Willy Tarreau
2011-02-16 18:33                     ` [PATCH] " Andrea Arcangeli
2011-02-16 21:34                       ` Konrad Rzeszutek Wilk
2011-02-17 10:19                       ` Johannes Weiner
2011-02-21 14:30                         ` Andrea Arcangeli
2011-02-21 14:53                           ` Johannes Weiner
2011-02-22  7:48                             ` Jan Beulich
2011-02-22 13:49                               ` Andrea Arcangeli
2011-02-22 14:22                                 ` Jan Beulich
2011-02-22 14:34                                   ` Andrea Arcangeli
2011-02-22 17:08                                     ` Jeremy Fitzhardinge
2011-02-22 17:13                                       ` Andrea Arcangeli
2011-02-24  4:22                                   ` Andrea Arcangeli
2011-02-24  8:23                                     ` Jan Beulich
2011-02-24 14:11                                       ` Andrea Arcangeli
2011-02-21 17:40                         ` Jeremy Fitzhardinge
2011-02-03 20:59     ` [PATCH] x86: hold mm->page_table_lock while doing vmalloc_sync Larry Woodman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CC0AB73.8060609@goop.org \
    --to=jeremy@goop.org \
    --cc=Ian.Campbell@citrix.com \
    --cc=JBeulich@novell.com \
    --cc=Xen-devel@lists.xensource.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).