From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Jan Beulich <jbeulich@novell.com>
Cc: Ingo Molnar <mingo@elte.hu>, linux-kernel@vger.kernel.org
Subject: Re: operation ordering during pgd_alloc/pgd_free
Date: Thu, 05 Jun 2008 10:27:19 +0100 [thread overview]
Message-ID: <4847B177.7070501@goop.org> (raw)
In-Reply-To: <4847C54B.76E4.0078.0@novell.com>
Jan Beulich wrote:
> At present, pgd_ctor() adds a new pgd to pgd_list solely based on
> !SHARED_KERNEL_PMD. For PAE && !SHARED_KERNEL_PMD (i.e. Xen)
> this doesn't seem correct, as the pgd is still empty, which will confuse
> vmalloc_sync_all(). So in this case, list insertion should only happen at
> the end of pgd_prepopulate_pmd().
>
How does vmalloc_sync_all() get confused?
> Likewise, pgd_free() calls pgd_mop_up_pmds() *before* pgd_dtor(),
> with the former zeroing pgd entries as it goes and only the latter
> removing the pgd from the list. Just as above this can confuse
> vmalloc_sync_all(), so here I would think that the two calls should just
> be swapped. However, if they get swapped, careful inspection of the
> interaction with save/restore will be needed -
Yes, I specifically wanted to make sure that the pgd was on the list
from before it had any entries until after it has any, to make sure that
no pmds escape visibility from xen_mm_pin_all(). (Note to self: put a
memory barrier to make sure the list update is complete before/after
inserting/removing any pmd entries.)
> XenSource's Linux tree
> has a comment specifically to that effect:
>
> /*
> * After this the pgd should not be pinned for the duration of this
> * function's execution. We should never sleep and thus never race:
> * 1. User pmds will not become write-protected under our feet due
> * to a concurrent mm_pin_all().
> * 2. The machine addresses in PGD entries will not become invalid
> * due to a concurrent save/restore.
> */
>
> Since that tree doesn't support preemption, this is perhaps fine, but
> likely going to cause problems in the (preemptable) pv-ops code.
>
I don't think so. When saving with preemption enabled, it first puts
all processes in the freezer before entering stop_machine_run(); a
process constructing a pagetable should be finished by the time it can
be frozen.
But I think there's a problem *without* preemption.
pmd_prepopulate_pgd() allocates new pmds with GFP_KERNEL, and so it can
block, which undermines the precondition of the comment you quote. This
allows an unlisted and unpinned pgd to be missed at save time. I could
just use the freezer unconditionally, but there was some concern about
how much time it would take on a busy system.
Alternatively, a different ordering would fix it:
1. preallocate - but don't install - the pmds
2. take pgd_lock
3. install pmds into pgd
4. insert pgd onto list
5. release pgd_lock
Holding pgd_lock will prevent both vmalloc_sync_all() and
xen_mm_pin_all() from being able to visit the pgd while it is in its
transitional state.
> The issue with vmalloc_sync_all() would even go unnoticed, since the
> patch to unify the pgd_list mechanism with x86-64 removed the
> BUG_ON() that was meant to trigger on issues like this.
>
Is there an inherent reason vmalloc_sync_all can't deal with a partially
constructed pgd? Couldn't it just skip them, as if it wasn't (or
rather, not yet) on the list? In fact, that looks like what it does now.
Thanks for looking at this.
J
next prev parent reply other threads:[~2008-06-05 9:28 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-05 8:51 operation ordering during pgd_alloc/pgd_free Jan Beulich
2008-06-05 9:27 ` Jeremy Fitzhardinge [this message]
2008-06-05 9:58 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4847B177.7070501@goop.org \
--to=jeremy@goop.org \
--cc=jbeulich@novell.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.