From: Keir Fraser <keir.xen@gmail.com>
To: Jan Beulich <JBeulich@suse.com>, xen-devel <xen-devel@lists.xen.org>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: [PATCH] x86: fix map_domain_page() last resort fallback
Date: Wed, 12 Jun 2013 18:27:50 +0100 [thread overview]
Message-ID: <CDDE6E26.54D3B%keir.xen@gmail.com> (raw)
In-Reply-To: <51B8B70D02000078000DDACB@nat28.tlf.novell.com>
On 12/06/2013 16:59, "Jan Beulich" <JBeulich@suse.com> wrote:
> Guests with vCPU count not divisible by 4 have unused bits in the last
> word of their inuse bitmap, and the garbage collection code therefore
> would get mislead believing that some entries were actually recoverable
> for use.
>
> Also use an earlier established local variable in mapcache_vcpu_init()
> instead of re-calculating the value (noticed while investigating the
> generally better option of setting those overhanging bits once during
> setup - this didn't work out in a simple enough fashion because the
> mapping getting established there isn't in the current address space,
> and hence the bitmap isn't directly accessible there).
>
> Reported-by: Konrad Wilk <konrad.wilk@oracle.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Whilst I can't argue against this as the obvious bugfix to the existing
code, I personally object to clawing back hash-table entries at all. The
size of the per-vcpu hashtable is small, and it should be perfectly possible
to always allow enough extra entries in the mapcache to always be able to
allocate an entry even when all vcpu's maphash buckets are in use.
Perhaps this is the right fix for 4.3 at this point, but in that case I am
quite inclined to simplify this down after 4.3, sidestepping the whole
issue.
-- Keir
> --- a/xen/arch/x86/domain_page.c
> +++ b/xen/arch/x86/domain_page.c
> @@ -111,16 +111,17 @@ void *map_domain_page(unsigned long mfn)
> idx = find_next_zero_bit(dcache->inuse, dcache->entries, dcache->cursor);
> if ( unlikely(idx >= dcache->entries) )
> {
> - unsigned long accum = 0;
> + unsigned long accum = 0, prev = 0;
>
> /* /First/, clean the garbage map and update the inuse list. */
> for ( i = 0; i < BITS_TO_LONGS(dcache->entries); i++ )
> {
> + accum |= prev;
> dcache->inuse[i] &= ~xchg(&dcache->garbage[i], 0);
> - accum |= ~dcache->inuse[i];
> + prev = ~dcache->inuse[i];
> }
>
> - if ( accum )
> + if ( accum | (prev & BITMAP_LAST_WORD_MASK(dcache->entries)) )
> idx = find_first_zero_bit(dcache->inuse, dcache->entries);
> else
> {
> @@ -280,8 +281,7 @@ int mapcache_vcpu_init(struct vcpu *v)
> if ( ents > dcache->entries )
> {
> /* Populate page tables. */
> - int rc = create_perdomain_mapping(d, MAPCACHE_VIRT_START,
> - d->max_vcpus *
> MAPCACHE_VCPU_ENTRIES,
> + int rc = create_perdomain_mapping(d, MAPCACHE_VIRT_START, ents,
> NIL(l1_pgentry_t *), NULL);
>
> /* Populate bit maps. */
>
>
>
next prev parent reply other threads:[~2013-06-12 17:27 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-12 15:59 [PATCH] x86: fix map_domain_page() last resort fallback Jan Beulich
2013-06-12 17:27 ` Keir Fraser [this message]
2013-06-13 7:49 ` Jan Beulich
2013-06-13 8:06 ` Keir Fraser
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CDDE6E26.54D3B%keir.xen@gmail.com \
--to=keir.xen@gmail.com \
--cc=George.Dunlap@eu.citrix.com \
--cc=JBeulich@suse.com \
--cc=konrad.wilk@oracle.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).