* [PATCH] x86: fix map_domain_page() last resort fallback
@ 2013-06-12 15:59 Jan Beulich
2013-06-12 17:27 ` Keir Fraser
0 siblings, 1 reply; 4+ messages in thread
From: Jan Beulich @ 2013-06-12 15:59 UTC (permalink / raw)
To: xen-devel; +Cc: George Dunlap, Keir Fraser, Konrad Rzeszutek Wilk
[-- Attachment #1: Type: text/plain, Size: 2109 bytes --]
Guests with vCPU count not divisible by 4 have unused bits in the last
word of their inuse bitmap, and the garbage collection code therefore
would get mislead believing that some entries were actually recoverable
for use.
Also use an earlier established local variable in mapcache_vcpu_init()
instead of re-calculating the value (noticed while investigating the
generally better option of setting those overhanging bits once during
setup - this didn't work out in a simple enough fashion because the
mapping getting established there isn't in the current address space,
and hence the bitmap isn't directly accessible there).
Reported-by: Konrad Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/xen/arch/x86/domain_page.c
+++ b/xen/arch/x86/domain_page.c
@@ -111,16 +111,17 @@ void *map_domain_page(unsigned long mfn)
idx = find_next_zero_bit(dcache->inuse, dcache->entries, dcache->cursor);
if ( unlikely(idx >= dcache->entries) )
{
- unsigned long accum = 0;
+ unsigned long accum = 0, prev = 0;
/* /First/, clean the garbage map and update the inuse list. */
for ( i = 0; i < BITS_TO_LONGS(dcache->entries); i++ )
{
+ accum |= prev;
dcache->inuse[i] &= ~xchg(&dcache->garbage[i], 0);
- accum |= ~dcache->inuse[i];
+ prev = ~dcache->inuse[i];
}
- if ( accum )
+ if ( accum | (prev & BITMAP_LAST_WORD_MASK(dcache->entries)) )
idx = find_first_zero_bit(dcache->inuse, dcache->entries);
else
{
@@ -280,8 +281,7 @@ int mapcache_vcpu_init(struct vcpu *v)
if ( ents > dcache->entries )
{
/* Populate page tables. */
- int rc = create_perdomain_mapping(d, MAPCACHE_VIRT_START,
- d->max_vcpus * MAPCACHE_VCPU_ENTRIES,
+ int rc = create_perdomain_mapping(d, MAPCACHE_VIRT_START, ents,
NIL(l1_pgentry_t *), NULL);
/* Populate bit maps. */
[-- Attachment #2: x86-mdp-last-inuse-word.patch --]
[-- Type: text/plain, Size: 2154 bytes --]
x86: fix map_domain_page() last resort fallback
Guests with vCPU count not divisible by 4 have unused bits in the last
word of their inuse bitmap, and the garbage collection code therefore
would get mislead believing that some entries were actually recoverable
for use.
Also use an earlier established local variable in mapcache_vcpu_init()
instead of re-calculating the value (noticed while investigating the
generally better option of setting those overhanging bits once during
setup - this didn't work out in a simple enough fashion because the
mapping getting established there isn't in the current address space,
and hence the bitmap isn't directly accessible there).
Reported-by: Konrad Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/xen/arch/x86/domain_page.c
+++ b/xen/arch/x86/domain_page.c
@@ -111,16 +111,17 @@ void *map_domain_page(unsigned long mfn)
idx = find_next_zero_bit(dcache->inuse, dcache->entries, dcache->cursor);
if ( unlikely(idx >= dcache->entries) )
{
- unsigned long accum = 0;
+ unsigned long accum = 0, prev = 0;
/* /First/, clean the garbage map and update the inuse list. */
for ( i = 0; i < BITS_TO_LONGS(dcache->entries); i++ )
{
+ accum |= prev;
dcache->inuse[i] &= ~xchg(&dcache->garbage[i], 0);
- accum |= ~dcache->inuse[i];
+ prev = ~dcache->inuse[i];
}
- if ( accum )
+ if ( accum | (prev & BITMAP_LAST_WORD_MASK(dcache->entries)) )
idx = find_first_zero_bit(dcache->inuse, dcache->entries);
else
{
@@ -280,8 +281,7 @@ int mapcache_vcpu_init(struct vcpu *v)
if ( ents > dcache->entries )
{
/* Populate page tables. */
- int rc = create_perdomain_mapping(d, MAPCACHE_VIRT_START,
- d->max_vcpus * MAPCACHE_VCPU_ENTRIES,
+ int rc = create_perdomain_mapping(d, MAPCACHE_VIRT_START, ents,
NIL(l1_pgentry_t *), NULL);
/* Populate bit maps. */
[-- Attachment #3: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] x86: fix map_domain_page() last resort fallback
2013-06-12 15:59 [PATCH] x86: fix map_domain_page() last resort fallback Jan Beulich
@ 2013-06-12 17:27 ` Keir Fraser
2013-06-13 7:49 ` Jan Beulich
0 siblings, 1 reply; 4+ messages in thread
From: Keir Fraser @ 2013-06-12 17:27 UTC (permalink / raw)
To: Jan Beulich, xen-devel; +Cc: George Dunlap, Konrad Rzeszutek Wilk
On 12/06/2013 16:59, "Jan Beulich" <JBeulich@suse.com> wrote:
> Guests with vCPU count not divisible by 4 have unused bits in the last
> word of their inuse bitmap, and the garbage collection code therefore
> would get mislead believing that some entries were actually recoverable
> for use.
>
> Also use an earlier established local variable in mapcache_vcpu_init()
> instead of re-calculating the value (noticed while investigating the
> generally better option of setting those overhanging bits once during
> setup - this didn't work out in a simple enough fashion because the
> mapping getting established there isn't in the current address space,
> and hence the bitmap isn't directly accessible there).
>
> Reported-by: Konrad Wilk <konrad.wilk@oracle.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Whilst I can't argue against this as the obvious bugfix to the existing
code, I personally object to clawing back hash-table entries at all. The
size of the per-vcpu hashtable is small, and it should be perfectly possible
to always allow enough extra entries in the mapcache to always be able to
allocate an entry even when all vcpu's maphash buckets are in use.
Perhaps this is the right fix for 4.3 at this point, but in that case I am
quite inclined to simplify this down after 4.3, sidestepping the whole
issue.
-- Keir
> --- a/xen/arch/x86/domain_page.c
> +++ b/xen/arch/x86/domain_page.c
> @@ -111,16 +111,17 @@ void *map_domain_page(unsigned long mfn)
> idx = find_next_zero_bit(dcache->inuse, dcache->entries, dcache->cursor);
> if ( unlikely(idx >= dcache->entries) )
> {
> - unsigned long accum = 0;
> + unsigned long accum = 0, prev = 0;
>
> /* /First/, clean the garbage map and update the inuse list. */
> for ( i = 0; i < BITS_TO_LONGS(dcache->entries); i++ )
> {
> + accum |= prev;
> dcache->inuse[i] &= ~xchg(&dcache->garbage[i], 0);
> - accum |= ~dcache->inuse[i];
> + prev = ~dcache->inuse[i];
> }
>
> - if ( accum )
> + if ( accum | (prev & BITMAP_LAST_WORD_MASK(dcache->entries)) )
> idx = find_first_zero_bit(dcache->inuse, dcache->entries);
> else
> {
> @@ -280,8 +281,7 @@ int mapcache_vcpu_init(struct vcpu *v)
> if ( ents > dcache->entries )
> {
> /* Populate page tables. */
> - int rc = create_perdomain_mapping(d, MAPCACHE_VIRT_START,
> - d->max_vcpus *
> MAPCACHE_VCPU_ENTRIES,
> + int rc = create_perdomain_mapping(d, MAPCACHE_VIRT_START, ents,
> NIL(l1_pgentry_t *), NULL);
>
> /* Populate bit maps. */
>
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] x86: fix map_domain_page() last resort fallback
2013-06-12 17:27 ` Keir Fraser
@ 2013-06-13 7:49 ` Jan Beulich
2013-06-13 8:06 ` Keir Fraser
0 siblings, 1 reply; 4+ messages in thread
From: Jan Beulich @ 2013-06-13 7:49 UTC (permalink / raw)
To: Keir Fraser; +Cc: George Dunlap, Konrad Rzeszutek Wilk, xen-devel
>>> On 12.06.13 at 19:27, Keir Fraser <keir.xen@gmail.com> wrote:
> On 12/06/2013 16:59, "Jan Beulich" <JBeulich@suse.com> wrote:
>
>> Guests with vCPU count not divisible by 4 have unused bits in the last
>> word of their inuse bitmap, and the garbage collection code therefore
>> would get mislead believing that some entries were actually recoverable
>> for use.
>>
>> Also use an earlier established local variable in mapcache_vcpu_init()
>> instead of re-calculating the value (noticed while investigating the
>> generally better option of setting those overhanging bits once during
>> setup - this didn't work out in a simple enough fashion because the
>> mapping getting established there isn't in the current address space,
>> and hence the bitmap isn't directly accessible there).
>>
>> Reported-by: Konrad Wilk <konrad.wilk@oracle.com>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> Whilst I can't argue against this as the obvious bugfix to the existing
> code, I personally object to clawing back hash-table entries at all. The
> size of the per-vcpu hashtable is small, and it should be perfectly possible
> to always allow enough extra entries in the mapcache to always be able to
> allocate an entry even when all vcpu's maphash buckets are in use.
>
> Perhaps this is the right fix for 4.3 at this point, but in that case I am
> quite inclined to simplify this down after 4.3, sidestepping the whole
> issue.
I won't object undoing this, and moving the MAPHASH_ENTRIES
definition into config.h, but I also won't put my name under it.
Jan
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] x86: fix map_domain_page() last resort fallback
2013-06-13 7:49 ` Jan Beulich
@ 2013-06-13 8:06 ` Keir Fraser
0 siblings, 0 replies; 4+ messages in thread
From: Keir Fraser @ 2013-06-13 8:06 UTC (permalink / raw)
To: Jan Beulich; +Cc: George Dunlap, Konrad Rzeszutek Wilk, xen-devel
On 13/06/2013 08:49, "Jan Beulich" <JBeulich@suse.com> wrote:
>>>> On 12.06.13 at 19:27, Keir Fraser <keir.xen@gmail.com> wrote:
>> On 12/06/2013 16:59, "Jan Beulich" <JBeulich@suse.com> wrote:
>>
>>> Guests with vCPU count not divisible by 4 have unused bits in the last
>>> word of their inuse bitmap, and the garbage collection code therefore
>>> would get mislead believing that some entries were actually recoverable
>>> for use.
>>>
>>> Also use an earlier established local variable in mapcache_vcpu_init()
>>> instead of re-calculating the value (noticed while investigating the
>>> generally better option of setting those overhanging bits once during
>>> setup - this didn't work out in a simple enough fashion because the
>>> mapping getting established there isn't in the current address space,
>>> and hence the bitmap isn't directly accessible there).
>>>
>>> Reported-by: Konrad Wilk <konrad.wilk@oracle.com>
>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>
>> Whilst I can't argue against this as the obvious bugfix to the existing
>> code, I personally object to clawing back hash-table entries at all. The
>> size of the per-vcpu hashtable is small, and it should be perfectly possible
>> to always allow enough extra entries in the mapcache to always be able to
>> allocate an entry even when all vcpu's maphash buckets are in use.
>>
>> Perhaps this is the right fix for 4.3 at this point, but in that case I am
>> quite inclined to simplify this down after 4.3, sidestepping the whole
>> issue.
>
> I won't object undoing this, and moving the MAPHASH_ENTRIES
> definition into config.h, but I also won't put my name under it.
I think your fix is best for 4.3. Let's get it checked in.
Acked-by: Keir Fraser <keir@xen.org>
> Jan
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-06-13 8:06 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-12 15:59 [PATCH] x86: fix map_domain_page() last resort fallback Jan Beulich
2013-06-12 17:27 ` Keir Fraser
2013-06-13 7:49 ` Jan Beulich
2013-06-13 8:06 ` Keir Fraser
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).