* [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen().
@ 2017-11-10 7:18 Yu Zhang
2017-11-10 7:18 ` [PATCH v2 2/2] x86/mm: fix a potential race condition in modify_xen_mappings() Yu Zhang
2017-11-10 9:49 ` [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen() Jan Beulich
0 siblings, 2 replies; 11+ messages in thread
From: Yu Zhang @ 2017-11-10 7:18 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper, min.he, Jan Beulich, yi.z.zhang
From: Min He <min.he@intel.com>
In map_pages_to_xen(), a L2 page table entry may be reset to point to
a superpage, and its corresponding L1 page table need be freed in such
scenario, when these L1 page table entries are mapping to consecutive
page frames and having the same mapping flags.
However, variable `pl1e` is not protected by the lock before L1 page table
is enumerated. A race condition may happen if this code path is invoked
simultaneously on different CPUs.
For example, `pl1e` value on CPU0 may hold an obsolete value, pointing
to a page which has just been freed on CPU1. Besides, before this page
is reused, it will still be holding the old PTEs, referencing consecutive
page frames. Consequently the `free_xen_pagetable(l2e_to_l1e(ol2e))` will
be triggered on CPU0, resulting the unexpected free of a normal page.
This patch fixes the potential race condition by protecting the `pl1e`
with the lock, and checking the PSE flag of the `pl2e`.
Note: PSE flag of `pl3e` is also checked before its re-consolidation,
for the same reason we do for `pl2e` - we cannot presume the contents
of the target superpage.
Signed-off-by: Min He <min.he@intel.com>
Signed-off-by: Yi Zhang <yi.z.zhang@intel.com>
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Changes in v2:
According to comments from Jan Beulich,
- check PSE of pl2e and pl3e, and skip the re-consolidation if set.
- commit message changes, e.g. add "From :" tag etc.
- code style changes.
- introduce a seperate patch to resolve the similar issue in
modify_xen_mappings().
---
xen/arch/x86/mm.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index a20fdca..47855fb 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4844,9 +4844,19 @@ int map_pages_to_xen(
{
unsigned long base_mfn;
- pl1e = l2e_to_l1e(*pl2e);
if ( locking )
spin_lock(&map_pgdir_lock);
+
+ /* Skip the re-consolidation if it's done on another CPU. */
+ if ( l2e_get_flags(*pl2e) & _PAGE_PSE )
+ {
+ if ( locking )
+ spin_unlock(&map_pgdir_lock);
+ goto check_l3;
+ }
+
+ ol2e = *pl2e;
+ pl1e = l2e_to_l1e(ol2e);
base_mfn = l1e_get_pfn(*pl1e) & ~(L1_PAGETABLE_ENTRIES - 1);
for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++, pl1e++ )
if ( (l1e_get_pfn(*pl1e) != (base_mfn + i)) ||
@@ -4854,7 +4864,6 @@ int map_pages_to_xen(
break;
if ( i == L1_PAGETABLE_ENTRIES )
{
- ol2e = *pl2e;
l2e_write_atomic(pl2e, l2e_from_pfn(base_mfn,
l1f_to_lNf(flags)));
if ( locking )
@@ -4880,6 +4889,15 @@ int map_pages_to_xen(
if ( locking )
spin_lock(&map_pgdir_lock);
+
+ /* Skip the re-consolidation if it's done on another CPU. */
+ if ( l3e_get_flags(*pl3e) & _PAGE_PSE )
+ {
+ if ( locking )
+ spin_unlock(&map_pgdir_lock);
+ continue;
+ }
+
ol3e = *pl3e;
pl2e = l3e_to_l2e(ol3e);
base_mfn = l2e_get_pfn(*pl2e) & ~(L2_PAGETABLE_ENTRIES *
--
2.5.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH v2 2/2] x86/mm: fix a potential race condition in modify_xen_mappings().
2017-11-10 7:18 [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen() Yu Zhang
@ 2017-11-10 7:18 ` Yu Zhang
2017-11-10 9:57 ` Jan Beulich
2017-11-10 9:49 ` [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen() Jan Beulich
1 sibling, 1 reply; 11+ messages in thread
From: Yu Zhang @ 2017-11-10 7:18 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper, min.he, Jan Beulich, yi.z.zhang
In modify_xen_mappings(), a L1/L2 page table may be freed,
if all entries of this page table are empty. Corresponding
L2/L3 PTE will need be cleared in such scenario.
However, logic to enumerate the L1/L2 page table and to reset
the corresponding L2/L3 PTE need to be protected with spinlock.
Otherwise, the paging structure may be freed more than once, if
the same routine is invoked simultaneously on different CPUs.
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
xen/arch/x86/mm.c | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 47855fb..c07c528 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5097,6 +5097,17 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
*/
if ( (nf & _PAGE_PRESENT) || ((v != e) && (l1_table_offset(v) != 0)) )
continue;
+ if ( locking )
+ spin_lock(&map_pgdir_lock);
+
+ /* L2E may be cleared on another CPU. */
+ if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
+ {
+ if ( locking )
+ spin_unlock(&map_pgdir_lock);
+ goto check_l3;
+ }
+
pl1e = l2e_to_l1e(*pl2e);
for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
if ( l1e_get_intpte(pl1e[i]) != 0 )
@@ -5105,11 +5116,16 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
{
/* Empty: zap the L2E and free the L1 page. */
l2e_write_atomic(pl2e, l2e_empty());
+ if ( locking )
+ spin_unlock(&map_pgdir_lock);
flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
free_xen_pagetable(pl1e);
}
+ else if ( locking )
+ spin_unlock(&map_pgdir_lock);
}
+check_l3:
/*
* If we are not destroying mappings, or not done with the L3E,
* skip the empty&free check.
@@ -5117,6 +5133,17 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
if ( (nf & _PAGE_PRESENT) ||
((v != e) && (l2_table_offset(v) + l1_table_offset(v) != 0)) )
continue;
+ if ( locking )
+ spin_lock(&map_pgdir_lock);
+
+ /* L3E may be cleared on another CPU. */
+ if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
+ {
+ if ( locking )
+ spin_unlock(&map_pgdir_lock);
+ continue;
+ }
+
pl2e = l3e_to_l2e(*pl3e);
for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ )
if ( l2e_get_intpte(pl2e[i]) != 0 )
@@ -5125,9 +5152,13 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
{
/* Empty: zap the L3E and free the L2 page. */
l3e_write_atomic(pl3e, l3e_empty());
+ if ( locking )
+ spin_unlock(&map_pgdir_lock);
flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
free_xen_pagetable(pl2e);
}
+ else if ( locking )
+ spin_unlock(&map_pgdir_lock);
}
flush_area(NULL, FLUSH_TLB_GLOBAL);
--
2.5.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH v2 2/2] x86/mm: fix a potential race condition in modify_xen_mappings().
2017-11-10 7:18 ` [PATCH v2 2/2] x86/mm: fix a potential race condition in modify_xen_mappings() Yu Zhang
@ 2017-11-10 9:57 ` Jan Beulich
2017-11-10 14:02 ` Yu Zhang
0 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2017-11-10 9:57 UTC (permalink / raw)
To: Yu Zhang; +Cc: Andrew Cooper, min.he, xen-devel, yi.z.zhang
>>> On 10.11.17 at 08:18, <yu.c.zhang@linux.intel.com> wrote:
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5097,6 +5097,17 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
> */
> if ( (nf & _PAGE_PRESENT) || ((v != e) && (l1_table_offset(v) != 0)) )
> continue;
> + if ( locking )
> + spin_lock(&map_pgdir_lock);
> +
> + /* L2E may be cleared on another CPU. */
> + if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
I think you also need a PSE check here, or else the l2e_to_l1e() below
may be illegal.
> @@ -5105,11 +5116,16 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
> {
> /* Empty: zap the L2E and free the L1 page. */
> l2e_write_atomic(pl2e, l2e_empty());
> + if ( locking )
> + spin_unlock(&map_pgdir_lock);
> flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
> free_xen_pagetable(pl1e);
> }
> + else if ( locking )
> + spin_unlock(&map_pgdir_lock);
> }
>
> +check_l3:
Labels indented by at least one space please.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH v2 2/2] x86/mm: fix a potential race condition in modify_xen_mappings().
2017-11-10 9:57 ` Jan Beulich
@ 2017-11-10 14:02 ` Yu Zhang
2017-11-13 9:33 ` Jan Beulich
0 siblings, 1 reply; 11+ messages in thread
From: Yu Zhang @ 2017-11-10 14:02 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, min.he, xen-devel, yi.z.zhang
On 11/10/2017 5:57 PM, Jan Beulich wrote:
>>>> On 10.11.17 at 08:18, <yu.c.zhang@linux.intel.com> wrote:
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -5097,6 +5097,17 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
>> */
>> if ( (nf & _PAGE_PRESENT) || ((v != e) && (l1_table_offset(v) != 0)) )
>> continue;
>> + if ( locking )
>> + spin_lock(&map_pgdir_lock);
>> +
>> + /* L2E may be cleared on another CPU. */
>> + if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
> I think you also need a PSE check here, or else the l2e_to_l1e() below
> may be illegal.
Hmm, interesting point, and thanks! :-)
I did not check the PSE, because modify_xen_mappings() will not do the
re-consolidation, and
concurrent invokes of this routine will not change this flag. But now I
believe this presumption
shall not be made, because the paging structures may be modified by
other routines, like
map_pages_to_xen() on other CPUs.
So yes, I think a _PAGE_PSE check is necessary here. And I suggest we
also check the _PAGE_PRESENT
flag as well, for the re-consolidation part in my first patch for
map_pages_to_xen(). Do you agree?
>> @@ -5105,11 +5116,16 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
>> {
>> /* Empty: zap the L2E and free the L1 page. */
>> l2e_write_atomic(pl2e, l2e_empty());
>> + if ( locking )
>> + spin_unlock(&map_pgdir_lock);
>> flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */
>> free_xen_pagetable(pl1e);
>> }
>> + else if ( locking )
>> + spin_unlock(&map_pgdir_lock);
>> }
>>
>> +check_l3:
> Labels indented by at least one space please.
Got it . Thanks.
Yu
>
> Jan
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH v2 2/2] x86/mm: fix a potential race condition in modify_xen_mappings().
2017-11-10 14:02 ` Yu Zhang
@ 2017-11-13 9:33 ` Jan Beulich
0 siblings, 0 replies; 11+ messages in thread
From: Jan Beulich @ 2017-11-13 9:33 UTC (permalink / raw)
To: Yu Zhang; +Cc: Andrew Cooper, min.he, xen-devel, yi.z.zhang
>>> On 10.11.17 at 15:02, <yu.c.zhang@linux.intel.com> wrote:
> On 11/10/2017 5:57 PM, Jan Beulich wrote:
>>>>> On 10.11.17 at 08:18, <yu.c.zhang@linux.intel.com> wrote:
>>> --- a/xen/arch/x86/mm.c
>>> +++ b/xen/arch/x86/mm.c
>>> @@ -5097,6 +5097,17 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf)
>>> */
>>> if ( (nf & _PAGE_PRESENT) || ((v != e) && (l1_table_offset(v) != 0)) )
>>> continue;
>>> + if ( locking )
>>> + spin_lock(&map_pgdir_lock);
>>> +
>>> + /* L2E may be cleared on another CPU. */
>>> + if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
>> I think you also need a PSE check here, or else the l2e_to_l1e() below
>> may be illegal.
>
> Hmm, interesting point, and thanks! :-)
> I did not check the PSE, because modify_xen_mappings() will not do the
> re-consolidation, and
> concurrent invokes of this routine will not change this flag. But now I
> believe this presumption
> shall not be made, because the paging structures may be modified by
> other routines, like
> map_pages_to_xen() on other CPUs.
>
> So yes, I think a _PAGE_PSE check is necessary here. And I suggest we
> also check the _PAGE_PRESENT
> flag as well, for the re-consolidation part in my first patch for
> map_pages_to_xen(). Do you agree?
Oh, yes, definitely. I should have noticed this myself.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen().
2017-11-10 7:18 [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen() Yu Zhang
2017-11-10 7:18 ` [PATCH v2 2/2] x86/mm: fix a potential race condition in modify_xen_mappings() Yu Zhang
@ 2017-11-10 9:49 ` Jan Beulich
2017-11-10 14:05 ` Yu Zhang
1 sibling, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2017-11-10 9:49 UTC (permalink / raw)
To: Yu Zhang; +Cc: Andrew Cooper, min.he, xen-devel, yi.z.zhang
>>> On 10.11.17 at 08:18, <yu.c.zhang@linux.intel.com> wrote:
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -4844,9 +4844,19 @@ int map_pages_to_xen(
> {
> unsigned long base_mfn;
>
> - pl1e = l2e_to_l1e(*pl2e);
> if ( locking )
> spin_lock(&map_pgdir_lock);
> +
> + /* Skip the re-consolidation if it's done on another CPU. */
> + if ( l2e_get_flags(*pl2e) & _PAGE_PSE )
> + {
> + if ( locking )
> + spin_unlock(&map_pgdir_lock);
> + goto check_l3;
> + }
> +
> + ol2e = *pl2e;
This wants moving up ahead of the if(), so you can use the local
variable inside that if().
> @@ -4880,6 +4889,15 @@ int map_pages_to_xen(
>
> if ( locking )
> spin_lock(&map_pgdir_lock);
> +
> + /* Skip the re-consolidation if it's done on another CPU. */
> + if ( l3e_get_flags(*pl3e) & _PAGE_PSE )
> + {
> + if ( locking )
> + spin_unlock(&map_pgdir_lock);
> + continue;
> + }
> +
> ol3e = *pl3e;
Same here - move the if() below here and use ol3e in there.
With that
Reviewed-by: Jan Beulich <jbeulich@suse.com>
I'm not certain this is important enough a fix to consider for 4.10,
and you seem to think it's good enough if this gets applied only
after the tree would be branched, as you didn't Cc Julien. Please
indicate if you actually simply weren't aware, and you indeed
there's an important aspect to this that I'm overlooking.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen().
2017-11-10 9:49 ` [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen() Jan Beulich
@ 2017-11-10 14:05 ` Yu Zhang
2017-11-13 9:31 ` Jan Beulich
0 siblings, 1 reply; 11+ messages in thread
From: Yu Zhang @ 2017-11-10 14:05 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, min.he, yi.z.zhang, Julien Grall, xen-devel
On 11/10/2017 5:49 PM, Jan Beulich wrote:
>>>> On 10.11.17 at 08:18, <yu.c.zhang@linux.intel.com> wrote:
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -4844,9 +4844,19 @@ int map_pages_to_xen(
>> {
>> unsigned long base_mfn;
>>
>> - pl1e = l2e_to_l1e(*pl2e);
>> if ( locking )
>> spin_lock(&map_pgdir_lock);
>> +
>> + /* Skip the re-consolidation if it's done on another CPU. */
>> + if ( l2e_get_flags(*pl2e) & _PAGE_PSE )
>> + {
>> + if ( locking )
>> + spin_unlock(&map_pgdir_lock);
>> + goto check_l3;
>> + }
>> +
>> + ol2e = *pl2e;
> This wants moving up ahead of the if(), so you can use the local
> variable inside that if().
Got it. Thanks.
>> @@ -4880,6 +4889,15 @@ int map_pages_to_xen(
>>
>> if ( locking )
>> spin_lock(&map_pgdir_lock);
>> +
>> + /* Skip the re-consolidation if it's done on another CPU. */
>> + if ( l3e_get_flags(*pl3e) & _PAGE_PSE )
>> + {
>> + if ( locking )
>> + spin_unlock(&map_pgdir_lock);
>> + continue;
>> + }
>> +
>> ol3e = *pl3e;
> Same here - move the if() below here and use ol3e in there.
>
> With that
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>
> I'm not certain this is important enough a fix to consider for 4.10,
> and you seem to think it's good enough if this gets applied only
> after the tree would be branched, as you didn't Cc Julien. Please
> indicate if you actually simply weren't aware, and you indeed
> there's an important aspect to this that I'm overlooking.
Well, at first I have not expected this to be accepted for 4.10. But
since we have
met this issue in practice, when running a graphic application which
consumes
memory intensively in dom0, I think it also makes sense if we can fix it
in xen's
release as early as possible. Do you think this is a reasonable
requirement? :-)
Also Cc to Julien. Julien, any comment? :-)
Yu
> Jan
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen().
2017-11-10 14:05 ` Yu Zhang
@ 2017-11-13 9:31 ` Jan Beulich
2017-11-13 10:34 ` Yu Zhang
0 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2017-11-13 9:31 UTC (permalink / raw)
To: Yu Zhang; +Cc: Andrew Cooper, Julien Grall, xen-devel, min.he, yi.z.zhang
>>> On 10.11.17 at 15:05, <yu.c.zhang@linux.intel.com> wrote:
> On 11/10/2017 5:49 PM, Jan Beulich wrote:
>> I'm not certain this is important enough a fix to consider for 4.10,
>> and you seem to think it's good enough if this gets applied only
>> after the tree would be branched, as you didn't Cc Julien. Please
>> indicate if you actually simply weren't aware, and you indeed
>> there's an important aspect to this that I'm overlooking.
>
> Well, at first I have not expected this to be accepted for 4.10. But
> since we have
> met this issue in practice, when running a graphic application which
> consumes
> memory intensively in dom0, I think it also makes sense if we can fix it
> in xen's
> release as early as possible. Do you think this is a reasonable
> requirement? :-)
You'd need to provide further details for us to understand the
scenario. It obviously depends on whether you have other
patches to Xen which actually trigger this. If the problem can
be triggered from outside of a vanilla upstream Xen, then yes,
I think I would favor the fixes being included.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen().
2017-11-13 9:31 ` Jan Beulich
@ 2017-11-13 10:34 ` Yu Zhang
2017-11-13 11:06 ` Jan Beulich
0 siblings, 1 reply; 11+ messages in thread
From: Yu Zhang @ 2017-11-13 10:34 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Julien Grall, xen-devel, min.he, yi.z.zhang
On 11/13/2017 5:31 PM, Jan Beulich wrote:
>>>> On 10.11.17 at 15:05, <yu.c.zhang@linux.intel.com> wrote:
>> On 11/10/2017 5:49 PM, Jan Beulich wrote:
>>> I'm not certain this is important enough a fix to consider for 4.10,
>>> and you seem to think it's good enough if this gets applied only
>>> after the tree would be branched, as you didn't Cc Julien. Please
>>> indicate if you actually simply weren't aware, and you indeed
>>> there's an important aspect to this that I'm overlooking.
>> Well, at first I have not expected this to be accepted for 4.10. But
>> since we have
>> met this issue in practice, when running a graphic application which
>> consumes
>> memory intensively in dom0, I think it also makes sense if we can fix it
>> in xen's
>> release as early as possible. Do you think this is a reasonable
>> requirement? :-)
> You'd need to provide further details for us to understand the
> scenario. It obviously depends on whether you have other
> patches to Xen which actually trigger this. If the problem can
> be triggered from outside of a vanilla upstream Xen, then yes,
> I think I would favor the fixes being included.
Thank, Jan. Let me try to give an explaination of the scenario. :-)
We saw an ASSERT failue in ASSERT((page->count_info & PGC_count_mask) != 0)
in is_iomem_page() <- put_page_from_l1e() <- alloc_l1_table(), when we run a
graphic application(which is a memory eater, but close sourced) in dom0.
And this
failure only happens when dom0 is configured with 2 vCPUs.
Our debug showed the concerned page->count_info was already(and
unexpectedly)
cleared in free_xenheap_pages(), and the call trace should be like this:
free_xenheap_pages()
^
|
free_xen_pagetable()
^
|
map_pages_to_xen()
^
|
update_xen_mappings()
^
|
get_page_from_l1e()
^
|
mod_l1_entry()
^
|
do_mmu_update()
And we then realized that it happened when dom0 tries to update its page
table,
and when the cache attributes are gonna be changed for referenced page
frame,
corresponding mappings for xen VA space will be updated by
map_pages_to_xen()
as well.
However, since routine map_pages_to_xen() has the aforementioned racing
problem,
when MMU_NORMAL_PT_UPDATE is triggered concurrently on different CPUs, it
may mistakenly free a superpage referenced by pl2e. That's why our
ASSERT failure
only happens when dom0 has more than one vCPU configured.
As to the code base, we were running XenGT code, which has only a few
non-upstreamed
patches in Xen - I believe most of them are libxl related ones, and none
of them is
mmu related. So I believe this issue could be triggered by a pv guest to
a vanilla
upstream xen.
Is above description convincing enough? :-)
Yu
>
> Jan
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen().
2017-11-13 10:34 ` Yu Zhang
@ 2017-11-13 11:06 ` Jan Beulich
2017-11-13 11:22 ` Julien Grall
0 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2017-11-13 11:06 UTC (permalink / raw)
To: Yu Zhang; +Cc: Andrew Cooper, Julien Grall, xen-devel, min.he, yi.z.zhang
>>> On 13.11.17 at 11:34, <yu.c.zhang@linux.intel.com> wrote:
> Our debug showed the concerned page->count_info was already(and
> unexpectedly)
> cleared in free_xenheap_pages(), and the call trace should be like this:
>
> free_xenheap_pages()
> ^
> |
> free_xen_pagetable()
> ^
> |
> map_pages_to_xen()
> ^
> |
> update_xen_mappings()
> ^
> |
> get_page_from_l1e()
> ^
> |
> mod_l1_entry()
> ^
> |
> do_mmu_update()
This ...
> Is above description convincing enough? :-)
... is indeed enough for me to suggest to Julien that we take both
patches (once they're ready). But it's his decision in the end.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen().
2017-11-13 11:06 ` Jan Beulich
@ 2017-11-13 11:22 ` Julien Grall
0 siblings, 0 replies; 11+ messages in thread
From: Julien Grall @ 2017-11-13 11:22 UTC (permalink / raw)
To: Jan Beulich, Yu Zhang
Cc: Andrew Cooper, Julien Grall, yi.z.zhang, min.he, xen-devel
Hi,
On 11/13/2017 11:06 AM, Jan Beulich wrote:
>>>> On 13.11.17 at 11:34, <yu.c.zhang@linux.intel.com> wrote:
>> Our debug showed the concerned page->count_info was already(and
>> unexpectedly)
>> cleared in free_xenheap_pages(), and the call trace should be like this:
>>
>> free_xenheap_pages()
>> ^
>> |
>> free_xen_pagetable()
>> ^
>> |
>> map_pages_to_xen()
>> ^
>> |
>> update_xen_mappings()
>> ^
>> |
>> get_page_from_l1e()
>> ^
>> |
>> mod_l1_entry()
>> ^
>> |
>> do_mmu_update()
>
> This ...
>
>> Is above description convincing enough? :-)
>
> ... is indeed enough for me to suggest to Julien that we take both
> patches (once they're ready). But it's his decision in the end.
I will wait the series to be ready before giving my release-ack.
Cheers,
> Jan
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2017-11-13 11:22 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-10 7:18 [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen() Yu Zhang
2017-11-10 7:18 ` [PATCH v2 2/2] x86/mm: fix a potential race condition in modify_xen_mappings() Yu Zhang
2017-11-10 9:57 ` Jan Beulich
2017-11-10 14:02 ` Yu Zhang
2017-11-13 9:33 ` Jan Beulich
2017-11-10 9:49 ` [PATCH v2 1/2] x86/mm: fix a potential race condition in map_pages_to_xen() Jan Beulich
2017-11-10 14:05 ` Yu Zhang
2017-11-13 9:31 ` Jan Beulich
2017-11-13 10:34 ` Yu Zhang
2017-11-13 11:06 ` Jan Beulich
2017-11-13 11:22 ` Julien Grall
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).