* apparent regressions from TLB range flushing page set
@ 2012-08-20 14:12 Jan Beulich
2012-08-22 3:23 ` Alex Shi
2012-08-22 3:26 ` Alex Shi
0 siblings, 2 replies; 9+ messages in thread
From: Jan Beulich @ 2012-08-20 14:12 UTC (permalink / raw)
To: Alex Shi; +Cc: Konrad Rzeszutek Wilk, linux-kernel, hpa
Alex,
without even having run that code yet, I think I see two bugs here,
both of which I'm pretty sure I pointed out at least once during the
review cycle:
For one, while TLB_FLUSH_ALL gets passed as 'end' argument to
flush_tlb_others(), the Xen code was made to check its 'start'
parameter.
Second, the UV code doesn't flush the full range at all, it simply
ignores its 'end' parameter (and hence also the "all" indicator).
While the Xen one is simple to fix, I have no clue what to do for
the UV code.
Jan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: apparent regressions from TLB range flushing page set
2012-08-20 14:12 apparent regressions from TLB range flushing page set Jan Beulich
@ 2012-08-22 3:23 ` Alex Shi
2012-08-22 7:39 ` Jan Beulich
2012-08-22 3:26 ` Alex Shi
1 sibling, 1 reply; 9+ messages in thread
From: Alex Shi @ 2012-08-22 3:23 UTC (permalink / raw)
To: Jan Beulich; +Cc: Konrad Rzeszutek Wilk, linux-kernel, hpa
On 08/20/2012 10:12 PM, Jan Beulich wrote:
> Alex,
>
> without even having run that code yet, I think I see two bugs here,
> both of which I'm pretty sure I pointed out at least once during the
> review cycle:
I was thought you have 'Agreed' for xen part code. :)
>
> For one, while TLB_FLUSH_ALL gets passed as 'end' argument to
> flush_tlb_others(), the Xen code was made to check its 'start'
> parameter.
Do you mean need the following change? --untested.
>From the logical of flush_tlb_others, the old code should cause unflushed TLB issue
(when end == -1, xen just executed INVLPG_MULTI, not whole TLB_FLUSH_MULTI). but we
didn't find this problem in xen test. So, guess the xen code has other bug cover this.
=========
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index b65a761..5141d80 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -1283,7 +1283,7 @@ static void xen_flush_tlb_others(const struct cpumask *cpus,
cpumask_clear_cpu(smp_processor_id(), to_cpumask(args->mask));
args->op.cmd = MMUEXT_TLB_FLUSH_MULTI;
- if (start != TLB_FLUSH_ALL && (end - start) <= PAGE_SIZE) {
+ if (end != TLB_FLUSH_ALL && (end - start) <= PAGE_SIZE) {
args->op.cmd = MMUEXT_INVLPG_MULTI;
args->op.arg1.linear_addr = start;
}
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: apparent regressions from TLB range flushing page set
2012-08-22 3:23 ` Alex Shi
@ 2012-08-22 7:39 ` Jan Beulich
2012-08-22 8:54 ` Alex Shi
0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2012-08-22 7:39 UTC (permalink / raw)
To: alex.shi; +Cc: konrad.wilk, linux-kernel, hpa
>>> Alex Shi <alex.shi@intel.com> 08/22/12 5:24 AM >>>
>On 08/20/2012 10:12 PM, Jan Beulich wrote:
>I was thought you have 'Agreed' for xen part code. :)
I had agreed to it being done the right way, and I had pointed out the
problem once. I can't say for sure that I looked at the most recent rev
closely enough to spot the issue still being unfixed.
>> For one, while TLB_FLUSH_ALL gets passed as 'end' argument to
>> flush_tlb_others(), the Xen code was made to check its 'start'
>> parameter.
>
>Do you mean need the following change? --untested.
Yes. I'd question though whether for that special case it shouldn't be
start _and_ end to get passed the special value.
Jan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: apparent regressions from TLB range flushing page set
2012-08-22 7:39 ` Jan Beulich
@ 2012-08-22 8:54 ` Alex Shi
2012-08-22 13:22 ` Jan Beulich
0 siblings, 1 reply; 9+ messages in thread
From: Alex Shi @ 2012-08-22 8:54 UTC (permalink / raw)
To: Jan Beulich; +Cc: konrad.wilk, linux-kernel, hpa
On 08/22/2012 03:39 PM, Jan Beulich wrote:
>>>> Alex Shi <alex.shi@intel.com> 08/22/12 5:24 AM >>>
>> On 08/20/2012 10:12 PM, Jan Beulich wrote:
>> I was thought you have 'Agreed' for xen part code. :)
>
> I had agreed to it being done the right way, and I had pointed out the
> problem once. I can't say for sure that I looked at the most recent rev
> closely enough to spot the issue still being unfixed.
>
>>> For one, while TLB_FLUSH_ALL gets passed as 'end' argument to
>>> flush_tlb_others(), the Xen code was made to check its 'start'
>>> parameter.
>>
>> Do you mean need the following change? --untested.
>
> Yes. I'd question though whether for that special case it shouldn't be
> start _and_ end to get passed the special value.
Actually the special value is already there in old code.
so, what's your meaning of the question?
>
> Jan
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: apparent regressions from TLB range flushing page set
2012-08-22 8:54 ` Alex Shi
@ 2012-08-22 13:22 ` Jan Beulich
2012-08-24 8:15 ` Alex Shi
0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2012-08-22 13:22 UTC (permalink / raw)
To: Alex Shi; +Cc: konrad.wilk, linux-kernel, hpa
>>> On 22.08.12 at 10:54, Alex Shi <alex.shi@intel.com> wrote:
> On 08/22/2012 03:39 PM, Jan Beulich wrote:
>
>>>>> Alex Shi <alex.shi@intel.com> 08/22/12 5:24 AM >>>
>>> On 08/20/2012 10:12 PM, Jan Beulich wrote:
>>> I was thought you have 'Agreed' for xen part code. :)
>>
>> I had agreed to it being done the right way, and I had pointed out the
>> problem once. I can't say for sure that I looked at the most recent rev
>> closely enough to spot the issue still being unfixed.
>>
>>>> For one, while TLB_FLUSH_ALL gets passed as 'end' argument to
>>>> flush_tlb_others(), the Xen code was made to check its 'start'
>>>> parameter.
>>>
>>> Do you mean need the following change? --untested.
>>
>> Yes. I'd question though whether for that special case it shouldn't be
>> start _and_ end to get passed the special value.
>
>
> Actually the special value is already there in old code.
> so, what's your meaning of the question?
I'm saying that I'd rather see
#define flush_tlb_mm(mm) flush_tlb_mm_range(mm, TLB_FLUSH_ALL, TLB_FLUSH_ALL, 0UL)
Jan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: apparent regressions from TLB range flushing page set
2012-08-22 13:22 ` Jan Beulich
@ 2012-08-24 8:15 ` Alex Shi
0 siblings, 0 replies; 9+ messages in thread
From: Alex Shi @ 2012-08-24 8:15 UTC (permalink / raw)
To: Jan Beulich; +Cc: konrad.wilk, linux-kernel, hpa
On 08/22/2012 09:22 PM, Jan Beulich wrote:
>>>> On 22.08.12 at 10:54, Alex Shi <alex.shi@intel.com> wrote:
>> On 08/22/2012 03:39 PM, Jan Beulich wrote:
>>
>>>>>> Alex Shi <alex.shi@intel.com> 08/22/12 5:24 AM >>>
>>>> On 08/20/2012 10:12 PM, Jan Beulich wrote:
>>>> I was thought you have 'Agreed' for xen part code. :)
>>>
>>> I had agreed to it being done the right way, and I had pointed out the
>>> problem once. I can't say for sure that I looked at the most recent rev
>>> closely enough to spot the issue still being unfixed.
>>>
>>>>> For one, while TLB_FLUSH_ALL gets passed as 'end' argument to
>>>>> flush_tlb_others(), the Xen code was made to check its 'start'
>>>>> parameter.
>>>>
>>>> Do you mean need the following change? --untested.
>>>
>>> Yes. I'd question though whether for that special case it shouldn't be
>>> start _and_ end to get passed the special value.
>>
>>
>> Actually the special value is already there in old code.
>> so, what's your meaning of the question?
>
> I'm saying that I'd rather see
>
> #define flush_tlb_mm(mm) flush_tlb_mm_range(mm, TLB_FLUSH_ALL, TLB_FLUSH_ALL, 0UL)
It bring logical confusing, and is no much help.
flush_tlb_mm_range still will call:
flush_tlb_others(mm_cpumask(mm), mm, 0UL, TLB_FLUSH_ALL);
So, since we already fix code error, we'd better not to do this change.
>
> Jan
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: apparent regressions from TLB range flushing page set
2012-08-20 14:12 apparent regressions from TLB range flushing page set Jan Beulich
2012-08-22 3:23 ` Alex Shi
@ 2012-08-22 3:26 ` Alex Shi
2012-08-22 7:44 ` Jan Beulich
1 sibling, 1 reply; 9+ messages in thread
From: Alex Shi @ 2012-08-22 3:26 UTC (permalink / raw)
To: Jan Beulich; +Cc: Konrad Rzeszutek Wilk, linux-kernel, hpa
> Second, the UV code doesn't flush the full range at all, it simply
> ignores its 'end' parameter (and hence also the "all" indicator).
>
Sure. the following rfc patch try to fix it. untested since no hardware.
=====
diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index a06983c..ac6f326 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -225,8 +225,10 @@ struct bau_local_cpumask {
* The payload is software-defined for INTD transactions
*/
struct bau_msg_payload {
- unsigned long address; /* signifies a page or all
- TLB's of the cpu */
+ unsigned long start; /* start address to flush TLB
+ of the cpu */
+ unsigned long end; /* end address to flush TLB
+ of the cpu */
/* 64 bits */
unsigned short sending_cpu; /* filled in by sender */
/* 16 bits */
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index b8b3a37..c603d15 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -280,12 +280,12 @@ static void bau_process_message(struct msg_desc *mdp, struct bau_control *bcp,
/*
* This must be a normal message, or retry of a normal message
*/
- if (msg->address == TLB_FLUSH_ALL) {
+ if (msg->end == 0) {
+ __flush_tlb_one(msg->start);
+ stat->d_onetlb++;
+ } else {
local_flush_tlb();
stat->d_alltlb++;
- } else {
- __flush_tlb_one(msg->address);
- stat->d_onetlb++;
}
stat->d_requestee++;
@@ -1034,7 +1034,8 @@ static int set_distrib_bits(struct cpumask *flush_mask, struct bau_control *bcp,
* globally purge translation cache of a virtual address or all TLB's
* @cpumask: mask of all cpu's in which the address is to be removed
* @mm: mm_struct containing virtual address range
- * @va: virtual address to be removed (or TLB_FLUSH_ALL for all TLB's on cpu)
+ * @start: start virtual address to be removed from TLB
+ * @end: end virtual address to be remove from TLB
* @cpu: the current cpu
*
* This is the entry point for initiating any UV global TLB shootdown.
@@ -1113,7 +1114,8 @@ const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
record_send_statistics(stat, locals, hubs, remotes, bau_desc);
- bau_desc->payload.address = start;
+ bau_desc->payload.start = start;
+ bau_desc->payload.end = end;
bau_desc->payload.sending_cpu = cpu;
/*
* uv_flush_send_and_wait returns 0 if all cpu's were messaged,
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: apparent regressions from TLB range flushing page set
2012-08-22 3:26 ` Alex Shi
@ 2012-08-22 7:44 ` Jan Beulich
2012-08-22 7:53 ` Alex Shi
0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2012-08-22 7:44 UTC (permalink / raw)
To: alex.shi; +Cc: konrad.wilk, linux-kernel, hpa
>>> Alex Shi <alex.shi@intel.com> 08/22/12 5:27 AM >>>
>> Second, the UV code doesn't flush the full range at all, it simply
>> ignores its 'end' parameter (and hence also the "all" indicator).
>
>Sure. the following rfc patch try to fix it. untested since no hardware.
Sure - this needs to be looked at by a person knowing UV (and I would
have thought a change like the one we're discussing here would also
have required an ack from such a person), but ...
>--- a/arch/x86/platform/uv/tlb_uv.c
>+++ b/arch/x86/platform/uv/tlb_uv.c
>@@ -280,12 +280,12 @@ static void bau_process_message(struct msg_desc *mdp, struct bau_control *bcp,
> /*
> * This must be a normal message, or retry of a normal message
> */
>- if (msg->address == TLB_FLUSH_ALL) {
>+ if (msg->end == 0) {
How would "end" end up being 0 here? Don't you rather mean "start and
end on the same page"? And even if you do, aren't you then losing the
intended optimization?
>+ __flush_tlb_one(msg->start);
>+ stat->d_onetlb++;
>+ } else {
> local_flush_tlb();
> stat->d_alltlb++;
>- } else {
>- __flush_tlb_one(msg->address);
>- stat->d_onetlb++;
> }
> stat->d_requestee++;
>
>@@ -1113,7 +1114,8 @@ const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
>
> record_send_statistics(stat, locals, hubs, remotes, bau_desc);
>
>- bau_desc->payload.address = start;
>+ bau_desc->payload.start = start;
>+ bau_desc->payload.end = end;
> bau_desc->payload.sending_cpu = cpu;
> /*
> * uv_flush_send_and_wait returns 0 if all cpu's were messaged,
Jan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: apparent regressions from TLB range flushing page set
2012-08-22 7:44 ` Jan Beulich
@ 2012-08-22 7:53 ` Alex Shi
0 siblings, 0 replies; 9+ messages in thread
From: Alex Shi @ 2012-08-22 7:53 UTC (permalink / raw)
To: Jan Beulich; +Cc: konrad.wilk, linux-kernel, hpa
On 08/22/2012 03:44 PM, Jan Beulich wrote:
>>>> Alex Shi <alex.shi@intel.com> 08/22/12 5:27 AM >>>
>>> Second, the UV code doesn't flush the full range at all, it simply
>>> ignores its 'end' parameter (and hence also the "all" indicator).
>>
>> Sure. the following rfc patch try to fix it. untested since no hardware.
>
> Sure - this needs to be looked at by a person knowing UV (and I would
> have thought a change like the one we're discussing here would also
> have required an ack from such a person), but ...
>
>> --- a/arch/x86/platform/uv/tlb_uv.c
>> +++ b/arch/x86/platform/uv/tlb_uv.c
>> @@ -280,12 +280,12 @@ static void bau_process_message(struct msg_desc *mdp, struct bau_control *bcp,
>> /*
>> * This must be a normal message, or retry of a normal message
>> */
>> - if (msg->address == TLB_FLUSH_ALL) {
>> + if (msg->end == 0) {
>
> How would "end" end up being 0 here? Don't you rather mean "start and
> end on the same page"?
yes,
> And even if you do, aren't you then losing the
> intended optimization?
Sure, TLB optimization is relatively complex and specific on different
hardware. I am not sure this platform needs this, and even so, I can not
give reasonable flushall_shift value. So, it is better to recover the
system as before.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-08-24 8:15 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-20 14:12 apparent regressions from TLB range flushing page set Jan Beulich
2012-08-22 3:23 ` Alex Shi
2012-08-22 7:39 ` Jan Beulich
2012-08-22 8:54 ` Alex Shi
2012-08-22 13:22 ` Jan Beulich
2012-08-24 8:15 ` Alex Shi
2012-08-22 3:26 ` Alex Shi
2012-08-22 7:44 ` Jan Beulich
2012-08-22 7:53 ` Alex Shi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).