[PATCH] less tlb flush in unmap

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] less tlb flush in unmap_vmas
@ 2006-03-22  2:38 Shaohua Li
  2006-03-22  4:52 ` Nick Piggin
  0 siblings, 1 reply; 7+ messages in thread
From: Shaohua Li @ 2006-03-22  2:38 UTC (permalink / raw)
  To: lkml; +Cc: Andrew Morton

In unmaping region, if current task doesn't need reschedule, don't do a
tlb_finish_mmu. This can reduce some tlb flushes.

In the lmbench tests, this patch gives 2.1% improvement on exec proc
item and 4.2% on sh proc item.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
---

 linux-2.6.16-rc5-root/mm/memory.c |    7 +++----
 1 files changed, 3 insertions(+), 4 deletions(-)

diff -puN mm/memory.c~less_flush mm/memory.c
--- linux-2.6.16-rc5/mm/memory.c~less_flush	2006-03-21 07:22:47.000000000 +0800
+++ linux-2.6.16-rc5-root/mm/memory.c	2006-03-21 07:26:51.000000000 +0800
@@ -837,19 +837,18 @@ unsigned long unmap_vmas(struct mmu_gath
 				break;
 			}
 
-			tlb_finish_mmu(*tlbp, tlb_start, start);
-
 			if (need_resched() ||
 				(i_mmap_lock && need_lockbreak(i_mmap_lock))) {
+				tlb_finish_mmu(*tlbp, tlb_start, start);
 				if (i_mmap_lock) {
 					*tlbp = NULL;
 					goto out;
 				}
 				cond_resched();
+				tlb_start_valid = 0;
+				*tlbp = tlb_gather_mmu(vma->vm_mm, fullmm);
 			}
 
-			*tlbp = tlb_gather_mmu(vma->vm_mm, fullmm);
-			tlb_start_valid = 0;
 			zap_work = ZAP_BLOCK_SIZE;
 		}
 	}
_



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] less tlb flush in unmap_vmas
  2006-03-22  2:38 [PATCH] less tlb flush in unmap_vmas Shaohua Li
@ 2006-03-22  4:52 ` Nick Piggin
  2006-03-22  7:15   ` Chen, Kenneth W
  0 siblings, 1 reply; 7+ messages in thread
From: Nick Piggin @ 2006-03-22  4:52 UTC (permalink / raw)
  To: Shaohua Li; +Cc: lkml, Andrew Morton

Shaohua Li wrote:

>In unmaping region, if current task doesn't need reschedule, don't do a
>tlb_finish_mmu. This can reduce some tlb flushes.
>
>In the lmbench tests, this patch gives 2.1% improvement on exec proc
>item and 4.2% on sh proc item.
>
>

The problem with this is that by the time we _do_ determine that a
reschedule is needed, we might have built up a huge amount of work
to do (which can probably be as much if not more exensive per-page
as the unmapping), so scheduling latency can still be unacceptable
so I'm afraid I don't think we can include this patch.

One option I've been looking into is my "mmu gather in-place" that
never needs extra tlb flushes and is always preemptible... so that
may be a way forward.

---

Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] less tlb flush in unmap_vmas
  2006-03-22  4:52 ` Nick Piggin
@ 2006-03-22  7:15   ` Chen, Kenneth W
  2006-03-22  7:30     ` Nick Piggin
  0 siblings, 1 reply; 7+ messages in thread
From: Chen, Kenneth W @ 2006-03-22  7:15 UTC (permalink / raw)
  To: 'Nick Piggin', Li, Shaohua
  Cc: 'lkml', 'Andrew Morton'

Nick Piggin wrote on Tuesday, March 21, 2006 8:53 PM
> Shaohua Li wrote:
> >In unmaping region, if current task doesn't need reschedule, don't do a
> >tlb_finish_mmu. This can reduce some tlb flushes.
> >
> >In the lmbench tests, this patch gives 2.1% improvement on exec proc
> >item and 4.2% on sh proc item.
> 
> The problem with this is that by the time we _do_ determine that a
> reschedule is needed, we might have built up a huge amount of work
> to do (which can probably be as much if not more exensive per-page
> as the unmapping), so scheduling latency can still be unacceptable
> so I'm afraid I don't think we can include this patch.

Interesting. In the old day, since mm->page_table_lock is held for the
entire unmap_vmas function, it was beneficial to introduce periodic
reschedule point and to drop the spin lock under pressure. Now that the
page table lock is fine-grained and is pushed into zap_pte_range(), I
would think scheduling latency would improve from lock contention
avoidance point of view.  It is not the case?

- Ken


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] less tlb flush in unmap_vmas
  2006-03-22  7:15   ` Chen, Kenneth W
@ 2006-03-22  7:30     ` Nick Piggin
  2006-03-22  7:44       ` Chen, Kenneth W
  0 siblings, 1 reply; 7+ messages in thread
From: Nick Piggin @ 2006-03-22  7:30 UTC (permalink / raw)
  To: Chen, Kenneth W; +Cc: Li, Shaohua, 'lkml', 'Andrew Morton'

Chen, Kenneth W wrote:
> Nick Piggin wrote on Tuesday, March 21, 2006 8:53 PM
> 
>>Shaohua Li wrote:
>>
>>>In unmaping region, if current task doesn't need reschedule, don't do a
>>>tlb_finish_mmu. This can reduce some tlb flushes.
>>>
>>>In the lmbench tests, this patch gives 2.1% improvement on exec proc
>>>item and 4.2% on sh proc item.
>>
>>The problem with this is that by the time we _do_ determine that a
>>reschedule is needed, we might have built up a huge amount of work
>>to do (which can probably be as much if not more exensive per-page
>>as the unmapping), so scheduling latency can still be unacceptable
>>so I'm afraid I don't think we can include this patch.
> 
> 
> Interesting. In the old day, since mm->page_table_lock is held for the
> entire unmap_vmas function, it was beneficial to introduce periodic
> reschedule point and to drop the spin lock under pressure. Now that the
> page table lock is fine-grained and is pushed into zap_pte_range(), I
> would think scheduling latency would improve from lock contention
> avoidance point of view.  It is not the case?
> 

Well mmu_gather uses a per-cpu data structure and is non preemptible,
which I guess is one of the main reasons why we have this preemption
here.

You're right that another good reason would be ptl lock contention,
however I don't think that alleviating that problem alone would allow
longer mmu_gather scheduling latencies, because the longest latency
is still the mmu_gather <--> mmu_finish span.

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] less tlb flush in unmap_vmas
  2006-03-22  7:30     ` Nick Piggin
@ 2006-03-22  7:44       ` Chen, Kenneth W
  2006-03-22 10:13         ` Nick Piggin
  2006-03-27  5:01         ` Lee Revell
  0 siblings, 2 replies; 7+ messages in thread
From: Chen, Kenneth W @ 2006-03-22  7:44 UTC (permalink / raw)
  To: 'Nick Piggin'
  Cc: Li, Shaohua, 'lkml', 'Andrew Morton'

Nick Piggin wrote on Tuesday, March 21, 2006 11:30 PM
> > Chen, Kenneth W wrote:
> > Interesting. In the old day, since mm->page_table_lock is held for the
> > entire unmap_vmas function, it was beneficial to introduce periodic
> > reschedule point and to drop the spin lock under pressure. Now that the
> > page table lock is fine-grained and is pushed into zap_pte_range(), I
> > would think scheduling latency would improve from lock contention
> > avoidance point of view.  It is not the case?
> > 
> 
> Well mmu_gather uses a per-cpu data structure and is non preemptible,
> which I guess is one of the main reasons why we have this preemption
> here.
> 
> You're right that another good reason would be ptl lock contention,
> however I don't think that alleviating that problem alone would allow
> longer mmu_gather scheduling latencies, because the longest latency
> is still the mmu_gather <--> mmu_finish span.

OK, I think it would be beneficial to take a latency measurement again,
just to see how it perform now a day.  The dynamics might changed.

- Ken


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] less tlb flush in unmap_vmas
  2006-03-22  7:44       ` Chen, Kenneth W
@ 2006-03-22 10:13         ` Nick Piggin
  2006-03-27  5:01         ` Lee Revell
  1 sibling, 0 replies; 7+ messages in thread
From: Nick Piggin @ 2006-03-22 10:13 UTC (permalink / raw)
  To: Chen, Kenneth W
  Cc: Li, Shaohua, 'lkml', 'Andrew Morton',
	Hugh Dickins

Chen, Kenneth W wrote:
> Nick Piggin wrote on Tuesday, March 21, 2006 11:30 PM
> 

>>Well mmu_gather uses a per-cpu data structure and is non preemptible,
>>which I guess is one of the main reasons why we have this preemption
>>here.
>>
>>You're right that another good reason would be ptl lock contention,
>>however I don't think that alleviating that problem alone would allow
>>longer mmu_gather scheduling latencies, because the longest latency
>>is still the mmu_gather <--> mmu_finish span.
> 
> 
> OK, I think it would be beneficial to take a latency measurement again,
> just to see how it perform now a day.  The dynamics might changed.
> 

Well I wouldn't argue against further investigation or fine tuning
the present code, however also remember that the way of unconditionally
finishing the mmu_gather that the patch is aimed to prevent never
actually lowered ptl hold times itself.

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] less tlb flush in unmap_vmas
  2006-03-22  7:44       ` Chen, Kenneth W
  2006-03-22 10:13         ` Nick Piggin
@ 2006-03-27  5:01         ` Lee Revell
  1 sibling, 0 replies; 7+ messages in thread
From: Lee Revell @ 2006-03-27  5:01 UTC (permalink / raw)
  To: Chen, Kenneth W
  Cc: 'Nick Piggin', Li, Shaohua, 'lkml',
	'Andrew Morton', Ingo Molnar

On Tue, 2006-03-21 at 23:44 -0800, Chen, Kenneth W wrote:
> 
> OK, I think it would be beneficial to take a latency measurement
> again,
> just to see how it perform now a day.  The dynamics might changed. 

I will test this with Ingo's latency tracer as soon as I get a chance.
I had previously posted results showing this to be a problem spot.

Lee


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-03-27  5:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-22  2:38 [PATCH] less tlb flush in unmap_vmas Shaohua Li
2006-03-22  4:52 ` Nick Piggin
2006-03-22  7:15   ` Chen, Kenneth W
2006-03-22  7:30     ` Nick Piggin
2006-03-22  7:44       ` Chen, Kenneth W
2006-03-22 10:13         ` Nick Piggin
2006-03-27  5:01         ` Lee Revell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox