From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
Rik van Riel <riel@redhat.com>,
Hugh Dickins <hugh.dickins@tiscali.co.uk>,
Mel Gorman <mel@csn.ul.ie>, Nick Piggin <npiggin@kernel.dk>,
Alex Shi <alex.shi@intel.com>,
"Nikunj A. Dadhania" <nikunj@linux.vnet.ibm.com>,
Konrad Rzeszutek Wilk <konrad@darnok.org>,
David Miller <davem@davemloft.net>,
Russell King <rmk@arm.linux.org.uk>,
Chris Metcalf <cmetcalf@tilera.com>Martin Schwidefsky <sc>
Subject: Re: [PATCH 08/20] mm: Optimize fullmm TLB flushing
Date: Sat, 30 Jun 2012 08:11:43 +1000 [thread overview]
Message-ID: <1341007903.2563.41.camel@pasglop> (raw)
In-Reply-To: <20120629152645.GG17837@arm.com>
On Fri, 2012-06-29 at 16:26 +0100, Catalin Marinas wrote:
> On Thu, Jun 28, 2012 at 10:57:21PM +0100, Benjamin Herrenschmidt wrote:
> > On Thu, 2012-06-28 at 18:52 +0200, Peter Zijlstra wrote:
> > > No I think you're right (as always).. also an IPI will not force
> > > schedule the thread that might be running on the receiving cpu, also
> > > we'd have to wait for any such schedule to complete in order to
> > > guarantee the mm isn't lazily used anymore.
> > >
> > > Bugger..
> >
> > You can still do it if the mm count is 1 no ? Ie, current is the last
> > holder of a reference to the mm struct... which will probably be the
> > common case for short lived programs.
>
> BTW, can we not move the free_pgtables() call in exit_mmap() to
> __mmdrop()? Something like below but I'm not entirely sure about its
> implications:
The main one is that it might remain active on another core for a
-loooong- time if that cores is only running kernel threads or otherwise
idle, thus wasting memory etc...
Also, mm_count being 1 is probably the common case for many short lived
processes, so it should be fine, I don't think the count can every
increase back at that point can it ? (we could make sure it doesn't,
mark the mm as dead and WARN loudly if somebody tries to increase the
count).
The advantage of doing a "detach & flush" IPI if the count is larger is
that you already do the IPI for flushing anyway, so you just add a
detach to the path.
That avoids the problem of the mm staying around for too long as well.
Cheers,
Ben.
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index b36d08c..507ee9f 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1372,6 +1372,7 @@ extern void unlink_file_vma(struct vm_area_struct *);
> extern struct vm_area_struct *copy_vma(struct vm_area_struct **,
> unsigned long addr, unsigned long len, pgoff_t pgoff);
> extern void exit_mmap(struct mm_struct *);
> +extern void exit_pgtables(struct mm_struct *mm);
>
> extern int mm_take_all_locks(struct mm_struct *mm);
> extern void mm_drop_all_locks(struct mm_struct *mm);
> diff --git a/kernel/fork.c b/kernel/fork.c
> index ab5211b..3412b1a 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -588,6 +588,7 @@ struct mm_struct *mm_alloc(void)
> void __mmdrop(struct mm_struct *mm)
> {
> BUG_ON(mm == &init_mm);
> + exit_pgtables(mm);
> mm_free_pgd(mm);
> destroy_context(mm);
> mmu_notifier_mm_destroy(mm);
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 074b487..d9ebfdb 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2269,7 +2269,6 @@ void exit_mmap(struct mm_struct *mm)
> {
> struct mmu_gather tlb;
> struct vm_area_struct *vma;
> - unsigned long nr_accounted = 0;
>
> /* mm's last user has gone, and its about to be pulled down */
> mmu_notifier_release(mm);
> @@ -2291,11 +2290,23 @@ void exit_mmap(struct mm_struct *mm)
>
> lru_add_drain();
> flush_cache_mm(mm);
> - tlb_gather_mmu(&tlb, mm, 1);
> + tlb_gather_mmu(&tlb, mm, 0);
> /* update_hiwater_rss(mm) here? but nobody should be looking */
> /* Use -1 here to ensure all VMAs in the mm are unmapped */
> unmap_vmas(&tlb, vma, 0, -1);
> + tlb_finish_mmu(&tlb, 0, -1);
> +}
> +
> +void exit_pgtables(struct mm_struct *mm)
> +{
> + struct mmu_gather tlb;
> + struct vm_area_struct *vma;
> + unsigned long nr_accounted = 0;
>
> + vma = mm->mmap;
> + if (!vma)
> + return;
> + tlb_gather_mmu(&tlb, mm, 1);
> free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, TASK_SIZE);
> tlb_finish_mmu(&tlb, 0, -1);
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
Rik van Riel <riel@redhat.com>,
Hugh Dickins <hugh.dickins@tiscali.co.uk>,
Mel Gorman <mel@csn.ul.ie>, Nick Piggin <npiggin@kernel.dk>,
Alex Shi <alex.shi@intel.com>,
"Nikunj A. Dadhania" <nikunj@linux.vnet.ibm.com>,
Konrad Rzeszutek Wilk <konrad@darnok.org>,
David Miller <davem@davemloft.net>,
Russell King <rmk@arm.linux.org.uk>,
Chris Metcalf <cmetcalf@tilera.com>,
Martin Schwidefsky <schwidefsky@de.ibm.com>,
Tony Luck <tony.luck@intel.com>, Paul Mundt <lethal@linux-sh.org>,
Jeff Dike <jdike@addtoit.com>,
Richard Weinberger <richard@nod.at>,
Ralf Baechle <ralf@linux-mips.org>,
Kyle McMartin <kyle@mcmartin.ca>,
James Bottomley <jejb@parisc-linux.org>,
Chris Zankel <chris@zankel.net>
Subject: Re: [PATCH 08/20] mm: Optimize fullmm TLB flushing
Date: Sat, 30 Jun 2012 08:11:43 +1000 [thread overview]
Message-ID: <1341007903.2563.41.camel@pasglop> (raw)
In-Reply-To: <20120629152645.GG17837@arm.com>
On Fri, 2012-06-29 at 16:26 +0100, Catalin Marinas wrote:
> On Thu, Jun 28, 2012 at 10:57:21PM +0100, Benjamin Herrenschmidt wrote:
> > On Thu, 2012-06-28 at 18:52 +0200, Peter Zijlstra wrote:
> > > No I think you're right (as always).. also an IPI will not force
> > > schedule the thread that might be running on the receiving cpu, also
> > > we'd have to wait for any such schedule to complete in order to
> > > guarantee the mm isn't lazily used anymore.
> > >
> > > Bugger..
> >
> > You can still do it if the mm count is 1 no ? Ie, current is the last
> > holder of a reference to the mm struct... which will probably be the
> > common case for short lived programs.
>
> BTW, can we not move the free_pgtables() call in exit_mmap() to
> __mmdrop()? Something like below but I'm not entirely sure about its
> implications:
The main one is that it might remain active on another core for a
-loooong- time if that cores is only running kernel threads or otherwise
idle, thus wasting memory etc...
Also, mm_count being 1 is probably the common case for many short lived
processes, so it should be fine, I don't think the count can every
increase back at that point can it ? (we could make sure it doesn't,
mark the mm as dead and WARN loudly if somebody tries to increase the
count).
The advantage of doing a "detach & flush" IPI if the count is larger is
that you already do the IPI for flushing anyway, so you just add a
detach to the path.
That avoids the problem of the mm staying around for too long as well.
Cheers,
Ben.
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index b36d08c..507ee9f 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1372,6 +1372,7 @@ extern void unlink_file_vma(struct vm_area_struct *);
> extern struct vm_area_struct *copy_vma(struct vm_area_struct **,
> unsigned long addr, unsigned long len, pgoff_t pgoff);
> extern void exit_mmap(struct mm_struct *);
> +extern void exit_pgtables(struct mm_struct *mm);
>
> extern int mm_take_all_locks(struct mm_struct *mm);
> extern void mm_drop_all_locks(struct mm_struct *mm);
> diff --git a/kernel/fork.c b/kernel/fork.c
> index ab5211b..3412b1a 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -588,6 +588,7 @@ struct mm_struct *mm_alloc(void)
> void __mmdrop(struct mm_struct *mm)
> {
> BUG_ON(mm == &init_mm);
> + exit_pgtables(mm);
> mm_free_pgd(mm);
> destroy_context(mm);
> mmu_notifier_mm_destroy(mm);
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 074b487..d9ebfdb 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2269,7 +2269,6 @@ void exit_mmap(struct mm_struct *mm)
> {
> struct mmu_gather tlb;
> struct vm_area_struct *vma;
> - unsigned long nr_accounted = 0;
>
> /* mm's last user has gone, and its about to be pulled down */
> mmu_notifier_release(mm);
> @@ -2291,11 +2290,23 @@ void exit_mmap(struct mm_struct *mm)
>
> lru_add_drain();
> flush_cache_mm(mm);
> - tlb_gather_mmu(&tlb, mm, 1);
> + tlb_gather_mmu(&tlb, mm, 0);
> /* update_hiwater_rss(mm) here? but nobody should be looking */
> /* Use -1 here to ensure all VMAs in the mm are unmapped */
> unmap_vmas(&tlb, vma, 0, -1);
> + tlb_finish_mmu(&tlb, 0, -1);
> +}
> +
> +void exit_pgtables(struct mm_struct *mm)
> +{
> + struct mmu_gather tlb;
> + struct vm_area_struct *vma;
> + unsigned long nr_accounted = 0;
>
> + vma = mm->mmap;
> + if (!vma)
> + return;
> + tlb_gather_mmu(&tlb, mm, 1);
> free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, TASK_SIZE);
> tlb_finish_mmu(&tlb, 0, -1);
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-06-29 22:11 UTC|newest]
Thread overview: 120+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-27 21:15 [PATCH 00/20] Unify TLB gather implementations -v3 Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 21:15 ` [PATCH 01/20] mm, x86: Add HAVE_RCU_TABLE_FREE support Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 21:15 ` [PATCH 02/20] mm: Add optional TLB flush to generic RCU page-table freeing Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 22:23 ` Linus Torvalds
2012-06-27 22:23 ` Linus Torvalds
2012-06-27 23:01 ` Peter Zijlstra
2012-06-27 23:01 ` Peter Zijlstra
2012-06-27 23:01 ` Peter Zijlstra
2012-06-27 23:42 ` Linus Torvalds
2012-06-27 23:42 ` Linus Torvalds
2012-06-27 23:42 ` Linus Torvalds
2012-06-28 7:09 ` Benjamin Herrenschmidt
2012-06-28 7:09 ` Benjamin Herrenschmidt
2012-06-28 7:09 ` Benjamin Herrenschmidt
2012-06-28 11:05 ` Peter Zijlstra
2012-06-28 11:05 ` Peter Zijlstra
2012-06-28 11:05 ` Peter Zijlstra
2012-06-28 12:00 ` Benjamin Herrenschmidt
2012-06-28 12:00 ` Benjamin Herrenschmidt
2012-06-28 12:00 ` Benjamin Herrenschmidt
2012-07-24 5:12 ` Nikunj A Dadhania
2012-07-24 5:12 ` Nikunj A Dadhania
2012-07-24 5:12 ` Nikunj A Dadhania
2012-06-27 21:15 ` [PATCH 03/20] mm, tlb: Remove a few #ifdefs Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 21:15 ` [PATCH 04/20] mm, s390: use generic RCU page-table freeing code Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 21:15 ` [PATCH 05/20] mm, powerpc: Dont use tlb_flush for external tlb flushes Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 21:15 ` [PATCH 06/20] mm, sparc64: " Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 21:15 ` [PATCH 07/20] mm, arch: Remove tlb_flush() Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 21:15 ` [PATCH 08/20] mm: Optimize fullmm TLB flushing Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 22:26 ` Linus Torvalds
2012-06-27 22:26 ` Linus Torvalds
2012-06-27 23:02 ` Peter Zijlstra
2012-06-27 23:02 ` Peter Zijlstra
2012-06-27 23:13 ` Peter Zijlstra
2012-06-27 23:13 ` Peter Zijlstra
2012-06-27 23:13 ` Peter Zijlstra
2012-06-27 23:23 ` Linus Torvalds
2012-06-27 23:23 ` Linus Torvalds
2012-06-27 23:23 ` Linus Torvalds
2012-06-27 23:33 ` Linus Torvalds
2012-06-27 23:33 ` Linus Torvalds
2012-06-27 23:33 ` Linus Torvalds
2012-06-28 9:16 ` Catalin Marinas
2012-06-28 9:16 ` Catalin Marinas
2012-06-28 10:39 ` Benjamin Herrenschmidt
2012-06-28 10:39 ` Benjamin Herrenschmidt
2012-06-28 10:59 ` Peter Zijlstra
2012-06-28 10:59 ` Peter Zijlstra
2012-06-28 14:53 ` Catalin Marinas
2012-06-28 14:53 ` Catalin Marinas
2012-06-28 16:20 ` Peter Zijlstra
2012-06-28 16:20 ` Peter Zijlstra
2012-06-28 16:38 ` Peter Zijlstra
2012-06-28 16:38 ` Peter Zijlstra
2012-06-28 16:45 ` Linus Torvalds
2012-06-28 16:45 ` Linus Torvalds
2012-06-28 16:52 ` Peter Zijlstra
2012-06-28 16:52 ` Peter Zijlstra
2012-06-28 21:57 ` Benjamin Herrenschmidt
2012-06-28 21:57 ` Benjamin Herrenschmidt
2012-06-28 21:58 ` Benjamin Herrenschmidt
2012-06-28 21:58 ` Benjamin Herrenschmidt
2012-06-29 8:49 ` Peter Zijlstra
2012-06-29 8:49 ` Peter Zijlstra
2012-06-29 15:26 ` Catalin Marinas
2012-06-29 15:26 ` Catalin Marinas
2012-06-29 22:11 ` Benjamin Herrenschmidt [this message]
2012-06-29 22:11 ` Benjamin Herrenschmidt
2012-06-28 10:55 ` Peter Zijlstra
2012-06-28 10:55 ` Peter Zijlstra
2012-06-28 10:55 ` Peter Zijlstra
2012-06-28 11:19 ` Martin Schwidefsky
2012-06-28 11:19 ` Martin Schwidefsky
2012-06-28 11:19 ` Martin Schwidefsky
2012-06-28 11:30 ` Peter Zijlstra
2012-06-28 11:30 ` Peter Zijlstra
2012-06-28 11:30 ` Peter Zijlstra
2012-06-28 16:00 ` Avi Kivity
2012-06-28 16:00 ` Avi Kivity
2012-06-27 21:15 ` [PATCH 09/20] mm, arch: Add end argument to p??_free_tlb() Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 21:15 ` [PATCH 10/20] mm: Provide generic range tracking and flushing Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 21:15 ` [PATCH 11/20] mm, s390: Convert to use generic mmu_gather Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 22:13 ` Peter Zijlstra
2012-06-27 22:13 ` Peter Zijlstra
2012-06-28 7:13 ` Martin Schwidefsky
2012-06-28 7:13 ` Martin Schwidefsky
2012-06-27 21:15 ` [PATCH 12/20] mm, arm: Convert arm to generic tlb Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 21:15 ` [PATCH 13/20] mm, ia64: Convert ia64 " Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 21:15 ` [PATCH 14/20] mm, sh: Convert sh " Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-28 18:32 ` Paul Mundt
2012-06-28 18:32 ` Paul Mundt
2012-06-28 20:27 ` Peter Zijlstra
2012-06-28 20:27 ` Peter Zijlstra
2012-06-27 21:15 ` [PATCH 15/20] mm, um: Convert um " Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 21:15 ` [PATCH 16/20] mm, avr32: Convert avr32 " Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 21:15 ` [PATCH 17/20] mm, mips: Convert mips " Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 21:15 ` [PATCH 18/20] mm, parisc: Convert parisc " Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 21:15 ` [PATCH 19/20] mm, sparc32: Convert sparc32 " Peter Zijlstra
2012-06-27 21:15 ` Peter Zijlstra
2012-06-27 21:16 ` [PATCH 20/20] mm, xtensa: Convert xtensa " Peter Zijlstra
2012-06-27 21:16 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1341007903.2563.41.camel@pasglop \
--to=benh@kernel.crashing.org \
--cc=akpm@linux-foundation.org \
--cc=alex.shi@intel.com \
--cc=catalin.marinas@arm.com \
--cc=cmetcalf@tilera.com \
--cc=davem@davemloft.net \
--cc=hugh.dickins@tiscali.co.uk \
--cc=konrad@darnok.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=mingo@elte.hu \
--cc=nikunj@linux.vnet.ibm.com \
--cc=npiggin@kernel.dk \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=rmk@arm.linux.org.uk \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.