From: Peter Zijlstra <peterz@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@intel.com>,
"H. Peter Anvin" <hpa@zytor.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
Russell King - ARM Linux <linux@arm.linux.org.uk>,
Tony Luck <tony.luck@intel.com>,
kirill.shutemov@linux.intel.com
Subject: Re: Dirty/Access bits vs. page content
Date: Tue, 22 Apr 2014 09:34:43 +0200 [thread overview]
Message-ID: <20140422073443.GC11182@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <CA+55aFzFxBDJ2rWo9DggdNsq-qBCr11OVXnm64jx04KMSVCBAw@mail.gmail.com>
> From 21819f790e3d206ad77cd20d6e7cae86311fc87d Mon Sep 17 00:00:00 2001
> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Mon, 21 Apr 2014 15:29:49 -0700
> Subject: [PATCH 1/2] mm: move page table dirty state into TLB gather operation
>
> When tearing down a memory mapping, we have long delayed the actual
> freeing of the pages until after the (batched) TLB flush, since only
> after the TLB entries have been flushed from all CPU's do we know that
> none of the pages will be accessed any more.
>
> HOWEVER.
>
> Ben Herrenschmidt points out that we need to do the same thing for
> marking a shared mapped page dirty. Because if we mark the underlying
> page dirty before we have flushed the TLB's, other CPU's may happily
> continue to write to the page (using their stale TLB contents) after
> we've marked the page dirty, and they can thus race with any cleaning
> operation.
>
> Now, in practice, any page cleaning operations will take much longer to
> start the IO on the page than it will have taken us to get to the TLB
> flush, so this is going to be hard to trigger in real life. In fact, so
> far nobody has even come up with a reasonable test-case for this to show
> it happening.
>
> But what we do now (set_page_dirty() before flushing the TLB) really is
> wrong. And this commit does not fix it, but by moving the dirty
> handling into the TLB gather operation at least the internal interfaces
> now support the notion of those TLB gather interfaces doing the rigth
> thing.
>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Peter Anvin <hpa@zytor.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: linux-arch@vger.kernel.org
> Cc: linux-mm@kvack.org
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> ---
> arch/arm/include/asm/tlb.h | 6 ++++--
> arch/ia64/include/asm/tlb.h | 6 ++++--
> arch/s390/include/asm/tlb.h | 4 +++-
> arch/sh/include/asm/tlb.h | 6 ++++--
> arch/um/include/asm/tlb.h | 6 ++++--
> include/asm-generic/tlb.h | 4 ++--
> mm/hugetlb.c | 4 +---
> mm/memory.c | 15 +++++++++------
> 8 files changed, 31 insertions(+), 20 deletions(-)
>
> diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
> index 0baf7f0d9394..ac9c16af8e63 100644
> --- a/arch/arm/include/asm/tlb.h
> +++ b/arch/arm/include/asm/tlb.h
> @@ -165,8 +165,10 @@ tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
> tlb_flush(tlb);
> }
>
> -static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
> +static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page, bool dirty)
> {
> + if (dirty)
> + set_page_dirty(page);
> tlb->pages[tlb->nr++] = page;
> VM_BUG_ON(tlb->nr > tlb->max);
> return tlb->max - tlb->nr;
> @@ -174,7 +176,7 @@ static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
>
> static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
> {
> - if (!__tlb_remove_page(tlb, page))
> + if (!__tlb_remove_page(tlb, page, 0))
> tlb_flush_mmu(tlb);
> }
So I checked this, and currently the only users of tlb_remove_page() are
the archs for freeing the page table pages and THP. The latter is OK
because it is strictly Anon (for now).
Anybody (/me looks at Kiryl) thinking of making THP work for shared
pages should also cure this.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@intel.com>,
"H. Peter Anvin" <hpa@zytor.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
Russell King - ARM Linux <linux@arm.linux.org.uk>,
Tony Luck <tony.luck@intel.com>,
kirill.shutemov@linux.intel.com
Subject: Re: Dirty/Access bits vs. page content
Date: Tue, 22 Apr 2014 09:34:43 +0200 [thread overview]
Message-ID: <20140422073443.GC11182@twins.programming.kicks-ass.net> (raw)
Message-ID: <20140422073443.IWSqCcVc-hY_hX41Sl3o5EitBQ78T2p6CARse0A4hTE@z> (raw)
In-Reply-To: <CA+55aFzFxBDJ2rWo9DggdNsq-qBCr11OVXnm64jx04KMSVCBAw@mail.gmail.com>
> From 21819f790e3d206ad77cd20d6e7cae86311fc87d Mon Sep 17 00:00:00 2001
> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Mon, 21 Apr 2014 15:29:49 -0700
> Subject: [PATCH 1/2] mm: move page table dirty state into TLB gather operation
>
> When tearing down a memory mapping, we have long delayed the actual
> freeing of the pages until after the (batched) TLB flush, since only
> after the TLB entries have been flushed from all CPU's do we know that
> none of the pages will be accessed any more.
>
> HOWEVER.
>
> Ben Herrenschmidt points out that we need to do the same thing for
> marking a shared mapped page dirty. Because if we mark the underlying
> page dirty before we have flushed the TLB's, other CPU's may happily
> continue to write to the page (using their stale TLB contents) after
> we've marked the page dirty, and they can thus race with any cleaning
> operation.
>
> Now, in practice, any page cleaning operations will take much longer to
> start the IO on the page than it will have taken us to get to the TLB
> flush, so this is going to be hard to trigger in real life. In fact, so
> far nobody has even come up with a reasonable test-case for this to show
> it happening.
>
> But what we do now (set_page_dirty() before flushing the TLB) really is
> wrong. And this commit does not fix it, but by moving the dirty
> handling into the TLB gather operation at least the internal interfaces
> now support the notion of those TLB gather interfaces doing the rigth
> thing.
>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Peter Anvin <hpa@zytor.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: linux-arch@vger.kernel.org
> Cc: linux-mm@kvack.org
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> ---
> arch/arm/include/asm/tlb.h | 6 ++++--
> arch/ia64/include/asm/tlb.h | 6 ++++--
> arch/s390/include/asm/tlb.h | 4 +++-
> arch/sh/include/asm/tlb.h | 6 ++++--
> arch/um/include/asm/tlb.h | 6 ++++--
> include/asm-generic/tlb.h | 4 ++--
> mm/hugetlb.c | 4 +---
> mm/memory.c | 15 +++++++++------
> 8 files changed, 31 insertions(+), 20 deletions(-)
>
> diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
> index 0baf7f0d9394..ac9c16af8e63 100644
> --- a/arch/arm/include/asm/tlb.h
> +++ b/arch/arm/include/asm/tlb.h
> @@ -165,8 +165,10 @@ tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
> tlb_flush(tlb);
> }
>
> -static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
> +static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page, bool dirty)
> {
> + if (dirty)
> + set_page_dirty(page);
> tlb->pages[tlb->nr++] = page;
> VM_BUG_ON(tlb->nr > tlb->max);
> return tlb->max - tlb->nr;
> @@ -174,7 +176,7 @@ static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
>
> static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
> {
> - if (!__tlb_remove_page(tlb, page))
> + if (!__tlb_remove_page(tlb, page, 0))
> tlb_flush_mmu(tlb);
> }
So I checked this, and currently the only users of tlb_remove_page() are
the archs for freeing the page table pages and THP. The latter is OK
because it is strictly Anon (for now).
Anybody (/me looks at Kiryl) thinking of making THP work for shared
pages should also cure this.
next prev parent reply other threads:[~2014-04-22 7:34 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1398032742.19682.11.camel@pasglop>
[not found] ` <CA+55aFz1sK+PF96LYYZY7OB7PBpxZu-uNLWLvPiRz-tJsBqX3w@mail.gmail.com>
[not found] ` <1398054064.19682.32.camel@pasglop>
[not found] ` <1398057630.19682.38.camel@pasglop>
[not found] ` <CA+55aFwWHBtihC3w9E4+j4pz+6w7iTnYhTf4N3ie15BM9thxLQ@mail.gmail.com>
[not found] ` <53558507.9050703@zytor.com>
2014-04-21 22:29 ` Dirty/Access bits vs. page content Linus Torvalds
2014-04-21 22:44 ` Dave Hansen
2014-04-22 0:31 ` Linus Torvalds
2014-04-22 0:44 ` Linus Torvalds
2014-04-22 5:15 ` Tony Luck
2014-04-22 5:15 ` Tony Luck
2014-04-22 14:55 ` Linus Torvalds
2014-04-22 14:55 ` Linus Torvalds
2014-04-22 7:34 ` Peter Zijlstra [this message]
2014-04-22 7:34 ` Peter Zijlstra
2014-04-22 7:54 ` Peter Zijlstra
2014-04-22 7:54 ` Peter Zijlstra
2014-04-22 21:36 ` Linus Torvalds
2014-04-22 21:36 ` Linus Torvalds
2014-04-22 21:46 ` Dave Hansen
2014-04-22 21:46 ` Dave Hansen
2014-04-22 22:08 ` Linus Torvalds
2014-04-22 22:08 ` Linus Torvalds
2014-04-22 22:41 ` Dave Hansen
2014-04-22 22:41 ` Dave Hansen
2014-04-23 2:44 ` Linus Torvalds
2014-04-23 2:44 ` Linus Torvalds
2014-04-23 3:08 ` Hugh Dickins
2014-04-23 3:08 ` Hugh Dickins
2014-04-23 3:08 ` Hugh Dickins
2014-04-23 4:23 ` Linus Torvalds
2014-04-23 4:23 ` Linus Torvalds
2014-04-23 6:14 ` Benjamin Herrenschmidt
2014-04-23 6:14 ` Benjamin Herrenschmidt
2014-04-23 18:41 ` Jan Kara
2014-04-23 18:41 ` Jan Kara
2014-04-23 19:33 ` Linus Torvalds
2014-04-24 6:51 ` Peter Zijlstra
2014-04-24 6:51 ` Peter Zijlstra
2014-04-24 18:40 ` Hugh Dickins
2014-04-24 18:40 ` Hugh Dickins
2014-04-24 19:45 ` Linus Torvalds
2014-04-24 19:45 ` Linus Torvalds
2014-04-24 20:02 ` Hugh Dickins
2014-04-24 20:02 ` Hugh Dickins
2014-04-24 23:46 ` Linus Torvalds
2014-04-25 1:37 ` Benjamin Herrenschmidt
2014-04-25 1:37 ` Benjamin Herrenschmidt
2014-04-25 2:41 ` Benjamin Herrenschmidt
2014-04-25 2:46 ` Linus Torvalds
2014-04-25 2:46 ` Linus Torvalds
2014-04-25 2:50 ` H. Peter Anvin
2014-04-25 2:50 ` H. Peter Anvin
2014-04-25 2:50 ` H. Peter Anvin
2014-04-25 3:03 ` Linus Torvalds
2014-04-25 3:03 ` Linus Torvalds
2014-04-25 12:01 ` Hugh Dickins
2014-04-25 12:01 ` Hugh Dickins
2014-04-25 13:51 ` Peter Zijlstra
2014-04-25 13:51 ` Peter Zijlstra
2014-04-25 19:41 ` Hugh Dickins
2014-04-25 19:41 ` Hugh Dickins
2014-04-26 18:07 ` Peter Zijlstra
2014-04-26 18:07 ` Peter Zijlstra
2014-04-27 7:20 ` Peter Zijlstra
2014-04-27 7:20 ` Peter Zijlstra
2014-04-27 12:20 ` Hugh Dickins
2014-04-27 12:20 ` Hugh Dickins
2014-04-27 19:33 ` Peter Zijlstra
2014-04-27 19:33 ` Peter Zijlstra
2014-04-27 19:47 ` Linus Torvalds
2014-04-27 19:47 ` Linus Torvalds
2014-04-27 20:09 ` Hugh Dickins
2014-04-27 20:09 ` Hugh Dickins
2014-04-28 9:25 ` Peter Zijlstra
2014-04-28 9:25 ` Peter Zijlstra
2014-04-28 10:14 ` Peter Zijlstra
2014-04-28 10:14 ` Peter Zijlstra
2014-04-27 16:21 ` Linus Torvalds
2014-04-27 16:21 ` Linus Torvalds
2014-04-27 23:13 ` Benjamin Herrenschmidt
2014-04-25 16:54 ` Dave Hansen
2014-04-25 16:54 ` Dave Hansen
2014-04-25 16:54 ` Dave Hansen
2014-04-25 18:41 ` Hugh Dickins
2014-04-25 18:41 ` Hugh Dickins
2014-04-25 22:00 ` Dave Hansen
2014-04-25 22:00 ` Dave Hansen
2014-04-25 22:00 ` Dave Hansen
2014-04-26 3:11 ` Hugh Dickins
2014-04-26 3:11 ` Hugh Dickins
2014-04-26 3:48 ` Linus Torvalds
2014-04-26 3:48 ` Linus Torvalds
2014-04-25 17:56 ` Linus Torvalds
2014-04-25 17:56 ` Linus Torvalds
2014-04-25 19:13 ` Hugh Dickins
2014-04-25 19:13 ` Hugh Dickins
2014-04-25 16:30 ` Dave Hansen
2014-04-25 16:30 ` Dave Hansen
2014-04-25 16:30 ` Dave Hansen
2014-04-23 20:11 ` Hugh Dickins
2014-04-23 20:11 ` Hugh Dickins
2014-04-24 8:49 ` Jan Kara
2014-04-24 8:49 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140422073443.GC11182@twins.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=benh@kernel.crashing.org \
--cc=dave.hansen@intel.com \
--cc=hpa@zytor.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux@arm.linux.org.uk \
--cc=tony.luck@intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.