* [PATCH 1/3] mm: Disable tlb_fast_mode() when mm has notifiers
2011-10-21 12:21 [RFC][PATCH 0/3] Add support for non-CPU TLBs in MMU-Notifiers Joerg Roedel
@ 2011-10-21 12:21 ` Joerg Roedel
2011-10-21 12:21 ` [PATCH 2/3] mmu_notifier: Add invalidate_range_free_pages() notifier Joerg Roedel
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Joerg Roedel @ 2011-10-21 12:21 UTC (permalink / raw)
To: Andrea Arcangeli, Rik van Riel, akpm, Hugh Dickins, Mel Gorman,
Nick Piggin
Cc: linux-mm, linux-kernel, joro, Joerg Roedel
When the MMU-Notifiers are used to manage non-CPU TLBs the
tlb_fast_mode can't be used anymore. So disable it when an
mm has notifiers.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
include/asm-generic/tlb.h | 2 +-
include/linux/mmu_notifier.h | 1 +
mm/memory.c | 2 +-
3 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index e58fa77..8c6cc1b 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -100,7 +100,7 @@ struct mmu_gather {
static inline int tlb_fast_mode(struct mmu_gather *tlb)
{
-#ifdef CONFIG_SMP
+#if defined(CONFIG_SMP) || defined(CONFIG_MMU_NOTIFIER)
return tlb->fast_mode;
#else
/*
diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 1d1b1e1..b9469d6 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -373,6 +373,7 @@ static inline void mmu_notifier_mm_destroy(struct mm_struct *mm)
#define pmdp_clear_flush_notify pmdp_clear_flush
#define pmdp_splitting_flush_notify pmdp_splitting_flush
#define set_pte_at_notify set_pte_at
+#define mm_has_notifiers(mm) 0
#endif /* CONFIG_MMU_NOTIFIER */
diff --git a/mm/memory.c b/mm/memory.c
index a56e3ba..b31f9e0 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -230,7 +230,7 @@ void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, bool fullmm)
tlb->fullmm = fullmm;
tlb->need_flush = 0;
- tlb->fast_mode = (num_possible_cpus() == 1);
+ tlb->fast_mode = (num_possible_cpus() == 1) && !mm_has_notifiers(mm);
tlb->local.next = NULL;
tlb->local.nr = 0;
tlb->local.max = ARRAY_SIZE(tlb->__pages);
--
1.7.5.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH 2/3] mmu_notifier: Add invalidate_range_free_pages() notifier
2011-10-21 12:21 [RFC][PATCH 0/3] Add support for non-CPU TLBs in MMU-Notifiers Joerg Roedel
2011-10-21 12:21 ` [PATCH 1/3] mm: Disable tlb_fast_mode() when mm has notifiers Joerg Roedel
@ 2011-10-21 12:21 ` Joerg Roedel
2011-10-21 12:21 ` [PATCH 3/3] mmu_notifier: Call " Joerg Roedel
2011-11-02 9:46 ` [RFC][PATCH 0/3] Add support for non-CPU TLBs in MMU-Notifiers Robin Holt
3 siblings, 0 replies; 8+ messages in thread
From: Joerg Roedel @ 2011-10-21 12:21 UTC (permalink / raw)
To: Andrea Arcangeli, Rik van Riel, akpm, Hugh Dickins, Mel Gorman,
Nick Piggin
Cc: linux-mm, linux-kernel, joro, Joerg Roedel
This notifier closes an important gap in the current
invalidate_range_start()/end() notifiers. The _start() part
is called when all pages are still mapped while the _end()
notifier is called when all pages are potentially unmapped
and already freed.
This does not allow to manage external (non-CPU) hardware
TLBs with MMU-notifiers because there is no way to prevent
that hardware will esablish new TLB entries between the
calls of these two functions. But this is a requirement to
the subsytem that implements these existing notifiers.
To allow managing external TLBs the MMU-notifiers need to
catch the moment when pages are unmapped but not yet freed.
This new notifier catches that moment and notifies the
interested subsytem when pages that were unmapped are about
to be freed. The new notifier will only be called between
invalidate_range_start()/end().
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
include/linux/mmu_notifier.h | 32 +++++++++++++++++++++++++++-----
mm/mmu_notifier.c | 13 +++++++++++++
2 files changed, 40 insertions(+), 5 deletions(-)
diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index b9469d6..199813f 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -94,11 +94,17 @@ struct mmu_notifier_ops {
/*
* invalidate_range_start() and invalidate_range_end() must be
* paired and are called only when the mmap_sem and/or the
- * locks protecting the reverse maps are held. The subsystem
- * must guarantee that no additional references are taken to
- * the pages in the range established between the call to
- * invalidate_range_start() and the matching call to
- * invalidate_range_end().
+ * locks protecting the reverse maps are held.
+ * invalidate_range_free_pages() is called between the two
+ * functions every time when the VM has unmapped pages that are
+ * about to be freed.
+ * The subsystem must guarantee that no additional references
+ * are taken to the pages in the range established between the
+ * call to invalidate_range_start() and the matching call to
+ * invalidate_range_end(). If this guarantee can not be given
+ * by the subsystem it has to make sure that additional
+ * references are dropped again in the
+ * invalidate_range_free_pages() notifier.
*
* Invalidation of multiple concurrent ranges may be
* optionally permitted by the driver. Either way the
@@ -109,6 +115,9 @@ struct mmu_notifier_ops {
* invalidate_range_start() is called when all pages in the
* range are still mapped and have at least a refcount of one.
*
+ * invalidate_range_free_pages() is called when a bunch of pages
+ * are unmapped but not yet freed by the VM.
+ *
* invalidate_range_end() is called when all pages in the
* range have been unmapped and the pages have been freed by
* the VM.
@@ -137,6 +146,8 @@ struct mmu_notifier_ops {
void (*invalidate_range_start)(struct mmu_notifier *mn,
struct mm_struct *mm,
unsigned long start, unsigned long end);
+ void (*invalidate_range_free_pages)(struct mmu_notifier *mn,
+ struct mm_struct *mm);
void (*invalidate_range_end)(struct mmu_notifier *mn,
struct mm_struct *mm,
unsigned long start, unsigned long end);
@@ -181,6 +192,7 @@ extern void __mmu_notifier_invalidate_page(struct mm_struct *mm,
unsigned long address);
extern void __mmu_notifier_invalidate_range_start(struct mm_struct *mm,
unsigned long start, unsigned long end);
+extern void __mmu_notifier_invalidate_range_free_pages(struct mm_struct *mm);
extern void __mmu_notifier_invalidate_range_end(struct mm_struct *mm,
unsigned long start, unsigned long end);
@@ -227,6 +239,12 @@ static inline void mmu_notifier_invalidate_range_start(struct mm_struct *mm,
__mmu_notifier_invalidate_range_start(mm, start, end);
}
+static inline void mmu_notifier_invalidate_range_free_pages(struct mm_struct *mm)
+{
+ if (mm_has_notifiers(mm))
+ __mmu_notifier_invalidate_range_free_pages(mm);
+}
+
static inline void mmu_notifier_invalidate_range_end(struct mm_struct *mm,
unsigned long start, unsigned long end)
{
@@ -354,6 +372,10 @@ static inline void mmu_notifier_invalidate_range_start(struct mm_struct *mm,
{
}
+static inline void mmu_notifier_invalidate_range_free_pages(struct mm_struct *mm)
+{
+}
+
static inline void mmu_notifier_invalidate_range_end(struct mm_struct *mm,
unsigned long start, unsigned long end)
{
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 8d032de..ec6b11b 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -168,6 +168,19 @@ void __mmu_notifier_invalidate_range_start(struct mm_struct *mm,
rcu_read_unlock();
}
+void __mmu_notifier_invalidate_range_free_pages(struct mm_struct *mm)
+{
+ struct mmu_notifier *mn;
+ struct hlist_node *n;
+
+ rcu_read_lock();
+ hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_mm->list, hlist) {
+ if (mn->ops->invalidate_range_free_pages)
+ mn->ops->invalidate_range_free_pages(mn, mm);
+ }
+ rcu_read_unlock();
+}
+
void __mmu_notifier_invalidate_range_end(struct mm_struct *mm,
unsigned long start, unsigned long end)
{
--
1.7.5.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH 3/3] mmu_notifier: Call invalidate_range_free_pages() notifier
2011-10-21 12:21 [RFC][PATCH 0/3] Add support for non-CPU TLBs in MMU-Notifiers Joerg Roedel
2011-10-21 12:21 ` [PATCH 1/3] mm: Disable tlb_fast_mode() when mm has notifiers Joerg Roedel
2011-10-21 12:21 ` [PATCH 2/3] mmu_notifier: Add invalidate_range_free_pages() notifier Joerg Roedel
@ 2011-10-21 12:21 ` Joerg Roedel
2011-11-02 9:46 ` [RFC][PATCH 0/3] Add support for non-CPU TLBs in MMU-Notifiers Robin Holt
3 siblings, 0 replies; 8+ messages in thread
From: Joerg Roedel @ 2011-10-21 12:21 UTC (permalink / raw)
To: Andrea Arcangeli, Rik van Riel, akpm, Hugh Dickins, Mel Gorman,
Nick Piggin
Cc: linux-mm, linux-kernel, joro, Joerg Roedel
This patch adds the necessary calls to the new notifier.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
mm/hugetlb.c | 1 +
mm/memory.c | 11 +++++++++++
2 files changed, 12 insertions(+), 0 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index dae27ba..d08998d 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2278,6 +2278,7 @@ void __unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start,
}
spin_unlock(&mm->page_table_lock);
flush_tlb_range(vma, start, end);
+ mmu_notifier_invalidate_range_free_pages(mm);
mmu_notifier_invalidate_range_end(mm, start, end);
list_for_each_entry_safe(page, tmp, &page_list, lru) {
page_remove_rmap(page);
diff --git a/mm/memory.c b/mm/memory.c
index b31f9e0..a5cc335 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1207,6 +1207,7 @@ again:
* and page-free while holding it.
*/
if (force_flush) {
+ mmu_notifier_invalidate_range_free_pages(mm);
force_flush = 0;
tlb_flush_mmu(tlb);
if (addr != end)
@@ -1359,6 +1360,16 @@ unsigned long unmap_vmas(struct mmu_gather *tlb,
}
}
+ /*
+ * In theory it would be sufficient to do the final flush for the last
+ * bunch of pages queued by mmu_gather in mn_invalidate_range_end().
+ * But that would break the API definition because in the _end notifier
+ * the called subsystem has to assume that the pages are alread freed.
+ * So call mn_invalidate_range_free_pages() explicitly here for the
+ * final bunch of pages.
+ */
+ mmu_notifier_invalidate_range_free_pages(mm);
+
mmu_notifier_invalidate_range_end(mm, start_addr, end_addr);
return start; /* which is now the end (or restart) address */
}
--
1.7.5.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [RFC][PATCH 0/3] Add support for non-CPU TLBs in MMU-Notifiers
2011-10-21 12:21 [RFC][PATCH 0/3] Add support for non-CPU TLBs in MMU-Notifiers Joerg Roedel
` (2 preceding siblings ...)
2011-10-21 12:21 ` [PATCH 3/3] mmu_notifier: Call " Joerg Roedel
@ 2011-11-02 9:46 ` Robin Holt
2011-11-02 9:57 ` Roedel, Joerg
3 siblings, 1 reply; 8+ messages in thread
From: Robin Holt @ 2011-11-02 9:46 UTC (permalink / raw)
To: Joerg Roedel
Cc: Andrea Arcangeli, Rik van Riel, akpm, Hugh Dickins, Mel Gorman,
Nick Piggin, linux-mm, linux-kernel, joro
On Fri, Oct 21, 2011 at 02:21:45PM +0200, Joerg Roedel wrote:
> Hi,
>
> this is my first attempt to add support for non-CPU TLBs to the
> MMU-Notifier framework. This will be used by the AMD IOMMU driver for
> the next generation of hardware. The next version of the AMD IOMMU can
> walk page-tables in AMD64 long-mode format (with setting
> accessed/dirty-bits atomically) and save the translations in its own
> TLB. Page faulting for IO devices is supported too. This will be used to
> let hardware devices share page-tables with CPU processes and access
> their memory directly. Please look at
>
> http://support.amd.com/us/Processor_TechDocs/48882.pdf
...
Did this patch set get any review or traction? Perhaps you should have
included the linux-mm@kvack.org mailing list.
Robin
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [RFC][PATCH 0/3] Add support for non-CPU TLBs in MMU-Notifiers
2011-11-02 9:46 ` [RFC][PATCH 0/3] Add support for non-CPU TLBs in MMU-Notifiers Robin Holt
@ 2011-11-02 9:57 ` Roedel, Joerg
2011-11-03 0:07 ` Andrea Arcangeli
0 siblings, 1 reply; 8+ messages in thread
From: Roedel, Joerg @ 2011-11-02 9:57 UTC (permalink / raw)
To: Robin Holt
Cc: Andrea Arcangeli, Rik van Riel, akpm@linux-foundation.org,
Hugh Dickins, Mel Gorman, Nick Piggin, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, joro@8bytes.org
On Wed, Nov 02, 2011 at 05:46:02AM -0400, Robin Holt wrote:
> On Fri, Oct 21, 2011 at 02:21:45PM +0200, Joerg Roedel wrote:
> > Hi,
> >
> > this is my first attempt to add support for non-CPU TLBs to the
> > MMU-Notifier framework. This will be used by the AMD IOMMU driver for
> > the next generation of hardware. The next version of the AMD IOMMU can
> > walk page-tables in AMD64 long-mode format (with setting
> > accessed/dirty-bits atomically) and save the translations in its own
> > TLB. Page faulting for IO devices is supported too. This will be used to
> > let hardware devices share page-tables with CPU processes and access
> > their memory directly. Please look at
> >
> > http://support.amd.com/us/Processor_TechDocs/48882.pdf
>
> ...
>
> Did this patch set get any review or traction? Perhaps you should have
> included the linux-mm@kvack.org mailing list.
I have included this mailing list in my post afaics. I talked with
Andrea Arcangeli about these patches at LinuxCon in Prague and will post
a new version based on his comments.
Joerg
--
AMD Operating System Research Center
Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC][PATCH 0/3] Add support for non-CPU TLBs in MMU-Notifiers
2011-11-02 9:57 ` Roedel, Joerg
@ 2011-11-03 0:07 ` Andrea Arcangeli
2011-11-03 10:18 ` Roedel, Joerg
0 siblings, 1 reply; 8+ messages in thread
From: Andrea Arcangeli @ 2011-11-03 0:07 UTC (permalink / raw)
To: Roedel, Joerg
Cc: Robin Holt, Rik van Riel, akpm@linux-foundation.org, Hugh Dickins,
Mel Gorman, Nick Piggin, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, joro@8bytes.org
On Wed, Nov 02, 2011 at 10:57:40AM +0100, Roedel, Joerg wrote:
> I have included this mailing list in my post afaics. I talked with
> Andrea Arcangeli about these patches at LinuxCon in Prague and will post
> a new version based on his comments.
Thanks!
Andrea
PS. Luckily I already read your patches before we met :).
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC][PATCH 0/3] Add support for non-CPU TLBs in MMU-Notifiers
2011-11-03 0:07 ` Andrea Arcangeli
@ 2011-11-03 10:18 ` Roedel, Joerg
0 siblings, 0 replies; 8+ messages in thread
From: Roedel, Joerg @ 2011-11-03 10:18 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: Robin Holt, Rik van Riel, akpm@linux-foundation.org, Hugh Dickins,
Mel Gorman, Nick Piggin, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, joro@8bytes.org
On Wed, Nov 02, 2011 at 08:07:25PM -0400, Andrea Arcangeli wrote:
> On Wed, Nov 02, 2011 at 10:57:40AM +0100, Roedel, Joerg wrote:
> > I have included this mailing list in my post afaics. I talked with
> > Andrea Arcangeli about these patches at LinuxCon in Prague and will post
> > a new version based on his comments.
>
> Thanks!
> Andrea
>
> PS. Luckily I already read your patches before we met :).
Yeah, otherwise we would have done some kind of peer-review ;)
Joerg
--
AMD Operating System Research Center
Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread