From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Christoph Lameter <clameter@sgi.com>
Cc: Andrea Arcangeli <andrea@qumranet.com>, Robin Holt <holt@sgi.com>,
kvm-devel@lists.sourceforge.net,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
general@lists.openfabrics.org, steiner@sgi.com,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [patch 01/10] emm: mm_lock: Lock a process against reclaim
Date: Fri, 04 Apr 2008 16:12:42 -0700 [thread overview]
Message-ID: <47F6B5EA.6060106@goop.org> (raw)
In-Reply-To: <20080404223131.271668133@sgi.com>
Christoph Lameter wrote:
> Provide a way to lock an mm_struct against reclaim (try_to_unmap
> etc). This is necessary for the invalidate notifier approaches so
> that they can reliably add and remove a notifier.
>
> Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
> Signed-off-by: Christoph Lameter <clameter@sgi.com>
>
> ---
> include/linux/mm.h | 10 ++++++++
> mm/mmap.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 76 insertions(+)
>
> Index: linux-2.6/include/linux/mm.h
> ===================================================================
> --- linux-2.6.orig/include/linux/mm.h 2008-04-02 11:41:47.741678873 -0700
> +++ linux-2.6/include/linux/mm.h 2008-04-04 15:02:17.660504756 -0700
> @@ -1050,6 +1050,16 @@ extern int install_special_mapping(struc
> unsigned long addr, unsigned long len,
> unsigned long flags, struct page **pages);
>
> +/*
> + * Locking and unlocking am mm against reclaim.
> + *
> + * mm_lock will take mmap_sem writably (to prevent additional vmas from being
> + * added) and then take all mapping locks of the existing vmas. With that
> + * reclaim is effectively stopped.
> + */
> +extern void mm_lock(struct mm_struct *mm);
> +extern void mm_unlock(struct mm_struct *mm);
> +
> extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
>
> extern unsigned long do_mmap_pgoff(struct file *file, unsigned long addr,
> Index: linux-2.6/mm/mmap.c
> ===================================================================
> --- linux-2.6.orig/mm/mmap.c 2008-04-04 14:55:03.477593980 -0700
> +++ linux-2.6/mm/mmap.c 2008-04-04 14:59:05.505395402 -0700
> @@ -2242,3 +2242,69 @@ int install_special_mapping(struct mm_st
>
> return 0;
> }
> +
> +static void mm_lock_unlock(struct mm_struct *mm, int lock)
> +{
> + struct vm_area_struct *vma;
> + spinlock_t *i_mmap_lock_last, *anon_vma_lock_last;
> +
> + i_mmap_lock_last = NULL;
> + for (;;) {
> + spinlock_t *i_mmap_lock = (spinlock_t *) -1UL;
> + for (vma = mm->mmap; vma; vma = vma->vm_next)
> + if (vma->vm_file && vma->vm_file->f_mapping &&
>
I think you can break this if() down a bit:
if (!(vma->vm_file && vma->vm_file->f_mapping))
continue;
> + (unsigned long) i_mmap_lock >
> + (unsigned long)
> + &vma->vm_file->f_mapping->i_mmap_lock &&
> + (unsigned long)
> + &vma->vm_file->f_mapping->i_mmap_lock >
> + (unsigned long) i_mmap_lock_last)
> + i_mmap_lock =
> + &vma->vm_file->f_mapping->i_mmap_lock;
>
So this is an O(n^2) algorithm to take the i_mmap_locks from low to high
order? A comment would be nice. And O(n^2)? Ouch. How often is it
called?
And is it necessary to mush lock and unlock together? Unlock ordering
doesn't matter, so you should just be able to have a much simpler loop, no?
> + if (i_mmap_lock == (spinlock_t *) -1UL)
> + break;
> + i_mmap_lock_last = i_mmap_lock;
> + if (lock)
> + spin_lock(i_mmap_lock);
> + else
> + spin_unlock(i_mmap_lock);
> + }
> +
> + anon_vma_lock_last = NULL;
> + for (;;) {
> + spinlock_t *anon_vma_lock = (spinlock_t *) -1UL;
> + for (vma = mm->mmap; vma; vma = vma->vm_next)
> + if (vma->anon_vma &&
> + (unsigned long) anon_vma_lock >
> + (unsigned long) &vma->anon_vma->lock &&
> + (unsigned long) &vma->anon_vma->lock >
> + (unsigned long) anon_vma_lock_last)
> + anon_vma_lock = &vma->anon_vma->lock;
> + if (anon_vma_lock == (spinlock_t *) -1UL)
> + break;
> + anon_vma_lock_last = anon_vma_lock;
> + if (lock)
> + spin_lock(anon_vma_lock);
> + else
> + spin_unlock(anon_vma_lock);
> + }
> +}
>
> +
> +/*
> + * This operation locks against the VM for all pte/vma/mm related
> + * operations that could ever happen on a certain mm. This includes
> + * vmtruncate, try_to_unmap, and all page faults. The holder
> + * must not hold any mm related lock. A single task can't take more
> + * than one mm lock in a row or it would deadlock.
> + */
> +void mm_lock(struct mm_struct * mm)
> +{
> + down_write(&mm->mmap_sem);
> + mm_lock_unlock(mm, 1);
> +}
> +
> +void mm_unlock(struct mm_struct *mm)
> +{
> + mm_lock_unlock(mm, 0);
> + up_write(&mm->mmap_sem);
> +}
>
>
WARNING: multiple messages have this Message-ID (diff)
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Christoph Lameter <clameter@sgi.com>
Cc: Andrea Arcangeli <andrea@qumranet.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
kvm-devel@lists.sourceforge.net, steiner@sgi.com,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Robin Holt <holt@sgi.com>,
general@lists.openfabrics.org
Subject: [ofa-general] Re: [patch 01/10] emm: mm_lock: Lock a process against reclaim
Date: Fri, 04 Apr 2008 16:12:42 -0700 [thread overview]
Message-ID: <47F6B5EA.6060106@goop.org> (raw)
In-Reply-To: <20080404223131.271668133@sgi.com>
Christoph Lameter wrote:
> Provide a way to lock an mm_struct against reclaim (try_to_unmap
> etc). This is necessary for the invalidate notifier approaches so
> that they can reliably add and remove a notifier.
>
> Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
> Signed-off-by: Christoph Lameter <clameter@sgi.com>
>
> ---
> include/linux/mm.h | 10 ++++++++
> mm/mmap.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 76 insertions(+)
>
> Index: linux-2.6/include/linux/mm.h
> ===================================================================
> --- linux-2.6.orig/include/linux/mm.h 2008-04-02 11:41:47.741678873 -0700
> +++ linux-2.6/include/linux/mm.h 2008-04-04 15:02:17.660504756 -0700
> @@ -1050,6 +1050,16 @@ extern int install_special_mapping(struc
> unsigned long addr, unsigned long len,
> unsigned long flags, struct page **pages);
>
> +/*
> + * Locking and unlocking am mm against reclaim.
> + *
> + * mm_lock will take mmap_sem writably (to prevent additional vmas from being
> + * added) and then take all mapping locks of the existing vmas. With that
> + * reclaim is effectively stopped.
> + */
> +extern void mm_lock(struct mm_struct *mm);
> +extern void mm_unlock(struct mm_struct *mm);
> +
> extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
>
> extern unsigned long do_mmap_pgoff(struct file *file, unsigned long addr,
> Index: linux-2.6/mm/mmap.c
> ===================================================================
> --- linux-2.6.orig/mm/mmap.c 2008-04-04 14:55:03.477593980 -0700
> +++ linux-2.6/mm/mmap.c 2008-04-04 14:59:05.505395402 -0700
> @@ -2242,3 +2242,69 @@ int install_special_mapping(struct mm_st
>
> return 0;
> }
> +
> +static void mm_lock_unlock(struct mm_struct *mm, int lock)
> +{
> + struct vm_area_struct *vma;
> + spinlock_t *i_mmap_lock_last, *anon_vma_lock_last;
> +
> + i_mmap_lock_last = NULL;
> + for (;;) {
> + spinlock_t *i_mmap_lock = (spinlock_t *) -1UL;
> + for (vma = mm->mmap; vma; vma = vma->vm_next)
> + if (vma->vm_file && vma->vm_file->f_mapping &&
>
I think you can break this if() down a bit:
if (!(vma->vm_file && vma->vm_file->f_mapping))
continue;
> + (unsigned long) i_mmap_lock >
> + (unsigned long)
> + &vma->vm_file->f_mapping->i_mmap_lock &&
> + (unsigned long)
> + &vma->vm_file->f_mapping->i_mmap_lock >
> + (unsigned long) i_mmap_lock_last)
> + i_mmap_lock =
> + &vma->vm_file->f_mapping->i_mmap_lock;
>
So this is an O(n^2) algorithm to take the i_mmap_locks from low to high
order? A comment would be nice. And O(n^2)? Ouch. How often is it
called?
And is it necessary to mush lock and unlock together? Unlock ordering
doesn't matter, so you should just be able to have a much simpler loop, no?
> + if (i_mmap_lock == (spinlock_t *) -1UL)
> + break;
> + i_mmap_lock_last = i_mmap_lock;
> + if (lock)
> + spin_lock(i_mmap_lock);
> + else
> + spin_unlock(i_mmap_lock);
> + }
> +
> + anon_vma_lock_last = NULL;
> + for (;;) {
> + spinlock_t *anon_vma_lock = (spinlock_t *) -1UL;
> + for (vma = mm->mmap; vma; vma = vma->vm_next)
> + if (vma->anon_vma &&
> + (unsigned long) anon_vma_lock >
> + (unsigned long) &vma->anon_vma->lock &&
> + (unsigned long) &vma->anon_vma->lock >
> + (unsigned long) anon_vma_lock_last)
> + anon_vma_lock = &vma->anon_vma->lock;
> + if (anon_vma_lock == (spinlock_t *) -1UL)
> + break;
> + anon_vma_lock_last = anon_vma_lock;
> + if (lock)
> + spin_lock(anon_vma_lock);
> + else
> + spin_unlock(anon_vma_lock);
> + }
> +}
>
> +
> +/*
> + * This operation locks against the VM for all pte/vma/mm related
> + * operations that could ever happen on a certain mm. This includes
> + * vmtruncate, try_to_unmap, and all page faults. The holder
> + * must not hold any mm related lock. A single task can't take more
> + * than one mm lock in a row or it would deadlock.
> + */
> +void mm_lock(struct mm_struct * mm)
> +{
> + down_write(&mm->mmap_sem);
> + mm_lock_unlock(mm, 1);
> +}
> +
> +void mm_unlock(struct mm_struct *mm)
> +{
> + mm_lock_unlock(mm, 0);
> + up_write(&mm->mmap_sem);
> +}
>
>
next prev parent reply other threads:[~2008-04-04 23:13 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-04 22:30 [patch 00/10] [RFC] EMM Notifier V3 Christoph Lameter
2008-04-04 22:30 ` [ofa-general] " Christoph Lameter
2008-04-04 22:30 ` [patch 01/10] emm: mm_lock: Lock a process against reclaim Christoph Lameter
2008-04-04 22:30 ` [ofa-general] " Christoph Lameter
2008-04-04 23:12 ` Jeremy Fitzhardinge [this message]
2008-04-04 23:12 ` [ofa-general] " Jeremy Fitzhardinge
2008-04-05 0:41 ` Andrea Arcangeli
2008-04-05 0:41 ` [ofa-general] " Andrea Arcangeli
2008-04-07 13:55 ` Peter Zijlstra
2008-04-07 13:55 ` [ofa-general] " Peter Zijlstra
2008-04-07 19:02 ` Jeremy Fitzhardinge
2008-04-07 19:02 ` [ofa-general] " Jeremy Fitzhardinge
2008-04-07 19:35 ` Andrea Arcangeli
2008-04-07 19:35 ` [ofa-general] " Andrea Arcangeli
2008-04-04 22:30 ` [patch 02/10] emm: notifier logic Christoph Lameter
2008-04-04 22:30 ` Christoph Lameter
2008-04-05 0:57 ` Andrea Arcangeli
2008-04-05 0:57 ` [ofa-general] " Andrea Arcangeli
2008-04-07 5:48 ` Christoph Lameter
2008-04-07 5:48 ` [ofa-general] " Christoph Lameter
2008-04-07 6:06 ` Andrea Arcangeli
2008-04-07 6:06 ` Andrea Arcangeli
2008-04-07 6:20 ` Christoph Lameter
2008-04-07 6:20 ` Christoph Lameter
2008-04-07 7:13 ` Andrea Arcangeli
2008-04-07 7:13 ` [ofa-general] " Andrea Arcangeli
2008-04-08 20:23 ` Christoph Lameter
2008-04-08 20:23 ` Christoph Lameter
2008-04-08 20:23 ` [ofa-general] " Christoph Lameter
2008-04-09 14:29 ` Andrea Arcangeli
2008-04-09 14:29 ` Andrea Arcangeli
2008-04-09 14:29 ` [ofa-general] " Andrea Arcangeli
2008-04-04 22:30 ` [patch 03/10] emm: Move tlb flushing into free_pgtables Christoph Lameter
2008-04-04 22:30 ` Christoph Lameter
2008-04-04 22:30 ` [patch 04/10] emm: Convert i_mmap_lock to i_mmap_sem Christoph Lameter
2008-04-04 22:30 ` [ofa-general] " Christoph Lameter
2008-04-04 22:30 ` [patch 05/10] emm: Remove tlb pointer from the parameters of unmap vmas Christoph Lameter
2008-04-04 22:30 ` Christoph Lameter
2008-04-04 22:30 ` [patch 06/10] emm: Convert anon_vma lock to rw_sem and refcount Christoph Lameter
2008-04-04 22:30 ` [ofa-general] " Christoph Lameter
2008-04-04 22:30 ` [patch 07/10] xpmem: This patch exports zap_page_range as it is needed by XPMEM Christoph Lameter
2008-04-04 22:30 ` Christoph Lameter
2008-04-04 22:30 ` [patch 08/10] xpmem: Locking rules for taking multiple mmap_sem locks Christoph Lameter
2008-04-04 22:30 ` Christoph Lameter
2008-04-04 22:30 ` [patch 09/10] xpmem: The device driver Christoph Lameter
2008-04-04 22:30 ` Christoph Lameter
2008-04-04 22:30 ` [patch 10/10] xpmem: Simple example Christoph Lameter
2008-04-04 22:30 ` [ofa-general] " Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47F6B5EA.6060106@goop.org \
--to=jeremy@goop.org \
--cc=a.p.zijlstra@chello.nl \
--cc=andrea@qumranet.com \
--cc=clameter@sgi.com \
--cc=general@lists.openfabrics.org \
--cc=holt@sgi.com \
--cc=kvm-devel@lists.sourceforge.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=steiner@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.