linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Haggai Eran <haggaie@mellanox.com>
Cc: linux-mm@kvack.org, Andrea Arcangeli <aarcange@redhat.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>,
	Shachar Raindel <raindel@mellanox.com>,
	Sagi Grimberg <sagig@mellanox.com>,
	Or Gerlitz <ogerlitz@mellanox.com>,
	Andrea Arcangeli <andrea@qumranet.com>
Subject: Re: [PATCH V1 1/2] mm: Move all mmu notifier invocations to be done outside the PT lock
Date: Tue, 4 Sep 2012 15:07:37 -0700	[thread overview]
Message-ID: <20120904150737.a6774600.akpm@linux-foundation.org> (raw)
In-Reply-To: <1346748081-1652-2-git-send-email-haggaie@mellanox.com>

On Tue,  4 Sep 2012 11:41:20 +0300
Haggai Eran <haggaie@mellanox.com> wrote:

> From: Sagi Grimberg <sagig@mellanox.com>
> 
> In order to allow sleeping during mmu notifier calls, we need to avoid
> invoking them under the page table spinlock. This patch solves the
> problem by calling invalidate_page notification after releasing
> the lock (but before freeing the page itself), or by wrapping the page
> invalidation with calls to invalidate_range_begin and
> invalidate_range_end.
> 
>
> ...
>
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -868,12 +868,14 @@ static int do_huge_pmd_wp_page_fallback(struct mm_struct *mm,
>  		cond_resched();
>  	}
>  
> +	mmu_notifier_invalidate_range_start(mm, haddr, haddr + HPAGE_PMD_SIZE);
> +
>  	spin_lock(&mm->page_table_lock);
>  	if (unlikely(!pmd_same(*pmd, orig_pmd)))
>  		goto out_free_pages;
>  	VM_BUG_ON(!PageHead(page));
>  
> -	pmdp_clear_flush_notify(vma, haddr, pmd);
> +	pmdp_clear_flush(vma, haddr, pmd);
>  	/* leave pmd empty until pte is filled */
>  
>  	pgtable = get_pmd_huge_pte(mm);
> @@ -896,6 +898,9 @@ static int do_huge_pmd_wp_page_fallback(struct mm_struct *mm,
>  	page_remove_rmap(page);
>  	spin_unlock(&mm->page_table_lock);
>  
> +	mmu_notifier_invalidate_range_end(vma->vm_mm, haddr,
> +					  haddr + HPAGE_PMD_SIZE);

But `haddr' has been altered by the time we get here.  We should have
saved the original value?

That's a thing I don't like about this patchset - it adds maintenance
overhead.  This need to retain values of local variables or incoming
args across lengthy code sequences is fragile.  We could easily break
your changes as the core code evolves, and it would take a long long
time before anyone noticed the breakage.

I'm wondering if it would be better to adopt the convention throughout
this patchset that mmu_notifier_invalidate_range_start() and
mmu_notifier_invalidate_range_end() always use their own locals.  ie:

	unsigned long mmun_start;	/* For mmu_notifiers */
	unsigned long mmun_end;		/* For mmu_notifiers */

	...

	mmun_start = ...;
	mmun_end = ...;

	...

	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);

	...

	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);


This is verbose, but it is explicit and clear and more robust than what
you have.  It shouldn't generate any additional text or stack usage or
register usage unless gcc is having an especially stupid day.


>  	ret |= VM_FAULT_WRITE;
>  	put_page(page);
>  
>
> ...
>
> @@ -1382,7 +1391,7 @@ static int try_to_unmap_cluster(unsigned long cursor, unsigned int *mapcount,
>  	spinlock_t *ptl;
>  	struct page *page;
>  	unsigned long address;
> -	unsigned long end;
> +	unsigned long start, end;

You'll note that this function uses the one-definition-per-line
convention, which has a few (smallish) advantages over
multiple-definitions-per-line.  One such advantage is that it leaves
room for a nice little comment at the RHS.  Take that as a hint ;)

>  	int ret = SWAP_AGAIN;
>  	int locked_vma = 0;
>  
> @@ -1405,6 +1414,9 @@ static int try_to_unmap_cluster(unsigned long cursor, unsigned int *mapcount,
>  	if (!pmd_present(*pmd))
>  		return ret;
>  
> +	start = address;
> +	mmu_notifier_invalidate_range_start(mm, start, end);

`end' is used uninitialised in this function.

I'm surprised that it didn't generate a warning(?) and I worry about
the testing coverage?

>  	/*
>  	 * If we can acquire the mmap_sem for read, and vma is VM_LOCKED,
>  	 * keep the sem while scanning the cluster for mlocking pages.
>
> ...
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-09-04 22:07 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-04  8:41 [PATCH V1 0/2] Enable clients to schedule in mmu_notifier methods Haggai Eran
2012-09-04  8:41 ` [PATCH V1 1/2] mm: Move all mmu notifier invocations to be done outside the PT lock Haggai Eran
2012-09-04 22:07   ` Andrew Morton [this message]
2012-09-06 14:34     ` [PATCH V2 0/2] Enable clients to schedule in mmu_notifier methods Haggai Eran
2012-09-06 14:34       ` [PATCH V2 1/2] mm: Move all mmu notifier invocations to be done outside the PT lock Haggai Eran
2012-09-06 14:34       ` [PATCH V2 2/2] mm: Wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end Haggai Eran
2012-09-06 20:08       ` [PATCH V2 0/2] Enable clients to schedule in mmu_notifier methods Andrew Morton
2012-09-04  8:41 ` [PATCH V1 2/2] mm: Wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end Haggai Eran
2012-09-04 22:07   ` Andrew Morton
2012-09-04 22:06 ` [PATCH V1 0/2] Enable clients to schedule in mmu_notifier methods Andrew Morton
2012-09-05 12:55   ` Avi Kivity
2012-09-05 14:01   ` Haggai Eran

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120904150737.a6774600.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aarcange@redhat.com \
    --cc=andrea@qumranet.com \
    --cc=haggaie@mellanox.com \
    --cc=linux-mm@kvack.org \
    --cc=ogerlitz@mellanox.com \
    --cc=raindel@mellanox.com \
    --cc=sagig@mellanox.com \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).