All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Izik Eidus <ieidus@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	kvm@vger.kernel.org, aarcange@redhat.com, chrisw@redhat.com,
	avi@redhat.com, izike@qumranet.com
Subject: Re: [PATCH 1/4] rmap: add page_wrprotect() function,
Date: Tue, 11 Nov 2008 11:39:48 -0800	[thread overview]
Message-ID: <20081111113948.f38b9e95.akpm@linux-foundation.org> (raw)
In-Reply-To: <1226409701-14831-2-git-send-email-ieidus@redhat.com>

On Tue, 11 Nov 2008 15:21:38 +0200
Izik Eidus <ieidus@redhat.com> wrote:

> From: Izik Eidus <izike@qumranet.com>
> 
> this function is useful for cases you want to compare page and know
> that its value wont change during you compare it.
> 
> this function is working by walking over the whole rmap of a page
> and mark every pte related to the page as write_protect.
> 
> the odirect_sync paramter is used to notify the caller of
> page_wrprotect() if one pte or more was not marked readonly
> in order to avoid race with odirect.

The grammar is a bit mangled.  Missing apostrophes.  Sentences start
with capital letters.

Yeah, it's a nit, but we don't actually gain anything from deliberately
mangling the language.

>
> ...
>
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -576,6 +576,103 @@ int page_mkclean(struct page *page)
>  }
>  EXPORT_SYMBOL_GPL(page_mkclean);
>  
> +static int page_wrprotect_one(struct page *page, struct vm_area_struct *vma,
> +			      int *odirect_sync)
> +{
> +	struct mm_struct *mm = vma->vm_mm;
> +	unsigned long address;
> +	pte_t *pte;
> +	spinlock_t *ptl;
> +	int ret = 0;
> +
> +	address = vma_address(page, vma);
> +	if (address == -EFAULT)
> +		goto out;
> +
> +	pte = page_check_address(page, mm, address, &ptl, 0);
> +	if (!pte)
> +		goto out;
> +
> +	if (pte_write(*pte)) {
> +		pte_t entry;
> +
> +		if (page_mapcount(page) != page_count(page)) {
> +			*odirect_sync = 0;
> +			goto out_unlock;
> +		}
> +		flush_cache_page(vma, address, pte_pfn(*pte));
> +		entry = ptep_clear_flush_notify(vma, address, pte);
> +		entry = pte_wrprotect(entry);
> +		set_pte_at(mm, address, pte, entry);
> +	}
> +	ret = 1;
> +
> +out_unlock:
> +	pte_unmap_unlock(pte, ptl);
> +out:
> +	return ret;
> +}

OK.  I think.  We need to find a way of provoking Hugh to look at it.

> +static int page_wrprotect_file(struct page *page, int *odirect_sync)
> +{
> +	struct address_space *mapping;
> +	struct prio_tree_iter iter;
> +	struct vm_area_struct *vma;
> +	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
> +	int ret = 0;
> +
> +	mapping = page_mapping(page);

What pins *mapping in memory?  Usually this is done by requiring that
the caller has locked the page.  But no such precondition is documented
here.

> +	if (!mapping)
> +		return ret;
> +
> +	spin_lock(&mapping->i_mmap_lock);
> +
> +	vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff)
> +		ret += page_wrprotect_one(page, vma, odirect_sync);
> +
> +	spin_unlock(&mapping->i_mmap_lock);
> +
> +	return ret;
> +}
> +
> +static int page_wrprotect_anon(struct page *page, int *odirect_sync)
> +{
> +	struct vm_area_struct *vma;
> +	struct anon_vma *anon_vma;
> +	int ret = 0;
> +
> +	anon_vma = page_lock_anon_vma(page);
> +	if (!anon_vma)
> +		return ret;
> +
> +	list_for_each_entry(vma, &anon_vma->head, anon_vma_node)
> +		ret += page_wrprotect_one(page, vma, odirect_sync);
> +
> +	page_unlock_anon_vma(anon_vma);
> +
> +	return ret;
> +}
> +
> +/**

This token means "this is a kerneldoc comment".

> + * set all the ptes pointed to a page as read only,
> + * odirect_sync is set to 0 in case we cannot protect against race with odirect
> + * return the number of ptes that were set as read only
> + * (ptes that were read only before this function was called are couned as well)
> + */

But it isn't.

I don't understand this odirect_sync thing.  What race?  Please expand
this comment to make the function of odirect_sync more understandable.

> +int page_wrprotect(struct page *page, int *odirect_sync)
> +{
> +	int ret =0;
> +
> +	*odirect_sync = 1;
> +	if (PageAnon(page))
> +		ret = page_wrprotect_anon(page, odirect_sync);
> +	else
> +		ret = page_wrprotect_file(page, odirect_sync);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(page_wrprotect);

What do you think about making all this new code dependent upon some
CONFIG_ switch which CONFIG_KVM can select?


WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Izik Eidus <ieidus@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	kvm@vger.kernel.org, aarcange@redhat.com, chrisw@redhat.com,
	avi@redhat.com, izike@qumranet.com
Subject: Re: [PATCH 1/4] rmap: add page_wrprotect() function,
Date: Tue, 11 Nov 2008 11:39:48 -0800	[thread overview]
Message-ID: <20081111113948.f38b9e95.akpm@linux-foundation.org> (raw)
In-Reply-To: <1226409701-14831-2-git-send-email-ieidus@redhat.com>

On Tue, 11 Nov 2008 15:21:38 +0200
Izik Eidus <ieidus@redhat.com> wrote:

> From: Izik Eidus <izike@qumranet.com>
> 
> this function is useful for cases you want to compare page and know
> that its value wont change during you compare it.
> 
> this function is working by walking over the whole rmap of a page
> and mark every pte related to the page as write_protect.
> 
> the odirect_sync paramter is used to notify the caller of
> page_wrprotect() if one pte or more was not marked readonly
> in order to avoid race with odirect.

The grammar is a bit mangled.  Missing apostrophes.  Sentences start
with capital letters.

Yeah, it's a nit, but we don't actually gain anything from deliberately
mangling the language.

>
> ...
>
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -576,6 +576,103 @@ int page_mkclean(struct page *page)
>  }
>  EXPORT_SYMBOL_GPL(page_mkclean);
>  
> +static int page_wrprotect_one(struct page *page, struct vm_area_struct *vma,
> +			      int *odirect_sync)
> +{
> +	struct mm_struct *mm = vma->vm_mm;
> +	unsigned long address;
> +	pte_t *pte;
> +	spinlock_t *ptl;
> +	int ret = 0;
> +
> +	address = vma_address(page, vma);
> +	if (address == -EFAULT)
> +		goto out;
> +
> +	pte = page_check_address(page, mm, address, &ptl, 0);
> +	if (!pte)
> +		goto out;
> +
> +	if (pte_write(*pte)) {
> +		pte_t entry;
> +
> +		if (page_mapcount(page) != page_count(page)) {
> +			*odirect_sync = 0;
> +			goto out_unlock;
> +		}
> +		flush_cache_page(vma, address, pte_pfn(*pte));
> +		entry = ptep_clear_flush_notify(vma, address, pte);
> +		entry = pte_wrprotect(entry);
> +		set_pte_at(mm, address, pte, entry);
> +	}
> +	ret = 1;
> +
> +out_unlock:
> +	pte_unmap_unlock(pte, ptl);
> +out:
> +	return ret;
> +}

OK.  I think.  We need to find a way of provoking Hugh to look at it.

> +static int page_wrprotect_file(struct page *page, int *odirect_sync)
> +{
> +	struct address_space *mapping;
> +	struct prio_tree_iter iter;
> +	struct vm_area_struct *vma;
> +	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
> +	int ret = 0;
> +
> +	mapping = page_mapping(page);

What pins *mapping in memory?  Usually this is done by requiring that
the caller has locked the page.  But no such precondition is documented
here.

> +	if (!mapping)
> +		return ret;
> +
> +	spin_lock(&mapping->i_mmap_lock);
> +
> +	vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff)
> +		ret += page_wrprotect_one(page, vma, odirect_sync);
> +
> +	spin_unlock(&mapping->i_mmap_lock);
> +
> +	return ret;
> +}
> +
> +static int page_wrprotect_anon(struct page *page, int *odirect_sync)
> +{
> +	struct vm_area_struct *vma;
> +	struct anon_vma *anon_vma;
> +	int ret = 0;
> +
> +	anon_vma = page_lock_anon_vma(page);
> +	if (!anon_vma)
> +		return ret;
> +
> +	list_for_each_entry(vma, &anon_vma->head, anon_vma_node)
> +		ret += page_wrprotect_one(page, vma, odirect_sync);
> +
> +	page_unlock_anon_vma(anon_vma);
> +
> +	return ret;
> +}
> +
> +/**

This token means "this is a kerneldoc comment".

> + * set all the ptes pointed to a page as read only,
> + * odirect_sync is set to 0 in case we cannot protect against race with odirect
> + * return the number of ptes that were set as read only
> + * (ptes that were read only before this function was called are couned as well)
> + */

But it isn't.

I don't understand this odirect_sync thing.  What race?  Please expand
this comment to make the function of odirect_sync more understandable.

> +int page_wrprotect(struct page *page, int *odirect_sync)
> +{
> +	int ret =0;
> +
> +	*odirect_sync = 1;
> +	if (PageAnon(page))
> +		ret = page_wrprotect_anon(page, odirect_sync);
> +	else
> +		ret = page_wrprotect_file(page, odirect_sync);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(page_wrprotect);

What do you think about making all this new code dependent upon some
CONFIG_ switch which CONFIG_KVM can select?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2008-11-11 19:41 UTC|newest]

Thread overview: 139+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-11 13:21 [PATCH 0/4] ksm - dynamic page sharing driver for linux Izik Eidus
2008-11-11 13:21 ` Izik Eidus
2008-11-11 13:21 ` [PATCH 1/4] rmap: add page_wrprotect() function, Izik Eidus
2008-11-11 13:21   ` Izik Eidus, Izik Eidus
2008-11-11 13:21   ` [PATCH 2/4] Add replace_page(), change the mapping of pte from one page into another Izik Eidus
2008-11-11 13:21     ` Izik Eidus, Izik Eidus
2008-11-11 13:21     ` [PATCH 3/4] add ksm kernel shared memory driver Izik Eidus
2008-11-11 13:21       ` Izik Eidus, Izik Eidus
2008-11-11 13:21       ` [PATCH 4/4] MMU_NOTIFIRES: add set_pte_at_notify() Izik Eidus
2008-11-11 13:21         ` Izik Eidus, Izik Eidus
2008-11-11 20:38       ` [PATCH 3/4] add ksm kernel shared memory driver Andrew Morton
2008-11-11 20:38         ` Andrew Morton
2008-11-11 22:03         ` Andrea Arcangeli
2008-11-11 22:03           ` Andrea Arcangeli
2008-11-11 22:03       ` Jonathan Corbet
2008-11-11 22:03         ` Jonathan Corbet
2008-11-11 22:17         ` Izik Eidus
2008-11-11 22:17           ` Izik Eidus
2008-11-11 22:25           ` Jonathan Corbet
2008-11-11 22:25             ` Jonathan Corbet
2008-11-11 22:31             ` Izik Eidus
2008-11-11 22:31               ` Izik Eidus
2008-11-11 22:30           ` Jonathan Corbet
2008-11-11 22:30             ` Jonathan Corbet
2008-11-11 22:38             ` Izik Eidus
2008-11-11 22:38               ` Izik Eidus
2008-11-11 23:02             ` Izik Eidus
2008-11-11 23:02               ` Izik Eidus
2008-11-11 23:03             ` Andrea Arcangeli
2008-11-11 23:03               ` Andrea Arcangeli
2008-11-11 22:49           ` Avi Kivity
2008-11-11 22:49             ` Avi Kivity
2008-11-11 22:40         ` Valdis.Kletnieks
2008-11-13  6:13           ` Eric Rannaud
2008-11-13  6:13             ` Eric Rannaud
2008-11-11 22:43         ` Avi Kivity
2008-11-11 22:43           ` Avi Kivity
2008-11-11 19:45     ` [PATCH 2/4] Add replace_page(), change the mapping of pte from one page into another Andrew Morton
2008-11-11 19:45       ` Andrew Morton
2008-11-11 20:57       ` Izik Eidus
2008-11-11 20:57         ` Izik Eidus
2008-11-11 21:21         ` Christoph Lameter
2008-11-11 21:21           ` Christoph Lameter
2008-11-11 21:23           ` Izik Eidus
2008-11-11 21:23             ` Izik Eidus
2008-11-11 21:31             ` Christoph Lameter
2008-11-11 21:31               ` Christoph Lameter
2008-11-11 21:37               ` Izik Eidus
2008-11-11 21:37                 ` Izik Eidus
2008-11-11 22:24               ` Andrea Arcangeli
2008-11-11 22:24                 ` Andrea Arcangeli
2008-11-12  2:19                 ` KAMEZAWA Hiroyuki
2008-11-12  2:19                   ` KAMEZAWA Hiroyuki
2008-11-12 10:05                   ` Avi Kivity
2008-11-12 10:05                     ` Avi Kivity
2008-11-12 11:11                     ` Izik Eidus
2008-11-12 11:11                       ` Izik Eidus
2008-11-13  6:11                       ` KAMEZAWA Hiroyuki
2008-11-13  6:11                         ` KAMEZAWA Hiroyuki
2008-11-13 10:38                         ` Izik Eidus
2008-11-13 10:38                           ` Izik Eidus
2008-11-13 11:32                           ` KAMEZAWA Hiroyuki
2008-11-13 11:32                             ` KAMEZAWA Hiroyuki
2008-11-11 21:35           ` Andrea Arcangeli
2008-11-11 21:35             ` Andrea Arcangeli
2008-11-11 21:06       ` Andrea Arcangeli
2008-11-11 21:06         ` Andrea Arcangeli
2008-11-11 21:26         ` Christoph Lameter
2008-11-11 21:26           ` Christoph Lameter
2008-11-11 21:39           ` Avi Kivity
2008-11-11 21:39             ` Avi Kivity
2008-11-11 21:47             ` Christoph Lameter
2008-11-11 21:47               ` Christoph Lameter
2008-11-11 21:55               ` Izik Eidus
2008-11-11 21:55                 ` Izik Eidus
2008-11-11 22:36               ` Avi Kivity
2008-11-11 22:36                 ` Avi Kivity
2008-11-11 22:17           ` Andrea Arcangeli
2008-11-11 22:17             ` Andrea Arcangeli
2008-11-11 22:30             ` Christoph Lameter
2008-11-11 22:30               ` Christoph Lameter
2008-11-11 23:17               ` Andrea Arcangeli
2008-11-11 23:17                 ` Andrea Arcangeli
2008-11-11 23:25                 ` Andrea Arcangeli
2008-11-11 23:25                   ` Andrea Arcangeli
2008-11-12  0:27                 ` Christoph Lameter
2008-11-12  0:27                   ` Christoph Lameter
2008-11-12  2:27                   ` Andrea Arcangeli
2008-11-12  2:27                     ` Andrea Arcangeli
2008-11-12  3:10                     ` Christoph Lameter
2008-11-12  3:10                       ` Christoph Lameter
2008-11-12 17:32                       ` Andrea Arcangeli
2008-11-12 17:32                         ` Andrea Arcangeli
2008-11-12 20:08                         ` Lee Schermerhorn
2008-11-12 20:08                           ` Lee Schermerhorn
2008-11-12 20:31                           ` Christoph Lameter
2008-11-12 20:31                             ` Christoph Lameter
2008-11-12 20:27                         ` Christoph Lameter
2008-11-12 20:27                           ` Christoph Lameter
2008-11-12 22:09                           ` Lee Schermerhorn
2008-11-12 22:09                             ` Lee Schermerhorn
2008-11-13  2:00                             ` Andrea Arcangeli
2008-11-13  2:00                               ` Andrea Arcangeli
2008-11-13  2:31                               ` Andrea Arcangeli
2008-11-13  2:31                                 ` Andrea Arcangeli
2008-11-13  4:02                                 ` Nick Piggin
2008-11-13  4:02                                   ` Nick Piggin
2008-11-11 19:39   ` Andrew Morton [this message]
2008-11-11 19:39     ` [PATCH 1/4] rmap: add page_wrprotect() function, Andrew Morton
2008-11-11 20:38     ` Andrea Arcangeli
2008-11-11 20:38       ` Andrea Arcangeli
2008-11-11 21:01       ` Andrew Morton
2008-11-11 21:01         ` Andrew Morton
2008-11-11 21:17         ` Andrea Arcangeli
2008-11-11 21:17           ` Andrea Arcangeli
2008-11-11 18:30 ` [PATCH 0/4] ksm - dynamic page sharing driver for linux Andrew Morton
2008-11-11 18:30   ` Andrew Morton
2008-11-11 18:48   ` Avi Kivity
2008-11-11 18:48     ` Avi Kivity
2008-11-11 19:08     ` Izik Eidus
2008-11-11 19:08       ` Izik Eidus
2008-11-11 19:11     ` Andrew Morton
2008-11-11 19:11       ` Andrew Morton
2008-11-11 19:18       ` Izik Eidus
2008-11-11 19:18         ` Izik Eidus
2008-11-11 19:32         ` Andrew Morton
2008-11-11 19:32           ` Andrew Morton
2008-11-11 19:52           ` Izik Eidus
2008-11-11 19:52             ` Izik Eidus
2008-11-11 20:08             ` Izik Eidus
2008-11-11 20:08               ` Izik Eidus
2008-11-11 19:29       ` Avi Kivity
2008-11-11 19:29         ` Avi Kivity
2008-11-11 19:55       ` Andrea Arcangeli
2008-11-11 19:55         ` Andrea Arcangeli
2008-11-11 19:07   ` Izik Eidus
2008-11-11 19:07     ` Izik Eidus
2008-11-11 19:20     ` Andrew Morton
2008-11-11 19:20       ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081111113948.f38b9e95.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=aarcange@redhat.com \
    --cc=avi@redhat.com \
    --cc=chrisw@redhat.com \
    --cc=ieidus@redhat.com \
    --cc=izike@qumranet.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.