linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bugfix] proc/pagemap: correctly report non-present ptes and holes between vmas
@ 2012-04-28 16:22 Konstantin Khlebnikov
  2012-04-28 16:24 ` Pavel Emelyanov
  2012-04-30 15:25 ` Naoya Horiguchi
  0 siblings, 2 replies; 5+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-28 16:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Pavel Emelyanov, linux-kernel, linux-mm,
	Naoya Horiguchi, KAMEZAWA Hiroyuki

This patch resets current pagemap-entry if current pte isn't present,
or if current vma is over. Otherwise pagemap reports last entry again and again.

non-present pte reporting was broken in commit v3.3-3738-g092b50b
("pagemap: introduce data structure for pagemap entry")

reporting for holes was broken in commit v3.3-3734-g5aaabe8
("pagemap: avoid splitting thp when reading /proc/pid/pagemap")

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Reported-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andi Kleen <ak@linux.intel.com>
---
 fs/proc/task_mmu.c |   12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index a580c69..9f9c033 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -747,6 +747,8 @@ static void pte_to_pagemap_entry(pagemap_entry_t *pme, pte_t pte)
 	else if (pte_present(pte))
 		*pme = make_pme(PM_PFRAME(pte_pfn(pte))
 				| PM_PSHIFT(PAGE_SHIFT) | PM_PRESENT);
+	else
+		*pme = make_pme(PM_NOT_PRESENT);
 }
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
@@ -761,6 +763,8 @@ static void thp_pmd_to_pagemap_entry(pagemap_entry_t *pme,
 	if (pmd_present(pmd))
 		*pme = make_pme(PM_PFRAME(pmd_pfn(pmd) + offset)
 				| PM_PSHIFT(PAGE_SHIFT) | PM_PRESENT);
+	else
+		*pme = make_pme(PM_NOT_PRESENT);
 }
 #else
 static inline void thp_pmd_to_pagemap_entry(pagemap_entry_t *pme,
@@ -801,8 +805,10 @@ static int pagemap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
 
 		/* check to see if we've left 'vma' behind
 		 * and need a new, higher one */
-		if (vma && (addr >= vma->vm_end))
+		if (vma && (addr >= vma->vm_end)) {
 			vma = find_vma(walk->mm, addr);
+			pme = make_pme(PM_NOT_PRESENT);
+		}
 
 		/* check that 'vma' actually covers this address,
 		 * and that it isn't a huge page vma */
@@ -830,6 +836,8 @@ static void huge_pte_to_pagemap_entry(pagemap_entry_t *pme,
 	if (pte_present(pte))
 		*pme = make_pme(PM_PFRAME(pte_pfn(pte) + offset)
 				| PM_PSHIFT(PAGE_SHIFT) | PM_PRESENT);
+	else
+		*pme = make_pme(PM_NOT_PRESENT);
 }
 
 /* This function walks within one hugetlb entry in the single call */
@@ -839,7 +847,7 @@ static int pagemap_hugetlb_range(pte_t *pte, unsigned long hmask,
 {
 	struct pagemapread *pm = walk->private;
 	int err = 0;
-	pagemap_entry_t pme = make_pme(PM_NOT_PRESENT);
+	pagemap_entry_t pme;
 
 	for (; addr != end; addr += PAGE_SIZE) {
 		int offset = (addr & ~hmask) >> PAGE_SHIFT;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH bugfix] proc/pagemap: correctly report non-present ptes and holes between vmas
  2012-04-28 16:22 [PATCH bugfix] proc/pagemap: correctly report non-present ptes and holes between vmas Konstantin Khlebnikov
@ 2012-04-28 16:24 ` Pavel Emelyanov
  2012-04-30 15:25 ` Naoya Horiguchi
  1 sibling, 0 replies; 5+ messages in thread
From: Pavel Emelyanov @ 2012-04-28 16:24 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Andrew Morton, Andi Kleen, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, Naoya Horiguchi, KAMEZAWA Hiroyuki

On 04/28/2012 08:22 PM, Konstantin Khlebnikov wrote:
> This patch resets current pagemap-entry if current pte isn't present,
> or if current vma is over. Otherwise pagemap reports last entry again and again.
> 
> non-present pte reporting was broken in commit v3.3-3738-g092b50b
> ("pagemap: introduce data structure for pagemap entry")
> 
> reporting for holes was broken in commit v3.3-3734-g5aaabe8
> ("pagemap: avoid splitting thp when reading /proc/pid/pagemap")
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
> Reported-by: Pavel Emelyanov <xemul@parallels.com>
> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Andi Kleen <ak@linux.intel.com>

Acked-by: Pavel Emelyanov <xemul@parallels.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH bugfix] proc/pagemap: correctly report non-present ptes and holes between vmas
  2012-04-28 16:22 [PATCH bugfix] proc/pagemap: correctly report non-present ptes and holes between vmas Konstantin Khlebnikov
  2012-04-28 16:24 ` Pavel Emelyanov
@ 2012-04-30 15:25 ` Naoya Horiguchi
  2012-04-30 19:19   ` Konstantin Khlebnikov
  1 sibling, 1 reply; 5+ messages in thread
From: Naoya Horiguchi @ 2012-04-30 15:25 UTC (permalink / raw)
  To: khlebnikov
  Cc: Andrew Morton, ak, xemul, linux-kernel, linux-mm, Naoya Horiguchi,
	KAMEZAWA Hiroyuki

Hi,

On Sat, Apr 28, 2012 at 08:22:30PM +0400, Konstantin Khlebnikov wrote:
> This patch resets current pagemap-entry if current pte isn't present,
> or if current vma is over. Otherwise pagemap reports last entry again and again.
> 
> non-present pte reporting was broken in commit v3.3-3738-g092b50b
> ("pagemap: introduce data structure for pagemap entry")
> 
> reporting for holes was broken in commit v3.3-3734-g5aaabe8
> ("pagemap: avoid splitting thp when reading /proc/pid/pagemap")
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
> Reported-by: Pavel Emelyanov <xemul@parallels.com>
> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Andi Kleen <ak@linux.intel.com>

Thanks for your efforts.
I confirmed that this patch fixes the problem on v3.4-rc4.
But originally (before the commits you pointed to above) initializing
pagemap entries (originally labelled with confusing 'pfn') were done
in for-loop in pagemap_pte_range(), so I think it's better to get it
back to things like that.

How about the following?
---
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 2b9a760..538f8d8 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -779,13 +779,14 @@ static int pagemap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
 	struct pagemapread *pm = walk->private;
 	pte_t *pte;
 	int err = 0;
-	pagemap_entry_t pme = make_pme(PM_NOT_PRESENT);
+	pagemap_entry_t pme;
 
 	/* find the first VMA at or above 'addr' */
 	vma = find_vma(walk->mm, addr);
 	if (pmd_trans_huge_lock(pmd, vma) == 1) {
 		for (; addr != end; addr += PAGE_SIZE) {
 			unsigned long offset;
+			pme = make_pme(PM_NOT_PRESENT);
 
 			offset = (addr & ~PAGEMAP_WALK_MASK) >>
 					PAGE_SHIFT;
@@ -801,6 +802,7 @@ static int pagemap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
 	if (pmd_trans_unstable(pmd))
 		return 0;
 	for (; addr != end; addr += PAGE_SIZE) {
+		pme = make_pme(PM_NOT_PRESENT);
 
 		/* check to see if we've left 'vma' behind
 		 * and need a new, higher one */
@@ -842,10 +844,10 @@ static int pagemap_hugetlb_range(pte_t *pte, unsigned long hmask,
 {
 	struct pagemapread *pm = walk->private;
 	int err = 0;
-	pagemap_entry_t pme = make_pme(PM_NOT_PRESENT);
 
 	for (; addr != end; addr += PAGE_SIZE) {
 		int offset = (addr & ~hmask) >> PAGE_SHIFT;
+		pagemap_entry_t pme = make_pme(PM_NOT_PRESENT);
 		huge_pte_to_pagemap_entry(&pme, *pte, offset);
 		err = add_to_pagemap(addr, &pme, pm);
 		if (err)

---
Thanks,
Naoya

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH bugfix] proc/pagemap: correctly report non-present ptes and holes between vmas
  2012-04-30 15:25 ` Naoya Horiguchi
@ 2012-04-30 19:19   ` Konstantin Khlebnikov
  2012-04-30 22:12     ` Naoya Horiguchi
  0 siblings, 1 reply; 5+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-30 19:19 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Andrew Morton, ak@linux.intel.com, Pavel Emelianov,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	KAMEZAWA Hiroyuki

Naoya Horiguchi wrote:
> Hi,
>
> On Sat, Apr 28, 2012 at 08:22:30PM +0400, Konstantin Khlebnikov wrote:
>> This patch resets current pagemap-entry if current pte isn't present,
>> or if current vma is over. Otherwise pagemap reports last entry again and again.
>>
>> non-present pte reporting was broken in commit v3.3-3738-g092b50b
>> ("pagemap: introduce data structure for pagemap entry")
>>
>> reporting for holes was broken in commit v3.3-3734-g5aaabe8
>> ("pagemap: avoid splitting thp when reading /proc/pid/pagemap")
>>
>> Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> Reported-by: Pavel Emelyanov<xemul@parallels.com>
>> Cc: Naoya Horiguchi<n-horiguchi@ah.jp.nec.com>
>> Cc: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
>> Cc: Andi Kleen<ak@linux.intel.com>
>
> Thanks for your efforts.
> I confirmed that this patch fixes the problem on v3.4-rc4.
> But originally (before the commits you pointed to above) initializing
> pagemap entries (originally labelled with confusing 'pfn') were done
> in for-loop in pagemap_pte_range(), so I think it's better to get it
> back to things like that.
>
> How about the following?

I don't like this. Functions which returns void should always initialize its "output"
argument, it much more clear than relying on preinitialized value.

> ---
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 2b9a760..538f8d8 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -779,13 +779,14 @@ static int pagemap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
>   	struct pagemapread *pm = walk->private;
>   	pte_t *pte;
>   	int err = 0;
> -	pagemap_entry_t pme = make_pme(PM_NOT_PRESENT);
> +	pagemap_entry_t pme;
>
>   	/* find the first VMA at or above 'addr' */
>   	vma = find_vma(walk->mm, addr);
>   	if (pmd_trans_huge_lock(pmd, vma) == 1) {
>   		for (; addr != end; addr += PAGE_SIZE) {
>   			unsigned long offset;
> +			pme = make_pme(PM_NOT_PRESENT);
>
>   			offset = (addr&  ~PAGEMAP_WALK_MASK)>>
>   					PAGE_SHIFT;
> @@ -801,6 +802,7 @@ static int pagemap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
>   	if (pmd_trans_unstable(pmd))
>   		return 0;
>   	for (; addr != end; addr += PAGE_SIZE) {
> +		pme = make_pme(PM_NOT_PRESENT);
>
>   		/* check to see if we've left 'vma' behind
>   		 * and need a new, higher one */
> @@ -842,10 +844,10 @@ static int pagemap_hugetlb_range(pte_t *pte, unsigned long hmask,
>   {
>   	struct pagemapread *pm = walk->private;
>   	int err = 0;
> -	pagemap_entry_t pme = make_pme(PM_NOT_PRESENT);
>
>   	for (; addr != end; addr += PAGE_SIZE) {
>   		int offset = (addr&  ~hmask)>>  PAGE_SHIFT;
> +		pagemap_entry_t pme = make_pme(PM_NOT_PRESENT);
>   		huge_pte_to_pagemap_entry(&pme, *pte, offset);
>   		err = add_to_pagemap(addr,&pme, pm);
>   		if (err)
>
> ---
> Thanks,
> Naoya

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH bugfix] proc/pagemap: correctly report non-present ptes and holes between vmas
  2012-04-30 19:19   ` Konstantin Khlebnikov
@ 2012-04-30 22:12     ` Naoya Horiguchi
  0 siblings, 0 replies; 5+ messages in thread
From: Naoya Horiguchi @ 2012-04-30 22:12 UTC (permalink / raw)
  To: khlebnikov
  Cc: Naoya Horiguchi, Andrew Morton, ak, xemul, linux-kernel, linux-mm,
	KAMEZAWA Hiroyuki

On Mon, Apr 30, 2012 at 11:19:27PM +0400, Konstantin Khlebnikov wrote:
> Naoya Horiguchi wrote:
> >Hi,
> >
> >On Sat, Apr 28, 2012 at 08:22:30PM +0400, Konstantin Khlebnikov wrote:
> >>This patch resets current pagemap-entry if current pte isn't present,
> >>or if current vma is over. Otherwise pagemap reports last entry again and again.
> >>
> >>non-present pte reporting was broken in commit v3.3-3738-g092b50b
> >>("pagemap: introduce data structure for pagemap entry")
> >>
> >>reporting for holes was broken in commit v3.3-3734-g5aaabe8
> >>("pagemap: avoid splitting thp when reading /proc/pid/pagemap")
> >>
> >>Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
> >>Reported-by: Pavel Emelyanov<xemul@parallels.com>
> >>Cc: Naoya Horiguchi<n-horiguchi@ah.jp.nec.com>
> >>Cc: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
> >>Cc: Andi Kleen<ak@linux.intel.com>
> >
> >Thanks for your efforts.
> >I confirmed that this patch fixes the problem on v3.4-rc4.
> >But originally (before the commits you pointed to above) initializing
> >pagemap entries (originally labelled with confusing 'pfn') were done
> >in for-loop in pagemap_pte_range(), so I think it's better to get it
> >back to things like that.
> >
> >How about the following?
> 
> I don't like this. Functions which returns void should always initialize its "output"
> argument, it much more clear than relying on preinitialized value.

OK, it makes sense.

Thanks,
Naoya

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-04-30 22:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-28 16:22 [PATCH bugfix] proc/pagemap: correctly report non-present ptes and holes between vmas Konstantin Khlebnikov
2012-04-28 16:24 ` Pavel Emelyanov
2012-04-30 15:25 ` Naoya Horiguchi
2012-04-30 19:19   ` Konstantin Khlebnikov
2012-04-30 22:12     ` Naoya Horiguchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).