* 2.6.33 pagemap endless read loop
@ 2010-01-10 2:09 Andi Kleen
2010-01-10 6:33 ` Américo Wang
[not found] ` <20100113141648.42eff724.akpm@linux-foundation.org>
0 siblings, 2 replies; 11+ messages in thread
From: Andi Kleen @ 2010-01-10 2:09 UTC (permalink / raw)
To: linux-kernel
An LTP run on x86-64/2.6.33-rc3 run right now results in "proc01" hanging
lr-x------ 1 root root 64 2010-01-10 02:28 7 -> /proc/2679/task/2679/pagemap
strace shows an endless loop of
read(7, "\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0"..., 1024) = 1024
read(7, "\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0"..., 1024) = 1024
read(7, "\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0"..., 1024) = 1024
read(7, "\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0"..., 1024) = 1024
read(7, "\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0"..., 1024) = 1024
read(7, "\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0"..., 1024) = 1024
read(7, "\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0"..., 1024) = 10
cat of that file hangs in the same way so it seems like a kernel bug.
A cat of /proc/2679/pagemap also hangs, in fact any pagemap cat seems to.
I tried to revert the latest change from Horiguchi-san:
commit 5dc37642cbce34619e4588a9f0bdad1d2f870956
Author: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Date: Mon Dec 14 18:00:01 2009 -0800
mm hugetlb: add hugepage support to pagemap
but that didn't fix it, so it must be something else.
Haven't done further bisect or anything, should be easy to reproduce.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.33 pagemap endless read loop
2010-01-10 2:09 2.6.33 pagemap endless read loop Andi Kleen
@ 2010-01-10 6:33 ` Américo Wang
2010-01-10 12:37 ` Andi Kleen
[not found] ` <20100113141648.42eff724.akpm@linux-foundation.org>
1 sibling, 1 reply; 11+ messages in thread
From: Américo Wang @ 2010-01-10 6:33 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel
On Sun, Jan 10, 2010 at 03:09:55AM +0100, Andi Kleen wrote:
>
>An LTP run on x86-64/2.6.33-rc3 run right now results in "proc01" hanging
>
>lr-x------ 1 root root 64 2010-01-10 02:28 7 -> /proc/2679/task/2679/pagemap
>
>strace shows an endless loop of
>
>read(7, "\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0"..., 1024) = 1024
>read(7, "\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0"..., 1024) = 1024
>read(7, "\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0"..., 1024) = 1024
>read(7, "\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0"..., 1024) = 1024
>read(7, "\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0"..., 1024) = 1024
>read(7, "\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0"..., 1024) = 1024
>read(7, "\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6\0\0\0"..., 1024) = 10
>
>cat of that file hangs in the same way so it seems like a kernel bug.
>A cat of /proc/2679/pagemap also hangs, in fact any pagemap cat seems to.
>
Are you sure?
On my 32bit machine, it runs for a long time, but finally exits.
>From the document of pagemap, it is the mapping of all the virtual pages
of a process, on x86-32, it should be 4G/4K = 1024*1024 pages, also
that when you 'cat' it, tty layer is involved to display it too.
--
Live like a child, think like the god.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.33 pagemap endless read loop
2010-01-10 6:33 ` Américo Wang
@ 2010-01-10 12:37 ` Andi Kleen
2010-01-12 5:59 ` Mike Galbraith
2010-01-12 15:32 ` Américo Wang
0 siblings, 2 replies; 11+ messages in thread
From: Andi Kleen @ 2010-01-10 12:37 UTC (permalink / raw)
To: Américo Wang; +Cc: Andi Kleen, linux-kernel
> Are you sure
I don't know if it's really endless, but it's doing a good emulation
of it if it's not.
>
> On my 32bit machine, it runs for a long time, but finally exits.
That was on 64bit.
proc01 had accumulated several minutes of CPU time when I stopped it.
> From the document of pagemap, it is the mapping of all the virtual pages
> of a process, on x86-32, it should be 4G/4K = 1024*1024 pages, also
Only the mapped ones surely?
Right now it looks like it dumps all the holes and that obviously
will never finish on 64bit which has 47bits of address space.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.33 pagemap endless read loop
2010-01-10 12:37 ` Andi Kleen
@ 2010-01-12 5:59 ` Mike Galbraith
2010-01-12 15:32 ` Américo Wang
1 sibling, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2010-01-12 5:59 UTC (permalink / raw)
To: Andi Kleen; +Cc: Américo Wang, linux-kernel
On Sun, 2010-01-10 at 13:37 +0100, Andi Kleen wrote:
> > Are you sure
>
> I don't know if it's really endless, but it's doing a good emulation
> of it if it's not.
>
> >
> > On my 32bit machine, it runs for a long time, but finally exits.
>
> That was on 64bit.
>
> proc01 had accumulated several minutes of CPU time when I stopped it.
>
> > From the document of pagemap, it is the mapping of all the virtual pages
> > of a process, on x86-32, it should be 4G/4K = 1024*1024 pages, also
>
> Only the mapped ones surely?
>
> Right now it looks like it dumps all the holes and that obviously
> will never finish on 64bit which has 47bits of address space.
Takes forever if you whack .pagemap_pte_hole too, just spends it's life
in pagemap_pte_range() instead.
start_vaddr:0x0 end_vaddr:0x7ffffffff000
Hm, zillion paces ain't a walk.
-Mike
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.33 pagemap endless read loop
2010-01-10 12:37 ` Andi Kleen
2010-01-12 5:59 ` Mike Galbraith
@ 2010-01-12 15:32 ` Américo Wang
2010-01-12 16:28 ` Andi Kleen
1 sibling, 1 reply; 11+ messages in thread
From: Américo Wang @ 2010-01-12 15:32 UTC (permalink / raw)
To: Andi Kleen; +Cc: Américo Wang, linux-kernel
On Sun, Jan 10, 2010 at 01:37:02PM +0100, Andi Kleen wrote:
>> Are you sure
>
>I don't know if it's really endless, but it's doing a good emulation
>of it if it's not.
>
>>
>> On my 32bit machine, it runs for a long time, but finally exits.
>
>That was on 64bit.
>
>proc01 had accumulated several minutes of CPU time when I stopped it.
>
>> From the document of pagemap, it is the mapping of all the virtual pages
>> of a process, on x86-32, it should be 4G/4K = 1024*1024 pages, also
>
>Only the mapped ones surely?
>
>Right now it looks like it dumps all the holes and that obviously
>will never finish on 64bit which has 47bits of address space.
>
Hmm, could you please try:
time cat /proc/self/pagemap >/dev/null
? Is it much faster than:
time cat /proc/self/pagemap
?
Thanks!
--
Live like a child, think like the god.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.33 pagemap endless read loop
2010-01-12 15:32 ` Américo Wang
@ 2010-01-12 16:28 ` Andi Kleen
2010-01-13 2:08 ` Américo Wang
0 siblings, 1 reply; 11+ messages in thread
From: Andi Kleen @ 2010-01-12 16:28 UTC (permalink / raw)
To: Américo Wang; +Cc: Andi Kleen, linux-kernel
> Hmm, could you please try:
>
> time cat /proc/self/pagemap >/dev/null
>
I stopped it after a minute of CPU time...
Obviously reading 47bits of zeroes take a long time.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.33 pagemap endless read loop
2010-01-12 16:28 ` Andi Kleen
@ 2010-01-13 2:08 ` Américo Wang
0 siblings, 0 replies; 11+ messages in thread
From: Américo Wang @ 2010-01-13 2:08 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel
On Wed, Jan 13, 2010 at 12:28 AM, Andi Kleen <andi@firstfloor.org> wrote:
>> Hmm, could you please try:
>>
>> time cat /proc/self/pagemap >/dev/null
>>
>
> I stopped it after a minute of CPU time...
>
> Obviously reading 47bits of zeroes take a long time.
>
Yeah, but this just proves it is not endless, doesn't it? :)
Thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.33 pagemap endless read loop
[not found] ` <20100113221013.b1beecb4.akpm@linux-foundation.org>
@ 2010-01-14 8:06 ` Wu Fengguang
2010-01-14 8:30 ` Andi Kleen
0 siblings, 1 reply; 11+ messages in thread
From: Wu Fengguang @ 2010-01-14 8:06 UTC (permalink / raw)
To: Andrew Morton; +Cc: Matt Mackall, Andi Kleen, LKML, Rafael J. Wysocki
On Thu, Jan 14, 2010 at 02:10:13PM +0800, Andrew Morton wrote:
> On Thu, 14 Jan 2010 13:53:19 +0800 Wu Fengguang <fengguang.wu@intel.com> wrote:
>
> > On Thu, Jan 14, 2010 at 07:15:02AM +0800, Andrew Morton wrote:
> > @@ -740,6 +740,11 @@ static ssize_t pagemap_read(struct file
> > if (!ptrace_may_access(task, PTRACE_MODE_READ))
> > goto out_task;
> >
> > + /* stop dumb cp/cat */
> > + ret = -EFAULT;
> > + if (!*ppos)
> > + goto out_task;
> > +
> > ret = -EINVAL;
> > /* file position must be aligned */
> > if ((*ppos % PM_ENTRY_BYTES) || (count % PM_ENTRY_BYTES))
>
> Seems simple. But is there a usecase for reading at that offset?
> I think it's possible to map a page at 0 with MAP_FIXED. I forget..
How about this trick?
Tests show that cp/cat will exit if read nothing :)
---
pagemap: early return on unmapped areas
This helps stop dumb cp/cat early instead of loop for days:
http://bugzilla.kernel.org/show_bug.cgi?id=15041
This also serves as an optimization to normal pagemap users (eg.
page-types).
CC: Andi Kleen <andi@firstfloor.org>
CC: Matt Mackall <mpm@selenic.com>
CC: <stable@kernel.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
fs/proc/task_mmu.c | 4 ++++
1 file changed, 4 insertions(+)
--- linux-mm.orig/fs/proc/task_mmu.c 2010-01-14 13:38:44.000000000 +0800
+++ linux-mm/fs/proc/task_mmu.c 2010-01-14 15:54:28.000000000 +0800
@@ -586,6 +586,10 @@ static int pagemap_pte_hole(unsigned lon
struct pagemapread *pm = walk->private;
unsigned long addr;
int err = 0;
+
+ if ((end - start) / PAGE_SIZE > pm->end - pm->out)
+ return PM_END_OF_BUFFER;
+
for (addr = start; addr < end; addr += PAGE_SIZE) {
err = add_to_pagemap(addr, PM_NOT_PRESENT, pm);
if (err)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 2.6.33 pagemap endless read loop
2010-01-14 8:06 ` Wu Fengguang
@ 2010-01-14 8:30 ` Andi Kleen
2010-01-14 8:52 ` [PATCH] pagemap: early return on unmapped areas Wu Fengguang
0 siblings, 1 reply; 11+ messages in thread
From: Andi Kleen @ 2010-01-14 8:30 UTC (permalink / raw)
To: Wu Fengguang
Cc: Andrew Morton, Matt Mackall, Andi Kleen, LKML, Rafael J. Wysocki
> ---
> pagemap: early return on unmapped areas
Yes seems like a reasonable idea.
-Andi
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] pagemap: early return on unmapped areas
2010-01-14 8:30 ` Andi Kleen
@ 2010-01-14 8:52 ` Wu Fengguang
2010-01-14 17:02 ` Matt Mackall
0 siblings, 1 reply; 11+ messages in thread
From: Wu Fengguang @ 2010-01-14 8:52 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Matt Mackall, LKML, Rafael J. Wysocki
On Thu, Jan 14, 2010 at 04:30:48PM +0800, Andi Kleen wrote:
> > ---
> > pagemap: early return on unmapped areas
>
> Yes seems like a reasonable idea.
Thanks!
I just changed the ">" to ">=", because it happen to make a difference
for cp/cat, which does 4k sized read:
4k read buffer = 4k/8 pages = 512 pages = NR_PMD_PAGES
So the change to ">=" makes cp exit immediately (4k buffer matches
exactly one PMD hole).
before
dd if=/proc/$$/pagemap of=/dev/null bs=4k
512+0 records in
512+0 records out
2097152 bytes (2.1 MB) copied, 0.0225796 s, 92.9 MB/s
after
dd if=/proc/$$/pagemap of=/dev/null bs=4k
0+0 records in
0+0 records out
0 bytes (0 B) copied, 2.8029e-05 s, 0.0 kB/s
page-types still works (only faster;) with this patch.
Thanks,
Fengguang
---
pagemap: early return on unmapped areas
This helps stop dumb cp/cat early instead of loop for days:
http://bugzilla.kernel.org/show_bug.cgi?id=15041
This also serves as an optimization to normal pagemap users (eg.
page-types).
CC: Andi Kleen <andi@firstfloor.org>
CC: Matt Mackall <mpm@selenic.com>
CC: <stable@kernel.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
fs/proc/task_mmu.c | 4 ++++
1 file changed, 4 insertions(+)
--- linux-mm.orig/fs/proc/task_mmu.c 2010-01-14 13:38:44.000000000 +0800
+++ linux-mm/fs/proc/task_mmu.c 2010-01-14 16:38:49.000000000 +0800
@@ -586,6 +586,10 @@ static int pagemap_pte_hole(unsigned lon
struct pagemapread *pm = walk->private;
unsigned long addr;
int err = 0;
+
+ if ((end - start) / PAGE_SIZE >= pm->end - pm->out)
+ return PM_END_OF_BUFFER;
+
for (addr = start; addr < end; addr += PAGE_SIZE) {
err = add_to_pagemap(addr, PM_NOT_PRESENT, pm);
if (err)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] pagemap: early return on unmapped areas
2010-01-14 8:52 ` [PATCH] pagemap: early return on unmapped areas Wu Fengguang
@ 2010-01-14 17:02 ` Matt Mackall
0 siblings, 0 replies; 11+ messages in thread
From: Matt Mackall @ 2010-01-14 17:02 UTC (permalink / raw)
To: Wu Fengguang; +Cc: Andi Kleen, Andrew Morton, LKML, Rafael J. Wysocki
On Thu, 2010-01-14 at 16:52 +0800, Wu Fengguang wrote:
> On Thu, Jan 14, 2010 at 04:30:48PM +0800, Andi Kleen wrote:
> > > ---
> > > pagemap: early return on unmapped areas
> >
> > Yes seems like a reasonable idea.
>
> Thanks!
>
> I just changed the ">" to ">=", because it happen to make a difference
> for cp/cat, which does 4k sized read:
>
> 4k read buffer = 4k/8 pages = 512 pages = NR_PMD_PAGES
>
> So the change to ">=" makes cp exit immediately (4k buffer matches
> exactly one PMD hole).
>
> before
> dd if=/proc/$$/pagemap of=/dev/null bs=4k
> 512+0 records in
> 512+0 records out
> 2097152 bytes (2.1 MB) copied, 0.0225796 s, 92.9 MB/s
>
> after
> dd if=/proc/$$/pagemap of=/dev/null bs=4k
> 0+0 records in
> 0+0 records out
> 0 bytes (0 B) copied, 2.8029e-05 s, 0.0 kB/s
Have you guys lost your minds?
I regularly used the exact standard tool you just broke above to test
pagemap when I was developing it. There's no doubt that this patch (or
any similarly misguided ones) WILL regress existing naive or not
intended to be portable 32-bit users. And if I was some poor future
bastard trying to write a program against pagemap with your patch, odds
of getting a concussion banging my head against the wall trying to
figure out its highly non-file-like behavior would be quite high.
Why are we even still on this. There's no regression. LTP + pagemap +
64-bit has always behaved thus. Nor are large virtual files new. And
LTP's n00b assumption that it can just read all of any file in /proc has
always been wrong. LTP had the exact same problem on 64-bit /proc/kcore
(how long has that existed?) and finally fixed it in 2005.
--
http://selenic.com : development and support for Mercurial and Linux
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-01-14 17:02 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-10 2:09 2.6.33 pagemap endless read loop Andi Kleen
2010-01-10 6:33 ` Américo Wang
2010-01-10 12:37 ` Andi Kleen
2010-01-12 5:59 ` Mike Galbraith
2010-01-12 15:32 ` Américo Wang
2010-01-12 16:28 ` Andi Kleen
2010-01-13 2:08 ` Américo Wang
[not found] ` <20100113141648.42eff724.akpm@linux-foundation.org>
[not found] ` <1263421562.29868.5000.camel@calx>
[not found] ` <20100113143055.7bea30c5.akpm@linux-foundation.org>
[not found] ` <1263423916.29868.5009.camel@calx>
[not found] ` <20100113151502.fa425247.akpm@linux-foundation.org>
[not found] ` <20100114055319.GA17671@localhost>
[not found] ` <20100113221013.b1beecb4.akpm@linux-foundation.org>
2010-01-14 8:06 ` Wu Fengguang
2010-01-14 8:30 ` Andi Kleen
2010-01-14 8:52 ` [PATCH] pagemap: early return on unmapped areas Wu Fengguang
2010-01-14 17:02 ` Matt Mackall
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox