* [PATCH 0/3] /proc/kmem cleanups and hwpoison bits
@ 2009-09-15 2:18 Wu Fengguang
2009-09-15 2:18 ` [PATCH 1/3] vmalloc: ignore vmalloc area holes in vwrite() Wu Fengguang
` (3 more replies)
0 siblings, 4 replies; 20+ messages in thread
From: Wu Fengguang @ 2009-09-15 2:18 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki, Andrew Morton
Cc: Benjamin Herrenschmidt, Christoph Lameter, Ingo Molnar, Tejun Heo,
Nick Piggin, Wu Fengguang, LKML
Hi Kame,
Here are 3 more kmem patches in my queue. Comments are welcome.
If you feel good about them, I can send all recent kmem cleanup
patches for you.
Thanks,
Fengguang
--
^ permalink raw reply [flat|nested] 20+ messages in thread* [PATCH 1/3] vmalloc: ignore vmalloc area holes in vwrite() 2009-09-15 2:18 [PATCH 0/3] /proc/kmem cleanups and hwpoison bits Wu Fengguang @ 2009-09-15 2:18 ` Wu Fengguang 2009-09-15 2:34 ` KAMEZAWA Hiroyuki 2009-09-15 2:18 ` [PATCH 2/3] devmem: handle partial kmem write/read Wu Fengguang ` (2 subsequent siblings) 3 siblings, 1 reply; 20+ messages in thread From: Wu Fengguang @ 2009-09-15 2:18 UTC (permalink / raw) To: KAMEZAWA Hiroyuki, Andrew Morton Cc: Benjamin Herrenschmidt, Wu Fengguang, Christoph Lameter, Ingo Molnar, Tejun Heo, Nick Piggin, LKML [-- Attachment #1: vwrite-ret.patch --] [-- Type: text/plain, Size: 1970 bytes --] Siliently ignore all vmalloc area holes in vwrite(), and report success to the caller even if nothing is written. CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> --- mm/vmalloc.c | 13 ++----------- 1 file changed, 2 insertions(+), 11 deletions(-) --- linux-mm.orig/mm/vmalloc.c 2009-09-15 10:08:33.000000000 +0800 +++ linux-mm/mm/vmalloc.c 2009-09-15 10:14:18.000000000 +0800 @@ -1805,10 +1805,8 @@ finished: * @addr: vm address. * @count: number of bytes to be read. * - * Returns # of bytes which addr and buf should be incresed. + * Returns # of bytes which addr and buf should be increased. * (same number to @count). - * If [addr...addr+count) doesn't includes any intersect with valid - * vmalloc area, returns 0. * * This function checks that addr is a valid vmalloc'ed area, and * copy data from a buffer to the given addr. If specified range of @@ -1816,8 +1814,6 @@ finished: * proper area of @buf. If there are memory holes, no copy to hole. * IOREMAP area is treated as memory hole and no copy is done. * - * If [addr...addr+count) doesn't includes any intersects with alive - * vm_struct area, returns 0. * @buf should be kernel's buffer. Because this function uses KM_USER0, * the caller should guarantee KM_USER0 is not used. * @@ -1834,7 +1830,6 @@ long vwrite(char *buf, char *addr, unsig struct vm_struct *tmp; char *vaddr; unsigned long n, buflen; - int copied = 0; /* Don't allow overflow */ if ((unsigned long) addr + count < count) @@ -1856,18 +1851,14 @@ long vwrite(char *buf, char *addr, unsig n = vaddr + tmp->size - PAGE_SIZE - addr; if (n > count) n = count; - if (!(tmp->flags & VM_IOREMAP)) { + if (!(tmp->flags & VM_IOREMAP)) aligned_vwrite(buf, addr, n); - copied++; - } buf += n; addr += n; count -= n; } finished: read_unlock(&vmlist_lock); - if (!copied) - return 0; return buflen; } -- ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/3] vmalloc: ignore vmalloc area holes in vwrite() 2009-09-15 2:18 ` [PATCH 1/3] vmalloc: ignore vmalloc area holes in vwrite() Wu Fengguang @ 2009-09-15 2:34 ` KAMEZAWA Hiroyuki 2009-09-15 3:15 ` Wu Fengguang 0 siblings, 1 reply; 20+ messages in thread From: KAMEZAWA Hiroyuki @ 2009-09-15 2:34 UTC (permalink / raw) To: Wu Fengguang Cc: Andrew Morton, Benjamin Herrenschmidt, Christoph Lameter, Ingo Molnar, Tejun Heo, Nick Piggin, LKML On Tue, 15 Sep 2009 10:18:52 +0800 Wu Fengguang <fengguang.wu@intel.com> wrote: > Siliently ignore all vmalloc area holes in vwrite(), > and report success to the caller even if nothing is written. > > CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Why don't you modify vread() at the same time ? Because /proc/kcore ignores return value of vread(), I think you can modify it without no side-effect. Regards, -Kame > --- > mm/vmalloc.c | 13 ++----------- > 1 file changed, 2 insertions(+), 11 deletions(-) > > --- linux-mm.orig/mm/vmalloc.c 2009-09-15 10:08:33.000000000 +0800 > +++ linux-mm/mm/vmalloc.c 2009-09-15 10:14:18.000000000 +0800 > @@ -1805,10 +1805,8 @@ finished: > * @addr: vm address. > * @count: number of bytes to be read. > * > - * Returns # of bytes which addr and buf should be incresed. > + * Returns # of bytes which addr and buf should be increased. > * (same number to @count). > - * If [addr...addr+count) doesn't includes any intersect with valid > - * vmalloc area, returns 0. > * > * This function checks that addr is a valid vmalloc'ed area, and > * copy data from a buffer to the given addr. If specified range of > @@ -1816,8 +1814,6 @@ finished: > * proper area of @buf. If there are memory holes, no copy to hole. > * IOREMAP area is treated as memory hole and no copy is done. > * > - * If [addr...addr+count) doesn't includes any intersects with alive > - * vm_struct area, returns 0. > * @buf should be kernel's buffer. Because this function uses KM_USER0, > * the caller should guarantee KM_USER0 is not used. > * > @@ -1834,7 +1830,6 @@ long vwrite(char *buf, char *addr, unsig > struct vm_struct *tmp; > char *vaddr; > unsigned long n, buflen; > - int copied = 0; > > /* Don't allow overflow */ > if ((unsigned long) addr + count < count) > @@ -1856,18 +1851,14 @@ long vwrite(char *buf, char *addr, unsig > n = vaddr + tmp->size - PAGE_SIZE - addr; > if (n > count) > n = count; > - if (!(tmp->flags & VM_IOREMAP)) { > + if (!(tmp->flags & VM_IOREMAP)) > aligned_vwrite(buf, addr, n); > - copied++; > - } > buf += n; > addr += n; > count -= n; > } > finished: > read_unlock(&vmlist_lock); > - if (!copied) > - return 0; > return buflen; > } > > > -- > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/3] vmalloc: ignore vmalloc area holes in vwrite() 2009-09-15 2:34 ` KAMEZAWA Hiroyuki @ 2009-09-15 3:15 ` Wu Fengguang 2009-09-15 3:18 ` KAMEZAWA Hiroyuki 0 siblings, 1 reply; 20+ messages in thread From: Wu Fengguang @ 2009-09-15 3:15 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: Andrew Morton, Benjamin Herrenschmidt, Christoph Lameter, Ingo Molnar, Tejun Heo, Nick Piggin, LKML Hi Kame, I'll go out for a while. If you are going to do your improvements to vmalloc.c, please feel free to do so. I can rebase the hwpoison bits after yours. I could do the kmem part(in a modified patch 2/3) and let it return any error code vread/vwrite reports. Ideally the kmem read/write could do if (zero bytes successfully read/written) return error_code; else return bytes_so_far; Do you agree? Thanks, Fengguang On Tue, Sep 15, 2009 at 10:34:25AM +0800, KAMEZAWA Hiroyuki wrote: > On Tue, 15 Sep 2009 10:18:52 +0800 > Wu Fengguang <fengguang.wu@intel.com> wrote: > > > Siliently ignore all vmalloc area holes in vwrite(), > > and report success to the caller even if nothing is written. > > > > CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> > > Why don't you modify vread() at the same time ? > Because /proc/kcore ignores return value of vread(), I think you can > modify it without no side-effect. > > Regards, > -Kame > > > --- > > mm/vmalloc.c | 13 ++----------- > > 1 file changed, 2 insertions(+), 11 deletions(-) > > > > --- linux-mm.orig/mm/vmalloc.c 2009-09-15 10:08:33.000000000 +0800 > > +++ linux-mm/mm/vmalloc.c 2009-09-15 10:14:18.000000000 +0800 > > @@ -1805,10 +1805,8 @@ finished: > > * @addr: vm address. > > * @count: number of bytes to be read. > > * > > - * Returns # of bytes which addr and buf should be incresed. > > + * Returns # of bytes which addr and buf should be increased. > > * (same number to @count). > > - * If [addr...addr+count) doesn't includes any intersect with valid > > - * vmalloc area, returns 0. > > * > > * This function checks that addr is a valid vmalloc'ed area, and > > * copy data from a buffer to the given addr. If specified range of > > @@ -1816,8 +1814,6 @@ finished: > > * proper area of @buf. If there are memory holes, no copy to hole. > > * IOREMAP area is treated as memory hole and no copy is done. > > * > > - * If [addr...addr+count) doesn't includes any intersects with alive > > - * vm_struct area, returns 0. > > * @buf should be kernel's buffer. Because this function uses KM_USER0, > > * the caller should guarantee KM_USER0 is not used. > > * > > @@ -1834,7 +1830,6 @@ long vwrite(char *buf, char *addr, unsig > > struct vm_struct *tmp; > > char *vaddr; > > unsigned long n, buflen; > > - int copied = 0; > > > > /* Don't allow overflow */ > > if ((unsigned long) addr + count < count) > > @@ -1856,18 +1851,14 @@ long vwrite(char *buf, char *addr, unsig > > n = vaddr + tmp->size - PAGE_SIZE - addr; > > if (n > count) > > n = count; > > - if (!(tmp->flags & VM_IOREMAP)) { > > + if (!(tmp->flags & VM_IOREMAP)) > > aligned_vwrite(buf, addr, n); > > - copied++; > > - } > > buf += n; > > addr += n; > > count -= n; > > } > > finished: > > read_unlock(&vmlist_lock); > > - if (!copied) > > - return 0; > > return buflen; > > } > > > > > > -- > > > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/3] vmalloc: ignore vmalloc area holes in vwrite() 2009-09-15 3:15 ` Wu Fengguang @ 2009-09-15 3:18 ` KAMEZAWA Hiroyuki 2009-09-15 5:21 ` Wu Fengguang 0 siblings, 1 reply; 20+ messages in thread From: KAMEZAWA Hiroyuki @ 2009-09-15 3:18 UTC (permalink / raw) To: Wu Fengguang Cc: Andrew Morton, Benjamin Herrenschmidt, Christoph Lameter, Ingo Molnar, Tejun Heo, Nick Piggin, LKML On Tue, 15 Sep 2009 11:15:07 +0800 Wu Fengguang <fengguang.wu@intel.com> wrote: > Hi Kame, > > I'll go out for a while. If you are going to do your improvements to > vmalloc.c, please feel free to do so. I can rebase the hwpoison bits > after yours. > > I could do the kmem part(in a modified patch 2/3) and let it return > any error code vread/vwrite reports. Ideally the kmem read/write > could do > > if (zero bytes successfully read/written) > return error_code; > else > return bytes_so_far; > > Do you agree? > Okay. I'll write patches for vread/vwrite. And make them just return bool. true .....requested area includes valid memory range and worth to be copied. false.....no valid range found. Thanks, -Kame > Thanks, > Fengguang > > On Tue, Sep 15, 2009 at 10:34:25AM +0800, KAMEZAWA Hiroyuki wrote: > > On Tue, 15 Sep 2009 10:18:52 +0800 > > Wu Fengguang <fengguang.wu@intel.com> wrote: > > > > > Siliently ignore all vmalloc area holes in vwrite(), > > > and report success to the caller even if nothing is written. > > > > > > CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > > > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> > > > > Why don't you modify vread() at the same time ? > > Because /proc/kcore ignores return value of vread(), I think you can > > modify it without no side-effect. > > > > Regards, > > -Kame > > > > > --- > > > mm/vmalloc.c | 13 ++----------- > > > 1 file changed, 2 insertions(+), 11 deletions(-) > > > > > > --- linux-mm.orig/mm/vmalloc.c 2009-09-15 10:08:33.000000000 +0800 > > > +++ linux-mm/mm/vmalloc.c 2009-09-15 10:14:18.000000000 +0800 > > > @@ -1805,10 +1805,8 @@ finished: > > > * @addr: vm address. > > > * @count: number of bytes to be read. > > > * > > > - * Returns # of bytes which addr and buf should be incresed. > > > + * Returns # of bytes which addr and buf should be increased. > > > * (same number to @count). > > > - * If [addr...addr+count) doesn't includes any intersect with valid > > > - * vmalloc area, returns 0. > > > * > > > * This function checks that addr is a valid vmalloc'ed area, and > > > * copy data from a buffer to the given addr. If specified range of > > > @@ -1816,8 +1814,6 @@ finished: > > > * proper area of @buf. If there are memory holes, no copy to hole. > > > * IOREMAP area is treated as memory hole and no copy is done. > > > * > > > - * If [addr...addr+count) doesn't includes any intersects with alive > > > - * vm_struct area, returns 0. > > > * @buf should be kernel's buffer. Because this function uses KM_USER0, > > > * the caller should guarantee KM_USER0 is not used. > > > * > > > @@ -1834,7 +1830,6 @@ long vwrite(char *buf, char *addr, unsig > > > struct vm_struct *tmp; > > > char *vaddr; > > > unsigned long n, buflen; > > > - int copied = 0; > > > > > > /* Don't allow overflow */ > > > if ((unsigned long) addr + count < count) > > > @@ -1856,18 +1851,14 @@ long vwrite(char *buf, char *addr, unsig > > > n = vaddr + tmp->size - PAGE_SIZE - addr; > > > if (n > count) > > > n = count; > > > - if (!(tmp->flags & VM_IOREMAP)) { > > > + if (!(tmp->flags & VM_IOREMAP)) > > > aligned_vwrite(buf, addr, n); > > > - copied++; > > > - } > > > buf += n; > > > addr += n; > > > count -= n; > > > } > > > finished: > > > read_unlock(&vmlist_lock); > > > - if (!copied) > > > - return 0; > > > return buflen; > > > } > > > > > > > > > -- > > > > > > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/3] vmalloc: ignore vmalloc area holes in vwrite() 2009-09-15 3:18 ` KAMEZAWA Hiroyuki @ 2009-09-15 5:21 ` Wu Fengguang 0 siblings, 0 replies; 20+ messages in thread From: Wu Fengguang @ 2009-09-15 5:21 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: Andrew Morton, Benjamin Herrenschmidt, Christoph Lameter, Ingo Molnar, Tejun Heo, Nick Piggin, LKML On Tue, Sep 15, 2009 at 11:18:29AM +0800, KAMEZAWA Hiroyuki wrote: > On Tue, 15 Sep 2009 11:15:07 +0800 > Wu Fengguang <fengguang.wu@intel.com> wrote: > > > Hi Kame, > > > > I'll go out for a while. If you are going to do your improvements to > > vmalloc.c, please feel free to do so. I can rebase the hwpoison bits > > after yours. > > > > I could do the kmem part(in a modified patch 2/3) and let it return > > any error code vread/vwrite reports. Ideally the kmem read/write > > could do > > > > if (zero bytes successfully read/written) > > return error_code; > > else > > return bytes_so_far; > > > > Do you agree? > > > Okay. I'll write patches for vread/vwrite. And make them just return bool. bool may not be sufficient? For obviously you need to return -EFAULT for invalid address and I need to return -EIO for hwpoison pages. (I assume you put is_vmalloc_or_module_addr() immediately before the vmalloc_to_page() calls in vmalloc.c, which seems a more natural place than in mem.c.) Thanks, Fengguang > true .....requested area includes valid memory range and worth to be copied. > false.....no valid range found. > > Thanks, > -Kame > > > > > > Thanks, > > Fengguang > > > > On Tue, Sep 15, 2009 at 10:34:25AM +0800, KAMEZAWA Hiroyuki wrote: > > > On Tue, 15 Sep 2009 10:18:52 +0800 > > > Wu Fengguang <fengguang.wu@intel.com> wrote: > > > > > > > Siliently ignore all vmalloc area holes in vwrite(), > > > > and report success to the caller even if nothing is written. > > > > > > > > CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > > > > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> > > > > > > Why don't you modify vread() at the same time ? > > > Because /proc/kcore ignores return value of vread(), I think you can > > > modify it without no side-effect. > > > > > > Regards, > > > -Kame > > > > > > > --- > > > > mm/vmalloc.c | 13 ++----------- > > > > 1 file changed, 2 insertions(+), 11 deletions(-) > > > > > > > > --- linux-mm.orig/mm/vmalloc.c 2009-09-15 10:08:33.000000000 +0800 > > > > +++ linux-mm/mm/vmalloc.c 2009-09-15 10:14:18.000000000 +0800 > > > > @@ -1805,10 +1805,8 @@ finished: > > > > * @addr: vm address. > > > > * @count: number of bytes to be read. > > > > * > > > > - * Returns # of bytes which addr and buf should be incresed. > > > > + * Returns # of bytes which addr and buf should be increased. > > > > * (same number to @count). > > > > - * If [addr...addr+count) doesn't includes any intersect with valid > > > > - * vmalloc area, returns 0. > > > > * > > > > * This function checks that addr is a valid vmalloc'ed area, and > > > > * copy data from a buffer to the given addr. If specified range of > > > > @@ -1816,8 +1814,6 @@ finished: > > > > * proper area of @buf. If there are memory holes, no copy to hole. > > > > * IOREMAP area is treated as memory hole and no copy is done. > > > > * > > > > - * If [addr...addr+count) doesn't includes any intersects with alive > > > > - * vm_struct area, returns 0. > > > > * @buf should be kernel's buffer. Because this function uses KM_USER0, > > > > * the caller should guarantee KM_USER0 is not used. > > > > * > > > > @@ -1834,7 +1830,6 @@ long vwrite(char *buf, char *addr, unsig > > > > struct vm_struct *tmp; > > > > char *vaddr; > > > > unsigned long n, buflen; > > > > - int copied = 0; > > > > > > > > /* Don't allow overflow */ > > > > if ((unsigned long) addr + count < count) > > > > @@ -1856,18 +1851,14 @@ long vwrite(char *buf, char *addr, unsig > > > > n = vaddr + tmp->size - PAGE_SIZE - addr; > > > > if (n > count) > > > > n = count; > > > > - if (!(tmp->flags & VM_IOREMAP)) { > > > > + if (!(tmp->flags & VM_IOREMAP)) > > > > aligned_vwrite(buf, addr, n); > > > > - copied++; > > > > - } > > > > buf += n; > > > > addr += n; > > > > count -= n; > > > > } > > > > finished: > > > > read_unlock(&vmlist_lock); > > > > - if (!copied) > > > > - return 0; > > > > return buflen; > > > > } > > > > > > > > > > > > -- > > > > > > > > > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 2/3] devmem: handle partial kmem write/read 2009-09-15 2:18 [PATCH 0/3] /proc/kmem cleanups and hwpoison bits Wu Fengguang 2009-09-15 2:18 ` [PATCH 1/3] vmalloc: ignore vmalloc area holes in vwrite() Wu Fengguang @ 2009-09-15 2:18 ` Wu Fengguang 2009-09-15 2:38 ` KAMEZAWA Hiroyuki 2009-09-15 2:18 ` [PATCH 3/3] HWPOISON: prevent /dev/kmem users from accessing hwpoison pages Wu Fengguang 2009-09-15 3:09 ` [PATCH 0/3] /proc/kmem cleanups and hwpoison bits KAMEZAWA Hiroyuki 3 siblings, 1 reply; 20+ messages in thread From: Wu Fengguang @ 2009-09-15 2:18 UTC (permalink / raw) To: KAMEZAWA Hiroyuki, Andrew Morton Cc: Benjamin Herrenschmidt, Greg KH, Andi Kleen, Wu Fengguang, Christoph Lameter, Ingo Molnar, Tejun Heo, Nick Piggin, LKML [-- Attachment #1: kmem-vwrite-ret.patch --] [-- Type: text/plain, Size: 1579 bytes --] Return early on partial read/write, which may happen in future. (eg. hit hwpoison pages) CC: Greg KH <greg@kroah.com> CC: Andi Kleen <andi@firstfloor.org> CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> --- drivers/char/mem.c | 30 ++++++++++++++++++------------ mm/vmalloc.c | 10 +--------- 2 files changed, 19 insertions(+), 21 deletions(-) --- linux-mm.orig/drivers/char/mem.c 2009-09-15 09:44:49.000000000 +0800 +++ linux-mm/drivers/char/mem.c 2009-09-15 09:44:55.000000000 +0800 @@ -444,18 +444,22 @@ static ssize_t read_kmem(struct file *fi if (!kbuf) return -ENOMEM; while (count > 0) { + unsigned long n; + sz = size_inside_page(p, count); - sz = vread(kbuf, (char *)p, sz); - if (!sz) + n = vread(kbuf, (char *)p, sz); + if (!n) break; - if (copy_to_user(buf, kbuf, sz)) { + if (copy_to_user(buf, kbuf, n)) { free_page((unsigned long)kbuf); return -EFAULT; } - count -= sz; - buf += sz; - read += sz; - p += sz; + count -= n; + buf += n; + read += n; + p += n; + if (n < sz) + break; } free_page((unsigned long)kbuf); } @@ -551,11 +555,13 @@ static ssize_t write_kmem(struct file * free_page((unsigned long)kbuf); return -EFAULT; } - sz = vwrite(kbuf, (char *)p, sz); - count -= sz; - buf += sz; - virtr += sz; - p += sz; + n = vwrite(kbuf, (char *)p, sz); + count -= n; + buf += n; + virtr += n; + p += n; + if (n < sz) + break; } free_page((unsigned long)kbuf); } -- ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/3] devmem: handle partial kmem write/read 2009-09-15 2:18 ` [PATCH 2/3] devmem: handle partial kmem write/read Wu Fengguang @ 2009-09-15 2:38 ` KAMEZAWA Hiroyuki 2009-09-15 9:01 ` Wu Fengguang 0 siblings, 1 reply; 20+ messages in thread From: KAMEZAWA Hiroyuki @ 2009-09-15 2:38 UTC (permalink / raw) To: Wu Fengguang Cc: Andrew Morton, Benjamin Herrenschmidt, Greg KH, Andi Kleen, Christoph Lameter, Ingo Molnar, Tejun Heo, Nick Piggin, LKML On Tue, 15 Sep 2009 10:18:53 +0800 Wu Fengguang <fengguang.wu@intel.com> wrote: > Return early on partial read/write, which may happen in future. > (eg. hit hwpoison pages) > Hmm, please modify vread() as you did in vwrite() and == kbuf = (char *)__get_free_page(GFP_KERNEL); if (!kbuf) return -ENOMEM; == Add __GFP_ZERO to kbuf allocation, and just ignore vread()'s return value. Then, this will be much simpler. Thanks, -Kame > CC: Greg KH <greg@kroah.com> > CC: Andi Kleen <andi@firstfloor.org> > CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> > --- > drivers/char/mem.c | 30 ++++++++++++++++++------------ > mm/vmalloc.c | 10 +--------- > 2 files changed, 19 insertions(+), 21 deletions(-) > > --- linux-mm.orig/drivers/char/mem.c 2009-09-15 09:44:49.000000000 +0800 > +++ linux-mm/drivers/char/mem.c 2009-09-15 09:44:55.000000000 +0800 > @@ -444,18 +444,22 @@ static ssize_t read_kmem(struct file *fi > if (!kbuf) > return -ENOMEM; > while (count > 0) { > + unsigned long n; > + > sz = size_inside_page(p, count); > - sz = vread(kbuf, (char *)p, sz); > - if (!sz) > + n = vread(kbuf, (char *)p, sz); > + if (!n) > break; > - if (copy_to_user(buf, kbuf, sz)) { > + if (copy_to_user(buf, kbuf, n)) { > free_page((unsigned long)kbuf); > return -EFAULT; > } > - count -= sz; > - buf += sz; > - read += sz; > - p += sz; > + count -= n; > + buf += n; > + read += n; > + p += n; > + if (n < sz) > + break; > } > free_page((unsigned long)kbuf); > } > @@ -551,11 +555,13 @@ static ssize_t write_kmem(struct file * > free_page((unsigned long)kbuf); > return -EFAULT; > } > - sz = vwrite(kbuf, (char *)p, sz); > - count -= sz; > - buf += sz; > - virtr += sz; > - p += sz; > + n = vwrite(kbuf, (char *)p, sz); > + count -= n; > + buf += n; > + virtr += n; > + p += n; > + if (n < sz) > + break; > } > free_page((unsigned long)kbuf); > } > > -- > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/3] devmem: handle partial kmem write/read 2009-09-15 2:38 ` KAMEZAWA Hiroyuki @ 2009-09-15 9:01 ` Wu Fengguang 2009-09-15 9:03 ` Wu Fengguang 0 siblings, 1 reply; 20+ messages in thread From: Wu Fengguang @ 2009-09-15 9:01 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: Andrew Morton, Benjamin Herrenschmidt, Greg KH, Andi Kleen, Christoph Lameter, Ingo Molnar, Tejun Heo, Nick Piggin, LKML On Tue, Sep 15, 2009 at 10:38:11AM +0800, KAMEZAWA Hiroyuki wrote: > On Tue, 15 Sep 2009 10:18:53 +0800 > Wu Fengguang <fengguang.wu@intel.com> wrote: > > > Return early on partial read/write, which may happen in future. > > (eg. hit hwpoison pages) > > > Hmm, please modify vread() as you did in vwrite() and > > == > kbuf = (char *)__get_free_page(GFP_KERNEL); > if (!kbuf) > return -ENOMEM; > == > Add __GFP_ZERO to kbuf allocation, and just ignore vread()'s return value. > Then, this will be much simpler. Thanks, here is the updated patch, which - updates vread/vwrite prototype to return 0 (or error code in future). - do zero fill in the callers Comment updates are ignored for now. Thanks, Fengguang --- vmalloc: ignore vmalloc area holes in vread()/vwrite() Siliently ignore all vmalloc area holes in vread()/vwrite(), and report success (or error code in future) to the caller. When /dev/kmem read()/write() encounters hwpoison page, stop it and return the amount of work done till now. CC: Andi Kleen <andi@firstfloor.org> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Christoph Lameter <cl@linux-foundation.org> CC: Ingo Molnar <mingo@elte.hu> CC: Tejun Heo <tj@kernel.org> CC: Nick Piggin <npiggin@suse.de> CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> --- drivers/char/mem.c | 16 ++++++++++------ include/linux/vmalloc.h | 6 +++--- mm/vmalloc.c | 31 +++++++------------------------ 3 files changed, 20 insertions(+), 33 deletions(-) --- linux-mm.orig/mm/vmalloc.c 2009-09-15 16:38:49.000000000 +0800 +++ linux-mm/mm/vmalloc.c 2009-09-15 16:39:31.000000000 +0800 @@ -1752,11 +1752,10 @@ static int aligned_vwrite(char *buf, cha * */ -long vread(char *buf, char *addr, unsigned long count) +int vread(char *buf, char *addr, unsigned long count) { struct vm_struct *tmp; char *vaddr, *buf_start = buf; - unsigned long buflen = count; unsigned long n; /* Don't allow overflow */ @@ -1790,13 +1789,7 @@ long vread(char *buf, char *addr, unsign finished: read_unlock(&vmlist_lock); - if (buf == buf_start) - return 0; - /* zero-fill memory holes */ - if (buf != buf_start + buflen) - memset(buf, 0, buflen - (buf - buf_start)); - - return buflen; + return 0; } /** @@ -1805,10 +1798,8 @@ finished: * @addr: vm address. * @count: number of bytes to be read. * - * Returns # of bytes which addr and buf should be incresed. + * Returns # of bytes which addr and buf should be increased. * (same number to @count). - * If [addr...addr+count) doesn't includes any intersect with valid - * vmalloc area, returns 0. * * This function checks that addr is a valid vmalloc'ed area, and * copy data from a buffer to the given addr. If specified range of @@ -1816,8 +1807,6 @@ finished: * proper area of @buf. If there are memory holes, no copy to hole. * IOREMAP area is treated as memory hole and no copy is done. * - * If [addr...addr+count) doesn't includes any intersects with alive - * vm_struct area, returns 0. * @buf should be kernel's buffer. Because this function uses KM_USER0, * the caller should guarantee KM_USER0 is not used. * @@ -1829,17 +1818,15 @@ finished: * The caller should guarantee KM_USER1 is not used. */ -long vwrite(char *buf, char *addr, unsigned long count) +int vwrite(char *buf, char *addr, unsigned long count) { struct vm_struct *tmp; char *vaddr; - unsigned long n, buflen; - int copied = 0; + unsigned long n; /* Don't allow overflow */ if ((unsigned long) addr + count < count) count = -(unsigned long) addr; - buflen = count; read_lock(&vmlist_lock); for (tmp = vmlist; count && tmp; tmp = tmp->next) { @@ -1856,19 +1843,15 @@ long vwrite(char *buf, char *addr, unsig n = vaddr + tmp->size - PAGE_SIZE - addr; if (n > count) n = count; - if (!(tmp->flags & VM_IOREMAP)) { + if (!(tmp->flags & VM_IOREMAP)) aligned_vwrite(buf, addr, n); - copied++; - } buf += n; addr += n; count -= n; } finished: read_unlock(&vmlist_lock); - if (!copied) - return 0; - return buflen; + return 0; } /** --- linux-mm.orig/include/linux/vmalloc.h 2009-09-15 16:38:49.000000000 +0800 +++ linux-mm/include/linux/vmalloc.h 2009-09-15 16:39:31.000000000 +0800 @@ -104,9 +104,9 @@ extern void unmap_kernel_range(unsigned extern struct vm_struct *alloc_vm_area(size_t size); extern void free_vm_area(struct vm_struct *area); -/* for /dev/kmem */ -extern long vread(char *buf, char *addr, unsigned long count); -extern long vwrite(char *buf, char *addr, unsigned long count); +/* for /dev/kmem and /dev/kcore */ +extern int vread(char *buf, char *addr, unsigned long count); +extern int vwrite(char *buf, char *addr, unsigned long count); /* * Internals. Dont't use.. --- linux-mm.orig/drivers/char/mem.c 2009-09-15 15:26:11.000000000 +0800 +++ linux-mm/drivers/char/mem.c 2009-09-15 17:00:45.000000000 +0800 @@ -399,6 +399,7 @@ static ssize_t read_kmem(struct file *fi unsigned long p = *ppos; ssize_t low_count, read, sz; char * kbuf; /* k-addr because vread() takes vmlist_lock rwlock */ + int ret = 0; read = 0; if (p < (unsigned long) high_memory) { @@ -440,13 +441,13 @@ static ssize_t read_kmem(struct file *fi } if (count > 0) { - kbuf = (char *)__get_free_page(GFP_KERNEL); + kbuf = (char *)__get_free_page(GFP_KERNEL | __GFP_ZERO); if (!kbuf) return -ENOMEM; while (count > 0) { sz = size_inside_page(p, count); - sz = vread(kbuf, (char *)p, sz); - if (!sz) + ret = vread(kbuf, (char *)p, sz); + if (ret) break; if (copy_to_user(buf, kbuf, sz)) { free_page((unsigned long)kbuf); @@ -460,7 +461,7 @@ static ssize_t read_kmem(struct file *fi free_page((unsigned long)kbuf); } *ppos = p; - return read; + return read ? read : ret; } @@ -524,6 +525,7 @@ static ssize_t write_kmem(struct file * ssize_t wrote = 0; ssize_t virtr = 0; char * kbuf; /* k-addr because vwrite() takes vmlist_lock rwlock */ + int ret = 0; if (p < (unsigned long) high_memory) { unsigned long to_write = min_t(unsigned long, count, @@ -551,7 +553,9 @@ static ssize_t write_kmem(struct file * free_page((unsigned long)kbuf); return -EFAULT; } - sz = vwrite(kbuf, (char *)p, sz); + ret = vwrite(kbuf, (char *)p, sz); + if (ret) + break; count -= sz; buf += sz; virtr += sz; @@ -561,7 +565,7 @@ static ssize_t write_kmem(struct file * } *ppos = p; - return virtr + wrote; + return virtr + wrote ? virtr + wrote : ret; } #endif ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/3] devmem: handle partial kmem write/read 2009-09-15 9:01 ` Wu Fengguang @ 2009-09-15 9:03 ` Wu Fengguang 0 siblings, 0 replies; 20+ messages in thread From: Wu Fengguang @ 2009-09-15 9:03 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: Andrew Morton, Benjamin Herrenschmidt, Greg KH, Andi Kleen, Christoph Lameter, Ingo Molnar, Tejun Heo, Nick Piggin, LKML On Tue, Sep 15, 2009 at 05:01:30PM +0800, Wu Fengguang wrote: > On Tue, Sep 15, 2009 at 10:38:11AM +0800, KAMEZAWA Hiroyuki wrote: > > On Tue, 15 Sep 2009 10:18:53 +0800 > > Wu Fengguang <fengguang.wu@intel.com> wrote: > > > > > Return early on partial read/write, which may happen in future. > > > (eg. hit hwpoison pages) > > > > > Hmm, please modify vread() as you did in vwrite() and > > > > == > > kbuf = (char *)__get_free_page(GFP_KERNEL); > > if (!kbuf) > > return -ENOMEM; > > == > > Add __GFP_ZERO to kbuf allocation, and just ignore vread()'s return value. > > Then, this will be much simpler. > > Thanks, here is the updated patch, which > - updates vread/vwrite prototype to return 0 (or error code in future). > - do zero fill in the callers > Comment updates are ignored for now. This patch also makes it trivial to add your is_vmalloc_or_module_addr() fixes, like this :) --- From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> --- drivers/char/mem.c | 8 ++++++++ 1 file changed, 8 insertions(+) --- linux-mm.orig/drivers/char/mem.c 2009-09-15 16:58:06.000000000 +0800 +++ linux-mm/drivers/char/mem.c 2009-09-15 16:58:12.000000000 +0800 @@ -446,6 +446,10 @@ static ssize_t read_kmem(struct file *fi return -ENOMEM; while (count > 0) { sz = size_inside_page(p, count); + if (!is_vmalloc_or_module_addr((void*)p)) { + ret = -EFAULT; + break; + } ret = vread(kbuf, (char *)p, sz); if (ret) break; @@ -546,6 +550,10 @@ static ssize_t write_kmem(struct file * unsigned long sz = size_inside_page(p, count); unsigned long n; + if (!is_vmalloc_or_module_addr((void*)p)) { + ret = -EFAULT; + break; + } n = copy_from_user(kbuf, buf, sz); if (n) { if (wrote + virtr) ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 3/3] HWPOISON: prevent /dev/kmem users from accessing hwpoison pages 2009-09-15 2:18 [PATCH 0/3] /proc/kmem cleanups and hwpoison bits Wu Fengguang 2009-09-15 2:18 ` [PATCH 1/3] vmalloc: ignore vmalloc area holes in vwrite() Wu Fengguang 2009-09-15 2:18 ` [PATCH 2/3] devmem: handle partial kmem write/read Wu Fengguang @ 2009-09-15 2:18 ` Wu Fengguang 2009-09-15 2:45 ` KAMEZAWA Hiroyuki 2009-09-15 3:09 ` [PATCH 0/3] /proc/kmem cleanups and hwpoison bits KAMEZAWA Hiroyuki 3 siblings, 1 reply; 20+ messages in thread From: Wu Fengguang @ 2009-09-15 2:18 UTC (permalink / raw) To: KAMEZAWA Hiroyuki, Andrew Morton Cc: Benjamin Herrenschmidt, Greg KH, Andi Kleen, Wu Fengguang, Christoph Lameter, Ingo Molnar, Tejun Heo, Nick Piggin, LKML [-- Attachment #1: kmem-dev-kmem.patch --] [-- Type: text/plain, Size: 5557 bytes --] When /dev/kmem read()/write() encounters hwpoison page, stop it and return the amount of work done till now. CC: Greg KH <greg@kroah.com> CC: Andi Kleen <andi@firstfloor.org> CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> --- drivers/char/mem.c | 18 +++++++++++---- mm/vmalloc.c | 51 +++++++++++++++++++++++++++---------------- 2 files changed, 47 insertions(+), 22 deletions(-) --- linux-mm.orig/drivers/char/mem.c 2009-09-15 10:14:20.000000000 +0800 +++ linux-mm/drivers/char/mem.c 2009-09-15 10:14:25.000000000 +0800 @@ -429,6 +429,9 @@ static ssize_t read_kmem(struct file *fi */ kbuf = xlate_dev_kmem_ptr((char *)p); + if (unlikely(virt_addr_valid(kbuf) && + PageHWPoison(virt_to_page(kbuf)))) + return -EIO; if (copy_to_user(buf, kbuf, sz)) return -EFAULT; buf += sz; @@ -474,6 +477,7 @@ do_write_kmem(unsigned long p, const cha { ssize_t written, sz; unsigned long copied; + int err = 0; written = 0; #ifdef __ARCH_HAS_NO_PAGE_ZERO_MAPPED @@ -500,13 +504,19 @@ do_write_kmem(unsigned long p, const cha */ ptr = xlate_dev_kmem_ptr((char *)p); + if (unlikely(virt_addr_valid(ptr) && + PageHWPoison(virt_to_page(ptr)))) { + err = -EIO; + break; + } + copied = copy_from_user(ptr, buf, sz); if (copied) { written += sz - copied; - if (written) - break; - return -EFAULT; + err = -EFAULT; + break; } + buf += sz; p += sz; count -= sz; @@ -514,7 +524,7 @@ do_write_kmem(unsigned long p, const cha } *ppos += written; - return written; + return written ? written : err; } --- linux-mm.orig/mm/vmalloc.c 2009-09-15 10:14:18.000000000 +0800 +++ linux-mm/mm/vmalloc.c 2009-09-15 10:17:20.000000000 +0800 @@ -1661,6 +1661,8 @@ static int aligned_vread(char *buf, char if (length > count) length = count; p = vmalloc_to_page(addr); + if (unlikely(p && PageHWPoison(p))) + break; /* * To do safe access to this _mapped_ area, we need * lock. But adding lock here means that we need to add @@ -1700,6 +1702,8 @@ static int aligned_vwrite(char *buf, cha if (length > count) length = count; p = vmalloc_to_page(addr); + if (unlikely(p && PageHWPoison(p))) + break; /* * To do safe access to this _mapped_ area, we need * lock. But adding lock here means that we need to add @@ -1731,8 +1735,10 @@ static int aligned_vwrite(char *buf, cha * @count: number of bytes to be read. * * Returns # of bytes which addr and buf should be increased. - * (same number to @count). Returns 0 if [addr...addr+count) doesn't - * includes any intersect with alive vmalloc area. + * (same number to @count if no hwpoison pages encountered). + * + * Returns 0 if [addr...addr+count) doesn't includes any intersect with + * alive vmalloc area. * * This function checks that addr is a valid vmalloc'ed area, and * copy data from that area to a given buffer. If the given memory range @@ -1740,8 +1746,6 @@ static int aligned_vwrite(char *buf, cha * proper area of @buf. If there are memory holes, they'll be zero-filled. * IOREMAP area is treated as memory hole and no copy is done. * - * If [addr...addr+count) doesn't includes any intersects with alive - * vm_struct area, returns 0. * @buf should be kernel's buffer. Because this function uses KM_USER0, * the caller should guarantee KM_USER0 is not used. * @@ -1757,7 +1761,8 @@ long vread(char *buf, char *addr, unsign struct vm_struct *tmp; char *vaddr, *buf_start = buf; unsigned long buflen = count; - unsigned long n; + unsigned long ret = 0; + unsigned long n = 0; /* Don't allow overflow */ if ((unsigned long) addr + count < count) @@ -1780,12 +1785,16 @@ long vread(char *buf, char *addr, unsign if (n > count) n = count; if (!(tmp->flags & VM_IOREMAP)) - aligned_vread(buf, addr, n); - else /* IOREMAP area is treated as memory hole */ + ret = aligned_vread(buf, addr, n); + else { /* IOREMAP area is treated as memory hole */ memset(buf, 0, n); - buf += n; - addr += n; - count -= n; + ret = n; + } + buf += ret; + addr += ret; + count -= ret; + if (ret < n) + break; } finished: read_unlock(&vmlist_lock); @@ -1796,7 +1805,7 @@ finished: if (buf != buf_start + buflen) memset(buf, 0, buflen - (buf - buf_start)); - return buflen; + return ret == n ? buflen : buflen - count; } /** @@ -1806,7 +1815,7 @@ finished: * @count: number of bytes to be read. * * Returns # of bytes which addr and buf should be increased. - * (same number to @count). + * (same number to @count if no hwpoison pages encountered). * * This function checks that addr is a valid vmalloc'ed area, and * copy data from a buffer to the given addr. If specified range of @@ -1829,7 +1838,9 @@ long vwrite(char *buf, char *addr, unsig { struct vm_struct *tmp; char *vaddr; - unsigned long n, buflen; + unsigned long buflen; + unsigned long ret = 0; + unsigned long n = 0; /* Don't allow overflow */ if ((unsigned long) addr + count < count) @@ -1852,14 +1863,18 @@ long vwrite(char *buf, char *addr, unsig if (n > count) n = count; if (!(tmp->flags & VM_IOREMAP)) - aligned_vwrite(buf, addr, n); - buf += n; - addr += n; - count -= n; + ret = aligned_vwrite(buf, addr, n); + else + ret = n; + buf += ret; + addr += ret; + count -= ret; + if (ret < n) + break; } finished: read_unlock(&vmlist_lock); - return buflen; + return ret == n ? buflen : buflen - count; } /** -- ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/3] HWPOISON: prevent /dev/kmem users from accessing hwpoison pages 2009-09-15 2:18 ` [PATCH 3/3] HWPOISON: prevent /dev/kmem users from accessing hwpoison pages Wu Fengguang @ 2009-09-15 2:45 ` KAMEZAWA Hiroyuki 0 siblings, 0 replies; 20+ messages in thread From: KAMEZAWA Hiroyuki @ 2009-09-15 2:45 UTC (permalink / raw) To: Wu Fengguang Cc: Andrew Morton, Benjamin Herrenschmidt, Greg KH, Andi Kleen, Christoph Lameter, Ingo Molnar, Tejun Heo, Nick Piggin, LKML On Tue, 15 Sep 2009 10:18:54 +0800 Wu Fengguang <fengguang.wu@intel.com> wrote: > When /dev/kmem read()/write() encounters hwpoison page, stop it > and return the amount of work done till now. > Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > CC: Greg KH <greg@kroah.com> > CC: Andi Kleen <andi@firstfloor.org> > CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> > --- > drivers/char/mem.c | 18 +++++++++++---- > mm/vmalloc.c | 51 +++++++++++++++++++++++++++---------------- > 2 files changed, 47 insertions(+), 22 deletions(-) > > --- linux-mm.orig/drivers/char/mem.c 2009-09-15 10:14:20.000000000 +0800 > +++ linux-mm/drivers/char/mem.c 2009-09-15 10:14:25.000000000 +0800 > @@ -429,6 +429,9 @@ static ssize_t read_kmem(struct file *fi > */ > kbuf = xlate_dev_kmem_ptr((char *)p); > > + if (unlikely(virt_addr_valid(kbuf) && > + PageHWPoison(virt_to_page(kbuf)))) > + return -EIO; > if (copy_to_user(buf, kbuf, sz)) > return -EFAULT; > buf += sz; > @@ -474,6 +477,7 @@ do_write_kmem(unsigned long p, const cha > { > ssize_t written, sz; > unsigned long copied; > + int err = 0; > > written = 0; > #ifdef __ARCH_HAS_NO_PAGE_ZERO_MAPPED > @@ -500,13 +504,19 @@ do_write_kmem(unsigned long p, const cha > */ > ptr = xlate_dev_kmem_ptr((char *)p); > > + if (unlikely(virt_addr_valid(ptr) && > + PageHWPoison(virt_to_page(ptr)))) { > + err = -EIO; > + break; > + } > + > copied = copy_from_user(ptr, buf, sz); > if (copied) { > written += sz - copied; > - if (written) > - break; > - return -EFAULT; > + err = -EFAULT; > + break; > } > + > buf += sz; > p += sz; > count -= sz; > @@ -514,7 +524,7 @@ do_write_kmem(unsigned long p, const cha > } > > *ppos += written; > - return written; > + return written ? written : err; > } > > > --- linux-mm.orig/mm/vmalloc.c 2009-09-15 10:14:18.000000000 +0800 > +++ linux-mm/mm/vmalloc.c 2009-09-15 10:17:20.000000000 +0800 > @@ -1661,6 +1661,8 @@ static int aligned_vread(char *buf, char > if (length > count) > length = count; > p = vmalloc_to_page(addr); > + if (unlikely(p && PageHWPoison(p))) > + break; > /* > * To do safe access to this _mapped_ area, we need > * lock. But adding lock here means that we need to add > @@ -1700,6 +1702,8 @@ static int aligned_vwrite(char *buf, cha > if (length > count) > length = count; > p = vmalloc_to_page(addr); > + if (unlikely(p && PageHWPoison(p))) > + break; > /* > * To do safe access to this _mapped_ area, we need > * lock. But adding lock here means that we need to add > @@ -1731,8 +1735,10 @@ static int aligned_vwrite(char *buf, cha > * @count: number of bytes to be read. > * > * Returns # of bytes which addr and buf should be increased. > - * (same number to @count). Returns 0 if [addr...addr+count) doesn't > - * includes any intersect with alive vmalloc area. > + * (same number to @count if no hwpoison pages encountered). > + * > + * Returns 0 if [addr...addr+count) doesn't includes any intersect with > + * alive vmalloc area. > * > * This function checks that addr is a valid vmalloc'ed area, and > * copy data from that area to a given buffer. If the given memory range > @@ -1740,8 +1746,6 @@ static int aligned_vwrite(char *buf, cha > * proper area of @buf. If there are memory holes, they'll be zero-filled. > * IOREMAP area is treated as memory hole and no copy is done. > * > - * If [addr...addr+count) doesn't includes any intersects with alive > - * vm_struct area, returns 0. > * @buf should be kernel's buffer. Because this function uses KM_USER0, > * the caller should guarantee KM_USER0 is not used. > * > @@ -1757,7 +1761,8 @@ long vread(char *buf, char *addr, unsign > struct vm_struct *tmp; > char *vaddr, *buf_start = buf; > unsigned long buflen = count; > - unsigned long n; > + unsigned long ret = 0; > + unsigned long n = 0; > > /* Don't allow overflow */ > if ((unsigned long) addr + count < count) > @@ -1780,12 +1785,16 @@ long vread(char *buf, char *addr, unsign > if (n > count) > n = count; > if (!(tmp->flags & VM_IOREMAP)) > - aligned_vread(buf, addr, n); > - else /* IOREMAP area is treated as memory hole */ > + ret = aligned_vread(buf, addr, n); > + else { /* IOREMAP area is treated as memory hole */ > memset(buf, 0, n); > - buf += n; > - addr += n; > - count -= n; > + ret = n; > + } > + buf += ret; > + addr += ret; > + count -= ret; > + if (ret < n) > + break; > } > finished: > read_unlock(&vmlist_lock); > @@ -1796,7 +1805,7 @@ finished: > if (buf != buf_start + buflen) > memset(buf, 0, buflen - (buf - buf_start)); > > - return buflen; > + return ret == n ? buflen : buflen - count; > } > > /** > @@ -1806,7 +1815,7 @@ finished: > * @count: number of bytes to be read. > * > * Returns # of bytes which addr and buf should be increased. > - * (same number to @count). > + * (same number to @count if no hwpoison pages encountered). > * > * This function checks that addr is a valid vmalloc'ed area, and > * copy data from a buffer to the given addr. If specified range of > @@ -1829,7 +1838,9 @@ long vwrite(char *buf, char *addr, unsig > { > struct vm_struct *tmp; > char *vaddr; > - unsigned long n, buflen; > + unsigned long buflen; > + unsigned long ret = 0; > + unsigned long n = 0; > > /* Don't allow overflow */ > if ((unsigned long) addr + count < count) > @@ -1852,14 +1863,18 @@ long vwrite(char *buf, char *addr, unsig > if (n > count) > n = count; > if (!(tmp->flags & VM_IOREMAP)) > - aligned_vwrite(buf, addr, n); > - buf += n; > - addr += n; > - count -= n; > + ret = aligned_vwrite(buf, addr, n); > + else > + ret = n; > + buf += ret; > + addr += ret; > + count -= ret; > + if (ret < n) > + break; > } > finished: > read_unlock(&vmlist_lock); > - return buflen; > + return ret == n ? buflen : buflen - count; > } > > /** > > -- > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3] /proc/kmem cleanups and hwpoison bits 2009-09-15 2:18 [PATCH 0/3] /proc/kmem cleanups and hwpoison bits Wu Fengguang ` (2 preceding siblings ...) 2009-09-15 2:18 ` [PATCH 3/3] HWPOISON: prevent /dev/kmem users from accessing hwpoison pages Wu Fengguang @ 2009-09-15 3:09 ` KAMEZAWA Hiroyuki 2009-09-15 8:22 ` Wu Fengguang 3 siblings, 1 reply; 20+ messages in thread From: KAMEZAWA Hiroyuki @ 2009-09-15 3:09 UTC (permalink / raw) To: Wu Fengguang Cc: Andrew Morton, Benjamin Herrenschmidt, Ingo Molnar, Tejun Heo, Nick Piggin, LKML On Tue, 15 Sep 2009 10:18:51 +0800 Wu Fengguang <fengguang.wu@intel.com> wrote: > Hi Kame, > > Here are 3 more kmem patches in my queue. Comments are welcome. > If you feel good about them, I can send all recent kmem cleanup > patches for you. > This is my quick hack. But I don't want to be an obstacle for you. So, I'll wait for your updates. == Now, /dev/kmem's read/write vmalloc area doesn't do range-check. Because vread/vwrite traverse vmalloc area list under system-wide spinlock, it's better to avoid unnecessary to do unnecessary calls to vread/vwrite. And, vread/vwrite returns 0 if we accessed memory holes. We can avoid copy-to-user in read side, we just ignore at write. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> --- drivers/char/mem.c | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-) Index: linux-2.6.31/drivers/char/mem.c =================================================================== --- linux-2.6.31.orig/drivers/char/mem.c +++ linux-2.6.31/drivers/char/mem.c @@ -437,17 +437,19 @@ static ssize_t read_kmem(struct file *fi } if (count > 0) { - kbuf = (char *)__get_free_page(GFP_KERNEL); + kbuf = (char *)__get_free_page(GFP_KERNEL | __GFP_ZERO); if (!kbuf) return -ENOMEM; while (count > 0) { sz = size_inside_page(p, count); - sz = vread(kbuf, (char *)p, sz); - if (!sz) - break; - if (copy_to_user(buf, kbuf, sz)) { - free_page((unsigned long)kbuf); - return -EFAULT; + if (is_vmalloc_or_module_addr((void*)p)) { + /* returns Non-zero if a page is found */ + if (vread(kbuf, (char *)p, sz)) + if (copy_to_user(buf, kbuf, sz)) { + free_page((unsigned long)kbuf); + return -EFAULT; + } + } } count -= sz; buf += sz; @@ -541,6 +543,11 @@ static ssize_t write_kmem(struct file * unsigned long sz = size_inside_page(p, count); unsigned long n; + if (is_vmalloc_or_module_addr((void*)p)) { + free_page((unsignedl long)kbuf); + retrun -EFAULT; + } + n = copy_from_user(kbuf, buf, sz); if (n) { if (wrote + virtr) @@ -548,7 +555,11 @@ static ssize_t write_kmem(struct file * free_page((unsigned long)kbuf); return -EFAULT; } - sz = vwrite(kbuf, (char *)p, sz); + /* + * returns non-zero if copied successfully.... + * But we don't care it. just make a progress. + */ + vwrite(kbuf, (char *)p, sz); count -= sz; buf += sz; virtr += sz; ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3] /proc/kmem cleanups and hwpoison bits 2009-09-15 3:09 ` [PATCH 0/3] /proc/kmem cleanups and hwpoison bits KAMEZAWA Hiroyuki @ 2009-09-15 8:22 ` Wu Fengguang 2009-09-15 10:16 ` Hugh Dickins 0 siblings, 1 reply; 20+ messages in thread From: Wu Fengguang @ 2009-09-15 8:22 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: Andrew Morton, Benjamin Herrenschmidt, Ingo Molnar, Tejun Heo, Nick Piggin, LKML On Tue, Sep 15, 2009 at 11:09:39AM +0800, KAMEZAWA Hiroyuki wrote: > On Tue, 15 Sep 2009 10:18:51 +0800 > Wu Fengguang <fengguang.wu@intel.com> wrote: > > > Hi Kame, > > > > Here are 3 more kmem patches in my queue. Comments are welcome. > > If you feel good about them, I can send all recent kmem cleanup > > patches for you. > > > > This is my quick hack. But I don't want to be an obstacle for you. > So, I'll wait for your updates. Thanks. This is also a bug fix: vmalloc_to_page() will otherwise BUG() on !is_vmalloc_or_module_addr() pages. > == > Now, /dev/kmem's read/write vmalloc area doesn't do > range-check. Because vread/vwrite traverse vmalloc area list > under system-wide spinlock, it's better to avoid unnecessary > to do unnecessary calls to vread/vwrite. is_vmalloc_or_module_addr() could be put to either read_kmem() or aligned_vread(), and I'm fine with both. It looks like vread can be shared by kmem and kcore :) > And, vread/vwrite returns 0 if we accessed memory holes. > We can avoid copy-to-user in read side, we just ignore at write. > > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > --- > drivers/char/mem.c | 27 +++++++++++++++++++-------- > 1 file changed, 19 insertions(+), 8 deletions(-) > > Index: linux-2.6.31/drivers/char/mem.c > =================================================================== > --- linux-2.6.31.orig/drivers/char/mem.c > +++ linux-2.6.31/drivers/char/mem.c > @@ -437,17 +437,19 @@ static ssize_t read_kmem(struct file *fi > } > > if (count > 0) { > - kbuf = (char *)__get_free_page(GFP_KERNEL); > + kbuf = (char *)__get_free_page(GFP_KERNEL | __GFP_ZERO); > if (!kbuf) > return -ENOMEM; > while (count > 0) { > sz = size_inside_page(p, count); > - sz = vread(kbuf, (char *)p, sz); > - if (!sz) > - break; > - if (copy_to_user(buf, kbuf, sz)) { > - free_page((unsigned long)kbuf); > - return -EFAULT; > + if (is_vmalloc_or_module_addr((void*)p)) { > + /* returns Non-zero if a page is found */ > + if (vread(kbuf, (char *)p, sz)) > + if (copy_to_user(buf, kbuf, sz)) { > + free_page((unsigned long)kbuf); > + return -EFAULT; > + } > + } > } > count -= sz; > buf += sz; > @@ -541,6 +543,11 @@ static ssize_t write_kmem(struct file * > unsigned long sz = size_inside_page(p, count); > unsigned long n; > > + if (is_vmalloc_or_module_addr((void*)p)) { !is_vmalloc_or_module_addr() ? > + free_page((unsignedl long)kbuf); > + retrun -EFAULT; > + } > + > n = copy_from_user(kbuf, buf, sz); > if (n) { > if (wrote + virtr) > @@ -548,7 +555,11 @@ static ssize_t write_kmem(struct file * > free_page((unsigned long)kbuf); > return -EFAULT; > } > - sz = vwrite(kbuf, (char *)p, sz); > + /* > + * returns non-zero if copied successfully.... > + * But we don't care it. just make a progress. > + */ > + vwrite(kbuf, (char *)p, sz); vwrite could return error code which should not be ignored. Thanks, Fengguang > count -= sz; > buf += sz; > virtr += sz; > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3] /proc/kmem cleanups and hwpoison bits 2009-09-15 8:22 ` Wu Fengguang @ 2009-09-15 10:16 ` Hugh Dickins 2009-09-16 0:39 ` KAMEZAWA Hiroyuki ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Hugh Dickins @ 2009-09-15 10:16 UTC (permalink / raw) To: Wu Fengguang Cc: KAMEZAWA Hiroyuki, Andrew Morton, Benjamin Herrenschmidt, Ingo Molnar, Tejun Heo, Nick Piggin, LKML On Tue, 15 Sep 2009, Wu Fengguang wrote: > On Tue, Sep 15, 2009 at 11:09:39AM +0800, KAMEZAWA Hiroyuki wrote: > > On Tue, 15 Sep 2009 10:18:51 +0800 > > Wu Fengguang <fengguang.wu@intel.com> wrote: > > > > > Hi Kame, > > > > > > Here are 3 more kmem patches in my queue. Comments are welcome. > > > If you feel good about them, I can send all recent kmem cleanup > > > patches for you. > > > > > > > This is my quick hack. But I don't want to be an obstacle for you. > > So, I'll wait for your updates. > > Thanks. This is also a bug fix: vmalloc_to_page() will otherwise BUG() > on !is_vmalloc_or_module_addr() pages. > > > == > > Now, /dev/kmem's read/write vmalloc area doesn't do > > range-check. Because vread/vwrite traverse vmalloc area list > > under system-wide spinlock, it's better to avoid unnecessary > > to do unnecessary calls to vread/vwrite. > > is_vmalloc_or_module_addr() could be put to either read_kmem() > or aligned_vread(), and I'm fine with both. > > It looks like vread can be shared by kmem and kcore :) > > > And, vread/vwrite returns 0 if we accessed memory holes. > > We can avoid copy-to-user in read side, we just ignore at write. > > > > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > > --- > > drivers/char/mem.c | 27 +++++++++++++++++++-------- > > 1 file changed, 19 insertions(+), 8 deletions(-) Sorry guys, I'm picking this mail pretty much at random as something to reply to. I'm interested to notice such work going on in drivers/char/mem.c, and don't have time to join in - you interact a lot faster than I manage, and I've other priorities to attend to; but thought I should at least send over the patch I've been including in my private debug kernels for the last couple of years (rebased to 2.6.31), which lets me peek and poke into /dev/kmem as I occasionally wish. Warning: it may be rubbish, it may just be a hack which appeared to work for me the last time I tried, on a particular address range of a particular set of configurations of a particular set of architectures (x86_32, x86_64, powerpc64). I've never thought it through enough to consider submitting, but it _might_ contain something useful for you to factor into your own efforts. Sorry for chucking it over the wall to you in this way, but I guess that's better than just sitting quietly on it for a few more years. Certainly-Not-Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> --- drivers/char/mem.c | 265 +++++++++++++++---------------------------- fs/read_write.c | 9 + 2 files changed, 105 insertions(+), 169 deletions(-) but if completed would also remove vread() and vwrite() from include/linux/vmalloc.h mm/nommu.c mm/vmalloc.c --- 2.6.31/drivers/char/mem.c 2009-09-09 23:13:59.000000000 +0100 +++ 2.6.31d/drivers/char/mem.c 2009-09-10 09:38:30.000000000 +0100 @@ -405,203 +405,134 @@ static ssize_t read_oldmem(struct file * static ssize_t read_kmem(struct file *file, char __user *buf, size_t count, loff_t *ppos) { - unsigned long p = *ppos; - ssize_t low_count, read, sz; - char * kbuf; /* k-addr because vread() takes vmlist_lock rwlock */ - - read = 0; - if (p < (unsigned long) high_memory) { - low_count = count; - if (count > (unsigned long) high_memory - p) - low_count = (unsigned long) high_memory - p; - -#ifdef __ARCH_HAS_NO_PAGE_ZERO_MAPPED - /* we don't have page 0 mapped on sparc and m68k.. */ - if (p < PAGE_SIZE && low_count > 0) { - size_t tmp = PAGE_SIZE - p; - if (tmp > low_count) tmp = low_count; - if (clear_user(buf, tmp)) - return -EFAULT; - buf += tmp; - p += tmp; - read += tmp; - low_count -= tmp; - count -= tmp; - } -#endif - while (low_count > 0) { - /* - * Handle first page in case it's not aligned - */ - if (-p & (PAGE_SIZE - 1)) - sz = -p & (PAGE_SIZE - 1); - else - sz = PAGE_SIZE; - - sz = min_t(unsigned long, sz, low_count); - - /* - * On ia64 if a page has been mapped somewhere as - * uncached, then it must also be accessed uncached - * by the kernel or data corruption may occur - */ - kbuf = xlate_dev_kmem_ptr((char *)p); - - if (copy_to_user(buf, kbuf, sz)) - return -EFAULT; - buf += sz; - p += sz; - read += sz; - low_count -= sz; - count -= sz; - } - } + char stack_kbuf[64]; + char *kbuf = stack_kbuf; + unsigned long kaddr = *ppos; + char *kptr; + size_t part; + unsigned long left; + ssize_t done = 0; + ssize_t err = 0; + mm_segment_t oldfs = get_fs(); - if (count > 0) { + if (count > sizeof(stack_kbuf)) { kbuf = (char *)__get_free_page(GFP_KERNEL); if (!kbuf) return -ENOMEM; - while (count > 0) { - int len = count; - - if (len > PAGE_SIZE) - len = PAGE_SIZE; - len = vread(kbuf, (char *)p, len); - if (!len) - break; - if (copy_to_user(buf, kbuf, len)) { - free_page((unsigned long)kbuf); - return -EFAULT; - } - count -= len; - buf += len; - read += len; - p += len; - } - free_page((unsigned long)kbuf); - } - *ppos = p; - return read; -} - - -static inline ssize_t -do_write_kmem(void *p, unsigned long realp, const char __user * buf, - size_t count, loff_t *ppos) -{ - ssize_t written, sz; - unsigned long copied; - - written = 0; -#ifdef __ARCH_HAS_NO_PAGE_ZERO_MAPPED - /* we don't have page 0 mapped on sparc and m68k.. */ - if (realp < PAGE_SIZE) { - unsigned long sz = PAGE_SIZE - realp; - if (sz > count) - sz = count; - /* Hmm. Do something? */ - buf += sz; - p += sz; - realp += sz; - count -= sz; - written += sz; } -#endif - - while (count > 0) { - char *ptr; - /* - * Handle first page in case it's not aligned - */ - if (-realp & (PAGE_SIZE - 1)) - sz = -realp & (PAGE_SIZE - 1); - else - sz = PAGE_SIZE; - - sz = min_t(unsigned long, sz, count); + part = min(count, (size_t)(PAGE_SIZE - (kaddr & ~PAGE_MASK))); + while (part && !err) { /* * On ia64 if a page has been mapped somewhere as * uncached, then it must also be accessed uncached * by the kernel or data corruption may occur */ - ptr = xlate_dev_kmem_ptr(p); + kptr = (char *) kaddr; + if (kptr >= (char *)PAGE_OFFSET && kptr < (char *)high_memory) + kptr = xlate_dev_kmem_ptr(kptr); + + set_fs(KERNEL_DS); + pagefault_disable(); + left = __copy_from_user_inatomic( + kbuf, (__force char __user *) kptr, part); + pagefault_enable(); + set_fs(oldfs); + + if (left) { + err = -ENXIO; + part -= left; + if (!part) + break; + } - copied = copy_from_user(ptr, buf, sz); - if (copied) { - written += sz - copied; - if (written) + left = copy_to_user(buf, kbuf, part); + if (left) { + err = -EFAULT; + part -= left; + if (!part) break; - return -EFAULT; } - buf += sz; - p += sz; - realp += sz; - count -= sz; - written += sz; + + buf += part; + kaddr += part; + done += part; + count -= part; + part = min(count, (size_t)PAGE_SIZE); } - *ppos += written; - return written; + if (kbuf != stack_kbuf) + free_page((unsigned long)kbuf); + *ppos = kaddr; + return done? : err; } - /* * This function writes to the *virtual* memory as seen by the kernel. */ static ssize_t write_kmem(struct file * file, const char __user * buf, size_t count, loff_t *ppos) { - unsigned long p = *ppos; - ssize_t wrote = 0; - ssize_t virtr = 0; - ssize_t written; - char * kbuf; /* k-addr because vwrite() takes vmlist_lock rwlock */ - - if (p < (unsigned long) high_memory) { - - wrote = count; - if (count > (unsigned long) high_memory - p) - wrote = (unsigned long) high_memory - p; - - written = do_write_kmem((void*)p, p, buf, wrote, ppos); - if (written != wrote) - return written; - wrote = written; - p += wrote; - buf += wrote; - count -= wrote; - } + char stack_kbuf[64]; + char *kbuf = stack_kbuf; + unsigned long kaddr = *ppos; + char *kptr; + size_t part; + unsigned long left; + ssize_t done = 0; + ssize_t err = 0; + mm_segment_t oldfs = get_fs(); - if (count > 0) { + if (count > sizeof(stack_kbuf)) { kbuf = (char *)__get_free_page(GFP_KERNEL); if (!kbuf) - return wrote ? wrote : -ENOMEM; - while (count > 0) { - int len = count; - - if (len > PAGE_SIZE) - len = PAGE_SIZE; - if (len) { - written = copy_from_user(kbuf, buf, len); - if (written) { - if (wrote + virtr) - break; - free_page((unsigned long)kbuf); - return -EFAULT; - } - } - len = vwrite(kbuf, (char *)p, len); - count -= len; - buf += len; - virtr += len; - p += len; + return -ENOMEM; + } + + part = min(count, (size_t)(PAGE_SIZE - (kaddr & ~PAGE_MASK))); + while (part && !err) { + left = copy_from_user(kbuf, buf, part); + if (left) { + err = -EFAULT; + part -= left; + if (!part) + break; } - free_page((unsigned long)kbuf); + + /* + * On ia64 if a page has been mapped somewhere as + * uncached, then it must also be accessed uncached + * by the kernel or data corruption may occur + */ + kptr = (char *) kaddr; + if (kptr >= (char *)PAGE_OFFSET && kptr < (char *)high_memory) + kptr = xlate_dev_kmem_ptr(kptr); + + set_fs(KERNEL_DS); + pagefault_disable(); + left = __copy_to_user_inatomic( + (__force char __user *) kptr, kbuf, part); + pagefault_enable(); + set_fs(oldfs); + + if (left) { + err = -ENXIO; + part -= left; + if (!part) + break; + } + + buf += part; + kaddr += part; + done += part; + count -= part; + part = min(count, (size_t)PAGE_SIZE); } - *ppos = p; - return virtr + wrote; + if (kbuf != stack_kbuf) + free_page((unsigned long)kbuf); + *ppos = kaddr; + return done? : err; } #endif --- 2.6.31/fs/read_write.c 2009-09-09 23:13:59.000000000 +0100 +++ 2.6.31d/fs/read_write.c 2009-09-10 09:38:30.000000000 +0100 @@ -222,8 +222,13 @@ int rw_verify_area(int read_write, struc if (unlikely((ssize_t) count < 0)) return retval; pos = *ppos; - if (unlikely((pos < 0) || (loff_t) (pos + count) < 0)) - return retval; + if (pos >= 0) { + if (unlikely((loff_t) (pos + count) < 0)) + count = 1 + (size_t) LLONG_MAX - pos; + } else { + if (unlikely((loff_t) (pos + count) > 0)) + count = - pos; + } if (unlikely(inode->i_flock && mandatory_lock(inode))) { retval = locks_mandatory_area( ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3] /proc/kmem cleanups and hwpoison bits 2009-09-15 10:16 ` Hugh Dickins @ 2009-09-16 0:39 ` KAMEZAWA Hiroyuki 2009-09-16 0:39 ` Wu Fengguang 2009-09-16 9:01 ` Geert Uytterhoeven 2 siblings, 0 replies; 20+ messages in thread From: KAMEZAWA Hiroyuki @ 2009-09-16 0:39 UTC (permalink / raw) To: Hugh Dickins Cc: Wu Fengguang, Andrew Morton, Benjamin Herrenschmidt, Ingo Molnar, Tejun Heo, Nick Piggin, LKML On Tue, 15 Sep 2009 11:16:36 +0100 (BST) Hugh Dickins <hugh.dickins@tiscali.co.uk> wrote: > Sorry guys, I'm picking this mail pretty much at random > as something to reply to. > > I'm interested to notice such work going on in drivers/char/mem.c, > and don't have time to join in - you interact a lot faster than I > manage, and I've other priorities to attend to; but thought I should > at least send over the patch I've been including in my private debug > kernels for the last couple of years (rebased to 2.6.31), which lets > me peek and poke into /dev/kmem as I occasionally wish. > > Warning: it may be rubbish, it may just be a hack which appeared to > work for me the last time I tried, on a particular address range of a > particular set of configurations of a particular set of architectures > (x86_32, x86_64, powerpc64). I've never thought it through enough to > consider submitting, but it _might_ contain something useful for you > to factor into your own efforts. > yes. thank you. very interesting. > Sorry for chucking it over the wall to you in this way, but I guess > that's better than just sitting quietly on it for a few more years. > Hmm..Hmm.. Then, concept of this patch is - just use copy_from/to_user() - avoid unnecessary get_user_pages(). Ah....yes, Because /dev/kmem accesses a page per iteration, new vread/vwrite which can handle memory holes may be overkill ;) And unlike /proc/kcore, /dev/kmem can return -ENXIO when the caller touches memory hole. Very impressive. Thank you. -Kame > Certainly-Not-Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> > --- > > drivers/char/mem.c | 265 +++++++++++++++---------------------------- > fs/read_write.c | 9 + > 2 files changed, 105 insertions(+), 169 deletions(-) > > but if completed would also remove vread() and vwrite() from > > include/linux/vmalloc.h > mm/nommu.c > mm/vmalloc.c > > --- 2.6.31/drivers/char/mem.c 2009-09-09 23:13:59.000000000 +0100 > +++ 2.6.31d/drivers/char/mem.c 2009-09-10 09:38:30.000000000 +0100 > @@ -405,203 +405,134 @@ static ssize_t read_oldmem(struct file * > static ssize_t read_kmem(struct file *file, char __user *buf, > size_t count, loff_t *ppos) > { > - unsigned long p = *ppos; > - ssize_t low_count, read, sz; > - char * kbuf; /* k-addr because vread() takes vmlist_lock rwlock */ > - > - read = 0; > - if (p < (unsigned long) high_memory) { > - low_count = count; > - if (count > (unsigned long) high_memory - p) > - low_count = (unsigned long) high_memory - p; > - > -#ifdef __ARCH_HAS_NO_PAGE_ZERO_MAPPED > - /* we don't have page 0 mapped on sparc and m68k.. */ > - if (p < PAGE_SIZE && low_count > 0) { > - size_t tmp = PAGE_SIZE - p; > - if (tmp > low_count) tmp = low_count; > - if (clear_user(buf, tmp)) > - return -EFAULT; > - buf += tmp; > - p += tmp; > - read += tmp; > - low_count -= tmp; > - count -= tmp; > - } > -#endif > - while (low_count > 0) { > - /* > - * Handle first page in case it's not aligned > - */ > - if (-p & (PAGE_SIZE - 1)) > - sz = -p & (PAGE_SIZE - 1); > - else > - sz = PAGE_SIZE; > - > - sz = min_t(unsigned long, sz, low_count); > - > - /* > - * On ia64 if a page has been mapped somewhere as > - * uncached, then it must also be accessed uncached > - * by the kernel or data corruption may occur > - */ > - kbuf = xlate_dev_kmem_ptr((char *)p); > - > - if (copy_to_user(buf, kbuf, sz)) > - return -EFAULT; > - buf += sz; > - p += sz; > - read += sz; > - low_count -= sz; > - count -= sz; > - } > - } > + char stack_kbuf[64]; > + char *kbuf = stack_kbuf; > + unsigned long kaddr = *ppos; > + char *kptr; > + size_t part; > + unsigned long left; > + ssize_t done = 0; > + ssize_t err = 0; > + mm_segment_t oldfs = get_fs(); > > - if (count > 0) { > + if (count > sizeof(stack_kbuf)) { > kbuf = (char *)__get_free_page(GFP_KERNEL); > if (!kbuf) > return -ENOMEM; > - while (count > 0) { > - int len = count; > - > - if (len > PAGE_SIZE) > - len = PAGE_SIZE; > - len = vread(kbuf, (char *)p, len); > - if (!len) > - break; > - if (copy_to_user(buf, kbuf, len)) { > - free_page((unsigned long)kbuf); > - return -EFAULT; > - } > - count -= len; > - buf += len; > - read += len; > - p += len; > - } > - free_page((unsigned long)kbuf); > - } > - *ppos = p; > - return read; > -} > - > - > -static inline ssize_t > -do_write_kmem(void *p, unsigned long realp, const char __user * buf, > - size_t count, loff_t *ppos) > -{ > - ssize_t written, sz; > - unsigned long copied; > - > - written = 0; > -#ifdef __ARCH_HAS_NO_PAGE_ZERO_MAPPED > - /* we don't have page 0 mapped on sparc and m68k.. */ > - if (realp < PAGE_SIZE) { > - unsigned long sz = PAGE_SIZE - realp; > - if (sz > count) > - sz = count; > - /* Hmm. Do something? */ > - buf += sz; > - p += sz; > - realp += sz; > - count -= sz; > - written += sz; > } > -#endif > - > - while (count > 0) { > - char *ptr; > - /* > - * Handle first page in case it's not aligned > - */ > - if (-realp & (PAGE_SIZE - 1)) > - sz = -realp & (PAGE_SIZE - 1); > - else > - sz = PAGE_SIZE; > - > - sz = min_t(unsigned long, sz, count); > > + part = min(count, (size_t)(PAGE_SIZE - (kaddr & ~PAGE_MASK))); > + while (part && !err) { > /* > * On ia64 if a page has been mapped somewhere as > * uncached, then it must also be accessed uncached > * by the kernel or data corruption may occur > */ > - ptr = xlate_dev_kmem_ptr(p); > + kptr = (char *) kaddr; > + if (kptr >= (char *)PAGE_OFFSET && kptr < (char *)high_memory) > + kptr = xlate_dev_kmem_ptr(kptr); > + > + set_fs(KERNEL_DS); > + pagefault_disable(); > + left = __copy_from_user_inatomic( > + kbuf, (__force char __user *) kptr, part); > + pagefault_enable(); > + set_fs(oldfs); > + > + if (left) { > + err = -ENXIO; > + part -= left; > + if (!part) > + break; > + } > > - copied = copy_from_user(ptr, buf, sz); > - if (copied) { > - written += sz - copied; > - if (written) > + left = copy_to_user(buf, kbuf, part); > + if (left) { > + err = -EFAULT; > + part -= left; > + if (!part) > break; > - return -EFAULT; > } > - buf += sz; > - p += sz; > - realp += sz; > - count -= sz; > - written += sz; > + > + buf += part; > + kaddr += part; > + done += part; > + count -= part; > + part = min(count, (size_t)PAGE_SIZE); > } > > - *ppos += written; > - return written; > + if (kbuf != stack_kbuf) > + free_page((unsigned long)kbuf); > + *ppos = kaddr; > + return done? : err; > } > > - > /* > * This function writes to the *virtual* memory as seen by the kernel. > */ > static ssize_t write_kmem(struct file * file, const char __user * buf, > size_t count, loff_t *ppos) > { > - unsigned long p = *ppos; > - ssize_t wrote = 0; > - ssize_t virtr = 0; > - ssize_t written; > - char * kbuf; /* k-addr because vwrite() takes vmlist_lock rwlock */ > - > - if (p < (unsigned long) high_memory) { > - > - wrote = count; > - if (count > (unsigned long) high_memory - p) > - wrote = (unsigned long) high_memory - p; > - > - written = do_write_kmem((void*)p, p, buf, wrote, ppos); > - if (written != wrote) > - return written; > - wrote = written; > - p += wrote; > - buf += wrote; > - count -= wrote; > - } > + char stack_kbuf[64]; > + char *kbuf = stack_kbuf; > + unsigned long kaddr = *ppos; > + char *kptr; > + size_t part; > + unsigned long left; > + ssize_t done = 0; > + ssize_t err = 0; > + mm_segment_t oldfs = get_fs(); > > - if (count > 0) { > + if (count > sizeof(stack_kbuf)) { > kbuf = (char *)__get_free_page(GFP_KERNEL); > if (!kbuf) > - return wrote ? wrote : -ENOMEM; > - while (count > 0) { > - int len = count; > - > - if (len > PAGE_SIZE) > - len = PAGE_SIZE; > - if (len) { > - written = copy_from_user(kbuf, buf, len); > - if (written) { > - if (wrote + virtr) > - break; > - free_page((unsigned long)kbuf); > - return -EFAULT; > - } > - } > - len = vwrite(kbuf, (char *)p, len); > - count -= len; > - buf += len; > - virtr += len; > - p += len; > + return -ENOMEM; > + } > + > + part = min(count, (size_t)(PAGE_SIZE - (kaddr & ~PAGE_MASK))); > + while (part && !err) { > + left = copy_from_user(kbuf, buf, part); > + if (left) { > + err = -EFAULT; > + part -= left; > + if (!part) > + break; > } > - free_page((unsigned long)kbuf); > + > + /* > + * On ia64 if a page has been mapped somewhere as > + * uncached, then it must also be accessed uncached > + * by the kernel or data corruption may occur > + */ > + kptr = (char *) kaddr; > + if (kptr >= (char *)PAGE_OFFSET && kptr < (char *)high_memory) > + kptr = xlate_dev_kmem_ptr(kptr); > + > + set_fs(KERNEL_DS); > + pagefault_disable(); > + left = __copy_to_user_inatomic( > + (__force char __user *) kptr, kbuf, part); > + pagefault_enable(); > + set_fs(oldfs); > + > + if (left) { > + err = -ENXIO; > + part -= left; > + if (!part) > + break; > + } > + > + buf += part; > + kaddr += part; > + done += part; > + count -= part; > + part = min(count, (size_t)PAGE_SIZE); > } > > - *ppos = p; > - return virtr + wrote; > + if (kbuf != stack_kbuf) > + free_page((unsigned long)kbuf); > + *ppos = kaddr; > + return done? : err; > } > #endif > > --- 2.6.31/fs/read_write.c 2009-09-09 23:13:59.000000000 +0100 > +++ 2.6.31d/fs/read_write.c 2009-09-10 09:38:30.000000000 +0100 > @@ -222,8 +222,13 @@ int rw_verify_area(int read_write, struc > if (unlikely((ssize_t) count < 0)) > return retval; > pos = *ppos; > - if (unlikely((pos < 0) || (loff_t) (pos + count) < 0)) > - return retval; > + if (pos >= 0) { > + if (unlikely((loff_t) (pos + count) < 0)) > + count = 1 + (size_t) LLONG_MAX - pos; > + } else { > + if (unlikely((loff_t) (pos + count) > 0)) > + count = - pos; > + } > > if (unlikely(inode->i_flock && mandatory_lock(inode))) { > retval = locks_mandatory_area( > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3] /proc/kmem cleanups and hwpoison bits 2009-09-15 10:16 ` Hugh Dickins 2009-09-16 0:39 ` KAMEZAWA Hiroyuki @ 2009-09-16 0:39 ` Wu Fengguang 2009-09-16 9:01 ` Geert Uytterhoeven 2 siblings, 0 replies; 20+ messages in thread From: Wu Fengguang @ 2009-09-16 0:39 UTC (permalink / raw) To: Hugh Dickins Cc: KAMEZAWA Hiroyuki, Andrew Morton, Benjamin Herrenschmidt, Ingo Molnar, Tejun Heo, Nick Piggin, LKML Hugh, On Tue, Sep 15, 2009 at 06:16:36PM +0800, Hugh Dickins wrote: > Sorry guys, I'm picking this mail pretty much at random > as something to reply to. > > I'm interested to notice such work going on in drivers/char/mem.c, > and don't have time to join in - you interact a lot faster than I > manage, and I've other priorities to attend to; but thought I should > at least send over the patch I've been including in my private debug > kernels for the last couple of years (rebased to 2.6.31), which lets > me peek and poke into /dev/kmem as I occasionally wish. > > Warning: it may be rubbish, it may just be a hack which appeared to > work for me the last time I tried, on a particular address range of a > particular set of configurations of a particular set of architectures > (x86_32, x86_64, powerpc64). I've never thought it through enough to > consider submitting, but it _might_ contain something useful for you > to factor into your own efforts. > > Sorry for chucking it over the wall to you in this way, but I guess > that's better than just sitting quietly on it for a few more years. It's good to see a totally different approach. Although I'm not sure if it provides equal functionality or error handling, it does amazed me by rewriting read_kmem/write_kmem in such short code (without any supporting functions!). It looks that your ENXIO makes more sense than EFAULT we used for nonexistent address :) Thanks, Fengguang > Certainly-Not-Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> > --- > > drivers/char/mem.c | 265 +++++++++++++++---------------------------- > fs/read_write.c | 9 + > 2 files changed, 105 insertions(+), 169 deletions(-) > > but if completed would also remove vread() and vwrite() from > > include/linux/vmalloc.h > mm/nommu.c > mm/vmalloc.c > > --- 2.6.31/drivers/char/mem.c 2009-09-09 23:13:59.000000000 +0100 > +++ 2.6.31d/drivers/char/mem.c 2009-09-10 09:38:30.000000000 +0100 > @@ -405,203 +405,134 @@ static ssize_t read_oldmem(struct file * > static ssize_t read_kmem(struct file *file, char __user *buf, > size_t count, loff_t *ppos) > { > - unsigned long p = *ppos; > - ssize_t low_count, read, sz; > - char * kbuf; /* k-addr because vread() takes vmlist_lock rwlock */ > - > - read = 0; > - if (p < (unsigned long) high_memory) { > - low_count = count; > - if (count > (unsigned long) high_memory - p) > - low_count = (unsigned long) high_memory - p; > - > -#ifdef __ARCH_HAS_NO_PAGE_ZERO_MAPPED > - /* we don't have page 0 mapped on sparc and m68k.. */ > - if (p < PAGE_SIZE && low_count > 0) { > - size_t tmp = PAGE_SIZE - p; > - if (tmp > low_count) tmp = low_count; > - if (clear_user(buf, tmp)) > - return -EFAULT; > - buf += tmp; > - p += tmp; > - read += tmp; > - low_count -= tmp; > - count -= tmp; > - } > -#endif > - while (low_count > 0) { > - /* > - * Handle first page in case it's not aligned > - */ > - if (-p & (PAGE_SIZE - 1)) > - sz = -p & (PAGE_SIZE - 1); > - else > - sz = PAGE_SIZE; > - > - sz = min_t(unsigned long, sz, low_count); > - > - /* > - * On ia64 if a page has been mapped somewhere as > - * uncached, then it must also be accessed uncached > - * by the kernel or data corruption may occur > - */ > - kbuf = xlate_dev_kmem_ptr((char *)p); > - > - if (copy_to_user(buf, kbuf, sz)) > - return -EFAULT; > - buf += sz; > - p += sz; > - read += sz; > - low_count -= sz; > - count -= sz; > - } > - } > + char stack_kbuf[64]; > + char *kbuf = stack_kbuf; > + unsigned long kaddr = *ppos; > + char *kptr; > + size_t part; > + unsigned long left; > + ssize_t done = 0; > + ssize_t err = 0; > + mm_segment_t oldfs = get_fs(); > > - if (count > 0) { > + if (count > sizeof(stack_kbuf)) { > kbuf = (char *)__get_free_page(GFP_KERNEL); > if (!kbuf) > return -ENOMEM; > - while (count > 0) { > - int len = count; > - > - if (len > PAGE_SIZE) > - len = PAGE_SIZE; > - len = vread(kbuf, (char *)p, len); > - if (!len) > - break; > - if (copy_to_user(buf, kbuf, len)) { > - free_page((unsigned long)kbuf); > - return -EFAULT; > - } > - count -= len; > - buf += len; > - read += len; > - p += len; > - } > - free_page((unsigned long)kbuf); > - } > - *ppos = p; > - return read; > -} > - > - > -static inline ssize_t > -do_write_kmem(void *p, unsigned long realp, const char __user * buf, > - size_t count, loff_t *ppos) > -{ > - ssize_t written, sz; > - unsigned long copied; > - > - written = 0; > -#ifdef __ARCH_HAS_NO_PAGE_ZERO_MAPPED > - /* we don't have page 0 mapped on sparc and m68k.. */ > - if (realp < PAGE_SIZE) { > - unsigned long sz = PAGE_SIZE - realp; > - if (sz > count) > - sz = count; > - /* Hmm. Do something? */ > - buf += sz; > - p += sz; > - realp += sz; > - count -= sz; > - written += sz; > } > -#endif > - > - while (count > 0) { > - char *ptr; > - /* > - * Handle first page in case it's not aligned > - */ > - if (-realp & (PAGE_SIZE - 1)) > - sz = -realp & (PAGE_SIZE - 1); > - else > - sz = PAGE_SIZE; > - > - sz = min_t(unsigned long, sz, count); > > + part = min(count, (size_t)(PAGE_SIZE - (kaddr & ~PAGE_MASK))); > + while (part && !err) { > /* > * On ia64 if a page has been mapped somewhere as > * uncached, then it must also be accessed uncached > * by the kernel or data corruption may occur > */ > - ptr = xlate_dev_kmem_ptr(p); > + kptr = (char *) kaddr; > + if (kptr >= (char *)PAGE_OFFSET && kptr < (char *)high_memory) > + kptr = xlate_dev_kmem_ptr(kptr); > + > + set_fs(KERNEL_DS); > + pagefault_disable(); > + left = __copy_from_user_inatomic( > + kbuf, (__force char __user *) kptr, part); > + pagefault_enable(); > + set_fs(oldfs); > + > + if (left) { > + err = -ENXIO; > + part -= left; > + if (!part) > + break; > + } > > - copied = copy_from_user(ptr, buf, sz); > - if (copied) { > - written += sz - copied; > - if (written) > + left = copy_to_user(buf, kbuf, part); > + if (left) { > + err = -EFAULT; > + part -= left; > + if (!part) > break; > - return -EFAULT; > } > - buf += sz; > - p += sz; > - realp += sz; > - count -= sz; > - written += sz; > + > + buf += part; > + kaddr += part; > + done += part; > + count -= part; > + part = min(count, (size_t)PAGE_SIZE); > } > > - *ppos += written; > - return written; > + if (kbuf != stack_kbuf) > + free_page((unsigned long)kbuf); > + *ppos = kaddr; > + return done? : err; > } > > - > /* > * This function writes to the *virtual* memory as seen by the kernel. > */ > static ssize_t write_kmem(struct file * file, const char __user * buf, > size_t count, loff_t *ppos) > { > - unsigned long p = *ppos; > - ssize_t wrote = 0; > - ssize_t virtr = 0; > - ssize_t written; > - char * kbuf; /* k-addr because vwrite() takes vmlist_lock rwlock */ > - > - if (p < (unsigned long) high_memory) { > - > - wrote = count; > - if (count > (unsigned long) high_memory - p) > - wrote = (unsigned long) high_memory - p; > - > - written = do_write_kmem((void*)p, p, buf, wrote, ppos); > - if (written != wrote) > - return written; > - wrote = written; > - p += wrote; > - buf += wrote; > - count -= wrote; > - } > + char stack_kbuf[64]; > + char *kbuf = stack_kbuf; > + unsigned long kaddr = *ppos; > + char *kptr; > + size_t part; > + unsigned long left; > + ssize_t done = 0; > + ssize_t err = 0; > + mm_segment_t oldfs = get_fs(); > > - if (count > 0) { > + if (count > sizeof(stack_kbuf)) { > kbuf = (char *)__get_free_page(GFP_KERNEL); > if (!kbuf) > - return wrote ? wrote : -ENOMEM; > - while (count > 0) { > - int len = count; > - > - if (len > PAGE_SIZE) > - len = PAGE_SIZE; > - if (len) { > - written = copy_from_user(kbuf, buf, len); > - if (written) { > - if (wrote + virtr) > - break; > - free_page((unsigned long)kbuf); > - return -EFAULT; > - } > - } > - len = vwrite(kbuf, (char *)p, len); > - count -= len; > - buf += len; > - virtr += len; > - p += len; > + return -ENOMEM; > + } > + > + part = min(count, (size_t)(PAGE_SIZE - (kaddr & ~PAGE_MASK))); > + while (part && !err) { > + left = copy_from_user(kbuf, buf, part); > + if (left) { > + err = -EFAULT; > + part -= left; > + if (!part) > + break; > } > - free_page((unsigned long)kbuf); > + > + /* > + * On ia64 if a page has been mapped somewhere as > + * uncached, then it must also be accessed uncached > + * by the kernel or data corruption may occur > + */ > + kptr = (char *) kaddr; > + if (kptr >= (char *)PAGE_OFFSET && kptr < (char *)high_memory) > + kptr = xlate_dev_kmem_ptr(kptr); > + > + set_fs(KERNEL_DS); > + pagefault_disable(); > + left = __copy_to_user_inatomic( > + (__force char __user *) kptr, kbuf, part); > + pagefault_enable(); > + set_fs(oldfs); > + > + if (left) { > + err = -ENXIO; > + part -= left; > + if (!part) > + break; > + } > + > + buf += part; > + kaddr += part; > + done += part; > + count -= part; > + part = min(count, (size_t)PAGE_SIZE); > } > > - *ppos = p; > - return virtr + wrote; > + if (kbuf != stack_kbuf) > + free_page((unsigned long)kbuf); > + *ppos = kaddr; > + return done? : err; > } > #endif > > --- 2.6.31/fs/read_write.c 2009-09-09 23:13:59.000000000 +0100 > +++ 2.6.31d/fs/read_write.c 2009-09-10 09:38:30.000000000 +0100 > @@ -222,8 +222,13 @@ int rw_verify_area(int read_write, struc > if (unlikely((ssize_t) count < 0)) > return retval; > pos = *ppos; > - if (unlikely((pos < 0) || (loff_t) (pos + count) < 0)) > - return retval; > + if (pos >= 0) { > + if (unlikely((loff_t) (pos + count) < 0)) > + count = 1 + (size_t) LLONG_MAX - pos; > + } else { > + if (unlikely((loff_t) (pos + count) > 0)) > + count = - pos; > + } > > if (unlikely(inode->i_flock && mandatory_lock(inode))) { > retval = locks_mandatory_area( ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3] /proc/kmem cleanups and hwpoison bits 2009-09-15 10:16 ` Hugh Dickins 2009-09-16 0:39 ` KAMEZAWA Hiroyuki 2009-09-16 0:39 ` Wu Fengguang @ 2009-09-16 9:01 ` Geert Uytterhoeven 2009-09-16 11:26 ` Hugh Dickins 2 siblings, 1 reply; 20+ messages in thread From: Geert Uytterhoeven @ 2009-09-16 9:01 UTC (permalink / raw) To: Hugh Dickins Cc: Wu Fengguang, KAMEZAWA Hiroyuki, Andrew Morton, Benjamin Herrenschmidt, Ingo Molnar, Tejun Heo, Nick Piggin, LKML, linux-m68k, sparclinux On Tue, Sep 15, 2009 at 12:16, Hugh Dickins <hugh.dickins@tiscali.co.uk> wrote: > On Tue, 15 Sep 2009, Wu Fengguang wrote: >> On Tue, Sep 15, 2009 at 11:09:39AM +0800, KAMEZAWA Hiroyuki wrote: >> > On Tue, 15 Sep 2009 10:18:51 +0800 >> > Wu Fengguang <fengguang.wu@intel.com> wrote: >> > >> > > Hi Kame, >> > > >> > > Here are 3 more kmem patches in my queue. Comments are welcome. >> > > If you feel good about them, I can send all recent kmem cleanup >> > > patches for you. >> > > >> > >> > This is my quick hack. But I don't want to be an obstacle for you. >> > So, I'll wait for your updates. >> >> Thanks. This is also a bug fix: vmalloc_to_page() will otherwise BUG() >> on !is_vmalloc_or_module_addr() pages. >> >> > == >> > Now, /dev/kmem's read/write vmalloc area doesn't do >> > range-check. Because vread/vwrite traverse vmalloc area list >> > under system-wide spinlock, it's better to avoid unnecessary >> > to do unnecessary calls to vread/vwrite. >> >> is_vmalloc_or_module_addr() could be put to either read_kmem() >> or aligned_vread(), and I'm fine with both. >> >> It looks like vread can be shared by kmem and kcore :) >> >> > And, vread/vwrite returns 0 if we accessed memory holes. >> > We can avoid copy-to-user in read side, we just ignore at write. >> > >> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> >> > --- >> > drivers/char/mem.c | 27 +++++++++++++++++++-------- >> > 1 file changed, 19 insertions(+), 8 deletions(-) > > Sorry guys, I'm picking this mail pretty much at random > as something to reply to. > > I'm interested to notice such work going on in drivers/char/mem.c, > and don't have time to join in - you interact a lot faster than I > manage, and I've other priorities to attend to; but thought I should > at least send over the patch I've been including in my private debug > kernels for the last couple of years (rebased to 2.6.31), which lets > me peek and poke into /dev/kmem as I occasionally wish. > > Warning: it may be rubbish, it may just be a hack which appeared to > work for me the last time I tried, on a particular address range of a > particular set of configurations of a particular set of architectures > (x86_32, x86_64, powerpc64). I've never thought it through enough to > consider submitting, but it _might_ contain something useful for you > to factor into your own efforts. > > Sorry for chucking it over the wall to you in this way, but I guess > that's better than just sitting quietly on it for a few more years. > > Certainly-Not-Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> > --- > > drivers/char/mem.c | 265 +++++++++++++++---------------------------- > fs/read_write.c | 9 + > 2 files changed, 105 insertions(+), 169 deletions(-) > > but if completed would also remove vread() and vwrite() from > > include/linux/vmalloc.h > mm/nommu.c > mm/vmalloc.c > > --- 2.6.31/drivers/char/mem.c 2009-09-09 23:13:59.000000000 +0100 > +++ 2.6.31d/drivers/char/mem.c 2009-09-10 09:38:30.000000000 +0100 > -#ifdef __ARCH_HAS_NO_PAGE_ZERO_MAPPED > - /* we don't have page 0 mapped on sparc and m68k.. */ Is this `feature' removed? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3] /proc/kmem cleanups and hwpoison bits 2009-09-16 9:01 ` Geert Uytterhoeven @ 2009-09-16 11:26 ` Hugh Dickins 2009-09-16 11:42 ` Geert Uytterhoeven 0 siblings, 1 reply; 20+ messages in thread From: Hugh Dickins @ 2009-09-16 11:26 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Wu Fengguang, KAMEZAWA Hiroyuki, Andrew Morton, Benjamin Herrenschmidt, Ingo Molnar, Tejun Heo, Nick Piggin, LKML, linux-m68k, sparclinux [-- Attachment #1: Type: TEXT/PLAIN, Size: 1993 bytes --] On Wed, 16 Sep 2009, Geert Uytterhoeven wrote: > On Tue, Sep 15, 2009 at 12:16, Hugh Dickins <hugh.dickins@tiscali.co.uk> wrote: > > Warning: it may be rubbish, it may just be a hack which appeared to > > work for me the last time I tried, on a particular address range of a > > particular set of configurations of a particular set of architectures > > (x86_32, x86_64, powerpc64). I've never thought it through enough to > > consider submitting, but it _might_ contain something useful for you > > to factor into your own efforts. > > > > Sorry for chucking it over the wall to you in this way, but I guess > > that's better than just sitting quietly on it for a few more years. > > > > Certainly-Not-Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> I think that gives a good idea of the status of this patch: I'm not making any policy decisions here or submitting to any tree. > > --- 2.6.31/drivers/char/mem.c 2009-09-09 23:13:59.000000000 +0100 > > +++ 2.6.31d/drivers/char/mem.c 2009-09-10 09:38:30.000000000 +0100 > > > -#ifdef __ARCH_HAS_NO_PAGE_ZERO_MAPPED > > - /* we don't have page 0 mapped on sparc and m68k.. */ > > Is this `feature' removed? The feature that some arches don't have page zero mapped? No, I could hardly change that from drivers/char/mem.c, and wouldn't wish to. The feature that reading from one unmapped page gave zeroes and writing to it threw away what you wrote? Yes, in making that patch, I was thinking that if we cope with unmapped areas elsewhere by returning -ENXIO, it made little sense to have special code to do something different for page zero. But of course there might be userspace compatibility reasons why that couldn't change e.g. a body of /dev/kmem users who would be surprised by an error right at the "start" (though in kmem's case, they've already had to seek to get anywhere). Take the patch as saying "hmm, I wonder if we could do it this way". Hugh ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3] /proc/kmem cleanups and hwpoison bits 2009-09-16 11:26 ` Hugh Dickins @ 2009-09-16 11:42 ` Geert Uytterhoeven 0 siblings, 0 replies; 20+ messages in thread From: Geert Uytterhoeven @ 2009-09-16 11:42 UTC (permalink / raw) To: Hugh Dickins Cc: Wu Fengguang, KAMEZAWA Hiroyuki, Andrew Morton, Benjamin Herrenschmidt, Ingo Molnar, Tejun Heo, Nick Piggin, LKML, linux-m68k, sparclinux On Wed, Sep 16, 2009 at 13:26, Hugh Dickins <hugh.dickins@tiscali.co.uk> wrote: > On Wed, 16 Sep 2009, Geert Uytterhoeven wrote: >> On Tue, Sep 15, 2009 at 12:16, Hugh Dickins <hugh.dickins@tiscali.co.uk> wrote: >> > Warning: it may be rubbish, it may just be a hack which appeared to >> > work for me the last time I tried, on a particular address range of a >> > particular set of configurations of a particular set of architectures >> > (x86_32, x86_64, powerpc64). I've never thought it through enough to >> > consider submitting, but it _might_ contain something useful for you >> > to factor into your own efforts. >> > >> > Sorry for chucking it over the wall to you in this way, but I guess >> > that's better than just sitting quietly on it for a few more years. >> > >> > Certainly-Not-Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> > > I think that gives a good idea of the status of this patch: > I'm not making any policy decisions here or submitting to any tree. OK, that was my understanding, too. But I thought it would hurt to verify. You never know who's gonna take your patch and (try to) sneak it in ;-) Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2009-09-16 11:48 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-09-15 2:18 [PATCH 0/3] /proc/kmem cleanups and hwpoison bits Wu Fengguang 2009-09-15 2:18 ` [PATCH 1/3] vmalloc: ignore vmalloc area holes in vwrite() Wu Fengguang 2009-09-15 2:34 ` KAMEZAWA Hiroyuki 2009-09-15 3:15 ` Wu Fengguang 2009-09-15 3:18 ` KAMEZAWA Hiroyuki 2009-09-15 5:21 ` Wu Fengguang 2009-09-15 2:18 ` [PATCH 2/3] devmem: handle partial kmem write/read Wu Fengguang 2009-09-15 2:38 ` KAMEZAWA Hiroyuki 2009-09-15 9:01 ` Wu Fengguang 2009-09-15 9:03 ` Wu Fengguang 2009-09-15 2:18 ` [PATCH 3/3] HWPOISON: prevent /dev/kmem users from accessing hwpoison pages Wu Fengguang 2009-09-15 2:45 ` KAMEZAWA Hiroyuki 2009-09-15 3:09 ` [PATCH 0/3] /proc/kmem cleanups and hwpoison bits KAMEZAWA Hiroyuki 2009-09-15 8:22 ` Wu Fengguang 2009-09-15 10:16 ` Hugh Dickins 2009-09-16 0:39 ` KAMEZAWA Hiroyuki 2009-09-16 0:39 ` Wu Fengguang 2009-09-16 9:01 ` Geert Uytterhoeven 2009-09-16 11:26 ` Hugh Dickins 2009-09-16 11:42 ` Geert Uytterhoeven
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox