From: Wu Fengguang <fengguang.wu@intel.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"mtosatti@redhat.com" <mtosatti@redhat.com>,
"gregkh@suse.de" <gregkh@suse.de>,
"broonie@opensource.wolfsonmicro.com"
<broonie@opensource.wolfsonmicro.com>,
"johannes@sipsolutions.net" <johannes@sipsolutions.net>,
"avi@qumranet.com" <avi@qumranet.com>,
"andi@firstfloor.org" <andi@firstfloor.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
WANG Cong <xiyou.wangcong@gmail.com>,
Mike Smith <scgtrp@gmail.com>,
Nick Piggin <nickpiggin@yahoo.com.au>
Subject: Re: [PATCH] devmem: handle partial kmem write/read
Date: Tue, 15 Sep 2009 10:57:10 +0800 [thread overview]
Message-ID: <20090915025710.GA23881@localhost> (raw)
In-Reply-To: <20090915113113.126f90f0.kamezawa.hiroyu@jp.fujitsu.com>
[-- Attachment #1: Type: text/plain, Size: 10101 bytes --]
On Tue, Sep 15, 2009 at 10:31:13AM +0800, KAMEZAWA Hiroyuki wrote:
> On Tue, 15 Sep 2009 10:02:08 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
>
> > On Tue, Sep 15, 2009 at 08:24:48AM +0800, KAMEZAWA Hiroyuki wrote:
> > > On Mon, 14 Sep 2009 12:34:44 +0800
> > > Wu Fengguang <fengguang.wu@intel.com> wrote:
> > >
> > > > Hi Kame,
> > > >
> > > > This patch needs more work. I first intent to fix a bug:
> > > >
> > > > sz = vwrite(kbuf, (char *)p, sz);
> > > > p += sz;
> > > > }
> > > >
> > > > So if the returned len is 0, the kbuf/p pointers will mismatch.
> > > >
> > > > Then I realize it changed the write behavior. The current vwrite()
> > > > behavior is strange, it returns 0 if _whole range_ is hole, otherwise
> > > > ignores the hole silently. So holes can be treated differently even in
> > > > the original code.
> > > >
> > > Ah, ok...
> > >
> > > > I'm not really sure about the right behavior. KAME-san, do you have
> > > > any suggestions?
> > > >
> > > maybe following make sense.
> > > =
> > > written = vwrite(kbuf, (char *p), sz);
> > > if (!written) // whole vmem was a hole
> > > written = sz;
> >
> > Since the 0 return value won't be used at all, it would be simpler to
> > tell vwrite() return the untouched buflen/sz, like this. It will ignore
> > _all_ the holes silently. Need to update comments too.
> >
>
> Hmm. IIUC the original kmem code returns immediately if vread/vwrite returns 0.
Agreed for vread. It seems vwrite don't do that for both kmem and
vmalloc part.
> But it seems this depends on wrong assumption that vmap area is continuous,
> at the first look.
You mean it is possible that not all pages are mapped in a vmlist addr,size range?
Is vmalloc areas guaranteed to be page aligned? Sorry for the dumb questions.
> In man(4) mem,kmem
> ==
> Byte addresses in mem are interpreted as physical memory addresses.
> References to nonexistent locations cause errors to be returned.
> .....
> The file kmem is the same as mem, except that the kernel virtual memory
> rather than physical memory is accessed.
>
> ==
>
> Then, we have to return error for accesses to "nonexistent locations".
That's one important factor to consider. On the other hand, the
original kmem read/write implementation don't return error code for
holes. Instead kmem read returns early, while kmem write ignores holes
but is buggy..
> memory-hole in vmap area is ....."nonexistent" ?
> I think it's nonexistent if there are no overlaps between requested [pos, pos+len)
> and registerred vmalloc area.
> But, hmm, there are no way for users to know "existing vmalloc area".
> Then, my above idea may be wrong.
>
> Then, I'd like to modify as following,
>
> - If is_vmalloc_or_module_addr(requested address) is false, return -EFAULT.
> - If is_vmalloc_or_module_addr(requested address) is true, return no error.
> Even if specified range doesn't include no exsiting vmalloc area.
>
> How do you think ?
Looks reasonable to me. But it's good to hear more wide opinions..
> Thanks,
> -Kame
> p.s. I wonder current /dev/kmem cannot read text area of kernel if it's not
> directly mapped.
Attached is a small tool (from LKML) for reading 'modprobe_path' from kmem,
it's not text, but is close..
Thanks,
Fengguang
>
>
> > --- linux-mm.orig/mm/vmalloc.c 2009-09-15 09:40:08.000000000 +0800
> > +++ linux-mm/mm/vmalloc.c 2009-09-15 09:43:33.000000000 +0800
> > @@ -1834,7 +1834,6 @@ long vwrite(char *buf, char *addr, unsig
> > struct vm_struct *tmp;
> > char *vaddr;
> > unsigned long n, buflen;
> > - int copied = 0;
> >
> > /* Don't allow overflow */
> > if ((unsigned long) addr + count < count)
> > @@ -1858,7 +1857,6 @@ long vwrite(char *buf, char *addr, unsig
> > n = count;
> > if (!(tmp->flags & VM_IOREMAP)) {
> > aligned_vwrite(buf, addr, n);
> > - copied++;
> > }
> > buf += n;
> > addr += n;
> > @@ -1866,8 +1864,6 @@ long vwrite(char *buf, char *addr, unsig
> > }
> > finished:
> > read_unlock(&vmlist_lock);
> > - if (!copied)
> > - return 0;
> > return buflen;
> > }
> >
> > > ==
> > > needs fix.
> > >
> > > Anyway, I wonder kmem is broken. It's should be totally rewritten.
> > >
> > > For example, this doesn't check anything.
> > > ==
> > > if (p < (unsigned long) high_memory) {
> > >
> > > ==
> > >
> > > But....are there users ?
> > > If necessary, I'll write some...
> >
> > I'm trying to stop possible mem/kmem users to access hwpoison pages..
> > I'm not the user, but rather a tester ;)
> >
> > Thanks,
> > Fengguang
> >
> > > Thanks,
> > > -Kame
> > >
> > >
> > > > Thanks,
> > > > Fengguang
> > > >
> > > > On Mon, Sep 14, 2009 at 11:29:01AM +0800, Wu Fengguang wrote:
> > > > > Current vwrite silently ignores vwrite() to vmap holes.
> > > > > Better behavior should be stop the write and return
> > > > > on such holes.
> > > > >
> > > > > Also return on partial read, which may happen in future
> > > > > (eg. hit hwpoison pages).
> > > > >
> > > > > CC: Andi Kleen <andi@firstfloor.org>
> > > > > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> > > > > ---
> > > > > drivers/char/mem.c | 30 ++++++++++++++++++------------
> > > > > 1 file changed, 18 insertions(+), 12 deletions(-)
> > > > >
> > > > > --- linux-mm.orig/drivers/char/mem.c 2009-09-14 10:59:50.000000000 +0800
> > > > > +++ linux-mm/drivers/char/mem.c 2009-09-14 11:06:25.000000000 +0800
> > > > > @@ -444,18 +444,22 @@ static ssize_t read_kmem(struct file *fi
> > > > > if (!kbuf)
> > > > > return -ENOMEM;
> > > > > while (count > 0) {
> > > > > + unsigned long n;
> > > > > +
> > > > > sz = size_inside_page(p, count);
> > > > > - sz = vread(kbuf, (char *)p, sz);
> > > > > - if (!sz)
> > > > > + n = vread(kbuf, (char *)p, sz);
> > > > > + if (!n)
> > > > > break;
> > > > > - if (copy_to_user(buf, kbuf, sz)) {
> > > > > + if (copy_to_user(buf, kbuf, n)) {
> > > > > free_page((unsigned long)kbuf);
> > > > > return -EFAULT;
> > > > > }
> > > > > - count -= sz;
> > > > > - buf += sz;
> > > > > - read += sz;
> > > > > - p += sz;
> > > > > + count -= n;
> > > > > + buf += n;
> > > > > + read += n;
> > > > > + p += n;
> > > > > + if (n < sz)
> > > > > + break;
> > > > > }
> > > > > free_page((unsigned long)kbuf);
> > > > > }
> > > > > @@ -551,11 +555,13 @@ static ssize_t write_kmem(struct file *
> > > > > free_page((unsigned long)kbuf);
> > > > > return -EFAULT;
> > > > > }
> > > > > - sz = vwrite(kbuf, (char *)p, sz);
> > > > > - count -= sz;
> > > > > - buf += sz;
> > > > > - virtr += sz;
> > > > > - p += sz;
> > > > > + n = vwrite(kbuf, (char *)p, sz);
> > > > > + count -= n;
> > > > > + buf += n;
> > > > > + virtr += n;
> > > > > + p += n;
> > > > > + if (n < sz)
> > > > > + break;
> > > > > }
> > > > > free_page((unsigned long)kbuf);
> > > > > }
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > Please read the FAQ at http://www.tux.org/lkml/
> > > >
> >
[-- Attachment #2: tmap.c --]
[-- Type: text/x-csrc, Size: 1385 bytes --]
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/poll.h>
#include <sys/mman.h>
int page_size;
#define PAGE_SIZE page_size
#define PAGE_MASK (~(PAGE_SIZE-1))
void get_var (unsigned long addr) {
off_t ptr = addr & ~(PAGE_MASK);
off_t offset = addr & PAGE_MASK;
int i = 0;
char *map;
static int kfd = -1;
kfd = open("/dev/kmem",O_RDONLY);
if (kfd < 0) {
perror("open");
exit(0);
}
map = mmap(NULL,PAGE_SIZE,PROT_READ,MAP_SHARED,kfd,offset);
if (map == MAP_FAILED) {
perror("mmap");
exit(-1);
}
printf("%s\n",map+ptr);
return;
}
int main(int argc, char **argv)
{
FILE *fp;
char addr_str[11]="0x";
char var[51];
unsigned long addr;
char ch;
int r;
if (argc != 2) {
fprintf(stderr,"usage: %s System.map\n",argv[0]);
exit(-1);
}
if ((fp = fopen(argv[1],"r")) == NULL) {
perror("fopen");
exit(-1);
}
do {
/* ffffffff81723880 D modprobe_path */
r = fscanf(fp,"%16s %c %50s\n",&addr_str[2],&ch,var);
if (strcmp(var,"modprobe_path")==0)
break;
} while(r > 0);
if (r < 0) {
printf("could not find modprobe_path\n");
exit(-1);
}
page_size = getpagesize();
addr = strtoul(addr_str,NULL,16);
printf("found modprobe_path at (%s) %08lx\n",addr_str,addr);
get_var(addr);
return 0;
}
next prev parent reply other threads:[~2009-09-15 2:57 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-14 3:29 [PATCH] devmem: handle partial kmem write/read Wu Fengguang
2009-09-14 4:34 ` Wu Fengguang
2009-09-15 0:24 ` KAMEZAWA Hiroyuki
2009-09-15 1:52 ` KAMEZAWA Hiroyuki
2009-09-15 2:05 ` Wu Fengguang
2009-09-15 2:02 ` Wu Fengguang
2009-09-15 2:31 ` KAMEZAWA Hiroyuki
2009-09-15 2:57 ` Wu Fengguang [this message]
2009-09-15 7:58 ` Question: how to handle too big f_pos " KAMEZAWA Hiroyuki
2009-09-15 8:11 ` Wu Fengguang
2009-09-15 9:52 ` Hugh Dickins
2009-09-16 5:29 ` [RFC][PATCH][bugfix] more checks for negative f_pos handling (Was Re: Question: how to handle too big f_pos KAMEZAWA Hiroyuki
2009-09-16 8:20 ` Américo Wang
2009-09-16 8:44 ` KAMEZAWA Hiroyuki
2009-09-16 9:13 ` Américo Wang
2009-09-16 12:06 ` KAMEZAWA Hiroyuki
2009-09-17 3:06 ` Américo Wang
2009-09-17 5:53 ` [RFC][PATCH][bugfix] more checks for negative f_pos handling v2 KAMEZAWA Hiroyuki
2009-09-17 6:07 ` [RFC][PATCH][bugfix] more checks for negative f_pos handling v3 KAMEZAWA Hiroyuki
2009-09-17 6:21 ` Wu Fengguang
2009-09-17 6:31 ` KAMEZAWA Hiroyuki
2009-09-17 6:53 ` Wu Fengguang
2009-09-17 6:51 ` [RFC][PATCH][bugfix] more checks for negative f_pos handling v4 KAMEZAWA Hiroyuki
2009-09-17 7:14 ` Wu Fengguang
2009-09-17 7:23 ` KAMEZAWA Hiroyuki
2009-09-17 7:30 ` Wu Fengguang
2009-09-17 9:42 ` Wu Fengguang
2009-09-17 10:54 ` KAMEZAWA Hiroyuki
2009-09-17 10:58 ` KAMEZAWA Hiroyuki
2009-09-17 12:40 ` Wu Fengguang
2009-09-18 0:02 ` KAMEZAWA Hiroyuki
2009-09-18 2:25 ` Américo Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090915025710.GA23881@localhost \
--to=fengguang.wu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=avi@qumranet.com \
--cc=broonie@opensource.wolfsonmicro.com \
--cc=gregkh@suse.de \
--cc=johannes@sipsolutions.net \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=nickpiggin@yahoo.com.au \
--cc=scgtrp@gmail.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.