From: Wu Fengguang <fengguang.wu@intel.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"mtosatti@redhat.com" <mtosatti@redhat.com>,
"gregkh@suse.de" <gregkh@suse.de>,
"broonie@opensource.wolfsonmicro.com"
<broonie@opensource.wolfsonmicro.com>,
"johannes@sipsolutions.net" <johannes@sipsolutions.net>,
"avi@qumranet.com" <avi@qumranet.com>,
"andi@firstfloor.org" <andi@firstfloor.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
WANG Cong <xiyou.wangcong@gmail.com>,
Mike Smith <scgtrp@gmail.com>,
Nick Piggin <nickpiggin@yahoo.com.au>
Subject: Re: [PATCH] devmem: handle partial kmem write/read
Date: Tue, 15 Sep 2009 10:57:10 +0800 [thread overview]
Message-ID: <20090915025710.GA23881@localhost> (raw)
In-Reply-To: <20090915113113.126f90f0.kamezawa.hiroyu@jp.fujitsu.com>
[-- Attachment #1: Type: text/plain, Size: 10101 bytes --]
On Tue, Sep 15, 2009 at 10:31:13AM +0800, KAMEZAWA Hiroyuki wrote:
> On Tue, 15 Sep 2009 10:02:08 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
>
> > On Tue, Sep 15, 2009 at 08:24:48AM +0800, KAMEZAWA Hiroyuki wrote:
> > > On Mon, 14 Sep 2009 12:34:44 +0800
> > > Wu Fengguang <fengguang.wu@intel.com> wrote:
> > >
> > > > Hi Kame,
> > > >
> > > > This patch needs more work. I first intent to fix a bug:
> > > >
> > > > sz = vwrite(kbuf, (char *)p, sz);
> > > > p += sz;
> > > > }
> > > >
> > > > So if the returned len is 0, the kbuf/p pointers will mismatch.
> > > >
> > > > Then I realize it changed the write behavior. The current vwrite()
> > > > behavior is strange, it returns 0 if _whole range_ is hole, otherwise
> > > > ignores the hole silently. So holes can be treated differently even in
> > > > the original code.
> > > >
> > > Ah, ok...
> > >
> > > > I'm not really sure about the right behavior. KAME-san, do you have
> > > > any suggestions?
> > > >
> > > maybe following make sense.
> > > =
> > > written = vwrite(kbuf, (char *p), sz);
> > > if (!written) // whole vmem was a hole
> > > written = sz;
> >
> > Since the 0 return value won't be used at all, it would be simpler to
> > tell vwrite() return the untouched buflen/sz, like this. It will ignore
> > _all_ the holes silently. Need to update comments too.
> >
>
> Hmm. IIUC the original kmem code returns immediately if vread/vwrite returns 0.
Agreed for vread. It seems vwrite don't do that for both kmem and
vmalloc part.
> But it seems this depends on wrong assumption that vmap area is continuous,
> at the first look.
You mean it is possible that not all pages are mapped in a vmlist addr,size range?
Is vmalloc areas guaranteed to be page aligned? Sorry for the dumb questions.
> In man(4) mem,kmem
> ==
> Byte addresses in mem are interpreted as physical memory addresses.
> References to nonexistent locations cause errors to be returned.
> .....
> The file kmem is the same as mem, except that the kernel virtual memory
> rather than physical memory is accessed.
>
> ==
>
> Then, we have to return error for accesses to "nonexistent locations".
That's one important factor to consider. On the other hand, the
original kmem read/write implementation don't return error code for
holes. Instead kmem read returns early, while kmem write ignores holes
but is buggy..
> memory-hole in vmap area is ....."nonexistent" ?
> I think it's nonexistent if there are no overlaps between requested [pos, pos+len)
> and registerred vmalloc area.
> But, hmm, there are no way for users to know "existing vmalloc area".
> Then, my above idea may be wrong.
>
> Then, I'd like to modify as following,
>
> - If is_vmalloc_or_module_addr(requested address) is false, return -EFAULT.
> - If is_vmalloc_or_module_addr(requested address) is true, return no error.
> Even if specified range doesn't include no exsiting vmalloc area.
>
> How do you think ?
Looks reasonable to me. But it's good to hear more wide opinions..
> Thanks,
> -Kame
> p.s. I wonder current /dev/kmem cannot read text area of kernel if it's not
> directly mapped.
Attached is a small tool (from LKML) for reading 'modprobe_path' from kmem,
it's not text, but is close..
Thanks,
Fengguang
>
>
> > --- linux-mm.orig/mm/vmalloc.c 2009-09-15 09:40:08.000000000 +0800
> > +++ linux-mm/mm/vmalloc.c 2009-09-15 09:43:33.000000000 +0800
> > @@ -1834,7 +1834,6 @@ long vwrite(char *buf, char *addr, unsig
> > struct vm_struct *tmp;
> > char *vaddr;
> > unsigned long n, buflen;
> > - int copied = 0;
> >
> > /* Don't allow overflow */
> > if ((unsigned long) addr + count < count)
> > @@ -1858,7 +1857,6 @@ long vwrite(char *buf, char *addr, unsig
> > n = count;
> > if (!(tmp->flags & VM_IOREMAP)) {
> > aligned_vwrite(buf, addr, n);
> > - copied++;
> > }
> > buf += n;
> > addr += n;
> > @@ -1866,8 +1864,6 @@ long vwrite(char *buf, char *addr, unsig
> > }
> > finished:
> > read_unlock(&vmlist_lock);
> > - if (!copied)
> > - return 0;
> > return buflen;
> > }
> >
> > > ==
> > > needs fix.
> > >
> > > Anyway, I wonder kmem is broken. It's should be totally rewritten.
> > >
> > > For example, this doesn't check anything.
> > > ==
> > > if (p < (unsigned long) high_memory) {
> > >
> > > ==
> > >
> > > But....are there users ?
> > > If necessary, I'll write some...
> >
> > I'm trying to stop possible mem/kmem users to access hwpoison pages..
> > I'm not the user, but rather a tester ;)
> >
> > Thanks,
> > Fengguang
> >
> > > Thanks,
> > > -Kame
> > >
> > >
> > > > Thanks,
> > > > Fengguang
> > > >
> > > > On Mon, Sep 14, 2009 at 11:29:01AM +0800, Wu Fengguang wrote:
> > > > > Current vwrite silently ignores vwrite() to vmap holes.
> > > > > Better behavior should be stop the write and return
> > > > > on such holes.
> > > > >
> > > > > Also return on partial read, which may happen in future
> > > > > (eg. hit hwpoison pages).
> > > > >
> > > > > CC: Andi Kleen <andi@firstfloor.org>
> > > > > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> > > > > ---
> > > > > drivers/char/mem.c | 30 ++++++++++++++++++------------
> > > > > 1 file changed, 18 insertions(+), 12 deletions(-)
> > > > >
> > > > > --- linux-mm.orig/drivers/char/mem.c 2009-09-14 10:59:50.000000000 +0800
> > > > > +++ linux-mm/drivers/char/mem.c 2009-09-14 11:06:25.000000000 +0800
> > > > > @@ -444,18 +444,22 @@ static ssize_t read_kmem(struct file *fi
> > > > > if (!kbuf)
> > > > > return -ENOMEM;
> > > > > while (count > 0) {
> > > > > + unsigned long n;
> > > > > +
> > > > > sz = size_inside_page(p, count);
> > > > > - sz = vread(kbuf, (char *)p, sz);
> > > > > - if (!sz)
> > > > > + n = vread(kbuf, (char *)p, sz);
> > > > > + if (!n)
> > > > > break;
> > > > > - if (copy_to_user(buf, kbuf, sz)) {
> > > > > + if (copy_to_user(buf, kbuf, n)) {
> > > > > free_page((unsigned long)kbuf);
> > > > > return -EFAULT;
> > > > > }
> > > > > - count -= sz;
> > > > > - buf += sz;
> > > > > - read += sz;
> > > > > - p += sz;
> > > > > + count -= n;
> > > > > + buf += n;
> > > > > + read += n;
> > > > > + p += n;
> > > > > + if (n < sz)
> > > > > + break;
> > > > > }
> > > > > free_page((unsigned long)kbuf);
> > > > > }
> > > > > @@ -551,11 +555,13 @@ static ssize_t write_kmem(struct file *
> > > > > free_page((unsigned long)kbuf);
> > > > > return -EFAULT;
> > > > > }
> > > > > - sz = vwrite(kbuf, (char *)p, sz);
> > > > > - count -= sz;
> > > > > - buf += sz;
> > > > > - virtr += sz;
> > > > > - p += sz;
> > > > > + n = vwrite(kbuf, (char *)p, sz);
> > > > > + count -= n;
> > > > > + buf += n;
> > > > > + virtr += n;
> > > > > + p += n;
> > > > > + if (n < sz)
> > > > > + break;
> > > > > }
> > > > > free_page((unsigned long)kbuf);
> > > > > }
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > Please read the FAQ at http://www.tux.org/lkml/
> > > >
> >
[-- Attachment #2: tmap.c --]
[-- Type: text/x-csrc, Size: 1385 bytes --]
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/poll.h>
#include <sys/mman.h>
int page_size;
#define PAGE_SIZE page_size
#define PAGE_MASK (~(PAGE_SIZE-1))
void get_var (unsigned long addr) {
off_t ptr = addr & ~(PAGE_MASK);
off_t offset = addr & PAGE_MASK;
int i = 0;
char *map;
static int kfd = -1;
kfd = open("/dev/kmem",O_RDONLY);
if (kfd < 0) {
perror("open");
exit(0);
}
map = mmap(NULL,PAGE_SIZE,PROT_READ,MAP_SHARED,kfd,offset);
if (map == MAP_FAILED) {
perror("mmap");
exit(-1);
}
printf("%s\n",map+ptr);
return;
}
int main(int argc, char **argv)
{
FILE *fp;
char addr_str[11]="0x";
char var[51];
unsigned long addr;
char ch;
int r;
if (argc != 2) {
fprintf(stderr,"usage: %s System.map\n",argv[0]);
exit(-1);
}
if ((fp = fopen(argv[1],"r")) == NULL) {
perror("fopen");
exit(-1);
}
do {
/* ffffffff81723880 D modprobe_path */
r = fscanf(fp,"%16s %c %50s\n",&addr_str[2],&ch,var);
if (strcmp(var,"modprobe_path")==0)
break;
} while(r > 0);
if (r < 0) {
printf("could not find modprobe_path\n");
exit(-1);
}
page_size = getpagesize();
addr = strtoul(addr_str,NULL,16);
printf("found modprobe_path at (%s) %08lx\n",addr_str,addr);
get_var(addr);
return 0;
}
next prev parent reply other threads:[~2009-09-15 2:57 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-14 3:29 [PATCH] devmem: handle partial kmem write/read Wu Fengguang
2009-09-14 4:34 ` Wu Fengguang
2009-09-15 0:24 ` KAMEZAWA Hiroyuki
2009-09-15 1:52 ` KAMEZAWA Hiroyuki
2009-09-15 2:05 ` Wu Fengguang
2009-09-15 2:02 ` Wu Fengguang
2009-09-15 2:31 ` KAMEZAWA Hiroyuki
2009-09-15 2:57 ` Wu Fengguang [this message]
2009-09-15 7:58 ` Question: how to handle too big f_pos " KAMEZAWA Hiroyuki
2009-09-15 8:11 ` Wu Fengguang
2009-09-15 9:52 ` Hugh Dickins
2009-09-16 5:29 ` [RFC][PATCH][bugfix] more checks for negative f_pos handling (Was Re: Question: how to handle too big f_pos KAMEZAWA Hiroyuki
2009-09-16 8:20 ` Américo Wang
2009-09-16 8:44 ` KAMEZAWA Hiroyuki
2009-09-16 9:13 ` Américo Wang
2009-09-16 12:06 ` KAMEZAWA Hiroyuki
2009-09-17 3:06 ` Américo Wang
2009-09-17 5:53 ` [RFC][PATCH][bugfix] more checks for negative f_pos handling v2 KAMEZAWA Hiroyuki
2009-09-17 6:07 ` [RFC][PATCH][bugfix] more checks for negative f_pos handling v3 KAMEZAWA Hiroyuki
2009-09-17 6:21 ` Wu Fengguang
2009-09-17 6:31 ` KAMEZAWA Hiroyuki
2009-09-17 6:53 ` Wu Fengguang
2009-09-17 6:51 ` [RFC][PATCH][bugfix] more checks for negative f_pos handling v4 KAMEZAWA Hiroyuki
2009-09-17 7:14 ` Wu Fengguang
2009-09-17 7:23 ` KAMEZAWA Hiroyuki
2009-09-17 7:30 ` Wu Fengguang
2009-09-17 9:42 ` Wu Fengguang
2009-09-17 10:54 ` KAMEZAWA Hiroyuki
2009-09-17 10:58 ` KAMEZAWA Hiroyuki
2009-09-17 12:40 ` Wu Fengguang
2009-09-18 0:02 ` KAMEZAWA Hiroyuki
2009-09-18 2:25 ` Américo Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090915025710.GA23881@localhost \
--to=fengguang.wu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=avi@qumranet.com \
--cc=broonie@opensource.wolfsonmicro.com \
--cc=gregkh@suse.de \
--cc=johannes@sipsolutions.net \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=nickpiggin@yahoo.com.au \
--cc=scgtrp@gmail.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox