From: Christopher Yeoh <cyeoh@au1.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-man@vger.kernel.org, linux-arch@vger.kernel.org
Subject: Re: Cross Memory Attach v3
Date: Mon, 25 Jul 2011 16:41:48 +0930 [thread overview]
Message-ID: <20110725164148.05a672b0@lilo> (raw)
In-Reply-To: <20110721150908.a71a59d6.akpm@linux-foundation.org>
On Thu, 21 Jul 2011 15:09:08 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:
> On Tue, 19 Jul 2011 00:35:37 +0930
> Christopher Yeoh <cyeoh@au1.ibm.com> wrote:
>
> > +/*
> > + * process_vm_rw_pages - read/write pages from task specified
> > + * @task: task to read/write from
> > + * @mm: mm for task
> > + * @process_pages: struct pages area that can store at least
> > + * nr_pages_to_copy struct page pointers
> > + * @pa: address of page in task to start copying from/to
> > + * @start_offset: offset in page to start copying from/to
> > + * @len: number of bytes to copy
> > + * @lvec: iovec array specifying where to copy to/from
> > + * @lvec_cnt: number of elements in iovec array
> > + * @lvec_current: index in iovec array we are up to
> > + * @lvec_offset: offset in bytes from current iovec iov_base we
> > are up to
> > + * @vm_write: 0 means copy from, 1 means copy to
> > + * @nr_pages_to_copy: number of pages to copy
> > + */
> > +static ssize_t process_vm_rw_pages(struct task_struct *task,
> > + struct mm_struct *mm,
> > + - *lvec_offset);
> > .....
> >
> > +
> > + target_kaddr = kmap(process_pages[pgs_copied]) +
> > start_offset; +
> > + if (vm_write)
> > + ret = copy_from_user(target_kaddr,
> > +
> > lvec[*lvec_current].iov_base
> > + + *lvec_offset,
> > + bytes_to_copy);
> > + else
> > + ret =
> > copy_to_user(lvec[*lvec_current].iov_base
> > + + *lvec_offset,
> > + target_kaddr,
> > bytes_to_copy);
> > + kunmap(process_pages[pgs_copied]);
> > + if (ret) {
> > + pgs_copied++;
> > + goto end;
>
> afacit this will always return -EFAULT, even after copying some
> memory.
>
> Is this a misdesign? Would it not be better to return the number of
> bytes actually copied, or -EFAULT is we weren't able to copy any?
> Like read().
>
> That way, userspace can at least process the data which _was_
> transferred, before retrying and then handling the fault. With the
> proposed interface this is not possible, and the data is lost.
Perhaps because of how I'm using the interface for MPI I
was thinking that if it fails at this point due to EFAULT then
its an application error and that the partial read/write wouldn't be
used. But I see your point could be true for other uses of the interface
and the next version will return partial read/write information all the
way back to userspace.
> And note that the function's kerneldoc doesn't document the return
> value at all! kerneldoc sucks that way - there's no formal place in
> the template, so people often ignore this important part of the
> function interface. Similarly, there's no kerneldoc template for
> preconditions such as irq on/off, locks which must be held, etc.
> So they don't get documented. But they're part of the interface.
>
Ok. Will be fixed in next version
> > nr_pages_to_copy);
> > + start_offset = 0;
> > +
> > + if (rc < 0)
> > + return rc;
>
> It's propagated here.
>
> (CodingStyleNit: it's conventional to put {} around the single-line
> block in this case, btw).
>
Fixed now.
> > +
> > + if (nr_pages == 0)
> > + return 0;
> > +
> > + /* For reliability don't try to kmalloc more than 2 pages
> > worth */
> > + process_pages = kmalloc(min_t(size_t,
> > PVM_MAX_KMALLOC_PAGES,
> > + sizeof(struct pages
> > *)*nr_pages),
> > + GFP_KERNEL);
>
> You might get some speed benefit by optimising for the small copies
> here. Define a local on-stack array of N page*'s and point
> process_pages at that if the number of pages is <= N. Saves a
> malloc/free and is more cache-friendly. But only if the result is
> measurable!
ok. will do some benchmarking to see if its worth it.
> > +
> > +static ssize_t process_vm_rw_check_iovecs(pid_t pid,
> > + const struct iovec
> > __user *lvec,
> > + unsigned long liovcnt,
> > + const struct iovec
> > __user *rvec,
> > + unsigned long riovcnt,
> > + unsigned long flags, int
> > vm_write)
>
> I'm allergic to functions with "check" in their name. Check for what?
>
> And one would expect a check_foo() to return a bool or an errno. This
> one returns a ssize_t! Weird! Interface documentation, please.
> Including return value semantics ;)
That part was split out from the main function because the iovec
checks are different for 32 bit compatibility. Have renamed:
process_vm_rw -> process_vm_rw_core
process_vm_rw_check_iovecs -> process_vm_rw
compat_process_vm_rw_check_iovecs -> compat_process_vm_rw
...and added more doco - with return value semantics :-)
Chris
--
cyeoh@au.ibm.com
WARNING: multiple messages have this Message-ID (diff)
From: Christopher Yeoh <cyeoh@au1.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-man@vger.kernel.org, linux-arch@vger.kernel.org
Subject: Re: Cross Memory Attach v3
Date: Mon, 25 Jul 2011 16:41:48 +0930 [thread overview]
Message-ID: <20110725164148.05a672b0@lilo> (raw)
In-Reply-To: <20110721150908.a71a59d6.akpm@linux-foundation.org>
On Thu, 21 Jul 2011 15:09:08 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:
> On Tue, 19 Jul 2011 00:35:37 +0930
> Christopher Yeoh <cyeoh@au1.ibm.com> wrote:
>
> > +/*
> > + * process_vm_rw_pages - read/write pages from task specified
> > + * @task: task to read/write from
> > + * @mm: mm for task
> > + * @process_pages: struct pages area that can store at least
> > + * nr_pages_to_copy struct page pointers
> > + * @pa: address of page in task to start copying from/to
> > + * @start_offset: offset in page to start copying from/to
> > + * @len: number of bytes to copy
> > + * @lvec: iovec array specifying where to copy to/from
> > + * @lvec_cnt: number of elements in iovec array
> > + * @lvec_current: index in iovec array we are up to
> > + * @lvec_offset: offset in bytes from current iovec iov_base we
> > are up to
> > + * @vm_write: 0 means copy from, 1 means copy to
> > + * @nr_pages_to_copy: number of pages to copy
> > + */
> > +static ssize_t process_vm_rw_pages(struct task_struct *task,
> > + struct mm_struct *mm,
> > + - *lvec_offset);
> > .....
> >
> > +
> > + target_kaddr = kmap(process_pages[pgs_copied]) +
> > start_offset; +
> > + if (vm_write)
> > + ret = copy_from_user(target_kaddr,
> > +
> > lvec[*lvec_current].iov_base
> > + + *lvec_offset,
> > + bytes_to_copy);
> > + else
> > + ret =
> > copy_to_user(lvec[*lvec_current].iov_base
> > + + *lvec_offset,
> > + target_kaddr,
> > bytes_to_copy);
> > + kunmap(process_pages[pgs_copied]);
> > + if (ret) {
> > + pgs_copied++;
> > + goto end;
>
> afacit this will always return -EFAULT, even after copying some
> memory.
>
> Is this a misdesign? Would it not be better to return the number of
> bytes actually copied, or -EFAULT is we weren't able to copy any?
> Like read().
>
> That way, userspace can at least process the data which _was_
> transferred, before retrying and then handling the fault. With the
> proposed interface this is not possible, and the data is lost.
Perhaps because of how I'm using the interface for MPI I
was thinking that if it fails at this point due to EFAULT then
its an application error and that the partial read/write wouldn't be
used. But I see your point could be true for other uses of the interface
and the next version will return partial read/write information all the
way back to userspace.
> And note that the function's kerneldoc doesn't document the return
> value at all! kerneldoc sucks that way - there's no formal place in
> the template, so people often ignore this important part of the
> function interface. Similarly, there's no kerneldoc template for
> preconditions such as irq on/off, locks which must be held, etc.
> So they don't get documented. But they're part of the interface.
>
Ok. Will be fixed in next version
> > nr_pages_to_copy);
> > + start_offset = 0;
> > +
> > + if (rc < 0)
> > + return rc;
>
> It's propagated here.
>
> (CodingStyleNit: it's conventional to put {} around the single-line
> block in this case, btw).
>
Fixed now.
> > +
> > + if (nr_pages == 0)
> > + return 0;
> > +
> > + /* For reliability don't try to kmalloc more than 2 pages
> > worth */
> > + process_pages = kmalloc(min_t(size_t,
> > PVM_MAX_KMALLOC_PAGES,
> > + sizeof(struct pages
> > *)*nr_pages),
> > + GFP_KERNEL);
>
> You might get some speed benefit by optimising for the small copies
> here. Define a local on-stack array of N page*'s and point
> process_pages at that if the number of pages is <= N. Saves a
> malloc/free and is more cache-friendly. But only if the result is
> measurable!
ok. will do some benchmarking to see if its worth it.
> > +
> > +static ssize_t process_vm_rw_check_iovecs(pid_t pid,
> > + const struct iovec
> > __user *lvec,
> > + unsigned long liovcnt,
> > + const struct iovec
> > __user *rvec,
> > + unsigned long riovcnt,
> > + unsigned long flags, int
> > vm_write)
>
> I'm allergic to functions with "check" in their name. Check for what?
>
> And one would expect a check_foo() to return a bool or an errno. This
> one returns a ssize_t! Weird! Interface documentation, please.
> Including return value semantics ;)
That part was split out from the main function because the iovec
checks are different for 32 bit compatibility. Have renamed:
process_vm_rw -> process_vm_rw_core
process_vm_rw_check_iovecs -> process_vm_rw
compat_process_vm_rw_check_iovecs -> compat_process_vm_rw
...and added more doco - with return value semantics :-)
Chris
--
cyeoh@au.ibm.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-07-25 7:11 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-18 15:05 Cross Memory Attach v3 Christopher Yeoh
2011-07-18 15:05 ` Christopher Yeoh
2011-07-21 21:54 ` Andrew Morton
2011-07-21 21:54 ` Andrew Morton
2011-07-21 21:54 ` Andrew Morton
2011-07-25 7:02 ` Christopher Yeoh
2011-07-25 7:02 ` Christopher Yeoh
2011-07-21 22:09 ` Andrew Morton
2011-07-21 22:09 ` Andrew Morton
2011-07-25 7:11 ` Christopher Yeoh [this message]
2011-07-25 7:11 ` Christopher Yeoh
2011-08-04 3:54 ` Cross Memory Attach v3 (includes updated patch based on v4) Christopher Yeoh
2011-08-04 3:54 ` Christopher Yeoh
2011-11-20 10:16 ` Cross Memory Attach v3 Geert Uytterhoeven
2011-11-20 10:16 ` Geert Uytterhoeven
2011-11-21 0:13 ` Christopher Yeoh
2011-11-21 0:13 ` Christopher Yeoh
2011-12-04 14:19 ` Geert Uytterhoeven
2011-12-04 14:19 ` Geert Uytterhoeven
2011-12-04 14:37 ` Thorsten Glaser
2011-12-04 16:28 ` Geert Uytterhoeven
2011-12-04 18:34 ` Andreas Schwab
2011-12-11 22:11 ` Thorsten Glaser
2011-12-13 10:59 ` Aurelien Jarno
2011-12-17 3:18 ` Thorsten Glaser
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110725164148.05a672b0@lilo \
--to=cyeoh@au1.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-man@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.