From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andres Lagar-Cavilla Subject: Re: VMX status report. Xen:26323 & Dom0:3.7.1 Date: Mon, 14 Jan 2013 11:18:07 -0500 Message-ID: <8B662C72-CD4B-44EF-BF8C-19CDEE4D06CC@gmail.com> References: <1B4B44D9196EFF41AE41FDA404FC0A1024486E@SHSMSX101.ccr.corp.intel.com> <50EE908602000078000B44CE@nat28.tlf.novell.com> <50EFDC8802000078000B4AC2@nat28.tlf.novell.com> <750FD2DB-E7A5-4038-9274-2CBAF2B4027C@gridcentric.ca> <50F40F42.5020807@citrix.com> <65B8802C-FE47-43CB-87EF-B168F57FF6DA@gmail.com> <50F42C6A.2080702@citrix.com> Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Content-Type: multipart/mixed; boundary="===============1728604709129947767==" Return-path: In-Reply-To: <50F42C6A.2080702@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: David Vrabel Cc: Yongjie Ren , Ian Campbell , Konrad Wilk , Andres Lagar-Cavilla , xen-devel , Chao Zhou , Jan Beulich , Mats Petersson , Yan Dai , YongweiX Xu , SongtaoX Liu , Andres Lagar-Cavilla List-Id: xen-devel@lists.xenproject.org --===============1728604709129947767== Content-Type: multipart/alternative; boundary="Apple-Mail=_298094DA-B00E-4399-9D22-BE08576A2A7D" --Apple-Mail=_298094DA-B00E-4399-9D22-BE08576A2A7D Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 On Jan 14, 2013, at 11:03 AM, David Vrabel = wrote: > On 14/01/13 15:06, Andres Lagar-Cavilla wrote: >> On Jan 14, 2013, at 8:59 AM, David Vrabel = wrote: >>=20 >>> On 14/01/13 04:29, Andres Lagar-Cavilla wrote: >>>>=20 >>>> Below you'll find pasted an RFC patch to fix this. I've expanded = the >>>> cc line to add Mats Peterson, who is also looking into some = improvements >>>> to privcmd (and IanC for general feedback). >>>>=20 >>>> The RFC patch cuts down code overall and cleans up logic too. I did >>>> change the behavior wrt classic implementations when it comes to >>>> handling errors & EFAULT. Instead of doing all the mapping work and = then >>>> copying back to user, I copy back each individual mapping error as = soon >>>> as it arises. And short-circuit and quit the whole operation as = soon as >>>> the first EFAULT arises. >>>=20 >>> Which is broken. >> Certainly due to copy_on_write within mmap semaphore. Unfortunately I = didn't have time last night to post the fix, pardon for the noise. >>> Please just look at my v3 patch and implement that method. >=20 > ... but be aware that I messed up mmap_return_errors() for V1 and set > all MFNs as having errors. Oops. >=20 >> The one nit I have about that is that it does an unnecessary get_user = of the mfn on the second pass for V1. HOw about this? >=20 > __get_user() and __put_user() are actually cheap (provided they don't > fault). >=20 > This looks ok except for one thing. >=20 >> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c >> index 3421f0d..fc4952d 100644 >> --- a/drivers/xen/privcmd.c >> +++ b/drivers/xen/privcmd.c > [...] >> @@ -288,7 +289,19 @@ static int mmap_batch_fn(void *data, void = *state) >> &cur_page); >>=20 >> /* Store error code for second pass. */ >> - *(st->err++) =3D ret; >> + if (st->version =3D=3D 1) { >> + if (ret < 0) { >> + /* >> + * V1 encodes the error codes in the 32bit top = nibble of the >> + * mfn (with its known limitations vis-a-vis 64 = bit callers). >> + */ >> + *mfnp |=3D (ret =3D=3D -ENOENT) ? >> + = PRIVCMD_MMAPBATCH_PAGED_ERROR : >> + = PRIVCMD_MMAPBATCH_MFN_ERROR; >=20 > You also need to clear the top nibble on success (ret >=3D 0) so large > PFNs with the top nibble already set don't give false positives of = errors. But classic kernels don't do this either, afaict (e.g. = http://xenbits.xen.org/hg/linux-2.6.18-xen.hg/file/c340a22a3a63/drivers/xe= n/privcmd/privcmd.c#l282 or XenServer's 6.1 kernel). The key reason for V2 is to get rid of all these limitations and stop = trying to fix them in V1. I'm open to whichever fix, though. It'd be just one line for the else = case. I'd just like some feedback before "officially" submitting a = patch. Thanks! Andres >=20 > David >=20 --Apple-Mail=_298094DA-B00E-4399-9D22-BE08576A2A7D Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 david.vrabel@citrix.com> = wrote:
On 14/01/13 15:06, Andres Lagar-Cavilla = wrote:
On Jan 14, 2013, at 8:59 AM, David = Vrabel <david.vrabel@citrix.com> = wrote:

On 14/01/13 04:29, Andres = Lagar-Cavilla wrote:

Below you'll find = pasted an RFC patch to fix this. I've expanded the
cc line to add = Mats Peterson, who is also looking into some improvements
to privcmd = (and IanC for general feedback).

The RFC patch cuts down code = overall and cleans up logic too. I did
change the behavior wrt = classic implementations when it comes to
handling errors & = EFAULT. Instead of doing all the mapping work and then
copying back = to user, I copy back each individual mapping error as soon
as it = arises. And short-circuit and quit the whole operation as soon as
the = first EFAULT arises.

Which is = broken.
Certainly due to copy_on_write within mmap = semaphore. Unfortunately I didn't have time last night to post the fix, = pardon for the noise.
Please just look at = my v3 patch and implement that = method.

... but be aware that I messed = up mmap_return_errors() for V1 and set
all MFNs as having errors. =  Oops.

The one nit I have about = that is that it does an unnecessary get_user of the mfn on the second = pass for V1. HOw about this?

__get_user() and = __put_user() are actually cheap (provided they = don't
fault).

This looks ok except for one = thing.

diff --git = a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
index = 3421f0d..fc4952d 100644
--- a/drivers/xen/privcmd.c
+++ = b/drivers/xen/privcmd.c
[...]
@@ -288,7 +289,19 @@ static int mmap_batch_fn(void *data, = void *state)
= = = = = &cur_page);

/* Store error code for second = pass. */
- = *(st->err++) =3D ret;
+ if (st->version =3D=3D 1) = {
+ = = if (ret < 0) {
+ /*
+ * V1 = encodes the error codes in the 32bit top nibble of the
+ * mfn = (with its known limitations vis-a-vis 64 bit callers).
+ = */
+ = = = *mfnp |=3D (ret =3D=3D -ENOENT) ?
+ = PRIVCMD_MMAPBATCH_PAGED_ERROR :
+ = PRIVCMD_MMAPBATCH_MFN_ERROR;

You also need to = clear the top nibble on success (ret >=3D 0) so large
PFNs with = the top nibble already set don't give false positives of = errors.

But classic kernels don't do this = either, afaict (e.g. http://xenbits.xen.org/hg/linux-2.6.18-x= en.hg/file/c340a22a3a63/drivers/xen/privcmd/privcmd.c#l282 or = XenServer's 6.1 kernel).

The key reason for V2 = is to get rid of all these limitations and stop trying to fix them in = V1.

I'm open to whichever fix, though. It'd be = just one line for the else case. I'd just like some feedback before = "officially" submitting a = patch.

Thanks!
Andres

David


= --Apple-Mail=_298094DA-B00E-4399-9D22-BE08576A2A7D-- --===============1728604709129947767== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============1728604709129947767==--