xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* Xen hypercall API/ABI problems
@ 2013-06-19 15:43 Andrew Cooper
  2013-06-20  9:01 ` Jan Beulich
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Cooper @ 2013-06-19 15:43 UTC (permalink / raw)
  To: Xen-devel List, Keir Fraser, Jan Beulich, Tim Deegan,
	Ian Campbell

Hello,

While attempting to teach a hypercall-aware valgrind about enough
hypercalls to allow it to introspect HVM domain migration I came across
some systemic problems with certain hypercalls, particularly with migrate.

Here is the example of XENMEM_maximum_ram_page, but it is not alone as
far as this goes.

In Xen, it is defined as

/*                                                                                                                                                                                 

 * Returns the maximum machine frame number of mapped RAM in this
system.                                                                                                          

 * This command always succeeds (it never returns an error
code).                                                                                                                  

 * arg ==
NULL.                                                                                                                                                                    

 */

In memory.c, there is a possible unsigned->signed conversion error from
max_pages to rc.
In compat/memory.c, there is a long->int truncation error for compat
hypercalls, although newer versions of Xen cap this at INT_{MIN,MAX}

In the privcmd driver passes the hypercall rc through as the return from
the ioctl handler, containing a possible long->int truncation error.

>From libxc, the do_memory_op() is expected -errno style error handling,
but does not enforce it.  There is however a possible int->long
extension issue with xc_maximum_ram_page().

The value from this is then stuffed into unsigned long minfo->max_mfn
and immediately used in try to map the M2P table.


>From the work with XSA-55, we have already identified that the error
handling and propagation in libxc leaves a lot to be desired.  However,
the hypervisor side of things is just as problematic.

What policy do we have about deprecating hypercall interfaces and
introducing newer ones?  At a minimum, all hypercalls should be using
-errno style errors, with a possibility of returning 0 to LONG_MAX as well.

I realise that simply changing the hypercalls in place is not possible. 
Would it be acceptable to have a step change across a Xen version (say
early in 4.4) where consumers of the public interface would have to make
use of -DXEN_LEGACY_UNSAFE_HYPERCALLS (or equivalent) in an attempt to
move them forward with the API ?

~Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Xen hypercall API/ABI problems
  2013-06-19 15:43 Xen hypercall API/ABI problems Andrew Cooper
@ 2013-06-20  9:01 ` Jan Beulich
  2013-06-25 13:10   ` Andrew Cooper
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Beulich @ 2013-06-20  9:01 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Tim Deegan, Keir Fraser, Ian Campbell, Xen-devel List

>>> On 19.06.13 at 17:43, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> In memory.c, there is a possible unsigned->signed conversion error from
> max_pages to rc.

That's of no concern as long as the maximum possible value can't
result in the value being negative. Plus it's problematic only when
the hypervisor is 32-bit (as otherwise it's a conversion from
"unsigned int" to "signed long".

And for the list of items to be complete - there's a similar conversion
for d->tot_pages.

> In compat/memory.c, there is a long->int truncation error for compat
> hypercalls, although newer versions of Xen cap this at INT_{MIN,MAX}

That was added precisely to avoid uncontrolled truncation.

> In the privcmd driver passes the hypercall rc through as the return from
> the ioctl handler, containing a possible long->int truncation error.

That's an outright bug, introduced by improper code transformations
when porting the XenoLinux code to the upstream kernel, or - if the
porting was done long enough ago - lack of noticing linux-2.6.18-xen.hg
c/s 984.

> From the work with XSA-55, we have already identified that the error
> handling and propagation in libxc leaves a lot to be desired.  However,
> the hypervisor side of things is just as problematic.

Given the above I'm not clear what problematic point you see.

> What policy do we have about deprecating hypercall interfaces and
> introducing newer ones?  At a minimum, all hypercalls should be using
> -errno style errors, with a possibility of returning 0 to LONG_MAX as well.
> 
> I realise that simply changing the hypercalls in place is not possible. 
> Would it be acceptable to have a step change across a Xen version (say
> early in 4.4) where consumers of the public interface would have to make
> use of -DXEN_LEGACY_UNSAFE_HYPERCALLS (or equivalent) in an attempt to
> move them forward with the API ?

That's what we have __XEN_INTERFACE_VERSION__ for - just
guard stuff you don't want up-to-date consumers to use anymore
with a respective #if __XEN_INTERFACE_VERSION__ < 0x040400.

Of course pv-ops is lacking any such version handling so far,
apparently with the original hope of only using up-to-date bits.

Jan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Xen hypercall API/ABI problems
  2013-06-20  9:01 ` Jan Beulich
@ 2013-06-25 13:10   ` Andrew Cooper
  2013-06-25 14:04     ` Jan Beulich
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Cooper @ 2013-06-25 13:10 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Tim (Xen.org), Keir (Xen.org), Ian Campbell, Xen-devel List

On 20/06/13 10:01, Jan Beulich wrote:
>>>> On 19.06.13 at 17:43, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> In memory.c, there is a possible unsigned->signed conversion error from
>> max_pages to rc.
> That's of no concern as long as the maximum possible value can't
> result in the value being negative. Plus it's problematic only when
> the hypervisor is 32-bit (as otherwise it's a conversion from
> "unsigned int" to "signed long".
>
> And for the list of items to be complete - there's a similar conversion
> for d->tot_pages.

In this case, 64bit domain on 64bit Xen is fine.  This hypercall is ok
as it really shouldn't be returning more than ((~0ULL)>>PAGE_SHIFT)

I guess the question boils down this:

Is it ok to retroactively apply -error semantics to hypercalls which
were previously defined to never return an error?  Already for the
compat layer a wrong value is being returned. All we would be doing is
changing from INT_MAX to -ERANGE which is differently wrong but more
consistent.

~Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Xen hypercall API/ABI problems
  2013-06-25 13:10   ` Andrew Cooper
@ 2013-06-25 14:04     ` Jan Beulich
  0 siblings, 0 replies; 4+ messages in thread
From: Jan Beulich @ 2013-06-25 14:04 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Tim (Xen.org), Keir (Xen.org), Ian Campbell, Xen-devel List

>>> On 25.06.13 at 15:10, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> On 20/06/13 10:01, Jan Beulich wrote:
>>>>> On 19.06.13 at 17:43, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>> In memory.c, there is a possible unsigned->signed conversion error from
>>> max_pages to rc.
>> That's of no concern as long as the maximum possible value can't
>> result in the value being negative. Plus it's problematic only when
>> the hypervisor is 32-bit (as otherwise it's a conversion from
>> "unsigned int" to "signed long".
>>
>> And for the list of items to be complete - there's a similar conversion
>> for d->tot_pages.
> 
> In this case, 64bit domain on 64bit Xen is fine.  This hypercall is ok
> as it really shouldn't be returning more than ((~0ULL)>>PAGE_SHIFT)
> 
> I guess the question boils down this:
> 
> Is it ok to retroactively apply -error semantics to hypercalls which
> were previously defined to never return an error?  Already for the
> compat layer a wrong value is being returned. All we would be doing is
> changing from INT_MAX to -ERANGE which is differently wrong but more
> consistent.

I think it is okay if the change is, like here, from a de facto random
value (due to having got truncated) to a predictable error indicator.
The capping to INT_MAX was trying to do almost the same (with
the goal of not converting a success return to an error one).

Jan

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-06-25 14:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-19 15:43 Xen hypercall API/ABI problems Andrew Cooper
2013-06-20  9:01 ` Jan Beulich
2013-06-25 13:10   ` Andrew Cooper
2013-06-25 14:04     ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).