xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] xen/page_alloc: Distinguish different errors from assign_pages()
@ 2016-06-28 17:56 Andrew Cooper
  2016-06-28 17:58 ` Andrew Cooper
  2016-06-28 18:56 ` Domain creation errors Andrew Cooper
  0 siblings, 2 replies; 8+ messages in thread
From: Andrew Cooper @ 2016-06-28 17:56 UTC (permalink / raw)
  To: Xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Tim Deegan, Jan Beulich

assign_pages() has a return type of int, but uses for a boolean value.  As
there are two distinct failure cases, return a more meaningful error.

All caller currently use its boolean nature, so there is no resulting
change (yet).

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: George Dunlap <George.Dunlap@eu.citrix.com>
CC: Jan Beulich <JBeulich@suse.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Tim Deegan <tim@xen.org>
CC: Wei Liu <wei.liu2@citrix.com>

This is the first in a number of changes trying to clean up error reporting of
memory conditions.
---
 xen/common/page_alloc.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 98e30e5..48cf90d 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1740,6 +1740,7 @@ int assign_pages(
     unsigned int order,
     unsigned int memflags)
 {
+    int rc = 0;
     unsigned long i;
 
     spin_lock(&d->page_alloc_lock);
@@ -1748,7 +1749,8 @@ int assign_pages(
     {
         gdprintk(XENLOG_INFO, "Cannot assign page to domain%d -- dying.\n",
                 d->domain_id);
-        goto fail;
+        rc = -EINVAL;
+        goto out;
     }
 
     if ( !(memflags & MEMF_no_refcount) )
@@ -1759,7 +1761,8 @@ int assign_pages(
                 gprintk(XENLOG_INFO, "Over-allocation for domain %u: "
                         "%u > %u\n", d->domain_id,
                         d->tot_pages + (1 << order), d->max_pages);
-            goto fail;
+            rc = -E2BIG;
+            goto out;
         }
 
         if ( unlikely(d->tot_pages == 0) )
@@ -1778,12 +1781,9 @@ int assign_pages(
         page_list_add_tail(&pg[i], &d->page_list);
     }
 
+ out:
     spin_unlock(&d->page_alloc_lock);
-    return 0;
-
- fail:
-    spin_unlock(&d->page_alloc_lock);
-    return -1;
+    return rc;
 }
 
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] xen/page_alloc: Distinguish different errors from assign_pages()
  2016-06-28 17:56 [PATCH] xen/page_alloc: Distinguish different errors from assign_pages() Andrew Cooper
@ 2016-06-28 17:58 ` Andrew Cooper
  2016-06-29  9:52   ` Jan Beulich
  2016-06-28 18:56 ` Domain creation errors Andrew Cooper
  1 sibling, 1 reply; 8+ messages in thread
From: Andrew Cooper @ 2016-06-28 17:58 UTC (permalink / raw)
  To: Xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Tim Deegan,
	Jan Beulich

On 28/06/16 18:56, Andrew Cooper wrote:
> assign_pages() has a return type of int, but uses for a boolean value.  As
> there are two distinct failure cases, return a more meaningful error.

Sorry - sent an out of date version.  This sentence actually reads

assign_pages() has a return type of int, but only returns 0 or -1.  As there
are two distinct failure cases, return a more meaningful error.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Domain creation errors
  2016-06-28 17:56 [PATCH] xen/page_alloc: Distinguish different errors from assign_pages() Andrew Cooper
  2016-06-28 17:58 ` Andrew Cooper
@ 2016-06-28 18:56 ` Andrew Cooper
  2016-06-29  9:55   ` Jan Beulich
  1 sibling, 1 reply; 8+ messages in thread
From: Andrew Cooper @ 2016-06-28 18:56 UTC (permalink / raw)
  To: Xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Tim Deegan,
	Julien Grall, Jan Beulich

On 28/06/16 18:56, Andrew Cooper wrote:
> <snip>
>
> This is the first in a number of changes trying to clean up error reporting of
> memory conditions.

One area which is constantly causing problems is creation of domains in
low memory conditions.  In the case where the toolstack gets its
calculations wrong, it is very difficult to diagnose what precicely went
wrong, for a number of reasons.

The error handling in xc_domain_populate_physmap_exact() looks like

    if ( err >= 0 )
    {
        DPRINTF("Failed allocation for dom %d: %ld extents of order %d\n",
                domid, nr_extents, extent_order);
        errno = EBUSY;
        err = -1;
    }

which hides any error whatsoever behind an -EBUSY.

The reason for this is that the hypercall interface for
XENMEM_populate_physmap identifies which extent failed, but not why it
failed.  (Actually, there are plenty of errors which are accounted
against "the next extent", rather than being directly caused by that
extent).


The error conditions I want to be able to distinguish are "Xen is
completely out of memory", and "You are trying to add memory to a domain
beyond its allotted limit".  The former indicates that a toolstacks
logic to account overall memory is problematic, while the latter
indicates a mismatch between different toolstack areas.  Sadly, both end
up as -EBUSY.

The first problem to solve is that alloc_domheap_pages() can't
distinguish the errors.  Two returns from it are legitimately an
indication of ENOMEM, but the assign_pages() failure is specifically
not.  Returning a struct page_info pointer is unhelpful at
distinguishing the issues.

I have thought of two approaches, but its hard to decide between them.

First is to change alloc_domheap_pages() prototype to

int alloc_domheap_pages(struct domain *d, unsigned int order, unsigned
int memflags, struct page_info *out);

while the second is to use PTR_ERR().

Using PTR_ERR() is less disruptive to the code, but will cause
collateral damage for anyone with out-of-tree patches, as the code will
compile but the error logic will be wrong.  The use of PTR_ERR() is also
quite dangerous in the context of a PV guest, as the resulting pointer
is under 64bit guest ABI control.

I am leaning towards the first option, which at least has the advantage
that any out-of-tree code will break in an obvious way.

Any opinions or alternate suggestions?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] xen/page_alloc: Distinguish different errors from assign_pages()
  2016-06-28 17:58 ` Andrew Cooper
@ 2016-06-29  9:52   ` Jan Beulich
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2016-06-29  9:52 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, WeiLiu, George Dunlap, Tim Deegan, Xen-devel

>>> On 28.06.16 at 19:58, <andrew.cooper3@citrix.com> wrote:
> On 28/06/16 18:56, Andrew Cooper wrote:
>> assign_pages() has a return type of int, but uses for a boolean value.  As
>> there are two distinct failure cases, return a more meaningful error.
> 
> Sorry - sent an out of date version.  This sentence actually reads
> 
> assign_pages() has a return type of int, but only returns 0 or -1.  As there
> are two distinct failure cases, return a more meaningful error.

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Domain creation errors
  2016-06-28 18:56 ` Domain creation errors Andrew Cooper
@ 2016-06-29  9:55   ` Jan Beulich
  2016-06-29 11:11     ` Tim Deegan
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2016-06-29  9:55 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, WeiLiu, George Dunlap, Tim Deegan, Xen-devel,
	Julien Grall

>>> On 28.06.16 at 20:56, <andrew.cooper3@citrix.com> wrote:
> On 28/06/16 18:56, Andrew Cooper wrote:
>> <snip>
>>
>> This is the first in a number of changes trying to clean up error reporting 
> of
>> memory conditions.
> 
> One area which is constantly causing problems is creation of domains in
> low memory conditions.  In the case where the toolstack gets its
> calculations wrong, it is very difficult to diagnose what precicely went
> wrong, for a number of reasons.
> 
> The error handling in xc_domain_populate_physmap_exact() looks like
> 
>     if ( err >= 0 )
>     {
>         DPRINTF("Failed allocation for dom %d: %ld extents of order %d\n",
>                 domid, nr_extents, extent_order);
>         errno = EBUSY;
>         err = -1;
>     }
> 
> which hides any error whatsoever behind an -EBUSY.
> 
> The reason for this is that the hypercall interface for
> XENMEM_populate_physmap identifies which extent failed, but not why it
> failed.  (Actually, there are plenty of errors which are accounted
> against "the next extent", rather than being directly caused by that
> extent).
> 
> 
> The error conditions I want to be able to distinguish are "Xen is
> completely out of memory", and "You are trying to add memory to a domain
> beyond its allotted limit".  The former indicates that a toolstacks
> logic to account overall memory is problematic, while the latter
> indicates a mismatch between different toolstack areas.  Sadly, both end
> up as -EBUSY.
> 
> The first problem to solve is that alloc_domheap_pages() can't
> distinguish the errors.  Two returns from it are legitimately an
> indication of ENOMEM, but the assign_pages() failure is specifically
> not.  Returning a struct page_info pointer is unhelpful at
> distinguishing the issues.
> 
> I have thought of two approaches, but its hard to decide between them.
> 
> First is to change alloc_domheap_pages() prototype to
> 
> int alloc_domheap_pages(struct domain *d, unsigned int order, unsigned
> int memflags, struct page_info *out);
> 
> while the second is to use PTR_ERR().
> 
> Using PTR_ERR() is less disruptive to the code, but will cause
> collateral damage for anyone with out-of-tree patches, as the code will
> compile but the error logic will be wrong.  The use of PTR_ERR() is also
> quite dangerous in the context of a PV guest, as the resulting pointer
> is under 64bit guest ABI control.
> 
> I am leaning towards the first option, which at least has the advantage
> that any out-of-tree code will break in an obvious way.
> 
> Any opinions or alternate suggestions?

To be honest I'm not worried much about out of tree code, and
the err.h abstractions are precisely for cases like this. So I'm for
the PTR_ERR() variant.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Domain creation errors
  2016-06-29  9:55   ` Jan Beulich
@ 2016-06-29 11:11     ` Tim Deegan
  2016-06-29 11:21       ` Andrew Cooper
  0 siblings, 1 reply; 8+ messages in thread
From: Tim Deegan @ 2016-06-29 11:11 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, WeiLiu, George Dunlap, Andrew Cooper,
	Xen-devel, Julien Grall

At 03:55 -0600 on 29 Jun (1467172554), Jan Beulich wrote:
> >>> On 28.06.16 at 20:56, <andrew.cooper3@citrix.com> wrote:
> > Using PTR_ERR() is less disruptive to the code, but will cause
> > collateral damage for anyone with out-of-tree patches, as the code will
> > compile but the error logic will be wrong.  The use of PTR_ERR() is also
> > quite dangerous in the context of a PV guest, as the resulting pointer
> > is under 64bit guest ABI control.
> > 
> > I am leaning towards the first option, which at least has the advantage
> > that any out-of-tree code will break in an obvious way.
> > 
> > Any opinions or alternate suggestions?
> 
> To be honest I'm not worried much about out of tree code, and
> the err.h abstractions are precisely for cases like this. So I'm for
> the PTR_ERR() variant.

+1, FWIW.  Can the x86_64/PV problem be avoided by using non-canonical
error addresses?

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Domain creation errors
  2016-06-29 11:11     ` Tim Deegan
@ 2016-06-29 11:21       ` Andrew Cooper
  2016-06-29 14:20         ` Jan Beulich
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Cooper @ 2016-06-29 11:21 UTC (permalink / raw)
  To: Tim Deegan, Jan Beulich
  Cc: Stefano Stabellini, WeiLiu, George Dunlap, Xen-devel,
	Julien Grall

On 29/06/16 12:11, Tim Deegan wrote:
> At 03:55 -0600 on 29 Jun (1467172554), Jan Beulich wrote:
>>>>> On 28.06.16 at 20:56, <andrew.cooper3@citrix.com> wrote:
>>> Using PTR_ERR() is less disruptive to the code, but will cause
>>> collateral damage for anyone with out-of-tree patches, as the code will
>>> compile but the error logic will be wrong.  The use of PTR_ERR() is also
>>> quite dangerous in the context of a PV guest, as the resulting pointer
>>> is under 64bit guest ABI control.
>>>
>>> I am leaning towards the first option, which at least has the advantage
>>> that any out-of-tree code will break in an obvious way.
>>>
>>> Any opinions or alternate suggestions?
>> To be honest I'm not worried much about out of tree code, and
>> the err.h abstractions are precisely for cases like this. So I'm for
>> the PTR_ERR() variant.
> +1, FWIW.  Can the x86_64/PV problem be avoided by using non-canonical
> error addresses?

I can look into that, but it will definitely complicate the PTR_ERR()
handling.  Linux gets away with the status quo as the pointers which are
actually error integers fall into kernel-controlled memory.

The other reason I am hesitant about PTR_ERR() is that it obfuscates the
semantics sufficiently for Coverity to give up.  As Coverity has found
legitimate issues with the use of alloc_domheap_pages() in the past, I
am hesitant to make things harder to interpret.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Domain creation errors
  2016-06-29 11:21       ` Andrew Cooper
@ 2016-06-29 14:20         ` Jan Beulich
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2016-06-29 14:20 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, WeiLiu, George Dunlap, Tim Deegan, Xen-devel,
	Julien Grall

>>> On 29.06.16 at 13:21, <andrew.cooper3@citrix.com> wrote:
> The other reason I am hesitant about PTR_ERR() is that it obfuscates the
> semantics sufficiently for Coverity to give up.

Mind giving some more detail? These inline functions aren't all that
obfuscating - just a couple of casts. If that's enough to confuse
Coverity then I think we have many more problems elsewhere (no
matter how much we try to fight introduction of new casts).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-06-29 14:20 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-28 17:56 [PATCH] xen/page_alloc: Distinguish different errors from assign_pages() Andrew Cooper
2016-06-28 17:58 ` Andrew Cooper
2016-06-29  9:52   ` Jan Beulich
2016-06-28 18:56 ` Domain creation errors Andrew Cooper
2016-06-29  9:55   ` Jan Beulich
2016-06-29 11:11     ` Tim Deegan
2016-06-29 11:21       ` Andrew Cooper
2016-06-29 14:20         ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).