xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0 of 5] xl: fix guest reboot failures
@ 2012-06-22 10:56 Ian Campbell
  2012-06-22 10:56 ` [PATCH 1 of 5] libxl: validate scheduler parameters Ian Campbell
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Ian Campbell @ 2012-06-22 10:56 UTC (permalink / raw)
  To: xen-devel; +Cc: ian.jackson, Dario Faggioli

The following fixes recent test failures reboot a guest (e.g. flight
13302).

The main fix is the first patch which stops us from trying to validate
a domains scheduler paramters before it has been created.

The others are miscellaneous fixes I noticed while tracking that down.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1 of 5] libxl: validate scheduler parameters
  2012-06-22 10:56 [PATCH 0 of 5] xl: fix guest reboot failures Ian Campbell
@ 2012-06-22 10:56 ` Ian Campbell
  2012-06-22 11:23   ` Ian Campbell
  2012-06-22 10:56 ` [PATCH 2 of 5] libxl: initialise cpupoolinfo in libxl__domain_scheduler Ian Campbell
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Ian Campbell @ 2012-06-22 10:56 UTC (permalink / raw)
  To: xen-devel; +Cc: ian.jackson, Dario Faggioli

# HG changeset patch
# User Ian Campbell <ian.campbell@citrix.com>
# Date 1340361703 -3600
# Node ID 998d48ccb8905907cb2f104b475e5ab6ad445348
# Parent  5fb3c536b5a8810fb8be4149df609d272beff959
libxl: validate scheduler parameters

This was previously done by xl itself however the domain was not
created at that point so there was no domid to check. This happened to
work on first boot because xl's global domid was initialised to zero
so we would (incorrectly) validate the new domain to be against
domain0. On reboot though we would try to use the old domain's id and
fail.

sched_params_valid is moved and gains a gc+domid parameters and
s/ctx/CTX/. The call is placed after
libxl__domain_build_info_setdefault in the create path, because
set_defaults doesn't have access to the domid and there are other
callers which don't even have a domid to give it.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

diff -r 5fb3c536b5a8 -r 998d48ccb890 tools/libxl/libxl_create.c
--- a/tools/libxl/libxl_create.c	Fri Jun 22 10:14:46 2012 +0100
+++ b/tools/libxl/libxl_create.c	Fri Jun 22 11:41:43 2012 +0100
@@ -72,6 +72,49 @@ int libxl__domain_create_info_setdefault
     return 0;
 }
 
+static int sched_params_valid(libxl__gc *gc,
+                              uint32_t domid, libxl_domain_sched_params *scp)
+{
+    int has_weight = scp->weight != LIBXL_DOMAIN_SCHED_PARAM_WEIGHT_DEFAULT;
+    int has_period = scp->period != LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT;
+    int has_slice = scp->slice != LIBXL_DOMAIN_SCHED_PARAM_SLICE_DEFAULT;
+    int has_extratime =
+                scp->extratime != LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT;
+    libxl_domain_sched_params sci;
+
+    libxl_domain_sched_params_get(CTX, domid, &sci);
+
+    /* The sedf scheduler needs some more consistency checking */
+    if (sci.sched == LIBXL_SCHEDULER_SEDF) {
+        if (has_weight && (has_period || has_slice))
+            return 0;
+        if (has_period != has_slice)
+            return 0;
+
+        /*
+         * Idea is, if we specify a weight, then both period and
+         * slice has to be zero. OTOH, if we do not specify a weight,
+         * that means we want a pure best effort domain or an actual
+         * real-time one. In the former case, it is period that needs
+         * to be zero, in the latter, weight should be.
+         */
+        if (has_weight) {
+            scp->slice = 0;
+            scp->period = 0;
+        }
+        else if (!has_period) {
+            /* We can setup a proper best effort domain (extra time only)
+             * iff we either already have or are asking for some extra time. */
+            scp->weight = has_extratime ? scp->extratime : sci.extratime;
+            scp->period = 0;
+        }
+        if (has_period && has_slice)
+            scp->weight = 0;
+    }
+
+    return 1;
+}
+
 int libxl__domain_build_info_setdefault(libxl__gc *gc,
                                         libxl_domain_build_info *b_info)
 {
@@ -622,6 +665,12 @@ static void initiate_domain_create(libxl
     ret = libxl__domain_build_info_setdefault(gc, &d_config->b_info);
     if (ret) goto error_out;
 
+    if (!sched_params_valid(gc, domid, &d_config->b_info.sched_params)) {
+        LOG(ERROR, "Invalid scheduling parameters\n");
+        ret = ERROR_INVAL;
+        goto error_out;
+    }
+
     for (i = 0; i < d_config->num_disks; i++) {
         ret = libxl__device_disk_setdefault(gc, &d_config->disks[i]);
         if (ret) goto error_out;
diff -r 5fb3c536b5a8 -r 998d48ccb890 tools/libxl/xl_cmdimpl.c
--- a/tools/libxl/xl_cmdimpl.c	Fri Jun 22 10:14:46 2012 +0100
+++ b/tools/libxl/xl_cmdimpl.c	Fri Jun 22 11:41:43 2012 +0100
@@ -550,48 +550,6 @@ vcpp_out:
     return rc;
 }
 
-static int sched_params_valid(libxl_domain_sched_params *scp)
-{
-    int has_weight = scp->weight != LIBXL_DOMAIN_SCHED_PARAM_WEIGHT_DEFAULT;
-    int has_period = scp->period != LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT;
-    int has_slice = scp->slice != LIBXL_DOMAIN_SCHED_PARAM_SLICE_DEFAULT;
-    int has_extratime =
-                scp->extratime != LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT;
-    libxl_domain_sched_params sci;
-
-    libxl_domain_sched_params_get(ctx, domid, &sci);
-
-    /* The sedf scheduler needs some more consistency checking */
-    if (sci.sched == LIBXL_SCHEDULER_SEDF) {
-        if (has_weight && (has_period || has_slice))
-            return 0;
-        if (has_period != has_slice)
-            return 0;
-
-        /*
-         * Idea is, if we specify a weight, then both period and
-         * slice has to be zero. OTOH, if we do not specify a weight,
-         * that means we want a pure best effort domain or an actual
-         * real-time one. In the former case, it is period that needs
-         * to be zero, in the latter, weight should be.
-         */
-        if (has_weight) {
-            scp->slice = 0;
-            scp->period = 0;
-        }
-        else if (!has_period) {
-            /* We can setup a proper best effort domain (extra time only)
-             * iff we either already have or are asking for some extra time. */
-            scp->weight = has_extratime ? scp->extratime : sci.extratime;
-            scp->period = 0;
-        }
-        if (has_period && has_slice)
-            scp->weight = 0;
-    }
-
-    return 1;
-}
-
 static void parse_config_data(const char *config_source,
                               const char *config_data,
                               int config_len,
@@ -686,10 +644,6 @@ static void parse_config_data(const char
         b_info->sched_params.latency = l;
     if (!xlu_cfg_get_long (config, "extratime", &l, 0))
         b_info->sched_params.extratime = l;
-    if (!sched_params_valid(&b_info->sched_params)) {
-        fprintf(stderr, "Invalid scheduling parameters\n");
-        exit(1);
-    }
 
     if (!xlu_cfg_get_long (config, "vcpus", &l, 0)) {
         b_info->max_vcpus = l;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2 of 5] libxl: initialise cpupoolinfo in libxl__domain_scheduler
  2012-06-22 10:56 [PATCH 0 of 5] xl: fix guest reboot failures Ian Campbell
  2012-06-22 10:56 ` [PATCH 1 of 5] libxl: validate scheduler parameters Ian Campbell
@ 2012-06-22 10:56 ` Ian Campbell
  2012-06-22 10:56 ` [PATCH 3 of 5] libxl: correct type of cpupool variable Ian Campbell
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Ian Campbell @ 2012-06-22 10:56 UTC (permalink / raw)
  To: xen-devel; +Cc: ian.jackson, Dario Faggioli

# HG changeset patch
# User Ian Campbell <ian.campbell@citrix.com>
# Date 1340362527 -3600
# Node ID b6a78743e13fb5b7f652f25a541eb425a21f1396
# Parent  998d48ccb8905907cb2f104b475e5ab6ad445348
libxl: initialise cpupoolinfo in libxl__domain_scheduler

If libxl_cpupool_info fails then we would call
libxl_cpupoolinfo_dispose on an uninitialised struct, and possibly
free an invalid pointer.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

diff -r 998d48ccb890 -r b6a78743e13f tools/libxl/libxl_dom.c
--- a/tools/libxl/libxl_dom.c	Fri Jun 22 11:41:43 2012 +0100
+++ b/tools/libxl/libxl_dom.c	Fri Jun 22 11:55:27 2012 +0100
@@ -84,6 +84,7 @@ libxl_scheduler libxl__domain_scheduler(
     if (cpupool < 0)
         return sched;
 
+    libxl_cpupoolinfo_init(&poolinfo);
     rc = libxl_cpupool_info(CTX, &poolinfo, cpupool);
     if (rc < 0)
         goto out;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3 of 5] libxl: correct type of cpupool variable
  2012-06-22 10:56 [PATCH 0 of 5] xl: fix guest reboot failures Ian Campbell
  2012-06-22 10:56 ` [PATCH 1 of 5] libxl: validate scheduler parameters Ian Campbell
  2012-06-22 10:56 ` [PATCH 2 of 5] libxl: initialise cpupoolinfo in libxl__domain_scheduler Ian Campbell
@ 2012-06-22 10:56 ` Ian Campbell
  2012-06-22 10:56 ` [PATCH 4 of 5] libxl: log on failure in cpupool_info and libxl__domain_cpupool Ian Campbell
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Ian Campbell @ 2012-06-22 10:56 UTC (permalink / raw)
  To: xen-devel; +Cc: ian.jackson, Dario Faggioli

# HG changeset patch
# User Ian Campbell <ian.campbell@citrix.com>
# Date 1340362529 -3600
# Node ID b63faf77bff33d7c9ebfe7260ecaffc26f3773a8
# Parent  b6a78743e13fb5b7f652f25a541eb425a21f1396
libxl: correct type of cpupool variable.

libxl__domain_cpupool returns int and can return ERROR_* so we need to
use a signed type.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

diff -r b6a78743e13f -r b63faf77bff3 tools/libxl/libxl_dom.c
--- a/tools/libxl/libxl_dom.c	Fri Jun 22 11:55:27 2012 +0100
+++ b/tools/libxl/libxl_dom.c	Fri Jun 22 11:55:29 2012 +0100
@@ -76,7 +76,7 @@ int libxl__domain_cpupool(libxl__gc *gc,
 
 libxl_scheduler libxl__domain_scheduler(libxl__gc *gc, uint32_t domid)
 {
-    uint32_t cpupool = libxl__domain_cpupool(gc, domid);
+    int cpupool = libxl__domain_cpupool(gc, domid);
     libxl_cpupoolinfo poolinfo;
     libxl_scheduler sched = LIBXL_SCHEDULER_UNKNOWN;
     int rc;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 4 of 5] libxl: log on failure in cpupool_info and libxl__domain_cpupool
  2012-06-22 10:56 [PATCH 0 of 5] xl: fix guest reboot failures Ian Campbell
                   ` (2 preceding siblings ...)
  2012-06-22 10:56 ` [PATCH 3 of 5] libxl: correct type of cpupool variable Ian Campbell
@ 2012-06-22 10:56 ` Ian Campbell
  2012-06-22 10:56 ` [PATCH 5 of 5] xl: initialise domid to an explicitly invalid value Ian Campbell
  2012-06-29 14:45 ` [PATCH 0 of 5] xl: fix guest reboot failures Ian Jackson
  5 siblings, 0 replies; 13+ messages in thread
From: Ian Campbell @ 2012-06-22 10:56 UTC (permalink / raw)
  To: xen-devel; +Cc: ian.jackson, Dario Faggioli

# HG changeset patch
# User Ian Campbell <ian.campbell@citrix.com>
# Date 1340362601 -3600
# Node ID 2829b66dfe19afd4537ea4b2cfcef0cf28b49472
# Parent  b63faf77bff33d7c9ebfe7260ecaffc26f3773a8
libxl: log on failure in cpupool_info and libxl__domain_cpupool

Also in cpupool_info propagate the failure value from
libxl_cpumap_alloc.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

diff -r b63faf77bff3 -r 2829b66dfe19 tools/libxl/libxl.c
--- a/tools/libxl/libxl.c	Fri Jun 22 11:55:29 2012 +0100
+++ b/tools/libxl/libxl.c	Fri Jun 22 11:56:41 2012 +0100
@@ -562,16 +562,27 @@ static int cpupool_info(libxl__gc *gc,
 
     xcinfo = xc_cpupool_getinfo(CTX->xch, poolid);
     if (xcinfo == NULL)
+    {
+        LOGE(ERROR, "failed to get info for cpupool%d\n", poolid);
         return ERROR_FAIL;
+    }
 
     if (exact && xcinfo->cpupool_id != poolid)
+    {
+        LOG(ERROR, "got info for cpupool%d, wanted cpupool%d\n",
+            xcinfo->cpupool_id, poolid);
         goto out;
+    }
 
     info->poolid = xcinfo->cpupool_id;
     info->sched = xcinfo->sched_id;
     info->n_dom = xcinfo->n_dom;
-    if (libxl_cpumap_alloc(CTX, &info->cpumap))
+    rc = libxl_cpumap_alloc(CTX, &info->cpumap);
+    if (rc)
+    {
+        LOG(ERROR, "unable to allocate cpumap %d\n", rc);
         goto out;
+    }
     memcpy(info->cpumap.map, xcinfo->cpumap, info->cpumap.size);
 
     rc = 0;
diff -r b63faf77bff3 -r 2829b66dfe19 tools/libxl/libxl_dom.c
--- a/tools/libxl/libxl_dom.c	Fri Jun 22 11:55:29 2012 +0100
+++ b/tools/libxl/libxl_dom.c	Fri Jun 22 11:56:41 2012 +0100
@@ -67,10 +67,15 @@ int libxl__domain_cpupool(libxl__gc *gc,
 
     ret = xc_domain_getinfolist(CTX->xch, domid, 1, &info);
     if (ret != 1)
+    {
+        LOGE(ERROR, "getinfolist failed %d\n", ret);
         return ERROR_FAIL;
+    }
     if (info.domain != domid)
+    {
+        LOGE(ERROR, "got info for dom%d, wanted dom%d\n", info.domain, domid);
         return ERROR_FAIL;
-
+    }
     return info.cpupool;
 }

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 5 of 5] xl: initialise domid to an explicitly invalid value
  2012-06-22 10:56 [PATCH 0 of 5] xl: fix guest reboot failures Ian Campbell
                   ` (3 preceding siblings ...)
  2012-06-22 10:56 ` [PATCH 4 of 5] libxl: log on failure in cpupool_info and libxl__domain_cpupool Ian Campbell
@ 2012-06-22 10:56 ` Ian Campbell
  2012-06-29 14:45 ` [PATCH 0 of 5] xl: fix guest reboot failures Ian Jackson
  5 siblings, 0 replies; 13+ messages in thread
From: Ian Campbell @ 2012-06-22 10:56 UTC (permalink / raw)
  To: xen-devel; +Cc: ian.jackson, Dario Faggioli

# HG changeset patch
# User Ian Campbell <ian.campbell@citrix.com>
# Date 1340362601 -3600
# Node ID 9190b57657f4b6def855fa22de905f45d22ef7de
# Parent  2829b66dfe19afd4537ea4b2cfcef0cf28b49472
xl: initialise domid to an explicitly invalid value

also ensure it is invalid whenever we destroy the domain.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

diff -r 2829b66dfe19 -r 9190b57657f4 tools/libxl/xl_cmdimpl.c
--- a/tools/libxl/xl_cmdimpl.c	Fri Jun 22 11:56:41 2012 +0100
+++ b/tools/libxl/xl_cmdimpl.c	Fri Jun 22 11:56:41 2012 +0100
@@ -68,7 +68,8 @@ libxl_ctx *ctx;
 xlchild children[child_max];
 
 /* when we operate on a domain, it is this one: */
-static uint32_t domid;
+#define INVALID_DOMID ~0
+static uint32_t domid = INVALID_DOMID;
 static const char *common_domname;
 static int fd_lock = -1;
 
@@ -1389,6 +1390,7 @@ static int handle_domain_death(uint32_t 
     case LIBXL_ACTION_ON_SHUTDOWN_DESTROY:
         LOG("Domain %d needs to be cleaned up: destroying the domain", domid);
         libxl_domain_destroy(ctx, domid);
+        domid = INVALID_DOMID;
         break;
 
     case LIBXL_ACTION_ON_SHUTDOWN_COREDUMP_DESTROY:
@@ -1451,6 +1453,12 @@ static int preserve_domain(uint32_t domi
     LOG("Preserving domain %d %s with suffix%s", domid, d_config->c_info.name, stime);
     rc = libxl_domain_preserve(ctx, domid, &d_config->c_info, stime, new_uuid);
 
+    /*
+     * Although domid still exists it is no longer the one we are concerned
+     * with.
+     */
+    domid = INVALID_DOMID;
+
     return rc == 0 ? 1 : 0;
 }
 
@@ -1738,7 +1746,7 @@ static int create_domain(struct domain_c
         goto out;
 
 start:
-    domid = -1;
+    assert(domid == INVALID_DOMID);
 
     rc = acquire_lock();
     if (rc < 0)
@@ -1985,8 +1993,10 @@ start:
 
 error_out:
     release_lock();
-    if (libxl_domid_valid_guest(domid))
+    if (libxl_domid_valid_guest(domid)) {
         libxl_domain_destroy(ctx, domid);
+        domid = INVALID_DOMID;
+    }
 
 out:
     if (logfile != 2)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1 of 5] libxl: validate scheduler parameters
  2012-06-22 10:56 ` [PATCH 1 of 5] libxl: validate scheduler parameters Ian Campbell
@ 2012-06-22 11:23   ` Ian Campbell
  2012-06-22 15:58     ` Dario Faggioli
  0 siblings, 1 reply; 13+ messages in thread
From: Ian Campbell @ 2012-06-22 11:23 UTC (permalink / raw)
  To: xen-devel@lists.xen.org; +Cc: Dario Faggioli, Ian Jackson

On Fri, 2012-06-22 at 11:56 +0100, Ian Campbell wrote:
> # HG changeset patch
> # User Ian Campbell <ian.campbell@citrix.com>
> # Date 1340361703 -3600
> # Node ID 998d48ccb8905907cb2f104b475e5ab6ad445348
> # Parent  5fb3c536b5a8810fb8be4149df609d272beff959
> libxl: validate scheduler parameters
> 
> This was previously done by xl itself however the domain was not
> created at that point so there was no domid to check. This happened to
> work on first boot because xl's global domid was initialised to zero
> so we would (incorrectly) validate the new domain to be against
> domain0. On reboot though we would try to use the old domain's id and
> fail.
> 
> sched_params_valid is moved and gains a gc+domid parameters and
> s/ctx/CTX/. The call is placed after
> libxl__domain_build_info_setdefault in the create path, because
> set_defaults doesn't have access to the domid and there are other
> callers which don't even have a domid to give it.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

After consultation with Ian J and Dario I have committed this one in
order to get a test pass ASAP. I'll leave the remainder of this series
to get properly reviewed.

Ian.

> 
> diff -r 5fb3c536b5a8 -r 998d48ccb890 tools/libxl/libxl_create.c
> --- a/tools/libxl/libxl_create.c	Fri Jun 22 10:14:46 2012 +0100
> +++ b/tools/libxl/libxl_create.c	Fri Jun 22 11:41:43 2012 +0100
> @@ -72,6 +72,49 @@ int libxl__domain_create_info_setdefault
>      return 0;
>  }
>  
> +static int sched_params_valid(libxl__gc *gc,
> +                              uint32_t domid, libxl_domain_sched_params *scp)
> +{
> +    int has_weight = scp->weight != LIBXL_DOMAIN_SCHED_PARAM_WEIGHT_DEFAULT;
> +    int has_period = scp->period != LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT;
> +    int has_slice = scp->slice != LIBXL_DOMAIN_SCHED_PARAM_SLICE_DEFAULT;
> +    int has_extratime =
> +                scp->extratime != LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT;
> +    libxl_domain_sched_params sci;
> +
> +    libxl_domain_sched_params_get(CTX, domid, &sci);
> +
> +    /* The sedf scheduler needs some more consistency checking */
> +    if (sci.sched == LIBXL_SCHEDULER_SEDF) {
> +        if (has_weight && (has_period || has_slice))
> +            return 0;
> +        if (has_period != has_slice)
> +            return 0;
> +
> +        /*
> +         * Idea is, if we specify a weight, then both period and
> +         * slice has to be zero. OTOH, if we do not specify a weight,
> +         * that means we want a pure best effort domain or an actual
> +         * real-time one. In the former case, it is period that needs
> +         * to be zero, in the latter, weight should be.
> +         */
> +        if (has_weight) {
> +            scp->slice = 0;
> +            scp->period = 0;
> +        }
> +        else if (!has_period) {
> +            /* We can setup a proper best effort domain (extra time only)
> +             * iff we either already have or are asking for some extra time. */
> +            scp->weight = has_extratime ? scp->extratime : sci.extratime;
> +            scp->period = 0;
> +        }
> +        if (has_period && has_slice)
> +            scp->weight = 0;
> +    }
> +
> +    return 1;
> +}
> +
>  int libxl__domain_build_info_setdefault(libxl__gc *gc,
>                                          libxl_domain_build_info *b_info)
>  {
> @@ -622,6 +665,12 @@ static void initiate_domain_create(libxl
>      ret = libxl__domain_build_info_setdefault(gc, &d_config->b_info);
>      if (ret) goto error_out;
>  
> +    if (!sched_params_valid(gc, domid, &d_config->b_info.sched_params)) {
> +        LOG(ERROR, "Invalid scheduling parameters\n");
> +        ret = ERROR_INVAL;
> +        goto error_out;
> +    }
> +
>      for (i = 0; i < d_config->num_disks; i++) {
>          ret = libxl__device_disk_setdefault(gc, &d_config->disks[i]);
>          if (ret) goto error_out;
> diff -r 5fb3c536b5a8 -r 998d48ccb890 tools/libxl/xl_cmdimpl.c
> --- a/tools/libxl/xl_cmdimpl.c	Fri Jun 22 10:14:46 2012 +0100
> +++ b/tools/libxl/xl_cmdimpl.c	Fri Jun 22 11:41:43 2012 +0100
> @@ -550,48 +550,6 @@ vcpp_out:
>      return rc;
>  }
>  
> -static int sched_params_valid(libxl_domain_sched_params *scp)
> -{
> -    int has_weight = scp->weight != LIBXL_DOMAIN_SCHED_PARAM_WEIGHT_DEFAULT;
> -    int has_period = scp->period != LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT;
> -    int has_slice = scp->slice != LIBXL_DOMAIN_SCHED_PARAM_SLICE_DEFAULT;
> -    int has_extratime =
> -                scp->extratime != LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT;
> -    libxl_domain_sched_params sci;
> -
> -    libxl_domain_sched_params_get(ctx, domid, &sci);
> -
> -    /* The sedf scheduler needs some more consistency checking */
> -    if (sci.sched == LIBXL_SCHEDULER_SEDF) {
> -        if (has_weight && (has_period || has_slice))
> -            return 0;
> -        if (has_period != has_slice)
> -            return 0;
> -
> -        /*
> -         * Idea is, if we specify a weight, then both period and
> -         * slice has to be zero. OTOH, if we do not specify a weight,
> -         * that means we want a pure best effort domain or an actual
> -         * real-time one. In the former case, it is period that needs
> -         * to be zero, in the latter, weight should be.
> -         */
> -        if (has_weight) {
> -            scp->slice = 0;
> -            scp->period = 0;
> -        }
> -        else if (!has_period) {
> -            /* We can setup a proper best effort domain (extra time only)
> -             * iff we either already have or are asking for some extra time. */
> -            scp->weight = has_extratime ? scp->extratime : sci.extratime;
> -            scp->period = 0;
> -        }
> -        if (has_period && has_slice)
> -            scp->weight = 0;
> -    }
> -
> -    return 1;
> -}
> -
>  static void parse_config_data(const char *config_source,
>                                const char *config_data,
>                                int config_len,
> @@ -686,10 +644,6 @@ static void parse_config_data(const char
>          b_info->sched_params.latency = l;
>      if (!xlu_cfg_get_long (config, "extratime", &l, 0))
>          b_info->sched_params.extratime = l;
> -    if (!sched_params_valid(&b_info->sched_params)) {
> -        fprintf(stderr, "Invalid scheduling parameters\n");
> -        exit(1);
> -    }
>  
>      if (!xlu_cfg_get_long (config, "vcpus", &l, 0)) {
>          b_info->max_vcpus = l;
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1 of 5] libxl: validate scheduler parameters
  2012-06-22 11:23   ` Ian Campbell
@ 2012-06-22 15:58     ` Dario Faggioli
  2012-06-22 16:29       ` Dario Faggioli
  0 siblings, 1 reply; 13+ messages in thread
From: Dario Faggioli @ 2012-06-22 15:58 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian Jackson, xen-devel@lists.xen.org


[-- Attachment #1.1: Type: text/plain, Size: 2970 bytes --]

On Fri, 2012-06-22 at 12:23 +0100, Ian Campbell wrote:
> > sched_params_valid is moved and gains a gc+domid parameters and
> > s/ctx/CTX/. The call is placed after
> > libxl__domain_build_info_setdefault in the create path, because
> > set_defaults doesn't have access to the domid and there are other
> > callers which don't even have a domid to give it.
> > 
> > Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> 
> After consultation with Ian J and Dario I have committed this one in
> order to get a test pass ASAP. I'll leave the remainder of this series
> to get properly reviewed.
> 
This looked good at inspection, and it indeed works for the credit
scheduler. Unfortunately, sedf seems to require the domain's vcpus to be
properly setup (in the hypervisor), which is not true at this point (it
is still too early!).

In fact, what happens is it fails right here:

> > diff -r 5fb3c536b5a8 -r 998d48ccb890 tools/libxl/libxl_create.c
> > --- a/tools/libxl/libxl_create.c	Fri Jun 22 10:14:46 2012 +0100
> > +++ b/tools/libxl/libxl_create.c	Fri Jun 22 11:41:43 2012 +0100
> > @@ -72,6 +72,49 @@ int libxl__domain_create_info_setdefault
> >      return 0;
> >  }
> >  
> > +static int sched_params_valid(libxl__gc *gc,
> > +                              uint32_t domid, libxl_domain_sched_params *scp)
> > +{
> > +    int has_weight = scp->weight != LIBXL_DOMAIN_SCHED_PARAM_WEIGHT_DEFAULT;
> > +    int has_period = scp->period != LIBXL_DOMAIN_SCHED_PARAM_PERIOD_DEFAULT;
> > +    int has_slice = scp->slice != LIBXL_DOMAIN_SCHED_PARAM_SLICE_DEFAULT;
> > +    int has_extratime =
> > +                scp->extratime != LIBXL_DOMAIN_SCHED_PARAM_EXTRATIME_DEFAULT;
> > +    libxl_domain_sched_params sci;
> > +
> > +    libxl_domain_sched_params_get(CTX, domid, &sci);
> > +
With a splat like this:
(XEN) ----[ Xen-4.2-unstable  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82c480120d57>] sedf_adjust+0x8bd/0x9ad
(XEN) RFLAGS: 0000000000010046   CONTEXT: hypervisor
(XEN) rax: 0000000000000000   rbx: 0000000000000000   rcx: 0000000000000001
(XEN) rdx: ffff82c4802b7e48   rsi: ffff83023e401000   rdi: ffff83031b3e34e4

Because of this code in xen/common/sched_sedf.c:

        if ( p->vcpu[0] == NULL )
        {
            rc = -EINVAL;
            goto out;
        }

What I expect to happen, then, is the win7 HVM tests to restart being
successful, but the sedf-s ones restart failing. :-(

New patch against this patch coming. This time I tested it on both sedf
and credit and tested rebooting a domain as well... Let's hope this is
going to be the end of this!

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1 of 5] libxl: validate scheduler parameters
  2012-06-22 15:58     ` Dario Faggioli
@ 2012-06-22 16:29       ` Dario Faggioli
  0 siblings, 0 replies; 13+ messages in thread
From: Dario Faggioli @ 2012-06-22 16:29 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian Jackson, xen-devel@lists.xen.org


[-- Attachment #1.1: Type: text/plain, Size: 707 bytes --]

On Fri, 2012-06-22 at 17:58 +0200, Dario Faggioli wrote:
> New patch against this patch coming. This time I tested it on both sedf
> and credit and tested rebooting a domain as well... Let's hope this is
> going to be the end of this!
> 
Which just for the records is <f5f29d8e2ccf642cad89.1340381818@Solace>,
which I think Ian is checking in right now. Let's wait for some verdict
from OSStest during the weekend. :-)

Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0 of 5] xl: fix guest reboot failures
  2012-06-22 10:56 [PATCH 0 of 5] xl: fix guest reboot failures Ian Campbell
                   ` (4 preceding siblings ...)
  2012-06-22 10:56 ` [PATCH 5 of 5] xl: initialise domid to an explicitly invalid value Ian Campbell
@ 2012-06-29 14:45 ` Ian Jackson
  2012-06-29 14:48   ` Ian Campbell
  5 siblings, 1 reply; 13+ messages in thread
From: Ian Jackson @ 2012-06-29 14:45 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Dario Faggioli, xen-devel

Ian Campbell writes ("[Xen-devel] [PATCH 0 of 5] xl: fix guest reboot failures"):
> The following fixes recent test failures reboot a guest (e.g. flight
> 13302).

Patches 2..5:

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>

1 was applied earlier.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0 of 5] xl: fix guest reboot failures
  2012-06-29 14:45 ` [PATCH 0 of 5] xl: fix guest reboot failures Ian Jackson
@ 2012-06-29 14:48   ` Ian Campbell
  2012-06-29 14:51     ` Ian Jackson
  0 siblings, 1 reply; 13+ messages in thread
From: Ian Campbell @ 2012-06-29 14:48 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Dario Faggioli, xen-devel@lists.xen.org

On Fri, 2012-06-29 at 15:45 +0100, Ian Jackson wrote:
> Ian Campbell writes ("[Xen-devel] [PATCH 0 of 5] xl: fix guest reboot failures"):
> > The following fixes recent test failures reboot a guest (e.g. flight
> > 13302).
> 
> Patches 2..5:

I sent out a V2 of this series with the (trivial) reject fixed. I guess
you fixed that yourself?

> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
> 
> 1 was applied earlier.
> 
> Thanks,
> Ian.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0 of 5] xl: fix guest reboot failures
  2012-06-29 14:48   ` Ian Campbell
@ 2012-06-29 14:51     ` Ian Jackson
  2012-06-29 14:54       ` Ian Campbell
  0 siblings, 1 reply; 13+ messages in thread
From: Ian Jackson @ 2012-06-29 14:51 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Dario Faggioli, xen-devel@lists.xen.org

Ian Campbell writes ("Re: [Xen-devel] [PATCH 0 of 5] xl: fix guest reboot failures"):
> On Fri, 2012-06-29 at 15:45 +0100, Ian Jackson wrote:
> > Ian Campbell writes ("[Xen-devel] [PATCH 0 of 5] xl: fix guest reboot failures"):
> > > The following fixes recent test failures reboot a guest (e.g. flight
> > > 13302).
> > 
> > Patches 2..5:
> 
> I sent out a V2 of this series with the (trivial) reject fixed. I guess
> you fixed that yourself?

I just replied to the wrong 0 of 5.  There wasn't any reject and I
applied v2 of the series.

Sorry for the confusion.

Ian.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0 of 5] xl: fix guest reboot failures
  2012-06-29 14:51     ` Ian Jackson
@ 2012-06-29 14:54       ` Ian Campbell
  0 siblings, 0 replies; 13+ messages in thread
From: Ian Campbell @ 2012-06-29 14:54 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Dario Faggioli, xen-devel@lists.xen.org

On Fri, 2012-06-29 at 15:51 +0100, Ian Jackson wrote:
> Ian Campbell writes ("Re: [Xen-devel] [PATCH 0 of 5] xl: fix guest reboot failures"):
> > On Fri, 2012-06-29 at 15:45 +0100, Ian Jackson wrote:
> > > Ian Campbell writes ("[Xen-devel] [PATCH 0 of 5] xl: fix guest reboot failures"):
> > > > The following fixes recent test failures reboot a guest (e.g. flight
> > > > 13302).
> > > 
> > > Patches 2..5:
> > 
> > I sent out a V2 of this series with the (trivial) reject fixed. I guess
> > you fixed that yourself?
> 
> I just replied to the wrong 0 of 5.  There wasn't any reject and I
> applied v2 of the series.

Grand!

> Sorry for the confusion.

No worries.

Ian.

> 
> Ian.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-06-29 14:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-22 10:56 [PATCH 0 of 5] xl: fix guest reboot failures Ian Campbell
2012-06-22 10:56 ` [PATCH 1 of 5] libxl: validate scheduler parameters Ian Campbell
2012-06-22 11:23   ` Ian Campbell
2012-06-22 15:58     ` Dario Faggioli
2012-06-22 16:29       ` Dario Faggioli
2012-06-22 10:56 ` [PATCH 2 of 5] libxl: initialise cpupoolinfo in libxl__domain_scheduler Ian Campbell
2012-06-22 10:56 ` [PATCH 3 of 5] libxl: correct type of cpupool variable Ian Campbell
2012-06-22 10:56 ` [PATCH 4 of 5] libxl: log on failure in cpupool_info and libxl__domain_cpupool Ian Campbell
2012-06-22 10:56 ` [PATCH 5 of 5] xl: initialise domid to an explicitly invalid value Ian Campbell
2012-06-29 14:45 ` [PATCH 0 of 5] xl: fix guest reboot failures Ian Jackson
2012-06-29 14:48   ` Ian Campbell
2012-06-29 14:51     ` Ian Jackson
2012-06-29 14:54       ` Ian Campbell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).