xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Dario Faggioli <dfaggioli@suse.com>
To: Wei Liu <wei.liu2@citrix.com>,
	Xen-devel <xen-devel@lists.xenproject.org>
Cc: George Dunlap <dunlapg@gmail.com>,
	Jan Beulich <JBeulich@suse.com>,
	Roger Pau Monne <roger.paumonne@citrix.com>
Subject: Re: [PATCH RFC v1 42/74] sched/null: skip vCPUs on the waitqueue that are blocked
Date: Fri, 12 Jan 2018 11:41:56 +0100	[thread overview]
Message-ID: <1515753716.30117.123.camel@suse.com> (raw)
In-Reply-To: <20180104130625.28605-43-wei.liu2@citrix.com>


[-- Attachment #1.1: Type: text/plain, Size: 5161 bytes --]

On Thu, 2018-01-04 at 13:05 +0000, Wei Liu wrote:
> From: Roger Pau Monne <roger.pau@citrix.com>
> 
> Avoid scheduling vCPUs that are blocked, there's no point in
> assigning
> them to a pCPU because they are not going to run anyway.
> 
> Since blocked vCPUs are not assigned to pCPUs after this change,
> force
> a rescheduling when a vCPU is brought up if it's on the waitqueue.
> Also when scheduling try to pick a vCPU from the runqueue if the pCPU
> is running idle.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Cc: George Dunlap <george.dunlap@eu.citrix.com>
> Cc: Dario Faggioli <raistlin@linux.it>
> ---
> Changes since v1:
>  - Force a rescheduling when a vCPU is brought up.
>  - Try to pick a vCPU from the runqueue if running the idle vCPU.
>
As noted by Jan already, there's a mixing of "blocked" and "down" (or
offline).

In the null scheduler, a vCPU that is assigned to a pCPU, is free to
block and wake-up as many time as it wants (quite obviously). And when
it blocks, the pCPU will just stay idle.

There's no such thing of pulling on the CPU another vCPU, either from
the waitqueue or from anywhere else. That's the whole point of the
scheduler, actually.

Now, I'm not quite sure whether or not this can be a problem in the
"shim scenario". If it is, we have to think of a solution that does not
totally defeat the purpose of the scheduler when used baremetal.

Or use another scheduler, perhaps configuring static 1:1 pinning. Null
seems a great fit for this use case to me, so, I'd say, let's try to
find a nice and cool way to use it. :-)

> ---
>  xen/common/sched_null.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
> index b4a24baf8e..bacfb31cb3 100644
> --- a/xen/common/sched_null.c
> +++ b/xen/common/sched_null.c
> @@ -574,6 +574,8 @@ static void null_vcpu_wake(const struct scheduler
> *ops, struct vcpu *v)
>      {
>          /* Not exactly "on runq", but close enough for reusing the
> counter */
>          SCHED_STAT_CRANK(vcpu_wake_onrunq);
> +        /* Force a rescheduling in case some CPU is idle can pick
> this vCPU */
> +        cpumask_raise_softirq(&cpu_online_map, SCHEDULE_SOFTIRQ);
>          return;
>
This needs to become 'the cpus of vcpu->domain 's cpupool'.

I appreciate that this is fine, when running as shim, where you
certainly don't use cpupools. But when this run as baremetal, if we use
cpu_online_map, basically _all_ the online CPUs --even the ones that
are in another pool, under a different scheduler-- will be forced to
reschedule. And we don't want that.

I'm not also 100% convinced that this must/can live here. Basially,
you're saying that vcpu_wake() is called on a vCPU that happens to be
in the waitqueue, we should reschedule. And, AFAIUI, this is to cover
the case of a vCPU of the L2 guest comes online.

Well, it may even be technically fine. Still, if what we want to deal
with is vCPU onlining, I would prefer to at least trying find a place
which is more related to the onlining path, than to the wakeup path.

If you confirm your intent, I can have a look at the code and try to
identify such better place...

> @@ -761,9 +763,10 @@ static struct task_slice null_schedule(const
> struct scheduler *ops,
>      /*
>       * We may be new in the cpupool, or just coming back online. In
> which
>       * case, there may be vCPUs in the waitqueue that we can assign
> to us
> -     * and run.
> +     * and run. Also check whether this CPU is running idle, in
> which case try
> +     * to pick a vCPU from the waitqueue.
>       */
> -    if ( unlikely(ret.task == NULL) )
> +    if ( unlikely(ret.task == NULL || ret.task == idle_vcpu[cpu]) )
>
I don't think I understand this. I may be a bit rusty, but are you sure
that, on an idle pCPU, ret.task is idle_vcpu at this point in this
function? I don't think it is.

Also, I'm quite sure this may mess up things for tasklets. In fact, one
case when ret.task is idle_vcpu here, if I have just forced it to be
so, in order to run a tasklet. But with this, we scan the waitqueue
instead, and may end up running something else.

> @@ -781,6 +784,10 @@ static struct task_slice null_schedule(const
> struct scheduler *ops,
>          {
>              list_for_each_entry( wvc, &prv->waitq, waitq_elem )
>              {
> +                if ( test_bit(_VPF_down, &wvc->vcpu->pause_flags) )
> +                    /* Skip vCPUs that are down. */
> +                    continue;
> +
So, yes, I think things like this are what we want. As said above for
the wakeup case, though, I'd prefer to find a way to avoid that offline
vCPUs ends up in the waitqueue, rather than having to skip them.

Side note, is_vcpu_online() can be used for the test.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  parent reply	other threads:[~2018-01-12 10:42 UTC|newest]

Thread overview: 206+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-04 13:05 [PATCH RFC v1 00/74] Run PV guest in PVH container Wei Liu
2018-01-04 13:05 ` [PATCH RFC v1 01/74] x86/svm: Offer CPUID Faulting to AMD HVM guests as well Wei Liu
2018-01-04 14:00   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 02/74] x86: Common cpuid faulting support Wei Liu
2018-01-04 14:19   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 03/74] x86/upcall: inject a spurious event after setting upcall vector Wei Liu
2018-01-04 13:05 ` [PATCH RFC v1 04/74] tools/libxc: initialise hvm loader elf log fd to get more logging Wei Liu
2018-01-04 13:05 ` [PATCH RFC v1 05/74] tools/libxc: remove extraneous newline in xc_dom_load_acpi Wei Liu
2018-01-04 13:05 ` [PATCH RFC v1 06/74] tools/libelf: fix elf notes check for PVH guest Wei Liu
2018-01-04 14:37   ` Jan Beulich
2018-01-08 15:34     ` Wei Liu
2018-01-08 16:02       ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 07/74] tools/libxc: Multi modules support Wei Liu
2018-01-04 13:05 ` [PATCH RFC v1 08/74] libxl: Introduce hack to allow PVH mode to add a shim Wei Liu
2018-01-04 13:05 ` [PATCH RFC v1 09/74] xen/common: Widen the guest logging buffer slightly Wei Liu
2018-01-04 13:05 ` [PATCH RFC v1 10/74] x86/time: Print a more helpful error when a platform timer can't be found Wei Liu
2018-01-05 10:37   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 11/74] x86/link: Introduce and use SECTION_ALIGN Wei Liu
2018-01-05 10:38   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 12/74] xen/acpi: mark the PM timer FADT field as optional Wei Liu
2018-01-05 10:52   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 13/74] xen/domctl: Return arch_config via getdomaininfo Wei Liu
2018-01-05 10:58   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 14/74] tools/ocaml: Expose arch_config in domaininfo Wei Liu
2018-01-04 13:05 ` [PATCH RFC v1 15/74] tools/ocaml: Extend domain_create() to take arch_domainconfig Wei Liu
2018-01-04 13:05 ` [PATCH RFC v1 16/74] x86/fixmap: Modify fix_to_virt() to return a void pointer Wei Liu
2018-01-05 11:05   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 17/74] ---- x86/Kconfig: Options for Xen and PVH support Wei Liu
2018-01-05 11:11   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 18/74] x86/link: Relocate program headers Wei Liu
2018-01-05 11:20   ` Jan Beulich
2018-01-08 15:43     ` Wei Liu
2018-01-08 16:26       ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 19/74] x86: introduce ELFNOTE macro Wei Liu
2018-01-05 11:27   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 20/74] x86: produce a binary that can be booted as PVH Wei Liu
2018-01-05 11:39   ` Jan Beulich
2018-01-08 15:59     ` Wei Liu
2018-01-08 16:42       ` Jan Beulich
2018-01-09 13:49         ` Wei Liu
2018-01-10 19:10     ` Wei Liu
2018-01-04 13:05 ` [PATCH RFC v1 21/74] x86/entry: Early PVH boot code Wei Liu
2018-01-05 13:32   ` Jan Beulich
2018-01-09 15:45     ` Wei Liu
2018-01-09 16:41       ` Jan Beulich
2018-01-09 17:10         ` Wei Liu
2018-01-04 13:05 ` [PATCH RFC v1 22/74] x86/boot: Map more than the first 16MB Wei Liu
2018-01-04 13:05 ` [PATCH RFC v1 23/74] x86/entry: Probe for Xen early during boot Wei Liu
2018-01-05 13:40   ` Jan Beulich
2018-01-10 17:45     ` Wei Liu
2018-01-11  7:55       ` Jan Beulich
2018-01-11  9:43         ` Wei Liu
2018-01-04 13:05 ` [PATCH RFC v1 24/74] x86/guest: Hypercall support Wei Liu
2018-01-05 13:53   ` Jan Beulich
2018-01-05 14:09     ` Andrew Cooper
2018-01-04 13:05 ` [PATCH RFC v1 25/74] x86/shutdown: Support for using SCHEDOP_{shutdown, reboot} Wei Liu
2018-01-05 14:01   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 26/74] x86/pvh: Retrieve memory map from Xen Wei Liu
2018-01-05 14:05   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 27/74] xen/console: Introduce console=xen Wei Liu
2018-01-05 14:08   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 28/74] x86: initialise shared_info page Wei Liu
2018-01-05 14:11   ` Jan Beulich
2018-01-05 14:20     ` Andrew Cooper
2018-01-05 14:28       ` Roger Pau Monné
2018-01-05 14:40         ` Andrew Cooper
2018-01-04 13:05 ` [PATCH RFC v1 29/74] x86: xen pv clock time source Wei Liu
2018-01-05 14:17   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 30/74] x86: APIC timer calibration when running as a guest Wei Liu
2018-01-05 14:35   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 31/74] x86: read wallclock from Xen running in pvh mode Wei Liu
2018-01-05 14:43   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 32/74] x86: don't swallow the first command line item " Wei Liu
2018-01-05 14:49   ` Jan Beulich
2018-01-09 14:30   ` Roger Pau Monné
2018-01-04 13:05 ` [PATCH RFC v1 33/74] x86/guest: enable event channels upcalls Wei Liu
2018-01-05 15:07   ` Jan Beulich
2018-01-05 15:19     ` Andrew Cooper
2018-01-04 13:05 ` [PATCH RFC v1 34/74] x86/guest: add PV console code Wei Liu
2018-01-05 15:22   ` Jan Beulich
2018-01-10 15:33     ` Roger Pau Monné
2018-01-10 15:55       ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 35/74] x86/guest: use PV console for Xen/Dom0 I/O Wei Liu
2018-01-04 13:05 ` [PATCH RFC v1 36/74] --- x86/shim: Kconfig and command line options Wei Liu
2018-01-05 15:26   ` Jan Beulich
2018-01-05 17:51     ` Andrew Cooper
2018-01-08  8:22       ` Jan Beulich
2018-01-08 11:33         ` Andrew Cooper
2018-01-08 11:46           ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 37/74] tools/firmware: Build and install xen-shim Wei Liu
2018-01-04 13:05 ` [PATCH RFC v1 38/74] x86/pv-shim: Force CPUID faulting in pv-shim mode Wei Liu
2018-01-08 10:16   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 39/74] xen/x86: make VGA support selectable Wei Liu
2018-01-08 10:22   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 40/74] xen/x86: report domain id on cpuid Wei Liu
2018-01-08 10:27   ` Jan Beulich
2018-01-08 10:34     ` Andrew Cooper
2018-01-08 11:11       ` Jan Beulich
2018-01-08 11:22         ` Andrew Cooper
2018-01-08 11:27           ` Jan Beulich
2018-01-08 11:29   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 41/74] xen/pvh: do not mark the low 1MB as IO mem Wei Liu
2018-01-08 10:30   ` Jan Beulich
2018-01-08 10:37     ` Roger Pau Monné
2018-01-08 11:11       ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 42/74] sched/null: skip vCPUs on the waitqueue that are blocked Wei Liu
2018-01-08 10:37   ` Jan Beulich
2018-01-08 11:12     ` George Dunlap
2018-01-12  9:54       ` Dario Faggioli
2018-01-12 10:45         ` Roger Pau Monné
2018-01-12 11:16           ` Dario Faggioli
2018-01-12 11:22             ` Roger Pau Monné
2018-01-12 10:41   ` Dario Faggioli [this message]
2018-01-04 13:05 ` [PATCH RFC v1 43/74] xen: introduce rangeset_reserve_hole Wei Liu
2018-01-08 10:46   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 44/74] xen/pvshim: keep track of unused pages Wei Liu
2018-01-08 10:58   ` Jan Beulich
2018-01-08 11:04     ` Roger Pau Monné
2018-01-08 11:22       ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 45/74] x86/guest: use unpopulated memory to map the shared_info page Wei Liu
2018-01-08 11:03   ` Jan Beulich
2018-01-08 11:06     ` Roger Pau Monné
2018-01-08 11:25       ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 46/74] xen/guest: fetch vCPU ID from Xen Wei Liu
2018-01-08 11:04   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 47/74] x86/guest: fix upcall vector setup Wei Liu
2018-01-08 11:08   ` Jan Beulich
2018-01-04 13:05 ` [PATCH RFC v1 48/74] x86/guest: unmask console event channel Wei Liu
2018-01-04 13:06 ` [PATCH RFC v1 49/74] x86/guest: map per-cpu vcpu_info area Wei Liu
2018-01-08 13:21   ` Jan Beulich
2018-01-09 12:08     ` Roger Pau Monné
2018-01-04 13:06 ` [PATCH RFC v1 50/74] xen/pvshim: remove Dom0 kernel support check Wei Liu
2018-01-08 13:28   ` Jan Beulich
2018-01-04 13:06 ` [PATCH RFC v1 51/74] xen/pvshim: don't allow access to iomem or ioports Wei Liu
2018-01-08 13:29   ` Jan Beulich
2018-01-04 13:06 ` [PATCH RFC v1 52/74] xen: mark xenstore/console pages as RAM and add them to dom_io Wei Liu
2018-01-08 13:49   ` Jan Beulich
2018-01-09  9:25     ` Roger Pau Monné
2018-01-09 11:03       ` Jan Beulich
2018-01-09 11:26         ` Roger Pau Monné
2018-01-09 13:34           ` Jan Beulich
2018-01-04 13:06 ` [PATCH RFC v1 53/74] xen/pvshim: modify Dom0 builder in order to build a DomU Wei Liu
2018-01-08 14:06   ` Jan Beulich
2018-01-09 16:09     ` Roger Pau Monné
2018-01-09 16:26       ` Jan Beulich
2018-01-09  9:06   ` Jan Beulich
2018-01-04 13:06 ` [PATCH RFC v1 54/74] xen/pvshim: set correct domid value Wei Liu
2018-01-08 14:17   ` Jan Beulich
2018-01-09 16:27     ` Roger Pau Monné
2018-01-04 13:06 ` [PATCH RFC v1 55/74] xen/pvshim: forward evtchn ops between L0 Xen and L2 DomU Wei Liu
2018-01-08 16:05   ` Jan Beulich
2018-01-08 16:22     ` Roger Pau Monné
2018-01-09  8:00       ` Jan Beulich
2018-01-09 16:45         ` Roger Pau Monné
2018-01-09 17:42           ` Jan Beulich
2018-01-09 17:50     ` Anthony Liguori
2018-01-10 12:23       ` Roger Pau Monné
2018-01-09  7:49   ` Jan Beulich
2018-01-04 13:06 ` [PATCH RFC v1 56/74] xen/pvshim: add grant table operations Wei Liu
2018-01-08 17:19   ` Jan Beulich
2018-01-09 18:34     ` Roger Pau Monné
2018-01-10  7:28       ` Jan Beulich
2018-01-10  8:01         ` Roger Pau Monné
2018-01-04 13:06 ` [PATCH RFC v1 57/74] x86/pv-shim: shadow PV console's page for L2 DomU Wei Liu
2018-01-09  9:13   ` Jan Beulich
2018-01-09 15:43     ` Sergey Dyasli
2018-01-09 16:28       ` Jan Beulich
2018-01-10 16:56         ` Sergey Dyasli
2018-01-12  7:03           ` Sarah Newman
2018-01-04 13:06 ` [PATCH RFC v1 58/74] xen/pvshim: add migration support Wei Liu
2018-01-09  9:38   ` Jan Beulich
2018-01-10 12:54     ` Roger Pau Monné
2018-01-04 13:06 ` [PATCH RFC v1 59/74] xen/pvshim: add shim_mem cmdline parameter Wei Liu
2018-01-09  9:47   ` Jan Beulich
2018-01-04 13:06 ` [PATCH RFC v1 60/74] xen/pvshim: set max_pages to the value of tot_pages Wei Liu
2018-01-09  9:48   ` Jan Beulich
2018-01-04 13:06 ` [PATCH RFC v1 61/74] xen/pvshim: support vCPU hotplug Wei Liu
2018-01-09 10:16   ` Jan Beulich
2018-01-10 13:07     ` Roger Pau Monné
2018-01-10 13:33       ` Jan Beulich
2018-01-10 14:40     ` Roger Pau Monné
2018-01-04 13:06 ` [PATCH RFC v1 62/74] xen/pvshim: memory hotplug Wei Liu
2018-01-09 10:42   ` Jan Beulich
2018-01-10 13:36     ` Roger Pau Monné
2018-01-10 13:42       ` Jan Beulich
2018-01-04 13:06 ` [PATCH RFC v1 63/74] xen/shim: modify shim_mem parameter behaviour Wei Liu
2018-01-09 10:48   ` Jan Beulich
2018-01-04 13:06 ` [PATCH RFC v1 64/74] xen/pvshim: use default position for the m2p mappings Wei Liu
2018-01-09 10:50   ` Jan Beulich
2018-01-04 13:06 ` [PATCH RFC v1 65/74] xen/shim: crash instead of reboot in shim mode Wei Liu
2018-01-09 10:52   ` Jan Beulich
2018-01-04 13:06 ` [PATCH RFC v1 66/74] xen/shim: allow DomU to have as many vcpus as available Wei Liu
2018-01-09 10:59   ` Jan Beulich
2018-01-10 16:14     ` Roger Pau Monné
2018-01-04 13:06 ` [PATCH RFC v1 67/74] libxl: libxl__build_hvm: Introduce separate b_info parameter Wei Liu
2018-01-04 13:06 ` [PATCH RFC v1 68/74] libxl__domain_build_info_setdefault_pvhhvm: introduce Wei Liu
2018-01-04 13:06 ` [PATCH RFC v1 69/74] libxl_bitmap_copy_alloc: copy 0, NULL as 0, NULL Wei Liu
2018-01-04 13:06 ` [PATCH RFC v1 70/74] libxl: pvshim: Check state->shim_path before domain type Wei Liu
2018-01-04 13:06 ` [PATCH RFC v1 71/74] libxl: pvshim: Provide first-class config settings to enable shim mode Wei Liu
2018-01-04 13:06 ` [PATCH RFC v1 72/74] libxl: pvshim: Introduce pvhshim_extra Wei Liu
2018-01-04 13:06 ` [PATCH RFC v1 73/74] xl: pvshim: Provide and document xl config Wei Liu
2018-01-04 13:06 ` [PATCH RFC v1 74/74] libxl: pvshim: Set video_memkb to ~0 Wei Liu
2018-01-08 16:12 ` [PATCH RFC v1 00/74] Run PV guest in PVH container Ian Jackson
2018-01-11 15:39   ` Ian Jackson
2018-01-10 16:26 ` George Dunlap
2018-01-10 16:28   ` Wei Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1515753716.30117.123.camel@suse.com \
    --to=dfaggioli@suse.com \
    --cc=JBeulich@suse.com \
    --cc=dunlapg@gmail.com \
    --cc=roger.paumonne@citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).