* [Qemu-devel] [PATCH 0/2] qmp: add query-cpus-fast @ 2018-02-07 17:50 Luiz Capitulino 2018-02-07 17:50 ` [Qemu-devel] [PATCH 1/2] " Luiz Capitulino 2018-02-07 17:50 ` [Qemu-devel] [PATCH 2/2] qmp: document query-cpus performance issue Luiz Capitulino 0 siblings, 2 replies; 15+ messages in thread From: Luiz Capitulino @ 2018-02-07 17:50 UTC (permalink / raw) To: qemu-devel; +Cc: armbru, eblake, ehabkost, berrange, mihajlov We've recently debugged a huge performance degradation we were getting on a latency sensitive workload down to the fact that libvirt is issuing query-cpus. As it turns out, query-cpus always interrupts all vCPU threads so that they can run ioctl to collect a number of register information, most of which are not even used by query-cpus at all. This series adds a new command called query-cpus-fast, which returns the most relevant information returned by query-cpus without having to interrupt vCPU threads. This series also updates query-cpus documentation to advise against its use in production. More details in individual patches. Luiz Capitulino (2): qmp: add query-cpus-fast qmp: document query-cpus performance issue cpus.c | 44 ++++++++++++++++++++++++++++++ hmp-commands-info.hx | 14 ++++++++++ hmp.c | 24 +++++++++++++++++ hmp.h | 1 + qapi-schema.json | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 158 insertions(+) -- 2.14.3 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH 1/2] qmp: add query-cpus-fast 2018-02-07 17:50 [Qemu-devel] [PATCH 0/2] qmp: add query-cpus-fast Luiz Capitulino @ 2018-02-07 17:50 ` Luiz Capitulino 2018-02-07 18:49 ` Eric Blake ` (2 more replies) 2018-02-07 17:50 ` [Qemu-devel] [PATCH 2/2] qmp: document query-cpus performance issue Luiz Capitulino 1 sibling, 3 replies; 15+ messages in thread From: Luiz Capitulino @ 2018-02-07 17:50 UTC (permalink / raw) To: qemu-devel; +Cc: armbru, eblake, ehabkost, berrange, mihajlov The query-cpus command has an extremely serious side effect: it always interrupt all running vCPUs so that they can run ioctl calls. This can cause a huge performance degradation for some workloads. And most of the information retrieved by the ioctl calls are not even used by query-cpus. This commit introduces a replacement for query-cpus called query-cpus-fast, which has the following features: o Never interrupt vCPUs threads. query-cpus-fast only returns vCPU information maintained by QEMU itself, which should be sufficient for most management software needs o Make "halted" field optional: we only return it if the halted state is maintained by QEMU. But this also gives the option of dropping the field in the future (see below) o Drop irrelevant fields such as "current", "pc" and "arch" o Rename some fields for better clarification & proper naming standard The "halted" field is somewhat controversial. On the one hand, it offers a convenient way to know if a guest CPU is idle or running. On the other hand, it's a field that can change many times a second. In fact, the halted state can change even before query-cpus-fast has returned. This makes one wonder if this field should be dropped all together. Having the "halted" field as optional gives a better option for dropping it in the future, since we can just stop returning it. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> --- cpus.c | 44 ++++++++++++++++++++++++++++++++ hmp-commands-info.hx | 14 +++++++++++ hmp.c | 24 ++++++++++++++++++ hmp.h | 1 + qapi-schema.json | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 154 insertions(+) diff --git a/cpus.c b/cpus.c index 2cb0af9b22..3b68a8146c 100644 --- a/cpus.c +++ b/cpus.c @@ -2083,6 +2083,50 @@ CpuInfoList *qmp_query_cpus(Error **errp) return head; } +/* + * fast means: we NEVER interrupt vCPU threads to retrieve + * information from KVM. + */ +CpuInfo2List *qmp_query_cpus_fast(Error **errp) +{ + MachineState *ms = MACHINE(qdev_get_machine()); + MachineClass *mc = MACHINE_GET_CLASS(ms); + CpuInfo2List *head = NULL, *cur_item = NULL; + CPUState *cpu; + + CPU_FOREACH(cpu) { + CpuInfo2List *info = g_malloc0(sizeof(*info)); + info->value = g_malloc0(sizeof(*info->value)); + + info->value->cpu_index = cpu->cpu_index; + info->value->qom_path = object_get_canonical_path(OBJECT(cpu)); + info->value->thread_id = cpu->thread_id; + + info->value->has_props = !!mc->cpu_index_to_instance_props; + if (info->value->has_props) { + CpuInstanceProperties *props; + props = g_malloc0(sizeof(*props)); + *props = mc->cpu_index_to_instance_props(ms, cpu->cpu_index); + info->value->props = props; + } + + /* if in kernel irqchip is used, we don't have 'halted' */ + info->value->has_halted = !kvm_irqchip_in_kernel(); + if (info->value->has_halted) { + info->value->halted = cpu->halted; + } + + if (!cur_item) { + head = cur_item = info; + } else { + cur_item->next = info; + cur_item = info; + } + } + + return head; +} + void qmp_memsave(int64_t addr, int64_t size, const char *filename, bool has_cpu, int64_t cpu_index, Error **errp) { diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx index ad590a4ffb..8657ceceb0 100644 --- a/hmp-commands-info.hx +++ b/hmp-commands-info.hx @@ -157,6 +157,20 @@ STEXI @item info cpus @findex info cpus Show infos for each CPU. +ETEXI + + { + .name = "cpus_fast", + .args_type = "", + .params = "", + .help = "show infos for each CPU without performance penalty", + .cmd = hmp_info_cpus_fast, + }, + +STEXI +@item info cpus_fast +@findex info cpus_fast +Show infos for each CPU without performance penalty. ETEXI { diff --git a/hmp.c b/hmp.c index b3de32d219..3d32333fd2 100644 --- a/hmp.c +++ b/hmp.c @@ -404,6 +404,30 @@ void hmp_info_cpus(Monitor *mon, const QDict *qdict) qapi_free_CpuInfoList(cpu_list); } +void hmp_info_cpus_fast(Monitor *mon, const QDict *qdict) +{ + CpuInfo2List *head, *cpu; + TargetInfo *target; + + target = qmp_query_target(NULL); + monitor_printf(mon, "CPU architecture is '%s'\n\n", target->arch); + qapi_free_TargetInfo(target); + + head = qmp_query_cpus_fast(NULL); + + for (cpu = head; cpu; cpu = cpu->next) { + monitor_printf(mon, "CPU%" PRId64 "\n", cpu->value->cpu_index); + monitor_printf(mon, " thread-id=%" PRId64 "\n", cpu->value->thread_id); + if (cpu->value->has_halted) { + monitor_printf(mon, " halted=%d\n", cpu->value->halted); + } + monitor_printf(mon, " qom-path=%s\n", cpu->value->qom_path); + monitor_printf(mon, "\n"); + } + + qapi_free_CpuInfo2List(head); +} + static void print_block_info(Monitor *mon, BlockInfo *info, BlockDeviceInfo *inserted, bool verbose) { diff --git a/hmp.h b/hmp.h index 536cb91caa..f4ceba5cc8 100644 --- a/hmp.h +++ b/hmp.h @@ -31,6 +31,7 @@ void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict); void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict); void hmp_info_migrate_cache_size(Monitor *mon, const QDict *qdict); void hmp_info_cpus(Monitor *mon, const QDict *qdict); +void hmp_info_cpus_fast(Monitor *mon, const QDict *qdict); void hmp_info_block(Monitor *mon, const QDict *qdict); void hmp_info_blockstats(Monitor *mon, const QDict *qdict); void hmp_info_vnc(Monitor *mon, const QDict *qdict); diff --git a/qapi-schema.json b/qapi-schema.json index 5c06745c79..82d6f12b53 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -558,6 +558,77 @@ ## { 'command': 'query-cpus', 'returns': ['CpuInfo'] } +## +# @CpuInfo2: +# +# Information about a virtual CPU +# +# @cpu-index: index of the virtual CPU +# +# @halted: true if the virtual CPU is in the halt state. Halt usually refers +# to a processor specific low power mode. This field is optional, +# it is only present if the halted state can be retrieved without +# a performance penalty +# +# @qom-path: path to the CPU object in the QOM tree +# +# @thread-id: ID of the underlying host thread +# +# @props: properties describing to which node/socket/core/thread +# virtual CPU belongs to, provided if supported by board +# +# Since: 2.12 +# +# Notes: @halted is a transient state that changes frequently. By the time the +# data is sent to the client, the guest may no longer be halted. +## +{ 'struct': 'CpuInfo2', + 'data': {'cpu-index': 'int', '*halted': 'bool', 'qom-path': 'str', + 'thread-id': 'int', '*props': 'CpuInstanceProperties' } } + +## +# @query-cpus-fast: +# +# Returns information about all virtual CPUs. This command does not +# incur a performance penalty and should be used in production +# instead of query-cpus. +# +# Returns: list of @CpuInfo2 +# +# Notes: The CPU architecture name is not returned by query-cpus-fast. +# Use query-target to retreive that information. +# +# Since: 2.12 +# +# Example: +# +# -> { "execute": "query-cpus-fast" } +# <- { "return": [ +# { +# "thread-id": 25627, +# "props": { +# "core-id": 0, +# "thread-id": 0, +# "socket-id": 0 +# }, +# "qom-path": "/machine/unattached/device[0]", +# "cpu-index": 0 +# }, +# { +# "thread-id": 25628, +# "props": { +# "core-id": 0, +# "thread-id": 0, +# "socket-id": 1 +# }, +# "qom-path": "/machine/unattached/device[2]", +# "cpu-index": 1 +# } +# ] +# } +## +{ 'command': 'query-cpus-fast', 'returns': [ 'CpuInfo2' ] } + ## # @IOThreadInfo: # -- 2.14.3 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] qmp: add query-cpus-fast 2018-02-07 17:50 ` [Qemu-devel] [PATCH 1/2] " Luiz Capitulino @ 2018-02-07 18:49 ` Eric Blake 2018-02-08 7:41 ` Viktor Mihajlovski 2018-02-08 19:59 ` Eduardo Habkost 2 siblings, 0 replies; 15+ messages in thread From: Eric Blake @ 2018-02-07 18:49 UTC (permalink / raw) To: Luiz Capitulino, qemu-devel; +Cc: armbru, ehabkost, berrange, mihajlov On 02/07/2018 11:50 AM, Luiz Capitulino wrote: > The query-cpus command has an extremely serious side effect: > it always interrupt all running vCPUs so that they can run s/interrupt/interrupts/ > ioctl calls. This can cause a huge performance degradation for > some workloads. And most of the information retrieved by the > ioctl calls are not even used by query-cpus. > > This commit introduces a replacement for query-cpus called > query-cpus-fast, which has the following features: > > o Never interrupt vCPUs threads. query-cpus-fast only returns > vCPU information maintained by QEMU itself, which should be > sufficient for most management software needs > > o Make "halted" field optional: we only return it if the > halted state is maintained by QEMU. But this also gives > the option of dropping the field in the future (see below) > > o Drop irrelevant fields such as "current", "pc" and "arch" > > o Rename some fields for better clarification & proper naming > standard > > The "halted" field is somewhat controversial. On the one hand, > it offers a convenient way to know if a guest CPU is idle or > running. On the other hand, it's a field that can change many > times a second. In fact, the halted state can change even > before query-cpus-fast has returned. This makes one wonder if > this field should be dropped all together. Having the "halted" > field as optional gives a better option for dropping it in > the future, since we can just stop returning it. Makes sense to me. > > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> > --- > cpus.c | 44 ++++++++++++++++++++++++++++++++ > hmp-commands-info.hx | 14 +++++++++++ > hmp.c | 24 ++++++++++++++++++ > hmp.h | 1 + > qapi-schema.json | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > 5 files changed, 154 insertions(+) > > +++ b/hmp-commands-info.hx > @@ -157,6 +157,20 @@ STEXI > @item info cpus > @findex info cpus > Show infos for each CPU. Pre-existing, but... > +ETEXI > + > + { > + .name = "cpus_fast", > + .args_type = "", > + .params = "", > + .help = "show infos for each CPU without performance penalty", ...copied here. s/infos/information/ in public-facing documentation, while we're touching it. > +++ b/qapi-schema.json > @@ -558,6 +558,77 @@ > ## > { 'command': 'query-cpus', 'returns': ['CpuInfo'] } > > +## > +# @CpuInfo2: Good thing that type names aren't part of our introspection ABI. Would CpuInfoLite or CpuInfoFast make the code any more legible? > +# > +# Information about a virtual CPU > +# > +# @cpu-index: index of the virtual CPU > +# > +# @halted: true if the virtual CPU is in the halt state. Halt usually refers > +# to a processor specific low power mode. This field is optional, > +# it is only present if the halted state can be retrieved without > +# a performance penalty Disclaimer is good. > +# > +# @qom-path: path to the CPU object in the QOM tree > +# > +# @thread-id: ID of the underlying host thread > +# > +# @props: properties describing to which node/socket/core/thread > +# virtual CPU belongs to, provided if supported by board > +# > +# Since: 2.12 > +# > +# Notes: @halted is a transient state that changes frequently. By the time the > +# data is sent to the client, the guest may no longer be halted. > +## > +{ 'struct': 'CpuInfo2', > + 'data': {'cpu-index': 'int', '*halted': 'bool', 'qom-path': 'str', > + 'thread-id': 'int', '*props': 'CpuInstanceProperties' } } Looks reasonable. > + > +## > +# @query-cpus-fast: > +# > +# Returns information about all virtual CPUs. This command does not > +# incur a performance penalty and should be used in production > +# instead of query-cpus. > +# > +# Returns: list of @CpuInfo2 > +# > +# Notes: The CPU architecture name is not returned by query-cpus-fast. > +# Use query-target to retreive that information. s/retreive/retrieve/ > +## > +{ 'command': 'query-cpus-fast', 'returns': [ 'CpuInfo2' ] } > + Also, please add some docs to 'query-cpus' pointing to 'query-cpus-fast'. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] qmp: add query-cpus-fast 2018-02-07 17:50 ` [Qemu-devel] [PATCH 1/2] " Luiz Capitulino 2018-02-07 18:49 ` Eric Blake @ 2018-02-08 7:41 ` Viktor Mihajlovski 2018-02-08 10:13 ` Viktor Mihajlovski 2018-02-08 13:59 ` Luiz Capitulino 2018-02-08 19:59 ` Eduardo Habkost 2 siblings, 2 replies; 15+ messages in thread From: Viktor Mihajlovski @ 2018-02-08 7:41 UTC (permalink / raw) To: Luiz Capitulino, qemu-devel Cc: armbru, eblake, ehabkost, berrange, Christian Borntraeger On 07.02.2018 18:50, Luiz Capitulino wrote: > The query-cpus command has an extremely serious side effect: > it always interrupt all running vCPUs so that they can run > ioctl calls. This can cause a huge performance degradation for > some workloads. And most of the information retrieved by the > ioctl calls are not even used by query-cpus. > > This commit introduces a replacement for query-cpus called > query-cpus-fast, which has the following features: > > o Never interrupt vCPUs threads. query-cpus-fast only returns > vCPU information maintained by QEMU itself, which should be > sufficient for most management software needs > > o Make "halted" field optional: we only return it if the > halted state is maintained by QEMU. But this also gives > the option of dropping the field in the future (see below) > > o Drop irrelevant fields such as "current", "pc" and "arch" I disagree that arch is irrelevant and would strongly suggest to keep arch and arch-specific fields. At least in the case of s390 there's a cpu_state field that can be obtained cheaply. [...] > diff --git a/cpus.c b/cpus.c > index 2cb0af9b22..3b68a8146c 100644 > --- a/cpus.c > +++ b/cpus.c > @@ -2083,6 +2083,50 @@ CpuInfoList *qmp_query_cpus(Error **errp) > return head; > } > > +/* > + * fast means: we NEVER interrupt vCPU threads to retrieve > + * information from KVM. > + */ > +CpuInfo2List *qmp_query_cpus_fast(Error **errp) > +{ > + MachineState *ms = MACHINE(qdev_get_machine()); > + MachineClass *mc = MACHINE_GET_CLASS(ms); > + CpuInfo2List *head = NULL, *cur_item = NULL; > + CPUState *cpu; > + > + CPU_FOREACH(cpu) { > + CpuInfo2List *info = g_malloc0(sizeof(*info)); > + info->value = g_malloc0(sizeof(*info->value)); > + > + info->value->cpu_index = cpu->cpu_index; > + info->value->qom_path = object_get_canonical_path(OBJECT(cpu)); > + info->value->thread_id = cpu->thread_id; > + > + info->value->has_props = !!mc->cpu_index_to_instance_props; > + if (info->value->has_props) { > + CpuInstanceProperties *props; > + props = g_malloc0(sizeof(*props)); > + *props = mc->cpu_index_to_instance_props(ms, cpu->cpu_index); > + info->value->props = props; > + } > + > + /* if in kernel irqchip is used, we don't have 'halted' */ > + info->value->has_halted = !kvm_irqchip_in_kernel(); This is definitely not true for s390. Externally observable CPU state changes are handled by QEMU there. We may still drop halted if we add a more appropriate arch-specific field. > + if (info->value->has_halted) { > + info->value->halted = cpu->halted; > + } [...] -- Regards, Viktor Mihajlovski ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] qmp: add query-cpus-fast 2018-02-08 7:41 ` Viktor Mihajlovski @ 2018-02-08 10:13 ` Viktor Mihajlovski 2018-02-08 13:59 ` Luiz Capitulino 1 sibling, 0 replies; 15+ messages in thread From: Viktor Mihajlovski @ 2018-02-08 10:13 UTC (permalink / raw) To: Luiz Capitulino, qemu-devel; +Cc: armbru, Christian Borntraeger, ehabkost On 08.02.2018 08:41, Viktor Mihajlovski wrote: > On 07.02.2018 18:50, Luiz Capitulino wrote: >> The query-cpus command has an extremely serious side effect: >> it always interrupt all running vCPUs so that they can run >> ioctl calls. This can cause a huge performance degradation for >> some workloads. And most of the information retrieved by the >> ioctl calls are not even used by query-cpus. >> >> This commit introduces a replacement for query-cpus called >> query-cpus-fast, which has the following features: >> >> o Never interrupt vCPUs threads. query-cpus-fast only returns >> vCPU information maintained by QEMU itself, which should be >> sufficient for most management software needs >> >> o Make "halted" field optional: we only return it if the >> halted state is maintained by QEMU. But this also gives >> the option of dropping the field in the future (see below) >> >> o Drop irrelevant fields such as "current", "pc" and "arch" > I disagree that arch is irrelevant and would strongly suggest to keep > arch and arch-specific fields. At least in the case of s390 there's a > cpu_state field that can be obtained cheaply. I've posted a patch [1] to add s390-specific state info to the query-cpus output. This state *can* be obtained without kicking the CPU out of VM execution. With this info in the query-cpus-fast return data we can eventually get rid of halted and its ramifications. [...] [1] https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg02032.html -- Regards, Viktor Mihajlovski ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] qmp: add query-cpus-fast 2018-02-08 7:41 ` Viktor Mihajlovski 2018-02-08 10:13 ` Viktor Mihajlovski @ 2018-02-08 13:59 ` Luiz Capitulino 1 sibling, 0 replies; 15+ messages in thread From: Luiz Capitulino @ 2018-02-08 13:59 UTC (permalink / raw) To: Viktor Mihajlovski Cc: qemu-devel, armbru, eblake, ehabkost, berrange, Christian Borntraeger On Thu, 8 Feb 2018 08:41:31 +0100 Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> wrote: > On 07.02.2018 18:50, Luiz Capitulino wrote: > > The query-cpus command has an extremely serious side effect: > > it always interrupt all running vCPUs so that they can run > > ioctl calls. This can cause a huge performance degradation for > > some workloads. And most of the information retrieved by the > > ioctl calls are not even used by query-cpus. > > > > This commit introduces a replacement for query-cpus called > > query-cpus-fast, which has the following features: > > > > o Never interrupt vCPUs threads. query-cpus-fast only returns > > vCPU information maintained by QEMU itself, which should be > > sufficient for most management software needs > > > > o Make "halted" field optional: we only return it if the > > halted state is maintained by QEMU. But this also gives > > the option of dropping the field in the future (see below) > > > > o Drop irrelevant fields such as "current", "pc" and "arch" > I disagree that arch is irrelevant and would strongly suggest to keep > arch and arch-specific fields. At least in the case of s390 there's a > cpu_state field that can be obtained cheaply. The arch name can be queried with query-target. The only other arch field I'm dropping is pc, which should be considered debug only if anything. Also, if this need to query CPU registers increase, then we probably should port 'info registers' to QMP. Otherwise, we'll eventually run into the performance problem once again. > [...] > > diff --git a/cpus.c b/cpus.c > > index 2cb0af9b22..3b68a8146c 100644 > > --- a/cpus.c > > +++ b/cpus.c > > @@ -2083,6 +2083,50 @@ CpuInfoList *qmp_query_cpus(Error **errp) > > return head; > > } > > > > +/* > > + * fast means: we NEVER interrupt vCPU threads to retrieve > > + * information from KVM. > > + */ > > +CpuInfo2List *qmp_query_cpus_fast(Error **errp) > > +{ > > + MachineState *ms = MACHINE(qdev_get_machine()); > > + MachineClass *mc = MACHINE_GET_CLASS(ms); > > + CpuInfo2List *head = NULL, *cur_item = NULL; > > + CPUState *cpu; > > + > > + CPU_FOREACH(cpu) { > > + CpuInfo2List *info = g_malloc0(sizeof(*info)); > > + info->value = g_malloc0(sizeof(*info->value)); > > + > > + info->value->cpu_index = cpu->cpu_index; > > + info->value->qom_path = object_get_canonical_path(OBJECT(cpu)); > > + info->value->thread_id = cpu->thread_id; > > + > > + info->value->has_props = !!mc->cpu_index_to_instance_props; > > + if (info->value->has_props) { > > + CpuInstanceProperties *props; > > + props = g_malloc0(sizeof(*props)); > > + *props = mc->cpu_index_to_instance_props(ms, cpu->cpu_index); > > + info->value->props = props; > > + } > > + > > + /* if in kernel irqchip is used, we don't have 'halted' */ > > + info->value->has_halted = !kvm_irqchip_in_kernel(); > This is definitely not true for s390. Externally observable CPU state > changes are handled by QEMU there. We may still drop halted if we add a > more appropriate arch-specific field. > > + if (info->value->has_halted) { > > + info->value->halted = cpu->halted; > > + } > [...] > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] qmp: add query-cpus-fast 2018-02-07 17:50 ` [Qemu-devel] [PATCH 1/2] " Luiz Capitulino 2018-02-07 18:49 ` Eric Blake 2018-02-08 7:41 ` Viktor Mihajlovski @ 2018-02-08 19:59 ` Eduardo Habkost 2018-02-08 20:59 ` Eric Blake 2 siblings, 1 reply; 15+ messages in thread From: Eduardo Habkost @ 2018-02-08 19:59 UTC (permalink / raw) To: Luiz Capitulino; +Cc: qemu-devel, armbru, eblake, berrange, mihajlov On Wed, Feb 07, 2018 at 12:50:13PM -0500, Luiz Capitulino wrote: > The query-cpus command has an extremely serious side effect: > it always interrupt all running vCPUs so that they can run > ioctl calls. This can cause a huge performance degradation for > some workloads. And most of the information retrieved by the > ioctl calls are not even used by query-cpus. > > This commit introduces a replacement for query-cpus called > query-cpus-fast, which has the following features: > > o Never interrupt vCPUs threads. query-cpus-fast only returns > vCPU information maintained by QEMU itself, which should be > sufficient for most management software needs > > o Make "halted" field optional: we only return it if the > halted state is maintained by QEMU. But this also gives > the option of dropping the field in the future (see below) > > o Drop irrelevant fields such as "current", "pc" and "arch" > > o Rename some fields for better clarification & proper naming > standard > > The "halted" field is somewhat controversial. On the one hand, > it offers a convenient way to know if a guest CPU is idle or > running. On the other hand, it's a field that can change many > times a second. In fact, the halted state can change even > before query-cpus-fast has returned. This makes one wonder if > this field should be dropped all together. Having the "halted" > field as optional gives a better option for dropping it in > the future, since we can just stop returning it. > > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> [...] > diff --git a/qapi-schema.json b/qapi-schema.json > index 5c06745c79..82d6f12b53 100644 > --- a/qapi-schema.json > +++ b/qapi-schema.json > @@ -558,6 +558,77 @@ > ## > { 'command': 'query-cpus', 'returns': ['CpuInfo'] } > > +## > +# @CpuInfo2: > +# > +# Information about a virtual CPU > +# > +# @cpu-index: index of the virtual CPU > +# > +# @halted: true if the virtual CPU is in the halt state. Halt usually refers > +# to a processor specific low power mode. This field is optional, > +# it is only present if the halted state can be retrieved without > +# a performance penalty > +# > +# @qom-path: path to the CPU object in the QOM tree > +# > +# @thread-id: ID of the underlying host thread > +# > +# @props: properties describing to which node/socket/core/thread > +# virtual CPU belongs to, provided if supported by board > +# > +# Since: 2.12 > +# > +# Notes: @halted is a transient state that changes frequently. By the time the > +# data is sent to the client, the guest may no longer be halted. > +## > +{ 'struct': 'CpuInfo2', > + 'data': {'cpu-index': 'int', '*halted': 'bool', 'qom-path': 'str', > + 'thread-id': 'int', '*props': 'CpuInstanceProperties' } } This will require duplicating struct fields every time we add a new field to query-cpus-fast (e.g. how would VIktor's CpuInfoS390State patch[1] look like if rebased on top of yours?). One way to avoid that is to use CpuInfo for both, and make all "slow" fields optional. Another option is to use QAPI inheritance, but it could be a little complicated if unions are involved? [1] https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg02032.html -- Eduardo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] qmp: add query-cpus-fast 2018-02-08 19:59 ` Eduardo Habkost @ 2018-02-08 20:59 ` Eric Blake 2018-02-08 21:41 ` Eduardo Habkost 0 siblings, 1 reply; 15+ messages in thread From: Eric Blake @ 2018-02-08 20:59 UTC (permalink / raw) To: Eduardo Habkost, Luiz Capitulino; +Cc: qemu-devel, armbru, berrange, mihajlov On 02/08/2018 01:59 PM, Eduardo Habkost wrote: > On Wed, Feb 07, 2018 at 12:50:13PM -0500, Luiz Capitulino wrote: >> The query-cpus command has an extremely serious side effect: >> it always interrupt all running vCPUs so that they can run >> ioctl calls. This can cause a huge performance degradation for >> some workloads. And most of the information retrieved by the >> ioctl calls are not even used by query-cpus. >> >> This commit introduces a replacement for query-cpus called >> query-cpus-fast, which has the following features: >> >> +# Notes: @halted is a transient state that changes frequently. By the time the >> +# data is sent to the client, the guest may no longer be halted. >> +## >> +{ 'struct': 'CpuInfo2', >> + 'data': {'cpu-index': 'int', '*halted': 'bool', 'qom-path': 'str', >> + 'thread-id': 'int', '*props': 'CpuInstanceProperties' } } > > This will require duplicating struct fields every time we add a > new field to query-cpus-fast (e.g. how would VIktor's > CpuInfoS390State patch[1] look like if rebased on top of yours?). > > One way to avoid that is to use CpuInfo for both, and make all > "slow" fields optional. Another option is to use QAPI > inheritance, but it could be a little complicated if unions are > involved? Inheritance is better than optional fields for the sake of introspection learning which fields to expect. Put the common fields to both interfaces in the base class, then have the slower (older) CpuInfo class extend the base class to add the additional fields. Unions should be able to inherit just fine from structs (after all, a flat union requires a struct base); but if we need two layers of unions, we'll need to enhance QAPI code generation first. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] qmp: add query-cpus-fast 2018-02-08 20:59 ` Eric Blake @ 2018-02-08 21:41 ` Eduardo Habkost 2018-02-09 8:13 ` Viktor Mihajlovski 0 siblings, 1 reply; 15+ messages in thread From: Eduardo Habkost @ 2018-02-08 21:41 UTC (permalink / raw) To: Eric Blake; +Cc: Luiz Capitulino, qemu-devel, armbru, berrange, mihajlov On Thu, Feb 08, 2018 at 02:59:17PM -0600, Eric Blake wrote: > On 02/08/2018 01:59 PM, Eduardo Habkost wrote: > > On Wed, Feb 07, 2018 at 12:50:13PM -0500, Luiz Capitulino wrote: > > > The query-cpus command has an extremely serious side effect: > > > it always interrupt all running vCPUs so that they can run > > > ioctl calls. This can cause a huge performance degradation for > > > some workloads. And most of the information retrieved by the > > > ioctl calls are not even used by query-cpus. > > > > > > This commit introduces a replacement for query-cpus called > > > query-cpus-fast, which has the following features: > > > > > > > +# Notes: @halted is a transient state that changes frequently. By the time the > > > +# data is sent to the client, the guest may no longer be halted. > > > +## > > > +{ 'struct': 'CpuInfo2', > > > + 'data': {'cpu-index': 'int', '*halted': 'bool', 'qom-path': 'str', > > > + 'thread-id': 'int', '*props': 'CpuInstanceProperties' } } > > > > This will require duplicating struct fields every time we add a > > new field to query-cpus-fast (e.g. how would VIktor's > > CpuInfoS390State patch[1] look like if rebased on top of yours?). > > > > One way to avoid that is to use CpuInfo for both, and make all > > "slow" fields optional. Another option is to use QAPI > > inheritance, but it could be a little complicated if unions are > > involved? > > Inheritance is better than optional fields for the sake of introspection > learning which fields to expect. > > Put the common fields to both interfaces in the base class, then have the > slower (older) CpuInfo class extend the base class to add the additional > fields. > > Unions should be able to inherit just fine from structs (after all, a flat > union requires a struct base); but if we need two layers of unions, we'll > need to enhance QAPI code generation first. If we can't do union-union inheritance yet, maybe we can work around it this way: # fields that are always returned by both query-cpus and query-cpus-fast { 'struct': 'BothCpuInfoBase', 'data': {'cpu': 'int', 'qom_path': 'str', 'thread_id': 'int', '*props': 'CpuInstanceProperties' } } # fields that are always returned by query-cpus { 'struct': 'CpuInfoBase', 'base': 'BothCpuInfoBase', 'data': {'current': 'bool', 'halted': 'bool', 'arch': 'CpuInfoArch' } } # query-cpus return value { 'union': 'CpuInfo', 'base': 'CpuInfoBase', 'discriminator': 'arch', 'data': { 'x86': 'CpuInfoX86', 's390': 'CpuInfoS390', 'sparc': 'CpuInfoSPARC', 'ppc': 'CpuInfoPPC', 'mips': 'CpuInfoMIPS', 'tricore': 'CpuInfoTricore', 'other': 'CpuInfoOther' } } # fields that are always returned by query-cpus-fast { 'struct': 'FastCpuInfoBase', 'base': 'BothCpuInfoBase', 'data': { 'arch': 'CpuInfoArch' } } # return value of query-cpus-fast { 'union': 'FastCpuInfo', 'base': 'FastCpuInfoBase', 'discriminator': 'arch', 'data': { 'x86': 'CpuInfoOther', 's390': 'FastCpuInfoS390', 'sparc': 'CpuInfoOther', 'ppc': 'CpuInfoOther', 'mips': 'CpuInfoOther', 'tricore': 'CpuInfoOther', 'other': 'CpuInfoOther' } } # fields returned by both query-cpus and query-cpus-fast on s390 { 'struct': 'FastCpuInfoS390', 'data': { 'fast_field': 'int' } } # fields returned by query-cpus on s390 { 'struct': 'CpuInfoS390', 'base': 'FastCpuInfoS390', 'data': { 'slow_field': 'int' } } -- Eduardo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] qmp: add query-cpus-fast 2018-02-08 21:41 ` Eduardo Habkost @ 2018-02-09 8:13 ` Viktor Mihajlovski 0 siblings, 0 replies; 15+ messages in thread From: Viktor Mihajlovski @ 2018-02-09 8:13 UTC (permalink / raw) To: Eduardo Habkost, Eric Blake; +Cc: Luiz Capitulino, qemu-devel, armbru, berrange On 08.02.2018 22:41, Eduardo Habkost wrote: > On Thu, Feb 08, 2018 at 02:59:17PM -0600, Eric Blake wrote: >> On 02/08/2018 01:59 PM, Eduardo Habkost wrote: >>> On Wed, Feb 07, 2018 at 12:50:13PM -0500, Luiz Capitulino wrote: >>>> The query-cpus command has an extremely serious side effect: >>>> it always interrupt all running vCPUs so that they can run >>>> ioctl calls. This can cause a huge performance degradation for >>>> some workloads. And most of the information retrieved by the >>>> ioctl calls are not even used by query-cpus. >>>> >>>> This commit introduces a replacement for query-cpus called >>>> query-cpus-fast, which has the following features: >>>> >> >>>> +# Notes: @halted is a transient state that changes frequently. By the time the >>>> +# data is sent to the client, the guest may no longer be halted. >>>> +## >>>> +{ 'struct': 'CpuInfo2', >>>> + 'data': {'cpu-index': 'int', '*halted': 'bool', 'qom-path': 'str', >>>> + 'thread-id': 'int', '*props': 'CpuInstanceProperties' } } >>> >>> This will require duplicating struct fields every time we add a >>> new field to query-cpus-fast (e.g. how would VIktor's >>> CpuInfoS390State patch[1] look like if rebased on top of yours?). >>> >>> One way to avoid that is to use CpuInfo for both, and make all >>> "slow" fields optional. Another option is to use QAPI >>> inheritance, but it could be a little complicated if unions are >>> involved? >> >> Inheritance is better than optional fields for the sake of introspection >> learning which fields to expect. >> >> Put the common fields to both interfaces in the base class, then have the >> slower (older) CpuInfo class extend the base class to add the additional >> fields. >> >> Unions should be able to inherit just fine from structs (after all, a flat >> union requires a struct base); but if we need two layers of unions, we'll >> need to enhance QAPI code generation first. > > If we can't do union-union inheritance yet, maybe we can work around it this > way: > > # fields that are always returned by both query-cpus and query-cpus-fast > { 'struct': 'BothCpuInfoBase', > 'data': {'cpu': 'int', 'qom_path': 'str', 'thread_id': 'int', > '*props': 'CpuInstanceProperties' } } > > # fields that are always returned by query-cpus > { 'struct': 'CpuInfoBase', > 'base': 'BothCpuInfoBase', > 'data': {'current': 'bool', 'halted': 'bool', 'arch': 'CpuInfoArch' } } > > # query-cpus return value > { 'union': 'CpuInfo', > 'base': 'CpuInfoBase', > 'discriminator': 'arch', > 'data': { 'x86': 'CpuInfoX86', > 's390': 'CpuInfoS390', > 'sparc': 'CpuInfoSPARC', > 'ppc': 'CpuInfoPPC', > 'mips': 'CpuInfoMIPS', > 'tricore': 'CpuInfoTricore', > 'other': 'CpuInfoOther' } } > > # fields that are always returned by query-cpus-fast > { 'struct': 'FastCpuInfoBase', > 'base': 'BothCpuInfoBase', > 'data': { 'arch': 'CpuInfoArch' } } > > # return value of query-cpus-fast > { 'union': 'FastCpuInfo', > 'base': 'FastCpuInfoBase', > 'discriminator': 'arch', > 'data': { 'x86': 'CpuInfoOther', > 's390': 'FastCpuInfoS390', > 'sparc': 'CpuInfoOther', > 'ppc': 'CpuInfoOther', > 'mips': 'CpuInfoOther', > 'tricore': 'CpuInfoOther', > 'other': 'CpuInfoOther' } } > > # fields returned by both query-cpus and query-cpus-fast on s390 > { 'struct': 'FastCpuInfoS390', > 'data': { 'fast_field': 'int' } } > > # fields returned by query-cpus on s390 > { 'struct': 'CpuInfoS390', > 'base': 'FastCpuInfoS390', > 'data': { 'slow_field': 'int' } } > This is not exactly pretty, but would provide the functionality (and flexibility) I'd expect. -- Regards, Viktor Mihajlovski ^ permalink raw reply [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH 2/2] qmp: document query-cpus performance issue 2018-02-07 17:50 [Qemu-devel] [PATCH 0/2] qmp: add query-cpus-fast Luiz Capitulino 2018-02-07 17:50 ` [Qemu-devel] [PATCH 1/2] " Luiz Capitulino @ 2018-02-07 17:50 ` Luiz Capitulino 2018-02-07 18:50 ` Eric Blake 2018-02-08 9:29 ` Daniel P. Berrangé 1 sibling, 2 replies; 15+ messages in thread From: Luiz Capitulino @ 2018-02-07 17:50 UTC (permalink / raw) To: qemu-devel; +Cc: armbru, eblake, ehabkost, berrange, mihajlov Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> --- qapi-schema.json | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/qapi-schema.json b/qapi-schema.json index 82d6f12b53..0665a14dba 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -526,6 +526,10 @@ # # Returns a list of information about each virtual CPU. # +# WARNING: This command incurs a performance penalty for latency +# sensitive workloads and hence it's not recommended to +# to be used in production. Use query-cpus-fast instead +# # Returns: a list of @CpuInfo for each virtual CPU # # Since: 0.14.0 -- 2.14.3 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] qmp: document query-cpus performance issue 2018-02-07 17:50 ` [Qemu-devel] [PATCH 2/2] qmp: document query-cpus performance issue Luiz Capitulino @ 2018-02-07 18:50 ` Eric Blake 2018-02-07 19:14 ` Luiz Capitulino 2018-02-08 9:29 ` Daniel P. Berrangé 1 sibling, 1 reply; 15+ messages in thread From: Eric Blake @ 2018-02-07 18:50 UTC (permalink / raw) To: Luiz Capitulino, qemu-devel; +Cc: armbru, ehabkost, berrange, mihajlov On 02/07/2018 11:50 AM, Luiz Capitulino wrote: > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> > --- > qapi-schema.json | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/qapi-schema.json b/qapi-schema.json > index 82d6f12b53..0665a14dba 100644 > --- a/qapi-schema.json > +++ b/qapi-schema.json > @@ -526,6 +526,10 @@ > # > # Returns a list of information about each virtual CPU. > # > +# WARNING: This command incurs a performance penalty for latency > +# sensitive workloads and hence it's not recommended to > +# to be used in production. Use query-cpus-fast instead > +# Ah, I asked for this on 1/2. You could squash these. I didn't review the code, just the interface, so if you squash them, you can add the weaker: Acked-by: Eric Blake <eblake@redhat.com> -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] qmp: document query-cpus performance issue 2018-02-07 18:50 ` Eric Blake @ 2018-02-07 19:14 ` Luiz Capitulino 0 siblings, 0 replies; 15+ messages in thread From: Luiz Capitulino @ 2018-02-07 19:14 UTC (permalink / raw) To: Eric Blake; +Cc: qemu-devel, armbru, ehabkost, berrange, mihajlov On Wed, 7 Feb 2018 12:50:59 -0600 Eric Blake <eblake@redhat.com> wrote: > On 02/07/2018 11:50 AM, Luiz Capitulino wrote: > > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> > > --- > > qapi-schema.json | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/qapi-schema.json b/qapi-schema.json > > index 82d6f12b53..0665a14dba 100644 > > --- a/qapi-schema.json > > +++ b/qapi-schema.json > > @@ -526,6 +526,10 @@ > > # > > # Returns a list of information about each virtual CPU. > > # > > +# WARNING: This command incurs a performance penalty for latency > > +# sensitive workloads and hence it's not recommended to > > +# to be used in production. Use query-cpus-fast instead > > +# > > Ah, I asked for this on 1/2. You could squash these. > > I didn't review the code, just the interface, so if you squash them, you > can add the weaker: > > Acked-by: Eric Blake <eblake@redhat.com> OK, I'll wait for more review and do the changes you requested. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] qmp: document query-cpus performance issue 2018-02-07 17:50 ` [Qemu-devel] [PATCH 2/2] qmp: document query-cpus performance issue Luiz Capitulino 2018-02-07 18:50 ` Eric Blake @ 2018-02-08 9:29 ` Daniel P. Berrangé 2018-02-08 14:00 ` Luiz Capitulino 1 sibling, 1 reply; 15+ messages in thread From: Daniel P. Berrangé @ 2018-02-08 9:29 UTC (permalink / raw) To: Luiz Capitulino; +Cc: qemu-devel, armbru, eblake, ehabkost, mihajlov On Wed, Feb 07, 2018 at 12:50:14PM -0500, Luiz Capitulino wrote: > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> > --- > qapi-schema.json | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/qapi-schema.json b/qapi-schema.json > index 82d6f12b53..0665a14dba 100644 > --- a/qapi-schema.json > +++ b/qapi-schema.json > @@ -526,6 +526,10 @@ > # > # Returns a list of information about each virtual CPU. > # > +# WARNING: This command incurs a performance penalty for latency > +# sensitive workloads and hence it's not recommended to > +# to be used in production. Use query-cpus-fast instead I suggest being more explicit about exactly what the problem is, so people understand implications if they choose to still use it. ie This command causes vCPU threads to exit to userspace, which causes an small interruption guest CPU execution. This will have a negative impact on realtime guests and other latency sensitive guest workloads. It is recommended to use query-cpus-fast instead of this command to avoid the vCPU interruption. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] qmp: document query-cpus performance issue 2018-02-08 9:29 ` Daniel P. Berrangé @ 2018-02-08 14:00 ` Luiz Capitulino 0 siblings, 0 replies; 15+ messages in thread From: Luiz Capitulino @ 2018-02-08 14:00 UTC (permalink / raw) To: Daniel P. Berrangé; +Cc: qemu-devel, armbru, eblake, ehabkost, mihajlov On Thu, 8 Feb 2018 09:29:28 +0000 Daniel P. Berrangé <berrange@redhat.com> wrote: > On Wed, Feb 07, 2018 at 12:50:14PM -0500, Luiz Capitulino wrote: > > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> > > --- > > qapi-schema.json | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/qapi-schema.json b/qapi-schema.json > > index 82d6f12b53..0665a14dba 100644 > > --- a/qapi-schema.json > > +++ b/qapi-schema.json > > @@ -526,6 +526,10 @@ > > # > > # Returns a list of information about each virtual CPU. > > # > > +# WARNING: This command incurs a performance penalty for latency > > +# sensitive workloads and hence it's not recommended to > > +# to be used in production. Use query-cpus-fast instead > > I suggest being more explicit about exactly what the problem is, so people > understand implications if they choose to still use it. ie I'll add your text. > > This command causes vCPU threads to exit to userspace, which causes > an small interruption guest CPU execution. This will have a negative > impact on realtime guests and other latency sensitive guest workloads. > It is recommended to use query-cpus-fast instead of this command to > avoid the vCPU interruption. > > Regards, > Daniel ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2018-02-09 8:13 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-02-07 17:50 [Qemu-devel] [PATCH 0/2] qmp: add query-cpus-fast Luiz Capitulino 2018-02-07 17:50 ` [Qemu-devel] [PATCH 1/2] " Luiz Capitulino 2018-02-07 18:49 ` Eric Blake 2018-02-08 7:41 ` Viktor Mihajlovski 2018-02-08 10:13 ` Viktor Mihajlovski 2018-02-08 13:59 ` Luiz Capitulino 2018-02-08 19:59 ` Eduardo Habkost 2018-02-08 20:59 ` Eric Blake 2018-02-08 21:41 ` Eduardo Habkost 2018-02-09 8:13 ` Viktor Mihajlovski 2018-02-07 17:50 ` [Qemu-devel] [PATCH 2/2] qmp: document query-cpus performance issue Luiz Capitulino 2018-02-07 18:50 ` Eric Blake 2018-02-07 19:14 ` Luiz Capitulino 2018-02-08 9:29 ` Daniel P. Berrangé 2018-02-08 14:00 ` Luiz Capitulino
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).