* Supporting consistency of vcpu_runstate_info across cpus @ 2016-05-19 7:53 Juergen Gross 2016-05-19 8:09 ` Andrew Cooper 0 siblings, 1 reply; 10+ messages in thread From: Juergen Gross @ 2016-05-19 7:53 UTC (permalink / raw) To: xen-devel Cc: Stefano Stabellini, Wei Liu, George.Dunlap@eu.citrix.com, Andrew Cooper, Ian Jackson, Tim Deegan, Jan Beulich A guest kernel can use the vcpu_op hypercall sub-op VCPUOP_register_runstate_memory_area to get a copy of the vcpu_runstate_info of a vcpu mapped into its memory. As this structure has no update indicator it is only save to be read by the vcpu it is containing the runstate information of. Being able to read the runstate info of another cpu is required e.g. by the Linux kernel to be able to calculate vruntime: see http://lists.xen.org/archives/html/xen-devel/2016-05/msg01790.html I'd suggest to add an "update in progress" indicator in the highest bit of vcpu_runstate_info->state_entry_time as this structure element is already used to detect vcpu scheduling when vcpu_runstate_info is read by the owning vcpu. The question is how to enable setting this indicator, as the guest must be able to cope with it (I believe the Linux kernel would just run fine, but we can't be sure this is true for all guests). I see the following possible solutions: a) Introduce a new vcpu_op hypercall sub-op for mapping the vcpu_runstate_info with update indicator support (a guest supporting this would try the new sub-op first and could fall back to VCPUOP_register_runstate_memory_area in case of ENOSYS). b) Add a virtual MSR to switch on the feature (not being able to set the appropriate bit would indicate the feature not being available). This is the variant KVM is using. Does ARM have something like MSRs? c) Add another hypercall to switch on the feature (similar to XENVER_get_features we could have a XENVER_set_features). Any preferences? Juergen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Supporting consistency of vcpu_runstate_info across cpus 2016-05-19 7:53 Supporting consistency of vcpu_runstate_info across cpus Juergen Gross @ 2016-05-19 8:09 ` Andrew Cooper 2016-05-19 8:49 ` Juergen Gross 0 siblings, 1 reply; 10+ messages in thread From: Andrew Cooper @ 2016-05-19 8:09 UTC (permalink / raw) To: Juergen Gross, xen-devel Cc: Stefano Stabellini, Wei Liu, George.Dunlap@eu.citrix.com, Ian Jackson, Tim Deegan, Jan Beulich On 19/05/2016 08:53, Juergen Gross wrote: > A guest kernel can use the vcpu_op hypercall sub-op > VCPUOP_register_runstate_memory_area to get a copy of the > vcpu_runstate_info of a vcpu mapped into its memory. As this structure > has no update indicator it is only save to be read by the vcpu it is > containing the runstate information of. > > Being able to read the runstate info of another cpu is required e.g. > by the Linux kernel to be able to calculate vruntime: see > > http://lists.xen.org/archives/html/xen-devel/2016-05/msg01790.html > > I'd suggest to add an "update in progress" indicator in the highest > bit of vcpu_runstate_info->state_entry_time as this structure element is > already used to detect vcpu scheduling when vcpu_runstate_info is read > by the owning vcpu. > > The question is how to enable setting this indicator, as the guest must > be able to cope with it (I believe the Linux kernel would just run fine, > but we can't be sure this is true for all guests). > > I see the following possible solutions: > > a) Introduce a new vcpu_op hypercall sub-op for mapping the > vcpu_runstate_info with update indicator support (a guest supporting > this would try the new sub-op first and could fall back to > VCPUOP_register_runstate_memory_area in case of ENOSYS). > > b) Add a virtual MSR to switch on the feature (not being able to set the > appropriate bit would indicate the feature not being available). This > is the variant KVM is using. Does ARM have something like MSRs? > > c) Add another hypercall to switch on the feature (similar to > XENVER_get_features we could have a XENVER_set_features). > > Any preferences? However, irrespective of how you signal the request for new behaviour, you should see about using a lockless clock rather than a single bit, as a single bit can't indicate the case where a complete update has occurred between two samplings. This will probably require an extension to the current implementation, at which point you might be able to add a capability field as well. Alternatively, the easiest way will probably be to add a new VMASSIST, which allows the guest to opt into the new behaviour. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Supporting consistency of vcpu_runstate_info across cpus 2016-05-19 8:09 ` Andrew Cooper @ 2016-05-19 8:49 ` Juergen Gross 2016-05-19 10:21 ` Dario Faggioli 2016-05-19 10:40 ` Stefano Stabellini 0 siblings, 2 replies; 10+ messages in thread From: Juergen Gross @ 2016-05-19 8:49 UTC (permalink / raw) To: Andrew Cooper, xen-devel Cc: Stefano Stabellini, Wei Liu, George.Dunlap@eu.citrix.com, Ian Jackson, Tim Deegan, Jan Beulich On 19/05/16 10:09, Andrew Cooper wrote: > On 19/05/2016 08:53, Juergen Gross wrote: >> A guest kernel can use the vcpu_op hypercall sub-op >> VCPUOP_register_runstate_memory_area to get a copy of the >> vcpu_runstate_info of a vcpu mapped into its memory. As this structure >> has no update indicator it is only save to be read by the vcpu it is >> containing the runstate information of. >> >> Being able to read the runstate info of another cpu is required e.g. >> by the Linux kernel to be able to calculate vruntime: see >> >> http://lists.xen.org/archives/html/xen-devel/2016-05/msg01790.html >> >> I'd suggest to add an "update in progress" indicator in the highest >> bit of vcpu_runstate_info->state_entry_time as this structure element is >> already used to detect vcpu scheduling when vcpu_runstate_info is read >> by the owning vcpu. >> >> The question is how to enable setting this indicator, as the guest must >> be able to cope with it (I believe the Linux kernel would just run fine, >> but we can't be sure this is true for all guests). >> >> I see the following possible solutions: >> >> a) Introduce a new vcpu_op hypercall sub-op for mapping the >> vcpu_runstate_info with update indicator support (a guest supporting >> this would try the new sub-op first and could fall back to >> VCPUOP_register_runstate_memory_area in case of ENOSYS). >> >> b) Add a virtual MSR to switch on the feature (not being able to set the >> appropriate bit would indicate the feature not being available). This >> is the variant KVM is using. Does ARM have something like MSRs? >> >> c) Add another hypercall to switch on the feature (similar to >> XENVER_get_features we could have a XENVER_set_features). >> >> Any preferences? > > However, irrespective of how you signal the request for new behaviour, > you should see about using a lockless clock rather than a single bit, as > a single bit can't indicate the case where a complete update has > occurred between two samplings. This will probably require an extension > to the current implementation, at which point you might be able to add a > capability field as well. That's the reason I've chosen state_entry_time as the home for the new bit. state_entry_time is guaranteed to change between two updates. So the logic would look like the following: do { old_entry_time = READ_ONCE(r->state_entry_time); rmb(); new_state = READ_ONCE(*r); rmb(); } while (new_state.state_entry_time != old_entry_time || (old_entry_time >> 63)); > Alternatively, the easiest way will probably be to add a new VMASSIST, > which allows the guest to opt into the new behaviour. Aah, nice. Yes, this seems to be a sensible option. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Supporting consistency of vcpu_runstate_info across cpus 2016-05-19 8:49 ` Juergen Gross @ 2016-05-19 10:21 ` Dario Faggioli 2016-05-19 13:57 ` Roger Pau Monne 2016-05-19 10:40 ` Stefano Stabellini 1 sibling, 1 reply; 10+ messages in thread From: Dario Faggioli @ 2016-05-19 10:21 UTC (permalink / raw) To: Juergen Gross Cc: Stefano Stabellini, Wei Liu, George.Dunlap@eu.citrix.com, Andrew Cooper, Ian Jackson, Tim Deegan, Jan Beulich, xen-devel, Roger Pau Monne Since, AFAIUI, you're interested in non-Linux guests' perspective, I'm adding Roger (and avoiding trimming, for his benefit), who can tell us what he thinks of this all, from the FreeBSD point of view. On Thu, May 19, 2016 at 10:49 AM, Juergen Gross <jgross@suse.com> wrote: > On 19/05/16 10:09, Andrew Cooper wrote: >> On 19/05/2016 08:53, Juergen Gross wrote: >>> A guest kernel can use the vcpu_op hypercall sub-op >>> VCPUOP_register_runstate_memory_area to get a copy of the >>> vcpu_runstate_info of a vcpu mapped into its memory. As this structure >>> has no update indicator it is only save to be read by the vcpu it is >>> containing the runstate information of. >>> >>> Being able to read the runstate info of another cpu is required e.g. >>> by the Linux kernel to be able to calculate vruntime: see >>> >>> http://lists.xen.org/archives/html/xen-devel/2016-05/msg01790.html >>> >>> I'd suggest to add an "update in progress" indicator in the highest >>> bit of vcpu_runstate_info->state_entry_time as this structure element is >>> already used to detect vcpu scheduling when vcpu_runstate_info is read >>> by the owning vcpu. >>> >>> The question is how to enable setting this indicator, as the guest must >>> be able to cope with it (I believe the Linux kernel would just run fine, >>> but we can't be sure this is true for all guests). >>> >>> I see the following possible solutions: >>> >>> a) Introduce a new vcpu_op hypercall sub-op for mapping the >>> vcpu_runstate_info with update indicator support (a guest supporting >>> this would try the new sub-op first and could fall back to >>> VCPUOP_register_runstate_memory_area in case of ENOSYS). >>> >>> b) Add a virtual MSR to switch on the feature (not being able to set the >>> appropriate bit would indicate the feature not being available). This >>> is the variant KVM is using. Does ARM have something like MSRs? >>> >>> c) Add another hypercall to switch on the feature (similar to >>> XENVER_get_features we could have a XENVER_set_features). >>> >>> Any preferences? >> >> However, irrespective of how you signal the request for new behaviour, >> you should see about using a lockless clock rather than a single bit, as >> a single bit can't indicate the case where a complete update has >> occurred between two samplings. This will probably require an extension >> to the current implementation, at which point you might be able to add a >> capability field as well. > > That's the reason I've chosen state_entry_time as the home for the new > bit. state_entry_time is guaranteed to change between two updates. So > the logic would look like the following: > > do { > old_entry_time = READ_ONCE(r->state_entry_time); > rmb(); > new_state = READ_ONCE(*r); > rmb(); > } while (new_state.state_entry_time != old_entry_time || > (old_entry_time >> 63)); > >> Alternatively, the easiest way will probably be to add a new VMASSIST, >> which allows the guest to opt into the new behaviour. > > Aah, nice. Yes, this seems to be a sensible option. > FWIW, this looks a good approach to me as well. Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) --------------------------------------------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Supporting consistency of vcpu_runstate_info across cpus 2016-05-19 10:21 ` Dario Faggioli @ 2016-05-19 13:57 ` Roger Pau Monne 0 siblings, 0 replies; 10+ messages in thread From: Roger Pau Monne @ 2016-05-19 13:57 UTC (permalink / raw) To: Dario Faggioli Cc: Juergen Gross, Stefano Stabellini, Wei Liu, George.Dunlap@eu.citrix.com, Andrew Cooper, Ian Jackson, Tim Deegan, Jan Beulich, xen-devel On Thu, May 19, 2016 at 12:21:57PM +0200, Dario Faggioli wrote: > Since, AFAIUI, you're interested in non-Linux guests' perspective, I'm > adding Roger (and avoiding trimming, for his benefit), who can tell us > what he thinks of this all, from the FreeBSD point of view. Thanks, AFAIK vcpu_runstate_info is only used by Linux ATM? (maybe Windows?) FreeBSD doesn't do stolen time accounting at all, and (although I would really like to see this implemented) I don't foresee myself adding this in the near future. That's mainly because FreeBSD doesn't have the necessary scheduler hooks, so it's not only implementing the Xen side of it, it needs to be plumbed through the scheduler and that doesn't look like an easy task. NetBSD also doesn't seem to do it, and OpenBSD just gained basic Xen support, so no stolen time accounting there also. > On Thu, May 19, 2016 at 10:49 AM, Juergen Gross <jgross@suse.com> wrote: > > On 19/05/16 10:09, Andrew Cooper wrote: > >> On 19/05/2016 08:53, Juergen Gross wrote: > >>> A guest kernel can use the vcpu_op hypercall sub-op > >>> VCPUOP_register_runstate_memory_area to get a copy of the > >>> vcpu_runstate_info of a vcpu mapped into its memory. As this structure > >>> has no update indicator it is only save to be read by the vcpu it is > >>> containing the runstate information of. > >>> > >>> Being able to read the runstate info of another cpu is required e.g. > >>> by the Linux kernel to be able to calculate vruntime: see > >>> > >>> http://lists.xen.org/archives/html/xen-devel/2016-05/msg01790.html > >>> > >>> I'd suggest to add an "update in progress" indicator in the highest > >>> bit of vcpu_runstate_info->state_entry_time as this structure element is > >>> already used to detect vcpu scheduling when vcpu_runstate_info is read > >>> by the owning vcpu. > >>> > >>> The question is how to enable setting this indicator, as the guest must > >>> be able to cope with it (I believe the Linux kernel would just run fine, > >>> but we can't be sure this is true for all guests). > >>> > >>> I see the following possible solutions: > >>> > >>> a) Introduce a new vcpu_op hypercall sub-op for mapping the > >>> vcpu_runstate_info with update indicator support (a guest supporting > >>> this would try the new sub-op first and could fall back to > >>> VCPUOP_register_runstate_memory_area in case of ENOSYS). > >>> > >>> b) Add a virtual MSR to switch on the feature (not being able to set the > >>> appropriate bit would indicate the feature not being available). This > >>> is the variant KVM is using. Does ARM have something like MSRs? So I assume the vcpu_runstate_info structure is shared between Xen and KVM, just like the PV time info structure? > >>> c) Add another hypercall to switch on the feature (similar to > >>> XENVER_get_features we could have a XENVER_set_features). > >>> > >>> Any preferences? > >> > >> However, irrespective of how you signal the request for new behaviour, > >> you should see about using a lockless clock rather than a single bit, as > >> a single bit can't indicate the case where a complete update has > >> occurred between two samplings. This will probably require an extension > >> to the current implementation, at which point you might be able to add a > >> capability field as well. > > > > That's the reason I've chosen state_entry_time as the home for the new > > bit. state_entry_time is guaranteed to change between two updates. So > > the logic would look like the following: > > > > do { > > old_entry_time = READ_ONCE(r->state_entry_time); > > rmb(); > > new_state = READ_ONCE(*r); > > rmb(); > > } while (new_state.state_entry_time != old_entry_time || > > (old_entry_time >> 63)); > > > >> Alternatively, the easiest way will probably be to add a new VMASSIST, > >> which allows the guest to opt into the new behaviour. > > > > Aah, nice. Yes, this seems to be a sensible option. > > > FWIW, this looks a good approach to me as well. I don't have a problem with this, I would just like to use whatever KVM uses in order to be able to reduce code duplication if I ever implement this on FreeBSD. Roger. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Supporting consistency of vcpu_runstate_info across cpus 2016-05-19 8:49 ` Juergen Gross 2016-05-19 10:21 ` Dario Faggioli @ 2016-05-19 10:40 ` Stefano Stabellini 2016-05-19 10:45 ` Jan Beulich 1 sibling, 1 reply; 10+ messages in thread From: Stefano Stabellini @ 2016-05-19 10:40 UTC (permalink / raw) To: Juergen Gross Cc: Stefano Stabellini, Wei Liu, George.Dunlap@eu.citrix.com, Andrew Cooper, Ian Jackson, Tim Deegan, Jan Beulich, xen-devel On Thu, 19 May 2016, Juergen Gross wrote: > > Alternatively, the easiest way will probably be to add a new VMASSIST, > > which allows the guest to opt into the new behaviour. > > Aah, nice. Yes, this seems to be a sensible option. If you are referring to VM_ASSIST, it is only available on x86. I suggest we use a feature flag instead. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Supporting consistency of vcpu_runstate_info across cpus 2016-05-19 10:40 ` Stefano Stabellini @ 2016-05-19 10:45 ` Jan Beulich 2016-05-19 10:48 ` Stefano Stabellini 0 siblings, 1 reply; 10+ messages in thread From: Jan Beulich @ 2016-05-19 10:45 UTC (permalink / raw) To: Stefano Stabellini Cc: Juergen Gross, Wei Liu, George.Dunlap@eu.citrix.com, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel >>> On 19.05.16 at 12:40, <sstabellini@kernel.org> wrote: > On Thu, 19 May 2016, Juergen Gross wrote: >> > Alternatively, the easiest way will probably be to add a new VMASSIST, >> > which allows the guest to opt into the new behaviour. >> >> Aah, nice. Yes, this seems to be a sensible option. > > If you are referring to VM_ASSIST, it is only available on x86. I > suggest we use a feature flag instead. A feature flag can only be checked by the guest, not set. How about enabling VMASSIST for ARM? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Supporting consistency of vcpu_runstate_info across cpus 2016-05-19 10:45 ` Jan Beulich @ 2016-05-19 10:48 ` Stefano Stabellini 2016-05-19 14:11 ` Juergen Gross 0 siblings, 1 reply; 10+ messages in thread From: Stefano Stabellini @ 2016-05-19 10:48 UTC (permalink / raw) To: Jan Beulich Cc: Juergen Gross, Stefano Stabellini, Wei Liu, George.Dunlap@eu.citrix.com, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel On Thu, 19 May 2016, Jan Beulich wrote: > >>> On 19.05.16 at 12:40, <sstabellini@kernel.org> wrote: > > On Thu, 19 May 2016, Juergen Gross wrote: > >> > Alternatively, the easiest way will probably be to add a new VMASSIST, > >> > which allows the guest to opt into the new behaviour. > >> > >> Aah, nice. Yes, this seems to be a sensible option. > > > > If you are referring to VM_ASSIST, it is only available on x86. I > > suggest we use a feature flag instead. > > A feature flag can only be checked by the guest, not set. How > about enabling VMASSIST for ARM? Sure _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Supporting consistency of vcpu_runstate_info across cpus 2016-05-19 10:48 ` Stefano Stabellini @ 2016-05-19 14:11 ` Juergen Gross 2016-05-19 16:54 ` Stefano Stabellini 0 siblings, 1 reply; 10+ messages in thread From: Juergen Gross @ 2016-05-19 14:11 UTC (permalink / raw) To: Stefano Stabellini, Jan Beulich Cc: Wei Liu, George.Dunlap@eu.citrix.com, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel On 19/05/16 12:48, Stefano Stabellini wrote: > On Thu, 19 May 2016, Jan Beulich wrote: >>>>> On 19.05.16 at 12:40, <sstabellini@kernel.org> wrote: >>> On Thu, 19 May 2016, Juergen Gross wrote: >>>>> Alternatively, the easiest way will probably be to add a new VMASSIST, >>>>> which allows the guest to opt into the new behaviour. >>>> >>>> Aah, nice. Yes, this seems to be a sensible option. >>> >>> If you are referring to VM_ASSIST, it is only available on x86. I >>> suggest we use a feature flag instead. >> >> A feature flag can only be checked by the guest, not set. How >> about enabling VMASSIST for ARM? > > Sure Stefano, if you want I can do this when adding the VMASSIST option. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Supporting consistency of vcpu_runstate_info across cpus 2016-05-19 14:11 ` Juergen Gross @ 2016-05-19 16:54 ` Stefano Stabellini 0 siblings, 0 replies; 10+ messages in thread From: Stefano Stabellini @ 2016-05-19 16:54 UTC (permalink / raw) To: Juergen Gross Cc: Stefano Stabellini, Wei Liu, George.Dunlap@eu.citrix.com, Andrew Cooper, Ian Jackson, Tim Deegan, Jan Beulich, xen-devel On Thu, 19 May 2016, Juergen Gross wrote: > On 19/05/16 12:48, Stefano Stabellini wrote: > > On Thu, 19 May 2016, Jan Beulich wrote: > >>>>> On 19.05.16 at 12:40, <sstabellini@kernel.org> wrote: > >>> On Thu, 19 May 2016, Juergen Gross wrote: > >>>>> Alternatively, the easiest way will probably be to add a new VMASSIST, > >>>>> which allows the guest to opt into the new behaviour. > >>>> > >>>> Aah, nice. Yes, this seems to be a sensible option. > >>> > >>> If you are referring to VM_ASSIST, it is only available on x86. I > >>> suggest we use a feature flag instead. > >> > >> A feature flag can only be checked by the guest, not set. How > >> about enabling VMASSIST for ARM? > > > > Sure > > Stefano, if you want I can do this when adding the VMASSIST option. That would be great _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-05-19 16:54 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-05-19 7:53 Supporting consistency of vcpu_runstate_info across cpus Juergen Gross 2016-05-19 8:09 ` Andrew Cooper 2016-05-19 8:49 ` Juergen Gross 2016-05-19 10:21 ` Dario Faggioli 2016-05-19 13:57 ` Roger Pau Monne 2016-05-19 10:40 ` Stefano Stabellini 2016-05-19 10:45 ` Jan Beulich 2016-05-19 10:48 ` Stefano Stabellini 2016-05-19 14:11 ` Juergen Gross 2016-05-19 16:54 ` Stefano Stabellini
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.