All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: Dave Martin <Dave.Martin@arm.com>
Cc: Okamoto Takayuki <tokamoto@jp.fujitsu.com>,
	Christoffer Dall <cdall@kernel.org>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Marc Zyngier <marc.zyngier@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers
Date: Tue, 20 Nov 2018 15:30:29 +0000	[thread overview]
Message-ID: <87muq3ix62.fsf@linaro.org> (raw)
In-Reply-To: <20181120141659.GZ3505@e103592.cambridge.arm.com>


Dave Martin <Dave.Martin@arm.com> writes:

<snip>
>> >> > @@ -404,10 +444,11 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>> >> >  	 * and restore the guest context lazily.
>> >> >  	 * If FP/SIMD is not implemented, handle the trap and inject an
>> >> >  	 * undefined instruction exception to the guest.
>> >> > +	 * Similarly for trapped SVE accesses.
>> >> >  	 */
>> >> > -	if (system_supports_fpsimd() &&
>> >> > -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
>> >> > -		return __hyp_switch_fpsimd(vcpu);
>> >> > +	guest_has_sve = vcpu_has_sve(vcpu);
>> >>
>> >> I'm not sure if it's worth fishing this out here given you are already
>> >> passing vcpu down the chain.
>> >
>> > I wanted to discourage GCC from recomputing this.  If you're in a
>> > position to do so, can you look at the disassembly with/without this
>> > factored out and see whether it makes a difference?
>>
>> Hmm it is hard to tell. There is code motion but for some reason I'm
>> seeing the static jump code unrolled, for example (original on left):
>>
>> __hyp_switch_fpsimd():                                                                  __hyp_switch_fpsimd():
>> /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:382                      | /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:381
>>                                                                                       >  ----:  tst     w0, #0x400000
>>                                                                                       >  ----:  b.eq    22c <fixup_guest_exit+0x1a4>  // b.none
>>                                                                                       > arch_static_branch_jump():
>>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/include/asm/jump_label.h:45
>>                                                                                       >  ----:  b       38c <fixup_guest_exit+0x304>
>>                                                                                       > arch_static_branch():
>>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/include/asm/jump_label.h:31
>>                                                                                       >  ----:  nop
>>                                                                                       >  ----:  b       22c <fixup_guest_exit+0x1a4>
>>                                                                                       > test_bit():
>>                                                                                       > /home/alex/lsrc/kvm/linux.git/include/asm-generic/bitops/non-atomic.h:106
>>                                                                                       >  ----:  adrp    x0, 0 <cpu_hwcaps>
>>                                                                                       >  ----:  ldr     x0, [x0]
>>                                                                                       > __hyp_switch_fpsimd():
>>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:381
>>  ----:  tst     w0, #0x400000                                                            ----:  tst     w0, #0x400000
>>  ----:  b.eq    238 <fixup_guest_exit+0x1b0>  // b.none                               |  ----:  b.eq    22c <fixup_guest_exit+0x1a4>  // b.none
>>  ----:  cbz     w21, 238 <fixup_guest_exit+0x1b0>                                     |  ----:  tbz     w2, #5, 22c <fixup_guest_exit+0x1a4>
>> /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:383                      | /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:382
>>  ----:  ldr     w2, [x19, #2040]                                                      |  ----:  ldr     w2, [x20, #2040]
>>  ----:  add     x1, x19, #0x4b0                                                       |  ----:  add     x1, x20, #0x4b0
>>  ----:  ldr     x0, [x19, #2032]                                                      |  ----:  ldr     x0, [x20, #2032]
>> sve_ffr_offset():                                                                       sve_ffr_offset():
>>
>> Put calculating guest_has_sve at the top of __hyp_switch_fpsimd make
>> most of that go away and just moves things around a little bit. So I
>> guess it could makes sense for the fast(ish) path although I'd be
>> interested in knowing if it made any real difference to the numbers.
>> After all the first read should be well cached and moving it through the
>> stack is just additional memory and register pressure.
>
> Hmmm, I will have a think about this when I respin.
>
> Explicitly caching guest_has_sve() does reduce the compiler's freedom to
> optimise.
>
> We might be able to mark it as __pure or __attribute_const__ to enable
> the compiler to decide whether to cache the result, but this may not be
> 100% safe.
>
> Part of me would prefer to leave things as they are to avoid the risk of
> breaking the code again...

Given that the only place you call __hyp_switch_fpsimd is here you could
just roll in into __hyp_trap_is_fpsimd and have:

	if (__hyp_trap_is_fpsimd(vcpu))
		return true;

--
Alex Bennée
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

WARNING: multiple messages have this Message-ID (diff)
From: alex.bennee@linaro.org (Alex Bennée)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers
Date: Tue, 20 Nov 2018 15:30:29 +0000	[thread overview]
Message-ID: <87muq3ix62.fsf@linaro.org> (raw)
In-Reply-To: <20181120141659.GZ3505@e103592.cambridge.arm.com>


Dave Martin <Dave.Martin@arm.com> writes:

<snip>
>> >> > @@ -404,10 +444,11 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
>> >> >  	 * and restore the guest context lazily.
>> >> >  	 * If FP/SIMD is not implemented, handle the trap and inject an
>> >> >  	 * undefined instruction exception to the guest.
>> >> > +	 * Similarly for trapped SVE accesses.
>> >> >  	 */
>> >> > -	if (system_supports_fpsimd() &&
>> >> > -	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
>> >> > -		return __hyp_switch_fpsimd(vcpu);
>> >> > +	guest_has_sve = vcpu_has_sve(vcpu);
>> >>
>> >> I'm not sure if it's worth fishing this out here given you are already
>> >> passing vcpu down the chain.
>> >
>> > I wanted to discourage GCC from recomputing this.  If you're in a
>> > position to do so, can you look at the disassembly with/without this
>> > factored out and see whether it makes a difference?
>>
>> Hmm it is hard to tell. There is code motion but for some reason I'm
>> seeing the static jump code unrolled, for example (original on left):
>>
>> __hyp_switch_fpsimd():                                                                  __hyp_switch_fpsimd():
>> /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:382                      | /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:381
>>                                                                                       >  ----:  tst     w0, #0x400000
>>                                                                                       >  ----:  b.eq    22c <fixup_guest_exit+0x1a4>  // b.none
>>                                                                                       > arch_static_branch_jump():
>>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/include/asm/jump_label.h:45
>>                                                                                       >  ----:  b       38c <fixup_guest_exit+0x304>
>>                                                                                       > arch_static_branch():
>>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/include/asm/jump_label.h:31
>>                                                                                       >  ----:  nop
>>                                                                                       >  ----:  b       22c <fixup_guest_exit+0x1a4>
>>                                                                                       > test_bit():
>>                                                                                       > /home/alex/lsrc/kvm/linux.git/include/asm-generic/bitops/non-atomic.h:106
>>                                                                                       >  ----:  adrp    x0, 0 <cpu_hwcaps>
>>                                                                                       >  ----:  ldr     x0, [x0]
>>                                                                                       > __hyp_switch_fpsimd():
>>                                                                                       > /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:381
>>  ----:  tst     w0, #0x400000                                                            ----:  tst     w0, #0x400000
>>  ----:  b.eq    238 <fixup_guest_exit+0x1b0>  // b.none                               |  ----:  b.eq    22c <fixup_guest_exit+0x1a4>  // b.none
>>  ----:  cbz     w21, 238 <fixup_guest_exit+0x1b0>                                     |  ----:  tbz     w2, #5, 22c <fixup_guest_exit+0x1a4>
>> /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:383                      | /home/alex/lsrc/kvm/linux.git/arch/arm64/kvm/hyp/switch.----:382
>>  ----:  ldr     w2, [x19, #2040]                                                      |  ----:  ldr     w2, [x20, #2040]
>>  ----:  add     x1, x19, #0x4b0                                                       |  ----:  add     x1, x20, #0x4b0
>>  ----:  ldr     x0, [x19, #2032]                                                      |  ----:  ldr     x0, [x20, #2032]
>> sve_ffr_offset():                                                                       sve_ffr_offset():
>>
>> Put calculating guest_has_sve at the top of __hyp_switch_fpsimd make
>> most of that go away and just moves things around a little bit. So I
>> guess it could makes sense for the fast(ish) path although I'd be
>> interested in knowing if it made any real difference to the numbers.
>> After all the first read should be well cached and moving it through the
>> stack is just additional memory and register pressure.
>
> Hmmm, I will have a think about this when I respin.
>
> Explicitly caching guest_has_sve() does reduce the compiler's freedom to
> optimise.
>
> We might be able to mark it as __pure or __attribute_const__ to enable
> the compiler to decide whether to cache the result, but this may not be
> 100% safe.
>
> Part of me would prefer to leave things as they are to avoid the risk of
> breaking the code again...

Given that the only place you call __hyp_switch_fpsimd is here you could
just roll in into __hyp_trap_is_fpsimd and have:

	if (__hyp_trap_is_fpsimd(vcpu))
		return true;

--
Alex Benn?e

  reply	other threads:[~2018-11-20 15:30 UTC|newest]

Thread overview: 154+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-28 13:39 [RFC PATCH v2 00/23] KVM: arm64: Initial support for SVE guests Dave Martin
2018-09-28 13:39 ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 01/23] arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 02/23] KVM: arm64: Delete orphaned declaration for __fpsimd_enabled() Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 03/23] KVM: arm64: Refactor kvm_arm_num_regs() for easier maintenance Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 04/23] KVM: arm64: Add missing #include of <linux/bitmap.h> to kvm_host.h Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 05/23] KVM: arm: Add arch vcpu uninit hook Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-02  8:05   ` Christoffer Dall
2018-11-02  8:05     ` Christoffer Dall
2018-11-15 16:40     ` Dave Martin
2018-11-15 16:40       ` Dave Martin
2018-11-20 10:56       ` Christoffer Dall
2018-11-20 10:56         ` Christoffer Dall
2018-09-28 13:39 ` [RFC PATCH v2 06/23] arm64/sve: Check SVE virtualisability Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-15 15:39   ` Alex Bennée
2018-11-15 15:39     ` Alex Bennée
2018-11-15 17:09     ` Dave Martin
2018-11-15 17:09       ` Dave Martin
2018-11-16 12:32       ` Alex Bennée
2018-11-16 12:32         ` Alex Bennée
2018-11-16 15:09         ` Dave Martin
2018-11-16 15:09           ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 07/23] arm64/sve: Enable SVE state tracking for non-task contexts Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 08/23] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-15 15:44   ` Alex Bennée
2018-11-15 15:44     ` Alex Bennée
2018-09-28 13:39 ` [RFC PATCH v2 09/23] KVM: arm64: Propagate vcpu into read_id_reg() Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-15 15:56   ` Alex Bennée
2018-11-15 15:56     ` Alex Bennée
2018-09-28 13:39 ` [RFC PATCH v2 10/23] KVM: arm64: Extend reset_unknown() to handle mixed RES0/UNKNOWN registers Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-02  8:11   ` Christoffer Dall
2018-11-02  8:11     ` Christoffer Dall
2018-11-15 17:11     ` Dave Martin
2018-11-15 17:11       ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-02  8:16   ` Christoffer Dall
2018-11-02  8:16     ` Christoffer Dall
2018-11-15 17:27     ` Dave Martin
2018-11-15 17:27       ` Dave Martin
2018-11-22 10:53       ` Christoffer Dall
2018-11-22 10:53         ` Christoffer Dall
2018-11-22 11:13         ` Peter Maydell
2018-11-22 11:13           ` Peter Maydell
2018-11-22 12:34           ` Christoffer Dall
2018-11-22 12:34             ` Christoffer Dall
2018-11-22 12:59             ` Peter Maydell
2018-11-22 12:59               ` Peter Maydell
2018-11-22 11:27         ` Alex Bennée
2018-11-22 11:27           ` Alex Bennée
2018-11-22 12:32           ` Dave P Martin
2018-11-22 12:32             ` Dave P Martin
2018-11-22 13:07             ` Christoffer Dall
2018-11-22 13:07               ` Christoffer Dall
2018-11-23 17:42               ` Dave Martin
2018-11-23 17:42                 ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 12/23] KVM: arm64/sve: System register context switch and access support Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-15 16:37   ` Alex Bennée
2018-11-15 16:37     ` Alex Bennée
2018-11-15 17:59     ` Dave Martin
2018-11-15 17:59       ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-19 16:36   ` Alex Bennée
2018-11-19 16:36     ` Alex Bennée
2018-11-19 17:03     ` Dave Martin
2018-11-19 17:03       ` Dave Martin
2018-11-20 12:25       ` Alex Bennée
2018-11-20 12:25         ` Alex Bennée
2018-11-20 14:17         ` Dave Martin
2018-11-20 14:17           ` Dave Martin
2018-11-20 15:30           ` Alex Bennée [this message]
2018-11-20 15:30             ` Alex Bennée
2018-11-20 17:18             ` Dave Martin
2018-11-20 17:18               ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 14/23] KVM: Allow 2048-bit register access via ioctl interface Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-19 16:48   ` Alex Bennée
2018-11-19 16:48     ` Alex Bennée
2018-11-19 17:07     ` Dave Martin
2018-11-19 17:07       ` Dave Martin
2018-11-20 11:20       ` Alex Bennée
2018-11-20 11:20         ` Alex Bennée
2018-09-28 13:39 ` [RFC PATCH v2 15/23] KVM: arm64/sve: Add SVE support to register access " Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-21 15:20   ` Alex Bennée
2018-11-21 15:20     ` Alex Bennée
2018-11-21 18:05     ` Dave Martin
2018-11-21 18:05       ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 16/23] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-21 16:09   ` Alex Bennée
2018-11-21 16:09     ` Alex Bennée
2018-11-21 16:32     ` Dave Martin
2018-11-21 16:32       ` Dave Martin
2018-11-21 16:49       ` Alex Bennée
2018-11-21 16:49         ` Alex Bennée
2018-11-21 17:46         ` Dave Martin
2018-11-21 17:46           ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 17/23] arm64/sve: In-kernel vector length availability query interface Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-21 16:16   ` Alex Bennée
2018-11-21 16:16     ` Alex Bennée
2018-11-21 16:35     ` Dave Martin
2018-11-21 16:35       ` Dave Martin
2018-11-21 16:46       ` Alex Bennée
2018-11-21 16:46         ` Alex Bennée
2018-09-28 13:39 ` [RFC PATCH v2 18/23] KVM: arm64: Add arch vcpu ioctl hook Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-02  8:30   ` Christoffer Dall
2018-11-02  8:30     ` Christoffer Dall
2018-09-28 13:39 ` [RFC PATCH v2 19/23] KVM: arm64/sve: Report and enable SVE API extensions for userspace Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-22 15:23   ` Alex Bennée
2018-11-22 15:23     ` Alex Bennée
2018-12-05 18:22     ` Dave Martin
2018-12-05 18:22       ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 20/23] KVM: arm64: Add arch vm ioctl hook Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-02  8:32   ` Christoffer Dall
2018-11-02  8:32     ` Christoffer Dall
2018-11-15 18:04     ` Dave Martin
2018-11-15 18:04       ` Dave Martin
2018-11-20 10:58       ` Christoffer Dall
2018-11-20 10:58         ` Christoffer Dall
2018-11-20 14:19         ` Dave Martin
2018-11-20 14:19           ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 21/23] KVM: arm64/sve: allow KVM_ARM_SVE_CONFIG_QUERY on vm fd Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-22 15:29   ` Alex Bennée
2018-11-22 15:29     ` Alex Bennée
2018-09-28 13:39 ` [RFC PATCH v2 22/23] KVM: Documentation: Document arm64 core registers in detail Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-09-28 13:39 ` [RFC PATCH v2 23/23] KVM: arm64/sve: Document KVM API extensions for SVE Dave Martin
2018-09-28 13:39   ` Dave Martin
2018-11-22 15:31   ` Alex Bennée
2018-11-22 15:31     ` Alex Bennée
2018-12-05 17:59     ` Dave Martin
2018-12-05 17:59       ` Dave Martin
2018-11-22 15:34 ` [RFC PATCH v2 00/23] KVM: arm64: Initial support for SVE guests Alex Bennée
2018-11-22 15:34   ` Alex Bennée
2018-12-04 15:50   ` Dave Martin
2018-12-04 15:50     ` Dave Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87muq3ix62.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=Dave.Martin@arm.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=cdall@kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=marc.zyngier@arm.com \
    --cc=tokamoto@jp.fujitsu.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.