public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: ynorov@caviumnetworks.com (Yury Norov)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 32/37] KVM: arm/arm64: Handle VGICv2 save/restore from the main VGIC code
Date: Thu, 30 Nov 2017 15:09:41 +0300	[thread overview]
Message-ID: <20171130120941.5vbpjmyrm2v2xcj4@yury-thinkpad> (raw)
In-Reply-To: <20171126194641.GP28855@cbox>

On Sun, Nov 26, 2017 at 08:46:41PM +0100, Christoffer Dall wrote:
> On Sun, Nov 26, 2017 at 01:29:30PM +0300, Yury Norov wrote:
> > On Wed, Nov 15, 2017 at 05:50:07PM +0000, Andre Przywara wrote:
> > > Hi,
> > > 
> > > those last few patches are actually helpful for the Xen port ...
> > 
> > [...] 
> > 
> > > > +static void save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base)
> > > > +{
> > > > +	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
> > > > +	int nr_lr = kvm_vgic_global_state.nr_lr;
> > > > +	u32 elrsr0, elrsr1;
> > > > +
> > > > +	elrsr0 = readl_relaxed(base + GICH_ELRSR0);
> > > > +	if (unlikely(nr_lr > 32))
> > > > +		elrsr1 = readl_relaxed(base + GICH_ELRSR1);
> > > > +	else
> > > > +		elrsr1 = 0;
> > > > +
> > > > +#ifdef CONFIG_CPU_BIG_ENDIAN
> > > > +	cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1;
> > > > +#else
> > > > +	cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0;
> > > > +#endif
> > > 
> > > I have some gut feeling that this is really broken, since we mix up
> > > endian *byte* ordering with *bit* ordering here, don't we?
> > 
> > Good feeling indeed. :)
> > 
> > We have bitmap_{from,to)_u32array for things like this. But it was
> > considered bad-designed, and I proposed new bitmap_{from,to)_arr32().
> > 
> > https://lkml.org/lkml/2017/11/15/592
> > 
> > What else I have in mind, to introduce something like bitmap_{from,to}_pair_32()
> > as most of current users of bitmap_{from,to)_u32array(), (and those who should
> > use it but don't, like this one) have only 2 32-bit halfwords to be copied
> > from/to bitmap.
> > 
> > Also, it will be complementary to bitmap_from_u64().
> > 
> > More reading about bitmap/array conversion is in comment to BITMAP_FROM_U64
> > macro.
> > 
> 
> I have no idea what you want to introduce here.  If you have an idea on
> how to improve the code, patches are welcome.

That's about Andre's gut feeling, not about your patch. I have some
ideas related to it, and just want to share it to him - that's all.

> Please keep in mind, that the purpose of this patch is to move code
> around to improve the GIC handling performance, not changing the
> lower-level details of the code.
>
> > > I understand it's just copied and gets removed later on, so I was
> > > wondering if you could actually move patch 35/37 ("Get rid of
> > > vgic_elrsr") before this patch here, to avoid copying bogus code around?
> > > Or does 35/37 depend on 34/37 to be correct?
> > > 
> > > > +}
> > > > +
> > > > +static void save_lrs(struct kvm_vcpu *vcpu, void __iomem *base)
> > > > +{
> > > > +	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
> > > > +	int i;
> > > > +	u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
> > > > +
> > > > +	for (i = 0; i < used_lrs; i++) {
> > > > +		if (cpu_if->vgic_elrsr & (1UL << i))
> > 
> > So, the vgic_elrsr is naturally bitmap, and bitmap API is preferred if no
> > other considerations:
> >                 if (test_bit(i, cpu_if->vgic_elrsr))
> > 
> > > > +			cpu_if->vgic_lr[i] &= ~GICH_LR_STATE;
> > > > +		else
> > > > +			cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
> > > > +
> > > > +		writel_relaxed(0, base + GICH_LR0 + (i * 4));
> > > > +	}
> > > > +}
> > 
> > I'd also headscratch about using for_each_clear_bit() here: 
> > 
> >         /*
> >          * Setup default vgic_lr values somewhere earlier.
> 
> Not sure what the 'default' values are.
> 
> >          * Not needed at all if you take my suggestion for
> >          * vgic_v2_restore_state() below
> >          */
> >         for (i = 0; i < used_lrs; i++)
> >                 cpu_if->vgic_lr[i] &= ~GICH_LR_STATE;
> > 
> > static void save_lrs(struct kvm_vcpu *vcpu, void __iomem *base)
> > {
> > 	[...]
> > 
> > 	for_each_clear_bit (i, cpu_if->vgic_elrsr, used_lrs)
> > 		cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4));
> > 
> >         for (i = 0; i < used_lrs; i++)
> > 		writel_relaxed(0, base + GICH_LR0 + (i * 4));
> > }
> > 
> > Not sure how performance-critical this path is, but sometimes things
> > get really faster with bitmaps. 
> > 
> 
> Your suggestion below would require us to maintain elrsr when we setup
> list registers, and I don't really see the benefit.
 
That's what I asked - is it maintained or not. If not then it will not
work.
 
> > [...]
> > 
> > > > +void vgic_v2_restore_state(struct kvm_vcpu *vcpu)
> > > > +{
> > > > +	struct kvm *kvm = vcpu->kvm;
> > > > +	struct vgic_dist *vgic = &kvm->arch.vgic;
> > > > +	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
> > > > +	void __iomem *base = vgic->vctrl_base;
> > > > +	u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
> > > > +	int i;
> > > > +
> > > > +	if (!base)
> > > > +		return;
> > > > +
> > > > +	if (used_lrs) {
> > > > +		writel_relaxed(cpu_if->vgic_hcr, base + GICH_HCR);
> > > > +		writel_relaxed(cpu_if->vgic_apr, base + GICH_APR);
> > > > +		for (i = 0; i < used_lrs; i++) {
> > > > +			writel_relaxed(cpu_if->vgic_lr[i],
> > > > +				       base + GICH_LR0 + (i * 4));
> > > > +		}
> > > > +	}
> > > > +}
> > 
> > The alternative approach would be:
> > 	for (i = 0; i < used_lrs; i++) {
> >                 if (test_bit(i, cpu_if->vgic_elrsr))
> >                         writel_relaxed(~GICH_LR_STATE, base + GICH_LR0 + (i * 4));
> >                 else
> >                         writel_relaxed(cpu_if->vgic_lr[i], base + GICH_LR0 + (i * 4));
> > 	}
> > 
> > If cpu_if->vgic_elrsr is untouched in-between of course. It will make
> > save_lrs() simpler and this function more verbose.
> > 
> I don't understand your suggestion.  As you will see later, we will get
> rid of storing the elrsr completely with a measureable performance
> improvement.

OK, now I see. Sorry for stupid questions - I just start learning
codebase. By the way, can you share the technique that you use to
measure performance? It would be great if I can reproduce your
results.

> If you think you can improve the code beyond that, a follow-up patch
> would be most welcome.
> 
> Note that on all the implementations I'm familiar with, the maximum
> number of LRs is four, so we're not wading through massive bitmaps in
> practice here.
> 
> Thanks,
> -Christoffer

  reply	other threads:[~2017-11-30 12:09 UTC|newest]

Thread overview: 127+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-12 10:41 [PATCH 00/37] Optimize KVM/ARM for VHE systems Christoffer Dall
2017-10-12 10:41 ` [PATCH 01/37] KVM: arm64: Avoid storing the vcpu pointer on the stack Christoffer Dall
2017-10-12 15:49   ` Marc Zyngier
2017-10-12 17:02     ` Christoffer Dall
2017-10-13 11:31       ` Marc Zyngier
2017-11-23 20:59     ` Christoffer Dall
2017-11-27 11:11       ` James Morse
2017-11-29 18:20         ` Christoffer Dall
2017-11-06 17:22   ` Andrew Jones
2017-11-07  8:24     ` Christoffer Dall
2017-10-12 10:41 ` [PATCH 02/37] KVM: arm64: Rework hyp_panic for VHE and non-VHE Christoffer Dall
2017-10-12 15:55   ` Marc Zyngier
2017-10-12 17:06     ` Christoffer Dall
2017-10-12 10:41 ` [PATCH 03/37] KVM: arm64: Move HCR_INT_OVERRIDE to default HCR_EL2 guest flag Christoffer Dall
2017-10-12 16:20   ` Marc Zyngier
2017-10-12 10:41 ` [PATCH 04/37] KVM: arm/arm64: Get rid of vcpu->arch.irq_lines Christoffer Dall
2017-10-12 16:24   ` Marc Zyngier
2017-11-06 17:58   ` Andrew Jones
2017-11-14 12:17   ` Julien Thierry
2017-11-16 16:11     ` Julien Thierry
2017-11-26 16:04     ` Christoffer Dall
2017-10-12 10:41 ` [PATCH 05/37] KVM: Record the executing ioctl number on the vcpu struct Christoffer Dall
2017-10-13 17:13   ` Radim Krčmář
2017-10-13 17:31     ` Christoffer Dall
2017-10-13 18:38       ` Radim Krčmář
2017-10-13 18:51         ` Christoffer Dall
2017-11-07 10:45   ` Andrew Jones
2017-11-22 20:28     ` Christoffer Dall
2017-10-12 10:41 ` [PATCH 06/37] KVM: arm/arm64: Only load/put VCPU state for KVM_RUN Christoffer Dall
2017-10-12 10:41 ` [PATCH 07/37] KVM: arm/arm64: Add kvm_vcpu_load_sysregs and kvm_vcpu_put_sysregs Christoffer Dall
2017-11-07 10:56   ` Andrew Jones
2017-11-07 11:10   ` Andrew Jones
2017-11-22 20:34     ` Christoffer Dall
2017-11-23 11:08       ` Andrew Jones
2017-10-12 10:41 ` [PATCH 08/37] KVM: arm64: Defer restoring host VFP state to vcpu_put Christoffer Dall
2017-11-07 13:15   ` Andrew Jones
2017-11-26 16:24     ` Christoffer Dall
2017-11-15 16:04   ` Andrew Jones
2017-11-26 16:17     ` Christoffer Dall
2017-11-27  8:32       ` Andrew Jones
2017-11-25  7:52   ` Yury Norov
2017-11-26 16:17     ` Christoffer Dall
2017-11-26 18:58       ` Yury Norov
2017-11-26 19:18         ` Christoffer Dall
2017-11-27  6:25           ` Yury Norov
2017-11-30 19:07         ` Marc Zyngier
2017-10-12 10:41 ` [PATCH 09/37] KVM: arm64: Move debug dirty flag calculation out of world switch Christoffer Dall
2017-11-07 14:09   ` Andrew Jones
2017-11-25  8:09     ` Yury Norov
2017-12-01 17:25     ` Christoffer Dall
2017-12-03 13:17       ` Andrew Jones
2017-10-12 10:41 ` [PATCH 10/37] KVM: arm64: Slightly improve debug save/restore functions Christoffer Dall
2017-11-07 14:22   ` Andrew Jones
2017-12-01 17:51     ` Christoffer Dall
2017-11-14 16:42   ` Julien Thierry
2017-12-01 15:19     ` Christoffer Dall
2017-12-06 15:38       ` Julien Thierry
2017-10-12 10:41 ` [PATCH 11/37] KVM: arm64: Improve debug register save/restore flow Christoffer Dall
2017-11-07 14:48   ` Andrew Jones
2017-12-01 17:52     ` Christoffer Dall
2017-12-03 13:49       ` Andrew Jones
2017-12-03 20:47         ` Christoffer Dall
2017-10-12 10:41 ` [PATCH 12/37] KVM: arm64: Factor out fault info population and gic workarounds Christoffer Dall
2017-11-07 15:12   ` Andrew Jones
2017-10-12 10:41 ` [PATCH 13/37] KVM: arm64: Introduce VHE-specific kvm_vcpu_run Christoffer Dall
2017-11-07 15:25   ` Andrew Jones
2017-12-01 18:10     ` Christoffer Dall
2017-10-12 10:41 ` [PATCH 14/37] KVM: arm64: Remove kern_hyp_va() use in VHE switch function Christoffer Dall
2017-11-07 16:07   ` Andrew Jones
2017-10-12 10:41 ` [PATCH 15/37] KVM: arm64: Don't deactivate VM on VHE systems Christoffer Dall
2017-11-07 16:14   ` Andrew Jones
2017-12-03 19:27     ` Christoffer Dall
2017-10-12 10:41 ` [PATCH 16/37] KVM: arm64: Remove noop calls to timer save/restore from VHE switch Christoffer Dall
2017-11-07 16:25   ` Andrew Jones
2017-12-03 19:27     ` Christoffer Dall
2017-10-12 10:41 ` [PATCH 17/37] KVM: arm64: Move userspace system registers into separate function Christoffer Dall
2017-11-08  9:32   ` Andrew Jones
2017-12-03 19:36     ` Christoffer Dall
2017-10-12 10:41 ` [PATCH 18/37] KVM: arm64: Rewrite sysreg alternatives to static keys Christoffer Dall
2017-10-12 10:41 ` [PATCH 19/37] KVM: arm64: Introduce separate VHE/non-VHE sysreg save/restore functions Christoffer Dall
2017-11-08 10:31   ` Andrew Jones
2017-10-12 10:41 ` [PATCH 20/37] KVM: arm64: Unify non-VHE host/guest sysreg save and restore functions Christoffer Dall
2017-11-08 10:39   ` Andrew Jones
2017-12-03 19:41     ` Christoffer Dall
2017-10-12 10:41 ` [PATCH 21/37] KVM: arm64: Don't save the host ELR_EL2 and SPSR_EL2 on VHE systems Christoffer Dall
2017-11-08 17:03   ` Andrew Jones
2017-12-03 19:45     ` Christoffer Dall
2017-10-12 10:41 ` [PATCH 22/37] KVM: arm64: Change 32-bit handling of VM system registers Christoffer Dall
2017-11-13 16:25   ` Andrew Jones
2017-10-12 10:41 ` [PATCH 23/37] KVM: arm64: Prepare to handle traps on deferred VM sysregs Christoffer Dall
2017-11-13 17:54   ` Andrew Jones
2017-12-03 19:50     ` Christoffer Dall
2017-12-04 10:05       ` Andrew Jones
2017-10-12 10:41 ` [PATCH 24/37] KVM: arm64: Prepare to handle traps on deferred EL0 sysregs Christoffer Dall
2017-11-15  9:25   ` Julien Thierry
2017-12-03 19:51     ` Christoffer Dall
2017-10-12 10:41 ` [PATCH 25/37] KVM: arm64: Prepare to handle traps on remaining deferred EL1 sysregs Christoffer Dall
2017-11-13 18:56   ` Andrew Jones
2017-12-03 20:29     ` Christoffer Dall
2017-10-12 10:41 ` [PATCH 26/37] KVM: arm64: Prepare to handle traps on deferred AArch32 sysregs Christoffer Dall
2017-11-13 19:07   ` Andrew Jones
2017-12-03 20:35     ` Christoffer Dall
2017-10-12 10:41 ` [PATCH 27/37] KVM: arm64: Defer saving/restoring system registers to vcpu load/put on VHE Christoffer Dall
2017-10-12 10:41 ` [PATCH 28/37] KVM: arm64: Move common VHE/non-VHE trap config in separate functions Christoffer Dall
2017-11-25 10:43   ` Yury Norov
2017-11-25 10:49     ` Russell King - ARM Linux
2017-10-12 10:41 ` [PATCH 29/37] KVM: arm64: Configure FPSIMD traps on vcpu load/put for VHE Christoffer Dall
2017-10-12 10:41 ` [PATCH 30/37] KVM: arm64: Configure c15, PMU, and debug register traps on cpu " Christoffer Dall
2017-10-12 10:41 ` [PATCH 31/37] KVM: arm64: Separate activate_traps and deactive_traps for VHE and non-VHE Christoffer Dall
2017-10-12 10:41 ` [PATCH 32/37] KVM: arm/arm64: Handle VGICv2 save/restore from the main VGIC code Christoffer Dall
2017-11-15 17:50   ` Andre Przywara
2017-11-26 10:29     ` Yury Norov
2017-11-26 19:46       ` Christoffer Dall
2017-11-30 12:09         ` Yury Norov [this message]
2017-11-26 19:37     ` Christoffer Dall
2017-10-12 10:41 ` [PATCH 33/37] KVM: arm/arm64: Move arm64-only vgic-v2-sr.c file to arm64 Christoffer Dall
2017-11-15 17:52   ` Andre Przywara
2017-10-12 10:41 ` [PATCH 34/37] KVM: arm/arm64: Handle VGICv3 save/restore from the main VGIC code on VHE Christoffer Dall
2017-10-12 10:41 ` [PATCH 35/37] KVM: arm/arm64: Get rid of vgic_elrsr Christoffer Dall
2017-11-26 14:39   ` Yury Norov
2017-11-26 19:53     ` Christoffer Dall
2017-10-12 10:41 ` [PATCH 36/37] KVM: arm/arm64: Move VGIC APR save/restore to vgic put/load Christoffer Dall
2017-11-26 15:09   ` Yury Norov
2017-11-26 19:55     ` Christoffer Dall
2017-10-12 10:41 ` [PATCH 37/37] KVM: arm/arm64: Avoid VGICv3 save/restore on VHE with no IRQs Christoffer Dall
2017-11-30 18:33   ` Yury Norov
2017-12-03 20:38     ` Christoffer Dall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171130120941.5vbpjmyrm2v2xcj4@yury-thinkpad \
    --to=ynorov@caviumnetworks.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox