All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoffer Dall <christoffer.dall@linaro.org>
To: Jia He <hejianet@gmail.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>,
	Jia He <jia.he@hxt-semitech.com>,
	kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] KVM: arm/arm64: don't set vtimer->cnt_ctl in kvm_arch_timer_handler
Date: Thu, 14 Dec 2017 14:09:54 +0100	[thread overview]
Message-ID: <20171214130954.GV910@cbox> (raw)
In-Reply-To: <dc95b58c-ee6c-e5c7-1f37-8f69c789a1fc@gmail.com>

On Thu, Dec 14, 2017 at 12:57:54PM +0800, Jia He wrote:
Hi Jia,

> 
> I have tried your newer level-mapped-v7 branch, but bug is still there.
> 
> There is no special load in both host and guest. The guest (kernel
> 4.14) is often hanging when booting
> 
> the guest kernel log
> 
> [ OK ] Reached target Remote File Systems.
> Starting File System Check on /dev/mapper/fedora-root...
> [ OK ] Started File System Check on /dev/mapper/fedora-root.
> Mounting /sysroot...
> [ 2.670764] SGI XFS with ACLs, security attributes, no debug enabled
> [ 2.678180] XFS (dm-0): Mounting V5 Filesystem
> [ 2.740364] XFS (dm-0): Ending clean mount
> [ OK ] Mounted /sysroot.
> [ OK ] Reached target Initrd Root File System.
> Starting Reload Configuration from the Real Root...
> [ 61.288215] INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 61.290791] 1-...!: (0 ticks this GP) idle=574/0/0 softirq=5/5 fqs=1
> [ 61.293664] (detected by 0, t=6002 jiffies, g=-263, c=-264, q=39760)
> [ 61.296480] Task dump for CPU 1:
> [ 61.297938] swapper/1 R running task 0 0 1 0x00000020
> [ 61.300643] Call trace:
> [ 61.301260] __switch_to+0x6c/0x78
> [ 61.302095] cpu_number+0x0/0x8
> [ 61.302867] rcu_sched kthread starved for 6000 jiffies!
> g18446744073709551353 c18446744073709551352 f0x0 RCU_GP_WAIT_FQS(3)
> ->state=0x402 ->cpu=1
> [ 61.305941] rcu_sched I 0 8 2 0x00000020
> [ 61.307250] Call trace:
> [ 61.307854] __switch_to+0x6c/0x78
> [ 61.308693] __schedule+0x268/0x8f0
> [ 61.309545] schedule+0x2c/0x88
> [ 61.310325] schedule_timeout+0x84/0x3b8
> [ 61.311278] rcu_gp_kthread+0x4d4/0x7d8
> [ 61.312213] kthread+0x134/0x138
> [ 61.313001] ret_from_fork+0x10/0x1c
> 
> Maybe my previous patch is not perfect enough, thanks for your comments.
> 
> I digged it futher more, do you think below code logic is possibly
> problematic?
> 
> 
> vtimer_save_state           (vtimer->loaded = false, cntv_ctl is 0)
> 
> kvm_arch_timer_handler        (read cntv_ctl and set vtimer->cnt_ctl = 0)
> 
> vtimer_restore_state            (write vtimer->cnt_ctl to cntv_ctl,
> then cntv_ctl will
> 
>                        be 0 forever)
> 
> 
> If above analysis is reasonable

Yes, I think there's something there if the hardware doesn't retire the
signal fast enough...

> how about below patch? already
> tested in my arm64 server.
> 
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index f9555b1..ee6dd3f 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -99,7 +99,7 @@ static irqreturn_t kvm_arch_timer_handler(int irq,
> void *dev_id)
>         }
>         vtimer = vcpu_vtimer(vcpu);
> 
> -       if (!vtimer->irq.level) {
> +       if (vtimer->loaded && !vtimer->irq.level) {
>                 vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
>                 if (kvm_timer_irq_can_fire(vtimer))
>                         kvm_timer_update_irq(vcpu, true, vtimer);
> 

There's nothing really wrong with that patch, I just didn't think it
would be necessary, as we really shouldn't see interrupts if the timer
is not loaded.  Can you confirm that a WARN_ON(!vtimer->loaded) in
kvm_arch_timer_handler() gives you a splat?

Also, could you give the following a try (without your patch):

diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 73d262c4712b..4751255345d1 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -367,6 +367,7 @@ static void vtimer_save_state(struct kvm_vcpu *vcpu)
 
 	/* Disable the virtual timer */
 	write_sysreg_el0(0, cntv_ctl);
+	isb();
 
 	vtimer->loaded = false;
 out:

Thanks,
-Christoffer

WARNING: multiple messages have this Message-ID (diff)
From: christoffer.dall@linaro.org (Christoffer Dall)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] KVM: arm/arm64: don't set vtimer->cnt_ctl in kvm_arch_timer_handler
Date: Thu, 14 Dec 2017 14:09:54 +0100	[thread overview]
Message-ID: <20171214130954.GV910@cbox> (raw)
In-Reply-To: <dc95b58c-ee6c-e5c7-1f37-8f69c789a1fc@gmail.com>

On Thu, Dec 14, 2017 at 12:57:54PM +0800, Jia He wrote:
Hi Jia,

> 
> I have tried your newer level-mapped-v7 branch, but bug is still there.
> 
> There is no special load in both host and guest. The guest (kernel
> 4.14) is often hanging when booting
> 
> the guest kernel log
> 
> [ OK ] Reached target Remote File Systems.
> Starting File System Check on /dev/mapper/fedora-root...
> [ OK ] Started File System Check on /dev/mapper/fedora-root.
> Mounting /sysroot...
> [ 2.670764] SGI XFS with ACLs, security attributes, no debug enabled
> [ 2.678180] XFS (dm-0): Mounting V5 Filesystem
> [ 2.740364] XFS (dm-0): Ending clean mount
> [ OK ] Mounted /sysroot.
> [ OK ] Reached target Initrd Root File System.
> Starting Reload Configuration from the Real Root...
> [ 61.288215] INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 61.290791] 1-...!: (0 ticks this GP) idle=574/0/0 softirq=5/5 fqs=1
> [ 61.293664] (detected by 0, t=6002 jiffies, g=-263, c=-264, q=39760)
> [ 61.296480] Task dump for CPU 1:
> [ 61.297938] swapper/1 R running task 0 0 1 0x00000020
> [ 61.300643] Call trace:
> [ 61.301260] __switch_to+0x6c/0x78
> [ 61.302095] cpu_number+0x0/0x8
> [ 61.302867] rcu_sched kthread starved for 6000 jiffies!
> g18446744073709551353 c18446744073709551352 f0x0 RCU_GP_WAIT_FQS(3)
> ->state=0x402 ->cpu=1
> [ 61.305941] rcu_sched I 0 8 2 0x00000020
> [ 61.307250] Call trace:
> [ 61.307854] __switch_to+0x6c/0x78
> [ 61.308693] __schedule+0x268/0x8f0
> [ 61.309545] schedule+0x2c/0x88
> [ 61.310325] schedule_timeout+0x84/0x3b8
> [ 61.311278] rcu_gp_kthread+0x4d4/0x7d8
> [ 61.312213] kthread+0x134/0x138
> [ 61.313001] ret_from_fork+0x10/0x1c
> 
> Maybe my previous patch is not perfect enough, thanks for your comments.
> 
> I digged it futher more, do you think below code logic is possibly
> problematic?
> 
> 
> vtimer_save_state?????????? (vtimer->loaded = false, cntv_ctl is 0)
> 
> kvm_arch_timer_handler????????(read cntv_ctl and set vtimer->cnt_ctl = 0)
> 
> vtimer_restore_state ? ? ? ? ?? (write vtimer->cnt_ctl to cntv_ctl,
> then cntv_ctl will
> 
> ??? ??? ??? ??? ?? ? ? be 0 forever)
> 
> 
> If above analysis is reasonable

Yes, I think there's something there if the hardware doesn't retire the
signal fast enough...

> how about below patch? already
> tested in my arm64 server.
> 
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index f9555b1..ee6dd3f 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -99,7 +99,7 @@ static irqreturn_t kvm_arch_timer_handler(int irq,
> void *dev_id)
> ??????? }
> ??????? vtimer = vcpu_vtimer(vcpu);
> 
> -?????? if (!vtimer->irq.level) {
> +?????? if (vtimer->loaded && !vtimer->irq.level) {
> ??????????????? vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
> ??????????????? if (kvm_timer_irq_can_fire(vtimer))
> ??????????????????????? kvm_timer_update_irq(vcpu, true, vtimer);
> 

There's nothing really wrong with that patch, I just didn't think it
would be necessary, as we really shouldn't see interrupts if the timer
is not loaded.  Can you confirm that a WARN_ON(!vtimer->loaded) in
kvm_arch_timer_handler() gives you a splat?

Also, could you give the following a try (without your patch):

diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 73d262c4712b..4751255345d1 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -367,6 +367,7 @@ static void vtimer_save_state(struct kvm_vcpu *vcpu)
 
 	/* Disable the virtual timer */
 	write_sysreg_el0(0, cntv_ctl);
+	isb();
 
 	vtimer->loaded = false;
 out:

Thanks,
-Christoffer

WARNING: multiple messages have this Message-ID (diff)
From: Christoffer Dall <christoffer.dall@linaro.org>
To: Jia He <hejianet@gmail.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org,
	Jia He <jia.he@hxt-semitech.com>
Subject: Re: [PATCH] KVM: arm/arm64: don't set vtimer->cnt_ctl in kvm_arch_timer_handler
Date: Thu, 14 Dec 2017 14:09:54 +0100	[thread overview]
Message-ID: <20171214130954.GV910@cbox> (raw)
In-Reply-To: <dc95b58c-ee6c-e5c7-1f37-8f69c789a1fc@gmail.com>

On Thu, Dec 14, 2017 at 12:57:54PM +0800, Jia He wrote:
Hi Jia,

> 
> I have tried your newer level-mapped-v7 branch, but bug is still there.
> 
> There is no special load in both host and guest. The guest (kernel
> 4.14) is often hanging when booting
> 
> the guest kernel log
> 
> [ OK ] Reached target Remote File Systems.
> Starting File System Check on /dev/mapper/fedora-root...
> [ OK ] Started File System Check on /dev/mapper/fedora-root.
> Mounting /sysroot...
> [ 2.670764] SGI XFS with ACLs, security attributes, no debug enabled
> [ 2.678180] XFS (dm-0): Mounting V5 Filesystem
> [ 2.740364] XFS (dm-0): Ending clean mount
> [ OK ] Mounted /sysroot.
> [ OK ] Reached target Initrd Root File System.
> Starting Reload Configuration from the Real Root...
> [ 61.288215] INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 61.290791] 1-...!: (0 ticks this GP) idle=574/0/0 softirq=5/5 fqs=1
> [ 61.293664] (detected by 0, t=6002 jiffies, g=-263, c=-264, q=39760)
> [ 61.296480] Task dump for CPU 1:
> [ 61.297938] swapper/1 R running task 0 0 1 0x00000020
> [ 61.300643] Call trace:
> [ 61.301260] __switch_to+0x6c/0x78
> [ 61.302095] cpu_number+0x0/0x8
> [ 61.302867] rcu_sched kthread starved for 6000 jiffies!
> g18446744073709551353 c18446744073709551352 f0x0 RCU_GP_WAIT_FQS(3)
> ->state=0x402 ->cpu=1
> [ 61.305941] rcu_sched I 0 8 2 0x00000020
> [ 61.307250] Call trace:
> [ 61.307854] __switch_to+0x6c/0x78
> [ 61.308693] __schedule+0x268/0x8f0
> [ 61.309545] schedule+0x2c/0x88
> [ 61.310325] schedule_timeout+0x84/0x3b8
> [ 61.311278] rcu_gp_kthread+0x4d4/0x7d8
> [ 61.312213] kthread+0x134/0x138
> [ 61.313001] ret_from_fork+0x10/0x1c
> 
> Maybe my previous patch is not perfect enough, thanks for your comments.
> 
> I digged it futher more, do you think below code logic is possibly
> problematic?
> 
> 
> vtimer_save_state           (vtimer->loaded = false, cntv_ctl is 0)
> 
> kvm_arch_timer_handler        (read cntv_ctl and set vtimer->cnt_ctl = 0)
> 
> vtimer_restore_state            (write vtimer->cnt_ctl to cntv_ctl,
> then cntv_ctl will
> 
>                        be 0 forever)
> 
> 
> If above analysis is reasonable

Yes, I think there's something there if the hardware doesn't retire the
signal fast enough...

> how about below patch? already
> tested in my arm64 server.
> 
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index f9555b1..ee6dd3f 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -99,7 +99,7 @@ static irqreturn_t kvm_arch_timer_handler(int irq,
> void *dev_id)
>         }
>         vtimer = vcpu_vtimer(vcpu);
> 
> -       if (!vtimer->irq.level) {
> +       if (vtimer->loaded && !vtimer->irq.level) {
>                 vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
>                 if (kvm_timer_irq_can_fire(vtimer))
>                         kvm_timer_update_irq(vcpu, true, vtimer);
> 

There's nothing really wrong with that patch, I just didn't think it
would be necessary, as we really shouldn't see interrupts if the timer
is not loaded.  Can you confirm that a WARN_ON(!vtimer->loaded) in
kvm_arch_timer_handler() gives you a splat?

Also, could you give the following a try (without your patch):

diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 73d262c4712b..4751255345d1 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -367,6 +367,7 @@ static void vtimer_save_state(struct kvm_vcpu *vcpu)
 
 	/* Disable the virtual timer */
 	write_sysreg_el0(0, cntv_ctl);
+	isb();
 
 	vtimer->loaded = false;
 out:

Thanks,
-Christoffer

  parent reply	other threads:[~2017-12-14 13:06 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-13  7:00 [PATCH] KVM: arm/arm64: don't set vtimer->cnt_ctl in kvm_arch_timer_handler Jia He
2017-12-13  7:00 ` Jia He
2017-12-13  8:56 ` Marc Zyngier
2017-12-13  8:56   ` Marc Zyngier
2017-12-13  9:08   ` Auger Eric
2017-12-13  9:08     ` Auger Eric
2017-12-13  9:27     ` Marc Zyngier
2017-12-13  9:27       ` Marc Zyngier
2017-12-13  9:34       ` Christoffer Dall
2017-12-13  9:34         ` Christoffer Dall
2017-12-13  9:34         ` Christoffer Dall
2017-12-13  9:20   ` Christoffer Dall
2017-12-13  9:20     ` Christoffer Dall
2017-12-13  9:20     ` Christoffer Dall
2017-12-13  9:18 ` Christoffer Dall
2017-12-13  9:18   ` Christoffer Dall
2017-12-13  9:18   ` Christoffer Dall
2017-12-14  4:57   ` Jia He
2017-12-14  4:57     ` Jia He
2017-12-14  4:57     ` Jia He
2017-12-14  5:35     ` Jia He
2017-12-14  5:35       ` Jia He
2017-12-14 13:09     ` Christoffer Dall [this message]
2017-12-14 13:09       ` Christoffer Dall
2017-12-14 13:09       ` Christoffer Dall
2017-12-14 15:28       ` Jia He
2017-12-14 15:28         ` Jia He
2017-12-14 15:45         ` Christoffer Dall
2017-12-14 15:45           ` Christoffer Dall
2017-12-15  2:27           ` Jia He
2017-12-15  2:27             ` Jia He
2017-12-15  2:27             ` Jia He
2017-12-15  9:09             ` Marc Zyngier
2017-12-15  9:09               ` Marc Zyngier
2017-12-15  9:09               ` Marc Zyngier
2017-12-15 10:10               ` Christoffer Dall
2017-12-15 10:10                 ` Christoffer Dall
2017-12-15 10:33                 ` Marc Zyngier
2017-12-15 10:33                   ` Marc Zyngier
2017-12-15 10:33                   ` Marc Zyngier
2017-12-15 11:15                   ` Christoffer Dall
2017-12-15 11:15                     ` Christoffer Dall
2017-12-15 11:15                     ` Christoffer Dall
2017-12-15 10:04             ` Christoffer Dall
2017-12-15 10:04               ` Christoffer Dall
2017-12-21  9:16               ` Jia He
2017-12-21  9:16                 ` Jia He
2017-12-21  9:16                 ` Jia He
2017-12-21 11:35                 ` Christoffer Dall
2017-12-21 11:35                   ` Christoffer Dall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171214130954.GV910@cbox \
    --to=christoffer.dall@linaro.org \
    --cc=hejianet@gmail.com \
    --cc=jia.he@hxt-semitech.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marc.zyngier@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.