From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755317AbcB2PMt (ORCPT ); Mon, 29 Feb 2016 10:12:49 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46853 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751099AbcB2PMr (ORCPT ); Mon, 29 Feb 2016 10:12:47 -0500 Subject: Re: [PATCH 0/3] KVM: Fix lost IRQ acks for RTC To: Joerg Roedel , Gleb Natapov References: <1456758285-25060-1-git-send-email-joro@8bytes.org> Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org From: Paolo Bonzini Message-ID: <56D45FEA.9030307@redhat.com> Date: Mon, 29 Feb 2016 16:12:42 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: <1456758285-25060-1-git-send-email-joro@8bytes.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 29/02/2016 16:04, Joerg Roedel wrote: > Hi, > > here is a small patch-set to fix a race condition which > happens when an RTC-IRQ is migrated to another VCPU while it > is being handled by the guest. > > The RTC-EOI handling in KVM requires that all sent interrupt > messages to the VCPUs need to be acked before another > RTC-IRQ can be sent. When an EOI signal from the guest is > lost, it will never see an RTC interrupt again (until it > reboots). > > This is easily reproducible with a Linux guest executing > this loop: > > $ while true;do time hwclock --show --test --debug;done > > When the guest has multiple vcpus and the RTC-IRQ is > regularily migrated (e.g. by irqbalance), the race condition > will be hit after some time and the hwclock tool will fail > with: > > select() to /dev/rtc to wait for clock tick timed out...synchronization failed > > The race condition happens because of the way the EOI > backtracking between local APIC and IOAPIC works in KVM. The > destination VCPU and vector is part of the IOAPIC state. > When the guest sends an EOI to the local APIC the vector is > matched against the destinations stored in the IOAPIC and > ACKed there too if it matches. > > The problem begins when a VCPU handles an RTC interrupt and > at the same time another VCPU migrates the RTC-IRQ away from > that VCPU. This updates the IOAPIC state in KVM to > the new destination, so that the EOI sent from the first > VCPU does not match anymore in the IOAPIC, hence losing the > RTC-EOI. > > This patch-set fixes the race-condition by adding explicit > back-tracking information for RTC-IRQs. The rtc_status > struct already holds a dest_map bitmap to store which VCPUs > receveived an RTC-IRQ. This is extended to also hold the > vector that was sent to this VCPU. > > This information is then used to match EOI signals from the > guest to the RTC. This explicit back-tracking fixes the > issue. > > Regards, Nice patches, really. Ok to wait until 4.6? Paolo