* [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock @ 2011-02-04 9:49 Jan Kiszka 2011-02-04 21:03 ` Zachary Amsden 2011-02-10 10:40 ` Avi Kivity 0 siblings, 2 replies; 17+ messages in thread From: Jan Kiszka @ 2011-02-04 9:49 UTC (permalink / raw) To: Avi Kivity, Marcelo Tosatti Cc: kvm, Linux Kernel Mailing List, Zachary Amsden Code under this lock requires non-preemptibility. Ensure this also over -rt by converting it to raw spinlock. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/x86.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ffd7f8d..25e7734 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -448,7 +448,7 @@ struct kvm_arch { unsigned long irq_sources_bitmap; s64 kvmclock_offset; - spinlock_t tsc_write_lock; + raw_spinlock_t tsc_write_lock; u64 last_tsc_nsec; u64 last_tsc_offset; u64 last_tsc_write; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a7f65aa..d813919 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1017,7 +1017,7 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, u64 data) unsigned long flags; s64 sdiff; - spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags); + raw_spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags); offset = data - native_read_tsc(); ns = get_kernel_ns(); elapsed = ns - kvm->arch.last_tsc_nsec; @@ -1050,7 +1050,7 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, u64 data) kvm->arch.last_tsc_write = data; kvm->arch.last_tsc_offset = offset; kvm_x86_ops->write_tsc_offset(vcpu, offset); - spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags); + raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags); /* Reset of TSC must disable overshoot protection below */ vcpu->arch.hv_clock.tsc_timestamp = 0; @@ -6009,7 +6009,7 @@ int kvm_arch_init_vm(struct kvm *kvm) /* Reserve bit 0 of irq_sources_bitmap for userspace irq source */ set_bit(KVM_USERSPACE_IRQ_SOURCE_ID, &kvm->arch.irq_sources_bitmap); - spin_lock_init(&kvm->arch.tsc_write_lock); + raw_spin_lock_init(&kvm->arch.tsc_write_lock); return 0; } -- 1.7.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock 2011-02-04 9:49 [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock Jan Kiszka @ 2011-02-04 21:03 ` Zachary Amsden 2011-02-07 11:35 ` Jan Kiszka 2011-02-10 10:40 ` Avi Kivity 1 sibling, 1 reply; 17+ messages in thread From: Zachary Amsden @ 2011-02-04 21:03 UTC (permalink / raw) To: Jan Kiszka; +Cc: Avi Kivity, Marcelo Tosatti, kvm, Linux Kernel Mailing List On 02/04/2011 04:49 AM, Jan Kiszka wrote: > Code under this lock requires non-preemptibility. Ensure this also over > -rt by converting it to raw spinlock. > Oh dear, I had forgotten about that. I believe kvm_lock might have the same assumption in a few places regarding clock. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock 2011-02-04 21:03 ` Zachary Amsden @ 2011-02-07 11:35 ` Jan Kiszka 2011-02-07 14:11 ` Zachary Amsden 0 siblings, 1 reply; 17+ messages in thread From: Jan Kiszka @ 2011-02-07 11:35 UTC (permalink / raw) To: Zachary Amsden Cc: Avi Kivity, Marcelo Tosatti, kvm, Linux Kernel Mailing List On 2011-02-04 22:03, Zachary Amsden wrote: > On 02/04/2011 04:49 AM, Jan Kiszka wrote: >> Code under this lock requires non-preemptibility. Ensure this also over >> -rt by converting it to raw spinlock. >> > > Oh dear, I had forgotten about that. I believe kvm_lock might have the > same assumption in a few places regarding clock. I only found a problematic section in kvmclock_cpufreq_notifier. Didn't see this during my tests as I have CPUFREQ disabled in my .config. We may need something like this as converting kvm_lock would likely be overkill: diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 36f54fb..971ee0d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4530,7 +4530,7 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va struct cpufreq_freqs *freq = data; struct kvm *kvm; struct kvm_vcpu *vcpu; - int i, send_ipi = 0; + int i, me, send_ipi = 0; /* * We allow guests to temporarily run on slowing clocks, @@ -4583,9 +4583,11 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va kvm_for_each_vcpu(i, vcpu, kvm) { if (vcpu->cpu != freq->cpu) continue; + me = get_cpu(); kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); - if (vcpu->cpu != smp_processor_id()) + if (vcpu->cpu != me) send_ipi = 1; + put_cpu(); } } spin_unlock(&kvm_lock); Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock 2011-02-07 11:35 ` Jan Kiszka @ 2011-02-07 14:11 ` Zachary Amsden 2011-02-07 15:00 ` Jan Kiszka 0 siblings, 1 reply; 17+ messages in thread From: Zachary Amsden @ 2011-02-07 14:11 UTC (permalink / raw) To: Jan Kiszka; +Cc: Avi Kivity, Marcelo Tosatti, kvm, Linux Kernel Mailing List On 02/07/2011 06:35 AM, Jan Kiszka wrote: > On 2011-02-04 22:03, Zachary Amsden wrote: > >> On 02/04/2011 04:49 AM, Jan Kiszka wrote: >> >>> Code under this lock requires non-preemptibility. Ensure this also over >>> -rt by converting it to raw spinlock. >>> >>> >> Oh dear, I had forgotten about that. I believe kvm_lock might have the >> same assumption in a few places regarding clock. >> > I only found a problematic section in kvmclock_cpufreq_notifier. Didn't > see this during my tests as I have CPUFREQ disabled in my .config. > > We may need something like this as converting kvm_lock would likely be > overkill: > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 36f54fb..971ee0d 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -4530,7 +4530,7 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va > struct cpufreq_freqs *freq = data; > struct kvm *kvm; > struct kvm_vcpu *vcpu; > - int i, send_ipi = 0; > + int i, me, send_ipi = 0; > > /* > * We allow guests to temporarily run on slowing clocks, > @@ -4583,9 +4583,11 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va > kvm_for_each_vcpu(i, vcpu, kvm) { > if (vcpu->cpu != freq->cpu) > continue; > + me = get_cpu(); > kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); > - if (vcpu->cpu != smp_processor_id()) > + if (vcpu->cpu != me) > send_ipi = 1; > + put_cpu(); > } > } > spin_unlock(&kvm_lock); > > Jan > > That looks like a good solution, and I do believe that is the only place the lock is used in that fashion - please add a comment though in the giant comment block above that preemption protection is needed for RT. Also, gcc should catch this, but moving the me variable into the kvm_for_each_vcpu loop should allow for better register allocation. The only other thing I can think of is that RT lock preemption may break some of the CPU initialization semantics enforced by kvm_lock if you happen to get a hotplug event just as the module is loading. That should be rare, but if it is indeed a bug, it would be nice to fix, it would be a panic for sure not to initialize VMX. Cheers, Zach ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock 2011-02-07 14:11 ` Zachary Amsden @ 2011-02-07 15:00 ` Jan Kiszka 2011-02-07 15:15 ` Zachary Amsden 0 siblings, 1 reply; 17+ messages in thread From: Jan Kiszka @ 2011-02-07 15:00 UTC (permalink / raw) To: Zachary Amsden Cc: Avi Kivity, Marcelo Tosatti, kvm, Linux Kernel Mailing List On 2011-02-07 15:11, Zachary Amsden wrote: > On 02/07/2011 06:35 AM, Jan Kiszka wrote: >> On 2011-02-04 22:03, Zachary Amsden wrote: >> >>> On 02/04/2011 04:49 AM, Jan Kiszka wrote: >>> >>>> Code under this lock requires non-preemptibility. Ensure this also over >>>> -rt by converting it to raw spinlock. >>>> >>>> >>> Oh dear, I had forgotten about that. I believe kvm_lock might have the >>> same assumption in a few places regarding clock. >>> >> I only found a problematic section in kvmclock_cpufreq_notifier. Didn't >> see this during my tests as I have CPUFREQ disabled in my .config. >> >> We may need something like this as converting kvm_lock would likely be >> overkill: >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 36f54fb..971ee0d 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -4530,7 +4530,7 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va >> struct cpufreq_freqs *freq = data; >> struct kvm *kvm; >> struct kvm_vcpu *vcpu; >> - int i, send_ipi = 0; >> + int i, me, send_ipi = 0; >> >> /* >> * We allow guests to temporarily run on slowing clocks, >> @@ -4583,9 +4583,11 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va >> kvm_for_each_vcpu(i, vcpu, kvm) { >> if (vcpu->cpu != freq->cpu) >> continue; >> + me = get_cpu(); >> kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); >> - if (vcpu->cpu != smp_processor_id()) >> + if (vcpu->cpu != me) >> send_ipi = 1; >> + put_cpu(); >> } >> } >> spin_unlock(&kvm_lock); >> >> Jan >> >> > > That looks like a good solution, and I do believe that is the only place > the lock is used in that fashion - please add a comment though in the > giant comment block above that preemption protection is needed for RT. > Also, gcc should catch this, but moving the me variable into the > kvm_for_each_vcpu loop should allow for better register allocation. > > The only other thing I can think of is that RT lock preemption may break > some of the CPU initialization semantics enforced by kvm_lock if you > happen to get a hotplug event just as the module is loading. That > should be rare, but if it is indeed a bug, it would be nice to fix, it > would be a panic for sure not to initialize VMX. Hmm, is a cpu hotplug notifier allowed to run sleepy code? Can't imagine. So we already have a strong reason to convert kvm_lock to a raw_spinlock which obsoletes the above workaround. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock 2011-02-07 15:00 ` Jan Kiszka @ 2011-02-07 15:15 ` Zachary Amsden 2011-02-07 15:38 ` Jan Kiszka 0 siblings, 1 reply; 17+ messages in thread From: Zachary Amsden @ 2011-02-07 15:15 UTC (permalink / raw) To: Jan Kiszka; +Cc: Avi Kivity, Marcelo Tosatti, kvm, Linux Kernel Mailing List On 02/07/2011 10:00 AM, Jan Kiszka wrote: > On 2011-02-07 15:11, Zachary Amsden wrote: > >> On 02/07/2011 06:35 AM, Jan Kiszka wrote: >> >>> On 2011-02-04 22:03, Zachary Amsden wrote: >>> >>> >>>> On 02/04/2011 04:49 AM, Jan Kiszka wrote: >>>> >>>> >>>>> Code under this lock requires non-preemptibility. Ensure this also over >>>>> -rt by converting it to raw spinlock. >>>>> >>>>> >>>>> >>>> Oh dear, I had forgotten about that. I believe kvm_lock might have the >>>> same assumption in a few places regarding clock. >>>> >>>> >>> I only found a problematic section in kvmclock_cpufreq_notifier. Didn't >>> see this during my tests as I have CPUFREQ disabled in my .config. >>> >>> We may need something like this as converting kvm_lock would likely be >>> overkill: >>> >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>> index 36f54fb..971ee0d 100644 >>> --- a/arch/x86/kvm/x86.c >>> +++ b/arch/x86/kvm/x86.c >>> @@ -4530,7 +4530,7 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va >>> struct cpufreq_freqs *freq = data; >>> struct kvm *kvm; >>> struct kvm_vcpu *vcpu; >>> - int i, send_ipi = 0; >>> + int i, me, send_ipi = 0; >>> >>> /* >>> * We allow guests to temporarily run on slowing clocks, >>> @@ -4583,9 +4583,11 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va >>> kvm_for_each_vcpu(i, vcpu, kvm) { >>> if (vcpu->cpu != freq->cpu) >>> continue; >>> + me = get_cpu(); >>> kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); >>> - if (vcpu->cpu != smp_processor_id()) >>> + if (vcpu->cpu != me) >>> send_ipi = 1; >>> + put_cpu(); >>> } >>> } >>> spin_unlock(&kvm_lock); >>> >>> Jan >>> >>> >>> >> That looks like a good solution, and I do believe that is the only place >> the lock is used in that fashion - please add a comment though in the >> giant comment block above that preemption protection is needed for RT. >> Also, gcc should catch this, but moving the me variable into the >> kvm_for_each_vcpu loop should allow for better register allocation. >> >> The only other thing I can think of is that RT lock preemption may break >> some of the CPU initialization semantics enforced by kvm_lock if you >> happen to get a hotplug event just as the module is loading. That >> should be rare, but if it is indeed a bug, it would be nice to fix, it >> would be a panic for sure not to initialize VMX. >> > Hmm, is a cpu hotplug notifier allowed to run sleepy code? Can't > imagine. So we already have a strong reason to convert kvm_lock to a > raw_spinlock which obsoletes the above workaround. > I don't know as it is allowed to sleep, it doesn't call any sleeping functions to my knowledge. What worries me in the RT case is that the spinlock acquired for hardware_enable might be preempted and run on another CPU, which obviously isn't what you want. Zach ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock 2011-02-07 15:15 ` Zachary Amsden @ 2011-02-07 15:38 ` Jan Kiszka 2011-02-07 15:52 ` Avi Kivity 0 siblings, 1 reply; 17+ messages in thread From: Jan Kiszka @ 2011-02-07 15:38 UTC (permalink / raw) To: Zachary Amsden Cc: Avi Kivity, Marcelo Tosatti, kvm, Linux Kernel Mailing List On 2011-02-07 16:15, Zachary Amsden wrote: > On 02/07/2011 10:00 AM, Jan Kiszka wrote: >> On 2011-02-07 15:11, Zachary Amsden wrote: >> >>> On 02/07/2011 06:35 AM, Jan Kiszka wrote: >>> >>>> On 2011-02-04 22:03, Zachary Amsden wrote: >>>> >>>> >>>>> On 02/04/2011 04:49 AM, Jan Kiszka wrote: >>>>> >>>>> >>>>>> Code under this lock requires non-preemptibility. Ensure this also over >>>>>> -rt by converting it to raw spinlock. >>>>>> >>>>>> >>>>>> >>>>> Oh dear, I had forgotten about that. I believe kvm_lock might have the >>>>> same assumption in a few places regarding clock. >>>>> >>>>> >>>> I only found a problematic section in kvmclock_cpufreq_notifier. Didn't >>>> see this during my tests as I have CPUFREQ disabled in my .config. >>>> >>>> We may need something like this as converting kvm_lock would likely be >>>> overkill: >>>> >>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>>> index 36f54fb..971ee0d 100644 >>>> --- a/arch/x86/kvm/x86.c >>>> +++ b/arch/x86/kvm/x86.c >>>> @@ -4530,7 +4530,7 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va >>>> struct cpufreq_freqs *freq = data; >>>> struct kvm *kvm; >>>> struct kvm_vcpu *vcpu; >>>> - int i, send_ipi = 0; >>>> + int i, me, send_ipi = 0; >>>> >>>> /* >>>> * We allow guests to temporarily run on slowing clocks, >>>> @@ -4583,9 +4583,11 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va >>>> kvm_for_each_vcpu(i, vcpu, kvm) { >>>> if (vcpu->cpu != freq->cpu) >>>> continue; >>>> + me = get_cpu(); >>>> kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); >>>> - if (vcpu->cpu != smp_processor_id()) >>>> + if (vcpu->cpu != me) >>>> send_ipi = 1; >>>> + put_cpu(); >>>> } >>>> } >>>> spin_unlock(&kvm_lock); >>>> >>>> Jan >>>> >>>> >>>> >>> That looks like a good solution, and I do believe that is the only place >>> the lock is used in that fashion - please add a comment though in the >>> giant comment block above that preemption protection is needed for RT. >>> Also, gcc should catch this, but moving the me variable into the >>> kvm_for_each_vcpu loop should allow for better register allocation. >>> >>> The only other thing I can think of is that RT lock preemption may break >>> some of the CPU initialization semantics enforced by kvm_lock if you >>> happen to get a hotplug event just as the module is loading. That >>> should be rare, but if it is indeed a bug, it would be nice to fix, it >>> would be a panic for sure not to initialize VMX. >>> >> Hmm, is a cpu hotplug notifier allowed to run sleepy code? Can't >> imagine. So we already have a strong reason to convert kvm_lock to a >> raw_spinlock which obsoletes the above workaround. >> > > I don't know as it is allowed to sleep, it doesn't call any sleeping > functions to my knowledge. What worries me in the RT case is that the > spinlock acquired for hardware_enable might be preempted and run on > another CPU, which obviously isn't what you want. I see now, there are calls to raw_smp_processor_id. I think it's best to make this a raw lock. At this chance, some read-only users of vm_list should be rcu'ified. Will have a look. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock 2011-02-07 15:38 ` Jan Kiszka @ 2011-02-07 15:52 ` Avi Kivity 2011-02-07 15:58 ` Jan Kiszka 0 siblings, 1 reply; 17+ messages in thread From: Avi Kivity @ 2011-02-07 15:52 UTC (permalink / raw) To: Jan Kiszka Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List On 02/07/2011 05:38 PM, Jan Kiszka wrote: > > > > I don't know as it is allowed to sleep, it doesn't call any sleeping > > functions to my knowledge. What worries me in the RT case is that the > > spinlock acquired for hardware_enable might be preempted and run on > > another CPU, which obviously isn't what you want. > > I see now, there are calls to raw_smp_processor_id. > > I think it's best to make this a raw lock. At this chance, some > read-only users of vm_list should be rcu'ified. Will have a look. vm_list is rarely used, for either read or write. I don't see the need to rcu it. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock 2011-02-07 15:52 ` Avi Kivity @ 2011-02-07 15:58 ` Jan Kiszka 2011-02-07 16:26 ` Avi Kivity 0 siblings, 1 reply; 17+ messages in thread From: Jan Kiszka @ 2011-02-07 15:58 UTC (permalink / raw) To: Avi Kivity Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List On 2011-02-07 16:52, Avi Kivity wrote: > On 02/07/2011 05:38 PM, Jan Kiszka wrote: >>> >>> I don't know as it is allowed to sleep, it doesn't call any sleeping >>> functions to my knowledge. What worries me in the RT case is that the >>> spinlock acquired for hardware_enable might be preempted and run on >>> another CPU, which obviously isn't what you want. >> >> I see now, there are calls to raw_smp_processor_id. >> >> I think it's best to make this a raw lock. At this chance, some >> read-only users of vm_list should be rcu'ified. Will have a look. > > vm_list is rarely used, for either read or write. I don't see the need > to rcu it. Avoid that code under this lock expands the preempt-disabled period, specifically under -rt, and specifically as the number of objects over which we loop is user-defined. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock 2011-02-07 15:58 ` Jan Kiszka @ 2011-02-07 16:26 ` Avi Kivity 2011-02-07 16:59 ` Jan Kiszka 0 siblings, 1 reply; 17+ messages in thread From: Avi Kivity @ 2011-02-07 16:26 UTC (permalink / raw) To: Jan Kiszka Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List On 02/07/2011 05:58 PM, Jan Kiszka wrote: > On 2011-02-07 16:52, Avi Kivity wrote: > > On 02/07/2011 05:38 PM, Jan Kiszka wrote: > >>> > >>> I don't know as it is allowed to sleep, it doesn't call any sleeping > >>> functions to my knowledge. What worries me in the RT case is that the > >>> spinlock acquired for hardware_enable might be preempted and run on > >>> another CPU, which obviously isn't what you want. > >> > >> I see now, there are calls to raw_smp_processor_id. > >> > >> I think it's best to make this a raw lock. At this chance, some > >> read-only users of vm_list should be rcu'ified. Will have a look. > > > > vm_list is rarely used, for either read or write. I don't see the need > > to rcu it. > > Avoid that code under this lock expands the preempt-disabled period, > specifically under -rt, and specifically as the number of objects over > which we loop is user-defined. Good point; even under non-rt. (well, actually, cpufreq_notifier and kvm_arch_hardware_enable are already non preemptible, and the stats code should just go away?) -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock 2011-02-07 16:26 ` Avi Kivity @ 2011-02-07 16:59 ` Jan Kiszka 2011-02-07 17:10 ` Avi Kivity 0 siblings, 1 reply; 17+ messages in thread From: Jan Kiszka @ 2011-02-07 16:59 UTC (permalink / raw) To: Avi Kivity Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List On 2011-02-07 17:26, Avi Kivity wrote: > On 02/07/2011 05:58 PM, Jan Kiszka wrote: >> On 2011-02-07 16:52, Avi Kivity wrote: >>> On 02/07/2011 05:38 PM, Jan Kiszka wrote: >>>>> >>>>> I don't know as it is allowed to sleep, it doesn't call any sleeping >>>>> functions to my knowledge. What worries me in the RT case is that the >>>>> spinlock acquired for hardware_enable might be preempted and run on >>>>> another CPU, which obviously isn't what you want. >>>> >>>> I see now, there are calls to raw_smp_processor_id. >>>> >>>> I think it's best to make this a raw lock. At this chance, some >>>> read-only users of vm_list should be rcu'ified. Will have a look. >>> >>> vm_list is rarely used, for either read or write. I don't see the need >>> to rcu it. >> >> Avoid that code under this lock expands the preempt-disabled period, >> specifically under -rt, and specifically as the number of objects over >> which we loop is user-defined. > > Good point; even under non-rt. > > (well, actually, cpufreq_notifier and kvm_arch_hardware_enable are > already non preemptible, and the stats code should just go away?) The stats code is trivial to convert, so it doesn't matter. But what about mmu_shrink and its list_move_tail? How is this synchronized against kvm_destroy_vm - already today? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock 2011-02-07 16:59 ` Jan Kiszka @ 2011-02-07 17:10 ` Avi Kivity 2011-02-07 17:23 ` Jan Kiszka 0 siblings, 1 reply; 17+ messages in thread From: Avi Kivity @ 2011-02-07 17:10 UTC (permalink / raw) To: Jan Kiszka Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List On 02/07/2011 06:59 PM, Jan Kiszka wrote: > > > > (well, actually, cpufreq_notifier and kvm_arch_hardware_enable are > > already non preemptible, and the stats code should just go away?) > > The stats code is trivial to convert, so it doesn't matter. Removal is easier. > But what about mmu_shrink and its list_move_tail? How is this > synchronized against kvm_destroy_vm - already today? kvm_destroy_vm() takes kvm_lock. If a vm is destroyed before mmu_shrink(), mmu_shrink() will never see it. If we reach mmu_shrink() before kvm_destroy_vm(), the latter will wait until mmu_shrink() is done. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock 2011-02-07 17:10 ` Avi Kivity @ 2011-02-07 17:23 ` Jan Kiszka 2011-02-08 9:15 ` Avi Kivity 0 siblings, 1 reply; 17+ messages in thread From: Jan Kiszka @ 2011-02-07 17:23 UTC (permalink / raw) To: Avi Kivity Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List On 2011-02-07 18:10, Avi Kivity wrote: > On 02/07/2011 06:59 PM, Jan Kiszka wrote: >>> >>> (well, actually, cpufreq_notifier and kvm_arch_hardware_enable are >>> already non preemptible, and the stats code should just go away?) >> >> The stats code is trivial to convert, so it doesn't matter. > > Removal is easier. Is that stat interface no longer used? > >> But what about mmu_shrink and its list_move_tail? How is this >> synchronized against kvm_destroy_vm - already today? > > kvm_destroy_vm() takes kvm_lock. If a vm is destroyed before > mmu_shrink(), mmu_shrink() will never see it. If we reach mmu_shrink() > before kvm_destroy_vm(), the latter will wait until mmu_shrink() is done. > Ah, I was confused. Would require some more logic if we wanted to make the loop lock-less, though. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock 2011-02-07 17:23 ` Jan Kiszka @ 2011-02-08 9:15 ` Avi Kivity 2011-02-08 9:55 ` Jan Kiszka 0 siblings, 1 reply; 17+ messages in thread From: Avi Kivity @ 2011-02-08 9:15 UTC (permalink / raw) To: Jan Kiszka Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List On 02/07/2011 07:23 PM, Jan Kiszka wrote: > On 2011-02-07 18:10, Avi Kivity wrote: > > On 02/07/2011 06:59 PM, Jan Kiszka wrote: > >>> > >>> (well, actually, cpufreq_notifier and kvm_arch_hardware_enable are > >>> already non preemptible, and the stats code should just go away?) > >> > >> The stats code is trivial to convert, so it doesn't matter. > > > > Removal is easier. > > Is that stat interface no longer used? It's there for compatibility. I'm itching to remove it. See qemu-kvm.git/kvm/kvm_stat for the only known user, and for the replacement via tracepoints. Tracepoints have marginally lower overhead when disabled, and somewhat higher overhead when enabled. A disadvantage of tracepoints is that it is harder to associate an event with a vm when that event is triggered by a workqueue, but I don't think it matters in practice (kvm_stat doesn't even provide a per-vm breakdown). > > > >> But what about mmu_shrink and its list_move_tail? How is this > >> synchronized against kvm_destroy_vm - already today? > > > > kvm_destroy_vm() takes kvm_lock. If a vm is destroyed before > > mmu_shrink(), mmu_shrink() will never see it. If we reach mmu_shrink() > > before kvm_destroy_vm(), the latter will wait until mmu_shrink() is done. > > > > Ah, I was confused. Would require some more logic if we wanted to make > the loop lock-less, though. Yes, the usual rcu_read_lock() / grab reference / rcu_read_unlock(). -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock 2011-02-08 9:15 ` Avi Kivity @ 2011-02-08 9:55 ` Jan Kiszka 2011-02-08 9:58 ` Avi Kivity 0 siblings, 1 reply; 17+ messages in thread From: Jan Kiszka @ 2011-02-08 9:55 UTC (permalink / raw) To: Avi Kivity Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List On 2011-02-08 10:15, Avi Kivity wrote: > On 02/07/2011 07:23 PM, Jan Kiszka wrote: >> On 2011-02-07 18:10, Avi Kivity wrote: >>> On 02/07/2011 06:59 PM, Jan Kiszka wrote: >>>>> >>>>> (well, actually, cpufreq_notifier and kvm_arch_hardware_enable are >>>>> already non preemptible, and the stats code should just go away?) >>>> >>>> The stats code is trivial to convert, so it doesn't matter. >>> >>> Removal is easier. >> >> Is that stat interface no longer used? > > It's there for compatibility. I'm itching to remove it. See > qemu-kvm.git/kvm/kvm_stat for the only known user, and for the > replacement via tracepoints. OK, but that will first take a grace period. > > Tracepoints have marginally lower overhead when disabled, and somewhat > higher overhead when enabled. A disadvantage of tracepoints is that it > is harder to associate an event with a vm when that event is triggered > by a workqueue, but I don't think it matters in practice (kvm_stat > doesn't even provide a per-vm breakdown). What about using the perf infrastructure for this? Besides that perf can reuse tracepoints, maybe there is even a more efficient way of added new stat sources. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock 2011-02-08 9:55 ` Jan Kiszka @ 2011-02-08 9:58 ` Avi Kivity 0 siblings, 0 replies; 17+ messages in thread From: Avi Kivity @ 2011-02-08 9:58 UTC (permalink / raw) To: Jan Kiszka Cc: Zachary Amsden, Marcelo Tosatti, kvm, Linux Kernel Mailing List On 02/08/2011 11:55 AM, Jan Kiszka wrote: > > > > Tracepoints have marginally lower overhead when disabled, and somewhat > > higher overhead when enabled. A disadvantage of tracepoints is that it > > is harder to associate an event with a vm when that event is triggered > > by a workqueue, but I don't think it matters in practice (kvm_stat > > doesn't even provide a per-vm breakdown). > > What about using the perf infrastructure for this? Besides that perf can > reuse tracepoints, maybe there is even a more efficient way of added new > stat sources. We are using the perf infrastructure for this (using tracepoints). -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock 2011-02-04 9:49 [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock Jan Kiszka 2011-02-04 21:03 ` Zachary Amsden @ 2011-02-10 10:40 ` Avi Kivity 1 sibling, 0 replies; 17+ messages in thread From: Avi Kivity @ 2011-02-10 10:40 UTC (permalink / raw) To: Jan Kiszka Cc: Marcelo Tosatti, kvm, Linux Kernel Mailing List, Zachary Amsden On 02/04/2011 11:49 AM, Jan Kiszka wrote: > Code under this lock requires non-preemptibility. Ensure this also over > -rt by converting it to raw spinlock. Applied, thanks. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2011-02-10 10:40 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-02-04 9:49 [PATCH] KVM: x86: Convert tsc_write_lock to raw_spinlock Jan Kiszka 2011-02-04 21:03 ` Zachary Amsden 2011-02-07 11:35 ` Jan Kiszka 2011-02-07 14:11 ` Zachary Amsden 2011-02-07 15:00 ` Jan Kiszka 2011-02-07 15:15 ` Zachary Amsden 2011-02-07 15:38 ` Jan Kiszka 2011-02-07 15:52 ` Avi Kivity 2011-02-07 15:58 ` Jan Kiszka 2011-02-07 16:26 ` Avi Kivity 2011-02-07 16:59 ` Jan Kiszka 2011-02-07 17:10 ` Avi Kivity 2011-02-07 17:23 ` Jan Kiszka 2011-02-08 9:15 ` Avi Kivity 2011-02-08 9:55 ` Jan Kiszka 2011-02-08 9:58 ` Avi Kivity 2011-02-10 10:40 ` Avi Kivity
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).