From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gleb Natapov Subject: Re: APIC lookups Date: Sat, 3 Sep 2011 10:32:08 +0300 Message-ID: <20110903073208.GK26451@redhat.com> References: <1314986155.31676.22.camel@lappy> <20110902181323.GJ26451@redhat.com> <1314990522.31676.30.camel@lappy> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm To: Sasha Levin Return-path: Received: from mx1.redhat.com ([209.132.183.28]:65088 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751854Ab1ICHcL (ORCPT ); Sat, 3 Sep 2011 03:32:11 -0400 Content-Disposition: inline In-Reply-To: <1314990522.31676.30.camel@lappy> Sender: kvm-owner@vger.kernel.org List-ID: On Fri, Sep 02, 2011 at 10:08:42PM +0300, Sasha Levin wrote: > On Fri, 2011-09-02 at 21:13 +0300, Gleb Natapov wrote: > > On Fri, Sep 02, 2011 at 08:55:55PM +0300, Sasha Levin wrote: > > > Hi, > > > > > > I've noticed that kvm_irq_delivery_to_apic() is locating the destination > > > APIC by running through kvm_for_each_vcpu() which becomes a scalability > > > issue with a large number if vcpus. > > > > > > I'm thinking about speeding that up using a radix tree for lookups, and > > > was wondering if it sounds right. > > > > > We have to call kvm_apic_match_dest() on each apic to see if it should > > get the message. Single message can be sent to more than one apic. It is > > likely possible to optimize common case of physical addressing fixed > > destination, but then just use array of 256 elements, no need for a tree. > > I think it's also possible to handle it for logical addressing as well, > instead of a simple compare we just need to go through all the IDs that > would 'and' with the dest. > There are two kinds of logical addressing: flat and cluster. And I see nothing that prevents different CPUs be in different mode. It is better to cache lookup result in irq routing entry to speedup following interrupts. > > Do you see this function in profiling? > > I was running profiling to see which functions get much slower during > regular operation (not boot) when you run with large amount of vcpus, > and this was one of them. > > Though this is probably due to the method we use to find lowest priority > and not the lookups themselves. > Currently we round robin between all cpus on each interrupt when lowest priority delivery is used. We should do it on each N interrupts where N >> 1. -- Gleb.