From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juergen Gross Subject: Re: [PATCH] cpupools: retry cpupool-destroy if domain in cpupool is dying Date: Wed, 14 May 2014 12:35:07 +0200 Message-ID: <537346DB.9070009@ts.fujitsu.com> References: <1399895385-18894-1-git-send-email-juergen.gross@ts.fujitsu.com> <53733DE5.9060406@ts.fujitsu.com> <53735E690200007800012106@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <53735E690200007800012106@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: George Dunlap , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org On 14.05.2014 12:15, Jan Beulich wrote: >>>> On 14.05.14 at 11:56, wrote: >> On 14.05.2014 11:16, George Dunlap wrote: >>> On Mon, May 12, 2014 at 12:49 PM, Juergen Gross >>> wrote: >>>> When a cpupool is destroyed just after the last domain has been stopped the >>>> domain might already be removed from the domain list without being removed >>>> from the cpupool. >>>> It is easy to detect this situation and to return EAGAIN in this case which >>>> is already handled in libxc by doing a retry. >>> >>> OK, I hate to be picky over two lines, but it still seems to me like >>> this is papering over issues instead of dealing with them properly. >>> The real problem here is that "for_each_domain_in_cpupool()" doesn't >>> actually go over every domain in the cpupool. Instead of making it so >>> that it actually does, you're compensating for that fact in an ad-hoc >>> fashion. >>> >>> Now as it happens, it looks like all the other current uses of >>> for_each_domain_in_cpupool() work just fine if there are domains in >>> the pool it doesn't see, as long as they're about to disappear. But >>> we've already seen a bug caused because of a situation where "don't >>> see domains that are about to disappear" *does* actually cause a >>> problem; working around it is just setting a trap for future >>> developers to fall into. (And who knows, there may already be a bug >>> we haven't discovered in the other invocations of >>> for_each_domain_in_cpupool()). >> >> This isn't unique to for_each_domain_in_cpupool(). It is a problem for all >> uses of for_each_domain() which are related to resources freed only in >> complete_domain_destroy(). > > Of which there shouldn't be that many, if any at all. > > What prevents cpupool_rm_domain() getting moved from > complete_domain_destroy() to domain_destroy(), before the > domain gets taken off the list? I actually assume that there are > more things here that may not really need deferring until the > last possible moment... sched_destroy_vcpu() and sched_destroy_domain() have to happen before cpupool_rm_domain(). This could be avoided if the domain would be moved to cpupool0 in domain_destroy(). Hmm, doesn't sound too bad. This would be just symmetrical to domain creation. What do you think? Juergen -- Juergen Gross Principal Developer Operating Systems PSO PM&D ES&S SWE OS6 Telephone: +49 (0) 89 62060 2932 Fujitsu e-mail: juergen.gross@ts.fujitsu.com Mies-van-der-Rohe-Str. 8 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html