From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juergen Gross Subject: Re: [PATCH] move domain to cpupool0 before destroying it Date: Tue, 20 May 2014 06:28:54 +0200 Message-ID: <537ADA06.4070903@ts.fujitsu.com> References: <1400129994-26297-1-git-send-email-juergen.gross@ts.fujitsu.com> <537A40C00200007800013C2C@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: George Dunlap , Jan Beulich Cc: "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org On 19.05.2014 18:19, George Dunlap wrote: > On Mon, May 19, 2014 at 4:34 PM, Jan Beulich wrote: >>>>> On 19.05.14 at 16:57, wrote: >>> On Thu, May 15, 2014 at 5:59 AM, Juergen Gross >>> wrote: >>>> Currently when a domain is destroyed it is removed from the domain_list >>>> before all of it's resources, including the cpupool membership, are freed. >>>> This can lead to a situation where the domain is still member of a cpupool >>>> without for_each_domain_in_cpupool() (or even for_each_domain()) being >>>> able to find it any more. This in turn can result in rejection of removing >>>> the last cpu from a cpupool, because there seems to be still a domain in >>>> the cpupool, even if it can't be found by scanning through all domains. >>>> >>>> This situation can be avoided by moving the domain to be destroyed to >>>> cpupool0 first and then remove it from this cpupool BEFORE deleting it from >>>> the domain_list. As cpupool0 is always active and a domain without any >>> cpupool >>>> membership is implicitly regarded as belonging to cpupool0, this poses no >>>> problem. >>> >>> I'm a bit unclear why we're doing *both* a sched_move_domain(), *and* >>> moving the "cpupool_rm_domain()". >>> >>> The sched_move_domain() only happens in domain_kill(), which is only >>> initiated (AFAICT) by hypercall: does that mean if a VM dies for some >>> other reason (i.e., crashes), that you may still have the same race? >>> If not, then just this change alone should be sufficent. If it does, >>> then this change is redundant. >> >> No, a crashed domain is merely being reported as crashed to the >> tool stack. It's the tool stack to then actually invoke the killing of >> it (or else e.g. "on_crash=preserve" would be rather hard to handle). > > Right, I see. > >> >>> Moving the cpupool_rm_domain() will change things so that there is now >>> a period of time where the VM is not being listed as being in >>> cpupool0's pool, but may still be in that pool's scheduler's list of >>> domains. Is that OK? If it is OK, it seems like that change alone >>> should be sufficient. >> >> Moving this earlier was a requirement to avoid the race that the >> earlier (much different) patch tried to address. Also I think the >> patch's description already addresses that question (see the last >> sentence of the quoted original mail contents above). > > But we're avoiding that race by instead moving the dying domain to > cpupool0, which is never going to disappear. > > Or, moving the domain to cpupool0 *won't* sufficiently solve the race, > and this will -- in which case, why are we bothering to move it to > cpupool0 at all? Why not just remove it from the cpupool when > removing it from the domain list? Wouldn't that also solve the > original problem? > > Regarding the last bit, "...a domain without any cpupool membership is > implicitly regarded as belonging to cpupool0...": > > 1. At a quick glance through the code, I couldn't find any evidence > that this was the case; I couldn't find an instance where d->cpupool > == NULL => assumed cpupool0. xen/common/schedule.c: #define DOM2OP(_d) (((_d)->cpupool == NULL) ? &ops : ((_d)->cpupool->sched)) together with: struct scheduler *scheduler_get_default(void) { return &ops; } > > 2. If in reality d->cpupool is never (or almost never) actually NULL, > then the "implicitly belongs to cpupool0" assumption will bitrot. > Having that kind of assumption without some way of making sure it's > maintained is a bug waiting to happen. That's not going to happen: This assumption is tested every time the idle domain is being referenced by the scheduler... > >>> I've been trying to trace through the twisty little passages of domain >>> destruction, and I'm still not quite sure: would it be OK if we just >>> called sched_move_domain() in domain_destroy() instead of calling >>> cpupool_rm_domain()? >> >> No, it would not, because then again we wouldn't be able to deal >> with potential failure, needing re-invocation of the function. > > Right. Juergen -- Juergen Gross Principal Developer Operating Systems PSO PM&D ES&S SWE OS6 Telephone: +49 (0) 89 62060 2932 Fujitsu e-mail: juergen.gross@ts.fujitsu.com Mies-van-der-Rohe-Str. 8 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html