From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juergen Gross Subject: Re: [Patch] Call sched_destroy_domain before cpupool_rm_domain. Date: Tue, 05 Nov 2013 06:59:14 +0100 Message-ID: <52788932.1030209@ts.fujitsu.com> References: <1383534234-3933-1-git-send-email-nate.studer@dornerworks.com> <52773EF2.8000308@ts.fujitsu.com> <1383557167.9207.35.camel@Solace> <52776FBC.50800@ts.fujitsu.com> <5277BBB6.5050604@dornerworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5277BBB6.5050604@dornerworks.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Nate Studer Cc: George Dunlap , Dario Faggioli , Keir Fraser , Jan Beulich , xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On 04.11.2013 16:22, Nate Studer wrote: > On 11/4/2013 4:58 AM, Juergen Gross wrote: >> On 04.11.2013 10:26, Dario Faggioli wrote: >>> On lun, 2013-11-04 at 07:30 +0100, Juergen Gross wrote: >>>> On 04.11.2013 04:03, Nathan Studer wrote: >>>>> From: Nathan Studer >>>>> >>>>> The domain destruction code, removes a domain from its cpupool >>>>> before attempting to destroy its scheduler information. Since >>>>> the scheduler framework uses the domain's cpupool information >>>>> to decide on which scheduler ops to use, this results in the >>>>> the wrong scheduler's destroy domain function being called >>>>> when the cpupool scheduler and the initial scheduler are >>>>> different. >>>>> >>>>> Correct this by destroying the domain's scheduling information >>>>> before removing it from the pool. >>>>> >>>>> Signed-off-by: Nathan Studer >>>> >>>> Reviewed-by: Juergen Gross >>>> >>> I think this is a candidate for backports too, isn't it? >>> >>> Nathan, what was happening without this patch? Are you able to quickly >>> figure out what previous Xen versions suffers from the same bug? > > Various things: > > If I used the credit scheduler in Pool-0 and the arinc653 scheduler in a cpupool > the other pool, it would: > 1. Hit a BUG_ON in the arinc653 scheduler. > 2. Hit an assert in the scheduling framework code. > 3. Or crash in the credit scheduler's csched_free_domdata function. > > The latter clued me in that the wrong scheduler's destroy function was somehow > being called. > > If I used the credit2 scheduler in the other pool, I would only ever see the latter. > > Similarly, if I used the sedf scheduler in the other pool, I would only ever see > the latter. However when using the sedf scheduler I would have to create and > destroy the domain twice, instead of just once. > >> >> In theory this bug is present since 4.1. >> >> OTOH it will be hit only with arinc653 scheduler in a cpupool other than >> Pool-0. And I don't see how this is being supported by arinc653 today (pick_cpu >> will always return 0). > > Correct, the arinc653 scheduler currently does not work with cpupools. We are > working on remedying that though, which is how I ran into this. I would have > just wrapped this patch in with the upcoming arinc653 ones, if I had not run > into the same issue with the other schedulers. > >> >> All other schedulers will just call xfree() for the domain specific data (and >> may be update some statistic data, which is not critical). > > The credit and credit2 schedulers do a bit more than that in their free_domdata > functions. Sorry, got not enough sleep on the weekend ;-) I checked only 4.1 and 4.2 trees. There only xfree of the domain data is done. > > The credit scheduler frees the node_affinity_cpumask contained in the domain > data and the credit2 scheduler deletes a list element contained in the domain > data. Since with this bug they are accessing structures that do not belong to > them, bad things happen. So the patch would be subject to a 4.3 backport, I think. Juergen -- Juergen Gross Principal Developer Operating Systems PBG PDG ES&S SWE OS6 Telephone: +49 (0) 89 62060 2932 Fujitsu e-mail: juergen.gross@ts.fujitsu.com Mies-van-der-Rohe-Str. 8 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html