From: Juergen Gross <juergen.gross@ts.fujitsu.com>
To: Nate Studer <nate.studer@dornerworks.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>,
Dario Faggioli <dario.faggioli@citrix.com>,
Keir Fraser <keir@xen.org>, Jan Beulich <JBeulich@suse.com>,
xen-devel@lists.xen.org
Subject: Re: [Patch] Call sched_destroy_domain before cpupool_rm_domain.
Date: Tue, 05 Nov 2013 06:59:14 +0100 [thread overview]
Message-ID: <52788932.1030209@ts.fujitsu.com> (raw)
In-Reply-To: <5277BBB6.5050604@dornerworks.com>
On 04.11.2013 16:22, Nate Studer wrote:
> On 11/4/2013 4:58 AM, Juergen Gross wrote:
>> On 04.11.2013 10:26, Dario Faggioli wrote:
>>> On lun, 2013-11-04 at 07:30 +0100, Juergen Gross wrote:
>>>> On 04.11.2013 04:03, Nathan Studer wrote:
>>>>> From: Nathan Studer <nate.studer@dornerworks.com>
>>>>>
>>>>> The domain destruction code, removes a domain from its cpupool
>>>>> before attempting to destroy its scheduler information. Since
>>>>> the scheduler framework uses the domain's cpupool information
>>>>> to decide on which scheduler ops to use, this results in the
>>>>> the wrong scheduler's destroy domain function being called
>>>>> when the cpupool scheduler and the initial scheduler are
>>>>> different.
>>>>>
>>>>> Correct this by destroying the domain's scheduling information
>>>>> before removing it from the pool.
>>>>>
>>>>> Signed-off-by: Nathan Studer <nate.studer@dornerworks.com>
>>>>
>>>> Reviewed-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
>>>>
>>> I think this is a candidate for backports too, isn't it?
>>>
>>> Nathan, what was happening without this patch? Are you able to quickly
>>> figure out what previous Xen versions suffers from the same bug?
>
> Various things:
>
> If I used the credit scheduler in Pool-0 and the arinc653 scheduler in a cpupool
> the other pool, it would:
> 1. Hit a BUG_ON in the arinc653 scheduler.
> 2. Hit an assert in the scheduling framework code.
> 3. Or crash in the credit scheduler's csched_free_domdata function.
>
> The latter clued me in that the wrong scheduler's destroy function was somehow
> being called.
>
> If I used the credit2 scheduler in the other pool, I would only ever see the latter.
>
> Similarly, if I used the sedf scheduler in the other pool, I would only ever see
> the latter. However when using the sedf scheduler I would have to create and
> destroy the domain twice, instead of just once.
>
>>
>> In theory this bug is present since 4.1.
>>
>> OTOH it will be hit only with arinc653 scheduler in a cpupool other than
>> Pool-0. And I don't see how this is being supported by arinc653 today (pick_cpu
>> will always return 0).
>
> Correct, the arinc653 scheduler currently does not work with cpupools. We are
> working on remedying that though, which is how I ran into this. I would have
> just wrapped this patch in with the upcoming arinc653 ones, if I had not run
> into the same issue with the other schedulers.
>
>>
>> All other schedulers will just call xfree() for the domain specific data (and
>> may be update some statistic data, which is not critical).
>
> The credit and credit2 schedulers do a bit more than that in their free_domdata
> functions.
Sorry, got not enough sleep on the weekend ;-)
I checked only 4.1 and 4.2 trees. There only xfree of the domain data is done.
>
> The credit scheduler frees the node_affinity_cpumask contained in the domain
> data and the credit2 scheduler deletes a list element contained in the domain
> data. Since with this bug they are accessing structures that do not belong to
> them, bad things happen.
So the patch would be subject to a 4.3 backport, I think.
Juergen
--
Juergen Gross Principal Developer Operating Systems
PBG PDG ES&S SWE OS6 Telephone: +49 (0) 89 62060 2932
Fujitsu e-mail: juergen.gross@ts.fujitsu.com
Mies-van-der-Rohe-Str. 8 Internet: ts.fujitsu.com
D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html
next prev parent reply other threads:[~2013-11-05 5:59 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-04 3:03 [Patch] Call sched_destroy_domain before cpupool_rm_domain Nathan Studer
2013-11-04 6:30 ` Juergen Gross
2013-11-04 9:26 ` Dario Faggioli
2013-11-04 9:58 ` Juergen Gross
2013-11-04 15:22 ` Nate Studer
2013-11-05 5:59 ` Juergen Gross [this message]
2013-11-07 7:39 ` Jan Beulich
2013-11-07 9:09 ` Juergen Gross
2013-11-07 9:37 ` Jan Beulich
2013-11-07 9:43 ` Juergen Gross
2013-11-04 9:33 ` Andrew Cooper
2013-11-05 21:09 ` Keir Fraser
2013-11-04 15:10 ` George Dunlap
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52788932.1030209@ts.fujitsu.com \
--to=juergen.gross@ts.fujitsu.com \
--cc=JBeulich@suse.com \
--cc=dario.faggioli@citrix.com \
--cc=george.dunlap@eu.citrix.com \
--cc=keir@xen.org \
--cc=nate.studer@dornerworks.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).