* [Patch] Call sched_destroy_domain before cpupool_rm_domain.
@ 2013-11-04 3:03 Nathan Studer
2013-11-04 6:30 ` Juergen Gross
2013-11-04 15:10 ` George Dunlap
0 siblings, 2 replies; 13+ messages in thread
From: Nathan Studer @ 2013-11-04 3:03 UTC (permalink / raw)
To: xen-devel; +Cc: George Dunlap, Juergen Gross, Keir Fraser, Nathan Studer
From: Nathan Studer <nate.studer@dornerworks.com>
The domain destruction code, removes a domain from its cpupool
before attempting to destroy its scheduler information. Since
the scheduler framework uses the domain's cpupool information
to decide on which scheduler ops to use, this results in the
the wrong scheduler's destroy domain function being called
when the cpupool scheduler and the initial scheduler are
different.
Correct this by destroying the domain's scheduling information
before removing it from the pool.
Signed-off-by: Nathan Studer <nate.studer@dornerworks.com>
---
xen/common/domain.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 5999779..78ce968 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -727,10 +727,10 @@ static void complete_domain_destroy(struct rcu_head *head)
rangeset_domain_destroy(d);
- cpupool_rm_domain(d);
-
sched_destroy_domain(d);
+ cpupool_rm_domain(d);
+
/* Free page used by xen oprofile buffer. */
#ifdef CONFIG_XENOPROF
free_xenoprof_pages(d);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Patch] Call sched_destroy_domain before cpupool_rm_domain.
2013-11-04 3:03 [Patch] Call sched_destroy_domain before cpupool_rm_domain Nathan Studer
@ 2013-11-04 6:30 ` Juergen Gross
2013-11-04 9:26 ` Dario Faggioli
2013-11-04 9:33 ` Andrew Cooper
2013-11-04 15:10 ` George Dunlap
1 sibling, 2 replies; 13+ messages in thread
From: Juergen Gross @ 2013-11-04 6:30 UTC (permalink / raw)
To: Nathan Studer; +Cc: George Dunlap, Keir Fraser, xen-devel
On 04.11.2013 04:03, Nathan Studer wrote:
> From: Nathan Studer <nate.studer@dornerworks.com>
>
> The domain destruction code, removes a domain from its cpupool
> before attempting to destroy its scheduler information. Since
> the scheduler framework uses the domain's cpupool information
> to decide on which scheduler ops to use, this results in the
> the wrong scheduler's destroy domain function being called
> when the cpupool scheduler and the initial scheduler are
> different.
>
> Correct this by destroying the domain's scheduling information
> before removing it from the pool.
>
> Signed-off-by: Nathan Studer <nate.studer@dornerworks.com>
Reviewed-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
> ---
> xen/common/domain.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 5999779..78ce968 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -727,10 +727,10 @@ static void complete_domain_destroy(struct rcu_head *head)
>
> rangeset_domain_destroy(d);
>
> - cpupool_rm_domain(d);
> -
> sched_destroy_domain(d);
>
> + cpupool_rm_domain(d);
> +
> /* Free page used by xen oprofile buffer. */
> #ifdef CONFIG_XENOPROF
> free_xenoprof_pages(d);
>
--
Juergen Gross Principal Developer Operating Systems
PBG PDG ES&S SWE OS6 Telephone: +49 (0) 89 62060 2932
Fujitsu e-mail: juergen.gross@ts.fujitsu.com
Mies-van-der-Rohe-Str. 8 Internet: ts.fujitsu.com
D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch] Call sched_destroy_domain before cpupool_rm_domain.
2013-11-04 6:30 ` Juergen Gross
@ 2013-11-04 9:26 ` Dario Faggioli
2013-11-04 9:58 ` Juergen Gross
2013-11-04 9:33 ` Andrew Cooper
1 sibling, 1 reply; 13+ messages in thread
From: Dario Faggioli @ 2013-11-04 9:26 UTC (permalink / raw)
To: Juergen Gross
Cc: George Dunlap, Keir Fraser, Nathan Studer, Jan Beulich, xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 1294 bytes --]
On lun, 2013-11-04 at 07:30 +0100, Juergen Gross wrote:
> On 04.11.2013 04:03, Nathan Studer wrote:
> > From: Nathan Studer <nate.studer@dornerworks.com>
> >
> > The domain destruction code, removes a domain from its cpupool
> > before attempting to destroy its scheduler information. Since
> > the scheduler framework uses the domain's cpupool information
> > to decide on which scheduler ops to use, this results in the
> > the wrong scheduler's destroy domain function being called
> > when the cpupool scheduler and the initial scheduler are
> > different.
> >
> > Correct this by destroying the domain's scheduling information
> > before removing it from the pool.
> >
> > Signed-off-by: Nathan Studer <nate.studer@dornerworks.com>
>
> Reviewed-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
>
I think this is a candidate for backports too, isn't it?
Nathan, what was happening without this patch? Are you able to quickly
figure out what previous Xen versions suffers from the same bug?
Dario
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch] Call sched_destroy_domain before cpupool_rm_domain.
2013-11-04 6:30 ` Juergen Gross
2013-11-04 9:26 ` Dario Faggioli
@ 2013-11-04 9:33 ` Andrew Cooper
2013-11-05 21:09 ` Keir Fraser
1 sibling, 1 reply; 13+ messages in thread
From: Andrew Cooper @ 2013-11-04 9:33 UTC (permalink / raw)
To: Juergen Gross; +Cc: George Dunlap, Keir Fraser, Nathan Studer, xen-devel
On 04/11/13 06:30, Juergen Gross wrote:
> On 04.11.2013 04:03, Nathan Studer wrote:
>> From: Nathan Studer <nate.studer@dornerworks.com>
>>
>> The domain destruction code, removes a domain from its cpupool
>> before attempting to destroy its scheduler information. Since
>> the scheduler framework uses the domain's cpupool information
>> to decide on which scheduler ops to use, this results in the
>> the wrong scheduler's destroy domain function being called
>> when the cpupool scheduler and the initial scheduler are
>> different.
>>
>> Correct this by destroying the domain's scheduling information
>> before removing it from the pool.
>>
>> Signed-off-by: Nathan Studer <nate.studer@dornerworks.com>
>
> Reviewed-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>
>> ---
>> xen/common/domain.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/common/domain.c b/xen/common/domain.c
>> index 5999779..78ce968 100644
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -727,10 +727,10 @@ static void complete_domain_destroy(struct
>> rcu_head *head)
>>
>> rangeset_domain_destroy(d);
>>
>> - cpupool_rm_domain(d);
>> -
>> sched_destroy_domain(d);
>>
>> + cpupool_rm_domain(d);
>> +
>> /* Free page used by xen oprofile buffer. */
>> #ifdef CONFIG_XENOPROF
>> free_xenoprof_pages(d);
>>
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch] Call sched_destroy_domain before cpupool_rm_domain.
2013-11-04 9:26 ` Dario Faggioli
@ 2013-11-04 9:58 ` Juergen Gross
2013-11-04 15:22 ` Nate Studer
0 siblings, 1 reply; 13+ messages in thread
From: Juergen Gross @ 2013-11-04 9:58 UTC (permalink / raw)
To: Dario Faggioli
Cc: George Dunlap, Keir Fraser, Nathan Studer, Jan Beulich, xen-devel
On 04.11.2013 10:26, Dario Faggioli wrote:
> On lun, 2013-11-04 at 07:30 +0100, Juergen Gross wrote:
>> On 04.11.2013 04:03, Nathan Studer wrote:
>>> From: Nathan Studer <nate.studer@dornerworks.com>
>>>
>>> The domain destruction code, removes a domain from its cpupool
>>> before attempting to destroy its scheduler information. Since
>>> the scheduler framework uses the domain's cpupool information
>>> to decide on which scheduler ops to use, this results in the
>>> the wrong scheduler's destroy domain function being called
>>> when the cpupool scheduler and the initial scheduler are
>>> different.
>>>
>>> Correct this by destroying the domain's scheduling information
>>> before removing it from the pool.
>>>
>>> Signed-off-by: Nathan Studer <nate.studer@dornerworks.com>
>>
>> Reviewed-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
>>
> I think this is a candidate for backports too, isn't it?
>
> Nathan, what was happening without this patch? Are you able to quickly
> figure out what previous Xen versions suffers from the same bug?
In theory this bug is present since 4.1.
OTOH it will be hit only with arinc653 scheduler in a cpupool other than
Pool-0. And I don't see how this is being supported by arinc653 today (pick_cpu
will always return 0).
All other schedulers will just call xfree() for the domain specific data (and
may be update some statistic data, which is not critical).
Juergen
--
Juergen Gross Principal Developer Operating Systems
PBG PDG ES&S SWE OS6 Telephone: +49 (0) 89 62060 2932
Fujitsu e-mail: juergen.gross@ts.fujitsu.com
Mies-van-der-Rohe-Str. 8 Internet: ts.fujitsu.com
D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch] Call sched_destroy_domain before cpupool_rm_domain.
2013-11-04 3:03 [Patch] Call sched_destroy_domain before cpupool_rm_domain Nathan Studer
2013-11-04 6:30 ` Juergen Gross
@ 2013-11-04 15:10 ` George Dunlap
1 sibling, 0 replies; 13+ messages in thread
From: George Dunlap @ 2013-11-04 15:10 UTC (permalink / raw)
To: Nathan Studer, xen-devel; +Cc: Juergen Gross, Keir Fraser
On 04/11/13 03:03, Nathan Studer wrote:
> From: Nathan Studer <nate.studer@dornerworks.com>
>
> The domain destruction code, removes a domain from its cpupool
> before attempting to destroy its scheduler information. Since
> the scheduler framework uses the domain's cpupool information
> to decide on which scheduler ops to use, this results in the
> the wrong scheduler's destroy domain function being called
> when the cpupool scheduler and the initial scheduler are
> different.
>
> Correct this by destroying the domain's scheduling information
> before removing it from the pool.
>
> Signed-off-by: Nathan Studer <nate.studer@dornerworks.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Thanks!
-George
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch] Call sched_destroy_domain before cpupool_rm_domain.
2013-11-04 9:58 ` Juergen Gross
@ 2013-11-04 15:22 ` Nate Studer
2013-11-05 5:59 ` Juergen Gross
0 siblings, 1 reply; 13+ messages in thread
From: Nate Studer @ 2013-11-04 15:22 UTC (permalink / raw)
To: Juergen Gross, Dario Faggioli
Cc: George Dunlap, Keir Fraser, Jan Beulich, xen-devel
On 11/4/2013 4:58 AM, Juergen Gross wrote:
> On 04.11.2013 10:26, Dario Faggioli wrote:
>> On lun, 2013-11-04 at 07:30 +0100, Juergen Gross wrote:
>>> On 04.11.2013 04:03, Nathan Studer wrote:
>>>> From: Nathan Studer <nate.studer@dornerworks.com>
>>>>
>>>> The domain destruction code, removes a domain from its cpupool
>>>> before attempting to destroy its scheduler information. Since
>>>> the scheduler framework uses the domain's cpupool information
>>>> to decide on which scheduler ops to use, this results in the
>>>> the wrong scheduler's destroy domain function being called
>>>> when the cpupool scheduler and the initial scheduler are
>>>> different.
>>>>
>>>> Correct this by destroying the domain's scheduling information
>>>> before removing it from the pool.
>>>>
>>>> Signed-off-by: Nathan Studer <nate.studer@dornerworks.com>
>>>
>>> Reviewed-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
>>>
>> I think this is a candidate for backports too, isn't it?
>>
>> Nathan, what was happening without this patch? Are you able to quickly
>> figure out what previous Xen versions suffers from the same bug?
Various things:
If I used the credit scheduler in Pool-0 and the arinc653 scheduler in a cpupool
the other pool, it would:
1. Hit a BUG_ON in the arinc653 scheduler.
2. Hit an assert in the scheduling framework code.
3. Or crash in the credit scheduler's csched_free_domdata function.
The latter clued me in that the wrong scheduler's destroy function was somehow
being called.
If I used the credit2 scheduler in the other pool, I would only ever see the latter.
Similarly, if I used the sedf scheduler in the other pool, I would only ever see
the latter. However when using the sedf scheduler I would have to create and
destroy the domain twice, instead of just once.
>
> In theory this bug is present since 4.1.
>
> OTOH it will be hit only with arinc653 scheduler in a cpupool other than
> Pool-0. And I don't see how this is being supported by arinc653 today (pick_cpu
> will always return 0).
Correct, the arinc653 scheduler currently does not work with cpupools. We are
working on remedying that though, which is how I ran into this. I would have
just wrapped this patch in with the upcoming arinc653 ones, if I had not run
into the same issue with the other schedulers.
>
> All other schedulers will just call xfree() for the domain specific data (and
> may be update some statistic data, which is not critical).
The credit and credit2 schedulers do a bit more than that in their free_domdata
functions.
The credit scheduler frees the node_affinity_cpumask contained in the domain
data and the credit2 scheduler deletes a list element contained in the domain
data. Since with this bug they are accessing structures that do not belong to
them, bad things happen.
With the credit scheduler in Pool-0, the result should be an invalid free and an
eventual crash.
With the credit2 scheduler in Pool-0, the effects might be a be more
unpredictable. At best it should result in an invalid pointer dereference.
Likewise, since the other schedulers do not do this additional work, there would
probably be other issues if the sedf or arinc653 scheduler was running in Pool-0
and one of the credit schedulers was run in the other pool. I do not know
enough about the credit scheduler to make any predictions about what would
happen though.
>
>
> Juergen
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch] Call sched_destroy_domain before cpupool_rm_domain.
2013-11-04 15:22 ` Nate Studer
@ 2013-11-05 5:59 ` Juergen Gross
2013-11-07 7:39 ` Jan Beulich
0 siblings, 1 reply; 13+ messages in thread
From: Juergen Gross @ 2013-11-05 5:59 UTC (permalink / raw)
To: Nate Studer
Cc: George Dunlap, Dario Faggioli, Keir Fraser, Jan Beulich,
xen-devel
On 04.11.2013 16:22, Nate Studer wrote:
> On 11/4/2013 4:58 AM, Juergen Gross wrote:
>> On 04.11.2013 10:26, Dario Faggioli wrote:
>>> On lun, 2013-11-04 at 07:30 +0100, Juergen Gross wrote:
>>>> On 04.11.2013 04:03, Nathan Studer wrote:
>>>>> From: Nathan Studer <nate.studer@dornerworks.com>
>>>>>
>>>>> The domain destruction code, removes a domain from its cpupool
>>>>> before attempting to destroy its scheduler information. Since
>>>>> the scheduler framework uses the domain's cpupool information
>>>>> to decide on which scheduler ops to use, this results in the
>>>>> the wrong scheduler's destroy domain function being called
>>>>> when the cpupool scheduler and the initial scheduler are
>>>>> different.
>>>>>
>>>>> Correct this by destroying the domain's scheduling information
>>>>> before removing it from the pool.
>>>>>
>>>>> Signed-off-by: Nathan Studer <nate.studer@dornerworks.com>
>>>>
>>>> Reviewed-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
>>>>
>>> I think this is a candidate for backports too, isn't it?
>>>
>>> Nathan, what was happening without this patch? Are you able to quickly
>>> figure out what previous Xen versions suffers from the same bug?
>
> Various things:
>
> If I used the credit scheduler in Pool-0 and the arinc653 scheduler in a cpupool
> the other pool, it would:
> 1. Hit a BUG_ON in the arinc653 scheduler.
> 2. Hit an assert in the scheduling framework code.
> 3. Or crash in the credit scheduler's csched_free_domdata function.
>
> The latter clued me in that the wrong scheduler's destroy function was somehow
> being called.
>
> If I used the credit2 scheduler in the other pool, I would only ever see the latter.
>
> Similarly, if I used the sedf scheduler in the other pool, I would only ever see
> the latter. However when using the sedf scheduler I would have to create and
> destroy the domain twice, instead of just once.
>
>>
>> In theory this bug is present since 4.1.
>>
>> OTOH it will be hit only with arinc653 scheduler in a cpupool other than
>> Pool-0. And I don't see how this is being supported by arinc653 today (pick_cpu
>> will always return 0).
>
> Correct, the arinc653 scheduler currently does not work with cpupools. We are
> working on remedying that though, which is how I ran into this. I would have
> just wrapped this patch in with the upcoming arinc653 ones, if I had not run
> into the same issue with the other schedulers.
>
>>
>> All other schedulers will just call xfree() for the domain specific data (and
>> may be update some statistic data, which is not critical).
>
> The credit and credit2 schedulers do a bit more than that in their free_domdata
> functions.
Sorry, got not enough sleep on the weekend ;-)
I checked only 4.1 and 4.2 trees. There only xfree of the domain data is done.
>
> The credit scheduler frees the node_affinity_cpumask contained in the domain
> data and the credit2 scheduler deletes a list element contained in the domain
> data. Since with this bug they are accessing structures that do not belong to
> them, bad things happen.
So the patch would be subject to a 4.3 backport, I think.
Juergen
--
Juergen Gross Principal Developer Operating Systems
PBG PDG ES&S SWE OS6 Telephone: +49 (0) 89 62060 2932
Fujitsu e-mail: juergen.gross@ts.fujitsu.com
Mies-van-der-Rohe-Str. 8 Internet: ts.fujitsu.com
D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch] Call sched_destroy_domain before cpupool_rm_domain.
2013-11-04 9:33 ` Andrew Cooper
@ 2013-11-05 21:09 ` Keir Fraser
0 siblings, 0 replies; 13+ messages in thread
From: Keir Fraser @ 2013-11-05 21:09 UTC (permalink / raw)
To: Andrew Cooper, Juergen Gross
Cc: George Dunlap, Keir Fraser, Nathan Studer, xen-devel
On 04/11/2013 09:33, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
> On 04/11/13 06:30, Juergen Gross wrote:
>> On 04.11.2013 04:03, Nathan Studer wrote:
>>> From: Nathan Studer <nate.studer@dornerworks.com>
>>>
>>> The domain destruction code, removes a domain from its cpupool
>>> before attempting to destroy its scheduler information. Since
>>> the scheduler framework uses the domain's cpupool information
>>> to decide on which scheduler ops to use, this results in the
>>> the wrong scheduler's destroy domain function being called
>>> when the cpupool scheduler and the initial scheduler are
>>> different.
>>>
>>> Correct this by destroying the domain's scheduling information
>>> before removing it from the pool.
>>>
>>> Signed-off-by: Nathan Studer <nate.studer@dornerworks.com>
>>
>> Reviewed-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch] Call sched_destroy_domain before cpupool_rm_domain.
2013-11-05 5:59 ` Juergen Gross
@ 2013-11-07 7:39 ` Jan Beulich
2013-11-07 9:09 ` Juergen Gross
0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2013-11-07 7:39 UTC (permalink / raw)
To: Nate Studer, Juergen Gross
Cc: George Dunlap, Dario Faggioli, Keir Fraser, xen-devel
>>> On 05.11.13 at 06:59, Juergen Gross <juergen.gross@ts.fujitsu.com> wrote:
> On 04.11.2013 16:22, Nate Studer wrote:
>> On 11/4/2013 4:58 AM, Juergen Gross wrote:
>>> All other schedulers will just call xfree() for the domain specific data (and
>>> may be update some statistic data, which is not critical).
>>
>> The credit and credit2 schedulers do a bit more than that in their free_domdata
>> functions.
>
> Sorry, got not enough sleep on the weekend ;-)
>
> I checked only 4.1 and 4.2 trees. There only xfree of the domain data is
> done.
>
>>
>> The credit scheduler frees the node_affinity_cpumask contained in the domain
>> data and the credit2 scheduler deletes a list element contained in the domain
>> data. Since with this bug they are accessing structures that do not belong to
>> them, bad things happen.
>
> So the patch would be subject to a 4.3 backport, I think.
Hmm, I'm slightly confused: credit2's free_domdata has always been
doing more than just xfree() afaict, and hence backporting is either
necessary uniformly or (taking into account that it was made clear
that arinc doesn't work with CPU pools anyway so far) not at all.
Please clarify.
Jan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch] Call sched_destroy_domain before cpupool_rm_domain.
2013-11-07 7:39 ` Jan Beulich
@ 2013-11-07 9:09 ` Juergen Gross
2013-11-07 9:37 ` Jan Beulich
0 siblings, 1 reply; 13+ messages in thread
From: Juergen Gross @ 2013-11-07 9:09 UTC (permalink / raw)
To: Jan Beulich
Cc: George Dunlap, Dario Faggioli, Keir Fraser, Nate Studer,
xen-devel
On 07.11.2013 08:39, Jan Beulich wrote:
>>>> On 05.11.13 at 06:59, Juergen Gross <juergen.gross@ts.fujitsu.com> wrote:
>> On 04.11.2013 16:22, Nate Studer wrote:
>>> On 11/4/2013 4:58 AM, Juergen Gross wrote:
>>>> All other schedulers will just call xfree() for the domain specific data (and
>>>> may be update some statistic data, which is not critical).
>>>
>>> The credit and credit2 schedulers do a bit more than that in their free_domdata
>>> functions.
>>
>> Sorry, got not enough sleep on the weekend ;-)
>>
>> I checked only 4.1 and 4.2 trees. There only xfree of the domain data is
>> done.
>>
>>>
>>> The credit scheduler frees the node_affinity_cpumask contained in the domain
>>> data and the credit2 scheduler deletes a list element contained in the domain
>>> data. Since with this bug they are accessing structures that do not belong to
>>> them, bad things happen.
>>
>> So the patch would be subject to a 4.3 backport, I think.
>
> Hmm, I'm slightly confused: credit2's free_domdata has always been
> doing more than just xfree() afaict, and hence backporting is either
> necessary uniformly or (taking into account that it was made clear
> that arinc doesn't work with CPU pools anyway so far) not at all.
>
> Please clarify.
Okay, I assumed only "production ready" features are to be taken into account
for a backport. And credit2 is clearly not in this state, or am I wrong?
A 4.3 backport should be considered in any case, as sedf and credit schedulers
behave differently in free_domdata, and both are "production ready". If you
want to be safe for credit2 and/or arinc653 as well, backports to 4.2 and 4.1
will be required.
In any case a backport isn't very complex. :-)
Juergen
--
Juergen Gross Principal Developer Operating Systems
PBG PDG ES&S SWE OS6 Telephone: +49 (0) 89 62060 2932
Fujitsu e-mail: juergen.gross@ts.fujitsu.com
Mies-van-der-Rohe-Str. 8 Internet: ts.fujitsu.com
D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch] Call sched_destroy_domain before cpupool_rm_domain.
2013-11-07 9:09 ` Juergen Gross
@ 2013-11-07 9:37 ` Jan Beulich
2013-11-07 9:43 ` Juergen Gross
0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2013-11-07 9:37 UTC (permalink / raw)
To: Juergen Gross
Cc: George Dunlap, Dario Faggioli, Keir Fraser, Nate Studer,
xen-devel
>>> On 07.11.13 at 10:09, Juergen Gross <juergen.gross@ts.fujitsu.com> wrote:
> On 07.11.2013 08:39, Jan Beulich wrote:
>>>>> On 05.11.13 at 06:59, Juergen Gross <juergen.gross@ts.fujitsu.com> wrote:
>>> On 04.11.2013 16:22, Nate Studer wrote:
>>>> On 11/4/2013 4:58 AM, Juergen Gross wrote:
>>>>> All other schedulers will just call xfree() for the domain specific data
> (and
>>>>> may be update some statistic data, which is not critical).
>>>>
>>>> The credit and credit2 schedulers do a bit more than that in their
> free_domdata
>>>> functions.
>>>
>>> Sorry, got not enough sleep on the weekend ;-)
>>>
>>> I checked only 4.1 and 4.2 trees. There only xfree of the domain data is
>>> done.
>>>
>>>>
>>>> The credit scheduler frees the node_affinity_cpumask contained in the domain
>>>> data and the credit2 scheduler deletes a list element contained in the
> domain
>>>> data. Since with this bug they are accessing structures that do not belong
> to
>>>> them, bad things happen.
>>>
>>> So the patch would be subject to a 4.3 backport, I think.
>>
>> Hmm, I'm slightly confused: credit2's free_domdata has always been
>> doing more than just xfree() afaict, and hence backporting is either
>> necessary uniformly or (taking into account that it was made clear
>> that arinc doesn't work with CPU pools anyway so far) not at all.
>>
>> Please clarify.
>
> Okay, I assumed only "production ready" features are to be taken into
> account
> for a backport. And credit2 is clearly not in this state, or am I wrong?
You aren't, but is arinc production ready? I wouldn't think so
simply based on it not working with CPU pools. And then the
backporting question would become mute.
> A 4.3 backport should be considered in any case, as sedf and credit
> schedulers
> behave differently in free_domdata, and both are "production ready". If you
> want to be safe for credit2 and/or arinc653 as well, backports to 4.2 and
> 4.1
> will be required.
>
> In any case a backport isn't very complex. :-)
Indeed. But I'd like backports to be on purpose as well as
consistent across trees (iow: applied to all maintained trees
where needed, and only there).
Jan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Patch] Call sched_destroy_domain before cpupool_rm_domain.
2013-11-07 9:37 ` Jan Beulich
@ 2013-11-07 9:43 ` Juergen Gross
0 siblings, 0 replies; 13+ messages in thread
From: Juergen Gross @ 2013-11-07 9:43 UTC (permalink / raw)
To: Jan Beulich
Cc: George Dunlap, Dario Faggioli, Keir Fraser, Nate Studer,
xen-devel
On 07.11.2013 10:37, Jan Beulich wrote:
>>>> On 07.11.13 at 10:09, Juergen Gross <juergen.gross@ts.fujitsu.com> wrote:
>> On 07.11.2013 08:39, Jan Beulich wrote:
>>>>>> On 05.11.13 at 06:59, Juergen Gross <juergen.gross@ts.fujitsu.com> wrote:
>>>> On 04.11.2013 16:22, Nate Studer wrote:
>>>>> On 11/4/2013 4:58 AM, Juergen Gross wrote:
>>>>>> All other schedulers will just call xfree() for the domain specific data
>> (and
>>>>>> may be update some statistic data, which is not critical).
>>>>>
>>>>> The credit and credit2 schedulers do a bit more than that in their
>> free_domdata
>>>>> functions.
>>>>
>>>> Sorry, got not enough sleep on the weekend ;-)
>>>>
>>>> I checked only 4.1 and 4.2 trees. There only xfree of the domain data is
>>>> done.
>>>>
>>>>>
>>>>> The credit scheduler frees the node_affinity_cpumask contained in the domain
>>>>> data and the credit2 scheduler deletes a list element contained in the
>> domain
>>>>> data. Since with this bug they are accessing structures that do not belong
>> to
>>>>> them, bad things happen.
>>>>
>>>> So the patch would be subject to a 4.3 backport, I think.
>>>
>>> Hmm, I'm slightly confused: credit2's free_domdata has always been
>>> doing more than just xfree() afaict, and hence backporting is either
>>> necessary uniformly or (taking into account that it was made clear
>>> that arinc doesn't work with CPU pools anyway so far) not at all.
>>>
>>> Please clarify.
>>
>> Okay, I assumed only "production ready" features are to be taken into
>> account
>> for a backport. And credit2 is clearly not in this state, or am I wrong?
>
> You aren't, but is arinc production ready? I wouldn't think so
> simply based on it not working with CPU pools. And then the
> backporting question would become mute.
No, it doesn't. The following statement should have made that clear:
>> A 4.3 backport should be considered in any case, as sedf and credit
>> schedulers
>> behave differently in free_domdata, and both are "production ready".
If you have credit as default scheduler and use sedf in a cpupool, destroying
a domain in the cpupool with sedf will use the credit free_domdata routine,
leading to an error in 4.3 when calling free_cpumask_var().
Juergen
--
Juergen Gross Principal Developer Operating Systems
PBG PDG ES&S SWE OS6 Telephone: +49 (0) 89 62060 2932
Fujitsu e-mail: juergen.gross@ts.fujitsu.com
Mies-van-der-Rohe-Str. 8 Internet: ts.fujitsu.com
D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2013-11-07 9:43 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-04 3:03 [Patch] Call sched_destroy_domain before cpupool_rm_domain Nathan Studer
2013-11-04 6:30 ` Juergen Gross
2013-11-04 9:26 ` Dario Faggioli
2013-11-04 9:58 ` Juergen Gross
2013-11-04 15:22 ` Nate Studer
2013-11-05 5:59 ` Juergen Gross
2013-11-07 7:39 ` Jan Beulich
2013-11-07 9:09 ` Juergen Gross
2013-11-07 9:37 ` Jan Beulich
2013-11-07 9:43 ` Juergen Gross
2013-11-04 9:33 ` Andrew Cooper
2013-11-05 21:09 ` Keir Fraser
2013-11-04 15:10 ` George Dunlap
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).