All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai] Kernel OOPS during regression tests
@ 2013-01-12 17:26 John Morris
  2013-01-12 17:31 ` Gilles Chanteperdrix
                   ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: John Morris @ 2013-01-12 17:26 UTC (permalink / raw)
  To: Xenomai

Hi list,

The el6 packages are coming along nicely:

https://github.com/zultron/kernel-ml/tree/branch3.5.3-xeno

Still a few problems when running xeno-regression-test.

1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
getting foggy about all the things I've seen, but it seems like it was
happening earlier in the tests until these config values were quadrupled.

2)  posix/mprotect is failing:  "FAILURE: sigdebug_handler triggered,
reason 2    memory write after exec enable".

3)  FPU warning:  "fptest.h:24: Warning: Linux is compiled to use FPU in
kernel-space.    For this reason, switchtest can not test using FPU in
Linux kernel-space."

Regression test run log:
http://www.zultron.com/static/2013/01/xenomai/foo-xeno-regression-test.log

Log of dmesg with oops at end:
http://www.zultron.com/static/2013/01/xenomai/foo-oops.log

Kernel .config:
www.zultron.com/static/2013/01/xenomai/foo-kernel.config

The kernel is built from the elrepo.org kernel, and this config file
overrides the upstream config; makes it easy to see the Xenomai-specific
changes:
https://github.com/zultron/kernel-ml/blob/branch3.5.3-xeno/config-3.5.3-xenomai-x86_64

Any help is greatly appreciated!  These RPMs are part of my effort to
enable RH-like distros to run LinuxCNC and expand the pool of potential
users, so if I'm successful, many folks will benefit.

	John


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] Kernel OOPS during regression tests
  2013-01-12 17:26 [Xenomai] Kernel OOPS during regression tests John Morris
@ 2013-01-12 17:31 ` Gilles Chanteperdrix
  2013-01-13  4:36   ` John Morris
  2013-01-12 19:02 ` [Xenomai] Kernel OOPS during regression tests Gilles Chanteperdrix
  2013-01-12 19:03 ` [Xenomai] Kernel OOPS during regression tests Gilles Chanteperdrix
  2 siblings, 1 reply; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-12 17:31 UTC (permalink / raw)
  To: John Morris; +Cc: Xenomai

On 01/12/2013 06:26 PM, John Morris wrote:

> Hi list,
> 
> The el6 packages are coming along nicely:
> 
> https://github.com/zultron/kernel-ml/tree/branch3.5.3-xeno

>

> Still a few problems when running xeno-regression-test.
> 
> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
> getting foggy about all the things I've seen, but it seems like it was
> happening earlier in the tests until these config values were quadrupled.


Could you check whether you can reproduce this issue with the I-pipe
patch for 3.5.7 ? The next xenomai release will be based on this version
on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git

> 
> 2)  posix/mprotect is failing:  "FAILURE: sigdebug_handler triggered,
> reason 2    memory write after exec enable".


I guess you need Jan's fixes.

> 
> 3)  FPU warning:  "fptest.h:24: Warning: Linux is compiled to use FPU in
> kernel-space.    For this reason, switchtest can not test using FPU in
> Linux kernel-space."


Well, the warning says it all.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] Kernel OOPS during regression tests
  2013-01-12 17:26 [Xenomai] Kernel OOPS during regression tests John Morris
  2013-01-12 17:31 ` Gilles Chanteperdrix
@ 2013-01-12 19:02 ` Gilles Chanteperdrix
  2013-01-13  6:50   ` John Morris
  2013-01-12 19:03 ` [Xenomai] Kernel OOPS during regression tests Gilles Chanteperdrix
  2 siblings, 1 reply; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-12 19:02 UTC (permalink / raw)
  To: John Morris; +Cc: Xenomai

On 01/12/2013 06:26 PM, John Morris wrote:

> Hi list,
> 
> The el6 packages are coming along nicely:
> 
> https://github.com/zultron/kernel-ml/tree/branch3.5.3-xeno
> 
> Still a few problems when running xeno-regression-test.
> 
> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
> getting foggy about all the things I've seen, but it seems like it was
> happening earlier in the tests until these config values were quadrupled.


Try disabling transparent hugepages, as noted by Jan.

> 
> 2)  posix/mprotect is failing:  "FAILURE: sigdebug_handler triggered,
> reason 2    memory write after exec enable".
> 
> 3)  FPU warning:  "fptest.h:24: Warning: Linux is compiled to use FPU in
> kernel-space.    For this reason, switchtest can not test using FPU in
> Linux kernel-space."
> 
> Regression test run log:
> http://www.zultron.com/static/2013/01/xenomai/foo-xeno-regression-test.log


You should normally use the -l option to use a non default "load", and
run the test for a long time.

See xeno-regression-test --help and dohell --help


-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] Kernel OOPS during regression tests
  2013-01-12 17:26 [Xenomai] Kernel OOPS during regression tests John Morris
  2013-01-12 17:31 ` Gilles Chanteperdrix
  2013-01-12 19:02 ` [Xenomai] Kernel OOPS during regression tests Gilles Chanteperdrix
@ 2013-01-12 19:03 ` Gilles Chanteperdrix
  2013-01-13  4:40   ` John Morris
  2 siblings, 1 reply; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-12 19:03 UTC (permalink / raw)
  To: John Morris; +Cc: Xenomai

On 01/12/2013 06:26 PM, John Morris wrote:>

> The kernel is built from the elrepo.org kernel, and this config file
> overrides the upstream config; makes it easy to see the Xenomai-specific
> changes:
> https://github.com/zultron/kernel-ml/blob/branch3.5.3-xeno/config-3.5.3-xenomai-x86_64


It seems a really bad idea to enable the SMI workaround by default.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] Kernel OOPS during regression tests
  2013-01-12 17:31 ` Gilles Chanteperdrix
@ 2013-01-13  4:36   ` John Morris
  2013-01-13 12:16     ` Gilles Chanteperdrix
  0 siblings, 1 reply; 54+ messages in thread
From: John Morris @ 2013-01-13  4:36 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai

On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
> On 01/12/2013 06:26 PM, John Morris wrote:
>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>> getting foggy about all the things I've seen, but it seems like it was
>> happening earlier in the tests until these config values were quadrupled.
> 
> 
> Could you check whether you can reproduce this issue with the I-pipe
> patch for 3.5.7 ? The next xenomai release will be based on this version
> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git

Different problem; Xenomai wouldn't start:

  I-pipe: could not find timer for cpu #0

dmesg:
http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log

.config:
www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config

FYI, I found this same problem on two of my systems while testing your
Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
 They're about the same generation of motherboards, AM2 or AM2+ socket.
 One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.

Hardware looks similar to Mariusz's in this post, where he had the same
problem:

http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html

He's also running AMD 64-bit on a Gigabyte motherboard, but the next
generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
BIOS option on these boards to enable/disable.  These same motherboards
don't suffer this problem with mainline Xenomai on 3.5.3.

>> 2)  posix/mprotect is failing:  "FAILURE: sigdebug_handler triggered,
>> reason 2    memory write after exec enable".
> 
> 
> I guess you need Jan's fixes.

Jan's fix is to turn off CONFIG_TRANSPARENT_HUGEPAGE, right?  I'll
report back after a new 3.5.3 kernel is compiled.

>> 3)  FPU warning:  "fptest.h:24: Warning: Linux is compiled to use FPU in
>> kernel-space.    For this reason, switchtest can not test using FPU in
>> Linux kernel-space."
> 
> 
> Well, the warning says it all.

Sorry, I hit 'send' without Googling first.  So, here you say it's
harmless, is that right?

http://www.xenomai.org/pipermail/xenomai-core/2009-05/msg00011.html

	John


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] Kernel OOPS during regression tests
  2013-01-12 19:03 ` [Xenomai] Kernel OOPS during regression tests Gilles Chanteperdrix
@ 2013-01-13  4:40   ` John Morris
  2013-01-13 13:53     ` Gilles Chanteperdrix
  0 siblings, 1 reply; 54+ messages in thread
From: John Morris @ 2013-01-13  4:40 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai



On 01/12/2013 01:03 PM, Gilles Chanteperdrix wrote:
> On 01/12/2013 06:26 PM, John Morris wrote:>
> 
>> The kernel is built from the elrepo.org kernel, and this config file
>> overrides the upstream config; makes it easy to see the Xenomai-specific
>> changes:
>> https://github.com/zultron/kernel-ml/blob/branch3.5.3-xeno/config-3.5.3-xenomai-x86_64
> 
> 
> It seems a really bad idea to enable the SMI workaround by default.
> 

Thanks!  Despite feeling uncomfortable contradicting the documentation,
I was encouraged to turn this on by someone more authoritative than me.
 It does seem like a separate package with it turned on is warranted for
those who have problems and have been fairly warned about the risk.

	John


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] Kernel OOPS during regression tests
  2013-01-12 19:02 ` [Xenomai] Kernel OOPS during regression tests Gilles Chanteperdrix
@ 2013-01-13  6:50   ` John Morris
  2013-01-13 11:23     ` Jan Kiszka
  0 siblings, 1 reply; 54+ messages in thread
From: John Morris @ 2013-01-13  6:50 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai

On 01/12/2013 01:02 PM, Gilles Chanteperdrix wrote:
> On 01/12/2013 06:26 PM, John Morris wrote:
> 
>> Hi list,
>>
>> The el6 packages are coming along nicely:
>>
>> https://github.com/zultron/kernel-ml/tree/branch3.5.3-xeno
>>
>> Still a few problems when running xeno-regression-test.
>>
>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>> getting foggy about all the things I've seen, but it seems like it was
>> happening earlier in the tests until these config values were quadrupled.
> 
> 
> Try disabling transparent hugepages, as noted by Jan.
> 
>>
>> 2)  posix/mprotect is failing:  "FAILURE: sigdebug_handler triggered,
>> reason 2    memory write after exec enable".
>>
>> 3)  FPU warning:  "fptest.h:24: Warning: Linux is compiled to use FPU in
>> kernel-space.    For this reason, switchtest can not test using FPU in
>> Linux kernel-space."
>>
>> Regression test run log:
>> http://www.zultron.com/static/2013/01/xenomai/foo-xeno-regression-test.log
> 
> 
> You should normally use the -l option to use a non default "load", and
> run the test for a long time.
> 
> See xeno-regression-test --help and dohell --help
> 
> 


Jan's fix appears to have solved 1) above, but 2) posix/mprotect is
still failing.  Thanks for the xeno-regression-test tips; ran a couple
more (not for a long time, will do that later).

Test runs:
http://www.zultron.com/static/2013/01/xenomai/3.5.3-jan-fix/foo-xeno-regression-test-3.5.3-jan-fix.log

Kernel config (only change is disabling TRANSPARENT_HUGEPAGE):
http://www.zultron.com/static/2013/01/xenomai/3.5.3-jan-fix/foo-kernel-3.5.3-jan-fix.log

Dmesg output with no oops (hopefully boring):
http://www.zultron.com/static/2013/01/xenomai/3.5.3-jan-fix/foo-dmesg-3.5.3-jan-fix.log

	John


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] Kernel OOPS during regression tests
  2013-01-13  6:50   ` John Morris
@ 2013-01-13 11:23     ` Jan Kiszka
  2013-01-13 12:18       ` Gilles Chanteperdrix
  0 siblings, 1 reply; 54+ messages in thread
From: Jan Kiszka @ 2013-01-13 11:23 UTC (permalink / raw)
  To: John Morris; +Cc: Xenomai

On 2013-01-13 07:50, John Morris wrote:
> On 01/12/2013 01:02 PM, Gilles Chanteperdrix wrote:
>> On 01/12/2013 06:26 PM, John Morris wrote:
>>
>>> Hi list,
>>>
>>> The el6 packages are coming along nicely:
>>>
>>> https://github.com/zultron/kernel-ml/tree/branch3.5.3-xeno
>>>
>>> Still a few problems when running xeno-regression-test.
>>>
>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>> getting foggy about all the things I've seen, but it seems like it was
>>> happening earlier in the tests until these config values were quadrupled.
>>
>>
>> Try disabling transparent hugepages, as noted by Jan.
>>
>>>
>>> 2)  posix/mprotect is failing:  "FAILURE: sigdebug_handler triggered,
>>> reason 2    memory write after exec enable".
>>>
>>> 3)  FPU warning:  "fptest.h:24: Warning: Linux is compiled to use FPU in
>>> kernel-space.    For this reason, switchtest can not test using FPU in
>>> Linux kernel-space."
>>>
>>> Regression test run log:
>>> http://www.zultron.com/static/2013/01/xenomai/foo-xeno-regression-test.log
>>
>>
>> You should normally use the -l option to use a non default "load", and
>> run the test for a long time.
>>
>> See xeno-regression-test --help and dohell --help
>>
>>
> 
> 
> Jan's fix appears to have solved 1) above, but 2) posix/mprotect is
> still failing.  Thanks for the xeno-regression-test tips; ran a couple
> more (not for a long time, will do that later).

Can you clarify how your setup looks like? The kernel is a vanilla 3.5.3
patched with ipipe-core-3.5.3-x86-2.patch? And then you just disabled
THP? Or did you switch to my for-upstream/3.5 I-pipe branch?

I'll have a look at your .config next week.

Jan


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20130113/978ba193/attachment.pgp>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] Kernel OOPS during regression tests
  2013-01-13  4:36   ` John Morris
@ 2013-01-13 12:16     ` Gilles Chanteperdrix
  2013-01-13 19:14       ` [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests) John Morris
  0 siblings, 1 reply; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-13 12:16 UTC (permalink / raw)
  To: John Morris; +Cc: Xenomai

On 01/13/2013 05:36 AM, John Morris wrote:

> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>> On 01/12/2013 06:26 PM, John Morris wrote:
>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>> getting foggy about all the things I've seen, but it seems like it was
>>> happening earlier in the tests until these config values were quadrupled.
>>
>>
>> Could you check whether you can reproduce this issue with the I-pipe
>> patch for 3.5.7 ? The next xenomai release will be based on this version
>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
> 
> Different problem; Xenomai wouldn't start:
> 
>   I-pipe: could not find timer for cpu #0
> 
> dmesg:
> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
> 
> .config:
> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
> 
> FYI, I found this same problem on two of my systems while testing your
> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
> 
> Hardware looks similar to Mariusz's in this post, where he had the same
> problem:
> 
> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
> 
> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
> BIOS option on these boards to enable/disable.  These same motherboards
> don't suffer this problem with mainline Xenomai on 3.5.3.


If you had the same problem as Marius, you would have seen it with
3.5.3, and you would get the message in the dmesg about C1E, so, it is
probably something else. Could you run

cat /proc/timer_list

?

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] Kernel OOPS during regression tests
  2013-01-13 11:23     ` Jan Kiszka
@ 2013-01-13 12:18       ` Gilles Chanteperdrix
  2013-01-13 19:34         ` [Xenomai] 3.5.3 posix/mprotect fail "sigdebug_handler triggered" (Was: Re: Kernel OOPS during regression tests) John Morris
  0 siblings, 1 reply; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-13 12:18 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: John Morris, Xenomai

On 01/13/2013 12:23 PM, Jan Kiszka wrote:

> On 2013-01-13 07:50, John Morris wrote:
>> On 01/12/2013 01:02 PM, Gilles Chanteperdrix wrote:
>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>
>>>> Hi list,
>>>>
>>>> The el6 packages are coming along nicely:
>>>>
>>>> https://github.com/zultron/kernel-ml/tree/branch3.5.3-xeno
>>>>
>>>> Still a few problems when running xeno-regression-test.
>>>>
>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>> getting foggy about all the things I've seen, but it seems like it was
>>>> happening earlier in the tests until these config values were quadrupled.
>>>
>>>
>>> Try disabling transparent hugepages, as noted by Jan.
>>>
>>>>
>>>> 2)  posix/mprotect is failing:  "FAILURE: sigdebug_handler triggered,
>>>> reason 2    memory write after exec enable".
>>>>
>>>> 3)  FPU warning:  "fptest.h:24: Warning: Linux is compiled to use FPU in
>>>> kernel-space.    For this reason, switchtest can not test using FPU in
>>>> Linux kernel-space."
>>>>
>>>> Regression test run log:
>>>> http://www.zultron.com/static/2013/01/xenomai/foo-xeno-regression-test.log
>>>
>>>
>>> You should normally use the -l option to use a non default "load", and
>>> run the test for a long time.
>>>
>>> See xeno-regression-test --help and dohell --help
>>>
>>>
>>
>>
>> Jan's fix appears to have solved 1) above, but 2) posix/mprotect is
>> still failing.  Thanks for the xeno-regression-test tips; ran a couple
>> more (not for a long time, will do that later).
> 
> Can you clarify how your setup looks like? The kernel is a vanilla 3.5.3
> patched with ipipe-core-3.5.3-x86-2.patch? And then you just disabled
> THP? Or did you switch to my for-upstream/3.5 I-pipe branch?


I have merged the bugfixes from your branch in my branch in addition to
other bug fixes, so, I would recommend using this one.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] Kernel OOPS during regression tests
  2013-01-13  4:40   ` John Morris
@ 2013-01-13 13:53     ` Gilles Chanteperdrix
  2013-01-13 19:36       ` [Xenomai] SMI workarounds in one-size-fits-all kernel packages (Was: Re: Kernel OOPS during regression tests) John Morris
  0 siblings, 1 reply; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-13 13:53 UTC (permalink / raw)
  To: John Morris; +Cc: Xenomai

On 01/13/2013 05:40 AM, John Morris wrote:

> 
> 
> On 01/12/2013 01:03 PM, Gilles Chanteperdrix wrote:
>> On 01/12/2013 06:26 PM, John Morris wrote:>
>>
>>> The kernel is built from the elrepo.org kernel, and this config file
>>> overrides the upstream config; makes it easy to see the Xenomai-specific
>>> changes:
>>> https://github.com/zultron/kernel-ml/blob/branch3.5.3-xeno/config-3.5.3-xenomai-x86_64
>>
>>
>> It seems a really bad idea to enable the SMI workaround by default.
>>
> 
> Thanks!  Despite feeling uncomfortable contradicting the documentation,
> I was encouraged to turn this on by someone more authoritative than me.
>  It does seem like a separate package with it turned on is warranted for
> those who have problems and have been fairly warned about the risk.


I guess we should turn the compile-time option into a kernel parameter.
So, the smi workaround would be disabled by default, and only enabled
when passing a kerne command-line option.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-13 12:16     ` Gilles Chanteperdrix
@ 2013-01-13 19:14       ` John Morris
  2013-01-13 19:41         ` Gilles Chanteperdrix
  0 siblings, 1 reply; 54+ messages in thread
From: John Morris @ 2013-01-13 19:14 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai

Hi Gilles and Jan,

Note change of thread subject.  I'm starting to get confused.

On 01/13/2013 06:16 AM, Gilles Chanteperdrix wrote:
> On 01/13/2013 05:36 AM, John Morris wrote:
> 
>> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>> getting foggy about all the things I've seen, but it seems like it was
>>>> happening earlier in the tests until these config values were quadrupled.
>>>
>>>
>>> Could you check whether you can reproduce this issue with the I-pipe
>>> patch for 3.5.7 ? The next xenomai release will be based on this version
>>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
>>
>> Different problem; Xenomai wouldn't start:
>>
>>   I-pipe: could not find timer for cpu #0
>>
>> dmesg:
>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>
>> .config:
>> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
>>
>> FYI, I found this same problem on two of my systems while testing your
>> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
>>
>> Hardware looks similar to Mariusz's in this post, where he had the same
>> problem:
>>
>> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
>>
>> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
>> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
>> BIOS option on these boards to enable/disable.  These same motherboards
>> don't suffer this problem with mainline Xenomai on 3.5.3.
> 
> 
> If you had the same problem as Marius, you would have seen it with
> 3.5.3, and you would get the message in the dmesg about C1E, so, it is
> probably something else. 

Yes, I'm definitely getting confused.  I did see the same problem with
C1E, but only while running your 3.5.7 .debs, and not in the 3.5.7 el6
packages that are the main subject of this sub-thread:

http://www.zultron.com/static/2013/01/xenomai/3.5.7-deb-gilles/dmesg-3.5.7-deb-gilles.log

> Could you run
> 
> cat /proc/timer_list

Back to el6 again, 3.5.7 i-pipe:

http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-cat-proc-timer_list-3.5.7-test.log

	John


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Xenomai] 3.5.3 posix/mprotect fail "sigdebug_handler triggered" (Was: Re: Kernel OOPS during regression tests)
  2013-01-13 12:18       ` Gilles Chanteperdrix
@ 2013-01-13 19:34         ` John Morris
  2013-01-13 19:42           ` Gilles Chanteperdrix
  0 siblings, 1 reply; 54+ messages in thread
From: John Morris @ 2013-01-13 19:34 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Jan Kiszka, Xenomai

Another subject change here...

On 01/13/2013 06:18 AM, Gilles Chanteperdrix wrote:
> On 01/13/2013 12:23 PM, Jan Kiszka wrote:
> 
>> On 2013-01-13 07:50, John Morris wrote:
>>> On 01/12/2013 01:02 PM, Gilles Chanteperdrix wrote:
>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>
>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>
>>>>
>>>> Try disabling transparent hugepages, as noted by Jan.
>>>>
>>>>>
>>>>> 2)  posix/mprotect is failing:  "FAILURE: sigdebug_handler triggered,
>>>>> reason 2    memory write after exec enable".
>>>>>
>>>>> 3)  FPU warning:  "fptest.h:24: Warning: Linux is compiled to use FPU in
>>>>> kernel-space.    For this reason, switchtest can not test using FPU in
>>>>> Linux kernel-space."
>>>>>
>>>>> Regression test run log:
>>>>> http://www.zultron.com/static/2013/01/xenomai/foo-xeno-regression-test.log
>>>>
>>>>
>>>> You should normally use the -l option to use a non default "load", and
>>>> run the test for a long time.
>>>>
>>>> See xeno-regression-test --help and dohell --help
>>>>
>>>>
>>>
>>>
>>> Jan's fix appears to have solved 1) above, but 2) posix/mprotect is
>>> still failing.  Thanks for the xeno-regression-test tips; ran a couple
>>> more (not for a long time, will do that later).
>>
>> Can you clarify how your setup looks like? The kernel is a vanilla 3.5.3
>> patched with ipipe-core-3.5.3-x86-2.patch? And then you just disabled
>> THP? Or did you switch to my for-upstream/3.5 I-pipe branch?

Vanilla 3.5.3, released Xenomai 2.6.2, THP disabled.

> I have merged the bugfixes from your branch in my branch in addition to
> other bug fixes, so, I would recommend using this one.

You must mean the xenomai-2.6.git tree, master branch.  I'll begin using
that.

	John


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Xenomai] SMI workarounds in one-size-fits-all kernel packages (Was: Re: Kernel OOPS during regression tests)
  2013-01-13 13:53     ` Gilles Chanteperdrix
@ 2013-01-13 19:36       ` John Morris
  2013-01-13 19:45         ` Gilles Chanteperdrix
  0 siblings, 1 reply; 54+ messages in thread
From: John Morris @ 2013-01-13 19:36 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai



On 01/13/2013 07:53 AM, Gilles Chanteperdrix wrote:
> On 01/13/2013 05:40 AM, John Morris wrote:
> 
>>
>>
>> On 01/12/2013 01:03 PM, Gilles Chanteperdrix wrote:
>>> On 01/12/2013 06:26 PM, John Morris wrote:>
>>>
>>>> The kernel is built from the elrepo.org kernel, and this config file
>>>> overrides the upstream config; makes it easy to see the Xenomai-specific
>>>> changes:
>>>> https://github.com/zultron/kernel-ml/blob/branch3.5.3-xeno/config-3.5.3-xenomai-x86_64
>>>
>>>
>>> It seems a really bad idea to enable the SMI workaround by default.
>>>
>>
>> Thanks!  Despite feeling uncomfortable contradicting the documentation,
>> I was encouraged to turn this on by someone more authoritative than me.
>>  It does seem like a separate package with it turned on is warranted for
>> those who have problems and have been fairly warned about the risk.
> 
> 
> I guess we should turn the compile-time option into a kernel parameter.
> So, the smi workaround would be disabled by default, and only enabled
> when passing a kerne command-line option.

That would be ideal!

	John


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-13 19:14       ` [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests) John Morris
@ 2013-01-13 19:41         ` Gilles Chanteperdrix
  2013-01-14  4:47           ` [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed! John Morris
  2013-01-14 12:00           ` [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests) Jan Kiszka
  0 siblings, 2 replies; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-13 19:41 UTC (permalink / raw)
  To: John Morris; +Cc: Xenomai

On 01/13/2013 08:14 PM, John Morris wrote:

> Hi Gilles and Jan,
> 
> Note change of thread subject.  I'm starting to get confused.
> 
> On 01/13/2013 06:16 AM, Gilles Chanteperdrix wrote:
>> On 01/13/2013 05:36 AM, John Morris wrote:
>>
>>> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>
>>>>
>>>> Could you check whether you can reproduce this issue with the I-pipe
>>>> patch for 3.5.7 ? The next xenomai release will be based on this version
>>>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
>>>
>>> Different problem; Xenomai wouldn't start:
>>>
>>>   I-pipe: could not find timer for cpu #0
>>>
>>> dmesg:
>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>>
>>> .config:
>>> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
>>>
>>> FYI, I found this same problem on two of my systems while testing your
>>> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>>>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>>>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
>>>
>>> Hardware looks similar to Mariusz's in this post, where he had the same
>>> problem:
>>>
>>> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
>>>
>>> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
>>> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
>>> BIOS option on these boards to enable/disable.  These same motherboards
>>> don't suffer this problem with mainline Xenomai on 3.5.3.
>>
>>
>> If you had the same problem as Marius, you would have seen it with
>> 3.5.3, and you would get the message in the dmesg about C1E, so, it is
>> probably something else. 
> 
> Yes, I'm definitely getting confused.  I did see the same problem with
> C1E, but only while running your 3.5.7 .debs, and not in the 3.5.7 el6
> packages that are the main subject of this sub-thread:
> 
> http://www.zultron.com/static/2013/01/xenomai/3.5.7-deb-gilles/dmesg-3.5.7-deb-gilles.log


Ah, that is because I rebased the I-pipe tree in between, and at some
point the code printing the message was wrong (ATOMIC_INIT(0) instead of
ATOMIC_INIT(-1)). That is my fault then, sorry.

> 
>> Could you run
>>
>> cat /proc/timer_list
> 
> Back to el6 again, 3.5.7 i-pipe:
> 
> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-cat-proc-timer_list-3.5.7-test.log


The LAPIC is definitely up and running (mode: 3). So, it probably means
that the erratum detection is not sufficient to decide not to use a
LAPIC. Checking your logs, we see:

using AMD E400 aware idle routine

which means the LAPIC could potentially be unusable, but the idle
routine also checks for a bit in a K8 specific MSR and prints the message:

System has AMD C1E enabled

If this bit is set, and in your case the message is not printed so the
bit is not set. So, the LAPIC is usable, but due to the changes I made
to try and print a message in Marius case, I broke the detection in your
case.

I have just pushed a rework for this commit in the for-core-3.5.7 branch
in ipipe-gch git.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.3 posix/mprotect fail "sigdebug_handler triggered" (Was: Re: Kernel OOPS during regression tests)
  2013-01-13 19:34         ` [Xenomai] 3.5.3 posix/mprotect fail "sigdebug_handler triggered" (Was: Re: Kernel OOPS during regression tests) John Morris
@ 2013-01-13 19:42           ` Gilles Chanteperdrix
  0 siblings, 0 replies; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-13 19:42 UTC (permalink / raw)
  To: John Morris; +Cc: Jan Kiszka, Xenomai

On 01/13/2013 08:34 PM, John Morris wrote:

> Another subject change here...
> 
> On 01/13/2013 06:18 AM, Gilles Chanteperdrix wrote:
>> On 01/13/2013 12:23 PM, Jan Kiszka wrote:
>>
>>> On 2013-01-13 07:50, John Morris wrote:
>>>> On 01/12/2013 01:02 PM, Gilles Chanteperdrix wrote:
>>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>>
>>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>>
>>>>>
>>>>> Try disabling transparent hugepages, as noted by Jan.
>>>>>
>>>>>>
>>>>>> 2)  posix/mprotect is failing:  "FAILURE: sigdebug_handler triggered,
>>>>>> reason 2    memory write after exec enable".
>>>>>>
>>>>>> 3)  FPU warning:  "fptest.h:24: Warning: Linux is compiled to use FPU in
>>>>>> kernel-space.    For this reason, switchtest can not test using FPU in
>>>>>> Linux kernel-space."
>>>>>>
>>>>>> Regression test run log:
>>>>>> http://www.zultron.com/static/2013/01/xenomai/foo-xeno-regression-test.log
>>>>>
>>>>>
>>>>> You should normally use the -l option to use a non default "load", and
>>>>> run the test for a long time.
>>>>>
>>>>> See xeno-regression-test --help and dohell --help
>>>>>
>>>>>
>>>>
>>>>
>>>> Jan's fix appears to have solved 1) above, but 2) posix/mprotect is
>>>> still failing.  Thanks for the xeno-regression-test tips; ran a couple
>>>> more (not for a long time, will do that later).
>>>
>>> Can you clarify how your setup looks like? The kernel is a vanilla 3.5.3
>>> patched with ipipe-core-3.5.3-x86-2.patch? And then you just disabled
>>> THP? Or did you switch to my for-upstream/3.5 I-pipe branch?
> 
> Vanilla 3.5.3, released Xenomai 2.6.2, THP disabled.
> 
>> I have merged the bugfixes from your branch in my branch in addition to
>> other bug fixes, so, I would recommend using this one.
> 
> You must mean the xenomai-2.6.git tree, master branch.  I'll begin using
> that.


I mean branch for-core-3.5.7 in ipipe-gch git. But you should also pull
the xenomai-2.6 git master branch fixes, yes.


-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] SMI workarounds in one-size-fits-all kernel packages (Was: Re: Kernel OOPS during regression tests)
  2013-01-13 19:36       ` [Xenomai] SMI workarounds in one-size-fits-all kernel packages (Was: Re: Kernel OOPS during regression tests) John Morris
@ 2013-01-13 19:45         ` Gilles Chanteperdrix
  2013-01-14  5:33           ` John Morris
  0 siblings, 1 reply; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-13 19:45 UTC (permalink / raw)
  To: John Morris; +Cc: Xenomai

On 01/13/2013 08:36 PM, John Morris wrote:

> 
> 
> On 01/13/2013 07:53 AM, Gilles Chanteperdrix wrote:
>> On 01/13/2013 05:40 AM, John Morris wrote:
>>
>>>
>>>
>>> On 01/12/2013 01:03 PM, Gilles Chanteperdrix wrote:
>>>> On 01/12/2013 06:26 PM, John Morris wrote:>
>>>>
>>>>> The kernel is built from the elrepo.org kernel, and this config file
>>>>> overrides the upstream config; makes it easy to see the Xenomai-specific
>>>>> changes:
>>>>> https://github.com/zultron/kernel-ml/blob/branch3.5.3-xeno/config-3.5.3-xenomai-x86_64
>>>>
>>>>
>>>> It seems a really bad idea to enable the SMI workaround by default.
>>>>
>>>
>>> Thanks!  Despite feeling uncomfortable contradicting the documentation,
>>> I was encouraged to turn this on by someone more authoritative than me.
>>>  It does seem like a separate package with it turned on is warranted for
>>> those who have problems and have been fairly warned about the risk.
>>
>>
>> I guess we should turn the compile-time option into a kernel parameter.
>> So, the smi workaround would be disabled by default, and only enabled
>> when passing a kerne command-line option.
> 
> That would be ideal!


We will try to issue a fix for the 2.6.2 release first, this improvement
will be part of the next release.


-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed!
  2013-01-13 19:41         ` Gilles Chanteperdrix
@ 2013-01-14  4:47           ` John Morris
  2013-01-14 11:57             ` Gilles Chanteperdrix
  2013-01-14 19:50             ` [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed! Gilles Chanteperdrix
  2013-01-14 12:00           ` [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests) Jan Kiszka
  1 sibling, 2 replies; 54+ messages in thread
From: John Morris @ 2013-01-14  4:47 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai

On 01/13/2013 01:41 PM, Gilles Chanteperdrix wrote:
> On 01/13/2013 08:14 PM, John Morris wrote:
> 
>> Hi Gilles and Jan,
>>
>> Note change of thread subject.  I'm starting to get confused.
>>
>> On 01/13/2013 06:16 AM, Gilles Chanteperdrix wrote:
>>> On 01/13/2013 05:36 AM, John Morris wrote:
>>>
>>>> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>>
>>>>>
>>>>> Could you check whether you can reproduce this issue with the I-pipe
>>>>> patch for 3.5.7 ? The next xenomai release will be based on this version
>>>>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
>>>>
>>>> Different problem; Xenomai wouldn't start:
>>>>
>>>>   I-pipe: could not find timer for cpu #0
>>>>
>>>> dmesg:
>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>>>
>>>> .config:
>>>> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
>>>>
>>>> FYI, I found this same problem on two of my systems while testing your
>>>> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>>>>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>>>>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
>>>>
>>>> Hardware looks similar to Mariusz's in this post, where he had the same
>>>> problem:
>>>>
>>>> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
>>>>
>>>> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
>>>> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
>>>> BIOS option on these boards to enable/disable.  These same motherboards
>>>> don't suffer this problem with mainline Xenomai on 3.5.3.
>>>
>>>
>>> If you had the same problem as Marius, you would have seen it with
>>> 3.5.3, and you would get the message in the dmesg about C1E, so, it is
>>> probably something else. 
>>
>> Yes, I'm definitely getting confused.  I did see the same problem with
>> C1E, but only while running your 3.5.7 .debs, and not in the 3.5.7 el6
>> packages that are the main subject of this sub-thread:
>>
>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-deb-gilles/dmesg-3.5.7-deb-gilles.log
> 
> 
> Ah, that is because I rebased the I-pipe tree in between, and at some
> point the code printing the message was wrong (ATOMIC_INIT(0) instead of
> ATOMIC_INIT(-1)). That is my fault then, sorry.
> 
>>
>>> Could you run
>>>
>>> cat /proc/timer_list
>>
>> Back to el6 again, 3.5.7 i-pipe:
>>
>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-cat-proc-timer_list-3.5.7-test.log
> 
> 
> The LAPIC is definitely up and running (mode: 3). So, it probably means
> that the erratum detection is not sufficient to decide not to use a
> LAPIC. Checking your logs, we see:
> 
> using AMD E400 aware idle routine
> 
> which means the LAPIC could potentially be unusable, but the idle
> routine also checks for a bit in a K8 specific MSR and prints the message:
> 
> System has AMD C1E enabled
> 
> If this bit is set, and in your case the message is not printed so the
> bit is not set. So, the LAPIC is usable, but due to the changes I made
> to try and print a message in Marius case, I broke the detection in your
> case.
> 
> I have just pushed a rework for this commit in the for-core-3.5.7 branch
> in ipipe-gch git.

And it worked, no more C1E error!  Thanks!

It looks like the AMD-64 AM2/AM2+ socket CPUs were the last generation
without C1E, and the AM3 socket CPUs were the first gen with.

Back to the original problem, the posix/mprotect problem is confirmed to
be in this branch:

  ++ /usr/lib64/xenomai/regression/posix/mprotect
  memory read
  FAILURE: sigdebug_handler triggered, reason 2
  memory write after exec enable

Regression test run:
http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-xeno-regression-test-3.5.7-test.log

Dmesg:
http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log

Kernel .config:
http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config.txt

To minimize confusion (esp. my own) and answer Jan's question, this is
Gilles's ipipe-gch/for-core-3.5.7 kernel (20130113git08f0596) with
xenomai master (20130113git210ed428) and standard xenomai kernel options
from the wiki, plus CPUMASK_OFFSTACK of course.

	John


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] SMI workarounds in one-size-fits-all kernel packages (Was: Re: Kernel OOPS during regression tests)
  2013-01-13 19:45         ` Gilles Chanteperdrix
@ 2013-01-14  5:33           ` John Morris
  0 siblings, 0 replies; 54+ messages in thread
From: John Morris @ 2013-01-14  5:33 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai

On 01/13/2013 01:45 PM, Gilles Chanteperdrix wrote:
> On 01/13/2013 08:36 PM, John Morris wrote:
>> On 01/13/2013 07:53 AM, Gilles Chanteperdrix wrote:
>>> On 01/13/2013 05:40 AM, John Morris wrote:
>>>> On 01/12/2013 01:03 PM, Gilles Chanteperdrix wrote:
>>>>> It seems a really bad idea to enable the SMI workaround by default.
>>>>
>>>> Thanks!  Despite feeling uncomfortable contradicting the documentation,
>>>> I was encouraged to turn this on by someone more authoritative than me.
>>>> It does seem like a separate package with it turned on is warranted for
>>>> those who have problems and have been fairly warned about the risk.
>>>
>>> I guess we should turn the compile-time option into a kernel parameter.
>>> So, the smi workaround would be disabled by default, and only enabled
>>> when passing a kerne command-line option.
>>
>> That would be ideal!
> 
> We will try to issue a fix for the 2.6.2 release first, this improvement
> will be part of the next release.

A boot-time option would definitely solve the one-kernel-fits-all need.
 In my ideal fantasy world, a run-time option, perhaps in /proc/xenomai,
would potentially make this dummy-proof, since someone could write a
utility to tweak settings in a running system, maybe even a smart
utility that could help detect dangerous conditions.

I did find Jan's userland utility 'smictrl' [1] that uses libpci to
provide a simple way to query and set SMI controls on a running system.
 If it has all the functionality of Xenomai's SMI workaround, maybe we
could toss that into the packages, write a wiki page and say 'good
enough for now'.  (I tried it on an old Dell PC and my old Thinkpad,
both with ICH5 chipset, and while it could manipulate some bits, the
ones that might have mattered couldn't be reset.  I assumed they were
locked down by the BIOS, not by any shortcoming of the utility.)

	John

[1]  http://git.kiszka.org/?p=smictrl.git


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed!
  2013-01-14  4:47           ` [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed! John Morris
@ 2013-01-14 11:57             ` Gilles Chanteperdrix
  2013-01-14 12:00               ` Jan Kiszka
  2013-01-14 19:50             ` [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed! Gilles Chanteperdrix
  1 sibling, 1 reply; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-14 11:57 UTC (permalink / raw)
  To: John Morris; +Cc: Xenomai

On 01/14/2013 05:47 AM, John Morris wrote:

> On 01/13/2013 01:41 PM, Gilles Chanteperdrix wrote:
>> On 01/13/2013 08:14 PM, John Morris wrote:
>>
>>> Hi Gilles and Jan,
>>>
>>> Note change of thread subject.  I'm starting to get confused.
>>>
>>> On 01/13/2013 06:16 AM, Gilles Chanteperdrix wrote:
>>>> On 01/13/2013 05:36 AM, John Morris wrote:
>>>>
>>>>> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>>>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>>>
>>>>>>
>>>>>> Could you check whether you can reproduce this issue with the I-pipe
>>>>>> patch for 3.5.7 ? The next xenomai release will be based on this version
>>>>>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
>>>>>
>>>>> Different problem; Xenomai wouldn't start:
>>>>>
>>>>>   I-pipe: could not find timer for cpu #0
>>>>>
>>>>> dmesg:
>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>>>>
>>>>> .config:
>>>>> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
>>>>>
>>>>> FYI, I found this same problem on two of my systems while testing your
>>>>> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>>>>>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>>>>>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
>>>>>
>>>>> Hardware looks similar to Mariusz's in this post, where he had the same
>>>>> problem:
>>>>>
>>>>> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
>>>>>
>>>>> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
>>>>> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
>>>>> BIOS option on these boards to enable/disable.  These same motherboards
>>>>> don't suffer this problem with mainline Xenomai on 3.5.3.
>>>>
>>>>
>>>> If you had the same problem as Marius, you would have seen it with
>>>> 3.5.3, and you would get the message in the dmesg about C1E, so, it is
>>>> probably something else. 
>>>
>>> Yes, I'm definitely getting confused.  I did see the same problem with
>>> C1E, but only while running your 3.5.7 .debs, and not in the 3.5.7 el6
>>> packages that are the main subject of this sub-thread:
>>>
>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-deb-gilles/dmesg-3.5.7-deb-gilles.log
>>
>>
>> Ah, that is because I rebased the I-pipe tree in between, and at some
>> point the code printing the message was wrong (ATOMIC_INIT(0) instead of
>> ATOMIC_INIT(-1)). That is my fault then, sorry.
>>
>>>
>>>> Could you run
>>>>
>>>> cat /proc/timer_list
>>>
>>> Back to el6 again, 3.5.7 i-pipe:
>>>
>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-cat-proc-timer_list-3.5.7-test.log
>>
>>
>> The LAPIC is definitely up and running (mode: 3). So, it probably means
>> that the erratum detection is not sufficient to decide not to use a
>> LAPIC. Checking your logs, we see:
>>
>> using AMD E400 aware idle routine
>>
>> which means the LAPIC could potentially be unusable, but the idle
>> routine also checks for a bit in a K8 specific MSR and prints the message:
>>
>> System has AMD C1E enabled
>>
>> If this bit is set, and in your case the message is not printed so the
>> bit is not set. So, the LAPIC is usable, but due to the changes I made
>> to try and print a message in Marius case, I broke the detection in your
>> case.
>>
>> I have just pushed a rework for this commit in the for-core-3.5.7 branch
>> in ipipe-gch git.
> 
> And it worked, no more C1E error!  Thanks!
> 
> It looks like the AMD-64 AM2/AM2+ socket CPUs were the last generation
> without C1E, and the AM3 socket CPUs were the first gen with.
> 
> Back to the original problem, the posix/mprotect problem is confirmed to
> be in this branch:
> 
>   ++ /usr/lib64/xenomai/regression/posix/mprotect
>   memory read
>   FAILURE: sigdebug_handler triggered, reason 2
>   memory write after exec enable


I will try to compile a kernel with the same configuration as you to see
if I can reproduce the issue.
-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed!
  2013-01-14 11:57             ` Gilles Chanteperdrix
@ 2013-01-14 12:00               ` Jan Kiszka
  2013-01-14 13:36                 ` Jan Kiszka
  0 siblings, 1 reply; 54+ messages in thread
From: Jan Kiszka @ 2013-01-14 12:00 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: John Morris, Xenomai

On 2013-01-14 12:57, Gilles Chanteperdrix wrote:
> On 01/14/2013 05:47 AM, John Morris wrote:
> 
>> On 01/13/2013 01:41 PM, Gilles Chanteperdrix wrote:
>>> On 01/13/2013 08:14 PM, John Morris wrote:
>>>
>>>> Hi Gilles and Jan,
>>>>
>>>> Note change of thread subject.  I'm starting to get confused.
>>>>
>>>> On 01/13/2013 06:16 AM, Gilles Chanteperdrix wrote:
>>>>> On 01/13/2013 05:36 AM, John Morris wrote:
>>>>>
>>>>>> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>>>>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>>>>
>>>>>>>
>>>>>>> Could you check whether you can reproduce this issue with the I-pipe
>>>>>>> patch for 3.5.7 ? The next xenomai release will be based on this version
>>>>>>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
>>>>>>
>>>>>> Different problem; Xenomai wouldn't start:
>>>>>>
>>>>>>   I-pipe: could not find timer for cpu #0
>>>>>>
>>>>>> dmesg:
>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>>>>>
>>>>>> .config:
>>>>>> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
>>>>>>
>>>>>> FYI, I found this same problem on two of my systems while testing your
>>>>>> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>>>>>>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>>>>>>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
>>>>>>
>>>>>> Hardware looks similar to Mariusz's in this post, where he had the same
>>>>>> problem:
>>>>>>
>>>>>> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
>>>>>>
>>>>>> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
>>>>>> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
>>>>>> BIOS option on these boards to enable/disable.  These same motherboards
>>>>>> don't suffer this problem with mainline Xenomai on 3.5.3.
>>>>>
>>>>>
>>>>> If you had the same problem as Marius, you would have seen it with
>>>>> 3.5.3, and you would get the message in the dmesg about C1E, so, it is
>>>>> probably something else. 
>>>>
>>>> Yes, I'm definitely getting confused.  I did see the same problem with
>>>> C1E, but only while running your 3.5.7 .debs, and not in the 3.5.7 el6
>>>> packages that are the main subject of this sub-thread:
>>>>
>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-deb-gilles/dmesg-3.5.7-deb-gilles.log
>>>
>>>
>>> Ah, that is because I rebased the I-pipe tree in between, and at some
>>> point the code printing the message was wrong (ATOMIC_INIT(0) instead of
>>> ATOMIC_INIT(-1)). That is my fault then, sorry.
>>>
>>>>
>>>>> Could you run
>>>>>
>>>>> cat /proc/timer_list
>>>>
>>>> Back to el6 again, 3.5.7 i-pipe:
>>>>
>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-cat-proc-timer_list-3.5.7-test.log
>>>
>>>
>>> The LAPIC is definitely up and running (mode: 3). So, it probably means
>>> that the erratum detection is not sufficient to decide not to use a
>>> LAPIC. Checking your logs, we see:
>>>
>>> using AMD E400 aware idle routine
>>>
>>> which means the LAPIC could potentially be unusable, but the idle
>>> routine also checks for a bit in a K8 specific MSR and prints the message:
>>>
>>> System has AMD C1E enabled
>>>
>>> If this bit is set, and in your case the message is not printed so the
>>> bit is not set. So, the LAPIC is usable, but due to the changes I made
>>> to try and print a message in Marius case, I broke the detection in your
>>> case.
>>>
>>> I have just pushed a rework for this commit in the for-core-3.5.7 branch
>>> in ipipe-gch git.
>>
>> And it worked, no more C1E error!  Thanks!
>>
>> It looks like the AMD-64 AM2/AM2+ socket CPUs were the last generation
>> without C1E, and the AM3 socket CPUs were the first gen with.
>>
>> Back to the original problem, the posix/mprotect problem is confirmed to
>> be in this branch:
>>
>>   ++ /usr/lib64/xenomai/regression/posix/mprotect
>>   memory read
>>   FAILURE: sigdebug_handler triggered, reason 2
>>   memory write after exec enable
> 
> 
> I will try to compile a kernel with the same configuration as you to see
> if I can reproduce the issue.

Unless you want to double-check: build is already running here. This
feature is critical for us, but I have no clue ATM why it could fail
over the latest patch queue.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-13 19:41         ` Gilles Chanteperdrix
  2013-01-14  4:47           ` [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed! John Morris
@ 2013-01-14 12:00           ` Jan Kiszka
  2013-01-14 18:50             ` Gilles Chanteperdrix
  1 sibling, 1 reply; 54+ messages in thread
From: Jan Kiszka @ 2013-01-14 12:00 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: John Morris, Xenomai

On 2013-01-13 20:41, Gilles Chanteperdrix wrote:
> On 01/13/2013 08:14 PM, John Morris wrote:
> 
>> Hi Gilles and Jan,
>>
>> Note change of thread subject.  I'm starting to get confused.
>>
>> On 01/13/2013 06:16 AM, Gilles Chanteperdrix wrote:
>>> On 01/13/2013 05:36 AM, John Morris wrote:
>>>
>>>> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>>
>>>>>
>>>>> Could you check whether you can reproduce this issue with the I-pipe
>>>>> patch for 3.5.7 ? The next xenomai release will be based on this version
>>>>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
>>>>
>>>> Different problem; Xenomai wouldn't start:
>>>>
>>>>   I-pipe: could not find timer for cpu #0
>>>>
>>>> dmesg:
>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>>>
>>>> .config:
>>>> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
>>>>
>>>> FYI, I found this same problem on two of my systems while testing your
>>>> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>>>>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>>>>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
>>>>
>>>> Hardware looks similar to Mariusz's in this post, where he had the same
>>>> problem:
>>>>
>>>> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
>>>>
>>>> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
>>>> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
>>>> BIOS option on these boards to enable/disable.  These same motherboards
>>>> don't suffer this problem with mainline Xenomai on 3.5.3.
>>>
>>>
>>> If you had the same problem as Marius, you would have seen it with
>>> 3.5.3, and you would get the message in the dmesg about C1E, so, it is
>>> probably something else. 
>>
>> Yes, I'm definitely getting confused.  I did see the same problem with
>> C1E, but only while running your 3.5.7 .debs, and not in the 3.5.7 el6
>> packages that are the main subject of this sub-thread:
>>
>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-deb-gilles/dmesg-3.5.7-deb-gilles.log
> 
> 
> Ah, that is because I rebased the I-pipe tree in between, and at some
> point the code printing the message was wrong (ATOMIC_INIT(0) instead of
> ATOMIC_INIT(-1)). That is my fault then, sorry.
> 
>>
>>> Could you run
>>>
>>> cat /proc/timer_list
>>
>> Back to el6 again, 3.5.7 i-pipe:
>>
>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-cat-proc-timer_list-3.5.7-test.log
> 
> 
> The LAPIC is definitely up and running (mode: 3). So, it probably means
> that the erratum detection is not sufficient to decide not to use a
> LAPIC. Checking your logs, we see:
> 
> using AMD E400 aware idle routine
> 
> which means the LAPIC could potentially be unusable, but the idle
> routine also checks for a bit in a K8 specific MSR and prints the message:
> 
> System has AMD C1E enabled
> 
> If this bit is set, and in your case the message is not printed so the
> bit is not set. So, the LAPIC is usable, but due to the changes I made
> to try and print a message in Marius case, I broke the detection in your
> case.
> 
> I have just pushed a rework for this commit in the for-core-3.5.7 branch
> in ipipe-gch git.

Could you fold those changes into a single patch and add a few words to
the changelog that setup_APIC_timer is too early to check? Then I'll
merge it into the x86 queue.

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed!
  2013-01-14 12:00               ` Jan Kiszka
@ 2013-01-14 13:36                 ` Jan Kiszka
  2013-01-14 20:52                   ` John Morris
  0 siblings, 1 reply; 54+ messages in thread
From: Jan Kiszka @ 2013-01-14 13:36 UTC (permalink / raw)
  To: Gilles Chanteperdrix, John Morris; +Cc: Xenomai

On 2013-01-14 13:00, Jan Kiszka wrote:
> On 2013-01-14 12:57, Gilles Chanteperdrix wrote:
>> On 01/14/2013 05:47 AM, John Morris wrote:
>>
>>> On 01/13/2013 01:41 PM, Gilles Chanteperdrix wrote:
>>>> On 01/13/2013 08:14 PM, John Morris wrote:
>>>>
>>>>> Hi Gilles and Jan,
>>>>>
>>>>> Note change of thread subject.  I'm starting to get confused.
>>>>>
>>>>> On 01/13/2013 06:16 AM, Gilles Chanteperdrix wrote:
>>>>>> On 01/13/2013 05:36 AM, John Morris wrote:
>>>>>>
>>>>>>> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>>>>>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>>>>>
>>>>>>>>
>>>>>>>> Could you check whether you can reproduce this issue with the I-pipe
>>>>>>>> patch for 3.5.7 ? The next xenomai release will be based on this version
>>>>>>>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
>>>>>>>
>>>>>>> Different problem; Xenomai wouldn't start:
>>>>>>>
>>>>>>>   I-pipe: could not find timer for cpu #0
>>>>>>>
>>>>>>> dmesg:
>>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>>>>>>
>>>>>>> .config:
>>>>>>> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
>>>>>>>
>>>>>>> FYI, I found this same problem on two of my systems while testing your
>>>>>>> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>>>>>>>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>>>>>>>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
>>>>>>>
>>>>>>> Hardware looks similar to Mariusz's in this post, where he had the same
>>>>>>> problem:
>>>>>>>
>>>>>>> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
>>>>>>>
>>>>>>> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
>>>>>>> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
>>>>>>> BIOS option on these boards to enable/disable.  These same motherboards
>>>>>>> don't suffer this problem with mainline Xenomai on 3.5.3.
>>>>>>
>>>>>>
>>>>>> If you had the same problem as Marius, you would have seen it with
>>>>>> 3.5.3, and you would get the message in the dmesg about C1E, so, it is
>>>>>> probably something else. 
>>>>>
>>>>> Yes, I'm definitely getting confused.  I did see the same problem with
>>>>> C1E, but only while running your 3.5.7 .debs, and not in the 3.5.7 el6
>>>>> packages that are the main subject of this sub-thread:
>>>>>
>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-deb-gilles/dmesg-3.5.7-deb-gilles.log
>>>>
>>>>
>>>> Ah, that is because I rebased the I-pipe tree in between, and at some
>>>> point the code printing the message was wrong (ATOMIC_INIT(0) instead of
>>>> ATOMIC_INIT(-1)). That is my fault then, sorry.
>>>>
>>>>>
>>>>>> Could you run
>>>>>>
>>>>>> cat /proc/timer_list
>>>>>
>>>>> Back to el6 again, 3.5.7 i-pipe:
>>>>>
>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-cat-proc-timer_list-3.5.7-test.log
>>>>
>>>>
>>>> The LAPIC is definitely up and running (mode: 3). So, it probably means
>>>> that the erratum detection is not sufficient to decide not to use a
>>>> LAPIC. Checking your logs, we see:
>>>>
>>>> using AMD E400 aware idle routine
>>>>
>>>> which means the LAPIC could potentially be unusable, but the idle
>>>> routine also checks for a bit in a K8 specific MSR and prints the message:
>>>>
>>>> System has AMD C1E enabled
>>>>
>>>> If this bit is set, and in your case the message is not printed so the
>>>> bit is not set. So, the LAPIC is usable, but due to the changes I made
>>>> to try and print a message in Marius case, I broke the detection in your
>>>> case.
>>>>
>>>> I have just pushed a rework for this commit in the for-core-3.5.7 branch
>>>> in ipipe-gch git.
>>>
>>> And it worked, no more C1E error!  Thanks!
>>>
>>> It looks like the AMD-64 AM2/AM2+ socket CPUs were the last generation
>>> without C1E, and the AM3 socket CPUs were the first gen with.
>>>
>>> Back to the original problem, the posix/mprotect problem is confirmed to
>>> be in this branch:
>>>
>>>   ++ /usr/lib64/xenomai/regression/posix/mprotect
>>>   memory read
>>>   FAILURE: sigdebug_handler triggered, reason 2
>>>   memory write after exec enable
>>
>>
>> I will try to compile a kernel with the same configuration as you to see
>> if I can reproduce the issue.
> 
> Unless you want to double-check: build is already running here. This
> feature is critical for us, but I have no clue ATM why it could fail
> over the latest patch queue.

OK, would probably be good to double-check as I'm unable to reproduce
the issue over my queue with John's .config.

The reason code is suspicious BTW: syscall. Would be good if someone who
can reproduce attaches gdb and provides a backtrace from the signal to
the call that triggered it. Could something have went wrong while
building the test case, that not all required functions are wrapped such
as printf?

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-14 12:00           ` [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests) Jan Kiszka
@ 2013-01-14 18:50             ` Gilles Chanteperdrix
  2013-01-14 19:13               ` Jan Kiszka
  0 siblings, 1 reply; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-14 18:50 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: John Morris, Xenomai

On 01/14/2013 01:00 PM, Jan Kiszka wrote:

> On 2013-01-13 20:41, Gilles Chanteperdrix wrote:
>> On 01/13/2013 08:14 PM, John Morris wrote:
>>
>>> Hi Gilles and Jan,
>>>
>>> Note change of thread subject.  I'm starting to get confused.
>>>
>>> On 01/13/2013 06:16 AM, Gilles Chanteperdrix wrote:
>>>> On 01/13/2013 05:36 AM, John Morris wrote:
>>>>
>>>>> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>>>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>>>
>>>>>>
>>>>>> Could you check whether you can reproduce this issue with the I-pipe
>>>>>> patch for 3.5.7 ? The next xenomai release will be based on this version
>>>>>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
>>>>>
>>>>> Different problem; Xenomai wouldn't start:
>>>>>
>>>>>   I-pipe: could not find timer for cpu #0
>>>>>
>>>>> dmesg:
>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>>>>
>>>>> .config:
>>>>> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
>>>>>
>>>>> FYI, I found this same problem on two of my systems while testing your
>>>>> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>>>>>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>>>>>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
>>>>>
>>>>> Hardware looks similar to Mariusz's in this post, where he had the same
>>>>> problem:
>>>>>
>>>>> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
>>>>>
>>>>> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
>>>>> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
>>>>> BIOS option on these boards to enable/disable.  These same motherboards
>>>>> don't suffer this problem with mainline Xenomai on 3.5.3.
>>>>
>>>>
>>>> If you had the same problem as Marius, you would have seen it with
>>>> 3.5.3, and you would get the message in the dmesg about C1E, so, it is
>>>> probably something else. 
>>>
>>> Yes, I'm definitely getting confused.  I did see the same problem with
>>> C1E, but only while running your 3.5.7 .debs, and not in the 3.5.7 el6
>>> packages that are the main subject of this sub-thread:
>>>
>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-deb-gilles/dmesg-3.5.7-deb-gilles.log
>>
>>
>> Ah, that is because I rebased the I-pipe tree in between, and at some
>> point the code printing the message was wrong (ATOMIC_INIT(0) instead of
>> ATOMIC_INIT(-1)). That is my fault then, sorry.
>>
>>>
>>>> Could you run
>>>>
>>>> cat /proc/timer_list
>>>
>>> Back to el6 again, 3.5.7 i-pipe:
>>>
>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-cat-proc-timer_list-3.5.7-test.log
>>
>>
>> The LAPIC is definitely up and running (mode: 3). So, it probably means
>> that the erratum detection is not sufficient to decide not to use a
>> LAPIC. Checking your logs, we see:
>>
>> using AMD E400 aware idle routine
>>
>> which means the LAPIC could potentially be unusable, but the idle
>> routine also checks for a bit in a K8 specific MSR and prints the message:
>>
>> System has AMD C1E enabled
>>
>> If this bit is set, and in your case the message is not printed so the
>> bit is not set. So, the LAPIC is usable, but due to the changes I made
>> to try and print a message in Marius case, I broke the detection in your
>> case.
>>
>> I have just pushed a rework for this commit in the for-core-3.5.7 branch
>> in ipipe-gch git.
> 
> Could you fold those changes into a single patch and add a few words to
> the changelog that setup_APIC_timer is too early to check? Then I'll
> merge it into the x86 queue.


I am trying to reach a point where we add bug-fixes and only bug-fixes
to re-release 2.6.2, so, the for-core-3.5.7 branch is what I intend to
put in this release, I would like to avoid the other commits in your branch.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-14 18:50             ` Gilles Chanteperdrix
@ 2013-01-14 19:13               ` Jan Kiszka
  2013-01-14 19:15                 ` Gilles Chanteperdrix
  0 siblings, 1 reply; 54+ messages in thread
From: Jan Kiszka @ 2013-01-14 19:13 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: John Morris, Xenomai

On 2013-01-14 19:50, Gilles Chanteperdrix wrote:
> On 01/14/2013 01:00 PM, Jan Kiszka wrote:
> 
>> On 2013-01-13 20:41, Gilles Chanteperdrix wrote:
>>> On 01/13/2013 08:14 PM, John Morris wrote:
>>>
>>>> Hi Gilles and Jan,
>>>>
>>>> Note change of thread subject.  I'm starting to get confused.
>>>>
>>>> On 01/13/2013 06:16 AM, Gilles Chanteperdrix wrote:
>>>>> On 01/13/2013 05:36 AM, John Morris wrote:
>>>>>
>>>>>> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>>>>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>>>>
>>>>>>>
>>>>>>> Could you check whether you can reproduce this issue with the I-pipe
>>>>>>> patch for 3.5.7 ? The next xenomai release will be based on this version
>>>>>>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
>>>>>>
>>>>>> Different problem; Xenomai wouldn't start:
>>>>>>
>>>>>>   I-pipe: could not find timer for cpu #0
>>>>>>
>>>>>> dmesg:
>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>>>>>
>>>>>> .config:
>>>>>> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
>>>>>>
>>>>>> FYI, I found this same problem on two of my systems while testing your
>>>>>> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>>>>>>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>>>>>>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
>>>>>>
>>>>>> Hardware looks similar to Mariusz's in this post, where he had the same
>>>>>> problem:
>>>>>>
>>>>>> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
>>>>>>
>>>>>> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
>>>>>> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
>>>>>> BIOS option on these boards to enable/disable.  These same motherboards
>>>>>> don't suffer this problem with mainline Xenomai on 3.5.3.
>>>>>
>>>>>
>>>>> If you had the same problem as Marius, you would have seen it with
>>>>> 3.5.3, and you would get the message in the dmesg about C1E, so, it is
>>>>> probably something else. 
>>>>
>>>> Yes, I'm definitely getting confused.  I did see the same problem with
>>>> C1E, but only while running your 3.5.7 .debs, and not in the 3.5.7 el6
>>>> packages that are the main subject of this sub-thread:
>>>>
>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-deb-gilles/dmesg-3.5.7-deb-gilles.log
>>>
>>>
>>> Ah, that is because I rebased the I-pipe tree in between, and at some
>>> point the code printing the message was wrong (ATOMIC_INIT(0) instead of
>>> ATOMIC_INIT(-1)). That is my fault then, sorry.
>>>
>>>>
>>>>> Could you run
>>>>>
>>>>> cat /proc/timer_list
>>>>
>>>> Back to el6 again, 3.5.7 i-pipe:
>>>>
>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-cat-proc-timer_list-3.5.7-test.log
>>>
>>>
>>> The LAPIC is definitely up and running (mode: 3). So, it probably means
>>> that the erratum detection is not sufficient to decide not to use a
>>> LAPIC. Checking your logs, we see:
>>>
>>> using AMD E400 aware idle routine
>>>
>>> which means the LAPIC could potentially be unusable, but the idle
>>> routine also checks for a bit in a K8 specific MSR and prints the message:
>>>
>>> System has AMD C1E enabled
>>>
>>> If this bit is set, and in your case the message is not printed so the
>>> bit is not set. So, the LAPIC is usable, but due to the changes I made
>>> to try and print a message in Marius case, I broke the detection in your
>>> case.
>>>
>>> I have just pushed a rework for this commit in the for-core-3.5.7 branch
>>> in ipipe-gch git.
>>
>> Could you fold those changes into a single patch and add a few words to
>> the changelog that setup_APIC_timer is too early to check? Then I'll
>> merge it into the x86 queue.
> 
> 
> I am trying to reach a point where we add bug-fixes and only bug-fixes
> to re-release 2.6.2, so, the for-core-3.5.7 branch is what I intend to
> put in this release, I would like to avoid the other commits in your branch.

Please have a closer look at the patches before judging. First, many of
them fix bugs of features that already used to work. Second, they add
support in an orthogonal way, i.e. have no or minimal side effects when
the corresponding kernel features are off. And third, the features,
specifically ftrace/perf, are very useful for a broad audience - and
mandatory for our x86 use cases. It would not only help us a lot if we
could focus on different Xenomai tasks than continue to maintain the
patch queues separately.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-14 19:13               ` Jan Kiszka
@ 2013-01-14 19:15                 ` Gilles Chanteperdrix
  2013-01-14 19:37                   ` Jan Kiszka
  0 siblings, 1 reply; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-14 19:15 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: John Morris, Xenomai

On 01/14/2013 08:13 PM, Jan Kiszka wrote:

> On 2013-01-14 19:50, Gilles Chanteperdrix wrote:
>> On 01/14/2013 01:00 PM, Jan Kiszka wrote:
>>
>>> On 2013-01-13 20:41, Gilles Chanteperdrix wrote:
>>>> On 01/13/2013 08:14 PM, John Morris wrote:
>>>>
>>>>> Hi Gilles and Jan,
>>>>>
>>>>> Note change of thread subject.  I'm starting to get confused.
>>>>>
>>>>> On 01/13/2013 06:16 AM, Gilles Chanteperdrix wrote:
>>>>>> On 01/13/2013 05:36 AM, John Morris wrote:
>>>>>>
>>>>>>> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>>>>>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>>>>>
>>>>>>>>
>>>>>>>> Could you check whether you can reproduce this issue with the I-pipe
>>>>>>>> patch for 3.5.7 ? The next xenomai release will be based on this version
>>>>>>>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
>>>>>>>
>>>>>>> Different problem; Xenomai wouldn't start:
>>>>>>>
>>>>>>>   I-pipe: could not find timer for cpu #0
>>>>>>>
>>>>>>> dmesg:
>>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>>>>>>
>>>>>>> .config:
>>>>>>> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
>>>>>>>
>>>>>>> FYI, I found this same problem on two of my systems while testing your
>>>>>>> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>>>>>>>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>>>>>>>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
>>>>>>>
>>>>>>> Hardware looks similar to Mariusz's in this post, where he had the same
>>>>>>> problem:
>>>>>>>
>>>>>>> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
>>>>>>>
>>>>>>> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
>>>>>>> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
>>>>>>> BIOS option on these boards to enable/disable.  These same motherboards
>>>>>>> don't suffer this problem with mainline Xenomai on 3.5.3.
>>>>>>
>>>>>>
>>>>>> If you had the same problem as Marius, you would have seen it with
>>>>>> 3.5.3, and you would get the message in the dmesg about C1E, so, it is
>>>>>> probably something else. 
>>>>>
>>>>> Yes, I'm definitely getting confused.  I did see the same problem with
>>>>> C1E, but only while running your 3.5.7 .debs, and not in the 3.5.7 el6
>>>>> packages that are the main subject of this sub-thread:
>>>>>
>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-deb-gilles/dmesg-3.5.7-deb-gilles.log
>>>>
>>>>
>>>> Ah, that is because I rebased the I-pipe tree in between, and at some
>>>> point the code printing the message was wrong (ATOMIC_INIT(0) instead of
>>>> ATOMIC_INIT(-1)). That is my fault then, sorry.
>>>>
>>>>>
>>>>>> Could you run
>>>>>>
>>>>>> cat /proc/timer_list
>>>>>
>>>>> Back to el6 again, 3.5.7 i-pipe:
>>>>>
>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-cat-proc-timer_list-3.5.7-test.log
>>>>
>>>>
>>>> The LAPIC is definitely up and running (mode: 3). So, it probably means
>>>> that the erratum detection is not sufficient to decide not to use a
>>>> LAPIC. Checking your logs, we see:
>>>>
>>>> using AMD E400 aware idle routine
>>>>
>>>> which means the LAPIC could potentially be unusable, but the idle
>>>> routine also checks for a bit in a K8 specific MSR and prints the message:
>>>>
>>>> System has AMD C1E enabled
>>>>
>>>> If this bit is set, and in your case the message is not printed so the
>>>> bit is not set. So, the LAPIC is usable, but due to the changes I made
>>>> to try and print a message in Marius case, I broke the detection in your
>>>> case.
>>>>
>>>> I have just pushed a rework for this commit in the for-core-3.5.7 branch
>>>> in ipipe-gch git.
>>>
>>> Could you fold those changes into a single patch and add a few words to
>>> the changelog that setup_APIC_timer is too early to check? Then I'll
>>> merge it into the x86 queue.
>>
>>
>> I am trying to reach a point where we add bug-fixes and only bug-fixes
>> to re-release 2.6.2, so, the for-core-3.5.7 branch is what I intend to
>> put in this release, I would like to avoid the other commits in your branch.
> 
> Please have a closer look at the patches before judging. First, many of
> them fix bugs of features that already used to work. Second, they add
> support in an orthogonal way, i.e. have no or minimal side effects when
> the corresponding kernel features are off. And third, the features,
> specifically ftrace/perf, are very useful for a broad audience - and
> mandatory for our x86 use cases. It would not only help us a lot if we
> could focus on different Xenomai tasks than continue to maintain the
> patch queues separately.


I have no problem with merging new features, but I would suggest waiting
for after the 2.6.2 re-release. But ultimately, I am not the one who
decides. People can upgrade the I-pipe patch with a Xenomai release as
you know.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-14 19:15                 ` Gilles Chanteperdrix
@ 2013-01-14 19:37                   ` Jan Kiszka
  2013-01-14 20:39                     ` Gilles Chanteperdrix
  0 siblings, 1 reply; 54+ messages in thread
From: Jan Kiszka @ 2013-01-14 19:37 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: John Morris, Xenomai

On 2013-01-14 20:15, Gilles Chanteperdrix wrote:
> On 01/14/2013 08:13 PM, Jan Kiszka wrote:
> 
>> On 2013-01-14 19:50, Gilles Chanteperdrix wrote:
>>> On 01/14/2013 01:00 PM, Jan Kiszka wrote:
>>>
>>>> On 2013-01-13 20:41, Gilles Chanteperdrix wrote:
>>>>> On 01/13/2013 08:14 PM, John Morris wrote:
>>>>>
>>>>>> Hi Gilles and Jan,
>>>>>>
>>>>>> Note change of thread subject.  I'm starting to get confused.
>>>>>>
>>>>>> On 01/13/2013 06:16 AM, Gilles Chanteperdrix wrote:
>>>>>>> On 01/13/2013 05:36 AM, John Morris wrote:
>>>>>>>
>>>>>>>> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>>>>>>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>>>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>>>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>>>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>>>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Could you check whether you can reproduce this issue with the I-pipe
>>>>>>>>> patch for 3.5.7 ? The next xenomai release will be based on this version
>>>>>>>>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
>>>>>>>>
>>>>>>>> Different problem; Xenomai wouldn't start:
>>>>>>>>
>>>>>>>>   I-pipe: could not find timer for cpu #0
>>>>>>>>
>>>>>>>> dmesg:
>>>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>>>>>>>
>>>>>>>> .config:
>>>>>>>> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
>>>>>>>>
>>>>>>>> FYI, I found this same problem on two of my systems while testing your
>>>>>>>> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>>>>>>>>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>>>>>>>>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
>>>>>>>>
>>>>>>>> Hardware looks similar to Mariusz's in this post, where he had the same
>>>>>>>> problem:
>>>>>>>>
>>>>>>>> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
>>>>>>>>
>>>>>>>> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
>>>>>>>> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
>>>>>>>> BIOS option on these boards to enable/disable.  These same motherboards
>>>>>>>> don't suffer this problem with mainline Xenomai on 3.5.3.
>>>>>>>
>>>>>>>
>>>>>>> If you had the same problem as Marius, you would have seen it with
>>>>>>> 3.5.3, and you would get the message in the dmesg about C1E, so, it is
>>>>>>> probably something else. 
>>>>>>
>>>>>> Yes, I'm definitely getting confused.  I did see the same problem with
>>>>>> C1E, but only while running your 3.5.7 .debs, and not in the 3.5.7 el6
>>>>>> packages that are the main subject of this sub-thread:
>>>>>>
>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-deb-gilles/dmesg-3.5.7-deb-gilles.log
>>>>>
>>>>>
>>>>> Ah, that is because I rebased the I-pipe tree in between, and at some
>>>>> point the code printing the message was wrong (ATOMIC_INIT(0) instead of
>>>>> ATOMIC_INIT(-1)). That is my fault then, sorry.
>>>>>
>>>>>>
>>>>>>> Could you run
>>>>>>>
>>>>>>> cat /proc/timer_list
>>>>>>
>>>>>> Back to el6 again, 3.5.7 i-pipe:
>>>>>>
>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-cat-proc-timer_list-3.5.7-test.log
>>>>>
>>>>>
>>>>> The LAPIC is definitely up and running (mode: 3). So, it probably means
>>>>> that the erratum detection is not sufficient to decide not to use a
>>>>> LAPIC. Checking your logs, we see:
>>>>>
>>>>> using AMD E400 aware idle routine
>>>>>
>>>>> which means the LAPIC could potentially be unusable, but the idle
>>>>> routine also checks for a bit in a K8 specific MSR and prints the message:
>>>>>
>>>>> System has AMD C1E enabled
>>>>>
>>>>> If this bit is set, and in your case the message is not printed so the
>>>>> bit is not set. So, the LAPIC is usable, but due to the changes I made
>>>>> to try and print a message in Marius case, I broke the detection in your
>>>>> case.
>>>>>
>>>>> I have just pushed a rework for this commit in the for-core-3.5.7 branch
>>>>> in ipipe-gch git.
>>>>
>>>> Could you fold those changes into a single patch and add a few words to
>>>> the changelog that setup_APIC_timer is too early to check? Then I'll
>>>> merge it into the x86 queue.
>>>
>>>
>>> I am trying to reach a point where we add bug-fixes and only bug-fixes
>>> to re-release 2.6.2, so, the for-core-3.5.7 branch is what I intend to
>>> put in this release, I would like to avoid the other commits in your branch.
>>
>> Please have a closer look at the patches before judging. First, many of
>> them fix bugs of features that already used to work. Second, they add
>> support in an orthogonal way, i.e. have no or minimal side effects when
>> the corresponding kernel features are off. And third, the features,
>> specifically ftrace/perf, are very useful for a broad audience - and
>> mandatory for our x86 use cases. It would not only help us a lot if we
>> could focus on different Xenomai tasks than continue to maintain the
>> patch queues separately.
> 
> 
> I have no problem with merging new features, but I would suggest waiting
> for after the 2.6.2 re-release. But ultimately, I am not the one who
> decides. People can upgrade the I-pipe patch with a Xenomai release as
> you know.

If 2.6.2b (or whatever) is in a real hurry and this wont close the
core-3.5 branch, I'm fine with pulling only the bottom of my queue - or
yours (then ideally after that commit cleanup).

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed!
  2013-01-14  4:47           ` [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed! John Morris
  2013-01-14 11:57             ` Gilles Chanteperdrix
@ 2013-01-14 19:50             ` Gilles Chanteperdrix
  2013-01-14 20:56               ` John Morris
  1 sibling, 1 reply; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-14 19:50 UTC (permalink / raw)
  To: John Morris; +Cc: Xenomai

On 01/14/2013 05:47 AM, John Morris wrote:

> On 01/13/2013 01:41 PM, Gilles Chanteperdrix wrote:
>> On 01/13/2013 08:14 PM, John Morris wrote:
>>
>>> Hi Gilles and Jan,
>>>
>>> Note change of thread subject.  I'm starting to get confused.
>>>
>>> On 01/13/2013 06:16 AM, Gilles Chanteperdrix wrote:
>>>> On 01/13/2013 05:36 AM, John Morris wrote:
>>>>
>>>>> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>>>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>>>
>>>>>>
>>>>>> Could you check whether you can reproduce this issue with the I-pipe
>>>>>> patch for 3.5.7 ? The next xenomai release will be based on this version
>>>>>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
>>>>>
>>>>> Different problem; Xenomai wouldn't start:
>>>>>
>>>>>   I-pipe: could not find timer for cpu #0
>>>>>
>>>>> dmesg:
>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>>>>
>>>>> .config:
>>>>> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
>>>>>
>>>>> FYI, I found this same problem on two of my systems while testing your
>>>>> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>>>>>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>>>>>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
>>>>>
>>>>> Hardware looks similar to Mariusz's in this post, where he had the same
>>>>> problem:
>>>>>
>>>>> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
>>>>>
>>>>> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
>>>>> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
>>>>> BIOS option on these boards to enable/disable.  These same motherboards
>>>>> don't suffer this problem with mainline Xenomai on 3.5.3.
>>>>
>>>>
>>>> If you had the same problem as Marius, you would have seen it with
>>>> 3.5.3, and you would get the message in the dmesg about C1E, so, it is
>>>> probably something else. 
>>>
>>> Yes, I'm definitely getting confused.  I did see the same problem with
>>> C1E, but only while running your 3.5.7 .debs, and not in the 3.5.7 el6
>>> packages that are the main subject of this sub-thread:
>>>
>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-deb-gilles/dmesg-3.5.7-deb-gilles.log
>>
>>
>> Ah, that is because I rebased the I-pipe tree in between, and at some
>> point the code printing the message was wrong (ATOMIC_INIT(0) instead of
>> ATOMIC_INIT(-1)). That is my fault then, sorry.
>>
>>>
>>>> Could you run
>>>>
>>>> cat /proc/timer_list
>>>
>>> Back to el6 again, 3.5.7 i-pipe:
>>>
>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-cat-proc-timer_list-3.5.7-test.log
>>
>>
>> The LAPIC is definitely up and running (mode: 3). So, it probably means
>> that the erratum detection is not sufficient to decide not to use a
>> LAPIC. Checking your logs, we see:
>>
>> using AMD E400 aware idle routine
>>
>> which means the LAPIC could potentially be unusable, but the idle
>> routine also checks for a bit in a K8 specific MSR and prints the message:
>>
>> System has AMD C1E enabled
>>
>> If this bit is set, and in your case the message is not printed so the
>> bit is not set. So, the LAPIC is usable, but due to the changes I made
>> to try and print a message in Marius case, I broke the detection in your
>> case.
>>
>> I have just pushed a rework for this commit in the for-core-3.5.7 branch
>> in ipipe-gch git.
> 
> And it worked, no more C1E error!  Thanks!
> 
> It looks like the AMD-64 AM2/AM2+ socket CPUs were the last generation
> without C1E, and the AM3 socket CPUs were the first gen with.
> 
> Back to the original problem, the posix/mprotect problem is confirmed to
> be in this branch:
> 
>   ++ /usr/lib64/xenomai/regression/posix/mprotect
>   memory read
>   FAILURE: sigdebug_handler triggered, reason 2
>   memory write after exec enable
> 
> Regression test run:
> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-xeno-regression-test-3.5.7-test.log
> 
> Dmesg:
> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
> 
> Kernel .config:
> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config.txt


I can not reproduce this issue with the same configuration on atom.

> 
> To minimize confusion (esp. my own) and answer Jan's question, this is
> Gilles's ipipe-gch/for-core-3.5.7 kernel (20130113git08f0596) with


I can not find a commit beginning with 08f0596. The current head of
xenomai-2.6 master branch is 851281e593d89573edec063fe02c913e425f121b

> xenomai master (20130113git210ed428)


210ed428 is I-pipe current for-core-3.5.7 branch head.


-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-14 19:37                   ` Jan Kiszka
@ 2013-01-14 20:39                     ` Gilles Chanteperdrix
  2013-01-15 11:35                       ` Jan Kiszka
  0 siblings, 1 reply; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-14 20:39 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: John Morris, Xenomai

On 01/14/2013 08:37 PM, Jan Kiszka wrote:

> On 2013-01-14 20:15, Gilles Chanteperdrix wrote:
>> On 01/14/2013 08:13 PM, Jan Kiszka wrote:
>>
>>> On 2013-01-14 19:50, Gilles Chanteperdrix wrote:
>>>> On 01/14/2013 01:00 PM, Jan Kiszka wrote:
>>>>
>>>>> On 2013-01-13 20:41, Gilles Chanteperdrix wrote:
>>>>>> On 01/13/2013 08:14 PM, John Morris wrote:
>>>>>>
>>>>>>> Hi Gilles and Jan,
>>>>>>>
>>>>>>> Note change of thread subject.  I'm starting to get confused.
>>>>>>>
>>>>>>> On 01/13/2013 06:16 AM, Gilles Chanteperdrix wrote:
>>>>>>>> On 01/13/2013 05:36 AM, John Morris wrote:
>>>>>>>>
>>>>>>>>> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>>>>>>>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>>>>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>>>>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>>>>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>>>>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Could you check whether you can reproduce this issue with the I-pipe
>>>>>>>>>> patch for 3.5.7 ? The next xenomai release will be based on this version
>>>>>>>>>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
>>>>>>>>>
>>>>>>>>> Different problem; Xenomai wouldn't start:
>>>>>>>>>
>>>>>>>>>   I-pipe: could not find timer for cpu #0
>>>>>>>>>
>>>>>>>>> dmesg:
>>>>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>>>>>>>>
>>>>>>>>> .config:
>>>>>>>>> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
>>>>>>>>>
>>>>>>>>> FYI, I found this same problem on two of my systems while testing your
>>>>>>>>> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>>>>>>>>>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>>>>>>>>>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
>>>>>>>>>
>>>>>>>>> Hardware looks similar to Mariusz's in this post, where he had the same
>>>>>>>>> problem:
>>>>>>>>>
>>>>>>>>> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
>>>>>>>>>
>>>>>>>>> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
>>>>>>>>> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
>>>>>>>>> BIOS option on these boards to enable/disable.  These same motherboards
>>>>>>>>> don't suffer this problem with mainline Xenomai on 3.5.3.
>>>>>>>>
>>>>>>>>
>>>>>>>> If you had the same problem as Marius, you would have seen it with
>>>>>>>> 3.5.3, and you would get the message in the dmesg about C1E, so, it is
>>>>>>>> probably something else. 
>>>>>>>
>>>>>>> Yes, I'm definitely getting confused.  I did see the same problem with
>>>>>>> C1E, but only while running your 3.5.7 .debs, and not in the 3.5.7 el6
>>>>>>> packages that are the main subject of this sub-thread:
>>>>>>>
>>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-deb-gilles/dmesg-3.5.7-deb-gilles.log
>>>>>>
>>>>>>
>>>>>> Ah, that is because I rebased the I-pipe tree in between, and at some
>>>>>> point the code printing the message was wrong (ATOMIC_INIT(0) instead of
>>>>>> ATOMIC_INIT(-1)). That is my fault then, sorry.
>>>>>>
>>>>>>>
>>>>>>>> Could you run
>>>>>>>>
>>>>>>>> cat /proc/timer_list
>>>>>>>
>>>>>>> Back to el6 again, 3.5.7 i-pipe:
>>>>>>>
>>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-cat-proc-timer_list-3.5.7-test.log
>>>>>>
>>>>>>
>>>>>> The LAPIC is definitely up and running (mode: 3). So, it probably means
>>>>>> that the erratum detection is not sufficient to decide not to use a
>>>>>> LAPIC. Checking your logs, we see:
>>>>>>
>>>>>> using AMD E400 aware idle routine
>>>>>>
>>>>>> which means the LAPIC could potentially be unusable, but the idle
>>>>>> routine also checks for a bit in a K8 specific MSR and prints the message:
>>>>>>
>>>>>> System has AMD C1E enabled
>>>>>>
>>>>>> If this bit is set, and in your case the message is not printed so the
>>>>>> bit is not set. So, the LAPIC is usable, but due to the changes I made
>>>>>> to try and print a message in Marius case, I broke the detection in your
>>>>>> case.
>>>>>>
>>>>>> I have just pushed a rework for this commit in the for-core-3.5.7 branch
>>>>>> in ipipe-gch git.
>>>>>
>>>>> Could you fold those changes into a single patch and add a few words to
>>>>> the changelog that setup_APIC_timer is too early to check? Then I'll
>>>>> merge it into the x86 queue.
>>>>
>>>>
>>>> I am trying to reach a point where we add bug-fixes and only bug-fixes
>>>> to re-release 2.6.2, so, the for-core-3.5.7 branch is what I intend to
>>>> put in this release, I would like to avoid the other commits in your branch.
>>>
>>> Please have a closer look at the patches before judging. First, many of
>>> them fix bugs of features that already used to work. Second, they add
>>> support in an orthogonal way, i.e. have no or minimal side effects when
>>> the corresponding kernel features are off. And third, the features,
>>> specifically ftrace/perf, are very useful for a broad audience - and
>>> mandatory for our x86 use cases. It would not only help us a lot if we
>>> could focus on different Xenomai tasks than continue to maintain the
>>> patch queues separately.
>>
>>
>> I have no problem with merging new features, but I would suggest waiting
>> for after the 2.6.2 re-release. But ultimately, I am not the one who
>> decides. People can upgrade the I-pipe patch with a Xenomai release as
>> you know.
> 
> If 2.6.2b (or whatever) is in a real hurry and this wont close the
> core-3.5 branch, I'm fine with pulling only the bottom of my queue - or
> yours (then ideally after that commit cleanup).


Done, also note that my current work is the for-core-3.5.7 branch, not
the for-core-3.5 branch.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed!
  2013-01-14 13:36                 ` Jan Kiszka
@ 2013-01-14 20:52                   ` John Morris
  2013-01-14 22:54                     ` Gilles Chanteperdrix
  0 siblings, 1 reply; 54+ messages in thread
From: John Morris @ 2013-01-14 20:52 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Xenomai

On 01/14/2013 07:36 AM, Jan Kiszka wrote:
> On 2013-01-14 13:00, Jan Kiszka wrote:
>> On 2013-01-14 12:57, Gilles Chanteperdrix wrote:
>>> On 01/14/2013 05:47 AM, John Morris wrote:
>>>
>>>> On 01/13/2013 01:41 PM, Gilles Chanteperdrix wrote:
>>>>> On 01/13/2013 08:14 PM, John Morris wrote:
>>>>>
>>>>>> Hi Gilles and Jan,
>>>>>>
>>>>>> Note change of thread subject.  I'm starting to get confused.
>>>>>>
>>>>>> On 01/13/2013 06:16 AM, Gilles Chanteperdrix wrote:
>>>>>>> On 01/13/2013 05:36 AM, John Morris wrote:
>>>>>>>
>>>>>>>> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>>>>>>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>>>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>>>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>>>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>>>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Could you check whether you can reproduce this issue with the I-pipe
>>>>>>>>> patch for 3.5.7 ? The next xenomai release will be based on this version
>>>>>>>>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
>>>>>>>>
>>>>>>>> Different problem; Xenomai wouldn't start:
>>>>>>>>
>>>>>>>>   I-pipe: could not find timer for cpu #0
>>>>>>>>
>>>>>>>> dmesg:
>>>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>>>>>>>
>>>>>>>> .config:
>>>>>>>> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
>>>>>>>>
>>>>>>>> FYI, I found this same problem on two of my systems while testing your
>>>>>>>> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>>>>>>>>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>>>>>>>>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
>>>>>>>>
>>>>>>>> Hardware looks similar to Mariusz's in this post, where he had the same
>>>>>>>> problem:
>>>>>>>>
>>>>>>>> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
>>>>>>>>
>>>>>>>> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
>>>>>>>> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
>>>>>>>> BIOS option on these boards to enable/disable.  These same motherboards
>>>>>>>> don't suffer this problem with mainline Xenomai on 3.5.3.
>>>>>>>
>>>>>>>
>>>>>>> If you had the same problem as Marius, you would have seen it with
>>>>>>> 3.5.3, and you would get the message in the dmesg about C1E, so, it is
>>>>>>> probably something else. 
>>>>>>
>>>>>> Yes, I'm definitely getting confused.  I did see the same problem with
>>>>>> C1E, but only while running your 3.5.7 .debs, and not in the 3.5.7 el6
>>>>>> packages that are the main subject of this sub-thread:
>>>>>>
>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-deb-gilles/dmesg-3.5.7-deb-gilles.log
>>>>>
>>>>>
>>>>> Ah, that is because I rebased the I-pipe tree in between, and at some
>>>>> point the code printing the message was wrong (ATOMIC_INIT(0) instead of
>>>>> ATOMIC_INIT(-1)). That is my fault then, sorry.
>>>>>
>>>>>>
>>>>>>> Could you run
>>>>>>>
>>>>>>> cat /proc/timer_list
>>>>>>
>>>>>> Back to el6 again, 3.5.7 i-pipe:
>>>>>>
>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-cat-proc-timer_list-3.5.7-test.log
>>>>>
>>>>>
>>>>> The LAPIC is definitely up and running (mode: 3). So, it probably means
>>>>> that the erratum detection is not sufficient to decide not to use a
>>>>> LAPIC. Checking your logs, we see:
>>>>>
>>>>> using AMD E400 aware idle routine
>>>>>
>>>>> which means the LAPIC could potentially be unusable, but the idle
>>>>> routine also checks for a bit in a K8 specific MSR and prints the message:
>>>>>
>>>>> System has AMD C1E enabled
>>>>>
>>>>> If this bit is set, and in your case the message is not printed so the
>>>>> bit is not set. So, the LAPIC is usable, but due to the changes I made
>>>>> to try and print a message in Marius case, I broke the detection in your
>>>>> case.
>>>>>
>>>>> I have just pushed a rework for this commit in the for-core-3.5.7 branch
>>>>> in ipipe-gch git.
>>>>
>>>> And it worked, no more C1E error!  Thanks!
>>>>
>>>> It looks like the AMD-64 AM2/AM2+ socket CPUs were the last generation
>>>> without C1E, and the AM3 socket CPUs were the first gen with.
>>>>
>>>> Back to the original problem, the posix/mprotect problem is confirmed to
>>>> be in this branch:
>>>>
>>>>   ++ /usr/lib64/xenomai/regression/posix/mprotect
>>>>   memory read
>>>>   FAILURE: sigdebug_handler triggered, reason 2
>>>>   memory write after exec enable
>>>
>>>
>>> I will try to compile a kernel with the same configuration as you to see
>>> if I can reproduce the issue.
>>
>> Unless you want to double-check: build is already running here. This
>> feature is critical for us, but I have no clue ATM why it could fail
>> over the latest patch queue.
> 
> OK, would probably be good to double-check as I'm unable to reproduce
> the issue over my queue with John's .config.
> 
> The reason code is suspicious BTW: syscall. Would be good if someone who
> can reproduce attaches gdb and provides a backtrace from the signal to
> the call that triggered it. Could something have went wrong while
> building the test case, that not all required functions are wrapped such
> as printf?

Hi Jan,

Is this what you need?

http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-gdb-posix-mprotect-3.5.7-test.log

	John


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed!
  2013-01-14 19:50             ` [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed! Gilles Chanteperdrix
@ 2013-01-14 20:56               ` John Morris
  2013-01-14 22:57                 ` Gilles Chanteperdrix
  0 siblings, 1 reply; 54+ messages in thread
From: John Morris @ 2013-01-14 20:56 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai



On 01/14/2013 01:50 PM, Gilles Chanteperdrix wrote:
> On 01/14/2013 05:47 AM, John Morris wrote:
> 
>> On 01/13/2013 01:41 PM, Gilles Chanteperdrix wrote:
>>> On 01/13/2013 08:14 PM, John Morris wrote:
>>>
>>>> Hi Gilles and Jan,
>>>>
>>>> Note change of thread subject.  I'm starting to get confused.
>>>>
>>>> On 01/13/2013 06:16 AM, Gilles Chanteperdrix wrote:
>>>>> On 01/13/2013 05:36 AM, John Morris wrote:
>>>>>
>>>>>> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>>>>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>>>>
>>>>>>>
>>>>>>> Could you check whether you can reproduce this issue with the I-pipe
>>>>>>> patch for 3.5.7 ? The next xenomai release will be based on this version
>>>>>>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
>>>>>>
>>>>>> Different problem; Xenomai wouldn't start:
>>>>>>
>>>>>>   I-pipe: could not find timer for cpu #0
>>>>>>
>>>>>> dmesg:
>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>>>>>
>>>>>> .config:
>>>>>> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
>>>>>>
>>>>>> FYI, I found this same problem on two of my systems while testing your
>>>>>> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>>>>>>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>>>>>>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
>>>>>>
>>>>>> Hardware looks similar to Mariusz's in this post, where he had the same
>>>>>> problem:
>>>>>>
>>>>>> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
>>>>>>
>>>>>> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
>>>>>> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
>>>>>> BIOS option on these boards to enable/disable.  These same motherboards
>>>>>> don't suffer this problem with mainline Xenomai on 3.5.3.
>>>>>
>>>>>
>>>>> If you had the same problem as Marius, you would have seen it with
>>>>> 3.5.3, and you would get the message in the dmesg about C1E, so, it is
>>>>> probably something else. 
>>>>
>>>> Yes, I'm definitely getting confused.  I did see the same problem with
>>>> C1E, but only while running your 3.5.7 .debs, and not in the 3.5.7 el6
>>>> packages that are the main subject of this sub-thread:
>>>>
>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-deb-gilles/dmesg-3.5.7-deb-gilles.log
>>>
>>>
>>> Ah, that is because I rebased the I-pipe tree in between, and at some
>>> point the code printing the message was wrong (ATOMIC_INIT(0) instead of
>>> ATOMIC_INIT(-1)). That is my fault then, sorry.
>>>
>>>>
>>>>> Could you run
>>>>>
>>>>> cat /proc/timer_list
>>>>
>>>> Back to el6 again, 3.5.7 i-pipe:
>>>>
>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-cat-proc-timer_list-3.5.7-test.log
>>>
>>>
>>> The LAPIC is definitely up and running (mode: 3). So, it probably means
>>> that the erratum detection is not sufficient to decide not to use a
>>> LAPIC. Checking your logs, we see:
>>>
>>> using AMD E400 aware idle routine
>>>
>>> which means the LAPIC could potentially be unusable, but the idle
>>> routine also checks for a bit in a K8 specific MSR and prints the message:
>>>
>>> System has AMD C1E enabled
>>>
>>> If this bit is set, and in your case the message is not printed so the
>>> bit is not set. So, the LAPIC is usable, but due to the changes I made
>>> to try and print a message in Marius case, I broke the detection in your
>>> case.
>>>
>>> I have just pushed a rework for this commit in the for-core-3.5.7 branch
>>> in ipipe-gch git.
>>
>> And it worked, no more C1E error!  Thanks!
>>
>> It looks like the AMD-64 AM2/AM2+ socket CPUs were the last generation
>> without C1E, and the AM3 socket CPUs were the first gen with.
>>
>> Back to the original problem, the posix/mprotect problem is confirmed to
>> be in this branch:
>>
>>   ++ /usr/lib64/xenomai/regression/posix/mprotect
>>   memory read
>>   FAILURE: sigdebug_handler triggered, reason 2
>>   memory write after exec enable
>>
>> Regression test run:
>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-xeno-regression-test-3.5.7-test.log
>>
>> Dmesg:
>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>
>> Kernel .config:
>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config.txt
> 
> 
> I can not reproduce this issue with the same configuration on atom.
> 
>>
>> To minimize confusion (esp. my own) and answer Jan's question, this is
>> Gilles's ipipe-gch/for-core-3.5.7 kernel (20130113git08f0596) with
> 
> 
> I can not find a commit beginning with 08f0596. The current head of
> xenomai-2.6 master branch is 851281e593d89573edec063fe02c913e425f121b
> 
>> xenomai master (20130113git210ed428)
> 
> 
> 210ed428 is I-pipe current for-core-3.5.7 branch head.
> 
> 

How embarrassing, let's try again:

xenomai 20130113git851281e5, for-core-3.5.7 20130113git210ed428

Both HEAD pulled yesterday.

	John


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed!
  2013-01-14 20:52                   ` John Morris
@ 2013-01-14 22:54                     ` Gilles Chanteperdrix
  2013-01-15  7:16                       ` [Xenomai] 3.5.7 posix/mprotect fixed; (Was: posix/mprotect failure) John Morris
  0 siblings, 1 reply; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-14 22:54 UTC (permalink / raw)
  To: John Morris; +Cc: Jan Kiszka, Xenomai

On 01/14/2013 09:52 PM, John Morris wrote:

> On 01/14/2013 07:36 AM, Jan Kiszka wrote:
>> On 2013-01-14 13:00, Jan Kiszka wrote:
>>> On 2013-01-14 12:57, Gilles Chanteperdrix wrote:
>>>> On 01/14/2013 05:47 AM, John Morris wrote:
>>>>
>>>>> On 01/13/2013 01:41 PM, Gilles Chanteperdrix wrote:
>>>>>> On 01/13/2013 08:14 PM, John Morris wrote:
>>>>>>
>>>>>>> Hi Gilles and Jan,
>>>>>>>
>>>>>>> Note change of thread subject.  I'm starting to get confused.
>>>>>>>
>>>>>>> On 01/13/2013 06:16 AM, Gilles Chanteperdrix wrote:
>>>>>>>> On 01/13/2013 05:36 AM, John Morris wrote:
>>>>>>>>
>>>>>>>>> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>>>>>>>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>>>>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>>>>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>>>>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>>>>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Could you check whether you can reproduce this issue with the I-pipe
>>>>>>>>>> patch for 3.5.7 ? The next xenomai release will be based on this version
>>>>>>>>>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
>>>>>>>>>
>>>>>>>>> Different problem; Xenomai wouldn't start:
>>>>>>>>>
>>>>>>>>>   I-pipe: could not find timer for cpu #0
>>>>>>>>>
>>>>>>>>> dmesg:
>>>>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>>>>>>>>
>>>>>>>>> .config:
>>>>>>>>> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
>>>>>>>>>
>>>>>>>>> FYI, I found this same problem on two of my systems while testing your
>>>>>>>>> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>>>>>>>>>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>>>>>>>>>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
>>>>>>>>>
>>>>>>>>> Hardware looks similar to Mariusz's in this post, where he had the same
>>>>>>>>> problem:
>>>>>>>>>
>>>>>>>>> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
>>>>>>>>>
>>>>>>>>> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
>>>>>>>>> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
>>>>>>>>> BIOS option on these boards to enable/disable.  These same motherboards
>>>>>>>>> don't suffer this problem with mainline Xenomai on 3.5.3.
>>>>>>>>
>>>>>>>>
>>>>>>>> If you had the same problem as Marius, you would have seen it with
>>>>>>>> 3.5.3, and you would get the message in the dmesg about C1E, so, it is
>>>>>>>> probably something else. 
>>>>>>>
>>>>>>> Yes, I'm definitely getting confused.  I did see the same problem with
>>>>>>> C1E, but only while running your 3.5.7 .debs, and not in the 3.5.7 el6
>>>>>>> packages that are the main subject of this sub-thread:
>>>>>>>
>>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-deb-gilles/dmesg-3.5.7-deb-gilles.log
>>>>>>
>>>>>>
>>>>>> Ah, that is because I rebased the I-pipe tree in between, and at some
>>>>>> point the code printing the message was wrong (ATOMIC_INIT(0) instead of
>>>>>> ATOMIC_INIT(-1)). That is my fault then, sorry.
>>>>>>
>>>>>>>
>>>>>>>> Could you run
>>>>>>>>
>>>>>>>> cat /proc/timer_list
>>>>>>>
>>>>>>> Back to el6 again, 3.5.7 i-pipe:
>>>>>>>
>>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-cat-proc-timer_list-3.5.7-test.log
>>>>>>
>>>>>>
>>>>>> The LAPIC is definitely up and running (mode: 3). So, it probably means
>>>>>> that the erratum detection is not sufficient to decide not to use a
>>>>>> LAPIC. Checking your logs, we see:
>>>>>>
>>>>>> using AMD E400 aware idle routine
>>>>>>
>>>>>> which means the LAPIC could potentially be unusable, but the idle
>>>>>> routine also checks for a bit in a K8 specific MSR and prints the message:
>>>>>>
>>>>>> System has AMD C1E enabled
>>>>>>
>>>>>> If this bit is set, and in your case the message is not printed so the
>>>>>> bit is not set. So, the LAPIC is usable, but due to the changes I made
>>>>>> to try and print a message in Marius case, I broke the detection in your
>>>>>> case.
>>>>>>
>>>>>> I have just pushed a rework for this commit in the for-core-3.5.7 branch
>>>>>> in ipipe-gch git.
>>>>>
>>>>> And it worked, no more C1E error!  Thanks!
>>>>>
>>>>> It looks like the AMD-64 AM2/AM2+ socket CPUs were the last generation
>>>>> without C1E, and the AM3 socket CPUs were the first gen with.
>>>>>
>>>>> Back to the original problem, the posix/mprotect problem is confirmed to
>>>>> be in this branch:
>>>>>
>>>>>   ++ /usr/lib64/xenomai/regression/posix/mprotect
>>>>>   memory read
>>>>>   FAILURE: sigdebug_handler triggered, reason 2
>>>>>   memory write after exec enable
>>>>
>>>>
>>>> I will try to compile a kernel with the same configuration as you to see
>>>> if I can reproduce the issue.
>>>
>>> Unless you want to double-check: build is already running here. This
>>> feature is critical for us, but I have no clue ATM why it could fail
>>> over the latest patch queue.
>>
>> OK, would probably be good to double-check as I'm unable to reproduce
>> the issue over my queue with John's .config.
>>
>> The reason code is suspicious BTW: syscall. Would be good if someone who
>> can reproduce attaches gdb and provides a backtrace from the signal to
>> the call that triggered it. Could something have went wrong while
>> building the test case, that not all required functions are wrapped such
>> as printf?
> 
> Hi Jan,
> 
> Is this what you need?
> 
> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-gdb-posix-mprotect-3.5.7-test.log


So, you are compiling with _FORTIFY_SOURCE, that is the difference with
how we are compiling. Is it easy for you to set _FORTIFY_SOURCE to 0?

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed!
  2013-01-14 20:56               ` John Morris
@ 2013-01-14 22:57                 ` Gilles Chanteperdrix
  0 siblings, 0 replies; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-14 22:57 UTC (permalink / raw)
  To: John Morris; +Cc: Xenomai

On 01/14/2013 09:56 PM, John Morris wrote:

> 
> 
> On 01/14/2013 01:50 PM, Gilles Chanteperdrix wrote:
>> On 01/14/2013 05:47 AM, John Morris wrote:
>>
>>> On 01/13/2013 01:41 PM, Gilles Chanteperdrix wrote:
>>>> On 01/13/2013 08:14 PM, John Morris wrote:
>>>>
>>>>> Hi Gilles and Jan,
>>>>>
>>>>> Note change of thread subject.  I'm starting to get confused.
>>>>>
>>>>> On 01/13/2013 06:16 AM, Gilles Chanteperdrix wrote:
>>>>>> On 01/13/2013 05:36 AM, John Morris wrote:
>>>>>>
>>>>>>> On 01/12/2013 11:31 AM, Gilles Chanteperdrix wrote:
>>>>>>>> On 01/12/2013 06:26 PM, John Morris wrote:
>>>>>>>>> 1)  Most worrisome is "kernel BUG at mm/mmap.c:2313!   invalid opcode:
>>>>>>>>> 0000 [#2] SMP".  Is this related to HEAPSZ or STACKPOOLSZ?  My mind is
>>>>>>>>> getting foggy about all the things I've seen, but it seems like it was
>>>>>>>>> happening earlier in the tests until these config values were quadrupled.
>>>>>>>>
>>>>>>>>
>>>>>>>> Could you check whether you can reproduce this issue with the I-pipe
>>>>>>>> patch for 3.5.7 ? The next xenomai release will be based on this version
>>>>>>>> on x86 anyway. Branch for-core-3.5.7 in ipipe-gch.git
>>>>>>>
>>>>>>> Different problem; Xenomai wouldn't start:
>>>>>>>
>>>>>>>   I-pipe: could not find timer for cpu #0
>>>>>>>
>>>>>>> dmesg:
>>>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>>>>>>
>>>>>>> .config:
>>>>>>> www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config
>>>>>>>
>>>>>>> FYI, I found this same problem on two of my systems while testing your
>>>>>>> Debian packages.  Both AMD Athlon II 64-bit (one single, one dual core).
>>>>>>>  They're about the same generation of motherboards, AM2 or AM2+ socket.
>>>>>>>  One is AMD 770 chipset, the other NVidia GeForce 6100 / nForce 430.
>>>>>>>
>>>>>>> Hardware looks similar to Mariusz's in this post, where he had the same
>>>>>>> problem:
>>>>>>>
>>>>>>> http://www.xenomai.org/pipermail/xenomai/2012-December/027121.html
>>>>>>>
>>>>>>> He's also running AMD 64-bit on a Gigabyte motherboard, but the next
>>>>>>> generation AM3 socket, Phenom CPU, AMD 890 chipset.  I don't have a C1E
>>>>>>> BIOS option on these boards to enable/disable.  These same motherboards
>>>>>>> don't suffer this problem with mainline Xenomai on 3.5.3.
>>>>>>
>>>>>>
>>>>>> If you had the same problem as Marius, you would have seen it with
>>>>>> 3.5.3, and you would get the message in the dmesg about C1E, so, it is
>>>>>> probably something else. 
>>>>>
>>>>> Yes, I'm definitely getting confused.  I did see the same problem with
>>>>> C1E, but only while running your 3.5.7 .debs, and not in the 3.5.7 el6
>>>>> packages that are the main subject of this sub-thread:
>>>>>
>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-deb-gilles/dmesg-3.5.7-deb-gilles.log
>>>>
>>>>
>>>> Ah, that is because I rebased the I-pipe tree in between, and at some
>>>> point the code printing the message was wrong (ATOMIC_INIT(0) instead of
>>>> ATOMIC_INIT(-1)). That is my fault then, sorry.
>>>>
>>>>>
>>>>>> Could you run
>>>>>>
>>>>>> cat /proc/timer_list
>>>>>
>>>>> Back to el6 again, 3.5.7 i-pipe:
>>>>>
>>>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-cat-proc-timer_list-3.5.7-test.log
>>>>
>>>>
>>>> The LAPIC is definitely up and running (mode: 3). So, it probably means
>>>> that the erratum detection is not sufficient to decide not to use a
>>>> LAPIC. Checking your logs, we see:
>>>>
>>>> using AMD E400 aware idle routine
>>>>
>>>> which means the LAPIC could potentially be unusable, but the idle
>>>> routine also checks for a bit in a K8 specific MSR and prints the message:
>>>>
>>>> System has AMD C1E enabled
>>>>
>>>> If this bit is set, and in your case the message is not printed so the
>>>> bit is not set. So, the LAPIC is usable, but due to the changes I made
>>>> to try and print a message in Marius case, I broke the detection in your
>>>> case.
>>>>
>>>> I have just pushed a rework for this commit in the for-core-3.5.7 branch
>>>> in ipipe-gch git.
>>>
>>> And it worked, no more C1E error!  Thanks!
>>>
>>> It looks like the AMD-64 AM2/AM2+ socket CPUs were the last generation
>>> without C1E, and the AM3 socket CPUs were the first gen with.
>>>
>>> Back to the original problem, the posix/mprotect problem is confirmed to
>>> be in this branch:
>>>
>>>   ++ /usr/lib64/xenomai/regression/posix/mprotect
>>>   memory read
>>>   FAILURE: sigdebug_handler triggered, reason 2
>>>   memory write after exec enable
>>>
>>> Regression test run:
>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-xeno-regression-test-3.5.7-test.log
>>>
>>> Dmesg:
>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7.log
>>>
>>> Kernel .config:
>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-3.5.7.config.txt
>>
>>
>> I can not reproduce this issue with the same configuration on atom.
>>
>>>
>>> To minimize confusion (esp. my own) and answer Jan's question, this is
>>> Gilles's ipipe-gch/for-core-3.5.7 kernel (20130113git08f0596) with
>>
>>
>> I can not find a commit beginning with 08f0596. The current head of
>> xenomai-2.6 master branch is 851281e593d89573edec063fe02c913e425f121b
>>
>>> xenomai master (20130113git210ed428)
>>
>>
>> 210ed428 is I-pipe current for-core-3.5.7 branch head.
>>
>>
> 
> How embarrassing, let's try again:
> 
> xenomai 20130113git851281e5, for-core-3.5.7 20130113git210ed428
> 
> Both HEAD pulled yesterday.


Yes, that were yesterday's head. I rebased the for-core-3.5.7 branch,
merging the two AMD C1E commit as Jan asked, so, the new head in the
for-core-3.5.7 branch is fde77b2e48b2cb0b2ea20932fc3f66771939501d

Anyway, the problem you have comes from the fact that you compile with
_FORTIFY_SOURCE, and we did not take care for this case in xenomai sources.

> 
> 	John
> 



-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 posix/mprotect fixed; (Was:  posix/mprotect failure)
  2013-01-14 22:54                     ` Gilles Chanteperdrix
@ 2013-01-15  7:16                       ` John Morris
  2013-01-15  7:31                         ` Gilles Chanteperdrix
  0 siblings, 1 reply; 54+ messages in thread
From: John Morris @ 2013-01-15  7:16 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Jan Kiszka, Xenomai



On 01/14/2013 04:54 PM, Gilles Chanteperdrix wrote:
> On 01/14/2013 09:52 PM, John Morris wrote:
>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-gdb-posix-mprotect-3.5.7-test.log
> 
> 
> So, you are compiling with _FORTIFY_SOURCE, that is the difference with
> how we are compiling. Is it easy for you to set _FORTIFY_SOURCE to 0?

Wow, you picked that out instantly.  That fixes it.  RH-like distros
define _FORTIFY_SOURCE in the default CFLAGS, worthy of a wiki note:

http://fedoraproject.org/wiki/Security/Features#Compile_Time_Buffer_Checks_.28FORTIFY_SOURCE.29

I'm guessing that Xenomai-enabled applications won't have to worry about
this, because they'll only be compiling header files.  Is that about
right, or should there be a notice in the -devel package?


Here's the latest xeno-regression-test run:

http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-xeno-regression-test-3.5.7-test.log

The latest xeno-regression-test run still has a few small problems:

1)  "xddp_test" fails:  "...IPCPROTO_XDDP): Address family not supported
by protocol".  DRIVERS_RTIPC_XDDP is enabled in the config.  Doesn't
look like others have experienced this before.  LinuxCNC doesn't use
XDDP, but it should be fixed for the general audience.

2)  "clocktest -C 42 -T 30" fails; apparently expected in some
circumstances ('... || :').  Does anything need to be fixed here?  The
test system switched to acpi_pm, and the TSCs on the dual-core are
unsynched.  We don't know if the end-user's application will lock the
process to a single CPU, so CLOCK_HOST_REALTIME isn't the best choice,
even if it were somehow available.

3)  "Warning: Linux is compiled to use FPU in kernel-space" in dmesg:
Ignoring:  disabling (at least) software RAID breaks the
'one-size-fits-all' goal, and Gilles states this is harmless:
http://www.xenomai.org/pipermail/xenomai/2009-May/016743.html

	John


current status:
xenomai-2.6 master 851281e5 (Sun Jan 13 13:51:22 2013 +0100)
ipipe-gch for-core-3.5.7 fde77b2e (Wed Jan 9 12:14:54 2013 +0100)

kernel .config
http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-kernel-config-3.5.7-test.txt

dmesg
http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-dmesg-3.5.7-test.log

regression test
http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-xeno-regression-test-3.5.7-test.log


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 posix/mprotect fixed; (Was:  posix/mprotect failure)
  2013-01-15  7:16                       ` [Xenomai] 3.5.7 posix/mprotect fixed; (Was: posix/mprotect failure) John Morris
@ 2013-01-15  7:31                         ` Gilles Chanteperdrix
  2013-01-18  3:56                           ` John Morris
  0 siblings, 1 reply; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-15  7:31 UTC (permalink / raw)
  To: John Morris; +Cc: Jan Kiszka, Xenomai

On 01/15/2013 08:16 AM, John Morris wrote:

> 
> 
> On 01/14/2013 04:54 PM, Gilles Chanteperdrix wrote:
>> On 01/14/2013 09:52 PM, John Morris wrote:
>>> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-gdb-posix-mprotect-3.5.7-test.log
>>
>>
>> So, you are compiling with _FORTIFY_SOURCE, that is the difference with
>> how we are compiling. Is it easy for you to set _FORTIFY_SOURCE to 0?
> 
> Wow, you picked that out instantly.  That fixes it.  RH-like distros
> define _FORTIFY_SOURCE in the default CFLAGS, worthy of a wiki note:
> 
> http://fedoraproject.org/wiki/Security/Features#Compile_Time_Buffer_Checks_.28FORTIFY_SOURCE.29
> 
> I'm guessing that Xenomai-enabled applications won't have to worry about
> this, because they'll only be compiling header files.  Is that about
> right, or should there be a notice in the -devel package?


No, unfortunately, this is a problem, applications compiled with
-D_FORTIFY_SOURCE will not get proper symbols substitution by xenomai
library for the printf functions. The fix is to implement the checking
variants in xenomai library, so as to be able to have FORTIFY_SOURCE and
symbols susbtitution.

> 
> 
> Here's the latest xeno-regression-test run:
> 
> http://www.zultron.com/static/2013/01/xenomai/3.5.7-test/foo-xeno-regression-test-3.5.7-test.log
> 
> The latest xeno-regression-test run still has a few small problems:
> 
> 1)  "xddp_test" fails:  "...IPCPROTO_XDDP): Address family not supported
> by protocol".  DRIVERS_RTIPC_XDDP is enabled in the config.  Doesn't
> look like others have experienced this before.  LinuxCNC doesn't use
> XDDP, but it should be fixed for the general audience.


You are missing some kernel drivers. I used the following options in the
Debian configuration: basically, all options enabled (some as module,
because they are expected to be used less often).

CONFIG_XENOMAI=y
CONFIG_XENO_GENERIC_STACKPOOL=y
CONFIG_XENO_FASTSYNCH=y
CONFIG_XENO_OPT_NUCLEUS=y
CONFIG_XENO_OPT_PERVASIVE=y
# CONFIG_XENO_OPT_PRIOCPL is not set
CONFIG_XENO_OPT_PIPELINE_HEAD=y
# CONFIG_XENO_OPT_SCHED_CLASSES is not set
CONFIG_XENO_OPT_PIPE=y
CONFIG_XENO_OPT_MAP=y
CONFIG_XENO_OPT_VFILE=y
CONFIG_XENO_OPT_PIPE_NRDEV=32
CONFIG_XENO_OPT_REGISTRY_NRSLOTS=512
CONFIG_XENO_OPT_SYS_HEAPSZ=4096
CONFIG_XENO_OPT_SYS_STACKPOOLSZ=4096
CONFIG_XENO_OPT_SEM_HEAPSZ=12
CONFIG_XENO_OPT_GLOBAL_SEM_HEAPSZ=12
CONFIG_XENO_OPT_STATS=y
CONFIG_XENO_OPT_DEBUG=y
# CONFIG_XENO_OPT_DEBUG_NUCLEUS is not set
# CONFIG_XENO_OPT_DEBUG_XNLOCK is not set
# CONFIG_XENO_OPT_DEBUG_QUEUES is not set
# CONFIG_XENO_OPT_DEBUG_REGISTRY is not set
# CONFIG_XENO_OPT_DEBUG_TIMERS is not set
CONFIG_XENO_OPT_DEBUG_SYNCH_RELAX=y
CONFIG_XENO_OPT_WATCHDOG=y
CONFIG_XENO_OPT_WATCHDOG_TIMEOUT=4
CONFIG_XENO_OPT_SHIRQ=y
CONFIG_XENO_OPT_SELECT=y
CONFIG_XENO_OPT_HOSTRT=y
CONFIG_XENO_OPT_TIMING_PERIODIC=y
CONFIG_XENO_OPT_TIMING_VIRTICK=1000
CONFIG_XENO_OPT_TIMING_SCHEDLAT=0
# CONFIG_XENO_OPT_SCALABLE_SCHED is not set
CONFIG_XENO_OPT_TIMER_LIST=y
# CONFIG_XENO_OPT_TIMER_HEAP is not set
# CONFIG_XENO_OPT_TIMER_WHEEL is not set
CONFIG_XENO_HW_FPU=y
# CONFIG_XENO_HW_SMI_DETECT_DISABLE is not set
CONFIG_XENO_HW_SMI_DETECT=y
# CONFIG_XENO_HW_SMI_WORKAROUND is not set
CONFIG_XENO_SKIN_NATIVE=y
CONFIG_XENO_OPT_NATIVE_PERIOD=0
CONFIG_XENO_OPT_NATIVE_PIPE=y
CONFIG_XENO_OPT_NATIVE_PIPE_BUFSZ=1024
CONFIG_XENO_OPT_NATIVE_SEM=y
CONFIG_XENO_OPT_NATIVE_EVENT=y
CONFIG_XENO_OPT_NATIVE_MUTEX=y
CONFIG_XENO_OPT_NATIVE_COND=y
CONFIG_XENO_OPT_NATIVE_QUEUE=y
CONFIG_XENO_OPT_NATIVE_BUFFER=y
CONFIG_XENO_OPT_NATIVE_HEAP=y
CONFIG_XENO_OPT_NATIVE_ALARM=y
CONFIG_XENO_OPT_NATIVE_MPS=y
# CONFIG_XENO_OPT_NATIVE_INTR is not set
# CONFIG_XENO_OPT_DEBUG_NATIVE is not set
CONFIG_XENO_SKIN_POSIX=y
CONFIG_XENO_OPT_POSIX_PERIOD=0
# CONFIG_XENO_OPT_POSIX_SHM is not set
# CONFIG_XENO_OPT_POSIX_INTR is not set
CONFIG_XENO_OPT_POSIX_SELECT=y
# CONFIG_XENO_OPT_DEBUG_POSIX is not set
CONFIG_XENO_SKIN_PSOS=m
CONFIG_XENO_OPT_PSOS_PERIOD=1000
# CONFIG_XENO_OPT_DEBUG_PSOS is not set
CONFIG_XENO_SKIN_UITRON=m
CONFIG_XENO_OPT_UITRON_PERIOD=1000
# CONFIG_XENO_OPT_DEBUG_UITRON is not set
CONFIG_XENO_SKIN_VRTX=m
CONFIG_XENO_OPT_VRTX_PERIOD=1000
CONFIG_XENO_SKIN_VXWORKS=m
CONFIG_XENO_OPT_VXWORKS_PERIOD=1000
# CONFIG_XENO_OPT_DEBUG_VXWORKS is not set
# CONFIG_XENO_OPT_NOWARN_DEPRECATED is not set
CONFIG_XENO_SKIN_RTDM=y
CONFIG_XENO_OPT_RTDM_PERIOD=0
CONFIG_XENO_OPT_RTDM_FILDES=128
CONFIG_XENO_OPT_RTDM_SELECT=y
# CONFIG_XENO_OPT_DEBUG_RTDM is not set
# CONFIG_XENO_OPT_DEBUG_RTDM_APPL is not set
CONFIG_XENO_DRIVERS_16550A=m
# CONFIG_XENO_DRIVERS_16550A_PIO is not set
# CONFIG_XENO_DRIVERS_16550A_MMIO is not set
CONFIG_XENO_DRIVERS_16550A_ANY=y
CONFIG_XENO_DRIVERS_16550A_PCI=y
CONFIG_XENO_DRIVERS_16550A_PCI_MOXA=y
CONFIG_XENO_DRIVERS_TIMERBENCH=y
CONFIG_XENO_DRIVERS_KLATENCY=m
CONFIG_XENO_DRIVERS_IRQBENCH=m
CONFIG_XENO_DRIVERS_SWITCHTEST=y
CONFIG_XENO_DRIVERS_RTDMTEST=m
CONFIG_XENO_DRIVERS_CAN=m
CONFIG_XENO_DRIVERS_CAN_DEBUG=y
# CONFIG_XENO_DRIVERS_CAN_LOOPBACK is not set
CONFIG_XENO_DRIVERS_CAN_RXBUF_SIZE=1024
CONFIG_XENO_DRIVERS_CAN_MAX_DEVICES=4
CONFIG_XENO_DRIVERS_CAN_MAX_RECEIVERS=16
CONFIG_XENO_DRIVERS_CAN_BUS_ERR=y
# CONFIG_XENO_DRIVERS_CAN_CALC_BITTIME_OLD is not set
CONFIG_XENO_DRIVERS_CAN_VIRT=m
# CONFIG_XENO_DRIVERS_CAN_FLEXCAN is not set
CONFIG_XENO_DRIVERS_CAN_SJA1000=m
CONFIG_XENO_DRIVERS_CAN_SJA1000_ISA=m
CONFIG_XENO_DRIVERS_CAN_SJA1000_MEM=m
CONFIG_XENO_DRIVERS_CAN_SJA1000_PEAK_PCI=m
CONFIG_XENO_DRIVERS_CAN_SJA1000_IXXAT_PCI=m
CONFIG_XENO_DRIVERS_CAN_SJA1000_ADV_PCI=m
CONFIG_XENO_DRIVERS_CAN_SJA1000_PLX_PCI=m
CONFIG_XENO_DRIVERS_CAN_SJA1000_EMS_PCI=m
CONFIG_XENO_DRIVERS_CAN_SJA1000_ESD_PCI=m
CONFIG_XENO_DRIVERS_CAN_SJA1000_PEAK_DNG=m
CONFIG_XENO_DRIVERS_ANALOGY=m
CONFIG_XENO_DRIVERS_ANALOGY_DEBUG=y
CONFIG_XENO_DRIVERS_ANALOGY_DEBUG_LEVEL=0
CONFIG_XENO_DRIVERS_ANALOGY_DRIVER_DEBUG_LEVEL=0
CONFIG_XENO_DRIVERS_ANALOGY_FAKE=m
CONFIG_XENO_DRIVERS_ANALOGY_8255=m
CONFIG_XENO_DRIVERS_ANALOGY_PARPORT=m
CONFIG_XENO_DRIVERS_ANALOGY_NI_MITE=m
CONFIG_XENO_DRIVERS_ANALOGY_NI_TIO=m
CONFIG_XENO_DRIVERS_ANALOGY_NI_MIO=m
CONFIG_XENO_DRIVERS_ANALOGY_NI_PCIMIO=m
CONFIG_XENO_DRIVERS_ANALOGY_NI_670x=m
CONFIG_XENO_DRIVERS_ANALOGY_NI_660x=m
CONFIG_XENO_DRIVERS_ANALOGY_S526=m
CONFIG_XENO_DRIVERS_RTIPC=y
CONFIG_XENO_DRIVERS_RTIPC_XDDP=y
CONFIG_XENO_DRIVERS_RTIPC_IDDP=y
CONFIG_XENO_OPT_IDDP_NRPORT=32
CONFIG_XENO_DRIVERS_RTIPC_BUFP=y
CONFIG_XENO_OPT_BUFP_NRPORT=32


> 
> 2)  "clocktest -C 42 -T 30" fails; apparently expected in some
> circumstances ('... || :').  Does anything need to be fixed here?  The
> test system switched to acpi_pm, and the TSCs on the dual-core are
> unsynched.  We don't know if the end-user's application will lock the
> process to a single CPU, so CLOCK_HOST_REALTIME isn't the best choice,
> even if it were somehow available.


Yes, if linux does not use the tsc for CLOCK_REALTIME,
CLOCK_HOST_REALTIME does not work.


> 
> 3)  "Warning: Linux is compiled to use FPU in kernel-space" in dmesg:
> Ignoring:  disabling (at least) software RAID breaks the
> 'one-size-fits-all' goal, and Gilles states this is harmless:
> http://www.xenomai.org/pipermail/xenomai/2009-May/016743.html


It means that the switchtest test does not really test every possible
FPU switch. So, in order to validate thoroughly a release, you should
compile a kernel without the conflicting options, run
xeno-regression-test, then compile the final kernel, and run
xeno-regression-test again.

Regards.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-14 20:39                     ` Gilles Chanteperdrix
@ 2013-01-15 11:35                       ` Jan Kiszka
  2013-01-15 12:06                         ` Gilles Chanteperdrix
  0 siblings, 1 reply; 54+ messages in thread
From: Jan Kiszka @ 2013-01-15 11:35 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: John Morris, Xenomai

On 2013-01-14 21:39, Gilles Chanteperdrix wrote:
> Done, also note that my current work is the for-core-3.5.7 branch, not
> the for-core-3.5 branch.

Both branches point to the same commit ATM. I suppose you didn't push
the new for-core-3.5.7 version yet. Once done, I'll rebase my stuff on top.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-15 11:35                       ` Jan Kiszka
@ 2013-01-15 12:06                         ` Gilles Chanteperdrix
  2013-01-15 12:09                           ` Philippe Gerum
  0 siblings, 1 reply; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-15 12:06 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: John Morris, Xenomai

On 01/15/2013 12:35 PM, Jan Kiszka wrote:

> On 2013-01-14 21:39, Gilles Chanteperdrix wrote:
>> Done, also note that my current work is the for-core-3.5.7 branch, not
>> the for-core-3.5 branch.
> 
> Both branches point to the same commit ATM. I suppose you didn't push
> the new for-core-3.5.7 version yet. Once done, I'll rebase my stuff on top.


Sorry, pushed.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-15 12:06                         ` Gilles Chanteperdrix
@ 2013-01-15 12:09                           ` Philippe Gerum
  2013-01-15 12:21                             ` Jan Kiszka
  0 siblings, 1 reply; 54+ messages in thread
From: Philippe Gerum @ 2013-01-15 12:09 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Jan Kiszka, John Morris, Xenomai

On 01/15/2013 01:06 PM, Gilles Chanteperdrix wrote:
> On 01/15/2013 12:35 PM, Jan Kiszka wrote:
>
>> On 2013-01-14 21:39, Gilles Chanteperdrix wrote:
>>> Done, also note that my current work is the for-core-3.5.7 branch, not
>>> the for-core-3.5 branch.
>>
>> Both branches point to the same commit ATM. I suppose you didn't push
>> the new for-core-3.5.7 version yet. Once done, I'll rebase my stuff on top.
>
>
> Sorry, pushed.
>

Please everybody, make sure to eventually resync with my tree, this is 
becoming a mess when merging your stuff here. TIA,


-- 
Philippe.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-15 12:09                           ` Philippe Gerum
@ 2013-01-15 12:21                             ` Jan Kiszka
  2013-01-15 13:44                               ` Philippe Gerum
  0 siblings, 1 reply; 54+ messages in thread
From: Jan Kiszka @ 2013-01-15 12:21 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: John Morris, Xenomai

On 2013-01-15 13:09, Philippe Gerum wrote:
> On 01/15/2013 01:06 PM, Gilles Chanteperdrix wrote:
>> On 01/15/2013 12:35 PM, Jan Kiszka wrote:
>>
>>> On 2013-01-14 21:39, Gilles Chanteperdrix wrote:
>>>> Done, also note that my current work is the for-core-3.5.7 branch, not
>>>> the for-core-3.5 branch.
>>>
>>> Both branches point to the same commit ATM. I suppose you didn't push
>>> the new for-core-3.5.7 version yet. Once done, I'll rebase my stuff on top.
>>
>>
>> Sorry, pushed.
>>
> 
> Please everybody, make sure to eventually resync with my tree, this is 
> becoming a mess when merging your stuff here. TIA,

I'm fetching from you regularly, but your public tree contains no
changes for 3.5, sorry.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-15 12:21                             ` Jan Kiszka
@ 2013-01-15 13:44                               ` Philippe Gerum
  2013-01-15 13:48                                 ` Jan Kiszka
  0 siblings, 1 reply; 54+ messages in thread
From: Philippe Gerum @ 2013-01-15 13:44 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: John Morris, Xenomai

On 01/15/2013 01:21 PM, Jan Kiszka wrote:
> On 2013-01-15 13:09, Philippe Gerum wrote:
>> On 01/15/2013 01:06 PM, Gilles Chanteperdrix wrote:
>>> On 01/15/2013 12:35 PM, Jan Kiszka wrote:
>>>
>>>> On 2013-01-14 21:39, Gilles Chanteperdrix wrote:
>>>>> Done, also note that my current work is the for-core-3.5.7 branch, not
>>>>> the for-core-3.5 branch.
>>>>
>>>> Both branches point to the same commit ATM. I suppose you didn't push
>>>> the new for-core-3.5.7 version yet. Once done, I'll rebase my stuff on top.
>>>
>>>
>>> Sorry, pushed.
>>>
>>
>> Please everybody, make sure to eventually resync with my tree, this is
>> becoming a mess when merging your stuff here. TIA,
>
> I'm fetching from you regularly, but your public tree contains no
> changes for 3.5, sorry.
>

You must mean no change for 3.5.3 since the last stable pipeline release 
I pushed out. I'm seeing several breakages when merging the very latest 
work, I'm solving this with Gilles.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-15 13:44                               ` Philippe Gerum
@ 2013-01-15 13:48                                 ` Jan Kiszka
  2013-01-15 13:58                                   ` Philippe Gerum
  2013-01-15 13:59                                   ` Jan Kiszka
  0 siblings, 2 replies; 54+ messages in thread
From: Jan Kiszka @ 2013-01-15 13:48 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: John Morris, Xenomai

On 2013-01-15 14:44, Philippe Gerum wrote:
> On 01/15/2013 01:21 PM, Jan Kiszka wrote:
>> On 2013-01-15 13:09, Philippe Gerum wrote:
>>> On 01/15/2013 01:06 PM, Gilles Chanteperdrix wrote:
>>>> On 01/15/2013 12:35 PM, Jan Kiszka wrote:
>>>>
>>>>> On 2013-01-14 21:39, Gilles Chanteperdrix wrote:
>>>>>> Done, also note that my current work is the for-core-3.5.7 branch, not
>>>>>> the for-core-3.5 branch.
>>>>>
>>>>> Both branches point to the same commit ATM. I suppose you didn't push
>>>>> the new for-core-3.5.7 version yet. Once done, I'll rebase my stuff on top.
>>>>
>>>>
>>>> Sorry, pushed.
>>>>
>>>
>>> Please everybody, make sure to eventually resync with my tree, this is
>>> becoming a mess when merging your stuff here. TIA,
>>
>> I'm fetching from you regularly, but your public tree contains no
>> changes for 3.5, sorry.
>>
> 
> You must mean no change for 3.5.3 since the last stable pipeline release 
> I pushed out. I'm seeing several breakages when merging the very latest 
> work, I'm solving this with Gilles.

I mean that I have no clue what I should resolve. My branch was based on
your core-3.5 branch, the latest publicly available version. I've just
rebased it on top of Gilles' 3.5.7 queue - without any conflicts. So
what are you talking about?

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-15 13:48                                 ` Jan Kiszka
@ 2013-01-15 13:58                                   ` Philippe Gerum
  2013-01-15 14:10                                     ` Jan Kiszka
  2013-01-15 19:39                                     ` Gilles Chanteperdrix
  2013-01-15 13:59                                   ` Jan Kiszka
  1 sibling, 2 replies; 54+ messages in thread
From: Philippe Gerum @ 2013-01-15 13:58 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: John Morris, Xenomai

On 01/15/2013 02:48 PM, Jan Kiszka wrote:
> On 2013-01-15 14:44, Philippe Gerum wrote:
>> On 01/15/2013 01:21 PM, Jan Kiszka wrote:
>>> On 2013-01-15 13:09, Philippe Gerum wrote:
>>>> On 01/15/2013 01:06 PM, Gilles Chanteperdrix wrote:
>>>>> On 01/15/2013 12:35 PM, Jan Kiszka wrote:
>>>>>
>>>>>> On 2013-01-14 21:39, Gilles Chanteperdrix wrote:
>>>>>>> Done, also note that my current work is the for-core-3.5.7 branch, not
>>>>>>> the for-core-3.5 branch.
>>>>>>
>>>>>> Both branches point to the same commit ATM. I suppose you didn't push
>>>>>> the new for-core-3.5.7 version yet. Once done, I'll rebase my stuff on top.
>>>>>
>>>>>
>>>>> Sorry, pushed.
>>>>>
>>>>
>>>> Please everybody, make sure to eventually resync with my tree, this is
>>>> becoming a mess when merging your stuff here. TIA,
>>>
>>> I'm fetching from you regularly, but your public tree contains no
>>> changes for 3.5, sorry.
>>>
>>
>> You must mean no change for 3.5.3 since the last stable pipeline release
>> I pushed out. I'm seeing several breakages when merging the very latest
>> work, I'm solving this with Gilles.
>
> I mean that I have no clue what I should resolve. My branch was based on
> your core-3.5 branch, the latest publicly available version. I've just
> rebased it on top of Gilles' 3.5.7 queue - without any conflicts. So
> what are you talking about?
>

I'm talking about conflicts in pgtable.h, apic.c with the atomic counter 
braindamage and stuff like this. I'm not asking you to fix anything in 
your tree, I'm pulling from Gilles' trees almost exclusively, and had 
issues with those. I raised an alert about painful merges happening 
lately, and a recommendation to avoid these. Gilles fixed the issue on 
his end, and the merge now resolves as a fast forward, as expected. 
Issue closed.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-15 13:48                                 ` Jan Kiszka
  2013-01-15 13:58                                   ` Philippe Gerum
@ 2013-01-15 13:59                                   ` Jan Kiszka
  1 sibling, 0 replies; 54+ messages in thread
From: Jan Kiszka @ 2013-01-15 13:59 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Xenomai

On 2013-01-15 14:48, Jan Kiszka wrote:
> On 2013-01-15 14:44, Philippe Gerum wrote:
>> On 01/15/2013 01:21 PM, Jan Kiszka wrote:
>>> On 2013-01-15 13:09, Philippe Gerum wrote:
>>>> On 01/15/2013 01:06 PM, Gilles Chanteperdrix wrote:
>>>>> On 01/15/2013 12:35 PM, Jan Kiszka wrote:
>>>>>
>>>>>> On 2013-01-14 21:39, Gilles Chanteperdrix wrote:
>>>>>>> Done, also note that my current work is the for-core-3.5.7 branch, not
>>>>>>> the for-core-3.5 branch.
>>>>>>
>>>>>> Both branches point to the same commit ATM. I suppose you didn't push
>>>>>> the new for-core-3.5.7 version yet. Once done, I'll rebase my stuff on top.
>>>>>
>>>>>
>>>>> Sorry, pushed.
>>>>>
>>>>
>>>> Please everybody, make sure to eventually resync with my tree, this is
>>>> becoming a mess when merging your stuff here. TIA,
>>>
>>> I'm fetching from you regularly, but your public tree contains no
>>> changes for 3.5, sorry.
>>>
>>
>> You must mean no change for 3.5.3 since the last stable pipeline release 
>> I pushed out. I'm seeing several breakages when merging the very latest 
>> work, I'm solving this with Gilles.
> 
> I mean that I have no clue what I should resolve. My branch was based on
> your core-3.5 branch, the latest publicly available version. I've just
> rebased it on top of Gilles' 3.5.7 queue - without any conflicts. So
> what are you talking about?
> 

[taking John from CC]

BTW, at this chance we may also discuss what the maintenance planes for
older core versions and for porting to newer ones are.

I could imagine backporting and testing our x86-related fixes to
core-3.4 (as it's still LTS), but I see no point in maintaining older
versions. If we manage to move forward soon, I'd move maintenance away
from 3.5 to that version once it's stable.

IOW: One stable kernel + one recent target appear reasonable to me, at
least for x86.

Jan

-- 
Siemens AG
Corporate Technology
CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
Otto-Hahn-Ring 6
81739 Muenchen
Tel.: +49 (89) 636-40042
Fax: +49 (89) 636-45450
mailto:jan.kiszka@siemens.com

Siemens Aktiengesellschaft: Chairman of the Supervisory Board: Gerhard
Cromme; Managing Board: Peter Loescher, Chairman, President and Chief
Executive Officer; Roland Busch, Brigitte Ederer, Klaus Helmrich, Joe
Kaeser, Barbara Kux, Hermann Requardt, Siegfried Russwurm, Peter Y.
Solmssen, Michael Suess; Registered offices: Berlin and Munich, Germany;
Commercial registries: Berlin  Charlottenburg, HRB 12300, Munich, HRB
6684; WEEE-Reg.-No. DE 23691322


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-15 13:58                                   ` Philippe Gerum
@ 2013-01-15 14:10                                     ` Jan Kiszka
  2013-01-15 19:39                                     ` Gilles Chanteperdrix
  1 sibling, 0 replies; 54+ messages in thread
From: Jan Kiszka @ 2013-01-15 14:10 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Xenomai

On 2013-01-15 14:58, Philippe Gerum wrote:
> On 01/15/2013 02:48 PM, Jan Kiszka wrote:
>> On 2013-01-15 14:44, Philippe Gerum wrote:
>>> On 01/15/2013 01:21 PM, Jan Kiszka wrote:
>>>> On 2013-01-15 13:09, Philippe Gerum wrote:
>>>>> On 01/15/2013 01:06 PM, Gilles Chanteperdrix wrote:
>>>>>> On 01/15/2013 12:35 PM, Jan Kiszka wrote:
>>>>>>
>>>>>>> On 2013-01-14 21:39, Gilles Chanteperdrix wrote:
>>>>>>>> Done, also note that my current work is the for-core-3.5.7 branch, not
>>>>>>>> the for-core-3.5 branch.
>>>>>>>
>>>>>>> Both branches point to the same commit ATM. I suppose you didn't push
>>>>>>> the new for-core-3.5.7 version yet. Once done, I'll rebase my stuff on top.
>>>>>>
>>>>>>
>>>>>> Sorry, pushed.
>>>>>>
>>>>>
>>>>> Please everybody, make sure to eventually resync with my tree, this is
>>>>> becoming a mess when merging your stuff here. TIA,
>>>>
>>>> I'm fetching from you regularly, but your public tree contains no
>>>> changes for 3.5, sorry.
>>>>
>>>
>>> You must mean no change for 3.5.3 since the last stable pipeline release
>>> I pushed out. I'm seeing several breakages when merging the very latest
>>> work, I'm solving this with Gilles.
>>
>> I mean that I have no clue what I should resolve. My branch was based on
>> your core-3.5 branch, the latest publicly available version. I've just
>> rebased it on top of Gilles' 3.5.7 queue - without any conflicts. So
>> what are you talking about?
>>
> 
> I'm talking about conflicts in pgtable.h, apic.c with the atomic counter 
> braindamage and stuff like this. I'm not asking you to fix anything in 
> your tree, I'm pulling from Gilles' trees almost exclusively, and had 
> issues with those. I raised an alert about painful merges happening 
> lately, and a recommendation to avoid these. Gilles fixed the issue on 
> his end, and the merge now resolves as a fast forward, as expected. 
> Issue closed.

I still don't see even that issue (based on what was publicly visible),
but if it's fine now, well, it's fine.

BTW, we are committed on x86 maintenance, so I would suggest to route
all related changes though one tree (per core version) to avoid
conflicts and confusions. Unless there are major concerns on your sides,
I could offer that tree to Philippe to pull from.

In addition, I could provide all generic commits in a separate tree as
well so that Gilles can pull them earlier for testing purposes. If
conflicts between Gilles' and my tree are in sight, we should, of
course, always try to resolve them in advance before offering you the pull.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-15 13:58                                   ` Philippe Gerum
  2013-01-15 14:10                                     ` Jan Kiszka
@ 2013-01-15 19:39                                     ` Gilles Chanteperdrix
  2013-01-16  8:02                                       ` Jan Kiszka
  1 sibling, 1 reply; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-15 19:39 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Jan Kiszka, John Morris, Xenomai

On 01/15/2013 02:58 PM, Philippe Gerum wrote:

> On 01/15/2013 02:48 PM, Jan Kiszka wrote:
>> On 2013-01-15 14:44, Philippe Gerum wrote:
>>> On 01/15/2013 01:21 PM, Jan Kiszka wrote:
>>>> On 2013-01-15 13:09, Philippe Gerum wrote:
>>>>> On 01/15/2013 01:06 PM, Gilles Chanteperdrix wrote:
>>>>>> On 01/15/2013 12:35 PM, Jan Kiszka wrote:
>>>>>>
>>>>>>> On 2013-01-14 21:39, Gilles Chanteperdrix wrote:
>>>>>>>> Done, also note that my current work is the for-core-3.5.7 branch, not
>>>>>>>> the for-core-3.5 branch.
>>>>>>>
>>>>>>> Both branches point to the same commit ATM. I suppose you didn't push
>>>>>>> the new for-core-3.5.7 version yet. Once done, I'll rebase my stuff on top.
>>>>>>
>>>>>>
>>>>>> Sorry, pushed.
>>>>>>
>>>>>
>>>>> Please everybody, make sure to eventually resync with my tree, this is
>>>>> becoming a mess when merging your stuff here. TIA,
>>>>
>>>> I'm fetching from you regularly, but your public tree contains no
>>>> changes for 3.5, sorry.
>>>>
>>>
>>> You must mean no change for 3.5.3 since the last stable pipeline release
>>> I pushed out. I'm seeing several breakages when merging the very latest
>>> work, I'm solving this with Gilles.
>>
>> I mean that I have no clue what I should resolve. My branch was based on
>> your core-3.5 branch, the latest publicly available version. I've just
>> rebased it on top of Gilles' 3.5.7 queue - without any conflicts. So
>> what are you talking about?
>>
> 
> I'm talking about conflicts in pgtable.h, apic.c with the atomic counter 
> braindamage and stuff like this. I'm not asking you to fix anything in 
> your tree, I'm pulling from Gilles' trees almost exclusively, and had 
> issues with those. I raised an alert about painful merges happening 
> lately, and a recommendation to avoid these. Gilles fixed the issue on 
> his end, and the merge now resolves as a fast forward, as expected. 
> Issue closed.


These were issues due to rebasing. Rebasings makes merging incrementally
harder, will avoid this in the future.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-15 19:39                                     ` Gilles Chanteperdrix
@ 2013-01-16  8:02                                       ` Jan Kiszka
  2013-01-16  8:44                                         ` Jan Kiszka
  0 siblings, 1 reply; 54+ messages in thread
From: Jan Kiszka @ 2013-01-16  8:02 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: John Morris, Xenomai

On 2013-01-15 20:39, Gilles Chanteperdrix wrote:
> On 01/15/2013 02:58 PM, Philippe Gerum wrote:
> 
>> On 01/15/2013 02:48 PM, Jan Kiszka wrote:
>>> On 2013-01-15 14:44, Philippe Gerum wrote:
>>>> On 01/15/2013 01:21 PM, Jan Kiszka wrote:
>>>>> On 2013-01-15 13:09, Philippe Gerum wrote:
>>>>>> On 01/15/2013 01:06 PM, Gilles Chanteperdrix wrote:
>>>>>>> On 01/15/2013 12:35 PM, Jan Kiszka wrote:
>>>>>>>
>>>>>>>> On 2013-01-14 21:39, Gilles Chanteperdrix wrote:
>>>>>>>>> Done, also note that my current work is the for-core-3.5.7 branch, not
>>>>>>>>> the for-core-3.5 branch.
>>>>>>>>
>>>>>>>> Both branches point to the same commit ATM. I suppose you didn't push
>>>>>>>> the new for-core-3.5.7 version yet. Once done, I'll rebase my stuff on top.
>>>>>>>
>>>>>>>
>>>>>>> Sorry, pushed.
>>>>>>>
>>>>>>
>>>>>> Please everybody, make sure to eventually resync with my tree, this is
>>>>>> becoming a mess when merging your stuff here. TIA,
>>>>>
>>>>> I'm fetching from you regularly, but your public tree contains no
>>>>> changes for 3.5, sorry.
>>>>>
>>>>
>>>> You must mean no change for 3.5.3 since the last stable pipeline release
>>>> I pushed out. I'm seeing several breakages when merging the very latest
>>>> work, I'm solving this with Gilles.
>>>
>>> I mean that I have no clue what I should resolve. My branch was based on
>>> your core-3.5 branch, the latest publicly available version. I've just
>>> rebased it on top of Gilles' 3.5.7 queue - without any conflicts. So
>>> what are you talking about?
>>>
>>
>> I'm talking about conflicts in pgtable.h, apic.c with the atomic counter 
>> braindamage and stuff like this. I'm not asking you to fix anything in 
>> your tree, I'm pulling from Gilles' trees almost exclusively, and had 
>> issues with those. I raised an alert about painful merges happening 
>> lately, and a recommendation to avoid these. Gilles fixed the issue on 
>> his end, and the merge now resolves as a fast forward, as expected. 
>> Issue closed.
> 
> 
> These were issues due to rebasing. Rebasings makes merging incrementally
> harder, will avoid this in the future.

Rebasing is not bad per se and, where possible, better than leaving an
un-bisectable commit series behind - IMHO. I suppose Philippe started
pulling from your queue before you reordered/updated it again. That's
why I didn't send out a pull request yet - mine wasn't done. I'm trying
to formalize this process here, i.e. never rebase after releasing a
queue for pulling. At the same time, upstream should not pull or pick in
a way that makes life harder for downstream.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-16  8:02                                       ` Jan Kiszka
@ 2013-01-16  8:44                                         ` Jan Kiszka
  2013-01-16  9:41                                           ` Philippe Gerum
  0 siblings, 1 reply; 54+ messages in thread
From: Jan Kiszka @ 2013-01-16  8:44 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: John Morris, Xenomai

On 2013-01-16 09:02, Jan Kiszka wrote:
> At the same time, upstream should not pull or pick in
> a way that makes life harder for downstream.

Philippe, in the future, please keep your public tree up-to-date,
ideally at a daily base. I'm seeing commits there that were done locally
more than a week ago. Only publishing your state will avoid the problems
you faced with integrating our changes.

Will you pull Gilles' 3.5.7 merge before the next core release?

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-16  8:44                                         ` Jan Kiszka
@ 2013-01-16  9:41                                           ` Philippe Gerum
  2013-01-16  9:48                                             ` Jan Kiszka
  0 siblings, 1 reply; 54+ messages in thread
From: Philippe Gerum @ 2013-01-16  9:41 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: John Morris, Xenomai

On 01/16/2013 09:44 AM, Jan Kiszka wrote:
> On 2013-01-16 09:02, Jan Kiszka wrote:
>> At the same time, upstream should not pull or pick in
>> a way that makes life harder for downstream.
>
> Philippe, in the future, please keep your public tree up-to-date,
> ideally at a daily base. I'm seeing commits there that were done locally
> more than a week ago. Only publishing your state will avoid the problems
> you faced with integrating our changes.

No. If I don't publish, there must be a reason.

>
> Will you pull Gilles' 3.5.7 merge before the next core release?
>
> Thanks,
> Jan
>


-- 
Philippe.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-16  9:41                                           ` Philippe Gerum
@ 2013-01-16  9:48                                             ` Jan Kiszka
  2013-01-16 10:37                                               ` Philippe Gerum
  0 siblings, 1 reply; 54+ messages in thread
From: Jan Kiszka @ 2013-01-16  9:48 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: John Morris, Xenomai

On 2013-01-16 10:41, Philippe Gerum wrote:
> On 01/16/2013 09:44 AM, Jan Kiszka wrote:
>> On 2013-01-16 09:02, Jan Kiszka wrote:
>>> At the same time, upstream should not pull or pick in
>>> a way that makes life harder for downstream.
>>
>> Philippe, in the future, please keep your public tree up-to-date,
>> ideally at a daily base. I'm seeing commits there that were done locally
>> more than a week ago. Only publishing your state will avoid the problems
>> you faced with integrating our changes.
> 
> No. If I don't publish, there must be a reason.

This approach doesn't work very well - to state it carefully. Push you
stuff at least into a public "next" branch, one that may be rebased /
reordered without warning. That allows us to prepare for what is under
test, maybe pick up arch-specific changes into the subsystem queues. And
don't complain about conflict if you cherry-pick subsystem patches
without dropping a note to the author - or was Gilles aware of your
private queue?

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-16  9:48                                             ` Jan Kiszka
@ 2013-01-16 10:37                                               ` Philippe Gerum
  2013-01-16 12:03                                                 ` Gilles Chanteperdrix
  0 siblings, 1 reply; 54+ messages in thread
From: Philippe Gerum @ 2013-01-16 10:37 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: John Morris, Xenomai

On 01/16/2013 10:48 AM, Jan Kiszka wrote:
> On 2013-01-16 10:41, Philippe Gerum wrote:
>> On 01/16/2013 09:44 AM, Jan Kiszka wrote:
>>> On 2013-01-16 09:02, Jan Kiszka wrote:
>>>> At the same time, upstream should not pull or pick in
>>>> a way that makes life harder for downstream.
>>>
>>> Philippe, in the future, please keep your public tree up-to-date,
>>> ideally at a daily base. I'm seeing commits there that were done locally
>>> more than a week ago. Only publishing your state will avoid the problems
>>> you faced with integrating our changes.
>>
>> No. If I don't publish, there must be a reason.
>
> This approach doesn't work very well - to state it carefully. Push you
> stuff at least into a public "next" branch, one that may be rebased /
> reordered without warning. That allows us to prepare for what is under
> test, maybe pick up arch-specific changes into the subsystem queues. And
> don't complain about conflict if you cherry-pick subsystem patches
> without dropping a note to the author - or was Gilles aware of your
> private queue?
>

You are off base, this has nothing to do with random cherry picking, or 
any private queue I would maintain: none of these have existed. Besides, 
it's not about integrating my changes in the reference tree - I had 
almost none recently - but Gilles', and many of them came from other 
sources he merged into his local queue.

Please, we have been working reasonably successfully with our current 
workflow for the past ten years now, so we had some time to understand - 
even if our very limited brainpower made this quite a challenge - the 
basics of distributed development.

What happened is a misunderstanding between Gilles and myself on the 
presence of a pending pull request, nothing more, which eventually led 
to an out-of-sync pulling on my end. Since this happened when several 
people were pushing stuff to Gilles, this triggered my warning to the 
list, so that everyone involved may know that things stopped going as 
smoothly as usual. Rest assured that under normal circumstances, Gilles 
is very well aware of what I'm working on daily, and conversely. I 
understand your willingness to work the right way now that you recently 
committed to maintaining the x86 branch, and this is appreciated.

On a more general note, the situation changed because we have been 
receiving more patches from more authors in the last weeks than we used 
to get, and conflicts follow the same trend. So yes, for this reason I'm 
going to open a 'next' branch or something alike to be periodically 
merged to the reference core branch as a fast forward.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-16 10:37                                               ` Philippe Gerum
@ 2013-01-16 12:03                                                 ` Gilles Chanteperdrix
  2013-01-16 12:18                                                   ` Jan Kiszka
  0 siblings, 1 reply; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-16 12:03 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Jan Kiszka, John Morris, Xenomai

On 01/16/2013 11:37 AM, Philippe Gerum wrote:>

> What happened is a misunderstanding between Gilles and myself on the 
> presence of a pending pull request, nothing more, which eventually led 
> to an out-of-sync pulling on my end. Since this happened when several 
> people were pushing stuff to Gilles, this triggered my warning to the 
> list, so that everyone involved may know that things stopped going as 
> smoothly as usual. Rest assured that under normal circumstances, Gilles 
> is very well aware of what I'm working on daily, and conversely. I 
> understand your willingness to work the right way now that you recently 
> committed to maintaining the x86 branch, and this is appreciated.


My problem was to send the pull request then continue working on the
branch, I should have a branch for pull requests, or send pull request
for tags if that is possible.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests)
  2013-01-16 12:03                                                 ` Gilles Chanteperdrix
@ 2013-01-16 12:18                                                   ` Jan Kiszka
  0 siblings, 0 replies; 54+ messages in thread
From: Jan Kiszka @ 2013-01-16 12:18 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: John Morris, Xenomai

On 2013-01-16 13:03, Gilles Chanteperdrix wrote:
> On 01/16/2013 11:37 AM, Philippe Gerum wrote:>
> 
>> What happened is a misunderstanding between Gilles and myself on the 
>> presence of a pending pull request, nothing more, which eventually led 
>> to an out-of-sync pulling on my end. Since this happened when several 
>> people were pushing stuff to Gilles, this triggered my warning to the 
>> list, so that everyone involved may know that things stopped going as 
>> smoothly as usual. Rest assured that under normal circumstances, Gilles 
>> is very well aware of what I'm working on daily, and conversely. I 
>> understand your willingness to work the right way now that you recently 
>> committed to maintaining the x86 branch, and this is appreciated.
> 
> 
> My problem was to send the pull request then continue working on the
> branch, I should have a branch for pull requests, or send pull request
> for tags if that is possible.

I think this all melts down to communicating, like formally revoking a
pull or notifying the contributor about the merge of his pull. And this
will be more important when more people work on I-pipe now. There are
much bigger projects that struggle with such glitches, but we should
more easily be able to avoid them.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 posix/mprotect fixed; (Was:  posix/mprotect failure)
  2013-01-15  7:31                         ` Gilles Chanteperdrix
@ 2013-01-18  3:56                           ` John Morris
  2013-01-18  4:31                             ` Gilles Chanteperdrix
  0 siblings, 1 reply; 54+ messages in thread
From: John Morris @ 2013-01-18  3:56 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai

On 01/15/2013 01:31 AM, Gilles Chanteperdrix wrote:
> On 01/15/2013 08:16 AM, John Morris wrote:
>> 1)  "xddp_test" fails:  "...IPCPROTO_XDDP): Address family not supported
>> by protocol".  DRIVERS_RTIPC_XDDP is enabled in the config.  Doesn't
>> look like others have experienced this before.  LinuxCNC doesn't use
>> XDDP, but it should be fixed for the general audience.
> 
> 
> You are missing some kernel drivers. I used the following options in the
> Debian configuration: basically, all options enabled (some as module,
> because they are expected to be used less often).
> 
> CONFIG_XENO_DRIVERS_RTIPC_XDDP=y

I think the problem was here.  xddp was compiled in as a module in my
original config, and wasn't loaded by the test suite.

	John



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Xenomai] 3.5.7 posix/mprotect fixed; (Was:  posix/mprotect failure)
  2013-01-18  3:56                           ` John Morris
@ 2013-01-18  4:31                             ` Gilles Chanteperdrix
  0 siblings, 0 replies; 54+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-18  4:31 UTC (permalink / raw)
  To: John Morris; +Cc: Xenomai

On 01/18/2013 04:56 AM, John Morris wrote:

> On 01/15/2013 01:31 AM, Gilles Chanteperdrix wrote:
>> On 01/15/2013 08:16 AM, John Morris wrote:
>>> 1)  "xddp_test" fails:  "...IPCPROTO_XDDP): Address family not supported
>>> by protocol".  DRIVERS_RTIPC_XDDP is enabled in the config.  Doesn't
>>> look like others have experienced this before.  LinuxCNC doesn't use
>>> XDDP, but it should be fixed for the general audience.
>>
>>
>> You are missing some kernel drivers. I used the following options in the
>> Debian configuration: basically, all options enabled (some as module,
>> because they are expected to be used less often).
>>
>> CONFIG_XENO_DRIVERS_RTIPC_XDDP=y
> 
> I think the problem was here.  xddp was compiled in as a module in my
> original config, and wasn't loaded by the test suite.


The testsuite does not load any module. It makes more sense to have xddp
builtin anyway.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2013-01-18  4:31 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-12 17:26 [Xenomai] Kernel OOPS during regression tests John Morris
2013-01-12 17:31 ` Gilles Chanteperdrix
2013-01-13  4:36   ` John Morris
2013-01-13 12:16     ` Gilles Chanteperdrix
2013-01-13 19:14       ` [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests) John Morris
2013-01-13 19:41         ` Gilles Chanteperdrix
2013-01-14  4:47           ` [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed! John Morris
2013-01-14 11:57             ` Gilles Chanteperdrix
2013-01-14 12:00               ` Jan Kiszka
2013-01-14 13:36                 ` Jan Kiszka
2013-01-14 20:52                   ` John Morris
2013-01-14 22:54                     ` Gilles Chanteperdrix
2013-01-15  7:16                       ` [Xenomai] 3.5.7 posix/mprotect fixed; (Was: posix/mprotect failure) John Morris
2013-01-15  7:31                         ` Gilles Chanteperdrix
2013-01-18  3:56                           ` John Morris
2013-01-18  4:31                             ` Gilles Chanteperdrix
2013-01-14 19:50             ` [Xenomai] 3.5.7 posix/mprotect failure; "I-pipe: could not find timer" fixed! Gilles Chanteperdrix
2013-01-14 20:56               ` John Morris
2013-01-14 22:57                 ` Gilles Chanteperdrix
2013-01-14 12:00           ` [Xenomai] 3.5.7 "I-pipe: could not find timer" (Was: Re: Kernel OOPS during regression tests) Jan Kiszka
2013-01-14 18:50             ` Gilles Chanteperdrix
2013-01-14 19:13               ` Jan Kiszka
2013-01-14 19:15                 ` Gilles Chanteperdrix
2013-01-14 19:37                   ` Jan Kiszka
2013-01-14 20:39                     ` Gilles Chanteperdrix
2013-01-15 11:35                       ` Jan Kiszka
2013-01-15 12:06                         ` Gilles Chanteperdrix
2013-01-15 12:09                           ` Philippe Gerum
2013-01-15 12:21                             ` Jan Kiszka
2013-01-15 13:44                               ` Philippe Gerum
2013-01-15 13:48                                 ` Jan Kiszka
2013-01-15 13:58                                   ` Philippe Gerum
2013-01-15 14:10                                     ` Jan Kiszka
2013-01-15 19:39                                     ` Gilles Chanteperdrix
2013-01-16  8:02                                       ` Jan Kiszka
2013-01-16  8:44                                         ` Jan Kiszka
2013-01-16  9:41                                           ` Philippe Gerum
2013-01-16  9:48                                             ` Jan Kiszka
2013-01-16 10:37                                               ` Philippe Gerum
2013-01-16 12:03                                                 ` Gilles Chanteperdrix
2013-01-16 12:18                                                   ` Jan Kiszka
2013-01-15 13:59                                   ` Jan Kiszka
2013-01-12 19:02 ` [Xenomai] Kernel OOPS during regression tests Gilles Chanteperdrix
2013-01-13  6:50   ` John Morris
2013-01-13 11:23     ` Jan Kiszka
2013-01-13 12:18       ` Gilles Chanteperdrix
2013-01-13 19:34         ` [Xenomai] 3.5.3 posix/mprotect fail "sigdebug_handler triggered" (Was: Re: Kernel OOPS during regression tests) John Morris
2013-01-13 19:42           ` Gilles Chanteperdrix
2013-01-12 19:03 ` [Xenomai] Kernel OOPS during regression tests Gilles Chanteperdrix
2013-01-13  4:40   ` John Morris
2013-01-13 13:53     ` Gilles Chanteperdrix
2013-01-13 19:36       ` [Xenomai] SMI workarounds in one-size-fits-all kernel packages (Was: Re: Kernel OOPS during regression tests) John Morris
2013-01-13 19:45         ` Gilles Chanteperdrix
2013-01-14  5:33           ` John Morris

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.