All of lore.kernel.org
 help / color / mirror / Atom feed
* Dovetail 5.10 - Xenomai huge page support
@ 2022-01-19 13:35 Bezdeka, Florian
  2022-01-19 13:49 ` Bezdeka, Florian
  2022-01-20  8:30 ` Philippe Gerum
  0 siblings, 2 replies; 7+ messages in thread
From: Bezdeka, Florian @ 2022-01-19 13:35 UTC (permalink / raw)
  To: xenomai@xenomai.org

Hi all,

after migrating from 4.19-ipipe to 5.10-dovetail internal users are
reporting that they receive sporadic SIGXCPUs. The reported reason is
"page_fault_user". 

After disabling the huge page support in 5.10 the issue vanished. Tests
are still running, but we are optimistic that we found the root cause.

Are there any known limitations to the huge page support in Xenomai /
Dovetail? My understanding is that huge pages are supported by Xenomai,
so any ideas where to look at?

Best regards,
Florian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Dovetail 5.10 - Xenomai huge page support
  2022-01-19 13:35 Dovetail 5.10 - Xenomai huge page support Bezdeka, Florian
@ 2022-01-19 13:49 ` Bezdeka, Florian
  2022-01-20  8:32   ` Philippe Gerum
  2022-01-20  8:30 ` Philippe Gerum
  1 sibling, 1 reply; 7+ messages in thread
From: Bezdeka, Florian @ 2022-01-19 13:49 UTC (permalink / raw)
  To: xenomai@xenomai.org

On Wed, 2022-01-19 at 13:35 +0000, Bezdeka, Florian via Xenomai wrote:
> Hi all,
> 
> after migrating from 4.19-ipipe to 5.10-dovetail internal users are
> reporting that they receive sporadic SIGXCPUs. The reported reason is
> "page_fault_user". 
> 
> After disabling the huge page support in 5.10 the issue vanished. Tests
> are still running, but we are optimistic that we found the root cause.
> 
> Are there any known limitations to the huge page support in Xenomai /
> Dovetail? My understanding is that huge pages are supported by Xenomai,
> so any ideas where to look at?

[1] is part of the 5.15-rebased dovetail branch but not part of the
5.10-rebase branch and mentions huge page handling. Could that be
related? Maybe this patch is not 5.15 specific?

[1] https://source.denx.de/Xenomai/linux-dovetail/-/commit/b0a4f4c3891949d079aa3d65f2a56428a67e580c

> 
> Best regards,
> Florian


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Dovetail 5.10 - Xenomai huge page support
  2022-01-19 13:35 Dovetail 5.10 - Xenomai huge page support Bezdeka, Florian
  2022-01-19 13:49 ` Bezdeka, Florian
@ 2022-01-20  8:30 ` Philippe Gerum
  1 sibling, 0 replies; 7+ messages in thread
From: Philippe Gerum @ 2022-01-20  8:30 UTC (permalink / raw)
  To: Bezdeka, Florian; +Cc: xenomai@xenomai.org, jan.kiszka@siemens.com


"Bezdeka, Florian" <florian.bezdeka@siemens.com> writes:

> Hi all,
>
> after migrating from 4.19-ipipe to 5.10-dovetail internal users are
> reporting that they receive sporadic SIGXCPUs. The reported reason is
> "page_fault_user". 
>
> After disabling the huge page support in 5.10 the issue vanished. Tests
> are still running, but we are optimistic that we found the root cause.
>
> Are there any known limitations to the huge page support in Xenomai /
> Dovetail? My understanding is that huge pages are supported by Xenomai,
> so any ideas where to look at?
>

No limitation intended or known, however the recent change in the
Dovetail implement you mention in your next post may help addressing the
issue indeed by disabling COW for huge pages too, which was missing so
far.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Dovetail 5.10 - Xenomai huge page support
  2022-01-19 13:49 ` Bezdeka, Florian
@ 2022-01-20  8:32   ` Philippe Gerum
  2022-01-20  9:15     ` Bezdeka, Florian
  0 siblings, 1 reply; 7+ messages in thread
From: Philippe Gerum @ 2022-01-20  8:32 UTC (permalink / raw)
  To: Bezdeka, Florian; +Cc: xenomai@xenomai.org, jan.kiszka@siemens.com


"Bezdeka, Florian" <florian.bezdeka@siemens.com> writes:

> On Wed, 2022-01-19 at 13:35 +0000, Bezdeka, Florian via Xenomai wrote:
>> Hi all,
>> 
>> after migrating from 4.19-ipipe to 5.10-dovetail internal users are
>> reporting that they receive sporadic SIGXCPUs. The reported reason is
>> "page_fault_user". 
>> 
>> After disabling the huge page support in 5.10 the issue vanished. Tests
>> are still running, but we are optimistic that we found the root cause.
>> 
>> Are there any known limitations to the huge page support in Xenomai /
>> Dovetail? My understanding is that huge pages are supported by Xenomai,
>> so any ideas where to look at?
>
> [1] is part of the 5.15-rebased dovetail branch but not part of the
> 5.10-rebase branch and mentions huge page handling. Could that be
> related? Maybe this patch is not 5.15 specific?
>

It is not 5.15 specific. The same logic would apply to 5.10.

> [1] https://source.denx.de/Xenomai/linux-dovetail/-/commit/b0a4f4c3891949d079aa3d65f2a56428a67e580c
>
>> 
>> Best regards,
>> Florian


-- 
Philippe.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Dovetail 5.10 - Xenomai huge page support
  2022-01-20  8:32   ` Philippe Gerum
@ 2022-01-20  9:15     ` Bezdeka, Florian
  2022-01-25 16:48       ` Jan Kiszka
  0 siblings, 1 reply; 7+ messages in thread
From: Bezdeka, Florian @ 2022-01-20  9:15 UTC (permalink / raw)
  To: rpm@xenomai.org; +Cc: xenomai@xenomai.org, jan.kiszka@siemens.com

On Thu, 2022-01-20 at 09:32 +0100, Philippe Gerum wrote:
> "Bezdeka, Florian" <florian.bezdeka@siemens.com> writes:
> 
> > On Wed, 2022-01-19 at 13:35 +0000, Bezdeka, Florian via Xenomai wrote:
> > > Hi all,
> > > 
> > > after migrating from 4.19-ipipe to 5.10-dovetail internal users are
> > > reporting that they receive sporadic SIGXCPUs. The reported reason is
> > > "page_fault_user". 
> > > 
> > > After disabling the huge page support in 5.10 the issue vanished. Tests
> > > are still running, but we are optimistic that we found the root cause.
> > > 
> > > Are there any known limitations to the huge page support in Xenomai /
> > > Dovetail? My understanding is that huge pages are supported by Xenomai,
> > > so any ideas where to look at?
> > 
> > [1] is part of the 5.15-rebased dovetail branch but not part of the
> > 5.10-rebase branch and mentions huge page handling. Could that be
> > related? Maybe this patch is not 5.15 specific?
> > 
> 
> It is not 5.15 specific. The same logic would apply to 5.10.

Thanks Philippe! I will give it a try! Even with huge page support
disabled the problem came back. The first run was OK, starting a second
run after 2h of idle triggers kind of a SIGXCPUs storm.

> 
> > [1] https://source.denx.de/Xenomai/linux-dovetail/-/commit/b0a4f4c3891949d079aa3d65f2a56428a67e580c
> > 
> > > 
> > > Best regards,
> > > Florian
> 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Dovetail 5.10 - Xenomai huge page support
  2022-01-20  9:15     ` Bezdeka, Florian
@ 2022-01-25 16:48       ` Jan Kiszka
  2022-01-25 17:43         ` Philippe Gerum
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Kiszka @ 2022-01-25 16:48 UTC (permalink / raw)
  To: Bezdeka, Florian (T CED SES-DE), rpm@xenomai.org; +Cc: xenomai@xenomai.org

On 20.01.22 10:15, Bezdeka, Florian (T CED SES-DE) wrote:
> On Thu, 2022-01-20 at 09:32 +0100, Philippe Gerum wrote:
>> "Bezdeka, Florian" <florian.bezdeka@siemens.com> writes:
>>
>>> On Wed, 2022-01-19 at 13:35 +0000, Bezdeka, Florian via Xenomai wrote:
>>>> Hi all,
>>>>
>>>> after migrating from 4.19-ipipe to 5.10-dovetail internal users are
>>>> reporting that they receive sporadic SIGXCPUs. The reported reason is
>>>> "page_fault_user".
>>>>
>>>> After disabling the huge page support in 5.10 the issue vanished. Tests
>>>> are still running, but we are optimistic that we found the root cause.
>>>>
>>>> Are there any known limitations to the huge page support in Xenomai /
>>>> Dovetail? My understanding is that huge pages are supported by Xenomai,
>>>> so any ideas where to look at?
>>>
>>> [1] is part of the 5.15-rebased dovetail branch but not part of the
>>> 5.10-rebase branch and mentions huge page handling. Could that be
>>> related? Maybe this patch is not 5.15 specific?
>>>
>>
>> It is not 5.15 specific. The same logic would apply to 5.10.
> 
> Thanks Philippe! I will give it a try! Even with huge page support
> disabled the problem came back. The first run was OK, starting a second
> run after 2h of idle triggers kind of a SIGXCPUs storm.
> 

That patch [1] is not targeting 5.10 - that was at least the information 
you shared, Philippe, when I asked this question as well back then.

Meanwhile we understood that the mode changes originated from 
CONFIG_MIGRATION. We are now trying to understand if we can live without 
it in that large NUMA setup or if some more targeted per-process pinning 
would be desirable. Such pinning may also come with implicit unCOW in 
the end, let's see.

Jan

>>
>>> [1] https://source.denx.de/Xenomai/linux-dovetail/-/commit/b0a4f4c3891949d079aa3d65f2a56428a67e580c
>>>
>>>>
>>>> Best regards,
>>>> Florian
>>
>>
> 

-- 
Siemens AG, Technology
Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Dovetail 5.10 - Xenomai huge page support
  2022-01-25 16:48       ` Jan Kiszka
@ 2022-01-25 17:43         ` Philippe Gerum
  0 siblings, 0 replies; 7+ messages in thread
From: Philippe Gerum @ 2022-01-25 17:43 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Bezdeka, Florian (T CED SES-DE), xenomai@xenomai.org


Jan Kiszka <jan.kiszka@siemens.com> writes:

> On 20.01.22 10:15, Bezdeka, Florian (T CED SES-DE) wrote:
>> On Thu, 2022-01-20 at 09:32 +0100, Philippe Gerum wrote:
>>> "Bezdeka, Florian" <florian.bezdeka@siemens.com> writes:
>>>
>>>> On Wed, 2022-01-19 at 13:35 +0000, Bezdeka, Florian via Xenomai wrote:
>>>>> Hi all,
>>>>>
>>>>> after migrating from 4.19-ipipe to 5.10-dovetail internal users are
>>>>> reporting that they receive sporadic SIGXCPUs. The reported reason is
>>>>> "page_fault_user".
>>>>>
>>>>> After disabling the huge page support in 5.10 the issue vanished. Tests
>>>>> are still running, but we are optimistic that we found the root cause.
>>>>>
>>>>> Are there any known limitations to the huge page support in Xenomai /
>>>>> Dovetail? My understanding is that huge pages are supported by Xenomai,
>>>>> so any ideas where to look at?
>>>>
>>>> [1] is part of the 5.15-rebased dovetail branch but not part of the
>>>> 5.10-rebase branch and mentions huge page handling. Could that be
>>>> related? Maybe this patch is not 5.15 specific?
>>>>
>>>
>>> It is not 5.15 specific. The same logic would apply to 5.10.
>> Thanks Philippe! I will give it a try! Even with huge page support
>> disabled the problem came back. The first run was OK, starting a second
>> run after 2h of idle triggers kind of a SIGXCPUs storm.
>> 
>
> That patch [1] is not targeting 5.10 - that was at least the
> information you shared, Philippe, when I asked this question as well
> back then.
>

The fact that the original bug was not in 5.10 as I mentioned does not
preclude the change to have a wider impact, please see the commit log:

commit 4dfc2bc1c47d2343cacd2a4ae01a3c3513d27d12 (HEAD -> rebase/v5.15.y-evl, linux-evl/v5.15.y-evl-rebase)
Author: Philippe Gerum <rpm@xenomai.org>
Date:   Sat Jan 8 16:30:24 2022 +0100

    dovetail: mm: fix logic of COW-disabling check

    ...
    
    As a result of this change, other callers of page_needs_cow_for_dma()
    are now influenced by the Dovetail pinning status of the mm,
    specifically the huge page management.
    
    Finally, since page_needs_cow_for_dma() does not exclusively apply to
    pinned memory for DMA anymore, rename it to page_needs_cow().
    
> Meanwhile we understood that the mode changes originated from
> CONFIG_MIGRATION. We are now trying to understand if we can live
> without it in that large NUMA setup or if some more targeted
> per-process pinning would be desirable. Such pinning may also come
> with implicit unCOW in the end, let's see.
>

-- 
Philippe.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-01-25 17:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-01-19 13:35 Dovetail 5.10 - Xenomai huge page support Bezdeka, Florian
2022-01-19 13:49 ` Bezdeka, Florian
2022-01-20  8:32   ` Philippe Gerum
2022-01-20  9:15     ` Bezdeka, Florian
2022-01-25 16:48       ` Jan Kiszka
2022-01-25 17:43         ` Philippe Gerum
2022-01-20  8:30 ` Philippe Gerum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.