* [Xenomai-core] arm: Unprotected access to irq_desc field?
@ 2010-10-28 15:38 Jan Kiszka
2010-10-28 19:15 ` Gilles Chanteperdrix
2010-10-28 19:37 ` Gilles Chanteperdrix
0 siblings, 2 replies; 13+ messages in thread
From: Jan Kiszka @ 2010-10-28 15:38 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: Xenomai core
Gilles,
I happened to come across rthal_mark_irq_disabled/enabled on arm. On
first glance, it looks like these helpers manipulate irq_desc::status
non-atomically, i.e. without holding irq_desc::lock. Isn't this fragile?
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Xenomai-core] arm: Unprotected access to irq_desc field?
2010-10-28 15:38 [Xenomai-core] arm: Unprotected access to irq_desc field? Jan Kiszka
@ 2010-10-28 19:15 ` Gilles Chanteperdrix
2010-10-28 19:34 ` Philippe Gerum
2010-10-28 19:37 ` Gilles Chanteperdrix
1 sibling, 1 reply; 13+ messages in thread
From: Gilles Chanteperdrix @ 2010-10-28 19:15 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Xenomai core
Jan Kiszka wrote:
> Gilles,
>
> I happened to come across rthal_mark_irq_disabled/enabled on arm. On
> first glance, it looks like these helpers manipulate irq_desc::status
> non-atomically, i.e. without holding irq_desc::lock. Isn't this fragile?
I have no idea. How do the other architectures do? As far as I know,
this code has been copied from there.
--
Gilles.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Xenomai-core] arm: Unprotected access to irq_desc field?
2010-10-28 19:15 ` Gilles Chanteperdrix
@ 2010-10-28 19:34 ` Philippe Gerum
2010-10-29 7:00 ` Jan Kiszka
0 siblings, 1 reply; 13+ messages in thread
From: Philippe Gerum @ 2010-10-28 19:34 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: Jan Kiszka, Xenomai core
On Thu, 2010-10-28 at 21:15 +0200, Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
> > Gilles,
> >
> > I happened to come across rthal_mark_irq_disabled/enabled on arm. On
> > first glance, it looks like these helpers manipulate irq_desc::status
> > non-atomically, i.e. without holding irq_desc::lock. Isn't this fragile?
>
> I have no idea. How do the other architectures do? As far as I know,
> this code has been copied from there.
Other archs do the same, simply because once an irq is managed by the
hal, it may not be shared in any way with the regular kernel. So locking
is pointless.
--
Philippe.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Xenomai-core] arm: Unprotected access to irq_desc field?
2010-10-28 15:38 [Xenomai-core] arm: Unprotected access to irq_desc field? Jan Kiszka
2010-10-28 19:15 ` Gilles Chanteperdrix
@ 2010-10-28 19:37 ` Gilles Chanteperdrix
1 sibling, 0 replies; 13+ messages in thread
From: Gilles Chanteperdrix @ 2010-10-28 19:37 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Xenomai core
Jan Kiszka wrote:
> Gilles,
>
> I happened to come across rthal_mark_irq_disabled/enabled on arm. On
> first glance, it looks like these helpers manipulate irq_desc::status
> non-atomically, i.e. without holding irq_desc::lock. Isn't this fragile?
>From my point of view, locking anything would be overkill on ARM: irq
configurations are completely static as per the board, and so, ARMs can
use proper irq demuxing, instead of the "shared irqs" workaround. So, in
other word, I do not see why we would need any locking.
--
Gilles.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Xenomai-core] arm: Unprotected access to irq_desc field?
2010-10-28 19:34 ` Philippe Gerum
@ 2010-10-29 7:00 ` Jan Kiszka
2010-10-29 8:27 ` Philippe Gerum
0 siblings, 1 reply; 13+ messages in thread
From: Jan Kiszka @ 2010-10-29 7:00 UTC (permalink / raw)
To: Philippe Gerum; +Cc: Xenomai core
[-- Attachment #1: Type: text/plain, Size: 1136 bytes --]
Am 28.10.2010 21:34, Philippe Gerum wrote:
> On Thu, 2010-10-28 at 21:15 +0200, Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles,
>>>
>>> I happened to come across rthal_mark_irq_disabled/enabled on arm. On
>>> first glance, it looks like these helpers manipulate irq_desc::status
>>> non-atomically, i.e. without holding irq_desc::lock. Isn't this fragile?
>>
>> I have no idea. How do the other architectures do? As far as I know,
>> this code has been copied from there.
>
> Other archs do the same, simply because once an irq is managed by the
> hal, it may not be shared in any way with the regular kernel. So locking
> is pointless.
Indeed, I missed that all the other archs have this uninlined in hal.c.
However, this leaves at least a race between xnintr_disable/enable and
XN_ISR_PROPAGATE (ie. the related Linux path) behind. Not sure if it
matters practically - but risking silent breakage for this micro
optimization? Is disabling/enabling really in the highly
latency-critical anywhere? Otherwise, I would suggest to just plug this
by adding the intended lock for this field.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Xenomai-core] arm: Unprotected access to irq_desc field?
2010-10-29 7:00 ` Jan Kiszka
@ 2010-10-29 8:27 ` Philippe Gerum
2010-10-29 9:05 ` Jan Kiszka
0 siblings, 1 reply; 13+ messages in thread
From: Philippe Gerum @ 2010-10-29 8:27 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Xenomai core
On Fri, 2010-10-29 at 09:00 +0200, Jan Kiszka wrote:
> Am 28.10.2010 21:34, Philippe Gerum wrote:
> > On Thu, 2010-10-28 at 21:15 +0200, Gilles Chanteperdrix wrote:
> >> Jan Kiszka wrote:
> >>> Gilles,
> >>>
> >>> I happened to come across rthal_mark_irq_disabled/enabled on arm. On
> >>> first glance, it looks like these helpers manipulate irq_desc::status
> >>> non-atomically, i.e. without holding irq_desc::lock. Isn't this fragile?
> >>
> >> I have no idea. How do the other architectures do? As far as I know,
> >> this code has been copied from there.
> >
> > Other archs do the same, simply because once an irq is managed by the
> > hal, it may not be shared in any way with the regular kernel. So locking
> > is pointless.
>
> Indeed, I missed that all the other archs have this uninlined in hal.c.
>
> However, this leaves at least a race between xnintr_disable/enable and
> XN_ISR_PROPAGATE (ie. the related Linux path) behind.
I can't see why XN_ISR_PROPAGATE would be involved here. This service
pends an interrupt in the pipeline log.
> Not sure if it
> matters practically - but risking silent breakage for this micro
> optimization?
It was not meant as an optimization; we may not grab the linux
descriptor lock in this context because we may enter it in primary mode.
> Is disabling/enabling really in the highly
> latency-critical anywhere? Otherwise, I would suggest to just plug this
> by adding the intended lock for this field.
The caller is expected to manage locking; AFAICS the only one who does
not is the RTAI skin, which is obsolete and removed in 2.6.x, so no big
deal.
>
> Jan
>
--
Philippe.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Xenomai-core] arm: Unprotected access to irq_desc field?
2010-10-29 8:27 ` Philippe Gerum
@ 2010-10-29 9:05 ` Jan Kiszka
2010-10-29 12:00 ` Philippe Gerum
0 siblings, 1 reply; 13+ messages in thread
From: Jan Kiszka @ 2010-10-29 9:05 UTC (permalink / raw)
To: Philippe Gerum; +Cc: Xenomai core
Am 29.10.2010 10:27, Philippe Gerum wrote:
> On Fri, 2010-10-29 at 09:00 +0200, Jan Kiszka wrote:
>> Am 28.10.2010 21:34, Philippe Gerum wrote:
>>> On Thu, 2010-10-28 at 21:15 +0200, Gilles Chanteperdrix wrote:
>>>> Jan Kiszka wrote:
>>>>> Gilles,
>>>>>
>>>>> I happened to come across rthal_mark_irq_disabled/enabled on arm. On
>>>>> first glance, it looks like these helpers manipulate irq_desc::status
>>>>> non-atomically, i.e. without holding irq_desc::lock. Isn't this fragile?
>>>>
>>>> I have no idea. How do the other architectures do? As far as I know,
>>>> this code has been copied from there.
>>>
>>> Other archs do the same, simply because once an irq is managed by the
>>> hal, it may not be shared in any way with the regular kernel. So locking
>>> is pointless.
>>
>> Indeed, I missed that all the other archs have this uninlined in hal.c.
>>
>> However, this leaves at least a race between xnintr_disable/enable and
>> XN_ISR_PROPAGATE (ie. the related Linux path) behind.
>
> I can't see why XN_ISR_PROPAGATE would be involved here. This service
> pends an interrupt in the pipeline log.
And this finally lets Linux code run that fiddles with irq_desc::status
as well - potentially in parallel to an unsyncrhonized
xnintr_irq_disable in a different context. That's the problem.
>
>> Not sure if it
>> matters practically - but risking silent breakage for this micro
>> optimization?
>
> It was not meant as an optimization; we may not grab the linux
> descriptor lock in this context because we may enter it in primary mode.
Oh, that lock isn't harden as I somehow assumed. This of course
complicates things.
>
>> Is disabling/enabling really in the highly
>> latency-critical anywhere? Otherwise, I would suggest to just plug this
>> by adding the intended lock for this field.
>
> The caller is expected to manage locking; AFAICS the only one who does
> not is the RTAI skin, which is obsolete and removed in 2.6.x, so no big
> deal.
The problem is that IRQ forwarding to Linux may let this manipulation
race with plain Linux code, thus has to synchronize with it. It is a
corner case (no one is supposed to pass IRQs down blindly anyway - if at
all), but it should at least be documented ("Don't use disable/enable
together with IRQ forwarding unless you acquire the descriptor lock
properly!").
BTW, do we need to track the descriptor state in primary mode at all?
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Xenomai-core] arm: Unprotected access to irq_desc field?
2010-10-29 9:05 ` Jan Kiszka
@ 2010-10-29 12:00 ` Philippe Gerum
2010-10-29 12:03 ` Philippe Gerum
2010-10-29 12:09 ` Jan Kiszka
0 siblings, 2 replies; 13+ messages in thread
From: Philippe Gerum @ 2010-10-29 12:00 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Xenomai core
On Fri, 2010-10-29 at 11:05 +0200, Jan Kiszka wrote:
> Am 29.10.2010 10:27, Philippe Gerum wrote:
> > On Fri, 2010-10-29 at 09:00 +0200, Jan Kiszka wrote:
> >> Am 28.10.2010 21:34, Philippe Gerum wrote:
> >>> On Thu, 2010-10-28 at 21:15 +0200, Gilles Chanteperdrix wrote:
> >>>> Jan Kiszka wrote:
> >>>>> Gilles,
> >>>>>
> >>>>> I happened to come across rthal_mark_irq_disabled/enabled on arm. On
> >>>>> first glance, it looks like these helpers manipulate irq_desc::status
> >>>>> non-atomically, i.e. without holding irq_desc::lock. Isn't this fragile?
> >>>>
> >>>> I have no idea. How do the other architectures do? As far as I know,
> >>>> this code has been copied from there.
> >>>
> >>> Other archs do the same, simply because once an irq is managed by the
> >>> hal, it may not be shared in any way with the regular kernel. So locking
> >>> is pointless.
> >>
> >> Indeed, I missed that all the other archs have this uninlined in hal.c.
> >>
> >> However, this leaves at least a race between xnintr_disable/enable and
> >> XN_ISR_PROPAGATE (ie. the related Linux path) behind.
> >
> > I can't see why XN_ISR_PROPAGATE would be involved here. This service
> > pends an interrupt in the pipeline log.
>
> And this finally lets Linux code run that fiddles with irq_desc::status
> as well - potentially in parallel to an unsyncrhonized
> xnintr_irq_disable in a different context. That's the problem.
Propagation happens in primary domain. When is this supposed to conflict
on the same CPU with linux?
>
> >
> >> Not sure if it
> >> matters practically - but risking silent breakage for this micro
> >> optimization?
> >
> > It was not meant as an optimization; we may not grab the linux
> > descriptor lock in this context because we may enter it in primary mode.
>
> Oh, that lock isn't harden as I somehow assumed. This of course
> complicates things.
>
> >
> >> Is disabling/enabling really in the highly
> >> latency-critical anywhere? Otherwise, I would suggest to just plug this
> >> by adding the intended lock for this field.
> >
> > The caller is expected to manage locking; AFAICS the only one who does
> > not is the RTAI skin, which is obsolete and removed in 2.6.x, so no big
> > deal.
>
> The problem is that IRQ forwarding to Linux may let this manipulation
> race with plain Linux code, thus has to synchronize with it. It is a
> corner case (no one is supposed to pass IRQs down blindly anyway - if at
> all), but it should at least be documented ("Don't use disable/enable
> together with IRQ forwarding unless you acquire the descriptor lock
> properly!").
>
> BTW, do we need to track the descriptor state in primary mode at all?
>
That is the real issue. I don't see the point of doing this with the
current kernel code.
> Jan
>
--
Philippe.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Xenomai-core] arm: Unprotected access to irq_desc field?
2010-10-29 12:00 ` Philippe Gerum
@ 2010-10-29 12:03 ` Philippe Gerum
2010-10-29 12:09 ` Jan Kiszka
1 sibling, 0 replies; 13+ messages in thread
From: Philippe Gerum @ 2010-10-29 12:03 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Xenomai core
On Fri, 2010-10-29 at 14:00 +0200, Philippe Gerum wrote:
> On Fri, 2010-10-29 at 11:05 +0200, Jan Kiszka wrote:
> > Am 29.10.2010 10:27, Philippe Gerum wrote:
> > > On Fri, 2010-10-29 at 09:00 +0200, Jan Kiszka wrote:
> > >> Am 28.10.2010 21:34, Philippe Gerum wrote:
> > >>> On Thu, 2010-10-28 at 21:15 +0200, Gilles Chanteperdrix wrote:
> > >>>> Jan Kiszka wrote:
> > >>>>> Gilles,
> > >>>>>
> > >>>>> I happened to come across rthal_mark_irq_disabled/enabled on arm. On
> > >>>>> first glance, it looks like these helpers manipulate irq_desc::status
> > >>>>> non-atomically, i.e. without holding irq_desc::lock. Isn't this fragile?
> > >>>>
> > >>>> I have no idea. How do the other architectures do? As far as I know,
> > >>>> this code has been copied from there.
> > >>>
> > >>> Other archs do the same, simply because once an irq is managed by the
> > >>> hal, it may not be shared in any way with the regular kernel. So locking
> > >>> is pointless.
> > >>
> > >> Indeed, I missed that all the other archs have this uninlined in hal.c.
> > >>
> > >> However, this leaves at least a race between xnintr_disable/enable and
> > >> XN_ISR_PROPAGATE (ie. the related Linux path) behind.
> > >
> > > I can't see why XN_ISR_PROPAGATE would be involved here. This service
> > > pends an interrupt in the pipeline log.
> >
> > And this finally lets Linux code run that fiddles with irq_desc::status
> > as well - potentially in parallel to an unsyncrhonized
> > xnintr_irq_disable in a different context. That's the problem.
>
> Propagation happens in primary domain. When is this supposed to conflict
> on the same CPU with linux?
>
> >
> > >
> > >> Not sure if it
> > >> matters practically - but risking silent breakage for this micro
> > >> optimization?
> > >
> > > It was not meant as an optimization; we may not grab the linux
> > > descriptor lock in this context because we may enter it in primary mode.
> >
> > Oh, that lock isn't harden as I somehow assumed. This of course
> > complicates things.
> >
> > >
> > >> Is disabling/enabling really in the highly
> > >> latency-critical anywhere? Otherwise, I would suggest to just plug this
> > >> by adding the intended lock for this field.
> > >
> > > The caller is expected to manage locking; AFAICS the only one who does
> > > not is the RTAI skin, which is obsolete and removed in 2.6.x, so no big
> > > deal.
> >
> > The problem is that IRQ forwarding to Linux may let this manipulation
> > race with plain Linux code, thus has to synchronize with it. It is a
> > corner case (no one is supposed to pass IRQs down blindly anyway - if at
> > all), but it should at least be documented ("Don't use disable/enable
> > together with IRQ forwarding unless you acquire the descriptor lock
> > properly!").
> >
> > BTW, do we need to track the descriptor state in primary mode at all?
> >
>
> That is the real issue. I don't see the point of doing this with the
> current kernel code.
It must have been related to the irqchip::unmask handlers on some archs
which do test this, but reading the code now, I can't figure out the
upside of flipping this state actually.
>
> > Jan
> >
>
--
Philippe.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Xenomai-core] arm: Unprotected access to irq_desc field?
2010-10-29 12:00 ` Philippe Gerum
2010-10-29 12:03 ` Philippe Gerum
@ 2010-10-29 12:09 ` Jan Kiszka
2010-10-29 12:46 ` Philippe Gerum
1 sibling, 1 reply; 13+ messages in thread
From: Jan Kiszka @ 2010-10-29 12:09 UTC (permalink / raw)
To: Philippe Gerum; +Cc: Xenomai core
Am 29.10.2010 14:00, Philippe Gerum wrote:
> On Fri, 2010-10-29 at 11:05 +0200, Jan Kiszka wrote:
>> Am 29.10.2010 10:27, Philippe Gerum wrote:
>>> On Fri, 2010-10-29 at 09:00 +0200, Jan Kiszka wrote:
>>>> Am 28.10.2010 21:34, Philippe Gerum wrote:
>>>>> On Thu, 2010-10-28 at 21:15 +0200, Gilles Chanteperdrix wrote:
>>>>>> Jan Kiszka wrote:
>>>>>>> Gilles,
>>>>>>>
>>>>>>> I happened to come across rthal_mark_irq_disabled/enabled on arm. On
>>>>>>> first glance, it looks like these helpers manipulate irq_desc::status
>>>>>>> non-atomically, i.e. without holding irq_desc::lock. Isn't this fragile?
>>>>>>
>>>>>> I have no idea. How do the other architectures do? As far as I know,
>>>>>> this code has been copied from there.
>>>>>
>>>>> Other archs do the same, simply because once an irq is managed by the
>>>>> hal, it may not be shared in any way with the regular kernel. So locking
>>>>> is pointless.
>>>>
>>>> Indeed, I missed that all the other archs have this uninlined in hal.c.
>>>>
>>>> However, this leaves at least a race between xnintr_disable/enable and
>>>> XN_ISR_PROPAGATE (ie. the related Linux path) behind.
>>>
>>> I can't see why XN_ISR_PROPAGATE would be involved here. This service
>>> pends an interrupt in the pipeline log.
>>
>> And this finally lets Linux code run that fiddles with irq_desc::status
>> as well - potentially in parallel to an unsyncrhonized
>> xnintr_irq_disable in a different context. That's the problem.
>
> Propagation happens in primary domain. When is this supposed to conflict
> on the same CPU with linux?
The propagation triggers the delivery of this IRQ to the Linux domain,
thus at some point there will be Linux accessing the descriptor while
there might be xnintr_irq_enable/disable running on some other CPU (or
it was preempted at the wrong point on the very same CPU).
>
>>
>>>
>>>> Not sure if it
>>>> matters practically - but risking silent breakage for this micro
>>>> optimization?
>>>
>>> It was not meant as an optimization; we may not grab the linux
>>> descriptor lock in this context because we may enter it in primary mode.
>>
>> Oh, that lock isn't harden as I somehow assumed. This of course
>> complicates things.
>>
>>>
>>>> Is disabling/enabling really in the highly
>>>> latency-critical anywhere? Otherwise, I would suggest to just plug this
>>>> by adding the intended lock for this field.
>>>
>>> The caller is expected to manage locking; AFAICS the only one who does
>>> not is the RTAI skin, which is obsolete and removed in 2.6.x, so no big
>>> deal.
>>
>> The problem is that IRQ forwarding to Linux may let this manipulation
>> race with plain Linux code, thus has to synchronize with it. It is a
>> corner case (no one is supposed to pass IRQs down blindly anyway - if at
>> all), but it should at least be documented ("Don't use disable/enable
>> together with IRQ forwarding unless you acquire the descriptor lock
>> properly!").
>>
>> BTW, do we need to track the descriptor state in primary mode at all?
>>
>
> That is the real issue. I don't see the point of doing this with the
> current kernel code.
Do we need to keep the status in synch with the hardware state for the
case Linux may take over the descriptor again? Or will Linux test the
state when processing a forwarded IRQ? These are the two potential
scenarios that come to my mind. For former could be deferred, but the
latter would be critical again.
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Xenomai-core] arm: Unprotected access to irq_desc field?
2010-10-29 12:09 ` Jan Kiszka
@ 2010-10-29 12:46 ` Philippe Gerum
2010-10-29 12:58 ` Philippe Gerum
0 siblings, 1 reply; 13+ messages in thread
From: Philippe Gerum @ 2010-10-29 12:46 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Xenomai core
On Fri, 2010-10-29 at 14:09 +0200, Jan Kiszka wrote:
> Am 29.10.2010 14:00, Philippe Gerum wrote:
> > On Fri, 2010-10-29 at 11:05 +0200, Jan Kiszka wrote:
> >> Am 29.10.2010 10:27, Philippe Gerum wrote:
> >>> On Fri, 2010-10-29 at 09:00 +0200, Jan Kiszka wrote:
> >>>> Am 28.10.2010 21:34, Philippe Gerum wrote:
> >>>>> On Thu, 2010-10-28 at 21:15 +0200, Gilles Chanteperdrix wrote:
> >>>>>> Jan Kiszka wrote:
> >>>>>>> Gilles,
> >>>>>>>
> >>>>>>> I happened to come across rthal_mark_irq_disabled/enabled on arm. On
> >>>>>>> first glance, it looks like these helpers manipulate irq_desc::status
> >>>>>>> non-atomically, i.e. without holding irq_desc::lock. Isn't this fragile?
> >>>>>>
> >>>>>> I have no idea. How do the other architectures do? As far as I know,
> >>>>>> this code has been copied from there.
> >>>>>
> >>>>> Other archs do the same, simply because once an irq is managed by the
> >>>>> hal, it may not be shared in any way with the regular kernel. So locking
> >>>>> is pointless.
> >>>>
> >>>> Indeed, I missed that all the other archs have this uninlined in hal.c.
> >>>>
> >>>> However, this leaves at least a race between xnintr_disable/enable and
> >>>> XN_ISR_PROPAGATE (ie. the related Linux path) behind.
> >>>
> >>> I can't see why XN_ISR_PROPAGATE would be involved here. This service
> >>> pends an interrupt in the pipeline log.
> >>
> >> And this finally lets Linux code run that fiddles with irq_desc::status
> >> as well - potentially in parallel to an unsyncrhonized
> >> xnintr_irq_disable in a different context. That's the problem.
> >
> > Propagation happens in primary domain. When is this supposed to conflict
> > on the same CPU with linux?
>
> The propagation triggers the delivery of this IRQ to the Linux domain,
> thus at some point there will be Linux accessing the descriptor while
> there might be xnintr_irq_enable/disable running on some other CPU (or
> it was preempted at the wrong point on the very same CPU).
The point is that XN_ISR_PROPAGATE, as a mean to force sharing of an IRQ
between both domains is plain wrong. Remove this, and no conflict
remains; this is what needs to be addressed. The potential issue between
xnintr_enable/disable and the hal routines does not exist, if those
callers handle locking properly.
>
> >
> >>
> >>>
> >>>> Not sure if it
> >>>> matters practically - but risking silent breakage for this micro
> >>>> optimization?
> >>>
> >>> It was not meant as an optimization; we may not grab the linux
> >>> descriptor lock in this context because we may enter it in primary mode.
> >>
> >> Oh, that lock isn't harden as I somehow assumed. This of course
> >> complicates things.
> >>
> >>>
> >>>> Is disabling/enabling really in the highly
> >>>> latency-critical anywhere? Otherwise, I would suggest to just plug this
> >>>> by adding the intended lock for this field.
> >>>
> >>> The caller is expected to manage locking; AFAICS the only one who does
> >>> not is the RTAI skin, which is obsolete and removed in 2.6.x, so no big
> >>> deal.
> >>
> >> The problem is that IRQ forwarding to Linux may let this manipulation
> >> race with plain Linux code, thus has to synchronize with it. It is a
> >> corner case (no one is supposed to pass IRQs down blindly anyway - if at
> >> all), but it should at least be documented ("Don't use disable/enable
> >> together with IRQ forwarding unless you acquire the descriptor lock
> >> properly!").
> >>
> >> BTW, do we need to track the descriptor state in primary mode at all?
> >>
> >
> > That is the real issue. I don't see the point of doing this with the
> > current kernel code.
>
> Do we need to keep the status in synch with the hardware state for the
> case Linux may take over the descriptor again? Or will Linux test the
> state when processing a forwarded IRQ? These are the two potential
> scenarios that come to my mind. For former could be deferred, but the
> latter would be critical again.
>
> Jan
>
--
Philippe.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Xenomai-core] arm: Unprotected access to irq_desc field?
2010-10-29 12:46 ` Philippe Gerum
@ 2010-10-29 12:58 ` Philippe Gerum
2010-10-29 13:54 ` Jan Kiszka
0 siblings, 1 reply; 13+ messages in thread
From: Philippe Gerum @ 2010-10-29 12:58 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Xenomai core
On Fri, 2010-10-29 at 14:46 +0200, Philippe Gerum wrote:
> On Fri, 2010-10-29 at 14:09 +0200, Jan Kiszka wrote:
> > Am 29.10.2010 14:00, Philippe Gerum wrote:
> > > On Fri, 2010-10-29 at 11:05 +0200, Jan Kiszka wrote:
> > >> Am 29.10.2010 10:27, Philippe Gerum wrote:
> > >>> On Fri, 2010-10-29 at 09:00 +0200, Jan Kiszka wrote:
> > >>>> Am 28.10.2010 21:34, Philippe Gerum wrote:
> > >>>>> On Thu, 2010-10-28 at 21:15 +0200, Gilles Chanteperdrix wrote:
> > >>>>>> Jan Kiszka wrote:
> > >>>>>>> Gilles,
> > >>>>>>>
> > >>>>>>> I happened to come across rthal_mark_irq_disabled/enabled on arm. On
> > >>>>>>> first glance, it looks like these helpers manipulate irq_desc::status
> > >>>>>>> non-atomically, i.e. without holding irq_desc::lock. Isn't this fragile?
> > >>>>>>
> > >>>>>> I have no idea. How do the other architectures do? As far as I know,
> > >>>>>> this code has been copied from there.
> > >>>>>
> > >>>>> Other archs do the same, simply because once an irq is managed by the
> > >>>>> hal, it may not be shared in any way with the regular kernel. So locking
> > >>>>> is pointless.
> > >>>>
> > >>>> Indeed, I missed that all the other archs have this uninlined in hal.c.
> > >>>>
> > >>>> However, this leaves at least a race between xnintr_disable/enable and
> > >>>> XN_ISR_PROPAGATE (ie. the related Linux path) behind.
> > >>>
> > >>> I can't see why XN_ISR_PROPAGATE would be involved here. This service
> > >>> pends an interrupt in the pipeline log.
> > >>
> > >> And this finally lets Linux code run that fiddles with irq_desc::status
> > >> as well - potentially in parallel to an unsyncrhonized
> > >> xnintr_irq_disable in a different context. That's the problem.
> > >
> > > Propagation happens in primary domain. When is this supposed to conflict
> > > on the same CPU with linux?
> >
> > The propagation triggers the delivery of this IRQ to the Linux domain,
> > thus at some point there will be Linux accessing the descriptor while
> > there might be xnintr_irq_enable/disable running on some other CPU (or
> > it was preempted at the wrong point on the very same CPU).
>
> The point is that XN_ISR_PROPAGATE, as a mean to force sharing of an IRQ
> between both domains is plain wrong. Remove this, and no conflict
> remains; this is what needs to be addressed. The potential issue between
> xnintr_enable/disable and the hal routines does not exist, if those
> callers handle locking properly.
>
In any case, I don't think we could accept that sharing, so flipping the
bits in the hal is in fact pointless. To match the linux locking, we
should iron the irq_desc::lock, which we won't since this would cause
massive jittery. We should stick to the basic logic: no sharing,
therefore no tracking need for the irqflags. I'll kill XN_ISR_PROPAGATE
in forge at some point, for sure.
> >
> > >
> > >>
> > >>>
> > >>>> Not sure if it
> > >>>> matters practically - but risking silent breakage for this micro
> > >>>> optimization?
> > >>>
> > >>> It was not meant as an optimization; we may not grab the linux
> > >>> descriptor lock in this context because we may enter it in primary mode.
> > >>
> > >> Oh, that lock isn't harden as I somehow assumed. This of course
> > >> complicates things.
> > >>
> > >>>
> > >>>> Is disabling/enabling really in the highly
> > >>>> latency-critical anywhere? Otherwise, I would suggest to just plug this
> > >>>> by adding the intended lock for this field.
> > >>>
> > >>> The caller is expected to manage locking; AFAICS the only one who does
> > >>> not is the RTAI skin, which is obsolete and removed in 2.6.x, so no big
> > >>> deal.
> > >>
> > >> The problem is that IRQ forwarding to Linux may let this manipulation
> > >> race with plain Linux code, thus has to synchronize with it. It is a
> > >> corner case (no one is supposed to pass IRQs down blindly anyway - if at
> > >> all), but it should at least be documented ("Don't use disable/enable
> > >> together with IRQ forwarding unless you acquire the descriptor lock
> > >> properly!").
> > >>
> > >> BTW, do we need to track the descriptor state in primary mode at all?
> > >>
> > >
> > > That is the real issue. I don't see the point of doing this with the
> > > current kernel code.
> >
> > Do we need to keep the status in synch with the hardware state for the
> > case Linux may take over the descriptor again? Or will Linux test the
> > state when processing a forwarded IRQ? These are the two potential
> > scenarios that come to my mind. For former could be deferred, but the
> > latter would be critical again.
> >
> > Jan
> >
>
> --
> Philippe.
>
>
>
> _______________________________________________
> Xenomai-core mailing list
> Xenomai-core@domain.hid
> https://mail.gna.org/listinfo/xenomai-core
--
Philippe.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Xenomai-core] arm: Unprotected access to irq_desc field?
2010-10-29 12:58 ` Philippe Gerum
@ 2010-10-29 13:54 ` Jan Kiszka
0 siblings, 0 replies; 13+ messages in thread
From: Jan Kiszka @ 2010-10-29 13:54 UTC (permalink / raw)
To: Philippe Gerum; +Cc: Xenomai core
Am 29.10.2010 14:58, Philippe Gerum wrote:
> On Fri, 2010-10-29 at 14:46 +0200, Philippe Gerum wrote:
>> On Fri, 2010-10-29 at 14:09 +0200, Jan Kiszka wrote:
>>> Am 29.10.2010 14:00, Philippe Gerum wrote:
>>>> On Fri, 2010-10-29 at 11:05 +0200, Jan Kiszka wrote:
>>>>> Am 29.10.2010 10:27, Philippe Gerum wrote:
>>>>>> On Fri, 2010-10-29 at 09:00 +0200, Jan Kiszka wrote:
>>>>>>> Am 28.10.2010 21:34, Philippe Gerum wrote:
>>>>>>>> On Thu, 2010-10-28 at 21:15 +0200, Gilles Chanteperdrix wrote:
>>>>>>>>> Jan Kiszka wrote:
>>>>>>>>>> Gilles,
>>>>>>>>>>
>>>>>>>>>> I happened to come across rthal_mark_irq_disabled/enabled on arm. On
>>>>>>>>>> first glance, it looks like these helpers manipulate irq_desc::status
>>>>>>>>>> non-atomically, i.e. without holding irq_desc::lock. Isn't this fragile?
>>>>>>>>>
>>>>>>>>> I have no idea. How do the other architectures do? As far as I know,
>>>>>>>>> this code has been copied from there.
>>>>>>>>
>>>>>>>> Other archs do the same, simply because once an irq is managed by the
>>>>>>>> hal, it may not be shared in any way with the regular kernel. So locking
>>>>>>>> is pointless.
>>>>>>>
>>>>>>> Indeed, I missed that all the other archs have this uninlined in hal.c.
>>>>>>>
>>>>>>> However, this leaves at least a race between xnintr_disable/enable and
>>>>>>> XN_ISR_PROPAGATE (ie. the related Linux path) behind.
>>>>>>
>>>>>> I can't see why XN_ISR_PROPAGATE would be involved here. This service
>>>>>> pends an interrupt in the pipeline log.
>>>>>
>>>>> And this finally lets Linux code run that fiddles with irq_desc::status
>>>>> as well - potentially in parallel to an unsyncrhonized
>>>>> xnintr_irq_disable in a different context. That's the problem.
>>>>
>>>> Propagation happens in primary domain. When is this supposed to conflict
>>>> on the same CPU with linux?
>>>
>>> The propagation triggers the delivery of this IRQ to the Linux domain,
>>> thus at some point there will be Linux accessing the descriptor while
>>> there might be xnintr_irq_enable/disable running on some other CPU (or
>>> it was preempted at the wrong point on the very same CPU).
>>
>> The point is that XN_ISR_PROPAGATE, as a mean to force sharing of an IRQ
>> between both domains is plain wrong. Remove this, and no conflict
>> remains; this is what needs to be addressed. The potential issue between
>> xnintr_enable/disable and the hal routines does not exist, if those
>> callers handle locking properly.
>>
>
> In any case, I don't think we could accept that sharing, so flipping the
> bits in the hal is in fact pointless. To match the linux locking, we
> should iron the irq_desc::lock, which we won't since this would cause
> massive jittery. We should stick to the basic logic: no sharing,
> therefore no tracking need for the irqflags. I'll kill XN_ISR_PROPAGATE
> in forge at some point, for sure.
>
Sounds good.
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2010-10-29 13:54 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-28 15:38 [Xenomai-core] arm: Unprotected access to irq_desc field? Jan Kiszka
2010-10-28 19:15 ` Gilles Chanteperdrix
2010-10-28 19:34 ` Philippe Gerum
2010-10-29 7:00 ` Jan Kiszka
2010-10-29 8:27 ` Philippe Gerum
2010-10-29 9:05 ` Jan Kiszka
2010-10-29 12:00 ` Philippe Gerum
2010-10-29 12:03 ` Philippe Gerum
2010-10-29 12:09 ` Jan Kiszka
2010-10-29 12:46 ` Philippe Gerum
2010-10-29 12:58 ` Philippe Gerum
2010-10-29 13:54 ` Jan Kiszka
2010-10-28 19:37 ` Gilles Chanteperdrix
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.