* Re: [Xenomai] [Xenomai-git] Jan Kiszka : cobalt/rtdm: Fix driver reference counting
[not found] <E1b6BNz-0003SF-Fa@sd-51317.xenomai.org>
@ 2016-05-27 6:58 ` Gilles Chanteperdrix
2016-05-27 8:24 ` Jan Kiszka
0 siblings, 1 reply; 6+ messages in thread
From: Gilles Chanteperdrix @ 2016-05-27 6:58 UTC (permalink / raw)
To: xenomai
On Fri, May 27, 2016 at 08:36:43AM +0200, git repository hosting wrote:
> Module: xenomai-jki
> Branch: for-forge
> Commit: c9d83776c0ed882c71045dc32b340b57f88c5e00
> URL: http://git.xenomai.org/?p=xenomai-jki.git;a=commit;h=c9d83776c0ed882c71045dc32b340b57f88c5e00
>
> Author: Jan Kiszka <jan.kiszka@siemens.com>
> Date: Fri May 27 08:32:41 2016 +0200
>
> cobalt/rtdm: Fix driver reference counting
>
> The rtdm smokey test triggered a BUG due to rtdm_dev_unregister not
> taking the reference counter of a driver into account. Fix this by
> moving the check into unregister_driver directly.
Did you have a look at commit
96e85548a56c8c7fbd6d64c079701483a8e5da27 ?
This looks like a revert, so since the commit was fixing something,
I believe you are reintroducting a bug.
--
Gilles.
https://click-hack.org
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai] [Xenomai-git] Jan Kiszka : cobalt/rtdm: Fix driver reference counting
2016-05-27 6:58 ` [Xenomai] [Xenomai-git] Jan Kiszka : cobalt/rtdm: Fix driver reference counting Gilles Chanteperdrix
@ 2016-05-27 8:24 ` Jan Kiszka
2016-05-27 8:33 ` Gilles Chanteperdrix
0 siblings, 1 reply; 6+ messages in thread
From: Jan Kiszka @ 2016-05-27 8:24 UTC (permalink / raw)
To: Gilles Chanteperdrix, xenomai
On 2016-05-27 08:58, Gilles Chanteperdrix wrote:
> On Fri, May 27, 2016 at 08:36:43AM +0200, git repository hosting wrote:
>> Module: xenomai-jki
>> Branch: for-forge
>> Commit: c9d83776c0ed882c71045dc32b340b57f88c5e00
>> URL: http://git.xenomai.org/?p=xenomai-jki.git;a=commit;h=c9d83776c0ed882c71045dc32b340b57f88c5e00
>>
>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>> Date: Fri May 27 08:32:41 2016 +0200
>>
>> cobalt/rtdm: Fix driver reference counting
>>
>> The rtdm smokey test triggered a BUG due to rtdm_dev_unregister not
>> taking the reference counter of a driver into account. Fix this by
>> moving the check into unregister_driver directly.
>
> Did you have a look at commit
> 96e85548a56c8c7fbd6d64c079701483a8e5da27 ?
> This looks like a revert, so since the commit was fixing something,
> I believe you are reintroducting a bug.
>
Didn't see that. However, that commit was wrong because you were mixing
up different reference counters. One is that for devices, which is
decreased in __rtdm_put_device. The other is what un/register_driver
have to handle: that of the corresponding rtdm_driver. A device might
pass earlier than a driver because the latter may manage multiple
devices - exactly what the unit test checks.
Maybe you can describe what scenarios was triggering the issue back, and
we can check if it reoccurred and fix it for good.
Jan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <http://xenomai.org/pipermail/xenomai/attachments/20160527/f8710e97/attachment.sig>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai] [Xenomai-git] Jan Kiszka : cobalt/rtdm: Fix driver reference counting
2016-05-27 8:24 ` Jan Kiszka
@ 2016-05-27 8:33 ` Gilles Chanteperdrix
2016-05-27 9:10 ` Jan Kiszka
0 siblings, 1 reply; 6+ messages in thread
From: Gilles Chanteperdrix @ 2016-05-27 8:33 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai
On Fri, May 27, 2016 at 10:24:21AM +0200, Jan Kiszka wrote:
> On 2016-05-27 08:58, Gilles Chanteperdrix wrote:
> > On Fri, May 27, 2016 at 08:36:43AM +0200, git repository hosting wrote:
> >> Module: xenomai-jki
> >> Branch: for-forge
> >> Commit: c9d83776c0ed882c71045dc32b340b57f88c5e00
> >> URL: http://git.xenomai.org/?p=xenomai-jki.git;a=commit;h=c9d83776c0ed882c71045dc32b340b57f88c5e00
> >>
> >> Author: Jan Kiszka <jan.kiszka@siemens.com>
> >> Date: Fri May 27 08:32:41 2016 +0200
> >>
> >> cobalt/rtdm: Fix driver reference counting
> >>
> >> The rtdm smokey test triggered a BUG due to rtdm_dev_unregister not
> >> taking the reference counter of a driver into account. Fix this by
> >> moving the check into unregister_driver directly.
> >
> > Did you have a look at commit
> > 96e85548a56c8c7fbd6d64c079701483a8e5da27 ?
> > This looks like a revert, so since the commit was fixing something,
> > I believe you are reintroducting a bug.
> >
>
> Didn't see that. However, that commit was wrong because you were mixing
> up different reference counters. One is that for devices, which is
> decreased in __rtdm_put_device. The other is what un/register_driver
> have to handle: that of the corresponding rtdm_driver. A device might
> pass earlier than a driver because the latter may manage multiple
> devices - exactly what the unit test checks.
>
> Maybe you can describe what scenarios was triggering the issue back, and
> we can check if it reoccurred and fix it for good.
Well, that was pretty simple, removing kernel modules registering
devices (in my case it was rtnet.ko), would fail to unregister some
part of the driver (some proc or sys files if I remember correctly),
so that reinserting the driver would first cause some warning, and
after several rmmod/insmod result in a crash or a simple failure I
do not remember. I traced that to the fact that the test for the
reference counter in unregister_driver was failing because the
reference counter had already been decremented elsewhere. This was a
long time ago, I do not remember all the details, but I think it is
something like that.
--
Gilles.
https://click-hack.org
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai] [Xenomai-git] Jan Kiszka : cobalt/rtdm: Fix driver reference counting
2016-05-27 8:33 ` Gilles Chanteperdrix
@ 2016-05-27 9:10 ` Jan Kiszka
2016-05-27 9:39 ` Jan Kiszka
0 siblings, 1 reply; 6+ messages in thread
From: Jan Kiszka @ 2016-05-27 9:10 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai
On 2016-05-27 10:33, Gilles Chanteperdrix wrote:
> On Fri, May 27, 2016 at 10:24:21AM +0200, Jan Kiszka wrote:
>> On 2016-05-27 08:58, Gilles Chanteperdrix wrote:
>>> On Fri, May 27, 2016 at 08:36:43AM +0200, git repository hosting wrote:
>>>> Module: xenomai-jki
>>>> Branch: for-forge
>>>> Commit: c9d83776c0ed882c71045dc32b340b57f88c5e00
>>>> URL: http://git.xenomai.org/?p=xenomai-jki.git;a=commit;h=c9d83776c0ed882c71045dc32b340b57f88c5e00
>>>>
>>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>>> Date: Fri May 27 08:32:41 2016 +0200
>>>>
>>>> cobalt/rtdm: Fix driver reference counting
>>>>
>>>> The rtdm smokey test triggered a BUG due to rtdm_dev_unregister not
>>>> taking the reference counter of a driver into account. Fix this by
>>>> moving the check into unregister_driver directly.
>>>
>>> Did you have a look at commit
>>> 96e85548a56c8c7fbd6d64c079701483a8e5da27 ?
>>> This looks like a revert, so since the commit was fixing something,
>>> I believe you are reintroducting a bug.
>>>
>>
>> Didn't see that. However, that commit was wrong because you were mixing
>> up different reference counters. One is that for devices, which is
>> decreased in __rtdm_put_device. The other is what un/register_driver
>> have to handle: that of the corresponding rtdm_driver. A device might
>> pass earlier than a driver because the latter may manage multiple
>> devices - exactly what the unit test checks.
>>
>> Maybe you can describe what scenarios was triggering the issue back, and
>> we can check if it reoccurred and fix it for good.
>
> Well, that was pretty simple, removing kernel modules registering
> devices (in my case it was rtnet.ko), would fail to unregister some
> part of the driver (some proc or sys files if I remember correctly),
> so that reinserting the driver would first cause some warning, and
> after several rmmod/insmod result in a crash or a simple failure I
> do not remember. I traced that to the fact that the test for the
> reference counter in unregister_driver was failing because the
> reference counter had already been decremented elsewhere. This was a
> long time ago, I do not remember all the details, but I think it is
> something like that.
>
I can remove and reload a stack of rtnet, rt_e1000 and rtipv4 multiple
times without any bug reports. /proc/rtnet also properly disappears and
reappears. However, unloading and loading rtpacket causes an oops.
BUG: unable to handle kernel paging request at ffffffffa01b5b20
IP: [<ffffffff81079d20>] blocking_notifier_chain_register+0x40/0xb0
...
Call Trace:
[<ffffffff8114f6b8>] cobalt_add_state_chain+0x18/0x20
[<ffffffff8117a403>] rtdm_dev_register+0x1d3/0x660
[<ffffffff8110e84c>] ? ipipe_unstall_root+0x5c/0x90
[<ffffffff810003b0>] ? do_one_initcall+0x80/0x1f0
[<ffffffffa01d5000>] ? 0xffffffffa01d5000
[<ffffffffa01d5015>] rt_packet_proto_init+0x15/0x48 [rtpacket]
[<ffffffffa01d5000>] ? 0xffffffffa01d5000
[<ffffffff810003c0>] do_one_initcall+0x90/0x1f0
Let me check that.
Jan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <http://xenomai.org/pipermail/xenomai/attachments/20160527/e7bff3ed/attachment.sig>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai] [Xenomai-git] Jan Kiszka : cobalt/rtdm: Fix driver reference counting
2016-05-27 9:10 ` Jan Kiszka
@ 2016-05-27 9:39 ` Jan Kiszka
2016-05-27 10:50 ` Gilles Chanteperdrix
0 siblings, 1 reply; 6+ messages in thread
From: Jan Kiszka @ 2016-05-27 9:39 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai
On 2016-05-27 11:10, Jan Kiszka wrote:
> On 2016-05-27 10:33, Gilles Chanteperdrix wrote:
>> On Fri, May 27, 2016 at 10:24:21AM +0200, Jan Kiszka wrote:
>>> On 2016-05-27 08:58, Gilles Chanteperdrix wrote:
>>>> On Fri, May 27, 2016 at 08:36:43AM +0200, git repository hosting wrote:
>>>>> Module: xenomai-jki
>>>>> Branch: for-forge
>>>>> Commit: c9d83776c0ed882c71045dc32b340b57f88c5e00
>>>>> URL: http://git.xenomai.org/?p=xenomai-jki.git;a=commit;h=c9d83776c0ed882c71045dc32b340b57f88c5e00
>>>>>
>>>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>>>> Date: Fri May 27 08:32:41 2016 +0200
>>>>>
>>>>> cobalt/rtdm: Fix driver reference counting
>>>>>
>>>>> The rtdm smokey test triggered a BUG due to rtdm_dev_unregister not
>>>>> taking the reference counter of a driver into account. Fix this by
>>>>> moving the check into unregister_driver directly.
>>>>
>>>> Did you have a look at commit
>>>> 96e85548a56c8c7fbd6d64c079701483a8e5da27 ?
>>>> This looks like a revert, so since the commit was fixing something,
>>>> I believe you are reintroducting a bug.
>>>>
>>>
>>> Didn't see that. However, that commit was wrong because you were mixing
>>> up different reference counters. One is that for devices, which is
>>> decreased in __rtdm_put_device. The other is what un/register_driver
>>> have to handle: that of the corresponding rtdm_driver. A device might
>>> pass earlier than a driver because the latter may manage multiple
>>> devices - exactly what the unit test checks.
>>>
>>> Maybe you can describe what scenarios was triggering the issue back, and
>>> we can check if it reoccurred and fix it for good.
>>
>> Well, that was pretty simple, removing kernel modules registering
>> devices (in my case it was rtnet.ko), would fail to unregister some
>> part of the driver (some proc or sys files if I remember correctly),
>> so that reinserting the driver would first cause some warning, and
>> after several rmmod/insmod result in a crash or a simple failure I
>> do not remember. I traced that to the fact that the test for the
>> reference counter in unregister_driver was failing because the
>> reference counter had already been decremented elsewhere. This was a
>> long time ago, I do not remember all the details, but I think it is
>> something like that.
>>
>
> I can remove and reload a stack of rtnet, rt_e1000 and rtipv4 multiple
> times without any bug reports. /proc/rtnet also properly disappears and
> reappears. However, unloading and loading rtpacket causes an oops.
>
> BUG: unable to handle kernel paging request at ffffffffa01b5b20
> IP: [<ffffffff81079d20>] blocking_notifier_chain_register+0x40/0xb0
> ...
> Call Trace:
> [<ffffffff8114f6b8>] cobalt_add_state_chain+0x18/0x20
> [<ffffffff8117a403>] rtdm_dev_register+0x1d3/0x660
> [<ffffffff8110e84c>] ? ipipe_unstall_root+0x5c/0x90
> [<ffffffff810003b0>] ? do_one_initcall+0x80/0x1f0
> [<ffffffffa01d5000>] ? 0xffffffffa01d5000
> [<ffffffffa01d5015>] rt_packet_proto_init+0x15/0x48 [rtpacket]
> [<ffffffffa01d5000>] ? 0xffffffffa01d5000
> [<ffffffff810003c0>] do_one_initcall+0x90/0x1f0
>
> Let me check that.
>
Pushed the proper fix for that. Was only affecting protocol drivers,
thus the exposure via rtnet.
Jan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <http://xenomai.org/pipermail/xenomai/attachments/20160527/03841783/attachment.sig>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai] [Xenomai-git] Jan Kiszka : cobalt/rtdm: Fix driver reference counting
2016-05-27 9:39 ` Jan Kiszka
@ 2016-05-27 10:50 ` Gilles Chanteperdrix
0 siblings, 0 replies; 6+ messages in thread
From: Gilles Chanteperdrix @ 2016-05-27 10:50 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai
On Fri, May 27, 2016 at 11:39:07AM +0200, Jan Kiszka wrote:
> On 2016-05-27 11:10, Jan Kiszka wrote:
> > On 2016-05-27 10:33, Gilles Chanteperdrix wrote:
> >> On Fri, May 27, 2016 at 10:24:21AM +0200, Jan Kiszka wrote:
> >>> On 2016-05-27 08:58, Gilles Chanteperdrix wrote:
> >>>> On Fri, May 27, 2016 at 08:36:43AM +0200, git repository hosting wrote:
> >>>>> Module: xenomai-jki
> >>>>> Branch: for-forge
> >>>>> Commit: c9d83776c0ed882c71045dc32b340b57f88c5e00
> >>>>> URL: http://git.xenomai.org/?p=xenomai-jki.git;a=commit;h=c9d83776c0ed882c71045dc32b340b57f88c5e00
> >>>>>
> >>>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
> >>>>> Date: Fri May 27 08:32:41 2016 +0200
> >>>>>
> >>>>> cobalt/rtdm: Fix driver reference counting
> >>>>>
> >>>>> The rtdm smokey test triggered a BUG due to rtdm_dev_unregister not
> >>>>> taking the reference counter of a driver into account. Fix this by
> >>>>> moving the check into unregister_driver directly.
> >>>>
> >>>> Did you have a look at commit
> >>>> 96e85548a56c8c7fbd6d64c079701483a8e5da27 ?
> >>>> This looks like a revert, so since the commit was fixing something,
> >>>> I believe you are reintroducting a bug.
> >>>>
> >>>
> >>> Didn't see that. However, that commit was wrong because you were mixing
> >>> up different reference counters. One is that for devices, which is
> >>> decreased in __rtdm_put_device. The other is what un/register_driver
> >>> have to handle: that of the corresponding rtdm_driver. A device might
> >>> pass earlier than a driver because the latter may manage multiple
> >>> devices - exactly what the unit test checks.
> >>>
> >>> Maybe you can describe what scenarios was triggering the issue back, and
> >>> we can check if it reoccurred and fix it for good.
> >>
> >> Well, that was pretty simple, removing kernel modules registering
> >> devices (in my case it was rtnet.ko), would fail to unregister some
> >> part of the driver (some proc or sys files if I remember correctly),
> >> so that reinserting the driver would first cause some warning, and
> >> after several rmmod/insmod result in a crash or a simple failure I
> >> do not remember. I traced that to the fact that the test for the
> >> reference counter in unregister_driver was failing because the
> >> reference counter had already been decremented elsewhere. This was a
> >> long time ago, I do not remember all the details, but I think it is
> >> something like that.
> >>
> >
> > I can remove and reload a stack of rtnet, rt_e1000 and rtipv4 multiple
> > times without any bug reports. /proc/rtnet also properly disappears and
> > reappears. However, unloading and loading rtpacket causes an oops.
> >
> > BUG: unable to handle kernel paging request at ffffffffa01b5b20
> > IP: [<ffffffff81079d20>] blocking_notifier_chain_register+0x40/0xb0
> > ...
> > Call Trace:
> > [<ffffffff8114f6b8>] cobalt_add_state_chain+0x18/0x20
> > [<ffffffff8117a403>] rtdm_dev_register+0x1d3/0x660
> > [<ffffffff8110e84c>] ? ipipe_unstall_root+0x5c/0x90
> > [<ffffffff810003b0>] ? do_one_initcall+0x80/0x1f0
> > [<ffffffffa01d5000>] ? 0xffffffffa01d5000
> > [<ffffffffa01d5015>] rt_packet_proto_init+0x15/0x48 [rtpacket]
> > [<ffffffffa01d5000>] ? 0xffffffffa01d5000
> > [<ffffffff810003c0>] do_one_initcall+0x90/0x1f0
> >
> > Let me check that.
> >
>
> Pushed the proper fix for that. Was only affecting protocol drivers,
> thus the exposure via rtnet.
Ok, but it was also probably affecting rtudp, rttcp, and the rtipc
protocols too I guess. Actually I wonder if I did not see it with
xddp.
--
Gilles.
https://click-hack.org
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-05-27 10:50 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <E1b6BNz-0003SF-Fa@sd-51317.xenomai.org>
2016-05-27 6:58 ` [Xenomai] [Xenomai-git] Jan Kiszka : cobalt/rtdm: Fix driver reference counting Gilles Chanteperdrix
2016-05-27 8:24 ` Jan Kiszka
2016-05-27 8:33 ` Gilles Chanteperdrix
2016-05-27 9:10 ` Jan Kiszka
2016-05-27 9:39 ` Jan Kiszka
2016-05-27 10:50 ` Gilles Chanteperdrix
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.