From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4CD6D22C.2030708@domain.hid> Date: Sun, 07 Nov 2010 17:22:04 +0100 From: Jan Kiszka MIME-Version: 1.0 References: <20101007115728.GA24500@domain.hid> <4CADBDC2.8080600@domain.hid> <20101008070148.GB2255@domain.hid> <1286530884.13186.109.camel@domain.hid> <20101013090353.GA6902@domain.hid> <1286961375.1759.71.camel@domain.hid> <20101013092617.GB6902@domain.hid> <1286981521.1759.83.camel@domain.hid> <1288025329.26618.132.camel@domain.hid> <4CC5C80E.2070004@domain.hid> <1288033731.26618.161.camel@domain.hid> <4CC5D742.9080307@domain.hid> <1288034435.26618.164.camel@domain.hid> <4CC5D8FF.5080109@domain.hid> <1288041166.26618.182.camel@domain.hid> <4CC5F525.7040206@domain.hid> <1288042858.26618.204.camel@domain.hid> <4CC5FAE6.6010305@domain.hid> <1288068231.26618.224.camel@domain.hid> <4CC665A1.9040707@domain.hid> <4CC72D27.3010607@domain.hid> <1288243034.1816.14.camel@domain.hid> <4CC926BE.7040105@domain.hid> <1288251968.1816.22.camel@domain.hid> <1289142959.1842.295.camel@domain.hid> In-Reply-To: <1289142959.1842.295.camel@domain.hid> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig7B6746909E794DF41BFEEAE1" Sender: jan.kiszka@domain.hid Subject: Re: [Xenomai-help] kernel oopses when killing realtime task List-Id: Help regarding installation and common use of Xenomai List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Philippe Gerum Cc: xenomai@xenomai.org This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig7B6746909E794DF41BFEEAE1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Am 07.11.2010 16:15, Philippe Gerum wrote: > On Thu, 2010-10-28 at 09:46 +0200, Philippe Gerum wrote: >> On Thu, 2010-10-28 at 09:31 +0200, Jan Kiszka wrote: >>> Am 28.10.2010 07:17, Philippe Gerum wrote: >>>> On Tue, 2010-10-26 at 21:33 +0200, Jan Kiszka wrote: >>>>> Am 26.10.2010 07:22, Jan Kiszka wrote: >>>>>> Will come up with two patches for stable, one for I-pipe and one f= or >>>>>> Xenomai, later today. Then we can discuss which cases I'm missing.= >>>>> >>>>> While meditating over my approach (which turned out to be less triv= ial >>>>> as expected - of course), I also reconsidered your current patches.= The >>>>> concerns I had (forwarding of spurious IRQ to Linux) turned out to = be >>>>> harmless (Linux will ignore such few spurious events). >>>>> >>>> >>>> That is not even an issue if you consider the sequence to be >>>> xnarch_disable_irq then ipipe_control (new version, doing a critical= >>>> entry to flip the irq mode). >>> >>> When you want to support shared IRQs, xnarch_disable_irq is tabu. I >>> suppose you meant some my_device_disable_irqs(). >> >> No, it is perfectly valid provided you made sure that no handler >> remained on the shared list. There is absolutely no reason to keep a >> line unmasked if no device is supposed to be active on it. Hence the >> release sequence described earlier. >> >>> >>>> >>>>> Still, the approach to sync via shutting down the line for the curr= ent >>>>> domain before xnintr_irq_detach doesn't work for us. It only works = if >>>>> xnintr_irq_detach actually detaches from the line, but it breaks if= >>>>> there are users remaining. >>>>> >>>>> We need intrlock to check if we are the last user while removing >>>>> ourselves from the list. And we cannot postpone line detaching afte= r the >>>>> critical section as we may otherwise race with the next registratio= n on >>>>> that line. IOW, I don't see how to solve the issue without moving t= he >>>>> drain after the detach and making the detach safer instead. >>>>> >>>>> Do you agree? >>>>> >>>> >>>> I agree this is not trivial, for sure. To keep things simple, I woul= d >>>> introduce a new "teardown" flag to freeze the descriptor, thus avoid= ing >>>> further attachments, while xnintr_detach can probe the shared list f= or >>>> lingering users, and eventually call xnarch_disable_irq >>>> +xnarch_ignore_irq+xnarch_release_irq in sequence with all locks >>>> dropped, if empty. >>>> >>>> The only adverse effect I can see ATM would be some concurrent calle= r of >>>> xnintr_detach() blocked on the teardown flag on another CPU, albeit = it >>>> _could_ have joined the bandwagon, attaching the irq, in case the sh= ared >>>> list proved to remain active (and thus xnarch_release_irq was not >>>> called). But this may also look like a simple way to prevent live >>>> locking of interrupt descriptors. YMMV. >>> >>> This sounds like it's best discussed based on patches. >>> >> >> Likely, yes. I'll have a look when time allows. >=20 > The following patches implements the teardown approach. The basic idea > is: > - neither break nor improve old setups with legacy I-pipe patches not > providing the revised ipipe_control_irq call. > - fix the SMP race when detaching interrupts. Looks good. >=20 > The last patch also fixes two other issues: > - do not alter the irq descriptor (e.g. cookie and stats) if the > attachment fails early > - do not set irq affinity before the validity checks, and set it only > for the first handler introduced in the shared list. Separate commits? At least mention it in the change log. Jan --------------enig7B6746909E794DF41BFEEAE1 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.15 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iEYEARECAAYFAkzW0jAACgkQitSsb3rl5xRsPACgreKLb49REQ9pPfwVIH6XDVyp IoEAnj6arwbu5QfzYzqQ5eFM9FcgYOCh =6vNu -----END PGP SIGNATURE----- --------------enig7B6746909E794DF41BFEEAE1--