From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <45CFB49F.1050000@domain.hid> Date: Mon, 12 Feb 2007 01:28:15 +0100 From: Jan Kiszka MIME-Version: 1.0 Subject: Re: [Xenomai-core] [BUG] trunk: screwed Linux irq state References: <45CF951B.8080404@domain.hid> <1171233732.5035.24.camel@domain.hid> <17871.41373.425284.839228@domain.hid> <1171237780.5035.30.camel@domain.hid> <17871.45750.434103.944040@domain.hid> In-Reply-To: <17871.45750.434103.944040@domain.hid> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigC023F55F9C0E4F8CC4721B51" Sender: jan.kiszka@domain.hid List-Id: "Xenomai life and development \(bug reports, patches, discussions\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gilles Chanteperdrix Cc: xenomai-core This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigC023F55F9C0E4F8CC4721B51 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Gilles Chanteperdrix wrote: > Philippe Gerum wrote: > > On Mon, 2007-02-12 at 00:07 +0100, Gilles Chanteperdrix wrote: > > > Philippe Gerum wrote: > > > > On Sun, 2007-02-11 at 23:13 +0100, Jan Kiszka wrote: > > > > > Hi, > > > > >=20 > > > > > while testing 2.6.20 with RTnet, I got this kernel BUG during= the slave > > > > > startup procedure: > > > > >=20 > > > > > <4>[ 137.799234] TDMA: calibrated master-to-slave packet del= ay: 34 us (min/max: 33/38 us) > > > > > <4>[ 142.291455] BUG: at kernel/fork.c:993 copy_process() > > > > > <4>[ 142.291585] [] show_trace_log_lvl+0x1f/0x40 > > > > > <4>[ 142.291767] [] show_trace+0x17/0x20 > > > > > <4>[ 142.291896] [] dump_stack+0x1b/0x20 > > > > > <4>[ 142.292026] [] copy_process+0x914/0x13d0 > > > > > <4>[ 142.292190] [] do_fork+0x70/0x1b0 > > > > > <4>[ 142.292323] [] sys_clone+0x38/0x40 > > > > > <4>[ 142.292620] [] syscall_call+0x7/0xb > > > > > <4>[ 142.292747] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D > > > > > <3>[ 142.292860] BUG: sleeping function called from invalid = context at mm/slab.c:3034 > > > > > <4>[ 142.293052] in_atomic():0, irqs_disabled():1 > > > > ^^^^ > > > >=20 > > > > Typical of something going wrong in entry.S. > > >=20 > > > You mean, interrupts are not really disabled when forking ? :-) > > >=20 > >=20 > > Eh, mmmh, no. Hopefully. > >=20 > > > So, I am afraid the new fpu_counter optimization is buggy: if a ta= sk > > > forks with fpu_counter greater than 5 and is preempted right after= > > > prepare_to_copy in dup_task_struct, when the system switches back = to > > > this task, the task FPU context will be restored and TS_USEDFPU se= t in > > > the task flags, thereby voiding the effect of prepare_to_copy. > > >=20 > >=20 > > You mean that the parent FPU context would leak into the child's one= ? >=20 > Yes, something like that. The result is random segfaults, I do not > remember exactly why. >=20 > > Well, maybe the LKML people would like to know about this. As a > > sidenote, I don't see anything bad with your latest counter-measure > > disabling this optimization in Xenomai's context switch code, even i= n > > the bugous case above. Right?=20 >=20 > Right, if there are random segfaults, they will not be xenomai's fault.= >=20 I'm currently sorting the symptoms again, or better I'm looking where they went to. 2.6.20 just decided to work normally again, 2.6.19 needs a re-check. It appears now that the tracer played an important role, but I'm not 100% sure yet. I'll keep you posted. Jan --------------enigC023F55F9C0E4F8CC4721B51 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFFz7SfniDOoMHTA+kRAloEAJ44pzn6J77H3cPy/dr3xE7DwikEAwCfWeqv EiJwCN1CMAE02kB06fNivqE= =Qk3D -----END PGP SIGNATURE----- --------------enigC023F55F9C0E4F8CC4721B51--