From mboxrd@z Thu Jan 1 00:00:00 1970 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C8699D.64F3E542" Date: Thu, 7 Feb 2008 08:23:29 -0700 Message-ID: <51CAD0CE1504444DBE77CBBE51A0135D3769E8@domain.hid> From: "Steven Seeger" Subject: [Xenomai-help] fpu List-Id: Help regarding installation and common use of Xenomai List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: xenomai@xenomai.org This is a multi-part message in MIME format. ------_=_NextPart_001_01C8699D.64F3E542 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable This is probably an application error, but I am having some random weird behavior where sometimes a double that I add 0.01 to in a periodic task gets corrupted and becomes nan. I didn't specify the T_FPU flag in rt_task_spawn() but the docs said that flag is assumed for a user space task. I've since set that flag manually and haven't had the problem happen yet. Is this just a coincidence?=20 =20 The corruption seems to happen in a half-second window where the system is doing a lot and only about 35% of the CPU is available to Linux. However, I am testing for overruns on rt_task_wait_period() and never seeing any. =20 Steven =20 ------_=_NextPart_001_01C8699D.64F3E542 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

This is probably an application error, but I am = having some random weird behavior where sometimes a double that I add 0.01 to in a = periodic task gets corrupted and becomes nan. I didn’t specify the T_FPU = flag in rt_task_spawn() but the docs said that flag is assumed for a user space = task. I’ve since set that flag manually and haven’t had the problem happen = yet. Is this just a coincidence?

 

The corruption seems to happen in a half-second = window where the system is doing a lot and only about 35% of the CPU is available to = Linux. However, I am testing for overruns on rt_task_wait_period() and never = seeing any.

 

Steven

 

------_=_NextPart_001_01C8699D.64F3E542-- From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <47AB2771.7080508@domain.hid> Date: Thu, 07 Feb 2008 16:44:49 +0100 From: Jan Kiszka MIME-Version: 1.0 References: <51CAD0CE1504444DBE77CBBE51A0135D3769E8@domain.hid> In-Reply-To: <51CAD0CE1504444DBE77CBBE51A0135D3769E8@domain.hid> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai-help] fpu List-Id: Help regarding installation and common use of Xenomai List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Steven Seeger Cc: xenomai@xenomai.org Steven Seeger wrote: > This is probably an application error, but I am having some random weird > behavior where sometimes a double that I add 0.01 to in a periodic task > gets corrupted and becomes nan. I didn't specify the T_FPU flag in > rt_task_spawn() but the docs said that flag is assumed for a user space > task. I've since set that flag manually and haven't had the problem > happen yet. Is this just a coincidence? > Likely, as T_FPU is hard-coded for user space threads. You could try to strip down your test so that incorrect application behaviour can be excluded (or use the test below). > > > The corruption seems to happen in a half-second window where the system > is doing a lot and only about 35% of the CPU is available to Linux. > However, I am testing for overruns on rt_task_wait_period() and never > seeing any. Overload must not cause data corruption, "only" missed deadlines. This case needs closer examination. As some first step: there is the switchtest tool coming with Xenomai. It includes consistency checks of the FPU environment across context switches. Maybe you can give this a try under similar load conditions over a longer period. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux From mboxrd@z Thu Jan 1 00:00:00 1970 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C86A7C.64F0DBB0" Date: Fri, 8 Feb 2008 10:59:47 -0700 Message-ID: <51CAD0CE1504444DBE77CBBE51A0135D376A5C@slcmail.slc.mew.int> From: "Steven Seeger" Subject: [Xenomai-help] fpu List-Id: Help regarding installation and common use of Xenomai List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: xenomai@xenomai.org This is a multi-part message in MIME format. ------_=_NextPart_001_01C86A7C.64F0DBB0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable We ran switchtest and it seems to pass. It continues to perform context switches (about 4600/sec) and doesn't report any errors. So, I'll take that to mean that I have an application problem somewhere. =20 Thanks for the advice. =20 Steven =20 ------_=_NextPart_001_01C86A7C.64F0DBB0 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

We ran switchtest and it seems to pass. It continues = to perform context switches (about 4600/sec) and doesn’t report any = errors. So, I’ll take that to mean that I have an application problem = somewhere.

 

Thanks for the advice.

 

Steven

 

------_=_NextPart_001_01C86A7C.64F0DBB0-- From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <47ACA034.5040408@domain.hid> Date: Fri, 08 Feb 2008 19:32:20 +0100 From: Jan Kiszka MIME-Version: 1.0 References: <51CAD0CE1504444DBE77CBBE51A0135D376A5C@slcmail.slc.mew.int> In-Reply-To: <51CAD0CE1504444DBE77CBBE51A0135D376A5C@slcmail.slc.mew.int> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigC64E83AE3D0C19B8F69B7C9B" Sender: jan.kiszka@domain.hid Subject: Re: [Xenomai-help] fpu List-Id: Help regarding installation and common use of Xenomai List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Steven Seeger Cc: xenomai@xenomai.org This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigC64E83AE3D0C19B8F69B7C9B Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable Steven Seeger wrote: > We ran switchtest and it seems to pass. It continues to perform context= > switches (about 4600/sec) and doesn't report any errors. So, I'll take > that to mean that I have an application problem somewhere. >=20 Well, let's say it is less likely. If you can manage to shrink down your = scenario now, this may either point to the application bug - or generate = a nice test case we could try out. Jan --------------enigC64E83AE3D0C19B8F69B7C9B Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHrKA0niDOoMHTA+kRAiAWAJ9d3GoSHC0UgIuGspUQUJe7J90IoQCfV1P4 1RunFficUZXeFxyRMoqYbsw= =TJc9 -----END PGP SIGNATURE----- --------------enigC64E83AE3D0C19B8F69B7C9B--