From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <533E90E2.1020302@xenomai.org> Date: Fri, 04 Apr 2014 13:00:50 +0200 From: Gilles Chanteperdrix MIME-Version: 1.0 References: <52CAEA4D.1020505@xenomai.org> <6FD43B5D-6C35-48E7-BC3C-1414A0B809C9@gmail.com> In-Reply-To: <6FD43B5D-6C35-48E7-BC3C-1414A0B809C9@gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai] Command line freeze during xeno-regression-test on omap4460 List-Id: Discussions about the Xenomai project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Andreas Glatz Cc: xenomai@xenomai.org On 04/04/2014 12:27 PM, Andreas Glatz wrote: > Hi Gilles, > > I'm finally back to my original problem below: > > On 6 Jan 2014, at 17:39, Gilles Chanteperdrix wrote: > >> On 01/06/2014 04:30 PM, Andreas Glatz wrote: >>> Hi, >>> >>> I managed to produce a kernel (v3.8.13) with xenomai 2.6.3 ipipe >>> patch and >>> rootfs (debian wheezy) with xenomai 2.6.3 libraries for my >>> Pandaboard ES >>> (omap4460). The simple regression test, which only calls dd during >>> the >>> switchtest, works fine. However the regression test with the linux >>> test >>> project (ltp-full-20130904) scripts causes some sort of system lock >>> up. >>> After that I only can ctrl-c xeno-regression-test (i.e. >>> switchtest), which, >>> however, doesn't help to regain console access (neigher over >>> ethernet nor >>> serial). >>> >>> Here's what I did: >>> >>> -- Building -- >>> As recomended in the Xenomai 2.6 readme I followed the instructions >>> in [1] >>> to produce a kernel and filesystem. To get a xenomai kernel I had >>> to do >>> three things differently: >>> >>> *) I used: git checkout origin/v3.8.x -b tmp >>> *) I applied ipipe-core-3.8.13-arm-3.patch from the xenomai-2.6 git >>> tree as >>> described in the Xenomai 2.6 readme >>> *) I disabled KGDB and TIDSPBRIDGE since those produced compile >>> errors (see >>> config [2]) >>> >>> After a while I obtained the following messages from dmesg [3] and >>> from the >>> command prompt: >>> >>> root@arm:~# cat /proc/version >>> Linux version 3.8.13-x3.6 (aglatz@linuxvbox) (gcc version 4.7.3 >>> 20130328 >>> (prerelease) (crosstool-NG linaro-1.13.1-4.7-2013.04-20130415 - >>> Linaro GCC >>> 2013.04) ) #4 SMP Sat Jan 4 15:54:20 GMT 2014 >>> >>> -- Testing Linux -- >>> To see if everything works I downloaded and cross-compiled >>> ltp-full-20130904 [4] with the same toolchain and flags (- >>> march=armv7-a >>> -mfpu=vfp3) as the xenomai libs and runtime. I started ltp with "./ >>> runltp >>> -p -l dohell-2014-01-06-1.log -S xenomai.skiplist" and after a >>> while it >>> finished with a few failed tests [5]. The console access, however, >>> worked >>> fine. >>> >>> -- Testing Xenomai -- >>> First I sucessfully could run the simple xenomai regression test: >>> xeno-regression-test -l "/usr/lib/xenomai/testsuite/dohell -m /tmp >>> 100" -t >>> 2 which produced the output in [6] and the following additional >>> messages >>> with dmesg: >>> >>> [ 476.215057] Xenomai: RTDM: closing file descriptor 1. >>> [ 477.434936] Xenomai: Posix: destroying semaphore f0069c00. >>> [ 477.440887] Xenomai: Posix: destroying mutex f0069a00. >>> [ 477.475372] xnheap: destroying shared heap 'rt_heap: heap' with >>> 16384 >>> bytes still in use. >>> [ 479.008453] Xenomai: Switching rt_task to secondary mode after >>> exception >>> #0 from user-space at 0x9620 (pid 2145) >>> [ 480.574462] Xenomai: watchdog triggered -- signaling runaway >>> thread >>> 'rt_task' >>> [ 480.582061] [sched_delayed] sched: RT throttling activated >>> [ 557.336425] Xenomai: Posix: closing message queue descriptor 3. >>> >>> and "cat /proc/xenomai/*" produced [7]. >>> >>> When I started the realistic xenomai regression test: xeno- >>> regression-test >>> -l "/usr/lib/xenomai/testsuite/dohell -m /tmp -l /opt/ltp" -t 2 >>> everything >>> seemed fine at first - I could logon and start top to inspect the >>> running >>> processes. However, the command line (over serial and ethernet) >>> consistently freezes after a while (at different ltp tests though). >>> First I >>> thought it's the massive system load which doesn't leave CPU for the >>> console... however ctrl-c of xeno-regression-test does not help to >>> regain >>> console access... >> >> That is because kill xeno-regression-test does not kill all the >> script children. So, basically, the load tasks are still running. >> Also, what filesystem is /tmp? dohell is using dd to alternatively >> write to /tmp, then erase the file. If /tmp is some flash, it will >> become slow after a while. If it is a tmpfs, it will eat RAM. >> >> > > The described problem is _very_ reproducible on my PandaBoard ES > (omap4460), where I boot from an SD card partition and the rootfs is I have a pandaboard, I can check whether I can reproduce that. I believe the same problem has also been reported on beagleboard XM: http://www.xenomai.org/pipermail/xenomai/2014-March/030311.html So, there may be an issue with Xenomai or interrupt pipelining and the MMC driver for omap3 and omap4. -- Gilles.