From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <513069D5.1090508@xenomai.org> Date: Fri, 01 Mar 2013 09:41:57 +0100 From: Philippe Gerum MIME-Version: 1.0 References: <512FB9B5.9040709@xenomai.org> <51306545.1010200@xenomai.org> <5130663F.7070209@xenomai.org> <51306710.5030201@xenomai.org> <5130673D.2090700@xenomai.org> In-Reply-To: <5130673D.2090700@xenomai.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai] Xenomai-forge: thread using 100% cpu load List-Id: Discussions about the Xenomai project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gilles Chanteperdrix Cc: xenomai@xenomai.org On 03/01/2013 09:30 AM, Gilles Chanteperdrix wrote: > On 03/01/2013 09:30 AM, Philippe Gerum wrote: > >> On 03/01/2013 09:26 AM, Gilles Chanteperdrix wrote: >>> On 03/01/2013 09:22 AM, Philippe Gerum wrote: >>> >>>> On 02/28/2013 09:22 PM, Thomas De Schampheleire wrote: >>>>> On Thu, Feb 28, 2013 at 9:10 PM, Gilles Chanteperdrix >>>>> wrote: >>>>>> On 02/28/2013 08:19 PM, Ronny Meeus wrote: >>>>>> >>>>>>> Hello >>>>>>> >>>>>>> we are using the PSOS interface of Xenomai forge, running completely >>>>>>> in user-space using the mercury code. >>>>>>> We deploy our application on different processors, one product is >>>>>>> running on PPC multicore (P4040, P4080, P4034) and another one on >>>>>>> Cavium (8 core device). >>>>>>> The Linux version we use is 2.6.32 but I would assume that this is not >>>>>>> so relevant. >>>>>>> >>>>>>> Our Xenomai application is running on one of the cores (affinity is >>>>>>> set), while the other cores are running other code. >>>>>>> >>>>>>> On both architectures we recently start to see issues that one thread >>>>>>> is consuming 100% of the core on which the application is pinned. >>>>>>> The thread that monopolizes the core is the thread internally used to >>>>>>> manage the timers, running at the highest priority. >>>>>>> The trigger for running into this behavior is currently unclear. >>>>>>> If we only start a part of the application (platform management only), >>>>>>> the issue is not observed. >>>>>>> We see this on both an old version of Xenomai and a very recent one >>>>>>> (pulled from the git repo yesterday). >>>>>>> >>>>>>> I will continue to debug this issue in the coming days and try isolate >>>>>>> the code that is triggering it, but I can use hints from the >>>>>>> community. >>>>>>> Debugging is complex since once the load starts, the debugger is not >>>>>>> reacting anymore. >>>>>>> If I put breakpoints in the functions that are called when the timer >>>>>>> expires (both oneshot and periodic), the process starts to clone >>>>>>> itself and I endup with tens of them. >>>>>>> >>>>>>> Has anybody seen an issue like this before or does somebody has some >>>>>>> hints on how to debug this problem? >>>>>> >>>>>> >>>>>> First enable the watchdog. It will send a signal to the application when >>>>>> detecting a problem, then you can use the watchdog to trigger an I-pipe >>>>>> tracer trace when the bug happens. You will probably have to increase >>>>>> the watchdog polling frequency, in order to have a meaningful trace. >>>>>> >>>>> >>>>> I don't think an I-pipe tracer will be possible when using the Mercury >>>>> core, right (xenomai-forge) ? >>>>> >>>> >>>> Correct. >>> >>> >>> I do not think so. The way I see it, you can enable the I-pipe tracer >>> without CONFIG_XENOMAI. >>> >> >> Mercury has NO pipeline in the kernel. >> > > You mean mercury can not run with an I-pipe kernel? > I mean it does not care about the pipeline, it does not need it. So if this is about observing kernel activity, then ftrace should be fine, or possibly perf to find out where userland spends time. -- Philippe.