From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Wed, 16 Apr 2014 18:02:05 +0200 From: =?utf-8?q?Petr_Cervenka?= MIME-Version: 1.0 Message-Id: <20140416180205.C2109F56@centrum.cz> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Subject: Re: [Xenomai] =?utf-8?q?non-blocking_rt=5Ftask=5Fsuspend=28NULL=29?= List-Id: Discussions about the Xenomai project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?utf-8?q?Gilles_Chanteperdrix?= Cc: Xenomai > Od: Gilles Chanteperdrix > > CC: "Xenomai" >On 04/16/2014 04:20 PM, Petr Cervenka wrote: >>> Od: Gilles Chanteperdrix >>> >>> CC: "Xenomai" On 04/16/2014 02:22 PM, Petr >>> Cervenka wrote: >>>>> Od: Gilles Chanteperdrix >>>>> >>>>> CC: "Xenomai" On 04/15/2014 02:42 PM, >>>>> Petr Cervenka wrote: >>>>>> Hello I have a problem with the rt_task_suspend(NULL) call. >>>>>> I'm using it for synchronization of two (producer / consumer >>>>>> like) tasks. 1) When the consumer task has no work to do, it >>>>>> stops itself by calling of the rt_task_suspend(NULL). 2) When >>>>>> the producer creates new work for consumer, it wakes it up by >>>>>> calling of rt_task_resume(&consumerTask). The problem is, >>>>>> that consumer seldom switches to a state, that it sleeps by >>>>>> rt_task_suspend no more. And the task then takes all the CPU >>>>>> time. The return code is 0. But I already have seen couple of >>>>>> -4 (-EINTR) values in the past also. Consumer task status was >>>>>> 00300380 before and 00300184 (if there is small safety sleep >>>>>> present). I can use for example RT_EVENT variable instead, >>>>>> but I'm curious if you by chance don't know, what is >>>>>> happening? Xenomai 2.6.3, Linux 3.5.7 >>>>> >>>>> Could you post the example of code you are using to get this >>>>> issue? >>>>> >>>> >>>> It's and application with many threads, mutexes and others. It's >>>> also special measuring HW dependent. I can post here some >>>> simplified example. But I don't think it would be possible to >>>> reproduce the same behavior easily. It happens in my >>>> configuration only probably once per day and very unpredictably. >>>> But I have more details. I replaced rt_task_suspend / >>>> rt_task_resume by rt_event_wait / rt_event_signal. It failed >>>> similar way, but this time the result of wait was -4 (-EINTR). >>>> And (after several millions of invocations) it recovered itself. >>> >>> -EINTR is a valid return value for both rt_event_wait and >>> rt_task_suspend. In case you get this error, you should loop to >>> call rt_event_wait again, and not call rt_event_clear, as you risk >>> clearing an event which has been signaled afterwards. >>> >> You are right. It was just very quick replace of waiting and >> waking-up functions. But I'm checking the "work queue" anyway and it >> also doesn't need exact timing here. My problem it that the slow >> consumer task seems to be "interrupted by signal" (or whatever) for >> several minutes. I mean, that it doesn't wait for the event anymore >> and it always returns immediately (with -EINTR return code). > >Are you running inside gdb? Does the task receive the SIGDEBUG signal? >Do you have the XNWARNSW bit armed? > gdb: No. SIGDEBUG, XNWARNSW: I don't even know what it is ;-). >> I also >> already got one such situation half an hour ago. But the return code >> was 0 that time. Could you give me some advice what to check when >> such situation happens again? > >Well the task status should help. > Normaly the task status is (from /proc/xenomai/stat): 00300182. (XNFPU | XNSHADOW| XNMAPPED | XNSTARTED | XNPEND - waiting for an event) Task status from the last issue (from /proc/xenomai/stat) was 00300380. (XNFPU | XNSHADOW| XNRELAX | XNMAPPED | XNSTARTED) CPU load of the task was 23% (and more than 4mil. MSW/CSW). Perhaps sending of UDP packets (used for debugging) caused some sleep and prevented the computer from total freeze. After next issue I will have more precise information from rt_task_inquire. Petr