Jan Kiszka wrote: > Am 01.11.2010 17:55, Anders Blomdell wrote: >> Jan Kiszka wrote: >>> Am 28.10.2010 11:34, Anders Blomdell wrote: >>>> Jan Kiszka wrote: >>>>> Am 28.10.2010 09:34, Anders Blomdell wrote: >>>>>> Anders Blomdell wrote: >>>>>>> Anders Blomdell wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> I'm trying to use rt_eepro100, for sending raw ethernet packets, >>>>>>>> but I'm >>>>>>>> experincing occasionally weird behaviour. >>>>>>>> >>>>>>>> Versions of things: >>>>>>>> >>>>>>>> linux-2.6.34.5 >>>>>>>> xenomai-2.5.5.2 >>>>>>>> rtnet-39f7fcf >>>>>>>> >>>>>>>> The testprogram runs on two computers with "Intel Corporation >>>>>>>> 82557/8/9/0/1 Ethernet Pro 100 (rev 08)" controller, where one >>>>>>>> computer >>>>>>>> acts as a mirror sending back packets received from the ethernet >>>>>>>> (only >>>>>>>> those two computers on the network), and the other sends packets and >>>>>>>> measures roundtrip time. Most packets comes back in approximately >>>>>>>> 100 >>>>>>>> us, but occasionally the reception times out (once in about 100000 >>>>>>>> packets or more), but the packets gets immediately received when >>>>>>>> reception is retried, which might indicate a race between >>>>>>>> rt_dev_recvmsg >>>>>>>> and interrupt, but I might miss something obvious. >>>>>>> Changing one of the ethernet cards to a "Intel Corporation 82541PI >>>>>>> Gigabit Ethernet Controller (rev 05)", while keeping everything else >>>>>>> constant, changes behavior somewhat; after receiving a few 100000 >>>>>>> packets, reception stops entirely (-EAGAIN is returned), while >>>>>>> transmission proceeds as it should (and mirror returns packets). >>>>>>> >>>>>>> Any suggestions on what to try? >>>>>> Since the problem disappears with 'maxcpus=1', I suspect I have a SMP >>>>>> issue (machine is a Core2 Quad), so I'll move to xenomai-core. >>>>>> (original message can be found at >>>>>> http://sourceforge.net/mailarchive/message.php?msg_name=4CC82C8D.3080808%40control.lth.se >>>>>> >>>>>> ) >>>>>> >>>>>> Xenomai-core gurus: which is the corrrect way to debug SMP issues? >>>>>> Can I run I-pipe-tracer and expect to be able save at least 150 us of >>>>>> traces for all cpus? Any hints/suggestions/insigths are welcome... >>>>> The i-pipe tracer unfortunately only saves traces for a the CPU that >>>>> triggered the freeze. To have a full pictures, you may want to try my >>>>> ftrace port I posted recently for 2.6.35. >>>> 2.6.35.7 ? >>>> >>> Exactly. >> Finally managed to get the ftrace to work >> (one possible bug: had to manually copy >> include/xenomai/trace/xn_nucleus.h to >> include/xenomai/trace/events/xn_nucleus.h), and it looks like it can be >> very useful... >> >> But I don't think it will give much info at the moment, since no >> xenomai/ipipe interrupt activity shows up, and adding that is far above >> my league :-( > > You could use the function tracer, provided you are able to stop the > trace quickly enough on error. > >> My current theory is that the problem occurs when something like this >> takes place: >> >> CPU-i CPU-j CPU-k CPU-l >> >> rt_dev_sendmsg >> xmit_irq >> rt_dev_recvmsg recv_irq > > Can't follow. When races here, and what will go wrong then? Thats the good question. Find attached: 1. .config (so you can check for stupid mistakes) 2. console log 3. latest version of test program 4. tail of ftrace dump These are the xenomai tasks running when the test program is active: CPU PID CLASS PRI TIMEOUT TIMEBASE STAT NAME 0 0 idle -1 - master R ROOT/0 1 0 idle -1 - master R ROOT/1 2 0 idle -1 - master R ROOT/2 3 0 idle -1 - master R ROOT/3 0 0 rt 98 - master W rtnet-stack 0 0 rt 0 - master W rtnet-rtpc 0 29901 rt 50 - master raw_test 0 29906 rt 0 - master X reporter The lines of interest from the trace are probably: [003] 2061.347855: xn_nucleus_thread_resume: thread=f9bf7b00 thread_name=rtnet-stack mask=2 [003] 2061.347862: xn_nucleus_sched: status=2000000 [000] 2061.347866: xn_nucleus_sched_remote: status=0 since this is the only place where a packet gets delayed, and the only place in the trace where sched_remote reports a status=0 Anybody that has any ideas? /Anders