From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from mailout-de.gmx.net ([213.165.64.22]:60940 "HELO mailout-de.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1751174Ab1LPIKf (ORCPT ); Fri, 16 Dec 2011 03:10:35 -0500 Message-ID: <4EEAFCF4.5080008@gmx.de> Date: Fri, 16 Dec 2011 09:10:28 +0100 From: Ninja MIME-Version: 1.0 To: linux-media Subject: Re: Mantis CAM not SMP safe / Activating CAM on Technisat Skystar HD2 (DVB-S2) References: <4EC052CE.1080002@gmx.de> <4EE2A06D.7070901@gmx.de> <4EE5E0BE.4060300@kolumbus.fi> <4EE7C3F9.1080703@gmx.de> In-Reply-To: <4EE7C3F9.1080703@gmx.de> Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-media-owner@vger.kernel.org List-ID: Am 13.12.2011 22:30, schrieb Ninja: > Am 12.12.2011 12:08, schrieb Marko Ristola: >> On 12/10/2011 01:57 AM, Ninja wrote: >>> Hi, >>> >>> has anyone an idea how the SMP problems could be fixed? >> >> You could turn on Mantis Kernel module's debug messages. >> It could tell you the emitted interrupts. >> >> One risky thing with the Interrupt handler code is that >> MANTIS_GPIF_STATUS is cleared, even though IRQ0 isn't active yet. >> This could lead to a rare starvation of the wait queue you described. >> I supplied a patch below. Does it help? >> >>> I did some further investigation. When comparing the number of >>> interrupts with all cores enabled and the interrupts with only one >>> core enabled it seems like only the IRQ0 changed, the other IRQs and >>> the total number stays quite the same: >>> >>> 4 Cores: >>> All IRQ/sec: 493 >>> Masked IRQ/sec: 400 >>> Unknown IRQ/sec: 0 >>> DMA/sec: 400 >>> IRQ-0/sec: 143 >>> IRQ-1/sec: 0 >>> OCERR/sec: 0 >>> PABRT/sec: 0 >>> RIPRR/sec: 0 >>> PPERR/sec: 0 >>> FTRGT/sec: 0 >>> RISCI/sec: 258 >>> RACK/sec: 0 >>> >>> 1 Core: >>> All IRQ/sec: 518 >>> Masked IRQ/sec: 504 >>> Unknown IRQ/sec: 0 >>> DMA/sec: 504 >>> IRQ-0/sec: 246 >>> IRQ-1/sec: 0 >>> OCERR/sec: 0 >>> PABRT/sec: 0 >>> RIPRR/sec: 0 >>> PPERR/sec: 0 >>> FTRGT/sec: 0 >>> RISCI/sec: 258 >>> RACK/sec: 0 >>> >>> So, where might be the problem? >> Turning on Mantis debug messages, might tell the difference between >> these interrupts. >> >> .... >>> I hope somebody can help, because I think we are very close to a >>> fully functional CAM here. >>> I ran out of things to test to get closer to the solution :( >>> Btw: Is there any documentation available for the mantis PCI bridge? >> Not that I know. >> >>> >>> Manuel >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe >>> linux-media" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> >> >> Regards, >> Marko Ristola >> > > Hi Marko, > > thanks for the patch. I did some quick testing today. The IRQ0 problem > stays, but it seems like the small hangs (3-5 seconds every 20 minutes > or something) are fixed :) > > Manuel Hi, I did some further investigation of my problem. Almost all IRQ0s originate from calling the function "mantis_hif_read_iom" (at least when the CAM is up and running). Changing the udelay between the writes to about 100 gets almost rid of the lost IRQ0 problem, but somehow it increases the number of total interrupts and IRQ0 as well to about double to triple of the numbers with udelay(20). This increase doesn't happen when reducing the number of cores as workaround. And getting *almost* no timeouts doesn't help much, because every timeout causes a hang/freeze until the CAM is initialized again. Changing the PCI latency to 0xff didn't help either. btw: The DMA patches of Marko postet in the other thread "Multiple Mantis devices gives me glitches" doesn't help me further since I'm using the latest code which already includes the patch. Manuel