From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: xen-swiotlb Date: Mon, 16 Aug 2010 10:43:53 -0400 Message-ID: <20100816144353.GB29351@phenom.dumpdata.com> References: <1045449371.20100815232839@eikelenboom.it> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <1045449371.20100815232839@eikelenboom.it> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Sander Eikelenboom Cc: "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org On Sun, Aug 15, 2010 at 11:28:39PM +0200, Sander Eikelenboom wrote: > Hi Konrad, > > After i have exhausted all kernel debug options without getting a pointer to my freezes, i have added some printk's to all functions in swiotlb-xen.c > I see a lot of calls to xen_swiotlb_dma_mapping_error, which doesn't seem to be good ? The driver looks to be actually testing the value, which is great. Some of the older drivers (r8169) don't even do that. > > Although all the errors the device works fine (grabs video), but eventually the machine freezes, probably due to overwriting some mem it shouldn't. Looking at the output, the physical addresses that DMA-ed are: 0x1f2962dc0 0x1f24f2e68 and they look to be called quite often. In fact, there looks to be a loop that does something like this: again: p = kmalloc(..) dma = pci_map_single(p) pci_dma_mapping_error(dma); /* get some data.. */ /* parse the: (pipe 0x80000280): IN: c0 00 00 00 0c 00 01 00 */ pci_unmap_sg(dma); goto again; As the virtual address sent to pci_map_single looks to be sequentially increasing. It might be: a). the pci_dma_mapping_error is used incorrectly, ie, it is used as !pci_dma_mapping_error, but I doubt that - the Linux kernel has soo many exampples of how to proper use that. b). The pci_dma_mapping_error implementation in Xen-SWIOTLB is busted, but I can't see how. The logic is basically 'return !addr' so, if you have addr = 0xf200000', you will get 0, which is the proper return value. c). the xhci driver does something similar to the pseudo-code I've pointed out. It is missing a kfree somewhere. Can you point me to the git tree for the xhci and I can take a look there? Also could you send me yor debug patch - that will help in finding the culprit.