From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: Re: pci passthrough xhci host controller Date: Fri, 1 Oct 2010 16:54:17 -0400 Message-ID: <20101001205417.GA18302@dumpdata.com> References: <1262837074.20100915230935@eikelenboom.it> <20100920203344.GA26201@dumpdata.com> <1227438201.20100921220310@eikelenboom.it> <20100927155952.GA4741@dumpdata.com> <1115867214.20100930212448@eikelenboom.it> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <1115867214.20100930212448@eikelenboom.it> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Sander Eikelenboom Cc: Jeremy Fitzhardinge , "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org On Thu, Sep 30, 2010 at 09:24:48PM +0200, Sander Eikelenboom wrote: > Hello Konrad, > > I have done some more tests, the results: > > - boot xen with mem=4G, > 2 days uptime with passthrough and videograbbing > - boot xen without mem=4G, < 1 day freeze with passthrough and videograbbing > - on both no problems as long as you don't grab video (so the controller doesn't do much) > - on both no problems when grabbing video with usb2, so it's xhci specific > > I haven't changed anything else, same number of VM's running etc. etc., videograbbing is working on both (until the freeze or until i ended the test) > I'm reading some messages about msi(-x) interrupt problems with xen on xen-devel, and suggestions to try noirqbalance with xen, so on both i use noirqbalance. > > So it seems to be related to the amount of mem available. > I do see one difference on the domU, with mem=4G i see some occasional warnings in syslog: > Sep 28 17:55:02 security kernel: [81744.078288] xhci_hcd 0000:07:00.0: WARN: transfer error on endpoint > Sep 28 17:55:02 security kernel: [81744.092653] xhci_hcd 0000:07:00.0: WARN: transfer error on endpoint > Sep 28 17:55:02 security kernel: [81744.093647] xhci_hcd 0000:07:00.0: WARN: transfer error on endpoint > Sep 28 17:55:02 security kernel: [81744.093647] xhci_hcd 0000:07:00.0: WARN: transfer error on endpoint > Sep 28 17:55:02 security kernel: [81744.093647] xhci_hcd 0000:07:00.0: WARN: transfer error on endpoint > > I don't see these warnings in the syslog when no mem=4G is used, so a hunch would be it goes wrong there while the xhci code tries to clean something up. > It could do something "strange" that seems to work on bare metal and on xen with mem=4G, but freezes everything with mem > 4G and gives no time to write the warning to the syslog / disk in time. > > in the syslog of dom0 i do see some occasional memleaks going by, but one set could be related: > Sep 28 17:55:19 localhost kernel: [81962.053321] kmemleak: 22 new suspected memory leaks (see /sys/kernel/debug/kmemleak) > > I will add a script that cat's the content of /sys/kernel/debug/kmemleak to syslog when kmemleak reports new suspected leaks. > > Any suggestions to try to debug this further ? Do you have the name of the grabber + USB3 device? If it is not too much I might as well get it and see what happens on my boxes.