xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze within 2 days, no info in serial log
@ 2010-08-03 15:30 Sander Eikelenboom
  2010-08-03 15:45 ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 14+ messages in thread
From: Sander Eikelenboom @ 2010-08-03 15:30 UTC (permalink / raw)
  To: xen-devel@lists.xensource.com, Keir Fraser
  Cc: Jeremy Fitzhardinge, Konrad Rzeszutek Wilk

Hi All,

I'm experiencing for what it seems a random freeze with current xen-4.0-testing, pvops dom0 2.6.32.16 kernel, most of the time within 2 days after rebooting.

Symptoms:
- Complete freeze, only power cycle does work.
- No bug output/stacktrace in serial log / on screen.
- Not able to get into hypervisor with ctrl-a (doesn't react to keyboard)
- No info in syslog.

Are there any more boot options I could give a try in the hope it will give some debug output ?



title           xen-4.0.1-rc5-pre.gz / Debian GNU/Linux, 2.6.32.16+xen-2.6.32.x-20100731
root            (hd0,0)
kernel          /xen-4.0.1-rc5-pre.gz dom0_mem=768M loglvl=all loglvl_guest=all com1=115200,8n1 sync_console console_to_ring console=com1,vga iommu=0,verbose,amd_iommu_debug lapic=debug apic_verbosity=debug apic=debug
module          /vmlinuz-2.6.32.16+xen-2.6.32.x-20100731 root=/dev/mapper/serveerstertje-root ro earlyprintk=xen max_loop=255 loop_max_part=63 console=hvc0 xen-pciback.hide=(03:06.0)(04:00.0)(08:00.0)(0a:01.1)(0a:01.2)(0f:00.0) pci=resource_alignment=03:06.0;04:00.0;08:00.0;0a:01.0;0a:01.1;0a:01.2;0f:00.0
module          /initrd.img-2.6.32.16+xen-2.6.32.x-20100731







serveerstertje:~# xm info
host                   : serveerstertje
release                : 2.6.32.16+xen-2.6.32.x-20100731
version                : #1 SMP Sat Jul 31 14:27:35 CEST 2010
machine                : x86_64
nr_cpus                : 6
nr_nodes               : 1
cores_per_socket       : 6
threads_per_core       : 1
cpu_mhz                : 3200
hw_caps                : 178bf3ff:efd3fbff:00000000:00001310:00802001:00000000:000037ff:00000000
virt_caps              : hvm
total_memory           : 8191
free_memory            : 6312
node_to_cpu            : node0:0-5
node_to_memory         : node0:6312
node_to_dma32_mem      : node0:2876
max_node_id            : 0
xen_major              : 4
xen_minor              : 0
xen_extra              : .1-rc5-pre
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : Sun Jul 25 22:22:43 2010 +0100 21287:e6b5b2cb8146
xen_commandline        : dom0_mem=768M loglvl=all loglvl_guest=all com1=115200,8n1 sync_console console_to_ring console=com1,vga iommu=0,verbose,amd_iommu_debug lapic=debug apic_verbosity=debug apic=debug
cc_compiler            : gcc version 4.3.2 (Debian 4.3.2-1.1)
cc_compile_by          : root
cc_compile_domain      :
cc_compile_date        : Sat Jul 31 12:06:59 CEST 2010
xend_config_format     : 4


commit 78b55f90e72348e231092dbe3e50ac7414b9e1af
Merge: c0a00fb... dee9469...
Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Date:   Wed Jul 28 00:19:39 2010 -0700

    Merge branch 'xen/next-2.6.32' into xen/stable-2.6.32.x

    * xen/next-2.6.32: (24 commits)
      cmpxchg: fix some 32-bit typos
      x86/cmpxchg: fix asm constraints to mention memory modification
      apply_to_page_range: fix compilation warning
      xen/blktap: #if CONFIG_XEN -> #ifdef CONFIG_XEN
      x86/hugepte: use set_pgd for 2-level pagetables
      implement O_NONBLOCK for /proc/xen/xenbus
      xen-pcifront: Remove usage of spin-locks.
      xen-pciback: Redo spinlock usage.
      xen-pcifront: Fix spinlock usage.
      xen-pcifront: Don't race with udev when discovering new devices.
      xen/blktap: make protocol specific usage of shared sring explicit
      xen/netback: make protocol specific usage of shared sring explicit
      xen/rings: make protocol specific usage of shared sring explicit
      xen/rings: make protocol specific usage of shared sring explicit
      xen/netfront: make protocol specific usage of shared sring explicit
      xen/rings: make protocol specific usage of shared sring explicit
      xen/rings: make protocol specific usage of shared sring explicit
      xen/blkback: Flush blkback data when connecting.
      xen: support large numbers of CPUs with vcpu info placement
      rtnetlink: make SR-IOV VF interface symmetric
      ...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze within 2 days, no info in serial log
  2010-08-03 15:30 [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze within 2 days, no info in serial log Sander Eikelenboom
@ 2010-08-03 15:45 ` Konrad Rzeszutek Wilk
  2010-08-03 15:51   ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 14+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-08-03 15:45 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Keir Fraser

On Tue, Aug 03, 2010 at 05:30:57PM +0200, Sander Eikelenboom wrote:
> Hi All,
> 
> I'm experiencing for what it seems a random freeze with current xen-4.0-testing, pvops dom0 2.6.32.16 kernel, most of the time within 2 days after rebooting.
> 
You did not experience the freeze with 2.6.32.15?
> Symptoms:
> - Complete freeze, only power cycle does work.
> - No bug output/stacktrace in serial log / on screen.
> - Not able to get into hypervisor with ctrl-a (doesn't react to keyboard)
> - No info in syslog.
> 
> Are there any more boot options I could give a try in the hope it will give some debug output ?

The Linux kernel has some of those 'DETECT_SPINLOCK_HANG' or
'DETECT_WORK..something' flags. It might be a good idea to compile those
and see when your machine freezes if after 2 minutes the kernel starts
spitting out what is hung. That could give some idea.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze within 2 days, no info in serial log
  2010-08-03 15:45 ` Konrad Rzeszutek Wilk
@ 2010-08-03 15:51   ` Jeremy Fitzhardinge
  2010-08-03 16:18     ` Sander Eikelenboom
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Jeremy Fitzhardinge @ 2010-08-03 15:51 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Sander Eikelenboom, xen-devel@lists.xensource.com, Keir Fraser

  On 08/03/2010 08:45 AM, Konrad Rzeszutek Wilk wrote:
> On Tue, Aug 03, 2010 at 05:30:57PM +0200, Sander Eikelenboom wrote:
>> Hi All,
>>
>> I'm experiencing for what it seems a random freeze with current xen-4.0-testing, pvops dom0 2.6.32.16 kernel, most of the time within 2 days after rebooting.
>>
> You did not experience the freeze with 2.6.32.15?

There have been a few updates to the .32.16 kernel too (and now its 
.17...).  But it would be very useful to identify which the last working 
kernel was.

>> Symptoms:
>> - Complete freeze, only power cycle does work.
>> - No bug output/stacktrace in serial log / on screen.
>> - Not able to get into hypervisor with ctrl-a (doesn't react to keyboard)
>> - No info in syslog.
>>
>> Are there any more boot options I could give a try in the hope it will give some debug output ?
> The Linux kernel has some of those 'DETECT_SPINLOCK_HANG' or
> 'DETECT_WORK..something' flags. It might be a good idea to compile those
> and see when your machine freezes if after 2 minutes the kernel starts
> spitting out what is hung. That could give some idea.
>

If Xen doesn't respond then it isn't a kernel spinlock problem; it looks 
more system-wide than that.  I notice the kernel command line has lots 
of hidden PCI devices.  Sander, is there any particular activity (esp 
passthrough device activity) which might correspond to the hang?

     J

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze within 2 days, no info in serial log
  2010-08-03 15:51   ` Jeremy Fitzhardinge
@ 2010-08-03 16:18     ` Sander Eikelenboom
  2010-08-03 17:18     ` Sander Eikelenboom
  2010-08-05  9:48     ` Sander Eikelenboom
  2 siblings, 0 replies; 14+ messages in thread
From: Sander Eikelenboom @ 2010-08-03 16:18 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: xen-devel@lists.xensource.com, Keir Fraser, Konrad Rzeszutek Wilk

Hi Jeremy,

Yes, i have a domU with 2 usb cards passed through, with 2 USB videograbbers attached.
This domain is running Konrad's devel/merge.2.6.35-rc6.t2 tree and some additional patches for the usb3/xhci card, which still give some trouble.

But i didn't expect both dom0 and the hypervisor to freeze as well, and leaving no clues :(

--

Sander


Tuesday, August 3, 2010, 5:51:26 PM, you wrote:

>   On 08/03/2010 08:45 AM, Konrad Rzeszutek Wilk wrote:
>> On Tue, Aug 03, 2010 at 05:30:57PM +0200, Sander Eikelenboom wrote:
>>> Hi All,
>>>
>>> I'm experiencing for what it seems a random freeze with current xen-4.0-testing, pvops dom0 2.6.32.16 kernel, most of the time within 2 days after rebooting.
>>>
>> You did not experience the freeze with 2.6.32.15?

> There have been a few updates to the .32.16 kernel too (and now its 
> .17...).  But it would be very useful to identify which the last working 
> kernel was.

>>> Symptoms:
>>> - Complete freeze, only power cycle does work.
>>> - No bug output/stacktrace in serial log / on screen.
>>> - Not able to get into hypervisor with ctrl-a (doesn't react to keyboard)
>>> - No info in syslog.
>>>
>>> Are there any more boot options I could give a try in the hope it will give some debug output ?
>> The Linux kernel has some of those 'DETECT_SPINLOCK_HANG' or
>> 'DETECT_WORK..something' flags. It might be a good idea to compile those
>> and see when your machine freezes if after 2 minutes the kernel starts
>> spitting out what is hung. That could give some idea.
>>

> If Xen doesn't respond then it isn't a kernel spinlock problem; it looks 
> more system-wide than that.  I notice the kernel command line has lots 
> of hidden PCI devices.  Sander, is there any particular activity (esp 
> passthrough device activity) which might correspond to the hang?

>      J



-- 
Best regards,
 Sander                            mailto:linux@eikelenboom.it

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze within 2 days, no info in serial log
  2010-08-03 15:51   ` Jeremy Fitzhardinge
  2010-08-03 16:18     ` Sander Eikelenboom
@ 2010-08-03 17:18     ` Sander Eikelenboom
  2010-08-05  9:48     ` Sander Eikelenboom
  2 siblings, 0 replies; 14+ messages in thread
From: Sander Eikelenboom @ 2010-08-03 17:18 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: xen-devel@lists.xensource.com, Keir Fraser, Konrad Rzeszutek Wilk

Hi Jeremy,

The last kernel that worked is quite hard to say, since the xhci patches depend on a 2.6.35-rc6 kernel, so i took that one from Konrad's tree.
But those pciback changes make it so i have to use a recent dom0 kernel as well (if i recall correct form some spinlock cleanups to work).

--
Sander


Tuesday, August 3, 2010, 5:51:26 PM, you wrote:

>   On 08/03/2010 08:45 AM, Konrad Rzeszutek Wilk wrote:
>> On Tue, Aug 03, 2010 at 05:30:57PM +0200, Sander Eikelenboom wrote:
>>> Hi All,
>>>
>>> I'm experiencing for what it seems a random freeze with current xen-4.0-testing, pvops dom0 2.6.32.16 kernel, most of the time within 2 days after rebooting.
>>>
>> You did not experience the freeze with 2.6.32.15?

> There have been a few updates to the .32.16 kernel too (and now its 
> .17...).  But it would be very useful to identify which the last working 
> kernel was.

>>> Symptoms:
>>> - Complete freeze, only power cycle does work.
>>> - No bug output/stacktrace in serial log / on screen.
>>> - Not able to get into hypervisor with ctrl-a (doesn't react to keyboard)
>>> - No info in syslog.
>>>
>>> Are there any more boot options I could give a try in the hope it will give some debug output ?
>> The Linux kernel has some of those 'DETECT_SPINLOCK_HANG' or
>> 'DETECT_WORK..something' flags. It might be a good idea to compile those
>> and see when your machine freezes if after 2 minutes the kernel starts
>> spitting out what is hung. That could give some idea.
>>

> If Xen doesn't respond then it isn't a kernel spinlock problem; it looks 
> more system-wide than that.  I notice the kernel command line has lots 
> of hidden PCI devices.  Sander, is there any particular activity (esp 
> passthrough device activity) which might correspond to the hang?

>      J



-- 
Best regards,
 Sander                            mailto:linux@eikelenboom.it

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze within 2 days, no info in serial log
  2010-08-03 15:51   ` Jeremy Fitzhardinge
  2010-08-03 16:18     ` Sander Eikelenboom
  2010-08-03 17:18     ` Sander Eikelenboom
@ 2010-08-05  9:48     ` Sander Eikelenboom
  2010-08-05 14:52       ` Konrad Rzeszutek Wilk
  2 siblings, 1 reply; 14+ messages in thread
From: Sander Eikelenboom @ 2010-08-05  9:48 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: xen-devel@lists.xensource.com, Keir Fraser, Konrad Rzeszutek Wilk

Hi Konrad/Jeremy,

I have tested the last 2 days with the vm's with passthroughed devices shutdown, and no freeze so far.
I'm running now with one of the vm's that runs an old 2.6.33 kernel from an old tree from Konrad together with some hacked up patches for xhci/usb3 support.
That seems to be running fine for some time now (although not a full 2 days yet).

So my other vm seems to cause the freeze.

- This one uses the devel/merge.2.6.35-rc6.t2 as domU kernel, i think i should try an older version of pci-front/xen-swiotlb perhaps.
- It has both a usb2 and usb3 controller passed through, but the xhci module has much changed since the hacked up patches from the kernel in de working domU vm
- Most probably the drivers for the videograbbers will have changed

So i suspect:
   - newer pci-front / xen-swiotlb
   - xhci/usb3 driver
   - drivers videograbber

Most probable would be a roque dma transfer that can't be catched by xen / pciback I guess, and therefore would be hard to debug ?

--
Sander



Tuesday, August 3, 2010, 5:51:26 PM, you wrote:

>   On 08/03/2010 08:45 AM, Konrad Rzeszutek Wilk wrote:
>> On Tue, Aug 03, 2010 at 05:30:57PM +0200, Sander Eikelenboom wrote:
>>> Hi All,
>>>
>>> I'm experiencing for what it seems a random freeze with current xen-4.0-testing, pvops dom0 2.6.32.16 kernel, most of the time within 2 days after rebooting.
>>>
>> You did not experience the freeze with 2.6.32.15?

> There have been a few updates to the .32.16 kernel too (and now its 
> .17...).  But it would be very useful to identify which the last working 
> kernel was.

>>> Symptoms:
>>> - Complete freeze, only power cycle does work.
>>> - No bug output/stacktrace in serial log / on screen.
>>> - Not able to get into hypervisor with ctrl-a (doesn't react to keyboard)
>>> - No info in syslog.
>>>
>>> Are there any more boot options I could give a try in the hope it will give some debug output ?
>> The Linux kernel has some of those 'DETECT_SPINLOCK_HANG' or
>> 'DETECT_WORK..something' flags. It might be a good idea to compile those
>> and see when your machine freezes if after 2 minutes the kernel starts
>> spitting out what is hung. That could give some idea.
>>

> If Xen doesn't respond then it isn't a kernel spinlock problem; it looks 
> more system-wide than that.  I notice the kernel command line has lots 
> of hidden PCI devices.  Sander, is there any particular activity (esp 
> passthrough device activity) which might correspond to the hang?

>      J



-- 
Best regards,
 Sander                            mailto:linux@eikelenboom.it

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze within 2 days, no info in serial log
  2010-08-05  9:48     ` Sander Eikelenboom
@ 2010-08-05 14:52       ` Konrad Rzeszutek Wilk
  2010-08-05 15:12         ` Sander Eikelenboom
                           ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-08-05 14:52 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Keir Fraser

On Thu, Aug 05, 2010 at 11:48:44AM +0200, Sander Eikelenboom wrote:
> Hi Konrad/Jeremy,
> 
> I have tested the last 2 days with the vm's with passthroughed devices shutdown, and no freeze so far.
> I'm running now with one of the vm's that runs an old 2.6.33 kernel from an old tree from Konrad together with some hacked up patches for xhci/usb3 support.
> That seems to be running fine for some time now (although not a full 2 days yet).
> 
> So my other vm seems to cause the freeze.
> 
> - This one uses the devel/merge.2.6.35-rc6.t2 as domU kernel, i think i should try an older version of pci-front/xen-swiotlb perhaps.
> - It has both a usb2 and usb3 controller passed through, but the xhci module has much changed since the hacked up patches from the kernel in de working domU vm
> - Most probably the drivers for the videograbbers will have changed
> 
> So i suspect:
>    - newer pci-front / xen-swiotlb
>    - xhci/usb3 driver
>    - drivers videograbber
> 
> Most probable would be a roque dma transfer that can't be catched by xen / pciback I guess, and therefore would be hard to debug ?

The SWIOTLB "brains" by themselves haven't changed since the
uhh...2.6.33. The code internals that just got Ack-ed upstream looks quite
similar to the one that Jeremy carries in xen/stable-2.6.32.x. The
outside plumbing parts are the ones that changed.

The fixes in the pci-front, well, most of those are "burocractic" in
nature - set the ownership to this, make hotplug work, etc. The big
fixes were the MSI/MSI-X ones but those were big news a couple of months
ago (and I think that was when 2.6.34 came out).

The videograbber (vl4) stack trace you sent to me some time ago looked
liked a mutex was held for a very very long time... which I wonder if
that is the cmpxch compiler bug that has hit some folks. Are you using
Debian?

But we can do something easy. I can rebase my 2.6.33 kernel with the
latest Xen-SWIOTLB/SWIOTLB engine + Xen PCI front, and we can eliminate the
SWIOTLB/PCIfront being at fault here.. Let me do that if your  2.6.33
VM guest is running fine for the last two days.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze within 2 days, no info in serial log
  2010-08-05 14:52       ` Konrad Rzeszutek Wilk
@ 2010-08-05 15:12         ` Sander Eikelenboom
  2010-08-05 16:21         ` Jeremy Fitzhardinge
  2010-08-06  9:21         ` Sander Eikelenboom
  2 siblings, 0 replies; 14+ messages in thread
From: Sander Eikelenboom @ 2010-08-05 15:12 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Keir Fraser

Hi Konrad,

Thx for your response, and i saw Linus pulled the swiotlb code .. way to go !

Thursday, August 5, 2010, 4:52:14 PM, you wrote:

> On Thu, Aug 05, 2010 at 11:48:44AM +0200, Sander Eikelenboom wrote:
>> Hi Konrad/Jeremy,
>> 
>> I have tested the last 2 days with the vm's with passthroughed devices shutdown, and no freeze so far.
>> I'm running now with one of the vm's that runs an old 2.6.33 kernel from an old tree from Konrad together with some hacked up patches for xhci/usb3 support.
>> That seems to be running fine for some time now (although not a full 2 days yet).
>> 
>> So my other vm seems to cause the freeze.
>> 
>> - This one uses the devel/merge.2.6.35-rc6.t2 as domU kernel, i think i should try an older version of pci-front/xen-swiotlb perhaps.
>> - It has both a usb2 and usb3 controller passed through, but the xhci module has much changed since the hacked up patches from the kernel in de working domU vm
>> - Most probably the drivers for the videograbbers will have changed
>> 
>> So i suspect:
>>    - newer pci-front / xen-swiotlb
>>    - xhci/usb3 driver
>>    - drivers videograbber
>> 
>> Most probable would be a roque dma transfer that can't be catched by xen / pciback I guess, and therefore would be hard to debug ?

> The SWIOTLB "brains" by themselves haven't changed since the
> uhh...2.6.33. The code internals that just got Ack-ed upstream looks quite
> similar to the one that Jeremy carries in xen/stable-2.6.32.x. The
> outside plumbing parts are the ones that changed.

> The fixes in the pci-front, well, most of those are "burocractic" in
> nature - set the ownership to this, make hotplug work, etc. The big
> fixes were the MSI/MSI-X ones but those were big news a couple of months
> ago (and I think that was when 2.6.34 came out).

> The videograbber (vl4) stack trace you sent to me some time ago looked
> liked a mutex was held for a very very long time... which I wonder if
> that is the cmpxch compiler bug that has hit some folks. Are you using
> Debian?

Yes i'm using Debian, i saw that bug fix too, but since Jeremy didn't include it in stable yet i also didn't :-)
Well you gave me a pointer here, looking again it seems to hang on the device on the usb2 controller and not the usb3.
So to rule out the usb3 stuff i will drop that usb2 controller and see if that works. If so, it must be a problem in the driver.
Since that grabber + usb2 controller worked for quite a while grabbing perfectly.


> But we can do something easy. I can rebase my 2.6.33 kernel with the
> latest Xen-SWIOTLB/SWIOTLB engine + Xen PCI front, and we can eliminate the
> SWIOTLB/PCIfront being at fault here.. Let me do that if your  2.6.33
> VM guest is running fine for the last two days.

I will first try the above, if that doesn't work out,  i will try the 2.6.33 again for longer and report back !

Thx Again !

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze within 2 days, no info in serial log
  2010-08-05 14:52       ` Konrad Rzeszutek Wilk
  2010-08-05 15:12         ` Sander Eikelenboom
@ 2010-08-05 16:21         ` Jeremy Fitzhardinge
  2010-08-06  9:21         ` Sander Eikelenboom
  2 siblings, 0 replies; 14+ messages in thread
From: Jeremy Fitzhardinge @ 2010-08-05 16:21 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Sander Eikelenboom, xen-devel@lists.xensource.com, Keir Fraser

  On 08/05/2010 07:52 AM, Konrad Rzeszutek Wilk wrote:
> The videograbber (vl4) stack trace you sent to me some time ago looked
> liked a mutex was held for a very very long time... which I wonder if
> that is the cmpxch compiler bug that has hit some folks. Are you using
> Debian?

The symptom of that bug is that gcc doesn't see any writes to a static 
variable, so it puts it in a RO section, causing faults.  So while I 
guess its within the realm of possibilities that this is a different 
manifestation of the same bug, it doesn't seem likely.

     J

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze within 2 days, no info in serial log
  2010-08-05 14:52       ` Konrad Rzeszutek Wilk
  2010-08-05 15:12         ` Sander Eikelenboom
  2010-08-05 16:21         ` Jeremy Fitzhardinge
@ 2010-08-06  9:21         ` Sander Eikelenboom
  2010-08-06 15:17           ` Konrad Rzeszutek Wilk
  2 siblings, 1 reply; 14+ messages in thread
From: Sander Eikelenboom @ 2010-08-06  9:21 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Keir Fraser

Hi Konrad,

Hmm it seems that 2.6.33 tree does seem to work for 1 VM with a videograbber, but doesn't for the VM which seem to cause the freeze.
It does spit out some stacktraces after a while of not functioning, with since is OOM i will be something else caused by the fall out and not anywhere near the root cause.
Although this at least didn't freeze the complete system :-)
I will try some more configurations to see if i can find a pattern somehow ...

--
Sander

[ 1269.032133] submit of urb 0 failed (error=-90)
[ 1274.153341] motion: page allocation failure. order:6, mode:0xd4
[ 1274.153375] Pid: 1884, comm: motion Not tainted 2.6.33 #5
[ 1274.153391] Call Trace:
[ 1274.153416]  [<ffffffff810e4665>] __alloc_pages_nodemask+0x5b2/0x62b
[ 1274.153440]  [<ffffffff810338b9>] ? xen_force_evtchn_callback+0xd/0xf
[ 1274.153461]  [<ffffffff810e46f5>] __get_free_pages+0x17/0x5f
[ 1274.153483]  [<ffffffff8128042e>] xen_swiotlb_alloc_coherent+0x3c/0xe2
[ 1274.153507]  [<ffffffff81410931>] hcd_buffer_alloc+0xfa/0x11f
[ 1274.153527]  [<ffffffff81403e0c>] usb_buffer_alloc+0x17/0x1d
[ 1274.153562]  [<ffffffffa003f39e>] em28xx_init_isoc+0x16a/0x32b [em28xx]
[ 1274.153585]  [<ffffffff815ec0b9>] ? __down_read+0x47/0xed
[ 1274.153613]  [<ffffffffa003a4ac>] buffer_prepare+0xd7/0x10d [em28xx]
[ 1274.153639]  [<ffffffffa0016dac>] videobuf_qbuf+0x308/0x3f4 [videobuf_core]
[ 1274.153667]  [<ffffffffa0039cb3>] vidioc_qbuf+0x35/0x3a [em28xx]
[ 1274.153697]  [<ffffffffa0028229>] __video_do_ioctl+0x11ab/0x373b [videodev]
[ 1274.153720]  [<ffffffff814b51cd>] ? sock_def_readable+0x54/0x5f
[ 1274.153743]  [<ffffffff81541f65>] ? unix_dgram_sendmsg+0x3f1/0x43e
[ 1274.153764]  [<ffffffff810313b5>] ? __raw_callee_save_xen_pud_val+0x11/0x1e
[ 1274.153793]  [<ffffffffa0039c7e>] ? vidioc_qbuf+0x0/0x3a [em28xx]
[ 1274.153814]  [<ffffffff814b208b>] ? sock_sendmsg+0xa3/0xbc
[ 1274.153837]  [<ffffffff8123349b>] ? avc_has_perm+0x4e/0x60
[ 1274.153855]  [<ffffffff810338b9>] ? xen_force_evtchn_callback+0xd/0xf
[ 1274.153880]  [<ffffffffa002aab1>] video_ioctl2+0x2f8/0x3af [videodev]
[ 1274.153901]  [<ffffffff810357df>] ? __switch_to+0x265/0x277
[ 1274.153924]  [<ffffffffa0026122>] v4l2_ioctl+0x38/0x3a [videodev]
[ 1274.153944]  [<ffffffff8111ec90>] vfs_ioctl+0x72/0x9e
[ 1274.153961]  [<ffffffff8111f1d7>] do_vfs_ioctl+0x4a0/0x4e1
[ 1274.153980]  [<ffffffff8111f26d>] sys_ioctl+0x55/0x77
[ 1274.154000]  [<ffffffff81112e6a>] ? sys_write+0x60/0x70
[ 1274.154009]  [<ffffffff81036cc2>] system_call_fastpath+0x16/0x1b
[ 1274.154126] Mem-Info:
[ 1274.154138] DMA per-cpu:
[ 1274.154151] CPU    0: hi:    0, btch:   1 usd:   0
[ 1274.154165] CPU    1: hi:    0, btch:   1 usd:   0
[ 1274.154180] DMA32 per-cpu:
[ 1274.154202] CPU    0: hi:  186, btch:  31 usd:   0
[ 1274.154220] CPU    1: hi:  186, btch:  31 usd:  78
[ 1274.154241] active_anon:248 inactive_anon:326 isolated_anon:0
[ 1274.154244]  active_file:132 inactive_file:105 isolated_file:41
[ 1274.154247]  unevictable:0 dirty:0 writeback:19 unstable:0
[ 1274.154250]  free:1309 slab_reclaimable:642 slab_unreclaimable:3111
[ 1274.154254]  mapped:100846 shmem:4 pagetables:1187 bounce:0
[ 1274.154313] DMA free:2036kB min:80kB low:100kB high:120kB active_anon:0kB inactive_anon:24kB active_file:20kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:14752kB mlocked:0kB dirty:0kB writeback:0kB mapped:12804kB shmem:0kB slab_reclaimable:16kB slab_unreclaimable:40kB kernel_stack:0kB pagetables:24kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 1274.154375] lowmem_reserve[]: 0 489 489 489
[ 1274.154415] DMA32 free:3200kB min:2788kB low:3484kB high:4180kB active_anon:992kB inactive_anon:1280kB active_file:508kB inactive_file:420kB unevictable:0kB isolated(anon):0kB isolated(file):164kB present:500960kB mlocked:0kB dirty:0kB writeback:76kB mapped:390580kB shmem:16kB slab_reclaimable:2552kB slab_unreclaimable:12404kB kernel_stack:592kB pagetables:4724kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:160 all_unreclaimable? no
[ 1274.154481] lowmem_reserve[]: 0 0 0 0
[ 1274.154508] DMA: 7*4kB 1*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2036kB
[ 1274.154571] DMA32: 409*4kB 33*8kB 2*16kB 0*32kB 0*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 3212kB
[ 1274.154634] 429 total pagecache pages
[ 1274.154646] 161 pages in swap cache
[ 1274.154658] Swap cache stats: add 344422, delete 344260, find 99167/143153
[ 1274.154673] Free swap  = 476756kB
[ 1274.154684] Total swap = 524280kB
[ 1274.160880] 131072 pages RAM
[ 1274.160902] 21934 pages reserved
[ 1274.160914] 101195 pages shared
[ 1274.160925] 6309 pages non-shared
[ 1274.160963] unable to allocate 185088 bytes for transfer buffer 4
[ 1287.634682] motion invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0
[ 1287.634719] motion cpuset=/ mems_allowed=0




Thursday, August 5, 2010, 4:52:14 PM, you wrote:

> On Thu, Aug 05, 2010 at 11:48:44AM +0200, Sander Eikelenboom wrote:
>> Hi Konrad/Jeremy,
>> 
>> I have tested the last 2 days with the vm's with passthroughed devices shutdown, and no freeze so far.
>> I'm running now with one of the vm's that runs an old 2.6.33 kernel from an old tree from Konrad together with some hacked up patches for xhci/usb3 support.
>> That seems to be running fine for some time now (although not a full 2 days yet).
>> 
>> So my other vm seems to cause the freeze.
>> 
>> - This one uses the devel/merge.2.6.35-rc6.t2 as domU kernel, i think i should try an older version of pci-front/xen-swiotlb perhaps.
>> - It has both a usb2 and usb3 controller passed through, but the xhci module has much changed since the hacked up patches from the kernel in de working domU vm
>> - Most probably the drivers for the videograbbers will have changed
>> 
>> So i suspect:
>>    - newer pci-front / xen-swiotlb
>>    - xhci/usb3 driver
>>    - drivers videograbber
>> 
>> Most probable would be a roque dma transfer that can't be catched by xen / pciback I guess, and therefore would be hard to debug ?

> The SWIOTLB "brains" by themselves haven't changed since the
> uhh...2.6.33. The code internals that just got Ack-ed upstream looks quite
> similar to the one that Jeremy carries in xen/stable-2.6.32.x. The
> outside plumbing parts are the ones that changed.

> The fixes in the pci-front, well, most of those are "burocractic" in
> nature - set the ownership to this, make hotplug work, etc. The big
> fixes were the MSI/MSI-X ones but those were big news a couple of months
> ago (and I think that was when 2.6.34 came out).

> The videograbber (vl4) stack trace you sent to me some time ago looked
> liked a mutex was held for a very very long time... which I wonder if
> that is the cmpxch compiler bug that has hit some folks. Are you using
> Debian?

> But we can do something easy. I can rebase my 2.6.33 kernel with the
> latest Xen-SWIOTLB/SWIOTLB engine + Xen PCI front, and we can eliminate the
> SWIOTLB/PCIfront being at fault here.. Let me do that if your  2.6.33
> VM guest is running fine for the last two days.




-- 
Best regards,
 Sander                            mailto:linux@eikelenboom.it

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze within 2 days, no info in serial log
  2010-08-06  9:21         ` Sander Eikelenboom
@ 2010-08-06 15:17           ` Konrad Rzeszutek Wilk
  2010-08-06 20:44             ` Jeremy Fitzhardinge
                               ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-08-06 15:17 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Keir Fraser

On Fri, Aug 06, 2010 at 11:21:11AM +0200, Sander Eikelenboom wrote:
> Hi Konrad,
> 
> Hmm it seems that 2.6.33 tree does seem to work for 1 VM with a videograbber, but doesn't for the VM which seem to cause the freeze.
> It does spit out some stacktraces after a while of not functioning, with since is OOM i will be something else caused by the fall out and not anywhere near the root cause.
> Although this at least didn't freeze the complete system :-)
> I will try some more configurations to see if i can find a pattern somehow ...
> 
> --
> Sander
> 
> [ 1269.032133] submit of urb 0 failed (error=-90)
> [ 1274.153341] motion: page allocation failure. order:6, mode:0xd4

That is a 256kB request for memery.
> [ 1274.153375] Pid: 1884, comm: motion Not tainted 2.6.33 #5
> [ 1274.153391] Call Trace:
> [ 1274.153416]  [<ffffffff810e4665>] __alloc_pages_nodemask+0x5b2/0x62b
> [ 1274.153440]  [<ffffffff810338b9>] ? xen_force_evtchn_callback+0xd/0xf
> [ 1274.153461]  [<ffffffff810e46f5>] __get_free_pages+0x17/0x5f
> [ 1274.153483]  [<ffffffff8128042e>] xen_swiotlb_alloc_coherent+0x3c/0xe2
> [ 1274.153507]  [<ffffffff81410931>] hcd_buffer_alloc+0xfa/0x11f
> [ 1274.153527]  [<ffffffff81403e0c>] usb_buffer_alloc+0x17/0x1d
> [ 1274.153562]  [<ffffffffa003f39e>] em28xx_init_isoc+0x16a/0x32b [em28xx]
> [ 1274.153585]  [<ffffffff815ec0b9>] ? __down_read+0x47/0xed
> [ 1274.153613]  [<ffffffffa003a4ac>] buffer_prepare+0xd7/0x10d [em28xx]
> [ 1274.153639]  [<ffffffffa0016dac>] videobuf_qbuf+0x308/0x3f4 [videobuf_core]
> [ 1274.153667]  [<ffffffffa0039cb3>] vidioc_qbuf+0x35/0x3a [em28xx]
> [ 1274.153697]  [<ffffffffa0028229>] __video_do_ioctl+0x11ab/0x373b [videodev]
> [ 1274.153720]  [<ffffffff814b51cd>] ? sock_def_readable+0x54/0x5f
> [ 1274.153743]  [<ffffffff81541f65>] ? unix_dgram_sendmsg+0x3f1/0x43e
> [ 1274.153764]  [<ffffffff810313b5>] ? __raw_callee_save_xen_pud_val+0x11/0x1e
> [ 1274.153793]  [<ffffffffa0039c7e>] ? vidioc_qbuf+0x0/0x3a [em28xx]
> [ 1274.153814]  [<ffffffff814b208b>] ? sock_sendmsg+0xa3/0xbc
> [ 1274.153837]  [<ffffffff8123349b>] ? avc_has_perm+0x4e/0x60
> [ 1274.153855]  [<ffffffff810338b9>] ? xen_force_evtchn_callback+0xd/0xf
> [ 1274.153880]  [<ffffffffa002aab1>] video_ioctl2+0x2f8/0x3af [videodev]
> [ 1274.153901]  [<ffffffff810357df>] ? __switch_to+0x265/0x277
> [ 1274.153924]  [<ffffffffa0026122>] v4l2_ioctl+0x38/0x3a [videodev]
> [ 1274.153944]  [<ffffffff8111ec90>] vfs_ioctl+0x72/0x9e
> [ 1274.153961]  [<ffffffff8111f1d7>] do_vfs_ioctl+0x4a0/0x4e1
> [ 1274.153980]  [<ffffffff8111f26d>] sys_ioctl+0x55/0x77
> [ 1274.154000]  [<ffffffff81112e6a>] ? sys_write+0x60/0x70
> [ 1274.154009]  [<ffffffff81036cc2>] system_call_fastpath+0x16/0x1b
> [ 1274.154126] Mem-Info:
> [ 1274.154138] DMA per-cpu:
> [ 1274.154151] CPU    0: hi:    0, btch:   1 usd:   0
> [ 1274.154165] CPU    1: hi:    0, btch:   1 usd:   0
> [ 1274.154180] DMA32 per-cpu:
> [ 1274.154202] CPU    0: hi:  186, btch:  31 usd:   0
> [ 1274.154220] CPU    1: hi:  186, btch:  31 usd:  78
> [ 1274.154241] active_anon:248 inactive_anon:326 isolated_anon:0
> [ 1274.154244]  active_file:132 inactive_file:105 isolated_file:41
> [ 1274.154247]  unevictable:0 dirty:0 writeback:19 unstable:0
> [ 1274.154250]  free:1309 slab_reclaimable:642 slab_unreclaimable:3111
> [ 1274.154254]  mapped:100846 shmem:4 pagetables:1187 bounce:0
> [ 1274.154313] DMA free:2036kB min:80kB low:100kB high:120kB active_anon:0kB inactive_anon:24kB active_file:20kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:14752kB mlocked:0kB dirty:0kB writeback:0kB mapped:12804kB shmem:0kB slab_reclaimable:16kB slab_unreclaimable:40kB kernel_stack:0kB pagetables:24kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> [ 1274.154375] lowmem_reserve[]: 0 489 489 489
> [ 1274.154415] DMA32 free:3200kB min:2788kB low:3484kB high:4180kB active_anon:992kB inactive_anon:1280kB active_file:508kB inactive_file:420kB unevictable:0kB isolated(anon):0kB isolated(file):164kB present:500960kB mlocked:0kB dirty:0kB writeback:76kB mapped:390580kB shmem:16kB slab_reclaimable:2552kB slab_unreclaimable:12404kB kernel_stack:592kB pagetables:4724kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:160 all_unreclaimable? no
> [ 1274.154481] lowmem_reserve[]: 0 0 0 0
> [ 1274.154508] DMA: 7*4kB 1*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2036kB
> [ 1274.154571] DMA32: 409*4kB 33*8kB 2*16kB 0*32kB 0*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 3212kB
> [ 1274.154634] 429 total pagecache pages
> [ 1274.154646] 161 pages in swap cache
> [ 1274.154658] Swap cache stats: add 344422, delete 344260, find 99167/143153
> [ 1274.154673] Free swap  = 476756kB
> [ 1274.154684] Total swap = 524280kB
> [ 1274.160880] 131072 pages RAM
> [ 1274.160902] 21934 pages reserved
> [ 1274.160914] 101195 pages shared
> [ 1274.160925] 6309 pages non-shared
> [ 1274.160963] unable to allocate 185088 bytes for transfer buffer 4

Though here it says it is 185 kbytes. Hmm.. You got 3MB in DMA32 and 2MB
in DMA so that should be enough.

I am not that familiar with the VM, so the instinctive thing I can think
of is to raise the amount of memory your guest has from the 512MB to
768MB. Does 'proc/meminfo' when this happens show you an excedingly
small amount of MemFree?

> [ 1287.634682] motion invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0
> [ 1287.634719] motion cpuset=/ mems_allowed=0
> 
> 
> 
> 
> Thursday, August 5, 2010, 4:52:14 PM, you wrote:
> 
> > On Thu, Aug 05, 2010 at 11:48:44AM +0200, Sander Eikelenboom wrote:
> >> Hi Konrad/Jeremy,
> >> 
> >> I have tested the last 2 days with the vm's with passthroughed devices shutdown, and no freeze so far.
> >> I'm running now with one of the vm's that runs an old 2.6.33 kernel from an old tree from Konrad together with some hacked up patches for xhci/usb3 support.
> >> That seems to be running fine for some time now (although not a full 2 days yet).
> >> 
> >> So my other vm seems to cause the freeze.
> >> 
> >> - This one uses the devel/merge.2.6.35-rc6.t2 as domU kernel, i think i should try an older version of pci-front/xen-swiotlb perhaps.
> >> - It has both a usb2 and usb3 controller passed through, but the xhci module has much changed since the hacked up patches from the kernel in de working domU vm
> >> - Most probably the drivers for the videograbbers will have changed
> >> 
> >> So i suspect:
> >>    - newer pci-front / xen-swiotlb
> >>    - xhci/usb3 driver
> >>    - drivers videograbber
> >> 
> >> Most probable would be a roque dma transfer that can't be catched by xen / pciback I guess, and therefore would be hard to debug ?
> 
> > The SWIOTLB "brains" by themselves haven't changed since the
> > uhh...2.6.33. The code internals that just got Ack-ed upstream looks quite
> > similar to the one that Jeremy carries in xen/stable-2.6.32.x. The
> > outside plumbing parts are the ones that changed.
> 
> > The fixes in the pci-front, well, most of those are "burocractic" in
> > nature - set the ownership to this, make hotplug work, etc. The big
> > fixes were the MSI/MSI-X ones but those were big news a couple of months
> > ago (and I think that was when 2.6.34 came out).
> 
> > The videograbber (vl4) stack trace you sent to me some time ago looked
> > liked a mutex was held for a very very long time... which I wonder if
> > that is the cmpxch compiler bug that has hit some folks. Are you using
> > Debian?
> 
> > But we can do something easy. I can rebase my 2.6.33 kernel with the
> > latest Xen-SWIOTLB/SWIOTLB engine + Xen PCI front, and we can eliminate the
> > SWIOTLB/PCIfront being at fault here.. Let me do that if your  2.6.33
> > VM guest is running fine for the last two days.
> 
> 
> 
> 
> -- 
> Best regards,
>  Sander                            mailto:linux@eikelenboom.it

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze within 2 days, no info in serial log
  2010-08-06 15:17           ` Konrad Rzeszutek Wilk
@ 2010-08-06 20:44             ` Jeremy Fitzhardinge
  2010-08-08 13:54             ` Sander Eikelenboom
  2010-08-08 16:57             ` Sander Eikelenboom
  2 siblings, 0 replies; 14+ messages in thread
From: Jeremy Fitzhardinge @ 2010-08-06 20:44 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Sander Eikelenboom, xen-devel@lists.xensource.com, Keir Fraser

  On 08/06/2010 08:17 AM, Konrad Rzeszutek Wilk wrote:
> On Fri, Aug 06, 2010 at 11:21:11AM +0200, Sander Eikelenboom wrote:
>> Hi Konrad,
>>
>> Hmm it seems that 2.6.33 tree does seem to work for 1 VM with a videograbber, but doesn't for the VM which seem to cause the freeze.
>> It does spit out some stacktraces after a while of not functioning, with since is OOM i will be something else caused by the fall out and not anywhere near the root cause.
>> Although this at least didn't freeze the complete system :-)
>> I will try some more configurations to see if i can find a pattern somehow ...
>>
>> --
>> Sander
>>
>> [ 1269.032133] submit of urb 0 failed (error=-90)
>> [ 1274.153341] motion: page allocation failure. order:6, mode:0xd4
> That is a 256kB request for memery.
>> [ 1274.153375] Pid: 1884, comm: motion Not tainted 2.6.33 #5
>> [ 1274.153391] Call Trace:
>> [ 1274.153416]  [<ffffffff810e4665>] __alloc_pages_nodemask+0x5b2/0x62b
>> [ 1274.153440]  [<ffffffff810338b9>] ? xen_force_evtchn_callback+0xd/0xf
>> [ 1274.153461]  [<ffffffff810e46f5>] __get_free_pages+0x17/0x5f
>> [ 1274.153483]  [<ffffffff8128042e>] xen_swiotlb_alloc_coherent+0x3c/0xe2
>> [ 1274.153507]  [<ffffffff81410931>] hcd_buffer_alloc+0xfa/0x11f
>> [ 1274.153527]  [<ffffffff81403e0c>] usb_buffer_alloc+0x17/0x1d
>> [ 1274.153562]  [<ffffffffa003f39e>] em28xx_init_isoc+0x16a/0x32b [em28xx]
>> [ 1274.153585]  [<ffffffff815ec0b9>] ? __down_read+0x47/0xed
>> [ 1274.153613]  [<ffffffffa003a4ac>] buffer_prepare+0xd7/0x10d [em28xx]
>> [ 1274.153639]  [<ffffffffa0016dac>] videobuf_qbuf+0x308/0x3f4 [videobuf_core]
>> [ 1274.153667]  [<ffffffffa0039cb3>] vidioc_qbuf+0x35/0x3a [em28xx]
>> [ 1274.153697]  [<ffffffffa0028229>] __video_do_ioctl+0x11ab/0x373b [videodev]
>> [ 1274.153720]  [<ffffffff814b51cd>] ? sock_def_readable+0x54/0x5f
>> [ 1274.153743]  [<ffffffff81541f65>] ? unix_dgram_sendmsg+0x3f1/0x43e
>> [ 1274.153764]  [<ffffffff810313b5>] ? __raw_callee_save_xen_pud_val+0x11/0x1e
>> [ 1274.153793]  [<ffffffffa0039c7e>] ? vidioc_qbuf+0x0/0x3a [em28xx]
>> [ 1274.153814]  [<ffffffff814b208b>] ? sock_sendmsg+0xa3/0xbc
>> [ 1274.153837]  [<ffffffff8123349b>] ? avc_has_perm+0x4e/0x60
>> [ 1274.153855]  [<ffffffff810338b9>] ? xen_force_evtchn_callback+0xd/0xf
>> [ 1274.153880]  [<ffffffffa002aab1>] video_ioctl2+0x2f8/0x3af [videodev]
>> [ 1274.153901]  [<ffffffff810357df>] ? __switch_to+0x265/0x277
>> [ 1274.153924]  [<ffffffffa0026122>] v4l2_ioctl+0x38/0x3a [videodev]
>> [ 1274.153944]  [<ffffffff8111ec90>] vfs_ioctl+0x72/0x9e
>> [ 1274.153961]  [<ffffffff8111f1d7>] do_vfs_ioctl+0x4a0/0x4e1
>> [ 1274.153980]  [<ffffffff8111f26d>] sys_ioctl+0x55/0x77
>> [ 1274.154000]  [<ffffffff81112e6a>] ? sys_write+0x60/0x70
>> [ 1274.154009]  [<ffffffff81036cc2>] system_call_fastpath+0x16/0x1b
>> [ 1274.154126] Mem-Info:
>> [ 1274.154138] DMA per-cpu:
>> [ 1274.154151] CPU    0: hi:    0, btch:   1 usd:   0
>> [ 1274.154165] CPU    1: hi:    0, btch:   1 usd:   0
>> [ 1274.154180] DMA32 per-cpu:
>> [ 1274.154202] CPU    0: hi:  186, btch:  31 usd:   0
>> [ 1274.154220] CPU    1: hi:  186, btch:  31 usd:  78
>> [ 1274.154241] active_anon:248 inactive_anon:326 isolated_anon:0
>> [ 1274.154244]  active_file:132 inactive_file:105 isolated_file:41
>> [ 1274.154247]  unevictable:0 dirty:0 writeback:19 unstable:0
>> [ 1274.154250]  free:1309 slab_reclaimable:642 slab_unreclaimable:3111
>> [ 1274.154254]  mapped:100846 shmem:4 pagetables:1187 bounce:0
>> [ 1274.154313] DMA free:2036kB min:80kB low:100kB high:120kB active_anon:0kB inactive_anon:24kB active_file:20kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:14752kB mlocked:0kB dirty:0kB writeback:0kB mapped:12804kB shmem:0kB slab_reclaimable:16kB slab_unreclaimable:40kB kernel_stack:0kB pagetables:24kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
>> [ 1274.154375] lowmem_reserve[]: 0 489 489 489
>> [ 1274.154415] DMA32 free:3200kB min:2788kB low:3484kB high:4180kB active_anon:992kB inactive_anon:1280kB active_file:508kB inactive_file:420kB unevictable:0kB isolated(anon):0kB isolated(file):164kB present:500960kB mlocked:0kB dirty:0kB writeback:76kB mapped:390580kB shmem:16kB slab_reclaimable:2552kB slab_unreclaimable:12404kB kernel_stack:592kB pagetables:4724kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:160 all_unreclaimable? no
>> [ 1274.154481] lowmem_reserve[]: 0 0 0 0
>> [ 1274.154508] DMA: 7*4kB 1*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2036kB
>> [ 1274.154571] DMA32: 409*4kB 33*8kB 2*16kB 0*32kB 0*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 3212kB
>> [ 1274.154634] 429 total pagecache pages
>> [ 1274.154646] 161 pages in swap cache
>> [ 1274.154658] Swap cache stats: add 344422, delete 344260, find 99167/143153
>> [ 1274.154673] Free swap  = 476756kB
>> [ 1274.154684] Total swap = 524280kB
>> [ 1274.160880] 131072 pages RAM
>> [ 1274.160902] 21934 pages reserved
>> [ 1274.160914] 101195 pages shared
>> [ 1274.160925] 6309 pages non-shared
>> [ 1274.160963] unable to allocate 185088 bytes for transfer buffer 4
> Though here it says it is 185 kbytes. Hmm.. You got 3MB in DMA32 and 2MB
> in DMA so that should be enough.
>
> I am not that familiar with the VM, so the instinctive thing I can think
> of is to raise the amount of memory your guest has from the 512MB to
> 768MB. Does 'proc/meminfo' when this happens show you an excedingly
> small amount of MemFree?

Memory allocations are rounded up to the next order, so 185k -> 256k.  
It's also a contiguous allocation, so it needs to find 64 contiguous 
pages, which is pretty much impossible in a system which has been 
running for a while.

     J

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze within 2 days, no info in serial log
  2010-08-06 15:17           ` Konrad Rzeszutek Wilk
  2010-08-06 20:44             ` Jeremy Fitzhardinge
@ 2010-08-08 13:54             ` Sander Eikelenboom
  2010-08-08 16:57             ` Sander Eikelenboom
  2 siblings, 0 replies; 14+ messages in thread
From: Sander Eikelenboom @ 2010-08-08 13:54 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Keir Fraser

Hi Konrad/Jeremy,

Previously i had a working setup, with the same VM i'm using now, but quite a few things have changed:

- Another motherboard
- Another processor
- Another USB controller (now usb3 pci-e instead of usb2 pci)

The freezes seem to be related to the USB3, but i'm not quite sure, since this also can give another workload.
What i have tried:
- latest xen-4.0-testing compiled with debug=y
- for dom0: latest 2.6.32 pvops stable kernel from Jeremy
- for dom0: latest 2.6.32 xen-next kernel from Jeremy
- added some kernel debug options(apart from some noise about hardirq's this doesn't seem to deliver a lot)
- for domU: 2.6.35-rc6 tree (devel/2.6.25-rc6-t2) from Konrad's tree.
- I also tried running without msi

It allways freezes at some point, but the time when seems to vary, although most of the time within a day.

- Done another memtest to be sure it isn't faulty memory, cooling is on max and temperatures are good, so that seems to be ruled out as well.


- I tried using the IOMMU, this should rule out DMA one should say, but it also froze with the IOMMU enabled and working, and again nothing in serial log :-(

The things i'm about to try after i have finished some backups are:
- xen-unstable
- running the VM as a HVM



What i'm wondering about is:

- Could the 4GB memory barrier still be a problem ? The machine has 8GB, and the domain normally would be started as one of the last, which totals up to around the 4GB of domains running.
  This night i let it run with only the troublesome pv domain, and it seems to work so far.
- Is there a way to force a domain to live underneath the 4GB or any other thing i could try out (besides ripping the ram out of the machine)
- Are there any other things that could prevent a full freeze by making things more strict cq provide addiontal debug info ?

--
Sander

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze within 2 days, no info in serial log
  2010-08-06 15:17           ` Konrad Rzeszutek Wilk
  2010-08-06 20:44             ` Jeremy Fitzhardinge
  2010-08-08 13:54             ` Sander Eikelenboom
@ 2010-08-08 16:57             ` Sander Eikelenboom
  2 siblings, 0 replies; 14+ messages in thread
From: Sander Eikelenboom @ 2010-08-08 16:57 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Keir Fraser

[-- Attachment #1: Type: text/plain, Size: 22490 bytes --]

Hi Konrad,

This time the grabbing application became an hung application again in the VM, it seems you are right, available mem is down to 0.
It always used to work with 512mb assigned to the domain.

Most probably a bug in the xhci code i assume ?

Attached: Some hopefully relevant data from /proc


--
Sander



Aug  8 20:16:17 security kernel: [  721.555787] BUG: soft lockup - CPU#0 stuck for 82s! [kmemleak:374]
Aug  8 20:16:17 security kernel: [  721.555790] Modules linked in: fuse saa7115 em28xx v4l2_common videodev v4l1_compat v4l2_compat_ioctl32 videobuf_vmalloc videobuf_core tveeprom evdev i2c_core pcspkr thermal_sys [last unloaded: scsi_wait_scan]
Aug  8 20:16:17 security kernel: [  721.555814] CPU 0 
Aug  8 20:16:17 security kernel: [  721.555816] Modules linked in: fuse saa7115 em28xx v4l2_common videodev v4l1_compat v4l2_compat_ioctl32 videobuf_vmalloc videobuf_core tveeprom evdev i2c_core pcspkr thermal_sys [last unloaded: scsi_wait_scan]
Aug  8 20:16:17 security kernel: [  721.555838] 
Aug  8 20:16:17 security kernel: [  721.555841] Pid: 374, comm: kmemleak Not tainted 2.6.35-rc6+xen-2.6.35-rc6-xen-isoc-20100808-l3-mutex-dma-ed+ #7 /
Aug  8 20:16:17 security kernel: [  721.555847] RIP: e030:[<ffffffff81006318>]  [<ffffffff81006318>] xen_restore_fl_direct+0x18/0x1b
Aug  8 20:16:17 security kernel: [  721.555858] RSP: e02b:ffff88001d8abe40  EFLAGS: 00000246
Aug  8 20:16:17 security kernel: [  721.555861] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88001f7626d0
Aug  8 20:16:17 security kernel: [  721.555865] RDX: 0000000000000000 RSI: 0000000000000200 RDI: 0000000000000200
Aug  8 20:16:17 security kernel: [  721.555869] RBP: 0000000000000001 R08: fffc000000000000 R09: ffff88001d8abdb0
Aug  8 20:16:17 security kernel: [  721.555873] R10: 000000000000000c R11: ffffea00002fdef8 R12: 0000000000000200
Aug  8 20:16:17 security kernel: [  721.555877] R13: 0000000000000000 R14: ffffea00002fdf01 R15: 0000000000000001
Aug  8 20:16:17 security kernel: [  721.555886] FS:  00007fc794dfc910(0000) GS:ffff880002ced000(0000) knlGS:0000000000000000
Aug  8 20:16:17 security kernel: [  721.555891] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
Aug  8 20:16:17 security kernel: [  721.555894] CR2: 0000000001840078 CR3: 000000001e40b000 CR4: 0000000000000660
Aug  8 20:16:17 security kernel: [  721.559780] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug  8 20:16:17 security kernel: [  721.559780] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Aug  8 20:16:17 security kernel: [  721.559780] Process kmemleak (pid: 374, threadinfo ffff88001d8aa000, task ffff88001fd918d0)
Aug  8 20:16:17 security kernel: [  721.559780] Stack:
Aug  8 20:16:17 security kernel: [  721.559780]  ffffffff8142d77e ffffffff810c638e 0000000000000000 ffff8800126e0260
Aug  8 20:16:17 security kernel: [  721.559780] <0> ffffea00002fdf00 ffffffff810c6967 ffff8800149b02b0 ffffea00002fded0
Aug  8 20:16:17 security kernel: [  721.559780] <0> 000000000000dad6 ffffea00002fdf08 0000000000020000 0000000000000000
Aug  8 20:16:17 security kernel: [  721.559780] Call Trace:
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff8142d77e>] ? _raw_read_unlock_irqrestore+0xd/0xe
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810c638e>] ? find_and_get_object+0x4a/0x75
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810c6967>] ? scan_block+0x4a/0xf7
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810c6ce9>] ? kmemleak_scan+0x1a2/0x3e9
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810c737c>] ? kmemleak_scan_thread+0x0/0x9b
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810c737c>] ? kmemleak_scan_thread+0x0/0x9b
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810c73d5>] ? kmemleak_scan_thread+0x59/0x9b
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff81054051>] ? kthread+0x79/0x81
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810094e4>] ? kernel_thread_helper+0x4/0x10
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810088e3>] ? int_ret_from_sys_call+0x7/0x1b
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff8142dadd>] ? retint_restore_args+0x5/0x6
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810094e0>] ? kernel_thread_helper+0x0/0x10
Aug  8 20:16:17 security kernel: [  721.559780] Code: 44 00 00 65 f6 04 25 21 b0 00 00 ff 0f 94 c4 00 e4 c3 90 66 f7 c7 00 02 65 0f 94 04 25 21 b0 00 00 65 66 83 3c 25 20 b0 00 00 01 <74> 05 e8 01 00 00 00 c3 50 51 52 56 57 41 50 41 51 41 52 41 53 
Aug  8 20:16:17 security kernel: [  721.559780] Call Trace:
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff8142d77e>] ? _raw_read_unlock_irqrestore+0xd/0xe
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810c638e>] ? find_and_get_object+0x4a/0x75
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810c6967>] ? scan_block+0x4a/0xf7
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810c6ce9>] ? kmemleak_scan+0x1a2/0x3e9
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810c737c>] ? kmemleak_scan_thread+0x0/0x9b
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810c737c>] ? kmemleak_scan_thread+0x0/0x9b
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810c73d5>] ? kmemleak_scan_thread+0x59/0x9b
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff81054051>] ? kthread+0x79/0x81
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810094e4>] ? kernel_thread_helper+0x4/0x10
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810088e3>] ? int_ret_from_sys_call+0x7/0x1b
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff8142dadd>] ? retint_restore_args+0x5/0x6
Aug  8 20:16:17 security kernel: [  721.559780]  [<ffffffff810094e0>] ? kernel_thread_helper+0x0/0x10
Aug  8 20:16:19 security kernel: [  724.187104] kmemleak: 5 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
Aug  8 20:16:46 security motion: [0] Thread 1 - Watchdog timeout, trying to do a graceful restart
Aug  8 20:17:01 security /USR/SBIN/CRON[1865]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Aug  8 20:17:46 security motion: [0] Thread 1 - Watchdog timeout, did NOT restart graceful,killing it!
Aug  8 20:17:46 security motion: [0] Calling vid_close() from motion_cleanup
Aug  8 20:17:46 security motion: [0] Closing video device /dev/kworld
Aug  8 20:20:17 security kernel: [  961.780121] INFO: task motion:1257 blocked for more than 120 seconds.
Aug  8 20:20:17 security kernel: [  961.780155] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug  8 20:20:17 security kernel: [  961.780177] motion        D ffff88001ef60bc0     0  1257      1 0x00000000
Aug  8 20:20:17 security kernel: [  961.780207]  ffff88001fd155d0 0000000000000282 ffffffff81005cc5 00000000000145c0
Aug  8 20:20:17 security kernel: [  961.780243]  ffff88001e41dfd8 ffff88001e41dfd8 ffff88001ef60930 00000000000145c0
Aug  8 20:20:17 security kernel: [  961.780278]  00000000000145c0 00000000000145c0 ffff88001ef60930 0000000000000000
Aug  8 20:20:17 security kernel: [  961.780313] Call Trace:
Aug  8 20:20:17 security kernel: [  961.780337]  [<ffffffff81005cc5>] ? xen_force_evtchn_callback+0x9/0xa
Aug  8 20:20:17 security kernel: [  961.780365]  [<ffffffffa002eaa6>] ? video_ioctl2+0x0/0x32e [videodev]
Aug  8 20:20:17 security kernel: [  961.780388]  [<ffffffff8142c544>] ? __mutex_lock_slowpath+0x12f/0x22c
Aug  8 20:20:17 security kernel: [  961.780409]  [<ffffffff8142c64a>] ? mutex_lock+0x9/0x1e
Aug  8 20:20:17 security kernel: [  961.780430]  [<ffffffffa0017e58>] ? videobuf_streamoff+0x13/0x35 [videobuf_core]
Aug  8 20:20:17 security kernel: [  961.780454]  [<ffffffff81005cc5>] ? xen_force_evtchn_callback+0x9/0xa
Aug  8 20:20:17 security kernel: [  961.780478]  [<ffffffffa003d573>] ? vidioc_streamoff+0x7e/0xb5 [em28xx]
Aug  8 20:20:17 security kernel: [  961.780500]  [<ffffffffa002c5fe>] ? __video_do_ioctl+0x181f/0x3cc7 [videodev]
Aug  8 20:20:17 security kernel: [  961.780523]  [<ffffffff8100631f>] ? xen_restore_fl_direct_end+0x0/0x1
Aug  8 20:20:17 security kernel: [  961.780544]  [<ffffffff8142d714>] ? _raw_spin_unlock_irqrestore+0xc/0xd
Aug  8 20:20:17 security kernel: [  961.780564]  [<ffffffff813941dd>] ? sock_def_readable+0x3b/0x5d
Aug  8 20:20:17 security kernel: [  961.780585]  [<ffffffff814043a6>] ? unix_dgram_sendmsg+0x428/0x4b2
Aug  8 20:20:17 security kernel: [  961.780606]  [<ffffffff810058fa>] ? xen_set_pte_at+0x196/0x1b6
Aug  8 20:20:17 security kernel: [  961.780625]  [<ffffffff810036bd>] ? __raw_callee_save_xen_make_pte+0x11/0x1e
Aug  8 20:20:17 security kernel: [  961.780648]  [<ffffffff8139115e>] ? sock_sendmsg+0xd1/0xec
Aug  8 20:20:17 security kernel: [  961.780669]  [<ffffffff810b0b00>] ? __do_fault+0x40f/0x44a
Aug  8 20:20:17 security kernel: [  961.780689]  [<ffffffff81005cc5>] ? xen_force_evtchn_callback+0x9/0xa
Aug  8 20:20:17 security kernel: [  961.780709]  [<ffffffff81006332>] ? check_events+0x12/0x20
Aug  8 20:20:17 security kernel: [  961.780730]  [<ffffffffa002ed38>] ? video_ioctl2+0x292/0x32e [videodev]
Aug  8 20:20:17 security kernel: [  961.780750]  [<ffffffff81002616>] ? xen_write_msr_safe+0x5d/0x79
Aug  8 20:20:17 security kernel: [  961.780770]  [<ffffffff81007337>] ? __switch_to+0x1b3/0x2a4
Aug  8 20:20:17 security kernel: [  961.780790]  [<ffffffff8100622a>] ? xen_sched_clock+0xf/0x8c
Aug  8 20:20:17 security kernel: [  961.780810]  [<ffffffff81005cc5>] ? xen_force_evtchn_callback+0x9/0xa
Aug  8 20:20:17 security kernel: [  961.780830]  [<ffffffff81006332>] ? check_events+0x12/0x20
Aug  8 20:20:17 security kernel: [  961.780850]  [<ffffffffa002a10b>] ? v4l2_ioctl+0x38/0x3a [videodev]
Aug  8 20:20:17 security kernel: [  961.780870]  [<ffffffff810d54be>] ? vfs_ioctl+0x69/0x92
Aug  8 20:20:17 security kernel: [  961.780889]  [<ffffffff810d596e>] ? do_vfs_ioctl+0x411/0x43c
Aug  8 20:20:17 security kernel: [  961.780909]  [<ffffffff810c96b4>] ? vfs_write+0x134/0x169
Aug  8 20:20:17 security kernel: [  961.780928]  [<ffffffff810d59ea>] ? sys_ioctl+0x51/0x70
Aug  8 20:20:17 security kernel: [  961.780947]  [<ffffffff810086c2>] ? system_call_fastpath+0x16/0x1b
Aug  8 20:22:17 security kernel: [ 1081.780140] INFO: task motion:1257 blocked for more than 120 seconds.
Aug  8 20:22:17 security kernel: [ 1081.780172] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug  8 20:22:17 security kernel: [ 1081.780194] motion        D ffff88001ef60bc0     0  1257      1 0x00000000
Aug  8 20:22:17 security kernel: [ 1081.780224]  ffff88001fd155d0 0000000000000282 ffffffff81005cc5 00000000000145c0
Aug  8 20:22:17 security kernel: [ 1081.780261]  ffff88001e41dfd8 ffff88001e41dfd8 ffff88001ef60930 00000000000145c0
Aug  8 20:22:17 security kernel: [ 1081.780295]  00000000000145c0 00000000000145c0 ffff88001ef60930 0000000000000000
Aug  8 20:22:17 security kernel: [ 1081.780330] Call Trace:
Aug  8 20:22:17 security kernel: [ 1081.780355]  [<ffffffff81005cc5>] ? xen_force_evtchn_callback+0x9/0xa
Aug  8 20:22:17 security kernel: [ 1081.780382]  [<ffffffffa002eaa6>] ? video_ioctl2+0x0/0x32e [videodev]
Aug  8 20:22:17 security kernel: [ 1081.780405]  [<ffffffff8142c544>] ? __mutex_lock_slowpath+0x12f/0x22c
Aug  8 20:22:17 security kernel: [ 1081.780426]  [<ffffffff8142c64a>] ? mutex_lock+0x9/0x1e
Aug  8 20:22:17 security kernel: [ 1081.780447]  [<ffffffffa0017e58>] ? videobuf_streamoff+0x13/0x35 [videobuf_core]
Aug  8 20:22:17 security kernel: [ 1081.780471]  [<ffffffff81005cc5>] ? xen_force_evtchn_callback+0x9/0xa
Aug  8 20:22:17 security kernel: [ 1081.780495]  [<ffffffffa003d573>] ? vidioc_streamoff+0x7e/0xb5 [em28xx]
Aug  8 20:22:17 security kernel: [ 1081.780517]  [<ffffffffa002c5fe>] ? __video_do_ioctl+0x181f/0x3cc7 [videodev]
Aug  8 20:22:17 security kernel: [ 1081.780540]  [<ffffffff8100631f>] ? xen_restore_fl_direct_end+0x0/0x1
Aug  8 20:22:17 security kernel: [ 1081.780561]  [<ffffffff8142d714>] ? _raw_spin_unlock_irqrestore+0xc/0xd
Aug  8 20:22:17 security kernel: [ 1081.780581]  [<ffffffff813941dd>] ? sock_def_readable+0x3b/0x5d
Aug  8 20:22:17 security kernel: [ 1081.780602]  [<ffffffff814043a6>] ? unix_dgram_sendmsg+0x428/0x4b2
Aug  8 20:22:17 security kernel: [ 1081.780622]  [<ffffffff810058fa>] ? xen_set_pte_at+0x196/0x1b6
Aug  8 20:22:17 security kernel: [ 1081.780642]  [<ffffffff810036bd>] ? __raw_callee_save_xen_make_pte+0x11/0x1e
Aug  8 20:22:17 security kernel: [ 1081.780666]  [<ffffffff8139115e>] ? sock_sendmsg+0xd1/0xec
Aug  8 20:22:17 security kernel: [ 1081.780686]  [<ffffffff810b0b00>] ? __do_fault+0x40f/0x44a
Aug  8 20:22:17 security kernel: [ 1081.780706]  [<ffffffff81005cc5>] ? xen_force_evtchn_callback+0x9/0xa
Aug  8 20:22:17 security kernel: [ 1081.780726]  [<ffffffff81006332>] ? check_events+0x12/0x20
Aug  8 20:22:17 security kernel: [ 1081.780747]  [<ffffffffa002ed38>] ? video_ioctl2+0x292/0x32e [videodev]
Aug  8 20:22:17 security kernel: [ 1081.780767]  [<ffffffff81002616>] ? xen_write_msr_safe+0x5d/0x79
Aug  8 20:22:17 security kernel: [ 1081.780787]  [<ffffffff81007337>] ? __switch_to+0x1b3/0x2a4
Aug  8 20:22:17 security kernel: [ 1081.780806]  [<ffffffff8100622a>] ? xen_sched_clock+0xf/0x8c
Aug  8 20:22:17 security kernel: [ 1081.780826]  [<ffffffff81005cc5>] ? xen_force_evtchn_callback+0x9/0xa
Aug  8 20:22:17 security kernel: [ 1081.780847]  [<ffffffff81006332>] ? check_events+0x12/0x20
Aug  8 20:22:17 security kernel: [ 1081.780866]  [<ffffffffa002a10b>] ? v4l2_ioctl+0x38/0x3a [videodev]
Aug  8 20:22:17 security kernel: [ 1081.780886]  [<ffffffff810d54be>] ? vfs_ioctl+0x69/0x92
Aug  8 20:22:17 security kernel: [ 1081.780905]  [<ffffffff810d596e>] ? do_vfs_ioctl+0x411/0x43c
Aug  8 20:22:17 security kernel: [ 1081.780925]  [<ffffffff810c96b4>] ? vfs_write+0x134/0x169
Aug  8 20:22:17 security kernel: [ 1081.780943]  [<ffffffff810d59ea>] ? sys_ioctl+0x51/0x70
Aug  8 20:22:17 security kernel: [ 1081.780961]  [<ffffffff810086c2>] ? system_call_fastpath+0x16/0x1b






Friday, August 6, 2010, 5:17:43 PM, you wrote:

> On Fri, Aug 06, 2010 at 11:21:11AM +0200, Sander Eikelenboom wrote:
>> Hi Konrad,
>> 
>> Hmm it seems that 2.6.33 tree does seem to work for 1 VM with a videograbber, but doesn't for the VM which seem to cause the freeze.
>> It does spit out some stacktraces after a while of not functioning, with since is OOM i will be something else caused by the fall out and not anywhere near the root cause.
>> Although this at least didn't freeze the complete system :-)
>> I will try some more configurations to see if i can find a pattern somehow ...
>> 
>> --
>> Sander
>> 
>> [ 1269.032133] submit of urb 0 failed (error=-90)
>> [ 1274.153341] motion: page allocation failure. order:6, mode:0xd4

> That is a 256kB request for memery.
>> [ 1274.153375] Pid: 1884, comm: motion Not tainted 2.6.33 #5
>> [ 1274.153391] Call Trace:
>> [ 1274.153416]  [<ffffffff810e4665>] __alloc_pages_nodemask+0x5b2/0x62b
>> [ 1274.153440]  [<ffffffff810338b9>] ? xen_force_evtchn_callback+0xd/0xf
>> [ 1274.153461]  [<ffffffff810e46f5>] __get_free_pages+0x17/0x5f
>> [ 1274.153483]  [<ffffffff8128042e>] xen_swiotlb_alloc_coherent+0x3c/0xe2
>> [ 1274.153507]  [<ffffffff81410931>] hcd_buffer_alloc+0xfa/0x11f
>> [ 1274.153527]  [<ffffffff81403e0c>] usb_buffer_alloc+0x17/0x1d
>> [ 1274.153562]  [<ffffffffa003f39e>] em28xx_init_isoc+0x16a/0x32b [em28xx]
>> [ 1274.153585]  [<ffffffff815ec0b9>] ? __down_read+0x47/0xed
>> [ 1274.153613]  [<ffffffffa003a4ac>] buffer_prepare+0xd7/0x10d [em28xx]
>> [ 1274.153639]  [<ffffffffa0016dac>] videobuf_qbuf+0x308/0x3f4 [videobuf_core]
>> [ 1274.153667]  [<ffffffffa0039cb3>] vidioc_qbuf+0x35/0x3a [em28xx]
>> [ 1274.153697]  [<ffffffffa0028229>] __video_do_ioctl+0x11ab/0x373b [videodev]
>> [ 1274.153720]  [<ffffffff814b51cd>] ? sock_def_readable+0x54/0x5f
>> [ 1274.153743]  [<ffffffff81541f65>] ? unix_dgram_sendmsg+0x3f1/0x43e
>> [ 1274.153764]  [<ffffffff810313b5>] ? __raw_callee_save_xen_pud_val+0x11/0x1e
>> [ 1274.153793]  [<ffffffffa0039c7e>] ? vidioc_qbuf+0x0/0x3a [em28xx]
>> [ 1274.153814]  [<ffffffff814b208b>] ? sock_sendmsg+0xa3/0xbc
>> [ 1274.153837]  [<ffffffff8123349b>] ? avc_has_perm+0x4e/0x60
>> [ 1274.153855]  [<ffffffff810338b9>] ? xen_force_evtchn_callback+0xd/0xf
>> [ 1274.153880]  [<ffffffffa002aab1>] video_ioctl2+0x2f8/0x3af [videodev]
>> [ 1274.153901]  [<ffffffff810357df>] ? __switch_to+0x265/0x277
>> [ 1274.153924]  [<ffffffffa0026122>] v4l2_ioctl+0x38/0x3a [videodev]
>> [ 1274.153944]  [<ffffffff8111ec90>] vfs_ioctl+0x72/0x9e
>> [ 1274.153961]  [<ffffffff8111f1d7>] do_vfs_ioctl+0x4a0/0x4e1
>> [ 1274.153980]  [<ffffffff8111f26d>] sys_ioctl+0x55/0x77
>> [ 1274.154000]  [<ffffffff81112e6a>] ? sys_write+0x60/0x70
>> [ 1274.154009]  [<ffffffff81036cc2>] system_call_fastpath+0x16/0x1b
>> [ 1274.154126] Mem-Info:
>> [ 1274.154138] DMA per-cpu:
>> [ 1274.154151] CPU    0: hi:    0, btch:   1 usd:   0
>> [ 1274.154165] CPU    1: hi:    0, btch:   1 usd:   0
>> [ 1274.154180] DMA32 per-cpu:
>> [ 1274.154202] CPU    0: hi:  186, btch:  31 usd:   0
>> [ 1274.154220] CPU    1: hi:  186, btch:  31 usd:  78
>> [ 1274.154241] active_anon:248 inactive_anon:326 isolated_anon:0
>> [ 1274.154244]  active_file:132 inactive_file:105 isolated_file:41
>> [ 1274.154247]  unevictable:0 dirty:0 writeback:19 unstable:0
>> [ 1274.154250]  free:1309 slab_reclaimable:642 slab_unreclaimable:3111
>> [ 1274.154254]  mapped:100846 shmem:4 pagetables:1187 bounce:0
>> [ 1274.154313] DMA free:2036kB min:80kB low:100kB high:120kB active_anon:0kB inactive_anon:24kB active_file:20kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:14752kB mlocked:0kB dirty:0kB writeback:0kB mapped:12804kB shmem:0kB slab_reclaimable:16kB slab_unreclaimable:40kB kernel_stack:0kB pagetables:24kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
>> [ 1274.154375] lowmem_reserve[]: 0 489 489 489
>> [ 1274.154415] DMA32 free:3200kB min:2788kB low:3484kB high:4180kB active_anon:992kB inactive_anon:1280kB active_file:508kB inactive_file:420kB unevictable:0kB isolated(anon):0kB isolated(file):164kB present:500960kB mlocked:0kB dirty:0kB writeback:76kB mapped:390580kB shmem:16kB slab_reclaimable:2552kB slab_unreclaimable:12404kB kernel_stack:592kB pagetables:4724kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:160 all_unreclaimable? no
>> [ 1274.154481] lowmem_reserve[]: 0 0 0 0
>> [ 1274.154508] DMA: 7*4kB 1*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2036kB
>> [ 1274.154571] DMA32: 409*4kB 33*8kB 2*16kB 0*32kB 0*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 3212kB
>> [ 1274.154634] 429 total pagecache pages
>> [ 1274.154646] 161 pages in swap cache
>> [ 1274.154658] Swap cache stats: add 344422, delete 344260, find 99167/143153
>> [ 1274.154673] Free swap  = 476756kB
>> [ 1274.154684] Total swap = 524280kB
>> [ 1274.160880] 131072 pages RAM
>> [ 1274.160902] 21934 pages reserved
>> [ 1274.160914] 101195 pages shared
>> [ 1274.160925] 6309 pages non-shared
>> [ 1274.160963] unable to allocate 185088 bytes for transfer buffer 4

> Though here it says it is 185 kbytes. Hmm.. You got 3MB in DMA32 and 2MB
> in DMA so that should be enough.

> I am not that familiar with the VM, so the instinctive thing I can think
> of is to raise the amount of memory your guest has from the 512MB to
> 768MB. Does 'proc/meminfo' when this happens show you an excedingly
> small amount of MemFree?

>> [ 1287.634682] motion invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0
>> [ 1287.634719] motion cpuset=/ mems_allowed=0
>> 
>> 
>> 
>> 
>> Thursday, August 5, 2010, 4:52:14 PM, you wrote:
>> 
>> > On Thu, Aug 05, 2010 at 11:48:44AM +0200, Sander Eikelenboom wrote:
>> >> Hi Konrad/Jeremy,
>> >> 
>> >> I have tested the last 2 days with the vm's with passthroughed devices shutdown, and no freeze so far.
>> >> I'm running now with one of the vm's that runs an old 2.6.33 kernel from an old tree from Konrad together with some hacked up patches for xhci/usb3 support.
>> >> That seems to be running fine for some time now (although not a full 2 days yet).
>> >> 
>> >> So my other vm seems to cause the freeze.
>> >> 
>> >> - This one uses the devel/merge.2.6.35-rc6.t2 as domU kernel, i think i should try an older version of pci-front/xen-swiotlb perhaps.
>> >> - It has both a usb2 and usb3 controller passed through, but the xhci module has much changed since the hacked up patches from the kernel in de working domU vm
>> >> - Most probably the drivers for the videograbbers will have changed
>> >> 
>> >> So i suspect:
>> >>    - newer pci-front / xen-swiotlb
>> >>    - xhci/usb3 driver
>> >>    - drivers videograbber
>> >> 
>> >> Most probable would be a roque dma transfer that can't be catched by xen / pciback I guess, and therefore would be hard to debug ?
>> 
>> > The SWIOTLB "brains" by themselves haven't changed since the
>> > uhh...2.6.33. The code internals that just got Ack-ed upstream looks quite
>> > similar to the one that Jeremy carries in xen/stable-2.6.32.x. The
>> > outside plumbing parts are the ones that changed.
>> 
>> > The fixes in the pci-front, well, most of those are "burocractic" in
>> > nature - set the ownership to this, make hotplug work, etc. The big
>> > fixes were the MSI/MSI-X ones but those were big news a couple of months
>> > ago (and I think that was when 2.6.34 came out).
>> 
>> > The videograbber (vl4) stack trace you sent to me some time ago looked
>> > liked a mutex was held for a very very long time... which I wonder if
>> > that is the cmpxch compiler bug that has hit some folks. Are you using
>> > Debian?
>> 
>> > But we can do something easy. I can rebase my 2.6.33 kernel with the
>> > latest Xen-SWIOTLB/SWIOTLB engine + Xen PCI front, and we can eliminate the
>> > SWIOTLB/PCIfront being at fault here.. Let me do that if your  2.6.33
>> > VM guest is running fine for the last two days.
>> 
>> 
>> 
>> 
>> -- 
>> Best regards,
>>  Sander                            mailto:linux@eikelenboom.it



-- 
Best regards,
 Sander                            mailto:linux@eikelenboom.it

[-- Attachment #2: interrupts.txt --]
[-- Type: text/plain, Size: 1249 bytes --]

            CPU0       
  86:          0  xen-pirq-pcifront-msi-x  xhci_hcd
  87:    4664997  xen-pirq-pcifront-msi-x  xhci_hcd
1268:       2585   xen-dyn-event     eth0
1269:       7108   xen-dyn-event     blkif
1270:       4491   xen-dyn-event     blkif
1271:         28   xen-dyn-event     blkif
1272:        775   xen-dyn-event     hvc_console
1273:          1   xen-dyn-event     pcifront
1274:        642   xen-dyn-event     xenbus
1275:          0   xen-dyn-ipi       callfuncsingle0
1276:          0   xen-dyn-virq      debug0
1277:          0   xen-dyn-ipi       callfunc0
1278:          0   xen-dyn-ipi       resched0
1279:     631711   xen-dyn-virq      timer0
 NMI:          0   Non-maskable interrupts
 LOC:          0   Local timer interrupts
 SPU:          0   Spurious interrupts
 PMI:          0   Performance monitoring interrupts
 PND:          0   Performance pending work
 RES:          0   Rescheduling interrupts
 CAL:          0   Function call interrupts
 TLB:          0   TLB shootdowns
 TRM:          0   Thermal event interrupts
 THR:          0   Threshold APIC interrupts
 MCE:          0   Machine check exceptions
 MCP:          0   Machine check polls
 ERR:          0
 MIS:          0

[-- Attachment #3: meminfo.txt --]
[-- Type: text/plain, Size: 1021 bytes --]

MemTotal:         440068 kB
MemFree:            9128 kB
Buffers:            9576 kB
Cached:           285616 kB
SwapCached:            0 kB
Active:            65168 kB
Inactive:         272952 kB
Active(anon):      20516 kB
Inactive(anon):    22628 kB
Active(file):      44652 kB
Inactive(file):   250324 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:        524284 kB
SwapFree:         524284 kB
Dirty:                32 kB
Writeback:             0 kB
AnonPages:         42968 kB
Mapped:            11580 kB
Shmem:               216 kB
Slab:              80792 kB
SReclaimable:      16836 kB
SUnreclaim:        63956 kB
KernelStack:         520 kB
PageTables:         4168 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:      744316 kB
Committed_AS:     182284 kB
VmallocTotal:   34359738367 kB
VmallocUsed:        3556 kB
VmallocChunk:   34359734779 kB
DirectMap4k:      524288 kB
DirectMap2M:           0 kB

[-- Attachment #4: slabinfo.txt --]
[-- Type: text/plain, Size: 18549 bytes --]

slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
fuse_request           0      0    632    6    1 : tunables   54   27    0 : slabdata      0      0      0
fuse_inode             0      0    768    5    1 : tunables   54   27    0 : slabdata      0      0      0
bridge_fdb_cache       0      0     64   59    1 : tunables  120   60    0 : slabdata      0      0      0
flow_cache             0      0     96   40    1 : tunables  120   60    0 : slabdata      0      0      0
dm_raid1_read_record      0      0   1064    7    2 : tunables   24   12    0 : slabdata      0      0      0
dm_snap_tracked_chunk      0      0     24  144    1 : tunables  120   60    0 : slabdata      0      0      0
dm_snap_pending_exception      0      0     80   48    1 : tunables  120   60    0 : slabdata      0      0      0
dm_exception           0      0     32  112    1 : tunables  120   60    0 : slabdata      0      0      0
dm_mpath_io            0      0     16  202    1 : tunables  120   60    0 : slabdata      0      0      0
dm_crypt_io            0      0    152   25    1 : tunables  120   60    0 : slabdata      0      0      0
kcopyd_job             0      0    384   10    1 : tunables   54   27    0 : slabdata      0      0      0
io                     0      0     64   59    1 : tunables  120   60    0 : slabdata      0      0      0
dm_rq_clone_bio_info      0      0     16  202    1 : tunables  120   60    0 : slabdata      0      0      0
dm_rq_target_io        0      0    368   10    1 : tunables   54   27    0 : slabdata      0      0      0
dm_target_io           0      0     24  144    1 : tunables  120   60    0 : slabdata      0      0      0
dm_io                  0      0     40   92    1 : tunables  120   60    0 : slabdata      0      0      0
uhci_urb_priv          0      0     56   67    1 : tunables  120   60    0 : slabdata      0      0      0
cfq_io_context        17     30    128   30    1 : tunables  120   60    0 : slabdata      1      1      0
cfq_queue             18     34    232   17    1 : tunables  120   60    0 : slabdata      2      2      0
bsg_cmd                0      0    312   12    1 : tunables   54   27    0 : slabdata      0      0      0
mqueue_inode_cache      1      4    960    4    1 : tunables   54   27    0 : slabdata      1      1      0
extent_map             0      0     88   44    1 : tunables  120   60    0 : slabdata      0      0      0
extent_buffers         0      0    144   27    1 : tunables  120   60    0 : slabdata      0      0      0
extent_state           0      0    128   30    1 : tunables  120   60    0 : slabdata      0      0      0
btrfs_path_cache       0      0    144   27    1 : tunables  120   60    0 : slabdata      0      0      0
btrfs_transaction_cache      0      0    232   17    1 : tunables  120   60    0 : slabdata      0      0      0
btrfs_trans_handle_cache      0      0     64   59    1 : tunables  120   60    0 : slabdata      0      0      0
btrfs_inode_cache      0      0   1024    4    1 : tunables   54   27    0 : slabdata      0      0      0
nilfs2_btree_path_cache      0      0   1792    2    1 : tunables   24   12    0 : slabdata      0      0      0
nilfs2_segbuf_cache      0      0    232   17    1 : tunables  120   60    0 : slabdata      0      0      0
nilfs2_transaction_cache      0      0     40   92    1 : tunables  120   60    0 : slabdata      0      0      0
nilfs2_inode_cache      0      0    960    4    1 : tunables   54   27    0 : slabdata      0      0      0
jbd2_journal_handle      0      0     24  144    1 : tunables  120   60    0 : slabdata      0      0      0
jbd2_journal_head      0      0    112   34    1 : tunables  120   60    0 : slabdata      0      0      0
jbd2_revoke_table      0      0     16  202    1 : tunables  120   60    0 : slabdata      0      0      0
jbd2_revoke_record      0      0     32  112    1 : tunables  120   60    0 : slabdata      0      0      0
journal_handle         2    144     24  144    1 : tunables  120   60    0 : slabdata      1      1      0
journal_head          17     34    112   34    1 : tunables  120   60    0 : slabdata      1      1      0
revoke_table           4    202     16  202    1 : tunables  120   60    0 : slabdata      1      1      0
revoke_record          0      0     32  112    1 : tunables  120   60    0 : slabdata      0      0      0
ext4_inode_cache       0      0   1008    4    1 : tunables   54   27    0 : slabdata      0      0      0
ext4_xattr             0      0     88   44    1 : tunables  120   60    0 : slabdata      0      0      0
ext4_free_block_extents      0      0     56   67    1 : tunables  120   60    0 : slabdata      0      0      0
ext4_alloc_context      0      0    144   27    1 : tunables  120   60    0 : slabdata      0      0      0
ext4_prealloc_space      0      0    104   37    1 : tunables  120   60    0 : slabdata      0      0      0
ext4_system_zone       0      0     40   92    1 : tunables  120   60    0 : slabdata      0      0      0
ext2_inode_cache       0      0    824    4    1 : tunables   54   27    0 : slabdata      0      0      0
ext2_xattr             0      0     88   44    1 : tunables  120   60    0 : slabdata      0      0      0
ext3_inode_cache    3922   3940    848    4    1 : tunables   54   27    0 : slabdata    985    985      0
ext3_xattr             0      0     88   44    1 : tunables  120   60    0 : slabdata      0      0      0
dquot                  0      0    256   15    1 : tunables  120   60    0 : slabdata      0      0      0
kioctx                 0      0    384   10    1 : tunables   54   27    0 : slabdata      0      0      0
kiocb                  0      0    256   15    1 : tunables  120   60    0 : slabdata      0      0      0
inotify_event_private_data      0      0     32  112    1 : tunables  120   60    0 : slabdata      0      0      0
inotify_inode_mark_entry     23     34    112   34    1 : tunables  120   60    0 : slabdata      1      1      0
dnotify_mark_entry      0      0    112   34    1 : tunables  120   60    0 : slabdata      0      0      0
dnotify_struct         0      0     32  112    1 : tunables  120   60    0 : slabdata      0      0      0
fasync_cache           0      0     48   77    1 : tunables  120   60    0 : slabdata      0      0      0
pid_namespace          0      0   2112    3    2 : tunables   24   12    0 : slabdata      0      0      0
nsproxy                0      0     48   77    1 : tunables  120   60    0 : slabdata      0      0      0
posix_timers_cache      0      0    176   22    1 : tunables  120   60    0 : slabdata      0      0      0
uid_cache              4     30    128   30    1 : tunables  120   60    0 : slabdata      1      1      0
UNIX                  12     18    832    9    2 : tunables   54   27    0 : slabdata      2      2      0
ip_mrt_cache           0      0    128   30    1 : tunables  120   60    0 : slabdata      0      0      0
UDP-Lite               0      0    832    9    2 : tunables   54   27    0 : slabdata      0      0      0
tcp_bind_bucket        3    112     32  112    1 : tunables  120   60    0 : slabdata      1      1      0
inet_peer_cache        0      0     64   59    1 : tunables  120   60    0 : slabdata      0      0      0
secpath_cache          0      0     64   59    1 : tunables  120   60    0 : slabdata      0      0      0
xfrm_dst_cache         0      0    448    8    1 : tunables   54   27    0 : slabdata      0      0      0
ip_fib_alias           0      0     32  112    1 : tunables  120   60    0 : slabdata      0      0      0
ip_fib_hash            9     53     72   53    1 : tunables  120   60    0 : slabdata      1      1      0
ip_dst_cache          11     20    384   10    1 : tunables   54   27    0 : slabdata      2      2      0
arp_cache              2     15    256   15    1 : tunables  120   60    0 : slabdata      1      1      0
RAW                    2      9    832    9    2 : tunables   54   27    0 : slabdata      1      1      0
UDP                    0      0    832    9    2 : tunables   54   27    0 : slabdata      0      0      0
tw_sock_TCP            0      0    256   15    1 : tunables  120   60    0 : slabdata      0      0      0
request_sock_TCP       0      0    128   30    1 : tunables  120   60    0 : slabdata      0      0      0
TCP                    5      8   1664    4    2 : tunables   24   12    0 : slabdata      2      2      0
eventpoll_pwq          5     53     72   53    1 : tunables  120   60    0 : slabdata      1      1      0
eventpoll_epi          5     30    128   30    1 : tunables  120   60    0 : slabdata      1      1      0
sgpool-128             2      2   5120    1    2 : tunables    8    4    0 : slabdata      2      2      0
sgpool-64              2      3   2560    3    2 : tunables   24   12    0 : slabdata      1      1      0
sgpool-32              2      3   1280    3    1 : tunables   24   12    0 : slabdata      1      1      0
sgpool-16              2      6    640    6    1 : tunables   54   27    0 : slabdata      1      1      0
sgpool-8               2     12    320   12    1 : tunables   54   27    0 : slabdata      1      1      0
scsi_data_buffer       0      0     24  144    1 : tunables  120   60    0 : slabdata      0      0      0
blkdev_queue          19     21   2344    3    2 : tunables   24   12    0 : slabdata      7      7      0
blkdev_requests       12     12    328   12    1 : tunables   54   27    0 : slabdata      1      1      0
blkdev_ioc            16     59     64   59    1 : tunables  120   60    0 : slabdata      1      1      0
fsnotify_event_holder      0      0     24  144    1 : tunables  120   60    0 : slabdata      0      0      0
fsnotify_event         0      0    104   37    1 : tunables  120   60    0 : slabdata      0      0      0
bio-0                  2     20    192   20    1 : tunables  120   60    0 : slabdata      1      1      0
biovec-256             2      2   4096    1    1 : tunables   24   12    0 : slabdata      2      2      0
biovec-128             0      0   2048    2    1 : tunables   24   12    0 : slabdata      0      0      0
biovec-64              0      0   1024    4    1 : tunables   54   27    0 : slabdata      0      0      0
biovec-16              0      0    256   15    1 : tunables  120   60    0 : slabdata      0      0      0
sock_inode_cache      35     35    704    5    1 : tunables   54   27    0 : slabdata      7      7      0
skbuff_fclone_cache      7      7    512    7    1 : tunables   54   27    0 : slabdata      1      1      0
skbuff_head_cache     66    165    256   15    1 : tunables  120   60    0 : slabdata     11     11      0
file_lock_cache       22     22    176   22    1 : tunables  120   60    0 : slabdata      1      1      0
shmem_inode_cache    507    540    808    5    1 : tunables   54   27    0 : slabdata    108    108      0
task_delay_info       81    136    112   34    1 : tunables  120   60    0 : slabdata      4      4      0
taskstats              2     12    328   12    1 : tunables   54   27    0 : slabdata      1      1      0
proc_inode_cache     252    252    664    6    1 : tunables   54   27    0 : slabdata     42     42      0
sigqueue               0      0    160   24    1 : tunables  120   60    0 : slabdata      0      0      0
bdev_cache            20     20    896    4    1 : tunables   54   27    0 : slabdata      5      5      0
sysfs_dir_cache     3501   3504     80   48    1 : tunables  120   60    0 : slabdata     73     73      0
mnt_cache             22     30    256   15    1 : tunables  120   60    0 : slabdata      2      2      0
filp                 540    540    192   20    1 : tunables  120   60    0 : slabdata     27     27      0
inode_cache         1016   1050    616    6    1 : tunables   54   27    0 : slabdata    175    175      0
dentry              4121   5660    192   20    1 : tunables  120   60    0 : slabdata    283    283      0
names_cache            1      1   4096    1    1 : tunables   24   12    0 : slabdata      1      1      0
key_jar                0      0    192   20    1 : tunables  120   60    0 : slabdata      0      0      0
buffer_head        68770  72409    104   37    1 : tunables  120   60    0 : slabdata   1957   1957      0
vm_area_struct      3298   3358    168   23    1 : tunables  120   60    0 : slabdata    146    146      0
mm_struct             45     52    896    4    1 : tunables   54   27    0 : slabdata     13     13      0
fs_cache              30     59     64   59    1 : tunables  120   60    0 : slabdata      1      1      0
files_cache           32     66    704   11    2 : tunables   54   27    0 : slabdata      6      6      0
signal_cache          64     92   1024    4    1 : tunables   54   27    0 : slabdata     23     23      0
sighand_cache         72     72   2112    3    2 : tunables   24   12    0 : slabdata     24     24      0
task_xstate           48     64    512    8    1 : tunables   54   27    0 : slabdata      8      8      0
task_struct           77     96   1712    4    2 : tunables   24   12    0 : slabdata     24     24      0
cred_jar             160    160    192   20    1 : tunables  120   60    0 : slabdata      8      8      0
anon_vma_chain      3163   3542     48   77    1 : tunables  120   60    0 : slabdata     46     46      0
anon_vma            1694   2304     24  144    1 : tunables  120   60    0 : slabdata     16     16      0
pid                  120    120    128   30    1 : tunables  120   60    0 : slabdata      4      4      0
idr_layer_cache      156    161    544    7    1 : tunables   54   27    0 : slabdata     23     23      0
kmemleak_scan_area    422    448     32  112    1 : tunables  120   60    0 : slabdata      4      4      0
kmemleak_object   142008 159024    320   12    1 : tunables   54   27    0 : slabdata  13252  13252      0
radix_tree_node     5202   5264    552    7    1 : tunables   54   27    0 : slabdata    752    752      0
size-4194304(DMA)      0      0 4194304    1 1024 : tunables    1    1    0 : slabdata      0      0      0
size-4194304           0      0 4194304    1 1024 : tunables    1    1    0 : slabdata      0      0      0
size-2097152(DMA)      0      0 2097152    1  512 : tunables    1    1    0 : slabdata      0      0      0
size-2097152           0      0 2097152    1  512 : tunables    1    1    0 : slabdata      0      0      0
size-1048576(DMA)      0      0 1048576    1  256 : tunables    1    1    0 : slabdata      0      0      0
size-1048576           0      0 1048576    1  256 : tunables    1    1    0 : slabdata      0      0      0
size-524288(DMA)       0      0 524288    1  128 : tunables    1    1    0 : slabdata      0      0      0
size-524288            0      0 524288    1  128 : tunables    1    1    0 : slabdata      0      0      0
size-262144(DMA)       0      0 262144    1   64 : tunables    1    1    0 : slabdata      0      0      0
size-262144            0      0 262144    1   64 : tunables    1    1    0 : slabdata      0      0      0
size-131072(DMA)       0      0 131072    1   32 : tunables    8    4    0 : slabdata      0      0      0
size-131072            0      0 131072    1   32 : tunables    8    4    0 : slabdata      0      0      0
size-65536(DMA)        0      0  65536    1   16 : tunables    8    4    0 : slabdata      0      0      0
size-65536             0      0  65536    1   16 : tunables    8    4    0 : slabdata      0      0      0
size-32768(DMA)        0      0  32768    1    8 : tunables    8    4    0 : slabdata      0      0      0
size-32768             1      1  32768    1    8 : tunables    8    4    0 : slabdata      1      1      0
size-16384(DMA)        0      0  16384    1    4 : tunables    8    4    0 : slabdata      0      0      0
size-16384             4      4  16384    1    4 : tunables    8    4    0 : slabdata      4      4      0
size-8192(DMA)         0      0   8192    1    2 : tunables    8    4    0 : slabdata      0      0      0
size-8192              7      7   8192    1    2 : tunables    8    4    0 : slabdata      7      7      0
size-4096(DMA)         0      0   4096    1    1 : tunables   24   12    0 : slabdata      0      0      0
size-4096             25     25   4096    1    1 : tunables   24   12    0 : slabdata     25     25      0
size-2048(DMA)         0      0   2048    2    1 : tunables   24   12    0 : slabdata      0      0      0
size-2048             50     50   2048    2    1 : tunables   24   12    0 : slabdata     25     25      0
size-1024(DMA)         0      0   1024    4    1 : tunables   54   27    0 : slabdata      0      0      0
size-1024            324    324   1024    4    1 : tunables   54   27    0 : slabdata     81     81      0
size-512(DMA)          0      0    512    8    1 : tunables   54   27    0 : slabdata      0      0      0
size-512             288    288    512    8    1 : tunables   54   27    0 : slabdata     36     36      0
size-256(DMA)          0      0    256   15    1 : tunables  120   60    0 : slabdata      0      0      0
size-256              66     75    256   15    1 : tunables  120   60    0 : slabdata      5      5      0
size-192(DMA)          0      0    192   20    1 : tunables  120   60    0 : slabdata      0      0      0
size-192           33260  33260    192   20    1 : tunables  120   60    0 : slabdata   1663   1663      0
size-128(DMA)          0      0    128   30    1 : tunables  120   60    0 : slabdata      0      0      0
size-64(DMA)           0      0     64   59    1 : tunables  120   60    0 : slabdata      0      0      0
size-64             5719   5959     64   59    1 : tunables  120   60    0 : slabdata    101    101      0
size-32(DMA)           0      0     32  112    1 : tunables  120   60    0 : slabdata      0      0      0
size-128             690    690    128   30    1 : tunables  120   60    0 : slabdata     23     23      0
size-32             4704   4704     32  112    1 : tunables  120   60    0 : slabdata     42     42      0
kmem_cache           168    170    384   10    1 : tunables   54   27    0 : slabdata     17     17      0

[-- Attachment #5: vmallacinfo.txt --]
[-- Type: text/plain, Size: 2329 bytes --]

0xffffc90000000000-0xffffc90000021000  135168 arch_gnttab_map_shared+0x24/0x56 ioremap
0xffffc90000022000-0xffffc90000024000    8192 init_vdso_vars+0xef/0x221
0xffffc90000026000-0xffffc90000028000    8192 pci_enable_msix+0x1dc/0x3eb phys=fe401000 ioremap
0xffffc90000028000-0xffffc9000002b000   12288 usb_hcd_pci_probe+0x15f/0x30f phys=fe400000 ioremap
0xffffc9000002c000-0xffffc9000004d000  135168 sys_swapon+0x4e7/0xae2 pages=32 vmalloc
0xffffc9000004e000-0xffffc9000011a000  835584 __videobuf_mmap_mapper+0x116/0x200 [videobuf_vmalloc] pages=203 vmalloc user
0xffffc9000011b000-0xffffc900001e7000  835584 __videobuf_mmap_mapper+0x116/0x200 [videobuf_vmalloc] pages=203 vmalloc user
0xffffc900001e8000-0xffffc900002b4000  835584 __videobuf_mmap_mapper+0x116/0x200 [videobuf_vmalloc] pages=203 vmalloc user
0xffffc900002b5000-0xffffc90000381000  835584 __videobuf_mmap_mapper+0x116/0x200 [videobuf_vmalloc] pages=203 vmalloc user
0xffffffffa0000000-0xffffffffa0004000   16384 module_alloc_update_bounds+0xe/0x5a pages=3 vmalloc
0xffffffffa0005000-0xffffffffa0007000    8192 module_alloc_update_bounds+0xe/0x5a pages=1 vmalloc
0xffffffffa0008000-0xffffffffa000d000   20480 module_alloc_update_bounds+0xe/0x5a pages=4 vmalloc
0xffffffffa000e000-0xffffffffa0011000   12288 module_alloc_update_bounds+0xe/0x5a pages=2 vmalloc
0xffffffffa0012000-0xffffffffa0016000   16384 module_alloc_update_bounds+0xe/0x5a pages=3 vmalloc
0xffffffffa0017000-0xffffffffa001c000   20480 module_alloc_update_bounds+0xe/0x5a pages=4 vmalloc
0xffffffffa001d000-0xffffffffa001f000    8192 module_alloc_update_bounds+0xe/0x5a pages=1 vmalloc
0xffffffffa0020000-0xffffffffa0024000   16384 module_alloc_update_bounds+0xe/0x5a pages=3 vmalloc
0xffffffffa0025000-0xffffffffa0029000   16384 module_alloc_update_bounds+0xe/0x5a pages=3 vmalloc
0xffffffffa002a000-0xffffffffa0035000   45056 module_alloc_update_bounds+0xe/0x5a pages=10 vmalloc
0xffffffffa0036000-0xffffffffa003b000   20480 module_alloc_update_bounds+0xe/0x5a pages=4 vmalloc
0xffffffffa003c000-0xffffffffa0050000   81920 module_alloc_update_bounds+0xe/0x5a pages=19 vmalloc
0xffffffffa0059000-0xffffffffa005d000   16384 module_alloc_update_bounds+0xe/0x5a pages=3 vmalloc
0xffffffffa005e000-0xffffffffa006c000   57344 module_alloc_update_bounds+0xe/0x5a pages=13 vmalloc

[-- Attachment #6: vmstat.txt --]
[-- Type: text/plain, Size: 1446 bytes --]

nr_free_pages 2253
nr_inactive_anon 5657
nr_active_anon 5129
nr_inactive_file 62586
nr_active_file 11165
nr_unevictable 0
nr_mlock 0
nr_anon_pages 10742
nr_mapped 2895
nr_file_pages 73805
nr_dirty 4
nr_writeback 0
nr_slab_reclaimable 4209
nr_slab_unreclaimable 15988
nr_page_table_pages 1042
nr_kernel_stack 64
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 54
pgpgin 68514
pgpgout 1132904
pswpin 0
pswpout 0
pgalloc_dma 7283
pgalloc_dma32 513627
pgalloc_normal 0
pgalloc_movable 0
pgfree 523304
pgactivate 14814
pgdeactivate 5664
pgfault 1005226
pgmajfault 575
pgrefill_dma 32
pgrefill_dma32 5632
pgrefill_normal 0
pgrefill_movable 0
pgsteal_dma 0
pgsteal_dma32 59063
pgsteal_normal 0
pgsteal_movable 0
pgscan_kswapd_dma 0
pgscan_kswapd_dma32 60864
pgscan_kswapd_normal 0
pgscan_kswapd_movable 0
pgscan_direct_dma 0
pgscan_direct_dma32 0
pgscan_direct_normal 0
pgscan_direct_movable 0
pginodesteal 0
slabs_scanned 12544
kswapd_steal 59063
kswapd_inodesteal 18737
kswapd_low_wmark_hit_quickly 9
kswapd_high_wmark_hit_quickly 33
kswapd_skip_congestion_wait 0
pageoutrun 1158
allocstall 0
pgrotated 0
unevictable_pgs_culled 0
unevictable_pgs_scanned 0
unevictable_pgs_rescued 0
unevictable_pgs_mlocked 0
unevictable_pgs_munlocked 0
unevictable_pgs_cleared 0
unevictable_pgs_stranded 0
unevictable_pgs_mlockfreed 0

[-- Attachment #7: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-08-08 16:57 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-03 15:30 [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze within 2 days, no info in serial log Sander Eikelenboom
2010-08-03 15:45 ` Konrad Rzeszutek Wilk
2010-08-03 15:51   ` Jeremy Fitzhardinge
2010-08-03 16:18     ` Sander Eikelenboom
2010-08-03 17:18     ` Sander Eikelenboom
2010-08-05  9:48     ` Sander Eikelenboom
2010-08-05 14:52       ` Konrad Rzeszutek Wilk
2010-08-05 15:12         ` Sander Eikelenboom
2010-08-05 16:21         ` Jeremy Fitzhardinge
2010-08-06  9:21         ` Sander Eikelenboom
2010-08-06 15:17           ` Konrad Rzeszutek Wilk
2010-08-06 20:44             ` Jeremy Fitzhardinge
2010-08-08 13:54             ` Sander Eikelenboom
2010-08-08 16:57             ` Sander Eikelenboom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).