xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* SLUB allocation error on 3.0.3 / 4.1.1
@ 2011-09-12 19:52 Nathan March
  2011-09-12 20:17 ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 2+ messages in thread
From: Nathan March @ 2011-09-12 19:52 UTC (permalink / raw)
  To: xen-devel

Hi All,

Running into temporary pauses in our VMs which correspond to these 
errors in dmesg on the dom0:

[1721485.352560] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
[1721485.352563]   cache: kmalloc-2048, object size: 2048, buffer size: 
2048, default order: 3, min order: 0
[1721485.352566]   node 0: slabs: 81, objs: 1296, free: 0
[1721485.352576] swapper: page allocation failure: order:0, mode:0x4020
[1721485.352579] Pid: 0, comm: swapper Not tainted 3.0.3 #1
[1721485.352582] Call Trace:
[1721485.352584] <IRQ>  [<ffffffff810bea48>] warn_alloc_failed+0x12b/0x142
[1721485.352595]  [<ffffffff810be8f2>] ? get_page_from_freelist+0x51c/0x547
[1721485.352601]  [<ffffffff810068a5>] ? xen_force_evtchn_callback+0xd/0xf
[1721485.352605]  [<ffffffff810bf30d>] __alloc_pages_nodemask+0x606/0x67d
[1721485.352610]  [<ffffffff81006eef>] ? xen_restore_fl_direct_reloc+0x4/0x4
[1721485.352614]  [<ffffffff810ed11e>] new_slab+0x7e/0x1f6
[1721485.352617]  [<ffffffff810ed430>] __slab_alloc+0x19a/0x33c
[1721485.352623]  [<ffffffff815b1655>] ? __netdev_alloc_skb+0x1d/0x3c
[1721485.352627]  [<ffffffff810ed84e>] __kmalloc_track_caller+0x106/0x145
[1721485.352631]  [<ffffffff815b1655>] ? __netdev_alloc_skb+0x1d/0x3c
[1721485.352634]  [<ffffffff815b066d>] __alloc_skb+0x69/0x129
[1721485.352638]  [<ffffffff815b1655>] __netdev_alloc_skb+0x1d/0x3c
[1721485.352643]  [<ffffffff81469c03>] e1000_alloc_rx_buffers+0x7f/0x14c
[1721485.352647]  [<ffffffff81469f84>] e1000_clean_rx_irq+0x265/0x28c
[1721485.352651]  [<ffffffff810068a5>] ? xen_force_evtchn_callback+0xd/0xf
[1721485.352655]  [<ffffffff8146b44a>] e1000_clean+0x75/0x24e
[1721485.352658]  [<ffffffff81006eef>] ? xen_restore_fl_direct_reloc+0x4/0x4
[1721485.352663]  [<ffffffff815b888e>] net_rx_action+0xdd/0x20f
[1721485.352668]  [<ffffffff810492d7>] __do_softirq+0xd3/0x1bb
[1721485.352673]  [<ffffffff81094be4>] ? handle_edge_irq+0x9d/0xbc
[1721485.352678]  [<ffffffff81731b1c>] call_softirq+0x1c/0x30
[1721485.352682]  [<ffffffff8100bd89>] do_softirq+0x61/0xbf
[1721485.352686]  [<ffffffff81049082>] irq_exit+0x43/0xb2
[1721485.352691]  [<ffffffff813712b1>] xen_evtchn_do_upcall+0x2f/0x3c
[1721485.352695]  [<ffffffff81731b6e>] xen_do_hypervisor_callback+0x1e/0x30
[1721485.352697] <EOI>  [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000
[1721485.352705]  [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000
[1721485.352709]  [<ffffffff8100693c>] ? xen_safe_halt+0x10/0x1a
[1721485.352714]  [<ffffffff81010fc3>] ? default_idle+0x5e/0xa6
[1721485.352718]  [<ffffffff81009f81>] ? cpu_idle+0x6d/0xa3
[1721485.352723]  [<ffffffff81701664>] ? rest_init+0x68/0x6a
[1721485.352729]  [<ffffffff81cb2d18>] ? start_kernel+0x412/0x41d
[1721485.352733]  [<ffffffff81cb22cb>] ? x86_64_start_reservations+0xb6/0xba
[1721485.352737]  [<ffffffff81cb5f55>] ? xen_start_kernel+0x59b/0x5a2
[1721485.352739] Mem-Info:
[1721485.352741] DMA per-cpu:
[1721485.352744] CPU    0: hi:    0, btch:   1 usd:   0
[1721485.352746] DMA32 per-cpu:
[1721485.352748] CPU    0: hi:  186, btch:  31 usd: 176
[1721485.352750] Normal per-cpu:
[1721485.352752] CPU    0: hi:  186, btch:  31 usd:   0
[1721485.352757] active_anon:2403 inactive_anon:13164 isolated_anon:0
[1721485.352758]  active_file:66256 inactive_file:75740 isolated_file:0
[1721485.352759]  unevictable:507 dirty:3175 writeback:40742 unstable:0
[1721485.352760]  free:13180 slab_reclaimable:5805 slab_unreclaimable:9005
[1721485.352761]  mapped:1983 shmem:4 pagetables:1147 bounce:0
[1721485.352768] DMA free:15904kB min:16kB low:20kB high:24kB 
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15680kB 
mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB 
slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB 
pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 
all_unreclaimable? yes
[1721485.352774] lowmem_reserve[]: 0 952 10042 10042
[1721485.352784] DMA32 free:36816kB min:1212kB low:1512kB high:1816kB 
active_anon:9612kB inactive_anon:52656kB active_file:265024kB 
inactive_file:302960kB unevictable:2028kB isolated(anon):0kB 
isolated(file):0kB present:975072kB mlocked:2028kB dirty:12700kB 
writeback:162968kB mapped:7932kB shmem:16kB slab_reclaimable:23220kB 
slab_unreclaimable:36020kB kernel_stack:2360kB pagetables:4588kB 
unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 
all_unreclaimable? no
[1721485.352790] lowmem_reserve[]: 0 0 9090 9090
[1721485.352799] Normal free:0kB min:11600kB low:14500kB high:17400kB 
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:9308160kB 
mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB 
slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB 
pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 
all_unreclaimable? yes
[1721485.352805] lowmem_reserve[]: 0 0 0 0
[1721485.352810] DMA: 0*4kB 0*8kB 0*16kB 1*32kB 2*64kB 1*128kB 1*256kB 
0*512kB 1*1024kB 1*2048kB 3*4096kB = 15904kB
[1721485.352822] DMA32: 588*4kB 256*8kB 150*16kB 168*32kB 285*64kB 
34*128kB 6*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 36816kB
[1721485.352834] Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 
0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
[1721485.352845] 153277 total pagecache pages
[1721485.352848] 10888 pages in swap cache
[1721485.352850] Swap cache stats: add 3069760, delete 3058872, find 
3791763/4300237
[1721485.352852] Free swap  = 1816844kB
[1721485.352854] Total swap = 1943856kB
[1721485.377309] 2621424 pages RAM
[1721485.377312] 2427446 pages reserved
[1721485.377313] 108302 pages shared
[1721485.377315] 126166 pages non-shared
[1721485.377319] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
[1721485.377323]   cache: kmalloc-2048, object size: 2048, buffer size: 
2048, default order: 3, min order: 0
[1721485.377326]   node 0: slabs: 81, objs: 1296, free: 0
[1721485.377560] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
[1721485.377564]   cache: kmalloc-2048, object size: 2048, buffer size: 
2048, default order: 3, min order: 0
[1721485.377567]   node 0: slabs: 81, objs: 1296, free: 0

xen7 ~ # xm info
host                   : xen7
release                : 3.0.3
version                : #1 SMP Mon Aug 22 14:25:38 PDT 2011
machine                : x86_64
nr_cpus                : 24
nr_nodes               : 2
cores_per_socket       : 6
threads_per_core       : 2
cpu_mhz                : 2266
hw_caps                : 
bfebfbff:2c100800:00000000:00003f40:009ee3fd:00000000:00000001:00000000
virt_caps              : hvm hvm_directio
total_memory           : 98294
free_memory            : 36580
free_cpus              : 0
xen_major              : 4
xen_minor              : 1
xen_extra              : .1
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 
hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : unavailable
xen_commandline        : console=com1,com2,vga com1=115200,8n1 
com2=115200,8n1 dom0_mem=1024M dom0_max_vcpus=1 dom0_vcpus_pin=true
cc_compiler            : gcc version 4.3.4 (Gentoo 4.3.4 p1.1, pie-10.1.5)
cc_compile_by          : root
cc_compile_domain      : nmsrv.com
cc_compile_date        : Mon Aug 22 11:28:50 PDT 2011
xend_config_format     : 4

Seeing this on multiple dom0's which are all running identical hardware 
(Supermicro X8DTT w/ Intel 82574L gige). Dom0's are limited to 1gb 
(dom0_mem=1024M dom0_max_vcpus=1 dom0_vcpus_pin=true) although they 
don't go above 250mb used.

Not sure if this is a xen bug, network driver issue or something else?

- Nathan

-- 
Nathan March<nathan@gt.net>
Gossamer Threads Inc. http://www.gossamer-threads.com/
Tel: (604) 687-5804 Fax: (604) 687-5806

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: SLUB allocation error on 3.0.3 / 4.1.1
  2011-09-12 19:52 SLUB allocation error on 3.0.3 / 4.1.1 Nathan March
@ 2011-09-12 20:17 ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 2+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-09-12 20:17 UTC (permalink / raw)
  To: Nathan March; +Cc: xen-devel

> total_memory           : 98294
> free_memory            : 36580
> free_cpus              : 0
> xen_major              : 4
> xen_minor              : 1
> xen_extra              : .1
> xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p
> hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
> xen_scheduler          : credit
> xen_pagesize           : 4096
> platform_params        : virt_start=0xffff800000000000
> xen_changeset          : unavailable
> xen_commandline        : console=com1,com2,vga com1=115200,8n1
> com2=115200,8n1 dom0_mem=1024M dom0_max_vcpus=1 dom0_vcpus_pin=true
> cc_compiler            : gcc version 4.3.4 (Gentoo 4.3.4 p1.1, pie-10.1.5)
> cc_compile_by          : root
> cc_compile_domain      : nmsrv.com
> cc_compile_date        : Mon Aug 22 11:28:50 PDT 2011
> xend_config_format     : 4
> 
> Seeing this on multiple dom0's which are all running identical
> hardware (Supermicro X8DTT w/ Intel 82574L gige). Dom0's are limited
> to 1gb (dom0_mem=1024M dom0_max_vcpus=1 dom0_vcpus_pin=true)
> although they don't go above 250mb used.
> 
> Not sure if this is a xen bug, network driver issue or something else?

It is a Linux kernel bug. It does not respect the dom0_mem=max:X argument
so you end up with 98GB of pagetables in Dom0 and you can't allocate
enough memory for your normal drivers (since most of the memory is used
for your non-used pagetables).

The workaround is to put in your Linux command-line: "mem=1GB"
(and keep the dom0_mem=..) arguments.

A patch in 3.0.4 (or 3.0.5) should soon surface which will fix this.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2011-09-12 20:17 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-12 19:52 SLUB allocation error on 3.0.3 / 4.1.1 Nathan March
2011-09-12 20:17 ` Konrad Rzeszutek Wilk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).