* migration of pv guest fails from small to large host
@ 2011-07-01 10:41 Olaf Hering
2011-07-01 16:20 ` Olaf Hering
2011-07-12 16:43 ` [PATCH] xen: update machine_to_phys_order on resume Olaf Hering
0 siblings, 2 replies; 17+ messages in thread
From: Olaf Hering @ 2011-07-01 10:41 UTC (permalink / raw)
To: xen-devel
This issue was initially reported to happen on different sized HP ProLiant
systems running SLES11SP1 on dom0 and domU.
Migration of pv guests fails, the guest crashes on the target host once the
guest is unpaused after transit. It happens when the guest is started on a
small systen, then migrated from that small system to a large system.
If the guest is started on a large system, then migrated to a small system and
back to the large system, the migration will be successful.
The symptoms on the target host differ with the systems I have access to,
which are listed below. It is not possible to take a core dump.
The pv guest has one vcpu and 256MB, one network interface and a disk.
I have currently no idea what to look for. The xenctx patch for dumping
pagetables showed no differences between src/dst guest after transit to the
target host (I have to verify this on my hosts).
involved hardware:
bolen: ProLiant DL580 G7, 32GB, CPU E7540 @ 2.00GHz
falla: ProLiant DL360 G6, 8GB, CPU E5540 @ 2.53GHz
drnek: ProLiant DL170h G6, 6GB, CPU E5504 @ 2.00GHz
gubaidulina: Intel SDV S3E37, 192GB, CPU 000 @ 2.40GHz (unknown cpu 0x206f1)
(other target hosts from different vendors with large amount of memory were reported to fail as well.)
I still trying to test a non-HP system as source host.
involved software:
host: sles11sp1, xen 4.0. Also xen-unstable 4.2 hg rev23640
pv gust: sles11sp1
migration with this command on bolen, falla, drnek:
"xm migrate sles11sp_para_1 gubaidulina" fails on gubaidulina:
[2011-06-30 21:21:32 21858] WARNING (XendDomainInfo:2061) Domain has crashed: name=sles11sp1_para_1 id=1.
[2011-06-30 21:21:32 21858] ERROR (XendDomainInfo:2318) core dump failed: id = 1 name = sles11sp1_para_1: (1, 'Internal error', "Couldn't map p2m_frame_list_list (errno 1) (1 = Operation not permitted)")
[2011-06-30 21:21:32 21858] DEBUG (XendDomainInfo:3084) XendDomainInfo.destroy: domid=1
[2011-06-30 21:21:32 21858] DEBUG (XendDomainInfo:2403) Destroying device model
[2011-06-30 21:21:32 21858] INFO (image:702) sles11sp1_para_1 device model terminated
xm dmesg shows no errors.
notes from a "bisect" with limiting Xen memory:
gubaidulina booted with mem=64G, migration from bolen succeeds.
gubaidulina booted with mem=96G, migration from bolen fails.
gubaidulina booted with mem=80G, migration from bolen fails.
gubaidulina booted with mem=72G, migration from bolen fails.
gubaidulina booted with mem=68G, migration from bolen fails.
gubaidulina booted with mem=65G, migration from bolen succeeds.
now testing more after migration:
gubaidulina booted with mem=66G, migration from bolen fails, no coredump message, no coredump
gubaidulina booted with mem=66G, second migration from bolen succeeds. xm shutdown crashes guest, no coredump
gubaidulina booted with mem=65G, migration from bolen succeeds. xm shutdown succeeds
Olaf
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: migration of pv guest fails from small to large host
2011-07-01 10:41 migration of pv guest fails from small to large host Olaf Hering
@ 2011-07-01 16:20 ` Olaf Hering
2011-07-05 13:46 ` Konrad Rzeszutek Wilk
2011-07-12 16:43 ` [PATCH] xen: update machine_to_phys_order on resume Olaf Hering
1 sibling, 1 reply; 17+ messages in thread
From: Olaf Hering @ 2011-07-01 16:20 UTC (permalink / raw)
To: xen-devel
The situation with Linux 3.0 is different, crashes on target. With SLES11
kernel I can not attach to the console, the guest is appearently running.
gubaidulina -> bolen -> gubaidulina works.
bolen -> gubaidulina crashes:
root@bolen:~ # xl -v create -c -d /etc/xen/vm/sles11sp1_para_1
Parsing config file /etc/xen/vm/sles11sp1_para_1
(domain
(domid -1)
(create_info)
(hvm 0)
(hap 1)
(oos 1)
(ssidref 0)
(name sles11sp1_para_1)
(uuid <unknown>)
(cpupool Pool-0)
(xsdata (null))
(platformdata (null))
(build_info)
(max_vcpus 1)
(tsc_mode 0)
(max_memkb 50176)
(target_memkb 50176)
(nomigrate 0)
(image
(linux 0)
(kernel /boot/vmlinuz-kernel-mainline)
(cmdline quiet sysrq=yes panic=1 debug log_buf_len=4K console=hvc0)
(ramdisk /boot/initrd-kernel-mainline)
(e820_host 0)
)
)
)
domainbuilder: detail: xc_dom_allocate: cmdline=" quiet sysrq=yes panic=1 debug log_buf_len=4K console=hvc0", features="(null)"
domainbuilder: detail: xc_dom_kernel_file: filename="/boot/vmlinuz-kernel-mainline"
domainbuilder: detail: xc_dom_malloc_filemap : 2038 kB
domainbuilder: detail: xc_dom_ramdisk_file: filename="/boot/initrd-kernel-mainline"
domainbuilder: detail: xc_dom_malloc_filemap : 5830 kB
domainbuilder: detail: xc_dom_boot_xen_init: ver 4.2, caps xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
domainbuilder: detail: xc_dom_parse_image: called
domainbuilder: detail: xc_dom_find_loader: trying ELF-generic loader ...
domainbuilder: detail: loader probe failed
domainbuilder: detail: xc_dom_find_loader: trying Linux bzImage loader ...
domainbuilder: detail: xc_dom_malloc : 13010 kB
domainbuilder: detail: xc_dom_do_gunzip: unzip ok, 0x1f4616 -> 0xcb4b30
domainbuilder: detail: loader probe OK
xc: detail: elf_parse_binary: phdr: paddr=0x1000000 memsz=0x4d3000
xc: detail: elf_parse_binary: phdr: paddr=0x1600000 memsz=0x38cc0
xc: detail: elf_parse_binary: phdr: paddr=0x1639000 memsz=0xd60
xc: detail: elf_parse_binary: phdr: paddr=0x163a000 memsz=0x11f80
xc: detail: elf_parse_binary: phdr: paddr=0x164c000 memsz=0x3e9000
xc: detail: elf_parse_binary: memory: 0x1000000 -> 0x1a35000
xc: detail: elf_xen_parse_note: GUEST_OS = "linux"
xc: detail: elf_xen_parse_note: GUEST_VERSION = "2.6"
xc: detail: elf_xen_parse_note: XEN_VERSION = "xen-3.0"
xc: detail: elf_xen_parse_note: VIRT_BASE = 0xffffffff80000000
xc: detail: elf_xen_parse_note: ENTRY = 0xffffffff8164c200
xc: detail: elf_xen_parse_note: HYPERCALL_PAGE = 0xffffffff81001000
xc: detail: elf_xen_parse_note: FEATURES = "!writable_page_tables|pae_pgdir_above_4gb"
xc: detail: elf_xen_parse_note: PAE_MODE = "yes"
xc: detail: elf_xen_parse_note: LOADER = "generic"
xc: detail: elf_xen_parse_note: unknown xen elf note (0xd)
xc: detail: elf_xen_parse_note: SUSPEND_CANCEL = 0x1
xc: detail: elf_xen_parse_note: HV_START_LOW = 0xffff800000000000
xc: detail: elf_xen_parse_note: PADDR_OFFSET = 0x0
xc: detail: elf_xen_addr_calc_check: addresses:
xc: detail: virt_base = 0xffffffff80000000
xc: detail: elf_paddr_offset = 0x0
xc: detail: virt_offset = 0xffffffff80000000
xc: detail: virt_kstart = 0xffffffff81000000
xc: detail: virt_kend = 0xffffffff81a35000
xc: detail: virt_entry = 0xffffffff8164c200
xc: detail: p2m_base = 0xffffffffffffffff
domainbuilder: detail: xc_dom_parse_elf_kernel: xen-3.0-x86_64: 0xffffffff81000000 -> 0xffffffff81a35000
domainbuilder: detail: xc_dom_mem_init: mem 49 MB, pages 0x3100 pages, 4k each
domainbuilder: detail: xc_dom_mem_init: 0x3100 pages
domainbuilder: detail: xc_dom_boot_mem_init: called
domainbuilder: detail: x86_compat: guest xen-3.0-x86_64, address size 64
domainbuilder: detail: xc_dom_build_image: called
domainbuilder: detail: xc_dom_alloc_segment: kernel : 0xffffffff81000000 -> 0xffffffff81a35000 (pfn 0x1000 + 0xa35 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x1000+0xa35 at 0x7fd2484c0000
xc: detail: elf_load_binary: phdr 0 at 0x0x7fd2484c0000 -> 0x0x7fd248993000
xc: detail: elf_load_binary: phdr 1 at 0x0x7fd248ac0000 -> 0x0x7fd248af8cc0
xc: detail: elf_load_binary: phdr 2 at 0x0x7fd248af9000 -> 0x0x7fd248af9d60
xc: detail: elf_load_binary: phdr 3 at 0x0x7fd248afa000 -> 0x0x7fd248b0bf80
xc: detail: elf_load_binary: phdr 4 at 0x0x7fd248b0c000 -> 0x0x7fd248b74000
domainbuilder: detail: xc_dom_alloc_segment: ramdisk : 0xffffffff81a35000 -> 0xffffffff826a1000 (pfn 0x1a35 + 0xc6c pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x1a35+0xc6c at 0x7fd247854000
domainbuilder: detail: xc_dom_do_gunzip: unzip ok, 0x5b18b6 -> 0xc6be10
domainbuilder: detail: xc_dom_alloc_segment: phys2mach : 0xffffffff826a1000 -> 0xffffffff826ba000 (pfn 0x26a1 + 0x19 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x26a1+0x19 at 0x7fd24c19b000
domainbuilder: detail: xc_dom_alloc_page : start info : 0xffffffff826ba000 (pfn 0x26ba)
domainbuilder: detail: xc_dom_alloc_page : xenstore : 0xffffffff826bb000 (pfn 0x26bb)
domainbuilder: detail: xc_dom_alloc_page : console : 0xffffffff826bc000 (pfn 0x26bc)
domainbuilder: detail: nr_page_tables: 0x0000ffffffffffff/48: 0xffff000000000000 -> 0xffffffffffffffff, 1 table(s)
domainbuilder: detail: nr_page_tables: 0x0000007fffffffff/39: 0xffffff8000000000 -> 0xffffffffffffffff, 1 table(s)
domainbuilder: detail: nr_page_tables: 0x000000003fffffff/30: 0xffffffff80000000 -> 0xffffffffbfffffff, 1 table(s)
domainbuilder: detail: nr_page_tables: 0x00000000001fffff/21: 0xffffffff80000000 -> 0xffffffff827fffff, 20 table(s)
domainbuilder: detail: xc_dom_alloc_segment: page tables : 0xffffffff826bd000 -> 0xffffffff826d4000 (pfn 0x26bd + 0x17 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x26bd+0x17 at 0x7fd24c16c000
domainbuilder: detail: xc_dom_alloc_page : boot stack : 0xffffffff826d4000 (pfn 0x26d4)
domainbuilder: detail: xc_dom_build_image : virt_alloc_end : 0xffffffff826d5000
domainbuilder: detail: xc_dom_build_image : virt_pgtab_end : 0xffffffff82800000
domainbuilder: detail: xc_dom_boot_image: called
domainbuilder: detail: arch_setup_bootearly: doing nothing
domainbuilder: detail: xc_dom_compat_check: supported guest type: xen-3.0-x86_64 <= matches
domainbuilder: detail: xc_dom_compat_check: supported guest type: xen-3.0-x86_32p
domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_32
domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_32p
domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_64
domainbuilder: detail: xc_dom_update_guest_p2m: dst 64bit, pages 0x3100
domainbuilder: detail: clear_page: pfn 0x26bc, mfn 0x466420
domainbuilder: detail: clear_page: pfn 0x26bb, mfn 0x2b7c69
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x26ba+0x1 at 0x7fd24c19a000
domainbuilder: detail: start_info_x86_64: called
domainbuilder: detail: setup_hypercall_page: vaddr=0xffffffff81001000 pfn=0x1001
domainbuilder: detail: domain builder memory footprint
domainbuilder: detail: allocated
domainbuilder: detail: malloc : 13248 kB
domainbuilder: detail: anon mmap : 0 bytes
domainbuilder: detail: mapped
domainbuilder: detail: file mmap : 7868 kB
domainbuilder: detail: domU mmap : 23368 kB
domainbuilder: detail: arch_setup_bootlate: shared_info: pfn 0x0, mfn 0x42d92
domainbuilder: detail: shared_info_x86_64: called
domainbuilder: detail: vcpu_x86_64: called
domainbuilder: detail: vcpu_x86_64: cr3: pfn 0x26bd mfn 0x2b7c68
domainbuilder: detail: launch_vm: called, ctxt=0x7fff7f588220
domainbuilder: detail: xc_dom_release: called
Daemon running with PID 33929
[ 0.000000] Linux version 3.0.0-rc5-5.1.home_olh-kernel-mainline (abuild@kuckuk) (gcc version 4.3.4 [gcc-4_3-branch revision 152973] (SUSE Linux) ) #1 SMP Tue Jun 28 15:02:04 UTC 2011
[ 0.000000] Command line: quiet sysrq=yes panic=1 debug log_buf_len=4K console=hvc0
[ 0.000000] ACPI in unprivileged domain disabled
[ 0.000000] released 0 pages of unused memory
[ 0.000000] Set 0 page(s) to 1-1 mapping.
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] Xen: 0000000000000000 - 00000000000a0000 (usable)
[ 0.000000] Xen: 00000000000a0000 - 0000000000100000 (reserved)
[ 0.000000] Xen: 0000000000100000 - 0000000003900000 (usable)
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] DMI not present or invalid.
[ 0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved)
[ 0.000000] e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
[ 0.000000] No AGP bridge found
[ 0.000000] last_pfn = 0x3900 max_arch_pfn = 0x400000000
[ 0.000000] initial memory mapped : 0 - 026a1000
[ 0.000000] Base memory trampoline at [ffff88000009e000] 9e000 size 8192
[ 0.000000] init_memory_mapping: 0000000000000000-0000000003900000
[ 0.000000] 0000000000 - 0003900000 page 4k
[ 0.000000] kernel direct mapping tables up to 3900000 @ 30e1000-3100000
[ 0.000000] xen: setting RW the range 30e6000 - 3100000
[ 0.000000] RAMDISK: 01a35000 - 026a1000
[ 0.000000] Zone PFN ranges:
[ 0.000000] DMA 0x00000010 -> 0x00001000
[ 0.000000] DMA32 0x00001000 -> 0x00100000
[ 0.000000] Normal empty
[ 0.000000] Movable zone start PFN for each node
[ 0.000000] early_node_map[2] active PFN ranges
[ 0.000000] 0: 0x00000010 -> 0x000000a0
[ 0.000000] 0: 0x00000100 -> 0x00003900
[ 0.000000] On node 0 totalpages: 14480
[ 0.000000] DMA zone: 56 pages used for memmap
[ 0.000000] DMA zone: 2 pages reserved
[ 0.000000] DMA zone: 3926 pages, LIFO batch:0
[ 0.000000] DMA32 zone: 144 pages used for memmap
[ 0.000000] DMA32 zone: 10352 pages, LIFO batch:1
[ 0.000000] SMP: Allowing 1 CPUs, 0 hotplug CPUs
[ 0.000000] No local APIC present
[ 0.000000] APIC: disable apic facility
[ 0.000000] APIC: switched to apic NOOP
[ 0.000000] nr_irqs_gsi: 16
[ 0.000000] Allocating PCI resources starting at 3900000 (gap: 3900000:fc700000)
[ 0.000000] Booting paravirtualized kernel on Xen
[ 0.000000] Xen version: 4.2.23640-2.1 (preserve-AD)
[ 0.000000] setup_percpu: NR_CPUS:8 nr_cpumask_bits:8 nr_cpu_ids:1 nr_node_ids:1
[ 0.000000] PERCPU: Embedded 25 pages/cpu @ffff8800030c8000 s73600 r8192 d20608 u102400
[ 0.000000] pcpu-alloc: s73600 r8192 d20608 u102400 alloc=25*4096
[ 0.000000] pcpu-alloc: [0] 0
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 14278
[ 0.000000] Kernel command line: quiet sysrq=yes panic=1 debug log_buf_len=4K console=hvc0
[ 0.000000] PID hash table entries: 256 (order: -1, 2048 bytes)
[ 0.000000] Dentry cache hash table entries: 8192 (order: 4, 65536 bytes)
[ 0.000000] Inode-cache hash table entries: 4096 (order: 3, 32768 bytes)
[ 0.000000] Checking aperture...
[ 0.000000] No AGP bridge found
[ 0.000000] Memory: 25484k/58368k available (2800k kernel code, 448k absent, 32436k reserved, 3571k data, 476k init)
[ 0.000000] SLUB: Genslabs=15, HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[ 0.000000] Hierarchical RCU implementation.
[ 0.000000] NR_IRQS:4352 nr_irqs:256 16
[ 0.000000] Console: colour dummy device 80x25
[ 0.000000] console [tty0] enabled
[ 0.000000] console [hvc0] enabled
[ 0.000000] Xen: using vcpuop timer interface
[ 0.000000] installing Xen timer for CPU 0
[ 0.000000] Detected 1997.896 MHz processor.
[ 0.004000] Calibrating delay loop (skipped), value calculated using timer frequency.. 3995.79 BogoMIPS (lpj=7991584)
[ 0.004000] pid_max: default: 32768 minimum: 301
[ 0.004000] Mount-cache hash table entries: 256
[ 0.004000] CPU: Physical Processor ID: 0
[ 0.004000] CPU: Processor Core ID: 0
[ 0.004000] SMP alternatives: switching to UP code
[ 0.004032] Freeing SMP alternatives: 12k freed
[ 0.004142] Performance Events: unsupported p6 CPU model 46 no PMU driver, software events only.
[ 0.004302] Brought up 1 CPUs
[ 0.004667] Grant table initialized
[ 0.004729] NET: Registered protocol family 16
[ 0.006388] PCI: setting up Xen PCI frontend stub
[ 0.006406] PCI: pci_cache_line_size set to 64 bytes
[ 0.006971] bio: create slab <bio-0> at 0
[ 0.007068] ACPI: Interpreter disabled.
[ 0.007359] xen/balloon: Initialising balloon driver.
[ 0.007373] last_pfn = 0x3900 max_arch_pfn = 0x400000000
[ 0.007704] xen-balloon: Initialising balloon driver.
[ 0.007896] vgaarb: loaded
[ 0.007946] PCI: System does not support PCI
[ 0.007958] PCI: System does not support PCI
[ 0.008046] Switching to clocksource xen
[ 0.008132] pnp: PnP ACPI: disabled
[ 0.009368] PCI: max bus depth: 0 pci_try_num: 1
[ 0.009407] NET: Registered protocol family 2
[ 0.009458] IP route cache hash table entries: 512 (order: 0, 4096 bytes)
[ 0.009567] TCP established hash table entries: 2048 (order: 3, 32768 bytes)
[ 0.009591] TCP bind hash table entries: 2048 (order: 3, 32768 bytes)
[ 0.009611] TCP: Hash tables configured (established 2048 bind 2048)
[ 0.009623] TCP reno registered
[ 0.009632] UDP hash table entries: 128 (order: 0, 4096 bytes)
[ 0.009647] UDP-Lite hash table entries: 128 (order: 0, 4096 bytes)
[ 0.009716] NET: Registered protocol family 1
[ 0.009732] PCI: CLS 0 bytes, default 64
[ 0.009799] Unpacking initramfs...
[ 0.034581] Freeing initrd memory: 12720k freed
[ 0.040060] platform rtc_cmos: registered platform RTC device (no PNP device found)
[ 0.045296] msgmni has been set to 74
[ 0.045328] io scheduler noop registered (default)
[ 0.045784] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[ 0.046109] i8042: PNP: No PS/2 controller found. Probing ports directly.
[ 0.046958] i8042: No controller found
[ 0.047005] mousedev: PS/2 mouse device common for all mice
[ 0.086895] rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
[ 0.086956] rtc_cmos: probe of rtc_cmos failed with error -38
[ 0.086993] cpuidle: using governor ladder
[ 0.087069] TCP cubic registered
[ 0.087083] NET: Registered protocol family 17
[ 0.087101] Registering the dns_resolver key type
[ 0.087219] drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
[ 0.087755] Freeing unused kernel memory: 476k freed
[ 0.088102] Write protecting the kernel read-only data: 6144k
[ 0.095393] Freeing unused kernel memory: 1276k freed
[ 0.097280] Freeing unused kernel memory: 1204k freed
doing fast boot
FATAL: Module cciss not found.
FATAL: Module hpsa not found.
[ 0.247494] SCSI subsystem initialized
[ 0.248930] libata version 3.00 loaded.
FATAL: Module ide_pci_generic not found.
FATAL: Module processor not found.
FATAL: Module thermal not found.
FATAL: Module fan not found.
[ 0.386837] BIOS EDD facility v0.16 2004-Jun-25, 0 devices found
[ 0.386848] EDD information not available.
FATAL: Error inserting edd (/lib/modules/3.0.0-rc5-5.1.home_olh-kernel-mainline/kernel/drivers/firmware/edd.ko): No such device
[ 0.403571] Initialising Xen virtual ethernet driver.
preping 03-storage.sh
running 03-storage.sh
preping 04-udev.sh
running 04-udev.sh
Creating device nodes with udev
[ 0.432303] udevd (155): /proc/155/oom_adj is deprecated, please use /proc/155/oom_score_adj instead.
[ 0.432336] udevd version 128 started
preping 05-blogd.sh
running 05-blogd.sh
mount: devpts already mounted or /dev/pts busy
mount: according to mtab, devpts is already mounted on /dev/pts
Boot logging started on /dev/hvc0(/dev/console) at Fri Jul 1 18:08:13 2011
preping 05-clock.sh
running 05-clock.sh
preping 11-block.sh
running 11-block.sh
preping 11-usb.sh
running 11-usb.sh
preping 21-devinit_done.sh
running 21-devinit_done.sh
preping 81-resume.userspace.sh
running 81-resume.userspace.sh
Trying manual resume from /dev/disk/by-id/cciss-3600508b1001ceea6b9fd83b3a5651463-part2
preping 82-resume.kernel.sh
running 82-resume.kernel.sh
Trying manual resume from /dev/disk/by-id/cciss-3600508b1001ceea6b9fd83b3a5651463-part2
preping 83-mount.sh
running 83-mount.sh
Waiting for device /dev/cciss/c0d0p6 to appear: ..............................Could not find /dev/cciss/c0d0p6.
Want me to fall back to /dev/cciss/c0d0p6? (Y/n)
n
not found -- exiting to /bin/sh
root@bolen:~ # xl -v migrate sles11sp1_para_1 gubaidulina
migration target: Ready to receive domain.
Saving to migration stream new xl format (info 0x0/0x0/825)
Loading new save file incoming migration stream (new xl fmt info 0x0/0x0/825)
Savefile contains xl domain config
xc: detail: Had 0 unexplained entries in p2m table
xc: Saving memory: iter 0 (last sent 0 skipped 0): 14592/14592 100%
xc: detail: delta 1519ms, dom0 47%, target 0%, sent 269Mb/s, dirtied 0Mb/s 36 pages
xc: Saving memory: iter 1 (last sent 12508 skipped 36): 14592/14592 100%
xc: detail: Start last iteration
xc: detail: SUSPEND shinfo 00042d92
xc: detail: delta 212ms, dom0 8%, target 0%, sent 5Mb/s, dirtied 14Mb/s 92 pages
xc: Saving memory: iter 2 (last sent 36 skipped 0): 14592/14592 100%
xc: detail: delta 2ms, dom0 0%, target 0%, sent 1507Mb/s, dirtied 1507Mb/s 92 pages
xc: detail: Total pages sent= 12636 (0.87x)
xc: detail: (of which 0 were fixups)
xc: detail: All memory is saved
xc: detail: Save exit rc=0
migration target: Transfer complete, requesting permission to start domain.
migration sender: Target has acknowledged transfer.
migration sender: Giving target permission to start.
migration target: Got permission, starting domain.
migration sender: Target reports successful startup.
migration target: Domain started successsfully.
Migration successful.
gubaidulina:~/:[3]# xl console sles11sp1_para_1
[ 119.760310] BUG: unable to handle kernel paging request at 00007f30c3ff39a0
[ 119.760310] IP: [<ffffffff8102987e>] fill_pte+0x1e/0x110
[ 119.760310] PGD fffffffffffff067 BAD
[ 119.760310] Oops: 0000 [#1] SMP
[ 119.760310] CPU 0
[ 119.760310] Modules linked in: xen_blkfront xen_netfront xenbus_probe_frontend ext3 mbcache jbd ata_generic ata_piix libata scsi_mod
[ 119.760310]
[ 119.760310] Pid: 6, comm: migration/0 Not tainted 3.0.0-rc5-5.1.home_olh-kernel-mainline #1
[ 119.760310] RIP: e030:[<ffffffff8102987e>] [<ffffffff8102987e>] fill_pte+0x1e/0x110
[ 119.760310] RSP: e02b:ffff8800028ffd20 EFLAGS: 00010082
[ 119.760310] RAX: ffffc7ffffffffd0 RBX: ffffffffff57b000 RCX: 00003ffffffff000
[ 119.760310] RDX: 00003ffffffff000 RSI: ffffffffff57b000 RDI: ffffc7ffffffffd0
[ 119.760310] RBP: ffffffffff57b000 R08: 0000000000000010 R09: 0000000000000000
[ 119.760310] R10: 0720072007200720 R11: ffffffff810368f0 R12: ffffc7ffffffffd0
[ 119.760310] R13: ffff880002a33d01 R14: ffff880002a15c20 R15: ffff8800030d60c8
[ 119.760310] FS: 00007f1b5d4e2700(0000) GS:ffff8800030c8000(0000) knlGS:0000000000000000
[ 119.760310] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 119.760310] CR2: 00007f30c3ff39a0 CR3: 00000000020ae000 CR4: 0000000000002660
[ 119.760310] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 119.760310] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 119.760310] Process migration/0 (pid: 6, threadinfo ffff8800028fe000, task ffff880002a15c20)
[ 119.760310] Stack:
[ 119.760310] ffffffff810368f0 ffffffffff57b000 0000000000000000 80000000057d1063
[ 119.760310] ffff880002a33d01 ffffffff81029ac3 ffffffff81603000 0000000000000000
[ 119.760310] ffff880002a33df4 ffffffff8102ecf4 0000000000000000 ffffffff81003848
[ 119.760310] Call Trace:
[ 119.760310] [<ffffffff810368f0>] ? pick_next_task_fair+0x140/0x140
[ 119.760310] [<ffffffff81029ac3>] ? set_pte_vaddr_pud+0x33/0x60
[ 119.760310] [<ffffffff8102ecf4>] ? __native_set_fixmap+0x24/0x30
[ 119.760310] [<ffffffff81003848>] ? xen_setup_shared_info+0x58/0x70
[ 119.760310] [<ffffffff810073cd>] ? xen_arch_post_suspend+0xd/0xb0
[ 119.760310] [<ffffffff811af5a9>] ? xen_post_suspend+0x9/0x20
[ 119.760310] [<ffffffff811af552>] ? xen_suspend+0x52/0xa0
[ 119.760310] [<ffffffff81078fec>] ? stop_machine_cpu_stop+0x8c/0xd0
[ 119.760310] [<ffffffff81078f60>] ? cpu_stopper_thread+0x180/0x180
[ 119.760310] [<ffffffff81078f60>] ? cpu_stopper_thread+0x180/0x180
[ 119.760310] [<ffffffff81078ea4>] ? cpu_stopper_thread+0xc4/0x180
[ 119.760310] [<ffffffff8100713f>] ? xen_restore_fl_direct_reloc+0x4/0x4
[ 119.760310] [<ffffffff812b7ccc>] ? _raw_spin_unlock_irqrestore+0xc/0x10
[ 119.760310] [<ffffffff8103cb6a>] ? try_to_wake_up+0xaa/0x250
[ 119.760310] [<ffffffff81036ac0>] ? dequeue_task_fair+0x1d0/0x1d0
[ 119.760310] [<ffffffff81006ab9>] ? xen_force_evtchn_callback+0x9/0x10
[ 119.760310] [<ffffffff81007152>] ? check_events+0x12/0x20
[ 119.760310] [<ffffffff81036ac0>] ? dequeue_task_fair+0x1d0/0x1d0
[ 119.760310] [<ffffffff81078de0>] ? ikconfig_read_current+0x20/0x20
[ 119.760310] [<ffffffff8105ce86>] ? kthread+0x96/0xa0
[ 119.760310] [<ffffffff812b9504>] ? kernel_thread_helper+0x4/0x10
[ 119.760310] [<ffffffff812b8936>] ? int_ret_from_sys_call+0x7/0x1b
[ 119.760310] [<ffffffff812b80e1>] ? retint_restore_args+0x5/0x6
[ 119.760310] [<ffffffff812b9500>] ? gs_change+0x13/0x13
[ 119.760310] Code: e9 28 38 00 00 0f 1f 84 00 00 00 00 00 48 83 ec 28 48 89 6c 24 10 4c 89 64 24 18 49 89 fc 48 89 5c 24 08 4c 89 6c 24 20 48 89 f5
[ 119.760310] 8b 3f 48 85 ff 74 52 ff 14 25 70 f8 60 81 48 c1 ed 09 48 89
[ 119.760310] RIP [<ffffffff8102987e>] fill_pte+0x1e/0x110
[ 119.760310] RSP <ffff8800028ffd20>
[ 119.760310] CR2: 00007f30c3ff39a0
[ 119.760310] ---[ end trace f00341e36132aa0d ]---
#
gubaidulina:~/:[130]# xl -v create -c -d /etc/xen/vm/sles11sp1_para_1
Parsing config file /etc/xen/vm/sles11sp1_para_1
(domain
(domid -1)
(create_info)
(hvm 0)
(hap 1)
(oos 1)
(ssidref 0)
(name sles11sp1_para_1)
(uuid <unknown>)
(cpupool Pool-0)
(xsdata (null))
(platformdata (null))
(build_info)
(max_vcpus 1)
(tsc_mode 0)
(max_memkb 50176)
(target_memkb 50176)
(nomigrate 0)
(image
(linux 0)
(kernel /boot/vmlinuz-kernel-mainline)
(cmdline quiet sysrq=yes panic=1 debug log_buf_len=4K console=hvc0)
(ramdisk /boot/initrd-kernel-mainline)
(e820_host 0)
)
)
)
domainbuilder: detail: xc_dom_allocate: cmdline=" quiet sysrq=yes panic=1 debug log_buf_len=4K console=hvc0", features="(null)"
domainbuilder: detail: xc_dom_kernel_file: filename="/boot/vmlinuz-kernel-mainline"
domainbuilder: detail: xc_dom_malloc_filemap : 2038 kB
domainbuilder: detail: xc_dom_ramdisk_file: filename="/boot/initrd-kernel-mainline"
domainbuilder: detail: xc_dom_malloc_filemap : 5859 kB
domainbuilder: detail: xc_dom_boot_xen_init: ver 4.2, caps xen-3.0-x86_64 xen-3.0-x86_32p
domainbuilder: detail: xc_dom_parse_image: called
domainbuilder: detail: xc_dom_find_loader: trying ELF-generic loader ...
domainbuilder: detail: loader probe failed
domainbuilder: detail: xc_dom_find_loader: trying Linux bzImage loader ...
domainbuilder: detail: xc_dom_malloc : 13010 kB
domainbuilder: detail: xc_dom_do_gunzip: unzip ok, 0x1f4616 -> 0xcb4b30
domainbuilder: detail: loader probe OK
xc: detail: elf_parse_binary: phdr: paddr=0x1000000 memsz=0x4d3000
xc: detail: elf_parse_binary: phdr: paddr=0x1600000 memsz=0x38cc0
xc: detail: elf_parse_binary: phdr: paddr=0x1639000 memsz=0xd60
xc: detail: elf_parse_binary: phdr: paddr=0x163a000 memsz=0x11f80
xc: detail: elf_parse_binary: phdr: paddr=0x164c000 memsz=0x3e9000
xc: detail: elf_parse_binary: memory: 0x1000000 -> 0x1a35000
xc: detail: elf_xen_parse_note: GUEST_OS = "linux"
xc: detail: elf_xen_parse_note: GUEST_VERSION = "2.6"
xc: detail: elf_xen_parse_note: XEN_VERSION = "xen-3.0"
xc: detail: elf_xen_parse_note: VIRT_BASE = 0xffffffff80000000
xc: detail: elf_xen_parse_note: ENTRY = 0xffffffff8164c200
xc: detail: elf_xen_parse_note: HYPERCALL_PAGE = 0xffffffff81001000
xc: detail: elf_xen_parse_note: FEATURES = "!writable_page_tables|pae_pgdir_above_4gb"
xc: detail: elf_xen_parse_note: PAE_MODE = "yes"
xc: detail: elf_xen_parse_note: LOADER = "generic"
xc: detail: elf_xen_parse_note: unknown xen elf note (0xd)
xc: detail: elf_xen_parse_note: SUSPEND_CANCEL = 0x1
xc: detail: elf_xen_parse_note: HV_START_LOW = 0xffff800000000000
xc: detail: elf_xen_parse_note: PADDR_OFFSET = 0x0
xc: detail: elf_xen_addr_calc_check: addresses:
xc: detail: virt_base = 0xffffffff80000000
xc: detail: elf_paddr_offset = 0x0
xc: detail: virt_offset = 0xffffffff80000000
xc: detail: virt_kstart = 0xffffffff81000000
xc: detail: virt_kend = 0xffffffff81a35000
xc: detail: virt_entry = 0xffffffff8164c200
xc: detail: p2m_base = 0xffffffffffffffff
domainbuilder: detail: xc_dom_parse_elf_kernel: xen-3.0-x86_64: 0xffffffff81000000 -> 0xffffffff81a35000
domainbuilder: detail: xc_dom_mem_init: mem 49 MB, pages 0x3100 pages, 4k each
domainbuilder: detail: xc_dom_mem_init: 0x3100 pages
domainbuilder: detail: xc_dom_boot_mem_init: called
domainbuilder: detail: x86_compat: guest xen-3.0-x86_64, address size 64
domainbuilder: detail: xc_dom_build_image: called
domainbuilder: detail: xc_dom_alloc_segment: kernel : 0xffffffff81000000 -> 0xffffffff81a35000 (pfn 0x1000 + 0xa35 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x1000+0xa35 at 0x7feeade83000
xc: detail: elf_load_binary: phdr 0 at 0x0x7feeade83000 -> 0x0x7feeae356000
xc: detail: elf_load_binary: phdr 1 at 0x0x7feeae483000 -> 0x0x7feeae4bbcc0
xc: detail: elf_load_binary: phdr 2 at 0x0x7feeae4bc000 -> 0x0x7feeae4bcd60
xc: detail: elf_load_binary: phdr 3 at 0x0x7feeae4bd000 -> 0x0x7feeae4cef80
xc: detail: elf_load_binary: phdr 4 at 0x0x7feeae4cf000 -> 0x0x7feeae537000
domainbuilder: detail: xc_dom_alloc_segment: ramdisk : 0xffffffff81a35000 -> 0xffffffff826ba000 (pfn 0x1a35 + 0xc85 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x1a35+0xc85 at 0x7feead1fe000
domainbuilder: detail: xc_dom_do_gunzip: unzip ok, 0x5b8c39 -> 0xc84e10
domainbuilder: detail: xc_dom_alloc_segment: phys2mach : 0xffffffff826ba000 -> 0xffffffff826d3000 (pfn 0x26ba + 0x19 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x26ba+0x19 at 0x7feeb1b65000
domainbuilder: detail: xc_dom_alloc_page : start info : 0xffffffff826d3000 (pfn 0x26d3)
domainbuilder: detail: xc_dom_alloc_page : xenstore : 0xffffffff826d4000 (pfn 0x26d4)
domainbuilder: detail: xc_dom_alloc_page : console : 0xffffffff826d5000 (pfn 0x26d5)
domainbuilder: detail: nr_page_tables: 0x0000ffffffffffff/48: 0xffff000000000000 -> 0xffffffffffffffff, 1 table(s)
domainbuilder: detail: nr_page_tables: 0x0000007fffffffff/39: 0xffffff8000000000 -> 0xffffffffffffffff, 1 table(s)
domainbuilder: detail: nr_page_tables: 0x000000003fffffff/30: 0xffffffff80000000 -> 0xffffffffbfffffff, 1 table(s)
domainbuilder: detail: nr_page_tables: 0x00000000001fffff/21: 0xffffffff80000000 -> 0xffffffff827fffff, 20 table(s)
domainbuilder: detail: xc_dom_alloc_segment: page tables : 0xffffffff826d6000 -> 0xffffffff826ed000 (pfn 0x26d6 + 0x17 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x26d6+0x17 at 0x7feeb1b34000
domainbuilder: detail: xc_dom_alloc_page : boot stack : 0xffffffff826ed000 (pfn 0x26ed)
domainbuilder: detail: xc_dom_build_image : virt_alloc_end : 0xffffffff826ee000
domainbuilder: detail: xc_dom_build_image : virt_pgtab_end : 0xffffffff82800000
domainbuilder: detail: xc_dom_boot_image: called
domainbuilder: detail: arch_setup_bootearly: doing nothing
domainbuilder: detail: xc_dom_compat_check: supported guest type: xen-3.0-x86_64 <= matches
domainbuilder: detail: xc_dom_compat_check: supported guest type: xen-3.0-x86_32p
domainbuilder: detail: xc_dom_update_guest_p2m: dst 64bit, pages 0x3100
domainbuilder: detail: clear_page: pfn 0x26d5, mfn 0x207f639
domainbuilder: detail: clear_page: pfn 0x26d4, mfn 0x101e1f2
domainbuilder: detail: xc_dom_pfn_to_ptr: domU mapping: pfn 0x26d3+0x1 at 0x7feeb1b64000
domainbuilder: detail: start_info_x86_64: called
domainbuilder: detail: setup_hypercall_page: vaddr=0xffffffff81001000 pfn=0x1001
domainbuilder: detail: domain builder memory footprint
domainbuilder: detail: allocated
domainbuilder: detail: malloc : 13249 kB
domainbuilder: detail: anon mmap : 0 bytes
domainbuilder: detail: mapped
domainbuilder: detail: file mmap : 7897 kB
domainbuilder: detail: domU mmap : 23468 kB
domainbuilder: detail: arch_setup_bootlate: shared_info: pfn 0x0, mfn 0x57d1
domainbuilder: detail: shared_info_x86_64: called
domainbuilder: detail: vcpu_x86_64: called
domainbuilder: detail: vcpu_x86_64: cr3: pfn 0x26d6 mfn 0x20bf63d
domainbuilder: detail: launch_vm: called, ctxt=0x7fff9cb67240
domainbuilder: detail: xc_dom_release: called
Daemon running with PID 29339
[ 0.000000] Linux version 3.0.0-rc5-5.1.home_olh-kernel-mainline (abuild@kuckuk) (gcc version 4.3.4 [gcc-4_3-branch revision 152973] (SUSE Linux) ) #1 SMP Tue Jun 28 15:02:04 UTC 2011
[ 0.000000] Command line: quiet sysrq=yes panic=1 debug log_buf_len=4K console=hvc0
[ 0.000000] ACPI in unprivileged domain disabled
[ 0.000000] released 0 pages of unused memory
[ 0.000000] Set 0 page(s) to 1-1 mapping.
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] Xen: 0000000000000000 - 00000000000a0000 (usable)
[ 0.000000] Xen: 00000000000a0000 - 0000000000100000 (reserved)
[ 0.000000] Xen: 0000000000100000 - 0000000003900000 (usable)
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] DMI not present or invalid.
[ 0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved)
[ 0.000000] e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
[ 0.000000] No AGP bridge found
[ 0.000000] last_pfn = 0x3900 max_arch_pfn = 0x400000000
[ 0.000000] initial memory mapped : 0 - 026ba000
[ 0.000000] Base memory trampoline at [ffff88000009e000] 9e000 size 8192
[ 0.000000] init_memory_mapping: 0000000000000000-0000000003900000
[ 0.000000] 0000000000 - 0003900000 page 4k
[ 0.000000] kernel direct mapping tables up to 3900000 @ 30e1000-3100000
[ 0.000000] xen: setting RW the range 30e6000 - 3100000
[ 0.000000] RAMDISK: 01a35000 - 026ba000
[ 0.000000] Zone PFN ranges:
[ 0.000000] DMA 0x00000010 -> 0x00001000
[ 0.000000] DMA32 0x00001000 -> 0x00100000
[ 0.000000] Normal empty
[ 0.000000] Movable zone start PFN for each node
[ 0.000000] early_node_map[2] active PFN ranges
[ 0.000000] 0: 0x00000010 -> 0x000000a0
[ 0.000000] 0: 0x00000100 -> 0x00003900
[ 0.000000] On node 0 totalpages: 14480
[ 0.000000] DMA zone: 56 pages used for memmap
[ 0.000000] DMA zone: 2 pages reserved
[ 0.000000] DMA zone: 3926 pages, LIFO batch:0
[ 0.000000] DMA32 zone: 144 pages used for memmap
[ 0.000000] DMA32 zone: 10352 pages, LIFO batch:1
[ 0.000000] SMP: Allowing 1 CPUs, 0 hotplug CPUs
[ 0.000000] No local APIC present
[ 0.000000] APIC: disable apic facility
[ 0.000000] APIC: switched to apic NOOP
[ 0.000000] nr_irqs_gsi: 16
[ 0.000000] Allocating PCI resources starting at 3900000 (gap: 3900000:fc700000)
[ 0.000000] Booting paravirtualized kernel on Xen
[ 0.000000] Xen version: 4.2.23640-2.1 (preserve-AD)
[ 0.000000] setup_percpu: NR_CPUS:8 nr_cpumask_bits:8 nr_cpu_ids:1 nr_node_ids:1
[ 0.000000] PERCPU: Embedded 25 pages/cpu @ffff8800030c8000 s73600 r8192 d20608 u102400
[ 0.000000] pcpu-alloc: s73600 r8192 d20608 u102400 alloc=25*4096
[ 0.000000] pcpu-alloc: [0] 0
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 14278
[ 0.000000] Kernel command line: quiet sysrq=yes panic=1 debug log_buf_len=4K console=hvc0
[ 0.000000] PID hash table entries: 256 (order: -1, 2048 bytes)
[ 0.000000] Dentry cache hash table entries: 8192 (order: 4, 65536 bytes)
[ 0.000000] Inode-cache hash table entries: 4096 (order: 3, 32768 bytes)
[ 0.000000] Checking aperture...
[ 0.000000] No AGP bridge found
[ 0.000000] Memory: 25384k/58368k available (2800k kernel code, 448k absent, 32536k reserved, 3571k data, 476k init)
[ 0.000000] SLUB: Genslabs=15, HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[ 0.000000] Hierarchical RCU implementation.
[ 0.000000] NR_IRQS:4352 nr_irqs:256 16
[ 0.000000] Console: colour dummy device 80x25
[ 0.000000] console [tty0] enabled
[ 0.000000] console [hvc0] enabled
[ 0.000000] Xen: using vcpuop timer interface
[ 0.000000] installing Xen timer for CPU 0
[ 0.000000] Detected 2394.054 MHz processor.
[ 0.004000] Calibrating delay loop (skipped), value calculated using timer frequency.. 4788.10 BogoMIPS (lpj=9576216)
[ 0.004000] pid_max: default: 32768 minimum: 301
[ 0.004000] Mount-cache hash table entries: 256
[ 0.004000] CPU: Physical Processor ID: 0
[ 0.004000] CPU: Processor Core ID: 0
[ 0.004000] SMP alternatives: switching to UP code
[ 0.005334] Freeing SMP alternatives: 12k freed
[ 0.005474] Performance Events: unsupported p6 CPU model 47 no PMU driver, software events only.
[ 0.005551] Brought up 1 CPUs
[ 0.005737] Grant table initialized
[ 0.005772] NET: Registered protocol family 16
[ 0.007024] PCI: setting up Xen PCI frontend stub
[ 0.007034] PCI: pci_cache_line_size set to 64 bytes
[ 0.007294] bio: create slab <bio-0> at 0
[ 0.007342] ACPI: Interpreter disabled.
[ 0.008000] xen/balloon: Initialising balloon driver.
[ 0.008000] last_pfn = 0x3900 max_arch_pfn = 0x400000000
[ 0.008584] xen-balloon: Initialising balloon driver.
[ 0.008917] vgaarb: loaded
[ 0.008944] PCI: System does not support PCI
[ 0.008948] PCI: System does not support PCI
[ 0.009017] Switching to clocksource xen
[ 0.009074] pnp: PnP ACPI: disabled
[ 0.009636] PCI: max bus depth: 0 pci_try_num: 1
[ 0.009658] NET: Registered protocol family 2
[ 0.009687] IP route cache hash table entries: 512 (order: 0, 4096 bytes)
[ 0.009744] TCP established hash table entries: 2048 (order: 3, 32768 bytes)
[ 0.009756] TCP bind hash table entries: 2048 (order: 3, 32768 bytes)
[ 0.009765] TCP: Hash tables configured (established 2048 bind 2048)
[ 0.009770] TCP reno registered
[ 0.009775] UDP hash table entries: 128 (order: 0, 4096 bytes)
[ 0.009782] UDP-Lite hash table entries: 128 (order: 0, 4096 bytes)
[ 0.009821] NET: Registered protocol family 1
[ 0.009830] PCI: CLS 0 bytes, default 64
[ 0.009871] Unpacking initramfs...
[ 0.022163] Freeing initrd memory: 12820k freed
[ 0.025665] platform rtc_cmos: registered platform RTC device (no PNP device found)
[ 0.027099] msgmni has been set to 74
[ 0.027116] io scheduler noop registered (default)
[ 0.027340] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[ 0.027495] i8042: PNP: No PS/2 controller found. Probing ports directly.
[ 0.028057] i8042: No controller found
[ 0.028057] mousedev: PS/2 mouse device common for all mice
[ 0.068335] rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
[ 0.068477] rtc_cmos: probe of rtc_cmos failed with error -38
[ 0.068494] cpuidle: using governor ladder
[ 0.068532] TCP cubic registered
[ 0.068537] NET: Registered protocol family 17
[ 0.068546] Registering the dns_resolver key type
[ 0.068607] drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
[ 0.068856] Freeing unused kernel memory: 476k freed
[ 0.069040] Write protecting the kernel read-only data: 6144k
[ 0.072401] Freeing unused kernel memory: 1276k freed
[ 0.073409] Freeing unused kernel memory: 1204k freed
doing fast boot
FATAL: Module megaraid_sas not found.
[ 0.184890] SCSI subsystem initialized
[ 0.195210] libata version 3.00 loaded.
FATAL: Module processor not found.
FATAL: Module thermal not found.
FATAL: Module fan not found.
[ 0.286182] BIOS EDD facility v0.16 2004-Jun-25, 0 devices found
[ 0.286192] EDD information not available.
FATAL: Error inserting edd (/lib/modules/3.0.0-rc5-5.1.home_olh-kernel-mainline/kernel/drivers/firmware/edd.ko): No such device
[ 0.301295] Initialising Xen virtual ethernet driver.
preping 03-storage.sh
running 03-storage.sh
preping 04-udev.sh
running 04-udev.sh
Creating device nodes with udev
[ 0.324277] udevd (134): /proc/134/oom_adj is deprecated, please use /proc/134/oom_score_adj instead.
[ 0.324407] udevd version 128 started
preping 05-blogd.sh
running 05-blogd.sh
mount: devpts already mounted or /dev/pts busy
mount: according to mtab, devpts is already mounted on /dev/pts
Boot logging started on /dev/hvc0(/dev/console) at Fri Jul 1 18:15:40 2011
preping 05-clock.sh
running 05-clock.sh
preping 11-block.sh
running 11-block.sh
preping 11-usb.sh
running 11-usb.sh
preping 21-devinit_done.sh
running 21-devinit_done.sh
preping 81-resume.userspace.sh
running 81-resume.userspace.sh
Trying manual resume from /dev/disk/by-id/scsi-3600605b000d6850011e4fa070c236a10-part2
preping 82-resume.kernel.sh
running 82-resume.kernel.sh
Trying manual resume from /dev/disk/by-id/scsi-3600605b000d6850011e4fa070c236a10-part2
preping 83-mount.sh
running 83-mount.sh
Waiting for device /dev/sda6 to appear: ..............................Could not find /dev/sda6.
Want me to fall back to /dev/sda6? (Y/n)
n
not found -- exiting to /bin/sh
$
$ gubaidulina:~/:[0]#
gubaidulina:~/:[0]# xl -v migrate sles11sp1_para_1 bolen
migration target: Ready to receive domain.
Saving to migration stream new xl format (info 0x0/0x0/825)
Loading new save file incoming migration stream (new xl fmt info 0x0/0x0/825)
Savefile contains xl domain config
xc: detail: Had 0 unexplained entries in p2m table
xc: Saving memory: iter 0 (last sent 0 skipped 0): 14592/14592 100%
xc: detail: delta 1430ms, dom0 68%, target 0%, sent 286Mb/s, dirtied 1Mb/s 48 pages
xc: Saving memory: iter 1 (last sent 12508 skipped 36): 14592/14592 100%
xc: detail: Start last iteration
xc: detail: SUSPEND shinfo 000057d1
xc: detail: delta 215ms, dom0 10%, target 0%, sent 7Mb/s, dirtied 13Mb/s 91 pages
xc: Saving memory: iter 2 (last sent 48 skipped 0): 14592/14592 100%
xc: detail: delta 1ms, dom0 0%, target 0%, sent 2981Mb/s, dirtied 2981Mb/s 91 pages
xc: detail: Total pages sent= 12647 (0.87x)
xc: detail: (of which 0 were fixups)
xc: detail: All memory is saved
xc: detail: Save exit rc=0
migration target: Transfer complete, requesting permission to start domain.
migration sender: Target has acknowledged transfer.
migration sender: Giving target permission to start.
migration target: Got permission, starting domain.
migration target: Domain started successsfully.
migration sender: Target reports successful startup.
Migration successful.
gubaidulina:~/:[0]#
root@bolen:~ # xl console sles11sp1_para_1
[ 53.328089] PM: early restore of devices complete after 0.014 msecs
[ 53.328923] PM: restore of devices complete after 0.049 msecs
$ root@bolen:~ #
root@bolen:~ # xl -v migrate sles11sp1_para_1 gubaidulina
migration target: Ready to receive domain.
Saving to migration stream new xl format (info 0x0/0x0/825)
Loading new save file incoming migration stream (new xl fmt info 0x0/0x0/825)
Savefile contains xl domain config
xc: detail: Had 0 unexplained entries in p2m table
xc: Saving memory: iter 0 (last sent 0 skipped 0): 14592/14592 100%
xc: detail: delta 1462ms, dom0 54%, target 0%, sent 280Mb/s, dirtied 0Mb/s 36 pages
xc: Saving memory: iter 1 (last sent 12508 skipped 36): 14592/14592 100%
xc: detail: Start last iteration
xc: detail: SUSPEND shinfo 00042d92
xc: detail: delta 207ms, dom0 8%, target 0%, sent 5Mb/s, dirtied 14Mb/s 92 pages
xc: Saving memory: iter 2 (last sent 36 skipped 0): 14592/14592 100%
xc: detail: delta 1ms, dom0 100%, target 0%, sent 3014Mb/s, dirtied 3014Mb/s 92 pages
xc: detail: Total pages sent= 12636 (0.87x)
xc: detail: (of which 0 were fixups)
xc: detail: All memory is saved
xc: detail: Save exit rc=0
migration target: Transfer complete, requesting permission to start domain.
migration sender: Target has acknowledged transfer.
migration sender: Giving target permission to start.
migration target: Got permission, starting domain.
migration sender: Target reports successful startup.
migration target: Domain started successsfully.
Migration successful.
root@bolen:~ #
gubaidulina:~/:[0]# xl console sles11sp1_para_1
[ 110.769663] PM: early restore of devices complete after 0.023 msecs
[ 110.769663] PM: restore of devices complete after 0.052 msecs
$ gubaidulina:~/:[0]#
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Re: migration of pv guest fails from small to large host
2011-07-01 16:20 ` Olaf Hering
@ 2011-07-05 13:46 ` Konrad Rzeszutek Wilk
2011-07-07 13:20 ` Olaf Hering
0 siblings, 1 reply; 17+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-07-05 13:46 UTC (permalink / raw)
To: Olaf Hering; +Cc: xen-devel
On Fri, Jul 01, 2011 at 06:20:23PM +0200, Olaf Hering wrote:
>
> The situation with Linux 3.0 is different, crashes on target. With SLES11
> kernel I can not attach to the console, the guest is appearently running.
What happens if you 'xl save', 'xl restore' on the same machine? Does that work?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Re: migration of pv guest fails from small to large host
2011-07-05 13:46 ` Konrad Rzeszutek Wilk
@ 2011-07-07 13:20 ` Olaf Hering
2011-07-07 18:36 ` Konrad Rzeszutek Wilk
0 siblings, 1 reply; 17+ messages in thread
From: Olaf Hering @ 2011-07-07 13:20 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk; +Cc: xen-devel
On Tue, Jul 05, Konrad Rzeszutek Wilk wrote:
> On Fri, Jul 01, 2011 at 06:20:23PM +0200, Olaf Hering wrote:
> >
> > The situation with Linux 3.0 is different, crashes on target. With SLES11
> > kernel I can not attach to the console, the guest is appearently running.
>
> What happens if you 'xl save', 'xl restore' on the same machine? Does that work?
save/restore on the same system works, just any form of transit from
small to large fails.
Olaf
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Re: migration of pv guest fails from small to large host
2011-07-07 13:20 ` Olaf Hering
@ 2011-07-07 18:36 ` Konrad Rzeszutek Wilk
2011-07-07 21:02 ` Olaf Hering
0 siblings, 1 reply; 17+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-07-07 18:36 UTC (permalink / raw)
To: Olaf Hering; +Cc: xen-devel
On Thu, Jul 07, 2011 at 06:20:42AM -0700, Olaf Hering wrote:
> On Tue, Jul 05, Konrad Rzeszutek Wilk wrote:
>
> > On Fri, Jul 01, 2011 at 06:20:23PM +0200, Olaf Hering wrote:
> > >
> > > The situation with Linux 3.0 is different, crashes on target. With SLES11
> > > kernel I can not attach to the console, the guest is appearently running.
> >
> > What happens if you 'xl save', 'xl restore' on the same machine? Does that work?
>
> save/restore on the same system works, just any form of transit from
> small to large fails.
I am not sure what 'small' to 'large' means in this context? As in the
migration from a machine with less memory than than the other? What happens
if you migratte the other way? Does it work if you use the old style 'xm' commands?
>
> Olaf
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Re: migration of pv guest fails from small to large host
2011-07-07 18:36 ` Konrad Rzeszutek Wilk
@ 2011-07-07 21:02 ` Olaf Hering
0 siblings, 0 replies; 17+ messages in thread
From: Olaf Hering @ 2011-07-07 21:02 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk; +Cc: xen-devel
On Thu, Jul 07, Konrad Rzeszutek Wilk wrote:
> On Thu, Jul 07, 2011 at 06:20:42AM -0700, Olaf Hering wrote:
> > On Tue, Jul 05, Konrad Rzeszutek Wilk wrote:
> >
> > > On Fri, Jul 01, 2011 at 06:20:23PM +0200, Olaf Hering wrote:
> > > >
> > > > The situation with Linux 3.0 is different, crashes on target. With SLES11
> > > > kernel I can not attach to the console, the guest is appearently running.
> > >
> > > What happens if you 'xl save', 'xl restore' on the same machine? Does that work?
> >
> > save/restore on the same system works, just any form of transit from
> > small to large fails.
>
> I am not sure what 'small' to 'large' means in this context? As in the
> migration from a machine with less memory than than the other? What happens
> if you migratte the other way? Does it work if you use the old style 'xm' commands?
xm or xl make no difference, they call the same helper function.
If the guest is started on a large system (>64GB), then migrated to a
small system (<64GB) and back to the large system nothing bad happens.
But if the guest is started on the small system, then migrated to the
large system the guest either crashes or hangs, depending on the guest
kernel. So far I have no handle what actually happens. Doing a dump
fails because the p2m stuff is not yet configured by the guest.
Today I wrote some debug code to get some tracing info, which I need to
test.
Olaf
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] xen: update machine_to_phys_order on resume
2011-07-01 10:41 migration of pv guest fails from small to large host Olaf Hering
2011-07-01 16:20 ` Olaf Hering
@ 2011-07-12 16:43 ` Olaf Hering
2011-07-12 18:11 ` [Xen-devel] " Konrad Rzeszutek Wilk
1 sibling, 1 reply; 17+ messages in thread
From: Olaf Hering @ 2011-07-12 16:43 UTC (permalink / raw)
To: xen-devel
Migration of pv guests fails, the guest crashes on the target host once the
guest is unpaused after transit. It happens when the guest is started on a
small systen, then migrated from that small system to a large system.
If the guest is started on a large system, then migrated to a small system and
back to the large system, the migration will be successful.
The issue is that mfn_to_pfn() makes use of machine_to_phys_order, which
is only configured once early in the boot process. After migration to a
large host the mfns will exceed the order from the small system and a
wrong code path is taken.
Calling xen_setup_machphys_mapping() again in the resume path will avoid
the crash.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
arch/x86/xen/mmu.c | 2 +-
arch/x86/xen/suspend.c | 2 ++
2 files changed, 3 insertions(+), 1 deletion(-)
Index: linux-3.0-rc7/arch/x86/xen/mmu.c
===================================================================
--- linux-3.0-rc7.orig/arch/x86/xen/mmu.c
+++ linux-3.0-rc7/arch/x86/xen/mmu.c
@@ -1623,7 +1623,7 @@ static void __init xen_map_identity_earl
set_page_prot(pmd, PAGE_KERNEL_RO);
}
-void __init xen_setup_machphys_mapping(void)
+void xen_setup_machphys_mapping(void)
{
struct xen_machphys_mapping mapping;
unsigned long machine_to_phys_nr_ents;
Index: linux-3.0-rc7/arch/x86/xen/suspend.c
===================================================================
--- linux-3.0-rc7.orig/arch/x86/xen/suspend.c
+++ linux-3.0-rc7/arch/x86/xen/suspend.c
@@ -43,6 +43,8 @@ void xen_arch_hvm_post_suspend(int suspe
void xen_arch_post_suspend(int suspend_cancelled)
{
+ xen_setup_machphys_mapping();
+
xen_build_mfn_list_list();
xen_setup_shared_info();
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH] xen: update machine_to_phys_order on resume
2011-07-12 16:43 ` [PATCH] xen: update machine_to_phys_order on resume Olaf Hering
@ 2011-07-12 18:11 ` Konrad Rzeszutek Wilk
2011-07-13 9:12 ` Ian Campbell
0 siblings, 1 reply; 17+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-07-12 18:11 UTC (permalink / raw)
To: Olaf Hering, linux-kernel; +Cc: xen-devel
On Tue, Jul 12, 2011 at 06:43:42PM +0200, Olaf Hering wrote:
>
> Migration of pv guests fails, the guest crashes on the target host once the
> guest is unpaused after transit. It happens when the guest is started on a
> small systen, then migrated from that small system to a large system.
> If the guest is started on a large system, then migrated to a small system and
> back to the large system, the migration will be successful.
>
> The issue is that mfn_to_pfn() makes use of machine_to_phys_order, which
> is only configured once early in the boot process. After migration to a
> large host the mfns will exceed the order from the small system and a
> wrong code path is taken.
>
> Calling xen_setup_machphys_mapping() again in the resume path will avoid
> the crash.
Oh, duh!
Let me queue that up for 3.0-rc7 unless there are objections?
>
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
>
> ---
> arch/x86/xen/mmu.c | 2 +-
> arch/x86/xen/suspend.c | 2 ++
> 2 files changed, 3 insertions(+), 1 deletion(-)
>
> Index: linux-3.0-rc7/arch/x86/xen/mmu.c
> ===================================================================
> --- linux-3.0-rc7.orig/arch/x86/xen/mmu.c
> +++ linux-3.0-rc7/arch/x86/xen/mmu.c
> @@ -1623,7 +1623,7 @@ static void __init xen_map_identity_earl
> set_page_prot(pmd, PAGE_KERNEL_RO);
> }
>
> -void __init xen_setup_machphys_mapping(void)
> +void xen_setup_machphys_mapping(void)
> {
> struct xen_machphys_mapping mapping;
> unsigned long machine_to_phys_nr_ents;
> Index: linux-3.0-rc7/arch/x86/xen/suspend.c
> ===================================================================
> --- linux-3.0-rc7.orig/arch/x86/xen/suspend.c
> +++ linux-3.0-rc7/arch/x86/xen/suspend.c
> @@ -43,6 +43,8 @@ void xen_arch_hvm_post_suspend(int suspe
>
> void xen_arch_post_suspend(int suspend_cancelled)
> {
> + xen_setup_machphys_mapping();
> +
> xen_build_mfn_list_list();
>
> xen_setup_shared_info();
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH] xen: update machine_to_phys_order on resume
2011-07-12 18:11 ` [Xen-devel] " Konrad Rzeszutek Wilk
@ 2011-07-13 9:12 ` Ian Campbell
2011-07-13 13:20 ` Konrad Rzeszutek Wilk
2011-07-15 16:05 ` Jan Beulich
0 siblings, 2 replies; 17+ messages in thread
From: Ian Campbell @ 2011-07-13 9:12 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk, Keir Fraser, Jan Beulich
Cc: Olaf Hering, linux-kernel@vger.kernel.org,
xen-devel@lists.xensource.com
On Tue, 2011-07-12 at 19:11 +0100, Konrad Rzeszutek Wilk wrote:
> On Tue, Jul 12, 2011 at 06:43:42PM +0200, Olaf Hering wrote:
> >
> > Migration of pv guests fails, the guest crashes on the target host once the
> > guest is unpaused after transit. It happens when the guest is started on a
> > small systen, then migrated from that small system to a large system.
> > If the guest is started on a large system, then migrated to a small system and
> > back to the large system, the migration will be successful.
> >
> > The issue is that mfn_to_pfn() makes use of machine_to_phys_order, which
> > is only configured once early in the boot process. After migration to a
> > large host the mfns will exceed the order from the small system and a
> > wrong code path is taken.
> >
> > Calling xen_setup_machphys_mapping() again in the resume path will avoid
> > the crash.
>
> Oh, duh!
>
> Let me queue that up for 3.0-rc7 unless there are objections?
It's not so much an objection to this patch but this issue seems to have
been caused by Xen cset 20892:d311d1efc25e which looks to me like a
subtle ABI breakage for guests. Perhaps we should introduce a feature
flag to indicate that a guest can cope with the m2p changing size over
migration like this?
Ian.
>
> >
> > Signed-off-by: Olaf Hering <olaf@aepfle.de>
> >
> > ---
> > arch/x86/xen/mmu.c | 2 +-
> > arch/x86/xen/suspend.c | 2 ++
> > 2 files changed, 3 insertions(+), 1 deletion(-)
> >
> > Index: linux-3.0-rc7/arch/x86/xen/mmu.c
> > ===================================================================
> > --- linux-3.0-rc7.orig/arch/x86/xen/mmu.c
> > +++ linux-3.0-rc7/arch/x86/xen/mmu.c
> > @@ -1623,7 +1623,7 @@ static void __init xen_map_identity_earl
> > set_page_prot(pmd, PAGE_KERNEL_RO);
> > }
> >
> > -void __init xen_setup_machphys_mapping(void)
> > +void xen_setup_machphys_mapping(void)
> > {
> > struct xen_machphys_mapping mapping;
> > unsigned long machine_to_phys_nr_ents;
> > Index: linux-3.0-rc7/arch/x86/xen/suspend.c
> > ===================================================================
> > --- linux-3.0-rc7.orig/arch/x86/xen/suspend.c
> > +++ linux-3.0-rc7/arch/x86/xen/suspend.c
> > @@ -43,6 +43,8 @@ void xen_arch_hvm_post_suspend(int suspe
> >
> > void xen_arch_post_suspend(int suspend_cancelled)
> > {
> > + xen_setup_machphys_mapping();
> > +
> > xen_build_mfn_list_list();
> >
> > xen_setup_shared_info();
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH] xen: update machine_to_phys_order on resume
2011-07-13 9:12 ` Ian Campbell
@ 2011-07-13 13:20 ` Konrad Rzeszutek Wilk
2011-07-15 8:56 ` Jan Beulich
2011-07-15 16:05 ` Jan Beulich
1 sibling, 1 reply; 17+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-07-13 13:20 UTC (permalink / raw)
To: Ian Campbell
Cc: Keir Fraser, Jan Beulich, Olaf Hering,
linux-kernel@vger.kernel.org, xen-devel@lists.xensource.com
On Wed, Jul 13, 2011 at 10:12:44AM +0100, Ian Campbell wrote:
> On Tue, 2011-07-12 at 19:11 +0100, Konrad Rzeszutek Wilk wrote:
> > On Tue, Jul 12, 2011 at 06:43:42PM +0200, Olaf Hering wrote:
> > >
> > > Migration of pv guests fails, the guest crashes on the target host once the
> > > guest is unpaused after transit. It happens when the guest is started on a
> > > small systen, then migrated from that small system to a large system.
> > > If the guest is started on a large system, then migrated to a small system and
> > > back to the large system, the migration will be successful.
> > >
> > > The issue is that mfn_to_pfn() makes use of machine_to_phys_order, which
> > > is only configured once early in the boot process. After migration to a
> > > large host the mfns will exceed the order from the small system and a
> > > wrong code path is taken.
> > >
> > > Calling xen_setup_machphys_mapping() again in the resume path will avoid
> > > the crash.
> >
> > Oh, duh!
> >
> > Let me queue that up for 3.0-rc7 unless there are objections?
>
> It's not so much an objection to this patch but this issue seems to have
> been caused by Xen cset 20892:d311d1efc25e which looks to me like a
> subtle ABI breakage for guests. Perhaps we should introduce a feature
> flag to indicate that a guest can cope with the m2p changing size over
> migration like this?
Sounds reasonable to me.. I will wait (I can always submit it during 3.1 cycle
and CC stable@kernel.org to backport it to 3.0.1).
Jan, you are the one who came up with the c/s - what's your thought?
How does your kernel handle the changing size of the M2P - like the patch below?
>
> Ian.
>
> >
> > >
> > > Signed-off-by: Olaf Hering <olaf@aepfle.de>
> > >
> > > ---
> > > arch/x86/xen/mmu.c | 2 +-
> > > arch/x86/xen/suspend.c | 2 ++
> > > 2 files changed, 3 insertions(+), 1 deletion(-)
> > >
> > > Index: linux-3.0-rc7/arch/x86/xen/mmu.c
> > > ===================================================================
> > > --- linux-3.0-rc7.orig/arch/x86/xen/mmu.c
> > > +++ linux-3.0-rc7/arch/x86/xen/mmu.c
> > > @@ -1623,7 +1623,7 @@ static void __init xen_map_identity_earl
> > > set_page_prot(pmd, PAGE_KERNEL_RO);
> > > }
> > >
> > > -void __init xen_setup_machphys_mapping(void)
> > > +void xen_setup_machphys_mapping(void)
> > > {
> > > struct xen_machphys_mapping mapping;
> > > unsigned long machine_to_phys_nr_ents;
> > > Index: linux-3.0-rc7/arch/x86/xen/suspend.c
> > > ===================================================================
> > > --- linux-3.0-rc7.orig/arch/x86/xen/suspend.c
> > > +++ linux-3.0-rc7/arch/x86/xen/suspend.c
> > > @@ -43,6 +43,8 @@ void xen_arch_hvm_post_suspend(int suspe
> > >
> > > void xen_arch_post_suspend(int suspend_cancelled)
> > > {
> > > + xen_setup_machphys_mapping();
> > > +
> > > xen_build_mfn_list_list();
> > >
> > > xen_setup_shared_info();
> > >
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@lists.xensource.com
> > > http://lists.xensource.com/xen-devel
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] xen: update machine_to_phys_order on resume
@ 2011-07-13 14:29 Jan Beulich
0 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2011-07-13 14:29 UTC (permalink / raw)
To: Ian.Campbell, konrad.wilk, keir; +Cc: olaf, xen-devel, linux-kernel
[-- Attachment #1.1: Type: text/plain, Size: 495 bytes --]
>>> Ian Campbell 07/13/11 11:12 AM >>>
>It's not so much an objection to this patch but this issue seems to have
>been caused by Xen cset 20892:d311d1efc25e which looks to me like a
>subtle ABI breakage for guests. Perhaps we should introduce a feature
>flag to indicate that a guest can cope with the m2p changing size over
>migration like this?
Indeed - migration was completely beyond my consideration when
submitting this. A feature flag seems the right way to go to me.
Jan
[-- Attachment #1.2: HTML --]
[-- Type: text/html, Size: 801 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH] xen: update machine_to_phys_order on resume
2011-07-13 13:20 ` Konrad Rzeszutek Wilk
@ 2011-07-15 8:56 ` Jan Beulich
0 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2011-07-15 8:56 UTC (permalink / raw)
To: Ian Campbell, Konrad Rzeszutek Wilk
Cc: Olaf Hering, xen-devel@lists.xensource.com,
linux-kernel@vger.kernel.org, Keir Fraser
>>> On 13.07.11 at 15:20, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> On Wed, Jul 13, 2011 at 10:12:44AM +0100, Ian Campbell wrote:
>> On Tue, 2011-07-12 at 19:11 +0100, Konrad Rzeszutek Wilk wrote:
>> > On Tue, Jul 12, 2011 at 06:43:42PM +0200, Olaf Hering wrote:
>> > >
>> > > Migration of pv guests fails, the guest crashes on the target host once
> the
>> > > guest is unpaused after transit. It happens when the guest is started on a
>> > > small systen, then migrated from that small system to a large system.
>> > > If the guest is started on a large system, then migrated to a small system
> and
>> > > back to the large system, the migration will be successful.
>> > >
>> > > The issue is that mfn_to_pfn() makes use of machine_to_phys_order, which
>> > > is only configured once early in the boot process. After migration to a
>> > > large host the mfns will exceed the order from the small system and a
>> > > wrong code path is taken.
>> > >
>> > > Calling xen_setup_machphys_mapping() again in the resume path will avoid
>> > > the crash.
>> >
>> > Oh, duh!
>> >
>> > Let me queue that up for 3.0-rc7 unless there are objections?
>>
>> It's not so much an objection to this patch but this issue seems to have
>> been caused by Xen cset 20892:d311d1efc25e which looks to me like a
>> subtle ABI breakage for guests. Perhaps we should introduce a feature
>> flag to indicate that a guest can cope with the m2p changing size over
>> migration like this?
>
> Sounds reasonable to me.. I will wait (I can always submit it during 3.1
> cycle
> and CC stable@kernel.org to backport it to 3.0.1).
>
> Jan, you are the one who came up with the c/s - what's your thought?
> How does your kernel handle the changing size of the M2P - like the patch
> below?
As said in an earlier reply to Ian, I didn't pay attention to the
migration aspects and I'm in favor of introducing a feature flag
to control the behavior.
In the meantime, as an immediate fix, I just sent out a patch to
revert to original behavior for all but Dom0.
Jan
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH] xen: update machine_to_phys_order on resume
2011-07-13 9:12 ` Ian Campbell
2011-07-13 13:20 ` Konrad Rzeszutek Wilk
@ 2011-07-15 16:05 ` Jan Beulich
1 sibling, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2011-07-15 16:05 UTC (permalink / raw)
To: Ian Campbell, Konrad Rzeszutek Wilk, Keir Fraser
Cc: Olaf Hering, xen-devel@lists.xensource.com,
linux-kernel@vger.kernel.org
>>> On 13.07.11 at 11:12, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> It's not so much an objection to this patch but this issue seems to have
> been caused by Xen cset 20892:d311d1efc25e which looks to me like a
> subtle ABI breakage for guests. Perhaps we should introduce a feature
> flag to indicate that a guest can cope with the m2p changing size over
> migration like this?
That's actually not strait forward, as the hypervisor can't see the ELF
note specified features of a DomU kernel. Passing this information
down from the tools or from the guest kernel itself otoh doesn't
necessarily seem worth it. Instead a guest that can deal with the
upper bound of the M2P table changing can easily obtain the
desired information through XENMEM_maximum_ram_page. So I
think on the hypervisor side we're good with the patch I sent
earlier today.
Now - does anyone remember why machine_to_phys_order got
introduced in the first place (rather than doing a precise upper
bound check using the maximum number the hypervisor returned)?
I really can't see any benefit in calculating and using the much
more coarse order only.
Jan
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] xen: update machine_to_phys_order on resume
@ 2011-07-15 17:30 Jan Beulich
2011-07-15 18:23 ` Keir Fraser
0 siblings, 1 reply; 17+ messages in thread
From: Jan Beulich @ 2011-07-15 17:30 UTC (permalink / raw)
To: Ian.Campbell, konrad.wilk, keir; +Cc: olaf, xen-devel, linux-kernel
[-- Attachment #1.1: Type: text/plain, Size: 1427 bytes --]
>>> "Jan Beulich" 07/15/11 6:07 PM >>>
>>>> On 13.07.11 at 11:12, Ian Campbell wrote:
>> It's not so much an objection to this patch but this issue seems to have
>> been caused by Xen cset 20892:d311d1efc25e which looks to me like a
>> subtle ABI breakage for guests. Perhaps we should introduce a feature
>> flag to indicate that a guest can cope with the m2p changing size over
>> migration like this?
>
>That's actually not strait forward, as the hypervisor can't see the ELF
>note specified features of a DomU kernel. Passing this information
>down from the tools or from the guest kernel itself otoh doesn't
>necessarily seem worth it. Instead a guest that can deal with the
>upper bound of the M2P table changing can easily obtain the
>desired information through XENMEM_maximum_ram_page. So I
>think on the hypervisor side we're good with the patch I sent
>earlier today.
Actually, one more thought: What's the purpose of this hypercall if
it is set in stone what values it ought to return? Isn't a guest using
it (supposed to be) advertising that it can deal with the values being
variable (and it was just overlooked so far that this doesn't only
include varying values from boot to boot, but also migration)? Or in
other words, if we found a need to relocate the M2P table or grow
its static maximum size, it would be impossible to migrate guests
from an old to a new hypervisor.
Jan
[-- Attachment #1.2: HTML --]
[-- Type: text/html, Size: 1852 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] xen: update machine_to_phys_order on resume
2011-07-15 17:30 Jan Beulich
@ 2011-07-15 18:23 ` Keir Fraser
0 siblings, 0 replies; 17+ messages in thread
From: Keir Fraser @ 2011-07-15 18:23 UTC (permalink / raw)
To: Jan Beulich, Ian.Campbell, Konrad Rzeszutek Wilk, keir
Cc: olaf, xen-devel, linux-kernel
On 15/07/2011 18:30, "Jan Beulich" <JBeulich@novell.com> wrote:
> Actually, one more thought: What's the purpose of this hypercall if
> it is set in stone what values it ought to return? Isn't a guest using
> it (supposed to be) advertising that it can deal with the values being
> variable (and it was just overlooked so far that this doesn't only
> include varying values from boot to boot, but also migration)? Or in
> other words, if we found a need to relocate the M2P table or grow
> its static maximum size, it would be impossible to migrate guests
> from an old to a new hypervisor.
Fair point. There has to be a static fallback set of return values for old
guests.
-- Keir
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] xen: update machine_to_phys_order on resume
2011-07-18 7:05 [Xen-devel] " Jan Beulich
@ 2011-07-18 7:27 ` Keir Fraser
2011-07-18 8:31 ` [Xen-devel] " Ian Campbell
0 siblings, 1 reply; 17+ messages in thread
From: Keir Fraser @ 2011-07-18 7:27 UTC (permalink / raw)
To: Jan Beulich, Ian.Campbell, Konrad Rzeszutek Wilk, keir
Cc: olaf, xen-devel, linux-kernel
On 18/07/2011 08:05, "Jan Beulich" <JBeulich@novell.com> wrote:
>>>> On 15.07.11 at 20:23, Keir Fraser <keir.xen@gmail.com> wrote:
>> On 15/07/2011 18:30, "Jan Beulich" <JBeulich@novell.com> wrote:
>>
>>> Actually, one more thought: What's the purpose of this hypercall if
>>> it is set in stone what values it ought to return? Isn't a guest using
>>> it (supposed to be) advertising that it can deal with the values being
>>> variable (and it was just overlooked so far that this doesn't only
>>> include varying values from boot to boot, but also migration)? Or in
>>> other words, if we found a need to relocate the M2P table or grow
>>> its static maximum size, it would be impossible to migrate guests
>>> from an old to a new hypervisor.
>>
>> Fair point. There has to be a static fallback set of return values for old
>> guests.
>
> Hmm, in my reading the two sentences sort of contradict each other.
> That is, I'm not certain what route we want to go here: Keep things
> the way they are after 23706:3dd399873c9e, and introduce a
> completely new discovery mechanism if we find it necessary to change
> the M2P table's location and/or size, including a mechanism for a guest
> to announce it's capable of dealing with that? If so, I think we ought
> to add a comment to the hypercall implementation documenting that
> its return values must not be changed (and why).
We can return different values from the existing hypercall if that is
negotiated with the guest, at run time or build time.
-- Keir
> Jan
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] xen: update machine_to_phys_order on resume
2011-07-18 8:31 ` [Xen-devel] " Ian Campbell
@ 2011-07-18 8:47 ` Jan Beulich
0 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2011-07-18 8:47 UTC (permalink / raw)
To: Ian Campbell, Keir Fraser
Cc: olaf@aepfle.de, xen-devel@lists.xensource.com, keir@xen.org,
linux-kernel@vger.kernel.org, Konrad Rzeszutek Wilk
>>> On 18.07.11 at 10:31, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> Pushing things down at build time is pretty easy. FWIW here's an
> incomplete patch to push the kernel declared features down into the
> hypervisor at build time extracted from some old PV in HVM container
> stuff (so not directly applicable). I can bring it up to date if the
> approach seems useful.
Yes, that looks like what we'd want from a conceptual perspective,
but...
> One thing which is somewhat missing is user control of non-mandatory
> features declared by a kernel, although normally that should be a
> decision made by the tools/hypervisor based upon available features etc.
>
> diff -r a6c4d03b7d45 tools/libxc/xc_dom_core.c
> --- a/tools/libxc/xc_dom_core.c Mon Feb 08 13:05:48 2010 +0000
> +++ b/tools/libxc/xc_dom_core.c Thu Feb 11 12:48:47 2010 +0000
> @@ -609,6 +609,7 @@
>
> int xc_dom_parse_image(struct xc_dom_image *dom)
> {
> + DECLARE_DOMCTL;
> int i;
>
> xc_dom_printf("%s: called\n", __FUNCTION__);
> @@ -629,8 +630,26 @@
> /* check features */
> for ( i = 0; i < XENFEAT_NR_SUBMAPS; i++ )
> {
> - dom->f_active[i] |= dom->f_requested[i]; /* cmd line */
> - dom->f_active[i] |= dom->parms.f_required[i]; /* kernel */
> + domctl.cmd = XEN_DOMCTL_setfeatures;
> + domctl.domain = dom->guest_domid;
> +
> + domctl.u.setfeatures.submap_idx = i;
> + domctl.u.setfeatures.submap = 0;
> +
> + domctl.u.setfeatures.submap |= dom->f_requested[i]; /* cmd line */
> + domctl.u.setfeatures.submap |= dom->parms.f_required[i]; /* kernel */
> +
> + xc_dom_printf("requesting features[%d] = %#x\n", domctl.u.setfeatures.submap_idx, domctl.u.setfeatures.submap);
> + if (do_domctl(dom->guest_xc, &domctl))
> + {
> + xc_dom_panic(XC_INVALID_PARAM,
> + "%s: unable to set requested features\n", __FUNCTION__);
> + goto err;
> + }
> +
> + xc_dom_printf("received features[%d] = %#x\n", domctl.u.setfeatures.submap_idx, domctl.u.setfeatures.submap);
> + dom->f_active[i] = domctl.u.setfeatures.submap;
> +
> if ( (dom->f_active[i] & dom->parms.f_supported[i]) !=
> dom->f_active[i] )
> {
> @@ -639,6 +658,7 @@
> goto err;
> }
> }
> +
> return 0;
>
> err:
> diff -r a6c4d03b7d45 tools/libxl/xl.c
> --- a/tools/libxl/xl.c Mon Feb 08 13:05:48 2010 +0000
> +++ b/tools/libxl/xl.c Thu Feb 11 12:48:47 2010 +0000
> @@ -468,6 +468,8 @@
> }
> if (config_lookup_string (&config, "ramdisk", &buf) == CONFIG_TRUE)
> b_info->u.pv.ramdisk = strdup(buf);
> + if (config_lookup_string (&config, "features", &buf) == CONFIG_TRUE)
> + b_info->u.pv.features = strdup(buf);
> }
>
> if ((vbds = config_lookup (&config, "disk")) != NULL) {
> diff -r a6c4d03b7d45 xen/common/domctl.c
> --- a/xen/common/domctl.c Mon Feb 08 13:05:48 2010 +0000
> +++ b/xen/common/domctl.c Thu Feb 11 12:48:47 2010 +0000
> @@ -23,6 +23,7 @@
> #include <xen/paging.h>
> #include <asm/current.h>
> #include <public/domctl.h>
> +#include <public/features.h>
> #include <xsm/xsm.h>
>
> static DEFINE_SPINLOCK(domctl_lock);
> @@ -960,6 +962,54 @@
> }
> break;
>
> + case XEN_DOMCTL_setfeatures:
> + {
> + struct domain *d;
> + ret = -ESRCH;
> + if ( (d = rcu_lock_domain_by_id(op->domain)) != NULL )
> + {
> + printk("dom%d set features[%d] = %#x\n", d->domain_id, op->u.setfeatures.submap_idx, op->u.setfeatures.submap);
> +
> + switch (op->u.setfeatures.submap_idx) {
> + case 0:
> + if ( !paging_mode_translate(d) )
... this condition looks inverted to me.
> + {
> + op->u.setfeatures.submap &= ~(1U<<XENFEAT_writable_page_tables);
> + op->u.setfeatures.submap &= ~(1U<<XENFEAT_auto_translated_physmap);
> + }
> + if ( !is_pvhvm_domain(d) )
> + {
> + op->u.setfeatures.submap &= ~(1U<<XENFEAT_supervisor_mode_kernel);
> + }
> +
> + op->u.setfeatures.submap &= ~(1U<<XENFEAT_writable_descriptor_tables);
Why do you turn this off unconditionally?
> +
> + /* XXX other features */
That's perhaps also the place holder where the passed in information
would actually get stored?
Jan
> +
> +
> + if ( op->u.setfeatures.submap &(1U<<XENFEAT_supervisor_mode_kernel) )
> + d->arch.pv_kernel_minimum_rpl = 0;
> +
> + ret = 0;
> + break;
> +
> + default:
> + printk("dom%d unknown feature submap %d\n", d->domain_id, op->u.setfeatures.submap_idx);
> + op->u.setfeatures.submap = 0;
> + ret = -EINVAL;
> + break;
> + }
> +
> + rcu_unlock_domain(d);
> + ret = 0;
> +
> + if ( copy_to_guest(u_domctl, op, 1) )
> + ret = -EFAULT;
> +
> + }
> + }
> + break;
> +
> default:
> ret = arch_do_domctl(op, u_domctl);
> break;
> diff -r a6c4d03b7d45 xen/include/public/domctl.h
> --- a/xen/include/public/domctl.h Mon Feb 08 13:05:48 2010 +0000
> +++ b/xen/include/public/domctl.h Thu Feb 11 12:48:47 2010 +0000
> @@ -169,6 +169,13 @@
> XEN_GUEST_HANDLE_64(xen_pfn_t) array;
> };
>
> +/* XEN_DOMCTL_setfeatures */
> +struct xen_domctl_setfeatures {
> + /* IN variables */
> + unsigned int submap_idx; /* which 32-bit submap to return */
> + /* IN/OUT variables */
> + uint32_t submap; /* 32-bit submap, updated with actual
> result. */
> +};
>
> /*
> * Control shadow pagetables operation
> @@ -842,6 +848,7 @@
> #define XEN_DOMCTL_gettscinfo 59
> #define XEN_DOMCTL_settscinfo 60
> #define XEN_DOMCTL_getpageframeinfo3 61
> +#define XEN_DOMCTL_setfeatures 62
> #define XEN_DOMCTL_gdbsx_guestmemio 1000
> #define XEN_DOMCTL_gdbsx_pausevcpu 1001
> #define XEN_DOMCTL_gdbsx_unpausevcpu 1002
> @@ -855,6 +862,7 @@
> struct xen_domctl_getpageframeinfo getpageframeinfo;
> struct xen_domctl_getpageframeinfo2 getpageframeinfo2;
> struct xen_domctl_getpageframeinfo3 getpageframeinfo3;
> + struct xen_domctl_setfeatures setfeatures;
> struct xen_domctl_vcpuaffinity vcpuaffinity;
> struct xen_domctl_shadow_op shadow_op;
> struct xen_domctl_max_mem max_mem;
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2011-07-18 8:47 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-01 10:41 migration of pv guest fails from small to large host Olaf Hering
2011-07-01 16:20 ` Olaf Hering
2011-07-05 13:46 ` Konrad Rzeszutek Wilk
2011-07-07 13:20 ` Olaf Hering
2011-07-07 18:36 ` Konrad Rzeszutek Wilk
2011-07-07 21:02 ` Olaf Hering
2011-07-12 16:43 ` [PATCH] xen: update machine_to_phys_order on resume Olaf Hering
2011-07-12 18:11 ` [Xen-devel] " Konrad Rzeszutek Wilk
2011-07-13 9:12 ` Ian Campbell
2011-07-13 13:20 ` Konrad Rzeszutek Wilk
2011-07-15 8:56 ` Jan Beulich
2011-07-15 16:05 ` Jan Beulich
-- strict thread matches above, loose matches on Subject: below --
2011-07-13 14:29 Jan Beulich
2011-07-15 17:30 Jan Beulich
2011-07-15 18:23 ` Keir Fraser
2011-07-18 7:05 [Xen-devel] " Jan Beulich
2011-07-18 7:27 ` Keir Fraser
2011-07-18 8:31 ` [Xen-devel] " Ian Campbell
2011-07-18 8:47 ` Jan Beulich
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.