From: Atom2 <ariel.atom2@web2web.at>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: xen-devel@lists.xen.org
Subject: Re: [BUG] XEN 4.3.3 - segfault in xl create for HVM with PCI passthrough
Date: Wed, 05 Nov 2014 13:01:15 +0100 [thread overview]
Message-ID: <545A118B.7040309@web2web.at> (raw)
In-Reply-To: <1415180713.11486.61.camel@citrix.com>
[-- Attachment #1: Type: text/plain, Size: 4137 bytes --]
Am 05.11.14 um 10:45 schrieb Ian Campbell:
> On Tue, 2014-11-04 at 18:30 +0100, Atom2 wrote:
>> Am 04.11.14 um 17:31 schrieb Ian Campbell:
>>> On Tue, 2014-11-04 at 17:14 +0100, Atom2 wrote:
>>>> Am 04.11.14 um 16:44 schrieb Ian Campbell:
>>>>> On Tue, 2014-11-04 at 16:13 +0100, Atom2 wrote:
> Sadly it looks like your version of valgrind doesn't know how to handle
> the hypercalls made by the Xen toolstack, which means it produces a lot
> of unrelated noise.
>
> You seem to be using valgrind 3.9.0, which lacked knowledge of some of
> the HVM related hypercalls that weren't added until 3.10.0. It's
> probably not worth pursuing this angle any further (unless it is utterly
> trivial to pull in the new version).
Many thanks again for your quick answers.
You were right, I used valgrind-3.9.0 which is the latest stable version
for gentoo. 3.10.0 is available under unstable and it was indeed trivial
to pull that in instead. The unrelated noise seems to have disappeared,
so attached please find the output of running
# valgrindd xl create -F -c pfsense
The strange thing was: No segfault at the start, but obviously also
issues with passing through the PCI devices as evidenced by the same
error messages you flagged below. Also the boot menu now showed up and I
was able to boot the domain - but, as expected by the error message, no
network devices have been passed through. Even a
# xl shutdown -F pfsense
Shutting down domain 2
PV control interface not available: sending ACPI power button event.
#
from another ssh connection to dom0 worked (no segfault message in that
session) and as such the attached file 'valgrind.out' contains the
complete screen output of the valgrind session from start to finnish.
However, towards the end of that file (line 235) you'll see a SEGFAULT
message from valgrind. I hope you can make some sense out of that ... or
should I rerun with some options to valgrind (like the ones mentioned in
the output):
--leak-check=full
-v
To me, it looks as if something is broken with the PCI passthrough stuff
and that has started with 4.3.3. Strangely however, valgrind seems to
work around that issue insofar that no segfault happens. Is there any
explanation of the different behaviour between native execution of xl
and starting xl under valgrind's control?
In any case, I am positive that there hasn't been any change to the
hardware of the system, not even a slot change of an add-on card. So I
have no clue why the system after the upgrade misbehaves.
>
> Apart from the valgrind output there is a new message from libxl:
> libxl: error: libxl_pci.c:1045:libxl__device_pci_add: PCI device 0000:04:00.0 cannot be assigned - no IOMMU?
> which suggests that it isn't passing things through (this might be
> fallout from valgrind not understanding things) and no segfault.
>
> OOI what does "xl create -F ..." do without valgrind (I'm wondering if
> -F is responsible for the change in behaviour).
I tried that as well:
vm-host auto [526] # xl create -F -c pfsense
Parsing config from pfsense
xc: info: VIRTUAL MEMORY ARRANGEMENT:
Loader: 0000000000100000->00000000001c12a4
Modules: 0000000000000000->0000000000000000
TOTAL: 0000000000000000->000000001f800000
ENTRY ADDRESS: 0000000000100000
xc: info: PHYSICAL MEMORY ALLOCATION:
4KB PAGES: 0x0000000000000200
2MB PAGES: 0x00000000000000fb
1GB PAGES: 0x0000000000000000
Segmentation fault
vm-host auto [527] # xl list
Name ID Mem VCPUs State Time(s)
Domain-0 0 4094 8 r----- 451.5
pfsense 1 512 1 --p--- 0.0
vm-host auto [528] # xl destroy pfsense
Segmentation fault
vm-host auto [529] # xl list
Name ID Mem VCPUs State Time(s)
Domain-0 0 4096 8 r----- 452.1
vm-host auto [529] #
and, as you can see, again had the segfault and the same status of the
domU as back at the time when the issues started (i.e. paused - which
you explained as being normal after a start).
Thanks Atom2
[-- Attachment #2: valgrind.out --]
[-- Type: text/plain, Size: 13682 bytes --]
vm-host auto [540] # valgrind xl create -F -c pfsense
==24982== Memcheck, a memory error detector
==24982== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==24982== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==24982== Command: xl create -F -c pfsense
==24982==
Parsing config from pfsense
--24982-- WARNING: unhandled __HYPERVISOR_domctl shadow(10) subop: 31
--24982-- You may be able to write your own handler.
--24982-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
--24982-- Nevertheless we consider this a bug. Please report
--24982-- it at http://valgrind.org/support/bug_reports.html &
--24982-- http://wiki.xen.org/wiki/Reporting_Bugs_against_Xen.
xc: info: VIRTUAL MEMORY ARRANGEMENT:
Loader: 0000000000100000->00000000001c12a4
Modules: 0000000000000000->0000000000000000
TOTAL: 0000000000000000->000000001f800000
ENTRY ADDRESS: 0000000000100000
xc: info: PHYSICAL MEMORY ALLOCATION:
4KB PAGES: 0x0000000000000200
2MB PAGES: 0x00000000000000fb
1GB PAGES: 0x0000000000000000
--24982-- WARNING: unhandled __HYPERVISOR_domctl subop: 45
--24982-- You may be able to write your own handler.
--24982-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
--24982-- Nevertheless we consider this a bug. Please report
--24982-- it at http://valgrind.org/support/bug_reports.html &
--24982-- http://wiki.xen.org/wiki/Reporting_Bugs_against_Xen.
libxl: error: libxl_pci.c:1045:libxl__device_pci_add: PCI device 0000:04:00.0 cannot be assigned - no IOMMU?
--24982-- WARNING: unhandled __HYPERVISOR_domctl subop: 45
--24982-- You may be able to write your own handler.
--24982-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
--24982-- Nevertheless we consider this a bug. Please report
--24982-- it at http://valgrind.org/support/bug_reports.html &
--24982-- http://wiki.xen.org/wiki/Reporting_Bugs_against_Xen.
libxl: error: libxl_pci.c:1045:libxl__device_pci_add: PCI device 0000:0a:08.0 cannot be assigned - no IOMMU?
--24982-- WARNING: unhandled __HYPERVISOR_domctl subop: 45
--24982-- You may be able to write your own handler.
--24982-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
--24982-- Nevertheless we consider this a bug. Please report
--24982-- it at http://valgrind.org/support/bug_reports.html &
--24982-- http://wiki.xen.org/wiki/Reporting_Bugs_against_Xen.
libxl: error: libxl_pci.c:1045:libxl__device_pci_add: PCI device 0000:0a:0b.0 cannot be assigned - no IOMMU?
Waiting for domain pfsense (domid 2) to die [pid 24982]
/boot/config: -DConsoles: internal video/keyboard serial port
BIOS drive C: is disk0
BIOS 639kB/515068kB available memory
FreeBSD/x86 bootstrap loader, Revision 1.1
(root@pf2_1_1_amd64.pfsense.org, Mon Aug 25 08:18:48 EDT 2014)
Loading /boot/defaults/loader.conf
/boot/kernel/kernel data=0xbc1792 data=0x596478+0xe0ed0 syms=[0x8+0x125f70+0x8+0x113bc5]
\
�������������������������������������������
� �
� �
� �
� Welcome to pfSense! �
� � ______
� � / \
� 1. Boot pfSense [default] � _____/ f \
� 2. Boot pfSense with ACPI disabled � / \ /
� 3. Boot pfSense using USB device � / p \______/ Sense
� 4. Boot pfSense in Safe Mode � \ / \
� 5. Boot pfSense in single user mode � \_____/ \
� 6. Boot pfSense with verbose logging � \ /
� 7. Escape to loader prompt � \______/
� 8. Reboot �
� �
� �
� �
� Select option, [Enter] for default �
� or [Space] to pause timer 3 �
�������������������������������������������
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2012 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.3-RELEASE-p16 #0: Mon Aug 25 08:27:11 EDT 2014
root@pf2_1_1_amd64.pfsense.org:/usr/obj.amd64/usr/pfSensesrc/src/sys/pfSense_SMP.8 amd64
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Core(TM) i5-2300 CPU @ 2.80GHz (2394.57-MHz K8-class CPU)
Origin = "GenuineIntel" Id = 0x206a7 Family = 6 Model = 2a Stepping = 7
Features=0x1783fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2,HTT>
Features2=0x97ba2203<SSE3,PCLMULQDQ,SSSE3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,AVX,HV>
AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
AMD Features2=0x1<LAHF>
TSC: P-state invariant
real memory = 528482304 (504 MB)
avail memory = 483524608 (461 MB)
ACPI APIC Table: <HPQOEM SLIC-CPC>
ioapic0: Changing APIC ID to 1
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0 <Version 1.1> irqs 0-47 on motherboard
wlan: mac acl policy registered
ipw_bss: You need to read the LICENSE file in /usr/share/doc/legal/intel_ipw/.
ipw_bss: If you agree with the license, set legal.intel_ipw.license_ack=1 in /boot/loader.conf.
module_register_init: MOD_LOAD (ipw_bss_fw, 0xffffffff804abaf0, 0) error 1
ipw_ibss: You need to read the LICENSE file in /usr/share/doc/legal/intel_ipw/.
ipw_ibss: If you agree with the license, set legal.intel_ipw.license_ack=1 in /boot/loader.conf.
module_register_init: MOD_LOAD (ipw_ibss_fw, 0xffffffff804abb90, 0) error 1
ipw_monitor: You need to read the LICENSE file in /usr/share/doc/legal/intel_ipw/.
ipw_monitor: If you agree with the license, set legal.intel_ipw.license_ack=1 in /boot/loader.conf.
module_register_init: MOD_LOAD (ipw_monitor_fw, 0xffffffff804abc30, 0) error 1
kbd1 at kbdmux0
cryptosoft0: <software crypto> on motherboard
padlock0: No ACE support.
acpi0: <HPQOEM SLIC-CPC> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
acpi0: Sleep Button (fixed)
Timecounter "ACPI-safe" frequency 3579545 Hz quality 850
acpi_timer0: <32-bit timer at 3.579545MHz> port 0xb008-0xb00b on acpi0
cpu0: <ACPI CPU> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
isab0: <PCI-ISA bridge> at device 1.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel PIIX3 WDMA2 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xc240-0xc24f at device 1.1 on pci0
ata0: <ATA channel> at channel 0 on atapci0
ata0: [ITHREAD]
ata1: <ATA channel> at channel 1 on atapci0
ata1: [ITHREAD]
pci0: <bridge> at device 1.3 (no driver attached)
vgapci0: <VGA-compatible display> mem 0xf0000000-0xf1ffffff,0xf3052000-0xf3052fff at device 2.0 on pci0
pci0: <unknown> at device 3.0 (no driver attached)
sym0: <895a> port 0xc100-0xc1ff mem 0xf3053000-0xf30533ff,0xf3050000-0xf3051fff irq 32 at device 4.0 on pci0
sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking
sym0: [ITHREAD]
em0: <Intel(R) PRO/1000 Legacy Network Connection 1.0.6> port 0xc200-0xc23f mem 0xf3000000-0xf301ffff irq 36 at device 5.0 on pci0
em0: [FILTER]
acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 62500000 Hz quality 900
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: [ITHREAD]
psm0: model IntelliMouse Explorer, device ID 4
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: does not respond
device_attach: fdc0 attach returned 6
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart0: [FILTER]
uart0: console (115200,n,8,1)
ppc0: <Parallel port> port 0x378-0x37f irq 7 on acpi0
ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode
ppc0: [ITHREAD]
ppbus0: <Parallel port bus> on ppc0
plip0: <PLIP network interface> on ppbus0
plip0: [ITHREAD]
lpt0: <Printer> on ppbus0
lpt0: [ITHREAD]
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
orm0: <ISA Option ROM> at iomem 0xed800-0xeffff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounter "TSC" frequency 2394570596 Hz quality 800
Timecounters tick every 10.000 msec
IPsec: Initialized Security Association Processing.
acd0: DVDROM <QEMU DVD-ROM/1.3.1> at ata1-master WDMA2
sym0: unknown interrupt(s) ignored, ISTAT=0x1 DSTAT=0x80 SIST=0x0
da0 at sym0 bus 0 scbus0 target 0 lun 0
da0: <QEMU QEMU HARDDISK 1.3.> Fixed Direct Access SCSI-5 device
da0: 3.300MB/s transfers
da0: Command Queueing enabled
da0: 8192MB (16777216 512 byte sectors: 255H 63S/T 1044C)
Trying to mount root from ufs:/dev/da0s1a
Configuring crash dumps...
Using /dev/da0s1b for dump device.
Mounting filesystems...
ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is present;
to enable, add "vfs.zfs.prefetch_disable=0" to /boot/loader.conf.
ZFS WARNING: Recommended minimum kmem_size is 512MB; expect unstable behavior.
Consider tuning vm.kmem_size and vm.kmem_size_max
in /boot/loader.conf.
ZFS filesystem version 5
ZFS storage pool version 28
___
___/ f \
/ p \___/ Sense
\___/ \
\___/
Welcome to pfSense 2.1.5-RELEASE ...
No core dumps found.
Creating symlinks......done.
>>> Under 512 megabytes of ram detected. Not enabling APC.
External config loader 1.0 is now starting...
Launching the init system... done.
Initializing............................. done.
Starting device manager (devd)...done.
Loading configuration......done.
Warning: Configuration references interfaces that do not exist: ath0 rl0
Network interface mismatch -- Running interface assignment option.
Valid interfaces are:
em0 00:16:3e:a1:64:01 (up) Intel(R) PRO/1000 Legacy Network Connection 1.0.6
Do you want to set up VLANs first?
If you are not going to use VLANs, or only for optional interfaces, you should
say no here and use the webConfigurator to configure VLANs later, if required.
Do you want to set up VLANs now [y|n]? em0: link state changed to UP
pfSense is now shutting down ...
Waiting (max 60 seconds) for system process `vnlru' to stop...done
Waiting (max 60 seconds) for system process `bufdaemon' to stop...done
Waiting (max 60 seconds) for system process `syncer' to stop...
Syncing disks, vnodes remaining...0 done
All buffers synced.
Uptime: 4m13s
acpi0: Powering system off
Domain 2 has shut down, reason code 0 0x0
Action for shutdown reason code 0 is destroy
Domain 2 needs to be cleaned up: destroying the domain
==24982==
==24982== Process terminating with default action of signal 11 (SIGSEGV)
==24982== Bad permissions for mapped region at address 0x4035D40
==24982== at 0x56EDB95: sigcancel_handler (nptl-init.c:174)
==24982==
==24982== HEAP SUMMARY:
==24982== in use at exit: 7,580 bytes in 50 blocks
==24982== total heap usage: 1,688 allocs, 1,638 frees, 4,978,265 bytes allocated
==24982==
==24982== LEAK SUMMARY:
==24982== definitely lost: 516 bytes in 7 blocks
==24982== indirectly lost: 0 bytes in 0 blocks
==24982== possibly lost: 576 bytes in 2 blocks
==24982== still reachable: 6,488 bytes in 41 blocks
==24982== suppressed: 0 bytes in 0 blocks
==24982== Rerun with --leak-check=full to see details of leaked memory
==24982==
==24982== For counts of detected and suppressed errors, rerun with: -v
==24982== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Killed
[-- Attachment #3: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2014-11-05 12:01 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-27 21:25 segfault in xl create for HVM with PCI passthrough Atom2
2014-10-28 10:59 ` Ian Campbell
2014-10-28 15:39 ` Atom2
2014-10-28 16:04 ` Ian Campbell
2014-10-29 0:26 ` Atom2
2014-10-30 23:05 ` Atom2
2014-11-04 15:13 ` [BUG] XEN 4.3.3 - " Atom2
2014-11-04 15:44 ` Ian Campbell
2014-11-04 16:14 ` Atom2
2014-11-04 16:31 ` Ian Campbell
2014-11-04 16:48 ` Atom2
2014-11-05 9:33 ` Ian Campbell
2014-11-04 17:30 ` Atom2
2014-11-05 9:45 ` Ian Campbell
2014-11-05 12:01 ` Atom2 [this message]
2014-11-05 12:39 ` Ian Campbell
2014-11-05 12:45 ` Andrew Cooper
2014-11-05 12:47 ` Ian Campbell
2014-11-06 15:11 ` Atom2
2014-11-10 11:16 ` Ian Campbell
2014-11-10 11:44 ` Atom2
2014-11-10 12:09 ` Ian Campbell
2014-12-01 3:34 ` Dennis Lan (dlan)
2014-12-01 9:38 ` Ian Campbell
2014-11-09 23:03 ` Atom2
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=545A118B.7040309@web2web.at \
--to=ariel.atom2@web2web.at \
--cc=Ian.Campbell@citrix.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).