* [Qemu-devel] [PATCH] ohci: Stop OHCI bus when PCI bus master is disabled
@ 2014-09-11 7:11 Alexey Kardashevskiy
2014-09-11 7:22 ` Gerd Hoffmann
0 siblings, 1 reply; 8+ messages in thread
From: Alexey Kardashevskiy @ 2014-09-11 7:11 UTC (permalink / raw)
To: qemu-devel; +Cc: Alexey Kardashevskiy, qemu-ppc, Alexander Graf, Gerd Hoffmann
When the guest performs kexec() (for example, as a part of kdump),
new kernel does PCI probing. As a part of it, PCI_COMMAND_MASTER
gets disabled which disables bus master memory region.
Since ohci_frame_boundary() timer is not stopped at this point
as OHCI device was not reset, the device tries accessing DMA memory,
fails and ends up in ohci_die() producing errors:
usb-ohci: HCCA read error at 30000000
ohci_die: DMA error
This stops OHCI bus (i.e. timer) when bus master is disabled.
The kernel later resets OHCI anyway and the bus gets reenabled so
everything what this patch does is removing those QEMU errors about DMA.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
This is the log of what is happening during kdump. I am not sure what piece
of the guest kernel code is responsible for disabling bus master.
[root@localhost ~]# echo c> /proc/sysrq-trigger
SysRq : Trigger a crash
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc0000000003a4ecc
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=1024 NUMA pSeries
Modules linked in: ipv6 sg ext4 jbd2 mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt dm_mirror dm_region_hash dm_log dm_mo
d [last unloaded: scsi_wait_scan]
NIP: c0000000003a4ecc LR: c0000000003a5218 CTR: c0000000003a4eb0
REGS: c0000003efe4f860 TRAP: 0300 Not tainted (2.6.32-498.el6.ppc64)
MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000
DAR: 0000000000000000, DSISR: 0000000042000000
TASK = c0000003efd28970[1309] 'bash' THREAD: c0000003efe4c000 CPU: 0
GPR00: 0000000000000001 c0000003efe4fae0 c000000000f6c0b0 0000000000000063
GPR04: 0000000000000000 ffffffffffffffff 0000000000000240 00000000001cd390
GPR08: 00000001ccd05260 0000000000000000 0000000000080000 c0000000003a4ee0
GPR12: 0000000028242482 c000000001042500 000000001012b42c 0000000000000000
GPR16: 0000000000000000 0000000010129ca8 0000000010129c48 0000000000000000
GPR20: 0000000000000000 0000000010123d98 0000000000000000 c000000000e98ce0
GPR24: 000000000000000a c000000000e98e80 c000000000e776b4 0000000000000000
GPR28: 0000000000000001 0000000000000063 c000000000f0b858 c00000000147209c
NIP [c0000000003a4ecc] .sysrq_handle_crash+0x1c/0x30
LR [c0000000003a5218] .__handle_sysrq+0x108/0x230
Call Trace:
[c0000003efe4fae0] [c0000000003a51ec] .__handle_sysrq+0xdc/0x230 (unreliable)
[c0000003efe4fba0] [c0000000003a53c0] .write_sysrq_trigger+0x80/0xa0
[c0000003efe4fc30] [c00000000024abf4] .proc_reg_write+0xb4/0x130
[c0000003efe4fce0] [c0000000001cd3bc] .vfs_write+0xec/0x1f0
[c0000003efe4fd80] [c0000000001cd5e8] .SyS_write+0x58/0xb0
[c0000003efe4fe30] [c000000000008564] syscall_exit+0x0/0x40
Instruction dump:
98090003 ebc1fff0 4e800020 60000000 60000000 fbc1fff0 ebc2c7f8 38000001
e93e8020 90090000 7c0004ac 39200000 <98090000> ebc1fff0 4e800020 60000000
Sending IPI to other cpus...
Using pSeries machine description
Page orders: linear mapping = 16, virtual = 16, io = 12
Using 1TB segments
Found initrd at 0xc0000000054e0000:0xc000000005af0000
bootconsole [udbg0] enabled
CPU maps initialized for 1 thread per core
(thread shift is 0)
Starting Linux PPC64 #1 SMP Fri Aug 22 22:45:56 EDT 2014
-----------------------------------------------------
ppc64_pft_size = 0x1b
physicalMemorySize = 0x24010000
htab_hash_mask = 0xfffff
physical_start = 0x4000000
-----------------------------------------------------
Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.32-498.el6.ppc64 (mockbuild@ppc-003.build.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-10) (GCC) ) #1 SMP
Fri Aug 22 22:45:56 EDT 2014
CF000012
Setup Arch[boot]0012 Setup Arch
Node 0 Memory: 0x0-0x30000000
PCI host bridge /pci@800000020000000 ranges:
IO 0x0000010080000000..0x000001008000ffff -> 0x0000000000000000
MEM 0x00000100a0000000..0x00000100bfffffff -> 0x0000000080000000
PPC64 nvram contains 65536 bytes
Using shared processor idle loop
Zone PFN ranges:
DMA 0x00000000 -> 0x00003000
Normal 0x00003000 -> 0x00003000
Movable zone start PFN for each node
early_node_map[2] active PFN ranges
0: 0x00000000 -> 0x00002400
0: 0x00002fff -> 0x00003000
On node 0 totalpages: 9217
DMA zone: 11 pages used for memmap
DMA zone: 0 pages reserved
DMA zone: 9206 pages, LIFO batch:1
CF000015
Setup Done[boot]0015 Setup Done
PERCPU: Embedded 2 pages/cpu @c000000005c00000 s93080 r0 d37992 u1048576
pcpu-alloc: s93080 r0 d37992 u1048576 alloc=1*1048576
pcpu-alloc: [0] 0
Built 1 zonelists in Node order, mobility grouping on. Total pages: 9206
Policy zone: DMA
Kernel command line: root=/dev/mapper/VolGroup-lv_root ro rd_NO_LUKS LANG=en_US.UTF-8 rd_NO_MD console=hvc0 KEYTABLE=us rd_LVM_LV=VolGroup
/lv_swap SYSFONT=latarcyrheb-sun16 rd_LVM_LV=VolGroup/lv_root rd_NO_DM rhgb quiet debug irqpoll maxcpus=1 noirqdistrib reset_devices cgrou
p_disable=memory elfcorehdr=86784K savemaxmem=16384M
Misrouted IRQ fixup and polling support enabled
This may significantly impact system performance
Disabling memory control group subsystem
PID hash table entries: 4096 (order: -1, 32768 bytes)
freeing bootmem node 0
Memory: 493568k/589888k available (13760k kernel code, 112640k reserved, 2816k data, 4592k bss, 5120k init)
Hierarchical RCU implementation.
NR_IRQS:512
CF000020
XICS Init[boot]0020 XICS Init
CF000021
XICS Done[boot]0021 XICS Done
pic: no ISA interrupt controller
time_init: decrementer frequency = 512.000000 MHz
time_init: processor frequency = 3720.000000 MHz
clocksource: timebase mult[7d0000] shift[22] registered
clockevent: decrementer mult[83126e97] shift[32] cpu[0]
Console: colour dummy device 80x25
console [hvc0] enabled, bootconsole disabled
console [hvc0] enabled, bootconsole disabled
pid_max: default: 32768 minimum: 301
Security Framework initialized
SELinux: Initializing.
SELinux: Starting in permissive mode
Dentry cache hash table entries: 131072 (order: 4, 1048576 bytes)
Inode-cache hash table entries: 65536 (order: 3, 524288 bytes)
Mount-cache hash table entries: 4096
Initializing cgroup subsys ns
Initializing cgroup subsys cpuacct
Initializing cgroup subsys memory
Initializing cgroup subsys devices
Initializing cgroup subsys freezer
Initializing cgroup subsys net_cls
Initializing cgroup subsys blkio
Initializing cgroup subsys perf_event
Initializing cgroup subsys net_prio
irq: irq 2 on host null mapped to virtual irq 16
POWER7 performance monitor hardware support registered
Brought up 1 CPUs
Node 0 CPUs: 0
sizeof(vma)=200 bytes
sizeof(page)=56 bytes
sizeof(inode)=584 bytes
sizeof(dentry)=192 bytes
sizeof(ext3inode)=784 bytes
sizeof(buffer_head)=104 bytes
sizeof(skbuff)=232 bytes
sizeof(task_struct)=3152 bytes
Enabling Asymmetric SMT scheduling
devtmpfs: initialized
regulator: core version 0.5
NET: Registered protocol family 16
IBM eBus Device Driver
nvram: pstore_register() failed, defaults to kmsg_dump; returned -1
Linux ppc64
PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [io 0x10000-0x1ffff] (bus address [0x0000-0xffff])
pci_bus 0000:00: root bus resource [mem 0x100a0000000-0x100bfffffff] (bus address [0x80000000-0x9fffffff])
pci_cfg_read pci-ohci 00:0 @0x6 -> 0x0
pci_cfg_read pci-ohci 00:0 @0x6 -> 0x0
pci_cfg_read pci-ohci 00:0 @0x6 -> 0x0
pci_cfg_read pci-ohci 00:0 @0x6 -> 0x0
pci_cfg_read pci-ohci 00:0 @0x6 -> 0x0
pci_cfg_read pci-ohci 00:0 @0x6 -> 0x0
IOMMU table initialized, virtual merging enabled
irq: irq 4098 on host null mapped to virtual irq 17
pci_cfg_read pci-ohci 00:0 @0x4 -> 0x106
pci_cfg_read pci-ohci 00:0 @0x4 -> 0x106
PCI: Probing PCI hardware done
irq: irq 4096 on host null mapped to virtual irq 18
bio: create slab <bio-0> at 0
vgaarb: loaded
SCSI subsystem initialized
libata version 3.00 loaded.
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
NetLabel: Initializing
NetLabel: domain hash size = 128
NetLabel: protocols = UNLABELED CIPSOv4
NetLabel: unlabeled traffic allowed by default
Switching to clocksource timebase
NET: Registered protocol family 2
IP route cache hash table entries: 8192 (order: 0, 65536 bytes)
TCP established hash table entries: 32768 (order: 3, 524288 bytes)
TCP bind hash table entries: 32768 (order: 3, 524288 bytes)
TCP: Hash tables configured (established 32768 bind 32768)
TCP reno registered
NET: Registered protocol family 1
pci_cfg_read pci-ohci 00:0 @0x4 -> 0x106
pci_cfg_read pci-ohci 00:0 @0x4 -> 0x106
pci_cfg_read pci-ohci 00:0 @0x4 -> 0x106
pci_cfg_write pci-ohci 00:0 @0x4 <- 0x102
+++Q+++ pci_default_write_config 1154 cmd=106
Breakpoint 10, memory_region_set_enabled (mr=0x10f6cdc0, enabled=0x0) at /home/alexey/p/qemu/memory.c:1686
1686 printf("+++Q+++ %s %u %s %u\n", __func__, __LINE__, mr->name, enabled);
(gdb)
---
hw/usb/hcd-ohci.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
index 83bec34..be67198 100644
--- a/hw/usb/hcd-ohci.c
+++ b/hw/usb/hcd-ohci.c
@@ -1972,6 +1972,18 @@ static void usb_ohci_exit(PCIDevice *dev)
}
}
+static void ohci_pci_write_config(PCIDevice *pdev, uint32_t addr,
+ uint32_t val, int len)
+{
+ if ((addr == PCI_COMMAND) &&
+ (pdev->config[PCI_COMMAND] & PCI_COMMAND_MASTER) &&
+ !(val & PCI_COMMAND_MASTER)) {
+
+ ohci_bus_stop(&PCI_OHCI(pdev)->state);
+ }
+ pci_default_write_config(pdev, addr, val, len);
+}
+
#define TYPE_SYSBUS_OHCI "sysbus-ohci"
#define SYSBUS_OHCI(obj) OBJECT_CHECK(OHCISysBusState, (obj), TYPE_SYSBUS_OHCI)
@@ -2113,6 +2125,7 @@ static void ohci_pci_class_init(ObjectClass *klass, void *data)
k->vendor_id = PCI_VENDOR_ID_APPLE;
k->device_id = PCI_DEVICE_ID_APPLE_IPID_USB;
k->class_id = PCI_CLASS_SERIAL_USB;
+ k->config_write = ohci_pci_write_config;
set_bit(DEVICE_CATEGORY_USB, dc->categories);
dc->desc = "Apple USB Controller";
dc->props = ohci_pci_properties;
--
2.0.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] ohci: Stop OHCI bus when PCI bus master is disabled
2014-09-11 7:11 [Qemu-devel] [PATCH] ohci: Stop OHCI bus when PCI bus master is disabled Alexey Kardashevskiy
@ 2014-09-11 7:22 ` Gerd Hoffmann
2014-09-11 8:18 ` Alexey Kardashevskiy
0 siblings, 1 reply; 8+ messages in thread
From: Gerd Hoffmann @ 2014-09-11 7:22 UTC (permalink / raw)
To: Alexey Kardashevskiy; +Cc: qemu-ppc, qemu-devel, Alexander Graf
On Do, 2014-09-11 at 17:11 +1000, Alexey Kardashevskiy wrote:
> When the guest performs kexec() (for example, as a part of kdump),
> new kernel does PCI probing. As a part of it, PCI_COMMAND_MASTER
> gets disabled which disables bus master memory region.
> Since ohci_frame_boundary() timer is not stopped at this point
> as OHCI device was not reset, the device tries accessing DMA memory,
> fails and ends up in ohci_die() producing errors:
>
> usb-ohci: HCCA read error at 30000000
> ohci_die: DMA error
Which is the correct behavior.
IMHO the kernel should stop ohci before doing kexec.
Independant of that we can move the ohci error logging to tracepoints,
so ohci emulation is silent by default.
cheers,
Gerd
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] ohci: Stop OHCI bus when PCI bus master is disabled
2014-09-11 7:22 ` Gerd Hoffmann
@ 2014-09-11 8:18 ` Alexey Kardashevskiy
2014-09-11 10:15 ` Gerd Hoffmann
0 siblings, 1 reply; 8+ messages in thread
From: Alexey Kardashevskiy @ 2014-09-11 8:18 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: qemu-ppc, qemu-devel, Alexander Graf
On 09/11/2014 05:22 PM, Gerd Hoffmann wrote:
> On Do, 2014-09-11 at 17:11 +1000, Alexey Kardashevskiy wrote:
>> When the guest performs kexec() (for example, as a part of kdump),
>> new kernel does PCI probing. As a part of it, PCI_COMMAND_MASTER
>> gets disabled which disables bus master memory region.
>> Since ohci_frame_boundary() timer is not stopped at this point
>> as OHCI device was not reset, the device tries accessing DMA memory,
>> fails and ends up in ohci_die() producing errors:
>>
>> usb-ohci: HCCA read error at 30000000
>> ohci_die: DMA error
>
> Which is the correct behavior.
>
> IMHO the kernel should stop ohci before doing kexec.
To be precise, it is kdump.
> Independant of that we can move the ohci error logging to tracepoints,
> so ohci emulation is silent by default.
That is the other way to go, yes.
--
Alexey
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] ohci: Stop OHCI bus when PCI bus master is disabled
2014-09-11 8:18 ` Alexey Kardashevskiy
@ 2014-09-11 10:15 ` Gerd Hoffmann
2014-09-11 10:30 ` Alexey Kardashevskiy
0 siblings, 1 reply; 8+ messages in thread
From: Gerd Hoffmann @ 2014-09-11 10:15 UTC (permalink / raw)
To: Alexey Kardashevskiy; +Cc: qemu-ppc, qemu-devel, Alexander Graf
Hi,
> To be precise, it is kdump.
Ok, kdump is a different story, radically turn off all DMA is reasonable
in that case, normal driver shutdown might be unreliable after panic and
it also changes system state too much for a useful dump.
ohci (and probably others too, but without spamming the log) throwing
dma errors then is normal fallout though. kdump kernel should be able
to bring the device back online after such an error, with your patch
applied you can't test that with qemu any more ...
> > Independant of that we can move the ohci error logging to tracepoints,
> > so ohci emulation is silent by default.
>
> That is the other way to go, yes.
Lets do it this way please.
cheers,
Gerd
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] ohci: Stop OHCI bus when PCI bus master is disabled
2014-09-11 10:15 ` Gerd Hoffmann
@ 2014-09-11 10:30 ` Alexey Kardashevskiy
2014-09-11 10:38 ` Gerd Hoffmann
0 siblings, 1 reply; 8+ messages in thread
From: Alexey Kardashevskiy @ 2014-09-11 10:30 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: qemu-ppc, qemu-devel, Alexander Graf
On 09/11/2014 08:15 PM, Gerd Hoffmann wrote:
> Hi,
>
>> To be precise, it is kdump.
>
> Ok, kdump is a different story, radically turn off all DMA is reasonable
> in that case, normal driver shutdown might be unreliable after panic and
> it also changes system state too much for a useful dump.
>
> ohci (and probably others too, but without spamming the log) throwing
> dma errors then is normal fallout though. kdump kernel should be able
> to bring the device back online after such an error, with your patch
> applied you can't test that with qemu any more ...
>
>>> Independant of that we can move the ohci error logging to tracepoints,
>>> so ohci emulation is silent by default.
>>
>> That is the other way to go, yes.
>
> Lets do it this way please.
Yep, no problem, I'll make a small patch.
Another question - I noticed that XHCI migration is broken in quite recent
upstream QEMU, smells like memory corruption. Is it just me or just PPC or
is it known issue?
--
Alexey
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] ohci: Stop OHCI bus when PCI bus master is disabled
2014-09-11 10:30 ` Alexey Kardashevskiy
@ 2014-09-11 10:38 ` Gerd Hoffmann
2014-09-11 11:02 ` Alexey Kardashevskiy
0 siblings, 1 reply; 8+ messages in thread
From: Gerd Hoffmann @ 2014-09-11 10:38 UTC (permalink / raw)
To: Alexey Kardashevskiy; +Cc: qemu-ppc, qemu-devel, Alexander Graf
Hi,
> Another question - I noticed that XHCI migration is broken in quite recent
> upstream QEMU, smells like memory corruption. Is it just me or just PPC or
> is it known issue?
2.0 -> 2.1 migration being broken is a known issue (patch for that one
was on the list earlier this week, unfortunately missed 2.1.1).
Other that that I'm not aware of any issues.
cheers,
Gerd
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] ohci: Stop OHCI bus when PCI bus master is disabled
2014-09-11 10:38 ` Gerd Hoffmann
@ 2014-09-11 11:02 ` Alexey Kardashevskiy
2014-09-12 4:09 ` Alexey Kardashevskiy
0 siblings, 1 reply; 8+ messages in thread
From: Alexey Kardashevskiy @ 2014-09-11 11:02 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: qemu-ppc, qemu-devel, Alexander Graf
On 09/11/2014 08:38 PM, Gerd Hoffmann wrote:
> Hi,
>
>> Another question - I noticed that XHCI migration is broken in quite recent
>> upstream QEMU, smells like memory corruption. Is it just me or just PPC or
>> is it known issue?
>
> 2.0 -> 2.1 migration being broken is a known issue (patch for that one
> was on the list earlier this week, unfortunately missed 2.1.1).
>
> Other that that I'm not aware of any issues.
My bad, it was me.
I enabled 64bit DMA on pseries (the guest ram is mapped at
8000.0000.0000.0000 on the pci bus) and somehow this causes migration
errors. I thought there is no 64bit DMA-capable device in QEMU, and I was
wrong :)
/me is debugging
--
Alexey
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] ohci: Stop OHCI bus when PCI bus master is disabled
2014-09-11 11:02 ` Alexey Kardashevskiy
@ 2014-09-12 4:09 ` Alexey Kardashevskiy
0 siblings, 0 replies; 8+ messages in thread
From: Alexey Kardashevskiy @ 2014-09-12 4:09 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: qemu-ppc, qemu-devel, Alexander Graf
On 09/11/2014 09:02 PM, Alexey Kardashevskiy wrote:
> On 09/11/2014 08:38 PM, Gerd Hoffmann wrote:
>> Hi,
>>
>>> Another question - I noticed that XHCI migration is broken in quite recent
>>> upstream QEMU, smells like memory corruption. Is it just me or just PPC or
>>> is it known issue?
>>
>> 2.0 -> 2.1 migration being broken is a known issue (patch for that one
>> was on the list earlier this week, unfortunately missed 2.1.1).
>>
>> Other that that I'm not aware of any issues.
>
> My bad, it was me.
>
> I enabled 64bit DMA on pseries (the guest ram is mapped at
> 8000.0000.0000.0000 on the pci bus) and somehow this causes migration
> errors. I thought there is no 64bit DMA-capable device in QEMU, and I was
> wrong :)
>
> /me is debugging
After all, my issue has nothing to do with XHCI itself. I am implementing
another DMA window (64bit, maps entire guest) which is yet another child
object on the PHB device but it is dynamic - it is created by request from
the guest. And XHCI driver does it so I end up with source QEMU which has
this object and destination QEMU which does not and when I try creating
this object during migration (and this adds yet another SaveStateEntry),
something goes very wrong. QOM puzzle.
--
Alexey
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-09-12 4:09 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-11 7:11 [Qemu-devel] [PATCH] ohci: Stop OHCI bus when PCI bus master is disabled Alexey Kardashevskiy
2014-09-11 7:22 ` Gerd Hoffmann
2014-09-11 8:18 ` Alexey Kardashevskiy
2014-09-11 10:15 ` Gerd Hoffmann
2014-09-11 10:30 ` Alexey Kardashevskiy
2014-09-11 10:38 ` Gerd Hoffmann
2014-09-11 11:02 ` Alexey Kardashevskiy
2014-09-12 4:09 ` Alexey Kardashevskiy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).