* [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash
@ 2011-07-06 12:39 Andrew Cooper
2011-07-06 12:43 ` Jan Beulich
2011-07-06 18:42 ` Konrad Rzeszutek Wilk
0 siblings, 2 replies; 17+ messages in thread
From: Andrew Cooper @ 2011-07-06 12:39 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper
In the case of a crash, IOMMU DMA remapping gets turned off so that
the kdump kernel may boot. However, this is warned as being dangerous
in the VTD specification if a DMA transaction is in progress.
Also, in the case of a crash, DMA transactions and interrupts from
peripheral devices such as network cards are likely to keep coming in.
Without DMA remapping enabled, the transactions will be writing over
low memory, corrupting the crash state, and perhaps even the kdump
reserved memory.
Therefore, on the crash path, we can disconnect all PCI devices from
their respective buses so that they are no longer able to be DMA
busmasters. This reduces the risk of DMA transactions corrupting
state (and will also reduce spurious interrupts arriving to the kdump
kernel) until the kdump kernel and properly reset the PCI devices.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/arch/x86/crash.c
--- a/xen/arch/x86/crash.c Mon Jun 27 17:37:12 2011 +0100
+++ b/xen/arch/x86/crash.c Wed Jul 06 13:37:44 2011 +0100
@@ -28,6 +28,7 @@
#include <asm/apic.h>
#include <asm/io_apic.h>
#include <xen/iommu.h>
+#include <xen/pci.h>
static atomic_t waiting_for_crash_ipi;
static unsigned int crashing_cpu;
@@ -78,6 +79,8 @@ static void nmi_shootdown_cpus(void)
msecs--;
}
+ disconnect_pci_devices();
+
/* Crash shutdown any IOMMU functionality as the crashdump kernel is not
* happy when booting if interrupt/dma remapping is still enabled */
iommu_crash_shutdown();
diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/drivers/passthrough/pci.c
--- a/xen/drivers/passthrough/pci.c Mon Jun 27 17:37:12 2011 +0100
+++ b/xen/drivers/passthrough/pci.c Wed Jul 06 13:37:44 2011 +0100
@@ -462,6 +462,32 @@ int __init scan_pci_devices(void)
return 0;
}
+/* Disconnect a PCI device from the PCI bus. From the PCI spec:
+ * "When a 0 is written to [the COMMAND] register, the device is
+ * logically disconnected from the PCI bus for all accesses except
+ * configuration accesses. All devices are required to support
+ * this base level of functionality."
+ */
+void disconnect_pci_device(struct pci_dev *pdev)
+{
+ pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn),
+ PCI_FUNC(pdev->devfn), PCI_COMMAND, 0);
+}
+
+/* Diconnect all PCI devices from the PCI buses.
+ */
+void disconnect_pci_devices(void)
+{
+ struct pci_dev *pdev;
+
+ spin_lock(&pcidevs_lock);
+
+ list_for_each_entry ( pdev, &alldevs_list, alldevs_list )
+ disconnect_pci_device(pdev);
+
+ spin_unlock(&pcidevs_lock);
+}
+
#ifdef SUPPORT_MSI_REMAPPING
static void dump_pci_devices(unsigned char ch)
{
diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/include/xen/pci.h
--- a/xen/include/xen/pci.h Mon Jun 27 17:37:12 2011 +0100
+++ b/xen/include/xen/pci.h Wed Jul 06 13:37:44 2011 +0100
@@ -92,6 +92,9 @@ int pci_add_device_ext(u8 bus, u8 devfn,
struct pci_dev *pci_get_pdev(int bus, int devfn);
struct pci_dev *pci_get_pdev_by_domain(struct domain *d, int bus, int devfn);
+void disconnect_pci_device(struct pci_dev * pdev);
+void disconnect_pci_devices(void);
+
uint8_t pci_conf_read8(
unsigned int bus, unsigned int dev, unsigned int func, unsigned int reg);
uint16_t pci_conf_read16(
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash 2011-07-06 12:39 [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash Andrew Cooper @ 2011-07-06 12:43 ` Jan Beulich 2011-07-06 12:46 ` Andrew Cooper 2011-07-06 18:42 ` Konrad Rzeszutek Wilk 1 sibling, 1 reply; 17+ messages in thread From: Jan Beulich @ 2011-07-06 12:43 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel >>> On 06.07.11 at 14:39, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > In the case of a crash, IOMMU DMA remapping gets turned off so that > the kdump kernel may boot. However, this is warned as being dangerous > in the VTD specification if a DMA transaction is in progress. > > Also, in the case of a crash, DMA transactions and interrupts from > peripheral devices such as network cards are likely to keep coming in. > Without DMA remapping enabled, the transactions will be writing over > low memory, corrupting the crash state, and perhaps even the kdump > reserved memory. > > Therefore, on the crash path, we can disconnect all PCI devices from > their respective buses so that they are no longer able to be DMA > busmasters. This reduces the risk of DMA transactions corrupting > state (and will also reduce spurious interrupts arriving to the kdump > kernel) until the kdump kernel and properly reset the PCI devices. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/arch/x86/crash.c > --- a/xen/arch/x86/crash.c Mon Jun 27 17:37:12 2011 +0100 > +++ b/xen/arch/x86/crash.c Wed Jul 06 13:37:44 2011 +0100 > @@ -28,6 +28,7 @@ > #include <asm/apic.h> > #include <asm/io_apic.h> > #include <xen/iommu.h> > +#include <xen/pci.h> > > static atomic_t waiting_for_crash_ipi; > static unsigned int crashing_cpu; > @@ -78,6 +79,8 @@ static void nmi_shootdown_cpus(void) > msecs--; > } > > + disconnect_pci_devices(); > + > /* Crash shutdown any IOMMU functionality as the crashdump kernel is > not > * happy when booting if interrupt/dma remapping is still enabled */ > iommu_crash_shutdown(); > diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/drivers/passthrough/pci.c > --- a/xen/drivers/passthrough/pci.c Mon Jun 27 17:37:12 2011 +0100 > +++ b/xen/drivers/passthrough/pci.c Wed Jul 06 13:37:44 2011 +0100 > @@ -462,6 +462,32 @@ int __init scan_pci_devices(void) > return 0; > } > > +/* Disconnect a PCI device from the PCI bus. From the PCI spec: > + * "When a 0 is written to [the COMMAND] register, the device is > + * logically disconnected from the PCI bus for all accesses except > + * configuration accesses. All devices are required to support > + * this base level of functionality." > + */ > +void disconnect_pci_device(struct pci_dev *pdev) Any reason this cannot be static? Or even be integrated into the single caller? Jan > +{ > + pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn), > + PCI_FUNC(pdev->devfn), PCI_COMMAND, 0); > +} > + > +/* Diconnect all PCI devices from the PCI buses. > + */ > +void disconnect_pci_devices(void) > +{ > + struct pci_dev *pdev; > + > + spin_lock(&pcidevs_lock); > + > + list_for_each_entry ( pdev, &alldevs_list, alldevs_list ) > + disconnect_pci_device(pdev); > + > + spin_unlock(&pcidevs_lock); > +} > + > #ifdef SUPPORT_MSI_REMAPPING > static void dump_pci_devices(unsigned char ch) > { > diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/include/xen/pci.h > --- a/xen/include/xen/pci.h Mon Jun 27 17:37:12 2011 +0100 > +++ b/xen/include/xen/pci.h Wed Jul 06 13:37:44 2011 +0100 > @@ -92,6 +92,9 @@ int pci_add_device_ext(u8 bus, u8 devfn, > struct pci_dev *pci_get_pdev(int bus, int devfn); > struct pci_dev *pci_get_pdev_by_domain(struct domain *d, int bus, int > devfn); > > +void disconnect_pci_device(struct pci_dev * pdev); > +void disconnect_pci_devices(void); > + > uint8_t pci_conf_read8( > unsigned int bus, unsigned int dev, unsigned int func, unsigned int > reg); > uint16_t pci_conf_read16( > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash 2011-07-06 12:43 ` Jan Beulich @ 2011-07-06 12:46 ` Andrew Cooper 0 siblings, 0 replies; 17+ messages in thread From: Andrew Cooper @ 2011-07-06 12:46 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel@lists.xensource.com On 06/07/11 13:43, Jan Beulich wrote: >>>> On 06.07.11 at 14:39, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> In the case of a crash, IOMMU DMA remapping gets turned off so that >> the kdump kernel may boot. However, this is warned as being dangerous >> in the VTD specification if a DMA transaction is in progress. >> >> Also, in the case of a crash, DMA transactions and interrupts from >> peripheral devices such as network cards are likely to keep coming in. >> Without DMA remapping enabled, the transactions will be writing over >> low memory, corrupting the crash state, and perhaps even the kdump >> reserved memory. >> >> Therefore, on the crash path, we can disconnect all PCI devices from >> their respective buses so that they are no longer able to be DMA >> busmasters. This reduces the risk of DMA transactions corrupting >> state (and will also reduce spurious interrupts arriving to the kdump >> kernel) until the kdump kernel and properly reset the PCI devices. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> >> diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/arch/x86/crash.c >> --- a/xen/arch/x86/crash.c Mon Jun 27 17:37:12 2011 +0100 >> +++ b/xen/arch/x86/crash.c Wed Jul 06 13:37:44 2011 +0100 >> @@ -28,6 +28,7 @@ >> #include <asm/apic.h> >> #include <asm/io_apic.h> >> #include <xen/iommu.h> >> +#include <xen/pci.h> >> >> static atomic_t waiting_for_crash_ipi; >> static unsigned int crashing_cpu; >> @@ -78,6 +79,8 @@ static void nmi_shootdown_cpus(void) >> msecs--; >> } >> >> + disconnect_pci_devices(); >> + >> /* Crash shutdown any IOMMU functionality as the crashdump kernel is >> not >> * happy when booting if interrupt/dma remapping is still enabled */ >> iommu_crash_shutdown(); >> diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/drivers/passthrough/pci.c >> --- a/xen/drivers/passthrough/pci.c Mon Jun 27 17:37:12 2011 +0100 >> +++ b/xen/drivers/passthrough/pci.c Wed Jul 06 13:37:44 2011 +0100 >> @@ -462,6 +462,32 @@ int __init scan_pci_devices(void) >> return 0; >> } >> >> +/* Disconnect a PCI device from the PCI bus. From the PCI spec: >> + * "When a 0 is written to [the COMMAND] register, the device is >> + * logically disconnected from the PCI bus for all accesses except >> + * configuration accesses. All devices are required to support >> + * this base level of functionality." >> + */ >> +void disconnect_pci_device(struct pci_dev *pdev) > Any reason this cannot be static? Or even be integrated into the > single caller? > > Jan Not specifically - I can fold them together if you think that would be better ~Andrew >> +{ >> + pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn), >> + PCI_FUNC(pdev->devfn), PCI_COMMAND, 0); >> +} >> + >> +/* Diconnect all PCI devices from the PCI buses. >> + */ >> +void disconnect_pci_devices(void) >> +{ >> + struct pci_dev *pdev; >> + >> + spin_lock(&pcidevs_lock); >> + >> + list_for_each_entry ( pdev, &alldevs_list, alldevs_list ) >> + disconnect_pci_device(pdev); >> + >> + spin_unlock(&pcidevs_lock); >> +} >> + >> #ifdef SUPPORT_MSI_REMAPPING >> static void dump_pci_devices(unsigned char ch) >> { >> diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/include/xen/pci.h >> --- a/xen/include/xen/pci.h Mon Jun 27 17:37:12 2011 +0100 >> +++ b/xen/include/xen/pci.h Wed Jul 06 13:37:44 2011 +0100 >> @@ -92,6 +92,9 @@ int pci_add_device_ext(u8 bus, u8 devfn, >> struct pci_dev *pci_get_pdev(int bus, int devfn); >> struct pci_dev *pci_get_pdev_by_domain(struct domain *d, int bus, int >> devfn); >> >> +void disconnect_pci_device(struct pci_dev * pdev); >> +void disconnect_pci_devices(void); >> + >> uint8_t pci_conf_read8( >> unsigned int bus, unsigned int dev, unsigned int func, unsigned int >> reg); >> uint16_t pci_conf_read16( >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > > -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash 2011-07-06 12:39 [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash Andrew Cooper 2011-07-06 12:43 ` Jan Beulich @ 2011-07-06 18:42 ` Konrad Rzeszutek Wilk 2011-07-07 8:41 ` George Dunlap 2011-07-07 8:53 ` Ian Campbell 1 sibling, 2 replies; 17+ messages in thread From: Konrad Rzeszutek Wilk @ 2011-07-06 18:42 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote: > In the case of a crash, IOMMU DMA remapping gets turned off so that > the kdump kernel may boot. However, this is warned as being dangerous > in the VTD specification if a DMA transaction is in progress. > > Also, in the case of a crash, DMA transactions and interrupts from > peripheral devices such as network cards are likely to keep coming in. > Without DMA remapping enabled, the transactions will be writing over > low memory, corrupting the crash state, and perhaps even the kdump > reserved memory. > > Therefore, on the crash path, we can disconnect all PCI devices from > their respective buses so that they are no longer able to be DMA > busmasters. This reduces the risk of DMA transactions corrupting > state (and will also reduce spurious interrupts arriving to the kdump > kernel) until the kdump kernel and properly reset the PCI devices. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/arch/x86/crash.c > --- a/xen/arch/x86/crash.c Mon Jun 27 17:37:12 2011 +0100 > +++ b/xen/arch/x86/crash.c Wed Jul 06 13:37:44 2011 +0100 > @@ -28,6 +28,7 @@ > #include <asm/apic.h> > #include <asm/io_apic.h> > #include <xen/iommu.h> > +#include <xen/pci.h> > > static atomic_t waiting_for_crash_ipi; > static unsigned int crashing_cpu; > @@ -78,6 +79,8 @@ static void nmi_shootdown_cpus(void) > msecs--; > } > > + disconnect_pci_devices(); > + > /* Crash shutdown any IOMMU functionality as the crashdump kernel is not > * happy when booting if interrupt/dma remapping is still enabled */ > iommu_crash_shutdown(); > diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/drivers/passthrough/pci.c > --- a/xen/drivers/passthrough/pci.c Mon Jun 27 17:37:12 2011 +0100 > +++ b/xen/drivers/passthrough/pci.c Wed Jul 06 13:37:44 2011 +0100 > @@ -462,6 +462,32 @@ int __init scan_pci_devices(void) > return 0; > } > > +/* Disconnect a PCI device from the PCI bus. From the PCI spec: > + * "When a 0 is written to [the COMMAND] register, the device is > + * logically disconnected from the PCI bus for all accesses except > + * configuration accesses. All devices are required to support > + * this base level of functionality." > + */ > +void disconnect_pci_device(struct pci_dev *pdev) > +{ > + pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn), > + PCI_FUNC(pdev->devfn), PCI_COMMAND, 0); So if you have a PCI serial card (or Intel AMT) and you are using that for serial output on the hypervisor line, this will turn it off. There should be some whitelist capability to not do it for PCI serial devices that are owned (used) by the hypervisor. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash 2011-07-06 18:42 ` Konrad Rzeszutek Wilk @ 2011-07-07 8:41 ` George Dunlap 2011-07-07 8:53 ` Ian Campbell 1 sibling, 0 replies; 17+ messages in thread From: George Dunlap @ 2011-07-07 8:41 UTC (permalink / raw) To: Konrad Rzeszutek Wilk; +Cc: Andrew Cooper, xen-devel On Wed, Jul 6, 2011 at 7:42 PM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > So if you have a PCI serial card (or Intel AMT) and you are using that for > serial output on the hypervisor line, this will turn it off. There should > be some whitelist capability to not do it for PCI serial devices that are > owned (used) by the hypervisor. Good catch, Konrad. Andrew, I have a box with such a configuration, if you want to test it. -George ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash 2011-07-06 18:42 ` Konrad Rzeszutek Wilk 2011-07-07 8:41 ` George Dunlap @ 2011-07-07 8:53 ` Ian Campbell 2011-07-07 9:10 ` Jan Beulich 1 sibling, 1 reply; 17+ messages in thread From: Ian Campbell @ 2011-07-07 8:53 UTC (permalink / raw) To: Konrad Rzeszutek Wilk; +Cc: Andrew Cooper, xen-devel@lists.xensource.com On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote: > On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote: > > In the case of a crash, IOMMU DMA remapping gets turned off so that > > the kdump kernel may boot. However, this is warned as being dangerous > > in the VTD specification if a DMA transaction is in progress. > > > > Also, in the case of a crash, DMA transactions and interrupts from > > peripheral devices such as network cards are likely to keep coming in. > > Without DMA remapping enabled, the transactions will be writing over > > low memory, corrupting the crash state, and perhaps even the kdump > > reserved memory. > > > > Therefore, on the crash path, we can disconnect all PCI devices from > > their respective buses so that they are no longer able to be DMA > > busmasters. This reduces the risk of DMA transactions corrupting > > state (and will also reduce spurious interrupts arriving to the kdump > > kernel) until the kdump kernel and properly reset the PCI devices. > > > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > > > diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/arch/x86/crash.c > > --- a/xen/arch/x86/crash.c Mon Jun 27 17:37:12 2011 +0100 > > +++ b/xen/arch/x86/crash.c Wed Jul 06 13:37:44 2011 +0100 > > @@ -28,6 +28,7 @@ > > #include <asm/apic.h> > > #include <asm/io_apic.h> > > #include <xen/iommu.h> > > +#include <xen/pci.h> > > > > static atomic_t waiting_for_crash_ipi; > > static unsigned int crashing_cpu; > > @@ -78,6 +79,8 @@ static void nmi_shootdown_cpus(void) > > msecs--; > > } > > > > + disconnect_pci_devices(); > > + > > /* Crash shutdown any IOMMU functionality as the crashdump kernel is not > > * happy when booting if interrupt/dma remapping is still enabled */ > > iommu_crash_shutdown(); > > diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/drivers/passthrough/pci.c > > --- a/xen/drivers/passthrough/pci.c Mon Jun 27 17:37:12 2011 +0100 > > +++ b/xen/drivers/passthrough/pci.c Wed Jul 06 13:37:44 2011 +0100 > > @@ -462,6 +462,32 @@ int __init scan_pci_devices(void) > > return 0; > > } > > > > +/* Disconnect a PCI device from the PCI bus. From the PCI spec: > > + * "When a 0 is written to [the COMMAND] register, the device is > > + * logically disconnected from the PCI bus for all accesses except > > + * configuration accesses. All devices are required to support > > + * this base level of functionality." > > + */ > > +void disconnect_pci_device(struct pci_dev *pdev) > > +{ > > + pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn), > > + PCI_FUNC(pdev->devfn), PCI_COMMAND, 0); > > So if you have a PCI serial card (or Intel AMT) and you are using that for > serial output on the hypervisor line, this will turn it off. There should > be some whitelist capability to not do it for PCI serial devices that are > owned (used) by the hypervisor. That would be useful for debugging the kexec process itself but in the general case there won't be any further output from the hypervisor and if the kexec'd kernel wants to use the device it is going to have to set it up again anyways. On the other hand if the hypervisor is driving a device it presumably knows (or could know) how to turn of interrupts and switch to polled mode so we might as well do that? Ian. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash 2011-07-07 8:53 ` Ian Campbell @ 2011-07-07 9:10 ` Jan Beulich 2011-07-07 9:12 ` Ian Campbell 0 siblings, 1 reply; 17+ messages in thread From: Jan Beulich @ 2011-07-07 9:10 UTC (permalink / raw) To: Andrew Cooper, Ian Campbell Cc: xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk >>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote: >> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote: >> > +/* Disconnect a PCI device from the PCI bus. From the PCI spec: >> > + * "When a 0 is written to [the COMMAND] register, the device is >> > + * logically disconnected from the PCI bus for all accesses except >> > + * configuration accesses. All devices are required to support >> > + * this base level of functionality." >> > + */ >> > +void disconnect_pci_device(struct pci_dev *pdev) >> > +{ >> > + pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn), >> > + PCI_FUNC(pdev->devfn), PCI_COMMAND, 0); >> >> So if you have a PCI serial card (or Intel AMT) and you are using that for >> serial output on the hypervisor line, this will turn it off. There should >> be some whitelist capability to not do it for PCI serial devices that are >> owned (used) by the hypervisor. > > That would be useful for debugging the kexec process itself but in the > general case there won't be any further output from the hypervisor and > if the kexec'd kernel wants to use the device it is going to have to set > it up again anyways. No, not generally. Just look at Linux' early-printk code: The device is assumed to be enabled (by the BIOS), as the PCI subsystem can't possibly be initialized at this point already. This also means that white-listing just devices Xen uses may not be enough: If Xen doesn't use a serial console (or the secondary kernel wants to use some other device Xen doesn't care about - VGA or other kind of console devices come to mind), it must not find it fully disconnected from the bus. Consequently I would think that while interrupt and DMA activity should be forced off, decoding I/O and memory addresses by the devices shouldn't be. Jan ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash 2011-07-07 9:10 ` Jan Beulich @ 2011-07-07 9:12 ` Ian Campbell 2011-07-07 9:42 ` Jan Beulich 0 siblings, 1 reply; 17+ messages in thread From: Ian Campbell @ 2011-07-07 9:12 UTC (permalink / raw) To: Jan Beulich Cc: Andrew Cooper, xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote: > >>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote: > >> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote: > >> > +/* Disconnect a PCI device from the PCI bus. From the PCI spec: > >> > + * "When a 0 is written to [the COMMAND] register, the device is > >> > + * logically disconnected from the PCI bus for all accesses except > >> > + * configuration accesses. All devices are required to support > >> > + * this base level of functionality." > >> > + */ > >> > +void disconnect_pci_device(struct pci_dev *pdev) > >> > +{ > >> > + pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn), > >> > + PCI_FUNC(pdev->devfn), PCI_COMMAND, 0); > >> > >> So if you have a PCI serial card (or Intel AMT) and you are using that for > >> serial output on the hypervisor line, this will turn it off. There should > >> be some whitelist capability to not do it for PCI serial devices that are > >> owned (used) by the hypervisor. > > > > That would be useful for debugging the kexec process itself but in the > > general case there won't be any further output from the hypervisor and > > if the kexec'd kernel wants to use the device it is going to have to set > > it up again anyways. > > No, not generally. Just look at Linux' early-printk code: The device > is assumed to be enabled (by the BIOS), as the PCI subsystem can't > possibly be initialized at this point already. That's arguably a debugging facility as well though. > This also means that white-listing just devices Xen uses may not be > enough: If Xen doesn't use a serial console (or the secondary kernel > wants to use some other device Xen doesn't care about - VGA or > other kind of console devices come to mind), it must not find it fully > disconnected from the bus. Consequently I would think that while > interrupt and DMA activity should be forced off, decoding I/O and > memory addresses by the devices shouldn't be. The problem is that this can't be done without device specific knowledge, which the hypervisor generally doesn't have and we can't get the device's owning domain to do anything because we are crashing. Ian. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash 2011-07-07 9:12 ` Ian Campbell @ 2011-07-07 9:42 ` Jan Beulich 2011-07-07 9:53 ` Ian Campbell 0 siblings, 1 reply; 17+ messages in thread From: Jan Beulich @ 2011-07-07 9:42 UTC (permalink / raw) To: Ian Campbell Cc: Andrew Cooper, xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk >>> On 07.07.11 at 11:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote: > On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote: >> >>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote: >> > On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote: >> >> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote: >> >> > +/* Disconnect a PCI device from the PCI bus. From the PCI spec: >> >> > + * "When a 0 is written to [the COMMAND] register, the device is >> >> > + * logically disconnected from the PCI bus for all accesses except >> >> > + * configuration accesses. All devices are required to support >> >> > + * this base level of functionality." >> >> > + */ >> >> > +void disconnect_pci_device(struct pci_dev *pdev) >> >> > +{ >> >> > + pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn), >> >> > + PCI_FUNC(pdev->devfn), PCI_COMMAND, 0); >> >> >> >> So if you have a PCI serial card (or Intel AMT) and you are using that for >> >> serial output on the hypervisor line, this will turn it off. There should >> >> be some whitelist capability to not do it for PCI serial devices that are >> >> owned (used) by the hypervisor. >> > >> > That would be useful for debugging the kexec process itself but in the >> > general case there won't be any further output from the hypervisor and >> > if the kexec'd kernel wants to use the device it is going to have to set >> > it up again anyways. >> >> No, not generally. Just look at Linux' early-printk code: The device >> is assumed to be enabled (by the BIOS), as the PCI subsystem can't >> possibly be initialized at this point already. > > That's arguably a debugging facility as well though. > >> This also means that white-listing just devices Xen uses may not be >> enough: If Xen doesn't use a serial console (or the secondary kernel >> wants to use some other device Xen doesn't care about - VGA or >> other kind of console devices come to mind), it must not find it fully >> disconnected from the bus. Consequently I would think that while >> interrupt and DMA activity should be forced off, decoding I/O and >> memory addresses by the devices shouldn't be. > > The problem is that this can't be done without device specific > knowledge, which the hypervisor generally doesn't have and we can't get > the device's owning domain to do anything because we are crashing. Why would there be any device specific knowledge needed? It's all done through the command word, just that writing zero isn't really appropriate. Jan ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash 2011-07-07 9:42 ` Jan Beulich @ 2011-07-07 9:53 ` Ian Campbell 2011-07-07 10:02 ` Andrew Cooper 2011-07-07 11:21 ` Jan Beulich 0 siblings, 2 replies; 17+ messages in thread From: Ian Campbell @ 2011-07-07 9:53 UTC (permalink / raw) To: Jan Beulich Cc: Andrew Cooper, xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk On Thu, 2011-07-07 at 10:42 +0100, Jan Beulich wrote: > >>> On 07.07.11 at 11:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote: > > On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote: > >> >>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote: > >> > On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote: > >> >> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote: > >> >> > +/* Disconnect a PCI device from the PCI bus. From the PCI spec: > >> >> > + * "When a 0 is written to [the COMMAND] register, the device is > >> >> > + * logically disconnected from the PCI bus for all accesses except > >> >> > + * configuration accesses. All devices are required to support > >> >> > + * this base level of functionality." > >> >> > + */ > >> >> > +void disconnect_pci_device(struct pci_dev *pdev) > >> >> > +{ > >> >> > + pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn), > >> >> > + PCI_FUNC(pdev->devfn), PCI_COMMAND, 0); > >> >> > >> >> So if you have a PCI serial card (or Intel AMT) and you are using that for > >> >> serial output on the hypervisor line, this will turn it off. There should > >> >> be some whitelist capability to not do it for PCI serial devices that are > >> >> owned (used) by the hypervisor. > >> > > >> > That would be useful for debugging the kexec process itself but in the > >> > general case there won't be any further output from the hypervisor and > >> > if the kexec'd kernel wants to use the device it is going to have to set > >> > it up again anyways. > >> > >> No, not generally. Just look at Linux' early-printk code: The device > >> is assumed to be enabled (by the BIOS), as the PCI subsystem can't > >> possibly be initialized at this point already. > > > > That's arguably a debugging facility as well though. > > > >> This also means that white-listing just devices Xen uses may not be > >> enough: If Xen doesn't use a serial console (or the secondary kernel > >> wants to use some other device Xen doesn't care about - VGA or > >> other kind of console devices come to mind), it must not find it fully > >> disconnected from the bus. Consequently I would think that while > >> interrupt and DMA activity should be forced off, decoding I/O and > >> memory addresses by the devices shouldn't be. > > > > The problem is that this can't be done without device specific > > knowledge, which the hypervisor generally doesn't have and we can't get > > the device's owning domain to do anything because we are crashing. > > Why would there be any device specific knowledge needed? It's > all done through the command word, just that writing zero isn't > really appropriate. So presumably if you disable bus mastering you've effectively disabled DMA but how do you disable interrupts via the command word? Ian. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash 2011-07-07 9:53 ` Ian Campbell @ 2011-07-07 10:02 ` Andrew Cooper 2011-07-07 10:06 ` Tim Deegan ` (2 more replies) 2011-07-07 11:21 ` Jan Beulich 1 sibling, 3 replies; 17+ messages in thread From: Andrew Cooper @ 2011-07-07 10:02 UTC (permalink / raw) To: Ian Campbell Cc: xen-devel@lists.xensource.com, Jan Beulich, Konrad Rzeszutek Wilk On 07/07/11 10:53, Ian Campbell wrote: > On Thu, 2011-07-07 at 10:42 +0100, Jan Beulich wrote: >>>>> On 07.07.11 at 11:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote: >>> On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote: >>>>>>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote: >>>>> On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote: >>>>>> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote: >>>>>>> +/* Disconnect a PCI device from the PCI bus. From the PCI spec: >>>>>>> + * "When a 0 is written to [the COMMAND] register, the device is >>>>>>> + * logically disconnected from the PCI bus for all accesses except >>>>>>> + * configuration accesses. All devices are required to support >>>>>>> + * this base level of functionality." >>>>>>> + */ >>>>>>> +void disconnect_pci_device(struct pci_dev *pdev) >>>>>>> +{ >>>>>>> + pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn), >>>>>>> + PCI_FUNC(pdev->devfn), PCI_COMMAND, 0); >>>>>> So if you have a PCI serial card (or Intel AMT) and you are using that for >>>>>> serial output on the hypervisor line, this will turn it off. There should >>>>>> be some whitelist capability to not do it for PCI serial devices that are >>>>>> owned (used) by the hypervisor. >>>>> That would be useful for debugging the kexec process itself but in the >>>>> general case there won't be any further output from the hypervisor and >>>>> if the kexec'd kernel wants to use the device it is going to have to set >>>>> it up again anyways. >>>> No, not generally. Just look at Linux' early-printk code: The device >>>> is assumed to be enabled (by the BIOS), as the PCI subsystem can't >>>> possibly be initialized at this point already. >>> That's arguably a debugging facility as well though. >>> >>>> This also means that white-listing just devices Xen uses may not be >>>> enough: If Xen doesn't use a serial console (or the secondary kernel >>>> wants to use some other device Xen doesn't care about - VGA or >>>> other kind of console devices come to mind), it must not find it fully >>>> disconnected from the bus. Consequently I would think that while >>>> interrupt and DMA activity should be forced off, decoding I/O and >>>> memory addresses by the devices shouldn't be. >>> The problem is that this can't be done without device specific >>> knowledge, which the hypervisor generally doesn't have and we can't get >>> the device's owning domain to do anything because we are crashing. >> Why would there be any device specific knowledge needed? It's >> all done through the command word, just that writing zero isn't >> really appropriate. > So presumably if you disable bus mastering you've effectively disabled > DMA but how do you disable interrupts via the command word? > > Ian. > Bit 10 of the control word is "disable assertion of INTx# pins" (so set bit to 1 to disable interrupts). That should cover legacy interrupts. For MSI and above, disabling DMA should prevent the bus writing to the magic local APIC addresses. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash 2011-07-07 10:02 ` Andrew Cooper @ 2011-07-07 10:06 ` Tim Deegan 2011-07-07 10:07 ` Ian Campbell 2011-07-07 11:25 ` Jan Beulich 2 siblings, 0 replies; 17+ messages in thread From: Tim Deegan @ 2011-07-07 10:06 UTC (permalink / raw) To: Andrew Cooper Cc: Ian Campbell, Rzeszutek Wilk, xen-devel@lists.xensource.com, Jan Beulich, Konrad At 11:02 +0100 on 07 Jul (1310036565), Andrew Cooper wrote: > > > On 07/07/11 10:53, Ian Campbell wrote: > > On Thu, 2011-07-07 at 10:42 +0100, Jan Beulich wrote: > >>>>> On 07.07.11 at 11:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote: > >>> On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote: > >>>>>>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote: > >>>>> On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote: > >>>>>> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote: > >>>>>>> +/* Disconnect a PCI device from the PCI bus. From the PCI spec: > >>>>>>> + * "When a 0 is written to [the COMMAND] register, the device is > >>>>>>> + * logically disconnected from the PCI bus for all accesses except > >>>>>>> + * configuration accesses. All devices are required to support > >>>>>>> + * this base level of functionality." > >>>>>>> + */ > >>>>>>> +void disconnect_pci_device(struct pci_dev *pdev) > >>>>>>> +{ > >>>>>>> + pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn), > >>>>>>> + PCI_FUNC(pdev->devfn), PCI_COMMAND, 0); > >>>>>> So if you have a PCI serial card (or Intel AMT) and you are using that for > >>>>>> serial output on the hypervisor line, this will turn it off. There should > >>>>>> be some whitelist capability to not do it for PCI serial devices that are > >>>>>> owned (used) by the hypervisor. > >>>>> That would be useful for debugging the kexec process itself but in the > >>>>> general case there won't be any further output from the hypervisor and > >>>>> if the kexec'd kernel wants to use the device it is going to have to set > >>>>> it up again anyways. > >>>> No, not generally. Just look at Linux' early-printk code: The device > >>>> is assumed to be enabled (by the BIOS), as the PCI subsystem can't > >>>> possibly be initialized at this point already. > >>> That's arguably a debugging facility as well though. > >>> > >>>> This also means that white-listing just devices Xen uses may not be > >>>> enough: If Xen doesn't use a serial console (or the secondary kernel > >>>> wants to use some other device Xen doesn't care about - VGA or > >>>> other kind of console devices come to mind), it must not find it fully > >>>> disconnected from the bus. Consequently I would think that while > >>>> interrupt and DMA activity should be forced off, decoding I/O and > >>>> memory addresses by the devices shouldn't be. > >>> The problem is that this can't be done without device specific > >>> knowledge, which the hypervisor generally doesn't have and we can't get > >>> the device's owning domain to do anything because we are crashing. > >> Why would there be any device specific knowledge needed? It's > >> all done through the command word, just that writing zero isn't > >> really appropriate. > > So presumably if you disable bus mastering you've effectively disabled > > DMA but how do you disable interrupts via the command word? > > > > Ian. > > > Bit 10 of the control word is "disable assertion of INTx# pins" (so set > bit to 1 to disable interrupts). That should cover legacy interrupts. In older specs that's a reserved bit; I suspect on real PCI devices it will have no effect. > For MSI and above, disabling DMA should prevent the bus writing to the > magic local APIC addresses. Are MSI cycles always the same as DMA cycles? I though that MSI-X changed that (but might be quite wrong). Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash 2011-07-07 10:02 ` Andrew Cooper 2011-07-07 10:06 ` Tim Deegan @ 2011-07-07 10:07 ` Ian Campbell 2011-07-07 11:25 ` Jan Beulich 2 siblings, 0 replies; 17+ messages in thread From: Ian Campbell @ 2011-07-07 10:07 UTC (permalink / raw) To: Andrew Cooper Cc: xen-devel@lists.xensource.com, Jan Beulich, Konrad Rzeszutek Wilk On Thu, 2011-07-07 at 11:02 +0100, Andrew Cooper wrote: > > On 07/07/11 10:53, Ian Campbell wrote: > > On Thu, 2011-07-07 at 10:42 +0100, Jan Beulich wrote: > >>>>> On 07.07.11 at 11:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote: > >>> On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote: > >>>>>>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote: > >>>>> On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote: > >>>>>> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote: > >>>>>>> +/* Disconnect a PCI device from the PCI bus. From the PCI spec: > >>>>>>> + * "When a 0 is written to [the COMMAND] register, the device is > >>>>>>> + * logically disconnected from the PCI bus for all accesses except > >>>>>>> + * configuration accesses. All devices are required to support > >>>>>>> + * this base level of functionality." > >>>>>>> + */ > >>>>>>> +void disconnect_pci_device(struct pci_dev *pdev) > >>>>>>> +{ > >>>>>>> + pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn), > >>>>>>> + PCI_FUNC(pdev->devfn), PCI_COMMAND, 0); > >>>>>> So if you have a PCI serial card (or Intel AMT) and you are using that for > >>>>>> serial output on the hypervisor line, this will turn it off. There should > >>>>>> be some whitelist capability to not do it for PCI serial devices that are > >>>>>> owned (used) by the hypervisor. > >>>>> That would be useful for debugging the kexec process itself but in the > >>>>> general case there won't be any further output from the hypervisor and > >>>>> if the kexec'd kernel wants to use the device it is going to have to set > >>>>> it up again anyways. > >>>> No, not generally. Just look at Linux' early-printk code: The device > >>>> is assumed to be enabled (by the BIOS), as the PCI subsystem can't > >>>> possibly be initialized at this point already. > >>> That's arguably a debugging facility as well though. > >>> > >>>> This also means that white-listing just devices Xen uses may not be > >>>> enough: If Xen doesn't use a serial console (or the secondary kernel > >>>> wants to use some other device Xen doesn't care about - VGA or > >>>> other kind of console devices come to mind), it must not find it fully > >>>> disconnected from the bus. Consequently I would think that while > >>>> interrupt and DMA activity should be forced off, decoding I/O and > >>>> memory addresses by the devices shouldn't be. > >>> The problem is that this can't be done without device specific > >>> knowledge, which the hypervisor generally doesn't have and we can't get > >>> the device's owning domain to do anything because we are crashing. > >> Why would there be any device specific knowledge needed? It's > >> all done through the command word, just that writing zero isn't > >> really appropriate. > > So presumably if you disable bus mastering you've effectively disabled > > DMA but how do you disable interrupts via the command word? > > > > Ian. > > > Bit 10 of the control word is "disable assertion of INTx# pins" (so set > bit to 1 to disable interrupts). My copy of the spec isn't new enough to cover that particular bit but the Linux headers describe it as disabling INTx "emulation" so I wasn't sure if it actually did anything in all cases, especially on older hardware. > That should cover legacy interrupts. > For MSI and above, disabling DMA should prevent the bus writing to the > magic local APIC addresses. > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash 2011-07-07 10:02 ` Andrew Cooper 2011-07-07 10:06 ` Tim Deegan 2011-07-07 10:07 ` Ian Campbell @ 2011-07-07 11:25 ` Jan Beulich 2 siblings, 0 replies; 17+ messages in thread From: Jan Beulich @ 2011-07-07 11:25 UTC (permalink / raw) To: Andrew Cooper Cc: Ian Campbell, xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk >>> On 07.07.11 at 12:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > > On 07/07/11 10:53, Ian Campbell wrote: >> On Thu, 2011-07-07 at 10:42 +0100, Jan Beulich wrote: >>>>>> On 07.07.11 at 11:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote: >>>> On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote: >>>>>>>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote: >>>>>> On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote: >>>>>>> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote: >>>>>>>> +/* Disconnect a PCI device from the PCI bus. From the PCI spec: >>>>>>>> + * "When a 0 is written to [the COMMAND] register, the device is >>>>>>>> + * logically disconnected from the PCI bus for all accesses except >>>>>>>> + * configuration accesses. All devices are required to support >>>>>>>> + * this base level of functionality." >>>>>>>> + */ >>>>>>>> +void disconnect_pci_device(struct pci_dev *pdev) >>>>>>>> +{ >>>>>>>> + pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn), >>>>>>>> + PCI_FUNC(pdev->devfn), PCI_COMMAND, 0); >>>>>>> So if you have a PCI serial card (or Intel AMT) and you are using that for >>>>>>> serial output on the hypervisor line, this will turn it off. There should >>>>>>> be some whitelist capability to not do it for PCI serial devices that are >>>>>>> owned (used) by the hypervisor. >>>>>> That would be useful for debugging the kexec process itself but in the >>>>>> general case there won't be any further output from the hypervisor and >>>>>> if the kexec'd kernel wants to use the device it is going to have to set >>>>>> it up again anyways. >>>>> No, not generally. Just look at Linux' early-printk code: The device >>>>> is assumed to be enabled (by the BIOS), as the PCI subsystem can't >>>>> possibly be initialized at this point already. >>>> That's arguably a debugging facility as well though. >>>> >>>>> This also means that white-listing just devices Xen uses may not be >>>>> enough: If Xen doesn't use a serial console (or the secondary kernel >>>>> wants to use some other device Xen doesn't care about - VGA or >>>>> other kind of console devices come to mind), it must not find it fully >>>>> disconnected from the bus. Consequently I would think that while >>>>> interrupt and DMA activity should be forced off, decoding I/O and >>>>> memory addresses by the devices shouldn't be. >>>> The problem is that this can't be done without device specific >>>> knowledge, which the hypervisor generally doesn't have and we can't get >>>> the device's owning domain to do anything because we are crashing. >>> Why would there be any device specific knowledge needed? It's >>> all done through the command word, just that writing zero isn't >>> really appropriate. >> So presumably if you disable bus mastering you've effectively disabled >> DMA but how do you disable interrupts via the command word? >> >> Ian. >> > Bit 10 of the control word is "disable assertion of INTx# pins" (so set > bit to 1 to disable interrupts). That should cover legacy interrupts. > For MSI and above, disabling DMA should prevent the bus writing to the > magic local APIC addresses. No, I don't thing the bus mastering bit disables MSI accesses (and even if officially it did, I wouldn't trust it). But disabling MSI itself isn't device specific either, so should be an option here (with the caveat that there are a few buggy devices that can't get back into MSI mode if it was once disabled, so perhaps masking MSI when possible should be preferred over disabling it). Jan ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash 2011-07-07 9:53 ` Ian Campbell 2011-07-07 10:02 ` Andrew Cooper @ 2011-07-07 11:21 ` Jan Beulich 2011-07-07 11:40 ` Stefano Stabellini 1 sibling, 1 reply; 17+ messages in thread From: Jan Beulich @ 2011-07-07 11:21 UTC (permalink / raw) To: Ian Campbell Cc: Andrew Cooper, xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk >>> On 07.07.11 at 11:53, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote: > On Thu, 2011-07-07 at 10:42 +0100, Jan Beulich wrote: >> >>> On 07.07.11 at 11:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote: >> > On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote: >> >> >>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote: >> >> > On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote: >> >> >> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote: >> >> >> > +/* Disconnect a PCI device from the PCI bus. From the PCI spec: >> >> >> > + * "When a 0 is written to [the COMMAND] register, the device is >> >> >> > + * logically disconnected from the PCI bus for all accesses except >> >> >> > + * configuration accesses. All devices are required to support >> >> >> > + * this base level of functionality." >> >> >> > + */ >> >> >> > +void disconnect_pci_device(struct pci_dev *pdev) >> >> >> > +{ >> >> >> > + pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn), >> >> >> > + PCI_FUNC(pdev->devfn), PCI_COMMAND, 0); >> >> >> >> >> >> So if you have a PCI serial card (or Intel AMT) and you are using that for >> >> >> serial output on the hypervisor line, this will turn it off. There should >> >> >> be some whitelist capability to not do it for PCI serial devices that are >> >> >> owned (used) by the hypervisor. >> >> > >> >> > That would be useful for debugging the kexec process itself but in the >> >> > general case there won't be any further output from the hypervisor and >> >> > if the kexec'd kernel wants to use the device it is going to have to set >> >> > it up again anyways. >> >> >> >> No, not generally. Just look at Linux' early-printk code: The device >> >> is assumed to be enabled (by the BIOS), as the PCI subsystem can't >> >> possibly be initialized at this point already. >> > >> > That's arguably a debugging facility as well though. >> > >> >> This also means that white-listing just devices Xen uses may not be >> >> enough: If Xen doesn't use a serial console (or the secondary kernel >> >> wants to use some other device Xen doesn't care about - VGA or >> >> other kind of console devices come to mind), it must not find it fully >> >> disconnected from the bus. Consequently I would think that while >> >> interrupt and DMA activity should be forced off, decoding I/O and >> >> memory addresses by the devices shouldn't be. >> > >> > The problem is that this can't be done without device specific >> > knowledge, which the hypervisor generally doesn't have and we can't get >> > the device's owning domain to do anything because we are crashing. >> >> Why would there be any device specific knowledge needed? It's >> all done through the command word, just that writing zero isn't >> really appropriate. > > So presumably if you disable bus mastering you've effectively disabled > DMA but how do you disable interrupts via the command word? Interrupts can be disabled equally well in the IO-APIC and by disabling MSI for all devices. Jan ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash 2011-07-07 11:21 ` Jan Beulich @ 2011-07-07 11:40 ` Stefano Stabellini 2011-07-07 13:03 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 17+ messages in thread From: Stefano Stabellini @ 2011-07-07 11:40 UTC (permalink / raw) To: Jan Beulich Cc: Ian Campbell, Andrew Cooper, xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk On Thu, 7 Jul 2011, Jan Beulich wrote: > >>> On 07.07.11 at 11:53, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote: > > On Thu, 2011-07-07 at 10:42 +0100, Jan Beulich wrote: > >> >>> On 07.07.11 at 11:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote: > >> > On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote: > >> >> >>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote: > >> >> > On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote: > >> >> >> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote: > >> >> >> > +/* Disconnect a PCI device from the PCI bus. From the PCI spec: > >> >> >> > + * "When a 0 is written to [the COMMAND] register, the device is > >> >> >> > + * logically disconnected from the PCI bus for all accesses except > >> >> >> > + * configuration accesses. All devices are required to support > >> >> >> > + * this base level of functionality." > >> >> >> > + */ > >> >> >> > +void disconnect_pci_device(struct pci_dev *pdev) > >> >> >> > +{ > >> >> >> > + pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn), > >> >> >> > + PCI_FUNC(pdev->devfn), PCI_COMMAND, 0); > >> >> >> > >> >> >> So if you have a PCI serial card (or Intel AMT) and you are using that for > >> >> >> serial output on the hypervisor line, this will turn it off. There should > >> >> >> be some whitelist capability to not do it for PCI serial devices that are > >> >> >> owned (used) by the hypervisor. > >> >> > > >> >> > That would be useful for debugging the kexec process itself but in the > >> >> > general case there won't be any further output from the hypervisor and > >> >> > if the kexec'd kernel wants to use the device it is going to have to set > >> >> > it up again anyways. > >> >> > >> >> No, not generally. Just look at Linux' early-printk code: The device > >> >> is assumed to be enabled (by the BIOS), as the PCI subsystem can't > >> >> possibly be initialized at this point already. > >> > > >> > That's arguably a debugging facility as well though. > >> > > >> >> This also means that white-listing just devices Xen uses may not be > >> >> enough: If Xen doesn't use a serial console (or the secondary kernel > >> >> wants to use some other device Xen doesn't care about - VGA or > >> >> other kind of console devices come to mind), it must not find it fully > >> >> disconnected from the bus. Consequently I would think that while > >> >> interrupt and DMA activity should be forced off, decoding I/O and > >> >> memory addresses by the devices shouldn't be. > >> > > >> > The problem is that this can't be done without device specific > >> > knowledge, which the hypervisor generally doesn't have and we can't get > >> > the device's owning domain to do anything because we are crashing. > >> > >> Why would there be any device specific knowledge needed? It's > >> all done through the command word, just that writing zero isn't > >> really appropriate. > > > > So presumably if you disable bus mastering you've effectively disabled > > DMA but how do you disable interrupts via the command word? > > Interrupts can be disabled equally well in the IO-APIC and by > disabling MSI for all devices. Considering that it would be nice to have PCI device reset capabilities in the hypervisor anyway, in case a device is malfunctioning, and considering that we all know that PCI devices don't always follow PCI specs, wouldn't it be safer to just reset all the devices in this case? At least the reset code is well tested and known to work (or known to work with some particular devices, and for this reasons there are quirks). Also resetting devices would better cover the case when a device is misbehaving. Of course we still need a whitelist for serial consoles and maybe other devices. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash 2011-07-07 11:40 ` Stefano Stabellini @ 2011-07-07 13:03 ` Konrad Rzeszutek Wilk 0 siblings, 0 replies; 17+ messages in thread From: Konrad Rzeszutek Wilk @ 2011-07-07 13:03 UTC (permalink / raw) To: Stefano Stabellini Cc: Ian Campbell, Andrew Cooper, xen-devel@lists.xensource.com, Jan Beulich > quirks). Also resetting devices would better cover the case when a > device is misbehaving. > Of course we still need a whitelist for serial consoles and maybe other > devices. It sounds like I should post some of the patches I've that deal with various serial console now. They actually save the BDF of the PCI serial device so that the "whitelist" data can be easily extracted. ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2011-07-07 13:03 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-07-06 12:39 [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash Andrew Cooper 2011-07-06 12:43 ` Jan Beulich 2011-07-06 12:46 ` Andrew Cooper 2011-07-06 18:42 ` Konrad Rzeszutek Wilk 2011-07-07 8:41 ` George Dunlap 2011-07-07 8:53 ` Ian Campbell 2011-07-07 9:10 ` Jan Beulich 2011-07-07 9:12 ` Ian Campbell 2011-07-07 9:42 ` Jan Beulich 2011-07-07 9:53 ` Ian Campbell 2011-07-07 10:02 ` Andrew Cooper 2011-07-07 10:06 ` Tim Deegan 2011-07-07 10:07 ` Ian Campbell 2011-07-07 11:25 ` Jan Beulich 2011-07-07 11:21 ` Jan Beulich 2011-07-07 11:40 ` Stefano Stabellini 2011-07-07 13:03 ` Konrad Rzeszutek Wilk
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.