All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash
@ 2011-07-06 12:39 Andrew Cooper
  2011-07-06 12:43 ` Jan Beulich
  2011-07-06 18:42 ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 17+ messages in thread
From: Andrew Cooper @ 2011-07-06 12:39 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

In the case of a crash, IOMMU DMA remapping gets turned off so that
the kdump kernel may boot.  However, this is warned as being dangerous
in the VTD specification if a DMA transaction is in progress.

Also, in the case of a crash, DMA transactions and interrupts from
peripheral devices such as network cards are likely to keep coming in.
Without DMA remapping enabled, the transactions will be writing over
low memory, corrupting the crash state, and perhaps even the kdump
reserved memory.

Therefore, on the crash path, we can disconnect all PCI devices from
their respective buses so that they are no longer able to be DMA
busmasters.  This reduces the risk of DMA transactions corrupting
state (and will also reduce spurious interrupts arriving to the kdump
kernel) until the kdump kernel and properly reset the PCI devices.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/arch/x86/crash.c
--- a/xen/arch/x86/crash.c	Mon Jun 27 17:37:12 2011 +0100
+++ b/xen/arch/x86/crash.c	Wed Jul 06 13:37:44 2011 +0100
@@ -28,6 +28,7 @@
 #include <asm/apic.h>
 #include <asm/io_apic.h>
 #include <xen/iommu.h>
+#include <xen/pci.h>
 
 static atomic_t waiting_for_crash_ipi;
 static unsigned int crashing_cpu;
@@ -78,6 +79,8 @@ static void nmi_shootdown_cpus(void)
         msecs--;
     }
 
+    disconnect_pci_devices();
+
     /* Crash shutdown any IOMMU functionality as the crashdump kernel is not
      * happy when booting if interrupt/dma remapping is still enabled */
     iommu_crash_shutdown();
diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/drivers/passthrough/pci.c
--- a/xen/drivers/passthrough/pci.c	Mon Jun 27 17:37:12 2011 +0100
+++ b/xen/drivers/passthrough/pci.c	Wed Jul 06 13:37:44 2011 +0100
@@ -462,6 +462,32 @@ int __init scan_pci_devices(void)
     return 0;
 }
 
+/* Disconnect a PCI device from the PCI bus.  From the PCI spec:
+ *     "When a 0 is written to [the COMMAND] register, the device is
+ *     logically disconnected from the PCI bus for all accesses except
+ *     configuration accesses. All devices are required to support
+ *     this base level of functionality."
+ */
+void disconnect_pci_device(struct pci_dev *pdev)
+{
+    pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn),
+                     PCI_FUNC(pdev->devfn), PCI_COMMAND, 0);
+}
+
+/* Diconnect all PCI devices from the PCI buses.
+ */
+void disconnect_pci_devices(void)
+{
+    struct pci_dev *pdev;
+
+    spin_lock(&pcidevs_lock);
+
+    list_for_each_entry ( pdev, &alldevs_list, alldevs_list )
+        disconnect_pci_device(pdev);
+
+    spin_unlock(&pcidevs_lock);
+}
+
 #ifdef SUPPORT_MSI_REMAPPING
 static void dump_pci_devices(unsigned char ch)
 {
diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/include/xen/pci.h
--- a/xen/include/xen/pci.h	Mon Jun 27 17:37:12 2011 +0100
+++ b/xen/include/xen/pci.h	Wed Jul 06 13:37:44 2011 +0100
@@ -92,6 +92,9 @@ int pci_add_device_ext(u8 bus, u8 devfn,
 struct pci_dev *pci_get_pdev(int bus, int devfn);
 struct pci_dev *pci_get_pdev_by_domain(struct domain *d, int bus, int devfn);
 
+void disconnect_pci_device(struct pci_dev * pdev);
+void disconnect_pci_devices(void);
+
 uint8_t pci_conf_read8(
     unsigned int bus, unsigned int dev, unsigned int func, unsigned int reg);
 uint16_t pci_conf_read16(

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash
  2011-07-06 12:39 [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash Andrew Cooper
@ 2011-07-06 12:43 ` Jan Beulich
  2011-07-06 12:46   ` Andrew Cooper
  2011-07-06 18:42 ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 17+ messages in thread
From: Jan Beulich @ 2011-07-06 12:43 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel

>>> On 06.07.11 at 14:39, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> In the case of a crash, IOMMU DMA remapping gets turned off so that
> the kdump kernel may boot.  However, this is warned as being dangerous
> in the VTD specification if a DMA transaction is in progress.
> 
> Also, in the case of a crash, DMA transactions and interrupts from
> peripheral devices such as network cards are likely to keep coming in.
> Without DMA remapping enabled, the transactions will be writing over
> low memory, corrupting the crash state, and perhaps even the kdump
> reserved memory.
> 
> Therefore, on the crash path, we can disconnect all PCI devices from
> their respective buses so that they are no longer able to be DMA
> busmasters.  This reduces the risk of DMA transactions corrupting
> state (and will also reduce spurious interrupts arriving to the kdump
> kernel) until the kdump kernel and properly reset the PCI devices.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> 
> diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/arch/x86/crash.c
> --- a/xen/arch/x86/crash.c	Mon Jun 27 17:37:12 2011 +0100
> +++ b/xen/arch/x86/crash.c	Wed Jul 06 13:37:44 2011 +0100
> @@ -28,6 +28,7 @@
>  #include <asm/apic.h>
>  #include <asm/io_apic.h>
>  #include <xen/iommu.h>
> +#include <xen/pci.h>
>  
>  static atomic_t waiting_for_crash_ipi;
>  static unsigned int crashing_cpu;
> @@ -78,6 +79,8 @@ static void nmi_shootdown_cpus(void)
>          msecs--;
>      }
>  
> +    disconnect_pci_devices();
> +
>      /* Crash shutdown any IOMMU functionality as the crashdump kernel is 
> not
>       * happy when booting if interrupt/dma remapping is still enabled */
>      iommu_crash_shutdown();
> diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/drivers/passthrough/pci.c
> --- a/xen/drivers/passthrough/pci.c	Mon Jun 27 17:37:12 2011 +0100
> +++ b/xen/drivers/passthrough/pci.c	Wed Jul 06 13:37:44 2011 +0100
> @@ -462,6 +462,32 @@ int __init scan_pci_devices(void)
>      return 0;
>  }
>  
> +/* Disconnect a PCI device from the PCI bus.  From the PCI spec:
> + *     "When a 0 is written to [the COMMAND] register, the device is
> + *     logically disconnected from the PCI bus for all accesses except
> + *     configuration accesses. All devices are required to support
> + *     this base level of functionality."
> + */
> +void disconnect_pci_device(struct pci_dev *pdev)

Any reason this cannot be static? Or even be integrated into the
single caller?

Jan

> +{
> +    pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn),
> +                     PCI_FUNC(pdev->devfn), PCI_COMMAND, 0);
> +}
> +
> +/* Diconnect all PCI devices from the PCI buses.
> + */
> +void disconnect_pci_devices(void)
> +{
> +    struct pci_dev *pdev;
> +
> +    spin_lock(&pcidevs_lock);
> +
> +    list_for_each_entry ( pdev, &alldevs_list, alldevs_list )
> +        disconnect_pci_device(pdev);
> +
> +    spin_unlock(&pcidevs_lock);
> +}
> +
>  #ifdef SUPPORT_MSI_REMAPPING
>  static void dump_pci_devices(unsigned char ch)
>  {
> diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/include/xen/pci.h
> --- a/xen/include/xen/pci.h	Mon Jun 27 17:37:12 2011 +0100
> +++ b/xen/include/xen/pci.h	Wed Jul 06 13:37:44 2011 +0100
> @@ -92,6 +92,9 @@ int pci_add_device_ext(u8 bus, u8 devfn,
>  struct pci_dev *pci_get_pdev(int bus, int devfn);
>  struct pci_dev *pci_get_pdev_by_domain(struct domain *d, int bus, int 
> devfn);
>  
> +void disconnect_pci_device(struct pci_dev * pdev);
> +void disconnect_pci_devices(void);
> +
>  uint8_t pci_conf_read8(
>      unsigned int bus, unsigned int dev, unsigned int func, unsigned int 
> reg);
>  uint16_t pci_conf_read16(
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com 
> http://lists.xensource.com/xen-devel 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KEXEC: disconnect all PCI devices from  the PCI bus on crash
  2011-07-06 12:43 ` Jan Beulich
@ 2011-07-06 12:46   ` Andrew Cooper
  0 siblings, 0 replies; 17+ messages in thread
From: Andrew Cooper @ 2011-07-06 12:46 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel@lists.xensource.com



On 06/07/11 13:43, Jan Beulich wrote:
>>>> On 06.07.11 at 14:39, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> In the case of a crash, IOMMU DMA remapping gets turned off so that
>> the kdump kernel may boot.  However, this is warned as being dangerous
>> in the VTD specification if a DMA transaction is in progress.
>>
>> Also, in the case of a crash, DMA transactions and interrupts from
>> peripheral devices such as network cards are likely to keep coming in.
>> Without DMA remapping enabled, the transactions will be writing over
>> low memory, corrupting the crash state, and perhaps even the kdump
>> reserved memory.
>>
>> Therefore, on the crash path, we can disconnect all PCI devices from
>> their respective buses so that they are no longer able to be DMA
>> busmasters.  This reduces the risk of DMA transactions corrupting
>> state (and will also reduce spurious interrupts arriving to the kdump
>> kernel) until the kdump kernel and properly reset the PCI devices.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>
>> diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/arch/x86/crash.c
>> --- a/xen/arch/x86/crash.c	Mon Jun 27 17:37:12 2011 +0100
>> +++ b/xen/arch/x86/crash.c	Wed Jul 06 13:37:44 2011 +0100
>> @@ -28,6 +28,7 @@
>>  #include <asm/apic.h>
>>  #include <asm/io_apic.h>
>>  #include <xen/iommu.h>
>> +#include <xen/pci.h>
>>  
>>  static atomic_t waiting_for_crash_ipi;
>>  static unsigned int crashing_cpu;
>> @@ -78,6 +79,8 @@ static void nmi_shootdown_cpus(void)
>>          msecs--;
>>      }
>>  
>> +    disconnect_pci_devices();
>> +
>>      /* Crash shutdown any IOMMU functionality as the crashdump kernel is 
>> not
>>       * happy when booting if interrupt/dma remapping is still enabled */
>>      iommu_crash_shutdown();
>> diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/drivers/passthrough/pci.c
>> --- a/xen/drivers/passthrough/pci.c	Mon Jun 27 17:37:12 2011 +0100
>> +++ b/xen/drivers/passthrough/pci.c	Wed Jul 06 13:37:44 2011 +0100
>> @@ -462,6 +462,32 @@ int __init scan_pci_devices(void)
>>      return 0;
>>  }
>>  
>> +/* Disconnect a PCI device from the PCI bus.  From the PCI spec:
>> + *     "When a 0 is written to [the COMMAND] register, the device is
>> + *     logically disconnected from the PCI bus for all accesses except
>> + *     configuration accesses. All devices are required to support
>> + *     this base level of functionality."
>> + */
>> +void disconnect_pci_device(struct pci_dev *pdev)
> Any reason this cannot be static? Or even be integrated into the
> single caller?
>
> Jan

Not specifically - I can fold them together if you think that would be
better

~Andrew

>> +{
>> +    pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn),
>> +                     PCI_FUNC(pdev->devfn), PCI_COMMAND, 0);
>> +}
>> +
>> +/* Diconnect all PCI devices from the PCI buses.
>> + */
>> +void disconnect_pci_devices(void)
>> +{
>> +    struct pci_dev *pdev;
>> +
>> +    spin_lock(&pcidevs_lock);
>> +
>> +    list_for_each_entry ( pdev, &alldevs_list, alldevs_list )
>> +        disconnect_pci_device(pdev);
>> +
>> +    spin_unlock(&pcidevs_lock);
>> +}
>> +
>>  #ifdef SUPPORT_MSI_REMAPPING
>>  static void dump_pci_devices(unsigned char ch)
>>  {
>> diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/include/xen/pci.h
>> --- a/xen/include/xen/pci.h	Mon Jun 27 17:37:12 2011 +0100
>> +++ b/xen/include/xen/pci.h	Wed Jul 06 13:37:44 2011 +0100
>> @@ -92,6 +92,9 @@ int pci_add_device_ext(u8 bus, u8 devfn,
>>  struct pci_dev *pci_get_pdev(int bus, int devfn);
>>  struct pci_dev *pci_get_pdev_by_domain(struct domain *d, int bus, int 
>> devfn);
>>  
>> +void disconnect_pci_device(struct pci_dev * pdev);
>> +void disconnect_pci_devices(void);
>> +
>>  uint8_t pci_conf_read8(
>>      unsigned int bus, unsigned int dev, unsigned int func, unsigned int 
>> reg);
>>  uint16_t pci_conf_read16(
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com 
>> http://lists.xensource.com/xen-devel 
>
>

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash
  2011-07-06 12:39 [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash Andrew Cooper
  2011-07-06 12:43 ` Jan Beulich
@ 2011-07-06 18:42 ` Konrad Rzeszutek Wilk
  2011-07-07  8:41   ` George Dunlap
  2011-07-07  8:53   ` Ian Campbell
  1 sibling, 2 replies; 17+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-07-06 18:42 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel

On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote:
> In the case of a crash, IOMMU DMA remapping gets turned off so that
> the kdump kernel may boot.  However, this is warned as being dangerous
> in the VTD specification if a DMA transaction is in progress.
> 
> Also, in the case of a crash, DMA transactions and interrupts from
> peripheral devices such as network cards are likely to keep coming in.
> Without DMA remapping enabled, the transactions will be writing over
> low memory, corrupting the crash state, and perhaps even the kdump
> reserved memory.
> 
> Therefore, on the crash path, we can disconnect all PCI devices from
> their respective buses so that they are no longer able to be DMA
> busmasters.  This reduces the risk of DMA transactions corrupting
> state (and will also reduce spurious interrupts arriving to the kdump
> kernel) until the kdump kernel and properly reset the PCI devices.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> 
> diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/arch/x86/crash.c
> --- a/xen/arch/x86/crash.c	Mon Jun 27 17:37:12 2011 +0100
> +++ b/xen/arch/x86/crash.c	Wed Jul 06 13:37:44 2011 +0100
> @@ -28,6 +28,7 @@
>  #include <asm/apic.h>
>  #include <asm/io_apic.h>
>  #include <xen/iommu.h>
> +#include <xen/pci.h>
>  
>  static atomic_t waiting_for_crash_ipi;
>  static unsigned int crashing_cpu;
> @@ -78,6 +79,8 @@ static void nmi_shootdown_cpus(void)
>          msecs--;
>      }
>  
> +    disconnect_pci_devices();
> +
>      /* Crash shutdown any IOMMU functionality as the crashdump kernel is not
>       * happy when booting if interrupt/dma remapping is still enabled */
>      iommu_crash_shutdown();
> diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/drivers/passthrough/pci.c
> --- a/xen/drivers/passthrough/pci.c	Mon Jun 27 17:37:12 2011 +0100
> +++ b/xen/drivers/passthrough/pci.c	Wed Jul 06 13:37:44 2011 +0100
> @@ -462,6 +462,32 @@ int __init scan_pci_devices(void)
>      return 0;
>  }
>  
> +/* Disconnect a PCI device from the PCI bus.  From the PCI spec:
> + *     "When a 0 is written to [the COMMAND] register, the device is
> + *     logically disconnected from the PCI bus for all accesses except
> + *     configuration accesses. All devices are required to support
> + *     this base level of functionality."
> + */
> +void disconnect_pci_device(struct pci_dev *pdev)
> +{
> +    pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn),
> +                     PCI_FUNC(pdev->devfn), PCI_COMMAND, 0);

So if you have a PCI serial card (or Intel AMT) and you are using that for
serial output on the hypervisor line, this will turn it off. There should
be some whitelist capability to not do it for PCI serial devices that are
owned (used) by the hypervisor.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash
  2011-07-06 18:42 ` Konrad Rzeszutek Wilk
@ 2011-07-07  8:41   ` George Dunlap
  2011-07-07  8:53   ` Ian Campbell
  1 sibling, 0 replies; 17+ messages in thread
From: George Dunlap @ 2011-07-07  8:41 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Andrew Cooper, xen-devel

On Wed, Jul 6, 2011 at 7:42 PM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> So if you have a PCI serial card (or Intel AMT) and you are using that for
> serial output on the hypervisor line, this will turn it off. There should
> be some whitelist capability to not do it for PCI serial devices that are
> owned (used) by the hypervisor.

Good catch, Konrad.  Andrew, I have a box with such a configuration,
if you want to test it.

 -George

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash
  2011-07-06 18:42 ` Konrad Rzeszutek Wilk
  2011-07-07  8:41   ` George Dunlap
@ 2011-07-07  8:53   ` Ian Campbell
  2011-07-07  9:10     ` Jan Beulich
  1 sibling, 1 reply; 17+ messages in thread
From: Ian Campbell @ 2011-07-07  8:53 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Andrew Cooper, xen-devel@lists.xensource.com

On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote:
> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote:
> > In the case of a crash, IOMMU DMA remapping gets turned off so that
> > the kdump kernel may boot.  However, this is warned as being dangerous
> > in the VTD specification if a DMA transaction is in progress.
> > 
> > Also, in the case of a crash, DMA transactions and interrupts from
> > peripheral devices such as network cards are likely to keep coming in.
> > Without DMA remapping enabled, the transactions will be writing over
> > low memory, corrupting the crash state, and perhaps even the kdump
> > reserved memory.
> > 
> > Therefore, on the crash path, we can disconnect all PCI devices from
> > their respective buses so that they are no longer able to be DMA
> > busmasters.  This reduces the risk of DMA transactions corrupting
> > state (and will also reduce spurious interrupts arriving to the kdump
> > kernel) until the kdump kernel and properly reset the PCI devices.
> > 
> > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> > 
> > diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/arch/x86/crash.c
> > --- a/xen/arch/x86/crash.c	Mon Jun 27 17:37:12 2011 +0100
> > +++ b/xen/arch/x86/crash.c	Wed Jul 06 13:37:44 2011 +0100
> > @@ -28,6 +28,7 @@
> >  #include <asm/apic.h>
> >  #include <asm/io_apic.h>
> >  #include <xen/iommu.h>
> > +#include <xen/pci.h>
> >  
> >  static atomic_t waiting_for_crash_ipi;
> >  static unsigned int crashing_cpu;
> > @@ -78,6 +79,8 @@ static void nmi_shootdown_cpus(void)
> >          msecs--;
> >      }
> >  
> > +    disconnect_pci_devices();
> > +
> >      /* Crash shutdown any IOMMU functionality as the crashdump kernel is not
> >       * happy when booting if interrupt/dma remapping is still enabled */
> >      iommu_crash_shutdown();
> > diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/drivers/passthrough/pci.c
> > --- a/xen/drivers/passthrough/pci.c	Mon Jun 27 17:37:12 2011 +0100
> > +++ b/xen/drivers/passthrough/pci.c	Wed Jul 06 13:37:44 2011 +0100
> > @@ -462,6 +462,32 @@ int __init scan_pci_devices(void)
> >      return 0;
> >  }
> >  
> > +/* Disconnect a PCI device from the PCI bus.  From the PCI spec:
> > + *     "When a 0 is written to [the COMMAND] register, the device is
> > + *     logically disconnected from the PCI bus for all accesses except
> > + *     configuration accesses. All devices are required to support
> > + *     this base level of functionality."
> > + */
> > +void disconnect_pci_device(struct pci_dev *pdev)
> > +{
> > +    pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn),
> > +                     PCI_FUNC(pdev->devfn), PCI_COMMAND, 0);
> 
> So if you have a PCI serial card (or Intel AMT) and you are using that for
> serial output on the hypervisor line, this will turn it off. There should
> be some whitelist capability to not do it for PCI serial devices that are
> owned (used) by the hypervisor.

That would be useful for debugging the kexec process itself but in the
general case there won't be any further output from the hypervisor and
if the kexec'd kernel wants to use the device it is going to have to set
it up again anyways.

On the other hand if the hypervisor is driving a device it presumably
knows (or could know) how to turn of interrupts and switch to polled
mode so we might as well do that?

Ian.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash
  2011-07-07  8:53   ` Ian Campbell
@ 2011-07-07  9:10     ` Jan Beulich
  2011-07-07  9:12       ` Ian Campbell
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Beulich @ 2011-07-07  9:10 UTC (permalink / raw)
  To: Andrew Cooper, Ian Campbell
  Cc: xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk

>>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote:
>> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote:
>> > +/* Disconnect a PCI device from the PCI bus.  From the PCI spec:
>> > + *     "When a 0 is written to [the COMMAND] register, the device is
>> > + *     logically disconnected from the PCI bus for all accesses except
>> > + *     configuration accesses. All devices are required to support
>> > + *     this base level of functionality."
>> > + */
>> > +void disconnect_pci_device(struct pci_dev *pdev)
>> > +{
>> > +    pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn),
>> > +                     PCI_FUNC(pdev->devfn), PCI_COMMAND, 0);
>> 
>> So if you have a PCI serial card (or Intel AMT) and you are using that for
>> serial output on the hypervisor line, this will turn it off. There should
>> be some whitelist capability to not do it for PCI serial devices that are
>> owned (used) by the hypervisor.
> 
> That would be useful for debugging the kexec process itself but in the
> general case there won't be any further output from the hypervisor and
> if the kexec'd kernel wants to use the device it is going to have to set
> it up again anyways.

No, not generally. Just look at Linux' early-printk code: The device
is assumed to be enabled (by the BIOS), as the PCI subsystem can't
possibly be initialized at this point already.

This also means that white-listing just devices Xen uses may not be
enough: If Xen doesn't use a serial console (or the secondary kernel
wants to use some other device Xen doesn't care about - VGA or
other kind of console devices come to mind), it must not find it fully
disconnected from the bus. Consequently I would think that while
interrupt and DMA activity should be forced off, decoding I/O and
memory addresses by the devices shouldn't be.

Jan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KEXEC: disconnect all PCI devices from  the PCI bus on crash
  2011-07-07  9:10     ` Jan Beulich
@ 2011-07-07  9:12       ` Ian Campbell
  2011-07-07  9:42         ` Jan Beulich
  0 siblings, 1 reply; 17+ messages in thread
From: Ian Campbell @ 2011-07-07  9:12 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, xen-devel@lists.xensource.com,
	Konrad Rzeszutek Wilk

On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote:
> >>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote:
> >> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote:
> >> > +/* Disconnect a PCI device from the PCI bus.  From the PCI spec:
> >> > + *     "When a 0 is written to [the COMMAND] register, the device is
> >> > + *     logically disconnected from the PCI bus for all accesses except
> >> > + *     configuration accesses. All devices are required to support
> >> > + *     this base level of functionality."
> >> > + */
> >> > +void disconnect_pci_device(struct pci_dev *pdev)
> >> > +{
> >> > +    pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn),
> >> > +                     PCI_FUNC(pdev->devfn), PCI_COMMAND, 0);
> >> 
> >> So if you have a PCI serial card (or Intel AMT) and you are using that for
> >> serial output on the hypervisor line, this will turn it off. There should
> >> be some whitelist capability to not do it for PCI serial devices that are
> >> owned (used) by the hypervisor.
> > 
> > That would be useful for debugging the kexec process itself but in the
> > general case there won't be any further output from the hypervisor and
> > if the kexec'd kernel wants to use the device it is going to have to set
> > it up again anyways.
> 
> No, not generally. Just look at Linux' early-printk code: The device
> is assumed to be enabled (by the BIOS), as the PCI subsystem can't
> possibly be initialized at this point already.

That's arguably a debugging facility as well though.

> This also means that white-listing just devices Xen uses may not be
> enough: If Xen doesn't use a serial console (or the secondary kernel
> wants to use some other device Xen doesn't care about - VGA or
> other kind of console devices come to mind), it must not find it fully
> disconnected from the bus. Consequently I would think that while
> interrupt and DMA activity should be forced off, decoding I/O and
> memory addresses by the devices shouldn't be.

The problem is that this can't be done without device specific
knowledge, which the hypervisor generally doesn't have and we can't get
the device's owning domain to do anything because we are crashing.

Ian.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash
  2011-07-07  9:12       ` Ian Campbell
@ 2011-07-07  9:42         ` Jan Beulich
  2011-07-07  9:53           ` Ian Campbell
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Beulich @ 2011-07-07  9:42 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Andrew Cooper, xen-devel@lists.xensource.com,
	Konrad Rzeszutek Wilk

>>> On 07.07.11 at 11:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote:
>> >>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote:
>> > On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote:
>> >> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote:
>> >> > +/* Disconnect a PCI device from the PCI bus.  From the PCI spec:
>> >> > + *     "When a 0 is written to [the COMMAND] register, the device is
>> >> > + *     logically disconnected from the PCI bus for all accesses except
>> >> > + *     configuration accesses. All devices are required to support
>> >> > + *     this base level of functionality."
>> >> > + */
>> >> > +void disconnect_pci_device(struct pci_dev *pdev)
>> >> > +{
>> >> > +    pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn),
>> >> > +                     PCI_FUNC(pdev->devfn), PCI_COMMAND, 0);
>> >> 
>> >> So if you have a PCI serial card (or Intel AMT) and you are using that for
>> >> serial output on the hypervisor line, this will turn it off. There should
>> >> be some whitelist capability to not do it for PCI serial devices that are
>> >> owned (used) by the hypervisor.
>> > 
>> > That would be useful for debugging the kexec process itself but in the
>> > general case there won't be any further output from the hypervisor and
>> > if the kexec'd kernel wants to use the device it is going to have to set
>> > it up again anyways.
>> 
>> No, not generally. Just look at Linux' early-printk code: The device
>> is assumed to be enabled (by the BIOS), as the PCI subsystem can't
>> possibly be initialized at this point already.
> 
> That's arguably a debugging facility as well though.
> 
>> This also means that white-listing just devices Xen uses may not be
>> enough: If Xen doesn't use a serial console (or the secondary kernel
>> wants to use some other device Xen doesn't care about - VGA or
>> other kind of console devices come to mind), it must not find it fully
>> disconnected from the bus. Consequently I would think that while
>> interrupt and DMA activity should be forced off, decoding I/O and
>> memory addresses by the devices shouldn't be.
> 
> The problem is that this can't be done without device specific
> knowledge, which the hypervisor generally doesn't have and we can't get
> the device's owning domain to do anything because we are crashing.

Why would there be any device specific knowledge needed? It's
all done through the command word, just that writing zero isn't
really appropriate.

Jan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KEXEC: disconnect all PCI devices from  the PCI bus on crash
  2011-07-07  9:42         ` Jan Beulich
@ 2011-07-07  9:53           ` Ian Campbell
  2011-07-07 10:02             ` Andrew Cooper
  2011-07-07 11:21             ` Jan Beulich
  0 siblings, 2 replies; 17+ messages in thread
From: Ian Campbell @ 2011-07-07  9:53 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, xen-devel@lists.xensource.com,
	Konrad Rzeszutek Wilk

On Thu, 2011-07-07 at 10:42 +0100, Jan Beulich wrote:
> >>> On 07.07.11 at 11:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> > On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote:
> >> >>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> >> > On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote:
> >> >> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote:
> >> >> > +/* Disconnect a PCI device from the PCI bus.  From the PCI spec:
> >> >> > + *     "When a 0 is written to [the COMMAND] register, the device is
> >> >> > + *     logically disconnected from the PCI bus for all accesses except
> >> >> > + *     configuration accesses. All devices are required to support
> >> >> > + *     this base level of functionality."
> >> >> > + */
> >> >> > +void disconnect_pci_device(struct pci_dev *pdev)
> >> >> > +{
> >> >> > +    pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn),
> >> >> > +                     PCI_FUNC(pdev->devfn), PCI_COMMAND, 0);
> >> >> 
> >> >> So if you have a PCI serial card (or Intel AMT) and you are using that for
> >> >> serial output on the hypervisor line, this will turn it off. There should
> >> >> be some whitelist capability to not do it for PCI serial devices that are
> >> >> owned (used) by the hypervisor.
> >> > 
> >> > That would be useful for debugging the kexec process itself but in the
> >> > general case there won't be any further output from the hypervisor and
> >> > if the kexec'd kernel wants to use the device it is going to have to set
> >> > it up again anyways.
> >> 
> >> No, not generally. Just look at Linux' early-printk code: The device
> >> is assumed to be enabled (by the BIOS), as the PCI subsystem can't
> >> possibly be initialized at this point already.
> > 
> > That's arguably a debugging facility as well though.
> > 
> >> This also means that white-listing just devices Xen uses may not be
> >> enough: If Xen doesn't use a serial console (or the secondary kernel
> >> wants to use some other device Xen doesn't care about - VGA or
> >> other kind of console devices come to mind), it must not find it fully
> >> disconnected from the bus. Consequently I would think that while
> >> interrupt and DMA activity should be forced off, decoding I/O and
> >> memory addresses by the devices shouldn't be.
> > 
> > The problem is that this can't be done without device specific
> > knowledge, which the hypervisor generally doesn't have and we can't get
> > the device's owning domain to do anything because we are crashing.
> 
> Why would there be any device specific knowledge needed? It's
> all done through the command word, just that writing zero isn't
> really appropriate.

So presumably if you disable bus mastering you've effectively disabled
DMA but how do you disable interrupts via the command word?

Ian.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KEXEC: disconnect all PCI devices from  the PCI bus on crash
  2011-07-07  9:53           ` Ian Campbell
@ 2011-07-07 10:02             ` Andrew Cooper
  2011-07-07 10:06               ` Tim Deegan
                                 ` (2 more replies)
  2011-07-07 11:21             ` Jan Beulich
  1 sibling, 3 replies; 17+ messages in thread
From: Andrew Cooper @ 2011-07-07 10:02 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel@lists.xensource.com, Jan Beulich, Konrad Rzeszutek Wilk



On 07/07/11 10:53, Ian Campbell wrote:
> On Thu, 2011-07-07 at 10:42 +0100, Jan Beulich wrote:
>>>>> On 07.07.11 at 11:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
>>> On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote:
>>>>>>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote:
>>>>> On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote:
>>>>>> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote:
>>>>>>> +/* Disconnect a PCI device from the PCI bus.  From the PCI spec:
>>>>>>> + *     "When a 0 is written to [the COMMAND] register, the device is
>>>>>>> + *     logically disconnected from the PCI bus for all accesses except
>>>>>>> + *     configuration accesses. All devices are required to support
>>>>>>> + *     this base level of functionality."
>>>>>>> + */
>>>>>>> +void disconnect_pci_device(struct pci_dev *pdev)
>>>>>>> +{
>>>>>>> +    pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn),
>>>>>>> +                     PCI_FUNC(pdev->devfn), PCI_COMMAND, 0);
>>>>>> So if you have a PCI serial card (or Intel AMT) and you are using that for
>>>>>> serial output on the hypervisor line, this will turn it off. There should
>>>>>> be some whitelist capability to not do it for PCI serial devices that are
>>>>>> owned (used) by the hypervisor.
>>>>> That would be useful for debugging the kexec process itself but in the
>>>>> general case there won't be any further output from the hypervisor and
>>>>> if the kexec'd kernel wants to use the device it is going to have to set
>>>>> it up again anyways.
>>>> No, not generally. Just look at Linux' early-printk code: The device
>>>> is assumed to be enabled (by the BIOS), as the PCI subsystem can't
>>>> possibly be initialized at this point already.
>>> That's arguably a debugging facility as well though.
>>>
>>>> This also means that white-listing just devices Xen uses may not be
>>>> enough: If Xen doesn't use a serial console (or the secondary kernel
>>>> wants to use some other device Xen doesn't care about - VGA or
>>>> other kind of console devices come to mind), it must not find it fully
>>>> disconnected from the bus. Consequently I would think that while
>>>> interrupt and DMA activity should be forced off, decoding I/O and
>>>> memory addresses by the devices shouldn't be.
>>> The problem is that this can't be done without device specific
>>> knowledge, which the hypervisor generally doesn't have and we can't get
>>> the device's owning domain to do anything because we are crashing.
>> Why would there be any device specific knowledge needed? It's
>> all done through the command word, just that writing zero isn't
>> really appropriate.
> So presumably if you disable bus mastering you've effectively disabled
> DMA but how do you disable interrupts via the command word?
>
> Ian.
>
Bit 10 of the control word is "disable assertion of INTx# pins"  (so set
bit to 1 to disable interrupts).  That should cover legacy interrupts. 
For MSI and above, disabling DMA should prevent the bus writing to the
magic local APIC addresses.

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KEXEC: disconnect all PCI devices from  the PCI bus on crash
  2011-07-07 10:02             ` Andrew Cooper
@ 2011-07-07 10:06               ` Tim Deegan
  2011-07-07 10:07               ` Ian Campbell
  2011-07-07 11:25               ` Jan Beulich
  2 siblings, 0 replies; 17+ messages in thread
From: Tim Deegan @ 2011-07-07 10:06 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Ian Campbell, Rzeszutek Wilk, xen-devel@lists.xensource.com,
	Jan Beulich, Konrad

At 11:02 +0100 on 07 Jul (1310036565), Andrew Cooper wrote:
> 
> 
> On 07/07/11 10:53, Ian Campbell wrote:
> > On Thu, 2011-07-07 at 10:42 +0100, Jan Beulich wrote:
> >>>>> On 07.07.11 at 11:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> >>> On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote:
> >>>>>>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> >>>>> On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote:
> >>>>>> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote:
> >>>>>>> +/* Disconnect a PCI device from the PCI bus.  From the PCI spec:
> >>>>>>> + *     "When a 0 is written to [the COMMAND] register, the device is
> >>>>>>> + *     logically disconnected from the PCI bus for all accesses except
> >>>>>>> + *     configuration accesses. All devices are required to support
> >>>>>>> + *     this base level of functionality."
> >>>>>>> + */
> >>>>>>> +void disconnect_pci_device(struct pci_dev *pdev)
> >>>>>>> +{
> >>>>>>> +    pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn),
> >>>>>>> +                     PCI_FUNC(pdev->devfn), PCI_COMMAND, 0);
> >>>>>> So if you have a PCI serial card (or Intel AMT) and you are using that for
> >>>>>> serial output on the hypervisor line, this will turn it off. There should
> >>>>>> be some whitelist capability to not do it for PCI serial devices that are
> >>>>>> owned (used) by the hypervisor.
> >>>>> That would be useful for debugging the kexec process itself but in the
> >>>>> general case there won't be any further output from the hypervisor and
> >>>>> if the kexec'd kernel wants to use the device it is going to have to set
> >>>>> it up again anyways.
> >>>> No, not generally. Just look at Linux' early-printk code: The device
> >>>> is assumed to be enabled (by the BIOS), as the PCI subsystem can't
> >>>> possibly be initialized at this point already.
> >>> That's arguably a debugging facility as well though.
> >>>
> >>>> This also means that white-listing just devices Xen uses may not be
> >>>> enough: If Xen doesn't use a serial console (or the secondary kernel
> >>>> wants to use some other device Xen doesn't care about - VGA or
> >>>> other kind of console devices come to mind), it must not find it fully
> >>>> disconnected from the bus. Consequently I would think that while
> >>>> interrupt and DMA activity should be forced off, decoding I/O and
> >>>> memory addresses by the devices shouldn't be.
> >>> The problem is that this can't be done without device specific
> >>> knowledge, which the hypervisor generally doesn't have and we can't get
> >>> the device's owning domain to do anything because we are crashing.
> >> Why would there be any device specific knowledge needed? It's
> >> all done through the command word, just that writing zero isn't
> >> really appropriate.
> > So presumably if you disable bus mastering you've effectively disabled
> > DMA but how do you disable interrupts via the command word?
> >
> > Ian.
> >
> Bit 10 of the control word is "disable assertion of INTx# pins"  (so set
> bit to 1 to disable interrupts).  That should cover legacy interrupts. 

In older specs that's a reserved bit; I suspect on real PCI devices
it will have no effect. 

> For MSI and above, disabling DMA should prevent the bus writing to the
> magic local APIC addresses.

Are MSI cycles always the same as DMA cycles?  I though that MSI-X
changed that (but might be quite wrong). 

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KEXEC: disconnect all PCI devices from  the PCI bus on crash
  2011-07-07 10:02             ` Andrew Cooper
  2011-07-07 10:06               ` Tim Deegan
@ 2011-07-07 10:07               ` Ian Campbell
  2011-07-07 11:25               ` Jan Beulich
  2 siblings, 0 replies; 17+ messages in thread
From: Ian Campbell @ 2011-07-07 10:07 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: xen-devel@lists.xensource.com, Jan Beulich, Konrad Rzeszutek Wilk

On Thu, 2011-07-07 at 11:02 +0100, Andrew Cooper wrote:
> 
> On 07/07/11 10:53, Ian Campbell wrote:
> > On Thu, 2011-07-07 at 10:42 +0100, Jan Beulich wrote:
> >>>>> On 07.07.11 at 11:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> >>> On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote:
> >>>>>>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> >>>>> On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote:
> >>>>>> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote:
> >>>>>>> +/* Disconnect a PCI device from the PCI bus.  From the PCI spec:
> >>>>>>> + *     "When a 0 is written to [the COMMAND] register, the device is
> >>>>>>> + *     logically disconnected from the PCI bus for all accesses except
> >>>>>>> + *     configuration accesses. All devices are required to support
> >>>>>>> + *     this base level of functionality."
> >>>>>>> + */
> >>>>>>> +void disconnect_pci_device(struct pci_dev *pdev)
> >>>>>>> +{
> >>>>>>> +    pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn),
> >>>>>>> +                     PCI_FUNC(pdev->devfn), PCI_COMMAND, 0);
> >>>>>> So if you have a PCI serial card (or Intel AMT) and you are using that for
> >>>>>> serial output on the hypervisor line, this will turn it off. There should
> >>>>>> be some whitelist capability to not do it for PCI serial devices that are
> >>>>>> owned (used) by the hypervisor.
> >>>>> That would be useful for debugging the kexec process itself but in the
> >>>>> general case there won't be any further output from the hypervisor and
> >>>>> if the kexec'd kernel wants to use the device it is going to have to set
> >>>>> it up again anyways.
> >>>> No, not generally. Just look at Linux' early-printk code: The device
> >>>> is assumed to be enabled (by the BIOS), as the PCI subsystem can't
> >>>> possibly be initialized at this point already.
> >>> That's arguably a debugging facility as well though.
> >>>
> >>>> This also means that white-listing just devices Xen uses may not be
> >>>> enough: If Xen doesn't use a serial console (or the secondary kernel
> >>>> wants to use some other device Xen doesn't care about - VGA or
> >>>> other kind of console devices come to mind), it must not find it fully
> >>>> disconnected from the bus. Consequently I would think that while
> >>>> interrupt and DMA activity should be forced off, decoding I/O and
> >>>> memory addresses by the devices shouldn't be.
> >>> The problem is that this can't be done without device specific
> >>> knowledge, which the hypervisor generally doesn't have and we can't get
> >>> the device's owning domain to do anything because we are crashing.
> >> Why would there be any device specific knowledge needed? It's
> >> all done through the command word, just that writing zero isn't
> >> really appropriate.
> > So presumably if you disable bus mastering you've effectively disabled
> > DMA but how do you disable interrupts via the command word?
> >
> > Ian.
> >
> Bit 10 of the control word is "disable assertion of INTx# pins"  (so set
> bit to 1 to disable interrupts).

My copy of the spec isn't new enough to cover that particular bit but
the Linux headers describe it as disabling INTx "emulation" so I wasn't
sure if it actually did anything in all cases, especially on older
hardware.

>   That should cover legacy interrupts. 
> For MSI and above, disabling DMA should prevent the bus writing to the
> magic local APIC addresses.
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash
  2011-07-07  9:53           ` Ian Campbell
  2011-07-07 10:02             ` Andrew Cooper
@ 2011-07-07 11:21             ` Jan Beulich
  2011-07-07 11:40               ` Stefano Stabellini
  1 sibling, 1 reply; 17+ messages in thread
From: Jan Beulich @ 2011-07-07 11:21 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Andrew Cooper, xen-devel@lists.xensource.com,
	Konrad Rzeszutek Wilk

>>> On 07.07.11 at 11:53, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> On Thu, 2011-07-07 at 10:42 +0100, Jan Beulich wrote:
>> >>> On 07.07.11 at 11:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
>> > On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote:
>> >> >>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote:
>> >> > On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote:
>> >> >> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote:
>> >> >> > +/* Disconnect a PCI device from the PCI bus.  From the PCI spec:
>> >> >> > + *     "When a 0 is written to [the COMMAND] register, the device is
>> >> >> > + *     logically disconnected from the PCI bus for all accesses except
>> >> >> > + *     configuration accesses. All devices are required to support
>> >> >> > + *     this base level of functionality."
>> >> >> > + */
>> >> >> > +void disconnect_pci_device(struct pci_dev *pdev)
>> >> >> > +{
>> >> >> > +    pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn),
>> >> >> > +                     PCI_FUNC(pdev->devfn), PCI_COMMAND, 0);
>> >> >> 
>> >> >> So if you have a PCI serial card (or Intel AMT) and you are using that for
>> >> >> serial output on the hypervisor line, this will turn it off. There should
>> >> >> be some whitelist capability to not do it for PCI serial devices that are
>> >> >> owned (used) by the hypervisor.
>> >> > 
>> >> > That would be useful for debugging the kexec process itself but in the
>> >> > general case there won't be any further output from the hypervisor and
>> >> > if the kexec'd kernel wants to use the device it is going to have to set
>> >> > it up again anyways.
>> >> 
>> >> No, not generally. Just look at Linux' early-printk code: The device
>> >> is assumed to be enabled (by the BIOS), as the PCI subsystem can't
>> >> possibly be initialized at this point already.
>> > 
>> > That's arguably a debugging facility as well though.
>> > 
>> >> This also means that white-listing just devices Xen uses may not be
>> >> enough: If Xen doesn't use a serial console (or the secondary kernel
>> >> wants to use some other device Xen doesn't care about - VGA or
>> >> other kind of console devices come to mind), it must not find it fully
>> >> disconnected from the bus. Consequently I would think that while
>> >> interrupt and DMA activity should be forced off, decoding I/O and
>> >> memory addresses by the devices shouldn't be.
>> > 
>> > The problem is that this can't be done without device specific
>> > knowledge, which the hypervisor generally doesn't have and we can't get
>> > the device's owning domain to do anything because we are crashing.
>> 
>> Why would there be any device specific knowledge needed? It's
>> all done through the command word, just that writing zero isn't
>> really appropriate.
> 
> So presumably if you disable bus mastering you've effectively disabled
> DMA but how do you disable interrupts via the command word?

Interrupts can be disabled equally well in the IO-APIC and by
disabling MSI for all devices.

Jan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash
  2011-07-07 10:02             ` Andrew Cooper
  2011-07-07 10:06               ` Tim Deegan
  2011-07-07 10:07               ` Ian Campbell
@ 2011-07-07 11:25               ` Jan Beulich
  2 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2011-07-07 11:25 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Ian Campbell, xen-devel@lists.xensource.com,
	Konrad Rzeszutek Wilk

>>> On 07.07.11 at 12:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote:

> 
> On 07/07/11 10:53, Ian Campbell wrote:
>> On Thu, 2011-07-07 at 10:42 +0100, Jan Beulich wrote:
>>>>>> On 07.07.11 at 11:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
>>>> On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote:
>>>>>>>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote:
>>>>>> On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote:
>>>>>>> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote:
>>>>>>>> +/* Disconnect a PCI device from the PCI bus.  From the PCI spec:
>>>>>>>> + *     "When a 0 is written to [the COMMAND] register, the device is
>>>>>>>> + *     logically disconnected from the PCI bus for all accesses except
>>>>>>>> + *     configuration accesses. All devices are required to support
>>>>>>>> + *     this base level of functionality."
>>>>>>>> + */
>>>>>>>> +void disconnect_pci_device(struct pci_dev *pdev)
>>>>>>>> +{
>>>>>>>> +    pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn),
>>>>>>>> +                     PCI_FUNC(pdev->devfn), PCI_COMMAND, 0);
>>>>>>> So if you have a PCI serial card (or Intel AMT) and you are using that for
>>>>>>> serial output on the hypervisor line, this will turn it off. There should
>>>>>>> be some whitelist capability to not do it for PCI serial devices that are
>>>>>>> owned (used) by the hypervisor.
>>>>>> That would be useful for debugging the kexec process itself but in the
>>>>>> general case there won't be any further output from the hypervisor and
>>>>>> if the kexec'd kernel wants to use the device it is going to have to set
>>>>>> it up again anyways.
>>>>> No, not generally. Just look at Linux' early-printk code: The device
>>>>> is assumed to be enabled (by the BIOS), as the PCI subsystem can't
>>>>> possibly be initialized at this point already.
>>>> That's arguably a debugging facility as well though.
>>>>
>>>>> This also means that white-listing just devices Xen uses may not be
>>>>> enough: If Xen doesn't use a serial console (or the secondary kernel
>>>>> wants to use some other device Xen doesn't care about - VGA or
>>>>> other kind of console devices come to mind), it must not find it fully
>>>>> disconnected from the bus. Consequently I would think that while
>>>>> interrupt and DMA activity should be forced off, decoding I/O and
>>>>> memory addresses by the devices shouldn't be.
>>>> The problem is that this can't be done without device specific
>>>> knowledge, which the hypervisor generally doesn't have and we can't get
>>>> the device's owning domain to do anything because we are crashing.
>>> Why would there be any device specific knowledge needed? It's
>>> all done through the command word, just that writing zero isn't
>>> really appropriate.
>> So presumably if you disable bus mastering you've effectively disabled
>> DMA but how do you disable interrupts via the command word?
>>
>> Ian.
>>
> Bit 10 of the control word is "disable assertion of INTx# pins"  (so set
> bit to 1 to disable interrupts).  That should cover legacy interrupts. 
> For MSI and above, disabling DMA should prevent the bus writing to the
> magic local APIC addresses.

No, I don't thing the bus mastering bit disables MSI accesses (and
even if officially it did, I wouldn't trust it). But disabling MSI itself
isn't device specific either, so should be an option here (with the
caveat that there are a few buggy devices that can't get back
into MSI mode if it was once disabled, so perhaps masking MSI
when possible should be preferred over disabling it).

Jan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash
  2011-07-07 11:21             ` Jan Beulich
@ 2011-07-07 11:40               ` Stefano Stabellini
  2011-07-07 13:03                 ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 17+ messages in thread
From: Stefano Stabellini @ 2011-07-07 11:40 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Ian Campbell, Andrew Cooper, xen-devel@lists.xensource.com,
	Konrad Rzeszutek Wilk

On Thu, 7 Jul 2011, Jan Beulich wrote:
> >>> On 07.07.11 at 11:53, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> > On Thu, 2011-07-07 at 10:42 +0100, Jan Beulich wrote:
> >> >>> On 07.07.11 at 11:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:
> >> > On Thu, 2011-07-07 at 10:10 +0100, Jan Beulich wrote:
> >> >> >>> On 07.07.11 at 10:53, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> >> >> > On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote:
> >> >> >> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote:
> >> >> >> > +/* Disconnect a PCI device from the PCI bus.  From the PCI spec:
> >> >> >> > + *     "When a 0 is written to [the COMMAND] register, the device is
> >> >> >> > + *     logically disconnected from the PCI bus for all accesses except
> >> >> >> > + *     configuration accesses. All devices are required to support
> >> >> >> > + *     this base level of functionality."
> >> >> >> > + */
> >> >> >> > +void disconnect_pci_device(struct pci_dev *pdev)
> >> >> >> > +{
> >> >> >> > +    pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn),
> >> >> >> > +                     PCI_FUNC(pdev->devfn), PCI_COMMAND, 0);
> >> >> >> 
> >> >> >> So if you have a PCI serial card (or Intel AMT) and you are using that for
> >> >> >> serial output on the hypervisor line, this will turn it off. There should
> >> >> >> be some whitelist capability to not do it for PCI serial devices that are
> >> >> >> owned (used) by the hypervisor.
> >> >> > 
> >> >> > That would be useful for debugging the kexec process itself but in the
> >> >> > general case there won't be any further output from the hypervisor and
> >> >> > if the kexec'd kernel wants to use the device it is going to have to set
> >> >> > it up again anyways.
> >> >> 
> >> >> No, not generally. Just look at Linux' early-printk code: The device
> >> >> is assumed to be enabled (by the BIOS), as the PCI subsystem can't
> >> >> possibly be initialized at this point already.
> >> > 
> >> > That's arguably a debugging facility as well though.
> >> > 
> >> >> This also means that white-listing just devices Xen uses may not be
> >> >> enough: If Xen doesn't use a serial console (or the secondary kernel
> >> >> wants to use some other device Xen doesn't care about - VGA or
> >> >> other kind of console devices come to mind), it must not find it fully
> >> >> disconnected from the bus. Consequently I would think that while
> >> >> interrupt and DMA activity should be forced off, decoding I/O and
> >> >> memory addresses by the devices shouldn't be.
> >> > 
> >> > The problem is that this can't be done without device specific
> >> > knowledge, which the hypervisor generally doesn't have and we can't get
> >> > the device's owning domain to do anything because we are crashing.
> >> 
> >> Why would there be any device specific knowledge needed? It's
> >> all done through the command word, just that writing zero isn't
> >> really appropriate.
> > 
> > So presumably if you disable bus mastering you've effectively disabled
> > DMA but how do you disable interrupts via the command word?
> 
> Interrupts can be disabled equally well in the IO-APIC and by
> disabling MSI for all devices.

Considering that it would be nice to have PCI device reset capabilities
in the hypervisor anyway, in case a device is malfunctioning, and
considering that we all know that PCI devices don't always follow PCI
specs, wouldn't it be safer to just reset all the devices in this case?
At least the reset code is well tested and known to work (or known to
work with some particular devices, and for this reasons there are
quirks). Also resetting devices would better cover the case when a
device is misbehaving.
Of course we still need a whitelist for serial consoles and maybe other
devices.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash
  2011-07-07 11:40               ` Stefano Stabellini
@ 2011-07-07 13:03                 ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 17+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-07-07 13:03 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Ian Campbell, Andrew Cooper, xen-devel@lists.xensource.com,
	Jan Beulich

> quirks). Also resetting devices would better cover the case when a
> device is misbehaving.
> Of course we still need a whitelist for serial consoles and maybe other
> devices.

It sounds like I should post some of the patches I've that deal with
various serial console now. They actually save the BDF of the PCI serial
device so that the "whitelist" data can be easily extracted.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2011-07-07 13:03 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-06 12:39 [PATCH] KEXEC: disconnect all PCI devices from the PCI bus on crash Andrew Cooper
2011-07-06 12:43 ` Jan Beulich
2011-07-06 12:46   ` Andrew Cooper
2011-07-06 18:42 ` Konrad Rzeszutek Wilk
2011-07-07  8:41   ` George Dunlap
2011-07-07  8:53   ` Ian Campbell
2011-07-07  9:10     ` Jan Beulich
2011-07-07  9:12       ` Ian Campbell
2011-07-07  9:42         ` Jan Beulich
2011-07-07  9:53           ` Ian Campbell
2011-07-07 10:02             ` Andrew Cooper
2011-07-07 10:06               ` Tim Deegan
2011-07-07 10:07               ` Ian Campbell
2011-07-07 11:25               ` Jan Beulich
2011-07-07 11:21             ` Jan Beulich
2011-07-07 11:40               ` Stefano Stabellini
2011-07-07 13:03                 ` Konrad Rzeszutek Wilk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.