All of lore.kernel.org
 help / color / mirror / Atom feed
From: konrad wilk <konrad.wilk@oracle.com>
To: Sander Eikelenboom <linux@eikelenboom.it>
Cc: Jan Beulich <JBeulich@suse.com>, xen-devel <xen-devel@lists.xen.org>
Subject: Re: linux-3.9-rc0 regression from 3.8 SATA controller not detected under xen
Date: Wed, 27 Feb 2013 17:22:18 -0500	[thread overview]
Message-ID: <512E871A.20706@oracle.com> (raw)
In-Reply-To: <171656525.20130227214153@eikelenboom.it>


On 2/27/2013 3:41 PM, Sander Eikelenboom wrote:
> Wednesday, February 27, 2013, 8:28:10 PM, you wrote:
>
>> On Wed, Feb 27, 2013 at 06:50:59PM +0100, Sander Eikelenboom wrote:
>>> Wednesday, February 27, 2013, 1:54:31 PM, you wrote:
>>>
>>>>>>> On 27.02.13 at 12:46, Sander Eikelenboom <linux@eikelenboom.it> wrote:
>>>>>    [   89.338827] ahci: probe of 0000:00:11.0 failed with error -22
>>>> Which is -EINVAL. With nothing else printed, I'm afraid you need to
>>>> find the origin of this return value by instrumenting the involved
>>>> call tree.
>>> Just wondering, is multiple msi's per device actually supported by xen ?
>> That is very good question. I know we support MSI-X b/c 1GB or 10GB NICs
>> use them and they work great with Xen.
>> BTW, this is merge:
>> ommit 5800700f66678ea5c85e7d62b138416070bf7f60
>> Merge: 266d7ad af8d102
>> Author: Linus Torvalds <torvalds@linux-foundation.org>
>> Date:   Tue Feb 19 19:07:27 2013 -0800
>>      Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>      
>>      Pull x86/apic changes from Ingo Molnar:
>>       "Main changes:
>>      
>>         - Multiple MSI support added to the APIC, PCI and AHCI code - acked
>>           by all relevant maintainers, by Alexander Gordeev.
>>      
>>           The advantage is that multiple AHCI ports can have multiple MSI
>>           irqs assigned, and can thus spread to multiple CPUs.
>>      
>>           [ Drivers can make use of this new facility via the
>>             pci_enable_msi_block_auto() method ]
>
>
>> With MSI per device, the hypercall that ends up happening is:
>> PHYSDEVOP_map_pirq with:
>>     map_irq.domid = domid;
>>     map_irq.type = MAP_PIRQ_TYPE_MSI_SEG;
>>     map_irq.index = -1;
>>     map_irq.pirq = -1;
>>     map_irq.bus = dev->bus->number |
>>                   (pci_domain_nr(dev->bus) << 16);
>>     map_irq.devfn = dev->devfn;
>> Which would imply that we are doing this call multiple times?
>> (This is xen_initdom_setup_msi_irqs).
>> It looks like pci_enable_msi_block_auto is the multiple MSI one
>> and it should perculate down to xen_initdom_setup_msi_irqs.
>> Granted the xen_init.. does not do anything with the 'nvec' call.
>> So could I ask you try out your hunch by doing three things:
>>   1). Instrument xen_initdom_setup_msi_irqs to see if the
>>       nvec has anything but 1 and in its loop instrument to
>>       see if it has more than on MSI attribute?
>>   2). The ahci driver has ahci_init_interrupts which only does
>>     the multiple MSI thing if AHCI_HFLAG_NO_MSI is not set.
>>      If you edit drivers/ata/ahci ahci_port_info for the SB600 (or 700?)
>>      to have AHCI_HFLAG_NO_MSI flag (you probably want to do this
>>      seperatly from 1).
>>   3). Checkout before merge 5800700f66678ea5c85e7d62b138416070bf7f60
>>      and try 266d7ad7f4fe2f44b91561f5b812115c1b3018ab?
>
> So of interest are commits:
> - 5ca72c4f7c412c2002363218901eba5516c476b1
> - 08261d87f7d1b6253ab3223756625a5c74532293
> - 51906e779f2b13b38f8153774c4c7163d412ffd9
>
> Hmmm reading the commit message of 51906e779f2b13b38f8153774c4c7163d412ffd9:
>
> x86/MSI: Support multiple MSIs in presense of IRQ remapping
>
> The MSI specification has several constraints in comparison with
> MSI-X, most notable of them is the inability to configure MSIs
> independently. As a result, it is impossible to dispatch
> interrupts from different queues to different CPUs. This is
> largely devalues the support of multiple MSIs in SMP systems.
>
> Also, a necessity to allocate a contiguous block of vector
> numbers for devices capable of multiple MSIs might cause a
> considerable pressure on x86 interrupt vector allocator and
> could lead to fragmentation of the interrupt vectors space.
>
> This patch overcomes both drawbacks in presense of IRQ remapping
> and lets devices take advantage of multiple queues and per-IRQ
> affinity assignments.
>
> At least makes clear why baremetal does boot and xen doesn't:
>
> Baremetal behaves differently and thus boots because interrupt remapping gets disabled on boot by the kernel iommu code due to the buggy bios iommu errata, so according to the commit message above it doesn't even try the multiple MSI per device scenario.
>
> So the question is if it can be enabled in Xen (and if it actually could be beneficial because the commit messages seems to indicate that could be questionable).
> If not, the check in arch/x86/kernel/apic/io_apic.c:setup_msi_irqs should fail
Except that function in Xen is not run. that is b/c 
x86_msi_ops.setup_msi_irqs end up pointing to xen_initdom_setup_irqs. 
While if IOMMU is enabled it gets set to irq_remapping_setup_msi_irqs.

So a fix like this:
diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 56ab749..47f8cca 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -263,6 +263,9 @@ static int xen_initdom_setup_msi_irqs(struct pci_dev 
*dev, int nvec, int type)
         int ret = 0;
         struct msi_desc *msidesc;

+       if (type == PCI_CAP_ID_MSI && nvec > 1)
+               return 1;
+
         list_for_each_entry(msidesc, &dev->msi_list, list) {
                 struct physdev_map_pirq map_irq;
                 domid_t domid;


(sorry about the paste getting messed up here) - ought to do it? As for 
example on one of my AMD machines there is no IOMMU, and this is where 
AHCI does work under baremetal but not under Xen.

We can future wise implement a better version of this to deal with 
multiple MSIs, but lets make sure to first get it booting.
> --
> Sander
>
>
>
>
>>> --
>>> Sander
>>>
>>>> Jan
>>>
>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xen.org
>>> http://lists.xen.org/xen-devel
>>>
>
>

  reply	other threads:[~2013-02-27 22:22 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-25 22:18 linux-3.9-rc0 regression from 3.8 SATA controller not detected under xen Sander Eikelenboom
2013-02-26  8:41 ` Jan Beulich
2013-02-27 10:57   ` Sander Eikelenboom
2013-02-27 11:06     ` Jan Beulich
2013-02-27 11:46       ` Sander Eikelenboom
2013-02-27 12:54         ` Jan Beulich
2013-02-27 17:50           ` Sander Eikelenboom
2013-02-27 19:28             ` Konrad Rzeszutek Wilk
2013-02-27 19:56               ` Sander Eikelenboom
2013-02-28 14:20                 ` Konrad Rzeszutek Wilk
2013-02-27 20:41               ` Sander Eikelenboom
2013-02-27 22:22                 ` konrad wilk [this message]
2013-02-27 23:57                   ` Sander Eikelenboom
2013-02-28 13:52                     ` Konrad Rzeszutek Wilk
2013-02-28 13:57                       ` Sander Eikelenboom
2013-02-28  7:51             ` Jan Beulich
2013-02-28  8:15               ` Sander Eikelenboom
2013-02-26 15:20 ` Konrad Rzeszutek Wilk
2013-02-26 15:55   ` Sander Eikelenboom
2013-02-26 20:56   ` Sander Eikelenboom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=512E871A.20706@oracle.com \
    --to=konrad.wilk@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=linux@eikelenboom.it \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.