qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Dongli Zhang <dongli.zhang@oracle.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: keith.busch@intel.com, qemu-devel@nongnu.org,
	coffmaker@gmail.com, maximlevitsky@gmail.com
Subject: Re: [Qemu-devel] vfio failure with intel 760p 128GB nvme
Date: Thu, 27 Dec 2018 20:30:48 +0800	[thread overview]
Message-ID: <55d8a52e-7c0e-aa5b-ced2-176619e5c661@oracle.com> (raw)
In-Reply-To: <8f06611d-7b1e-5934-648c-f4cd7626fe63@oracle.com>

Hi Alex,

On 12/02/2018 09:29 AM, Dongli Zhang wrote:
> Hi Alex,
> 
> On 12/02/2018 03:29 AM, Alex Williamson wrote:
>> On Sat, 1 Dec 2018 10:52:21 -0800 (PST)
>> Dongli Zhang <dongli.zhang@oracle.com> wrote:
>>
>>> Hi,
>>>
>>> I obtained below error when assigning an intel 760p 128GB nvme to guest via
>>> vfio on my desktop:
>>>
>>> qemu-system-x86_64: -device vfio-pci,host=0000:01:00.0: vfio 0000:01:00.0: failed to add PCI capability 0x11[0x50]@0xb0: table & pba overlap, or they don't fit in BARs, or don't align
>>>
>>>
>>> This is because the msix table is overlapping with pba. According to below
>>> 'lspci -vv' from host, the distance between msix table offset and pba offset is
>>> only 0x100, although there are 22 entries supported (22 entries need 0x160).
>>> Looks qemu supports at most 0x800.
>>>
>>> # sudo lspci -vv
>>> ... ...
>>> 01:00.0 Non-Volatile memory controller: Intel Corporation Device f1a6 (rev 03) (prog-if 02 [NVM Express])
>>> 	Subsystem: Intel Corporation Device 390b
>>> ... ...
>>> 	Capabilities: [b0] MSI-X: Enable- Count=22 Masked-
>>> 		Vector table: BAR=0 offset=00002000
>>> 		PBA: BAR=0 offset=00002100
>>>
>>>
>>>
>>> A patch below could workaround the issue and passthrough nvme successfully.
>>>
>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>> index 5c7bd96..54fc25e 100644
>>> --- a/hw/vfio/pci.c
>>> +++ b/hw/vfio/pci.c
>>> @@ -1510,6 +1510,11 @@ static void vfio_msix_early_setup(VFIOPCIDevice *vdev, Error **errp)
>>>      msix->pba_offset = pba & ~PCI_MSIX_FLAGS_BIRMASK;
>>>      msix->entries = (ctrl & PCI_MSIX_FLAGS_QSIZE) + 1;
>>>  
>>> +    if (msix->table_bar == msix->pba_bar &&
>>> +        msix->table_offset + msix->entries * PCI_MSIX_ENTRY_SIZE > msix->pba_offset) {
>>> +        msix->entries = (msix->pba_offset - msix->table_offset) / PCI_MSIX_ENTRY_SIZE;
>>> +    }
>>> +
>>>      /*
>>>       * Test the size of the pba_offset variable and catch if it extends outside
>>>       * of the specified BAR. If it is the case, we need to apply a hardware
>>>
>>>
>>> Would you please help confirm if this can be regarded as bug in qemu, or issue
>>> with nvme hardware? Should we fix thin in qemu, or we should never use such buggy
>>> hardware with vfio?
>>
>> It's a hardware bug, is there perhaps a firmware update for the device
>> that resolves it?  It's curious that a vector table size of 0x100 gives
>> us 16 entries and 22 in hex is 0x16 (table size would be reported as
>> 0x15 for the N-1 algorithm).  I wonder if there's a hex vs decimal
>> mismatch going on.  We don't really know if the workaround above is
>> correct, are there really 16 entries or maybe does the PBA actually
>> start at a different offset?  We wouldn't want to generically assume
>> one or the other.  I think we need Intel to tell us in which way their
>> hardware is broken and whether it can or is already fixed in a firmware
>> update.  Thanks,
> 
> Thank you very much for the confirmation.
> 
> Just realized looks this would make trouble to my desktop as well when 17
> vectors are used.
> 
> I will report to intel and confirm how this can happen and if there is any
> firmware update available for this issue.
> 

I found there is similar issue reported to kvm:

https://bugzilla.kernel.org/show_bug.cgi?id=202055


I confirmed with my env again. By default, the msi-x count is 16.

	Capabilities: [b0] MSI-X: Enable+ Count=16 Masked-
		Vector table: BAR=0 offset=00002000
		PBA: BAR=0 offset=00002100


The count is still 16 after the device is assigned to vfio (Enable- now):

# echo 0000:01:00.0 > /sys/bus/pci/devices/0000\:01\:00.0/driver/unbind
# echo "8086 f1a6" > /sys/bus/pci/drivers/vfio-pci/new_id

Capabilities: [b0] MSI-X: Enable- Count=16 Masked-
		Vector table: BAR=0 offset=00002000
		PBA: BAR=0 offset=00002100


After I boot qemu with "-device vfio-pci,host=0000:01:00.0", count becomes 22.

Capabilities: [b0] MSI-X: Enable- Count=22 Masked-
		Vector table: BAR=0 offset=00002000
		PBA: BAR=0 offset=00002100



Another interesting observation is, vfio-based userspace nvme also changes count
from 16 to 22.

I reboot host and the count is reset to 16. Then I boot VM with "-drive
file=nvme://0000:01:00.0/1,if=none,id=nvmedrive0 -device
virtio-blk,drive=nvmedrive0,id=nvmevirtio0". As userspace nvme uses different
vfio path, it boots successfully without issue.

However, the count becomes 22 then:

Capabilities: [b0] MSI-X: Enable- Count=22 Masked-
		Vector table: BAR=0 offset=00002000
		PBA: BAR=0 offset=00002100


Both vfio and userspace nvme (based on vfio) would change the count from 16 to 22.

Dongli Zhang

  reply	other threads:[~2018-12-27 12:31 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-01 18:52 [Qemu-devel] vfio failure with intel 760p 128GB nvme Dongli Zhang
2018-12-01 19:29 ` Alex Williamson
2018-12-02  1:29   ` Dongli Zhang
2018-12-27 12:30     ` Dongli Zhang [this message]
2018-12-27 14:20       ` Alex Williamson
2018-12-27 15:15         ` Dongli Zhang
2018-12-27 12:32     ` Dongli Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55d8a52e-7c0e-aa5b-ced2-176619e5c661@oracle.com \
    --to=dongli.zhang@oracle.com \
    --cc=alex.williamson@redhat.com \
    --cc=coffmaker@gmail.com \
    --cc=keith.busch@intel.com \
    --cc=maximlevitsky@gmail.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).