qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jintack Lim <jintack@cs.columbia.edu>
To: Peter Xu <peterx@redhat.com>
Cc: mst@redhat.com, Alex Williamson <alex.williamson@redhat.com>,
	QEMU Devel Mailing List <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] iommu emulation
Date: Wed, 15 Feb 2017 17:05:35 -0500	[thread overview]
Message-ID: <CAHyh4xg0sY2woR5BPVd6uiESpu6U9PHYs7a173sQ7BcLSVk5Hg@mail.gmail.com> (raw)
In-Reply-To: <20170215025243.GA3988@pxdev.xzpeter.org>

On Tue, Feb 14, 2017 at 9:52 PM, Peter Xu <peterx@redhat.com> wrote:

> On Tue, Feb 14, 2017 at 07:50:39AM -0500, Jintack Lim wrote:
>
> [...]
>
> > > > >> > I misunderstood what you said?
> > > > >
> > > > > I failed to understand why an vIOMMU could help boost performance.
> :(
> > > > > Could you provide your command line here so that I can try to
> > > > > reproduce?
> > > >
> > > > Sure. This is the command line to launch L1 VM
> > > >
> > > > qemu-system-x86_64 -M q35,accel=kvm,kernel-irqchip=split \
> > > > -m 12G -device intel-iommu,intremap=on,eim=off,caching-mode=on \
> > > > -drive file=/mydata/guest0.img,format=raw --nographic -cpu host \
> > > > -smp 4,sockets=4,cores=1,threads=1 \
> > > > -device vfio-pci,host=08:00.0,id=net0
> > > >
> > > > And this is for L2 VM.
> > > >
> > > > ./qemu-system-x86_64 -M q35,accel=kvm \
> > > > -m 8G \
> > > > -drive file=/vm/l2guest.img,format=raw --nographic -cpu host \
> > > > -device vfio-pci,host=00:03.0,id=net0
> > >
> > > ... here looks like these are command lines for L1/L2 guest, rather
> > > than L1 guest with/without vIOMMU?
> > >
> >
> > That's right. I thought you were asking about command lines for L1/L2
> guest
> > :(.
> > I think I made the confusion, and as I said above, I didn't mean to talk
> > about the performance of L1 guest with/without vIOMMO.
> > We can move on!
>
> I see. Sure! :-)
>
> [...]
>
> > >
> > > Then, I *think* above assertion you encountered would fail only if
> > > prev == 0 here, but I still don't quite sure why was that happening.
> > > Btw, could you paste me your "lspci -vvv -s 00:03.0" result in your L1
> > > guest?
> > >
> >
> > Sure. This is from my L1 guest.
>
> Hmm... I think I found the problem...
>
> >
> > root@guest0:~# lspci -vvv -s 00:03.0
> > 00:03.0 Network controller: Mellanox Technologies MT27500 Family
> > [ConnectX-3]
> > Subsystem: Mellanox Technologies Device 0050
> > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> > Stepping- SERR+ FastB2B- DisINTx+
> > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
> > <MAbort- >SERR- <PERR- INTx-
> > Latency: 0, Cache Line Size: 64 bytes
> > Interrupt: pin A routed to IRQ 23
> > Region 0: Memory at fe900000 (64-bit, non-prefetchable) [size=1M]
> > Region 2: Memory at fe000000 (64-bit, prefetchable) [size=8M]
> > Expansion ROM at fea00000 [disabled] [size=1M]
> > Capabilities: [40] Power Management version 3
> > Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-
> )
> > Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> > Capabilities: [48] Vital Product Data
> > Product Name: CX354A - ConnectX-3 QSFP
> > Read-only fields:
> > [PN] Part number: MCX354A-FCBT
> > [EC] Engineering changes: A4
> > [SN] Serial number: MT1346X00791
> > [V0] Vendor specific: PCIe Gen3 x8
> > [RV] Reserved: checksum good, 0 byte(s) reserved
> > Read/write fields:
> > [V1] Vendor specific: N/A
> > [YA] Asset tag: N/A
> > [RW] Read-write area: 105 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 252 byte(s) free
> > End
> > Capabilities: [9c] MSI-X: Enable+ Count=128 Masked-
> > Vector table: BAR=0 offset=0007c000
> > PBA: BAR=0 offset=0007d000
> > Capabilities: [60] Express (v2) Root Complex Integrated Endpoint, MSI 00
> > DevCap: MaxPayload 256 bytes, PhantFunc 0
> > ExtTag- RBE+
> > DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
> > RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
> > MaxPayload 256 bytes, MaxReadReq 4096 bytes
> > DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
> > DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not
> > Supported
> > DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF
> Disabled
> > Capabilities: [100 v0] #00
>
> Here we have the head of ecap capability as cap_id==0, then when we
> boot the l2 guest with the same device, we'll first copy this
> cap_id==0 cap, then when adding the 2nd ecap, we'll probably encounter
> problem since pcie_find_capability_list() will thought there is no cap
> at all (cap_id==0 is skipped).
>
> Do you want to try this "hacky patch" to see whether it works for you?
>

Thanks for following this up!

I just tried this, and I got some different message this time.

qemu-system-x86_64: vfio: Cannot reset device 0000:00:03.0, no available
reset mechanism.
qemu-system-x86_64: vfio: Cannot reset device 0000:00:03.0, no available
reset mechanism.


Thanks,
Jintack


> ------8<-------
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 332f41d..bacd302 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -1925,11 +1925,6 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
>
>      }
>
> -    /* Cleanup chain head ID if necessary */
> -    if (pci_get_word(pdev->config + PCI_CONFIG_SPACE_SIZE) == 0xFFFF) {
> -        pci_set_word(pdev->config + PCI_CONFIG_SPACE_SIZE, 0);
> -    }
> -
>      g_free(config);
>      return;
>  }
> ------>8-------
>
> I don't think it's a good solution (it just used 0xffff instead of 0x0
> for the masked cap_id, then l2 guest would like to co-op with it), but
> it should workaround this temporarily. I'll try to think of a better
> one later and post when proper.
>
> (Alex, please leave comment if you have any better suggestion before
>  mine :)
>
> Thanks,
>
> -- peterx
>
>

  parent reply	other threads:[~2017-02-15 22:05 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAHyh4xiVKjd+D=qaizUZ02O8xLYhpoVKOqC9cR0ZWWyLq9HtbQ@mail.gmail.com>
     [not found] ` <20170208031216.GA5151@pxdev.xzpeter.org>
     [not found]   ` <CAHyh4xg7NVPjXu3c+xGWNzQqwLgFqFJTPo4SgN-X+FNuHjGihQ@mail.gmail.com>
     [not found]     ` <CAHyh4xhOPmfLoU_fvtbBF1Wqbzji9q6rp_bRN38qfnwvhQq+9A@mail.gmail.com>
2017-02-09  3:52       ` [Qemu-devel] iommu emulation Peter Xu
2017-02-09 13:01         ` Jintack Lim
2017-02-14  7:35           ` Peter Xu
2017-02-14 12:50             ` Jintack Lim
2017-02-15  2:52               ` Peter Xu
2017-02-15  3:34                 ` Peter Xu
2017-02-15 18:15                   ` Alex Williamson
2017-02-16  2:28                     ` Peter Xu
2017-02-16  2:47                       ` Alex Williamson
2017-02-21 10:33                         ` Jintack Lim
2017-02-23 23:04                           ` Jintack Lim
2017-03-02 22:20                             ` Bandan Das
2017-03-02 23:36                               ` Jintack Lim
2017-03-03  3:43                               ` Peter Xu
2017-03-03  7:45                                 ` Bandan Das
2017-02-15 22:05                 ` Jintack Lim [this message]
2017-02-15 22:50                   ` Alex Williamson
2017-02-15 23:25                     ` Jintack Lim
2017-02-16  1:17                       ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHyh4xg0sY2woR5BPVd6uiESpu6U9PHYs7a173sQ7BcLSVk5Hg@mail.gmail.com \
    --to=jintack@cs.columbia.edu \
    --cc=alex.williamson@redhat.com \
    --cc=mst@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).