qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, tianyu.lan@intel.com,
	kevin.tian@intel.com, mst@redhat.com, jan.kiszka@siemens.com,
	jasowang@redhat.com, David Gibson <david@gibson.dropbear.id.au>,
	bd.aviv@gmail.com
Subject: Re: [Qemu-devel] [PATCH v7 00/17] VT-d: vfio enablement and misc enhances
Date: Mon, 20 Feb 2017 12:15:33 -0700	[thread overview]
Message-ID: <20170220121533.6914307b@t450s.home> (raw)
In-Reply-To: <20170220074731.GD12693@pxdev.xzpeter.org>

On Mon, 20 Feb 2017 15:47:31 +0800
Peter Xu <peterx@redhat.com> wrote:

> On Fri, Feb 17, 2017 at 10:18:35AM -0700, Alex Williamson wrote:
> > On Tue,  7 Feb 2017 16:28:02 +0800
> > Peter Xu <peterx@redhat.com> wrote:
> >   
> > > This is v7 of vt-d vfio enablement series.  
> > [snip]  
> > > =========
> > > Test Done
> > > =========
> > > 
> > > Build test passed for x86_64/arm/ppc64.
> > > 
> > > Simply tested with x86_64, assigning two PCI devices to a single VM,
> > > boot the VM using:
> > > 
> > > bin=x86_64-softmmu/qemu-system-x86_64
> > > $bin -M q35,accel=kvm,kernel-irqchip=split -m 1G \
> > >      -device intel-iommu,intremap=on,eim=off,caching-mode=on \
> > >      -netdev user,id=net0,hostfwd=tcp::5555-:22 \
> > >      -device virtio-net-pci,netdev=net0 \
> > >      -device vfio-pci,host=03:00.0 \
> > >      -device vfio-pci,host=02:00.0 \
> > >      -trace events=".trace.vfio" \
> > >      /var/lib/libvirt/images/vm1.qcow2
> > > 
> > > pxdev:bin [vtd-vfio-enablement]# cat .trace.vfio
> > > vtd_page_walk*
> > > vtd_replay*
> > > vtd_inv_desc*
> > > 
> > > Then, in the guest, run the following tool:
> > > 
> > >   https://github.com/xzpeter/clibs/blob/master/gpl/userspace/vfio-bind-group/vfio-bind-group.c
> > > 
> > > With parameter:
> > > 
> > >   ./vfio-bind-group 00:03.0 00:04.0
> > > 
> > > Check host side trace log, I can see pages are replayed and mapped in
> > > 00:04.0 device address space, like:
> > > 
> > > ...
> > > vtd_replay_ce_valid replay valid context device 00:04.00 hi 0x401 lo 0x38fe1001
> > > vtd_page_walk Page walk for ce (0x401, 0x38fe1001) iova range 0x0 - 0x8000000000
> > > vtd_page_walk_level Page walk (base=0x38fe1000, level=3) iova range 0x0 - 0x8000000000
> > > vtd_page_walk_level Page walk (base=0x35d31000, level=2) iova range 0x0 - 0x40000000
> > > vtd_page_walk_level Page walk (base=0x34979000, level=1) iova range 0x0 - 0x200000
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x0 -> gpa 0x22dc3000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x1000 -> gpa 0x22e25000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x2000 -> gpa 0x22e12000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x3000 -> gpa 0x22e2d000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x4000 -> gpa 0x12a49000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x5000 -> gpa 0x129bb000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x6000 -> gpa 0x128db000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x7000 -> gpa 0x12a80000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x8000 -> gpa 0x12a7e000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x9000 -> gpa 0x12b22000 mask 0xfff perm 3
> > > vtd_page_walk_one Page walk detected map level 0x1 iova 0xa000 -> gpa 0x12b41000 mask 0xfff perm 3
> > > ...  
> > 
> > Hi Peter,
> > 
> > I'm trying to make use of this, with your vtd-vfio-enablement-v7 branch
> > (HEAD 0c1c4e738095).  I'm assigning an 82576 PF to a VM.  It works with
> > iommu=pt, but if I remove that option, the device does not work and
> > vfio_iommu_map_notify is never called.  Any suggestions?  My
> > commandline is below.  Thanks,
> > 
> > Alex
> > 
> > /usr/local/bin/qemu-system-x86_64 \
> >         -name guest=l1,debug-threads=on -S \
> >         -machine pc-q35-2.9,accel=kvm,usb=off,dump-guest-core=off,kernel-irqchip=split \
> >         -cpu host -m 10240 -realtime mlock=off -smp 4,sockets=1,cores=2,threads=2 \
> >         -no-user-config -nodefaults -monitor stdio -rtc base=utc,driftfix=slew \
> >         -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown \
> >         -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 \
> >         -boot strict=on \
> >         -device ioh3420,port=0x10,chassis=1,id=pci.1,bus=pcie.0,addr=0x2 \
> >         -device i82801b11-bridge,id=pci.2,bus=pcie.0,addr=0x1e \
> >         -device pci-bridge,chassis_nr=3,id=pci.3,bus=pci.2,addr=0x0 \
> >         -device ioh3420,port=0x18,chassis=4,id=pci.4,bus=pcie.0,addr=0x3 \
> >         -device ioh3420,port=0x20,chassis=5,id=pci.5,bus=pcie.0,addr=0x4 \
> >         -device ioh3420,port=0x28,chassis=6,id=pci.6,bus=pcie.0,addr=0x5 \
> >         -device ioh3420,port=0x30,chassis=7,id=pci.7,bus=pcie.0,addr=0x6 \
> >         -device ioh3420,port=0x38,chassis=8,id=pci.8,bus=pcie.0,addr=0x7 \
> >         -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 \
> >         -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x1d \
> >         -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 \
> >         -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 \
> >         -device virtio-serial-pci,id=virtio-serial0,bus=pci.4,addr=0x0 \
> >         -drive file=/dev/vg_s20/lv_l1,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native \
> >         -device virtio-blk-pci,scsi=off,bus=pci.5,addr=0x0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
> >         -netdev user,id=hostnet0 \
> >         -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:c2:62:30,bus=pci.1,addr=0x0 \
> >         -device usb-tablet,id=input0,bus=usb.0,port=1 \
> >         -vnc :0 -vga std \
> >         -device vfio-pci,host=01:00.0,id=hostdev0,bus=pci.8,addr=0x0 \
> >         -device intel-iommu,intremap=on,eim=off,caching-mode=on -trace events=/trace-events.txt -msg timestamp=on  
> 
> Alex,
> 
> Thanks for testing this series.
> 
> I think I reproduced it using my 10g nic as well. What I got is:
> 
> [   23.724787] ixgbe 0000:01:00.0 enp1s0: Detected Tx Unit Hang
> [   23.724787]   Tx Queue             <0>
> [   23.724787]   TDH, TDT             <0>, <1>
> [   23.724787]   next_to_use          <1>
> [   23.724787]   next_to_clean        <0>
> [   23.724787] tx_buffer_info[next_to_clean]
> [   23.724787]   time_stamp           <fffbb8bb>
> [   23.724787]   jiffies              <fffbc780>
> [   23.729580] ixgbe 0000:01:00.0 enp1s0: tx hang 1 detected on queue 0, resetting adapter
> [   23.730752] ixgbe 0000:01:00.0 enp1s0: initiating reset due to tx timeout
> [   23.731768] ixgbe 0000:01:00.0 enp1s0: Reset adapter
> 
> Is this the problem you have encountered? (adapter continuously reset)
> 
> Interestingly, I found that the problem solves itself after I move the
> "-device intel-iommu,..." line before all the other devices.
> 
> Or say, this will be the much shorter reproducer meet the bug:
> 
> $qemu   -machine q35,accel=kvm,kernel-irqchip=split \
>         -cpu host -smp 4 -m 2048 \
>         -nographic -nodefaults -serial stdio \
>         -device vfio-pci,host=05:00.0,bus=pci.1 \
>         -device intel-iommu,intremap=on,eim=off,caching-mode=on \
>         /images/fedora-25.qcow2
> 
> While this may possibly be okay at least on my host (switching the
> order of the two devices):
> 
> $qemu   -machine q35,accel=kvm,kernel-irqchip=split \
>         -cpu host -smp 4 -m 2048 \
>         -nographic -nodefaults -serial stdio \
>         -device intel-iommu,intremap=on,eim=off,caching-mode=on \
>         -device vfio-pci,host=05:00.0,bus=pci.1 \
>         /images/fedora-25.qcow2
> 
> So not sure how the ordering of realization of these two devices
> (intel-iommu, vfio-pci) affected the behavior. One thing I suspect is
> that in vfio_realize(), we have:
> 
>   group = vfio_get_group(groupid, pci_device_iommu_address_space(pdev), errp);
> 
> while here we possibly will be getting &address_space_memory here
> instead of the correct DMA address space since Intel IOMMU device has
> not yet been inited...
> 
> Before I go deeper, any thoughts?


Sounds theory, seems confirmed by Yi.  Makes it pretty impossible to
test using libvirt <qemu:arg> support, which is how I derived my VM
commandline.  Thanks,

Alex

  parent reply	other threads:[~2017-02-20 19:15 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-07  8:28 [Qemu-devel] [PATCH v7 00/17] VT-d: vfio enablement and misc enhances Peter Xu
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 01/17] vfio: trace map/unmap for notify as well Peter Xu
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 02/17] vfio: introduce vfio_get_vaddr() Peter Xu
2017-02-10  1:12   ` David Gibson
2017-02-10  5:50     ` Peter Xu
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 03/17] vfio: allow to notify unmap for very large region Peter Xu
2017-02-10  1:13   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 04/17] intel_iommu: add "caching-mode" option Peter Xu
2017-02-10  1:14   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 05/17] intel_iommu: simplify irq region translation Peter Xu
2017-02-10  1:15   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 06/17] intel_iommu: renaming gpa to iova where proper Peter Xu
2017-02-10  1:17   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 07/17] intel_iommu: convert dbg macros to traces for inv Peter Xu
2017-02-08  2:47   ` Jason Wang
2017-02-10  1:19   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 08/17] intel_iommu: convert dbg macros to trace for trans Peter Xu
2017-02-08  2:49   ` Jason Wang
2017-02-10  1:20   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 09/17] intel_iommu: vtd_slpt_level_shift check level Peter Xu
2017-02-10  1:20   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 10/17] memory: add section range info for IOMMU notifier Peter Xu
2017-02-10  2:29   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 11/17] memory: provide IOMMU_NOTIFIER_FOREACH macro Peter Xu
2017-02-10  2:30   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 12/17] memory: provide iommu_replay_all() Peter Xu
2017-02-10  2:31   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 13/17] memory: introduce memory_region_notify_one() Peter Xu
2017-02-10  2:33   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 14/17] memory: add MemoryRegionIOMMUOps.replay() callback Peter Xu
2017-02-10  2:34   ` David Gibson
2017-03-27  8:35   ` Liu, Yi L
2017-03-27  9:12     ` Peter Xu
2017-03-27  9:21       ` Liu, Yi L
2017-03-30 11:06         ` Liu, Yi L
2017-03-30 11:57           ` Jason Wang
2017-03-31  2:56             ` Peter Xu
2017-03-31  4:21               ` Jason Wang
2017-03-31  5:01                 ` Peter Xu
2017-03-31  5:12                   ` Jason Wang
2017-03-31  5:28                     ` Peter Xu
2017-03-31  5:34             ` Liu, Yi L
2017-03-31  7:16               ` Jason Wang
2017-03-31  7:30                 ` Liu, Yi L
2017-04-01  5:00                   ` Jason Wang
2017-04-01  6:39                     ` Liu, Yi L
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 15/17] intel_iommu: provide its own replay() callback Peter Xu
2017-02-10  2:36   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 16/17] intel_iommu: allow dynamic switch of IOMMU region Peter Xu
2017-02-10  2:38   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 17/17] intel_iommu: enable vfio devices Peter Xu
2017-02-10  6:24   ` Jason Wang
2017-03-16  4:05   ` Peter Xu
2017-03-19 15:34     ` Aviv B.D.
2017-03-20  1:56       ` Peter Xu
2017-03-20  2:12         ` Liu, Yi L
2017-03-20  2:41           ` Peter Xu
2017-02-17 17:18 ` [Qemu-devel] [PATCH v7 00/17] VT-d: vfio enablement and misc enhances Alex Williamson
2017-02-20  7:47   ` Peter Xu
2017-02-20  8:17     ` Liu, Yi L
2017-02-20  8:32       ` Peter Xu
2017-02-20 19:15     ` Alex Williamson [this message]
2017-02-28  7:52 ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170220121533.6914307b@t450s.home \
    --to=alex.williamson@redhat.com \
    --cc=bd.aviv@gmail.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=jan.kiszka@siemens.com \
    --cc=jasowang@redhat.com \
    --cc=kevin.tian@intel.com \
    --cc=mst@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=tianyu.lan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).