qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Zhou Jie <zhoujie2011@cn.fujitsu.com>
Cc: Chen Fan <chen.fan.fnst@cn.fujitsu.com>,
	izumi.taku@jp.fujitsu.com, fan.chen@easystack.cn,
	qemu-devel@nongnu.org, mst@redhat.com
Subject: Re: [Qemu-devel] [PATCH v9 05/11] vfio: add check host bus reset is support or not
Date: Wed, 31 Aug 2016 20:12:42 -0600	[thread overview]
Message-ID: <20160831201242.4d4d350d@t450s.home> (raw)
In-Reply-To: <20160831135620.1083b9a6@t450s.home>

On Wed, 31 Aug 2016 13:56:20 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Tue, 19 Jul 2016 15:38:23 +0800
> Zhou Jie <zhoujie2011@cn.fujitsu.com> wrote:
> 
> > From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> > 
> > When assigning a vfio device with AER enabled, we must check whether
> > the device supports a host bus reset (ie. hot reset) as this may be
> > used by the guest OS in order to recover the device from an AER
> > error.  QEMU must therefore have the ability to perform a physical
> > host bus reset using the existing vfio APIs in response to a virtual
> > bus reset in the VM.  A physical bus reset affects all of the devices
> > on the host bus, therefore we place a few simplifying configuration
> > restriction on the VM:
> > 
> >  - All physical devices affected by a bus reset must be assigned to
> >    the VM with AER enabled on each and be configured on the same
> >    virtual bus in the VM.
> > 
> >  - No devices unaffected by the bus reset, be they physical, emulated,
> >    or paravirtual may be configured on the same virtual bus as a
> >    device supporting AER signaling through vfio.
> > 
> > In other words users wishing to enable AER on a multifunction device
> > need to assign all functions of the device to the same virtual bus
> > and enable AER support for each device.  The easiest way to
> > accomplish this is to identity map the physical functions to virtual
> > functions with multifunction enabled on the virtual device.  
> 
> Why am I able to start the following VM with aer=on for the vfio-pci
> devices?
> 
> # lspci -tv
> -[0000:00]-+-00.0  Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
>            +-01.0  Device 1234:1111
>            +-1c.0-[01]--
>            +-1d.0-[02]--+-01.0  Intel Corporation 82576 Gigabit Network Connection
>            |            \-01.1  Intel Corporation 82576 Gigabit Network Connection
>            ...
> 
> # lspci -vvv -s 1d.0
> 00:1d.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge (prog-if 00 [Normal decode])
> 
> The devices are behind a PCIe-to-PCI bridge, so shouldn't specifying
> aer=on for the vfio-pci devices cause a configuration error?
> 
> commandline:
> 
> /home/alwillia/local/bin/qemu-system-x86_64 -name guest=rhel7-q35,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-11-rhel7-q35/master-key.aes -machine pc-q35-2.7,accel=kvm,usb=off,vmport=off -cpu IvyBridge -m 8192 -realtime mlock=off -smp 6,sockets=1,cores=6,threads=1 -uuid b20b28b4-9304-4e11-9ffa-0367aeb44afb -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-11-rhel7-q35/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e -device pci-bridge,chassis_nr=2,id=pci.2,bus=pci.1,addr=0x1 -device pci-bridge,chassis_nr=3,id=pci.3,bus=pcie.0,addr=0x1d -device ioh3420,port=0xe0,chassis=4,id=pci.4,bus=pcie.0,addr=0x1c -device ich9-usb-ehci1,id=usb,bus=pci!
 .2,addr=0x3.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.2,multifunction=on,addr=0x3 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.2,addr=0x3.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.2,addr=0x3.0x2 -drive file=/dev/rhel/rhel7-q35,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native -device virtio-blk-pci,scsi=off,bus=pci.2,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:50:ec:0d,bus=pci.2,addr=0x1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 127.0.0.1:0 -device VGA,id=video0,vgamem_mb=16,bus=pcie.0,addr=0x1 -device intel-hda,id=sound0,bus=pci.2,addr=0x2 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device vfio-pci,aer=on,host=07:00.0,id=hostdev0,bus=pci.3,multifunction=on,addr=0x1 -device vfio-pci,!
 aer=on,host=07:00.1,id=hostdev1,bus=pci.3,addr=0x1.0x1 -msg timestamp=on
> 

I had to move to a different system where I could actually inject an
aer error and created a config similar to above but with the 82576
ports downstream of the ioh3420 root port.  When I inject a malformed
TLP uncorrectable error, my RHEL7.2 guest does this:

[   35.995645] pcieport 0000:00:1c.0: AER: Multiple Uncorrected (Fatal) error received: id=0200
[   35.998483] igb 0000:02:00.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Unaccessible, id=0200(Unregistered Agent ID)
[   36.001965] igb 0000:02:00.0 enp2s0f0: PCIe link lost, device now detached
[   36.015092] igb 0000:02:00.1 enp2s0f1: PCIe link lost, device now detached
[   39.133185] igb 0000:02:00.0: enabling device (0000 -> 0002)
[   40.071245] igb 0000:02:00.1: enabling device (0000 -> 0002)
[   41.014451] BUG: unable to handle kernel paging request at 0000000000003818
[   41.015969] IP: [<ffffffffa02b438d>] igb_configure_tx_ring+0x14d/0x280 [igb]
[   41.017507] PGD 367e2067 PUD 7ae56067 PMD 0 
[   41.018497] Oops: 0002 [#1] SMP 
[   41.019242] Modules linked in: ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter snd_hda_codec_generic snd_hda_intel snd_hda_codec ppdev snd_hda_core snd_hwdep snd_seq snd_seq_device iTCO_wdt iTCO_vendor_support bochs_drm snd_pcm syscopyarea sysfillrect sysimgblt ttm virtio_balloon snd_timer snd igb drm_kms_helper soundcore ptp pps_core i2c_algo_bit i2c_i801 dca drm shpchp lpc_ich mfd_core pcspkr i2c_core parport_pc parport ip_tables xfs libcrc32c virtio_blk virtio_console virtio_net ahci libahci crc32c_intel serio_raw libata virtio_pci virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod
[   41.040590] CPU: 0 PID: 29 Comm: kworker/0:1 Not tainted 3.10.0-327.el7.x86_64 #1
[   41.042180] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
[   41.044635] Workqueue: events aer_isr
[   41.045478] task: ffff880179435080 ti: ffff880179680000 task.ti: ffff880179680000
[   41.047097] RIP: 0010:[<ffffffffa02b438d>]  [<ffffffffa02b438d>] igb_configure_tx_ring+0x14d/0x280 [igb]
[   41.049151] RSP: 0018:ffff880179683bf8  EFLAGS: 00010246
[   41.050260] RAX: 0000000000003818 RBX: 0000000000000000 RCX: 0000000000003818
[   41.051747] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 00000000002896b3
[   41.053268] RBP: ffff880179683c20 R08: 0000000001010100 R09: 00000000ffffffe7
[   41.054730] R10: ffffea0001eb6100 R11: ffffffffa02afa31 R12: 0000000000000000
[   41.056201] R13: ffff880035dbc8c0 R14: ffff880175d03f80 R15: 000000017716e000
[   41.057673] FS:  0000000000000000(0000) GS:ffff88017fc00000(0000) knlGS:0000000000000000
[   41.059337] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   41.060548] CR2: 0000000000003818 CR3: 0000000178331000 CR4: 00000000000006f0
[   41.062028] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   41.063534] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   41.065025] Stack:
[   41.065473]  ffff880035dbc8c0 ffff880035dbce70 0000000000000001 ffff880035dbc8c8
[   41.067119]  ffff880035dbce70 ffff880179683c80 ffffffffa02b8a77 fefdf27269fb3cd8
[   41.068781]  2009f9ee3386436f eb9e4e66756bbfdd 34002f8114a5d65f 9535990856231c4b
[   41.094179] Call Trace:
[   41.118688]  [<ffffffffa02b8a77>] igb_configure+0x267/0x450 [igb]
[   41.144286]  [<ffffffffa02b94f1>] igb_up+0x21/0x1a0 [igb]
[   41.170606]  [<ffffffffa02b96a7>] igb_io_resume+0x37/0x70 [igb]
[   41.195846]  [<ffffffff813381e0>] ? pci_cleanup_aer_uncorrect_error_status+0x90/0x90
[   41.221767]  [<ffffffff81338228>] report_resume+0x48/0x60
[   41.246455]  [<ffffffff8131e359>] pci_walk_bus+0x79/0xa0
[   41.270722]  [<ffffffff813381e0>] ? pci_cleanup_aer_uncorrect_error_status+0x90/0x90
[   41.296747]  [<ffffffff813382f0>] broadcast_error_message+0xb0/0x100
[   41.321552]  [<ffffffff81338509>] do_recovery+0x1c9/0x280
[   41.345507]  [<ffffffff81338f58>] aer_isr+0x348/0x430
[   41.368851]  [<ffffffff8109d5fb>] process_one_work+0x17b/0x470
[   41.392157]  [<ffffffff8109e3cb>] worker_thread+0x11b/0x400
[   41.416852]  [<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400
[   41.441577]  [<ffffffff810a5aef>] kthread+0xcf/0xe0
[   41.465029]  [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[   41.488341]  [<ffffffff81645858>] ret_from_fork+0x58/0x90
[   41.511247]  [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[   41.535442] Code: c1 49 89 4e 30 49 8b 85 b8 05 00 00 48 85 c0 0f 84 39 01 00 00 81 c2 10 38 00 00 48 63 d2 48 01 d0 31 d2 89 10 49 8b 46 30 31 d2 <89> 10 41 8b 95 3c 06 00 00 b8 14 01 10 02 83 fa 05 74 0b 83 fa 
[   41.587718] RIP  [<ffffffffa02b438d>] igb_configure_tx_ring+0x14d/0x280 [igb]
[   41.610872]  RSP <ffff880179683bf8>
[   41.632301] CR2: 0000000000003818

And then it reboots.  So what RAS improvement have we bought ourselves
here?  What endpoints have you tested with this?  Which ones recovered
reliably?  Thanks,

Alex

  reply	other threads:[~2016-09-01  2:12 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-19  7:38 [Qemu-devel] [PATCH v9 00/11] vfio-pci: pass the aer error to guest Zhou Jie
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 01/11] vfio: extract vfio_get_hot_reset_info as a single function Zhou Jie
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 02/11] vfio: squeeze out vfio_pci_do_hot_reset for support bus reset Zhou Jie
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 03/11] vfio: add aer support for vfio device Zhou Jie
2016-08-31 19:44   ` Alex Williamson
2016-09-13  6:56     ` Dou Liyang
2016-09-13 15:14       ` Alex Williamson
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 04/11] vfio: refine function vfio_pci_host_match Zhou Jie
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 05/11] vfio: add check host bus reset is support or not Zhou Jie
2016-08-31 19:56   ` Alex Williamson
2016-09-01  2:12     ` Alex Williamson [this message]
2016-09-22 14:04       ` Dou Liyang
2016-09-22  8:34     ` Dou Liyang
2016-09-22 14:03       ` Alex Williamson
2016-09-22 14:27         ` Dou Liyang
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 06/11] pci: add a pci_function_is_valid callback to check function if valid Zhou Jie
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 07/11] vfio: add check aer functionality for hotplug device Zhou Jie
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 08/11] vfio: vote the function 0 to do host bus reset when aer occurred Zhou Jie
2016-10-09 13:07   ` Cao jin
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 09/11] vfio-pci: pass the aer error to guest Zhou Jie
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 10/11] vfio: Add waiting for host aer error progress Zhou Jie
2016-08-31 20:13   ` Alex Williamson
2016-08-31 20:34     ` Michael S. Tsirkin
2016-10-24  7:56   ` Cao jin
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 11/11] vfio: add 'aer' property to expose aercap Zhou Jie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160831201242.4d4d350d@t450s.home \
    --to=alex.williamson@redhat.com \
    --cc=chen.fan.fnst@cn.fujitsu.com \
    --cc=fan.chen@easystack.cn \
    --cc=izumi.taku@jp.fujitsu.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=zhoujie2011@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).