All of lore.kernel.org
 help / color / mirror / Atom feed
From: Maik Broemme <mbroemme@parallels.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] Multi GPU passthrough via VFIO
Date: Fri, 7 Feb 2014 21:17:34 +0100	[thread overview]
Message-ID: <20140207201734.GR995@parallels.com> (raw)
In-Reply-To: <1391800246.6959.280.camel@bling.home>

Hi Alex,

Alex Williamson <alex.williamson@redhat.com> wrote:
> On Fri, 2014-02-07 at 01:22 +0100, Maik Broemme wrote:
> > Interesting is the diff between 1st and 2nd boot, so if I do the lspci
> > prior to the booting. The only difference between 1st start and 2nd
> > start are:
> > 
> > --- 001-lspci.290x.before.1st.log	2014-02-07 01:13:41.498827928 +0100
> > +++ 004-lspci.290x.before.2nd.log	2014-02-07 01:16:50.966611282 +0100
> > @@ -24,7 +24,7 @@
> >  			ClockPM- Surprise- LLActRep- BwNot-
> >  		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
> >  			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> > -		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > +		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> >  		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
> >  		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
> >  		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> > @@ -33,13 +33,13 @@
> >  		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
> >  			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> >  	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
> > -		Address: 0000000000000000  Data: 0000
> > +		Address: 00000000fee00000  Data: 0000
> >  	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
> >  	Capabilities: [150 v2] Advanced Error Reporting
> >  		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> >  		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> >  		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> > -		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> > +		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> >  		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> >  		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> >  	Capabilities: [270 v1] #19
> > 
> > After that if I do suspend-to-ram / resume trick I have again lspci
> > output from before 1st boot.
> 
> The Link Status change after X is stopped seems the most interesting to
> me.  The MSI change is probably explained by the MSI save/restore of the
> device, but should be harmless since MSI is disabled.  I'm a bit
> surprised the Correctable Error Status in the AER capability didn't get
> cleared.  I would have thought that a bus reset would have caused the
> link to retrain back to the original speed/width as well.  Let's check
> that we're actually getting a bus reset, try this in addition to the
> previous qemu patch.  This just enables debug logging for the bus resest
> function.  Thanks,
> 

Below are the outputs from 2 boots, VGA, load fglrx and start X. (2nd
time X gets killed and oops happened)

- 1st boot:

vfio: vfio_pci_hot_reset(0000:01:00.1) multi
vfio: 0000:01:00.1: hot reset dependent devices:
vfio: 	0000:01:00.0 group 1
vfio: 	0000:01:00.1 group 1
vfio: 0000:01:00.1 hot reset: Success
vfio: vfio_pci_hot_reset(0000:01:00.1) one
vfio: 0000:01:00.1: hot reset dependent devices:
vfio: 	0000:01:00.0 group 1
vfio: vfio: found another in-use device 0000:01:00.0
vfio: vfio_pci_hot_reset(0000:01:00.0) one
vfio: 0000:01:00.0: hot reset dependent devices:
vfio: 	0000:01:00.0 group 1
vfio: 	0000:01:00.1 group 1
vfio: vfio: found another in-use device 0000:01:00.1

- 2nd boot:

vfio: vfio_pci_hot_reset(0000:01:00.1) multi
vfio: 0000:01:00.1: hot reset dependent devices:
vfio: 	0000:01:00.0 group 1
vfio: 	0000:01:00.1 group 1
vfio: 0000:01:00.1 hot reset: Success
vfio: vfio_pci_hot_reset(0000:01:00.1) one
vfio: 0000:01:00.1: hot reset dependent devices:
vfio: 	0000:01:00.0 group 1
vfio: vfio: found another in-use device 0000:01:00.0
vfio: vfio_pci_hot_reset(0000:01:00.0) one
vfio: 0000:01:00.0: hot reset dependent devices:
vfio: 	0000:01:00.0 group 1
vfio: 	0000:01:00.1 group 1
vfio: vfio: found another in-use device 0000:01:00.1

> Alex
> 
> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> index 8db182f..7fec259 100644
> --- a/hw/misc/vfio.c
> +++ b/hw/misc/vfio.c
> @@ -2927,6 +2927,10 @@ static bool vfio_pci_host_match(PCIHostDeviceAddress *hos
>              host1->slot == host2->slot && host1->function == host2->function);
>  }
>  
> +#undef DPRINTF
> +#define DPRINTF(fmt, ...) \
> +    do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
> +
>  static int vfio_pci_hot_reset(VFIODevice *vdev, bool single)
>  {
>      VFIOGroup *group;
> @@ -3104,6 +3108,15 @@ out_single:
>      return ret;
>  }
>  
> +#undef DPRINTF
> +#ifdef DEBUG_VFIO
> +#define DPRINTF(fmt, ...) \
> +    do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
> +#else
> +#define DPRINTF(fmt, ...) \
> +    do { } while (0)
> +#endif
> +
>  /*
>   * We want to differentiate hot reset of mulitple in-use devices vs hot reset
>   * of a single in-use device.  VFIO_DEVICE_RESET will already handle the case
> 
> 

--Maik

  reply	other threads:[~2014-02-07 20:17 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-05 18:59 [Qemu-devel] Multi GPU passthrough via VFIO Maik Broemme
2014-02-05 20:26 ` Alex Williamson
2014-02-05 21:10   ` Maik Broemme
2014-02-05 21:27     ` Alex Williamson
2014-02-05 23:47       ` Maik Broemme
2014-02-06  0:25         ` Maik Broemme
2014-02-06  3:36           ` Alex Williamson
2014-02-07  0:22             ` Maik Broemme
2014-02-07 18:07               ` Maik Broemme
2014-02-07 19:10               ` Alex Williamson
2014-02-07 20:17                 ` Maik Broemme [this message]
2014-02-14  0:01                   ` Maik Broemme
2014-02-14  0:33                     ` Alex Williamson
2014-02-14 14:51                       ` Maik Broemme
     [not found]                         ` <20140414170306.GH724@parallels.com>
2015-01-16 12:21                           ` Maik Broemme
2015-01-19 17:43                             ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140207201734.GR995@parallels.com \
    --to=mbroemme@parallels.com \
    --cc=alex.williamson@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.