From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Williamson Subject: Re: [PATCH v6] vfio error recovery: kernel support Date: Wed, 5 Apr 2017 16:56:15 -0600 Message-ID: <20170405165615.448bcead@t450s.home> References: <58DA6954.2000601@cn.fujitsu.com> <20170328101233.74f50a92@t450s.home> <20170329000148.GA18849@redhat.com> <20170328205513.21b97381@t450s.home> <20170330205823-mutt-send-email-mst@kernel.org> <20170330121652.2ac8fa62@t450s.home> <58E4B0C9.50109@cn.fujitsu.com> <20170405133822.76cda620@t450s.home> <20170406004845-mutt-send-email-mst@kernel.org> <20170405161910.26fee7d1@t450s.home> <20170406013534-mutt-send-email-mst@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: Cao jin , linux-kernel@vger.kernel.org, kvm@vger.kernel.org, qemu-devel@nongnu.org, izumi.taku@jp.fujitsu.com To: "Michael S. Tsirkin" Return-path: Received: from mx1.redhat.com ([209.132.183.28]:59834 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751673AbdDEW4S (ORCPT ); Wed, 5 Apr 2017 18:56:18 -0400 In-Reply-To: <20170406013534-mutt-send-email-mst@kernel.org> Sender: kvm-owner@vger.kernel.org List-ID: On Thu, 6 Apr 2017 01:36:31 +0300 "Michael S. Tsirkin" wrote: > On Wed, Apr 05, 2017 at 04:19:10PM -0600, Alex Williamson wrote: > > On Thu, 6 Apr 2017 00:50:22 +0300 > > "Michael S. Tsirkin" wrote: > > > > > On Wed, Apr 05, 2017 at 01:38:22PM -0600, Alex Williamson wrote: > > > > The previous intention of trying to handle all sorts of AER faults > > > > clearly had more value, though even there the implementation and > > > > configuration requirements restricted the practicality. For instance > > > > is AER support actually useful to a customer if it requires all ports > > > > of a multifunction device assigned to the VM? This seems more like a > > > > feature targeting whole system partitioning rather than general VM > > > > device assignment use cases. Maybe that's ok, but it should be a clear > > > > design decision. > > > > > > Alex, what kind of testing do you expect to be necessary? > > > Would you say testing on real hardware and making it trigger > > > AER errors is a requirement? > > > > Testing various fatal, non-fatal, and corrected errors with aer-inject, > > especially in multfunction configurations (where more than one port > > is actually usable) would certainly be required. If we have cases where > > the driver for a companion function can escalate a non-fatal error to a > > bus reset, that should be tested, even if it requires temporary hacks to > > the host driver for the companion function to trigger that case. AER > > handling is not something that the typical user is going to experience, > > so it should to be thoroughly tested to make sure it works when needed > > or there's little point to doing it at all. Thanks, > > > > Alex > > Some things can be tested within a VM. What would you > say would be sufficient on a VM and what has to be > tested on bare metal? Testing on a VM could be interesting for development, but I'd expect bare metal for validation, no offense. Bus reset timing can be different, error propagation can be different, etc. Thanks, Alex