From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39953) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cvtGq-0004f7-8e for qemu-devel@nongnu.org; Wed, 05 Apr 2017 18:19:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cvtGl-00087R-Bg for qemu-devel@nongnu.org; Wed, 05 Apr 2017 18:19:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:33718) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cvtGl-00087L-5P for qemu-devel@nongnu.org; Wed, 05 Apr 2017 18:19:15 -0400 Date: Wed, 5 Apr 2017 16:19:10 -0600 From: Alex Williamson Message-ID: <20170405161910.26fee7d1@t450s.home> In-Reply-To: <20170406004845-mutt-send-email-mst@kernel.org> References: <1490260051-6046-1-git-send-email-caoj.fnst@cn.fujitsu.com> <20170324161238.366ce6a7@t450s.home> <58DA6954.2000601@cn.fujitsu.com> <20170328101233.74f50a92@t450s.home> <20170329000148.GA18849@redhat.com> <20170328205513.21b97381@t450s.home> <20170330205823-mutt-send-email-mst@kernel.org> <20170330121652.2ac8fa62@t450s.home> <58E4B0C9.50109@cn.fujitsu.com> <20170405133822.76cda620@t450s.home> <20170406004845-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v6] vfio error recovery: kernel support List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: Cao jin , linux-kernel@vger.kernel.org, kvm@vger.kernel.org, qemu-devel@nongnu.org, izumi.taku@jp.fujitsu.com On Thu, 6 Apr 2017 00:50:22 +0300 "Michael S. Tsirkin" wrote: > On Wed, Apr 05, 2017 at 01:38:22PM -0600, Alex Williamson wrote: > > The previous intention of trying to handle all sorts of AER faults > > clearly had more value, though even there the implementation and > > configuration requirements restricted the practicality. For instance > > is AER support actually useful to a customer if it requires all ports > > of a multifunction device assigned to the VM? This seems more like a > > feature targeting whole system partitioning rather than general VM > > device assignment use cases. Maybe that's ok, but it should be a clear > > design decision. > > Alex, what kind of testing do you expect to be necessary? > Would you say testing on real hardware and making it trigger > AER errors is a requirement? Testing various fatal, non-fatal, and corrected errors with aer-inject, especially in multfunction configurations (where more than one port is actually usable) would certainly be required. If we have cases where the driver for a companion function can escalate a non-fatal error to a bus reset, that should be tested, even if it requires temporary hacks to the host driver for the companion function to trigger that case. AER handling is not something that the typical user is going to experience, so it should to be thoroughly tested to make sure it works when needed or there's little point to doing it at all. Thanks, Alex