From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40892) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a3b8c-00044q-04 for qemu-devel@nongnu.org; Mon, 30 Nov 2015 21:57:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a3b8a-0006rb-DQ for qemu-devel@nongnu.org; Mon, 30 Nov 2015 21:57:53 -0500 Date: Tue, 1 Dec 2015 13:23:46 +1100 From: David Gibson Message-ID: <20151201022346.GJ31343@voom.redhat.com> References: <1447907368-9208-1-git-send-email-david@gibson.dropbear.id.au> <1447907368-9208-2-git-send-email-david@gibson.dropbear.id.au> <1448315891.20382.261.camel@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="MzdA25v054BPvyZa" Content-Disposition: inline In-Reply-To: <1448315891.20382.261.camel@redhat.com> Subject: Re: [Qemu-devel] [RFC 01/12] vfio: Start improving VFIO/EEH interface List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alex Williamson Cc: aik@ozlabs.ru, mdroth@linux.vnet.ibm.com, gwshan@au1.ibm.com, qemu-devel@nongnu.org, qemu-ppc@nongnu.org --MzdA25v054BPvyZa Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Nov 23, 2015 at 02:58:11PM -0700, Alex Williamson wrote: > On Thu, 2015-11-19 at 15:29 +1100, David Gibson wrote: > > At present the code handling IBM's Enhanced Error Handling (EEH) interf= ace > > on VFIO devices operates by bypassing the usual VFIO logic with > > vfio_container_ioctl(). That's a poorly designed interface with unclear > > semantics about exactly what can be operated on. > >=20 > > In particular it operates on a single vfio container internally (hence = the > > name), but takes an address space and group id, from which it deduces t= he > > container in a rather roundabout way. groupids are something that code > > outside vfio shouldn't even be aware of. > >=20 > > This patch creates new interfaces for EEH operations. Internally we > > have vfio_eeh_container_op() which takes a VFIOContainer object > > directly. For external use we have vfio_eeh_as_ok() which determines > > if an AddressSpace is usable for EEH (at present this means it has a > > single container and at most a single group attached), and > > vfio_eeh_as_op() which will perform an operation on an AddressSpace in > > the unambiguous case, and otherwise returns an error. > >=20 > > This interface still isn't great, but it's enough of an improvement to > > allow a number of cleanups in other places. > >=20 > > Signed-off-by: David Gibson > > --- > > hw/vfio/common.c | 77 ++++++++++++++++++++++++++++++++++++++++++= ++++++++ > > include/hw/vfio/vfio.h | 2 ++ > > 2 files changed, 79 insertions(+) > >=20 > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c > > index 6797208..4733625 100644 > > --- a/hw/vfio/common.c > > +++ b/hw/vfio/common.c > > @@ -1002,3 +1002,80 @@ int vfio_container_ioctl(AddressSpace *as, int32= _t groupid, > > =20 > > return vfio_container_do_ioctl(as, groupid, req, param); > > } > > + > > +/* > > + * Interfaces for IBM EEH (Enhanced Error Handling) > > + */ > > +static bool vfio_eeh_container_ok(VFIOContainer *container) > > +{ > > + /* A broken kernel implementation means EEH operations won't work > > + * correctly if there are multiple groups in a container */ > > + > > + if (!QLIST_EMPTY(&container->group_list) > > + && QLIST_NEXT(QLIST_FIRST(&container->group_list), container_n= ext)) { > > + return false; > > + } > > + > > + return true; > > +} >=20 > Seems like there are ways to make this a non-eeh specific function, > vfio_container_group_count(), vfio_container_group_empty_or_singleton(), > etc. I guess, but I don't know of anything else that needs to know, so is there a point? Actually.. I could do with a another opinion here: so, logically EEH operations should be possible on a container basis - the kernel interface correctly reflects that (my previous comments that the interface was broken were mistaken). The current kernel implementation *is* broken (and is non-trivial to fix) which is what this test is about. But is checking for a probably broken kernel state something that we ought to be checking for in qemu? As it stands when the kernel is fixed we'll need a new capability so that qemu can know to disable this test. Should we instead just proceed with any container and just advise people not to attach multiple groups until the kernel is fixed? A relevant point here might be that while I haven't implemented it so far, I think it will be possible to workaround the broken kernel with full functionality by forcing each group into a separate container and using one of a couple of possible different methods to handle EEH functionality across multiple containers on a vPHB. > > + > > +static int vfio_eeh_container_op(VFIOContainer *container, uint32_t op) > > +{ > > + struct vfio_eeh_pe_op pe_op =3D { > > + .argsz =3D sizeof(pe_op), > > + .op =3D op, > > + }; > > + int ret; > > + > > + if (!vfio_eeh_container_ok(container)) { > > + error_report("vfio/eeh: EEH_PE_OP 0x%x called on container" > > + " with multiple groups", op); > > + return -EPERM; > > + } > > + > > + ret =3D ioctl(container->fd, VFIO_EEH_PE_OP, &pe_op); > > + if (ret < 0) { > > + error_report("vfio/eeh: EEH_PE_OP 0x%x failed: %m", op); > > + return -errno; > > + } > > + > > + return 0; > > +} > > + > > +static VFIOContainer *vfio_eeh_as_container(AddressSpace *as) > > +{ > > + VFIOAddressSpace *space =3D vfio_get_address_space(as); > > + VFIOContainer *container =3D NULL; > > + > > + if (QLIST_EMPTY(&space->containers)) { > > + /* No containers to act on */ > > + goto out; > > + } > > + > > + container =3D QLIST_FIRST(&space->containers); > > + > > + if (QLIST_NEXT(container, next)) { > > + /* We don't yet have logic to synchronize EEH state across > > + * multiple containers */ > > + container =3D NULL; > > + goto out; > > + } > > + > > +out: > > + vfio_put_address_space(space); > > + return container; > > +} >=20 >=20 > Here too, vfio_container_from_as() Ok, I'll make that change. > Overall the series looks good to me, nice cleanup both in power and vfio > code. Thanks, >=20 > Alex >=20 >=20 > > + > > +bool vfio_eeh_as_ok(AddressSpace *as) > > +{ > > + VFIOContainer *container =3D vfio_eeh_as_container(as); > > + > > + return (container !=3D NULL) && vfio_eeh_container_ok(container); > > +} > > + > > +int vfio_eeh_as_op(AddressSpace *as, uint32_t op) > > +{ > > + VFIOContainer *container =3D vfio_eeh_as_container(as); > > + > > + return vfio_eeh_container_op(container, op); > > +} > > diff --git a/include/hw/vfio/vfio.h b/include/hw/vfio/vfio.h > > index 0b26cd8..fd3933b 100644 > > --- a/include/hw/vfio/vfio.h > > +++ b/include/hw/vfio/vfio.h > > @@ -5,5 +5,7 @@ > > =20 > > extern int vfio_container_ioctl(AddressSpace *as, int32_t groupid, > > int req, void *param); > > +bool vfio_eeh_as_ok(AddressSpace *as); > > +int vfio_eeh_as_op(AddressSpace *as, uint32_t op); > > =20 > > #endif >=20 >=20 >=20 --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --MzdA25v054BPvyZa Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJWXQSxAAoJEGw4ysog2bOS5OgP/3WEuhpy/jRcqBjUm68wpWhv 0v5vXXXdaxmAC3ssrV3WSJQdTuulDbuawmsInoZl7/POeh1vsWwXj8Jx6CgVrwL4 Tx48NVv/N6Qy/e7zzoPH5MNyEUm6N/Na6093ytq9a44TdsThbk9r0ku6tH9UWpX7 aWkcgPrjomK132v3tjf4e59BoaLHu74cSaJ5WVzstpcrWL7v1vWvnQzGks5lXgoq 4V6avDrTz/0GNoPpfxIzUs5r4EuMHki22ksPqhWWYRut8oyOyiW/ejMYj5Jd8QjU Nu/le15bVM297EkzMj8sL4u3UNs0pETF43R5phx0iIygcf6OVbR36rLQ6bd8JqxO RAzWoUXFfgtlxc/m6NW2CwubBtDfWSFUG0nb7i3kEEBhOaACbbClMybQv7etW6Dx TAyLb14xogrbpsvjLlj0iF1cyWhymxX97DdvpTCm/Z+QaFklh/MS8+JrBe1vdsiA gFbEvVJRw3hJe2WXEz9QwLi3GmU+zCnQ/sGila0F0dM16nE9QBsZEfGuBI9UHgJb Y+gWpj74CPh/oX77ZLXT7ZiLiKl5syTVwwGcyrVIY1cfDtF3CgZ/FgQUNlqXoBIK 0BksU9SNDMAxCTbnwi02Ifv6QFhF7wWMq0ox4hnWfBb8U3WXZDp5QRfis7Tj/JVk SZrTrctvH7M7NiOhlyD4 =D6h4 -----END PGP SIGNATURE----- --MzdA25v054BPvyZa--