From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Williamson Subject: Re: VFIO no-iommu Date: Fri, 18 Dec 2015 07:38:40 -0700 Message-ID: <1450449520.2674.162.camel@redhat.com> References: <566B4A50.9090607@6wind.com> <1449874953.20509.6.camel@redhat.com> <26FA93C7ED1EAA44AB77D62FBE1D27BA6747CE55@IRSMSX108.ger.corp.intel.com> <1450198398.6042.32.camel@redhat.com> <20151216040408.GA18363@sivlogin002.ir.intel.com> <1450240711.2674.11.camel@redhat.com> <1450285912.2674.22.camel@redhat.com> <1450370639.2674.93.camel@redhat.com> <20151218104310.GA11371@sivlogin002.ir.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Cc: "dev@dpdk.org" To: "Yigit, Ferruh" Return-path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 7F49A7E23 for ; Fri, 18 Dec 2015 15:38:42 +0100 (CET) In-Reply-To: <20151218104310.GA11371@sivlogin002.ir.intel.com> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Fri, 2015-12-18 at 10:43 +0000, Yigit, Ferruh wrote: > On Thu, Dec 17, 2015 at 09:43:59AM -0700, Alex Williamson wrote: > <...> > > > > > > >=20 > > > > > > > Also I need to disable VFIO_CHECK_EXTENSION ioctl, > > > > > > > because in > > > > > > > vfio > > > > > > > module, > > > > > > > container->noiommu is not set before doing a > > > > > > > vfio_group_set_container() > > > > > > > and vfio_for_each_iommu_driver selects wrong driver. > > > > > >=20 > > > > > > Running CHECK_EXTENSION on a container without the group > > > > > > attached is > > > > > > only going to tell you what extensions vfio is capable of, > > > > > > not > > > > > > necessarily what extensions are available to you with that > > > > > > group. > > > > > > Is this just a general dpdk- vfio ordering bug? > > > > >=20 > > > > > Yes, that is how VFIO was implemented in DPDK. I was under > > > > > the > > > > > impression that checking extension before assigning devices > > > > > was > > > > > the > > > > > correct way to do things, so as to not to try anything we > > > > > know > > > > > would > > > > > fail anyway. Does this imply that CHECK_EXTENSION needs to be > > > > > called > > > > > on both container and groups (or just on groups)? > > > >=20 > > > > Hmm, in Documentation/vfio.txt we do give the following > > > > algorithm: > > > >=20 > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (ioctl(contain= er, VFIO_GET_API_VERSION) !=3D > > > > VFIO_API_VERSION) > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* Unknown API version */ > > > >=20 > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (!ioctl(contai= ner, VFIO_CHECK_EXTENSION, > > > > VFIO_TYPE1_IOMMU)) > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* Doesn't support the IOMMU driver we want= . */ > > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0... > > > >=20 > > > > That's just going to query each iommu driver and we can't yet > > > > say > > > > whether > > > > the group the user attaches to the container later will > > > > actually > > > > support that > > > > extension until we try to do it, that would come at > > > > VFIO_SET_IOMMU. > > > > =C2=A0So is > > > > it perhaps a vfio bug that we're not advertising no-iommu until > > > > the > > > > group is > > > > attached? =C2=A0After all, we are capable of it with just an empt= y > > > > container, just > > > > like we are with type1, but we're going to fail SET_IOMMU for > > > > the > > > > wrong > > > > combination. > > > > =C2=A0This is exactly the sort of thing that makes me glad we > > > > reverted > > > > it without > > > > feedback from a working user driver. =C2=A0Thanks, > > >=20 > > > Whether it should be considered a "bug" in VFIO or "by design" is > > > up > > > to you, of course, but at least according to the VFIO > > > documentation, > > > we are meant to check for type 1 extension and then attach > > > devices, > > > so it would be expected to get VFIO_NOIOMMU_IOMMU marked as > > > supported > > > even without any devices attached to the container (just like we > > > get > > > type 1 as supported without any devices attached). Having said > > > that, > > > if it was meant to attach devices first and then check the > > > extensions, then perhaps the documentation should also point out > > > that > > > fact (or perhaps I missed that detail in my readings of the docs, > > > in > > > which case my apologies). > >=20 > > Hi Anatoly, > >=20 > > Does the below patch make it behave more like you'd expect. =C2=A0Thi= s > > applies to v4.4-rc4, I'd fold this into the base patch if we > > reincorporate it to a future kernel. =C2=A0Thanks, > >=20 > > Alex > >=20 > > commit 88d4dcb6b77624965f0b45b5cd305a2b4a105c94 > > Author: Alex Williamson > > Date:=C2=A0=C2=A0=C2=A0Wed Dec 16 19:02:01 2015 -0700 > >=20 > > =C2=A0=C2=A0=C2=A0=C2=A0vfio: Fix no-iommu CHECK_EXTENSION > > =C2=A0=C2=A0=C2=A0=C2=A0 > > =C2=A0=C2=A0=C2=A0=C2=A0Previously the no-iommu iommu driver was only= visible when the > > =C2=A0=C2=A0=C2=A0=C2=A0container had an attached no-iommu group.=C2=A0= =C2=A0This means that > > =C2=A0=C2=A0=C2=A0=C2=A0CHECK_EXTENSION on and empty container couldn= 't report the > > possibility > > =C2=A0=C2=A0=C2=A0=C2=A0of using VFIO_NOIOMMU_IOMMU.=C2=A0=C2=A0We re= port TYPE1 whether or not > > the user > > =C2=A0=C2=A0=C2=A0=C2=A0can make use of it with the group, so this is > > inconsistent.=C2=A0=C2=A0Add the > > =C2=A0=C2=A0=C2=A0=C2=A0no-iommu iommu to the list of iommu drivers w= hen enabled via > > module > > =C2=A0=C2=A0=C2=A0=C2=A0option, but skip all the others if the contai= ner is attached to > > a > > =C2=A0=C2=A0=C2=A0=C2=A0no-iommu groups.=C2=A0=C2=A0Note that taintin= g is now done with the > > "unsafe" > > =C2=A0=C2=A0=C2=A0=C2=A0module callback rather than explictly within = vfio. > > =C2=A0=C2=A0=C2=A0=C2=A0 > > =C2=A0=C2=A0=C2=A0=C2=A0Also fixes module option and module descripti= on name > > inconsistency. > > =C2=A0=C2=A0=C2=A0=C2=A0 > > =C2=A0=C2=A0=C2=A0=C2=A0Also make vfio_noiommu_ops const. > > =C2=A0=C2=A0=C2=A0=C2=A0 > > =C2=A0=C2=A0=C2=A0=C2=A0Signed-off-by: Alex Williamson >=20 > Hi Alex, >=20 > I got following crash with this update: >=20 > [=C2=A0=C2=A0+0.046973] BUG: unable to handle kernel NULL pointer deref= erence > at=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(nu= ll) > [=C2=A0=C2=A0+0.000031] IP: [] __list_add+V0x1f/0xc0 > [=C2=A0=C2=A0+0.000024] PGD 0=20 > [=C2=A0=C2=A0+0.000010] Oops: 0000 [#1] SMP=20 > ... > [=C2=A0=C2=A0+0.000028] Call Trace: > [=C2=A0=C2=A0+0.000014]=C2=A0=C2=A0[] __mutex_lock_sl= owpath+0x91/0x110 > [=C2=A0=C2=A0+0.000022]=C2=A0=C2=A0[] mutex_lock+0x23= /0x40 > [=C2=A0=C2=A0+0.000020]=C2=A0=C2=A0[] > vfio_register_iommu_driver+0x40/0xc0 [vfio] > [=C2=A0=C2=A0+0.000025]=C2=A0=C2=A0[] noiommu_param_s= et+0x51/0x60 > [vfio] > [=C2=A0=C2=A0+0.000022]=C2=A0=C2=A0[] parse_args+0x1b= e/0x4a0 > [=C2=A0=C2=A0+0.000021]=C2=A0=C2=A0[] load_module+0xe= 30/0x2730 > [=C2=A0=C2=A0+0.000942]=C2=A0=C2=A0[] ? __symbol_put+= 0x60/0x60 > [=C2=A0=C2=A0+0.000921]=C2=A0=C2=A0[] ? kernel_read+0= x50/0x80 > [=C2=A0=C2=A0+0.000917]=C2=A0=C2=A0[] SyS_finit_modul= e+0xb9/0xf0 > [=C2=A0=C2=A0+0.000925]=C2=A0=C2=A0[] > entry_SYSCALL_64_fastpath+0x12/0x71 >=20 > > -static struct vfio_iommu_driver vfio_noiommu_driver =3D { > > - .ops =3D &vfio_noiommu_ops, > > +static int noiommu_param_set(const char *val, const struct > > kernel_param *kp) > > +{ > > + int ret; > > + > > + if (!val) > > + val =3D "1"; > > + > > + ret =3D strtobool(val, kp->arg); > > + if (ret) > > + return ret; > > + > > + if (noiommu) > > + ret =3D > > vfio_register_iommu_driver(&vfio_noiommu_ops); >=20 > Is it possible that kernel_param_ops->set() called before > module_init() that initializes the mutex. > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/ > kernel/module.c?id=3Drefs/tags/v4.4-rc5#n3517 Yes, guess I didn't think that one through very well. =C2=A0I'll try agai= n. =C2=A0Thank you for testing, sorry it was a dud. =C2=A0Thanks, Alex