From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50623) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aLuyQ-0005xD-1w for qemu-devel@nongnu.org; Wed, 20 Jan 2016 10:47:07 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aLuyL-0006my-0W for qemu-devel@nongnu.org; Wed, 20 Jan 2016 10:47:05 -0500 Received: from mx1.redhat.com ([209.132.183.28]:49904) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aLuyK-0006mu-Og for qemu-devel@nongnu.org; Wed, 20 Jan 2016 10:47:00 -0500 Message-ID: <1453304819.32741.277.camel@redhat.com> From: Alex Williamson Date: Wed, 20 Jan 2016 08:46:59 -0700 In-Reply-To: <569FA454.6050409@linux.vnet.ibm.com> References: <1452611505-25478-1-git-send-email-pmorel@linux.vnet.ibm.com> <1452622595.9674.19.camel@redhat.com> <569FA454.6050409@linux.vnet.ibm.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v3] vfio/common: Check iova with limit not with size List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Pierre Morel Cc: pbonzini@redhat.com, qemu-devel@nongnu.org, peter.maydell@linaro.org On Wed, 2016-01-20 at 16:14 +0100, Pierre Morel wrote: >=20 > On 01/12/2016 07:16 PM, Alex Williamson wrote: > > On Tue, 2016-01-12 at 16:11 +0100, Pierre Morel wrote: > > > In vfio_listener_region_add(), we try to validate that the region > > > is > > > not > > > zero sized and hasn't overflowed the addresses space. > > >=20 > > > But the calculation uses the size of the region instead of > > > using the region's limit (size - 1). > > >=20 > > > This leads to Int128 overflow when the region has > > > been initialized to UINT64_MAX because in this case > > > memory_region_init() transform the size from UINT64_MAX > > > to int128_2_64(). > > >=20 > > > Let's really use the limit by sustracting one to the size > > > and take care to use the limit for functions using limit > > > and size to call functions which need size. > > >=20 > > > Signed-off-by: Pierre Morel > > > --- > > >=20 > > > Changes from v2: > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0- all, just ignore v2, sorry about th= is, > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0this is build after v1 > > >=20 > > > Changes from v1: > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0- adjust the tests by knowing we alre= ady substracted one to > > > end. > > >=20 > > > =C2=A0 hw/vfio/common.c |=C2=A0=C2=A0=C2=A014 +++++++------- > > > =C2=A0 1 files changed, 7 insertions(+), 7 deletions(-) > > >=20 > > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c > > > index 6797208..a5f6643 100644 > > > --- a/hw/vfio/common.c > > > +++ b/hw/vfio/common.c > > > @@ -348,12 +348,12 @@ static void > > > vfio_listener_region_add(MemoryListener *listener, > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (int128_ge(int128_make64(iov= a), llend)) { > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return; > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > > > -=C2=A0=C2=A0=C2=A0=C2=A0end =3D int128_get64(llend); > > > +=C2=A0=C2=A0=C2=A0=C2=A0end =3D int128_get64(int128_sub(llend, int= 128_one())); > > > =C2=A0=C2=A0 > > > -=C2=A0=C2=A0=C2=A0=C2=A0if ((iova < container->min_iova) || ((end = - 1) > container- > > > > max_iova)) { > > > +=C2=A0=C2=A0=C2=A0=C2=A0if ((iova < container->min_iova) || (end=C2= =A0=C2=A0> container- > > > > max_iova)) { > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0error_r= eport("vfio: IOMMU container %p can't map guest > > > IOVA > > > region" > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0" 0x= %"HWADDR_PRIx"..0x%"HWADDR_PRIx, > > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0container, io= va, end - 1); > > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0container, io= va, end); > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0ret =3D= -EFAULT; > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0goto fa= il; > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > > > @@ -363,7 +363,7 @@ static void > > > vfio_listener_region_add(MemoryListener *listener, > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (memory_region_is_iommu(sect= ion->mr)) { > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0VFIOGue= stIOMMU *giommu; > > > =C2=A0=C2=A0 > > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0trace_vfio_listene= r_region_add_iommu(iova, end - 1); > > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0trace_vfio_listene= r_region_add_iommu(iova, end); > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*= FIXME: We should do some checking to see if the > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*= capabilities of the host VFIO IOMMU are adequate to > > > model > > > @@ -394,13 +394,13 @@ static void > > > vfio_listener_region_add(MemoryListener *listener, > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0section->offset_within_region + > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0(iova - section->offset_within_address_space); > > > =C2=A0=C2=A0 > > > -=C2=A0=C2=A0=C2=A0=C2=A0trace_vfio_listener_region_add_ram(iova, e= nd - 1, vaddr); > > > +=C2=A0=C2=A0=C2=A0=C2=A0trace_vfio_listener_region_add_ram(iova, e= nd, vaddr); > > > =C2=A0=C2=A0 > > > -=C2=A0=C2=A0=C2=A0=C2=A0ret =3D vfio_dma_map(container, iova, end = - iova, vaddr, > > > section- > > > > readonly); > > > +=C2=A0=C2=A0=C2=A0=C2=A0ret =3D vfio_dma_map(container, iova, end = - iova + 1, vaddr, > > > section->readonly); > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (ret) { > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0error_r= eport("vfio_dma_map(%p, 0x%"HWADDR_PRIx", " > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0"0x%= "HWADDR_PRIx", %p) =3D %d (%m)", > > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0container, io= va, end - iova, vaddr, ret); > > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0container, io= va, end - iova + 1, vaddr, > > > ret); > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0goto fa= il; > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > > > =C2=A0=C2=A0 > > Hmm, did we just push the overflow from one place to another?=C2=A0=C2= =A0If > > we're > > mapping a full region of size int128_2_64() starting at iova zero, > > then > > this becomes (0xffff_ffff_ffff_ffff - 0 + 1) =3D 0.=C2=A0=C2=A0So I t= hink we > > need > > to calculate size with 128bit arithmetic too and let it assert if > > we > > overflow, ie: > >=20 > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c > > index a5f6643..13ad90b 100644 > > --- a/hw/vfio/common.c > > +++ b/hw/vfio/common.c > > @@ -321,7 +321,7 @@ static void > > vfio_listener_region_add(MemoryListener *listener, > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0MemoryRegionSection > > *section) > > =C2=A0 { > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0VFIOContainer *container =3D cont= ainer_of(listener, > > VFIOContainer, listener); > > -=C2=A0=C2=A0=C2=A0=C2=A0hwaddr iova, end; > > +=C2=A0=C2=A0=C2=A0=C2=A0hwaddr iova, end, size; > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0Int128 llend; > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0void *vaddr; > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0int ret; > > @@ -348,7 +348,9 @@ static void > > vfio_listener_region_add(MemoryListener *listener, > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (int128_ge(int128_make64(iova)= , llend)) { > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return; > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > > + > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0end =3D int128_get64(int128_sub(l= lend, int128_one())); > > +=C2=A0=C2=A0=C2=A0=C2=A0size =3D int128_get64(int128_sub(llend, int1= 28_make64(iova))); >=20 > here again, if iova is null, since llend is section->size (2^64) ... >=20 > > =C2=A0=C2=A0 > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ((iova < container->min_iova) = || (end=C2=A0=C2=A0> container- > > >max_iova)) { > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0error_rep= ort("vfio: IOMMU container %p can't map guest > > IOVA region" > > @@ -396,11 +398,11 @@ static void > > vfio_listener_region_add(MemoryListener *listener, > > =C2=A0=C2=A0 > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0trace_vfio_listener_region_add_ra= m(iova, end, vaddr); > > =C2=A0=C2=A0 > > -=C2=A0=C2=A0=C2=A0=C2=A0ret =3D vfio_dma_map(container, iova, end - = iova + 1, vaddr, > > section->readonly); > > +=C2=A0=C2=A0=C2=A0=C2=A0ret =3D vfio_dma_map(container, iova, size, = vaddr, section- > > >readonly); > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (ret) { > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0error_rep= ort("vfio_dma_map(%p, 0x%"HWADDR_PRIx", " > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0"0x%= "HWADDR_PRIx", %p) =3D %d (%m)", > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0container, iova,= end - iova + 1, vaddr, ret); > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0container, iova,= size, vaddr, ret); > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0goto fail= ; > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > > =C2=A0=C2=A0 > >=20 > > Does that still solve your scenario?=C2=A0=C2=A0Perhaps vfio-iommu-ty= pe1 > > should > > have used first/last rather than start/size for mapping since we > > seem > > to have an off-by-one for mapping a full 64bit space.=C2=A0=C2=A0Seem= s like > > we > > could do it with two calls to vfio_dma_map if we really wanted to. > > Thanks, > >=20 > > Alex > >=20 >=20 > You are right, every try to solve this will push the overflow > somewhere=C2=A0 > else. >=20 > There is just no way to express 2^64 with 64 bits, we have the > int128()=C2=A0 > solution, > but if we solve it here, we fall in the linux ioctl call anyway. >=20 > Intuitively, making two calls do not seem right to me. >=20 > But, what do you think of something like: >=20 > - creating a new VFIO extention >=20 > - and in ioctl(), since we have a flag entry in the=C2=A0 > vfio_iommu_type1_dma_map, > may be adding a new flag meaning "map all virtual memory" ? > or meaning "use first/last" ? > I think this would break existing code unless we add a new VFIO > extension. Backup, is there ever a case where we actually need to map the entire 64bit address space? =C2=A0This is fairly well impossible on x86. =C2=A0I= 'm pointing out an issue, but I don't know that we need to solve it with more than an assert since it's never likely to happen. =C2=A0Thanks, Alex