All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pierre Morel <pmorel@linux.vnet.ibm.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: pbonzini@redhat.com, qemu-devel@nongnu.org, peter.maydell@linaro.org
Subject: Re: [Qemu-devel] [PATCH v3] vfio/common: Check iova with limit not with size
Date: Thu, 21 Jan 2016 14:15:32 +0100	[thread overview]
Message-ID: <56A0D9F4.1060708@linux.vnet.ibm.com> (raw)
In-Reply-To: <1453304819.32741.277.camel@redhat.com>



On 01/20/2016 04:46 PM, Alex Williamson wrote:
> On Wed, 2016-01-20 at 16:14 +0100, Pierre Morel wrote:
>> On 01/12/2016 07:16 PM, Alex Williamson wrote:
>>> On Tue, 2016-01-12 at 16:11 +0100, Pierre Morel wrote:
>>>> In vfio_listener_region_add(), we try to validate that the region
>>>> is
>>>> not
>>>> zero sized and hasn't overflowed the addresses space.
>>>>
>>>> But the calculation uses the size of the region instead of
>>>> using the region's limit (size - 1).
>>>>
>>>> This leads to Int128 overflow when the region has
>>>> been initialized to UINT64_MAX because in this case
>>>> memory_region_init() transform the size from UINT64_MAX
>>>> to int128_2_64().
>>>>
>>>> Let's really use the limit by sustracting one to the size
>>>> and take care to use the limit for functions using limit
>>>> and size to call functions which need size.
>>>>
>>>> Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
>>>> ---
>>>>
>>>> Changes from v2:
>>>>       - all, just ignore v2, sorry about this,
>>>>         this is build after v1
>>>>
>>>> Changes from v1:
>>>>       - adjust the tests by knowing we already substracted one to
>>>> end.
>>>>
>>>>    hw/vfio/common.c |   14 +++++++-------
>>>>    1 files changed, 7 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>>>> index 6797208..a5f6643 100644
>>>> --- a/hw/vfio/common.c
>>>> +++ b/hw/vfio/common.c
>>>> @@ -348,12 +348,12 @@ static void
>>>> vfio_listener_region_add(MemoryListener *listener,
>>>>        if (int128_ge(int128_make64(iova), llend)) {
>>>>            return;
>>>>        }
>>>> -    end = int128_get64(llend);
>>>> +    end = int128_get64(int128_sub(llend, int128_one()));
>>>>    
>>>> -    if ((iova < container->min_iova) || ((end - 1) > container-
>>>>> max_iova)) {
>>>> +    if ((iova < container->min_iova) || (end  > container-
>>>>> max_iova)) {
>>>>            error_report("vfio: IOMMU container %p can't map guest
>>>> IOVA
>>>> region"
>>>>                         " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx,
>>>> -                     container, iova, end - 1);
>>>> +                     container, iova, end);
>>>>            ret = -EFAULT;
>>>>            goto fail;
>>>>        }
>>>> @@ -363,7 +363,7 @@ static void
>>>> vfio_listener_region_add(MemoryListener *listener,
>>>>        if (memory_region_is_iommu(section->mr)) {
>>>>            VFIOGuestIOMMU *giommu;
>>>>    
>>>> -        trace_vfio_listener_region_add_iommu(iova, end - 1);
>>>> +        trace_vfio_listener_region_add_iommu(iova, end);
>>>>            /*
>>>>             * FIXME: We should do some checking to see if the
>>>>             * capabilities of the host VFIO IOMMU are adequate to
>>>> model
>>>> @@ -394,13 +394,13 @@ static void
>>>> vfio_listener_region_add(MemoryListener *listener,
>>>>                section->offset_within_region +
>>>>                (iova - section->offset_within_address_space);
>>>>    
>>>> -    trace_vfio_listener_region_add_ram(iova, end - 1, vaddr);
>>>> +    trace_vfio_listener_region_add_ram(iova, end, vaddr);
>>>>    
>>>> -    ret = vfio_dma_map(container, iova, end - iova, vaddr,
>>>> section-
>>>>> readonly);
>>>> +    ret = vfio_dma_map(container, iova, end - iova + 1, vaddr,
>>>> section->readonly);
>>>>        if (ret) {
>>>>            error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", "
>>>>                         "0x%"HWADDR_PRIx", %p) = %d (%m)",
>>>> -                     container, iova, end - iova, vaddr, ret);
>>>> +                     container, iova, end - iova + 1, vaddr,
>>>> ret);
>>>>            goto fail;
>>>>        }
>>>>    
>>> Hmm, did we just push the overflow from one place to another?  If
>>> we're
>>> mapping a full region of size int128_2_64() starting at iova zero,
>>> then
>>> this becomes (0xffff_ffff_ffff_ffff - 0 + 1) = 0.  So I think we
>>> need
>>> to calculate size with 128bit arithmetic too and let it assert if
>>> we
>>> overflow, ie:
>>>
>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>>> index a5f6643..13ad90b 100644
>>> --- a/hw/vfio/common.c
>>> +++ b/hw/vfio/common.c
>>> @@ -321,7 +321,7 @@ static void
>>> vfio_listener_region_add(MemoryListener *listener,
>>>                                         MemoryRegionSection
>>> *section)
>>>    {
>>>        VFIOContainer *container = container_of(listener,
>>> VFIOContainer, listener);
>>> -    hwaddr iova, end;
>>> +    hwaddr iova, end, size;
>>>        Int128 llend;
>>>        void *vaddr;
>>>        int ret;
>>> @@ -348,7 +348,9 @@ static void
>>> vfio_listener_region_add(MemoryListener *listener,
>>>        if (int128_ge(int128_make64(iova), llend)) {
>>>            return;
>>>        }
>>> +
>>>        end = int128_get64(int128_sub(llend, int128_one()));
>>> +    size = int128_get64(int128_sub(llend, int128_make64(iova)));
>> here again, if iova is null, since llend is section->size (2^64) ...
>>
>>>    
>>>        if ((iova < container->min_iova) || (end  > container-
>>>> max_iova)) {
>>>            error_report("vfio: IOMMU container %p can't map guest
>>> IOVA region"
>>> @@ -396,11 +398,11 @@ static void
>>> vfio_listener_region_add(MemoryListener *listener,
>>>    
>>>        trace_vfio_listener_region_add_ram(iova, end, vaddr);
>>>    
>>> -    ret = vfio_dma_map(container, iova, end - iova + 1, vaddr,
>>> section->readonly);
>>> +    ret = vfio_dma_map(container, iova, size, vaddr, section-
>>>> readonly);
>>>        if (ret) {
>>>            error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", "
>>>                         "0x%"HWADDR_PRIx", %p) = %d (%m)",
>>> -                     container, iova, end - iova + 1, vaddr, ret);
>>> +                     container, iova, size, vaddr, ret);
>>>            goto fail;
>>>        }
>>>    
>>>
>>> Does that still solve your scenario?  Perhaps vfio-iommu-type1
>>> should
>>> have used first/last rather than start/size for mapping since we
>>> seem
>>> to have an off-by-one for mapping a full 64bit space.  Seems like
>>> we
>>> could do it with two calls to vfio_dma_map if we really wanted to.
>>> Thanks,
>>>
>>> Alex
>>>
>> You are right, every try to solve this will push the overflow
>> somewhere
>> else.
>>
>> There is just no way to express 2^64 with 64 bits, we have the
>> int128()
>> solution,
>> but if we solve it here, we fall in the linux ioctl call anyway.
>>
>> Intuitively, making two calls do not seem right to me.
>>
>> But, what do you think of something like:
>>
>> - creating a new VFIO extention
>>
>> - and in ioctl(), since we have a flag entry in the
>> vfio_iommu_type1_dma_map,
>> may be adding a new flag meaning "map all virtual memory" ?
>> or meaning "use first/last" ?
>> I think this would break existing code unless we add a new VFIO
>> extension.
> Backup, is there ever a case where we actually need to map the entire
> 64bit address space?  This is fairly well impossible on x86.  I'm
> pointing out an issue, but I don't know that we need to solve it with
> more than an assert since it's never likely to happen.  Thanks,
>
> Alex
>

If I understood right, IOVA is the IO virtual address,
it is then possible to map the virtual address page 0xffff_ffff_ffff_f000
to something reasonable inside the real memory.

Eventual we do not need to map the last virtual page but
I think that in a general case the all virtual memory, as viewed by the
device through the IOMMU should be mapped to avoid any uninitialized
virtual memory access.

It is the same reason that make us map the all virtual memory for the 
CPU MMU.

May be I missed something, or may be I worry too much,
but I see this as a restriction on the supported hardware
if we compare host and guest hardware support compatibility.

We can live with it, because in fact you are right and today
I am not aware of a hardware wanting to access this page but a
hardware designers knowing having a IOMMU may want to access exactly
this kind of strange virtual page for special features and this would work
on the host but not inside of the guest.

Anyway we will find a trick but I think it is the place to make it right.

Thanks to you and tell me if I missed something.

Pierre

  reply	other threads:[~2016-01-21 13:15 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-12 15:11 [Qemu-devel] [PATCH v3] vfio/common: Check iova with limit not with size Pierre Morel
2016-01-12 18:16 ` Alex Williamson
2016-01-20 15:14   ` Pierre Morel
2016-01-20 15:46     ` Alex Williamson
2016-01-21 13:15       ` Pierre Morel [this message]
2016-01-22 22:14         ` Alex Williamson
2016-01-22 22:19           ` Alex Williamson
2016-01-26 14:51             ` Pierre Morel
2016-01-26 17:00               ` Alex Williamson
2016-01-27  9:28                 ` Pierre Morel
2016-01-27 17:43                   ` Alex Williamson
2016-01-28 12:07                     ` Pierre Morel
2016-02-02 20:41 ` Bandan Das

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56A0D9F4.1060708@linux.vnet.ibm.com \
    --to=pmorel@linux.vnet.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.