* [Qemu-devel] [PATCH-for-4.1] virtio-balloon: fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE
@ 2019-07-17 8:42 David Hildenbrand
2019-07-17 9:57 ` Michael S. Tsirkin
2019-07-17 11:10 ` Michael S. Tsirkin
0 siblings, 2 replies; 11+ messages in thread
From: David Hildenbrand @ 2019-07-17 8:42 UTC (permalink / raw)
To: qemu-devel
Cc: Michael S . Tsirkin, David Hildenbrand, qemu-stable,
Stefan Hajnoczi, Igor Mammedov, David Gibson
We are using the wrong functions to set/clear bits, effectively touching
multiple bits, writing out of range of the bitmap, resulting in memory
corruptions. We have to use set_bit()/clear_bit() instead.
Can easily be reproduced by starting a qemu guest on hugetlbfs memory,
inflating the balloon. QEMU crashes. This never could have worked
properly - especially, also pages would have been discarded when the
first sub-page would be inflated (the whole bitmap would be set).
While testing I realized, that on hugetlbfs it is pretty much impossible
to discard a page - the guest just frees the 4k sub-pages in random order
most of the time. I was only able to discard a hugepage a handful of
times - so I hope that now works correctly.
Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
host page size")
Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption
with inflates & deflates")
Cc: qemu-stable@nongnu.org #v4.0.0
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
hw/virtio/virtio-balloon.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index e85d1c0d5c..669067d661 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -94,9 +94,8 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
balloon->pbp->base = host_page_base;
}
- bitmap_set(balloon->pbp->bitmap,
- (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
- subpages);
+ set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
+ balloon->pbp->bitmap);
if (bitmap_full(balloon->pbp->bitmap, subpages)) {
/* We've accumulated a full host page, we can actually discard
@@ -140,9 +139,8 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
* for a guest to do this in practice, but handle it anyway,
* since getting it wrong could mean discarding memory the
* guest is still using. */
- bitmap_clear(balloon->pbp->bitmap,
- (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
- subpages);
+ clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
+ balloon->pbp->bitmap);
if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
g_free(balloon->pbp);
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH-for-4.1] virtio-balloon: fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE
2019-07-17 8:42 [Qemu-devel] [PATCH-for-4.1] virtio-balloon: fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE David Hildenbrand
@ 2019-07-17 9:57 ` Michael S. Tsirkin
2019-07-17 10:04 ` David Hildenbrand
2019-07-17 11:10 ` Michael S. Tsirkin
1 sibling, 1 reply; 11+ messages in thread
From: Michael S. Tsirkin @ 2019-07-17 9:57 UTC (permalink / raw)
To: David Hildenbrand
Cc: Igor Mammedov, David Gibson, qemu-devel, Stefan Hajnoczi,
qemu-stable
On Wed, Jul 17, 2019 at 10:42:55AM +0200, David Hildenbrand wrote:
> We are using the wrong functions to set/clear bits, effectively touching
> multiple bits, writing out of range of the bitmap, resulting in memory
> corruptions. We have to use set_bit()/clear_bit() instead.
>
> Can easily be reproduced by starting a qemu guest on hugetlbfs memory,
> inflating the balloon. QEMU crashes. This never could have worked
> properly - especially, also pages would have been discarded when the
> first sub-page would be inflated (the whole bitmap would be set).
>
> While testing I realized, that on hugetlbfs it is pretty much impossible
> to discard a page - the guest just frees the 4k sub-pages in random order
> most of the time. I was only able to discard a hugepage a handful of
> times - so I hope that now works correctly.
>
> Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
> host page size")
> Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption
> with inflates & deflates")
> Cc: qemu-stable@nongnu.org #v4.0.0
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
> hw/virtio/virtio-balloon.c | 10 ++++------
> 1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index e85d1c0d5c..669067d661 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -94,9 +94,8 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
> balloon->pbp->base = host_page_base;
> }
>
> - bitmap_set(balloon->pbp->bitmap,
> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> - subpages);
> + set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> + balloon->pbp->bitmap);
>
> if (bitmap_full(balloon->pbp->bitmap, subpages)) {
> /* We've accumulated a full host page, we can actually discard
> @@ -140,9 +139,8 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
> * for a guest to do this in practice, but handle it anyway,
> * since getting it wrong could mean discarding memory the
> * guest is still using. */
> - bitmap_clear(balloon->pbp->bitmap,
> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> - subpages);
> + clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> + balloon->pbp->bitmap);
>
> if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
> g_free(balloon->pbp);
I also started to wonder about this:
if (!balloon->pbp) {
/* Starting on a new host page */
size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
balloon->pbp->rb = rb;
balloon->pbp->base = host_page_base;
}
Is keeping a pointer to a ram block like this safe? what if the ramblock
gets removed?
> --
> 2.21.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH-for-4.1] virtio-balloon: fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE
2019-07-17 9:57 ` Michael S. Tsirkin
@ 2019-07-17 10:04 ` David Hildenbrand
2019-07-17 10:17 ` David Hildenbrand
0 siblings, 1 reply; 11+ messages in thread
From: David Hildenbrand @ 2019-07-17 10:04 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Igor Mammedov, David Gibson, qemu-devel, Stefan Hajnoczi,
qemu-stable
On 17.07.19 11:57, Michael S. Tsirkin wrote:
> On Wed, Jul 17, 2019 at 10:42:55AM +0200, David Hildenbrand wrote:
>> We are using the wrong functions to set/clear bits, effectively touching
>> multiple bits, writing out of range of the bitmap, resulting in memory
>> corruptions. We have to use set_bit()/clear_bit() instead.
>>
>> Can easily be reproduced by starting a qemu guest on hugetlbfs memory,
>> inflating the balloon. QEMU crashes. This never could have worked
>> properly - especially, also pages would have been discarded when the
>> first sub-page would be inflated (the whole bitmap would be set).
>>
>> While testing I realized, that on hugetlbfs it is pretty much impossible
>> to discard a page - the guest just frees the 4k sub-pages in random order
>> most of the time. I was only able to discard a hugepage a handful of
>> times - so I hope that now works correctly.
>>
>> Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
>> host page size")
>> Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption
>> with inflates & deflates")
>> Cc: qemu-stable@nongnu.org #v4.0.0
>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
>> Cc: David Gibson <david@gibson.dropbear.id.au>
>> Cc: Michael S. Tsirkin <mst@redhat.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>> hw/virtio/virtio-balloon.c | 10 ++++------
>> 1 file changed, 4 insertions(+), 6 deletions(-)
>>
>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
>> index e85d1c0d5c..669067d661 100644
>> --- a/hw/virtio/virtio-balloon.c
>> +++ b/hw/virtio/virtio-balloon.c
>> @@ -94,9 +94,8 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
>> balloon->pbp->base = host_page_base;
>> }
>>
>> - bitmap_set(balloon->pbp->bitmap,
>> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
>> - subpages);
>> + set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
>> + balloon->pbp->bitmap);
>>
>> if (bitmap_full(balloon->pbp->bitmap, subpages)) {
>> /* We've accumulated a full host page, we can actually discard
>> @@ -140,9 +139,8 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
>> * for a guest to do this in practice, but handle it anyway,
>> * since getting it wrong could mean discarding memory the
>> * guest is still using. */
>> - bitmap_clear(balloon->pbp->bitmap,
>> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
>> - subpages);
>> + clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
>> + balloon->pbp->bitmap);
>>
>> if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
>> g_free(balloon->pbp);
>
> I also started to wonder about this:
>
> if (!balloon->pbp) {
> /* Starting on a new host page */
> size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
> balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
> balloon->pbp->rb = rb;
> balloon->pbp->base = host_page_base;
> }
>
> Is keeping a pointer to a ram block like this safe? what if the ramblock
> gets removed?
>
David added
if (balloon->pbp
&& (rb != balloon->pbp->rb ) ...
So in case the rb changes (IOW replaced - delete old one, new one
added), we reset the data.
After a ram block was deleted, there will be no more deflation requests
coming in for it. This should be fine I guess.
However, there is another possible issue: Resets.
If the balloon was inflated and we reboot, the old balloon->pbp will
remain intact. The guest will continue using all memory until
virtio-balloon guest driver comes up. If the stars align, it could
happen that new inflation requests by the guests will result in a
discard of a big chunk, although the guest is re-using some parts
already again.
We would have to reset balloon->pbp during virtio_balloon_device_reset().
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH-for-4.1] virtio-balloon: fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE
2019-07-17 10:04 ` David Hildenbrand
@ 2019-07-17 10:17 ` David Hildenbrand
2019-07-17 11:06 ` Michael S. Tsirkin
0 siblings, 1 reply; 11+ messages in thread
From: David Hildenbrand @ 2019-07-17 10:17 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Igor Mammedov, David Gibson, qemu-devel, Stefan Hajnoczi,
qemu-stable
On 17.07.19 12:04, David Hildenbrand wrote:
> On 17.07.19 11:57, Michael S. Tsirkin wrote:
>> On Wed, Jul 17, 2019 at 10:42:55AM +0200, David Hildenbrand wrote:
>>> We are using the wrong functions to set/clear bits, effectively touching
>>> multiple bits, writing out of range of the bitmap, resulting in memory
>>> corruptions. We have to use set_bit()/clear_bit() instead.
>>>
>>> Can easily be reproduced by starting a qemu guest on hugetlbfs memory,
>>> inflating the balloon. QEMU crashes. This never could have worked
>>> properly - especially, also pages would have been discarded when the
>>> first sub-page would be inflated (the whole bitmap would be set).
>>>
>>> While testing I realized, that on hugetlbfs it is pretty much impossible
>>> to discard a page - the guest just frees the 4k sub-pages in random order
>>> most of the time. I was only able to discard a hugepage a handful of
>>> times - so I hope that now works correctly.
>>>
>>> Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
>>> host page size")
>>> Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption
>>> with inflates & deflates")
>>> Cc: qemu-stable@nongnu.org #v4.0.0
>>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
>>> Cc: David Gibson <david@gibson.dropbear.id.au>
>>> Cc: Michael S. Tsirkin <mst@redhat.com>
>>> Cc: Igor Mammedov <imammedo@redhat.com>
>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>> ---
>>> hw/virtio/virtio-balloon.c | 10 ++++------
>>> 1 file changed, 4 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
>>> index e85d1c0d5c..669067d661 100644
>>> --- a/hw/virtio/virtio-balloon.c
>>> +++ b/hw/virtio/virtio-balloon.c
>>> @@ -94,9 +94,8 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
>>> balloon->pbp->base = host_page_base;
>>> }
>>>
>>> - bitmap_set(balloon->pbp->bitmap,
>>> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
>>> - subpages);
>>> + set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
>>> + balloon->pbp->bitmap);
>>>
>>> if (bitmap_full(balloon->pbp->bitmap, subpages)) {
>>> /* We've accumulated a full host page, we can actually discard
>>> @@ -140,9 +139,8 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
>>> * for a guest to do this in practice, but handle it anyway,
>>> * since getting it wrong could mean discarding memory the
>>> * guest is still using. */
>>> - bitmap_clear(balloon->pbp->bitmap,
>>> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
>>> - subpages);
>>> + clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
>>> + balloon->pbp->bitmap);
>>>
>>> if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
>>> g_free(balloon->pbp);
>>
>> I also started to wonder about this:
>>
>> if (!balloon->pbp) {
>> /* Starting on a new host page */
>> size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
>> balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
>> balloon->pbp->rb = rb;
>> balloon->pbp->base = host_page_base;
>> }
>>
>> Is keeping a pointer to a ram block like this safe? what if the ramblock
>> gets removed?
>>
>
> David added
>
> if (balloon->pbp
> && (rb != balloon->pbp->rb ) ...
>
> So in case the rb changes (IOW replaced - delete old one, new one
> added), we reset the data.
>
> After a ram block was deleted, there will be no more deflation requests
> coming in for it. This should be fine I guess.
>
>
> However, there is another possible issue: Resets.
>
> If the balloon was inflated and we reboot, the old balloon->pbp will
> remain intact. The guest will continue using all memory until
> virtio-balloon guest driver comes up. If the stars align, it could
> happen that new inflation requests by the guests will result in a
> discard of a big chunk, although the guest is re-using some parts
> already again.
>
> We would have to reset balloon->pbp during virtio_balloon_device_reset().
>
... also, I think balloon->pbp is not freed when unrealizing, resulting
in a memory leak ...
will craft some more patches.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH-for-4.1] virtio-balloon: fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE
2019-07-17 10:17 ` David Hildenbrand
@ 2019-07-17 11:06 ` Michael S. Tsirkin
2019-07-17 11:10 ` David Hildenbrand
0 siblings, 1 reply; 11+ messages in thread
From: Michael S. Tsirkin @ 2019-07-17 11:06 UTC (permalink / raw)
To: David Hildenbrand
Cc: Igor Mammedov, David Gibson, qemu-devel, Stefan Hajnoczi,
qemu-stable
On Wed, Jul 17, 2019 at 12:17:57PM +0200, David Hildenbrand wrote:
> On 17.07.19 12:04, David Hildenbrand wrote:
> > On 17.07.19 11:57, Michael S. Tsirkin wrote:
> >> On Wed, Jul 17, 2019 at 10:42:55AM +0200, David Hildenbrand wrote:
> >>> We are using the wrong functions to set/clear bits, effectively touching
> >>> multiple bits, writing out of range of the bitmap, resulting in memory
> >>> corruptions. We have to use set_bit()/clear_bit() instead.
> >>>
> >>> Can easily be reproduced by starting a qemu guest on hugetlbfs memory,
> >>> inflating the balloon. QEMU crashes. This never could have worked
> >>> properly - especially, also pages would have been discarded when the
> >>> first sub-page would be inflated (the whole bitmap would be set).
> >>>
> >>> While testing I realized, that on hugetlbfs it is pretty much impossible
> >>> to discard a page - the guest just frees the 4k sub-pages in random order
> >>> most of the time. I was only able to discard a hugepage a handful of
> >>> times - so I hope that now works correctly.
> >>>
> >>> Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
> >>> host page size")
> >>> Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption
> >>> with inflates & deflates")
> >>> Cc: qemu-stable@nongnu.org #v4.0.0
> >>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> >>> Cc: David Gibson <david@gibson.dropbear.id.au>
> >>> Cc: Michael S. Tsirkin <mst@redhat.com>
> >>> Cc: Igor Mammedov <imammedo@redhat.com>
> >>> Signed-off-by: David Hildenbrand <david@redhat.com>
> >>> ---
> >>> hw/virtio/virtio-balloon.c | 10 ++++------
> >>> 1 file changed, 4 insertions(+), 6 deletions(-)
> >>>
> >>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> >>> index e85d1c0d5c..669067d661 100644
> >>> --- a/hw/virtio/virtio-balloon.c
> >>> +++ b/hw/virtio/virtio-balloon.c
> >>> @@ -94,9 +94,8 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
> >>> balloon->pbp->base = host_page_base;
> >>> }
> >>>
> >>> - bitmap_set(balloon->pbp->bitmap,
> >>> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> >>> - subpages);
> >>> + set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> >>> + balloon->pbp->bitmap);
> >>>
> >>> if (bitmap_full(balloon->pbp->bitmap, subpages)) {
> >>> /* We've accumulated a full host page, we can actually discard
> >>> @@ -140,9 +139,8 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
> >>> * for a guest to do this in practice, but handle it anyway,
> >>> * since getting it wrong could mean discarding memory the
> >>> * guest is still using. */
> >>> - bitmap_clear(balloon->pbp->bitmap,
> >>> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> >>> - subpages);
> >>> + clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> >>> + balloon->pbp->bitmap);
> >>>
> >>> if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
> >>> g_free(balloon->pbp);
> >>
> >> I also started to wonder about this:
> >>
> >> if (!balloon->pbp) {
> >> /* Starting on a new host page */
> >> size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
> >> balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
> >> balloon->pbp->rb = rb;
> >> balloon->pbp->base = host_page_base;
> >> }
> >>
> >> Is keeping a pointer to a ram block like this safe? what if the ramblock
> >> gets removed?
> >>
> >
> > David added
> >
> > if (balloon->pbp
> > && (rb != balloon->pbp->rb ) ...
> >
> > So in case the rb changes (IOW replaced - delete old one, new one
> > added), we reset the data.
> >
> > After a ram block was deleted, there will be no more deflation requests
> > coming in for it. This should be fine I guess.
I think it might happen that an old dangling pointer happens
to match a newly allocated one.
I think we really should just cache all data we want to take into account
and compare that.
> >
> >
> > However, there is another possible issue: Resets.
> >
> > If the balloon was inflated and we reboot, the old balloon->pbp will
> > remain intact. The guest will continue using all memory until
> > virtio-balloon guest driver comes up. If the stars align, it could
> > happen that new inflation requests by the guests will result in a
> > discard of a big chunk, although the guest is re-using some parts
> > already again.
> >
> > We would have to reset balloon->pbp during virtio_balloon_device_reset().
> >
>
> ... also, I think balloon->pbp is not freed when unrealizing, resulting
> in a memory leak ...
>
> will craft some more patches.
Ught.
> --
>
> Thanks,
>
> David / dhildenb
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH-for-4.1] virtio-balloon: fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE
2019-07-17 11:06 ` Michael S. Tsirkin
@ 2019-07-17 11:10 ` David Hildenbrand
2019-07-17 11:22 ` Michael S. Tsirkin
0 siblings, 1 reply; 11+ messages in thread
From: David Hildenbrand @ 2019-07-17 11:10 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Igor Mammedov, David Gibson, qemu-devel, Stefan Hajnoczi,
qemu-stable
On 17.07.19 13:06, Michael S. Tsirkin wrote:
> On Wed, Jul 17, 2019 at 12:17:57PM +0200, David Hildenbrand wrote:
>> On 17.07.19 12:04, David Hildenbrand wrote:
>>> On 17.07.19 11:57, Michael S. Tsirkin wrote:
>>>> On Wed, Jul 17, 2019 at 10:42:55AM +0200, David Hildenbrand wrote:
>>>>> We are using the wrong functions to set/clear bits, effectively touching
>>>>> multiple bits, writing out of range of the bitmap, resulting in memory
>>>>> corruptions. We have to use set_bit()/clear_bit() instead.
>>>>>
>>>>> Can easily be reproduced by starting a qemu guest on hugetlbfs memory,
>>>>> inflating the balloon. QEMU crashes. This never could have worked
>>>>> properly - especially, also pages would have been discarded when the
>>>>> first sub-page would be inflated (the whole bitmap would be set).
>>>>>
>>>>> While testing I realized, that on hugetlbfs it is pretty much impossible
>>>>> to discard a page - the guest just frees the 4k sub-pages in random order
>>>>> most of the time. I was only able to discard a hugepage a handful of
>>>>> times - so I hope that now works correctly.
>>>>>
>>>>> Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
>>>>> host page size")
>>>>> Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption
>>>>> with inflates & deflates")
>>>>> Cc: qemu-stable@nongnu.org #v4.0.0
>>>>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
>>>>> Cc: David Gibson <david@gibson.dropbear.id.au>
>>>>> Cc: Michael S. Tsirkin <mst@redhat.com>
>>>>> Cc: Igor Mammedov <imammedo@redhat.com>
>>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>>> ---
>>>>> hw/virtio/virtio-balloon.c | 10 ++++------
>>>>> 1 file changed, 4 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
>>>>> index e85d1c0d5c..669067d661 100644
>>>>> --- a/hw/virtio/virtio-balloon.c
>>>>> +++ b/hw/virtio/virtio-balloon.c
>>>>> @@ -94,9 +94,8 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
>>>>> balloon->pbp->base = host_page_base;
>>>>> }
>>>>>
>>>>> - bitmap_set(balloon->pbp->bitmap,
>>>>> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
>>>>> - subpages);
>>>>> + set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
>>>>> + balloon->pbp->bitmap);
>>>>>
>>>>> if (bitmap_full(balloon->pbp->bitmap, subpages)) {
>>>>> /* We've accumulated a full host page, we can actually discard
>>>>> @@ -140,9 +139,8 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
>>>>> * for a guest to do this in practice, but handle it anyway,
>>>>> * since getting it wrong could mean discarding memory the
>>>>> * guest is still using. */
>>>>> - bitmap_clear(balloon->pbp->bitmap,
>>>>> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
>>>>> - subpages);
>>>>> + clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
>>>>> + balloon->pbp->bitmap);
>>>>>
>>>>> if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
>>>>> g_free(balloon->pbp);
>>>>
>>>> I also started to wonder about this:
>>>>
>>>> if (!balloon->pbp) {
>>>> /* Starting on a new host page */
>>>> size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
>>>> balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
>>>> balloon->pbp->rb = rb;
>>>> balloon->pbp->base = host_page_base;
>>>> }
>>>>
>>>> Is keeping a pointer to a ram block like this safe? what if the ramblock
>>>> gets removed?
>>>>
>>>
>>> David added
>>>
>>> if (balloon->pbp
>>> && (rb != balloon->pbp->rb ) ...
>>>
>>> So in case the rb changes (IOW replaced - delete old one, new one
>>> added), we reset the data.
>>>
>>> After a ram block was deleted, there will be no more deflation requests
>>> coming in for it. This should be fine I guess.
>
> I think it might happen that an old dangling pointer happens
> to match a newly allocated one.
> I think we really should just cache all data we want to take into account
> and compare that.
That's true. I think just remembering and comparing the GPA base address
would be sufficient.
However, I don't consider this here to trigger easily. We would need
some crazy memory unplug+replug going on while using the balloon. So I
assume we can just rework this part after 4.1
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH-for-4.1] virtio-balloon: fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE
2019-07-17 8:42 [Qemu-devel] [PATCH-for-4.1] virtio-balloon: fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE David Hildenbrand
2019-07-17 9:57 ` Michael S. Tsirkin
@ 2019-07-17 11:10 ` Michael S. Tsirkin
2019-07-17 11:12 ` David Hildenbrand
1 sibling, 1 reply; 11+ messages in thread
From: Michael S. Tsirkin @ 2019-07-17 11:10 UTC (permalink / raw)
To: David Hildenbrand
Cc: Igor Mammedov, David Gibson, qemu-devel, Stefan Hajnoczi,
qemu-stable
On Wed, Jul 17, 2019 at 10:42:55AM +0200, David Hildenbrand wrote:
> We are using the wrong functions to set/clear bits, effectively touching
> multiple bits, writing out of range of the bitmap, resulting in memory
> corruptions. We have to use set_bit()/clear_bit() instead.
>
> Can easily be reproduced by starting a qemu guest on hugetlbfs memory,
> inflating the balloon. QEMU crashes. This never could have worked
> properly - especially, also pages would have been discarded when the
> first sub-page would be inflated (the whole bitmap would be set).
>
> While testing I realized, that on hugetlbfs it is pretty much impossible
> to discard a page - the guest just frees the 4k sub-pages in random order
> most of the time. I was only able to discard a hugepage a handful of
> times - so I hope that now works correctly.
I think this actually only works reasonably well on guests
which have pages larger than 4K.
So guest page size = host page size > balloon page size.
>
> Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
> host page size")
> Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption
> with inflates & deflates")
> Cc: qemu-stable@nongnu.org #v4.0.0
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
> hw/virtio/virtio-balloon.c | 10 ++++------
> 1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index e85d1c0d5c..669067d661 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -94,9 +94,8 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
> balloon->pbp->base = host_page_base;
> }
>
> - bitmap_set(balloon->pbp->bitmap,
> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> - subpages);
> + set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> + balloon->pbp->bitmap);
>
> if (bitmap_full(balloon->pbp->bitmap, subpages)) {
> /* We've accumulated a full host page, we can actually discard
> @@ -140,9 +139,8 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
> * for a guest to do this in practice, but handle it anyway,
> * since getting it wrong could mean discarding memory the
> * guest is still using. */
> - bitmap_clear(balloon->pbp->bitmap,
> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> - subpages);
> + clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> + balloon->pbp->bitmap);
>
> if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
> g_free(balloon->pbp);
> --
> 2.21.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH-for-4.1] virtio-balloon: fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE
2019-07-17 11:10 ` Michael S. Tsirkin
@ 2019-07-17 11:12 ` David Hildenbrand
0 siblings, 0 replies; 11+ messages in thread
From: David Hildenbrand @ 2019-07-17 11:12 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Igor Mammedov, David Gibson, qemu-devel, Stefan Hajnoczi,
qemu-stable
On 17.07.19 13:10, Michael S. Tsirkin wrote:
> On Wed, Jul 17, 2019 at 10:42:55AM +0200, David Hildenbrand wrote:
>> We are using the wrong functions to set/clear bits, effectively touching
>> multiple bits, writing out of range of the bitmap, resulting in memory
>> corruptions. We have to use set_bit()/clear_bit() instead.
>>
>> Can easily be reproduced by starting a qemu guest on hugetlbfs memory,
>> inflating the balloon. QEMU crashes. This never could have worked
>> properly - especially, also pages would have been discarded when the
>> first sub-page would be inflated (the whole bitmap would be set).
>>
>> While testing I realized, that on hugetlbfs it is pretty much impossible
>> to discard a page - the guest just frees the 4k sub-pages in random order
>> most of the time. I was only able to discard a hugepage a handful of
>> times - so I hope that now works correctly.
>
> I think this actually only works reasonably well on guests
> which have pages larger than 4K.
> So guest page size = host page size > balloon page size.
>
Yes, otherwise it's pure luck (and therefore the printed warning is
appropriate).
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH-for-4.1] virtio-balloon: fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE
2019-07-17 11:10 ` David Hildenbrand
@ 2019-07-17 11:22 ` Michael S. Tsirkin
2019-07-17 11:28 ` David Hildenbrand
0 siblings, 1 reply; 11+ messages in thread
From: Michael S. Tsirkin @ 2019-07-17 11:22 UTC (permalink / raw)
To: David Hildenbrand
Cc: Igor Mammedov, David Gibson, qemu-devel, Stefan Hajnoczi,
qemu-stable
On Wed, Jul 17, 2019 at 01:10:21PM +0200, David Hildenbrand wrote:
> On 17.07.19 13:06, Michael S. Tsirkin wrote:
> > On Wed, Jul 17, 2019 at 12:17:57PM +0200, David Hildenbrand wrote:
> >> On 17.07.19 12:04, David Hildenbrand wrote:
> >>> On 17.07.19 11:57, Michael S. Tsirkin wrote:
> >>>> On Wed, Jul 17, 2019 at 10:42:55AM +0200, David Hildenbrand wrote:
> >>>>> We are using the wrong functions to set/clear bits, effectively touching
> >>>>> multiple bits, writing out of range of the bitmap, resulting in memory
> >>>>> corruptions. We have to use set_bit()/clear_bit() instead.
> >>>>>
> >>>>> Can easily be reproduced by starting a qemu guest on hugetlbfs memory,
> >>>>> inflating the balloon. QEMU crashes. This never could have worked
> >>>>> properly - especially, also pages would have been discarded when the
> >>>>> first sub-page would be inflated (the whole bitmap would be set).
> >>>>>
> >>>>> While testing I realized, that on hugetlbfs it is pretty much impossible
> >>>>> to discard a page - the guest just frees the 4k sub-pages in random order
> >>>>> most of the time. I was only able to discard a hugepage a handful of
> >>>>> times - so I hope that now works correctly.
> >>>>>
> >>>>> Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
> >>>>> host page size")
> >>>>> Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption
> >>>>> with inflates & deflates")
> >>>>> Cc: qemu-stable@nongnu.org #v4.0.0
> >>>>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> >>>>> Cc: David Gibson <david@gibson.dropbear.id.au>
> >>>>> Cc: Michael S. Tsirkin <mst@redhat.com>
> >>>>> Cc: Igor Mammedov <imammedo@redhat.com>
> >>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
> >>>>> ---
> >>>>> hw/virtio/virtio-balloon.c | 10 ++++------
> >>>>> 1 file changed, 4 insertions(+), 6 deletions(-)
> >>>>>
> >>>>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> >>>>> index e85d1c0d5c..669067d661 100644
> >>>>> --- a/hw/virtio/virtio-balloon.c
> >>>>> +++ b/hw/virtio/virtio-balloon.c
> >>>>> @@ -94,9 +94,8 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
> >>>>> balloon->pbp->base = host_page_base;
> >>>>> }
> >>>>>
> >>>>> - bitmap_set(balloon->pbp->bitmap,
> >>>>> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> >>>>> - subpages);
> >>>>> + set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> >>>>> + balloon->pbp->bitmap);
> >>>>>
> >>>>> if (bitmap_full(balloon->pbp->bitmap, subpages)) {
> >>>>> /* We've accumulated a full host page, we can actually discard
> >>>>> @@ -140,9 +139,8 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
> >>>>> * for a guest to do this in practice, but handle it anyway,
> >>>>> * since getting it wrong could mean discarding memory the
> >>>>> * guest is still using. */
> >>>>> - bitmap_clear(balloon->pbp->bitmap,
> >>>>> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> >>>>> - subpages);
> >>>>> + clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> >>>>> + balloon->pbp->bitmap);
> >>>>>
> >>>>> if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
> >>>>> g_free(balloon->pbp);
> >>>>
> >>>> I also started to wonder about this:
> >>>>
> >>>> if (!balloon->pbp) {
> >>>> /* Starting on a new host page */
> >>>> size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
> >>>> balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
> >>>> balloon->pbp->rb = rb;
> >>>> balloon->pbp->base = host_page_base;
> >>>> }
> >>>>
> >>>> Is keeping a pointer to a ram block like this safe? what if the ramblock
> >>>> gets removed?
> >>>>
> >>>
> >>> David added
> >>>
> >>> if (balloon->pbp
> >>> && (rb != balloon->pbp->rb ) ...
> >>>
> >>> So in case the rb changes (IOW replaced - delete old one, new one
> >>> added), we reset the data.
> >>>
> >>> After a ram block was deleted, there will be no more deflation requests
> >>> coming in for it. This should be fine I guess.
> >
> > I think it might happen that an old dangling pointer happens
> > to match a newly allocated one.
> > I think we really should just cache all data we want to take into account
> > and compare that.
>
> That's true. I think just remembering and comparing the GPA base address
> would be sufficient.
Well we need to know the bitmap size allocated, too.
And I guess when we are ready to free we should
re-check it just in case.
> However, I don't consider this here to trigger easily. We would need
> some crazy memory unplug+replug going on while using the balloon. So I
> assume we can just rework this part after 4.1
Dangling pointers are just a recipe for CVEs. I'd rather rework it now.
> --
>
> Thanks,
>
> David / dhildenb
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH-for-4.1] virtio-balloon: fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE
2019-07-17 11:22 ` Michael S. Tsirkin
@ 2019-07-17 11:28 ` David Hildenbrand
2019-07-17 11:32 ` Michael S. Tsirkin
0 siblings, 1 reply; 11+ messages in thread
From: David Hildenbrand @ 2019-07-17 11:28 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Igor Mammedov, David Gibson, qemu-devel, Stefan Hajnoczi,
qemu-stable
On 17.07.19 13:22, Michael S. Tsirkin wrote:
> On Wed, Jul 17, 2019 at 01:10:21PM +0200, David Hildenbrand wrote:
>> On 17.07.19 13:06, Michael S. Tsirkin wrote:
>>> On Wed, Jul 17, 2019 at 12:17:57PM +0200, David Hildenbrand wrote:
>>>> On 17.07.19 12:04, David Hildenbrand wrote:
>>>>> On 17.07.19 11:57, Michael S. Tsirkin wrote:
>>>>>> On Wed, Jul 17, 2019 at 10:42:55AM +0200, David Hildenbrand wrote:
>>>>>>> We are using the wrong functions to set/clear bits, effectively touching
>>>>>>> multiple bits, writing out of range of the bitmap, resulting in memory
>>>>>>> corruptions. We have to use set_bit()/clear_bit() instead.
>>>>>>>
>>>>>>> Can easily be reproduced by starting a qemu guest on hugetlbfs memory,
>>>>>>> inflating the balloon. QEMU crashes. This never could have worked
>>>>>>> properly - especially, also pages would have been discarded when the
>>>>>>> first sub-page would be inflated (the whole bitmap would be set).
>>>>>>>
>>>>>>> While testing I realized, that on hugetlbfs it is pretty much impossible
>>>>>>> to discard a page - the guest just frees the 4k sub-pages in random order
>>>>>>> most of the time. I was only able to discard a hugepage a handful of
>>>>>>> times - so I hope that now works correctly.
>>>>>>>
>>>>>>> Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
>>>>>>> host page size")
>>>>>>> Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption
>>>>>>> with inflates & deflates")
>>>>>>> Cc: qemu-stable@nongnu.org #v4.0.0
>>>>>>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
>>>>>>> Cc: David Gibson <david@gibson.dropbear.id.au>
>>>>>>> Cc: Michael S. Tsirkin <mst@redhat.com>
>>>>>>> Cc: Igor Mammedov <imammedo@redhat.com>
>>>>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>>>>> ---
>>>>>>> hw/virtio/virtio-balloon.c | 10 ++++------
>>>>>>> 1 file changed, 4 insertions(+), 6 deletions(-)
>>>>>>>
>>>>>>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
>>>>>>> index e85d1c0d5c..669067d661 100644
>>>>>>> --- a/hw/virtio/virtio-balloon.c
>>>>>>> +++ b/hw/virtio/virtio-balloon.c
>>>>>>> @@ -94,9 +94,8 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
>>>>>>> balloon->pbp->base = host_page_base;
>>>>>>> }
>>>>>>>
>>>>>>> - bitmap_set(balloon->pbp->bitmap,
>>>>>>> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
>>>>>>> - subpages);
>>>>>>> + set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
>>>>>>> + balloon->pbp->bitmap);
>>>>>>>
>>>>>>> if (bitmap_full(balloon->pbp->bitmap, subpages)) {
>>>>>>> /* We've accumulated a full host page, we can actually discard
>>>>>>> @@ -140,9 +139,8 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
>>>>>>> * for a guest to do this in practice, but handle it anyway,
>>>>>>> * since getting it wrong could mean discarding memory the
>>>>>>> * guest is still using. */
>>>>>>> - bitmap_clear(balloon->pbp->bitmap,
>>>>>>> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
>>>>>>> - subpages);
>>>>>>> + clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
>>>>>>> + balloon->pbp->bitmap);
>>>>>>>
>>>>>>> if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
>>>>>>> g_free(balloon->pbp);
>>>>>>
>>>>>> I also started to wonder about this:
>>>>>>
>>>>>> if (!balloon->pbp) {
>>>>>> /* Starting on a new host page */
>>>>>> size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
>>>>>> balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
>>>>>> balloon->pbp->rb = rb;
>>>>>> balloon->pbp->base = host_page_base;
>>>>>> }
>>>>>>
>>>>>> Is keeping a pointer to a ram block like this safe? what if the ramblock
>>>>>> gets removed?
>>>>>>
>>>>>
>>>>> David added
>>>>>
>>>>> if (balloon->pbp
>>>>> && (rb != balloon->pbp->rb ) ...
>>>>>
>>>>> So in case the rb changes (IOW replaced - delete old one, new one
>>>>> added), we reset the data.
>>>>>
>>>>> After a ram block was deleted, there will be no more deflation requests
>>>>> coming in for it. This should be fine I guess.
>>>
>>> I think it might happen that an old dangling pointer happens
>>> to match a newly allocated one.
>>> I think we really should just cache all data we want to take into account
>>> and compare that.
>>
>> That's true. I think just remembering and comparing the GPA base address
>> would be sufficient.
>
> Well we need to know the bitmap size allocated, too.
> And I guess when we are ready to free we should
> re-check it just in case.
Right, either that or the page size, which is orthogonal.
>
>> However, I don't consider this here to trigger easily. We would need
>> some crazy memory unplug+replug going on while using the balloon. So I
>> assume we can just rework this part after 4.1
>
> Dangling pointers are just a recipe for CVEs. I'd rather rework it now.
>
If they are not dereferences, I don't consider it an ultimate problem.
But yeah, I'll look into that tomorrow. Can you pick up these patches in
the meantime?
Thanks!
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH-for-4.1] virtio-balloon: fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE
2019-07-17 11:28 ` David Hildenbrand
@ 2019-07-17 11:32 ` Michael S. Tsirkin
0 siblings, 0 replies; 11+ messages in thread
From: Michael S. Tsirkin @ 2019-07-17 11:32 UTC (permalink / raw)
To: David Hildenbrand
Cc: Igor Mammedov, David Gibson, qemu-devel, Stefan Hajnoczi,
qemu-stable
On Wed, Jul 17, 2019 at 01:28:19PM +0200, David Hildenbrand wrote:
> On 17.07.19 13:22, Michael S. Tsirkin wrote:
> > On Wed, Jul 17, 2019 at 01:10:21PM +0200, David Hildenbrand wrote:
> >> On 17.07.19 13:06, Michael S. Tsirkin wrote:
> >>> On Wed, Jul 17, 2019 at 12:17:57PM +0200, David Hildenbrand wrote:
> >>>> On 17.07.19 12:04, David Hildenbrand wrote:
> >>>>> On 17.07.19 11:57, Michael S. Tsirkin wrote:
> >>>>>> On Wed, Jul 17, 2019 at 10:42:55AM +0200, David Hildenbrand wrote:
> >>>>>>> We are using the wrong functions to set/clear bits, effectively touching
> >>>>>>> multiple bits, writing out of range of the bitmap, resulting in memory
> >>>>>>> corruptions. We have to use set_bit()/clear_bit() instead.
> >>>>>>>
> >>>>>>> Can easily be reproduced by starting a qemu guest on hugetlbfs memory,
> >>>>>>> inflating the balloon. QEMU crashes. This never could have worked
> >>>>>>> properly - especially, also pages would have been discarded when the
> >>>>>>> first sub-page would be inflated (the whole bitmap would be set).
> >>>>>>>
> >>>>>>> While testing I realized, that on hugetlbfs it is pretty much impossible
> >>>>>>> to discard a page - the guest just frees the 4k sub-pages in random order
> >>>>>>> most of the time. I was only able to discard a hugepage a handful of
> >>>>>>> times - so I hope that now works correctly.
> >>>>>>>
> >>>>>>> Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
> >>>>>>> host page size")
> >>>>>>> Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption
> >>>>>>> with inflates & deflates")
> >>>>>>> Cc: qemu-stable@nongnu.org #v4.0.0
> >>>>>>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> >>>>>>> Cc: David Gibson <david@gibson.dropbear.id.au>
> >>>>>>> Cc: Michael S. Tsirkin <mst@redhat.com>
> >>>>>>> Cc: Igor Mammedov <imammedo@redhat.com>
> >>>>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
> >>>>>>> ---
> >>>>>>> hw/virtio/virtio-balloon.c | 10 ++++------
> >>>>>>> 1 file changed, 4 insertions(+), 6 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> >>>>>>> index e85d1c0d5c..669067d661 100644
> >>>>>>> --- a/hw/virtio/virtio-balloon.c
> >>>>>>> +++ b/hw/virtio/virtio-balloon.c
> >>>>>>> @@ -94,9 +94,8 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
> >>>>>>> balloon->pbp->base = host_page_base;
> >>>>>>> }
> >>>>>>>
> >>>>>>> - bitmap_set(balloon->pbp->bitmap,
> >>>>>>> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> >>>>>>> - subpages);
> >>>>>>> + set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> >>>>>>> + balloon->pbp->bitmap);
> >>>>>>>
> >>>>>>> if (bitmap_full(balloon->pbp->bitmap, subpages)) {
> >>>>>>> /* We've accumulated a full host page, we can actually discard
> >>>>>>> @@ -140,9 +139,8 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
> >>>>>>> * for a guest to do this in practice, but handle it anyway,
> >>>>>>> * since getting it wrong could mean discarding memory the
> >>>>>>> * guest is still using. */
> >>>>>>> - bitmap_clear(balloon->pbp->bitmap,
> >>>>>>> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> >>>>>>> - subpages);
> >>>>>>> + clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> >>>>>>> + balloon->pbp->bitmap);
> >>>>>>>
> >>>>>>> if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
> >>>>>>> g_free(balloon->pbp);
> >>>>>>
> >>>>>> I also started to wonder about this:
> >>>>>>
> >>>>>> if (!balloon->pbp) {
> >>>>>> /* Starting on a new host page */
> >>>>>> size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
> >>>>>> balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
> >>>>>> balloon->pbp->rb = rb;
> >>>>>> balloon->pbp->base = host_page_base;
> >>>>>> }
> >>>>>>
> >>>>>> Is keeping a pointer to a ram block like this safe? what if the ramblock
> >>>>>> gets removed?
> >>>>>>
> >>>>>
> >>>>> David added
> >>>>>
> >>>>> if (balloon->pbp
> >>>>> && (rb != balloon->pbp->rb ) ...
> >>>>>
> >>>>> So in case the rb changes (IOW replaced - delete old one, new one
> >>>>> added), we reset the data.
> >>>>>
> >>>>> After a ram block was deleted, there will be no more deflation requests
> >>>>> coming in for it. This should be fine I guess.
> >>>
> >>> I think it might happen that an old dangling pointer happens
> >>> to match a newly allocated one.
> >>> I think we really should just cache all data we want to take into account
> >>> and compare that.
> >>
> >> That's true. I think just remembering and comparing the GPA base address
> >> would be sufficient.
> >
> > Well we need to know the bitmap size allocated, too.
> > And I guess when we are ready to free we should
> > re-check it just in case.
>
> Right, either that or the page size, which is orthogonal.
>
> >
> >> However, I don't consider this here to trigger easily. We would need
> >> some crazy memory unplug+replug going on while using the balloon. So I
> >> assume we can just rework this part after 4.1
> >
> > Dangling pointers are just a recipe for CVEs. I'd rather rework it now.
> >
>
> If they are not dereferences, I don't consider it an ultimate problem.
The following pattern is highly unsafe if p has been freed and
reused:
if (d->p == p)
use p->foo
and this is because we can now have copies of d->p->foo != p->foo
resulting in inconsistencies.
> But yeah, I'll look into that tomorrow. Can you pick up these patches in
> the meantime?
>
> Thanks!
Sure, thanks!
> --
>
> Thanks,
>
> David / dhildenb
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2019-07-17 11:33 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-07-17 8:42 [Qemu-devel] [PATCH-for-4.1] virtio-balloon: fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE David Hildenbrand
2019-07-17 9:57 ` Michael S. Tsirkin
2019-07-17 10:04 ` David Hildenbrand
2019-07-17 10:17 ` David Hildenbrand
2019-07-17 11:06 ` Michael S. Tsirkin
2019-07-17 11:10 ` David Hildenbrand
2019-07-17 11:22 ` Michael S. Tsirkin
2019-07-17 11:28 ` David Hildenbrand
2019-07-17 11:32 ` Michael S. Tsirkin
2019-07-17 11:10 ` Michael S. Tsirkin
2019-07-17 11:12 ` David Hildenbrand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).