From: Alexander Graf <agraf@suse.de>
To: Igor Mammedov <imammedo@redhat.com>
Cc: qemu-ppc@nongnu.org, stuart.yoder@freescale.com,
qemu-devel@nongnu.org, pbonzini@redhat.com, kvm@vger.kernel.org,
qemu-stable@nongnu.org
Subject: Re: [PATCH] kvm: Fix memory slot page alignment logic
Date: Mon, 10 Nov 2014 15:47:49 +0100 [thread overview]
Message-ID: <5460D015.10900@suse.de> (raw)
In-Reply-To: <20141110145511.173f48ec@nial.usersys.redhat.com>
On 10.11.14 14:55, Igor Mammedov wrote:
> On Mon, 10 Nov 2014 14:16:58 +0100
> Alexander Graf <agraf@suse.de> wrote:
>
>>
>>
>> On 10.11.14 13:31, Igor Mammedov wrote:
>>> On Fri, 7 Nov 2014 22:18:45 +0100
>>> Alexander Graf <agraf@suse.de> wrote:
>>>
>>>> Memory slots have to be page aligned to get entered into KVM. There
>>>> is existing logic that tries to ensure that we pad memory slots that
>>>> are not page aligned to the biggest region that would still fit in the
>>>> alignment requirements.
>>>>
>>>> Unfortunately, that logic is broken. It tries to calculate the start
>>>> offset based on the region size.
>>>>
>>>> Fix up the logic to do the thing it was intended to do and document it
>>>> properly in the comment above it.
>>>>
>>>> With this patch applied, I can successfully run an e500 guest with more
>>>> than 3GB RAM (at which point RAM starts overlapping subpage memory regions).
>>>>
>>>> Cc: qemu-stable@nongnu.org
>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>> ---
>>>> kvm-all.c | 6 ++++--
>>>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/kvm-all.c b/kvm-all.c
>>>> index 44a5e72..596e7ce 100644
>>>> --- a/kvm-all.c
>>>> +++ b/kvm-all.c
>>>> @@ -634,8 +634,10 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, bool add)
>>>> unsigned delta;
>>>>
>>>> /* kvm works in page size chunks, but the function may be called
>>>> - with sub-page size and unaligned start address. */
>>>> - delta = TARGET_PAGE_ALIGN(size) - size;
>>>> + with sub-page size and unaligned start address. Pad the start
>>>> + address to next and truncate size to previous page boundary. */
>>> I'm a bit confused how it works at all.
>>> Lets assume that there is no mapped pages that include start_addr,
>>> then if start_addr were padded to next page, kvm would map it from there
>>> but the rest of QEMU would still use unaligned start_addr for MemoryRegion
>>> that isn't even mapped.
>>
>> Sorry, I don't understand this paragraph. Memory slots in general are
>> accelerations for memory access - for MMIO (RAM is usually aligned), KVM
>> can always exit to QEMU and just do a manual MMIO exit.
>>
>>> It would seem that instead of padding up to the next page, start_addr
>>> should be moved to the start of the page that includes it to make page
>>> with original start_addr available to guest.
>>
>> No, because in that case you would map something as RAM that really
>> isn't RAM.
>>
>> Imagine you have the following memory layout:
>>
>> 0x1000 page size
>>
>> 1) 0x00000 - 0x10000 RAM
>> 2) 0x10000 - 0x10100 MMIO
>> 3) 0x10100 - 0x20000 RAM
>>
>> Then you want to map 1) as memory slot and 4) from 0x11000 onwards as
>> memory slot.
> so every access to RAM 0x10100-0x11000 which is not represented as memory
> slot would cause VMEXIT?
Yes, there's no other way. Otherwise we wouldn't be able to trap on the
exits from 0x10000 - 0x10100. Hardware only gives us page granularity.
Usually this isn't an issue because overlapping MMIO regions are pretty
large chunks of power-of-2 size - if you see any overlapping at all. On
e500 this bites us though, because we end up with small MSI-X windows
inside our address space (which in turn might also be a bug, but that
doesn't mean that the slot mapping logic should be left as broken as it is).
Alex
WARNING: multiple messages have this Message-ID (diff)
From: Alexander Graf <agraf@suse.de>
To: Igor Mammedov <imammedo@redhat.com>
Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org,
qemu-stable@nongnu.org, stuart.yoder@freescale.com,
qemu-ppc@nongnu.org, pbonzini@redhat.com
Subject: Re: [Qemu-devel] [PATCH] kvm: Fix memory slot page alignment logic
Date: Mon, 10 Nov 2014 15:47:49 +0100 [thread overview]
Message-ID: <5460D015.10900@suse.de> (raw)
In-Reply-To: <20141110145511.173f48ec@nial.usersys.redhat.com>
On 10.11.14 14:55, Igor Mammedov wrote:
> On Mon, 10 Nov 2014 14:16:58 +0100
> Alexander Graf <agraf@suse.de> wrote:
>
>>
>>
>> On 10.11.14 13:31, Igor Mammedov wrote:
>>> On Fri, 7 Nov 2014 22:18:45 +0100
>>> Alexander Graf <agraf@suse.de> wrote:
>>>
>>>> Memory slots have to be page aligned to get entered into KVM. There
>>>> is existing logic that tries to ensure that we pad memory slots that
>>>> are not page aligned to the biggest region that would still fit in the
>>>> alignment requirements.
>>>>
>>>> Unfortunately, that logic is broken. It tries to calculate the start
>>>> offset based on the region size.
>>>>
>>>> Fix up the logic to do the thing it was intended to do and document it
>>>> properly in the comment above it.
>>>>
>>>> With this patch applied, I can successfully run an e500 guest with more
>>>> than 3GB RAM (at which point RAM starts overlapping subpage memory regions).
>>>>
>>>> Cc: qemu-stable@nongnu.org
>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>> ---
>>>> kvm-all.c | 6 ++++--
>>>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/kvm-all.c b/kvm-all.c
>>>> index 44a5e72..596e7ce 100644
>>>> --- a/kvm-all.c
>>>> +++ b/kvm-all.c
>>>> @@ -634,8 +634,10 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, bool add)
>>>> unsigned delta;
>>>>
>>>> /* kvm works in page size chunks, but the function may be called
>>>> - with sub-page size and unaligned start address. */
>>>> - delta = TARGET_PAGE_ALIGN(size) - size;
>>>> + with sub-page size and unaligned start address. Pad the start
>>>> + address to next and truncate size to previous page boundary. */
>>> I'm a bit confused how it works at all.
>>> Lets assume that there is no mapped pages that include start_addr,
>>> then if start_addr were padded to next page, kvm would map it from there
>>> but the rest of QEMU would still use unaligned start_addr for MemoryRegion
>>> that isn't even mapped.
>>
>> Sorry, I don't understand this paragraph. Memory slots in general are
>> accelerations for memory access - for MMIO (RAM is usually aligned), KVM
>> can always exit to QEMU and just do a manual MMIO exit.
>>
>>> It would seem that instead of padding up to the next page, start_addr
>>> should be moved to the start of the page that includes it to make page
>>> with original start_addr available to guest.
>>
>> No, because in that case you would map something as RAM that really
>> isn't RAM.
>>
>> Imagine you have the following memory layout:
>>
>> 0x1000 page size
>>
>> 1) 0x00000 - 0x10000 RAM
>> 2) 0x10000 - 0x10100 MMIO
>> 3) 0x10100 - 0x20000 RAM
>>
>> Then you want to map 1) as memory slot and 4) from 0x11000 onwards as
>> memory slot.
> so every access to RAM 0x10100-0x11000 which is not represented as memory
> slot would cause VMEXIT?
Yes, there's no other way. Otherwise we wouldn't be able to trap on the
exits from 0x10000 - 0x10100. Hardware only gives us page granularity.
Usually this isn't an issue because overlapping MMIO regions are pretty
large chunks of power-of-2 size - if you see any overlapping at all. On
e500 this bites us though, because we end up with small MSI-X windows
inside our address space (which in turn might also be a bug, but that
doesn't mean that the slot mapping logic should be left as broken as it is).
Alex
next prev parent reply other threads:[~2014-11-10 14:47 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-07 21:18 [PATCH] kvm: Fix memory slot page alignment logic Alexander Graf
2014-11-07 21:18 ` [Qemu-devel] " Alexander Graf
2014-11-07 21:24 ` [Qemu-ppc] " Alexander Graf
2014-11-07 21:24 ` [Qemu-devel] " Alexander Graf
2014-11-10 12:31 ` Igor Mammedov
2014-11-10 12:31 ` [Qemu-devel] " Igor Mammedov
2014-11-10 13:16 ` Alexander Graf
2014-11-10 13:16 ` [Qemu-devel] " Alexander Graf
2014-11-10 13:54 ` Paolo Bonzini
2014-11-10 13:54 ` [Qemu-devel] " Paolo Bonzini
2014-11-10 13:55 ` Peter Maydell
2014-11-10 13:55 ` Peter Maydell
2014-11-10 14:48 ` Alexander Graf
2014-11-10 14:48 ` Alexander Graf
2014-11-10 13:55 ` Igor Mammedov
2014-11-10 13:55 ` [Qemu-devel] " Igor Mammedov
2014-11-10 14:47 ` Alexander Graf [this message]
2014-11-10 14:47 ` Alexander Graf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5460D015.10900@suse.de \
--to=agraf@suse.de \
--cc=imammedo@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=qemu-stable@nongnu.org \
--cc=stuart.yoder@freescale.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.