Re: [PATCH 0/2] Faster MMU lookups for Book3s v3

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Avi Kivity <avi@redhat.com>
To: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	KVM list <kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	linuxppc-dev
	<linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org>
Subject: Re: [PATCH 0/2] Faster MMU lookups for Book3s v3
Date: Thu, 01 Jul 2010 12:43:54 +0000	[thread overview]
Message-ID: <4C2C8D8A.7080103@redhat.com> (raw)
In-Reply-To: <4C2C89D6.3090401-l3A5Bk7waGM@public.gmane.org>

On 07/01/2010 03:28 PM, Alexander Graf wrote:
>
>>
>>>    Wouldn't it speed up dirty bitmap flushing
>>> a lot if we'd just have a simple linked list of all sPTEs belonging to
>>> that memslot?
>>>
>>>        
>> The complexity is O(pages_in_slot) + O(sptes_for_slot).
>>
>> Usually, every page is mapped at least once, so sptes_for_slot
>> dominates.  Even when it isn't so, iterating the rmap base pointers is
>> very fast since they are linear in memory, while sptes are scattered
>> around, causing cache misses.
>>      
> Why would pages be mapped often?

It's not a question of how often they are mapped (shadow: very often; 
tdp: very rarely) but what percentage of pages are mapped.  It's usually 
100%.

> Don't you use lazy spte updates?
>    

We do, but given enough time, the guest will touch its entire memory.


>> Another consideration is that on x86, an spte occupies just 64 bits
>> (for the hardware pte); if there are multiple sptes per page (rare on
>> modern hardware), there is also extra memory for rmap chains;
>> sometimes we also allocate 64 bits for the gfn.  Having an extra
>> linked list would require more memory to be allocated and maintained.
>>      
> Hrm. I was thinking of not having an rmap but only using the chain. The
> only slots that would require such a chain would be the ones with dirty
> bitmapping enabled, so no penalty for normal RAM (unless you use kemari
> or live migration of course).
>    

You could also only chain writeable ptes.

> But then again I probably do need an rmap for the mmu_notifier magic,
> right? But I'd rather prefer to have that code path be slow and the
> dirty bitmap invalidation fast than the other way around. Swapping is
> slow either way.
>    

It's not just swapping, it's also page ageing.  That needs to be fast.  
Does ppc have a hardware-set referenced bit?  If so, you need a fast 
rmap for mmu notifiers.

-- 
error compiling committee.c: too many arguments to function

WARNING: multiple messages have this Message-ID (diff)

From: Avi Kivity <avi@redhat.com>
To: Alexander Graf <agraf@suse.de>
Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	KVM list <kvm@vger.kernel.org>,
	kvm-ppc@vger.kernel.org
Subject: Re: [PATCH 0/2] Faster MMU lookups for Book3s v3
Date: Thu, 01 Jul 2010 15:43:54 +0300	[thread overview]
Message-ID: <4C2C8D8A.7080103@redhat.com> (raw)
In-Reply-To: <4C2C89D6.3090401@suse.de>

On 07/01/2010 03:28 PM, Alexander Graf wrote:
>
>>
>>>    Wouldn't it speed up dirty bitmap flushing
>>> a lot if we'd just have a simple linked list of all sPTEs belonging to
>>> that memslot?
>>>
>>>        
>> The complexity is O(pages_in_slot) + O(sptes_for_slot).
>>
>> Usually, every page is mapped at least once, so sptes_for_slot
>> dominates.  Even when it isn't so, iterating the rmap base pointers is
>> very fast since they are linear in memory, while sptes are scattered
>> around, causing cache misses.
>>      
> Why would pages be mapped often?

It's not a question of how often they are mapped (shadow: very often; 
tdp: very rarely) but what percentage of pages are mapped.  It's usually 
100%.

> Don't you use lazy spte updates?
>    

We do, but given enough time, the guest will touch its entire memory.


>> Another consideration is that on x86, an spte occupies just 64 bits
>> (for the hardware pte); if there are multiple sptes per page (rare on
>> modern hardware), there is also extra memory for rmap chains;
>> sometimes we also allocate 64 bits for the gfn.  Having an extra
>> linked list would require more memory to be allocated and maintained.
>>      
> Hrm. I was thinking of not having an rmap but only using the chain. The
> only slots that would require such a chain would be the ones with dirty
> bitmapping enabled, so no penalty for normal RAM (unless you use kemari
> or live migration of course).
>    

You could also only chain writeable ptes.

> But then again I probably do need an rmap for the mmu_notifier magic,
> right? But I'd rather prefer to have that code path be slow and the
> dirty bitmap invalidation fast than the other way around. Swapping is
> slow either way.
>    

It's not just swapping, it's also page ageing.  That needs to be fast.  
Does ppc have a hardware-set referenced bit?  If so, you need a fast 
rmap for mmu notifiers.

-- 
error compiling committee.c: too many arguments to function

WARNING: multiple messages have this Message-ID (diff)

From: Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Alexander Graf <agraf-l3A5Bk7waGM@public.gmane.org>
Cc: kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	KVM list <kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	linuxppc-dev
	<linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org>
Subject: Re: [PATCH 0/2] Faster MMU lookups for Book3s v3
Date: Thu, 01 Jul 2010 15:43:54 +0300	[thread overview]
Message-ID: <4C2C8D8A.7080103@redhat.com> (raw)
In-Reply-To: <4C2C89D6.3090401-l3A5Bk7waGM@public.gmane.org>

On 07/01/2010 03:28 PM, Alexander Graf wrote:
>
>>
>>>    Wouldn't it speed up dirty bitmap flushing
>>> a lot if we'd just have a simple linked list of all sPTEs belonging to
>>> that memslot?
>>>
>>>        
>> The complexity is O(pages_in_slot) + O(sptes_for_slot).
>>
>> Usually, every page is mapped at least once, so sptes_for_slot
>> dominates.  Even when it isn't so, iterating the rmap base pointers is
>> very fast since they are linear in memory, while sptes are scattered
>> around, causing cache misses.
>>      
> Why would pages be mapped often?

It's not a question of how often they are mapped (shadow: very often; 
tdp: very rarely) but what percentage of pages are mapped.  It's usually 
100%.

> Don't you use lazy spte updates?
>    

We do, but given enough time, the guest will touch its entire memory.


>> Another consideration is that on x86, an spte occupies just 64 bits
>> (for the hardware pte); if there are multiple sptes per page (rare on
>> modern hardware), there is also extra memory for rmap chains;
>> sometimes we also allocate 64 bits for the gfn.  Having an extra
>> linked list would require more memory to be allocated and maintained.
>>      
> Hrm. I was thinking of not having an rmap but only using the chain. The
> only slots that would require such a chain would be the ones with dirty
> bitmapping enabled, so no penalty for normal RAM (unless you use kemari
> or live migration of course).
>    

You could also only chain writeable ptes.

> But then again I probably do need an rmap for the mmu_notifier magic,
> right? But I'd rather prefer to have that code path be slow and the
> dirty bitmap invalidation fast than the other way around. Swapping is
> slow either way.
>    

It's not just swapping, it's also page ageing.  That needs to be fast.  
Does ppc have a hardware-set referenced bit?  If so, you need a fast 
rmap for mmu notifiers.

-- 
error compiling committee.c: too many arguments to function

next prev parent reply	other threads:[~2010-07-01 12:43 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-30 13:18 [PATCH 0/2] Faster MMU lookups for Book3s v3 Alexander Graf
2010-06-30 13:18 ` Alexander Graf
2010-06-30 13:18 ` Alexander Graf
     [not found] ` <1277903926-12786-1-git-send-email-agraf-l3A5Bk7waGM@public.gmane.org>
2010-06-30 13:18   ` [PATCH 1/2] KVM: PPC: Add generic hpte management functions Alexander Graf
2010-06-30 13:18     ` Alexander Graf
2010-06-30 13:18     ` Alexander Graf
2010-06-30 13:18   ` [PATCH 2/2] KVM: PPC: Make use of hash based Shadow MMU Alexander Graf
2010-06-30 13:18     ` Alexander Graf
2010-06-30 13:18     ` Alexander Graf
2010-07-01  7:29   ` [PATCH 0/2] Faster MMU lookups for Book3s v3 Avi Kivity
2010-07-01  7:29     ` Avi Kivity
2010-07-01  7:29     ` Avi Kivity
     [not found]     ` <4C2C43C0.4000400-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-07-01  8:18       ` Alexander Graf
2010-07-01  8:18         ` Alexander Graf
2010-07-01  8:18         ` Alexander Graf
     [not found]         ` <7F9C2F52-3E95-4A22-B973-DACEBC95E5F4-l3A5Bk7waGM@public.gmane.org>
2010-07-01  8:40           ` Avi Kivity
2010-07-01  8:40             ` Avi Kivity
2010-07-01  8:40             ` Avi Kivity
2010-07-01 10:00             ` Alexander Graf
2010-07-01 10:00               ` Alexander Graf
2010-07-01 10:00               ` Alexander Graf
2010-07-01 11:14               ` Avi Kivity
2010-07-01 11:14                 ` Avi Kivity
2010-07-01 11:14                 ` Avi Kivity
2010-07-01 12:28                 ` Alexander Graf
2010-07-01 12:28                   ` Alexander Graf
2010-07-01 12:28                   ` Alexander Graf
     [not found]                   ` <4C2C89D6.3090401-l3A5Bk7waGM@public.gmane.org>
2010-07-01 12:43                     ` Avi Kivity [this message]
2010-07-01 12:43                       ` Avi Kivity
2010-07-01 12:43                       ` Avi Kivity
     [not found]                       ` <4C2C8D8A.7080103-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-07-01 12:52                         ` Alexander Graf
2010-07-01 12:52                           ` Alexander Graf
2010-07-01 12:52                           ` Alexander Graf
     [not found]                           ` <4C2C8FA8.1030702-l3A5Bk7waGM@public.gmane.org>
2010-07-01 13:42                             ` Avi Kivity
2010-07-01 13:42                               ` Avi Kivity
2010-07-01 13:42                               ` Avi Kivity
2010-07-02  2:54                               ` Benjamin Herrenschmidt
2010-07-02  2:54                                 ` Benjamin Herrenschmidt
2010-07-02  2:50                           ` Benjamin Herrenschmidt
2010-07-02  2:50                             ` Benjamin Herrenschmidt
2010-07-02  2:50                             ` Benjamin Herrenschmidt
2010-07-01 15:40   ` Marcelo Tosatti
2010-07-01 15:40     ` Marcelo Tosatti
2010-07-01 15:40     ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C2C8D8A.7080103@redhat.com \
    --to=avi@redhat.com \
    --cc=agraf-l3A5Bk7waGM@public.gmane.org \
    --cc=kvm-ppc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.