From mboxrd@z Thu Jan  1 00:00:00 1970
From: Avi Kivity <avi@redhat.com>
Subject: Re: [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions
 to access phys_ram_dirty.
Date: Tue, 16 Mar 2010 15:57:32 +0200
Message-ID: <4B9F8E4C.5070307@redhat.com>
References: <1268736839-27371-1-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> <1268736839-27371-3-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> <4B9F7D78.5090201@redhat.com> <4B9F8502.3070108@lab.ntt.co.jp> <4B9F87A9.3070509@redhat.com> <4B9F8CE2.7010104@codemonkey.ws>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>,
	kvm@vger.kernel.org, qemu-devel@nongnu.org,
	ohmura.kei@lab.ntt.co.jp
To: Anthony Liguori <anthony@codemonkey.ws>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:34996 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S966261Ab0CPN5m (ORCPT <rfc822;kvm@vger.kernel.org>);
	Tue, 16 Mar 2010 09:57:42 -0400
In-Reply-To: <4B9F8CE2.7010104@codemonkey.ws>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On 03/16/2010 03:51 PM, Anthony Liguori wrote:
> On 03/16/2010 08:29 AM, Avi Kivity wrote:
>> On 03/16/2010 03:17 PM, Yoshiaki Tamura wrote:
>>> Avi Kivity wrote:
>>>> On 03/16/2010 12:53 PM, Yoshiaki Tamura wrote:
>>>>> Modifies wrapper functions for byte-based phys_ram_dirty bitmap to
>>>>> bit-based phys_ram_dirty bitmap, and adds more wrapper functions to
>>>>> prevent
>>>>> direct access to the phys_ram_dirty bitmap.
>>>>
>>>>> +
>>>>> +static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t 
>>>>> addr)
>>>>> +{
>>>>> + unsigned long mask;
>>>>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS;
>>>>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1);
>>>>> + int ret = 0;
>>>>> +
>>>>> + mask = 1UL<< offset;
>>>>> + if (phys_ram_vga_dirty[index]& mask)
>>>>> + ret |= VGA_DIRTY_FLAG;
>>>>> + if (phys_ram_code_dirty[index]& mask)
>>>>> + ret |= CODE_DIRTY_FLAG;
>>>>> + if (phys_ram_migration_dirty[index]& mask)
>>>>> + ret |= MIGRATION_DIRTY_FLAG;
>>>>> +
>>>>> + return ret;
>>>>> }
>>>>>
>>>>> static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
>>>>> int dirty_flags)
>>>>> {
>>>>> - return phys_ram_dirty[addr>> TARGET_PAGE_BITS]& dirty_flags;
>>>>> + return cpu_physical_memory_get_dirty_flags(addr)& dirty_flags;
>>>>> }
>>>>
>>>> This turns one cacheline access into three. If the dirty bitmaps 
>>>> were in
>>>> an array, you could do
>>>>
>>>> return dirty_bitmaps[dirty_index][addr >> (TARGET_PAGE_BITS +
>>>> BITS_IN_LONG)] & mask;
>>>>
>>>> with one cacheline access.
>>>
>>> If I'm understanding the existing code correctly,
>>> int dirty_flags can be combined, like VGA + MIGRATION.
>>> If we only have to worry about a single dirty flag, I agree with 
>>> your idea.
>>
>> From a quick grep it seems flags are not combined, except for 
>> something strange with CODE_DIRTY_FLAG:
>>
>>> static void notdirty_mem_writel(void *opaque, target_phys_addr_t 
>>> ram_addr,
>>>                                 uint32_t val)
>>> {
>>>     int dirty_flags;
>>>     dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
>>>     if (!(dirty_flags & CODE_DIRTY_FLAG)) {
>>> #if !defined(CONFIG_USER_ONLY)
>>>         tb_invalidate_phys_page_fast(ram_addr, 4);
>>>         dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS];
>>> #endif
>>>     }
>>>     stl_p(qemu_get_ram_ptr(ram_addr), val);
>>>     dirty_flags |= (0xff & ~CODE_DIRTY_FLAG);
>>>     phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS] = dirty_flags;
>>>     /* we remove the notdirty callback only if the code has been
>>>        flushed */
>>>     if (dirty_flags == 0xff)
>>>         tlb_set_dirty(cpu_single_env, cpu_single_env->mem_io_vaddr);
>>> }
>>
>> I can't say I understand what it does.
>
> The semantics of CODE_DIRTY_FLAG are a little counter intuitive.  
> CODE_DIRTY_FLAG means that we know that something isn't code so writes 
> do not need checking for self modifying code.

So the hardware equivalent is, when the Instruction TLB loads a page 
address, clear CODE_DIRTY_FLAG?

>
> notdirty_mem_write() is called for any ram that is in the virtual TLB 
> that has not been updated yet and once a write has occurred, we can 
> switch to faster access functions (provided we've invalidated any 
> translation blocks).
>
> That's why the check is if (!(dirty_flags & CODE_DIRTY_FLAG)), if it 
> hasn't been set yet, we have to assume that it could be a TB so we 
> need to invalidate it.  tb_invalidate_phys_page_fast() will set the 
> CODE_DIRTY_FLAG if no code is present in that memory area which is why 
> we fetch dirty_flags again.

Ok.

>
> We do the store, and then set the dirty bits to mark that the page is 
> now dirty taking care to not change the CODE_DIRTY_FLAG bit.
>
> At the very end, we check to see if CODE_DIRTY_FLAG which indicates 
> that we no longer need to trap writes.  If so, we call tlb_set_dirty() 
> which will ultimately remove the notdirty callback in favor of a 
> faster access mechanism.
>
> With respect patch series, there should be no problem having a 
> separate code bitmap that gets updated along with a main bitmap 
> provided that the semantics of CODE_DIRTY_FLAG are preserved.
>
>>> Sounds good to me.
>>> So we're going to introduce 4 (VGA, CODE, MIGRATION, master) 
>>> bit-based bitmaps in total.
>>>
>>
>> Yeah, except CODE doesn't behave like the others.  Would be best to 
>> understand what it's requirements are before making the change.  
>> Maybe CODE will need separate handling (so master will only feed VGA 
>> and MIGRATION).
>
> Generally speaking, cpu_physical_memory_set_dirty() is called by the 
> device model.  Any writes by the device model that results in 
> self-modifying code are not going to have predictable semantics which 
> is why it can set CODE_DIRTY_FLAG.
>
> CODE_DIRTY_FLAG doesn't need to get updated from a master bitmap.  It 
> should be treated as a separate bitmap that is strictly dealt with by 
> the virtual TLB.

Thanks.

-- 
error compiling committee.c: too many arguments to function