From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH 2/6] qemu-kvm: Modify and introduce wrapper functions to access phys_ram_dirty. Date: Tue, 16 Mar 2010 15:29:13 +0200 Message-ID: <4B9F87A9.3070509@redhat.com> References: <1268736839-27371-1-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> <1268736839-27371-3-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> <4B9F7D78.5090201@redhat.com> <4B9F8502.3070108@lab.ntt.co.jp> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org, anthony@codemonkey.ws, ohmura.kei@lab.ntt.co.jp To: Yoshiaki Tamura Return-path: Received: from mx1.redhat.com ([209.132.183.28]:44719 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966201Ab0CPN3T (ORCPT ); Tue, 16 Mar 2010 09:29:19 -0400 In-Reply-To: <4B9F8502.3070108@lab.ntt.co.jp> Sender: kvm-owner@vger.kernel.org List-ID: On 03/16/2010 03:17 PM, Yoshiaki Tamura wrote: > Avi Kivity wrote: >> On 03/16/2010 12:53 PM, Yoshiaki Tamura wrote: >>> Modifies wrapper functions for byte-based phys_ram_dirty bitmap to >>> bit-based phys_ram_dirty bitmap, and adds more wrapper functions to >>> prevent >>> direct access to the phys_ram_dirty bitmap. >> >>> + >>> +static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr) >>> +{ >>> + unsigned long mask; >>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS; >>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1); >>> + int ret = 0; >>> + >>> + mask = 1UL<< offset; >>> + if (phys_ram_vga_dirty[index]& mask) >>> + ret |= VGA_DIRTY_FLAG; >>> + if (phys_ram_code_dirty[index]& mask) >>> + ret |= CODE_DIRTY_FLAG; >>> + if (phys_ram_migration_dirty[index]& mask) >>> + ret |= MIGRATION_DIRTY_FLAG; >>> + >>> + return ret; >>> } >>> >>> static inline int cpu_physical_memory_get_dirty(ram_addr_t addr, >>> int dirty_flags) >>> { >>> - return phys_ram_dirty[addr>> TARGET_PAGE_BITS]& dirty_flags; >>> + return cpu_physical_memory_get_dirty_flags(addr)& dirty_flags; >>> } >> >> This turns one cacheline access into three. If the dirty bitmaps were in >> an array, you could do >> >> return dirty_bitmaps[dirty_index][addr >> (TARGET_PAGE_BITS + >> BITS_IN_LONG)] & mask; >> >> with one cacheline access. > > If I'm understanding the existing code correctly, > int dirty_flags can be combined, like VGA + MIGRATION. > If we only have to worry about a single dirty flag, I agree with your > idea. From a quick grep it seems flags are not combined, except for something strange with CODE_DIRTY_FLAG: > static void notdirty_mem_writel(void *opaque, target_phys_addr_t ram_addr, > uint32_t val) > { > int dirty_flags; > dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS]; > if (!(dirty_flags & CODE_DIRTY_FLAG)) { > #if !defined(CONFIG_USER_ONLY) > tb_invalidate_phys_page_fast(ram_addr, 4); > dirty_flags = phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS]; > #endif > } > stl_p(qemu_get_ram_ptr(ram_addr), val); > dirty_flags |= (0xff & ~CODE_DIRTY_FLAG); > phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS] = dirty_flags; > /* we remove the notdirty callback only if the code has been > flushed */ > if (dirty_flags == 0xff) > tlb_set_dirty(cpu_single_env, cpu_single_env->mem_io_vaddr); > } I can't say I understand what it does. > > On the other hand, qemu seems to require getting combined dirty flags. > If we introduce dirty bitmaps for each type, we need to access each > bitmap to get combined flags. I wasn't sure how to make this more > efficient... > >>> static inline void cpu_physical_memory_set_dirty(ram_addr_t addr) >>> { >>> - phys_ram_dirty[addr>> TARGET_PAGE_BITS] = 0xff; >>> + unsigned long mask; >>> + int index = (addr>> TARGET_PAGE_BITS) / HOST_LONG_BITS; >>> + int offset = (addr>> TARGET_PAGE_BITS)& (HOST_LONG_BITS - 1); >>> + >>> + mask = 1UL<< offset; >>> + phys_ram_vga_dirty[index] |= mask; >>> + phys_ram_code_dirty[index] |= mask; >>> + phys_ram_migration_dirty[index] |= mask; >>> +} >> >> This is also three cacheline accesses. I think we should have a master >> bitmap which is updated by set_dirty(), and which is or'ed into the >> other bitmaps when they are accessed. At least the vga and migration >> bitmaps are only read periodically, not randomly, so this would be very >> fast. In a way, this is similar to how the qemu bitmap is updated from >> the kvm bitmap today. > > Sounds good to me. > So we're going to introduce 4 (VGA, CODE, MIGRATION, master) bit-based > bitmaps in total. > Yeah, except CODE doesn't behave like the others. Would be best to understand what it's requirements are before making the change. Maybe CODE will need separate handling (so master will only feed VGA and MIGRATION). -- error compiling committee.c: too many arguments to function