From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:37851) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UKROu-00044V-Fk for qemu-devel@nongnu.org; Tue, 26 Mar 2013 06:46:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UKROt-0005rx-1H for qemu-devel@nongnu.org; Tue, 26 Mar 2013 06:46:44 -0400 Received: from mx1.redhat.com ([209.132.183.28]:63201) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UKROs-0005rq-QJ for qemu-devel@nongnu.org; Tue, 26 Mar 2013 06:46:42 -0400 From: Juan Quintela In-Reply-To: <1364291919-19563-1-git-send-email-pl@kamp.de> (Peter Lieven's message of "Tue, 26 Mar 2013 10:58:29 +0100") References: <1364291919-19563-1-git-send-email-pl@kamp.de> Date: Tue, 26 Mar 2013 11:46:46 +0100 Message-ID: <87boa6o2uh.fsf@elfo.elfo> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Qemu-devel] [PATCHv5 00/10] buffer_is_zero / migration optimizations Reply-To: quintela@redhat.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Lieven Cc: Orit Wasserman , Paolo Bonzini , qemu-devel@nongnu.org, Stefan Hajnoczi Peter Lieven wrote: > this is v5 of my patch series with various optimizations in > zero buffer checking and migration tweaks. > > thanks especially to Eric Blake, Orit Wassermann and Paolo Bonzini > for reviewing. > > v5: > - move zero splat vector to a different patch > - fix indentation of can_user_buffer_find_nonzero_offset() > - do not unroll the first loop in buffer_find_nonzero_offset() > to optimize it for zero page checking > - use an older unrolled version of find_next_bit() without > SIMD instruction as there is no evidence that the vectorized > version is better if not even worse and the code is easier > to understand. > - added a word in the commit message of patch 8 > about the skipped pages field in QMP MigrationStats. > - fixed the order of key-value pairs of MigrationStats in > qapi-schema.json > - updated info about the performance benefit of is_zero_page() > to the latest benchmark results in the commit message. > > v4: > - do not inline buffer_find_nonzero_offset() > - inline can_usebuffer_find_nonzero_offset() correctly > - readd asserts in buffer_find_nonzero_offset() as profiling > shows they do not hurt. > - change last occurences of scalar 8 by > BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR > - avoid deferencing p already in patch 5 where we > know that the page (p) is zero > - explicitly set bytes_sent = 0 if we skip a zero page. > bytes_sent was 0 before, but it was not obvious. > - add accounting information for skipped zero pages > - fix errors reported by checkpatch.pl > > v3: > - remove asserts, inline functions and add a check > function if buffer_find_nonzero_offset() can be used. > - use above check function in buffer_is_zero() and > find_next_bit(). > - use buffer_is_nonzero_offset() directly to find > zero pages. we know that all requirements are met > for memory pages. > - fix C89 violation in buffer_is_zero(). > - avoid derefencing p in ram_save_block() if we already > know the page is zero. > - fix initialization of last_offset in reset_ram_globals(). > - avoid skipping pages with offset == 0 in bulk stage in > migration_bitmap_find_and_reset_dirty(). > - compared to v1 check for zero pages also after bulk > ram migration as there are guests (e.g. Windows) which > zero out large amount of memory while running. > > v2: > - fix description, add trivial zero check and add asserts > to buffer_find_nonzero_offset. > - add a constant for the unroll factor of buffer_find_nonzero_offset > - replace is_dup_page() by buffer_is_zero() > - added test results to xbzrle patch > - optimize descriptions > > Peter Lieven (10): > move vector definitions to qemu-common.h > add a zero splat vector to qemu-common.h > cutils: add a function to find non-zero content in a buffer > buffer_is_zero: use vector optimizations if possible > bitops: unroll while loop in find_next_bit() > migration: search for zero instead of dup pages > migration: add an indicator for bulk state of ram migration > migration: do not sent zero pages in bulk stage > migration: do not search dirty pages in bulk stage > migration: use XBZRLE only after bulk stage Really nice series. I did some minor review changes, but they are more nits that anynthing else. Thanks, Juan. Reviewed-by: Juan Quintela