From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55275) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bBIVb-0001Oa-Ku for qemu-devel@nongnu.org; Fri, 10 Jun 2016 05:13:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bBIVW-0004IF-IX for qemu-devel@nongnu.org; Fri, 10 Jun 2016 05:13:42 -0400 Received: from mx-v6.kamp.de ([2a02:248:0:51::16]:46689 helo=mx01.kamp.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bBIVW-0004I6-86 for qemu-devel@nongnu.org; Fri, 10 Jun 2016 05:13:38 -0400 Message-ID: <575A84BF.6030108@kamp.de> Date: Fri, 10 Jun 2016 11:13:35 +0200 From: Peter Lieven MIME-Version: 1.0 References: <1464079240-24362-1-git-send-email-pl@kamp.de> <5744DF5F.3070600@redhat.com> <574BDEA0.2040708@kamp.de> In-Reply-To: <574BDEA0.2040708@kamp.de> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH V3] block/iscsi: allow caching of the allocation map List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake , qemu-block@nongnu.org Cc: pbonzini@redhat.com, famz@redhat.com, qemu-devel@nongnu.org Am 30.05.2016 um 08:33 schrieb Peter Lieven: > Am 25.05.2016 um 01:10 schrieb Eric Blake: >> On 05/24/2016 02:40 AM, Peter Lieven wrote: >>> until now the allocation map was used only as a hint if a cluster >>> is allocated or not. If a block was not allocated (or Qemu had >>> no info about the allocation status) a get_block_status call was >>> issued to check the allocation status and possibly avoid >>> a subsequent read of unallocated sectors. If a block known to be >>> allocated the get_block_status call was omitted. In the other case >>> a get_block_status call was issued before every read to avoid >>> the necessity for a consistent allocation map. To avoid the >>> potential overhead of calling get_block_status for each and >>> every read request this took only place for the bigger requests. >>> >>> This patch enhances this mechanism to cache the allocation >>> status and avoid calling get_block_status for blocks where >>> the allocation status has been queried before. This allows >>> for bypassing the read request even for smaller requests and >>> additionally omits calling get_block_status for known to be >>> unallocated blocks. >>> >>> Signed-off-by: Peter Lieven >>> --- >>> +static int iscsi_allocmap_init(IscsiLun *iscsilun, int open_flags) >>> { >>> - if (iscsilun->allocationmap == NULL) { >>> - return; >>> + iscsi_allocmap_free(iscsilun); >>> + >>> + iscsilun->allocmap_size = >>> + DIV_ROUND_UP(sector_lun2qemu(iscsilun->num_blocks, iscsilun), >>> + iscsilun->cluster_sectors); >>> + >> Computes: ceiling( (num_blocks * block_size / 512) / (cluster_size / >> 512) ); aka number of clusters. But we don't independently track the >> cluster size, so I don't see any simpler way of writing it, even if we >> could be more efficient by not having to first scale through qemu's >> sector size. >> >>> + iscsilun->allocmap = bitmap_try_new(iscsilun->allocmap_size); >>> + if (!iscsilun->allocmap) { >>> + return -ENOMEM; >>> + } >>> + >>> + if (open_flags & BDRV_O_NOCACHE) { >>> + /* in case that cache.direct = on all allocmap entries are >>> + * treated as invalid to force a relookup of the block >>> + * status on every read request */ >>> + return 0; >> Can we cache that we are opened BDRV_O_NOCACHE, so that we don't even >> have to bother allocating allocmap when we know we are never changing >> its bits? In other words, can you swap this to be before the >> bitmap_try_new()? > > The idea of the allocmap in cache.direct = on mode is that we can > still speed up block jobs by skipping large unallocated areas. In this case > the allocmap has only a hint character. If we don't know the status > we issue a get_block_status request and verify the status. If its unallocated > we return zeroes. If we new through an earlier get block status request > that the area is allocated we can skip the useless get_block_status request. > This is the old behaviour without this patch. > >> >>> + } >>> + >>> + iscsilun->allocmap_valid = bitmap_try_new(iscsilun->allocmap_size); >>> + if (!iscsilun->allocmap_valid) { >>> + /* if we are under memory pressure free the allocmap as well */ >>> + iscsi_allocmap_free(iscsilun); >>> + return -ENOMEM; >>> } >>> - bitmap_set(iscsilun->allocationmap, >>> - sector_num / iscsilun->cluster_sectors, >>> - DIV_ROUND_UP(nb_sectors, iscsilun->cluster_sectors)); >>> + >>> + return 0; >>> } >>> -static void iscsi_allocationmap_clear(IscsiLun *iscsilun, int64_t sector_num, >>> - int nb_sectors) >>> +static void >>> +iscsi_allocmap_update(IscsiLun *iscsilun, int64_t sector_num, >>> + int nb_sectors, bool allocated, bool valid) >>> { >>> int64_t cluster_num, nb_clusters; >>> - if (iscsilun->allocationmap == NULL) { >>> + >>> + if (iscsilun->allocmap == NULL) { >>> return; >>> } >> Here, you are short-circuiting when there is no allocmap, but shouldn't >> you also short-circuit if you are BDRV_O_NOCACHE? >> >> Should you assert(!(allocated && !valid)) [or by deMorgan's, >> assert(!allocated || valid)], to make sure we are only tracking 3 states >> rather than 4? > > sure, we thats a good enhancement altough allocated and invalid doesn't > hurt. > >> >>> cluster_num = DIV_ROUND_UP(sector_num, iscsilun->cluster_sectors); >>> nb_clusters = (sector_num + nb_sectors) / iscsilun->cluster_sectors >>> - cluster_num; >>> - if (nb_clusters > 0) { >>> - bitmap_clear(iscsilun->allocationmap, cluster_num, nb_clusters); >>> + if (allocated) { >>> + bitmap_set(iscsilun->allocmap, >>> + sector_num / iscsilun->cluster_sectors, >>> + DIV_ROUND_UP(nb_sectors, iscsilun->cluster_sectors)); >> This says that if we have a sub-cluster request, we can round out to >> cluster alignment (if our covered part of the cluster is allocated, >> presumably the rest is allocated as well)... >> >>> + } else if (nb_clusters > 0) { >>> + bitmap_clear(iscsilun->allocmap, cluster_num, nb_clusters); >> ...while this says if we are marking something clear, we have to round >> in (while we can trim the aligned middle, we should not mark any >> unaligned head or tail as trimmed, in case they remain allocated due to >> the unvisited sectors). Still, it may be worth comments for why your >> rounding between the two calls is different. > > Okay > >> >>> + } >>> + >>> + if (iscsilun->allocmap_valid == NULL) { >>> + return; >>> + } >> When do we ever have allocmap set but allocmap_valid clear? Isn't it >> easier to assume that both maps are present (we are tracking status) or >> absent (we are BDRV_O_NOCACHE)? > > Thats the current behaviour. See above. Ping. Peter