Re: [Qemu-devel] [PATCH V3] block/iscsi: allow caching of the allocation map

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Peter Lieven <pl@kamp.de>
To: Eric Blake <eblake@redhat.com>, qemu-block@nongnu.org
Cc: pbonzini@redhat.com, famz@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH V3] block/iscsi: allow caching of the allocation map
Date: Fri, 10 Jun 2016 11:13:35 +0200	[thread overview]
Message-ID: <575A84BF.6030108@kamp.de> (raw)
In-Reply-To: <574BDEA0.2040708@kamp.de>

Am 30.05.2016 um 08:33 schrieb Peter Lieven:
> Am 25.05.2016 um 01:10 schrieb Eric Blake:
>> On 05/24/2016 02:40 AM, Peter Lieven wrote:
>>> until now the allocation map was used only as a hint if a cluster
>>> is allocated or not. If a block was not allocated (or Qemu had
>>> no info about the allocation status) a get_block_status call was
>>> issued to check the allocation status and possibly avoid
>>> a subsequent read of unallocated sectors. If a block known to be
>>> allocated the get_block_status call was omitted. In the other case
>>> a get_block_status call was issued before every read to avoid
>>> the necessity for a consistent allocation map. To avoid the
>>> potential overhead of calling get_block_status for each and
>>> every read request this took only place for the bigger requests.
>>>
>>> This patch enhances this mechanism to cache the allocation
>>> status and avoid calling get_block_status for blocks where
>>> the allocation status has been queried before. This allows
>>> for bypassing the read request even for smaller requests and
>>> additionally omits calling get_block_status for known to be
>>> unallocated blocks.
>>>
>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>> ---
>>> +static int iscsi_allocmap_init(IscsiLun *iscsilun, int open_flags)
>>>   {
>>> -    if (iscsilun->allocationmap == NULL) {
>>> -        return;
>>> +    iscsi_allocmap_free(iscsilun);
>>> +
>>> +    iscsilun->allocmap_size =
>>> +        DIV_ROUND_UP(sector_lun2qemu(iscsilun->num_blocks, iscsilun),
>>> +                     iscsilun->cluster_sectors);
>>> +
>> Computes: ceiling( (num_blocks * block_size / 512) / (cluster_size /
>> 512) ); aka number of clusters.  But we don't independently track the
>> cluster size, so I don't see any simpler way of writing it, even if we
>> could be more efficient by not having to first scale through qemu's
>> sector size.
>>
>>> +    iscsilun->allocmap = bitmap_try_new(iscsilun->allocmap_size);
>>> +    if (!iscsilun->allocmap) {
>>> +        return -ENOMEM;
>>> +    }
>>> +
>>> +    if (open_flags & BDRV_O_NOCACHE) {
>>> +        /* in case that cache.direct = on all allocmap entries are
>>> +         * treated as invalid to force a relookup of the block
>>> +         * status on every read request */
>>> +        return 0;
>> Can we cache that we are opened BDRV_O_NOCACHE, so that we don't even
>> have to bother allocating allocmap when we know we are never changing
>> its bits?  In other words, can you swap this to be before the
>> bitmap_try_new()?
>
> The idea of the allocmap in cache.direct = on mode is that we can
> still speed up block jobs by skipping large unallocated areas. In this case
> the allocmap has only a hint character. If we don't know the status
> we issue a get_block_status request and verify the status. If its unallocated
> we return zeroes. If we new through an earlier get block status request
> that the area is allocated we can skip the useless get_block_status request.
> This is the old behaviour without this patch.
>
>>
>>> +    }
>>> +
>>> +    iscsilun->allocmap_valid = bitmap_try_new(iscsilun->allocmap_size);
>>> +    if (!iscsilun->allocmap_valid) {
>>> +        /* if we are under memory pressure free the allocmap as well */
>>> +        iscsi_allocmap_free(iscsilun);
>>> +        return -ENOMEM;
>>>       }
>>> -    bitmap_set(iscsilun->allocationmap,
>>> -               sector_num / iscsilun->cluster_sectors,
>>> -               DIV_ROUND_UP(nb_sectors, iscsilun->cluster_sectors));
>>> +
>>> +    return 0;
>>>   }
>>>   -static void iscsi_allocationmap_clear(IscsiLun *iscsilun, int64_t sector_num,
>>> -                                      int nb_sectors)
>>> +static void
>>> +iscsi_allocmap_update(IscsiLun *iscsilun, int64_t sector_num,
>>> +                      int nb_sectors, bool allocated, bool valid)
>>>   {
>>>       int64_t cluster_num, nb_clusters;
>>> -    if (iscsilun->allocationmap == NULL) {
>>> +
>>> +    if (iscsilun->allocmap == NULL) {
>>>           return;
>>>       }
>> Here, you are short-circuiting when there is no allocmap, but shouldn't
>> you also short-circuit if you are BDRV_O_NOCACHE?
>>
>> Should you assert(!(allocated && !valid)) [or by deMorgan's,
>> assert(!allocated || valid)], to make sure we are only tracking 3 states
>> rather than 4?
>
> sure, we thats a good enhancement altough allocated and invalid doesn't
> hurt.
>
>>
>>>       cluster_num = DIV_ROUND_UP(sector_num, iscsilun->cluster_sectors);
>>>       nb_clusters = (sector_num + nb_sectors) / iscsilun->cluster_sectors
>>>                     - cluster_num;
>>> -    if (nb_clusters > 0) {
>>> -        bitmap_clear(iscsilun->allocationmap, cluster_num, nb_clusters);
>>> +    if (allocated) {
>>> +        bitmap_set(iscsilun->allocmap,
>>> +                   sector_num / iscsilun->cluster_sectors,
>>> +                   DIV_ROUND_UP(nb_sectors, iscsilun->cluster_sectors));
>> This says that if we have a sub-cluster request, we can round out to
>> cluster alignment (if our covered part of the cluster is allocated,
>> presumably the rest is allocated as well)...
>>
>>> +    } else if (nb_clusters > 0) {
>>> +        bitmap_clear(iscsilun->allocmap, cluster_num, nb_clusters);
>> ...while this says if we are marking something clear, we have to round
>> in (while we can trim the aligned middle, we should not mark any
>> unaligned head or tail as trimmed, in case they remain allocated due to
>> the unvisited sectors).  Still, it may be worth comments for why your
>> rounding between the two calls is different.
>
> Okay
>
>>
>>> +    }
>>> +
>>> +    if (iscsilun->allocmap_valid == NULL) {
>>> +        return;
>>> +    }
>> When do we ever have allocmap set but allocmap_valid clear?  Isn't it
>> easier to assume that both maps are present (we are tracking status) or
>> absent (we are BDRV_O_NOCACHE)?
>
> Thats the current behaviour. See above.

Ping.

Peter

next prev parent reply	other threads:[~2016-06-10  9:13 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-24  8:40 [Qemu-devel] [PATCH V3] block/iscsi: allow caching of the allocation map Peter Lieven
2016-05-24 23:10 ` Eric Blake
2016-05-30  6:33   ` Peter Lieven
2016-06-10  9:13     ` Peter Lieven [this message]
2016-06-13  9:53     ` Paolo Bonzini
2016-06-12  5:04 ` Fam Zheng
2016-06-13  9:45 ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=575A84BF.6030108@kamp.de \
    --to=pl@kamp.de \
    --cc=eblake@redhat.com \
    --cc=famz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).