From: Hanna Czenczek <hreitz@redhat.com>
To: Alexander Ivanov <alexander.ivanov@virtuozzo.com>, qemu-devel@nongnu.org
Cc: qemu-block@nongnu.org, den@virtuozzo.com, stefanha@redhat.com,
vsementsov@yandex-team.ru, kwolf@redhat.com
Subject: Re: [PATCH v5 3/5] parallels: Add checking and repairing duplicate offsets in BAT
Date: Fri, 2 Jun 2023 16:43:27 +0200 [thread overview]
Message-ID: <5cbe5958-e47b-8151-9d5e-a9ae0d572d3d@redhat.com> (raw)
In-Reply-To: <20230529151503.34006-4-alexander.ivanov@virtuozzo.com>
On 29.05.23 17:15, Alexander Ivanov wrote:
> Cluster offsets must be unique among all the BAT entries. Find duplicate
> offsets in the BAT and fix it by copying the content of the relevant
> cluster to a newly allocated cluster and set the new cluster offset to the
> duplicated entry.
>
> Add host_cluster_index() helper to deduplicate the code.
>
> Move parallels_fix_leak() call to parallels_co_check() to fix both types
> of leak: real corruption and a leak produced by allocate_clusters()
> during deduplication.
I’m not really a fan of splitting parallels_fix_leak() in this way. One
problem is that parallels_check_leak() still increments leaks_fixed,
even though it cannot know whether that will succeed. Would it be a
problem to move parallels_check_leak() after parallels_check_duplicate()?
> Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
> ---
> block/parallels.c | 138 ++++++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 133 insertions(+), 5 deletions(-)
>
> diff --git a/block/parallels.c b/block/parallels.c
> index 64850b9655..9fa1f93973 100644
> --- a/block/parallels.c
> +++ b/block/parallels.c
> @@ -136,6 +136,12 @@ static int cluster_remainder(BDRVParallelsState *s, int64_t sector_num,
> return MIN(nb_sectors, ret);
> }
>
> +static uint32_t host_cluster_index(BDRVParallelsState *s, int64_t off)
> +{
> + off -= le32_to_cpu(s->header->data_off) << BDRV_SECTOR_BITS;
> + return off / s->cluster_size;
> +}
> +
> static int64_t block_status(BDRVParallelsState *s, int64_t sector_num,
> int nb_sectors, int *pnum)
> {
> @@ -533,7 +539,6 @@ parallels_check_leak(BlockDriverState *bs, BdrvCheckResult *res,
> {
> BDRVParallelsState *s = bs->opaque;
> int64_t count, leak_size;
> - int ret;
>
> leak_size = parallels_get_leak_size(bs, res);
> if (leak_size < 0) {
> @@ -550,16 +555,127 @@ parallels_check_leak(BlockDriverState *bs, BdrvCheckResult *res,
> fix & BDRV_FIX_LEAKS ? "Repairing" : "ERROR", leak_size);
>
> if (fix & BDRV_FIX_LEAKS) {
> - ret = parallels_fix_leak(bs, res);
> - if (ret < 0) {
> - return ret;
> - }
> res->leaks_fixed += count;
> }
>
> return 0;
> }
>
> +static int parallels_check_duplicate(BlockDriverState *bs,
> + BdrvCheckResult *res,
> + BdrvCheckMode *fix)
> +{
> + BDRVParallelsState *s = bs->opaque;
> + QEMUIOVector qiov;
> + int64_t off, sector;
> + unsigned long *bitmap;
> + uint32_t i, bitmap_size, cluster_index;
> + int n, ret = 0;
> + uint64_t *buf = NULL;
> +
> + /*
> + * Create a bitmap of used clusters.
> + * If a bit is set, there is a BAT entry pointing to this cluster.
> + * Loop through the BAT entries, check bits relevant to an entry offset.
> + * If bit is set, this entry is duplicated. Otherwise set the bit.
> + *
> + * We shouldn't worry about newly allocated clusters outside the image
> + * because they are created higher then any existing cluster pointed by
> + * a BAT entry.
> + */
> + bitmap_size = host_cluster_index(s, res->image_end_offset);
> + if (bitmap_size == 0) {
> + return 0;
> + }
> +
> + bitmap = bitmap_new(bitmap_size);
> +
> + buf = qemu_memalign(4096, s->cluster_size);
There is qemu_blockalign(), which actually uses the BDS’s memory
alignment, so should be a better fit.
> + qemu_iovec_init(&qiov, 0);
> + qemu_iovec_add(&qiov, buf, s->cluster_size);
I don’t think this is necessary, there is bdrv_co_pwrite() that takes a
buffer. OTOH, if you want to keep this, you could replace the
bdrv_co_pread() call by bdrv_co_preadv().
> +
> + for (i = 0; i < s->bat_size; i++) {
> + off = bat2sect(s, i) << BDRV_SECTOR_BITS;
> + if (off == 0) {
> + continue;
> + }
> +
> + cluster_index = host_cluster_index(s, off);
> + if (test_bit(cluster_index, bitmap)) {
I understand we’ve already ensured that image_end_offset (which
determines the bitmap size) is large enough, and so this can’t overflow,
but I could sleep better if there was an `assert(cluster_index <
bitmap_size);` before this `test_bit()`.
> + /* this cluster duplicates another one */
> + fprintf(stderr,
> + "%s duplicate offset in BAT entry %u\n",
> + *fix & BDRV_FIX_ERRORS ? "Repairing" : "ERROR", i);
> +
> + res->corruptions++;
> +
> + if (*fix & BDRV_FIX_ERRORS) {
> + /*
> + * Reset the entry and allocate a new cluster
> + * for the relevant guest offset. In this way we let
> + * the lower layer to place the new cluster properly.
> + * Copy the original cluster to the allocated one.
> + */
> + parallels_set_bat_entry(s, i, 0);
As far as I understand, this will modify the image content when read.
Can we perhaps revert this change if there’s an error in bdrv_co_pread()
or allocate_clusters()? I understand that a double allocation is a bad
corruption, but if we can’t fix it because of some unexpected error, I
feel like it’s better to still keep the image in the same state as
before rather than having effectively destroyed the data in the
respective cluster, so users can at least try to fix it by copying it.
> +
> + ret = bdrv_co_pread(bs->file, off, s->cluster_size, buf, 0);
> + if (ret < 0) {
> + res->check_errors++;
> + goto out;
> + }
> +
> + sector = (i * (int64_t)s->cluster_size) >> BDRV_SECTOR_BITS;
Both for my own understanding and to maybe suggest a simplification:
This is the same as `sector = i * (int64_t)s->tracks`, right?
Hanna
> + sector = allocate_clusters(bs, sector, s->tracks, &n);
> + if (sector < 0) {
> + res->check_errors++;
> + ret = sector;
> + goto out;
> + }
next prev parent reply other threads:[~2023-06-02 14:43 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-29 15:14 [PATCH v5 0/5] parallels: Add duplication check, repair at open, fix bugs Alexander Ivanov
2023-05-29 15:14 ` [PATCH v5 1/5] parallels: Incorrect data end calculation in parallels_open() Alexander Ivanov
2023-06-02 14:44 ` Hanna Czenczek
2023-05-29 15:15 ` [PATCH v5 2/5] parallels: Split image leak handling to separate check and fix helpers Alexander Ivanov
2023-06-02 14:08 ` Hanna Czenczek
2023-06-05 13:13 ` Alexander Ivanov
2023-05-29 15:15 ` [PATCH v5 3/5] parallels: Add checking and repairing duplicate offsets in BAT Alexander Ivanov
2023-06-02 14:43 ` Hanna Czenczek [this message]
2023-06-05 16:55 ` Alexander Ivanov
2023-05-29 15:15 ` [PATCH v5 4/5] parallels: Replace fprintf by qemu_log in check Alexander Ivanov
2023-06-02 14:48 ` Hanna Czenczek
2023-06-09 10:36 ` Alexander Ivanov
2023-06-09 10:59 ` Peter Maydell
2023-05-29 15:15 ` [PATCH v5 5/5] parallels: Image repairing in parallels_open() Alexander Ivanov
2023-06-02 14:59 ` Hanna Czenczek
2023-06-09 13:21 ` Alexander Ivanov
2023-06-09 13:41 ` Hanna Czenczek
2023-06-11 14:45 ` Alexander Ivanov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5cbe5958-e47b-8151-9d5e-a9ae0d572d3d@redhat.com \
--to=hreitz@redhat.com \
--cc=alexander.ivanov@virtuozzo.com \
--cc=den@virtuozzo.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@yandex-team.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).