From: Maxim Levitsky <mlevitsk@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Cc: "Kevin Wolf" <kwolf@redhat.com>,
"Daniel P . Berrangé" <berrange@redhat.com>,
"qemu-block@nongnu.org" <qemu-block@nongnu.org>,
qemu-stable <qemu-stable@nongnu.org>,
"Max Reitz" <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v2 1/3] block/qcow2: refactoring of threaded encryption code
Date: Tue, 10 Sep 2019 17:35:55 +0300 [thread overview]
Message-ID: <0d1bbda1fccb296746b5df3d5bde3ee9443a07b8.camel@redhat.com> (raw)
In-Reply-To: <a21dc942-404c-6fc5-08c5-88be1d01508b@virtuozzo.com>
On Tue, 2019-09-10 at 14:17 +0000, Vladimir Sementsov-Ogievskiy wrote:
> 10.09.2019 15:31, Maxim Levitsky wrote:
> > On Sat, 2019-09-07 at 19:08 +0000, Vladimir Sementsov-Ogievskiy wrote:
> > > 06.09.2019 22:57, Maxim Levitsky wrote:
> > > > This commit tries to clarify few function arguments,
> > > > and add comments describing the encrypt/decrypt interface
> > > >
> > > > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > > > ---
> > > > block/qcow2-cluster.c | 10 +++----
> > > > block/qcow2-threads.c | 61 ++++++++++++++++++++++++++++++++++---------
> > > > 2 files changed, 53 insertions(+), 18 deletions(-)
> > > >
> > > > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> > > > index f09cc992af..1989b423da 100644
> > > > --- a/block/qcow2-cluster.c
> > > > +++ b/block/qcow2-cluster.c
> > > > @@ -463,8 +463,8 @@ static int coroutine_fn do_perform_cow_read(BlockDriverState *bs,
> > > > }
> > > >
> > > > static bool coroutine_fn do_perform_cow_encrypt(BlockDriverState *bs,
> > > > - uint64_t src_cluster_offset,
> > > > - uint64_t cluster_offset,
> > > > + uint64_t guest_cluster_offset,
> > > > + uint64_t host_cluster_offset,
> > > > unsigned offset_in_cluster,
> > > > uint8_t *buffer,
> > > > unsigned bytes)
> > > > @@ -474,8 +474,8 @@ static bool coroutine_fn do_perform_cow_encrypt(BlockDriverState *bs,
> > > > assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
> > > > assert((bytes & ~BDRV_SECTOR_MASK) == 0);
> > > > assert(s->crypto);
> > > > - if (qcow2_co_encrypt(bs, cluster_offset,
> > > > - src_cluster_offset + offset_in_cluster,
> > > > + if (qcow2_co_encrypt(bs, host_cluster_offset,
> > > > + guest_cluster_offset + offset_in_cluster,
> > > > buffer, bytes) < 0) {
> > > > return false;
> > > > }
> > > > @@ -496,7 +496,7 @@ static int coroutine_fn do_perform_cow_write(BlockDriverState *bs,
> > > > }
> > > >
> > > > ret = qcow2_pre_write_overlap_check(bs, 0,
> > > > - cluster_offset + offset_in_cluster, qiov->size, true);
> > > > + cluster_offset + offset_in_cluster, qiov->size, true);
> > >
> > >
> > > Hmm, unrelated hunk.
> >
> > I was asked to do this to fix coding style, so that wrapped line,
> > is 4 characters shifted to the right.
>
> AFAIS, Eric asked only about qcow2_co_encdec calls and definition.. It's OK to fix style in code
> you touch in you patch anyway, but no reason to fix style somewhere else, it dirties patch for
> no reason, making it more difficult (a bit, but still) to review and more difficult to backport..
All right, then I'll drop that change. I kind of agree with this, but I also didn't mind doing these fixes
either.
>
> >
> > >
> > > > if (ret < 0) {
> > > > return ret;
> > > > }
> > > > diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
> > > > index 3b1e63fe41..c3cda0c6a5 100644
> > > > --- a/block/qcow2-threads.c
> > > > +++ b/block/qcow2-threads.c
> > > > @@ -234,15 +234,19 @@ static int qcow2_encdec_pool_func(void *opaque)
> > > > }
> > > >
> > > > static int coroutine_fn
> > > > -qcow2_co_encdec(BlockDriverState *bs, uint64_t file_cluster_offset,
> > > > - uint64_t offset, void *buf, size_t len, Qcow2EncDecFunc func)
> > > > +qcow2_co_encdec(BlockDriverState *bs, uint64_t host_cluster_offset,
> > > > + uint64_t guest_offset, void *buf, size_t len,
> > > > + Qcow2EncDecFunc func)
> > > > {
> > > > BDRVQcow2State *s = bs->opaque;
> > > > +
> > > > + uint64_t offset = s->crypt_physical_offset ?
> > > > + host_cluster_offset + offset_into_cluster(s, guest_offset) :
> > > > + guest_offset;
> > > > +
> > > > Qcow2EncDecData arg = {
> > > > .block = s->crypto,
> > > > - .offset = s->crypt_physical_offset ?
> > > > - file_cluster_offset + offset_into_cluster(s, offset) :
> > > > - offset,
> > > > + .offset = offset,
> > > > .buf = buf,
> > > > .len = len,
> > > > .func = func,
> > > > @@ -251,18 +255,49 @@ qcow2_co_encdec(BlockDriverState *bs, uint64_t file_cluster_offset,
> > > > return qcow2_co_process(bs, qcow2_encdec_pool_func, &arg);
> > > > }
> > > >
> > > > +
> > > > +/*
> > > > + * qcow2_co_encrypt()
> > > > + *
> > > > + * Encrypts one or more contiguous aligned sectors
> > > > + *
> > > > + * @host_cluster_offset - on disk offset of the first cluster in which
> > > > + * the encrypted data will be written
> > >
> > >
> > > It's not quite right, it's not on disk, but on .file child of qcow2 node, which
> > > may be any other format or protocol node.. So, I called it file_cluster_offset.
> > > But I'm OK with new naming anyway. And it may be better for encryption related
> > > logic..
> >
> > Yes, the .file is the underlying storage for both qcow2 metadata and the data,
> > and it is unlikely be another qcow2 file. Usually it will be a raw file,
> > accessed with some protocol.
> > I will change the wording to not include the 'disk' word though.
> >
> >
> > To be really honest, the best naming here would be one that follows the virtual memory concepts.
> > A virtual block/cluster address and a physical block/cluster address.
> > However we talked with Kevin recently and I also studied quite a lot of qcow2 code,
> > and the usual convention is guest cluster offset and host cluster offset,
> > and often guest offsets are just called offsets, which is very confusing IMHO.
>
> It don't confuse me, as it's an interface of block-layer. Interface user just writes at some offset.
> And the fact that internally, we consider interface parameter "offset" as "guest offset" and map it
> to some physical offset - it's an implementation details. In other words, guest don't know that it
> writes at "guest offset", it just writes at "offset" and don't care.
>
> >
> >
> > >
> > > > + * Used as an initialization vector for encryption
> > >
> > > Hmm, is it default now?
> >
> >
> > Most of block crypto implementations have IV which derive
> > some way or another from the sector address.
> >
> > From what I see, the block address is either used as is,
> > or encrypted itself with same encryption key,
> > and the result is used as IV. Even the legacy qcow
> > encryption uses this, although it uses the virtual block
> > address, and apparently this is one of its security flaws.
> >
> > If you don't use any IV, you end up with major security
> > hole - sectors of the same content will be encrypted
> > to the same cipertext.
> >
> > I added this comment to clarify the usage of offset,
> > since other that this aspect of IV generation,
> > the crypto routines only need the data to be encrypted and
> > the encryption key which is stored in the crypto context.
> >
> >
> > >
> > > > + *
> > > > + * @guest_offset - guest (virtual) offset of the first sector of the
> > > > + * data to be encrypted
> > >
> > > Hmm, stop. It's wrong. Data to be encrypted is in buffer, so, it's not first sector of
> > > the data to be encrypted, but first sector in which guest writes data (to be encrypted
> > > in meantime).
> >
> > No no no!
> >
> > guest_offset is literally the guest's disk address of the first sector that is in the buffer.
> > The qcow2_co_encrypt is called from 2 places:
> >
> > 1. qcow2_co_pwritev_part - here indeed the actually guest written data
> > is encrypted, and the 'offset' is passed which is the offset on which pwritev was called
>
> I mean, that in this case your comment for me sounds like "data to be encrypted is on disk, at this offset".
> but it's not actually on disk, data is in buffer. And what is on disk we don't know, we are going to
> write...
>
> But I don't really care. Hope it's actually obvious for averyone, where is data on write path.
Ah, now I understand. I don't mind adding a word or two explicitly mentioning that the *data* to be
encrypted is in the given buffer.
>
> >
> >
> > 2. do_perform_cow_encrypt - here the just read data from before or after actually written guest
> > data is encrypted, and the guest_offset represents the address of that data.
> > I changed the do_perform_cow_encrypt, so that it receives from the caller the host and guest offset
> > of the data to be encrypted. It then aligns the host offset on start of the cluster, and passes
> > the guest offset as is, so that it does the same as qcow2_co_pwritev_part.
> >
> >
> > So what is wrong here?
> >
> > >
> > >
> > > > + * Used as an initialization vector for older, qcow2 native encryption
> > > > + *
> > > > + * @buf - buffer with the data to encrypt
> > > > + * @len - length of the buffer (in sector size multiplies)
> > > > + *
> > > > + * Note that the group of the sectors, don't have to be aligned
> > > > + * on cluster boundary and can also cross a cluster boundary.
> > >
> > > And I doubt in it now. I'm afraid that if we call qcow2_co_encrypt for a group
> > > of the sectors crossing a cluster boundary, we will finish up with similar bug: we'll
> > > use first cluster offset as a vector for all the sectors. We still never do it.. So,
> > > I think it worth assertion and corresponding comment.
> > >
> > > Or is it correct?
> >
> > Crypto code receives the data to be encrypted and decrypted,
> > and offset of first sector (512 bytes aligned) of that data (for IV calculation).
> > If the data spans multiple sectors (this happens already a lot)
>
> Ah, yes, that's OK, sorry.
>
> > then it able to handle this since the code works.
> > The relevant code is in do_qcrypto_block_cipher_encdec, and
> > is actually generic for all the crypto modes.
> >
> > So there should not be anything special about crossing the cluster
> > boundary, other that noting this here, just in case.
> >
> > No only that code can cross a cluster boundary, but it can even be
> > called for more that one cluster at once. It looks like qcow2 code limits all
> > the IO in case crypto is used to QCOW_MAX_CRYPT_CLUSTERS, which is 32
> > currently. I don't know why to be honest.
> >
> > Remember that crypto code has no notion of clusters. It works purely
> > on sector (512 bytes) level, and each sector will have its own IV calculated,
> > based on its sector address.
> >
> > >
> > > > + *
> > > > + *
> > > > + */
> > > > int coroutine_fn
> > > > -qcow2_co_encrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
> > > > - uint64_t offset, void *buf, size_t len)
> > > > +qcow2_co_encrypt(BlockDriverState *bs, uint64_t host_cluster_offset,
> > > > + uint64_t guest_offset, void *buf, size_t len)
> > > > {
> > > > - return qcow2_co_encdec(bs, file_cluster_offset, offset, buf, len,
> > > > - qcrypto_block_encrypt);
> > > > + return qcow2_co_encdec(bs, host_cluster_offset, guest_offset, buf, len,
> > > > + qcrypto_block_encrypt);
> > > > }
> > > >
> > > > +
> > > > +/*
> > > > + * qcow2_co_decrypt()
> > > > + *
> > > > + * Decrypts one or more contiguous aligned sectors
> > > > + * Same function as qcow2_co_encrypt
> > >
> > > Hmm, not exactly same :)
> >
> > I'll fix that.
> >
> >
> > >
> > > > + *
> > > > + */
> > > > +
> > > > int coroutine_fn
> > > > -qcow2_co_decrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
> > > > - uint64_t offset, void *buf, size_t len)
> > > > +qcow2_co_decrypt(BlockDriverState *bs, uint64_t host_cluster_offset,
> > > > + uint64_t guest_offset, void *buf, size_t len)
> > > > {
> > > > - return qcow2_co_encdec(bs, file_cluster_offset, offset, buf, len,
> > > > - qcrypto_block_decrypt);
> > > > + return qcow2_co_encdec(bs, host_cluster_offset, guest_offset, buf, len,
> > > > + qcrypto_block_decrypt);
> > > > }
> > > >
Best regards,
Maxim Levitsky
next prev parent reply other threads:[~2019-09-10 14:37 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-06 19:57 [Qemu-devel] [PATCH v2 0/3] Fix qcow2+luks corruption introduced by commit 8ac0f15f335 Maxim Levitsky
2019-09-06 19:57 ` [Qemu-devel] [PATCH v2 1/3] block/qcow2: refactoring of threaded encryption code Maxim Levitsky
2019-09-07 19:08 ` Vladimir Sementsov-Ogievskiy
2019-09-10 12:31 ` Maxim Levitsky
2019-09-10 14:17 ` Vladimir Sementsov-Ogievskiy
2019-09-10 14:35 ` Maxim Levitsky [this message]
2019-09-06 19:57 ` [Qemu-devel] [PATCH v2 2/3] block/qcow2: fix the corruption when rebasing luks encrypted files Maxim Levitsky
2019-09-07 18:38 ` Vladimir Sementsov-Ogievskiy
2019-09-06 19:57 ` [Qemu-devel] [PATCH v2 3/3] qemu-iotests: Add test for bz #1745922 Maxim Levitsky
2019-09-09 10:35 ` Daniel P. Berrangé
2019-09-10 11:01 ` Maxim Levitsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0d1bbda1fccb296746b5df3d5bde3ee9443a07b8.camel@redhat.com \
--to=mlevitsk@redhat.com \
--cc=berrange@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-stable@nongnu.org \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).