Re: [Qemu-devel] blobstore disk format (was Re: Design of the blobstore)

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Michael S. Tsirkin" <mst@redhat.com>
To: Stefan Berger <stefanb@linux.vnet.ibm.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>,
	QEMU Developers <qemu-devel@nongnu.org>,
	Markus Armbruster <armbru@redhat.com>
Subject: Re: [Qemu-devel] blobstore disk format (was Re: Design of the blobstore)
Date: Fri, 16 Sep 2011 17:44:43 +0300	[thread overview]
Message-ID: <20110916144443.GB20933@redhat.com> (raw)
In-Reply-To: <4E720CA9.9050208@linux.vnet.ibm.com>

On Thu, Sep 15, 2011 at 10:33:13AM -0400, Stefan Berger wrote:
> On 09/15/2011 08:28 AM, Michael S. Tsirkin wrote:
> >So the below is a proposal for a directory scheme
> >for storing (optionally multiple) nvram images,
> >along with any metadata.
> >Data is encoded using BER:
> >http://en.wikipedia.org/wiki/Basic_Encoding_Rules
> >Specifically, we mostly use the subsets.
> >
> Would it change anything if we were to think of the NVRAM image as
> another piece of metadata?

Yes, we can do that, sure. I had the feeling that it will help to lay
out the image at the end, to make directory listing
more efficient - the rest of metadata is usually small,
image might be somewhat large.

> I am also wondering whether each device shouldn't just handle the
> metadata itself,

It could be that just means we will have custom code with
different bugs in each device.
Note that from experience with formats, the problem with
time becomes less trivial than it seems as we
need to provide forward and backward compatibility
guarantees.

> so generate a blob from data structures containing
> all the metadata it needs, arranging attribute and value pairs
> itself (maybe using some convenience function for
> serialization/deserialization) and let the NVRAM layer not handle
> the metadata at all but only blobs, their maximum sizes, actual
> sizes

Actual size seems to be a TPM specific thing.

> encryption, integrity value (crc32 or sha1) and so on. What
> metadata should there be that really need to be handled on the NVRAM
> API and below level rather than on the device-specific code level?

So checksum  (checksum value and type) 'and so on' are what I call
metadata :) Doing it at device level seems wrong.

> >We use a directory as a SET in a CER format.
> >This allows generating directory online without scanning
> >the entries beforehand.
> >
> I guess it is the 'unknown' for me... but what is the advantage of
> using ASN1 for this rather than just writing out packed and
> endianess-normalized data structures (with revision value),

If you want an example of where this 'custom formats are easy
so let us write one' leads to in the end,
look no further than live migration code.
It's a mess of hacks that does not even work across
upstream qemu versions, leave alone across
downstreams (different linux distros).

> having
> them crc32-protected to have some sanity checking in place?
> 
>     Stefan

I'm not sure why we want crc specifically in TPM.
If it is 'just because we can' then it probably
applies to other non-volatile storage?
Storage generally?

> >The rest of the encoding uses a DER format.
> >This makes for fast parsing as entries are easy to skip.
> >
> >Each entry is encoded in DER format.
> >Each entry is a SEQUENCE with two objects:
> >1. nvram
> >2. optional name - a UTF8String
> >
> >Binary data is stored as OCTET-STRING values on disk.
> >Any RW metadata is stored as OCTET-STRING value as well.
> >Any RO metadata is stored in appropriate universal encoding,
> >by type.
> >
> >On the context below, an attribute is either a IA5String or a SEQUENCE.
> >If IA5String, this is the attribute name, and it has no value.
> >If SEQUENCE, the first entry in the sequence is an
> >IA5String, it is the attribute name. The rest of the entries
> >represent the attribute value.
> >
> >Mandatory/optional attributes: depends on type.
> >tpm will have realsize as RW mandatory attribute.
> >
> >Each nvram is built as a SEQUENCE including 4 objects
> >1. type - an IA5String. downstreams can use other types such as
> >                      UUIDs instead to ensure no conflicts with upstream
> >2. SET of mandatory attributes
> >3. SET of optional attributes
> >4. data - a RW OCTET-STRING
> >
> >It is envisioned that attributes won't be too large,
> >so they can easily be kept in memory.
> >
> >

next prev parent reply	other threads:[~2011-09-16 14:43 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-15 12:28 [Qemu-devel] blobstore disk format (was Re: Design of the blobstore) Michael S. Tsirkin
2011-09-15 14:33 ` Stefan Berger
2011-09-16 14:44   ` Michael S. Tsirkin [this message]
2011-09-16 16:46     ` Stefan Berger
2011-09-17 19:28       ` Michael S. Tsirkin
2011-09-19 16:22         ` Stefan Berger
2011-09-19 19:04           ` Michael S. Tsirkin
2011-09-22  1:44             ` Stefan Berger
2011-09-22  6:37               ` Michael S. Tsirkin
2011-09-28 15:48                 ` Stefan Berger
2011-09-28 15:50                   ` Daniel P. Berrange
2011-09-28 19:19                     ` Stefan Berger
2011-10-02  9:18                   ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110916144443.GB20933@redhat.com \
    --to=mst@redhat.com \
    --cc=aliguori@us.ibm.com \
    --cc=armbru@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanb@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).