Re: [Qemu-devel] vNVRAM / blobstore design

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Anthony Liguori <anthony@codemonkey.ws>
To: Stefan Berger <stefanb@linux.vnet.ibm.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	"Michael S. Tsirkin" <mst@redhat.com>
Cc: Joel Schopp <jschopp@linux.vnet.ibm.com>,
	Corey Bryant <coreyb@linux.vnet.ibm.com>,
	Michael Roth <mdroth@linux.vnet.ibm.com>,
	Stefan Hajnoczi <stefanha@gmail.com>
Subject: Re: [Qemu-devel] vNVRAM / blobstore design
Date: Mon, 25 Mar 2013 17:05:13 -0500	[thread overview]
Message-ID: <87ehf3nnja.fsf@codemonkey.ws> (raw)
In-Reply-To: <5150C415.9030302@linux.vnet.ibm.com>

Stefan Berger <stefanb@linux.vnet.ibm.com> writes:

> [argh, just posted this to qemu-trivial -- it's not trivial]
>
>
> Hello!
>
> I am posting this message to revive the previous discussions about the 
> design of vNVRAM / blobstore cc'ing (at least) those that participated 
> in this discussion 'back then'.
>
> The first goal of the implementation is to provide an vNVRAM storage for 
> a software implementation of a TPM to store its different blobs into. 
> Some of the data that the TPM writes into persistent memory needs to 
> survive a power down / power up cycle of a virtual machine, therefore 
> this type of persistent storage is needed. For the vNVRAM not to become 
> a road-block for VM migration, we would make use of block device 
> migration and layer the vNVRAM on top of the block device, therefore 
> using virtual machine images for storing the vNVRAM data.
>
> Besides the TPM blobs the vNVRAM should of course also be able able to 
> accommodate other use cases where persistent data is stored into
> NVRAM, 

Well let's focus more on the "blob store".  What are the semantics of
this?  Is there a max number of blobs?  Are the sizes fixed or variable?
How often are new blobs added/removed?

Regards,

Anthony Liguori


> BBRAM (battery backed-up RAM) or EEPROM. As far as I know more recent 
> machines with UEFI also have such types of persistent memory. I believe 
> the current design of the vNVRAM layer accommodates other use cases as 
> well, though additional 'glue devices' would need to be implemented to 
> interact with this vNVRAM layer.
>
> Here is a reference to the previous discussion:
>
> http://lists.gnu.org/archive/html/qemu-devel/2011-09/msg01791.html
> http://lists.gnu.org/archive/html/qemu-devel/2011-09/msg01967.html
>
> Two aspects of the vNVRAM seem of primary interest: its API and how the 
> data is organized in the virtual machine image leaving its inner 
> workings to the side for now.
>
>
> API of the vNVRAM:
> ------------------
>
> The following functions and data structures are important for devices:
>
>
> enum NVRAMRWOp {
>      NVRAM_OP_READ,
>      NVRAM_OP_WRITE,
>      NVRAM_OP_FORMAT
> };
>
> /**
>   * Callback function a device must provide for reading and writing
>   * of blobs as well as for indicating to the NVRAM layer the maximum
>   * blob size of the given entry. Due to the layout of the data in the
>   * NVRAM, the device must always write a blob with the size indicated
>   * during formatting.
>   * @op: Indication of the NVRAM operation
>   * @v: Input visitor in case of a read operation, output visitor in
>   *     case of a write or format operation.
>   * @entry_name: Unique name of the NVRAM entry
>   * @opaque: opaque data previously provided when registering the NVRAM
>   *          entry
>   * @errp: Pointer to an Error pointer for the visitor to indicate error
>   */
> typedef int (*NVRAMRWData)(enum NVRAMRWOp op, Visitor *v,
>                             const NVRAMEntryName *entry_name, void *opaque,
>                             Error **errp);
>
> /**
>   * nvram_setup:
>   * @drive_id: The ID of the drive to be used as NVRAM. Following the 
> command
>   *            line '-drive if=none,id=tpm-bs,file=<file>' 'tpm-bs' would
>   *            have to be passed.
>   * @errcode: Pointer to an integer for an error code
>   * @resetfn : Device reset function
>   * @dev: The DeviceState to be passed to the device reset function @resetfn
>   *
>   * This function returns a pointer to VNVRAM or NULL in case an error 
> occurred
>   */
> VNVRAM *nvram_setup(const char *drive_id, int *errcode,
>                      qdev_resetfn resetfn, DeviceState *dev);
>
> /**
>   * nvram_delete:
>   * @nvram: The NVRAM to destroy
>   *
>   * Destroy the NVRAM previously allocated using nvram_setup.
>   */
> int nvram_delete(VNVRAM *nvram);
>
> /**
>   * nvram_start:
>   * @nvram: The NVRAM to start
>   * @fail_on_encrypted_drive: Fail if the drive is encrypted but no
>   *                           key was provided so far to lower layers.
>   *
>   * After all blobs that the device intends to write have been registered
>   * with the NVRAM, this function is used to start up the NVRAM. In case
>   * no error occurred, 0 is returned, an error code otherwise.
>   */
> int nvram_start(VNVRAM *nvram, bool fail_on_encrypted_drive);
>
> /**
>   * nvram_process_requests:
>   *
>   * Have the NVRAM layer process all outstanding requests and wait
>   * for their completion.
>   */
> void nvram_process_requests(void);
>
> /**
>   * nvram_register_entry:
>   *
>   * @nvram: The NVRAM to register an entry with
>   * @entry_name: The unique name of the blob to register
>   * @rwdata_callback: Callback function for the NVRAM layer to
>   *                   invoke for asynchronous requests such as
>   *                   delivering the results of a read operation
>   *                   or requesting the maximum size of the blob
>   *                   when formatting.
>   * @opaque: Data to pass to the callback function
>   *
>   * Register an entry for the NVRAM layer to write. In case of success
>   * this function returns 0, an error code otherwise.
>   */
> int nvram_register_entry(VNVRAM *nvram, const NVRAMEntryName *entry_name,
>                           NVRAMRWData rwdata_callback, void *opaque);
>
> /**
>   * nvram_deregister_entry:
>   * @nvram: The NVRAM to deregister an entry from
>   * @entry_name: the unique name of the entry
>   *
>   * Deregister an NVRAM entry previously registered with the NVRAM layer.
>   * The function returns 0 on success, an error code on failure.
>   */
> int nvram_deregister_entry(VNVRAM *nvram, const NVRAMEntryName *entry_name);
>
> /**
>   * nvram_had_fatal_error:
>   * @nvram: The NVRAM to check
>   *
>   * Returns true in case the NVRAM had a fatal error and
>   * is unusable, false if the device can be used.
>   */
> bool nvram_had_fatal_error(VNVRAM *nvram);
>
> /**
>   * nvram_write_data:
>   * @nvram: The NVRAM to write the data to
>   * @entry_name: The name of the blob to write
>   * @flags: Flags indicating sychronouse or asynchronous
>   *         operation and whether to wait for completion
>   *         of the operation.
>   * @cb: callback to invoke for an async write
>   * @opaque: data to pass to the callback
>   *
>   * Write data into NVRAM. This function will invoke the callback
>   * provided in nvram_setup where an output visitor will be
>   * provided for writing the blob. This function returns 0 in case
>   * of success, an error code otherwise.
>   */
> int nvram_write_data(VNVRAM *nvram, const NVRAMEntryName *entry_name,
>                       int flags, NVRAMRWFinishCB cb, void *opaque);
>
> /**
>   * nvram_write_data:
>   * @nvram: The NVRAM to read the data from
>   * @entry_name: The name of the blob tow rite
>   * @flags: Flags indicating sychronouse or asynchronous
>   *         operation and whether to wait for completion
>   *         of the operation.
>   * @cb: callback to invoke for an async read
>   * @opaque: data to pass to the callback
>   *
>   * Read data from NVRAM. This function will invoke the callback
>   * provided in nvram_setup where an input visitor will be
>   * provided for reading the data. This function return 0 in
>   * case of success, an error code otherwise.
>   */
> int nvram_read_data(VNVRAM *nvram, const NVRAMEntryName *entry_name,
>                      int flags, NVRAMRWFinishCB cb, void *opaque);
>
> /* flags used above */
> #define VNVRAM_ASYNC_F              (1 << 0)
> #define VNVRAM_WAIT_COMPLETION_F    (1 << 1)
>
>
>
> Organization of the data in the virtual machine image:
> ------------------------------------------------------
>
> All data on the VM image are written as a single ASN.1 stream with a 
> header followed by the individual fixed-sized NVRAM entries. The NVRAM 
> layer creates the header during an NVRAM formatting step that must be 
> initiated by the user (or libvirt) through an HMP or QMP command.
>
> The fact that we are writing ASN.1 formatted data into the virtual 
> machine image is also the reason for the recent posts of the ASN.1 
> visitor patch series.
>
>
> /*
>   * The NVRAM on-disk format is as follows:
>   * Let '{' and '}' denote an ASN.1 sequence start and end.
>   *
>   * {
>   *   NVRAM-header: "qemu-nvram"
>   *   # a sequence of blobs:
>   *   {
>   *     1st NVRAM entry's name
>   *     1st NVRAM entry's ASN.1 blob (fixed size)
>   *   }
>   *   {
>   *     2nd NVRAM entry's name
>   *     2nd NVRAM entry's ASN.1 blob (fixed size)
>   *   }
>   *   ...
>   * }
>   */
>
> NVRAM entries are read by searching for the entry identified by its 
> unique name. If it is found, the device's callback function is invoked 
> with an input visitor for the device to read the data.
>
> NVRAM entries are written by searching for the entry identified by its 
> unique name. If it is found, the device's callback function is invoked 
> with an output visitor positioned to where the data need to be written 
> to in the VM image. The device then uses the visitor directly to write 
> the blob.
>
> The ASN.1 blobs have to be of fixed size since an inflating or deflating 
> 1st blob would require that all subsequent blobs be moved or destroy the 
> integrity of the ASN.1 stream.
>
>
> One complication is the requirements on size of the NVRAM and the fact 
> the virtual machine images typically don't grow. Here users may need 
> a-priori knowledge as to what the size of the NVRAM has to be for the 
> device to properly work. In case of the the TPM for example, the TPM 
> requires a virtual machine image of a certain size for it to be able to 
> write all its blobs into. It may be necessary for human users to start 
> QEMU once to find out the required size of the NVRAM image (using an HMP 
> command) or get it through documentation. In the case of libvirt the 
> required image size could be hard coded into libvirt since it will not 
> change anymore and is a property of the device. Another possibility 
> would be to use QEMU APIs to re-size the image before formatting (this 
> at least did not work a few months ago if I recall correctly, or did not 
> work with all VM image formats; details here faded from memory...)
>
> I think this is enough detail for now. Please let me know of any 
> comments you may have. My primary concern for now is to get clarity on 
> the layout of the data inside the virtual machine image. The ASN.1 
> visitors were written for this purpose.
>
>
> Thanks and regards,
>      Stefan

next prev parent reply	other threads:[~2013-03-25 22:05 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-25 21:39 [Qemu-devel] vNVRAM / blobstore design Stefan Berger
2013-03-25 22:05 ` Anthony Liguori [this message]
2013-03-25 22:20   ` Stefan Berger
2013-03-27 15:17     ` Corey Bryant
2013-03-27 15:20       ` Corey Bryant
2013-03-27 15:30         ` Michael S. Tsirkin
2013-03-27 16:07           ` mdroth
2013-03-27 15:43         ` Kenneth Goldman
2013-03-27 15:53           ` Michael S. Tsirkin
2013-03-27 16:12             ` Joel Schopp
2013-03-27 16:46               ` Stefan Berger
2013-03-27 17:14                 ` Anthony Liguori
2013-03-27 17:27                   ` Stefan Berger
2013-03-27 18:27                     ` Anthony Liguori
2013-03-27 19:12                       ` Stefan Berger
2013-03-28 16:11                         ` Stefan Berger
2013-03-28 16:31                           ` Michael S. Tsirkin
2013-03-28 17:02                             ` Stefan Berger
2013-03-28 17:27                           ` Anthony Liguori
2013-03-28 17:36                             ` Stefan Berger
2013-03-28 17:39                             ` Michael S. Tsirkin
2013-03-29 13:55                               ` Stefan Berger
2013-03-29 15:12                                 ` Anthony Liguori
2013-03-29 17:33                           ` Kenneth Goldman
2013-03-31  8:17                             ` Michael S. Tsirkin
2013-03-31 20:48                               ` Kenneth Goldman
2013-04-02 12:06                                 ` Michael S. Tsirkin
2013-04-02 13:24                                   ` Kenneth Goldman
2013-04-02 13:37                                     ` Michael S. Tsirkin
2013-03-27 18:04                   ` Michael S. Tsirkin
2013-03-27 16:20             ` Kenneth Goldman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ehf3nnja.fsf@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=coreyb@linux.vnet.ibm.com \
    --cc=jschopp@linux.vnet.ibm.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanb@linux.vnet.ibm.com \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).