From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([140.186.70.92]:45049)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefanb@linux.vnet.ibm.com>) id 1R4Wij-0004Mk-4u
	for qemu-devel@nongnu.org; Fri, 16 Sep 2011 07:36:38 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <stefanb@linux.vnet.ibm.com>) id 1R4Wih-0002m6-GJ
	for qemu-devel@nongnu.org; Fri, 16 Sep 2011 07:36:37 -0400
Received: from e5.ny.us.ibm.com ([32.97.182.145]:43697)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefanb@linux.vnet.ibm.com>) id 1R4Wih-0002li-6X
	for qemu-devel@nongnu.org; Fri, 16 Sep 2011 07:36:35 -0400
Received: from d01relay06.pok.ibm.com (d01relay06.pok.ibm.com [9.56.227.116])
	by e5.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id p8GB5sSo031040
	for <qemu-devel@nongnu.org>; Fri, 16 Sep 2011 07:05:54 -0400
Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64])
	by d01relay06.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id
	p8GBaXQu1794244
	for <qemu-devel@nongnu.org>; Fri, 16 Sep 2011 07:36:33 -0400
Received: from d01av04.pok.ibm.com (loopback [127.0.0.1])
	by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id
	p8GBaVQc004293
	for <qemu-devel@nongnu.org>; Fri, 16 Sep 2011 07:36:32 -0400
Message-ID: <4E7334BF.10305@linux.vnet.ibm.com>
Date: Fri, 16 Sep 2011 07:36:31 -0400
From: Stefan Berger <stefanb@linux.vnet.ibm.com>
MIME-Version: 1.0
References: <4E70DEE8.8090908@linux.vnet.ibm.com>
	<CAJSP0QUvDJYFKs-wOuJmY=M=syCJQ4F6LOb9c7-0NgTWuoeMYQ@mail.gmail.com>
	<4E71F0EF.6070803@linux.vnet.ibm.com>
	<20110916103517.GA3391@stefanha-thinkpad.localdomain>
In-Reply-To: <20110916103517.GA3391@stefanha-thinkpad.localdomain>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] Design of the blobstore  [API of the NVRAM]
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Anthony Liguori <aliguori@us.ibm.com>, "Michael S. Tsirkin" <mst@redhat.com>, Markus Armbruster <armbru@redhat.com>, QEMU Developers <qemu-devel@nongnu.org>

On 09/16/2011 06:35 AM, Stefan Hajnoczi wrote:
> On Thu, Sep 15, 2011 at 08:34:55AM -0400, Stefan Berger wrote:
>> On 09/15/2011 07:17 AM, Stefan Hajnoczi wrote:
>>
[...]
>> Everything is kind of changing now. But here's what I have right now:
>>
>>      tb->s.tpm_ltpms->nvram = nvram_setup(tpm_ltpms->drive_id,&errcode);
>>      if (!tb->s.tpm_ltpms->nvram) {
>>          fprintf(stderr, "Could not find nvram.\n");
>>          return errcode;
>>      }
>>
>>      nvram_register_blob(tb->s.tpm_ltpms->nvram,
>>                          NVRAM_ENTRY_PERMSTATE,
>>                          tpmlib_get_prop(TPMPROP_TPM_MAX_NV_SPACE));
>>      nvram_register_blob(tb->s.tpm_ltpms->nvram,
>>                          NVRAM_ENTRY_SAVESTATE,
>>                          tpmlib_get_prop(TPMPROP_TPM_MAX_SAVESTATE_SPACE));
>>      nvram_register_blob(tb->s.tpm_ltpms->nvram,
>>                          NVRAM_ENTRY_VOLASTATE,
>> tpmlib_get_prop(TPMPROP_TPM_MAX_VOLATILESTATE_SPACE));
>>
>>      rc = nvram_start(tpm_ltpms->nvram, fail_on_encrypted_drive);
>>
>> Above first sets up the NVRAM using the drive's id. That is the
>> -tpmdev ...,nvram=my-bs, parameter. This establishes the NVRAM.
>> Subsequently the blobs to be written into the NVRAM are registered.
>> The nvram_start then reconciles the registered NVRAM blobs with
>> those found on disk and if everything fits together the result is
>> 'rc = 0' and the NVRAM is ready to go. Other devices can than do the
>> same also with the same NVRAM or another NVRAM. (NVRAM now after
>> renaming from blobstore).
>>
>> Reading from NVRAM in case of the TPM is a rare event. It happens in
>> the context of QEMU's main thread:
>>
>>      if (nvram_read_data(tpm_ltpms->nvram,
>>                          NVRAM_ENTRY_PERMSTATE,
>> &tpm_ltpms->permanent_state.buffer,
>> &tpm_ltpms->permanent_state.size,
>>                          0, NULL, NULL) ||
>>          nvram_read_data(tpm_ltpms->nvram,
>>                          NVRAM_ENTRY_SAVESTATE,
>> &tpm_ltpms->save_state.buffer,
>> &tpm_ltpms->save_state.size,
>>                          0, NULL, NULL))
>>      {
>>          tpm_ltpms->had_fatal_error = true;
>>          return;
>>      }
>>
>> Above reads the data of 2 blobs synchronously. This happens during startup.
>>
>>
>> Writes are depending on what the user does with the TPM. He can
>> trigger lots of updates to persistent state if he performs certain
>> operations, i.e., persisting keys inside the TPM.
>>
>>      rc = nvram_write_data(tpm_ltpms->nvram,
>>                            what, tsb->buffer, tsb->size,
>>                            VNVRAM_ASYNC_F | VNVRAM_WAIT_COMPLETION_F,
>>                            NULL, NULL);
>>
>> Above writes a TPM blob into the NVRAM. This is triggered by the TPM
>> thread and notifies the QEMU main thread to write the blob into
>> NVRAM. I do this synchronously at the moment not using the last two
>> parameters for callback after completion but the two flags. The
>> first is to notify the main thread the 2nd flag is to wait for the
>> completion of the request (using a condition internally).
>>
>> Here are the protos:
>>
>> VNVRAM *nvram_setup(const char *drive_id, int *errcode);
>>
>> int nvram_start(VNVRAM *, bool fail_on_encrypted_drive);
>>
>> int nvram_register_blob(VNVRAM *bs, enum NVRAMEntryType type,
>>                          unsigned int maxsize);
>>
>> unsigned int nvram_get_totalsize(VNVRAM *bs);
>> unsigned int nvram_get_totalsize_kb(VNVRAM *bs);
>>
>> typedef void NVRAMRWFinishCB(void *opaque, int errcode, bool is_write,
>>                               unsigned char **data, unsigned int len);
>>
>> int nvram_write_data(VNVRAM *bs, enum NVRAMEntryType type,
>>                       const unsigned char *data, unsigned int len,
>>                       int flags, NVRAMRWFinishCB cb, void *opaque);
>>
>>
>> As said, things are changing right now, so this is to give an impression...
> Thanks, these details are interesting.  I interpreted the blobstore as a
> key-value store but these example show it as a stream.  No IDs or
IMO the only stuff we should store there are blobs retrievable via keys 
(names) -- no metadata.

> offsets are given, the reads are just performed in order and move
> through the NVRAM.  If it stays this simple then bdrv_*() is indeed a
There are no offsets because there's some intelligence in the 
blobstore/NVRAM that lays out the data onto the disk. That's why there 
is a directory. This in turn allows the sharing of the NVRAM by possibly 
multiple drivers where the driver-writer doesn't need to lay out the 
blobs him-/herself.
> natural way to do this - although my migration point remains since this
> feature adds a new requirement for shared storage when it would be
> pretty easy to put this stuff in the vm data stream (IIUC the TPM NVRAM
> is relatively small?).
It's just another image. You have to treat it like the VM's 'main' 
image. Block migration works fine on it just that it may be difficult 
for a user to handle migration flags if one image is on shared storage 
and the other isn't.

    Stefan
> Stefan
>