From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:45049) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R4Wij-0004Mk-4u for qemu-devel@nongnu.org; Fri, 16 Sep 2011 07:36:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R4Wih-0002m6-GJ for qemu-devel@nongnu.org; Fri, 16 Sep 2011 07:36:37 -0400 Received: from e5.ny.us.ibm.com ([32.97.182.145]:43697) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R4Wih-0002li-6X for qemu-devel@nongnu.org; Fri, 16 Sep 2011 07:36:35 -0400 Received: from d01relay06.pok.ibm.com (d01relay06.pok.ibm.com [9.56.227.116]) by e5.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id p8GB5sSo031040 for ; Fri, 16 Sep 2011 07:05:54 -0400 Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay06.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p8GBaXQu1794244 for ; Fri, 16 Sep 2011 07:36:33 -0400 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p8GBaVQc004293 for ; Fri, 16 Sep 2011 07:36:32 -0400 Message-ID: <4E7334BF.10305@linux.vnet.ibm.com> Date: Fri, 16 Sep 2011 07:36:31 -0400 From: Stefan Berger MIME-Version: 1.0 References: <4E70DEE8.8090908@linux.vnet.ibm.com> <4E71F0EF.6070803@linux.vnet.ibm.com> <20110916103517.GA3391@stefanha-thinkpad.localdomain> In-Reply-To: <20110916103517.GA3391@stefanha-thinkpad.localdomain> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] Design of the blobstore [API of the NVRAM] List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Kevin Wolf , Anthony Liguori , "Michael S. Tsirkin" , Markus Armbruster , QEMU Developers On 09/16/2011 06:35 AM, Stefan Hajnoczi wrote: > On Thu, Sep 15, 2011 at 08:34:55AM -0400, Stefan Berger wrote: >> On 09/15/2011 07:17 AM, Stefan Hajnoczi wrote: >> [...] >> Everything is kind of changing now. But here's what I have right now: >> >> tb->s.tpm_ltpms->nvram = nvram_setup(tpm_ltpms->drive_id,&errcode); >> if (!tb->s.tpm_ltpms->nvram) { >> fprintf(stderr, "Could not find nvram.\n"); >> return errcode; >> } >> >> nvram_register_blob(tb->s.tpm_ltpms->nvram, >> NVRAM_ENTRY_PERMSTATE, >> tpmlib_get_prop(TPMPROP_TPM_MAX_NV_SPACE)); >> nvram_register_blob(tb->s.tpm_ltpms->nvram, >> NVRAM_ENTRY_SAVESTATE, >> tpmlib_get_prop(TPMPROP_TPM_MAX_SAVESTATE_SPACE)); >> nvram_register_blob(tb->s.tpm_ltpms->nvram, >> NVRAM_ENTRY_VOLASTATE, >> tpmlib_get_prop(TPMPROP_TPM_MAX_VOLATILESTATE_SPACE)); >> >> rc = nvram_start(tpm_ltpms->nvram, fail_on_encrypted_drive); >> >> Above first sets up the NVRAM using the drive's id. That is the >> -tpmdev ...,nvram=my-bs, parameter. This establishes the NVRAM. >> Subsequently the blobs to be written into the NVRAM are registered. >> The nvram_start then reconciles the registered NVRAM blobs with >> those found on disk and if everything fits together the result is >> 'rc = 0' and the NVRAM is ready to go. Other devices can than do the >> same also with the same NVRAM or another NVRAM. (NVRAM now after >> renaming from blobstore). >> >> Reading from NVRAM in case of the TPM is a rare event. It happens in >> the context of QEMU's main thread: >> >> if (nvram_read_data(tpm_ltpms->nvram, >> NVRAM_ENTRY_PERMSTATE, >> &tpm_ltpms->permanent_state.buffer, >> &tpm_ltpms->permanent_state.size, >> 0, NULL, NULL) || >> nvram_read_data(tpm_ltpms->nvram, >> NVRAM_ENTRY_SAVESTATE, >> &tpm_ltpms->save_state.buffer, >> &tpm_ltpms->save_state.size, >> 0, NULL, NULL)) >> { >> tpm_ltpms->had_fatal_error = true; >> return; >> } >> >> Above reads the data of 2 blobs synchronously. This happens during startup. >> >> >> Writes are depending on what the user does with the TPM. He can >> trigger lots of updates to persistent state if he performs certain >> operations, i.e., persisting keys inside the TPM. >> >> rc = nvram_write_data(tpm_ltpms->nvram, >> what, tsb->buffer, tsb->size, >> VNVRAM_ASYNC_F | VNVRAM_WAIT_COMPLETION_F, >> NULL, NULL); >> >> Above writes a TPM blob into the NVRAM. This is triggered by the TPM >> thread and notifies the QEMU main thread to write the blob into >> NVRAM. I do this synchronously at the moment not using the last two >> parameters for callback after completion but the two flags. The >> first is to notify the main thread the 2nd flag is to wait for the >> completion of the request (using a condition internally). >> >> Here are the protos: >> >> VNVRAM *nvram_setup(const char *drive_id, int *errcode); >> >> int nvram_start(VNVRAM *, bool fail_on_encrypted_drive); >> >> int nvram_register_blob(VNVRAM *bs, enum NVRAMEntryType type, >> unsigned int maxsize); >> >> unsigned int nvram_get_totalsize(VNVRAM *bs); >> unsigned int nvram_get_totalsize_kb(VNVRAM *bs); >> >> typedef void NVRAMRWFinishCB(void *opaque, int errcode, bool is_write, >> unsigned char **data, unsigned int len); >> >> int nvram_write_data(VNVRAM *bs, enum NVRAMEntryType type, >> const unsigned char *data, unsigned int len, >> int flags, NVRAMRWFinishCB cb, void *opaque); >> >> >> As said, things are changing right now, so this is to give an impression... > Thanks, these details are interesting. I interpreted the blobstore as a > key-value store but these example show it as a stream. No IDs or IMO the only stuff we should store there are blobs retrievable via keys (names) -- no metadata. > offsets are given, the reads are just performed in order and move > through the NVRAM. If it stays this simple then bdrv_*() is indeed a There are no offsets because there's some intelligence in the blobstore/NVRAM that lays out the data onto the disk. That's why there is a directory. This in turn allows the sharing of the NVRAM by possibly multiple drivers where the driver-writer doesn't need to lay out the blobs him-/herself. > natural way to do this - although my migration point remains since this > feature adds a new requirement for shared storage when it would be > pretty easy to put this stuff in the vm data stream (IIUC the TPM NVRAM > is relatively small?). It's just another image. You have to treat it like the VM's 'main' image. Block migration works fine on it just that it may be difficult for a user to handle migration flags if one image is on shared storage and the other isn't. Stefan > Stefan >