From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:51525) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ULGDt-0005hM-AQ for qemu-devel@nongnu.org; Thu, 28 Mar 2013 13:02:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ULGDq-0000oo-Jw for qemu-devel@nongnu.org; Thu, 28 Mar 2013 13:02:45 -0400 Received: from e7.ny.us.ibm.com ([32.97.182.137]:42326) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ULGDq-0000og-GN for qemu-devel@nongnu.org; Thu, 28 Mar 2013 13:02:42 -0400 Received: from /spool/local by e7.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 28 Mar 2013 13:02:41 -0400 Received: from d01relay03.pok.ibm.com (d01relay03.pok.ibm.com [9.56.227.235]) by d01dlp03.pok.ibm.com (Postfix) with ESMTP id 4642FC90025 for ; Thu, 28 Mar 2013 13:02:38 -0400 (EDT) Received: from d01av05.pok.ibm.com (d01av05.pok.ibm.com [9.56.224.195]) by d01relay03.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r2SH2chD223074 for ; Thu, 28 Mar 2013 13:02:38 -0400 Received: from d01av05.pok.ibm.com (loopback [127.0.0.1]) by d01av05.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r2SH2bnM021581 for ; Thu, 28 Mar 2013 13:02:37 -0400 Message-ID: <515477AD.7030702@linux.vnet.ibm.com> Date: Thu, 28 Mar 2013 13:02:37 -0400 From: Stefan Berger MIME-Version: 1.0 References: <51530E4B.2010203@linux.vnet.ibm.com> <20130327155303.GB29523@redhat.com> <51531A51.3050709@linux.vnet.ibm.com> <51532268.40102@linux.vnet.ibm.com> <87k3os7okn.fsf@codemonkey.ws> <51532C0B.1050108@linux.vnet.ibm.com> <87ehf03dgw.fsf@codemonkey.ws> <515344AB.2030403@linux.vnet.ibm.com> <51546BAA.60504@linux.vnet.ibm.com> <20130328163108.GA30183@redhat.com> In-Reply-To: <20130328163108.GA30183@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] vNVRAM / blobstore design List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: Stefan Hajnoczi , Kent E Yoder , Corey Bryant , Michael Roth , qemu-devel , Joel Schopp , Kenneth Goldman , Anthony Liguori On 03/28/2013 12:31 PM, Michael S. Tsirkin wrote: > On Thu, Mar 28, 2013 at 12:11:22PM -0400, Stefan Berger wrote: >> On 03/27/2013 03:12 PM, Stefan Berger wrote: >>> On 03/27/2013 02:27 PM, Anthony Liguori wrote: >>>> Stefan Berger writes: >>>> >>>>> On 03/27/2013 01:14 PM, Anthony Liguori wrote: >>>>>> Stefan Berger writes: >>>>>> >>>>>> What I struggle with is that we're calling this a "blobstore". Using >>>>>> BER to store "blobs" seems kind of pointless especially when we're >>>>>> talking about exactly three blobs. >>>>>> >>>>>> I suspect real hardware does something like, flash is N >>>>>> bytes, blob 1 is >>>>>> a max of X bytes, blob 2 is a max of Y bytes, and blob 3 is >>>>>> (N - X - Y) >>>>>> bytes. >>>>>> >>>>>> Do we really need to do anything more than that? >>>>> I typically call it NVRAM, but earlier discussions seemed to prefer >>>>> 'blobstore'. >>>>> >>>>> Using BER is the 2nd design of the NVRAM/blobstore. The 1st one didn't >>>>> use any visitors but used a directory in the first sector pointing to >>>>> the actual blobs in other sectors of the block device. The organization >>>>> of the directory and assignment of the blobs to their sectors, aka 'the >>>>> layout of the data' in the disk image, was handled by the >>>>> NVRAM/blobstore implementation. >>>> Okay, the short response is: >>>> >>>> Just make the TPM have a DRIVE property, drop all notion of >>>> NVRAM/blobstore, and used fixed offsets into the BlockDriverState for >>>> each blob. >>> Fine by me. I don't see the need for visitors. I guess sharing of >>> the persistent storage between different types of devices is not a >>> goal here so that a layer that hides the layout and the blobs' >>> position within the storage would be necessary. Also fine by me >>> for as long as we don't come back to this discussion. >> One thing I'd like to get clarity about is the following >> corner-case. A user supplies some VM image as persistent storage for >> the TPM. It contains garbage. How do we handle this case? Does the >> TPM then just start writing its state into this image or do we want >> to have some layer in place that forces a user to go through the >> step of formatting after that layer indicates that the data are >> unreadable. Besides that a completely empty image also contains >> garbage from the perspective of TPM persistent state and for that >> layer. >> >> My intention would (again) be to put a header in front of every >> blob. That header would contain a crc32 covering that header (minus >> the crc32 field itself of course) plus the blob to determine whether >> the blob is garbage or not. It is similar in those terms as the 1st >> implementation where we also had a directory that contained that >> crc32 for the directory itself and for each blob. This is not a >> filesystem, I know that. >> >> Regards, >> Stefan >> >> > It was precisely this addition of more and more metadata > that made me suggest a format like BER. But of course > a file per field will do too: following what Anthony suggested you would > put the checksum in a separate file? > My intention would be to still support migration, so a block device / image file is then probably the best choice addressing this concern unless we force every setup to provide a shared filesystem. I think the latter wouldn't be accepted. Another idea (again) would be to support encryption on other image file than QCoW2. Here the user would supply the AES key to that persistent storage layer and that layer would keep a flag whether the blobs are encrypted and encrypt them upon writing , decrypt them upon reading. The crc32 also here could serve the purpose of seeing whether the right key was supplied, which can be detected upon decryption and the blob's crc not matching what was computed when it was written. If crc32 is not good enough, we can use a sha1 for this 'integrity' check or possibly the padding of AES can reveal the bad decryption as well. Some form of integrity checking in conjunction with a formatting step seems necessary. Besides that not having to use QCoW2, and with that getting automatic support for snapshotting, addresses a concern from the virtualized TPM spec that as far as I know doesn't want to see multiple states of the same TPM, which in effect snapshotting could cause. So if one wanted to be compliant to that spec one could use raw VM image files and along with that get encryption and migration. At least in the following aspects we are away from the hardware-world: - image files are accessible through the filesystem and can be looked into and their secrets retrieved while the NVRAM of a device may be shielded and more difficult to be examined (it's not impossible) -> so we may want to have encryption for every type of image file and not just rely on QCoW2 encryption or assume the image files always reside in encrypted filesystems - admittedly a corner case: persistent storage that contains garbage needs to be consciously formatted or the user asked what to do; an image file with all zeros could probably be detected, though, but if we require formatting for the case where garbage is found,we may as well require it here also I understand your suggestion with the BER encoding. One problematic aspect of the whole BER stuff including all the other patches around it seem to be that they are be too big (~5.5ksloc) to find much love. Stefan