From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:41058) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R1Hpv-0007Ad-S4 for qemu-devel@nongnu.org; Wed, 07 Sep 2011 09:06:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R1Hpp-0000lW-U3 for qemu-devel@nongnu.org; Wed, 07 Sep 2011 09:06:39 -0400 Received: from e33.co.us.ibm.com ([32.97.110.151]:33921) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R1Hpp-0000lQ-NS for qemu-devel@nongnu.org; Wed, 07 Sep 2011 09:06:33 -0400 Received: from d03relay02.boulder.ibm.com (d03relay02.boulder.ibm.com [9.17.195.227]) by e33.co.us.ibm.com (8.14.4/8.13.1) with ESMTP id p87CwRP4002342 for ; Wed, 7 Sep 2011 06:58:27 -0600 Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by d03relay02.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p87D6IbK043528 for ; Wed, 7 Sep 2011 07:06:20 -0600 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p87DC8IP030847 for ; Wed, 7 Sep 2011 07:12:08 -0600 Message-ID: <4E676C3D.2040607@linux.vnet.ibm.com> Date: Wed, 07 Sep 2011 09:06:05 -0400 From: Stefan Berger MIME-Version: 1.0 References: <20110831143551.127339744@linux.vnet.ibm.com> <20110831143621.799480525@linux.vnet.ibm.com> <20110901173225.GH10989@redhat.com> <4E603724.7010405@linux.vnet.ibm.com> <20110904193258.GA13640@redhat.com> <4E66B304.8020605@linux.vnet.ibm.com> <20110907111857.GB9337@redhat.com> In-Reply-To: <20110907111857.GB9337@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH V8 08/14] Introduce file lock for the block layer List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: chrisw@redhat.com, Anthony Liguori , anbang.ruan@cs.ox.ac.uk, qemu-devel@nongnu.org, rrelyea@redhat.com, alevy@redhat.com, andreas.niederl@iaik.tugraz.at, "Serge E. Hallyn" On 09/07/2011 07:18 AM, Michael S. Tsirkin wrote: > On Tue, Sep 06, 2011 at 07:55:48PM -0400, Stefan Berger wrote: >> On 09/04/2011 03:32 PM, Michael S. Tsirkin wrote: >>> On Thu, Sep 01, 2011 at 09:53:40PM -0400, Stefan Berger wrote: >>>>> Generally, what all other devices do is perform validation >>>>> as the last step in migration when device state >>>>> is restored. On failure, management can decide what to do: >>>>> retry migration or restart on source. >>>>> >>>>> Why is TPM special and needs to be treated differently? >>>>> >>>>> >>>>> >>> ... >>> >>>> More detail: Typically one starts out with an empty QCoW2 file >>>> created via qemu-img. Once Qemu starts and initializes the >>>> libtpms-based TPM, it tries to read existing state from that QCoW2 >>>> file. Since there is no state stored in the QCoW2, the TPM will >>>> start initializing itself to an initial 'blank' state. >>> So it looks like the problem is you access the file when guest isn't >>> running. Delaying the initialization until the guest actually starts >>> running will solve the problem in a way that is more consistent with >>> other qemu devices. >>> >> I'd agree if there wasn't one more thing: encryption on the data >> inside the QCoW2 filesystem >> >> First: There are two ways to encrypt the data. >> >> One comes with the QCoW2 type of image and it comes for free. Set >> the encryption flag when creating the QCoW2 file and one has to >> provide a key to access the QCoW2. I found this mode problematic for >> users since it required me to go through the monitor every time I >> started the VM. Besides that the key is provided so late that all >> devices are already initialized and if the wrong key was provided >> the only thing the TPM can do is to go into shutdown mode since >> there is state on the QCoW2 but it cannot be decrypted. This also >> became problematic when doing migrations with libvirt for example >> and one was to have a wrong key/password installed on the target >> side -- graceful termination of the migration is impossible. >> >> So the above drove the implementation of the encryption mode added >> in patch 10 in this series. Here the key is provided via command >> line and it can be used immediately. So I am reading the state blobs >> from the file, decrypt them, create the CRC32 on the plain data and >> check against the CRC32 stored in the 'directory'. If it doesn't >> match the expected CRC32 either the key was wrong or the state is >> corrupted and I can terminate Qemu gracefully. I can also react >> appropriately if no key was provided but one is expected and >> vice-versa. Also in case of migration this now allows me to >> terminate Qemu gracefully so it continues running on the source. >> This is an improvement over the situation described above where in >> case the target had the wrong key the TPM went into shutdown mode >> and the user would be wondering why that is -- the TPM becomes >> inaccessible. However, particularly in the case of migration with >> shared storage I need to access the QCoW2 file to check whether on >> the target I have the right key. And this happens very early during >> qemu startup on the target machine. So that's where the file locking >> on the block layer comes in. I want to serialize access to the QCoW2 >> so I don't read intermediate state on the migration-target host that >> can occur if the source is currently writing TPM state data. >> >> This patch and the command line provided key along with the >> encryption mode added in patch 10 in this series add to usability >> and help prevent failures. >> >> Regards, >> Stefan > Sure, it's a useful feature of validating the device early. > But as I said above, it's useful for other devices besides > TPM. However it introduces issues where state changes > since guest keeps running (such as what you have described here). > > I think solving this in a way specific to TPM is a mistake. > Passing key on command line does not mean you must use it > immediately, this can still be done when guest starts running. > If I was not using the key immediately I could just drop this patch and the following one introducing the blob encryption and we would have to go with the QCoW2 encryption mode. Delaying the key usage then becomes equivalent to how QCoW2 is handling its encryption mode along with the problems related to user-friendliness / handling of missing or wrong keys described earlier. > Further, the patchset seems to be split in > a way that introduces support headaches and makes review Patch 8 introduces file locking for bdrv. Patch 9 implements support for string TPM blobs inside a QCoW2 image and makes use of the locking. Patch 10 adds encryption support for the TPM blobs. What otherwise would be the logical split? > harder, not easier. In this case, we will have to support > both a broken mode (key supplied later) and a non-broken one > (key supplied on init). It would be better to implement > a single mode, in a single patch. > > If we call QCoW2 encryption support the 'broken mode' and we want to do better than this then I do have to solve it in a TPM-specific way since Qemu otherwise does not support any better method (afaics). QCoW2 encryption is there today and we get it for free. The only thing I could really do in patch 10 in this series is check whether the QCoW2 image is encrypted and refuse to use it. I find it contradicting what you said above. Regards, Stefan