From: Martin Kletzander <mkletzan@redhat.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Erik Skultety <eskultet@redhat.com>,
libvir-list@redhat.com, qemu-devel@nongnu.org,
brijesh.singh@amd.com, dinechin@redhat.com
Subject: Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
Date: Fri, 18 Jan 2019 12:11:50 +0100 [thread overview]
Message-ID: <20190118111150.GA28476@wheatley> (raw)
In-Reply-To: <20190118101638.GE20660@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 5846 bytes --]
On Fri, Jan 18, 2019 at 10:16:38AM +0000, Daniel P. Berrangé wrote:
>On Fri, Jan 18, 2019 at 10:39:35AM +0100, Erik Skultety wrote:
>> Hi,
>> this is a summary of a private discussion I've had with guys CC'd on this email
>> about finding a solution to [1] - basically, the default permissions on
>> /dev/sev (below) make it impossible to query for SEV platform capabilities,
>> since by default we run QEMU as qemu:qemu when probing for capabilities. It's
>> worth noting is that this is only relevant to probing, since for a proper QEMU
>> VM we create a mount namespace for the process and chown all the nodes (needs a
>> SEV fix though).
>>
>> # ll /dev/sev
>> crw-------. 1 root root
>>
>> I suggested either force running QEMU as root for probing (despite the obvious
>> security implications) or using namespaces for probing too. Dan argued that
>> this would have a significant perf impact and suggested we ask systemd to add a
>> global udev rule.
>
If the creation of namespaces is poses a performance impact, then why don't we
special-case the probing in a sense that we create one namespace for probing,
once, and probe all QEMU binaries in that one namespace?
>I've just realized there is a potential 3rd solution. Remember there is
>actually nothing inherantly special about the 'root' user as an account
>ID. 'root' gains its powers from the fact that it has many capabilities
>by default. 'qemu' can't access /dev/sev because it is owned by a
>different user (happens to be root) and 'qemu' does not have capabilities.
>
>So we can make probing work by using our capabilities code to grant
>CAP_DAC_OVERRIDE to the qemu process we spawn. So probing still runs
>as 'qemu', but can none the less access /dev/sev while it is owned
>by root. We were not using 'qemu' for sake of security, as the probing
>process is not executing any untrusthworthy code, so we don't loose any
>security protection by granting CAP_DAC_OVERRIDE.
>
IMHO CAP_DAC_OVERRIDE is a lot, especially on systems without SELinux.
>> I proceeded with cloning [1] to systemd and creating an udev rule that I planned
>> on submitting to systemd upstream - the initial idea was to mimic /dev/kvm and
>> make it world accessible to which Brijesh from AMD expressed a concern that
>> regular users might deplete the resources (limit on the number of guests
>> allowed by the platform). But since the limit is claimed to be around 4, Dan
>> discouraged me to continue with restricting the udev rule to only the 'kvm'
>> group which Laszlo suggested earlier as the limit is so small that a malicious
>> QEMU could easily deplete this during probing. This fact also ruled out any
>> kind of ACL we could create dynamically. Instead, he suggested that we filter
>> out the kvm-capable QEMU and put only that one in the namespace without a
>> significant perf impact.
>
>Yes, my suggestion to mimic /dev/kvm was based on the mistaken mis-understanding
>that there was not a finite resource limit. Given that there are one or more
>finite resource limits, we need access control on which unprivileged users, and
>/or which individual QEMU instances are permitted access. This means /dev/sev
>must remain with restrictive user/group/permissions that prevent any unprivilegd
>account from having access. This means either root:root 0770/0700, or possibly
>having an 'sev' group and using root:sev 0770, so that users can be granted
>access via 'sev' group membership which (might?) allow unprivileged libvirtd to
>use 'sev' if the user was added.
>
>> - my take on this is that there could potentially be more than a single
>> kvm-enabled QEMU and therefore we'd need to create more than just a
>> single namespace.
>
>True, I guess qemu-system-x86_64 and qemu-system-i386 both get KVM
>on an x86_64 host, and likewise for many other 64-bit archs supporting.
>32-bit apps.
>
>> - I also argued that I can image that the same kind of DOS attack might be
>> possible from within the namespace, even if we created the /dev/sev node
>> only in SEV-enabled guests (which we currently don't). All of us have
>> agreed that allowing /dev/sev in the namespace for only SEV-enabled
>> guests is worth doing nonetheless.
>
>There's never any perfect level of protection. We're just striving to
>minimize the attack surface by only exposing it where there's a genuine
>need to use it.
>
>> In the meantime, Christophe went through the kernel code to verify how the SEV
>> resources are managed and what protection is currently in place to mitigate the
>> chance of a process easily depleting the limit on SEV guests. He found that
>> ASID, which determines the encryption key, is allocated from a single ASID
>> bitmap and essentially guarded by a single 'sev->active' flag.
>>
>> So, in conclusion, we absolutely need input from Brijesh (AMD) whether there
>> was something more than the low limit on number of guests behind the default
>> permissions. Also, we'd like to get some details on how the limit is managed,
>> helping to assess the approaches mentioned above.
>
>Regardless of this problem, I think it is important to have some docs
>in either libvirt or QEMU that describe the resource usage constraints
>so that management apps can decide how to best take advantage of SEV.
>
>>
>> Thanks and please do share your ideas,
>> Erik
>>
>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1665400
>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1561113
>
>Regards,
>Daniel
>--
>|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
>|: https://libvirt.org -o- https://fstop138.berrange.com :|
>|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2019-01-18 11:21 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-18 9:39 [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities Erik Skultety
2019-01-18 10:16 ` Daniel P. Berrangé
2019-01-18 10:56 ` [Qemu-devel] [libvirt] " Erik Skultety
2019-01-18 11:11 ` Martin Kletzander [this message]
2019-01-18 11:17 ` [Qemu-devel] " Daniel P. Berrangé
2019-01-18 11:31 ` Martin Kletzander
2019-01-18 12:51 ` Singh, Brijesh
2019-01-23 12:55 ` Erik Skultety
2019-01-23 13:10 ` Daniel P. Berrangé
2019-01-23 13:22 ` Erik Skultety
2019-01-23 13:24 ` Daniel P. Berrangé
2019-01-23 13:33 ` Erik Skultety
2019-01-23 13:36 ` Daniel P. Berrangé
2019-01-23 15:02 ` Singh, Brijesh
2019-01-23 15:29 ` Erik Skultety
2019-01-29 16:15 ` Erik Skultety
2019-01-29 18:40 ` Daniel P. Berrangé
2019-01-30 8:06 ` Erik Skultety
2019-01-30 10:37 ` Daniel P. Berrangé
2019-01-30 13:39 ` Erik Skultety
2019-01-30 17:47 ` Singh, Brijesh
2019-01-30 18:18 ` Daniel P. Berrangé
2019-01-31 15:28 ` Erik Skultety
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190118111150.GA28476@wheatley \
--to=mkletzan@redhat.com \
--cc=berrange@redhat.com \
--cc=brijesh.singh@amd.com \
--cc=dinechin@redhat.com \
--cc=eskultet@redhat.com \
--cc=libvir-list@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).