From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Daniel P. Berrange" Subject: Re: pci-stub error and MSI-X for KVM guest Date: Fri, 8 Jan 2010 11:04:34 +0000 Message-ID: <20100108110434.GA29141@redhat.com> References: <0199E0D51A61344794750DC57738F58E6D718922FC@GVW1118EXC.americas.hpqcorp.net> <20091221191923.GA5979@sequoia.sous-sol.org> <0199E0D51A61344794750DC57738F58E6D71892321@GVW1118EXC.americas.hpqcorp.net> <20091221195849.GC5979@sequoia.sous-sol.org> <0199E0D51A61344794750DC57738F58E6D723AADC9@GVW1118EXC.americas.hpqcorp.net> <20100104151659.GA27601@sequoia.sous-sol.org> <0199E0D51A61344794750DC57738F58E6D723AB1EC@GVW1118EXC.americas.hpqcorp.net> <20100108005003.GA20720@sequoia.sous-sol.org> Reply-To: "Daniel P. Berrange" Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: "Fischer, Anna" , "kvm@vger.kernel.org" , libvir-list@redhat.com To: Chris Wright Return-path: Received: from mx1.redhat.com ([209.132.183.28]:52091 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752765Ab0AHLFg (ORCPT ); Fri, 8 Jan 2010 06:05:36 -0500 Content-Disposition: inline In-Reply-To: <20100108005003.GA20720@sequoia.sous-sol.org> Sender: kvm-owner@vger.kernel.org List-ID: On Thu, Jan 07, 2010 at 04:50:03PM -0800, Chris Wright wrote: > * Fischer, Anna (anna.fischer@hp.com) wrote: > > So, when setting a breakpoint for the exit() call I'm getting a bit closer to figuring where it kills my guest. > > Thanks, this helps clarify what is happening. > > > Breakpoint 1, exit (status=1) at exit.c:99 > > 99 { > > Current language: auto > > The current source language is "auto; currently c". > > (gdb) bt > > #0 exit (status=1) at exit.c:99 > > #1 0x0000000000470c6e in assigned_dev_pci_read_config (d=0x259c6f0, address=64, len=4) > > assigned_dev_pci_read_config(..., 64, 4) > ^^ > This is a libvirt issue. When you use virt-manager it has libvirtd > fork/exec qemu-kvm. libvirtd will drop privileges and run qemu-kvm as > user qemu (or perhaps root if you've edited qemu.conf). Regardless of > the user, it clears capabilities. Reading PCI config space beyond just > the header requires CAP_SYS_ADMIN. The above is reading the first 4 > bytes of device dependent config space, and the kernel is returning 0 > because qemu doesn't have CAP_SYS_ADMIN. Hmm, libvirt also chown()'s the files in /sys/bus/pci/devices//* to 'qemu' (and sets SELinux context) so that the unprivileged QEMU process can have full read/write access to them. I would have hoped that would avoid the need to have any capabilities like CAP_SYS_ADMIN :-( > Basically, this means that device assignment w/ libvirt will break > MSI/MSI-X because qemu will never be able to see that the host device > has those PCI capabilities. This, in turn, renders VF device assignment > useless (since a VF is required to support MSI and/or MSI-X). > > Granting CAP_SYS_ADMIN for each qemu instance that does device assignment > would render the privilege reduction useless (CAP_SYS_ADMIN is the > kitchen sink catchall of the Linux capability system). Yeah that's pretty troublesome, even when libvirt runs QEMU as 'root', it will remove all capabilities. Why is the 'CAP_SYS_ADMIN' check there - is it a mistakenly over-zealous permission check that could be removed, just relying on access controls on the sysfs /sys/bus/pci/devices//config file ? Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|