From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yolkfull Chow Subject: Re: [PATCH] KVM test: Add PCI device assignment support Date: Tue, 12 Jan 2010 18:17:01 +0800 Message-ID: <20100112101701.GD2282@aFu.nay.redhat.com> References: <1261958156-14136-1-git-send-email-lmr@redhat.com> Reply-To: Yolkfull Chow Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: autotest@test.kernel.org, kvm@vger.kernel.org To: Lucas Meneghel Rodrigues Return-path: Received: from mx1.redhat.com ([209.132.183.28]:14632 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753883Ab0ALKRH (ORCPT ); Tue, 12 Jan 2010 05:17:07 -0500 Content-Disposition: inline In-Reply-To: <1261958156-14136-1-git-send-email-lmr@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Sun, Dec 27, 2009 at 09:55:56PM -0200, Lucas Meneghel Rodrigues wrot= e: > Add support to PCI device assignment on the kvm test. It supports > both SR-IOV virtual functions and physical NIC card device > assignment. >=20 > Single Root I/O Virtualization (SR-IOV) allows a single PCI device to > be shared amongst multiple virtual machines while retaining the > performance benefit of assigning a PCI device to a virtual machine. > A common example is where a single SR-IOV capable NIC - with perhaps > only a single physical network port - might be shared with multiple > virtual machines by assigning a virtual function to each VM. >=20 > SR-IOV support is implemented in the kernel. The core implementation > is contained in the PCI subsystem, but there must also be driver supp= ort > for both the Physical Function (PF) and Virtual Function (VF) devices= =2E > With an SR-IOV capable device one can allocate VFs from a PF. The VFs > surface as PCI devices which are backed on the physical PCI device by > resources (queues, and register sets). >=20 > Device support: >=20 > In 2.6.30, the Intel=C2=AE 82576 Gigabit Ethernet Controller is the o= nly > SR-IOV capable device supported. The igb driver has PF support and th= e > igbvf has VF support. >=20 > In 2.6.31 the Neterion=C2=AE X3100=E2=84=A2 is supported as well. Thi= s device uses > the same vxge driver for the PF as well as the VFs. >=20 > In order to configure the test: >=20 > * For SR-IOV virtual functions passthrough, we could specify the > module parameter 'max_vfs' in config file. > * For physical NIC card pass through, we should specify the device > name(s). >=20 > 4th try: Implemented Yolkfull's suggestion of keeping 'max_vfs' and > 'assignable_devices' as sepparated parameters. Yolkfull, please test = this > on your environment. Thank you! Hi Lucas, Sorry for the late reply. I just tested this patch and found some probl= ems, please see comments below: >=20 > * Naming is consistent with "PCI assignment" instead of > "PCI passthrough", as it's a more correct term. > * No more device database file, as all information about devices > is stored on an attribute of the VM class (an instance of the > PciAssignable class), so we don't have to bother dumping this > info to a file. > * Code simplified to avoid duplication >=20 > As it's a fairly involved feature, the more reviews we get the better= =2E >=20 > Signed-off-by: Lucas Meneghel Rodrigues > --- > client/tests/kvm/kvm_utils.py | 281 ++++++++++++++++++++++= ++++++++++ > client/tests/kvm/kvm_vm.py | 59 +++++++ > client/tests/kvm/tests_base.cfg.sample | 20 +++ > 3 files changed, 360 insertions(+), 0 deletions(-) >=20 > diff --git a/client/tests/kvm/kvm_utils.py b/client/tests/kvm/kvm_uti= ls.py > index 2bbbe22..59c72a9 100644 > --- a/client/tests/kvm/kvm_utils.py > +++ b/client/tests/kvm/kvm_utils.py > @@ -924,3 +924,284 @@ def create_report(report_dir, results_dir): > reporter =3D os.path.join(report_dir, 'html_report.py') > html_file =3D os.path.join(results_dir, 'results.html') > os.system('%s -r %s -f %s -R' % (reporter, results_dir, html_fil= e)) > + > + > +def get_full_pci_id(pci_id): > + """ > + Get full PCI ID of pci_id. > + > + @param pci_id: PCI ID of a device. > + """ > + cmd =3D "lspci -D | awk '/%s/ {print $1}'" % pci_id > + status, full_id =3D commands.getstatusoutput(cmd) > + if status !=3D 0: > + return None > + return full_id > + > + > +def get_vendor_from_pci_id(pci_id): > + """ > + Check out the device vendor ID according to pci_id. > + > + @param pci_id: PCI ID of a device. > + """ > + cmd =3D "lspci -n | awk '/%s/ {print $3}'" % pci_id > + return re.sub(":", " ", commands.getoutput(cmd)) > + > + > +class PciAssignable(object): > + """ > + Request PCI assignable devices on host. It will check whether to= request > + PF (physical Functions) or VF (Virtual Functions). > + """ > + def __init__(self, type=3D"nic_vf", driver=3DNone, driver_option= =3DNone, > + names=3DNone, devices_requested=3DNone): > + """ > + Initialize parameter 'type' which could be: > + nic_vf: Virtual Functions > + nic_pf: Physical Function (actual hardware) > + mixed: Both includes VFs and PFs > + > + If pass through Physical NIC cards, we need to specify which= devices > + to be assigned, e.g. 'eth1 eth2'. > + > + If pass through Virtual Functions, we need to specify how ma= ny vfs > + are going to be assigned, e.g. passthrough_count =3D 8 and m= ax_vfs in > + config file. > + > + @param type: PCI device type. > + @param driver: Kernel module for the PCI assignable device. > + @param driver_option: Module option to specify the maximum n= umber of > + VFs (eg 'max_vfs=3D7') > + @param names: Physical NIC cards correspondent network inter= faces, > + e.g.'eth1 eth2 ...' Add parameter interpretation for 'devices_requested'. > + """ > + self.type =3D type > + self.driver =3D driver > + self.driver_option =3D driver_option > + if names: > + self.name_list =3D names.split() > + if devices_requested: > + self.devices_requested =3D int(devices_requested) We need anyway to initialize 'self.devices_requested' since following c= odes use this attribution, like: =2E.. def check_vfs_count(self): > + """ > + Check VFs count number according to the parameter driver_opt= ions. > + """ > + return (self.get_vfs_count =3D=3D self.devices_requested) =2E.. > + > + > + def _get_pf_pci_id(self, name, search_str): > + """ > + Get the PF PCI ID according to name. > + > + @param name: Name of the PCI device. > + @param search_str: Search string to be used on lspci. > + """ > + cmd =3D "ethtool -i %s | awk '/bus-info/ {print $2}'" % name > + s, pci_id =3D commands.getstatusoutput(cmd) > + if not (s or "Cannot get driver information" in pci_id): > + return pci_id[5:] > + cmd =3D "lspci | awk '/%s/ {print $1}'" % search_str > + pci_ids =3D [id for id in commands.getoutput(cmd).splitlines= ()] > + nic_id =3D int(re.search('[0-9]+', name).group(0)) > + if (len(pci_ids) - 1) < nic_id: > + return None > + return pci_ids[nic_id] > + > + > + def _release_dev(self, pci_id): > + """ > + Release a single PCI device. > + > + @param pci_id: PCI ID of a given PCI device. > + """ > + base_dir =3D "/sys/bus/pci" > + full_id =3D get_full_pci_id(pci_id) > + vendor_id =3D get_vendor_from_pci_id(pci_id) > + drv_path =3D os.path.join(base_dir, "devices/%s/driver" % fu= ll_id) > + if 'pci-stub' in os.readlink(drv_path): > + cmd =3D "echo '%s' > %s/new_id" % (vendor_id, drv_path) > + if os.system(cmd): > + return False > + > + stub_path =3D os.path.join(base_dir, "drivers/pci-stub") > + cmd =3D "echo '%s' > %s/unbind" % (full_id, stub_path) > + if os.system(cmd): > + return False > + > + driver =3D self.dev_drivers[pci_id] > + cmd =3D "echo '%s' > %s/bind" % (full_id, driver) > + if os.system(cmd): > + return False > + > + return True > + > + > + def get_vf_devs(self): > + """ > + Catch all VFs PCI IDs. > + > + @return: List with all PCI IDs for the Virtual Functions ava= liable > + """ > + if not self.sr_iov_setup(): > + return [] > + > + cmd =3D "lspci | awk '/Virtual Function/ {print $1}'" > + return commands.getoutput(cmd).split() > + > + > + def get_pf_devs(self): > + """ > + Catch all PFs PCI IDs. > + > + @return: List with all PCI IDs for the physical hardware req= uested > + """ > + pf_ids =3D [] > + for name in self.name_list: > + pf_id =3D self._get_pf_pci_id(name, "Ethernet") > + if not pf_id: > + continue > + pf_ids.append(pf_id) > + return pf_ids > + > + > + def get_devs(self, count): > + """ > + Check out all devices' PCI IDs according to their name. > + > + @param count: count number of PCI devices needed for pass th= rough > + @return: a list of all devices' PCI IDs > + """ > + if self.type =3D=3D "nic_vf": > + vf_ids =3D self.get_vf_devs() > + elif self.type =3D=3D "nic_pf": > + vf_ids =3D self.get_pf_devs() > + elif self.type =3D=3D "mixed": > + vf_ids =3D self.get_vf_devs() > + vf_ids.extend(self.get_pf_devs()) > + return vf_ids[0:count] > + > + > + def get_vfs_count(self): > + """ > + Get VFs count number according to lspci. > + """ > + cmd =3D "lspci | grep 'Virtual Function' | wc -l" > + # For each VF we'll see 2 prints of 'Virtual Function', so l= et's > + # divide the result per 2 > + return int(commands.getoutput(cmd)) / 2 > + > + > + def check_vfs_count(self): > + """ > + Check VFs count number according to the parameter driver_opt= ions. > + """ > + return (self.get_vfs_count =3D=3D self.devices_requested) > + > + > + def is_binded_to_stub(self, full_id): > + """ > + Verify whether the device with full_id is already binded to = pci-stub. > + > + @param full_id: Full ID for the given PCI device > + """ > + base_dir =3D "/sys/bus/pci" > + stub_path =3D os.path.join(base_dir, "drivers/pci-stub") > + if os.path.exists(os.path.join(stub_path, full_id)): > + return True > + return False > + > + > + def sr_iov_setup(self): > + """ > + Ensure the PCI device is working in sr_iov mode. > + > + Check if the PCI hardware device drive is loaded with the ap= propriate, > + parameters (number of VFs), and if it's not, perform setup. > + > + @return: True, if the setup was completed successfuly, False= otherwise. > + """ > + re_probe =3D False > + s, o =3D commands.getstatusoutput('lsmod | grep %s' % self.d= river) > + if s: > + re_probe =3D True > + elif not self.check_vfs_count(): > + os.system("modprobe -r %s" % self.driver) > + re_probe =3D True > + > + # Re-probe driver with proper number of VFs > + if re_probe: > + cmd =3D "modprobe %s %s" % (self.driver, self.driver_opt= ion) > + s, o =3D commands.getstatusoutput(cmd) > + if s: > + return False > + if not self.check_vfs_count(): > + return False > + return True > + > + > + def request_devs(self): > + """ > + Implement setup process: unbind the PCI device and then bind= it > + to the pci-stub driver. > + > + @return: a list of successfully requested devices' PCI IDs. > + """ > + base_dir =3D "/sys/bus/pci" > + stub_path =3D os.path.join(base_dir, "drivers/pci-stub") > + > + self.pci_ids =3D self.get_devs(self.devices_requested) > + logging.debug("The following pci_ids were found: %s" % self.= pci_ids) > + requested_pci_ids =3D [] > + self.dev_drivers =3D {} > + > + # Setup all devices specified for assignment to guest > + for pci_id in self.pci_ids: > + full_id =3D get_full_pci_id(pci_id) > + if not full_id: > + continue > + drv_path =3D os.path.join(base_dir, "devices/%s/driver" = % full_id) > + dev_prev_driver=3D os.path.realpath(os.path.join(drv_pat= h, > + os.readlink(drv_path))= ) > + self.dev_drivers[pci_id] =3D dev_prev_driver > + > + # Judge whether the device driver has been binded to stu= b > + if not self.is_binded_to_stub(full_id): > + logging.debug("Binding device %s to stub" % full_id) > + vendor_id =3D get_vendor_from_pci_id(pci_id) > + stub_new_id =3D os.path.join(stub_path, 'new_id') > + unbind_dev =3D os.path.join(drv_path, 'unbind') > + stub_bind =3D os.path.join(stub_path, 'bind') > + > + info_write_to_files =3D [(vendor_id, stub_new_id), > + (full_id, unbind_dev), > + (full_id, stub_bind)] > + > + for content, file in info_write_to_files: > + try: > + utils.open_write_close(content, file) > + except IOError: > + logging.debug("Failed to write %s to file %s= " % > + (content, file)) > + continue > + > + if not self.is_binded_to_stub(full_id): > + logging.error("Binding device %s to stub failed"= % > + pci_id) > + continue > + else: > + logging.debug("Device %s already binded to stub" % p= ci_id) > + requested_pci_ids.append(pci_id) > + self.pci_ids =3D requested_pci_ids > + return self.pci_ids > + > + > + def release_devs(self): > + """ > + Release all PCI devices currently assigned to VMs back to th= e > + virtualization host. > + """ > + try: > + for pci_id in self.dev_drivers: > + if not self._release_dev(pci_id): > + logging.error("Failed to release device %s to ho= st" % > + pci_id) > + else: > + logging.info("Released device %s successfully" %= pci_id) > + except: > + return > diff --git a/client/tests/kvm/kvm_vm.py b/client/tests/kvm/kvm_vm.py > index cc314d4..a86c124 100755 > --- a/client/tests/kvm/kvm_vm.py > +++ b/client/tests/kvm/kvm_vm.py > @@ -304,6 +304,12 @@ class VM: > elif params.get("uuid"): > qemu_cmd +=3D " -uuid %s" % params.get("uuid") > =20 > + # If the PCI assignment step went OK, add each one of the PC= I assigned > + # devices to the qemu command line. > + if self.pci_assignable: > + for pci_id in self.pa_pci_ids: > + qemu_cmd +=3D " -pcidevice host=3D%s" % pci_id > + > return qemu_cmd > =20 > =20 > @@ -392,6 +398,50 @@ class VM: > self.uuid =3D f.read().strip() > f.close() > =20 > + if not params.get("pci_assignable") =3D=3D "no": > + pa_type =3D params.get("pci_assignable") > + pa_devices_requested =3D params.get("devices_request= ed") > + > + # Virtual Functions (VF) assignable devices > + if pa_type =3D=3D "vf": > + pa_driver =3D params.get("driver") > + pa_driver_option =3D params.get("driver_option") > + self.pci_assignable =3D kvm_utils.PciAssignable(= type=3Dpa_type, > + driver=3Dpa_driver, > + driver_option=3Dpa_driver_op= tion, > + devices_requested=3Dpa_devic= es_requested) > + # Physical NIC (PF) assignable devices > + elif pa_type =3D=3D "pf": > + pa_device_names =3D params.get("device_names") > + self.pci_assignable =3D kvm_utils.PciAssignable(= type=3Dpa_type, > + names=3Dpa_device_names, > + devices_requested=3Dpa_devi= ces_requested) > + # Working with both VF and PF > + elif pa_type =3D=3D "mixed": > + pa_device_names =3D params.get("device_names") > + pa_driver =3D params.get("driver") > + pa_driver_option =3D params.get("driver_option") > + self.pci_assignable =3D kvm_utils.PciAssignable(= type=3Dpa_type, > + driver=3Dpa_driver, > + driver_option=3Dpa_driver_op= tion, > + names=3Dpa_device_names, > + devices_requested=3Dpa_devic= es_requested) > + > + self.pa_pci_ids =3D self.pci_assignable.request_devs= () > + > + if self.pa_pci_ids: > + logging.debug("Successfuly assigned devices: %s"= % > + self.pa_pci_ids) > + else: > + logging.error("No PCI assignable devices were as= signed " > + "and 'pci_assignable' is defined t= o %s " > + "on your config file. Aborting VM = creation." % > + pa_type) > + return False > + > + else: > + self.pci_assignable =3D None It's weird that even though we initialize 'self.pci_assignable' when pa= rams.get("pci_assignable") =3D=3D "no", we still got backtrace: 03:27:38 ERROR| child process failed 03:27:38 INFO | FAIL kvm.Fedora.11.64.boot kvm.Fedora.11.64.boot time= stamp=3D1263284858 localtime=3DJan 12 03:27:38 Unhandled AttributeError= : VM instance has no attribute 'pci_assignable' Traceback (most recent call last): File "/root/pci_assign/client/common_lib/test.py", line 595, in _= call_test_function return func(*args, **dargs) File "/root/pci_assign/client/common_lib/test.py", line 281, in e= xecute postprocess_profiled_run, args, dargs) File "/root/pci_assign/client/common_lib/test.py", line 202, in _= call_run_once self.run_once(*args, **dargs) File "/root/pci_assign/client/tests/kvm/kvm.py", line 69, in run_= once kvm_preprocessing.postprocess(self, params, env) File "/root/pci_assign/client/tests/kvm/kvm_preprocessing.py", li= ne 271, in postprocess process(test, params, env, postprocess_image, postprocess_vm) File "/root/pci_assign/client/tests/kvm/kvm_preprocessing.py", li= ne 178, in process vm_func(test, vm_params, env, vm_name) File "/root/pci_assign/client/tests/kvm/kvm_preprocessing.py", li= ne 126, in postprocess_vm vm.destroy(gracefully =3D params.get("kill_vm_gracefully") =3D=3D= "yes") File "/root/pci_assign/client/tests/kvm/kvm_vm.py", line 590, in = destroy if self.pci_assignable: AttributeError: VM instance has no attribute 'pci_assignable' 03:27:38 INFO | END FAIL kvm.Fedora.11.64.boot kvm.Fedora.11.64.boot t= imestamp=3D1263284858 localtime=3DJan 12 03:27:38=09 So I added 'self.pci_assignable =3D None' in __init__() and fixed the p= roblem. > + > # Make qemu command > qemu_command =3D self.make_qemu_command() > =20 > @@ -537,6 +587,8 @@ class VM: > # Is it already dead? > if self.is_dead(): > logging.debug("VM is already down") > + if self.pci_assignable: > + self.pci_assignable.release_devs() > return > =20 > logging.debug("Destroying VM with PID %d..." % > @@ -557,6 +609,9 @@ class VM: > return > finally: > session.close() > + if self.pci_assignable: > + self.pci_assignable.release_devs() > + > =20 > # Try to destroy with a monitor command > logging.debug("Trying to kill VM with monitor command...= ") > @@ -566,6 +621,8 @@ class VM: > # Wait for the VM to be really dead > if kvm_utils.wait_for(self.is_dead, 5, 0.5, 0.5): > logging.debug("VM is down") > + if self.pci_assignable: > + self.pci_assignable.release_devs() > return > =20 > # If the VM isn't dead yet... > @@ -575,6 +632,8 @@ class VM: > # Wait for the VM to be really dead > if kvm_utils.wait_for(self.is_dead, 5, 0.5, 0.5): > logging.debug("VM is down") > + if self.pci_assignable: > + self.pci_assignable.release_devs() > return > =20 > logging.error("Process %s is a zombie!" % self.process.g= et_pid()) > diff --git a/client/tests/kvm/tests_base.cfg.sample b/client/tests/kv= m/tests_base.cfg.sample > index a403399..b7ee2e1 100644 > --- a/client/tests/kvm/tests_base.cfg.sample > +++ b/client/tests/kvm/tests_base.cfg.sample > @@ -884,3 +884,23 @@ variants: > pre_command =3D "/usr/bin/python scripts/hugepage.py /mnt/kv= m_hugepage" > extra_params +=3D " -mem-path /mnt/kvm_hugepage" > =20 > + > +variants: > + - @no_pci_assignable: > + pci_assignable =3D no > + - pf_assignable: > + pci_assignable =3D pf > + device_names =3D eth1 > + - vf_assignable: > + pci_assignable =3D vf > + # Driver (kernel module) that supports SR-IOV hardware. > + # As of today (30-11-2009), we have 2 drivers for this type = of hardware: > + # Intel=C2=AE 82576 Gigabit Ethernet Controller - igb > + # Neterion=C2=AE X3100=E2=84=A2 - vxge > + driver =3D igb > + # Driver option to specify the maximum number of virtual fun= ctions > + # (on vxge the option is , for example, is max_config_dev) > + # the default below is for the igb driver > + driver_option =3D "max_vfs=3D7" > + # Number of devices that are going to be requested. > + devices_requested =3D 7 > --=20 > 1.6.5.2 After fixing above two problems, patch runs good: # ./scan_results.py=20 Test Status Seconds Info ---- ------ ------- ---- (Result file: ../../results/default/status) =46edora.11.64.boot GOOD 51 completed s= uccessfully pf_assignable.Fedora.11.64.boot GOOD 19 completed suc= cessfully vf_assignable.Fedora.11.64.boot GOOD 14 completed suc= cessfully ---- GOOD 117=09 Thanks very much for improving this patch. :-) >=20 > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html