vfio problem

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

* vfio problem
@ 2012-06-08 16:58 Andreas Hartmann
  2012-06-08 17:35 ` Alex Williamson
  2012-06-08 17:39 ` Andreas Hartmann
  0 siblings, 2 replies; 11+ messages in thread
From: Andreas Hartmann @ 2012-06-08 16:58 UTC (permalink / raw)
  To: Alex Williamson; +Cc: KVM

Hello Alex,

You can probably say, what this message on host side means:

kernel: [ 3902.124109] vfio_dma_do_map: RLIMIT_MEMLOCK (65536) exceeded

The WLAN card in the VM doesn't work any more. It came up after a few
times of restarting the VM (with unbinding / rebinding - procedures).

I'll see if it is reproducible. I had to reboot to get it working again.

Thanks,
regards,
Andreas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: vfio problem
  2012-06-08 16:58 vfio problem Andreas Hartmann
@ 2012-06-08 17:35 ` Alex Williamson
  2012-06-09 13:42   ` Andreas Hartmann
  2012-06-08 17:39 ` Andreas Hartmann
  1 sibling, 1 reply; 11+ messages in thread
From: Alex Williamson @ 2012-06-08 17:35 UTC (permalink / raw)
  To: Andreas Hartmann; +Cc: KVM

On Fri, 2012-06-08 at 18:58 +0200, Andreas Hartmann wrote:
> Hello Alex,
> 
> You can probably say, what this message on host side means:
> 
> kernel: [ 3902.124109] vfio_dma_do_map: RLIMIT_MEMLOCK (65536) exceeded

We've hit the limit of locked pages.  Are you trying to run as root or a
normal user?  If the latter, you need to play with ulimits to increase
the size.

> The WLAN card in the VM doesn't work any more. It came up after a few
> times of restarting the VM (with unbinding / rebinding - procedures).

Do I recall correctly you reporting a message about the device not
supporting reset for the WLAN?`  Unfortunately devices are mostly black
boxes as far as VFIO is concerned, so if the device doesn't support
reset and doesn't have it's own device specific reset and doesn't simply
start behaving when we restore config space, there's little for vfio to
do.  We do have a bit more flexibility in performing a secondary bus
reset on the bridge since we own everything below the bridge.  We
probably need to consider adding a group reset ioctl to take advantage
of that.

> I'll see if it is reproducible. I had to reboot to get it working again.

I'm definitely curious if there's anything cumulative about the locked
memory problem above.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: vfio problem
  2012-06-08 16:58 vfio problem Andreas Hartmann
  2012-06-08 17:35 ` Alex Williamson
@ 2012-06-08 17:39 ` Andreas Hartmann
  2012-06-08 18:17   ` Alex Williamson
  1 sibling, 1 reply; 11+ messages in thread
From: Andreas Hartmann @ 2012-06-08 17:39 UTC (permalink / raw)
  To: Alex Williamson; +Cc: KVM

Andreas Hartmann wrote:
> Hello Alex,
> 
> You can probably say, what this message on host side means:
> 
> kernel: [ 3902.124109] vfio_dma_do_map: RLIMIT_MEMLOCK (65536) exceeded
> 
> The WLAN card in the VM doesn't work any more. It came up after a few
> times of restarting the VM (with unbinding / rebinding - procedures).
> 
> I'll see if it is reproducible. I had to reboot to get it working again.

It is reproducible. And id seems not to be a problem of binding /
unbinding, but the fact of not starting it as root user seems to be the
problem.

I never saw these problems with a root VM (and root does have the same
value for ulimit -l).

- Is it possible to run the VM / VFIO in user context?
- What size should be used for ulimit -l?


Thanks,
Andreas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: vfio problem
  2012-06-08 17:39 ` Andreas Hartmann
@ 2012-06-08 18:17   ` Alex Williamson
  2012-06-08 19:43     ` Andreas Hartmann
  2012-06-09  0:00     ` Andreas Hartmann
  0 siblings, 2 replies; 11+ messages in thread
From: Alex Williamson @ 2012-06-08 18:17 UTC (permalink / raw)
  To: Andreas Hartmann; +Cc: KVM

On Fri, 2012-06-08 at 19:39 +0200, Andreas Hartmann wrote:
> Andreas Hartmann wrote:
> > Hello Alex,
> > 
> > You can probably say, what this message on host side means:
> > 
> > kernel: [ 3902.124109] vfio_dma_do_map: RLIMIT_MEMLOCK (65536) exceeded
> > 
> > The WLAN card in the VM doesn't work any more. It came up after a few
> > times of restarting the VM (with unbinding / rebinding - procedures).
> > 
> > I'll see if it is reproducible. I had to reboot to get it working again.
> 
> It is reproducible. And id seems not to be a problem of binding /
> unbinding, but the fact of not starting it as root user seems to be the
> problem.
> 
> I never saw these problems with a root VM (and root does have the same
> value for ulimit -l).

Yes, this is expected when running as non-root.  VFIO needs to lock
pages on behalf of the user, so the user needs limits granted to be able
to do that.  Otherwise a VFIO user could lock down all the memory in the
system.

> - Is it possible to run the VM / VFIO in user context?

Yes, this is one of the key design requirements of VFIO.
Pre-requirements are that a privileged entity has sequestered all the
devices as being owned by vfio-pci or pci-stub, the user has permissions
to /dev/vfio/<group#> and /dev/vfio/vfio (the latter is expected to be
safe to leave as 0666), and the user has limits set to lock pages
sufficient for what they need (note that the default of 64k might be
enough for some userspace driver applications).

> - What size should be used for ulimit -l?

It should be about the size of memory assigned to the guest.

Once we have libvirt support, all of this should be relatively
transparent as that will take care of the limits setting.  For now, it's
a bit of a pain running it as a normal user.  If you come up with an
easy way of doing it, please share.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: vfio problem
  2012-06-08 18:17   ` Alex Williamson
@ 2012-06-08 19:43     ` Andreas Hartmann
  2012-06-09  0:00     ` Andreas Hartmann
  1 sibling, 0 replies; 11+ messages in thread
From: Andreas Hartmann @ 2012-06-08 19:43 UTC (permalink / raw)
  To: Alex Williamson; +Cc: KVM

Alex Williamson wrote:
> On Fri, 2012-06-08 at 19:39 +0200, Andreas Hartmann wrote:
[...]
>> - What size should be used for ulimit -l?
> 
> It should be about the size of memory assigned to the guest.

Ok, I'm using 256MB, this means, I should try to set ulimit -l to 256MB.
Nevertheless, I'm wondering that I don't get the same problem with root.
Obviously, the limit is ignored as root.

> Once we have libvirt support, all of this should be relatively
> transparent as that will take care of the limits setting. For now, it's
> a bit of a pain running it as a normal user.  If you come up with an
> easy way of doing it, please share.  Thanks,

Hmm, should work like this (I'm already running VM's this way, but
without changes to ulimit):

Define a new user and run libvirt with this user. For this user context,
ulimit can be set to, lets say, 256MB.
I'll try it.

Thanks for your explanations,
Andreas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: vfio problem
  2012-06-08 18:17   ` Alex Williamson
  2012-06-08 19:43     ` Andreas Hartmann
@ 2012-06-09  0:00     ` Andreas Hartmann
  2012-06-09 14:09       ` Alex Williamson
  1 sibling, 1 reply; 11+ messages in thread
From: Andreas Hartmann @ 2012-06-09  0:00 UTC (permalink / raw)
  To: Alex Williamson; +Cc: KVM

Alex Williamson wrote:
> On Fri, 2012-06-08 at 19:39 +0200, Andreas Hartmann wrote:
>> Andreas Hartmann wrote:
>>> Hello Alex,
>>>
>>> You can probably say, what this message on host side means:
>>>
>>> kernel: [ 3902.124109] vfio_dma_do_map: RLIMIT_MEMLOCK (65536) exceeded
>>>
>>> The WLAN card in the VM doesn't work any more. It came up after a few
>>> times of restarting the VM (with unbinding / rebinding - procedures).
>>>
>>> I'll see if it is reproducible. I had to reboot to get it working again.
>>
>> It is reproducible. And id seems not to be a problem of binding /
>> unbinding, but the fact of not starting it as root user seems to be the
>> problem.
>>
>> I never saw these problems with a root VM (and root does have the same
>> value for ulimit -l).
> 
> Yes, this is expected when running as non-root.  VFIO needs to lock
> pages on behalf of the user, so the user needs limits granted to be able
> to do that.  Otherwise a VFIO user could lock down all the memory in the
> system.
> 
>> - Is it possible to run the VM / VFIO in user context?
> 
> Yes, this is one of the key design requirements of VFIO.
> Pre-requirements are that a privileged entity has sequestered all the
> devices as being owned by vfio-pci or pci-stub, the user has permissions
> to /dev/vfio/<group#> and /dev/vfio/vfio (the latter is expected to be
> safe to leave as 0666), and the user has limits set to lock pages
> sufficient for what they need (note that the default of 64k might be
> enough for some userspace driver applications).

I already read about the permissions, but I wasn't aware, that I have to
set the max locked memory for the user.

I set it to 512 MB (for a VM with 256 MB plus some overhead) for the
beginning. It most probably could be reduced.

>> - What size should be used for ulimit -l?
> 
> It should be about the size of memory assigned to the guest.

Thanks for this hint! 64k is definitely not enough for my VM :-).

> Once we have libvirt support, all of this should be relatively
> transparent as that will take care of the limits setting.

If you are aware of it, it's no problem ... .

>  For now, it's
> a bit of a pain running it as a normal user. 

Not that much. It's good to become an understanding of what is going on.

> If you come up with an
> easy way of doing it, please share.

I don't know, if it's easy, but it's straight forward and working:

1. define a user for running the VM.
2. create a VM for virsh (xml-File)
   libvirt has the ability to freely define parameters for qemu. That's
   the way I enabled libvirt to support VFIO:

   <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
	...
	<devices>
		...
	</devices>
	<qemu:commandline> <!-- instead of hostdev ... -->
		<qemu:arg value='-device'/>
		<qemu:arg value='vfio-pci,host=06:07.0'/>
    		<!-- <qemu:env name='QEMU_ENV' value='VAL'/> -->
	</qemu:commandline>
   </domain>

3. set max locked memory limit in /etc/security/limits.conf (PAM).
4. write a small start script which contains the bindings and the start
   of the VM as root with
   su user -c "virsh start VM"
   Don't forget to increase the max locked memory before virsh start if
   the default size is less than the desired size for the VM.
5. write a small stop script to shutdown the VM with
   su user -c "virsh shutdown VM" and free the devices again.

The start / stop scripts could be put to sudo to enable a user without
root authorization to start the VM. You can even start it with an icon
with your favourite GUI if you want to (women acceptance factor).


BTW:
During playing around (yes, it was my fault), I accidentally tried to
free the devices though the VM still has been running. Vfio wasn't that
happy about this accident :-) :


Jun  9 01:01:49 host kernel: [20108.859106] ------------[ cut here ]------------
Jun  9 01:01:49 host kernel: [20108.859117] kernel BUG at /rpm/BUILD/kernel-desktop-3.4.1/linux-3.4/drivers/vfio/vfio.c:574!
Jun  9 01:01:49 host kernel: [20108.859125] invalid opcode: 0000 [#1] PREEMPT SMP 
Jun  9 01:01:49 host kernel: [20108.859131] CPU 3 
Jun  9 01:01:49 host kernel: [20108.859133] Modules linked in: xfs vfio_iommu_type1 vfio_pci vfio ... [last unloaded: micr
Jun  9 01:01:49 host kernel: ocode]
Jun  9 01:01:49 host kernel: [20108.859226] 
Jun  9 01:01:49 host kernel: [20108.859231] Pid: 22178, comm: stopVM Tainted: P           O 3.4.1-3.1-desktop #1 Gigabyte Technology Co., Ltd. GA-990XA-UD3/GA-990XA-UD3
Jun  9 01:01:49 host kernel: [20108.859241] RIP: 0010:[<ffffffffa077f0f3>]  [<ffffffffa077f0f3>] vfio_iommu_group_notifier+0x1f3/0x300 [vfio]
Jun  9 01:01:49 host kernel: [20108.859252] RSP: 0018:ffff8801f25f9d18  EFLAGS: 00010282
Jun  9 01:01:49 host kernel: [20108.859259] RAX: 00000000ffffffea RBX: ffff88022065a6c0 RCX: 0000000000000008
Jun  9 01:01:49 host kernel: [20108.859265] RDX: ffff880223f98090 RSI: ffff880223f98090 RDI: 0000000000000000
Jun  9 01:01:49 host kernel: [20108.859271] RBP: ffff88022065a718 R08: ffff88022065a6c0 R09: 000000000000000a
Jun  9 01:01:49 host kernel: [20108.859277] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880223f98090
Jun  9 01:01:49 host kernel: [20108.859283] R13: ffffffffa04bb8f9 R14: 0000000000000000 R15: ffff880222906b40
Jun  9 01:01:49 host kernel: [20108.859290] FS:  00007f8613318700(0000) GS:ffff88022ecc0000(0000) knlGS:0000000000000000
Jun  9 01:01:49 host kernel: [20108.859297] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun  9 01:01:49 host kernel: [20108.859303] CR2: 00007f6c0e8674f8 CR3: 000000013a066000 CR4: 00000000000407e0
Jun  9 01:01:49 host kernel: [20108.859310] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun  9 01:01:49 host kernel: [20108.859316] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun  9 01:01:49 host kernel: [20108.859322] Process stopVM (pid: 22178, threadinfo ffff8801f25f8000, task ffff880111176580)
Jun  9 01:01:49 host kernel: [20108.859328] Stack:
Jun  9 01:01:49 host kernel: [20108.859333]  ffff8801f25f9d58 ffffffff8107471f ffff8801109b3e40 0000000000000004
Jun  9 01:01:49 host kernel: [20108.859342]  ffff880223f98090 00000000ffffffff 0000000000000000 ffffffff815b5e25
Jun  9 01:01:49 host kernel: [20108.859350]  ffff880221585e78 0000000000000004 ffff880223f98090 00000000ffffffff
Jun  9 01:01:49 host kernel: [20108.859358] Call Trace:
Jun  9 01:01:49 host kernel: [20108.859384]  [<ffffffff815b5e25>] notifier_call_chain+0x45/0x60
Jun  9 01:01:49 host kernel: [20108.859395]  [<ffffffff8106b886>] __blocking_notifier_call_chain+0x56/0x90
Jun  9 01:01:49 host kernel: [20108.859405]  [<ffffffff81497e79>] iommu_bus_notifier+0x89/0xe0
Jun  9 01:01:49 host kernel: [20108.859414]  [<ffffffff815b5e25>] notifier_call_chain+0x45/0x60
Jun  9 01:01:49 host kernel: [20108.859422]  [<ffffffff8106b886>] __blocking_notifier_call_chain+0x56/0x90
Jun  9 01:01:49 host kernel: [20108.859433]  [<ffffffff813bbb30>] really_probe+0xc0/0x300
Jun  9 01:01:49 host kernel: [20108.859442]  [<ffffffff813bbef7>] driver_probe_device+0x47/0xb0
Jun  9 01:01:49 host kernel: [20108.859450]  [<ffffffff813ba6f2>] driver_bind+0xd2/0x110
Jun  9 01:01:49 host kernel: [20108.859459]  [<ffffffff811db732>] sysfs_write_file+0xd2/0x160
Jun  9 01:01:49 host kernel: [20108.859468]  [<ffffffff8116b2b6>] vfs_write+0xc6/0x180
Jun  9 01:01:49 host kernel: [20108.859476]  [<ffffffff8116b5ce>] sys_write+0x4e/0x90
Jun  9 01:01:49 host kernel: [20108.859484]  [<ffffffff815b99f9>] system_call_fastpath+0x16/0x1b
Jun  9 01:01:49 host kernel: [20108.859494]  [<00007f86127a6190>] 0x7f86127a618f
Jun  9 01:01:49 host kernel: [20108.859499] Code: a0 48 c7 c7 60 01 78 a0 e8 4a 01 e3 e0 8b 45 b0 85 c0 0f 84 20 ff ff ff 48 89 de 4c 89 e7 e8 b5 fb ff ff 85 c0 0f 84 0d ff ff ff <0f> 0b 0f 1f 00 48 8b 7d b8 e8 8f 87 d1 e0 49 8b 54 24 50 48 85 
Jun  9 01:01:49 host kernel: [20108.859554] RIP  [<ffffffffa077f0f3>] vfio_iommu_group_notifier+0x1f3/0x300 [vfio]
Jun  9 01:01:49 host kernel: [20108.859563]  RSP <ffff8801f25f9d18>
Jun  9 01:01:49 host kernel: [20108.859573] ---[ end trace faef2013325649a8 ]---

The stopVM script segfaulted.


Thanks,
kind regards,
Andreas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: vfio problem
  2012-06-08 17:35 ` Alex Williamson
@ 2012-06-09 13:42   ` Andreas Hartmann
  2012-06-09 14:30     ` Alex Williamson
  2012-06-09 15:32     ` Andreas Hartmann
  0 siblings, 2 replies; 11+ messages in thread
From: Andreas Hartmann @ 2012-06-09 13:42 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Andreas Hartmann, KVM

On Fri, 08 Jun 2012 11:35:07 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Fri, 2012-06-08 at 18:58 +0200, Andreas Hartmann wrote:
> > Hello Alex,
> > 
> > You can probably say, what this message on host side means:
> > 
> > kernel: [ 3902.124109] vfio_dma_do_map: RLIMIT_MEMLOCK (65536) exceeded
> 
> We've hit the limit of locked pages.  Are you trying to run as root or a
> normal user?  If the latter, you need to play with ulimits to increase
> the size.

That's what I did now. What for is this memory exactly needed? I don't
think for the complete VM, because the VM without the device passed
through works fine without it (and it comes up fine and can ssh'd). 
That's why I think, it's just needed for the communication between 
the device and the guest. But why so much then? 
I think I didn't got it right until now ... .

> 
> > The WLAN card in the VM doesn't work any more. It came up after a few
> > times of restarting the VM (with unbinding / rebinding - procedures).
> 
> Do I recall correctly you reporting a message about the device not
> supporting reset for the WLAN?` 

Yes.

> Unfortunately devices are mostly black
> boxes as far as VFIO is concerned, so if the device doesn't support
> reset and doesn't have it's own device specific reset and doesn't simply
> start behaving when we restore config space, there's little for vfio to
> do.  We do have a bit more flexibility in performing a secondary bus
> reset on the bridge since we own everything below the bridge.  We
> probably need to consider adding a group reset ioctl to take advantage
> of that.
> 
> > I'll see if it is reproducible. I had to reboot to get it working again.
> 
> I'm definitely curious if there's anything cumulative about the locked
> memory problem above.  Thanks,

Ok, I managed to get it reproducible. I'll describe step by step, how.

- setting low memory (64k)
- start VM:
  qemu-system-x86_64: vfio_dma_map(0x7fbfcf4fd170, 0x00000000febe0000, 0x10000, 0x7fbfb57b0000) = -12 (Cannot allocate memory)
  Jun  9 14:11:33 host kernel: [12001.026007] vfio_dma_do_map: RLIMIT_MEMLOCK (65536) exceeded
- VM is up
- module rt2800pci in VM is loaded fine - no errors can be seen in log.
- but: device doesn't work (no beaconing)
- stop hostapd
- unload wlan stack (hardware + nl80211)
- reload wlan stack 
- start hostapd
  Jun  9 14:16:17 vm kernel: [  286.088795] phy0 -> rt2x00lib_request_firmware: Info - Loading firmware file 'rt2860.bin'.
  Jun  9 14:16:17 vm kernel: [  286.090251] phy0 -> rt2x00lib_request_firmware: Info - Firmware detected - version: 0.34.
  Jun  9 14:16:18 vm kernel: [  287.194351] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x0000006a].
  Jun  9 14:16:19 vm kernel: [  288.294350] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x0000006a].
  Jun  9 14:16:19 vm kernel: [  288.294358] phy0 -> rt2800pci_set_device_state: Error - Device failed to enter state 4 (-5).
- shutdown VM (virsh shutdown VM)


- set memory to 512M
- start VM (no RLIMIT_MEMLOCK error)
- VM is up
- module rt2800pci doesn't load correctly:
  Jun  9 14:24:27 vm kernel: [    8.544858] phy0 -> rt2x00lib_request_firmware: Info - Loading firmware file 'rt2860.bin'.
  Jun  9 14:24:27 vm kernel: [    8.547870] phy0 -> rt2x00lib_request_firmware: Info - Firmware detected - version: 0.34.
  Jun  9 14:24:28 vm kernel: [    9.652364] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x0000006a].
  Jun  9 14:24:29 vm kernel: [   10.752363] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x0000006a].
  Jun  9 14:24:29 vm kernel: [   10.752371] phy0 -> rt2800pci_set_device_state: Error - Device failed to enter state 4 (-5).


I didn't manage to remove this error but with rebooting.
I tried w/ or w/o including the bridge to the bind procedure. I even
tried to get it working again by loading the module on the host. Could 
it be probably a issue of rt2800pci?




Start script for VM:
---------------------------------------------------------------
#!/bin/sh
function bind() {
modprobe vfio-pci
# not necessary
#echo "1002 4385" > /sys/bus/pci/drivers/pci-stub/new_id
#echo 0000:00:14.0 > /sys/bus/pci/devices/0000:00:14.0/driver/unbind
#echo 0000:00:14.0 > /sys/bus/pci/drivers/pci-stub/bind

echo "1002 439c" > /sys/bus/pci/drivers/pci-stub/new_id
echo 0000:00:14.1 > /sys/bus/pci/devices/0000:00:14.1/driver/unbind
echo 0000:00:14.1 > /sys/bus/pci/drivers/pci-stub/bind

echo "1002 4383" > /sys/bus/pci/drivers/pci-stub/new_id
echo 0000:00:14.2 > /sys/bus/pci/devices/0000:00:14.2/driver/unbind
echo 0000:00:14.2 > /sys/bus/pci/drivers/pci-stub/bind

# not necessary
#echo "1002 439d" > /sys/bus/pci/drivers/pci-stub/new_id
#echo 0000:00:14.3 > /sys/bus/pci/devices/0000:00:14.3/driver/unbind
#echo 0000:00:14.3 > /sys/bus/pci/drivers/pci-stub/bind

# not necessary
#echo "1002 4384" > /sys/bus/pci/drivers/pci-stub/new_id
#echo 0000:00:14.4 > /sys/bus/pci/devices/0000:00:14.4/driver/unbind
#echo 0000:00:14.4 > /sys/bus/pci/drivers/pci-stub/bind

echo "1002 4399" > /sys/bus/pci/drivers/pci-stub/new_id
echo 0000:00:14.5 > /sys/bus/pci/devices/0000:00:14.5/driver/unbind
echo 0000:00:14.5 > /sys/bus/pci/drivers/pci-stub/bind

echo "1814 0601" > /sys/bus/pci/drivers/vfio-pci/new_id
echo 0000:06:07.0 > /sys/bus/pci/devices/0000:06:07.0/driver/unbind
echo 0000:06:07.0 > /sys/bus/pci/drivers/vfio-pci/bind

sleep 1

chgrp virt5 /dev/vfio/9
chmod 660 /dev/vfio/9

chgrp virt5 /dev/vfio/vfio
chmod 660 /dev/vfio/vfio
}

if [ -S /a/vm/.libvirt/qemu/lib/VM.monitor ]; then
    echo "VM already running!"
    exit 1
fi

bind

ulimit -l 524288
su vm -c "virsh start VM"
----------------------------------------------------------------



Stop script for VM:
---------------------------------------------------------------
#!/bin/sh

if [ ! -S /a/vm/.libvirt/qemu/lib/VM.monitor ]; then
    echo "VM not running"
    exit 1
fi

su vm -c "virsh shutdown VM"

# wait for VM exited ...
function wait4exit ()
{
while true ; do
    sleep 5
    cnt=`su vm -c "virsh list" | grep -c VM`
    if [ "$cnt" -eq 0 ] ; then
        break
    fi
done
}

wait4exit

# 1002 4385 / 0000:00:14.0 not necessary

echo 0000:00:14.1 > /sys/bus/pci/drivers/pci-stub/unbind
echo "1002 439c" > /sys/bus/pci/drivers/pci-stub/remove_id

echo 0000:00:14.2 > /sys/bus/pci/drivers/pci-stub/unbind
echo "1002 4383" > /sys/bus/pci/drivers/pci-stub/remove_id
# rebind sound
echo 0000:00:14.2 > /sys/bus/pci/drivers/snd_hda_intel/bind


# 1002 439d / 0000:00:14.3 not necessary

# not necessary - bridge 1002 4384 / 0000:00:14.4
#echo 0000:00:14.4 > /sys/bus/pci/drivers/pci-stub/unbind
#echo "1002 4384" > /sys/bus/pci/drivers/pci-stub/remove_id

echo 0000:00:14.5 > /sys/bus/pci/drivers/pci-stub/unbind
echo "1002 4399" > /sys/bus/pci/drivers/pci-stub/remove_id

echo 0000:06:07.0 > /sys/bus/pci/drivers/vfio-pci/unbind
echo "1814 0601" > /sys/bus/pci/drivers/vfio-pci/remove_id
------------------------------------------------------------

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: vfio problem
  2012-06-09  0:00     ` Andreas Hartmann
@ 2012-06-09 14:09       ` Alex Williamson
  0 siblings, 0 replies; 11+ messages in thread
From: Alex Williamson @ 2012-06-09 14:09 UTC (permalink / raw)
  To: Andreas Hartmann; +Cc: KVM

On Sat, 2012-06-09 at 02:00 +0200, Andreas Hartmann wrote:
> Alex Williamson wrote:
> > On Fri, 2012-06-08 at 19:39 +0200, Andreas Hartmann wrote:
> >> Andreas Hartmann wrote:
> >>> Hello Alex,
> >>>
> >>> You can probably say, what this message on host side means:
> >>>
> >>> kernel: [ 3902.124109] vfio_dma_do_map: RLIMIT_MEMLOCK (65536) exceeded
> >>>
> >>> The WLAN card in the VM doesn't work any more. It came up after a few
> >>> times of restarting the VM (with unbinding / rebinding - procedures).
> >>>
> >>> I'll see if it is reproducible. I had to reboot to get it working again.
> >>
> >> It is reproducible. And id seems not to be a problem of binding /
> >> unbinding, but the fact of not starting it as root user seems to be the
> >> problem.
> >>
> >> I never saw these problems with a root VM (and root does have the same
> >> value for ulimit -l).
> > 
> > Yes, this is expected when running as non-root.  VFIO needs to lock
> > pages on behalf of the user, so the user needs limits granted to be able
> > to do that.  Otherwise a VFIO user could lock down all the memory in the
> > system.
> > 
> >> - Is it possible to run the VM / VFIO in user context?
> > 
> > Yes, this is one of the key design requirements of VFIO.
> > Pre-requirements are that a privileged entity has sequestered all the
> > devices as being owned by vfio-pci or pci-stub, the user has permissions
> > to /dev/vfio/<group#> and /dev/vfio/vfio (the latter is expected to be
> > safe to leave as 0666), and the user has limits set to lock pages
> > sufficient for what they need (note that the default of 64k might be
> > enough for some userspace driver applications).
> 
> I already read about the permissions, but I wasn't aware, that I have to
> set the max locked memory for the user.
> 
> I set it to 512 MB (for a VM with 256 MB plus some overhead) for the
> beginning. It most probably could be reduced.
> 
> >> - What size should be used for ulimit -l?
> > 
> > It should be about the size of memory assigned to the guest.
> 
> Thanks for this hint! 64k is definitely not enough for my VM :-).
> 
> > Once we have libvirt support, all of this should be relatively
> > transparent as that will take care of the limits setting.
> 
> If you are aware of it, it's no problem ... .
> 
> >  For now, it's
> > a bit of a pain running it as a normal user. 
> 
> Not that much. It's good to become an understanding of what is going on.
> 
> > If you come up with an
> > easy way of doing it, please share.
> 
> I don't know, if it's easy, but it's straight forward and working:
> 
> 1. define a user for running the VM.
> 2. create a VM for virsh (xml-File)
>    libvirt has the ability to freely define parameters for qemu. That's
>    the way I enabled libvirt to support VFIO:
> 
>    <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
> 	...
> 	<devices>
> 		...
> 	</devices>
> 	<qemu:commandline> <!-- instead of hostdev ... -->
> 		<qemu:arg value='-device'/>
> 		<qemu:arg value='vfio-pci,host=06:07.0'/>
>     		<!-- <qemu:env name='QEMU_ENV' value='VAL'/> -->
> 	</qemu:commandline>
>    </domain>
> 
> 3. set max locked memory limit in /etc/security/limits.conf (PAM).
> 4. write a small start script which contains the bindings and the start
>    of the VM as root with
>    su user -c "virsh start VM"
>    Don't forget to increase the max locked memory before virsh start if
>    the default size is less than the desired size for the VM.
> 5. write a small stop script to shutdown the VM with
>    su user -c "virsh shutdown VM" and free the devices again.
> 
> The start / stop scripts could be put to sudo to enable a user without
> root authorization to start the VM. You can even start it with an icon
> with your favourite GUI if you want to (women acceptance factor).
> 
> 
> BTW:
> During playing around (yes, it was my fault), I accidentally tried to
> free the devices though the VM still has been running. Vfio wasn't that
> happy about this accident :-) :
> 
> 
> Jun  9 01:01:49 host kernel: [20108.859106] ------------[ cut here ]------------
> Jun  9 01:01:49 host kernel: [20108.859117] kernel BUG at /rpm/BUILD/kernel-desktop-3.4.1/linux-3.4/drivers/vfio/vfio.c:574!

This is actually an intentional BUGON to catch just the sort of thing
you did.  This happens if the group is in use and one of the devices in
the group is re-bound to something other than pci-stub or vfio-pci.
Basically, the system has gone to a non-secure state, with kernel owned
devices and user owned devices in the same active group.  I'll probably
end up attempting to kill the user process when this occurs, but for now
it's just a BUGON.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: vfio problem
  2012-06-09 13:42   ` Andreas Hartmann
@ 2012-06-09 14:30     ` Alex Williamson
  2012-06-09 15:58       ` Andreas Hartmann
  2012-06-09 15:32     ` Andreas Hartmann
  1 sibling, 1 reply; 11+ messages in thread
From: Alex Williamson @ 2012-06-09 14:30 UTC (permalink / raw)
  To: Andreas Hartmann; +Cc: KVM

On Sat, 2012-06-09 at 15:42 +0200, Andreas Hartmann wrote:
> On Fri, 08 Jun 2012 11:35:07 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> > On Fri, 2012-06-08 at 18:58 +0200, Andreas Hartmann wrote:
> > > Hello Alex,
> > > 
> > > You can probably say, what this message on host side means:
> > > 
> > > kernel: [ 3902.124109] vfio_dma_do_map: RLIMIT_MEMLOCK (65536) exceeded
> > 
> > We've hit the limit of locked pages.  Are you trying to run as root or a
> > normal user?  If the latter, you need to play with ulimits to increase
> > the size.
> 
> That's what I did now. What for is this memory exactly needed? I don't
> think for the complete VM, because the VM without the device passed
> through works fine without it (and it comes up fine and can ssh'd). 
> That's why I think, it's just needed for the communication between 
> the device and the guest. But why so much then? 
> I think I didn't got it right until now ... .

For x86 device assignment, we pin all of guest memory when doing device
assignment.  This allows the guest to transparently use any guest
physical address as a DMA target.  If we didn't have this memory locked,
a page of guest memory could be swapped in the host just as the assigned
device issued a DMA write to that page.  This would result in corrupted
host memory.
 
> > > The WLAN card in the VM doesn't work any more. It came up after a few
> > > times of restarting the VM (with unbinding / rebinding - procedures).
> > 
> > Do I recall correctly you reporting a message about the device not
> > supporting reset for the WLAN?` 
> 
> Yes.
> 
> > Unfortunately devices are mostly black
> > boxes as far as VFIO is concerned, so if the device doesn't support
> > reset and doesn't have it's own device specific reset and doesn't simply
> > start behaving when we restore config space, there's little for vfio to
> > do.  We do have a bit more flexibility in performing a secondary bus
> > reset on the bridge since we own everything below the bridge.  We
> > probably need to consider adding a group reset ioctl to take advantage
> > of that.
> > 
> > > I'll see if it is reproducible. I had to reboot to get it working again.
> > 
> > I'm definitely curious if there's anything cumulative about the locked
> > memory problem above.  Thanks,
> 
> Ok, I managed to get it reproducible. I'll describe step by step, how.
> 
> - setting low memory (64k)
> - start VM:
>   qemu-system-x86_64: vfio_dma_map(0x7fbfcf4fd170, 0x00000000febe0000, 0x10000, 0x7fbfb57b0000) = -12 (Cannot allocate memory)
>   Jun  9 14:11:33 host kernel: [12001.026007] vfio_dma_do_map: RLIMIT_MEMLOCK (65536) exceeded
> - VM is up
> - module rt2800pci in VM is loaded fine - no errors can be seen in log.
> - but: device doesn't work (no beaconing)

I'm surprised that the driver loaded, but not surprised that it doesn't
work since it can't do any DMA.  You were probably getting errors from
the IOMMU in host dmesg here too, right?

> - stop hostapd
> - unload wlan stack (hardware + nl80211)
> - reload wlan stack 
> - start hostapd
>   Jun  9 14:16:17 vm kernel: [  286.088795] phy0 -> rt2x00lib_request_firmware: Info - Loading firmware file 'rt2860.bin'.
>   Jun  9 14:16:17 vm kernel: [  286.090251] phy0 -> rt2x00lib_request_firmware: Info - Firmware detected - version: 0.34.
>   Jun  9 14:16:18 vm kernel: [  287.194351] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x0000006a].
>   Jun  9 14:16:19 vm kernel: [  288.294350] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x0000006a].
>   Jun  9 14:16:19 vm kernel: [  288.294358] phy0 -> rt2800pci_set_device_state: Error - Device failed to enter state 4 (-5).
> - shutdown VM (virsh shutdown VM)
> 
> 
> - set memory to 512M
> - start VM (no RLIMIT_MEMLOCK error)
> - VM is up
> - module rt2800pci doesn't load correctly:
>   Jun  9 14:24:27 vm kernel: [    8.544858] phy0 -> rt2x00lib_request_firmware: Info - Loading firmware file 'rt2860.bin'.
>   Jun  9 14:24:27 vm kernel: [    8.547870] phy0 -> rt2x00lib_request_firmware: Info - Firmware detected - version: 0.34.
>   Jun  9 14:24:28 vm kernel: [    9.652364] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x0000006a].
>   Jun  9 14:24:29 vm kernel: [   10.752363] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x0000006a].
>   Jun  9 14:24:29 vm kernel: [   10.752371] phy0 -> rt2800pci_set_device_state: Error - Device failed to enter state 4 (-5).
> 
> 
> I didn't manage to remove this error but with rebooting.
> I tried w/ or w/o including the bridge to the bind procedure. I even
> tried to get it working again by loading the module on the host. Could 
> it be probably a issue of rt2800pci?

Quite possibly.  Since the device doesn't have a reset at the PCI level,
it's probably getting left in a weird state, perhaps still attempting to
do DMA from the first guest boot.  If rt2800pci isn't robust enough to
pull the device out of this mode, there's not much to do except pull
some kind of hard reset like rebooting the host.  We need to figure out
how we can take advantage of this device being behind a PCI-to-PCI
bridge and possibly issuing a secondary bus reset on that bridge which
could get the device back to a known state.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: vfio problem
  2012-06-09 13:42   ` Andreas Hartmann
  2012-06-09 14:30     ` Alex Williamson
@ 2012-06-09 15:32     ` Andreas Hartmann
  1 sibling, 0 replies; 11+ messages in thread
From: Andreas Hartmann @ 2012-06-09 15:32 UTC (permalink / raw)
  To: Alex Williamson; +Cc: KVM

Andreas Hartmann wrote:
> On Fri, 08 Jun 2012 11:35:07 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
[...]
>> I'm definitely curious if there's anything cumulative about the locked
>> memory problem above.  Thanks,
> 
> Ok, I managed to get it reproducible. I'll describe step by step, how.
> 
> - setting low memory (64k)
> - start VM:
>   qemu-system-x86_64: vfio_dma_map(0x7fbfcf4fd170, 0x00000000febe0000, 0x10000, 0x7fbfb57b0000) = -12 (Cannot allocate memory)
>   Jun  9 14:11:33 host kernel: [12001.026007] vfio_dma_do_map: RLIMIT_MEMLOCK (65536) exceeded
> - VM is up
> - module rt2800pci in VM is loaded fine - no errors can be seen in log.
> - but: device doesn't work (no beaconing)
> - stop hostapd
> - unload wlan stack (hardware + nl80211)
> - reload wlan stack 
> - start hostapd
>   Jun  9 14:16:17 vm kernel: [  286.088795] phy0 -> rt2x00lib_request_firmware: Info - Loading firmware file 'rt2860.bin'.
>   Jun  9 14:16:17 vm kernel: [  286.090251] phy0 -> rt2x00lib_request_firmware: Info - Firmware detected - version: 0.34.
>   Jun  9 14:16:18 vm kernel: [  287.194351] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x0000006a].
>   Jun  9 14:16:19 vm kernel: [  288.294350] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x0000006a].
>   Jun  9 14:16:19 vm kernel: [  288.294358] phy0 -> rt2800pci_set_device_state: Error - Device failed to enter state 4 (-5).
> - shutdown VM (virsh shutdown VM)
> 
> 
> - set memory to 512M
> - start VM (no RLIMIT_MEMLOCK error)
> - VM is up
> - module rt2800pci doesn't load correctly:
>   Jun  9 14:24:27 vm kernel: [    8.544858] phy0 -> rt2x00lib_request_firmware: Info - Loading firmware file 'rt2860.bin'.
>   Jun  9 14:24:27 vm kernel: [    8.547870] phy0 -> rt2x00lib_request_firmware: Info - Firmware detected - version: 0.34.
>   Jun  9 14:24:28 vm kernel: [    9.652364] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x0000006a].
>   Jun  9 14:24:29 vm kernel: [   10.752363] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x0000006a].
>   Jun  9 14:24:29 vm kernel: [   10.752371] phy0 -> rt2800pci_set_device_state: Error - Device failed to enter state 4 (-5).
> 
> 
> I didn't manage to remove this error but with rebooting.
> I tried w/ or w/o including the bridge to the bind procedure. I even
> tried to get it working again by loading the module on the host. Could 
> it be probably a issue of rt2800pci?

Update: it can be reseted by s2ram/resume cycle, too.


Regards,
Andreas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: vfio problem
  2012-06-09 14:30     ` Alex Williamson
@ 2012-06-09 15:58       ` Andreas Hartmann
  0 siblings, 0 replies; 11+ messages in thread
From: Andreas Hartmann @ 2012-06-09 15:58 UTC (permalink / raw)
  To: Alex Williamson; +Cc: KVM

Alex Williamson wrote:
> On Sat, 2012-06-09 at 15:42 +0200, Andreas Hartmann wrote:
>> On Fri, 08 Jun 2012 11:35:07 -0600
>> Alex Williamson <alex.williamson@redhat.com> wrote:
>>
>>> On Fri, 2012-06-08 at 18:58 +0200, Andreas Hartmann wrote:
>>>> Hello Alex,
>>>>
>>>> You can probably say, what this message on host side means:
>>>>
>>>> kernel: [ 3902.124109] vfio_dma_do_map: RLIMIT_MEMLOCK (65536) exceeded
>>>
>>> We've hit the limit of locked pages.  Are you trying to run as root or a
>>> normal user?  If the latter, you need to play with ulimits to increase
>>> the size.
>>
>> That's what I did now. What for is this memory exactly needed? I don't
>> think for the complete VM, because the VM without the device passed
>> through works fine without it (and it comes up fine and can ssh'd). 
>> That's why I think, it's just needed for the communication between 
>> the device and the guest. But why so much then? 
>> I think I didn't got it right until now ... .
> 
> For x86 device assignment, we pin all of guest memory when doing device
> assignment.  This allows the guest to transparently use any guest
> physical address as a DMA target.  If we didn't have this memory locked,
> a page of guest memory could be swapped in the host just as the assigned
> device issued a DMA write to that page.  This would result in corrupted
> host memory.

Thanks, got it now!

>>>> The WLAN card in the VM doesn't work any more. It came up after a few
>>>> times of restarting the VM (with unbinding / rebinding - procedures).
>>>
>>> Do I recall correctly you reporting a message about the device not
>>> supporting reset for the WLAN?` 
>>
>> Yes.
>>
>>> Unfortunately devices are mostly black
>>> boxes as far as VFIO is concerned, so if the device doesn't support
>>> reset and doesn't have it's own device specific reset and doesn't simply
>>> start behaving when we restore config space, there's little for vfio to
>>> do.  We do have a bit more flexibility in performing a secondary bus
>>> reset on the bridge since we own everything below the bridge.  We
>>> probably need to consider adding a group reset ioctl to take advantage
>>> of that.
>>>
>>>> I'll see if it is reproducible. I had to reboot to get it working again.
>>>
>>> I'm definitely curious if there's anything cumulative about the locked
>>> memory problem above.  Thanks,
>>
>> Ok, I managed to get it reproducible. I'll describe step by step, how.
>>
>> - setting low memory (64k)
>> - start VM:
>>   qemu-system-x86_64: vfio_dma_map(0x7fbfcf4fd170, 0x00000000febe0000, 0x10000, 0x7fbfb57b0000) = -12 (Cannot allocate memory)
>>   Jun  9 14:11:33 host kernel: [12001.026007] vfio_dma_do_map: RLIMIT_MEMLOCK (65536) exceeded
>> - VM is up
>> - module rt2800pci in VM is loaded fine - no errors can be seen in log.
>> - but: device doesn't work (no beaconing)
> 
> I'm surprised that the driver loaded, but not surprised that it doesn't
> work since it can't do any DMA.  You were probably getting errors from
> the IOMMU in host dmesg here too, right?

No. The only errors I get on host side are the RLIMIT_MEMLOCK exceeded
errors (a lot of them).

[...]

>> I didn't manage to remove this error but with rebooting.
>> I tried w/ or w/o including the bridge to the bind procedure. I even
>> tried to get it working again by loading the module on the host. Could 
>> it be probably a issue of rt2800pci?
> 
> Quite possibly.  Since the device doesn't have a reset at the PCI level,
> it's probably getting left in a weird state, perhaps still attempting to
> do DMA from the first guest boot.  If rt2800pci isn't robust enough to
> pull the device out of this mode, there's not much to do except pull
> some kind of hard reset like rebooting the host.  We need to figure out
> how we can take advantage of this device being behind a PCI-to-PCI
> bridge and possibly issuing a secondary bus reset on that bridge which
> could get the device back to a known state.

Hm, I hoped binding pci-stub to the bridge would already do that. Maybe
it does it already, but it has no effect on this device?

Nevertheless, I'm willing to do tests if you have some code.


Thanks,
regards,
Andreas

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-06-09 16:00 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-08 16:58 vfio problem Andreas Hartmann
2012-06-08 17:35 ` Alex Williamson
2012-06-09 13:42   ` Andreas Hartmann
2012-06-09 14:30     ` Alex Williamson
2012-06-09 15:58       ` Andreas Hartmann
2012-06-09 15:32     ` Andreas Hartmann
2012-06-08 17:39 ` Andreas Hartmann
2012-06-08 18:17   ` Alex Williamson
2012-06-08 19:43     ` Andreas Hartmann
2012-06-09  0:00     ` Andreas Hartmann
2012-06-09 14:09       ` Alex Williamson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox