From mboxrd@z Thu Jan 1 00:00:00 1970 From: Steve Ofsthun Subject: Re: Paravirtualised drivers for fully virtualised domains Date: Tue, 18 Jul 2006 19:24:52 -0400 Message-ID: <44BD6DC4.3020304@virtualiron.com> References: <20060718125106.GA4727@cam.ac.uk> <44BD05B5.6050108@virtualiron.com> <20060718203422.GA7497@cam.ac.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20060718203422.GA7497@cam.ac.uk> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Steven Smith Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org Steven Smith wrote: >>Have you built the guest environment on anything other than a 2.6.16 >>version of Linux? We ran into extra work supporting older linux versions. > > #ifdef soup will get you back to about 2.6.12-ish without too many > problems. These patches don't include that, since it would complicate > merging. I was thinking about SLES9 (2.6.5), RHEL4 (2.6.9), RHEL3 (2.4.21). >>You did some work to make xenbus a loadable module in the guest domains. >>Can this be used to make xenbus loadable in Domain 0? > > I can't see any immediate reason why not, but it's not clear to me why > that would be useful. It just makes it easier to insert alternate bus implementations. >>Domain 0 buffer cache coherency issues can cause catastrophic file >>system corruption. This is due to the backend accessing the backing >>device directly, and QEMU accessing the device through buffered >>reads and writes. We are working on a patch to convert QEMU to use >>O_DIRECT whenever possible. This solves the cache coherency issue. > > I wasn't aware of these issues. I was much more worried about domU > trying to cache the devices twice, and those caches getting out of > sync. It's pretty much the usual problem of configuring a device into > two domains and then having them trip over each other. Do you have a > plan for dealing with this? We eliminate any buffer cache use in domain 0 for backing store objects. This prevents double caching and reduces domain 0 's memory footprint. We don't restrict multiple domain access to the same "raw" backing object. Real hardware allows this (at least for SCSI/FC). This may be necessary for shared storage clustering. >>Actually presenting two copies of the same device to linux can cause >>its own problems. Mounting using LABEL= will complain about duplicate >>labels. However, using the device names directly seems to work. With >>this approach it is possible to decide in the guest whether to mount >>a device as an emulated disk or a PV disk. > > My plan here was to just not support VMs which mix paravirtualised and > ioemulated devices, requiring the user to load the PV drivers from an > initrd. Of course, you have to load the initrd somehow, but the > bootloader should only be reading the disk, which makes the coherency > issues much easier. As a last resort, rombios could learn about the > PV devices, but I'd rather avoid that if possible. > > Your way would be preferable, though, if it works. We currently only allow this for the boot device (mainly to avoid the rombios work you mention). In addition, we make the qemu device only visible to the rombios (and not the guest O/S) by controlling the IDE probe logic in qemu. >>>-- Adding a new device to the qemu PCI bus which is used for >>> bootstrapping the devices and getting an IRQ. >> >>Have you thought about supporting more than one IRQ. We are experimenting >>with an IRQ per device class (BUS, NIC, VBD). > > I considered it, but it wasn't obvious that there would be much > benefit. You can potentially scan a smaller part of the pending event > channel mask, but that's fairly quick already. The main benefit we see is for legacy Linux variants that limit 1 CPU per IRQ. Allowing additional IRQs increases the possible interrupt processing concurrency. In addition, one interrupt class can't starve another (on SMP guests). Steve -- Steve Ofsthun - Virtual Iron Software, Inc.