From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH 1/10] Trivial: /dev/kvm interface is no longer experimental. Date: Wed, 18 Jul 2007 12:31:23 +0300 Message-ID: <469DDDEB.9070009@qumranet.com> References: <1184677946.10380.4.camel@localhost.localdomain> <200707171835.53092.arnd@arndb.de> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org To: Arnd Bergmann Return-path: In-Reply-To: <200707171835.53092.arnd-r2nGTMty4D4@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org Errors-To: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org List-Id: kvm.vger.kernel.org Arnd Bergmann wrote: > The equivalent of the current set of ioctls comes down to roughtly this > set of syscalls: > > int kvm_create_vm(void); > int kvm_get_msr_index_list(int fd, sttuct kvm_msr_list); > size_t kvm_get_vcpu_mmap_size(int fd); > int kvm_set_memory_region(int fd, unsigned slot, unsigned flags, > __u64 guest_phys_addr, __u64 size); > int kvm_create_vcpu(int fd); > int kvm_get_dirty_log(int fd, unsigned slot, void *dirty_bitmap); > int kvm_get_memory_alias(int fd, struct kvm_memory_region *region); > int kvm_run(int fd); > int kvm_get_regs(int fd, struct kvm_regs *regs); > int kvm_set_regs(int fd, const struct kvm_regs *regs); > int kvm_get_sregs(int fd, struct kvm_sregs *sregs); > int kvm_set_sregs(int fd, const struct kvm_sregs *sregs); > int kvm_translate(int fd, __u64 linear_address, __u64 *physical_address, > __u8 *valid, __u8 *writeable, __u8 *usemode); > int kvm_interrupt(int fd, __u32 irq); > int kvm_debug_guest(...); > int kvm_get_msrs(...); > int kvm_set_msrs(...); > int kvm_set_cpuid(...); > int kvm_set_signal_mask(...); > int kvm_get_fpu(...); > int kvm_set_fpu(...); > > That's a lot of system calls! The only ioctl calls that can immediately > go away are KVM_GET_API_VERSION and KVM_CHECK_EXTENSION, if the API > is fixed. > Many can be merged (set_fpu, set_regs, set_sregs). We will need CHECK_EXTENSION as long as we are unable to predict the future (well, for syscalls that are intended to be invoked on initialization only we can use ENOSYS). > Before moving to the final syscall interface, I'd wait for the memory > model to have moved to using a region of the host process address instead > of a separate address range. As I understood Carsten, that's what was > discussed in Ottawa anyway. It can probably remove a few of the > existing ioctl calls. > We will also wait until we get most ports working, so we get a chance to test it in real life. > Some more can be saved by changing the interface for the kvm_get/set_* > calls. One way I can see this done at the syscall level is to replace > kvm_get_regs() with > > regsfd = openat(vcpu, "regs", O_RDWR); > (void)read(regsfs, ®s, sizeof (regs); > > Some of the others can also be done with files like this, e.g. writing > to an "interrupt" file. Of course, you would keep the file descriptors > open for the life time of the guest. > Once we're back to using fds, we might as well use ioctls. If anything, an ioctl has an explicit mention of the structure it manipulates in its definition. I don't care that it came in last in the last 15 annual kernel beauty contests. For me, the difference between syscalls and ioctls is whether the vcpu is bound to a task (and the vm bound to the mm) rather than whether the names are spelled in lowercase or uppercase. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/