From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH] ioeventfd: Introduce KVM_IOEVENTFD_FLAG_PIPE Date: Mon, 04 Jul 2011 14:19:39 +0300 Message-ID: <4E11A1CB.2080709@redhat.com> References: <1309712689-4290-1-git-send-email-levinsasha928@gmail.com> <20110704103207.GA11386@redhat.com> <4E1199B3.2010507@redhat.com> <20110704110723.GD11386@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Sasha Levin , kvm@vger.kernel.org, Ingo Molnar , Marcelo Tosatti , Pekka Enberg To: "Michael S. Tsirkin" Return-path: Received: from mx1.redhat.com ([209.132.183.28]:35060 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754905Ab1GDLT7 (ORCPT ); Mon, 4 Jul 2011 07:19:59 -0400 In-Reply-To: <20110704110723.GD11386@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On 07/04/2011 02:07 PM, Michael S. Tsirkin wrote: > On Mon, Jul 04, 2011 at 01:45:07PM +0300, Avi Kivity wrote: > > On 07/04/2011 01:32 PM, Michael S. Tsirkin wrote: > > >On Sun, Jul 03, 2011 at 08:04:49PM +0300, Sasha Levin wrote: > > >> The new flag allows passing a write side of a pipe instead of an > > >> eventfd to be notified of writes to the specified memory region. > > >> > > >> Instead of signaling an event, the value written to the memory region > > >> is written to the pipe. > > >> > > >> Using a pipe instead of an eventfd is usefull when any value can be > > >> written to the memory region but we're interested in recieving the > > >> actual value instead of just a notification. > > >> > > >> A simple example for practical use is the serial port. we are not > > >> interested in an exit every time a char is written to the port, but > > >> we do need to know what was written so we could handle it on the guest. > > > > > >Looking at this example, how would you handle a pipe full condition? > > >We can't buffer unlimited amount of data in the host. > > > > Stall. > > Right, but the guest gets no indication that the pipe is full. > Something like virtio would let the guest do something useful > instead of stalling the vcpu. That's not a problem. The vcpu blocks, which lets the other process get the cpu and run with it. If there are not enough cpu resources, we'll indeed stall the vcpu, but that happens whenever you're overcommitted anyway. > Also noting that the fd can be set not to block, or that > a signal can interrupt the write. Both cases are not errors. One thing we can do is return via the normal KVM_EXIT_MMIO method and hope userspace knows how to handle this. Otherwise I don't see what we can do. > > > > > >If pipe is non-blocking, or if we get a signal, > > >this might fail or return a value< len. > > >Data will be lost then, won't it? > > > > Yes. Need a loop-until-buffer-exhausted-or-error. > > Signal handling becomes a problem. You don't want a > full pipe to prevent qemu from getting killed or > getting a timer alert. Maybe we should require AF_UNIX SOCK_SEQPACKET connection. That gives us atomicity, and drops the need for a mutex. > > > > We should allow unix domain sockets as well. In fact, for > > read/write support, we need this to be a unix domain socket. > > Sockets are actually better at this than pipes > as you can at least make the writes > non-blocking by passing in a message flag. I'm not sure we want that. How do we handle it? If the socket buffers get filled up, it's time for the vcpu to wait for the mmio server process. Let the scheduler sort things out. btw, like vhost-net and other thread offloads, this sort of trick is dangerous. When you have excess cpu resources throughput improves, but once the system is loaded, the workload is needlessly spread across more cores than strictly necessary and communication is done by context switches instead of user/system transitions. > If we support sockets, do we really need to support > pipes at all I think not. -- error compiling committee.c: too many arguments to function