From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NwMQi-0000fq-Ex for qemu-devel@nongnu.org; Mon, 29 Mar 2010 17:23:28 -0400 Received: from [140.186.70.92] (port=57767 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NwMQg-0000eX-By for qemu-devel@nongnu.org; Mon, 29 Mar 2010 17:23:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1NwMQe-0002mP-0p for qemu-devel@nongnu.org; Mon, 29 Mar 2010 17:23:26 -0400 Received: from e32.co.us.ibm.com ([32.97.110.150]:56163) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1NwMQd-0002mG-QZ for qemu-devel@nongnu.org; Mon, 29 Mar 2010 17:23:23 -0400 Received: from d03relay01.boulder.ibm.com (d03relay01.boulder.ibm.com [9.17.195.226]) by e32.co.us.ibm.com (8.14.3/8.13.1) with ESMTP id o2TLGhcF006218 for ; Mon, 29 Mar 2010 15:16:43 -0600 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay01.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id o2TLNLfM109948 for ; Mon, 29 Mar 2010 15:23:21 -0600 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.14.3/8.13.1/NCO v10.0 AVout) with ESMTP id o2TLNL6Z024909 for ; Mon, 29 Mar 2010 15:23:21 -0600 Message-ID: <4BB11A48.4060805@linux.vnet.ibm.com> Date: Mon, 29 Mar 2010 16:23:20 -0500 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH -V3 09/32] virtio-9p: Implement P9_TWRITE/ Thread model in QEMU References: <1269535420-31206-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com> <1269535420-31206-10-git-send-email-aneesh.kumar@linux.vnet.ibm.com> <4BB04A59.1040204@linux.vnet.ibm.com> <4BB0C09D.7090204@linux.vnet.ibm.com> <4BB10E30.70908@redhat.com> <4BB110B3.8040300@linux.vnet.ibm.com> <4BB1136B.6050506@redhat.com> In-Reply-To: <4BB1136B.6050506@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: ericvh@gmail.com, jvrao , "Aneesh Kumar K.V" , qemu-devel@nongnu.org On 03/29/2010 03:54 PM, Avi Kivity wrote: > On 03/29/2010 11:42 PM, Anthony Liguori wrote: >>>> For individual device models or host services, I think (3) is >>>> probably the worst model overall. I personally think that (1) is >>>> better in the long run but ultimately would need an existence proof >>>> to compare against (2). (2) looks appealing until you actually try >>>> to have the device handle multiple requests at a time. >>> >>> Sooner or later nature and the ever more complicated code will force >>> us towards (3). As an example, we've observed live migration to >>> throttle vcpus when sending a large guest's zeroed memory over; the >>> bandwidth control doesn't kick in since zero pages are compressed, >>> so the iothread spends large amounts of time reading memory. >> >> >> Making things re-entrant is different than (3) in my mind. >> >> There's no reason that VCPU threads should run in lock-step with live >> migration during the live phase. Making device models re-entrant and >> making live migration depend not depend on the big global lock is a >> good thing to do. > > It's not sufficient. If you have a single thread that runs both live > migrations and timers, then timers will be backlogged behind live > migration, or you'll have to yield often. This is regardless of the > locking model (and of course having threads without fixing the locking > is insufficient as well, live migration accesses guest memory so it > needs the big qemu lock). But what's the solution? Sending every timer in a separate thread? We'll hit the same problem if we implement an arbitrary limit to number of threads. >> What I'm skeptical of, is whether converting virtio-9p or qcow2 to >> handle each request in a separate thread is really going to improve >> things. > > Currently qcow2 isn't even fullly asynchronous, so it can't fail to > improve things. Unless it introduces more data corruptions which is my concern with any significant change to qcow2. >> The VNC server is another area that I think multithreading would be a >> bad idea. > > If the vnc server is stuffing a few megabytes of screen into a socket, > then timers will be delayed behind it, unless you litter the code with > calls to bottom halves. Even worse if it does complicated compression > and encryption. Sticking the VNC server in it's own thread would be fine. Trying to make the VNC server multithreaded though would be problematic. Basically, sticking isolated components in a single thread should be pretty reasonable. >>> >>> But if those system calls are blocking, you need a thread? >> >> You can dispatch just the system call to a thread pool. The >> advantage of doing that is that you don't need to worry about locking >> since the system calls are not (usually) handling shared state. > > There is always implied shared state. If you're doing direct guest > memory access, you need to lock memory against hotunplug, or the > syscall will end up writing into freed memory. If the device can be > hotunplugged, you need to make sure all threads have returned before > unplugging it. There are other ways to handle hot unplug (like reference counting) that avoid this problem. Ultimately, this comes down to a question of lock granularity and thread granularity. I don't think it's a good idea to start with the assumption that we want extremely fine granularity. There's certainly very low hanging fruit with respect to threading. >>> On a philosophical note, threads may be easier to model complex >>> hardware that includes a processor, for example our scsi card (and >>> how about using tcg as a jit to boost it :) >> >> Yeah, it's hard to argue that script evaluation shouldn't be done in >> a thread. But that doesn't prevent me from being very cautious about >> how and where we use threading :-) > > Caution where threads are involved is a good thing. They are > inevitable however, IMO. We already are using threads so they aren't just inevitable, they're reality. I still don't think using threads would significantly simplify virtio-9p. Regards, Anthony Liguori