qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Anthony Liguori <aliguori@linux.vnet.ibm.com>
Cc: ericvh@gmail.com, jvrao <jvrao@linux.vnet.ibm.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH -V3 09/32] virtio-9p: Implement P9_TWRITE/ Thread model in QEMU
Date: Tue, 30 Mar 2010 16:28:35 +0300	[thread overview]
Message-ID: <4BB1FC83.6000202@redhat.com> (raw)
In-Reply-To: <4BB1F915.6020800@linux.vnet.ibm.com>

On 03/30/2010 04:13 PM, Anthony Liguori wrote:
> On 03/30/2010 05:24 AM, Avi Kivity wrote:
>> On 03/30/2010 12:23 AM, Anthony Liguori wrote:
>>>> It's not sufficient.  If you have a single thread that runs both 
>>>> live migrations and timers, then timers will be backlogged behind 
>>>> live migration, or you'll have to yield often.  This is regardless 
>>>> of the locking model (and of course having threads without fixing 
>>>> the locking is insufficient as well, live migration accesses guest 
>>>> memory so it needs the big qemu lock).
>>>
>>>
>>> But what's the solution?  Sending every timer in a separate thread?  
>>> We'll hit the same problem if we implement an arbitrary limit to 
>>> number of threads.
>>
>> A completion that's expected to take a couple of microseconds at most 
>> can live in the iothread.  A completion that's expected to take a 
>> couple of milliseconds wants its own thread.  We'll have to think 
>> about anything in between.
>>
>> vnc and migration can perform large amounts of work in a single 
>> completion; they're limited only by the socket send rate and our 
>> internal rate-limiting which are both outside our control.  Most 
>> device timers are O(1).  virtio completions probably fall into the 
>> annoying "have to think about it" department.
>
> I think it may make more sense to have vcpu completions vs. io thread 
> completions and make vcpu completions target short lived operations.

vcpu completions make sense when you can tell that a completion will 
cause an interrupt injection and you have a good idea which cpu will be 
interrupted.

>
>>>>> What I'm skeptical of, is whether converting virtio-9p or qcow2 to 
>>>>> handle each request in a separate thread is really going to 
>>>>> improve things. 
>>>>
>>>> Currently qcow2 isn't even fullly asynchronous, so it can't fail to 
>>>> improve things.
>>>
>>> Unless it introduces more data corruptions which is my concern with 
>>> any significant change to qcow2.
>>
>> It's possible to move qcow2 to a thread without any significant 
>> change to it (simply run the current code in its own thread, 
>> protected by a mutex).  Further changes would be very incremental.
>
> But that offers no advantage to what we have which fails the 
> proof-by-example that threading makes the situation better. 

It has an advantage, qcow2 is currently synchronous in parts:

block/qcow2-cluster.c:    ret = bdrv_write(s->hd, (cluster_offset >> 9) 
+ n_start,
block/qcow2.c:        bdrv_write(s->hd, (meta.cluster_offset >> 9) + num 
- 1, buf, 1);
block/qcow2.c:        bdrv_write(bs, sector_num, buf, s->cluster_sectors);
block/qcow2-cluster.c:                    ret = 
bdrv_read(bs->backing_hd, sector_num, buf, n1);
block/qcow2-cluster.c:        ret = bdrv_read(s->hd, coffset >> 9, 
s->cluster_data, nb_csectors);

> To convert qcow2 to be threaded, I think you would have to wrap the 
> whole thing in a lock, then convert the current asynchronous functions 
> to synchronous (that's the whole point, right).  At this point, you've 
> regressed performance because you can only handle one read/write 
> outstanding at a given time.  So now you have to make the locking more 
> granular but because we do layered block devices, you've got to make 
> most of the core block driver functions thread safe.

Not at all.  The first conversion will be to keep the current code as 
is, operating asynchronously, but running in its own thread.  It will 
still support multiple outstanding requests using the current state 
machine code; the synchronous parts will be remain synchronous relative 
to the block device, but async relative to everything else.  The second 
stage will convert the state machine code to threaded code.  This is 
more difficult but not overly so - turn every dependency list into a mutex.

> Once you get basic data operations concurrent, which I expect won't be 
> so bad, to get an improvement over the current code, you have to allow 
> simultaneous access to metadata which is where I think the vast 
> majority of the complexity will come from.

I have no plans to do that, all I want is qcow2 not to block vcpus.  
btw, I don't think it's all that complicated, it's simple to lock 
individual L2 blocks and the L1 block.

> You could argue that we stick qcow2 into a thread and stop there and 
> that fixes the problems with synchronous data access.  If that's the 
> argument, then let's not even bother doing at the qcow layer, let's 
> just switch the block aio emulation to use a dedicated thread.

That's certainly the plan for vmdk and friends which are today useless.  
qcow2 deserves better treatment.

>>> Sticking the VNC server in it's own thread would be fine.  Trying to 
>>> make the VNC server multithreaded though would be problematic.
>>
>> Why would it be problematic?  Each client gets its own threads, they 
>> don't interact at all do they?
>
> Dealing with locking of the core display which each client uses for 
> rendering.  Things like CopyRect will get ugly quickly.Ultimately, 
> this comes down to a question of lock granularity and thread 
> granularity.  I don't think it's a good idea to start with the 
> assumption that we want extremely fine granularity.  There's certainly 
> very low hanging fruit with respect to threading.

Not familiar with the code, but doesn't vnc access the display core 
through an API?  Slap a lot onto that.

>>
>> I meant, exposing qemu core to the threads instead of pretending they 
>> aren't there.  I'm not familiar with 9p so don't hold much of an 
>> opinion, but didn't you say you need threads in order to handle async 
>> syscalls?  That may not be the deep threading we're discussing here.
>>
>> btw, IIUC currently disk hotunplug will stall a guest, no?  We need 
>> async aio_flush().
>
> But aio_flush() never takes a very long time, right :-)
>
> We had this discussion in the past re: live migration because we do an 
> aio_flush() in the critical stage.
>

Live migration will stall a guest anyway.  It doesn't matter if 
aio_flush blocks for a few ms, since the final stage will dominate it.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

  reply	other threads:[~2010-03-30 13:28 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-25 16:43 [Qemu-devel] [PATCH -V3 00/32] virtio-9p: paravirtual file system passthrough Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 01/32] vitio-9p: Add a virtio 9p device to qemu Aneesh Kumar K.V
2010-03-25 21:04   ` Anthony Liguori
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 02/32] vrtio-9p: Implement P9_TVERSION for 9P Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 03/32] virtio-9p: Implement P9_TATTACH Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 04/32] virtio-9p: Implement P9_TSTAT Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 05/32] virtio-9p: Implement P9_TWALK Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 06/32] virtio-9p: Implement P9_TOPEN Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 07/32] virtio-9p: Implement P9_TREAD Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 08/32] virtio-9p: Implement P9_TCLUNK Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 09/32] virtio-9p: Implement P9_TWRITE Aneesh Kumar K.V
2010-03-29  6:36   ` [Qemu-devel] [PATCH -V3 09/32] virtio-9p: Implement P9_TWRITE/ Thread model in QEMU jvrao
2010-03-29 15:00     ` Anthony Liguori
2010-03-29 20:31       ` Avi Kivity
2010-03-29 20:42         ` Anthony Liguori
2010-03-29 20:54           ` Avi Kivity
2010-03-29 21:23             ` Anthony Liguori
2010-03-30 10:24               ` Avi Kivity
2010-03-30 13:13                 ` Anthony Liguori
2010-03-30 13:28                   ` Avi Kivity [this message]
2010-03-30 13:54                     ` Anthony Liguori
2010-03-30 14:03                       ` Avi Kivity
2010-03-30 14:07                         ` Anthony Liguori
2010-03-30 14:23                           ` Avi Kivity
2010-03-30 14:59                             ` Anthony Liguori
2010-03-29 21:17           ` jvrao
2010-03-30 10:28             ` Avi Kivity
2010-04-01 13:14           ` Paul Brook
2010-04-01 14:34             ` Avi Kivity
2010-04-01 15:30               ` Paul Brook
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 10/32] virtio-9p: Implement P9_TCREATE Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 11/32] virtio-9p: Implement P9_TWSTAT Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 12/32] virtio-9p: Implement P9_TREMOVE Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 13/32] virtio-9p: Implement P9_TFLUSH Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 14/32] virtio-9p: Add multiple mount point support Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 15/32] virtio-9p: Use little endian format on virtio Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 16/32] virtio-9p: Add support for hardlink Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 17/32] Implement sync support in 9p server Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 18/32] virtio-9p: Fix sg usage in the code Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 19/32] virtio-9p: Get the correct count values from the pdu Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 20/32] virtio-9p: Remove BUG_ON and add proper error handling Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 21/32] virtio-9p: Remove unnecessary definition of fid Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 22/32] virtio-9p: Update existing fid path on rename Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 23/32] vritio-9p: Fix chmod bug with directory Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 24/32] qemu-malloc: Add qemu_asprintf Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 25/32] virtio-9p: Move V9fs File system specific options to a separate header file Aneesh Kumar K.V
2010-03-29  0:52   ` jvrao
2010-03-29  1:09   ` jvrao
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 26/32] virtio-9p: Create a commandline option -fsdev Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 27/32] virtio-9p: Create qemu_fsdev_opts Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 28/32] virtio-9p: Handle the fsdev command line options Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 29/32] virtio-9p: Decouple share_path details from virtio-9p-dev Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 30/32] virtio-9p: Create a syntactic shortcut for the file-system pass-thru Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 31/32] virtio-9p: Return proper errors from create paths Aneesh Kumar K.V
2010-03-25 16:43 ` [Qemu-devel] [PATCH -V3 32/32] virtio-9p: Handle unknown 9P protocol versions as per the standards Aneesh Kumar K.V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BB1FC83.6000202@redhat.com \
    --to=avi@redhat.com \
    --cc=aliguori@linux.vnet.ibm.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=ericvh@gmail.com \
    --cc=jvrao@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).