* [Qemu-devel] [PATCH 0/4] qcow2: Improve cluster refcount update
@ 2008-11-06 16:55 Laurent Vivier
2008-11-06 18:53 ` Anthony Liguori
0 siblings, 1 reply; 12+ messages in thread
From: Laurent Vivier @ 2008-11-06 16:55 UTC (permalink / raw)
To: qemu-devel@nongnu.org; +Cc: Laurent Vivier
This series of patches improves qcow2 performance with O_DIRECT
when the qcow2 file is empty and we begin to fill it.
For instance, an OS install took 12 min 56 sec without these patches,
and 8 min 37 sec with it (this has been measured with KVM).
To do that, they improve the updating of the clusters refcounts.
[PATCH 1/4] qcow2: Clean-up update_cluster_refcount().
[PATCH 2/4] qcow2: Allow update_cluster_refcount() to update several
clusters refcount.
[PATCH 3/4] qcow2: Align I/O access to l2 table and refcount block.
[PATCH 4/4] qcow2: detect if no disk cache
Laurent
--
------------------ Laurent.Vivier@bull.net ------------------
"Tout ce qui est impossible reste à accomplir" Jules Verne
"Things are only impossible until they're not" Jean-Luc Picard
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] qcow2: Improve cluster refcount update
2008-11-06 16:55 [Qemu-devel] [PATCH 0/4] qcow2: Improve cluster refcount update Laurent Vivier
@ 2008-11-06 18:53 ` Anthony Liguori
2008-11-07 8:31 ` Kevin Wolf
2008-11-07 8:44 ` Laurent Vivier
0 siblings, 2 replies; 12+ messages in thread
From: Anthony Liguori @ 2008-11-06 18:53 UTC (permalink / raw)
To: qemu-devel; +Cc: Laurent Vivier
Laurent Vivier wrote:
> This series of patches improves qcow2 performance with O_DIRECT
> when the qcow2 file is empty and we begin to fill it.
>
I really dislike this series and any future series that does similar
things. The real problem is that non-aligned accesses are so slow, but
there's no reason that they must be so slow. It's because we're doing
synchronous IO operations instead of using posix-aio like we should.
That's the real problem. Just doing a memory copy to an aligned buffer
is not going to cause that much performance delay.
We shouldn't be adding more cruft to the code base to avoid fixing the
real problem. I think the current implementation of O_DIRECT needs to
be rewritten.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] qcow2: Improve cluster refcount update
2008-11-06 18:53 ` Anthony Liguori
@ 2008-11-07 8:31 ` Kevin Wolf
2008-11-07 8:44 ` Laurent Vivier
1 sibling, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2008-11-07 8:31 UTC (permalink / raw)
To: qemu-devel; +Cc: Laurent Vivier
Anthony Liguori schrieb:
> Laurent Vivier wrote:
>> This series of patches improves qcow2 performance with O_DIRECT
>> when the qcow2 file is empty and we begin to fill it.
>>
>
> I really dislike this series and any future series that does similar
> things. The real problem is that non-aligned accesses are so slow, but
> there's no reason that they must be so slow. It's because we're doing
> synchronous IO operations instead of using posix-aio like we should.
> That's the real problem. Just doing a memory copy to an aligned buffer
> is not going to cause that much performance delay.
Even if you don't like the alignment part, patches 1 and 2 are still
useful, IMHO.
Kevin
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] qcow2: Improve cluster refcount update
2008-11-06 18:53 ` Anthony Liguori
2008-11-07 8:31 ` Kevin Wolf
@ 2008-11-07 8:44 ` Laurent Vivier
2008-11-07 13:59 ` Anthony Liguori
1 sibling, 1 reply; 12+ messages in thread
From: Laurent Vivier @ 2008-11-07 8:44 UTC (permalink / raw)
To: qemu-devel
Le jeudi 06 novembre 2008 à 12:53 -0600, Anthony Liguori a écrit :
> Laurent Vivier wrote:
> > This series of patches improves qcow2 performance with O_DIRECT
> > when the qcow2 file is empty and we begin to fill it.
> >
>
> I really dislike this series and any future series that does similar
> things. The real problem is that non-aligned accesses are so slow, but
> there's no reason that they must be so slow. It's because we're doing
> synchronous IO operations instead of using posix-aio like we should.
> That's the real problem. Just doing a memory copy to an aligned buffer
> is not going to cause that much performance delay.
>
> We shouldn't be adding more cruft to the code base to avoid fixing the
> real problem. I think the current implementation of O_DIRECT needs to
> be rewritten.
Like any good maintainer, you are becoming a tyrant...
BTW, patch #1 is a clean-up (no change, only code move) and patch #2
doesn't align buffer, but try to minimise syscall (it improves
performance with O_DSYNC too).
Ignore patch #3 and #4 if you dislike them.
Regards,
Laurent
--
------------------ Laurent.Vivier@bull.net ------------------
"Tout ce qui est impossible reste à accomplir" Jules Verne
"Things are only impossible until they're not" Jean-Luc Picard
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] qcow2: Improve cluster refcount update
2008-11-07 8:44 ` Laurent Vivier
@ 2008-11-07 13:59 ` Anthony Liguori
2008-11-07 15:01 ` Laurent Vivier
0 siblings, 1 reply; 12+ messages in thread
From: Anthony Liguori @ 2008-11-07 13:59 UTC (permalink / raw)
To: qemu-devel
Laurent Vivier wrote:
> BTW, patch #1 is a clean-up (no change, only code move) and patch #2
> doesn't align buffer, but try to minimise syscall (it improves
> performance with O_DSYNC too).
>
> Ignore patch #3 and #4 if you dislike them.
>
But how useful is #1 and #2 if the buffers aren't aligned and we're
hitting the slow path?
Regards,
Anthony Liguori
> Regards,
> Laurent
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] qcow2: Improve cluster refcount update
2008-11-07 13:59 ` Anthony Liguori
@ 2008-11-07 15:01 ` Laurent Vivier
2008-11-07 15:37 ` Anthony Liguori
0 siblings, 1 reply; 12+ messages in thread
From: Laurent Vivier @ 2008-11-07 15:01 UTC (permalink / raw)
To: qemu-devel
Le vendredi 07 novembre 2008 à 07:59 -0600, Anthony Liguori a écrit :
> Laurent Vivier wrote:
> > BTW, patch #1 is a clean-up (no change, only code move) and patch #2
> > doesn't align buffer, but try to minimise syscall (it improves
> > performance with O_DSYNC too).
> >
> > Ignore patch #3 and #4 if you dislike them.
> >
>
> But how useful is #1 and #2 if the buffers aren't aligned and we're
> hitting the slow path?
In all cases #1 is totally useless as it is only cosmetic.
#2 improves performance without aligned buffers and with "cache" set to
default value.
I've installed ubuntu 8.04 desktop on an empty qcow2 file using qemu:
without patch: 16 minutes 54 seconds
with patch: 15 minutes 44 seconds
Regards,
Laurent
--
------------------ Laurent.Vivier@bull.net ------------------
"Tout ce qui est impossible reste à accomplir" Jules Verne
"Things are only impossible until they're not" Jean-Luc Picard
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] qcow2: Improve cluster refcount update
2008-11-07 15:01 ` Laurent Vivier
@ 2008-11-07 15:37 ` Anthony Liguori
2008-11-07 15:48 ` Laurent Vivier
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Anthony Liguori @ 2008-11-07 15:37 UTC (permalink / raw)
To: qemu-devel
Laurent Vivier wrote:
> Le vendredi 07 novembre 2008 à 07:59 -0600, Anthony Liguori a écrit :
>
> In all cases #1 is totally useless as it is only cosmetic.
>
> #2 improves performance without aligned buffers and with "cache" set to
> default value.
>
I'm currently rewriting the block-raw-posix.c O_DIRECT support. I'll
commit your patches once I finish that.
I'm switching everything to use the aio functions, introducing a memory
pool so that we can efficiently allocate memory for new requests without
having unbounded memory allocation, and switching all the synchronous
functions to use the aio functions.
Regards,
Anthony Liguori
> I've installed ubuntu 8.04 desktop on an empty qcow2 file using qemu:
>
> without patch: 16 minutes 54 seconds
> with patch: 15 minutes 44 seconds
>
> Regards,
> Laurent
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] qcow2: Improve cluster refcount update
2008-11-07 15:37 ` Anthony Liguori
@ 2008-11-07 15:48 ` Laurent Vivier
2008-11-09 14:35 ` Avi Kivity
2008-11-10 20:15 ` Andreas Färber
2 siblings, 0 replies; 12+ messages in thread
From: Laurent Vivier @ 2008-11-07 15:48 UTC (permalink / raw)
To: qemu-devel
Le vendredi 07 novembre 2008 à 09:37 -0600, Anthony Liguori a écrit :
> Laurent Vivier wrote:
> > Le vendredi 07 novembre 2008 à 07:59 -0600, Anthony Liguori a écrit :
> >
> > In all cases #1 is totally useless as it is only cosmetic.
> >
> > #2 improves performance without aligned buffers and with "cache" set to
> > default value.
> >
>
> I'm currently rewriting the block-raw-posix.c O_DIRECT support. I'll
> commit your patches once I finish that.
>
> I'm switching everything to use the aio functions, introducing a memory
> pool so that we can efficiently allocate memory for new requests without
> having unbounded memory allocation, and switching all the synchronous
> functions to use the aio functions.
OK, I'll repost the patch when your work will be done.
[to correct bugs reported by Kevin]
Regards,
Laurent
--
------------------ Laurent.Vivier@bull.net ------------------
"Tout ce qui est impossible reste à accomplir" Jules Verne
"Things are only impossible until they're not" Jean-Luc Picard
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] qcow2: Improve cluster refcount update
2008-11-07 15:37 ` Anthony Liguori
2008-11-07 15:48 ` Laurent Vivier
@ 2008-11-09 14:35 ` Avi Kivity
2008-11-09 16:32 ` Anthony Liguori
2008-11-10 20:15 ` Andreas Färber
2 siblings, 1 reply; 12+ messages in thread
From: Avi Kivity @ 2008-11-09 14:35 UTC (permalink / raw)
To: qemu-devel
Anthony Liguori wrote:
> Laurent Vivier wrote:
>> Le vendredi 07 novembre 2008 à 07:59 -0600, Anthony Liguori a écrit :
>> In all cases #1 is totally useless as it is only cosmetic.
>>
>> #2 improves performance without aligned buffers and with "cache" set to
>> default value.
>>
>
> I'm currently rewriting the block-raw-posix.c O_DIRECT support. I'll
> commit your patches once I finish that.
>
> I'm switching everything to use the aio functions, introducing a
> memory pool so that we can efficiently allocate memory for new
> requests without having unbounded memory allocation, and switching all
> the synchronous functions to use the aio functions.
What about locking? If clusters are already allocated and we don't need
COW things are simple (and a global lock should suffice, provided it's
released before the io is actually submitted), but are you planning to
parallelize the metadata paths as well?
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] qcow2: Improve cluster refcount update
2008-11-09 14:35 ` Avi Kivity
@ 2008-11-09 16:32 ` Anthony Liguori
0 siblings, 0 replies; 12+ messages in thread
From: Anthony Liguori @ 2008-11-09 16:32 UTC (permalink / raw)
To: qemu-devel
Avi Kivity wrote:
> Anthony Liguori wrote:
>> Laurent Vivier wrote:
>>> Le vendredi 07 novembre 2008 à 07:59 -0600, Anthony Liguori a écrit :
>>> In all cases #1 is totally useless as it is only cosmetic.
>>>
>>> #2 improves performance without aligned buffers and with "cache" set to
>>> default value.
>>>
>>
>> I'm currently rewriting the block-raw-posix.c O_DIRECT support. I'll
>> commit your patches once I finish that.
>>
>> I'm switching everything to use the aio functions, introducing a
>> memory pool so that we can efficiently allocate memory for new
>> requests without having unbounded memory allocation, and switching
>> all the synchronous functions to use the aio functions.
>
> What about locking? If clusters are already allocated and we don't
> need COW things are simple (and a global lock should suffice, provided
> it's released before the io is actually submitted), but are you
> planning to parallelize the metadata paths as well?
I'm still doing a global lock model and requests submitted while a
request is pending is simply queued. There are certainly more effective
strategies but this will work for now.
My larger goal is to totally eliminate synchronous reads/writes APIs in
QEMU. They cause unending pain with the main loop.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] qcow2: Improve cluster refcount update
2008-11-07 15:37 ` Anthony Liguori
2008-11-07 15:48 ` Laurent Vivier
2008-11-09 14:35 ` Avi Kivity
@ 2008-11-10 20:15 ` Andreas Färber
2008-11-10 20:52 ` Anthony Liguori
2 siblings, 1 reply; 12+ messages in thread
From: Andreas Färber @ 2008-11-10 20:15 UTC (permalink / raw)
To: qemu-devel
Am 07.11.2008 um 16:37 schrieb Anthony Liguori:
> Laurent Vivier wrote:
>> Le vendredi 07 novembre 2008 à 07:59 -0600, Anthony Liguori a écrit :
>> In all cases #1 is totally useless as it is only cosmetic.
>>
>> #2 improves performance without aligned buffers and with "cache"
>> set to
>> default value.
>>
>
> I'm currently rewriting the block-raw-posix.c O_DIRECT support.
> I'll commit your patches once I finish that.
>
> I'm switching everything to use the aio functions,
I'm not familiar with this area of code, so just as a reminder for
consideration, recently Blue Swirl added an option to disable AIO.
Some BSDs as well as Haiku may not have (full) POSIX AIO support.
Dumping non-AIO code for better performance on Linux would seem to
counterpoint the broadening platform support?
Andreas
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] qcow2: Improve cluster refcount update
2008-11-10 20:15 ` Andreas Färber
@ 2008-11-10 20:52 ` Anthony Liguori
0 siblings, 0 replies; 12+ messages in thread
From: Anthony Liguori @ 2008-11-10 20:52 UTC (permalink / raw)
To: qemu-devel
Andreas Färber wrote:
>
> Am 07.11.2008 um 16:37 schrieb Anthony Liguori:
>
>> Laurent Vivier wrote:
>>> Le vendredi 07 novembre 2008 à 07:59 -0600, Anthony Liguori a écrit :
>>> In all cases #1 is totally useless as it is only cosmetic.
>>>
>>> #2 improves performance without aligned buffers and with "cache" set to
>>> default value.
>>>
>>
>> I'm currently rewriting the block-raw-posix.c O_DIRECT support. I'll
>> commit your patches once I finish that.
>>
>> I'm switching everything to use the aio functions,
>
> I'm not familiar with this area of code, so just as a reminder for
> consideration, recently Blue Swirl added an option to disable AIO.
> Some BSDs as well as Haiku may not have (full) POSIX AIO support.
> Dumping non-AIO code for better performance on Linux would seem to
> counterpoint the broadening platform support?
We can emulate AIO either via a bottom half and synchronous IO (not
great) or using pthreads (better). I have patches for both. Mixing
synchronous IO and asynchronous has all sorts of really ugly consequences.
Regards,
Anthony Liguori
> Andreas
>
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2008-11-10 20:52 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-06 16:55 [Qemu-devel] [PATCH 0/4] qcow2: Improve cluster refcount update Laurent Vivier
2008-11-06 18:53 ` Anthony Liguori
2008-11-07 8:31 ` Kevin Wolf
2008-11-07 8:44 ` Laurent Vivier
2008-11-07 13:59 ` Anthony Liguori
2008-11-07 15:01 ` Laurent Vivier
2008-11-07 15:37 ` Anthony Liguori
2008-11-07 15:48 ` Laurent Vivier
2008-11-09 14:35 ` Avi Kivity
2008-11-09 16:32 ` Anthony Liguori
2008-11-10 20:15 ` Andreas Färber
2008-11-10 20:52 ` Anthony Liguori
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).