From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1Jr964-0005Sn-Mj
	for qemu-devel@nongnu.org; Wed, 30 Apr 2008 05:59:32 -0400
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1Jr963-0005Qm-M0
	for qemu-devel@nongnu.org; Wed, 30 Apr 2008 05:59:31 -0400
Received: from [199.232.76.173] (port=50375 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1Jr963-0005QT-8B
	for qemu-devel@nongnu.org; Wed, 30 Apr 2008 05:59:31 -0400
Received: from ecfrec.frec.bull.fr ([129.183.4.8])
	by monty-python.gnu.org with esmtp (Exim 4.60)
	(envelope-from <Laurent.Vivier@bull.net>) id 1Jr962-0002eg-CR
	for qemu-devel@nongnu.org; Wed, 30 Apr 2008 05:59:30 -0400
Subject: Re: [Qemu-devel] [PATCH] Align file accesses with	cache=off	(O_DIRECT)
From: Laurent Vivier <Laurent.Vivier@bull.net>
In-Reply-To: <48183A34.7030608@suse.de>
References: <4807514B.9040607@suse.de>  <4815EE89.70407@suse.de>
	<1209459667.4328.7.camel@frecb07144>  <48173591.9010609@suse.de>
	<1209484105.4248.27.camel@frecb07144>  <48174B0B.5070904@suse.de>
	<1209487719.4248.43.camel@frecb07144>  <48183A34.7030608@suse.de>
Content-Type: text/plain; charset=utf-8
Date: Wed, 30 Apr 2008 11:59:34 +0200
Message-Id: <1209549574.4312.27.camel@frecb07144>
Mime-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Reply-To: qemu-devel@nongnu.org
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Kevin Wolf <kwolf@suse.de>
Cc: qemu-devel@nongnu.org


Le mercredi 30 avril 2008 =C3=A0 11:21 +0200, Kevin Wolf a =C3=A9crit :
> Laurent Vivier schrieb:
> >> Hm, yes. We could call raw_pread in raw_aio_read when O_DIRECT is us=
ed=20
> >> and the request is not properly aligned. Is this what you meant?
> >=20
> > No, it was just a (stupid) comment. I think we must not convert
> > asynchronous I/O to synchronous I/O.
>=20
> Why not? If I'm not mistaken (but if you think it's wrong, probably I=20
> am) the only possible drawback should be performance. And we're not=20
> talking about normal guest IO, these accesses are aligned anyway.

I think there can be side effects if the caller of the function thinks
the I/O is asynchronous and it is in fact synchronous (but perhaps we
can call that a "bug"...).

> >> I think we agree that it's mostly item 3 why one would use O_DIRECT =
with=20
> >> qemu. In terms of reliability, it is important that the data really =
is=20
> >> written to the disk when the guest OS thinks so. But when for exampl=
e=20
> >> qemu crashes, I don't think it's too important if 40% or 50% of a=20
> >> snapshot have already been written - it's unusable anyway. A sync=20
> >> afterwards could be enough there.
> >=20
> > I don't speak about "qemu crashes" but about "host crashes".
>=20
> Well, I've never seen a host crash where qemu survived. ;-)
>=20
> What I wanted to say: If the snapshot is incomplete and not usable=20
> anyway, why bother if some bytes more or less have been written?

What I mean is with O_DIRECT we increase reliability because we are sure
data are on disk when qemu has finished to save snapshot, without
O_DIRECT qemu can have finished to save but the snapshot is not really
on the disk but in the host cache. And if the host crashes between these
two points, It's 'bad' (user thinks snapshot has been saved but in fact
it is not). But perhaps it's just a detail...

> > I'm not in the spirit "my patch is better than yours" (and I don't th=
ink
> > so); but could you try to test my patch ? Because if I remember
> > correctly I think It manages all cases and this can help you to find =
a
> > solution (or perhaps you can add to your patch the part of my patch
> > about block-qcow2.c)
>=20
> I really didn't want to say that your code is bad or it wouldn't work o=
r=20
> something like that. I just tried it and it works fine as well.
>=20
> But the approaches are quite different. Your patch makes every user of=20
> the block driver functions aware that O_DIRECT might be in effect. My=20
> patch tries to do things in one common place, even though possibly at=20
> the cost of performance (however, I'm not sure anymore about the bad=20
> performance now that I use your fcntl method).
>=20
> So what I like about my patch is that it is one change in one place=20
> which should make everything work. Your patch could be still of use to=20
> speed things up, e.g. by making qcow2 aware that there is something lik=
e=20
> O_DIRECT (would have to do a real benchmark with both patches) and have=
=20
> it align its buffers in the first place.

OK, I agree with you.

> I'll attach the current version of my patch which emulates AIO through=20
> synchronous requests for unaligned buffer. In comparison, with your=20
> patch the bootup of my test VM was slightly faster but loading/saving o=
f=20
> snapshots was much faster with mine. Perhaps I'll try to combine them t=
o=20
> get the best of both.

just a comment on the patch: perhaps you can call your field
"open_flags" instead of "flags", and perhaps you can merge your field
with "fd_open_flags" ?

Some comments from Qemu maintainers ?
Is there someone ready to commit it in SVN ?

I personally think this functionality must be included...

Regards,
Laurent
--=20
------------- Laurent.Vivier@bull.net ---------------
"The best way to predict the future is to invent it."
- Alan Kay