From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44108) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YsYp8-0006rY-1y for qemu-devel@nongnu.org; Wed, 13 May 2015 11:43:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YsYp4-000642-0k for qemu-devel@nongnu.org; Wed, 13 May 2015 11:43:53 -0400 Date: Wed, 13 May 2015 16:43:42 +0100 From: Stefan Hajnoczi Message-ID: <20150513154342.GB24352@stefanha-thinkpad.redhat.com> References: <1430746944-27347-1-git-send-email-den@openvz.org> <20150511150817.GK16270@stefanha-thinkpad.redhat.com> <5550D3B5.2050703@openvz.org> <5550DD2D.8000407@odin.com> <20150512100155.GB11497@stefanha-thinkpad.redhat.com> <5551D39E.1020902@odin.com> <5551DA21.7020105@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="yEPQxsgoJgBvi8ip" Content-Disposition: inline In-Reply-To: <5551DA21.7020105@redhat.com> Subject: Re: [Qemu-devel] [Qemu-block] [PATCH v5 0/2] block: enforce minimal 4096 alignment in qemu_blockalign List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: Dmitry Monakhov , qemu-block@nongnu.org, Stefan Hajnoczi , "Denis V. Lunev" , qemu-devel@nongnu.org --yEPQxsgoJgBvi8ip Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, May 12, 2015 at 12:46:57PM +0200, Paolo Bonzini wrote: >=20 >=20 > On 12/05/2015 12:19, Denis V. Lunev wrote: > >=20 > >=20 > > hades /vol $ strace -f -e pwrite -e raw=3Dwrite,pwrite qemu-io -n -c > > "write -P 0x11 0 64M" ./1.img > > Process 19326 attached > > [pid 19326] pwrite(0x6, 0x7fac07fff200, 0x4000000, 0x50000) =3D 0x40000= 00 > > <---- 1 GB Write from userspace >=20 > FWIW this is 64 MB (as expected). >=20 > > wrote 67108864/67108864 bytes at offset 0 > > 64 MiB, 1 ops; 0.2964 sec (215.863 MiB/sec and 3.3729 ops/sec) > > [pid 19326] +++ exited with 0 +++ > > +++ exited with 0 +++ > > hades /vol $ >=20 > > 9,0 1 266 74.030359772 19326 Q WS 473095 + 1016 [(null)] > > 9,0 1 267 74.030361546 19326 Q WS 474111 + 8 [(null)] > > 9,0 1 268 74.030395522 19326 Q WS 474119 + 1016 [(null)] > > 9,0 1 269 74.030397509 19326 Q WS 475135 + 8 [(null)] > >=20 > > This means, yes, kernel is INEFFECTIVE performing direct IO with > > not aligned address. For example, without direct IO the pattern is > > much better.=20 >=20 > I think this means that the kernel is DMAing at most 128 pages at a > time. If the buffer is misaligned, you need 129 pages and the kernel > then splits the request into a 128 page and a 1 page part. >=20 > This looks like a hardware limit, and the kernel probably cannot really > do anything about it because we requested O_DIRECT. So your patch makes > sense. A 64 MB buffer was given in the pwrite() call. The first and the last 128-page write requests may have partial pages, but why should the rest not use fully aligned 1024 sector writes? Maybe the buffer is split by the max sectors per request before the alignment requirements are considered. It would be more efficient to first split off the unaligned parts. So I think the kernel is still doing something suboptimal here. Stefan --yEPQxsgoJgBvi8ip Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVU3EuAAoJEJykq7OBq3PImF4H/ixKVoK63BWkYQkEovNVPu82 6pgjQeHRzsXkmpVK3zqRx/4UH3tOy6mKsJMNdasPfFpWGeI80Zzu+hH/XX+EzMEa TISKBaJ3LRnu/xb8Ph+a6VJx1hyemk+gnIw0zIx8HRHCUlNyXmhcTCuUGn5Fm6qx Ji3lYKit8bQcMNzK+btzIYe7+kOX+3bcUY4KFX3GyQhNxH6kZV9htnOPVA2WEDH9 xxEkeQU0764OiY2BLxTq1J7aMbPrtfDF/oT8eLna/MYW36gpQpqRv+Juk8fpRNTw 9725IsIXYwhc1rO0hxLu6/hMCONUJb9KKpD1QBNrLbUKI10aWguOyuKpaC6UMDg= =7JqN -----END PGP SIGNATURE----- --yEPQxsgoJgBvi8ip--