From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58955) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bFJ9J-0001CJ-Rn for qemu-devel@nongnu.org; Tue, 21 Jun 2016 06:43:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bFJ9H-0001rN-St for qemu-devel@nongnu.org; Tue, 21 Jun 2016 06:43:16 -0400 Date: Tue, 21 Jun 2016 12:43:08 +0200 From: Kevin Wolf Message-ID: <20160621104308.GC4520@noname.redhat.com> References: <1466465969-25315-1-git-send-email-eblake@redhat.com> <20160621102357.GG32560@stefanha-x1.localdomain> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="y0ulUmNC+osPPQO6" Content-Disposition: inline In-Reply-To: <20160621102357.GG32560@stefanha-x1.localdomain> Subject: Re: [Qemu-devel] [PATCH 0/5] Auto-fragment large transactions at the block layer List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Eric Blake , qemu-devel@nongnu.org, qemu-block@nongnu.org --y0ulUmNC+osPPQO6 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Am 21.06.2016 um 12:23 hat Stefan Hajnoczi geschrieben: > On Mon, Jun 20, 2016 at 05:39:24PM -0600, Eric Blake wrote: > > We have max_transfer documented in BlockLimits, but while we > > honor it during pwrite_zeroes, we were blindly ignoring it > > during pwritev and preadv, leading to multiple drivers having > > to implement fragmentation themselves. This series moves > > fragmentation to the block layer, then fixes the NBD driver to > > use it; if you like this but it needs a v2, you can request that > > I further do other drivers (I know at least iscsi and qcow2 do > > some self-fragmenting and/or error reporting that can be > > simplified by deferring fragmentation to the block layer). >=20 > I'm concerned that requests A & B which should be atomic can now be > interleaved. I don't think there is any guarantee of atomicity for overlapping requests, at least not with more than a single sector (logical block size, not BDRV_SECTOR_SIZE). That is, as far as I know neither hardware nor the Linux kernel nor the qemu block layer (image formats fragment all the time!) protect against this. If you have concurrent overlapping requests, you always get undefined behaviour. > For example, two writes that are overlapping and fragmented. > Applications expect to either see A or B on disk when both requests have > completed. Fragmentation must serialize overlapping requests in order > to prevent interleaved results where the application sees some of A and > some of B when both requests have completed. >=20 > A similar scenario happens when A is a read and B is a write, too. Read > A is supposed to see either "before B" or "after B". With fragmentation > it can see "some of before B and some of after B". If we wanted to achieve this semantics, it would be easy enough: Add a mark_request_serialising() in the right place. But I'm pretty sure that this isn't needed. Kevin --y0ulUmNC+osPPQO6 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJXaRo8AAoJEH8JsnLIjy/Wu/QQAMSMga8hVRC1u8F/P2AaZE/5 xe4ocC7JPJMUmNfeuYaV5lRrw9u4UFFjIxUAhXHzh5mIyfyg2iwqQkl2b9HdZS3D 04/y8ixwV+I4fWm6a1QmS/VVT76VQug2eT2zh51jGZXHRDcRAKnLObQdrYzLxIfQ qpJgZR5FwALZYJ0eDJLrEd9i9xLZvp1/q7r2Pb6Gi+Ya0o8GO+y70zyOjWlWZy3y 8nXjKxNGlZr+nCTIO3lAp9v1EJVXDZN69HU6jnqCLYznaQKN0hbXqQLbLh8ZvAeX sO9O7C7WqZq4ibLVA7M0L/6bunZ8GOK/y9lfcJYuZBbm3P2dNXr9r3xDRGoF9xeG ZR1TypFsMPPEG07Tka6mmoWG1cgoUwBvzZHVZJJWA0llL92BRUQ4aA6197YZ7kdL WJg1yxGIod07M2tHj7ZrJjiMVt9QjLZpkqgy4raoczvILCJTFD8j68LwfUma5iFq Hc2MB16DZ7sKBsL0JYgws48fkwKq+u0Wid19R1n0vWNnQYY0v5fHphaPusD5dF6n 80ic1V6auUdtPZrr/qJHHhMsLQLvvIwUlLKf+DGgHz2B8muPUvPVnG0VhDQVzN9T oBRk0QfXNCWBj+Hee4isBS2RPofDLF+IL0CTQbPQ7LWvofljKQoLNSF3IWnXvNpr hoiKOBg1FKM1JPx2siW/ =DVmV -----END PGP SIGNATURE----- --y0ulUmNC+osPPQO6--