From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Bader Subject: Re: [PATCH] xen-blk(front|back): Handle large physical sector disks Date: Wed, 15 May 2013 12:58:51 +0200 Message-ID: <51936A6B.6040905@canonical.com> References: <5191272A.7070703@canonical.com> <51920C4B02000078000D5D9F@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============8750909821565549791==" Return-path: In-Reply-To: <51920C4B02000078000D5D9F@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: xen-devel , Konrad Rzeszutek Wilk , roger.pau@citrix.com List-Id: xen-devel@lists.xenproject.org This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --===============8750909821565549791== Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="------------enigE50F035DB97CCF8B50FABB5C" This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigE50F035DB97CCF8B50FABB5C Content-Type: multipart/mixed; boundary="------------050903030707000204010401" This is a multi-part message in MIME format. --------------050903030707000204010401 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 14.05.2013 10:04, Jan Beulich wrote: >>>> On 13.05.13 at 19:47, Stefan Bader wrot= e: >> --- a/drivers/block/xen-blkback/xenbus.c >> +++ b/drivers/block/xen-blkback/xenbus.c >> @@ -704,6 +704,13 @@ again: >> dev->nodename); >> goto abort; >> } >> + err =3D xenbus_printf(xbt, dev->nodename, "physical-sector-size", "%= u", >> + bdev_physical_block_size(be->blkif->vbd.bdev)); >> + if (err) { >> + xenbus_dev_fatal(dev, err, "writing %s/physical-sector-size", >> + dev->nodename); >> + goto abort; >=20 > Failure here should not be fatal (as with any other protocol > extensions). >=20 > Beyond that the patch looks good to me. >=20 > Jan >=20 Ok, here the version with the fatal error changed to just error message. =46rom 922e5f4c3074559f5cb5475956bd4a1723cc4fdc Mon Sep 17 00:00:00 2001 From: Stefan Bader Date: Mon, 13 May 2013 16:28:15 +0200 Subject: [PATCH] xen/blk: Use physical sector size for setup Currently xen-blkback passes the logical sector size over xenbus and xen-blkfront sets up the paravirt disk with that logical block size. But newer drives usually have the logical sector size set to 512 for compatibility reasons and would show the actual sector size only in physical sector size. This results in the device being partitioned and accessed in dom0 with the correct sector size, but the guest thinks 512 bytes is the correct block size. And that results in poor performance. To fix this, blkback gets modified to pass also physical-sector-size over xenbus and blkfront to use both values to set up the paravirt disk. I did not just change the passed in sector-size because I am not sure having a bigger logical sector size than the physical one is valid (and that would happen if a newer dom0 kernel hits an older domU kernel). Also this way a domU set up before should still be accessible (just some tools might detect the unaligned setup). Signed-off-by: Stefan Bader --- drivers/block/xen-blkback/xenbus.c | 5 +++++ drivers/block/xen-blkfront.c | 24 ++++++++++++++++++++---- 2 files changed, 25 insertions(+), 4 deletions(-) diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkba= ck/xenbus.c index 8bfd1bc..25463e3 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -704,6 +704,11 @@ again: dev->nodename); goto abort; } + err =3D xenbus_printf(xbt, dev->nodename, "physical-sector-size", "%u",= + bdev_physical_block_size(be->blkif->vbd.bdev)); + if (err) + xenbus_dev_error(dev, err, "writing %s/physical-sector-size", + dev->nodename); err =3D xenbus_transaction_end(xbt, 0); if (err =3D=3D -EAGAIN) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index d89ef86..60ec315 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -542,7 +542,8 @@ wait: flush_requests(info); } -static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size) +static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size, + unsigned int physical_sector_size) { struct request_queue *rq; struct blkfront_info *info =3D gd->private_data; @@ -564,6 +565,7 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, u= 16 sector_size) /* Hard sector size and max sectors impersonate the equiv. hardware. */= blk_queue_logical_block_size(rq, sector_size); + blk_queue_physical_block_size(rq, physical_sector_size); blk_queue_max_hw_sectors(rq, 512); /* Each segment in a request is up to an aligned page in size. */ @@ -667,7 +669,8 @@ static char *encode_disk_name(char *ptr, unsigned int= n) static int xlvbd_alloc_gendisk(blkif_sector_t capacity, struct blkfront_info *info, - u16 vdisk_info, u16 sector_size) + u16 vdisk_info, u16 sector_size, + unsigned int physical_sector_size) { struct gendisk *gd; int nr_minors =3D 1; @@ -734,7 +737,7 @@ static int xlvbd_alloc_gendisk(blkif_sector_t capacit= y, gd->driverfs_dev =3D &(info->xbdev->dev); set_capacity(gd, capacity); - if (xlvbd_init_blk_queue(gd, sector_size)) { + if (xlvbd_init_blk_queue(gd, sector_size, physical_sector_size)) { del_gendisk(gd); goto release; } @@ -1395,6 +1398,7 @@ static void blkfront_connect(struct blkfront_info *= info) { unsigned long long sectors; unsigned long sector_size; + unsigned int physical_sector_size; unsigned int binfo; int err; int barrier, flush, discard, persistent; @@ -1437,6 +1441,17 @@ static void blkfront_connect(struct blkfront_info = *info) return; } + /* + * physcial-sector-size is a newer field, so old backends may not + * provide this. Assume physical sector size to be the same as + * sector_size in that case. + */ + err =3D xenbus_gather(XBT_NIL, info->xbdev->otherend, + "physical-sector-size", "%u", &physical_sector_size, + NULL); + if (err) + physical_sector_size =3D sector_size; + info->feature_flush =3D 0; info->flush_op =3D 0; @@ -1483,7 +1498,8 @@ static void blkfront_connect(struct blkfront_info *= info) else info->feature_persistent =3D persistent; - err =3D xlvbd_alloc_gendisk(sectors, info, binfo, sector_size); + err =3D xlvbd_alloc_gendisk(sectors, info, binfo, sector_size, + physical_sector_size); if (err) { xenbus_dev_fatal(info->xbdev, err, "xlvbd_add at %s", info->xbdev->otherend); --=20 1.7.9.5 --------------050903030707000204010401 Content-Type: text/x-diff; name="0001-xen-blk-Use-physical-sector-size-for-setup.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename*0="0001-xen-blk-Use-physical-sector-size-for-setup.patch" =46rom 922e5f4c3074559f5cb5475956bd4a1723cc4fdc Mon Sep 17 00:00:00 2001 From: Stefan Bader Date: Mon, 13 May 2013 16:28:15 +0200 Subject: [PATCH] xen/blk: Use physical sector size for setup Currently xen-blkback passes the logical sector size over xenbus and xen-blkfront sets up the paravirt disk with that logical block size. But newer drives usually have the logical sector size set to 512 for compatibility reasons and would show the actual sector size only in physical sector size. This results in the device being partitioned and accessed in dom0 with the correct sector size, but the guest thinks 512 bytes is the correct block size. And that results in poor performance. To fix this, blkback gets modified to pass also physical-sector-size over xenbus and blkfront to use both values to set up the paravirt disk. I did not just change the passed in sector-size because I am not sure having a bigger logical sector size than the physical one is valid (and that would happen if a newer dom0 kernel hits an older domU kernel). Also this way a domU set up before should still be accessible (just some tools might detect the unaligned setup). Signed-off-by: Stefan Bader --- drivers/block/xen-blkback/xenbus.c | 5 +++++ drivers/block/xen-blkfront.c | 24 ++++++++++++++++++++---- 2 files changed, 25 insertions(+), 4 deletions(-) diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkba= ck/xenbus.c index 8bfd1bc..25463e3 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -704,6 +704,11 @@ again: dev->nodename); goto abort; } + err =3D xenbus_printf(xbt, dev->nodename, "physical-sector-size", "%u",= + bdev_physical_block_size(be->blkif->vbd.bdev)); + if (err) + xenbus_dev_error(dev, err, "writing %s/physical-sector-size", + dev->nodename); =20 err =3D xenbus_transaction_end(xbt, 0); if (err =3D=3D -EAGAIN) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index d89ef86..60ec315 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -542,7 +542,8 @@ wait: flush_requests(info); } =20 -static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size) +static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size, + unsigned int physical_sector_size) { struct request_queue *rq; struct blkfront_info *info =3D gd->private_data; @@ -564,6 +565,7 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, u= 16 sector_size) =20 /* Hard sector size and max sectors impersonate the equiv. hardware. */= blk_queue_logical_block_size(rq, sector_size); + blk_queue_physical_block_size(rq, physical_sector_size); blk_queue_max_hw_sectors(rq, 512); =20 /* Each segment in a request is up to an aligned page in size. */ @@ -667,7 +669,8 @@ static char *encode_disk_name(char *ptr, unsigned int= n) =20 static int xlvbd_alloc_gendisk(blkif_sector_t capacity, struct blkfront_info *info, - u16 vdisk_info, u16 sector_size) + u16 vdisk_info, u16 sector_size, + unsigned int physical_sector_size) { struct gendisk *gd; int nr_minors =3D 1; @@ -734,7 +737,7 @@ static int xlvbd_alloc_gendisk(blkif_sector_t capacit= y, gd->driverfs_dev =3D &(info->xbdev->dev); set_capacity(gd, capacity); =20 - if (xlvbd_init_blk_queue(gd, sector_size)) { + if (xlvbd_init_blk_queue(gd, sector_size, physical_sector_size)) { del_gendisk(gd); goto release; } @@ -1395,6 +1398,7 @@ static void blkfront_connect(struct blkfront_info *= info) { unsigned long long sectors; unsigned long sector_size; + unsigned int physical_sector_size; unsigned int binfo; int err; int barrier, flush, discard, persistent; @@ -1437,6 +1441,17 @@ static void blkfront_connect(struct blkfront_info = *info) return; } =20 + /* + * physcial-sector-size is a newer field, so old backends may not + * provide this. Assume physical sector size to be the same as + * sector_size in that case. + */ + err =3D xenbus_gather(XBT_NIL, info->xbdev->otherend, + "physical-sector-size", "%u", &physical_sector_size, + NULL); + if (err) + physical_sector_size =3D sector_size; + info->feature_flush =3D 0; info->flush_op =3D 0; =20 @@ -1483,7 +1498,8 @@ static void blkfront_connect(struct blkfront_info *= info) else info->feature_persistent =3D persistent; =20 - err =3D xlvbd_alloc_gendisk(sectors, info, binfo, sector_size); + err =3D xlvbd_alloc_gendisk(sectors, info, binfo, sector_size, + physical_sector_size); if (err) { xenbus_dev_fatal(info->xbdev, err, "xlvbd_add at %s", info->xbdev->otherend); --=20 1.7.9.5 --------------050903030707000204010401-- --------------enigE50F035DB97CCF8B50FABB5C Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQIcBAEBCgAGBQJRk2prAAoJEOhnXe7L7s6jB6AQAJkDqjvPTei3mfJWH6wsdOfF rlgaU/+qfD0y6mDot/p2RCQnmOyr/MDcQlnSlrfnT3wjO1sz0Tj9riAaOju2yB0r 5fnXcR2S/im2cDgqibIIR/eaG9hCaw+WiNnKvz4+Gn2U4glDcSRWr1jYgkORZg8S 2wo++FxziZsE/6ZPQQ3IuLBSLtprWiHwxNkU8MLuj5gurl7y2heUHm0TdZkLVH0I oVsOZdyFvS5SyJ/phNFn0s1mMlPAN1/hhC9D6j/IHl9qKczQniOXyMeiIY3G4Zlr QIknnVgIh5wcJXpkoxJHq9q/h6CaKq4LJ+ELZOkQCU6WWWpk/8II+Azz8AgaUjVT KT00R2PHjSJfu4K8cjXDAoVrRePDYenBJCNR8VacMNo5hUeU+A9LLQlk8GMei79A ciEYN8oUlqrNSmDVNa2ggkKm7LOcFHbx75oce09G/JvijyveUd8D8IygUWTcnEAf 3CRLk1GSP/evY0vizkbIWxk7EU+KY2b/zDA87s5b3ez/oW76YiSm37EjgwhFWCRb mapUCJimNw57TVJHw6FjdnIlgo5o9DLquPYRcP9pZsbb4ttvHv7ortmLNtRrwAgu 5vONEWzbsNXwTc7/0aD4uVYk8XbtR28UQedvFP9TH5OTS6kylhMBWCEF2RvXGcdW fqmC6bJX5QCUTru2XTaW =ggeL -----END PGP SIGNATURE----- --------------enigE50F035DB97CCF8B50FABB5C-- --===============8750909821565549791== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============8750909821565549791==--