From: Christian Borntraeger <borntraeger@de.ibm.com>
To: Markus Armbruster <armbru@redhat.com>,
Stefan Hajnoczi <stefanha@gmail.com>,
Kevin Wolf <kwolf@redhat.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>,
qemu-devel@nongnu.org,
Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>,
dahi@linux.vnet.ibm.com, cornelia.huck@de.ibm.com,
Paolo Bonzini <pbonzini@redhat.com>
Subject: [Qemu-devel] Geometry and blocksize support for backing devices
Date: Thu, 06 Nov 2014 16:58:10 +0100 [thread overview]
Message-ID: <545B9A92.4030802@de.ibm.com> (raw)
In-Reply-To: <1406636839-11946-1-git-send-email-tumanova@linux.vnet.ibm.com>
Markus, Kevin, Stefan,
here is a (somewhat late) followup of some KVM forum discussions regarding
block size and geometry of pass-through block devices. Let's just do a quick
wrap-up (as of my understanding) and a proposal at the end of the mail
- DASD/ECKD disk devices have several special properties, e.g. the geometry
still has an influence on the partitions and each track can have a different
format and the format of the disk actually follows what z/OS has as basic
data structures. Linux does a low-level format of the disk to look like a
block device (most common case is formatted with 4K blocks). There are
still some smalls warts for z/OS compatibility (block0 has 28 bytes of data,
block1 has 148 bytes and block2 has 84 bytes of data. everything else has
4k. the dasd device driver will fake 4k blocks for all blocks by ignoring
writes beyond the data for blocks 0-2 and filling the gaps with 0xe5).
- Since Linux uses DASDs as normal block device, we actually want to use
virtio-blk to pass those to KVM guests. Due to the warts mentioned above,
we have to have the proper block size and geometry in the guest. Otherwise
things will fail in certain cases (linux partition detection for example)
- we have libvirt support to provide this information in the XML. This is far
from user friendly, though, as the admin has to manually look up the
properties of the host disks and then modify the guest definition accordingly.
- Kate came up with patches (based on initial patches from Einar Lueck) for
auto-detection of geometry and block size for host block devices
- Stefan and Paolo had some concerns:
1. if the geometry etc is important, then make it part of the guest definition
2. what about migration and the target disk differs
3. is that issue system z specific or generic?
Regarding 1: this does work as of today, but it is pretty complicated for an
admin to do so
Regarding 2: System z system do not have built-in disks, they are always
accessed via fibre channel (either FICON protocol for DASDs or FCP protocol
for scsi). So its quite common to have shared access to the same disk from
different System z boxes. system z admins should be able to setup this
properly. Question is, is it ok to assume that and fail if not?
Regarding 3: No idea.
At KVM forum I talked to several people regarding a solution:
a) Stefan suggested to make the auto detection explicit, e.g.: provide a
"autodetect" tag for the secs, cylc, heads and logical_sector_size properties
This would require changes in qemu, but also in libvirt and its domain
configuration
b) Markus suggested that there are already some cases in QEMU, where we rely
on the admin to provide a proper setup on the target, e.g. an .iso file as image.
If the target has a different content in the .iso file things will break.
Markus said, there are two classes of this.
a) problems that can be detected by QEMU. Here QEMU will abort migration
b) problems that cannot be detected by QEMU (e.g. different iso content). this
will trigger failures later on
a is preferred
Now here comes my proposal:
Markus statement brought up an idea of special casing DASDs support. We can
call an ioctl BIODASDINFO on the block device that will only succeed if the host
disk is really a dasd. We could enable the auto detection for that case.
In addition, QEMU will check if geometry and block size match during migration, if
not, migration will fail. That would work with the following cases
(manual override == secs, cyls, headers, blocksize given by admin)
HOST A HOST B
dasd (auto) -----> dasd (auto)
dasd (auto) -----> image file (manual override)
image file (manual) -----> dasd (auto)
image file (manual) -----> image file (manual)
dasd (auto) -----> other host block device with manual override
if there are different dasds or different value migration will fail (and that is
what we want)
In addition we could also implement Stefans proposal to add a "autodetect" statement
for secs, cyls, head.... but I am not sure about libvirt support, though
So 3 parts:
1. autodetect on real DASDs
2. geometry and sector size checking in generic code
3. maybe an autodetect flag
makes sense? Any guidance how to proceed?
Christian
next prev parent reply other threads:[~2014-11-06 15:58 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-29 12:27 [Qemu-devel] [PATCH 0/4] Geometry and blocksize support for backing devices Ekaterina Tumanova
2014-07-29 12:27 ` [Qemu-devel] [PATCH 1/4] hd-geometry.c: Integrate HDIO_GETGEO in guessing for target-s390x Ekaterina Tumanova
2014-08-20 8:19 ` Paolo Bonzini
2014-08-20 9:35 ` Christian Borntraeger
2014-08-20 13:10 ` Paolo Bonzini
2014-08-25 8:21 ` Christian Borntraeger
2014-07-29 12:27 ` [Qemu-devel] [PATCH 2/4] blocksize: support auto-sensing of blocksizes Ekaterina Tumanova
2014-09-03 15:37 ` Stefan Hajnoczi
2014-07-29 12:27 ` [Qemu-devel] [PATCH 3/4] blocksize: Add driver method to get the blocksizes Ekaterina Tumanova
2014-07-29 12:27 ` [Qemu-devel] [PATCH 4/4] blocksize: add blkconf_blocksize call to all block devices Ekaterina Tumanova
2014-09-03 15:46 ` Stefan Hajnoczi
2014-09-04 10:28 ` Ekaterina Tumanova
2014-09-17 14:00 ` Stefan Hajnoczi
2014-09-04 14:15 ` Christian Borntraeger
2014-09-17 13:56 ` Stefan Hajnoczi
2014-07-29 12:37 ` [Qemu-devel] [PATCH 0/4] Geometry and blocksize support for backing devices Christian Borntraeger
2014-11-06 15:58 ` Christian Borntraeger [this message]
2014-11-06 17:24 ` [Qemu-devel] " Paolo Bonzini
2014-11-06 19:05 ` Christian Borntraeger
2014-11-07 9:17 ` Markus Armbruster
2014-11-07 9:48 ` Christian Borntraeger
2014-11-07 15:58 ` Markus Armbruster
2014-11-07 13:39 ` Ekaterina Tumanova
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=545B9A92.4030802@de.ibm.com \
--to=borntraeger@de.ibm.com \
--cc=armbru@redhat.com \
--cc=cornelia.huck@de.ibm.com \
--cc=dahi@linux.vnet.ibm.com \
--cc=kwolf@redhat.com \
--cc=mihajlov@linux.vnet.ibm.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
--cc=tumanova@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).