qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH RFC 0/5] disk deadlines
@ 2015-09-08  8:00 Denis V. Lunev
  2015-09-08  8:00 ` [Qemu-devel] [PATCH 1/5] add QEMU style defines for __sync_add_and_fetch Denis V. Lunev
                   ` (9 more replies)
  0 siblings, 10 replies; 48+ messages in thread
From: Denis V. Lunev @ 2015-09-08  8:00 UTC (permalink / raw)
  Cc: Kevin Wolf, Denis V. Lunev, Stefan Hajnoczi, qemu-devel,
	Raushaniya Maksudova

Description of the problem:
Client and server interacts via Network File System (NFS) or using other
network storage like CEPH. The server contains an image of the Virtual
Machine (VM) with Linux inside. The disk is exposed as SATA or IDE
to VM. VM is started on the client as usual. In the case of network shortage
requests from the virtial disk can not be completed in predictable time.
If this request is f.e. ext3/4 journal write then the guest will reset
the controller and restart the request for the first time. On next such
event the guest will remount victim filesystem read-only. From the
end-user point of view this will look like a fatal crash with a manual
reboot required.

To avoid such situation this patchset introduces patch per-drive option
"disk-deadlines=on|off" which is unset by default. All disk requests
will become tracked if the option is enabled. If requests are not completed
in time some countermeasures applied (see below). The timeout could be
configured, default one is chosen by observations.

Test description that let reproduce the problem:
1) configure and start NFS server:
$sudo /etc/init.d/nfs-kernel-server restart
2) put Virtial Machine image with preinstalled Operating System on the server
3) on the client mount server folder that contains Virtial Machine image:
$sudo mount -t nfs -O uid=1000,iocharset=utf-8 server_ip:/path/to/folder/on/
server /path/to/folder/on/client
4) start Virtual Machine with QEMU on the client (for example):
$qemu-system-x86_64 -enable-kvm -vga std -balloon virtio -monitor stdio
 -drive file=/path/to/folder/on/client/vdisk.img,media=disk,if=ide,disk-deadlines=on
 -boot d -m 12288
5) inside of VM rum the following command:
$dd if=/dev/urandom of=testfile bs=10M count=300
AND stop the server (or disconnect network) by running:
$sudo /etc/init.d/nfs-kernel-server stop
6) inside of VM periodically run:
$dmesg
and check error messages.

One can get one of the error messages (just the main lines):
1) After server restarting Guest OS continues run as usual with
the following messages in dmesg:
  a) [ 1108.131474] nfs: server 10.30.23.163 not responding, still trying
     [ 1203.164903] INFO: task qemu-system-x86:3256 blocked for more
     than 120 seconds

  b) [ 581.184311] ata1.00: qc timeout (cmd 0xe7)
     [ 581.184321] ata1.00: FLUSH failed Emask 0x4
     [ 581.744271] ata1: soft resetting link
     [ 581.900346] ata1.01: NODEV after polling detection
     [ 581.900877] ata1.00: configured for MWDMA2
     [ 581.900879] ata1.00: retrying FLUSH 0xe7 Emask 0x4
     [ 581.901203] ata1.00: device reported invalid CHS sector 0
     [ 581.901213] ata1: EH complete
2) Guest OS remounts its Filesystem as read-only:
"remounting filesystem read-only"
3) Guest OS does not respond at all even after server restart

Tested on:
Virtual Machine - Linux 3.11.0 SMP x86_64 Ubuntu 13.10 saucy;
client -  Linux 3.11.10 SMP x86_64, Ubuntu 13.10 saucy;
server - Linux 3.13.0 SMP x86_64, Ubuntu 14.04.1 LTS.

How the given solution works?

If disk-deadlines option is enabled for a drive, one controls time completion
of this drive's requests. The method is as follows (further assume that this
option is enabled).

Every drive has its own red-black tree for keeping its requests.
Expiration time of the request is a key, cookie (as id of request) is an
appropriate node. Assume that every requests has 8 seconds to be completed.
If request was not accomplished in time for some reasons (server crash or smth
else), timer of this drive is fired and an appropriate callback requests to
stop Virtial Machine (VM).

VM remains stopped until all requests from the disk which caused VM's stopping
are completed. Furthermore, if there is another disks with 'disk-deadlines=on'
whose requests are waiting to be completed, do not start VM : wait completion
of all "late" requests from all disks.

Furthermore, all requests which caused VM stopping (or those that just were not
completed in time) could be printed using "info disk-deadlines" qemu monitor
option as follows:
$(qemu) info disk-deadlines

   disk_id  type       size total_time        start_time
.--------------------------------------------------------
  ide0-hd1 FLUSH         0b 46.403s     22232930059574ns
  ide0-hd1 FLUSH         0b 57.591s     22451499241285ns
  ide0-hd1 FLUSH         0b 103.482s    22574100547397ns

This set is sent in the hope that it might be useful.

Signed-off-by: Raushaniya Maksudova <rmaksudova@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>

Raushaniya Maksudova (5):
  add QEMU style defines for __sync_add_and_fetch
  disk_deadlines: add request to resume Virtual Machine
  disk_deadlines: add disk-deadlines option per drive
  disk_deadlines: add control of requests time expiration
  disk_deadlines: add info disk-deadlines option

 block/Makefile.objs            |   1 +
 block/accounting.c             |   8 ++
 block/disk-deadlines.c         | 280 +++++++++++++++++++++++++++++++++++++++++
 blockdev.c                     |  20 +++
 hmp.c                          |  37 ++++++
 hmp.h                          |   1 +
 include/block/accounting.h     |   2 +
 include/block/disk-deadlines.h |  48 +++++++
 include/qemu/atomic.h          |   3 +
 include/sysemu/sysemu.h        |   1 +
 monitor.c                      |   7 ++
 qapi-schema.json               |  33 +++++
 stubs/vm-stop.c                |   5 +
 vl.c                           |  18 +++
 14 files changed, 464 insertions(+)
 create mode 100644 block/disk-deadlines.c
 create mode 100644 include/block/disk-deadlines.h

-- 
2.1.4

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2015-09-28 13:55 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-08  8:00 [Qemu-devel] [PATCH RFC 0/5] disk deadlines Denis V. Lunev
2015-09-08  8:00 ` [Qemu-devel] [PATCH 1/5] add QEMU style defines for __sync_add_and_fetch Denis V. Lunev
2015-09-10  8:19   ` Stefan Hajnoczi
2015-09-08  8:00 ` [Qemu-devel] [PATCH 2/5] disk_deadlines: add request to resume Virtual Machine Denis V. Lunev
2015-09-10  8:51   ` Stefan Hajnoczi
2015-09-10 19:18     ` Denis V. Lunev
2015-09-14 16:46       ` Stefan Hajnoczi
2015-09-08  8:00 ` [Qemu-devel] [PATCH 3/5] disk_deadlines: add disk-deadlines option per drive Denis V. Lunev
2015-09-10  9:05   ` Stefan Hajnoczi
2015-09-08  8:00 ` [Qemu-devel] [PATCH 4/5] disk_deadlines: add control of requests time expiration Denis V. Lunev
2015-09-08  9:35   ` Fam Zheng
2015-09-08  9:42     ` Denis V. Lunev
2015-09-08 11:06   ` Kevin Wolf
2015-09-08 11:27     ` Denis V. Lunev
2015-09-08 13:05       ` Kevin Wolf
2015-09-08 14:23         ` Denis V. Lunev
2015-09-08 14:48           ` Kevin Wolf
2015-09-10 10:27             ` Stefan Hajnoczi
2015-09-10 11:39               ` Kevin Wolf
2015-09-14 16:53                 ` Stefan Hajnoczi
2015-09-25 12:34               ` Dr. David Alan Gilbert
2015-09-28 12:42                 ` Stefan Hajnoczi
2015-09-28 13:55                   ` Dr. David Alan Gilbert
2015-09-08  8:00 ` [Qemu-devel] [PATCH 5/5] disk_deadlines: add info disk-deadlines option Denis V. Lunev
2015-09-08 16:20   ` Eric Blake
2015-09-08 16:26     ` Eric Blake
2015-09-10 18:53       ` Denis V. Lunev
2015-09-10 19:13     ` Denis V. Lunev
2015-09-08  8:58 ` [Qemu-devel] [PATCH RFC 0/5] disk deadlines Vasiliy Tolstov
2015-09-08  9:20 ` Fam Zheng
2015-09-08 10:11   ` Kevin Wolf
2015-09-08 10:13     ` Denis V. Lunev
2015-09-08 10:20     ` Fam Zheng
2015-09-08 10:46       ` Denis V. Lunev
2015-09-08 10:49       ` Kevin Wolf
2015-09-08 13:20         ` Fam Zheng
2015-09-08  9:33 ` Paolo Bonzini
2015-09-08  9:41   ` Denis V. Lunev
2015-09-08  9:43     ` Paolo Bonzini
2015-09-08 10:37     ` Andrey Korolyov
2015-09-08 10:50       ` Denis V. Lunev
2015-09-08 10:07   ` Kevin Wolf
2015-09-08 10:08     ` Denis V. Lunev
2015-09-08 10:22   ` Stefan Hajnoczi
2015-09-08 10:26     ` Paolo Bonzini
2015-09-08 10:36     ` Denis V. Lunev
2015-09-08 19:11 ` John Snow
2015-09-10 19:29 ` [Qemu-devel] Summary: " Denis V. Lunev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).