* [Qemu-devel] [PATCH 0/2] Enable qemu block layer to not flush v2 @ 2010-05-12 23:36 Alexander Graf 2010-05-12 23:36 ` [Qemu-devel] [PATCH 1/2] Add no-op aio emulation stub Alexander Graf 0 siblings, 1 reply; 5+ messages in thread From: Alexander Graf @ 2010-05-12 23:36 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, hch Thanks to recent improvements, qemu flushes guest data to disk when the guest tells us to do so. This is great if we care about data consistency on host disk failures. In cases where we don't it just creates additional overhead for no net win. One such use case is the building of appliances in SUSE Studio. We write the resulting images out of the build VM, but compress it directly afterwards. So if possible we'd love to keep it in RAM. This patchset introduces a new cache option to -drive called "volatile" which allows a user to signal qemu that he doesn't care about data integrity. To show the difference in performance this makes, I have put together a small test case. Inside the initrd, I call the following piece of code on a 500MB preallocated vmdk image: mkfs.ext3 /dev/vda mkdir -p /mnt mount /dev/vda /mnt dd if=/dev/zero of=/mnt/test bs=1M umount /mnt sync halt -fp With cache=writeback real 0m34.653s user 0m16.957s sys 0m5.872s With cache=volatile real 0m27.068s user 0m15.829s sys 0m6.648s When I use a loopback file with msdos format and put the vmdk inside there, performance for cache=writeback and cache=volatile is basically identical. It nevertheless does make sense to offer a cache=volatile option to the user, as it's not always possible to loopback mount your own file systems, especially given qemu's non-root nature. v1 -> v2: - clean up no_op patch - move to cache=volatile instead of flush=off - make code less intrusive by catching everything in block.c Alexander Graf (2): Add no-op aio emulation stub Add cache=volatile parameter to -drive block.c | 28 ++++++++++++++++++++++++++++ block.h | 1 + qemu-config.c | 2 +- qemu-options.hx | 7 ++++--- vl.c | 3 +++ 5 files changed, 37 insertions(+), 4 deletions(-) ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Qemu-devel] [PATCH 1/2] Add no-op aio emulation stub 2010-05-12 23:36 [Qemu-devel] [PATCH 0/2] Enable qemu block layer to not flush v2 Alexander Graf @ 2010-05-12 23:36 ` Alexander Graf 2010-05-12 23:36 ` [Qemu-devel] [PATCH 2/2] Add cache=volatile parameter to -drive Alexander Graf 2010-05-14 13:12 ` [Qemu-devel] Re: [PATCH 1/2] Add no-op aio emulation stub Kevin Wolf 0 siblings, 2 replies; 5+ messages in thread From: Alexander Graf @ 2010-05-12 23:36 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, hch We need to be able to do nothing in AIO fashion. Since I suspect this could be useful for more cases than the non flushing, I figured I'd create a new function that does everything AIO-like, but doesn't do anything. Signed-off-by: Alexander Graf <agraf@suse.de> --- v1 -> v2: - remove stray define - make aio noop handler static - remove header define --- block.c | 18 ++++++++++++++++++ 1 files changed, 18 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 48305b7..00e9b6b 100644 --- a/block.c +++ b/block.c @@ -2196,6 +2196,24 @@ static BlockDriverAIOCB *bdrv_aio_flush_em(BlockDriverState *bs, return &acb->common; } +static BlockDriverAIOCB *bdrv_aio_noop_em(BlockDriverState *bs, + BlockDriverCompletionFunc *cb, void *opaque) +{ + BlockDriverAIOCBSync *acb; + + acb = qemu_aio_get(&bdrv_em_aio_pool, bs, cb, opaque); + acb->is_write = 1; /* don't bounce in the completion hadler */ + acb->qiov = NULL; + acb->bounce = NULL; + acb->ret = 0; + + if (!acb->bh) + acb->bh = qemu_bh_new(bdrv_aio_bh_cb, acb); + + qemu_bh_schedule(acb->bh); + return &acb->common; +} + /**************************************************************/ /* sync block device emulation */ -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Qemu-devel] [PATCH 2/2] Add cache=volatile parameter to -drive 2010-05-12 23:36 ` [Qemu-devel] [PATCH 1/2] Add no-op aio emulation stub Alexander Graf @ 2010-05-12 23:36 ` Alexander Graf 2010-05-14 13:12 ` [Qemu-devel] Re: [PATCH 1/2] Add no-op aio emulation stub Kevin Wolf 1 sibling, 0 replies; 5+ messages in thread From: Alexander Graf @ 2010-05-12 23:36 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, hch Usually the guest can tell the host to flush data to disk. In some cases we don't want to flush though, but try to keep everything in cache. So let's add a new cache value to -drive that allows us to set the cache policy to most aggressive, disabling flushes. We call this mode "volatile", as guest data is not guaranteed to survive host crashes anymore. Signed-off-by: Alexander Graf <agraf@suse.de> --- block.c | 10 ++++++++++ block.h | 1 + qemu-config.c | 2 +- qemu-options.hx | 7 ++++--- vl.c | 3 +++ 5 files changed, 19 insertions(+), 4 deletions(-) diff --git a/block.c b/block.c index 00e9b6b..b742965 100644 --- a/block.c +++ b/block.c @@ -50,6 +50,8 @@ static BlockDriverAIOCB *bdrv_aio_writev_em(BlockDriverState *bs, BlockDriverCompletionFunc *cb, void *opaque); static BlockDriverAIOCB *bdrv_aio_flush_em(BlockDriverState *bs, BlockDriverCompletionFunc *cb, void *opaque); +static BlockDriverAIOCB *bdrv_aio_noop_em(BlockDriverState *bs, + BlockDriverCompletionFunc *cb, void *opaque); static int bdrv_read_em(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, int nb_sectors); static int bdrv_write_em(BlockDriverState *bs, int64_t sector_num, @@ -1306,6 +1308,10 @@ const char *bdrv_get_device_name(BlockDriverState *bs) void bdrv_flush(BlockDriverState *bs) { + if (bs->open_flags & BDRV_O_NO_FLUSH) { + return; + } + if (bs->drv && bs->drv->bdrv_flush) bs->drv->bdrv_flush(bs); } @@ -2082,6 +2088,10 @@ BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs, { BlockDriver *drv = bs->drv; + if (bs->open_flags & BDRV_O_NO_FLUSH) { + return bdrv_aio_noop_em(bs, cb, opaque); + } + if (!drv) return NULL; return drv->bdrv_aio_flush(bs, cb, opaque); diff --git a/block.h b/block.h index f87d24e..8032b6b 100644 --- a/block.h +++ b/block.h @@ -33,6 +33,7 @@ typedef struct QEMUSnapshotInfo { #define BDRV_O_CACHE_WB 0x0040 /* use write-back caching */ #define BDRV_O_NATIVE_AIO 0x0080 /* use native AIO instead of the thread pool */ #define BDRV_O_NO_BACKING 0x0100 /* don't open the backing file */ +#define BDRV_O_NO_FLUSH 0x0200 /* disable flushing on this disk */ #define BDRV_O_CACHE_MASK (BDRV_O_NOCACHE | BDRV_O_CACHE_WB) diff --git a/qemu-config.c b/qemu-config.c index d500885..bf3d493 100644 --- a/qemu-config.c +++ b/qemu-config.c @@ -54,7 +54,7 @@ QemuOptsList qemu_drive_opts = { },{ .name = "cache", .type = QEMU_OPT_STRING, - .help = "host cache usage (none, writeback, writethrough)", + .help = "host cache usage (none, writeback, writethrough, volatile)", },{ .name = "aio", .type = QEMU_OPT_STRING, diff --git a/qemu-options.hx b/qemu-options.hx index 12f6b51..0e42ee4 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -118,8 +118,9 @@ ETEXI DEF("drive", HAS_ARG, QEMU_OPTION_drive, "-drive [file=file][,if=type][,bus=n][,unit=m][,media=d][,index=i]\n" " [,cyls=c,heads=h,secs=s[,trans=t]][,snapshot=on|off]\n" - " [,cache=writethrough|writeback|none][,format=f][,serial=s]\n" - " [,addr=A][,id=name][,aio=threads|native][,readonly=on|off]\n" + " [,cache=writethrough|writeback|volatile|none][,format=f]\n" + " [,serial=s][,addr=A][,id=name][,aio=threads|native]\n" + " [,readonly=on|off]\n" " use 'file' as a drive image\n", QEMU_ARCH_ALL) STEXI @item -drive @var{option}[,@var{option}[,@var{option}[,...]]] @@ -148,7 +149,7 @@ These options have the same definition as they have in @option{-hdachs}. @item snapshot=@var{snapshot} @var{snapshot} is "on" or "off" and allows to enable snapshot for given drive (see @option{-snapshot}). @item cache=@var{cache} -@var{cache} is "none", "writeback", or "writethrough" and controls how the host cache is used to access block data. +@var{cache} is "none", "writeback", "volatile", or "writethrough" and controls how the host cache is used to access block data. @item aio=@var{aio} @var{aio} is "threads", or "native" and selects between pthread based disk I/O and native Linux AIO. @item format=@var{format} diff --git a/vl.c b/vl.c index 85bcc84..c8abce6 100644 --- a/vl.c +++ b/vl.c @@ -913,6 +913,9 @@ DriveInfo *drive_init(QemuOpts *opts, void *opaque, bdrv_flags |= BDRV_O_NOCACHE; } else if (!strcmp(buf, "writeback")) { bdrv_flags |= BDRV_O_CACHE_WB; + } else if (!strcmp(buf, "volatile")) { + bdrv_flags |= BDRV_O_CACHE_WB; + bdrv_flags |= BDRV_O_NO_FLUSH; } else if (!strcmp(buf, "writethrough")) { /* this is the default */ } else { -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Qemu-devel] Re: [PATCH 1/2] Add no-op aio emulation stub 2010-05-12 23:36 ` [Qemu-devel] [PATCH 1/2] Add no-op aio emulation stub Alexander Graf 2010-05-12 23:36 ` [Qemu-devel] [PATCH 2/2] Add cache=volatile parameter to -drive Alexander Graf @ 2010-05-14 13:12 ` Kevin Wolf 1 sibling, 0 replies; 5+ messages in thread From: Kevin Wolf @ 2010-05-14 13:12 UTC (permalink / raw) To: Alexander Graf; +Cc: qemu-devel, hch Am 13.05.2010 01:36, schrieb Alexander Graf: > We need to be able to do nothing in AIO fashion. Since I suspect this > could be useful for more cases than the non flushing, I figured I'd > create a new function that does everything AIO-like, but doesn't do > anything. > > Signed-off-by: Alexander Graf <agraf@suse.de> This isn't bisectable: cc1: warnings being treated as errors block.c:2208: error: 'bdrv_aio_noop_em' defined but not used Maybe put everything in one patch, it's small enough now. Also, in qemu-options.hx you could explain what volatile actually means. Below the line that you change there are some paragraphs that explain the behaviour for each value. Other than that the series looks good to me. I'm not sure if Anthony is already convinced, though. Everyone else seems to agree with the intention of the series. Kevin ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Qemu-devel] [PATCH 0/2] Enable qemu block layer to not flush @ 2010-05-10 21:51 Alexander Graf 2010-05-10 21:51 ` [Qemu-devel] [PATCH 1/2] Add no-op aio emulation stub Alexander Graf 0 siblings, 1 reply; 5+ messages in thread From: Alexander Graf @ 2010-05-10 21:51 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, hch Thanks to recent improvements, qemu flushes guest data to disk when the guest tells us to do so. This is great if we care about data consistency on host disk failures. In cases where we don't it just creates additional overhead for no net win. One such use case is the building of appliances in SUSE Studio. We write the resulting images out of the build VM, but compress it directly afterwards. So if possible we'd love to keep it in RAM. This patchset introduces a new block parameter to -drive called "flush" which allows a user to disable flushing in odd scenarios like the above. To show the difference in performance this makes, I have put together a small test case. Inside the initrd, I call the following piece of code on a 500MB preallocated vmdk image: mkfs.ext3 /dev/vda mkdir -p /mnt mount /dev/vda /mnt dd if=/dev/zero of=/mnt/test bs=1M umount /mnt sync halt -fp With flush=on (default) real 0m33.597s user 0m16.453s sys 0m6.192s With flush=off real 0m27.150s user 0m16.533s sys 0m5.348s Alexander Graf (2): Add no-op aio emulation stub Add flush=off parameter to -drive block.c | 18 ++++++++++++++++++ block.h | 5 +++++ block/raw-posix.c | 13 +++++++++++++ qemu-config.c | 3 +++ qemu-options.hx | 3 +++ vl.c | 3 +++ 6 files changed, 45 insertions(+), 0 deletions(-) ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Qemu-devel] [PATCH 1/2] Add no-op aio emulation stub 2010-05-10 21:51 [Qemu-devel] [PATCH 0/2] Enable qemu block layer to not flush Alexander Graf @ 2010-05-10 21:51 ` Alexander Graf 2010-05-11 8:29 ` [Qemu-devel] " Kevin Wolf 0 siblings, 1 reply; 5+ messages in thread From: Alexander Graf @ 2010-05-10 21:51 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, hch We need to be able to do nothing in AIO fashion. Since I suspect this could be useful for more cases than the non flushing, I figured I'd create a new function that does everything AIO-like, but doesn't do anything. Signed-off-by: Alexander Graf <agraf@suse.de> --- block.c | 18 ++++++++++++++++++ block.h | 5 +++++ 2 files changed, 23 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 48305b7..1cd39d7 100644 --- a/block.c +++ b/block.c @@ -2196,6 +2196,24 @@ static BlockDriverAIOCB *bdrv_aio_flush_em(BlockDriverState *bs, return &acb->common; } +BlockDriverAIOCB *bdrv_aio_noop_em(BlockDriverState *bs, + BlockDriverCompletionFunc *cb, void *opaque) +{ + BlockDriverAIOCBSync *acb; + + acb = qemu_aio_get(&bdrv_em_aio_pool, bs, cb, opaque); + acb->is_write = 1; /* don't bounce in the completion hadler */ + acb->qiov = NULL; + acb->bounce = NULL; + acb->ret = 0; + + if (!acb->bh) + acb->bh = qemu_bh_new(bdrv_aio_bh_cb, acb); + + qemu_bh_schedule(acb->bh); + return &acb->common; +} + /**************************************************************/ /* sync block device emulation */ diff --git a/block.h b/block.h index f87d24e..bef6358 100644 --- a/block.h +++ b/block.h @@ -33,6 +33,7 @@ typedef struct QEMUSnapshotInfo { #define BDRV_O_CACHE_WB 0x0040 /* use write-back caching */ #define BDRV_O_NATIVE_AIO 0x0080 /* use native AIO instead of the thread pool */ #define BDRV_O_NO_BACKING 0x0100 /* don't open the backing file */ +#define BDRV_O_NOFLUSH 0x0200 /* don't flush the image ever */ #define BDRV_O_CACHE_MASK (BDRV_O_NOCACHE | BDRV_O_CACHE_WB) @@ -97,6 +98,10 @@ BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs, BlockDriverCompletionFunc *cb, void *opaque); void bdrv_aio_cancel(BlockDriverAIOCB *acb); +/* Emulate a no-op */ +BlockDriverAIOCB *bdrv_aio_noop_em(BlockDriverState *bs, + BlockDriverCompletionFunc *cb, void *opaque); + typedef struct BlockRequest { /* Fields to be filled by multiwrite caller */ int64_t sector; -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Qemu-devel] Re: [PATCH 1/2] Add no-op aio emulation stub 2010-05-10 21:51 ` [Qemu-devel] [PATCH 1/2] Add no-op aio emulation stub Alexander Graf @ 2010-05-11 8:29 ` Kevin Wolf 0 siblings, 0 replies; 5+ messages in thread From: Kevin Wolf @ 2010-05-11 8:29 UTC (permalink / raw) To: Alexander Graf; +Cc: qemu-devel, hch Am 10.05.2010 23:51, schrieb Alexander Graf: > We need to be able to do nothing in AIO fashion. Since I suspect this > could be useful for more cases than the non flushing, I figured I'd > create a new function that does everything AIO-like, but doesn't do > anything. > > Signed-off-by: Alexander Graf <agraf@suse.de> > --- > block.c | 18 ++++++++++++++++++ > block.h | 5 +++++ > 2 files changed, 23 insertions(+), 0 deletions(-) > > diff --git a/block.c b/block.c > index 48305b7..1cd39d7 100644 > --- a/block.c > +++ b/block.c > @@ -2196,6 +2196,24 @@ static BlockDriverAIOCB *bdrv_aio_flush_em(BlockDriverState *bs, > return &acb->common; > } > > +BlockDriverAIOCB *bdrv_aio_noop_em(BlockDriverState *bs, > + BlockDriverCompletionFunc *cb, void *opaque) > +{ > + BlockDriverAIOCBSync *acb; > + > + acb = qemu_aio_get(&bdrv_em_aio_pool, bs, cb, opaque); > + acb->is_write = 1; /* don't bounce in the completion hadler */ Typo in the comment. > + acb->qiov = NULL; > + acb->bounce = NULL; > + acb->ret = 0; > + > + if (!acb->bh) > + acb->bh = qemu_bh_new(bdrv_aio_bh_cb, acb); > + > + qemu_bh_schedule(acb->bh); > + return &acb->common; > +} > + > /**************************************************************/ > /* sync block device emulation */ > > diff --git a/block.h b/block.h > index f87d24e..bef6358 100644 > --- a/block.h > +++ b/block.h > @@ -33,6 +33,7 @@ typedef struct QEMUSnapshotInfo { > #define BDRV_O_CACHE_WB 0x0040 /* use write-back caching */ > #define BDRV_O_NATIVE_AIO 0x0080 /* use native AIO instead of the thread pool */ > #define BDRV_O_NO_BACKING 0x0100 /* don't open the backing file */ > +#define BDRV_O_NOFLUSH 0x0200 /* don't flush the image ever */ > > #define BDRV_O_CACHE_MASK (BDRV_O_NOCACHE | BDRV_O_CACHE_WB) This hunk should be in patch 2/2. > > @@ -97,6 +98,10 @@ BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs, > BlockDriverCompletionFunc *cb, void *opaque); > void bdrv_aio_cancel(BlockDriverAIOCB *acb); > > +/* Emulate a no-op */ > +BlockDriverAIOCB *bdrv_aio_noop_em(BlockDriverState *bs, > + BlockDriverCompletionFunc *cb, void *opaque); > + > typedef struct BlockRequest { > /* Fields to be filled by multiwrite caller */ > int64_t sector; I think exporting this function shouldn't be necessary. Everything that deals with AIO emulation should be contained in block.c. Kevin ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-05-14 13:13 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-05-12 23:36 [Qemu-devel] [PATCH 0/2] Enable qemu block layer to not flush v2 Alexander Graf 2010-05-12 23:36 ` [Qemu-devel] [PATCH 1/2] Add no-op aio emulation stub Alexander Graf 2010-05-12 23:36 ` [Qemu-devel] [PATCH 2/2] Add cache=volatile parameter to -drive Alexander Graf 2010-05-14 13:12 ` [Qemu-devel] Re: [PATCH 1/2] Add no-op aio emulation stub Kevin Wolf -- strict thread matches above, loose matches on Subject: below -- 2010-05-10 21:51 [Qemu-devel] [PATCH 0/2] Enable qemu block layer to not flush Alexander Graf 2010-05-10 21:51 ` [Qemu-devel] [PATCH 1/2] Add no-op aio emulation stub Alexander Graf 2010-05-11 8:29 ` [Qemu-devel] " Kevin Wolf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).