qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/2] eliminate data write in bdrv_write_zeroes on Linux
@ 2014-12-25  5:37 Denis V. Lunev
  2014-12-25  5:37 ` [Qemu-devel] [PATCH 1/2] block: use fallocate(FALLOC_FL_ZERO_RANGE) in handle_aiocb_write_zeroes Denis V. Lunev
  2014-12-25  5:37 ` [Qemu-devel] [PATCH 2/2] block: use fallocate(FALLOC_FL_PUNCH_HOLE) & fallocate(0) to write zeroes Denis V. Lunev
  0 siblings, 2 replies; 5+ messages in thread
From: Denis V. Lunev @ 2014-12-25  5:37 UTC (permalink / raw)
  Cc: Kevin Wolf, Denis V. Lunev, qemu-devel, Stefan Hajnoczi

These patches eliminate data writes completely on Linux if fallocate
FALLOC_FL_ZERO_RANGE or FALLOC_FL_PUNCH_HOLE are  supported on
underlying filesystem. This should seriously increase performance in
some cases.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH 1/2] block: use fallocate(FALLOC_FL_ZERO_RANGE) in handle_aiocb_write_zeroes
  2014-12-25  5:37 [Qemu-devel] [PATCH 0/2] eliminate data write in bdrv_write_zeroes on Linux Denis V. Lunev
@ 2014-12-25  5:37 ` Denis V. Lunev
  2014-12-26  8:45   ` Roman Kagan
  2014-12-25  5:37 ` [Qemu-devel] [PATCH 2/2] block: use fallocate(FALLOC_FL_PUNCH_HOLE) & fallocate(0) to write zeroes Denis V. Lunev
  1 sibling, 1 reply; 5+ messages in thread
From: Denis V. Lunev @ 2014-12-25  5:37 UTC (permalink / raw)
  Cc: Kevin Wolf, Denis V. Lunev, qemu-devel, Stefan Hajnoczi

this efficiently writes zeroes in the middle of the file on Linux
systems if the kernel is capable enough.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/raw-posix.c | 11 +++++++++++
 configure         | 19 +++++++++++++++++++
 2 files changed, 30 insertions(+)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index e51293a..9e66cb7 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -919,6 +919,17 @@ static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData *aiocb)
             return xfs_write_zeroes(s, aiocb->aio_offset, aiocb->aio_nbytes);
         }
 #endif
+
+#ifdef CONFIG_FALLOCATE_ZERO_RANGE
+        do {
+            if (fallocate(s->fd, CONFIG_FALLOCATE_ZERO_RANGE,
+                          aiocb->aio_offset, aiocb->aio_nbytes) == 0) {
+                return 0;
+            }
+        } while (errno == EINTR);
+
+        ret = -errno;
+#endif
     }
 
     if (ret == -ENODEV || ret == -ENOSYS || ret == -EOPNOTSUPP ||
diff --git a/configure b/configure
index cae588c..dfcf7b3 100755
--- a/configure
+++ b/configure
@@ -3309,6 +3309,22 @@ if compile_prog "" "" ; then
   fallocate_punch_hole=yes
 fi
 
+# check that fallocate supports range zeroing inside the file
+fallocate_zero_range=no
+cat > $TMPC << EOF
+#include <fcntl.h>
+#include <linux/falloc.h>
+
+int main(void)
+{
+    fallocate(0, FALLOC_FL_ZERO_RANGE, 0, 0);
+    return 0;
+}
+EOF
+if compile_prog "" "" ; then
+  fallocate_zero_range=yes
+fi
+
 # check for posix_fallocate
 posix_fallocate=no
 cat > $TMPC << EOF
@@ -4538,6 +4554,9 @@ fi
 if test "$fallocate_punch_hole" = "yes" ; then
   echo "CONFIG_FALLOCATE_PUNCH_HOLE=y" >> $config_host_mak
 fi
+if test "$fallocate_zero_range" = "yes" ; then
+  echo "CONFIG_FALLOCATE_ZERO_RANGE=y" >> $config_host_mak
+fi
 if test "$posix_fallocate" = "yes" ; then
   echo "CONFIG_POSIX_FALLOCATE=y" >> $config_host_mak
 fi
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH 2/2] block: use fallocate(FALLOC_FL_PUNCH_HOLE) & fallocate(0) to write zeroes
  2014-12-25  5:37 [Qemu-devel] [PATCH 0/2] eliminate data write in bdrv_write_zeroes on Linux Denis V. Lunev
  2014-12-25  5:37 ` [Qemu-devel] [PATCH 1/2] block: use fallocate(FALLOC_FL_ZERO_RANGE) in handle_aiocb_write_zeroes Denis V. Lunev
@ 2014-12-25  5:37 ` Denis V. Lunev
  2014-12-26  9:25   ` Roman Kagan
  1 sibling, 1 reply; 5+ messages in thread
From: Denis V. Lunev @ 2014-12-25  5:37 UTC (permalink / raw)
  Cc: Kevin Wolf, Denis V. Lunev, qemu-devel, Stefan Hajnoczi

This sequence works efficiently if FALLOC_FL_ZERO_RANGE is not supported.
The idea is that FALLOC_FL_PUNCH_HOLE could not increase file size
but it cleans already allocated blocks inside the file. If we have to
create something new, simple fallocate will do the job.

This should increase performance a bit for not-so-modern kernels or for
filesystems which do not support FALLOC_FL_ZERO_RANGE.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/raw-posix.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 9e66cb7..60972a1 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -930,6 +930,18 @@ static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData *aiocb)
 
         ret = -errno;
 #endif
+#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
+        do {
+            if (fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+                          aiocb->aio_offset, aiocb->aio_nbytes) == 0 &&
+                fallocate(s->fd, 0,
+                          aiocb->aio_offset, aiocb->aio_nbytes) == 0) {
+                return 0;
+            }
+        } while (errno == EINTR);
+
+        ret = -errno;
+#endif
     }
 
     if (ret == -ENODEV || ret == -ENOSYS || ret == -EOPNOTSUPP ||
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] block: use fallocate(FALLOC_FL_ZERO_RANGE) in handle_aiocb_write_zeroes
  2014-12-25  5:37 ` [Qemu-devel] [PATCH 1/2] block: use fallocate(FALLOC_FL_ZERO_RANGE) in handle_aiocb_write_zeroes Denis V. Lunev
@ 2014-12-26  8:45   ` Roman Kagan
  0 siblings, 0 replies; 5+ messages in thread
From: Roman Kagan @ 2014-12-26  8:45 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi

On Thu, Dec 25, 2014 at 08:37:29AM +0300, Denis V. Lunev wrote:
> +#ifdef CONFIG_FALLOCATE_ZERO_RANGE
> +        do {
> +            if (fallocate(s->fd, CONFIG_FALLOCATE_ZERO_RANGE,

Must be a typo, FALLOC_FL_ZERO_RANGE is what you mean.

Roman.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] block: use fallocate(FALLOC_FL_PUNCH_HOLE) & fallocate(0) to write zeroes
  2014-12-25  5:37 ` [Qemu-devel] [PATCH 2/2] block: use fallocate(FALLOC_FL_PUNCH_HOLE) & fallocate(0) to write zeroes Denis V. Lunev
@ 2014-12-26  9:25   ` Roman Kagan
  0 siblings, 0 replies; 5+ messages in thread
From: Roman Kagan @ 2014-12-26  9:25 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi

On Thu, Dec 25, 2014 at 08:37:30AM +0300, Denis V. Lunev wrote:
> This sequence works efficiently if FALLOC_FL_ZERO_RANGE is not supported.
> The idea is that FALLOC_FL_PUNCH_HOLE could not increase file size
> but it cleans already allocated blocks inside the file. If we have to
> create something new, simple fallocate will do the job.
> 
> This should increase performance a bit for not-so-modern kernels or for
> filesystems which do not support FALLOC_FL_ZERO_RANGE.
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/raw-posix.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/block/raw-posix.c b/block/raw-posix.c
> index 9e66cb7..60972a1 100644
> --- a/block/raw-posix.c
> +++ b/block/raw-posix.c
> @@ -930,6 +930,18 @@ static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData *aiocb)
>  
>          ret = -errno;
>  #endif
> +#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
> +        do {
> +            if (fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
> +                          aiocb->aio_offset, aiocb->aio_nbytes) == 0 &&
> +                fallocate(s->fd, 0,
> +                          aiocb->aio_offset, aiocb->aio_nbytes) == 0) {
> +                return 0;
> +            }
> +        } while (errno == EINTR);
> +
> +        ret = -errno;
> +#endif

This is suboptimal in that fallocate(FALLOC_FL_ZERO_RANGE) would always
be called in vain for such systems.  Might be worth another flag in
BDRVRawState?

Roman.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-12-26  9:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-25  5:37 [Qemu-devel] [PATCH 0/2] eliminate data write in bdrv_write_zeroes on Linux Denis V. Lunev
2014-12-25  5:37 ` [Qemu-devel] [PATCH 1/2] block: use fallocate(FALLOC_FL_ZERO_RANGE) in handle_aiocb_write_zeroes Denis V. Lunev
2014-12-26  8:45   ` Roman Kagan
2014-12-25  5:37 ` [Qemu-devel] [PATCH 2/2] block: use fallocate(FALLOC_FL_PUNCH_HOLE) & fallocate(0) to write zeroes Denis V. Lunev
2014-12-26  9:25   ` Roman Kagan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).