qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH for-2.8 0/4] Allow 'cache-clean-interval' in Linux only
@ 2016-11-25 11:27 Alberto Garcia
  2016-11-25 11:27 ` [Qemu-devel] [PATCH 1/4] qcow2: Make qcow2_cache_table_release() work only in Linux Alberto Garcia
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Alberto Garcia @ 2016-11-25 11:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, Kevin Wolf, Max Reitz, Alberto Garcia

Hi all,

The cache-clean-interval setting of qcow2 frees the memory of the L2
cache tables that haven't been used after a certain interval of time.

QEMU uses madvise() with MADV_DONTNEED for this. After that call, the
data in the specified cache tables is discarded by the kernel. The
problem with this behavior is that it is Linux-specific. madvise()
itself is not a standard system call and while other implementations
(e.g. FreeBSD) also have MADV_DONTNEED, they don't share the same
semantics.

POSIX defines posix_madvise(), which has POSIX_MADV_DONTNEED, and
that's what QEMU uses in systems that don't implement madvise().
However POSIX_MADV_DONTNEED also has different semantics and cannot be
used for our purposes. As a matter of fact, in glibc it is a no-op:

https://github.molgen.mpg.de/git-mirror/glibc/blob/glibc-2.23/sysdeps/unix/sysv/linux/posix_madvise.c

So while this all is mentioned in the QEMU documentation, there's
nothing preventing users of other systems from trying to use this
feature. In non-Linux systems it is worse than a no-op: it invalidates
perfectly valid cache tables for no reason without freeing their
memory.

This series makes Linux a hard requirement for cache-clean-interval
and prints an error message in other systems.

Regards,

Berto

Alberto Garcia (4):
  qcow2: Make qcow2_cache_table_release() work only in Linux
  qcow2: Allow 'cache-clean-interval' in Linux only
  qcow2: Remove stale comment
  docs: Specify that cache-clean-interval is only supported in Linux

 block/qcow2-cache.c  | 6 +++---
 block/qcow2.c        | 8 ++++++++
 docs/qcow2-cache.txt | 5 +++--
 3 files changed, 14 insertions(+), 5 deletions(-)

-- 
2.10.2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH 1/4] qcow2: Make qcow2_cache_table_release() work only in Linux
  2016-11-25 11:27 [Qemu-devel] [PATCH for-2.8 0/4] Allow 'cache-clean-interval' in Linux only Alberto Garcia
@ 2016-11-25 11:27 ` Alberto Garcia
  2016-11-25 11:27 ` [Qemu-devel] [PATCH 2/4] qcow2: Allow 'cache-clean-interval' in Linux only Alberto Garcia
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Alberto Garcia @ 2016-11-25 11:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, Kevin Wolf, Max Reitz, Alberto Garcia

We are using QEMU_MADV_DONTNEED to discard the memory of individual L2
cache tables. The problem with this is that those semantics are
specific to the Linux madvise() system call. Other implementations of
madvise() (including the very Linux implementation of posix_madvise())
don't do that, so we cannot use them for the same purpose.

This patch makes the code Linux-specific and uses madvise() directly
since there's no point in going through qemu_madvise() for this.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2-cache.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 6eaefed..ab8ee2d 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -66,7 +66,8 @@ static inline int qcow2_cache_get_table_idx(BlockDriverState *bs,
 static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c,
                                       int i, int num_tables)
 {
-#if QEMU_MADV_DONTNEED != QEMU_MADV_INVALID
+/* Using MADV_DONTNEED to discard memory is a Linux-specific feature */
+#ifdef CONFIG_LINUX
     BDRVQcow2State *s = bs->opaque;
     void *t = qcow2_cache_get_table_addr(bs, c, i);
     int align = getpagesize();
@@ -74,7 +75,7 @@ static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c,
     size_t offset = QEMU_ALIGN_UP((uintptr_t) t, align) - (uintptr_t) t;
     size_t length = QEMU_ALIGN_DOWN(mem_size - offset, align);
     if (length > 0) {
-        qemu_madvise((uint8_t *) t + offset, length, QEMU_MADV_DONTNEED);
+        madvise((uint8_t *) t + offset, length, MADV_DONTNEED);
     }
 #endif
 }
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH 2/4] qcow2: Allow 'cache-clean-interval' in Linux only
  2016-11-25 11:27 [Qemu-devel] [PATCH for-2.8 0/4] Allow 'cache-clean-interval' in Linux only Alberto Garcia
  2016-11-25 11:27 ` [Qemu-devel] [PATCH 1/4] qcow2: Make qcow2_cache_table_release() work only in Linux Alberto Garcia
@ 2016-11-25 11:27 ` Alberto Garcia
  2016-11-25 11:27 ` [Qemu-devel] [PATCH 3/4] qcow2: Remove stale comment Alberto Garcia
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Alberto Garcia @ 2016-11-25 11:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, Kevin Wolf, Max Reitz, Alberto Garcia

The cache-clean-interval option of qcow2 only works on Linux. However
we allow setting it in other systems regardless of whether it works or
not.

In those systems this option is not simply a no-op: it actually
invalidates perfectly valid cache tables for no good reason without
freeing their memory.

This patch forbids using that option in non-Linux systems.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 7cfcd84..ed9e0f3 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -668,6 +668,14 @@ static int qcow2_update_options_prepare(BlockDriverState *bs,
     r->cache_clean_interval =
         qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL,
                             s->cache_clean_interval);
+#ifndef CONFIG_LINUX
+    if (r->cache_clean_interval != 0) {
+        error_setg(errp, QCOW2_OPT_CACHE_CLEAN_INTERVAL
+                   " not supported on this host");
+        ret = -EINVAL;
+        goto fail;
+    }
+#endif
     if (r->cache_clean_interval > UINT_MAX) {
         error_setg(errp, "Cache clean interval too big");
         ret = -EINVAL;
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH 3/4] qcow2: Remove stale comment
  2016-11-25 11:27 [Qemu-devel] [PATCH for-2.8 0/4] Allow 'cache-clean-interval' in Linux only Alberto Garcia
  2016-11-25 11:27 ` [Qemu-devel] [PATCH 1/4] qcow2: Make qcow2_cache_table_release() work only in Linux Alberto Garcia
  2016-11-25 11:27 ` [Qemu-devel] [PATCH 2/4] qcow2: Allow 'cache-clean-interval' in Linux only Alberto Garcia
@ 2016-11-25 11:27 ` Alberto Garcia
  2016-11-25 11:27 ` [Qemu-devel] [PATCH 4/4] docs: Specify that cache-clean-interval is only supported in Linux Alberto Garcia
  2016-11-28 14:46 ` [Qemu-devel] [PATCH for-2.8 0/4] Allow 'cache-clean-interval' in Linux only Kevin Wolf
  4 siblings, 0 replies; 6+ messages in thread
From: Alberto Garcia @ 2016-11-25 11:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, Kevin Wolf, Max Reitz, Alberto Garcia

We haven't been using CONFIG_MADVISE since 02d0e095031b7fda77de8b

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2-cache.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index ab8ee2d..1d25147 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -22,7 +22,6 @@
  * THE SOFTWARE.
  */
 
-/* Needed for CONFIG_MADVISE */
 #include "qemu/osdep.h"
 #include "block/block_int.h"
 #include "qemu-common.h"
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH 4/4] docs: Specify that cache-clean-interval is only supported in Linux
  2016-11-25 11:27 [Qemu-devel] [PATCH for-2.8 0/4] Allow 'cache-clean-interval' in Linux only Alberto Garcia
                   ` (2 preceding siblings ...)
  2016-11-25 11:27 ` [Qemu-devel] [PATCH 3/4] qcow2: Remove stale comment Alberto Garcia
@ 2016-11-25 11:27 ` Alberto Garcia
  2016-11-28 14:46 ` [Qemu-devel] [PATCH for-2.8 0/4] Allow 'cache-clean-interval' in Linux only Kevin Wolf
  4 siblings, 0 replies; 6+ messages in thread
From: Alberto Garcia @ 2016-11-25 11:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, Kevin Wolf, Max Reitz, Alberto Garcia

Make it clear that having Linux is a hard requirement for this
feature.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 docs/qcow2-cache.txt | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 5bb0607..1fdd6f9 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -160,5 +160,6 @@ If unset, the default value for this parameter is 0 and it disables
 this feature.
 
 Note that this functionality currently relies on the MADV_DONTNEED
-argument for madvise() to actually free the memory, so it is not
-useful in systems that don't follow that behavior.
+argument for madvise() to actually free the memory. This is a
+Linux-specific feature, so cache-clean-interval is not supported in
+other systems.
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.8 0/4] Allow 'cache-clean-interval' in Linux only
  2016-11-25 11:27 [Qemu-devel] [PATCH for-2.8 0/4] Allow 'cache-clean-interval' in Linux only Alberto Garcia
                   ` (3 preceding siblings ...)
  2016-11-25 11:27 ` [Qemu-devel] [PATCH 4/4] docs: Specify that cache-clean-interval is only supported in Linux Alberto Garcia
@ 2016-11-28 14:46 ` Kevin Wolf
  4 siblings, 0 replies; 6+ messages in thread
From: Kevin Wolf @ 2016-11-28 14:46 UTC (permalink / raw)
  To: Alberto Garcia; +Cc: qemu-devel, qemu-block, Max Reitz

Am 25.11.2016 um 12:27 hat Alberto Garcia geschrieben:
> Hi all,
> 
> The cache-clean-interval setting of qcow2 frees the memory of the L2
> cache tables that haven't been used after a certain interval of time.
> 
> QEMU uses madvise() with MADV_DONTNEED for this. After that call, the
> data in the specified cache tables is discarded by the kernel. The
> problem with this behavior is that it is Linux-specific. madvise()
> itself is not a standard system call and while other implementations
> (e.g. FreeBSD) also have MADV_DONTNEED, they don't share the same
> semantics.
> 
> POSIX defines posix_madvise(), which has POSIX_MADV_DONTNEED, and
> that's what QEMU uses in systems that don't implement madvise().
> However POSIX_MADV_DONTNEED also has different semantics and cannot be
> used for our purposes. As a matter of fact, in glibc it is a no-op:
> 
> https://github.molgen.mpg.de/git-mirror/glibc/blob/glibc-2.23/sysdeps/unix/sysv/linux/posix_madvise.c
> 
> So while this all is mentioned in the QEMU documentation, there's
> nothing preventing users of other systems from trying to use this
> feature. In non-Linux systems it is worse than a no-op: it invalidates
> perfectly valid cache tables for no reason without freeing their
> memory.
> 
> This series makes Linux a hard requirement for cache-clean-interval
> and prints an error message in other systems.

Thanks, applied to the block branch.

Kevin

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-11-28 14:46 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-25 11:27 [Qemu-devel] [PATCH for-2.8 0/4] Allow 'cache-clean-interval' in Linux only Alberto Garcia
2016-11-25 11:27 ` [Qemu-devel] [PATCH 1/4] qcow2: Make qcow2_cache_table_release() work only in Linux Alberto Garcia
2016-11-25 11:27 ` [Qemu-devel] [PATCH 2/4] qcow2: Allow 'cache-clean-interval' in Linux only Alberto Garcia
2016-11-25 11:27 ` [Qemu-devel] [PATCH 3/4] qcow2: Remove stale comment Alberto Garcia
2016-11-25 11:27 ` [Qemu-devel] [PATCH 4/4] docs: Specify that cache-clean-interval is only supported in Linux Alberto Garcia
2016-11-28 14:46 ` [Qemu-devel] [PATCH for-2.8 0/4] Allow 'cache-clean-interval' in Linux only Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).