qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] backends/hostmem: Report more errors on failures
@ 2024-05-31  7:28 Michal Privoznik
  2024-05-31  7:28 ` [PATCH v2 1/4] osdep: Make qemu_madvise() to set errno in all cases Michal Privoznik
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Michal Privoznik @ 2024-05-31  7:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: david, imammedo

v2 of:

https://lists.gnu.org/archive/html/qemu-devel/2024-05/msg05659.html

diff to v1:
- patch 2/4 is new
- Errors are reported instead of warnings on failed qemu_madvise()
- Instead of rounding up value passed to qemu_madvise()/mbind() an error
  is reported

Michal Privoznik (4):
  osdep: Make qemu_madvise() to set errno in all cases
  osdep: Make qemu_madvise() return ENOSYS on unsupported OSes
  backends/hostmem: Report error on qemu_madvise() failures
  backends/hostmem: Report error when memory size is unaligned

 backends/hostmem.c | 44 ++++++++++++++++++++++++++++++++++++--------
 util/osdep.c       |  9 +++++++--
 2 files changed, 43 insertions(+), 10 deletions(-)

-- 
2.44.1



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/4] osdep: Make qemu_madvise() to set errno in all cases
  2024-05-31  7:28 [PATCH v2 0/4] backends/hostmem: Report more errors on failures Michal Privoznik
@ 2024-05-31  7:28 ` Michal Privoznik
  2024-05-31  7:57   ` Philippe Mathieu-Daudé
  2024-05-31  7:28 ` [PATCH v2 2/4] osdep: Make qemu_madvise() return ENOSYS on unsupported OSes Michal Privoznik
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Michal Privoznik @ 2024-05-31  7:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: david, imammedo

The unspoken premise of qemu_madvise() is that errno is set on
error. And it is mostly the case except for posix_madvise() which
is documented to return either zero (on success) or a positive
error number. This means, we must set errno ourselves. And while
at it, make the function return a negative value on error, just
like other error paths do.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
---
 util/osdep.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/util/osdep.c b/util/osdep.c
index e996c4744a..e42f4e8121 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -57,7 +57,12 @@ int qemu_madvise(void *addr, size_t len, int advice)
 #if defined(CONFIG_MADVISE)
     return madvise(addr, len, advice);
 #elif defined(CONFIG_POSIX_MADVISE)
-    return posix_madvise(addr, len, advice);
+    int rc = posix_madvise(addr, len, advice);
+    if (rc) {
+        errno = rc;
+        return -1;
+    }
+    return 0;
 #else
     errno = EINVAL;
     return -1;
-- 
2.44.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/4] osdep: Make qemu_madvise() return ENOSYS on unsupported OSes
  2024-05-31  7:28 [PATCH v2 0/4] backends/hostmem: Report more errors on failures Michal Privoznik
  2024-05-31  7:28 ` [PATCH v2 1/4] osdep: Make qemu_madvise() to set errno in all cases Michal Privoznik
@ 2024-05-31  7:28 ` Michal Privoznik
  2024-05-31  7:48   ` Philippe Mathieu-Daudé
  2024-05-31  7:53   ` David Hildenbrand
  2024-05-31  7:28 ` [PATCH v2 3/4] backends/hostmem: Report error on qemu_madvise() failures Michal Privoznik
  2024-05-31  7:29 ` [PATCH v2 4/4] backends/hostmem: Report error when memory size is unaligned Michal Privoznik
  3 siblings, 2 replies; 17+ messages in thread
From: Michal Privoznik @ 2024-05-31  7:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: david, imammedo

Not every OS is capable of madvise() or posix_madvise() even. In
that case, errno should be set to ENOSYS as it reflects the cause
better. This also mimic what madvise()/posix_madvise() would
return if kernel lacks corresponding syscall (e.g. due to
configuration).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
---
 util/osdep.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/util/osdep.c b/util/osdep.c
index e42f4e8121..5d23bbfbec 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -64,7 +64,7 @@ int qemu_madvise(void *addr, size_t len, int advice)
     }
     return 0;
 #else
-    errno = EINVAL;
+    errno = ENOSYS;
     return -1;
 #endif
 }
-- 
2.44.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 3/4] backends/hostmem: Report error on qemu_madvise() failures
  2024-05-31  7:28 [PATCH v2 0/4] backends/hostmem: Report more errors on failures Michal Privoznik
  2024-05-31  7:28 ` [PATCH v2 1/4] osdep: Make qemu_madvise() to set errno in all cases Michal Privoznik
  2024-05-31  7:28 ` [PATCH v2 2/4] osdep: Make qemu_madvise() return ENOSYS on unsupported OSes Michal Privoznik
@ 2024-05-31  7:28 ` Michal Privoznik
  2024-05-31  7:49   ` Philippe Mathieu-Daudé
  2024-05-31  9:10   ` David Hildenbrand
  2024-05-31  7:29 ` [PATCH v2 4/4] backends/hostmem: Report error when memory size is unaligned Michal Privoznik
  3 siblings, 2 replies; 17+ messages in thread
From: Michal Privoznik @ 2024-05-31  7:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: david, imammedo

If user sets .merge or .dump attributes qemu_madvise() is called
with corresponding advice. But it is never checked for failure
which may mislead users into thinking the attribute is set
correctly. Report an appropriate error.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
---
 backends/hostmem.c | 36 ++++++++++++++++++++++++++++--------
 1 file changed, 28 insertions(+), 8 deletions(-)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index eb9682b4a8..012a8c190f 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -178,8 +178,14 @@ static void host_memory_backend_set_merge(Object *obj, bool value, Error **errp)
         void *ptr = memory_region_get_ram_ptr(&backend->mr);
         uint64_t sz = memory_region_size(&backend->mr);
 
-        qemu_madvise(ptr, sz,
-                     value ? QEMU_MADV_MERGEABLE : QEMU_MADV_UNMERGEABLE);
+        if (qemu_madvise(ptr, sz,
+                         value ? QEMU_MADV_MERGEABLE : QEMU_MADV_UNMERGEABLE)) {
+            error_setg_errno(errp, errno,
+                             "Couldn't change property 'merge' on '%s'",
+                             object_get_typename(obj));
+            return;
+        }
+
         backend->merge = value;
     }
 }
@@ -204,8 +210,14 @@ static void host_memory_backend_set_dump(Object *obj, bool value, Error **errp)
         void *ptr = memory_region_get_ram_ptr(&backend->mr);
         uint64_t sz = memory_region_size(&backend->mr);
 
-        qemu_madvise(ptr, sz,
-                     value ? QEMU_MADV_DODUMP : QEMU_MADV_DONTDUMP);
+        if (qemu_madvise(ptr, sz,
+                     value ? QEMU_MADV_DODUMP : QEMU_MADV_DONTDUMP)) {
+            error_setg_errno(errp, errno,
+                             "Couldn't change property 'dump' on '%s'",
+                             object_get_typename(obj));
+            return;
+        }
+
         backend->dump = value;
     }
 }
@@ -337,11 +349,19 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
     ptr = memory_region_get_ram_ptr(&backend->mr);
     sz = memory_region_size(&backend->mr);
 
-    if (backend->merge) {
-        qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE);
+    if (backend->merge &&
+        qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE)) {
+        error_setg_errno(errp, errno,
+                         "Couldn't set property 'merge' on '%s'",
+                         object_get_typename(OBJECT(uc)));
+        return;
     }
-    if (!backend->dump) {
-        qemu_madvise(ptr, sz, QEMU_MADV_DONTDUMP);
+    if (!backend->dump &&
+        qemu_madvise(ptr, sz, QEMU_MADV_DONTDUMP)) {
+        error_setg_errno(errp, errno,
+                         "Couldn't set property 'dump' on '%s'",
+                         object_get_typename(OBJECT(uc)));
+        return;
     }
 #ifdef CONFIG_NUMA
     unsigned long lastbit = find_last_bit(backend->host_nodes, MAX_NODES);
-- 
2.44.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 4/4] backends/hostmem: Report error when memory size is unaligned
  2024-05-31  7:28 [PATCH v2 0/4] backends/hostmem: Report more errors on failures Michal Privoznik
                   ` (2 preceding siblings ...)
  2024-05-31  7:28 ` [PATCH v2 3/4] backends/hostmem: Report error on qemu_madvise() failures Michal Privoznik
@ 2024-05-31  7:29 ` Michal Privoznik
  2024-05-31  7:53   ` Philippe Mathieu-Daudé
  2024-06-05  7:12   ` Mario Casquero
  3 siblings, 2 replies; 17+ messages in thread
From: Michal Privoznik @ 2024-05-31  7:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: david, imammedo

If memory-backend-{file,ram} has a size that's not aligned to
underlying page size it is not only wasteful, but also may lead
to hard to debug behaviour. For instance, in case
memory-backend-file and hugepages, madvise() and mbind() fail.
Rightfully so, page is the smallest unit they can work with. And
even though an error is reported, the root cause it not very
clear:

  qemu-system-x86_64: Couldn't set property 'dump' on 'memory-backend-file': Invalid argument

After this commit:

  qemu-system-x86_64: backend memory size must be multiple of 0x200000

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
---
 backends/hostmem.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index 012a8c190f..5f9c9ea8f5 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -337,6 +337,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
     HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
     void *ptr;
     uint64_t sz;
+    size_t pagesize;
     bool async = !phase_check(PHASE_LATE_BACKENDS_CREATED);
 
     if (!bc->alloc) {
@@ -348,6 +349,13 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
 
     ptr = memory_region_get_ram_ptr(&backend->mr);
     sz = memory_region_size(&backend->mr);
+    pagesize = qemu_ram_pagesize(backend->mr.ram_block);
+
+    if (!QEMU_IS_ALIGNED(sz, pagesize)) {
+        error_setg(errp, "backend memory size must be multiple of 0x%"
+                   PRIx64, pagesize);
+        return;
+    }
 
     if (backend->merge &&
         qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE)) {
-- 
2.44.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/4] osdep: Make qemu_madvise() return ENOSYS on unsupported OSes
  2024-05-31  7:28 ` [PATCH v2 2/4] osdep: Make qemu_madvise() return ENOSYS on unsupported OSes Michal Privoznik
@ 2024-05-31  7:48   ` Philippe Mathieu-Daudé
  2024-05-31  7:53   ` David Hildenbrand
  1 sibling, 0 replies; 17+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-31  7:48 UTC (permalink / raw)
  To: Michal Privoznik, qemu-devel; +Cc: david, imammedo

On 31/5/24 09:28, Michal Privoznik wrote:
> Not every OS is capable of madvise() or posix_madvise() even. In
> that case, errno should be set to ENOSYS as it reflects the cause
> better. This also mimic what madvise()/posix_madvise() would
> return if kernel lacks corresponding syscall (e.g. due to
> configuration).
> 
> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> ---
>   util/osdep.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/4] backends/hostmem: Report error on qemu_madvise() failures
  2024-05-31  7:28 ` [PATCH v2 3/4] backends/hostmem: Report error on qemu_madvise() failures Michal Privoznik
@ 2024-05-31  7:49   ` Philippe Mathieu-Daudé
  2024-05-31  9:10   ` David Hildenbrand
  1 sibling, 0 replies; 17+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-31  7:49 UTC (permalink / raw)
  To: Michal Privoznik, qemu-devel; +Cc: david, imammedo

On 31/5/24 09:28, Michal Privoznik wrote:
> If user sets .merge or .dump attributes qemu_madvise() is called
> with corresponding advice. But it is never checked for failure
> which may mislead users into thinking the attribute is set
> correctly. Report an appropriate error.
> 
> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> ---
>   backends/hostmem.c | 36 ++++++++++++++++++++++++++++--------
>   1 file changed, 28 insertions(+), 8 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/4] osdep: Make qemu_madvise() return ENOSYS on unsupported OSes
  2024-05-31  7:28 ` [PATCH v2 2/4] osdep: Make qemu_madvise() return ENOSYS on unsupported OSes Michal Privoznik
  2024-05-31  7:48   ` Philippe Mathieu-Daudé
@ 2024-05-31  7:53   ` David Hildenbrand
  2024-05-31  8:02     ` Philippe Mathieu-Daudé
  1 sibling, 1 reply; 17+ messages in thread
From: David Hildenbrand @ 2024-05-31  7:53 UTC (permalink / raw)
  To: Michal Privoznik, qemu-devel; +Cc: imammedo

On 31.05.24 09:28, Michal Privoznik wrote:
> Not every OS is capable of madvise() or posix_madvise() even. In
> that case, errno should be set to ENOSYS as it reflects the cause
> better. This also mimic what madvise()/posix_madvise() would
> return if kernel lacks corresponding syscall (e.g. due to
> configuration).

Yes and no. According to the man page

" EINVAL advice is not a valid."

if a particular MADV_* call is not implemented, we would get EINVAL, 
which is really unfortunate ... to detect what is actually supported :(

For the patch here ENOSYS makes sense:

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/4] backends/hostmem: Report error when memory size is unaligned
  2024-05-31  7:29 ` [PATCH v2 4/4] backends/hostmem: Report error when memory size is unaligned Michal Privoznik
@ 2024-05-31  7:53   ` Philippe Mathieu-Daudé
  2024-06-05  7:12   ` Mario Casquero
  1 sibling, 0 replies; 17+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-31  7:53 UTC (permalink / raw)
  To: Michal Privoznik, qemu-devel; +Cc: david, imammedo

Hi Michal,

On 31/5/24 09:29, Michal Privoznik wrote:
> If memory-backend-{file,ram} has a size that's not aligned to
> underlying page size it is not only wasteful, but also may lead
> to hard to debug behaviour. For instance, in case
> memory-backend-file and hugepages, madvise() and mbind() fail.
> Rightfully so, page is the smallest unit they can work with. And
> even though an error is reported, the root cause it not very
> clear:
> 
>    qemu-system-x86_64: Couldn't set property 'dump' on 'memory-backend-file': Invalid argument
> 
> After this commit:
> 
>    qemu-system-x86_64: backend memory size must be multiple of 0x200000
> 
> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> ---
>   backends/hostmem.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/backends/hostmem.c b/backends/hostmem.c
> index 012a8c190f..5f9c9ea8f5 100644
> --- a/backends/hostmem.c
> +++ b/backends/hostmem.c
> @@ -337,6 +337,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
>       HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
>       void *ptr;
>       uint64_t sz;
> +    size_t pagesize;
>       bool async = !phase_check(PHASE_LATE_BACKENDS_CREATED);
>   
>       if (!bc->alloc) {
> @@ -348,6 +349,13 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
>   
>       ptr = memory_region_get_ram_ptr(&backend->mr);
>       sz = memory_region_size(&backend->mr);
> +    pagesize = qemu_ram_pagesize(backend->mr.ram_block);
> +
> +    if (!QEMU_IS_ALIGNED(sz, pagesize)) {
> +        error_setg(errp, "backend memory size must be multiple of 0x%"
> +                   PRIx64, pagesize);

Can we use size_to_str() instead?

Also display backend name:

"backend '%s' memory size must be multiple of %s"

> +        return;
> +    }
>   
>       if (backend->merge &&
>           qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE)) {



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/4] osdep: Make qemu_madvise() to set errno in all cases
  2024-05-31  7:28 ` [PATCH v2 1/4] osdep: Make qemu_madvise() to set errno in all cases Michal Privoznik
@ 2024-05-31  7:57   ` Philippe Mathieu-Daudé
  2024-05-31  8:01     ` David Hildenbrand
  0 siblings, 1 reply; 17+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-31  7:57 UTC (permalink / raw)
  To: Michal Privoznik, qemu-devel; +Cc: david, imammedo

Hi Michal,

On 31/5/24 09:28, Michal Privoznik wrote:
> The unspoken premise of qemu_madvise() is that errno is set on
> error. And it is mostly the case except for posix_madvise() which
> is documented to return either zero (on success) or a positive
> error number.

Watch out, Linux:

   RETURN VALUE

      On success, posix_madvise() returns 0.  On failure,
      it returns a positive error number.

but on Darwin:

   RETURN VALUES

      Upon successful completion, a value of 0 is returned.
      Otherwise, a value of -1 is returned and errno is set
      to indicate the error.

(Haven't checked other POSIX OSes).

So we likely need more #ifdef'ry here.

> This means, we must set errno ourselves. And while
> at it, make the function return a negative value on error, just
> like other error paths do.
> 
> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> Reviewed-by: David Hildenbrand <david@redhat.com>
> ---
>   util/osdep.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/util/osdep.c b/util/osdep.c
> index e996c4744a..e42f4e8121 100644
> --- a/util/osdep.c
> +++ b/util/osdep.c
> @@ -57,7 +57,12 @@ int qemu_madvise(void *addr, size_t len, int advice)
>   #if defined(CONFIG_MADVISE)
>       return madvise(addr, len, advice);
>   #elif defined(CONFIG_POSIX_MADVISE)
> -    return posix_madvise(addr, len, advice);
> +    int rc = posix_madvise(addr, len, advice);
> +    if (rc) {
> +        errno = rc;
> +        return -1;
> +    }
> +    return 0;
>   #else
>       errno = EINVAL;
>       return -1;



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/4] osdep: Make qemu_madvise() to set errno in all cases
  2024-05-31  7:57   ` Philippe Mathieu-Daudé
@ 2024-05-31  8:01     ` David Hildenbrand
  2024-05-31  8:12       ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 17+ messages in thread
From: David Hildenbrand @ 2024-05-31  8:01 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, Michal Privoznik, qemu-devel; +Cc: imammedo

On 31.05.24 09:57, Philippe Mathieu-Daudé wrote:
> Hi Michal,
> 
> On 31/5/24 09:28, Michal Privoznik wrote:
>> The unspoken premise of qemu_madvise() is that errno is set on
>> error. And it is mostly the case except for posix_madvise() which
>> is documented to return either zero (on success) or a positive
>> error number.
> 
> Watch out, Linux:
> 
>     RETURN VALUE
> 
>        On success, posix_madvise() returns 0.  On failure,
>        it returns a positive error number.
> 
> but on Darwin:
> 
>     RETURN VALUES
> 
>        Upon successful completion, a value of 0 is returned.
>        Otherwise, a value of -1 is returned and errno is set
>        to indicate the error.
> 
> (Haven't checked other POSIX OSes).
> 

... but it's supposed to follow the "posix standard" :D Maybe an issue 
in the docs?

FreeBSD seems to follow the standard: "The posix_madvise() interface is 
identical, except it returns an error number on error and does not 
modify errno, and is provided for standards conformance."

Same with OpenBSD: "The posix_madvise() interface has the same effect, 
but returns the error value instead of only setting errno."

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/4] osdep: Make qemu_madvise() return ENOSYS on unsupported OSes
  2024-05-31  7:53   ` David Hildenbrand
@ 2024-05-31  8:02     ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 17+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-31  8:02 UTC (permalink / raw)
  To: David Hildenbrand, Michal Privoznik, qemu-devel; +Cc: imammedo

On 31/5/24 09:53, David Hildenbrand wrote:
> On 31.05.24 09:28, Michal Privoznik wrote:
>> Not every OS is capable of madvise() or posix_madvise() even. In
>> that case, errno should be set to ENOSYS as it reflects the cause
>> better. This also mimic what madvise()/posix_madvise() would
>> return if kernel lacks corresponding syscall (e.g. due to
>> configuration).
> 
> Yes and no. According to the man page
> 
> " EINVAL advice is not a valid."
> 
> if a particular MADV_* call is not implemented, we would get EINVAL, 
> which is really unfortunate ... to detect what is actually supported :(

Maybe skip "This also mimic what madvise()/posix_madvise()
would return if kernel lacks corresponding syscall (e.g. due
to configuration)." for clarity?

> For the patch here ENOSYS makes sense:
> 
> Reviewed-by: David Hildenbrand <david@redhat.com>
> 



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/4] osdep: Make qemu_madvise() to set errno in all cases
  2024-05-31  8:01     ` David Hildenbrand
@ 2024-05-31  8:12       ` Philippe Mathieu-Daudé
  2024-05-31  9:08         ` David Hildenbrand
  0 siblings, 1 reply; 17+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-31  8:12 UTC (permalink / raw)
  To: David Hildenbrand, Michal Privoznik, qemu-devel
  Cc: imammedo, Cameron Esfahani, Eric Blake

On 31/5/24 10:01, David Hildenbrand wrote:
> On 31.05.24 09:57, Philippe Mathieu-Daudé wrote:
>> Hi Michal,
>>
>> On 31/5/24 09:28, Michal Privoznik wrote:
>>> The unspoken premise of qemu_madvise() is that errno is set on
>>> error. And it is mostly the case except for posix_madvise() which
>>> is documented to return either zero (on success) or a positive
>>> error number.
>>
>> Watch out, Linux:
>>
>>     RETURN VALUE
>>
>>        On success, posix_madvise() returns 0.  On failure,
>>        it returns a positive error number.
>>
>> but on Darwin:
>>
>>     RETURN VALUES
>>
>>        Upon successful completion, a value of 0 is returned.
>>        Otherwise, a value of -1 is returned and errno is set
>>        to indicate the error.
>>
>> (Haven't checked other POSIX OSes).
>>
> 
> ... but it's supposed to follow the "posix standard" :D Maybe an issue 
> in the docs?
> 
> FreeBSD seems to follow the standard: "The posix_madvise() interface is 
> identical, except it returns an error number on error and does not 
> modify errno, and is provided for standards conformance."
> 
> Same with OpenBSD: "The posix_madvise() interface has the same effect, 
> but returns the error value instead of only setting errno."

On Darwin, MADVISE(2):

    The posix_madvise() behaves same as madvise() except that it uses
    values with POSIX_ prefix for the advice system call argument.

    The posix_madvise function is part of IEEE 1003.1-2001 and was first
    implemented in Mac OS X 10.2.

Per IEEE 1003.1-2001 
(https://pubs.opengroup.org/onlinepubs/009695399/functions/posix_madvise.html):

   RETURN VALUE

     Upon successful completion, posix_madvise() shall return zero;
     otherwise, an error number shall be returned to indicate the error.

Note the use of "shall" which is described in RFC2119 as:

    This word, or the adjective "RECOMMENDED", mean that there
    may exist valid reasons in particular circumstances to ignore a
    particular item, but the full implications must be understood and
    carefully weighed before choosing a different course.

Regards,

Phil.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/4] osdep: Make qemu_madvise() to set errno in all cases
  2024-05-31  8:12       ` Philippe Mathieu-Daudé
@ 2024-05-31  9:08         ` David Hildenbrand
  2024-05-31 11:10           ` Michal Prívozník
  0 siblings, 1 reply; 17+ messages in thread
From: David Hildenbrand @ 2024-05-31  9:08 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, Michal Privoznik, qemu-devel
  Cc: imammedo, Cameron Esfahani, Eric Blake

On 31.05.24 10:12, Philippe Mathieu-Daudé wrote:
> On 31/5/24 10:01, David Hildenbrand wrote:
>> On 31.05.24 09:57, Philippe Mathieu-Daudé wrote:
>>> Hi Michal,
>>>
>>> On 31/5/24 09:28, Michal Privoznik wrote:
>>>> The unspoken premise of qemu_madvise() is that errno is set on
>>>> error. And it is mostly the case except for posix_madvise() which
>>>> is documented to return either zero (on success) or a positive
>>>> error number.
>>>
>>> Watch out, Linux:
>>>
>>>      RETURN VALUE
>>>
>>>         On success, posix_madvise() returns 0.  On failure,
>>>         it returns a positive error number.
>>>
>>> but on Darwin:
>>>
>>>      RETURN VALUES
>>>
>>>         Upon successful completion, a value of 0 is returned.
>>>         Otherwise, a value of -1 is returned and errno is set
>>>         to indicate the error.
>>>
>>> (Haven't checked other POSIX OSes).
>>>
>>
>> ... but it's supposed to follow the "posix standard" :D Maybe an issue
>> in the docs?
>>
>> FreeBSD seems to follow the standard: "The posix_madvise() interface is
>> identical, except it returns an error number on error and does not
>> modify errno, and is provided for standards conformance."
>>
>> Same with OpenBSD: "The posix_madvise() interface has the same effect,
>> but returns the error value instead of only setting errno."
> 
> On Darwin, MADVISE(2):
> 
>      The posix_madvise() behaves same as madvise() except that it uses
>      values with POSIX_ prefix for the advice system call argument.
> 
>      The posix_madvise function is part of IEEE 1003.1-2001 and was first
>      implemented in Mac OS X 10.2.
> 
> Per IEEE 1003.1-2001
> (https://pubs.opengroup.org/onlinepubs/009695399/functions/posix_madvise.html):
> 
>     RETURN VALUE
> 
>       Upon successful completion, posix_madvise() shall return zero;
>       otherwise, an error number shall be returned to indicate the error.
> 
> Note the use of "shall" which is described in RFC2119 as:
> 
>      This word, or the adjective "RECOMMENDED", mean that there
>      may exist valid reasons in particular circumstances to ignore a
>      particular item, but the full implications must be understood and
>      carefully weighed before choosing a different course.

Agreed, so we have to be careful.

I do wonder if there would be the option for an automatic approach: for 
example, detect if the errno was/was not changed. Hm.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/4] backends/hostmem: Report error on qemu_madvise() failures
  2024-05-31  7:28 ` [PATCH v2 3/4] backends/hostmem: Report error on qemu_madvise() failures Michal Privoznik
  2024-05-31  7:49   ` Philippe Mathieu-Daudé
@ 2024-05-31  9:10   ` David Hildenbrand
  1 sibling, 0 replies; 17+ messages in thread
From: David Hildenbrand @ 2024-05-31  9:10 UTC (permalink / raw)
  To: Michal Privoznik, qemu-devel; +Cc: imammedo

On 31.05.24 09:28, Michal Privoznik wrote:
> If user sets .merge or .dump attributes qemu_madvise() is called
> with corresponding advice. But it is never checked for failure
> which may mislead users into thinking the attribute is set
> correctly. Report an appropriate error.
> 
> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> ---

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/4] osdep: Make qemu_madvise() to set errno in all cases
  2024-05-31  9:08         ` David Hildenbrand
@ 2024-05-31 11:10           ` Michal Prívozník
  0 siblings, 0 replies; 17+ messages in thread
From: Michal Prívozník @ 2024-05-31 11:10 UTC (permalink / raw)
  To: David Hildenbrand, Philippe Mathieu-Daudé, qemu-devel
  Cc: imammedo, Cameron Esfahani, Eric Blake

On 5/31/24 11:08, David Hildenbrand wrote:
> On 31.05.24 10:12, Philippe Mathieu-Daudé wrote:
>> On 31/5/24 10:01, David Hildenbrand wrote:
>>> On 31.05.24 09:57, Philippe Mathieu-Daudé wrote:
>>>> Hi Michal,
>>>>
>>>> On 31/5/24 09:28, Michal Privoznik wrote:
>>>>> The unspoken premise of qemu_madvise() is that errno is set on
>>>>> error. And it is mostly the case except for posix_madvise() which
>>>>> is documented to return either zero (on success) or a positive
>>>>> error number.
>>>>
>>>> Watch out, Linux:
>>>>
>>>>      RETURN VALUE
>>>>
>>>>         On success, posix_madvise() returns 0.  On failure,
>>>>         it returns a positive error number.
>>>>
>>>> but on Darwin:
>>>>
>>>>      RETURN VALUES
>>>>
>>>>         Upon successful completion, a value of 0 is returned.
>>>>         Otherwise, a value of -1 is returned and errno is set
>>>>         to indicate the error.
>>>>
>>>> (Haven't checked other POSIX OSes).
>>>>
>>>
>>> ... but it's supposed to follow the "posix standard" :D Maybe an issue
>>> in the docs?
>>>
>>> FreeBSD seems to follow the standard: "The posix_madvise() interface is
>>> identical, except it returns an error number on error and does not
>>> modify errno, and is provided for standards conformance."
>>>
>>> Same with OpenBSD: "The posix_madvise() interface has the same effect,
>>> but returns the error value instead of only setting errno."
>>
>> On Darwin, MADVISE(2):
>>
>>      The posix_madvise() behaves same as madvise() except that it uses
>>      values with POSIX_ prefix for the advice system call argument.
>>
>>      The posix_madvise function is part of IEEE 1003.1-2001 and was first
>>      implemented in Mac OS X 10.2.
>>
>> Per IEEE 1003.1-2001
>> (https://pubs.opengroup.org/onlinepubs/009695399/functions/posix_madvise.html):
>>
>>     RETURN VALUE
>>
>>       Upon successful completion, posix_madvise() shall return zero;
>>       otherwise, an error number shall be returned to indicate the error.
>>
>> Note the use of "shall" which is described in RFC2119 as:
>>
>>      This word, or the adjective "RECOMMENDED", mean that there
>>      may exist valid reasons in particular circumstances to ignore a
>>      particular item, but the full implications must be understood and
>>      carefully weighed before choosing a different course.
> 
> Agreed, so we have to be careful.
> 
> I do wonder if there would be the option for an automatic approach: for
> example, detect if the errno was/was not changed. Hm.
> 

Firstly, thanks Philippe for this great catch! I did think that "posix_"
prefix might mean POSIX is followed. Anyway, looks like the common
denominator is: on success 0 returned. And then, on Darwin, errno is set
and -1 is returned. On everything(?) else, a positive value is returned
and errno is left untouched. So I think we can get away with something
like the following:

int rc = posix_madvise();
if (rc) {
  if (rc > 0) {
    errno = rc;
  }
  return -1;
}
return 0;

Plus a comment explaining the difference on Darwin.

Michal



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/4] backends/hostmem: Report error when memory size is unaligned
  2024-05-31  7:29 ` [PATCH v2 4/4] backends/hostmem: Report error when memory size is unaligned Michal Privoznik
  2024-05-31  7:53   ` Philippe Mathieu-Daudé
@ 2024-06-05  7:12   ` Mario Casquero
  1 sibling, 0 replies; 17+ messages in thread
From: Mario Casquero @ 2024-06-05  7:12 UTC (permalink / raw)
  To: Michal Privoznik; +Cc: qemu-devel, david, imammedo

This patch has been successfully tested by QE. After allocating some
hugepages in the host, try to boot up a VM with the memory backed by a
file and the size unaligned, check now the message displayed by QEMU:
qemu-system-x86_64: backend memory size must be multiple of 0x200000

Tested-by: Mario Casquero <mcasquer@redhat.com>


On Fri, May 31, 2024 at 9:29 AM Michal Privoznik <mprivozn@redhat.com> wrote:
>
> If memory-backend-{file,ram} has a size that's not aligned to
> underlying page size it is not only wasteful, but also may lead
> to hard to debug behaviour. For instance, in case
> memory-backend-file and hugepages, madvise() and mbind() fail.
> Rightfully so, page is the smallest unit they can work with. And
> even though an error is reported, the root cause it not very
> clear:
>
>   qemu-system-x86_64: Couldn't set property 'dump' on 'memory-backend-file': Invalid argument
>
> After this commit:
>
>   qemu-system-x86_64: backend memory size must be multiple of 0x200000
>
> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> ---
>  backends/hostmem.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/backends/hostmem.c b/backends/hostmem.c
> index 012a8c190f..5f9c9ea8f5 100644
> --- a/backends/hostmem.c
> +++ b/backends/hostmem.c
> @@ -337,6 +337,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
>      HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
>      void *ptr;
>      uint64_t sz;
> +    size_t pagesize;
>      bool async = !phase_check(PHASE_LATE_BACKENDS_CREATED);
>
>      if (!bc->alloc) {
> @@ -348,6 +349,13 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
>
>      ptr = memory_region_get_ram_ptr(&backend->mr);
>      sz = memory_region_size(&backend->mr);
> +    pagesize = qemu_ram_pagesize(backend->mr.ram_block);
> +
> +    if (!QEMU_IS_ALIGNED(sz, pagesize)) {
> +        error_setg(errp, "backend memory size must be multiple of 0x%"
> +                   PRIx64, pagesize);
> +        return;
> +    }
>
>      if (backend->merge &&
>          qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE)) {
> --
> 2.44.1
>
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-06-05  7:13 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-31  7:28 [PATCH v2 0/4] backends/hostmem: Report more errors on failures Michal Privoznik
2024-05-31  7:28 ` [PATCH v2 1/4] osdep: Make qemu_madvise() to set errno in all cases Michal Privoznik
2024-05-31  7:57   ` Philippe Mathieu-Daudé
2024-05-31  8:01     ` David Hildenbrand
2024-05-31  8:12       ` Philippe Mathieu-Daudé
2024-05-31  9:08         ` David Hildenbrand
2024-05-31 11:10           ` Michal Prívozník
2024-05-31  7:28 ` [PATCH v2 2/4] osdep: Make qemu_madvise() return ENOSYS on unsupported OSes Michal Privoznik
2024-05-31  7:48   ` Philippe Mathieu-Daudé
2024-05-31  7:53   ` David Hildenbrand
2024-05-31  8:02     ` Philippe Mathieu-Daudé
2024-05-31  7:28 ` [PATCH v2 3/4] backends/hostmem: Report error on qemu_madvise() failures Michal Privoznik
2024-05-31  7:49   ` Philippe Mathieu-Daudé
2024-05-31  9:10   ` David Hildenbrand
2024-05-31  7:29 ` [PATCH v2 4/4] backends/hostmem: Report error when memory size is unaligned Michal Privoznik
2024-05-31  7:53   ` Philippe Mathieu-Daudé
2024-06-05  7:12   ` Mario Casquero

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).