All of lore.kernel.org
 help / color / mirror / Atom feed
From: zhanghailiang <zhang.zhanghailiang@huawei.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: hutao@cn.fujitsu.com, luonengjun@huawei.com,
	peter.huangpeng@huawei.com, xiexiangyou <xiexiangyou@huawei.com>,
	qemu-devel@nongnu.org, aliguori@amazon.com, imammedo@redhat.com,
	pbonzini@redhat.com, gaowanlong@cn.fujitsu.com
Subject: Re: [Qemu-devel] [PATCH] mlock: fix bug when mlockall called before mbind
Date: Thu, 14 Aug 2014 14:31:41 +0800	[thread overview]
Message-ID: <53EC57CD.3020503@huawei.com> (raw)
In-Reply-To: <20140813115020.GC20244@redhat.com>

On 2014/8/13 19:50, Michael S. Tsirkin wrote:
> On Wed, Aug 13, 2014 at 07:21:57PM +0800, zhanghailiang wrote:
>> If we configure qemu with realtime-mlock-on and memory-node-bind at the same time,
>> Qemu will fail to start, and mbind() fails with message "Input/output error".
>>
>> > From man page:
>> int mbind(void *addr, unsigned long len, int mode,
>>                   unsigned long *nodemask, unsigned long maxnode,
>>                   unsigned flags);
>> The *MPOL_BIND* mode specifies a strict policy that restricts memory allocation
>> to the nodes specified in nodemask.
>> If *MPOL_MF_STRICT* is passed in flags and policy is not MPOL_DEFAULT(In qemu
>> here is MPOL_BIND), then the call will fail with the error EIO if the existing
>> pages in  the memory range don't follow the policy.
>>
>> The memory locked ahead by mlockall can not guarantee to follow the policy above,
>> And if that happens, it will result in an EIO error.
>>
>> So we should call mlock after mbind, here we adjust the place where called mlock,
>> Move it to function pc_memory_init.
>>
>> Signed-off-by: xiexiangyou<xiexiangyou@huawei.com>
>> Signed-off-by: zhanghailiang<zhang.zhanghailiang@huawei.com>
>
> OK but won't this still fail in case of memory hotplug?
> We set MCL_FUTURE so the same will apply?
> Maybe it's enough to set MPOL_MF_MOVE?
> Does the following work for you?
>

Hi Michael,

I have tested memory hotplug, use virsh command like
'virsh setmem redhat-6.4 6388608 --config --live', it is OK, and
it will not call mbind when do such memory hotplug.
but i don't know if there is command like 'memory-node hotplug' ?

MPOL_MF_MOVE can work, it is more simple, but it is not perfect. It 
consumes more time to *move the memory*(i guess will reconstruct pages 
and copy memory)which has been locked by mlockall. The result is VM will 
start slower than the above scenario.

BTW, i think the follow process is clearer and more logical:
Allocate memory--->Set memory policy--->Lock memory.

So what's your opinion? Thanks very much.


> -->
>
> hostmem: set MPOL_MF_MOVE
>
> When memory is allocated on a wrong node, MPOL_MF_STRICT
> doesn't move it - it just fails the allocation.
> A simple way to reproduce the failure is with mlock=on
> realtime feature.
>
> The code comment actually says: "ensure policy won't be ignored"
> so setting MPOL_MF_MOVE seems like a better way to do this.
>
> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>
> ---
>
> diff --git a/backends/hostmem.c b/backends/hostmem.c
> index ca10c51..a9905c0 100644
> --- a/backends/hostmem.c
> +++ b/backends/hostmem.c
> @@ -304,7 +304,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
>           /* ensure policy won't be ignored in case memory is preallocated
>            * before mbind(). note: MPOL_MF_STRICT is ignored on hugepages so
>            * this doesn't catch hugepage case. */
> -        unsigned flags = MPOL_MF_STRICT;
> +        unsigned flags = MPOL_MF_STRICT | MPOL_MF_MOVE;
>
>           /* check for invalid host-nodes and policies and give more verbose
>            * error messages than mbind(). */
>

  reply	other threads:[~2014-08-14  6:32 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-13 11:21 [Qemu-devel] [PATCH] mlock: fix bug when mlockall called before mbind zhanghailiang
2014-08-13 11:50 ` Michael S. Tsirkin
2014-08-14  6:31   ` zhanghailiang [this message]
2014-08-14  7:15     ` Hu Tao
2014-08-14  9:09       ` zhanghailiang
2014-08-14  9:56         ` Michael S. Tsirkin
2014-08-18  0:36           ` zhanghailiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53EC57CD.3020503@huawei.com \
    --to=zhang.zhanghailiang@huawei.com \
    --cc=aliguori@amazon.com \
    --cc=gaowanlong@cn.fujitsu.com \
    --cc=hutao@cn.fujitsu.com \
    --cc=imammedo@redhat.com \
    --cc=luonengjun@huawei.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.huangpeng@huawei.com \
    --cc=qemu-devel@nongnu.org \
    --cc=xiexiangyou@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.