qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: zhanghailiang <zhang.zhanghailiang@huawei.com>
To: Hu Tao <hutao@cn.fujitsu.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	luonengjun@huawei.com, qemu-devel@nongnu.org,
	xiexiangyou <xiexiangyou@huawei.com>,
	peter.huangpeng@huawei.com, aliguori@amazon.com,
	pbonzini@redhat.com, imammedo@redhat.com,
	gaowanlong@cn.fujitsu.com
Subject: Re: [Qemu-devel] [PATCH] mlock: fix bug when mlockall called before mbind
Date: Thu, 14 Aug 2014 17:09:00 +0800	[thread overview]
Message-ID: <53EC7CAC.2020509@huawei.com> (raw)
In-Reply-To: <20140814071503.GA18294@G08FNSTD100614.fnst.cn.fujitsu.com>

On 2014/8/14 15:15, Hu Tao wrote:
> On Thu, Aug 14, 2014 at 02:31:41PM +0800, zhanghailiang wrote:
>> On 2014/8/13 19:50, Michael S. Tsirkin wrote:
>>> On Wed, Aug 13, 2014 at 07:21:57PM +0800, zhanghailiang wrote:
>>>> If we configure qemu with realtime-mlock-on and memory-node-bind at the same time,
>>>> Qemu will fail to start, and mbind() fails with message "Input/output error".
>>>>
>>>>>  From man page:
>>>> int mbind(void *addr, unsigned long len, int mode,
>>>>                   unsigned long *nodemask, unsigned long maxnode,
>>>>                   unsigned flags);
>>>> The *MPOL_BIND* mode specifies a strict policy that restricts memory allocation
>>>> to the nodes specified in nodemask.
>>>> If *MPOL_MF_STRICT* is passed in flags and policy is not MPOL_DEFAULT(In qemu
>>>> here is MPOL_BIND), then the call will fail with the error EIO if the existing
>>>> pages in  the memory range don't follow the policy.
>>>>
>>>> The memory locked ahead by mlockall can not guarantee to follow the policy above,
>>>> And if that happens, it will result in an EIO error.
>>>>
>>>> So we should call mlock after mbind, here we adjust the place where called mlock,
>>>> Move it to function pc_memory_init.
>>>>
>>>> Signed-off-by: xiexiangyou<xiexiangyou@huawei.com>
>>>> Signed-off-by: zhanghailiang<zhang.zhanghailiang@huawei.com>
>>>
>>> OK but won't this still fail in case of memory hotplug?
>>> We set MCL_FUTURE so the same will apply?
>
> It has already been set.
>
>>> Maybe it's enough to set MPOL_MF_MOVE?
>>> Does the following work for you?
>>>
>>
>> Hi Michael,
>>
>> I have tested memory hotplug, use virsh command like
>> 'virsh setmem redhat-6.4 6388608 --config --live', it is OK, and
>> it will not call mbind when do such memory hotplug.
>> but i don't know if there is command like 'memory-node hotplug' ?
>
> Using qemu monitor command:
> object_add memory-backend-ram,id=ram1,size=128M,host-nodes=0,policy=bind
>

Hi Hu Tao,

I have tested it use the above command, and yes, it failed.
And if i remove the *MCL_FUTURE* flag of mlockall(), it will work fine.

*MCL_FUTURE*  Lock  all  pages  which  will become mapped into the
address space of the process in the future.(From man page)

So i think we could remove this flag, and call mlockall(MCL_CURRENT)
every time when we do the above 'memory object add'.

>>
>> MPOL_MF_MOVE can work, it is more simple, but it is not perfect. It
>> consumes more time to *move the memory*(i guess will reconstruct
>> pages and copy memory)which has been locked by mlockall. The result
>> is VM will start slower than the above scenario.
>
> I think your patch makes sense but it doesn't work for hotplugged
> memory. We need MPOL_MF_MOVE, too.
>

Thank you very much, before send another patch, I will look into the
qemu process of hotplug memory, and try to solve this fault.:)

>>
>> BTW, i think the follow process is clearer and more logical:
>> Allocate memory--->Set memory policy--->Lock memory.
>>
>> So what's your opinion? Thanks very much.
>>
>>
>>> -->
>>>
>>> hostmem: set MPOL_MF_MOVE
>>>
>>> When memory is allocated on a wrong node, MPOL_MF_STRICT
>>> doesn't move it - it just fails the allocation.
>>> A simple way to reproduce the failure is with mlock=on
>>> realtime feature.
>>>
>>> The code comment actually says: "ensure policy won't be ignored"
>>> so setting MPOL_MF_MOVE seems like a better way to do this.
>>>
>>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>>>
>>> ---
>>>
>>> diff --git a/backends/hostmem.c b/backends/hostmem.c
>>> index ca10c51..a9905c0 100644
>>> --- a/backends/hostmem.c
>>> +++ b/backends/hostmem.c
>>> @@ -304,7 +304,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
>>>           /* ensure policy won't be ignored in case memory is preallocated
>>>            * before mbind(). note: MPOL_MF_STRICT is ignored on hugepages so
>>>            * this doesn't catch hugepage case. */
>>> -        unsigned flags = MPOL_MF_STRICT;
>>> +        unsigned flags = MPOL_MF_STRICT | MPOL_MF_MOVE;
>>>
>>>           /* check for invalid host-nodes and policies and give more verbose
>>>            * error messages than mbind(). */
>>>
>>
>
>
> .
>

  reply	other threads:[~2014-08-14  9:16 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-13 11:21 [Qemu-devel] [PATCH] mlock: fix bug when mlockall called before mbind zhanghailiang
2014-08-13 11:50 ` Michael S. Tsirkin
2014-08-14  6:31   ` zhanghailiang
2014-08-14  7:15     ` Hu Tao
2014-08-14  9:09       ` zhanghailiang [this message]
2014-08-14  9:56         ` Michael S. Tsirkin
2014-08-18  0:36           ` zhanghailiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53EC7CAC.2020509@huawei.com \
    --to=zhang.zhanghailiang@huawei.com \
    --cc=aliguori@amazon.com \
    --cc=gaowanlong@cn.fujitsu.com \
    --cc=hutao@cn.fujitsu.com \
    --cc=imammedo@redhat.com \
    --cc=luonengjun@huawei.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.huangpeng@huawei.com \
    --cc=qemu-devel@nongnu.org \
    --cc=xiexiangyou@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).