Re: [RFC PATCH] memory-hotplug: Use dev_online for memhp_auto_offline

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>,
	linux-mm@kvack.org, mpe@ellerman.id.au,
	linuxppc-dev@lists.ozlabs.org, mdroth@linux.vnet.ibm.com,
	kys@microsoft.com
Subject: Re: [RFC PATCH] memory-hotplug: Use dev_online for memhp_auto_offline
Date: Fri, 24 Feb 2017 17:09:13 +0100	[thread overview]
Message-ID: <8760jzy3iu.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <20170224153227.GL19161@dhcp22.suse.cz> (Michal Hocko's message of "Fri, 24 Feb 2017 16:32:27 +0100")

Michal Hocko <mhocko@kernel.org> writes:

> On Fri 24-02-17 16:05:18, Vitaly Kuznetsov wrote:
>> Michal Hocko <mhocko@kernel.org> writes:
>> 
>> > On Fri 24-02-17 15:10:29, Vitaly Kuznetsov wrote:
> [...]
>> >> Just did a quick (and probably dirty) test, increasing guest memory from
>> >> 4G to 8G (32 x 128mb blocks) require 68Mb of memory, so it's roughly 2Mb
>> >> per block. It's really easy to trigger OOM for small guests.
>> >
>> > So we need ~1.5% of the added memory. That doesn't sound like something
>> > to trigger OOM killer too easily. Assuming that increase is not way too
>> > large. Going from 256M (your earlier example) to 8G looks will eat half
>> > the memory which is still quite far away from the OOM.
>> 
>> And if the kernel itself takes 128Mb of ram (which is not something
>> extraordinary with many CPUs) we have zero left. Go to something bigger
>> than 8G and you die.
>
> Again, if you have 128M and jump to 8G then your memory balancing is
> most probably broken.
>

I don't understand what balancing you're talking about. I have a small
guest and I want to add more memory to it and the result is ... OOM. Not
something I expected.

>> > I would call such
>> > an increase a bad memory balancing, though, to be honest. A more
>> > reasonable memory balancing would go and double the available memory
>> > IMHO. Anway, I still think that hotplug is a terrible way to do memory
>> > ballooning.
>> 
>> That's what we have in *all* modern hypervisors. And I don't see why
>> it's bad.
>
> Go and re-read the original thread. Dave has given many good arguments.
>

Are we discussing taking away the memory hotplug feature from all
hypervisors here?

>> > Just make them all online the memory explicitly. I really do not see why
>> > this should be decided by poor user. Put it differently, when should I
>> > disable auto online when using hyperV or other of the mentioned
>> > technologies? CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE should simply die and
>> > I would even be for killing the whole memhp_auto_online thing along the
>> > way. This simply doesn't make any sense to me.
>> 
>> ACPI, for example, is shared between KVM/Qemu, Vmware and real
>> hardware. I can understand why bare metall guys might not want to have
>> auto-online by default (though, major linux distros ship the stupid
>> 'offline' -> 'online' udev rule and nobody complains) -- they're doing
>> some physical action - going to a server room, openning the box,
>> plugging in memory, going back to their place but with VMs it's not like
>> that. What's gonna be the default for ACPI then?
>> 
>> I don't understand why CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE is
>
> Because this is something a user has to think about and doesn't have a
> reasonable way to decide. Our config space is also way too large!

Config space is for distros, not users.

>
>> disturbing and why do we need to take this choice away from distros. I
>> don't understand what we're gaining by replacing it with
>> per-memory-add-technology defaults.
>
> Because those technologies know that they want to have the memory online
> as soon as possible. Jeez, just look at the hv code. It waits for the
> userspace to online memory before going further. Why would it ever want
> to have the tunable in "offline" state? This just doesn't make any
> sense. Look at how things get simplified if we get rid of this clutter

While this will most probably work for me I still disagree with the
concept of 'one size fits all' here and the default 'false' for ACPI,
we're taking away the feature from KVM/Vmware folks so they'll again
come up with the udev rule which has known issues.

[snip].

-- 
  Vitaly

WARNING: multiple messages have this Message-ID (diff)

From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>,
	linux-mm@kvack.org, mpe@ellerman.id.au,
	linuxppc-dev@lists.ozlabs.org, mdroth@linux.vnet.ibm.com,
	kys@microsoft.com
Subject: Re: [RFC PATCH] memory-hotplug: Use dev_online for memhp_auto_offline
Date: Fri, 24 Feb 2017 17:09:13 +0100	[thread overview]
Message-ID: <8760jzy3iu.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <20170224153227.GL19161@dhcp22.suse.cz> (Michal Hocko's message of "Fri, 24 Feb 2017 16:32:27 +0100")

Michal Hocko <mhocko@kernel.org> writes:

> On Fri 24-02-17 16:05:18, Vitaly Kuznetsov wrote:
>> Michal Hocko <mhocko@kernel.org> writes:
>> 
>> > On Fri 24-02-17 15:10:29, Vitaly Kuznetsov wrote:
> [...]
>> >> Just did a quick (and probably dirty) test, increasing guest memory from
>> >> 4G to 8G (32 x 128mb blocks) require 68Mb of memory, so it's roughly 2Mb
>> >> per block. It's really easy to trigger OOM for small guests.
>> >
>> > So we need ~1.5% of the added memory. That doesn't sound like something
>> > to trigger OOM killer too easily. Assuming that increase is not way too
>> > large. Going from 256M (your earlier example) to 8G looks will eat half
>> > the memory which is still quite far away from the OOM.
>> 
>> And if the kernel itself takes 128Mb of ram (which is not something
>> extraordinary with many CPUs) we have zero left. Go to something bigger
>> than 8G and you die.
>
> Again, if you have 128M and jump to 8G then your memory balancing is
> most probably broken.
>

I don't understand what balancing you're talking about. I have a small
guest and I want to add more memory to it and the result is ... OOM. Not
something I expected.

>> > I would call such
>> > an increase a bad memory balancing, though, to be honest. A more
>> > reasonable memory balancing would go and double the available memory
>> > IMHO. Anway, I still think that hotplug is a terrible way to do memory
>> > ballooning.
>> 
>> That's what we have in *all* modern hypervisors. And I don't see why
>> it's bad.
>
> Go and re-read the original thread. Dave has given many good arguments.
>

Are we discussing taking away the memory hotplug feature from all
hypervisors here?

>> > Just make them all online the memory explicitly. I really do not see why
>> > this should be decided by poor user. Put it differently, when should I
>> > disable auto online when using hyperV or other of the mentioned
>> > technologies? CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE should simply die and
>> > I would even be for killing the whole memhp_auto_online thing along the
>> > way. This simply doesn't make any sense to me.
>> 
>> ACPI, for example, is shared between KVM/Qemu, Vmware and real
>> hardware. I can understand why bare metall guys might not want to have
>> auto-online by default (though, major linux distros ship the stupid
>> 'offline' -> 'online' udev rule and nobody complains) -- they're doing
>> some physical action - going to a server room, openning the box,
>> plugging in memory, going back to their place but with VMs it's not like
>> that. What's gonna be the default for ACPI then?
>> 
>> I don't understand why CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE is
>
> Because this is something a user has to think about and doesn't have a
> reasonable way to decide. Our config space is also way too large!

Config space is for distros, not users.

>
>> disturbing and why do we need to take this choice away from distros. I
>> don't understand what we're gaining by replacing it with
>> per-memory-add-technology defaults.
>
> Because those technologies know that they want to have the memory online
> as soon as possible. Jeez, just look at the hv code. It waits for the
> userspace to online memory before going further. Why would it ever want
> to have the tunable in "offline" state? This just doesn't make any
> sense. Look at how things get simplified if we get rid of this clutter

While this will most probably work for me I still disagree with the
concept of 'one size fits all' here and the default 'false' for ACPI,
we're taking away the feature from KVM/Vmware folks so they'll again
come up with the udev rule which has known issues.

[snip].

-- 
  Vitaly

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2017-02-24 16:09 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-21 17:22 [RFC PATCH] memory-hotplug: Use dev_online for memhp_auto_offline Nathan Fontenot
2017-02-21 17:22 ` Nathan Fontenot
2017-02-22  9:32 ` Vitaly Kuznetsov
2017-02-22  9:32   ` Vitaly Kuznetsov
2017-02-23 12:56   ` Michal Hocko
2017-02-23 12:56     ` Michal Hocko
2017-02-23 13:31     ` Vitaly Kuznetsov
2017-02-23 13:31       ` Vitaly Kuznetsov
2017-02-23 15:09       ` Michal Hocko
2017-02-23 15:09         ` Michal Hocko
2017-02-23 15:49         ` Vitaly Kuznetsov
2017-02-23 15:49           ` Vitaly Kuznetsov
2017-02-23 16:12           ` Michal Hocko
2017-02-23 16:12             ` Michal Hocko
2017-02-23 16:36             ` Vitaly Kuznetsov
2017-02-23 16:36               ` Vitaly Kuznetsov
2017-02-23 17:41               ` Michal Hocko
2017-02-23 17:41                 ` Michal Hocko
2017-02-23 18:14                 ` Vitaly Kuznetsov
2017-02-23 18:14                   ` Vitaly Kuznetsov
2017-02-24 13:37                   ` Michal Hocko
2017-02-24 13:37                     ` Michal Hocko
2017-02-24 14:10                     ` Vitaly Kuznetsov
2017-02-24 14:10                       ` Vitaly Kuznetsov
2017-02-24 14:41                       ` Michal Hocko
2017-02-24 14:41                         ` Michal Hocko
2017-02-24 15:05                         ` Vitaly Kuznetsov
2017-02-24 15:05                           ` Vitaly Kuznetsov
2017-02-24 15:32                           ` Michal Hocko
2017-02-24 15:32                             ` Michal Hocko
2017-02-24 16:09                             ` Vitaly Kuznetsov [this message]
2017-02-24 16:09                               ` Vitaly Kuznetsov
2017-02-24 16:23                               ` Michal Hocko
2017-02-24 16:23                                 ` Michal Hocko
2017-02-24 16:40                                 ` Vitaly Kuznetsov
2017-02-24 16:40                                   ` Vitaly Kuznetsov
2017-02-24 16:52                                   ` Michal Hocko
2017-02-24 16:52                                     ` Michal Hocko
2017-02-24 17:06                                     ` Vitaly Kuznetsov
2017-02-24 17:06                                       ` Vitaly Kuznetsov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8760jzy3iu.fsf@vitty.brq.redhat.com \
    --to=vkuznets@redhat.com \
    --cc=kys@microsoft.com \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=mhocko@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=nfont@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.