From: ebiederm@xmission.com (Eric W. Biederman)
To: Michal Hocko <mhocko@kernel.org>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>,
Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: threads-max observe limits
Date: Thu, 19 Sep 2019 14:33:24 -0500 [thread overview]
Message-ID: <87h8585bej.fsf@x220.int.ebiederm.org> (raw)
In-Reply-To: <20190918071541.GB12770@dhcp22.suse.cz> (Michal Hocko's message of "Wed, 18 Sep 2019 09:15:41 +0200")
Michal Hocko <mhocko@kernel.org> writes:
> On Tue 17-09-19 12:26:18, Eric W. Biederman wrote:
>> Michal Hocko <mhocko@kernel.org> writes:
>>
>> > On Tue 17-09-19 17:28:02, Heinrich Schuchardt wrote:
>> >>
>> >> On 9/17/19 12:03 PM, Michal Hocko wrote:
>> >> > Hi,
>> >> > I have just stumbled over 16db3d3f1170 ("kernel/sysctl.c: threads-max
>> >> > observe limits") and I am really wondering what is the motivation behind
>> >> > the patch. We've had a customer noticing the threads_max autoscaling
>> >> > differences btween 3.12 and 4.4 kernels and wanted to override the auto
>> >> > tuning from the userspace, just to find out that this is not possible.
>> >>
>> >> set_max_threads() sets the upper limit (max_threads_suggested) for
>> >> threads such that at a maximum 1/8th of the total memory can be occupied
>> >> by the thread's administrative data (of size THREADS_SIZE). On my 32 GiB
>> >> system this results in 254313 threads.
>> >
>> > This is quite arbitrary, isn't it? What would happen if the limit was
>> > twice as large?
>> >
>> >> With patch 16db3d3f1170 ("kernel/sysctl.c: threads-max observe limits")
>> >> a user cannot set an arbitrarily high number for
>> >> /proc/sys/kernel/threads-max which could lead to a system stalling
>> >> because the thread headers occupy all the memory.
>> >
>> > This is still a decision of the admin to make. You can consume the
>> > memory by other means and that is why we have measures in place. E.g.
>> > memcg accounting.
>> >
>> >> When developing the patch I remarked that on a system where memory is
>> >> installed dynamically it might be a good idea to recalculate this limit.
>> >> If you have a system that boots with let's say 8 GiB and than
>> >> dynamically installs a few TiB of RAM this might make sense. But such a
>> >> dynamic update of thread_max_suggested was left out for the sake of
>> >> simplicity.
>> >>
>> >> Anyway if more than 100,000 threads are used on a system, I would wonder
>> >> if the software should not be changed to use thread-pools instead.
>> >
>> > You do not change the software to overcome artificial bounds based on
>> > guessing.
>> >
>> > So can we get back to the justification of the patch. What kind of
>> > real life problem does it solve and why is it ok to override an admin
>> > decision?
>> > If there is no strong justification then the patch should be reverted
>> > because from what I have heard it has been noticed and it has broken
>> > a certain deployment. I am not really clear about technical details yet
>> > but it seems that there are workloads that believe they need to touch
>> > this tuning and complain if that is not possible.
>>
>> Taking a quick look myself.
>>
>> I am completely mystified by both sides of this conversation.
>>
>> a) The logic to set the default number of threads in a system
>> has not changed since 2.6.12-rc2 (the start of the git history).
>>
>> The implementation has changed but we should still get the same
>> value. So anyone seeing threads_max autoscaling differences
>> between kernels is either seeing a bug in the rewritten formula
>> or something else weird is going on.
>>
>> Michal is it a very small effect your customers are seeing?
>> Is it another bug somewhere else?
>
> I am still trying to get more information. Reportedly they see a
> different auto tuned limit between two kernel versions which results in
> an applicaton complaining. As already mentioned this might be a side
> effect of something else and this is not yet fully analyzed. My main
> point for bringing up this discussion is ...
Please this sounds like the kind of issue that will reveal something
deeper about what is going on.
>
>> b) Not being able to bump threads_max to the physical limit of
>> the machine is very clearly a regression.
>
> ... exactly this part. The changelog of the respective patch doesn't
> really exaplain why it is needed except of "it sounds like a good idea
> to be consistent".
I suggest doing a partial revert to just:
diff --git a/kernel/fork.c b/kernel/fork.c
index 7a74ade4e7d6..de8264ea34a7 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2943,7 +2943,7 @@ int sysctl_max_threads(struct ctl_table *table, int write,
if (ret || !write)
return ret;
- set_max_threads(threads);
+ max_threads = threads;
return 0;
}
proc_dointvec_minmax limiting the values to MIN_THREADS and MAX_THREADS
is justifiable. Those are the minimum and maximum values the kernel can
function with.
With a good changelog we should be able to backport that change without
any fear.
Eric
next prev parent reply other threads:[~2019-09-19 19:33 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-17 10:03 threads-max observe limits Michal Hocko
2019-09-17 15:28 ` Heinrich Schuchardt
2019-09-17 15:38 ` Michal Hocko
2019-09-17 17:26 ` Eric W. Biederman
2019-09-18 7:15 ` Michal Hocko
2019-09-19 7:59 ` Michal Hocko
2019-09-19 19:38 ` Andrew Morton
2019-09-19 19:33 ` Eric W. Biederman [this message]
2019-09-22 6:58 ` Michal Hocko
2019-09-22 15:31 ` Heinrich Schuchardt
2019-09-22 21:40 ` Eric W. Biederman
2019-09-22 21:24 ` Eric W. Biederman
2019-09-23 8:08 ` Michal Hocko
2019-09-23 21:23 ` Eric W. Biederman
2019-09-24 8:48 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h8585bej.fsf@x220.int.ebiederm.org \
--to=ebiederm@xmission.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mhocko@kernel.org \
--cc=xypron.glpk@gmx.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox