From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sha Zhengju Subject: Re: [patch 3/5] mm, memcg: introduce own oom handler to iterate only over its own threads Date: Thu, 12 Jul 2012 22:50:58 +0800 Message-ID: <4FFEE452.40300@gmail.com> References: Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=POjd8j115rdPFZgoHXLoLq9xwLKqMbhncB1w9NJpr/Q=; b=ZUq1HO+TA+L/SWjTNsKjl0OyGQMIDR8Xbym+0xZaEtN8ESIbsspXZHLXsJ5W/sUqgC SuYU4rBgOi/psDCvh0Bwf7iPAxRhrgMPJgJJEZyhQzICtffAE9RJna74n7lxFiMnQUGG JubhA0oFjlGnEspxQBLMZyu1mxyIot9uI/Pa/28H9t8tSEOFLOFEYBgJh3lQQ5khNBmQ ob19tP/EvT1vMMCQfQm655zKl76u9lEKf90UO+9yUY5IPYmDtc06ZOMWUimFSIU64BV5 VX1hR+9oDpEXY159srT1gwzb8lJ1+biz0KYzrWj3JGLp+XEhJnE3ll1Zg+2BOeOKGrfr DVnw== In-Reply-To: Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Type: text/plain; charset="us-ascii"; format="flowed" To: David Rientjes Cc: Andrew Morton , KAMEZAWA Hiroyuki , Michal Hocko , Johannes Weiner , KOSAKI Motohiro , Minchan Kim , Oleg Nesterov , linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 06/30/2012 05:06 AM, David Rientjes wrote: > The global oom killer is serialized by the zonelist being used in the > page allocation. Concurrent oom kills are thus a rare event and only > occur in systems using mempolicies and with a large number of nodes. > > Memory controller oom kills, however, can frequently be concurrent since > there is no serialization once the oom killer is called for oom > conditions in several different memcgs in parallel. > > This creates a massive contention on tasklist_lock since the oom killer > requires the readside for the tasklist iteration. If several memcgs are > calling the oom killer, this lock can be held for a substantial amount of > time, especially if threads continue to enter it as other threads are > exiting. > > Since the exit path grabs the writeside of the lock with irqs disabled in > a few different places, this can cause a soft lockup on cpus as a result > of tasklist_lock starvation. > > The kernel lacks unfair writelocks, and successful calls to the oom > killer usually result in at least one thread entering the exit path, so > an alternative solution is needed. > > This patch introduces a seperate oom handler for memcgs so that they do > not require tasklist_lock for as much time. Instead, it iterates only > over the threads attached to the oom memcg and grabs a reference to the > selected thread before calling oom_kill_process() to ensure it doesn't > prematurely exit. > > This still requires tasklist_lock for the tasklist dump, iterating > children of the selected process, and killing all other threads on the > system sharing the same memory as the selected victim. So while this > isn't a complete solution to tasklist_lock starvation, it significantly > reduces the amount of time that it is held. > Looks good. You can add Reviewed-by: Sha Zhengju Thanks, Sha