From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx152.postini.com [74.125.245.152]) by kanga.kvack.org (Postfix) with SMTP id 9230C6B005C for ; Thu, 12 Jul 2012 10:51:01 -0400 (EDT) Received: by yenr5 with SMTP id r5so2979060yen.14 for ; Thu, 12 Jul 2012 07:51:00 -0700 (PDT) Message-ID: <4FFEE452.40300@gmail.com> Date: Thu, 12 Jul 2012 22:50:58 +0800 From: Sha Zhengju MIME-Version: 1.0 Subject: Re: [patch 3/5] mm, memcg: introduce own oom handler to iterate only over its own threads References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: David Rientjes Cc: Andrew Morton , KAMEZAWA Hiroyuki , Michal Hocko , Johannes Weiner , KOSAKI Motohiro , Minchan Kim , Oleg Nesterov , linux-mm@kvack.org, cgroups@vger.kernel.org On 06/30/2012 05:06 AM, David Rientjes wrote: > The global oom killer is serialized by the zonelist being used in the > page allocation. Concurrent oom kills are thus a rare event and only > occur in systems using mempolicies and with a large number of nodes. > > Memory controller oom kills, however, can frequently be concurrent since > there is no serialization once the oom killer is called for oom > conditions in several different memcgs in parallel. > > This creates a massive contention on tasklist_lock since the oom killer > requires the readside for the tasklist iteration. If several memcgs are > calling the oom killer, this lock can be held for a substantial amount of > time, especially if threads continue to enter it as other threads are > exiting. > > Since the exit path grabs the writeside of the lock with irqs disabled in > a few different places, this can cause a soft lockup on cpus as a result > of tasklist_lock starvation. > > The kernel lacks unfair writelocks, and successful calls to the oom > killer usually result in at least one thread entering the exit path, so > an alternative solution is needed. > > This patch introduces a seperate oom handler for memcgs so that they do > not require tasklist_lock for as much time. Instead, it iterates only > over the threads attached to the oom memcg and grabs a reference to the > selected thread before calling oom_kill_process() to ensure it doesn't > prematurely exit. > > This still requires tasklist_lock for the tasklist dump, iterating > children of the selected process, and killing all other threads on the > system sharing the same memory as the selected victim. So while this > isn't a complete solution to tasklist_lock starvation, it significantly > reduces the amount of time that it is held. > Looks good. You can add Reviewed-by: Sha Zhengju Thanks, Sha -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org