From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=MAILING_LIST_MULTI,SPF_PASS, USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B512BC64EAD for ; Tue, 9 Oct 2018 07:50:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 703C320858 for ; Tue, 9 Oct 2018 07:50:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 703C320858 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726579AbeJIPF4 (ORCPT ); Tue, 9 Oct 2018 11:05:56 -0400 Received: from mx2.suse.de ([195.135.220.15]:36796 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725855AbeJIPF4 (ORCPT ); Tue, 9 Oct 2018 11:05:56 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 5A954AFD7; Tue, 9 Oct 2018 07:50:16 +0000 (UTC) Date: Tue, 9 Oct 2018 09:50:15 +0200 From: Michal Hocko To: Tetsuo Handa Cc: ytk.lee@samsung.com, "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , Oleg Nesterov , David Rientjes , Vladimir Davydov , Andrew Morton , Linus Torvalds Subject: Re: [PATCH] mm, oom_adj: avoid meaningless loop to find processes sharing mm Message-ID: <20181009075015.GC8528@dhcp22.suse.cz> References: <67eedc4c-7afa-e845-6c88-9716fd820de6@i-love.sakura.ne.jp> <20181008011931epcms1p82dd01b7e5c067ea99946418bc97de46a@epcms1p8> <20181008061407epcms1p519703ae6373a770160c8f912c7aa9521@epcms1p5> <20181008083855epcms1p20e691e5a001f3b94b267997c24e91128@epcms1p2> <20181009063541.GB8528@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181009063541.GB8528@dhcp22.suse.cz> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 09-10-18 08:35:41, Michal Hocko wrote: > [I have only now noticed that the patch has been reposted] > > On Mon 08-10-18 18:27:39, Tetsuo Handa wrote: > > On 2018/10/08 17:38, Yong-Taek Lee wrote: [...] > > > Thank you for your suggestion. But i think it would be better to seperate to 2 issues. How about think these > > > issues separately because there are no dependency between race issue and my patch. As i already explained, > > > for_each_process path is meaningless if there is only one thread group with many threads(mm_users > 1 but > > > no other thread group sharing same mm). Do you have any other idea to avoid meaningless loop ? > > > > Yes. I suggest reverting commit 44a70adec910d692 ("mm, oom_adj: make sure processes > > sharing mm have same view of oom_score_adj") and commit 97fd49c2355ffded ("mm, oom: > > kill all tasks sharing the mm"). > > This would require a lot of other work for something as border line as > weird threading model like this. I will think about something more > appropriate - e.g. we can take mmap_sem for read while doing this check > and that should prevent from races with [v]fork. Not really. We do not even take the mmap_sem when CLONE_VM. So this is not the way. Doing a proper synchronization seems much harder. So let's consider what is the worst case scenario. We would basically hit a race window between copy_signal and copy_mm and the only relevant case would be OOM_SCORE_ADJ_MIN which wouldn't propagate to the new "thread". OOM killer could then pick up the "thread" and kill it along with the whole process group sharing the mm. Well, that is unfortunate indeed and it breaks the OOM_SCORE_ADJ_MIN contract. There are basically two ways here 1) do not care and encourage users to use a saner way to set OOM_SCORE_ADJ_MIN because doing that externally is racy anyway e.g. setting it before [v]fork & exec. Btw. do we know about an actual user who would care? 2) add OOM_SCORE_ADJ_MIN and do not kill tasks sharing mm and do not reap the mm in the rare case of the race. I would prefer the firs but if this race really has to be addressed then the 2 sounds more reasonable than the wholesale revert. -- Michal Hocko SUSE Labs