From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1757031AbbIUPfz (ORCPT <rfc822;w@1wt.eu>);
	Mon, 21 Sep 2015 11:35:55 -0400
Received: from mx1.redhat.com ([209.132.183.28]:45088 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1756425AbbIUPfy (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Mon, 21 Sep 2015 11:35:54 -0400
Date: Mon, 21 Sep 2015 17:32:52 +0200
From: Oleg Nesterov <oleg@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
        Kyle Walker <kwalker@redhat.com>, Christoph Lameter <cl@linux.com>,
        Andrew Morton <akpm@linux-foundation.org>,
        David Rientjes <rientjes@google.com>,
        Johannes Weiner <hannes@cmpxchg.org>,
        Vladimir Davydov <vdavydov@parallels.com>,
        linux-mm <linux-mm@kvack.org>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        Stanislav Kozina <skozina@redhat.com>,
        Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Subject: Re: can't oom-kill zap the victim's memory?
Message-ID: <20150921153252.GA21988@redhat.com>
References: <1442512783-14719-1-git-send-email-kwalker@redhat.com> <20150919150316.GB31952@redhat.com> <CA+55aFwkvbMrGseOsZNaxgP3wzDoVjkGasBKFxpn07SaokvpXA@mail.gmail.com> <20150920125642.GA2104@redhat.com> <CA+55aFyajHq2W9HhJWbLASFkTx_kLSHtHuY6mDHKxmoW-LnVEw@mail.gmail.com> <20150921134414.GA15974@redhat.com> <20150921142423.GC19811@dhcp22.suse.cz>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150921142423.GC19811@dhcp22.suse.cz>
User-Agent: Mutt/1.5.18 (2008-05-17)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 09/21, Michal Hocko wrote:
>
> On Mon 21-09-15 15:44:14, Oleg Nesterov wrote:
> [...]
> > So yes, in general oom_kill_process() can't call oom_unmap_func() directly.
> > That is why the patch uses queue_work(oom_unmap_func). The workqueue thread
> > takes mmap_sem and frees the memory allocated by user space.
>
> OK, this might have been a bit confusing. I didn't mean you cannot use
> mmap_sem directly from the workqueue context. You _can_ AFAICS. But I've
> mentioned that you _shouldn't_ use workqueue context in the first place
> because all the workers might be blocked on locks and new workers cannot
> be created due to memory pressure.

Yes, yes, and I already tried to comment this part. We probably need a
dedicated kernel thread, but I still think (although I am not sure) that
initial change can use workueue. In the likely case system_unbound_wq pool
should have an idle thread, if not - OK, this change won't help in this
case. This is minor.

> So I think we probably need to do this in the OOM killer context (with
> try_lock)

Yes we should try to do this in the OOM killer context, and in this case
(of course) we need trylock. Let me quote my previous email:

	And we want to avoid using workqueues when the caller can do this
	directly. And in this case we certainly need trylock. But this needs
	some refactoring: we do not want to do this under oom_lock, otoh it
	makes sense to do this from mark_oom_victim() if current && killed,
	and a lot more details.

and probably this is another reason why do we need MMF_MEMDIE. But again,
I think the initial change should be simple.

Oleg.