From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755645AbbITTHk (ORCPT ); Sun, 20 Sep 2015 15:07:40 -0400 Received: from mail-pa0-f50.google.com ([209.85.220.50]:35807 "EHLO mail-pa0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755561AbbITTHh (ORCPT ); Sun, 20 Sep 2015 15:07:37 -0400 Subject: Re: can't oom-kill zap the victim's memory? To: Linus Torvalds , Oleg Nesterov References: <1442512783-14719-1-git-send-email-kwalker@redhat.com> <20150919150316.GB31952@redhat.com> <20150920125642.GA2104@redhat.com> Cc: Kyle Walker , Christoph Lameter , Michal Hocko , Andrew Morton , David Rientjes , Johannes Weiner , Vladimir Davydov , linux-mm , Linux Kernel Mailing List , Stanislav Kozina , Tetsuo Handa From: Raymond Jennings Message-ID: <55FF03F4.6000904@gmail.com> Date: Sun, 20 Sep 2015 12:07:32 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/20/15 11:05, Linus Torvalds wrote: > On Sun, Sep 20, 2015 at 5:56 AM, Oleg Nesterov wrote: >> In this case the workqueue thread will block. > What workqueue thread? > > pagefault_out_of_memory -> > out_of_memory -> > oom_kill_process > > as far as I can tell, this can be called by any task. Now, that > pagefault case should only happen when the page fault comes from user > space, but we also have > > __alloc_pages_slowpath -> > __alloc_pages_may_oom -> > out_of_memory -> > oom_kill_process > > which can be called from just about any context (but atomic > allocations will never get here, so it can schedule etc). I think in this case the oom killer should just slap a SIGKILL on the task and then back out, and whatever needed the memory should just wait patiently for the sacrificial lamb to commit seppuku. Which, btw, we should IMO encourage ASAP in the context of the lamb by having anything potentially locky or semaphory pay attention to if the task in question has a fatal signal pending, and if so, drop everything and run like hell so that the task can cough up any locks or semaphores. > So what's your point? Explain again just how do you guarantee that you > can take the mmap_sem. > > Linus > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org Also, I observed that a task in the middle of dumping core doesn't respond to signals while it's dumping, and I would guess that might be the case even if the task receives a SIGKILL from the OOM handler. Just a potential observation.