From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752649AbcEVVRn (ORCPT ); Sun, 22 May 2016 17:17:43 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46656 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752575AbcEVVRm (ORCPT ); Sun, 22 May 2016 17:17:42 -0400 Date: Sun, 22 May 2016 23:17:36 +0200 From: Oleg Nesterov To: Tetsuo Handa Cc: Andrew Morton , Andrea Arcangeli , Mel Gorman , Michal Hocko , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: zone_reclaimable() leads to livelock in __alloc_pages_slowpath() Message-ID: <20160522211736.GA3161@redhat.com> References: <20160520202817.GA22201@redhat.com> <237e1113-fca7-51c7-1271-fb48398fd599@I-love.SAKURA.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <237e1113-fca7-51c7-1271-fb48398fd599@I-love.SAKURA.ne.jp> User-Agent: Mutt/1.5.24 (2015-08-30) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Sun, 22 May 2016 21:17:41 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/21, Tetsuo Handa wrote: > > On 2016/05/21 5:28, Oleg Nesterov wrote: > > It spins in __alloc_pages_slowpath() forever, __alloc_pages_may_oom() is never > > called, it doesn't react to SIGKILL, etc. > > > > This is because zone_reclaimable() is always true in shrink_zones(), and the > > problem goes away if I comment out this code > > > > if (global_reclaim(sc) && > > !reclaimable && zone_reclaimable(zone)) > > reclaimable = true; > > > > in shrink_zones() which otherwise returns this "true" every time, and thus > > __alloc_pages_slowpath() always sees did_some_progress != 0. > > > > Michal Hocko's OOM detection rework patchset that removes that code was sent > to Linus 4 hours ago. ( https://marc.info/?l=linux-mm-commits&m=146378862415399 ) > Please wait for a few days and try reproducing using linux.git . I guess you mean http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/mm/vmscan.c?id=fa8c5f033ebb43f925d68c29d297bafd36af7114 "mm, oom: rework oom detection"... Yes thanks a lot Tetsuo, it should fix the problem. Cough I can't resist I hate Michal^W the fact this was already fixed ;) Because it took me some time to understand whats going on, initially it looked like some subtle and hard-to-reproduce bug in userfaultfd. Thanks! Oleg.