From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751840Ab3LJWDf (ORCPT ); Tue, 10 Dec 2013 17:03:35 -0500 Received: from mx1.redhat.com ([209.132.183.28]:16666 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750976Ab3LJWDc (ORCPT ); Tue, 10 Dec 2013 17:03:32 -0500 Date: Tue, 10 Dec 2013 17:02:51 -0500 From: Dave Jones To: Linus Torvalds Cc: Oleg Nesterov , Thomas Gleixner , Darren Hart , Andrea Arcangeli , Linux Kernel Mailing List , Peter Zijlstra , Mel Gorman Subject: Re: process 'stuck' at exit. Message-ID: <20131210220251.GB5050@redhat.com> Mail-Followup-To: Dave Jones , Linus Torvalds , Oleg Nesterov , Thomas Gleixner , Darren Hart , Andrea Arcangeli , Linux Kernel Mailing List , Peter Zijlstra , Mel Gorman References: <20131210154724.GA30020@redhat.com> <20131210203559.GA1209@redhat.com> <20131210204925.GB27373@redhat.com> <20131210213431.GA6342@redhat.com> <20131210214143.GG27373@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 10, 2013 at 01:57:49PM -0800, Linus Torvalds wrote: > On Tue, Dec 10, 2013 at 1:41 PM, Dave Jones wrote: > > > > http://codemonkey.org.uk/junk/trace > > Hmm. Ok, so something is calling [__]get_user_pages_fast() and > put_page() in a loop, but the trace doesn't show what that "something" > is, because it is itself not ever called. > > However, that pattern does seem to imply that the loop is in > get_futex_key(), because all the other loops I see seem to be calling > other things as well. > > And the __get_user_pages_fast() call implies that it's the THP case > that triggers the "unlikely(PageTail(page))" case. And anyway, > otherwise we'd see lock_page()/unlock_page() too. > > So it looks like __get_user_pages_fast() fails, and keeps failing. > Andrea, this is your code, any ideas? Commit a5b338f2b0b1f ("thp: > update futex compound knowledge") to be exact. So, a reason that this might only be showing up now, is that in the last week I added support to trinity to explicitly do huge page mmaps in the children, whereas before it only ever did that with MAP_SHARED in the main pid, and then every child inherited them on fork(). Dave