From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756250AbZENUjF (ORCPT ); Thu, 14 May 2009 16:39:05 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753776AbZENUiw (ORCPT ); Thu, 14 May 2009 16:38:52 -0400 Received: from e34.co.us.ibm.com ([32.97.110.152]:56988 "EHLO e34.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753688AbZENUiv (ORCPT ); Thu, 14 May 2009 16:38:51 -0400 Subject: Re: Misleading OOM messages From: Dave Hansen To: Christoph Lameter Cc: Pavel Machek , David Rientjes , Andrew Morton , Greg Kroah-Hartman , Nick Piggin , Mel Gorman , Peter Ziljstra , San Mehat , Arve Hj?nnev?g , linux-kernel@vger.kernel.org In-Reply-To: References: <20090514092909.GG1365@ucw.cz> Content-Type: text/plain Date: Thu, 14 May 2009 13:38:39 -0700 Message-Id: <1242333519.15391.210.camel@nimitz> Mime-Version: 1.0 X-Mailer: Evolution 2.26.1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2009-05-14 at 15:46 -0400, Christoph Lameter wrote: > On Thu, 14 May 2009, Pavel Machek wrote: > > It can be 'low on memory' if you play with mlock() a bit. > > But that is a reclaim failure becuase of mlocking pages. > > > It is out of memory if you run out of swap (or have no swap to begin with). > > That is a swap config issue. The other thing that I find confusing myself is that we're almost never at '0 pages free' (which is what I intrinsically think) when we OOM. We're just under the watermarks and not apparently making any progress. But I don't think we want to say "under the watermarks" in our error message. > > I believe message is often correct. What message would you suggest? > > "Failure to reclaim memory" The problem I have with that is that it also doesn't tell the whole. story. It's the end symptom when *just* before we OOM, but it doesn't characterize the whole thing very well. It's like saying the Titanic sunk because "too much water onboard." :) It's true, but it concentrates a bit too much on the end state. To me, it's a question of how much information we can get out in a line or two on the console. Is something like this better? "Unable to satisfy memory allocation request and not making progress reclaiming from other sources." We can't exactly go spitting out an entire tutorial in dmesg, but could we stick a short URL in there? Like http://linux-mm.org/OOM perhaps? -- Dave