From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail172.messagelabs.com (mail172.messagelabs.com [216.82.254.3]) by kanga.kvack.org (Postfix) with SMTP id 0B0118D0015 for ; Thu, 28 Oct 2010 09:36:14 -0400 (EDT) Received: by gxk3 with SMTP id 3so887803gxk.14 for ; Thu, 28 Oct 2010 06:36:04 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: From: Tharindu Rukshan Bamunuarachchi Date: Thu, 28 Oct 2010 19:05:33 +0530 Message-ID: Subject: Re: TMPFS Maximum File Size Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org To: Christoph Lameter Cc: Hugh Dickins , linux-mm@kvack.org List-ID: Dear Hugh/Christoph/All, I have done further testing to isolate the issue & found following. 1. At the moment .... Issue only occurs with IBM hardware. (x3550/x3650). It did not occur in HP Nehalem or Sun X4600. I have only IBM/HP/Sun box= es. 2. Issue is not visible with vanilla kernel 2.6.32 or 2.6.36. SLES 11 is running with 2.6.27-45. I think I should turn to IBM/Novell for further help. I still wonder why this happens only with IBM+SLES 11 kernel ? Same HW works with later kernels ? __ Tharindu R Bamunuarachchi. On Thu, Oct 28, 2010 at 1:38 AM, Christoph Lameter wrote: > > On Tue, 26 Oct 2010, Tharindu Rukshan Bamunuarachchi wrote: > > > I have two node NUMA system and 100G TMPFS mount. > > > > 1. When "dd" running freely (without CPU affinity) all memory pages > > were allocated from NODE 0 and then from NODE 1. > > > > 2. When "dd" running bound (using taskset) to CPU core in NODE 1 .... > > =C2=A0 =C2=A0 All memory pages were allocated from NODE 1. > > =C2=A0 =C2=A0 BUT machine stopped responding after exhausting NODE 1. > > =C2=A0 =C2=A0 No memory pages were allocated from NODE 0. > > Hmmm... Strange it should fall back like under #1. Can you tell us where > it hung? > > > Do you have any comment / suggestions to try out ? > > Why "dd" cannot allocate memory from NODE 0 when it is running bound > > to NODE 1 CPU core ? > > Definitely looks like a bug somewhere. TMPFS policies are not correctly > falling over to more distant zones? > > > Core was generated by `DataWareHouseEngine Surv:1:1:DataWareHouseEngine= :1'. > > Program terminated with signal 11, Segmentation fault. > > #0 =C2=A00x00007fd924b0cf7c in write () from /lib64/libc.so.6 > > Hmmm... Kernel oops? Or a segfault because of an invalid reference by you= r > app? > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org