From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752879Ab2L1Azi (ORCPT ); Thu, 27 Dec 2012 19:55:38 -0500 Received: from mxout2.iskon.hr ([213.191.128.81]:56843 "EHLO mxout2.iskon.hr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752637Ab2L1Azf (ORCPT ); Thu, 27 Dec 2012 19:55:35 -0500 X-Remote-IP: 213.191.128.133 Date: Fri, 28 Dec 2012 01:42:41 +0100 From: Zlatko Calusic Organization: Iskon Internet d.d. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/17.0 Icedove/17.0 MIME-Version: 1.0 To: sedat.dilek@gmail.com CC: LKML , linux-mm References: <50DCDC21.6080303@iskon.hr> <50DCDEE5.9000700@iskon.hr> <50DCE8C8.8050103@iskon.hr> In-Reply-To: Message-ID: <50DCEB01.3060903@iskon.hr> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000500 X-Anti-Virus: Kaspersky Anti-Virus for Linux Mail Server 5.6.45/RELEASE, bases: 20121228 #8901209, check: 20121228 clean X-SpamTest-Envelope-From: zlatko.calusic@iskon.hr X-SpamTest-Group-ID: 00000000 X-SpamTest-Info: Profiles 40797 [Dec 28 2012] X-SpamTest-Method: none X-SpamTest-Rate: 0 X-SpamTest-SPF: none X-SpamTest-Status: Not detected X-SpamTest-Status-Extended: not_detected X-SpamTest-Version: SMTP-Filter Version 3.0.0 [0284], KAS30/Release Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 28.12.2012 01:37, Sedat Dilek wrote: > On Fri, Dec 28, 2012 at 1:33 AM, Zlatko Calusic wrote: >> On 28.12.2012 01:24, Sedat Dilek wrote: >>> >>> On Fri, Dec 28, 2012 at 12:51 AM, Zlatko Calusic >>> wrote: >>>> >>>> On 28.12.2012 00:42, Sedat Dilek wrote: >>>>> >>>>> >>>>> On Fri, Dec 28, 2012 at 12:39 AM, Zlatko Calusic >>>>> wrote: >>>>>> >>>>>> >>>>>> On 28.12.2012 00:30, Sedat Dilek wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> Hi Zlatko, >>>>>>> >>>>>>> I am not sure if I hit the same problem as described in this thread. >>>>>>> >>>>>>> Under heavy load, while building a customized toolchain for the Freetz >>>>>>> router project I got a BUG || NULL pointer derefence || kswapd || >>>>>>> zone_balanced || pgdat_balanced() etc. (details see my screenshot). >>>>>>> >>>>>>> I will try your patch from [1] ***only*** on top of my last >>>>>>> Linux-v3.8-rc1 GIT setup (post-v3.8-rc1 mainline + some net-fixes). >>>>>>> >>>>>> >>>>>> Yes, that's the same bug. It should be fixed with my latest patch, so >>>>>> I'd >>>>>> appreciate you testing it, to be on the safe side this time. There >>>>>> should >>>>>> be >>>>>> no difference if you apply it to anything newer than 3.8-rc1, so go for >>>>>> it. >>>>>> Thanks! >>>>>> >>>>> >>>>> Not sure how I can really reproduce this bug as one build worked fine >>>>> within my last v3.8-rc1 kernel. >>>>> I increased the parallel-make-jobs-number from "4" to "8" to stress a >>>>> bit harder. >>>>> Just building right now... and will report. >>>>> >>>>> If you have any test-case (script or whatever), please let me/us know. >>>>> >>>> >>>> Unfortunately not, I haven't reproduced it yet on my machines. But it >>>> seems >>>> that bug will hit only under heavy memory pressure. When close to OOM, or >>>> possibly with lots of writing to disk. It's also possible that >>>> fragmentation >>>> of memory zones could provoke it, that means testing it for a longer >>>> time. >>>> >>> >>> I tested successfully by doing simultaneously... >>> - building Freetz with 8 parallel make-jobs >>> - building Linux GIT with 1 make-job >>> - 9 tabs open in firefox >>> - In one tab I ran YouTube music video >>> - etc. >>> >>> I am reading [1] and [2] where another user reports success by reverting >>> this... >>> >>> commit cda73a10eb3f493871ed39f468db50a65ebeddce >>> "mm: do not sleep in balance_pgdat if there's no i/o congestion" >>> >>> BTW, this machine has also 4GiB RAM (Ubuntu/precise AMD64). >>> >>> Feel free to add a "Reported-by/Tested-by" if you think this is a >>> positive report. >>> >> >> Thanks for the testing! And keep running it in case something interesting >> pops up. ;) >> >> No need to revert cda73a10eb because it fixes another bug. And the patch >> you're now running fixes the new bug I introduced with a combination of my >> latest 2 patches. Nah, it gets complicated... :) >> >> But, at least I found the culprit and as soon as Linus applies the fix, >> everything will be hunky dory again, at least on this front. :P >> > > I am not subscribed to LKML and linux-mm,,, > Do you have a patch with a proper subject and descriptive text? URL? > Soon to follow. I'd appreciate Zhouping Liu testing it too, though. -- Zlatko