From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754548Ab3KVDxd (ORCPT ); Thu, 21 Nov 2013 22:53:33 -0500 Received: from fgwmail5.fujitsu.co.jp ([192.51.44.35]:48538 "EHLO fgwmail5.fujitsu.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753041Ab3KVDx2 (ORCPT ); Thu, 21 Nov 2013 22:53:28 -0500 X-SecurityPolicyCheck: OK by SHieldMailChecker v1.8.9 X-SHieldMailCheckerPolicyVersion: FJ-ISEC-20120718-2 Message-ID: <528ED04E.4060606@jp.fujitsu.com> Date: Fri, 22 Nov 2013 12:32:30 +0900 From: HATAYAMA Daisuke User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:24.0) Gecko/20100101 Thunderbird/24.1.1 MIME-Version: 1.0 To: Yinghai Lu CC: Vivek Goyal , jerry.hoemann@hp.com, "H. Peter Anvin" , Matthew Garrett , Rob Landley , Thomas Gleixner , Ingo Molnar , the arch/x86 maintainers , Matt Fleming , Andrew Morton , Borislav Petkov , "linux-doc@vger.kernel.org" , Linux Kernel Mailing List , linux-efi@vger.kernel.org, Pekka Enberg , Ingo Molnar , Atsushi Kumagai Subject: Re: [RFC v2 0/2] Early use of boot service memory References: <1385067686-73500-1-git-send-email-jerry.hoemann@hp.com> <20131121230744.GA31592@srcf.ucam.org> <528E94D1.2050809@zytor.com> <20131121233705.GA32121@srcf.ucam.org> <528EAF99.1010503@zytor.com> <20131122012524.GA5627@anatevka.fc.hp.com> <20131122022957.GE31921@redhat.com> In-Reply-To: <20131122022957.GE31921@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org (2013/11/22 11:29), Vivek Goyal wrote: > [..] >>> makedumpfile going to cyclic buffer has helped out greatly, but on >>> our new systems we're still looking at 512 MB crash kernels. >> >> I tried 6TiB system/16 PCIE cards, kdump on RHEL 6.5 beta still does not work. >> still get OOM. > > What crashkernel= option you are using? > > Interesting. So something is consuming lot of memory. How about setting > "debug_mem_level 1" in /etc/kdump.conf and regenerate initrd and retry. > This time it should output some memory usage info at various points > during boot and that can give us some idea who is consuming how much > memory. > > If some module are consuming lot of memory, then you can try "blacklist" > option in /etc/kdump.conf to disable those. > > If it is not modules, then it will concern me because then either > kernel is consuming too much memory (which it should not) or for > some reason makedumpfile cyclic mode did not work for you properly. > > While you are re-testing, how about also increasing debug message > level of makedumpfile. makedumpfile developers should be able to > have a look at that. In /etc/kdump.conf, specify. > > core_collector makedumpfile -c --message-level 31 -d 31 > > If message level 31 turns out to be too verbose, reduce it as per > makedumpfile man page. > The following configuration is more flexible: core_collector false default shell Then crash dump collection fails and emergency shell shows up, where you can type a variety of commands. If 2nd kernel keeps failing even on this configuration, it's likely that kernel side already causes the OOM you're facing before reaching invocation of the command specified by core_collector directive. -- Thanks. HATAYAMA, Daisuke