From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752388AbbEHFMM (ORCPT <rfc822;w@1wt.eu>);
	Fri, 8 May 2015 01:12:12 -0400
Received: from mail-wi0-f178.google.com ([209.85.212.178]:36866 "EHLO
	mail-wi0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751137AbbEHFMJ (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Fri, 8 May 2015 01:12:09 -0400
Message-ID: <1431061931.3168.41.camel@gmail.com>
Subject: Re: [KERNEL BUG] do_timer/tick_handover_do_timer 3.10.17
From: Mike Galbraith <umgwanakikbuti@gmail.com>
To: "Oza (Pawandeep) Oza" <oza@broadcom.com>
Cc: pawandeep oza <oza.contri.linux.kernel@gmail.com>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        malayasen rout <malayasen.rout@gmail.com>
Date: Fri, 08 May 2015 07:12:11 +0200
In-Reply-To: <5C6899BCED92C94EBDCC00F80838E3D52113AB15@SJEXCHMB06.corp.ad.broadcom.com>
References: <CAA-nRyrW8ehQ4CphjSQGmZi0XJepHO6O2Z7O0j=4-XtxmQR4WQ@mail.gmail.com>
	 <1430968960.2955.45.camel@gmail.com>
	 <5C6899BCED92C94EBDCC00F80838E3D52113A83F@SJEXCHMB06.corp.ad.broadcom.com>
	 <1430975311.2955.73.camel@gmail.com>
	 <5C6899BCED92C94EBDCC00F80838E3D52113A87B@SJEXCHMB06.corp.ad.broadcom.com>
	 <1430978071.2955.96.camel@gmail.com>
	 <5C6899BCED92C94EBDCC00F80838E3D52113A8D3@SJEXCHMB06.corp.ad.broadcom.com>
	 <1430981678.2955.121.camel@gmail.com>
	 <5C6899BCED92C94EBDCC00F80838E3D52113A908@SJEXCHMB06.corp.ad.broadcom.com>
	 <1430987391.2955.163.camel@gmail.com>
	  <5C6899BCED92C94EBDCC00F80838E3D52113AB15@SJEXCHMB06.corp.ad.broadcom.com>
Content-Type: text/plain; charset="UTF-8"
X-Mailer: Evolution 3.12.11 
Mime-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Fri, 2015-05-08 at 04:16 +0000, Oza (Pawandeep) Oza wrote:
> So Mike, is this reason strong enough for you ?

Nope.  I think you did the right thing in removing your dependency on
jiffies reliability in a dying box.  You don't have to convince me of
anything though, CC timer subsystem maintainer, see what he says.

> I understand your point: solve the BUG, and I do tend to agree with you.
> 
> But by design and implementation, the BUG() is just a beginning of the end for dying kernel.
> And what happens in between this 'the beginning' and 'the end' is not less important. 
> (because say,  on our platform we want to get clean RAMDUMP to analyze what happened, and for that we want to get clean reboot)

I don't see anybody else having any trouble getting crash dumps.  I
spent yet another long day just yesterday, rummaging through one.

> Also,
> If somebody's design is to legally Crash the kernel (e.g. where kernel is actually not faulty).
> Then, I do expect that tick/timekeeping framework do its job as long as it can do, and it should do, because kernel is not faulty.
> But in this case it doesn’t handover jiffies incrementing job sanely.

It seems odd to me to use BUG() for what you appear to be using it for..
not that I know exactly what that it mind you, but when you said when
some other gizmo in your box has a problem you crash the kernel, my head
tilted to the side - surely there's a more controlled response possible
than poking the big red self destruct button ;-)

	-Mike