From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S967018Ab0GSVl1 (ORCPT ); Mon, 19 Jul 2010 17:41:27 -0400 Received: from relay3.sgi.com ([192.48.152.1]:40073 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S966876Ab0GSViy (ORCPT ); Mon, 19 Jul 2010 17:38:54 -0400 Message-Id: <20100719213853.942990184@sgi.com> User-Agent: quilt/0.47-1 Date: Mon, 19 Jul 2010 16:32:35 -0500 From: steiner@sgi.com To: akpm@osdl.org, linux-kernel@vger.kernel.org Subject: [Patch 17/25] GRU - no panic on gru malfunction References: <20100719213651.362618144@sgi.com> Content-Disposition: inline; filename=uv_gru_no_panic_gru_failure Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Jack Steiner If the GRU malfunctions, print a message instead of panicing the system. This simplifies debugging since some of the debug tools can be used on a live system. Flush the cache on instruction timeouts in case the malfunction is related to a coherency issue (never seen this but I'm paranoid). Signed-off-by: Jack Steiner --- drivers/misc/sgi-gru/gruhandles.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) Index: linux/drivers/misc/sgi-gru/gruhandles.c =================================================================== --- linux.orig/drivers/misc/sgi-gru/gruhandles.c 2010-06-09 08:11:43.724081727 -0500 +++ linux/drivers/misc/sgi-gru/gruhandles.c 2010-06-09 08:11:46.697237522 -0500 @@ -71,7 +71,7 @@ static void report_instruction_timeout(v else if (TYPE_IS(TFH, goff)) id = "TFH"; - panic(KERN_ALERT "GRU %p (%s) is malfunctioning\n", h, id); + printk(KERN_ALERT "GRU:%d %p (%s) is malfunctioning\n", smp_processor_id(), h, id); } static int wait_instruction_complete(void *h, enum mcs_op opc) @@ -85,6 +85,7 @@ static int wait_instruction_complete(voi if (status != CCHSTATUS_ACTIVE) break; if (GRU_OPERATION_TIMEOUT < (get_cycles() - start_time)) { + gru_flush_cache(h); report_instruction_timeout(h); start_time = get_cycles(); }