From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S941322AbcIHRBP (ORCPT ); Thu, 8 Sep 2016 13:01:15 -0400 Received: from achernar.gro-tsen.net ([195.154.91.68]:48202 "EHLO achernar.gro-tsen.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932443AbcIHRBO (ORCPT ); Thu, 8 Sep 2016 13:01:14 -0400 Date: Thu, 8 Sep 2016 19:01:09 +0200 From: David Madore To: Linux Kernel mailing-list Subject: iTCO_wdt watchdog on Asus P10S-WS motherboard FREEZES MOTHERBOARD COMPLETELY Message-ID: <20160908170109.GA927@achernar.madore.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org TL;DR: the iTCO_wdt watchdog on the Asus P10S-WS motherboard, instead of rebooting the machine, places the motherboard in a completely nonfunctional state, from which it can be revived only by a hard power cycle. I suspect this is a BIOS bug: seeking advice on how/where to report this, and what to do generally. Maybe Linux can work around? Dear list, I have an Asus P10S-WS motherboard (Intel C236 chipset). I have been trying to get the iTCO_wdt hardware watchdog to work (I have been successfully using this driver with similar Intel chipset based Asus motherboards before, and I know it to work reliably). I am using Linux 4.7.3. I trigger a reboot by killing (with kill -9) the wd_keepalive daemon once it has opened the watchdog device. Sadly, it appears that on this motherboard, the watchdog does not reboot the machine (or at least, does not successfully reboot it). Instead, the machine enters a "frozen" state (fans spinning, screen black, all peripherals unresponsive) from which it cannot be woken up by pressing the reset button, or even the power button twice (the first press does turn the machine off, but it returns to the same nonfunctional state after power on). Instead, power has to be cut completely, at the power supply level. In this nonfunctional state, the Asus POST status display shows the number "62", which according to the motherboard manual is the code for "installation of the PCH runtime services" (I have no idea of what that means). I suspect that this is a BIOS ^W UEFI bug and in no way Linux's fault. It could also be a hardware problem, a chipset bug, or something else. And even if it is a firmware bug, it is conceivable that there is a way to work around the problem from Linux. So I ask for guidance from the wisdom of this list: * Is there something Linux can do about the problem? * Is there a chance some kernel developer knows someone at Asus and can bring this problem to their attention? * Can someone report success using the iTCO_wdt watchdog with other motherboards having the same Intel C236 chipset? (Note: for it to work, the i2c_smbus module needs to be loaded: it took me a long time to figure out.) * Is all hope lost for my motherboard? (I badly need a hardware watchdog: if there is no way to get it to work on this motherboard, I will need to buy a new one.) Any suggestions are welcome (or even words of comfort :-). -- David A. Madore ( http://www.madore.org/~david/ )