From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from phobos.denx.de (phobos.denx.de [85.214.62.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 529A5C433F5 for ; Mon, 10 Oct 2022 19:24:26 +0000 (UTC) Received: from h2850616.stratoserver.net (localhost [IPv6:::1]) by phobos.denx.de (Postfix) with ESMTP id E015E84C41; Mon, 10 Oct 2022 21:24:23 +0200 (CEST) Authentication-Results: phobos.denx.de; dmarc=pass (p=none dis=none) header.from=konsulko.com Authentication-Results: phobos.denx.de; spf=pass smtp.mailfrom=u-boot-bounces@lists.denx.de Authentication-Results: phobos.denx.de; dkim=pass (1024-bit key; unprotected) header.d=konsulko.com header.i=@konsulko.com header.b="I4Lh39hS"; dkim-atps=neutral Received: by phobos.denx.de (Postfix, from userid 109) id 11CF584A73; Mon, 10 Oct 2022 21:24:22 +0200 (CEST) Received: from mail-qt1-x834.google.com (mail-qt1-x834.google.com [IPv6:2607:f8b0:4864:20::834]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)) (No client certificate requested) by phobos.denx.de (Postfix) with ESMTPS id 25FA284EE8 for ; Mon, 10 Oct 2022 21:24:17 +0200 (CEST) Authentication-Results: phobos.denx.de; dmarc=pass (p=none dis=none) header.from=konsulko.com Authentication-Results: phobos.denx.de; spf=pass smtp.mailfrom=trini@konsulko.com Received: by mail-qt1-x834.google.com with SMTP id ay9so7087923qtb.0 for ; Mon, 10 Oct 2022 12:24:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=konsulko.com; s=google; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=gfLv22jsCyX3+uqRkBjcC4WN2gxd/0byGmt9keqqRD8=; b=I4Lh39hSGLbYcfAdzhs+V12Dy7tLJHM39VPCC2EeAQbGOH7jDiy6IgI35FPE1K1Gkh ziQ4PlqF+2uA53O+YBUUtceG1EEe0I/NnQpNAot2sBxC8QXFU5RW02fsUBBOgQ2jRV7l OZFbfvxtQGb07SaoOX2L1tMoCaHIPFJwZqtBs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=gfLv22jsCyX3+uqRkBjcC4WN2gxd/0byGmt9keqqRD8=; b=FfL3+IX1qTNt04kUJS4khRJ0u/Gd0Eypg65H4xxg3zVr/2xNoysDR1HLWNWumdtUaM 0+DCJ7ZOnThBZgUak6pj5qy9v0qYFFTF3r+LY2OJbFN5CaBAUs2oW85eDimAvAg42FPj 9eccUMAogBlRZ0krIkiPbhIt9kgy1/QEKsxmUrAnVz5P5Big6icgszB48N1qo41og15t j8ROtACfkxNx4/CfeWUxLzsA15w0bpqtbqSgwodRJ3ZXrILCFB+oHyw5I6IRMucybzJS 9qsiOrFQA1m8UNtazyXnHn/NaLZzAJlJUWzKZ14WbljQ7e5Oi+//EAYniGQc4hmBFrOE tyYw== X-Gm-Message-State: ACrzQf0thJGUXAMGsz/gc+Fn3+KojxzcDJAgL1SczAQOvrWp+M9g73P/ znrqLye0VC3R/qIFWlyBC3u61xd303VMHg== X-Google-Smtp-Source: AMsMyM4bXK1d5k4nBdps8TUhR93xSO+jot44xMPt++d3mkvdtcZVifhR/QY+qDIwYcP4Crnr1y+SoQ== X-Received: by 2002:a05:622a:2613:b0:396:8ec3:a846 with SMTP id ci19-20020a05622a261300b003968ec3a846mr13882806qtb.231.1665429855779; Mon, 10 Oct 2022 12:24:15 -0700 (PDT) Received: from bill-the-cat (2603-6081-7b00-6400-9534-07b5-32fb-791d.res6.spectrum.com. [2603:6081:7b00:6400:9534:7b5:32fb:791d]) by smtp.gmail.com with ESMTPSA id dz18-20020a05620a2b9200b006ee2953fac4sm1232871qkb.136.2022.10.10.12.24.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Oct 2022 12:24:15 -0700 (PDT) Date: Mon, 10 Oct 2022 15:24:13 -0400 From: Tom Rini To: Pali =?iso-8859-1?Q?Roh=E1r?= Cc: Stefan Roese , u-boot@lists.denx.de Subject: Re: Broken watchdog in u-boot master branch Message-ID: <20221010192413.GS2020586@bill-the-cat> References: <20221009191225.65jwebefhqng3qbi@pali> <20221010162818.GM2020586@bill-the-cat> <20221010172256.jb4qwvgsbcucwejf@pali> <20221010174038.GP2020586@bill-the-cat> <20221010174405.5rvz7aclukn567gj@pali> <20221010175610.GQ2020586@bill-the-cat> <20221010180123.p7p4gfo2aa6u6zi3@pali> <20221010181425.GR2020586@bill-the-cat> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="bAr+fMtvBxbbbkvl" Content-Disposition: inline In-Reply-To: <20221010181425.GR2020586@bill-the-cat> X-Clacks-Overhead: GNU Terry Pratchett X-BeenThere: u-boot@lists.denx.de X-Mailman-Version: 2.1.39 Precedence: list List-Id: U-Boot discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: u-boot-bounces@lists.denx.de Sender: "U-Boot" X-Virus-Scanned: clamav-milter 0.103.6 at phobos.denx.de X-Virus-Status: Clean --bAr+fMtvBxbbbkvl Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Oct 10, 2022 at 02:14:25PM -0400, Tom Rini wrote: > On Mon, Oct 10, 2022 at 08:01:23PM +0200, Pali Roh=E1r wrote: > > On Monday 10 October 2022 13:56:10 Tom Rini wrote: > > > On Mon, Oct 10, 2022 at 07:44:05PM +0200, Pali Roh=E1r wrote: > > > > On Monday 10 October 2022 13:40:38 Tom Rini wrote: > > > > > On Mon, Oct 10, 2022 at 07:22:56PM +0200, Pali Roh=E1r wrote: > > > > > > On Monday 10 October 2022 12:28:18 Tom Rini wrote: > > > > > > > On Sun, Oct 09, 2022 at 09:12:25PM +0200, Pali Roh=E1r wrote: > > > > > > > > Hello! Watchdog code seems to be broken in u-boot master br= anch. > > > > > > > > On Nokia N900 I'm getting following message in qemu: > > > > > > > >=20 > > > > > > > > cyclic function rx51_watchdog took too long: 10000us vs 100= 0us max, disabling > > > > > > > >=20 > > > > > > > > Seems that watchdog core code is not prepared for "slower" = watchdogs > > > > > > > > which communicate over slower i2c bus, like it is the case = for N900. > > > > > > > >=20 > > > > > > > > Disabling slower watchdog is a bad idea as it would result = in reboot > > > > > > > > loop instead of slower - but working code. > > > > > > >=20 > > > > > > > So, looking at this in more detail, we have > > > > > > > CONFIG_CYCLIC_MAX_CPU_TIME_US as a configuration option (whic= h is where > > > > > > > the too long comes from). And picking a random CI run: > > > > > > > https://source.denx.de/u-boot/u-boot/-/jobs/511177 > > > > > > > I do see we hit this in CI once, but not every time, QEMU run= s here. Is > > > > > > > that the max time is configurable enough to satisfy your conc= erns here? > > > > > >=20 > > > > > > It is needed to investigate, how to _properly_ fix this issue, = not just > > > > > > workarounded it. Probably other boards may be affected. > > > > >=20 > > > > > So it's the cyclic watchdog code, which we merged as early as pos= sible > > > > > that's the reason here. And it was merged as early as we could to= see if > > > > > there's problems. Are there problems? We're seeing "system too sl= ow, > > > > > disabling" on QEMU, sometimes, and the value of too slow is > > > > > configurable. I know you reported other problems with n900 HW, so= we > > > > > can't see if it's failing there > > > >=20 > > > > I was tested it with older asm code (as described in that other ema= il, > > > > via git checkout commit -- file) on n900 HW and watchdog problem is > > > > there too. Phone reboots in about 20 seconds. But as I do not have > > > > serial console, I do not know if that "disabling" message is printed > > > > there too (but I guess it is). > > >=20 > > > I think I'm a bit baffled at this point, honestly. The watchdog timeo= ut > > > is 60 seconds. If you're confident in it being about 20 seconds, > > > consistently, changing WATCHDOG_TIMEOUT_MSECS to say 10000 (so, 10 > > > seconds) should let you see if U-Boot has configured the watchdog and > > > it's being tripped, or if it's still at the prior stage value. > >=20 > > $ git grep CONFIG_WATCHDOG_TIMEOUT_MSECS configs/nokia_rx51_defconfig > > configs/nokia_rx51_defconfig:CONFIG_WATCHDOG_TIMEOUT_MSECS=3D31000 > >=20 > > Also watchdog is started by NOLO (which loads and execute U-Boot) so > > there can be some smaller timeout. > >=20 > > So I have feeling that on the real HW is same issue. cyclic code > > disabled watchdog kicking and then watchdog restarted phone. > >=20 > > I do not remember exact time (if it is 20s or 25s; I have not measured > > it precisely), but it sounds plausible. >=20 > OK, so what happens if you increase CONFIG_CYCLIC_MAX_CPU_TIME_US to > something very high (so we should still enable the watchdog and > configure the timeout) along with CONFIG_WATCHDOG_TIMEOUT_MSECS being > high too (so if we can't service it in time really it's so long as to be > noticeable) ? Or CONFIG_WATCHDOG_TIMEOUT_MSECS to something much lower > (so that if the device is resetting quicker we're crashing elsewhere) ? OK, on my beagleboard xM with a small change: diff --git a/drivers/watchdog/omap_wdt.c b/drivers/watchdog/omap_wdt.c index ca2bc7cfb59e..f0e57b4f7286 100644 --- a/drivers/watchdog/omap_wdt.c +++ b/drivers/watchdog/omap_wdt.c @@ -39,7 +39,7 @@ #include #include #include -#include +#include #include #include #include On my beagleboard xM I now see: U-Boot SPL 2022.10-00459-g73e741b8ee46-dirty (Oct 10 2022 - 15:18:38 -0400) Trying to boot from MMC1 U-Boot 2022.10-00459-g73e741b8ee46-dirty (Oct 10 2022 - 15:18:38 -0400) OMAP3630/3730-GP ES1.1, CPU-OPP2, L3-200MHz, Max CPU Clock 800 MHz Model: TI OMAP3 BeagleBoard OMAP3 Beagle board + LPDDR/NAND I2C: ready DRAM: 256 MiB Core: 45 devices, 19 uclasses, devicetree: separate WDT: Started wdt@48314000 without servicing (60s timeout) NAND: 0 MiB MMC: OMAP SD/MMC: 0 Loading Environment from NAND... *** Warning - readenv() failed, using defa= ult environment Beagle xM Rev A/B No EEPROM on expansion board OMAP die ID: 6e5e00211ff00000015739eb08031024 Net: No ethernet found. Hit any key to stop autoboot: 0 So, this is as close as I can get to testing on n900 HW, and it's fine here. --=20 Tom --bAr+fMtvBxbbbkvl Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEEGjx/cOCPqxcHgJu/FHw5/5Y0tywFAmNEcVkACgkQFHw5/5Y0 tyxKRQv+IHNHNAsDY96GFIWcOcf4FWhQ45jgRAW6aFp0h71dT49LuttKSSBw8cYj lbxvak92WTwm64DaECPSTqw6FZDhrtesJiC4PqmINy/2g8ygLf84ciZy1Bn1M3NB JI1Z+diJgDSZdvZ9ILkf8yZ48LZEzhip26pRuhUASlRHQjDov+2/CN9xWZ9v//a4 kjB4eQ76OCrTiRDaziVFPmFif8Vu/fDKwYECWLqd4OpysmS0+x1/t9OGoU3k8gRZ UYDRv65RzCbeVzDnDDN6qPHc7jazVey7hftU+wbqeWFaiN61HPyUaitoWt2zQHyp Du4Z6hdWrC43V9Xk5veSi88qQiaSBr9WEvgSk7E1zEZpr4MHR3vTXDoeIuxcshwZ 4DJIqnvBAgRTV+1NrwxA2yrFRAMID+AZy4JRgOR0hKqmWwMrzk5jR1c7507hzcrX 61xcHOy0Ck36FzH3psgSBCh6mS21N2QuNUgMeOxP4R9jPl4eZlCWG8L+RjGv1TQC a8Icp5Ak =cVaJ -----END PGP SIGNATURE----- --bAr+fMtvBxbbbkvl--