From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C1250CE7A81 for ; Mon, 25 Sep 2023 03:55:18 +0000 (UTC) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=hiHuml4U; dkim-atps=neutral Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4Rv8Cn1zlSz3dRX for ; Mon, 25 Sep 2023 13:55:17 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=hiHuml4U; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::b2b; helo=mail-yb1-xb2b.google.com; envelope-from=oohall@gmail.com; receiver=lists.ozlabs.org) Received: from mail-yb1-xb2b.google.com (mail-yb1-xb2b.google.com [IPv6:2607:f8b0:4864:20::b2b]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4Rv8Bm3nyhz3c28 for ; Mon, 25 Sep 2023 13:54:24 +1000 (AEST) Received: by mail-yb1-xb2b.google.com with SMTP id 3f1490d57ef6-d81adf0d57fso6562507276.1 for ; Sun, 24 Sep 2023 20:54:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695614059; x=1696218859; darn=lists.ozlabs.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ITTBpzczqd4LSWTAD9aZOIvJ8J7Zcd7g95EdndUh4GE=; b=hiHuml4Uw4iU/HmjE9dcy2fGocvhVrYX88JGAoH1iIyHH+691LPPtKJVibjhd9SSM1 sJxAiQB0p8OaC9Mw3aE7uvukKGkCNPO0eJpuUi7v3gye4xr6k8P45Zs8K9h1kfGtqa9E uhGG33i1nS4IZftZh+X1I7Mtt/C5S5ienyNgJ0ESzA9p4mXHN/53C2FESYSit0aeRWC9 kI1MxAFWEOc/5QhtA0lQhn8BkXHLFNpHK01LT1dl0FPSrCMMEM+ChnYII01A2yLSDjj4 K3px0eW8FiWI/P/2l7+rduSiAn3rgc1cB5/o5p7gAGjPUWGwOHCRAiSFuzs5/V8goH4+ Ngxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695614059; x=1696218859; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ITTBpzczqd4LSWTAD9aZOIvJ8J7Zcd7g95EdndUh4GE=; b=qgtoNqMEW3XJcNIaMkXp4JKA57Ekm3D/ZsPwkFoKeYChE9Pjy8UmwoUPJ3goT5XmtK 7E3tK2Ech2B65vSx70OooXpWA80dmEHbSCTYbesf6+jixx3QHCqK/V5mE+GZhjr9o7ua YMODJqGF3U1oijL8zEgTH4pBW9q117FnAa6+f4TybE9ngk44NdPq4BUmx47nLT2kE/RI S3w/+wa4kM2DN+nSM5/rbb9L/bO57poCMB8xQne5BSJArrq+iHWXbUTvZjQRly9VyTp0 dnYgDxUie6V4Zt9HT2NAhM4Ae8hzokJnIBSMSH+f4I7s4IIOYpI/CJVBtDCXf6mSUsJ/ Kj4A== X-Gm-Message-State: AOJu0YxdanaNAHrxAVor3TFHAzJ/DHmcybMt78/KaRW1VwH9bxv2RJuA DEABZwLskulOOnnc7DEqWV/gqFRV2hvVls+eUis= X-Google-Smtp-Source: AGHT+IEPOLAmcobHAk1T8K+we8QFLN4PFzlGhEPu6UIp+fRvfWYf2Xav/jX3C/IYkXWacdxhlKDFU1i+P3s28fkvDpw= X-Received: by 2002:a25:4106:0:b0:d78:be:6f02 with SMTP id o6-20020a254106000000b00d7800be6f02mr5161748yba.11.1695614059628; Sun, 24 Sep 2023 20:54:19 -0700 (PDT) MIME-Version: 1.0 References: <20230920230257.GA280837@bhelgaas> <625cdd6c55994bf3a50efd8f79680029@AcuMS.aculab.com> In-Reply-To: <625cdd6c55994bf3a50efd8f79680029@AcuMS.aculab.com> From: "Oliver O'Halloran" Date: Mon, 25 Sep 2023 13:54:08 +1000 Message-ID: Subject: Re: Questions: Should kernel panic when PCIe fatal error occurs? To: David Laight Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "james.morse@arm.com" , "Rafael J. Wysocki" , Linux PCI , Jonathan Cameron , "mahesh@linux.ibm.com" , "linux-kernel@vger.kernel.org" , "linux-acpi@vger.kernel.org" , Bjorn Helgaas , Shuai Xue , Baolin Wang , "gregkh@linuxfoundation.org" , "bhelgaas@google.com" , "bp@alien8.de" , "linuxppc-dev@lists.ozlabs.org" , "lenb@kernel.org" Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Fri, Sep 22, 2023 at 8:23=E2=80=AFAM David Laight wrote: > > > It would be nice if they worked the same, but I suspect that vendors > > may rely on the fact that CPER_SEV_FATAL forces a restart/panic as > > part of their system integrity story. > > The file system errors created by a panic (especially an NMI panic) > could easily be more problematic than a failed PCIe data transfer. > Evan a read that returned ~0u - which can be checked for. > > Panicking a system that is converting TDM telephony to RTP for the > 911 emergency service because a PCIe cable/riser connecting one of the > TDM board has become loose doesn't seem ideal. For kernel native AER the default reaction to errors is reset-and-reinit which probably isn't much better for your case. Sounds like you would want a knob to suppress everything except error reporting so you can handle it in userspace? > (Or because the TDM board's fpga has decided it isn't going to respond > to any accesses until the BARs are setup again...) > > The system can carry on with some TDM connections disabled - but that > is ok because they are all duplicated in case a cable gets cuit. Well that's a relief :) Oliver