From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from esa11.hc1455-7.c3s2.iphmx.com (esa11.hc1455-7.c3s2.iphmx.com [207.54.90.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E19BF179AA; Mon, 2 Sep 2024 14:20:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.54.90.137 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725286844; cv=none; b=UU24O6p1C9GLTXQ/FER/2/YXmBcBQ6Jpdp7Sz8fYWyihyoIfvMO5nIzPabCfiTV/3GD9K+rWuted9GcFhxLfNNXN5rmT4aArkw3TXrxhpfLw32dXF4w/W2caoq0tQdnkrzHHj7IyBYSChKqi7NXPNgD7h5+Gbh+701csjhaYSTo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725286844; c=relaxed/simple; bh=ZkWv2QJf7NlJ00mCSywvA3GddH6obDZjGitEq5QIKtQ=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=fYbRNEk8k4M73MlE8KRgh4Cu/tcsKtH2CtJ2XxmXuiobMXWujyqS+a7JXNcoitJSLr3W0KSA0CDS4vTXG/rNbb4qRr/c+BxKQpHftqkbSG6jNrcbT1ZjGu/lIFIcmSnGj4FPbmMDssurNAtYUophd0PAt8y1C8DdPbWlpdzkcb0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fujitsu.com; spf=pass smtp.mailfrom=fujitsu.com; dkim=pass (2048-bit key) header.d=fujitsu.com header.i=@fujitsu.com header.b=RBwoIacM; arc=none smtp.client-ip=207.54.90.137 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fujitsu.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fujitsu.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fujitsu.com header.i=@fujitsu.com header.b="RBwoIacM" DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=fujitsu.com; i=@fujitsu.com; q=dns/txt; s=fj2; t=1725286842; x=1756822842; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=ZkWv2QJf7NlJ00mCSywvA3GddH6obDZjGitEq5QIKtQ=; b=RBwoIacM0XeQuWZBtk8B9cTLsGZNltwyFKP7+e2a9hbZylpFJNmtQmzf 9kPtix7Ou7maSQlyEEN28NtiU4v3De5q3tUPynYWkkEXhf6ULVT7bLCFe xpdnQn4xqy6FCMlJXnFEtyHNUeZlNG1Um8GsFRfZNKP6v0DMewtDiDk7d w36+OA7BKsIKPP82ZXwQUnWk9zNBohx/QZedbrlIHVFtieLLVLezFwGMp /8bnQ1kwdujg97vTlHxi+sLaGcLloTPWWgvq3Xnnof0jWbTOfi/RRx6OA BhIyH2KFclkyX7H23UB0YgZjMc0ya42eh00zFnTz19ELrqDvJq4ZZM3/p Q==; X-CSE-ConnectionGUID: 7mXtf9pXT66heOu5aJX5NQ== X-CSE-MsgGUID: ++GJYVH4TuqlC6cHkJGiMQ== X-IronPort-AV: E=McAfee;i="6700,10204,11183"; a="151869290" X-IronPort-AV: E=Sophos;i="6.10,195,1719846000"; d="scan'208";a="151869290" Received: from unknown (HELO oym-r4.gw.nic.fujitsu.com) ([210.162.30.92]) by esa11.hc1455-7.c3s2.iphmx.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Sep 2024 23:19:30 +0900 Received: from oym-m1.gw.nic.fujitsu.com (oym-nat-oym-m1.gw.nic.fujitsu.com [192.168.87.58]) by oym-r4.gw.nic.fujitsu.com (Postfix) with ESMTP id 1F29AD800F; Mon, 2 Sep 2024 23:19:28 +0900 (JST) Received: from kws-ab4.gw.nic.fujitsu.com (kws-ab4.gw.nic.fujitsu.com [192.51.206.22]) by oym-m1.gw.nic.fujitsu.com (Postfix) with ESMTP id 5F185D4F51; Mon, 2 Sep 2024 23:19:27 +0900 (JST) Received: from edo.cn.fujitsu.com (edo.cn.fujitsu.com [10.167.33.5]) by kws-ab4.gw.nic.fujitsu.com (Postfix) with ESMTP id BA7946BD5B; Mon, 2 Sep 2024 23:19:26 +0900 (JST) Received: from [10.193.128.195] (unknown [10.193.128.195]) by edo.cn.fujitsu.com (Postfix) with ESMTP id 889C41A000A; Mon, 2 Sep 2024 22:19:25 +0800 (CST) Message-ID: Date: Mon, 2 Sep 2024 22:19:25 +0800 Precedence: bulk X-Mailing-List: linux-edac@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 2/2] cxl: avoid duplicated report from MCE & device To: Jonathan Cameron Cc: linux-cxl@vger.kernel.org, linux-edac@vger.kernel.org, linux-mm@kvack.org, dan.j.williams@intel.com, vishal.l.verma@intel.com, alison.schofield@intel.com, bp@alien8.de, dave.jiang@intel.com, dave@stgolabs.net, ira.weiny@intel.com, james.morse@arm.com, linmiaohe@huawei.com, mchehab@kernel.org, nao.horiguchi@gmail.com, rric@kernel.org, tony.luck@intel.com References: <20240808151328.707869-1-ruansy.fnst@fujitsu.com> <20240808151328.707869-3-ruansy.fnst@fujitsu.com> <20240827165255.00003184@Huawei.com> From: Shiyang Ruan In-Reply-To: <20240827165255.00003184@Huawei.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-TM-AS-Product-Ver: IMSS-9.1.0.1417-9.0.0.1002-28638.000 X-TM-AS-User-Approved-Sender: Yes X-TMASE-Version: IMSS-9.1.0.1417-9.0.1002-28638.000 X-TMASE-Result: 10--18.722600-10.000000 X-TMASE-MatchedRID: qA30kLX4rkePvrMjLFD6eKn9fPsu8s0a2q80vLACqaeqvcIF1TcLYLBk jjdoOP1bW9EH4+AJvKPWEKq3/x+jsOZusStRKBV2lOneJcroAxQEa8g1x8eqFzKIerHAhfYxh4A 8/yPGZNapvxnoY+yIonXOnTupNIIg0ULUiMMnBpvEOJqSsn5KmZQ7eT0DII9NfnzRct83gQIJfS FgccfpAR+fvjkvoc3Be52pOBC+4eHTVE410k90AWQFd4bOnrT64XnbArnSCLHkMnUVL5d0Ezje3 avJWBBRdOlApyjqZMKIT8eUrs+4RRs7n0Ur0F2YRN+FMKVZBhEQOcMSo0926lcZNuxCoduS7bVh 2RA8dMz3SzCPT5sXYys9U4Zn7lQDVOc6pZRHw9cmZusHWPhfCk3yuY9BGW8rsqiKlYBJQxg0Oxd vWc671FozlTt1FeqbX7bicKxRIU23sNbcHjySQdigzzbKqYUy+gtHj7OwNO0YzpbdT4uedzdKAe VgKQLGIiCPMSPC78UIet2iW5cw1yy8MXciL+hj X-TMASE-SNAP-Result: 1.821001.0001-0-1-22:0,33:0,34:0-0 在 2024/8/27 23:52, Jonathan Cameron 写道: > On Thu, 8 Aug 2024 23:13:28 +0800 > Shiyang Ruan wrote: > >> Since CXL device is a memory device, while CPU is consuming a poison >> page of CXL device, it always triggers a MCE (via interrupt #18) and >> calls memory_failure() to handle POISON page, no matter which-First path >> is configured. CXL device could also find and report the POISON, kernel >> now not only traces but also calls memory_failure() to handle it, which >> is marked as "NEW" in the figure blow. >> ``` >> 1. MCE (interrupt #18, while CPU consuming POISON) >> -> do_machine_check() >> -> mce_log() >> -> notify chain (x86_mce_decoder_chain) >> -> memory_failure() <---------------------------- EXISTS >> 2.a FW-First (optional, CXL device proactively find&report) >> -> CXL device -> Firmware >> -> OS: ACPI->APEI->GHES->CPER -> CXL driver -> trace >> \-> memory_failure() >> ^----- NEW >> 2.b OS-First (optional, CXL device proactively find&report) >> -> CXL device -> MSI >> -> OS: CXL driver -> trace >> \-> memory_failure() >> ^------------------------------- NEW >> ``` >> >> But in this way, the memory_failure() could be called twice or even at >> same time, as is shown in the figure above: (1.) and (2.a or 2.b), >> before the POISON page is cleared. memory_failure() has it own mutex >> lock so it actually won't be called at same time and the later call >> could be avoided because HWPoison bit has been set. However, assume >> such a scenario, "CXL device reports POISON error" triggers 1st call, >> user see it from log and want to clear the poison by executing `cxl >> clear-poison` command, and at the same time, a process tries to access >> this POISON page, which triggers MCE (it's the 2nd call). > > Attempting to clear poison in a page that is online seems unwise. > Does that ever make sense today? To be honest, I am not sure about this. Even if the error from CXL device is recoverable, we don't reuse it again? > >> Since there >> is no lock between the 2nd call with clearing poison operation, race >> condition may happen, which may cause HWPoison bit of the page in an >> unknown state. > > As long as that state is always wrong in the sense we think it's poisoned > when it isn't we don't care. The 2nd memory_failure() need this state to determine whether to continue its process or return. >> >> Thus, we have to avoid the 2nd call. This patch[2] introduces a new >> notifier_block into `x86_mce_decoder_chain` and a POISON cache list, to >> stop the 2nd call of memory_failure(). It checks whether the current >> poison page has been reported (if yes, stop the notifier chain, don't >> call the following memory_failure() to report again). >> > > If we do want to do this, it belongs in the generic code, not arch specific > part. Can we do similar in memory failure? Yes, I saw the build error. Will fix this. > > To RAS reviewers, this isn't a new problem unique to CXL. Does a solution > like this make sense in practice, or are we fine to always let two reports > for the same error get handled? > > > Jonathan > > -- Thanks, Ruan.