From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DEF6C52D71 for ; Thu, 8 Aug 2024 18:28:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CEB496B0088; Thu, 8 Aug 2024 14:28:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C9AB96B0089; Thu, 8 Aug 2024 14:28:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B15216B008A; Thu, 8 Aug 2024 14:28:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 94A976B0088 for ; Thu, 8 Aug 2024 14:28:31 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 52EB7161611 for ; Thu, 8 Aug 2024 18:28:31 +0000 (UTC) X-FDA: 82429913622.13.67C4408 Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) by imf16.hostedemail.com (Postfix) with ESMTP id 55DAF180005 for ; Thu, 8 Aug 2024 18:28:28 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=XxNPeqDJ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf16.hostedemail.com: domain of nifan.cxl@gmail.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=nifan.cxl@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723141655; a=rsa-sha256; cv=none; b=R4OA+oMi0YhdzVN4kFryq6YN1ZCjABqITImmIkNGD6rSw0ekBDMCufbK+IVcuzSQ85oVZc U4DwbHIy3Yo0N3s2TCRtZ3JSvNxWNnY1B99LyTI7+oku9OnIlJE1o3iZ2DS7URW+JVVqNk vLRPbl334/rjNTpt1Jns/iLtA/wdAIQ= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=XxNPeqDJ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf16.hostedemail.com: domain of nifan.cxl@gmail.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=nifan.cxl@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723141655; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=x/eKB9adD5vtAdH+AHSGnKrabRcrcGFa4QNQ7bMcvU8=; b=XnCny+7JBu8+X/6+azRYdxifnsJNQHtlp8ug59F2v+cT6YYN83BNa70CrANgQeY+Q22RHQ 8HrccASqDj0KM7Vlih3z05OGzLRc7c4GO4NRfvwRvQPPzYbfjIKIr86vY0EHQKctfs//Bt oGEpsG6u4vEpRtvpL+FYg/R/cNduIyg= Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-70ea2f25bfaso1031134b3a.1 for ; Thu, 08 Aug 2024 11:28:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723141707; x=1723746507; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:from:to:cc:subject:date:message-id:reply-to; bh=x/eKB9adD5vtAdH+AHSGnKrabRcrcGFa4QNQ7bMcvU8=; b=XxNPeqDJRoW5ZTN5hOwxC4tMyXG4niDkj6DPKLALi2DGXMxF0CSYgBs53Wyc/wYk2q SpyTMoC/mlftAsqNYuM2pvHfU8oSKv1lf2HHqPTSt0fYuzahdMzJruD/1Vf8OiMZyaxM c6R0mO15D7zJLRrGf3m0JmquKzBIp7fU/vfVah3atDwR3GQnszHARaR3mz7LgnDzu7hm qOFSx200Np6pJ62HE1zLG8qpqHXW6g57i+xn4rqK234/o6Z3hnrC6YafK+Z3HoD4Y1ht VKhXSy8UEs2y4U8/uCz6EISJdvpDq5MpAoHx497G+t9/Fsca+MS+Huti2ZG7Xiivqct6 OXNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723141707; x=1723746507; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=x/eKB9adD5vtAdH+AHSGnKrabRcrcGFa4QNQ7bMcvU8=; b=c6MINns7EhIsLsMizyOXzAcO5nScQ8Q1n0SC/qjqVlKBYPpRa8QwEk3j0/MDg2IYvS 4+Z34ycJMpoyJcTZqcKp5FWVTrDVove3QmjslWOeQmYM/JJtqonOYyVa3mcJ7c0Wjhyf 8ik7hcoOFFBbxkNr6z+IEWBa3JxwgvGBiaK+XseU3KC4mP+7RzdLI/OyhkuxBbmXlbd5 MhfaBlJlucvVXIK8VGCcjhYAGOtw9k+jaE391kYXU82YKDh1MJ6qURS+97h+1JLvYkBw yWG+jchI6TqPi3IPRDCSPNaQd1Yniq4ud8V4FT1qKKDRpZ3C0Wq3RkgcS2/00A+DizZG iacQ== X-Forwarded-Encrypted: i=1; AJvYcCUvXHzffTxtgMAlbwKRvpySuXdzG5KRbJa7lGTfcPRs3kW5ZQguOE39FxZ24QivMdChNKz5Tb9ZdYqOpZCYFbXnak0= X-Gm-Message-State: AOJu0YybGnd84M35dm7kBazxpqR6qOJSorY0WrcafVxn/upxwNpi+TiZ m6e4KdXeGMpxl/iQ2920IC9c3QfLkJJVqEHpJ8CRKarpzi3qQtsp X-Google-Smtp-Source: AGHT+IHhCjRuTNDogtGzrIlZwTbvqVxHATIyO9zKjrlxrDbVkksTgDVka1CaxP0T7jfYts3GQuVkZQ== X-Received: by 2002:a05:6a20:4313:b0:1c3:3436:a244 with SMTP id adf61e73a8af0-1c6fceb871emr3559685637.1.1723141706780; Thu, 08 Aug 2024 11:28:26 -0700 (PDT) Received: from fan ([2601:646:8f03:9fee:ba5b:a13:cdb6:fb15]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-710cb2d0355sm1408586b3a.118.2024.08.08.11.28.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Aug 2024 11:28:26 -0700 (PDT) From: Fan Ni X-Google-Original-From: Fan Ni Date: Thu, 8 Aug 2024 11:28:05 -0700 To: Shiyang Ruan Cc: qemu-devel@nongnu.org, linux-cxl@vger.kernel.org, linux-edac@vger.kernel.org, linux-mm@kvack.org, dan.j.williams@intel.com, vishal.l.verma@intel.com, Jonathan.Cameron@huawei.com, alison.schofield@intel.com, bp@alien8.de, dave.jiang@intel.com, dave@stgolabs.net, ira.weiny@intel.com, james.morse@arm.com, linmiaohe@huawei.com, mchehab@kernel.org, nao.horiguchi@gmail.com, rric@kernel.org, tony.luck@intel.com Subject: Re: [PATCH v4 1/2] cxl/core: introduce device reporting poison hanlding Message-ID: References: <20240808151328.707869-1-ruansy.fnst@fujitsu.com> <20240808151328.707869-2-ruansy.fnst@fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240808151328.707869-2-ruansy.fnst@fujitsu.com> X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 55DAF180005 X-Stat-Signature: bdgo6k49wmaka43oxd7br7xgmwnpnejs X-Rspam-User: X-HE-Tag: 1723141708-152305 X-HE-Meta: U2FsdGVkX1/JpjXRaxFXpbxMMmh3AMD849Tn7PCtDzj2pRJpqYAPoX5ek60dfZa5g1J+ukIbNu6KTGGHyT4yU70zvNJzIFYGvRwJcp9yx3jtCK6sWEELUJSySWLzdw+6AWt2/VtFhnB2MASMdXbCFd4r5A0fvy3VlDQvbbRaMdupbkga8K+7nwgJ5UYLrxqYZiDcJ0RMSsPTw17iP/slWqhyAVhrxZMJgAJ6UFI9mes6RIw8YgnoeIh8eBpr1nPCfgvQaFYSyTKjuKo+5FKpTDjeKTS9/mBqTbV4EgTXQKGHDfGwGsBlA63ACdlTQuWawQ2ZVoEK9qQ1uUWCbSBNr9zM00sAal7UtftNYooCkNzE3f4FhPmsweWsalcAle3REqA8/LyoLgl0JzpPf7U9ZkJbsmmzLLQL+56BmxQq8mB6ps29HkTdu0tyKuFLgoNs9lC6wKvvFptdlXWMCmFaSAXqSkCYhRsS0MeM0dgl0abeCjKNz6CSjb1j2Xdf/r1Yd+X2t9QJ1TBRB/dIaRBT49T0QbRYING8nt1i5hZo073UEv7k0qIAdxTGyyFdoLzrfppgHZ0KkK76imr9Qg4fkQ469fnhUQoBds/xR8qXW9xBe+I+7QBJGnphMpRkrhktXx4U1EAuhwYYor/0MrBf9ubAU4ZwzUn7vHroHQPY5cgmYB8boeEytLkv3afM5y//cllkJLtGesgi1qGyCFtFpEH52L1YwIaAx2qeUIFWdDEVy+Slqkftlgp6kOASf2hJUE9TCOM7dxT75/SIou8q6LDQfhXZQk8p8v4/elI4waUVJ0lrlrhjGrfrx0bzKTmpIAyl+V4qrZ0yUL4B/e5iaVwM1jm59ukO0mv4tSkcVBbUij4YiaYiMczxhjw1ojSGbpCzkWeOip3QFXcnAyWgUqhsApd1BJqTjMiCRofn69atkUVHn3dI8C+eOa+h4LppH2jfqDsKMhwGpq05Jnq C8AKVWXR rGgv5fmKDCOrU9mgRofv81Tg+zktEj5csfQCCSpUGZ1EKuLHn7XayYbYOmHH4JEF9IDlD4T/whiqEN6E0D/RZ48PrNLeKU4HUlxOTy8S7fLBgQ30p0BtsfUEt7lEOqpm2Qx5v2goCw1wKXl96PLmai3peNcKkq1PIeh2J/4q4kAQQyaO9TlTmWazG+FsK8uvnFqlZMgU5u5cXAcB+Pyevb+rBdyj6bq1o2HsWv/1Lt1kgDeR5TSQXW8+Xz58AEkEUXN/uit5kmXjwfi46UdbSCMA3A4kAG12C09jq734qwKL0WDNYx5y6dX6Z+fRYPK4VqQCMwMbRDD+sVA57lqk/q72IByGxTtWcau7MV0LjgOyrDxzGi1rs1CvTDUMzLQyyo7rBOHB6D9bPsYhtOmSRbGXdpmDyoaec4DBO0SqJvCOIn2Qfk3j/V/u7J/oXM4tUT8G1g8u2I7BE11of/lMJ+DNRTTayLDRX4fmpqdgHCMJBvkXaCYr/yF2v2kjFatQGy/nplrnYttEtCaFlOHy/DX2IBl8i1YGYZHvS X-Bogosity: Ham, tests=bogofilter, spamicity=0.008060, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Aug 08, 2024 at 11:13:27PM +0800, Shiyang Ruan wrote: > CXL device can find&report memory problems, even before MCE is detected > by CPU. AFAIK, the current kernel only traces POISON error event > from FW-First/OS-First path, but it doesn't handle them, neither > notify processes who are using the POISON page like MCE does. > > Thus, user have to read logs from trace and find out which device > reported the error and which applications are affected. That is not > an easy work and cannot be handled in time. Thus, it is needed to add > the feature to make the work done automatically and quickly. Once CXL > device reports the POISON error (via FW-First/OS-First), kernel > handles it immediately, similar to the flow when a MCE is triggered. > > The current call trace of error reporting&handling looks like this: > ``` > 1. MCE (interrupt #18, while CPU consuming POISON) > -> do_machine_check() > -> mce_log() > -> notify chain (x86_mce_decoder_chain) > -> memory_failure() > > 2.a FW-First (optional, CXL device proactively find&report) > -> CXL device -> Firmware > -> OS: ACPI->APEI->GHES->CPER -> CXL driver -> trace > \-> memory_failure() > ^----- ADD > 2.b OS-First (optional, CXL device proactively find&report) > -> CXL device -> MSI > -> OS: CXL driver -> trace > \-> memory_failure() > ^------------------------------- ADD > ``` > This patch adds calling memory_failure() while CXL device reporting > error is received, marked as "ADD" in figure above. > > Signed-off-by: Shiyang Ruan > --- > drivers/cxl/core/mbox.c | 75 ++++++++++++++++++++++++++++++++------- > drivers/cxl/cxlmem.h | 8 ++--- > drivers/cxl/pci.c | 4 +-- > include/linux/cxl-event.h | 16 ++++++++- > 4 files changed, 83 insertions(+), 20 deletions(-) > > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c > index e5cdeafdf76e..0cb6ef2e6600 100644 > --- a/drivers/cxl/core/mbox.c > +++ b/drivers/cxl/core/mbox.c > @@ -849,10 +849,55 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds) > } > EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL); > > -void cxl_event_trace_record(const struct cxl_memdev *cxlmd, > - enum cxl_event_log_type type, > - enum cxl_event_type event_type, > - const uuid_t *uuid, union cxl_event *evt) > +static void cxl_report_poison(struct cxl_memdev *cxlmd, u64 hpa) > +{ > + unsigned long pfn = PHYS_PFN(hpa); > + > + memory_failure_queue(pfn, 0); > +} > + > +static void cxl_event_handle_general_media(struct cxl_memdev *cxlmd, > + enum cxl_event_log_type type, > + u64 hpa, > + struct cxl_event_gen_media *rec) > +{ > + if (type == CXL_EVENT_TYPE_FAIL) { > + switch (rec->media_hdr.transaction_type) { > + case CXL_EVENT_TRANSACTION_READ: > + case CXL_EVENT_TRANSACTION_WRITE: > + case CXL_EVENT_TRANSACTION_SCAN_MEDIA: > + case CXL_EVENT_TRANSACTION_INJECT_POISON: > + cxl_report_poison(cxlmd, hpa); > + break; > + default: > + break; > + } > + } > +} > + > +static void cxl_event_handle_dram(struct cxl_memdev *cxlmd, > + enum cxl_event_log_type type, > + u64 hpa, > + struct cxl_event_dram *rec) > +{ > + if (type == CXL_EVENT_TYPE_FAIL) { > + switch (rec->media_hdr.transaction_type) { > + case CXL_EVENT_TRANSACTION_READ: > + case CXL_EVENT_TRANSACTION_WRITE: > + case CXL_EVENT_TRANSACTION_SCAN_MEDIA: > + case CXL_EVENT_TRANSACTION_INJECT_POISON: > + cxl_report_poison(cxlmd, hpa); > + break; > + default: > + break; > + } > + } > +} > + > +void cxl_event_handle_record(struct cxl_memdev *cxlmd, > + enum cxl_event_log_type type, > + enum cxl_event_type event_type, > + const uuid_t *uuid, union cxl_event *evt) > { > if (event_type == CXL_CPER_EVENT_MEM_MODULE) { > trace_cxl_memory_module(cxlmd, type, &evt->mem_module); > @@ -880,18 +925,22 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd, > if (cxlr) > hpa = cxl_dpa_to_hpa(cxlr, cxlmd, dpa); > > - if (event_type == CXL_CPER_EVENT_GEN_MEDIA) > + if (event_type == CXL_CPER_EVENT_GEN_MEDIA) { > trace_cxl_general_media(cxlmd, type, cxlr, hpa, > &evt->gen_media); > - else if (event_type == CXL_CPER_EVENT_DRAM) > + cxl_event_handle_general_media(cxlmd, type, hpa, > + &evt->gen_media); > + } else if (event_type == CXL_CPER_EVENT_DRAM) { > trace_cxl_dram(cxlmd, type, cxlr, hpa, &evt->dram); > + cxl_event_handle_dram(cxlmd, type, hpa, &evt->dram); Does it make sense to call the trace function in cxl_event_handle_dram/general_media and replace the trace function with the handle_* here? > + } > } > } > -EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL); > +EXPORT_SYMBOL_NS_GPL(cxl_event_handle_record, CXL); > > -static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd, > - enum cxl_event_log_type type, > - struct cxl_event_record_raw *record) > +static void __cxl_event_handle_record(struct cxl_memdev *cxlmd, > + enum cxl_event_log_type type, > + struct cxl_event_record_raw *record) > { > enum cxl_event_type ev_type = CXL_CPER_EVENT_GENERIC; > const uuid_t *uuid = &record->id; > @@ -903,7 +952,7 @@ static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd, > else if (uuid_equal(uuid, &CXL_EVENT_MEM_MODULE_UUID)) > ev_type = CXL_CPER_EVENT_MEM_MODULE; > > - cxl_event_trace_record(cxlmd, type, ev_type, uuid, &record->event); > + cxl_event_handle_record(cxlmd, type, ev_type, uuid, &record->event); > } > > static int cxl_clear_event_record(struct cxl_memdev_state *mds, > @@ -1012,8 +1061,8 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds, > break; > > for (i = 0; i < nr_rec; i++) > - __cxl_event_trace_record(cxlmd, type, > - &payload->records[i]); > + __cxl_event_handle_record(cxlmd, type, > + &payload->records[i]); > > if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW) > trace_cxl_overflow(cxlmd, type, payload); > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h > index afb53d058d62..5c4810dcbdeb 100644 > --- a/drivers/cxl/cxlmem.h > +++ b/drivers/cxl/cxlmem.h > @@ -826,10 +826,10 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds, > void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds, > unsigned long *cmds); > void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status); > -void cxl_event_trace_record(const struct cxl_memdev *cxlmd, > - enum cxl_event_log_type type, > - enum cxl_event_type event_type, > - const uuid_t *uuid, union cxl_event *evt); > +void cxl_event_handle_record(struct cxl_memdev *cxlmd, > + enum cxl_event_log_type type, > + enum cxl_event_type event_type, > + const uuid_t *uuid, union cxl_event *evt); > int cxl_set_timestamp(struct cxl_memdev_state *mds); > int cxl_poison_state_init(struct cxl_memdev_state *mds); > int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len, > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > index 4be35dc22202..6e65ca89f666 100644 > --- a/drivers/cxl/pci.c > +++ b/drivers/cxl/pci.c > @@ -1029,8 +1029,8 @@ static void cxl_handle_cper_event(enum cxl_event_type ev_type, > hdr_flags = get_unaligned_le24(rec->event.generic.hdr.flags); > log_type = FIELD_GET(CXL_EVENT_HDR_FLAGS_REC_SEVERITY, hdr_flags); > > - cxl_event_trace_record(cxlds->cxlmd, log_type, ev_type, > - &uuid_null, &rec->event); > + cxl_event_handle_record(cxlds->cxlmd, log_type, ev_type, > + &uuid_null, &rec->event); > } > > static void cxl_cper_work_fn(struct work_struct *work) > diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h > index 0bea1afbd747..be4342a2b597 100644 > --- a/include/linux/cxl-event.h > +++ b/include/linux/cxl-event.h > @@ -7,6 +7,20 @@ > #include > #include > > +/* > + * Event transaction type > + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43 Here and below, update the specification reference to reflect cxl 3.1. Fan > + */ > +enum cxl_event_transaction_type { > + CXL_EVENT_TRANSACTION_UNKNOWN = 0X00, > + CXL_EVENT_TRANSACTION_READ, > + CXL_EVENT_TRANSACTION_WRITE, > + CXL_EVENT_TRANSACTION_SCAN_MEDIA, > + CXL_EVENT_TRANSACTION_INJECT_POISON, > + CXL_EVENT_TRANSACTION_MEDIA_SCRUB, > + CXL_EVENT_TRANSACTION_MEDIA_MANAGEMENT, > +}; > + > /* > * Common Event Record Format > * CXL rev 3.0 section 8.2.9.2.1; Table 8-42 > @@ -26,7 +40,7 @@ struct cxl_event_media_hdr { > __le64 phys_addr; > u8 descriptor; > u8 type; > - u8 transaction_type; > + u8 transaction_type; /* enum cxl_event_transaction_type */ > /* > * The meaning of Validity Flags from bit 2 is > * different across DRAM and General Media records > -- > 2.34.1 >