From: Jonathan Cameron via <qemu-devel@nongnu.org>
To: <nifan.cxl@gmail.com>
Cc: <qemu-devel@nongnu.org>, <linux-cxl@vger.kernel.org>,
<ira.weiny@intel.com>, <dan.j.williams@intel.com>,
<a.manzanares@samsung.com>, <dave@stgolabs.net>,
<nmtadam.samsung@gmail.com>, Fan Ni <fan.ni@samsung.com>
Subject: Re: [QEMU RFC] hw/mem/cxl_type3: add guard to avoid event log overflow during a DC extent add/release request
Date: Mon, 14 Oct 2024 12:23:22 +0100 [thread overview]
Message-ID: <20241014122322.00001ad4@Huawei.com> (raw)
In-Reply-To: <20241011202929.11611-2-nifan.cxl@gmail.com>
On Fri, 11 Oct 2024 13:24:50 -0700
nifan.cxl@gmail.com wrote:
> From: Fan Ni <fan.ni@samsung.com>
>
> One DC extent add/release request can take multiple DC extents.
> For each extent in the request, one DC event record will be generated and
> isnerted into the event log. All the event records for the request will be
> grouped with the More flag (see CXL spec r3.1, Table 8-168 and 8-170).
> If an overflow happens during the process, the yet-to-insert records will
> get lost, leaving the device in a situation where it notifies the host
> only part of the extents involved, and the host never surfacing the
> extents received and waiting for the remaining extents.
Interesting corner. For other 'events' an overflow is natural because
they can be out of the control of the device. This artificial limit
was to trigger the overflow handling in those cases. For this one I'd expect
the device to push back on the fabric management commands, or handle the
event log filling so overflow doesn't happen.
>
> Add a check in qmp_cxl_process_dynamic_capacity_prescriptive and ensure
> the event log does not overflow during the process.
>
> Currently we check the number of extents involved with the event
> overflow threshold, do we need to tight the check and compare with
> the remaining spot available in the event log?
Yes. I think we need to prevent other outstanding events causing us trouble.
Is it useful to support the case where we have more than one
group of extents outstanding? If not we could simply fail the add whenever
that happens. Maybe that is a reasonable stop gap until we have a reason
to care about that case. We probably care when we have FM-API hooked up
to this and want to test more advanced fabric management stuff, or poke
a corner of the kernel code perhaps?
I guess from a 'would it be right if a device did this' the answer may be
yes, but that doesn't mean Linux is going to support such a device
(at least not until we know they really exist). Ira, what do you think
about this corner case? Maybe detect and scream if we aren't already?
Jonathan
>
> Signed-off-by: Fan Ni <fan.ni@samsung.com>
> ---
> hw/cxl/cxl-events.c | 2 --
> hw/mem/cxl_type3.c | 7 +++++++
> include/hw/cxl/cxl_events.h | 3 +++
> 3 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/hw/cxl/cxl-events.c b/hw/cxl/cxl-events.c
> index 12dee2e467..05d8aae627 100644
> --- a/hw/cxl/cxl-events.c
> +++ b/hw/cxl/cxl-events.c
> @@ -16,8 +16,6 @@
> #include "hw/cxl/cxl.h"
> #include "hw/cxl/cxl_events.h"
>
> -/* Artificial limit on the number of events a log can hold */
> -#define CXL_TEST_EVENT_OVERFLOW 8
>
> static void reset_overflow(CXLEventLog *log)
> {
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index 3d7289fa84..32668df365 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -2015,6 +2015,13 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
> num_extents++;
> }
>
> + if (num_extents > CXL_TEST_EVENT_OVERFLOW) {
> + error_setg(errp,
> + "at most %d extents allowed in one add/release request",
> + CXL_TEST_EVENT_OVERFLOW);
> + return;
> + }
> +
> /* Create extent list for event being passed to host */
> i = 0;
> list = records;
> diff --git a/include/hw/cxl/cxl_events.h b/include/hw/cxl/cxl_events.h
> index 38cadaa0f3..2a6b57e3e6 100644
> --- a/include/hw/cxl/cxl_events.h
> +++ b/include/hw/cxl/cxl_events.h
> @@ -12,6 +12,9 @@
>
> #include "qemu/uuid.h"
>
> +/* Artificial limit on the number of events a log can hold */
> +#define CXL_TEST_EVENT_OVERFLOW 8
> +
> /*
> * CXL r3.1 section 8.2.9.2.2: Get Event Records (Opcode 0100h); Table 8-52
> *
next prev parent reply other threads:[~2024-10-14 11:24 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-11 20:24 [QEMU RFC] hw/mem/cxl_type3: add guard to avoid event log overflow during a DC extent add/release request nifan.cxl
2024-10-14 11:23 ` Jonathan Cameron via [this message]
2024-10-18 15:35 ` Fan Ni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241014122322.00001ad4@Huawei.com \
--to=qemu-devel@nongnu.org \
--cc=Jonathan.Cameron@Huawei.com \
--cc=a.manzanares@samsung.com \
--cc=dan.j.williams@intel.com \
--cc=dave@stgolabs.net \
--cc=fan.ni@samsung.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=nifan.cxl@gmail.com \
--cc=nmtadam.samsung@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).