From: Dan Williams <dan.j.williams@intel.com>
To: Vishal Verma <vishal.l.verma@intel.com>, <nvdimm@lists.linux.dev>,
<linux-cxl@vger.kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>,
Alison Schofield <alison.schofield@intel.com>,
Dave Jiang <dave.jiang@intel.com>,
Ira Weiny <ira.weiny@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>
Subject: Re: [PATCH ndctl 2/2] daxctl/device.c: Fix error propagation in do_xaction_device()
Date: Fri, 12 Apr 2024 14:30:04 -0700 [thread overview]
Message-ID: <6619a7dcd1a18_24596294ec@dwillia2-mobl3.amr.corp.intel.com.notmuch> (raw)
In-Reply-To: <20240412-vv-daxctl-fixes-v1-2-6e808174e24f@intel.com>
Vishal Verma wrote:
> The loop through the provided list of devices in do_xaction_device()
> returns the status based on whatever the last device did. Since the
> order of processing devices, especially in cases like the 'all' keyword,
> can be effectively random, this can lead to the same command, and same
> effects, exiting with a different error code based on device ordering.
>
> This was noticed with flakiness in the daxctl-create.sh unit test. Its
> 'destroy-device all' command would either pass or fail based on the
> order it tried to destroy devices in. (Recall that until now, destroying
> a daxX.0 device would result in a failure).
>
> Make this slightly more consistent by saving a failed status in
> do_xaction_device if any iteration of the loop produces a failure.
> Return this saved status instead of returning the status of the last
> device processed.
I think "this is the way", at least it follows what cxl/memdev.c is
doing. However we have ended up with an error scheme per tool when it
comes to reporting errors for multi-device operations.
cxl/memdev.c: report the first error
daxctl/device.c: now fixed to report the first error
ndctl/namespace.c: reports last result (same daxctl/device.c issue), unless in the greedy-create case
cxl/region.c: reports the last error even if that is not the last
result, immune to the above bug, but why different?
The struggle here is that all of these tools continue on error, so it
has always been the case that the only way to get a reliable error code
vs action carried out is to not use the "all" or "multi-device" ways to
specify the devices to operate upon.
I don't have a good answer besides, be careful when using "all".
It might make sense to bring ndctl/namespace.c in line to guarantee
"unless 100% of the attempts are successful the command reports
failure". However, it might be too late to make that change there if it
breaks people's scripts. ndctl/namespace.c does not suffer from needing
to know that namspaceX.0 can not be deleted since the deletion there is
exclusively done by setting namespace size to zero.
I think this daxctl change has a low risk of breaking folks because the
primary failure case is fixed to swallow the error.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
next prev parent reply other threads:[~2024-04-12 21:30 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-12 21:05 [PATCH ndctl 0/2] daxctl: Fix error handling and propagation in daxctl/device.c Vishal Verma
2024-04-12 21:05 ` [PATCH ndctl 1/2] daxctl/device.c: Handle special case of destroying daxX.0 Vishal Verma
2024-04-12 21:09 ` Dan Williams
2024-04-15 16:17 ` Dave Jiang
2024-04-12 21:05 ` [PATCH ndctl 2/2] daxctl/device.c: Fix error propagation in do_xaction_device() Vishal Verma
2024-04-12 21:30 ` Dan Williams [this message]
2024-04-12 21:42 ` Verma, Vishal L
2024-04-15 16:19 ` Dave Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6619a7dcd1a18_24596294ec@dwillia2-mobl3.amr.corp.intel.com.notmuch \
--to=dan.j.williams@intel.com \
--cc=alison.schofield@intel.com \
--cc=dave.jiang@intel.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=nvdimm@lists.linux.dev \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox