From: Alison Schofield <alison.schofield@intel.com>
To: Marc Herbert <marc.herbert@linux.intel.com>
Cc: <nvdimm@lists.linux.dev>, Marc Herbert <marc.herbert@intel.com>
Subject: Re: [ndctl PATCH] ndctl/test: fully reset nfit_test in pmem_ns unit test
Date: Wed, 22 Oct 2025 22:15:07 -0700 [thread overview]
Message-ID: <aPm524Y0hIEOUehg@aschofie-mobl2.lan> (raw)
In-Reply-To: <a9806830-6ce7-4d1b-a72d-7fa123e8b326@linux.intel.com>
On Wed, Oct 22, 2025 at 11:37:37AM -0700, Marc Herbert wrote:
> On 2025-10-21 14:26, Alison Schofield wrote:
> > The pmem_ns unit test frequently fails when run as part of the full
> > suite, yet passes when executed alone.
> >
> > [...]
> > > Replace the NULL context parameter when calling ndctl_test_init()
> > with the available ndctl_ctx to ensure pmem_ns can find usable PMEM
> > regions.
> >
> > Reported-by: Marc Herbert <marc.herbert@intel.com>
> > Closes: https://github.com/pmem/ndctl/issues/290
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > ---
> > test/pmem_namespaces.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/test/pmem_namespaces.c b/test/pmem_namespaces.c
> > index 4bafff5164c8..7b8de9dcb61d 100644
> > --- a/test/pmem_namespaces.c
> > +++ b/test/pmem_namespaces.c
> > @@ -191,7 +191,7 @@ int test_pmem_namespaces(int log_level, struct ndctl_test *test,
> >
> > if (!bus) {
> > fprintf(stderr, "ACPI.NFIT unavailable falling back to nfit_test\n");
> > - rc = ndctl_test_init(&kmod_ctx, &mod, NULL, log_level, test);
> > + rc = ndctl_test_init(&kmod_ctx, &mod, ctx, log_level, test);
> > ndctl_invalidate(ctx);
> > bus = ndctl_bus_get_by_provider(ctx, "nfit_test.0");
> > if (rc < 0 || !bus) {
>
> Thanks Alison! This does fix the crash, so you can also add my Tested-By:!
>
> But to test, I had to combine this fix with this temporary hack from
> https://github.com/pmem/ndctl/issues/290
Ah, yes I did similar to debug and test.
>
> --- a/test/pmem_namespaces.c
> +++ b/test/pmem_namespaces.c
> @@ -189,7 +189,7 @@ int test_pmem_namespaces(int log_level, struct ndctl_test *test,
> bus = NULL;
> }
>
> - if (!bus) {
> + if (!bus || true) {
> fprintf(stderr, "ACPI.NFIT unavailable falling back to nfit_test\n");
> rc = ndctl_test_init(&kmod_ctx, &mod, NULL, log_level, test);
> ndctl_invalidate(ctx);
>
>
>
> ... which explains why I disagree with... the commit message! I don't think
> this necessary fix "closes" https://github.com/pmem/ndctl/issues/290 entirely.
Marc,
Thanks for the review!
Ah, you disagree with the Closes tag? I added the close tag expecting
the test case will now pass. pmem-ns will successfully fallback to
nfit_test region if ACPI.NFIT is not present or does not have the pmem
capable region.
wrt the reason why ACPI.NFIT fails to find a suitable region, I haven't
given up on it. In my setup, it fails because the region type is
ND_DEVICE_NAMESPACE_IO (4) rather than ND_DEVICE_NAMESPACE_PMEM (5)
wrt why it fails in your case, a full test run after boot, and with
my reproducer (simply run pmem-ns alone). I don't have the soln yet.
If you have time to check that your failure is same as with my
reproducer, you can collect and share this:
ND_DEVICE_NAMESPACE_IO is 4
ND_DEVICE_NAMESPACE_PMEM is 5
diff --git a/test/pmem_namespaces.c b/test/pmem_namespaces.c
index 4bafff5164c8..c2f25bb02025 100644
--- a/test/pmem_namespaces.c
+++ b/test/pmem_namespaces.c
@@ -180,11 +180,15 @@ int test_pmem_namespaces(int log_level, struct ndctl_test *test,
bus = ndctl_bus_get_by_provider(ctx, "ACPI.NFIT");
if (bus) {
+ int nstype;
+
/* skip this bus if no label-enabled PMEM regions */
- ndctl_region_foreach(bus, region)
- if (ndctl_region_get_nstype(region)
- == ND_DEVICE_NAMESPACE_PMEM)
+ ndctl_region_foreach(bus, region) {
+ nstype = ndctl_region_get_nstype(region);
+ fprintf(stderr, "ALISON nstype %d\n", nstype);
+ if (nstype == ND_DEVICE_NAMESPACE_PMEM)
break;
+ }
if (!region)
bus = NULL;
}
>
> This fix does stop the test from failing which is great and it lowers dramatically
> the severity of 290. But we still don't know why ACPI.NFIT is "available" most of
> the time and... sometimes not. In other words, we still don't know why this test is
> non-deterministic. Of course, there will always be some non-determinism because
> the kernel and QEMU are too complex to be deterministic but I don't think
> non-determism should extend to test fixtures and test code themselves like this.
> Why 290 should stay open IMHO.
>
> Also, this feels like a (missed?) opportunity to add better logging of this
> non-determinism, I mean stuff like:
> https://github.com/pmem/ndctl/issues/290#issuecomment-3260168362
> This is test code, it should not be mean with logging. All bash scripts run
> with "set -x" already so this would not make much difference to the total
> volume.
>
>
> Generally speaking, tests should follow a CLEAN - TEST - CLEAN logic to
> minimize interferences; as much as time allows[*]. Bug 290 demonstrates that:
> 1. Some unknown test running before pmem-ns does not clean properly after itself, and
> 2. The pmem-ns test is not capable of creating a deterministic setup for itself.
>
> We still have no clue about 1. and 2. is not mitigated with logs
> and source comments. So there's still an open bug there.
>
> Marc
>
>
>
>
> [*] there are practical limits: rebooting QEMU for each test would be too slow.
prev parent reply other threads:[~2025-10-23 5:15 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-21 21:26 [ndctl PATCH] ndctl/test: fully reset nfit_test in pmem_ns unit test Alison Schofield
2025-10-22 15:04 ` Dave Jiang
2025-10-22 18:37 ` Marc Herbert
2025-10-23 5:15 ` Alison Schofield [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aPm524Y0hIEOUehg@aschofie-mobl2.lan \
--to=alison.schofield@intel.com \
--cc=marc.herbert@intel.com \
--cc=marc.herbert@linux.intel.com \
--cc=nvdimm@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox