All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Marzinski <bmarzins@redhat.com>
To: Brian Bunker <brian@purestorage.com>
Cc: dm-devel@lists.linux.dev
Subject: Re: [PATCH] tests: add test demonstrating TUR checker race condition
Date: Mon, 9 Mar 2026 14:33:06 -0400	[thread overview]
Message-ID: <aa8SYkg7Uu7N52SU@redhat.com> (raw)
In-Reply-To: <20260307010614.1732-1-brian@purestorage.com>

On Fri, Mar 06, 2026 at 05:06:14PM -0800, Brian Bunker wrote:
> Add a test that reproduces the race condition in the TUR checker where
> the checker thread has finished updating ct->state and ct->msgid (under
> lock) but hasn't yet cleared ct->running. In this race window, the
> code incorrectly returns PATH_PENDING even though a valid result is
> already available.
> 
> NOTE: This test currently fails, demonstrating the bug exists.
> 
> The test includes two cases:
> - test_race_window_result_ready: Simulates the race window where
>   ct->running is still set but ct->msgid indicates the check has
>   completed. This currently returns PATH_PENDING instead of the
>   actual result (PATH_UP).
> - test_thread_still_running: Verifies that when ct->msgid is
>   MSG_TUR_RUNNING (thread genuinely hasn't finished), PATH_PENDING
>   is correctly returned.
> 

I thought your test program could show the TUR checker hanging. This is
just showing the known behavior, and I don't see a way for it to hang.
Also, unless you actually got the TUR checker to hang, I don't know of
anyone who has ever reported a hang.

-Ben

> == running tur_race-test ==
> [==========] Running 2 test(s).
> [ RUN      ] test_race_window_result_ready
> [  ERROR   ] --- 0x6 != 0x3
> [   LINE   ] --- tur_race.c:104: error: Failure!
> [  FAILED  ] test_race_window_result_ready
> [ RUN      ] test_thread_still_running
> [       OK ] test_thread_still_running
> [==========] 2 test(s) run.
> [  PASSED  ] 1 test(s).
> [  FAILED  ] 1 test(s), listed below:
> [  FAILED  ] test_race_window_result_ready
> 
>  1 FAILED TEST(S)
> 
> Signed-off-by: Brian Bunker <brian@purestorage.com>
> ---
>  tests/Makefile   |   3 +-
>  tests/tur_race.c | 175 +++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 177 insertions(+), 1 deletion(-)
>  create mode 100644 tests/tur_race.c
> 
> diff --git a/tests/Makefile b/tests/Makefile
> index 9f1b950f..3d033d9a 100644
> --- a/tests/Makefile
> +++ b/tests/Makefile
> @@ -9,7 +9,7 @@ CFLAGS += $(BIN_CFLAGS) -Wno-unused-parameter -Wno-unused-function $(W_MISSING_I
>  LIBDEPS += -L. -L $(mpathutildir) -L$(mpathcmddir) -lmultipath -lmpathutil -lmpathcmd -lcmocka
>  
>  TESTS := uevent parser util dmevents hwtable blacklist unaligned vpd pgpolicy \
> -	 alias directio valid devt mpathvalid strbuf sysfs features cli mapinfo
> +	 alias directio valid devt mpathvalid strbuf sysfs features cli mapinfo tur_race
>  HELPERS := test-lib.o test-log.o
>  
>  .PRECIOUS: $(TESTS:%=%-test)
> @@ -65,6 +65,7 @@ features-test_LIBDEPS := -ludev -lpthread
>  features-test_OBJDEPS := $(mpathutildir)/mt-libudev.o
>  cli-test_OBJDEPS := $(daemondir)/cli.o
>  mapinfo-test_LIBDEPS = -lpthread -ldevmapper
> +tur_race-test_LIBDEPS := -lpthread -lurcu
>  
>  %.o: %.c
>  	@echo building $@ because of $?
> diff --git a/tests/tur_race.c b/tests/tur_race.c
> new file mode 100644
> index 00000000..00ad3cea
> --- /dev/null
> +++ b/tests/tur_race.c
> @@ -0,0 +1,175 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Test for TUR checker race condition.
> + *
> + * This test demonstrates a race condition where the checker thread has
> + * finished (set state/msgid under lock) but hasn't yet cleared ct->running.
> + * Currently, libcheck_check() incorrectly returns PATH_PENDING even though
> + * a valid result is available.
> + *
> + * NOTE: test_race_window_result_ready currently FAILS, demonstrating the bug.
> + */
> +
> +#define _GNU_SOURCE
> +#include <stdint.h>
> +#include <stdbool.h>
> +#include <stdarg.h>
> +#include <stddef.h>
> +#include <setjmp.h>
> +#include <stdlib.h>
> +#include <pthread.h>
> +#include <time.h>
> +#include "cmocka-compat.h"
> +
> +#include "globals.c"
> +#include "../libmultipath/checkers/tur.c"
> +
> +/* Mock device for testing */
> +static dev_t test_devt;
> +
> +static int setup(void **state)
> +{
> +	test_devt = makedev(8, 0);
> +	return 0;
> +}
> +
> +static int teardown(void **state)
> +{
> +	return 0;
> +}
> +
> +/*
> + * Test: Simulate the race window where thread has finished but running != 0
> + *
> + * This simulates the state between:
> + *   pthread_mutex_unlock(&ct->lock);  // thread released lock
> + *   uatomic_set(&ct->running, 0);     // thread hasn't cleared running yet
> + *
> + * In this window:
> + *   - ct->state and ct->msgid have the final result
> + *   - ct->running is still 1
> + *
> + * BUG: The code sees running != 0 and returns PATH_PENDING, ignoring
> + * the valid result that is already available.
> + *
> + * This test currently FAILS, demonstrating the race condition.
> + */
> +static void test_race_window_result_ready(void **state)
> +{
> +	struct checker c = {0};
> +	struct tur_checker_context *ct;
> +	int result;
> +
> +	/* Initialize the checker context manually to avoid dependencies */
> +	ct = calloc(1, sizeof(struct tur_checker_context));
> +	assert_non_null(ct);
> +
> +	pthread_mutex_init(&ct->lock, NULL);
> +	pthread_cond_init(&ct->active, NULL);
> +	uatomic_set(&ct->holders, 1);
> +	ct->state = PATH_UNCHECKED;
> +	ct->fd = -1;
> +
> +	c.fd = 1;
> +	c.timeout = 30;
> +	c.context = ct;
> +	assert_non_null(ct);
> +
> +	/*
> +	 * Simulate race condition state:
> +	 * - Thread has finished and set state/msgid
> +	 * - Thread has NOT yet cleared running
> +	 * - ct->thread is set (appears to be an active check)
> +	 */
> +	ct->thread = (pthread_t)1;  /* Non-zero: appears to be running */
> +	ct->devt = test_devt;
> +
> +	pthread_mutex_lock(&ct->lock);
> +	ct->state = PATH_UP;
> +	ct->msgid = CHECKER_MSGID_UP;
> +	pthread_mutex_unlock(&ct->lock);
> +
> +	uatomic_set(&ct->running, 1);  /* Thread "still running" */
> +	ct->time = time(NULL) + 100;   /* Not timed out */
> +
> +	/* Call libcheck_check - this is where the race manifests */
> +	result = libcheck_check(&c);
> +
> +	/*
> +	 * Expected: PATH_UP (the result is ready)
> +	 * Actual (BUG): PATH_PENDING (ignores ready result)
> +	 *
> +	 * This assertion currently FAILS, demonstrating the bug.
> +	 */
> +	assert_int_equal(result, PATH_UP);
> +	assert_int_equal(c.msgid, CHECKER_MSGID_UP);
> +
> +	/* ct->thread was cleared by libcheck_check when it collected result */
> +	ct->thread = 0;
> +	uatomic_set(&ct->running, 0);
> +	pthread_mutex_destroy(&ct->lock);
> +	pthread_cond_destroy(&ct->active);
> +	free(ct);
> +}
> +
> +/*
> + * Test: Verify thread still running returns PATH_PENDING
> + *
> + * When the thread genuinely hasn't finished (msgid == MSG_TUR_RUNNING),
> + * we should correctly return PATH_PENDING.
> + */
> +static void test_thread_still_running(void **state)
> +{
> +	struct checker c = {0};
> +	struct tur_checker_context *ct;
> +	int result;
> +
> +	/* Initialize the checker context manually to avoid dependencies */
> +	ct = calloc(1, sizeof(struct tur_checker_context));
> +	assert_non_null(ct);
> +
> +	pthread_mutex_init(&ct->lock, NULL);
> +	pthread_cond_init(&ct->active, NULL);
> +	uatomic_set(&ct->holders, 1);
> +	ct->state = PATH_UNCHECKED;
> +	ct->fd = -1;
> +
> +	c.fd = 1;
> +	c.timeout = 30;
> +	c.context = ct;
> +
> +	/* Thread is genuinely still running - no result yet */
> +	ct->thread = (pthread_t)1;
> +	ct->devt = test_devt;
> +
> +	pthread_mutex_lock(&ct->lock);
> +	ct->state = PATH_PENDING;
> +	ct->msgid = MSG_TUR_RUNNING;  /* Thread hasn't set result yet */
> +	pthread_mutex_unlock(&ct->lock);
> +
> +	uatomic_set(&ct->running, 1);
> +	ct->time = time(NULL) + 100;
> +
> +	result = libcheck_check(&c);
> +
> +	/* Should correctly return PATH_PENDING */
> +	assert_int_equal(result, PATH_PENDING);
> +
> +	ct->thread = 0;
> +	uatomic_set(&ct->running, 0);
> +	pthread_mutex_destroy(&ct->lock);
> +	pthread_cond_destroy(&ct->active);
> +	free(ct);
> +}
> +
> +int main(void)
> +{
> +	const struct CMUnitTest tests[] = {
> +		cmocka_unit_test_setup_teardown(test_race_window_result_ready,
> +						setup, teardown),
> +		cmocka_unit_test_setup_teardown(test_thread_still_running,
> +						setup, teardown),
> +	};
> +	return cmocka_run_group_tests(tests, NULL, NULL);
> +}
> +
> -- 
> 2.52.0


      reply	other threads:[~2026-03-09 18:33 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-07  1:06 [PATCH] tests: add test demonstrating TUR checker race condition Brian Bunker
2026-03-09 18:33 ` Benjamin Marzinski [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aa8SYkg7Uu7N52SU@redhat.com \
    --to=bmarzins@redhat.com \
    --cc=brian@purestorage.com \
    --cc=dm-devel@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.