* [PATCH v2 01/22] libmultipath: store checker_check() result in checker struct
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 02/22] libmultipath: add missing checker function prototypes Benjamin Marzinski
` (22 subsequent siblings)
23 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
checker_check() is now a void function that stores the path state in
the checker struct. It can be retrieved later using checker_get_state().
Right now, this is called immediately after checker_check(), but in
the future, it can be deferred to another time.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
libmultipath/checkers.c | 26 +++++++++++++-------------
libmultipath/checkers.h | 4 +++-
libmultipath/discovery.c | 3 ++-
libmultipath/propsel.c | 1 +
4 files changed, 19 insertions(+), 15 deletions(-)
diff --git a/libmultipath/checkers.c b/libmultipath/checkers.c
index fdb91e17..c4918d28 100644
--- a/libmultipath/checkers.c
+++ b/libmultipath/checkers.c
@@ -299,28 +299,28 @@ void checker_put (struct checker * dst)
free_checker_class(src);
}
-int checker_check (struct checker * c, int path_state)
+int checker_get_state(struct checker *c)
{
- int r;
+ return c ? c->path_state : PATH_UNCHECKED;
+}
+void checker_check (struct checker * c, int path_state)
+{
if (!c)
- return PATH_WILD;
+ return;
c->msgid = CHECKER_MSGID_NONE;
if (c->disable) {
c->msgid = CHECKER_MSGID_DISABLED;
- return PATH_UNCHECKED;
- }
- if (!strncmp(c->cls->name, NONE, 4))
- return path_state;
-
- if (c->fd < 0) {
+ c->path_state = PATH_UNCHECKED;
+ } else if (!strncmp(c->cls->name, NONE, 4)) {
+ c->path_state = path_state;
+ } else if (c->fd < 0) {
c->msgid = CHECKER_MSGID_NO_FD;
- return PATH_WILD;
+ c->path_state = PATH_WILD;
+ } else {
+ c->path_state = c->cls->check(c);
}
- r = c->cls->check(c);
-
- return r;
}
const char *checker_name(const struct checker *c)
diff --git a/libmultipath/checkers.h b/libmultipath/checkers.h
index 102351f6..6e54d8f0 100644
--- a/libmultipath/checkers.h
+++ b/libmultipath/checkers.h
@@ -123,6 +123,7 @@ struct checker {
int fd;
unsigned int timeout;
int disable;
+ int path_state;
short msgid; /* checker-internal extra status */
void * context; /* store for persistent data */
void ** mpcontext; /* store for persistent data shared
@@ -169,7 +170,8 @@ struct checker_context {
};
int start_checker_thread (pthread_t *thread, const pthread_attr_t *attr,
struct checker_context *ctx);
-int checker_check (struct checker *, int);
+int checker_get_state(struct checker *c);
+void checker_check (struct checker *, int);
int checker_is_sync(const struct checker *);
const char *checker_name (const struct checker *);
void reset_checker_classes(void);
diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index e94705bf..5648be60 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -2007,7 +2007,8 @@ get_state (struct path * pp, struct config *conf, int daemon, int oldstate)
checker_set_async(c);
else
checker_set_sync(c);
- state = checker_check(c, oldstate);
+ checker_check(c, oldstate);
+ state = checker_get_state(c);
condlog(3, "%s: %s state = %s", pp->dev,
checker_name(c), checker_state_name(state));
if (state != PATH_UP && state != PATH_GHOST &&
diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
index a3fce203..ad771d35 100644
--- a/libmultipath/propsel.c
+++ b/libmultipath/propsel.c
@@ -700,6 +700,7 @@ out:
condlog(3, "%s: path_checker = %s %s", pp->dev,
checker_name(c), origin);
c->timeout = pp->checker_timeout;
+ c->path_state = PATH_UNCHECKED;
return 0;
}
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v2 02/22] libmultipath: add missing checker function prototypes
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 01/22] libmultipath: store checker_check() result in checker struct Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 03/22] libmultipath: split out the code to wait for pending checkers Benjamin Marzinski
` (21 subsequent siblings)
23 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
libmultipath/checkers.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/libmultipath/checkers.h b/libmultipath/checkers.h
index 6e54d8f0..fb1160af 100644
--- a/libmultipath/checkers.h
+++ b/libmultipath/checkers.h
@@ -188,6 +188,9 @@ int libcheck_check(struct checker *);
int libcheck_init(struct checker *);
void libcheck_free(struct checker *);
void *libcheck_thread(struct checker_context *ctx);
+void libcheck_reset(void);
+int libcheck_mp_init(struct checker *);
+
/*
* msgid => message map.
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v2 03/22] libmultipath: split out the code to wait for pending checkers
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 01/22] libmultipath: store checker_check() result in checker struct Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 02/22] libmultipath: add missing checker function prototypes Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 04/22] libmultipath: remove pending wait code from libcheck_check calls Benjamin Marzinski
` (20 subsequent siblings)
23 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
This patch adds a new optional symbol for the dynamic path checker
libraries, libcheck_pending. This is currently unused, but can be called
on pending checkers to check if they've completed and return their value.
The "tur" and "directio" checkers are the only ones which can return
PATH_PENDING. They now implement libcheck_pending() as a wrapper around
the code that libcheck_check uses to wait for pending paths, which has
been broken out into its own function.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
libmultipath/checkers.c | 4 +-
libmultipath/checkers.h | 1 +
libmultipath/checkers/directio.c | 86 ++++++++++++++++++++++----------
libmultipath/checkers/tur.c | 65 ++++++++++++++++--------
4 files changed, 109 insertions(+), 47 deletions(-)
diff --git a/libmultipath/checkers.c b/libmultipath/checkers.c
index c4918d28..298aec78 100644
--- a/libmultipath/checkers.c
+++ b/libmultipath/checkers.c
@@ -26,6 +26,7 @@ struct checker_class {
void (*free)(struct checker *); /* to free the context */
void (*reset)(void); /* to reset the global variables */
void *(*thread)(void *); /* async thread entry point */
+ int (*pending)(struct checker *); /* to recheck pending paths */
const char **msgtable;
short msgtable_size;
};
@@ -180,7 +181,8 @@ static struct checker_class *add_checker_class(const char *name)
c->mp_init = (int (*)(struct checker *)) dlsym(c->handle, "libcheck_mp_init");
c->reset = (void (*)(void)) dlsym(c->handle, "libcheck_reset");
c->thread = (void *(*)(void*)) dlsym(c->handle, "libcheck_thread");
- /* These 3 functions can be NULL. call dlerror() to clear out any
+ c->pending = (int (*)(struct checker *)) dlsym(c->handle, "libcheck_pending");
+ /* These 4 functions can be NULL. call dlerror() to clear out any
* error string */
dlerror();
diff --git a/libmultipath/checkers.h b/libmultipath/checkers.h
index fb1160af..b2342a1b 100644
--- a/libmultipath/checkers.h
+++ b/libmultipath/checkers.h
@@ -190,6 +190,7 @@ void libcheck_free(struct checker *);
void *libcheck_thread(struct checker_context *ctx);
void libcheck_reset(void);
int libcheck_mp_init(struct checker *);
+int libcheck_pending(struct checker *c);
/*
diff --git a/libmultipath/checkers/directio.c b/libmultipath/checkers/directio.c
index 8e87878b..3f88b40d 100644
--- a/libmultipath/checkers/directio.c
+++ b/libmultipath/checkers/directio.c
@@ -288,14 +288,36 @@ get_events(struct aio_group *aio_grp, struct timespec *timeout)
return got_events;
}
+static void
+check_pending(struct directio_context *ct, struct timespec endtime)
+{
+ int r;
+ struct timespec currtime, timeout;
+
+ while(1) {
+ get_monotonic_time(&currtime);
+ timespecsub(&endtime, &currtime, &timeout);
+ if (timeout.tv_sec < 0)
+ timeout.tv_sec = timeout.tv_nsec = 0;
+
+ r = get_events(ct->aio_grp, &timeout);
+
+ if (ct->req->state != PATH_PENDING) {
+ ct->running = 0;
+ return;
+ } else if (r == 0 ||
+ (timeout.tv_sec == 0 && timeout.tv_nsec == 0))
+ return;
+ }
+}
+
static int
check_state(int fd, struct directio_context *ct, int sync, int timeout_secs)
{
struct timespec timeout = { .tv_nsec = 1000 };
struct stat sb;
int rc;
- long r;
- struct timespec currtime, endtime;
+ struct timespec endtime;
if (fstat(fd, &sb) == 0) {
LOG(4, "called for %x", (unsigned) sb.st_rdev);
@@ -330,21 +352,11 @@ check_state(int fd, struct directio_context *ct, int sync, int timeout_secs)
endtime.tv_sec += timeout.tv_sec;
endtime.tv_nsec += timeout.tv_nsec;
normalize_timespec(&endtime);
- while(1) {
- r = get_events(ct->aio_grp, &timeout);
- if (ct->req->state != PATH_PENDING) {
- ct->running = 0;
- return ct->req->state;
- } else if (r == 0 ||
- (timeout.tv_sec == 0 && timeout.tv_nsec == 0))
- break;
+ check_pending(ct, endtime);
+ if (ct->req->state != PATH_PENDING)
+ return ct->req->state;
- get_monotonic_time(&currtime);
- timespecsub(&endtime, &currtime, &timeout);
- if (timeout.tv_sec < 0)
- timeout.tv_sec = timeout.tv_nsec = 0;
- }
if (ct->running > timeout_secs || sync) {
struct io_event event;
@@ -360,17 +372,9 @@ check_state(int fd, struct directio_context *ct, int sync, int timeout_secs)
return rc;
}
-int libcheck_check (struct checker * c)
+static void set_msgid(struct checker *c, int state)
{
- int ret;
- struct directio_context * ct = (struct directio_context *)c->context;
-
- if (!ct)
- return PATH_UNCHECKED;
-
- ret = check_state(c->fd, ct, checker_is_sync(c), c->timeout);
-
- switch (ret)
+ switch (state)
{
case PATH_UNCHECKED:
c->msgid = MSG_DIRECTIO_UNKNOWN;
@@ -387,5 +391,37 @@ int libcheck_check (struct checker * c)
default:
break;
}
+}
+
+int libcheck_pending(struct checker *c)
+{
+ struct timespec endtime;
+ struct directio_context *ct = (struct directio_context *)c->context;
+
+ /* The if path checker isn't running, just return the exiting value. */
+ if (!ct || !ct->running)
+ return c->path_state;
+
+ if (ct->req->state == PATH_PENDING) {
+ get_monotonic_time(&endtime);
+ check_pending(ct, endtime);
+ } else
+ ct->running = 0;
+ set_msgid(c, ct->req->state);
+
+ return ct->req->state;
+}
+
+int libcheck_check (struct checker * c)
+{
+ int ret;
+ struct directio_context * ct = (struct directio_context *)c->context;
+
+ if (!ct)
+ return PATH_UNCHECKED;
+
+ ret = check_state(c->fd, ct, checker_is_sync(c), c->timeout);
+ set_msgid(c, ret);
+
return ret;
}
diff --git a/libmultipath/checkers/tur.c b/libmultipath/checkers/tur.c
index a2905af5..95af5214 100644
--- a/libmultipath/checkers/tur.c
+++ b/libmultipath/checkers/tur.c
@@ -323,6 +323,49 @@ static int tur_check_async_timeout(struct checker *c)
return (now.tv_sec > ct->time);
}
+int check_pending(struct checker *c, struct timespec endtime)
+{
+ struct tur_checker_context *ct = c->context;
+ int r, tur_status = PATH_PENDING;
+
+ pthread_mutex_lock(&ct->lock);
+
+ for (r = 0;
+ r == 0 && ct->state == PATH_PENDING &&
+ ct->msgid == MSG_TUR_RUNNING;
+ r = pthread_cond_timedwait(&ct->active, &ct->lock, &endtime));
+
+ if (!r) {
+ tur_status = ct->state;
+ c->msgid = ct->msgid;
+ }
+ pthread_mutex_unlock(&ct->lock);
+ if (tur_status == PATH_PENDING && c->msgid == MSG_TUR_RUNNING) {
+ condlog(4, "%d:%d : tur checker still running",
+ major(ct->devt), minor(ct->devt));
+ } else {
+ int running = uatomic_xchg(&ct->running, 0);
+ if (running)
+ pthread_cancel(ct->thread);
+ ct->thread = 0;
+ }
+
+ return tur_status;
+}
+
+int libcheck_pending(struct checker *c)
+{
+ struct timespec endtime;
+ struct tur_checker_context *ct = c->context;
+
+ /* The if path checker isn't running, just return the exiting value. */
+ if (!ct || !ct->thread)
+ return c->path_state;
+
+ get_monotonic_time(&endtime);
+ return check_pending(c, endtime);
+}
+
int libcheck_check(struct checker * c)
{
struct tur_checker_context *ct = c->context;
@@ -437,27 +480,7 @@ int libcheck_check(struct checker * c)
return tur_check(c->fd, c->timeout, &c->msgid);
}
tur_timeout(&tsp);
- pthread_mutex_lock(&ct->lock);
-
- for (r = 0;
- r == 0 && ct->state == PATH_PENDING &&
- ct->msgid == MSG_TUR_RUNNING;
- r = pthread_cond_timedwait(&ct->active, &ct->lock, &tsp));
-
- if (!r) {
- tur_status = ct->state;
- c->msgid = ct->msgid;
- }
- pthread_mutex_unlock(&ct->lock);
- if (tur_status == PATH_PENDING && c->msgid == MSG_TUR_RUNNING) {
- condlog(4, "%d:%d : tur checker still running",
- major(ct->devt), minor(ct->devt));
- } else {
- int running = uatomic_xchg(&ct->running, 0);
- if (running)
- pthread_cancel(ct->thread);
- ct->thread = 0;
- }
+ tur_status = check_pending(c, tsp);
}
return tur_status;
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v2 04/22] libmultipath: remove pending wait code from libcheck_check calls
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (2 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 03/22] libmultipath: split out the code to wait for pending checkers Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 05/22] multipath-tools tests: fix up directio tests Benjamin Marzinski
` (19 subsequent siblings)
23 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
When the tur and directio checkers start an asynchronous checker, they
now immediately return with the path in PATH_PENDING, instead of
waiting in checker_check(). Instead the wait now happens in
checker_get_state().
Additionally, the directio checker now waits for 1 ms, like the tur
checker does. Also like the tur checker it now only waits once. If it
is still pending after the first call to checker_get_state(). It will
not wait at all on future calls, and will just process the already
completed IOs.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
libmultipath/checkers.c | 7 +++-
libmultipath/checkers/directio.c | 57 +++++++++++++++++---------------
libmultipath/checkers/tur.c | 13 +++-----
3 files changed, 41 insertions(+), 36 deletions(-)
diff --git a/libmultipath/checkers.c b/libmultipath/checkers.c
index 298aec78..ce3e48bd 100644
--- a/libmultipath/checkers.c
+++ b/libmultipath/checkers.c
@@ -303,7 +303,12 @@ void checker_put (struct checker * dst)
int checker_get_state(struct checker *c)
{
- return c ? c->path_state : PATH_UNCHECKED;
+ if (!c || !c->cls)
+ return PATH_UNCHECKED;
+ if (c->path_state != PATH_PENDING || !c->cls->pending)
+ return c->path_state;
+ c->path_state = c->cls->pending(c);
+ return c->path_state;
}
void checker_check (struct checker * c, int path_state)
diff --git a/libmultipath/checkers/directio.c b/libmultipath/checkers/directio.c
index 3f88b40d..904e3071 100644
--- a/libmultipath/checkers/directio.c
+++ b/libmultipath/checkers/directio.c
@@ -60,10 +60,11 @@ const char *libcheck_msgtable[] = {
#define LOG(prio, fmt, args...) condlog(prio, "directio: " fmt, ##args)
struct directio_context {
- int running;
+ unsigned int running;
int reset_flags;
struct aio_group *aio_grp;
struct async_req *req;
+ struct timespec endtime;
};
static struct aio_group *
@@ -314,19 +315,16 @@ check_pending(struct directio_context *ct, struct timespec endtime)
static int
check_state(int fd, struct directio_context *ct, int sync, int timeout_secs)
{
- struct timespec timeout = { .tv_nsec = 1000 };
struct stat sb;
int rc;
+ struct io_event event;
struct timespec endtime;
if (fstat(fd, &sb) == 0) {
LOG(4, "called for %x", (unsigned) sb.st_rdev);
}
- if (sync > 0) {
+ if (sync > 0)
LOG(4, "called in synchronous mode");
- timeout.tv_sec = timeout_secs;
- timeout.tv_nsec = 0;
- }
if (ct->running) {
if (ct->req->state != PATH_PENDING) {
@@ -345,31 +343,26 @@ check_state(int fd, struct directio_context *ct, int sync, int timeout_secs)
LOG(3, "io_submit error %i", -rc);
return PATH_UNCHECKED;
}
+ get_monotonic_time(&ct->endtime);
+ ct->endtime.tv_nsec += 1000 * 1000;
+ normalize_timespec(&ct->endtime);
}
ct->running++;
+ if (!sync)
+ return PATH_PENDING;
get_monotonic_time(&endtime);
- endtime.tv_sec += timeout.tv_sec;
- endtime.tv_nsec += timeout.tv_nsec;
+ endtime.tv_sec += timeout_secs;
normalize_timespec(&endtime);
check_pending(ct, endtime);
if (ct->req->state != PATH_PENDING)
return ct->req->state;
- if (ct->running > timeout_secs || sync) {
- struct io_event event;
-
- LOG(3, "abort check on timeout");
-
- io_cancel(ct->aio_grp->ioctx, &ct->req->io, &event);
- rc = PATH_DOWN;
- } else {
- LOG(4, "async io pending");
- rc = PATH_PENDING;
- }
+ LOG(3, "abort check on timeout");
- return rc;
+ io_cancel(ct->aio_grp->ioctx, &ct->req->io, &event);
+ return PATH_DOWN;
}
static void set_msgid(struct checker *c, int state)
@@ -395,21 +388,31 @@ static void set_msgid(struct checker *c, int state)
int libcheck_pending(struct checker *c)
{
- struct timespec endtime;
+ int rc;
+ struct io_event event;
struct directio_context *ct = (struct directio_context *)c->context;
/* The if path checker isn't running, just return the exiting value. */
if (!ct || !ct->running)
return c->path_state;
- if (ct->req->state == PATH_PENDING) {
- get_monotonic_time(&endtime);
- check_pending(ct, endtime);
- } else
+ if (ct->req->state == PATH_PENDING)
+ check_pending(ct, ct->endtime);
+ else
ct->running = 0;
- set_msgid(c, ct->req->state);
+ rc = ct->req->state;
+ if (rc == PATH_PENDING) {
+ if (ct->running > c->timeout) {
+ LOG(3, "abort check on timeout");
+ io_cancel(ct->aio_grp->ioctx, &ct->req->io, &event);
+ rc = PATH_DOWN;
+ }
+ else
+ LOG(4, "async io pending");
+ }
+ set_msgid(c, rc);
- return ct->req->state;
+ return rc;
}
int libcheck_check (struct checker * c)
diff --git a/libmultipath/checkers/tur.c b/libmultipath/checkers/tur.c
index 95af5214..81db565b 100644
--- a/libmultipath/checkers/tur.c
+++ b/libmultipath/checkers/tur.c
@@ -58,6 +58,7 @@ struct tur_checker_context {
int msgid;
struct checker_context ctx;
unsigned int nr_timeouts;
+ struct timespec endtime;
};
int libcheck_init (struct checker * c)
@@ -323,7 +324,7 @@ static int tur_check_async_timeout(struct checker *c)
return (now.tv_sec > ct->time);
}
-int check_pending(struct checker *c, struct timespec endtime)
+int check_pending(struct checker *c)
{
struct tur_checker_context *ct = c->context;
int r, tur_status = PATH_PENDING;
@@ -333,7 +334,7 @@ int check_pending(struct checker *c, struct timespec endtime)
for (r = 0;
r == 0 && ct->state == PATH_PENDING &&
ct->msgid == MSG_TUR_RUNNING;
- r = pthread_cond_timedwait(&ct->active, &ct->lock, &endtime));
+ r = pthread_cond_timedwait(&ct->active, &ct->lock, &ct->endtime));
if (!r) {
tur_status = ct->state;
@@ -355,21 +356,18 @@ int check_pending(struct checker *c, struct timespec endtime)
int libcheck_pending(struct checker *c)
{
- struct timespec endtime;
struct tur_checker_context *ct = c->context;
/* The if path checker isn't running, just return the exiting value. */
if (!ct || !ct->thread)
return c->path_state;
- get_monotonic_time(&endtime);
- return check_pending(c, endtime);
+ return check_pending(c);
}
int libcheck_check(struct checker * c)
{
struct tur_checker_context *ct = c->context;
- struct timespec tsp;
pthread_attr_t attr;
int tur_status, r;
@@ -479,8 +477,7 @@ int libcheck_check(struct checker * c)
" sync mode", major(ct->devt), minor(ct->devt));
return tur_check(c->fd, c->timeout, &c->msgid);
}
- tur_timeout(&tsp);
- tur_status = check_pending(c, tsp);
+ tur_timeout(&ct->endtime);
}
return tur_status;
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v2 05/22] multipath-tools tests: fix up directio tests
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (3 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 04/22] libmultipath: remove pending wait code from libcheck_check calls Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 06/22] libmultipath: split get_state into two functions Benjamin Marzinski
` (18 subsequent siblings)
23 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
Make the directio tests work with libcheck_pending() being separate from
libcheck_check
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
tests/directio.c | 133 +++++++++++++++++++++++++++++------------------
1 file changed, 82 insertions(+), 51 deletions(-)
diff --git a/tests/directio.c b/tests/directio.c
index 763929e5..2e22f831 100644
--- a/tests/directio.c
+++ b/tests/directio.c
@@ -219,13 +219,22 @@ static void return_io_getevents_nr(struct timespec *ts, int nr,
ev_off += i;
}
-void do_check_state(struct checker *c, int sync, int timeout, int chk_state)
+void do_check_state(struct checker *c, int sync, int chk_state)
{
struct directio_context * ct = (struct directio_context *)c->context;
if (!ct->running)
will_return(__wrap_io_submit, 1);
- assert_int_equal(check_state(test_fd, ct, sync, timeout), chk_state);
+ assert_int_equal(check_state(test_fd, ct, sync, c->timeout), chk_state);
+ if (sync) {
+ assert_int_equal(ev_off, 0);
+ memset(mock_events, 0, sizeof(mock_events));
+ }
+}
+
+void do_libcheck_pending(struct checker *c, int chk_state)
+{
+ assert_int_equal(libcheck_pending(c), chk_state);
assert_int_equal(ev_off, 0);
memset(mock_events, 0, sizeof(mock_events));
}
@@ -245,12 +254,13 @@ void do_libcheck_reset(int nr_aio_grps)
assert_int_equal(ioctx_count, 0);
}
-static void do_libcheck_init(struct checker *c, int blocksize,
+static void do_libcheck_init(struct checker *c, int blocksize, int timeout,
struct async_req **req)
{
struct directio_context * ct;
c->fd = test_fd;
+ c->timeout = timeout;
wrap_will_return(WRAP_IOCTL, blocksize);
assert_int_equal(libcheck_init(c), 0);
ct = (struct directio_context *)c->context;
@@ -305,7 +315,7 @@ static void test_init_reset_init(void **state)
assert_true(list_empty(&aio_grp_list));
will_return(__wrap_io_setup, 0);
- do_libcheck_init(&c, 4096, NULL);
+ do_libcheck_init(&c, 4096, 0, NULL);
aio_grp = get_aio_grp(&c);
check_aio_grp(aio_grp, 1, 0);
list_for_each_entry(tmp_grp, &aio_grp_list, node)
@@ -314,7 +324,7 @@ static void test_init_reset_init(void **state)
check_aio_grp(aio_grp, 0, 0);
do_libcheck_reset(1);
will_return(__wrap_io_setup, 0);
- do_libcheck_init(&c, 4096, NULL);
+ do_libcheck_init(&c, 4096, 0, NULL);
aio_grp = get_aio_grp(&c);
check_aio_grp(aio_grp, 1, 0);
list_for_each_entry(tmp_grp, &aio_grp_list, node)
@@ -340,11 +350,11 @@ static void test_init_free(void **state)
struct directio_context * ct;
if (i % 3 == 0)
- do_libcheck_init(&c[i], 512, NULL);
+ do_libcheck_init(&c[i], 512, 0, NULL);
else if (i % 3 == 1)
- do_libcheck_init(&c[i], 1024, NULL);
+ do_libcheck_init(&c[i], 1024, 0, NULL);
else
- do_libcheck_init(&c[i], 4096, NULL);
+ do_libcheck_init(&c[i], 4096, 0, NULL);
ct = (struct directio_context *)c[i].context;
assert_non_null(ct->aio_grp);
if ((i & 1023) == 0)
@@ -385,7 +395,7 @@ static void test_multi_init_free(void **state)
for (count = 0, i = 0; i < 4096; count++) {
/* usually init, but occasionally free checkers */
if (count == 0 || (count % 5 != 0 && count % 7 != 0)) {
- do_libcheck_init(&c[i], 4096, NULL);
+ do_libcheck_init(&c[i], 4096, 0, NULL);
i++;
} else {
i--;
@@ -404,7 +414,7 @@ static void test_multi_init_free(void **state)
i--;
libcheck_free(&c[i]);
} else {
- do_libcheck_init(&c[i], 4096, NULL);
+ do_libcheck_init(&c[i], 4096, 0, NULL);
i++;
}
}
@@ -420,9 +430,9 @@ static void test_check_state_simple(void **state)
assert_true(list_empty(&aio_grp_list));
will_return(__wrap_io_setup, 0);
- do_libcheck_init(&c, 4096, &req);
+ do_libcheck_init(&c, 4096, 30, &req);
return_io_getevents_nr(NULL, 1, &req, &res);
- do_check_state(&c, 1, 30, PATH_UP);
+ do_check_state(&c, 1, PATH_UP);
libcheck_free(&c);
do_libcheck_reset(1);
}
@@ -435,10 +445,10 @@ static void test_check_state_timeout(void **state)
assert_true(list_empty(&aio_grp_list));
will_return(__wrap_io_setup, 0);
- do_libcheck_init(&c, 4096, NULL);
+ do_libcheck_init(&c, 4096, 30, NULL);
aio_grp = get_aio_grp(&c);
return_io_getevents_none();
- do_check_state(&c, 1, 30, PATH_DOWN);
+ do_check_state(&c, 1, PATH_DOWN);
check_aio_grp(aio_grp, 1, 0);
libcheck_free(&c);
do_libcheck_reset(1);
@@ -452,16 +462,20 @@ static void test_check_state_async_timeout(void **state)
assert_true(list_empty(&aio_grp_list));
will_return(__wrap_io_setup, 0);
- do_libcheck_init(&c, 4096, NULL);
+ do_libcheck_init(&c, 4096, 3, NULL);
aio_grp = get_aio_grp(&c);
+ do_check_state(&c, 0, PATH_PENDING);
return_io_getevents_none();
- do_check_state(&c, 0, 3, PATH_PENDING);
+ do_libcheck_pending(&c, PATH_PENDING);
+ do_check_state(&c, 0, PATH_PENDING);
return_io_getevents_none();
- do_check_state(&c, 0, 3, PATH_PENDING);
+ do_libcheck_pending(&c, PATH_PENDING);
+ do_check_state(&c, 0, PATH_PENDING);
return_io_getevents_none();
- do_check_state(&c, 0, 3, PATH_PENDING);
+ do_libcheck_pending(&c, PATH_PENDING);
+ do_check_state(&c, 0, PATH_PENDING);
return_io_getevents_none();
- do_check_state(&c, 0, 3, PATH_DOWN);
+ do_libcheck_pending(&c, PATH_DOWN);
check_aio_grp(aio_grp, 1, 0);
libcheck_free(&c);
do_libcheck_reset(1);
@@ -477,14 +491,16 @@ static void test_free_with_pending(void **state)
assert_true(list_empty(&aio_grp_list));
will_return(__wrap_io_setup, 0);
- do_libcheck_init(&c[0], 4096, &req);
- do_libcheck_init(&c[1], 4096, NULL);
+ do_libcheck_init(&c[0], 4096, 30, &req);
+ do_libcheck_init(&c[1], 4096, 30, NULL);
aio_grp = get_aio_grp(c);
+ do_check_state(&c[0], 0, PATH_PENDING);
+ do_check_state(&c[1], 0, PATH_PENDING);
return_io_getevents_none();
- do_check_state(&c[0], 0, 30, PATH_PENDING);
+ do_libcheck_pending(&c[0], PATH_PENDING);
return_io_getevents_nr(NULL, 1, &req, &res);
return_io_getevents_none();
- do_check_state(&c[1], 0, 30, PATH_PENDING);
+ do_libcheck_pending(&c[1], PATH_PENDING);
assert_true(is_checker_running(&c[0]));
assert_true(is_checker_running(&c[1]));
check_aio_grp(aio_grp, 2, 0);
@@ -505,9 +521,10 @@ static void test_orphaned_aio_group(void **state)
assert_true(list_empty(&aio_grp_list));
will_return(__wrap_io_setup, 0);
for (i = 0; i < AIO_GROUP_SIZE; i++) {
- do_libcheck_init(&c[i], 4096, NULL);
+ do_libcheck_init(&c[i], 4096, 30, NULL);
+ do_check_state(&c[i], 0, PATH_PENDING);
return_io_getevents_none();
- do_check_state(&c[i], 0, 30, PATH_PENDING);
+ do_libcheck_pending(&c[i], PATH_PENDING);
}
aio_grp = get_aio_grp(c);
check_aio_grp(aio_grp, AIO_GROUP_SIZE, 0);
@@ -539,19 +556,19 @@ static void test_timeout_cancel_failed(void **state)
assert_true(list_empty(&aio_grp_list));
will_return(__wrap_io_setup, 0);
for (i = 0; i < 2; i++)
- do_libcheck_init(&c[i], 4096, &reqs[i]);
+ do_libcheck_init(&c[i], 4096, 30, &reqs[i]);
aio_grp = get_aio_grp(c);
return_io_getevents_none();
- do_check_state(&c[0], 1, 30, PATH_DOWN);
+ do_check_state(&c[0], 1, PATH_DOWN);
assert_true(is_checker_running(&c[0]));
check_aio_grp(aio_grp, 2, 0);
return_io_getevents_none();
- do_check_state(&c[0], 1, 30, PATH_DOWN);
+ do_check_state(&c[0], 1, PATH_DOWN);
assert_true(is_checker_running(&c[0]));
return_io_getevents_nr(NULL, 1, &reqs[0], &res[0]);
return_io_getevents_nr(NULL, 1, &reqs[1], &res[1]);
- do_check_state(&c[1], 1, 30, PATH_UP);
- do_check_state(&c[0], 1, 30, PATH_UP);
+ do_check_state(&c[1], 1, PATH_UP);
+ do_check_state(&c[0], 1, PATH_UP);
for (i = 0; i < 2; i++) {
assert_false(is_checker_running(&c[i]));
libcheck_free(&c[i]);
@@ -571,28 +588,37 @@ static void test_async_timeout_cancel_failed(void **state)
assert_true(list_empty(&aio_grp_list));
will_return(__wrap_io_setup, 0);
for (i = 0; i < 2; i++)
- do_libcheck_init(&c[i], 4096, &reqs[i]);
+ do_libcheck_init(&c[i], 4096, 2, &reqs[i]);
+ do_check_state(&c[0], 0, PATH_PENDING);
+ do_check_state(&c[1], 0, PATH_PENDING);
return_io_getevents_none();
- do_check_state(&c[0], 0, 2, PATH_PENDING);
+ do_libcheck_pending(&c[0], PATH_PENDING);
return_io_getevents_none();
- do_check_state(&c[1], 0, 2, PATH_PENDING);
+ do_libcheck_pending(&c[1], PATH_PENDING);
+ do_check_state(&c[0], 0, PATH_PENDING);
+ do_check_state(&c[1], 0, PATH_PENDING);
return_io_getevents_none();
- do_check_state(&c[0], 0, 2, PATH_PENDING);
+ do_libcheck_pending(&c[0], PATH_PENDING);
return_io_getevents_none();
- do_check_state(&c[1], 0, 2, PATH_PENDING);
+ do_libcheck_pending(&c[1], PATH_PENDING);
+ do_check_state(&c[0], 0, PATH_PENDING);
+ do_check_state(&c[1], 0, PATH_PENDING);
return_io_getevents_none();
- do_check_state(&c[0], 0, 2, PATH_DOWN);
+ do_libcheck_pending(&c[0], PATH_DOWN);
if (!test_dev) {
/* can't pick which even gets returned on real devices */
return_io_getevents_nr(NULL, 1, &reqs[1], &res[1]);
- do_check_state(&c[1], 0, 2, PATH_UP);
+ do_libcheck_pending(&c[1], PATH_UP);
}
+ do_check_state(&c[0], 0, PATH_PENDING);
return_io_getevents_none();
- do_check_state(&c[0], 0, 2, PATH_DOWN);
+ do_libcheck_pending(&c[0], PATH_DOWN);
assert_true(is_checker_running(&c[0]));
+ do_check_state(&c[1], 0, PATH_PENDING);
+ do_check_state(&c[0], 0, PATH_PENDING);
return_io_getevents_nr(NULL, 2, reqs, res);
- do_check_state(&c[1], 0, 2, PATH_UP);
- do_check_state(&c[0], 0, 2, PATH_UP);
+ do_libcheck_pending(&c[1], PATH_UP);
+ do_libcheck_pending(&c[0], PATH_UP);
for (i = 0; i < 2; i++) {
assert_false(is_checker_running(&c[i]));
libcheck_free(&c[i]);
@@ -612,15 +638,17 @@ static void test_orphan_checker_cleanup(void **state)
assert_true(list_empty(&aio_grp_list));
will_return(__wrap_io_setup, 0);
for (i = 0; i < 2; i++)
- do_libcheck_init(&c[i], 4096, &reqs[i]);
+ do_libcheck_init(&c[i], 4096, 30, &reqs[i]);
aio_grp = get_aio_grp(c);
+ do_check_state(&c[0], 0, PATH_PENDING);
return_io_getevents_none();
- do_check_state(&c[0], 0, 30, PATH_PENDING);
+ do_libcheck_pending(&c[0], PATH_PENDING);
check_aio_grp(aio_grp, 2, 0);
libcheck_free(&c[0]);
check_aio_grp(aio_grp, 2, 1);
+ do_check_state(&c[1], 0, PATH_PENDING);
return_io_getevents_nr(NULL, 2, reqs, res);
- do_check_state(&c[1], 0, 2, PATH_UP);
+ do_libcheck_pending(&c[1], PATH_UP);
check_aio_grp(aio_grp, 1, 0);
libcheck_free(&c[1]);
check_aio_grp(aio_grp, 0, 0);
@@ -636,10 +664,11 @@ static void test_orphan_reset_cleanup(void **state)
assert_true(list_empty(&aio_grp_list));
will_return(__wrap_io_setup, 0);
- do_libcheck_init(&c, 4096, NULL);
+ do_libcheck_init(&c, 4096, 30, NULL);
orphan_aio_grp = get_aio_grp(&c);
+ do_check_state(&c, 0, PATH_PENDING);
return_io_getevents_none();
- do_check_state(&c, 0, 30, PATH_PENDING);
+ do_libcheck_pending(&c, PATH_PENDING);
check_aio_grp(orphan_aio_grp, 1, 0);
libcheck_free(&c);
check_aio_grp(orphan_aio_grp, 1, 1);
@@ -671,10 +700,10 @@ static void test_check_state_blksize(void **state)
assert_true(list_empty(&aio_grp_list));
will_return(__wrap_io_setup, 0);
for (i = 0; i < 3; i++)
- do_libcheck_init(&c[i], blksize[i], &reqs[i]);
+ do_libcheck_init(&c[i], blksize[i], 30, &reqs[i]);
for (i = 0; i < 3; i++) {
return_io_getevents_nr(NULL, 1, &reqs[i], &res[i]);
- do_check_state(&c[i], 1, 30, chk_state[i]);
+ do_check_state(&c[i], 1, chk_state[i]);
}
for (i = 0; i < 3; i++) {
assert_false(is_checker_running(&c[i]));
@@ -695,19 +724,21 @@ static void test_check_state_async(void **state)
assert_true(list_empty(&aio_grp_list));
will_return(__wrap_io_setup, 0);
for (i = 0; i < 257; i++)
- do_libcheck_init(&c[i], 4096, &reqs[i]);
+ do_libcheck_init(&c[i], 4096, 30, &reqs[i]);
for (i = 0; i < 256; i++) {
+ do_check_state(&c[i], 0, PATH_PENDING);
return_io_getevents_none();
- do_check_state(&c[i], 0, 30, PATH_PENDING);
+ do_libcheck_pending(&c[i], PATH_PENDING);
assert_true(is_checker_running(&c[i]));
}
+ do_check_state(&c[256], 0, PATH_PENDING);
return_io_getevents_nr(&full_timeout, 256, reqs, res);
return_io_getevents_nr(NULL, 1, &reqs[256], &res[256]);
- do_check_state(&c[256], 0, 30, PATH_UP);
+ do_libcheck_pending(&c[256], PATH_UP);
assert_false(is_checker_running(&c[256]));
libcheck_free(&c[256]);
for (i = 0; i < 256; i++) {
- do_check_state(&c[i], 0, 30, PATH_UP);
+ do_check_state(&c[i], 0, PATH_UP);
assert_false(is_checker_running(&c[i]));
libcheck_free(&c[i]);
}
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v2 06/22] libmultipath: split get_state into two functions
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (4 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 05/22] multipath-tools tests: fix up directio tests Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 07/22] libmultipath: change path_offline to path_sysfs_state Benjamin Marzinski
` (17 subsequent siblings)
23 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
get_state() is now split into start_checker(), which runs the checker
but doesn't wait for async checkers, and get_state(), which returns the
state from the checker, after waiting for async checkers if necessary.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
libmultipath/discovery.c | 22 ++++++++++++++++------
libmultipath/discovery.h | 4 +++-
libmultipath/libmultipath.version | 1 +
multipathd/main.c | 4 +++-
4 files changed, 23 insertions(+), 8 deletions(-)
diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 5648be60..e0f46ff2 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -1974,30 +1974,29 @@ cciss_ioctl_pathinfo(struct path *pp)
}
int
-get_state (struct path * pp, struct config *conf, int daemon, int oldstate)
+start_checker (struct path * pp, struct config *conf, int daemon, int oldstate)
{
struct checker * c = &pp->checker;
- int state;
if (!checker_selected(c)) {
if (daemon) {
if (pathinfo(pp, conf, DI_SYSFS) != PATHINFO_OK) {
condlog(3, "%s: couldn't get sysfs pathinfo",
pp->dev);
- return PATH_UNCHECKED;
+ return -1;
}
}
select_detect_checker(conf, pp);
select_checker(conf, pp);
if (!checker_selected(c)) {
condlog(3, "%s: No checker selected", pp->dev);
- return PATH_UNCHECKED;
+ return -1;
}
checker_set_fd(c, pp->fd);
if (checker_init(c, pp->mpp?&pp->mpp->mpcontext:NULL)) {
checker_clear(c);
condlog(3, "%s: checker init failed", pp->dev);
- return PATH_UNCHECKED;
+ return -1;
}
}
if (pp->mpp && !c->mpcontext)
@@ -2008,6 +2007,15 @@ get_state (struct path * pp, struct config *conf, int daemon, int oldstate)
else
checker_set_sync(c);
checker_check(c, oldstate);
+ return 0;
+}
+
+int
+get_state (struct path * pp)
+{
+ struct checker * c = &pp->checker;
+ int state;
+
state = checker_get_state(c);
condlog(3, "%s: %s state = %s", pp->dev,
checker_name(c), checker_state_name(state));
@@ -2455,7 +2463,9 @@ int pathinfo(struct path *pp, struct config *conf, int mask)
if (mask & DI_CHECKER) {
if (path_state == PATH_UP) {
- int newstate = get_state(pp, conf, 0, path_state);
+ int newstate = PATH_UNCHECKED;
+ if (start_checker(pp, conf, 0, path_state) == 0)
+ newstate = get_state(pp);
if (newstate != PATH_PENDING ||
pp->state == PATH_UNCHECKED ||
pp->state == PATH_WILD)
diff --git a/libmultipath/discovery.h b/libmultipath/discovery.h
index c93abf1c..f3e0c618 100644
--- a/libmultipath/discovery.h
+++ b/libmultipath/discovery.h
@@ -34,7 +34,9 @@ int path_discovery (vector pathvec, int flag);
int path_get_tpgs(struct path *pp); /* This function never returns TPGS_UNDEF */
int do_tur (char *);
int path_offline (struct path *);
-int get_state (struct path * pp, struct config * conf, int daemon, int state);
+int start_checker(struct path * pp, struct config * conf, int daemon,
+ int state);
+int get_state(struct path * pp);
int get_vpd_sgio (int fd, int pg, int vend_id, char * str, int maxlen);
int pathinfo (struct path * pp, struct config * conf, int mask);
int alloc_path_with_pathinfo (struct config *conf, struct udev_device *udevice,
diff --git a/libmultipath/libmultipath.version b/libmultipath/libmultipath.version
index 21d48da6..bb34d9e8 100644
--- a/libmultipath/libmultipath.version
+++ b/libmultipath/libmultipath.version
@@ -160,6 +160,7 @@ global:
remove_wwid;
replace_wwids;
reset_checker_classes;
+ start_checker;
select_all_tg_pt;
select_action;
select_find_multipaths_timeout;
diff --git a/multipathd/main.c b/multipathd/main.c
index 1b7fd04f..4f752adb 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2324,7 +2324,9 @@ check_path_state(struct path *pp)
if (newstate == PATH_UP) {
conf = get_multipath_config();
pthread_cleanup_push(put_multipath_config, conf);
- newstate = get_state(pp, conf, 1, newstate);
+ newstate = PATH_UNCHECKED;
+ if (start_checker(pp, conf, 1, newstate) == 0)
+ newstate = get_state(pp);
pthread_cleanup_pop(1);
} else {
checker_clear_message(&pp->checker);
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v2 07/22] libmultipath: change path_offline to path_sysfs_state
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (5 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 06/22] libmultipath: split get_state into two functions Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 08/22] multipathd: split check_path_state into two functions Benjamin Marzinski
` (16 subsequent siblings)
23 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
Instead of pp->offline being a binary value, change it to show the
actual result of looking at the sysfs state, and rename it to
pp->sysfs_state. Also change the function name from path_offline() to
path_sysfs_state(). Adapt the tests for pp->offline. This should not
change how multipath currently works. pp->sysfs_state will be used in
future patches.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
libmultipath/discovery.c | 65 ++++++++++++++++---------------
libmultipath/discovery.h | 2 +-
libmultipath/libmultipath.version | 2 +-
libmultipath/print.c | 2 +-
libmultipath/structs.h | 2 +-
multipathd/main.c | 4 +-
6 files changed, 40 insertions(+), 37 deletions(-)
diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index e0f46ff2..6ccdfa0b 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -1076,18 +1076,16 @@ detect_alua(struct path * pp)
return;
}
- if (pp->fd == -1 || pp->offline)
+ if (pp->fd == -1 || pp->sysfs_state == PATH_DOWN)
return;
ret = get_target_port_group(pp);
if (ret < 0 || get_asymmetric_access_state(pp, ret) < 0) {
- int state;
-
if (ret == -RTPG_INQUIRY_FAILED)
return;
- state = path_offline(pp);
- if (state != PATH_UP)
+ path_sysfs_state(pp);
+ if (pp->sysfs_state != PATH_UP)
return;
pp->tpgs = TPGS_NONE;
@@ -1800,7 +1798,7 @@ common_sysfs_pathinfo (struct path * pp)
}
int
-path_offline (struct path * pp)
+path_sysfs_state(struct path * pp)
{
struct udev_device * parent;
char buff[SCSI_STATE_SIZE];
@@ -1814,7 +1812,8 @@ path_offline (struct path * pp)
subsys_type = "nvme";
}
else {
- return PATH_UP;
+ pp->sysfs_state = PATH_UP;
+ goto out;
}
parent = pp->udev;
@@ -1827,16 +1826,18 @@ path_offline (struct path * pp)
if (!parent) {
condlog(1, "%s: failed to get sysfs information", pp->dev);
- return PATH_REMOVED;
+ pp->sysfs_state = PATH_REMOVED;
+ goto out;
}
memset(buff, 0x0, SCSI_STATE_SIZE);
err = sysfs_attr_get_value(parent, "state", buff, sizeof(buff));
if (!sysfs_attr_value_ok(err, sizeof(buff))) {
if (err == -ENXIO)
- return PATH_REMOVED;
+ pp->sysfs_state = PATH_REMOVED;
else
- return PATH_DOWN;
+ pp->sysfs_state = PATH_DOWN;
+ goto out;
}
@@ -1844,31 +1845,34 @@ path_offline (struct path * pp)
if (pp->bus == SYSFS_BUS_SCSI) {
if (!strncmp(buff, "offline", 7)) {
- pp->offline = 1;
- return PATH_DOWN;
+ pp->sysfs_state = PATH_DOWN;
+ goto out;
+ } else if (!strncmp(buff, "blocked", 7) ||
+ !strncmp(buff, "quiesce", 7)) {
+ pp->sysfs_state = PATH_PENDING;
+ goto out;
+ } else if (!strncmp(buff, "running", 7)) {
+ pp->sysfs_state = PATH_UP;
+ goto out;
}
- pp->offline = 0;
- if (!strncmp(buff, "blocked", 7) ||
- !strncmp(buff, "quiesce", 7))
- return PATH_PENDING;
- else if (!strncmp(buff, "running", 7))
- return PATH_UP;
}
else if (pp->bus == SYSFS_BUS_NVME) {
if (!strncmp(buff, "dead", 4)) {
- pp->offline = 1;
- return PATH_DOWN;
+ pp->sysfs_state = PATH_DOWN;
+ goto out;
+ } else if (!strncmp(buff, "new", 3) ||
+ !strncmp(buff, "deleting", 8)) {
+ pp->sysfs_state = PATH_PENDING;
+ goto out;
+ } else if (!strncmp(buff, "live", 4)) {
+ pp->sysfs_state = PATH_UP;
+ goto out;
}
- pp->offline = 0;
- if (!strncmp(buff, "new", 3) ||
- !strncmp(buff, "deleting", 8))
- return PATH_PENDING;
- else if (!strncmp(buff, "live", 4))
- return PATH_UP;
}
-
- return PATH_DOWN;
+ pp->sysfs_state = PATH_DOWN;
+out:
+ return pp->sysfs_state;
}
static int
@@ -2052,8 +2056,7 @@ get_prio (struct path * pp)
old_prio = pp->priority;
pp->priority = prio_getprio(p, pp);
if (pp->priority < 0) {
- /* this changes pp->offline, but why not */
- int state = path_offline(pp);
+ int state = path_sysfs_state(pp);
if (state == PATH_DOWN || state == PATH_PENDING) {
pp->priority = old_prio;
@@ -2424,7 +2427,7 @@ int pathinfo(struct path *pp, struct config *conf, int mask)
return PATHINFO_SKIPPED;
}
- path_state = path_offline(pp);
+ path_state = path_sysfs_state(pp);
if (path_state == PATH_REMOVED)
goto blank;
else if (mask & DI_NOIO) {
diff --git a/libmultipath/discovery.h b/libmultipath/discovery.h
index f3e0c618..7d42eae5 100644
--- a/libmultipath/discovery.h
+++ b/libmultipath/discovery.h
@@ -33,7 +33,7 @@ struct config;
int path_discovery (vector pathvec, int flag);
int path_get_tpgs(struct path *pp); /* This function never returns TPGS_UNDEF */
int do_tur (char *);
-int path_offline (struct path *);
+int path_sysfs_state(struct path *);
int start_checker(struct path * pp, struct config * conf, int daemon,
int state);
int get_state(struct path * pp);
diff --git a/libmultipath/libmultipath.version b/libmultipath/libmultipath.version
index bb34d9e8..6439d3a7 100644
--- a/libmultipath/libmultipath.version
+++ b/libmultipath/libmultipath.version
@@ -146,7 +146,7 @@ global:
path_discovery;
path_get_tpgs;
pathinfo;
- path_offline;
+ path_sysfs_state;
print_all_paths;
print_foreign_topology;
print_multipath_topology__;
diff --git a/libmultipath/print.c b/libmultipath/print.c
index e536c5c0..00c03ace 100644
--- a/libmultipath/print.c
+++ b/libmultipath/print.c
@@ -514,7 +514,7 @@ snprint_offline (struct strbuf *buff, const struct path * pp)
{
if (!pp || !pp->mpp)
return append_strbuf_str(buff, "unknown");
- else if (pp->offline)
+ else if (pp->sysfs_state == PATH_DOWN)
return append_strbuf_str(buff, "offline");
else
return append_strbuf_str(buff, "running");
diff --git a/libmultipath/structs.h b/libmultipath/structs.h
index 074faca6..d8231e95 100644
--- a/libmultipath/structs.h
+++ b/libmultipath/structs.h
@@ -362,7 +362,7 @@ struct path {
unsigned int tick;
unsigned int pending_ticks;
int bus;
- int offline;
+ int sysfs_state;
int state;
int dmstate;
int chkrstate;
diff --git a/multipathd/main.c b/multipathd/main.c
index 4f752adb..33a57041 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -96,7 +96,7 @@ void * mpath_pr_event_handler_fn (void * );
do { \
if (pp->mpp && checker_selected(&pp->checker) && \
lvl <= libmp_verbosity) { \
- if (pp->offline) \
+ if (pp->sysfs_state == PATH_DOWN) \
condlog(lvl, "%s: %s - path offline", \
pp->mpp->alias, pp->dev); \
else { \
@@ -2320,7 +2320,7 @@ check_path_state(struct path *pp)
int newstate;
struct config *conf;
- newstate = path_offline(pp);
+ newstate = path_sysfs_state(pp);
if (newstate == PATH_UP) {
conf = get_multipath_config();
pthread_cleanup_push(put_multipath_config, conf);
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v2 08/22] multipathd: split check_path_state into two functions
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (6 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 07/22] libmultipath: change path_offline to path_sysfs_state Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 09/22] multipathd: split do_checker_path Benjamin Marzinski
` (15 subsequent siblings)
23 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
check_path_state() is now split into start_path_check(), which calls
path_sysfs_state() and if the path is up also calls start_checker(), and
get_new_state() which gets the new state from either pp->sysfs_state
or get_state().
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
multipathd/main.c | 30 +++++++++++++++++++-----------
1 file changed, 19 insertions(+), 11 deletions(-)
diff --git a/multipathd/main.c b/multipathd/main.c
index 33a57041..16c0531e 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2314,25 +2314,31 @@ should_skip_path(struct path *pp){
return 0;
}
-static int
-check_path_state(struct path *pp)
+static void
+start_path_check(struct path *pp)
{
- int newstate;
struct config *conf;
- newstate = path_sysfs_state(pp);
- if (newstate == PATH_UP) {
+ if (path_sysfs_state(pp) == PATH_UP) {
conf = get_multipath_config();
pthread_cleanup_push(put_multipath_config, conf);
- newstate = PATH_UNCHECKED;
- if (start_checker(pp, conf, 1, newstate) == 0)
- newstate = get_state(pp);
+ start_checker(pp, conf, 1, PATH_UNCHECKED);
pthread_cleanup_pop(1);
} else {
checker_clear_message(&pp->checker);
condlog(3, "%s: state %s, checker not called",
- pp->dev, checker_state_name(newstate));
+ pp->dev, checker_state_name(pp->sysfs_state));
}
+}
+
+static int
+get_new_state(struct path *pp)
+{
+ int newstate = pp->sysfs_state;
+ struct config *conf;
+
+ if (newstate == PATH_UP)
+ newstate = get_state(pp);
/*
* Wait for uevent for removed paths;
* some LLDDs like zfcp keep paths unavailable
@@ -2413,7 +2419,8 @@ do_check_path (struct vectors * vecs, struct path * pp)
pp->checkint = checkint;
};
- newstate = check_path_state(pp);
+ start_path_check(pp);
+ newstate = get_new_state(pp);
if (newstate == PATH_WILD || newstate == PATH_UNCHECKED)
return CHECK_PATH_SKIPPED;
/*
@@ -2752,7 +2759,8 @@ handle_uninitialized_path(struct vectors * vecs, struct path * pp,
}
}
- newstate = check_path_state(pp);
+ start_path_check(pp);
+ newstate = get_new_state(pp);
if (!strlen(pp->wwid) &&
(pp->initialized == INIT_FAILED || pp->initialized == INIT_NEW) &&
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v2 09/22] multipathd: split do_checker_path
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (7 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 08/22] multipathd: split check_path_state into two functions Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-10-03 20:23 ` Martin Wilck
2024-09-12 21:49 ` [PATCH v2 10/22] multipathd: split check_path into two functions Benjamin Marzinski
` (14 subsequent siblings)
23 siblings, 1 reply; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
Move the code that starts the path checker from do_check_path() into
check_path(), rename the remainder of do_check_path() to
update_path_state() and call that from check_path().
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
multipathd/main.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/multipathd/main.c b/multipathd/main.c
index 16c0531e..9319751e 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2396,7 +2396,7 @@ enum check_path_return {
};
static int
-do_check_path (struct vectors * vecs, struct path * pp)
+update_path_state (struct vectors * vecs, struct path * pp)
{
int newstate;
int new_path_up = 0;
@@ -2414,12 +2414,6 @@ do_check_path (struct vectors * vecs, struct path * pp)
marginal_pathgroups = conf->marginal_pathgroups;
put_multipath_config(conf);
- if (pp->checkint == CHECKINT_UNDEF) {
- condlog(0, "%s: BUG: checkint is not set", pp->dev);
- pp->checkint = checkint;
- };
-
- start_path_check(pp);
newstate = get_new_state(pp);
if (newstate == PATH_WILD || newstate == PATH_UNCHECKED)
return CHECK_PATH_SKIPPED;
@@ -2639,7 +2633,7 @@ check_path (struct vectors * vecs, struct path * pp, unsigned int ticks,
time_t start_secs)
{
int r;
- unsigned int adjust_int, max_checkint;
+ unsigned int adjust_int, checkint, max_checkint;
struct config *conf;
time_t next_idx, goal_idx;
@@ -2652,14 +2646,21 @@ check_path (struct vectors * vecs, struct path * pp, unsigned int ticks,
return CHECK_PATH_SKIPPED;
conf = get_multipath_config();
+ checkint = conf->checkint;
max_checkint = conf->max_checkint;
adjust_int = conf->adjust_int;
put_multipath_config(conf);
- r = do_check_path(vecs, pp);
+ if (pp->checkint == CHECKINT_UNDEF) {
+ condlog(0, "%s: BUG: checkint is not set", pp->dev);
+ pp->checkint = checkint;
+ }
+
+ start_path_check(pp);
+ r = update_path_state(vecs, pp);
/*
- * do_check_path() removed or orphaned the path.
+ * update_path() removed or orphaned the path.
*/
if (r == CHECK_PATH_REMOVED || !pp->mpp)
return r;
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* Re: [PATCH v2 09/22] multipathd: split do_checker_path
2024-09-12 21:49 ` [PATCH v2 09/22] multipathd: split do_checker_path Benjamin Marzinski
@ 2024-10-03 20:23 ` Martin Wilck
2024-10-08 18:18 ` Benjamin Marzinski
0 siblings, 1 reply; 40+ messages in thread
From: Martin Wilck @ 2024-10-03 20:23 UTC (permalink / raw)
To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development
Nit: the subject should refer to "do_check_path".
Also, the word "split" is somewhat confusing as you've been using it
for actually splitting one function into two elsewhere in this set.
Perhaps just call the patch "rename do_check_path to
update_path_state".
On Thu, 2024-09-12 at 17:49 -0400, Benjamin Marzinski wrote:
> Move the code that starts the path checker from do_check_path() into
> check_path(), rename the remainder of do_check_path() to
> update_path_state() and call that from check_path().
>
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
> multipathd/main.c | 21 +++++++++++----------
> 1 file changed, 11 insertions(+), 10 deletions(-)
>
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 16c0531e..9319751e 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -2396,7 +2396,7 @@ enum check_path_return {
> };
>
> static int
> -do_check_path (struct vectors * vecs, struct path * pp)
> +update_path_state (struct vectors * vecs, struct path * pp)
> {
> int newstate;
> int new_path_up = 0;
> @@ -2414,12 +2414,6 @@ do_check_path (struct vectors * vecs, struct
> path * pp)
> marginal_pathgroups = conf->marginal_pathgroups;
> put_multipath_config(conf);
>
> - if (pp->checkint == CHECKINT_UNDEF) {
> - condlog(0, "%s: BUG: checkint is not set", pp->dev);
> - pp->checkint = checkint;
> - };
> -
> - start_path_check(pp);
> newstate = get_new_state(pp);
> if (newstate == PATH_WILD || newstate == PATH_UNCHECKED)
> return CHECK_PATH_SKIPPED;
> @@ -2639,7 +2633,7 @@ check_path (struct vectors * vecs, struct path
> * pp, unsigned int ticks,
> time_t start_secs)
> {
> int r;
> - unsigned int adjust_int, max_checkint;
> + unsigned int adjust_int, checkint, max_checkint;
> struct config *conf;
> time_t next_idx, goal_idx;
>
> @@ -2652,14 +2646,21 @@ check_path (struct vectors * vecs, struct
> path * pp, unsigned int ticks,
> return CHECK_PATH_SKIPPED;
>
> conf = get_multipath_config();
> + checkint = conf->checkint;
> max_checkint = conf->max_checkint;
> adjust_int = conf->adjust_int;
> put_multipath_config(conf);
>
> - r = do_check_path(vecs, pp);
> + if (pp->checkint == CHECKINT_UNDEF) {
> + condlog(0, "%s: BUG: checkint is not set", pp->dev);
> + pp->checkint = checkint;
> + }
> +
> + start_path_check(pp);
> + r = update_path_state(vecs, pp);
>
> /*
> - * do_check_path() removed or orphaned the path.
> + * update_path() removed or orphaned the path.
Nit: update_path() is introduced in the follow-up patch.
Should this be update_path_state()?
Martin
> */
> if (r == CHECK_PATH_REMOVED || !pp->mpp)
> return r;
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v2 09/22] multipathd: split do_checker_path
2024-10-03 20:23 ` Martin Wilck
@ 2024-10-08 18:18 ` Benjamin Marzinski
0 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-10-08 18:18 UTC (permalink / raw)
To: Martin Wilck; +Cc: Christophe Varoqui, device-mapper development
On Thu, Oct 03, 2024 at 10:23:51PM +0200, Martin Wilck wrote:
> Nit: the subject should refer to "do_check_path".
> Also, the word "split" is somewhat confusing as you've been using it
> for actually splitting one function into two elsewhere in this set.
>
> Perhaps just call the patch "rename do_check_path to
> update_path_state".
Sure.
>
> On Thu, 2024-09-12 at 17:49 -0400, Benjamin Marzinski wrote:
> > Move the code that starts the path checker from do_check_path() into
> > check_path(), rename the remainder of do_check_path() to
> > update_path_state() and call that from check_path().
> >
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> > multipathd/main.c | 21 +++++++++++----------
> > 1 file changed, 11 insertions(+), 10 deletions(-)
> >
> > diff --git a/multipathd/main.c b/multipathd/main.c
> > index 16c0531e..9319751e 100644
> > --- a/multipathd/main.c
> > +++ b/multipathd/main.c
> > @@ -2396,7 +2396,7 @@ enum check_path_return {
> > };
> >
> > static int
> > -do_check_path (struct vectors * vecs, struct path * pp)
> > +update_path_state (struct vectors * vecs, struct path * pp)
> > {
> > int newstate;
> > int new_path_up = 0;
> > @@ -2414,12 +2414,6 @@ do_check_path (struct vectors * vecs, struct
> > path * pp)
> > marginal_pathgroups = conf->marginal_pathgroups;
> > put_multipath_config(conf);
> >
> > - if (pp->checkint == CHECKINT_UNDEF) {
> > - condlog(0, "%s: BUG: checkint is not set", pp->dev);
> > - pp->checkint = checkint;
> > - };
> > -
> > - start_path_check(pp);
> > newstate = get_new_state(pp);
> > if (newstate == PATH_WILD || newstate == PATH_UNCHECKED)
> > return CHECK_PATH_SKIPPED;
> > @@ -2639,7 +2633,7 @@ check_path (struct vectors * vecs, struct path
> > * pp, unsigned int ticks,
> > time_t start_secs)
> > {
> > int r;
> > - unsigned int adjust_int, max_checkint;
> > + unsigned int adjust_int, checkint, max_checkint;
> > struct config *conf;
> > time_t next_idx, goal_idx;
> >
> > @@ -2652,14 +2646,21 @@ check_path (struct vectors * vecs, struct
> > path * pp, unsigned int ticks,
> > return CHECK_PATH_SKIPPED;
> >
> > conf = get_multipath_config();
> > + checkint = conf->checkint;
> > max_checkint = conf->max_checkint;
> > adjust_int = conf->adjust_int;
> > put_multipath_config(conf);
> >
> > - r = do_check_path(vecs, pp);
> > + if (pp->checkint == CHECKINT_UNDEF) {
> > + condlog(0, "%s: BUG: checkint is not set", pp->dev);
> > + pp->checkint = checkint;
> > + }
> > +
> > + start_path_check(pp);
> > + r = update_path_state(vecs, pp);
> >
> > /*
> > - * do_check_path() removed or orphaned the path.
> > + * update_path() removed or orphaned the path.
>
> Nit: update_path() is introduced in the follow-up patch.
> Should this be update_path_state()?
Oops. Yep.
-Ben
>
> Martin
>
>
> > */
> > if (r == CHECK_PATH_REMOVED || !pp->mpp)
> > return r;
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v2 10/22] multipathd: split check_path into two functions
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (8 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 09/22] multipathd: split do_checker_path Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-10-03 20:23 ` Martin Wilck
2024-09-12 21:49 ` [PATCH v2 11/22] multipathd: split handle_uninitialized_path " Benjamin Marzinski
` (13 subsequent siblings)
23 siblings, 1 reply; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
Split out the code that updates a path's state and sets up the next
check time into its own function, update_path().
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
multipathd/main.c | 31 ++++++++++++++++++++++---------
1 file changed, 22 insertions(+), 9 deletions(-)
diff --git a/multipathd/main.c b/multipathd/main.c
index 9319751e..8bfb166d 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2390,6 +2390,7 @@ sync_mpp(struct vectors * vecs, struct multipath *mpp, unsigned int ticks)
}
enum check_path_return {
+ CHECK_PATH_STARTED,
CHECK_PATH_CHECKED,
CHECK_PATH_SKIPPED,
CHECK_PATH_REMOVED,
@@ -2629,13 +2630,10 @@ update_path_state (struct vectors * vecs, struct path * pp)
}
static int
-check_path (struct vectors * vecs, struct path * pp, unsigned int ticks,
- time_t start_secs)
+check_path (struct path * pp, unsigned int ticks)
{
- int r;
- unsigned int adjust_int, checkint, max_checkint;
+ unsigned int checkint;
struct config *conf;
- time_t next_idx, goal_idx;
if (pp->initialized == INIT_REMOVED)
return CHECK_PATH_SKIPPED;
@@ -2647,8 +2645,6 @@ check_path (struct vectors * vecs, struct path * pp, unsigned int ticks,
conf = get_multipath_config();
checkint = conf->checkint;
- max_checkint = conf->max_checkint;
- adjust_int = conf->adjust_int;
put_multipath_config(conf);
if (pp->checkint == CHECKINT_UNDEF) {
@@ -2657,6 +2653,17 @@ check_path (struct vectors * vecs, struct path * pp, unsigned int ticks,
}
start_path_check(pp);
+ return CHECK_PATH_STARTED;
+}
+
+static int
+update_path(struct vectors * vecs, struct path * pp, time_t start_secs)
+{
+ int r;
+ unsigned int adjust_int, max_checkint;
+ struct config *conf;
+ time_t next_idx, goal_idx;
+
r = update_path_state(vecs, pp);
/*
@@ -2685,6 +2692,10 @@ check_path (struct vectors * vecs, struct path * pp, unsigned int ticks,
if (pp->tick == 1)
return r;
+ conf = get_multipath_config();
+ max_checkint = conf->max_checkint;
+ adjust_int = conf->adjust_int;
+ put_multipath_config(conf);
/*
* every mpp has a goal_idx in the range of
* 0 <= goal_idx < conf->max_checkint
@@ -2818,8 +2829,10 @@ check_paths(struct vectors *vecs, unsigned int ticks, int *num_paths_p)
if (!pp->mpp || pp->is_checked)
continue;
pp->is_checked = true;
- rc = check_path(vecs, pp, ticks,
- start_time.tv_sec);
+ rc = check_path(pp, ticks);
+ if (rc == CHECK_PATH_STARTED)
+ rc = update_path(vecs, pp,
+ start_time.tv_sec);
if (rc == CHECK_PATH_CHECKED)
(*num_paths_p)++;
if (++paths_checked % 128 == 0)
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* Re: [PATCH v2 10/22] multipathd: split check_path into two functions
2024-09-12 21:49 ` [PATCH v2 10/22] multipathd: split check_path into two functions Benjamin Marzinski
@ 2024-10-03 20:23 ` Martin Wilck
2024-10-08 18:29 ` Benjamin Marzinski
0 siblings, 1 reply; 40+ messages in thread
From: Martin Wilck @ 2024-10-03 20:23 UTC (permalink / raw)
To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development
On Thu, 2024-09-12 at 17:49 -0400, Benjamin Marzinski wrote:
> Split out the code that updates a path's state and sets up the next
> check time into its own function, update_path().
>
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
> multipathd/main.c | 31 ++++++++++++++++++++++---------
> 1 file changed, 22 insertions(+), 9 deletions(-)
>
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 9319751e..8bfb166d 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -2390,6 +2390,7 @@ sync_mpp(struct vectors * vecs, struct
> multipath *mpp, unsigned int ticks)
> }
>
> enum check_path_return {
> + CHECK_PATH_STARTED,
> CHECK_PATH_CHECKED,
> CHECK_PATH_SKIPPED,
> CHECK_PATH_REMOVED,
> @@ -2629,13 +2630,10 @@ update_path_state (struct vectors * vecs,
> struct path * pp)
> }
>
> static int
> -check_path (struct vectors * vecs, struct path * pp, unsigned int
> ticks,
> - time_t start_secs)
> +check_path (struct path * pp, unsigned int ticks)
> {
> - int r;
> - unsigned int adjust_int, checkint, max_checkint;
> + unsigned int checkint;
> struct config *conf;
> - time_t next_idx, goal_idx;
>
> if (pp->initialized == INIT_REMOVED)
> return CHECK_PATH_SKIPPED;
> @@ -2647,8 +2645,6 @@ check_path (struct vectors * vecs, struct path
> * pp, unsigned int ticks,
>
> conf = get_multipath_config();
> checkint = conf->checkint;
> - max_checkint = conf->max_checkint;
> - adjust_int = conf->adjust_int;
> put_multipath_config(conf);
>
After this change, we set only checking in this RCU critical section,
and checkint is only used in the if clause below. Perhaps move the
critical section into the if block?
Martin
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v2 10/22] multipathd: split check_path into two functions
2024-10-03 20:23 ` Martin Wilck
@ 2024-10-08 18:29 ` Benjamin Marzinski
0 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-10-08 18:29 UTC (permalink / raw)
To: Martin Wilck; +Cc: Christophe Varoqui, device-mapper development
On Thu, Oct 03, 2024 at 10:23:57PM +0200, Martin Wilck wrote:
> On Thu, 2024-09-12 at 17:49 -0400, Benjamin Marzinski wrote:
> > Split out the code that updates a path's state and sets up the next
> > check time into its own function, update_path().
> >
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> > multipathd/main.c | 31 ++++++++++++++++++++++---------
> > 1 file changed, 22 insertions(+), 9 deletions(-)
> >
> > diff --git a/multipathd/main.c b/multipathd/main.c
> > index 9319751e..8bfb166d 100644
> > --- a/multipathd/main.c
> > +++ b/multipathd/main.c
> > @@ -2390,6 +2390,7 @@ sync_mpp(struct vectors * vecs, struct
> > multipath *mpp, unsigned int ticks)
> > }
> >
> > enum check_path_return {
> > + CHECK_PATH_STARTED,
> > CHECK_PATH_CHECKED,
> > CHECK_PATH_SKIPPED,
> > CHECK_PATH_REMOVED,
> > @@ -2629,13 +2630,10 @@ update_path_state (struct vectors * vecs,
> > struct path * pp)
> > }
> >
> > static int
> > -check_path (struct vectors * vecs, struct path * pp, unsigned int
> > ticks,
> > - time_t start_secs)
> > +check_path (struct path * pp, unsigned int ticks)
> > {
> > - int r;
> > - unsigned int adjust_int, checkint, max_checkint;
> > + unsigned int checkint;
> > struct config *conf;
> > - time_t next_idx, goal_idx;
> >
> > if (pp->initialized == INIT_REMOVED)
> > return CHECK_PATH_SKIPPED;
> > @@ -2647,8 +2645,6 @@ check_path (struct vectors * vecs, struct path
> > * pp, unsigned int ticks,
> >
> > conf = get_multipath_config();
> > checkint = conf->checkint;
> > - max_checkint = conf->max_checkint;
> > - adjust_int = conf->adjust_int;
> > put_multipath_config(conf);
> >
>
> After this change, we set only checking in this RCU critical section,
> and checkint is only used in the if clause below. Perhaps move the
> critical section into the if block?
Makes sense. Sure.
-Ben
> Martin
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v2 11/22] multipathd: split handle_uninitialized_path into two functions
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (9 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 10/22] multipathd: split check_path into two functions Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 12/22] multipathd: split check_paths " Benjamin Marzinski
` (12 subsequent siblings)
23 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
Split handle_uninitialized_path() into check_uninitialized_path, which
handles udev retriggers for INIT_MISSING_UDEV paths and starts the path
checker for INIT_FAILED and INIT_NEW paths, and
update_uninitialized_path() which gets the path checker result and
reruns pathinfo if the path is up.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
multipathd/main.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/multipathd/main.c b/multipathd/main.c
index 8bfb166d..45d40559 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2718,13 +2718,10 @@ update_path(struct vectors * vecs, struct path * pp, time_t start_secs)
}
static int
-handle_uninitialized_path(struct vectors * vecs, struct path * pp,
- unsigned int ticks)
+check_uninitialized_path(struct path * pp, unsigned int ticks)
{
- int newstate;
int retrigger_tries;
struct config *conf;
- int ret;
if (pp->initialized != INIT_NEW && pp->initialized != INIT_FAILED &&
pp->initialized != INIT_MISSING_UDEV)
@@ -2772,6 +2769,15 @@ handle_uninitialized_path(struct vectors * vecs, struct path * pp,
}
start_path_check(pp);
+ return CHECK_PATH_STARTED;
+}
+
+static int
+update_uninitialized_path(struct vectors * vecs, struct path * pp)
+{
+ int newstate, ret;
+ struct config *conf;
+
newstate = get_new_state(pp);
if (!strlen(pp->wwid) &&
@@ -2862,7 +2868,9 @@ next_mpp:
continue;
pp->is_checked = true;
- rc = handle_uninitialized_path(vecs, pp, ticks);
+ rc = check_uninitialized_path(pp, ticks);
+ if (rc == CHECK_PATH_STARTED)
+ rc = update_uninitialized_path(vecs, pp);
if (rc == CHECK_PATH_REMOVED)
i--;
else if (rc == CHECK_PATH_CHECKED)
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v2 12/22] multipathd: split check_paths into two functions
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (10 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 11/22] multipathd: split handle_uninitialized_path " Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-10-03 20:41 ` Martin Wilck
2024-09-12 21:49 ` [PATCH v2 13/22] multipathd: fix "fail path" and "reinstate path" commands Benjamin Marzinski
` (11 subsequent siblings)
23 siblings, 1 reply; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
Instead of checking and updating each path, the checkerloop now starts
the checkers on all the paths in check_paths(), and then goes back and
updates all the paths in update_paths(). Since the async checkers use an
absolute time to wait for before returning PATH_PENDING, only one
checker actually needs to be waited for in update_paths(). The rest will
already have reached their timeout when update_path() is called for
them.
The check_paths() and update_paths() loop over the pathvec instead of
looping through the multipath device paths to avoid having to restart
checking of the device paths when the multipath device needs to be
resynced while updating the paths.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
libmultipath/structs.h | 10 +++-
multipathd/main.c | 105 +++++++++++++++++++----------------------
2 files changed, 57 insertions(+), 58 deletions(-)
diff --git a/libmultipath/structs.h b/libmultipath/structs.h
index d8231e95..af8e31e9 100644
--- a/libmultipath/structs.h
+++ b/libmultipath/structs.h
@@ -314,6 +314,14 @@ enum recheck_wwid_states {
RECHECK_WWID_ON = YNU_YES,
};
+enum check_path_states {
+ CHECK_PATH_UNCHECKED,
+ CHECK_PATH_STARTED,
+ CHECK_PATH_CHECKED,
+ CHECK_PATH_SKIPPED,
+ CHECK_PATH_REMOVED,
+};
+
struct vpd_vendor_page {
int pg;
const char *name;
@@ -395,7 +403,7 @@ struct path {
int fast_io_fail;
unsigned int dev_loss;
int eh_deadline;
- bool is_checked;
+ enum check_path_states is_checked;
bool can_use_env_uid;
unsigned int checker_timeout;
/* configlet pointers */
diff --git a/multipathd/main.c b/multipathd/main.c
index 45d40559..9519b6c5 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2389,13 +2389,6 @@ sync_mpp(struct vectors * vecs, struct multipath *mpp, unsigned int ticks)
do_sync_mpp(vecs, mpp);
}
-enum check_path_return {
- CHECK_PATH_STARTED,
- CHECK_PATH_CHECKED,
- CHECK_PATH_SKIPPED,
- CHECK_PATH_REMOVED,
-};
-
static int
update_path_state (struct vectors * vecs, struct path * pp)
{
@@ -2735,6 +2728,7 @@ check_uninitialized_path(struct path * pp, unsigned int ticks)
conf = get_multipath_config();
retrigger_tries = conf->retrigger_tries;
pp->tick = conf->max_checkint;
+ pp->checkint = conf->checkint;
put_multipath_config(conf);
if (pp->initialized == INIT_MISSING_UDEV) {
@@ -2778,6 +2772,10 @@ update_uninitialized_path(struct vectors * vecs, struct path * pp)
int newstate, ret;
struct config *conf;
+ if (pp->initialized != INIT_NEW && pp->initialized != INIT_FAILED &&
+ pp->initialized != INIT_MISSING_UDEV)
+ return CHECK_PATH_SKIPPED;
+
newstate = get_new_state(pp);
if (!strlen(pp->wwid) &&
@@ -2807,80 +2805,69 @@ update_uninitialized_path(struct vectors * vecs, struct path * pp)
enum checker_state {
CHECKER_STARTING,
- CHECKER_RUNNING,
+ CHECKER_CHECKING_PATHS,
+ CHECKER_UPDATING_PATHS,
CHECKER_FINISHED,
};
static enum checker_state
-check_paths(struct vectors *vecs, unsigned int ticks, int *num_paths_p)
+check_paths(struct vectors *vecs, unsigned int ticks)
{
unsigned int paths_checked = 0;
struct timespec diff_time, start_time, end_time;
- struct multipath *mpp;
struct path *pp;
- int i, rc;
+ int i;
get_monotonic_time(&start_time);
- vector_foreach_slot(vecs->mpvec, mpp, i) {
- struct pathgroup *pgp;
- struct path *pp;
- int j, k;
- bool check_for_waiters = false;
- /* maps can be rechecked, so this is not always 0 */
- int synced_count = mpp->synced_count;
-
- vector_foreach_slot (mpp->pg, pgp, j) {
- vector_foreach_slot (pgp->paths, pp, k) {
- if (!pp->mpp || pp->is_checked)
- continue;
- pp->is_checked = true;
- rc = check_path(pp, ticks);
- if (rc == CHECK_PATH_STARTED)
- rc = update_path(vecs, pp,
- start_time.tv_sec);
- if (rc == CHECK_PATH_CHECKED)
- (*num_paths_p)++;
- if (++paths_checked % 128 == 0)
- check_for_waiters = true;
- /*
- * mpp has been removed or resynced. Path may
- * have been removed.
- */
- if (VECTOR_SLOT(vecs->mpvec, i) != mpp ||
- synced_count != mpp->synced_count) {
- i--;
- goto next_mpp;
- }
- }
- }
-next_mpp:
- if (check_for_waiters &&
+ vector_foreach_slot(vecs->pathvec, pp, i) {
+ if (pp->is_checked != CHECK_PATH_UNCHECKED)
+ continue;
+ if (pp->mpp)
+ pp->is_checked = check_path(pp, ticks);
+ else
+ pp->is_checked = check_uninitialized_path(pp, ticks);
+ if (++paths_checked % 128 == 0 &&
(lock_has_waiters(&vecs->lock) || waiting_clients())) {
get_monotonic_time(&end_time);
timespecsub(&end_time, &start_time, &diff_time);
if (diff_time.tv_sec > 0)
- return CHECKER_RUNNING;
+ return CHECKER_CHECKING_PATHS;
}
}
+ return CHECKER_UPDATING_PATHS;
+}
+
+static enum checker_state
+update_paths(struct vectors *vecs, int *num_paths_p, time_t start_secs)
+{
+ unsigned int paths_checked = 0;
+ struct timespec diff_time, start_time, end_time;
+ struct path *pp;
+ int i, rc;
+
+ get_monotonic_time(&start_time);
+
vector_foreach_slot(vecs->pathvec, pp, i) {
- if (pp->mpp || pp->is_checked)
+ if (pp->is_checked != CHECK_PATH_STARTED)
continue;
- pp->is_checked = true;
-
- rc = check_uninitialized_path(pp, ticks);
- if (rc == CHECK_PATH_STARTED)
+ if (pp->mpp)
+ rc = update_path(vecs, pp, start_secs);
+ else
rc = update_uninitialized_path(vecs, pp);
if (rc == CHECK_PATH_REMOVED)
i--;
- else if (rc == CHECK_PATH_CHECKED)
- (*num_paths_p)++;
+ else {
+ pp->is_checked = rc;
+ if (rc == CHECK_PATH_CHECKED)
+ (*num_paths_p)++;
+ }
if (++paths_checked % 128 == 0 &&
(lock_has_waiters(&vecs->lock) || waiting_clients())) {
get_monotonic_time(&end_time);
timespecsub(&end_time, &start_time, &diff_time);
if (diff_time.tv_sec > 0)
- return CHECKER_RUNNING;
+ return CHECKER_UPDATING_PATHS;
}
}
return CHECKER_FINISHED;
@@ -2948,10 +2935,14 @@ checkerloop (void *ap)
vector_foreach_slot(vecs->mpvec, mpp, i)
sync_mpp(vecs, mpp, ticks);
vector_foreach_slot(vecs->pathvec, pp, i)
- pp->is_checked = false;
- checker_state = CHECKER_RUNNING;
+ pp->is_checked = CHECK_PATH_UNCHECKED;
+ checker_state = CHECKER_CHECKING_PATHS;
}
- checker_state = check_paths(vecs, ticks, &num_paths);
+ if (checker_state == CHECKER_CHECKING_PATHS)
+ checker_state = check_paths(vecs, ticks);
+ if (checker_state == CHECKER_UPDATING_PATHS)
+ checker_state = update_paths(vecs, &num_paths,
+ start_time.tv_sec);
lock_cleanup_pop(vecs->lock);
if (checker_state != CHECKER_FINISHED) {
/* Yield to waiters */
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* Re: [PATCH v2 12/22] multipathd: split check_paths into two functions
2024-09-12 21:49 ` [PATCH v2 12/22] multipathd: split check_paths " Benjamin Marzinski
@ 2024-10-03 20:41 ` Martin Wilck
2024-10-08 19:16 ` Benjamin Marzinski
0 siblings, 1 reply; 40+ messages in thread
From: Martin Wilck @ 2024-10-03 20:41 UTC (permalink / raw)
To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development
On Thu, 2024-09-12 at 17:49 -0400, Benjamin Marzinski wrote:
> Instead of checking and updating each path, the checkerloop now
> starts
> the checkers on all the paths in check_paths(), and then goes back
> and
> updates all the paths in update_paths(). Since the async checkers use
> an
> absolute time to wait for before returning PATH_PENDING, only one
> checker actually needs to be waited for in update_paths(). The rest
> will
> already have reached their timeout when update_path() is called for
> them.
>
> The check_paths() and update_paths() loop over the pathvec instead of
> looping through the multipath device paths to avoid having to restart
> checking of the device paths when the multipath device needs to be
> resynced while updating the paths.
>
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
> libmultipath/structs.h | 10 +++-
> multipathd/main.c | 105 +++++++++++++++++++--------------------
> --
> 2 files changed, 57 insertions(+), 58 deletions(-)
>
> diff --git a/libmultipath/structs.h b/libmultipath/structs.h
> index d8231e95..af8e31e9 100644
> --- a/libmultipath/structs.h
> +++ b/libmultipath/structs.h
> @@ -314,6 +314,14 @@ enum recheck_wwid_states {
> RECHECK_WWID_ON = YNU_YES,
> };
>
> +enum check_path_states {
> + CHECK_PATH_UNCHECKED,
> + CHECK_PATH_STARTED,
> + CHECK_PATH_CHECKED,
> + CHECK_PATH_SKIPPED,
> + CHECK_PATH_REMOVED,
> +};
> +
> struct vpd_vendor_page {
> int pg;
> const char *name;
> @@ -395,7 +403,7 @@ struct path {
> int fast_io_fail;
> unsigned int dev_loss;
> int eh_deadline;
> - bool is_checked;
> + enum check_path_states is_checked;
> bool can_use_env_uid;
> unsigned int checker_timeout;
> /* configlet pointers */
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 45d40559..9519b6c5 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -2389,13 +2389,6 @@ sync_mpp(struct vectors * vecs, struct
> multipath *mpp, unsigned int ticks)
> do_sync_mpp(vecs, mpp);
> }
>
> -enum check_path_return {
> - CHECK_PATH_STARTED,
> - CHECK_PATH_CHECKED,
> - CHECK_PATH_SKIPPED,
> - CHECK_PATH_REMOVED,
> -};
> -
> static int
> update_path_state (struct vectors * vecs, struct path * pp)
> {
> @@ -2735,6 +2728,7 @@ check_uninitialized_path(struct path * pp,
> unsigned int ticks)
> conf = get_multipath_config();
> retrigger_tries = conf->retrigger_tries;
> pp->tick = conf->max_checkint;
> + pp->checkint = conf->checkint;
> put_multipath_config(conf);
>
> if (pp->initialized == INIT_MISSING_UDEV) {
> @@ -2778,6 +2772,10 @@ update_uninitialized_path(struct vectors *
> vecs, struct path * pp)
> int newstate, ret;
> struct config *conf;
>
> + if (pp->initialized != INIT_NEW && pp->initialized !=
> INIT_FAILED &&
> + pp->initialized != INIT_MISSING_UDEV)
> + return CHECK_PATH_SKIPPED;
> +
> newstate = get_new_state(pp);
>
It seems to me that this hunk and the previous one belong into 11/22.
Martin
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v2 12/22] multipathd: split check_paths into two functions
2024-10-03 20:41 ` Martin Wilck
@ 2024-10-08 19:16 ` Benjamin Marzinski
0 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-10-08 19:16 UTC (permalink / raw)
To: Martin Wilck; +Cc: Christophe Varoqui, device-mapper development
On Thu, Oct 03, 2024 at 10:41:20PM +0200, Martin Wilck wrote:
> On Thu, 2024-09-12 at 17:49 -0400, Benjamin Marzinski wrote:
> > Instead of checking and updating each path, the checkerloop now
> > starts
> > the checkers on all the paths in check_paths(), and then goes back
> > and
> > updates all the paths in update_paths(). Since the async checkers use
> > an
> > absolute time to wait for before returning PATH_PENDING, only one
> > checker actually needs to be waited for in update_paths(). The rest
> > will
> > already have reached their timeout when update_path() is called for
> > them.
> >
> > The check_paths() and update_paths() loop over the pathvec instead of
> > looping through the multipath device paths to avoid having to restart
> > checking of the device paths when the multipath device needs to be
> > resynced while updating the paths.
> >
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> > libmultipath/structs.h | 10 +++-
> > multipathd/main.c | 105 +++++++++++++++++++--------------------
> > --
> > 2 files changed, 57 insertions(+), 58 deletions(-)
> >
> > diff --git a/libmultipath/structs.h b/libmultipath/structs.h
> > index d8231e95..af8e31e9 100644
> > --- a/libmultipath/structs.h
> > +++ b/libmultipath/structs.h
> > @@ -314,6 +314,14 @@ enum recheck_wwid_states {
> > RECHECK_WWID_ON = YNU_YES,
> > };
> >
> > +enum check_path_states {
> > + CHECK_PATH_UNCHECKED,
> > + CHECK_PATH_STARTED,
> > + CHECK_PATH_CHECKED,
> > + CHECK_PATH_SKIPPED,
> > + CHECK_PATH_REMOVED,
> > +};
> > +
> > struct vpd_vendor_page {
> > int pg;
> > const char *name;
> > @@ -395,7 +403,7 @@ struct path {
> > int fast_io_fail;
> > unsigned int dev_loss;
> > int eh_deadline;
> > - bool is_checked;
> > + enum check_path_states is_checked;
> > bool can_use_env_uid;
> > unsigned int checker_timeout;
> > /* configlet pointers */
> > diff --git a/multipathd/main.c b/multipathd/main.c
> > index 45d40559..9519b6c5 100644
> > --- a/multipathd/main.c
> > +++ b/multipathd/main.c
> > @@ -2389,13 +2389,6 @@ sync_mpp(struct vectors * vecs, struct
> > multipath *mpp, unsigned int ticks)
> > do_sync_mpp(vecs, mpp);
> > }
> >
> > -enum check_path_return {
> > - CHECK_PATH_STARTED,
> > - CHECK_PATH_CHECKED,
> > - CHECK_PATH_SKIPPED,
> > - CHECK_PATH_REMOVED,
> > -};
> > -
> > static int
> > update_path_state (struct vectors * vecs, struct path * pp)
> > {
> > @@ -2735,6 +2728,7 @@ check_uninitialized_path(struct path * pp,
> > unsigned int ticks)
> > conf = get_multipath_config();
> > retrigger_tries = conf->retrigger_tries;
> > pp->tick = conf->max_checkint;
> > + pp->checkint = conf->checkint;
> > put_multipath_config(conf);
> >
> > if (pp->initialized == INIT_MISSING_UDEV) {
> > @@ -2778,6 +2772,10 @@ update_uninitialized_path(struct vectors *
> > vecs, struct path * pp)
> > int newstate, ret;
> > struct config *conf;
> >
> > + if (pp->initialized != INIT_NEW && pp->initialized !=
> > INIT_FAILED &&
> > + pp->initialized != INIT_MISSING_UDEV)
> > + return CHECK_PATH_SKIPPED;
> > +
> > newstate = get_new_state(pp);
> >
>
> It seems to me that this hunk and the previous one belong into 11/22.
Setting pp->checkint could certainly go in the earlier patch. It's just
there to match check_path(), although to be honest, I'm still not sure
that there is any way for either check_path() or
check_uninitialized_path() to get called with pp->checkint unset.
The checking the path->initialized state in update_uninitialized_path()
makes less sense. That code is already in check_unintialized_path(). In
11/22, update_uninitialized_path() is always called immediately after
check_unintialized_path() if it's called at all, and pp->initialized
never has a chance to change.
In this patch, check_unintialized_path() is called in check_paths(), and
it's possible that path checking takes too long, and we need take a
break and yield to other vecs lock waiters before we call
update_paths(), where update_uninitialized_path() is called. In this
case, it's possible that a uevent got processed and changed
pp->initialized, so we need to recheck it in
update_uninitialized_path().
-Ben
> Martin
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v2 13/22] multipathd: fix "fail path" and "reinstate path" commands
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (11 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 12/22] multipathd: split check_paths " Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 14/22] multipathd: update priority once after updating all paths Benjamin Marzinski
` (10 subsequent siblings)
23 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
Now that multipathd can drop the vecs lock and sleep in-between when the
checker runs and when the path gets updated, it is possible for the
user to either fail or reinstate the path in this window.
If a path gets manually failed in the window between when the checker
starts and the path is updated, the checker will have already been
started, on the path, and if multipathd read the results like normal,
it could simply reinstate the path. To avoid this, when a path checker
is disabled, the checker path_state and message are updated, just
like they are when checker_check() is run on a disabled path, so that
when checker_get_state() is called, it will always return the same
results for a disabled checker, regardless of when it was disabled.
Reinstating the path doesn't cause many problems, but still can be
improved. Since the checker was disabled when it would have been
started, it didn't run during this cycle, only the kernel state will get
updated. The rest of the path update changes won't happen until the next
time the checker runs. This is the case regardless of whether or not the
path was reinstated in the window between when the checker starts and
the path is updated. To make reinstated paths get updated sooner,
pp->tick is now set to 1 when the path is reinstated.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
libmultipath/checkers.c | 2 ++
multipathd/cli_handlers.c | 1 +
2 files changed, 3 insertions(+)
diff --git a/libmultipath/checkers.c b/libmultipath/checkers.c
index ce3e48bd..f3e98352 100644
--- a/libmultipath/checkers.c
+++ b/libmultipath/checkers.c
@@ -251,6 +251,8 @@ void checker_disable (struct checker * c)
if (!c)
return;
c->disable = 1;
+ c->msgid = CHECKER_MSGID_DISABLED;
+ c->path_state = PATH_UNCHECKED;
}
int checker_init (struct checker * c, void ** mpctxt_addr)
diff --git a/multipathd/cli_handlers.c b/multipathd/cli_handlers.c
index 42603544..184c3f91 100644
--- a/multipathd/cli_handlers.c
+++ b/multipathd/cli_handlers.c
@@ -1108,6 +1108,7 @@ cli_reinstate(void * v, struct strbuf *reply, void * data)
pp->mpp->alias, pp->dev_t);
checker_enable(&pp->checker);
+ pp->tick = 1;
return dm_reinstate_path(pp->mpp->alias, pp->dev_t);
}
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v2 14/22] multipathd: update priority once after updating all paths
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (12 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 13/22] multipathd: fix "fail path" and "reinstate path" commands Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-10-03 21:00 ` Martin Wilck
2024-09-12 21:49 ` [PATCH v2 15/22] multipathd: simplify checking for followover_should_failback Benjamin Marzinski
` (9 subsequent siblings)
23 siblings, 1 reply; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
Instead of updating the path priorities and possibly reloading the
multipath device when each path is updated, wait till all paths
have been updated, and then go through the multipath devices updating
the priorities once, reloading if necessary.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
libmultipath/structs.h | 9 +++
multipathd/main.c | 169 ++++++++++++++++++++++++++---------------
2 files changed, 118 insertions(+), 60 deletions(-)
diff --git a/libmultipath/structs.h b/libmultipath/structs.h
index af8e31e9..1f531d30 100644
--- a/libmultipath/structs.h
+++ b/libmultipath/structs.h
@@ -318,6 +318,7 @@ enum check_path_states {
CHECK_PATH_UNCHECKED,
CHECK_PATH_STARTED,
CHECK_PATH_CHECKED,
+ CHECK_PATH_NEW_UP,
CHECK_PATH_SKIPPED,
CHECK_PATH_REMOVED,
};
@@ -421,6 +422,13 @@ enum prflag_value {
PRFLAG_SET,
};
+enum prio_update_type {
+ PRIO_UPDATE_NONE,
+ PRIO_UPDATE_NORMAL,
+ PRIO_UPDATE_NEW_PATH,
+ PRIO_UPDATE_MARGINAL,
+};
+
struct multipath {
char wwid[WWID_SIZE];
char alias_old[WWID_SIZE];
@@ -464,6 +472,7 @@ struct multipath {
int queue_mode;
unsigned int sync_tick;
int synced_count;
+ enum prio_update_type prio_update;
uid_t uid;
gid_t gid;
mode_t mode;
diff --git a/multipathd/main.c b/multipathd/main.c
index 9519b6c5..3cda3c18 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -1996,15 +1996,13 @@ mpvec_garbage_collector (struct vectors * vecs)
* best pathgroup, and this is the first path in the pathgroup to come back
* up, then switch to this pathgroup */
static int
-followover_should_failback(struct path * pp)
+do_followover_should_failback(struct path * pp)
{
struct pathgroup * pgp;
struct path *pp1;
int i;
- if (pp->mpp->pgfailback != -FAILBACK_FOLLOWOVER ||
- !pp->mpp->pg || !pp->pgindex ||
- pp->pgindex != pp->mpp->bestpg)
+ if (!pp->pgindex || pp->pgindex != pp->mpp->bestpg)
return 0;
pgp = VECTOR_SLOT(pp->mpp->pg, pp->pgindex - 1);
@@ -2017,6 +2015,26 @@ followover_should_failback(struct path * pp)
return 1;
}
+static int
+followover_should_failback(struct multipath *mpp)
+{
+ struct path *pp;
+ struct pathgroup * pgp;
+ int i, j;
+
+ if (mpp->pgfailback != -FAILBACK_FOLLOWOVER || !mpp->pg)
+ return 0;
+
+ vector_foreach_slot (mpp->pg, pgp, i) {
+ vector_foreach_slot (pgp->paths, pp, j) {
+ if (pp->is_checked == CHECK_PATH_NEW_UP &&
+ do_followover_should_failback(pp))
+ return 1;
+ }
+ }
+ return 0;
+}
+
static void
missing_uev_wait_tick(struct vectors *vecs)
{
@@ -2132,41 +2150,53 @@ partial_retrigger_tick(vector pathvec)
}
}
-static int update_prio(struct path *pp, int force_refresh_all)
+static bool update_prio(struct multipath *mpp, bool refresh_all)
{
int oldpriority;
- struct path *pp1;
+ struct path *pp;
struct pathgroup * pgp;
- int i, j, changed = 0;
+ int i, j;
+ bool changed = false;
+ bool skipped_path = false;
struct config *conf;
- oldpriority = pp->priority;
- if (pp->state != PATH_DOWN) {
- conf = get_multipath_config();
- pthread_cleanup_push(put_multipath_config, conf);
- pathinfo(pp, conf, DI_PRIO);
- pthread_cleanup_pop(1);
+ vector_foreach_slot (mpp->pg, pgp, i) {
+ vector_foreach_slot (pgp->paths, pp, j) {
+ if (pp->state == PATH_DOWN)
+ continue;
+ if (!refresh_all &&
+ pp->is_checked != CHECK_PATH_CHECKED) {
+ skipped_path = true;
+ continue;
+ }
+ oldpriority = pp->priority;
+ conf = get_multipath_config();
+ pthread_cleanup_push(put_multipath_config, conf);
+ pathinfo(pp, conf, DI_PRIO);
+ pthread_cleanup_pop(1);
+ if (pp->priority != oldpriority)
+ changed = true;
+ }
}
-
- if (pp->priority != oldpriority)
- changed = 1;
- else if (!force_refresh_all)
- return 0;
-
- vector_foreach_slot (pp->mpp->pg, pgp, i) {
- vector_foreach_slot (pgp->paths, pp1, j) {
- if (pp1 == pp || pp1->state == PATH_DOWN)
+ if (!changed || !skipped_path)
+ return changed;
+ /*
+ * If a path changed priorities, refresh the priorities of any
+ * paths we skipped
+ */
+ vector_foreach_slot (mpp->pg, pgp, i) {
+ vector_foreach_slot (pgp->paths, pp, j) {
+ if (pp->state == PATH_DOWN)
+ continue;
+ if (pp->is_checked == CHECK_PATH_CHECKED)
continue;
- oldpriority = pp1->priority;
conf = get_multipath_config();
pthread_cleanup_push(put_multipath_config, conf);
- pathinfo(pp1, conf, DI_PRIO);
+ pathinfo(pp, conf, DI_PRIO);
pthread_cleanup_pop(1);
- if (pp1->priority != oldpriority)
- changed = 1;
}
}
- return changed;
+ return true;
}
static int reload_map(struct vectors *vecs, struct multipath *mpp,
@@ -2393,14 +2423,12 @@ static int
update_path_state (struct vectors * vecs, struct path * pp)
{
int newstate;
- int new_path_up = 0;
int chkr_new_path_up = 0;
int disable_reinstate = 0;
int oldchkrstate = pp->chkrstate;
unsigned int checkint, max_checkint;
struct config *conf;
- int marginal_pathgroups, marginal_changed = 0;
- bool need_reload;
+ int marginal_pathgroups;
conf = get_multipath_config();
checkint = conf->checkint;
@@ -2462,7 +2490,7 @@ update_path_state (struct vectors * vecs, struct path * pp)
}
if (!pp->marginal) {
pp->marginal = 1;
- marginal_changed = 1;
+ pp->mpp->prio_update = PRIO_UPDATE_MARGINAL;
}
} else {
if (pp->marginal || pp->state == PATH_DELAYED)
@@ -2470,7 +2498,7 @@ update_path_state (struct vectors * vecs, struct path * pp)
pp->dev);
if (marginal_pathgroups && pp->marginal) {
pp->marginal = 0;
- marginal_changed = 1;
+ pp->mpp->prio_update = PRIO_UPDATE_MARGINAL;
}
}
}
@@ -2537,7 +2565,8 @@ update_path_state (struct vectors * vecs, struct path * pp)
*/
if (!disable_reinstate)
reinstate_path(pp);
- new_path_up = 1;
+ if (pp->mpp->prio_update != PRIO_UPDATE_MARGINAL)
+ pp->mpp->prio_update = PRIO_UPDATE_NEW_PATH;
if (oldchkrstate != PATH_UP && oldchkrstate != PATH_GHOST)
chkr_new_path_up = 1;
@@ -2588,38 +2617,48 @@ update_path_state (struct vectors * vecs, struct path * pp)
LOG_MSG(2, pp);
}
}
-
+ if (pp->mpp->prio_update == PRIO_UPDATE_NONE &&
+ (newstate == PATH_UP || newstate == PATH_GHOST))
+ pp->mpp->prio_update = PRIO_UPDATE_NORMAL;
pp->state = newstate;
+ return chkr_new_path_up ? CHECK_PATH_NEW_UP : CHECK_PATH_CHECKED;
+}
- if (pp->mpp->wait_for_udev)
- return CHECK_PATH_CHECKED;
- /*
- * path prio refreshing
- */
- condlog(4, "path prio refresh");
-
- if (marginal_changed) {
- update_prio(pp, 1);
- reload_and_sync_map(pp->mpp, vecs);
- } else if (update_prio(pp, new_path_up) &&
- pp->mpp->pgpolicyfn == (pgpolicyfn *)group_by_prio &&
- pp->mpp->pgfailback == -FAILBACK_IMMEDIATE) {
+static int
+update_mpp_prio(struct vectors *vecs, struct multipath *mpp)
+{
+ bool need_reload, changed;
+ enum prio_update_type prio_update = mpp->prio_update;
+ mpp->prio_update = PRIO_UPDATE_NONE;
+
+ if (mpp->wait_for_udev || prio_update == PRIO_UPDATE_NONE)
+ return 0;
+ condlog(4, "prio refresh");
+
+ changed = update_prio(mpp, prio_update != PRIO_UPDATE_NORMAL);
+ if (prio_update == PRIO_UPDATE_MARGINAL)
+ return reload_and_sync_map(mpp, vecs);
+ if (changed && mpp->pgpolicyfn == (pgpolicyfn *)group_by_prio &&
+ mpp->pgfailback == -FAILBACK_IMMEDIATE) {
condlog(2, "%s: path priorities changed. reloading",
- pp->mpp->alias);
- reload_and_sync_map(pp->mpp, vecs);
- } else if (need_switch_pathgroup(pp->mpp, &need_reload)) {
- if (pp->mpp->pgfailback > 0 &&
- (new_path_up || pp->mpp->failback_tick <= 0))
- pp->mpp->failback_tick = pp->mpp->pgfailback + 1;
- else if (pp->mpp->pgfailback == -FAILBACK_IMMEDIATE ||
- (chkr_new_path_up && followover_should_failback(pp))) {
+ mpp->alias);
+ return reload_and_sync_map(mpp, vecs);
+ }
+ if (need_switch_pathgroup(mpp, &need_reload)) {
+ if (mpp->pgfailback > 0 &&
+ (prio_update == PRIO_UPDATE_NEW_PATH ||
+ mpp->failback_tick <= 0))
+ mpp->failback_tick = mpp->pgfailback + 1;
+ else if (mpp->pgfailback == -FAILBACK_IMMEDIATE ||
+ (prio_update == PRIO_UPDATE_NEW_PATH &&
+ followover_should_failback(mpp))) {
if (need_reload)
- reload_and_sync_map(pp->mpp, vecs);
+ return reload_and_sync_map(mpp, vecs);
else
- switch_pathgroup(pp->mpp);
+ switch_pathgroup(mpp);
}
}
- return CHECK_PATH_CHECKED;
+ return 0;
}
static int
@@ -2859,7 +2898,7 @@ update_paths(struct vectors *vecs, int *num_paths_p, time_t start_secs)
i--;
else {
pp->is_checked = rc;
- if (rc == CHECK_PATH_CHECKED)
+ if (rc == CHECK_PATH_CHECKED || rc == CHECK_PATH_NEW_UP)
(*num_paths_p)++;
}
if (++paths_checked % 128 == 0 &&
@@ -2932,8 +2971,10 @@ checkerloop (void *ap)
vector_foreach_slot(vecs->mpvec, mpp, i)
mpp->synced_count = 0;
if (checker_state == CHECKER_STARTING) {
- vector_foreach_slot(vecs->mpvec, mpp, i)
+ vector_foreach_slot(vecs->mpvec, mpp, i) {
sync_mpp(vecs, mpp, ticks);
+ mpp->prio_update = PRIO_UPDATE_NONE;
+ }
vector_foreach_slot(vecs->pathvec, pp, i)
pp->is_checked = CHECK_PATH_UNCHECKED;
checker_state = CHECKER_CHECKING_PATHS;
@@ -2943,6 +2984,14 @@ checkerloop (void *ap)
if (checker_state == CHECKER_UPDATING_PATHS)
checker_state = update_paths(vecs, &num_paths,
start_time.tv_sec);
+ if (checker_state == CHECKER_FINISHED) {
+ vector_foreach_slot(vecs->mpvec, mpp, i) {
+ if (update_mpp_prio(vecs, mpp) == 2) {
+ /* multipath device deleted */
+ i--;
+ }
+ }
+ }
lock_cleanup_pop(vecs->lock);
if (checker_state != CHECKER_FINISHED) {
/* Yield to waiters */
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* Re: [PATCH v2 14/22] multipathd: update priority once after updating all paths
2024-09-12 21:49 ` [PATCH v2 14/22] multipathd: update priority once after updating all paths Benjamin Marzinski
@ 2024-10-03 21:00 ` Martin Wilck
2024-10-08 19:17 ` Benjamin Marzinski
0 siblings, 1 reply; 40+ messages in thread
From: Martin Wilck @ 2024-10-03 21:00 UTC (permalink / raw)
To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development
On Thu, 2024-09-12 at 17:49 -0400, Benjamin Marzinski wrote:
> Instead of updating the path priorities and possibly reloading the
> multipath device when each path is updated, wait till all paths
> have been updated, and then go through the multipath devices updating
> the priorities once, reloading if necessary.
>
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
> libmultipath/structs.h | 9 +++
> multipathd/main.c | 169 ++++++++++++++++++++++++++-------------
> --
> 2 files changed, 118 insertions(+), 60 deletions(-)
>
> diff --git a/libmultipath/structs.h b/libmultipath/structs.h
> index af8e31e9..1f531d30 100644
> --- a/libmultipath/structs.h
> +++ b/libmultipath/structs.h
> @@ -318,6 +318,7 @@ enum check_path_states {
> CHECK_PATH_UNCHECKED,
> CHECK_PATH_STARTED,
> CHECK_PATH_CHECKED,
> + CHECK_PATH_NEW_UP,
> CHECK_PATH_SKIPPED,
> CHECK_PATH_REMOVED,
> };
> @@ -421,6 +422,13 @@ enum prflag_value {
> PRFLAG_SET,
> };
>
> +enum prio_update_type {
> + PRIO_UPDATE_NONE,
> + PRIO_UPDATE_NORMAL,
> + PRIO_UPDATE_NEW_PATH,
> + PRIO_UPDATE_MARGINAL,
> +};
> +
> struct multipath {
> char wwid[WWID_SIZE];
> char alias_old[WWID_SIZE];
> @@ -464,6 +472,7 @@ struct multipath {
> int queue_mode;
> unsigned int sync_tick;
> int synced_count;
> + enum prio_update_type prio_update;
> uid_t uid;
> gid_t gid;
> mode_t mode;
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 9519b6c5..3cda3c18 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -1996,15 +1996,13 @@ mpvec_garbage_collector (struct vectors *
> vecs)
> * best pathgroup, and this is the first path in the pathgroup to
> come back
> * up, then switch to this pathgroup */
> static int
> -followover_should_failback(struct path * pp)
> +do_followover_should_failback(struct path * pp)
> {
> struct pathgroup * pgp;
> struct path *pp1;
> int i;
>
> - if (pp->mpp->pgfailback != -FAILBACK_FOLLOWOVER ||
> - !pp->mpp->pg || !pp->pgindex ||
> - pp->pgindex != pp->mpp->bestpg)
> + if (!pp->pgindex || pp->pgindex != pp->mpp->bestpg)
> return 0;
>
> pgp = VECTOR_SLOT(pp->mpp->pg, pp->pgindex - 1);
> @@ -2017,6 +2015,26 @@ followover_should_failback(struct path * pp)
> return 1;
> }
>
> +static int
> +followover_should_failback(struct multipath *mpp)
> +{
> + struct path *pp;
> + struct pathgroup * pgp;
> + int i, j;
> +
> + if (mpp->pgfailback != -FAILBACK_FOLLOWOVER || !mpp->pg)
> + return 0;
> +
> + vector_foreach_slot (mpp->pg, pgp, i) {
> + vector_foreach_slot (pgp->paths, pp, j) {
> + if (pp->is_checked == CHECK_PATH_NEW_UP &&
> + do_followover_should_failback(pp))
> + return 1;
> + }
> + }
> + return 0;
> +}
> +
> static void
> missing_uev_wait_tick(struct vectors *vecs)
> {
> @@ -2132,41 +2150,53 @@ partial_retrigger_tick(vector pathvec)
> }
> }
>
> -static int update_prio(struct path *pp, int force_refresh_all)
> +static bool update_prio(struct multipath *mpp, bool refresh_all)
> {
> int oldpriority;
> - struct path *pp1;
> + struct path *pp;
> struct pathgroup * pgp;
> - int i, j, changed = 0;
> + int i, j;
> + bool changed = false;
> + bool skipped_path = false;
> struct config *conf;
>
> - oldpriority = pp->priority;
> - if (pp->state != PATH_DOWN) {
> - conf = get_multipath_config();
> - pthread_cleanup_push(put_multipath_config, conf);
> - pathinfo(pp, conf, DI_PRIO);
> - pthread_cleanup_pop(1);
> + vector_foreach_slot (mpp->pg, pgp, i) {
> + vector_foreach_slot (pgp->paths, pp, j) {
> + if (pp->state == PATH_DOWN)
> + continue;
> + if (!refresh_all &&
> + pp->is_checked != CHECK_PATH_CHECKED) {
> + skipped_path = true;
> + continue;
> + }
Nit: My first thought here was that this would skip paths for which
pp->is_checked == CHECK_PATH_NEW_UP. Then I realized that if there was
a new path up, refresh_all would be set, which is not immediately
obvious. Can you add a comment?
Martin
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v2 14/22] multipathd: update priority once after updating all paths
2024-10-03 21:00 ` Martin Wilck
@ 2024-10-08 19:17 ` Benjamin Marzinski
0 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-10-08 19:17 UTC (permalink / raw)
To: Martin Wilck; +Cc: Christophe Varoqui, device-mapper development
On Thu, Oct 03, 2024 at 11:00:20PM +0200, Martin Wilck wrote:
> On Thu, 2024-09-12 at 17:49 -0400, Benjamin Marzinski wrote:
> > Instead of updating the path priorities and possibly reloading the
> > multipath device when each path is updated, wait till all paths
> > have been updated, and then go through the multipath devices updating
> > the priorities once, reloading if necessary.
> >
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> > libmultipath/structs.h | 9 +++
> > multipathd/main.c | 169 ++++++++++++++++++++++++++-------------
> > --
> > 2 files changed, 118 insertions(+), 60 deletions(-)
> >
> > diff --git a/libmultipath/structs.h b/libmultipath/structs.h
> > index af8e31e9..1f531d30 100644
> > --- a/libmultipath/structs.h
> > +++ b/libmultipath/structs.h
> > @@ -318,6 +318,7 @@ enum check_path_states {
> > CHECK_PATH_UNCHECKED,
> > CHECK_PATH_STARTED,
> > CHECK_PATH_CHECKED,
> > + CHECK_PATH_NEW_UP,
> > CHECK_PATH_SKIPPED,
> > CHECK_PATH_REMOVED,
> > };
> > @@ -421,6 +422,13 @@ enum prflag_value {
> > PRFLAG_SET,
> > };
> >
> > +enum prio_update_type {
> > + PRIO_UPDATE_NONE,
> > + PRIO_UPDATE_NORMAL,
> > + PRIO_UPDATE_NEW_PATH,
> > + PRIO_UPDATE_MARGINAL,
> > +};
> > +
> > struct multipath {
> > char wwid[WWID_SIZE];
> > char alias_old[WWID_SIZE];
> > @@ -464,6 +472,7 @@ struct multipath {
> > int queue_mode;
> > unsigned int sync_tick;
> > int synced_count;
> > + enum prio_update_type prio_update;
> > uid_t uid;
> > gid_t gid;
> > mode_t mode;
> > diff --git a/multipathd/main.c b/multipathd/main.c
> > index 9519b6c5..3cda3c18 100644
> > --- a/multipathd/main.c
> > +++ b/multipathd/main.c
> > @@ -1996,15 +1996,13 @@ mpvec_garbage_collector (struct vectors *
> > vecs)
> > * best pathgroup, and this is the first path in the pathgroup to
> > come back
> > * up, then switch to this pathgroup */
> > static int
> > -followover_should_failback(struct path * pp)
> > +do_followover_should_failback(struct path * pp)
> > {
> > struct pathgroup * pgp;
> > struct path *pp1;
> > int i;
> >
> > - if (pp->mpp->pgfailback != -FAILBACK_FOLLOWOVER ||
> > - !pp->mpp->pg || !pp->pgindex ||
> > - pp->pgindex != pp->mpp->bestpg)
> > + if (!pp->pgindex || pp->pgindex != pp->mpp->bestpg)
> > return 0;
> >
> > pgp = VECTOR_SLOT(pp->mpp->pg, pp->pgindex - 1);
> > @@ -2017,6 +2015,26 @@ followover_should_failback(struct path * pp)
> > return 1;
> > }
> >
> > +static int
> > +followover_should_failback(struct multipath *mpp)
> > +{
> > + struct path *pp;
> > + struct pathgroup * pgp;
> > + int i, j;
> > +
> > + if (mpp->pgfailback != -FAILBACK_FOLLOWOVER || !mpp->pg)
> > + return 0;
> > +
> > + vector_foreach_slot (mpp->pg, pgp, i) {
> > + vector_foreach_slot (pgp->paths, pp, j) {
> > + if (pp->is_checked == CHECK_PATH_NEW_UP &&
> > + do_followover_should_failback(pp))
> > + return 1;
> > + }
> > + }
> > + return 0;
> > +}
> > +
> > static void
> > missing_uev_wait_tick(struct vectors *vecs)
> > {
> > @@ -2132,41 +2150,53 @@ partial_retrigger_tick(vector pathvec)
> > }
> > }
> >
> > -static int update_prio(struct path *pp, int force_refresh_all)
> > +static bool update_prio(struct multipath *mpp, bool refresh_all)
> > {
> > int oldpriority;
> > - struct path *pp1;
> > + struct path *pp;
> > struct pathgroup * pgp;
> > - int i, j, changed = 0;
> > + int i, j;
> > + bool changed = false;
> > + bool skipped_path = false;
> > struct config *conf;
> >
> > - oldpriority = pp->priority;
> > - if (pp->state != PATH_DOWN) {
> > - conf = get_multipath_config();
> > - pthread_cleanup_push(put_multipath_config, conf);
> > - pathinfo(pp, conf, DI_PRIO);
> > - pthread_cleanup_pop(1);
> > + vector_foreach_slot (mpp->pg, pgp, i) {
> > + vector_foreach_slot (pgp->paths, pp, j) {
> > + if (pp->state == PATH_DOWN)
> > + continue;
> > + if (!refresh_all &&
> > + pp->is_checked != CHECK_PATH_CHECKED) {
> > + skipped_path = true;
> > + continue;
> > + }
>
> Nit: My first thought here was that this would skip paths for which
> pp->is_checked == CHECK_PATH_NEW_UP. Then I realized that if there was
> a new path up, refresh_all would be set, which is not immediately
> obvious. Can you add a comment?
Sure.
>
> Martin
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v2 15/22] multipathd: simplify checking for followover_should_failback
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (13 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 14/22] multipathd: update priority once after updating all paths Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 16/22] multipathd: only refresh prios on PATH_UP and PATH_GHOST Benjamin Marzinski
` (8 subsequent siblings)
23 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
The code needs to check that pp->pgindex equals pp->mpp->bestpg and that
they aren't both 0 (which is an invalid pathgroup id). Since we're
already checking that they are equal, we only need to check one of them
to see if it's non-zero. Instead of checking if pp->pgindex is non-zero
for every path, just check mpp->bestpg once for the multipath device and
return immediately if it's zero.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
multipathd/main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/multipathd/main.c b/multipathd/main.c
index 3cda3c18..75bc0620 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2002,7 +2002,7 @@ do_followover_should_failback(struct path * pp)
struct path *pp1;
int i;
- if (!pp->pgindex || pp->pgindex != pp->mpp->bestpg)
+ if (pp->pgindex != pp->mpp->bestpg)
return 0;
pgp = VECTOR_SLOT(pp->mpp->pg, pp->pgindex - 1);
@@ -2022,7 +2022,7 @@ followover_should_failback(struct multipath *mpp)
struct pathgroup * pgp;
int i, j;
- if (mpp->pgfailback != -FAILBACK_FOLLOWOVER || !mpp->pg)
+ if (mpp->pgfailback != -FAILBACK_FOLLOWOVER || !mpp->pg || !mpp->bestpg)
return 0;
vector_foreach_slot (mpp->pg, pgp, i) {
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v2 16/22] multipathd: only refresh prios on PATH_UP and PATH_GHOST
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (14 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 15/22] multipathd: simplify checking for followover_should_failback Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 17/22] multipathd: remove pointless check Benjamin Marzinski
` (7 subsequent siblings)
23 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
The way that multipathd was handling priority refreshes had some issues.
Some of the multipath prioritizers can hang when run on a failed path.
Multipathd was only skipping paths in PATH_DOWN, but there are other
states where the prioritizer is also likely to hang, such as
PATH_TIMEOUT, PATH_SHAKY, and to a lesser extent, PATH_DELAYED.
Also, before the recent patch splitting the priority updating from the
path checking, multipathd wasn't consistent with which states would
cause
paths to get their priorities updated. If a path changed its state to
anything other than PATH_UP or PATH_GHOST, it wouldn't get its priority
updated. But if a path kept the same state its priority would get
updated as long at the state wasn't PATH_DOWN.
For safety's sake, a path's priority should only get refreshed when its
in the PATH_UP or PATH_GHOST state. This shouldn't cause problems. Only
paths that are in the PATH_UP or PATH_GHOST state are usable by the
kenel and contibute to the pathgroup's priority.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
multipathd/main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/multipathd/main.c b/multipathd/main.c
index 75bc0620..91806e9d 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2162,7 +2162,7 @@ static bool update_prio(struct multipath *mpp, bool refresh_all)
vector_foreach_slot (mpp->pg, pgp, i) {
vector_foreach_slot (pgp->paths, pp, j) {
- if (pp->state == PATH_DOWN)
+ if (pp->state != PATH_UP && pp->state != PATH_GHOST)
continue;
if (!refresh_all &&
pp->is_checked != CHECK_PATH_CHECKED) {
@@ -2186,7 +2186,7 @@ static bool update_prio(struct multipath *mpp, bool refresh_all)
*/
vector_foreach_slot (mpp->pg, pgp, i) {
vector_foreach_slot (pgp->paths, pp, j) {
- if (pp->state == PATH_DOWN)
+ if (pp->state != PATH_UP && pp->state != PATH_GHOST)
continue;
if (pp->is_checked == CHECK_PATH_CHECKED)
continue;
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v2 17/22] multipathd: remove pointless check
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (15 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 16/22] multipathd: only refresh prios on PATH_UP and PATH_GHOST Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 18/22] multipathd: fix deferred_failback_tick for reload removes Benjamin Marzinski
` (6 subsequent siblings)
23 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
Since we just returned if newstate wasn't PATH_UP or PATH_GHOST, it
must obviously be PATH_UP or PATH_GHOST at this point in the code. We
don't even wrap all the code for dealing with a path that just came up
in this if-block, It's only the persistent resrvation code.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
multipathd/main.c | 26 +++++++++++++-------------
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/multipathd/main.c b/multipathd/main.c
index 91806e9d..18632c7d 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2545,19 +2545,19 @@ update_path_state (struct vectors * vecs, struct path * pp)
return CHECK_PATH_CHECKED;
}
- if (newstate == PATH_UP || newstate == PATH_GHOST) {
- if (pp->mpp->prflag != PRFLAG_UNSET) {
- int prflag = pp->mpp->prflag;
- /*
- * Check Persistent Reservation.
- */
- condlog(2, "%s: checking persistent "
- "reservation registration", pp->dev);
- mpath_pr_event_handle(pp);
- if (pp->mpp->prflag == PRFLAG_SET &&
- prflag != PRFLAG_SET)
- pr_register_active_paths(pp->mpp);
- }
+ /* newstate == PATH_UP || newstate == PATH_GHOST */
+
+ if (pp->mpp->prflag != PRFLAG_UNSET) {
+ int prflag = pp->mpp->prflag;
+ /*
+ * Check Persistent Reservation.
+ */
+ condlog(2, "%s: checking persistent "
+ "reservation registration", pp->dev);
+ mpath_pr_event_handle(pp);
+ if (pp->mpp->prflag == PRFLAG_SET &&
+ prflag != PRFLAG_SET)
+ pr_register_active_paths(pp->mpp);
}
/*
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v2 18/22] multipathd: fix deferred_failback_tick for reload removes
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (16 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 17/22] multipathd: remove pointless check Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 19/22] libmultipath: add libcheck_need_wait checker function Benjamin Marzinski
` (5 subsequent siblings)
23 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
If reload_and_sync_map() removes the multipath device,
deferred_failback_tick() needs to decrement the counter so that it
doesn't skip the following device.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
multipathd/main.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/multipathd/main.c b/multipathd/main.c
index 18632c7d..5e68e470 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2098,9 +2098,12 @@ deferred_failback_tick (struct vectors *vecs)
if (!mpp->failback_tick &&
need_switch_pathgroup(mpp, &need_reload)) {
- if (need_reload)
- reload_and_sync_map(mpp, vecs);
- else
+ if (need_reload) {
+ if (reload_and_sync_map(mpp, vecs) == 2) {
+ /* multipath device removed */
+ i--;
+ }
+ } else
switch_pathgroup(mpp);
}
}
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v2 19/22] libmultipath: add libcheck_need_wait checker function
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (17 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 18/22] multipathd: fix deferred_failback_tick for reload removes Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-10-03 21:15 ` Martin Wilck
2024-09-12 21:49 ` [PATCH v2 20/22] libmultipath: don't wait in libcheck_pending Benjamin Marzinski
` (4 subsequent siblings)
23 siblings, 1 reply; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
Add a new optional checker class function, libcheck_need_wait() and a
new function to call it, checker_need_wait(). This can be used to see if
a path_checker is currently running. This will be used to determine if
there are pending checkers that need to be waited for.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
libmultipath/checkers.c | 12 +++++++++++-
libmultipath/checkers.h | 4 +++-
libmultipath/checkers/directio.c | 10 ++++++++++
libmultipath/checkers/tur.c | 10 ++++++++++
4 files changed, 34 insertions(+), 2 deletions(-)
diff --git a/libmultipath/checkers.c b/libmultipath/checkers.c
index f3e98352..e2eda58d 100644
--- a/libmultipath/checkers.c
+++ b/libmultipath/checkers.c
@@ -27,6 +27,7 @@ struct checker_class {
void (*reset)(void); /* to reset the global variables */
void *(*thread)(void *); /* async thread entry point */
int (*pending)(struct checker *); /* to recheck pending paths */
+ bool (*need_wait)(struct checker *); /* checker needs waiting for */
const char **msgtable;
short msgtable_size;
};
@@ -182,7 +183,8 @@ static struct checker_class *add_checker_class(const char *name)
c->reset = (void (*)(void)) dlsym(c->handle, "libcheck_reset");
c->thread = (void *(*)(void*)) dlsym(c->handle, "libcheck_thread");
c->pending = (int (*)(struct checker *)) dlsym(c->handle, "libcheck_pending");
- /* These 4 functions can be NULL. call dlerror() to clear out any
+ c->need_wait = (bool (*)(struct checker *)) dlsym(c->handle, "libcheck_need_wait");
+ /* These 5 functions can be NULL. call dlerror() to clear out any
* error string */
dlerror();
@@ -313,6 +315,14 @@ int checker_get_state(struct checker *c)
return c->path_state;
}
+bool checker_need_wait(struct checker *c)
+{
+ if (!c || !c->cls || c->path_state != PATH_PENDING ||
+ !c->cls->need_wait)
+ return false;
+ return c->cls->need_wait(c);
+}
+
void checker_check (struct checker * c, int path_state)
{
if (!c)
diff --git a/libmultipath/checkers.h b/libmultipath/checkers.h
index b2342a1b..da91f499 100644
--- a/libmultipath/checkers.h
+++ b/libmultipath/checkers.h
@@ -2,6 +2,7 @@
#define CHECKERS_H_INCLUDED
#include <pthread.h>
+#include <stdbool.h>
#include "list.h"
#include "defaults.h"
@@ -171,6 +172,7 @@ struct checker_context {
int start_checker_thread (pthread_t *thread, const pthread_attr_t *attr,
struct checker_context *ctx);
int checker_get_state(struct checker *c);
+bool checker_need_wait(struct checker *c);
void checker_check (struct checker *, int);
int checker_is_sync(const struct checker *);
const char *checker_name (const struct checker *);
@@ -191,7 +193,7 @@ void *libcheck_thread(struct checker_context *ctx);
void libcheck_reset(void);
int libcheck_mp_init(struct checker *);
int libcheck_pending(struct checker *c);
-
+bool libcheck_need_wait(struct checker *c);
/*
* msgid => message map.
diff --git a/libmultipath/checkers/directio.c b/libmultipath/checkers/directio.c
index 904e3071..4beed02e 100644
--- a/libmultipath/checkers/directio.c
+++ b/libmultipath/checkers/directio.c
@@ -65,6 +65,7 @@ struct directio_context {
struct aio_group *aio_grp;
struct async_req *req;
struct timespec endtime;
+ bool waited_for;
};
static struct aio_group *
@@ -295,6 +296,7 @@ check_pending(struct directio_context *ct, struct timespec endtime)
int r;
struct timespec currtime, timeout;
+ ct->waited_for = true;
while(1) {
get_monotonic_time(&currtime);
timespecsub(&endtime, &currtime, &timeout);
@@ -346,6 +348,7 @@ check_state(int fd, struct directio_context *ct, int sync, int timeout_secs)
get_monotonic_time(&ct->endtime);
ct->endtime.tv_nsec += 1000 * 1000;
normalize_timespec(&ct->endtime);
+ ct->waited_for = false;
}
ct->running++;
if (!sync)
@@ -386,6 +389,13 @@ static void set_msgid(struct checker *c, int state)
}
}
+bool libcheck_need_wait(struct checker *c)
+{
+ struct directio_context *ct = (struct directio_context *)c->context;
+ return (ct && ct->running && ct->req->state == PATH_PENDING &&
+ !ct->waited_for);
+}
+
int libcheck_pending(struct checker *c)
{
int rc;
diff --git a/libmultipath/checkers/tur.c b/libmultipath/checkers/tur.c
index 81db565b..41d6b9c3 100644
--- a/libmultipath/checkers/tur.c
+++ b/libmultipath/checkers/tur.c
@@ -59,6 +59,7 @@ struct tur_checker_context {
struct checker_context ctx;
unsigned int nr_timeouts;
struct timespec endtime;
+ bool waited_for;
};
int libcheck_init (struct checker * c)
@@ -351,9 +352,17 @@ int check_pending(struct checker *c)
ct->thread = 0;
}
+ ct->waited_for = true;
return tur_status;
}
+bool libcheck_need_wait(struct checker *c)
+{
+ struct tur_checker_context *ct = c->context;
+ return (ct && ct->thread && uatomic_read(&ct->running) != 0 &&
+ !ct->waited_for);
+}
+
int libcheck_pending(struct checker *c)
{
struct tur_checker_context *ct = c->context;
@@ -463,6 +472,7 @@ int libcheck_check(struct checker * c)
pthread_mutex_unlock(&ct->lock);
ct->fd = c->fd;
ct->timeout = c->timeout;
+ ct->waited_for = false;
uatomic_add(&ct->holders, 1);
uatomic_set(&ct->running, 1);
tur_set_async_timeout(c);
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* Re: [PATCH v2 19/22] libmultipath: add libcheck_need_wait checker function
2024-09-12 21:49 ` [PATCH v2 19/22] libmultipath: add libcheck_need_wait checker function Benjamin Marzinski
@ 2024-10-03 21:15 ` Martin Wilck
2024-10-08 19:33 ` Benjamin Marzinski
0 siblings, 1 reply; 40+ messages in thread
From: Martin Wilck @ 2024-10-03 21:15 UTC (permalink / raw)
To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development
On Thu, 2024-09-12 at 17:49 -0400, Benjamin Marzinski wrote:
> Add a new optional checker class function, libcheck_need_wait() and a
> new function to call it, checker_need_wait(). This can be used to see
> if
> a path_checker is currently running. This will be used to determine
> if
> there are pending checkers that need to be waited for.
>
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
> libmultipath/checkers.c | 12 +++++++++++-
> libmultipath/checkers.h | 4 +++-
> libmultipath/checkers/directio.c | 10 ++++++++++
> libmultipath/checkers/tur.c | 10 ++++++++++
> 4 files changed, 34 insertions(+), 2 deletions(-)
>
> diff --git a/libmultipath/checkers.c b/libmultipath/checkers.c
> index f3e98352..e2eda58d 100644
> --- a/libmultipath/checkers.c
> +++ b/libmultipath/checkers.c
> @@ -27,6 +27,7 @@ struct checker_class {
> void (*reset)(void); /* to reset the global
> variables */
> void *(*thread)(void *); /* async thread entry
> point */
> int (*pending)(struct checker *); /* to recheck pending
> paths */
> + bool (*need_wait)(struct checker *); /* checker needs
> waiting for */
> const char **msgtable;
> short msgtable_size;
> };
> @@ -182,7 +183,8 @@ static struct checker_class
> *add_checker_class(const char *name)
> c->reset = (void (*)(void)) dlsym(c->handle,
> "libcheck_reset");
> c->thread = (void *(*)(void*)) dlsym(c->handle,
> "libcheck_thread");
> c->pending = (int (*)(struct checker *)) dlsym(c->handle,
> "libcheck_pending");
> - /* These 4 functions can be NULL. call dlerror() to clear
> out any
> + c->need_wait = (bool (*)(struct checker *)) dlsym(c->handle,
> "libcheck_need_wait");
> + /* These 5 functions can be NULL. call dlerror() to clear
> out any
> * error string */
> dlerror();
>
> @@ -313,6 +315,14 @@ int checker_get_state(struct checker *c)
> return c->path_state;
> }
>
> +bool checker_need_wait(struct checker *c)
> +{
> + if (!c || !c->cls || c->path_state != PATH_PENDING ||
> + !c->cls->need_wait)
> + return false;
> + return c->cls->need_wait(c);
> +}
> +
> void checker_check (struct checker * c, int path_state)
> {
> if (!c)
> diff --git a/libmultipath/checkers.h b/libmultipath/checkers.h
> index b2342a1b..da91f499 100644
> --- a/libmultipath/checkers.h
> +++ b/libmultipath/checkers.h
> @@ -2,6 +2,7 @@
> #define CHECKERS_H_INCLUDED
>
> #include <pthread.h>
> +#include <stdbool.h>
> #include "list.h"
> #include "defaults.h"
>
> @@ -171,6 +172,7 @@ struct checker_context {
> int start_checker_thread (pthread_t *thread, const pthread_attr_t
> *attr,
> struct checker_context *ctx);
> int checker_get_state(struct checker *c);
> +bool checker_need_wait(struct checker *c);
> void checker_check (struct checker *, int);
> int checker_is_sync(const struct checker *);
> const char *checker_name (const struct checker *);
> @@ -191,7 +193,7 @@ void *libcheck_thread(struct checker_context
> *ctx);
> void libcheck_reset(void);
> int libcheck_mp_init(struct checker *);
> int libcheck_pending(struct checker *c);
> -
> +bool libcheck_need_wait(struct checker *c);
>
> /*
> * msgid => message map.
> diff --git a/libmultipath/checkers/directio.c
> b/libmultipath/checkers/directio.c
> index 904e3071..4beed02e 100644
> --- a/libmultipath/checkers/directio.c
> +++ b/libmultipath/checkers/directio.c
> @@ -65,6 +65,7 @@ struct directio_context {
> struct aio_group *aio_grp;
> struct async_req *req;
> struct timespec endtime;
> + bool waited_for;
> };
>
> static struct aio_group *
> @@ -295,6 +296,7 @@ check_pending(struct directio_context *ct, struct
> timespec endtime)
> int r;
> struct timespec currtime, timeout;
>
> + ct->waited_for = true;
Not sure if it's important, but strictly speaking, we have only waited
for the checker if the endtime was in the future, which is often not
the case. Otherwise we'd just have polled.
Martin
> while(1) {
> get_monotonic_time(&currtime);
> timespecsub(&endtime, &currtime, &timeout);
> @@ -346,6 +348,7 @@ check_state(int fd, struct directio_context *ct,
> int sync, int timeout_secs)
> get_monotonic_time(&ct->endtime);
> ct->endtime.tv_nsec += 1000 * 1000;
> normalize_timespec(&ct->endtime);
> + ct->waited_for = false;
> }
> ct->running++;
> if (!sync)
> @@ -386,6 +389,13 @@ static void set_msgid(struct checker *c, int
> state)
> }
> }
>
> +bool libcheck_need_wait(struct checker *c)
> +{
> + struct directio_context *ct = (struct directio_context *)c-
> >context;
> + return (ct && ct->running && ct->req->state == PATH_PENDING
> &&
> + !ct->waited_for);
> +}
> +
> int libcheck_pending(struct checker *c)
> {
> int rc;
> diff --git a/libmultipath/checkers/tur.c
> b/libmultipath/checkers/tur.c
> index 81db565b..41d6b9c3 100644
> --- a/libmultipath/checkers/tur.c
> +++ b/libmultipath/checkers/tur.c
> @@ -59,6 +59,7 @@ struct tur_checker_context {
> struct checker_context ctx;
> unsigned int nr_timeouts;
> struct timespec endtime;
> + bool waited_for;
> };
>
> int libcheck_init (struct checker * c)
> @@ -351,9 +352,17 @@ int check_pending(struct checker *c)
> ct->thread = 0;
> }
>
> + ct->waited_for = true;
> return tur_status;
> }
>
> +bool libcheck_need_wait(struct checker *c)
> +{
> + struct tur_checker_context *ct = c->context;
> + return (ct && ct->thread && uatomic_read(&ct->running) != 0
> &&
> + !ct->waited_for);
> +}
> +
> int libcheck_pending(struct checker *c)
> {
> struct tur_checker_context *ct = c->context;
> @@ -463,6 +472,7 @@ int libcheck_check(struct checker * c)
> pthread_mutex_unlock(&ct->lock);
> ct->fd = c->fd;
> ct->timeout = c->timeout;
> + ct->waited_for = false;
> uatomic_add(&ct->holders, 1);
> uatomic_set(&ct->running, 1);
> tur_set_async_timeout(c);
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v2 19/22] libmultipath: add libcheck_need_wait checker function
2024-10-03 21:15 ` Martin Wilck
@ 2024-10-08 19:33 ` Benjamin Marzinski
2024-10-09 15:49 ` Martin Wilck
0 siblings, 1 reply; 40+ messages in thread
From: Benjamin Marzinski @ 2024-10-08 19:33 UTC (permalink / raw)
To: Martin Wilck; +Cc: Christophe Varoqui, device-mapper development
On Thu, Oct 03, 2024 at 11:15:21PM +0200, Martin Wilck wrote:
> On Thu, 2024-09-12 at 17:49 -0400, Benjamin Marzinski wrote:
> > Add a new optional checker class function, libcheck_need_wait() and a
> > new function to call it, checker_need_wait(). This can be used to see
> > if
> > a path_checker is currently running. This will be used to determine
> > if
> > there are pending checkers that need to be waited for.
> >
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> > libmultipath/checkers.c | 12 +++++++++++-
> > libmultipath/checkers.h | 4 +++-
> > libmultipath/checkers/directio.c | 10 ++++++++++
> > libmultipath/checkers/tur.c | 10 ++++++++++
> > 4 files changed, 34 insertions(+), 2 deletions(-)
> >
> > diff --git a/libmultipath/checkers.c b/libmultipath/checkers.c
> > index f3e98352..e2eda58d 100644
> > --- a/libmultipath/checkers.c
> > +++ b/libmultipath/checkers.c
> > @@ -27,6 +27,7 @@ struct checker_class {
> > void (*reset)(void); /* to reset the global
> > variables */
> > void *(*thread)(void *); /* async thread entry
> > point */
> > int (*pending)(struct checker *); /* to recheck pending
> > paths */
> > + bool (*need_wait)(struct checker *); /* checker needs
> > waiting for */
> > const char **msgtable;
> > short msgtable_size;
> > };
> > @@ -182,7 +183,8 @@ static struct checker_class
> > *add_checker_class(const char *name)
> > c->reset = (void (*)(void)) dlsym(c->handle,
> > "libcheck_reset");
> > c->thread = (void *(*)(void*)) dlsym(c->handle,
> > "libcheck_thread");
> > c->pending = (int (*)(struct checker *)) dlsym(c->handle,
> > "libcheck_pending");
> > - /* These 4 functions can be NULL. call dlerror() to clear
> > out any
> > + c->need_wait = (bool (*)(struct checker *)) dlsym(c->handle,
> > "libcheck_need_wait");
> > + /* These 5 functions can be NULL. call dlerror() to clear
> > out any
> > * error string */
> > dlerror();
> >
> > @@ -313,6 +315,14 @@ int checker_get_state(struct checker *c)
> > return c->path_state;
> > }
> >
> > +bool checker_need_wait(struct checker *c)
> > +{
> > + if (!c || !c->cls || c->path_state != PATH_PENDING ||
> > + !c->cls->need_wait)
> > + return false;
> > + return c->cls->need_wait(c);
> > +}
> > +
> > void checker_check (struct checker * c, int path_state)
> > {
> > if (!c)
> > diff --git a/libmultipath/checkers.h b/libmultipath/checkers.h
> > index b2342a1b..da91f499 100644
> > --- a/libmultipath/checkers.h
> > +++ b/libmultipath/checkers.h
> > @@ -2,6 +2,7 @@
> > #define CHECKERS_H_INCLUDED
> >
> > #include <pthread.h>
> > +#include <stdbool.h>
> > #include "list.h"
> > #include "defaults.h"
> >
> > @@ -171,6 +172,7 @@ struct checker_context {
> > int start_checker_thread (pthread_t *thread, const pthread_attr_t
> > *attr,
> > struct checker_context *ctx);
> > int checker_get_state(struct checker *c);
> > +bool checker_need_wait(struct checker *c);
> > void checker_check (struct checker *, int);
> > int checker_is_sync(const struct checker *);
> > const char *checker_name (const struct checker *);
> > @@ -191,7 +193,7 @@ void *libcheck_thread(struct checker_context
> > *ctx);
> > void libcheck_reset(void);
> > int libcheck_mp_init(struct checker *);
> > int libcheck_pending(struct checker *c);
> > -
> > +bool libcheck_need_wait(struct checker *c);
> >
> > /*
> > * msgid => message map.
> > diff --git a/libmultipath/checkers/directio.c
> > b/libmultipath/checkers/directio.c
> > index 904e3071..4beed02e 100644
> > --- a/libmultipath/checkers/directio.c
> > +++ b/libmultipath/checkers/directio.c
> > @@ -65,6 +65,7 @@ struct directio_context {
> > struct aio_group *aio_grp;
> > struct async_req *req;
> > struct timespec endtime;
> > + bool waited_for;
> > };
> >
> > static struct aio_group *
> > @@ -295,6 +296,7 @@ check_pending(struct directio_context *ct, struct
> > timespec endtime)
> > int r;
> > struct timespec currtime, timeout;
> >
> > + ct->waited_for = true;
>
> Not sure if it's important, but strictly speaking, we have only waited
> for the checker if the endtime was in the future, which is often not
> the case. Otherwise we'd just have polled.
But we always set the endtime in the future when we're starting a new
request, so the first time the we call check_pending() we don't return
till we either get an answer or endtime has passed, which is what we
want waited_for to check (that we've give the checker till endtime to
respond).
-Ben
>
> Martin
>
>
> > while(1) {
> > get_monotonic_time(&currtime);
> > timespecsub(&endtime, &currtime, &timeout);
> > @@ -346,6 +348,7 @@ check_state(int fd, struct directio_context *ct,
> > int sync, int timeout_secs)
> > get_monotonic_time(&ct->endtime);
> > ct->endtime.tv_nsec += 1000 * 1000;
> > normalize_timespec(&ct->endtime);
> > + ct->waited_for = false;
> > }
> > ct->running++;
> > if (!sync)
> > @@ -386,6 +389,13 @@ static void set_msgid(struct checker *c, int
> > state)
> > }
> > }
> >
> > +bool libcheck_need_wait(struct checker *c)
> > +{
> > + struct directio_context *ct = (struct directio_context *)c-
> > >context;
> > + return (ct && ct->running && ct->req->state == PATH_PENDING
> > &&
> > + !ct->waited_for);
> > +}
> > +
> > int libcheck_pending(struct checker *c)
> > {
> > int rc;
> > diff --git a/libmultipath/checkers/tur.c
> > b/libmultipath/checkers/tur.c
> > index 81db565b..41d6b9c3 100644
> > --- a/libmultipath/checkers/tur.c
> > +++ b/libmultipath/checkers/tur.c
> > @@ -59,6 +59,7 @@ struct tur_checker_context {
> > struct checker_context ctx;
> > unsigned int nr_timeouts;
> > struct timespec endtime;
> > + bool waited_for;
> > };
> >
> > int libcheck_init (struct checker * c)
> > @@ -351,9 +352,17 @@ int check_pending(struct checker *c)
> > ct->thread = 0;
> > }
> >
> > + ct->waited_for = true;
> > return tur_status;
> > }
> >
> > +bool libcheck_need_wait(struct checker *c)
> > +{
> > + struct tur_checker_context *ct = c->context;
> > + return (ct && ct->thread && uatomic_read(&ct->running) != 0
> > &&
> > + !ct->waited_for);
> > +}
> > +
> > int libcheck_pending(struct checker *c)
> > {
> > struct tur_checker_context *ct = c->context;
> > @@ -463,6 +472,7 @@ int libcheck_check(struct checker * c)
> > pthread_mutex_unlock(&ct->lock);
> > ct->fd = c->fd;
> > ct->timeout = c->timeout;
> > + ct->waited_for = false;
> > uatomic_add(&ct->holders, 1);
> > uatomic_set(&ct->running, 1);
> > tur_set_async_timeout(c);
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v2 19/22] libmultipath: add libcheck_need_wait checker function
2024-10-08 19:33 ` Benjamin Marzinski
@ 2024-10-09 15:49 ` Martin Wilck
2024-10-14 17:48 ` Benjamin Marzinski
0 siblings, 1 reply; 40+ messages in thread
From: Martin Wilck @ 2024-10-09 15:49 UTC (permalink / raw)
To: Benjamin Marzinski; +Cc: Christophe Varoqui, device-mapper development
On Tue, 2024-10-08 at 15:33 -0400, Benjamin Marzinski wrote:
> On Thu, Oct 03, 2024 at 11:15:21PM +0200, Martin Wilck wrote:
> > On Thu, 2024-09-12 at 17:49 -0400, Benjamin Marzinski wrote:
> > > Add a new optional checker class function, libcheck_need_wait()
> > > and a
> > > new function to call it, checker_need_wait(). This can be used to
> > > see
> > > if
> > > a path_checker is currently running. This will be used to
> > > determine
> > > if
> > > there are pending checkers that need to be waited for.
> > >
> > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > > ---
> > > libmultipath/checkers.c | 12 +++++++++++-
> > > libmultipath/checkers.h | 4 +++-
> > > libmultipath/checkers/directio.c | 10 ++++++++++
> > > libmultipath/checkers/tur.c | 10 ++++++++++
> > > 4 files changed, 34 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/libmultipath/checkers.c b/libmultipath/checkers.c
> > > index f3e98352..e2eda58d 100644
> > > --- a/libmultipath/checkers.c
> > > +++ b/libmultipath/checkers.c
> > > @@ -27,6 +27,7 @@ struct checker_class {
> > > void (*reset)(void); /* to reset the
> > > global
> > > variables */
> > > void *(*thread)(void *); /* async thread
> > > entry
> > > point */
> > > int (*pending)(struct checker *); /* to recheck
> > > pending
> > > paths */
> > > + bool (*need_wait)(struct checker *); /* checker needs
> > > waiting for */
> > > const char **msgtable;
> > > short msgtable_size;
> > > };
> > > @@ -182,7 +183,8 @@ static struct checker_class
> > > *add_checker_class(const char *name)
> > > c->reset = (void (*)(void)) dlsym(c->handle,
> > > "libcheck_reset");
> > > c->thread = (void *(*)(void*)) dlsym(c->handle,
> > > "libcheck_thread");
> > > c->pending = (int (*)(struct checker *)) dlsym(c-
> > > >handle,
> > > "libcheck_pending");
> > > - /* These 4 functions can be NULL. call dlerror() to
> > > clear
> > > out any
> > > + c->need_wait = (bool (*)(struct checker *)) dlsym(c-
> > > >handle,
> > > "libcheck_need_wait");
> > > + /* These 5 functions can be NULL. call dlerror() to
> > > clear
> > > out any
> > > * error string */
> > > dlerror();
> > >
> > > @@ -313,6 +315,14 @@ int checker_get_state(struct checker *c)
> > > return c->path_state;
> > > }
> > >
> > > +bool checker_need_wait(struct checker *c)
> > > +{
> > > + if (!c || !c->cls || c->path_state != PATH_PENDING ||
> > > + !c->cls->need_wait)
> > > + return false;
> > > + return c->cls->need_wait(c);
> > > +}
> > > +
> > > void checker_check (struct checker * c, int path_state)
> > > {
> > > if (!c)
> > > diff --git a/libmultipath/checkers.h b/libmultipath/checkers.h
> > > index b2342a1b..da91f499 100644
> > > --- a/libmultipath/checkers.h
> > > +++ b/libmultipath/checkers.h
> > > @@ -2,6 +2,7 @@
> > > #define CHECKERS_H_INCLUDED
> > >
> > > #include <pthread.h>
> > > +#include <stdbool.h>
> > > #include "list.h"
> > > #include "defaults.h"
> > >
> > > @@ -171,6 +172,7 @@ struct checker_context {
> > > int start_checker_thread (pthread_t *thread, const
> > > pthread_attr_t
> > > *attr,
> > > struct checker_context *ctx);
> > > int checker_get_state(struct checker *c);
> > > +bool checker_need_wait(struct checker *c);
> > > void checker_check (struct checker *, int);
> > > int checker_is_sync(const struct checker *);
> > > const char *checker_name (const struct checker *);
> > > @@ -191,7 +193,7 @@ void *libcheck_thread(struct checker_context
> > > *ctx);
> > > void libcheck_reset(void);
> > > int libcheck_mp_init(struct checker *);
> > > int libcheck_pending(struct checker *c);
> > > -
> > > +bool libcheck_need_wait(struct checker *c);
> > >
> > > /*
> > > * msgid => message map.
> > > diff --git a/libmultipath/checkers/directio.c
> > > b/libmultipath/checkers/directio.c
> > > index 904e3071..4beed02e 100644
> > > --- a/libmultipath/checkers/directio.c
> > > +++ b/libmultipath/checkers/directio.c
> > > @@ -65,6 +65,7 @@ struct directio_context {
> > > struct aio_group *aio_grp;
> > > struct async_req *req;
> > > struct timespec endtime;
> > > + bool waited_for;
> > > };
> > >
> > > static struct aio_group *
> > > @@ -295,6 +296,7 @@ check_pending(struct directio_context *ct,
> > > struct
> > > timespec endtime)
> > > int r;
> > > struct timespec currtime, timeout;
> > >
> > > + ct->waited_for = true;
> >
> > Not sure if it's important, but strictly speaking, we have only
> > waited
> > for the checker if the endtime was in the future, which is often
> > not
> > the case. Otherwise we'd just have polled.
>
> But we always set the endtime in the future when we're starting a new
> request, so the first time the we call check_pending() we don't
> return
> till we either get an answer or endtime has passed, which is what we
> want waited_for to check (that we've give the checker till endtime to
> respond).
That holds for this patch. But right in the next patch, the code is
changed such that libcheck_pending doesn't actually wait at all. So
"ct->waited_for" is a really confusing name. It just means that we have
polled for the state once. In theory that can happen very quickly after
starting the checker.
Regards
Martin
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v2 19/22] libmultipath: add libcheck_need_wait checker function
2024-10-09 15:49 ` Martin Wilck
@ 2024-10-14 17:48 ` Benjamin Marzinski
2024-10-14 21:08 ` Benjamin Marzinski
0 siblings, 1 reply; 40+ messages in thread
From: Benjamin Marzinski @ 2024-10-14 17:48 UTC (permalink / raw)
To: Martin Wilck; +Cc: Christophe Varoqui, device-mapper development
On Wed, Oct 09, 2024 at 05:49:33PM +0200, Martin Wilck wrote:
> On Tue, 2024-10-08 at 15:33 -0400, Benjamin Marzinski wrote:
> > On Thu, Oct 03, 2024 at 11:15:21PM +0200, Martin Wilck wrote:
> > > On Thu, 2024-09-12 at 17:49 -0400, Benjamin Marzinski wrote:
<snip>
> > > > diff --git a/libmultipath/checkers/directio.c
> > > > b/libmultipath/checkers/directio.c
> > > > index 904e3071..4beed02e 100644
> > > > --- a/libmultipath/checkers/directio.c
> > > > +++ b/libmultipath/checkers/directio.c
> > > > @@ -65,6 +65,7 @@ struct directio_context {
> > > > struct aio_group *aio_grp;
> > > > struct async_req *req;
> > > > struct timespec endtime;
> > > > + bool waited_for;
> > > > };
> > > >
> > > > static struct aio_group *
> > > > @@ -295,6 +296,7 @@ check_pending(struct directio_context *ct,
> > > > struct
> > > > timespec endtime)
> > > > int r;
> > > > struct timespec currtime, timeout;
> > > >
> > > > + ct->waited_for = true;
> > >
> > > Not sure if it's important, but strictly speaking, we have only
> > > waited
> > > for the checker if the endtime was in the future, which is often
> > > not
> > > the case. Otherwise we'd just have polled.
> >
> > But we always set the endtime in the future when we're starting a new
> > request, so the first time the we call check_pending() we don't
> > return
> > till we either get an answer or endtime has passed, which is what we
> > want waited_for to check (that we've give the checker till endtime to
> > respond).
>
> That holds for this patch. But right in the next patch, the code is
> changed such that libcheck_pending doesn't actually wait at all. So
> "ct->waited_for" is a really confusing name. It just means that we have
> polled for the state once. In theory that can happen very quickly after
> starting the checker.
Oh. Yeah. I can change the of the variable to something like
is_unpolled, and the calling function to checker_is_unpolled(), which
matches what we're actually asking for.
-Ben
> Regards
> Martin
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v2 19/22] libmultipath: add libcheck_need_wait checker function
2024-10-14 17:48 ` Benjamin Marzinski
@ 2024-10-14 21:08 ` Benjamin Marzinski
0 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-10-14 21:08 UTC (permalink / raw)
To: Martin Wilck; +Cc: Christophe Varoqui, device-mapper development
On Mon, Oct 14, 2024 at 01:48:14PM -0400, Benjamin Marzinski wrote:
> On Wed, Oct 09, 2024 at 05:49:33PM +0200, Martin Wilck wrote:
> > On Tue, 2024-10-08 at 15:33 -0400, Benjamin Marzinski wrote:
> > > On Thu, Oct 03, 2024 at 11:15:21PM +0200, Martin Wilck wrote:
> > > > On Thu, 2024-09-12 at 17:49 -0400, Benjamin Marzinski wrote:
>
> <snip>
>
> > > > > diff --git a/libmultipath/checkers/directio.c
> > > > > b/libmultipath/checkers/directio.c
> > > > > index 904e3071..4beed02e 100644
> > > > > --- a/libmultipath/checkers/directio.c
> > > > > +++ b/libmultipath/checkers/directio.c
> > > > > @@ -65,6 +65,7 @@ struct directio_context {
> > > > > struct aio_group *aio_grp;
> > > > > struct async_req *req;
> > > > > struct timespec endtime;
> > > > > + bool waited_for;
> > > > > };
> > > > >
> > > > > static struct aio_group *
> > > > > @@ -295,6 +296,7 @@ check_pending(struct directio_context *ct,
> > > > > struct
> > > > > timespec endtime)
> > > > > int r;
> > > > > struct timespec currtime, timeout;
> > > > >
> > > > > + ct->waited_for = true;
> > > >
> > > > Not sure if it's important, but strictly speaking, we have only
> > > > waited
> > > > for the checker if the endtime was in the future, which is often
> > > > not
> > > > the case. Otherwise we'd just have polled.
> > >
> > > But we always set the endtime in the future when we're starting a new
> > > request, so the first time the we call check_pending() we don't
> > > return
> > > till we either get an answer or endtime has passed, which is what we
> > > want waited_for to check (that we've give the checker till endtime to
> > > respond).
> >
> > That holds for this patch. But right in the next patch, the code is
> > changed such that libcheck_pending doesn't actually wait at all. So
> > "ct->waited_for" is a really confusing name. It just means that we have
> > polled for the state once. In theory that can happen very quickly after
> > starting the checker.
>
> Oh. Yeah. I can change the of the variable to something like
> is_unpolled, and the calling function to checker_is_unpolled(), which
> matches what we're actually asking for.
Actually, I'm going to leave the calling functon as checker_need_wait(),
since the question we're asking the checker is whether or not we need to
wait. Even if the checker is unpolled, if it's not pending, there's no
point in waiting. Inside the checkers, I'll change the variable name to
checked_state, which makes more sense than is_unpolled in the initial
patch, where you're still waiting instead of just polling.
-Ben
>
> -Ben
>
> > Regards
> > Martin
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v2 20/22] libmultipath: don't wait in libcheck_pending
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (18 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 19/22] libmultipath: add libcheck_need_wait checker function Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 21/22] multipathd: wait for checkers to complete Benjamin Marzinski
` (3 subsequent siblings)
23 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
Disable waiting in the libcheck_pending() checker class functions.
Instead, they now just check if the checker has already completed.
A future patch will re-add waiting outside of libcheck_pending().
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
libmultipath/checkers/directio.c | 41 ++++++++++++++++----------------
libmultipath/checkers/tur.c | 20 +++-------------
2 files changed, 23 insertions(+), 38 deletions(-)
diff --git a/libmultipath/checkers/directio.c b/libmultipath/checkers/directio.c
index 4beed02e..f37b0d39 100644
--- a/libmultipath/checkers/directio.c
+++ b/libmultipath/checkers/directio.c
@@ -64,7 +64,6 @@ struct directio_context {
int reset_flags;
struct aio_group *aio_grp;
struct async_req *req;
- struct timespec endtime;
bool waited_for;
};
@@ -291,18 +290,17 @@ get_events(struct aio_group *aio_grp, struct timespec *timeout)
}
static void
-check_pending(struct directio_context *ct, struct timespec endtime)
+check_pending(struct directio_context *ct, struct timespec timeout)
{
int r;
- struct timespec currtime, timeout;
+ struct timespec endtime, currtime;
ct->waited_for = true;
+ get_monotonic_time(&endtime);
+ endtime.tv_sec += timeout.tv_sec;
+ endtime.tv_nsec += timeout.tv_nsec;
+ normalize_timespec(&endtime);
while(1) {
- get_monotonic_time(&currtime);
- timespecsub(&endtime, &currtime, &timeout);
- if (timeout.tv_sec < 0)
- timeout.tv_sec = timeout.tv_nsec = 0;
-
r = get_events(ct->aio_grp, &timeout);
if (ct->req->state != PATH_PENDING) {
@@ -311,6 +309,10 @@ check_pending(struct directio_context *ct, struct timespec endtime)
} else if (r == 0 ||
(timeout.tv_sec == 0 && timeout.tv_nsec == 0))
return;
+ get_monotonic_time(&currtime);
+ timespecsub(&endtime, &currtime, &timeout);
+ if (timeout.tv_sec < 0)
+ timeout.tv_sec = timeout.tv_nsec = 0;
}
}
@@ -320,7 +322,7 @@ check_state(int fd, struct directio_context *ct, int sync, int timeout_secs)
struct stat sb;
int rc;
struct io_event event;
- struct timespec endtime;
+ struct timespec timeout = { .tv_sec = timeout_secs };
if (fstat(fd, &sb) == 0) {
LOG(4, "called for %x", (unsigned) sb.st_rdev);
@@ -345,20 +347,13 @@ check_state(int fd, struct directio_context *ct, int sync, int timeout_secs)
LOG(3, "io_submit error %i", -rc);
return PATH_UNCHECKED;
}
- get_monotonic_time(&ct->endtime);
- ct->endtime.tv_nsec += 1000 * 1000;
- normalize_timespec(&ct->endtime);
ct->waited_for = false;
}
ct->running++;
if (!sync)
return PATH_PENDING;
- get_monotonic_time(&endtime);
- endtime.tv_sec += timeout_secs;
- normalize_timespec(&endtime);
-
- check_pending(ct, endtime);
+ check_pending(ct, timeout);
if (ct->req->state != PATH_PENDING)
return ct->req->state;
@@ -401,13 +396,16 @@ int libcheck_pending(struct checker *c)
int rc;
struct io_event event;
struct directio_context *ct = (struct directio_context *)c->context;
+ struct timespec no_wait = { .tv_sec = 0 };
/* The if path checker isn't running, just return the exiting value. */
- if (!ct || !ct->running)
- return c->path_state;
+ if (!ct || !ct->running) {
+ rc = c->path_state;
+ goto out;
+ }
if (ct->req->state == PATH_PENDING)
- check_pending(ct, ct->endtime);
+ check_pending(ct, no_wait);
else
ct->running = 0;
rc = ct->req->state;
@@ -420,8 +418,9 @@ int libcheck_pending(struct checker *c)
else
LOG(4, "async io pending");
}
- set_msgid(c, rc);
+out:
+ set_msgid(c, rc);
return rc;
}
diff --git a/libmultipath/checkers/tur.c b/libmultipath/checkers/tur.c
index 41d6b9c3..5d606708 100644
--- a/libmultipath/checkers/tur.c
+++ b/libmultipath/checkers/tur.c
@@ -58,7 +58,6 @@ struct tur_checker_context {
int msgid;
struct checker_context ctx;
unsigned int nr_timeouts;
- struct timespec endtime;
bool waited_for;
};
@@ -299,14 +298,6 @@ void *libcheck_thread(struct checker_context *ctx)
return ((void *)0);
}
-
-static void tur_timeout(struct timespec *tsp)
-{
- get_monotonic_time(tsp);
- tsp->tv_nsec += 1000 * 1000; /* 1 millisecond */
- normalize_timespec(tsp);
-}
-
static void tur_set_async_timeout(struct checker *c)
{
struct tur_checker_context *ct = c->context;
@@ -328,16 +319,12 @@ static int tur_check_async_timeout(struct checker *c)
int check_pending(struct checker *c)
{
struct tur_checker_context *ct = c->context;
- int r, tur_status = PATH_PENDING;
+ int tur_status = PATH_PENDING;
pthread_mutex_lock(&ct->lock);
- for (r = 0;
- r == 0 && ct->state == PATH_PENDING &&
- ct->msgid == MSG_TUR_RUNNING;
- r = pthread_cond_timedwait(&ct->active, &ct->lock, &ct->endtime));
-
- if (!r) {
+ if (ct->state != PATH_PENDING || ct->msgid != MSG_TUR_RUNNING)
+ {
tur_status = ct->state;
c->msgid = ct->msgid;
}
@@ -487,7 +474,6 @@ int libcheck_check(struct checker * c)
" sync mode", major(ct->devt), minor(ct->devt));
return tur_check(c->fd, c->timeout, &c->msgid);
}
- tur_timeout(&ct->endtime);
}
return tur_status;
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v2 21/22] multipathd: wait for checkers to complete
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (19 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 20/22] libmultipath: don't wait in libcheck_pending Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-09-12 21:49 ` [PATCH v2 22/22] multipath-tools tests: fix up directio tests Benjamin Marzinski
` (2 subsequent siblings)
23 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
Multipath again waits for running checkers to complete. In pathinfo(),
mutipath will just wait 1ms if the checker is running. In checkerloop(),
if there are running checkers, multipathd will drop the vecs lock and
wait for 5ms. The difference it wait times is because multipathd cannot
drop the vecs lock in pathinfo.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
libmultipath/discovery.c | 9 ++++++++-
libmultipath/libmultipath.version | 1 +
multipathd/main.c | 12 +++++++++++-
3 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 6ccdfa0b..1d48c30a 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -2467,8 +2467,15 @@ int pathinfo(struct path *pp, struct config *conf, int mask)
if (mask & DI_CHECKER) {
if (path_state == PATH_UP) {
int newstate = PATH_UNCHECKED;
- if (start_checker(pp, conf, 0, path_state) == 0)
+ if (start_checker(pp, conf, 0, path_state) == 0) {
+ if (checker_need_wait(&pp->checker)) {
+ struct timespec wait = {
+ .tv_nsec = 1000 * 1000,
+ };
+ nanosleep(&wait, NULL);
+ }
newstate = get_state(pp);
+ }
if (newstate != PATH_PENDING ||
pp->state == PATH_UNCHECKED ||
pp->state == PATH_WILD)
diff --git a/libmultipath/libmultipath.version b/libmultipath/libmultipath.version
index 6439d3a7..c2e5f552 100644
--- a/libmultipath/libmultipath.version
+++ b/libmultipath/libmultipath.version
@@ -62,6 +62,7 @@ global:
checker_enable;
checker_message;
checker_name;
+ checker_need_wait;
checker_state_name;
check_foreign;
cleanup_bindings;
diff --git a/multipathd/main.c b/multipathd/main.c
index 5e68e470..5bac76ae 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2848,6 +2848,7 @@ update_uninitialized_path(struct vectors * vecs, struct path * pp)
enum checker_state {
CHECKER_STARTING,
CHECKER_CHECKING_PATHS,
+ CHECKER_WAITING_FOR_PATHS,
CHECKER_UPDATING_PATHS,
CHECKER_FINISHED,
};
@@ -2859,6 +2860,7 @@ check_paths(struct vectors *vecs, unsigned int ticks)
struct timespec diff_time, start_time, end_time;
struct path *pp;
int i;
+ bool need_wait = false;
get_monotonic_time(&start_time);
@@ -2869,6 +2871,9 @@ check_paths(struct vectors *vecs, unsigned int ticks)
pp->is_checked = check_path(pp, ticks);
else
pp->is_checked = check_uninitialized_path(pp, ticks);
+ if (pp->is_checked == CHECK_PATH_STARTED &&
+ checker_need_wait(&pp->checker))
+ need_wait = true;
if (++paths_checked % 128 == 0 &&
(lock_has_waiters(&vecs->lock) || waiting_clients())) {
get_monotonic_time(&end_time);
@@ -2877,7 +2882,7 @@ check_paths(struct vectors *vecs, unsigned int ticks)
return CHECKER_CHECKING_PATHS;
}
}
- return CHECKER_UPDATING_PATHS;
+ return need_wait ? CHECKER_WAITING_FOR_PATHS : CHECKER_UPDATING_PATHS;
}
static enum checker_state
@@ -2999,6 +3004,11 @@ checkerloop (void *ap)
if (checker_state != CHECKER_FINISHED) {
/* Yield to waiters */
struct timespec wait = { .tv_nsec = 10000, };
+ if (checker_state == CHECKER_WAITING_FOR_PATHS) {
+ /* wait 5ms */
+ wait.tv_nsec = 5 * 1000 * 1000;
+ checker_state = CHECKER_UPDATING_PATHS;
+ }
nanosleep(&wait, NULL);
}
}
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* [PATCH v2 22/22] multipath-tools tests: fix up directio tests
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (20 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 21/22] multipathd: wait for checkers to complete Benjamin Marzinski
@ 2024-09-12 21:49 ` Benjamin Marzinski
2024-09-13 9:30 ` [PATCH v2 00/22] Yet Another path checker refactor Martin Wilck
2024-10-03 21:23 ` Martin Wilck
23 siblings, 0 replies; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-12 21:49 UTC (permalink / raw)
To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck
Make the directio tests work with no waiting in the libcheck_pending()
and the new libcheck_need_wait() call.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
tests/directio.c | 49 +++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 46 insertions(+), 3 deletions(-)
diff --git a/tests/directio.c b/tests/directio.c
index 2e22f831..5ee482a8 100644
--- a/tests/directio.c
+++ b/tests/directio.c
@@ -313,23 +313,28 @@ static void test_init_reset_init(void **state)
struct checker c = {.cls = NULL};
struct aio_group *aio_grp, *tmp_grp;
+ assert_int_equal(libcheck_need_wait(&c), false);
assert_true(list_empty(&aio_grp_list));
will_return(__wrap_io_setup, 0);
do_libcheck_init(&c, 4096, 0, NULL);
+ assert_int_equal(libcheck_need_wait(&c), false);
aio_grp = get_aio_grp(&c);
check_aio_grp(aio_grp, 1, 0);
list_for_each_entry(tmp_grp, &aio_grp_list, node)
assert_ptr_equal(aio_grp, tmp_grp);
libcheck_free(&c);
+ assert_int_equal(libcheck_need_wait(&c), false);
check_aio_grp(aio_grp, 0, 0);
do_libcheck_reset(1);
will_return(__wrap_io_setup, 0);
do_libcheck_init(&c, 4096, 0, NULL);
+ assert_int_equal(libcheck_need_wait(&c), false);
aio_grp = get_aio_grp(&c);
check_aio_grp(aio_grp, 1, 0);
list_for_each_entry(tmp_grp, &aio_grp_list, node)
assert_ptr_equal(aio_grp, tmp_grp);
libcheck_free(&c);
+ assert_int_equal(libcheck_need_wait(&c), false);
check_aio_grp(aio_grp, 0, 0);
do_libcheck_reset(1);
}
@@ -433,6 +438,7 @@ static void test_check_state_simple(void **state)
do_libcheck_init(&c, 4096, 30, &req);
return_io_getevents_nr(NULL, 1, &req, &res);
do_check_state(&c, 1, PATH_UP);
+ assert_int_equal(libcheck_need_wait(&c), false);
libcheck_free(&c);
do_libcheck_reset(1);
}
@@ -465,17 +471,25 @@ static void test_check_state_async_timeout(void **state)
do_libcheck_init(&c, 4096, 3, NULL);
aio_grp = get_aio_grp(&c);
do_check_state(&c, 0, PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c), true);
return_io_getevents_none();
do_libcheck_pending(&c, PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c), false);
do_check_state(&c, 0, PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c), false);
return_io_getevents_none();
do_libcheck_pending(&c, PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c), false);
do_check_state(&c, 0, PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c), false);
return_io_getevents_none();
do_libcheck_pending(&c, PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c), false);
do_check_state(&c, 0, PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c), false);
return_io_getevents_none();
do_libcheck_pending(&c, PATH_DOWN);
+ assert_int_equal(libcheck_need_wait(&c), false);
check_aio_grp(aio_grp, 1, 0);
libcheck_free(&c);
do_libcheck_reset(1);
@@ -496,11 +510,16 @@ static void test_free_with_pending(void **state)
aio_grp = get_aio_grp(c);
do_check_state(&c[0], 0, PATH_PENDING);
do_check_state(&c[1], 0, PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c[0]), true);
+ assert_int_equal(libcheck_need_wait(&c[1]), true);
return_io_getevents_none();
do_libcheck_pending(&c[0], PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c[0]), false);
+ assert_int_equal(libcheck_need_wait(&c[1]), true);
return_io_getevents_nr(NULL, 1, &req, &res);
- return_io_getevents_none();
do_libcheck_pending(&c[1], PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c[0]), false);
+ assert_int_equal(libcheck_need_wait(&c[1]), false);
assert_true(is_checker_running(&c[0]));
assert_true(is_checker_running(&c[1]));
check_aio_grp(aio_grp, 2, 0);
@@ -523,8 +542,10 @@ static void test_orphaned_aio_group(void **state)
for (i = 0; i < AIO_GROUP_SIZE; i++) {
do_libcheck_init(&c[i], 4096, 30, NULL);
do_check_state(&c[i], 0, PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c[i]), true);
return_io_getevents_none();
do_libcheck_pending(&c[i], PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c[i]), false);
}
aio_grp = get_aio_grp(c);
check_aio_grp(aio_grp, AIO_GROUP_SIZE, 0);
@@ -560,15 +581,19 @@ static void test_timeout_cancel_failed(void **state)
aio_grp = get_aio_grp(c);
return_io_getevents_none();
do_check_state(&c[0], 1, PATH_DOWN);
+ assert_int_equal(libcheck_need_wait(&c[0]), false);
assert_true(is_checker_running(&c[0]));
check_aio_grp(aio_grp, 2, 0);
return_io_getevents_none();
do_check_state(&c[0], 1, PATH_DOWN);
+ assert_int_equal(libcheck_need_wait(&c[0]), false);
assert_true(is_checker_running(&c[0]));
return_io_getevents_nr(NULL, 1, &reqs[0], &res[0]);
return_io_getevents_nr(NULL, 1, &reqs[1], &res[1]);
do_check_state(&c[1], 1, PATH_UP);
+ assert_int_equal(libcheck_need_wait(&c[1]), false);
do_check_state(&c[0], 1, PATH_UP);
+ assert_int_equal(libcheck_need_wait(&c[0]), false);
for (i = 0; i < 2; i++) {
assert_false(is_checker_running(&c[i]));
libcheck_free(&c[i]);
@@ -590,11 +615,15 @@ static void test_async_timeout_cancel_failed(void **state)
for (i = 0; i < 2; i++)
do_libcheck_init(&c[i], 4096, 2, &reqs[i]);
do_check_state(&c[0], 0, PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c[0]), true);
do_check_state(&c[1], 0, PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c[1]), true);
return_io_getevents_none();
do_libcheck_pending(&c[0], PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c[0]), false);
return_io_getevents_none();
do_libcheck_pending(&c[1], PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c[1]), false);
do_check_state(&c[0], 0, PATH_PENDING);
do_check_state(&c[1], 0, PATH_PENDING);
return_io_getevents_none();
@@ -605,10 +634,12 @@ static void test_async_timeout_cancel_failed(void **state)
do_check_state(&c[1], 0, PATH_PENDING);
return_io_getevents_none();
do_libcheck_pending(&c[0], PATH_DOWN);
+ assert_int_equal(libcheck_need_wait(&c[0]), false);
if (!test_dev) {
/* can't pick which even gets returned on real devices */
return_io_getevents_nr(NULL, 1, &reqs[1], &res[1]);
do_libcheck_pending(&c[1], PATH_UP);
+ assert_int_equal(libcheck_need_wait(&c[1]), false);
}
do_check_state(&c[0], 0, PATH_PENDING);
return_io_getevents_none();
@@ -641,14 +672,19 @@ static void test_orphan_checker_cleanup(void **state)
do_libcheck_init(&c[i], 4096, 30, &reqs[i]);
aio_grp = get_aio_grp(c);
do_check_state(&c[0], 0, PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c[0]), true);
return_io_getevents_none();
do_libcheck_pending(&c[0], PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c[0]), false);
check_aio_grp(aio_grp, 2, 0);
libcheck_free(&c[0]);
+ assert_int_equal(libcheck_need_wait(&c[0]), false);
check_aio_grp(aio_grp, 2, 1);
do_check_state(&c[1], 0, PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c[1]), true);
return_io_getevents_nr(NULL, 2, reqs, res);
do_libcheck_pending(&c[1], PATH_UP);
+ assert_int_equal(libcheck_need_wait(&c[1]), false);
check_aio_grp(aio_grp, 1, 0);
libcheck_free(&c[1]);
check_aio_grp(aio_grp, 0, 0);
@@ -667,8 +703,10 @@ static void test_orphan_reset_cleanup(void **state)
do_libcheck_init(&c, 4096, 30, NULL);
orphan_aio_grp = get_aio_grp(&c);
do_check_state(&c, 0, PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c), true);
return_io_getevents_none();
do_libcheck_pending(&c, PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c), false);
check_aio_grp(orphan_aio_grp, 1, 0);
libcheck_free(&c);
check_aio_grp(orphan_aio_grp, 1, 1);
@@ -704,6 +742,7 @@ static void test_check_state_blksize(void **state)
for (i = 0; i < 3; i++) {
return_io_getevents_nr(NULL, 1, &reqs[i], &res[i]);
do_check_state(&c[i], 1, chk_state[i]);
+ assert_int_equal(libcheck_need_wait(&c[i]), false);
}
for (i = 0; i < 3; i++) {
assert_false(is_checker_running(&c[i]));
@@ -727,18 +766,22 @@ static void test_check_state_async(void **state)
do_libcheck_init(&c[i], 4096, 30, &reqs[i]);
for (i = 0; i < 256; i++) {
do_check_state(&c[i], 0, PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c[i]), true);
return_io_getevents_none();
do_libcheck_pending(&c[i], PATH_PENDING);
+ assert_int_equal(libcheck_need_wait(&c[i]), false);
assert_true(is_checker_running(&c[i]));
}
do_check_state(&c[256], 0, PATH_PENDING);
- return_io_getevents_nr(&full_timeout, 256, reqs, res);
- return_io_getevents_nr(NULL, 1, &reqs[256], &res[256]);
+ assert_int_equal(libcheck_need_wait(&c[256]), true);
+ return_io_getevents_nr(&full_timeout, 257, reqs, res);
do_libcheck_pending(&c[256], PATH_UP);
+ assert_int_equal(libcheck_need_wait(&c[256]), false);
assert_false(is_checker_running(&c[256]));
libcheck_free(&c[256]);
for (i = 0; i < 256; i++) {
do_check_state(&c[i], 0, PATH_UP);
+ assert_int_equal(libcheck_need_wait(&c[i]), false);
assert_false(is_checker_running(&c[i]));
libcheck_free(&c[i]);
}
--
2.45.0
^ permalink raw reply related [flat|nested] 40+ messages in thread* Re: [PATCH v2 00/22] Yet Another path checker refactor
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (21 preceding siblings ...)
2024-09-12 21:49 ` [PATCH v2 22/22] multipath-tools tests: fix up directio tests Benjamin Marzinski
@ 2024-09-13 9:30 ` Martin Wilck
2024-09-16 21:11 ` Benjamin Marzinski
2024-10-03 21:23 ` Martin Wilck
23 siblings, 1 reply; 40+ messages in thread
From: Martin Wilck @ 2024-09-13 9:30 UTC (permalink / raw)
To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development
On Thu, 2024-09-12 at 17:49 -0400, Benjamin Marzinski wrote:
> This patchset cleans up some of the ugly code from my last refactor,
> and
> speeds up multipathd path checking, by refactoring how paths are
> checked.
>
I haven't started reviewing this yet, but I'm seeing a CI problem with
the sysfs test, only with clang 18 on Fedora 40:
https://github.com/openSUSE/multipath-tools/actions/runs/10845899189
Martin
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [PATCH v2 00/22] Yet Another path checker refactor
2024-09-13 9:30 ` [PATCH v2 00/22] Yet Another path checker refactor Martin Wilck
@ 2024-09-16 21:11 ` Benjamin Marzinski
2024-09-17 10:13 ` Martin Wilck
0 siblings, 1 reply; 40+ messages in thread
From: Benjamin Marzinski @ 2024-09-16 21:11 UTC (permalink / raw)
To: Martin Wilck; +Cc: Christophe Varoqui, device-mapper development
On Fri, Sep 13, 2024 at 11:30:23AM +0200, Martin Wilck wrote:
> On Thu, 2024-09-12 at 17:49 -0400, Benjamin Marzinski wrote:
> > This patchset cleans up some of the ugly code from my last refactor,
> > and
> > speeds up multipathd path checking, by refactoring how paths are
> > checked.
> >
>
> I haven't started reviewing this yet, but I'm seeing a CI problem with
> the sysfs test, only with clang 18 on Fedora 40:
> https://github.com/openSUSE/multipath-tools/actions/runs/10845899189
I'm not sure what this is and I've been entirely unable to reproduce it
using Fedora 40 and clang 18.1.8 (the current f40 version of clang). It
seems pretty obvious that I has nothing to do with my patchset. It
looks like it's related to the renaming open() issues. Given the
"%s() has remaining non-returned values."
and
"__wrap___open64_2: '%s' parameter still has values that haven't been
checked."
messages, I'm guessing something other than __open64_2() was called. But
like I said, I can't reproduce this.
-Ben
>
> Martin
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v2 00/22] Yet Another path checker refactor
2024-09-16 21:11 ` Benjamin Marzinski
@ 2024-09-17 10:13 ` Martin Wilck
0 siblings, 0 replies; 40+ messages in thread
From: Martin Wilck @ 2024-09-17 10:13 UTC (permalink / raw)
To: Benjamin Marzinski; +Cc: Christophe Varoqui, device-mapper development
On Mon, 2024-09-16 at 17:11 -0400, Benjamin Marzinski wrote:
>
> I'm not sure what this is and I've been entirely unable to reproduce
> it
> using Fedora 40 and clang 18.1.8 (the current f40 version of clang).
> It
> seems pretty obvious that I has nothing to do with my patchset. It
> looks like it's related to the renaming open() issues. Given the
>
> "%s() has remaining non-returned values."
>
> and
>
> "__wrap___open64_2: '%s' parameter still has values that haven't been
> checked."
>
> messages, I'm guessing something other than __open64_2() was called.
> But
> like I said, I can't reproduce this.
OK, I'll double check. I don't recall seeing this before your patch
set, but that can have multiple reasons.
Martin
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v2 00/22] Yet Another path checker refactor
2024-09-12 21:49 [PATCH v2 00/22] Yet Another path checker refactor Benjamin Marzinski
` (22 preceding siblings ...)
2024-09-13 9:30 ` [PATCH v2 00/22] Yet Another path checker refactor Martin Wilck
@ 2024-10-03 21:23 ` Martin Wilck
23 siblings, 0 replies; 40+ messages in thread
From: Martin Wilck @ 2024-10-03 21:23 UTC (permalink / raw)
To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development
On Thu, 2024-09-12 at 17:49 -0400, Benjamin Marzinski wrote:
> This patchset cleans up some of the ugly code from my last refactor,
> and
> speeds up multipathd path checking, by refactoring how paths are
> checked.
>
> multipathd previously ran the checker on a path, waited briefly for
> it
> to complete, and if it did, updated the path and the priorities for
> the
> multipath device and possibly reloaded it.
>
> This had multiple issues.
> 1. All the paths waited for their checkers to complete serially. This
> wasted time with no benefit. It also meant that it wasn't that
> uncommon
> for a checker to not finish in time, and get marked pending.
>
> 2. The prioritiers could get run on paths multiple times during a
> checker loop if multiple paths were restored at the same time, which
> is
> a common occurance.
>
> 3. In an effort to keep a multipath device's state consistent despite
> the delays introduced by waiting for the path checkers, paths were
> checked by multipath device. However checking the paths could involve
> reloading a multipath device table, or even removing a multipath
> device.
> This added some ugly backout and retry logic to the path check code.
>
> With this patchset, the code now first starts the path checkers on
> all
> devices due for a path check. If there are path checkers that need
> waiting it then unocks the vecs lock and waits 5ms. Next, it goes
> back
> and updates the state of all the path devices. Once all of the paths
> have been updated, it goes through each multipath device, updating
> path
> priorities and reloading the device as necessary.
>
> This allows multipathd to spend less time and do less redundant work
> while checking paths, while making paths more likely to not spend a
> checker cycle marked as pending. Since multipathd drops the vecs lock
> while waiting, uevents and user commands can be processed while its
> waiting.
>
> Since there isn't a delay waiting for the previous checker before
> starting the next one, the path checker code has reverted to checking
> all paths in pathvec order, instead of by multipath device. Updating
> the
> paths in pathvec order simplifies the code, since the multipath
> devices
> can change during path updates. Starting the checkers in pathvec
> order
> keeps a path from having its checker started towards the end of the
> list
> of paths but checking for the result towards the start of the list of
> paths.
>
> Changes since v1:
>
> The original patches have been updated based on suggestions from
> Martin
> Wilck (patches listed using their original numbering):
>
> - Moved checker path_state initialization code from 06 to 01
> - Check for NULL before dereference in checker_get_state() in 04
> - Moved adding start_checker to version file from 07 to 06
> - Renamed check_path_state() to get_new_state() in 08. updated 09,
> 11,
> and 12 to deal with the name change.
> - Added new patch before 13 ('fix "fail path" and "reinstate path"
> commands') to fix issues with manually failing and reinstating
> paths
> between when the checkers are run and the paths are updated
> - Fixed check in update_paths in 13
> - Split patch 13 into three patches, one to move priority updating
> out
> of the individual path updating code, and two others to clean up
> some
> of the logic about updating priorities and path groups.
>
> After the now 18 patches from the original patchset, four new patches
> have been added to move the waiting for pending paths out from
> checker_get_state() and into the existing waiting code in
> checkerloop().
> Doing the waiting outside of the checkers (in fact, outside of the
> vecs lock completely) means that we can't wait till a checker has
> completed. We must just pick a time and wait for all of it. On
> Martin's
> suggestion I picked 5ms. Since multipathd isn't sleeping with the
> vecs
> lock held, it's much less critical to do this quickly.
>
> For checker run in pathinfo, I kept the previous timeout of 1ms, but
> I
> waffled on whether to wait at all here, and I'd be perfectly fine
> with
> removing the wait altogether, and just let the paths be pending when
> called from multipathd with async checkers.
>
> Also, I've just added these patches to the end of the patchset to
> make
> it easier to see the changes from v1, but if people want less churn
> in
> the checker code commits, I'd be find with resending this patchset
> with
> these patches refactored into the earlier patches.
Except for the few minor nits in my previous replies:
Reviewed-by: Martin Wilck <mwilck@suse.com>
^ permalink raw reply [flat|nested] 40+ messages in thread