* [PATCH rdma-core] srp_daemon: Install signal handler for ibsrpdm
@ 2017-12-14 11:02 Honggang LI
[not found] ` <20171214110241.4701-1-honli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: Honggang LI @ 2017-12-14 11:02 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: honli-H+wXaHxf7aLQT0dZR+AlfA,
jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/
From: Honggang Li <honli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
[root@localhost ~]# /usr/sbin/ibsrpdm -c -d /dev/infiniband/umad0
id_ext=0002c90300317810,ioc_guid=0002c90300317810,dgid=fe800000000000000002c90300317811,pkey=ffff,service_id=0002c90300317810
[root@localhost ~]# echo $?
130
fd3005f0cd34 moves the signal handler setup from ibsrpdm path. So,
default signal handler will be used when it receives signal SIGINT.
ibsrpdm will exit with non-zero exit code as default signal handler
killed it. ibsrpdm should return with exit code zero, if no error
emerged.
main->ibsrpdm->free_res->pthread_kill(res->async_ev_thread, SIGINT)
Beside the wrong exit code, there is an extra blank line at the end of
output of ibsrpdm.
Fixes: fd3005f0cd34 ("srp_daemon: Move the setup of the wakeup_pipe after openlog")
Signed-off-by: Honggang Li <honli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
srp_daemon/srp_daemon.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/srp_daemon/srp_daemon.c b/srp_daemon/srp_daemon.c
index cec36db2..0d00d712 100644
--- a/srp_daemon/srp_daemon.c
+++ b/srp_daemon/srp_daemon.c
@@ -1996,7 +1996,6 @@ static void cleanup_wakeup_fd(void)
static int setup_wakeup_fd(void)
{
- struct sigaction sa = {};
int ret;
ret = pipe2(wakeup_pipe, O_NONBLOCK | O_CLOEXEC);
@@ -2005,11 +2004,6 @@ static int setup_wakeup_fd(void)
return -1;
}
- sigemptyset(&sa.sa_mask);
- sa.sa_handler = signal_handler;
- sigaction(SIGINT, &sa, NULL);
- sigaction(SIGTERM, &sa, NULL);
- sigaction(SRP_CATAS_ERR, &sa, NULL);
return 0;
}
@@ -2100,6 +2094,7 @@ int main(int argc, char *argv[])
int lockfd = -1;
int received_signal = 0;
bool systemd;
+ struct sigaction sa = {};
#ifndef __CHECKER__
/*
@@ -2117,6 +2112,12 @@ int main(int argc, char *argv[])
BUILD_ASSERT(sizeof(struct srp_dm_iou_info) == 132);
BUILD_ASSERT(sizeof(struct srp_dm_ioc_prof) == 128);
+ sigemptyset(&sa.sa_mask);
+ sa.sa_handler = signal_handler;
+ sigaction(SIGINT, &sa, NULL);
+ sigaction(SIGTERM, &sa, NULL);
+ sigaction(SRP_CATAS_ERR, &sa, NULL);
+
if (strcmp(argv[0] + max_t(int, 0, strlen(argv[0]) - strlen("ibsrpdm")),
"ibsrpdm") == 0) {
ret = ibsrpdm(argc, argv);
--
2.14.2
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH rdma-core] srp_daemon: Install signal handler for ibsrpdm
[not found] ` <20171214110241.4701-1-honli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2017-12-14 14:59 ` Bart Van Assche
[not found] ` <1513263572.2986.2.camel-Sjgp3cTcYWE@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2017-12-14 14:59 UTC (permalink / raw)
To: honli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org
On Thu, 2017-12-14 at 19:02 +0800, Honggang LI wrote:
> fd3005f0cd34 moves the signal handler setup from ibsrpdm path. So,
> default signal handler will be used when it receives signal SIGINT.
> ibsrpdm will exit with non-zero exit code as default signal handler
> killed it.
Can you explain why you think that ibsrpdm needs a signal handler?
> ibsrpdm should return with exit code zero, if no error emerged.
That's correct, but if it was interrupted by a signal handler an error
code should be returned.
Bart.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH rdma-core] srp_daemon: Install signal handler for ibsrpdm
[not found] ` <1513263572.2986.2.camel-Sjgp3cTcYWE@public.gmane.org>
@ 2017-12-15 1:36 ` Honggang LI
[not found] ` <20171215013628.GA743-Y5OA6DF/u0nid9cnFhDO8BcY2uh10dtjAL8bYrjMMd8@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: Honggang LI @ 2017-12-15 1:36 UTC (permalink / raw)
To: Bart Van Assche
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org
On Thu, Dec 14, 2017 at 02:59:34PM +0000, Bart Van Assche wrote:
> On Thu, 2017-12-14 at 19:02 +0800, Honggang LI wrote:
> > fd3005f0cd34 moves the signal handler setup from ibsrpdm path. So,
> > default signal handler will be used when it receives signal SIGINT.
> > ibsrpdm will exit with non-zero exit code as default signal handler
> > killed it.
>
> Can you explain why you think that ibsrpdm needs a signal handler?
main
ibsrpdm
alloc_res
pthread_create(&res->res.async_ev_thread [1]
....
free_res
if (res->async_ev_thread) { pthread_kill(res->async_ev_thread, SIGINT); [2]
The ibsrpdm program create a thread in [1], and send a SIGINT in [2].
The default behavior of SIGINT is to terminate the process.
According to the man page of pthread_kill:
Signal dispositions are process-wide: if a signal handler is installed,
the handler will be invoked in the thread _thread_, but if the disposition
of the signal is "stop", "continue", or "terminate", this action will
affect the whole process.
If we did not install a signal handler for ibsrpdm, ibsrpdm will be
terminated in [2], when it free the resource. ibsrpdm expected to return
to the main program after it free the resource. However, it failed as it
killed by default signal handler.
>
> > ibsrpdm should return with exit code zero, if no error emerged.
>
> That's correct, but if it was interrupted by a signal handler an error
> code should be returned.
>
> Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH rdma-core] srp_daemon: Install signal handler for ibsrpdm
[not found] ` <20171215013628.GA743-Y5OA6DF/u0nid9cnFhDO8BcY2uh10dtjAL8bYrjMMd8@public.gmane.org>
@ 2017-12-15 17:28 ` Jason Gunthorpe
[not found] ` <20171215172800.GA12434-uk2M96/98Pc@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: Jason Gunthorpe @ 2017-12-15 17:28 UTC (permalink / raw)
To: Honggang LI
Cc: Bart Van Assche,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Fri, Dec 15, 2017 at 09:36:28AM +0800, Honggang LI wrote:
> On Thu, Dec 14, 2017 at 02:59:34PM +0000, Bart Van Assche wrote:
> > On Thu, 2017-12-14 at 19:02 +0800, Honggang LI wrote:
> > > fd3005f0cd34 moves the signal handler setup from ibsrpdm path. So,
> > > default signal handler will be used when it receives signal SIGINT.
> > > ibsrpdm will exit with non-zero exit code as default signal handler
> > > killed it.
> >
> > Can you explain why you think that ibsrpdm needs a signal handler?
>
> main
> ibsrpdm
> alloc_res
> pthread_create(&res->res.async_ev_thread [1]
> ....
> free_res
> if (res->async_ev_thread) { pthread_kill(res->async_ev_thread, SIGINT); [2]
>
>
> The ibsrpdm program create a thread in [1], and send a SIGINT in [2].
> The default behavior of SIGINT is to terminate the process.
Yuk, no, using signals like this is horrifyingly buggy.
Here is a sketch on how to fix it properly. All the users of
pthread_kill should be eliminated.
Though overall, there is really no reason to even cleanup the threads,
just call exit?
diff --git a/srp_daemon/srp_daemon.c b/srp_daemon/srp_daemon.c
index cec36db2e0f12e..fe307705ad6a51 100644
--- a/srp_daemon/srp_daemon.c
+++ b/srp_daemon/srp_daemon.c
@@ -61,6 +61,7 @@
#include <string.h>
#include <signal.h>
#include <sys/syslog.h>
+#include <sys/eventfd.h>
#include <infiniband/umad.h>
#include <infiniband/umad_types.h>
#include <infiniband/umad_sa.h>
@@ -1887,7 +1888,9 @@ static void free_res(struct resources *res)
modify_qp_to_err(res->ud_res->qp);
if (res->reconnect_thread) {
- pthread_kill(res->reconnect_thread, SIGINT);
+ uint64_t val = 1;
+
+ write(res->sync_res->stop_event_fd, &val, sizeof(val));
pthread_join(res->reconnect_thread, &status);
}
if (res->async_ev_thread) {
@@ -1947,6 +1950,7 @@ static struct resources *alloc_res(void)
goto err;
}
+ res->sync_res.stop_event_fd = eventfd(0, EFD_CLOEXEC);
ret = pthread_create(&res->res.async_ev_thread, NULL,
run_thread_listen_to_events, &res->res);
if (ret)
diff --git a/srp_daemon/srp_daemon.h b/srp_daemon/srp_daemon.h
index 5d268ed395e17e..8cbee09d7c688b 100644
--- a/srp_daemon/srp_daemon.h
+++ b/srp_daemon/srp_daemon.h
@@ -245,6 +245,7 @@ enum {
};
struct sync_resources {
+ int stop_event_fd;
int stop_threads;
int next_task;
struct timespec next_recalc_time;
diff --git a/srp_daemon/srp_handle_traps.c b/srp_daemon/srp_handle_traps.c
index 6b36b15cc84c16..cedc2b0ab8e5af 100644
--- a/srp_daemon/srp_handle_traps.c
+++ b/srp_daemon/srp_handle_traps.c
@@ -44,6 +44,7 @@
#include <infiniband/verbs.h>
#include <infiniband/umad_sa.h>
#include <infiniband/umad_sm.h>
+#include <poll.h>
#include "srp_ib_types.h"
@@ -788,69 +789,94 @@ int register_to_traps(struct resources *res, int subscribe)
}
-void *run_thread_listen_to_events(void *res_in)
+static int do_async_event(struct resources *res)
{
- struct resources *res = (struct resources *)res_in;
struct ibv_async_event event;
- while (!stop_threads(res->sync_res)) {
- if (ibv_get_async_event(res->ud_res->ib_ctx, &event)) {
- if (errno != EINTR)
- pr_err("ibv_get_async_event failed (errno = %d)\n",
- errno);
- break;
+ if (ibv_get_async_event(res->ud_res->ib_ctx, &event)) {
+ if (errno != EINTR)
+ pr_err("ibv_get_async_event failed (errno = %d)\n",
+ errno);
+ return -1;
+ }
+
+ pr_debug("event_type %d, port %d\n", event.event_type,
+ event.element.port_num);
+
+ switch (event.event_type) {
+ case IBV_EVENT_PORT_ACTIVE:
+ case IBV_EVENT_SM_CHANGE:
+ case IBV_EVENT_LID_CHANGE:
+ case IBV_EVENT_CLIENT_REREGISTER:
+ case IBV_EVENT_PKEY_CHANGE:
+ if (event.element.port_num == config->port_num) {
+ pthread_mutex_lock(&res->sync_res->mutex);
+ __schedule_rescan(res->sync_res, 0);
+ wake_up_main_loop(0);
+ pthread_mutex_unlock(&res->sync_res->mutex);
}
+ break;
- pr_debug("event_type %d, port %d\n",
- event.event_type, event.element.port_num);
-
- switch (event.event_type) {
- case IBV_EVENT_PORT_ACTIVE:
- case IBV_EVENT_SM_CHANGE:
- case IBV_EVENT_LID_CHANGE:
- case IBV_EVENT_CLIENT_REREGISTER:
- case IBV_EVENT_PKEY_CHANGE:
- if (event.element.port_num == config->port_num) {
- pthread_mutex_lock(&res->sync_res->mutex);
- __schedule_rescan(res->sync_res, 0);
- wake_up_main_loop(0);
- pthread_mutex_unlock(&res->sync_res->mutex);
- }
- break;
-
- case IBV_EVENT_DEVICE_FATAL:
- case IBV_EVENT_CQ_ERR:
- case IBV_EVENT_QP_FATAL:
- /* clean and restart */
- pr_err("Critical event %d, raising catastrophic "
- "error signal\n", event.event_type);
- raise(SRP_CATAS_ERR);
- break;
+ case IBV_EVENT_DEVICE_FATAL:
+ case IBV_EVENT_CQ_ERR:
+ case IBV_EVENT_QP_FATAL:
+ /* clean and restart */
+ pr_err("Critical event %d, raising catastrophic "
+ "error signal\n",
+ event.event_type);
+ raise(SRP_CATAS_ERR);
+ break;
- /*
+ /*
- case IBV_EVENT_PORT_ERR:
- case IBV_EVENT_QP_REQ_ERR:
- case IBV_EVENT_QP_ACCESS_ERR:
- case IBV_EVENT_COMM_EST:
- case IBV_EVENT_SQ_DRAINED:
- case IBV_EVENT_PATH_MIG:
- case IBV_EVENT_PATH_MIG_ERR:
- case IBV_EVENT_SRQ_ERR:
- case IBV_EVENT_SRQ_LIMIT_REACHED:
- case IBV_EVENT_QP_LAST_WQE_REACHED:
+ case IBV_EVENT_PORT_ERR:
+ case IBV_EVENT_QP_REQ_ERR:
+ case IBV_EVENT_QP_ACCESS_ERR:
+ case IBV_EVENT_COMM_EST:
+ case IBV_EVENT_SQ_DRAINED:
+ case IBV_EVENT_PATH_MIG:
+ case IBV_EVENT_PATH_MIG_ERR:
+ case IBV_EVENT_SRQ_ERR:
+ case IBV_EVENT_SRQ_LIMIT_REACHED:
+ case IBV_EVENT_QP_LAST_WQE_REACHED:
- */
+ */
- default:
+ default:
+ break;
+ }
+
+ ibv_ack_async_event(&event);
+ return 0;
+}
+
+void *run_thread_listen_to_events(void *res_in)
+{
+ struct pollfd fds[2];
+ struct resources *res = (struct resources *)res_in;
+
+ fds[0].fd = res->ud_res->ib_ctx->async_fd;
+ fds[0].events = POLLIN;
+ fds[1].fd = res->sync_res->stop_event_fd;
+ fds[1].events = POLLIN;
+
+ while (1) {
+ if (poll(fds, 2, -1) == -1) {
+ if (errno == EINTR)
+ continue;
+ pr_err("ibv_get_async_event failed (errno = %d)\n",
+ errno);
break;
}
- ibv_ack_async_event(&event);
+ if (fds[1].revents & POLLIN)
+ break;
+ if (fds[0].revents & POLLIN)
+ if (do_async_event(res))
+ break;
}
return NULL;
}
-
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH rdma-core] srp_daemon: Install signal handler for ibsrpdm
[not found] ` <20171215172800.GA12434-uk2M96/98Pc@public.gmane.org>
@ 2017-12-15 17:50 ` Bart Van Assche
[not found] ` <1513360253.2571.23.camel-Sjgp3cTcYWE@public.gmane.org>
2017-12-19 12:20 ` Honggang LI
1 sibling, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2017-12-15 17:50 UTC (permalink / raw)
To: honli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
jgg-uk2M96/98Pc@public.gmane.org
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1045 bytes --]
On Fri, 2017-12-15 at 10:28 -0700, Jason Gunthorpe wrote:
> Though overall, there is really no reason to even cleanup the threads,
> just call exit?
Memory leak detectors like the one in Valgrind only produce meaningful results
if threads are stopped cleanly before exit() is called.
> @@ -1887,7 +1888,9 @@ static void free_res(struct resources *res)
> modify_qp_to_err(res->ud_res->qp);
>
> if (res->reconnect_thread) {
> - pthread_kill(res->reconnect_thread, SIGINT);
> + uint64_t val = 1;
> +
> + write(res->sync_res->stop_event_fd, &val, sizeof(val));
> pthread_join(res->reconnect_thread, &status);
This patch does not remove all pthread_kill() calls. There are two other such
calls:
$ git grep -nHw pthread_kill
srp_daemon/srp_daemon.c:1897: pthread_kill(res->async_ev_thread, SIGINT);
srp_daemon/srp_daemon.c:1901: pthread_kill(res->trap_thread, SIGINT);
Bart.N§²æìr¸yúèØb²X¬¶Ç§vØ^)Þº{.nÇ+·¥{±Ù{ayº\x1dÊÚë,j\a¢f£¢·h»öì\x17/oSc¾Ú³9uÀ¦æåÈ&jw¨®\x03(éÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þàþf£¢·h§~m
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH rdma-core] srp_daemon: Install signal handler for ibsrpdm
[not found] ` <1513360253.2571.23.camel-Sjgp3cTcYWE@public.gmane.org>
@ 2017-12-15 17:59 ` Jason Gunthorpe
0 siblings, 0 replies; 10+ messages in thread
From: Jason Gunthorpe @ 2017-12-15 17:59 UTC (permalink / raw)
To: Bart Van Assche
Cc: honli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Fri, Dec 15, 2017 at 05:50:54PM +0000, Bart Van Assche wrote:
> On Fri, 2017-12-15 at 10:28 -0700, Jason Gunthorpe wrote:
> > Though overall, there is really no reason to even cleanup the threads,
> > just call exit?
>
> Memory leak detectors like the one in Valgrind only produce meaningful results
> if threads are stopped cleanly before exit() is called.
Fair enough
> > @@ -1887,7 +1888,9 @@ static void free_res(struct resources *res)
> > modify_qp_to_err(res->ud_res->qp);
> >
> > if (res->reconnect_thread) {
> > - pthread_kill(res->reconnect_thread, SIGINT);
> > + uint64_t val = 1;
> > +
> > + write(res->sync_res->stop_event_fd, &val, sizeof(val));
> > pthread_join(res->reconnect_thread, &status);
>
> This patch does not remove all pthread_kill() calls. There are two other such
> calls:
>
> $ git grep -nHw pthread_kill
> srp_daemon/srp_daemon.c:1897: pthread_kill(res->async_ev_thread, SIGINT);
> srp_daemon/srp_daemon.c:1901: pthread_kill(res->trap_thread, SIGINT);
Correct, this is why I called it a sketch :) I'm sure there are a few
mistakes too..
Hopefully Honggang can finish it and test it, as he found a real bug.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH rdma-core] srp_daemon: Install signal handler for ibsrpdm
[not found] ` <20171215172800.GA12434-uk2M96/98Pc@public.gmane.org>
2017-12-15 17:50 ` Bart Van Assche
@ 2017-12-19 12:20 ` Honggang LI
[not found] ` <20171219122050.GA19682-Y5OA6DF/u0nid9cnFhDO8BcY2uh10dtjAL8bYrjMMd8@public.gmane.org>
1 sibling, 1 reply; 10+ messages in thread
From: Honggang LI @ 2017-12-19 12:20 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Bart Van Assche,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Fri, Dec 15, 2017 at 10:28:00AM -0700, Jason Gunthorpe wrote:
> On Fri, Dec 15, 2017 at 09:36:28AM +0800, Honggang LI wrote:
> > On Thu, Dec 14, 2017 at 02:59:34PM +0000, Bart Van Assche wrote:
> > > On Thu, 2017-12-14 at 19:02 +0800, Honggang LI wrote:
> > > > fd3005f0cd34 moves the signal handler setup from ibsrpdm path. So,
> > > > default signal handler will be used when it receives signal SIGINT.
> > > > ibsrpdm will exit with non-zero exit code as default signal handler
> > > > killed it.
> > >
> > > Can you explain why you think that ibsrpdm needs a signal handler?
> >
> > main
> > ibsrpdm
> > alloc_res
> > pthread_create(&res->res.async_ev_thread [1]
> > ....
> > free_res
> > if (res->async_ev_thread) { pthread_kill(res->async_ev_thread, SIGINT); [2]
> >
> >
> > The ibsrpdm program create a thread in [1], and send a SIGINT in [2].
> > The default behavior of SIGINT is to terminate the process.
>
> Yuk, no, using signals like this is horrifyingly buggy.
OK, I agree with you. The "signal_handler", which calls "wake_up_main_loop",
is not the right solution, because wakeup_pipe never used by ibsrpdm.
<snip>
void wake_up_main_loop(char ch)
{
int res;
assert(wakeup_pipe[1] >= 0); // this 'assert' will be failed
<snip>
> Here is a sketch on how to fix it properly. All the users of
> pthread_kill should be eliminated.
>
> Though overall, there is really no reason to even cleanup the threads,
> just call exit?
No, if one pthread just calls 'exit', the entire process will be
terminated immediately. So, we need to cleanup the threads.
I think the source of current issue is the async_ev_thread pthread.
We should *NOT* create such pthread for ibsrpdm.
I checked the old srptools git repo.
git://git.openfabrics.org/~bvanassche/srptools.git
Commit ab57a5b92eb3b8c9221f77235a028814a462d2cb merges "ibsrpdm" into
"srp_daemon". The old ibsrpdm program is a single thread program.
srp_daemon is multi-thread program.
As ibsrpdm always only run once, because "config->once" has been
assigned to 1. I would suggest to apply this patch.
diff --git a/srp_daemon/srp_daemon.c b/srp_daemon/srp_daemon.c
index cec36db2..4012a7db 100644
--- a/srp_daemon/srp_daemon.c
+++ b/srp_daemon/srp_daemon.c
@@ -1945,12 +1945,12 @@ static struct resources *alloc_res(void)
run_thread_get_trap_notices, &res->res);
if (ret)
goto err;
- }
- ret = pthread_create(&res->res.async_ev_thread, NULL,
- run_thread_listen_to_events, &res->res);
- if (ret)
- goto err;
+ ret = pthread_create(&res->res.async_ev_thread, NULL,
+ run_thread_listen_to_events, &res->res);
+ if (ret)
+ goto err;
+ }
if (config->retry_timeout && !config->once) {
ret = pthread_create(&res->res.reconnect_thread, NULL,
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH rdma-core] srp_daemon: Install signal handler for ibsrpdm
[not found] ` <20171219122050.GA19682-Y5OA6DF/u0nid9cnFhDO8BcY2uh10dtjAL8bYrjMMd8@public.gmane.org>
@ 2017-12-19 16:57 ` Bart Van Assche
[not found] ` <1513702653.2535.3.camel-Sjgp3cTcYWE@public.gmane.org>
2017-12-19 21:13 ` Jason Gunthorpe
1 sibling, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2017-12-19 16:57 UTC (permalink / raw)
To: honli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
jgg-uk2M96/98Pc@public.gmane.org
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Tue, 2017-12-19 at 20:20 +0800, Honggang LI wrote:
> As ibsrpdm always only run once, because "config->once" has been
> assigned to 1. I would suggest to apply this patch.
>
> diff --git a/srp_daemon/srp_daemon.c b/srp_daemon/srp_daemon.c
> index cec36db2..4012a7db 100644
> --- a/srp_daemon/srp_daemon.c
> +++ b/srp_daemon/srp_daemon.c
> @@ -1945,12 +1945,12 @@ static struct resources *alloc_res(void)
> run_thread_get_trap_notices, &res->res);
> if (ret)
> goto err;
> - }
>
> - ret = pthread_create(&res->res.async_ev_thread, NULL,
> - run_thread_listen_to_events, &res->res);
> - if (ret)
> - goto err;
> + ret = pthread_create(&res->res.async_ev_thread, NULL,
> + run_thread_listen_to_events, &res->res);
> + if (ret)
> + goto err;
> + }
>
> if (config->retry_timeout && !config->once) {
> ret = pthread_create(&res->res.reconnect_thread, NULL,
Please submit this as a proper patch and add the following:
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Thanks,
Bart.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH rdma-core] srp_daemon: Install signal handler for ibsrpdm
[not found] ` <1513702653.2535.3.camel-Sjgp3cTcYWE@public.gmane.org>
@ 2017-12-19 19:12 ` Honggang LI
0 siblings, 0 replies; 10+ messages in thread
From: Honggang LI @ 2017-12-19 19:12 UTC (permalink / raw)
To: Bart Van Assche
Cc: jgg-uk2M96/98Pc@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Tue, Dec 19, 2017 at 04:57:35PM +0000, Bart Van Assche wrote:
> On Tue, 2017-12-19 at 20:20 +0800, Honggang LI wrote:
> > As ibsrpdm always only run once, because "config->once" has been
> > assigned to 1. I would suggest to apply this patch.
> >
> > diff --git a/srp_daemon/srp_daemon.c b/srp_daemon/srp_daemon.c
> > index cec36db2..4012a7db 100644
> > --- a/srp_daemon/srp_daemon.c
> > +++ b/srp_daemon/srp_daemon.c
> > @@ -1945,12 +1945,12 @@ static struct resources *alloc_res(void)
> > run_thread_get_trap_notices, &res->res);
> > if (ret)
> > goto err;
> > - }
> >
> > - ret = pthread_create(&res->res.async_ev_thread, NULL,
> > - run_thread_listen_to_events, &res->res);
> > - if (ret)
> > - goto err;
> > + ret = pthread_create(&res->res.async_ev_thread, NULL,
> > + run_thread_listen_to_events, &res->res);
> > + if (ret)
> > + goto err;
> > + }
> >
> > if (config->retry_timeout && !config->once) {
> > ret = pthread_create(&res->res.reconnect_thread, NULL,
>
> Please submit this as a proper patch and add the following:
[rdma-core PATCH] srp_daemon: Don't create async_ev_thread if only run once
I sent the patch with that $SUBJECT.
thanks
> Reviewed-by: Bart Van Assche <bart.vanassche-Sjgp3cTcYWE@public.gmane.org>
>
> Thanks,
>
> Bart.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH rdma-core] srp_daemon: Install signal handler for ibsrpdm
[not found] ` <20171219122050.GA19682-Y5OA6DF/u0nid9cnFhDO8BcY2uh10dtjAL8bYrjMMd8@public.gmane.org>
2017-12-19 16:57 ` Bart Van Assche
@ 2017-12-19 21:13 ` Jason Gunthorpe
1 sibling, 0 replies; 10+ messages in thread
From: Jason Gunthorpe @ 2017-12-19 21:13 UTC (permalink / raw)
To: Honggang LI
Cc: Bart Van Assche,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Tue, Dec 19, 2017 at 08:20:50PM +0800, Honggang LI wrote:
> > Here is a sketch on how to fix it properly. All the users of
> > pthread_kill should be eliminated.
> >
> > Though overall, there is really no reason to even cleanup the threads,
> > just call exit?
>
> No, if one pthread just calls 'exit', the entire process will be
> terminated immediately. So, we need to cleanup the threads.
>
> I think the source of current issue is the async_ev_thread pthread.
> We should *NOT* create such pthread for ibsrpdm.
>
> I checked the old srptools git repo.
FYI, this is merged into rdma-core, use
git log -p --follow srp_daemon/srp_daemon.c
and it will show you full history on a single file.
> Commit ab57a5b92eb3b8c9221f77235a028814a462d2cb merges "ibsrpdm" into
> "srp_daemon". The old ibsrpdm program is a single thread program.
> srp_daemon is multi-thread program.
Makes sense that ibsrpdm does not need the
run_thread_listen_to_events() thread. Patch looks OK to me too.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2017-12-19 21:13 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-14 11:02 [PATCH rdma-core] srp_daemon: Install signal handler for ibsrpdm Honggang LI
[not found] ` <20171214110241.4701-1-honli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-12-14 14:59 ` Bart Van Assche
[not found] ` <1513263572.2986.2.camel-Sjgp3cTcYWE@public.gmane.org>
2017-12-15 1:36 ` Honggang LI
[not found] ` <20171215013628.GA743-Y5OA6DF/u0nid9cnFhDO8BcY2uh10dtjAL8bYrjMMd8@public.gmane.org>
2017-12-15 17:28 ` Jason Gunthorpe
[not found] ` <20171215172800.GA12434-uk2M96/98Pc@public.gmane.org>
2017-12-15 17:50 ` Bart Van Assche
[not found] ` <1513360253.2571.23.camel-Sjgp3cTcYWE@public.gmane.org>
2017-12-15 17:59 ` Jason Gunthorpe
2017-12-19 12:20 ` Honggang LI
[not found] ` <20171219122050.GA19682-Y5OA6DF/u0nid9cnFhDO8BcY2uh10dtjAL8bYrjMMd8@public.gmane.org>
2017-12-19 16:57 ` Bart Van Assche
[not found] ` <1513702653.2535.3.camel-Sjgp3cTcYWE@public.gmane.org>
2017-12-19 19:12 ` Honggang LI
2017-12-19 21:13 ` Jason Gunthorpe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox