* [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' [not found] <026a6017e1b052f58cf908fc2f63aea7@de.mcbf.net> @ 2014-10-01 9:32 ` Philipp Reisner 2014-10-05 23:47 ` Greg KH 0 siblings, 1 reply; 7+ messages in thread From: Philipp Reisner @ 2014-10-01 9:32 UTC (permalink / raw) To: stable; +Cc: Jens Axboe, David Mohr, drbd-dev From: Lars Ellenberg <lars.ellenberg@linbit.com> Stable info: This patch landed in upstream with v3.16 as commit bbc1c5e8ad6dfebf9d13b8a4ccdf66c92913eac9 it should go into v3.14+ Since linux kernel 3.13, kthread_run() internally uses wait_for_completion_killable(). We sometimes may use kthread_run() while we still have a signal pending, which we used to kick our threads out of potentially blocking network functions, causing kthread_run() to mistake that as a new fatal signal and fail. Fix: flush_signals() before kthread_run(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@fb.com> --- drivers/block/drbd/drbd_nl.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c index 1b35c45..3f2e167 100644 --- a/drivers/block/drbd/drbd_nl.c +++ b/drivers/block/drbd/drbd_nl.c @@ -544,6 +544,12 @@ void conn_try_outdate_peer_async(struct drbd_connection *connection) struct task_struct *opa; kref_get(&connection->kref); + /* We may just have force_sig()'ed this thread + * to get it out of some blocking network function. + * Clear signals; otherwise kthread_run(), which internally uses + * wait_on_completion_killable(), will mistake our pending signal + * for a new fatal signal and fail. */ + flush_signals(current); opa = kthread_run(_try_outdate_peer_async, connection, "drbd_async_h"); if (IS_ERR(opa)) { drbd_err(connection, "out of mem, failed to invoke fence-peer helper\n"); -- 1.9.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' 2014-10-01 9:32 ` [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' Philipp Reisner @ 2014-10-05 23:47 ` Greg KH 2014-10-07 15:33 ` Lars Ellenberg 0 siblings, 1 reply; 7+ messages in thread From: Greg KH @ 2014-10-05 23:47 UTC (permalink / raw) To: Philipp Reisner; +Cc: Jens Axboe, David Mohr, stable, drbd-dev On Wed, Oct 01, 2014 at 11:32:29AM +0200, Philipp Reisner wrote: > From: Lars Ellenberg <lars.ellenberg@linbit.com> > > Stable info: > This patch landed in upstream with v3.16 as commit > bbc1c5e8ad6dfebf9d13b8a4ccdf66c92913eac9 > it should go into v3.14+ > > Since linux kernel 3.13, kthread_run() internally uses > wait_for_completion_killable(). We sometimes may use kthread_run() > while we still have a signal pending, which we used to kick our threads > out of potentially blocking network functions, causing kthread_run() to > mistake that as a new fatal signal and fail. > > Fix: flush_signals() before kthread_run(). > > Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> > Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> > Signed-off-by: Jens Axboe <axboe@fb.com> > --- > drivers/block/drbd/drbd_nl.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c > index 1b35c45..3f2e167 100644 > --- a/drivers/block/drbd/drbd_nl.c > +++ b/drivers/block/drbd/drbd_nl.c > @@ -544,6 +544,12 @@ void conn_try_outdate_peer_async(struct drbd_connection *connection) > struct task_struct *opa; > > kref_get(&connection->kref); > + /* We may just have force_sig()'ed this thread > + * to get it out of some blocking network function. > + * Clear signals; otherwise kthread_run(), which internally uses > + * wait_on_completion_killable(), will mistake our pending signal > + * for a new fatal signal and fail. */ > + flush_signals(current); > opa = kthread_run(_try_outdate_peer_async, connection, "drbd_async_h"); > if (IS_ERR(opa)) { > drbd_err(connection, "out of mem, failed to invoke fence-peer helper\n"); This doesn't apply to 3.16-stable or 3.14-stable, can you please provide a working backport? thanks, greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' 2014-10-05 23:47 ` Greg KH @ 2014-10-07 15:33 ` Lars Ellenberg 0 siblings, 0 replies; 7+ messages in thread From: Lars Ellenberg @ 2014-10-07 15:33 UTC (permalink / raw) To: Greg KH; +Cc: Jens Axboe, David Mohr, Philipp Reisner, stable, drbd-dev On Sun, Oct 05, 2014 at 04:47:01PM -0700, Greg KH wrote: > On Wed, Oct 01, 2014 at 11:32:29AM +0200, Philipp Reisner wrote: > > From: Lars Ellenberg <lars.ellenberg@linbit.com> > > > > Stable info: > > This patch landed in upstream with v3.16 as commit > > bbc1c5e8ad6dfebf9d13b8a4ccdf66c92913eac9 > > it should go into v3.14+ > > > > Since linux kernel 3.13, kthread_run() internally uses > > wait_for_completion_killable(). We sometimes may use kthread_run() > > while we still have a signal pending, which we used to kick our threads > > out of potentially blocking network functions, causing kthread_run() to > > mistake that as a new fatal signal and fail. > > > > Fix: flush_signals() before kthread_run(). > > > > Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> > > Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> > > Signed-off-by: Jens Axboe <axboe@fb.com> > > --- > > drivers/block/drbd/drbd_nl.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c > > index 1b35c45..3f2e167 100644 > > --- a/drivers/block/drbd/drbd_nl.c > > +++ b/drivers/block/drbd/drbd_nl.c > > @@ -544,6 +544,12 @@ void conn_try_outdate_peer_async(struct drbd_connection *connection) > > struct task_struct *opa; > > > > kref_get(&connection->kref); > > + /* We may just have force_sig()'ed this thread > > + * to get it out of some blocking network function. > > + * Clear signals; otherwise kthread_run(), which internally uses > > + * wait_on_completion_killable(), will mistake our pending signal > > + * for a new fatal signal and fail. */ > > + flush_signals(current); > > opa = kthread_run(_try_outdate_peer_async, connection, "drbd_async_h"); > > if (IS_ERR(opa)) { > > drbd_err(connection, "out of mem, failed to invoke fence-peer helper\n"); > > This doesn't apply to 3.16-stable or 3.14-stable, can you please provide > a working backport? There was a rename of "tconn" to "connection" between 3.14 and .15. Other than that, this has not changed. Below applies to 3.13 and 3.14 stable as of today. Lars 8<---- From a82efa2adeb992b5ded798b01b4567bc07b6ab1b Mon Sep 17 00:00:00 2001 From: Lars Ellenberg <lars.ellenberg@linbit.com> Date: Tue, 7 Oct 2014 17:20:27 +0200 Subject: [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' Stable info: This patch landed in upstream with v3.16 as commit bbc1c5e8ad6dfebf9d13b8a4ccdf66c92913eac9 it should go into v3.13+ Since linux kernel 3.13, kthread_run() internally uses wait_for_completion_killable(). We sometimes may use kthread_run() while we still have a signal pending, which we used to kick our threads out of potentially blocking network functions, causing kthread_run() to mistake that as a new fatal signal and fail. Fix: flush_signals() before kthread_run(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@fb.com> --- drivers/block/drbd/drbd_nl.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c index c706d50..8c16c2f 100644 --- a/drivers/block/drbd/drbd_nl.c +++ b/drivers/block/drbd/drbd_nl.c @@ -525,6 +525,12 @@ void conn_try_outdate_peer_async(struct drbd_tconn *tconn) struct task_struct *opa; kref_get(&tconn->kref); + /* We may just have force_sig()'ed this thread + * to get it out of some blocking network function. + * Clear signals; otherwise kthread_run(), which internally uses + * wait_on_completion_killable(), will mistake our pending signal + * for a new fatal signal and fail. */ + flush_signals(current); opa = kthread_run(_try_outdate_peer_async, tconn, "drbd_async_h"); if (IS_ERR(opa)) { conn_err(tconn, "out of mem, failed to invoke fence-peer helper\n"); -- 1.9.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [Drbd-dev] [PATCH] Fix DRBD regression @ 2014-07-09 19:18 Philipp Reisner 2014-07-09 19:18 ` [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' Philipp Reisner 0 siblings, 1 reply; 7+ messages in thread From: Philipp Reisner @ 2014-07-09 19:18 UTC (permalink / raw) To: linux-kernel, Jens Axboe; +Cc: drbd-dev Hi Jens, In 3.13 the commit 786235eeb 'kthread: make kthread_create() killable' broke DRBD's fence peer mechanism in a subtle way. Since only a part of our user base has fencing properly configured this regression was unnoticed until now. Please consider to submit this for 3.16-rc5. Thanks, Phil Lars Ellenberg (1): drbd: fix regression 'out of mem, failed to invoke fence-peer helper' drivers/block/drbd/drbd_nl.c | 6 ++++++ 1 file changed, 6 insertions(+) -- 1.9.1 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' 2014-07-09 19:18 [Drbd-dev] [PATCH] Fix DRBD regression Philipp Reisner @ 2014-07-09 19:18 ` Philipp Reisner 2014-07-10 9:07 ` Jens Axboe 0 siblings, 1 reply; 7+ messages in thread From: Philipp Reisner @ 2014-07-09 19:18 UTC (permalink / raw) To: linux-kernel, Jens Axboe; +Cc: drbd-dev From: Lars Ellenberg <lars.ellenberg@linbit.com> Since linux kernel 3.13, kthread_run() internally uses wait_for_completion_killable(). We sometimes may use kthread_run() while we still have a signal pending, which we used to kick our threads out of potentially blocking network functions, causing kthread_run() to mistake that as a new fatal signal and fail. Fix: flush_signals() before kthread_run(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> --- drivers/block/drbd/drbd_nl.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c index 1b35c45..3f2e167 100644 --- a/drivers/block/drbd/drbd_nl.c +++ b/drivers/block/drbd/drbd_nl.c @@ -544,6 +544,12 @@ void conn_try_outdate_peer_async(struct drbd_connection *connection) struct task_struct *opa; kref_get(&connection->kref); + /* We may just have force_sig()'ed this thread + * to get it out of some blocking network function. + * Clear signals; otherwise kthread_run(), which internally uses + * wait_on_completion_killable(), will mistake our pending signal + * for a new fatal signal and fail. */ + flush_signals(current); opa = kthread_run(_try_outdate_peer_async, connection, "drbd_async_h"); if (IS_ERR(opa)) { drbd_err(connection, "out of mem, failed to invoke fence-peer helper\n"); -- 1.9.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' 2014-07-09 19:18 ` [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' Philipp Reisner @ 2014-07-10 9:07 ` Jens Axboe 2014-07-10 9:53 ` Philipp Reisner 0 siblings, 1 reply; 7+ messages in thread From: Jens Axboe @ 2014-07-10 9:07 UTC (permalink / raw) To: Philipp Reisner, linux-kernel; +Cc: drbd-dev On 2014-07-09 21:18, Philipp Reisner wrote: > From: Lars Ellenberg <lars.ellenberg@linbit.com> > > Since linux kernel 3.13, kthread_run() internally uses > wait_for_completion_killable(). We sometimes may use kthread_run() > while we still have a signal pending, which we used to kick our threads > out of potentially blocking network functions, causing kthread_run() to > mistake that as a new fatal signal and fail. > > Fix: flush_signals() before kthread_run(). Applied - should this have been marked stable, if it affects 3..13+ kernels? -- Jens Axboe ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' 2014-07-10 9:07 ` Jens Axboe @ 2014-07-10 9:53 ` Philipp Reisner 2014-07-10 9:55 ` Jens Axboe 0 siblings, 1 reply; 7+ messages in thread From: Philipp Reisner @ 2014-07-10 9:53 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-kernel, drbd-dev Am Donnerstag, 10. Juli 2014, 11:07:22 schrieb Jens Axboe: > On 2014-07-09 21:18, Philipp Reisner wrote: > > From: Lars Ellenberg <lars.ellenberg@linbit.com> > > > > Since linux kernel 3.13, kthread_run() internally uses > > wait_for_completion_killable(). We sometimes may use kthread_run() > > while we still have a signal pending, which we used to kick our threads > > out of potentially blocking network functions, causing kthread_run() to > > mistake that as a new fatal signal and fail. > > > > Fix: flush_signals() before kthread_run(). > > Applied - should this have been marked stable, if it affects 3..13+ kernels? Yes, you are right. It should go to the stable kernels since 3.13 as well. What is the correct way for me to mark it as stable when sending a patch? -phil ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' 2014-07-10 9:53 ` Philipp Reisner @ 2014-07-10 9:55 ` Jens Axboe 0 siblings, 0 replies; 7+ messages in thread From: Jens Axboe @ 2014-07-10 9:55 UTC (permalink / raw) To: Philipp Reisner; +Cc: linux-kernel, drbd-dev On 2014-07-10 11:53, Philipp Reisner wrote: > Am Donnerstag, 10. Juli 2014, 11:07:22 schrieb Jens Axboe: >> On 2014-07-09 21:18, Philipp Reisner wrote: >>> From: Lars Ellenberg <lars.ellenberg@linbit.com> >>> >>> Since linux kernel 3.13, kthread_run() internally uses >>> wait_for_completion_killable(). We sometimes may use kthread_run() >>> while we still have a signal pending, which we used to kick our threads >>> out of potentially blocking network functions, causing kthread_run() to >>> mistake that as a new fatal signal and fail. >>> >>> Fix: flush_signals() before kthread_run(). >> >> Applied - should this have been marked stable, if it affects 3..13+ kernels? > > Yes, you are right. It should go to the stable kernels since 3.13 as > well. Alright, we'll have to notify Greg/stable when it goes in. > What is the correct way for me to mark it as stable when sending a patch? You just add a: Cc: stable@kernel.org where the signed-off-by is. If you know the versions it should be applied to, you can add that information as well. For this case, you would have done: Cc: stable@kernel.org # v3.13+ to get it into 3.13 stable and later. -- Jens Axboe ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-10-07 15:33 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <026a6017e1b052f58cf908fc2f63aea7@de.mcbf.net>
2014-10-01 9:32 ` [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' Philipp Reisner
2014-10-05 23:47 ` Greg KH
2014-10-07 15:33 ` Lars Ellenberg
2014-07-09 19:18 [Drbd-dev] [PATCH] Fix DRBD regression Philipp Reisner
2014-07-09 19:18 ` [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' Philipp Reisner
2014-07-10 9:07 ` Jens Axboe
2014-07-10 9:53 ` Philipp Reisner
2014-07-10 9:55 ` Jens Axboe
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.