All of lore.kernel.org
 help / color / mirror / Atom feed
* [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
  2014-07-09 19:18 [Drbd-dev] [PATCH] Fix DRBD regression Philipp Reisner
@ 2014-07-09 19:18 ` Philipp Reisner
  2014-07-10  9:07   ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Philipp Reisner @ 2014-07-09 19:18 UTC (permalink / raw)
  To: linux-kernel, Jens Axboe; +Cc: drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

Since linux kernel 3.13, kthread_run() internally uses
wait_for_completion_killable().  We sometimes may use kthread_run()
while we still have a signal pending, which we used to kick our threads
out of potentially blocking network functions, causing kthread_run() to
mistake that as a new fatal signal and fail.

Fix: flush_signals() before kthread_run().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
 drivers/block/drbd/drbd_nl.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 1b35c45..3f2e167 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -544,6 +544,12 @@ void conn_try_outdate_peer_async(struct drbd_connection *connection)
 	struct task_struct *opa;
 
 	kref_get(&connection->kref);
+	/* We may just have force_sig()'ed this thread
+	 * to get it out of some blocking network function.
+	 * Clear signals; otherwise kthread_run(), which internally uses
+	 * wait_on_completion_killable(), will mistake our pending signal
+	 * for a new fatal signal and fail. */
+	flush_signals(current);
 	opa = kthread_run(_try_outdate_peer_async, connection, "drbd_async_h");
 	if (IS_ERR(opa)) {
 		drbd_err(connection, "out of mem, failed to invoke fence-peer helper\n");
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
  2014-07-09 19:18 ` [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' Philipp Reisner
@ 2014-07-10  9:07   ` Jens Axboe
  2014-07-10  9:53     ` Philipp Reisner
  0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2014-07-10  9:07 UTC (permalink / raw)
  To: Philipp Reisner, linux-kernel; +Cc: drbd-dev

On 2014-07-09 21:18, Philipp Reisner wrote:
> From: Lars Ellenberg <lars.ellenberg@linbit.com>
>
> Since linux kernel 3.13, kthread_run() internally uses
> wait_for_completion_killable().  We sometimes may use kthread_run()
> while we still have a signal pending, which we used to kick our threads
> out of potentially blocking network functions, causing kthread_run() to
> mistake that as a new fatal signal and fail.
>
> Fix: flush_signals() before kthread_run().

Applied - should this have been marked stable, if it affects 3..13+ kernels?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
  2014-07-10  9:07   ` Jens Axboe
@ 2014-07-10  9:53     ` Philipp Reisner
  2014-07-10  9:55       ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Philipp Reisner @ 2014-07-10  9:53 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, drbd-dev

Am Donnerstag, 10. Juli 2014, 11:07:22 schrieb Jens Axboe:
> On 2014-07-09 21:18, Philipp Reisner wrote:
> > From: Lars Ellenberg <lars.ellenberg@linbit.com>
> > 
> > Since linux kernel 3.13, kthread_run() internally uses
> > wait_for_completion_killable().  We sometimes may use kthread_run()
> > while we still have a signal pending, which we used to kick our threads
> > out of potentially blocking network functions, causing kthread_run() to
> > mistake that as a new fatal signal and fail.
> > 
> > Fix: flush_signals() before kthread_run().
> 
> Applied - should this have been marked stable, if it affects 3..13+ kernels?

Yes, you are right. It should go to the stable kernels since 3.13 as
well.

What is the correct way for me to mark it as stable when sending a patch?

-phil


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
  2014-07-10  9:53     ` Philipp Reisner
@ 2014-07-10  9:55       ` Jens Axboe
  0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2014-07-10  9:55 UTC (permalink / raw)
  To: Philipp Reisner; +Cc: linux-kernel, drbd-dev

On 2014-07-10 11:53, Philipp Reisner wrote:
> Am Donnerstag, 10. Juli 2014, 11:07:22 schrieb Jens Axboe:
>> On 2014-07-09 21:18, Philipp Reisner wrote:
>>> From: Lars Ellenberg <lars.ellenberg@linbit.com>
>>>
>>> Since linux kernel 3.13, kthread_run() internally uses
>>> wait_for_completion_killable().  We sometimes may use kthread_run()
>>> while we still have a signal pending, which we used to kick our threads
>>> out of potentially blocking network functions, causing kthread_run() to
>>> mistake that as a new fatal signal and fail.
>>>
>>> Fix: flush_signals() before kthread_run().
>>
>> Applied - should this have been marked stable, if it affects 3..13+ kernels?
>
> Yes, you are right. It should go to the stable kernels since 3.13 as
> well.

Alright, we'll have to notify Greg/stable when it goes in.

> What is the correct way for me to mark it as stable when sending a patch?

You just add a:

Cc: stable@kernel.org

where the signed-off-by is. If you know the versions it should be 
applied to, you can add that information as well. For this case, you 
would have done:

Cc: stable@kernel.org # v3.13+

to get it into 3.13 stable and later.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
       [not found] <026a6017e1b052f58cf908fc2f63aea7@de.mcbf.net>
@ 2014-10-01  9:32 ` Philipp Reisner
  2014-10-05 23:47   ` Greg KH
  0 siblings, 1 reply; 7+ messages in thread
From: Philipp Reisner @ 2014-10-01  9:32 UTC (permalink / raw)
  To: stable; +Cc: Jens Axboe, David Mohr, drbd-dev

From: Lars Ellenberg <lars.ellenberg@linbit.com>

Stable info:
  This patch landed in upstream with v3.16 as commit
  bbc1c5e8ad6dfebf9d13b8a4ccdf66c92913eac9
  it should go into v3.14+

Since linux kernel 3.13, kthread_run() internally uses
wait_for_completion_killable().  We sometimes may use kthread_run()
while we still have a signal pending, which we used to kick our threads
out of potentially blocking network functions, causing kthread_run() to
mistake that as a new fatal signal and fail.

Fix: flush_signals() before kthread_run().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
---
 drivers/block/drbd/drbd_nl.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 1b35c45..3f2e167 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -544,6 +544,12 @@ void conn_try_outdate_peer_async(struct drbd_connection *connection)
        struct task_struct *opa;

        kref_get(&connection->kref);
+       /* We may just have force_sig()'ed this thread
+        * to get it out of some blocking network function.
+        * Clear signals; otherwise kthread_run(), which internally uses
+        * wait_on_completion_killable(), will mistake our pending signal
+        * for a new fatal signal and fail. */
+       flush_signals(current);
        opa = kthread_run(_try_outdate_peer_async, connection, "drbd_async_h");
        if (IS_ERR(opa)) {
                drbd_err(connection, "out of mem, failed to invoke fence-peer helper\n");
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
  2014-10-01  9:32 ` [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' Philipp Reisner
@ 2014-10-05 23:47   ` Greg KH
  2014-10-07 15:33     ` Lars Ellenberg
  0 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2014-10-05 23:47 UTC (permalink / raw)
  To: Philipp Reisner; +Cc: Jens Axboe, David Mohr, stable, drbd-dev

On Wed, Oct 01, 2014 at 11:32:29AM +0200, Philipp Reisner wrote:
> From: Lars Ellenberg <lars.ellenberg@linbit.com>
> 
> Stable info:
>   This patch landed in upstream with v3.16 as commit
>   bbc1c5e8ad6dfebf9d13b8a4ccdf66c92913eac9
>   it should go into v3.14+
> 
> Since linux kernel 3.13, kthread_run() internally uses
> wait_for_completion_killable().  We sometimes may use kthread_run()
> while we still have a signal pending, which we used to kick our threads
> out of potentially blocking network functions, causing kthread_run() to
> mistake that as a new fatal signal and fail.
> 
> Fix: flush_signals() before kthread_run().
> 
> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
> Signed-off-by: Jens Axboe <axboe@fb.com>
> ---
>  drivers/block/drbd/drbd_nl.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
> index 1b35c45..3f2e167 100644
> --- a/drivers/block/drbd/drbd_nl.c
> +++ b/drivers/block/drbd/drbd_nl.c
> @@ -544,6 +544,12 @@ void conn_try_outdate_peer_async(struct drbd_connection *connection)
>         struct task_struct *opa;
> 
>         kref_get(&connection->kref);
> +       /* We may just have force_sig()'ed this thread
> +        * to get it out of some blocking network function.
> +        * Clear signals; otherwise kthread_run(), which internally uses
> +        * wait_on_completion_killable(), will mistake our pending signal
> +        * for a new fatal signal and fail. */
> +       flush_signals(current);
>         opa = kthread_run(_try_outdate_peer_async, connection, "drbd_async_h");
>         if (IS_ERR(opa)) {
>                 drbd_err(connection, "out of mem, failed to invoke fence-peer helper\n");

This doesn't apply to 3.16-stable or 3.14-stable, can you please provide
a working backport?

thanks,


greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
  2014-10-05 23:47   ` Greg KH
@ 2014-10-07 15:33     ` Lars Ellenberg
  0 siblings, 0 replies; 7+ messages in thread
From: Lars Ellenberg @ 2014-10-07 15:33 UTC (permalink / raw)
  To: Greg KH; +Cc: Jens Axboe, David Mohr, Philipp Reisner, stable, drbd-dev

On Sun, Oct 05, 2014 at 04:47:01PM -0700, Greg KH wrote:
> On Wed, Oct 01, 2014 at 11:32:29AM +0200, Philipp Reisner wrote:
> > From: Lars Ellenberg <lars.ellenberg@linbit.com>
> > 
> > Stable info:
> >   This patch landed in upstream with v3.16 as commit
> >   bbc1c5e8ad6dfebf9d13b8a4ccdf66c92913eac9
> >   it should go into v3.14+
> > 
> > Since linux kernel 3.13, kthread_run() internally uses
> > wait_for_completion_killable().  We sometimes may use kthread_run()
> > while we still have a signal pending, which we used to kick our threads
> > out of potentially blocking network functions, causing kthread_run() to
> > mistake that as a new fatal signal and fail.
> > 
> > Fix: flush_signals() before kthread_run().
> > 
> > Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
> > Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
> > Signed-off-by: Jens Axboe <axboe@fb.com>
> > ---
> >  drivers/block/drbd/drbd_nl.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
> > index 1b35c45..3f2e167 100644
> > --- a/drivers/block/drbd/drbd_nl.c
> > +++ b/drivers/block/drbd/drbd_nl.c
> > @@ -544,6 +544,12 @@ void conn_try_outdate_peer_async(struct drbd_connection *connection)
> >         struct task_struct *opa;
> > 
> >         kref_get(&connection->kref);
> > +       /* We may just have force_sig()'ed this thread
> > +        * to get it out of some blocking network function.
> > +        * Clear signals; otherwise kthread_run(), which internally uses
> > +        * wait_on_completion_killable(), will mistake our pending signal
> > +        * for a new fatal signal and fail. */
> > +       flush_signals(current);
> >         opa = kthread_run(_try_outdate_peer_async, connection, "drbd_async_h");
> >         if (IS_ERR(opa)) {
> >                 drbd_err(connection, "out of mem, failed to invoke fence-peer helper\n");
> 
> This doesn't apply to 3.16-stable or 3.14-stable, can you please provide
> a working backport?

There was a rename of "tconn" to "connection" between 3.14 and .15.
Other than that, this has not changed.
Below applies to 3.13 and 3.14 stable as of today.

	Lars

8<----

From a82efa2adeb992b5ded798b01b4567bc07b6ab1b Mon Sep 17 00:00:00 2001
From: Lars Ellenberg <lars.ellenberg@linbit.com>
Date: Tue, 7 Oct 2014 17:20:27 +0200
Subject: [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer
 helper'

Stable info:
  This patch landed in upstream with v3.16 as commit
  bbc1c5e8ad6dfebf9d13b8a4ccdf66c92913eac9
  it should go into v3.13+

Since linux kernel 3.13, kthread_run() internally uses
wait_for_completion_killable().  We sometimes may use kthread_run()
while we still have a signal pending, which we used to kick our threads
out of potentially blocking network functions, causing kthread_run() to
mistake that as a new fatal signal and fail.

Fix: flush_signals() before kthread_run().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
---
 drivers/block/drbd/drbd_nl.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index c706d50..8c16c2f 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -525,6 +525,12 @@ void conn_try_outdate_peer_async(struct drbd_tconn *tconn)
 	struct task_struct *opa;
 
 	kref_get(&tconn->kref);
+	/* We may just have force_sig()'ed this thread
+	 * to get it out of some blocking network function.
+	 * Clear signals; otherwise kthread_run(), which internally uses
+	 * wait_on_completion_killable(), will mistake our pending signal
+	 * for a new fatal signal and fail. */
+	flush_signals(current);
 	opa = kthread_run(_try_outdate_peer_async, tconn, "drbd_async_h");
 	if (IS_ERR(opa)) {
 		conn_err(tconn, "out of mem, failed to invoke fence-peer helper\n");
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-10-07 15:33 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <026a6017e1b052f58cf908fc2f63aea7@de.mcbf.net>
2014-10-01  9:32 ` [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' Philipp Reisner
2014-10-05 23:47   ` Greg KH
2014-10-07 15:33     ` Lars Ellenberg
2014-07-09 19:18 [Drbd-dev] [PATCH] Fix DRBD regression Philipp Reisner
2014-07-09 19:18 ` [Drbd-dev] [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' Philipp Reisner
2014-07-10  9:07   ` Jens Axboe
2014-07-10  9:53     ` Philipp Reisner
2014-07-10  9:55       ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.