* [PATCH] Fix DRBD regression
@ 2014-07-09 19:18 Philipp Reisner
2014-07-09 19:18 ` [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' Philipp Reisner
0 siblings, 1 reply; 5+ messages in thread
From: Philipp Reisner @ 2014-07-09 19:18 UTC (permalink / raw)
To: linux-kernel, Jens Axboe; +Cc: drbd-dev
Hi Jens,
In 3.13 the commit 786235eeb 'kthread: make kthread_create() killable'
broke DRBD's fence peer mechanism in a subtle way. Since only a part
of our user base has fencing properly configured this regression was
unnoticed until now. Please consider to submit this for 3.16-rc5.
Thanks,
Phil
Lars Ellenberg (1):
drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
drivers/block/drbd/drbd_nl.c | 6 ++++++
1 file changed, 6 insertions(+)
--
1.9.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
2014-07-09 19:18 [PATCH] Fix DRBD regression Philipp Reisner
@ 2014-07-09 19:18 ` Philipp Reisner
2014-07-10 9:07 ` Jens Axboe
0 siblings, 1 reply; 5+ messages in thread
From: Philipp Reisner @ 2014-07-09 19:18 UTC (permalink / raw)
To: linux-kernel, Jens Axboe; +Cc: drbd-dev
From: Lars Ellenberg <lars.ellenberg@linbit.com>
Since linux kernel 3.13, kthread_run() internally uses
wait_for_completion_killable(). We sometimes may use kthread_run()
while we still have a signal pending, which we used to kick our threads
out of potentially blocking network functions, causing kthread_run() to
mistake that as a new fatal signal and fail.
Fix: flush_signals() before kthread_run().
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---
drivers/block/drbd/drbd_nl.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 1b35c45..3f2e167 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -544,6 +544,12 @@ void conn_try_outdate_peer_async(struct drbd_connection *connection)
struct task_struct *opa;
kref_get(&connection->kref);
+ /* We may just have force_sig()'ed this thread
+ * to get it out of some blocking network function.
+ * Clear signals; otherwise kthread_run(), which internally uses
+ * wait_on_completion_killable(), will mistake our pending signal
+ * for a new fatal signal and fail. */
+ flush_signals(current);
opa = kthread_run(_try_outdate_peer_async, connection, "drbd_async_h");
if (IS_ERR(opa)) {
drbd_err(connection, "out of mem, failed to invoke fence-peer helper\n");
--
1.9.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
2014-07-09 19:18 ` [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' Philipp Reisner
@ 2014-07-10 9:07 ` Jens Axboe
2014-07-10 9:53 ` Philipp Reisner
0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2014-07-10 9:07 UTC (permalink / raw)
To: Philipp Reisner, linux-kernel; +Cc: drbd-dev
On 2014-07-09 21:18, Philipp Reisner wrote:
> From: Lars Ellenberg <lars.ellenberg@linbit.com>
>
> Since linux kernel 3.13, kthread_run() internally uses
> wait_for_completion_killable(). We sometimes may use kthread_run()
> while we still have a signal pending, which we used to kick our threads
> out of potentially blocking network functions, causing kthread_run() to
> mistake that as a new fatal signal and fail.
>
> Fix: flush_signals() before kthread_run().
Applied - should this have been marked stable, if it affects 3..13+ kernels?
--
Jens Axboe
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
2014-07-10 9:07 ` Jens Axboe
@ 2014-07-10 9:53 ` Philipp Reisner
2014-07-10 9:55 ` Jens Axboe
0 siblings, 1 reply; 5+ messages in thread
From: Philipp Reisner @ 2014-07-10 9:53 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-kernel, drbd-dev
Am Donnerstag, 10. Juli 2014, 11:07:22 schrieb Jens Axboe:
> On 2014-07-09 21:18, Philipp Reisner wrote:
> > From: Lars Ellenberg <lars.ellenberg@linbit.com>
> >
> > Since linux kernel 3.13, kthread_run() internally uses
> > wait_for_completion_killable(). We sometimes may use kthread_run()
> > while we still have a signal pending, which we used to kick our threads
> > out of potentially blocking network functions, causing kthread_run() to
> > mistake that as a new fatal signal and fail.
> >
> > Fix: flush_signals() before kthread_run().
>
> Applied - should this have been marked stable, if it affects 3..13+ kernels?
Yes, you are right. It should go to the stable kernels since 3.13 as
well.
What is the correct way for me to mark it as stable when sending a patch?
-phil
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
2014-07-10 9:53 ` Philipp Reisner
@ 2014-07-10 9:55 ` Jens Axboe
0 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2014-07-10 9:55 UTC (permalink / raw)
To: Philipp Reisner; +Cc: linux-kernel, drbd-dev
On 2014-07-10 11:53, Philipp Reisner wrote:
> Am Donnerstag, 10. Juli 2014, 11:07:22 schrieb Jens Axboe:
>> On 2014-07-09 21:18, Philipp Reisner wrote:
>>> From: Lars Ellenberg <lars.ellenberg@linbit.com>
>>>
>>> Since linux kernel 3.13, kthread_run() internally uses
>>> wait_for_completion_killable(). We sometimes may use kthread_run()
>>> while we still have a signal pending, which we used to kick our threads
>>> out of potentially blocking network functions, causing kthread_run() to
>>> mistake that as a new fatal signal and fail.
>>>
>>> Fix: flush_signals() before kthread_run().
>>
>> Applied - should this have been marked stable, if it affects 3..13+ kernels?
>
> Yes, you are right. It should go to the stable kernels since 3.13 as
> well.
Alright, we'll have to notify Greg/stable when it goes in.
> What is the correct way for me to mark it as stable when sending a patch?
You just add a:
Cc: stable@kernel.org
where the signed-off-by is. If you know the versions it should be
applied to, you can add that information as well. For this case, you
would have done:
Cc: stable@kernel.org # v3.13+
to get it into 3.13 stable and later.
--
Jens Axboe
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-07-10 9:56 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-09 19:18 [PATCH] Fix DRBD regression Philipp Reisner
2014-07-09 19:18 ` [PATCH] drbd: fix regression 'out of mem, failed to invoke fence-peer helper' Philipp Reisner
2014-07-10 9:07 ` Jens Axboe
2014-07-10 9:53 ` Philipp Reisner
2014-07-10 9:55 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox