linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Oleg Drokin <green@linuxhacker.ru>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-kernel@vger.kernel.org, devel@driverdev.osuosl.org
Cc: "Christopher J. Morrone" <morrone2@llnl.gov>,
	Oleg Drokin <oleg.drokin@intel.com>
Subject: [PATCH 24/47] staging/lustre/llite: Avoid statahead thread start/stop deadlocks
Date: Sun, 27 Apr 2014 13:06:48 -0400	[thread overview]
Message-ID: <1398618431-29757-25-git-send-email-green@linuxhacker.ru> (raw)
In-Reply-To: <1398618431-29757-1-git-send-email-green@linuxhacker.ru>

From: "Christopher J. Morrone" <morrone2@llnl.gov>

The statahead and statahead agl threads blindly set their
thread state to SVC_RUNNING without checking the state first.  If, for
instance, another thread sets the state to SVC_STOPPING that
stop signal will now have been lost.  Deadlock ensues.

We also partly improve the sai reference counting, because a race exists
where the ll_stop_statahead thread can drop the default reference, and
the statahead thread can exit and drop its reference as well.  With no
references on the sai, the final put will poison and free the buffer.  The
original do_statahead_enter() function may then continue to access
the buffer after it is freed because it did not take a reference of its
own.  We add a local reference to address that.

Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/9358
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4624
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
---
 drivers/staging/lustre/lustre/llite/statahead.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/statahead.c b/drivers/staging/lustre/lustre/llite/statahead.c
index c8624b5..74d95b0 100644
--- a/drivers/staging/lustre/lustre/llite/statahead.c
+++ b/drivers/staging/lustre/lustre/llite/statahead.c
@@ -964,7 +964,11 @@ static int ll_agl_thread(void *arg)
 	atomic_inc(&sbi->ll_agl_total);
 	spin_lock(&plli->lli_agl_lock);
 	sai->sai_agl_valid = 1;
-	thread_set_flags(thread, SVC_RUNNING);
+	if (thread_is_init(thread))
+		/* If someone else has changed the thread state
+		 * (e.g. already changed to SVC_STOPPING), we can't just
+		 * blindly overwrite that setting. */
+		thread_set_flags(thread, SVC_RUNNING);
 	spin_unlock(&plli->lli_agl_lock);
 	wake_up(&thread->t_ctl_waitq);
 
@@ -1058,7 +1062,11 @@ static int ll_statahead_thread(void *arg)
 
 	atomic_inc(&sbi->ll_sa_total);
 	spin_lock(&plli->lli_sa_lock);
-	thread_set_flags(thread, SVC_RUNNING);
+	if (thread_is_init(thread))
+		/* If someone else has changed the thread state
+		 * (e.g. already changed to SVC_STOPPING), we can't just
+		 * blindly overwrite that setting. */
+		thread_set_flags(thread, SVC_RUNNING);
 	spin_unlock(&plli->lli_sa_lock);
 	wake_up(&thread->t_ctl_waitq);
 
@@ -1658,6 +1666,12 @@ int do_statahead_enter(struct inode *dir, struct dentry **dentryp,
 	CDEBUG(D_READA, "start statahead thread: [pid %d] [parent %.*s]\n",
 	       current_pid(), parent->d_name.len, parent->d_name.name);
 
+	/* The sai buffer already has one reference taken at allocation time,
+	 * but as soon as we expose the sai by attaching it to the lli that
+	 * default reference can be dropped by another thread calling
+	 * ll_stop_statahead. We need to take a local reference to protect
+	 * the sai buffer while we intend to access it. */
+	ll_sai_get(sai);
 	lli->lli_sai = sai;
 
 	plli = ll_i2info(parent->d_inode);
@@ -1670,6 +1684,9 @@ int do_statahead_enter(struct inode *dir, struct dentry **dentryp,
 		lli->lli_opendir_key = NULL;
 		thread_set_flags(thread, SVC_STOPPED);
 		thread_set_flags(&sai->sai_agl_thread, SVC_STOPPED);
+		/* Drop both our own local reference and the default
+		 * reference from allocation time. */
+		ll_sai_put(sai);
 		ll_sai_put(sai);
 		LASSERT(lli->lli_sai == NULL);
 		return -EAGAIN;
@@ -1678,6 +1695,7 @@ int do_statahead_enter(struct inode *dir, struct dentry **dentryp,
 	l_wait_event(thread->t_ctl_waitq,
 		     thread_is_running(thread) || thread_is_stopped(thread),
 		     &lwi);
+	ll_sai_put(sai);
 
 	/*
 	 * We don't stat-ahead for the first dirent since we are already in
-- 
1.8.5.3


  parent reply	other threads:[~2014-04-27 17:18 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-27 17:06 [PATCH 00/47] Lustre fixes and cleanups Oleg Drokin
2014-04-27 17:06 ` [PATCH 01/47] staging/lustre/ptlrpc: Fix assertion failure of null_alloc_rs() Oleg Drokin
2014-04-27 17:06 ` [PATCH 02/47] staging/lustre/ptlrpc: Remove log message about export timer update Oleg Drokin
2014-04-27 17:06 ` [PATCH 03/47] staging/lustre/gss: gssnull security flavor Oleg Drokin
2014-04-27 17:06 ` [PATCH 04/47] staging/lustre/gss: Shared key mechanism & flavors Oleg Drokin
2014-04-27 17:20   ` Greg Kroah-Hartman
2014-04-27 17:06 ` [PATCH 05/47] staging/lustre/osc: don't activate deactivated obd_import Oleg Drokin
2014-04-27 17:06 ` [PATCH 06/47] staging/lustre/lnet: Dropped messages are not accounted correctly Oleg Drokin
2014-04-27 17:06 ` [PATCH 07/47] staging/lustre/ldlm: Hold lock when clearing flag Oleg Drokin
2014-04-27 17:06 ` [PATCH 08/47] staging/lustre/clio: clear nowait flag agl lock re-enqueue Oleg Drokin
2014-04-27 17:06 ` [PATCH 09/47] staging/lustre/ptlrpc: don't try to recover no_recov connection Oleg Drokin
2014-04-27 17:06 ` [PATCH 10/47] staging/lustre/gss: fix few issues found by Klocwork Insight tool Oleg Drokin
2014-04-27 17:06 ` [PATCH 11/47] staging/lustre/ptlrpc: add rpc_cache Oleg Drokin
2014-04-29  9:46   ` Dan Carpenter
2014-04-30  3:22     ` Oleg Drokin
2014-04-27 17:06 ` [PATCH 12/47] staging/lustre: restore __GFP_WAIT flag to memalloc calls Oleg Drokin
2014-04-27 17:06 ` [PATCH 13/47] staging/lustre/gss: fix uninitialized variable Oleg Drokin
2014-04-27 17:06 ` [PATCH 14/47] staging/lustre: quiet console permission error messages Oleg Drokin
2014-04-27 17:06 ` [PATCH 15/47] staging/lustre/lov: remove unused lov llog code Oleg Drokin
2014-04-27 17:06 ` [PATCH 16/47] staging/lustre/obdclass: remove uses of lov_stripe_md Oleg Drokin
2014-04-27 17:06 ` [PATCH 17/47] staging/lustre/hsm: count NULL terminator in hai_zero/hal_size Oleg Drokin
2014-04-27 17:06 ` [PATCH 18/47] staging/lustre/hsm: HSM requests not delivered Oleg Drokin
2014-04-29  9:08   ` Dan Carpenter
2014-04-30  3:31     ` Oleg Drokin
2014-04-27 17:06 ` [PATCH 19/47] staging/lustre: fix permission problem of setfacl Oleg Drokin
2014-04-27 17:06 ` [PATCH 20/47] staging/lustre/llite: issue OST_SYNC for fsync() Oleg Drokin
2014-04-27 17:06 ` [PATCH 21/47] staging/lustre/llite: deadlock taking lli_trunc_sem during file write Oleg Drokin
2014-04-27 17:06 ` [PATCH 22/47] staging/lustre/lov: to not hold sub locks at initialization Oleg Drokin
2014-04-27 17:06 ` [PATCH 23/47] staging/lustre: Limit reply buffer size Oleg Drokin
2014-04-27 17:06 ` Oleg Drokin [this message]
2014-04-27 17:06 ` [PATCH 25/47] stagaing/lustre: Improve statahead debug messages Oleg Drokin
2014-04-27 17:06 ` [PATCH 26/47] staging/lustre/llite: access layout version under a lock Oleg Drokin
2014-04-27 17:06 ` [PATCH 27/47] staging/lustre: shrink lu_object_header by 8 bytes on x86_64 Oleg Drokin
2014-04-27 17:06 ` [PATCH 28/47] staging/lustre/ldlm: fix NULL pointer dereference Oleg Drokin
2014-04-27 17:06 ` [PATCH 29/47] staging/lustre/lnet: lnet: fix issues found by Klocwork Insight tool Oleg Drokin
2014-04-27 17:25   ` Greg Kroah-Hartman
2014-04-27 17:06 ` [PATCH 30/47] staging/lustre/mdc: fix issue " Oleg Drokin
2014-04-29 10:20   ` Dan Carpenter
2014-04-27 17:06 ` [PATCH 31/47] staging/lustre/libcfs: fix issues " Oleg Drokin
2014-04-27 17:06 ` [PATCH 32/47] staging/lustre/lnet: NI shutdown may loop forever Oleg Drokin
2014-04-27 17:06 ` [PATCH 33/47] staging/lustre: remove lustre/include/ioctl.h Oleg Drokin
2014-04-27 17:06 ` [PATCH 34/47] staging/lustre/libcfs: add CPU table functions for uniprocessor Oleg Drokin
2014-04-29 10:35   ` Dan Carpenter
2014-04-27 17:06 ` [PATCH 35/47] staging/lustre: replace semaphores with mutexes Oleg Drokin
2014-04-27 17:07 ` [PATCH 36/47] staging/lustre/clio: replace semaphore with mutex Oleg Drokin
2014-04-27 17:07 ` [PATCH 37/47] staging/lustre/llite: Do not rate limit dirty page discard warning Oleg Drokin
2014-04-27 17:07 ` [PATCH 38/47] staging/lustre/lloop: avoid panic during blockdev_info Oleg Drokin
2014-04-27 17:07 ` [PATCH 39/47] staging/lustre/clio: Solve a race in cl_lock_put Oleg Drokin
2014-04-27 17:07 ` [PATCH 40/47] staging/lustre/mdc: use cl_max_mds_md to pack getattr RPC Oleg Drokin
2014-04-27 17:07 ` [PATCH 41/47] staging/lustre/llite: remove dead code Oleg Drokin
2014-04-29 11:02   ` Dan Carpenter
2014-04-29 19:16     ` Hammond, John
2014-04-29 20:17       ` Dan Carpenter
2014-04-30  3:21     ` Oleg Drokin
2014-04-30  8:01       ` Dan Carpenter
2014-04-29 11:12   ` Richard Weinberger
2014-04-27 17:07 ` [PATCH 42/47] staging/lustre: remove assertion of spin_is_locked() Oleg Drokin
2014-04-27 17:07 ` [PATCH 43/47] staging/lustre/osc: Update inode timestamp for lockless IO as well Oleg Drokin
2014-04-27 17:07 ` [PATCH 44/47] staging/lustre: Always clamp cdls_delay between min and max Oleg Drokin
2014-04-27 17:07 ` [PATCH 45/47] staging/lustre: pass fsync() range through RPC/IO stack Oleg Drokin
2014-04-27 17:07 ` [PATCH 46/47] staging/lustre: Fix unsafe userspace access in many proc files Oleg Drokin
2014-04-27 17:30   ` Greg Kroah-Hartman
2014-04-27 17:07 ` [PATCH 47/47] staging/lustre/llite: prevent buffer overflow in fiemap Oleg Drokin
2014-04-27 17:33 ` [PATCH 00/47] Lustre fixes and cleanups Greg Kroah-Hartman
2014-04-27 18:28   ` Oleg Drokin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1398618431-29757-25-git-send-email-green@linuxhacker.ru \
    --to=green@linuxhacker.ru \
    --cc=devel@driverdev.osuosl.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=morrone2@llnl.gov \
    --cc=oleg.drokin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).