public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: green@linuxhacker.ru
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	devel@driverdev.osuosl.org,
	Andreas Dilger <andreas.dilger@intel.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Lustre Development List <lustre-devel@lists.lustre.org>,
	Bobi Jam <bobijam.xu@intel.com>,
	Oleg Drokin <green@linuxhacker.ru>
Subject: [PATCH 20/43] staging/lustre: update comments after cl_lock simplification
Date: Wed, 30 Mar 2016 12:47:52 -0400	[thread overview]
Message-ID: <1459356495-2794775-21-git-send-email-green@linuxhacker.ru> (raw)
In-Reply-To: <1459356495-2794775-1-git-send-email-green@linuxhacker.ru>

From: Bobi Jam <bobijam.xu@intel.com>

Update comments to reflect current cl_lock situations.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Reviewed-on: http://review.whamcloud.com/13137
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6046
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
---
 drivers/staging/lustre/lustre/include/cl_object.h  | 130 +++------------------
 .../staging/lustre/lustre/lov/lov_cl_internal.h    |  13 ---
 2 files changed, 19 insertions(+), 124 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index 8f9512e..e613007 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -1117,111 +1117,29 @@ static inline struct page *cl_page_vmpage(struct cl_page *page)
  *
  * LIFE CYCLE
  *
- * cl_lock is reference counted. When reference counter drops to 0, lock is
- * placed in the cache, except when lock is in CLS_FREEING state. CLS_FREEING
- * lock is destroyed when last reference is released. Referencing between
- * top-lock and its sub-locks is described in the lov documentation module.
- *
- * STATE MACHINE
- *
- * Also, cl_lock is a state machine. This requires some clarification. One of
- * the goals of client IO re-write was to make IO path non-blocking, or at
- * least to make it easier to make it non-blocking in the future. Here
- * `non-blocking' means that when a system call (read, write, truncate)
- * reaches a situation where it has to wait for a communication with the
- * server, it should --instead of waiting-- remember its current state and
- * switch to some other work.  E.g,. instead of waiting for a lock enqueue,
- * client should proceed doing IO on the next stripe, etc. Obviously this is
- * rather radical redesign, and it is not planned to be fully implemented at
- * this time, instead we are putting some infrastructure in place, that would
- * make it easier to do asynchronous non-blocking IO easier in the
- * future. Specifically, where old locking code goes to sleep (waiting for
- * enqueue, for example), new code returns cl_lock_transition::CLO_WAIT. When
- * enqueue reply comes, its completion handler signals that lock state-machine
- * is ready to transit to the next state. There is some generic code in
- * cl_lock.c that sleeps, waiting for these signals. As a result, for users of
- * this cl_lock.c code, it looks like locking is done in normal blocking
- * fashion, and it the same time it is possible to switch to the non-blocking
- * locking (simply by returning cl_lock_transition::CLO_WAIT from cl_lock.c
- * functions).
- *
- * For a description of state machine states and transitions see enum
- * cl_lock_state.
- *
- * There are two ways to restrict a set of states which lock might move to:
- *
- *     - placing a "hold" on a lock guarantees that lock will not be moved
- *       into cl_lock_state::CLS_FREEING state until hold is released. Hold
- *       can be only acquired on a lock that is not in
- *       cl_lock_state::CLS_FREEING. All holds on a lock are counted in
- *       cl_lock::cll_holds. Hold protects lock from cancellation and
- *       destruction. Requests to cancel and destroy a lock on hold will be
- *       recorded, but only honored when last hold on a lock is released;
- *
- *     - placing a "user" on a lock guarantees that lock will not leave
- *       cl_lock_state::CLS_NEW, cl_lock_state::CLS_QUEUING,
- *       cl_lock_state::CLS_ENQUEUED and cl_lock_state::CLS_HELD set of
- *       states, once it enters this set. That is, if a user is added onto a
- *       lock in a state not from this set, it doesn't immediately enforce
- *       lock to move to this set, but once lock enters this set it will
- *       remain there until all users are removed. Lock users are counted in
- *       cl_lock::cll_users.
- *
- *       User is used to assure that lock is not canceled or destroyed while
- *       it is being enqueued, or actively used by some IO.
- *
- *       Currently, a user always comes with a hold (cl_lock_invariant()
- *       checks that a number of holds is not less than a number of users).
- *
- * CONCURRENCY
- *
- * This is how lock state-machine operates. struct cl_lock contains a mutex
- * cl_lock::cll_guard that protects struct fields.
- *
- *     - mutex is taken, and cl_lock::cll_state is examined.
- *
- *     - for every state there are possible target states where lock can move
- *       into. They are tried in order. Attempts to move into next state are
- *       done by _try() functions in cl_lock.c:cl_{enqueue,unlock,wait}_try().
- *
- *     - if the transition can be performed immediately, state is changed,
- *       and mutex is released.
- *
- *     - if the transition requires blocking, _try() function returns
- *       cl_lock_transition::CLO_WAIT. Caller unlocks mutex and goes to
- *       sleep, waiting for possibility of lock state change. It is woken
- *       up when some event occurs, that makes lock state change possible
- *       (e.g., the reception of the reply from the server), and repeats
- *       the loop.
- *
- * Top-lock and sub-lock has separate mutexes and the latter has to be taken
- * first to avoid dead-lock.
- *
- * To see an example of interaction of all these issues, take a look at the
- * lov_cl.c:lov_lock_enqueue() function. It is called as a part of
- * cl_enqueue_try(), and tries to advance top-lock to ENQUEUED state, by
- * advancing state-machines of its sub-locks (lov_lock_enqueue_one()). Note
- * also, that it uses trylock to grab sub-lock mutex to avoid dead-lock. It
- * also has to handle CEF_ASYNC enqueue, when sub-locks enqueues have to be
- * done in parallel, rather than one after another (this is used for glimpse
- * locks, that cannot dead-lock).
+ * cl_lock is a cacheless data container for the requirements of locks to
+ * complete the IO. cl_lock is created before I/O starts and destroyed when the
+ * I/O is complete.
+ *
+ * cl_lock depends on LDLM lock to fulfill lock semantics. LDLM lock is attached
+ * to cl_lock at OSC layer. LDLM lock is still cacheable.
  *
  * INTERFACE AND USAGE
  *
- * struct cl_lock_operations provide a number of call-backs that are invoked
- * when events of interest occurs. Layers can intercept and handle glimpse,
- * blocking, cancel ASTs and a reception of the reply from the server.
+ * Two major methods are supported for cl_lock: clo_enqueue and clo_cancel.  A
+ * cl_lock is enqueued by cl_lock_request(), which will call clo_enqueue()
+ * methods for each layer to enqueue the lock. At the LOV layer, if a cl_lock
+ * consists of multiple sub cl_locks, each sub locks will be enqueued
+ * correspondingly. At OSC layer, the lock enqueue request will tend to reuse
+ * cached LDLM lock; otherwise a new LDLM lock will have to be requested from
+ * OST side.
  *
- * One important difference with the old client locking model is that new
- * client has a representation for the top-lock, whereas in the old code only
- * sub-locks existed as real data structures and file-level locks are
- * represented by "request sets" that are created and destroyed on each and
- * every lock creation.
+ * cl_lock_cancel() must be called to release a cl_lock after use. clo_cancel()
+ * method will be called for each layer to release the resource held by this
+ * lock. At OSC layer, the reference count of LDLM lock, which is held at
+ * clo_enqueue time, is released.
  *
- * Top-locks are cached, and can be found in the cache by the system calls. It
- * is possible that top-lock is in cache, but some of its sub-locks were
- * canceled and destroyed. In that case top-lock has to be enqueued again
- * before it can be used.
+ * LDLM lock can only be canceled if there is no cl_lock using it.
  *
  * Overall process of the locking during IO operation is as following:
  *
@@ -1234,7 +1152,7 @@ static inline struct page *cl_page_vmpage(struct cl_page *page)
  *
  *     - when all locks are acquired, IO is performed;
  *
- *     - locks are released into cache.
+ *     - locks are released after IO is complete.
  *
  * Striping introduces major additional complexity into locking. The
  * fundamental problem is that it is generally unsafe to actively use (hold)
@@ -1256,16 +1174,6 @@ static inline struct page *cl_page_vmpage(struct cl_page *page)
  * buf is a part of memory mapped Lustre file, a lock or locks protecting buf
  * has to be held together with the usual lock on [offset, offset + count].
  *
- * As multi-stripe locks have to be allowed, it makes sense to cache them, so
- * that, for example, a sequence of O_APPEND writes can proceed quickly
- * without going down to the individual stripes to do lock matching. On the
- * other hand, multi-stripe locks shouldn't be used by normal read/write
- * calls. To achieve this, every layer can implement ->clo_fits_into() method,
- * that is called by lock matching code (cl_lock_lookup()), and that can be
- * used to selectively disable matching of certain locks for certain IOs. For
- * example, lov layer implements lov_lock_fits_into() that allow multi-stripe
- * locks to be matched only for truncates and O_APPEND writes.
- *
  * Interaction with DLM
  *
  * In the expected setup, cl_lock is ultimately backed up by a collection of
diff --git a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
index dfe41a8..ac9744e 100644
--- a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
@@ -73,19 +73,6 @@
  *     - top-page keeps a reference to its sub-page, and destroys it when it
  *       is destroyed.
  *
- *     - sub-lock keep a reference to its top-locks. Top-lock keeps a
- *       reference (and a hold, see cl_lock_hold()) on its sub-locks when it
- *       actively using them (that is, in cl_lock_state::CLS_QUEUING,
- *       cl_lock_state::CLS_ENQUEUED, cl_lock_state::CLS_HELD states). When
- *       moving into cl_lock_state::CLS_CACHED state, top-lock releases a
- *       hold. From this moment top-lock has only a 'weak' reference to its
- *       sub-locks. This reference is protected by top-lock
- *       cl_lock::cll_guard, and will be automatically cleared by the sub-lock
- *       when the latter is destroyed. When a sub-lock is canceled, a
- *       reference to it is removed from the top-lock array, and top-lock is
- *       moved into CLS_NEW state. It is guaranteed that all sub-locks exist
- *       while their top-lock is in CLS_HELD or CLS_CACHED states.
- *
  *     - IO's are not reference counted.
  *
  * To implement a connection between top and sub entities, lov layer is split
-- 
2.1.0

  parent reply	other threads:[~2016-03-30 16:49 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-30 16:47 [PATCH 00/43] Lustre IO stack simplifications and cleanups green
2016-03-30 16:47 ` [PATCH 01/43] staging/lustre/obdclass: limit lu_site hash table size green
2016-03-30 16:47 ` [PATCH 02/43] staging/lustre: Get rid of CFS_PAGE_MASK green
2016-03-30 16:47 ` [PATCH 03/43] staging/lustre: merge lclient/*.c into llite/ green
2016-03-30 16:47 ` [PATCH 04/43] staging/lustre: Reintroduce global env list green
2016-03-30 16:47 ` [PATCH 05/43] staging/lustre/osc: Adjustment on osc LRU for performance green
2016-03-30 16:47 ` [PATCH 06/43] staging/lustre/osc: to drop LRU pages with cl_lru_work green
2016-03-30 16:47 ` [PATCH 07/43] staging/lustre/clio: collapse layer of cl_page green
2016-03-30 16:47 ` [PATCH 08/43] staging/lustre/obdclass: Add a preallocated percpu cl_env green
2016-03-30 16:47 ` [PATCH 09/43] staging/lustre/clio: add pages into writeback cache in batches green
2016-03-30 16:47 ` [PATCH 10/43] staging/lustre/osc: add weight function for DLM lock green
2016-03-30 16:47 ` [PATCH 11/43] staging/lustre/clio: remove stackable cl_page completely green
2016-03-30 16:47 ` [PATCH 12/43] staging/lustre/clio: optimize read ahead code green
2016-03-30 16:47 ` [PATCH 13/43] staging/lustre/llite: remove lli_lvb green
2016-03-30 16:47 ` [PATCH 14/43] staging/lustre/lmv: remove lmv_init_{lock,unlock}() green
2016-03-30 16:47 ` [PATCH 15/43] staging/lustre/obd: remove struct client_obd_lock green
2016-03-30 16:47 ` [PATCH 16/43] staging/lustre/llite: remove some cl wrappers green
2016-03-30 16:47 ` [PATCH 17/43] staging/lustre: Remove struct ll_iattr green
2016-03-30 16:47 ` [PATCH 18/43] staging/lustre/clio: generalize cl_sync_io green
2016-03-30 16:47 ` [PATCH 19/43] staging/lustre/clio: cl_lock simplification green
2016-03-30 16:47 ` green [this message]
2016-03-30 16:47 ` [PATCH 21/43] staging/lustre/llite: clip page correctly for vvp_io_commit_sync green
2016-03-30 16:47 ` [PATCH 22/43] staging/lustre/llite: deadlock for page write green
2016-03-30 16:47 ` [PATCH 23/43] staging/lustre/llite: make sure we do cl_page_clip on the last page green
2016-03-30 16:47 ` [PATCH 24/43] staging/lustre/llite: merge lclient.h into llite/vvp_internal.h green
2016-03-30 16:47 ` [PATCH 25/43] staging/lustre/llite: rename ccc_device to vvp_device green
2016-03-30 16:47 ` [PATCH 26/43] staging/lustre/llite: rename ccc_object to vvp_object green
2016-03-30 16:47 ` [PATCH 27/43] staging/lustre/llite: rename ccc_page to vvp_page green
2016-03-30 16:48 ` [PATCH 28/43] staging/lustre/llite: rename ccc_lock to vvp_lock green
2016-03-30 16:48 ` [PATCH 29/43] staging/lustre:llite: remove struct ll_ra_read green
2016-03-30 16:48 ` [PATCH 30/43] staging/lustre/llite: merge ccc_io and vvp_io green
2016-03-30 16:48 ` [PATCH 31/43] staging/lustre/llite: use vui prefix for struct vvp_io members green
2016-03-30 16:48 ` [PATCH 32/43] staging/lustre/llite: move vvp_io functions to vvp_io.c green
2016-03-30 16:48 ` [PATCH 33/43] staging/lustre/llite: rename ccc_req to vvp_req green
2016-03-30 16:48 ` [PATCH 34/43] staging/lustre/llite: Rename struct ccc_grouplock to ll_grouplock green
2016-03-30 16:48 ` [PATCH 35/43] staging/lustre/llite: Rename struct vvp_thread_info to ll_thread_info green
2016-03-30 23:13   ` kbuild test robot
2016-03-30 23:39     ` [lustre-devel] " Oleg Drokin
2016-03-30 16:48 ` [PATCH 36/43] staging/lustre/llite: rename struct ccc_thread_info to vvp_thread_info green
2016-03-30 16:48 ` [PATCH 37/43] staging/lustre/llite: Remove ccc_global_{init,fini}() green
2016-03-30 16:48 ` [PATCH 38/43] staging/lustre/llite: Move ll_dirent_type_get and make it static green
2016-03-30 16:48 ` [PATCH 39/43] staging/lustre/llite: Move several declarations to llite_internal.h green
2016-03-30 16:48 ` [PATCH 40/43] staging/lustre/llite: Remove unused vui_local_lock field green
2016-03-30 16:48 ` [PATCH 41/43] staging/lustre/ldlm: ELC picks locks in a safer policy green
2016-03-30 16:48 ` [PATCH 42/43] staging/lustre/ldlm: revert changes to ldlm_cancel_aged_policy() green
2016-03-30 16:48 ` [PATCH 43/43] staging/lustre/ldlm: restore the ELC for enqueue green

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1459356495-2794775-21-git-send-email-green@linuxhacker.ru \
    --to=green@linuxhacker.ru \
    --cc=andreas.dilger@intel.com \
    --cc=bobijam.xu@intel.com \
    --cc=devel@driverdev.osuosl.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox