All of lore.kernel.org
 help / color / mirror / Atom feed
From: Srinivas Eeda <srinivas.eeda@oracle.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] [RFC] ocfs2/dlm: support range lock
Date: Mon, 26 Jan 2015 23:08:53 -0800	[thread overview]
Message-ID: <54C73985.8060604@oracle.com> (raw)
In-Reply-To: <54C632F4.3070705@huawei.com>

Hi Yangwenfang,

thank you very much for initiating this RFC :). This feature is long due 
for OCFS2 and we are also interested in implementing this feature. 
Wengang(cc'ed) has been looking into analysing and giving an attempt to 
implement it. We haven't  looked at splitting and merging the range 
locking yet, but looked at having lock fairness and range locking. 
Wengang has done some of the dlm changes to see how it can be done but 
other changes are still work in progress. We will email more details in 
coming few days.

Since you are also looking into it, it would be great if we can 
collaborate work on this feature. Can you please share more info on the 
demo code you mentioned ? Like what it does and how much work has been 
done on this ?

One of the thing we considered was making the rw lock itself support 
range locking, which is a different approach from what you mentioned. Is 
there any reason why rw lock cannot be used and we needa new 
ip_range_lock_lockres ?

Thanks,
--Srini


Hi On 01/26/2015 04:28 AM, yangwenfang wrote:
> What:
> Byte range lock is applied to lock a region of a file to accelerate
> reading/writing concurrently.
>
> Why:		
> Currently ocfs2 does not support byte range lock. Since multiple nodes
> may concurrently update/write at different positions of the same file
> in database workloads, the performance(tpmc) of DB+ocfs2 is much poorer than
> DB+GPFS in running TPCC.
> Aiming at improving the efficiency of parallel accesses to the same file,
> we have implemented a demo of range lock feature which has been supported
> by lustre and GPFS, so that a file can be updated by different nodes in
> the cluster when they are visiting different blocks.
>
> How:
> Key issues in design and implementation:
> 1.In ocfs2, each file only has one lock, which is incapable of telling
> different position.
> One solution is to add a range field (start,end) in a lock. For example:
> -ocfs2_lock_res(N1)	      dlm_lock_resource(Master)	ocfs2_lock_res(N2)
> -ocfs2_res_range_lock (0,9)----dlm_lock(0,9)    N1			
> -				dlm_lock(10,19)  N2<--ocfs2_res_range_lock(10,19)
> -ocfs2_res_range_lock (20,29)---dlm_lock(20,29)  N1			
> -				dlm_lock(30,49)  N2<--ocfs2_res_range_lock(30,49)
> -ocfs2_res_range_lock (50,59)---dlm_lock(50,59)  N1			
> -				dlm_lock(60,69)  N2<--ocfs2_res_range_lock(60,69)
>
> Each lock resource deploys an interval tree to manage the range, which
> supports basic operations like add, delete, insert, find, split and merge.
> The most important issue is to determine the existance of conflicts
> among the ranges. Conflict-free ranges of the same file can be accessed
> concurrently. In the contrary, nodes must wait for the release of a
> conflicted lock before accessing the range of file.
>
> Byte range lock supports split and merge rules: for same level, larger
> scope; different level, write > read(If a node keeps EX lock with
> range(start,end), then it has PR range lock(start,end)).
> For example:
> (1) merge: N1 keeps range lock (0,9)PR and (5,19)PR, the lock is merged into
> (0,19) PR;
> (2) merge: N1 keeps range lock (0,9)PR and (5,19)EX, the merged lock should
> become(0,19) PR, (5,19)EX;
> (3) split: N1 keeps range lock (0,9)PR, N2 tries to lock(0,5) PR, N1 should
> split the lock and keep (6,9)PR.
>
> 2.In ocfs2, there are only three types of lock resources: rw, inode and open
> which provide protections to different contents.
> We need to add another lock resource(ip_range_lock_lockres) to protect
> different ranges in IO read/write process.
> For example: buffer read/write.
> (1)ocfs2_file_aio_write	------------->ocfs2_file_aio_write
> 	ocfs2_rw_lock(ex)		ocfs2_rw_lock(pr)
> 					ocfs2_range_lock(start, end, ex)
> 	ocfs2_write_begin
> 		ocfs2_inode_lock(ex)    ocfs2_inode_lock(pr)
> 					if append, update to ex;
> (2)ocfs2_file_aio_read---------------> no need to change.
> 	ocfs2_readpage
> 		ocfs2_inode_lock(pr)
> (3)but it is a problem in read_ahead.
> 	ocfs2_readpages------------------>ocfs2_readpages
> 	ocfs2_inode_lock(pr)		ocfs2_inode_lock(pr)
> 					ocfs2_range_lock(start, end, pr)
> 																	
> Limitations based on our assumption:
> 1.Byte range lock is only beneficial for update write.
> 2.Too many locks because of delayed unlock.
> 3.Significant source code modification is necessitated, involving almost the
> whole dlmglue and dlm modules.
>
> As described above, there are also many limitations base on our assumption.
> Many thanks for any advice.
>
> thanks.
>

  reply	other threads:[~2015-01-27  7:08 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-26 12:28 [Ocfs2-devel] [RFC] ocfs2/dlm: support range lock yangwenfang
2015-01-27  7:08 ` Srinivas Eeda [this message]
2015-01-29  6:42   ` yangwenfang
2015-01-29 11:04     ` Goldwyn Rodrigues
2015-01-30  2:59       ` Xue jiufei
2015-01-30 12:37         ` Goldwyn Rodrigues
2015-01-31  4:15           ` yangwenfang
2015-01-29 11:07     ` Goldwyn Rodrigues
2015-01-29  0:05 ` Goldwyn Rodrigues
2015-01-29  3:21   ` Wengang Wang
2015-01-29  7:47   ` yangwenfang
2015-01-29  8:06     ` Wengang Wang
2015-01-30  3:54       ` yangwenfang
2015-01-30  6:02         ` Wengang Wang
2015-01-30  7:46           ` yangwenfang
  -- strict thread matches above, loose matches on Subject: below --
2015-01-28  8:43 David Weber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54C73985.8060604@oracle.com \
    --to=srinivas.eeda@oracle.com \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.