All of lore.kernel.org
 help / color / mirror / Atom feed
From: Coly Li <coly.li@suse.de>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] [PATCH] ocfs2: fix DLM_BADARGS error in concurrent file locking
Date: Fri, 12 Dec 2008 03:04:49 +0800	[thread overview]
Message-ID: <49416451.90904@suse.de> (raw)
In-Reply-To: <1228789700-8226-3-git-send-email-sunil.mushran@oracle.com>

Sunil,

Sorry for my late replying and thanks for your patient explaining.

Sunil Mushran Wrote:
> So I have a new patch. It has debug code too. I am still testing this.
> Don't upload it to bz.
>
> --- a/fs/ocfs2/dlm/dlmmaster.c
> +++ b/fs/ocfs2/dlm/dlmmaster.c
> @@ -672,6 +672,12 @@ void __dlm_lockres_drop_inflight_ref(struct
> dlm_ctxt *dlm,
>  {
>         assert_spin_locked(&res->spinlock);
>
> +       if (res->inflight_locks == 0) {
> +               mlog(ML_ERROR, "%s:%.*s: inflight=0\n",
> +                    dlm->name, res->lockname.len, res->lockname.name);
> +               __dlm_print_one_lock_resource(res);
> +               dump_stack();
> +       }
>         BUG_ON(res->inflight_locks == 0);
>         res->inflight_locks--;
>         mlog(0, "%s:%.*s: inflight--: now %u\n",
> @@ -726,14 +732,22 @@ lookup:
>         if (tmpres) {
>                 int dropping_ref = 0;
>
> +               spin_unlock(&dlm->spinlock);
> +
>                 spin_lock(&tmpres->spinlock);
> +               if (tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN) {
> +                       mlog(ML_NOTICE, "%s:%.*s Another thread is
> mastering "
> +                            "this resource\n", dlm->name, namelen, lockid);
> +                       __dlm_wait_on_lockres(tmpres);
> +                       BUG_ON(tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN);
> +               }
> +
>                 if (tmpres->owner == dlm->node_num) {
>                         BUG_ON(tmpres->state & DLM_LOCK_RES_DROPPING_REF);
>                         dlm_lockres_grab_inflight_ref(dlm, tmpres);
>                 } else if (tmpres->state & DLM_LOCK_RES_DROPPING_REF)
>                         dropping_ref = 1;
>                 spin_unlock(&tmpres->spinlock);
> -               spin_unlock(&dlm->spinlock);
>
>                 /* wait until done messaging the master, drop our ref to
> allow
>                  * the lockres to be purged, start over. */
>
>
>
> Coly Li wrote:
>> This is the way I tried firstly, didn't not work well. When I tested with more than 20 forks on 1
>> node, user space process got segmentation fault and kernel got oops. The kernel oops was triggered
>> on BUG_ON(res->inflight_locks == 0) in __dlm_lockres_drop_inflight_ref() IIRC.
>>
>> Also, BUG_ON(tmpres->owner == DLM_LOCK_RES_OWNER_UNKNOWN) might not always be correct IMHO. Even we
>> return from __dlm_wait_on_lockres_flags(), there still is chance to set tmpres->owner to
>> DLM_LOCK_RES_OWNER_UNKNOWN before BUG_ON is tested (in theory), especially when dlm->spinlock is not
>> held.
>>
> Well, if you get an oops, get the stack. If you had, it would have
> looked like this.
>
> (19990,0):__dlm_lockres_drop_inflight_ref:677
>    ERROR:
> 1B57953A48324D258163FE31C04C4B11:F00000000000000000801f61f38b60c: inflight=0
> Call Trace:
> .show_stack+0x68/0x1b0 (unreliable)
>  .__dlm_lockres_drop_inflight_ref+0x98/0x184 [ocfs2_dlm]
>  .dlmlock_master+0x41c/0x4f8 [ocfs2_dlm]
>  .dlmlock+0x8b0/0x11a0 [ocfs2_dlm]
>  .ocfs2_lock_create+0x174/0x3a8 [ocfs2]
>  .ocfs2_file_lock+0x158/0x6d8 [ocfs2]
>  .ocfs2_flock+0x18c/0x278 [ocfs2]
>  .sys_flock+0x170/0x1d4
>  syscall_exit+0x0/0x40
> kernel BUG in __dlm_lockres_drop_inflight_ref at
> /rpmbuild/smushran/BUILD/ocfs2-1.4.1/fs/ocfs2/dlm/dlmmaster.c:680!
>
> This tells you what the issue is. We are forgetting to take an inflight ref.
> The new patch moves the code up so that we take the reference.
Because I am not familiar with the code yet, I though this is an oops triggered by my first
modification. Therefore, I choose to use a loop which did not trigger the oops.

From your reply, it seems kernel BUG in __dlm_lockres_drop_inflight_ref at dlmmaster.c:680 is
another bug ? I saw a patch named "ocfs2/dlm: Fix race in adding/removing lockres' to/from the
tracking list", is it the fix for this bug ? If yes, I should learn how you resolve it ;)

>
> Secondly, dlm->spinlock does not protect res->owner. Furthermore, that
> BUG_ON is important because that is the case. If mastery is underway,
> the state will remain IN_PROGRESS. I am curious as to why you feel
> otherwise.
Here is how I thought, please comments on my mistake,
The dlm associated with lockres is projected by dlm->spinlock, if we only protect lockres by
lockres->spinlock, there *might* be possibility to modify dlm->node_num somewhere. Since we have
quite a few places to compare lockres->owner with dlm->node_num, I suspect that manipulating on
lockres->owner without protecting dlm->owner might be problematic.

I might be wrong, and happy to see your correction :)
>
> Lastly, the perl script can be improved by enclosing the 20 fork loop
> into another loop that changes the filename. To startwith, touch 5000 files
> (test.flock%d). umount and mount again. This will drop the caches. Now
> before running any ls, run flock.pl. It will spawn 20 processes to flock()
> each file and wait till the 20 forks have exited before letting loose
> another
> 20 for the next file. (Makesure you have another node or two or more
> that have
> just mounted that volume.)
>
> I ran this with 5000 files successfully. I did encounter a problem in
> the last
> run, not where I expected it. Still a work in progress. I can't email
> you the
> perl script as my box will remain in the debugger for the night. ;)
>
> My problem with your patch is not that it is wrong, but it is punting the
> issue. Say we do have a case in which the res->owner remains unknown
> eventhough
> the state in not in_prog. In that case, the process will keep looping. By
> oopsing, we catch the problem upfront.
Thanks for the comments, I do agree with you. Your comments help me to understand more to the code :-)

-- 
Coly Li
SuSE PRC Labs

  reply	other threads:[~2008-12-11 19:04 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-09  2:28 [Ocfs2-devel] flock() race - fix Sunil Mushran
2008-12-09  2:28 ` [Ocfs2-devel] [PATCH 10/11] ocfs2/dlm: Fix race in adding/removing lockres' to/from the tracking list Sunil Mushran
2008-12-09  2:28 ` [Ocfs2-devel] [PATCH 11/11] ocfs2/dlm: Fix race during lockres mastery Sunil Mushran
2008-12-11 19:04   ` Coly Li [this message]
2008-12-11 19:23     ` [Ocfs2-devel] [PATCH] ocfs2: fix DLM_BADARGS error in concurrent file locking Sunil Mushran
2008-12-11 21:10       ` Coly Li
  -- strict thread matches above, loose matches on Subject: below --
2008-12-02 12:06 Coly Li
2008-12-02 22:09 ` Sunil Mushran
2008-12-03  2:23   ` Coly Li
2008-12-04  3:58     ` Sunil Mushran
2008-12-03 19:48   ` Coly Li
2008-12-04  1:31     ` tristan.ye
     [not found]       ` <4940CC19.5010801@suse.de>
2008-12-11  8:26         ` tristan.ye

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49416451.90904@suse.de \
    --to=coly.li@suse.de \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.