* [Ocfs2-devel] [patch 5/8] ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid endless, loop during umount
@ 2014-03-19 21:10 akpm at linux-foundation.org
2014-03-31 2:23 ` Mark Fasheh
0 siblings, 1 reply; 2+ messages in thread
From: akpm at linux-foundation.org @ 2014-03-19 21:10 UTC (permalink / raw)
To: ocfs2-devel
From: jiangyiwen <jiangyiwen@huawei.com>
Subject: ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid endless,loop during umount
The following case may lead to endless loop during umount.
node A node B node C node D
umount volume,
migrate lockres1
to B
want to lock lockres1,
send
MASTER_REQUEST_MSG
to C
init block mle
send
MIGRATE_REQUEST_MSG
to C
find a block
mle, and then
return
DLM_MIGRATE_RESPONSE_MASTERY_REF
to B
set C in refmap
umount successfully
try to umount, endless
loop occurs when migrate
lockres1 since C is in
refmap
So we can fix this endless loop case by only returning
DLM_MIGRATE_RESPONSE_MASTERY_REF if it has a mastery mle when receiving
MIGRATE_REQUEST_MSG.
[akpm at linux-foundation.org: coding-style fixes]
Signed-off-by: jiangyiwen <jiangyiwen@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Xue jiufei <xuejiufei@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/ocfs2/dlm/dlmmaster.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff -puN fs/ocfs2/dlm/dlmmaster.c~ocfs2-do-not-return-dlm_migrate_response_mastery_ref-to-avoid-endlessloop-during-umount fs/ocfs2/dlm/dlmmaster.c
--- a/fs/ocfs2/dlm/dlmmaster.c~ocfs2-do-not-return-dlm_migrate_response_mastery_ref-to-avoid-endlessloop-during-umount
+++ a/fs/ocfs2/dlm/dlmmaster.c
@@ -3084,11 +3084,15 @@ static int dlm_add_migration_mle(struct
/* remove it so that only one mle will be found */
__dlm_unlink_mle(dlm, tmp);
__dlm_mle_detach_hb_events(dlm, tmp);
- ret = DLM_MIGRATE_RESPONSE_MASTERY_REF;
- mlog(0, "%s:%.*s: master=%u, newmaster=%u, "
- "telling master to get ref for cleared out mle "
- "during migration\n", dlm->name, namelen, name,
- master, new_master);
+ if (tmp->type == DLM_MLE_MASTER) {
+ ret = DLM_MIGRATE_RESPONSE_MASTERY_REF;
+ mlog(0, "%s:%.*s: master=%u, newmaster=%u, "
+ "telling master to get ref "
+ "for cleared out mle during "
+ "migration\n", dlm->name,
+ namelen, name, master,
+ new_master);
+ }
}
spin_unlock(&tmp->spinlock);
}
_
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Ocfs2-devel] [patch 5/8] ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid endless, loop during umount
2014-03-19 21:10 [Ocfs2-devel] [patch 5/8] ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid endless, loop during umount akpm at linux-foundation.org
@ 2014-03-31 2:23 ` Mark Fasheh
0 siblings, 0 replies; 2+ messages in thread
From: Mark Fasheh @ 2014-03-31 2:23 UTC (permalink / raw)
To: ocfs2-devel
On Wed, Mar 19, 2014 at 02:10:03PM -0700, Andrew Morton wrote:
> From: jiangyiwen <jiangyiwen@huawei.com>
> Subject: ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid endless,loop during umount
>
> The following case may lead to endless loop during umount.
>
> node A node B node C node D
> umount volume,
> migrate lockres1
> to B
> want to lock lockres1,
> send
> MASTER_REQUEST_MSG
> to C
> init block mle
> send
> MIGRATE_REQUEST_MSG
> to C
> find a block
> mle, and then
> return
> DLM_MIGRATE_RESPONSE_MASTERY_REF
> to B
> set C in refmap
> umount successfully
> try to umount, endless
> loop occurs when migrate
> lockres1 since C is in
> refmap
>
> So we can fix this endless loop case by only returning
> DLM_MIGRATE_RESPONSE_MASTERY_REF if it has a mastery mle when receiving
> MIGRATE_REQUEST_MSG.
>
> [akpm at linux-foundation.org: coding-style fixes]
> Signed-off-by: jiangyiwen <jiangyiwen@huawei.com>
> Cc: Mark Fasheh <mfasheh@suse.com>
> Cc: Joel Becker <jlbec@evilplan.org>
> Cc: Xue jiufei <xuejiufei@huawei.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ok, I _think_ I got this race condition, and the patch itself seems sane.
How was this bug hit, and how much testing did you do with this patch? I ask
because dlm changes can sometimes have unintended effects and I really don't
understand that particular code well enough right now to tell with 100%
certainty we didn't mess something else up.
Actually, I'm going to CC Sunil in the hopes he can look at this.
--Mark
> ---
>
> fs/ocfs2/dlm/dlmmaster.c | 14 +++++++++-----
> 1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff -puN fs/ocfs2/dlm/dlmmaster.c~ocfs2-do-not-return-dlm_migrate_response_mastery_ref-to-avoid-endlessloop-during-umount fs/ocfs2/dlm/dlmmaster.c
> --- a/fs/ocfs2/dlm/dlmmaster.c~ocfs2-do-not-return-dlm_migrate_response_mastery_ref-to-avoid-endlessloop-during-umount
> +++ a/fs/ocfs2/dlm/dlmmaster.c
> @@ -3084,11 +3084,15 @@ static int dlm_add_migration_mle(struct
> /* remove it so that only one mle will be found */
> __dlm_unlink_mle(dlm, tmp);
> __dlm_mle_detach_hb_events(dlm, tmp);
> - ret = DLM_MIGRATE_RESPONSE_MASTERY_REF;
> - mlog(0, "%s:%.*s: master=%u, newmaster=%u, "
> - "telling master to get ref for cleared out mle "
> - "during migration\n", dlm->name, namelen, name,
> - master, new_master);
> + if (tmp->type == DLM_MLE_MASTER) {
> + ret = DLM_MIGRATE_RESPONSE_MASTERY_REF;
> + mlog(0, "%s:%.*s: master=%u, newmaster=%u, "
> + "telling master to get ref "
> + "for cleared out mle during "
> + "migration\n", dlm->name,
> + namelen, name, master,
> + new_master);
> + }
> }
> spin_unlock(&tmp->spinlock);
> }
> _
--
Mark Fasheh
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2014-03-31 2:23 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-19 21:10 [Ocfs2-devel] [patch 5/8] ocfs2: do not return DLM_MIGRATE_RESPONSE_MASTERY_REF to avoid endless, loop during umount akpm at linux-foundation.org
2014-03-31 2:23 ` Mark Fasheh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).