* [PATCH] xfs: lock for removing item from cil->xc_cil in xlog_cil_push [not found] <929991500.42327290.1412926226461.JavaMail.zimbra@redhat.com> @ 2014-10-10 7:31 ` Xu Wang 2014-10-10 12:07 ` Brian Foster 0 siblings, 1 reply; 3+ messages in thread From: Xu Wang @ 2014-10-10 7:31 UTC (permalink / raw) To: xfs From 59c6babab7c4ce8708036018143d2acc1477cc7f Mon Sep 17 00:00:00 2001 From: George Wang <xuw@redhat.com> Date: Fri, 10 Oct 2014 15:02:14 +0800 Subject: [PATCH] [PATCH] xfs: lock for removing item from cil->xc_cil in xlog_cil_push There is a race condition when xlog_cil_force_lsn and xlog_cil_push for cil->xc_cil items. When function xlog_cil_push_now called in xlog_cil_force_lsn, and the xlog_cil_push run in workqueue just reaches the point for setting cil->xc_current_sequence, the xlog_cil_force_lsn got the lock and check the "sequence == cil->xc_current_sequence && !list_empty(&cil->xc_cil)" for restart. The statement "sequence == cil->xc_current_sequence" is true, but the cil->xc_cil is empty according to the xlog_cil_push removed the items in cil->xc_cil without lock protectionl. So the function xlog_cil_force_lsn will return NULLCOMMITLSN, which means the cil log for current sequence will not be submitted to disk. And if there is no more operations(for example, when unmount fs, this situation happened in xfs_unmountfs->xfs_log_sbcount->xfs_trans_commit), the xfs will hang up. Signed-off-by: George Wang <xuw@redhat.com> --- fs/xfs/xfs_log_cil.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c index f6b79e5..97ba527 100644 --- a/fs/xfs/xfs_log_cil.c +++ b/fs/xfs/xfs_log_cil.c @@ -478,6 +478,8 @@ xlog_cil_push( */ lv = NULL; num_iovecs = 0; + + spin_lock(&cil->xc_push_lock); while (!list_empty(&cil->xc_cil)) { struct xfs_log_item *item; @@ -530,7 +532,6 @@ xlog_cil_push( * against the current sequence in log forces without risking * deferencing a freed context pointer. */ - spin_lock(&cil->xc_push_lock); cil->xc_current_sequence = new_ctx->sequence; list_add(&ctx->committing, &cil->xc_committing); spin_unlock(&cil->xc_push_lock); -- 1.9.3 -- George Wang 王旭 Kernel Quantity Engineer Red Hat Software (Beijing) Co.,Ltd IRC:xuw Tel:+86-010-62608041 Phone:15901231579 9/F, Tower C, Raycom _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] xfs: lock for removing item from cil->xc_cil in xlog_cil_push 2014-10-10 7:31 ` [PATCH] xfs: lock for removing item from cil->xc_cil in xlog_cil_push Xu Wang @ 2014-10-10 12:07 ` Brian Foster 2014-10-11 4:19 ` Eric Sandeen 0 siblings, 1 reply; 3+ messages in thread From: Brian Foster @ 2014-10-10 12:07 UTC (permalink / raw) To: Xu Wang; +Cc: xfs On Fri, Oct 10, 2014 at 03:31:46AM -0400, Xu Wang wrote: > From 59c6babab7c4ce8708036018143d2acc1477cc7f Mon Sep 17 00:00:00 2001 > From: George Wang <xuw@redhat.com> > Date: Fri, 10 Oct 2014 15:02:14 +0800 > Subject: [PATCH] [PATCH] xfs: lock for removing item from cil->xc_cil in > xlog_cil_push > > There is a race condition when xlog_cil_force_lsn and xlog_cil_push for > cil->xc_cil items. When function xlog_cil_push_now called in > xlog_cil_force_lsn, and the xlog_cil_push run in workqueue just reaches the point for > setting cil->xc_current_sequence, the xlog_cil_force_lsn got the lock > and check the "sequence == cil->xc_current_sequence && > !list_empty(&cil->xc_cil)" for restart. The statement "sequence == > cil->xc_current_sequence" is true, but the cil->xc_cil is empty > according to the xlog_cil_push removed the items in cil->xc_cil without > lock protectionl. So the function xlog_cil_force_lsn will return > NULLCOMMITLSN, which means the cil log for current sequence will not be > submitted to disk. And if there is no more operations(for example, when > unmount fs, this situation happened in > xfs_unmountfs->xfs_log_sbcount->xfs_trans_commit), the xfs will hang up. > > Signed-off-by: George Wang <xuw@redhat.com> > --- Isn't this fixed by the following commit? 8af3dcd3 xfs: xlog_cil_force_lsn doesn't always wait correctly With that patch, xlog_cil_push() adds the ctx to the committing list before draining the ctx and updating the current sequence number. xlog_cil_force_lsn() walks the committing list and checks the sequence number and ctx list, all under the push lock. That means that xlog_cil_force_lsn() should see the ctx either on the committing list and wait for it (commit_lsn != NULL), or not on the committing list at all. If it's not on the committing list yet, xlog_cil_push() won't have either drained the ctx or updated the sequence number, as it adds to the committing list first and that requires the push lock (which xlog_cil_force_lsn() holds and doesn't release until after the restart check). Am I missing something? Do you reproduce a problem with the latest tree that includes the above patch? Brian > fs/xfs/xfs_log_cil.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c > index f6b79e5..97ba527 100644 > --- a/fs/xfs/xfs_log_cil.c > +++ b/fs/xfs/xfs_log_cil.c > @@ -478,6 +478,8 @@ xlog_cil_push( > */ > lv = NULL; > num_iovecs = 0; > + > + spin_lock(&cil->xc_push_lock); > while (!list_empty(&cil->xc_cil)) { > struct xfs_log_item *item; > > @@ -530,7 +532,6 @@ xlog_cil_push( > * against the current sequence in log forces without risking > * deferencing a freed context pointer. > */ > - spin_lock(&cil->xc_push_lock); > cil->xc_current_sequence = new_ctx->sequence; > list_add(&ctx->committing, &cil->xc_committing); > spin_unlock(&cil->xc_push_lock); > -- > 1.9.3 > > > -- > George Wang 王旭 > > Kernel Quantity Engineer > Red Hat Software (Beijing) Co.,Ltd > IRC:xuw > Tel:+86-010-62608041 > Phone:15901231579 > 9/F, Tower C, Raycom > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] xfs: lock for removing item from cil->xc_cil in xlog_cil_push 2014-10-10 12:07 ` Brian Foster @ 2014-10-11 4:19 ` Eric Sandeen 0 siblings, 0 replies; 3+ messages in thread From: Eric Sandeen @ 2014-10-11 4:19 UTC (permalink / raw) To: Brian Foster, Xu Wang; +Cc: xfs On 10/10/14 7:07 AM, Brian Foster wrote: > Isn't this fixed by the following commit? > > 8af3dcd3 xfs: xlog_cil_force_lsn doesn't always wait correctly > > With that patch, xlog_cil_push() adds the ctx to the committing list > before draining the ctx and updating the current sequence number. > xlog_cil_force_lsn() walks the committing list and checks the sequence > number and ctx list, all under the push lock. > > That means that xlog_cil_force_lsn() should see the ctx either on the > committing list and wait for it (commit_lsn != NULL), or not on the > committing list at all. If it's not on the committing list yet, > xlog_cil_push() won't have either drained the ctx or updated the > sequence number, as it adds to the committing list first and that > requires the push lock (which xlog_cil_force_lsn() holds and doesn't > release until after the restart check). > > Am I missing something? Do you reproduce a problem with the latest tree > that includes the above patch? We clarified that Dave's patch does fix the problem; this was a Red Hat bug, and I hadn't put the upstream commit into it yet, sorry about that. George, thanks for working on it! -Eric > Brian _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-10-11 4:19 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <929991500.42327290.1412926226461.JavaMail.zimbra@redhat.com>
2014-10-10 7:31 ` [PATCH] xfs: lock for removing item from cil->xc_cil in xlog_cil_push Xu Wang
2014-10-10 12:07 ` Brian Foster
2014-10-11 4:19 ` Eric Sandeen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox