* [PATCH] xfs: flush workers before stopping log
[not found] <20120829134624.316257238@sgi.com>
@ 2012-08-29 13:46 ` tinguely
2012-08-29 14:31 ` Mark Tinguely
2012-08-30 0:23 ` Dave Chinner
0 siblings, 2 replies; 15+ messages in thread
From: tinguely @ 2012-08-29 13:46 UTC (permalink / raw)
To: xfs
[-- Attachment #1: xfs-flush_workers_before_stopping_log.patch --]
[-- Type: text/plain, Size: 7012 bytes --]
The unmount race continues with our test boxes.
The below trace gave the clue that there is a write of the superblock
after the log UNMOUNT record and xfs_logprint confirmed this write.
A couple different experiments points to the sync worker. The simplest
solution is to moved the final flush of the workers before the final
superblock write so there is no other filesystem activity after the
UNMOUNT record is written to the log.
PID: 21602 TASK: ee9df060 CPU: 0 COMMAND: "kworker/0:3"
#0 [c5377d28] crash_kexec at c0292c94
#1 [c5377d80] oops_end at c07090c2
#2 [c5377d98] no_context at c06f614e
#3 [c5377dbc] __bad_area_nosemaphore at c06f6281
#4 [c5377df4] bad_area_nosemaphore at c06f629b
#5 [c5377e00] do_page_fault at c070b0cb
#6 [c5377e7c] error_code (via page_fault) at c070892c
EAX: f300c6a8 EBX: f300c6a8 ECX: 000000c0 EDX: 000000c0 EBP: c5377ed0
DS: 007b ESI: 00000000 ES: 007b EDI: 00000001 GS: ffffad20
CS: 0060 EIP: c0481ad0 ERR: ffffffff EFLAGS: 00010246
#7 [c5377eb0] atomic64_read_cx8 at c0481ad0
#8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
#9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
#10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
#11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
#12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
#13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
#14 [c5377f58] process_one_work at c024ee4c
#15 [c5377f98] worker_thread at c024f43d
#16 [c5377fbc] kthread at c025326b
#17 [c5377fe8] kernel_thread_helper at c070e834
PID: 26653 TASK: e79143b0 CPU: 3 COMMAND: "umount"
#0 [cde0fda0] __schedule at c0706595
#1 [cde0fe28] schedule at c0706b89
#2 [cde0fe30] schedule_timeout at c0705600
#3 [cde0fe94] __down_common at c0706098
#4 [cde0fec8] __down at c0706122
#5 [cde0fed0] down at c025936f
#6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
#7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
#8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs]
#9 [cde0ff1c] generic_shutdown_super at c0333d7a
#10 [cde0ff38] kill_block_super at c0333e0f
#11 [cde0ff48] deactivate_locked_super at c0334218
#12 [cde0ff58] deactivate_super at c033495d
#13 [cde0ff68] mntput_no_expire at c034bc13
#14 [cde0ff7c] sys_umount at c034cc69
#15 [cde0ffa0] sys_oldumount at c034ccd4
#16 [cde0ffb0] system_call at c0707e66
---- xfs_logprint ----
cycle: 10 version: 2 lsn: 10,4858 tail_lsn: 10,4856
length of Log Record: 512 prev offset: 4856 num ops: 5
uuid: aa1f1bfe-c226-42ae-a3db-ae0d4fc62938 format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 2ce1d11b len: 0 clientid: TRANS flags: START
----------------------------------------------------------------------------
Oper (1): tid: 2ce1d11b len: 16 clientid: TRANS flags: none
TRAN: type: CHECKPOINT tid: 2ce1d11b num_items: 2
----------------------------------------------------------------------------
Oper (2): tid: 2ce1d11b len: 24 clientid: TRANS flags: none
BUF: #regs: 2 start blkno: 0 (0x0) len: 1 bmap size: 1 flags: 0x0
Oper (3): tid: 2ce1d11b len: 128 clientid: TRANS flags: none
SUPER BLOCK Buffer:
icount: 85952 ifree: 150 fdblks: 344729 frext: 0
----------------------------------------------------------------------------
Oper (4): tid: 2ce1d11b len: 0 clientid: TRANS flags: COMMIT
============================================================================
cycle: 10 version: 2 lsn: 10,4860 tail_lsn: 10,4858
length of Log Record: 512 prev offset: 4858 num ops: 1
uuid: aa1f1bfe-c226-42ae-a3db-ae0d4fc62938 format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 9328d157 len: 8 clientid: LOG flags: UNMOUNT
Unmount filesystem
============================================================================
cycle: 10 version: 2 lsn: 10,4862 tail_lsn: 10,4860
length of Log Record: 512 prev offset: 4860 num ops: 5
uuid: aa1f1bfe-c226-42ae-a3db-ae0d4fc62938 format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: b6b71c0c len: 0 clientid: TRANS flags: START
----------------------------------------------------------------------------
Oper (1): tid: b6b71c0c len: 16 clientid: TRANS flags: none
TRAN: type: CHECKPOINT tid: b6b71c0c num_items: 2
----------------------------------------------------------------------------
Oper (2): tid: b6b71c0c len: 24 clientid: TRANS flags: none
BUF: #regs: 2 start blkno: 0 (0x0) len: 1 bmap size: 1 flags: 0x0
Oper (3): tid: b6b71c0c len: 128 clientid: TRANS flags: none
SUPER BLOCK Buffer:
icount: 6360863066640355328 ifree: 524288 fdblks: 0 frext: 0
----------------------------------------------------------------------------
Oper (4): tid: b6b71c0c len: 0 clientid: TRANS flags: COMMIT
============================================================================
xfs_logprint: logical end of log
============================================================================
There should be no more I/O after the UNMOUNT record is written to the log.
Flush the workers before the final sync of the superblock, write of the
UNMOUNT log record and tearing down the log.
This earlier flush prevents a late write of the superblock that raced with
the fiesystem shutdown.
---
fs/xfs/xfs_log.c | 1 -
fs/xfs/xfs_mount.c | 5 +++++
fs/xfs/xfs_super.c | 1 -
3 files changed, 5 insertions(+), 2 deletions(-)
Index: b/fs/xfs/xfs_log.c
===================================================================
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -858,7 +858,6 @@ xfs_log_unmount_write(xfs_mount_t *mp)
void
xfs_log_unmount(xfs_mount_t *mp)
{
- cancel_delayed_work_sync(&mp->m_sync_work);
xfs_trans_ail_destroy(mp);
xlog_dealloc_log(mp->m_log);
}
Index: b/fs/xfs/xfs_mount.c
===================================================================
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1490,6 +1490,11 @@ xfs_unmountfs(
xfs_qm_unmount(mp);
+ /* flush the worker queues while the log still exists and
+ * before the final sync and unmount record.
+ */
+ xfs_syncd_stop(mp);
+
/*
* Flush out the log synchronously so that we know for sure
* that nothing is pinned. This is important because bflush()
Index: b/fs/xfs/xfs_super.c
===================================================================
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -920,7 +920,6 @@ xfs_fs_put_super(
xfs_filestream_unmount(mp);
xfs_unmountfs(mp);
- xfs_syncd_stop(mp);
xfs_freesb(mp);
xfs_icsb_destroy_counters(mp);
xfs_destroy_mount_workqueues(mp);
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] xfs: flush workers before stopping log
2012-08-29 13:46 ` [PATCH] xfs: flush workers before stopping log tinguely
@ 2012-08-29 14:31 ` Mark Tinguely
2012-08-30 0:23 ` Dave Chinner
1 sibling, 0 replies; 15+ messages in thread
From: Mark Tinguely @ 2012-08-29 14:31 UTC (permalink / raw)
To: xfs
Sorry the log print was trucated.
The superblock/UNMOUNT/superblock log is here:
cycle: 10 version: 2 lsn: 10,4858 tail_lsn: 10,4856
length of Log Record: 512 prev offset: 4856 num ops: 5
uuid: aa1f1bfe-c226-42ae-a3db-ae0d4fc62938 format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 2ce1d11b len: 0 clientid: TRANS flags: START
----------------------------------------------------------------------------
Oper (1): tid: 2ce1d11b len: 16 clientid: TRANS flags: none
TRAN: type: CHECKPOINT tid: 2ce1d11b num_items: 2
----------------------------------------------------------------------------
Oper (2): tid: 2ce1d11b len: 24 clientid: TRANS flags: none
BUF: #regs: 2 start blkno: 0 (0x0) len: 1 bmap size: 1 flags: 0x0
Oper (3): tid: 2ce1d11b len: 128 clientid: TRANS flags: none
SUPER BLOCK Buffer:
icount: 85952 ifree: 150 fdblks: 344729 frext: 0
----------------------------------------------------------------------------
Oper (4): tid: 2ce1d11b len: 0 clientid: TRANS flags: COMMIT
============================================================================
cycle: 10 version: 2 lsn: 10,4860 tail_lsn: 10,4858
length of Log Record: 512 prev offset: 4858 num ops: 1
uuid: aa1f1bfe-c226-42ae-a3db-ae0d4fc62938 format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: 9328d157 len: 8 clientid: LOG flags: UNMOUNT
Unmount filesystem
============================================================================
cycle: 10 version: 2 lsn: 10,4862 tail_lsn: 10,4860
length of Log Record: 512 prev offset: 4860 num ops: 5
uuid: aa1f1bfe-c226-42ae-a3db-ae0d4fc62938 format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
Oper (0): tid: b6b71c0c len: 0 clientid: TRANS flags: START
----------------------------------------------------------------------------
Oper (1): tid: b6b71c0c len: 16 clientid: TRANS flags: none
TRAN: type: CHECKPOINT tid: b6b71c0c num_items: 2
----------------------------------------------------------------------------
Oper (2): tid: b6b71c0c len: 24 clientid: TRANS flags: none
BUF: #regs: 2 start blkno: 0 (0x0) len: 1 bmap size: 1 flags: 0x0
Oper (3): tid: b6b71c0c len: 128 clientid: TRANS flags: none
SUPER BLOCK Buffer:
icount: 6360863066640355328 ifree: 524288 fdblks: 0 frext: 0
----------------------------------------------------------------------------
Oper (4): tid: b6b71c0c len: 0 clientid: TRANS flags: COMMIT
============================================================================
xfs_logprint: logical end of log
============================================================================
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] xfs: flush workers before stopping log
2012-08-29 13:46 ` [PATCH] xfs: flush workers before stopping log tinguely
2012-08-29 14:31 ` Mark Tinguely
@ 2012-08-30 0:23 ` Dave Chinner
2012-08-30 17:25 ` Ben Myers
1 sibling, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2012-08-30 0:23 UTC (permalink / raw)
To: tinguely; +Cc: xfs
On Wed, Aug 29, 2012 at 08:46:25AM -0500, tinguely@sgi.com wrote:
> The unmount race continues with our test boxes.
>
> The below trace gave the clue that there is a write of the superblock
> after the log UNMOUNT record and xfs_logprint confirmed this write.
>
> A couple different experiments points to the sync worker. The simplest
> solution is to moved the final flush of the workers before the final
> superblock write so there is no other filesystem activity after the
> UNMOUNT record is written to the log.
....
> #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
> #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
> #10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
> #11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
> #12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
> #13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
> #14 [c5377f58] process_one_work at c024ee4c
> #15 [c5377f98] worker_thread at c024f43d
> #16 [c5377fbc] kthread at c025326b
> #17 [c5377fe8] kernel_thread_helper at c070e834
>
> PID: 26653 TASK: e79143b0 CPU: 3 COMMAND: "umount"
> #0 [cde0fda0] __schedule at c0706595
> #1 [cde0fe28] schedule at c0706b89
> #2 [cde0fe30] schedule_timeout at c0705600
> #3 [cde0fe94] __down_common at c0706098
> #4 [cde0fec8] __down at c0706122
> #5 [cde0fed0] down at c025936f
> #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
> #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
OK, so you've got IO on the superblock buffer still active when the
superblock is being freed.
> There should be no more I/O after the UNMOUNT record is written to the log.
That depends - a freeze leaves the filesystem in exactly this state.
:)
> Flush the workers before the final sync of the superblock, write of the
> UNMOUNT log record and tearing down the log.
>
> This earlier flush prevents a late write of the superblock that raced with
> the fiesystem shutdown.
I'm not sure the xfs_sync_work can be responsible for this - the
xfs_sync_worker() has a MS_ACTIVE guard on it, so it will not log a
dummy record (superblock) during the unmount procedure, nor does it
dispatch supblock buffer IO, so it can not be responsible for the
item in the log after the unmount record or the IO that is being
run.
> Index: b/fs/xfs/xfs_mount.c
> ===================================================================
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -1490,6 +1490,11 @@ xfs_unmountfs(
>
> xfs_qm_unmount(mp);
>
> + /* flush the worker queues while the log still exists and
> + * before the final sync and unmount record.
> + */
> + xfs_syncd_stop(mp);
xfs_syncd_stop() needs to die rather than being moved from place to
place every time some problem is seemd. I outlined what we need to
do to solve the problems once and for all a couple of months ago:
http://oss.sgi.com/archives/xfs/2012-06/msg00064.html
I could use a break from backporting and testing - I might do this
today...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] xfs: flush workers before stopping log
2012-08-30 0:23 ` Dave Chinner
@ 2012-08-30 17:25 ` Ben Myers
2012-08-30 22:35 ` Dave Chinner
0 siblings, 1 reply; 15+ messages in thread
From: Ben Myers @ 2012-08-30 17:25 UTC (permalink / raw)
To: Dave Chinner; +Cc: tinguely, xfs
Hi Dave,
On Thu, Aug 30, 2012 at 10:23:35AM +1000, Dave Chinner wrote:
> On Wed, Aug 29, 2012 at 08:46:25AM -0500, tinguely@sgi.com wrote:
> > The unmount race continues with our test boxes.
> >
> > The below trace gave the clue that there is a write of the superblock
> > after the log UNMOUNT record and xfs_logprint confirmed this write.
> >
> > A couple different experiments points to the sync worker. The simplest
> > solution is to moved the final flush of the workers before the final
> > superblock write so there is no other filesystem activity after the
> > UNMOUNT record is written to the log.
>
> ....
> > #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
> > #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
> > #10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
> > #11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
> > #12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
> > #13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
> > #14 [c5377f58] process_one_work at c024ee4c
> > #15 [c5377f98] worker_thread at c024f43d
> > #16 [c5377fbc] kthread at c025326b
> > #17 [c5377fe8] kernel_thread_helper at c070e834
> >
> > PID: 26653 TASK: e79143b0 CPU: 3 COMMAND: "umount"
> > #0 [cde0fda0] __schedule at c0706595
> > #1 [cde0fe28] schedule at c0706b89
> > #2 [cde0fe30] schedule_timeout at c0705600
> > #3 [cde0fe94] __down_common at c0706098
> > #4 [cde0fec8] __down at c0706122
> > #5 [cde0fed0] down at c025936f
> > #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
> > #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
>
> OK, so you've got IO on the superblock buffer still active when the
> superblock is being freed.
>
> > There should be no more I/O after the UNMOUNT record is written to the log.
>
> That depends - a freeze leaves the filesystem in exactly this state.
> :)
>
> > Flush the workers before the final sync of the superblock, write of the
> > UNMOUNT log record and tearing down the log.
> >
> > This earlier flush prevents a late write of the superblock that raced with
> > the fiesystem shutdown.
>
> I'm not sure the xfs_sync_work can be responsible for this - the
> xfs_sync_worker() has a MS_ACTIVE guard on it, so it will not log a
> dummy record (superblock) during the unmount procedure, nor does it
> dispatch supblock buffer IO, so it can not be responsible for the
> item in the log after the unmount record or the IO that is being
> run.
>From what I've seen, with usage of the sync worker as of commit 1307bbd 'xfs:
protect xfs_sync_worker with s_umount semaphore' we have very solid test runs.
Unfortunately, as of commit 8866fc6 'xfs: shutdown xfs_sync_worker before the
log' we have started hitting these crashes again on an occasional basis. With
this newest patch we have not reproduced these crashes and it has been well by
Mark and on my dev boxes.
> > Index: b/fs/xfs/xfs_mount.c
> > ===================================================================
> > --- a/fs/xfs/xfs_mount.c
> > +++ b/fs/xfs/xfs_mount.c
> > @@ -1490,6 +1490,11 @@ xfs_unmountfs(
> >
> > xfs_qm_unmount(mp);
> >
> > + /* flush the worker queues while the log still exists and
> > + * before the final sync and unmount record.
> > + */
> > + xfs_syncd_stop(mp);
>
> xfs_syncd_stop() needs to die rather than being moved from place to
> place every time some problem is seemd. I outlined what we need to
> do to solve the problems once and for all a couple of months ago:
>
> http://oss.sgi.com/archives/xfs/2012-06/msg00064.html
Yikes, that patchset you've posted is very nice. Probably it is appropriate
for the 3.7 merge window. We also need a fix for 3.5-stable and 3.6. Do you
have a suggestion for that? It looks like you've also moved the final log
force to after cancellation of the work queue in patch 6, similar to what Mark
has done... so it would seem that Mark has the right idea. I think Mark's
patch is appropriate for 3.6 and 3.5-stable inclusion to fix the regression
introduced by me in 8866fc6.
What is your opinion?
Regards,
Ben
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] xfs: flush workers before stopping log
2012-08-30 17:25 ` Ben Myers
@ 2012-08-30 22:35 ` Dave Chinner
2012-08-31 18:15 ` Mark Tinguely
0 siblings, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2012-08-30 22:35 UTC (permalink / raw)
To: Ben Myers; +Cc: tinguely, xfs
On Thu, Aug 30, 2012 at 12:25:49PM -0500, Ben Myers wrote:
> On Thu, Aug 30, 2012 at 10:23:35AM +1000, Dave Chinner wrote:
> > On Wed, Aug 29, 2012 at 08:46:25AM -0500, tinguely@sgi.com wrote:
> > > Index: b/fs/xfs/xfs_mount.c
> > > ===================================================================
> > > --- a/fs/xfs/xfs_mount.c
> > > +++ b/fs/xfs/xfs_mount.c
> > > @@ -1490,6 +1490,11 @@ xfs_unmountfs(
> > >
> > > xfs_qm_unmount(mp);
> > >
> > > + /* flush the worker queues while the log still exists and
> > > + * before the final sync and unmount record.
> > > + */
> > > + xfs_syncd_stop(mp);
> >
> > xfs_syncd_stop() needs to die rather than being moved from place to
> > place every time some problem is seemd. I outlined what we need to
> > do to solve the problems once and for all a couple of months ago:
> >
> > http://oss.sgi.com/archives/xfs/2012-06/msg00064.html
>
> Yikes, that patchset you've posted is very nice. Probably it is appropriate
> for the 3.7 merge window. We also need a fix for 3.5-stable and 3.6. Do you
> have a suggestion for that?
3.7 - it's too late for 3.6.
> It looks like you've also moved the final log
> force to after cancellation of the work queue in patch 6, similar to what Mark
> has done... so it would seem that Mark has the right idea. I think Mark's
> patch is appropriate for 3.6 and 3.5-stable inclusion to fix the regression
> introduced by me in 8866fc6.
>
> What is your opinion?
Don't do unnecessary work. :)
Realistically, the question that I always ask myself in this
situation is: do we really need to fix it right now?
Backports to stable kernels are typically done in response to user
reported issues. If users are not hitting the problem, then there's
no compelling reason for backporting it. i.e. the answer is "no".
If someone is hitting the problem, then the answer is "yes" and we
need a temporary fix until the proper fix is done.
To that extent, I've seen one user report of an unmount problem in 3.5
(reported on #xfs), but it's not clear if the problem they hit was
this problem or the fact that the superblock IO is not being waited
for during unmount in 3.5. This was fixed in 3.6 by commit 9a57fa8e
("xfs: wait for the write the superblock on unmount"):
http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/xfs.git;a=commit;h=9a57fa8ee7c29e11c2a29ce058573ba99157eda7
So, if we take this reported as a "yes, we need to fix it" trigger,
then we've got a couple of fixes that will need to be backported to
3.5.x..
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] xfs: flush workers before stopping log
2012-08-30 22:35 ` Dave Chinner
@ 2012-08-31 18:15 ` Mark Tinguely
2012-09-01 23:08 ` Christoph Hellwig
0 siblings, 1 reply; 15+ messages in thread
From: Mark Tinguely @ 2012-08-31 18:15 UTC (permalink / raw)
To: Dave Chinner; +Cc: Ben Myers, xfs
On 08/30/12 17:35, Dave Chinner wrote:
> On Thu, Aug 30, 2012 at 12:25:49PM -0500, Ben Myers wrote:
>> On Thu, Aug 30, 2012 at 10:23:35AM +1000, Dave Chinner wrote:
>>> On Wed, Aug 29, 2012 at 08:46:25AM -0500, tinguely@sgi.com wrote:
>>>> Index: b/fs/xfs/xfs_mount.c
>>>> ===================================================================
>>>> --- a/fs/xfs/xfs_mount.c
>>>> +++ b/fs/xfs/xfs_mount.c
>>>> @@ -1490,6 +1490,11 @@ xfs_unmountfs(
>>>>
>>>> xfs_qm_unmount(mp);
>>>>
>>>> + /* flush the worker queues while the log still exists and
>>>> + * before the final sync and unmount record.
>>>> + */
>>>> + xfs_syncd_stop(mp);
>>>
>>> xfs_syncd_stop() needs to die rather than being moved from place to
>>> place every time some problem is seemd. I outlined what we need to
>>> do to solve the problems once and for all a couple of months ago:
>>>
>>> http://oss.sgi.com/archives/xfs/2012-06/msg00064.html
>>
>> Yikes, that patchset you've posted is very nice. Probably it is appropriate
>> for the 3.7 merge window. We also need a fix for 3.5-stable and 3.6. Do you
>> have a suggestion for that?
>
> 3.7 - it's too late for 3.6.
>
>> It looks like you've also moved the final log
>> force to after cancellation of the work queue in patch 6, similar to what Mark
>> has done... so it would seem that Mark has the right idea. I think Mark's
>> patch is appropriate for 3.6 and 3.5-stable inclusion to fix the regression
>> introduced by me in 8866fc6.
>>
>> What is your opinion?
>
> Don't do unnecessary work. :)
>
> Realistically, the question that I always ask myself in this
> situation is: do we really need to fix it right now?
>
> Backports to stable kernels are typically done in response to user
> reported issues. If users are not hitting the problem, then there's
> no compelling reason for backporting it. i.e. the answer is "no".
> If someone is hitting the problem, then the answer is "yes" and we
> need a temporary fix until the proper fix is done.
>
> To that extent, I've seen one user report of an unmount problem in 3.5
> (reported on #xfs), but it's not clear if the problem they hit was
> this problem or the fact that the superblock IO is not being waited
> for during unmount in 3.5. This was fixed in 3.6 by commit 9a57fa8e
> ("xfs: wait for the write the superblock on unmount"):
>
> http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/xfs.git;a=commit;h=9a57fa8ee7c29e11c2a29ce058573ba99157eda7
>
> So, if we take this reported as a "yes, we need to fix it" trigger,
> then we've got a couple of fixes that will need to be backported to
> 3.5.x..
>
> Cheers,
>
> Dave.
You are correct that Ben, Phil White, and myself have seen the problem -
yes all internal, but then we probably do more filesystem unmounts than
the ordinary host. Maybe users have good habits such as syncing 3 times
before umounting the filesystem (keep it up!) and they avoid this nastiness.
I see your point on fixing problems in older branches when/if they are
reported by an user. I am not glowing with pride with the patch, it is
something that survived a week of testing that would cause a panic in a
couple hours without the patch. Since we hit this problem with such
frequency, that we wanted to push for a little proactive attention to
prevent future panics.
Thanks,
--Mark Tinguely.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] xfs: flush workers before stopping log
2012-08-31 18:15 ` Mark Tinguely
@ 2012-09-01 23:08 ` Christoph Hellwig
2012-09-12 18:33 ` xfs: stop the sync worker before xfs_unmountfs Ben Myers
0 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2012-09-01 23:08 UTC (permalink / raw)
To: Mark Tinguely; +Cc: Ben Myers, xfs
On Fri, Aug 31, 2012 at 01:15:01PM -0500, Mark Tinguely wrote:
> I see your point on fixing problems in older branches when/if they
> are reported by an user. I am not glowing with pride with the patch,
> it is something that survived a week of testing that would cause a
> panic in a couple hours without the patch. Since we hit this problem
> with such frequency, that we wanted to push for a little proactive
> attention to prevent future panics.
I'd love to see a relatively minimal patch which can also be backported.
I also have to say that your patch as-is scares me a bit. Everytime we
move the current xfs_sync_init/stop monsters around we created another
set of problems. So I'd prefer at least taking the bits from Dave's
series that kill these helpers and do individual calls, and only move
those that are needed. The other thing that I don't like about the
patch is that it causes assymetry in the mount/unmount path by moving
the stop into xfs_unmountfs but not the start into xfs_mountfs. That
might be needed in some cases, but that should be some detailed comments
explaining why.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* xfs: stop the sync worker before xfs_unmountfs
2012-09-01 23:08 ` Christoph Hellwig
@ 2012-09-12 18:33 ` Ben Myers
2012-09-12 23:14 ` Dave Chinner
2012-09-13 8:17 ` Christoph Hellwig
0 siblings, 2 replies; 15+ messages in thread
From: Ben Myers @ 2012-09-12 18:33 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Mark Tinguely, xfs
See what you think of this. Not heavily tested yet, and not pretty... but it
is fairly minimal.
Pull startup and shutdown of the sync worker out of xfs_syncd_init and
xfs_syncd_stop into their callers. In the case of unmount, move shutdown of
the xfs_sync_worker before teardown of the log in xfs_unmountfs. This prevents
occasional crashes on unmount like so:
PID: 21602 TASK: ee9df060 CPU: 0 COMMAND: "kworker/0:3"
#0 [c5377d28] crash_kexec at c0292c94
#1 [c5377d80] oops_end at c07090c2
#2 [c5377d98] no_context at c06f614e
#3 [c5377dbc] __bad_area_nosemaphore at c06f6281
#4 [c5377df4] bad_area_nosemaphore at c06f629b
#5 [c5377e00] do_page_fault at c070b0cb
#6 [c5377e7c] error_code (via page_fault) at c070892c
EAX: f300c6a8 EBX: f300c6a8 ECX: 000000c0 EDX: 000000c0 EBP: c5377ed0
DS: 007b ESI: 00000000 ES: 007b EDI: 00000001 GS: ffffad20
CS: 0060 EIP: c0481ad0 ERR: ffffffff EFLAGS: 00010246
#7 [c5377eb0] atomic64_read_cx8 at c0481ad0
#8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
#9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
#10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
#11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
#12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
#13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
#14 [c5377f58] process_one_work at c024ee4c
#15 [c5377f98] worker_thread at c024f43d
#16 [c5377fbc] kthread at c025326b
#17 [c5377fe8] kernel_thread_helper at c070e834
PID: 26653 TASK: e79143b0 CPU: 3 COMMAND: "umount"
#0 [cde0fda0] __schedule at c0706595
#1 [cde0fe28] schedule at c0706b89
#2 [cde0fe30] schedule_timeout at c0705600
#3 [cde0fe94] __down_common at c0706098
#4 [cde0fec8] __down at c0706122
#5 [cde0fed0] down at c025936f
#6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
#7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
#8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs]
#9 [cde0ff1c] generic_shutdown_super at c0333d7a
#10 [cde0ff38] kill_block_super at c0333e0f
#11 [cde0ff48] deactivate_locked_super at c0334218
#12 [cde0ff58] deactivate_super at c033495d
#13 [cde0ff68] mntput_no_expire at c034bc13
#14 [cde0ff7c] sys_umount at c034cc69
#15 [cde0ffa0] sys_oldumount at c034ccd4
#16 [cde0ffb0] system_call at c0707e66
Signed-off-by: Ben Myers <bpm@sgi.com>
Index: xfs/fs/xfs/xfs_super.c
===================================================================
--- xfs.orig/fs/xfs/xfs_super.c
+++ xfs/fs/xfs/xfs_super.c
@@ -919,6 +919,7 @@ xfs_fs_put_super(
struct xfs_mount *mp = XFS_M(sb);
xfs_filestream_unmount(mp);
+ cancel_delayed_work_sync(&mp->m_sync_work);
xfs_unmountfs(mp);
xfs_syncd_stop(mp);
xfs_freesb(mp);
@@ -1210,6 +1211,9 @@ xfs_finish_flags(
return 0;
}
+extern void xfs_sync_worker(struct work_struct *);
+extern void xfs_syncd_queue_sync(struct xfs_mount *);
+
STATIC int
xfs_fs_fill_super(
struct super_block *sb,
@@ -1291,6 +1295,8 @@ xfs_fs_fill_super(
set_posix_acl_flag(sb);
error = xfs_syncd_init(mp);
+ INIT_DELAYED_WORK(&mp->m_sync_work, xfs_sync_worker);
+ xfs_syncd_queue_sync(mp);
if (error)
goto out_filestream_unmount;
@@ -1315,6 +1321,7 @@ xfs_fs_fill_super(
return 0;
out_syncd_stop:
+ cancel_delayed_work_sync(&mp->m_sync_work);
xfs_syncd_stop(mp);
out_filestream_unmount:
xfs_filestream_unmount(mp);
@@ -1335,6 +1342,7 @@ out_destroy_workqueues:
out_unmount:
xfs_filestream_unmount(mp);
xfs_unmountfs(mp);
+ cancel_delayed_work_sync(&mp->m_sync_work);
xfs_syncd_stop(mp);
goto out_free_sb;
}
Index: xfs/fs/xfs/xfs_sync.c
===================================================================
--- xfs.orig/fs/xfs/xfs_sync.c
+++ xfs/fs/xfs/xfs_sync.c
@@ -370,7 +370,7 @@ xfs_quiesce_attr(
xfs_buf_unlock(mp->m_sb_bp);
}
-static void
+void
xfs_syncd_queue_sync(
struct xfs_mount *mp)
{
@@ -383,7 +383,7 @@ xfs_syncd_queue_sync(
* disk quotas. We might need to cover the log to indicate that the
* filesystem is idle and not frozen.
*/
-STATIC void
+void
xfs_sync_worker(
struct work_struct *work)
{
@@ -494,11 +494,8 @@ xfs_syncd_init(
struct xfs_mount *mp)
{
INIT_WORK(&mp->m_flush_work, xfs_flush_worker);
- INIT_DELAYED_WORK(&mp->m_sync_work, xfs_sync_worker);
INIT_DELAYED_WORK(&mp->m_reclaim_work, xfs_reclaim_worker);
- xfs_syncd_queue_sync(mp);
-
return 0;
}
@@ -506,7 +503,6 @@ void
xfs_syncd_stop(
struct xfs_mount *mp)
{
- cancel_delayed_work_sync(&mp->m_sync_work);
cancel_delayed_work_sync(&mp->m_reclaim_work);
cancel_work_sync(&mp->m_flush_work);
}
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: xfs: stop the sync worker before xfs_unmountfs
2012-09-12 18:33 ` xfs: stop the sync worker before xfs_unmountfs Ben Myers
@ 2012-09-12 23:14 ` Dave Chinner
2012-09-13 16:43 ` Ben Myers
2012-09-13 8:17 ` Christoph Hellwig
1 sibling, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2012-09-12 23:14 UTC (permalink / raw)
To: Ben Myers; +Cc: Christoph Hellwig, Mark Tinguely, xfs
On Wed, Sep 12, 2012 at 01:33:47PM -0500, Ben Myers wrote:
> See what you think of this. Not heavily tested yet, and not pretty... but it
> is fairly minimal.
....
> Signed-off-by: Ben Myers <bpm@sgi.com>
>
> Index: xfs/fs/xfs/xfs_super.c
> ===================================================================
> --- xfs.orig/fs/xfs/xfs_super.c
> +++ xfs/fs/xfs/xfs_super.c
> @@ -919,6 +919,7 @@ xfs_fs_put_super(
> struct xfs_mount *mp = XFS_M(sb);
>
> xfs_filestream_unmount(mp);
> + cancel_delayed_work_sync(&mp->m_sync_work);
> xfs_unmountfs(mp);
> xfs_syncd_stop(mp);
> xfs_freesb(mp);
This is the only hunk in the patch needed to fix the problem.
The rest of the patch does not change the order in which the sync
worker is started and stopped - it just open codes it next to the
xfs_syncd_start/stop calls. Essentially, all it does is obfuscate
the real fix that is being made here and makes it harder to
understand what the relationship between xfs_sync_worker() and
xfs_syncd_start/stop is supposed to be.
So a minimal patch, IMO, is just the above hunk....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: xfs: stop the sync worker before xfs_unmountfs
2012-09-12 18:33 ` xfs: stop the sync worker before xfs_unmountfs Ben Myers
2012-09-12 23:14 ` Dave Chinner
@ 2012-09-13 8:17 ` Christoph Hellwig
2012-09-13 21:19 ` Ben Myers
1 sibling, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2012-09-13 8:17 UTC (permalink / raw)
To: Ben Myers; +Cc: Christoph Hellwig, Mark Tinguely, xfs
On Wed, Sep 12, 2012 at 01:33:47PM -0500, Ben Myers wrote:
> See what you think of this. Not heavily tested yet, and not pretty... but it
> is fairly minimal.
>
> Pull startup and shutdown of the sync worker out of xfs_syncd_init and
> xfs_syncd_stop into their callers. In the case of unmount, move shutdown of
> the xfs_sync_worker before teardown of the log in xfs_unmountfs. This prevents
> occasional crashes on unmount like so:
>
> PID: 21602 TASK: ee9df060 CPU: 0 COMMAND: "kworker/0:3"
> #0 [c5377d28] crash_kexec at c0292c94
Can you remove the trailing whitespaces in these lines? They make
reading the changelog on a normal 80 character wide terminal pretty
hard.
The changes look fine to me,
Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: xfs: stop the sync worker before xfs_unmountfs
2012-09-12 23:14 ` Dave Chinner
@ 2012-09-13 16:43 ` Ben Myers
2012-09-13 21:18 ` [PATCH] " Ben Myers
0 siblings, 1 reply; 15+ messages in thread
From: Ben Myers @ 2012-09-13 16:43 UTC (permalink / raw)
To: Dave Chinner; +Cc: Christoph Hellwig, Mark Tinguely, xfs
Hi Dave,
On Thu, Sep 13, 2012 at 09:14:06AM +1000, Dave Chinner wrote:
> On Wed, Sep 12, 2012 at 01:33:47PM -0500, Ben Myers wrote:
> > See what you think of this. Not heavily tested yet, and not pretty... but it
> > is fairly minimal.
> ....
> > Signed-off-by: Ben Myers <bpm@sgi.com>
> >
> > Index: xfs/fs/xfs/xfs_super.c
> > ===================================================================
> > --- xfs.orig/fs/xfs/xfs_super.c
> > +++ xfs/fs/xfs/xfs_super.c
> > @@ -919,6 +919,7 @@ xfs_fs_put_super(
> > struct xfs_mount *mp = XFS_M(sb);
> >
> > xfs_filestream_unmount(mp);
> > + cancel_delayed_work_sync(&mp->m_sync_work);
> > xfs_unmountfs(mp);
> > xfs_syncd_stop(mp);
> > xfs_freesb(mp);
>
> This is the only hunk in the patch needed to fix the problem.
>
> The rest of the patch does not change the order in which the sync
> worker is started and stopped - it just open codes it next to the
> xfs_syncd_start/stop calls. Essentially, all it does is obfuscate
> the real fix that is being made here and makes it harder to
> understand what the relationship between xfs_sync_worker() and
> xfs_syncd_start/stop is supposed to be.
>
> So a minimal patch, IMO, is just the above hunk....
Ah, you're right. I was working on the assumption that it is best not to
cancel the work twice. There really is no harm in cancel_delayed_work_sync
both in xfs_fs_put_super and in xfs_syncd_stop. However, we don't want them
spread around willy nilly. That can become obfuscatory too. I suggest we
should remove the cancel_delayed_work_sync(&mp->m_sync_work) in
xfs_log_unmount, leftover from commit 11159a05. It seems like that one hasn't
been effective.
Maybe we don't want to do that in this patch... I'll just add a comment and
repost.
Thanks,
Ben
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] xfs: stop the sync worker before xfs_unmountfs
2012-09-13 16:43 ` Ben Myers
@ 2012-09-13 21:18 ` Ben Myers
2012-09-14 1:07 ` Dave Chinner
2012-09-18 13:28 ` Mark Tinguely
0 siblings, 2 replies; 15+ messages in thread
From: Ben Myers @ 2012-09-13 21:18 UTC (permalink / raw)
To: Dave Chinner; +Cc: Christoph Hellwig, Mark Tinguely, xfs
Cancel work of the xfs_sync_worker before teardown of the log in
xfs_unmountfs. This prevents occasional crashes on unmount like so:
PID: 21602 TASK: ee9df060 CPU: 0 COMMAND: "kworker/0:3"
#0 [c5377d28] crash_kexec at c0292c94
#1 [c5377d80] oops_end at c07090c2
#2 [c5377d98] no_context at c06f614e
#3 [c5377dbc] __bad_area_nosemaphore at c06f6281
#4 [c5377df4] bad_area_nosemaphore at c06f629b
#5 [c5377e00] do_page_fault at c070b0cb
#6 [c5377e7c] error_code (via page_fault) at c070892c
EAX: f300c6a8 EBX: f300c6a8 ECX: 000000c0 EDX: 000000c0 EBP: c5377ed0
DS: 007b ESI: 00000000 ES: 007b EDI: 00000001 GS: ffffad20
CS: 0060 EIP: c0481ad0 ERR: ffffffff EFLAGS: 00010246
#7 [c5377eb0] atomic64_read_cx8 at c0481ad0
#8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
#9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
#10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
#11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
#12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
#13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
#14 [c5377f58] process_one_work at c024ee4c
#15 [c5377f98] worker_thread at c024f43d
#16 [c5377fbc] kthread at c025326b
#17 [c5377fe8] kernel_thread_helper at c070e834
PID: 26653 TASK: e79143b0 CPU: 3 COMMAND: "umount"
#0 [cde0fda0] __schedule at c0706595
#1 [cde0fe28] schedule at c0706b89
#2 [cde0fe30] schedule_timeout at c0705600
#3 [cde0fe94] __down_common at c0706098
#4 [cde0fec8] __down at c0706122
#5 [cde0fed0] down at c025936f
#6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
#7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
#8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs]
#9 [cde0ff1c] generic_shutdown_super at c0333d7a
#10 [cde0ff38] kill_block_super at c0333e0f
#11 [cde0ff48] deactivate_locked_super at c0334218
#12 [cde0ff58] deactivate_super at c033495d
#13 [cde0ff68] mntput_no_expire at c034bc13
#14 [cde0ff7c] sys_umount at c034cc69
#15 [cde0ffa0] sys_oldumount at c034ccd4
#16 [cde0ffb0] system_call at c0707e66
commit 11159a05 added this to xfs_log_unmount and needs to be cleaned up
at a later date.
Signed-off-by: Ben Myers <bpm@sgi.com>
Index: xfs/fs/xfs/xfs_super.c
===================================================================
--- xfs.orig/fs/xfs/xfs_super.c
+++ xfs/fs/xfs/xfs_super.c
@@ -919,6 +919,7 @@ xfs_fs_put_super(
struct xfs_mount *mp = XFS_M(sb);
xfs_filestream_unmount(mp);
+ cancel_delayed_work_sync(&mp->m_sync_work);
xfs_unmountfs(mp);
xfs_syncd_stop(mp);
xfs_freesb(mp);
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: xfs: stop the sync worker before xfs_unmountfs
2012-09-13 8:17 ` Christoph Hellwig
@ 2012-09-13 21:19 ` Ben Myers
0 siblings, 0 replies; 15+ messages in thread
From: Ben Myers @ 2012-09-13 21:19 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Mark Tinguely, xfs
On Thu, Sep 13, 2012 at 04:17:42AM -0400, Christoph Hellwig wrote:
> On Wed, Sep 12, 2012 at 01:33:47PM -0500, Ben Myers wrote:
> > See what you think of this. Not heavily tested yet, and not pretty... but it
> > is fairly minimal.
> >
> > Pull startup and shutdown of the sync worker out of xfs_syncd_init and
> > xfs_syncd_stop into their callers. In the case of unmount, move shutdown of
> > the xfs_sync_worker before teardown of the log in xfs_unmountfs. This prevents
> > occasional crashes on unmount like so:
> >
> > PID: 21602 TASK: ee9df060 CPU: 0 COMMAND: "kworker/0:3"
> > #0 [c5377d28] crash_kexec at c0292c94
>
> Can you remove the trailing whitespaces in these lines? They make
> reading the changelog on a normal 80 character wide terminal pretty
> hard.
Yeah. Sorry about that. Cleaned up.
> The changes look fine to me,
>
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
Thanks,
Ben
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] xfs: stop the sync worker before xfs_unmountfs
2012-09-13 21:18 ` [PATCH] " Ben Myers
@ 2012-09-14 1:07 ` Dave Chinner
2012-09-18 13:28 ` Mark Tinguely
1 sibling, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2012-09-14 1:07 UTC (permalink / raw)
To: Ben Myers; +Cc: Christoph Hellwig, Mark Tinguely, xfs
On Thu, Sep 13, 2012 at 04:18:47PM -0500, Ben Myers wrote:
> Cancel work of the xfs_sync_worker before teardown of the log in
> xfs_unmountfs. This prevents occasional crashes on unmount like so:
Looks OK.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] xfs: stop the sync worker before xfs_unmountfs
2012-09-13 21:18 ` [PATCH] " Ben Myers
2012-09-14 1:07 ` Dave Chinner
@ 2012-09-18 13:28 ` Mark Tinguely
1 sibling, 0 replies; 15+ messages in thread
From: Mark Tinguely @ 2012-09-18 13:28 UTC (permalink / raw)
To: Ben Myers; +Cc: xfs
On 09/13/12 16:18, Ben Myers wrote:
> Cancel work of the xfs_sync_worker before teardown of the log in
> xfs_unmountfs. This prevents occasional crashes on unmount like so:
...
> Signed-off-by: Ben Myers<bpm@sgi.com>
>
> Index: xfs/fs/xfs/xfs_super.c
> ===================================================================
> --- xfs.orig/fs/xfs/xfs_super.c
> +++ xfs/fs/xfs/xfs_super.c
> @@ -919,6 +919,7 @@ xfs_fs_put_super(
> struct xfs_mount *mp = XFS_M(sb);
>
> xfs_filestream_unmount(mp);
> + cancel_delayed_work_sync(&mp->m_sync_work);
> xfs_unmountfs(mp);
> xfs_syncd_stop(mp);
> xfs_freesb(mp);
>
Tests successfully on all 3 of the machines that were crashing without
the patch.
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2012-09-18 13:27 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20120829134624.316257238@sgi.com>
2012-08-29 13:46 ` [PATCH] xfs: flush workers before stopping log tinguely
2012-08-29 14:31 ` Mark Tinguely
2012-08-30 0:23 ` Dave Chinner
2012-08-30 17:25 ` Ben Myers
2012-08-30 22:35 ` Dave Chinner
2012-08-31 18:15 ` Mark Tinguely
2012-09-01 23:08 ` Christoph Hellwig
2012-09-12 18:33 ` xfs: stop the sync worker before xfs_unmountfs Ben Myers
2012-09-12 23:14 ` Dave Chinner
2012-09-13 16:43 ` Ben Myers
2012-09-13 21:18 ` [PATCH] " Ben Myers
2012-09-14 1:07 ` Dave Chinner
2012-09-18 13:28 ` Mark Tinguely
2012-09-13 8:17 ` Christoph Hellwig
2012-09-13 21:19 ` Ben Myers
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox