From: Dave Chinner <david@fromorbit.com>
To: Mark Tinguely <tinguely@sgi.com>
Cc: Ben Myers <bpm@sgi.com>, xfs@oss.sgi.com
Subject: Re: [PATCH] xfs: shutdown xfs_sync_worker before the log
Date: Tue, 12 Jun 2012 09:36:18 +1000 [thread overview]
Message-ID: <20120611233618.GD22848@dastard> (raw)
In-Reply-To: <4FD65F00.1010309@sgi.com>
On Mon, Jun 11, 2012 at 04:11:28PM -0500, Mark Tinguely wrote:
> On 06/11/12 15:45, Ben Myers wrote:
> ...
>
> >That sounds pretty good. In particular, I think that making the start
> >and stop of the workqueues correct should be the high priority. I'm not
> >as concerned about the accuracy of the names, or cleaning up xfs_sync.c
> >and xfs_iget.c, but cleanups are worth doing too.
> >
> >I hit a crash related to the xfslogd workqueue awhile back. Mark has
> >taken it up, so there might be a little coordination to do with him.
> >
> >Regards,
> > Ben
>
> To not leave a teaser out there:
>
> PID: 25879 TASK: ffff88012ac20340 CPU: 3 COMMAND: "kworker/3:3"
> #0 [ffff8801a72af920] machine_kexec at ffffffff810244e9
> #1 [ffff8801a72af990] crash_kexec at ffffffff8108d053
> #2 [ffff8801a72afa60] oops_end at ffffffff813ad1b8
> #3 [ffff8801a72afa90] no_context at ffffffff8102bd48
> #4 [ffff8801a72afae0] __bad_area_nosemaphore at ffffffff8102c04d
> #5 [ffff8801a72afb30] bad_area_nosemaphore at ffffffff8102c12e
> #6 [ffff8801a72afb40] do_page_fault at ffffffff813afaee
> #7 [ffff8801a72afc50] page_fault at ffffffff813ac635
> [exception RIP: xlog_assign_tail_lsn_locked+72]
> RIP: ffffffffa040da68 RSP: ffff8801a72afd00 RFLAGS: 00010246
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: dead000000200200
> RDX: ffff88013b32d550 RSI: dead000000100100 RDI: ffff88013b32d550
> RBP: ffff8801a72afd10 R8: ffff8801a72ae000 R9: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff88013b32d568
> R13: 0000000000000001 R14: ffff8801a72afd90 R15: ffff88013b32d540
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> #8 [ffff8801a72afd18] xfs_trans_ail_delete_bulk at ffffffffa0414b2a [xfs]
> #9 [ffff8801a72afd78] xfs_buf_iodone at ffffffffa04119c7 [xfs]
> #10 [ffff8801a72afdb8] xfs_buf_do_callbacks at ffffffffa041166c [xfs]
> #11 [ffff8801a72afdd8] xfs_buf_iodone_callbacks at ffffffffa04117de [xfs]
> #12 [ffff8801a72afdf8] xfs_buf_iodone_work at ffffffffa03ad7e1 [xfs]
> #13 [ffff8801a72afe18] process_one_work at ffffffff8104c53b
> #14 [ffff8801a72afe68] worker_thread at ffffffff8104f0e3
> #15 [ffff8801a72afee8] kthread at ffffffff8105395e
> #16 [ffff8801a72aff48] kernel_thread_helper at ffffffff813b3ae4
>
> I am just digging through that crash. It appears that xfs_umountfs()
> did a good job in cleaning the AIL and the m_ddev_targp, but it
> needs to wait for the xfslogd to be finished before deallocating the
> log.
It is supposed to be already idle before the log is torn down. The
log is forced synchronously while flushes remaining CIL items, then
the AIL is emptied synchronously. That should result in no
outstanding log operations to be run. Then xfs_log_unmount_write()
is called, which is supposed to wait for the log write to complete
before allowing the log to be torn down in xfs_log_unmount(). i.e.
it is also synchronous.
So the question that needs to be answered is this: what is the
transaction/checkpoint that is being completed here?
> Since workqueues are cheap, maybe it would be smart to have a
> per-filesystem xfslogd too.
That's overkill. If all we need is to ensure that we have emptied
the logd wq, then a synchronous flush is all that is necessary. But
first we need to find the cause of the above problem, and I'd
suggest a new thread for that...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2012-06-11 23:36 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-23 17:43 BUG in xlog_get_lowest_lsn Ben Myers
2012-05-14 20:34 ` [PATCH] xfs: use s_umount sema in xfs_sync_worker Ben Myers
2012-05-15 18:30 ` Mark Tinguely
2012-05-15 19:06 ` Ben Myers
2012-05-16 1:56 ` Dave Chinner
2012-05-16 17:04 ` Ben Myers
2012-05-17 7:16 ` Dave Chinner
2012-05-23 9:02 ` Dave Chinner
2012-05-23 16:45 ` Ben Myers
2012-05-24 22:39 ` Ben Myers
2012-05-25 20:45 ` [PATCH] xfs: shutdown xfs_sync_worker before the log Ben Myers
2012-05-29 15:07 ` Ben Myers
2012-05-29 15:36 ` Brian Foster
2012-05-29 17:04 ` Ben Myers
2012-05-29 17:54 ` Brian Foster
2012-05-31 16:23 ` Mark Tinguely
2012-06-06 4:26 ` Dave Chinner
2012-06-11 20:45 ` Ben Myers
2012-06-11 21:11 ` Mark Tinguely
2012-06-11 23:36 ` Dave Chinner [this message]
2012-06-14 17:13 ` Mark Tinguely
2012-06-14 23:56 ` Dave Chinner
2012-06-20 7:44 ` Christoph Hellwig
2012-06-20 7:36 ` Christoph Hellwig
2012-06-20 17:18 ` Ben Myers
2012-06-20 22:59 ` Dave Chinner
2012-06-21 7:12 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120611233618.GD22848@dastard \
--to=david@fromorbit.com \
--cc=bpm@sgi.com \
--cc=tinguely@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox