From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: dsterba@suse.cz, Josef Bacik <jbacik@fb.com>,
linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v4 00/18] Replace btrfs_workers with kernel workqueue based btrfs_workqueue
Date: Thu, 09 Jan 2014 09:35:11 +0800 [thread overview]
Message-ID: <52CDFCCF.6030005@cn.fujitsu.com> (raw)
In-Reply-To: <20140108185159.GI6498@twin.jikos.cz>
On Wed, 8 Jan 2014 19:51:59 +0100, David Sterba wrote:
> On Wed, Jan 08, 2014 at 02:25:02PM +0800, Qu Wenruo wrote:
>>> But according to the dmesg, which is expired now though, I made the
>>> following assumption:
>>>
>>> There is a possibility that out of order execution made the "work->wq =
>>> wq" sentence executed behind the "queue_work()" call,
>>> and the "normal_work_helper" is executed in another CPU instantly, which
>>> caused the problem.
> This could be the reason, though I still don't see how exactly this
> happens.
>
>>> ------
>>> static inline void __btrfs_queue_work(struct __btrfs_workqueue *wq,
>>> struct btrfs_work *work)
>>> {
>>> unsigned long flags;
>>>
>>> work->wq = wq;
>>> /* There is no memory barrier ensure the work->wq = wq is executed
>>> before queue_work */
>>> thresh_queue_hook(wq);
>>> if (work->ordered_func) {
>>> spin_lock_irqsave(&wq->list_lock, flags);
>>> list_add_tail(&work->ordered_list, &wq->ordered_list);
>>> spin_unlock_irqrestore(&wq->list_lock, flags);
>>> }
>>> queue_work(wq->normal_wq, &work->normal_work);
>>> }
>>> ------
>>>
>>> The fix is as easy as just adding a smp_wmb() behind the "work->wq = wq"
>>> sentence, but since I failed to reproduce the bug,
>>> I can't confirm whether the fix is good or not.
>>>
>>> I know it's impolite but would you please first try the smp_wmb() first to
>>> confirm the fix?
>> If itis convenient for you,would you please give some feedback about the
>> smp_wmb() fix ?
>>
>> Also incase that smp_wmb() doesn't work, it would be very nice of you to
>> also try smp_mb() fix.
> The smp_wmb barriers have to pair with smp_rmb, which means that
> normal_work_helper() has to call smp_rmb at the beginning.
>
> What still puzzles me is why would the barriers are needed at all,
> although your example code shows the effects of a delayed store.
>
> Down the call stack, the barrier takes place:
>
> queue_work
> queue_work_on(WORK_CPU_UNBOUND)
> __queue_work
> insert_work
> smp_mb
>
> There is no barrier before the work is being processed in
> workqueue.c::process_one_work(), but there maybe an implied one.
>
>
> david
>
It is right that before processing the work, there is already an
smp_mb() in queue_work.
So the problem may not be the memory barrier things.
It looks like I have to dig deeper even I could not reproduce the bug. :(
BTW, since the original dmesg is expired, would josef upload the dmesg
again?
Also, did david success to reproduce the bug?
Thanks
Qu
prev parent reply other threads:[~2014-01-09 1:34 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-17 9:31 [PATCH v4 00/18] Replace btrfs_workers with kernel workqueue based btrfs_workqueue Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 01/18] btrfs: Cleanup the unused struct async_sched Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 02/18] btrfs: Added btrfs_workqueue_struct implemented ordered execution based on kernel workqueue Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 03/18] btrfs: Add high priority workqueue support for btrfs_workqueue_struct Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 04/18] btrfs: Add threshold workqueue based on kernel workqueue Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 05/18] btrfs: Replace fs_info->workers with btrfs_workqueue Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 06/18] btrfs: Replace fs_info->delalloc_workers " Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 07/18] btrfs: Replace fs_info->submit_workers " Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 08/18] btrfs: Replace fs_info->flush_workers " Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 09/18] btrfs: Replace fs_info->endio_* workqueue " Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 10/18] btrfs: Replace fs_info->rmw_workers " Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 11/18] btrfs: Replace fs_info->cache_workers " Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 12/18] btrfs: Replace fs_info->readahead_workers " Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 13/18] btrfs: Replace fs_info->fixup_workers " Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 14/18] btrfs: Replace fs_info->delayed_workers " Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 15/18] btrfs: Replace fs_info->qgroup_rescan_worker " Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 16/18] btrfs: Replace fs_info->scrub_* " Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 17/18] btrfs: Cleanup the old btrfs_worker Qu Wenruo
2013-12-17 9:31 ` [PATCH v4 18/18] btrfs: Cleanup the "_struct" suffix in btrfs_workequeue Qu Wenruo
2013-12-19 15:27 ` [PATCH v4 00/18] Replace btrfs_workers with kernel workqueue based btrfs_workqueue Josef Bacik
2013-12-20 0:37 ` Qu Wenruo
2013-12-20 3:08 ` Qu Wenruo
2013-12-20 13:30 ` Josef Bacik
2013-12-23 2:35 ` Qu Wenruo
2014-01-08 6:25 ` Qu Wenruo
2014-01-08 18:51 ` David Sterba
2014-01-09 1:35 ` Qu Wenruo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52CDFCCF.6030005@cn.fujitsu.com \
--to=quwenruo@cn.fujitsu.com \
--cc=dsterba@suse.cz \
--cc=jbacik@fb.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).