From: "Shi, Yang" <yang.shi@linaro.org>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: tj@kernel.org, jack@suse.cz, axboe@fb.com,
fengguang.wu@intel.com, tglx@linutronix.de,
bigeasy@linutronix.de, akpm@linux-foundation.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-rt-users@vger.kernel.org, linaro-kernel@lists.linaro.org
Subject: Re: [PATCH] writeback: call writeback tracepoints withoud holding list_lock in wb_writeback()
Date: Thu, 25 Feb 2016 15:47:20 -0800 [thread overview]
Message-ID: <56CF9288.5010406@linaro.org> (raw)
In-Reply-To: <20160225183107.1902d42b@gandalf.local.home>
On 2/25/2016 3:31 PM, Steven Rostedt wrote:
> On Thu, 25 Feb 2016 15:16:54 -0800
> "Shi, Yang" <yang.shi@linaro.org> wrote:
>
>
>> Actually, regardless whether this is the right fix for the splat, it
>> makes me be wondering if the spin lock which protects the whole for loop
>> is really necessary. It sounds feasible to move it into the for loop and
>> just protect the necessary area.
>
> That's a separate issue, which may have its own merits that should be
> decided by the writeback folks.
Yes, definitely. I will rework my commit log for this part.
>
>>
>>>
>>>>
>>>>>
>>>>>
>>>>>> INFO: lockdep is turned off.
>>>>>> Preemption disabled at:[<ffffffc000374a5c>] wb_writeback+0xec/0x830
>>>
>>> Can you disassemble the vmlinux file to see exactly where that call is.
>>> I use gdb to find the right locations.
>>>
>>> gdb> li *0xffffffc000374a5c
>>> gdb> disass 0xffffffc000374a5c
>>
>> I use gdb to get the code too.
>>
>> It does point to the spin_lock.
>>
>> (gdb) list *0xffffffc000374a5c
>> 0xffffffc000374a5c is in wb_writeback (fs/fs-writeback.c:1621).
>> 1616
>> 1617 oldest_jif = jiffies;
>> 1618 work->older_than_this = &oldest_jif;
>> 1619
>> 1620 blk_start_plug(&plug);
>> 1621 spin_lock(&wb->list_lock);
>> 1622 for (;;) {
>> 1623 /*
>> 1624 * Stop writeback when nr_pages has been consumed
>> 1625 */
>>
>>
>> The disassemble:
>> 0xffffffc000374a58 <+232>: bl 0xffffffc0001300b0
>
> The above is the place it recorded. But I just realized, this isn't the
> issue. I know where the problem is.
>
>
>> <migrate_disable>
>> 0xffffffc000374a5c <+236>: mov x0, x22
>> 0xffffffc000374a60 <+240>: bl 0xffffffc000d5d518 <rt_spin_lock>
>>
>>>
>
>
>
>>>> DECLARE_EVENT_CLASS(writeback_work_class,
>>>> TP_PROTO(struct bdi_writeback *wb, struct wb_writeback_work *work),
>>>> TP_ARGS(wb, work),
>>>> TP_STRUCT__entry(
>>>> __array(char, name, 32)
>>>> __field(long, nr_pages)
>>>> __field(dev_t, sb_dev)
>>>> __field(int, sync_mode)
>>>> __field(int, for_kupdate)
>>>> __field(int, range_cyclic)
>>>> __field(int, for_background)
>>>> __field(int, reason)
>>>> __dynamic_array(char, cgroup, __trace_wb_cgroup_size(wb))
>>>>
>>>
>>> Ah, thanks for pointing that out. I missed that.
>>
>> It sounds not correct if tracepoint doesn't allow sleep.
>>
>> I considered to change sleeping lock to raw lock in kernfs_* functions,
>> but it sounds not reasonable since they are used heavily by cgroup.
>
> It is the kernfs_* that can't sleep. Tracepoints use
> rcu_read_lock_sched_notrace(), which disables preemption, and not only
> that, hides itself from lockdep as the last place to disable preemption.
Ah, thanks for pointing out this.
>
> Is there a way to not use the kernfs_* function? At least for -rt?
I'm not quite sure if there is straightforward replacement. However, I'm
wondering if lock free version could be used by tracing.
For example, create __kernfs_path_len which doesn't acquire any lock for
writeback tracing as long as there is not any race condition.
At least we could rule out preemption.
Thanks,
Yang
>
> -- Steve
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: "Shi, Yang" <yang.shi@linaro.org>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: tj@kernel.org, jack@suse.cz, axboe@fb.com,
fengguang.wu@intel.com, tglx@linutronix.de,
bigeasy@linutronix.de, akpm@linux-foundation.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-rt-users@vger.kernel.org, linaro-kernel@lists.linaro.org
Subject: Re: [PATCH] writeback: call writeback tracepoints withoud holding list_lock in wb_writeback()
Date: Thu, 25 Feb 2016 15:47:20 -0800 [thread overview]
Message-ID: <56CF9288.5010406@linaro.org> (raw)
In-Reply-To: <20160225183107.1902d42b@gandalf.local.home>
On 2/25/2016 3:31 PM, Steven Rostedt wrote:
> On Thu, 25 Feb 2016 15:16:54 -0800
> "Shi, Yang" <yang.shi@linaro.org> wrote:
>
>
>> Actually, regardless whether this is the right fix for the splat, it
>> makes me be wondering if the spin lock which protects the whole for loop
>> is really necessary. It sounds feasible to move it into the for loop and
>> just protect the necessary area.
>
> That's a separate issue, which may have its own merits that should be
> decided by the writeback folks.
Yes, definitely. I will rework my commit log for this part.
>
>>
>>>
>>>>
>>>>>
>>>>>
>>>>>> INFO: lockdep is turned off.
>>>>>> Preemption disabled at:[<ffffffc000374a5c>] wb_writeback+0xec/0x830
>>>
>>> Can you disassemble the vmlinux file to see exactly where that call is.
>>> I use gdb to find the right locations.
>>>
>>> gdb> li *0xffffffc000374a5c
>>> gdb> disass 0xffffffc000374a5c
>>
>> I use gdb to get the code too.
>>
>> It does point to the spin_lock.
>>
>> (gdb) list *0xffffffc000374a5c
>> 0xffffffc000374a5c is in wb_writeback (fs/fs-writeback.c:1621).
>> 1616
>> 1617 oldest_jif = jiffies;
>> 1618 work->older_than_this = &oldest_jif;
>> 1619
>> 1620 blk_start_plug(&plug);
>> 1621 spin_lock(&wb->list_lock);
>> 1622 for (;;) {
>> 1623 /*
>> 1624 * Stop writeback when nr_pages has been consumed
>> 1625 */
>>
>>
>> The disassemble:
>> 0xffffffc000374a58 <+232>: bl 0xffffffc0001300b0
>
> The above is the place it recorded. But I just realized, this isn't the
> issue. I know where the problem is.
>
>
>> <migrate_disable>
>> 0xffffffc000374a5c <+236>: mov x0, x22
>> 0xffffffc000374a60 <+240>: bl 0xffffffc000d5d518 <rt_spin_lock>
>>
>>>
>
>
>
>>>> DECLARE_EVENT_CLASS(writeback_work_class,
>>>> TP_PROTO(struct bdi_writeback *wb, struct wb_writeback_work *work),
>>>> TP_ARGS(wb, work),
>>>> TP_STRUCT__entry(
>>>> __array(char, name, 32)
>>>> __field(long, nr_pages)
>>>> __field(dev_t, sb_dev)
>>>> __field(int, sync_mode)
>>>> __field(int, for_kupdate)
>>>> __field(int, range_cyclic)
>>>> __field(int, for_background)
>>>> __field(int, reason)
>>>> __dynamic_array(char, cgroup, __trace_wb_cgroup_size(wb))
>>>>
>>>
>>> Ah, thanks for pointing that out. I missed that.
>>
>> It sounds not correct if tracepoint doesn't allow sleep.
>>
>> I considered to change sleeping lock to raw lock in kernfs_* functions,
>> but it sounds not reasonable since they are used heavily by cgroup.
>
> It is the kernfs_* that can't sleep. Tracepoints use
> rcu_read_lock_sched_notrace(), which disables preemption, and not only
> that, hides itself from lockdep as the last place to disable preemption.
Ah, thanks for pointing out this.
>
> Is there a way to not use the kernfs_* function? At least for -rt?
I'm not quite sure if there is straightforward replacement. However, I'm
wondering if lock free version could be used by tracing.
For example, create __kernfs_path_len which doesn't acquire any lock for
writeback tracing as long as there is not any race condition.
At least we could rule out preemption.
Thanks,
Yang
>
> -- Steve
>
next prev parent reply other threads:[~2016-02-25 23:47 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-24 22:47 [PATCH] writeback: call writeback tracepoints withoud holding list_lock in wb_writeback() Yang Shi
2016-02-24 22:47 ` Yang Shi
2016-02-25 2:40 ` Steven Rostedt
2016-02-25 2:40 ` Steven Rostedt
2016-02-25 19:38 ` Shi, Yang
2016-02-25 19:38 ` Shi, Yang
2016-02-25 19:54 ` Steven Rostedt
2016-02-25 19:54 ` Steven Rostedt
2016-02-25 23:16 ` Shi, Yang
2016-02-25 23:16 ` Shi, Yang
2016-02-25 23:31 ` Steven Rostedt
2016-02-25 23:31 ` Steven Rostedt
2016-02-25 23:47 ` Shi, Yang [this message]
2016-02-25 23:47 ` Shi, Yang
2016-02-25 23:54 ` Shi, Yang
2016-02-25 23:54 ` Shi, Yang
2016-02-26 0:09 ` Steven Rostedt
2016-02-26 0:09 ` Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56CF9288.5010406@linaro.org \
--to=yang.shi@linaro.org \
--cc=akpm@linux-foundation.org \
--cc=axboe@fb.com \
--cc=bigeasy@linutronix.de \
--cc=fengguang.wu@intel.com \
--cc=jack@suse.cz \
--cc=linaro-kernel@lists.linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.