From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Shi, Yang" Subject: Re: [PATCH] writeback: call writeback tracepoints withoud holding list_lock in wb_writeback() Date: Thu, 25 Feb 2016 15:47:20 -0800 Message-ID: <56CF9288.5010406@linaro.org> References: <1456354043-31420-1-git-send-email-yang.shi@linaro.org> <20160224214042.71c3493b@grimm.local.home> <56CF5848.7050806@linaro.org> <20160225145432.3749e5ec@gandalf.local.home> <56CF8B66.8070108@linaro.org> <20160225183107.1902d42b@gandalf.local.home> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: tj@kernel.org, jack@suse.cz, axboe@fb.com, fengguang.wu@intel.com, tglx@linutronix.de, bigeasy@linutronix.de, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-rt-users@vger.kernel.org, linaro-kernel@lists.linaro.org To: Steven Rostedt Return-path: In-Reply-To: <20160225183107.1902d42b@gandalf.local.home> Sender: owner-linux-mm@kvack.org List-Id: linux-rt-users.vger.kernel.org On 2/25/2016 3:31 PM, Steven Rostedt wrote: > On Thu, 25 Feb 2016 15:16:54 -0800 > "Shi, Yang" wrote: > > >> Actually, regardless whether this is the right fix for the splat, it >> makes me be wondering if the spin lock which protects the whole for loop >> is really necessary. It sounds feasible to move it into the for loop and >> just protect the necessary area. > > That's a separate issue, which may have its own merits that should be > decided by the writeback folks. Yes, definitely. I will rework my commit log for this part. > >> >>> >>>> >>>>> >>>>> >>>>>> INFO: lockdep is turned off. >>>>>> Preemption disabled at:[] wb_writeback+0xec/0x830 >>> >>> Can you disassemble the vmlinux file to see exactly where that call is. >>> I use gdb to find the right locations. >>> >>> gdb> li *0xffffffc000374a5c >>> gdb> disass 0xffffffc000374a5c >> >> I use gdb to get the code too. >> >> It does point to the spin_lock. >> >> (gdb) list *0xffffffc000374a5c >> 0xffffffc000374a5c is in wb_writeback (fs/fs-writeback.c:1621). >> 1616 >> 1617 oldest_jif = jiffies; >> 1618 work->older_than_this = &oldest_jif; >> 1619 >> 1620 blk_start_plug(&plug); >> 1621 spin_lock(&wb->list_lock); >> 1622 for (;;) { >> 1623 /* >> 1624 * Stop writeback when nr_pages has been consumed >> 1625 */ >> >> >> The disassemble: >> 0xffffffc000374a58 <+232>: bl 0xffffffc0001300b0 > > The above is the place it recorded. But I just realized, this isn't the > issue. I know where the problem is. > > >> >> 0xffffffc000374a5c <+236>: mov x0, x22 >> 0xffffffc000374a60 <+240>: bl 0xffffffc000d5d518 >> >>> > > > >>>> DECLARE_EVENT_CLASS(writeback_work_class, >>>> TP_PROTO(struct bdi_writeback *wb, struct wb_writeback_work *work), >>>> TP_ARGS(wb, work), >>>> TP_STRUCT__entry( >>>> __array(char, name, 32) >>>> __field(long, nr_pages) >>>> __field(dev_t, sb_dev) >>>> __field(int, sync_mode) >>>> __field(int, for_kupdate) >>>> __field(int, range_cyclic) >>>> __field(int, for_background) >>>> __field(int, reason) >>>> __dynamic_array(char, cgroup, __trace_wb_cgroup_size(wb)) >>>> >>> >>> Ah, thanks for pointing that out. I missed that. >> >> It sounds not correct if tracepoint doesn't allow sleep. >> >> I considered to change sleeping lock to raw lock in kernfs_* functions, >> but it sounds not reasonable since they are used heavily by cgroup. > > It is the kernfs_* that can't sleep. Tracepoints use > rcu_read_lock_sched_notrace(), which disables preemption, and not only > that, hides itself from lockdep as the last place to disable preemption. Ah, thanks for pointing out this. > > Is there a way to not use the kernfs_* function? At least for -rt? I'm not quite sure if there is straightforward replacement. However, I'm wondering if lock free version could be used by tracing. For example, create __kernfs_path_len which doesn't acquire any lock for writeback tracing as long as there is not any race condition. At least we could rule out preemption. Thanks, Yang > > -- Steve > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org