From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932181AbZEEBcT (ORCPT ); Mon, 4 May 2009 21:32:19 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755189AbZEEBcK (ORCPT ); Mon, 4 May 2009 21:32:10 -0400 Received: from cn.fujitsu.com ([222.73.24.84]:62691 "EHLO song.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org with ESMTP id S1754370AbZEEBcJ (ORCPT ); Mon, 4 May 2009 21:32:09 -0400 Message-ID: <49FF9742.4090402@cn.fujitsu.com> Date: Tue, 05 May 2009 09:32:50 +0800 From: Li Zefan User-Agent: Thunderbird 2.0.0.9 (X11/20071115) MIME-Version: 1.0 To: Steven Rostedt CC: LKML , Ingo Molnar , Andrew Morton , Peter Zijlstra , Frederic Weisbecker , Steven Rostedt Subject: Re: [PATCH 2/6] tracing: increase size of number of possible events References: <20090424043005.338878133@goodmis.org> <20090424043111.983189535@goodmis.org> <49F15607.1090707@cn.fujitsu.com> <49F404E4.6040907@cn.fujitsu.com> <49FEAFD0.30106@cn.fujitsu.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Steven Rostedt wrote: > On Mon, 4 May 2009, Li Zefan wrote: >>>>>>> # cat /debug/tracing/trace_pipe >>>>>>> >>>>>>> console 2: >>>>>>> while (1) { >>>>>>> insmod trace-events-sample.ko >>>>>>> echo foo_bar > /debug/tracing/set_event >>>>>>> rmmod trace-events-sample.ko >>>>>>> } >>>>>>> >>>>>>> I got this immediately: >>>>>>> >>>>>>> BUG: unable to handle kernel NULL pointer dereference at 0000006f >>>>>>> IP: [] bstr_printf+0x2ce/0x302 >>>>>>> ... >>>>>>> Call Trace: >>>>>>> [] ? trace_seq_bprintf+0x28/0x41 >>>>>>> [] ? trace_bprint_print+0x58/0x6c >>>>>>> [] ? print_trace_line+0x2c5/0x2df >>>>>>> [] ? sub_preempt_count+0x85/0xa0 >>>>>>> [] ? tracing_read_pipe+0x118/0x191 >>>>>>> [] ? tracing_read_pipe+0x0/0x191 >>>>>>> [] ? vfs_read+0x8f/0x136 >>>>>>> [] ? sys_read+0x40/0x65 >>>>>>> [] ? sysenter_do_call+0x12/0x36 >>>>>>> >>>>>>> (We can even get other crashes..) >>>>>>> >>>>>> Can you send me your full bootlog and config. I'm not able to reproduce >>>>>> this. >>>>> And the SHA1 of the HEAD of the git repo you are using. >>>>> >>>>> Thanks, >>>>> >>>> the HEAD: 9ce5424d75e56891905b77d1589924765e62059a + this patch >>>> >>>> (I commented out "hi" from the sample module) >>> I found that commenting out the "hi" too helped in causing races. >>> >>>> I think it's because "type" is wrapped back to 0. I changed the test script >>>> like this: >>>> >>>> while (foo_bar.id < 65536) >>>> { >>>> insmod trace-events-sample.ko >>>> rmmod trace-events-sample.ko >>>> } >>>> >>>> for ((; ;)) >>>> { >>>> insmod /home/lizf/linux-2.6-tip/samples/trace_events/*.ko >>>> echo trace-events-sample:foo_bar > /mnt/tracing/set_event >>>> sleep 1 >>>> cat /mnt/tracing/trace >>>> rmmod /home/lizf/linux-2.6-tip/samples/trace_events/*.ko >>>> } >>>> >>> Could you try my git repo with my latest changes? >>> >>> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git >>> >>> branch: rfc/debugfs >>> >> The bug won't be triggered with your latest changes, but it has nothing >> to do with the debugfs patch, it's because this patch: >> "tracing/events: reuse trace event ids after overflow" >> >> But that patch just hides the bug not fixes it.. I thought I explained >> the bug clearly. :( >> >> I think the bug is triggered because a wrong trace_event->trace() is called. >> >> >> Before the above patch: >> id overflowed >> =>foo_bar.id == 6 == TRACE_BPRINT >> =>echo foo_bar > set_event >> =>sleep N secs >> =>rmmod sample >> =>cat trace >> =>trace_bprint_event->trace() is called!! >> =>oops!!! >> >> >> After the above patch: >> >> Suppose id 0 ~ 65534 are used, only 65535 is left, and we have 2 different >> sample module trace_event subsystems foo and bar. >> =>insmod foo.ko, and foo.id == 65535 >> =>echo foo > set_event >> =>sleep N secs >> =>rmmod foo.ko >> =>insmod bar.ko, and bar.id == 65535 >> =>cat trace >> =>bar->trace() will be called on events which were generated by foo! >> >> So is it possible/sane to discard corresponding events from ring buffer when >> we unload a module? >> >> Correct me if I'm wrong.. > > OK, I understand what you are saying. > > You are saying that if we reuse the event id, then there is a case that > the event can be used by another module, and if we have old data in the > ring buffer the other module's callback will be called with the previous > module's data. > > Right? > You got it. :) > > A module should not be storing pointers into the ring buffer. We do > that with the trace_print, but I think we were suppose to do something > special if we detect that it is a module being called. > > Frederic? > > If the module only stores data and never a pointer, then the worst thing > stale data would do is show junk. > > Scanning the ring buffer and discarding the module data can be very racy. > We would need to disable the ring buffer while doing this. Probably the > best thing to do is to simpy reset the ring buffers if it detects that a > module is being unloaded and has callbacks. The data in the ring buffer is > garbage anyway, so might as well reset it. > I agree, if no better way.