From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3EF8ECAAA1 for ; Fri, 2 Sep 2022 03:48:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235343AbiIBDsG (ORCPT ); Thu, 1 Sep 2022 23:48:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40926 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235333AbiIBDsE (ORCPT ); Thu, 1 Sep 2022 23:48:04 -0400 Received: from mail-pf1-x435.google.com (mail-pf1-x435.google.com [IPv6:2607:f8b0:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7747F65560 for ; Thu, 1 Sep 2022 20:48:02 -0700 (PDT) Received: by mail-pf1-x435.google.com with SMTP id 145so716135pfw.4 for ; Thu, 01 Sep 2022 20:48:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=user-agent:in-reply-to:content-disposition:mime-version:references :message-id:subject:cc:to:from:date:from:to:cc:subject:date; bh=vKD8TfmMqTBWdVFR1mJEtEV+5qFuJOssmhigvIglB5U=; b=NjpNf1kyoBMcrdYMQvCMrMRWPoWuUrlUrKFbJ+4Suxauf1dHXRgC5PKWjypK4vLNgr J1OHz9RhTzYG4fff2tG4RRc7jT8BsmEAJJHZkQkvpFiOWpOBukW0jmHlB9SeTrdWOHsP Q/2brnGBpzSlKtOg5R/JB8XciFQMeYvAItWq6Z9tunB+EVFzhNnDsAn1Ccg4edOLSeJ/ v1S523eLud33y27j2AOjbmZztLPbYCQRcNN5XeV1KWawRxkV3cLW1IbkyQJ33MxsG4hz 8ND9OphqBKAVbNW5VIdr7yEAclQ9OBfKwBhY2vjMyY6RdnRO9wzuycvsBljXlxtXK7Tx /7ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=user-agent:in-reply-to:content-disposition:mime-version:references :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date; bh=vKD8TfmMqTBWdVFR1mJEtEV+5qFuJOssmhigvIglB5U=; b=JLrzLVZKOZyCuhZYdOdhnukKtA6KD2554CQ29fINkH44C8rw8Mx8G+zQJLco6ipFRq 6FQv0txluIsRCu7dqrnsgKdsQfeFOe1814DzsZurEq3SdvVR8aMM7s+YEQkEE7BuU3co QHSepVev3HJal/ry7hQQsQAWlCDPncajtiqhV46XfZVs9Pgms+KRROxSzyE3RRt3XtQ4 10uSO9SekwQ0fyJrMYTu5dTJWfAk9yfdhCc2SwMxjPhXcpI6749Sl3DAfaowmIOXayU7 5A57hpzK+KrMDH/SNCy6/GwNWzs7SEiBr8o5O0p/REPVnWuVHAHjx9vM+n9vYepj7dpB p8mg== X-Gm-Message-State: ACgBeo1sCLdtZz7N/fRvKt9HuKvoxWs4+newwdOAPaCLdSWd3oq9iTj8 d7kwyyA9NvY/rLa7WmKaYdY= X-Google-Smtp-Source: AA6agR5GR8Db+xgSOjSfud9OcKBREVccJ59Bl53VopsGfi+kQWFlJcNPJm1WkSu6MWDa6tJfVYuuoQ== X-Received: by 2002:a63:d857:0:b0:42b:424f:9c8c with SMTP id k23-20020a63d857000000b0042b424f9c8cmr29279054pgj.8.1662090481752; Thu, 01 Sep 2022 20:48:01 -0700 (PDT) Received: from localhost ([58.34.94.196]) by smtp.gmail.com with ESMTPSA id y22-20020aa79436000000b0052ea306a1e8sm411165pfo.210.2022.09.01.20.48.00 (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Thu, 01 Sep 2022 20:48:01 -0700 (PDT) Date: Fri, 2 Sep 2022 11:47:58 +0800 From: lijiazi To: Steven Rostedt Cc: Ingo Molnar , "Jiazi.Li" , linux-trace-devel@vger.kernel.org Subject: Re: [PATCH] ring-buffer: Add barrire in rb_move_tail Message-ID: <20220902031717.GA27303@Jiazi.Li> References: <20220830120854.7545-1-jiazi.li@transsion.com> <20220901111320.04b57cb7@gandalf.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220901111320.04b57cb7@gandalf.local.home> User-Agent: Mutt/1.9.4 (2018-02-28) Precedence: bulk List-ID: X-Mailing-List: linux-trace-devel@vger.kernel.org On Thu, Sep 01, 2022 at 11:13:20AM -0400, Steven Rostedt wrote: > On Tue, 30 Aug 2022 20:08:54 +0800 > "Jiazi.Li" wrote: > > > The following crash issue was encountered several times when read > > trace_pipe: > > > > 1. in ring_buffer_event_data, found event's type is bigger than > > RINGBUF_TYPE_DATA_TYPE_LEN_MAX, print WARNING: > > [249033.677658]WARNING: CPU: 1 PID: 9975 at ring_buffer_event_data+0x38/0x48 > > > > 2. trace_read_pipe print this event resulting in crash issue: > > [249033.806200] die+0xac/0x37c > > [249033.806891] die_kernel_fault+0x60/0x70 > > [249033.807710] die_kernel_fault+0x0/0x70 > > [249033.808518] do_bad_area+0x30/0x130 > > [249033.809296] do_translation_fault+0x34/0x40 > > [249033.810158] do_mem_abort+0x4c/0xf8 > > [249033.810935] el1_da+0x1c/0xac > > [249033.811647] string+0x38/0x74 > > [249033.812357] vsnprintf+0x430/0x65c > > [249033.813122] seq_buf_vprintf+0x64/0xcc > > [249033.813934] trace_seq_printf+0x70/0xd8 > > [249033.814755] trace_raw_output_ipi_handler+0x40/0x54 > > [249033.815705] print_trace_line+0x3ec/0x424 > > [249033.816547] tracing_read_pipe+0x28c/0x3ec > > [249033.817402] __vfs_read+0x44/0x13c > > [249033.818166] vfs_read+0xb8/0x150 > > [249033.818908] ksys_read+0x6c/0xcc > > [249033.819651] __arm64_sys_read+0x18/0x20 > > [249033.820470] el0_svc_common+0x98/0x160 > > [249033.821278] el0_svc_handler+0x60/0x78 > > [249033.822086] el0_svc+0x8/0x300 > > > > ipi event is not enable in my device, the event tracing_read_pipe > > The ipi event is *not* enabled? So you are saying that it translated the > extra data at the end that was suppose to be PADDING? > Yes, in my tracing instance, I only enable four events: signal_generate, signal_deliver, sched_process_fork and sched_process_exit. I also not enable any filter on these events. Since these four events not trigger frequently, so reader comsume speed is bigger than event write, and there is no overwrite. Reader mistook this extra data to ipi_entry event. > > want to print is a PADDING event generated in rb_move_tail. > > array[0] of this event is 0x18, ipi entry trace event's type value > > is 0x18 too in my device. > > so trace_read_pipe call trace_raw_output_ipi_handler to print this > > event, eventually, system crash because illegal pointers. > > > > In ring_buffer_peek, PADDING type event will not return to caller. > > So at least when ring_buffer_peek is called, this event has not > > been set to PADDING. > > > > My reader task poll on trace_pipe and other files, wakeup > > by other files and found that trace ring_buffer is not empty > > because of read != commit. > > But maybe because of cpu out-of-order execution, new event has not > > been set to PADDING type, and leads to above issue. > > > > Can we fix this issue by add a barrier between rb_reset_tail > > and rb_end_commit? > > Or are there other bugs that cause this issue? > > I rather not add a barrier in the writer because that will slow down the > tracing. > > > > > Signed-off-by: Jiazi.Li > > --- > > kernel/trace/ring_buffer.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c > > index d59b6a328b7f..20f3ef839467 100644 > > --- a/kernel/trace/ring_buffer.c > > +++ b/kernel/trace/ring_buffer.c > > @@ -2717,6 +2717,11 @@ rb_move_tail(struct ring_buffer_per_cpu *cpu_buffer, > > > > rb_reset_tail(cpu_buffer, tail, info); > > > > + /* > > + * Add barrier to make sure tail event be set to > > + * PADDING type before update commit. > > + */ > > + smp_mb(); > > Hmm, I'm thinking this may be a bigger race than a memory barrier. I'm > thinking that the reader has full access to this information at this > moment, as soon as the tail page moved forward. I unfortunately do not have > the time right now to do a full investigation, but thank you for reporting > this. > > I think your mb() solution is just making the race window smaller and not > actually fixing the race. > > -- Steve > Yes, I'm thinking this may be a race issue too. But I didn't find race point through code review. I have tried the following two methods to investigation this issue: 1. add log in rb_reset_tail, ring_buffer_peek, rb_per_cpu_empty 2. do not add log in these functions, just get timestamp and store to ring_buffer struct for debug. But All these changes make issue no longer reproduce, maybe because these changes also make race window smaller. Do you have any suggestions? We now have found more than 20 similar issues, all are reader peek a PADDING event. Compared with ordinary data event, the time between write RING_BUF_PADDING to memory and update commit is very short, because rb_reset_tail and rb_end_commit are inline functions. We do not found reader peek a ordinary data event with expired data, So I think maybe add a barrier could fix this issue. > > /* Commit what we have for now. */ > > rb_end_commit(cpu_buffer); > > /* rb_end_commit() decs committing */ >