From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0211E28ED for ; Mon, 9 Oct 2023 09:43:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="bG99yCTa" Received: from out-197.mta1.migadu.com (out-197.mta1.migadu.com [IPv6:2001:41d0:203:375::c5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0DCFF94 for ; Mon, 9 Oct 2023 02:43:57 -0700 (PDT) Message-ID: <68eb65c5-1870-0776-0878-694a8b002a6d@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1696844635; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O0YOF8WiqWV3qWEelhaf9YObeiYly4thXLasaoZJfXw=; b=bG99yCTavxE57bN/sitJRBv3pvuXgFF9RvNFfRAPaLA/yw7LIq18IbInFcZFQahAQDzbrG YgvZMtphcWgi9sGRwruSxGXFM+LmJ86HtdVKsvysqZ+OcE7hl+tnmhZ1Sj7OAQv3ooAkqC ZxCWXIA1dcG1z4QhUKfmSa6R6xxV+UY= Date: Mon, 9 Oct 2023 17:43:43 +0800 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH net-next v7] net/core: Introduce netdev_core_stats_inc() Content-Language: en-US To: Eric Dumazet Cc: rostedt@goodmis.org, mhiramat@kernel.org, dennis@kernel.org, tj@kernel.org, cl@linux.com, mark.rutland@arm.com, davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Alexander Lobakin , linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org References: <20231007050621.1706331-1-yajun.deng@linux.dev> <917708b5-cb86-f233-e878-9233c4e6c707@linux.dev> <9f4fb613-d63f-9b86-fe92-11bf4dfb7275@linux.dev> <4a747fda-2bb9-4231-66d6-31306184eec2@linux.dev> <814b5598-5284-9558-8f56-12a6f7a67187@linux.dev> <508b33f7-3dc0-4536-21f6-4a5e7ade2b5c@linux.dev> <296ca17d-cff0-2d19-f620-eedab004ddde@linux.dev> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Yajun Deng In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net On 2023/10/9 17:30, Eric Dumazet wrote: > On Mon, Oct 9, 2023 at 10:36 AM Yajun Deng wrote: >> >> On 2023/10/9 16:20, Eric Dumazet wrote: >>> On Mon, Oct 9, 2023 at 10:14 AM Yajun Deng wrote: >>>> On 2023/10/9 15:53, Eric Dumazet wrote: >>>>> On Mon, Oct 9, 2023 at 5:07 AM Yajun Deng wrote: >>>>> >>>>>> 'this_cpu_read + this_cpu_write' and 'pr_info + this_cpu_inc' will make >>>>>> the trace work well. >>>>>> >>>>>> They all have 'pop' instructions in them. This may be the key to making >>>>>> the trace work well. >>>>>> >>>>>> Hi all, >>>>>> >>>>>> I need your help on percpu and ftrace. >>>>>> >>>>> I do not think you made sure netdev_core_stats_inc() was never inlined. >>>>> >>>>> Adding more code in it is simply changing how the compiler decides to >>>>> inline or not. >>>> Yes, you are right. It needs to add the 'noinline' prefix. The >>>> disassembly code will have 'pop' >>>> >>>> instruction. >>>> >>> The function was fine, you do not need anything like push or pop. >>> >>> The only needed stuff was the call __fentry__. >>> >>> The fact that the function was inlined for some invocations was the >>> issue, because the trace point >>> is only planted in the out of line function. >> >> But somehow the following code isn't inline? They didn't need to add the >> 'noinline' prefix. >> >> + field = (unsigned long *)((void *)this_cpu_ptr(p) + offset); >> + WRITE_ONCE(*field, READ_ONCE(*field) + 1); >> >> Or >> + (*(unsigned long *)((void *)this_cpu_ptr(p) + offset))++; >> > I think you are very confused. > > You only want to trace netdev_core_stats_inc() entry point, not > arbitrary pieces of it. Yes, I will trace netdev_core_stats_inc() entry point. I mean to replace +                                       field = (__force unsigned long __percpu *)((__force void *)p + offset); +                                       this_cpu_inc(*field); with + field = (unsigned long *)((void *)this_cpu_ptr(p) + offset); + WRITE_ONCE(*field, READ_ONCE(*field) + 1); Or + (*(unsigned long *)((void *)this_cpu_ptr(p) + offset))++; The netdev_core_stats_inc() entry point will work fine even if it doesn't have 'noinline' prefix. I don't know why this code needs to add 'noinline' prefix. +               field = (__force unsigned long __percpu *)((__force void *)p + offset); +               this_cpu_inc(*field);