From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C405E732E5 for ; Thu, 28 Sep 2023 15:40:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231906AbjI1Pkh (ORCPT ); Thu, 28 Sep 2023 11:40:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59678 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232013AbjI1Pkf (ORCPT ); Thu, 28 Sep 2023 11:40:35 -0400 Received: from out-207.mta1.migadu.com (out-207.mta1.migadu.com [IPv6:2001:41d0:203:375::cf]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9D3041AA for ; Thu, 28 Sep 2023 08:40:32 -0700 (PDT) Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1695915629; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=45MlEPGwvOexgNDZ/JPA6sJGHWuPY5aTD7qAaECk8Lg=; b=kQcPCDwhRi/lCsA3fubs+MheLqGfroa/cVrC1n9IwiHk93+aPsXWnanpy5UILTQvDg6bzT BaAiaYlBiTz4fl8QaZY30lXEKGCFOFP8D1ycw/KLW2GTAuFQa98a0hF6BkcO5m4Hcue/QM USTRqLV2w1W+uyHifI/Qqq3zwQLfEj8= Date: Thu, 28 Sep 2023 23:40:19 +0800 MIME-Version: 1.0 Subject: Re: [PATCH v6] net/core: Introduce netdev_core_stats_inc() Content-Language: en-US To: Eric Dumazet Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Alexander Lobakin References: <20230928100418.521594-1-yajun.deng@linux.dev> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Yajun Deng In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2023/9/28 22:18, Eric Dumazet wrote: > On Thu, Sep 28, 2023 at 12:04 PM Yajun Deng wrote: >> Although there is a kfree_skb_reason() helper function that can be used to >> find the reason why this skb is dropped, but most callers didn't increase >> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped. >> >> For the users, people are more concerned about why the dropped in ip >> is increasing. >> >> Introduce netdev_core_stats_inc() for trace the caller of the dropped >> skb. Also, add __code to netdev_core_stats_alloc(), as it's called >> unlinkly. >> >> Signed-off-by: Yajun Deng >> Suggested-by: Alexander Lobakin >> --- >> v6: merge netdev_core_stats and netdev_core_stats_inc together >> v5: Access the per cpu pointer before reach the relevant offset. >> v4: Introduce netdev_core_stats_inc() instead of export dev_core_stats_*_inc() >> v3: __cold should be added to the netdev_core_stats_alloc(). >> v2: use __cold instead of inline in dev_core_stats(). >> v1: https://lore.kernel.org/netdev/20230911082016.3694700-1-yajun.deng@linux.dev/ >> --- >> include/linux/netdevice.h | 21 ++++----------------- >> net/core/dev.c | 17 +++++++++++++++-- >> 2 files changed, 19 insertions(+), 19 deletions(-) >> >> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h >> index 7e520c14eb8c..eb1fa04fbccc 100644 >> --- a/include/linux/netdevice.h >> +++ b/include/linux/netdevice.h >> @@ -4002,32 +4002,19 @@ static __always_inline bool __is_skb_forwardable(const struct net_device *dev, >> return false; >> } >> >> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev); >> - >> -static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev) >> -{ >> - /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */ >> - struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats); >> - >> - if (likely(p)) >> - return p; >> - >> - return netdev_core_stats_alloc(dev); >> -} >> +void netdev_core_stats_inc(struct net_device *dev, u32 offset); >> >> #define DEV_CORE_STATS_INC(FIELD) \ >> static inline void dev_core_stats_##FIELD##_inc(struct net_device *dev) \ >> { \ >> - struct net_device_core_stats __percpu *p; \ >> - \ >> - p = dev_core_stats(dev); \ >> - if (p) \ >> - this_cpu_inc(p->FIELD); \ > Note that we were using this_cpu_inc() which implied : > - IRQ safety, and > - a barrier paired with : > > net/core/dev.c:10548: storage->rx_dropped += > READ_ONCE(core_stats->rx_dropped); > net/core/dev.c:10549: storage->tx_dropped += > READ_ONCE(core_stats->tx_dropped); > net/core/dev.c:10550: storage->rx_nohandler += > READ_ONCE(core_stats->rx_nohandler); > net/core/dev.c:10551: storage->rx_otherhost_dropped > += READ_ONCE(core_stats->rx_otherhost_dropped); > > >> + netdev_core_stats_inc(dev, \ >> + offsetof(struct net_device_core_stats, FIELD)); \ >> } >> DEV_CORE_STATS_INC(rx_dropped) >> DEV_CORE_STATS_INC(tx_dropped) >> DEV_CORE_STATS_INC(rx_nohandler) >> DEV_CORE_STATS_INC(rx_otherhost_dropped) >> +#undef DEV_CORE_STATS_INC >> >> static __always_inline int ____dev_forward_skb(struct net_device *dev, >> struct sk_buff *skb, >> diff --git a/net/core/dev.c b/net/core/dev.c >> index 606a366cc209..88a32c392c1d 100644 >> --- a/net/core/dev.c >> +++ b/net/core/dev.c >> @@ -10497,7 +10497,8 @@ void netdev_stats_to_stats64(struct rtnl_link_stats64 *stats64, >> } >> EXPORT_SYMBOL(netdev_stats_to_stats64); >> >> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) >> +static __cold struct net_device_core_stats __percpu *netdev_core_stats_alloc( >> + struct net_device *dev) >> { >> struct net_device_core_stats __percpu *p; >> >> @@ -10510,7 +10511,19 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device >> /* This READ_ONCE() pairs with the cmpxchg() above */ >> return READ_ONCE(dev->core_stats); >> } >> -EXPORT_SYMBOL(netdev_core_stats_alloc); >> + >> +void netdev_core_stats_inc(struct net_device *dev, u32 offset) >> +{ >> + /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */ >> + struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats); >> + >> + if (unlikely(!p)) >> + p = netdev_core_stats_alloc(dev); >> + >> + if (p) >> + (*(unsigned long *)((void *)this_cpu_ptr(p) + offset))++; > While here you are using a ++ operation that : > > - is not irq safe > - might cause store-tearing. > > I would suggest a preliminary patch converting the "unsigned long" fields in > struct net_device_core_stats to local_t Do you mean it needs to revert the commit 6510ea973d8d ("net: Use this_cpu_inc() to increment net->core_stats") first? But it would allocate memory which breaks on PREEMPT_RT. > > You might be able tweak this to > > unsigned long __percpu *field = (unsigned long __percpu) ((u8 *)p + offset); > this_cpu_inc(field);