From: Yajun Deng <yajun.deng@linux.dev>
To: Eric Dumazet <edumazet@google.com>
Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
Alexander Lobakin <aleksander.lobakin@intel.com>
Subject: Re: [PATCH v6] net/core: Introduce netdev_core_stats_inc()
Date: Fri, 29 Sep 2023 00:32:36 +0800 [thread overview]
Message-ID: <a94ca1e1-d29a-5d98-bf39-97c7a1f25372@linux.dev> (raw)
In-Reply-To: <CANn89i+XQ_LKvr5LHd2QUgTMfZh9Nd1yQTYfRORHUt2_BCkxcg@mail.gmail.com>
On 2023/9/29 00:23, Eric Dumazet wrote:
> On Thu, Sep 28, 2023 at 6:16 PM Yajun Deng <yajun.deng@linux.dev> wrote:
>>
>> On 2023/9/28 23:44, Eric Dumazet wrote:
>>> On Thu, Sep 28, 2023 at 5:40 PM Yajun Deng <yajun.deng@linux.dev> wrote:
>>>> On 2023/9/28 22:18, Eric Dumazet wrote:
>>>>> On Thu, Sep 28, 2023 at 12:04 PM Yajun Deng <yajun.deng@linux.dev> wrote:
>>>>>> Although there is a kfree_skb_reason() helper function that can be used to
>>>>>> find the reason why this skb is dropped, but most callers didn't increase
>>>>>> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped.
>>>>>>
>>>>>> For the users, people are more concerned about why the dropped in ip
>>>>>> is increasing.
>>>>>>
>>>>>> Introduce netdev_core_stats_inc() for trace the caller of the dropped
>>>>>> skb. Also, add __code to netdev_core_stats_alloc(), as it's called
>>>>>> unlinkly.
>>>>>>
>>>>>> Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
>>>>>> Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com>
>>>>>> ---
>>>>>> v6: merge netdev_core_stats and netdev_core_stats_inc together
>>>>>> v5: Access the per cpu pointer before reach the relevant offset.
>>>>>> v4: Introduce netdev_core_stats_inc() instead of export dev_core_stats_*_inc()
>>>>>> v3: __cold should be added to the netdev_core_stats_alloc().
>>>>>> v2: use __cold instead of inline in dev_core_stats().
>>>>>> v1: https://lore.kernel.org/netdev/20230911082016.3694700-1-yajun.deng@linux.dev/
>>>>>> ---
>>>>>> include/linux/netdevice.h | 21 ++++-----------------
>>>>>> net/core/dev.c | 17 +++++++++++++++--
>>>>>> 2 files changed, 19 insertions(+), 19 deletions(-)
>>>>>>
>>>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>>>>> index 7e520c14eb8c..eb1fa04fbccc 100644
>>>>>> --- a/include/linux/netdevice.h
>>>>>> +++ b/include/linux/netdevice.h
>>>>>> @@ -4002,32 +4002,19 @@ static __always_inline bool __is_skb_forwardable(const struct net_device *dev,
>>>>>> return false;
>>>>>> }
>>>>>>
>>>>>> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev);
>>>>>> -
>>>>>> -static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev)
>>>>>> -{
>>>>>> - /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
>>>>>> - struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
>>>>>> -
>>>>>> - if (likely(p))
>>>>>> - return p;
>>>>>> -
>>>>>> - return netdev_core_stats_alloc(dev);
>>>>>> -}
>>>>>> +void netdev_core_stats_inc(struct net_device *dev, u32 offset);
>>>>>>
>>>>>> #define DEV_CORE_STATS_INC(FIELD) \
>>>>>> static inline void dev_core_stats_##FIELD##_inc(struct net_device *dev) \
>>>>>> { \
>>>>>> - struct net_device_core_stats __percpu *p; \
>>>>>> - \
>>>>>> - p = dev_core_stats(dev); \
>>>>>> - if (p) \
>>>>>> - this_cpu_inc(p->FIELD); \
>>>>> Note that we were using this_cpu_inc() which implied :
>>>>> - IRQ safety, and
>>>>> - a barrier paired with :
>>>>>
>>>>> net/core/dev.c:10548: storage->rx_dropped +=
>>>>> READ_ONCE(core_stats->rx_dropped);
>>>>> net/core/dev.c:10549: storage->tx_dropped +=
>>>>> READ_ONCE(core_stats->tx_dropped);
>>>>> net/core/dev.c:10550: storage->rx_nohandler +=
>>>>> READ_ONCE(core_stats->rx_nohandler);
>>>>> net/core/dev.c:10551: storage->rx_otherhost_dropped
>>>>> += READ_ONCE(core_stats->rx_otherhost_dropped);
>>>>>
>>>>>
>>>>>> + netdev_core_stats_inc(dev, \
>>>>>> + offsetof(struct net_device_core_stats, FIELD)); \
>>>>>> }
>>>>>> DEV_CORE_STATS_INC(rx_dropped)
>>>>>> DEV_CORE_STATS_INC(tx_dropped)
>>>>>> DEV_CORE_STATS_INC(rx_nohandler)
>>>>>> DEV_CORE_STATS_INC(rx_otherhost_dropped)
>>>>>> +#undef DEV_CORE_STATS_INC
>>>>>>
>>>>>> static __always_inline int ____dev_forward_skb(struct net_device *dev,
>>>>>> struct sk_buff *skb,
>>>>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>>>>> index 606a366cc209..88a32c392c1d 100644
>>>>>> --- a/net/core/dev.c
>>>>>> +++ b/net/core/dev.c
>>>>>> @@ -10497,7 +10497,8 @@ void netdev_stats_to_stats64(struct rtnl_link_stats64 *stats64,
>>>>>> }
>>>>>> EXPORT_SYMBOL(netdev_stats_to_stats64);
>>>>>>
>>>>>> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
>>>>>> +static __cold struct net_device_core_stats __percpu *netdev_core_stats_alloc(
>>>>>> + struct net_device *dev)
>>>>>> {
>>>>>> struct net_device_core_stats __percpu *p;
>>>>>>
>>>>>> @@ -10510,7 +10511,19 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device
>>>>>> /* This READ_ONCE() pairs with the cmpxchg() above */
>>>>>> return READ_ONCE(dev->core_stats);
>>>>>> }
>>>>>> -EXPORT_SYMBOL(netdev_core_stats_alloc);
>>>>>> +
>>>>>> +void netdev_core_stats_inc(struct net_device *dev, u32 offset)
>>>>>> +{
>>>>>> + /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
>>>>>> + struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
>>>>>> +
>>>>>> + if (unlikely(!p))
>>>>>> + p = netdev_core_stats_alloc(dev);
>>>>>> +
>>>>>> + if (p)
>>>>>> + (*(unsigned long *)((void *)this_cpu_ptr(p) + offset))++;
>>>>> While here you are using a ++ operation that :
>>>>>
>>>>> - is not irq safe
>>>>> - might cause store-tearing.
>>>>>
>>>>> I would suggest a preliminary patch converting the "unsigned long" fields in
>>>>> struct net_device_core_stats to local_t
>>>> Do you mean it needs to revert the commit 6510ea973d8d ("net: Use
>>>> this_cpu_inc() to increment
>>>>
>>>> net->core_stats") first? But it would allocate memory which breaks on
>>>> PREEMPT_RT.
>>> I think I provided an (untested) alternative.
>>>
>>> unsigned long __percpu *field = (__force unsigned long __percpu *)
>>> ((__force u8 *)p + offset);
>>> this_cpu_inc(field);
>> unsigned long __percpu *field = (__force unsigned long __percpu *)
>> ((__force u8 *)p + offset);
>> this_cpu_inc(*(int *)field);
>>
>> This would compiler success. But I didn't test it.
>> This cold look complex.
> Why exactly ? Not very different from the cast you already had.
Okay, I'll test it.
>
>> Shoud I base v3? Export dev_core_stats_*_inc() intead of introduce netdev_core_stats_inc().
>> That would be easy.
> Well, you tell me, but this does not look incremental to me.
>
> I do not think we need 4 different (and maybe more to come if struct
> net_device_core_stats
> grows in the future) functions for some hardly used path.
next prev parent reply other threads:[~2023-09-28 16:32 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-28 10:04 [PATCH v6] net/core: Introduce netdev_core_stats_inc() Yajun Deng
2023-09-28 14:18 ` Eric Dumazet
2023-09-28 15:40 ` Yajun Deng
2023-09-28 15:44 ` Eric Dumazet
2023-09-28 16:16 ` Yajun Deng
2023-09-28 16:23 ` Eric Dumazet
2023-09-28 16:32 ` Yajun Deng [this message]
2023-09-29 5:38 ` Yajun Deng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a94ca1e1-d29a-5d98-bf39-97c7a1f25372@linux.dev \
--to=yajun.deng@linux.dev \
--cc=aleksander.lobakin@intel.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.