From: Dongli Zhang <dongli.zhang@oracle.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: dsahern@gmail.com, netdev@vger.kernel.org, bpf@vger.kernel.org,
linux-kernel@vger.kernel.org, davem@davemloft.net,
rostedt@goodmis.org, mingo@redhat.com, ast@kernel.org,
daniel@iogearbox.net, andrii@kernel.org, imagedong@tencent.com,
joao.m.martins@oracle.com, joe.jin@oracle.com,
edumazet@google.com
Subject: Re: [PATCH net-next v4 4/4] net: tun: track dropped skb via kfree_skb_reason()
Date: Wed, 2 Mar 2022 14:21:31 -0800 [thread overview]
Message-ID: <e3bc5ac4-e1db-584c-7219-54a09192a001@oracle.com> (raw)
In-Reply-To: <20220302111731.00746020@kicinski-fedora-PC1C0HJN.hsd1.ca.comcast.net>
Hi Jakub,
On 3/2/22 11:17 AM, Jakub Kicinski wrote:
> On Wed, 2 Mar 2022 10:19:30 -0800 Dongli Zhang wrote:
>> On 3/1/22 6:50 PM, Jakub Kicinski wrote:
>>> On Sat, 26 Feb 2022 00:49:29 -0800 Dongli Zhang wrote:
>>>> + SKB_DROP_REASON_SKB_PULL, /* failed to pull sk_buff data */
>>>> + SKB_DROP_REASON_SKB_TRIM, /* failed to trim sk_buff data */
>>>
>>> IDK if these are not too low level and therefore lacking meaning.
>>>
>>> What are your thoughts David?
>>>
>>> Would it be better to up level the names a little bit and call SKB_PULL
>>> something like "HDR_TRUNC" or "HDR_INV" or "HDR_ERR" etc or maybe
>>> "L2_HDR_ERR" since in this case we seem to be pulling off ETH_HLEN?
>>
>> This is for device driver and I think for most of cases the people understanding
>> source code will be involved. I think SKB_PULL is more meaningful to people
>> understanding source code.
>>
>> The functions to pull data to skb is commonly used with the same pattern, and
>> not only for ETH_HLEN. E.g., I randomly found below in kernel source code.
>>
>> 1071 static rx_handler_result_t macsec_handle_frame(struct sk_buff **pskb)
>> 1072 {
>> ... ...
>> 1102 pulled_sci = pskb_may_pull(skb, macsec_extra_len(true));
>> 1103 if (!pulled_sci) {
>> 1104 if (!pskb_may_pull(skb, macsec_extra_len(false)))
>> 1105 goto drop_direct;
>> 1106 }
>> ... ...
>> 1254 drop_direct:
>> 1255 kfree_skb(skb);
>> 1256 *pskb = NULL;
>> 1257 return RX_HANDLER_CONSUMED;
>>
>>
>> About 'L2_HDR_ERR', I am curious what the user/administrator may do as next
>> step, while the 'SKB_PULL' will be very clear to the developers which kernel
>> operation (e.g., to pull some protocol/hdr data to sk_buff data) is with the issue.
>>
>> I may use 'L2_HDR_ERR' if you prefer.
>
> We don't have to break it out per layer if you prefer. Let's call it
> HDR_TRUNC.
>
> I don't like SKB_PULL, people using these trace points are as likely
> to be BPF developers as kernel developers and skb_pull will not be
> meaningful to them. Besides the code can check if header is not
> truncated in other ways than pskb_may_pull(). And calling things
> by the name of the helper that failed is bad precedent.
I will switch to SKB_DROP_REASON_HDR_TRUNC.
>
>>> For SKB_TRIM the error comes from allocation failures, there may be
>>> a whole bunch of skb helpers which will fail only under mem pressure,
>>> would it be better to identify them and return some ENOMEM related
>>> reason, since, most likely, those will be noise to whoever is tracking
>>> real errors?
>>
>> The reasons I want to use SKB_TRIM:
>>
>> 1. To have SKB_PULL and SKB_TRIM (perhaps more SKB_XXX in the future in the same
>> set).
>>
>> 2. Although so that SKB_TRIM is always caused by ENOMEM, suppose if there is new
>> return values by pskb_trim(), the reason is not going to be valid any longer.
>>
>>
>> I may use SKB_DROP_REASON_NOMEM if you prefer.
>>
>> Another concern is that many functions may return -ENOMEM. It is more likely
>> that if there are two "goto drop" to return -ENOMEM, we will not be able to tell
>> from which function the sk_buff is dropped, e.g.,
>>
>> if (function_A()) {
>> reason = -ENOMEM;
>> goto drop;
>> }
>>
>> if (function_B()) {
>> reason = -ENOMEM;
>> goto drop;
>> }
>
> Are you saying that you're intending to break out skb drop reasons
> by what entity failed to allocate memory? I'd think "skb was dropped
Yes.
> because of OOM" is what should be reported. What we were trying to
> allocate is not very relevant (and can be gotten from the stack trace
> if needed).
I think OOM is not enough. Although it may not be the case in this patchset,
sometimes the allocation is failed because we are allocating a large chunk of
physically continuous pages (kmalloc vs. vmalloc) while there is still plenty of
memory pages available.
As a kernel developer, it is very significant for me to identify the specific
line/function and specific data structure that cause the error. E.g, the bug
filer may be chasing which line is making trouble.
It is less likely to SKB_TRIM more than once in a driver function, compared to
ENOMEM.
I am the user of this patchset and I prefer to make my work easier in the future :)
>
>>>> SKB_DROP_REASON_DEV_HDR, /* there is something wrong with
>>>> * device driver specific header
>>>> */
>>>> + SKB_DROP_REASON_DEV_READY, /* device is not ready */
>>>
>>> What is ready? link is not up? peer not connected? can we expand?
>>
>> In this patchset, it is for either:
>>
>> - tun->tfiles[txq] is not set, or
>>
>> - !(tun->dev->flags & IFF_UP)
>>
>> I want to make it very generic so that the sk_buff dropped due to any device
>> level data structure that is not up/ready/initialized/allocated will use this
>> reason in the future.
>
> Let's expand the documentation so someone reading thru the enum can
> feel confident if they are using this reason correctly.
>
> Side note - you may want to switch to inline comments to make writing
> more verbose documentation, I mean:
>
> /* This is the explanation of reason one which explains what
> * reason ones means, and how it should be used. We can make
> * use of full line width this way.
> */
> SKB_DROP_REASON_ONE,
> /* And this is an explanation for reason two. */
> SKB_DROP_REASON_TWO,
>
I will expand the comments.
Thank you very much!
Dongli Zhang
next prev parent reply other threads:[~2022-03-02 22:22 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-26 8:49 [PATCH net-next v4 0/4] tun/tap: use kfree_skb_reason() to trace dropped skb Dongli Zhang
2022-02-26 8:49 ` [PATCH net-next v4 1/4] skbuff: introduce kfree_skb_list_reason() Dongli Zhang
2022-03-02 2:34 ` Jakub Kicinski
2022-03-02 16:49 ` Dongli Zhang
2022-02-26 8:49 ` [PATCH net-next v4 2/4] net: tap: track dropped skb via kfree_skb_reason() Dongli Zhang
2022-03-02 2:42 ` Jakub Kicinski
2022-03-02 17:43 ` Dongli Zhang
2022-03-02 19:03 ` Jakub Kicinski
2022-03-02 21:44 ` Dongli Zhang
2022-02-26 8:49 ` [PATCH net-next v4 3/4] net: tun: split run_ebpf_filter() and pskb_trim() into different "if statement" Dongli Zhang
2022-02-26 8:49 ` [PATCH net-next v4 4/4] net: tun: track dropped skb via kfree_skb_reason() Dongli Zhang
2022-02-28 1:20 ` David Ahern
2022-03-02 2:50 ` Jakub Kicinski
2022-03-02 3:29 ` David Ahern
2022-03-02 4:16 ` [Internet]Re: " imagedong(董梦龙)
2022-03-02 18:26 ` Dongli Zhang
2022-03-02 19:22 ` Jakub Kicinski
2022-03-02 18:19 ` Dongli Zhang
2022-03-02 19:17 ` Jakub Kicinski
2022-03-02 22:21 ` Dongli Zhang [this message]
2022-03-03 5:21 ` Jakub Kicinski
2022-03-03 5:55 ` Dongli Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e3bc5ac4-e1db-584c-7219-54a09192a001@oracle.com \
--to=dongli.zhang@oracle.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dsahern@gmail.com \
--cc=edumazet@google.com \
--cc=imagedong@tencent.com \
--cc=joao.m.martins@oracle.com \
--cc=joe.jin@oracle.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).