netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Drozdov <al.drozdov@gmail.com>
To: Willem de Bruijn <willemb@google.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	Daniel Borkmann <dborkman@redhat.com>,
	Eric Dumazet <edumazet@google.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Network Development <netdev@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	tom@herbertland.com
Subject: Re: [PATCH 2/2] af_packet: pass checksum validation status to the user
Date: Fri, 20 Mar 2015 09:46:07 +0300	[thread overview]
Message-ID: <550BC22F.6090007@gmail.com> (raw)
In-Reply-To: <CA+FuTSeJH22uVJcZ6eAxau_ng4LdqLfVSRGYna_pi4pxr1Oz_A@mail.gmail.com>


On 19.03.2015 21:50:03 +0300 Willem de Bruijn wrote:
> On Thu, Mar 19, 2015 at 2:08 PM, Alexander Drozdov <al.drozdov@gmail.com> wrote:
>> On Thu, 19 Mar 2015 11:29:32 -0400, Willem de Bruijn <willemb@google.com>
>> wrote:
>>>
>>> On Thu, Mar 19, 2015 at 4:01 AM, Alexander Drozdov <al.drozdov@gmail.com>
>>> wrote:
>>>>
>>>> Introduce TP_STATUS_CSUM_VALID tp_status flag to tell the
>>>> af_packet user that at least the transport header checksum
>>>> has been already validated.
>>>
>>> This changes the interface slightly. Processes should be treating
>>> this previously unused bit as reserved and other flags have
>>> been added to the bitmap in this manner as well, so this should
>>> then be safe here, too.
>>>
>>>> For now, the flag may be set for incoming packets only.
>>>
>>> Why?
>>
>> I can't figure out how af_packet could get that the outgoing
>> packet's checksum has been validated. "Checksumming on
>> output" from skbuff.h tells that skb->ip_summed should
>> equal to CHECKSUM_NONE in that case,
>
> CHECKSUM_UNNECESSARY is a valid flag on the outgoing
> path according to that documentation. And, indeed, I also see
> no checksum scrubbing on forwarding paths in practice (but I
> may be wrong there, only took a quick glance).
>
>> but that is not true for me (in my tests, skb->ip_summed ==
>> CHECKSUM_UNNECESSARY for forwarded packets in some
>> cases).
>
> When you disable hardware checksum offload and generate
> packets locally, you do see the expected CHECKSUM_NONE
> value?

Yes, I do, but see below.

>>> You cannot change the semantics of the flag afterwards.
>>
>>
>> I think the semantics of the flag won't be changed if one set the flag
>> for outgoing packets. If the flag is not set (for any directions)
>> then that is not mean that the packet checksum is invalid. The user just can
>> then
>> checksum the packet by itself. So, the user may check the flag for any
>> packet
>> right now.
>
> I see. So it is a hint. Okay. It would be nice if it behaves as
> expected in as many cases as possible from the outset. This
> would include PACKET_OUTGOING and CHECKSUM_NONE.

I've just done some testing, and I've found that packets generated by
'nping --badsum' (socket(PF_INET, SOCK_RAW, IPPROTO_RAW)) have CHECKSUM_NONE
when they are viewed by af_packet. I've used rather old Linux kernel
for the tests, but isn't that a reason to not set TP_STATUS_CSUM_VALID for
outgoing packets right now?

> Please also note the flag and semantics in
> Documentation/networking/packet_mmap.txt

I'll do it and I'll resend the patches with the note.

>>
>>
>>> Better to support both directions from the start.
>>>
>>>> Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com>
>>>> ---
>>>>    include/uapi/linux/if_packet.h | 1 +
>>>>    net/packet/af_packet.c         | 9 +++++++++
>>>>    2 files changed, 10 insertions(+)
>>>>
>>>> diff --git a/include/uapi/linux/if_packet.h
>>>> b/include/uapi/linux/if_packet.h
>>>> index da2d668..053bd10 100644
>>>> --- a/include/uapi/linux/if_packet.h
>>>> +++ b/include/uapi/linux/if_packet.h
>>>> @@ -99,6 +99,7 @@ struct tpacket_auxdata {
>>>>    #define TP_STATUS_VLAN_VALID           (1 << 4) /* auxdata has valid
>>>> tp_vlan_tci */
>>>>    #define TP_STATUS_BLK_TMO              (1 << 5)
>>>>    #define TP_STATUS_VLAN_TPID_VALID      (1 << 6) /* auxdata has valid
>>>> tp_vlan_tpid */
>>>> +#define TP_STATUS_CSUM_VALID           (1 << 7)
>>>>
>>>>    /* Tx ring - header status */
>>>>    #define TP_STATUS_AVAILABLE          0
>>>> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
>>>> index 6ecf8dd..3f09dda 100644
>>>> --- a/net/packet/af_packet.c
>>>> +++ b/net/packet/af_packet.c
>>>> @@ -1918,6 +1918,10 @@ static int tpacket_rcv(struct sk_buff *skb, struct
>>>> net_device *dev,
>>>>
>>>>           if (skb->ip_summed == CHECKSUM_PARTIAL)
>>>>                   status |= TP_STATUS_CSUMNOTREADY;
>>>> +       else if (skb->pkt_type != PACKET_OUTGOING &&
>>>> +                (skb->ip_summed == CHECKSUM_COMPLETE ||
>>>> +                 skb_csum_unnecessary(skb)))
>>>> +               status |= TP_STATUS_CSUM_VALID;
>>>>
>>>>           if (snaplen > res)
>>>>                   snaplen = res;
>>>> @@ -3015,6 +3019,11 @@ static int packet_recvmsg(struct kiocb *iocb,
>>>> struct socket *sock,
>>>>                   aux.tp_status = TP_STATUS_USER;
>>>>                   if (skb->ip_summed == CHECKSUM_PARTIAL)
>>>>                           aux.tp_status |= TP_STATUS_CSUMNOTREADY;
>>>> +               else if (skb->pkt_type != PACKET_OUTGOING &&
>>>> +                        (skb->ip_summed == CHECKSUM_COMPLETE ||
>>>> +                         skb_csum_unnecessary(skb)))
>>>> +                       aux.tp_status |= TP_STATUS_CSUM_VALID;
>>>> +
>>>
>>> These two sections are near duplicates. I'd move the entire status
>>> initialization, including existing TP_STATUS_USER and
>>> TP_STATUS_CSUMNOTREADY fields to a helper function.
>>>
>>> It's a bit unfortunately that we have to use an extra bit to add a signal
>>> that is a near inverse of the existing TP_STATUS_CSUMNOTREADY
>>> (bar CHECKSUM_NONE). I do not immediately see a better way,
>>> either, though. And tp_status has plenty room at 32 bits.
>>>
>>>>                   aux.tp_len = PACKET_SKB_CB(skb)->origlen;
>>>>                   aux.tp_snaplen = skb->len;
>>>>                   aux.tp_mac = 0;
>>>> --
>>>> 1.9.1
>>>>
>>
>>

  reply	other threads:[~2015-03-20  6:46 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-19  8:01 [PATCH 1/2] af_packet: make tpacket_rcv to not set status value before run_filter Alexander Drozdov
2015-03-19  8:01 ` [PATCH 2/2] af_packet: pass checksum validation status to the user Alexander Drozdov
2015-03-19 15:29   ` Willem de Bruijn
2015-03-19 18:08     ` Alexander Drozdov
2015-03-19 18:50       ` Willem de Bruijn
2015-03-20  6:46         ` Alexander Drozdov [this message]
2015-03-20 15:29           ` Willem de Bruijn
2015-03-23  6:11   ` [PATCH V2 1/2] af_packet: make tpacket_rcv to not set status value before run_filter Alexander Drozdov
2015-03-23  6:11     ` [PATCH V2 2/2] af_packet: pass checksum validation status to the user Alexander Drozdov
2015-03-24  2:01       ` David Miller
2015-03-24  2:01     ` [PATCH V2 1/2] af_packet: make tpacket_rcv to not set status value before run_filter David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=550BC22F.6090007@gmail.com \
    --to=al.drozdov@gmail.com \
    --cc=davem@davemloft.net \
    --cc=dborkman@redhat.com \
    --cc=edumazet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=tom@herbertland.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).