From: Davide Caratti <dcaratti@redhat.com>
To: Tom Herbert <tom@herbertland.com>,
Alexander Duyck <alexander.duyck@gmail.com>,
David Laight <David.Laight@aculab.com>,
"David S . Miller" <davem@davemloft.net>,
Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: netdev@vger.kernel.org, linux-sctp@vger.kernel.org
Subject: Re: [PATCH RFC net-next v3 4/7] net: use skb->csum_algo to identify packets needing crc32c
Date: Thu, 13 Apr 2017 10:36:34 +0000 [thread overview]
Message-ID: <cover.1492077747.git.dcaratti@redhat.com> (raw)
In-Reply-To: <CALx6S36rem=OuN_At6qYA=se5cpuYM1N2R8efoaszvo8b8Tz5A@mail.gmail.com>
thank you,
On Fri, 2017-04-07 at 08:43 -0700, Tom Herbert wrote:
> Maybe just call it csum_not_ip then. Then just do "if
> (unlikely(skb->csum_not_ip)) ..."
OK, I will rename the bit, avoid the enum and use the 'unlikely'. Up to now,
this series uses the bit for SCTP only and leaves unmodified behavior of
offloaded FCoE frames: please let me know if you disagree on that.
On Fri, 2017-04-07 at 08:43 -0700, Tom Herbert wrote:
> On Fri, Apr 7, 2017 at 10:29 AM, Davide Caratti <dcaratti@redhat.com> wrote:
> > In my understanding, csum_algo needs to be set to INTERNET_CHECKSUM after the
> > CRC32c is computed. Otherwise, after subsequent operation on the skb (e.g. it
> > is encapsulated in a UDP frame), there is the possibility for skb->ip_summed
> > to become CHECKSUM_PARTIAL again. So, to ensure that skb_checksum_help() and
> > not skb_crc32c_help() will be called, csum_algo must be 0.
> ip_summed should no longer be CHECKSUM_PARTIAL with CRC32c is computed.
Even though it's uncommon, skb->ip_summed can become CHECKSUM_PARTIAL again
after the CRC32c is computed and CHECKSUM_NONE is set: for example, when a
veth and a vxlan with UDP checksums are enslaved to the same bridge, and the
NIC below vxlan has no checksumming capabilities. Here, validate_xmit_skb is
called three times on the same skb (see perf output at the bottom):
* before transmission on the veth: here ip_summed is CHECKSUM_PARTIAL, but
the device supports CRC32c offload so the skb is (correctly) untouched.
* before vxlan encapsulation: here ip_summed is CHECKSUM_PARTIAL,
skb->csum_not_inet is 1 and NETIF_F_SCTP_CRC is not set. Here,
skb_csum_hwoffload_help() correctly fills the CRC32c and assigns ip_summed
to CHECKSUM_NONE.
* before transmission on the NIC: ip_summed is CHECKSUM_PARTIAL again (because
udp_set_csum changed csum_start and csum_offset to point to the tunnel
UDP header). No bit in NETIF_F_HW_CSUM is set: if skb->csum_not_inet is still 1,
the helper (wrongly) computes CRC32c again, thus corrupting the outer UDP
transport header. On the contrary, if skb->csum_not_inet is 0, skb_checksum_help()
correctly resolves CHECKSUM_PARTIAL.
To avoid this problem, skb->csum_not_inet must be assigned to 0 every time
the CHECKSUM_PARTIAL is resolved on skb carrying SCTP packets.
> > To minimize the impact of the patch, I substituted all assignments of skb->ip_summed,
> > done by SCTP-related code, with calls to skb_set_crc32c_ipsummed(). The alternative is
> > to explicitly set csum_algo to 0 (INTERNET_CHECKSUM) in SCTP-related code. Do you agree?
> No, like I said the only case where this new bit is relevant is when
> CHECKSUM_PARTIAL for a CRC is being done. When it's set for offloading
> sctp crc it must be set. When CRC is resolved, in the helper for
> instance, it must be cleared. If these rules are properly followed
> then the bit will be zero in all other cases without needing any
> additional work or conditionals.
At a minimum, this csum_not_inet bit needs to be cleared in three places:
1) in skb_crc32c_csum_help, to fix scenarios like veth->bridge->vxlan->NIC above.
2) in sctp_gso_make_checksum, a SCTP GSO packet is segmented and CRC32c is written
on each segment. skb->ip_summed transitions from CHECKSUM_PARTIAL to CHECKSUM_NONE.
3) in act_csum, because TC action mangling the packet are called before
validate_xmit_skb().
It is not necessary to do it in netfilter NAT (even it is harmless), because
SCTP packets having CHECKSUM_PARTIAL are not resolved (since commit 3189a290f98d
"netfilter: nat: skip checksum on offload SCTP packets"). And it should be not
needed in IPVS code, because ip_summed is set to CHECKSUM_UNNECESSARY, so skb
is not going to be checksummed anymore.
thank you in advance for the feedback!
regards,
--
davide
scenario:
vethA --> vethB --> bridge --> vxlan encaps --> NIC
# ./perf probe -k vmlinux --add \
"skb_csum_hwoffload_help csum_offset=skb->csum_offset ip_summed=skb->ip_summed:b2@7/32 skb=skb"
# ./perf record -e probe:skb_csum_hwoffload_help -aR -- ./scenario-vxlan.sh
# ./perf script
<....>
ncat 7577 [000] 22056.548907: probe:skb_csum_hwoffload_help: (ffffffff8162a6f0) csum_offset=8 ip_summed=1 skb=0xffff880106f6bd00
ncat 7577 [000] 22056.548915: probe:skb_csum_hwoffload_help: (ffffffff8162a6f0) csum_offset=8 ip_summed=1 skb=0xffff880106f6bd00
ncat 7577 [000] 22056.548917: probe:skb_csum_hwoffload_help: (ffffffff8162a6f0) csum_offset=6 ip_summed=1 skb=0xffff880106f6bd00
<....>
WARNING: multiple messages have this Message-ID (diff)
From: Davide Caratti <dcaratti@redhat.com>
To: Tom Herbert <tom@herbertland.com>,
Alexander Duyck <alexander.duyck@gmail.com>,
David Laight <David.Laight@aculab.com>,
"David S . Miller" <davem@davemloft.net>,
Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: netdev@vger.kernel.org, linux-sctp@vger.kernel.org
Subject: Re: [PATCH RFC net-next v3 4/7] net: use skb->csum_algo to identify packets needing crc32c
Date: Thu, 13 Apr 2017 12:36:34 +0200 [thread overview]
Message-ID: <cover.1492077747.git.dcaratti@redhat.com> (raw)
In-Reply-To: <CALx6S36rem=OuN_At6qYA=se5cpuYM1N2R8efoaszvo8b8Tz5A@mail.gmail.com>
thank you,
On Fri, 2017-04-07 at 08:43 -0700, Tom Herbert wrote:
> Maybe just call it csum_not_ip then. Then just do "if
> (unlikely(skb->csum_not_ip)) ..."
OK, I will rename the bit, avoid the enum and use the 'unlikely'. Up to now,
this series uses the bit for SCTP only and leaves unmodified behavior of
offloaded FCoE frames: please let me know if you disagree on that.
On Fri, 2017-04-07 at 08:43 -0700, Tom Herbert wrote:
> On Fri, Apr 7, 2017 at 10:29 AM, Davide Caratti <dcaratti@redhat.com> wrote:
> > In my understanding, csum_algo needs to be set to INTERNET_CHECKSUM after the
> > CRC32c is computed. Otherwise, after subsequent operation on the skb (e.g. it
> > is encapsulated in a UDP frame), there is the possibility for skb->ip_summed
> > to become CHECKSUM_PARTIAL again. So, to ensure that skb_checksum_help() and
> > not skb_crc32c_help() will be called, csum_algo must be 0.
> ip_summed should no longer be CHECKSUM_PARTIAL with CRC32c is computed.
Even though it's uncommon, skb->ip_summed can become CHECKSUM_PARTIAL again
after the CRC32c is computed and CHECKSUM_NONE is set: for example, when a
veth and a vxlan with UDP checksums are enslaved to the same bridge, and the
NIC below vxlan has no checksumming capabilities. Here, validate_xmit_skb is
called three times on the same skb (see perf output at the bottom):
* before transmission on the veth: here ip_summed is CHECKSUM_PARTIAL, but
the device supports CRC32c offload so the skb is (correctly) untouched.
* before vxlan encapsulation: here ip_summed is CHECKSUM_PARTIAL,
skb->csum_not_inet is 1 and NETIF_F_SCTP_CRC is not set. Here,
skb_csum_hwoffload_help() correctly fills the CRC32c and assigns ip_summed
to CHECKSUM_NONE.
* before transmission on the NIC: ip_summed is CHECKSUM_PARTIAL again (because
udp_set_csum changed csum_start and csum_offset to point to the tunnel
UDP header). No bit in NETIF_F_HW_CSUM is set: if skb->csum_not_inet is still 1,
the helper (wrongly) computes CRC32c again, thus corrupting the outer UDP
transport header. On the contrary, if skb->csum_not_inet is 0, skb_checksum_help()
correctly resolves CHECKSUM_PARTIAL.
To avoid this problem, skb->csum_not_inet must be assigned to 0 every time
the CHECKSUM_PARTIAL is resolved on skb carrying SCTP packets.
> > To minimize the impact of the patch, I substituted all assignments of skb->ip_summed,
> > done by SCTP-related code, with calls to skb_set_crc32c_ipsummed(). The alternative is
> > to explicitly set csum_algo to 0 (INTERNET_CHECKSUM) in SCTP-related code. Do you agree?
> No, like I said the only case where this new bit is relevant is when
> CHECKSUM_PARTIAL for a CRC is being done. When it's set for offloading
> sctp crc it must be set. When CRC is resolved, in the helper for
> instance, it must be cleared. If these rules are properly followed
> then the bit will be zero in all other cases without needing any
> additional work or conditionals.
At a minimum, this csum_not_inet bit needs to be cleared in three places:
1) in skb_crc32c_csum_help, to fix scenarios like veth->bridge->vxlan->NIC above.
2) in sctp_gso_make_checksum, a SCTP GSO packet is segmented and CRC32c is written
on each segment. skb->ip_summed transitions from CHECKSUM_PARTIAL to CHECKSUM_NONE.
3) in act_csum, because TC action mangling the packet are called before
validate_xmit_skb().
It is not necessary to do it in netfilter NAT (even it is harmless), because
SCTP packets having CHECKSUM_PARTIAL are not resolved (since commit 3189a290f98d
"netfilter: nat: skip checksum on offload SCTP packets"). And it should be not
needed in IPVS code, because ip_summed is set to CHECKSUM_UNNECESSARY, so skb
is not going to be checksummed anymore.
thank you in advance for the feedback!
regards,
next prev parent reply other threads:[~2017-04-13 10:36 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-23 16:52 [RFC PATCH net-next 0/5] net: improve support for SCTP checksums Davide Caratti
2017-01-23 16:52 ` Davide Caratti
2017-01-23 16:52 ` [RFC PATCH net-next 1/5] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
2017-01-23 16:52 ` Davide Caratti
2017-01-23 16:52 ` [RFC PATCH net-next 2/5] net: split skb_checksum_help Davide Caratti
2017-01-23 16:52 ` Davide Caratti
2017-01-23 20:59 ` Tom Herbert
2017-01-23 20:59 ` Tom Herbert
2017-01-24 16:35 ` David Laight
2017-01-24 16:35 ` David Laight
2017-02-02 15:07 ` Davide Caratti
2017-02-02 15:07 ` Davide Caratti
2017-02-02 16:55 ` David Laight
2017-02-02 16:55 ` David Laight
2017-02-02 18:08 ` Tom Herbert
2017-02-02 18:08 ` Tom Herbert
2017-02-27 13:39 ` Davide Caratti
2017-02-27 13:39 ` Davide Caratti
2017-02-27 15:11 ` Tom Herbert
2017-02-27 15:11 ` Tom Herbert
2017-02-28 10:31 ` Davide Caratti
2017-02-28 10:31 ` Davide Caratti
2017-02-28 10:32 ` [PATCH RFC net-next v2 1/4] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
2017-02-28 10:32 ` Davide Caratti
2017-02-28 22:46 ` Alexander Duyck
2017-02-28 22:46 ` Alexander Duyck
2017-03-01 3:17 ` Tom Herbert
2017-03-01 3:17 ` Tom Herbert
2017-03-01 10:53 ` David Laight
2017-03-01 10:53 ` David Laight
2017-03-06 21:51 ` Davide Caratti
2017-03-06 21:51 ` Davide Caratti
2017-03-07 18:06 ` Alexander Duyck
2017-03-07 18:06 ` Alexander Duyck
2017-03-18 13:17 ` Davide Caratti
2017-03-18 13:17 ` Davide Caratti
2017-03-18 22:35 ` Tom Herbert
2017-03-18 22:35 ` Tom Herbert
2017-04-07 14:16 ` [PATCH RFC net-next v3 0/7] improve CRC32c in the forwarding path Davide Caratti
2017-04-07 14:16 ` Davide Caratti
2017-04-07 14:16 ` [PATCH RFC net-next v3 1/7] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
2017-04-07 14:16 ` Davide Caratti
2017-04-07 14:16 ` [PATCH RFC net-next v3 2/7] net: introduce skb_crc32c_csum_help Davide Caratti
2017-04-07 14:16 ` Davide Caratti
2017-04-07 14:16 ` [PATCH RFC net-next v3 3/7] sk_buff: remove support for csum_bad in sk_buff Davide Caratti
2017-04-07 14:16 ` Davide Caratti
2017-04-07 14:16 ` [PATCH RFC net-next v3 4/7] net: use skb->csum_algo to identify packets needing crc32c Davide Caratti
2017-04-07 14:16 ` Davide Caratti
2017-04-07 15:43 ` Tom Herbert
2017-04-07 15:43 ` Tom Herbert
2017-04-07 17:29 ` Davide Caratti
2017-04-07 17:29 ` Davide Caratti
2017-04-07 18:11 ` Tom Herbert
2017-04-07 18:11 ` Tom Herbert
2017-04-13 10:36 ` Davide Caratti [this message]
2017-04-13 10:36 ` Davide Caratti
2017-04-20 13:38 ` [PATCH RFC net-next v4 0/7] net: improve support for SCTP checksums Davide Caratti
2017-04-20 13:38 ` Davide Caratti
2017-04-27 12:41 ` Marcelo Ricardo Leitner
2017-04-27 12:41 ` Marcelo Ricardo Leitner
2017-04-20 13:38 ` [PATCH RFC net-next v4 1/7] skbuff: add stub to help computing crc32c on SCTP packets Davide Caratti
2017-04-20 13:38 ` Davide Caratti
2017-04-20 13:38 ` [PATCH RFC net-next v4 2/7] net: introduce skb_crc32c_csum_help Davide Caratti
2017-04-20 13:38 ` Davide Caratti
2017-04-27 12:29 ` Marcelo Ricardo Leitner
2017-04-27 12:29 ` Marcelo Ricardo Leitner
2017-04-20 13:38 ` [PATCH RFC net-next v4 3/7] sk_buff: remove support for csum_bad in sk_buff Davide Caratti
2017-04-20 13:38 ` Davide Caratti
2017-04-27 1:34 ` [sk_buff] 95510aef27: BUG:Bad_page_state_in_process kernel test robot
2017-04-27 1:34 ` kernel test robot
2017-04-29 20:21 ` [PATCH RFC net-next v4 3/7] sk_buff: remove support for csum_bad in sk_buff Tom Herbert
2017-04-29 20:21 ` Tom Herbert
2017-04-20 13:38 ` [PATCH RFC net-next v4 4/7] net: use skb->csum_not_inet to identify packets needing crc32c Davide Caratti
2017-04-20 13:38 ` Davide Caratti
2017-04-29 20:18 ` Tom Herbert
2017-04-29 20:18 ` Tom Herbert
2017-04-20 13:38 ` [PATCH RFC net-next v4 5/7] net: more accurate checksumming in validate_xmit_skb() Davide Caratti
2017-04-20 13:38 ` Davide Caratti
2017-04-20 13:38 ` [PATCH RFC net-next v4 6/7] openvswitch: more accurate checksumming in queue_userspace_packet() Davide Caratti
2017-04-20 13:38 ` Davide Caratti
2017-04-20 13:38 ` [PATCH RFC net-next v4 7/7] sk_buff.h: improve description of CHECKSUM_{COMPLETE,UNNECESSARY} Davide Caratti
2017-04-20 13:38 ` Davide Caratti
2017-04-29 20:20 ` Tom Herbert
2017-04-29 20:20 ` Tom Herbert
2017-04-07 14:16 ` [PATCH RFC net-next v3 5/7] net: more accurate checksumming in validate_xmit_skb() Davide Caratti
2017-04-07 14:16 ` Davide Caratti
2017-04-07 14:16 ` [PATCH RFC net-next v3 6/7] openvswitch: more accurate checksumming in queue_userspace_packet() Davide Caratti
2017-04-07 14:16 ` Davide Caratti
2017-04-07 14:16 ` [PATCH RFC net-next v3 7/7] sk_buff.h: improve description of CHECKSUM_{COMPLETE,UNNECESSARY} Davide Caratti
2017-04-07 14:16 ` Davide Caratti
2017-02-28 10:32 ` [PATCH RFC net-next v2 2/4] net: introduce skb_sctp_csum_help Davide Caratti
2017-02-28 10:32 ` Davide Caratti
2017-02-28 10:32 ` [PATCH RFC net-next v2 3/4] net: more accurate checksumming in validate_xmit_skb Davide Caratti
2017-02-28 10:32 ` Davide Caratti
2017-02-28 19:50 ` Tom Herbert
2017-02-28 19:50 ` Tom Herbert
2017-02-28 10:32 ` [PATCH RFC net-next v2 4/4] Documentation: update notes on checksum offloading Davide Caratti
2017-02-28 10:32 ` Davide Caratti
2017-01-23 16:52 ` [RFC PATCH net-next 3/5] net: introduce skb_sctp_csum_help Davide Caratti
2017-01-23 16:52 ` Davide Caratti
2017-01-23 16:52 ` [RFC PATCH net-next 4/5] net: more accurate checksumming in validate_xmit_skb Davide Caratti
2017-01-23 16:52 ` Davide Caratti
2017-01-23 16:52 ` [RFC PATCH net-next 5/5] Documentation: add description of skb_sctp_csum_help Davide Caratti
2017-01-23 16:52 ` Davide Caratti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1492077747.git.dcaratti@redhat.com \
--to=dcaratti@redhat.com \
--cc=David.Laight@aculab.com \
--cc=alexander.duyck@gmail.com \
--cc=davem@davemloft.net \
--cc=linux-sctp@vger.kernel.org \
--cc=marcelo.leitner@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=tom@herbertland.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.