* Re: [PATCH V2 net-next 06/14] net/tls: Add generic NIC offload infrastructure
From: Eric Dumazet @ 2018-03-21 21:10 UTC (permalink / raw)
To: Saeed Mahameed, David S. Miller
Cc: netdev, Dave Watson, Boris Pismenny, Ilya Lesokhin,
Aviad Yehezkel
In-Reply-To: <20180321210146.22537-7-saeedm@mellanox.com>
On 03/21/2018 02:01 PM, Saeed Mahameed wrote:
> From: Ilya Lesokhin <ilyal@mellanox.com>
>
> This patch adds a generic infrastructure to offload TLS crypto to a
...
> +
> +static inline int tls_push_record(struct sock *sk,
> + struct tls_context *ctx,
> + struct tls_offload_context *offload_ctx,
> + struct tls_record_info *record,
> + struct page_frag *pfrag,
> + int flags,
> + unsigned char record_type)
> +{
> + skb_frag_t *frag;
> + struct tcp_sock *tp = tcp_sk(sk);
> + struct page_frag fallback_frag;
> + struct page_frag *tag_pfrag = pfrag;
> + int i;
> +
> + /* fill prepand */
> + frag = &record->frags[0];
> + tls_fill_prepend(ctx,
> + skb_frag_address(frag),
> + record->len - ctx->prepend_size,
> + record_type);
> +
> + if (unlikely(!skb_page_frag_refill(ctx->tag_size, pfrag, GFP_KERNEL))) {
> + /* HW doesn't care about the data in the tag
> + * so in case pfrag has no room
> + * for a tag and we can't allocate a new pfrag
> + * just use the page in the first frag
> + * rather then write a complicated fall back code.
> + */
> + tag_pfrag = &fallback_frag;
> + tag_pfrag->page = skb_frag_page(frag);
> + tag_pfrag->offset = 0;
> + }
> +
If HW does not care, why even trying to call skb_page_frag_refill() ?
If you remove it, then we remove one seldom used path and might uncover bugs
This part looks very suspect to me, to be honest.
^ permalink raw reply
* Re: [PATCH net-next v2] net: mvpp2: Don't use dynamic allocs for local variables
From: Maxime Chevallier @ 2018-03-21 21:14 UTC (permalink / raw)
To: Yan Markman
Cc: davem@davemloft.net, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, Antoine Tenart,
thomas.petazzoni@bootlin.com, gregory.clement@bootlin.com,
miquel.raynal@bootlin.com, Nadav Haklai, Stefan Chulski,
mw@semihalf.com
In-Reply-To: <b97e76e64d494912ae09efcc64d612ff@IL-EXCH01.marvell.com>
Hello Yan,
On Wed, 21 Mar 2018 19:57:47 +0000,
Yan Markman <ymarkman@marvell.com> wrote :
> Hi Maxime
Please avoid top-posting on this list.
> Please check the TWO points:
>
> 1). The mvpp2_prs_flow_find() returns TID if found
> The TID=0 is valid FOUND value
> For Not-found use -ENOENT (just like your mvpp2_prs_vlan_find)
This is actually what is used in this patch. You might be refering to
a previous draft version of this patch.
> 2). The original code always uses "mvpp2_prs_entry *pe" storage
> Zero-Allocated Please check the correctnes of new "mvpp2_prs_entry
> pe" without memset(pe, 0, sizeof(pe));
> in all procedures where pe=kzalloc() has been replaced
I think we're good on that regard. On places where I didn't memset the
prs_entry, the pe.index field is set, and this is followed by a read
from TCAM that will initialize the prs_entry to the correct value :
pe.index = tid;
mvpp2_prs_hw_read(priv, &pe);
> Thanks
> Yan Markman
[...]
Thanks,
Maxime
^ permalink raw reply
* Re: [PATCH][next] gre: fix TUNNEL_SEQ bit check on sequence numbering
From: William Tu @ 2018-03-21 21:22 UTC (permalink / raw)
To: Colin King
Cc: David S . Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
Linux Kernel Network Developers, kernel-janitors, LKML
In-Reply-To: <20180321193458.4451-1-colin.king@canonical.com>
On Wed, Mar 21, 2018 at 12:34 PM, Colin King <colin.king@canonical.com> wrote:
> From: Colin Ian King <colin.king@canonical.com>
>
> The current logic of flags | TUNNEL_SEQ is always non-zero and hence
> sequence numbers are always incremented no matter the setting of the
> TUNNEL_SEQ bit. Fix this by using & instead of |.
>
> Detected by CoverityScan, CID#1466039 ("Operands don't affect result")
>
> Fixes: 77a5196a804e ("gre: add sequence number for collect md mode.")
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
Thanks for the fix!
btw, how can I access the CoverityScan result with this CID?
Acked-by: William Tu <u9012063@gmail.com>
> ---
> net/ipv4/ip_gre.c | 2 +-
> net/ipv6/ip6_gre.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
> index 2fa2ef2e2af9..9ab1aa2f7660 100644
> --- a/net/ipv4/ip_gre.c
> +++ b/net/ipv4/ip_gre.c
> @@ -550,7 +550,7 @@ static void gre_fb_xmit(struct sk_buff *skb, struct net_device *dev,
> (TUNNEL_CSUM | TUNNEL_KEY | TUNNEL_SEQ);
> gre_build_header(skb, tunnel_hlen, flags, proto,
> tunnel_id_to_key32(tun_info->key.tun_id),
> - (flags | TUNNEL_SEQ) ? htonl(tunnel->o_seqno++) : 0);
> + (flags & TUNNEL_SEQ) ? htonl(tunnel->o_seqno++) : 0);
>
> df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
>
> diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
> index 0bcefc480aeb..3a98c694da5f 100644
> --- a/net/ipv6/ip6_gre.c
> +++ b/net/ipv6/ip6_gre.c
> @@ -725,7 +725,7 @@ static netdev_tx_t __gre6_xmit(struct sk_buff *skb,
> gre_build_header(skb, tunnel->tun_hlen,
> flags, protocol,
> tunnel_id_to_key32(tun_info->key.tun_id),
> - (flags | TUNNEL_SEQ) ? htonl(tunnel->o_seqno++)
> + (flags & TUNNEL_SEQ) ? htonl(tunnel->o_seqno++)
> : 0);
>
> } else {
> --
> 2.15.1
>
^ permalink raw reply
* Re: [PATCH net-next RFC V1 1/5] net: Introduce peer to peer one step PTP time stamping.
From: Richard Cochran @ 2018-03-21 21:26 UTC (permalink / raw)
To: Keller, Jacob E
Cc: netdev@vger.kernel.org, devicetree@vger.kernel.org, Andrew Lunn,
David Miller, Florian Fainelli, Mark Rutland, Miroslav Lichvar,
Rob Herring, Willem de Bruijn
In-Reply-To: <02874ECE860811409154E81DA85FBB5882D02105@ORSMSX115.amr.corp.intel.com>
On Wed, Mar 21, 2018 at 08:05:36PM +0000, Keller, Jacob E wrote:
> I am guessing that we expect all devices which support onestep P2P messages, will always support onestep SYNC as well?
Yes. Anything else doesn't make sense, don't you think?
Also, reading 1588, it isn't clear whether supporting only 1-step Sync
without 1-step P2P is even intended. There is only a "one-step
clock", and it is described as doing both.
Thanks,
Richard
^ permalink raw reply
* Re: [PATCH net-next RFC V1 5/5] net: mdio: Add a driver for InES time stamping IP core.
From: Richard Cochran @ 2018-03-21 21:36 UTC (permalink / raw)
To: Andrew Lunn
Cc: netdev, devicetree, David Miller, Florian Fainelli, Mark Rutland,
Miroslav Lichvar, Rob Herring, Willem de Bruijn
In-Reply-To: <20180321193315.GR24516@lunn.ch>
On Wed, Mar 21, 2018 at 08:33:15PM +0100, Andrew Lunn wrote:
> Can you point us at some documentation for this.
The overall one-step functionality is described IEEE 1588.
> I think Florian and I want to better understand how this device works,
> in order to understand your other changes.
The device is from here:
https://www.zhaw.ch/en/engineering/institutes-centres/ines/products-and-services/ptp-ieee-1588/ptp-hardware/#c43991
The only other docs that I have is a PDF of the register layout, but I
don't think I can redistribute that. Actually, there really isn't any
detail in that doc at all.
Thanks,
Richard
^ permalink raw reply
* Re: [PATCH net-next RFC V1 5/5] net: mdio: Add a driver for InES time stamping IP core.
From: Andrew Lunn @ 2018-03-21 21:44 UTC (permalink / raw)
To: Richard Cochran
Cc: netdev, devicetree, David Miller, Florian Fainelli, Mark Rutland,
Miroslav Lichvar, Rob Herring, Willem de Bruijn
In-Reply-To: <20180321213636.2mmfveu2vg5qbwpp@localhost>
Hi Richard
> The only other docs that I have is a PDF of the register layout, but I
> don't think I can redistribute that. Actually, there really isn't any
> detail in that doc at all.
O.K, so lets do the 20 questions approach.
As far as i can see, this is not an MDIO device. It is not connected
to the MDIO bus, it has no MDIO registers, you don't even pass a valid
MDIO address in device tree.
It it actually an MII bus snooper? Does it snoop, or is it actually in
the MII bus, and can modify packets, i.e. insert time stamps as frames
pass over the MII bus?
When the driver talks about having three ports, does that mean it can
be on three different MII busses?
Thanks
Andrew
^ permalink raw reply
* Re: [PATCH net-next RFC V1 2/5] net: phy: Move time stamping interface into the generic mdio layer.
From: Richard Cochran @ 2018-03-21 21:45 UTC (permalink / raw)
To: Florian Fainelli
Cc: netdev, devicetree, Andrew Lunn, David Miller, Mark Rutland,
Miroslav Lichvar, Rob Herring, Willem de Bruijn
In-Reply-To: <e6c7b4e1-9449-58be-f670-4b6a8aa4b526@gmail.com>
On Wed, Mar 21, 2018 at 12:10:07PM -0700, Florian Fainelli wrote:
> > + phydev->mdio.ts_info = dp83640_ts_info;
> > + phydev->mdio.hwtstamp = dp83640_hwtstamp;
> > + phydev->mdio.rxtstamp = dp83640_rxtstamp;
> > + phydev->mdio.txtstamp = dp83640_txtstamp;
>
> Why is this implemented a the mdio_device level and not at the
> mdio_driver level? This looks like the wrong level at which this is done.
The question could be asked of:
struct mdio_device {
int (*bus_match)(struct device *dev, struct device_driver *drv);
void (*device_free)(struct mdio_device *mdiodev);
void (*device_remove)(struct mdio_device *mdiodev);
}
I saw how this is done for the phy, etc, but I don't see any benefit
of doing it that way. It would add an extra layer (or two) of
indirection and save the space four pointer functions. Is that
trade-off worth it?
Thanks,
Richard
^ permalink raw reply
* Re: [PATCH REPOST v4 5/7] ixgbevf: keep writel() closer to wmb()
From: Jeff Kirsher @ 2018-03-21 21:48 UTC (permalink / raw)
To: Sinan Kaya
Cc: netdev, timur, sulrich, linux-arm-msm, linux-arm-kernel,
intel-wired-lan, linux-kernel
In-Reply-To: <1521658572-26354-6-git-send-email-okaya@codeaurora.org>
[-- Attachment #1: Type: text/plain, Size: 545 bytes --]
On Wed, 2018-03-21 at 14:56 -0400, Sinan Kaya wrote:
> Remove ixgbevf_write_tail() in favor of moving writel() close to
> wmb().
>
> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
> Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
> ---
> drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 5 -----
> drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 4 ++--
> 2 files changed, 2 insertions(+), 7 deletions(-)
This patch fails to compile because there is a call to
ixgbevf_write_tail() which you missed cleaning up.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [PATCH REPOST v4 5/7] ixgbevf: keep writel() closer to wmb()
From: okaya @ 2018-03-21 21:51 UTC (permalink / raw)
To: jeffrey.t.kirsher
Cc: netdev, timur, sulrich, linux-arm-msm, linux-arm-kernel,
intel-wired-lan, linux-kernel
In-Reply-To: <1521668888.12746.32.camel@intel.com>
On 2018-03-21 17:48, Jeff Kirsher wrote:
> On Wed, 2018-03-21 at 14:56 -0400, Sinan Kaya wrote:
>> Remove ixgbevf_write_tail() in favor of moving writel() close to
>> wmb().
>>
>> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
>> Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
>> ---
>> drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 5 -----
>> drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 4 ++--
>> 2 files changed, 2 insertions(+), 7 deletions(-)
>
> This patch fails to compile because there is a call to
> ixgbevf_write_tail() which you missed cleaning up.
Hah, I did a compile test but maybe I missed something. I will get v6 of
this patch only and leave the rest of the series as it is.
^ permalink raw reply
* Re: [PATCH net-next v5 1/2] net: permit skb_segment on head_frag frag_list skb
From: Alexander Duyck @ 2018-03-21 21:51 UTC (permalink / raw)
To: Yonghong Song
Cc: Eric Dumazet, ast, Daniel Borkmann, diptanu, Netdev, Kernel Team
In-Reply-To: <20180321203650.1404106-2-yhs@fb.com>
On Wed, Mar 21, 2018 at 1:36 PM, Yonghong Song <yhs@fb.com> wrote:
> One of our in-house projects, bpf-based NAT, hits a kernel BUG_ON at
> function skb_segment(), line 3667. The bpf program attaches to
> clsact ingress, calls bpf_skb_change_proto to change protocol
> from ipv4 to ipv6 or from ipv6 to ipv4, and then calls bpf_redirect
> to send the changed packet out.
>
> 3472 struct sk_buff *skb_segment(struct sk_buff *head_skb,
> 3473 netdev_features_t features)
> 3474 {
> 3475 struct sk_buff *segs = NULL;
> 3476 struct sk_buff *tail = NULL;
> ...
> 3665 while (pos < offset + len) {
> 3666 if (i >= nfrags) {
> 3667 BUG_ON(skb_headlen(list_skb));
> 3668
> 3669 i = 0;
> 3670 nfrags = skb_shinfo(list_skb)->nr_frags;
> 3671 frag = skb_shinfo(list_skb)->frags;
> 3672 frag_skb = list_skb;
> ...
>
> call stack:
> ...
> #1 [ffff883ffef03558] __crash_kexec at ffffffff8110c525
> #2 [ffff883ffef03620] crash_kexec at ffffffff8110d5cc
> #3 [ffff883ffef03640] oops_end at ffffffff8101d7e7
> #4 [ffff883ffef03668] die at ffffffff8101deb2
> #5 [ffff883ffef03698] do_trap at ffffffff8101a700
> #6 [ffff883ffef036e8] do_error_trap at ffffffff8101abfe
> #7 [ffff883ffef037a0] do_invalid_op at ffffffff8101acd0
> #8 [ffff883ffef037b0] invalid_op at ffffffff81a00bab
> [exception RIP: skb_segment+3044]
> RIP: ffffffff817e4dd4 RSP: ffff883ffef03860 RFLAGS: 00010216
> RAX: 0000000000002bf6 RBX: ffff883feb7aaa00 RCX: 0000000000000011
> RDX: ffff883fb87910c0 RSI: 0000000000000011 RDI: ffff883feb7ab500
> RBP: ffff883ffef03928 R8: 0000000000002ce2 R9: 00000000000027da
> R10: 000001ea00000000 R11: 0000000000002d82 R12: ffff883f90a1ee80
> R13: ffff883fb8791120 R14: ffff883feb7abc00 R15: 0000000000002ce2
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> #9 [ffff883ffef03930] tcp_gso_segment at ffffffff818713e7
> --- <IRQ stack> ---
> ...
>
> The triggering input skb has the following properties:
> list_skb = skb->frag_list;
> skb->nfrags != NULL && skb_headlen(list_skb) != 0
> and skb_segment() is not able to handle a frag_list skb
> if its headlen (list_skb->len - list_skb->data_len) is not 0.
>
> This patch addressed the issue by handling skb_headlen(list_skb) != 0
> case properly if list_skb->head_frag is true, which is expected in
> most cases. The head frag is processed before list_skb->frags
> are processed.
>
> Reported-by: Diptanu Gon Choudhury <diptanu@fb.com>
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
> net/core/skbuff.c | 26 ++++++++++++++++++++------
> 1 file changed, 20 insertions(+), 6 deletions(-)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 715c134..23b317a 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -3460,6 +3460,19 @@ void *skb_pull_rcsum(struct sk_buff *skb, unsigned int len)
> }
> EXPORT_SYMBOL_GPL(skb_pull_rcsum);
>
> +static inline skb_frag_t skb_head_frag_to_page_desc(struct sk_buff *frag_skb)
> +{
> + skb_frag_t head_frag;
> + struct page *page;
> +
> + page = virt_to_head_page(frag_skb->head);
> + head_frag.page.p = page;
> + head_frag.page_offset = frag_skb->data -
> + (unsigned char *)page_address(page);
> + head_frag.size = skb_headlen(frag_skb);
> + return head_frag;
> +}
> +
> /**
> * skb_segment - Perform protocol segmentation on skb.
> * @head_skb: buffer to segment
> @@ -3664,15 +3677,16 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>
> while (pos < offset + len) {
> if (i >= nfrags) {
> - BUG_ON(skb_headlen(list_skb));
> -
> i = 0;
> nfrags = skb_shinfo(list_skb)->nr_frags;
> frag = skb_shinfo(list_skb)->frags;
> - frag_skb = list_skb;
You could probably leave this line in place. No point in moving it.
> -
> - BUG_ON(!nfrags);
> + if (skb_headlen(list_skb)) {
> + BUG_ON(!list_skb->head_frag);
>
> + /* to make room for head_frag. */
> + i--; frag--;
Normally these should be two separate lines one for "i--;" and one for
"frag--;".
> + }
You could probably place the BUG_ON(!nfrags) in an else statement here
to handle the case where we have a potentially empty skb which would
be a bug.
> + frag_skb = list_skb;
> if (skb_orphan_frags(frag_skb, GFP_ATOMIC) ||
> skb_zerocopy_clone(nskb, frag_skb,
> GFP_ATOMIC))
> @@ -3689,7 +3703,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
> goto err;
> }
>
> - *nskb_frag = *frag;
> + *nskb_frag = (i < 0) ? skb_head_frag_to_page_desc(frag_skb) : *frag;
> __skb_frag_ref(nskb_frag);
> size = skb_frag_size(nskb_frag);
>
> --
> 2.9.5
>
^ permalink raw reply
* Re: [PATCH net-next RFC V1 3/5] net: Introduce field for the MII time stamper.
From: Richard Cochran @ 2018-03-21 21:51 UTC (permalink / raw)
To: Florian Fainelli
Cc: netdev, devicetree, Andrew Lunn, David Miller, Mark Rutland,
Miroslav Lichvar, Rob Herring, Willem de Bruijn
In-Reply-To: <cbab700a-8a4f-8c4a-b959-e356de006f6e@gmail.com>
On Wed, Mar 21, 2018 at 12:12:00PM -0700, Florian Fainelli wrote:
> > +static int mdiobus_netdev_notification(struct notifier_block *nb,
> > + unsigned long msg, void *ptr)
> > +{
> > + struct net_device *netdev = netdev_notifier_info_to_dev(ptr);
> > + struct phy_device *phydev = netdev->phydev;
> > + struct mdio_device *mdev;
> > + struct mii_bus *bus;
> > + int i;
> > +
> > + if (netdev->mdiots || msg != NETDEV_UP || !phydev)
> > + return NOTIFY_DONE;
>
> You are still assuming that we have a phy_device somehow, whereas you
> parch series wants to solve that for generic MDIO devices, that is a bit
> confusing.
The phydev is the only thing that associates a netdev with an MII bus.
> > +
> > + /*
> > + * Examine the MII bus associated with the PHY that is
> > + * attached to the MAC. If there is a time stamping device
> > + * on the bus, then connect it to the network device.
> > + */
> > + bus = phydev->mdio.bus;
> > +
> > + for (i = 0; i < PHY_MAX_ADDR; i++) {
> > + mdev = bus->mdio_map[i];
> > + if (!mdev)
> > + continue;
> > + if (mdiodev_supports_timestamping(mdev)) {
> > + netdev->mdiots = mdev;
> > + return NOTIFY_OK;
>
> What guarantees that netdev->mdiots gets cleared?
Why would it need to be cleared?
> Also, why is this done
> with a notifier instead of through phy_{connect,attach,disconnect}?
We have no guarantee the mdio device has been probed yet.
> It
> looks like we still have this requirement of the mdio TS device being a
> phy_device somehow, I am confused here...
We only need the phydev to get from the netdev to the mii bus.
> > + }
> > + }
> > +
> > + return NOTIFY_DONE;
> > +}
> > +
> > #ifdef CONFIG_PM
> > static int mdio_bus_suspend(struct device *dev)
> > {
>
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 5fbb9f1da7fd..223d691aa0b0 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -1943,6 +1943,7 @@ struct net_device {
> > struct netprio_map __rcu *priomap;
> > #endif
> > struct phy_device *phydev;
> > + struct mdio_device *mdiots;
>
> phy_device embedds a mdio_device, can you find a way to rework the PHY
> PTP code to utilize the phy_device's mdio instance so do not introduce
> yet another pointer in that big structure that net_device already is?
It would be strange and wrong to "steal" the phy's mdio struct, IMHO.
After all, we just got support for non-PHY mdio devices. The natural
solution is to use it.
Thanks,
Richard
^ permalink raw reply
* Re: [Intel-wired-lan] [PATCH REPOST v4 5/7] ixgbevf: keep writel() closer to wmb()
From: Alexander Duyck @ 2018-03-21 21:53 UTC (permalink / raw)
To: Sinan Kaya
Cc: Jeff Kirsher, sulrich, Netdev, Timur Tabi, LKML, intel-wired-lan,
linux-arm-msm, linux-arm-kernel
In-Reply-To: <c2ce63d1213b5145f77bf70f57c3360d@codeaurora.org>
On Wed, Mar 21, 2018 at 2:51 PM, <okaya@codeaurora.org> wrote:
> On 2018-03-21 17:48, Jeff Kirsher wrote:
>>
>> On Wed, 2018-03-21 at 14:56 -0400, Sinan Kaya wrote:
>>>
>>> Remove ixgbevf_write_tail() in favor of moving writel() close to
>>> wmb().
>>>
>>> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
>>> Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
>>> ---
>>> drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 5 -----
>>> drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 4 ++--
>>> 2 files changed, 2 insertions(+), 7 deletions(-)
>>
>>
>> This patch fails to compile because there is a call to
>> ixgbevf_write_tail() which you missed cleaning up.
>
>
> Hah, I did a compile test but maybe I missed something. I will get v6 of
> this patch only and leave the rest of the series as it is.
Actually you might want to just pull Jeff's tree and rebase before you
submit your patches. I suspect the difference is the ixgbevf XDP code
that is present in Jeff's tree and not in Dave's. The alternative is
to wait for Jeff to push the ixgbevf code and then once Dave has
pulled it you could rebase your patches.
Thanks.
- Alex
^ permalink raw reply
* Re: [PATCH REPOST v4 5/7] ixgbevf: keep writel() closer to wmb()
From: David Miller @ 2018-03-21 21:54 UTC (permalink / raw)
To: jeffrey.t.kirsher
Cc: okaya, netdev, timur, sulrich, linux-arm-msm, linux-arm-kernel,
intel-wired-lan, linux-kernel
In-Reply-To: <1521668888.12746.32.camel@intel.com>
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Wed, 21 Mar 2018 14:48:08 -0700
> On Wed, 2018-03-21 at 14:56 -0400, Sinan Kaya wrote:
>> Remove ixgbevf_write_tail() in favor of moving writel() close to
>> wmb().
>>
>> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
>> Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
>> ---
>> drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 5 -----
>> drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 4 ++--
>> 2 files changed, 2 insertions(+), 7 deletions(-)
>
> This patch fails to compile because there is a call to
> ixgbevf_write_tail() which you missed cleaning up.
For a change with delicate side effects, it doesn't create much
confidence if the code does not even compile.
Sinan, please put more care into the changes you are making.
Thank you.
^ permalink raw reply
* Re: [PATCH net-next RFC V1 5/5] net: mdio: Add a driver for InES time stamping IP core.
From: Richard Cochran @ 2018-03-21 21:57 UTC (permalink / raw)
To: Andrew Lunn
Cc: netdev, devicetree, David Miller, Florian Fainelli, Mark Rutland,
Miroslav Lichvar, Rob Herring, Willem de Bruijn
In-Reply-To: <20180321214436.GX24516@lunn.ch>
On Wed, Mar 21, 2018 at 10:44:36PM +0100, Andrew Lunn wrote:
> O.K, so lets do the 20 questions approach.
:)
> As far as i can see, this is not an MDIO device. It is not connected
> to the MDIO bus, it has no MDIO registers, you don't even pass a valid
> MDIO address in device tree.
Right. There might very well be other products out there that *do*
use MDIO commands. I know that there are MII time stamping asics and
ip cores on the market, but I don't know all of their creative design
details.
> It it actually an MII bus snooper? Does it snoop, or is it actually in
> the MII bus, and can modify packets, i.e. insert time stamps as frames
> pass over the MII bus?
It acts like a "snooper" to provide out of band time stamps, but it
also can modify packets when for the one-step functionality.
> When the driver talks about having three ports, does that mean it can
> be on three different MII busses?
Yes.
HTH,
Richard
^ permalink raw reply
* Re: [PATCH v2 bpf-next 4/8] tracepoint: compute num_args at build time
From: Alexei Starovoitov @ 2018-03-21 22:05 UTC (permalink / raw)
To: Linus Torvalds
Cc: David Miller, Daniel Borkmann, Peter Zijlstra, Steven Rostedt,
Network Development, kernel-team, Linux API
In-Reply-To: <CA+55aFyXFSuyxJrLqixsZ-SBXSOrSJ6Q_oOF04Fd4pzMRbPCEA@mail.gmail.com>
On 3/21/18 12:44 PM, Linus Torvalds wrote:
> On Wed, Mar 21, 2018 at 11:54 AM, Alexei Starovoitov <ast@fb.com> wrote:
>>
>> add fancy macro to compute number of arguments passed into tracepoint
>> at compile time and store it as part of 'struct tracepoint'.
>
> We should probably do this __COUNT() thing in some generic header, we
> just talked last week about another use case entirely.
ok. Not sure which generic header though.
Should I move it to include/linux/kernel.h ?
> And wouldn't it be nice to just have some generic infrastructure like this:
>
> /*
> * This counts to ten.
> *
> * Any more than that, and we'd need to take off our shoes
> */
> #define __GET_COUNT(_0,_1,_2,_3,_4,_5,_6,_7,_8,_9,_10,_n,...) _n
> #define __COUNT(...) \
> __GET_COUNT(__VA_ARGS__,10,9,8,7,6,5,4,3,2,1,0)
> #define COUNT(...) __COUNT(dummy,##__VA_ARGS__)
since it will be a build time error, it's a good time to discuss
how many arguments we want to support in tracepoints and
in general in other places that would want to use this macro.
Like the only reason my patch is counting till 17 is because of
trace_iwlwifi_dev_ucode_error().
The next offenders are using 12 arguments:
trace_mc_event()
trace_mm_vmscan_lru_shrink_inactive()
Clearly not every efficient usage of it:
trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id,
nr_scanned, nr_reclaimed,
stat.nr_dirty, stat.nr_writeback,
stat.nr_congested, stat.nr_immediate,
stat.nr_activate, stat.nr_ref_keep,
stat.nr_unmap_fail,
sc->priority, file);
could have passed &stat instead.
I'd like to refactor that trace_iwlwifi_dev_ucode_error()
and from now on set the limit to 12.
Any offenders should be using tracepoints with <= 12 args
instead of extending the macro.
Does it sound reasonable ?
> #define __CONCAT(a,b) a##b
> #define __CONCATENATE(a,b) __CONCAT(a,b)
>
> and then you can do things like:
>
> #define fn(...) __CONCATENATE(fn,COUNT(__VA_ARGS__))(__VA_ARGS__)
>
> which turns "fn(x,y,z..)" into "fn<N>(x,y,z)".
>
> That can be useful for things like "max(a,b,c,d)" expanding to
> "max4()", and then you can just have the trivial
>
> #define max3(a,b,c) max2(a,max2(b.c))
I can try that. Not sure my macro-fu is up to that level.
__CAST_TO_U64() macro from the next patch was difficult to make
work across compilers and architectures.
^ permalink raw reply
* [trivial PATCH V2] treewide: Align function definition open/close braces
From: Joe Perches @ 2018-03-21 22:09 UTC (permalink / raw)
To: Andrew Morton, Will Deacon, Peter Zijlstra, Boqun Feng,
Rafael J. Wysocki, Len Brown, Zhang Rui, Sathya Prakash,
Chaitra P B, Suganath Prabu Subramani, Manish Chopra, Rahul Verma,
Dept-GELinuxNICDev, QCA ath9k Development, Corentin Chary,
Darren Hart, Andy Shevchenko, Alessandro Zummo, Alexandre Belloni,
Adaptec OEM Raid Solutions, James E.J. Bottomley
Cc: alsa-devel, Liam Girdwood, David Airlie, dri-devel, Takashi Iwai,
H. Peter Anvin, linux-rtc, Mauro Carvalho Chehab, linux-scsi, x86,
amd-gfx, linux-acpi, MPT-FusionLinux.pdl, linux-media,
platform-driver-x86, acpi4asus-user, linux-fsdevel, Mark Brown,
Thomas Gleixner, Jaroslav Kysela, Kalle Valo, netdev,
linux-wireless, linux-kernel, linux-audit, Alex Deucher,
linuxppc-dev
Some functions definitions have either the initial open brace and/or
the closing brace outside of column 1.
Move those braces to column 1.
This allows various function analyzers like gnu complexity to work
properly for these modified functions.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
---
git diff -w still shows no difference.
This patch was sent but December and not applied.
As the trivial maintainer seems not active, it'd be nice if
Andrew Morton picks this up.
V2: Remove fs/xfs/libxfs/xfs_alloc.c as it's updated and remerge the rest
arch/x86/include/asm/atomic64_32.h | 2 +-
drivers/acpi/custom_method.c | 2 +-
drivers/acpi/fan.c | 2 +-
drivers/gpu/drm/amd/display/dc/core/dc.c | 2 +-
drivers/media/i2c/msp3400-kthreads.c | 2 +-
drivers/message/fusion/mptsas.c | 2 +-
drivers/net/ethernet/qlogic/netxen/netxen_nic_init.c | 2 +-
drivers/net/wireless/ath/ath9k/xmit.c | 2 +-
drivers/platform/x86/eeepc-laptop.c | 2 +-
drivers/rtc/rtc-ab-b5ze-s3.c | 2 +-
drivers/scsi/dpt_i2o.c | 2 +-
drivers/scsi/sym53c8xx_2/sym_glue.c | 2 +-
fs/locks.c | 2 +-
fs/ocfs2/stack_user.c | 2 +-
fs/xfs/xfs_export.c | 2 +-
kernel/audit.c | 6 +++---
kernel/trace/trace_printk.c | 4 ++--
lib/raid6/sse2.c | 14 +++++++-------
sound/soc/fsl/fsl_dma.c | 2 +-
19 files changed, 28 insertions(+), 28 deletions(-)
diff --git a/arch/x86/include/asm/atomic64_32.h b/arch/x86/include/asm/atomic64_32.h
index 46e1ef17d92d..92212bf0484f 100644
--- a/arch/x86/include/asm/atomic64_32.h
+++ b/arch/x86/include/asm/atomic64_32.h
@@ -123,7 +123,7 @@ static inline long long arch_atomic64_read(const atomic64_t *v)
long long r;
alternative_atomic64(read, "=&A" (r), "c" (v) : "memory");
return r;
- }
+}
/**
* arch_atomic64_add_return - add and return
diff --git a/drivers/acpi/custom_method.c b/drivers/acpi/custom_method.c
index b33fba70ec51..a07fbe999eb6 100644
--- a/drivers/acpi/custom_method.c
+++ b/drivers/acpi/custom_method.c
@@ -97,7 +97,7 @@ static void __exit acpi_custom_method_exit(void)
{
if (cm_dentry)
debugfs_remove(cm_dentry);
- }
+}
module_init(acpi_custom_method_init);
module_exit(acpi_custom_method_exit);
diff --git a/drivers/acpi/fan.c b/drivers/acpi/fan.c
index 6cf4988206f2..3563103590c6 100644
--- a/drivers/acpi/fan.c
+++ b/drivers/acpi/fan.c
@@ -219,7 +219,7 @@ fan_set_cur_state(struct thermal_cooling_device *cdev, unsigned long state)
return fan_set_state_acpi4(device, state);
else
return fan_set_state(device, state);
- }
+}
static const struct thermal_cooling_device_ops fan_cooling_ops = {
.get_max_state = fan_get_max_state,
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 8394d69b963f..e934326a95d3 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -588,7 +588,7 @@ static void disable_dangling_plane(struct dc *dc, struct dc_state *context)
******************************************************************************/
struct dc *dc_create(const struct dc_init_data *init_params)
- {
+{
struct dc *dc = kzalloc(sizeof(*dc), GFP_KERNEL);
unsigned int full_pipe_count;
diff --git a/drivers/media/i2c/msp3400-kthreads.c b/drivers/media/i2c/msp3400-kthreads.c
index 4dd01e9f553b..dc6cb8d475b3 100644
--- a/drivers/media/i2c/msp3400-kthreads.c
+++ b/drivers/media/i2c/msp3400-kthreads.c
@@ -885,7 +885,7 @@ static int msp34xxg_modus(struct i2c_client *client)
}
static void msp34xxg_set_source(struct i2c_client *client, u16 reg, int in)
- {
+{
struct msp_state *state = to_state(i2c_get_clientdata(client));
int source, matrix;
diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c
index 439ee9c5f535..231f3a1e27bf 100644
--- a/drivers/message/fusion/mptsas.c
+++ b/drivers/message/fusion/mptsas.c
@@ -2967,7 +2967,7 @@ mptsas_exp_repmanufacture_info(MPT_ADAPTER *ioc,
mutex_unlock(&ioc->sas_mgmt.mutex);
out:
return ret;
- }
+}
static void
mptsas_parse_device_info(struct sas_identify *identify,
diff --git a/drivers/net/ethernet/qlogic/netxen/netxen_nic_init.c b/drivers/net/ethernet/qlogic/netxen/netxen_nic_init.c
index 3dd973475125..0ea141ece19e 100644
--- a/drivers/net/ethernet/qlogic/netxen/netxen_nic_init.c
+++ b/drivers/net/ethernet/qlogic/netxen/netxen_nic_init.c
@@ -603,7 +603,7 @@ static struct uni_table_desc *nx_get_table_desc(const u8 *unirom, int section)
static int
netxen_nic_validate_header(struct netxen_adapter *adapter)
- {
+{
const u8 *unirom = adapter->fw->data;
struct uni_table_desc *directory = (struct uni_table_desc *) &unirom[0];
u32 fw_file_size = adapter->fw->size;
diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c
index 396bf05c6bf6..88be55ed5b4d 100644
--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -252,7 +252,7 @@ ath_tid_pull(struct ath_atx_tid *tid)
}
return skb;
- }
+}
static bool ath_tid_has_buffered(struct ath_atx_tid *tid)
diff --git a/drivers/platform/x86/eeepc-laptop.c b/drivers/platform/x86/eeepc-laptop.c
index 5a681962899c..4c38904a8a32 100644
--- a/drivers/platform/x86/eeepc-laptop.c
+++ b/drivers/platform/x86/eeepc-laptop.c
@@ -492,7 +492,7 @@ static void eeepc_platform_exit(struct eeepc_laptop *eeepc)
* potentially bad time, such as a timer interrupt.
*/
static void tpd_led_update(struct work_struct *work)
- {
+{
struct eeepc_laptop *eeepc;
eeepc = container_of(work, struct eeepc_laptop, tpd_led_work);
diff --git a/drivers/rtc/rtc-ab-b5ze-s3.c b/drivers/rtc/rtc-ab-b5ze-s3.c
index e55f35fa0b58..8dc451932446 100644
--- a/drivers/rtc/rtc-ab-b5ze-s3.c
+++ b/drivers/rtc/rtc-ab-b5ze-s3.c
@@ -646,7 +646,7 @@ static int abb5zes3_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *alarm)
ret);
return ret;
- }
+}
/* Enable or disable battery low irq generation */
static inline int _abb5zes3_rtc_battery_low_irq_enable(struct regmap *regmap,
diff --git a/drivers/scsi/dpt_i2o.c b/drivers/scsi/dpt_i2o.c
index 0f30792d74c4..cc5fa99a6530 100644
--- a/drivers/scsi/dpt_i2o.c
+++ b/drivers/scsi/dpt_i2o.c
@@ -3521,7 +3521,7 @@ static int adpt_i2o_systab_send(adpt_hba* pHba)
#endif
return ret;
- }
+}
/*============================================================================
diff --git a/drivers/scsi/sym53c8xx_2/sym_glue.c b/drivers/scsi/sym53c8xx_2/sym_glue.c
index 791a2182de53..7320d5fe4cbc 100644
--- a/drivers/scsi/sym53c8xx_2/sym_glue.c
+++ b/drivers/scsi/sym53c8xx_2/sym_glue.c
@@ -1393,7 +1393,7 @@ static struct Scsi_Host *sym_attach(struct scsi_host_template *tpnt, int unit,
scsi_host_put(shost);
return NULL;
- }
+}
/*
diff --git a/fs/locks.c b/fs/locks.c
index d56a14894fb2..0feaed9f589b 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -559,7 +559,7 @@ static const struct lock_manager_operations lease_manager_ops = {
* Initialize a lease, use the default lock manager operations
*/
static int lease_init(struct file *filp, long type, struct file_lock *fl)
- {
+{
if (assign_type(fl, type) != 0)
return -EINVAL;
diff --git a/fs/ocfs2/stack_user.c b/fs/ocfs2/stack_user.c
index dae9eb7c441e..d2fb97b173da 100644
--- a/fs/ocfs2/stack_user.c
+++ b/fs/ocfs2/stack_user.c
@@ -398,7 +398,7 @@ static int ocfs2_control_do_setnode_msg(struct file *file,
static int ocfs2_control_do_setversion_msg(struct file *file,
struct ocfs2_control_message_setv *msg)
- {
+{
long major, minor;
char *ptr = NULL;
struct ocfs2_control_private *p = file->private_data;
diff --git a/fs/xfs/xfs_export.c b/fs/xfs/xfs_export.c
index 761f3189eff2..eed698aa9f16 100644
--- a/fs/xfs/xfs_export.c
+++ b/fs/xfs/xfs_export.c
@@ -122,7 +122,7 @@ xfs_nfs_get_inode(
struct super_block *sb,
u64 ino,
u32 generation)
- {
+{
xfs_mount_t *mp = XFS_M(sb);
xfs_inode_t *ip;
int error;
diff --git a/kernel/audit.c b/kernel/audit.c
index 8fe6dfb67a94..a97d004375e3 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -486,15 +486,15 @@ static int audit_set_failure(u32 state)
* Drop any references inside the auditd connection tracking struct and free
* the memory.
*/
- static void auditd_conn_free(struct rcu_head *rcu)
- {
+static void auditd_conn_free(struct rcu_head *rcu)
+{
struct auditd_connection *ac;
ac = container_of(rcu, struct auditd_connection, rcu);
put_pid(ac->pid);
put_net(ac->net);
kfree(ac);
- }
+}
/**
* auditd_set - Set/Reset the auditd connection state
diff --git a/kernel/trace/trace_printk.c b/kernel/trace/trace_printk.c
index ad1d6164e946..50f44b7b2b32 100644
--- a/kernel/trace/trace_printk.c
+++ b/kernel/trace/trace_printk.c
@@ -196,7 +196,7 @@ struct notifier_block module_trace_bprintk_format_nb = {
};
int __trace_bprintk(unsigned long ip, const char *fmt, ...)
- {
+{
int ret;
va_list ap;
@@ -214,7 +214,7 @@ int __trace_bprintk(unsigned long ip, const char *fmt, ...)
EXPORT_SYMBOL_GPL(__trace_bprintk);
int __ftrace_vbprintk(unsigned long ip, const char *fmt, va_list ap)
- {
+{
if (unlikely(!fmt))
return 0;
diff --git a/lib/raid6/sse2.c b/lib/raid6/sse2.c
index 1d2276b007ee..8191e1d0d2fb 100644
--- a/lib/raid6/sse2.c
+++ b/lib/raid6/sse2.c
@@ -91,7 +91,7 @@ static void raid6_sse21_gen_syndrome(int disks, size_t bytes, void **ptrs)
static void raid6_sse21_xor_syndrome(int disks, int start, int stop,
size_t bytes, void **ptrs)
- {
+{
u8 **dptr = (u8 **)ptrs;
u8 *p, *q;
int d, z, z0;
@@ -200,9 +200,9 @@ static void raid6_sse22_gen_syndrome(int disks, size_t bytes, void **ptrs)
kernel_fpu_end();
}
- static void raid6_sse22_xor_syndrome(int disks, int start, int stop,
+static void raid6_sse22_xor_syndrome(int disks, int start, int stop,
size_t bytes, void **ptrs)
- {
+{
u8 **dptr = (u8 **)ptrs;
u8 *p, *q;
int d, z, z0;
@@ -265,7 +265,7 @@ static void raid6_sse22_gen_syndrome(int disks, size_t bytes, void **ptrs)
asm volatile("sfence" : : : "memory");
kernel_fpu_end();
- }
+}
const struct raid6_calls raid6_sse2x2 = {
raid6_sse22_gen_syndrome,
@@ -366,9 +366,9 @@ static void raid6_sse24_gen_syndrome(int disks, size_t bytes, void **ptrs)
kernel_fpu_end();
}
- static void raid6_sse24_xor_syndrome(int disks, int start, int stop,
+static void raid6_sse24_xor_syndrome(int disks, int start, int stop,
size_t bytes, void **ptrs)
- {
+{
u8 **dptr = (u8 **)ptrs;
u8 *p, *q;
int d, z, z0;
@@ -471,7 +471,7 @@ static void raid6_sse24_gen_syndrome(int disks, size_t bytes, void **ptrs)
}
asm volatile("sfence" : : : "memory");
kernel_fpu_end();
- }
+}
const struct raid6_calls raid6_sse2x4 = {
diff --git a/sound/soc/fsl/fsl_dma.c b/sound/soc/fsl/fsl_dma.c
index fce2010d3c53..78871de35086 100644
--- a/sound/soc/fsl/fsl_dma.c
+++ b/sound/soc/fsl/fsl_dma.c
@@ -886,7 +886,7 @@ static const struct snd_pcm_ops fsl_dma_ops = {
};
static int fsl_soc_dma_probe(struct platform_device *pdev)
- {
+{
struct dma_object *dma;
struct device_node *np = pdev->dev.of_node;
struct device_node *ssi_np;
--
2.15.0
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply related
* Re: [PATCH] bnx2x: fix spelling mistake: "registeration" -> "registration"
From: David Miller @ 2018-03-21 22:10 UTC (permalink / raw)
To: colin.king
Cc: ariel.elior, everest-linux-l2, netdev, kernel-janitors,
linux-kernel
In-Reply-To: <20180319143259.5796-1-colin.king@canonical.com>
From: Colin King <colin.king@canonical.com>
Date: Mon, 19 Mar 2018 14:32:59 +0000
> From: Colin Ian King <colin.king@canonical.com>
>
> Trivial fix to spelling mistake in BNX2X_ERR error message text
>
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
Applied.
^ permalink raw reply
* Re: [PATCH] qede: fix spelling mistake: "registeration" -> "registration"
From: David Miller @ 2018-03-21 22:10 UTC (permalink / raw)
To: colin.king
Cc: Ariel.Elior, everest-linux-l2, netdev, kernel-janitors,
linux-kernel
In-Reply-To: <20180319145711.6591-1-colin.king@canonical.com>
From: Colin King <colin.king@canonical.com>
Date: Mon, 19 Mar 2018 14:57:11 +0000
> From: Colin Ian King <colin.king@canonical.com>
>
> Trivial fix to spelling mistakes in DP_ERR error message text and
> comments
>
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
Applied.
^ permalink raw reply
* Re: [PATCH net v2 0/7] fix idr leak in actions
From: David Miller @ 2018-03-21 22:13 UTC (permalink / raw)
To: dcaratti; +Cc: xiyou.wangcong, jiri, jhs, netdev
In-Reply-To: <cover.1521465261.git.dcaratti@redhat.com>
From: Davide Caratti <dcaratti@redhat.com>
Date: Mon, 19 Mar 2018 15:31:21 +0100
> This series fixes situations where a temporary failure to install a TC
> action results in the permanent impossibility to reuse the configured
> value of 'index'.
>
> Thanks to Cong Wang for the initial review.
>
> v2: fix build error in act_ipt.c, reported by kbuild test robot
Series applied, thanks Davide.
^ permalink raw reply
* Re: [PATCH REPOST v4 5/7] ixgbevf: keep writel() closer to wmb()
From: okaya @ 2018-03-21 22:14 UTC (permalink / raw)
To: David Miller
Cc: jeffrey.t.kirsher, netdev, timur, sulrich, linux-arm-msm,
linux-arm-kernel, intel-wired-lan, linux-kernel
In-Reply-To: <20180321.175427.1419929873765406157.davem@davemloft.net>
On 2018-03-21 17:54, David Miller wrote:
> From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Date: Wed, 21 Mar 2018 14:48:08 -0700
>
>> On Wed, 2018-03-21 at 14:56 -0400, Sinan Kaya wrote:
>>> Remove ixgbevf_write_tail() in favor of moving writel() close to
>>> wmb().
>>>
>>> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
>>> Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
>>> ---
>>> drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 5 -----
>>> drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 4 ++--
>>> 2 files changed, 2 insertions(+), 7 deletions(-)
>>
>> This patch fails to compile because there is a call to
>> ixgbevf_write_tail() which you missed cleaning up.
>
> For a change with delicate side effects, it doesn't create much
> confidence if the code does not even compile.
>
> Sinan, please put more care into the changes you are making.
I think the issue is the tree that code is getting tested has
undelivered code as Alex mentioned.
I was using linux-next 4.16 rc4 for testing.
I will rebase to Jeff's tree.
>
> Thank you.
^ permalink raw reply
* Re: [PATCH net-next RFC V1 5/5] net: mdio: Add a driver for InES time stamping IP core.
From: Andrew Lunn @ 2018-03-21 22:16 UTC (permalink / raw)
To: Richard Cochran
Cc: netdev, devicetree, David Miller, Florian Fainelli, Mark Rutland,
Miroslav Lichvar, Rob Herring, Willem de Bruijn
In-Reply-To: <20180321215729.engnoxpaympvvdc5@localhost>
On Wed, Mar 21, 2018 at 02:57:29PM -0700, Richard Cochran wrote:
> On Wed, Mar 21, 2018 at 10:44:36PM +0100, Andrew Lunn wrote:
> > O.K, so lets do the 20 questions approach.
>
> :)
>
> > As far as i can see, this is not an MDIO device. It is not connected
> > to the MDIO bus, it has no MDIO registers, you don't even pass a valid
> > MDIO address in device tree.
>
> Right.
O.K, so i suggest we stop trying to model this thing as an MDIO
device. It is really an MMIO device.
> There might very well be other products out there that *do*
> use MDIO commands. I know that there are MII time stamping asics and
> ip cores on the market, but I don't know all of their creative design
> details.
So i suggest we leave the design for those until we actual see one.
> > It it actually an MII bus snooper? Does it snoop, or is it actually in
> > the MII bus, and can modify packets, i.e. insert time stamps as frames
> > pass over the MII bus?
>
> It acts like a "snooper" to provide out of band time stamps, but it
> also can modify packets when for the one-step functionality.
>
> > When the driver talks about having three ports, does that mean it can
> > be on three different MII busses?
O.K, so here is how i think it should be done. It is a device which
offers services to other devices. It is not that different to an
interrupt controller, a GPIO controller, etc. Lets follow how they
work in device tree....
The device itself is just another MMIO mapped device in the SoC:
timestamper@60000000 {
compatible = "ines,ptp-ctrl";
reg = <0x60000000 0x80>;
#address-cells = <1>;
#size-cells = <0>;
};
The MAC drivers are clients of this device. They then use a phandle
and specifier:
eth0: ethernet-controller@72000 {
compatible = "marvell,kirkwood-eth";
#address-cells = <1>;
#size-cells = <0>;
reg = <0x72000 0x4000>;
timerstamper = <×tamper 2>
}
The 2 indicates this MAC is using port 2.
The MAC driver can then do the standard device tree things to follow
the phandle to get access to the device and use the API it exports.
Andrew
^ permalink raw reply
* [PATCH net-next 1/1] tc-testing: updated police, mirred, skbedit and skbmod with more tests
From: Roman Mashak @ 2018-03-21 22:17 UTC (permalink / raw)
To: stephen; +Cc: netdev, kernel, jhs, xiyou.wangcong, jiri, Roman Mashak
Added extra test cases for control actions (reclassify, pipe etc.),
cookies, max index value and police args sanity check.
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
---
.../tc-testing/tc-tests/actions/mirred.json | 192 +++++++++++++++++++++
.../tc-testing/tc-tests/actions/police.json | 144 ++++++++++++++++
.../tc-testing/tc-tests/actions/skbedit.json | 168 ++++++++++++++++++
.../tc-testing/tc-tests/actions/skbmod.json | 26 ++-
4 files changed, 529 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json b/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json
index 0fcccf18399b..443c9b3c8664 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json
@@ -171,6 +171,198 @@
]
},
{
+ "id": "8917",
+ "name": "Add mirred mirror action with control pass",
+ "category": [
+ "actions",
+ "mirred"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action mirred",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action mirred ingress mirror dev lo pass index 1",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions get action mirred index 1",
+ "matchPattern": "action order [0-9]*: mirred \\(Ingress Mirror to device lo\\) pass.*index 1 ref",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action mirred"
+ ]
+ },
+ {
+ "id": "1054",
+ "name": "Add mirred mirror action with control pipe",
+ "category": [
+ "actions",
+ "mirred"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action mirred",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action mirred ingress mirror dev lo pipe index 15",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions get action mirred index 15",
+ "matchPattern": "action order [0-9]*: mirred \\(Ingress Mirror to device lo\\) pipe.*index 15 ref",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action mirred"
+ ]
+ },
+ {
+ "id": "9887",
+ "name": "Add mirred mirror action with control continue",
+ "category": [
+ "actions",
+ "mirred"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action mirred",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action mirred ingress mirror dev lo continue index 15",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions get action mirred index 15",
+ "matchPattern": "action order [0-9]*: mirred \\(Ingress Mirror to device lo\\) continue.*index 15 ref",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action mirred"
+ ]
+ },
+ {
+ "id": "e4aa",
+ "name": "Add mirred mirror action with control reclassify",
+ "category": [
+ "actions",
+ "mirred"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action mirred",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action mirred ingress mirror dev lo reclassify index 150",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions get action mirred index 150",
+ "matchPattern": "action order [0-9]*: mirred \\(Ingress Mirror to device lo\\) reclassify.*index 150 ref",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action mirred"
+ ]
+ },
+ {
+ "id": "ece9",
+ "name": "Add mirred mirror action with control drop",
+ "category": [
+ "actions",
+ "mirred"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action mirred",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action mirred ingress mirror dev lo drop index 99",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions get action mirred index 99",
+ "matchPattern": "action order [0-9]*: mirred \\(Ingress Mirror to device lo\\) drop.*index 99 ref",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action mirred"
+ ]
+ },
+ {
+ "id": "0031",
+ "name": "Add mirred mirror action with control jump",
+ "category": [
+ "actions",
+ "mirred"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action mirred",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action mirred ingress mirror dev lo jump 10 index 99",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions get action mirred index 99",
+ "matchPattern": "action order [0-9]*: mirred \\(Ingress Mirror to device lo\\) jump 10.*index 99 ref",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action mirred"
+ ]
+ },
+ {
+ "id": "407c",
+ "name": "Add mirred mirror action with cookie",
+ "category": [
+ "actions",
+ "mirred"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action mirred",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action mirred ingress mirror dev lo reclassify cookie aa11bb22cc33dd44ee55",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions ls action mirred",
+ "matchPattern": "action order [0-9]*: mirred \\(Ingress Mirror to device lo\\) reclassify.*cookie aa11bb22cc33dd44ee55",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action mirred"
+ ]
+ },
+ {
+ "id": "8b69",
+ "name": "Add mirred mirror action with maximum index",
+ "category": [
+ "actions",
+ "mirred"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action mirred",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action mirred ingress mirror dev lo pipe index 4294967295",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions get action mirred index 4294967295",
+ "matchPattern": "action order [0-9]*: mirred \\(Ingress Mirror to device lo\\) pipe.*index 4294967295",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action mirred"
+ ]
+ },
+ {
"id": "a70e",
"name": "Delete mirred mirror action",
"category": [
diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/police.json b/tools/testing/selftests/tc-testing/tc-tests/actions/police.json
index 0e602a3f9393..38d85a1d7492 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/actions/police.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/actions/police.json
@@ -265,6 +265,150 @@
]
},
{
+ "id": "ddd6",
+ "name": "Add police action with invalid rate value",
+ "category": [
+ "actions",
+ "police"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action police",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action police rate 3tb burst 250k conform-exceed pass/pipe index 5",
+ "expExitCode": "255",
+ "verifyCmd": "$TC actions ls action police",
+ "matchPattern": "action order [0-9]*: police 0x5 rate 3Tb burst 250Kb mtu 2Kb action pass/pipe",
+ "matchCount": "0",
+ "teardown": [
+ "$TC actions flush action police"
+ ]
+ },
+ {
+ "id": "f61c",
+ "name": "Add police action with invalid burst value",
+ "category": [
+ "actions",
+ "police"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action police",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action police rate 3kbit burst 250P conform-exceed pass/pipe index 5",
+ "expExitCode": "255",
+ "verifyCmd": "$TC actions ls action police",
+ "matchPattern": "action order [0-9]*: police 0x5 rate 3Kbit burst 250Pb mtu 2Kb action pass/pipe",
+ "matchCount": "0",
+ "teardown": [
+ "$TC actions flush action police"
+ ]
+ },
+ {
+ "id": "c26f",
+ "name": "Add police action with invalid peakrate value",
+ "category": [
+ "actions",
+ "police"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action police",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action police rate 90kbit burst 10k mtu 2kb peakrate 100T index 1",
+ "expExitCode": "255",
+ "verifyCmd": "$TC actions ls action police",
+ "matchPattern": "action order [0-9]*: police 0x1 rate 90Kbit burst 10Kb mtu 2Kb peakrate 100Tbit",
+ "matchCount": "0",
+ "teardown": [
+ "$TC actions flush action police"
+ ]
+ },
+ {
+ "id": "db04",
+ "name": "Add police action with invalid mtu value",
+ "category": [
+ "actions",
+ "police"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action police",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action police rate 10kbit burst 10k mtu 2Pbit index 1",
+ "expExitCode": "255",
+ "verifyCmd": "$TC actions ls action police",
+ "matchPattern": "action order [0-9]*: police 0x1 rate 10Kbit burst 1Kb mtu 2Pb",
+ "matchCount": "0",
+ "teardown": [
+ "$TC actions flush action police"
+ ]
+ },
+ {
+ "id": "f3c9",
+ "name": "Add police action with cookie",
+ "category": [
+ "actions",
+ "police"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action police",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action police rate 10mbit burst 10k index 1 cookie a1b1c1d1e1f12233bb",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions get action police index 1",
+ "matchPattern": "action order [0-9]*: police 0x1 rate 10Mbit burst 10Kb mtu 2Kb.*cookie a1b1c1d1e1f12233bb",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action police"
+ ]
+ },
+ {
+ "id": "d190",
+ "name": "Add police action with maximum index",
+ "category": [
+ "actions",
+ "police"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action police",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action police rate 10mbit burst 10k index 4294967295",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions get action mirred index 4294967295",
+ "matchPattern": "action order [0-9]*: police 0xffffffff rate 10Mbit burst 10Kb mtu 2Kb",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action mirred"
+ ]
+ },
+ {
"id": "336e",
"name": "Delete police action",
"category": [
diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/skbedit.json b/tools/testing/selftests/tc-testing/tc-tests/actions/skbedit.json
index 99635ea4722e..37ecc2716fee 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/actions/skbedit.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/actions/skbedit.json
@@ -216,6 +216,174 @@
]
},
{
+ "id": "464a",
+ "name": "Add skbedit action with control pipe",
+ "category": [
+ "actions",
+ "skbedit"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action skbedit",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action skbedit ptype host pipe index 11",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions get action skbedit index 11",
+ "matchPattern": "action order [0-9]*: skbedit ptype host pipe.*index 11 ref",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action skbedit"
+ ]
+ },
+ {
+ "id": "212f",
+ "name": "Add skbedit action with control reclassify",
+ "category": [
+ "actions",
+ "skbedit"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action skbedit",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action skbedit mark 56789 reclassify index 90",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions get action skbedit index 90",
+ "matchPattern": "action order [0-9]*: skbedit mark 56789 reclassify.*index 90 ref",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action skbedit"
+ ]
+ },
+ {
+ "id": "0651",
+ "name": "Add skbedit action with control pass",
+ "category": [
+ "actions",
+ "skbedit"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action skbedit",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action skbedit queue_mapping 3 pass index 271",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions get action skbedit index 271",
+ "matchPattern": "action order [0-9]*: skbedit queue_mapping 3 pass.*index 271 ref",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action skbedit"
+ ]
+ },
+ {
+ "id": "cc53",
+ "name": "Add skbedit action with control drop",
+ "category": [
+ "actions",
+ "skbedit"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action skbedit",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action skbedit queue_mapping 3 drop index 271",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions get action skbedit index 271",
+ "matchPattern": "action order [0-9]*: skbedit queue_mapping 3 drop.*index 271 ref",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action skbedit"
+ ]
+ },
+ {
+ "id": "ec16",
+ "name": "Add skbedit action with control jump",
+ "category": [
+ "actions",
+ "skbedit"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action skbedit",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action skbedit priority 8 jump 9 index 2",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions get action skbedit index 2",
+ "matchPattern": "action order [0-9]*: skbedit priority :8 jump 9.*index 2 ref",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action skbedit"
+ ]
+ },
+ {
+ "id": "db54",
+ "name": "Add skbedit action with control continue",
+ "category": [
+ "actions",
+ "skbedit"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action skbedit",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action skbedit priority 16 continue index 32",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions get action skbedit index 32",
+ "matchPattern": "action order [0-9]*: skbedit priority :16 continue.*index 32 ref",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action skbedit"
+ ]
+ },
+ {
+ "id": "1055",
+ "name": "Add skbedit action with cookie",
+ "category": [
+ "actions",
+ "skbedit"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action skbedit",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action skbedit priority 16 continue index 32 cookie deadbeef",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions get action skbedit index 32",
+ "matchPattern": "action order [0-9]*: skbedit priority :16 continue.*index 32 ref.*cookie deadbeef",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action skbedit"
+ ]
+ },
+ {
"id": "5172",
"name": "List skbedit actions",
"category": [
diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/skbmod.json b/tools/testing/selftests/tc-testing/tc-tests/actions/skbmod.json
index 90bba48c3f07..8aa5a88ccb19 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/actions/skbmod.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/actions/skbmod.json
@@ -264,6 +264,30 @@
]
},
{
+ "id": "6046",
+ "name": "Add skbmod action with control reclassify and cookie",
+ "category": [
+ "actions",
+ "skbmod"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action skbmod",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action skbmod set smac 00:01:02:03:04:01 reclassify index 1 cookie ddeeffaabb11cc22",
+ "expExitCode": "0",
+ "verifyCmd": "$TC actions get action skbmod index 1",
+ "matchPattern": "action order [0-9]*: skbmod reclassify set smac 00:01:02:03:04:01.*index 1 ref.*cookie ddeeffaabb11cc22",
+ "matchCount": "1",
+ "teardown": [
+ "$TC actions flush action skbmod"
+ ]
+ },
+ {
"id": "58cb",
"name": "List skbmod actions",
"category": [
@@ -369,4 +393,4 @@
"$TC actions flush action skbmod"
]
}
-]
+]
\ No newline at end of file
--
2.7.4
^ permalink raw reply related
* Re: [PATCH V2 net-next 06/14] net/tls: Add generic NIC offload infrastructure
From: Kirill Tkhai @ 2018-03-21 22:27 UTC (permalink / raw)
To: Saeed Mahameed, David S. Miller
Cc: netdev, Dave Watson, Boris Pismenny, Ilya Lesokhin,
Aviad Yehezkel
In-Reply-To: <20180321210146.22537-7-saeedm@mellanox.com>
Hi, Saeed,
thanks for fixing some of my remarks, but I've dived into the code
more deeply, and found with a sadness, the patch lacks the readability.
It too big and not fit kernel coding style. Please, see some comments
below.
Can we do something with patch length? Is there a way to split it in
several small patches? It's difficult to review the logic of changes.
On 22.03.2018 00:01, Saeed Mahameed wrote:
> From: Ilya Lesokhin <ilyal@mellanox.com>
>
> This patch adds a generic infrastructure to offload TLS crypto to a
> network devices. It enables the kernel TLS socket to skip encryption
> and authentication operations on the transmit side of the data path.
> Leaving those computationally expensive operations to the NIC.
>
> The NIC offload infrastructure builds TLS records and pushes them to
> the TCP layer just like the SW KTLS implementation and using the same API.
> TCP segmentation is mostly unaffected. Currently the only exception is
> that we prevent mixed SKBs where only part of the payload requires
> offload. In the future we are likely to add a similar restriction
> following a change cipher spec record.
>
> The notable differences between SW KTLS and NIC offloaded TLS
> implementations are as follows:
> 1. The offloaded implementation builds "plaintext TLS record", those
> records contain plaintext instead of ciphertext and place holder bytes
> instead of authentication tags.
> 2. The offloaded implementation maintains a mapping from TCP sequence
> number to TLS records. Thus given a TCP SKB sent from a NIC offloaded
> TLS socket, we can use the tls NIC offload infrastructure to obtain
> enough context to encrypt the payload of the SKB.
> A TLS record is released when the last byte of the record is ack'ed,
> this is done through the new icsk_clean_acked callback.
>
> The infrastructure should be extendable to support various NIC offload
> implementations. However it is currently written with the
> implementation below in mind:
> The NIC assumes that packets from each offloaded stream are sent as
> plaintext and in-order. It keeps track of the TLS records in the TCP
> stream. When a packet marked for offload is transmitted, the NIC
> encrypts the payload in-place and puts authentication tags in the
> relevant place holders.
>
> The responsibility for handling out-of-order packets (i.e. TCP
> retransmission, qdisc drops) falls on the netdev driver.
>
> The netdev driver keeps track of the expected TCP SN from the NIC's
> perspective. If the next packet to transmit matches the expected TCP
> SN, the driver advances the expected TCP SN, and transmits the packet
> with TLS offload indication.
>
> If the next packet to transmit does not match the expected TCP SN. The
> driver calls the TLS layer to obtain the TLS record that includes the
> TCP of the packet for transmission. Using this TLS record, the driver
> posts a work entry on the transmit queue to reconstruct the NIC TLS
> state required for the offload of the out-of-order packet. It updates
> the expected TCP SN accordingly and transmit the now in-order packet.
> The same queue is used for packet transmission and TLS context
> reconstruction to avoid the need for flushing the transmit queue before
> issuing the context reconstruction request.
>
> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
> Signed-off-by: Boris Pismenny <borisp@mellanox.com>
> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> ---
> include/net/tls.h | 74 +++-
> net/tls/Kconfig | 10 +
> net/tls/Makefile | 2 +
> net/tls/tls_device.c | 793 ++++++++++++++++++++++++++++++++++++++++++
> net/tls/tls_device_fallback.c | 415 ++++++++++++++++++++++
> net/tls/tls_main.c | 33 +-
> 6 files changed, 1320 insertions(+), 7 deletions(-)
> create mode 100644 net/tls/tls_device.c
> create mode 100644 net/tls/tls_device_fallback.c
>
> diff --git a/include/net/tls.h b/include/net/tls.h
> index 4913430ab807..0bfb1b0a156a 100644
> --- a/include/net/tls.h
> +++ b/include/net/tls.h
> @@ -77,6 +77,37 @@ struct tls_sw_context {
> struct scatterlist sg_aead_out[2];
> };
>
> +struct tls_record_info {
> + struct list_head list;
> + u32 end_seq;
> + int len;
> + int num_frags;
> + skb_frag_t frags[MAX_SKB_FRAGS];
> +};
> +
> +struct tls_offload_context {
> + struct crypto_aead *aead_send;
> + spinlock_t lock; /* protects records list */
> + struct list_head records_list;
> + struct tls_record_info *open_record;
> + struct tls_record_info *retransmit_hint;
> + u64 hint_record_sn;
> + u64 unacked_record_sn;
> +
> + struct scatterlist sg_tx_data[MAX_SKB_FRAGS];
> + void (*sk_destruct)(struct sock *sk);
> + u8 driver_state[];
> + /* The TLS layer reserves room for driver specific state
> + * Currently the belief is that there is not enough
> + * driver specific state to justify another layer of indirection
> + */
> +#define TLS_DRIVER_STATE_SIZE (max_t(size_t, 8, sizeof(void *)))
> +};
> +
> +#define TLS_OFFLOAD_CONTEXT_SIZE \
> + (ALIGN(sizeof(struct tls_offload_context), sizeof(void *)) + \
> + TLS_DRIVER_STATE_SIZE)
> +
> enum {
> TLS_PENDING_CLOSED_RECORD
> };
> @@ -87,6 +118,10 @@ struct tls_context {
> struct tls12_crypto_info_aes_gcm_128 crypto_send_aes_gcm_128;
> };
>
> + struct list_head list;
> + struct net_device *netdev;
> + refcount_t refcount;
> +
> void *priv_ctx;
>
> u8 tx_conf:2;
> @@ -131,9 +166,29 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
> void tls_sw_close(struct sock *sk, long timeout);
> void tls_sw_free_tx_resources(struct sock *sk);
>
> -void tls_sk_destruct(struct sock *sk, struct tls_context *ctx);
> -void tls_icsk_clean_acked(struct sock *sk);
> +void tls_clear_device_offload(struct sock *sk, struct tls_context *ctx);
> +int tls_set_device_offload(struct sock *sk, struct tls_context *ctx);
> +int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
> +int tls_device_sendpage(struct sock *sk, struct page *page,
> + int offset, size_t size, int flags);
> +void tls_device_sk_destruct(struct sock *sk);
> +void tls_device_init(void);
> +void tls_device_cleanup(void);
> +
> +struct tls_record_info *tls_get_record(struct tls_offload_context *context,
> + u32 seq, u64 *p_record_sn);
> +
> +static inline bool tls_record_is_start_marker(struct tls_record_info *rec)
> +{
> + return rec->len == 0;
> +}
> +
> +static inline u32 tls_record_start_seq(struct tls_record_info *rec)
> +{
> + return rec->end_seq - rec->len;
> +}
>
> +void tls_sk_destruct(struct sock *sk, struct tls_context *ctx);
> int tls_push_sg(struct sock *sk, struct tls_context *ctx,
> struct scatterlist *sg, u16 first_offset,
> int flags);
> @@ -170,6 +225,13 @@ static inline bool tls_is_pending_open_record(struct tls_context *tls_ctx)
> return tls_ctx->pending_open_record_frags;
> }
>
> +static inline bool tls_is_sk_tx_device_offloaded(struct sock *sk)
> +{
> + return sk_fullsock(sk) &&
> + /* matches smp_store_release in tls_set_device_offload */
> + smp_load_acquire(&sk->sk_destruct) == &tls_device_sk_destruct;
> +}
> +
> static inline void tls_err_abort(struct sock *sk)
> {
> sk->sk_err = EBADMSG;
> @@ -257,4 +319,12 @@ static inline struct tls_offload_context *tls_offload_ctx(
> int tls_proccess_cmsg(struct sock *sk, struct msghdr *msg,
> unsigned char *record_type);
>
> +struct sk_buff *tls_validate_xmit_skb(struct sock *sk,
> + struct net_device *dev,
> + struct sk_buff *skb);
> +
> +int tls_sw_fallback_init(struct sock *sk,
> + struct tls_offload_context *offload_ctx,
> + struct tls_crypto_info *crypto_info);
> +
> #endif /* _TLS_OFFLOAD_H */
> diff --git a/net/tls/Kconfig b/net/tls/Kconfig
> index eb583038c67e..9d3ef820bb16 100644
> --- a/net/tls/Kconfig
> +++ b/net/tls/Kconfig
> @@ -13,3 +13,13 @@ config TLS
> encryption handling of the TLS protocol to be done in-kernel.
>
> If unsure, say N.
> +
> +config TLS_DEVICE
> + bool "Transport Layer Security HW offload"
> + depends on TLS
> + select SOCK_VALIDATE_XMIT
> + default n
> + ---help---
> + Enable kernel support for HW offload of the TLS protocol.
> +
> + If unsure, say N.
> diff --git a/net/tls/Makefile b/net/tls/Makefile
> index a930fd1c4f7b..4d6b728a67d0 100644
> --- a/net/tls/Makefile
> +++ b/net/tls/Makefile
> @@ -5,3 +5,5 @@
> obj-$(CONFIG_TLS) += tls.o
>
> tls-y := tls_main.o tls_sw.o
> +
> +tls-$(CONFIG_TLS_DEVICE) += tls_device.o tls_device_fallback.o
> diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
> new file mode 100644
> index 000000000000..e623280ea019
> --- /dev/null
> +++ b/net/tls/tls_device.c
> @@ -0,0 +1,793 @@
> +/* Copyright (c) 2018, Mellanox Technologies All rights reserved.
> + *
> + * This software is available to you under a choice of one of two
> + * licenses. You may choose to be licensed under the terms of the GNU
> + * General Public License (GPL) Version 2, available from the file
> + * COPYING in the main directory of this source tree, or the
> + * OpenIB.org BSD license below:
> + *
> + * Redistribution and use in source and binary forms, with or
> + * without modification, are permitted provided that the following
> + * conditions are met:
> + *
> + * - Redistributions of source code must retain the above
> + * copyright notice, this list of conditions and the following
> + * disclaimer.
> + *
> + * - Redistributions in binary form must reproduce the above
> + * copyright notice, this list of conditions and the following
> + * disclaimer in the documentation and/or other materials
> + * provided with the distribution.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
> + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
> + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
> + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
> + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
> + * SOFTWARE.
> + */
> +
> +#include <linux/module.h>
> +#include <net/tcp.h>
> +#include <net/inet_common.h>
> +#include <linux/highmem.h>
> +#include <linux/netdevice.h>
> +
> +#include <net/tls.h>
> +#include <crypto/aead.h>
> +
> +/* device_offload_lock is used to synchronize tls_dev_add
> + * against NETDEV_DOWN notifications.
> + */
> +static DECLARE_RWSEM(device_offload_lock);
> +
> +static void tls_device_gc_task(struct work_struct *work);
> +
> +static DECLARE_WORK(tls_device_gc_work, tls_device_gc_task);
> +static LIST_HEAD(tls_device_gc_list);
> +static LIST_HEAD(tls_device_list);
> +static DEFINE_SPINLOCK(tls_device_lock);
> +
> +static void tls_device_free_ctx(struct tls_context *ctx)
> +{
> + struct tls_offload_context *offlad_ctx = tls_offload_ctx(ctx);
> +
> + kfree(offlad_ctx);
> + kfree(ctx);
> +}
> +
> +static void tls_device_gc_task(struct work_struct *work)
> +{
> + struct tls_context *ctx, *tmp;
> + struct list_head gc_list;
> + unsigned long flags;
> +
> + INIT_LIST_HEAD(&gc_list);
You should just declare the list as
LIST_HEAD(gc_list);
I've already pointed this in commentary to v1.
> +
> + spin_lock_irqsave(&tls_device_lock, flags);
> + list_splice_init(&tls_device_gc_list, &gc_list);
> + spin_unlock_irqrestore(&tls_device_lock, flags);
> +
> + list_for_each_entry_safe(ctx, tmp, &gc_list, list) {
> + struct net_device *netdev = ctx->netdev;
> +
> + if (netdev) {
> + netdev->tlsdev_ops->tls_dev_del(netdev, ctx,
> + TLS_OFFLOAD_CTX_DIR_TX);
> + dev_put(netdev);
> + }
> +
> + list_del(&ctx->list);
> + tls_device_free_ctx(ctx);
> + }
> +}
> +
> +static void tls_device_queue_ctx_destruction(struct tls_context *ctx)
> +{
> + unsigned long flags;
> +
> + spin_lock_irqsave(&tls_device_lock, flags);
> + list_move_tail(&ctx->list, &tls_device_gc_list);
> +
> + /* schedule_work inside the spinlock
> + * to make sure tls_device_down waits for that work.
> + */
> + schedule_work(&tls_device_gc_work);
> +
> + spin_unlock_irqrestore(&tls_device_lock, flags);
> +}
> +
> +/* We assume that the socket is already connected */
> +static struct net_device *get_netdev_for_sock(struct sock *sk)
> +{
> + struct inet_sock *inet = inet_sk(sk);
> + struct net_device *netdev = NULL;
> +
> + netdev = dev_get_by_index(sock_net(sk), inet->cork.fl.flowi_oif);
> +
> + return netdev;
Why can't we just return
return dev_get_by_index(sock_net(sk), inet->cork.fl.flowi_oif);
?
What for we need this netdev declaration? The direct return makes the line
even shorter.
> +}
> +
> +static int attach_sock_to_netdev(struct sock *sk, struct net_device *netdev,
> + struct tls_context *ctx)
> +{
> + int rc;
> +
> + rc = netdev->tlsdev_ops->tls_dev_add(netdev, sk, TLS_OFFLOAD_CTX_DIR_TX,
> + &ctx->crypto_send,
> + tcp_sk(sk)->write_seq);
> + if (rc) {
> + pr_err_ratelimited("The netdev has refused to offload this socket\n");
> + goto out;
> + }
> +
> + rc = 0;
> +out:
> + return rc;
Too many not functional/useless lines. What for is this out label??? This has to be:
rc = netdev->tlsdev_ops->tls_dev_add(netdev, sk, TLS_OFFLOAD_CTX_DIR_TX,
&ctx->crypto_send,
tcp_sk(sk)->write_seq);
if (rc)
pr_err_ratelimited("The netdev has refused to offload this socket\n");
return rc;
> +}
> +
> +static void destroy_record(struct tls_record_info *record)
> +{
> + skb_frag_t *frag;
> + int nr_frags = record->num_frags;
> +
> + while (nr_frags > 0) {
> + frag = &record->frags[nr_frags - 1];
> + __skb_frag_unref(frag);
> + --nr_frags;
Why just not to write the below instead?
while (nr_frags-- > 0) {
frag = &record->frags[nr_frags];
__skb_frag_unref(frag);
}
> + }
> + kfree(record);
> +}
> +
> +static void delete_all_records(struct tls_offload_context *offload_ctx)
> +{
> + struct tls_record_info *info, *temp;
> +
> + list_for_each_entry_safe(info, temp, &offload_ctx->records_list, list) {
> + list_del(&info->list);
> + destroy_record(info);
> + }
> +
> + offload_ctx->retransmit_hint = NULL;
> +}
> +
> +static void tls_icsk_clean_acked(struct sock *sk, u32 acked_seq)
> +{
> + struct tls_context *tls_ctx = tls_get_ctx(sk);
> + struct tls_offload_context *ctx;
> + struct tls_record_info *info, *temp;
> + unsigned long flags;
> + u64 deleted_records = 0;
> +
> + if (!tls_ctx)
> + return;
> +
> + ctx = tls_offload_ctx(tls_ctx);
> +
> + spin_lock_irqsave(&ctx->lock, flags);
> + info = ctx->retransmit_hint;
> + if (info && !before(acked_seq, info->end_seq)) {
> + ctx->retransmit_hint = NULL;
> + list_del(&info->list);
> + destroy_record(info);
> + deleted_records++;
> + }
> +
> + list_for_each_entry_safe(info, temp, &ctx->records_list, list) {
> + if (before(acked_seq, info->end_seq))
> + break;
> + list_del(&info->list);
> +
> + destroy_record(info);
> + deleted_records++;
> + }
> +
> + ctx->unacked_record_sn += deleted_records;
> + spin_unlock_irqrestore(&ctx->lock, flags);
> +}
> +
> +/* At this point, there should be no references on this
> + * socket and no in-flight SKBs associated with this
> + * socket, so it is safe to free all the resources.
> + */
> +void tls_device_sk_destruct(struct sock *sk)
> +{
> + struct tls_context *tls_ctx = tls_get_ctx(sk);
> + struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx);
> +
> + if (ctx->open_record)
> + destroy_record(ctx->open_record);
> +
> + delete_all_records(ctx);
> + crypto_free_aead(ctx->aead_send);
> + ctx->sk_destruct(sk);
> + static_branch_dec(&clean_acked_data_enabled);
> +
> + if (refcount_dec_and_test(&tls_ctx->refcount))
> + tls_device_queue_ctx_destruction(tls_ctx);
> +}
> +EXPORT_SYMBOL(tls_device_sk_destruct);
> +
> +static inline void tls_append_frag(struct tls_record_info *record,
> + struct page_frag *pfrag,
> + int size)
> +{
> + skb_frag_t *frag;
> +
> + frag = &record->frags[record->num_frags - 1];
> + if (frag->page.p == pfrag->page &&
> + frag->page_offset + frag->size == pfrag->offset) {
> + frag->size += size;
> + } else {
> + ++frag;
> + frag->page.p = pfrag->page;
> + frag->page_offset = pfrag->offset;
> + frag->size = size;
> + ++record->num_frags;
> + get_page(pfrag->page);
> + }
> +
> + pfrag->offset += size;
> + record->len += size;
> +}
> +
> +static inline int tls_push_record(struct sock *sk,
> + struct tls_context *ctx,
> + struct tls_offload_context *offload_ctx,
> + struct tls_record_info *record,
> + struct page_frag *pfrag,
> + int flags,
> + unsigned char record_type)
> +{
> + skb_frag_t *frag;
> + struct tcp_sock *tp = tcp_sk(sk);
> + struct page_frag fallback_frag;
> + struct page_frag *tag_pfrag = pfrag;
> + int i;
> +
> + /* fill prepand */
> + frag = &record->frags[0];
> + tls_fill_prepend(ctx,
> + skb_frag_address(frag),
> + record->len - ctx->prepend_size,
> + record_type);
> +
> + if (unlikely(!skb_page_frag_refill(ctx->tag_size, pfrag, GFP_KERNEL))) {
> + /* HW doesn't care about the data in the tag
> + * so in case pfrag has no room
> + * for a tag and we can't allocate a new pfrag
> + * just use the page in the first frag
> + * rather then write a complicated fall back code.
> + */
> + tag_pfrag = &fallback_frag;
> + tag_pfrag->page = skb_frag_page(frag);
> + tag_pfrag->offset = 0;
> + }
> +
> + tls_append_frag(record, tag_pfrag, ctx->tag_size);
> + record->end_seq = tp->write_seq + record->len;
> + spin_lock_irq(&offload_ctx->lock);
> + list_add_tail(&record->list, &offload_ctx->records_list);
> + spin_unlock_irq(&offload_ctx->lock);
> + offload_ctx->open_record = NULL;
> + set_bit(TLS_PENDING_CLOSED_RECORD, &ctx->flags);
> + tls_advance_record_sn(sk, ctx);
> +
> + for (i = 0; i < record->num_frags; i++) {
> + frag = &record->frags[i];
> + sg_unmark_end(&offload_ctx->sg_tx_data[i]);
> + sg_set_page(&offload_ctx->sg_tx_data[i], skb_frag_page(frag),
> + frag->size, frag->page_offset);
> + sk_mem_charge(sk, frag->size);
> + get_page(skb_frag_page(frag));
> + }
> + sg_mark_end(&offload_ctx->sg_tx_data[record->num_frags - 1]);
> +
> + /* all ready, send */
> + return tls_push_sg(sk, ctx, offload_ctx->sg_tx_data, 0, flags);
> +}
> +
> +static inline int tls_create_new_record(struct tls_offload_context *offload_ctx,
> + struct page_frag *pfrag,
> + size_t prepend_size)
> +{
> + skb_frag_t *frag;
> + struct tls_record_info *record;
> +
> + record = kmalloc(sizeof(*record), GFP_KERNEL);
> + if (!record)
> + return -ENOMEM;
> +
> + frag = &record->frags[0];
> + __skb_frag_set_page(frag, pfrag->page);
> + frag->page_offset = pfrag->offset;
> + skb_frag_size_set(frag, prepend_size);
> +
> + get_page(pfrag->page);
> + pfrag->offset += prepend_size;
> +
> + record->num_frags = 1;
> + record->len = prepend_size;
> + offload_ctx->open_record = record;
> + return 0;
> +}
> +
> +static inline int tls_do_allocation(struct sock *sk,
> + struct tls_offload_context *offload_ctx,
> + struct page_frag *pfrag,
> + size_t prepend_size)
> +{
> + int ret;
> +
> + if (!offload_ctx->open_record) {
> + if (unlikely(!skb_page_frag_refill(prepend_size, pfrag,
> + sk->sk_allocation))) {
> + sk->sk_prot->enter_memory_pressure(sk);
> + sk_stream_moderate_sndbuf(sk);
> + return -ENOMEM;
> + }
> +
> + ret = tls_create_new_record(offload_ctx, pfrag, prepend_size);
> + if (ret)
> + return ret;
> +
> + if (pfrag->size > pfrag->offset)
> + return 0;
> + }
> +
> + if (!sk_page_frag_refill(sk, pfrag))
> + return -ENOMEM;
> +
> + return 0;
> +}
> +
> +static int tls_push_data(struct sock *sk,
> + struct iov_iter *msg_iter,
> + size_t size, int flags,
> + unsigned char record_type)
> +{
> + struct tls_context *tls_ctx = tls_get_ctx(sk);
> + struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx);
> + struct tls_record_info *record = ctx->open_record;
> + struct page_frag *pfrag;
> + int copy, rc = 0;
> + size_t orig_size = size;
> + u32 max_open_record_len;
> + long timeo;
> + int more = flags & (MSG_SENDPAGE_NOTLAST | MSG_MORE);
> + int tls_push_record_flags = flags | MSG_SENDPAGE_NOTLAST;
> + bool done = false;
As David says, variables have to be declared in reverse Сhristmas tree order.
> +
> + if (flags &
> + ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST))
> + return -ENOTSUPP;
> +
> + if (sk->sk_err)
> + return -sk->sk_err;
> +
> + timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
> + rc = tls_complete_pending_work(sk, tls_ctx, flags, &timeo);
> + if (rc < 0)
> + return rc;
> +
> + pfrag = sk_page_frag(sk);
> +
> + /* TLS_TLS_HEADER_SIZE is not counted as part of the TLS record, and
> + * we need to leave room for an authentication tag.
> + */
> + max_open_record_len = TLS_MAX_PAYLOAD_SIZE +
> + tls_ctx->prepend_size;
> + do {
> + if (tls_do_allocation(sk, ctx, pfrag,
> + tls_ctx->prepend_size)) {
> + rc = sk_stream_wait_memory(sk, &timeo);
> + if (!rc)
> + continue;
> +
> + record = ctx->open_record;
> + if (!record)
> + break;
> +handle_error:
> + if (record_type != TLS_RECORD_TYPE_DATA) {
> + /* avoid sending partial
> + * record with type !=
> + * application_data
> + */
> + size = orig_size;
> + destroy_record(record);
> + ctx->open_record = NULL;
> + } else if (record->len > tls_ctx->prepend_size) {
> + goto last_record;
> + }
> +
> + break;
> + }
> +
> + record = ctx->open_record;
> + copy = min_t(size_t, size, (pfrag->size - pfrag->offset));
> + copy = min_t(size_t, copy, (max_open_record_len - record->len));
> +
> + if (copy_from_iter_nocache(page_address(pfrag->page) +
> + pfrag->offset,
> + copy, msg_iter) != copy) {
> + rc = -EFAULT;
> + goto handle_error;
> + }
> + tls_append_frag(record, pfrag, copy);
> +
> + size -= copy;
> + if (!size) {
> +last_record:
> + tls_push_record_flags = flags;
> + if (more) {
> + tls_ctx->pending_open_record_frags =
> + record->num_frags;
> + break;
> + }
> +
> + done = true;
> + }
> +
> + if ((done) || record->len >= max_open_record_len ||
> + (record->num_frags >= MAX_SKB_FRAGS - 1)) {
> + rc = tls_push_record(sk,
> + tls_ctx,
> + ctx,
> + record,
> + pfrag,
> + tls_push_record_flags,
> + record_type);
> + if (rc < 0)
> + break;
> + }
> + } while (!done);
> +
> + if (orig_size - size > 0)
> + rc = orig_size - size;
> +
> + return rc;
> +}
> +
> +int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
> +{
> + unsigned char record_type = TLS_RECORD_TYPE_DATA;
> + int rc = 0;
> +
> + lock_sock(sk);
> +
> + if (unlikely(msg->msg_controllen)) {
> + rc = tls_proccess_cmsg(sk, msg, &record_type);
> + if (rc)
> + goto out;
> + }
> +
> + rc = tls_push_data(sk, &msg->msg_iter, size,
> + msg->msg_flags, record_type);
> +
> +out:
> + release_sock(sk);
> + return rc;
> +}
> +
> +int tls_device_sendpage(struct sock *sk, struct page *page,
> + int offset, size_t size, int flags)
> +{
> + struct iov_iter msg_iter;
> + struct kvec iov;
> + char *kaddr = kmap(page);
> + int rc = 0;
> +
> + if (flags & MSG_SENDPAGE_NOTLAST)
> + flags |= MSG_MORE;
> +
> + lock_sock(sk);
> +
> + if (flags & MSG_OOB) {
> + rc = -ENOTSUPP;
> + goto out;
> + }
> +
> + iov.iov_base = kaddr + offset;
> + iov.iov_len = size;
> + iov_iter_kvec(&msg_iter, WRITE | ITER_KVEC, &iov, 1, size);
> + rc = tls_push_data(sk, &msg_iter, size,
> + flags, TLS_RECORD_TYPE_DATA);
> + kunmap(page);
> +
> +out:
> + release_sock(sk);
> + return rc;
> +}
> +
> +struct tls_record_info *tls_get_record(struct tls_offload_context *context,
> + u32 seq, u64 *p_record_sn)
> +{
> + struct tls_record_info *info;
> + u64 record_sn = context->hint_record_sn;
> +
> + info = context->retransmit_hint;
> + if (!info ||
> + before(seq, info->end_seq - info->len)) {
> + /* if retransmit_hint is irrelevant start
> + * from the begging of the list
> + */
> + info = list_first_entry(&context->records_list,
> + struct tls_record_info, list);
> + record_sn = context->unacked_record_sn;
> + }
> +
> + list_for_each_entry_from(info, &context->records_list, list) {
> + if (before(seq, info->end_seq)) {
> + if (!context->retransmit_hint ||
> + after(info->end_seq,
> + context->retransmit_hint->end_seq)) {
> + context->hint_record_sn = record_sn;
> + context->retransmit_hint = info;
> + }
> + *p_record_sn = record_sn;
> + return info;
> + }
> + record_sn++;
> + }
> +
> + return NULL;
> +}
> +EXPORT_SYMBOL(tls_get_record);
> +
> +static int tls_device_push_pending_record(struct sock *sk, int flags)
> +{
> + struct iov_iter msg_iter;
> +
> + iov_iter_kvec(&msg_iter, WRITE | ITER_KVEC, NULL, 0, 0);
> + return tls_push_data(sk, &msg_iter, 0, flags, TLS_RECORD_TYPE_DATA);
> +}
> +
> +int tls_set_device_offload(struct sock *sk, struct tls_context *ctx)
> +{
> + u16 nonece_size, tag_size, iv_size, rec_seq_size;
> + struct tls_record_info *start_marker_record;
> + struct tls_offload_context *offload_ctx;
> + struct tls_crypto_info *crypto_info;
> + struct net_device *netdev;
> + char *iv, *rec_seq;
> + struct sk_buff *skb;
> + int rc = -EINVAL;
> + __be64 rcd_sn;
> +
> + if (!ctx)
> + goto out;
> +
> + if (ctx->priv_ctx) {
> + rc = -EEXIST;
> + goto out;
> + }
> +
> + start_marker_record = kmalloc(sizeof(*start_marker_record), GFP_KERNEL);
> + if (!start_marker_record) {
> + rc = -ENOMEM;
> + goto out;
> + }
> +
> + offload_ctx = kzalloc(TLS_OFFLOAD_CONTEXT_SIZE, GFP_KERNEL);
> + if (!offload_ctx) {
> + rc = -ENOMEM;
> + goto free_marker_record;
> + }
> +
> + crypto_info = &ctx->crypto_send;
> + switch (crypto_info->cipher_type) {
> + case TLS_CIPHER_AES_GCM_128: {
> + nonece_size = TLS_CIPHER_AES_GCM_128_IV_SIZE;
> + tag_size = TLS_CIPHER_AES_GCM_128_TAG_SIZE;
> + iv_size = TLS_CIPHER_AES_GCM_128_IV_SIZE;
> + iv = ((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->iv;
> + rec_seq_size = TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE;
> + rec_seq =
> + ((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->rec_seq;
> + break;
> + }
> + default:
> + rc = -EINVAL;
> + goto free_offload_ctx;
> + }
> +
> + ctx->prepend_size = TLS_HEADER_SIZE + nonece_size;
> + ctx->tag_size = tag_size;
> + ctx->iv_size = iv_size;
> + ctx->iv = kmalloc(iv_size + TLS_CIPHER_AES_GCM_128_SALT_SIZE,
> + GFP_KERNEL);
> + if (!ctx->iv) {
> + rc = -ENOMEM;
> + goto free_offload_ctx;
> + }
> +
> + memcpy(ctx->iv + TLS_CIPHER_AES_GCM_128_SALT_SIZE, iv, iv_size);
> +
> + ctx->rec_seq_size = rec_seq_size;
> + ctx->rec_seq = kmalloc(rec_seq_size, GFP_KERNEL);
> + if (!ctx->rec_seq) {
> + rc = -ENOMEM;
> + goto free_iv;
> + }
> + memcpy(ctx->rec_seq, rec_seq, rec_seq_size);
> +
> + rc = tls_sw_fallback_init(sk, offload_ctx, crypto_info);
> + if (rc)
> + goto free_rec_seq;
> +
> + /* start at rec_seq - 1 to account for the start marker record */
> + memcpy(&rcd_sn, ctx->rec_seq, sizeof(rcd_sn));
> + offload_ctx->unacked_record_sn = be64_to_cpu(rcd_sn) - 1;
> +
> + start_marker_record->end_seq = tcp_sk(sk)->write_seq;
> + start_marker_record->len = 0;
> + start_marker_record->num_frags = 0;
> +
> + INIT_LIST_HEAD(&offload_ctx->records_list);
> + list_add_tail(&start_marker_record->list, &offload_ctx->records_list);
> + spin_lock_init(&offload_ctx->lock);
> +
> + static_branch_inc(&clean_acked_data_enabled);
> + inet_csk(sk)->icsk_clean_acked = &tls_icsk_clean_acked;
> + ctx->push_pending_record = tls_device_push_pending_record;
> + offload_ctx->sk_destruct = sk->sk_destruct;
> +
> + /* TLS offload is greatly simplified if we don't send
> + * SKBs where only part of the payload needs to be encrypted.
> + * So mark the last skb in the write queue as end of record.
> + */
> + skb = tcp_write_queue_tail(sk);
> + if (skb)
> + TCP_SKB_CB(skb)->eor = 1;
> +
> + refcount_set(&ctx->refcount, 1);
> +
> + /* We support starting offload on multiple sockets
> + * concurrently, so we only need a read lock here.
> + * This lock must preceed get_netdev_for_sock to prevent races between
> + * NETDEV_DOWN and setsockopt.
> + */
> + down_read(&device_offload_lock);
> + netdev = get_netdev_for_sock(sk);
> + if (!netdev) {
> + pr_err_ratelimited("%s: netdev not found\n", __func__);
> + rc = -EINVAL;
> + goto release_lock;
> + }
> +
> + if (!(netdev->features & NETIF_F_HW_TLS_TX)) {
> + rc = -ENOTSUPP;
> + goto release_netdev;
> + }
> +
> + /* Avoid offloading if the device is down
> + * We don't want to offload new flows after
> + * the NETDEV_DOWN event
> + */
> + if (!(netdev->flags & IFF_UP)) {
> + rc = -EINVAL;
> + goto release_netdev;
> + }
> +
> + ctx->priv_ctx = offload_ctx;
> + rc = attach_sock_to_netdev(sk, netdev, ctx);
> + if (rc)
> + goto release_netdev;
> +
> + ctx->netdev = netdev;
> +
> + spin_lock_irq(&tls_device_lock);
> + list_add_tail(&ctx->list, &tls_device_list);
> + spin_unlock_irq(&tls_device_lock);
> +
> + sk->sk_validate_xmit_skb = tls_validate_xmit_skb;
> + /* following this assignment tls_is_sk_tx_device_offloaded
> + * will return true and the context might be accessed
> + * by the netdev's xmit function.
> + */
> + smp_store_release(&sk->sk_destruct,
> + &tls_device_sk_destruct);
> + up_read(&device_offload_lock);
> + goto out;
> +
> +release_netdev:
> + dev_put(netdev);
> +release_lock:
> + up_read(&device_offload_lock);
> + static_branch_dec(&clean_acked_data_enabled);
> + crypto_free_aead(offload_ctx->aead_send);
> +free_rec_seq:
> + kfree(ctx->rec_seq);
> +free_iv:
> + kfree(ctx->iv);
> +free_offload_ctx:
> + kfree(offload_ctx);
> + ctx->priv_ctx = NULL;
> +free_marker_record:
> + kfree(start_marker_record);
> +out:
> + return rc;
> +}
> +
> +static int tls_device_api_check(struct net_device *dev)
> +{
> + if ((dev->features & NETIF_F_HW_TLS_TX) && !dev->tlsdev_ops)
> + return NOTIFY_BAD;
You have almost the same check in tls_device_down().
Why can't just filter all !(dev->features & NETIF_F_HW_TLS_TX) devices
in tls_dev_event()?
I gave the code example in comment to previous patch.
> +
> + return NOTIFY_DONE;
> +}
> +
> +static int tls_device_down(struct net_device *netdev)
> +{
> + struct tls_context *ctx, *tmp;
> + struct list_head list;
> + unsigned long flags;
> +
> + if (!(netdev->features & NETIF_F_HW_TLS_TX))
> + return NOTIFY_DONE;
> +
> + INIT_LIST_HEAD(&list);
This should just be declarated as:
LIST_HEAD(list);
This will save us 2 lines.
> +
> + /* Request a write lock to block new offload attempts
> + */
> + down_write(&device_offload_lock);
> +
> + spin_lock_irqsave(&tls_device_lock, flags);
> + list_for_each_entry_safe(ctx, tmp, &tls_device_list, list) {
> + if (ctx->netdev != netdev ||
> + !refcount_inc_not_zero(&ctx->refcount))
> + continue;
> +
> + list_move(&ctx->list, &list);
> + }
> + spin_unlock_irqrestore(&tls_device_lock, flags);
> +
> + list_for_each_entry_safe(ctx, tmp, &list, list) {
> + netdev->tlsdev_ops->tls_dev_del(netdev, ctx,
> + TLS_OFFLOAD_CTX_DIR_TX);
> + ctx->netdev = NULL;
> + dev_put(netdev);
> + list_del_init(&ctx->list);
> +
> + if (refcount_dec_and_test(&ctx->refcount))
> + tls_device_free_ctx(ctx);
> + }
> +
> + up_write(&device_offload_lock);
> +
> + flush_work(&tls_device_gc_work);
> +
> + return NOTIFY_DONE;
> +}
> +
> +static int tls_dev_event(struct notifier_block *this, unsigned long event,
> + void *ptr)
> +{
> + struct net_device *dev = netdev_notifier_info_to_dev(ptr);
> +
> + switch (event) {
> + case NETDEV_REGISTER:
> + case NETDEV_FEAT_CHANGE:
> + return tls_device_api_check(dev);
> + case NETDEV_DOWN:
> + return tls_device_down(dev);
> + }
> + return NOTIFY_DONE;
> +}
> +
> +static struct notifier_block tls_dev_notifier = {
> + .notifier_call = tls_dev_event,
> +};
> +
> +void __init tls_device_init(void)
> +{
> + register_netdevice_notifier(&tls_dev_notifier);
> +}
> +
> +void __exit tls_device_cleanup(void)
> +{
> + unregister_netdevice_notifier(&tls_dev_notifier);
> + flush_work(&tls_device_gc_work);
> +}
> diff --git a/net/tls/tls_device_fallback.c b/net/tls/tls_device_fallback.c
> new file mode 100644
> index 000000000000..843c7331cfc4
> --- /dev/null
> +++ b/net/tls/tls_device_fallback.c
> @@ -0,0 +1,415 @@
> +/* Copyright (c) 2018, Mellanox Technologies All rights reserved.
> + *
> + * This software is available to you under a choice of one of two
> + * licenses. You may choose to be licensed under the terms of the GNU
> + * General Public License (GPL) Version 2, available from the file
> + * COPYING in the main directory of this source tree, or the
> + * OpenIB.org BSD license below:
> + *
> + * Redistribution and use in source and binary forms, with or
> + * without modification, are permitted provided that the following
> + * conditions are met:
> + *
> + * - Redistributions of source code must retain the above
> + * copyright notice, this list of conditions and the following
> + * disclaimer.
> + *
> + * - Redistributions in binary form must reproduce the above
> + * copyright notice, this list of conditions and the following
> + * disclaimer in the documentation and/or other materials
> + * provided with the distribution.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
> + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
> + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
> + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
> + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
> + * SOFTWARE.
> + */
> +
> +#include <net/tls.h>
> +#include <crypto/aead.h>
> +#include <crypto/scatterwalk.h>
> +#include <net/ip6_checksum.h>
> +
> +static void chain_to_walk(struct scatterlist *sg, struct scatter_walk *walk)
> +{
> + struct scatterlist *src = walk->sg;
> + int diff = walk->offset - src->offset;
> +
> + sg_set_page(sg, sg_page(src),
> + src->length - diff, walk->offset);
> +
> + scatterwalk_crypto_chain(sg, sg_next(src), 0, 2);
> +}
> +
> +static int tls_enc_record(struct aead_request *aead_req,
> + struct crypto_aead *aead, char *aad, char *iv,
> + __be64 rcd_sn, struct scatter_walk *in,
> + struct scatter_walk *out, int *in_len)
> +{
> + struct scatterlist sg_in[3];
> + struct scatterlist sg_out[3];
> + unsigned char buf[TLS_HEADER_SIZE + TLS_CIPHER_AES_GCM_128_IV_SIZE];
Reverse Christmas tree.
> + u16 len;
> + int rc;
> +
> + len = min_t(int, *in_len, ARRAY_SIZE(buf));
> +
> + scatterwalk_copychunks(buf, in, len, 0);
> + scatterwalk_copychunks(buf, out, len, 1);
> +
> + *in_len -= len;
> + if (!*in_len)
> + return 0;
> +
> + scatterwalk_pagedone(in, 0, 1);
> + scatterwalk_pagedone(out, 1, 1);
> +
> + len = buf[4] | (buf[3] << 8);
> + len -= TLS_CIPHER_AES_GCM_128_IV_SIZE;
> +
> + tls_make_aad(aad, len - TLS_CIPHER_AES_GCM_128_TAG_SIZE,
> + (char *)&rcd_sn, sizeof(rcd_sn), buf[0]);
> +
> + memcpy(iv + TLS_CIPHER_AES_GCM_128_SALT_SIZE, buf + TLS_HEADER_SIZE,
> + TLS_CIPHER_AES_GCM_128_IV_SIZE);
> +
> + sg_init_table(sg_in, ARRAY_SIZE(sg_in));
> + sg_init_table(sg_out, ARRAY_SIZE(sg_out));
> + sg_set_buf(sg_in, aad, TLS_AAD_SPACE_SIZE);
> + sg_set_buf(sg_out, aad, TLS_AAD_SPACE_SIZE);
> + chain_to_walk(sg_in + 1, in);
> + chain_to_walk(sg_out + 1, out);
> +
> + *in_len -= len;
> + if (*in_len < 0) {
> + *in_len += TLS_CIPHER_AES_GCM_128_TAG_SIZE;
> + if (*in_len < 0)
> + /* the input buffer doesn't contain the entire record.
> + * trim len accordingly. The resulting authentication tag
> + * will contain garbage. but we don't care as we won't
> + * include any of it in the output skb
> + * Note that we assume the output buffer length
> + * is larger then input buffer length + tag size
> + */
> + len += *in_len;
> +
> + *in_len = 0;
> + }
> +
> + if (*in_len) {
> + scatterwalk_copychunks(NULL, in, len, 2);
> + scatterwalk_pagedone(in, 0, 1);
> + scatterwalk_copychunks(NULL, out, len, 2);
> + scatterwalk_pagedone(out, 1, 1);
> + }
> +
> + len -= TLS_CIPHER_AES_GCM_128_TAG_SIZE;
> + aead_request_set_crypt(aead_req, sg_in, sg_out, len, iv);
> +
> + rc = crypto_aead_encrypt(aead_req);
> +
> + return rc;
> +}
> +
> +static void tls_init_aead_request(struct aead_request *aead_req,
> + struct crypto_aead *aead)
> +{
> + aead_request_set_tfm(aead_req, aead);
> + aead_request_set_ad(aead_req, TLS_AAD_SPACE_SIZE);
> +}
> +
> +static struct aead_request *tls_alloc_aead_request(struct crypto_aead *aead,
> + gfp_t flags)
> +{
> + unsigned int req_size = sizeof(struct aead_request) +
> + crypto_aead_reqsize(aead);
> + struct aead_request *aead_req;
> +
> + aead_req = kzalloc(req_size, flags);
> + if (!aead_req)
> + return NULL;
> +
> + tls_init_aead_request(aead_req, aead);
> + return aead_req;
> +}
> +
> +static int tls_enc_records(struct aead_request *aead_req,
> + struct crypto_aead *aead, struct scatterlist *sg_in,
> + struct scatterlist *sg_out, char *aad, char *iv,
> + u64 rcd_sn, int len)
> +{
> + struct scatter_walk in;
> + struct scatter_walk out;
Why do not declare them both in a single line?
> + int rc;
> +
> + scatterwalk_start(&in, sg_in);
> + scatterwalk_start(&out, sg_out);
> +
> + do {
> + rc = tls_enc_record(aead_req, aead, aad, iv,
> + cpu_to_be64(rcd_sn), &in, &out, &len);
> + rcd_sn++;
> +
> + } while (rc == 0 && len);
> +
> + scatterwalk_done(&in, 0, 0);
> + scatterwalk_done(&out, 1, 0);
> +
> + return rc;
> +}
> +
> +static inline void update_chksum(struct sk_buff *skb, int headln)
> +{
> + /* Can't use icsk->icsk_af_ops->send_check here because the ip addresses
> + * might have been changed by NAT.
> + */
> +
> + const struct ipv6hdr *ipv6h;
> + const struct iphdr *iph;
> + struct tcphdr *th = tcp_hdr(skb);
> + int datalen = skb->len - headln;
> +
> + /* We only changed the payload so if we are using partial we don't
> + * need to update anything.
> + */
> + if (likely(skb->ip_summed == CHECKSUM_PARTIAL))
> + return;
> +
> + skb->ip_summed = CHECKSUM_PARTIAL;
> + skb->csum_start = skb_transport_header(skb) - skb->head;
> + skb->csum_offset = offsetof(struct tcphdr, check);
> +
> + if (skb->sk->sk_family == AF_INET6) {
> + ipv6h = ipv6_hdr(skb);
> + th->check = ~csum_ipv6_magic(&ipv6h->saddr, &ipv6h->daddr,
> + datalen, IPPROTO_TCP, 0);
> + } else {
> + iph = ip_hdr(skb);
> + th->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, datalen,
> + IPPROTO_TCP, 0);
> + }
> +}
> +
> +static void complete_skb(struct sk_buff *nskb, struct sk_buff *skb, int headln)
> +{
> + skb_copy_header(nskb, skb);
> +
> + skb_put(nskb, skb->len);
> + memcpy(nskb->data, skb->data, headln);
> + update_chksum(nskb, headln);
> +
> + nskb->destructor = skb->destructor;
> + nskb->sk = skb->sk;
> + skb->destructor = NULL;
> + skb->sk = NULL;
> + refcount_add(nskb->truesize - skb->truesize,
> + &nskb->sk->sk_wmem_alloc);
> +}
> +
> +/* This function may be called after the user socket is already
> + * closed so make sure we don't use anything freed during
> + * tls_sk_proto_close here
> + */
> +static struct sk_buff *tls_sw_fallback(struct sock *sk, struct sk_buff *skb)
> +{
> + int tcp_header_size = tcp_hdrlen(skb);
> + int tcp_payload_offset = skb_transport_offset(skb) + tcp_header_size;
> + int payload_len = skb->len - tcp_payload_offset;
> + struct tls_context *tls_ctx = tls_get_ctx(sk);
> + struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx);
> + int remaining, buf_len, resync_sgs, rc, i = 0;
> + void *buf, *dummy_buf, *iv, *aad;
> + struct scatterlist *sg_in;
> + struct scatterlist sg_out[3];
> + u32 tcp_seq = ntohl(tcp_hdr(skb)->seq);
> + struct aead_request *aead_req;
> + struct sk_buff *nskb = NULL;
> + struct tls_record_info *record;
> + unsigned long flags;
> + s32 sync_size;
> + u64 rcd_sn;
I don't remember I've seen a function with so many number of local variables.
Can we do something to improve the readability of this?
> + /* worst case is:
> + * MAX_SKB_FRAGS in tls_record_info
> + * MAX_SKB_FRAGS + 1 in SKB head and frags.
> + */
> + int sg_in_max_elements = 2 * MAX_SKB_FRAGS + 1;
> +
> + if (!payload_len)
> + return skb;
> +
> + sg_in = kmalloc_array(sg_in_max_elements, sizeof(*sg_in), GFP_ATOMIC);
> + if (!sg_in)
> + goto free_orig;
> +
> + sg_init_table(sg_in, sg_in_max_elements);
> + sg_init_table(sg_out, ARRAY_SIZE(sg_out));
> +
> + spin_lock_irqsave(&ctx->lock, flags);
> + record = tls_get_record(ctx, tcp_seq, &rcd_sn);
> + if (!record) {
> + spin_unlock_irqrestore(&ctx->lock, flags);
> + WARN(1, "Record not found for seq %u\n", tcp_seq);
> + goto free_sg;
> + }
> +
> + sync_size = tcp_seq - tls_record_start_seq(record);
> + if (sync_size < 0) {
> + int is_start_marker = tls_record_is_start_marker(record);
> +
> + spin_unlock_irqrestore(&ctx->lock, flags);
> + if (!is_start_marker)
> + /* This should only occur if the relevant record was
> + * already acked. In that case it should be ok
> + * to drop the packet and avoid retransmission.
> + *
> + * There is a corner case where the packet contains
> + * both an acked and a non-acked record.
> + * We currently don't handle that case and rely
> + * on TCP to retranmit a packet that doesn't contain
> + * already acked payload.
> + */
> + goto free_orig;
> +
> + if (payload_len > -sync_size) {
> + WARN(1, "Fallback of partially offloaded packets is not supported\n");
> + goto free_sg;
> + } else {
> + return skb;
> + }
> + }
> +
> + remaining = sync_size;
> + while (remaining > 0) {
> + skb_frag_t *frag = &record->frags[i];
> +
> + __skb_frag_ref(frag);
> + sg_set_page(sg_in + i, skb_frag_page(frag),
> + skb_frag_size(frag), frag->page_offset);
> +
> + remaining -= skb_frag_size(frag);
> +
> + if (remaining < 0)
> + sg_in[i].length += remaining;
> +
> + i++;
> + }
> + spin_unlock_irqrestore(&ctx->lock, flags);
> + resync_sgs = i;
> +
> + aead_req = tls_alloc_aead_request(ctx->aead_send, GFP_ATOMIC);
> + if (!aead_req)
> + goto put_sg;
> +
> + buf_len = TLS_CIPHER_AES_GCM_128_SALT_SIZE +
> + TLS_CIPHER_AES_GCM_128_IV_SIZE +
> + TLS_AAD_SPACE_SIZE +
> + sync_size +
> + tls_ctx->tag_size;
> + buf = kmalloc(buf_len, GFP_ATOMIC);
> + if (!buf)
> + goto free_req;
> +
> + nskb = alloc_skb(skb_headroom(skb) + skb->len, GFP_ATOMIC);
> + if (!nskb)
> + goto free_buf;
> +
> + skb_reserve(nskb, skb_headroom(skb));
> +
> + iv = buf;
> +
> + memcpy(iv, tls_ctx->crypto_send_aes_gcm_128.salt,
> + TLS_CIPHER_AES_GCM_128_SALT_SIZE);
> + aad = buf + TLS_CIPHER_AES_GCM_128_SALT_SIZE +
> + TLS_CIPHER_AES_GCM_128_IV_SIZE;
> + dummy_buf = aad + TLS_AAD_SPACE_SIZE;
> +
> + sg_set_buf(&sg_out[0], dummy_buf, sync_size);
> + sg_set_buf(&sg_out[1], nskb->data + tcp_payload_offset,
> + payload_len);
> + /* Add room for authentication tag produced by crypto */
> + dummy_buf += sync_size;
> + sg_set_buf(&sg_out[2], dummy_buf, tls_ctx->tag_size);
> + rc = skb_to_sgvec(skb, &sg_in[i], tcp_payload_offset,
> + payload_len);
> + if (rc < 0)
> + goto free_nskb;
> +
> + rc = tls_enc_records(aead_req, ctx->aead_send, sg_in, sg_out, aad, iv,
> + rcd_sn, sync_size + payload_len);
> + if (rc < 0)
> + goto free_nskb;
> +
> + complete_skb(nskb, skb, tcp_payload_offset);
> +
> + /* validate_xmit_skb_list assumes that if the skb wasn't segmented
> + * nskb->prev will point to the skb itself
> + */
> + nskb->prev = nskb;
> +free_buf:
> + kfree(buf);
> +free_req:
> + kfree(aead_req);
> +put_sg:
> + for (i = 0; i < resync_sgs; i++)
> + put_page(sg_page(&sg_in[i]));
> +free_sg:
> + kfree(sg_in);
> +free_orig:
> + kfree_skb(skb);
> + return nskb;
> +
> +free_nskb:
> + kfree_skb(nskb);
> + nskb = NULL;
> + goto free_buf;
> +}
> +
> +struct sk_buff *tls_validate_xmit_skb(struct sock *sk,
> + struct net_device *dev,
> + struct sk_buff *skb)
> +{
> + if (dev == tls_get_ctx(sk)->netdev)
> + return skb;
> +
> + return tls_sw_fallback(sk, skb);
> +}
> +
> +int tls_sw_fallback_init(struct sock *sk,
> + struct tls_offload_context *offload_ctx,
> + struct tls_crypto_info *crypto_info)
> +{
> + int rc;
> + const u8 *key;
Reverse Christmas tree.
> +
> + offload_ctx->aead_send =
> + crypto_alloc_aead("gcm(aes)", 0, CRYPTO_ALG_ASYNC);
> + if (IS_ERR(offload_ctx->aead_send)) {
> + rc = PTR_ERR(offload_ctx->aead_send);
> + pr_err_ratelimited("crypto_alloc_aead failed rc=%d\n", rc);
> + offload_ctx->aead_send = NULL;
> + goto err_out;
> + }
> +
> + key = ((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->key;
> +
> + rc = crypto_aead_setkey(offload_ctx->aead_send, key,
> + TLS_CIPHER_AES_GCM_128_KEY_SIZE);
> + if (rc)
> + goto free_aead;
> +
> + rc = crypto_aead_setauthsize(offload_ctx->aead_send,
> + TLS_CIPHER_AES_GCM_128_TAG_SIZE);
> + if (rc)
> + goto free_aead;
> +
> + return 0;
> +free_aead:
> + crypto_free_aead(offload_ctx->aead_send);
> +err_out:
> + return rc;
> +}
> diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
> index d824d548447e..e0dface33017 100644
> --- a/net/tls/tls_main.c
> +++ b/net/tls/tls_main.c
> @@ -54,6 +54,9 @@ enum {
> enum {
> TLS_BASE_TX,
> TLS_SW_TX,
> +#ifdef CONFIG_TLS_DEVICE
> + TLS_HW_TX,
> +#endif
> TLS_NUM_CONFIG,
> };
>
> @@ -416,11 +419,19 @@ static int do_tls_setsockopt_tx(struct sock *sk, char __user *optval,
> goto err_crypto_info;
> }
>
> - /* currently SW is default, we will have ethtool in future */
> - rc = tls_set_sw_offload(sk, ctx);
> - tx_conf = TLS_SW_TX;
> - if (rc)
> - goto err_crypto_info;
> +#ifdef CONFIG_TLS_DEVICE
> + rc = tls_set_device_offload(sk, ctx);
> + tx_conf = TLS_HW_TX;
> + if (rc) {
> +#else
> + {
> +#endif
> + /* if HW offload fails fallback to SW */
> + rc = tls_set_sw_offload(sk, ctx);
> + tx_conf = TLS_SW_TX;
> + if (rc)
> + goto err_crypto_info;
> + }
>
> ctx->tx_conf = tx_conf;
> update_sk_prot(sk, ctx);
> @@ -473,6 +484,12 @@ static void build_protos(struct proto *prot, struct proto *base)
> prot[TLS_SW_TX] = prot[TLS_BASE_TX];
> prot[TLS_SW_TX].sendmsg = tls_sw_sendmsg;
> prot[TLS_SW_TX].sendpage = tls_sw_sendpage;
> +
> +#ifdef CONFIG_TLS_DEVICE
> + prot[TLS_HW_TX] = prot[TLS_SW_TX];
> + prot[TLS_HW_TX].sendmsg = tls_device_sendmsg;
> + prot[TLS_HW_TX].sendpage = tls_device_sendpage;
> +#endif
> }
>
> static int tls_init(struct sock *sk)
> @@ -531,6 +548,9 @@ static int __init tls_register(void)
> {
> build_protos(tls_prots[TLSV4], &tcp_prot);
>
> +#ifdef CONFIG_TLS_DEVICE
> + tls_device_init();
> +#endif
> tcp_register_ulp(&tcp_tls_ulp_ops);
>
> return 0;
> @@ -539,6 +559,9 @@ static int __init tls_register(void)
> static void __exit tls_unregister(void)
> {
> tcp_unregister_ulp(&tcp_tls_ulp_ops);
> +#ifdef CONFIG_TLS_DEVICE
> + tls_device_cleanup();
> +#endif
> }
>
> module_init(tls_register);
Thanks,
Kirill
^ permalink raw reply
* Re: [RFC v3 net-next 13/18] net/sched: Introduce the TBS Qdisc
From: Thomas Gleixner @ 2018-03-21 22:29 UTC (permalink / raw)
To: Jesus Sanchez-Palencia
Cc: netdev, jhs, xiyou.wangcong, jiri, vinicius.gomes, richardcochran,
anna-maria, henrik, John Stultz, levi.pearson, edumazet, willemb,
mlichvar
In-Reply-To: <alpine.DEB.2.21.1803211407520.3754@nanos.tec.linutronix.de>
On Wed, 21 Mar 2018, Thomas Gleixner wrote:
> If you look at the use cases of TDM in various fields then FIFO mode is
> pretty much useless. In industrial/automotive fieldbus applications the
> various time slices are filled by different threads or even processes.
That brings me to a related question. The TDM cases I'm familiar with which
aim to use this utilize multiple periodic time slices, aka 802.1Qbv
time-aware scheduling.
Simple example:
[1a][1b][1c][1d] [1a][1b][1c][1d] [.....
[2a][2b] [2c][2d]
[3a] [3b]
[4a] [4b]
----------------------------------------------------------------------> t
where 1-4 is the slice level and a-d are network nodes.
In most cases the slice levels on a node are handled by different
applications or threads. Some of the protocols utilize dedicated time slice
levels - lets assume '4' in the above example - to run general network
traffic which might even be allowed to have collisions, i.e. [4a-d] would
become [4] and any node can send; the involved componets like switches are
supposed to handle that.
I'm not seing how TBS is going to assist with any of that. It requires
everything to be handled at the application level. Not really useful
especially not for general traffic which does not know about the scheduling
bands at all.
If you look at an industrial control node. It basically does:
queue_first_packet(tx, slice1);
while (!stop) {
if (wait_for_packet(rx) == ERROR)
goto errorhandling;
tx = do_computation(rx);
queue_next_tx(tx, slice1);
}
that's a pretty common pattern for these kind of applications. For audio
sources queue_next() might be triggered by the input sampler which needs to
be synchronized to the network slices anyway in order to work properly.
TBS per current implementation is nice as a proof of concept, but it solves
just a small portion of the complete problem space. I have the suspicion
that this was 'designed' to replace the user space hack in the AVNU stack
with something close to it. Not really a good plan to be honest.
I think what we really want is a strict periodic scheduler which supports
multiple slices as shown above because thats what all relevant TDM use
cases need: A/V, industrial fieldbusses .....
|---------------------------------------------------------|
| |
| TAS |<- Config
| 1 2 3 4 |
|---------------------------------------------------------|
| | | |
| | | |
| | | |
| | | |
[DirectSocket] [Qdisc FIFO] [Qdisc Prio] [Qdisc FIFO]
| | |
| | |
[Socket] [Socket] [General traffic]
The interesting thing here is that it does not require any time stamp
information brought in from the application. That's especially good for
general network traffic which is routed through a dedicated time slot. If
we don't have that then we need a user space scheduler which does exactly
the same thing and we have to route the general traffic out to user space
and back into the kernel, which is obviously a pointless exercise.
There are all kind of TDM schemes out there which are not directly driven
by applications, but rather route categorized traffic like VLANs through
dedicated time slices. That works pretty well with the above scheme because
in that case the applications might be completely oblivious about the tx
time schedule.
Surely there are protocols which do not utilize every time slice they could
use, so we need a way to tell the number of empty slices between two
consecutive packets. There are also different policies vs. the unused time
slices, like sending dummy frames or just nothing which wants to be
addressed, but I don't think that changes the general approach.
There might be some special cases for setup or node hotplug, but the
protocols I'm familiar with handle these in dedicated time slices or
through general traffic so it should just fit in.
I'm surely missing some details, but from my knowledge about the protocols
which want to utilize this, the general direction should be fine.
Feel free to tell me that I'm missing the point completely though :)
Thoughts?
Thanks,
tglx
^ permalink raw reply
* Re: [trivial PATCH V2] treewide: Align function definition open/close braces
From: Rafael J. Wysocki @ 2018-03-21 22:36 UTC (permalink / raw)
To: Joe Perches
Cc: open list:NETWORKING DRIVERS (WIRELESS), Alexandre Belloni,
the arch/x86 maintainers, Xiubo Li, Peter Zijlstra, Jeff Layton,
Will Deacon, Timur Tabi, dri-devel, Liam Girdwood,
J. Bruce Fields, Adaptec OEM Raid Solutions, H. Peter Anvin,
ACPI Devel Maling List, linux-rtc-u79uwXL29TY76Z2rM5mHXA,
David (ChunMing) Zhou, James E.J. Bottomley, Paul Moore,
open list:TARGET SUBSYSTEM, Darrick J. Wong, Dept-GELinuxNIC
In-Reply-To: <5ccbbf083e26bddfb4ea4f819ed62347ce266f39.1521669820.git.joe-6d6DIl74uiNBDgjK7y7TUQ@public.gmane.org>
On Wed, Mar 21, 2018 at 11:09 PM, Joe Perches <joe@perches.com> wrote:
> Some functions definitions have either the initial open brace and/or
> the closing brace outside of column 1.
>
> Move those braces to column 1.
>
> This allows various function analyzers like gnu complexity to work
> properly for these modified functions.
>
> Signed-off-by: Joe Perches <joe@perches.com>
> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
> Acked-by: Paul Moore <paul@paul-moore.com>
> Acked-by: Alex Deucher <alexander.deucher@amd.com>
> Acked-by: Dave Chinner <dchinner@redhat.com>
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
> Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
> Acked-by: Takashi Iwai <tiwai@suse.de>
> Acked-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
> ---
>
> git diff -w still shows no difference.
>
> This patch was sent but December and not applied.
>
> As the trivial maintainer seems not active, it'd be nice if
> Andrew Morton picks this up.
>
> V2: Remove fs/xfs/libxfs/xfs_alloc.c as it's updated and remerge the rest
>
> arch/x86/include/asm/atomic64_32.h | 2 +-
> drivers/acpi/custom_method.c | 2 +-
> drivers/acpi/fan.c | 2 +-
For the ACPI changes:
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox