Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] debugfs: improve formatting of debugfs_real_fops()
From: Greg Kroah-Hartman @ 2016-11-10 17:51 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, Nicolai Stange, Christian Lamparter
In-Reply-To: <1478798629-22318-1-git-send-email-jakub.kicinski@netronome.com>

On Thu, Nov 10, 2016 at 05:23:49PM +0000, Jakub Kicinski wrote:
> Type of debugfs_real_fops() is longer than parameters and
> the name, so there is no way to break the declaration nicely.
> We have to go over 80 characters.
> 
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> ---
>  include/linux/debugfs.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply

* Re: [PATCH net-next V5 3/9] liquidio CN23XX: Mailbox support
From: David Miller @ 2016-11-10 17:53 UTC (permalink / raw)
  To: Raghu.Vatsavayi; +Cc: netdev, Derek.Chickles, Satananda.Burla, Felix.Manlunas
In-Reply-To: <DM3PR07MB21382E9EA153F2C3FE1BC41881B80@DM3PR07MB2138.namprd07.prod.outlook.com>

From: "Vatsavayi, Raghu" <Raghu.Vatsavayi@cavium.com>
Date: Thu, 10 Nov 2016 17:44:50 +0000

> Thanks. Will you please let me know if any more changes that you
> need, I will include all those in next version.

Nobody is under any obligation to review your entire series every time
I or any other develoer performs a review.

My time is extremely limited, and by the time I review a patch series
it is supposedly "ready".  So all I am able to do is review until I
find the first error and I will report it and push back on you.

Please make your reviewer expectations match reality.  Yes, this means
we might send you through several more iterations.  That's simply how
it works.

^ permalink raw reply

* Re: [PATCH net 0/2] qed: Fix RoCE infrastructure
From: David Miller @ 2016-11-10 17:55 UTC (permalink / raw)
  To: Yuval.Mintz-YGCgFSpz5w/QT0dZR+AlfA
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	dledford-H+wXaHxf7aLQT0dZR+AlfA,
	Ram.Amrani-YGCgFSpz5w/QT0dZR+AlfA
In-Reply-To: <1478724524-16940-1-git-send-email-Yuval.Mintz-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>

From: Yuval Mintz <Yuval.Mintz-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>
Date: Wed, 9 Nov 2016 22:48:42 +0200

> This series fixes 2 basic issues with RoCE support,
> one handles a missing configuration in the initial infrastructure
> support while the other is a regression introduced by one of the
> initial fix submissions.

Series applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* RE: [PATCH net-next V5 3/9] liquidio CN23XX: Mailbox support
From: Vatsavayi, Raghu @ 2016-11-10 17:44 UTC (permalink / raw)
  To: David Miller
  Cc: netdev@vger.kernel.org, Chickles, Derek, Burla, Satananda,
	Manlunas, Felix
In-Reply-To: <20161110.121910.154188943242358141.davem@davemloft.net>

Thanks. Will you please let me know if any more changes that you need, I will include all those in next version.

If this the only change then please do confirm, so that I can mail you next patch soon with this change.

Regards
Raghu.

> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Thursday, November 10, 2016 9:19 AM
> To: Vatsavayi, Raghu
> Cc: netdev@vger.kernel.org; Vatsavayi, Raghu; Chickles, Derek; Burla,
> Satananda; Manlunas, Felix
> Subject: Re: [PATCH net-next V5 3/9] liquidio CN23XX: Mailbox support
> 
> From: Raghu Vatsavayi <rvatsavayi@caviumnetworks.com>
> Date: Wed, 9 Nov 2016 15:13:46 -0800
> 
> > +		dev_info(&oct->pci_dev->dev,
> > +			 "PF changed VF's MAC address to
> %02hhx:%02hhx:%02hhx:%02hhx:%02hhx:%02hhx\n",
> > +			 mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
> 
> We have a standard mechanism to print network addresses, including MAC
> ones, via the %pXXX format specifier.  In this case you should use "%pM" or
> "%pm" depending upon whether you want the address printed with colon
> separators or not.

^ permalink raw reply

* Re: [patch net v2 0/2] mlxsw: Couple of router fixes
From: David Miller @ 2016-11-10 18:02 UTC (permalink / raw)
  To: jiri; +Cc: netdev, idosch, eladr, yotamg, nogahf, ogerlitz
In-Reply-To: <1478777465-6567-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@resnulli.us>
Date: Thu, 10 Nov 2016 12:31:03 +0100

> From: Jiri Pirko <jiri@mellanox.com>
> 
> ---
> v1->v2:
> - patch2:
>  - use net_eq

Series applied, thanks Jiri.

^ permalink raw reply

* Re: [PATCH net-next 0/2] Fixes for port refactoring
From: David Miller @ 2016-11-10 18:05 UTC (permalink / raw)
  To: andrew; +Cc: vivien.didelot, netdev
In-Reply-To: <1478789041-16264-1-git-send-email-andrew@lunn.ch>

From: Andrew Lunn <andrew@lunn.ch>
Date: Thu, 10 Nov 2016 15:43:59 +0100

> The patches which refactored setting up the switch MACs introduced a
> couple of regressions. The RGMII delays for a port can be set using
> other mechanism than just phy-mode. Don't overwrite the delays unless
> explicitly asked to. This broke my Armada 370 RD. Also, the mv88e6351
> family supports setting RGMII delays, but is missing the necessary
> entries in the ops structures to allow this.
> 
> These fixes are to patches currently in net-next. No need for stable
> etc.

Series applied, thanks Andrew.

^ permalink raw reply

* Re: [patch net-next] mlxsw: spectrum_router: Add FIB abort warning
From: David Miller @ 2016-11-10 18:07 UTC (permalink / raw)
  To: jiri; +Cc: netdev, idosch, eladr, yotamg, nogahf, ogerlitz
In-Reply-To: <1478776229-4722-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@resnulli.us>
Date: Thu, 10 Nov 2016 12:10:29 +0100

> From: Jiri Pirko <jiri@mellanox.com>
> 
> Add a warning that the abort mechanism was triggered for device.
> Also avoid going through the procedure if abort was already done.
> 
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
> Acked-by: Ido Schimmel <idosch@mellanox.com>

Applied.

^ permalink raw reply

* Re: [PATCH 1/2] net: mvpp2: don't bring up on MAC address set
From: Baruch Siach @ 2016-11-10 18:08 UTC (permalink / raw)
  To: David Miller; +Cc: mw, netdev, thomas.petazzoni, gregory.clement
In-Reply-To: <20161110.115718.1744727306693534849.davem@davemloft.net>

Hi David,

On Thu, Nov 10, 2016 at 11:57:18AM -0500, David Miller wrote:
> From: Baruch Siach <baruch@tkos.co.il>
> Date: Wed,  9 Nov 2016 14:56:33 +0200
> 
> > Current .ndo_set_mac_address implementation brings up the interface when revert
> > to original address after failure succeeds. Fix this.
> > 
> > Signed-off-by: Baruch Siach <baruch@tkos.co.il>
> > ---
> > Untested; I don't have the hardware.
> 
> The code which updates the parser should keep the existing
> state if any part of the update fails.
> 
> This means it must attempt all memory allocations and whatever other
> resource acquisition is necessary, and only after all of those
> operations succeed and no more error cases are possible should it
> update the tables and release the old entry.
> 
> In other worse, this whole mechanism must move to a proper "prepare
> --> commit" model of making changes.

Doing so is possible, but requires much larger changes in the driver code. 
This patch makes the minimal fix to a clear bug. It should be an improvement 
over the current state.

baruch

-- 
     http://baruch.siach.name/blog/                  ~. .~   Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
   - baruch@tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il -

^ permalink raw reply

* Re: [PATCH net-next v2 2/7] vxlan: simplify exception handling
From: Pravin Shelar @ 2016-11-10 18:10 UTC (permalink / raw)
  To: Jiri Benc; +Cc: Linux Kernel Network Developers
In-Reply-To: <20161110104703.7386ab29@griffin>

On Thu, Nov 10, 2016 at 1:47 AM, Jiri Benc <jbenc@redhat.com> wrote:
> On Wed, 9 Nov 2016 19:33:24 -0800, Pravin Shelar wrote:
>> I have moved the dst error handling to vxlan_build_skb(), which is
>> releasing the dst entry.
>
> Hmm, right, I missed that. But it's weird. The previous logic was
> confusing (skb is freed in case of error but not otherwise), I'm with
> you on this change. But having the same confusing logic with dst_entry
> instead doesn't improve things.
>
> Could we free both skb and dst_entry from the caller?
>

I wanted to do same, that is free dst and skb in caller function. But
that would need more changes due to discrepancy in IPv4 udp-tunnel and
IPv6 udp-tunnel api. IPv4 works on route entry and IPv6 needs dst
entry. so If caller frees dst-entry then I need additional variable to
keep track of dst entry which is what I am trying to avoid.

^ permalink raw reply

* Re: [PATCH net-next v2 4/7] vxlan: improve vxlan route lookup checks.
From: Pravin Shelar @ 2016-11-10 18:10 UTC (permalink / raw)
  To: Jiri Benc; +Cc: Linux Kernel Network Developers
In-Reply-To: <20161110105639.6e8b70f7@griffin>

On Thu, Nov 10, 2016 at 1:56 AM, Jiri Benc <jbenc@redhat.com> wrote:
> On Wed, 9 Nov 2016 19:34:06 -0800, Pravin Shelar wrote:
>> Why it would not help in non-ovs vxlan egress path? It avoids checking
>> (if condition) for device loop.
>
> I may be missing something but I count the same number of conditions
> for each packet, they're just at a different place after the patch.
>
I am specifically talking about cached routes. If the dst entry is
cached, this patch avoids checking for device loop.

> E.g. for IPv4, the "if (!sock4)" is moved from vxlan_xmit_one into
> vxlan_get_route and the "rt" error handling stays logically the same
> (three if conditions in the non-error path) but is moved into
> vxlan_get_route. Similarly for IPv6.
>
>  Jiri

^ permalink raw reply

* Re: [PATCH net-next v2 2/7] vxlan: simplify exception handling
From: Jiri Benc @ 2016-11-10 18:33 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Linux Kernel Network Developers
In-Reply-To: <CAOrHB_BdYpCpjrzHmNmEkvhXk7u6Gb_1aRyggeCrsL0Jx1nkvw@mail.gmail.com>

On Thu, 10 Nov 2016 10:10:09 -0800, Pravin Shelar wrote:
> I wanted to do same, that is free dst and skb in caller function. But
> that would need more changes due to discrepancy in IPv4 udp-tunnel and
> IPv6 udp-tunnel api. IPv4 works on route entry and IPv6 needs dst
> entry. so If caller frees dst-entry then I need additional variable to
> keep track of dst entry which is what I am trying to avoid.

Is additional variable really that bad? It's likely to be optimized by
the compiler and it will lead to less surprises. You obviously caught
me during review :-)

 Jiri

^ permalink raw reply

* Re: [PATCH net-next v2 4/7] vxlan: improve vxlan route lookup checks.
From: Jiri Benc @ 2016-11-10 18:37 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Linux Kernel Network Developers
In-Reply-To: <CAOrHB_DPWiGah9vADm=SXXnWkiDcvsHdmy3RQ6f7ntjmjZOeDg@mail.gmail.com>

On Thu, 10 Nov 2016 10:10:25 -0800, Pravin Shelar wrote:
> I am specifically talking about cached routes. If the dst entry is
> cached, this patch avoids checking for device loop.

Okay, true, for cached routes we do less work with this patch.

This was more a side note anyway, the real comment was increasing of
the dev stats (which you already said you would fix).

Thanks!

 Jiri

^ permalink raw reply

* Re: [mm PATCH v3 07/23] arch/hexagon: Add option to skip DMA sync as a part of mapping
From: Richard Kuo @ 2016-11-10 18:40 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: linux-mm, akpm, linux-hexagon, netdev, linux-kernel
In-Reply-To: <20161110113452.76501.45864.stgit@ahduyck-blue-test.jf.intel.com>

On Thu, Nov 10, 2016 at 06:34:52AM -0500, Alexander Duyck wrote:
> This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
> avoid invoking cache line invalidation if the driver will just handle it
> later via a sync_for_cpu or sync_for_device call.
> 
> Cc: Richard Kuo <rkuo@codeaurora.org>
> Cc: linux-hexagon@vger.kernel.org
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> ---
>  arch/hexagon/kernel/dma.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 

For Hexagon:

Acked-by: Richard Kuo <rkuo@codeaurora.org>



-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, 
a Linux Foundation Collaborative Project

^ permalink raw reply

* [PATCH net v2] igb: re-assign hw address pointer on reset after PCI error
From: Guilherme G. Piccoli @ 2016-11-10 18:46 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: jeffrey.t.kirsher, gpiccoli, netdev, alexander.duyck,
	todd.fujinaka

Whenever the igb driver detects the result of a read operation returns
a value composed only by F's (like 0xFFFFFFFF), it will detach the
net_device, clear the hw_addr pointer and warn to the user that adapter's
link is lost - those steps happen on igb_rd32().

In case a PCI error happens on Power architecture, there's a recovery
mechanism called EEH, that will reset the PCI slot and call driver's
handlers to reset the adapter and network functionality as well.

We observed that once hw_addr is NULL after the error is detected on
igb_rd32(), it's never assigned back, so in the process of resetting
the network functionality we got a NULL pointer dereference in both
igb_configure_tx_ring() and igb_configure_rx_ring(). In order to avoid
such bug, this patch re-assigns the hw_addr value in the slot_reset
handler.

Reported-by: Anthony H. Thai <ahthai@us.ibm.com>
Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index edc9a6a..136ee9e 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -7878,6 +7878,11 @@ static pci_ers_result_t igb_io_slot_reset(struct pci_dev *pdev)
 		pci_enable_wake(pdev, PCI_D3hot, 0);
 		pci_enable_wake(pdev, PCI_D3cold, 0);
 
+		/* In case of PCI error, adapter lose its HW address
+		 * so we should re-assign it here.
+		 */
+		hw->hw_addr = adapter->io_addr;
+
 		igb_reset(adapter);
 		wr32(E1000_WUS, ~0);
 		result = PCI_ERS_RESULT_RECOVERED;
-- 
2.1.0

^ permalink raw reply related

* [Regression w/ patch] Restore network resistance to weird ICMP messages
From: Vicente Jiménez @ 2016-11-10 18:47 UTC (permalink / raw)
  To: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy
  Cc: linux-kernel, netdev

[-- Attachment #1: Type: text/plain, Size: 2187 bytes --]

Handle weird ICMP fragmentation needed messages with next hop MTU
equal to (or exceeding) dropped packet size

Fixes: 46517008e116 ("ipv4: Kill ip_rt_frag_needed().")

In a large corporate network, we spotted this weird ICMP message after
a long troubleshooting. See attached capture file. Those ICMP "network
unreachable - fragmentation needed and don't fragment bit set"
messages are sent by a router that drop 1500 bytes IP packets and fill
the next hop MTU ICMP field with 1500.

Those messages cause the TCP connection to stall but only on newer
kernels. Older kernels set path MTU to 1492 and communicates
successfully.

After checking code and commit history, I spotted how commit
46517008e116 ("ipv4: Kill ip_rt_frag_needed().") from June 2012
changed ICMP messages handling by removing ip_rt_frag_needed function.

The relevant part of the ip_rt_frag_needed function that was removed is:

if (new_mtu < 68 || new_mtu >= old_mtu) {
        /* BSD 4.2 derived systems incorrectly adjust
         * tot_len by the IP header length, and report
         * a zero MTU in the ICMP message.
         */
        if (mtu == 0 &&
            old_mtu >= 68 + (iph->ihl << 2))
                old_mtu -= iph->ihl << 2;
        mtu = guess_mtu(old_mtu);
}


This condition handled the cases when next hop MTU where zero (less
than 68). Now this is handled by the protocol and fixed by commit
68b7107b6298 "ipv4: icmp: Fix pMTU handling for rare case".

But the rarest case when (next hop MTU) new_mtu >= old_mtu (dropped
packet length) was also removed. This commit restores this check.
Instead of using a table lookup like function guess_mtu uses, it just
try to set the path MTU decrementing by 2 bytes the dropped packet
size.

In our case, setting the path MTU to just 1498 (one iteration) worked.
This solution should converge in any case to a good value by small
steps. I don't think there's a need to a more complex solution.

The patched kernel worked perfectly setting the path MTU to 1498 from
the initial default interface value of 1500. This time I don't have a
capture file from inside the affected center, but all received packed
had a maximum size of 1498.

-- 
cheers
vicente

[-- Attachment #2: 0001-ipv4-icmp-Fix-pMTU-handling-for-rarest-case.patch --]
[-- Type: application/octet-stream, Size: 1273 bytes --]

From 88ac49fa287e2e0d3d3bc805dea1fec301aad1df Mon Sep 17 00:00:00 2001
From: Vicente Jimenez Aguilar <googuy@gmail.com>
Date: Mon, 31 Oct 2016 13:10:29 +0100
Subject: [PATCH] ipv4: icmp: Fix pMTU handling for rarest case

Restore network resistance to weird ICMP fragmentation needed messages
with next hop MTU equal to (or exceeding) dropped packet size

Fixes: 46517008e116 ("ipv4: Kill ip_rt_frag_needed().")
Signed-off-by: Vicente Jimenez Aguilar <googuy@gmail.com>
---
 net/ipv4/icmp.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 38abe70..4c90d76 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -773,6 +773,7 @@ static bool icmp_tag_validation(int proto)
 static bool icmp_unreach(struct sk_buff *skb)
 {
 	const struct iphdr *iph;
+	unsigned short old_mtu;
 	struct icmphdr *icmph;
 	struct net *net;
 	u32 info = 0;
@@ -819,6 +820,12 @@ static bool icmp_unreach(struct sk_buff *skb)
 				/* fall through */
 			case 0:
 				info = ntohs(icmph->un.frag.mtu);
+				/* Handle weird case where next hop MTU is
+				 * equal to or exceeding dropped packet size
+				 */
+				old_mtu = ntohs(iph->tot_len);
+				if (info >= old_mtu)
+					info = old_mtu - 2;
 			}
 			break;
 		case ICMP_SR_FAILED:
-- 
2.7.3


[-- Attachment #3: ICMP discarting and sugesting 1500 2.pcapng --]
[-- Type: application/octet-stream, Size: 89664 bytes --]

^ permalink raw reply related

* Re: [Regression w/ patch] Restore network resistance to weird ICMP messages
From: David Miller @ 2016-11-10 19:07 UTC (permalink / raw)
  To: googuy; +Cc: kuznet, jmorris, yoshfuji, kaber, linux-kernel, netdev
In-Reply-To: <CAO1wt+Z12-0vDVy2yo_L24Uesa688-wLfDb-CUZ5qmfRUF982Q@mail.gmail.com>


This is still not formed properly.

Your Subject line should be of the form:

	Subject: $Subsystem: $Description

Something like "icmp: Restore resistence to abnormal messages."

Also, your patch must be inline, right after the commit message,
rather than included as an attachment.

Finally your Fixes: line much be exactly right before the first
Signoff of your commit message.

^ permalink raw reply

* Re: [patch net-next 1/8] Introduce ife encapsulation module
From: David Miller @ 2016-11-10 19:17 UTC (permalink / raw)
  To: jiri
  Cc: netdev, yotamg, idosch, eladr, nogahf, ogerlitz, jhs,
	geert+renesas, stephen, xiyou.wangcong, linux, roopa
In-Reply-To: <1478776988-5400-2-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@resnulli.us>
Date: Thu, 10 Nov 2016 12:23:01 +0100

> +void *ife_encode(struct sk_buff *skb, u16 metalen)

metalen is u16

> +	metalen = htons(metalen);

htons() returns be16.

> +int ife_tlv_meta_encode(void *skbdata, u16 attrtype, u16 dlen, const void *dval)
> +{
> +	u32 *tlv = (u32 *) (skbdata);
 ...
> +	*tlv = htonl(htlv);

Similar situation here.

^ permalink raw reply

* Re: [PATCH net-next v2 2/7] vxlan: simplify exception handling
From: Pravin Shelar @ 2016-11-10 19:21 UTC (permalink / raw)
  To: Jiri Benc; +Cc: Linux Kernel Network Developers
In-Reply-To: <20161110193347.0e81d68b@griffin>

On Thu, Nov 10, 2016 at 10:33 AM, Jiri Benc <jbenc@redhat.com> wrote:
> On Thu, 10 Nov 2016 10:10:09 -0800, Pravin Shelar wrote:
>> I wanted to do same, that is free dst and skb in caller function. But
>> that would need more changes due to discrepancy in IPv4 udp-tunnel and
>> IPv6 udp-tunnel api. IPv4 works on route entry and IPv6 needs dst
>> entry. so If caller frees dst-entry then I need additional variable to
>> keep track of dst entry which is what I am trying to avoid.
>
> Is additional variable really that bad? It's likely to be optimized by
> the compiler and it will lead to less surprises. You obviously caught
> me during review :-)
>
One additional variable is not bad but look at what has happened in
vxlan_xmit_one(). There are already more than 20 variables defined. It
is hard to read code in this case.
anyways I can add another variable to the function. I do not feel that
strongly about this.

^ permalink raw reply

* Re: BUG() can be hit in tcp_collapse()
From: Eric Dumazet @ 2016-11-10 19:26 UTC (permalink / raw)
  To: Vladis Dronov; +Cc: netdev, stable, Marco Grassi
In-Reply-To: <1478792652.8455.3.camel@edumazet-glaptop3.roam.corp.google.com>

On Thu, 2016-11-10 at 07:44 -0800, Eric Dumazet wrote:
> On Thu, 2016-11-10 at 09:47 -0500, Vladis Dronov wrote:
> > Hello,
> > 
> > It was discovered by Marco Grassi <marco.gra@gmail.com> (many thanks) that the
> > latest stable Linux kernel v4.8.6 is crashing in tcp_collapse() after making
> > certain syscalls:
> > 
> > [    9.622886] kernel BUG at net/ipv4/tcp_input.c:4813!
> > [    9.623299] invalid opcode: 0000 [#1] SMP
> > [    9.623642] Modules linked in: iptable_nat nf_nat_ipv4 nf_nat
> > [    9.624287] CPU: 2 PID: 2871 Comm: poc Not tainted 4.8.6 #2
> > [    9.624730] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014
> > [    9.625459] task: ffff8801387b9a00 task.stack: ffff8801380e4000
> > [    9.625929] RIP: 0010:[<ffffffff8178d4ec>]  [<ffffffff8178d4ec>] tcp_collapse+0x3ac/0x3b0
> > [    9.626609] RSP: 0018:ffff8801380e7b78  EFLAGS: 00010282
> > [    9.627028] RAX: 00000000fffffff2 RBX: 0000000000000ec0 RCX: 0000000000000ec0
> > [    9.627587] RDX: ffff8801365cd000 RSI: 0000000000000000 RDI: ffff8801364106e0
> > [    9.628142] RBP: ffff8801380e7bc8 R08: 0000000000000000 R09: ffff88013b003300
> > [    9.628704] R10: ffff8801365cd000 R11: 0000000000000000 R12: 0000000000000ec0
> > [    9.629259] R13: ffff88013663ae00 R14: 00000000cdf0ca26 R15: ffff8801364106e0
> > [    9.629819] FS:  00007f2cef695800(0000) GS:ffff88013fc80000(0000) knlGS:0000000000000000
> > [    9.630945] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [    9.631655] CR2: 000000002002a000 CR3: 0000000139d46000 CR4: 00000000001406e0
> > [    9.632462] Stack:
> > [    9.632900]  0000000000000000 cdf0da2600000001 ffff880138050000 ffff8801380500a8
> > [    9.634138]  ffff880100000000 ffff880138050688 0000000000000900 ffff8801364136e0
> > [    9.635379]  ffff880138050000 ffff880138050688 ffff8801380e7c00 ffffffff8178d630
> > [    9.636622] Call Trace:
> > [    9.637087]  [<ffffffff8178d630>] tcp_try_rmem_schedule+0x140/0x380
> > [    9.637834]  [<ffffffff81791aa8>] tcp_data_queue+0x898/0xcf0
> > [    9.638538]  [<ffffffff8179210b>] tcp_rcv_established+0x20b/0x6c0
> > [    9.639268]  [<ffffffff81710143>] ? sk_reset_timer+0x13/0x30
> > [    9.639968]  [<ffffffff81813009>] tcp_v6_do_rcv+0x1b9/0x420
> > [    9.640666]  [<ffffffff81710b02>] __release_sock+0x82/0xf0
> > [    9.641353]  [<ffffffff81710b9b>] release_sock+0x2b/0x90
> > [    9.642029]  [<ffffffff817890ca>] tcp_sendmsg+0x55a/0xb60
> > [    9.642714]  [<ffffffff817b29d0>] inet_sendmsg+0x60/0x90
> > [    9.643389]  [<ffffffff8170c7b3>] sock_sendmsg+0x33/0x40
> > [    9.644064]  [<ffffffff8170ccee>] SYSC_sendto+0xee/0x160
> > [    9.645530]  [<ffffffff8170d6f9>] SyS_sendto+0x9/0x10
> > [    9.646190]  [<ffffffff81909df2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
> > [    9.646947] Code: 48 c7 07 00 00 00 00 48 89 42 08 48 89 10 e8 cc 7e f8 ff 49 8b 47 30 48 8b 80 80 01 00 00 65 48 ff 80 b0 01 00 00 e9 72 fd ff ff <0f> 0b 66 90 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 fe 53 8b 
> > [    9.651794] RIP  [<ffffffff8178d4ec>] tcp_collapse+0x3ac/0x3b0
> > [    9.652554]  RSP <ffff8801380e7b78>
> > 
> > The reproducer is generated by the syzkaller, please, see attached. The
> > following BUG() is hit:
> > 
> > [net/ipv4/tcp_input.c]
> > static void
> > tcp_collapse(struct sock *sk, struct sk_buff_head *list,
> >              struct sk_buff *head, struct sk_buff *tail,
> >              u32 start, u32 end)
> > {
> > ...
> > /* Copy data, releasing collapsed skbs. */
> > while (copy > 0) {
> >         int offset = start - TCP_SKB_CB(skb)->seq;
> >         int size = TCP_SKB_CB(skb)->end_seq - start;
> > 
> >         BUG_ON(offset < 0);
> >         if (size > 0) {
> >                 size = min(copy, size);
> > 4812:           if (skb_copy_bits(skb, offset, skb_put(nskb, size), size))
> > 4813:                   BUG();
> > 
> > /usr/src/linux-4.8.6/net/ipv4/tcp_input.c: 4812
> > 0xffffffff8178d390 <tcp_collapse+0x250>:        mov    %r12d,%esi
> > 0xffffffff8178d393 <tcp_collapse+0x253>:        callq  0xffffffff81713ce0 <skb_put>
> > 0xffffffff8178d398 <tcp_collapse+0x258>:        mov    -0x30(%rbp),%r8d
> > 0xffffffff8178d39c <tcp_collapse+0x25c>:        mov    %r12d,%ecx
> > 0xffffffff8178d39f <tcp_collapse+0x25f>:        mov    %rax,%rdx
> > 0xffffffff8178d3a2 <tcp_collapse+0x262>:        mov    %r15,%rdi
> > 0xffffffff8178d3a5 <tcp_collapse+0x265>:        mov    %r8d,%esi
> > 0xffffffff8178d3a8 <tcp_collapse+0x268>:        callq  0xffffffff81714b90 <skb_copy_bits>
> > 0xffffffff8178d3ad <tcp_collapse+0x26d>:        test   %eax,%eax
> > 0xffffffff8178d3af <tcp_collapse+0x26f>:        jne    0xffffffff8178d4ec <tcp_collapse+0x3ac>
> > ...
> > /usr/src/linux-4.8.6/net/ipv4/tcp_input.c: 4813
> > 0xffffffff8178d4ec <tcp_collapse+0x3ac>:        ud2    
> > 
> > I have checked that the reproducer can cause hitting this BUG() in the kernels
> > since, at least v4.0. I was not checking the earlier kernels except RHEL-7 ones
> > (3.10.0-xxx) which are not vulnerable.
> > 
> > The upstream kernels since v4.9-rc1 are not vulnerable too and I have bisected
> > the repo to the commit c9c3321257 which fixes the issue.
> > 
> > $ git tag --contain c9c3321257e1b95be9b375f811fb250162af8d39
> > v4.9-rc1
> > 
> > Stable v4.8.6 kernel with the c9c3321257 commit applied does not hit the BUG(),
> > so I believe this commit should be backported to the stable branch. This commit
> > applies cleanly to the v4.8.6 tree with just line offsets.
> > 
> > Meanwhile, I see that commit c9c3321257 just increases(?) an skb buffer(?)
> > (which fixes hitting the BUG() for this exact reproducer), but does not fix the
> > real reason (so another set of syscalls still may cause hitting the BUG()). This
> > is why I'm emailing not only to stable@, but also to netdev@, asking to review the
> > data above and probably develop a fix.
> 
> I do not believe c9c3321257 is a proper fix for this issue.
> 
> It only works around the bug for this specific use case, probably
> because of :
> 
> +       /* In case all data was pulled from skb frags (in __pskb_pull_tail()),
> +        * we can fix skb->truesize to its real value to avoid future drops.
> +        * This is valid because skb is not yet charged to the socket.
> +        * It has been noticed pure SACK packets were sometimes dropped
> +        * (if cooked by drivers without copybreak feature).
> +        */
> +       if (!skb->data_len)
> +               skb->truesize = SKB_TRUESIZE(skb_end_offset(skb));
> 
> Meaning that (tcp_win_from_space(skb->truesize) > skb->len) expression
> result differs in tcp_collapse() later.

The issue is that sk_filter() truncates an incoming packet to a smaller
value.

Bad things happen because TCP_SKB_CB(skb)->end_seq is not updated.

I guess other issues would also happen if the truncation also removes
part of tcp header.

sk_filter_trim_cap(sk, skb, tcp_hlen) would be needed,
or sk_filter_trim_cap(sk, skb, skb->len) to only ACCEPT/DROP packets,
but no truncations.

^ permalink raw reply

* Re: [patch net-next 5/8] Introduce sample tc action
From: John Fastabend @ 2016-11-10 19:35 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: davem, yotamg, idosch, eladr, nogahf, ogerlitz, jhs,
	geert+renesas, stephen, xiyou.wangcong, linux, roopa
In-Reply-To: <1478776988-5400-6-git-send-email-jiri@resnulli.us>

On 16-11-10 03:23 AM, Jiri Pirko wrote:
> From: Yotam Gigi <yotamg@mellanox.com>
> 
> This action allow the user to sample traffic matched by tc classifier.
> The sampling consists of choosing packets randomly, truncating them,
> adding some informative metadata regarding the interface and the original
> packet size and mark them with specific mark, to allow further tc rules to
> match and process. The marked sample packets are then injected into the
> device ingress qdisc using netif_receive_skb.
> 
> The packets metadata is packed using the ife encapsulation protocol, and
> the outer packet's ethernet dest, source and eth_type, along with the
> rate, mark and the optional truncation size can be configured from
> userspace.
> 
> Example:
> To sample ingress traffic from interface eth1, and redirect the sampled
> the sampled packets to interface dummy0, one may use the commands:
> 
> tc qdisc add dev eth1 handle ffff: ingress
> 
> tc filter add dev eth1 parent ffff: \
> 	   matchall action sample rate 12 mark 17
> 
> tc filter add parent ffff: dev eth1 protocol all \
> 	   u32 match mark 17 0xff \
> 	   action mirred egress redirect dev dummy0
> 
> Where the first command adds an ingress qdisc and the second starts
> sampling every 12'th packet on dev eth1 and marks the sampled packets with
> 17. The third command catches the sampled packets, which are marked with
> 17, and redirects them to dev dummy0.

The sampling algorithm was not randomized based on the above commit
log? It really needs to be for all the reasons Roopa mentioned earlier.
Did I miss some email on why it didn't get implemented?

Also there was an indication the already is actually implemented
correctly so don't we need the hw/sw to behave the same. The whole
argument about sw/hw parity, etc.

> 
> Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
> ---

^ permalink raw reply

* Re: [patch net-next 5/8] Introduce sample tc action
From: John Fastabend @ 2016-11-10 19:38 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: davem, yotamg, idosch, eladr, nogahf, ogerlitz, jhs,
	geert+renesas, stephen, xiyou.wangcong, linux, roopa
In-Reply-To: <5824CBFE.5030502@gmail.com>

On 16-11-10 11:35 AM, John Fastabend wrote:
> On 16-11-10 03:23 AM, Jiri Pirko wrote:
>> From: Yotam Gigi <yotamg@mellanox.com>
>>
>> This action allow the user to sample traffic matched by tc classifier.
>> The sampling consists of choosing packets randomly, truncating them,
>> adding some informative metadata regarding the interface and the original
>> packet size and mark them with specific mark, to allow further tc rules to
>> match and process. The marked sample packets are then injected into the
>> device ingress qdisc using netif_receive_skb.
>>
>> The packets metadata is packed using the ife encapsulation protocol, and
>> the outer packet's ethernet dest, source and eth_type, along with the
>> rate, mark and the optional truncation size can be configured from
>> userspace.
>>
>> Example:
>> To sample ingress traffic from interface eth1, and redirect the sampled
>> the sampled packets to interface dummy0, one may use the commands:
>>
>> tc qdisc add dev eth1 handle ffff: ingress
>>
>> tc filter add dev eth1 parent ffff: \
>> 	   matchall action sample rate 12 mark 17
>>
>> tc filter add parent ffff: dev eth1 protocol all \
>> 	   u32 match mark 17 0xff \
>> 	   action mirred egress redirect dev dummy0
>>
>> Where the first command adds an ingress qdisc and the second starts
>> sampling every 12'th packet on dev eth1 and marks the sampled packets with
>> 17. The third command catches the sampled packets, which are marked with
>> 17, and redirects them to dev dummy0.
> 
> The sampling algorithm was not randomized based on the above commit
> log? It really needs to be for all the reasons Roopa mentioned earlier.
> Did I miss some email on why it didn't get implemented?
> 
> Also there was an indication the already is actually implemented
> correctly so don't we need the hw/sw to behave the same. The whole
> argument about sw/hw parity, etc.

sorry bit of a typo there corrected 2nd paragraph here...

Also there was an indication the hardware is already implemented \
correctly so don't we need the hw/sw to behave the same. The argument
about sw/hw parity, etc.

> 
>>
>> Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
>> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>> ---
> 

^ permalink raw reply

* Re: BUG() can be hit in tcp_collapse()
From: Eric Dumazet @ 2016-11-10 19:49 UTC (permalink / raw)
  To: Vladis Dronov; +Cc: netdev, stable, Marco Grassi
In-Reply-To: <1478805992.8455.20.camel@edumazet-glaptop3.roam.corp.google.com>

On Thu, 2016-11-10 at 11:26 -0800, Eric Dumazet wrote:

> The issue is that sk_filter() truncates an incoming packet to a smaller
> value.
> 
> Bad things happen because TCP_SKB_CB(skb)->end_seq is not updated.
> 
> I guess other issues would also happen if the truncation also removes
> part of tcp header.
> 
> sk_filter_trim_cap(sk, skb, tcp_hlen) would be needed,
> or sk_filter_trim_cap(sk, skb, skb->len) to only ACCEPT/DROP packets,
> but no truncations.

Something like :

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 61b7be303eec..0b8f575eefaa 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1676,7 +1676,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
 
 	nf_reset(skb);
 
-	if (sk_filter(sk, skb))
+	if (sk_filter_trim_cap(sk, skb, skb->len))
 		goto discard_and_relse;
 
 	skb->dev = NULL;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 6ca23c2e76f7..2c7a6f7f1113 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1229,7 +1229,7 @@ static int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
 	if (skb->protocol == htons(ETH_P_IP))
 		return tcp_v4_do_rcv(sk, skb);
 
-	if (sk_filter(sk, skb))
+	if (sk_filter_trim_cap(sk, skb, skb->len))
 		goto discard;
 
 	/*

^ permalink raw reply related

* Re: [PATCH] net: ethernet: ti: davinci_cpdma: fix fixed prio cpdma ctlr configuration
From: ivan.khoronzhuk @ 2016-11-10 19:52 UTC (permalink / raw)
  To: Grygorii Strashko, Ivan Khoronzhuk, mugunthanvnm, netdev
  Cc: linux-omap, linux-kernel
In-Reply-To: <37548a54-c54b-3194-691a-22fa8118ed3d@ti.com>



On 10.11.16 18:37, Grygorii Strashko wrote:
>
>
> On 11/09/2016 05:56 PM, Ivan Khoronzhuk wrote:
>>
>>
>> On 09.11.16 23:09, Grygorii Strashko wrote:
>>>
>>>
>>> On 11/08/2016 07:10 AM, Ivan Khoronzhuk wrote:
>>>> The dma ctlr is reseted to 0 while cpdma start, thus cpdma ctlr
>>>
>>> I assume this is because cpdma_ctlr_start() does soft reset. Is it
>>> correct?
>> Probably not. I've seen this register doesn't hold any previous settings
>> (just trash)
>
> What register CPDMA_DMACONTROL or CPSW_DMACTRL?
>
>> after cpdma_ctlr_stop(), actually after last channel is stopped (inside
>> of cpdma_ctlr_stop()).
>
> So, You are stating that Registers context is changed after stop?
>
>> Then cpdma_ctlr_start() just reset it to 0.
>
> "just trash" or "0".
>
> Sry, I do not see how cpdma_ctlr_stop() can affect on registers state :(
> and I'd very appreciated if can provide more detailed information.
>
I've checked again, it was my mistake. It simply reset to 0 while soft reset.

>>
>>>
>>>> cannot be configured after cpdma is stopped. So, restore content
>>>> of cpdma ctlr while off/on procedure.
>>>>
>>>> Based on net-next/master
>>>
>>> ^ remove it
>> sure
>>
>>>
>>>>
>>>> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
>>>> ---
>>>>  drivers/net/ethernet/ti/cpsw.c          |   6 +-
>>>>  drivers/net/ethernet/ti/davinci_cpdma.c | 103
>>>> +++++++++++++++++---------------
>>>>  drivers/net/ethernet/ti/davinci_cpdma.h |   2 +
>>>>  3 files changed, 58 insertions(+), 53 deletions(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/ti/cpsw.c
>>>> b/drivers/net/ethernet/ti/cpsw.c
>>>> index b1ddf89..4d04b8e 100644
>>>> --- a/drivers/net/ethernet/ti/cpsw.c
>>>> +++ b/drivers/net/ethernet/ti/cpsw.c
>>>> @@ -1376,10 +1376,6 @@ static int cpsw_ndo_open(struct net_device *ndev)
>>>>                    ALE_ALL_PORTS, ALE_ALL_PORTS, 0, 0);
>>>>
>>>>      if (!cpsw_common_res_usage_state(cpsw)) {
>>>> -        /* setup tx dma to fixed prio and zero offset */
>>>> -        cpdma_control_set(cpsw->dma, CPDMA_TX_PRIO_FIXED, 1);
>>>> -        cpdma_control_set(cpsw->dma, CPDMA_RX_BUFFER_OFFSET, 0);
>>>> -
>>>>          /* disable priority elevation */
>>>>          __raw_writel(0, &cpsw->regs->ptype);
>>>>
>>>> @@ -2710,6 +2706,8 @@ static int cpsw_probe(struct platform_device
>>>> *pdev)
>>>>      dma_params.desc_align        = 16;
>>>>      dma_params.has_ext_regs        = true;
>>>>      dma_params.desc_hw_addr         = dma_params.desc_mem_phys;
>>>> +    dma_params.rxbuf_offset        = 0;
>>>> +    dma_params.fixed_prio        = 1;
>>>
>>> Do we really need this new parameters? Do you have plans to use other
>>> values?
>> I'm ok if this is static (equally as a bunch of rest in dma_params), no
>> see reason to use other values,
>
> That's what i wanted to know :) - go static, pls.
>
>> it at least now. But the issue is not only in these two parameters and
>> not only in cpsw_ndo_open().
>> It touches cpsw_set_channels() also, where ctlr stop/start is present.
>> In order to not copy cpdma_control_set(cpsw->dma, CPDMA_TX_PRIO_FIXED,
>> 1)...
>> in all such kind places in eth drivers, better to allow the cpdma to
>> control it's parameters...
>>
>> The cpdma ctlr register holds a little more parameters (but only two of
>> them are set in cpsw)
>> Maybe there is reason to save them also. Actually I'd seen this problem
>> when playing with
>> on/off channel shapers, unfortunately the cpdma ctlr holds this info
>> also, and it was lost
>> while on/off (but I'm going to restore it in chan_start()).
>>
>
> I understand you change, but I'm note sure about real root cause :(
so, are you Ok with current version?

>

^ permalink raw reply

* RE: [patch net-next 1/8] Introduce ife encapsulation module
From: Yotam Gigi @ 2016-11-10 19:52 UTC (permalink / raw)
  To: David Miller, jiri@resnulli.us
  Cc: netdev@vger.kernel.org, Ido Schimmel, Elad Raz, Nogah Frankel,
	Or Gerlitz, jhs@mojatatu.com, geert+renesas@glider.be,
	stephen@networkplumber.org, xiyou.wangcong@gmail.com,
	linux@roeck-us.net, roopa@cumulusnetworks.com
In-Reply-To: <20161110.141731.1693686577994851858.davem@davemloft.net>

>-----Original Message-----
>From: David Miller [mailto:davem@davemloft.net]
>Sent: Thursday, November 10, 2016 9:18 PM
>To: jiri@resnulli.us
>Cc: netdev@vger.kernel.org; Yotam Gigi <yotamg@mellanox.com>; Ido Schimmel
><idosch@mellanox.com>; Elad Raz <eladr@mellanox.com>; Nogah Frankel
><nogahf@mellanox.com>; Or Gerlitz <ogerlitz@mellanox.com>;
>jhs@mojatatu.com; geert+renesas@glider.be; stephen@networkplumber.org;
>xiyou.wangcong@gmail.com; linux@roeck-us.net; roopa@cumulusnetworks.com
>Subject: Re: [patch net-next 1/8] Introduce ife encapsulation module
>
>From: Jiri Pirko <jiri@resnulli.us>
>Date: Thu, 10 Nov 2016 12:23:01 +0100
>
>> +void *ife_encode(struct sk_buff *skb, u16 metalen)
>
>metalen is u16
>
>> +	metalen = htons(metalen);
>
>htons() returns be16.
>
>> +int ife_tlv_meta_encode(void *skbdata, u16 attrtype, u16 dlen, const void
>*dval)
>> +{
>> +	u32 *tlv = (u32 *) (skbdata);
> ...
>> +	*tlv = htonl(htlv);
>
>Similar situation here.

I will fix those and we will send a v2.

^ permalink raw reply

* Re: BUG() can be hit in tcp_collapse()
From: Eric Dumazet @ 2016-11-10 20:13 UTC (permalink / raw)
  To: Vladis Dronov; +Cc: netdev, stable, Marco Grassi
In-Reply-To: <1478807366.8455.21.camel@edumazet-glaptop3.roam.corp.google.com>

On Thu, 2016-11-10 at 11:49 -0800, Eric Dumazet wrote:
> On Thu, 2016-11-10 at 11:26 -0800, Eric Dumazet wrote:
> 
> > The issue is that sk_filter() truncates an incoming packet to a smaller
> > value.
> > 
> > Bad things happen because TCP_SKB_CB(skb)->end_seq is not updated.
> > 
> > I guess other issues would also happen if the truncation also removes
> > part of tcp header.
> > 
> > sk_filter_trim_cap(sk, skb, tcp_hlen) would be needed,
> > or sk_filter_trim_cap(sk, skb, skb->len) to only ACCEPT/DROP packets,
> > but no truncations.
> 
> Something like :

Another sk_filter() is used in tcp v6.
So the correct patch would be :

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 61b7be303eec..0b8f575eefaa 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1676,7 +1676,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
 
 	nf_reset(skb);
 
-	if (sk_filter(sk, skb))
+	if (sk_filter_trim_cap(sk, skb, skb->len))
 		goto discard_and_relse;
 
 	skb->dev = NULL;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 6ca23c2e76f7..96525649a397 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1229,7 +1229,7 @@ static int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
 	if (skb->protocol == htons(ETH_P_IP))
 		return tcp_v4_do_rcv(sk, skb);
 
-	if (sk_filter(sk, skb))
+	if (sk_filter_trim_cap(sk, skb, skb->len))
 		goto discard;
 
 	/*
@@ -1457,7 +1457,7 @@ static int tcp_v6_rcv(struct sk_buff *skb)
 	if (tcp_v6_inbound_md5_hash(sk, skb))
 		goto discard_and_relse;
 
-	if (sk_filter(sk, skb))
+	if (sk_filter_trim_cap(sk, skb, skb->len))
 		goto discard_and_relse;
 
 	skb->dev = NULL;

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox