Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: TCP socket receives strange packet
From: Christoph Paasch @ 2014-10-14 16:54 UTC (permalink / raw)
  To: David Miller
  Cc: johnwheffner, eric.dumazet, Yurij.Plotnikov, netdev,
	Alexandra.Kossovsky
In-Reply-To: <20141014.125004.1504941413142780868.davem@davemloft.net>

On 14/10/14 - 12:50:04, David Miller wrote:
> From: Christoph Paasch <christoph.paasch@gmail.com>
> Date: Tue, 14 Oct 2014 09:41:44 -0700
> 
> > there was a long discussion whether for the updated version of RFC1323 (now
> > published as RFC 7323) a segment must be dropped if it does not contain a
> > timestamp. The rationale (defended by Joe Touch) was that it must be there to
> > protect against wrapped sequence numbers while others argued that mandating
> > a drop might result in stalling connections if (for one reason or another) a
> > host sends a segment without TS (or a middlebox removed it).
> > 
> > The RFC now says that a host SHOULD drop segments without timestamps.
> 
> There are too many middle-boxes that drop timestamps for that to be a tenable
> way to behave, especially by default.
> 
> If you want to be disconnected from various parts of the internet, feel free
> to follow that RFC's recommendations.

I'm completely with you. We actually argued against Joe Touch, who wanted to
have a  "MUST" for dropping these segments without TS.


Cheers,
Christoph

^ permalink raw reply

* [PATCH V2 1/2 net-next] caif_usb: remove redundant memory message
From: Fabian Frederick @ 2014-10-14 17:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joe Perches, Fabian Frederick, Dmitry Tarnyagin, David S. Miller,
	netdev

Let MM subsystem display out of memory messages.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
---
V2: add second patch with memset fix

 net/caif/caif_usb.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/caif/caif_usb.c b/net/caif/caif_usb.c
index ba02db0..0e487b0 100644
--- a/net/caif/caif_usb.c
+++ b/net/caif/caif_usb.c
@@ -87,10 +87,9 @@ static struct cflayer *cfusbl_create(int phyid, u8 ethaddr[ETH_ALEN],
 {
 	struct cfusbl *this = kmalloc(sizeof(struct cfusbl), GFP_ATOMIC);
 
-	if (!this) {
-		pr_warn("Out of memory\n");
+	if (!this)
 		return NULL;
-	}
+
 	caif_assert(offsetof(struct cfusbl, layer) == 0);
 
 	memset(this, 0, sizeof(struct cflayer));
-- 
1.9.3

^ permalink raw reply related

* [PATCH V2 2/2 net-next] caif_usb: use target structure member in memset
From: Fabian Frederick @ 2014-10-14 17:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joe Perches, Fabian Frederick, Dmitry Tarnyagin, David S. Miller,
	netdev

parent cfusbl was used instead of first structure member 'layer'

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
---
 net/caif/caif_usb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/caif/caif_usb.c b/net/caif/caif_usb.c
index 0e487b0..5cd44f0 100644
--- a/net/caif/caif_usb.c
+++ b/net/caif/caif_usb.c
@@ -92,7 +92,7 @@ static struct cflayer *cfusbl_create(int phyid, u8 ethaddr[ETH_ALEN],
 
 	caif_assert(offsetof(struct cfusbl, layer) == 0);
 
-	memset(this, 0, sizeof(struct cflayer));
+	memset(&this->layer, 0, sizeof(this->layer));
 	this->layer.receive = cfusbl_receive;
 	this->layer.transmit = cfusbl_transmit;
 	this->layer.ctrlcmd = cfusbl_ctrlcmd;
-- 
1.9.3

^ permalink raw reply related

* Re: [PATCH v2] fm10k: Add skb->xmit_more support
From: David Miller @ 2014-10-14 17:09 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: alexander.duyck, e1000-devel, netdev, matthew.vick
In-Reply-To: <1413026083.2427.59.camel@jtkirshe-mobl>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Sat, 11 Oct 2014 04:14:43 -0700

> On Fri, 2014-10-10 at 14:30 -0700, alexander.duyck@gmail.com wrote:
>> From: Alexander Duyck <alexander.h.duyck@redhat.com>
>> 
>> This change adds support for skb->xmit_more based on the changes that
>> were
>> made to igb to support the feature.  The main changes are moving up
>> the
>> check for maybe_stop_tx so that we can check netif_xmit_stopped to
>> determine
>> if we must write the tail because we can add no further buffers.
>> 
>> Acked-by: Matthew Vick <matthew.vick@intel.com>
>> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
>> ---
>> 
>> v2: Minor fix to patch description for typo.
>> 
>>  drivers/net/ethernet/intel/fm10k/fm10k_main.c |   65
>> +++++++++++++------------
>>  1 file changed, 34 insertions(+), 31 deletions(-)
> 
> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> 
> Its like you never left Intel Alex... :-)
> Dave, please pull this.

Applied, thanks everyone.

^ permalink raw reply

* Re: [PATCH net] skbuff: fix ftrace handling in skb_unshare
From: David Miller @ 2014-10-14 17:10 UTC (permalink / raw)
  To: alex.aring; +Cc: netdev, kernel
In-Reply-To: <1412975447-29958-1-git-send-email-alex.aring@gmail.com>

From: Alexander Aring <alex.aring@gmail.com>
Date: Fri, 10 Oct 2014 23:10:47 +0200

> If the skb is not dropped afterwards we should run consume_skb instead
> kfree_skb. Inside of function skb_unshare we do always a kfree_skb,
> doesn't depend if skb_copy failed or was successful.
> 
> This patch switch this behaviour like skb_share_check, if allocation of
> sk_buff failed we use kfree_skb otherwise consume_skb.
> 
> Signed-off-by: Alexander Aring <alex.aring@gmail.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net] tcp: fix ooo_okay setting vs Small Queues
From: David Miller @ 2014-10-14 17:12 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1412989595.9362.30.camel@edumazet-glaptop2.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 10 Oct 2014 18:06:35 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> TCP Small Queues (tcp_tsq_handler()) can hold one reference on
> sk->sk_wmem_alloc, preventing skb->ooo_okay being set.
> 
> We should relax test done to set skb->ooo_okay to take care
> of this extra reference.
> 
> Minimal truesize of skb containing one byte of payload is
> SKB_TRUESIZE(1)
> 
> Without this fix, we have more chance locking flows into the wrong
> transmit queue.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: [PATCH v2 net] x86: bpf_jit: fix two bugs in eBPF JIT compiler
From: David Miller @ 2014-10-14 17:13 UTC (permalink / raw)
  To: ast
  Cc: darrick.wong, edumazet, dborkman, hpa, tglx, mingo, netdev,
	linux-kernel
In-Reply-To: <1412998223-3805-1-git-send-email-ast@plumgrid.com>

From: Alexei Starovoitov <ast@plumgrid.com>
Date: Fri, 10 Oct 2014 20:30:23 -0700

> 1.
> JIT compiler using multi-pass approach to converge to final image size,
> since x86 instructions are variable length. It starts with large
> gaps between instructions (so some jumps may use imm32 instead of imm8)
> and iterates until total program size is the same as in previous pass.
> This algorithm works only if program size is strictly decreasing.
> Programs that use LD_ABS insn need additional code in prologue, but it
> was not emitted during 1st pass, so there was a chance that 2nd pass would
> adjust imm32->imm8 jump offsets to the same number of bytes as increase in
> prologue, which may cause algorithm to erroneously decide that size converged.
> Fix it by always emitting largest prologue in the first pass which
> is detected by oldproglen==0 check.
> Also change error check condition 'proglen != oldproglen' to fail gracefully.
> 
> 2.
> while staring at the code realized that 64-byte buffer may not be enough
> when 1st insn is large, so increase it to 128 to avoid buffer overflow
> (theoretical maximum size of prologue+div is 109) and add runtime check.
> 
> Fixes: 622582786c9e ("net: filter: x86: internal BPF JIT")
> Reported-by: Darrick J. Wong <darrick.wong@oracle.com>
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
> ---
> v1->v2: reduce chances of stack corruption in case of future bugs (suggested by Eric)
> 
> note in classic BPF programs 1st insn is always short move, but native eBPF
> programs may trigger buffer overflow. I couldn't force the crash with overflow,
> since there are no further calls while this part of stack is used.
> Both are ugly bugs regardless.
> When net-next opens I will add narrowed down testcase from 'nmap' to testsuite.

Applied, thanks Alexei.

^ permalink raw reply

* Re: [PATCH 1/1 linux-next] zd1201: replace kmalloc/memset by kzalloc
From: Joe Perches @ 2014-10-14 17:15 UTC (permalink / raw)
  To: Fabian Frederick; +Cc: linux-kernel, John W. Linville, linux-wireless, netdev
In-Reply-To: <1413304831-7203-1-git-send-email-fabf@skynet.be>

On Tue, 2014-10-14 at 18:40 +0200, Fabian Frederick wrote:
> Signed-off-by: Fabian Frederick <fabf@skynet.be>

It might be clearer to use a structure for this 16 byte thing.

There's a comment bit in the code:

/* cmdreq message: 
	u32 type
	u16 cmd
	u16 parm0
	u16 parm1
	u16 parm2
	u8  pad[4]

	total: 4 + 2 + 2 + 2 + 2 + 4 = 16
*/

It seems thought that this first u32 is actually
a series of u8s.

Maybe:

struct zd1201_cmdreq {
	u8	type;
	u8	seq;
	u8	a;
	u8	b;
	__le16	cmd;
	__le16	parm0;
	__le16	parm1;
	__le16	parm2;
	u8	pad[4];
};

And does this really need to be alloc'd?

maybe

struct zd1202_cmdreq request = {};

etc...

^ permalink raw reply

* Re: [PATCH] ipv6: remove aca_lock spinlock from struct ifacaddr6
From: David Miller @ 2014-10-14 17:16 UTC (permalink / raw)
  To: roy.qing.li; +Cc: netdev
In-Reply-To: <1413003814-14216-1-git-send-email-roy.qing.li@gmail.com>

From: roy.qing.li@gmail.com
Date: Sat, 11 Oct 2014 13:03:34 +0800

> From: Li RongQing <roy.qing.li@gmail.com>
> 
> no user uses this lock.
> 
> Signed-off-by: Li RongQing <roy.qing.li@gmail.com>

Applied, thanks.

^ permalink raw reply

* Fwd: micrel: ksz8051 badly detected as ksz8031
From: Angelo Dureghello @ 2014-10-14 17:24 UTC (permalink / raw)
  To: netdev@vger.kernel.org
In-Reply-To: <543C2954.5010609@gmail.com>

Dear,

have to apologize for the confusion, previous patch is not the proper fix,
since it is not solving completely the issue.

And also, i mainly misunderstood the issue.

The issue i am experiencing is :

https://lkml.org/lkml/2013/9/18/259

Mainly, i have Micrel chip marked KSZ8051(RNL), but the product Id in the
silicon is KSZ8031 and linux detects it as KSZ8031.
The attmept to mdio boot override register kill the Micrel functionality.

So i just replaced the phy_id code (hardcoded) in the code mdio detecion
routine.

Also, i am not giving to the Micrel any external 50Mhz clock, but
as per default, the Micrel is giving the clock out to the davinci-emac.
So no fixups are needed for my case.

But i still have a last issue now: i see link is up 100Mbit, but no
packets are really sent, and nothing is received.
Led links are up.

Time zone set
Starting network...
davinci_mdio davinci_mdio.0: resetting idled controller
net eth0: attached PHY driver [Micrel KSZ8051] 
(mii_bus:phy_addr=davinci_mdio-0:00, id=221550)
udhcpc (v1.20.2) started
Sending discover...
davinci_emac davinci_emac.1 eth0: Link is Up - 100Mbps/Full - flow 
control off
Sending discover...
Sending discover...
No lease, failing
....

[root@barix ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:08:E1:03:2A:C5
           inet6 addr: fe80::208:e1ff:fe03:2ac5/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
           TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:0 (0.0 B)  TX bytes:2258 (2.2 KiB)
           Interrupt:33

lo        Link encap:Local Loopback
           inet addr:127.0.0.1  Mask:255.0.0.0
           inet6 addr: ::1/128 Scope:Host
           UP LOOPBACK RUNNING  MTU:65536  Metric:1
           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

Packets seems sent but they are not sent at all (checking from WS)
and no packets are received at the same time.

Reagrds,
Angelo

^ permalink raw reply

* Re: [PATCH v2] netfilter: release skbuf when nlmsg put fail
From: Florian Westphal @ 2014-10-14 17:27 UTC (permalink / raw)
  To: Houcheng Lin
  Cc: Florian Westphal, pablo, Patrick McHardy, kadlec, davem,
	netfilter-devel, coreteam, netdev, Linux Kernel Mailing List
In-Reply-To: <CAL8JtxARGy6V1=m_4RJmeS5O0D0PD75cv+q2NrbmBr7ZGKeE4w@mail.gmail.com>

Houcheng Lin <houcheng@gmail.com> wrote:
> 2014-10-14 18:49 GMT+08:00 Florian Westphal <fw@strlen.de>:
> > Houcheng Lin <houcheng@gmail.com> wrote:
> >> When system is under heavy loading, the __nfulnl_send() may may failed
> >> to put nlmsg into skbuf of nfulnl_instance. If not clear the skbuff on failed,
> >> the __nfulnl_send() will still try to put next nlmsg onto this half-full skbuf
> >> and cause the user program can never receive packet.
> >>
> >> This patch fix this issue by releasing skbuf immediately after nlmst put
> >> failed.
> >
> > Could you please try this patch on top of this one and see if the
> > WARN_ON goes away?
> >
> > Thanks
> >
> > diff --git a/net/netfilter/nfnetlink_log.c b/net/netfilter/nfnetlink_log.c
> > --- a/net/netfilter/nfnetlink_log.c
> > +++ b/net/netfilter/nfnetlink_log.c
> > @@ -649,7 +649,8 @@ nfulnl_log_packet(struct net *net,
> >                 + nla_total_size(sizeof(u_int32_t))     /* gid */
> >                 + nla_total_size(plen)                  /* prefix */
> >                 + nla_total_size(sizeof(struct nfulnl_msg_packet_hw))
> > -               + nla_total_size(sizeof(struct nfulnl_msg_packet_timestamp));
> > +               + nla_total_size(sizeof(struct nfulnl_msg_packet_timestamp))
> > +               + nla_total_size(sizeof(struct nfgenmsg));      /* NLMSG_DONE */
> >
> >         if (in && skb_mac_header_was_set(skb)) {
> >                 size +=   nla_total_size(skb->dev->hard_header_len)
> > @@ -692,8 +693,7 @@ nfulnl_log_packet(struct net *net,
> >                 goto unlock_and_release;
> >         }
> >
> > -       if (inst->skb &&
> > -           size > skb_tailroom(inst->skb) - sizeof(struct nfgenmsg)) {
> > +       if (inst->skb && size > skb_tailroom(inst->skb)) {
> >                 /* either the queue len is too high or we don't have
> >                  * enough room in the skb left. flush to userspace. */
> >                 __nfulnl_flush(inst);
> Hi Florian,
> The modified code seems won't affect the program flow: Size is add a
> extra value,
> sizeof(struct nfgenmsg), during initialization. comparison size with tailroom
> space, the right-side value also add the same value.

There are two changes:

sizeof(struct nfgenmsg) is not the same as nla_total_size(sizeof(struct
nfgenmsg)).

Also, adding it means we consider the DONE space when allocationg the
skb.

^ permalink raw reply

* Re: [PATCH 1/1 linux-next] zd1201: replace kmalloc/memset by kzalloc
From: Bjørn Mork @ 2014-10-14 18:08 UTC (permalink / raw)
  To: Joe Perches
  Cc: Fabian Frederick, linux-kernel, John W. Linville, linux-wireless,
	netdev
In-Reply-To: <1413306955.3269.30.camel@joe-AO725>

Joe Perches <joe@perches.com> writes:

> And does this really need to be alloc'd?

Yes, it does. It is used as a transfer_buffer in usb_fill_bulk_urb() and
must be "suitable for DMA".  From include/linux/usb.h:

/**
 * struct urb - USB Request Block
..
 * @transfer_buffer:  This identifies the buffer to (or from) which the I/O
 *      request will be performed unless URB_NO_TRANSFER_DMA_MAP is set
 *      (however, do not leave garbage in transfer_buffer even then).
 *      This buffer must be suitable for DMA; allocate it with
 *      kmalloc() or equivalent.  For transfers to "in" endpoints, contents
 *      of this buffer will be modified.  This buffer is used for the data
 *      stage of control transfers.

Bjørn

^ permalink raw reply

* [PATCH] dsa: mv88e6171: Fix tag_protocol check
From: Guenter Roeck @ 2014-10-14 18:21 UTC (permalink / raw)
  To: David S. Miller
  Cc: Florian Fainelli, netdev, linux-kernel, Guenter Roeck,
	Andrew Lunn

tag_protocol is now an enum, so drivers have to check against it.

Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/net/dsa/mv88e6171.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index 6365e30..1020a7a 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -206,7 +206,7 @@ static int mv88e6171_setup_port(struct dsa_switch *ds, int p)
 	 */
 	val = 0x0433;
 	if (dsa_is_cpu_port(ds, p)) {
-		if (ds->dst->tag_protocol == htons(ETH_P_EDSA))
+		if (ds->dst->tag_protocol == DSA_TAG_PROTO_EDSA)
 			val |= 0x3300;
 		else
 			val |= 0x0100;
-- 
1.9.1

^ permalink raw reply related

* Re: [PATCH] dsa: mv88e6171: Fix tag_protocol check
From: Florian Fainelli @ 2014-10-14 18:23 UTC (permalink / raw)
  To: Guenter Roeck, David S. Miller; +Cc: netdev, linux-kernel, Andrew Lunn
In-Reply-To: <1413310864-16830-1-git-send-email-linux@roeck-us.net>

On 10/14/2014 11:21 AM, Guenter Roeck wrote:
> tag_protocol is now an enum, so drivers have to check against it.

Good catch, this driver got merged right in the middle of my changes
from __be16 to an enum and this slipped through.

> 
> Cc: Andrew Lunn <andrew@lunn.ch>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>

Fixes: 42f272539487e49c9ea830ad97db41eb9937d5dc ("net: DSA: Marvell
mv88e6171 switch driver")
Acked-by: Florian Fainelli <f.fainelli@gmail.com>

> ---
>  drivers/net/dsa/mv88e6171.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
> index 6365e30..1020a7a 100644
> --- a/drivers/net/dsa/mv88e6171.c
> +++ b/drivers/net/dsa/mv88e6171.c
> @@ -206,7 +206,7 @@ static int mv88e6171_setup_port(struct dsa_switch *ds, int p)
>  	 */
>  	val = 0x0433;
>  	if (dsa_is_cpu_port(ds, p)) {
> -		if (ds->dst->tag_protocol == htons(ETH_P_EDSA))
> +		if (ds->dst->tag_protocol == DSA_TAG_PROTO_EDSA)
>  			val |= 0x3300;
>  		else
>  			val |= 0x0100;
> 

^ permalink raw reply

* RE: [PATCH] net: fec: ptp: fix convergence issue to support LinuxPTP stack
From: Frank.Li @ 2014-10-14 18:27 UTC (permalink / raw)
  To: David Miller, richardcochran@gmail.com
  Cc: fugang.duan@freescale.com, netdev@vger.kernel.org,
	bhutchings@solarflare.com
In-Reply-To: <20141014.125239.1562178476120218308.davem@davemloft.net>



> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Tuesday, October 14, 2014 11:53 AM
> To: richardcochran@gmail.com
> Cc: Duan Fugang-B38611; netdev@vger.kernel.org; bhutchings@solarflare.com; Li
> Frank-B20596
> Subject: Re: [PATCH] net: fec: ptp: fix convergence issue to support LinuxPTP
> stack
> 
> From: Richard Cochran <richardcochran@gmail.com>
> Date: Tue, 14 Oct 2014 12:36:10 +0200
> 
> > On Tue, Oct 14, 2014 at 01:39:47PM +0800, Fugang Duan wrote:
> >> IEEE 1588 module has one hw issue in capturing the ATVR register.
> >> According to the user manual it is:
> >> 		ENET0->ATCR |= ENET_ATCR_CAPTURE_MASK;
> >> 		while(ENET0->ATCR & ENET_ATCR_CAPTURE_MASK);
> >> 		ts_counter_ns = ENET0->ATVR;
> >
> > ...
> >
> >> diff --git a/drivers/net/ethernet/freescale/fec_ptp.c
> >> b/drivers/net/ethernet/freescale/fec_ptp.c
> >> index cca3617..380bb10 100644
> >> --- a/drivers/net/ethernet/freescale/fec_ptp.c
> >> +++ b/drivers/net/ethernet/freescale/fec_ptp.c
> >> @@ -82,12 +82,17 @@ static cycle_t fec_ptp_read(const struct
> >> cyclecounter *cc)  {
> >>  	struct fec_enet_private *fep =
> >>  		container_of(cc, struct fec_enet_private, cc);
> >> +	const struct platform_device_id *id_entry =
> >> +		platform_get_device_id(fep->pdev);
> >>  	u32 tempval;
> >>
> >>  	tempval = readl(fep->hwp + FEC_ATIME_CTRL);
> >>  	tempval |= FEC_T_CTRL_CAPTURE;
> >>  	writel(tempval, fep->hwp + FEC_ATIME_CTRL);
> >>
> >> +	if (id_entry->driver_data & FEC_QUIRK_BUG_CAPTURE)
> >> +		udelay(1);
> >> +
> >
> > What? You never had
> >
> > 	while(ENET0->ATCR & ENET_ATCR_CAPTURE_MASK);
> >
> > in the first place. Maybe you should try that.
> >
> > Did this code ever work? I guess not.
> 
> Also see Luwei Zhou's series:
> 
> http://patchwork.ozlabs.org/patch/398467/
> http://patchwork.ozlabs.org/patch/398465/
> http://patchwork.ozlabs.org/patch/398466/
> 
> Let's get down to the bottom of all of this before I apply anything.  Does this
> fec PTP code work at all?

Yes, Luwei's 3 patch tested at MX6Q, MX6DL platform.

Only i.MX6SX has the problem, which fixed by fugang's patch.

i.MX6SX's ptp have the same issue even though without luwei's patch.

So luwei's patch is independent with fugang's patch.

Luwei's patch use hardware ptp adjustment method.
Fugang's patch fix the problem existed in MX6SX platform.

So I suggest apply luwei's 3 patches firstly.

Best regards
Frank Li

^ permalink raw reply

* Re: [PATCH] net: fec: ptp: fix convergence issue to support LinuxPTP stack
From: Richard Cochran @ 2014-10-14 18:39 UTC (permalink / raw)
  To: Frank.Li@freescale.com
  Cc: David Miller, fugang.duan@freescale.com, netdev@vger.kernel.org,
	bhutchings@solarflare.com
In-Reply-To: <7e6db0b336a340bd80b62475fca284fc@BLUPR03MB486.namprd03.prod.outlook.com>

Frank,

On Tue, Oct 14, 2014 at 06:27:28PM +0000, Frank.Li@freescale.com wrote:
> Fugang's patch fix the problem existed in MX6SX platform.

You say that Fugang's patch fixes a quirk in the SX device.

But he said that the user's manual tells us to wait for some status
bits to clear. (See the third line, with the 'while' key word).

> > > On Tue, Oct 14, 2014 at 01:39:47PM +0800, Fugang Duan wrote:
> > >> IEEE 1588 module has one hw issue in capturing the ATVR register.
> > >> According to the user manual it is:
> > >> 		ENET0->ATCR |= ENET_ATCR_CAPTURE_MASK;
> > >> 		while(ENET0->ATCR & ENET_ATCR_CAPTURE_MASK);
> > >> 		ts_counter_ns = ENET0->ATVR;

That applies to all IMX devices, doesn't it?
In any case, Fugang's change log is not clear.

Confused,
Richard

^ permalink raw reply

* [PATCH RFC net-next] sfc: add support for skb->xmit_more
From: Edward Cree @ 2014-10-14 18:41 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Edward Cree, davem, nikolay, netdev, Shradha Shah, Jon Cooper,
	linux-net-drivers
In-Reply-To: <543C2588.6060604@redhat.com>

Don't ring the doorbell, and don't do PIO.  This will also prevent
 TX Push, because there will be more than one buffer waiting when
 the doorbell is rung.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---

 drivers/net/ethernet/sfc/nic.h | 29 +++++++++++++++++++++++-----
 drivers/net/ethernet/sfc/tx.c  | 43 +++++++++++++++++-------------------------
 2 files changed, 41 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/sfc/nic.h b/drivers/net/ethernet/sfc/nic.h
index 60f8514..f77cce0 100644
--- a/drivers/net/ethernet/sfc/nic.h
+++ b/drivers/net/ethernet/sfc/nic.h
@@ -71,9 +71,17 @@ efx_tx_desc(struct efx_tx_queue *tx_queue, unsigned int index)
 	return ((efx_qword_t *) (tx_queue->txd.buf.addr)) + index;
 }
 
-/* Report whether the NIC considers this TX queue empty, given the
- * write_count used for the last doorbell push.  May return false
- * negative.
+/* Get partner of a TX queue, seen as part of the same net core queue */
+static struct efx_tx_queue *efx_tx_queue_partner(struct efx_tx_queue *tx_queue)
+{
+	if (tx_queue->queue & EFX_TXQ_TYPE_OFFLOAD)
+		return tx_queue - EFX_TXQ_TYPE_OFFLOAD;
+	else
+		return tx_queue + EFX_TXQ_TYPE_OFFLOAD;
+}
+
+/* Report whether this TX queue would be empty for the given write_count.
+ * May return false negative.
  */
 static inline bool __efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue,
 					 unsigned int write_count)
@@ -86,9 +94,18 @@ static inline bool __efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue,
 	return ((empty_read_count ^ write_count) & ~EFX_EMPTY_COUNT_VALID) == 0;
 }
 
-static inline bool efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue)
+/* Decide whether we can use TX PIO, ie. write packet data directly into
+ * a buffer on the device.  This can reduce latency at the expense of
+ * throughput, so we only do this if both hardware and software TX rings
+ * are empty.  This also ensures that only one packet at a time can be
+ * using the PIO buffer.
+ */
+static inline bool efx_nic_may_tx_pio(struct efx_tx_queue *tx_queue)
 {
-	return __efx_nic_tx_is_empty(tx_queue, tx_queue->write_count);
+	struct efx_tx_queue *partner = efx_tx_queue_partner(tx_queue);
+	return tx_queue->piobuf &&
+	       __efx_nic_tx_is_empty(tx_queue, tx_queue->insert_count) &&
+	       __efx_nic_tx_is_empty(partner, partner->insert_count);
 }
 
 /* Decide whether to push a TX descriptor to the NIC vs merely writing
@@ -96,6 +113,8 @@ static inline bool efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue)
  * descriptor to an empty queue, but is otherwise pointless.  Further,
  * Falcon and Siena have hardware bugs (SF bug 33851) that may be
  * triggered if we don't check this.
+ * We use the write_count used for the last doorbell push, to get the
+ * NIC's view of the tx queue.
  */
 static inline bool efx_nic_may_push_tx_desc(struct efx_tx_queue *tx_queue,
 					    unsigned int write_count)
diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c
index 3206098..aaf2987 100644
--- a/drivers/net/ethernet/sfc/tx.c
+++ b/drivers/net/ethernet/sfc/tx.c
@@ -132,15 +132,6 @@ unsigned int efx_tx_max_skb_descs(struct efx_nic *efx)
 	return max_descs;
 }
 
-/* Get partner of a TX queue, seen as part of the same net core queue */
-static struct efx_tx_queue *efx_tx_queue_partner(struct efx_tx_queue *tx_queue)
-{
-	if (tx_queue->queue & EFX_TXQ_TYPE_OFFLOAD)
-		return tx_queue - EFX_TXQ_TYPE_OFFLOAD;
-	else
-		return tx_queue + EFX_TXQ_TYPE_OFFLOAD;
-}
-
 static void efx_tx_maybe_stop_queue(struct efx_tx_queue *txq1)
 {
 	/* We need to consider both queues that the net core sees as one */
@@ -344,6 +335,7 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
 	struct efx_nic *efx = tx_queue->efx;
 	struct device *dma_dev = &efx->pci_dev->dev;
 	struct efx_tx_buffer *buffer;
+	unsigned int old_insert_count = tx_queue->insert_count;
 	skb_frag_t *fragment;
 	unsigned int len, unmap_len = 0;
 	dma_addr_t dma_addr, unmap_addr = 0;
@@ -351,8 +343,6 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
 	unsigned short dma_flags;
 	int i = 0;
 
-	EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
-
 	if (skb_shinfo(skb)->gso_size)
 		return efx_enqueue_skb_tso(tx_queue, skb);
 
@@ -369,9 +359,8 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
 
 	/* Consider using PIO for short packets */
 #ifdef EFX_USE_PIO
-	if (skb->len <= efx_piobuf_size && tx_queue->piobuf &&
-	    efx_nic_tx_is_empty(tx_queue) &&
-	    efx_nic_tx_is_empty(efx_tx_queue_partner(tx_queue))) {
+	if (skb->len <= efx_piobuf_size && !skb->xmit_more &&
+	    efx_nic_may_tx_pio(tx_queue)) {
 		buffer = efx_enqueue_skb_pio(tx_queue, skb);
 		dma_flags = EFX_TX_BUF_OPTION;
 		goto finish_packet;
@@ -439,13 +428,14 @@ finish_packet:
 
 	netdev_tx_sent_queue(tx_queue->core_txq, skb->len);
 
+	efx_tx_maybe_stop_queue(tx_queue);
+
 	/* Pass off to hardware */
-	efx_nic_push_buffers(tx_queue);
+	if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
+		efx_nic_push_buffers(tx_queue);
 
 	tx_queue->tx_packets++;
 
-	efx_tx_maybe_stop_queue(tx_queue);
-
 	return NETDEV_TX_OK;
 
  dma_err:
@@ -458,7 +448,7 @@ finish_packet:
 	dev_kfree_skb_any(skb);
 
 	/* Work backwards until we hit the original insert pointer value */
-	while (tx_queue->insert_count != tx_queue->write_count) {
+	while (tx_queue->insert_count != old_insert_count) {
 		unsigned int pkts_compl = 0, bytes_compl = 0;
 		--tx_queue->insert_count;
 		buffer = __efx_tx_queue_get_insert_buffer(tx_queue);
@@ -989,12 +979,13 @@ static int efx_tso_put_header(struct efx_tx_queue *tx_queue,
 /* Remove buffers put into a tx_queue.  None of the buffers must have
  * an skb attached.
  */
-static void efx_enqueue_unwind(struct efx_tx_queue *tx_queue)
+static void efx_enqueue_unwind(struct efx_tx_queue *tx_queue,
+			       unsigned int insert_count)
 {
 	struct efx_tx_buffer *buffer;
 
 	/* Work backwards until we hit the original insert pointer value */
-	while (tx_queue->insert_count != tx_queue->write_count) {
+	while (tx_queue->insert_count != insert_count) {
 		--tx_queue->insert_count;
 		buffer = __efx_tx_queue_get_insert_buffer(tx_queue);
 		efx_dequeue_buffer(tx_queue, buffer, NULL, NULL);
@@ -1258,14 +1249,13 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
 			       struct sk_buff *skb)
 {
 	struct efx_nic *efx = tx_queue->efx;
+	unsigned int old_insert_count = tx_queue->insert_count;
 	int frag_i, rc;
 	struct tso_state state;
 
 	/* Find the packet protocol and sanity-check it */
 	state.protocol = efx_tso_check_protocol(skb);
 
-	EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count);
-
 	rc = tso_start(&state, efx, skb);
 	if (rc)
 		goto mem_err;
@@ -1308,11 +1298,12 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
 
 	netdev_tx_sent_queue(tx_queue->core_txq, skb->len);
 
-	/* Pass off to hardware */
-	efx_nic_push_buffers(tx_queue);
-
 	efx_tx_maybe_stop_queue(tx_queue);
 
+	/* Pass off to hardware */
+	if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq))
+		efx_nic_push_buffers(tx_queue);
+
 	tx_queue->tso_bursts++;
 	return NETDEV_TX_OK;
 
@@ -1336,6 +1327,6 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
 		dma_unmap_single(&efx->pci_dev->dev, state.header_dma_addr,
 				 state.header_unmap_len, DMA_TO_DEVICE);
 
-	efx_enqueue_unwind(tx_queue);
+	efx_enqueue_unwind(tx_queue, old_insert_count);
 	return NETDEV_TX_OK;
 }
-- 
1.7.11.7

^ permalink raw reply related

* RE: [PATCH] net: fec: ptp: fix convergence issue to support LinuxPTP stack
From: Frank.Li @ 2014-10-14 18:43 UTC (permalink / raw)
  To: Richard Cochran
  Cc: David Miller, fugang.duan@freescale.com, netdev@vger.kernel.org,
	bhutchings@solarflare.com
In-Reply-To: <20141014183933.GA14216@localhost.localdomain>



> -----Original Message-----
> From: Richard Cochran [mailto:richardcochran@gmail.com]
> Sent: Tuesday, October 14, 2014 1:40 PM
> To: Li Frank-B20596
> Cc: David Miller; Duan Fugang-B38611; netdev@vger.kernel.org;
> bhutchings@solarflare.com
> Subject: Re: [PATCH] net: fec: ptp: fix convergence issue to support LinuxPTP
> stack
> 
> Frank,
> 
> On Tue, Oct 14, 2014 at 06:27:28PM +0000, Frank.Li@freescale.com wrote:
> > Fugang's patch fix the problem existed in MX6SX platform.
> 
> You say that Fugang's patch fixes a quirk in the SX device.
> 
> But he said that the user's manual tells us to wait for some status bits to
> clear. (See the third line, with the 'while' key word).
> 
> > > > On Tue, Oct 14, 2014 at 01:39:47PM +0800, Fugang Duan wrote:
> > > >> IEEE 1588 module has one hw issue in capturing the ATVR register.
> > > >> According to the user manual it is:
> > > >> 		ENET0->ATCR |= ENET_ATCR_CAPTURE_MASK;
> > > >> 		while(ENET0->ATCR & ENET_ATCR_CAPTURE_MASK);
> > > >> 		ts_counter_ns = ENET0->ATVR;
> 
> That applies to all IMX devices, doesn't it?
> In any case, Fugang's change log is not clear.

Only MX6 SX. Only MX6 SX added FEC_QUIRK_BUG_CAPTURE.

.driver_data = FEC_QUIRK_ENET_MAC | FEC_QUIRK_HAS_GBIT |
 				FEC_QUIRK_HAS_BUFDESC_EX | FEC_QUIRK_HAS_CSUM |
 				FEC_QUIRK_HAS_VLAN | FEC_QUIRK_HAS_AVB |
-				FEC_QUIRK_ERR007885,
+				FEC_QUIRK_ERR007885 | FEC_QUIRK_BUG_CAPTURE,

> 
> Confused,
> Richard

^ permalink raw reply

* Re: [PATCH v3 0/3] Enable FEC pps feather
From: Richard Cochran @ 2014-10-14 18:45 UTC (permalink / raw)
  To: Luwei Zhou
  Cc: davem, netdev, shawn.guo, bhutchings, R49496, b38611, b20596,
	stephen
In-Reply-To: <1412918130-18830-1-git-send-email-b45643@freescale.com>


I had to think a lot about PPS feathers
and wonder what color they might be.

Finally I got it: you meant "feature" :^)

Thanks,
Richard

^ permalink raw reply

* Re: [PATCH v3 0/3] Enable FEC pps feather
From: David Miller @ 2014-10-14 18:45 UTC (permalink / raw)
  To: b45643
  Cc: richardcochran, netdev, shawn.guo, bhutchings, R49496, b38611,
	b20596, stephen
In-Reply-To: <1412918130-18830-1-git-send-email-b45643@freescale.com>

From: Luwei Zhou <b45643@freescale.com>
Date: Fri, 10 Oct 2014 13:15:27 +0800

> Change from v2 to v3:
> 	-Using the default channel 0 to be PPS channel not PTP_PIN_SET/GETFUNC interface.
> 	-Using the linux definition of NSEC_PER_SEC.
> 
> Change from v1 to v2:
> 	- Fix the potential 32-bit multiplication overflow issue.
> 	- Optimize the hareware adjustment code to improve efficiency as Richard suggested
> 	- Use ptp PTP_PIN_SET/GETFUNC interface to set PPS channel not device tree
> 	and add PTP_PF_PPS enumeration
> 	- Modify comments style

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH] net: fec: ptp: fix convergence issue to support LinuxPTP stack
From: Richard Cochran @ 2014-10-14 18:49 UTC (permalink / raw)
  To: Frank.Li@freescale.com
  Cc: David Miller, fugang.duan@freescale.com, netdev@vger.kernel.org,
	bhutchings@solarflare.com
In-Reply-To: <6b1256b73b7a4f978573a77ec3e36831@BLUPR03MB486.namprd03.prod.outlook.com>

On Tue, Oct 14, 2014 at 06:43:51PM +0000, Frank.Li@freescale.com wrote:
> Only MX6 SX. Only MX6 SX added FEC_QUIRK_BUG_CAPTURE.

But what about this comment:

+/* ENET Block Guide/ Chapter for the iMX6SLX (PELE) address one issue:
+ * Incorrect behavior for ENET_ATCR[Capture and Restart Bits]. These bits will
+ * always read a value zero. When these bits are set to 1'b1, these should hold
+ * value 1'b1 until the counter value is capture in the register clock domain.
+ */

It sounds like the bits are "sticky" until the counter value has been
latched. Therefore you need to check the bits before reading the
counter, but the code does not do this for the case of !SX.

Thanks,
Richard

^ permalink raw reply

* Re: [PATCH net-next RFC 0/3] virtio-net: Conditionally enable tx interrupt
From: David Miller @ 2014-10-14 18:53 UTC (permalink / raw)
  To: jasowang-H+wXaHxf7aLQT0dZR+AlfA
  Cc: rusty-8n+1lVoiYb80n/F98K4Iww, mst-H+wXaHxf7aLQT0dZR+AlfA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1413011806-3813-1-git-send-email-jasowang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

From: Jason Wang <jasowang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Date: Sat, 11 Oct 2014 15:16:43 +0800

> We free old transmitted packets in ndo_start_xmit() currently, so any
> packet must be orphaned also there. This was used to reduce the overhead of
> tx interrupt to achieve better performance. But this may not work for some
> protocols such as TCP stream. TCP depends on the value of sk_wmem_alloc to
> implement various optimization for small packets stream such as TCP small
> queue and auto corking. But orphaning packets early in ndo_start_xmit()
> disable such things more or less since sk_wmem_alloc was not accurate. This
> lead extra low throughput for TCP stream of small writes.
> 
> This series tries to solve this issue by enable tx interrupts for all TCP
> packets other than the ones with push bit or pure ACK. This is done through
> the support of urgent descriptor which can force an interrupt for a
> specified packet. If tx interrupt was enabled for a packet, there's no need
> to orphan it in ndo_start_xmit(), we can free it tx napi which is scheduled
> by tx interrupt. Then sk_wmem_alloc was more accurate than before and TCP
> can batch more for small write. More larger skb was produced by TCP in this
> case to improve both throughput and cpu utilization.
> 
> Test shows great improvements on small write tcp streams. For most of the
> other cases, the throughput and cpu utilization are the same in the
> past. Only few cases, more cpu utilization was noticed which needs more
> investigation.
> 
> Review and comments are welcomed.

I think proper accounting and queueing (at all levels, not just TCP
sockets) is more important than trying to skim a bunch of cycles by
avoiding TX interrupts.

Having an event to free the SKB is absolutely essential for the stack
to operate correctly.

And with virtio-net you don't even have the excuse of "the HW
unfortunately doesn't have an appropriate TX event."

So please don't play games, and instead use TX interrupts all the
time.  You can mitigate them in various ways, but don't turn them on
selectively based upon traffic type, that's terrible.

You can even use ->xmit_more to defer the TX interrupt indication to
the final TX packet in the chain.

^ permalink raw reply

* Re: something is wrong in commit 971f10eca1 - tcp: better TCP_SKB_CB layout to reduce cache line misses
From: Cong Wang @ 2014-10-14 18:59 UTC (permalink / raw)
  To: Krzysztof Kolasa; +Cc: Eric Dumazet, netdev, edumazet, David Miller
In-Reply-To: <543D2435.8090002@winsoft.pl>

[-- Attachment #1: Type: text/plain, Size: 1044 bytes --]

On Tue, Oct 14, 2014 at 6:25 AM, Krzysztof Kolasa <kkolasa@winsoft.pl> wrote:
> W dniu 14.10.2014 o 02:09, Cong Wang pisze:
>
>> On Mon, Oct 13, 2014 at 4:59 PM, Cong Wang <cwang@twopensource.com> wrote:
>>>
>>> Probably not related with this bug, but with regarding to the
>>> offending commit, what's the point of the memmove() in tcp_v4_rcv()
>>> since ip_rcv() already clears IPCB()?
>>
>> Oh, ip options are actually saved in ip_rcv_finish()... Hmm, looks scary
>> to play with variable-length array with memmove()....
>>
> On my other old laptop with 32bit kernel next and graphics card Intel 945GM
> just after the revert commit working OK,
> before, after login to gnome shell in some seconds decorations disappear
>
> 32 bit Ubuntu 12.04.5 LTS, gnome shell, kernel source next 14-10-2014
>
> Can anyone confirm this ?
>

Sorry, believe it or not, for me it is hard to find a 32bit machine even VM. :)

Could the attached patch by any chance help? I noticed cookie_v4_check()
still uses IPCB() instead of TCPCB() at TCP layer.

Thanks.

[-- Attachment #2: tcp-cb.diff --]
[-- Type: text/plain, Size: 3524 bytes --]

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 74efeda..5d11012 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -468,8 +468,7 @@ void inet_sk_rx_dst_set(struct sock *sk, const struct sk_buff *skb);
 /* From syncookies.c */
 int __cookie_v4_check(const struct iphdr *iph, const struct tcphdr *th,
 		      u32 cookie);
-struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb,
-			     struct ip_options *opt);
+struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb);
 #ifdef CONFIG_SYN_COOKIES
 
 /* Syncookies use a monotonic timer which increments every 60 seconds.
@@ -530,6 +529,7 @@ void tcp_send_active_reset(struct sock *sk, gfp_t priority);
 int tcp_send_synack(struct sock *);
 bool tcp_syn_flood_action(struct sock *sk, const struct sk_buff *skb,
 			  const char *proto);
+struct ip_options_rcu *tcp_v4_save_options(struct sk_buff *skb);
 void tcp_push_one(struct sock *, unsigned int mss_now);
 void tcp_send_ack(struct sock *sk);
 void tcp_send_delayed_ack(struct sock *sk);
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 0431a8f..d346303 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -255,9 +255,9 @@ bool cookie_check_timestamp(struct tcp_options_received *tcp_opt,
 }
 EXPORT_SYMBOL(cookie_check_timestamp);
 
-struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb,
-			     struct ip_options *opt)
+struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb)
 {
+	struct ip_options *opt = &TCP_SKB_CB(skb)->header.h4.opt;
 	struct tcp_options_received tcp_opt;
 	struct inet_request_sock *ireq;
 	struct tcp_request_sock *treq;
@@ -317,15 +317,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb,
 	/* We throwed the options of the initial SYN away, so we hope
 	 * the ACK carries the same options again (see RFC1122 4.2.3.8)
 	 */
-	if (opt && opt->optlen) {
-		int opt_size = sizeof(struct ip_options_rcu) + opt->optlen;
-
-		ireq->opt = kmalloc(opt_size, GFP_ATOMIC);
-		if (ireq->opt != NULL && ip_options_echo(&ireq->opt->opt, skb)) {
-			kfree(ireq->opt);
-			ireq->opt = NULL;
-		}
-	}
+	ireq->opt = tcp_v4_save_options(skb);
 
 	if (security_inet_conn_request(sk, skb, req)) {
 		reqsk_free(req);
@@ -344,7 +336,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb,
 	flowi4_init_output(&fl4, sk->sk_bound_dev_if, ireq->ir_mark,
 			   RT_CONN_FLAGS(sk), RT_SCOPE_UNIVERSE, IPPROTO_TCP,
 			   inet_sk_flowi_flags(sk),
-			   (opt && opt->srr) ? opt->faddr : ireq->ir_rmt_addr,
+			   opt->srr ? opt->faddr : ireq->ir_rmt_addr,
 			   ireq->ir_loc_addr, th->source, th->dest);
 	security_req_classify_flow(req, flowi4_to_flowi(&fl4));
 	rt = ip_route_output_key(sock_net(sk), &fl4);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 552e87e..e02d586 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -883,7 +883,7 @@ EXPORT_SYMBOL(tcp_syn_flood_action);
 /*
  * Save and compile IPv4 options into the request_sock if needed.
  */
-static struct ip_options_rcu *tcp_v4_save_options(struct sk_buff *skb)
+struct ip_options_rcu *tcp_v4_save_options(struct sk_buff *skb)
 {
 	const struct ip_options *opt = &TCP_SKB_CB(skb)->header.h4.opt;
 	struct ip_options_rcu *dopt = NULL;
@@ -1428,7 +1428,7 @@ static struct sock *tcp_v4_hnd_req(struct sock *sk, struct sk_buff *skb)
 
 #ifdef CONFIG_SYN_COOKIES
 	if (!th->syn)
-		sk = cookie_v4_check(sk, skb, &TCP_SKB_CB(skb)->header.h4.opt);
+		sk = cookie_v4_check(sk, skb);
 #endif
 	return sk;
 }

^ permalink raw reply related

* ixgbe: Question about Flow Control on 10G
From: dom @ 2014-10-14 19:01 UTC (permalink / raw)
  To: donald.c.skidmore, netdev

Hi

I have a question about the ixgbe driver's handling of 'ethtool -a ethX' 
when the NIC is using fibre.

Specifically I don't understand the code introduced by this commit:

commit 73d80953dfd1d5a92948005798c857c311c2834b
Author: Don Skidmore <donald.c.skidmore@intel.com>
Date:   Wed Jul 31 02:19:24 2013 +0000
Subject: ixgbe: fix fc autoneg ethtool reporting.

The function introduced the function:
        ixgbe_device_supports_autoneg_fc()

which gets called by ixgbe_get_pauseparam()/ixgbe_set_pauseparam().

specifically there is a  case in ixgbe_device_supports_autoneg_fc()

     case ixgbe_media_type_fiber_qsfp:
     case ixgbe_media_type_fiber:
         hw->mac.ops.check_link(hw, &speed, &link_up, false);
         /* if link is down, assume supported */
         if (link_up)
             supported = speed == IXGBE_LINK_SPEED_1GB_FULL ?
                 true : false;

If link_up=1 then why is supported only true for a 
speed=IXGBE_LINK_SPEED_1GB_FULL ?

Why is Flow Control not  supported for IXGBE_LINK_SPEED_10GB_FULL ?

I have search through the 82599 spec, but I can't find a reason.
Please can anyone help me understand ?

Thanks
dom

^ permalink raw reply

* Re: [PATCH RFC v4 net 1/3] ipv6: Remove BACKTRACK macro
From: David Miller @ 2014-10-14 19:01 UTC (permalink / raw)
  To: kafai; +Cc: netdev, hannes
In-Reply-To: <1412966888-31384-2-git-send-email-kafai@fb.com>

From: Martin KaFai Lau <kafai@fb.com>
Date: Fri, 10 Oct 2014 11:48:06 -0700

> +struct fib6_node *fib6_backtrack(struct fib6_node *fn,
> +				 struct in6_addr *saddr);
> +

I am completely mystified why you did this, could you explain the
logic?  I want to know what drove you to make this exported.

I marked it static in my example patch, and there is no caller outside
of route.c

Doing this also eliminates inlining opportunitites.

Please keep this private inside of route.c

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox