Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [RFC] netlink: get rid of nl_table_lock
From: David Miller @ 2015-01-06 22:19 UTC (permalink / raw)
  To: stephen
  Cc: tgraf, netdev, linux-kernel, herbert, paulmck, edumazet,
	john.r.fastabend, josh
In-Reply-To: <20150103110211.18b11f0f@urahara>

From: Stephen Hemminger <stephen@networkplumber.org>
Date: Sat, 3 Jan 2015 11:02:11 -0800

> As a follow on to Thomas's patch I think this would complete the
> transistion to RCU for netlink.
> Compile tested only.
> 
> This patch gets rid of the reader/writer nl_table_lock and replaces it
> with exclusively using RCU for reading, and a mutex for writing.
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

FWIW, this approach looks fine to me.

Thomas can you review this?

^ permalink raw reply

* Re: [PATCH] net/fsl: Add mEMAC MDIO support to XGMAC MDIO
From: David Miller @ 2015-01-06 22:18 UTC (permalink / raw)
  To: shh.xie; +Cc: netdev, afleming, Shaohui.Xie
In-Reply-To: <1420364162-13109-1-git-send-email-shh.xie@gmail.com>

From: <shh.xie@gmail.com>
Date: Sun, 4 Jan 2015 17:36:02 +0800

> From: Andy Fleming <afleming@gmail.com>
> 
> The Freescale mEMAC supports operating at 10/100/1000/10G, and
> its associated MDIO controller is likewise capable of operating
> both Clause 22 and Clause 45 MDIO buses. It is nearly identical
> to the MDIO controller on the XGMAC, so we just modify that
> driver.
> 
> Portions of this driver developed by:
> 
> Sandeep Singh <sandeep@freescale.com>
> Roy Zang <tie-fei.zang@freescale.com>
> 
> Signed-off-by: Andy Fleming <afleming@gmail.com>
> Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] ethtool: Extend ethtool plugin module eeprom API to phylib
From: David Miller @ 2015-01-06 22:17 UTC (permalink / raw)
  To: eswierk; +Cc: netdev, f.fainelli, linux-kernel
In-Reply-To: <1420248476-110859-1-git-send-email-eswierk@skyportsystems.com>

From: Ed Swierk <eswierk@skyportsystems.com>
Date: Fri,  2 Jan 2015 17:27:56 -0800

> This patch extends the ethtool plugin module eeprom API to support cards
> whose phy support is delegated to a separate driver.
> 
> The handlers for ETHTOOL_GMODULEINFO and ETHTOOL_GMODULEEEPROM call the
> module_info and module_eeprom functions if the phy driver provides them;
> otherwise the handlers call the equivalent ethtool_ops functions provided
> by network drivers with built-in phy support.
> 
> Signed-off-by: Ed Swierk <eswierk@skyportsystems.com>

Applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH net-next] openvswitch: Do not use private netdev_vport fields
From: Pravin Shelar @ 2015-01-06 22:15 UTC (permalink / raw)
  To: David Miller; +Cc: Daniele Di Proietto, netdev
In-Reply-To: <20150106.170206.2121115629154170856.davem@davemloft.net>

On Tue, Jan 6, 2015 at 2:02 PM, David Miller <davem@davemloft.net> wrote:
> From: Pravin Shelar <pshelar@nicira.com>
> Date: Tue, 6 Jan 2015 13:16:11 -0800
>
>> Function return type and function name should be on same line,
>> otherwise looks good.
>
> I disagree, where is the code in the tree that needs this?

Most of function definitions that I have seen are defined like this. I
was pointing out coding style issue.

^ permalink raw reply

* Re: [PATCH net-next] tg3: move init/deinit from open/close to probe/remove
From: Prashant Sreedharan @ 2015-01-06 21:57 UTC (permalink / raw)
  To: Ivan Vecera; +Cc: netdev, mchan
In-Reply-To: <1420576122-23618-1-git-send-email-ivecera@redhat.com>

On Tue, 2015-01-06 at 21:28 +0100, Ivan Vecera wrote:
> Move init and deinit of PTP support from open/close functions
> to probe/remove funcs to avoid removing/re-adding of associated PTP
> device(s) during ifup/ifdown.
> 
> Signed-off-by: Ivan Vecera <ivecera@redhat.com>
> ---
>  drivers/net/ethernet/broadcom/tg3.c | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
> index 553dcd8..e86bee4 100644
> --- a/drivers/net/ethernet/broadcom/tg3.c
> +++ b/drivers/net/ethernet/broadcom/tg3.c
> @@ -11681,13 +11681,6 @@ static int tg3_open(struct net_device *dev)
>  		pci_set_power_state(tp->pdev, PCI_D3hot);
>  	}
>  
> -	if (tg3_flag(tp, PTP_CAPABLE)) {
> -		tp->ptp_clock = ptp_clock_register(&tp->ptp_info,
> -						   &tp->pdev->dev);
> -		if (IS_ERR(tp->ptp_clock))
> -			tp->ptp_clock = NULL;
> -	}
> -
>  	return err;
>  }
>  
> @@ -11701,8 +11694,6 @@ static int tg3_close(struct net_device *dev)
>  		return -EAGAIN;
>  	}
>  
> -	tg3_ptp_fini(tp);
> -
>  	tg3_stop(tp);
>  
>  	/* Clear stats across close / open calls */
> @@ -17880,6 +17871,13 @@ static int tg3_init_one(struct pci_dev *pdev,
>  		goto err_out_apeunmap;
>  	}
>  
> +	if (tg3_flag(tp, PTP_CAPABLE)) {
> +		tp->ptp_clock = ptp_clock_register(&tp->ptp_info,
> +						   &tp->pdev->dev);
> +		if (IS_ERR(tp->ptp_clock))
> +			tp->ptp_clock = NULL;
> +	}
> +
>  	netdev_info(dev, "Tigon3 [partno(%s) rev %04x] (%s) MAC address %pM\n",
>  		    tp->board_part_number,
>  		    tg3_chip_rev_id(tp),
> @@ -17955,6 +17953,8 @@ static void tg3_remove_one(struct pci_dev *pdev)
>  	if (dev) {
>  		struct tg3 *tp = netdev_priv(dev);
>  
> +		tg3_ptp_fini(tp);
> +
>  		release_firmware(tp->fw);
>  
>  		tg3_reset_task_cancel(tp);

tg3_ptp_init() needs to be called before ptp_clock_register() to
initialize the HW and poplulate the ptp_clock_info structure. Could you
please test after making this change. Thanks.

^ permalink raw reply

* Re: [PATCH] net: ethernet: cpsw: ignore VLAN ID 1
From: Felipe Balbi @ 2015-01-06 22:04 UTC (permalink / raw)
  To: David Miller; +Cc: balbi, netdev, linux-omap, stable, mugunthanvnm
In-Reply-To: <20150106.165911.604916635790072318.davem@davemloft.net>

[-- Attachment #1: Type: text/plain, Size: 698 bytes --]

Hi,

On Tue, Jan 06, 2015 at 04:59:11PM -0500, David Miller wrote:
> From: Felipe Balbi <balbi@ti.com>
> Date: Tue, 6 Jan 2015 14:31:19 -0600
> 
> > What you're saying here is that you prefer to drop a feature that works
> > for all other 1023 IDs because 1 ID is quirky. Sounds like overkill
> > to me.
> 
> The other option is to software fallback only for VLAN 1.

now we're talking. Keep in mind, however, that this IP runs on mere
single-core cortex A8 and single-core cortex A9 devices which already
have somewhat of a hard-time keeping up with the non-accelerated
checksum calculations. But fair enough, if that's the way to go, it is
the way to go.

cheers

-- 
balbi

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

* Fwd: [PATCH 1/1] update ip-sysctl.txt documentation
From: Ani Sinha @ 2015-01-06 22:02 UTC (permalink / raw)
  To: netdev@vger.kernel.org
In-Reply-To: <1420574648-24213-1-git-send-email-ani@arista.com>

+netdev


---------- Forwarded message ----------
From: Ani Sinha <ani@arista.com>
Date: Tue, Jan 6, 2015 at 12:04 PM
Subject: [PATCH 1/1] update ip-sysctl.txt documentation
To: corbet@lwn.net, davem@davemloft.net, edumazet@google.com,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
ani@arista.com, P@draigbrady.com


Update documentation to reflect the fact that
/proc/sys/net/ipv4/route/max_size is no longer used for
ipv4.

Signed-off-by: Ani Sinha <ani@arista.com>
---
 Documentation/networking/ip-sysctl.txt |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/Documentation/networking/ip-sysctl.txt
b/Documentation/networking/ip-sysctl.txt
index 9bffdfc..c8a7e37 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -64,8 +64,10 @@ fwmark_reflect - BOOLEAN
        Default: 0

 route/max_size - INTEGER
-       Maximum number of routes allowed in the kernel.  Increase
-       this when using large numbers of interfaces and/or routes.
+        Post linux kernel 3.6, this is depricated for ipv4 as route cache is no
+        longer used. For ipv6, this is used to limit the maximum number of ipv6
+        routes allowed in the kernel.  Increase this when using large
numbers of
+        interfaces and/or routes.

 neigh/default/gc_thresh1 - INTEGER
        Minimum number of entries to keep.  Garbage collector will not
--
1.7.4.4

^ permalink raw reply related

* Re: [PATCH net-next] openvswitch: Do not use private netdev_vport fields
From: David Miller @ 2015-01-06 22:02 UTC (permalink / raw)
  To: pshelar; +Cc: daniele.di.proietto, netdev
In-Reply-To: <CALnjE+odj6GdO97n4NBLQhP4egMzuNHMjxVqMwC7fSC5C=hT-g@mail.gmail.com>

From: Pravin Shelar <pshelar@nicira.com>
Date: Tue, 6 Jan 2015 13:16:11 -0800

> Function return type and function name should be on same line,
> otherwise looks good.

I disagree, where is the code in the tree that needs this?

^ permalink raw reply

* Re: [PATCH net-next] openvswitch: Do not use private netdev_vport fields
From: David Miller @ 2015-01-06 22:01 UTC (permalink / raw)
  To: daniele.di.proietto; +Cc: netdev, pshelar
In-Reply-To: <1420577481-20238-1-git-send-email-daniele.di.proietto@gmail.com>

From: Daniele Di Proietto <daniele.di.proietto@gmail.com>
Date: Tue,  6 Jan 2015 21:51:21 +0100

> This commit introduces netdev_vport_index() to prevent datapath.c from directly accessing the 'dev' member of 'struct netdev_vport'.
> This fix is needed to allow possible alternative netdev_vport implementations.
> 
> Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com>

This doesn't make any sense to me, as the code currently stands your
change is not necessary at all.

If some need does arise, submit this patch along with the change
that creates the need.

^ permalink raw reply

* Re: TCP connection issues against Amazon S3
From: Yuchung Cheng @ 2015-01-06 22:00 UTC (permalink / raw)
  To: Erik Grinaker; +Cc: Eric Dumazet, linux-kernel@vger.kernel.org, netdev
In-Reply-To: <3F608393-E5F1-4647-81BF-C6C740100934@bengler.no>

On Tue, Jan 6, 2015 at 1:04 PM, Erik Grinaker <erik@bengler.no> wrote:
>
>> On 06 Jan 2015, at 20:26, Erik Grinaker <erik@bengler.no> wrote:
>>
>>>
>>> On 06 Jan 2015, at 20:13, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>>
>>> On Tue, 2015-01-06 at 19:42 +0000, Erik Grinaker wrote:
>>>
>>>> The transfer on the functioning Netherlands server does indeed use SACKs, while the Norway servers do not.
>>>>
>>>> For what it’s worth, I have made stripped down pcaps for a single failing transfer as well as a single functioning transfer in the Netherlands:
>>>>
>>>> http://abstrakt.bengler.no/tcp-issues-s3-failure.pcap.bz2
>>>> http://abstrakt.bengler.no/tcp-issues-s3-success-netherlands.pcap.bz2
>>>>
>>>
>>> Although sender seems to be reluctant to retransmit, this 'failure' is
>>> caused by receiver closing the connection too soon.
>>>
>>> Are you sure you do not ask curl to setup a very small completion
>>> timer ?
>>
>> For testing, I am using Curl with a 30 second timeout. This may well be a bit short, but the point is that with the older kernel I could run thousands of requests without a single failure (generally the requests would finish within seconds), while with the newer kernel about 5% of requests will time out (the rest complete within seconds).
>>
>>> 12:41:00.738336 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 767221:768681, ack 154, win 127, length 1460
>>> 12:41:00.738346 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 736561, win 1877, length 0
>>> 12:41:05.227150 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 736561:738021, ack 154, win 127, length 1460
>>> 12:41:05.227250 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 745321, win 1882, length 0
>>> 12:41:05.278287 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 768681:770141, ack 154, win 127, length 1460
>>> 12:41:05.278354 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 745321, win 1888, length 0
>>> 12:41:05.278421 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 770141:771601, ack 154, win 127, length 1460
>>> 12:41:05.278429 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 745321, win 1894, length 0
>>> 12:41:14.257102 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 745321:746781, ack 154, win 127, length 1460
>>> 12:41:14.257154 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 746781, win 1900, length 0
>>> 12:41:14.308117 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 771601:773061, ack 154, win 127, length 1460
>>> 12:41:14.308227 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 746781, win 1905, length 0
>>> 12:41:14.308387 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 773061:774521, ack 154, win 127, length 1460
>>> 12:41:14.308397 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 746781, win 1911, length 0
>>>
>>> -> Here receiver sends a FIN, because application closed the socket (or died)
>>> 12:41:23.237156 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [F.], seq 154, ack 746781, win 1911, length 0
>>> 12:41:23.289805 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 746781:748241, ack 155, win 127, length 1460
>>> 12:41:23.289882 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [R], seq 505782802, win 0, length 0
>>>
>>> Anyway, getting decent speed without SACK is going to be hard.
>>
>> Yes, I am not sure why the sender (S3) disables SACK on my Norwegian servers (across ISPs), while it enables SACK on my server in the Netherlands. They run the same kernel and configuration. I will have to look into it more closely tomorrow.
>
> It turns out the Norway and Netherlands servers were resolving different loadbalancers. The ones I reached in Norway did not support SACKs, while the ones in the Netherlands did. Going directly to a SACK-enabled IP fixes the problem.
>
> This still doesn’t explain why it works with older kernels, but not newer ones. I’m thinking it’s
probably some minor change, which gets amplified by the lack of SACKs
on the loadbalancer. Anyway, I’ll bring it up with Amazon.
can you post traces with the older kernels?

>
> Many thanks for your help, everyone.

^ permalink raw reply

* Re: [PATCH] net: ethernet: cpsw: ignore VLAN ID 1
From: David Miller @ 2015-01-06 21:59 UTC (permalink / raw)
  To: balbi; +Cc: netdev, linux-omap, stable, mugunthanvnm
In-Reply-To: <20150106203119.GC32308@saruman>

From: Felipe Balbi <balbi@ti.com>
Date: Tue, 6 Jan 2015 14:31:19 -0600

> What you're saying here is that you prefer to drop a feature that works
> for all other 1023 IDs because 1 ID is quirky. Sounds like overkill
> to me.

The other option is to software fallback only for VLAN 1.

^ permalink raw reply

* Re: Does the ordering of the fib_table_dump or /proc/net/fib_trie matter?
From: David Miller @ 2015-01-06 21:58 UTC (permalink / raw)
  To: alexander.duyck; +Cc: stephen, netdev
In-Reply-To: <54AC45CE.80800@gmail.com>

From: Alexander Duyck <alexander.duyck@gmail.com>
Date: Tue, 06 Jan 2015 12:30:06 -0800

> The question I have is if that would screw up any user-space apps.  I
> know ip route can dump the list via "ip route show".  I'm just wondering
> if there would be any problem with default being the last entry instead
> of the first entry?

The ordering already changed once when we went from fib_hash to
fib_trie, nobody should depend upon the ordering.

^ permalink raw reply

* Re: Possible BUG in ipv6_find_hdr function for fragmented packets
From: Rahul Sharma @ 2015-01-06 21:43 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev
In-Reply-To: <1420551094.32369.34.camel@stressinduktion.org>

Hi Hannes

On Tue, Jan 6, 2015 at 7:01 PM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> Hi Rahul,
>
> On Mi, 2014-12-31 at 12:33 +0530, Rahul Sharma wrote:
>> I have observed a problem when I added an AH header before protocol
>> header (OSPFv3) while implementing authentication support for OSPFv3.
>>
>> Problem: Fragmented packets which include authentication header don't
>> get reassembled in the kernel. This was because ipv6_find_hdr returns
>> ENOENT for the non-first fragment since AH is an extension header.
>>
>> Firstly, this comment  "Note that non-1st fragment is special case
>> that "the protocol number of last header" is "next header" field in
>> Fragment header" ('last header' doesn't include AH or other extension
>> headers) before ipv6_find_hdr looks incorrect as per the description
>> of the fragmentation process in RFC2460. The rfc clearly states that
>> next header value in the fragments will be the first header of the
>> Fragmentable part of the original packet which could be AH (51) as in
>> our case.
>>
>> This code looks like a problem:
>> if (_frag_off) {
>> 253                                 if (target < 0 &&
>> 254                                     ((!ipv6_ext_hdr(hp->nexthdr)) ||
>> 255                                      hp->nexthdr == NEXTHDR_NONE)) {
>> 256                                         if (fragoff)
>> 257                                                 *fragoff = _frag_off;
>> 258                                         return hp->nexthdr;
>> 259                                 }
>> 260                                 return -ENOENT;
>> 261                         }
>>
>> For non-first fragments, the 'next header' in the fragment header
>> would *always* be AUTH (or whatever extension header is the first
>> header in first fragment). But the above code will keep on returning
>> ENOENT for the non-first fragment in such cases.
>>
>> Solution: I suggest we should get away with this check
>> ((!ipv6_ext_hdr(hp->nexthdr)) ||hp->nexthdr == NEXTHDR_NONE))  and
>> simply return hp->nexthdr if the _frag_off is non zero. I tested it on
>> my machine and it works. Adding an special case for NEXTHDR_AUTH also
>> works for me.
>
> The packets do get dropped in netfilter code? Do you have any idea were
> specifically?
>
> Your suggestion seems correct to me, can you provide a patch to fix
> this?
>
> Thanks,
> Hannes
>
>

Yes, the packets get dropped in the netfilter code. ip6table_raw_hook
was returning NF_DROP for the second fragment.
This was because of xt_action_param structure's hotdrop flag being set
to true for this fragment when ip6t_do_table tries to call
ip6_packet_match which in turn calls ipv6_find_hdr which was returning
ENOENT.

I have also emailed the patch.

Thanks

^ permalink raw reply

* [PATCH net] ipv6: Prevent ipv6_find_hdr() from returning ENOENT for valid non-first fragments
From: Rahul Sharma @ 2015-01-06 21:33 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, hannes

ipv6_find_hdr() currently assumes that the next-header field in the
fragment header of the non-first fragment is the "protocol number of
the last header" (here last header excludes any extension header
protocol numbers ) which is incorrect as per RFC2460. The next-header
value is the first header of the fragmentable part of the original
packet (which can be extension header as well).
This can create reassembly problems. For example: Fragmented
authenticated OSPFv3 packets (where AH header is inserted before the
protocol header). For the second fragment, the next header value in
the fragment header will be NEXTHDR_AUTH which is correct but
ipv6_find_hdr will return ENOENT since AH is an extension header
resulting in second fragment getting dropped. This check for the
presence of non-extension header needs to be removed.

Signed-off-by: Rahul Sharma <rsharma@arista.com>
---
--- linux-3.18.1/net/ipv6/exthdrs_core.c.orig   2015-01-06
10:25:36.411419863 -0800
+++ linux-3.18.1/net/ipv6/exthdrs_core.c        2015-01-06
10:51:45.819364986 -0800
@@ -171,10 +171,11 @@ EXPORT_SYMBOL_GPL(ipv6_find_tlv);
  * If the first fragment doesn't contain the final protocol header or
  * NEXTHDR_NONE it is considered invalid.
  *
- * Note that non-1st fragment is special case that "the protocol number
- * of last header" is "next header" field in Fragment header. In this case,
- * *offset is meaningless and fragment offset is stored in *fragoff if fragoff
- * isn't NULL.
+ * Note that non-1st fragment is special case that "the protocol number of the
+ * first header of the fragmentable part of the original packet" is
+ * "next header" field in the Fragment header. In this case, *offset is
+ * meaningless and fragment offset is stored in *fragoff if fragoff isn't
+ * NULL.
  *
  * if flags is not NULL and it's a fragment, then the frag flag
  * IP6_FH_F_FRAG will be set. If it's an AH header, the
@@ -250,9 +251,7 @@ int ipv6_find_hdr(const struct sk_buff *

                        _frag_off = ntohs(*fp) & ~0x7;
                        if (_frag_off) {
-                               if (target < 0 &&
-                                   ((!ipv6_ext_hdr(hp->nexthdr)) ||
-                                    hp->nexthdr == NEXTHDR_NONE)) {
+                               if (target < 0) {
                                        if (fragoff)
                                                *fragoff = _frag_off;
                                        return hp->nexthdr;

^ permalink raw reply

* [PATCH] qla3xxx: don't allow never end busy loop
From: Andy Shevchenko @ 2015-01-06 21:17 UTC (permalink / raw)
  To: netdev, linux-driver, David S . Miller; +Cc: Andy Shevchenko

The counter variable wasn't increased at all which may stuck under
certain circumstances.

Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
---
 drivers/net/ethernet/qlogic/qla3xxx.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qla3xxx.c b/drivers/net/ethernet/qlogic/qla3xxx.c
index c2f09af..4847713 100644
--- a/drivers/net/ethernet/qlogic/qla3xxx.c
+++ b/drivers/net/ethernet/qlogic/qla3xxx.c
@@ -146,10 +146,7 @@ static int ql_wait_for_drvr_lock(struct ql3_adapter *qdev)
 {
 	int i = 0;
 
-	while (i < 10) {
-		if (i)
-			ssleep(1);
-
+	do {
 		if (ql_sem_lock(qdev,
 				QL_DRVR_SEM_MASK,
 				(QL_RESOURCE_BITS_BASE_CODE | (qdev->mac_index)
@@ -158,7 +155,8 @@ static int ql_wait_for_drvr_lock(struct ql3_adapter *qdev)
 				      "driver lock acquired\n");
 			return 1;
 		}
-	}
+		ssleep(1);
+	} while (++i < 10);
 
 	netdev_err(qdev->ndev, "Timed out waiting for driver lock...\n");
 	return 0;
-- 
1.8.3.101.g727a46b

^ permalink raw reply related

* Re: [PATCH net-next] openvswitch: Do not use private netdev_vport fields
From: Pravin Shelar @ 2015-01-06 21:16 UTC (permalink / raw)
  To: Daniele Di Proietto; +Cc: netdev
In-Reply-To: <1420577481-20238-1-git-send-email-daniele.di.proietto@gmail.com>

On Tue, Jan 6, 2015 at 12:51 PM, Daniele Di Proietto
<daniele.di.proietto@gmail.com> wrote:
> This commit introduces netdev_vport_index() to prevent datapath.c from directly accessing the 'dev' member of 'struct netdev_vport'.
> This fix is needed to allow possible alternative netdev_vport implementations.
>
> Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com>
> ---
>  net/openvswitch/datapath.c     | 2 +-
>  net/openvswitch/vport-netdev.h | 6 ++++++
>  2 files changed, 7 insertions(+), 1 deletion(-)
>
...
>
> diff --git a/net/openvswitch/vport-netdev.h b/net/openvswitch/vport-netdev.h
> index 6f7038e..ecfcbd5 100644
> --- a/net/openvswitch/vport-netdev.h
> +++ b/net/openvswitch/vport-netdev.h
> @@ -38,6 +38,12 @@ netdev_vport_priv(const struct vport *vport)
>         return vport_priv(vport);
>  }
>
> +static inline int
> +netdev_vport_index(const struct vport *vport)
> +{
> +       return netdev_vport_priv(vport)->dev->ifindex;
> +}
> +
Function return type and function name should be on same line,
otherwise looks good.

>  const char *ovs_netdev_get_name(const struct vport *);
>  void ovs_netdev_detach_dev(struct vport *);
>
> --
> 2.1.4
>

^ permalink raw reply

* Re: [PATCH net-next v3 0/5]: ixgbevf: Allow querying VFs RSS indirection table and key
From: Greg Rose @ 2015-01-06 21:13 UTC (permalink / raw)
  To: Vlad Zolotarov; +Cc: Gleb Natapov, netdev, Avi Kivity, jeffrey.t.kirsher
In-Reply-To: <54AC4206.4030006@cloudius-systems.com>

On Tue, Jan 6, 2015 at 12:13 PM, Vlad Zolotarov
<vladz@cloudius-systems.com> wrote:
>
> On 01/06/15 20:22, Greg Rose wrote:
>>

[snip]

>> I have not reached any such conclusion - let me reiterate:  I have no
>> idea.  It is not my area of expertise.  However, to take the lowest
>> risk route just add a policy hook so that a system admin can turn the
>> feature on through the PF driver (which is acknowledged as secure) if
>> they wish then there is no worry.
>
>
> NP. Let's move on.
>
>>> However I don't want to argue about any longer. Let's move on.
>>>
>>> Let's clarify one thing about this "hook". Do u agree that it should
>>> cover
>>> only the cases when VF shares the mentioned above data with PF - namely
>>> for
>>> all devices but x550?
>>
>> Look at how spoof checking is turned off/on for each VF using the "ip
>> link set" commands.  That's what I'm envisioning - some way to decide
>> on a per VF basis which VFs should be allowed to perform the query.
>
>
> I will but let's agree that x550 VFs should be out of this since their RSS
> indirection table and Key belong to the specific domain and don't impose any
> even theoretical thread.

Sounds good to me.

Thanks!

- Greg

>
>
> thanks,
> vlad
>
>> Thanks,
>>
>> - Greg
>
>

^ permalink raw reply

* Re: TCP connection issues against Amazon S3
From: Erik Grinaker @ 2015-01-06 21:04 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Yuchung Cheng, linux-kernel@vger.kernel.org, netdev
In-Reply-To: <4DA8529D-4EEC-42DA-89B0-DC7746DB2B10@bengler.no>


> On 06 Jan 2015, at 20:26, Erik Grinaker <erik@bengler.no> wrote:
> 
>> 
>> On 06 Jan 2015, at 20:13, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> 
>> On Tue, 2015-01-06 at 19:42 +0000, Erik Grinaker wrote:
>> 
>>> The transfer on the functioning Netherlands server does indeed use SACKs, while the Norway servers do not.
>>> 
>>> For what it’s worth, I have made stripped down pcaps for a single failing transfer as well as a single functioning transfer in the Netherlands:
>>> 
>>> http://abstrakt.bengler.no/tcp-issues-s3-failure.pcap.bz2
>>> http://abstrakt.bengler.no/tcp-issues-s3-success-netherlands.pcap.bz2
>>> 
>> 
>> Although sender seems to be reluctant to retransmit, this 'failure' is
>> caused by receiver closing the connection too soon.
>> 
>> Are you sure you do not ask curl to setup a very small completion
>> timer ?
> 
> For testing, I am using Curl with a 30 second timeout. This may well be a bit short, but the point is that with the older kernel I could run thousands of requests without a single failure (generally the requests would finish within seconds), while with the newer kernel about 5% of requests will time out (the rest complete within seconds).
> 
>> 12:41:00.738336 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 767221:768681, ack 154, win 127, length 1460
>> 12:41:00.738346 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 736561, win 1877, length 0
>> 12:41:05.227150 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 736561:738021, ack 154, win 127, length 1460
>> 12:41:05.227250 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 745321, win 1882, length 0
>> 12:41:05.278287 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 768681:770141, ack 154, win 127, length 1460
>> 12:41:05.278354 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 745321, win 1888, length 0
>> 12:41:05.278421 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 770141:771601, ack 154, win 127, length 1460
>> 12:41:05.278429 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 745321, win 1894, length 0
>> 12:41:14.257102 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 745321:746781, ack 154, win 127, length 1460
>> 12:41:14.257154 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 746781, win 1900, length 0
>> 12:41:14.308117 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 771601:773061, ack 154, win 127, length 1460
>> 12:41:14.308227 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 746781, win 1905, length 0
>> 12:41:14.308387 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 773061:774521, ack 154, win 127, length 1460
>> 12:41:14.308397 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 746781, win 1911, length 0
>> 
>> -> Here receiver sends a FIN, because application closed the socket (or died)
>> 12:41:23.237156 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [F.], seq 154, ack 746781, win 1911, length 0
>> 12:41:23.289805 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 746781:748241, ack 155, win 127, length 1460
>> 12:41:23.289882 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [R], seq 505782802, win 0, length 0
>> 
>> Anyway, getting decent speed without SACK is going to be hard.
> 
> Yes, I am not sure why the sender (S3) disables SACK on my Norwegian servers (across ISPs), while it enables SACK on my server in the Netherlands. They run the same kernel and configuration. I will have to look into it more closely tomorrow.

It turns out the Norway and Netherlands servers were resolving different loadbalancers. The ones I reached in Norway did not support SACKs, while the ones in the Netherlands did. Going directly to a SACK-enabled IP fixes the problem.

This still doesn’t explain why it works with older kernels, but not newer ones. I’m thinking it’s probably some minor change, which gets amplified by the lack of SACKs on the loadbalancer. Anyway, I’ll bring it up with Amazon.

Many thanks for your help, everyone.

^ permalink raw reply

* [PATCH net-next] openvswitch: Do not use private netdev_vport fields
From: Daniele Di Proietto @ 2015-01-06 20:51 UTC (permalink / raw)
  To: netdev; +Cc: pshelar, Daniele Di Proietto

This commit introduces netdev_vport_index() to prevent datapath.c from directly accessing the 'dev' member of 'struct netdev_vport'.
This fix is needed to allow possible alternative netdev_vport implementations.

Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com>
---
 net/openvswitch/datapath.c     | 2 +-
 net/openvswitch/vport-netdev.h | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 4e9a5f0..d632535 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -186,7 +186,7 @@ static int get_dpifindex(const struct datapath *dp)
 
 	local = ovs_vport_rcu(dp, OVSP_LOCAL);
 	if (local)
-		ifindex = netdev_vport_priv(local)->dev->ifindex;
+		ifindex = netdev_vport_index(local);
 	else
 		ifindex = 0;
 
diff --git a/net/openvswitch/vport-netdev.h b/net/openvswitch/vport-netdev.h
index 6f7038e..ecfcbd5 100644
--- a/net/openvswitch/vport-netdev.h
+++ b/net/openvswitch/vport-netdev.h
@@ -38,6 +38,12 @@ netdev_vport_priv(const struct vport *vport)
 	return vport_priv(vport);
 }
 
+static inline int
+netdev_vport_index(const struct vport *vport)
+{
+	return netdev_vport_priv(vport)->dev->ifindex;
+}
+
 const char *ovs_netdev_get_name(const struct vport *);
 void ovs_netdev_detach_dev(struct vport *);
 
-- 
2.1.4

^ permalink raw reply related

* Re: 3.12.33 - BUG xfrm_selector_match+0x25/0x2f6
From: Julian Anastasov @ 2015-01-06 20:46 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Smart Weblications GmbH - Florian Wiessner, Steffen Klassert,
	netdev, LKML, stable, Simon Horman, lvs-devel
In-Reply-To: <54ABDB98.9060504@suse.cz>


	Hello,

On Tue, 6 Jan 2015, Jiri Slaby wrote:

> So what should be done to fix the issue in stable 3.12? Are those
> patches needed in the upstream kernel too? In that case I suppose it
> will propagate to me through upstream. Otherwise, could you send "3.12
> only" patches to stable@ so that I can apply them?

	I asked Pablo for the old fix for IPVS-FTP:

http://www.spinics.net/lists/lvs-devel/msg03879.html

	The new fix for the xfrm crash is not applied yet:

http://www.spinics.net/lists/lvs-devel/msg03877.html

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* Re: [PATCH] net: ethernet: cpsw: ignore VLAN ID 1
From: Felipe Balbi @ 2015-01-06 20:31 UTC (permalink / raw)
  To: David Miller; +Cc: balbi, netdev, linux-omap, stable, mugunthanvnm
In-Reply-To: <20150106.141323.2091288413667564444.davem@davemloft.net>

[-- Attachment #1: Type: text/plain, Size: 1441 bytes --]

Hi,

On Tue, Jan 06, 2015 at 02:13:23PM -0500, David Miller wrote:
> From: Felipe Balbi <balbi@ti.com>
> Date: Tue, 6 Jan 2015 11:43:32 -0600
> 
> > CPSW completely hangs if we add, and later remove,
> > VLAN ID #1. What happens is that after removing
> > VLAN ID #1, no packets will be received by CPSW
> > rendering network unusable.
> > 
> > In order to "fix" the issue, we're returning -EINVAL
> > if anybody tries to add VLAN ID #1. While at that,
> > also filter out any ID > 4095 because we only have
> > 12 bits for VLAN IDs.
> > 
> > Fixes: 3b72c2f (drivers: net:ethernet: cpsw: add support for VLAN)
> > Cc: <stable@vger.kernel.org> # v3.9+
> > Cc: Mugunthan V N <mugunthanvnm@ti.com>
> > Tested-by: Schuyler Patton <spatton@ti.com>
> > Signed-off-by: Felipe Balbi <balbi@ti.com>
> 
> You can't just unilaterally make one VLAN ID unusable.
> 
> A better way to handle this situation must be found,
> and if that means turning off hw VLAN support completely,
> that's a much better alternative to this.
> 
> I'm not applying this patch, sorry.

All other IDs work alright, it's just ID 1 which seems to be quirky. In
fact when trying to add VLAN ID 1, vconfig itself dumps out a warning
that VLAN ID 1 doesn't work on most switches.

What you're saying here is that you prefer to drop a feature that works
for all other 1023 IDs because 1 ID is quirky. Sounds like overkill
to me.

-- 
balbi

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

* Does the ordering of the fib_table_dump or /proc/net/fib_trie matter?
From: Alexander Duyck @ 2015-01-06 20:30 UTC (permalink / raw)
  To: miller >> David Miller, stephen, NetDev

I am considering reversing the order of any non-lookup traversal of the
fib_trie so that it starts at the last node and works it's way up toward
the first node.  This would make it so that all walks using the parent
pointer all go in the same direction.

The problem right now is that leaf_walk_rcu and a couple of other
iterators traverse the trie in one direction grabbing the next child
(child++) of the parent, while fib_table_lookup is traversing the list
grabbing a previous child (child & (child - 1)) of the parent.  It makes
things a bit ugly for RCU as we have to have the node fully populated
before we can start updating the parent pointers on the children.

I want to have them both moving in the same direction so the
fib_table_lookup would remain the same, but the leaf_walk_rcu and others
would walk from the last child to the first (child--) and as a result
when I assemble a tnode in inflate or halve I would be able to populate
children from 0 to ((1 << tn->bits) - 1) without having to worry about
any iterators walking into uninitialized memory.

The question I have is if that would screw up any user-space apps.  I
know ip route can dump the list via "ip route show".  I'm just wondering
if there would be any problem with default being the last entry instead
of the first entry?

- Alex

^ permalink raw reply

* [PATCH net-next] tg3: move init/deinit from open/close to probe/remove
From: Ivan Vecera @ 2015-01-06 20:28 UTC (permalink / raw)
  To: netdev; +Cc: prashant, mchan

Move init and deinit of PTP support from open/close functions
to probe/remove funcs to avoid removing/re-adding of associated PTP
device(s) during ifup/ifdown.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
---
 drivers/net/ethernet/broadcom/tg3.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 553dcd8..e86bee4 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -11681,13 +11681,6 @@ static int tg3_open(struct net_device *dev)
 		pci_set_power_state(tp->pdev, PCI_D3hot);
 	}
 
-	if (tg3_flag(tp, PTP_CAPABLE)) {
-		tp->ptp_clock = ptp_clock_register(&tp->ptp_info,
-						   &tp->pdev->dev);
-		if (IS_ERR(tp->ptp_clock))
-			tp->ptp_clock = NULL;
-	}
-
 	return err;
 }
 
@@ -11701,8 +11694,6 @@ static int tg3_close(struct net_device *dev)
 		return -EAGAIN;
 	}
 
-	tg3_ptp_fini(tp);
-
 	tg3_stop(tp);
 
 	/* Clear stats across close / open calls */
@@ -17880,6 +17871,13 @@ static int tg3_init_one(struct pci_dev *pdev,
 		goto err_out_apeunmap;
 	}
 
+	if (tg3_flag(tp, PTP_CAPABLE)) {
+		tp->ptp_clock = ptp_clock_register(&tp->ptp_info,
+						   &tp->pdev->dev);
+		if (IS_ERR(tp->ptp_clock))
+			tp->ptp_clock = NULL;
+	}
+
 	netdev_info(dev, "Tigon3 [partno(%s) rev %04x] (%s) MAC address %pM\n",
 		    tp->board_part_number,
 		    tg3_chip_rev_id(tp),
@@ -17955,6 +17953,8 @@ static void tg3_remove_one(struct pci_dev *pdev)
 	if (dev) {
 		struct tg3 *tp = netdev_priv(dev);
 
+		tg3_ptp_fini(tp);
+
 		release_firmware(tp->fw);
 
 		tg3_reset_task_cancel(tp);
-- 
2.0.5

^ permalink raw reply related

* Re: [PATCH net-next 1/3] net: add IPv4 routing FIB support for swdev
From: Hannes Frederic Sowa @ 2015-01-06 20:26 UTC (permalink / raw)
  To: Scott Feldman
  Cc: Netdev, Jiří Pírko, john fastabend, Thomas Graf,
	Jamal Hadi Salim, Andy Gospodarek, Roopa Prabhu
In-Reply-To: <1420574353.15181.19.camel@stressinduktion.org>

On Di, 2015-01-06 at 20:59 +0100, Hannes Frederic Sowa wrote:
> Sorry, I haven't fully understood this. Does rocker first do a L3
> routing table lookup and *after* that does decide which nexthop to chose
> based on preferences in the action-set found at the leaf? My gut tells
> me that we cannot do a semantically equivalent to ip rules then, we
> would have to use ACLs then. Hmm...

Does rocker drop the packet if no match is found or can it pass the
packet onto the slowpath to the kernel?

^ permalink raw reply

* Re: TCP connection issues against Amazon S3
From: Erik Grinaker @ 2015-01-06 20:26 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Yuchung Cheng, linux-kernel@vger.kernel.org, netdev
In-Reply-To: <1420575216.5947.12.camel@edumazet-glaptop2.roam.corp.google.com>


> On 06 Jan 2015, at 20:13, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> On Tue, 2015-01-06 at 19:42 +0000, Erik Grinaker wrote:
> 
>> The transfer on the functioning Netherlands server does indeed use SACKs, while the Norway servers do not.
>> 
>> For what it’s worth, I have made stripped down pcaps for a single failing transfer as well as a single functioning transfer in the Netherlands:
>> 
>> http://abstrakt.bengler.no/tcp-issues-s3-failure.pcap.bz2
>> http://abstrakt.bengler.no/tcp-issues-s3-success-netherlands.pcap.bz2
>> 
> 
> Although sender seems to be reluctant to retransmit, this 'failure' is
> caused by receiver closing the connection too soon.
> 
> Are you sure you do not ask curl to setup a very small completion
> timer ?

For testing, I am using Curl with a 30 second timeout. This may well be a bit short, but the point is that with the older kernel I could run thousands of requests without a single failure (generally the requests would finish within seconds), while with the newer kernel about 5% of requests will time out (the rest complete within seconds).

> 12:41:00.738336 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 767221:768681, ack 154, win 127, length 1460
> 12:41:00.738346 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 736561, win 1877, length 0
> 12:41:05.227150 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 736561:738021, ack 154, win 127, length 1460
> 12:41:05.227250 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 745321, win 1882, length 0
> 12:41:05.278287 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 768681:770141, ack 154, win 127, length 1460
> 12:41:05.278354 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 745321, win 1888, length 0
> 12:41:05.278421 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 770141:771601, ack 154, win 127, length 1460
> 12:41:05.278429 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 745321, win 1894, length 0
> 12:41:14.257102 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 745321:746781, ack 154, win 127, length 1460
> 12:41:14.257154 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 746781, win 1900, length 0
> 12:41:14.308117 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 771601:773061, ack 154, win 127, length 1460
> 12:41:14.308227 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 746781, win 1905, length 0
> 12:41:14.308387 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 773061:774521, ack 154, win 127, length 1460
> 12:41:14.308397 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [.], ack 746781, win 1911, length 0
> 
> -> Here receiver sends a FIN, because application closed the socket (or died)
> 12:41:23.237156 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [F.], seq 154, ack 746781, win 1911, length 0
> 12:41:23.289805 IP 54.231.132.98.80 > 195.159.221.106.48837: Flags [.], seq 746781:748241, ack 155, win 127, length 1460
> 12:41:23.289882 IP 195.159.221.106.48837 > 54.231.132.98.80: Flags [R], seq 505782802, win 0, length 0
> 
> Anyway, getting decent speed without SACK is going to be hard.

Yes, I am not sure why the sender (S3) disables SACK on my Norwegian servers (across ISPs), while it enables SACK on my server in the Netherlands. They run the same kernel and configuration. I will have to look into it more closely tomorrow.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox