Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: /128 link-local subnet on 6in4 (sit) tunnels?
From: Wilco Baan Hofman @ 2013-03-28 13:00 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev, YOSHIFUJI Hideaki
In-Reply-To: <20130327183558.GC23223@order.stressinduktion.org>

On Wed, 2013-03-27 at 19:35 +0100, Hannes Frederic Sowa wrote:
> On Wed, Mar 27, 2013 at 07:20:54PM +0100, Wilco Baan Hofman wrote:
> > http://tools.ietf.org/html/rfc4213
> 
> Thanks, I have seen that already. The sit driver is used for more than 6in4
> (6to4, isatap, 6rd). So such a change has to be ok with all the other
> protocols implemented by sit. I also looked in the historic git archive for a
> rationale of this but couldn't find one. Commit messages 2002 where not as
> descriptive as today("Import changeset"). :)
> 
> I also added YOSHIFUJI Hideaki as Cc, perhaps he knows the reason.
> 

I've been doing some RFC checking of my own..

As far as 6to4 and 6rd go, a link-local address is optional and not very
useful at all. ISATAP should have a /64 subnet configured as far as I
can tell, same for 6in4.

>From rfc3056 section 3.1 [1]:

   The link-local address of a 6to4 pseudo-interface performing 6to4
   encapsulation would, if needed, be formed as described in Section 3.7
   of [MECH].  However, no scenario is known in which such an address
   would be useful, since a peer 6to4 gateway cannot determine the
   appropriate link-layer (IPv4) address to send to.

For 6rd, rfc5969 section 9 specifies that a link *may*, if needed, have
a non-used link-local address [2], this may be where the /128 comes in:

   The 6rd link is modeled as an NBMA link similar to other automatic
   IPv6 in IPv4 tunneling mechanisms like [RFC5214], with all 6rd CEs
   and BRs defined as off-link neighbors from one other.  The link-local
   address of a 6rd virtual interface performing the 6rd encapsulation
   would, if needed, be formed as described in Section 3.7 of [RFC4213].
   However, no communication using link-local addresses will occur.

For ISATAP, it basically states that a link-local should have a "subnet
of appropriate length".
rfc5214 section 6.2 refers to rfc4862 [2] for link local addressing:

   ISATAP interfaces form ISATAP interface identifiers from IPv4
   addresses in their locator set and use them to create link-local
   ISATAP addresses (Section 5.3 of [RFC4862]).

Which states:

   A link-local address is formed by combining the well-known link-local
   prefix FE80::0 [RFC4291] (of appropriate length) with an interface
   identifier as follows: >snip<

[1] http://tools.ietf.org/html/rfc3056#section-3.1
[2] http://tools.ietf.org/html/rfc5969#section-9
[3] http://tools.ietf.org/html/rfc5214#section-6.2
[4] http://tools.ietf.org/html/rfc4862#section-5.3

^ permalink raw reply

* Re: Difference between Net and Net-Next
From: Jim Baxter @ 2013-03-28 12:52 UTC (permalink / raw)
  To: netdev
In-Reply-To: <5153FA03.6010005@gmail.com>

Thank you both, that is much clearer.

Regards,
Jim

^ permalink raw reply

* [PATCH] smsc75xx: fix jumbo frame support
From: Steve Glendinning @ 2013-03-28 12:34 UTC (permalink / raw)
  To: netdev; +Cc: philip.dawson, Steve Glendinning

This patch enables RX of jumbo frames for LAN7500.

Previously the driver would transmit jumbo frames succesfully but
would drop received jumbo frames (incrementing the interface errors
count).

With this patch applied the device can succesfully receive jumbo
frames up to MTU 9000 (9014 bytes on the wire including ethernet
header).

Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
---
 drivers/net/usb/smsc75xx.c |   12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/net/usb/smsc75xx.c b/drivers/net/usb/smsc75xx.c
index 9abe517..1a15ec1 100644
--- a/drivers/net/usb/smsc75xx.c
+++ b/drivers/net/usb/smsc75xx.c
@@ -914,8 +914,12 @@ static int smsc75xx_set_rx_max_frame_length(struct usbnet *dev, int size)
 static int smsc75xx_change_mtu(struct net_device *netdev, int new_mtu)
 {
 	struct usbnet *dev = netdev_priv(netdev);
+	int ret;
+
+	if (new_mtu > MAX_SINGLE_PACKET_SIZE)
+		return -EINVAL;
 
-	int ret = smsc75xx_set_rx_max_frame_length(dev, new_mtu);
+	ret = smsc75xx_set_rx_max_frame_length(dev, new_mtu + ETH_HLEN);
 	if (ret < 0) {
 		netdev_warn(dev->net, "Failed to set mac rx frame length\n");
 		return ret;
@@ -1324,7 +1328,7 @@ static int smsc75xx_reset(struct usbnet *dev)
 
 	netif_dbg(dev, ifup, dev->net, "FCT_TX_CTL set to 0x%08x\n", buf);
 
-	ret = smsc75xx_set_rx_max_frame_length(dev, 1514);
+	ret = smsc75xx_set_rx_max_frame_length(dev, dev->net->mtu + ETH_HLEN);
 	if (ret < 0) {
 		netdev_warn(dev->net, "Failed to set max rx frame length\n");
 		return ret;
@@ -2134,8 +2138,8 @@ static int smsc75xx_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
 			else if (rx_cmd_a & (RX_CMD_A_LONG | RX_CMD_A_RUNT))
 				dev->net->stats.rx_frame_errors++;
 		} else {
-			/* ETH_FRAME_LEN + 4(CRC) + 2(COE) + 4(Vlan) */
-			if (unlikely(size > (ETH_FRAME_LEN + 12))) {
+			/* MAX_SINGLE_PACKET_SIZE + 4(CRC) + 2(COE) + 4(Vlan) */
+			if (unlikely(size > (MAX_SINGLE_PACKET_SIZE + ETH_HLEN + 12))) {
 				netif_dbg(dev, rx_err, dev->net,
 					  "size err rx_cmd_a=0x%08x\n",
 					  rx_cmd_a);
-- 
1.7.10.4

^ permalink raw reply related

* Re: 3.7.10 kernel crash
From: Peter Hurley @ 2013-03-28 12:35 UTC (permalink / raw)
  To: Fabio Coatti
  Cc: linux-kernel, Greg Kroah-Hartman, netdev, Matt Carlson,
	Michael Chan
In-Reply-To: <CADpTngW8RTVX28bVJQe5tV_ccc+FNOJezeQbczXFudy+7bMRDQ@mail.gmail.com>


[ +cc Matt Carlson, Michael Chan, netdev because this is a tg3-related oops]

On Thu, 2013-03-28 at 09:31 +0100, Fabio Coatti wrote:
> 2013/3/27 Fabio Coatti <fabio.coatti@gmail.com>:
> > Hi all,
> > we are experiencing crashes on some servers, right now running 3.7.10;
> > I've been able to get only screenshots from dying server that I
> > attached below. Probably we can exclude hardware issues, as it
> > happened on two different servers.
> 
> Further information: those crashes seems to happen only when the
> machine is heavily loaded (process, network and so on). We have seen
> this pattern several times.

I would recommend capturing the entire oops text (it will likely be
necessary anyway for someone to properly identify and fix the cause).

If the machine has a 2nd network port, then use netconsole on that
interface. If not, set up a serial console or try to get 50-line VGA
working.

Regards,
Peter Hurley

^ permalink raw reply

* [PATCH net-next] core: simplify the getting percpu of flow_cache
From: roy.qing.li @ 2013-03-28 12:24 UTC (permalink / raw)
  To: netdev

From: Li RongQing <roy.qing.li@gmail.com>

replace per_cpu with per_cpu_ptr to save conversion between address and pointer

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
---
 net/core/flow.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/flow.c b/net/core/flow.c
index 7fae135..707fb7b 100644
--- a/net/core/flow.c
+++ b/net/core/flow.c
@@ -334,7 +334,7 @@ static int flow_cache_percpu_empty(struct flow_cache *fc, int cpu)
 	struct flow_cache_percpu *fcp;
 	int i;
 
-	fcp = &per_cpu(*fc->percpu, cpu);
+	fcp = per_cpu_ptr(fc->percpu, cpu);
 	for (i = 0; i < flow_cache_hash_size(fc); i++)
 		if (!hlist_empty(&fcp->hash_table[i]))
 			return 0;
-- 
1.7.10.4

^ permalink raw reply related

* Re: [PATCH v2 0/3] net/macb: fixes to use core on Zynq SoCs
From: Steffen Trumtrar @ 2013-03-28 11:28 UTC (permalink / raw)
  To: Nicolas Ferre; +Cc: netdev, David Miller
In-Reply-To: <51542531.9010300@atmel.com>

On Thu, Mar 28, 2013 at 12:10:41PM +0100, Nicolas Ferre wrote:
> On 03/28/2013 10:07 AM, Steffen Trumtrar :
> > Hi!
> > 
> > The Cadence GEM is also licensed for the Xilinx Zynq7000 SoCs.
> > As Xilinx uses other reset defaults, some fixes are necessary to have it
> > working there. And as the Zynq is dualcore, the clk_enables/disables now need
> > to be atomic.
> > 
> > Changes in v2:
> > 	- only 3/3 was changed to correctly use the atomic clk_[en|dis]able
> > 
> > Regards,
> > Steffen
> 
> On the whole series:
> 
> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
> 
> Thanks, best regards.
> 

Thanks,
Steffen

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply

* Re: [PATCH v2 0/3] net/macb: fixes to use core on Zynq SoCs
From: Nicolas Ferre @ 2013-03-28 11:10 UTC (permalink / raw)
  To: Steffen Trumtrar, netdev, David Miller
In-Reply-To: <1364461627-26521-1-git-send-email-s.trumtrar@pengutronix.de>

On 03/28/2013 10:07 AM, Steffen Trumtrar :
> Hi!
> 
> The Cadence GEM is also licensed for the Xilinx Zynq7000 SoCs.
> As Xilinx uses other reset defaults, some fixes are necessary to have it
> working there. And as the Zynq is dualcore, the clk_enables/disables now need
> to be atomic.
> 
> Changes in v2:
> 	- only 3/3 was changed to correctly use the atomic clk_[en|dis]able
> 
> Regards,
> Steffen

On the whole series:

Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>

Thanks, best regards.

> Steffen Trumtrar (3):
>   net/macb: clear tx/rx completion flags in ISR
>   net/macb: force endian_swp_pkt_en to off
>   net/macb: make clk_enable atomic
> 
>  drivers/net/ethernet/cadence/macb.c | 24 ++++++++++++++----------
>  drivers/net/ethernet/cadence/macb.h |  2 ++
>  2 files changed, 16 insertions(+), 10 deletions(-)
> 


-- 
Nicolas Ferre

^ permalink raw reply

* Re: Deleting a network namespace
From: Eric W. Biederman @ 2013-03-28 11:05 UTC (permalink / raw)
  To: David Shwatrz; +Cc: netdev
In-Reply-To: <CAJJAcodGG6nmRNcF_rC+AUUpCp1MWjU3faGLfRTm9FNc6LT_Yw@mail.gmail.com>

David Shwatrz <dshwatrz@gmail.com> writes:

> Hello,
> When assigning a network interface to a network namespace and
> afterwards deleting the namespace, we will not see the network
> interface in any other namespace (including the default namespace) anymore:
>
> ip netns add ns1
> ip link set eth0 netns ns1
> ip netns del ns1
>
> This means that in fact we cannot use this interface again (only after
> rebooting)
> Am I right on this ?

Interfaces that represent physical hardware are moved to init_net.
Interfaces that are purely software constructs are deleted.

> Is moving an interface back to the default (init) namespace,
> when deleting the namespace which contains it, can be considered?

If you aren't seeing that your interface is either a purely software
construct like the veth or dummy interfaces or something still has a
reference to your network namespace.

Eric

^ permalink raw reply

* Deleting a network namespace
From: David Shwatrz @ 2013-03-28 10:43 UTC (permalink / raw)
  To: netdev; +Cc: Eric W. Biederman

Hello,
When assigning a network interface to a network namespace and
afterwards deleting the namespace, we will not see the network
interface in any other namespace (including the default namespace) anymore:

ip netns add ns1
ip link set eth0 netns ns1
ip netns del ns1

This means that in fact we cannot use this interface again (only after
rebooting)
Am I right on this ?
Is moving an interface back to the default (init) namespace,
when deleting the namespace which contains it, can be considered?

(AFAIK, we don't need to check that this interface is in any other namespace,
because by definition, a network interface belongs only to one namespace)

DS

^ permalink raw reply

* Re: [PATCH] man: packet.7: document fanout, ring and auxiliary options
From: Michael Kerrisk (man-pages) @ 2013-03-28 10:01 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: linux-man-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1363626807-22894-1-git-send-email-willemb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

Willem,

Thanks for sending this patch. This all looks good and authoritative.
Could I ask you to make a few small clean-ups and resubmit? See below.

On Mon, Mar 18, 2013 at 6:13 PM, Willem de Bruijn <willemb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> The packet socket manual page does not list all socket options.
>
> This patch adds descriptions of the common packet socket options
>   PACKET_AUXDATA, PACKET_FANOUT, PACKET_RX_RING, PACKET_STATISTICS,
>   PACKET_TX_RING
>
> and the ring-specific options
>   PACKET_LOSS, PACKET_RESERVE, PACKET_TIMESTAMP, PACKET_VERSION
>
> It does not yet add descriptions for
>   PACKET_COPY_THRESH, PACKET_HDRLEN, PACKET_ORIGDEV,
>   PACKET_TX_HAS_OFF, PACKET_TX_TIMESTAMP, PACKET_VNET_HDR
>
> It tries to balance being informative with exposing kernel detail
> that is unlikely to be used by most readers or that may change
> frequently. For implementation details, the manpage points to the
> documentation in kernel Documentation/networking. Let me know if
> options should be added or removed.

For the commit log message, could you just add a few lines for each of
the options stating how you determined the information. Also, if there
are specific individuals who could Ack the patch, please CC them and
ask them if they might Ack the patch.

> Signed-off-by: Willem de Bruijn <willemb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> ---
>  man7/packet.7 | 183 +++++++++++++++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 175 insertions(+), 8 deletions(-)
>
> diff --git a/man7/packet.7 b/man7/packet.7
> index 006f2ac..a9cc168 100644
> --- a/man7/packet.7
> +++ b/man7/packet.7
> @@ -177,17 +177,21 @@ and
>  .I sll_ifindex
>  are used.
>  .SS Socket options
> +Packet socket options are configured by calling
> +. BR setsockopt (2)
> +with level SOL_PACKET.

+with level
+.BR SOL_PACKET .

> +.TP
> +.BR PACKET_ADD_MEMBERSHIP
> +.PD 0
> +.TP
> +.BR PACKET_DROP_MEMBERSHIP
> +.PD
>  Packet sockets can be used to configure physical layer multicasting
>  and promiscuous mode.
> -It works by calling
> -.BR setsockopt (2)
> -on a packet socket for
> -.B SOL_PACKET
> -and one of the options
>  .B PACKET_ADD_MEMBERSHIP
> -to add a binding or
> +adds a binding and
>  .B PACKET_DROP_MEMBERSHIP
> -to drop it.
> +drops it.
>  They both expect a
>  .B packet_mreq
>  structure as argument:
> @@ -227,6 +231,169 @@ In addition the traditional ioctls
>  .BR SIOCADDMULTI ,
>  .B SIOCDELMULTI
>  can be used for the same purpose.
> +.TP
> +.BR PACKET_AUXDATA " (since Linux 2.6.21)"
> +.\" commit 8dc419447

It's great that you include these commit IDs, but I strongly prefer to
have the full 40-char ID. Potentially useful one day for scripting,
etc. Same comment for the instances below.

> +If this binary option is enabled, the packet socket passes a metadata
> +structure along with each packet in the
> +.BR recvmsg (2)
> +control field. The

Please start new sentences on new source lines (see man-pages(7)).
Same comment at numerous places below.


> +structure can be read with
> +.BR cmsg (3). It is defined as

Formatting broken there. Start new line after the period.

> +
> +.in +4n
> +.nf
> +struct tpacket_auxdata {
> +    __u32 tp_status;
> +    __u32 tp_len;      /* packet length */
> +    __u32 tp_snaplen;  /* captured length */
> +    __u16 tp_mac;
> +    __u16 tp_net;
> +    __u16 tp_vlan_tci;
> +    __u16 tp_padding;
> +};
> +.fi
> +.in
> +
> +.B tp_net

.I tp_net

> +stores the offset to the network layer. If the packet socket is of type
> +.BR SOCK_DGRAM ,
> +then
> +.B tp_mac
> +is the same. If it is of type
> +.B SOCK_RAW ,

.BR SOCK_RAW ,

> +then that stores the offset to the link layer frame.
> +.TP
> +.BR PACKET_FANOUT " (since Linux 3.1)"
> +.\" commit dc99f6006
> +To scale processing across threads, packet sockets can form a fanout
> +group. In this mode, each matching packet is enqueued onto only one
> +socket in the group. A socket joins a fanout group by calling
> +.B setsockopt(2)
> +with level SOL_PACKET and option PACKET_FANOUT.

.B SOL_PACKET
.BR PACKET_FANOUT .

> +Each network namespace can have up to 65536 independent groups. A
> +socket selects a group by encoding the ID in the first 16 bits of
> +the integer option value. The first packet socket to join a group
> +implicitly creates it. To successfully join an existing group,
> +subsequent packet sockets must have the same
> +protocol, device settings and fanout mode and flags (see below).
> +Packet sockets can leave a fanout group only by closing the socket.
> +The group is deleted when the last socket is closed.
> +
> +Fanout supports multiple algorithms to spread traffic between sockets.
> +The default mode,
> +. BR PACKET_FANOUT_HASH ,
> +sends packets from the same flow to the same socket to maintain per-flow
> +ordering. For each packet, it chooses a socket by taking the packet
> +flow hash modulo the number of sockets in the group, where a flow hash
> +is a hash over network layer address and optional transport layer port
> +fields. The load balance mode
> +. BR PACKET_FANOUT_LB
> +implements a round robin algorithm.

round-robin

> +. BR PACKET_FANOUT_CPU
> +selects the socket based on the cpu that the packet arrived on.

CPU

> +
> +Fanout modes can take additional options. IP fragmentation causes packets
> +from the same flow to have different flow hashes. The flag
> +.BR PACKET_FANOUT_FLAG_DEFRAG ,
> +if set, causes packet to be defragmented before fanout is applied, to
> +preserve order even in this case. Fanout mode and options are communicated
> +in the second 16 bits of the integer option value.
> +.TP
> +.BR PACKET_LOSS " (with PACKET_TX_RING)"
> +If set, do not silently drop on transmission errors, but return the
> +packet with status set to
> +.BR TP_STATUS_WRONG_FORMAT
> +.TP
> +.BR PACKET_RESERVE " (with PACKET_RX_RING)"
> +By default, a packet receive ring writes packets immediately following the
> +metadata structure and alignment padding. This integer option reserves
> +additional headroom.
> +.TP
> +.BR PACKET_RX_RING
> +Create a memory mapped ring buffer for asynchronous packet reception.
> +The packet socket reserves a contiguous region of application address
> +space, lays it out into an array of packet slots and copies packets
> +(up to snaplen)

.IR tp_snaplen )

> into subsequent slots. Each packet is preceded by a
> +metadata structure similar to
> +.B tpacket_auxdata.

.IR tpacket_auxdata .

> +Packet socket and application communicate the head and tail of the ring
> +through the
> +.B tp_status

.I

> +field. The packet socket owns all slots with status
> +.BR TP_STATUS_KERNEL .
> +After filling a slot, it changes the status of the slot to transfer
> +ownership to the application. During normal operation, the new status is
> +.BR TP_STATUS_USER ,
> +to signal that a correctly received packet has been stored. When the
> +application has finished processing a packet, it transfers ownership of
> +the slot back to the socket by setting the status to
> +.BR TP_STATUS_KERNEL .
> +Packet sockets implement multiple
> +variants of the packet ring. The implementation details are described in
> +.IR Documentation/networking/packet_mmap.txt
> +in the Linux kernel source tree.
> +.TP
> +.BR PACKET_STATISTICS
> +Retrieve packet socket statistics in the form of a structure
> +
> +.in +4n
> +.nf
> +struct tpacket_stats {
> +    __u32 tp_packets;  /* total packet count */
> +    __u32 tp_drops;    /* dropped packet count */
> +};
> +.fi
> +.in
> +
> +Receiving statistics resets the internal counters. The exact statistics
> +structure differs when using a ring of variant
> +.BR TPACKET_V3 .
> +.TP
> +.BR PACKET_TIMESTAMP " (with PACKET_RX_RING)"
> +The packet receive ring always stores a timestamp in the metadata header.
> +By default, this is a software generated timestamp generated when the
> +packet is copied into the ring. This integer option selects the type of
> +timestamp. Besides the default, it support the two hardware formats
> +described in
> +.IR Documentation/networking/timestamping.txt
> +in the Linux kernel source tree.
> +.TP
> +.BR PACKET_TX_RING " (since Linux 2.6.31)"
> +.\" commit 69e3c75f4
> +Create a memory mapped ring buffer for packet transmission. This option
> +is similar to
> +.BR PACKET_RX_RING
> +and takes the same arguments. The application writes packets into slots
> +with status
> +.BR TP_STATUS_AVAILABLE
> +and schedules them for transmission by changing the status to
> +.BR TP_STATUS_SEND_REQUEST .
> +When packets are ready to be transmitted, the application calls
> +.BR send (2)
> +Or a variant thereof. The

s/Or/or/

> +.B buf

.I buf

> +and
> +.B len

.I len

> +fields of this call are ignored. If an address is passed using
> +.BR sendto (2)
> +or
> +.BR sendmsg (2) ,
> +then that overrides the socket default. On successful transmission, the
> +socket resets the slot to
> +.BR TP_STATUS_AVAILABLE .
> +It discards packets silently on error unless
> +.BR PACKET_LOSS
> +is set.
> +.TP
> +.BR PACKET_VERSION " (with PACKET_RX_RING)"
> +By default,
> +.BR PACKET_RX_RING
> +creates a packet receive ring of variant
> +.BR TPACKET_V1 .
> +To create another variant, configure the desired variant by setting this
> +integer option before creating the ring.
> +
>  .SS Ioctls
>  .B SIOCGSTAMP
>  can be used to receive the timestamp of the last received packet.
> @@ -318,7 +485,7 @@ header to get a fully conforming packet.
>  Incoming 802.3 packets are not multiplexed on the DSAP/SSAP protocol
>  fields; instead they are supplied to the user as protocol
>  .B ETH_P_802_2
> -with the LLC header prepended.
> +with the LLC header prefixed.
>  It is thus not possible to bind to
>  .BR ETH_P_802_3 ;
>  bind to

Thanks,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface"; http://man7.org/tlpi/
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH] core: fix the use of this_cpu_ptr
From: roy.qing.li @ 2013-03-28  9:42 UTC (permalink / raw)
  To: netdev

From: Li RongQing <roy.qing.li@gmail.com>

flush_tasklet is not percpu var, and percpu is percpu var, and
	this_cpu_ptr(&info->cache->percpu->flush_tasklet)
is not equal to
	&this_cpu_ptr(info->cache->percpu)->flush_tasklet

1f743b076(use this_cpu_ptr per-cpu helper) introduced this bug.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
---
 net/core/flow.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/flow.c b/net/core/flow.c
index 7fae135..e8084b8 100644
--- a/net/core/flow.c
+++ b/net/core/flow.c
@@ -346,7 +346,7 @@ static void flow_cache_flush_per_cpu(void *data)
 	struct flow_flush_info *info = data;
 	struct tasklet_struct *tasklet;
 
-	tasklet = this_cpu_ptr(&info->cache->percpu->flush_tasklet);
+	tasklet = &this_cpu_ptr(info->cache->percpu)->flush_tasklet;
 	tasklet->data = (unsigned long)info;
 	tasklet_schedule(tasklet);
 }
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH] net: core: Remove redundant call to 'nf_reset' in 'dev_forward_skb'
From: Shmulik Ladkani @ 2013-03-28  9:13 UTC (permalink / raw)
  To: David Miller; +Cc: Ben Greear, netdev, Igor Michailov, Shmulik Ladkani

'nf_reset' is called just prior calling 'netif_rx'.
No need to call it twice.

Reported-by: Igor Michailov <rgohita@gmail.com>
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
---
 net/core/dev.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 2db88df..071f398 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1624,7 +1624,6 @@ int dev_forward_skb(struct net_device *dev, struct sk_buff *skb)
 	}
 
 	skb_orphan(skb);
-	nf_reset(skb);
 
 	if (unlikely(!is_skb_forwardable(dev, skb))) {
 		atomic_long_inc(&dev->rx_dropped);
-- 
1.7.9

^ permalink raw reply related

* [PATCH v2 3/3] net/macb: make clk_enable atomic
From: Steffen Trumtrar @ 2013-03-28  9:07 UTC (permalink / raw)
  To: netdev; +Cc: Nicolas Ferre, Steffen Trumtrar, Fabio Estevam
In-Reply-To: <1364461627-26521-1-git-send-email-s.trumtrar@pengutronix.de>

Use clk_prepare_enable/clk_disable_unprepare to be safe on SMP systems.

Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Fabio Estevam <festevam@gmail.com>
---

Changes in v2:
	- use clk_disable_unprepare() (reported by Fabio Estevam)
	- use clk_prepare_enable() in resume function

 drivers/net/ethernet/cadence/macb.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index 71e766b..4378d63 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -1561,14 +1561,14 @@ static int __init macb_probe(struct platform_device *pdev)
 		dev_err(&pdev->dev, "failed to get macb_clk\n");
 		goto err_out_free_dev;
 	}
-	clk_enable(bp->pclk);
+	clk_prepare_enable(bp->pclk);
 
 	bp->hclk = clk_get(&pdev->dev, "hclk");
 	if (IS_ERR(bp->hclk)) {
 		dev_err(&pdev->dev, "failed to get hclk\n");
 		goto err_out_put_pclk;
 	}
-	clk_enable(bp->hclk);
+	clk_prepare_enable(bp->hclk);
 
 	bp->regs = ioremap(regs->start, resource_size(regs));
 	if (!bp->regs) {
@@ -1658,9 +1658,9 @@ err_out_free_irq:
 err_out_iounmap:
 	iounmap(bp->regs);
 err_out_disable_clocks:
-	clk_disable(bp->hclk);
+	clk_disable_unprepare(bp->hclk);
 	clk_put(bp->hclk);
-	clk_disable(bp->pclk);
+	clk_disable_unprepare(bp->pclk);
 err_out_put_pclk:
 	clk_put(bp->pclk);
 err_out_free_dev:
@@ -1687,9 +1687,9 @@ static int __exit macb_remove(struct platform_device *pdev)
 		unregister_netdev(dev);
 		free_irq(dev->irq, dev);
 		iounmap(bp->regs);
-		clk_disable(bp->hclk);
+		clk_disable_unprepare(bp->hclk);
 		clk_put(bp->hclk);
-		clk_disable(bp->pclk);
+		clk_disable_unprepare(bp->pclk);
 		clk_put(bp->pclk);
 		free_netdev(dev);
 		platform_set_drvdata(pdev, NULL);
@@ -1707,8 +1707,8 @@ static int macb_suspend(struct platform_device *pdev, pm_message_t state)
 	netif_carrier_off(netdev);
 	netif_device_detach(netdev);
 
-	clk_disable(bp->hclk);
-	clk_disable(bp->pclk);
+	clk_disable_unprepare(bp->hclk);
+	clk_disable_unprepare(bp->pclk);
 
 	return 0;
 }
@@ -1718,8 +1718,8 @@ static int macb_resume(struct platform_device *pdev)
 	struct net_device *netdev = platform_get_drvdata(pdev);
 	struct macb *bp = netdev_priv(netdev);
 
-	clk_enable(bp->pclk);
-	clk_enable(bp->hclk);
+	clk_prepare_enable(bp->pclk);
+	clk_prepare_enable(bp->hclk);
 
 	netif_device_attach(netdev);
 
-- 
1.8.2.rc2

^ permalink raw reply related

* [PATCH v2 2/3] net/macb: force endian_swp_pkt_en to off
From: Steffen Trumtrar @ 2013-03-28  9:07 UTC (permalink / raw)
  To: netdev; +Cc: Nicolas Ferre, Steffen Trumtrar
In-Reply-To: <1364461627-26521-1-git-send-email-s.trumtrar@pengutronix.de>

The core has a bit for swapping packet data endianism.
Reset default from Cadence is off. Xilinx however, who uses this core on the
Zynq SoCs, opted for on.
Force it to off. This shouldn't change the behaviour for current users of the
macb, but enables usage on Zynq devices.

Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
---
 drivers/net/ethernet/cadence/macb.c | 1 +
 drivers/net/ethernet/cadence/macb.h | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index 817835e..71e766b 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -1057,6 +1057,7 @@ static void macb_configure_dma(struct macb *bp)
 		dmacfg |= GEM_BF(RXBS, RX_BUFFER_SIZE / 64);
 		dmacfg |= GEM_BF(FBLDO, 16);
 		dmacfg |= GEM_BIT(TXPBMS) | GEM_BF(RXBMS, -1L);
+		dmacfg &= ~GEM_BIT(ENDIA);
 		gem_writel(bp, DMACFG, dmacfg);
 	}
 }
diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index 570908b..993d703 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -173,6 +173,8 @@
 /* Bitfields in DMACFG. */
 #define GEM_FBLDO_OFFSET			0
 #define GEM_FBLDO_SIZE				5
+#define GEM_ENDIA_OFFSET			7
+#define GEM_ENDIA_SIZE				1
 #define GEM_RXBMS_OFFSET			8
 #define GEM_RXBMS_SIZE				2
 #define GEM_TXPBMS_OFFSET			10
-- 
1.8.2.rc2

^ permalink raw reply related

* [PATCH v2 1/3] net/macb: clear tx/rx completion flags in ISR
From: Steffen Trumtrar @ 2013-03-28  9:07 UTC (permalink / raw)
  To: netdev; +Cc: Nicolas Ferre, Steffen Trumtrar
In-Reply-To: <1364461627-26521-1-git-send-email-s.trumtrar@pengutronix.de>

At least in the cadence IP core on the Xilinx Zynq SoC the TCOMP/RCOMP flags
are not auto-cleaned. As these flags are evaluated, they need to be cleaned.

Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
---
 drivers/net/ethernet/cadence/macb.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index 7903943..817835e 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -485,6 +485,8 @@ static void macb_tx_interrupt(struct macb *bp)
 	status = macb_readl(bp, TSR);
 	macb_writel(bp, TSR, status);
 
+	macb_writel(bp, ISR, MACB_BIT(TCOMP));
+
 	netdev_vdbg(bp->dev, "macb_tx_interrupt status = 0x%03lx\n",
 		(unsigned long)status);
 
@@ -736,6 +738,7 @@ static irqreturn_t macb_interrupt(int irq, void *dev_id)
 			 * now.
 			 */
 			macb_writel(bp, IDR, MACB_RX_INT_FLAGS);
+			macb_writel(bp, ISR, MACB_BIT(RCOMP));
 
 			if (napi_schedule_prep(&bp->napi)) {
 				netdev_vdbg(bp->dev, "scheduling RX softirq\n");
-- 
1.8.2.rc2

^ permalink raw reply related

* [PATCH v2 0/3] net/macb: fixes to use core on Zynq SoCs
From: Steffen Trumtrar @ 2013-03-28  9:07 UTC (permalink / raw)
  To: netdev; +Cc: Nicolas Ferre, Steffen Trumtrar

Hi!

The Cadence GEM is also licensed for the Xilinx Zynq7000 SoCs.
As Xilinx uses other reset defaults, some fixes are necessary to have it
working there. And as the Zynq is dualcore, the clk_enables/disables now need
to be atomic.

Changes in v2:
	- only 3/3 was changed to correctly use the atomic clk_[en|dis]able

Regards,
Steffen

Steffen Trumtrar (3):
  net/macb: clear tx/rx completion flags in ISR
  net/macb: force endian_swp_pkt_en to off
  net/macb: make clk_enable atomic

 drivers/net/ethernet/cadence/macb.c | 24 ++++++++++++++----------
 drivers/net/ethernet/cadence/macb.h |  2 ++
 2 files changed, 16 insertions(+), 10 deletions(-)

-- 
1.8.2.rc2

^ permalink raw reply

* Re: [PATCH net-next 00/19] IPVS optimizations, part 2
From: Simon Horman @ 2013-03-28  9:05 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: lvs-devel, netdev
In-Reply-To: <alpine.LFD.2.00.1303281052370.2010@ja.ssi.bg>

On Thu, Mar 28, 2013 at 10:57:12AM +0200, Julian Anastasov wrote:
> 
> 	Hello,
> 
> On Thu, 28 Mar 2013, Simon Horman wrote:
> 
> > On Fri, Mar 22, 2013 at 11:46:35AM +0200, Julian Anastasov wrote:
> > > 	This is the second patchset with IPVS optimizations.
> > > Now we convert the schedulers, dests and services to RCU.
> > > 
> > > 	All patches are for net-next based on the first
> > > patchset v3. The idea is after discussion and review Simon to
> > > apply the patchset after a week or so to ipvs-next tree.
> > > 
> > > 	The changes in this patchset eliminate global locks
> > > from packet processing by using RCU. There are more details
> > > in the patches.
> > > 
> > > 	After this patchset the situation is as follows:
> > > 
> > > - dests:
> > > 	- lookups under RCU lock allow ip_vs_dest_hold, used
> > > 	for binding dest to conn or to select dest by scheduler
> > > 	- dests are freed by dest_trash code long after grace period
> > > 
> > > - services:
> > > 	- no global read_lock
> > > 	- lookups under RCU lock allow scheduler to select
> > > 	dests under RCU lock
> > > 	- grace period implemented with IP_VS_WAIT_WHILE is
> > > 	gone allowing scheduler's dest selection and scheduler
> > > 	reconfiguration to run in parallel
> > > 
> > > - schedulers:
> > > 	- schedule method runs under RCU lock, needs _rcu
> > > 	if using svc->destinations, needs _bh suffix to locks
> > > 	because it can be called in LOCAL_OUT hook
> > > 	- when dest is added, the add_dest method is called
> > > 	instead of update_service
> > > 	- when dest is deleted, the del_dest method is called
> > > 	instead of update_service
> > > 	- when dest is updated, the upd_dest method is called
> > > 	instead of update_service
> > > 	- scheduler can hold dests in its state long after they are
> > > 	unlinked from svc, even without providing del_dest handler.
> > > 	But such dests must not be returned by the
> > > 	schedule method (needs IP_VS_DEST_F_AVAILABLE check)
> > > 	- sched_data must be freed after grace period and
> > > 	module exit should be delayed with synchronize_rcu
> > > 	to wait all RCU read-side critical sections to
> > > 	complete
> > > 
> > > - BH:
> > > 	- we do not disable BHs in LOCAL_OUT and sync code anymore
> > > 	- _bh suffixes are added to all places that need them,
> > > 	except timer handlers
> > 
> > Hi Julian,
> > 
> > what is the status of this series?
> > 
> > N.B: THis series is different to "[PATCH 00/15 v3] IPVS optimizations (repost)"
> 
> 	Yes, I posted 2 parts with optimizations. Part 1 is
> at v3 while part 2 is at its first version. Both series
> are ready for applying. BTW, Hans plans more tests in
> the next days, may be we will have some numbers for
> comparison.

Thanks, I'll apply these changes.

^ permalink raw reply

* Re: [PATCH 00/15 v3] IPVS optimizations (repost)
From: Simon Horman @ 2013-03-28  9:04 UTC (permalink / raw)
  To: Pablo Neira Ayuso, David Miller
  Cc: lvs-devel, netdev, netfilter-devel, Wensong Zhang,
	Julian Anastasov
In-Reply-To: <1364449184-26672-1-git-send-email-horms@verge.net.au>

On Thu, Mar 28, 2013 at 02:39:29PM +0900, Simon Horman wrote:
> Hi Dave, Hi Pablo, Hi All,
> 
> This is a repost of an IPVS optimisations series by Julian Anastasov
> which has been acked by Hans Schillstrom.
> 
> I have tentatively applied them to the ipvs-next tree.
> However, the first patch of the series "net: add skb_dst_set_noref_force"
> touches core code and thus I believe it needs some review on netdev,
> this is the reason for me posting the series.
> 
> Assuming the patch is ok it would be easiest for me if
> it went through the ipvs-next tree. But if there is a preference
> for taking it through net-next, feel free.
> 
> What follows is Julian's cover-email for the series.
> And then git information. I am happy for Pablo to pull this
> if Dave is happy with that.

It seems that Dave has already acked the change.

http://marc.info/?l=linux-netdev&m=136294921002680&w=2

I'll tag ipvs-next and send a proper pull request.

> ======================================================================
> 
> Date: Thu, 21 Mar 2013 11:57:57 +0200
> From: Julian Anastasov <ja@ssi.bg>
> To: Simon Horman <horms@verge.net.au>
> Cc: lvs-devel@vger.kernel.org
> Subject: [PATCHv3 net-next 00/15] IPVS optimizations
> 
> 	This is a first patchset for IPVS optimizations.
> Another patchset will address the locking in schedulers
> and moving the global _bh disabling from LOCAL_OUT to all
> locks.
> 
> 	All patches are for net-next and Simon can
> take them for ipvs-next.
> 
> 	The changes in this patchset eliminate locks
> and dst refcnt operations from packet processing by
> using RCU. There are more details in the patches.
> 
> v3:
> * in "ipvs: consolidate all dst checks on transmit in one place"
>   preserve original skb dst even for local client, remove the
>   rt_is_input_route and skb->dev check. Call update_pmtu only for
>   local client by providing sk instead of skb.
> * in "ipvs: optimize dst usage for real server" use
>   rcu_dereference_protected for __ip_vs_dst_cache_reset instead of
>   rcu_dereference_raw. Use the new skb_dst_set_noref_force func.
> * in "ipvs: remove rs_lock by using RCU" prefer the port check in
>   ip_vs_has_real_service
> * "ipvs: convert locks used in persistence engines" needs only
>   synchronize_rcu, not rcu_barrier, we do not use rcu callbacks
> 
> v2:
> * use "net: add skb_dst_set_unref" instead of
>   "net: add dst_get_noref and refdst_ptr helpers"
> * add "ipvs: no need to reroute anymore on DNAT over loopback"
> * add "ipvs: do not use skb_share_check"
> * add "ipvs: consolidate all dst checks on transmit in one place", so
>   that we can avoid the refdst games in next patch
> * after "ipvs: consolidate all dst checks on transmit in one place"
>   "ipvs: optimize dst usage for real server" is simpler and
>   uses the new skb_dst_set_unref function
> * extend "ipvs: reorder keys in connection structure" with
>   changes in ip_vs_ct_in_get
> * fix "ipvs: avoid kmem_cache_zalloc in ip_vs_conn_new" to use new
>   function ip_vs_addr_set, so that we reset all address fields
>   that are used for hashing by hash_conntrack_raw
> 
> ======================================================================
> 
> The following changes since commit dece40e848f6e022f960dc9de54be518928460c3:
> 
>   netfilter: nf_conntrack: speed up module removal path if netns in use (2013-03-19 17:08:31 +0100)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next.git master
> 
> for you to fetch changes up to e8a0eb703e38870118928384ffd5eeeb47e7e1ef:
> 
>   ipvs: avoid kmem_cache_zalloc in ip_vs_conn_new (2013-03-28 14:16:38 +0900)
> 
> ----------------------------------------------------------------
> Julian Anastasov (15):
>       net: add skb_dst_set_noref_force
>       ipvs: avoid routing by TOS for real server
>       ipvs: prefer NETDEV_DOWN event to free cached dsts
>       ipvs: convert the IP_VS_XMIT macros to functions
>       ipvs: rename functions related to dst_cache reset
>       ipvs: no need to reroute anymore on DNAT over loopback
>       ipvs: do not use skb_share_check
>       ipvs: consolidate all dst checks on transmit in one place
>       ipvs: optimize dst usage for real server
>       ipvs: convert app locks
>       ipvs: remove rs_lock by using RCU
>       ipvs: convert locks used in persistence engines
>       ipvs: convert connection locking
>       ipvs: reorder keys in connection structure
>       ipvs: avoid kmem_cache_zalloc in ip_vs_conn_new
> 
>  include/linux/skbuff.h                |   35 +-
>  include/net/ip_vs.h                   |   71 ++-
>  net/core/dst.c                        |    9 +-
>  net/netfilter/ipvs/ip_vs_app.c        |   27 +-
>  net/netfilter/ipvs/ip_vs_conn.c       |  271 +++++----
>  net/netfilter/ipvs/ip_vs_core.c       |   16 +-
>  net/netfilter/ipvs/ip_vs_ctl.c        |  143 +++--
>  net/netfilter/ipvs/ip_vs_ftp.c        |    2 +
>  net/netfilter/ipvs/ip_vs_pe.c         |   43 +-
>  net/netfilter/ipvs/ip_vs_pe_sip.c     |    1 +
>  net/netfilter/ipvs/ip_vs_proto_sctp.c |   18 +-
>  net/netfilter/ipvs/ip_vs_proto_tcp.c  |   18 +-
>  net/netfilter/ipvs/ip_vs_proto_udp.c  |   19 +-
>  net/netfilter/ipvs/ip_vs_xmit.c       | 1046 ++++++++++++++-------------------
>  14 files changed, 810 insertions(+), 909 deletions(-)
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* [net-next 12/12] ixgbevf: Adjust to handle unassigned MAC address from PF
From: Jeff Kirsher @ 2013-03-28  9:00 UTC (permalink / raw)
  To: davem
  Cc: Greg Rose, netdev, gospo, sassmann, Andy Gospodarek,
	Stefan Assmann, Jeff Kirsher
In-Reply-To: <1364461215-7793-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Greg Rose <gregory.v.rose@intel.com>

If the administrator has not assigned a MAC address to the VF via the
PF then handle it gracefully by generating a temporary MAC address.
This ensures that we always know when we have a random address and
udev won't get upset about it.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 21 +++++++++++++++------
 drivers/net/ethernet/intel/ixgbevf/vf.c           |  7 ++++++-
 2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 5563250..eeae934 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -2052,6 +2052,7 @@ static int ixgbevf_sw_init(struct ixgbevf_adapter *adapter)
 {
 	struct ixgbe_hw *hw = &adapter->hw;
 	struct pci_dev *pdev = adapter->pdev;
+	struct net_device *netdev = adapter->netdev;
 	int err;
 
 	/* PCI config space info */
@@ -2071,18 +2072,26 @@ static int ixgbevf_sw_init(struct ixgbevf_adapter *adapter)
 	err = hw->mac.ops.reset_hw(hw);
 	if (err) {
 		dev_info(&pdev->dev,
-		         "PF still in reset state, assigning new address\n");
-		eth_hw_addr_random(adapter->netdev);
-		memcpy(adapter->hw.mac.addr, adapter->netdev->dev_addr,
-			adapter->netdev->addr_len);
+			 "PF still in reset state.  Is the PF interface up?\n");
 	} else {
 		err = hw->mac.ops.init_hw(hw);
 		if (err) {
 			pr_err("init_shared_code failed: %d\n", err);
 			goto out;
 		}
-		memcpy(adapter->netdev->dev_addr, adapter->hw.mac.addr,
-		       adapter->netdev->addr_len);
+		err = hw->mac.ops.get_mac_addr(hw, hw->mac.addr);
+		if (err)
+			dev_info(&pdev->dev, "Error reading MAC address\n");
+		else if (is_zero_ether_addr(adapter->hw.mac.addr))
+			dev_info(&pdev->dev,
+				 "MAC address not assigned by administrator.\n");
+		memcpy(netdev->dev_addr, hw->mac.addr, netdev->addr_len);
+	}
+
+	if (!is_valid_ether_addr(netdev->dev_addr)) {
+		dev_info(&pdev->dev, "Assigning random MAC address\n");
+		eth_hw_addr_random(netdev);
+		memcpy(hw->mac.addr, netdev->dev_addr, netdev->addr_len);
 	}
 
 	/* lock to protect mailbox accesses */
diff --git a/drivers/net/ethernet/intel/ixgbevf/vf.c b/drivers/net/ethernet/intel/ixgbevf/vf.c
index 0c94557..387b526 100644
--- a/drivers/net/ethernet/intel/ixgbevf/vf.c
+++ b/drivers/net/ethernet/intel/ixgbevf/vf.c
@@ -109,7 +109,12 @@ static s32 ixgbevf_reset_hw_vf(struct ixgbe_hw *hw)
 	if (ret_val)
 		return ret_val;
 
-	if (msgbuf[0] != (IXGBE_VF_RESET | IXGBE_VT_MSGTYPE_ACK))
+	/* New versions of the PF may NACK the reset return message
+	 * to indicate that no MAC address has yet been assigned for
+	 * the VF.
+	 */
+	if (msgbuf[0] != (IXGBE_VF_RESET | IXGBE_VT_MSGTYPE_ACK) &&
+	    msgbuf[0] != (IXGBE_VF_RESET | IXGBE_VT_MSGTYPE_NACK))
 		return IXGBE_ERR_INVALID_MAC_ADDR;
 
 	memcpy(hw->mac.perm_addr, addr, ETH_ALEN);
-- 
1.7.11.7

^ permalink raw reply related

* [net-next 11/12] ixgbe: Don't give VFs random MAC addresses
From: Jeff Kirsher @ 2013-03-28  9:00 UTC (permalink / raw)
  To: davem
  Cc: Greg Rose, netdev, gospo, sassmann, Andy Gospodarek,
	Stefan Assmann, Jeff Kirsher
In-Reply-To: <1364461215-7793-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Greg Rose <gregory.v.rose@intel.com>

If the user has not assigned a MAC address to a VM, then don't give it a
random one. Instead, just give it zeros and let it figure out what to do
with them.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index d44b4d2..b3e6530 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -661,13 +661,7 @@ int ixgbe_vf_configuration(struct pci_dev *pdev, unsigned int event_mask)
 	bool enable = ((event_mask & 0x10000000U) != 0);
 
 	if (enable) {
-		eth_random_addr(vf_mac_addr);
-		e_info(probe, "IOV: VF %d is enabled MAC %pM\n",
-		       vfn, vf_mac_addr);
-		/*
-		 * Store away the VF "permananet" MAC address, it will ask
-		 * for it later.
-		 */
+		eth_zero_addr(vf_mac_addr);
 		memcpy(adapter->vfinfo[vfn].vf_mac_addresses, vf_mac_addr, 6);
 	}
 
@@ -688,7 +682,8 @@ static int ixgbe_vf_reset_msg(struct ixgbe_adapter *adapter, u32 vf)
 	ixgbe_vf_reset_event(adapter, vf);
 
 	/* set vf mac address */
-	ixgbe_set_vf_mac(adapter, vf, vf_mac);
+	if (!is_zero_ether_addr(vf_mac))
+		ixgbe_set_vf_mac(adapter, vf, vf_mac);
 
 	vf_shift = vf % 32;
 	reg_offset = vf / 32;
@@ -729,8 +724,16 @@ static int ixgbe_vf_reset_msg(struct ixgbe_adapter *adapter, u32 vf)
 	IXGBE_WRITE_REG(hw, IXGBE_VMECM(reg_offset), reg);
 
 	/* reply to reset with ack and vf mac address */
-	msgbuf[0] = IXGBE_VF_RESET | IXGBE_VT_MSGTYPE_ACK;
-	memcpy(addr, vf_mac, ETH_ALEN);
+	msgbuf[0] = IXGBE_VF_RESET;
+	if (!is_zero_ether_addr(vf_mac)) {
+		msgbuf[0] |= IXGBE_VT_MSGTYPE_ACK;
+		memcpy(addr, vf_mac, ETH_ALEN);
+	} else {
+		msgbuf[0] |= IXGBE_VT_MSGTYPE_NACK;
+		dev_warn(&adapter->pdev->dev,
+			 "VF %d has no MAC address assigned, you may have to assign one manually\n",
+			 vf);
+	}
 
 	/*
 	 * Piggyback the multicast filter type so VF can compute the
-- 
1.7.11.7

^ permalink raw reply related

* [net-next 10/12] e1000e: fix scheduling while atomic bugs
From: Jeff Kirsher @ 2013-03-28  9:00 UTC (permalink / raw)
  To: davem; +Cc: Bruce Allan, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1364461215-7793-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Bruce Allan <bruce.w.allan@intel.com>

The previous commit ce43a2168c59bc47b5f0c1825fd5f9a2a9e3b447 (e1000e:
cleanup USLEEP_RANGE checkpatch checks) converted a number of delays and
sleeps as recommended in ./Documentation/timers/timers-howto.txt.
Unfortunately, a few of the udelay() to usleep_range() conversions are in
code paths that are in an atomic context in which usleep_range() should
not be used.  Revert those specific changes.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/phy.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/phy.c b/drivers/net/ethernet/intel/e1000e/phy.c
index cbb310f..59c76a6 100644
--- a/drivers/net/ethernet/intel/e1000e/phy.c
+++ b/drivers/net/ethernet/intel/e1000e/phy.c
@@ -165,7 +165,7 @@ s32 e1000e_read_phy_reg_mdic(struct e1000_hw *hw, u32 offset, u16 *data)
 	 * the lower time out
 	 */
 	for (i = 0; i < (E1000_GEN_POLL_TIMEOUT * 3); i++) {
-		usleep_range(50, 100);
+		udelay(50);
 		mdic = er32(MDIC);
 		if (mdic & E1000_MDIC_READY)
 			break;
@@ -190,7 +190,7 @@ s32 e1000e_read_phy_reg_mdic(struct e1000_hw *hw, u32 offset, u16 *data)
 	 * reading duplicate data in the next MDIC transaction.
 	 */
 	if (hw->mac.type == e1000_pch2lan)
-		usleep_range(100, 200);
+		udelay(100);
 
 	return 0;
 }
@@ -229,7 +229,7 @@ s32 e1000e_write_phy_reg_mdic(struct e1000_hw *hw, u32 offset, u16 data)
 	 * the lower time out
 	 */
 	for (i = 0; i < (E1000_GEN_POLL_TIMEOUT * 3); i++) {
-		usleep_range(50, 100);
+		udelay(50);
 		mdic = er32(MDIC);
 		if (mdic & E1000_MDIC_READY)
 			break;
@@ -253,7 +253,7 @@ s32 e1000e_write_phy_reg_mdic(struct e1000_hw *hw, u32 offset, u16 data)
 	 * reading duplicate data in the next MDIC transaction.
 	 */
 	if (hw->mac.type == e1000_pch2lan)
-		usleep_range(100, 200);
+		udelay(100);
 
 	return 0;
 }
-- 
1.7.11.7

^ permalink raw reply related

* [net-next 09/12] e1000e: increase driver version number
From: Jeff Kirsher @ 2013-03-28  9:00 UTC (permalink / raw)
  To: davem; +Cc: Bruce Allan, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1364461215-7793-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Bruce Allan <bruce.w.allan@intel.com>

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 0459fe3..858d2a3 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -55,7 +55,7 @@
 
 #define DRV_EXTRAVERSION "-k"
 
-#define DRV_VERSION "2.2.14" DRV_EXTRAVERSION
+#define DRV_VERSION "2.3.2" DRV_EXTRAVERSION
 char e1000e_driver_name[] = "e1000e";
 const char e1000e_driver_version[] = DRV_VERSION;
 
-- 
1.7.11.7

^ permalink raw reply related

* [net-next 08/12] e1000e: cleanup unused defines
From: Jeff Kirsher @ 2013-03-28  9:00 UTC (permalink / raw)
  To: davem; +Cc: Bruce Allan, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1364461215-7793-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Bruce Allan <bruce.w.allan@intel.com>

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/ich8lan.h | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.h b/drivers/net/ethernet/intel/e1000e/ich8lan.h
index 21d21b9..80034a2 100644
--- a/drivers/net/ethernet/intel/e1000e/ich8lan.h
+++ b/drivers/net/ethernet/intel/e1000e/ich8lan.h
@@ -250,13 +250,6 @@
 /* Proprietary Latency Tolerance Reporting PCI Capability */
 #define E1000_PCI_LTR_CAP_LPT		0xA8
 
-/* OBFF Control & Threshold Defines */
-#define E1000_SVCR_OFF_EN		0x00000001
-#define E1000_SVCR_OFF_MASKINT		0x00001000
-#define E1000_SVCR_OFF_TIMER_MASK	0xFFFF0000
-#define E1000_SVCR_OFF_TIMER_SHIFT	16
-#define E1000_SVT_OFF_HWM_MASK		0x0000001F
-
 void e1000e_write_protect_nvm_ich8lan(struct e1000_hw *hw);
 void e1000e_set_kmrn_lock_loss_workaround_ich8lan(struct e1000_hw *hw,
 						  bool state);
-- 
1.7.11.7

^ permalink raw reply related

* [net-next 07/12] e1000e: add support for LTR on I217/I218
From: Jeff Kirsher @ 2013-03-28  9:00 UTC (permalink / raw)
  To: davem; +Cc: Bruce Allan, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1364461215-7793-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Bruce Allan <bruce.w.allan@intel.com>

Set the Latency Tolerance Reporting (LTR) values for the "PCIe-like"
GbE MAC in the Lynx Point PCH based on Rx buffer size and link speed
when link is up (which must not exceed the maximum latency supported
by the platform), otherwise specify there is no LTR requirement.
Unlike true-PCIe devices which set the LTR maximum snoop/no-snoop
latencies in the LTR Extended Capability Structure in the PCIe Extended
Capability register set, on this device LTR is set by writing the
equivalent snoop/no-snoop latencies in the LTRV register in the MAC and
set the SEND bit to send an Intel On-chip System Fabric sideband (IOSF-SB)
message to the PMC.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/ich8lan.c | 97 +++++++++++++++++++++++++++++
 1 file changed, 97 insertions(+)

diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c
index 56c4935..ad9d8f2 100644
--- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
+++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
@@ -839,6 +839,94 @@ release:
 }
 
 /**
+ *  e1000_platform_pm_pch_lpt - Set platform power management values
+ *  @hw: pointer to the HW structure
+ *  @link: bool indicating link status
+ *
+ *  Set the Latency Tolerance Reporting (LTR) values for the "PCIe-like"
+ *  GbE MAC in the Lynx Point PCH based on Rx buffer size and link speed
+ *  when link is up (which must not exceed the maximum latency supported
+ *  by the platform), otherwise specify there is no LTR requirement.
+ *  Unlike true-PCIe devices which set the LTR maximum snoop/no-snoop
+ *  latencies in the LTR Extended Capability Structure in the PCIe Extended
+ *  Capability register set, on this device LTR is set by writing the
+ *  equivalent snoop/no-snoop latencies in the LTRV register in the MAC and
+ *  set the SEND bit to send an Intel On-chip System Fabric sideband (IOSF-SB)
+ *  message to the PMC.
+ **/
+static s32 e1000_platform_pm_pch_lpt(struct e1000_hw *hw, bool link)
+{
+	u32 reg = link << (E1000_LTRV_REQ_SHIFT + E1000_LTRV_NOSNOOP_SHIFT) |
+	    link << E1000_LTRV_REQ_SHIFT | E1000_LTRV_SEND;
+	u16 lat_enc = 0;	/* latency encoded */
+
+	if (link) {
+		u16 speed, duplex, scale = 0;
+		u16 max_snoop, max_nosnoop;
+		u16 max_ltr_enc;	/* max LTR latency encoded */
+		s64 lat_ns;	/* latency (ns) */
+		s64 value;
+		u32 rxa;
+
+		if (!hw->adapter->max_frame_size) {
+			e_dbg("max_frame_size not set.\n");
+			return -E1000_ERR_CONFIG;
+		}
+
+		hw->mac.ops.get_link_up_info(hw, &speed, &duplex);
+		if (!speed) {
+			e_dbg("Speed not set.\n");
+			return -E1000_ERR_CONFIG;
+		}
+
+		/* Rx Packet Buffer Allocation size (KB) */
+		rxa = er32(PBA) & E1000_PBA_RXA_MASK;
+
+		/* Determine the maximum latency tolerated by the device.
+		 *
+		 * Per the PCIe spec, the tolerated latencies are encoded as
+		 * a 3-bit encoded scale (only 0-5 are valid) multiplied by
+		 * a 10-bit value (0-1023) to provide a range from 1 ns to
+		 * 2^25*(2^10-1) ns.  The scale is encoded as 0=2^0ns,
+		 * 1=2^5ns, 2=2^10ns,...5=2^25ns.
+		 */
+		lat_ns = ((s64)rxa * 1024 -
+			  (2 * (s64)hw->adapter->max_frame_size)) * 8 * 1000;
+		if (lat_ns < 0)
+			lat_ns = 0;
+		else
+			do_div(lat_ns, speed);
+
+		value = lat_ns;
+		while (value > PCI_LTR_VALUE_MASK) {
+			scale++;
+			value = DIV_ROUND_UP(value, (1 << 5));
+		}
+		if (scale > E1000_LTRV_SCALE_MAX) {
+			e_dbg("Invalid LTR latency scale %d\n", scale);
+			return -E1000_ERR_CONFIG;
+		}
+		lat_enc = (u16)((scale << PCI_LTR_SCALE_SHIFT) | value);
+
+		/* Determine the maximum latency tolerated by the platform */
+		pci_read_config_word(hw->adapter->pdev, E1000_PCI_LTR_CAP_LPT,
+				     &max_snoop);
+		pci_read_config_word(hw->adapter->pdev,
+				     E1000_PCI_LTR_CAP_LPT + 2, &max_nosnoop);
+		max_ltr_enc = max_t(u16, max_snoop, max_nosnoop);
+
+		if (lat_enc > max_ltr_enc)
+			lat_enc = max_ltr_enc;
+	}
+
+	/* Set Snoop and No-Snoop latencies the same */
+	reg |= lat_enc | (lat_enc << E1000_LTRV_NOSNOOP_SHIFT);
+	ew32(LTRV, reg);
+
+	return 0;
+}
+
+/**
  *  e1000_check_for_copper_link_ich8lan - Check for link (Copper)
  *  @hw: pointer to the HW structure
  *
@@ -911,6 +999,15 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw)
 			return ret_val;
 	}
 
+	if (hw->mac.type == e1000_pch_lpt) {
+		/* Set platform power management values for
+		 * Latency Tolerance Reporting (LTR)
+		 */
+		ret_val = e1000_platform_pm_pch_lpt(hw, link);
+		if (ret_val)
+			return ret_val;
+	}
+
 	/* Clear link partner's EEE ability */
 	hw->dev_spec.ich8lan.eee_lp_ability = 0;
 
-- 
1.7.11.7

^ permalink raw reply related

* [net-next 06/12] e1000e: enable EEE by default
From: Jeff Kirsher @ 2013-03-28  9:00 UTC (permalink / raw)
  To: davem; +Cc: Bruce Allan, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1364461215-7793-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Bruce Allan <bruce.w.allan@intel.com>

Now that IEEE802.3az-2010 Energy Efficient Ethernet has been approved as
standard (September 2010) and the driver can enable and disable it via
ethtool, enable the feature by default on parts which support it.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/ich8lan.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c
index 174507c..56c4935 100644
--- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
+++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
@@ -1034,10 +1034,6 @@ static s32 e1000_get_variants_ich8lan(struct e1000_adapter *adapter)
 	    (er32(FWSM) & E1000_ICH_FWSM_FW_VALID))
 		adapter->flags2 |= FLAG2_PCIM2PCI_ARBITER_WA;
 
-	/* Disable EEE by default until IEEE802.3az spec is finalized */
-	if (adapter->flags2 & FLAG2_HAS_EEE)
-		adapter->hw.dev_spec.ich8lan.eee_disable = true;
-
 	return 0;
 }
 
-- 
1.7.11.7

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox