Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] bluetooth: use PTR_RET instead of IS_ERR + PTR_ERR
From: Silviu Popescu @ 2013-03-18 18:05 UTC (permalink / raw)
  To: David Miller
  Cc: linux-bluetooth-u79uwXL29TY76Z2rM5mHXA,
	marcel-kz+m5ild9QBg9hUCZPvPmw, gustavo-THi1TnShQwVAfugRpC6u6w,
	johan.hedberg-Re5JQEeQqe8AvxtiuMwx3w,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20130313.053145.2200448840921851390.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

On Wed, Mar 13, 2013 at 11:31 AM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
> From: Silviu-Mihai Popescu <silviupopescu1990-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Date: Tue, 12 Mar 2013 20:13:15 +0200
>
>> @@ -590,10 +590,7 @@ int __init bt_sysfs_init(void)
>>       bt_debugfs = debugfs_create_dir("bluetooth", NULL);
>>
>>       bt_class = class_create(THIS_MODULE, "bluetooth");
>> -     if (IS_ERR(bt_class))
>> -             return PTR_ERR(bt_class);
>> -
>> -     return 0;
>> +     return PTR_RET(bt_class)
>
> Don't bother submitting patches you aren't even going to try
> to compile.
>
> I'm rejecting all of your current submissions.  Resubmit them
> when you feel like typing 'make' from time to time.
>
>

Sorry for the trouble caused and sorry for the late reply.
That being said, I'd like to understand a bit better what exactly I messed up.
I've just pulled the latest revision of the mainline kernel and made
the changes in this patch.
I've tried with make defconfig (which would be x86_64_defconfig in my
case), followed by
make menuconfig to select the bluetooth options and make allyesconfig.
Both defconfig and allyesconfig compile successfully on my system.
Would you be so kind as to tell me what error you have encountered?
Or perhaps enlighten me as to what I'm still doing wrong. I'd like to
learn from my mistakes.

Thanks,
Silviu Popescu

^ permalink raw reply

* [PATCH can-next] can: dump stack on protocol bugs
From: Oliver Hartkopp @ 2013-03-18 17:52 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: Linux Netdev List

The rework of the kernel hlist implementation "hlist: drop the node parameter
from iterators" (b67bfe0d42cac56c512dd5da4b1b347a23f4b70a) created some
fallout in the form of non matching comments and obsolete code.

Additionally to the cleanup this patch adds a WARN() statement to catch the
caller of the wrong filter removal request.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>

---

diff --git a/net/can/af_can.c b/net/can/af_can.c
index 8bacf28..c4e5085 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -546,16 +546,13 @@ void can_rx_unregister(struct net_device *dev, canid_t can_id, canid_t mask,
 	}
 
 	/*
-	 * Check for bugs in CAN protocol implementations:
-	 * If no matching list item was found, the list cursor variable next
-	 * will be NULL, while r will point to the last item of the list.
+	 * Check for bugs in CAN protocol implementations using af_can.c:
+	 * 'r' will be NULL if no matching list item was found for removal.
 	 */
 
 	if (!r) {
-		pr_err("BUG: receive list entry not found for "
-		       "dev %s, id %03X, mask %03X\n",
-		       DNAME(dev), can_id, mask);
-		r = NULL;
+		WARN(1, "BUG: receive list entry not found for dev %s, "
+		     "id %03X, mask %03X\n", DNAME(dev), can_id, mask);
 		goto out;
 	}
 

^ permalink raw reply related

* Re: [PATCH] libertas: drop maintainership
From: Dan Williams @ 2013-03-18 17:51 UTC (permalink / raw)
  To: Joe Perches
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA, Daniel Drake, Bing Zhao
In-Reply-To: <1363627392.2074.21.camel@joe-AO722>

On Mon, 2013-03-18 at 10:23 -0700, Joe Perches wrote:
> On Mon, 2013-03-18 at 11:48 -0500, Dan Williams wrote:
> > Would be better maintained by somebody who actualy has time for it.
> []
> > diff --git a/MAINTAINERS b/MAINTAINERS
> []
> > -MARVELL LIBERTAS WIRELESS DRIVER
> > -M:	Dan Williams <dcbw-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > -L:	libertas-dev-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
> > -S:	Maintained
> > -F:	drivers/net/wireless/libertas/
> 
> I think it better to mark it as Orphan
> and maybe leave the list.
> 
> Maybe:
> 
> MARVELL LIBERTAS WIRELESS DRIVER
> L:	libertas-dev-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
> S:	Orphan
> F:	drivers/net/wireless/libertas/
> 
> or
> 
> MARVELL LIBERTAS WIRELESS DRIVER
> S:	Orphan
> F:	drivers/net/wireless/libertas/

I can do that; I wasn't quite sure how to do this.  A quick check showed
patches that did what mine did, and oddly MAINTAINERS has no section for
dropping maintainership that I could quickly find.  If this is what
others prefer I'm happy to resubmit?

Dan

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH V3 4/7] PHYLIB: queue work on any cpu
From: David Miller @ 2013-03-18 17:33 UTC (permalink / raw)
  To: viresh.kumar
  Cc: pjt, paul.mckenney, tglx, tj, suresh.b.siddha, venki, mingo,
	peterz, rostedt, linaro-kernel, robin.randhawa, Steve.Bannister,
	Liviu.Dudau, charles.garcia-tobin, Arvind.Chauhan, linux-rt-users,
	linux-kernel, netdev
In-Reply-To: <9a366f17b93a5e18777360481c94e6db763b45b7.1363617402.git.viresh.kumar@linaro.org>

From: Viresh Kumar <viresh.kumar@linaro.org>
Date: Mon, 18 Mar 2013 20:53:26 +0530

> Phylib uses workqueues for multiple purposes. There is no real dependency of
> scheduling these on the cpu which scheduled them.
> 
> On a idle system, it is observed that and idle cpu wakes up many times just to
> service this work. It would be better if we can schedule it on a cpu which isn't
> idle to save on power.
> 
> By idle cpu (from scheduler's perspective) we mean:
> - Current task is idle task
> - nr_running == 0
> - wake_list is empty
> 
> This patch replaces the schedule_work() and schedule_delayed_work() routines
> with their queue_[delayed_]work_on_any_cpu() siblings with system_wq as
> parameter.
> 
> These routines would look for the closest (via scheduling domains) non-idle cpu
> (non-idle from schedulers perspective). If the current cpu is not idle or all
> cpus are idle, work will be scheduled on local cpu.
> 
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: netdev@vger.kernel.org
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

This will need to be applied to whatever tree adds these new interfaces,
and for that:

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply

* Re: [PATCH] tcp: dont handle MTU reduction on LISTEN socket
From: David Miller @ 2013-03-18 17:32 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, dormando
In-Reply-To: <1363626088.29475.155.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 18 Mar 2013 10:01:28 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> When an ICMP ICMP_FRAG_NEEDED (or ICMPV6_PKT_TOOBIG) message finds a
> LISTEN socket, and this socket is currently owned by the user, we
> set TCP_MTU_REDUCED_DEFERRED flag in listener tsq_flags.
> 
> This is bad because if we clone the parent before it had a chance to
> clear the flag, the child inherits the tsq_flags value, and next
> tcp_release_cb() on the child will decrement sk_refcnt.
> 
> Result is that we might free a live TCP socket, as reported by
> Dormando.
> 
> IPv4: Attempt to release TCP socket in state 1
> 
> Fix this issue by testing sk_state against TCP_LISTEN early, so that we
> set TCP_MTU_REDUCED_DEFERRED on appropriate sockets (not a LISTEN one)
> 
> This bug was introduced in commit 563d34d05786
> (tcp: dont drop MTU reduction indications)
> 
> Reported-by: dormando <dormando@rydia.net>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: igb_poll - device driver failed to check map error
From: Alexander Duyck @ 2013-03-18 17:29 UTC (permalink / raw)
  To: christoph.paasch
  Cc: Alexander Duyck, Jeff Kirsher, Jesse Brandeburg, Bruce Allan,
	Eric Dumazet, netdev
In-Reply-To: <3729150.HPUjKjXiGc@cpaasch-mac>

On 03/16/2013 04:07 AM, Christoph Paasch wrote:
> On Friday 15 March 2013 16:08:31 Alexander Duyck wrote:
>> On 03/15/2013 12:52 AM, Christoph Paasch wrote:
>>> On Thursday 14 March 2013 19:18:18 Alexander Duyck wrote:
>>>> On 03/12/2013 02:31 AM, Christoph Paasch wrote:
>>>>> Hello,
>>>>>
>>>>> I'm seeing a warning while booting my machine when DMA_API_DEBUG is set:
>>>>>
>>>>> [   36.402824] ------------[ cut here ]------------
>>>>> [   36.458070] WARNING: at
>>>>> /home/cpaasch/builder/net-next/lib/dma-debug.c:934
>>>>> check_unmap+0x648/0x702()
>>>>> [   36.567377] Hardware name: ProLiant DL165 G7
>>>>> [   36.618452] igb 0000:04:00.0: DMA-API: device driver failed to check
>>>>> map
>>>>> error[device address=0x0000000233d9b232] [size=154 bytes] [mapped as
>>>>> single] [   36.776640] Modules linked in:
>>>>> [   36.815446] Pid: 0, comm: swapper/7 Not tainted 3.9.0-rc1-mptcp+ #101
>>>>> [   36.892515] Call Trace:
>>>>> [   36.921745]  <IRQ>  [<ffffffff8102ad7f>]
>>>>> warn_slowpath_common+0x80/0x9a
>>>>> [   37.001023]  [<ffffffff8102ae2d>] warn_slowpath_fmt+0x41/0x43
>>>>> [   37.069771]  [<ffffffff811db17f>] check_unmap+0x648/0x702
>>>>> [   37.134363]  [<ffffffff811db3e9>] debug_dma_unmap_page+0x50/0x52
>>>>> [   37.206234]  [<ffffffff8136676a>] igb_poll+0x144/0xf7c
>>>>> [   37.267706]  [<ffffffff8104dd19>] ? sched_clock_cpu+0x46/0xd1
>>>>> [   37.336456]  [<ffffffff814458ce>] net_rx_action+0xa7/0x1d0
>>>>> [   37.402085]  [<ffffffff81030b65>] __do_softirq+0xb4/0x16f
>>>>> [   37.466673]  [<ffffffff81030c90>] irq_exit+0x40/0x87
>>>>> [   37.526067]  [<ffffffff81002db1>] do_IRQ+0x98/0xaf
>>>>> [   37.583378]  [<ffffffff815210aa>] common_interrupt+0x6a/0x6a
>>>>> [   37.651086]  <EOI>  [<ffffffff8105d4be>] ?
>>>>> __tick_nohz_idle_enter+0x116/0x31f
>>>>> [   37.736595]  [<ffffffff81008a04>] ? default_idle+0x24/0x39
>>>>> [   37.802224]  [<ffffffff81008c62>] cpu_idle+0x68/0xa4
>>>>> [   37.861616]  [<ffffffff81519f78>] start_secondary+0x1a9/0x1ad
>>>>> [   37.930364] ---[ end trace 01b5bb0fd75a464c ]---
>>>>>
>>>>>
>>>>> It happens shortly after mounting the NFS-root filesystem.
>>>>>
>>>>> I tried to understand what is going on, but I am now at my wit's end.
>>>>>
>>>>> By adding some print-statements, here is what I found out (not sure if
>>>>> this is anyhow helpful):
>>>>>
>>>>> The difference between tx_buffer->time_stamp and the current 'jiffies'
>>>>> is
>>>>> up to 2000 jiffies (HZ==1000) at the first time the above warning
>>>>> happens
>>>>> (this seems too much for me). From then on, I see my print 3-4 times
>>>>> appear but without such a big difference between the timestamps
>>>>> (difference around 1 and 2 jiffies).
>>>>>
>>>>> Some other stuff, I printed:
>>>>> tx_buffer->skb: ffff880235054c80
>>>>> tx_buffer->bytecount: 154
>>>>> tx_buffer->gso_segs: 1
>>>>> tx_buffer->protocol: 8
>>>>> tx_buffer->tx_flags 0x20
>>>>>
>>>>>
>>>>> One last thing:
>>>>> Am I right that after each call to dma_map_single/page a call to
>>>>> dma_mapping_error is needed? If that's the case, I have some patches
>>>>> that
>>>>> add this statement at missing places in the e1000, e1000e and ixgb
>>>>> driver. But these patches do not fix my above problem.
>>>>>
>>>>>
>>>>> Thanks for your help,
>>>>> Christoph
>>>> Christoph,
>>>>
>>>> One thing that might be useful would be to reproduce this with a
>>>> standard 3.9-rc kernel instead of one using the multipath TCP patches.
>>>> This will help us to verify that the issue is reproducible with a stock
>>>> kernel and is not related to any ongoing work you may have only in your
>>>> tree.
>>> Hello,
>>>
>>> this is on a clean net-next kernel without any MPTCP-code.
>>>
>>> I bisected it down to  787314c35fbb (Merge tag 'iommu-updates-v3.8' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu), which simply
>>> introduces the debug_dma_mapping_error-checks.
>>>
>>> Am I right with the missing calls to dma_mapping_error in e1000, e1000e
>>> and
>>> ixgb?
>>>
>>> Cheers,
>>> Christoph
>> Christoph,
>>
>> The cause of this issues you are seeing may be due to the fact that the
>> buffer triggering the error is being reused.  I was able to reproduce
>> this issue occasionally with pktgen if I cloned the skb.  What may be
>> happening is that the buffer is being mapped in the transmit path on one
>> CPU while on another CPU the buffer is being cleaned.  Since the output
>> of each mapping is the physical address there is nothing to make each
>> mapping unique and I suspect this is resulting in false hits.
>>
>> You should be able to verify this if you were to check the skb->users
>> count as well as the dataref value in the skb_shared_info.  I suspect
>> either the users count of the dataref will be greater than 1.
> Both, users and dataref, are equal to 1. Before the call to dev_kfree_skb_any 
> and after dma_unmap_single fails.
>
>> You might also try testing the patch below to see if it has any effect.
>>  All it does is reorder the free and the unmap so that the buffer is not
>> freed for reuse until after we have checked it in the unmap path.
> I tested your patch, and it fixes my issue. Feel free to add a "Tested-by" to 
> the official patch.
>
>
> Cheers,
> Christoph

I'm not going to submit that as a fix since it doesn't resolve the
underlying issue and I still see problems when I do the pktgen test.

I'll try to submit a patch for the DMA debug API later today to resolve
the issue.  Basically what needs to happen is that we need to step
through and make it so that we tag each instance of the mapping
correctly instead of only tagging the first instance in the bucket.

Thanks,

Alex

^ permalink raw reply

* Re: [PATCH] libertas: drop maintainership
From: Joe Perches @ 2013-03-18 17:23 UTC (permalink / raw)
  To: Dan Williams; +Cc: netdev, linux-wireless, Daniel Drake, Bing Zhao
In-Reply-To: <1363625321.1597.28.camel@dcbw.foobar.com>

On Mon, 2013-03-18 at 11:48 -0500, Dan Williams wrote:
> Would be better maintained by somebody who actualy has time for it.
[]
> diff --git a/MAINTAINERS b/MAINTAINERS
[]
> -MARVELL LIBERTAS WIRELESS DRIVER
> -M:	Dan Williams <dcbw@redhat.com>
> -L:	libertas-dev@lists.infradead.org
> -S:	Maintained
> -F:	drivers/net/wireless/libertas/

I think it better to mark it as Orphan
and maybe leave the list.

Maybe:

MARVELL LIBERTAS WIRELESS DRIVER
L:	libertas-dev@lists.infradead.org
S:	Orphan
F:	drivers/net/wireless/libertas/

or

MARVELL LIBERTAS WIRELESS DRIVER
S:	Orphan
F:	drivers/net/wireless/libertas/

^ permalink raw reply

* [PATCH] man: packet.7: document fanout, ring and auxiliary options
From: Willem de Bruijn @ 2013-03-18 17:13 UTC (permalink / raw)
  To: mtk.manpages, linux-man, netdev; +Cc: Willem de Bruijn

The packet socket manual page does not list all socket options.

This patch adds descriptions of the common packet socket options
  PACKET_AUXDATA, PACKET_FANOUT, PACKET_RX_RING, PACKET_STATISTICS,
  PACKET_TX_RING

and the ring-specific options
  PACKET_LOSS, PACKET_RESERVE, PACKET_TIMESTAMP, PACKET_VERSION

It does not yet add descriptions for
  PACKET_COPY_THRESH, PACKET_HDRLEN, PACKET_ORIGDEV,
  PACKET_TX_HAS_OFF, PACKET_TX_TIMESTAMP, PACKET_VNET_HDR

It tries to balance being informative with exposing kernel detail
that is unlikely to be used by most readers or that may change
frequently. For implementation details, the manpage points to the
documentation in kernel Documentation/networking. Let me know if
options should be added or removed.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 man7/packet.7 | 183 +++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 175 insertions(+), 8 deletions(-)

diff --git a/man7/packet.7 b/man7/packet.7
index 006f2ac..a9cc168 100644
--- a/man7/packet.7
+++ b/man7/packet.7
@@ -177,17 +177,21 @@ and
 .I sll_ifindex
 are used.
 .SS Socket options
+Packet socket options are configured by calling
+. BR setsockopt (2)
+with level SOL_PACKET.
+.TP
+.BR PACKET_ADD_MEMBERSHIP
+.PD 0
+.TP
+.BR PACKET_DROP_MEMBERSHIP
+.PD
 Packet sockets can be used to configure physical layer multicasting
 and promiscuous mode.
-It works by calling
-.BR setsockopt (2)
-on a packet socket for
-.B SOL_PACKET
-and one of the options
 .B PACKET_ADD_MEMBERSHIP
-to add a binding or
+adds a binding and
 .B PACKET_DROP_MEMBERSHIP
-to drop it.
+drops it.
 They both expect a
 .B packet_mreq
 structure as argument:
@@ -227,6 +231,169 @@ In addition the traditional ioctls
 .BR SIOCADDMULTI ,
 .B SIOCDELMULTI
 can be used for the same purpose.
+.TP
+.BR PACKET_AUXDATA " (since Linux 2.6.21)"
+.\" commit 8dc419447
+If this binary option is enabled, the packet socket passes a metadata
+structure along with each packet in the
+.BR recvmsg (2)
+control field. The
+structure can be read with
+.BR cmsg (3). It is defined as
+
+.in +4n
+.nf
+struct tpacket_auxdata {
+    __u32 tp_status;
+    __u32 tp_len;      /* packet length */
+    __u32 tp_snaplen;  /* captured length */
+    __u16 tp_mac;
+    __u16 tp_net;
+    __u16 tp_vlan_tci;
+    __u16 tp_padding;
+};
+.fi
+.in
+
+.B tp_net
+stores the offset to the network layer. If the packet socket is of type
+.BR SOCK_DGRAM ,
+then
+.B tp_mac
+is the same. If it is of type
+.B SOCK_RAW ,
+then that stores the offset to the link layer frame.
+.TP
+.BR PACKET_FANOUT " (since Linux 3.1)"
+.\" commit dc99f6006
+To scale processing across threads, packet sockets can form a fanout
+group. In this mode, each matching packet is enqueued onto only one
+socket in the group. A socket joins a fanout group by calling
+.B setsockopt(2)
+with level SOL_PACKET and option PACKET_FANOUT.
+Each network namespace can have up to 65536 independent groups. A
+socket selects a group by encoding the ID in the first 16 bits of
+the integer option value. The first packet socket to join a group
+implicitly creates it. To successfully join an existing group,
+subsequent packet sockets must have the same
+protocol, device settings and fanout mode and flags (see below).
+Packet sockets can leave a fanout group only by closing the socket.
+The group is deleted when the last socket is closed.
+
+Fanout supports multiple algorithms to spread traffic between sockets.
+The default mode,
+. BR PACKET_FANOUT_HASH ,
+sends packets from the same flow to the same socket to maintain per-flow
+ordering. For each packet, it chooses a socket by taking the packet
+flow hash modulo the number of sockets in the group, where a flow hash
+is a hash over network layer address and optional transport layer port
+fields. The load balance mode
+. BR PACKET_FANOUT_LB
+implements a round robin algorithm.
+. BR PACKET_FANOUT_CPU
+selects the socket based on the cpu that the packet arrived on.
+
+Fanout modes can take additional options. IP fragmentation causes packets
+from the same flow to have different flow hashes. The flag
+.BR PACKET_FANOUT_FLAG_DEFRAG ,
+if set, causes packet to be defragmented before fanout is applied, to
+preserve order even in this case. Fanout mode and options are communicated
+in the second 16 bits of the integer option value.
+.TP
+.BR PACKET_LOSS " (with PACKET_TX_RING)"
+If set, do not silently drop on transmission errors, but return the
+packet with status set to
+.BR TP_STATUS_WRONG_FORMAT
+.TP
+.BR PACKET_RESERVE " (with PACKET_RX_RING)"
+By default, a packet receive ring writes packets immediately following the
+metadata structure and alignment padding. This integer option reserves
+additional headroom.
+.TP
+.BR PACKET_RX_RING
+Create a memory mapped ring buffer for asynchronous packet reception.
+The packet socket reserves a contiguous region of application address
+space, lays it out into an array of packet slots and copies packets
+(up to snaplen) into subsequent slots. Each packet is preceded by a
+metadata structure similar to
+.B tpacket_auxdata.
+Packet socket and application communicate the head and tail of the ring
+through the
+.B tp_status
+field. The packet socket owns all slots with status
+.BR TP_STATUS_KERNEL .
+After filling a slot, it changes the status of the slot to transfer
+ownership to the application. During normal operation, the new status is
+.BR TP_STATUS_USER ,
+to signal that a correctly received packet has been stored. When the
+application has finished processing a packet, it transfers ownership of
+the slot back to the socket by setting the status to
+.BR TP_STATUS_KERNEL .
+Packet sockets implement multiple
+variants of the packet ring. The implementation details are described in
+.IR Documentation/networking/packet_mmap.txt
+in the Linux kernel source tree.
+.TP
+.BR PACKET_STATISTICS
+Retrieve packet socket statistics in the form of a structure
+
+.in +4n
+.nf
+struct tpacket_stats {
+    __u32 tp_packets;  /* total packet count */
+    __u32 tp_drops;    /* dropped packet count */
+};
+.fi
+.in
+
+Receiving statistics resets the internal counters. The exact statistics
+structure differs when using a ring of variant
+.BR TPACKET_V3 .
+.TP
+.BR PACKET_TIMESTAMP " (with PACKET_RX_RING)"
+The packet receive ring always stores a timestamp in the metadata header.
+By default, this is a software generated timestamp generated when the
+packet is copied into the ring. This integer option selects the type of
+timestamp. Besides the default, it support the two hardware formats
+described in
+.IR Documentation/networking/timestamping.txt
+in the Linux kernel source tree.
+.TP
+.BR PACKET_TX_RING " (since Linux 2.6.31)"
+.\" commit 69e3c75f4
+Create a memory mapped ring buffer for packet transmission. This option
+is similar to
+.BR PACKET_RX_RING
+and takes the same arguments. The application writes packets into slots
+with status
+.BR TP_STATUS_AVAILABLE
+and schedules them for transmission by changing the status to
+.BR TP_STATUS_SEND_REQUEST .
+When packets are ready to be transmitted, the application calls
+.BR send (2)
+Or a variant thereof. The
+.B buf
+and
+.B len
+fields of this call are ignored. If an address is passed using
+.BR sendto (2)
+or
+.BR sendmsg (2) ,
+then that overrides the socket default. On successful transmission, the
+socket resets the slot to
+.BR TP_STATUS_AVAILABLE .
+It discards packets silently on error unless
+.BR PACKET_LOSS
+is set.
+.TP
+.BR PACKET_VERSION " (with PACKET_RX_RING)"
+By default,
+.BR PACKET_RX_RING
+creates a packet receive ring of variant
+.BR TPACKET_V1 .
+To create another variant, configure the desired variant by setting this
+integer option before creating the ring.
+
 .SS Ioctls
 .B SIOCGSTAMP
 can be used to receive the timestamp of the last received packet.
@@ -318,7 +485,7 @@ header to get a fully conforming packet.
 Incoming 802.3 packets are not multiplexed on the DSAP/SSAP protocol
 fields; instead they are supplied to the user as protocol
 .B ETH_P_802_2
-with the LLC header prepended.
+with the LLC header prefixed.
 It is thus not possible to bind to
 .BR ETH_P_802_3 ;
 bind to
-- 
1.8.1.3

^ permalink raw reply related

* Re: [PATCH] bnx2x: fix occasional statistics off-by-4GB error
From: David Miller @ 2013-03-18 17:13 UTC (permalink / raw)
  To: eilong; +Cc: maze, eric.dumazet, dmitry, netdev, yuvalmin
In-Reply-To: <1363601182.4752.13.camel@lb-tlvb-eilong.il.broadcom.com>

From: "Eilon Greenstein" <eilong@broadcom.com>
Date: Mon, 18 Mar 2013 12:06:22 +0200

> Maciej - thanks for the detailed information. You are right - it has
> nothing to do with the HW/FW and it is simply a bug that needs to be
> fixed. I withdraw my objections and add my ACK.
> 
> Acked-by: Eilon Greenstein <eilong@broadcom.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH net-next 2/2] bnx2x: add RSS capability for GRE traffic
From: David Miller @ 2013-03-18 17:11 UTC (permalink / raw)
  To: dmitry; +Cc: netdev, eilong
In-Reply-To: <1363625464-21633-2-git-send-email-dmitry@broadcom.com>

From: "Dmitry Kravkov" <dmitry@broadcom.com>
Date: Mon, 18 Mar 2013 18:51:04 +0200

> The patch drives FW to perform RSS for GRE traffic,
> based on inner headers.
> 
> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
> Signed-off-by: Eilon Greenstein <eilong@broadcom.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next 1/2] bnx2x: add CSUM and TSO support for encapsulation protocols
From: David Miller @ 2013-03-18 17:11 UTC (permalink / raw)
  To: dmitry; +Cc: netdev, eilong
In-Reply-To: <1363625464-21633-1-git-send-email-dmitry@broadcom.com>

From: "Dmitry Kravkov" <dmitry@broadcom.com>
Date: Mon, 18 Mar 2013 18:51:03 +0200

> The patch utilizes FW offload capabilities for
> encapsulation protocols.
> 
> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
> Signed-off-by: Eilon Greenstein <eilong@broadcom.com>

Applied.

^ permalink raw reply

* Re: [PATCH] macvlan: Remove an unnecessary goto
From: David Miller @ 2013-03-18 17:07 UTC (permalink / raw)
  To: slash; +Cc: netdev, kaber, linux-kernel
In-Reply-To: <20130318130339.4F76662C03B@msa106.auone-net.jp>

From: Kusanagi Kouichi <slash@ac.auone-net.jp>
Date: Mon, 18 Mar 2013 22:03:39 +0900

> Use else instead.
> 
> Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>

The code is not in any way more readable with your changes.

I'm not applying this.

^ permalink raw reply

* Re: [PATCH] net: Fix a comment typo
From: David Miller @ 2013-03-18 17:06 UTC (permalink / raw)
  To: slash; +Cc: netdev, linux-kernel, trivial
In-Reply-To: <20130318125952.CC5E615C03A@msa104.auone-net.jp>

From: Kusanagi Kouichi <slash@ac.auone-net.jp>
Date: Mon, 18 Mar 2013 21:59:52 +0900

> Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>

Applied.

^ permalink raw reply

* Re: [PATCH 3/3] net: ftgmac100: Use module_platform_driver()
From: David Miller @ 2013-03-18 17:03 UTC (permalink / raw)
  To: sachin.kamat; +Cc: netdev, ratbert
In-Reply-To: <1363607448-17369-3-git-send-email-sachin.kamat@linaro.org>

From: Sachin Kamat <sachin.kamat@linaro.org>
Date: Mon, 18 Mar 2013 17:20:48 +0530

> module_platform_driver macro removes some boilerplate and makes
> the code simpler.
> 
> Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>

Applied.

^ permalink raw reply

* Re: [PATCH 2/3] net: ep93xx_eth: Use module_platform_driver()
From: David Miller @ 2013-03-18 17:03 UTC (permalink / raw)
  To: sachin.kamat; +Cc: netdev
In-Reply-To: <1363607448-17369-2-git-send-email-sachin.kamat@linaro.org>

From: Sachin Kamat <sachin.kamat@linaro.org>
Date: Mon, 18 Mar 2013 17:20:47 +0530

> module_platform_driver macro removes some boilerplate and makes
> the code simpler.
> 
> Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>

Applied.

^ permalink raw reply

* Re: [PATCH 1/3] net: dm9000: Use module_platform_driver()
From: David Miller @ 2013-03-18 17:02 UTC (permalink / raw)
  To: sachin.kamat; +Cc: netdev
In-Reply-To: <1363607448-17369-1-git-send-email-sachin.kamat@linaro.org>

From: Sachin Kamat <sachin.kamat@linaro.org>
Date: Mon, 18 Mar 2013 17:20:46 +0530

> module_platform_driver macro removes some boilerplate and makes
> the code simpler.
> 
> Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net: neterion: replace ip_fast_csum with csum_replace2
From: David Miller @ 2013-03-18 17:02 UTC (permalink / raw)
  To: roy.qing.li; +Cc: netdev
In-Reply-To: <1363595688-16004-1-git-send-email-roy.qing.li@gmail.com>

From: roy.qing.li@gmail.com
Date: Mon, 18 Mar 2013 16:34:48 +0800

> From: Li RongQing <roy.qing.li@gmail.com>
> 
> replace ip_fast_csum with csum_replace2 to save cpu cycles
> 
> Signed-off-by: Li RongQing <roy.qing.li@gmail.com>

Applied.

^ permalink raw reply

* Re: Who/What is supposed to remove IPv6 address from interface when moving from one network to another ?
From: Dan Williams @ 2013-03-18 17:03 UTC (permalink / raw)
  To: Lorenzo Colitti; +Cc: Sylvain Munaut, netdev@vger.kernel.org
In-Reply-To: <CAKD1Yr1=1kfcSmRYgnrYQBhNPPc=nquB2c1jLbopbvqz848cxQ@mail.gmail.com>

On Mon, 2013-03-18 at 09:51 -0700, Lorenzo Colitti wrote:
> On Thu, Mar 14, 2013 at 12:48 PM, Dan Williams <dcbw@redhat.com> wrote:
> > The kernel does not (and shouldn't) trigger anything on carrier change as that's a
> > site-specific/user-specific policy.
> 
> Actually, it *does* trigger events on carrier change: it creates the
> addresses when you connect. It just doesn't delete them when you
> disconnect. So you can get addresses without a userspace daemon, but
> you can never delete them without a userspace daemon.
> 
> I tried to argue that that's incorrect, but, well, the archives show
> how far I got.

It does handle them when you connect, but only if you've set accept_ra
to something > 0.  And something has to set that :)  But in reality,
it's not a problem to listen for new addresses.  But *deleting*
addresses is way out of the kernel's responsibility, because a carrier
event doesn't tell the kernel anything about whether it's reconnecting
to the same network or a different one and thus it doesn't know whether
it should delete the old address or keep it around.  And that's where
the userspace stuff and policy comes in, and the kernel doesn't do
policy.

Dan

^ permalink raw reply

* Re: [PULL] vhost: tcm_vhost fixes for 3.9
From: David Miller @ 2013-03-18 17:01 UTC (permalink / raw)
  To: mst; +Cc: kvm, virtualization, netdev, linux-kernel, asias, nab,
	target-devel
In-Reply-To: <20130318112003.GA7809@redhat.com>

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Mon, 18 Mar 2013 13:20:03 +0200

> The following changes since commit 8c6216d7f118a128678270824b6a1286a63863ca:
> 
>   Revert "ip_gre: make ipgre_tunnel_xmit() not parse network header as IP unconditionally" (2013-03-16 23:00:41 -0400)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost-net
> 
> for you to fetch changes up to deb7cb067dabd3be625eaf5495da8bdc97377fc1:
> 
>   tcm_vhost: Flush vhost_work in vhost_scsi_flush() (2013-03-17 13:04:14 +0200)

This is a scsi driver, I therefore don't think this pull request is for me.

Please avoid such confusion in the future, and don't use branch names
like "vhost-net" for SCSI driver fixes.

Thanks.

^ permalink raw reply

* [PATCH] tcp: dont handle MTU reduction on LISTEN socket
From: Eric Dumazet @ 2013-03-18 17:01 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, dormando

From: Eric Dumazet <edumazet@google.com>

When an ICMP ICMP_FRAG_NEEDED (or ICMPV6_PKT_TOOBIG) message finds a
LISTEN socket, and this socket is currently owned by the user, we
set TCP_MTU_REDUCED_DEFERRED flag in listener tsq_flags.

This is bad because if we clone the parent before it had a chance to
clear the flag, the child inherits the tsq_flags value, and next
tcp_release_cb() on the child will decrement sk_refcnt.

Result is that we might free a live TCP socket, as reported by
Dormando.

IPv4: Attempt to release TCP socket in state 1

Fix this issue by testing sk_state against TCP_LISTEN early, so that we
set TCP_MTU_REDUCED_DEFERRED on appropriate sockets (not a LISTEN one)

This bug was introduced in commit 563d34d05786
(tcp: dont drop MTU reduction indications)

Reported-by: dormando <dormando@rydia.net>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_ipv4.c |   14 +++++++-------
 net/ipv6/tcp_ipv6.c |    7 +++++++
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 4a8ec45..d09203c 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -274,13 +274,6 @@ static void tcp_v4_mtu_reduced(struct sock *sk)
 	struct inet_sock *inet = inet_sk(sk);
 	u32 mtu = tcp_sk(sk)->mtu_info;
 
-	/* We are not interested in TCP_LISTEN and open_requests (SYN-ACKs
-	 * send out by Linux are always <576bytes so they should go through
-	 * unfragmented).
-	 */
-	if (sk->sk_state == TCP_LISTEN)
-		return;
-
 	dst = inet_csk_update_pmtu(sk, mtu);
 	if (!dst)
 		return;
@@ -408,6 +401,13 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info)
 			goto out;
 
 		if (code == ICMP_FRAG_NEEDED) { /* PMTU discovery (RFC1191) */
+			/* We are not interested in TCP_LISTEN and open_requests
+			 * (SYN-ACKs send out by Linux are always <576bytes so
+			 * they should go through unfragmented).
+			 */
+			if (sk->sk_state == TCP_LISTEN)
+				goto out;
+
 			tp->mtu_info = info;
 			if (!sock_owned_by_user(sk)) {
 				tcp_v4_mtu_reduced(sk);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 9b64600..f6d629f 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -389,6 +389,13 @@ static void tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 	}
 
 	if (type == ICMPV6_PKT_TOOBIG) {
+		/* We are not interested in TCP_LISTEN and open_requests
+		 * (SYN-ACKs send out by Linux are always <576bytes so
+		 * they should go through unfragmented).
+		 */
+		if (sk->sk_state == TCP_LISTEN)
+			goto out;
+
 		tp->mtu_info = ntohl(info);
 		if (!sock_owned_by_user(sk))
 			tcp_v6_mtu_reduced(sk);

^ permalink raw reply related

* Re: [PATCH v2 0/4] ARM: mxs: sanitize enet_out clock handling
From: David Miller @ 2013-03-18 16:57 UTC (permalink / raw)
  To: shawn.guo; +Cc: linux-arm-kernel, netdev
In-Reply-To: <1363595185-12302-1-git-send-email-shawn.guo@linaro.org>

From: Shawn Guo <shawn.guo@linaro.org>
Date: Mon, 18 Mar 2013 16:26:21 +0800

> If the series looks good to you, I hope I can have your ACK on the
> first 2 patches to have the series go via arm-soc tree for sake of
> git bisect.  Alternatively, please apply the first 2 on your tree
> for 3.10 and we will queue the platform patches for 3.11.

Feel free to apply these to arm-soc:

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply

* Re: [PATCH net-next 0/2] be2net: patch set
From: David Miller @ 2013-03-18 16:56 UTC (permalink / raw)
  To: somnath.kotur; +Cc: netdev
In-Reply-To: <bfd8205d-1c91-4947-88f7-c27554b4dd06@CMEXHTCAS2.ad.emulex.com>

From: Somnath Kotur <somnath.kotur@emulex.com>
Date: Mon, 18 Mar 2013 12:33:55 +0530

> Pls apply.
> 
> Somnath Kotur (2):
>   be2net: enable interrupts in be_probe() (RoCE and other ULPs need
>     them)
>   be2net: Use new F/W mailbox cmd to manipulate interrupts.

These patches to not apply to net-next at all.

^ permalink raw reply

* Re: Who/What is supposed to remove IPv6 address from interface when moving from one network to another ?
From: Lorenzo Colitti @ 2013-03-18 16:51 UTC (permalink / raw)
  To: Dan Williams; +Cc: Sylvain Munaut, netdev@vger.kernel.org
In-Reply-To: <1363290495.1643.29.camel@dcbw.foobar.com>

On Thu, Mar 14, 2013 at 12:48 PM, Dan Williams <dcbw@redhat.com> wrote:
> The kernel does not (and shouldn't) trigger anything on carrier change as that's a
> site-specific/user-specific policy.

Actually, it *does* trigger events on carrier change: it creates the
addresses when you connect. It just doesn't delete them when you
disconnect. So you can get addresses without a userspace daemon, but
you can never delete them without a userspace daemon.

I tried to argue that that's incorrect, but, well, the archives show
how far I got.

^ permalink raw reply

* [PATCH net-next 2/2] bnx2x: add RSS capability for GRE traffic
From: Dmitry Kravkov @ 2013-03-18 16:51 UTC (permalink / raw)
  To: davem, netdev; +Cc: Dmitry Kravkov, Eilon Greenstein
In-Reply-To: <1363625464-21633-1-git-send-email-dmitry@broadcom.com>

The patch drives FW to perform RSS for GRE traffic,
based on inner headers.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h |    3 +++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c  |   23 ++++++++++++-----------
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.h  |    9 +++++++++
 3 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
index 8f96372..f9098d8 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
@@ -973,6 +973,9 @@ static inline int bnx2x_func_start(struct bnx2x *bp)
 	else /* CHIP_IS_E1X */
 		start_params->network_cos_mode = FW_WRR;
 
+	start_params->gre_tunnel_mode = IPGRE_TUNNEL;
+	start_params->gre_tunnel_rss = GRE_INNER_HEADERS_RSS;
+
 	return bnx2x_func_state_change(bp, &func_params);
 }
 
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
index 66ab259..5bdc1d6 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
@@ -5679,17 +5679,18 @@ static inline int bnx2x_func_send_start(struct bnx2x *bp,
 	memset(rdata, 0, sizeof(*rdata));
 
 	/* Fill the ramrod data with provided parameters */
-	rdata->function_mode    = (u8)start_params->mf_mode;
-	rdata->sd_vlan_tag      = cpu_to_le16(start_params->sd_vlan_tag);
-	rdata->path_id          = BP_PATH(bp);
-	rdata->network_cos_mode = start_params->network_cos_mode;
-
-	/*
-	 *  No need for an explicit memory barrier here as long we would
-	 *  need to ensure the ordering of writing to the SPQ element
-	 *  and updating of the SPQ producer which involves a memory
-	 *  read and we will have to put a full memory barrier there
-	 *  (inside bnx2x_sp_post()).
+	rdata->function_mode	= (u8)start_params->mf_mode;
+	rdata->sd_vlan_tag	= cpu_to_le16(start_params->sd_vlan_tag);
+	rdata->path_id		= BP_PATH(bp);
+	rdata->network_cos_mode	= start_params->network_cos_mode;
+	rdata->gre_tunnel_mode	= start_params->gre_tunnel_mode;
+	rdata->gre_tunnel_rss	= start_params->gre_tunnel_rss;
+
+	/* No need for an explicit memory barrier here as long we would
+	 * need to ensure the ordering of writing to the SPQ element
+	 * and updating of the SPQ producer which involves a memory
+	 * read and we will have to put a full memory barrier there
+	 * (inside bnx2x_sp_post()).
 	 */
 
 	return bnx2x_sp_post(bp, RAMROD_CMD_ID_COMMON_FUNCTION_START, 0,
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.h
index 064dba2..35479da 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.h
@@ -1123,6 +1123,15 @@ struct bnx2x_func_start_params {
 
 	/* Function cos mode */
 	u8 network_cos_mode;
+
+	/* NVGRE classification enablement */
+	u8 nvgre_clss_en;
+
+	/* NO_GRE_TUNNEL/NVGRE_TUNNEL/L2GRE_TUNNEL/IPGRE_TUNNEL */
+	u8 gre_tunnel_mode;
+
+	/* GRE_OUTER_HEADERS_RSS/GRE_INNER_HEADERS_RSS/NVGRE_KEY_ENTROPY_RSS */
+	u8 gre_tunnel_rss;
 };
 
 struct bnx2x_func_switch_update_params {
-- 
1.7.7.2

^ permalink raw reply related

* [PATCH net-next 1/2] bnx2x: add CSUM and TSO support for encapsulation protocols
From: Dmitry Kravkov @ 2013-03-18 16:51 UTC (permalink / raw)
  To: davem, netdev; +Cc: Dmitry Kravkov, Eilon Greenstein

The patch utilizes FW offload capabilities for
encapsulation protocols.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h      |   29 ++--
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c  |  196 ++++++++++++++++++++--
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |    7 +
 3 files changed, 204 insertions(+), 28 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
index 9e8d195..a4729c7 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
@@ -612,9 +612,10 @@ struct bnx2x_fastpath {
  * START_BD		- describes packed
  * START_BD(splitted)	- includes unpaged data segment for GSO
  * PARSING_BD		- for TSO and CSUM data
+ * PARSING_BD2		- for encapsulation data
  * Frag BDs		- decribes pages for frags
  */
-#define BDS_PER_TX_PKT		3
+#define BDS_PER_TX_PKT		4
 #define MAX_BDS_PER_TX_PKT	(MAX_SKB_FRAGS + BDS_PER_TX_PKT)
 /* max BDs per tx packet including next pages */
 #define MAX_DESC_PER_TX_PKT	(MAX_BDS_PER_TX_PKT + \
@@ -731,16 +732,22 @@ struct bnx2x_fastpath {
 
 #define pbd_tcp_flags(tcp_hdr)	(ntohl(tcp_flag_word(tcp_hdr))>>16 & 0xff)
 
-#define XMIT_PLAIN			0
-#define XMIT_CSUM_V4			0x1
-#define XMIT_CSUM_V6			0x2
-#define XMIT_CSUM_TCP			0x4
-#define XMIT_GSO_V4			0x8
-#define XMIT_GSO_V6			0x10
-
-#define XMIT_CSUM			(XMIT_CSUM_V4 | XMIT_CSUM_V6)
-#define XMIT_GSO			(XMIT_GSO_V4 | XMIT_GSO_V6)
-
+#define XMIT_PLAIN		0
+#define XMIT_CSUM_V4		(1 << 0)
+#define XMIT_CSUM_V6		(1 << 1)
+#define XMIT_CSUM_TCP		(1 << 2)
+#define XMIT_GSO_V4		(1 << 3)
+#define XMIT_GSO_V6		(1 << 4)
+#define XMIT_CSUM_ENC_V4	(1 << 5)
+#define XMIT_CSUM_ENC_V6	(1 << 6)
+#define XMIT_GSO_ENC_V4		(1 << 7)
+#define XMIT_GSO_ENC_V6		(1 << 8)
+
+#define XMIT_CSUM_ENC		(XMIT_CSUM_ENC_V4 | XMIT_CSUM_ENC_V6)
+#define XMIT_GSO_ENC		(XMIT_GSO_ENC_V4 | XMIT_GSO_ENC_V6)
+
+#define XMIT_CSUM		(XMIT_CSUM_V4 | XMIT_CSUM_V6 | XMIT_CSUM_ENC)
+#define XMIT_GSO		(XMIT_GSO_V4 | XMIT_GSO_V6 | XMIT_GSO_ENC)
 
 /* stuff added to make the code fit 80Col */
 #define CQE_TYPE(cqe_fp_flags)	 ((cqe_fp_flags) & ETH_FAST_PATH_RX_CQE_TYPE)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 9f7a379..8091de7 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -3148,27 +3148,44 @@ static __le16 bnx2x_csum_fix(unsigned char *t_header, u16 csum, s8 fix)
 static u32 bnx2x_xmit_type(struct bnx2x *bp, struct sk_buff *skb)
 {
 	u32 rc;
+	__u8 prot = 0;
+	__be16 protocol;
 
 	if (skb->ip_summed != CHECKSUM_PARTIAL)
-		rc = XMIT_PLAIN;
+		return XMIT_PLAIN;
 
-	else {
-		if (vlan_get_protocol(skb) == htons(ETH_P_IPV6)) {
-			rc = XMIT_CSUM_V6;
-			if (ipv6_hdr(skb)->nexthdr == IPPROTO_TCP)
-				rc |= XMIT_CSUM_TCP;
+	protocol = vlan_get_protocol(skb);
+	if (protocol == htons(ETH_P_IPV6)) {
+		rc = XMIT_CSUM_V6;
+		prot = ipv6_hdr(skb)->nexthdr;
+	} else {
+		rc = XMIT_CSUM_V4;
+		prot = ip_hdr(skb)->protocol;
+	}
 
+	if (!CHIP_IS_E1x(bp) && skb->encapsulation) {
+		if (inner_ip_hdr(skb)->version == 6) {
+			rc |= XMIT_CSUM_ENC_V6;
+			if (inner_ipv6_hdr(skb)->nexthdr == IPPROTO_TCP)
+				rc |= XMIT_CSUM_TCP;
 		} else {
-			rc = XMIT_CSUM_V4;
-			if (ip_hdr(skb)->protocol == IPPROTO_TCP)
+			rc |= XMIT_CSUM_ENC_V4;
+			if (inner_ip_hdr(skb)->protocol == IPPROTO_TCP)
 				rc |= XMIT_CSUM_TCP;
 		}
 	}
+	if (prot == IPPROTO_TCP)
+		rc |= XMIT_CSUM_TCP;
 
-	if (skb_is_gso_v6(skb))
-		rc |= XMIT_GSO_V6 | XMIT_CSUM_TCP | XMIT_CSUM_V6;
-	else if (skb_is_gso(skb))
-		rc |= XMIT_GSO_V4 | XMIT_CSUM_V4 | XMIT_CSUM_TCP;
+	if (skb_is_gso_v6(skb)) {
+		rc |= (XMIT_GSO_V6 | XMIT_CSUM_TCP | XMIT_CSUM_V6);
+		if (rc & XMIT_CSUM_ENC)
+			rc |= XMIT_GSO_ENC_V6;
+	} else if (skb_is_gso(skb)) {
+		rc |= (XMIT_GSO_V4 | XMIT_CSUM_V4 | XMIT_CSUM_TCP);
+		if (rc & XMIT_CSUM_ENC)
+			rc |= XMIT_GSO_ENC_V4;
+	}
 
 	return rc;
 }
@@ -3256,11 +3273,20 @@ exit_lbl:
 static void bnx2x_set_pbd_gso_e2(struct sk_buff *skb, u32 *parsing_data,
 				 u32 xmit_type)
 {
+	struct ipv6hdr *ipv6;
+
 	*parsing_data |= (skb_shinfo(skb)->gso_size <<
 			      ETH_TX_PARSE_BD_E2_LSO_MSS_SHIFT) &
 			      ETH_TX_PARSE_BD_E2_LSO_MSS;
-	if ((xmit_type & XMIT_GSO_V6) &&
-	    (ipv6_hdr(skb)->nexthdr == NEXTHDR_IPV6))
+
+	if (xmit_type & XMIT_GSO_ENC_V6)
+		ipv6 = inner_ipv6_hdr(skb);
+	else if (xmit_type & XMIT_GSO_V6)
+		ipv6 = ipv6_hdr(skb);
+	else
+		ipv6 = NULL;
+
+	if (ipv6 && ipv6->nexthdr == NEXTHDR_IPV6)
 		*parsing_data |= ETH_TX_PARSE_BD_E2_IPV6_WITH_EXT_HDR;
 }
 
@@ -3297,6 +3323,40 @@ static void bnx2x_set_pbd_gso(struct sk_buff *skb,
 }
 
 /**
+ * bnx2x_set_pbd_csum_enc - update PBD with checksum and return header length
+ *
+ * @bp:			driver handle
+ * @skb:		packet skb
+ * @parsing_data:	data to be updated
+ * @xmit_type:		xmit flags
+ *
+ * 57712/578xx related, when skb has encapsulation
+ */
+static u8 bnx2x_set_pbd_csum_enc(struct bnx2x *bp, struct sk_buff *skb,
+				 u32 *parsing_data, u32 xmit_type)
+{
+	*parsing_data |=
+		((((u8 *)skb_inner_transport_header(skb) - skb->data) >> 1) <<
+		ETH_TX_PARSE_BD_E2_L4_HDR_START_OFFSET_W_SHIFT) &
+		ETH_TX_PARSE_BD_E2_L4_HDR_START_OFFSET_W;
+
+	if (xmit_type & XMIT_CSUM_TCP) {
+		*parsing_data |= ((inner_tcp_hdrlen(skb) / 4) <<
+			ETH_TX_PARSE_BD_E2_TCP_HDR_LENGTH_DW_SHIFT) &
+			ETH_TX_PARSE_BD_E2_TCP_HDR_LENGTH_DW;
+
+		return skb_inner_transport_header(skb) +
+			inner_tcp_hdrlen(skb) - skb->data;
+	}
+
+	/* We support checksum offload for TCP and UDP only.
+	 * No need to pass the UDP header length - it's a constant.
+	 */
+	return skb_inner_transport_header(skb) +
+		sizeof(struct udphdr) - skb->data;
+}
+
+/**
  * bnx2x_set_pbd_csum_e2 - update PBD with checksum and return header length
  *
  * @bp:			driver handle
@@ -3327,13 +3387,14 @@ static u8 bnx2x_set_pbd_csum_e2(struct bnx2x *bp, struct sk_buff *skb,
 	return skb_transport_header(skb) + sizeof(struct udphdr) - skb->data;
 }
 
+/* set FW indication according to inner or outer protocols if tunneled */
 static void bnx2x_set_sbd_csum(struct bnx2x *bp, struct sk_buff *skb,
 			       struct eth_tx_start_bd *tx_start_bd,
 			       u32 xmit_type)
 {
 	tx_start_bd->bd_flags.as_bitfield |= ETH_TX_BD_FLAGS_L4_CSUM;
 
-	if (xmit_type & XMIT_CSUM_V6)
+	if (xmit_type & (XMIT_CSUM_ENC_V6 | XMIT_CSUM_V6))
 		tx_start_bd->bd_flags.as_bitfield |= ETH_TX_BD_FLAGS_IPV6;
 
 	if (!(xmit_type & XMIT_CSUM_TCP))
@@ -3396,6 +3457,72 @@ static u8 bnx2x_set_pbd_csum(struct bnx2x *bp, struct sk_buff *skb,
 	return hlen;
 }
 
+static void bnx2x_update_pbds_gso_enc(struct sk_buff *skb,
+				      struct eth_tx_parse_bd_e2 *pbd_e2,
+				      struct eth_tx_parse_2nd_bd *pbd2,
+				      u16 *global_data,
+				      u32 xmit_type)
+{
+	u16 inner_hlen_w = 0;
+	u8 outerip_off, outerip_len = 0;
+
+	/* IP len */
+	inner_hlen_w = (skb_inner_transport_header(skb) -
+			skb_inner_network_header(skb)) >> 1;
+
+	/* transport len */
+	if (xmit_type & XMIT_CSUM_TCP)
+		inner_hlen_w += inner_tcp_hdrlen(skb) >> 1;
+	else
+		inner_hlen_w += sizeof(struct udphdr) >> 1;
+
+	pbd2->fw_ip_hdr_to_payload_w = inner_hlen_w;
+
+	if (xmit_type & XMIT_CSUM_ENC_V4) {
+		struct iphdr *iph = inner_ip_hdr(skb);
+
+		pbd2->fw_ip_csum_wo_len_flags_frag =
+			bswab16(csum_fold((~iph->check) -
+					  iph->tot_len - iph->frag_off));
+	} else {
+		pbd2->fw_ip_hdr_to_payload_w =
+			inner_hlen_w - ((sizeof(struct ipv6hdr)) >> 1);
+	}
+
+	pbd2->tcp_send_seq = bswab32(inner_tcp_hdr(skb)->seq);
+
+	pbd2->tcp_flags = pbd_tcp_flags(inner_tcp_hdr(skb));
+
+	if (xmit_type & XMIT_GSO_V4) {
+		pbd2->hw_ip_id = bswab16(ip_hdr(skb)->id);
+
+		pbd_e2->data.tunnel_data.pseudo_csum =
+			bswab16(~csum_tcpudp_magic(
+					inner_ip_hdr(skb)->saddr,
+					inner_ip_hdr(skb)->daddr,
+					0, IPPROTO_TCP, 0));
+
+		outerip_len = ip_hdr(skb)->ihl << 1;
+	} else {
+		pbd_e2->data.tunnel_data.pseudo_csum =
+			bswab16(~csum_ipv6_magic(
+					&inner_ipv6_hdr(skb)->saddr,
+					&inner_ipv6_hdr(skb)->daddr,
+					0, IPPROTO_TCP, 0));
+	}
+
+	outerip_off = (skb_network_header(skb) - skb->data) >> 1;
+
+	*global_data |=
+		outerip_off |
+		(!!(xmit_type & XMIT_CSUM_V6) <<
+			ETH_TX_PARSE_2ND_BD_IP_HDR_TYPE_OUTER_SHIFT) |
+		(outerip_len <<
+			ETH_TX_PARSE_2ND_BD_IP_HDR_LEN_OUTER_W_SHIFT) |
+		((skb->protocol == cpu_to_be16(ETH_P_8021Q)) <<
+			ETH_TX_PARSE_2ND_BD_LLC_SNAP_EN_SHIFT);
+}
+
 /* called with netif_tx_lock
  * bnx2x_tx_int() runs without netif_tx_lock unless it needs to call
  * netif_wake_queue()
@@ -3411,6 +3538,7 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct eth_tx_bd *tx_data_bd, *total_pkt_bd = NULL;
 	struct eth_tx_parse_bd_e1x *pbd_e1x = NULL;
 	struct eth_tx_parse_bd_e2 *pbd_e2 = NULL;
+	struct eth_tx_parse_2nd_bd *pbd2 = NULL;
 	u32 pbd_e2_parsing_data = 0;
 	u16 pkt_prod, bd_prod;
 	int nbd, txq_index;
@@ -3567,12 +3695,46 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (!CHIP_IS_E1x(bp)) {
 		pbd_e2 = &txdata->tx_desc_ring[bd_prod].parse_bd_e2;
 		memset(pbd_e2, 0, sizeof(struct eth_tx_parse_bd_e2));
-		/* Set PBD in checksum offload case */
-		if (xmit_type & XMIT_CSUM)
+
+		if (xmit_type & XMIT_CSUM_ENC) {
+			u16 global_data = 0;
+
+			/* Set PBD in enc checksum offload case */
+			hlen = bnx2x_set_pbd_csum_enc(bp, skb,
+						      &pbd_e2_parsing_data,
+						      xmit_type);
+
+			/* turn on 2nd parsing and get a BD */
+			bd_prod = TX_BD(NEXT_TX_IDX(bd_prod));
+
+			pbd2 = &txdata->tx_desc_ring[bd_prod].parse_2nd_bd;
+
+			memset(pbd2, 0, sizeof(*pbd2));
+
+			pbd_e2->data.tunnel_data.ip_hdr_start_inner_w =
+				(skb_inner_network_header(skb) -
+				 skb->data) >> 1;
+
+			if (xmit_type & XMIT_GSO_ENC)
+				bnx2x_update_pbds_gso_enc(skb, pbd_e2, pbd2,
+							  &global_data,
+							  xmit_type);
+
+			pbd2->global_data = cpu_to_le16(global_data);
+
+			/* add addition parse BD indication to start BD */
+			SET_FLAG(tx_start_bd->general_data,
+				 ETH_TX_START_BD_PARSE_NBDS, 1);
+			/* set encapsulation flag in start BD */
+			SET_FLAG(tx_start_bd->general_data,
+				 ETH_TX_START_BD_TUNNEL_EXIST, 1);
+			nbd++;
+		} else if (xmit_type & XMIT_CSUM) {
 			/* Set PBD in checksum offload case w/o encapsulation */
 			hlen = bnx2x_set_pbd_csum_e2(bp, skb,
 						     &pbd_e2_parsing_data,
 						     xmit_type);
+		}
 
 		/* Add the macs to the parsing BD this is a vf */
 		if (IS_VF(bp)) {
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 04d123f..4902d1e 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -11965,6 +11965,13 @@ static int bnx2x_init_dev(struct bnx2x *bp, struct pci_dev *pdev,
 		NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6 |
 		NETIF_F_RXCSUM | NETIF_F_LRO | NETIF_F_GRO |
 		NETIF_F_RXHASH | NETIF_F_HW_VLAN_TX;
+	if (!CHIP_IS_E1x(bp)) {
+		dev->hw_features |= NETIF_F_GSO_GRE;
+		dev->hw_enc_features =
+			NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_SG |
+			NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6 |
+			NETIF_F_GSO_GRE;
+	}
 
 	dev->vlan_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
 		NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6 | NETIF_F_HIGHDMA;
-- 
1.7.7.2

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox