Netdev List
 help / color / mirror / Atom feed
* [PATCH] Documentation: clarify phys_port_id
From: Dan Williams @ 2014-12-17 16:59 UTC (permalink / raw)
  To: netdev; +Cc: Joshua Watt, jpirko, Florian Fainelli
In-Reply-To: <1418834826.1160.35.camel@dcbw.local>

Signed-off-by: Dan Williams <dcbw@redhat.com>
---
 Documentation/ABI/testing/sysfs-class-net | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-class-net b/Documentation/ABI/testing/sysfs-class-net
index e1b2e78..7fe823a 100644
--- a/Documentation/ABI/testing/sysfs-class-net
+++ b/Documentation/ABI/testing/sysfs-class-net
@@ -186,7 +186,12 @@ KernelVersion:	3.12
 Contact:	netdev@vger.kernel.org
 Description:
 		Indicates the interface unique physical port identifier within
-		the NIC, as a string.
+		the NIC, as a string.  If two net_device objects share physical
+		hardware or other resources, and/or do not operate independently
+		both net_device objects should be assigned the
+		same phys_port_id.  phys_port_id should be as globally unique
+		as possible to prevent conflicts between different drivers and
+		vendors, eg with MAC addresses or hardware GUIDs.
 
 What:		/sys/class/net/<iface>/speed
 Date:		October 2009
-- 
1.9.3

^ permalink raw reply related

* Re: [PATCH net-next 1/3] Implementation of RFC 4898 Extended TCP Statistics (Web10G)
From: rapier @ 2014-12-17 17:00 UTC (permalink / raw)
  To: Bjørn Mork; +Cc: netdev
In-Reply-To: <87d27i35gb.fsf@nemi.mork.no>



On 12/17/14 6:01 AM, Bjørn Mork wrote:
> rapier <rapier@psc.edu> writes:
>
>> + * The Web10Gig project.  See http://www.web10gig.org
>
> URL is already outdated?

My apologies. The correct site is http://www.web10g.org. My fingers 
moved faster than my brain on that one.

^ permalink raw reply

* Re: [RFC PATCH net-next 3/5] tcp: Add a few more tracepoints for tcp tracer
From: Arnaldo Carvalho de Melo @ 2014-12-17 17:02 UTC (permalink / raw)
  To: David Ahern
  Cc: Martin KaFai Lau, netdev, David S. Miller, Hannes Frederic Sowa,
	Steven Rostedt, Lawrence Brakmo, Josef Bacik, Kernel Team
In-Reply-To: <5491A86C.5030207@gmail.com>

Em Wed, Dec 17, 2014 at 08:59:40AM -0700, David Ahern escreveu:
> On 12/17/14 8:33 AM, Arnaldo Carvalho de Melo wrote:

> >On a random RHEL7 kernel I had laying around on a test machine:

> >[root@ssdandy ~]# perf probe -L tcp_sacktag_write_queue | head -20
> ><tcp_sacktag_write_queue@/usr/src/debug/kernel-3.10.0-123.el7/linux-3.10.0-123.el7.x86_64/net/ipv4/tcp_input.c:0>
> >       0  tcp_sacktag_write_queue(struct sock *sk, const struct sk_buff *ack_skb,
> >          			u32 prior_snd_una)
> >       2  {
> >          	struct tcp_sock *tp = tcp_sk(sk);
> >       4  	const unsigned char *ptr = (skb_transport_header(ack_skb) +
> >          				    TCP_SKB_CB(ack_skb)->sacked);
> >          	struct tcp_sack_block_wire *sp_wire = (struct tcp_sack_block_wire *)(ptr+2);
> >          	struct tcp_sack_block sp[TCP_NUM_SACKS];
> >          	struct tcp_sack_block *cache;
> >          	struct tcp_sacktag_state state;
> >          	struct sk_buff *skb;
> >      11  	int num_sacks = min(TCP_NUM_SACKS, (ptr[1] - TCPOLEN_SACK_BASE) >> 3);
> >          	int used_sacks;
> >          	bool found_dup_sack = false;
> >          	int i, j;
> >          	int first_sack_index;
> >
> >      17  	state.flag = 0;
> >      18  	state.reord = tp->packets_out;
 
> But there are limitations/hassles with this approach. For starters I believe
> it requires vmlinux on box. The products I work on do not have vmlinux
> available in the runtime environment. I recall someone (Masami?) suggesting

Not necessarily, one can do all this on a development machine that has
that info and then end up with kprobe_trace expressions as described on:

Documentation/trace/kprobetrace.txt

> the ability to write the probe data to a file (ie., create the probe
> definition off box) and load the file to create the probe, so yes a solvable
> problem.

Exactly.
 
> But with this approach it could very be that the function name and variable
> names differ with kernel version and that makes it hard to impossible to
> create a set of analysis commands.

Well, if this is in a very much in flux code, then, there is no place
for a tracepoint there :-)

For instance: I've been using this definition:

commit c522739d72a341a3e74a369ce6298b9412813d3f
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Fri Sep 27 18:06:19 2013 -0300

    perf trace: Use vfs_getname hook if available
    
    Initially it tries to find a probe:vfs_getname that should be setup
    with:
    
     perf probe 'vfs_getname=getname_flags:65 pathname=result->name:string'

-----------------

For quite a while in tools/perf/builtin-trace.c

I.e. if this thing is in place, then I get mappings from a pointer to a
pathname that I use in a tool, 'perf trace', if not, it tries reading
/proc/, etc, which is suboptimal but then the only way to map a fd to a
pathname.

On RHEL7:

[root@ssdandy ~]# perf probe 'vfs_getname=getname_flags:65 pathname=result->name:string'
Added new event:
  probe:vfs_getname    (on getname_flags:65 with pathname=result->name:string)

You can now use it in all perf tools, such as:

	perf record -e probe:vfs_getname -aR sleep 1

[root@ssdandy ~]#
[root@ssdandy ~]# perf record -e probe:vfs_getname cat /etc/passwd > /dev/null
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.019 MB perf.data (~812 samples) ]
[root@ssdandy ~]# perf script
 cat 24805 [005] 188716.046396: probe:vfs_getname: (ffffffff811bb7e3) pathname="/etc/ld.so.preload"
 cat 24805 [005] 188716.046408: probe:vfs_getname: (ffffffff811bb7e3) pathname="/etc/ld.so.cache"
 cat 24805 [005] 188716.046428: probe:vfs_getname: (ffffffff811bb7e3) pathname="/lib64/libc.so.6"
 cat 24805 [005] 188716.046675: probe:vfs_getname: (ffffffff811bb7e3) pathname="/usr/lib/locale/locale-archive"
 cat 24805 [005] 188716.046718: probe:vfs_getname: (ffffffff811bb7e3) pathname="/etc/passwd"
[root@ssdandy ~]# uname -r
3.10.0-123.el7.x86_64

And in fedora there was a change in how we must set up the probe:

[root@zoo ~]# perf probe 'vfs_getname=getname_flags:65 pathname=filename:string'
Added new event:
  probe:vfs_getname    (on getname_flags:65 with pathname=filename:string)

You can now use it in all perf tools, such as:

	perf record -e probe:vfs_getname -aR sleep 1

[root@zoo ~]# perf record -e probe:vfs_getname cat /etc/passwd > /dev/null
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.016 MB perf.data (~687 samples) ]
[root@zoo ~]# perf script
 cat 27485 [000] 931.973139: probe:vfs_getname: (ffffffff8120ad13) pathname="/etc/ld.so.preload"
 cat 27485 [000] 931.973148: probe:vfs_getname: (ffffffff8120ad13) pathname="/etc/ld.so.cache"
 cat 27485 [000] 931.973169: probe:vfs_getname: (ffffffff8120ad13) pathname="/lib64/libc.so.6"
 cat 27485 [000] 931.973417: probe:vfs_getname: (ffffffff8120ad13) pathname="/usr/lib/locale/locale-archive"
 cat 27485 [000] 931.973488: probe:vfs_getname: (ffffffff8120ad13) pathname="/etc/passwd"
[root@zoo ~]# uname -r
3.17.4-200.fc20.x86_64
[root@zoo ~]#

What I could find in 'result->name'  changed to 'filename', but since I gave it
a standard name ("pathname"), its just when setting up the probes, something we
do at system start, i.e. just once, before the first 'perf record', we can deal
with that with some scripting that will do probe setup fallbacks.

All this while we're prototyping a tool, when we're 100% sure that a tracepoint
would provide any performance gains and that that area is so set in stone that
we can guarantee an ABI, go ahead and add the tracepoint.

- Arnaldo

^ permalink raw reply

* [PATCH][v3.13.y] e1000e: Fix no connectivity when driver loaded with cable out
From: Joseph Salisbury @ 2014-12-17 17:08 UTC (permalink / raw)
  To: Kamal Mostafa
  Cc: stable@vger.kernel.org, davidx.m.ertman, jeffrey.e.pieper,
	jeffrey.t.kirsher, LKML, linux.nics, e1000-devel,
	netdev@vger.kernel.org

Hello,

Please consider including mainline commit b20a774 in the next v3.13.y
stable release.  It was included in the mainline tree as of v3.15-rc1. 
It has been tested and confirmed to resolve
http://bugs.launchpad.net/bugs/1400365 .

commit b20a774495671f037e7160ea2ce8789af6b61533
Author: David Ertman <davidx.m.ertman@intel.com>
Date:   Tue Mar 25 04:27:55 2014 +0000

    e1000e: Fix no connectivity when driver loaded with cable out



Sincerely,
Joseph Salisbury

^ permalink raw reply

* Re: [PATCH net-next 1/3] Implementation of RFC 4898 Extended TCP Statistics (Web10G)
From: Bryan @ 2014-12-17 16:19 UTC (permalink / raw)
  To: Bjørn Mork; +Cc: netdev
In-Reply-To: <87d27i35gb.fsf@nemi.mork.no>

On 12/17/2014 6:01 AM, Bjørn Mork wrote:
>> + * The Web10Gig project.  See http://www.web10gig.org
> URL is already outdated?
That's a typo. Please see http://www.web10g.org

^ permalink raw reply

* Re: [RFC PATCH net-next 0/5] tcp: TCP tracer
From: Alexei Starovoitov @ 2014-12-17 17:14 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Martin KaFai Lau, netdev@vger.kernel.org, David S. Miller,
	Hannes Frederic Sowa, Steven Rostedt, Lawrence Brakmo,
	Josef Bacik, Kernel Team

On Wed, Dec 17, 2014 at 7:07 AM, Arnaldo Carvalho de Melo
<arnaldo.melo@gmail.com> wrote:
>
> I guess even just using 'perf probe' to set those wannabe tracepoints
> should be enough, no? Then he can refer to those in his perf record
> call, etc and process it just like with the real tracepoints.

it's far from ideal for two reasons.
- they have different kernels and dragging along vmlinux
with debug info or multiple 'perf list' data is too cumbersome
operationally. Permanent tracepoints solve this problem.
- the action upon hitting tracepoint is non-trivial.
perf probe style of unconditionally walking pointer chains
will be tripping over wrong pointers.
Plus they already need to do aggregation for high
frequency events.
As part of acting on trace_transmit_skb() event:
if (before(tcb->seq, tcp_sk(sk)->snd_nxt)) {
  tcp_trace_stats_add(...)
}
if (jiffies_to_msecs(jiffies - sktr->last_ts) ..) {
  tcp_trace_stats_add(...)
}

^ permalink raw reply

* [PATCH] MAINTAINERS: changes for wireless
From: John W. Linville @ 2014-12-17 17:07 UTC (permalink / raw)
  To: netdev; +Cc: linux-wireless, davem, John W. Linville

http://marc.info/?l=linux-wireless&m=141883202530292&w=2

This makes it official... :-)

Signed-off-by: John W. Linville <linville@tuxdriver.com>
---
 MAINTAINERS | 19 ++++++++-----------
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index fdffe962a16a..e82d31aeb936 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6603,19 +6603,8 @@ L:	netdev@vger.kernel.org
 S:	Maintained
 
 NETWORKING [WIRELESS]
-M:	"John W. Linville" <linville@tuxdriver.com>
 L:	linux-wireless@vger.kernel.org
 Q:	http://patchwork.kernel.org/project/linux-wireless/list/
-T:	git git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless.git
-S:	Maintained
-F:	net/mac80211/
-F:	net/rfkill/
-F:	net/wireless/
-F:	include/net/ieee80211*
-F:	include/linux/wireless.h
-F:	include/uapi/linux/wireless.h
-F:	include/net/iw_handler.h
-F:	drivers/net/wireless/
 
 NETWORKING DRIVERS
 L:	netdev@vger.kernel.org
@@ -6636,6 +6625,14 @@ F:	include/linux/inetdevice.h
 F:	include/uapi/linux/if_*
 F:	include/uapi/linux/netdevice.h
 
+NETWORKING DRIVERS (WIRELESS)
+M:	Kalle Valo <kvalo@codeaurora.org>
+L:	linux-wireless@vger.kernel.org
+Q:	http://patchwork.kernel.org/project/linux-wireless/list/
+T:	git git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers.git/
+S:	Maintained
+F:	drivers/net/wireless/
+
 NETXEN (1/10) GbE SUPPORT
 M:	Manish Chopra <manish.chopra@qlogic.com>
 M:	Sony Chacko <sony.chacko@qlogic.com>
-- 
1.9.3

^ permalink raw reply related

* Re: [PATCH net-next 2/3] Implementation of RFC 4898 Extended TCP Statistics (Web10G)
From: rapier @ 2014-12-17 17:32 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, alexei.starovoitov, netdev
In-Reply-To: <1418769212.9773.65.camel@edumazet-glaptop2.roam.corp.google.com>

On 12/16/14 5:33 PM, Eric Dumazet wrote:

> There is very little chance web10g ~3000 lines of code are added into
> linux TCP stack, by people who did not submit netdev changes in last
> years.

I do understand this. I went though something similar when submitting 
the hpn-ssh patch set to OpenSSH many years ago. In retrospect we should 
have been submitting subsets of the instruments on a periodic basis over 
the past couple of years.

I also understand the need to be conservative in the approach to 
inclusion of new functionality. No one, least of all my team, wants to 
introduce instability or useless complexity into the stack.

> At Google, we tried the web10g route, but reverted it (today !) in favor
> of tcp_info extensions (ss command from iproute2 can also grab/display
> these), after too many bugs being filled.

I was informed that there was parallel development at Google and the 
decision to move in favor of tcp_info happened not that long ago. I do 
wish Google had shared more of these bug reports with the development 
team so we could have addressed them at the time. That's how things go 
though.

> Researchers love/want to have hundred of metrics. This does not mean
> linux has to provide them natively, unless we can prove it is really
> damn useful.

We agree with that. What we've found is that determining the most 
valuable metrics often depends on the context. Those looking at 
instrumenting connections within a data center are going to be looking 
at different metrics than someone managing flows across a federated set 
of widely distributed data transfer nodes. I'd be more than happy to 
discuss what instruments provide the most value and which are 
superfluous. In fact, I believe this is a critical conversation *if* the 
community feels that more stack instrumentation would be useful.

> Sorry, but someone had to raise some reality concerns.

No need to apologize. Reality, like the Moon, is a harsh mistress.

> tcp_info _is_ extensible, granted you do not try to push 127 new metrics
> in it.

We are more than willing to look in to extending tcp_info. We do think 
our methodology has some value though. One of the things we feel is an 
advantage is that tcp_estats has a method to query the MIB and 
dynamically determine what set of instruments are available. This allows 
for a bit more flexibility in terms of forward/backward compatibility.

Chris

^ permalink raw reply

* [PATCH 00/10] Split UFO into v4 and v6 versions.
From: Vladislav Yasevich @ 2014-12-17 18:20 UTC (permalink / raw)
  To: netdev; +Cc: mst, ben, stefanha, virtualization

UFO support in the kernel applies to both IPv4 and IPv6 protocols
with the same device feature.  However some devices may not be able
to support one of the offloads.  For this we split the UFO offload
feature into 2 pieces.  NETIF_F_UFO now controlls the IPv4 part and
this series introduces NETIF_F_UFO6.

As a result of this work, we can now re-enable NETIF_F_UFO on
virtio_net devices and restore UDP over IPv4 performance for guests.
We also continue to support legacy guests that assume that UFO6
support included into UFO(4).

Without this work, migrating a guest to a 3.18 kernel fails.

Vladislav Yasevich (10):
  core: Split out UFO6 support
  net:  Correctly mark IPv6 UFO offload type.
  ovs: Enable handling of UFO6 packets.
  loopback: Turn on UFO6 support.
  veth: Enable UFO6 support.
  macvlan: Enable UFO6 support.
  s2io: Enable UFO6 support.
  tun: Re-uanble UFO support.
  macvtap: Re-enable UFO support
  Revert "drivers/net: Disable UFO through virtio"

 drivers/net/ethernet/neterion/s2io.c |  6 +++---
 drivers/net/loopback.c               |  4 ++--
 drivers/net/macvlan.c                |  2 +-
 drivers/net/macvtap.c                | 20 ++++++++++++++------
 drivers/net/tun.c                    | 26 ++++++++++++++------------
 drivers/net/veth.c                   |  2 +-
 drivers/net/virtio_net.c             | 24 ++++++++++--------------
 include/linux/netdev_features.h      |  7 +++++--
 include/linux/netdevice.h            |  1 +
 include/linux/skbuff.h               |  1 +
 net/core/dev.c                       | 35 +++++++++++++++++++----------------
 net/core/ethtool.c                   |  2 +-
 net/ipv6/ip6_offload.c               |  1 +
 net/ipv6/ip6_output.c                |  4 ++--
 net/ipv6/udp_offload.c               |  3 ++-
 net/mpls/mpls_gso.c                  |  1 +
 net/openvswitch/datapath.c           |  3 ++-
 net/openvswitch/flow.c               |  2 +-
 18 files changed, 81 insertions(+), 63 deletions(-)

-- 
1.9.3

^ permalink raw reply

* [PATCH 03/10] ovs: Enable handling of UFO6 packets.
From: Vladislav Yasevich @ 2014-12-17 18:20 UTC (permalink / raw)
  To: netdev; +Cc: mst, ben, stefanha, virtualization
In-Reply-To: <1418840455-22598-1-git-send-email-vyasevic@redhat.com>

Since UFO6 packets can now be identified by SKB_GSO_UDP6, add proper checks
to handel UFO6 flows.
Legacy applications may still have UFO6 packets identified by SKB_GSO_UDP,
so we need to continue to handle them correclty.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
---
 net/openvswitch/datapath.c | 3 ++-
 net/openvswitch/flow.c     | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index f9e556b..b43fc60 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -334,7 +334,8 @@ static int queue_gso_packets(struct datapath *dp, struct sk_buff *skb,
 		if (err)
 			break;
 
-		if (skb == segs && gso_type & SKB_GSO_UDP) {
+		if (skb == segs &&
+		    ((gso_type & SKB_GSO_UDP) || (gso_type & SKB_GSO_UDP6))) {
 			/* The initial flow key extracted by ovs_flow_extract()
 			 * in this case is for a first fragment, so we need to
 			 * properly mark later fragments.
diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index 2b78789..d03adf4 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -602,7 +602,7 @@ static int key_extract(struct sk_buff *skb, struct sw_flow_key *key)
 
 		if (key->ip.frag == OVS_FRAG_TYPE_LATER)
 			return 0;
-		if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP)
+		if (skb_shinfo(skb)->gso_type & (SKB_GSO_UDP | SKB_GSO_UDP6))
 			key->ip.frag = OVS_FRAG_TYPE_FIRST;
 
 		/* Transport layer. */
-- 
1.9.3

^ permalink raw reply related

* [PATCH 01/10] core: Split out UFO6 support
From: Vladislav Yasevich @ 2014-12-17 18:20 UTC (permalink / raw)
  To: netdev; +Cc: virtualization, mst, ben, stefanha, Vladislav Yasevich
In-Reply-To: <1418840455-22598-1-git-send-email-vyasevic@redhat.com>

Split IPv6 support for UFO into its own feature similiar to TSO.
This will later allow us to re-enable UFO support for virtio-net
devices.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
---
 include/linux/netdev_features.h |  7 +++++--
 include/linux/netdevice.h       |  1 +
 include/linux/skbuff.h          |  1 +
 net/core/dev.c                  | 35 +++++++++++++++++++----------------
 net/core/ethtool.c              |  2 +-
 5 files changed, 27 insertions(+), 19 deletions(-)

diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
index dcfdecb..a078945 100644
--- a/include/linux/netdev_features.h
+++ b/include/linux/netdev_features.h
@@ -48,8 +48,9 @@ enum {
 	NETIF_F_GSO_UDP_TUNNEL_BIT,	/* ... UDP TUNNEL with TSO */
 	NETIF_F_GSO_UDP_TUNNEL_CSUM_BIT,/* ... UDP TUNNEL with TSO & CSUM */
 	NETIF_F_GSO_MPLS_BIT,		/* ... MPLS segmentation */
+	NETIF_F_UFO6_BIT,		/* ... UDPv6 fragmentation */
 	/**/NETIF_F_GSO_LAST =		/* last bit, see GSO_MASK */
-		NETIF_F_GSO_MPLS_BIT,
+		NETIF_F_UFO6_BIT,
 
 	NETIF_F_FCOE_CRC_BIT,		/* FCoE CRC32 */
 	NETIF_F_SCTP_CSUM_BIT,		/* SCTP checksum offload */
@@ -109,6 +110,7 @@ enum {
 #define NETIF_F_TSO_ECN		__NETIF_F(TSO_ECN)
 #define NETIF_F_TSO		__NETIF_F(TSO)
 #define NETIF_F_UFO		__NETIF_F(UFO)
+#define NETIF_F_UFO6		__NETIF_F(UFO6)
 #define NETIF_F_VLAN_CHALLENGED	__NETIF_F(VLAN_CHALLENGED)
 #define NETIF_F_RXFCS		__NETIF_F(RXFCS)
 #define NETIF_F_RXALL		__NETIF_F(RXALL)
@@ -141,7 +143,7 @@ enum {
 
 /* List of features with software fallbacks. */
 #define NETIF_F_GSO_SOFTWARE	(NETIF_F_TSO | NETIF_F_TSO_ECN | \
-				 NETIF_F_TSO6 | NETIF_F_UFO)
+				 NETIF_F_TSO6 | NETIF_F_UFO | NETIF_F_UFO6)
 
 #define NETIF_F_GEN_CSUM	NETIF_F_HW_CSUM
 #define NETIF_F_V4_CSUM		(NETIF_F_GEN_CSUM | NETIF_F_IP_CSUM)
@@ -149,6 +151,7 @@ enum {
 #define NETIF_F_ALL_CSUM	(NETIF_F_V4_CSUM | NETIF_F_V6_CSUM)
 
 #define NETIF_F_ALL_TSO 	(NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_TSO_ECN)
+#define NETIF_F_ALL_UFO		(NETIF_F_UFO | NETIF_F_UFO6)
 
 #define NETIF_F_ALL_FCOE	(NETIF_F_FCOE_CRC | NETIF_F_FCOE_MTU | \
 				 NETIF_F_FSO)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 74fd5d3..86af10a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3559,6 +3559,7 @@ static inline bool net_gso_ok(netdev_features_t features, int gso_type)
 	/* check flags correspondence */
 	BUILD_BUG_ON(SKB_GSO_TCPV4   != (NETIF_F_TSO >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_UDP     != (NETIF_F_UFO >> NETIF_F_GSO_SHIFT));
+	BUILD_BUG_ON(SKB_GSO_UDP6    != (NETIF_F_UFO6 >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_DODGY   != (NETIF_F_GSO_ROBUST >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_TCP_ECN != (NETIF_F_TSO_ECN >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_TCPV6   != (NETIF_F_TSO6 >> NETIF_F_GSO_SHIFT));
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6c8b6f6..8538b67 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -372,6 +372,7 @@ enum {
 
 	SKB_GSO_MPLS = 1 << 12,
 
+	SKB_GSO_UDP6 = 1 << 13
 };
 
 #if BITS_PER_LONG > 32
diff --git a/net/core/dev.c b/net/core/dev.c
index 945bbd0..fa4d2ee 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5929,6 +5929,12 @@ static netdev_features_t netdev_fix_features(struct net_device *dev,
 		features &= ~NETIF_F_ALL_TSO;
 	}
 
+	/* UFO requires that SG is present as well */
+	if ((features & NETIF_F_ALL_UFO) && !(features & NETIF_F_SG)) {
+		netdev_dbg(dev, "Dropping UFO features since no SG feature.\n");
+		features &= ~NETIF_F_ALL_UFO;
+	}
+
 	if ((features & NETIF_F_TSO) && !(features & NETIF_F_HW_CSUM) &&
 					!(features & NETIF_F_IP_CSUM)) {
 		netdev_dbg(dev, "Dropping TSO features since no CSUM feature.\n");
@@ -5952,24 +5958,21 @@ static netdev_features_t netdev_fix_features(struct net_device *dev,
 		features &= ~NETIF_F_GSO;
 	}
 
-	/* UFO needs SG and checksumming */
-	if (features & NETIF_F_UFO) {
-		/* maybe split UFO into V4 and V6? */
-		if (!((features & NETIF_F_GEN_CSUM) ||
-		    (features & (NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM))
-			    == (NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM))) {
-			netdev_dbg(dev,
-				"Dropping NETIF_F_UFO since no checksum offload features.\n");
-			features &= ~NETIF_F_UFO;
-		}
-
-		if (!(features & NETIF_F_SG)) {
-			netdev_dbg(dev,
-				"Dropping NETIF_F_UFO since no NETIF_F_SG feature.\n");
-			features &= ~NETIF_F_UFO;
-		}
+	/* UFO also needs checksumming */
+	if ((features & NETIF_F_UFO) && !(features & NETIF_F_GEN_CSUM) &&
+					!(features & NETIF_F_IP_CSUM)) {
+		netdev_dbg(dev,
+			   "Dropping NETIF_F_UFO since no checksum offload features.\n");
+		features &= ~NETIF_F_UFO;
+	}
+	if ((features & NETIF_F_UFO6) && !(features & NETIF_F_GEN_CSUM) &&
+					 !(features & NETIF_F_IPV6_CSUM)) {
+		netdev_dbg(dev,
+			   "Dropping NETIF_F_UFO6 since no checksum offload features.\n");
+		features &= ~NETIF_F_UFO6;
 	}
 
+
 #ifdef CONFIG_NET_RX_BUSY_POLL
 	if (dev->netdev_ops->ndo_busy_poll)
 		features |= NETIF_F_BUSY_POLL;
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 06dfb29..93eff41 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -223,7 +223,7 @@ static netdev_features_t ethtool_get_feature_mask(u32 eth_cmd)
 		return NETIF_F_ALL_TSO;
 	case ETHTOOL_GUFO:
 	case ETHTOOL_SUFO:
-		return NETIF_F_UFO;
+		return NETIF_F_ALL_UFO;
 	case ETHTOOL_GGSO:
 	case ETHTOOL_SGSO:
 		return NETIF_F_GSO;
-- 
1.9.3

^ permalink raw reply related

* [PATCH 02/10] net:  Correctly mark IPv6 UFO offload type.
From: Vladislav Yasevich @ 2014-12-17 18:20 UTC (permalink / raw)
  To: netdev; +Cc: virtualization, mst, ben, stefanha, Vladislav Yasevich
In-Reply-To: <1418840455-22598-1-git-send-email-vyasevic@redhat.com>

If the device supports UFO6 features, mark the offloaded ipv6 udp
traffic appropriately.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
---
 net/ipv6/ip6_offload.c | 1 +
 net/ipv6/ip6_output.c  | 4 ++--
 net/ipv6/udp_offload.c | 3 ++-
 net/mpls/mpls_gso.c    | 1 +
 4 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index 01e12d0..bd985d5 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -81,6 +81,7 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
 		       SKB_GSO_UDP_TUNNEL_CSUM |
 		       SKB_GSO_MPLS |
 		       SKB_GSO_TCPV6 |
+		       SKB_GSO_UDP6 |
 		       0)))
 		goto out;
 
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 8e950c2..83f5c04 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1089,7 +1089,7 @@ static inline int ip6_ufo_append_data(struct sock *sk,
 	 */
 	skb_shinfo(skb)->gso_size = (mtu - fragheaderlen -
 				     sizeof(struct frag_hdr)) & ~7;
-	skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
+	skb_shinfo(skb)->gso_type = SKB_GSO_UDP6;
 	ipv6_select_ident(&fhdr, rt);
 	skb_shinfo(skb)->ip6_frag_id = fhdr.identification;
 
@@ -1296,7 +1296,7 @@ emsgsize:
 	if (((length > mtu) ||
 	     (skb && skb_is_gso(skb))) &&
 	    (sk->sk_protocol == IPPROTO_UDP) &&
-	    (rt->dst.dev->features & NETIF_F_UFO)) {
+	    (rt->dst.dev->features & NETIF_F_UFO6)) {
 		err = ip6_ufo_append_data(sk, getfrag, from, length,
 					  hh_len, fragheaderlen,
 					  transhdrlen, mtu, flags, rt);
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index 6b8f543..00d723e 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -39,6 +39,7 @@ static struct sk_buff *udp6_ufo_fragment(struct sk_buff *skb,
 		int type = skb_shinfo(skb)->gso_type;
 
 		if (unlikely(type & ~(SKB_GSO_UDP |
+				      SKB_GSO_UDP6 |
 				      SKB_GSO_DODGY |
 				      SKB_GSO_UDP_TUNNEL |
 				      SKB_GSO_UDP_TUNNEL_CSUM |
@@ -47,7 +48,7 @@ static struct sk_buff *udp6_ufo_fragment(struct sk_buff *skb,
 				      SKB_GSO_IPIP |
 				      SKB_GSO_SIT |
 				      SKB_GSO_MPLS) ||
-			     !(type & (SKB_GSO_UDP))))
+			     !(type & (SKB_GSO_UDP | SKB_GSO_UDP6))))
 			goto out;
 
 		skb_shinfo(skb)->gso_segs = DIV_ROUND_UP(skb->len, mss);
diff --git a/net/mpls/mpls_gso.c b/net/mpls/mpls_gso.c
index e3545f2..27343f0 100644
--- a/net/mpls/mpls_gso.c
+++ b/net/mpls/mpls_gso.c
@@ -30,6 +30,7 @@ static struct sk_buff *mpls_gso_segment(struct sk_buff *skb,
 				~(SKB_GSO_TCPV4 |
 				  SKB_GSO_TCPV6 |
 				  SKB_GSO_UDP |
+				  SKB_GSO_UDP6 |
 				  SKB_GSO_DODGY |
 				  SKB_GSO_TCP_ECN |
 				  SKB_GSO_GRE |
-- 
1.9.3

^ permalink raw reply related

* [PATCH 04/10] loopback: Turn on UFO6 support.
From: Vladislav Yasevich @ 2014-12-17 18:20 UTC (permalink / raw)
  To: netdev; +Cc: virtualization, mst, ben, stefanha, Vladislav Yasevich
In-Reply-To: <1418840455-22598-1-git-send-email-vyasevic@redhat.com>

Turn on UFO6 support.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
---
 drivers/net/loopback.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index c76283c..762c28a 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -170,10 +170,10 @@ static void loopback_setup(struct net_device *dev)
 	dev->flags		= IFF_LOOPBACK;
 	dev->priv_flags		|= IFF_LIVE_ADDR_CHANGE;
 	netif_keep_dst(dev);
-	dev->hw_features	= NETIF_F_ALL_TSO | NETIF_F_UFO;
+	dev->hw_features	= NETIF_F_ALL_TSO | NETIF_F_ALL_UFO;
 	dev->features 		= NETIF_F_SG | NETIF_F_FRAGLIST
 		| NETIF_F_ALL_TSO
-		| NETIF_F_UFO
+		| NETIF_F_ALL_UFO
 		| NETIF_F_HW_CSUM
 		| NETIF_F_RXCSUM
 		| NETIF_F_SCTP_CSUM
-- 
1.9.3

^ permalink raw reply related

* [PATCH 05/10] veth: Enable UFO6 support.
From: Vladislav Yasevich @ 2014-12-17 18:20 UTC (permalink / raw)
  To: netdev; +Cc: virtualization, mst, ben, stefanha, Vladislav Yasevich
In-Reply-To: <1418840455-22598-1-git-send-email-vyasevic@redhat.com>

Turn on UFO6 feature.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
---
 drivers/net/veth.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 8ad5965..0052db5 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -280,7 +280,7 @@ static const struct net_device_ops veth_netdev_ops = {
 #define VETH_FEATURES (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_ALL_TSO |    \
 		       NETIF_F_HW_CSUM | NETIF_F_RXCSUM | NETIF_F_HIGHDMA | \
 		       NETIF_F_GSO_GRE | NETIF_F_GSO_UDP_TUNNEL |	    \
-		       NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT | NETIF_F_UFO	|   \
+		       NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT | NETIF_F_ALL_UFO | \
 		       NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX | \
 		       NETIF_F_HW_VLAN_STAG_TX | NETIF_F_HW_VLAN_STAG_RX )
 
-- 
1.9.3

^ permalink raw reply related

* [PATCH 06/10] macvlan: Enable UFO6 support.
From: Vladislav Yasevich @ 2014-12-17 18:20 UTC (permalink / raw)
  To: netdev; +Cc: virtualization, mst, ben, stefanha, Vladislav Yasevich
In-Reply-To: <1418840455-22598-1-git-send-email-vyasevic@redhat.com>

Turn on UFO6 feature.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
---
 drivers/net/macvlan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index bfb0b6e..807b98d 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -746,7 +746,7 @@ static struct lock_class_key macvlan_netdev_addr_lock_key;
 
 #define MACVLAN_FEATURES \
 	(NETIF_F_SG | NETIF_F_ALL_CSUM | NETIF_F_HIGHDMA | NETIF_F_FRAGLIST | \
-	 NETIF_F_GSO | NETIF_F_TSO | NETIF_F_UFO | NETIF_F_GSO_ROBUST | \
+	 NETIF_F_GSO | NETIF_F_TSO | NETIF_F_ALL_UFO | NETIF_F_GSO_ROBUST | \
 	 NETIF_F_TSO_ECN | NETIF_F_TSO6 | NETIF_F_GRO | NETIF_F_RXCSUM | \
 	 NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_HW_VLAN_STAG_FILTER)
 
-- 
1.9.3

^ permalink raw reply related

* [PATCH 07/10] s2io: Enable UFO6 support.
From: Vladislav Yasevich @ 2014-12-17 18:20 UTC (permalink / raw)
  To: netdev; +Cc: virtualization, mst, ben, stefanha, Vladislav Yasevich, Jon Mason
In-Reply-To: <1418840455-22598-1-git-send-email-vyasevic@redhat.com>

CC: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
---
 drivers/net/ethernet/neterion/s2io.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/neterion/s2io.c b/drivers/net/ethernet/neterion/s2io.c
index f5e4b82..d823bb7 100644
--- a/drivers/net/ethernet/neterion/s2io.c
+++ b/drivers/net/ethernet/neterion/s2io.c
@@ -4140,7 +4140,7 @@ static netdev_tx_t s2io_xmit(struct sk_buff *skb, struct net_device *dev)
 	}
 
 	frg_len = skb_headlen(skb);
-	if (offload_type == SKB_GSO_UDP) {
+	if (offload_type == SKB_GSO_UDP || offload_type == SKB_GSO_UDP6) {
 		int ufo_size;
 
 		ufo_size = s2io_udp_mss(skb);
@@ -7917,9 +7917,9 @@ s2io_init_nic(struct pci_dev *pdev, const struct pci_device_id *pre)
 	dev->features |= dev->hw_features |
 		NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX;
 	if (sp->device_type & XFRAME_II_DEVICE) {
-		dev->hw_features |= NETIF_F_UFO;
+		dev->hw_features |= NETIF_F_ALL_UFO;
 		if (ufo)
-			dev->features |= NETIF_F_UFO;
+			dev->features |= NETIF_F_ALL_UFO;
 	}
 	if (sp->high_dma_flag == true)
 		dev->features |= NETIF_F_HIGHDMA;
-- 
1.9.3

^ permalink raw reply related

* [PATCH 08/10] tun: Re-uanble UFO support.
From: Vladislav Yasevich @ 2014-12-17 18:20 UTC (permalink / raw)
  To: netdev; +Cc: virtualization, mst, ben, stefanha, Vladislav Yasevich
In-Reply-To: <1418840455-22598-1-git-send-email-vyasevic@redhat.com>

Now that UFO is split into v4 and v6 parts, we can bring
back v4 support without any trouble.

Continue to handle legacy applications by selecting the
IPv6 fragment id but do not change the gso type.  Thist
makes sure that two legacy VMs may still communicate.

Based on original work from Ben Hutchings.

Fixes: 88e0e0e5aa7a ("drivers/net: Disable UFO through virtio")
CC: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
---
 drivers/net/tun.c | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 9dd3746..8c32fca 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -175,7 +175,7 @@ struct tun_struct {
 	struct net_device	*dev;
 	netdev_features_t	set_features;
 #define TUN_USER_FEATURES (NETIF_F_HW_CSUM|NETIF_F_TSO_ECN|NETIF_F_TSO| \
-			  NETIF_F_TSO6)
+			  NETIF_F_TSO6|NETIF_F_UFO)
 
 	int			vnet_hdr_sz;
 	int			sndbuf;
@@ -1152,20 +1152,15 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 			skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
 			break;
 		case VIRTIO_NET_HDR_GSO_UDP:
-		{
-			static bool warned;
-
-			if (!warned) {
-				warned = true;
-				netdev_warn(tun->dev,
-					    "%s: using disabled UFO feature; please fix this program\n",
-					    current->comm);
-			}
 			skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
-			if (skb->protocol == htons(ETH_P_IPV6))
+			if (vlan_get_protocol(skb) == htons(ETH_P_IPV6)) {
+				/* This allows legacy application to work.
+				 * Do not change the gso_type as it may
+				 * not be upderstood by legacy applications.
+				 */
 				ipv6_proxy_select_ident(skb);
+			}
 			break;
-		}
 		default:
 			tun->dev->stats.rx_frame_errors++;
 			kfree_skb(skb);
@@ -1273,6 +1268,8 @@ static ssize_t tun_put_user(struct tun_struct *tun,
 				gso.gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
 			else if (sinfo->gso_type & SKB_GSO_TCPV6)
 				gso.gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
+			else if (sinfo->gso_type & SKB_GSO_UDP)
+				gso.gso_type = VIRTIO_NET_HDR_GSO_UDP;
 			else {
 				pr_err("unexpected GSO type: "
 				       "0x%x, gso_size %d, hdr_len %d\n",
@@ -1780,6 +1777,11 @@ static int set_offload(struct tun_struct *tun, unsigned long arg)
 				features |= NETIF_F_TSO6;
 			arg &= ~(TUN_F_TSO4|TUN_F_TSO6);
 		}
+
+		if (arg & TUN_F_UFO) {
+			features |= NETIF_F_UFO;
+			arg &= ~TUN_F_UFO;
+		}
 	}
 
 	/* This gives the user a way to test for new features in future by
-- 
1.9.3

^ permalink raw reply related

* [PATCH 09/10] macvtap: Re-enable UFO support
From: Vladislav Yasevich @ 2014-12-17 18:20 UTC (permalink / raw)
  To: netdev; +Cc: virtualization, mst, ben, stefanha, Vladislav Yasevich
In-Reply-To: <1418840455-22598-1-git-send-email-vyasevic@redhat.com>

Now that UFO is split into v4 and v6 parts, we can bring
back v4 support.  Continue to handle legacy applications
by selecting the ipv6 fagment id but do not change the
gso type.  This allows 2 legacy VMs to continue to communicate.

Based on original work from Ben Hutchings.

Fixes: 88e0e0e5aa7a ("drivers/net: Disable UFO through virtio")
CC: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
---
 drivers/net/macvtap.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 880cc09..75febd4 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -66,7 +66,7 @@ static struct cdev macvtap_cdev;
 static const struct proto_ops macvtap_socket_ops;
 
 #define TUN_OFFLOADS (NETIF_F_HW_CSUM | NETIF_F_TSO_ECN | NETIF_F_TSO | \
-		      NETIF_F_TSO6)
+		      NETIF_F_TSO6 | NETIF_F_UFO)
 #define RX_OFFLOADS (NETIF_F_GRO | NETIF_F_LRO)
 #define TAP_FEATURES (NETIF_F_GSO | NETIF_F_SG)
 
@@ -570,11 +570,14 @@ static int macvtap_skb_from_vnet_hdr(struct sk_buff *skb,
 			gso_type = SKB_GSO_TCPV6;
 			break;
 		case VIRTIO_NET_HDR_GSO_UDP:
-			pr_warn_once("macvtap: %s: using disabled UFO feature; please fix this program\n",
-				     current->comm);
 			gso_type = SKB_GSO_UDP;
-			if (skb->protocol == htons(ETH_P_IPV6))
+			if (vlan_get_protocol(skb) == htons(ETH_P_IPV6)) {
+				/* This is to support legacy appliacations.
+				 * Do not change the gso_type as legacy apps
+				 * may not know about the new type.
+				 */
 				ipv6_proxy_select_ident(skb);
+			}
 			break;
 		default:
 			return -EINVAL;
@@ -619,6 +622,8 @@ static void macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
 			vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
 		else if (sinfo->gso_type & SKB_GSO_TCPV6)
 			vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
+		else if (sinfo->gso_type & SKB_GSO_UDP)
+			vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_UDP;
 		else
 			BUG();
 		if (sinfo->gso_type & SKB_GSO_TCP_ECN)
@@ -955,6 +960,9 @@ static int set_offload(struct macvtap_queue *q, unsigned long arg)
 			if (arg & TUN_F_TSO6)
 				feature_mask |= NETIF_F_TSO6;
 		}
+
+		if (arg & TUN_F_UFO)
+			feature_mask |= NETIF_F_UFO;
 	}
 
 	/* tun/tap driver inverts the usage for TSO offloads, where
@@ -965,7 +973,7 @@ static int set_offload(struct macvtap_queue *q, unsigned long arg)
 	 * When user space turns off TSO, we turn off GSO/LRO so that
 	 * user-space will not receive TSO frames.
 	 */
-	if (feature_mask & (NETIF_F_TSO | NETIF_F_TSO6))
+	if (feature_mask & (NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_UFO))
 		features |= RX_OFFLOADS;
 	else
 		features &= ~RX_OFFLOADS;
@@ -1066,7 +1074,7 @@ static long macvtap_ioctl(struct file *file, unsigned int cmd,
 	case TUNSETOFFLOAD:
 		/* let the user check for future flags */
 		if (arg & ~(TUN_F_CSUM | TUN_F_TSO4 | TUN_F_TSO6 |
-			    TUN_F_TSO_ECN))
+			    TUN_F_TSO_ECN | TUN_F_UFO))
 			return -EINVAL;
 
 		rtnl_lock();
-- 
1.9.3

^ permalink raw reply related

* [PATCH 10/10] Revert "drivers/net: Disable UFO through virtio"
From: Vladislav Yasevich @ 2014-12-17 18:20 UTC (permalink / raw)
  To: netdev; +Cc: virtualization, mst, ben, stefanha, Vladislav Yasevich
In-Reply-To: <1418840455-22598-1-git-send-email-vyasevic@redhat.com>

This reverts commit 3d0ad09412ffe00c9afa201d01effdb6023d09b4.
Now that we've split UFO into v4 and v6 version, we can turn
back UFO support for ipv4.  Full IPv6 support will come later as
it requires extending vnet header structure.

Any older VM that assumes IPv6 support is included in UFO
will continue to use UFO and the host will generate fragment
ids for it, thus preserving connectivity.

Fixes: 88e0e0e5aa7a ("drivers/net: Disable UFO through virtio")
CC: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
---
 drivers/net/virtio_net.c | 24 ++++++++++--------------
 1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b0bc8ea..534b633 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -491,17 +491,8 @@ static void receive_buf(struct receive_queue *rq, void *buf, unsigned int len)
 			skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
 			break;
 		case VIRTIO_NET_HDR_GSO_UDP:
-		{
-			static bool warned;
-
-			if (!warned) {
-				warned = true;
-				netdev_warn(dev,
-					    "host using disabled UFO feature; please fix it\n");
-			}
 			skb_shinfo(skb)->gso_type = SKB_GSO_UDP;
 			break;
-		}
 		case VIRTIO_NET_HDR_GSO_TCPV6:
 			skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
 			break;
@@ -890,6 +881,8 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
 			hdr->hdr.gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
 		else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)
 			hdr->hdr.gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
+		else if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP)
+			hdr->hdr.gso_type = VIRTIO_NET_HDR_GSO_UDP;
 		else
 			BUG();
 		if (skb_shinfo(skb)->gso_type & SKB_GSO_TCP_ECN)
@@ -1749,7 +1742,7 @@ static int virtnet_probe(struct virtio_device *vdev)
 			dev->features |= NETIF_F_HW_CSUM|NETIF_F_SG|NETIF_F_FRAGLIST;
 
 		if (virtio_has_feature(vdev, VIRTIO_NET_F_GSO)) {
-			dev->hw_features |= NETIF_F_TSO
+			dev->hw_features |= NETIF_F_TSO | NETIF_F_UFO
 				| NETIF_F_TSO_ECN | NETIF_F_TSO6;
 		}
 		/* Individual feature bits: what can host handle? */
@@ -1759,9 +1752,11 @@ static int virtnet_probe(struct virtio_device *vdev)
 			dev->hw_features |= NETIF_F_TSO6;
 		if (virtio_has_feature(vdev, VIRTIO_NET_F_HOST_ECN))
 			dev->hw_features |= NETIF_F_TSO_ECN;
+		if (virtio_has_feature(vdev, VIRTIO_NET_F_HOST_UFO))
+			dev->hw_features |= NETIF_F_UFO;
 
 		if (gso)
-			dev->features |= dev->hw_features & NETIF_F_ALL_TSO;
+			dev->features |= dev->hw_features & (NETIF_F_ALL_TSO|NETIF_F_UFO);
 		/* (!csum && gso) case will be fixed by register_netdev() */
 	}
 	if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_CSUM))
@@ -1799,7 +1794,8 @@ static int virtnet_probe(struct virtio_device *vdev)
 	/* If we can receive ANY GSO packets, we must allocate large ones. */
 	if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4) ||
 	    virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO6) ||
-	    virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_ECN))
+	    virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_ECN) ||
+	    virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_UFO))
 		vi->big_packets = true;
 
 	if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF))
@@ -1993,9 +1989,9 @@ static struct virtio_device_id id_table[] = {
 static unsigned int features[] = {
 	VIRTIO_NET_F_CSUM, VIRTIO_NET_F_GUEST_CSUM,
 	VIRTIO_NET_F_GSO, VIRTIO_NET_F_MAC,
-	VIRTIO_NET_F_HOST_TSO4, VIRTIO_NET_F_HOST_TSO6,
+	VIRTIO_NET_F_HOST_TSO4, VIRTIO_NET_F_HOST_UFO, VIRTIO_NET_F_HOST_TSO6,
 	VIRTIO_NET_F_HOST_ECN, VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
-	VIRTIO_NET_F_GUEST_ECN,
+	VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_UFO,
 	VIRTIO_NET_F_MRG_RXBUF, VIRTIO_NET_F_STATUS, VIRTIO_NET_F_CTRL_VQ,
 	VIRTIO_NET_F_CTRL_RX, VIRTIO_NET_F_CTRL_VLAN,
 	VIRTIO_NET_F_GUEST_ANNOUNCE, VIRTIO_NET_F_MQ,
-- 
1.9.3

^ permalink raw reply related

* Re: [PATCH net 2/2] geneve: Fix races between socket add and release.
From: Jesse Gross @ 2014-12-17 18:48 UTC (permalink / raw)
  To: Thomas Graf; +Cc: David Miller, netdev, Andy Zhou, Stephen Hemminger
In-Reply-To: <20141217165454.GE28766@casper.infradead.org>

On Wed, Dec 17, 2014 at 8:54 AM, Thomas Graf <tgraf@suug.ch> wrote:
> On 12/16/14 at 06:25pm, Jesse Gross wrote:
>> diff --git a/net/ipv4/geneve.c b/net/ipv4/geneve.c
>> index 5a47188..95e47c9 100644
>> --- a/net/ipv4/geneve.c
>> +++ b/net/ipv4/geneve.c
>> @@ -296,6 +296,7 @@ struct geneve_sock *geneve_sock_add(struct net *net, __be16 port,
>>                                   geneve_rcv_t *rcv, void *data,
>>                                   bool no_share, bool ipv6)
>>  {
>> +     struct geneve_net *gn = net_generic(net, geneve_net_id);
>>       struct geneve_sock *gs;
>>
>>       gs = geneve_socket_create(net, port, rcv, data, ipv6);
>> @@ -305,15 +306,15 @@ struct geneve_sock *geneve_sock_add(struct net *net, __be16 port,
>>       if (no_share)   /* Return error if sharing is not allowed. */
>>               return ERR_PTR(-EINVAL);
>>
>> +     spin_lock(&gn->sock_lock);
>>       gs = geneve_find_sock(net, port);
>
> Perhaps remove the _rcu of the iterator in the geneve_find_sock?
> Also, the kfree_rcu() seems no longer needed as all read accesses
> are protected by the spinlock.
>
>> -     if (gs) {
>> -             if (gs->rcv == rcv)
>> -                     atomic_inc(&gs->refcnt);
>> -             else
>> +     if (gs && ((gs->rcv != rcv) ||
>> +                !atomic_add_unless(&gs->refcnt, 1, 0)))
>>                       gs = ERR_PTR(-EBUSY);
>
> Since you are taking gn->sock_lock in geneve_sock_release()
> anyway, all accesses to refcnt could eventually be converted
> to non-atomic ops.

I generally agree (with the exception of kfree_rcu() - I believe that
is still needed since incoming packets reference it using RCU).
However, since this patch is targeted a net- I wanted to make a
minimal change and not completely redo the locking. A lot of the
locking here was pulled over from VXLAN and I think it can be
simplified since I don't expect that the Geneve code will bring in all
of that logic.

The one part that is not entirely clear is the workqueue in VXLAN used
for destroying the socket. This was added by Stephen in "vxlan: listen
on multiple ports" but it's not obvious to me what problem it is
trying to avoid and I don't see a comment. If possible, it would be
nice to simplify this as well if the issue doesn't apply to Geneve.

^ permalink raw reply

* [PATCH] net: unisys: adding unisys virtnic driver
From: Erik Arfvidson @ 2014-12-17 18:52 UTC (permalink / raw)
  To: benjamin.romer, netdev, dzickus, davem, Bruce.Vessey,
	sparmaintainer, prarit
  Cc: Erik Arfvidson

The purpose of this patch is to add Unisys virtual network driver
into the network directory and also to start a discussion about
the requirements needed.

Signed-off-by: Erik Arfvidson <earfvids@redhat.com>
---
 drivers/net/virtnic.c | 2475 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 2475 insertions(+)
 create mode 100644 drivers/net/virtnic.c

diff --git a/drivers/net/virtnic.c b/drivers/net/virtnic.c
new file mode 100644
index 0000000..0af48f3
--- /dev/null
+++ b/drivers/net/virtnic.c
@@ -0,0 +1,2475 @@
+/* virtnic.c
+ *
+ * Copyright © 2010 - 2014 UNISYS CORPORATION
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
+ * NON INFRINGEMENT.  See the GNU General Public License for more
+ * details.
+ */
+
+#define EXPORT_SYMTAB
+
+#include <linux/kernel.h>
+#ifdef CONFIG_MODVERSIONS
+#include <config/modversions.h>
+#endif
+
+#include "uniklog.h"
+#include "diagnostics/appos_subsystems.h"
+#include "uisutils.h"
+#include "uisthread.h"
+#include "uisqueue.h"
+#include "visorchipset.h"
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/pci.h>
+#include <linux/spinlock.h>
+#include <linux/device.h>
+#include <linux/slab.h>
+#include <linux/netdevice.h>
+#include <linux/etherdevice.h>
+#include <linux/string.h>
+#include <linux/tcp.h>
+#include <linux/ip.h>
+#include <linux/types.h>
+#include <linux/uuid.h>
+#include <linux/debugfs.h>
+
+#include "virtpci.h"
+#include "version.h"
+
+/* this is shorter than using __FILE__ (full path name) in */
+/* debug/info/error messages */
+#define __MYFILE__ "virtnic.c"
+
+/* turn off collecting of debug statistics */
+#define VIRTNIC_STATS 0
+
+ /* MAX_BUF = 64 lines x 32 MAXVHBA x 80 characters
+ *         = 163840 bytes ~ 40 pages
+ */
+#define MAX_BUF 163840
+
+/*
+ * uisnic                   virtnic
+ *         <---- xmit ---  virtnic_xmit(hard-start-xmit)
+ *         <-- rcvpost --  open, virtnic_rx
+ *	   <-- unpost ---  close
+ *	   <-- enb/dis --  open, close
+ *
+ * open & close can't run at the same time as each other or rcv/xmit, but
+ * virtnic_xmit and virtnic_rx could be running at the same time.
+ * and all messages being sent to uisnic MUST be sent so if the queue is
+ * full we have to retry, but we don't want to retry with a spinlock held.
+ */
+
+/*****************************************************/
+/* Forward declarations                              */
+/*****************************************************/
+static int virtnic_probe(struct virtpci_dev *dev,
+			 const struct pci_device_id *id);
+static void virtnic_remove(struct virtpci_dev *dev);
+static int virtnic_change_mtu(struct net_device *netdev, int new_mtu);
+static int virtnic_close(struct net_device *netdev);
+static struct net_device_stats *virtnic_get_stats(struct net_device *netdev);
+static int virtnic_open(struct net_device *netdev);
+static int virtnic_ioctl(struct net_device *netdev, struct ifreq *ifr,
+			 int cmd);
+static void virtnic_rx(struct uiscmdrsp *cmdrsp);
+static int virtnic_xmit(struct sk_buff *skb, struct net_device *netdev);
+static void virtnic_xmit_timeout(struct net_device *netdev);
+static void virtnic_set_multi(struct net_device *netdev);
+static int virtnic_serverdown(struct virtpci_dev *virtpcidev, u32 state);
+static int virtnic_serverup(struct virtpci_dev *virtpcidev);
+static void virtnic_serverdown_complete(struct work_struct *work);
+static void virtnic_timeout_reset(struct work_struct *work);
+static int process_incoming_rsps(void *);
+static ssize_t info_debugfs_read(struct file *file, char __user *buf,
+				 size_t len, loff_t *offset);
+static ssize_t enable_ints_write(struct file *file,
+				 const char __user *buffer,
+				 size_t count, loff_t *ppos);
+
+/*****************************************************/
+/* Globals                                           */
+/*****************************************************/
+
+#define VIRTNIC_XMIT_TIMEOUT (5 * HZ)	/* Default timeout period in jiffies */
+#define VIRTNIC_INFINITE_RESPONSE_WAIT 0
+#define INTERRUPT_VECTOR_MASK 0x3F
+
+static struct workqueue_struct *virtnic_serverdown_workqueue;
+static struct workqueue_struct *virtnic_timeout_reset_workqueue;
+
+static const struct pci_device_id virtnic_id_table[] = {
+	{
+	PCI_DEVICE(PCI_VENDOR_ID_UNISYS, PCI_DEVICE_ID_VIRTNIC)}, {
+0},};
+/* export virtnic_id_table */
+MODULE_DEVICE_TABLE(pci, virtnic_id_table);
+
+static struct virtpci_driver virtnic_driver = {
+	.name = "uisvirtnic",
+	.version = VERSION,
+	.vertag = NULL,
+	.id_table = virtnic_id_table,
+	.probe = virtnic_probe,
+	.remove = virtnic_remove,
+	.suspend = virtnic_serverdown,
+	.resume = virtnic_serverup
+};
+
+#define SEND_ENBDIS(ndev, state, cmdrsp, queue, insertlock, stats) { \
+	DBGINF("sending rcv enb/dis netdev:%p state:%d\n", ndev, state); \
+	cmdrsp->net.enbdis.enable = state; \
+	cmdrsp->net.enbdis.context = ndev; \
+	cmdrsp->net.type = NET_RCV_ENBDIS; \
+	cmdrsp->cmdtype = CMD_NET_TYPE; \
+	uisqueue_put_cmdrsp_with_lock_client(queue, cmdrsp, IOCHAN_TO_IOPART, \
+					     (void *)insertlock, \
+					     DONT_ISSUE_INTERRUPT, \
+					     (uint64_t)NULL, \
+					     OK_TO_WAIT, "vnic"); \
+	stats.sent_enbdis++;\
+}
+
+struct chanstat {
+	unsigned long got_rcv;	/* count of NET_RCV received */
+	unsigned long got_enbdisack;	/* count of NET_RCV_ENBDIS_ACK rcvd */
+	unsigned long got_xmit_done;	/* count of NET_XMIT_DONE received */
+	unsigned long xmit_fail;	/* count of NET_XMIT_DONE failures */
+	unsigned long sent_enbdis;	/* count of NET_RCV_ENBDIS sent */
+	unsigned long sent_promisc;	/* count of NET_RCV_PROMISC sent */
+	unsigned long sent_post;	/* count of NET_RCV_POST sent */
+	unsigned long sent_xmit;	/* count of NET_XMIT sent */
+	unsigned long reject_count;	/* count of NET_XMIT rejected because */
+	/* of BUSY/queue full */
+	unsigned long extra_rcvbufs_sent;
+#if VIRTNIC_STATS
+	unsigned long reject_jiffies_start;	/* jiffie count at start of
+						   NET_XMIT rejects */
+#endif /* VIRTNIC_STATS */
+};
+
+struct datachan {
+	struct chaninfo chinfo;
+	struct chanstat chstat;
+};
+
+struct virtnic_info {
+	struct virtpci_dev *virtpcidev;
+	struct net_device *netdev;
+	struct net_device_stats net_stats;
+	spinlock_t priv_lock; /* spinlock check for private lock */
+	struct datachan datachan;
+	struct sk_buff **rcvbuf;	/* rcvbuf is the array of rcv buffer */
+	/* we post to */
+	unsigned long long uniquenum;
+
+	/* the IOPART end */
+	int num_rcv_bufs;	/* indicates how many receive buffers the
+				   vnic will post */
+	int num_rcv_bufs_could_not_alloc;
+	atomic_t num_rcv_bufs_in_iovm;	/* indicates how many receive buffers
+					   have actully been sent to the iovm */
+	unsigned long inner_loop_limit_reached_cnt;
+	unsigned long alloc_failed_in_if_needed_cnt;
+	unsigned long alloc_failed_in_repost_return_cnt;
+
+	struct sk_buff_head xmitbufhead;	/* xmitbufhead is the head of
+						   the  xmit buffer list that
+						   have been sent to the IOPART
+						   end */
+	int max_outstanding_net_xmits;	/* absolute max number of outstanding
+					   xmits - should never hit this */
+	int upper_threshold_net_xmits;	/* high water mark for calling
+					   netif_stop_queue() */
+	int lower_threshold_net_xmits;	/* high water mark for calling
+					   netif_wake_queue() */
+	uuid_le zoneguid;		/* specifies the zone for the switch in
+					   which this VNIC resides  */
+	struct uiscmdrsp *cmdrsp_rcv;	/* cmdrsp_rcv is used for
+					   posting/unposting rcv buffers */
+	unsigned short enabled;	/* 0 disabled 1 enabled to receive */
+	unsigned short enab_dis_acked;	/* NET_RCV_ENABLE/DISABLE acked by
+					   uisnic */
+	atomic_t usage;			/* count of users */
+	unsigned short old_flags;	/* flags as they were prior to
+					   set_multicast_list */
+	struct uiscmdrsp *xmit_cmdrsp;	/* used to issue NET_XMIT -  there is
+					   never more that one xmit in progress
+					   at a time */
+	struct dentry *eth_debugfs_dir;	/* this points to /proc/eth?
+						   directory */
+	struct dentry *zone_debugfs_entry;	/* this points to
+						   /proc/virtnic/eth?/zone */
+	/* file */
+	struct dentry *clientstr_debugfs_entry;/* this points to
+						  /proc/virtnic/eth?/clientstr
+						  file  */
+	struct irq_info intr;	/* use recvInterrupt info  to connect
+					   to this to receive interrupts when
+					   IOs complete */
+	int interrupt_vector;
+	int thread_wait_ms;
+	int queuefullmsg_logged;	/* flag for throttling queue full */
+	/* messages */
+	/* some debug counters */
+	ulong n_rcv0;			/* # rcvs of 0 buffers */
+	ulong n_rcv1;			/* # rcvs of 1 buffer */
+	ulong n_rcv2;			/* # rcvs of 2 buffers */
+	ulong n_rcvx;			/* # rcvs of >2 buffers */
+	ulong found_repost_rcvbuf_cnt;	/* #time we called repost_rcvbuf_cnt */
+	ulong repost_found_skb_cnt;	/* # times found the skb */
+	ulong n_repost_deficit;		/* # times we couldn't find all of the
+					   rcv buffers */
+	ulong bad_rcv_buf;		/* # times we neglected to
+					     free the rcv skb because
+					     we didn't know where it
+					     came from */
+	ulong n_rcv_packet_not_accepted;	/* # bogus recv packets */
+	bool server_down;
+	bool server_change_state;
+	unsigned long long interrupts_rcvd;
+	unsigned long long interrupts_notme;
+	unsigned long long interrupts_disabled;
+	unsigned long long busy_cnt;
+	unsigned long long flow_control_upper_hits;
+	unsigned long long flow_control_lower_hits;
+	struct work_struct serverdown_completion;
+	struct work_struct timeout_reset;
+	uint64_t __iomem *flags_addr;
+	atomic_t interrupt_rcvd;
+	wait_queue_head_t rsp_queue;
+};
+
+struct virtnic_devices_open {
+	struct net_device *netdev;
+	struct virtnic_info *vnicinfo;
+};
+
+static ssize_t show_zone(struct device *dev, struct device_attribute *attr,
+			 char *buf)
+{
+	struct net_device *net = to_net_dev(dev);
+	struct virtnic_info *vnicinfo = netdev_priv(net);
+
+	return scnprintf(buf, PAGE_SIZE, "%pUL\n", &vnicinfo->zoneguid);
+}
+
+static ssize_t show_clientstr(struct device *dev, struct device_attribute *attr,
+			      char *buf)
+{
+	struct net_device *net = to_net_dev(dev);
+	struct virtnic_info *vnicinfo = netdev_priv(net);
+	struct spar_io_channel_protocol *chan =
+		(struct spar_io_channel_protocol *)vnicinfo->
+		datachan.chinfo.queueinfo->chan;
+
+	return scnprintf(buf, PAGE_SIZE, "%s\n",
+			(char *)&chan->client_string);
+}
+static DEVICE_ATTR(clientstr, S_IRUGO, show_clientstr, NULL);
+static DEVICE_ATTR(zone, S_IRUGO, show_zone, NULL);
+
+#define VIRTNICSOPENMAX 32
+/* array of open devices maintained by open() and close() */
+static struct virtnic_devices_open num_virtnic_open[VIRTNICSOPENMAX];
+static struct dentry *virtnic_debugfs_dir;
+
+static const struct file_operations debugfs_info_fops = {
+	.read = info_debugfs_read,
+};
+
+static const struct file_operations debugfs_enable_ints_fops = {
+	.write = enable_ints_write,
+};
+
+/*****************************************************/
+/* Probe Remove Functions                            */
+/*****************************************************/
+/* set up net.rcvpost struct in cmdrsp.
+ * all rcv buf skb are allocated at RCVPOST_BUF_SIZE, so length is
+ * RCVPOST_BUF_SIZE by default. and since RCVPOST_BUF_SIZE < 2048, one
+ * phys_info struct can describe the rcv buf.
+ */
+static inline void
+post_skb(struct uiscmdrsp *cmdrsp,
+	 struct virtnic_info *vnicinfo, struct sk_buff *skb)
+{
+	cmdrsp->net.buf = skb;
+	cmdrsp->net.rcvpost.frag.pi_pfn = page_to_pfn(virt_to_page(skb->data));
+	cmdrsp->net.rcvpost.frag.pi_off =
+		(unsigned long)skb->data & PI_PAGE_MASK;
+	cmdrsp->net.rcvpost.frag.pi_len = skb->len;
+	cmdrsp->net.rcvpost.unique_num = vnicinfo->uniquenum;
+
+	DBGINF("RCV_POST skb:%p pfn:%llu off:%x len:%d\n", skb,
+	       cmdrsp->net.rcvpost.frag.pi_pfn,
+	       cmdrsp->net.rcvpost.frag.pi_off,
+	       cmdrsp->net.rcvpost.frag.pi_len);
+	if ((cmdrsp->net.rcvpost.frag.pi_off + skb->len) > PI_PAGE_SIZE) {
+		LOGERRNAME(vnicinfo->netdev,
+			   "**** pi_off:0x%x pi_len:%d SPAN ACROSS A PAGE\n",
+			   cmdrsp->net.rcvpost.frag.pi_off, skb->len);
+	} else {
+		cmdrsp->net.type = NET_RCV_POST;
+		cmdrsp->cmdtype = CMD_NET_TYPE;
+		uisqueue_put_cmdrsp_with_lock_client(vnicinfo->datachan.chinfo.
+						     queueinfo, cmdrsp,
+						     IOCHAN_TO_IOPART,
+						     (void *)&vnicinfo->
+						     datachan.chinfo.insertlock,
+						     DONT_ISSUE_INTERRUPT,
+						     (uint64_t)NULL,
+						     OK_TO_WAIT,
+						     "vnic");
+		atomic_inc(&vnicinfo->num_rcv_bufs_in_iovm);
+		vnicinfo->datachan.chstat.sent_post++;
+	}
+}
+
+static irqreturn_t
+virtnic_ISR(int irq, void *dev_id)
+{
+	struct virtnic_info *vnicinfo = (struct virtnic_info *)dev_id;
+
+	struct channel_header __iomem *p_channel_header;
+
+	struct signal_queue_header __iomem *pqhdr;
+	uint64_t mask;
+	unsigned long long rc1;
+
+	if (vnicinfo == NULL)
+		return IRQ_NONE;
+	vnicinfo->interrupts_rcvd++;
+	p_channel_header = vnicinfo->datachan.chinfo.queueinfo->chan;
+	if (((readq(&p_channel_header->features) &
+	      ULTRA_IO_IOVM_IS_OK_WITH_DRIVER_DISABLING_INTS) != 0) &&
+	    ((readq(&p_channel_header->features) &
+	      ULTRA_IO_DRIVER_DISABLES_INTS) != 0)) {
+		/*
+		 * should not enter this path because we setup without
+		 * DRIVER_DISABLES_INTS.
+		 */
+		vnicinfo->interrupts_disabled++;
+		mask = ~ULTRA_CHANNEL_ENABLE_INTS;
+		rc1 = uisqueue_interlocked_and(vnicinfo->flags_addr, mask);
+	}
+	if (spar_signalqueue_empty(p_channel_header, IOCHAN_FROM_IOPART)) {
+		vnicinfo->interrupts_notme++;
+		return IRQ_NONE;
+	}
+	pqhdr = (struct signal_queue_header __iomem *)
+		((char __iomem *)p_channel_header +
+		 readq(&p_channel_header->ch_space_offset)) +
+		IOCHAN_FROM_IOPART;
+	writeq(readq(&pqhdr->num_irq_received) + 1,
+	       &pqhdr->num_irq_received);
+	atomic_set(&vnicinfo->interrupt_rcvd, 1);
+	wake_up_interruptible(&vnicinfo->rsp_queue);
+	return IRQ_HANDLED;
+}
+
+static const struct net_device_ops virtnic_dev_ops = {
+	.ndo_open = virtnic_open,
+	.ndo_stop = virtnic_close,
+	.ndo_start_xmit = virtnic_xmit,
+	.ndo_get_stats = virtnic_get_stats,
+	.ndo_do_ioctl = virtnic_ioctl,
+	.ndo_change_mtu = virtnic_change_mtu,
+	.ndo_tx_timeout = virtnic_xmit_timeout,
+	.ndo_set_rx_mode = virtnic_set_multi,
+};
+
+static int
+virtnic_probe(struct virtpci_dev *virtpcidev, const struct pci_device_id *id)
+{
+	struct net_device *netdev = NULL;
+	struct virtnic_info *vnicinfo;
+	int err;
+	int rsp;
+	irq_handler_t handler = virtnic_ISR;
+	struct channel_header __iomem *p_channel_header;
+	struct signal_queue_header __iomem *pqhdr;
+	uint64_t mask;
+
+#define RETFAIL(res) {\
+		kfree(vnicinfo->cmdrsp_rcv);  \
+		kfree(vnicinfo->xmit_cmdrsp); \
+		kfree(vnicinfo->rcvbuf);      \
+		if (vnicinfo->interrupt_vector != -1)		\
+			free_irq(vnicinfo->interrupt_vector, vnicinfo); \
+		if (netdev)						\
+			free_netdev(netdev);				\
+		return res;						\
+}
+
+	DBGINF("virtpci_dev:%p\n", virtpcidev);
+	DBGINF("virtpcidev busNo<<%d>>devNo<<%d>>",
+	       virtpcidev->busNo, virtpcidev->deviceNo);
+	netdev = alloc_etherdev(sizeof(struct virtnic_info));
+	if (netdev == NULL) {
+		LOGERR("**** FAILED to alloc etherdev\n");
+		return -ENOMEM;
+	}
+	netdev->netdev_ops = &virtnic_dev_ops;
+	netdev->watchdog_timeo = VIRTNIC_XMIT_TIMEOUT;
+
+	memcpy(netdev->dev_addr, virtpcidev->net.mac_addr, MAX_MACADDR_LEN);
+	netdev->addr_len = MAX_MACADDR_LEN;
+	/* netdev->name should be ethx already */
+	netdev->dev.parent = &virtpcidev->generic_dev;
+
+	/* setup our private struct */
+	vnicinfo = netdev_priv(netdev);
+	memset(vnicinfo, 0, sizeof(struct virtnic_info));
+	vnicinfo->interrupt_vector = -1;
+	vnicinfo->netdev = netdev;
+	vnicinfo->virtpcidev = virtpcidev;
+	init_waitqueue_head(&vnicinfo->rsp_queue);
+	spin_lock_init(&vnicinfo->priv_lock);
+	vnicinfo->datachan.chinfo.queueinfo = &virtpcidev->queueinfo;
+	spin_lock_init(&vnicinfo->datachan.chinfo.insertlock);
+	vnicinfo->enabled = 0;	/* not yet */
+	atomic_set(&vnicinfo->usage, 1);	/* starting val */
+	vnicinfo->zoneguid = virtpcidev->net.zone_uuid;
+	vnicinfo->num_rcv_bufs = virtpcidev->net.num_rcv_bufs;
+	LOGINFNAME(vnicinfo->netdev, "num_rcv_bufs =  %d\n",
+		   vnicinfo->num_rcv_bufs);
+	vnicinfo->rcvbuf = kmalloc(sizeof(struct sk_buff *) *
+				   vnicinfo->num_rcv_bufs, GFP_ATOMIC);
+	if (vnicinfo->rcvbuf == NULL) {
+		LOGERRNAME(vnicinfo->netdev,
+			   "**** FAILED to allocate memory for %d receive buffers.\n",
+			   vnicinfo->num_rcv_bufs);
+		RETFAIL(-ENOMEM);
+	}
+	memset(vnicinfo->rcvbuf, 0,
+	       sizeof(struct sk_buff *) * vnicinfo->num_rcv_bufs);
+	/* set the net_xmit outstanding threshold */
+	vnicinfo->max_outstanding_net_xmits =
+	    max(3, ((vnicinfo->num_rcv_bufs / 3) - 2));
+	/* always leave two slots open but you should have 3 at a minimum */
+	LOGINFNAME(vnicinfo->netdev, "max_outstanding_net_xmits =  %d\n",
+		   vnicinfo->max_outstanding_net_xmits);
+	vnicinfo->upper_threshold_net_xmits =
+	    max(2, vnicinfo->max_outstanding_net_xmits - 1);
+	LOGINFNAME(vnicinfo->netdev, "upper_threshold_net_xmits =  %d\n",
+		   vnicinfo->upper_threshold_net_xmits);
+	vnicinfo->lower_threshold_net_xmits =
+	    max(1, vnicinfo->max_outstanding_net_xmits / 2);
+	LOGINFNAME(vnicinfo->netdev, "lower_threshold_net_xmits =  %d\n",
+		   vnicinfo->lower_threshold_net_xmits);
+	skb_queue_head_init(&vnicinfo->xmitbufhead);
+
+	/* create a cmdrsp we can use to post and unpost rcv buffers  */
+	vnicinfo->cmdrsp_rcv = kmalloc(SIZEOF_CMDRSP, GFP_ATOMIC);
+	if (vnicinfo->cmdrsp_rcv == NULL) {
+		LOGERRNAME(vnicinfo->netdev,
+			   "**** FAILED to allocate cmdrsp to use for posting rcv buffers\n");
+		RETFAIL(-ENOMEM);
+	}
+	vnicinfo->xmit_cmdrsp = kmalloc(SIZEOF_CMDRSP, GFP_ATOMIC);
+	if (vnicinfo->xmit_cmdrsp == NULL) {
+		LOGERRNAME(vnicinfo->netdev,
+			   "**** FAILED to allocate cmdrsp to use for xmits\n");
+		RETFAIL(-ENOMEM);
+	}
+	INIT_WORK(&vnicinfo->serverdown_completion,
+		  virtnic_serverdown_complete);
+	INIT_WORK(&vnicinfo->timeout_reset, virtnic_timeout_reset);
+	vnicinfo->server_down = false;
+	vnicinfo->server_change_state = false;
+
+	/* set the default mtu */
+	netdev->mtu = virtpcidev->net.mtu;
+
+	vnicinfo->intr = virtpcidev->intr;
+	/* buffers will be allocated in open using mtu */
+
+	/* save off netdev in virtpcidev  */
+	virtpcidev->net.netdev = netdev;
+
+	/* start thread that will receive responses */
+	writeq(readq(&vnicinfo->datachan.chinfo.queueinfo->chan->features) |
+	       ULTRA_IO_CHANNEL_IS_POLLING,
+	       &vnicinfo->datachan.chinfo.queueinfo->chan->features);
+	DBGINF("starting rsp thread queueinfo:%p threadinfo:%p\n",
+	       vnicinfo->datachan.chinfo.queueinfo,
+	       &vnicinfo->datachan.chinfo.threadinfo);
+	p_channel_header = vnicinfo->datachan.chinfo.queueinfo->chan;
+	pqhdr = (struct signal_queue_header __iomem *)
+		((char __iomem *)p_channel_header +
+		 readq(&p_channel_header->ch_space_offset)) +
+	    IOCHAN_FROM_IOPART;
+	vnicinfo->flags_addr = (__force uint64_t __iomem *)&pqhdr->features;
+	vnicinfo->thread_wait_ms = 2;
+	if (!uisthread_start(&vnicinfo->datachan.chinfo.threadinfo,
+			     process_incoming_rsps, &vnicinfo->datachan,
+			     "vnic_incoming")) {
+		LOGERRNAME(vnicinfo->netdev, "**** FAILED to start thread\n");
+		RETFAIL(-ENODEV);
+	}
+
+	/* register_netdev */
+	LOGINFNAME(vnicinfo->netdev, "sendInterruptHandle=0x%16llX",
+		   (unsigned long long)vnicinfo->intr.send_irq_handle);
+	LOGINFNAME(vnicinfo->netdev, "recvInterruptHandle=0x%16llX",
+		   (unsigned long long)vnicinfo->intr.recv_irq_handle);
+	LOGINFNAME(vnicinfo->netdev, "recvInterruptVector=0x%8X",
+		   vnicinfo->intr.recv_irq_vector);
+	LOGINFNAME(vnicinfo->netdev, "recvInterruptShared=0x%2X",
+		   vnicinfo->intr.recv_irq_shared);
+	LOGINFNAME(vnicinfo->netdev, "netdev->name=%s", netdev->name);
+	vnicinfo->interrupt_vector = vnicinfo->intr.recv_irq_handle &
+	    INTERRUPT_VECTOR_MASK;
+	netdev->irq = vnicinfo->interrupt_vector;
+	err = register_netdev(netdev);
+	if (err) {
+		uisthread_stop(&vnicinfo->datachan.chinfo.threadinfo);
+		RETFAIL(err);
+	}
+
+	/* create proc/ethx directory */
+	vnicinfo->eth_debugfs_dir = debugfs_create_dir(netdev->name,
+						       virtnic_debugfs_dir);
+	if (!vnicinfo->eth_debugfs_dir) {
+		LOGERRNAME(vnicinfo->netdev,
+			   "****FAILED to create proc dir entry:%s\n",
+			   netdev->name);
+		uisthread_stop(&vnicinfo->datachan.chinfo.threadinfo);
+		RETFAIL(-ENODEV);
+	}
+
+	if (device_create_file(&netdev->dev, &dev_attr_zone) < 0) {
+		uisthread_stop(&vnicinfo->datachan.chinfo.threadinfo);
+		RETFAIL(-ENODEV);
+	}
+	if (device_create_file(&netdev->dev, &dev_attr_clientstr) < 0) {
+		device_remove_file(&netdev->dev, &dev_attr_zone);
+		uisthread_stop(&vnicinfo->datachan.chinfo.threadinfo);
+		RETFAIL(-ENODEV);
+	}
+	/* create proc/ethx directory  */
+	rsp = request_irq(vnicinfo->interrupt_vector, handler, IRQF_SHARED,
+			  netdev->name, vnicinfo);
+	if (rsp != 0) {
+		LOGERRNAME(vnicinfo->netdev,
+			   "request_irq(%d) uislib_vnic_ISR request failed with rsp=%d\n",
+			   vnicinfo->interrupt_vector, rsp);
+		vnicinfo->interrupt_vector = -1;
+	} else {
+		uint64_t __iomem *features_addr =
+		    &vnicinfo->datachan.chinfo.queueinfo->chan->features;
+		LOGERRNAME(vnicinfo->netdev,
+			   "request_irq(%d) uislib_vnic_ISR request succeeded\n",
+			   vnicinfo->interrupt_vector);
+		mask = ~(ULTRA_IO_CHANNEL_IS_POLLING |
+			 ULTRA_IO_DRIVER_DISABLES_INTS |
+			 ULTRA_IO_DRIVER_SUPPORTS_ENHANCED_RCVBUF_CHECKING);
+		uisqueue_interlocked_and(features_addr, mask);
+		mask = ULTRA_IO_DRIVER_ENABLES_INTS |
+		    ULTRA_IO_DRIVER_SUPPORTS_ENHANCED_RCVBUF_CHECKING;
+		uisqueue_interlocked_or(features_addr, mask);
+
+		vnicinfo->thread_wait_ms = 2000;
+	}
+
+	LOGINFNAME(vnicinfo->netdev,
+		   "Added VirtNic:%p %s insertlock:%p %02x:%02x:%02x:%02x:%02x:%02x\n",
+		   netdev, netdev->name, &vnicinfo->datachan.chinfo.insertlock,
+		   netdev->dev_addr[0], netdev->dev_addr[1],
+		   netdev->dev_addr[2], netdev->dev_addr[3],
+		   netdev->dev_addr[4], netdev->dev_addr[5]);
+	return 0;
+}
+
+static void
+virtnic_remove(struct virtpci_dev *virtpcidev)
+{
+	struct net_device *netdev = virtpcidev->net.netdev;
+	struct virtnic_info *vnicinfo;
+
+	vnicinfo = netdev_priv(netdev);
+
+	LOGINFNAME(vnicinfo->netdev,
+		   "virtpcidev:%p netdev:%p name:%s vnicinfo:%p\n",
+		   virtpcidev, netdev, netdev->name, vnicinfo);
+	LOGINFNAME(vnicinfo->netdev,
+		   "virtpcidev busNo<<%d>>devNo<<%d>>",
+		   virtpcidev->bus_no, virtpcidev->device_no);
+	/* REMOVE netdev */
+	DBGINF("unregistering netdev\n");
+	if (vnicinfo->interrupt_vector != -1)
+		free_irq(vnicinfo->interrupt_vector, vnicinfo);
+	unregister_netdev(netdev);
+	/* this is going to call virtnic_close which will send out */
+	/* disable don't take thread down until after that */
+	uisthread_stop(&vnicinfo->datachan.chinfo.threadinfo);
+
+	/* freeing of rcv bufs should have happened in close. */
+	/* free cmdrsp we allocated for rcv post/unpost */
+	kfree(vnicinfo->cmdrsp_rcv);
+	kfree(vnicinfo->xmit_cmdrsp);
+
+	/* delete proc file entries */
+	device_remove_file(&netdev->dev, &dev_attr_zone);
+	device_remove_file(&netdev->dev, &dev_attr_clientstr);
+
+	debugfs_remove(vnicinfo->eth_debugfs_dir);
+	LOGINFNAME(vnicinfo->netdev, "removed dentry %s\n",
+		   netdev->name);
+
+	kfree(vnicinfo->rcvbuf);
+	free_netdev(netdev);
+
+	LOGINF("virtnic removed\n");
+}
+
+/*****************************************************/
+/* NIC statistics handling					         */
+/*****************************************************/
+
+/* update rcv stats - locking done by invoker */
+#define UPD_RCV_STATS { \
+	vnicinfo->net_stats.rx_packets++;  \
+	vnicinfo->net_stats.rx_bytes += skb->len;  \
+}
+
+/* update xmt stats - locking done by invoker */
+#define UPD_XMT_STATS { \
+	vnicinfo->net_stats.tx_packets++;  \
+	vnicinfo->net_stats.tx_bytes += skb->len;  \
+}
+
+static struct net_device_stats *
+virtnic_get_stats(struct net_device *netdev)
+{
+	struct virtnic_info *vnicinfo = netdev_priv(netdev);
+
+	/* take this opportunity to print out our internal stats */
+	DBGINF
+	    ("NET_RCV_ENBDIS sent: %ld     NET_RCV_ENBDIS_ACK received: %ld\n",
+	     vnicinfo->datachan.chstat.sent_enbdis,
+	     vnicinfo->datachan.chstat.got_enbdisack);
+
+	DBGINF("NET_RCV received: %ld        NET_RCV_POST sent: %ld\n",
+	       vnicinfo->datachan.chstat.got_rcv,
+	       vnicinfo->datachan.chstat.sent_post);
+
+	DBGINF("extra NET_RCV_POST sent: %ld\n",
+	       vnicinfo->datachan.chstat.extra_rcvbufs_sent);
+
+	DBGINF("NET_XMIT sent: %ld           NET_XMIT_DONE received: %ld\n",
+	       vnicinfo->datachan.chstat.sent_xmit,
+	       vnicinfo->datachan.chstat.got_xmit_done);
+
+	DBGINF("XMIT failures: %ld           NET_RCV_PROMISC sent: %ld\n",
+	       vnicinfo->datachan.chstat.xmit_fail,
+	       vnicinfo->datachan.chstat.sent_promisc);
+
+	DBGINF("XMIT reject/busy: %ld\n",
+	       vnicinfo->datachan.chstat.reject_count);
+
+	return &vnicinfo->net_stats;
+}
+
+/*****************************************************/
+/* Local functions                                   */
+/*****************************************************/
+
+/*
+ * This function allocates skb, skb->data for first fragment. If Mtu
+ * size is > default, it allocates frags.
+ */
+static struct sk_buff *
+alloc_rcv_buf(struct net_device *netdev)
+{
+	struct sk_buff *skb;
+
+/*
+ * NOTE: the first fragment in each rcv buffer is pointed to by rcvskb->data.
+ * For now all rcv buffers will be RCVPOST_BUF_SIZE in length, so the firstfrag
+ * is large enough to hold 1514.
+ */
+	DBGINF("netdev->name <<%s>>:  allocating skb len:%d\n", netdev->name,
+	       RCVPOST_BUF_SIZE);
+	skb = alloc_skb(RCVPOST_BUF_SIZE, GFP_ATOMIC | __GFP_NOWARN);
+	if (!skb) {
+		LOGVER("**** alloc_skb failed\n");
+		return NULL;
+	}
+	skb->dev = netdev;
+	skb->len = RCVPOST_BUF_SIZE;
+	/* current value of mtu doesn't come into play here; large
+	 * packets will just end up using multiple rcv buffers all of
+	 * same size
+	 */
+	skb->data_len = 0;	/* dev_alloc_skb already zeroes it out.
+				   for clarification. */
+	return skb;
+}
+
+static int
+init_rcv_bufs(struct net_device *netdev, struct virtnic_info *vnicinfo)
+{
+	int i, count;
+
+	DBGINF("netdev->name <<%s>>", netdev->name);
+	/*
+	 * allocate fixed number of receive buffers to post to uisnic
+	 * post receive buffers after we've allocated a required
+	 * amount
+	 */
+	for (i = 0; i < vnicinfo->num_rcv_bufs; i++) {
+		vnicinfo->rcvbuf[i] = alloc_rcv_buf(netdev);
+		if (!vnicinfo->rcvbuf[i])
+			break;	/* if we failed to allocate one let us stop */
+	}
+	if (i < vnicinfo->num_rcv_bufs) {
+		LOGWRNNAME(vnicinfo->netdev,
+			   "only allocated %d of %d receive buffers", i,
+			   vnicinfo->num_rcv_bufs);
+		if (i == 0) {
+			/* couldn't even allocate one - bail out */
+			LOGERRNAME(vnicinfo->netdev,
+				   "**** FAILED to allocate any rcv buffers\n");
+			return -ENOMEM;
+		}
+	}
+	count = i;
+	/* Ensure we can alloc 2/3rd of the requested number of
+	 * buffers. 2/3 is an arbitraty choice; used also in ndis
+	 * init.c.
+	 */
+	if (count < ((2 * vnicinfo->num_rcv_bufs) / 3)) {
+		LOGERRNAME(vnicinfo->netdev,
+			   "**** FAILED to allocate enough rcv bufs; allocated only:%d MAX_NET_RCV_BUFS:%d\n",
+			   count, MAX_NET_RCV_BUFS);
+		/* free receive buffers we did allocate and then bail out */
+		for (i = 0; i < count; i++) {
+			kfree_skb(vnicinfo->rcvbuf[i]);
+			vnicinfo->rcvbuf[i] = NULL;
+		}
+		return -ENOMEM;
+	}
+
+	/* post receive buffers to receive incoming input - without holding */
+	/* lock - we've not enabled nor started the queue so there shouldn't */
+	/* be any rcv or xmit activity */
+	for (i = 0; i < count; i++)
+		post_skb(vnicinfo->cmdrsp_rcv, vnicinfo, vnicinfo->rcvbuf[i]);
+
+	/* push through with what buffers we've got - unallocated ones will */
+	/* be null */
+	LOGINFNAME(vnicinfo->netdev, "Allocated & posted %d rcv buffers\n",
+		   count);
+
+	return 0;
+}
+
+/* Sends disable to IOVM and frees receive buffers that were posted to
+ * IOVM (cleared by IOVM when disable is received)
+ * returns 0 on success, negative number on failure
+ *
+ * timeout is defined in msecs (timeout of 0 specifies infinite wait)
+ */
+static int
+virtnic_disable_with_timeout(struct net_device *netdev, const int timeout)
+{
+	struct virtnic_info *vnicinfo = netdev_priv(netdev);
+	int i, count = 0;
+	unsigned long flags;
+	int wait = 0;
+
+	LOGINFNAME(vnicinfo->netdev, "netdev->name <<%s>>", netdev->name);
+	/* stop the transmit queue so nothing more can be transmitted */
+	netif_stop_queue(netdev);
+
+	/* send a msg telling the other end we are stopping incoming pkts */
+	spin_lock_irqsave(&vnicinfo->priv_lock, flags);
+	vnicinfo->enabled = 0;
+	vnicinfo->enab_dis_acked = 0;	/* must wait for ack */
+	spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+
+	/* send disable and wait for ack - don't hold lock when
+	 * sending disable because if the queue is full, insert might
+	 * sleep.
+	 */
+	SEND_ENBDIS(netdev, 0, vnicinfo->cmdrsp_rcv,
+		    vnicinfo->datachan.chinfo.queueinfo,
+		    &vnicinfo->datachan.chinfo.insertlock,
+		    vnicinfo->datachan.chstat);
+
+	LOGINFNAME(vnicinfo->netdev,
+		   "Waiting for ENBDIS ACK before freeing rcv buffers...\n");
+	/* wait for ack to arrive before we try to free rcv buffers
+	 * NOTE: the other end automatically unposts the rcv buffers
+	 * when it gets a disable.
+	 */
+	while ((timeout == VIRTNIC_INFINITE_RESPONSE_WAIT) ||
+	       (wait < timeout)) {
+		spin_lock_irqsave(&vnicinfo->priv_lock, flags);
+		if (vnicinfo->n_rcv_packet_not_accepted) {
+			/* now we can continue with disable */
+			break;
+		} else if (vnicinfo->server_down ||
+			vnicinfo->server_change_state) {
+			LOGERRNAME(vnicinfo->netdev,
+				   "IOVM is down so disable will not be acknowledged.  Stopping wait.\n");
+			spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+			return -1;
+		}
+		set_current_state(TASK_INTERRUPTIBLE);
+		spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+		wait += schedule_timeout(msecs_to_jiffies(10));
+	}
+	if (!vnicinfo->n_rcv_packet_not_accepted) {
+		LOGERRNAME(vnicinfo->netdev,
+			   "IOVM did not respond to Disable in allocated time (%d msecs).\n",
+			   timeout);
+		spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+		return -1;
+	}
+	LOGINFNAME(vnicinfo->netdev,
+		   "Got ENBDIS ACK; now waiting for 0 usage count...\n");
+
+	/*
+	 * wait for usage to go to 1 (no other users) before freeing
+	 * rcv buffers
+	 */
+	if (atomic_read(&vnicinfo->usage) > 1) {
+		/* wait for usage count to be 1 */
+		while (1) {
+			set_current_state(TASK_INTERRUPTIBLE);
+			spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+			schedule_timeout(msecs_to_jiffies(10));
+			spin_lock_irqsave(&vnicinfo->priv_lock, flags);
+			if (atomic_read(&vnicinfo->usage) == 1) {
+				break;	/* go do work and only after
+					   that give up lock */
+			}
+		}
+	}
+	/* we've set enabled to 0, so we can give up the lock. */
+	spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+	LOGINFNAME(vnicinfo->netdev,
+		   "Usage count is 0; freeing the rcv buffers now\n");
+
+	/* free rcv buffers - other end has automatically unposted
+	 * them on disable
+	 */
+	for (i = 0; i < vnicinfo->num_rcv_bufs; i++) {
+		if (vnicinfo->rcvbuf[i]) {
+			kfree_skb(vnicinfo->rcvbuf[i]);
+			vnicinfo->rcvbuf[i] = NULL;
+			count++;
+		}
+	}
+	LOGINFNAME(vnicinfo->netdev, "Freed %d rcv bufs\n", count);
+
+	/* remove references from debug array */
+	for (i = 0; i < VIRTNICSOPENMAX; i++) {
+		if (num_virtnic_open[i].netdev == netdev) {
+			num_virtnic_open[i].netdev = NULL;
+			num_virtnic_open[i].vnicinfo = NULL;
+			break;
+		}
+	}
+
+	return 0;
+}
+
+/* Wait indefinitely for IOVM to acknowledge disable request */
+static int
+virtnic_disable(struct net_device *netdev)
+{
+	return virtnic_disable_with_timeout(netdev,
+					    VIRTNIC_INFINITE_RESPONSE_WAIT);
+}
+
+/* Sends enable to IOVM, inits, and  posts receive buffers to IOVM
+ * returns 0 on success, negative number on failure
+ *
+ * timeout is defined in msecs (timeout of 0 specifies infinite wait)
+ */
+static int
+virtnic_enable_with_timeout(struct net_device *netdev, const int timeout)
+{
+	int i;
+	struct virtnic_info *vnicinfo = netdev_priv(netdev);
+	unsigned long flags;
+	int wait = 0;
+
+	/* NOTE: the other end automatically unposts the rcv buffers when
+	 * it gets a disable.
+	 */
+	i = init_rcv_bufs(netdev, vnicinfo);
+	if (i < 0)
+		return i;
+
+	spin_lock_irqsave(&vnicinfo->priv_lock, flags);
+	vnicinfo->enabled = 1;
+	/* now we're ready, let's send an ENB to uisnic but until we
+	 * get an ACK back from uisnic, we'll drop the packets
+	 */
+	vnicinfo->n_rcv_packet_not_accepted = 0;
+	spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+
+	/* send enable and wait for ack - don't hold lock when sending
+	 * enable because if the queue is full, insert might sleep.
+	 */
+	SEND_ENBDIS(netdev, 1, vnicinfo->cmdrsp_rcv,
+		    vnicinfo->datachan.chinfo.queueinfo,
+		    &vnicinfo->datachan.chinfo.insertlock,
+		    vnicinfo->datachan.chstat);
+
+	LOGINFNAME(vnicinfo->netdev, "netdev->name <<%s>>", netdev->name);
+	LOGINFNAME(vnicinfo->netdev,
+		   "Waiting for ENBDIS ACK before starting device queue...\n");
+	while ((timeout == VIRTNIC_INFINITE_RESPONSE_WAIT) ||
+	       (wait < timeout)) {
+		spin_lock_irqsave(&vnicinfo->priv_lock, flags);
+		if (vnicinfo->enab_dis_acked) {
+			/* now we can continue  */
+			break;
+		} else if (vnicinfo->server_down ||
+			   vnicinfo->server_change_state) {
+			/* IOVM is going down so don't wait for a response */
+			LOGERRNAME(vnicinfo->netdev,
+				   "IOVM is down so enable will not be acknowledged.  Stopping wait.\n");
+			spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+			return -1;
+		}
+		set_current_state(TASK_INTERRUPTIBLE);
+		spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+		wait += schedule_timeout(msecs_to_jiffies(10));
+	}
+	if (!vnicinfo->enab_dis_acked) {
+		LOGERRNAME(vnicinfo->netdev,
+			   "IOVM did not respond to Enable in allocated time (%d msecs).\n",
+			   timeout);
+		spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+		return -1;
+	}
+	spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+	LOGINFNAME(vnicinfo->netdev, "Got ENBDIS ACK\n");
+
+	/* find an open slot in the array to save off VirtNic
+	 * references for debug
+	 */
+	for (i = 0; i < VIRTNICSOPENMAX; i++) {
+		if (num_virtnic_open[i].netdev == NULL) {
+			num_virtnic_open[i].netdev = netdev;
+			num_virtnic_open[i].vnicinfo = vnicinfo;
+			break;
+		}
+	}
+	if (i == VIRTNICSOPENMAX)
+		LOGINFNAME(vnicinfo->netdev,
+			   "No storage for debug ref for netdev = 0x%p vnicinfo = 0x%p\n",
+			   netdev, vnicinfo);
+
+	return 0;
+}
+
+/* Wait indefinitely for IOVM to acknowledge enable request */
+static int
+virtnic_enable(struct net_device *netdev)
+{
+	return virtnic_enable_with_timeout(netdev,
+		VIRTNIC_INFINITE_RESPONSE_WAIT);
+}
+
+static void
+send_rcv_posts_if_needed(struct virtnic_info *vnicinfo)
+{
+	int i;
+	struct net_device *netdev;
+	struct uiscmdrsp *cmdrsp = vnicinfo->cmdrsp_rcv;
+	int cur_num_rcv_bufs_to_alloc, rcv_bufs_allocated;
+
+	if (!(vnicinfo->enabled && vnicinfo->enab_dis_acked)) {
+		/* dont do this until vnic is marked ready. */
+		return;
+	}
+	netdev = vnicinfo->netdev;
+	rcv_bufs_allocated = 0;
+	/* this code is trying to prevent getting stuck here forever,
+	 * but still retry it if you cant allocate them all this
+	 * time.
+	 */
+	cur_num_rcv_bufs_to_alloc = vnicinfo->num_rcv_bufs_could_not_alloc;
+	while (cur_num_rcv_bufs_to_alloc > 0) {
+		cur_num_rcv_bufs_to_alloc--;
+		for (i = 0; i < vnicinfo->num_rcv_bufs; i++) {
+			if (vnicinfo->rcvbuf[i] != NULL)
+				continue;
+			vnicinfo->rcvbuf[i] = alloc_rcv_buf(netdev);
+			if (!vnicinfo->rcvbuf[i]) {
+				LOGVER("**** %s FAILED to allocate new rcv buf - no REPOST\n",
+				       netdev->name);
+				vnicinfo->
+				    alloc_failed_in_if_needed_cnt++;
+				break;
+			} else {
+				rcv_bufs_allocated++;
+				post_skb(cmdrsp, vnicinfo,
+					 vnicinfo->rcvbuf[i]);
+				vnicinfo->datachan.chstat.
+				    extra_rcvbufs_sent++;
+			}
+		}
+	}
+	vnicinfo->num_rcv_bufs_could_not_alloc -= rcv_bufs_allocated;
+	if (vnicinfo->num_rcv_bufs_could_not_alloc > 0) {
+		/*
+		 * this path means you failed to alloc an skb in the
+		 * normal path, and you are trying again later, and
+		 * it still fails.
+		 */
+		LOGVER("attempted to recover buffers which could not be allocated and failed");
+		LOGVER("rcv_bufs_allocated=%d, num_rcv_bufs_could_not_alloc=%d",
+		       rcv_bufs_allocated,
+		       vnicinfo->num_rcv_bufs_could_not_alloc);
+	}
+}
+
+static void
+drain_queue(struct datachan *dc, struct uiscmdrsp *cmdrsp,
+	    struct virtnic_info *vnicinfo)
+{
+	unsigned long flags;
+	int qrslt;
+	struct net_device *netdev;
+
+	/* drain queue */
+	while (1) {
+		spin_lock_irqsave(&dc->chinfo.insertlock, flags);
+		if (!spar_channel_client_acquire_os(dc->chinfo.queueinfo->chan,
+						    "vnic")) {
+			spin_unlock_irqrestore(&dc->chinfo.insertlock,
+					       flags);
+			break;
+		}
+		qrslt = uisqueue_get_cmdrsp(dc->chinfo.queueinfo, cmdrsp,
+					    IOCHAN_FROM_IOPART);
+		spar_channel_client_release_os(dc->chinfo.queueinfo->chan,
+					       "vnic");
+		spin_unlock_irqrestore(&dc->chinfo.insertlock, flags);
+		if (qrslt == 0)
+			break;	/* queue empty */
+		DBGINF("%p cmdrsp->net.type:%d\n",
+		       &dc->chinfo.queueinfo, cmdrsp->net.type);
+		switch (cmdrsp->net.type) {
+		case NET_RCV:
+			DBGINF("Got NET_RCV\n");
+			dc->chstat.got_rcv++;
+			/* process incoming packet */
+			virtnic_rx(cmdrsp);
+			break;
+		case NET_XMIT_DONE:
+			DBGINF("Got NET_XMIT_DONE %p\n", cmdrsp->net.buf);
+			spin_lock_irqsave(&vnicinfo->priv_lock, flags);
+			dc->chstat.got_xmit_done++;
+			if (cmdrsp->net.xmtdone.xmt_done_result) {
+				LOGERRNAME(vnicinfo->netdev,
+					   "XMIT_DONE failure buf:%p\n",
+					   cmdrsp->net.buf);
+				dc->chstat.xmit_fail++;
+			}
+			/* only call queue wake if we stopped it */
+			netdev = ((struct sk_buff *)cmdrsp->net.buf)->dev;
+			/* ASSERT netdev == vnicinfo->netdev; */
+			if (netdev != vnicinfo->netdev) {
+				LOGERRNAME(vnicinfo->netdev, "NET_XMIT_DONE something wrong; vnicinfo->netdev:%p != cmdrsp->net.buf)->dev:%p\n",
+					   vnicinfo->netdev, netdev);
+			} else if (netif_queue_stopped(netdev)) {
+				/*
+				 * check to see if we have crossed
+				 * the lower watermark for
+				 * netif_wake_queue()
+				 */
+				if (((vnicinfo->datachan.chstat.sent_xmit >=
+				    vnicinfo->datachan.chstat.got_xmit_done) &&
+				    (vnicinfo->datachan.chstat.sent_xmit -
+				    vnicinfo->datachan.chstat.got_xmit_done <=
+				    vnicinfo->lower_threshold_net_xmits)) ||
+				    ((vnicinfo->datachan.chstat.sent_xmit <
+				    vnicinfo->datachan.chstat.got_xmit_done) &&
+				    (ULONG_MAX -
+				    vnicinfo->datachan.chstat.got_xmit_done
+				    + vnicinfo->datachan.chstat.sent_xmit <=
+				    vnicinfo->lower_threshold_net_xmits))) {
+					/*
+					 * enough NET_XMITs completed
+					 * so can restart netif queue
+					 */
+					netif_wake_queue(netdev);
+					vnicinfo->flow_control_lower_hits++;
+				}
+			}
+			skb_unlink(cmdrsp->net.buf, &vnicinfo->xmitbufhead);
+			spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+			kfree_skb(cmdrsp->net.buf);
+			break;
+		case NET_RCV_ENBDIS_ACK:
+			DBGINF("Got NET_RCV_ENBDIS_ACK on:%p\n",
+			       (struct net_device *)
+			       cmdrsp->net.enbdis.context);
+			dc->chstat.got_enbdisack++;
+			netdev = (struct net_device *)
+				cmdrsp->net.enbdis.context;
+			spin_lock_irqsave(&vnicinfo->priv_lock, flags);
+			vnicinfo->enab_dis_acked = 1;
+			spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+
+			if (vnicinfo->server_down &&
+			    vnicinfo->server_change_state) {
+				/* Inform Linux that the link is up */
+				vnicinfo->server_down = false;
+				vnicinfo->server_change_state = false;
+				netif_wake_queue(netdev);
+				netif_carrier_on(netdev);
+			}
+			break;
+		case NET_CONNECT_STATUS:
+			DBGINF("NET_CONNECT_STATUS, enable=:%d\n",
+			       cmdrsp->net.enbdis.enable);
+			netdev = vnicinfo->netdev;
+			if (cmdrsp->net.enbdis.enable == 1) {
+				spin_lock_irqsave(&vnicinfo->priv_lock, flags);
+				vnicinfo->enabled = cmdrsp->net.enbdis.enable;
+				spin_unlock_irqrestore(&vnicinfo->priv_lock,
+						       flags);
+				netif_wake_queue(netdev);
+				netif_carrier_on(netdev);
+			} else {
+				netif_stop_queue(netdev);
+				netif_carrier_off(netdev);
+				spin_lock_irqsave(&vnicinfo->priv_lock, flags);
+				vnicinfo->enabled = cmdrsp->net.enbdis.enable;
+				spin_unlock_irqrestore(&vnicinfo->priv_lock,
+						       flags);
+			}
+			break;
+		default:
+			LOGERRNAME(vnicinfo->netdev,
+				   "Invalid net type:%d in cmdrsp\n",
+				   cmdrsp->net.type);
+			break;
+		}
+		/* cmdrsp is now available for reuse  */
+
+		if (dc->chinfo.threadinfo.should_stop)
+			break;
+	}
+}
+
+static int
+process_incoming_rsps(void *v)
+{
+	struct datachan *dc = v;
+	struct uiscmdrsp *cmdrsp = NULL;
+	const int SZ = SIZEOF_CMDRSP;
+	struct virtnic_info *vnicinfo;
+	struct channel_header __iomem *p_channel_header;
+	struct signal_queue_header __iomem *pqhdr;
+	uint64_t mask;
+	unsigned long long rc1;
+
+	UIS_DAEMONIZE("vnic_incoming");
+	DBGINF("In process_incoming_rsps pid:%d queueinfo:%p threadinfo:%p\n",
+	       current->pid, dc->chinfo.queueinfo, &dc->chinfo.threadinfo);
+	/* alloc once and reuse */
+	vnicinfo = container_of(dc, struct virtnic_info, datachan);
+	cmdrsp = kmalloc(SZ, GFP_ATOMIC);
+	if (cmdrsp == NULL) {
+		LOGERRNAME(vnicinfo->netdev,
+			   "**** FAILED to malloc - thread exiting\n");
+		complete_and_exit(&dc->chinfo.threadinfo.has_stopped, 0);
+	}
+	p_channel_header = vnicinfo->datachan.chinfo.queueinfo->chan;
+	pqhdr =
+	       (struct signal_queue_header __iomem *)
+	       ((char __iomem *)p_channel_header +
+	       readq(&p_channel_header->ch_space_offset)) +
+	       IOCHAN_FROM_IOPART;
+	mask = ULTRA_CHANNEL_ENABLE_INTS;
+	while (1) {
+		wait_event_interruptible_timeout(
+			vnicinfo->rsp_queue, (atomic_read
+					      (&vnicinfo->interrupt_rcvd) == 1),
+			msecs_to_jiffies(vnicinfo->thread_wait_ms));
+		/*
+		 * periodically check to see if there any rcv bufs which
+		 * need to get sent to the iovm.   This can only happen if
+		 * we run out of memory when trying to allocate skbs.
+		 */
+		atomic_set(&vnicinfo->interrupt_rcvd, 0);
+		send_rcv_posts_if_needed(vnicinfo);
+		drain_queue(dc, cmdrsp, vnicinfo);
+		rc1 = uisqueue_interlocked_or((uint64_t __iomem *)
+					     vnicinfo->flags_addr, mask);
+		if (dc->chinfo.threadinfo.should_stop)
+			break;
+	}
+
+	kfree(cmdrsp);
+	DBGINF("In process_incoming_nic_rsp exiting\n");
+	complete_and_exit(&dc->chinfo.threadinfo.has_stopped, 0);
+}
+
+/*****************************************************/
+/* NIC support functions called external             */
+/*****************************************************/
+
+static int
+virtnic_change_mtu(struct net_device *netdev, int new_mtu)
+{
+	LOGERRNAME(netdev, "netdev->name <<%s>>", netdev->name);
+	LOGERRNAME(netdev, "**** FAILED: MTU cannot be changed at this end.\n");
+	LOGERRNAME(netdev, "The same MTU is used for all the PNICs and VNICs in a switch.\n");
+	LOGERRNAME(netdev, "Please change MTU from the Resource Partition\n");
+	LOGERRNAME(netdev, "Current MTU is: %d\n", netdev->mtu);
+	return -EINVAL;
+	/*
+	 * we cannot willy-nilly change the MTU; it has to come from
+	 * CONTROL VM and all the vnics and pnics in a switch have to
+	 * have the same MTU for everything to work.
+	 */
+}
+
+/*
+ * Called by kernel when ifconfig down is run.
+ * Returns 0 on success, negative value on failure.
+ */
+static int
+virtnic_close(struct net_device *netdev)
+{
+	/* this is called on ifconfig down but also if the device is
+	 * being removed
+	 */
+	LOGINFNAME(netdev, "Closing %p name:%s\n", netdev, netdev->name);
+
+	netif_stop_queue(netdev);
+	virtnic_disable(netdev);
+
+	LOGINFNAME(netdev, "Closed:%p\n", netdev);
+
+	return 0;
+}
+
+static int
+virtnic_ioctl(struct net_device *netdev, struct ifreq *ifr, int cmd)
+{
+	return -EOPNOTSUPP;
+}
+
+/*
+ * Called by kernel when ifconfig up is run.
+ * Returns 0 on success, negative value on failure.
+*/
+static int
+virtnic_open(struct net_device *netdev)
+{
+	struct virtnic_info *vnicinfo = netdev_priv(netdev);
+	void *p = (__force void *)netdev->ip_ptr;
+
+	LOGINFNAME(vnicinfo->netdev,
+		   "Opening %p name:%s allocating:%d rcvbufs mtu:%d\n", netdev,
+		   netdev->name, vnicinfo->num_rcv_bufs, netdev->mtu);
+
+	virtnic_enable(netdev);
+	/* start the interface's transmit queue, allowing it accept
+	 * packets for transmission
+	 */
+	netif_start_queue(netdev);
+
+	LOGINFNAME(vnicinfo->netdev,
+		   "Opened %p netdev->ip_ptr:%p name:%s %02x:%02x:%02x:%02x:%02x:%02x\n",
+		   netdev, netdev->ip_ptr, netdev->name, netdev->dev_addr[0],
+		   netdev->dev_addr[1], netdev->dev_addr[2],
+		   netdev->dev_addr[3], netdev->dev_addr[4],
+		   netdev->dev_addr[5]);
+
+	/*
+	 * temporary code to see trap to catch if vnic inet addresses
+	 * are getting trashed
+	 */
+	if (p != (__force void *)netdev->ip_ptr) {
+		LOGERRNAME(vnicinfo->netdev, "***********FAILURE HAPPENED\n");
+		LOGERRNAME(vnicinfo->netdev, "           Test to catch if vnic inet addresses are getting trashed.\n");
+		set_current_state(TASK_INTERRUPTIBLE);
+		schedule_timeout(msecs_to_jiffies(1000));
+	}
+	return 0;
+}
+
+static inline int
+repost_return(
+	struct uiscmdrsp *cmdrsp,
+	struct virtnic_info *vnicinfo,
+	struct sk_buff *skb,
+	struct net_device *netdev)
+{
+	struct net_pkt_rcv copy;
+	int i = 0, cc, numreposted;
+	int found_skb = 0;
+	int status = 0;
+
+	copy = cmdrsp->net.rcv;
+	LOGVER("REPOST_RETURN: realloc rcv skbs to replace:%d rcvbufs\n",
+	       copy.numrcvbufs);
+	switch (copy.numrcvbufs) {
+	case 0:
+		vnicinfo->n_rcv0++;
+		break;
+	case 1:
+		vnicinfo->n_rcv1++;
+		break;
+	case 2:
+		vnicinfo->n_rcv2++;
+		break;
+	default:
+		vnicinfo->n_rcvx++;
+		break;
+	}
+	for (cc = 0, numreposted = 0; cc < copy.numrcvbufs; cc++) {
+		for (i = 0; i < vnicinfo->num_rcv_bufs; i++) {
+			if (vnicinfo->rcvbuf[i] != copy.rcvbuf[cc])
+				continue;
+
+			LOGVER("REPOST_RETURN: orphaning old rcvbuf[%d]:%p cc=%d",
+			       i, vnicinfo->rcvbuf[i], cc);
+			vnicinfo->found_repost_rcvbuf_cnt++;
+			if ((skb) && vnicinfo->rcvbuf[i] == skb) {
+				found_skb = 1;
+				vnicinfo->repost_found_skb_cnt++;
+			}
+			vnicinfo->rcvbuf[i] = alloc_rcv_buf(netdev);
+			if (!vnicinfo->rcvbuf[i]) {
+				LOGVER("**** %s FAILED to reallocate new rcv buf - no REPOST, found_skb=%d, cc=%d, i=%d\n",
+				       netdev->name, found_skb, cc, i);
+				vnicinfo->num_rcv_bufs_could_not_alloc++;
+				vnicinfo->alloc_failed_in_repost_return_cnt++;
+				status = -1;
+				break;
+			}
+			LOGVER("REPOST_RETURN: reposting new rcvbuf[%d]:%p\n",
+			       i, vnicinfo->rcvbuf[i]);
+			post_skb(cmdrsp, vnicinfo, vnicinfo->rcvbuf[i]);
+			numreposted++;
+			break;
+		}
+	}
+	LOGVER("REPOST_RETURN: num rcvbufs posted:%d\n", numreposted);
+	if (numreposted != copy.numrcvbufs) {
+		LOGVER("**** %s FAILED to repost all the rcv bufs; numreposted:%d rcv.numrcvbufs:%d\n",
+		       netdev->name, numreposted, copy.numrcvbufs);
+		vnicinfo->n_repost_deficit++;
+		status = -1;
+	}
+	if (skb) {
+		if (found_skb) {
+			LOGVER("REPOST_RETURN: skb is %p - freeing it", skb);
+			kfree_skb(skb);
+		} else {
+			LOGERRNAME(vnicinfo->netdev, "%s REPOST_RETURN: skb %p NOT found in rcvbuf list!!",
+				   netdev->name, skb);
+			status = -3;
+			vnicinfo->bad_rcv_buf++;
+		}
+	}
+	atomic_dec(&vnicinfo->usage);
+	return status;
+}
+
+static void
+virtnic_rx(struct uiscmdrsp *cmdrsp)
+{
+	struct virtnic_info *vnicinfo;
+	struct sk_buff *skb, *prev, *curr;
+	struct net_device *netdev;
+	int cc, currsize, off, status;
+	struct ethhdr *eth;
+	unsigned long flags;
+#ifdef DEBUG
+	struct phys_info testfrags[MAX_PHYS_INFO];
+#endif
+
+/*
+ * post new rcv buf to the other end using the cmdrsp we have at hand
+ * post it without holding lock - but we'll use the signal lock to synchronize
+ * the queue insert the cmdrsp that contains the net.rcv is the one we are
+ * using to repost, so copy the info we need from it.
+ */
+	skb = cmdrsp->net.buf;
+	netdev = skb->dev;
+
+	if (netdev)
+		DBGINF("in virtnic_rx %p %s len:%d\n", netdev, netdev->name,
+		       cmdrsp->net.rcv.rcv_done_len);
+	else {
+		/* We must have previously downed this network device and
+		 * this skb and device is no longer valid. This also means
+		 * the skb reference was removed from virtnic->rcvbuf so no
+		 * need to search for it.
+		 * All we can do is free the skb and return.
+		 * Note: We crash if we try to log this here.
+		 */
+		kfree_skb(skb);
+		return;
+	}
+
+	vnicinfo = netdev_priv(netdev);
+
+	spin_lock_irqsave(&vnicinfo->priv_lock, flags);
+	atomic_dec(&vnicinfo->num_rcv_bufs_in_iovm);
+
+	/* update rcv stats - call it with priv_lock held */
+	UPD_RCV_STATS;
+
+	atomic_inc(&vnicinfo->usage);	/* don't want a close to happen before
+					   we're done here */
+	/*
+	 * set length to how much was ACTUALLY received -
+	 * NOTE: rcv_done_len includes actual length of data rcvd
+	 * including ethhdr
+	 */
+	skb->len = cmdrsp->net.rcv.rcv_done_len;
+
+	/* test enabled while holding lock */
+	if (!(vnicinfo->enabled && vnicinfo->enab_dis_acked)) {
+		/*
+		 * don't process it unless we're in enable mode and until
+		 * we've gotten an ACK saying the other end got our RCV enable
+		 */
+		LOGERRNAME(vnicinfo->netdev,
+			   "%s dropping packet - perhaps old\n", netdev->name);
+		spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+		if (repost_return(cmdrsp, vnicinfo, skb, netdev) < 0)
+			LOGERRNAME(vnicinfo->netdev, "repost_return failed");
+		return;
+	}
+
+	spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+
+	/*
+	 * when skb was allocated, skb->dev, skb->data, skb->len and
+	 * skb->data_len were setup. AND, data has already put into the
+	 * skb (both first frag and in frags pages)
+	 * NOTE: firstfragslen is the amount of data in skb->data and that
+	 * which is not in nr_frags or frag_list. This is now simply
+	 * RCVPOST_BUF_SIZE. bump tail to show how much data is in
+	 * firstfrag & set data_len to show rest see if we have to chain
+	 * frag_list.
+	 */
+	if (skb->len > RCVPOST_BUF_SIZE) {	/* do PRECAUTIONARY check */
+		if (cmdrsp->net.rcv.numrcvbufs < 2) {
+			LOGERRNAME(vnicinfo->netdev, "**** %s Something is wrong; rcv_done_len:%d > RCVPOST_BUF_SIZE:%d but numrcvbufs:%d < 2\n",
+				   netdev->name, skb->len, RCVPOST_BUF_SIZE,
+				   cmdrsp->net.rcv.numrcvbufs);
+			if (repost_return(cmdrsp, vnicinfo, skb, netdev) < 0)
+				LOGERRNAME(vnicinfo->netdev,
+					   "repost_return failed");
+			return;
+		}
+		/* length rcvd is greater than firstfrag in this skb rcv buf  */
+		skb->tail += RCVPOST_BUF_SIZE;	/* amount in skb->data */
+		skb->data_len = skb->len - RCVPOST_BUF_SIZE;	/* amount that
+								   will be in
+								   frag_list */
+		DBGINF("len:%d data:%d\n", skb->len, skb->data_len);
+	} else {
+		/*
+		 * data fits in this skb - no chaining - do PRECAUTIONARY check
+		 */
+		if (cmdrsp->net.rcv.numrcvbufs != 1) {	/* should be 1 */
+			LOGERRNAME(vnicinfo->netdev, "**** %s Something is wrong; rcv_done_len:%d <= RCVPOST_BUF_SIZE:%d but numrcvbufs:%d != 1\n",
+				   netdev->name, skb->len, RCVPOST_BUF_SIZE,
+				   cmdrsp->net.rcv.numrcvbufs);
+			if (repost_return(cmdrsp, vnicinfo, skb, netdev) < 0)
+				LOGERRNAME(vnicinfo->netdev,
+					   "repost_return failed");
+			return;
+		}
+		skb->tail += skb->len;
+		skb->data_len = 0;	/* nothing rcvd in frag_list */
+	}
+	off = skb_tail_pointer(skb) - skb->data;
+	/*
+	 * amount we bumped tail by in the head skb
+	 * it is used to calculate the size of each chained skb below
+	 * it is also used to index into bufline to continue the copy
+	 * (for chansocktwopc)
+	 * if necessary chain the rcv skbs together.
+	 * NOTE: index 0 has the same as cmdrsp->net.rcv.skb; we need to
+	 * chain the rest to that one.
+	 * - do PRECAUTIONARY check
+	 */
+	if (cmdrsp->net.rcv.rcvbuf[0] != skb) {
+		LOGERRNAME(vnicinfo->netdev, "**** %s Something is wrong; rcvbuf[0]:%p != skb:%p\n",
+			   netdev->name, cmdrsp->net.rcv.rcvbuf[0], skb);
+		if (repost_return(cmdrsp, vnicinfo, skb, netdev) < 0)
+			LOGERRNAME(vnicinfo->netdev, "repost_return failed");
+		return;
+	}
+
+	if (cmdrsp->net.rcv.numrcvbufs > 1) {
+		/* chain the various rcv buffers into the skb's frag_list. */
+		/* Note: off was initialized above  */
+		for (cc = 1, prev = NULL;
+		     cc < cmdrsp->net.rcv.numrcvbufs; cc++) {
+			curr = (struct sk_buff *)cmdrsp->net.rcv.rcvbuf[cc];
+			curr->next = NULL;
+			DBGINF("chaining skb:%p data:%p to skb:%p data:%p\n",
+			       curr, curr->data, skb, skb->data);
+			if (prev == NULL)	/* start of list- set head */
+				skb_shinfo(skb)->frag_list = curr;
+			else
+				prev->next = curr;
+			prev = curr;
+			/*
+			 * should we set skb->len and skb->data_len for each
+			 * buffer being chained??? can't hurt!
+			 */
+			currsize =
+			    min(skb->len - off,
+				(unsigned int)RCVPOST_BUF_SIZE);
+			curr->len = currsize;
+			curr->tail += currsize;
+			curr->data_len = 0;
+			off += currsize;
+		}
+#ifdef DEBUG
+		/* assert skb->len == off */
+		if (skb->len != off) {
+			LOGERRNAME(vnicinfo->netdev, "%s something wrong; skb->len:%d != off:%d\n",
+				   netdev->name, skb->len, off);
+		}
+		/* test code */
+		cc = util_copy_fragsinfo_from_skb("rcvchaintest", skb,
+						  RCVPOST_BUF_SIZE,
+						  MAX_PHYS_INFO, testfrags);
+		LOGINFNAME(vnicinfo->netdev, "rcvchaintest returned:%d\n", cc);
+		if (cc != cmdrsp->net.rcv.numrcvbufs) {
+			LOGERRNAME(vnicinfo->netdev, "**** %s Something wrong; rcvd chain length %d different from one we calculated %d\n",
+				   netdev->name, cmdrsp->net.rcv.numrcvbufs,
+				   cc);
+		}
+		for (i = 0; i < cc; i++) {
+			LOGINFNAME(vnicinfo->netdev, "test:RCVPOST_BUF_SIZE:%d[%d] pfn:%llu off:0x%x len:%d\n",
+				   RCVPOST_BUF_SIZE, i, testfrags[i].pi_pfn,
+				   testfrags[i].pi_off, testfrags[i].pi_len);
+		}
+#endif
+	}
+
+	/* set up packet's protocl type using ethernet header - this
+	 * sets up skb->pkt_type & it also PULLS out the eth header
+	 */
+	skb->protocol = eth_type_trans(skb, netdev);
+
+	eth = eth_hdr(skb);
+
+	DBGINF("%d Src:%02x:%02x:%02x:%02x:%02x:%02x Dest:%02x:%02x:%02x:%02x:%02x:%02x proto:%x\n",
+	       skb->pkt_type, eth->h_source[0], eth->h_source[1],
+	       eth->h_source[2], eth->h_source[3], eth->h_source[4],
+	       eth->h_source[5], eth->h_dest[0], eth->h_dest[1], eth->h_dest[2],
+	       eth->h_dest[3], eth->h_dest[4], eth->h_dest[5], eth->h_proto);
+
+	skb->csum = 0;
+	skb->ip_summed = CHECKSUM_NONE;	/* trust me, the checksum has
+					   been verified */
+
+	do {
+		if (netdev->flags & IFF_PROMISC) {
+			DBGINF("IFF_PROMISC is set.\n");
+			break;	/* accept all packets */
+		}
+		if (skb->pkt_type == PACKET_BROADCAST) {
+			DBGINF("packet is broadcast.\n");
+			if (netdev->flags & IFF_BROADCAST) {
+				DBGINF("IFF_BROADCAST is set.\n");
+				break;	/* accept all broadcast packets */
+			}
+		} else if (skb->pkt_type == PACKET_MULTICAST) {
+			DBGINF("packet is multicast.\n");
+			if (netdev->flags & IFF_ALLMULTI)
+				DBGINF("IFF_ALLMULTI is set.\n");
+			if ((netdev->flags & IFF_MULTICAST) &&
+			    (netdev_mc_count(netdev))) {
+				struct netdev_hw_addr *ha;
+				int found_mc = 0;
+
+				DBGINF("IFF_MULTICAST is set %d.\n",
+				       netdev_mc_count(netdev));
+				/*
+				 * only accept multicast packets that we can
+				 * find in our multicast address list
+				 */
+				netdev_for_each_mc_addr(ha, netdev) {
+					if (memcmp
+					    (eth->h_dest, ha->addr,
+					     MAX_MACADDR_LEN) == 0) {
+						DBGINF("multicast address is in our list at index:%i.\n", i);
+						found_mc = 1;
+						break;
+					}
+				}
+				if (found_mc) {
+					break;	/* accept packet, dest
+						   matches a multicast
+						   address */
+				}
+			}
+		} else if (skb->pkt_type == PACKET_HOST) {
+			DBGINF("packet is directed.\n");
+			break;	/* accept packet, h_dest must match vnic
+				   mac address */
+		} else if (skb->pkt_type == PACKET_OTHERHOST) {
+			/* something is not right */
+			LOGERRNAME(vnicinfo->netdev, "**** FAILED to deliver rcv packet to OS; name:%s Dest:%02x:%02x:%02x:%02x:%02x:%02x VNIC:%02x:%02x:%02x:%02x:%02x:%02x\n",
+				   netdev->name, eth->h_dest[0], eth->h_dest[1],
+				   eth->h_dest[2], eth->h_dest[3],
+				   eth->h_dest[4], eth->h_dest[5],
+				   netdev->dev_addr[0], netdev->dev_addr[1],
+				   netdev->dev_addr[2], netdev->dev_addr[3],
+				   netdev->dev_addr[4], netdev->dev_addr[5]);
+		}
+		/* drop packet - don't forward it up to OS */
+		DBGINF("we cannot indicate this recv pkt! (netdev->flags:0x%04x, skb->pkt_type:0x%02x).\n",
+		       netdev->flags, skb->pkt_type);
+		vnicinfo->n_rcv_packet_not_accepted++;
+		if (repost_return(cmdrsp, vnicinfo, skb, netdev) < 0)
+			LOGERRNAME(vnicinfo->netdev, "repost_return failed");
+		return;
+	} while (0);
+
+	DBGINF("Calling netif_rx skb:%p head:%p end:%p data:%p tail:%p len:%d data_len:%d skb->nr_frags:%d\n",
+	       skb, skb->head, skb->end, skb->data, skb->tail, skb->len,
+	       skb->data_len, skb_shinfo(skb)->nr_frags);
+
+	status = netif_rx(skb);
+	if (status != NET_RX_SUCCESS)
+		LOGWRNNAME(vnicinfo->netdev, "status=%d\n", status);
+	/*
+	 * netif_rx returns various values, but "in practice most drivers
+	 * ignore the return value
+	 */
+
+	skb = NULL;
+	/*
+	 * whether the packet got dropped or handled, the skb is freed by
+	 * kernel code, so we shouldn't free it. but we should repost a
+	 * new rcv buffer.
+	 */
+	if (repost_return(cmdrsp, vnicinfo, skb, netdev) < 0)
+		LOGVER("repost_return failed");
+	return;
+}
+
+/*
+ * This function is protected from concurrent calls by a spinlock xmit_lock
+ * in the  net_device struct, but as soon as the function returns it can be
+ * called again.
+ * Return 0, OK, !0 for error.
+ */
+static int
+virtnic_xmit(struct sk_buff *skb, struct net_device *netdev)
+{
+	struct virtnic_info *vnicinfo;
+	int len, firstfraglen, padlen;
+	struct uiscmdrsp *cmdrsp = NULL;
+	unsigned long flags;
+	int qrslt;
+
+/* Note: NETDEV_TX_OK is 0, NETDEV_TX_BUSY is 1. */
+#define BUSY { \
+	spin_unlock_irqrestore(&vnicinfo->priv_lock, flags); \
+	vnicinfo->busy_cnt++; \
+	return NETDEV_TX_BUSY; \
+}
+
+/* return value NETDEV_TX_OK == 0 */
+	DBGINF("got xmit for netdev:%p %s len:%d ip_summed:%d skb->data:%p data_len:%d skb->h.raw:%p maxdatalen:%d\n",
+	       netdev, netdev->name, skb->len, skb->ip_summed, skb->data,
+	       skb->data_len, skb->h.raw, skb->end - skb->data);
+
+	vnicinfo = netdev_priv(netdev);
+	spin_lock_irqsave(&vnicinfo->priv_lock, flags);
+	/*Modified for Trac #2395 FIX TEL_CKS */
+	if (netif_queue_stopped(netdev)) {
+		LOGINFNAME(vnicinfo->netdev,
+			   "Returning Busy because queue is stopped\n");
+		BUSY;
+	}
+	if (vnicinfo->server_down || vnicinfo->server_change_state) {
+		LOGINFNAME(vnicinfo->netdev, "Returning BUSY because server is down/changing state\n");
+		BUSY;
+	}
+	/*
+	 * sk_buff struct is used to host network data throughout all the
+	 * Linux network subsystems
+	 */
+	len = skb->len;
+	/*
+	 * skb->len is the FULL length of data (including fragmentary portion)
+	 * skb->data_len is the length of the fragment portion in frags
+	 * skb->len - skb->data_len is the size of the 1st fragment in skb->data
+	 * calculate the length of the first fragment that skb->data is
+	 * pointing to
+	 */
+	firstfraglen = skb->len - skb->data_len;
+	if (firstfraglen < ETH_HEADER_SIZE) {
+		LOGERRNAME(vnicinfo->netdev, "first fragment in skb->data too small for ethernet header len:%d data_len:%d\n",
+			   skb->len, skb->data_len);
+		BUSY;		/* NOT LIKELY TO HAPPEN */
+	}
+
+	if ((len < ETH_MIN_PACKET_SIZE) &&
+	    ((skb_end_pointer(skb) - skb->data) >= ETH_MIN_PACKET_SIZE)) {
+		/* pad the packet out to minimum size */
+		padlen = ETH_MIN_PACKET_SIZE - len;
+		DBGINF("padding %d\n", padlen);
+		memset(&skb->data[len], 0, padlen);
+		skb->tail += padlen;
+		skb->len += padlen;
+		len += padlen;
+		firstfraglen += padlen;
+	}
+
+	cmdrsp = vnicinfo->xmit_cmdrsp;
+	/* clear cmdrsp */
+	memset(cmdrsp, 0, SIZEOF_CMDRSP);
+	cmdrsp->net.type = NET_XMIT;
+	cmdrsp->cmdtype = CMD_NET_TYPE;
+
+	/* save the pointer to skb - we'll need it for completion */
+	cmdrsp->net.buf = skb;
+
+	if (((vnicinfo->datachan.chstat.sent_xmit >=
+	      vnicinfo->datachan.chstat.got_xmit_done) &&
+	     (vnicinfo->datachan.chstat.sent_xmit -
+	     vnicinfo->datachan.chstat.got_xmit_done >=
+	     vnicinfo->max_outstanding_net_xmits)) ||
+	    /* OR check wrap condition */
+	    ((vnicinfo->datachan.chstat.sent_xmit <
+	      vnicinfo->datachan.chstat.got_xmit_done) &&
+	      (ULONG_MAX - vnicinfo->datachan.chstat.got_xmit_done +
+	       vnicinfo->datachan.chstat.sent_xmit >=
+	       vnicinfo->max_outstanding_net_xmits))
+	    ) {
+		/*
+		 * too many NET_XMITs queued over to IOVM - need to wait
+		 * Might need to remove the below message as these might be
+		 * excessive under load.
+		 */
+		vnicinfo->datachan.chstat.reject_count++;
+		if (!vnicinfo->queuefullmsg_logged &&
+		    ((vnicinfo->datachan.chstat.reject_count & 0x3ff) ==
+			1)) {
+			vnicinfo->queuefullmsg_logged = 1;
+#if VIRTNIC_STATS
+			vnicinfo->datachan.chstat.reject_jiffies_start =
+			    jiffies;
+#endif
+			LOGINFNAME(vnicinfo->netdev, "**** REJECTING NET_XMIT - rejected count=%ld chstat.sent_xmit=%lu chstat.got_xmit_done=%lu\n",
+				   vnicinfo->datachan.chstat.reject_count,
+				   vnicinfo->datachan.chstat.sent_xmit,
+				   vnicinfo->datachan.chstat.got_xmit_done);
+		}
+		netif_stop_queue(netdev);	/* calling stop queue */
+		BUSY;		/* return status that packet not accepted */
+	} else if (vnicinfo->queuefullmsg_logged) {
+#if VIRTNIC_STATS
+		LOGINFNAME(vnicinfo->netdev, "**** NET_XMITs now working again - rejected count = %ld msec = %ld\n",
+			   vnicinfo->datachan.chstat.reject_count,
+			   ((long)jiffies -
+			   (long)(vnicinfo->datachan.chstat.
+				    reject_jiffies_start)) * 1000 / HZ);
+#else
+		LOGINFNAME(vnicinfo->netdev, "**** NET_XMITs now working again - rejected count = %ld\n",
+			   vnicinfo->datachan.chstat.reject_count);
+#endif
+		/* queue is not blocked so reset the logging flag */
+		vnicinfo->queuefullmsg_logged = 0;
+	}
+
+	if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
+		DBGINF("CHECKSUM_HW protocol:%x csum:%x tso_size:%x data:%p h.raw:%p nh.raw:%p\n",
+		       skb->protocol, skb->csum, skb_shinfo(skb)->tso_size,
+		       skb->data, skb->h.raw, skb->nh.raw);
+		cmdrsp->net.xmt.lincsum.valid = 1;
+		cmdrsp->net.xmt.lincsum.protocol = skb->protocol;
+		if (skb_transport_header(skb) > skb->data) {
+			cmdrsp->net.xmt.lincsum.hrawoff =
+				skb_transport_header(skb) - skb->data;
+			cmdrsp->net.xmt.lincsum.hrawoffv = 1;
+		}
+		if (skb_network_header(skb) > skb->data) {
+			cmdrsp->net.xmt.lincsum.nhrawoff =
+			    skb_network_header(skb) - skb->data;
+			cmdrsp->net.xmt.lincsum.nhrawoffv = 1;
+		}
+		cmdrsp->net.xmt.lincsum.csum = skb->csum;
+		} else {
+		cmdrsp->net.xmt.lincsum.valid = 0;
+		}
+	/* save off the length of the entire data packet  */
+	 cmdrsp->net.xmt.len = len;	/* total data length */
+	/*
+	 * copy ethernet header from first frag into cmdrsp
+	 * - everything else will be passed in frags & DMA'ed
+	 */
+	memcpy(cmdrsp->net.xmt.ethhdr, skb->data, ETH_HEADER_SIZE);
+	/*
+	 * copy frags info - from skb->data we need to only provide access
+	 * beyond eth header
+	 */
+	cmdrsp->net.xmt.num_frags =
+	    uisutil_copy_fragsinfo_from_skb("virtnic_xmit", skb, firstfraglen,
+					    MAX_PHYS_INFO,
+					    cmdrsp->net.xmt.frags);
+	if (cmdrsp->net.xmt.num_frags == -1) {
+		LOGERRNAME(vnicinfo->netdev, "**** FAILED to copy fragsinfo\n");
+		BUSY;		/* WILL HAPPEN ONLY IF FRAG ARRAY WITH
+				   MAX_PHYS_INFO ENTRIES IS NOT ENOUGH */
+	}
+
+	DBGINF("Forwarding packet cmdrsp:%p\n", cmdrsp);
+
+	/*
+	 * don't hold lock when forwarding xmit - if queue is full insert
+	 * might sleep
+	 */
+	qrslt = uisqueue_put_cmdrsp_with_lock_client(
+			vnicinfo->datachan.chinfo.queueinfo, cmdrsp,
+			IOCHAN_TO_IOPART,
+			(void *)&vnicinfo->datachan.chinfo.insertlock,
+			DONT_ISSUE_INTERRUPT, (uint64_t)NULL,
+			0 /* don't wait */ ,
+			"vnic");
+	if (!qrslt) {
+		/* failed to queue xmit - return busy */
+		LOGERRNAME(vnicinfo->netdev,
+			   "**** FAILED to insert NET_XMIT\n");
+		netif_stop_queue(netdev);	/* calling stop queue  */
+		BUSY;		/* return status that packet not accepted */
+	}
+	/* Track the skbs that have been sent to the IOVM for XMIT */
+	skb_queue_head(&vnicinfo->xmitbufhead, skb);
+
+	/*
+	 * set the last transmission start time
+	 * linux docs says:  Do not forget to update netdev->trans_start to
+	 * jiffies after each new tx packet is given to the hardware.
+	 */
+	netdev->trans_start = jiffies;	/* some code in Linux uses this. */
+
+	/* update xmt stats */
+	UPD_XMT_STATS;
+	vnicinfo->datachan.chstat.sent_xmit++;
+
+	/*
+	 * check to see if we have hit the high watermark for
+	 * netif_stop_queue()
+	 */
+	if (((vnicinfo->datachan.chstat.sent_xmit >=
+	      vnicinfo->datachan.chstat.got_xmit_done) &&
+	     (vnicinfo->datachan.chstat.sent_xmit -
+	      vnicinfo->datachan.chstat.got_xmit_done >=
+	      vnicinfo->upper_threshold_net_xmits)) ||
+	    /* OR check wrap condition */
+	    ((vnicinfo->datachan.chstat.sent_xmit <
+	      vnicinfo->datachan.chstat.got_xmit_done) &&
+	      (ULONG_MAX - vnicinfo->datachan.chstat.got_xmit_done +
+	       vnicinfo->datachan.chstat.sent_xmit >=
+	       vnicinfo->upper_threshold_net_xmits))
+	   ) {
+		/* too many NET_XMITs queued over to IOVM - need to wait */
+		netif_stop_queue(netdev); /* calling stop queue - call
+					     netif_wake_queue() after lower
+					     threshold */
+		vnicinfo->flow_control_upper_hits++;
+	}
+
+	spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+
+	/* skb will be freed when we get back NET_XMIT_DONE */
+	return NETDEV_TX_OK;
+}
+
+static void
+virtnic_serverdown_complete(struct work_struct *work)
+{
+	struct virtnic_info *vnicinfo;
+	struct net_device *netdev;
+	struct virtpci_dev *virtpcidev;
+	unsigned long flags;
+	int i = 0, count = 0;
+
+	vnicinfo =
+	    container_of(work, struct virtnic_info, serverdown_completion);
+	netdev = vnicinfo->netdev;
+	virtpcidev = vnicinfo->virtpcidev;
+
+	DBGINF("virtpcidev busNo<<%d>>devNo<<%d>>", virtpcidev->busNo,
+	       virtpcidev->deviceNo);
+	DBGINF("net_device name<<%s>>", netdev->name);
+	/* Stop Using Datachan */
+	uisthread_stop(&vnicinfo->datachan.chinfo.threadinfo);
+
+	/* Inform Linux that the link is down */
+	netif_carrier_off(netdev);
+	netif_stop_queue(netdev);
+
+	/*
+	 * Free the skb for XMITs that haven't been serviced by the server
+	 * We shouldn't have to inform Linux about these IOs because they
+	 * are "lost in the ethernet"
+	 */
+	skb_queue_purge(&vnicinfo->xmitbufhead);
+
+	spin_lock_irqsave(&vnicinfo->priv_lock, flags);
+	/* free rcv buffers */
+	for (i = 0; i < vnicinfo->num_rcv_bufs; i++) {
+		if (vnicinfo->rcvbuf[i]) {
+			kfree_skb(vnicinfo->rcvbuf[i]);
+			vnicinfo->rcvbuf[i] = NULL;
+			count++;
+		}
+	}
+	atomic_set(&vnicinfo->num_rcv_bufs_in_iovm, 0);
+	spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+
+	LOGINFNAME(vnicinfo->netdev, "Closed:%p Freed %d rcv bufs\n", netdev,
+		   count);
+
+	vnicinfo->server_down = true;
+	vnicinfo->server_change_state = false;
+	visorchipset_device_pause_response(virtpcidev->bus_no,
+					   virtpcidev->device_no, 0);
+}
+
+/* As per VirtpciFunc returns 1 for success and 0 for failure */
+static int
+virtnic_serverdown(struct virtpci_dev *virtpcidev, u32 state)
+{
+	struct net_device *netdev = virtpcidev->net.netdev;
+	struct virtnic_info *vnicinfo = netdev_priv(netdev);
+
+	DBGINF("virtpcidev busNo<<%d>>devNo<<%d>>", virtpcidev->busNo,
+	       virtpcidev->deviceNo);
+	DBGINF("entering virtnic_serverdown");
+
+	if (!vnicinfo->server_down && !vnicinfo->server_change_state) {
+		vnicinfo->server_change_state = true;
+		queue_work(virtnic_serverdown_workqueue,
+			   &vnicinfo->serverdown_completion);
+	} else if (vnicinfo->server_change_state) {
+		LOGERRNAME(vnicinfo->netdev,
+			   "Server already processing change state message.");
+		return 0;
+	} else
+		LOGERRNAME(vnicinfo->netdev,
+			   "Server already down, but another server down message received.");
+	DBGINF("exiting virtnic_serverdown");
+	return 1;
+}
+
+/* As per VirtpciFunc returns 1 for success and 0 for failure */
+static int
+virtnic_serverup(struct virtpci_dev *virtpcidev)
+{
+	struct net_device *netdev = virtpcidev->net.netdev;
+	struct virtnic_info *vnicinfo = netdev_priv(netdev);
+	unsigned long flags;
+
+	DBGINF("entering virtnic_serverup");
+	DBGINF("virtpcidev busNo<<%d>>devNo<<%d>>", virtpcidev->busNo,
+	       virtpcidev->deviceNo);
+	DBGINF("net_device name<<%s>>", netdev->name);
+	if (vnicinfo->server_down && !vnicinfo->server_change_state) {
+		vnicinfo->server_change_state = true;
+		/*
+		 * Must transition channel to ATTACHED state BEFORE we can
+		 * start using the device again
+		 */
+		SPAR_CHANNEL_CLIENT_TRANSITION(vnicinfo->datachan.chinfo.
+					       queueinfo->chan,
+					       dev_name(&virtpcidev->
+							generic_dev),
+					       CHANNELCLI_ATTACHED, NULL);
+
+		if (!uisthread_start(&vnicinfo->datachan.chinfo.threadinfo,
+				     process_incoming_rsps,
+				     &vnicinfo->datachan, "vnic_incoming")) {
+			LOGERRNAME(vnicinfo->netdev,
+				   "**** FAILED to start thread\n");
+			return 0;
+		}
+
+		init_rcv_bufs(netdev, vnicinfo);
+
+		spin_lock_irqsave(&vnicinfo->priv_lock, flags);
+		vnicinfo->enabled = 1;
+		/*
+		 * now we're ready, let's send an ENB to uisnic
+		 * but until we get an ACK back from uisnic, we'll drop
+		 * the packets
+		 */
+		vnicinfo->enab_dis_acked = 0;
+		spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+
+		/*
+		 * send enable and wait for ack - don't hold lock when
+		 * sending enable because if the queue is full, insert
+		 * might sleep.
+		 */
+		SEND_ENBDIS(netdev, 1, vnicinfo->cmdrsp_rcv,
+			    vnicinfo->datachan.chinfo.queueinfo,
+			    &vnicinfo->datachan.chinfo.insertlock,
+			    vnicinfo->datachan.chstat);
+	} else if (vnicinfo->server_change_state) {
+		LOGERRNAME(vnicinfo->netdev,
+			   "Server already processing change state message.");
+		return 0;
+	} else {
+		DBGINF("Server up message received for server that was already up.");
+	}
+	DBGINF("exiting virtnic_serverup");
+	return 1;
+}
+
+static void
+virtnic_timeout_reset(struct work_struct *work)
+{
+	struct virtnic_info *vnicinfo;
+	struct net_device *netdev;
+	struct virtpci_dev *virtpcidev;
+	int response = 0;
+
+	vnicinfo = container_of(work, struct virtnic_info, timeout_reset);
+	netdev = vnicinfo->netdev;
+
+	DBGINF("net_device name<<%s>>", netdev->name);
+	/* Transmit Timeouts are typically handled by resetting the
+	 * device for our virtual NIC we will send a Disable and
+	 * Enable to the IOVM.  If it doesn't respond we will trigger
+	 * a serverdown
+	 */
+	DBGINF("Disabling connection to server.\n");
+	netif_stop_queue(netdev);
+	response = virtnic_disable_with_timeout(netdev, 100);
+	if (response != 0)
+		goto call_serverdown;
+
+	DBGINF("Disable returned so reenable connection to server.\n");
+	response = virtnic_enable_with_timeout(netdev, 100);
+	if (response != 0)
+		goto call_serverdown;
+	netif_wake_queue(netdev);
+
+	LOGWRNNAME(vnicinfo->netdev, "Virtual connection reset.\n");
+	return;
+
+call_serverdown:
+	LOGERRNAME(vnicinfo->netdev,
+		   "Disable/enabled Pair failed to return so start serverdown.\n");
+	virtpcidev = vnicinfo->virtpcidev;
+	virtnic_serverdown(virtpcidev, 0);
+	return;
+}
+
+static void
+virtnic_xmit_timeout(struct net_device *netdev)
+{
+	struct virtnic_info *vnicinfo = netdev_priv(netdev);
+	unsigned long flags;
+
+	LOGWRNNAME(vnicinfo->netdev,
+		   "Transmit Timeout.  Resetting virtual connection.\n");
+	LOGWRNNAME(vnicinfo->netdev, "net_device name<<%s>>", netdev->name);
+
+	spin_lock_irqsave(&vnicinfo->priv_lock, flags);
+	/* Ensure that a ServerDown message hasn't been received */
+	if (!vnicinfo->enabled ||
+	    (vnicinfo->server_down && !vnicinfo->server_change_state)) {
+		spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+		return;
+	}
+	spin_unlock_irqrestore(&vnicinfo->priv_lock, flags);
+
+	queue_work(virtnic_timeout_reset_workqueue, &vnicinfo->timeout_reset);
+}
+
+static void
+virtnic_set_multi(struct net_device *netdev)
+{
+	struct uiscmdrsp *cmdrsp;
+	struct virtnic_info *vnicinfo = netdev_priv(netdev);
+
+	DBGINF("net_device name<<%s>>", netdev->name);
+	DBGINF("entering virtnic_set_multi\n");
+
+	/* any filtering changes? */
+	if (vnicinfo->old_flags != netdev->flags) {
+		LOGINFNAME(vnicinfo->netdev,
+			   "old filter = 0x%04x, new filter = 0x%04x.\n",
+			   vnicinfo->old_flags, netdev->flags);
+		if ((netdev->flags & IFF_PROMISC) !=
+		    (vnicinfo->old_flags & IFF_PROMISC)) {
+			LOGINFNAME(vnicinfo->netdev,
+				   "we are %s promiscuous mode.\n",
+				   (netdev->
+				    flags & IFF_PROMISC) ? "entering" :
+				   "exiting");
+			cmdrsp = kmalloc(SIZEOF_CMDRSP, GFP_ATOMIC);
+			if (cmdrsp == NULL) {
+				LOGERRNAME(vnicinfo->netdev,
+					   "**** FAILED to kmalloc cmdrsp.\n");
+				return;
+			}
+			memset(cmdrsp, 0, SIZEOF_CMDRSP);
+			cmdrsp->cmdtype = CMD_NET_TYPE;
+			cmdrsp->net.type = NET_RCV_PROMISC;
+			cmdrsp->net.enbdis.context = netdev;
+			cmdrsp->net.enbdis.enable =
+			    (netdev->flags & IFF_PROMISC);
+			if (uisqueue_put_cmdrsp_with_lock_client
+			    (vnicinfo->datachan.chinfo.queueinfo, cmdrsp,
+			     IOCHAN_TO_IOPART,
+			     (void *)&vnicinfo->datachan.chinfo.insertlock,
+			     DONT_ISSUE_INTERRUPT, (uint64_t)NULL,
+			     0 /* don't wait */ , "vnic")) {
+				vnicinfo->datachan.chstat.sent_promisc++;
+			} else
+				LOGERRNAME(vnicinfo->netdev,
+					   "**** FAILED to insert NET_RCV_PROMISC.\n");
+			kfree(cmdrsp);
+		}
+
+		vnicinfo->old_flags = netdev->flags;
+	}
+	DBGINF("exiting virtnic_set_multi\n");
+}
+
+/*****************************************************/
+/* debugfs filesystem functions			     */
+/*****************************************************/
+
+static ssize_t info_debugfs_read(struct file *file,
+				 char __user *buf, size_t len, loff_t *offset)
+{
+	int i;
+	ssize_t bytes_read = 0;
+	int str_pos = 0;
+	struct virtnic_info *vni;
+	char *vbuf;
+
+	if (len > MAX_BUF)
+		len = MAX_BUF;
+	vbuf = kzalloc(len, GFP_KERNEL);
+	if (!vbuf)
+		return -ENOMEM;
+
+	/* for each vnic channel
+	 * dump out channel specific data
+	 */
+	for (i = 0; i < VIRTNICSOPENMAX; i++) {
+		if (num_virtnic_open[i].netdev == NULL)
+			continue;
+
+		vni = num_virtnic_open[i].vnicinfo;
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, "Vnic i = %d\n", i);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, "netdev = %s (0x%p), MAC Addr: %02x:%02x:%02x:%02x:%02x:%02x\n",
+			num_virtnic_open[i].netdev->name,
+			num_virtnic_open[i].netdev,
+			num_virtnic_open[i].netdev->dev_addr[0],
+			num_virtnic_open[i].netdev->dev_addr[1],
+			num_virtnic_open[i].netdev->dev_addr[2],
+			num_virtnic_open[i].netdev->dev_addr[3],
+			num_virtnic_open[i].netdev->dev_addr[4],
+			num_virtnic_open[i].netdev->dev_addr[5]);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, "vnicinfo = 0x%p\n", vni);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " num_rcv_bufs = %d\n",
+			vni->num_rcv_bufs);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " features = 0x%016llX\n",
+			(uint64_t)readq(&vni->datachan.chinfo.queueinfo->chan->
+				features));
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " max_outstanding_net_xmits = %d\n",
+			vni->max_outstanding_net_xmits);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " upper_threshold_net_xmits = %d\n",
+			vni->upper_threshold_net_xmits);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " lower_threshold_net_xmits = %d\n",
+			vni->lower_threshold_net_xmits);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " queuefullmsg_logged = %d\n",
+			vni->queuefullmsg_logged);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " queueinfo->packets_sent = %lld\n",
+			vni->datachan.chinfo.queueinfo->packets_sent);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " queueinfo->packets_received = %lld\n",
+			vni->datachan.chinfo.queueinfo->packets_received);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " chstat.got_rcv = %lu\n",
+			vni->datachan.chstat.got_rcv);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " chstat.got_enbdisack = %lu\n",
+			vni->datachan.chstat.got_enbdisack);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " chstat.got_xmit_done = %lu\n",
+			vni->datachan.chstat.got_xmit_done);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " chstat.xmit_fail = %lu\n",
+			vni->datachan.chstat.xmit_fail);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " chstat.sent_enbdis = %lu\n",
+			vni->datachan.chstat.sent_enbdis);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " chstat.sent_promisc = %lu\n",
+			vni->datachan.chstat.sent_promisc);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " chstat.sent_post = %lu\n",
+			vni->datachan.chstat.sent_post);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " chstat.sent_xmit = %lu\n",
+			vni->datachan.chstat.sent_xmit);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " chstat.reject_count = %lu\n",
+			vni->datachan.chstat.reject_count);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " chstat.extra_rcvbufs_sent = %lu\n",
+			vni->datachan.chstat.extra_rcvbufs_sent);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " n_rcv0 = %lu\n", vni->n_rcv0);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " n_rcv1 = %lu\n", vni->n_rcv1);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " n_rcv2 = %lu\n", vni->n_rcv2);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " n_rcvx = %lu\n", vni->n_rcvx);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " num_rcv_bufs_in_iovm = %d\n",
+			atomic_read(&vni->num_rcv_bufs_in_iovm));
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " alloc_failed_in_if_needed_cnt = %lu\n",
+			vni->alloc_failed_in_if_needed_cnt);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " alloc_failed_in_repost_return_cnt = %lu\n",
+			vni->alloc_failed_in_repost_return_cnt);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " inner_loop_limit_reached_cnt = %lu\n",
+			vni->inner_loop_limit_reached_cnt);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " found_repost_rcvbuf_cnt = %lu\n",
+			vni->found_repost_rcvbuf_cnt);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " repost_found_skb_cnt = %lu\n",
+			vni->repost_found_skb_cnt);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " n_repost_deficit = %lu\n",
+			vni->n_repost_deficit);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " bad_rcv_buf = %lu\n",
+			vni->bad_rcv_buf);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " n_rcv_packet_not_accepted = %lu\n",
+			vni->n_rcv_packet_not_accepted);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " interrupts_rcvd = %llu\n",
+			vni->interrupts_rcvd);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " interrupts_notme = %llu\n",
+			vni->interrupts_notme);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " interrupts_disabled = %llu\n",
+			vni->interrupts_disabled);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " busy_cnt = %llu\n",
+			vni->busy_cnt);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " flow_control_upper_hits = %llu\n",
+			vni->flow_control_upper_hits);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " flow_control_lower_hits = %llu\n",
+			vni->flow_control_lower_hits);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " thread_wait_ms = %d\n",
+			vni->thread_wait_ms);
+		str_pos += scnprintf(vbuf + str_pos,
+				len - str_pos, " netif_queue = %s\n",
+			netif_queue_stopped(vni->netdev) ?
+			"stopped" : "running");
+	}
+	bytes_read = simple_read_from_buffer(buf, len, offset, vbuf, str_pos);
+	kfree(vbuf);
+	return bytes_read;
+}
+
+static ssize_t enable_ints_write(struct file *file,
+				 const char __user *buffer,
+				 size_t count, loff_t *ppos)
+{
+	char buf[4];
+	int i, new_value;
+	struct virtnic_info *vnicinfo;
+	uint64_t __iomem *features_addr;
+	uint64_t mask;
+
+	if (count >= ARRAY_SIZE(buf))
+		return -EINVAL;
+
+	buf[count] = '\0';
+	if (copy_from_user(buf, buffer, count)) {
+		LOGERR("copy_from_user failed.\n");
+		return -EFAULT;
+	}
+
+	i = kstrtoint(buf, 10 , &new_value);
+
+	if (i != 0) {
+		LOGERR("Failed to scan value for enable_ints, buf<<%.*s>>",
+		       (int)count, buf);
+		return -EFAULT;
+	}
+
+	 /* set all counts to new_value usually 0 */
+	for (i = 0; i < VIRTNICSOPENMAX; i++) {
+		if (num_virtnic_open[i].vnicinfo != NULL) {
+			vnicinfo = num_virtnic_open[i].vnicinfo;
+			features_addr =
+				&vnicinfo->datachan.chinfo.queueinfo->chan->
+				features;
+			if (new_value == 1) {
+				mask =
+				    ~(ULTRA_IO_CHANNEL_IS_POLLING |
+				      ULTRA_IO_DRIVER_DISABLES_INTS);
+				uisqueue_interlocked_and(features_addr, mask);
+				mask = ULTRA_IO_DRIVER_ENABLES_INTS;
+				uisqueue_interlocked_or(features_addr, mask);
+				vnicinfo->thread_wait_ms = 2000;
+			} else {
+				mask =
+					~(ULTRA_IO_DRIVER_ENABLES_INTS |
+					ULTRA_IO_DRIVER_DISABLES_INTS);
+				uisqueue_interlocked_and(features_addr, mask);
+				mask = ULTRA_IO_CHANNEL_IS_POLLING;
+				uisqueue_interlocked_or(features_addr, mask);
+				vnicinfo->thread_wait_ms = 2;
+			}
+		}
+}
+
+return count;
+}
+
+/*****************************************************/
+/* Module init & exit functions                      */
+/*****************************************************/
+
+static int __init
+virtnic_mod_init(void)
+{
+	int error, i;
+
+	LOGINF("entering virtnic_mod_init");
+	/* ASSERT RCVPOST_BUF_SIZE < 4K */
+	if (RCVPOST_BUF_SIZE > PI_PAGE_SIZE) {
+		LOGERR("**** FAILED RCVPOST_BUF_SIZE:%d larger than a page\n",
+		       RCVPOST_BUF_SIZE);
+		return -1;
+	}
+	/* ASSERT RCVPOST_BUF_SIZE is big enough to hold eth header */
+	if (RCVPOST_BUF_SIZE < ETH_HEADER_SIZE) {
+		LOGERR("**** FAILED RCVPOST_BUF_SIZE:%d is < ETH_HEADER_SIZE:%d\n",
+		       RCVPOST_BUF_SIZE, ETH_HEADER_SIZE);
+		return -1;
+	}
+
+	/* clear out array */
+	for (i = 0; i < VIRTNICSOPENMAX; i++) {
+		num_virtnic_open[i].netdev = NULL;
+		num_virtnic_open[i].vnicinfo = NULL;
+	}
+	/* create workqueue for serverdown completion */
+	virtnic_serverdown_workqueue =
+	    create_singlethread_workqueue("virtnic_serverdown");
+	if (virtnic_serverdown_workqueue == NULL) {
+		LOGERR("**** FAILED virtnic_serverdown_workqueue creation\n");
+		return -1;
+	}
+	/* create workqueue for tx timeout reset  */
+	virtnic_timeout_reset_workqueue =
+	    create_singlethread_workqueue("virtnic_timeout_reset");
+	if (virtnic_timeout_reset_workqueue == NULL) {
+		LOGERR
+		    ("**** FAILED virtnic_timeout_reset_workqueue creation\n");
+		return -1;
+	}
+	virtnic_debugfs_dir = debugfs_create_dir("virtnic", NULL);
+	debugfs_create_file("info", S_IRUSR, virtnic_debugfs_dir,
+			    NULL, &debugfs_info_fops);
+	debugfs_create_file("enable_ints", S_IWUSR,
+			    virtnic_debugfs_dir, NULL,
+			    &debugfs_enable_ints_fops);
+
+	error = virtpci_register_driver(&virtnic_driver);
+	if (error < 0) {
+		LOGERR("**** FAILED to register driver %x\n", error);
+		debugfs_remove_recursive(virtnic_debugfs_dir);
+		return -1;
+	}
+	LOGINF("exiting virtnic_mod_init");
+	return error;
+}
+
+static void __exit
+virtnic_mod_exit(void)
+{
+	LOGINF("entering virtnic_mod_exit...\n");
+	virtpci_unregister_driver(&virtnic_driver);
+	/* unregister is going to call virtnic_remove for all devices */
+	/* destroy serverdown completion workqueue */
+	if (virtnic_serverdown_workqueue) {
+		destroy_workqueue(virtnic_serverdown_workqueue);
+		virtnic_serverdown_workqueue = NULL;
+	}
+
+	/* destroy timeout reset workqueue */
+	if (virtnic_timeout_reset_workqueue) {
+		destroy_workqueue(virtnic_timeout_reset_workqueue);
+		virtnic_timeout_reset_workqueue = NULL;
+	}
+
+	debugfs_remove_recursive(virtnic_debugfs_dir);
+	LOGINF("exiting virtnic_mod_exit...\n");
+}
+
+module_init(virtnic_mod_init);
+module_exit(virtnic_mod_exit);
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Usha Srinivasan");
+MODULE_ALIAS("uisvirtnic");
+/* this is extracted during depmod and kept in modules.dep */
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH] MAINTAINERS: changes for wireless
From: Arend van Spriel @ 2014-12-17 19:29 UTC (permalink / raw)
  To: John W. Linville; +Cc: netdev, linux-wireless, davem
In-Reply-To: <1418836025-9035-1-git-send-email-linville@tuxdriver.com>

On 12/17/14 18:07, John W. Linville wrote:
> http://marc.info/?l=linux-wireless&m=141883202530292&w=2
>
> This makes it official... :-)

Let see if I can comment on this patch.

First of all, thanks for the years of service. You already gave the 
heads up few months ago, but now its official. It has been good working 
with you getting rid of eye-sore code in brcm80211 drivers.

> Signed-off-by: John W. Linville<linville@tuxdriver.com>
> ---
>   MAINTAINERS | 19 ++++++++-----------
>   1 file changed, 8 insertions(+), 11 deletions(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index fdffe962a16a..e82d31aeb936 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6603,19 +6603,8 @@ L:	netdev@vger.kernel.org
>   S:	Maintained
>
>   NETWORKING [WIRELESS]
> -M:	"John W. Linville"<linville@tuxdriver.com>
>   L:	linux-wireless@vger.kernel.org
>   Q:	http://patchwork.kernel.org/project/linux-wireless/list/
> -T:	git git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless.git
> -S:	Maintained
> -F:	net/mac80211/
> -F:	net/rfkill/
> -F:	net/wireless/
> -F:	include/net/ieee80211*
> -F:	include/linux/wireless.h
> -F:	include/uapi/linux/wireless.h
> -F:	include/net/iw_handler.h
> -F:	drivers/net/wireless/
>
>   NETWORKING DRIVERS
>   L:	netdev@vger.kernel.org
> @@ -6636,6 +6625,14 @@ F:	include/linux/inetdevice.h
>   F:	include/uapi/linux/if_*
>   F:	include/uapi/linux/netdevice.h
>
> +NETWORKING DRIVERS (WIRELESS)
> +M:	Kalle Valo<kvalo@codeaurora.org>
> +L:	linux-wireless@vger.kernel.org
> +Q:	http://patchwork.kernel.org/project/linux-wireless/list/
> +T:	git git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers.git/
> +S:	Maintained
> +F:	drivers/net/wireless/

So what about the other paths that were in "NETWORKING [WIRELESS]". 
Couple of them are obviously maintained by Johannes, but..

> +
>   NETXEN (1/10) GbE SUPPORT
>   M:	Manish Chopra<manish.chopra@qlogic.com>
>   M:	Sony Chacko<sony.chacko@qlogic.com>

^ permalink raw reply

* Re: [iproute2] tc: Show classes more hierarchically]
From: Stephen Hemminger @ 2014-12-17 19:55 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner; +Cc: vadim4j, netdev
In-Reply-To: <54907619.5080508@gmail.com>

On Tue, 16 Dec 2014 16:12:41 -0200
Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote:

> On 15-12-2014 20:48, vadim4j@gmail.com wrote:
> > Hi All,
> >
> > I am playing with showing classes in more hierarchically format and I
> > have some code and example of output from my TC looks like:
> >
> > # tc/tc -t class show dev tap0
> >
> >   \---1:2 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >          \---1:40 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >          \---1:50 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >          \---1:60 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >   \---1:1 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >          \---1:10 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >                 \---1:11 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >                        \---1:111 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >          \---1:20 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >          \---1:30 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >
> >
> > which in standart output mode it looks like:
> >
> > # tc/tc class show dev tap0
> >
> > class htb 1:11 parent 1:10 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
> > class htb 1:111 parent 1:11 prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > class htb 1:10 parent 1:1 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
> > class htb 1:1 root rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
> > class htb 1:20 parent 1:1 leaf 20: prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
> > class htb 1:2 root rate 6Mbit ceil 6Mbit burst 15Kb cburst 1599b
> > class htb 1:30 parent 1:1 leaf 30: prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> > class htb 1:40 parent 1:2 leaf 40: prio 0 rate 5Mbit ceil 5Mbit burst 15Kb cburst 1600b
> > class htb 1:50 parent 1:2 leaf 50: prio 0 rate 3Mbit ceil 6Mbit burst 15Kb cburst 1599b
> > class htb 1:60 parent 1:2 leaf 60: prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
> >
> > So I'd like to ask if it might be useful for the TC users (may be
> > better format ?) to have this ?
> 
> Good idea! It already looks good, but what about:
> 
>    |-- 1:2 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>    |      |-- 1:40 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>    |      |-- 1:50 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>    |      '-- 1:60 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>    |-- 1:1 (htb) prio 0 rate 1Kbit ceil 6Mbit burst 15Kb cburst 1599b
>    ...
> 
> just another idea..
> 
> Thanks.
>    Marcelo

There are several places that also print tree format, hopefully there would
be reusable code (lspci, tree, ps).

^ permalink raw reply

* Re: [RFC PATCH net-next 0/5] tcp: TCP tracer
From: Arnaldo Carvalho de Melo @ 2014-12-17 19:51 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Martin KaFai Lau, netdev@vger.kernel.org, David S. Miller,
	Hannes Frederic Sowa, Steven Rostedt, Lawrence Brakmo,
	Josef Bacik, Kernel Team
In-Reply-To: <CAADnVQJgUn-hUrR1XLE4J64Ms52zRCRQkJDFUDu9SGTONJ507w@mail.gmail.com>

Em Wed, Dec 17, 2014 at 09:14:02AM -0800, Alexei Starovoitov escreveu:
> On Wed, Dec 17, 2014 at 7:07 AM, Arnaldo Carvalho de Melo
> <arnaldo.melo@gmail.com> wrote:
> > I guess even just using 'perf probe' to set those wannabe tracepoints
> > should be enough, no? Then he can refer to those in his perf record
> > call, etc and process it just like with the real tracepoints.
 
> it's far from ideal for two reasons.
> - they have different kernels and dragging along vmlinux
> with debug info or multiple 'perf list' data is too cumbersome

It is not strictly necessary to carry vmlinux, that is just a probe
point resolution time problem, solvable when generating a shell script,
on the development machine, to insert the probes.

> operationally. Permanent tracepoints solve this problem.

Sure, and when available, use them, my suggestion wasn't to use
exclusively any mechanism, but to initially use what is available to
create the tools, then find places that could be improved (if that
proves to be the case) by using a higher performance mechanism.

> - the action upon hitting tracepoint is non-trivial.
> perf probe style of unconditionally walking pointer chains
> will be tripping over wrong pointers.

Huh? Care to elaborate on this one?

> Plus they already need to do aggregation for high
> frequency events.

> As part of acting on trace_transmit_skb() event:
> if (before(tcb->seq, tcp_sk(sk)->snd_nxt)) {
>   tcp_trace_stats_add(...)
> }
> if (jiffies_to_msecs(jiffies - sktr->last_ts) ..) {
>   tcp_trace_stats_add(...)
> }

But aren't these stats TCP already keeps or could be made to?

- Arnaldo

^ permalink raw reply

* Re: [PATCH 01/10] core: Split out UFO6 support
From: Ben Hutchings @ 2014-12-17 20:10 UTC (permalink / raw)
  To: Vladislav Yasevich
  Cc: netdev, virtualization, mst, stefanha, Vladislav Yasevich
In-Reply-To: <1418840455-22598-2-git-send-email-vyasevic@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 1588 bytes --]

On Wed, 2014-12-17 at 13:20 -0500, Vladislav Yasevich wrote:
> Split IPv6 support for UFO into its own feature similiar to TSO.
> This will later allow us to re-enable UFO support for virtio-net
> devices.
[...]
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 6c8b6f6..8538b67 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -372,6 +372,7 @@ enum {
>  
>  	SKB_GSO_MPLS = 1 << 12,
>  
> +	SKB_GSO_UDP6 = 1 << 13

It seems like it would be cleaner to use the names SKB_GSO_UDPV{4,6},
similarly to SKB_GSO_TCPV{4,6}.

>  };
>  
>  #if BITS_PER_LONG > 32
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 945bbd0..fa4d2ee 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
[...]
> @@ -5952,24 +5958,21 @@ static netdev_features_t netdev_fix_features(struct net_device *dev,
[...]
> +	/* UFO also needs checksumming */
> +	if ((features & NETIF_F_UFO) && !(features & NETIF_F_GEN_CSUM) &&
> +					!(features & NETIF_F_IP_CSUM)) {

You can use !(features & NETIF_F_V4_CSUM) instead of the last two terms.

> +		netdev_dbg(dev,
> +			   "Dropping NETIF_F_UFO since no checksum offload features.\n");
> +		features &= ~NETIF_F_UFO;
> +	}
> +	if ((features & NETIF_F_UFO6) && !(features & NETIF_F_GEN_CSUM) &&
> +					 !(features & NETIF_F_IPV6_CSUM)) {
[...]

Similarly you can use !(features & NETIF_F_V6_CSUM) instead of the last
two terms.

Aside from those minor points, this looks fine.

Ben.

-- 
Ben Hutchings
Absolutum obsoletum. (If it works, it's out of date.) - Stafford Beer

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox