Netdev List
 help / color / mirror / Atom feed
* [PATCH v2 net-next-2.6] ifb: add performance flags
From: Eric Dumazet @ 2011-01-02 20:24 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David Miller, xiaosuo, pstaszewski, netdev
In-Reply-To: <20101228230709.GA2088@del.dom.local>

Le mercredi 29 décembre 2010 à 00:07 +0100, Jarek Poplawski a écrit :

> Ingress is before vlans handler so these features and the
> NETIF_F_HW_VLAN_TX flag seem useful for ifb considering
> dev_hard_start_xmit() checks.

OK, here is v2 of the patch then, thanks everybody.


[PATCH v2 net-next-2.6] ifb: add performance flags

IFB can use the full set of features flags (NETIF_F_SG |
NETIF_F_FRAGLIST | NETIF_F_TSO | NETIF_F_NO_CSUM | NETIF_F_HIGHDMA) to
avoid unnecessary split of some packets (GRO for example)

Changli suggested to also set vlan_features,
Jarek suggested to add NETIF_F_HW_VLAN_TX as well.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Changli Gao <xiaosuo@gmail.com>
Cc: Jarek Poplawski <jarkao2@gmail.com>
Cc: Pawel Staszewski <pstaszewski@itcare.pl>
---
 drivers/net/ifb.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index 124dac4..66ca7bf 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -126,6 +126,9 @@ static const struct net_device_ops ifb_netdev_ops = {
 	.ndo_validate_addr = eth_validate_addr,
 };
 
+#define IFB_FEATURES (NETIF_F_NO_CSUM | NETIF_F_SG  | NETIF_F_FRAGLIST | \
+		      NETIF_F_HIGHDMA | NETIF_F_TSO | NETIF_F_HW_VLAN_TX)
+
 static void ifb_setup(struct net_device *dev)
 {
 	/* Initialize the device structure. */
@@ -136,6 +139,9 @@ static void ifb_setup(struct net_device *dev)
 	ether_setup(dev);
 	dev->tx_queue_len = TX_Q_LIMIT;
 
+	dev->features |= IFB_FEATURES;
+	dev->vlan_features |= IFB_FEATURES;
+
 	dev->flags |= IFF_NOARP;
 	dev->flags &= ~IFF_MULTICAST;
 	dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;



^ permalink raw reply related

* [RFC] net_sched: mark packet staying on queue too long
From: Eric Dumazet @ 2011-01-02 21:27 UTC (permalink / raw)
  To: hadi, Jarek Poplawski, David Miller, Jesper Dangaard Brouer,
	Patrick McHardy
  Cc: netdev
In-Reply-To: <1293111333.11306.170.camel@mojatatu>

While playing with SFQ and other AQM, I was bothered to see how easy it
was for a single tcp flow to 'fill the pipe' and consume lot of memory
buffers in queues. I know Jesper use more than 50.000 SFQ on his
routers, and with GRO packets this can consume a lot of memory.

I played a bit adding ECN in SFQ, first by marking packets for a
particular flow if this flow qlen was above a given threshold, and later
using another trick : ECN mark packet if it stayed longer than a given
delay in the queue. This of course could be done on other modules, what
do you think ?

The idea is to take into account the time packet stayed in the queue,
regardless of other class parameters.

Following quick and dirty patch to show the idea. Of course, the delay
should be configured on each SFQ/RED/XXXX class, so it would need an
iproute2 patch, and the delay unit should be "ms" (or even "us"), not
ticks, but as I said this is a quick and dirty patch (net-next-2.6
based)

Using jiffies allows only delays above 3 or 4 ticks...

Or maybe ECN is just a dream :(

 include/net/sch_generic.h |    1 +
 net/sched/sch_sfq.c       |   11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 0af57eb..1a7ac68 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -198,6 +198,7 @@ struct tcf_proto {
 };
 
 struct qdisc_skb_cb {
+	unsigned long		timestamp;
 	unsigned int		pkt_len;
 	char			data[];
 };
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index d54ac94..baf9465 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -24,6 +24,8 @@
 #include <net/ip.h>
 #include <net/netlink.h>
 #include <net/pkt_sched.h>
+#include <net/inet_ecn.h>
+#include <linux/moduleparam.h>
 
 
 /*	Stochastic Fairness Queuing algorithm.
@@ -86,6 +88,10 @@
 /* This type should contain at least SFQ_DEPTH + SFQ_SLOTS values */
 typedef unsigned char sfq_index;
 
+static int ecn_delay; /* default : no ECN handling */
+module_param(ecn_delay, int, 0);
+MODULE_PARM_DESC(ecn_delay, "mark packets if they stay in queue longer than ecn_delay ticks");
+
 /*
  * We dont use pointers to save space.
  * Small indexes [0 ... SFQ_SLOTS - 1] are 'pointers' to slots[] array
@@ -391,6 +397,7 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 
 	sch->qstats.backlog += qdisc_pkt_len(skb);
 	slot_queue_add(slot, skb);
+	qdisc_skb_cb(skb)->timestamp = jiffies;
 	sfq_inc(q, x);
 	if (slot->qlen == 1) {		/* The flow is new */
 		if (q->tail == NULL) {	/* It is the first flow */
@@ -460,6 +467,10 @@ next_slot:
 		q->tail->next = next_a;
 	} else {
 		slot->allot -= SFQ_ALLOT_SIZE(qdisc_pkt_len(skb));
+		if (ecn_delay &&
+		    time_after(jiffies, qdisc_skb_cb(skb)->timestamp + ecn_delay) &&
+		    INET_ECN_set_ce(skb))
+			sch->qstats.overlimits++;
 	}
 	return skb;
 }



^ permalink raw reply related

* Re: [PATCH] new UDPCP Communication Protocol
From: Stefani Seibold @ 2011-01-02 21:33 UTC (permalink / raw)
  To: Daniel Baluta; +Cc: linux-kernel, akpm, davem, netdev, eric.dumazet, shemminger
In-Reply-To: <AANLkTi=7M=gSgBjcrCVEj+GKZ5cf5t4MPqm6r9NGr=MZ@mail.gmail.com>

Am Sonntag, den 02.01.2011, 21:48 +0200 schrieb Daniel Baluta:
> Hello,
> 
> I have some style comments, please read below.
> 
> > +struct udpcp_statistics {
> > +       unsigned int txMsgs;            /* Num of transmitted messages */
> > +       unsigned int rxMsgs;            /* Num of received messages */
> > +       unsigned int txNodes;           /* Num of receiver nodes */
> > +       unsigned int rxNodes;           /* Num of transmitter nodes */
> > +       unsigned int txTimeout;         /* Num of unsuccessful transmissions */
> > +       unsigned int rxTimeout;         /* Num of partial message receptions */
> > +       unsigned int txRetries;         /* Num of resends */
> > +       unsigned int rxDiscardedFrags;  /* Num of discarded fragments */
> > +       unsigned int crcErrors;         /* Num of crc errors detected */
> 
> Is there any strong reason to have this camel case naming?
> I would prefer tx_msgs, rx_msgs etc..
> 

This cannot be fixed for compatiblity reasons.

> > +struct udpcp_dest {
> > +       struct list_head list;
> > +       struct sk_buff_head xmit;
> > +       unsigned long tx_time;
> > +       unsigned long rx_time;
> > +       u32 txTimeout;
> > +       u32 rxTimeout;
> 
> Here you have mixed naming conventions. I guess
> tx_timeout will fit in better than txTimeout.
> 
> > +       u32 txRetries;
> > +       u32 rxDiscardedFrags;
> > +       struct sk_buff *xmit_wait;
> > +       struct sk_buff *xmit_last;
> > +       struct sk_buff *recv_msg;
> > +       struct sk_buff *recv_last;
> > +       struct udpcphdr lastmsg;
> > +       struct ipcm_cookie ipc;
> > +       struct flowi fl;
> > +       struct rtable *rt;
> > +       __be32 addr;
> > +       __be16 port;
> > +       u16 msgid;
> > +       u8 use_flag;
> > +       u8 insync;
> > +       u8 ackmode;
> > +       u8 chkmode;
> > +       u8 try;
> > +       u8 acks;
> > +       struct udp_sock udpsock;
> > +       struct sk_buff_head assembly;
> > +       u32 assembly_len;
> > +       struct udpcp_dest *assembly_dest;
> > +       wait_queue_head_t wq;
> > +       struct list_head destlist;
> > +       struct list_head udpcplist;
> > +       struct timer_list timer;
> > +       struct udpcp_statistics stat;
> > +       u32 pending;
> > +       unsigned long tx_timeout;
> > +       unsigned long rx_timeout;
> > +       void (*udp_data_ready) (struct sock *sk, int bytes);
> > +       u8 ackmode;
> > +       u8 chkmode;
> > +       u8 maxtry;
> > +       u8 acks;
> > +       u8 timeout;
> > +/* overall UDPCP statistics */
> > +static atomic_t udpcp_txMsgs;
> > +static atomic_t udpcp_rxMsgs;
> > +static atomic_t udpcp_txNodes;
> > +static atomic_t udpcp_rxNodes;
> > +static atomic_t udpcp_txTimeout;
> > +static atomic_t udpcp_rxTimeout;
> > +static atomic_t udpcp_txRetries;
> > +static atomic_t udpcp_rxDiscardedFrags;
> > +static atomic_t udpcp_crcErrors;
> 
> same here.
> 

I think there is no nameing convention in linux, as i know it is a
developer decision.

^ permalink raw reply

* Re: [PATCH] new UDPCP Communication Protocol
From: Stefani Seibold @ 2011-01-02 21:37 UTC (permalink / raw)
  To: Rémi Denis-Courmont; +Cc: netdev
In-Reply-To: <201101022216.04863.remi@remlab.net>

Am Sonntag, den 02.01.2011, 22:16 +0200 schrieb Rémi Denis-Courmont:
> Le dimanche 2 janvier 2011 17:31:26 stefani@seibold.net, vous avez écrit :
> > UDPCP is a communication protocol specified by the Open Base Station
> > Architecture Initiative Special Interest Group (OBSAI SIG). The
> > protocol is based on UDP and is designed to meet the needs of "Mobile
> > Communcation Base Station" internal communications. It is widely used by
> > the major networks infrastructure supplier.
> 
> Unless I missed something, you did not declare the new lock classes for the 
> lock consistency checks.
> 

Pardon, i did not realize what you mean, Which new lock class for what
kind of lock consistency checks?




^ permalink raw reply

* Re: [PATCH] new UDPCP Communication Protocol
From: Jesper Juhl @ 2011-01-02 21:40 UTC (permalink / raw)
  To: Stefani Seibold
  Cc: Daniel Baluta, linux-kernel, akpm, davem, netdev, eric.dumazet,
	shemminger
In-Reply-To: <1294003993.5675.3.camel@wall-e>

On Sun, 2 Jan 2011, Stefani Seibold wrote:

> Am Sonntag, den 02.01.2011, 21:48 +0200 schrieb Daniel Baluta:
> > Hello,
> > 
> > I have some style comments, please read below.
> > 
> > > +struct udpcp_statistics {
> > > +       unsigned int txMsgs;            /* Num of transmitted messages */
> > > +       unsigned int rxMsgs;            /* Num of received messages */
> > > +       unsigned int txNodes;           /* Num of receiver nodes */
> > > +       unsigned int rxNodes;           /* Num of transmitter nodes */
> > > +       unsigned int txTimeout;         /* Num of unsuccessful transmissions */
> > > +       unsigned int rxTimeout;         /* Num of partial message receptions */
> > > +       unsigned int txRetries;         /* Num of resends */
> > > +       unsigned int rxDiscardedFrags;  /* Num of discarded fragments */
> > > +       unsigned int crcErrors;         /* Num of crc errors detected */
> > 
> > Is there any strong reason to have this camel case naming?
[...]
> > 
> > same here.
> > 
> 
> I think there is no nameing convention in linux, as i know it is a
> developer decision.
> 

Chapter 4 of Documentation/CodingStyle seems to disagree with you:


"                Chapter 4: Naming

C is a Spartan language, and so should your naming be.  Unlike Modula-2
and Pascal programmers, C programmers do not use cute names like
ThisVariableIsATemporaryCounter.  A C programmer would call that
variable "tmp", which is much easier to write, and not the least more
difficult to understand.

HOWEVER, while mixed-case names are frowned upon, descriptive names for
global variables are a must.  To call a global function "foo" is a
shooting offense.
"

This seems (to me at least) to suggest that CammelCase is frawned upon.


-- 
Jesper Juhl <jj@chaosbits.net>            http://www.chaosbits.net/
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please.

^ permalink raw reply

* Re: [PATCH] new UDPCP Communication Protocol
From: Stefani Seibold @ 2011-01-02 21:46 UTC (permalink / raw)
  To: Jesper Juhl; +Cc: linux-kernel, akpm, davem, netdev, eric.dumazet, shemminger
In-Reply-To: <alpine.LNX.2.00.1101022024200.11481@swampdragon.chaosbits.net>

Am Sonntag, den 02.01.2011, 20:55 +0100 schrieb Jesper Juhl:
> On Sun, 2 Jan 2011, stefani@seibold.net wrote:


> > +
> > +#define VERSION	"0.71"
> 
> I personally don't think this makes much sense.
> Version numbers for individual modules tend to not get updated as the code 
> changes over the years, which make them rather meaningless.
> Since this module depends on functionallity of the kernel which it is 
> compiled with, the actual (meaningful) version of this code is that of the 
> kernel tree being compiled that includes this code. Which again makes this 
> specific version define meaningless.
> 
> So, why not save a few lines of code and get rid of this rather pointless 
> thing?
> 

I like it, it gives me a better monitoring during development which
version is currently tested.

> [...]
> > +static struct udpcp_dest *find_dest(struct sock *sk, __be32 addr, __be16 port)
> > +{
> > +	struct udpcp_dest *dest;
> > +
> > +	dest = __find_dest(sk, addr, port);
> 
> Why not
> 
> static struct udpcp_dest *find_dest(struct sock *sk, __be32 addr, __be16 port)
> {
>      struct udpcp_dest *dest =  __find_dest(sk, addr, port);
> 
> ?
I will fix it but i think this is counting peas.
 
> 
> 
> [...]
> > + * Release a routing table entry if no packed will be assembled
> 
> Don't you mean "packet" rather than "packed" here?
> 
> 
Right.

> [...]
> > + * Return true it the passed skb socket buffer is the last in the list
> 
> I believe you mean "Return true if the passed ..."
> 
Right.

> 
> [...]
> > +static void udpcp_flush_err(struct sock *sk, struct udpcp_dest *dest)
> > +{
> > +	struct inet_sock *inet = inet_sk(sk);
> > +	struct udpcp_sock *usk = udpcp_sk(sk);
> > +
> > +	if (!inet->recverr)
> > +		skb_queue_purge(&dest->xmit);
> > +	else {
> 
> CodingStyle would want this as
> 
>      if (!inet->recverr) {
>              skb_queue_purge(&dest->xmit);
>      } else {
> 
> If one branch needs {} then both should get them.
> 
./scripts/checkpatch.pl did not complain about this, so i think it is
okay.

> 
> [...]
> > +	if (!dest->xmit_last)
> > +		_udpcp_xmit(sk, dest);
> > +	else {
> > +		skb = dest->xmit_wait;
> 
> Same comment as above.
> There are more occurences of this, I'm not going to point them all out.
> 
> 
> [...]
> > +static inline void udpcp_release_sock(struct sock *sk)
> > +{
> > +	struct udpcp_sock *usk = udpcp_sk(sk);
> > +
> > +	while (usk->timeout)
> > +		udpcp_handle_timeout(sk);
> > +	release_sock(sk);
> > +        check_timeout(sk);
> 
> The line above uses spaces for indentation. It should use one tab.
> 
> 
> [...]
> > +static unsigned int udpcp_tx_queue_len(struct sock *sk, struct udpcp_dest *dest)
> > +{
> > +	struct sk_buff *skb;
> > +	unsigned int n;
> > +
> > +	n = 0;
> 
> Might as well save a few lines and make this
> 
> static unsigned int udpcp_tx_queue_len(struct sock *sk, struct udpcp_dest *dest)
> {
>      struct sk_buff *skb;
>      unsigned int n = 0;
> 
> 
> [...]
> > +static unsigned int udpcp_rx_queue_len(struct sock *sk, struct udpcp_dest *dest)
> > +{
> > +	struct sk_buff *skb;
> > +	unsigned int n;
> > +
> > +	n = 0;
> 
> Here as well
>         unsigned int n = 0;
> 
> 

I fix it in the next release.

Thanks

^ permalink raw reply

* Re: [PATCH] new UDPCP Communication Protocol
From: Eric Dumazet @ 2011-01-02 21:55 UTC (permalink / raw)
  To: stefani; +Cc: linux-kernel, akpm, davem, netdev, shemminger
In-Reply-To: <1293982286-15389-1-git-send-email-stefani@seibold.net>

Le dimanche 02 janvier 2011 à 16:31 +0100, stefani@seibold.net a écrit :
> From: Stefani Seibold <stefani@seibold.net>
> 
> Changelog:
> 31.12.2010 first proposal
> 01.01.2011 code cleanup and fixes suggest by Eric Dumazet
> 02.01.2011 kick away UDP-Lite support
>            change spin_lock_irq into spin_lock_bh
> 	   faster udpcp_release_sock
> 	   base is now linux-next

Since you depend on zlib_adler32(), you should add a dependence in
Kconfig so that its available

select ZLIB_INFLATE

^ permalink raw reply

* Re: [PATCH] new UDPCP Communication Protocol
From: Jesper Juhl @ 2011-01-02 22:04 UTC (permalink / raw)
  To: Stefani Seibold
  Cc: linux-kernel, akpm, davem, netdev, eric.dumazet, shemminger
In-Reply-To: <1294004772.5675.11.camel@wall-e>

On Sun, 2 Jan 2011, Stefani Seibold wrote:

> Am Sonntag, den 02.01.2011, 20:55 +0100 schrieb Jesper Juhl:
> > On Sun, 2 Jan 2011, stefani@seibold.net wrote:
> 
> 
> > > +
> > > +#define VERSION	"0.71"
> > 
> > I personally don't think this makes much sense.
> > Version numbers for individual modules tend to not get updated as the code 
> > changes over the years, which make them rather meaningless.
> > Since this module depends on functionallity of the kernel which it is 
> > compiled with, the actual (meaningful) version of this code is that of the 
> > kernel tree being compiled that includes this code. Which again makes this 
> > specific version define meaningless.
> > 
> > So, why not save a few lines of code and get rid of this rather pointless 
> > thing?
> > 
> 
> I like it, it gives me a better monitoring during development which
> version is currently tested.
> 
Does it really?  If your code is merged, then it's probably going to be
changed by various people over the years and not all of them (most) are
not going to notice nor change the version number, nor is the version
number here going to be changed when other parts of the kernel (that you
depend upon) are changed. So when you get a bug report in the future
mentioning VERSION xxx.yyy.zzz of your module it's not going to tell you
anything. What you want to know is the version of the kernel proper (or
git head commit id) - the VERSION defined here is likely going to be next
to useless in 1+ years (or less), so why have it at all?


> > [...]
> > > +static struct udpcp_dest *find_dest(struct sock *sk, __be32 addr, __be16 port)
> > > +{
> > > +	struct udpcp_dest *dest;
> > > +
> > > +	dest = __find_dest(sk, addr, port);
> > 
> > Why not
> > 
> > static struct udpcp_dest *find_dest(struct sock *sk, __be32 addr, __be16 port)
> > {
> >      struct udpcp_dest *dest =  __find_dest(sk, addr, port);
> > 
> > ?
> I will fix it but i think this is counting peas.
>  
Sure, it's a tiny trivial thing. I just took the time to actually read
through your patch and then I commented on everything I spotted.


> > [...]
> > > +static void udpcp_flush_err(struct sock *sk, struct udpcp_dest *dest)
> > > +{
> > > +	struct inet_sock *inet = inet_sk(sk);
> > > +	struct udpcp_sock *usk = udpcp_sk(sk);
> > > +
> > > +	if (!inet->recverr)
> > > +		skb_queue_purge(&dest->xmit);
> > > +	else {
> > 
> > CodingStyle would want this as
> > 
> >      if (!inet->recverr) {
> >              skb_queue_purge(&dest->xmit);
> >      } else {
> > 
> > If one branch needs {} then both should get them.
> > 
> ./scripts/checkpatch.pl did not complain about this, so i think it is
> okay.
> 
scripts/checkpatch.pl is not the final judge on style issues - not by a 
long shot. In any case, if you read Documentation/CodingStyle you'll 
notice this : 

"
Do not unnecessarily use braces where a single statement will do.

if (condition)
        action();

This does not apply if one branch of a conditional statement is a single
statement. Use braces in both branches.

if (condition) {
        do_this();
        do_that();
} else {
        otherwise();
}
"


-- 
Jesper Juhl <jj@chaosbits.net>            http://www.chaosbits.net/
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please.

^ permalink raw reply

* Re: [PATCH] new UDPCP Communication Protocol
From: Stefani Seibold @ 2011-01-02 22:16 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel, akpm, davem, netdev, shemminger
In-Reply-To: <1294005329.2535.258.camel@edumazet-laptop>

Am Sonntag, den 02.01.2011, 22:55 +0100 schrieb Eric Dumazet:
> Le dimanche 02 janvier 2011 à 16:31 +0100, stefani@seibold.net a écrit :
> > From: Stefani Seibold <stefani@seibold.net>
> > 
> > Changelog:
> > 31.12.2010 first proposal
> > 01.01.2011 code cleanup and fixes suggest by Eric Dumazet
> > 02.01.2011 kick away UDP-Lite support
> >            change spin_lock_irq into spin_lock_bh
> > 	   faster udpcp_release_sock
> > 	   base is now linux-next
> 
> Since you depend on zlib_adler32(), you should add a dependence in
> Kconfig so that its available
> 
> select ZLIB_INFLATE
> 

No this is not necessary, since zlib_adler32() is defined as a static
inline function in include/linux/zutil.h

^ permalink raw reply

* Re: [PATCH] new UDPCP Communication Protocol
From: Stefani Seibold @ 2011-01-02 22:21 UTC (permalink / raw)
  To: Jesper Juhl; +Cc: linux-kernel, akpm, davem, netdev, eric.dumazet, shemminger
In-Reply-To: <alpine.LNX.2.00.1101022300220.11481@swampdragon.chaosbits.net>

Am Sonntag, den 02.01.2011, 23:04 +0100 schrieb Jesper Juhl:
> On Sun, 2 Jan 2011, Stefani Seibold wrote:
> 
> > Am Sonntag, den 02.01.2011, 20:55 +0100 schrieb Jesper Juhl:
> > > On Sun, 2 Jan 2011, stefani@seibold.net wrote:
> > 
> > 
> > > > +
> > > > +#define VERSION	"0.71"
> > > 
> > > I personally don't think this makes much sense.
> > > Version numbers for individual modules tend to not get updated as the code 
> > > changes over the years, which make them rather meaningless.
> > > Since this module depends on functionallity of the kernel which it is 
> > > compiled with, the actual (meaningful) version of this code is that of the 
> > > kernel tree being compiled that includes this code. Which again makes this 
> > > specific version define meaningless.
> > > 
> > > So, why not save a few lines of code and get rid of this rather pointless 
> > > thing?
> > > 
> > 
> > I like it, it gives me a better monitoring during development which
> > version is currently tested.
> > 
> Does it really?  If your code is merged, then it's probably going to be
> changed by various people over the years and not all of them (most) are
> not going to notice nor change the version number, nor is the version
> number here going to be changed when other parts of the kernel (that you
> depend upon) are changed. So when you get a bug report in the future
> mentioning VERSION xxx.yyy.zzz of your module it's not going to tell you
> anything. What you want to know is the version of the kernel proper (or
> git head commit id) - the VERSION defined here is likely going to be next
> to useless in 1+ years (or less), so why have it at all?
> 

I said currently, so i agree but not yet. Okay?

> 
> > > [...]
> > > > +static struct udpcp_dest *find_dest(struct sock *sk, __be32 addr, __be16 port)
> > > > +{
> > > > +	struct udpcp_dest *dest;
> > > > +
> > > > +	dest = __find_dest(sk, addr, port);
> > > 
> > > Why not
> > > 
> > > static struct udpcp_dest *find_dest(struct sock *sk, __be32 addr, __be16 port)
> > > {
> > >      struct udpcp_dest *dest =  __find_dest(sk, addr, port);
> > > 
> > > ?
> > I will fix it but i think this is counting peas.
> >  
> Sure, it's a tiny trivial thing. I just took the time to actually read
> through your patch and then I commented on everything I spotted.
> 
> 
> > > [...]
> > > > +static void udpcp_flush_err(struct sock *sk, struct udpcp_dest *dest)
> > > > +{
> > > > +	struct inet_sock *inet = inet_sk(sk);
> > > > +	struct udpcp_sock *usk = udpcp_sk(sk);
> > > > +
> > > > +	if (!inet->recverr)
> > > > +		skb_queue_purge(&dest->xmit);
> > > > +	else {
> > > 
> > > CodingStyle would want this as
> > > 
> > >      if (!inet->recverr) {
> > >              skb_queue_purge(&dest->xmit);
> > >      } else {
> > > 
> > > If one branch needs {} then both should get them.
> > > 
> > ./scripts/checkpatch.pl did not complain about this, so i think it is
> > okay.
> > 
> scripts/checkpatch.pl is not the final judge on style issues - not by a 
> long shot. In any case, if you read Documentation/CodingStyle you'll 
> notice this : 
> 
> "
> Do not unnecessarily use braces where a single statement will do.
> 
> if (condition)
>         action();
> 
> This does not apply if one branch of a conditional statement is a single
> statement. Use braces in both branches.
> 
> if (condition) {
>         do_this();
>         do_that();
> } else {
>         otherwise();
> }
> "
> 
I will fix it but i think this is coding style from hell :-)

^ permalink raw reply

* Re: [PATCH] new UDPCP Communication Protocol
From: Eric Dumazet @ 2011-01-02 22:31 UTC (permalink / raw)
  To: Stefani Seibold; +Cc: linux-kernel, akpm, davem, netdev, shemminger
In-Reply-To: <1294006616.5675.13.camel@wall-e>

Le dimanche 02 janvier 2011 à 23:16 +0100, Stefani Seibold a écrit :
> Am Sonntag, den 02.01.2011, 22:55 +0100 schrieb Eric Dumazet:
> > Le dimanche 02 janvier 2011 à 16:31 +0100, stefani@seibold.net a écrit :
> > > From: Stefani Seibold <stefani@seibold.net>
> > > 
> > > Changelog:
> > > 31.12.2010 first proposal
> > > 01.01.2011 code cleanup and fixes suggest by Eric Dumazet
> > > 02.01.2011 kick away UDP-Lite support
> > >            change spin_lock_irq into spin_lock_bh
> > > 	   faster udpcp_release_sock
> > > 	   base is now linux-next
> > 
> > Since you depend on zlib_adler32(), you should add a dependence in
> > Kconfig so that its available
> > 
> > select ZLIB_INFLATE
> > 
> 
> No this is not necessary, since zlib_adler32() is defined as a static
> inline function in include/linux/zutil.h
> 
> 

Wow, this huge function is inlined, how shocking :)

^ permalink raw reply

* [PATCH] new UDPCP Communication Protocol
From: stefani @ 2011-01-02 22:39 UTC (permalink / raw)
  To: linux-kernel, akpm, davem, netdev, eric.dumazet, shemminger, jj,
	daniel.baluta
  Cc: stefani

From: Stefani Seibold <stefani@seibold.net>

Changelog:
31.12.2010 first proposal
01.01.2011 code cleanup and fixes suggest by Eric Dumazet
02.01.2011 kick away UDP-Lite support
           change spin_lock_irq into spin_lock_bh
	   faster udpcp_release_sock
	   base is now linux-next
02.01.2011 fix camel style
           fix coding style
	   fix types in comments
	   add per socket max. connection limit (pevents against abuse)
	   make udpcp adjustable through /proc/sys/net/ipv4/udpcp_

UDPCP is a communication protocol specified by the Open Base Station
Architecture Initiative Special Interest Group (OBSAI SIG). The
protocol is based on UDP and is designed to meet the needs of "Mobile
Communcation Base Station" internal communications. It is widely used by
the major networks infrastructure supplier.

The UDPCP communication service supports the following features:

-Connectionless communication for serial mode data transfer
-Acknowledged and unacknowledged transfer modes
-Retransmissions Algorithm
-Checksum Algorithm using Adler32
-Fragmentation of long messages (disassembly/reassembly) to match to the MTU
 during transport:
-Broadcasting and multicasting messages to multiple peers in unacknowledged
  transfer mode

UDPCP supports application level messages up to 64 KBytes (limited by 16-bit
packet data length field). Messages that are longer than the MTU will be
fragmented to the MTU.

UDPCP provides a reliable transport service that will perform message
retransmissions in case transport failures occur.

The code is also a nice example how to implement a UDP based protocol as
a kernel socket modules.

Due the nature of UDPCP which has no sliding windows support, the latency has
a huge impact. The perfomance increase by implementing as a kernel module is
about the factor 10, because there are no context switches and data packets or
ACKs will be handled in the interrupt service.

There are no side effects to the network subsystems so i ask for merge it
into linux-next. Hope you like it.

The patch is against linux next-20101231

- Stefani

Signed-off-by: Stefani Seibold <stefani@seibold.net>
---
 include/linux/socket.h |    9 +-
 include/net/udp.h      |    1 +
 include/net/udpcp.h    |   47 +
 net/Kconfig            |    1 +
 net/Makefile           |    1 +
 net/ipv4/ip_output.c   |    2 +
 net/ipv4/ip_sockglue.c |    2 +
 net/udpcp/Kconfig      |   34 +
 net/udpcp/Makefile     |    5 +
 net/udpcp/udpcp.c      | 2889 ++++++++++++++++++++++++++++++++++++++++++++++++
 10 files changed, 2988 insertions(+), 3 deletions(-)
 create mode 100644 include/net/udpcp.h
 create mode 100644 net/udpcp/Kconfig
 create mode 100644 net/udpcp/Makefile
 create mode 100644 net/udpcp/udpcp.c

diff --git a/include/linux/socket.h b/include/linux/socket.h
index 2dccbeb..2e9157c 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -171,7 +171,7 @@ struct ucred {
 #define AF_DECnet	12	/* Reserved for DECnet project	*/
 #define AF_NETBEUI	13	/* Reserved for 802.2LLC project*/
 #define AF_SECURITY	14	/* Security callback pseudo AF */
-#define AF_KEY		15      /* PF_KEY key management API */
+#define AF_KEY		15	/* PF_KEY key management API */
 #define AF_NETLINK	16
 #define AF_ROUTE	AF_NETLINK /* Alias to emulate 4.4BSD */
 #define AF_PACKET	17	/* Packet family		*/
@@ -194,7 +194,8 @@ struct ucred {
 #define AF_IEEE802154	36	/* IEEE802154 sockets		*/
 #define AF_CAIF		37	/* CAIF sockets			*/
 #define AF_ALG		38	/* Algorithm sockets		*/
-#define AF_MAX		39	/* For now.. */
+#define AF_UDPCP	39	/* UDPCP sockets		*/
+#define AF_MAX		40	/* For now.. */
 
 /* Protocol families, same as address families. */
 #define PF_UNSPEC	AF_UNSPEC
@@ -204,7 +205,7 @@ struct ucred {
 #define PF_AX25		AF_AX25
 #define PF_IPX		AF_IPX
 #define PF_APPLETALK	AF_APPLETALK
-#define	PF_NETROM	AF_NETROM
+#define PF_NETROM	AF_NETROM
 #define PF_BRIDGE	AF_BRIDGE
 #define PF_ATMPVC	AF_ATMPVC
 #define PF_X25		AF_X25
@@ -236,6 +237,7 @@ struct ucred {
 #define PF_IEEE802154	AF_IEEE802154
 #define PF_CAIF		AF_CAIF
 #define PF_ALG		AF_ALG
+#define PF_UDPCP	AF_UDPCP
 #define PF_MAX		AF_MAX
 
 /* Maximum queue length specifiable by listen.  */
@@ -310,6 +312,7 @@ struct ucred {
 #define SOL_IUCV	277
 #define SOL_CAIF	278
 #define SOL_ALG		279
+#define SOL_UDPCP	280
 
 /* IPX options */
 #define IPX_TYPE	1
diff --git a/include/net/udp.h b/include/net/udp.h
index bb967dd..82c95a7 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -47,6 +47,7 @@ struct udp_skb_cb {
 	} header;
 	__u16		cscov;
 	__u8		partial_cov;
+	__u8            udpcp_flag;
 };
 #define UDP_SKB_CB(__skb)	((struct udp_skb_cb *)((__skb)->cb))
 
diff --git a/include/net/udpcp.h b/include/net/udpcp.h
new file mode 100644
index 0000000..0745b15
--- /dev/null
+++ b/include/net/udpcp.h
@@ -0,0 +1,47 @@
+/* Definitions for UDPCP sockets. */
+
+#ifndef __LINUX_IF_UDPCP
+#define __LINUX_IF_UDPCP
+
+#include "linux/ioctl.h"
+
+#define UDPCP_MAX_MSGSIZE	65487
+
+#define UDPCP_MAX_WAIT_SEC	60
+
+#define UDPCP_OPT_TRANSFER_MODE		0
+#define UDPCP_OPT_CHECKSUM_MODE		1
+#define UDPCP_OPT_TX_TIMEOUT		2
+#define UDPCP_OPT_RX_TIMEOUT		3
+#define UDPCP_OPT_MAXTRY		4
+#define UDPCP_OPT_OUTSTANDING_ACKS	5
+
+#define UDPCP_NOACK		0
+#define UDPCP_ACK		1
+#define UDPCP_SINGLE_ACK	2
+#define UDPCP_NOCHECKSUM	3
+#define UDPCP_CHECKSUM		4
+
+#define UDPCP_IOC_MAGIC  251
+
+#define UDPCP_IOCTL_GET_STATISTICS \
+	_IOR(UDPCP_IOC_MAGIC, 0x01, struct udpcp_statistics *)
+#define UDPCP_IOCTL_RESET_STATISTICS \
+	_IO(UDPCP_IOC_MAGIC, 0x02)
+#define UDPCP_IOCTL_SYNC \
+	_IOR(UDPCP_IOC_MAGIC, 0x03, unsigned long)
+
+struct udpcp_statistics {
+	unsigned int txMsgs;		/* Num of transmitted messages */
+	unsigned int rxMsgs;		/* Num of received messages */
+	unsigned int txNodes;		/* Num of transmitter nodes */
+	unsigned int rxNodes;		/* Num of receiver nodes */
+	unsigned int txTimeout;		/* Num of unsuccessful transmissions */
+	unsigned int rxTimeout;		/* Num of partial message receptions */
+	unsigned int txRetries;		/* Num of resends */
+	unsigned int rxDiscardedFrags;	/* Num of discarded fragments */
+	unsigned int crcErrors;		/* Num of crc errors detected */
+};
+
+#endif
+
diff --git a/net/Kconfig b/net/Kconfig
index 7284062..4b3b619 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -302,6 +302,7 @@ source "net/rfkill/Kconfig"
 source "net/9p/Kconfig"
 source "net/caif/Kconfig"
 source "net/ceph/Kconfig"
+source "net/udpcp/Kconfig"
 
 
 endif   # if NET
diff --git a/net/Makefile b/net/Makefile
index a3330eb..388a582 100644
--- a/net/Makefile
+++ b/net/Makefile
@@ -70,3 +70,4 @@ obj-$(CONFIG_WIMAX)		+= wimax/
 obj-$(CONFIG_DNS_RESOLVER)	+= dns_resolver/
 obj-$(CONFIG_CEPH_LIB)		+= ceph/
 obj-$(CONFIG_BATMAN_ADV)	+= batman-adv/
+obj-$(CONFIG_UDPCP)		+= udpcp/
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 04c7b3b..41f9276 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1084,6 +1084,7 @@ error:
 	IP_INC_STATS(sock_net(sk), IPSTATS_MIB_OUTDISCARDS);
 	return err;
 }
+EXPORT_SYMBOL(ip_append_data);
 
 ssize_t	ip_append_page(struct sock *sk, struct page *page,
 		       int offset, size_t size, int flags)
@@ -1340,6 +1341,7 @@ error:
 	IP_INC_STATS(net, IPSTATS_MIB_OUTDISCARDS);
 	goto out;
 }
+EXPORT_SYMBOL(ip_push_pending_frames);
 
 /*
  *	Throw away all pending data on the socket.
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index 3948c86..310369c 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -226,6 +226,7 @@ int ip_cmsg_send(struct net *net, struct msghdr *msg, struct ipcm_cookie *ipc)
 	}
 	return 0;
 }
+EXPORT_SYMBOL(ip_cmsg_send);
 
 
 /* Special input handler for packets caught by router alert option.
@@ -369,6 +370,7 @@ void ip_local_error(struct sock *sk, int err, __be32 daddr, __be16 port, u32 inf
 	if (sock_queue_err_skb(sk, skb))
 		kfree_skb(skb);
 }
+EXPORT_SYMBOL(ip_local_error);
 
 /*
  *	Handle MSG_ERRQUEUE
diff --git a/net/udpcp/Kconfig b/net/udpcp/Kconfig
new file mode 100644
index 0000000..a58c1b0
--- /dev/null
+++ b/net/udpcp/Kconfig
@@ -0,0 +1,34 @@
+#
+# UDPCP protocol
+#
+
+config UDPCP
+	tristate "UDPCP Communication Protocol"
+	depends on INET
+	---help---
+	  UDPCP is a communication protocol specified by the Open Base Station
+	  Architecture Initiative Special Interest Group (OBSAI SIG). The
+	  protocol is based on UDP and is designed to meet the needs of "Mobile
+	  Communcation Base Station" internal communications.
+
+	  The UDPCP communication service supports the following features:
+
+          -Connectionless communication for serial mode data transfer
+          -Acknowledged and unacknowledged transfer modes
+          -Retransmissions Algorithm
+          -Checksum Algorithm using Adler32
+          -Fragmentation of long messages (disassembly/reassembly) to
+           match to the MTU during transport:
+          -Broadcasting and multicasting messages to multiple peers in
+           unacknowledged transfer mode
+
+          UDPCP supports application level messages up to 64 KBytes (limited
+          by 16-bit packet data length field). Messages that are longer than the
+          MTU will be fragmented to the MTU.
+
+          UDPCP provides a reliable transport service that will perform message
+          retransmissions in case transport failures occur.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called udpcp.
+
diff --git a/net/udpcp/Makefile b/net/udpcp/Makefile
new file mode 100644
index 0000000..37f87c5
--- /dev/null
+++ b/net/udpcp/Makefile
@@ -0,0 +1,5 @@
+#
+# Makefile for UDPCP support code.
+#
+
+obj-$(CONFIG_UDPCP) += udpcp.o
diff --git a/net/udpcp/udpcp.c b/net/udpcp/udpcp.c
new file mode 100644
index 0000000..287500b
--- /dev/null
+++ b/net/udpcp/udpcp.c
@@ -0,0 +1,2889 @@
+/*
+ * UDPCP communication protocol
+ *
+ * Copyright (C) 2010 Stefani Seibold <stefani@seibold.net>
+ * in order of NSN Ulm/Germany
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ *
+ */
+
+#include <net/xfrm.h>
+#include <net/protocol.h>
+#include <net/ip.h>
+#include <net/udp.h>
+#include <net/inet_common.h>
+#include <linux/zutil.h>
+#include <linux/module.h>
+#include <linux/proc_fs.h>
+#include <linux/spinlock.h>
+#include <linux/errqueue.h>
+#include <linux/atomic.h>
+
+#include <net/udpcp.h>
+
+#define VERSION	"0.73"
+
+/*
+ * UDPCP Protocol default parameters
+ */
+#define UDPCP_TX_TIMEOUT	100	/* milliseconds */
+#define UDPCP_RX_TIMEOUT	1000	/* milliseconds */
+#define UDPCP_TX_MAXTRY		5
+#define UDPCP_OUTSTANDING_ACKS	1
+
+/*
+ * UDPCP Protocol definitions
+ */
+#define UDPCP_MSG_TYPE_BIT		14
+#define UDPCP_PROTOCOL_VERSION_BIT	11
+#define UDPCP_NO_ACK_BIT		10
+#define UDPCP_CHECKSUM_BIT		9
+#define UDPCP_SINGLE_ACK_BIT		8
+#define UDPCP_DUPLICATE_BIT		7
+
+#define UDPCP_MSG_TYPE_MASK		(3 << UDPCP_MSG_TYPE_BIT)
+#define UDPCP_PROTOCOL_MASK		(7 << UDPCP_PROTOCOL_VERSION_BIT)
+
+#define UDPCP_MSG_TYPE_DATA		(1 << UDPCP_MSG_TYPE_BIT)
+#define UDPCP_MSG_TYPE_ACK		(2 << UDPCP_MSG_TYPE_BIT)
+#define UDPCP_PROTOCOL_VERSION_2	(2 << UDPCP_PROTOCOL_VERSION_BIT)
+
+#define UDPCP_NO_ACK_FLAG		(1 << UDPCP_NO_ACK_BIT)
+#define UDPCP_CHECKSUM_FLAG		(1 << UDPCP_CHECKSUM_BIT)
+#define UDPCP_SINGLE_ACK_FLAG		(1 << UDPCP_SINGLE_ACK_BIT)
+#define UDPCP_DUPLICATE_FLAG		(1 << UDPCP_DUPLICATE_BIT)
+
+/*
+ * helper macros
+ */
+#define list_to_udpcpdest(d) container_of(d, struct udpcp_dest, list)
+#define list_to_udpcpsock(d) container_of(d, struct udpcp_sock, udpcplist)
+
+#define UDPCP_HDRSIZE	(sizeof(struct udpcphdr)-sizeof(struct udphdr))
+
+#define RX_NODE	1
+#define TX_NODE	2
+
+/*
+ * name of the /proc entry
+ */
+#define UDPCP_PROC	"driver/udpcp"
+
+/*
+ * UDPCP message header
+ */
+struct udpcphdr {
+	struct udphdr udphdr;
+	__be32 chksum;
+	__be16 msginfo;
+	u8 fragamount;
+	u8 fragnum;
+	__be16 msgid;
+	__be16 length;
+};
+
+/*
+ * UDPCP destination descriptor
+ *
+ * For each communication address an individual destination descriptor will
+ * be create.
+ *
+ * The fields has the following meanings:
+ *
+ * list:		link list: part of udpcp_sock.destlist
+ * xmit:		messages fragments to be transmit
+ * tx_time:		timestamp of the last transmitted message fragment
+ * rx_time:		timestamp ot the last received message fragment
+ * tx_timeout:		statistic use only: number of transmit timeout
+ * rx_timeout:		statistic use only: number of receive timeout
+ * tx_retries:		statistic use only: number of transmit retries
+ * rx_discarded_frags:	statistic use only: number of discarded messages
+ * xmit_wait:		message fragment which is waiting for an ACK
+ * xmit_last:		last fragment transmitted
+ * recv_msg:		first fragment of the received message
+ * recv_last:		last fragment of the received message
+ * lastmsg:		last messages fragment header received
+ * ipc:			linux internal ipc cookie
+ * fl:			flow/routing information
+ * rt:			routing entry currently used for this destination
+ * addr:		ipv4 destination address
+ * port:		destination port number
+ * msgid:		current message id for outgoing data messages
+ * use_flag:		statistic use only: flag for dest using TX and/or RX
+ * insync:		flag for protocol synchronization
+ * ackmode;		ack mode for the current assembled message
+ * chkmode;		checksum mode for the current assembled message
+ * try:			current number of retries xmit_wait message
+ * acks:		number of outstandig ack's
+ */
+struct udpcp_dest {
+	struct list_head list;
+	struct sk_buff_head xmit;
+	unsigned long tx_time;
+	unsigned long rx_time;
+	u32 tx_timeout;
+	u32 rx_timeout;
+	u32 tx_retries;
+	u32 rx_discarded_frags;
+	struct sk_buff *xmit_wait;
+	struct sk_buff *xmit_last;
+	struct sk_buff *recv_msg;
+	struct sk_buff *recv_last;
+	struct udpcphdr lastmsg;
+	struct ipcm_cookie ipc;
+	struct flowi fl;
+	struct rtable *rt;
+	__be32 addr;
+	__be16 port;
+	u16 msgid;
+	u8 use_flag;
+	u8 insync;
+	u8 ackmode;
+	u8 chkmode;
+	u8 try;
+	u8 acks;
+};
+
+/*
+ * UDPCP socket descriptor
+ *
+ * For each opened socket individual socket descriptor will
+ * be created
+ *
+ * The fields has the following meanings:
+ *
+ * udpsock:		UDP socket has to be the first member of udpcp_sock
+ * assembly:		messages fragments currently assembled
+ * assembly_len:	current length of the assembled message
+ * assembly_dest:	current destination assembled
+ * wq:			wait queue for UDPCP_IOCTL_SYNC
+ * destlist:		head of destination descriptors link list
+ * udpcplist:		link list: part of udpcp_list
+ * timer:		timeout handler
+ * stat:		statistics for this socket
+ * pending:		number of pending messages fragment in the queues
+ * tx_timeout:		transmit timeout in jiffies
+ * rx_timeout:		receive timeout in jiffies
+ * udp_data_ready:	original data_ready handler for this socket
+ * ackmode:		default ack mode
+ * chkmode:		default checksum mode
+ * maxtry:		max. number of resends
+ * acks:		max. number of outstandig ack's
+ * timeout:		flag for unhandled timeout
+ */
+struct udpcp_sock {
+	struct udp_sock udpsock;
+	struct sk_buff_head assembly;
+	u32 assembly_len;
+	struct udpcp_dest *assembly_dest;
+	wait_queue_head_t wq;
+	struct list_head destlist;
+	struct list_head udpcplist;
+	struct timer_list timer;
+	struct udpcp_statistics stat;
+	u32 pending;
+	unsigned long tx_timeout;
+	unsigned long rx_timeout;
+	u32 connections;
+	void (*udp_data_ready) (struct sock *sk, int bytes);
+	u8 ackmode;
+	u8 chkmode;
+	u8 maxtry;
+	u8 acks;
+	u8 timeout;
+};
+
+/* head of struct udpcp_sock.udpcplist link list */
+static struct list_head udpcp_list;
+
+/* spinlock for race free access to the static variables */
+static spinlock_t udpcp_lock;
+
+/* debug flag, set != 0 to enable debug */
+static int udpcp_max_connections = 64;
+
+/* /proc/sys/net/ipv4/udpcp_* table */
+static struct ctl_table_header *udpcp_ctl_table;
+
+/* debug flag, set != 0 to enable debug */
+static int debug;
+
+/* overall UDPCP statistics */
+static atomic_t udpcp_tx_msgs;
+static atomic_t udpcp_rx_msgs;
+static atomic_t udpcp_tx_nodes;
+static atomic_t udpcp_rx_nodes;
+static atomic_t udpcp_tx_timeout;
+static atomic_t udpcp_rx_timeout;
+static atomic_t udpcp_tx_retries;
+static atomic_t udpcp_rx_discarded_frags;
+static atomic_t udpcp_crc_errors;
+
+module_param(debug, int, 0);
+MODULE_PARM_DESC(debug, "Debug enabled or not");
+
+module_param(udpcp_max_connections, int, 0);
+MODULE_PARM_DESC(udpcp_max_connections, "maximum connections per sockets");
+
+static int zero;
+
+static struct ctl_table ipv4_udpcp_table[] = {
+	{
+		.procname	= "udpcp_max_connections",
+		.data		= &udpcp_max_connections,
+		.maxlen		= sizeof(udpcp_max_connections),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &zero
+	},
+	{
+		.procname	= "udpcp_debug",
+		.data		= &debug,
+		.maxlen		= sizeof(debug),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &zero
+	},
+	{ }
+};
+
+#ifdef CONFIG_PROC_FS
+/*
+ * Handle /proc/driver/udpcp
+ *
+ * Show the statistics information
+ */
+static int udpcp_proc(char *page, char **start, off_t off, int count, int *eof,
+		      void *data)
+{
+	int len;
+
+	len = snprintf(page, count,
+		       "txMsgs:          %u\n"
+		       "rxMsgs:          %u\n"
+		       "txNodes:         %u\n"
+		       "rxNodes:         %u\n"
+		       "txTimeout:       %u\n"
+		       "rxTimeout:       %u\n"
+		       "txRetries:       %u\n"
+		       "rxDiscaredFrags: %u\n"
+		       "crcErrors:       %u\n",
+			atomic_read(&udpcp_tx_msgs),
+			atomic_read(&udpcp_rx_msgs),
+			atomic_read(&udpcp_tx_nodes),
+			atomic_read(&udpcp_rx_nodes),
+			atomic_read(&udpcp_tx_timeout),
+			atomic_read(&udpcp_rx_timeout),
+			atomic_read(&udpcp_tx_retries),
+			atomic_read(&udpcp_rx_discarded_frags),
+			atomic_read(&udpcp_crc_errors)
+		);
+
+	if (len <= off)
+		return 0;
+
+	len -= off;
+
+	if (len > count)
+		return count;
+
+	return len;
+}
+#endif
+
+/*
+ * Helper for the UDPCP header from a socket buffer
+ */
+static inline struct udpcphdr *udpcp_hdr(const struct sk_buff *skb)
+{
+	return (struct udpcphdr *)skb_transport_header(skb);
+}
+
+/*
+ * Helper for conversion a basic socket into a UDPCP socket
+ */
+static inline struct udpcp_sock *udpcp_sk(const struct sock *sk)
+{
+	return (struct udpcp_sock *)sk;
+}
+
+/*
+ * Dump the transport data of a socket buffer
+ */
+static inline void dump_data(struct sk_buff *skb, unsigned int max)
+{
+	unsigned int i;
+	unsigned char *data;
+	int data_len;
+
+	data = skb_transport_header(skb) + sizeof(struct udpcphdr);
+	data_len = skb_tail_pointer(skb) - data;
+
+	pr_debug(" data: ");
+
+	if (!data_len) {
+		pr_cont("<none>\n");
+		return;
+	}
+
+	if (max > data_len)
+		max = data_len;
+
+	for (i = 0; i < max; i++)
+		pr_cont("%02x ", data[i]);
+
+	if (data_len > max)
+		pr_cont("...");
+	pr_cont("\n");
+}
+
+/*
+ * Dump and decode a msginfo value
+ */
+static inline void dump_msginfo(u16 msginfo)
+{
+	pr_debug(" msginfo:0x%04x (", msginfo);
+
+	pr_cont("PCKT:");
+	switch (msginfo & UDPCP_MSG_TYPE_MASK) {
+	case UDPCP_MSG_TYPE_DATA:
+		pr_cont("DATA");
+		break;
+	case UDPCP_MSG_TYPE_ACK:
+		pr_cont("ACK");
+		break;
+	default:
+		pr_cont("UNKNOWN");
+		break;
+	}
+	pr_cont(" VER:%d",
+	       (msginfo & UDPCP_PROTOCOL_MASK) >> UDPCP_PROTOCOL_VERSION_BIT);
+
+	if (msginfo & UDPCP_NO_ACK_FLAG)
+		pr_cont(" NO_ACK");
+	if (msginfo & UDPCP_CHECKSUM_FLAG)
+		pr_cont(" CHECKSUM");
+	if (msginfo & UDPCP_SINGLE_ACK_FLAG)
+		pr_cont(" SINGLE_ACK");
+	if (msginfo & UDPCP_DUPLICATE_FLAG)
+		pr_cont(" DUPLICATE");
+	pr_cont(")\n");
+}
+
+/*
+ * Dump and decode a UDPCP message fragment
+ */
+static void dump_msg(const char *action, struct sk_buff *skb, __be32 saddr,
+		     __be32 daddr)
+{
+	struct udpcphdr *uh = udpcp_hdr(skb);
+
+	pr_debug("udpcp: %s (%lu)\n", action, jiffies);
+
+	pr_debug(" src:0x%08x:%d dst:0x%08x:%d fraglen:%d\n",
+	       saddr, uh->udphdr.source, daddr, uh->udphdr.dest, skb->len);
+
+	pr_debug(" fragamount:%u fragnum:%u msgid:%u%s"
+		 " length:%u checksum:0x%08x\n",
+	       uh->fragamount, uh->fragnum, ntohs(uh->msgid),
+	       (!uh->msgid) ? "(Sync)" : "", ntohs(uh->length),
+	       ntohl(uh->chksum)
+	    );
+
+	dump_msginfo(ntohs(uh->msginfo));
+	dump_data(skb, 16);
+}
+
+/*
+ * Create a new destination descriptor for the given IPV4 address and port
+ */
+static struct udpcp_dest *new_dest(struct sock *sk, __be32 addr, __be16 port)
+{
+	struct udpcp_dest *dest;
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	if (usk->connections >= udpcp_max_connections)
+		return NULL;
+
+	dest = kzalloc(sizeof(*dest), sk->sk_allocation);
+
+	if (dest) {
+		usk->connections++;
+		skb_queue_head_init(&dest->xmit);
+		dest->addr = addr;
+		dest->port = port;
+		dest->ackmode = UDPCP_ACK;
+		list_add_tail(&dest->list, &usk->destlist);
+	}
+
+	return dest;
+}
+
+/*
+ * Lookup for a destination descriptor for the given IPV4 address and port
+ */
+static struct udpcp_dest *__find_dest(struct sock *sk, __be32 addr, __be16 port)
+{
+	struct udpcp_dest *dest;
+	struct list_head *p;
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	list_for_each(p, &usk->destlist) {
+		dest = list_to_udpcpdest(p);
+
+		if ((dest->addr == addr) && (dest->port == port))
+			return dest;
+	}
+	return NULL;
+}
+
+/*
+ * Lookup for a destination descriptor and create a new one if no
+ * descriptor was found.
+ */
+static struct udpcp_dest *find_dest(struct sock *sk, __be32 addr, __be16 port)
+{
+	struct udpcp_dest *dest = __find_dest(sk, addr, port);
+
+	if (!dest)
+		dest = new_dest(sk, addr, port);
+
+	return dest;
+}
+
+/*
+ * Calculate udp checksum, mostly stolen from udp stack
+ */
+static void udpcp_do_csum(struct sock *sk, struct sk_buff *skb,
+			  struct udpcp_dest *dest)
+{
+	struct flowi *fl = &dest->fl;
+	struct udphdr *uh = udp_hdr(skb);
+	__wsum csum = 0;
+	unsigned short len = ntohs(uh->len);
+
+	if (sk->sk_no_check == UDP_CSUM_NOXMIT) {
+		skb->ip_summed = CHECKSUM_NONE;
+		return;
+	}
+	if (skb->ip_summed == CHECKSUM_PARTIAL) {
+		/* UDP hardware csum */
+		skb->csum_start = skb_transport_header(skb) - skb->head;
+		skb->csum_offset = offsetof(struct udphdr, check);
+		uh->check =
+		    ~csum_tcpudp_magic(fl->fl4_src, fl->fl4_dst, len,
+				       sk->sk_protocol, 0);
+		return;
+	}
+	csum = csum_partial(uh, sizeof(struct udpcphdr), 0);
+	csum = csum_add(csum, skb->csum);
+
+	/* add protocol-dependent pseudo-header */
+	uh->check =
+	    csum_tcpudp_magic(fl->fl4_src, fl->fl4_dst, len, sk->sk_protocol,
+			      csum);
+	if (uh->check == 0)
+		uh->check = CSUM_MANGLED_0;
+}
+
+/*
+ * Fetch data from kernel space and fill in checksum if needed.
+ */
+static int ip_reply_glue_bits(void *dptr, char *to, int offset,
+			      int len, int odd, struct sk_buff *skb)
+{
+	__wsum csum;
+
+	csum = csum_partial_copy_nocheck(dptr+offset, to, len, 0);
+	skb->csum = csum_block_add(skb->csum, csum, odd);
+	return 0;
+}
+
+/*
+ * Send an ack for a received data message fragment
+ *
+ * If the argument duplicate is true a ACK with UDPCP_DUPLICATE_FLAG set will
+ * be send
+ */
+static void udpcp_send_ack(struct sock *sk, struct sk_buff *skb,
+			   struct udpcp_dest *dest, int duplicate)
+{
+	struct inet_sock *inet = inet_sk(sk);
+	struct udpcphdr *uh = udpcp_hdr(skb);
+	struct rtable *rt = NULL;
+	__wsum csum;
+	struct ipcm_cookie ipc;
+	struct udpcphdr rep;
+
+	memset(&rep, 0, sizeof(rep));
+
+	/* Swap the send and the receive ports. */
+	rep.udphdr.source = uh->udphdr.dest;
+	rep.udphdr.dest = uh->udphdr.source;
+	rep.udphdr.len = htons(sizeof(struct udpcphdr));
+
+	rep.msginfo = htons(UDPCP_MSG_TYPE_ACK |
+			    UDPCP_NO_ACK_FLAG |
+			    UDPCP_SINGLE_ACK_FLAG | UDPCP_PROTOCOL_VERSION_2);
+	if (duplicate)
+		rep.msginfo |= htons(UDPCP_DUPLICATE_FLAG);
+	else
+		memcpy(&dest->lastmsg, uh, sizeof(dest->lastmsg));
+	rep.msgid = uh->msgid;
+	rep.fragamount = uh->fragamount;
+	rep.fragnum = uh->fragnum;
+	rep.length = 0;
+	rep.chksum = 0;
+	if (ntohs(uh->msginfo) & UDPCP_CHECKSUM_FLAG) {
+		u8 *data;
+		u32 data_len;
+
+		data = (u8 *) &rep + sizeof(struct udphdr);
+		data_len = sizeof(struct udpcphdr)-sizeof(struct udphdr);
+
+		rep.msginfo |= htons(UDPCP_CHECKSUM_FLAG);
+		rep.chksum = htonl(zlib_adler32(1, data, data_len));
+	}
+
+	if (unlikely(debug)) {
+		struct sk_buff tmp;
+
+		tmp.len = ntohs(rep.udphdr.len);
+		tmp.head = tmp.transport_header = tmp.data = (void *)&rep;
+		tmp.tail = tmp.head + tmp.len;
+
+		dump_msg("ack msg", &tmp, ip_hdr(skb)->daddr,
+			 ip_hdr(skb)->saddr);
+	}
+
+	csum = csum_tcpudp_nofold(ip_hdr(skb)->daddr,
+				      ip_hdr(skb)->saddr,
+				      sizeof(rep), sk->sk_protocol, 0);
+
+	ipc.addr = dest->addr;
+	ipc.opt = NULL;
+	ipc.tx_flags = 0;
+
+	{
+		struct flowi fl = {
+			.nl_u = { .ip4_u = {
+						.daddr = ipc.addr,
+						.saddr = ip_hdr(skb)->daddr,
+						.tos = RT_TOS(ip_hdr(skb)->tos)
+					      }
+			},
+			.uli_u = { .ports = {
+						.sport = udp_hdr(skb)->dest,
+						.dport = udp_hdr(skb)->source
+				       }
+			},
+			.proto = sk->sk_protocol,
+		};
+		security_skb_classify_flow(skb, &fl);
+		if (ip_route_output_key(sock_net(sk), &rt, &fl))
+			return;
+	}
+
+	inet->tos = ip_hdr(skb)->tos;
+	sk->sk_priority = skb->priority;
+	sk->sk_protocol = ip_hdr(skb)->protocol;
+	sk->sk_bound_dev_if = 0;
+	ip_append_data(sk, ip_reply_glue_bits, &rep, sizeof(rep),
+				0, &ipc, &rt, MSG_DONTWAIT);
+	skb = skb_peek(&sk->sk_write_queue);
+	if (skb) {
+		*((__sum16 *)skb_transport_header(skb) +
+		  offsetof(struct udphdr, check) / 2) =
+			csum_fold(csum_add(skb->csum, csum));
+		skb->ip_summed = CHECKSUM_NONE;
+		ip_push_pending_frames(sk);
+	}
+
+	ip_rt_put(rt);
+
+	UDP_INC_STATS_BH(sock_net(sk), UDP_MIB_OUTDATAGRAMS, 0);
+}
+
+/*
+ * Pass a UDPCP skb buffer to the ip stack and send it
+ */
+static int udpcp_send_skb(struct sock *sk, struct sk_buff *skb,
+			  struct udpcp_dest *dest, struct ip_options *opt)
+{
+	int err;
+
+	skb_dst_set(skb, dst_clone(&dest->rt->dst));
+
+	err = ip_build_and_send_pkt(skb, sk, dest->fl.fl4_src,
+					dest->fl.fl4_dst, opt);
+
+	if (!err)
+		UDP_INC_STATS_USER(sock_net(sk), UDP_MIB_OUTDATAGRAMS, 0);
+	return err;
+}
+
+/*
+ * Release a routing table entry if no packet will be assembled
+ */
+static void udpcp_dst_release(struct udpcp_sock *usk, struct udpcp_dest *dest)
+{
+	if (usk->assembly_dest != dest) {
+		dst_release(&dest->rt->dst);
+		dest->rt = NULL;
+	}
+}
+
+/*
+ * Return true if the passed skb socket buffer is the last in the list
+ */
+static inline bool skb_is_eoq(const struct sk_buff_head *list,
+			      const struct sk_buff *skb)
+{
+	return (skb->next == (struct sk_buff *)list);
+}
+
+/*
+ * Arm the timeout handler for the socket
+ */
+static void udpcp_timer(struct sock *sk, unsigned long timeout)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	mod_timer(&usk->timer, timeout);
+}
+
+/*
+ * Decrement the socket pending counter and wakeup a waiting UDPCP_IOCTL_SYNC
+ */
+static inline void udpcp_dec_pending(struct sock *sk)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	if (!--usk->pending) {
+		if (waitqueue_active(&usk->wq))
+			wake_up_interruptible(&usk->wq);
+	}
+}
+
+/*
+ * Returns true is the passed message fragment is the last fragment
+ */
+static inline int udpcp_is_last_frag(struct udpcphdr *uh)
+{
+	return uh->fragamount == uh->fragnum + 1;
+}
+
+/*
+ * Transmit data message fragments
+ */
+static int _udpcp_xmit(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	struct sk_buff *skb = NULL;
+	struct sk_buff *skbc;
+	struct udpcphdr *uh;
+	int err = 0;
+
+	if (dest->acks >= usk->acks)
+		goto out;
+
+	if (!dest->xmit_last) {
+		/*
+		 * handle data message fragments without an ack
+		 */
+		while ((skb = skb_peek(&dest->xmit))) {
+			uh = udpcp_hdr(skb);
+
+			if (!(ntohs(uh->msginfo) & UDPCP_NO_ACK_FLAG))
+				break;
+			if (udpcp_is_last_frag(uh)) {
+				usk->stat.txMsgs++;
+				atomic_inc(&udpcp_tx_msgs);
+			}
+			skb_unlink(skb, &dest->xmit);
+			udpcp_dec_pending(sk);
+			if (unlikely(debug))
+				dump_msg("send msg", skb, dest->fl.fl4_src,
+					 dest->fl.fl4_dst);
+			err = udpcp_send_skb(sk, skb, dest,
+						(struct ip_options *)skb->cb);
+			if (err) {
+				kfree_skb(skb);
+				skb = NULL;
+				break;
+			}
+		}
+		dest->xmit_wait = skb;
+	} else {
+		/*
+		 * handle next data message fragment waiting for an ack
+		 */
+		uh = udpcp_hdr(dest->xmit_last);
+
+		if (udpcp_is_last_frag(uh))
+			goto out;
+
+		/*
+		 * get next data message fragment
+		 */
+		skb = dest->xmit_last->next;
+	}
+
+	/*
+	 * send all data message fragment till the first which must be acked
+	 */
+	while (skb) {
+		skbc = skb_clone(skb, sk->sk_allocation);
+
+		if (!skbc)
+			break;
+
+		if (unlikely(debug))
+			dump_msg("send msg", skbc, dest->fl.fl4_src,
+				 dest->fl.fl4_dst);
+		err = udpcp_send_skb(sk, skbc, dest,
+					(struct ip_options *)skb->cb);
+		if (err) {
+			kfree_skb(skbc);
+			break;
+		}
+
+		uh = udpcp_hdr(skb);
+
+		if (!(ntohs(uh->msginfo) & UDPCP_SINGLE_ACK_FLAG)
+		    || udpcp_is_last_frag(uh)) {
+			dest->xmit_last = skb;
+
+			if (++dest->acks >= usk->acks || udpcp_is_last_frag(uh))
+				break;
+		}
+
+		skb = skb_is_eoq(&dest->xmit, skb) ? NULL : skb->next;
+	}
+
+out:
+	if (skb_queue_empty(&dest->xmit))
+		udpcp_dst_release(usk, dest);
+
+	return err;
+}
+
+/*
+ * Transmit data message fragments and rearm the timeout handler if necessary
+ */
+static int udpcp_xmit(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	int ret;
+
+	ret = _udpcp_xmit(sk, dest);
+
+	if (dest->xmit_wait) {
+		dest->tx_time = jiffies;
+
+		if (!timer_pending(&usk->timer))
+			udpcp_timer(sk, dest->tx_time + usk->tx_timeout);
+	}
+	return ret;
+}
+
+/*
+ * Queue the assembled message fragment into the transmit queue
+ */
+static void udpcp_queue_xmit(struct sock *sk, struct udpcp_dest *dest,
+			     u8 ackmode, u8 chkmode)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	struct udpcphdr *uh;
+	struct sk_buff *skb;
+	u8 fragamount;
+	u8 fragnum;
+	unsigned short msginfo;
+	struct flowi *fl = &dest->fl;
+
+	msginfo = UDPCP_MSG_TYPE_DATA | UDPCP_PROTOCOL_VERSION_2;
+	switch (ackmode) {
+	case UDPCP_NOACK:
+		msginfo |= UDPCP_NO_ACK_FLAG;
+		break;
+	case UDPCP_SINGLE_ACK:
+		msginfo |= UDPCP_SINGLE_ACK_FLAG;
+		break;
+	case UDPCP_ACK:
+	default:
+		break;
+	}
+	switch (chkmode) {
+	case UDPCP_NOCHECKSUM:
+		break;
+	case UDPCP_CHECKSUM:
+	default:
+		msginfo |= UDPCP_CHECKSUM_FLAG;
+		break;
+	}
+
+	fragamount = skb_queue_len(&usk->assembly);
+
+	udpcp_sk(sk)->pending += fragamount;
+
+	for (fragnum = 0; fragnum != fragamount; fragnum++) {
+		unsigned char *data;
+		int data_len;
+
+		skb = skb_dequeue(&usk->assembly);
+		uh = udpcp_hdr(skb);
+
+		/*
+		 * setup a UDPCP header
+		 */
+		uh->chksum = 0;
+		uh->msginfo = htons(msginfo);
+		uh->fragnum = fragnum;
+		uh->fragamount = fragamount;
+		uh->msgid = htons(dest->msgid);
+		uh->length = htons(usk->assembly_len);
+
+		data = skb_transport_header(skb) + sizeof(struct udphdr);
+		data_len = skb_tail_pointer(skb) - data;
+
+		if (chkmode == UDPCP_CHECKSUM)
+			uh->chksum = htonl(zlib_adler32(1, data, data_len));
+		/*
+		 * create a UDP header
+		 */
+		uh->udphdr.source = fl->fl_ip_sport;
+		uh->udphdr.dest = fl->fl_ip_dport;
+		uh->udphdr.len = htons(sizeof(struct udphdr) + data_len);
+		uh->udphdr.check = 0;
+
+		/*
+		 * create UDP checksum
+		 */
+		udpcp_do_csum(sk, skb, dest);
+
+		/*
+		 * add to xmit queue
+		 */
+		skb_queue_tail(&dest->xmit, skb);
+	}
+
+	dest->msgid++;
+	usk->assembly_len = 0;
+	usk->assembly_dest = NULL;
+}
+
+/*
+ * Remove all data message fragments of the first message from the transmit
+ * queue all fragments will be merged together
+ */
+static struct sk_buff *udpcp_dequeue_msg(struct sock *sk,
+					 struct udpcp_dest *dest)
+{
+	struct sk_buff *msg;
+	struct sk_buff *skb;
+	struct sk_buff **next;
+	struct udpcphdr *uh;
+
+	msg = skb_dequeue(&dest->xmit);
+	if (!msg)
+		return NULL;
+	skb_orphan(msg);
+
+	uh = udpcp_hdr(msg);
+	if (!uh->msgid) {
+		/*
+		 * sync message
+		 */
+		kfree_skb(msg);
+		return NULL;
+	}
+
+	skb_pull(msg, sizeof(struct udpcphdr));
+	if (udpcp_is_last_frag(uh))
+		return msg;
+
+	next = &skb_shinfo(msg)->frag_list;
+	for (;;) {
+		skb = skb_dequeue(&dest->xmit);
+		if (!skb)
+			break;
+		skb_orphan(skb);
+		uh = udpcp_hdr(skb);
+		skb_pull(msg, sizeof(struct udpcphdr));
+		msg->len += skb->len;
+		msg->data_len += skb->len;
+		*next = skb;
+		if (udpcp_is_last_frag(uh))
+			break;
+		next = &skb->next;
+	}
+	return msg;
+}
+
+static void udpcp_flush_err(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct inet_sock *inet = inet_sk(sk);
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	if (!inet->recverr) {
+		skb_queue_purge(&dest->xmit);
+	} else {
+		struct sock_exterr_skb *serr;
+		struct iphdr *iph;
+		struct sk_buff *skb;
+
+		while (!skb_queue_empty(&dest->xmit)) {
+			skb = udpcp_dequeue_msg(sk, dest);
+			if (!skb)
+				continue;
+
+			if (unlikely(debug))
+				dump_msg("flush outgoing message", skb,
+					 dest->fl.fl4_src, dest->fl.fl4_dst);
+
+			skb_push(skb, sizeof(struct iphdr));
+			skb_reset_network_header(skb);
+			iph = ip_hdr(skb);
+			iph->daddr = dest->rt->rt_dst;
+
+			serr = SKB_EXT_ERR(skb);
+			serr->ee.ee_errno = EPROTO;
+			serr->ee.ee_origin = SO_EE_ORIGIN_LOCAL;
+			serr->ee.ee_type = 0;
+			serr->ee.ee_code = 0;
+			serr->ee.ee_pad = 0;
+			serr->ee.ee_info = 0;
+			serr->ee.ee_data = 0;
+			serr->addr_offset = (u8 *) &iph->daddr -
+						skb_network_header(skb);
+			serr->port = dest->fl.fl_ip_dport;
+
+			skb_reset_transport_header(skb);
+			skb_pull(skb, sizeof(struct iphdr));
+
+			/*
+			 * set a flag for UDPCP message
+			 */
+			UDP_SKB_CB(skb)->udpcp_flag = 1;
+
+			/*
+			 * pass the dequeued message to the error queue of the
+			 * socket
+			 */
+			skb_set_owner_r(skb, sk);
+			skb_queue_tail(&sk->sk_error_queue, skb);
+			if (!sock_flag(sk, SOCK_DEAD)) {
+				if (usk->udp_data_ready)
+					usk->udp_data_ready(sk, skb->len);
+			}
+		}
+	}
+
+	dest->xmit_wait = 0;
+	dest->xmit_last = 0;
+	dest->try = 0;
+	dest->acks = 0;
+
+	usk->pending = 0;
+	if (waitqueue_active(&usk->wq))
+		wake_up_interruptible(&usk->wq);
+}
+
+/*
+ * Purge the current incoming data message
+ */
+static void udpcp_purge_incoming(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	if (dest->recv_last) {
+		u32 fragnum = udpcp_hdr(dest->recv_last)->fragnum + 1;
+
+		dest->rx_discarded_frags += fragnum;
+		usk->stat.rxDiscardedFrags += fragnum;
+		atomic_add(fragnum, &udpcp_rx_discarded_frags);
+
+		dest->lastmsg.msgid = 0;
+
+		if (unlikely(debug))
+			dump_msg("purge incoming message", dest->recv_msg,
+				 dest->fl.fl4_src, dest->fl.fl4_dst);
+	}
+
+	kfree_skb(dest->recv_msg);
+	dest->recv_msg = 0;
+	dest->recv_last = 0;
+}
+
+/*
+ * Resend all data message fragments to the one which is currently waiting for
+ * an ack
+ */
+static int udpcp_resend(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct sk_buff *skb;
+	struct sk_buff *skbc;
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	int err;
+
+	if (++dest->try >= usk->maxtry) {
+		dest->insync = 0;
+		udpcp_flush_err(sk, dest);
+		udpcp_purge_incoming(sk, dest);
+		udpcp_dst_release(usk, dest);
+		return 0;
+	}
+
+	dest->tx_retries++;
+	usk->stat.txRetries++;
+	atomic_inc(&udpcp_tx_retries);
+
+	if (!dest->xmit_last) {
+		_udpcp_xmit(sk, dest);
+	} else {
+		skb = dest->xmit_wait;
+
+		for (;;) {
+			skbc = skb_clone(skb, sk->sk_allocation);
+
+			if (skbc == NULL)
+				break;
+
+			if (unlikely(debug))
+				dump_msg("resend msg", skbc, dest->fl.fl4_src,
+					 dest->fl.fl4_dst);
+			err = udpcp_send_skb(sk, skbc, dest,
+						(struct ip_options *)skb->cb);
+			if (err) {
+				kfree_skb(skbc);
+				break;
+			}
+
+			if (skb == dest->xmit_last) {
+				_udpcp_xmit(sk, dest);
+				break;
+			}
+
+			skb = skb->next;
+		}
+	}
+	dest->tx_time = jiffies;
+
+	return 1;
+}
+
+/*
+ * Handle udpcp timeout
+ */
+static void udpcp_handle_timeout(struct sock *sk)
+{
+	struct udpcp_dest *dest;
+	struct list_head *p;
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	int wflag = 0;
+	unsigned long t = jiffies + UDPCP_MAX_WAIT_SEC * HZ + 1;
+
+	usk->timeout = 0;
+
+	/*
+	 * walk through all destinations
+	 */
+	list_for_each(p, &usk->destlist) {
+		dest = list_to_udpcpdest(p);
+
+		if (dest->xmit_wait) {
+			if (time_is_before_eq_jiffies
+			    (dest->tx_time + usk->tx_timeout)) {
+				/*
+				 * transmit timeout expired
+				 */
+				if (unlikely(debug))
+					dump_msg("send timeout",
+						 dest->xmit_wait,
+						 dest->fl.fl4_src,
+						 dest->fl.fl4_dst);
+				if (udpcp_resend(sk, dest) == 0) {
+					dest->tx_timeout++;
+					usk->stat.txTimeout++;
+					atomic_inc(&udpcp_tx_timeout);
+					goto check_incoming;
+				}
+				wflag = 1;
+			}
+			if (time_before(dest->tx_time + usk->tx_timeout, t)) {
+				/*
+				 * calculate new timeout timer value
+				 */
+				t = dest->tx_time + usk->tx_timeout;
+				wflag = 1;
+			}
+		}
+check_incoming:
+		if (dest->recv_msg) {
+			if (time_is_before_eq_jiffies
+			    (dest->rx_time + usk->rx_timeout)) {
+				/*
+				 * receive timeout occurred
+				 */
+				if (unlikely(debug))
+					dump_msg("receive timeout",
+						 dest->recv_last,
+						 dest->fl.fl4_src,
+						 dest->fl.fl4_dst);
+				udpcp_purge_incoming(sk, dest);
+				dest->rx_timeout++;
+				usk->stat.rxTimeout++;
+				atomic_inc(&udpcp_rx_timeout);
+			} else
+			if (time_before(dest->rx_time + usk->rx_timeout, t)) {
+				/*
+				 * calculate new timeout timer value
+				 */
+				t = dest->rx_time + usk->rx_timeout;
+				wflag = 1;
+			}
+		}
+	}
+	/*
+	 * restart timer if necessary
+	 */
+	if (wflag)
+		udpcp_timer(sk, t);
+}
+
+/*
+ * Timeout function
+ */
+static void udpcp_timeout(unsigned long data)
+{
+	struct sock *sk = (struct sock *)data;
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	bh_lock_sock(sk);
+	if (!sock_owned_by_user(sk)) {
+		udpcp_handle_timeout(sk);
+	} else {
+		/*
+		 * bad, cannot handle the timeout because the socket is in use
+		 * set flag for unhandled timeout and rearm the timer
+		 */
+		usk->timeout = 1;
+		udpcp_timer(sk, jiffies + 1);
+	}
+	bh_unlock_sock(sk);
+}
+
+/*
+ * Handle timeout if an the unhandled timeout flag is set
+ */
+static inline void check_timeout(struct sock *sk)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	while (usk->timeout) {
+		lock_sock(sk);
+		while (usk->timeout)
+			udpcp_handle_timeout(sk);
+		release_sock(sk);
+	}
+}
+
+/*
+ * Release the socket lock and test for unhandled timeouts
+ */
+static inline void udpcp_release_sock(struct sock *sk)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	while (usk->timeout)
+		udpcp_handle_timeout(sk);
+	release_sock(sk);
+	check_timeout(sk);
+}
+
+/*
+ * Parse sendmsg() control message
+ */
+static int udpcp_cmsg_send(struct msghdr *msg, u8 * ackmode, u8 * chkmode)
+{
+	struct cmsghdr *cmsg;
+
+	for (cmsg = CMSG_FIRSTHDR(msg); cmsg; cmsg = CMSG_NXTHDR(msg, cmsg)) {
+		if (!CMSG_OK(msg, cmsg))
+			return -EINVAL;
+		if (cmsg->cmsg_level != SOL_UDPCP)
+			continue;
+		switch (cmsg->cmsg_type) {
+		case UDPCP_NOACK:
+		case UDPCP_ACK:
+		case UDPCP_SINGLE_ACK:
+			*ackmode = cmsg->cmsg_type;
+			break;
+		case UDPCP_CHECKSUM:
+		case UDPCP_NOCHECKSUM:
+			*chkmode = cmsg->cmsg_type;
+			break;
+		default:
+			return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+/*
+ * Validate a skb buffer
+ */
+static int udpcp_validate_skb(struct sk_buff *skb)
+{
+	if (skb->next) {
+		pr_err("udpcp: unexpected skb_buff->next != NULL\n");
+		BUG();
+		return 1;
+	}
+	if (skb_shinfo(skb)->frag_list) {
+		pr_err("udpcp: unexpected skb_shinfo(skb)->frag_list != NULL\n");
+		BUG();
+		return 1;
+	}
+	return 0;
+}
+
+/*
+ * Split a message into fragments and store it into the assemble queue
+ * mostly stolen from UDP stack
+ */
+static int udpcp_data(struct sock *sk, struct udpcp_dest *dest,
+		      struct iovec *from, int length, unsigned int flags)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	struct inet_sock *inet = inet_sk(sk);
+	struct sk_buff *skb;
+	struct ipcm_cookie *ipc = &dest->ipc;
+	struct ip_options *opt = ipc->opt;
+	int hh_len;
+	int exthdrlen;
+	int mtu;
+	int copy;
+	int err;
+	int offset = 0;
+	unsigned int maxfraglen, fragheaderlen;
+	int csummode = CHECKSUM_NONE;
+	int transhdrlen = sizeof(struct udpcphdr);
+	struct rtable *rt = dest->rt;
+
+	if (opt && sizeof(skb->cb) < optlength(opt)) {
+		err = -EFAULT;
+		goto error;
+	}
+
+	usk->assembly_len += length;
+	usk->assembly_dest = dest;
+
+	if (usk->assembly_len > UDPCP_MAX_MSGSIZE) {
+		ip_local_error(sk, EMSGSIZE, rt->rt_dst, dest->fl.fl_ip_dport,
+				usk->assembly_len);
+		err = -EMSGSIZE;
+		goto error;
+	}
+
+	mtu = (inet->pmtudisc == IP_PMTUDISC_PROBE) ?
+		rt->dst.dev->mtu : dst_mtu(rt->dst.path);
+	sk->sk_sndmsg_page = NULL;
+	sk->sk_sndmsg_off = 0;
+	exthdrlen = rt->dst.header_len;
+	length += exthdrlen;
+	transhdrlen += exthdrlen;
+
+	hh_len = LL_RESERVED_SPACE(rt->dst.dev);
+
+	fragheaderlen = sizeof(struct iphdr) + (opt ? opt->optlen : 0);
+	maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen;
+
+	if (rt->dst.dev->features & NETIF_F_V4_CSUM && !exthdrlen)
+		csummode = CHECKSUM_PARTIAL;
+
+	skb = skb_peek_tail(&usk->assembly);
+	if (skb) {
+		unsigned int off;
+
+		off = skb->len;
+
+		copy = mtu - skb->len;
+		if (copy > length)
+			copy = length;
+
+		if (copy > 0 &&
+		    ip_generic_getfrag(
+		     from, skb_put(skb, copy), 0, copy, off, skb) < 0) {
+			__skb_trim(skb, off);
+			err = -EFAULT;
+			goto error;
+		}
+		length -= copy;
+		offset += copy;
+
+		if (!length)
+			return 0;
+	}
+
+	do {
+		char *data;
+		unsigned int datalen;
+		unsigned int fraglen;
+		unsigned int alloclen;
+
+		length += transhdrlen;
+		/*
+		 * If remaining data exceeds the mtu,
+		 * we know we need more fragment(s).
+		 */
+		datalen = length;
+		if (datalen > mtu - fragheaderlen)
+			datalen = maxfraglen - fragheaderlen;
+		fraglen = datalen + fragheaderlen;
+
+		if ((flags & MSG_MORE)
+		    && !(rt->dst.dev->features & NETIF_F_SG))
+			alloclen = mtu;
+		else
+			alloclen = fraglen;
+
+		alloclen += rt->dst.trailer_len + hh_len + 15;
+
+		udpcp_release_sock(sk);
+		skb = sock_alloc_send_skb(sk, alloclen,
+					(flags & MSG_DONTWAIT), &err);
+		lock_sock(sk);
+		if (skb == NULL)
+			goto error;
+
+		if (udpcp_validate_skb(skb)) {
+			kfree_skb(skb);
+
+			goto error;
+		}
+
+		/*
+		 * Fill in the control structures
+		 */
+		skb->ip_summed = csummode;
+		skb->csum = 0;
+		skb_reserve(skb, hh_len);
+
+		/*
+		 * Find where to start putting bytes.
+		 */
+		data = skb_put(skb, fraglen);
+		skb_set_network_header(skb, exthdrlen);
+		skb->transport_header = (skb->network_header + fragheaderlen);
+		data += fragheaderlen;
+
+		copy = datalen - transhdrlen;
+
+		if (copy > 0 &&
+		  ip_generic_getfrag(
+		   from, data + transhdrlen, offset, copy, 0, skb) < 0) {
+			err = -EFAULT;
+			kfree_skb(skb);
+			goto error;
+		}
+
+		offset += copy;
+		length -= datalen;
+
+		if (ipc->opt)
+			memcpy(skb->cb, &ipc->opt, optlength(opt));
+
+		skb_pull(skb, fragheaderlen);
+		skb_queue_tail(&usk->assembly, skb);
+	} while (length > 0);
+
+	return 0;
+error:
+	skb_queue_purge(&usk->assembly);
+	usk->assembly_len = 0;
+
+	IP_INC_STATS(sock_net(sk), IPSTATS_MIB_OUTDISCARDS);
+	return err;
+}
+
+/*
+ * This function will be called by send(), sento() and sendmsg()
+ */
+static int udpcp_sendmsg(struct kiocb *iocb, struct sock *sk,
+			 struct msghdr *msg, size_t len)
+{
+	struct inet_sock *inet = inet_sk(sk);
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	struct ipcm_cookie *ipc;
+	struct rtable *rt = NULL;
+	int free = 0;
+	int connected = 0;
+	__be32 daddr, faddr, saddr;
+	__be16 dport;
+	u8 tos;
+	int err = 0;
+	int corkreq = usk->udpsock.corkflag || msg->msg_flags & MSG_MORE;
+	struct udpcp_dest *dest;
+
+	if (len > UDPCP_MAX_MSGSIZE)
+		return -EMSGSIZE;
+
+	/*
+	 * Check the flags.
+	 */
+	if (msg->msg_flags & MSG_OOB)
+		return -EOPNOTSUPP;
+
+	/*
+	 * check if socket is binded to a port
+	 */
+	if (!(sk->sk_userlocks & SOCK_BINDPORT_LOCK) || !inet->inet_num)
+		return -ENOTCONN;
+
+	/*
+	 * Get and verify the address.
+	 */
+	if (msg->msg_name) {
+		struct sockaddr_in *usin = (struct sockaddr_in *)msg->msg_name;
+		if (msg->msg_namelen < sizeof(*usin))
+			return -EINVAL;
+		if (usin->sin_family != AF_INET) {
+			if (usin->sin_family != AF_UNSPEC)
+				return -EAFNOSUPPORT;
+		}
+
+		daddr = usin->sin_addr.s_addr;
+		dport = usin->sin_port;
+	} else {
+		if (sk->sk_state != TCP_ESTABLISHED)
+			return -EDESTADDRREQ;
+		daddr = inet->inet_daddr;
+		dport = inet->inet_dport;
+		/* Open fast path for connected socket.
+		   Route will not be used, if at least one option is set.
+		 */
+		connected = 1;
+	}
+
+	if (dport == 0)
+		return -EINVAL;
+
+	dest = find_dest(sk, daddr, dport);
+	if (!dest)
+		return -ENOMEM;
+
+	if (!(dest->use_flag & TX_NODE)) {
+		dest->use_flag |= TX_NODE;
+		usk->stat.txNodes++;
+		atomic_inc(&udpcp_tx_nodes);
+	}
+
+	ipc = &dest->ipc;
+
+	if (!skb_queue_empty(&usk->assembly)) {
+		/*
+		 * assembly is ongoing
+		 */
+		lock_sock(sk);
+		if (likely(!skb_queue_empty(&usk->assembly))) {
+			if (usk->assembly_dest != dest) {
+				udpcp_release_sock(sk);
+				return -EUSERS;
+			}
+			ipc->opt =
+			    (struct ip_options *)skb_peek(&usk->assembly)->cb;
+			goto queue_data;
+		}
+		udpcp_release_sock(sk);
+	}
+
+	ipc->addr = inet->inet_saddr;
+	ipc->oif = sk->sk_bound_dev_if;
+
+	dest->ackmode = usk->ackmode;
+	dest->chkmode = usk->chkmode;
+
+	if (msg->msg_controllen) {
+		/*
+		 * handle control message
+		 */
+		err = udpcp_cmsg_send(msg, &dest->ackmode, &dest->chkmode);
+		if (err)
+			return err;
+		err = ip_cmsg_send(sock_net(sk), msg, ipc);
+		if (err)
+			return err;
+		if (ipc->opt)
+			free = 1;
+		connected = 0;
+	}
+
+	if (!ipc->opt)
+		ipc->opt = inet->opt;
+
+	saddr = ipc->addr;
+	ipc->addr = faddr = daddr;
+
+	if (ipc->opt && ipc->opt->srr) {
+		if (!daddr)
+			return -EINVAL;
+		faddr = ipc->opt->faddr;
+		connected = 0;
+	}
+	tos = RT_TOS(inet->tos);
+	if (sock_flag(sk, SOCK_LOCALROUTE) ||
+	    (msg->msg_flags & MSG_DONTROUTE) ||
+	    (ipc->opt && ipc->opt->is_strictroute)) {
+		tos |= RTO_ONLINK;
+		connected = 0;
+	}
+
+	if (ipv4_is_multicast(daddr)) {
+		if (dest->ackmode != UDPCP_NOACK) {
+			err = EOPNOTSUPP;
+			goto out;
+		}
+		if (!ipc->oif)
+			ipc->oif = inet->mc_index;
+		if (!saddr)
+			saddr = inet->mc_addr;
+		connected = 0;
+	}
+
+	lock_sock(sk);
+	rt = dest->rt;
+	if (rt)
+		goto queue_data;
+	udpcp_release_sock(sk);
+
+	/*
+	 * calculate routing
+	 */
+	if (connected)
+		rt = (struct rtable *)sk_dst_check(sk, 0);
+
+	if (rt == NULL) {
+		struct flowi fl = {.oif = ipc->oif,
+			.nl_u = {.ip4_u = {.daddr = faddr,
+					   .saddr = saddr,
+					   .tos = tos} },
+			.proto = sk->sk_protocol,
+			.uli_u = {.ports = {.sport = inet->inet_sport,
+					    .dport = dport} }
+		};
+		struct net *net = sock_net(sk);
+
+		security_sk_classify_flow(sk, &fl);
+		err = ip_route_output_flow(net, &rt, &fl, sk, 1);
+		if (err) {
+			if (err == -ENETUNREACH)
+				IP_INC_STATS_BH(net, IPSTATS_MIB_OUTNOROUTES);
+			goto out;
+		}
+
+		err = -EACCES;
+		if ((rt->rt_flags & RTCF_BROADCAST) &&
+		    !sock_flag(sk, SOCK_BROADCAST))
+			goto out;
+		if (connected)
+			sk_dst_set(sk, dst_clone(&rt->dst));
+	}
+
+	if (msg->msg_flags & MSG_CONFIRM)
+		goto do_confirm;
+back_from_confirm:
+
+	saddr = rt->rt_src;
+	if (!ipc->addr)
+		daddr = ipc->addr = rt->rt_dst;
+
+	lock_sock(sk);
+
+	dest->fl.fl4_dst = daddr;
+	dest->fl.fl_ip_dport = dport;
+	dest->fl.fl4_src = saddr;
+	dest->fl.fl_ip_sport = inet->inet_sport;
+	dest->rt = rt;
+
+queue_data:
+	if (msg->msg_flags & MSG_PROBE)
+		goto release;
+
+	if (!dest->insync && skb_queue_empty(&dest->xmit)) {
+		/*
+		 * if not synced, queue a SYNC message
+		 */
+		err = udpcp_data(sk, dest, NULL, 0, 0);
+		if (err)
+			goto release;
+		dest->msgid = 0;
+		udpcp_queue_xmit(sk, dest, UDPCP_ACK, UDPCP_CHECKSUM);
+	}
+
+	/*
+	 * split message and store it to the assembly queue
+	 */
+	err = udpcp_data(sk, dest, msg->msg_iov, len,
+		       corkreq ? msg->msg_flags | MSG_MORE : msg->msg_flags);
+	if (err)
+		goto release;
+
+	if (!dest->msgid)
+		dest->msgid = 1;
+
+	if (!corkreq) {
+		/*
+		 * message is complete, transfer it from the assembly queue
+		 * into the transmit queue
+		 */
+		udpcp_queue_xmit(sk, dest, dest->ackmode, dest->chkmode);
+		/*
+		 * start transmit if possible
+		 */
+		err = udpcp_xmit(sk, dest);
+	}
+release:
+	udpcp_release_sock(sk);
+out:
+	if (free)
+		kfree(ipc->opt);
+
+	if (!err)
+		return len;
+	/*
+	 * ENOBUFS = no kernel mem, SOCK_NOSPACE = no sndbuf space.  Reporting
+	 * ENOBUFS might not be good (it's not tunable per se), but otherwise
+	 * we don't have a good statistic (IpOutDiscards but it can be too many
+	 * things).  We could add another new stat but at least for now that
+	 * seems like overkill.
+	 */
+	if (err == -ENOBUFS || test_bit(SOCK_NOSPACE, &sk->sk_socket->flags))
+		UDP_INC_STATS_USER(sock_net(sk), UDP_MIB_SNDBUFERRORS, 0);
+	return err;
+
+do_confirm:
+	dst_confirm(&rt->dst);
+	if (!(msg->msg_flags & MSG_PROBE) || len)
+		goto back_from_confirm;
+
+	err = 0;
+	goto out;
+}
+
+/*
+ * Sendpage() is not really implemented
+ */
+static int udpcp_sendpage(struct sock *sk, struct page *page, int offset,
+			  size_t size, int flags)
+{
+	return sock_no_sendpage(sk->sk_socket, page, offset, size, flags);
+}
+
+/*
+ * Release all message fragments of the first in the transmit queue
+ */
+static void udpcp_release_xmit(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	struct sk_buff *skb;
+	struct udpcphdr *uh;
+
+	for (;;) {
+		skb = skb_dequeue(&dest->xmit);
+
+		uh = udpcp_hdr(skb);
+
+		if (udpcp_is_last_frag(uh) && uh->msgid) {
+			usk->stat.txMsgs++;
+			atomic_inc(&udpcp_tx_msgs);
+		}
+
+		udpcp_dec_pending(sk);
+
+		kfree_skb(skb);
+		if (skb == dest->xmit_last)
+			break;
+	}
+
+	dest->xmit_wait = 0;
+	dest->xmit_last = 0;
+	dest->try = 0;
+}
+
+/*
+ * Set the sync state
+ */
+static void udpcp_sync(struct sock *sk, struct udpcp_dest *dest)
+{
+	dest->xmit_wait = 0;
+	dest->xmit_last = 0;
+	dest->try = 0;
+	dest->acks = 0;
+	dest->insync = 1;
+}
+
+/*
+ * Returns true if the first message in the transmit queue is a sync message
+ */
+static inline int udpcp_xmit_is_sync(struct udpcp_dest *dest)
+{
+	struct sk_buff *skb = skb_peek(&dest->xmit);
+
+	return skb && !udpcp_hdr(skb)->msgid;
+}
+
+static inline struct udpcphdr *udpcp_ack_scan(struct sk_buff *skb)
+{
+	struct udpcphdr *uh;
+
+	for (;;) {
+		uh = udpcp_hdr(skb);
+
+		if (!(ntohs(uh->msginfo) & UDPCP_SINGLE_ACK_FLAG)
+		    || udpcp_is_last_frag(uh))
+			return uh;
+
+		skb = skb->next;
+	}
+}
+
+/*
+ * Handle an incoming ack
+ */
+static void udpcp_handle_ack(struct sock *sk, struct sk_buff *skb,
+			     struct udpcp_dest *dest)
+{
+	struct udpcphdr *r_uh;
+	struct udpcphdr *q_uh;
+
+	if (!dest->acks)
+		return;
+
+	r_uh = udpcp_hdr(skb);
+
+	/*
+	 * acks doesn't have a payload
+	 */
+	if (r_uh->length)
+		return;
+
+	q_uh = udpcp_ack_scan(dest->xmit_wait);
+
+	/*
+	 * message id, fragnum and fragamount must match the awaited message
+	 * fragment
+	 */
+	if (r_uh->msgid != q_uh->msgid)
+		return;
+
+	if (r_uh->fragnum != q_uh->fragnum)
+		return;
+
+	if (r_uh->fragamount != q_uh->fragamount)
+		return;
+
+	dest->acks--;
+
+	/*
+	 * if last fragment release message
+	 */
+	if (udpcp_is_last_frag(q_uh)) {
+		udpcp_release_xmit(sk, dest);
+
+		/*
+		 * special handling for sync messages
+		 */
+		if (r_uh->msgid == 0)
+			udpcp_sync(sk, dest);
+	} else {
+		dest->xmit_wait = dest->xmit_wait->next;
+	}
+	/*
+	 * try to transmit next message/fragment
+	 */
+	udpcp_xmit(sk, dest);
+}
+
+/*
+ * Queue incoming message as owned by udpcp socket
+ */
+static void udpcp_set_owner_r(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct sk_buff *skb;
+
+	skb = dest->recv_msg;
+	skb_set_owner_r(skb, sk);
+
+	skb = skb_shinfo(skb)->frag_list;
+	if (!skb)
+		return;
+
+	for (;;) {
+		skb_set_owner_r(skb, sk);
+		if (udpcp_is_last_frag(udpcp_hdr(skb)))
+			break;
+		skb = skb->next;
+	}
+}
+
+/*
+ * Handle an incoming data message fragment
+ */
+static int udpcp_handle_data(struct sock *sk, struct sk_buff *skb,
+			     struct udpcp_dest *dest)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	struct udpcphdr *uh = udpcp_hdr(skb);
+	unsigned short msginfo = ntohs(uh->msginfo);
+	unsigned short length = ntohs(uh->length);
+
+	/*
+	 * special handling for sync messages
+	 */
+	if (uh->msgid == 0) {
+		/*
+		 * sync messages doesn't have a payload
+		 */
+		if (length)
+			return 1;
+
+		/*
+		 * sync messages doesn't have a ack rules
+		 */
+		if (msginfo & (UDPCP_NO_ACK_FLAG | UDPCP_SINGLE_ACK_FLAG))
+			return 1;
+
+		udpcp_send_ack(sk, skb, dest,
+			       memcmp(uh, &dest->lastmsg,
+				      sizeof(dest->lastmsg)) ? 0 : 1);
+
+		udpcp_purge_incoming(sk, dest);
+
+		/*
+		 * skip the first message in the queue if it is a sync messages
+		 */
+		if (udpcp_xmit_is_sync(dest)) {
+			dest->acks--;
+			udpcp_dec_pending(sk);
+			kfree_skb(skb_dequeue(&dest->xmit));
+		}
+
+		if (!dest->insync)
+			udpcp_sync(sk, dest);
+
+		udpcp_xmit(sk, dest);
+
+		return -1;
+	}
+
+	if (!dest->insync)
+		return 1;
+
+	if (length > UDPCP_MAX_MSGSIZE)
+		return 1;
+
+	length += sizeof(struct udpcphdr);
+
+	/*
+	 * if the message was still handled, send a duplicate ack
+	 */
+	if (!memcmp(uh, &dest->lastmsg, sizeof(dest->lastmsg))) {
+		udpcp_send_ack(sk, skb, dest, 1);
+		return 1;
+	}
+
+	if (dest->recv_msg) {
+		/*
+		 * if a fragment is already received validate the fragment
+		 */
+		if ((uh->msgid != udpcp_hdr(dest->recv_msg)->msgid) ||
+		    (uh->msginfo != udpcp_hdr(dest->recv_msg)->msginfo) ||
+		    (uh->length != udpcp_hdr(dest->recv_msg)->length) ||
+		    (uh->fragamount != udpcp_hdr(dest->recv_msg)->fragamount)
+		    ) {
+			udpcp_purge_incoming(sk, dest);
+			goto newmsg;
+		}
+
+		if (uh->fragnum != udpcp_hdr(dest->recv_last)->fragnum + 1)
+			return 1;
+
+		if (dest->recv_msg->len + skb->len - sizeof(struct udpcphdr) >
+		    length)
+			return 1;
+	} else {
+newmsg:
+		/*
+		 * first fragment must have the number 0
+		 */
+		if (uh->fragnum != 0)
+			return 1;
+
+		/*
+		 * UDPCP data length cannot be smaller then the UDP data length
+		 */
+		if (skb->len > length)
+			return 1;
+
+		/*
+		 * id of the last received is not valid
+		 */
+		if (dest->lastmsg.msgid == uh->msgid)
+			return 1;
+
+		/*
+		 * check against receive buffer limit
+		 */
+		if (atomic_read(&sk->sk_rmem_alloc) + length > sk->sk_rcvbuf)
+			return 1;
+	}
+
+	memset(&dest->lastmsg, 0, sizeof(dest->lastmsg));
+
+	if (!dest->recv_msg) {
+		/*
+		 * store the first message fragment
+		 */
+		if (skb->cloned) {
+			struct sk_buff *skbc;
+
+			skbc = skb_copy(skb, sk->sk_allocation);
+			if (skbc == NULL)
+				return 1;
+			kfree_skb(skb);
+			skb = skbc;
+		}
+		dest->recv_msg = skb;
+	} else {
+		/*
+		 * store the consecutively message fragment
+		 */
+		struct skb_shared_info *shinfo;
+
+		shinfo = skb_shinfo(dest->recv_msg);
+
+		if (!shinfo->frag_list)
+			shinfo->frag_list = skb;
+		else
+			dest->recv_last->next = skb;
+
+		skb_pull(skb, sizeof(struct udpcphdr));
+		dest->recv_msg->len += skb->len;
+		dest->recv_msg->data_len += skb->len;
+	}
+	dest->recv_last = skb;
+
+	msginfo = ntohs(uh->msginfo);
+
+	if (udpcp_is_last_frag(uh) || uh->fragamount == 0) {
+		/*
+		 * last fragment: queue it to the socket sk_receive_queue
+		 * and ack it
+		 */
+
+		if (dest->recv_msg->len != length) {
+			udpcp_purge_incoming(sk, dest);
+			return 0;
+		}
+
+		if (!(msginfo & UDPCP_NO_ACK_FLAG))
+			udpcp_send_ack(sk, skb, dest, 0);
+
+		memcpy(dest->recv_msg->data + UDPCP_HDRSIZE,
+		       dest->recv_msg->data, sizeof(struct udphdr));
+		skb_pull(dest->recv_msg, UDPCP_HDRSIZE);
+
+		usk->stat.rxMsgs++;
+		atomic_inc(&udpcp_rx_msgs);
+
+		/*
+		 * set a flag for UDPCP message
+		 */
+		UDP_SKB_CB(skb)->udpcp_flag = 1;
+
+		udpcp_set_owner_r(sk, dest);
+		skb_queue_tail(&sk->sk_receive_queue, dest->recv_msg);
+
+		/*
+		 * call the original data available handler
+		 */
+		if (usk->udp_data_ready)
+			usk->udp_data_ready(sk, dest->recv_msg->len);
+
+		dest->recv_msg = 0;
+		dest->recv_last = 0;
+	} else {
+		/*
+		 * ack fragment if requiered
+		 */
+		if (!(msginfo & UDPCP_NO_ACK_FLAG)
+		    && !(msginfo & UDPCP_SINGLE_ACK_FLAG))
+			udpcp_send_ack(sk, skb, dest, 0);
+
+		/*
+		 * setup timeout handler
+		 */
+		dest->rx_time = jiffies;
+
+		if (!timer_pending(&usk->timer))
+			udpcp_timer(sk, dest->rx_time + usk->rx_timeout);
+	}
+
+	return 0;
+}
+
+/*
+ * Deal with received UDPCP frames - sort out what type source it is
+ * and hand of it to the udpcp_handle_packet function.
+ */
+static void udpcp_data_ready(struct sock *sk, int slen)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	struct sk_buff *skb;
+	struct udpcp_dest *dest;
+	struct udpcphdr *uh;
+	unsigned short msginfo;
+	int ret;
+
+	skb = skb_peek_tail(&sk->sk_receive_queue);
+
+	/*
+	 * don't handle NULL pointer buffer and UDPCP messages
+	 */
+	if (skb == NULL || UDP_SKB_CB(skb)->udpcp_flag) {
+		if (usk->udp_data_ready)
+			usk->udp_data_ready(sk, slen);
+		return;
+	}
+
+	__skb_unlink(skb, &sk->sk_receive_queue);
+	if (udpcp_validate_skb(skb)) {
+		kfree_skb(skb);
+
+		return;
+	}
+
+	skb_orphan(skb);
+
+	/*
+	 * do UDP checksum
+	 */
+	if (udp_lib_checksum_complete(skb)) {
+		UDP_INC_STATS_BH(sock_net(sk), UDP_MIB_INERRORS, 0);
+		return;
+	}
+
+	if (unlikely(debug))
+		dump_msg("receive", skb, ip_hdr(skb)->saddr,
+			 ip_hdr(skb)->daddr);
+
+	uh = udpcp_hdr(skb);
+	msginfo = ntohs(uh->msginfo);
+
+	/*
+	 * handle only UDPCP protocol version 2
+	 */
+	if ((msginfo & UDPCP_PROTOCOL_MASK) != UDPCP_PROTOCOL_VERSION_2) {
+		kfree_skb(skb);
+		return;
+	}
+
+	/*
+	 * handle UDPCP checksum
+	 */
+	if (msginfo & UDPCP_CHECKSUM_FLAG) {
+		u8 *data;
+		u32 data_len;
+		u32 chksum;
+
+		chksum = ntohl(uh->chksum);
+		data = (u8 *) skb->data + sizeof(struct udphdr);
+		data_len = skb->len - sizeof(struct udphdr);
+
+		uh->chksum = 0;
+
+		if (chksum != zlib_adler32(1, data, data_len)) {
+			kfree_skb(skb);
+			usk->stat.crcErrors++;
+			atomic_inc(&udpcp_crc_errors);
+			return;
+		}
+	}
+
+	dest = __find_dest(sk, ip_hdr(skb)->saddr, udp_hdr(skb)->source);
+
+	if (!dest) {
+		/*
+		 * new communication destination must start with an sync message
+		 */
+		if (((msginfo & UDPCP_MSG_TYPE_MASK) != UDPCP_MSG_TYPE_DATA) ||
+		    (uh->msgid != 0)) {
+			kfree_skb(skb);
+			return;
+		}
+
+		dest = new_dest(sk, ip_hdr(skb)->saddr, udp_hdr(skb)->source);
+
+		if (!dest) {
+			kfree_skb(skb);
+			return;
+		}
+	}
+
+	/*
+	 * handle message type
+	 */
+	switch (msginfo & UDPCP_MSG_TYPE_MASK) {
+	case UDPCP_MSG_TYPE_DATA:
+		if (!(dest->use_flag & RX_NODE)) {
+			dest->use_flag |= RX_NODE;
+			usk->stat.rxNodes++;
+			atomic_inc(&udpcp_rx_nodes);
+		}
+
+		ret = udpcp_handle_data(sk, skb, dest);
+
+		if (ret > 0) {
+			dest->rx_discarded_frags++;
+			usk->stat.rxDiscardedFrags++;
+			atomic_inc(&udpcp_rx_discarded_frags);
+		}
+		break;
+	case UDPCP_MSG_TYPE_ACK:
+		udpcp_handle_ack(sk, skb, dest);
+	default:
+		ret = 1;
+		break;
+	}
+	if (ret)
+		kfree_skb(skb);
+}
+
+/*
+ * Set socket options
+ */
+static int udpcp_setsockopt(struct sock *sk, int level, int optname,
+			    char __user *optval, unsigned int optlen)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	int val, ret;
+
+	if (level != SOL_UDPCP) {
+		if (udp_prot.setsockopt) {
+			ret = udp_prot.setsockopt(sk, level, optname, optval,
+						optlen);
+			check_timeout(sk);
+			return ret;
+		}
+		return -ENOPROTOOPT;
+	}
+
+	if (optlen < sizeof(int))
+		return -EINVAL;
+
+	if (get_user(val, (int __user *)optval))
+		return -EFAULT;
+
+	switch (optname) {
+	case UDPCP_OPT_TRANSFER_MODE:
+		switch (val) {
+		case UDPCP_NOACK:
+		case UDPCP_ACK:
+		case UDPCP_SINGLE_ACK:
+			usk->ackmode = val;
+			break;
+		default:
+			return -EINVAL;
+		}
+		break;
+	case UDPCP_OPT_CHECKSUM_MODE:
+		switch (val) {
+		case UDPCP_NOCHECKSUM:
+		case UDPCP_CHECKSUM:
+			usk->chkmode = val;
+			break;
+		default:
+			return -EINVAL;
+		}
+		break;
+
+	case UDPCP_OPT_TX_TIMEOUT:
+		if ((val < 1) || (val > UDPCP_MAX_WAIT_SEC * 1000))
+			return -EINVAL;
+		usk->tx_timeout = msecs_to_jiffies(val);
+		break;
+
+	case UDPCP_OPT_RX_TIMEOUT:
+		if ((val < 1) || (val > UDPCP_MAX_WAIT_SEC * 1000))
+			return -EINVAL;
+		usk->rx_timeout = msecs_to_jiffies(val);
+		break;
+
+	case UDPCP_OPT_MAXTRY:
+		if ((val < 1) || (val > 10))
+			return -EINVAL;
+		usk->maxtry = val;
+		break;
+
+	case UDPCP_OPT_OUTSTANDING_ACKS:
+		if ((val < 1) || (val > 255))
+			return -EINVAL;
+		usk->acks = val;
+		break;
+
+	default:
+		return -ENOPROTOOPT;
+	}
+	return 0;
+}
+
+/*
+ * Get socket options
+ */
+static int udpcp_getsockopt(struct sock *sk, int level, int optname,
+			    char __user *optval, int __user *optlen)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	int val, len, ret;
+
+	if (level != SOL_UDPCP) {
+		if (udp_prot.getsockopt) {
+			ret = udp_prot.getsockopt(sk, level, optname, optval,
+						optlen);
+			check_timeout(sk);
+			return ret;
+		}
+		return -ENOPROTOOPT;
+	}
+
+	if (get_user(len, optlen))
+		return -EFAULT;
+
+	len = min_t(unsigned int, len, sizeof(int));
+
+	if (len < 0)
+		return -EINVAL;
+
+	switch (optname) {
+	case UDPCP_OPT_TRANSFER_MODE:
+		val = usk->ackmode;
+		break;
+
+	case UDPCP_OPT_CHECKSUM_MODE:
+		val = usk->chkmode;
+		break;
+
+	case UDPCP_OPT_TX_TIMEOUT:
+		val = jiffies_to_msecs(usk->tx_timeout);
+		break;
+
+	case UDPCP_OPT_MAXTRY:
+		val = usk->maxtry;
+		break;
+
+	case UDPCP_OPT_OUTSTANDING_ACKS:
+		val = usk->acks;
+		break;
+
+	default:
+		return -ENOPROTOOPT;
+	}
+
+	if (put_user(len, optlen))
+		return -EFAULT;
+	if (copy_to_user(optval, &val, len))
+		return -EFAULT;
+	return 0;
+}
+
+/*
+ * ioctl() requests applicable to the UDPCP protocol
+ */
+int udpcp_ioctl(struct sock *sk, int cmd, unsigned long arg)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	int ret = 0;
+
+	switch (cmd) {
+	case UDPCP_IOCTL_GET_STATISTICS:
+		lock_sock(sk);
+		if (copy_to_user((void *)arg, &usk->stat, sizeof(usk->stat)))
+			ret = -EFAULT;
+		udpcp_release_sock(sk);
+		break;
+
+	case UDPCP_IOCTL_RESET_STATISTICS:
+		lock_sock(sk);
+		usk->stat.txMsgs = 0;
+		usk->stat.rxMsgs = 0;
+		usk->stat.txTimeout = 0;
+		usk->stat.rxTimeout = 0;
+		usk->stat.txRetries = 0;
+		usk->stat.rxDiscardedFrags = 0;
+		usk->stat.crcErrors = 0;
+		udpcp_release_sock(sk);
+		break;
+
+	case UDPCP_IOCTL_SYNC:
+		if (arg)
+			ret = wait_event_interruptible_timeout(usk->wq,
+				!usk->pending, msecs_to_jiffies(arg));
+		else
+			ret = wait_event_interruptible(usk->wq, !usk->pending);
+
+		break;
+
+	default:
+		if (udp_prot.ioctl) {
+			ret = udp_prot.ioctl(sk, cmd, arg);
+			check_timeout(sk);
+		} else {
+			ret = -ENOIOCTLCMD;
+		}
+		break;
+	}
+	return ret;
+}
+
+/*
+ * This function will be called by recv(), recvfrom() and revmsg()
+ */
+int udpcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
+		  size_t len, int noblock, int flags, int *addr_len)
+{
+	int ret;
+
+	ret = udp_prot.recvmsg(iocb, sk, msg, len, noblock, flags, addr_len);
+	check_timeout(sk);
+	return ret;
+}
+
+/*
+ * This function will be called by socket() and initialized the socket
+ */
+static int udpcp_sockinit(struct sock *sk)
+{
+	int ret;
+	struct udpcp_sock *usk;
+
+	sk->sk_protocol = SOL_UDP;
+	sk->sk_allocation = GFP_ATOMIC;
+	if (udp_prot.init) {
+		ret = udp_prot.init(sk);
+
+		if (ret)
+			return ret;
+	}
+
+	usk = udpcp_sk(sk);
+	usk->timer.expires = 0;
+	usk->timer.function = udpcp_timeout;
+	usk->timer.data = (long)sk;
+	init_timer(&usk->timer);
+	INIT_LIST_HEAD(&usk->destlist);
+	init_waitqueue_head(&usk->wq);
+	usk->pending = 0;
+	usk->ackmode = UDPCP_ACK;
+	usk->chkmode = UDPCP_CHECKSUM;
+	usk->maxtry = UDPCP_TX_MAXTRY;
+	usk->acks = UDPCP_OUTSTANDING_ACKS;
+	usk->tx_timeout = msecs_to_jiffies(UDPCP_TX_TIMEOUT);
+	usk->rx_timeout = msecs_to_jiffies(UDPCP_RX_TIMEOUT);
+	usk->udp_data_ready = sk->sk_data_ready;
+	sk->sk_data_ready = udpcp_data_ready;
+	usk->udpsock.pending = 0;
+	skb_queue_head_init(&usk->assembly);
+	usk->assembly_len = 0;
+	usk->assembly_dest = NULL;
+
+	spin_lock_bh(&udpcp_lock);
+	list_add_tail(&usk->udpcplist, &udpcp_list);
+	spin_unlock_bh(&udpcp_lock);
+
+#ifdef MODULE
+	try_module_get(THIS_MODULE);
+#endif
+	return 0;
+}
+
+/*
+ * This function will be called by close()
+ */
+static void udpcp_destroy(struct sock *sk)
+{
+	struct list_head *p;
+	struct list_head *n;
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	spin_lock_bh(&udpcp_lock);
+	list_del(&usk->udpcplist);
+	spin_unlock_bh(&udpcp_lock);
+
+	if (udp_prot.destroy)
+		udp_prot.destroy(sk);
+
+	lock_sock(sk);
+
+	del_timer_sync(&usk->timer);
+	sk->sk_data_ready = usk->udp_data_ready;
+
+	skb_queue_purge(&usk->assembly);
+
+	list_for_each_safe(p, n, &usk->destlist) {
+		struct udpcp_dest *dest;
+
+		dest = list_to_udpcpdest(p);
+
+		skb_queue_purge(&dest->xmit);
+
+		kfree_skb(dest->recv_msg);
+
+		if (dest->rt)
+			dst_release(&dest->rt->dst);
+
+		kfree(dest);
+	}
+
+	atomic_sub(usk->stat.txNodes, &udpcp_tx_nodes);
+	atomic_sub(usk->stat.rxNodes, &udpcp_rx_nodes);
+
+	usk->pending = 0;
+
+	if (waitqueue_active(&usk->wq))
+		wake_up_interruptible(&usk->wq);
+
+	release_sock(sk);
+
+#ifdef MODULE
+	module_put(THIS_MODULE);
+#endif
+}
+
+static struct proto udpcp_prot;
+
+/*
+ * inet protocol stack descriptor
+ */
+static struct inet_protosw udpcp_protosw = {
+	.type = SOCK_DGRAM,
+	.protocol = PF_UDPCP,
+	.prot = &udpcp_prot,
+	.ops = &inet_dgram_ops,
+	.no_check = UDP_CSUM_DEFAULT,
+	.flags = 0,
+};
+
+#ifdef CONFIG_PROC_FS
+/*
+ * The following functions handles the /proc/net/udpcp entry
+ */
+struct udpcp_seq_afinfo {
+	char *name;
+	const struct file_operations seq_fops;
+	const struct seq_operations seq_ops;
+};
+
+struct udpcp_iter_state {
+	struct seq_net_private p;
+	struct sock *sk;
+	struct list_head *list;
+	int bucket;
+};
+
+static int udpcp_get_destlist(struct udpcp_sock *usk,
+			struct udpcp_iter_state *state)
+{
+	struct sock *sk = (struct sock *)usk;
+
+	if (sock_flag(sk, SOCK_DEAD))
+		return 0;
+
+	sock_hold(sk);
+	if (!list_empty(&usk->destlist)) {
+		state->sk = sk;
+		state->list = &usk->destlist;
+		return 1;
+	}
+	sock_put(sk);
+
+	return 0;
+}
+
+static inline int udpcp_next_dest(struct udpcp_iter_state *state)
+{
+	struct sock *sk = state->sk;
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	int found = 0;
+
+	if (sock_flag(sk, SOCK_DEAD))
+		return 0;
+
+	lock_sock(sk);
+	if (!list_is_last(state->list, &usk->destlist)) {
+		state->list = state->list->next;
+		state->bucket++;
+		found = 1;
+	}
+	udpcp_release_sock(sk);
+	return found;
+}
+
+static void *udpcp_get_next(struct seq_file *seq)
+{
+	struct udpcp_iter_state *state = seq->private;
+	struct udpcp_sock *usk;
+	struct sock *sk;
+
+	while (state) {
+		if (udpcp_next_dest(state))
+			return state;
+
+		sk = state->sk;
+		usk = udpcp_sk(sk);
+
+		spin_lock_bh(&udpcp_lock);
+		while (!list_is_last(&usk->udpcplist, &udpcp_list)) {
+			usk = list_entry(usk->udpcplist.next, struct udpcp_sock,
+				       udpcplist);
+
+			if (udpcp_get_destlist(usk, state))
+				goto found;
+		}
+		state->sk = NULL;
+		state = NULL;
+found:
+		spin_unlock_bh(&udpcp_lock);
+		sock_put(sk);
+	}
+	return state;
+}
+
+static void *udpcp_get_first(struct seq_file *seq)
+{
+	struct list_head *p;
+	struct udpcp_iter_state *state = seq->private;
+	int found = 0;
+
+	if (!state)
+		return NULL;
+
+	spin_lock_bh(&udpcp_lock);
+	list_for_each(p, &udpcp_list) {
+		found = udpcp_get_destlist(list_to_udpcpsock(p), state);
+		if (found)
+			goto found;
+	}
+found:
+	spin_unlock_bh(&udpcp_lock);
+
+	if (!found)
+		return NULL;
+	return udpcp_get_next(seq);
+}
+
+static void *udpcp_get_idx(struct seq_file *seq, loff_t pos)
+{
+	if (!udpcp_get_first(seq))
+		return NULL;
+
+	while (pos--) {
+		if (!udpcp_get_next(seq))
+			return NULL;
+	}
+	return seq->private;
+}
+
+static void *udpcp_seq_start(struct seq_file *seq, loff_t * pos)
+{
+	return *pos ? udpcp_get_idx(seq, *pos - 1) : SEQ_START_TOKEN;
+}
+
+static void *udpcp_seq_next(struct seq_file *seq, void *v, loff_t * pos)
+{
+	void *private;
+
+	if (v == SEQ_START_TOKEN)
+		private = udpcp_get_idx(seq, 0);
+	else
+		private = udpcp_get_next(seq);
+
+	++*pos;
+	return private;
+}
+
+static void udpcp_seq_stop(struct seq_file *seq, void *v)
+{
+	struct udpcp_iter_state *state = seq->private;
+
+	if (state->sk)
+		sock_put(state->sk);
+}
+
+static int udpcp_seq_open(struct inode *inode, struct file *file)
+{
+	struct udpcp_seq_afinfo *afinfo = PDE(inode)->data;
+	int err;
+
+	err = seq_open_net(inode, file, &afinfo->seq_ops,
+			   sizeof(struct udpcp_iter_state));
+	if (err < 0)
+		return err;
+
+	return err;
+}
+
+int udpcp_proc_register(struct net *net, struct udpcp_seq_afinfo *afinfo)
+{
+	struct proc_dir_entry *p;
+	int rc = 0;
+
+	p = proc_create_data(afinfo->name, S_IRUGO, net->proc_net,
+			     &afinfo->seq_fops, afinfo);
+	if (!p)
+		rc = -ENOMEM;
+	return rc;
+}
+
+void udpcp_proc_unregister(struct net *net, struct udpcp_seq_afinfo *afinfo)
+{
+	proc_net_remove(net, afinfo->name);
+}
+
+static unsigned int udpcp_tx_queue_len(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct sk_buff *skb;
+	unsigned int n = 0;
+
+	skb_queue_walk(&dest->xmit, skb)
+	    n += skb->len;
+	return n;
+}
+
+static unsigned int udpcp_rx_queue_len(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct sk_buff *skb;
+	unsigned int n = 0;
+
+	skb_queue_walk(&sk->sk_receive_queue, skb) {
+		if (udp_hdr(skb)->source == dest->port
+		    && ip_hdr(skb)->saddr == dest->addr)
+			n += skb->len;
+	}
+	return n;
+}
+
+static void udpcp_format_sock(struct seq_file *seq, int *len)
+{
+	struct udpcp_iter_state *state = seq->private;
+	struct sock *sk = state->sk;
+	struct inet_sock *inet = inet_sk(sk);
+	struct udpcp_dest *p = list_to_udpcpdest(state->list);
+	__be32 src = inet->inet_rcv_saddr;
+	__u16 srcp = ntohs(inet->inet_sport);
+	__be32 dest = p->addr;
+	__u16 destp = ntohs(p->port);
+
+	lock_sock(sk);
+	seq_printf(seq, "%4d: %08X:%04X %08X:%04X"
+		   " %02X %08X:%08X %02X:%08lX %08X %5d %8d %lu %d %p %u%n",
+		   state->bucket, src, srcp, dest, destp, sk->sk_state,
+		   udpcp_tx_queue_len(sk, p),
+		   udpcp_rx_queue_len(sk, p),
+		   0, 0L, p->tx_retries, sock_i_uid(sk),
+		   p->tx_timeout, sock_i_ino(sk),
+		   atomic_read(&sk->sk_refcnt), sk, p->rx_timeout,
+		   len);
+	udpcp_release_sock(sk);
+}
+
+int udpcp_seq_show(struct seq_file *seq, void *v)
+{
+	if (v == SEQ_START_TOKEN) {
+		seq_printf(seq, "%-127s\n",
+			   "  sl  local_address rem_address   st tx_queue "
+			   "rx_queue tr tm->when retrnsmt   uid  timeout "
+			   "inode ref pointer drops");
+	} else {
+		int len;
+
+		udpcp_format_sock(seq, &len);
+		seq_printf(seq, "%*s\n", 127 - len, "");
+	}
+	return 0;
+}
+
+static struct udpcp_seq_afinfo udpcp_seq_afinfo = {
+	.name = "udpcp",
+	.seq_fops = {
+			.owner = THIS_MODULE,
+			.open = udpcp_seq_open,
+			.read = seq_read,
+			.llseek = seq_lseek,
+			.release = seq_release_net,
+		     },
+	.seq_ops = {
+			.show = udpcp_seq_show,
+			.start = udpcp_seq_start,
+			.next = udpcp_seq_next,
+			.stop = udpcp_seq_stop,
+		    },
+};
+
+static int udpcp_proc_init_net(struct net *net)
+{
+	return udpcp_proc_register(net, &udpcp_seq_afinfo);
+}
+
+static void udpcp_proc_exit_net(struct net *net)
+{
+	udpcp_proc_unregister(net, &udpcp_seq_afinfo);
+}
+
+static struct pernet_operations udpcp_net_ops = {
+	.init = udpcp_proc_init_net,
+	.exit = udpcp_proc_exit_net,
+};
+
+static int __init udpcp_proc_init(void)
+{
+	return register_pernet_subsys(&udpcp_net_ops);
+}
+
+static void udpcp_proc_exit(void)
+{
+	unregister_pernet_subsys(&udpcp_net_ops);
+}
+#endif /* CONFIG_PROC_FS */
+
+/*
+ * Install and init module
+ */
+static int __init udpcp_init(void)
+{
+	int ret;
+	struct proc_dir_entry *proc_entry = NULL;
+
+	spin_lock_init(&udpcp_lock);
+
+	INIT_LIST_HEAD(&udpcp_list);
+
+	/*
+	 * to prevent to rewrite the whole UDP protocol,
+	 * assign struct proto udp to the struct proto udpcp
+	 */
+	udpcp_prot = udp_prot;
+
+	/*
+	 * change the protocol name
+	 */
+	strcpy(udpcp_prot.name, "UDPCP");
+
+	/*
+	 * overload the following function, all other
+	 * functions will use the UDP protocol functions
+	 */
+	udpcp_prot.sendmsg = udpcp_sendmsg;
+	udpcp_prot.sendpage = udpcp_sendpage;
+	udpcp_prot.init = udpcp_sockinit;
+	udpcp_prot.destroy = udpcp_destroy;
+	udpcp_prot.setsockopt = udpcp_setsockopt;
+	udpcp_prot.getsockopt = udpcp_getsockopt;
+	udpcp_prot.ioctl = udpcp_ioctl;
+	udpcp_prot.recvmsg = udpcp_recvmsg;
+
+	/*
+	 * fix the object size for the embedded udpcp_sock structure
+	 */
+	udpcp_prot.obj_size = sizeof(struct udpcp_sock);
+
+	/*
+	 * register the UDPCP protocol
+	 */
+	ret = proto_register(&udpcp_prot, 1);
+	if (ret)
+		return ret;
+
+	/*
+	 * register the inet socket for UDPCP
+	 */
+	inet_register_protosw(&udpcp_protosw);
+
+	/*
+	 * register the /proc/sys/net/ipv4/udpcp_ entries
+	 */
+	udpcp_ctl_table =
+		register_sysctl_paths(net_ipv4_ctl_path, ipv4_udpcp_table);
+	if (udpcp_ctl_table == NULL) {
+		ret = -ENOMEM;
+		goto err1;
+	}
+
+#ifdef CONFIG_PROC_FS
+	/*
+	 * register /proc/driver/udpcp entry
+	 */
+	proc_entry =
+	    create_proc_read_entry(UDPCP_PROC, S_IRUSR | S_IRGRP | S_IROTH,
+				   NULL, udpcp_proc, NULL);
+
+	if (!proc_entry) {
+		ret = -ENOMEM;
+		goto err2;
+	}
+	/*
+	 * register /proc/net/udpcp entry
+	 */
+	ret = udpcp_proc_init();
+
+	if (ret)
+		goto err3;
+#endif
+	pr_info("UDPCP protocol stack version " VERSION "\n");
+	return 0;
+#ifdef CONFIG_PROC_FS
+err3:
+	remove_proc_entry(UDPCP_PROC, NULL);
+err2:
+	unregister_sysctl_table(udpcp_ctl_table);
+#endif
+err1:
+	inet_unregister_protosw(&udpcp_protosw);
+	proto_unregister(&udpcp_prot);
+	return ret;
+}
+
+/*
+ * Cleanup and exit module
+ */
+static void __exit udpcp_exit(void)
+{
+#ifdef CONFIG_PROC_FS
+	udpcp_proc_exit();
+	remove_proc_entry(UDPCP_PROC, NULL);
+#endif
+	unregister_sysctl_table(udpcp_ctl_table);
+	inet_unregister_protosw(&udpcp_protosw);
+	proto_unregister(&udpcp_prot);
+}
+
+module_init(udpcp_init);
+module_exit(udpcp_exit);
+
+MODULE_AUTHOR("Stefani Seibold <stefani@seibold.net>");
+MODULE_DESCRIPTION("UDPCP protocol stack v" VERSION);
+MODULE_LICENSE("GPL");
+
-- 
1.7.3.4

^ permalink raw reply related

* Re: [PATCH] new UDPCP Communication Protocol
From: Eric Dumazet @ 2011-01-02 22:49 UTC (permalink / raw)
  To: stefani
  Cc: linux-kernel, akpm, davem, netdev, shemminger, jj, daniel.baluta,
	jochen
In-Reply-To: <1294007971-18878-1-git-send-email-stefani@seibold.net>

Le dimanche 02 janvier 2011 à 23:39 +0100, stefani@seibold.net a écrit :
> +
> +/*
> + * Create a new destination descriptor for the given IPV4 address and port
> + */
> +static struct udpcp_dest *new_dest(struct sock *sk, __be32 addr, __be16 port)
> +{
> +	struct udpcp_dest *dest;
> +	struct udpcp_sock *usk = udpcp_sk(sk);
> +
> +	if (usk->connections >= udpcp_max_connections)
> +		return NULL;
> +
> +	dest = kzalloc(sizeof(*dest), sk->sk_allocation);
> +
> +	if (dest) {
> +		usk->connections++;
> +		skb_queue_head_init(&dest->xmit);
> +		dest->addr = addr;
> +		dest->port = port;
> +		dest->ackmode = UDPCP_ACK;
> +		list_add_tail(&dest->list, &usk->destlist);
> +	}
> +
> +	return dest;
> +}
> +

Hmm, so 'connections' is increased, never decreased.

This seems a fatal flaw in this protocol, since a malicious user can
easily fill the list with garbage, and block regular communications.




^ permalink raw reply

* Re: [PATCH] new UDPCP Communication Protocol
From: Stefani Seibold @ 2011-01-02 22:55 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: linux-kernel, akpm, davem, netdev, shemminger, jj, daniel.baluta,
	jochen
In-Reply-To: <1294008562.2535.263.camel@edumazet-laptop>

Am Sonntag, den 02.01.2011, 23:49 +0100 schrieb Eric Dumazet:
> Le dimanche 02 janvier 2011 à 23:39 +0100, stefani@seibold.net a écrit :
> > +
> > +/*
> > + * Create a new destination descriptor for the given IPV4 address and port
> > + */
> > +static struct udpcp_dest *new_dest(struct sock *sk, __be32 addr, __be16 port)
> > +{
> > +	struct udpcp_dest *dest;
> > +	struct udpcp_sock *usk = udpcp_sk(sk);
> > +
> > +	if (usk->connections >= udpcp_max_connections)
> > +		return NULL;
> > +
> > +	dest = kzalloc(sizeof(*dest), sk->sk_allocation);
> > +
> > +	if (dest) {
> > +		usk->connections++;
> > +		skb_queue_head_init(&dest->xmit);
> > +		dest->addr = addr;
> > +		dest->port = port;
> > +		dest->ackmode = UDPCP_ACK;
> > +		list_add_tail(&dest->list, &usk->destlist);
> > +	}
> > +
> > +	return dest;
> > +}
> > +
> 
> Hmm, so 'connections' is increased, never decreased.
> 
> This seems a fatal flaw in this protocol, since a malicious user can
> easily fill the list with garbage, and block regular communications.

You are right, there is now way to detect which connection is no longer
needed. I have not designed this protocol, so i cannot fix it. 

But in our environment this will be used together with an firewall
and/or ipsec. In this case it it safe.

^ permalink raw reply

* Re: [PATCH] new UDPCP Communication Protocol
From: Jesper Juhl @ 2011-01-02 23:04 UTC (permalink / raw)
  To: Stefani Seibold
  Cc: Eric Dumazet, linux-kernel, akpm, davem, netdev, shemminger,
	daniel.baluta, jochen
In-Reply-To: <1294008917.18963.3.camel@wall-e>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2153 bytes --]

On Sun, 2 Jan 2011, Stefani Seibold wrote:

> Am Sonntag, den 02.01.2011, 23:49 +0100 schrieb Eric Dumazet:
> > Le dimanche 02 janvier 2011 à 23:39 +0100, stefani@seibold.net a écrit :
> > > +
> > > +/*
> > > + * Create a new destination descriptor for the given IPV4 address and port
> > > + */
> > > +static struct udpcp_dest *new_dest(struct sock *sk, __be32 addr, __be16 port)
> > > +{
> > > +	struct udpcp_dest *dest;
> > > +	struct udpcp_sock *usk = udpcp_sk(sk);
> > > +
> > > +	if (usk->connections >= udpcp_max_connections)
> > > +		return NULL;
> > > +
> > > +	dest = kzalloc(sizeof(*dest), sk->sk_allocation);
> > > +
> > > +	if (dest) {
> > > +		usk->connections++;
> > > +		skb_queue_head_init(&dest->xmit);
> > > +		dest->addr = addr;
> > > +		dest->port = port;
> > > +		dest->ackmode = UDPCP_ACK;
> > > +		list_add_tail(&dest->list, &usk->destlist);
> > > +	}
> > > +
> > > +	return dest;
> > > +}
> > > +
> > 
> > Hmm, so 'connections' is increased, never decreased.
> > 
> > This seems a fatal flaw in this protocol, since a malicious user can
> > easily fill the list with garbage, and block regular communications.
> 
> You are right, there is now way to detect which connection is no longer
> needed. I have not designed this protocol, so i cannot fix it. 
> 
> But in our environment this will be used together with an firewall
> and/or ipsec. In this case it it safe.
> 

Hmm, the first thing that springs into my head as a possible band-aid 
(which is probbaly wrong for many reasons I've not considered, so feel 
free to shoot it down) is; couldn't we use a timer (set to some outrageous 
high value by default and admin tunable) that would decrement 
'connections' (discount dead connections) when there has not been any 
acctivity for a huge period of time? Kill off connections that have been 
idle for ages.

Not perfect, but that would at least let the system recover after a while 
if a malicious client did something nasty with many connections...


-- 
Jesper Juhl <jj@chaosbits.net>            http://www.chaosbits.net/
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please.

^ permalink raw reply

* [net-next-2.6 00/08] r8169 driver updates
From: Francois Romieu @ 2011-01-02 23:35 UTC (permalink / raw)
  To: davem; +Cc: netdev, Hayes Wang, Ben Hutchings, David Woodhouse

The following series includes:
- Hayes'firmware removal. The firmware is already included in linux-firmware.
- my sauce to diminish the gap between the in-kernel r8169 driver
  and Realtek's one(s).

I have not noticed any regression while doing some basic testing with the
chipsets below (Realtek's own label are included between parenthesis).

- RTL8169sc/8110sc XID 98000000 (RTL8110SC)
- RTL8102e         XID 14c00000 (RTL8103E-GR)
- RTL8102e         XID 04a00000 (RTL8102E-VB-GR)
- RTL8101e         XID 00a00000 (RTL8105E-VG)
- RTL8168d/8111d   XID 083000c0 (RTL8111D-VB-GR)
- RTL8168d/8111d   XID 081000c0 (RTL8168D-GR)

The series is available as
git://git.kernel.org/pub/scm/linux/kernel/git/romieu/netdev-2.6.git r8169-davem

Distance from 'net-next' (40cd201e37073b3e2281cf2c73fcf5674f22267f)
-------------------------------------------------------------------

c853fd7221941839af1799d8f47daa2ae8cf65ef
ffbbf19c870a92b841a224db14cda64ad81598cf
82636521a5a60abc6ccadbfed1b2424d5be6bb49
ee37a075949e32bb336a54c93abc7d4771898264
a19878e168201c09f6f631bbef40b67a0e374078
dfbd1ebbd95cdc63f31b59d62bf9e652657d7241
d7b9a3f0f524085cc8fee0388d369cfb43e588c7
60631b47144e6f5d0003eaf66d376046d592b78f

Diffstat
--------

 drivers/net/r8169.c | 1632 +++++++++++++++++++++++++--------------------------
 1 files changed, 784 insertions(+), 848 deletions(-)

Shortlog
--------

Francois Romieu (7):
      r8169: identify different registers.
      r8169: use device dependent methods to access the MII registers.
      r8169: 8168DP specific MII registers access methods.
      r8169: phy power ops
      r8169: magic.
      r8169: rtl_csi_access_enable rename.
      r8169: more 8168dp support.

Hayes Wang (1):
      r8169: remove the firmware of RTL8111D.

-- 
Ueimor

^ permalink raw reply

* [net-next-2.6 01/08] r8169: remove the firmware of RTL8111D.
From: Francois Romieu @ 2011-01-02 23:35 UTC (permalink / raw)
  To: davem; +Cc: netdev, Hayes Wang, Ben Hutchings, David Woodhouse

The binary file of the firmware is moved to linux-firmware repository.
The firmwares are rtl_nic/rtl8168d-1.fw and rtl_nic/rtl8168d-2.fw.
The driver goes along if the firmware couldn't be found. However, it
is suggested to be done with the suitable firmware.

Some wrong PHY parameters are directly corrected in the driver.

Simple firmware checking added per Ben Hutchings suggestion.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Ben Hutchings <benh@debian.org>
---
 drivers/net/r8169.c |  812 +++++++++------------------------------------------
 1 files changed, 133 insertions(+), 679 deletions(-)

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index e165d96..3124462 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -24,6 +24,7 @@
 #include <linux/init.h>
 #include <linux/dma-mapping.h>
 #include <linux/pm_runtime.h>
+#include <linux/firmware.h>
 
 #include <asm/system.h>
 #include <asm/io.h>
@@ -33,6 +34,9 @@
 #define MODULENAME "r8169"
 #define PFX MODULENAME ": "
 
+#define FIRMWARE_8168D_1	"rtl_nic/rtl8168d-1.fw"
+#define FIRMWARE_8168D_2	"rtl_nic/rtl8168d-2.fw"
+
 #ifdef RTL8169_DEBUG
 #define assert(expr) \
 	if (!(expr)) {					\
@@ -514,6 +518,8 @@ module_param_named(debug, debug.msg_enable, int, 0);
 MODULE_PARM_DESC(debug, "Debug verbosity level (0=none, ..., 16=all)");
 MODULE_LICENSE("GPL");
 MODULE_VERSION(RTL8169_VERSION);
+MODULE_FIRMWARE(FIRMWARE_8168D_1);
+MODULE_FIRMWARE(FIRMWARE_8168D_2);
 
 static int rtl8169_open(struct net_device *dev);
 static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
@@ -1393,6 +1399,65 @@ static void rtl_phy_write(void __iomem *ioaddr, const struct phy_reg *regs, int
 	}
 }
 
+#define PHY_READ		0x00000000
+#define PHY_DATA_OR		0x10000000
+#define PHY_DATA_AND		0x20000000
+#define PHY_BJMPN		0x30000000
+#define PHY_READ_EFUSE		0x40000000
+#define PHY_READ_MAC_BYTE	0x50000000
+#define PHY_WRITE_MAC_BYTE	0x60000000
+#define PHY_CLEAR_READCOUNT	0x70000000
+#define PHY_WRITE		0x80000000
+#define PHY_READCOUNT_EQ_SKIP	0x90000000
+#define PHY_COMP_EQ_SKIPN	0xa0000000
+#define PHY_COMP_NEQ_SKIPN	0xb0000000
+#define PHY_WRITE_PREVIOUS	0xc0000000
+#define PHY_SKIPN		0xd0000000
+#define PHY_DELAY_MS		0xe0000000
+#define PHY_WRITE_ERI_WORD	0xf0000000
+
+static void
+rtl_phy_write_fw(struct rtl8169_private *tp, const struct firmware *fw)
+{
+	void __iomem *ioaddr = tp->mmio_addr;
+	__le32 *phytable = (__le32 *)fw->data;
+	struct net_device *dev = tp->dev;
+	size_t i;
+
+	if (fw->size % sizeof(*phytable)) {
+		netif_err(tp, probe, dev, "odd sized firmware %zd\n", fw->size);
+		return;
+	}
+
+	for (i = 0; i < fw->size / sizeof(*phytable); i++) {
+		u32 action = le32_to_cpu(phytable[i]);
+
+		if (!action)
+			break;
+
+		if ((action & 0xf0000000) != PHY_WRITE) {
+			netif_err(tp, probe, dev,
+				  "unknown action 0x%08x\n", action);
+			return;
+		}
+	}
+
+	while (i-- != 0) {
+		u32 action = le32_to_cpu(*phytable);
+		u32 data = action & 0x0000ffff;
+		u32 reg = (action & 0x0fff0000) >> 16;
+
+		switch(action & 0xf0000000) {
+		case PHY_WRITE:
+			mdio_write(ioaddr, reg, data);
+			phytable++;
+			break;
+		default:
+			BUG();
+		}
+	}
+}
+
 static void rtl8169s_hw_phy_config(void __iomem *ioaddr)
 {
 	static const struct phy_reg phy_reg_init[] = {
@@ -1725,9 +1790,10 @@ static void rtl8168c_4_hw_phy_config(void __iomem *ioaddr)
 	rtl8168c_3_hw_phy_config(ioaddr);
 }
 
-static void rtl8168d_1_hw_phy_config(void __iomem *ioaddr)
+static void rtl8168d_1_hw_phy_config(struct rtl8169_private *tp)
 {
 	static const struct phy_reg phy_reg_init_0[] = {
+		/* Channel Estimation */
 		{ 0x1f, 0x0001 },
 		{ 0x06, 0x4064 },
 		{ 0x07, 0x2863 },
@@ -1744,379 +1810,41 @@ static void rtl8168d_1_hw_phy_config(void __iomem *ioaddr)
 		{ 0x12, 0xf49f },
 		{ 0x13, 0x070b },
 		{ 0x1a, 0x05ad },
-		{ 0x14, 0x94c0 }
-	};
-	static const struct phy_reg phy_reg_init_1[] = {
+		{ 0x14, 0x94c0 },
+
+		/*
+		 * Tx Error Issue
+		 * enhance line driver power
+		 */
 		{ 0x1f, 0x0002 },
 		{ 0x06, 0x5561 },
 		{ 0x1f, 0x0005 },
 		{ 0x05, 0x8332 },
-		{ 0x06, 0x5561 }
-	};
-	static const struct phy_reg phy_reg_init_2[] = {
-		{ 0x1f, 0x0005 },
-		{ 0x05, 0xffc2 },
-		{ 0x1f, 0x0005 },
-		{ 0x05, 0x8000 },
-		{ 0x06, 0xf8f9 },
-		{ 0x06, 0xfaef },
-		{ 0x06, 0x59ee },
-		{ 0x06, 0xf8ea },
-		{ 0x06, 0x00ee },
-		{ 0x06, 0xf8eb },
-		{ 0x06, 0x00e0 },
-		{ 0x06, 0xf87c },
-		{ 0x06, 0xe1f8 },
-		{ 0x06, 0x7d59 },
-		{ 0x06, 0x0fef },
-		{ 0x06, 0x0139 },
-		{ 0x06, 0x029e },
-		{ 0x06, 0x06ef },
-		{ 0x06, 0x1039 },
-		{ 0x06, 0x089f },
-		{ 0x06, 0x2aee },
-		{ 0x06, 0xf8ea },
-		{ 0x06, 0x00ee },
-		{ 0x06, 0xf8eb },
-		{ 0x06, 0x01e0 },
-		{ 0x06, 0xf87c },
-		{ 0x06, 0xe1f8 },
-		{ 0x06, 0x7d58 },
-		{ 0x06, 0x409e },
-		{ 0x06, 0x0f39 },
-		{ 0x06, 0x46aa },
-		{ 0x06, 0x0bbf },
-		{ 0x06, 0x8290 },
-		{ 0x06, 0xd682 },
-		{ 0x06, 0x9802 },
-		{ 0x06, 0x014f },
-		{ 0x06, 0xae09 },
-		{ 0x06, 0xbf82 },
-		{ 0x06, 0x98d6 },
-		{ 0x06, 0x82a0 },
-		{ 0x06, 0x0201 },
-		{ 0x06, 0x4fef },
-		{ 0x06, 0x95fe },
-		{ 0x06, 0xfdfc },
-		{ 0x06, 0x05f8 },
-		{ 0x06, 0xf9fa },
-		{ 0x06, 0xeef8 },
-		{ 0x06, 0xea00 },
-		{ 0x06, 0xeef8 },
-		{ 0x06, 0xeb00 },
-		{ 0x06, 0xe2f8 },
-		{ 0x06, 0x7ce3 },
-		{ 0x06, 0xf87d },
-		{ 0x06, 0xa511 },
-		{ 0x06, 0x1112 },
-		{ 0x06, 0xd240 },
-		{ 0x06, 0xd644 },
-		{ 0x06, 0x4402 },
-		{ 0x06, 0x8217 },
-		{ 0x06, 0xd2a0 },
-		{ 0x06, 0xd6aa },
-		{ 0x06, 0xaa02 },
-		{ 0x06, 0x8217 },
-		{ 0x06, 0xae0f },
-		{ 0x06, 0xa544 },
-		{ 0x06, 0x4402 },
-		{ 0x06, 0xae4d },
-		{ 0x06, 0xa5aa },
-		{ 0x06, 0xaa02 },
-		{ 0x06, 0xae47 },
-		{ 0x06, 0xaf82 },
-		{ 0x06, 0x13ee },
-		{ 0x06, 0x834e },
-		{ 0x06, 0x00ee },
-		{ 0x06, 0x834d },
-		{ 0x06, 0x0fee },
-		{ 0x06, 0x834c },
-		{ 0x06, 0x0fee },
-		{ 0x06, 0x834f },
-		{ 0x06, 0x00ee },
-		{ 0x06, 0x8351 },
-		{ 0x06, 0x00ee },
-		{ 0x06, 0x834a },
-		{ 0x06, 0xffee },
-		{ 0x06, 0x834b },
-		{ 0x06, 0xffe0 },
-		{ 0x06, 0x8330 },
-		{ 0x06, 0xe183 },
-		{ 0x06, 0x3158 },
-		{ 0x06, 0xfee4 },
-		{ 0x06, 0xf88a },
-		{ 0x06, 0xe5f8 },
-		{ 0x06, 0x8be0 },
-		{ 0x06, 0x8332 },
-		{ 0x06, 0xe183 },
-		{ 0x06, 0x3359 },
-		{ 0x06, 0x0fe2 },
-		{ 0x06, 0x834d },
-		{ 0x06, 0x0c24 },
-		{ 0x06, 0x5af0 },
-		{ 0x06, 0x1e12 },
-		{ 0x06, 0xe4f8 },
-		{ 0x06, 0x8ce5 },
-		{ 0x06, 0xf88d },
-		{ 0x06, 0xaf82 },
-		{ 0x06, 0x13e0 },
-		{ 0x06, 0x834f },
-		{ 0x06, 0x10e4 },
-		{ 0x06, 0x834f },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x4e78 },
-		{ 0x06, 0x009f },
-		{ 0x06, 0x0ae0 },
-		{ 0x06, 0x834f },
-		{ 0x06, 0xa010 },
-		{ 0x06, 0xa5ee },
-		{ 0x06, 0x834e },
-		{ 0x06, 0x01e0 },
-		{ 0x06, 0x834e },
-		{ 0x06, 0x7805 },
-		{ 0x06, 0x9e9a },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x4e78 },
-		{ 0x06, 0x049e },
-		{ 0x06, 0x10e0 },
-		{ 0x06, 0x834e },
-		{ 0x06, 0x7803 },
-		{ 0x06, 0x9e0f },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x4e78 },
-		{ 0x06, 0x019e },
-		{ 0x06, 0x05ae },
-		{ 0x06, 0x0caf },
-		{ 0x06, 0x81f8 },
-		{ 0x06, 0xaf81 },
-		{ 0x06, 0xa3af },
-		{ 0x06, 0x81dc },
-		{ 0x06, 0xaf82 },
-		{ 0x06, 0x13ee },
-		{ 0x06, 0x8348 },
-		{ 0x06, 0x00ee },
-		{ 0x06, 0x8349 },
-		{ 0x06, 0x00e0 },
-		{ 0x06, 0x8351 },
-		{ 0x06, 0x10e4 },
-		{ 0x06, 0x8351 },
-		{ 0x06, 0x5801 },
-		{ 0x06, 0x9fea },
-		{ 0x06, 0xd000 },
-		{ 0x06, 0xd180 },
-		{ 0x06, 0x1f66 },
-		{ 0x06, 0xe2f8 },
-		{ 0x06, 0xeae3 },
-		{ 0x06, 0xf8eb },
-		{ 0x06, 0x5af8 },
-		{ 0x06, 0x1e20 },
-		{ 0x06, 0xe6f8 },
-		{ 0x06, 0xeae5 },
-		{ 0x06, 0xf8eb },
-		{ 0x06, 0xd302 },
-		{ 0x06, 0xb3fe },
-		{ 0x06, 0xe2f8 },
-		{ 0x06, 0x7cef },
-		{ 0x06, 0x325b },
-		{ 0x06, 0x80e3 },
-		{ 0x06, 0xf87d },
-		{ 0x06, 0x9e03 },
-		{ 0x06, 0x7dff },
-		{ 0x06, 0xff0d },
-		{ 0x06, 0x581c },
-		{ 0x06, 0x551a },
-		{ 0x06, 0x6511 },
-		{ 0x06, 0xa190 },
-		{ 0x06, 0xd3e2 },
-		{ 0x06, 0x8348 },
-		{ 0x06, 0xe383 },
-		{ 0x06, 0x491b },
-		{ 0x06, 0x56ab },
-		{ 0x06, 0x08ef },
-		{ 0x06, 0x56e6 },
-		{ 0x06, 0x8348 },
-		{ 0x06, 0xe783 },
-		{ 0x06, 0x4910 },
-		{ 0x06, 0xd180 },
-		{ 0x06, 0x1f66 },
-		{ 0x06, 0xa004 },
-		{ 0x06, 0xb9e2 },
-		{ 0x06, 0x8348 },
-		{ 0x06, 0xe383 },
-		{ 0x06, 0x49ef },
-		{ 0x06, 0x65e2 },
-		{ 0x06, 0x834a },
-		{ 0x06, 0xe383 },
-		{ 0x06, 0x4b1b },
-		{ 0x06, 0x56aa },
-		{ 0x06, 0x0eef },
-		{ 0x06, 0x56e6 },
-		{ 0x06, 0x834a },
-		{ 0x06, 0xe783 },
-		{ 0x06, 0x4be2 },
-		{ 0x06, 0x834d },
-		{ 0x06, 0xe683 },
-		{ 0x06, 0x4ce0 },
-		{ 0x06, 0x834d },
-		{ 0x06, 0xa000 },
-		{ 0x06, 0x0caf },
-		{ 0x06, 0x81dc },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x4d10 },
-		{ 0x06, 0xe483 },
-		{ 0x06, 0x4dae },
-		{ 0x06, 0x0480 },
-		{ 0x06, 0xe483 },
-		{ 0x06, 0x4de0 },
-		{ 0x06, 0x834e },
-		{ 0x06, 0x7803 },
-		{ 0x06, 0x9e0b },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x4e78 },
-		{ 0x06, 0x049e },
-		{ 0x06, 0x04ee },
-		{ 0x06, 0x834e },
-		{ 0x06, 0x02e0 },
-		{ 0x06, 0x8332 },
-		{ 0x06, 0xe183 },
-		{ 0x06, 0x3359 },
-		{ 0x06, 0x0fe2 },
-		{ 0x06, 0x834d },
-		{ 0x06, 0x0c24 },
-		{ 0x06, 0x5af0 },
-		{ 0x06, 0x1e12 },
-		{ 0x06, 0xe4f8 },
-		{ 0x06, 0x8ce5 },
-		{ 0x06, 0xf88d },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x30e1 },
-		{ 0x06, 0x8331 },
-		{ 0x06, 0x6801 },
-		{ 0x06, 0xe4f8 },
-		{ 0x06, 0x8ae5 },
-		{ 0x06, 0xf88b },
-		{ 0x06, 0xae37 },
-		{ 0x06, 0xee83 },
-		{ 0x06, 0x4e03 },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x4ce1 },
-		{ 0x06, 0x834d },
-		{ 0x06, 0x1b01 },
-		{ 0x06, 0x9e04 },
-		{ 0x06, 0xaaa1 },
-		{ 0x06, 0xaea8 },
-		{ 0x06, 0xee83 },
-		{ 0x06, 0x4e04 },
-		{ 0x06, 0xee83 },
-		{ 0x06, 0x4f00 },
-		{ 0x06, 0xaeab },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x4f78 },
-		{ 0x06, 0x039f },
-		{ 0x06, 0x14ee },
-		{ 0x06, 0x834e },
-		{ 0x06, 0x05d2 },
-		{ 0x06, 0x40d6 },
-		{ 0x06, 0x5554 },
-		{ 0x06, 0x0282 },
-		{ 0x06, 0x17d2 },
-		{ 0x06, 0xa0d6 },
-		{ 0x06, 0xba00 },
-		{ 0x06, 0x0282 },
-		{ 0x06, 0x17fe },
-		{ 0x06, 0xfdfc },
-		{ 0x06, 0x05f8 },
-		{ 0x06, 0xe0f8 },
-		{ 0x06, 0x60e1 },
-		{ 0x06, 0xf861 },
-		{ 0x06, 0x6802 },
-		{ 0x06, 0xe4f8 },
-		{ 0x06, 0x60e5 },
-		{ 0x06, 0xf861 },
-		{ 0x06, 0xe0f8 },
-		{ 0x06, 0x48e1 },
-		{ 0x06, 0xf849 },
-		{ 0x06, 0x580f },
-		{ 0x06, 0x1e02 },
-		{ 0x06, 0xe4f8 },
-		{ 0x06, 0x48e5 },
-		{ 0x06, 0xf849 },
-		{ 0x06, 0xd000 },
-		{ 0x06, 0x0282 },
-		{ 0x06, 0x5bbf },
-		{ 0x06, 0x8350 },
-		{ 0x06, 0xef46 },
-		{ 0x06, 0xdc19 },
-		{ 0x06, 0xddd0 },
-		{ 0x06, 0x0102 },
-		{ 0x06, 0x825b },
-		{ 0x06, 0x0282 },
-		{ 0x06, 0x77e0 },
-		{ 0x06, 0xf860 },
-		{ 0x06, 0xe1f8 },
-		{ 0x06, 0x6158 },
-		{ 0x06, 0xfde4 },
-		{ 0x06, 0xf860 },
-		{ 0x06, 0xe5f8 },
-		{ 0x06, 0x61fc },
-		{ 0x06, 0x04f9 },
-		{ 0x06, 0xfafb },
-		{ 0x06, 0xc6bf },
-		{ 0x06, 0xf840 },
-		{ 0x06, 0xbe83 },
-		{ 0x06, 0x50a0 },
-		{ 0x06, 0x0101 },
-		{ 0x06, 0x071b },
-		{ 0x06, 0x89cf },
-		{ 0x06, 0xd208 },
-		{ 0x06, 0xebdb },
-		{ 0x06, 0x19b2 },
-		{ 0x06, 0xfbff },
-		{ 0x06, 0xfefd },
-		{ 0x06, 0x04f8 },
-		{ 0x06, 0xe0f8 },
-		{ 0x06, 0x48e1 },
-		{ 0x06, 0xf849 },
-		{ 0x06, 0x6808 },
-		{ 0x06, 0xe4f8 },
-		{ 0x06, 0x48e5 },
-		{ 0x06, 0xf849 },
-		{ 0x06, 0x58f7 },
-		{ 0x06, 0xe4f8 },
-		{ 0x06, 0x48e5 },
-		{ 0x06, 0xf849 },
-		{ 0x06, 0xfc04 },
-		{ 0x06, 0x4d20 },
-		{ 0x06, 0x0002 },
-		{ 0x06, 0x4e22 },
-		{ 0x06, 0x0002 },
-		{ 0x06, 0x4ddf },
-		{ 0x06, 0xff01 },
-		{ 0x06, 0x4edd },
-		{ 0x06, 0xff01 },
-		{ 0x05, 0x83d4 },
-		{ 0x06, 0x8000 },
-		{ 0x05, 0x83d8 },
-		{ 0x06, 0x8051 },
-		{ 0x02, 0x6010 },
-		{ 0x03, 0xdc00 },
-		{ 0x05, 0xfff6 },
-		{ 0x06, 0x00fc },
-		{ 0x1f, 0x0000 },
+		{ 0x06, 0x5561 },
+
+		/*
+		 * Can not link to 1Gbps with bad cable
+		 * Decrease SNR threshold form 21.07dB to 19.04dB
+		 */
+		{ 0x1f, 0x0001 },
+		{ 0x17, 0x0cc0 },
 
 		{ 0x1f, 0x0000 },
-		{ 0x0d, 0xf880 },
-		{ 0x1f, 0x0000 }
+		{ 0x0d, 0xf880 }
 	};
+	void __iomem *ioaddr = tp->mmio_addr;
+	const struct firmware *fw;
 
 	rtl_phy_write(ioaddr, phy_reg_init_0, ARRAY_SIZE(phy_reg_init_0));
 
+	/*
+	 * Rx Error Issue
+	 * Fine Tune Switching regulator parameter
+	 */
 	mdio_write(ioaddr, 0x1f, 0x0002);
 	mdio_plus_minus(ioaddr, 0x0b, 0x0010, 0x00ef);
 	mdio_plus_minus(ioaddr, 0x0c, 0xa200, 0x5d00);
 
-	rtl_phy_write(ioaddr, phy_reg_init_1, ARRAY_SIZE(phy_reg_init_1));
-
 	if (rtl8168d_efuse_read(ioaddr, 0x01) == 0xb1) {
 		static const struct phy_reg phy_reg_init[] = {
 			{ 0x1f, 0x0002 },
@@ -2157,20 +1885,33 @@ static void rtl8168d_1_hw_phy_config(void __iomem *ioaddr)
 		rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 	}
 
+	/* RSET couple improve */
 	mdio_write(ioaddr, 0x1f, 0x0002);
 	mdio_patch(ioaddr, 0x0d, 0x0300);
 	mdio_patch(ioaddr, 0x0f, 0x0010);
 
+	/* Fine tune PLL performance */
 	mdio_write(ioaddr, 0x1f, 0x0002);
 	mdio_plus_minus(ioaddr, 0x02, 0x0100, 0x0600);
 	mdio_plus_minus(ioaddr, 0x03, 0x0000, 0xe000);
 
-	rtl_phy_write(ioaddr, phy_reg_init_2, ARRAY_SIZE(phy_reg_init_2));
+	mdio_write(ioaddr, 0x1f, 0x0005);
+	mdio_write(ioaddr, 0x05, 0x001b);
+	if (mdio_read(ioaddr, 0x06) == 0xbf00 &&
+	    request_firmware(&fw, FIRMWARE_8168D_1, &tp->pci_dev->dev) == 0) {
+		rtl_phy_write_fw(tp, fw);
+		release_firmware(fw);
+	} else {
+		netif_warn(tp, probe, tp->dev, "unable to apply firmware patch\n");
+	}
+
+	mdio_write(ioaddr, 0x1f, 0x0000);
 }
 
-static void rtl8168d_2_hw_phy_config(void __iomem *ioaddr)
+static void rtl8168d_2_hw_phy_config(struct rtl8169_private *tp)
 {
 	static const struct phy_reg phy_reg_init_0[] = {
+		/* Channel Estimation */
 		{ 0x1f, 0x0001 },
 		{ 0x06, 0x4064 },
 		{ 0x07, 0x2863 },
@@ -2189,324 +1930,28 @@ static void rtl8168d_2_hw_phy_config(void __iomem *ioaddr)
 		{ 0x1a, 0x05ad },
 		{ 0x14, 0x94c0 },
 
+		/*
+		 * Tx Error Issue
+		 * enhance line driver power
+		 */
 		{ 0x1f, 0x0002 },
 		{ 0x06, 0x5561 },
 		{ 0x1f, 0x0005 },
 		{ 0x05, 0x8332 },
-		{ 0x06, 0x5561 }
-	};
-	static const struct phy_reg phy_reg_init_1[] = {
-		{ 0x1f, 0x0005 },
-		{ 0x05, 0xffc2 },
-		{ 0x1f, 0x0005 },
-		{ 0x05, 0x8000 },
-		{ 0x06, 0xf8f9 },
-		{ 0x06, 0xfaee },
-		{ 0x06, 0xf8ea },
-		{ 0x06, 0x00ee },
-		{ 0x06, 0xf8eb },
-		{ 0x06, 0x00e2 },
-		{ 0x06, 0xf87c },
-		{ 0x06, 0xe3f8 },
-		{ 0x06, 0x7da5 },
-		{ 0x06, 0x1111 },
-		{ 0x06, 0x12d2 },
-		{ 0x06, 0x40d6 },
-		{ 0x06, 0x4444 },
-		{ 0x06, 0x0281 },
-		{ 0x06, 0xc6d2 },
-		{ 0x06, 0xa0d6 },
-		{ 0x06, 0xaaaa },
-		{ 0x06, 0x0281 },
-		{ 0x06, 0xc6ae },
-		{ 0x06, 0x0fa5 },
-		{ 0x06, 0x4444 },
-		{ 0x06, 0x02ae },
-		{ 0x06, 0x4da5 },
-		{ 0x06, 0xaaaa },
-		{ 0x06, 0x02ae },
-		{ 0x06, 0x47af },
-		{ 0x06, 0x81c2 },
-		{ 0x06, 0xee83 },
-		{ 0x06, 0x4e00 },
-		{ 0x06, 0xee83 },
-		{ 0x06, 0x4d0f },
-		{ 0x06, 0xee83 },
-		{ 0x06, 0x4c0f },
-		{ 0x06, 0xee83 },
-		{ 0x06, 0x4f00 },
-		{ 0x06, 0xee83 },
-		{ 0x06, 0x5100 },
-		{ 0x06, 0xee83 },
-		{ 0x06, 0x4aff },
-		{ 0x06, 0xee83 },
-		{ 0x06, 0x4bff },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x30e1 },
-		{ 0x06, 0x8331 },
-		{ 0x06, 0x58fe },
-		{ 0x06, 0xe4f8 },
-		{ 0x06, 0x8ae5 },
-		{ 0x06, 0xf88b },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x32e1 },
-		{ 0x06, 0x8333 },
-		{ 0x06, 0x590f },
-		{ 0x06, 0xe283 },
-		{ 0x06, 0x4d0c },
-		{ 0x06, 0x245a },
-		{ 0x06, 0xf01e },
-		{ 0x06, 0x12e4 },
-		{ 0x06, 0xf88c },
-		{ 0x06, 0xe5f8 },
-		{ 0x06, 0x8daf },
-		{ 0x06, 0x81c2 },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x4f10 },
-		{ 0x06, 0xe483 },
-		{ 0x06, 0x4fe0 },
-		{ 0x06, 0x834e },
-		{ 0x06, 0x7800 },
-		{ 0x06, 0x9f0a },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x4fa0 },
-		{ 0x06, 0x10a5 },
-		{ 0x06, 0xee83 },
-		{ 0x06, 0x4e01 },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x4e78 },
-		{ 0x06, 0x059e },
-		{ 0x06, 0x9ae0 },
-		{ 0x06, 0x834e },
-		{ 0x06, 0x7804 },
-		{ 0x06, 0x9e10 },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x4e78 },
-		{ 0x06, 0x039e },
-		{ 0x06, 0x0fe0 },
-		{ 0x06, 0x834e },
-		{ 0x06, 0x7801 },
-		{ 0x06, 0x9e05 },
-		{ 0x06, 0xae0c },
-		{ 0x06, 0xaf81 },
-		{ 0x06, 0xa7af },
-		{ 0x06, 0x8152 },
-		{ 0x06, 0xaf81 },
-		{ 0x06, 0x8baf },
-		{ 0x06, 0x81c2 },
-		{ 0x06, 0xee83 },
-		{ 0x06, 0x4800 },
-		{ 0x06, 0xee83 },
-		{ 0x06, 0x4900 },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x5110 },
-		{ 0x06, 0xe483 },
-		{ 0x06, 0x5158 },
-		{ 0x06, 0x019f },
-		{ 0x06, 0xead0 },
-		{ 0x06, 0x00d1 },
-		{ 0x06, 0x801f },
-		{ 0x06, 0x66e2 },
-		{ 0x06, 0xf8ea },
-		{ 0x06, 0xe3f8 },
-		{ 0x06, 0xeb5a },
-		{ 0x06, 0xf81e },
-		{ 0x06, 0x20e6 },
-		{ 0x06, 0xf8ea },
-		{ 0x06, 0xe5f8 },
-		{ 0x06, 0xebd3 },
-		{ 0x06, 0x02b3 },
-		{ 0x06, 0xfee2 },
-		{ 0x06, 0xf87c },
-		{ 0x06, 0xef32 },
-		{ 0x06, 0x5b80 },
-		{ 0x06, 0xe3f8 },
-		{ 0x06, 0x7d9e },
-		{ 0x06, 0x037d },
-		{ 0x06, 0xffff },
-		{ 0x06, 0x0d58 },
-		{ 0x06, 0x1c55 },
-		{ 0x06, 0x1a65 },
-		{ 0x06, 0x11a1 },
-		{ 0x06, 0x90d3 },
-		{ 0x06, 0xe283 },
-		{ 0x06, 0x48e3 },
-		{ 0x06, 0x8349 },
-		{ 0x06, 0x1b56 },
-		{ 0x06, 0xab08 },
-		{ 0x06, 0xef56 },
-		{ 0x06, 0xe683 },
-		{ 0x06, 0x48e7 },
-		{ 0x06, 0x8349 },
-		{ 0x06, 0x10d1 },
-		{ 0x06, 0x801f },
-		{ 0x06, 0x66a0 },
-		{ 0x06, 0x04b9 },
-		{ 0x06, 0xe283 },
-		{ 0x06, 0x48e3 },
-		{ 0x06, 0x8349 },
-		{ 0x06, 0xef65 },
-		{ 0x06, 0xe283 },
-		{ 0x06, 0x4ae3 },
-		{ 0x06, 0x834b },
-		{ 0x06, 0x1b56 },
-		{ 0x06, 0xaa0e },
-		{ 0x06, 0xef56 },
-		{ 0x06, 0xe683 },
-		{ 0x06, 0x4ae7 },
-		{ 0x06, 0x834b },
-		{ 0x06, 0xe283 },
-		{ 0x06, 0x4de6 },
-		{ 0x06, 0x834c },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x4da0 },
-		{ 0x06, 0x000c },
-		{ 0x06, 0xaf81 },
-		{ 0x06, 0x8be0 },
-		{ 0x06, 0x834d },
-		{ 0x06, 0x10e4 },
-		{ 0x06, 0x834d },
-		{ 0x06, 0xae04 },
-		{ 0x06, 0x80e4 },
-		{ 0x06, 0x834d },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x4e78 },
-		{ 0x06, 0x039e },
-		{ 0x06, 0x0be0 },
-		{ 0x06, 0x834e },
-		{ 0x06, 0x7804 },
-		{ 0x06, 0x9e04 },
-		{ 0x06, 0xee83 },
-		{ 0x06, 0x4e02 },
-		{ 0x06, 0xe083 },
-		{ 0x06, 0x32e1 },
-		{ 0x06, 0x8333 },
-		{ 0x06, 0x590f },
-		{ 0x06, 0xe283 },
-		{ 0x06, 0x4d0c },
-		{ 0x06, 0x245a },
-		{ 0x06, 0xf01e },
-		{ 0x06, 0x12e4 },
-		{ 0x06, 0xf88c },
-		{ 0x06, 0xe5f8 },
-		{ 0x06, 0x8de0 },
-		{ 0x06, 0x8330 },
-		{ 0x06, 0xe183 },
-		{ 0x06, 0x3168 },
-		{ 0x06, 0x01e4 },
-		{ 0x06, 0xf88a },
-		{ 0x06, 0xe5f8 },
-		{ 0x06, 0x8bae },
-		{ 0x06, 0x37ee },
-		{ 0x06, 0x834e },
-		{ 0x06, 0x03e0 },
-		{ 0x06, 0x834c },
-		{ 0x06, 0xe183 },
-		{ 0x06, 0x4d1b },
-		{ 0x06, 0x019e },
-		{ 0x06, 0x04aa },
-		{ 0x06, 0xa1ae },
-		{ 0x06, 0xa8ee },
-		{ 0x06, 0x834e },
-		{ 0x06, 0x04ee },
-		{ 0x06, 0x834f },
-		{ 0x06, 0x00ae },
-		{ 0x06, 0xabe0 },
-		{ 0x06, 0x834f },
-		{ 0x06, 0x7803 },
-		{ 0x06, 0x9f14 },
-		{ 0x06, 0xee83 },
-		{ 0x06, 0x4e05 },
-		{ 0x06, 0xd240 },
-		{ 0x06, 0xd655 },
-		{ 0x06, 0x5402 },
-		{ 0x06, 0x81c6 },
-		{ 0x06, 0xd2a0 },
-		{ 0x06, 0xd6ba },
-		{ 0x06, 0x0002 },
-		{ 0x06, 0x81c6 },
-		{ 0x06, 0xfefd },
-		{ 0x06, 0xfc05 },
-		{ 0x06, 0xf8e0 },
-		{ 0x06, 0xf860 },
-		{ 0x06, 0xe1f8 },
-		{ 0x06, 0x6168 },
-		{ 0x06, 0x02e4 },
-		{ 0x06, 0xf860 },
-		{ 0x06, 0xe5f8 },
-		{ 0x06, 0x61e0 },
-		{ 0x06, 0xf848 },
-		{ 0x06, 0xe1f8 },
-		{ 0x06, 0x4958 },
-		{ 0x06, 0x0f1e },
-		{ 0x06, 0x02e4 },
-		{ 0x06, 0xf848 },
-		{ 0x06, 0xe5f8 },
-		{ 0x06, 0x49d0 },
-		{ 0x06, 0x0002 },
-		{ 0x06, 0x820a },
-		{ 0x06, 0xbf83 },
-		{ 0x06, 0x50ef },
-		{ 0x06, 0x46dc },
-		{ 0x06, 0x19dd },
-		{ 0x06, 0xd001 },
-		{ 0x06, 0x0282 },
-		{ 0x06, 0x0a02 },
-		{ 0x06, 0x8226 },
-		{ 0x06, 0xe0f8 },
-		{ 0x06, 0x60e1 },
-		{ 0x06, 0xf861 },
-		{ 0x06, 0x58fd },
-		{ 0x06, 0xe4f8 },
-		{ 0x06, 0x60e5 },
-		{ 0x06, 0xf861 },
-		{ 0x06, 0xfc04 },
-		{ 0x06, 0xf9fa },
-		{ 0x06, 0xfbc6 },
-		{ 0x06, 0xbff8 },
-		{ 0x06, 0x40be },
-		{ 0x06, 0x8350 },
-		{ 0x06, 0xa001 },
-		{ 0x06, 0x0107 },
-		{ 0x06, 0x1b89 },
-		{ 0x06, 0xcfd2 },
-		{ 0x06, 0x08eb },
-		{ 0x06, 0xdb19 },
-		{ 0x06, 0xb2fb },
-		{ 0x06, 0xfffe },
-		{ 0x06, 0xfd04 },
-		{ 0x06, 0xf8e0 },
-		{ 0x06, 0xf848 },
-		{ 0x06, 0xe1f8 },
-		{ 0x06, 0x4968 },
-		{ 0x06, 0x08e4 },
-		{ 0x06, 0xf848 },
-		{ 0x06, 0xe5f8 },
-		{ 0x06, 0x4958 },
-		{ 0x06, 0xf7e4 },
-		{ 0x06, 0xf848 },
-		{ 0x06, 0xe5f8 },
-		{ 0x06, 0x49fc },
-		{ 0x06, 0x044d },
-		{ 0x06, 0x2000 },
-		{ 0x06, 0x024e },
-		{ 0x06, 0x2200 },
-		{ 0x06, 0x024d },
-		{ 0x06, 0xdfff },
-		{ 0x06, 0x014e },
-		{ 0x06, 0xddff },
-		{ 0x06, 0x0100 },
-		{ 0x05, 0x83d8 },
-		{ 0x06, 0x8000 },
-		{ 0x03, 0xdc00 },
-		{ 0x05, 0xfff6 },
-		{ 0x06, 0x00fc },
-		{ 0x1f, 0x0000 },
+		{ 0x06, 0x5561 },
+
+		/*
+		 * Can not link to 1Gbps with bad cable
+		 * Decrease SNR threshold form 21.07dB to 19.04dB
+		 */
+		{ 0x1f, 0x0001 },
+		{ 0x17, 0x0cc0 },
 
 		{ 0x1f, 0x0000 },
-		{ 0x0d, 0xf880 },
-		{ 0x1f, 0x0000 }
+		{ 0x0d, 0xf880 }
 	};
+	void __iomem *ioaddr = tp->mmio_addr;
+	const struct firmware *fw;
 
 	rtl_phy_write(ioaddr, phy_reg_init_0, ARRAY_SIZE(phy_reg_init_0));
 
@@ -2550,17 +1995,26 @@ static void rtl8168d_2_hw_phy_config(void __iomem *ioaddr)
 		rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 	}
 
+	/* Fine tune PLL performance */
 	mdio_write(ioaddr, 0x1f, 0x0002);
 	mdio_plus_minus(ioaddr, 0x02, 0x0100, 0x0600);
 	mdio_plus_minus(ioaddr, 0x03, 0x0000, 0xe000);
 
-	mdio_write(ioaddr, 0x1f, 0x0001);
-	mdio_write(ioaddr, 0x17, 0x0cc0);
-
+	/* Switching regulator Slew rate */
 	mdio_write(ioaddr, 0x1f, 0x0002);
 	mdio_patch(ioaddr, 0x0f, 0x0017);
 
-	rtl_phy_write(ioaddr, phy_reg_init_1, ARRAY_SIZE(phy_reg_init_1));
+	mdio_write(ioaddr, 0x1f, 0x0005);
+	mdio_write(ioaddr, 0x05, 0x001b);
+	if (mdio_read(ioaddr, 0x06) == 0xb300 &&
+	    request_firmware(&fw, FIRMWARE_8168D_2, &tp->pci_dev->dev) == 0) {
+		rtl_phy_write_fw(tp, fw);
+		release_firmware(fw);
+	} else {
+		netif_warn(tp, probe, tp->dev, "unable to apply firmware patch\n");
+	}
+
+	mdio_write(ioaddr, 0x1f, 0x0000);
 }
 
 static void rtl8168d_3_hw_phy_config(void __iomem *ioaddr)
@@ -2698,10 +2152,10 @@ static void rtl_hw_phy_config(struct net_device *dev)
 		rtl8168cp_2_hw_phy_config(ioaddr);
 		break;
 	case RTL_GIGA_MAC_VER_25:
-		rtl8168d_1_hw_phy_config(ioaddr);
+		rtl8168d_1_hw_phy_config(tp);
 		break;
 	case RTL_GIGA_MAC_VER_26:
-		rtl8168d_2_hw_phy_config(ioaddr);
+		rtl8168d_2_hw_phy_config(tp);
 		break;
 	case RTL_GIGA_MAC_VER_27:
 		rtl8168d_3_hw_phy_config(ioaddr);
-- 
1.7.3.4


^ permalink raw reply related

* [net-next-2.6 02/08] r8169: identify different registers.
From: Francois Romieu @ 2011-01-02 23:36 UTC (permalink / raw)
  To: davem; +Cc: netdev, Hayes Wang, Ben Hutchings, David Woodhouse

Documentation (sort of).

The location are the same, the values are the same but it is
just accidental. Note that the 810x could cope with a smaller
value as it does not support jumbo frames.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes <hayeswang@realtek.com>
---
 drivers/net/r8169.c |   22 ++++++++++++++--------
 1 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 3124462..8aa92ad 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -67,7 +67,6 @@ static const int multicast_filter_limit = 32;
 #define RX_FIFO_THRESH	7	/* 7 means NO threshold, Rx buffer level before first PCI xfer. */
 #define RX_DMA_BURST	6	/* Maximum PCI burst, '6' is 1024 */
 #define TX_DMA_BURST	6	/* Maximum PCI burst, '6' is 1024 */
-#define EarlyTxThld	0x3F	/* 0x3F means NO early transmit */
 #define SafeMtu		0x1c20	/* ... actually life sucks beyond ~7k */
 #define InterFrameGap	0x03	/* 3 means InterFrameGap = the shortest one */
 
@@ -231,7 +230,14 @@ enum rtl_registers {
 	IntrMitigate	= 0xe2,
 	RxDescAddrLow	= 0xe4,
 	RxDescAddrHigh	= 0xe8,
-	EarlyTxThres	= 0xec,
+	EarlyTxThres	= 0xec,	/* 8169. Unit of 32 bytes. */
+
+#define NoEarlyTx	0x3f	/* Max value : no early transmit. */
+
+	MaxTxPacketSize	= 0xec,	/* 8101/8168. Unit of 128 bytes. */
+
+#define TxPacketMax	(8064 >> 7)
+
 	FuncEvent	= 0xf0,
 	FuncEventMask	= 0xf4,
 	FuncPresetState	= 0xf8,
@@ -2901,7 +2907,7 @@ static void rtl_hw_start_8169(struct net_device *dev)
 	    (tp->mac_version == RTL_GIGA_MAC_VER_04))
 		RTL_W8(ChipCmd, CmdTxEnb | CmdRxEnb);
 
-	RTL_W8(EarlyTxThres, EarlyTxThld);
+	RTL_W8(EarlyTxThres, NoEarlyTx);
 
 	rtl_set_rx_max_size(ioaddr, rx_buf_sz);
 
@@ -3036,7 +3042,7 @@ static void rtl_hw_start_8168bef(void __iomem *ioaddr, struct pci_dev *pdev)
 {
 	rtl_hw_start_8168bb(ioaddr, pdev);
 
-	RTL_W8(EarlyTxThres, EarlyTxThld);
+	RTL_W8(MaxTxPacketSize, 0x3f);
 
 	RTL_W8(Config4, RTL_R8(Config4) & ~(1 << 0));
 }
@@ -3091,7 +3097,7 @@ static void rtl_hw_start_8168cp_3(void __iomem *ioaddr, struct pci_dev *pdev)
 	/* Magic. */
 	RTL_W8(DBG_REG, 0x20);
 
-	RTL_W8(EarlyTxThres, EarlyTxThld);
+	RTL_W8(MaxTxPacketSize, TxPacketMax);
 
 	rtl_tx_performance_tweak(pdev, 0x5 << MAX_READ_REQUEST_SHIFT);
 
@@ -3147,7 +3153,7 @@ static void rtl_hw_start_8168d(void __iomem *ioaddr, struct pci_dev *pdev)
 
 	rtl_disable_clock_request(pdev);
 
-	RTL_W8(EarlyTxThres, EarlyTxThld);
+	RTL_W8(MaxTxPacketSize, TxPacketMax);
 
 	rtl_tx_performance_tweak(pdev, 0x5 << MAX_READ_REQUEST_SHIFT);
 
@@ -3162,7 +3168,7 @@ static void rtl_hw_start_8168(struct net_device *dev)
 
 	RTL_W8(Cfg9346, Cfg9346_Unlock);
 
-	RTL_W8(EarlyTxThres, EarlyTxThld);
+	RTL_W8(MaxTxPacketSize, TxPacketMax);
 
 	rtl_set_rx_max_size(ioaddr, rx_buf_sz);
 
@@ -3342,7 +3348,7 @@ static void rtl_hw_start_8101(struct net_device *dev)
 
 	RTL_W8(Cfg9346, Cfg9346_Unlock);
 
-	RTL_W8(EarlyTxThres, EarlyTxThld);
+	RTL_W8(MaxTxPacketSize, TxPacketMax);
 
 	rtl_set_rx_max_size(ioaddr, rx_buf_sz);
 
-- 
1.7.3.4


^ permalink raw reply related

* [net-next-2.6 03/08] r8169: use device dependent methods to access the MII registers.
From: Francois Romieu @ 2011-01-02 23:36 UTC (permalink / raw)
  To: davem; +Cc: netdev, Hayes Wang, Ben Hutchings

Current mdio_{read/write} needs device specific information to work
correctly with newer chipsets.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes <hayeswang@realtek.com>
---
 drivers/net/r8169.c |  307 ++++++++++++++++++++++++++-------------------------
 1 files changed, 157 insertions(+), 150 deletions(-)

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 8aa92ad..f9d8ff0 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -502,9 +502,9 @@ struct rtl8169_private {
 #endif
 	int (*set_speed)(struct net_device *, u8 autoneg, u16 speed, u8 duplex);
 	int (*get_settings)(struct net_device *, struct ethtool_cmd *);
-	void (*phy_reset_enable)(void __iomem *);
+	void (*phy_reset_enable)(struct rtl8169_private *tp);
 	void (*hw_start)(struct net_device *);
-	unsigned int (*phy_reset_pending)(void __iomem *);
+	unsigned int (*phy_reset_pending)(struct rtl8169_private *tp);
 	unsigned int (*link_ok)(void __iomem *);
 	int (*do_ioctl)(struct rtl8169_private *tp, struct mii_ioctl_data *data, int cmd);
 	int pcie_cap;
@@ -547,7 +547,7 @@ static int rtl8169_poll(struct napi_struct *napi, int budget);
 static const unsigned int rtl8169_rx_config =
 	(RX_FIFO_THRESH << RxCfgFIFOShift) | (RX_DMA_BURST << RxCfgDMAShift);
 
-static void mdio_write(void __iomem *ioaddr, int reg_addr, int value)
+static void r8169_mdio_write(void __iomem *ioaddr, int reg_addr, int value)
 {
 	int i;
 
@@ -569,7 +569,7 @@ static void mdio_write(void __iomem *ioaddr, int reg_addr, int value)
 	udelay(20);
 }
 
-static int mdio_read(void __iomem *ioaddr, int reg_addr)
+static int r8169_mdio_read(void __iomem *ioaddr, int reg_addr)
 {
 	int i, value = -1;
 
@@ -595,34 +595,42 @@ static int mdio_read(void __iomem *ioaddr, int reg_addr)
 	return value;
 }
 
-static void mdio_patch(void __iomem *ioaddr, int reg_addr, int value)
+static void rtl_writephy(struct rtl8169_private *tp, int location, u32 val)
 {
-	mdio_write(ioaddr, reg_addr, mdio_read(ioaddr, reg_addr) | value);
+	r8169_mdio_write(tp->mmio_addr, location, val);
 }
 
-static void mdio_plus_minus(void __iomem *ioaddr, int reg_addr, int p, int m)
+static int rtl_readphy(struct rtl8169_private *tp, int location)
+{
+	return r8169_mdio_read(tp->mmio_addr, location);
+}
+
+static void rtl_patchphy(struct rtl8169_private *tp, int reg_addr, int value)
+{
+	rtl_writephy(tp, reg_addr, rtl_readphy(tp, reg_addr) | value);
+}
+
+static void rtl_w1w0_phy(struct rtl8169_private *tp, int reg_addr, int p, int m)
 {
 	int val;
 
-	val = mdio_read(ioaddr, reg_addr);
-	mdio_write(ioaddr, reg_addr, (val | p) & ~m);
+	val = rtl_readphy(tp, reg_addr);
+	rtl_writephy(tp, reg_addr, (val | p) & ~m);
 }
 
 static void rtl_mdio_write(struct net_device *dev, int phy_id, int location,
 			   int val)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
-	void __iomem *ioaddr = tp->mmio_addr;
 
-	mdio_write(ioaddr, location, val);
+	rtl_writephy(tp, location, val);
 }
 
 static int rtl_mdio_read(struct net_device *dev, int phy_id, int location)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
-	void __iomem *ioaddr = tp->mmio_addr;
 
-	return mdio_read(ioaddr, location);
+	return rtl_readphy(tp, location);
 }
 
 static void rtl_ephy_write(void __iomem *ioaddr, int reg_addr, int value)
@@ -723,14 +731,16 @@ static void rtl8169_asic_down(void __iomem *ioaddr)
 	RTL_R16(CPlusCmd);
 }
 
-static unsigned int rtl8169_tbi_reset_pending(void __iomem *ioaddr)
+static unsigned int rtl8169_tbi_reset_pending(struct rtl8169_private *tp)
 {
+	void __iomem *ioaddr = tp->mmio_addr;
+
 	return RTL_R32(TBICSR) & TBIReset;
 }
 
-static unsigned int rtl8169_xmii_reset_pending(void __iomem *ioaddr)
+static unsigned int rtl8169_xmii_reset_pending(struct rtl8169_private *tp)
 {
-	return mdio_read(ioaddr, MII_BMCR) & BMCR_RESET;
+	return rtl_readphy(tp, MII_BMCR) & BMCR_RESET;
 }
 
 static unsigned int rtl8169_tbi_link_ok(void __iomem *ioaddr)
@@ -743,17 +753,19 @@ static unsigned int rtl8169_xmii_link_ok(void __iomem *ioaddr)
 	return RTL_R8(PHYstatus) & LinkStatus;
 }
 
-static void rtl8169_tbi_reset_enable(void __iomem *ioaddr)
+static void rtl8169_tbi_reset_enable(struct rtl8169_private *tp)
 {
+	void __iomem *ioaddr = tp->mmio_addr;
+
 	RTL_W32(TBICSR, RTL_R32(TBICSR) | TBIReset);
 }
 
-static void rtl8169_xmii_reset_enable(void __iomem *ioaddr)
+static void rtl8169_xmii_reset_enable(struct rtl8169_private *tp)
 {
 	unsigned int val;
 
-	val = mdio_read(ioaddr, MII_BMCR) | BMCR_RESET;
-	mdio_write(ioaddr, MII_BMCR, val & 0xffff);
+	val = rtl_readphy(tp, MII_BMCR) | BMCR_RESET;
+	rtl_writephy(tp, MII_BMCR, val & 0xffff);
 }
 
 static void __rtl8169_check_link_status(struct net_device *dev,
@@ -917,18 +929,17 @@ static int rtl8169_set_speed_xmii(struct net_device *dev,
 				  u8 autoneg, u16 speed, u8 duplex)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
-	void __iomem *ioaddr = tp->mmio_addr;
 	int giga_ctrl, bmcr;
 
 	if (autoneg == AUTONEG_ENABLE) {
 		int auto_nego;
 
-		auto_nego = mdio_read(ioaddr, MII_ADVERTISE);
+		auto_nego = rtl_readphy(tp, MII_ADVERTISE);
 		auto_nego |= (ADVERTISE_10HALF | ADVERTISE_10FULL |
 			      ADVERTISE_100HALF | ADVERTISE_100FULL);
 		auto_nego |= ADVERTISE_PAUSE_CAP | ADVERTISE_PAUSE_ASYM;
 
-		giga_ctrl = mdio_read(ioaddr, MII_CTRL1000);
+		giga_ctrl = rtl_readphy(tp, MII_CTRL1000);
 		giga_ctrl &= ~(ADVERTISE_1000FULL | ADVERTISE_1000HALF);
 
 		/* The 8100e/8101e/8102e do Fast Ethernet only. */
@@ -956,12 +967,12 @@ static int rtl8169_set_speed_xmii(struct net_device *dev,
 			 * Vendor specific (0x1f) and reserved (0x0e) MII
 			 * registers.
 			 */
-			mdio_write(ioaddr, 0x1f, 0x0000);
-			mdio_write(ioaddr, 0x0e, 0x0000);
+			rtl_writephy(tp, 0x1f, 0x0000);
+			rtl_writephy(tp, 0x0e, 0x0000);
 		}
 
-		mdio_write(ioaddr, MII_ADVERTISE, auto_nego);
-		mdio_write(ioaddr, MII_CTRL1000, giga_ctrl);
+		rtl_writephy(tp, MII_ADVERTISE, auto_nego);
+		rtl_writephy(tp, MII_CTRL1000, giga_ctrl);
 	} else {
 		giga_ctrl = 0;
 
@@ -975,21 +986,21 @@ static int rtl8169_set_speed_xmii(struct net_device *dev,
 		if (duplex == DUPLEX_FULL)
 			bmcr |= BMCR_FULLDPLX;
 
-		mdio_write(ioaddr, 0x1f, 0x0000);
+		rtl_writephy(tp, 0x1f, 0x0000);
 	}
 
 	tp->phy_1000_ctrl_reg = giga_ctrl;
 
-	mdio_write(ioaddr, MII_BMCR, bmcr);
+	rtl_writephy(tp, MII_BMCR, bmcr);
 
 	if ((tp->mac_version == RTL_GIGA_MAC_VER_02) ||
 	    (tp->mac_version == RTL_GIGA_MAC_VER_03)) {
 		if ((speed == SPEED_100) && (autoneg != AUTONEG_ENABLE)) {
-			mdio_write(ioaddr, 0x17, 0x2138);
-			mdio_write(ioaddr, 0x0e, 0x0260);
+			rtl_writephy(tp, 0x17, 0x2138);
+			rtl_writephy(tp, 0x0e, 0x0260);
 		} else {
-			mdio_write(ioaddr, 0x17, 0x2108);
-			mdio_write(ioaddr, 0x0e, 0x0000);
+			rtl_writephy(tp, 0x17, 0x2108);
+			rtl_writephy(tp, 0x0e, 0x0000);
 		}
 	}
 
@@ -1397,10 +1408,11 @@ struct phy_reg {
 	u16 val;
 };
 
-static void rtl_phy_write(void __iomem *ioaddr, const struct phy_reg *regs, int len)
+static void rtl_writephy_batch(struct rtl8169_private *tp,
+			       const struct phy_reg *regs, int len)
 {
 	while (len-- > 0) {
-		mdio_write(ioaddr, regs->reg, regs->val);
+		rtl_writephy(tp, regs->reg, regs->val);
 		regs++;
 	}
 }
@@ -1425,7 +1437,6 @@ static void rtl_phy_write(void __iomem *ioaddr, const struct phy_reg *regs, int
 static void
 rtl_phy_write_fw(struct rtl8169_private *tp, const struct firmware *fw)
 {
-	void __iomem *ioaddr = tp->mmio_addr;
 	__le32 *phytable = (__le32 *)fw->data;
 	struct net_device *dev = tp->dev;
 	size_t i;
@@ -1455,7 +1466,7 @@ rtl_phy_write_fw(struct rtl8169_private *tp, const struct firmware *fw)
 
 		switch(action & 0xf0000000) {
 		case PHY_WRITE:
-			mdio_write(ioaddr, reg, data);
+			rtl_writephy(tp, reg, data);
 			phytable++;
 			break;
 		default:
@@ -1464,7 +1475,7 @@ rtl_phy_write_fw(struct rtl8169_private *tp, const struct firmware *fw)
 	}
 }
 
-static void rtl8169s_hw_phy_config(void __iomem *ioaddr)
+static void rtl8169s_hw_phy_config(struct rtl8169_private *tp)
 {
 	static const struct phy_reg phy_reg_init[] = {
 		{ 0x1f, 0x0001 },
@@ -1528,10 +1539,10 @@ static void rtl8169s_hw_phy_config(void __iomem *ioaddr)
 		{ 0x00, 0x9200 }
 	};
 
-	rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+	rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 }
 
-static void rtl8169sb_hw_phy_config(void __iomem *ioaddr)
+static void rtl8169sb_hw_phy_config(struct rtl8169_private *tp)
 {
 	static const struct phy_reg phy_reg_init[] = {
 		{ 0x1f, 0x0002 },
@@ -1539,11 +1550,10 @@ static void rtl8169sb_hw_phy_config(void __iomem *ioaddr)
 		{ 0x1f, 0x0000 }
 	};
 
-	rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+	rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 }
 
-static void rtl8169scd_hw_phy_config_quirk(struct rtl8169_private *tp,
-					   void __iomem *ioaddr)
+static void rtl8169scd_hw_phy_config_quirk(struct rtl8169_private *tp)
 {
 	struct pci_dev *pdev = tp->pci_dev;
 	u16 vendor_id, device_id;
@@ -1554,13 +1564,12 @@ static void rtl8169scd_hw_phy_config_quirk(struct rtl8169_private *tp,
 	if ((vendor_id != PCI_VENDOR_ID_GIGABYTE) || (device_id != 0xe000))
 		return;
 
-	mdio_write(ioaddr, 0x1f, 0x0001);
-	mdio_write(ioaddr, 0x10, 0xf01b);
-	mdio_write(ioaddr, 0x1f, 0x0000);
+	rtl_writephy(tp, 0x1f, 0x0001);
+	rtl_writephy(tp, 0x10, 0xf01b);
+	rtl_writephy(tp, 0x1f, 0x0000);
 }
 
-static void rtl8169scd_hw_phy_config(struct rtl8169_private *tp,
-				     void __iomem *ioaddr)
+static void rtl8169scd_hw_phy_config(struct rtl8169_private *tp)
 {
 	static const struct phy_reg phy_reg_init[] = {
 		{ 0x1f, 0x0001 },
@@ -1602,12 +1611,12 @@ static void rtl8169scd_hw_phy_config(struct rtl8169_private *tp,
 		{ 0x1f, 0x0000 }
 	};
 
-	rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+	rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 
-	rtl8169scd_hw_phy_config_quirk(tp, ioaddr);
+	rtl8169scd_hw_phy_config_quirk(tp);
 }
 
-static void rtl8169sce_hw_phy_config(void __iomem *ioaddr)
+static void rtl8169sce_hw_phy_config(struct rtl8169_private *tp)
 {
 	static const struct phy_reg phy_reg_init[] = {
 		{ 0x1f, 0x0001 },
@@ -1657,23 +1666,23 @@ static void rtl8169sce_hw_phy_config(void __iomem *ioaddr)
 		{ 0x1f, 0x0000 }
 	};
 
-	rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+	rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 }
 
-static void rtl8168bb_hw_phy_config(void __iomem *ioaddr)
+static void rtl8168bb_hw_phy_config(struct rtl8169_private *tp)
 {
 	static const struct phy_reg phy_reg_init[] = {
 		{ 0x10, 0xf41b },
 		{ 0x1f, 0x0000 }
 	};
 
-	mdio_write(ioaddr, 0x1f, 0x0001);
-	mdio_patch(ioaddr, 0x16, 1 << 0);
+	rtl_writephy(tp, 0x1f, 0x0001);
+	rtl_patchphy(tp, 0x16, 1 << 0);
 
-	rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+	rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 }
 
-static void rtl8168bef_hw_phy_config(void __iomem *ioaddr)
+static void rtl8168bef_hw_phy_config(struct rtl8169_private *tp)
 {
 	static const struct phy_reg phy_reg_init[] = {
 		{ 0x1f, 0x0001 },
@@ -1681,10 +1690,10 @@ static void rtl8168bef_hw_phy_config(void __iomem *ioaddr)
 		{ 0x1f, 0x0000 }
 	};
 
-	rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+	rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 }
 
-static void rtl8168cp_1_hw_phy_config(void __iomem *ioaddr)
+static void rtl8168cp_1_hw_phy_config(struct rtl8169_private *tp)
 {
 	static const struct phy_reg phy_reg_init[] = {
 		{ 0x1f, 0x0000 },
@@ -1694,10 +1703,10 @@ static void rtl8168cp_1_hw_phy_config(void __iomem *ioaddr)
 		{ 0x1f, 0x0000 }
 	};
 
-	rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+	rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 }
 
-static void rtl8168cp_2_hw_phy_config(void __iomem *ioaddr)
+static void rtl8168cp_2_hw_phy_config(struct rtl8169_private *tp)
 {
 	static const struct phy_reg phy_reg_init[] = {
 		{ 0x1f, 0x0001 },
@@ -1705,14 +1714,14 @@ static void rtl8168cp_2_hw_phy_config(void __iomem *ioaddr)
 		{ 0x1f, 0x0000 }
 	};
 
-	mdio_write(ioaddr, 0x1f, 0x0000);
-	mdio_patch(ioaddr, 0x14, 1 << 5);
-	mdio_patch(ioaddr, 0x0d, 1 << 5);
+	rtl_writephy(tp, 0x1f, 0x0000);
+	rtl_patchphy(tp, 0x14, 1 << 5);
+	rtl_patchphy(tp, 0x0d, 1 << 5);
 
-	rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+	rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 }
 
-static void rtl8168c_1_hw_phy_config(void __iomem *ioaddr)
+static void rtl8168c_1_hw_phy_config(struct rtl8169_private *tp)
 {
 	static const struct phy_reg phy_reg_init[] = {
 		{ 0x1f, 0x0001 },
@@ -1734,14 +1743,14 @@ static void rtl8168c_1_hw_phy_config(void __iomem *ioaddr)
 		{ 0x09, 0x0000 }
 	};
 
-	rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+	rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 
-	mdio_patch(ioaddr, 0x14, 1 << 5);
-	mdio_patch(ioaddr, 0x0d, 1 << 5);
-	mdio_write(ioaddr, 0x1f, 0x0000);
+	rtl_patchphy(tp, 0x14, 1 << 5);
+	rtl_patchphy(tp, 0x0d, 1 << 5);
+	rtl_writephy(tp, 0x1f, 0x0000);
 }
 
-static void rtl8168c_2_hw_phy_config(void __iomem *ioaddr)
+static void rtl8168c_2_hw_phy_config(struct rtl8169_private *tp)
 {
 	static const struct phy_reg phy_reg_init[] = {
 		{ 0x1f, 0x0001 },
@@ -1761,15 +1770,15 @@ static void rtl8168c_2_hw_phy_config(void __iomem *ioaddr)
 		{ 0x1f, 0x0000 }
 	};
 
-	rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+	rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 
-	mdio_patch(ioaddr, 0x16, 1 << 0);
-	mdio_patch(ioaddr, 0x14, 1 << 5);
-	mdio_patch(ioaddr, 0x0d, 1 << 5);
-	mdio_write(ioaddr, 0x1f, 0x0000);
+	rtl_patchphy(tp, 0x16, 1 << 0);
+	rtl_patchphy(tp, 0x14, 1 << 5);
+	rtl_patchphy(tp, 0x0d, 1 << 5);
+	rtl_writephy(tp, 0x1f, 0x0000);
 }
 
-static void rtl8168c_3_hw_phy_config(void __iomem *ioaddr)
+static void rtl8168c_3_hw_phy_config(struct rtl8169_private *tp)
 {
 	static const struct phy_reg phy_reg_init[] = {
 		{ 0x1f, 0x0001 },
@@ -1783,17 +1792,17 @@ static void rtl8168c_3_hw_phy_config(void __iomem *ioaddr)
 		{ 0x1f, 0x0000 }
 	};
 
-	rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+	rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 
-	mdio_patch(ioaddr, 0x16, 1 << 0);
-	mdio_patch(ioaddr, 0x14, 1 << 5);
-	mdio_patch(ioaddr, 0x0d, 1 << 5);
-	mdio_write(ioaddr, 0x1f, 0x0000);
+	rtl_patchphy(tp, 0x16, 1 << 0);
+	rtl_patchphy(tp, 0x14, 1 << 5);
+	rtl_patchphy(tp, 0x0d, 1 << 5);
+	rtl_writephy(tp, 0x1f, 0x0000);
 }
 
-static void rtl8168c_4_hw_phy_config(void __iomem *ioaddr)
+static void rtl8168c_4_hw_phy_config(struct rtl8169_private *tp)
 {
-	rtl8168c_3_hw_phy_config(ioaddr);
+	rtl8168c_3_hw_phy_config(tp);
 }
 
 static void rtl8168d_1_hw_phy_config(struct rtl8169_private *tp)
@@ -1841,15 +1850,15 @@ static void rtl8168d_1_hw_phy_config(struct rtl8169_private *tp)
 	void __iomem *ioaddr = tp->mmio_addr;
 	const struct firmware *fw;
 
-	rtl_phy_write(ioaddr, phy_reg_init_0, ARRAY_SIZE(phy_reg_init_0));
+	rtl_writephy_batch(tp, phy_reg_init_0, ARRAY_SIZE(phy_reg_init_0));
 
 	/*
 	 * Rx Error Issue
 	 * Fine Tune Switching regulator parameter
 	 */
-	mdio_write(ioaddr, 0x1f, 0x0002);
-	mdio_plus_minus(ioaddr, 0x0b, 0x0010, 0x00ef);
-	mdio_plus_minus(ioaddr, 0x0c, 0xa200, 0x5d00);
+	rtl_writephy(tp, 0x1f, 0x0002);
+	rtl_w1w0_phy(tp, 0x0b, 0x0010, 0x00ef);
+	rtl_w1w0_phy(tp, 0x0c, 0xa200, 0x5d00);
 
 	if (rtl8168d_efuse_read(ioaddr, 0x01) == 0xb1) {
 		static const struct phy_reg phy_reg_init[] = {
@@ -1862,9 +1871,9 @@ static void rtl8168d_1_hw_phy_config(struct rtl8169_private *tp)
 		};
 		int val;
 
-		rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+		rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 
-		val = mdio_read(ioaddr, 0x0d);
+		val = rtl_readphy(tp, 0x0d);
 
 		if ((val & 0x00ff) != 0x006c) {
 			static const u32 set[] = {
@@ -1873,11 +1882,11 @@ static void rtl8168d_1_hw_phy_config(struct rtl8169_private *tp)
 			};
 			int i;
 
-			mdio_write(ioaddr, 0x1f, 0x0002);
+			rtl_writephy(tp, 0x1f, 0x0002);
 
 			val &= 0xff00;
 			for (i = 0; i < ARRAY_SIZE(set); i++)
-				mdio_write(ioaddr, 0x0d, val | set[i]);
+				rtl_writephy(tp, 0x0d, val | set[i]);
 		}
 	} else {
 		static const struct phy_reg phy_reg_init[] = {
@@ -1888,22 +1897,22 @@ static void rtl8168d_1_hw_phy_config(struct rtl8169_private *tp)
 			{ 0x06, 0x6662 }
 		};
 
-		rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+		rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 	}
 
 	/* RSET couple improve */
-	mdio_write(ioaddr, 0x1f, 0x0002);
-	mdio_patch(ioaddr, 0x0d, 0x0300);
-	mdio_patch(ioaddr, 0x0f, 0x0010);
+	rtl_writephy(tp, 0x1f, 0x0002);
+	rtl_patchphy(tp, 0x0d, 0x0300);
+	rtl_patchphy(tp, 0x0f, 0x0010);
 
 	/* Fine tune PLL performance */
-	mdio_write(ioaddr, 0x1f, 0x0002);
-	mdio_plus_minus(ioaddr, 0x02, 0x0100, 0x0600);
-	mdio_plus_minus(ioaddr, 0x03, 0x0000, 0xe000);
+	rtl_writephy(tp, 0x1f, 0x0002);
+	rtl_w1w0_phy(tp, 0x02, 0x0100, 0x0600);
+	rtl_w1w0_phy(tp, 0x03, 0x0000, 0xe000);
 
-	mdio_write(ioaddr, 0x1f, 0x0005);
-	mdio_write(ioaddr, 0x05, 0x001b);
-	if (mdio_read(ioaddr, 0x06) == 0xbf00 &&
+	rtl_writephy(tp, 0x1f, 0x0005);
+	rtl_writephy(tp, 0x05, 0x001b);
+	if (rtl_readphy(tp, 0x06) == 0xbf00 &&
 	    request_firmware(&fw, FIRMWARE_8168D_1, &tp->pci_dev->dev) == 0) {
 		rtl_phy_write_fw(tp, fw);
 		release_firmware(fw);
@@ -1911,7 +1920,7 @@ static void rtl8168d_1_hw_phy_config(struct rtl8169_private *tp)
 		netif_warn(tp, probe, tp->dev, "unable to apply firmware patch\n");
 	}
 
-	mdio_write(ioaddr, 0x1f, 0x0000);
+	rtl_writephy(tp, 0x1f, 0x0000);
 }
 
 static void rtl8168d_2_hw_phy_config(struct rtl8169_private *tp)
@@ -1959,7 +1968,7 @@ static void rtl8168d_2_hw_phy_config(struct rtl8169_private *tp)
 	void __iomem *ioaddr = tp->mmio_addr;
 	const struct firmware *fw;
 
-	rtl_phy_write(ioaddr, phy_reg_init_0, ARRAY_SIZE(phy_reg_init_0));
+	rtl_writephy_batch(tp, phy_reg_init_0, ARRAY_SIZE(phy_reg_init_0));
 
 	if (rtl8168d_efuse_read(ioaddr, 0x01) == 0xb1) {
 		static const struct phy_reg phy_reg_init[] = {
@@ -1973,9 +1982,9 @@ static void rtl8168d_2_hw_phy_config(struct rtl8169_private *tp)
 		};
 		int val;
 
-		rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+		rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 
-		val = mdio_read(ioaddr, 0x0d);
+		val = rtl_readphy(tp, 0x0d);
 		if ((val & 0x00ff) != 0x006c) {
 			static const u32 set[] = {
 				0x0065, 0x0066, 0x0067, 0x0068,
@@ -1983,11 +1992,11 @@ static void rtl8168d_2_hw_phy_config(struct rtl8169_private *tp)
 			};
 			int i;
 
-			mdio_write(ioaddr, 0x1f, 0x0002);
+			rtl_writephy(tp, 0x1f, 0x0002);
 
 			val &= 0xff00;
 			for (i = 0; i < ARRAY_SIZE(set); i++)
-				mdio_write(ioaddr, 0x0d, val | set[i]);
+				rtl_writephy(tp, 0x0d, val | set[i]);
 		}
 	} else {
 		static const struct phy_reg phy_reg_init[] = {
@@ -1998,21 +2007,21 @@ static void rtl8168d_2_hw_phy_config(struct rtl8169_private *tp)
 			{ 0x06, 0x2642 }
 		};
 
-		rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+		rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 	}
 
 	/* Fine tune PLL performance */
-	mdio_write(ioaddr, 0x1f, 0x0002);
-	mdio_plus_minus(ioaddr, 0x02, 0x0100, 0x0600);
-	mdio_plus_minus(ioaddr, 0x03, 0x0000, 0xe000);
+	rtl_writephy(tp, 0x1f, 0x0002);
+	rtl_w1w0_phy(tp, 0x02, 0x0100, 0x0600);
+	rtl_w1w0_phy(tp, 0x03, 0x0000, 0xe000);
 
 	/* Switching regulator Slew rate */
-	mdio_write(ioaddr, 0x1f, 0x0002);
-	mdio_patch(ioaddr, 0x0f, 0x0017);
+	rtl_writephy(tp, 0x1f, 0x0002);
+	rtl_patchphy(tp, 0x0f, 0x0017);
 
-	mdio_write(ioaddr, 0x1f, 0x0005);
-	mdio_write(ioaddr, 0x05, 0x001b);
-	if (mdio_read(ioaddr, 0x06) == 0xb300 &&
+	rtl_writephy(tp, 0x1f, 0x0005);
+	rtl_writephy(tp, 0x05, 0x001b);
+	if (rtl_readphy(tp, 0x06) == 0xb300 &&
 	    request_firmware(&fw, FIRMWARE_8168D_2, &tp->pci_dev->dev) == 0) {
 		rtl_phy_write_fw(tp, fw);
 		release_firmware(fw);
@@ -2020,10 +2029,10 @@ static void rtl8168d_2_hw_phy_config(struct rtl8169_private *tp)
 		netif_warn(tp, probe, tp->dev, "unable to apply firmware patch\n");
 	}
 
-	mdio_write(ioaddr, 0x1f, 0x0000);
+	rtl_writephy(tp, 0x1f, 0x0000);
 }
 
-static void rtl8168d_3_hw_phy_config(void __iomem *ioaddr)
+static void rtl8168d_3_hw_phy_config(struct rtl8169_private *tp)
 {
 	static const struct phy_reg phy_reg_init[] = {
 		{ 0x1f, 0x0002 },
@@ -2081,10 +2090,10 @@ static void rtl8168d_3_hw_phy_config(void __iomem *ioaddr)
 		{ 0x1f, 0x0000 }
 	};
 
-	rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+	rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 }
 
-static void rtl8102e_hw_phy_config(void __iomem *ioaddr)
+static void rtl8102e_hw_phy_config(struct rtl8169_private *tp)
 {
 	static const struct phy_reg phy_reg_init[] = {
 		{ 0x1f, 0x0003 },
@@ -2093,18 +2102,17 @@ static void rtl8102e_hw_phy_config(void __iomem *ioaddr)
 		{ 0x1f, 0x0000 }
 	};
 
-	mdio_write(ioaddr, 0x1f, 0x0000);
-	mdio_patch(ioaddr, 0x11, 1 << 12);
-	mdio_patch(ioaddr, 0x19, 1 << 13);
-	mdio_patch(ioaddr, 0x10, 1 << 15);
+	rtl_writephy(tp, 0x1f, 0x0000);
+	rtl_patchphy(tp, 0x11, 1 << 12);
+	rtl_patchphy(tp, 0x19, 1 << 13);
+	rtl_patchphy(tp, 0x10, 1 << 15);
 
-	rtl_phy_write(ioaddr, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+	rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 }
 
 static void rtl_hw_phy_config(struct net_device *dev)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
-	void __iomem *ioaddr = tp->mmio_addr;
 
 	rtl8169_print_mac_version(tp);
 
@@ -2113,49 +2121,49 @@ static void rtl_hw_phy_config(struct net_device *dev)
 		break;
 	case RTL_GIGA_MAC_VER_02:
 	case RTL_GIGA_MAC_VER_03:
-		rtl8169s_hw_phy_config(ioaddr);
+		rtl8169s_hw_phy_config(tp);
 		break;
 	case RTL_GIGA_MAC_VER_04:
-		rtl8169sb_hw_phy_config(ioaddr);
+		rtl8169sb_hw_phy_config(tp);
 		break;
 	case RTL_GIGA_MAC_VER_05:
-		rtl8169scd_hw_phy_config(tp, ioaddr);
+		rtl8169scd_hw_phy_config(tp);
 		break;
 	case RTL_GIGA_MAC_VER_06:
-		rtl8169sce_hw_phy_config(ioaddr);
+		rtl8169sce_hw_phy_config(tp);
 		break;
 	case RTL_GIGA_MAC_VER_07:
 	case RTL_GIGA_MAC_VER_08:
 	case RTL_GIGA_MAC_VER_09:
-		rtl8102e_hw_phy_config(ioaddr);
+		rtl8102e_hw_phy_config(tp);
 		break;
 	case RTL_GIGA_MAC_VER_11:
-		rtl8168bb_hw_phy_config(ioaddr);
+		rtl8168bb_hw_phy_config(tp);
 		break;
 	case RTL_GIGA_MAC_VER_12:
-		rtl8168bef_hw_phy_config(ioaddr);
+		rtl8168bef_hw_phy_config(tp);
 		break;
 	case RTL_GIGA_MAC_VER_17:
-		rtl8168bef_hw_phy_config(ioaddr);
+		rtl8168bef_hw_phy_config(tp);
 		break;
 	case RTL_GIGA_MAC_VER_18:
-		rtl8168cp_1_hw_phy_config(ioaddr);
+		rtl8168cp_1_hw_phy_config(tp);
 		break;
 	case RTL_GIGA_MAC_VER_19:
-		rtl8168c_1_hw_phy_config(ioaddr);
+		rtl8168c_1_hw_phy_config(tp);
 		break;
 	case RTL_GIGA_MAC_VER_20:
-		rtl8168c_2_hw_phy_config(ioaddr);
+		rtl8168c_2_hw_phy_config(tp);
 		break;
 	case RTL_GIGA_MAC_VER_21:
-		rtl8168c_3_hw_phy_config(ioaddr);
+		rtl8168c_3_hw_phy_config(tp);
 		break;
 	case RTL_GIGA_MAC_VER_22:
-		rtl8168c_4_hw_phy_config(ioaddr);
+		rtl8168c_4_hw_phy_config(tp);
 		break;
 	case RTL_GIGA_MAC_VER_23:
 	case RTL_GIGA_MAC_VER_24:
-		rtl8168cp_2_hw_phy_config(ioaddr);
+		rtl8168cp_2_hw_phy_config(tp);
 		break;
 	case RTL_GIGA_MAC_VER_25:
 		rtl8168d_1_hw_phy_config(tp);
@@ -2164,7 +2172,7 @@ static void rtl_hw_phy_config(struct net_device *dev)
 		rtl8168d_2_hw_phy_config(tp);
 		break;
 	case RTL_GIGA_MAC_VER_27:
-		rtl8168d_3_hw_phy_config(ioaddr);
+		rtl8168d_3_hw_phy_config(tp);
 		break;
 
 	default:
@@ -2187,7 +2195,7 @@ static void rtl8169_phy_timer(unsigned long __opaque)
 
 	spin_lock_irq(&tp->lock);
 
-	if (tp->phy_reset_pending(ioaddr)) {
+	if (tp->phy_reset_pending(tp)) {
 		/*
 		 * A busy loop could burn quite a few cycles on nowadays CPU.
 		 * Let's delay the execution of the timer for a few ticks.
@@ -2201,7 +2209,7 @@ static void rtl8169_phy_timer(unsigned long __opaque)
 
 	netif_warn(tp, link, dev, "PHY reset until link up\n");
 
-	tp->phy_reset_enable(ioaddr);
+	tp->phy_reset_enable(tp);
 
 out_mod_timer:
 	mod_timer(timer, jiffies + timeout);
@@ -2261,12 +2269,11 @@ static void rtl8169_release_board(struct pci_dev *pdev, struct net_device *dev,
 static void rtl8169_phy_reset(struct net_device *dev,
 			      struct rtl8169_private *tp)
 {
-	void __iomem *ioaddr = tp->mmio_addr;
 	unsigned int i;
 
-	tp->phy_reset_enable(ioaddr);
+	tp->phy_reset_enable(tp);
 	for (i = 0; i < 100; i++) {
-		if (!tp->phy_reset_pending(ioaddr))
+		if (!tp->phy_reset_pending(tp))
 			return;
 		msleep(1);
 	}
@@ -2293,7 +2300,7 @@ static void rtl8169_init_phy(struct net_device *dev, struct rtl8169_private *tp)
 		dprintk("Set MAC Reg C+CR Offset 0x82h = 0x01h\n");
 		RTL_W8(0x82, 0x01);
 		dprintk("Set PHY Reg 0x0bh = 0x00h\n");
-		mdio_write(ioaddr, 0x0b, 0x0000); //w 0x0b 15 0 0
+		rtl_writephy(tp, 0x0b, 0x0000); //w 0x0b 15 0 0
 	}
 
 	rtl8169_phy_reset(dev, tp);
@@ -2363,11 +2370,11 @@ static int rtl_xmii_ioctl(struct rtl8169_private *tp, struct mii_ioctl_data *dat
 		return 0;
 
 	case SIOCGMIIREG:
-		data->val_out = mdio_read(tp->mmio_addr, data->reg_num & 0x1f);
+		data->val_out = rtl_readphy(tp, data->reg_num & 0x1f);
 		return 0;
 
 	case SIOCSMIIREG:
-		mdio_write(tp->mmio_addr, data->reg_num & 0x1f, data->val_in);
+		rtl_writephy(tp, data->reg_num & 0x1f, data->val_in);
 		return 0;
 	}
 	return -EOPNOTSUPP;
-- 
1.7.3.4


^ permalink raw reply related

* [net-next-2.6 04/08] r8169: 8168DP specific MII registers access methods.
From: Francois Romieu @ 2011-01-02 23:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, Hayes Wang, Ben Hutchings

Adapted from version 8.019.00 of Realtek's r8168 driver.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes <hayeswang@realtek.com>
---
 drivers/net/r8169.c |   83 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 81 insertions(+), 2 deletions(-)

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index f9d8ff0..7f6fd12 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -277,6 +277,20 @@ enum rtl8168_8101_registers {
 #define	EFUSEAR_DATA_MASK		0xff
 };
 
+enum rtl8168_registers {
+	EPHY_RXER_NUM		= 0x7c,
+	OCPDR			= 0xb0,	/* OCP GPHY access */
+#define OCPDR_WRITE_CMD			0x80000000
+#define OCPDR_READ_CMD			0x00000000
+#define OCPDR_REG_MASK			0xff
+#define OCPDR_GPHY_REG_SHIFT		12
+#define OCPDR_DATA_MASK			0xffff
+	OCPAR			= 0xb4,
+#define OCPAR_FLAG			0x80000000
+#define OCPAR_GPHY_WRITE_CMD		0x8000f060
+#define OCPAR_GPHY_READ_CMD		0x0000f060
+};
+
 enum rtl_register_content {
 	/* InterruptStatusBits */
 	SYSErr		= 0x8000,
@@ -500,6 +514,12 @@ struct rtl8169_private {
 #ifdef CONFIG_R8169_VLAN
 	struct vlan_group *vlgrp;
 #endif
+
+	struct mdio_ops {
+		void (*write)(void __iomem *, int, int);
+		int (*read)(void __iomem *, int);
+	} mdio_ops;
+
 	int (*set_speed)(struct net_device *, u8 autoneg, u16 speed, u8 duplex);
 	int (*get_settings)(struct net_device *, struct ethtool_cmd *);
 	void (*phy_reset_enable)(struct rtl8169_private *tp);
@@ -595,14 +615,55 @@ static int r8169_mdio_read(void __iomem *ioaddr, int reg_addr)
 	return value;
 }
 
+static void r8168dp_1_mdio_access(void __iomem *ioaddr, int reg_addr, u32 data)
+{
+	int i;
+
+	RTL_W32(OCPDR, data |
+		((reg_addr & OCPDR_REG_MASK) << OCPDR_GPHY_REG_SHIFT));
+	RTL_W32(OCPAR, OCPAR_GPHY_WRITE_CMD);
+	RTL_W32(EPHY_RXER_NUM, 0);
+
+	for (i = 0; i < 100; i++) {
+		mdelay(1);
+		if (!(RTL_R32(OCPAR) & OCPAR_FLAG))
+			break;
+	}
+}
+
+static void r8168dp_1_mdio_write(void __iomem *ioaddr, int reg_addr, int value)
+{
+	r8168dp_1_mdio_access(ioaddr, reg_addr, OCPDR_WRITE_CMD |
+		(value & OCPDR_DATA_MASK));
+}
+
+static int r8168dp_1_mdio_read(void __iomem *ioaddr, int reg_addr)
+{
+	int i;
+
+	r8168dp_1_mdio_access(ioaddr, reg_addr, OCPDR_READ_CMD);
+
+	mdelay(1);
+	RTL_W32(OCPAR, OCPAR_GPHY_READ_CMD);
+	RTL_W32(EPHY_RXER_NUM, 0);
+
+	for (i = 0; i < 100; i++) {
+		mdelay(1);
+		if (RTL_R32(OCPAR) & OCPAR_FLAG)
+			break;
+	}
+
+	return RTL_R32(OCPDR) & OCPDR_DATA_MASK;
+}
+
 static void rtl_writephy(struct rtl8169_private *tp, int location, u32 val)
 {
-	r8169_mdio_write(tp->mmio_addr, location, val);
+	tp->mdio_ops.write(tp->mmio_addr, location, val);
 }
 
 static int rtl_readphy(struct rtl8169_private *tp, int location)
 {
-	return r8169_mdio_read(tp->mmio_addr, location);
+	return tp->mdio_ops.read(tp->mmio_addr, location);
 }
 
 static void rtl_patchphy(struct rtl8169_private *tp, int reg_addr, int value)
@@ -2474,6 +2535,22 @@ static const struct net_device_ops rtl8169_netdev_ops = {
 
 };
 
+static void __devinit rtl_init_mdio_ops(struct rtl8169_private *tp)
+{
+	struct mdio_ops *ops = &tp->mdio_ops;
+
+	switch (tp->mac_version) {
+	case RTL_GIGA_MAC_VER_27:
+		ops->write	= r8168dp_1_mdio_write;
+		ops->read	= r8168dp_1_mdio_read;
+		break;
+	default:
+		ops->write	= r8169_mdio_write;
+		ops->read	= r8169_mdio_read;
+		break;
+	}
+}
+
 static int __devinit
 rtl8169_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
@@ -2592,6 +2669,8 @@ rtl8169_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	/* Identify chip attached to board */
 	rtl8169_get_mac_version(tp, ioaddr);
 
+	rtl_init_mdio_ops(tp);
+
 	/* Use appropriate default if unknown */
 	if (tp->mac_version == RTL_GIGA_MAC_NONE) {
 		netif_notice(tp, probe, dev,
-- 
1.7.3.4


^ permalink raw reply related

* [net-next-2.6 05/08] r8169: phy power ops
From: Francois Romieu @ 2011-01-02 23:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, Hayes Wang, Ben Hutchings

Bits from :
- version 8.019.00 of Realtek's 8168 driver
- version 1.019.00 of Realtek's 8101 driver

Plain old 8169 (PCI) devices do not seem to need anything akin to it.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes <hayeswang@realtek.com>
---
 drivers/net/r8169.c |  167 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 166 insertions(+), 1 deletions(-)

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 7f6fd12..fd7e40a 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -258,7 +258,7 @@ enum rtl8168_8101_registers {
 #define	CSIAR_BYTE_ENABLE		0x0f
 #define	CSIAR_BYTE_ENABLE_SHIFT		12
 #define	CSIAR_ADDR_MASK			0x0fff
-
+	PMCH			= 0x6f,
 	EPHYAR			= 0x80,
 #define	EPHYAR_FLAG			0x80000000
 #define	EPHYAR_WRITE_CMD		0x80000000
@@ -520,6 +520,11 @@ struct rtl8169_private {
 		int (*read)(void __iomem *, int);
 	} mdio_ops;
 
+	struct pll_power_ops {
+		void (*down)(struct rtl8169_private *);
+		void (*up)(struct rtl8169_private *);
+	} pll_power_ops;
+
 	int (*set_speed)(struct net_device *, u8 autoneg, u16 speed, u8 duplex);
 	int (*get_settings)(struct net_device *, struct ethtool_cmd *);
 	void (*phy_reset_enable)(struct rtl8169_private *tp);
@@ -2551,6 +2556,152 @@ static void __devinit rtl_init_mdio_ops(struct rtl8169_private *tp)
 	}
 }
 
+static void r810x_phy_power_down(struct rtl8169_private *tp)
+{
+	rtl_writephy(tp, 0x1f, 0x0000);
+	rtl_writephy(tp, MII_BMCR, BMCR_PDOWN);
+}
+
+static void r810x_phy_power_up(struct rtl8169_private *tp)
+{
+	rtl_writephy(tp, 0x1f, 0x0000);
+	rtl_writephy(tp, MII_BMCR, BMCR_ANENABLE);
+}
+
+static void r810x_pll_power_down(struct rtl8169_private *tp)
+{
+	if (__rtl8169_get_wol(tp) & WAKE_ANY) {
+		rtl_writephy(tp, 0x1f, 0x0000);
+		rtl_writephy(tp, MII_BMCR, 0x0000);
+		return;
+	}
+
+	r810x_phy_power_down(tp);
+}
+
+static void r810x_pll_power_up(struct rtl8169_private *tp)
+{
+	r810x_phy_power_up(tp);
+}
+
+static void r8168_phy_power_up(struct rtl8169_private *tp)
+{
+	rtl_writephy(tp, 0x1f, 0x0000);
+	rtl_writephy(tp, 0x0e, 0x0000);
+	rtl_writephy(tp, MII_BMCR, BMCR_ANENABLE);
+}
+
+static void r8168_phy_power_down(struct rtl8169_private *tp)
+{
+	rtl_writephy(tp, 0x1f, 0x0000);
+	rtl_writephy(tp, 0x0e, 0x0200);
+	rtl_writephy(tp, MII_BMCR, BMCR_PDOWN);
+}
+
+static void r8168_pll_power_down(struct rtl8169_private *tp)
+{
+	void __iomem *ioaddr = tp->mmio_addr;
+
+	if (tp->mac_version == RTL_GIGA_MAC_VER_27)
+		return;
+
+	if (((tp->mac_version == RTL_GIGA_MAC_VER_23) ||
+	     (tp->mac_version == RTL_GIGA_MAC_VER_24)) &&
+	    (RTL_R16(CPlusCmd) & ASF)) {
+		return;
+	}
+
+	if (__rtl8169_get_wol(tp) & WAKE_ANY) {
+		rtl_writephy(tp, 0x1f, 0x0000);
+		rtl_writephy(tp, MII_BMCR, 0x0000);
+
+		RTL_W32(RxConfig, RTL_R32(RxConfig) |
+			AcceptBroadcast | AcceptMulticast | AcceptMyPhys);
+		return;
+	}
+
+	r8168_phy_power_down(tp);
+
+	switch (tp->mac_version) {
+	case RTL_GIGA_MAC_VER_25:
+	case RTL_GIGA_MAC_VER_26:
+		RTL_W8(PMCH, RTL_R8(PMCH) & ~0x80);
+		break;
+	}
+}
+
+static void r8168_pll_power_up(struct rtl8169_private *tp)
+{
+	void __iomem *ioaddr = tp->mmio_addr;
+
+	if (tp->mac_version == RTL_GIGA_MAC_VER_27)
+		return;
+
+	switch (tp->mac_version) {
+	case RTL_GIGA_MAC_VER_25:
+	case RTL_GIGA_MAC_VER_26:
+		RTL_W8(PMCH, RTL_R8(PMCH) | 0x80);
+		break;
+	}
+
+	r8168_phy_power_up(tp);
+}
+
+static void rtl_pll_power_op(struct rtl8169_private *tp,
+			     void (*op)(struct rtl8169_private *))
+{
+	if (op)
+		op(tp);
+}
+
+static void rtl_pll_power_down(struct rtl8169_private *tp)
+{
+	rtl_pll_power_op(tp, tp->pll_power_ops.down);
+}
+
+static void rtl_pll_power_up(struct rtl8169_private *tp)
+{
+	rtl_pll_power_op(tp, tp->pll_power_ops.up);
+}
+
+static void __devinit rtl_init_pll_power_ops(struct rtl8169_private *tp)
+{
+	struct pll_power_ops *ops = &tp->pll_power_ops;
+
+	switch (tp->mac_version) {
+	case RTL_GIGA_MAC_VER_07:
+	case RTL_GIGA_MAC_VER_08:
+	case RTL_GIGA_MAC_VER_09:
+	case RTL_GIGA_MAC_VER_10:
+	case RTL_GIGA_MAC_VER_16:
+		ops->down	= r810x_pll_power_down;
+		ops->up		= r810x_pll_power_up;
+		break;
+
+	case RTL_GIGA_MAC_VER_11:
+	case RTL_GIGA_MAC_VER_12:
+	case RTL_GIGA_MAC_VER_17:
+	case RTL_GIGA_MAC_VER_18:
+	case RTL_GIGA_MAC_VER_19:
+	case RTL_GIGA_MAC_VER_20:
+	case RTL_GIGA_MAC_VER_21:
+	case RTL_GIGA_MAC_VER_22:
+	case RTL_GIGA_MAC_VER_23:
+	case RTL_GIGA_MAC_VER_24:
+	case RTL_GIGA_MAC_VER_25:
+	case RTL_GIGA_MAC_VER_26:
+	case RTL_GIGA_MAC_VER_27:
+		ops->down	= r8168_pll_power_down;
+		ops->up		= r8168_pll_power_up;
+		break;
+
+	default:
+		ops->down	= NULL;
+		ops->up		= NULL;
+		break;
+	}
+}
+
 static int __devinit
 rtl8169_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
@@ -2670,6 +2821,7 @@ rtl8169_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	rtl8169_get_mac_version(tp, ioaddr);
 
 	rtl_init_mdio_ops(tp);
+	rtl_init_pll_power_ops(tp);
 
 	/* Use appropriate default if unknown */
 	if (tp->mac_version == RTL_GIGA_MAC_NONE) {
@@ -2849,6 +3001,8 @@ static int rtl8169_open(struct net_device *dev)
 
 	napi_enable(&tp->napi);
 
+	rtl_pll_power_up(tp);
+
 	rtl_hw_start(dev);
 
 	rtl8169_request_timer(dev);
@@ -4257,6 +4411,8 @@ static void rtl8169_down(struct net_device *dev)
 	rtl8169_tx_clear(tp);
 
 	rtl8169_rx_clear(tp);
+
+	rtl_pll_power_down(tp);
 }
 
 static int rtl8169_close(struct net_device *dev)
@@ -4361,9 +4517,13 @@ static struct net_device_stats *rtl8169_get_stats(struct net_device *dev)
 
 static void rtl8169_net_suspend(struct net_device *dev)
 {
+	struct rtl8169_private *tp = netdev_priv(dev);
+
 	if (!netif_running(dev))
 		return;
 
+	rtl_pll_power_down(tp);
+
 	netif_device_detach(dev);
 	netif_stop_queue(dev);
 }
@@ -4382,7 +4542,12 @@ static int rtl8169_suspend(struct device *device)
 
 static void __rtl8169_resume(struct net_device *dev)
 {
+	struct rtl8169_private *tp = netdev_priv(dev);
+
 	netif_device_attach(dev);
+
+	rtl_pll_power_up(tp);
+
 	rtl8169_schedule_work(dev, rtl8169_reset_task);
 }
 
-- 
1.7.3.4


^ permalink raw reply related

* [net-next-2.6 06/08] r8169: magic.
From: Francois Romieu @ 2011-01-02 23:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, Hayes Wang, Ben Hutchings

Adapted from version 8.019.00 of Realtek's r8168 driver.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes <hayeswang@realtek.com>
---
 drivers/net/r8169.c |   93 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 93 insertions(+), 0 deletions(-)

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index fd7e40a..ab1741d 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -278,6 +278,18 @@ enum rtl8168_8101_registers {
 };
 
 enum rtl8168_registers {
+	ERIDR			= 0x70,
+	ERIAR			= 0x74,
+#define ERIAR_FLAG			0x80000000
+#define ERIAR_WRITE_CMD			0x80000000
+#define ERIAR_READ_CMD			0x00000000
+#define ERIAR_ADDR_BYTE_ALIGN		4
+#define ERIAR_EXGMAC			0
+#define ERIAR_MSIX			1
+#define ERIAR_ASF			2
+#define ERIAR_TYPE_SHIFT		16
+#define ERIAR_BYTEEN			0x0f
+#define ERIAR_BYTEEN_SHIFT		12
 	EPHY_RXER_NUM		= 0x7c,
 	OCPDR			= 0xb0,	/* OCP GPHY access */
 #define OCPDR_WRITE_CMD			0x80000000
@@ -572,6 +584,81 @@ static int rtl8169_poll(struct napi_struct *napi, int budget);
 static const unsigned int rtl8169_rx_config =
 	(RX_FIFO_THRESH << RxCfgFIFOShift) | (RX_DMA_BURST << RxCfgDMAShift);
 
+static u32 ocp_read(struct rtl8169_private *tp, u8 mask, u16 reg)
+{
+	void __iomem *ioaddr = tp->mmio_addr;
+	int i;
+
+	RTL_W32(OCPAR, ((u32)mask & 0x0f) << 12 | (reg & 0x0fff));
+	for (i = 0; i < 20; i++) {
+		udelay(100);
+		if (RTL_R32(OCPAR) & OCPAR_FLAG)
+			break;
+	}
+	return RTL_R32(OCPDR);
+}
+
+static void ocp_write(struct rtl8169_private *tp, u8 mask, u16 reg, u32 data)
+{
+	void __iomem *ioaddr = tp->mmio_addr;
+	int i;
+
+	RTL_W32(OCPDR, data);
+	RTL_W32(OCPAR, OCPAR_FLAG | ((u32)mask & 0x0f) << 12 | (reg & 0x0fff));
+	for (i = 0; i < 20; i++) {
+		udelay(100);
+		if ((RTL_R32(OCPAR) & OCPAR_FLAG) == 0)
+			break;
+	}
+}
+
+static void rtl8168_oob_notify(void __iomem *ioaddr, u8 cmd)
+{
+	int i;
+
+	RTL_W8(ERIDR, cmd);
+	RTL_W32(ERIAR, 0x800010e8);
+	msleep(2);
+	for (i = 0; i < 5; i++) {
+		udelay(100);
+		if (!(RTL_R32(ERIDR) & ERIAR_FLAG))
+			break;
+	}
+
+	ocp_write(ioaddr, 0x1, 0x30, 0x00000001);
+}
+
+#define OOB_CMD_RESET		0x00
+#define OOB_CMD_DRIVER_START	0x05
+#define OOB_CMD_DRIVER_STOP	0x06
+
+static void rtl8168_driver_start(struct rtl8169_private *tp)
+{
+	int i;
+
+	rtl8168_oob_notify(tp, OOB_CMD_DRIVER_START);
+
+	for (i = 0; i < 10; i++) {
+		msleep(10);
+		if (ocp_read(tp, 0x0f, 0x0010) & 0x00000800)
+			break;
+	}
+}
+
+static void rtl8168_driver_stop(struct rtl8169_private *tp)
+{
+	int i;
+
+	rtl8168_oob_notify(tp, OOB_CMD_DRIVER_STOP);
+
+	for (i = 0; i < 10; i++) {
+		msleep(10);
+		if ((ocp_read(tp, 0x0f, 0x0010) & 0x00000800) == 0)
+			break;
+	}
+}
+
+
 static void r8169_mdio_write(void __iomem *ioaddr, int reg_addr, int value)
 {
 	int i;
@@ -2913,6 +3000,9 @@ rtl8169_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 		   dev->base_addr, dev->dev_addr,
 		   (u32)(RTL_R32(TxConfig) & 0x9cf0f8ff), dev->irq);
 
+	if (tp->mac_version == RTL_GIGA_MAC_VER_27)
+		rtl8168_driver_start(tp);
+
 	rtl8169_init_phy(dev, tp);
 
 	/*
@@ -2948,6 +3038,9 @@ static void __devexit rtl8169_remove_one(struct pci_dev *pdev)
 	struct net_device *dev = pci_get_drvdata(pdev);
 	struct rtl8169_private *tp = netdev_priv(dev);
 
+	if (tp->mac_version == RTL_GIGA_MAC_VER_27)
+		rtl8168_driver_stop(tp);
+
 	cancel_delayed_work_sync(&tp->task);
 
 	unregister_netdev(dev);
-- 
1.7.3.4


^ permalink raw reply related

* [net-next-2.6 07/08] r8169: rtl_csi_access_enable rename.
From: Francois Romieu @ 2011-01-02 23:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, Hayes Wang, Ben Hutchings

Newer 8168 needs a slightly different rtl_csi_access_enable.
This patch separates some noise from the real thing.

No functional change.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes <hayeswang@realtek.com>
---
 drivers/net/r8169.c |   27 ++++++++++++++++-----------
 1 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index ab1741d..a8df458 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -3310,12 +3310,17 @@ static void rtl_tx_performance_tweak(struct pci_dev *pdev, u16 force)
 	}
 }
 
-static void rtl_csi_access_enable(void __iomem *ioaddr)
+static void rtl_csi_access_enable(void __iomem *ioaddr, u32 bits)
 {
 	u32 csi;
 
 	csi = rtl_csi_read(ioaddr, 0x070c) & 0x00ffffff;
-	rtl_csi_write(ioaddr, 0x070c, csi | 0x27000000);
+	rtl_csi_write(ioaddr, 0x070c, csi | bits);
+}
+
+static void rtl_csi_access_enable_2(void __iomem *ioaddr)
+{
+	rtl_csi_access_enable(ioaddr, 0x27000000);
 }
 
 struct ephy_info {
@@ -3403,7 +3408,7 @@ static void rtl_hw_start_8168cp_1(void __iomem *ioaddr, struct pci_dev *pdev)
 		{ 0x07, 0,	0x2000 }
 	};
 
-	rtl_csi_access_enable(ioaddr);
+	rtl_csi_access_enable_2(ioaddr);
 
 	rtl_ephy_init(ioaddr, e_info_8168cp, ARRAY_SIZE(e_info_8168cp));
 
@@ -3412,7 +3417,7 @@ static void rtl_hw_start_8168cp_1(void __iomem *ioaddr, struct pci_dev *pdev)
 
 static void rtl_hw_start_8168cp_2(void __iomem *ioaddr, struct pci_dev *pdev)
 {
-	rtl_csi_access_enable(ioaddr);
+	rtl_csi_access_enable_2(ioaddr);
 
 	RTL_W8(Config3, RTL_R8(Config3) & ~Beacon_en);
 
@@ -3423,7 +3428,7 @@ static void rtl_hw_start_8168cp_2(void __iomem *ioaddr, struct pci_dev *pdev)
 
 static void rtl_hw_start_8168cp_3(void __iomem *ioaddr, struct pci_dev *pdev)
 {
-	rtl_csi_access_enable(ioaddr);
+	rtl_csi_access_enable_2(ioaddr);
 
 	RTL_W8(Config3, RTL_R8(Config3) & ~Beacon_en);
 
@@ -3445,7 +3450,7 @@ static void rtl_hw_start_8168c_1(void __iomem *ioaddr, struct pci_dev *pdev)
 		{ 0x06, 0x0080,	0x0000 }
 	};
 
-	rtl_csi_access_enable(ioaddr);
+	rtl_csi_access_enable_2(ioaddr);
 
 	RTL_W8(DBG_REG, 0x06 | FIX_NAK_1 | FIX_NAK_2);
 
@@ -3461,7 +3466,7 @@ static void rtl_hw_start_8168c_2(void __iomem *ioaddr, struct pci_dev *pdev)
 		{ 0x03, 0x0400,	0x0220 }
 	};
 
-	rtl_csi_access_enable(ioaddr);
+	rtl_csi_access_enable_2(ioaddr);
 
 	rtl_ephy_init(ioaddr, e_info_8168c_2, ARRAY_SIZE(e_info_8168c_2));
 
@@ -3475,14 +3480,14 @@ static void rtl_hw_start_8168c_3(void __iomem *ioaddr, struct pci_dev *pdev)
 
 static void rtl_hw_start_8168c_4(void __iomem *ioaddr, struct pci_dev *pdev)
 {
-	rtl_csi_access_enable(ioaddr);
+	rtl_csi_access_enable_2(ioaddr);
 
 	__rtl_hw_start_8168cp(ioaddr, pdev);
 }
 
 static void rtl_hw_start_8168d(void __iomem *ioaddr, struct pci_dev *pdev)
 {
-	rtl_csi_access_enable(ioaddr);
+	rtl_csi_access_enable_2(ioaddr);
 
 	rtl_disable_clock_request(pdev);
 
@@ -3611,7 +3616,7 @@ static void rtl_hw_start_8102e_1(void __iomem *ioaddr, struct pci_dev *pdev)
 	};
 	u8 cfg1;
 
-	rtl_csi_access_enable(ioaddr);
+	rtl_csi_access_enable_2(ioaddr);
 
 	RTL_W8(DBG_REG, FIX_NAK_1);
 
@@ -3632,7 +3637,7 @@ static void rtl_hw_start_8102e_1(void __iomem *ioaddr, struct pci_dev *pdev)
 
 static void rtl_hw_start_8102e_2(void __iomem *ioaddr, struct pci_dev *pdev)
 {
-	rtl_csi_access_enable(ioaddr);
+	rtl_csi_access_enable_2(ioaddr);
 
 	rtl_tx_performance_tweak(pdev, 0x5 << MAX_READ_REQUEST_SHIFT);
 
-- 
1.7.3.4


^ permalink raw reply related

* [net-next-2.6 08/08] r8169: more 8168dp support.
From: Francois Romieu @ 2011-01-02 23:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, Hayes Wang, Ben Hutchings

Adapted from version 8.019.00 of Realtek's r8168 driver

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes <hayeswang@realtek.com>
---
 drivers/net/r8169.c |  145 +++++++++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 136 insertions(+), 9 deletions(-)

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index a8df458..f13194f 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -121,7 +121,8 @@ enum mac_version {
 	RTL_GIGA_MAC_VER_24 = 0x18, // 8168CP
 	RTL_GIGA_MAC_VER_25 = 0x19, // 8168D
 	RTL_GIGA_MAC_VER_26 = 0x1a, // 8168D
-	RTL_GIGA_MAC_VER_27 = 0x1b  // 8168DP
+	RTL_GIGA_MAC_VER_27 = 0x1b, // 8168DP
+	RTL_GIGA_MAC_VER_28 = 0x1c, // 8168DP
 };
 
 #define _R(NAME,MAC,MASK) \
@@ -158,7 +159,8 @@ static const struct {
 	_R("RTL8168cp/8111cp",	RTL_GIGA_MAC_VER_24, 0xff7e1880), // PCI-E
 	_R("RTL8168d/8111d",	RTL_GIGA_MAC_VER_25, 0xff7e1880), // PCI-E
 	_R("RTL8168d/8111d",	RTL_GIGA_MAC_VER_26, 0xff7e1880), // PCI-E
-	_R("RTL8168dp/8111dp",	RTL_GIGA_MAC_VER_27, 0xff7e1880)  // PCI-E
+	_R("RTL8168dp/8111dp",	RTL_GIGA_MAC_VER_27, 0xff7e1880), // PCI-E
+	_R("RTL8168dp/8111dp",	RTL_GIGA_MAC_VER_28, 0xff7e1880)  // PCI-E
 };
 #undef _R
 
@@ -301,6 +303,7 @@ enum rtl8168_registers {
 #define OCPAR_FLAG			0x80000000
 #define OCPAR_GPHY_WRITE_CMD		0x8000f060
 #define OCPAR_GPHY_READ_CMD		0x0000f060
+	RDSAR1			= 0xd0	/* 8168c only. Undocumented on 8168dp */
 };
 
 enum rtl_register_content {
@@ -748,6 +751,40 @@ static int r8168dp_1_mdio_read(void __iomem *ioaddr, int reg_addr)
 	return RTL_R32(OCPDR) & OCPDR_DATA_MASK;
 }
 
+#define R8168DP_1_MDIO_ACCESS_BIT	0x00020000
+
+static void r8168dp_2_mdio_start(void __iomem *ioaddr)
+{
+	RTL_W32(0xd0, RTL_R32(0xd0) & ~R8168DP_1_MDIO_ACCESS_BIT);
+}
+
+static void r8168dp_2_mdio_stop(void __iomem *ioaddr)
+{
+	RTL_W32(0xd0, RTL_R32(0xd0) | R8168DP_1_MDIO_ACCESS_BIT);
+}
+
+static void r8168dp_2_mdio_write(void __iomem *ioaddr, int reg_addr, int value)
+{
+	r8168dp_2_mdio_start(ioaddr);
+
+	r8169_mdio_write(ioaddr, reg_addr, value);
+
+	r8168dp_2_mdio_stop(ioaddr);
+}
+
+static int r8168dp_2_mdio_read(void __iomem *ioaddr, int reg_addr)
+{
+	int value;
+
+	r8168dp_2_mdio_start(ioaddr);
+
+	value = r8169_mdio_read(ioaddr, reg_addr);
+
+	r8168dp_2_mdio_stop(ioaddr);
+
+	return value;
+}
+
 static void rtl_writephy(struct rtl8169_private *tp, int location, u32 val)
 {
 	tp->mdio_ops.write(tp->mmio_addr, location, val);
@@ -1495,9 +1532,12 @@ static void rtl8169_get_mac_version(struct rtl8169_private *tp,
 		/* 8168D family. */
 		{ 0x7cf00000, 0x28300000,	RTL_GIGA_MAC_VER_26 },
 		{ 0x7cf00000, 0x28100000,	RTL_GIGA_MAC_VER_25 },
-		{ 0x7c800000, 0x28800000,	RTL_GIGA_MAC_VER_27 },
 		{ 0x7c800000, 0x28000000,	RTL_GIGA_MAC_VER_26 },
 
+		/* 8168DP family. */
+		{ 0x7cf00000, 0x28800000,	RTL_GIGA_MAC_VER_27 },
+		{ 0x7cf00000, 0x28a00000,	RTL_GIGA_MAC_VER_28 },
+
 		/* 8168C family. */
 		{ 0x7cf00000, 0x3cb00000,	RTL_GIGA_MAC_VER_24 },
 		{ 0x7cf00000, 0x3c900000,	RTL_GIGA_MAC_VER_23 },
@@ -2246,6 +2286,22 @@ static void rtl8168d_3_hw_phy_config(struct rtl8169_private *tp)
 	rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
 }
 
+static void rtl8168d_4_hw_phy_config(struct rtl8169_private *tp)
+{
+	static const struct phy_reg phy_reg_init[] = {
+		{ 0x1f, 0x0001 },
+		{ 0x17, 0x0cc0 },
+
+		{ 0x1f, 0x0007 },
+		{ 0x1e, 0x002d },
+		{ 0x18, 0x0040 },
+		{ 0x1f, 0x0000 }
+	};
+
+	rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
+	rtl_patchphy(tp, 0x0d, 1 << 5);
+}
+
 static void rtl8102e_hw_phy_config(struct rtl8169_private *tp)
 {
 	static const struct phy_reg phy_reg_init[] = {
@@ -2327,6 +2383,9 @@ static void rtl_hw_phy_config(struct net_device *dev)
 	case RTL_GIGA_MAC_VER_27:
 		rtl8168d_3_hw_phy_config(tp);
 		break;
+	case RTL_GIGA_MAC_VER_28:
+		rtl8168d_4_hw_phy_config(tp);
+		break;
 
 	default:
 		break;
@@ -2636,6 +2695,10 @@ static void __devinit rtl_init_mdio_ops(struct rtl8169_private *tp)
 		ops->write	= r8168dp_1_mdio_write;
 		ops->read	= r8168dp_1_mdio_read;
 		break;
+	case RTL_GIGA_MAC_VER_28:
+		ops->write	= r8168dp_2_mdio_write;
+		ops->read	= r8168dp_2_mdio_read;
+		break;
 	default:
 		ops->write	= r8169_mdio_write;
 		ops->read	= r8169_mdio_read;
@@ -2778,6 +2841,7 @@ static void __devinit rtl_init_pll_power_ops(struct rtl8169_private *tp)
 	case RTL_GIGA_MAC_VER_25:
 	case RTL_GIGA_MAC_VER_26:
 	case RTL_GIGA_MAC_VER_27:
+	case RTL_GIGA_MAC_VER_28:
 		ops->down	= r8168_pll_power_down;
 		ops->up		= r8168_pll_power_up;
 		break;
@@ -3000,8 +3064,10 @@ rtl8169_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 		   dev->base_addr, dev->dev_addr,
 		   (u32)(RTL_R32(TxConfig) & 0x9cf0f8ff), dev->irq);
 
-	if (tp->mac_version == RTL_GIGA_MAC_VER_27)
+	if ((tp->mac_version == RTL_GIGA_MAC_VER_27) ||
+	    (tp->mac_version == RTL_GIGA_MAC_VER_28)) {
 		rtl8168_driver_start(tp);
+	}
 
 	rtl8169_init_phy(dev, tp);
 
@@ -3038,8 +3104,10 @@ static void __devexit rtl8169_remove_one(struct pci_dev *pdev)
 	struct net_device *dev = pci_get_drvdata(pdev);
 	struct rtl8169_private *tp = netdev_priv(dev);
 
-	if (tp->mac_version == RTL_GIGA_MAC_VER_27)
+	if ((tp->mac_version == RTL_GIGA_MAC_VER_27) ||
+	    (tp->mac_version == RTL_GIGA_MAC_VER_28)) {
 		rtl8168_driver_stop(tp);
+	}
 
 	cancel_delayed_work_sync(&tp->task);
 
@@ -3122,11 +3190,19 @@ err_pm_runtime_put:
 	goto out;
 }
 
-static void rtl8169_hw_reset(void __iomem *ioaddr)
+static void rtl8169_hw_reset(struct rtl8169_private *tp)
 {
+	void __iomem *ioaddr = tp->mmio_addr;
+
 	/* Disable interrupts */
 	rtl8169_irq_mask_and_ack(ioaddr);
 
+	if (tp->mac_version == RTL_GIGA_MAC_VER_28) {
+		while (RTL_R8(TxPoll) & NPQ)
+			udelay(20);
+
+	}
+
 	/* Reset the chipset */
 	RTL_W8(ChipCmd, CmdReset);
 
@@ -3318,6 +3394,11 @@ static void rtl_csi_access_enable(void __iomem *ioaddr, u32 bits)
 	rtl_csi_write(ioaddr, 0x070c, csi | bits);
 }
 
+static void rtl_csi_access_enable_1(void __iomem *ioaddr)
+{
+	rtl_csi_access_enable(ioaddr, 0x17000000);
+}
+
 static void rtl_csi_access_enable_2(void __iomem *ioaddr)
 {
 	rtl_csi_access_enable(ioaddr, 0x27000000);
@@ -3355,6 +3436,21 @@ static void rtl_disable_clock_request(struct pci_dev *pdev)
 	}
 }
 
+static void rtl_enable_clock_request(struct pci_dev *pdev)
+{
+	struct net_device *dev = pci_get_drvdata(pdev);
+	struct rtl8169_private *tp = netdev_priv(dev);
+	int cap = tp->pcie_cap;
+
+	if (cap) {
+		u16 ctl;
+
+		pci_read_config_word(pdev, cap + PCI_EXP_LNKCTL, &ctl);
+		ctl |= PCI_EXP_LNKCTL_CLKREQ_EN;
+		pci_write_config_word(pdev, cap + PCI_EXP_LNKCTL, ctl);
+	}
+}
+
 #define R8168_CPCMD_QUIRK_MASK (\
 	EnableBist | \
 	Mac_dbgo_oe | \
@@ -3498,6 +3594,32 @@ static void rtl_hw_start_8168d(void __iomem *ioaddr, struct pci_dev *pdev)
 	RTL_W16(CPlusCmd, RTL_R16(CPlusCmd) & ~R8168_CPCMD_QUIRK_MASK);
 }
 
+static void rtl_hw_start_8168d_4(void __iomem *ioaddr, struct pci_dev *pdev)
+{
+	static const struct ephy_info e_info_8168d_4[] = {
+		{ 0x0b, ~0,	0x48 },
+		{ 0x19, 0x20,	0x50 },
+		{ 0x0c, ~0,	0x20 }
+	};
+	int i;
+
+	rtl_csi_access_enable_1(ioaddr);
+
+	rtl_tx_performance_tweak(pdev, 0x5 << MAX_READ_REQUEST_SHIFT);
+
+	RTL_W8(MaxTxPacketSize, TxPacketMax);
+
+	for (i = 0; i < ARRAY_SIZE(e_info_8168d_4); i++) {
+		const struct ephy_info *e = e_info_8168d_4 + i;
+		u16 w;
+
+		w = rtl_ephy_read(ioaddr, e->offset);
+		rtl_ephy_write(ioaddr, 0x03, (w & e->mask) | e->bits);
+	}
+
+	rtl_enable_clock_request(pdev);
+}
+
 static void rtl_hw_start_8168(struct net_device *dev)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
@@ -3575,6 +3697,10 @@ static void rtl_hw_start_8168(struct net_device *dev)
 		rtl_hw_start_8168d(ioaddr, pdev);
 	break;
 
+	case RTL_GIGA_MAC_VER_28:
+		rtl_hw_start_8168d_4(ioaddr, pdev);
+	break;
+
 	default:
 		printk(KERN_ERR PFX "%s: unknown chipset (mac_version = %d).\n",
 			dev->name, tp->mac_version);
@@ -3987,7 +4113,7 @@ static void rtl8169_tx_timeout(struct net_device *dev)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
 
-	rtl8169_hw_reset(tp->mmio_addr);
+	rtl8169_hw_reset(tp);
 
 	/* Let's wait a bit while any (async) irq lands on */
 	rtl8169_schedule_work(dev, rtl8169_reset_task);
@@ -4145,7 +4271,6 @@ static void rtl8169_pcierr_interrupt(struct net_device *dev)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
 	struct pci_dev *pdev = tp->pci_dev;
-	void __iomem *ioaddr = tp->mmio_addr;
 	u16 pci_status, pci_cmd;
 
 	pci_read_config_word(pdev, PCI_COMMAND, &pci_cmd);
@@ -4176,13 +4301,15 @@ static void rtl8169_pcierr_interrupt(struct net_device *dev)
 
 	/* The infamous DAC f*ckup only happens at boot time */
 	if ((tp->cp_cmd & PCIDAC) && !tp->dirty_rx && !tp->cur_rx) {
+		void __iomem *ioaddr = tp->mmio_addr;
+
 		netif_info(tp, intr, dev, "disabling PCI DAC\n");
 		tp->cp_cmd &= ~PCIDAC;
 		RTL_W16(CPlusCmd, tp->cp_cmd);
 		dev->features &= ~NETIF_F_HIGHDMA;
 	}
 
-	rtl8169_hw_reset(ioaddr);
+	rtl8169_hw_reset(tp);
 
 	rtl8169_schedule_work(dev, rtl8169_reinit_task);
 }
-- 
1.7.3.4


^ permalink raw reply related

* [PATCH] ksz884x: Fix section mismatch derived from pci_device_driver variable
From: Sedat Dilek @ 2011-01-03  2:01 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: Sedat Dilek

WARNING: drivers/net/ksz884x.o(.data+0x18): Section mismatch in reference from the variable pci_device_driver to the function .init.text:pcidev_init()
The variable pci_device_driver references
the function __init pcidev_init()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

This patch fixes the warning.

Tested with linux-next (next-20101231)

Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
---
 drivers/net/ksz884x.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ksz884x.c b/drivers/net/ksz884x.c
index 49ea870..515b0d7 100644
--- a/drivers/net/ksz884x.c
+++ b/drivers/net/ksz884x.c
@@ -7277,7 +7277,7 @@ static struct pci_device_id pcidev_table[] = {
 
 MODULE_DEVICE_TABLE(pci, pcidev_table);
 
-static struct pci_driver pci_device_driver = {
+static struct pci_driver pci_device_driver __refdata = {
 #ifdef CONFIG_PM
 	.suspend	= pcidev_suspend,
 	.resume		= pcidev_resume,
-- 
1.7.4.rc0

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox