Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH v2 net-next-2.6] ifb: add performance flags
From: Eric Dumazet @ 2011-01-02 20:24 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David Miller, xiaosuo, pstaszewski, netdev
In-Reply-To: <20101228230709.GA2088@del.dom.local>

Le mercredi 29 décembre 2010 à 00:07 +0100, Jarek Poplawski a écrit :

> Ingress is before vlans handler so these features and the
> NETIF_F_HW_VLAN_TX flag seem useful for ifb considering
> dev_hard_start_xmit() checks.

OK, here is v2 of the patch then, thanks everybody.


[PATCH v2 net-next-2.6] ifb: add performance flags

IFB can use the full set of features flags (NETIF_F_SG |
NETIF_F_FRAGLIST | NETIF_F_TSO | NETIF_F_NO_CSUM | NETIF_F_HIGHDMA) to
avoid unnecessary split of some packets (GRO for example)

Changli suggested to also set vlan_features,
Jarek suggested to add NETIF_F_HW_VLAN_TX as well.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Changli Gao <xiaosuo@gmail.com>
Cc: Jarek Poplawski <jarkao2@gmail.com>
Cc: Pawel Staszewski <pstaszewski@itcare.pl>
---
 drivers/net/ifb.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index 124dac4..66ca7bf 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -126,6 +126,9 @@ static const struct net_device_ops ifb_netdev_ops = {
 	.ndo_validate_addr = eth_validate_addr,
 };
 
+#define IFB_FEATURES (NETIF_F_NO_CSUM | NETIF_F_SG  | NETIF_F_FRAGLIST | \
+		      NETIF_F_HIGHDMA | NETIF_F_TSO | NETIF_F_HW_VLAN_TX)
+
 static void ifb_setup(struct net_device *dev)
 {
 	/* Initialize the device structure. */
@@ -136,6 +139,9 @@ static void ifb_setup(struct net_device *dev)
 	ether_setup(dev);
 	dev->tx_queue_len = TX_Q_LIMIT;
 
+	dev->features |= IFB_FEATURES;
+	dev->vlan_features |= IFB_FEATURES;
+
 	dev->flags |= IFF_NOARP;
 	dev->flags &= ~IFF_MULTICAST;
 	dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;



^ permalink raw reply related

* Re: [PATCH] new UDPCP Communication Protocol
From: Rémi Denis-Courmont @ 2011-01-02 20:16 UTC (permalink / raw)
  To: stefani, netdev
In-Reply-To: <1293982286-15389-1-git-send-email-stefani@seibold.net>

Le dimanche 2 janvier 2011 17:31:26 stefani@seibold.net, vous avez écrit :
> UDPCP is a communication protocol specified by the Open Base Station
> Architecture Initiative Special Interest Group (OBSAI SIG). The
> protocol is based on UDP and is designed to meet the needs of "Mobile
> Communcation Base Station" internal communications. It is widely used by
> the major networks infrastructure supplier.

Unless I missed something, you did not declare the new lock classes for the 
lock consistency checks.

-- 
Rémi Denis-Courmont
http://www.remlab.net/
http://fi.linkedin.com/in/remidenis

^ permalink raw reply

* [PATCH 1/1 V3] bridge: fix br_multicast_ipv6_rcv for paged skbs
From: Tomas Winkler @ 2011-01-02 20:18 UTC (permalink / raw)
  To: davem; +Cc: netdev, Tomas Winkler, Johannes Berg, Stephen Hemminger

use pskb_may_pull to access ipv6 header correctly for paged skbs
It was omitted in the bridge code leading to crash in blind
__skb_pull

this fixes bug https://bugzilla.kernel.org/show_bug.cgi?id=25202

Dec 15 14:36:40 User-PC hostapd: wlan0: STA 00:15:00:60:5d:34 IEEE 802.11: authenticated
Dec 15 14:36:40 User-PC hostapd: wlan0: STA 00:15:00:60:5d:34 IEEE 802.11: associated (aid 2)
Dec 15 14:36:40 User-PC hostapd: wlan0: STA 00:15:00:60:5d:34 RADIUS: starting accounting session 4D0608A3-00000005
Dec 15 14:36:41 User-PC kernel: [175576.120287] ------------[ cut here ]------------
Dec 15 14:36:41 User-PC kernel: [175576.120452] kernel BUG at include/linux/skbuff.h:1178!
Dec 15 14:36:41 User-PC kernel: [175576.120609] invalid opcode: 0000 [#1] SMP
Dec 15 14:36:41 User-PC kernel: [175576.120749] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/uevent
Dec 15 14:36:41 User-PC kernel: [175576.121035] Modules linked in: oprofile binfmt_misc bridge stp llc parport_pc ppdev arc4 iwlagn snd_hda_codec_realtek iwlcore i915 snd_hda_intel mac80211 joydev snd_hda_codec snd_hwdep snd_pcm snd_seq_midi drm_kms_helper snd_rawmidi drm snd_seq_midi_event snd_seq snd_timer snd_seq_device cfg80211 eeepc_wmi usbhid psmouse intel_agp i2c_algo_bit intel_gtt uvcvideo agpgart videodev sparse_keymap snd shpchp v4l1_compat lp hid video serio_raw soundcore output snd_page_alloc ahci libahci atl1c
Dec 15 14:36:41 User-PC kernel: [175576.122712]
Dec 15 14:36:41 User-PC kernel: [175576.122769] Pid: 0, comm: kworker/0:0 Tainted: G        W   2.6.37-rc5-wl+ #3 1015PE/1016P
Dec 15 14:36:41 User-PC kernel: [175576.123012] EIP: 0060:[<f83edd65>] EFLAGS: 00010283 CPU: 1
Dec 15 14:36:41 User-PC kernel: [175576.123193] EIP is at br_multicast_rcv+0xc95/0xe1c [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.123362] EAX: 0000001c EBX: f5626318 ECX: 00000000 EDX: 00000000
Dec 15 14:36:41 User-PC kernel: [175576.123550] ESI: ec512262 EDI: f5626180 EBP: f60b5ca0 ESP: f60b5bd8
Dec 15 14:36:41 User-PC kernel: [175576.123737]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Dec 15 14:36:41 User-PC kernel: [175576.123902] Process kworker/0:0 (pid: 0, ti=f60b4000 task=f60a8000 task.ti=f60b0000)
Dec 15 14:36:41 User-PC kernel: [175576.124137] Stack:
Dec 15 14:36:41 User-PC kernel: [175576.124181]  ec556500 f6d06800 f60b5be8 c01087d8 ec512262 00000030 00000024 f5626180
Dec 15 14:36:41 User-PC kernel: [175576.124181]  f572c200 ef463440 f5626300 3affffff f6d06dd0 e60766a4 000000c4 f6d06860
Dec 15 14:36:41 User-PC kernel: [175576.124181]  ffffffff ec55652c 00000001 f6d06844 f60b5c64 c0138264 c016e451 c013e47d
Dec 15 14:36:41 User-PC kernel: [175576.124181] Call Trace:
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01087d8>] ? sched_clock+0x8/0x10
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0138264>] ? enqueue_entity+0x174/0x440
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c016e451>] ? sched_clock_cpu+0x131/0x190
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c013e47d>] ? select_task_rq_fair+0x2ad/0x730
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0524fc1>] ? nf_iterate+0x71/0x90
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4914>] ? br_handle_frame_finish+0x184/0x220 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4790>] ? br_handle_frame_finish+0x0/0x220 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e46e9>] ? br_handle_frame+0x189/0x230 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4790>] ? br_handle_frame_finish+0x0/0x220 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4560>] ? br_handle_frame+0x0/0x230 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04ff026>] ? __netif_receive_skb+0x1b6/0x5b0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04f7a30>] ? skb_copy_bits+0x110/0x210
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0503a7f>] ? netif_receive_skb+0x6f/0x80
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82cb74c>] ? ieee80211_deliver_skb+0x8c/0x1a0 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82cc836>] ? ieee80211_rx_handlers+0xeb6/0x1aa0 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04ff1f0>] ? __netif_receive_skb+0x380/0x5b0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c016e242>] ? sched_clock_local+0xb2/0x190
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c012b688>] ? default_spin_lock_flags+0x8/0x10
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d83df>] ? _raw_spin_lock_irqsave+0x2f/0x50
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82cd621>] ? ieee80211_prepare_and_rx_handle+0x201/0xa90 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82ce154>] ? ieee80211_rx+0x2a4/0x830 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f815a8d6>] ? iwl_update_stats+0xa6/0x2a0 [iwlcore]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f8499212>] ? iwlagn_rx_reply_rx+0x292/0x3b0 [iwlagn]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d83df>] ? _raw_spin_lock_irqsave+0x2f/0x50
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f8483697>] ? iwl_rx_handle+0xe7/0x350 [iwlagn]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f8486ab7>] ? iwl_irq_tasklet+0xf7/0x5c0 [iwlagn]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01aece1>] ? __rcu_process_callbacks+0x201/0x2d0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0150d05>] ? tasklet_action+0xc5/0x100
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0150a07>] ? __do_softirq+0x97/0x1d0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d910c>] ? nmi_stack_correct+0x2f/0x34
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0150970>] ? __do_softirq+0x0/0x1d0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  <IRQ>
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01508f5>] ? irq_exit+0x65/0x70
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05df062>] ? do_IRQ+0x52/0xc0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01036b0>] ? common_interrupt+0x30/0x38
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c03a1fc2>] ? intel_idle+0xc2/0x160
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04daebb>] ? cpuidle_idle_call+0x6b/0x100
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0101dea>] ? cpu_idle+0x8a/0xf0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d2702>] ? start_secondary+0x1e8/0x1ee

Cc: David Miller <davem@davemloft.net>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
---
V2: Implement David Miller's suggestion 
V3: Fix the mld_msg access:
	Length check was wrong and psk_may_pull performs itself the length check

 net/bridge/br_multicast.c |   24 ++++++++++++++++++------
 1 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index f19e347..6932a0c 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1469,15 +1469,16 @@ static int br_multicast_ipv6_rcv(struct net_bridge *br,
 	if (!skb2)
 		return -ENOMEM;
 
+
+	err = -EINVAL;
+	if (!pskb_may_pull(skb2, offset + sizeof(struct icmp6hdr)))
+		goto out_nopush;
+
 	len -= offset - skb_network_offset(skb2);
 
 	__skb_pull(skb2, offset);
 	skb_reset_transport_header(skb2);
 
-	err = -EINVAL;
-	if (!pskb_may_pull(skb2, sizeof(*icmp6h)))
-		goto out;
-
 	icmp6h = icmp6_hdr(skb2);
 
 	switch (icmp6h->icmp6_type) {
@@ -1516,7 +1517,12 @@ static int br_multicast_ipv6_rcv(struct net_bridge *br,
 	switch (icmp6h->icmp6_type) {
 	case ICMPV6_MGM_REPORT:
 	    {
-		struct mld_msg *mld = (struct mld_msg *)icmp6h;
+		struct mld_msg *mld;
+		if (!pskb_may_pull(skb2, sizeof(*mld))) {
+			err = -EINVAL;
+			goto out;
+		}
+		mld = (struct mld_msg *)icmp6h;
 		BR_INPUT_SKB_CB(skb2)->mrouters_only = 1;
 		err = br_ip6_multicast_add_group(br, port, &mld->mld_mca);
 		break;
@@ -1529,13 +1535,19 @@ static int br_multicast_ipv6_rcv(struct net_bridge *br,
 		break;
 	case ICMPV6_MGM_REDUCTION:
 	    {
-		struct mld_msg *mld = (struct mld_msg *)icmp6h;
+		struct mld_msg *mld;
+		if (!pskb_may_pull(skb2, sizeof(*mld))) {
+			err = -EINVAL;
+			goto out;
+		}
+		mld = (struct mld_msg *)icmp6h;
 		br_ip6_multicast_leave_group(br, port, &mld->mld_mca);
 	    }
 	}
 
 out:
 	__skb_push(skb2, offset);
+out_nopush:
 	if (skb2 != skb)
 		kfree_skb(skb2);
 	return err;
-- 
1.7.3.4

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


^ permalink raw reply related

* Re: [PATCH] new UDPCP Communication Protocol
From: Jesper Juhl @ 2011-01-02 19:55 UTC (permalink / raw)
  To: stefani; +Cc: linux-kernel, akpm, davem, netdev, eric.dumazet, shemminger
In-Reply-To: <1293982286-15389-1-git-send-email-stefani@seibold.net>

On Sun, 2 Jan 2011, stefani@seibold.net wrote:

> From: Stefani Seibold <stefani@seibold.net>
> 
> Changelog:
> 31.12.2010 first proposal
> 01.01.2011 code cleanup and fixes suggest by Eric Dumazet
> 02.01.2011 kick away UDP-Lite support
>            change spin_lock_irq into spin_lock_bh
> 	   faster udpcp_release_sock
> 	   base is now linux-next
> 
> UDPCP is a communication protocol specified by the Open Base Station
> Architecture Initiative Special Interest Group (OBSAI SIG). The
> protocol is based on UDP and is designed to meet the needs of "Mobile
> Communcation Base Station" internal communications. It is widely used by
> the major networks infrastructure supplier.
> 
> The UDPCP communication service supports the following features:
> 
> -Connectionless communication for serial mode data transfer
> -Acknowledged and unacknowledged transfer modes
> -Retransmissions Algorithm
> -Checksum Algorithm using Adler32
> -Fragmentation of long messages (disassembly/reassembly) to match to the MTU
>  during transport:
> -Broadcasting and multicasting messages to multiple peers in unacknowledged
>   transfer mode
> 
> UDPCP supports application level messages up to 64 KBytes (limited by 16-bit
> packet data length field). Messages that are longer than the MTU will be
> fragmented to the MTU.
> 
> UDPCP provides a reliable transport service that will perform message
> retransmissions in case transport failures occur.
> 
> The code is also a nice example how to implement a UDP based protocol as
> a kernel socket modules.
> 
> Due the nature of UDPCP which has no sliding windows support, the latency has
> a huge impact. The perfomance increase by implementing as a kernel module is
> about the factor 10, because there are no context switches and data packets or
> ACKs will be handled in the interrupt service.
> 
> There are no side effects to the network subsystems so i ask for merge it
> into linux-next. Hope you like it.
> 
> The patch is against linux next-20101231
> 
> - Stefani
> 
> Signed-off-by: Stefani Seibold <stefani@seibold.net>

[...]
> +
> +#define VERSION	"0.71"

I personally don't think this makes much sense.
Version numbers for individual modules tend to not get updated as the code 
changes over the years, which make them rather meaningless.
Since this module depends on functionallity of the kernel which it is 
compiled with, the actual (meaningful) version of this code is that of the 
kernel tree being compiled that includes this code. Which again makes this 
specific version define meaningless.

So, why not save a few lines of code and get rid of this rather pointless 
thing?

[...]
> +static struct udpcp_dest *find_dest(struct sock *sk, __be32 addr, __be16 port)
> +{
> +	struct udpcp_dest *dest;
> +
> +	dest = __find_dest(sk, addr, port);

Why not

static struct udpcp_dest *find_dest(struct sock *sk, __be32 addr, __be16 port)
{
     struct udpcp_dest *dest =  __find_dest(sk, addr, port);

?


[...]
> + * Release a routing table entry if no packed will be assembled

Don't you mean "packet" rather than "packed" here?


[...]
> + * Return true it the passed skb socket buffer is the last in the list

I believe you mean "Return true if the passed ..."


[...]
> +static void udpcp_flush_err(struct sock *sk, struct udpcp_dest *dest)
> +{
> +	struct inet_sock *inet = inet_sk(sk);
> +	struct udpcp_sock *usk = udpcp_sk(sk);
> +
> +	if (!inet->recverr)
> +		skb_queue_purge(&dest->xmit);
> +	else {

CodingStyle would want this as

     if (!inet->recverr) {
             skb_queue_purge(&dest->xmit);
     } else {

If one branch needs {} then both should get them.


[...]
> +	if (!dest->xmit_last)
> +		_udpcp_xmit(sk, dest);
> +	else {
> +		skb = dest->xmit_wait;

Same comment as above.
There are more occurences of this, I'm not going to point them all out.


[...]
> +static inline void udpcp_release_sock(struct sock *sk)
> +{
> +	struct udpcp_sock *usk = udpcp_sk(sk);
> +
> +	while (usk->timeout)
> +		udpcp_handle_timeout(sk);
> +	release_sock(sk);
> +        check_timeout(sk);

The line above uses spaces for indentation. It should use one tab.


[...]
> +static unsigned int udpcp_tx_queue_len(struct sock *sk, struct udpcp_dest *dest)
> +{
> +	struct sk_buff *skb;
> +	unsigned int n;
> +
> +	n = 0;

Might as well save a few lines and make this

static unsigned int udpcp_tx_queue_len(struct sock *sk, struct udpcp_dest *dest)
{
     struct sk_buff *skb;
     unsigned int n = 0;


[...]
> +static unsigned int udpcp_rx_queue_len(struct sock *sk, struct udpcp_dest *dest)
> +{
> +	struct sk_buff *skb;
> +	unsigned int n;
> +
> +	n = 0;

Here as well
        unsigned int n = 0;



-- 
Jesper Juhl <jj@chaosbits.net>            http://www.chaosbits.net/
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please.

^ permalink raw reply

* Re: [PATCH] net: eepro testing positive EBUSY return by request_irq()?
From: Ben Hutchings @ 2011-01-02 19:51 UTC (permalink / raw)
  To: roel kluin; +Cc: davem, netdev, Andrew Morton, LKML
In-Reply-To: <4D209127.5080505@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2115 bytes --]

On Sun, 2011-01-02 at 15:52 +0100, roel kluin wrote:
> >> -		if (request_irq (*irqp, NULL, IRQF_SHARED, "bogus", dev) != EBUSY) {
> >> +		if (request_irq (*irqp, NULL, IRQF_SHARED, "bogus", dev) != -EBUSY) {
> 
> > This condition is completely bogus - request_irq() with a NULL handler
> > now returns -EINVAL before even checking whether the IRQ is in use.  The
> > code should be fixed along the lines of what I did for 3c503 in commit
> > b0cf4dfb7cd21556efd9a6a67edcba0840b4d98d.
> > 
> > The e2100 and hp net drivers have the same bug.
> > 
> > Ben.
> 
> I've made an attempt to fix the mentioned issues, but I'm not sure it is complete.
> The commit made some other changes as well which didn't seem appropriate here to me.

They are absolutely necessary.

In the bad old days before plug-and-play, some expansion cards would
have jumpers to configure the IRQ number and no software mechanism to
set or query it.  Users were expected to specify the IRQ to the driver
explicitly.  probe_irq_{on,off}() allow drivers to avoid this by
triggering an IRQ and checking which was received.  That assumes that
there are no IRQ conflicts, and it only works reliably in a quiescent
system.

These drivers have been abusing request_irq(), probe_irq_{on,off}() to
do resource allocation for hardware that *does* have a software
mechanism to set the IRQ number.  This has become less reliable as
driver loading has been made more dynamic, and is now broken completely
due to the change in request_irq().  I believe the correct way to do
this allocation is what I did in 3c503.  Possibly it's worth adding a
generic function.

> And I didn't compile or run test this.

I compiled but didn't have hardware to test.

> As a side note, in the function changed by the commit you mentioned, el2_open(), it
> appears that in current git, boolean `seen' cannot ever become true, can it?
[...]

The IRQ handler data is set to &seen and el2_probe_interrupt() uses that
pointer to set it.

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply

* Re: [PATCH] new UDPCP Communication Protocol
From: Daniel Baluta @ 2011-01-02 19:48 UTC (permalink / raw)
  To: stefani; +Cc: linux-kernel, akpm, davem, netdev, eric.dumazet, shemminger
In-Reply-To: <1293982286-15389-1-git-send-email-stefani@seibold.net>

Hello,

I have some style comments, please read below.

> +struct udpcp_statistics {
> +       unsigned int txMsgs;            /* Num of transmitted messages */
> +       unsigned int rxMsgs;            /* Num of received messages */
> +       unsigned int txNodes;           /* Num of receiver nodes */
> +       unsigned int rxNodes;           /* Num of transmitter nodes */
> +       unsigned int txTimeout;         /* Num of unsuccessful transmissions */
> +       unsigned int rxTimeout;         /* Num of partial message receptions */
> +       unsigned int txRetries;         /* Num of resends */
> +       unsigned int rxDiscardedFrags;  /* Num of discarded fragments */
> +       unsigned int crcErrors;         /* Num of crc errors detected */

Is there any strong reason to have this camel case naming?
I would prefer tx_msgs, rx_msgs etc..

> +struct udpcp_dest {
> +       struct list_head list;
> +       struct sk_buff_head xmit;
> +       unsigned long tx_time;
> +       unsigned long rx_time;
> +       u32 txTimeout;
> +       u32 rxTimeout;

Here you have mixed naming conventions. I guess
tx_timeout will fit in better than txTimeout.

> +       u32 txRetries;
> +       u32 rxDiscardedFrags;
> +       struct sk_buff *xmit_wait;
> +       struct sk_buff *xmit_last;
> +       struct sk_buff *recv_msg;
> +       struct sk_buff *recv_last;
> +       struct udpcphdr lastmsg;
> +       struct ipcm_cookie ipc;
> +       struct flowi fl;
> +       struct rtable *rt;
> +       __be32 addr;
> +       __be16 port;
> +       u16 msgid;
> +       u8 use_flag;
> +       u8 insync;
> +       u8 ackmode;
> +       u8 chkmode;
> +       u8 try;
> +       u8 acks;
> +       struct udp_sock udpsock;
> +       struct sk_buff_head assembly;
> +       u32 assembly_len;
> +       struct udpcp_dest *assembly_dest;
> +       wait_queue_head_t wq;
> +       struct list_head destlist;
> +       struct list_head udpcplist;
> +       struct timer_list timer;
> +       struct udpcp_statistics stat;
> +       u32 pending;
> +       unsigned long tx_timeout;
> +       unsigned long rx_timeout;
> +       void (*udp_data_ready) (struct sock *sk, int bytes);
> +       u8 ackmode;
> +       u8 chkmode;
> +       u8 maxtry;
> +       u8 acks;
> +       u8 timeout;
> +/* overall UDPCP statistics */
> +static atomic_t udpcp_txMsgs;
> +static atomic_t udpcp_rxMsgs;
> +static atomic_t udpcp_txNodes;
> +static atomic_t udpcp_rxNodes;
> +static atomic_t udpcp_txTimeout;
> +static atomic_t udpcp_rxTimeout;
> +static atomic_t udpcp_txRetries;
> +static atomic_t udpcp_rxDiscardedFrags;
> +static atomic_t udpcp_crcErrors;

same here.

> +
> +module_param(debug, int, 0);
> +MODULE_PARM_DESC(debug, "Debug enabled or not");
> +

thanks,
Daniel.

^ permalink raw reply

* Re: [PATCH v2] Broadcom CNIC core network driver: fix mem leak on allocation failures in cnic_alloc_uio_rings()
From: Jesper Juhl @ 2011-01-02 18:54 UTC (permalink / raw)
  To: David Miller; +Cc: joe, netdev, linux-kernel, zongxi, mchan
In-Reply-To: <20101231.112025.245393392.davem@davemloft.net>

On Fri, 31 Dec 2010, David Miller wrote:

> From: Jesper Juhl <jj@chaosbits.net>
> Date: Sun, 26 Dec 2010 21:57:39 +0100 (CET)
> 
> > We are leaking memory in drivers/net/cnic.c::cnic_alloc_uio_rings() if 
> > either of the calls to dma_alloc_coherent() fail. This patch fixes it by 
> > freeing both the memory allocated with kzalloc() and memory allocated with 
> > previous calls to dma_alloc_coherent() when there's a failure.
> > 
> > Thanks to  Joe Perches <joe@perches.com>  for suggesting a better 
> > implementation than my initial version.
> > 
> > 
> > Signed-off-by: Jesper Juhl <jj@chaosbits.net>
> 
>  ...
> > + err_dma:
> > +	dma_free_coherent(&udev->pdev->dev, udev->l2_ring_size,
> > +       			  udev->l2_ring, udev->l2_ring_map);
> 
> Space before tab in indentation, improperly positioned third argument
> indentation.
> 
Whoops.

> I fixed all of this up, but please do not skimp on making sure these
> details are taken care of when updating your patch in response to feedback.
> 
I usually try to take care that such things are in order. I often even 
point these details out to other people. It slipped past me this time. 
That was a mistake. Sorry.


-- 
Jesper Juhl <jj@chaosbits.net>            http://www.chaosbits.net/
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please.

^ permalink raw reply

* Re: [PATCH] new UDPCP Communication Protocol
From: Eric Dumazet @ 2011-01-02 16:34 UTC (permalink / raw)
  To: stefani; +Cc: linux-kernel, akpm, davem, netdev, shemminger
In-Reply-To: <1293982286-15389-1-git-send-email-stefani@seibold.net>

Le dimanche 02 janvier 2011 à 16:31 +0100, stefani@seibold.net a écrit :
> From: Stefani Seibold <stefani@seibold.net>
> 
> Changelog:
> 31.12.2010 first proposal
> 01.01.2011 code cleanup and fixes suggest by Eric Dumazet
> 02.01.2011 kick away UDP-Lite support
>            change spin_lock_irq into spin_lock_bh
> 	   faster udpcp_release_sock
> 	   base is now linux-next
...

> +/*
> + * Create a new destination descriptor for the given IPV4 address and port
> + */
> +static struct udpcp_dest *new_dest(struct sock *sk, __be32 addr, __be16 port)
> +{
> +	struct udpcp_dest *dest;
> +	struct udpcp_sock *usk = udpcp_sk(sk);
> +
> +	dest = kzalloc(sizeof(*dest), sk->sk_allocation);
> +
> +	if (dest) {
> +		skb_queue_head_init(&dest->xmit);
> +		dest->addr = addr;
> +		dest->port = port;
> +		dest->ackmode = UDPCP_ACK;
> +		list_add_tail(&dest->list, &usk->destlist);
> +	}
> +
> +	return dest;
> +}
> +

I have not found what prevents a malicious user to make destlist grow
and consume all memory ?




^ permalink raw reply

* Re: [*v3 PATCH 00/22] IPVS Network Name Space aware
From: Hans Schillstrom @ 2011-01-02 16:27 UTC (permalink / raw)
  To: Julian Anastasov
  Cc: horms, daniel.lezcano, wensong, lvs-devel, netdev,
	netfilter-devel, Hans Schillstrom
In-Reply-To: <alpine.LFD.2.00.1101011421410.2559@ja.ssi.bg>

On Saturday, January 01, 2011 13:27:00 Julian Anastasov wrote:
> 
>  	Hello,
> 
> On Thu, 30 Dec 2010, hans@schillstrom.com wrote:
> 
> > From: Hans Schillstrom <hans.schillstrom@ericsson.com>
> >
> > This patch series adds network name space support to the LVS.
> >
> > REVISION
> >
> > This is version 3
> >
> > OVERVIEW
> >
> > The patch doesn't remove or add any functionality except for netns.
> > For users that don't use network name space (netns) this patch is
> > completely transparent.
> >
> > Now it's possible to run LVS in a Linux container (see lxc-tools)
> > i.e.  a light weight visualization. For example it's possible to run
> > one or several lvs on a real server in their own network name spaces.
> >> From the LVS point of view it looks like it runs on it's own machine.
> 
>  	Only some comments for two of the patches:
> 
> v3 PATCH 10/22 use ip_vs_proto_data as param
> 
>  	- Can ip_vs_protocol_timeout_change() walk
>  	proto_data_table instead of ip_vs_proto_table to avoid the
>  	__ipvs_proto_data_get call?
> 

I just forget to do that ...

> v3 PATCH 15/22 ip_vs_stats
> 
>  	- Is ustats_seq allocated with alloc_percpu?
> 
>  	Such reader sections should be changed to use tmp vars because
>  	on retry we risk to add the values multiple times. For example:
> 
>  		do {
>  			start = read_seqcount_begin(seq_count);
>  			ipvs->ctl_stats->ustats.inbytes += u->inbytes;
>  			ipvs->ctl_stats->ustats.outbytes += u->outbytes;
>  		} while (read_seqcount_retry(seq_count, start));
> 
>  	should be changed as follows:
> 
>  		u64 inbytes, outbytes;
> 
>  		do {
>  			start = read_seqcount_begin(seq_count);
>  			inbytes = u->inbytes;
>  			outbytes = u->outbytes;
>  		} while (read_seqcount_retry(seq_count, start));
>  		ipvs->ctl_stats->ustats.inbytes += inbytes;
>  		ipvs->ctl_stats->ustats.outbytes += outbytes;
> 

Ooops, that was a bug :-(

>  	Or it is better to create new struct for percpu stats,
>  	they will have their own syncp, because we can not
>  	change struct ip_vs_stats_user. syncp should be percpu
>  	because we remove locks.
> 
>  	For example:
> 
>  	struct ip_vs_cpu_stats {
>  		struct ip_vs_stats_user	ustats;
>  		struct u64_stats_sync	syncp;
>  	};
> 
>  	Then we can add this in struct netns_ipvs:
> 
>  	struct ip_vs_cpu_stats __percpu *stats;	/* Statistics */
> 
>  	without the seqcount_t * ustats_seq;
> 
>  	Then syncp does not need any initialization, it seems
>  	alloc_percpu returns zeroed area.
> 
>  	When we use percpu stats for all places (dest and svc) we
>  	can create new struct struct ip_vs_counters, so that we
>  	can reduce the memory usage from percpu data. Now stats
>  	include counters and estimated values. The estimated
>  	values should not be percpu. Then ip_vs_cpu_stats
>  	will be shorter (it is not visible to user space):
> 
>  	struct ip_vs_cpu_stats {
>  		struct ip_vs_counters	ustats;
>  		struct u64_stats_sync	syncp;
>  	};
> 
>  	For writer side in softirq context we should protect the whole
>  	section with u64 counters, for example this code:
> 

There is only two u64,  inbytes and outbytes

>  	spin_lock(&ip_vs_stats.lock);
>  	ip_vs_stats.ustats.outpkts++;
>  	ip_vs_stats.ustats.outbytes += skb->len;
>  	spin_unlock(&ip_vs_stats.lock);
> 
>  	should be changed to:
> 
>  	struct ip_vs_cpu_stats *s = this_cpu_ptr(ipvs->stats);
> 
>  	u64_stats_update_begin(&s->syncp);
>  	s->ustats.outpkts++;
>  	s->ustats.outbytes += skb->len;
>  	u64_stats_update_end(&s->syncp);
> 
>  	Readers should look in similar way, only that _bh fetching
>  	should be used only for the u64 counters because writer is in
>  	softirq context:
> 
>  	u64 inbytes, outbytes;
> 
>  	for_each_possible_cpu(i) {
>  		const struct ip_vs_cpu_stats *s = per_cpu_ptr(ipvs->stats, i);
>  		unsigned int start;
> 
>  		...
>  		do {
>  			start = u64_stats_fetch_begin_bh(&s->syncp);
>  			inbytes = s->ustats.inbytes;
>  			outbytes = s->ustats.outbytes;
>  		} while (u64_stats_fetch_retry_bh(&s->syncp, start));
>  		ipvs->ctl_stats->ustats.inbytes += inbytes;
>  		ipvs->ctl_stats->ustats.outbytes += outbytes;
>  	}
> 
>  	Then IPVS_STAT_ADD and IPVS_STAT_INC can not access
>  	the syncp. They do not look generic enough, in case we
>  	decide to use struct ip_vs_cpu_stats for dest and svc stats.
>  	I think, these macros are not needed.
>  	It is possible to have other macros for write side,
>  	for percpu stats:
> 
>  	#define IP_VS_STATS_UPDATE_BEGIN(stats)		\
>  		{ struct ip_vs_cpu_stats *s = this_cpu_ptr(stats); \
>  		u64_stats_update_begin(&s->syncp)
> 
>  	#define IP_VS_STATS_UPDATE_END()		\
>  		u64_stats_update_end(&s->syncp);	\
>  		}
> 
>  	Then usage can be:
> 
>  	IP_VS_STATS_UPDATE_BEGIN(ipvs->stats);
>  	s->ustats.outpkts++;
>  	s->ustats.outbytes += skb->len;
>  	IP_VS_STATS_UPDATE_END();
> 
>  	Then we can rename ctl_stats to total_stats or full_stats.
>  	get_stats() can be changed to ip_vs_read_cpu_stats(), it
>  	will be used to read all percpu stats as sum in the
>  	total_stats/full_stats structure or tmp space:
> 
>  	/* Get sum of percpu stats */
>  	... ip_vs_read_cpu_stats(struct ip_vs_stats_user *sum,
>  		struct ip_vs_cpu_stats *stats)
>  	{
>  		...
>  		for_each_possible_cpu(i) {
>  			const struct ip_vs_cpu_stats *s = per_cpu_ptr(stats, i);
>  			unsigned int start;
> 
>  			if (i) {...
>  			...
>  			do {
>  				start = u64_stats_fetch_begin_bh(&s->syncp);
>  				inbytes = s->ustats.inbytes;
>  				outbytes = s->ustats.outbytes;
>  			} while (u64_stats_fetch_retry_bh(&s->syncp, start));
>  			sum->inbytes += inbytes;
>  			sum->outbytes += outbytes;
>  		}
>  	}
> 
>  	As result, ip_vs_read_cpu_stats() will be used in
>  	estimation_timer() instead of get_stats():
> 
>  	ip_vs_read_cpu_stats(&ipvs->total_stats->ustats, ipvs->stats);
> 
>  	but also in ip_vs_stats_show()
>  	{
>  		// I hope struct ip_vs_stats_user can fit in stack:
>  		struct ip_vs_stats_user sum;
> 
>  		here we do not need to use spin_lock_bh(&ctl_stats->lock);
>  		because now the stats are lockless, estimation_timer
>  		does not use ctl_stats->lock, so we have to call
>  		ip_vs_read_cpu_stats to avoid problems with
>  		reading incorrect u64 values directly from
>  		ipvs->total_stats->ustats.
> 
>  		ip_vs_read_cpu_stats(&sum, ipvs->stats);
>  		seq_printf for sum
>  	}
> 
>  	ip_vs_read_cpu_stats can be used also in ip_vs_copy_stats()
>  	which copies values to user space and needs proper u64 reading.
>  	But it is used only for svc and dest stats which are not
>  	percpu yet.
> 
>  	Now this code does not look ok:
> 
>  	ip_vs_zero_stats(net_ipvs(net)->ctl_stats);
> 
>  	May be we need new func ip_vs_zero_percpu_stats that will
>  	reset stats for all CPUs, i.e. ipvs->stats in our case?
>  	Even if this operation is not very safe if done from
>  	user space while the softirq can modify u64 counters
>  	on another CPU.

OK, I will take a look at this when I'm back from my vacation at 20 Jan.
I guess it might be worth the work to make all the stat counters per_cpu.

My tests shows that the big "CPU cycle bandit" was the one that I made per_cpu.

> 
> 
> Happy New Year!
> 

^ permalink raw reply

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
From: Eric Dumazet @ 2011-01-02 16:05 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Matt Carlson, Michael Leun, Michael Chan, David Miller,
	Ben Greear, linux-kernel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <AANLkTinq9ABi0prxe0m1Ma6rno4UnAOzuW-b5L362ij9@mail.gmail.com>

Le samedi 01 janvier 2011 à 19:27 -0500, Jesse Gross a écrit :
> On Sat, Jan 1, 2011 at 12:03 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > Le mardi 14 décembre 2010 à 11:15 -0800, Matt Carlson a écrit :
> >
> >> Thanks for the comments Jesse.  Below is an updated patch.
> >>
> >> Michael, I'm wondering if the difference in behavior can be explained by
> >> the presence or absence of management firmware.  Can you look at the
> >> driver sign-on messages in your syslogs for ASF[]?  I'm half expecting
> >> the 5752 to show "ASF[0]" and the 5714 to show "ASF[1]".  If you see
> >> this, and the below patch doesn't fix the problem, let me know.  I have
> >> another test I'd like you to run.
> >>
> >> ----
> >>
> >> [PATCH] tg3: Use new VLAN code
> >>
> >> This patch pivots the tg3 driver to the new VLAN infrastructure.
> >> All references to vlgrp have been removed and all VLAN code is
> >> unconditionally active.
> >>
> >> Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
> 
> [...]
> 
> > Hi Matt.
> >
> > Any news on this patch ?
> >
> > Without it, net-next-2.6 doesnt work for me on a vlan setup on top of
> > bonding.
> >
> > (bond0 : eth1 & eth2, eth1 being bnx2, eth2 beging tg3)
> >
> > ip link add link bond0 vlan.103 type vlan id 103
> > ip addr add 192.168.20.110/24 dev vlan.103
> > ip link set vlan.103 up
> >
> >
> > If active slave is eth1 (bnx2), everything works, but if active slave is
> > eth2 (tg3), incoming tagged frames (on vlan 103) are lost.
> 
> This patch isn't quite right - it always disables vlan stripping
> unless management firmware is in use, so it's not really a correct
> fix.
> 
> You said that this used to work correctly on this NIC?  Does it work
> without a bond, just a vlan on the tg3 device?  It sounds like Michael
> has a problem with vlan stripping on one of his NICs but if it works
> with just a vlan or on older kernels, it's probably not the same
> thing.
> 

1) current linux-2.6 works OK for me (and previous versions as well, I
am using this vlan/bonding setup since 3 years or so on one of my dev
machine)

Only net-next-2.6 has the problem.

If I remove bonding of the equation, I still have the problem, and can
see the 'dropped' counter increasing while I send packets to eth2 (tg3)

$ ifconfig eth2
eth2      Link encap:Ethernet  HWaddr 00:1E:0B:92:78:50  
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:94 errors:0 dropped:38686 overruns:0 frame:0
          TX packets:18 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:8332 (8.1 Kb)  TX bytes:1392 (1.3 Kb)
          Interrupt:19 
$ ifconfig vlan.103
vlan.103  Link encap:Ethernet  HWaddr 00:1E:0B:92:78:50  
          inet addr:192.168.20.110  Bcast:0.0.0.0  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:15 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 b)  TX bytes:846 (846.0 b)





> If it works on bnx2, it would seem to be a driver problem but it would
> be good to confirm that the tag in skb->vlan_tci is not being
> delievered to the networking core in this case.

Hmm, where do you want me to check this ?

^ permalink raw reply

* Re: [net-next-2.6 PATCH v2 2/4] dcbnl: adding DCBX feature flags get-set
From: Shmulik Ravid @ 2011-01-02 16:01 UTC (permalink / raw)
  To: John Fastabend
  Cc: davem@davemloft.net, Eilon Greenstein, Liu, Lucy,
	netdev@vger.kernel.org
In-Reply-To: <4D1CD814.2010903@intel.com>


> One more nit ;)
> 
> > +
> > +	ret = nla_parse_nested(data, DCB_FEATCFG_ATTR_MAX, tb[DCB_ATTR_FEATCFG],
> > +			       dcbnl_featcfg_nest);
> > +	if (ret) {
> > +		ret = -EINVAL;
> > +		goto err_out;
> > +	}
> 
> Why do you set EINVAL here if you use the returned error code from nla_parse_nested you get a more descriptive error. See ./lib/nlattr.c:nla_parse()/validate_nla().
> 
> [...]
> 
> > +static int dcbnl_setfeatcfg(struct net_device *netdev, struct nlattr **tb,
> > +			    u32 pid, u32 seq, u16 flags)
> > +{
> > +	struct nlattr *data[DCB_FEATCFG_ATTR_MAX + 1];
> > +	int ret = -EINVAL;
> > +	u8 value;
> > +	int i;
> > +
> > +	if (!tb[DCB_ATTR_FEATCFG] || !netdev->dcbnl_ops->setfeatcfg)
> > +		return ret;
> > +
> > +	ret = nla_parse_nested(data, DCB_FEATCFG_ATTR_MAX, tb[DCB_ATTR_FEATCFG],
> > +			       dcbnl_featcfg_nest);
> > +
> > +	if (ret) {
> > +		ret = -EINVAL;
> > +		goto err;
> > +	}
> 
> Same here.
> 

I'll send a patch with the improved return values for the the new dcbnl
routines. While I'm at it, is it safe to fix on the same lines the
older already established dcbnl routines?

Thanks,
Shmulik




^ permalink raw reply

* [PATCH] new UDPCP Communication Protocol
From: stefani @ 2011-01-02 15:31 UTC (permalink / raw)
  To: linux-kernel, akpm, davem, netdev, eric.dumazet, shemminger; +Cc: stefani

From: Stefani Seibold <stefani@seibold.net>

Changelog:
31.12.2010 first proposal
01.01.2011 code cleanup and fixes suggest by Eric Dumazet
02.01.2011 kick away UDP-Lite support
           change spin_lock_irq into spin_lock_bh
	   faster udpcp_release_sock
	   base is now linux-next

UDPCP is a communication protocol specified by the Open Base Station
Architecture Initiative Special Interest Group (OBSAI SIG). The
protocol is based on UDP and is designed to meet the needs of "Mobile
Communcation Base Station" internal communications. It is widely used by
the major networks infrastructure supplier.

The UDPCP communication service supports the following features:

-Connectionless communication for serial mode data transfer
-Acknowledged and unacknowledged transfer modes
-Retransmissions Algorithm
-Checksum Algorithm using Adler32
-Fragmentation of long messages (disassembly/reassembly) to match to the MTU
 during transport:
-Broadcasting and multicasting messages to multiple peers in unacknowledged
  transfer mode

UDPCP supports application level messages up to 64 KBytes (limited by 16-bit
packet data length field). Messages that are longer than the MTU will be
fragmented to the MTU.

UDPCP provides a reliable transport service that will perform message
retransmissions in case transport failures occur.

The code is also a nice example how to implement a UDP based protocol as
a kernel socket modules.

Due the nature of UDPCP which has no sliding windows support, the latency has
a huge impact. The perfomance increase by implementing as a kernel module is
about the factor 10, because there are no context switches and data packets or
ACKs will be handled in the interrupt service.

There are no side effects to the network subsystems so i ask for merge it
into linux-next. Hope you like it.

The patch is against linux next-20101231

- Stefani

Signed-off-by: Stefani Seibold <stefani@seibold.net>
---
 include/linux/socket.h |    9 +-
 include/net/udp.h      |    1 +
 include/net/udpcp.h    |   47 +
 net/Kconfig            |    1 +
 net/Makefile           |    1 +
 net/ipv4/ip_output.c   |    2 +
 net/ipv4/ip_sockglue.c |    2 +
 net/udpcp/Kconfig      |   34 +
 net/udpcp/Makefile     |    5 +
 net/udpcp/udpcp.c      | 2841 ++++++++++++++++++++++++++++++++++++++++++++++++
 10 files changed, 2940 insertions(+), 3 deletions(-)
 create mode 100644 include/net/udpcp.h
 create mode 100644 net/udpcp/Kconfig
 create mode 100644 net/udpcp/Makefile
 create mode 100644 net/udpcp/udpcp.c

diff --git a/include/linux/socket.h b/include/linux/socket.h
index 2dccbeb..2e9157c 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -171,7 +171,7 @@ struct ucred {
 #define AF_DECnet	12	/* Reserved for DECnet project	*/
 #define AF_NETBEUI	13	/* Reserved for 802.2LLC project*/
 #define AF_SECURITY	14	/* Security callback pseudo AF */
-#define AF_KEY		15      /* PF_KEY key management API */
+#define AF_KEY		15	/* PF_KEY key management API */
 #define AF_NETLINK	16
 #define AF_ROUTE	AF_NETLINK /* Alias to emulate 4.4BSD */
 #define AF_PACKET	17	/* Packet family		*/
@@ -194,7 +194,8 @@ struct ucred {
 #define AF_IEEE802154	36	/* IEEE802154 sockets		*/
 #define AF_CAIF		37	/* CAIF sockets			*/
 #define AF_ALG		38	/* Algorithm sockets		*/
-#define AF_MAX		39	/* For now.. */
+#define AF_UDPCP	39	/* UDPCP sockets		*/
+#define AF_MAX		40	/* For now.. */
 
 /* Protocol families, same as address families. */
 #define PF_UNSPEC	AF_UNSPEC
@@ -204,7 +205,7 @@ struct ucred {
 #define PF_AX25		AF_AX25
 #define PF_IPX		AF_IPX
 #define PF_APPLETALK	AF_APPLETALK
-#define	PF_NETROM	AF_NETROM
+#define PF_NETROM	AF_NETROM
 #define PF_BRIDGE	AF_BRIDGE
 #define PF_ATMPVC	AF_ATMPVC
 #define PF_X25		AF_X25
@@ -236,6 +237,7 @@ struct ucred {
 #define PF_IEEE802154	AF_IEEE802154
 #define PF_CAIF		AF_CAIF
 #define PF_ALG		AF_ALG
+#define PF_UDPCP	AF_UDPCP
 #define PF_MAX		AF_MAX
 
 /* Maximum queue length specifiable by listen.  */
@@ -310,6 +312,7 @@ struct ucred {
 #define SOL_IUCV	277
 #define SOL_CAIF	278
 #define SOL_ALG		279
+#define SOL_UDPCP	280
 
 /* IPX options */
 #define IPX_TYPE	1
diff --git a/include/net/udp.h b/include/net/udp.h
index bb967dd..82c95a7 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -47,6 +47,7 @@ struct udp_skb_cb {
 	} header;
 	__u16		cscov;
 	__u8		partial_cov;
+	__u8            udpcp_flag;
 };
 #define UDP_SKB_CB(__skb)	((struct udp_skb_cb *)((__skb)->cb))
 
diff --git a/include/net/udpcp.h b/include/net/udpcp.h
new file mode 100644
index 0000000..45180a5
--- /dev/null
+++ b/include/net/udpcp.h
@@ -0,0 +1,47 @@
+/* Definitions for UDPCP sockets. */
+
+#ifndef __LINUX_IF_UDPCP
+#define __LINUX_IF_UDPCP
+
+#include "linux/ioctl.h"
+
+#define UDPCP_MAX_MSGSIZE	65487
+
+#define UDPCP_MAX_WAIT_SEC	60
+
+#define UDPCP_OPT_TRANSFER_MODE		0
+#define UDPCP_OPT_CHECKSUM_MODE		1
+#define UDPCP_OPT_TX_TIMEOUT		2
+#define UDPCP_OPT_RX_TIMEOUT		3
+#define UDPCP_OPT_MAXTRY		4
+#define UDPCP_OPT_OUTSTANDING_ACKS	5
+
+#define UDPCP_NOACK		0
+#define UDPCP_ACK		1
+#define UDPCP_SINGLE_ACK	2
+#define UDPCP_NOCHECKSUM	3
+#define UDPCP_CHECKSUM		4
+
+#define UDPCP_IOC_MAGIC  251
+
+#define UDPCP_IOCTL_GET_STATISTICS \
+	_IOR(UDPCP_IOC_MAGIC, 0x01, struct udpcp_statistics *)
+#define UDPCP_IOCTL_RESET_STATISTICS \
+	_IO(UDPCP_IOC_MAGIC, 0x02)
+#define UDPCP_IOCTL_SYNC \
+	_IOR(UDPCP_IOC_MAGIC, 0x03, unsigned long)
+
+struct udpcp_statistics {
+	unsigned int txMsgs;		/* Num of transmitted messages */
+	unsigned int rxMsgs;		/* Num of received messages */
+	unsigned int txNodes;		/* Num of receiver nodes */
+	unsigned int rxNodes;		/* Num of transmitter nodes */
+	unsigned int txTimeout;		/* Num of unsuccessful transmissions */
+	unsigned int rxTimeout;		/* Num of partial message receptions */
+	unsigned int txRetries;		/* Num of resends */
+	unsigned int rxDiscardedFrags;	/* Num of discarded fragments */
+	unsigned int crcErrors;		/* Num of crc errors detected */
+};
+
+#endif
+
diff --git a/net/Kconfig b/net/Kconfig
index 7284062..4b3b619 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -302,6 +302,7 @@ source "net/rfkill/Kconfig"
 source "net/9p/Kconfig"
 source "net/caif/Kconfig"
 source "net/ceph/Kconfig"
+source "net/udpcp/Kconfig"
 
 
 endif   # if NET
diff --git a/net/Makefile b/net/Makefile
index a3330eb..388a582 100644
--- a/net/Makefile
+++ b/net/Makefile
@@ -70,3 +70,4 @@ obj-$(CONFIG_WIMAX)		+= wimax/
 obj-$(CONFIG_DNS_RESOLVER)	+= dns_resolver/
 obj-$(CONFIG_CEPH_LIB)		+= ceph/
 obj-$(CONFIG_BATMAN_ADV)	+= batman-adv/
+obj-$(CONFIG_UDPCP)		+= udpcp/
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 04c7b3b..41f9276 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1084,6 +1084,7 @@ error:
 	IP_INC_STATS(sock_net(sk), IPSTATS_MIB_OUTDISCARDS);
 	return err;
 }
+EXPORT_SYMBOL(ip_append_data);
 
 ssize_t	ip_append_page(struct sock *sk, struct page *page,
 		       int offset, size_t size, int flags)
@@ -1340,6 +1341,7 @@ error:
 	IP_INC_STATS(net, IPSTATS_MIB_OUTDISCARDS);
 	goto out;
 }
+EXPORT_SYMBOL(ip_push_pending_frames);
 
 /*
  *	Throw away all pending data on the socket.
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index 3948c86..310369c 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -226,6 +226,7 @@ int ip_cmsg_send(struct net *net, struct msghdr *msg, struct ipcm_cookie *ipc)
 	}
 	return 0;
 }
+EXPORT_SYMBOL(ip_cmsg_send);
 
 
 /* Special input handler for packets caught by router alert option.
@@ -369,6 +370,7 @@ void ip_local_error(struct sock *sk, int err, __be32 daddr, __be16 port, u32 inf
 	if (sock_queue_err_skb(sk, skb))
 		kfree_skb(skb);
 }
+EXPORT_SYMBOL(ip_local_error);
 
 /*
  *	Handle MSG_ERRQUEUE
diff --git a/net/udpcp/Kconfig b/net/udpcp/Kconfig
new file mode 100644
index 0000000..a58c1b0
--- /dev/null
+++ b/net/udpcp/Kconfig
@@ -0,0 +1,34 @@
+#
+# UDPCP protocol
+#
+
+config UDPCP
+	tristate "UDPCP Communication Protocol"
+	depends on INET
+	---help---
+	  UDPCP is a communication protocol specified by the Open Base Station
+	  Architecture Initiative Special Interest Group (OBSAI SIG). The
+	  protocol is based on UDP and is designed to meet the needs of "Mobile
+	  Communcation Base Station" internal communications.
+
+	  The UDPCP communication service supports the following features:
+
+          -Connectionless communication for serial mode data transfer
+          -Acknowledged and unacknowledged transfer modes
+          -Retransmissions Algorithm
+          -Checksum Algorithm using Adler32
+          -Fragmentation of long messages (disassembly/reassembly) to
+           match to the MTU during transport:
+          -Broadcasting and multicasting messages to multiple peers in
+           unacknowledged transfer mode
+
+          UDPCP supports application level messages up to 64 KBytes (limited
+          by 16-bit packet data length field). Messages that are longer than the
+          MTU will be fragmented to the MTU.
+
+          UDPCP provides a reliable transport service that will perform message
+          retransmissions in case transport failures occur.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called udpcp.
+
diff --git a/net/udpcp/Makefile b/net/udpcp/Makefile
new file mode 100644
index 0000000..37f87c5
--- /dev/null
+++ b/net/udpcp/Makefile
@@ -0,0 +1,5 @@
+#
+# Makefile for UDPCP support code.
+#
+
+obj-$(CONFIG_UDPCP) += udpcp.o
diff --git a/net/udpcp/udpcp.c b/net/udpcp/udpcp.c
new file mode 100644
index 0000000..62eab61
--- /dev/null
+++ b/net/udpcp/udpcp.c
@@ -0,0 +1,2841 @@
+/*
+ * UDPCP communication protocol
+ *
+ * Copyright (C) 2010 Stefani Seibold <stefani@seibold.net>
+ * in order of NSN Ulm/Germany
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ *
+ */
+
+#include <net/xfrm.h>
+#include <net/protocol.h>
+#include <net/ip.h>
+#include <net/udp.h>
+#include <net/inet_common.h>
+#include <linux/zutil.h>
+#include <linux/module.h>
+#include <linux/proc_fs.h>
+#include <linux/spinlock.h>
+#include <linux/errqueue.h>
+#include <linux/atomic.h>
+
+#include <net/udpcp.h>
+
+#define VERSION	"0.71"
+
+/*
+ * UDPCP Protocol default parameters
+ */
+#define UDPCP_TX_TIMEOUT	100	/* milliseconds */
+#define UDPCP_RX_TIMEOUT	1000	/* milliseconds */
+#define UDPCP_TX_MAXTRY		5
+#define UDPCP_OUTSTANDING_ACKS	1
+
+/*
+ * UDPCP Protocol definitions
+ */
+#define UDPCP_MSG_TYPE_BIT		14
+#define UDPCP_PROTOCOL_VERSION_BIT	11
+#define UDPCP_NO_ACK_BIT		10
+#define UDPCP_CHECKSUM_BIT		9
+#define UDPCP_SINGLE_ACK_BIT		8
+#define UDPCP_DUPLICATE_BIT		7
+
+#define UDPCP_MSG_TYPE_MASK		(3 << UDPCP_MSG_TYPE_BIT)
+#define UDPCP_PROTOCOL_MASK		(7 << UDPCP_PROTOCOL_VERSION_BIT)
+
+#define UDPCP_MSG_TYPE_DATA		(1 << UDPCP_MSG_TYPE_BIT)
+#define UDPCP_MSG_TYPE_ACK		(2 << UDPCP_MSG_TYPE_BIT)
+#define UDPCP_PROTOCOL_VERSION_2	(2 << UDPCP_PROTOCOL_VERSION_BIT)
+
+#define UDPCP_NO_ACK_FLAG		(1 << UDPCP_NO_ACK_BIT)
+#define UDPCP_CHECKSUM_FLAG		(1 << UDPCP_CHECKSUM_BIT)
+#define UDPCP_SINGLE_ACK_FLAG		(1 << UDPCP_SINGLE_ACK_BIT)
+#define UDPCP_DUPLICATE_FLAG		(1 << UDPCP_DUPLICATE_BIT)
+
+/*
+ * helper macros
+ */
+#define list_to_udpcpdest(d) container_of(d, struct udpcp_dest, list)
+#define list_to_udpcpsock(d) container_of(d, struct udpcp_sock, udpcplist)
+
+#define UDPCP_HDRSIZE	(sizeof(struct udpcphdr)-sizeof(struct udphdr))
+
+#define RX_NODE	1
+#define TX_NODE	2
+
+/*
+ * name of the /proc entry
+ */
+#define UDPCP_PROC	"driver/udpcp"
+
+/*
+ * UDPCP message header
+ */
+struct udpcphdr {
+	struct udphdr udphdr;
+	__be32 chksum;
+	__be16 msginfo;
+	u8 fragamount;
+	u8 fragnum;
+	__be16 msgid;
+	__be16 length;
+};
+
+/*
+ * UDPCP destination descriptor
+ *
+ * For each communication address an individual destination descriptor will
+ * be create.
+ *
+ * The fields has the following meanings:
+ *
+ * list:		link list: part of udpcp_sock.destlist
+ * xmit:		messages fragments to be transmit
+ * tx_time:		timestamp of the last transmitted message fragment
+ * rx_time:		timestamp ot the last received message fragment
+ * txTimeout:		statistic use only: number of transmit timeout
+ * rxTimeout:		statistic use only: number of receive timeout
+ * txRetries:		statistic use only: number of transmit retries
+ * rxDiscardedFrags:	statistic use only: number of discarded messages
+ * xmit_wait:		message fragment which is waiting for an ACK
+ * xmit_last:		last fragment transmitted
+ * recv_msg:		first fragment of the received message
+ * recv_last:		last fragment of the received message
+ * lastmsg:		last messages fragment header received
+ * ipc:			linux internal ipc cookie
+ * fl:			flow/routing information
+ * rt:			routing entry currently used for this destination
+ * addr:		ipv4 destination address
+ * port:		destination port number
+ * msgid:		current message id for outgoing data messages
+ * use_flag:		statistic use only: flag for dest using TX and/or RX
+ * insync:		flag for protocol synchronization
+ * ackmode;		ack mode for the current assembled message
+ * chkmode;		checksum mode for the current assembled message
+ * try:			current number of retries xmit_wait message
+ * acks:		number of outstandig ack's
+ */
+struct udpcp_dest {
+	struct list_head list;
+	struct sk_buff_head xmit;
+	unsigned long tx_time;
+	unsigned long rx_time;
+	u32 txTimeout;
+	u32 rxTimeout;
+	u32 txRetries;
+	u32 rxDiscardedFrags;
+	struct sk_buff *xmit_wait;
+	struct sk_buff *xmit_last;
+	struct sk_buff *recv_msg;
+	struct sk_buff *recv_last;
+	struct udpcphdr lastmsg;
+	struct ipcm_cookie ipc;
+	struct flowi fl;
+	struct rtable *rt;
+	__be32 addr;
+	__be16 port;
+	u16 msgid;
+	u8 use_flag;
+	u8 insync;
+	u8 ackmode;
+	u8 chkmode;
+	u8 try;
+	u8 acks;
+};
+
+/*
+ * UDPCP socket descriptor
+ *
+ * For each opened socket individual socket descriptor will
+ * be created
+ *
+ * The fields has the following meanings:
+ *
+ * udpsock:		UDP socket has to be the first member of udpcp_sock
+ * assembly:		messages fragments currently assembled
+ * assembly_len:	current length of the assembled message
+ * assembly_dest:	current destination assembled
+ * wq:			wait queue for UDPCP_IOCTL_SYNC
+ * destlist:		head of destination descriptors link list
+ * udpcplist:		link list: part of udpcp_list
+ * timer:		timeout handler
+ * stat:		statistics for this socket
+ * pending:		number of pending messages fragment in the queues
+ * tx_timeout:		transmit timeout in jiffies
+ * rx_timeout:		receive timeout in jiffies
+ * udp_data_ready:	original data_ready handler for this socket
+ * ackmode:		default ack mode
+ * chkmode:		default checksum mode
+ * maxtry:		max. number of resends
+ * acks:		max. number of outstandig ack's
+ * timeout:		flag for unhandled timeout
+ */
+struct udpcp_sock {
+	struct udp_sock udpsock;
+	struct sk_buff_head assembly;
+	u32 assembly_len;
+	struct udpcp_dest *assembly_dest;
+	wait_queue_head_t wq;
+	struct list_head destlist;
+	struct list_head udpcplist;
+	struct timer_list timer;
+	struct udpcp_statistics stat;
+	u32 pending;
+	unsigned long tx_timeout;
+	unsigned long rx_timeout;
+	void (*udp_data_ready) (struct sock *sk, int bytes);
+	u8 ackmode;
+	u8 chkmode;
+	u8 maxtry;
+	u8 acks;
+	u8 timeout;
+};
+
+/* head of struct udpcp_sock.udpcplist link list */
+static struct list_head udpcp_list;
+
+/* spinlock for race free access to the static variables */
+static spinlock_t udpcp_lock;
+
+/* debug flag, set != 0 to enable debug */
+static int debug;
+
+/* overall UDPCP statistics */
+static atomic_t udpcp_txMsgs;
+static atomic_t udpcp_rxMsgs;
+static atomic_t udpcp_txNodes;
+static atomic_t udpcp_rxNodes;
+static atomic_t udpcp_txTimeout;
+static atomic_t udpcp_rxTimeout;
+static atomic_t udpcp_txRetries;
+static atomic_t udpcp_rxDiscardedFrags;
+static atomic_t udpcp_crcErrors;
+
+module_param(debug, int, 0);
+MODULE_PARM_DESC(debug, "Debug enabled or not");
+
+#ifdef CONFIG_PROC_FS
+/*
+ * Handle /proc/driver/udpcp
+ *
+ * Show the statistics information
+ */
+static int udpcp_proc(char *page, char **start, off_t off, int count, int *eof,
+		      void *data)
+{
+	int len;
+
+	len = snprintf(page, count,
+		       "txMsgs:          %u\n"
+		       "rxMsgs:          %u\n"
+		       "txNodes:         %u\n"
+		       "rxNodes:         %u\n"
+		       "txTimeout:       %u\n"
+		       "rxTimeout:       %u\n"
+		       "txRetries:       %u\n"
+		       "rxDiscaredFrags: %u\n"
+		       "crcErrors:       %u\n",
+			atomic_read(&udpcp_txMsgs),
+			atomic_read(&udpcp_rxMsgs),
+			atomic_read(&udpcp_txNodes),
+			atomic_read(&udpcp_rxNodes),
+			atomic_read(&udpcp_txTimeout),
+			atomic_read(&udpcp_rxTimeout),
+			atomic_read(&udpcp_txRetries),
+			atomic_read(&udpcp_rxDiscardedFrags),
+			atomic_read(&udpcp_crcErrors)
+		);
+
+	if (len <= off)
+		return 0;
+
+	len -= off;
+
+	if (len > count)
+		return count;
+
+	return len;
+}
+#endif
+
+/*
+ * Helper for the UDPCP header from a socket buffer
+ */
+static inline struct udpcphdr *udpcp_hdr(const struct sk_buff *skb)
+{
+	return (struct udpcphdr *)skb_transport_header(skb);
+}
+
+/*
+ * Helper for conversion a basic socket into a UDPCP socket
+ */
+static inline struct udpcp_sock *udpcp_sk(const struct sock *sk)
+{
+	return (struct udpcp_sock *)sk;
+}
+
+/*
+ * Dump the transport data of a socket buffer
+ */
+static inline void dump_data(struct sk_buff *skb, unsigned int max)
+{
+	unsigned int i;
+	unsigned char *data;
+	int data_len;
+
+	data = skb_transport_header(skb) + sizeof(struct udpcphdr);
+	data_len = skb_tail_pointer(skb) - data;
+
+	pr_debug(" data: ");
+
+	if (!data_len) {
+		pr_cont("<none>\n");
+		return;
+	}
+
+	if (max > data_len)
+		max = data_len;
+
+	for (i = 0; i < max; i++)
+		pr_cont("%02x ", data[i]);
+
+	if (data_len > max)
+		pr_cont("...");
+	pr_cont("\n");
+}
+
+/*
+ * Dump and decode a msginfo value
+ */
+static inline void dump_msginfo(u16 msginfo)
+{
+	pr_debug(" msginfo:0x%04x (", msginfo);
+
+	pr_cont("PCKT:");
+	switch (msginfo & UDPCP_MSG_TYPE_MASK) {
+	case UDPCP_MSG_TYPE_DATA:
+		pr_cont("DATA");
+		break;
+	case UDPCP_MSG_TYPE_ACK:
+		pr_cont("ACK");
+		break;
+	default:
+		pr_cont("UNKNOWN");
+		break;
+	}
+	pr_cont(" VER:%d",
+	       (msginfo & UDPCP_PROTOCOL_MASK) >> UDPCP_PROTOCOL_VERSION_BIT);
+
+	if (msginfo & UDPCP_NO_ACK_FLAG)
+		pr_cont(" NO_ACK");
+	if (msginfo & UDPCP_CHECKSUM_FLAG)
+		pr_cont(" CHECKSUM");
+	if (msginfo & UDPCP_SINGLE_ACK_FLAG)
+		pr_cont(" SINGLE_ACK");
+	if (msginfo & UDPCP_DUPLICATE_FLAG)
+		pr_cont(" DUPLICATE");
+	pr_cont(")\n");
+}
+
+/*
+ * Dump and decode a UDPCP message fragment
+ */
+static void dump_msg(const char *action, struct sk_buff *skb, __be32 saddr,
+		     __be32 daddr)
+{
+	struct udpcphdr *uh = udpcp_hdr(skb);
+
+	pr_debug("udpcp: %s (%lu)\n", action, jiffies);
+
+	pr_debug(" src:0x%08x:%d dst:0x%08x:%d fraglen:%d\n",
+	       saddr, uh->udphdr.source, daddr, uh->udphdr.dest, skb->len);
+
+	pr_debug(" fragamount:%u fragnum:%u msgid:%u%s"
+		 " length:%u checksum:0x%08x\n",
+	       uh->fragamount, uh->fragnum, ntohs(uh->msgid),
+	       (!uh->msgid) ? "(Sync)" : "", ntohs(uh->length),
+	       ntohl(uh->chksum)
+	    );
+
+	dump_msginfo(ntohs(uh->msginfo));
+	dump_data(skb, 16);
+}
+
+/*
+ * Create a new destination descriptor for the given IPV4 address and port
+ */
+static struct udpcp_dest *new_dest(struct sock *sk, __be32 addr, __be16 port)
+{
+	struct udpcp_dest *dest;
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	dest = kzalloc(sizeof(*dest), sk->sk_allocation);
+
+	if (dest) {
+		skb_queue_head_init(&dest->xmit);
+		dest->addr = addr;
+		dest->port = port;
+		dest->ackmode = UDPCP_ACK;
+		list_add_tail(&dest->list, &usk->destlist);
+	}
+
+	return dest;
+}
+
+/*
+ * Lookup for a destination descriptor for the given IPV4 address and port
+ */
+static struct udpcp_dest *__find_dest(struct sock *sk, __be32 addr, __be16 port)
+{
+	struct udpcp_dest *dest;
+	struct list_head *p;
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	list_for_each(p, &usk->destlist) {
+		dest = list_to_udpcpdest(p);
+
+		if ((dest->addr == addr) && (dest->port == port))
+			return dest;
+	}
+	return NULL;
+}
+
+/*
+ * Lookup for a destination descriptor and create a new one if no
+ * descriptor was found.
+ */
+static struct udpcp_dest *find_dest(struct sock *sk, __be32 addr, __be16 port)
+{
+	struct udpcp_dest *dest;
+
+	dest = __find_dest(sk, addr, port);
+
+	if (!dest)
+		dest = new_dest(sk, addr, port);
+
+	return dest;
+}
+
+/*
+ * Calculate udp checksum, mostly stolen from udp stack
+ */
+static void udpcp_do_csum(struct sock *sk, struct sk_buff *skb,
+			  struct udpcp_dest *dest)
+{
+	struct flowi *fl = &dest->fl;
+	struct udphdr *uh = udp_hdr(skb);
+	__wsum csum = 0;
+	unsigned short len = ntohs(uh->len);
+
+	if (sk->sk_no_check == UDP_CSUM_NOXMIT) {
+		skb->ip_summed = CHECKSUM_NONE;
+		return;
+	}
+	if (skb->ip_summed == CHECKSUM_PARTIAL) {
+		/* UDP hardware csum */
+		skb->csum_start = skb_transport_header(skb) - skb->head;
+		skb->csum_offset = offsetof(struct udphdr, check);
+		uh->check =
+		    ~csum_tcpudp_magic(fl->fl4_src, fl->fl4_dst, len,
+				       sk->sk_protocol, 0);
+		return;
+	}
+	csum = csum_partial(uh, sizeof(struct udpcphdr), 0);
+	csum = csum_add(csum, skb->csum);
+
+	/* add protocol-dependent pseudo-header */
+	uh->check =
+	    csum_tcpudp_magic(fl->fl4_src, fl->fl4_dst, len, sk->sk_protocol,
+			      csum);
+	if (uh->check == 0)
+		uh->check = CSUM_MANGLED_0;
+}
+
+/*
+ * Fetch data from kernel space and fill in checksum if needed.
+ */
+static int ip_reply_glue_bits(void *dptr, char *to, int offset,
+			      int len, int odd, struct sk_buff *skb)
+{
+	__wsum csum;
+
+	csum = csum_partial_copy_nocheck(dptr+offset, to, len, 0);
+	skb->csum = csum_block_add(skb->csum, csum, odd);
+	return 0;
+}
+
+/*
+ * Send an ack for a received data message fragment
+ *
+ * If the argument duplicate is true a ACK with UDPCP_DUPLICATE_FLAG set will
+ * be send
+ */
+static void udpcp_send_ack(struct sock *sk, struct sk_buff *skb,
+			   struct udpcp_dest *dest, int duplicate)
+{
+	struct inet_sock *inet = inet_sk(sk);
+	struct udpcphdr *uh = udpcp_hdr(skb);
+	struct rtable *rt = NULL;
+	__wsum csum;
+	struct ipcm_cookie ipc;
+	struct udpcphdr rep;
+
+	memset(&rep, 0, sizeof(rep));
+
+	/* Swap the send and the receive ports. */
+	rep.udphdr.source = uh->udphdr.dest;
+	rep.udphdr.dest = uh->udphdr.source;
+	rep.udphdr.len = htons(sizeof(struct udpcphdr));
+
+	rep.msginfo = htons(UDPCP_MSG_TYPE_ACK |
+			    UDPCP_NO_ACK_FLAG |
+			    UDPCP_SINGLE_ACK_FLAG | UDPCP_PROTOCOL_VERSION_2);
+	if (duplicate)
+		rep.msginfo |= htons(UDPCP_DUPLICATE_FLAG);
+	else
+		memcpy(&dest->lastmsg, uh, sizeof(dest->lastmsg));
+	rep.msgid = uh->msgid;
+	rep.fragamount = uh->fragamount;
+	rep.fragnum = uh->fragnum;
+	rep.length = 0;
+	rep.chksum = 0;
+	if (ntohs(uh->msginfo) & UDPCP_CHECKSUM_FLAG) {
+		u8 *data;
+		u32 data_len;
+
+		data = (u8 *) &rep + sizeof(struct udphdr);
+		data_len = sizeof(struct udpcphdr)-sizeof(struct udphdr);
+
+		rep.msginfo |= htons(UDPCP_CHECKSUM_FLAG);
+		rep.chksum = htonl(zlib_adler32(1, data, data_len));
+	}
+
+	if (unlikely(debug)) {
+		struct sk_buff tmp;
+
+		tmp.len = ntohs(rep.udphdr.len);
+		tmp.head = tmp.transport_header = tmp.data = (void *)&rep;
+		tmp.tail = tmp.head + tmp.len;
+
+		dump_msg("ack msg", &tmp, ip_hdr(skb)->daddr,
+			 ip_hdr(skb)->saddr);
+	}
+
+	csum = csum_tcpudp_nofold(ip_hdr(skb)->daddr,
+				      ip_hdr(skb)->saddr,
+				      sizeof(rep), sk->sk_protocol, 0);
+
+	ipc.addr = dest->addr;
+	ipc.opt = NULL;
+	ipc.tx_flags = 0;
+
+	{
+		struct flowi fl = {
+			.nl_u = { .ip4_u = {
+						.daddr = ipc.addr,
+						.saddr = ip_hdr(skb)->daddr,
+						.tos = RT_TOS(ip_hdr(skb)->tos)
+					      }
+			},
+			.uli_u = { .ports = {
+						.sport = udp_hdr(skb)->dest,
+						.dport = udp_hdr(skb)->source
+				       }
+			},
+			.proto = sk->sk_protocol,
+		};
+		security_skb_classify_flow(skb, &fl);
+		if (ip_route_output_key(sock_net(sk), &rt, &fl))
+			return;
+	}
+
+	inet->tos = ip_hdr(skb)->tos;
+	sk->sk_priority = skb->priority;
+	sk->sk_protocol = ip_hdr(skb)->protocol;
+	sk->sk_bound_dev_if = 0;
+	ip_append_data(sk, ip_reply_glue_bits, &rep, sizeof(rep),
+				0, &ipc, &rt, MSG_DONTWAIT);
+	skb = skb_peek(&sk->sk_write_queue);
+	if (skb) {
+		*((__sum16 *)skb_transport_header(skb) +
+		  offsetof(struct udphdr, check) / 2) =
+			csum_fold(csum_add(skb->csum, csum));
+		skb->ip_summed = CHECKSUM_NONE;
+		ip_push_pending_frames(sk);
+	}
+
+	ip_rt_put(rt);
+
+	UDP_INC_STATS_BH(sock_net(sk), UDP_MIB_OUTDATAGRAMS, 0);
+}
+
+/*
+ * Pass a UDPCP skb buffer to the ip stack and send it
+ */
+static int udpcp_send_skb(struct sock *sk, struct sk_buff *skb,
+			  struct udpcp_dest *dest, struct ip_options *opt)
+{
+	int err;
+
+	skb_dst_set(skb, dst_clone(&dest->rt->dst));
+
+	err = ip_build_and_send_pkt(skb, sk, dest->fl.fl4_src,
+					dest->fl.fl4_dst, opt);
+
+	if (!err)
+		UDP_INC_STATS_USER(sock_net(sk), UDP_MIB_OUTDATAGRAMS, 0);
+	return err;
+}
+
+/*
+ * Release a routing table entry if no packed will be assembled
+ */
+static void udpcp_dst_release(struct udpcp_sock *usk, struct udpcp_dest *dest)
+{
+	if (usk->assembly_dest != dest) {
+		dst_release(&dest->rt->dst);
+		dest->rt = NULL;
+	}
+}
+
+/*
+ * Return true it the passed skb socket buffer is the last in the list
+ */
+static inline bool skb_is_eoq(const struct sk_buff_head *list,
+			      const struct sk_buff *skb)
+{
+	return (skb->next == (struct sk_buff *)list);
+}
+
+/*
+ * Arm the timeout handler for the socket
+ */
+static void udpcp_timer(struct sock *sk, unsigned long timeout)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	mod_timer(&usk->timer, timeout);
+}
+
+/*
+ * Decrement the socket pending counter and wakeup a waiting UDPCP_IOCTL_SYNC
+ */
+static inline void udpcp_dec_pending(struct sock *sk)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	if (!--usk->pending) {
+		if (waitqueue_active(&usk->wq))
+			wake_up_interruptible(&usk->wq);
+	}
+}
+
+/*
+ * Returns true is the passed message fragment is the last fragment
+ */
+static inline int udpcp_is_last_frag(struct udpcphdr *uh)
+{
+	return uh->fragamount == uh->fragnum + 1;
+}
+
+/*
+ * Transmit data message fragments
+ */
+static int _udpcp_xmit(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	struct sk_buff *skb = NULL;
+	struct sk_buff *skbc;
+	struct udpcphdr *uh;
+	int err = 0;
+
+	if (dest->acks >= usk->acks)
+		goto out;
+
+	if (!dest->xmit_last) {
+		/*
+		 * handle data message fragments without an ack
+		 */
+		while ((skb = skb_peek(&dest->xmit))) {
+			uh = udpcp_hdr(skb);
+
+			if (!(ntohs(uh->msginfo) & UDPCP_NO_ACK_FLAG))
+				break;
+			if (udpcp_is_last_frag(uh)) {
+				usk->stat.txMsgs++;
+				atomic_inc(&udpcp_txMsgs);
+			}
+			skb_unlink(skb, &dest->xmit);
+			udpcp_dec_pending(sk);
+			if (unlikely(debug))
+				dump_msg("send msg", skb, dest->fl.fl4_src,
+					 dest->fl.fl4_dst);
+			err = udpcp_send_skb(sk, skb, dest,
+						(struct ip_options *)skb->cb);
+			if (err) {
+				kfree_skb(skb);
+				skb = NULL;
+				break;
+			}
+		}
+		dest->xmit_wait = skb;
+	} else {
+		/*
+		 * handle next data message fragment waiting for an ack
+		 */
+		uh = udpcp_hdr(dest->xmit_last);
+
+		if (udpcp_is_last_frag(uh))
+			goto out;
+
+		/*
+		 * get next data message fragment
+		 */
+		skb = dest->xmit_last->next;
+	}
+
+	/*
+	 * send all data message fragment till the first which must be acked
+	 */
+	while (skb) {
+		skbc = skb_clone(skb, sk->sk_allocation);
+
+		if (!skbc)
+			break;
+
+		if (unlikely(debug))
+			dump_msg("send msg", skbc, dest->fl.fl4_src,
+				 dest->fl.fl4_dst);
+		err = udpcp_send_skb(sk, skbc, dest,
+					(struct ip_options *)skb->cb);
+		if (err) {
+			kfree_skb(skbc);
+			break;
+		}
+
+		uh = udpcp_hdr(skb);
+
+		if (!(ntohs(uh->msginfo) & UDPCP_SINGLE_ACK_FLAG)
+		    || udpcp_is_last_frag(uh)) {
+			dest->xmit_last = skb;
+
+			if (++dest->acks >= usk->acks || udpcp_is_last_frag(uh))
+				break;
+		}
+
+		skb = skb_is_eoq(&dest->xmit, skb) ? NULL : skb->next;
+	}
+
+out:
+	if (skb_queue_empty(&dest->xmit))
+		udpcp_dst_release(usk, dest);
+
+	return err;
+}
+
+/*
+ * Transmit data message fragments and rearm the timeout handler if necessary
+ */
+static int udpcp_xmit(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	int ret;
+
+	ret = _udpcp_xmit(sk, dest);
+
+	if (dest->xmit_wait) {
+		dest->tx_time = jiffies;
+
+		if (!timer_pending(&usk->timer))
+			udpcp_timer(sk, dest->tx_time + usk->tx_timeout);
+	}
+	return ret;
+}
+
+/*
+ * Queue the assembled message fragment into the transmit queue
+ */
+static void udpcp_queue_xmit(struct sock *sk, struct udpcp_dest *dest,
+			     u8 ackmode, u8 chkmode)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	struct udpcphdr *uh;
+	struct sk_buff *skb;
+	u8 fragamount;
+	u8 fragnum;
+	unsigned short msginfo;
+	struct flowi *fl = &dest->fl;
+
+	msginfo = UDPCP_MSG_TYPE_DATA | UDPCP_PROTOCOL_VERSION_2;
+	switch (ackmode) {
+	case UDPCP_NOACK:
+		msginfo |= UDPCP_NO_ACK_FLAG;
+		break;
+	case UDPCP_SINGLE_ACK:
+		msginfo |= UDPCP_SINGLE_ACK_FLAG;
+		break;
+	case UDPCP_ACK:
+	default:
+		break;
+	}
+	switch (chkmode) {
+	case UDPCP_NOCHECKSUM:
+		break;
+	case UDPCP_CHECKSUM:
+	default:
+		msginfo |= UDPCP_CHECKSUM_FLAG;
+		break;
+	}
+
+	fragamount = skb_queue_len(&usk->assembly);
+
+	udpcp_sk(sk)->pending += fragamount;
+
+	for (fragnum = 0; fragnum != fragamount; fragnum++) {
+		unsigned char *data;
+		int data_len;
+
+		skb = skb_dequeue(&usk->assembly);
+		uh = udpcp_hdr(skb);
+
+		/*
+		 * setup a UDPCP header
+		 */
+		uh->chksum = 0;
+		uh->msginfo = htons(msginfo);
+		uh->fragnum = fragnum;
+		uh->fragamount = fragamount;
+		uh->msgid = htons(dest->msgid);
+		uh->length = htons(usk->assembly_len);
+
+		data = skb_transport_header(skb) + sizeof(struct udphdr);
+		data_len = skb_tail_pointer(skb) - data;
+
+		if (chkmode == UDPCP_CHECKSUM)
+			uh->chksum = htonl(zlib_adler32(1, data, data_len));
+		/*
+		 * create a UDP header
+		 */
+		uh->udphdr.source = fl->fl_ip_sport;
+		uh->udphdr.dest = fl->fl_ip_dport;
+		uh->udphdr.len = htons(sizeof(struct udphdr) + data_len);
+		uh->udphdr.check = 0;
+
+		/*
+		 * create UDP checksum
+		 */
+		udpcp_do_csum(sk, skb, dest);
+
+		/*
+		 * add to xmit queue
+		 */
+		skb_queue_tail(&dest->xmit, skb);
+	}
+
+	dest->msgid++;
+	usk->assembly_len = 0;
+	usk->assembly_dest = NULL;
+}
+
+/*
+ * Remove all data message fragments of the first message from the transmit
+ * queue all fragments will be merged together
+ */
+static struct sk_buff *udpcp_dequeue_msg(struct sock *sk,
+					 struct udpcp_dest *dest)
+{
+	struct sk_buff *msg;
+	struct sk_buff *skb;
+	struct sk_buff **next;
+	struct udpcphdr *uh;
+
+	msg = skb_dequeue(&dest->xmit);
+	if (!msg)
+		return NULL;
+	skb_orphan(msg);
+
+	uh = udpcp_hdr(msg);
+	if (!uh->msgid) {
+		/*
+		 * sync message
+		 */
+		kfree_skb(msg);
+		return NULL;
+	}
+
+	skb_pull(msg, sizeof(struct udpcphdr));
+	if (udpcp_is_last_frag(uh))
+		return msg;
+
+	next = &skb_shinfo(msg)->frag_list;
+	for (;;) {
+		skb = skb_dequeue(&dest->xmit);
+		if (!skb)
+			break;
+		skb_orphan(skb);
+		uh = udpcp_hdr(skb);
+		skb_pull(msg, sizeof(struct udpcphdr));
+		msg->len += skb->len;
+		msg->data_len += skb->len;
+		*next = skb;
+		if (udpcp_is_last_frag(uh))
+			break;
+		next = &skb->next;
+	}
+	return msg;
+}
+
+static void udpcp_flush_err(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct inet_sock *inet = inet_sk(sk);
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	if (!inet->recverr)
+		skb_queue_purge(&dest->xmit);
+	else {
+		struct sock_exterr_skb *serr;
+		struct iphdr *iph;
+		struct sk_buff *skb;
+
+		while (!skb_queue_empty(&dest->xmit)) {
+			skb = udpcp_dequeue_msg(sk, dest);
+			if (!skb)
+				continue;
+
+			if (unlikely(debug))
+				dump_msg("flush outgoing message", skb,
+					 dest->fl.fl4_src, dest->fl.fl4_dst);
+
+			skb_push(skb, sizeof(struct iphdr));
+			skb_reset_network_header(skb);
+			iph = ip_hdr(skb);
+			iph->daddr = dest->rt->rt_dst;
+
+			serr = SKB_EXT_ERR(skb);
+			serr->ee.ee_errno = EPROTO;
+			serr->ee.ee_origin = SO_EE_ORIGIN_LOCAL;
+			serr->ee.ee_type = 0;
+			serr->ee.ee_code = 0;
+			serr->ee.ee_pad = 0;
+			serr->ee.ee_info = 0;
+			serr->ee.ee_data = 0;
+			serr->addr_offset = (u8 *) &iph->daddr -
+						skb_network_header(skb);
+			serr->port = dest->fl.fl_ip_dport;
+
+			skb_reset_transport_header(skb);
+			skb_pull(skb, sizeof(struct iphdr));
+
+			/*
+			 * set a flag for UDPCP message
+			 */
+			UDP_SKB_CB(skb)->udpcp_flag = 1;
+
+			/*
+			 * pass the dequeued message to the error queue of the
+			 * socket
+			 */
+			skb_set_owner_r(skb, sk);
+			skb_queue_tail(&sk->sk_error_queue, skb);
+			if (!sock_flag(sk, SOCK_DEAD)) {
+				if (usk->udp_data_ready)
+					usk->udp_data_ready(sk, skb->len);
+			}
+		}
+	}
+
+	dest->xmit_wait = 0;
+	dest->xmit_last = 0;
+	dest->try = 0;
+	dest->acks = 0;
+
+	usk->pending = 0;
+	if (waitqueue_active(&usk->wq))
+		wake_up_interruptible(&usk->wq);
+}
+
+/*
+ * Purge the current incoming data message
+ */
+static void udpcp_purge_incoming(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	if (dest->recv_last) {
+		u32 fragnum = udpcp_hdr(dest->recv_last)->fragnum + 1;
+
+		dest->rxDiscardedFrags += fragnum;
+		usk->stat.rxDiscardedFrags += fragnum;
+		atomic_add(fragnum, &udpcp_rxDiscardedFrags);
+
+		dest->lastmsg.msgid = 0;
+
+		if (unlikely(debug))
+			dump_msg("purge incoming message", dest->recv_msg,
+				 dest->fl.fl4_src, dest->fl.fl4_dst);
+	}
+
+	kfree_skb(dest->recv_msg);
+	dest->recv_msg = 0;
+	dest->recv_last = 0;
+}
+
+/*
+ * Resend all data message fragments to the one which is currently waiting for
+ * an ack
+ */
+static int udpcp_resend(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct sk_buff *skb;
+	struct sk_buff *skbc;
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	int err;
+
+	if (++dest->try >= usk->maxtry) {
+		dest->insync = 0;
+		udpcp_flush_err(sk, dest);
+		udpcp_purge_incoming(sk, dest);
+		udpcp_dst_release(usk, dest);
+		return 0;
+	}
+
+	dest->txRetries++;
+	usk->stat.txRetries++;
+	atomic_inc(&udpcp_txRetries);
+
+	if (!dest->xmit_last)
+		_udpcp_xmit(sk, dest);
+	else {
+		skb = dest->xmit_wait;
+
+		for (;;) {
+			skbc = skb_clone(skb, sk->sk_allocation);
+
+			if (skbc == NULL)
+				break;
+
+			if (unlikely(debug))
+				dump_msg("resend msg", skbc, dest->fl.fl4_src,
+					 dest->fl.fl4_dst);
+			err = udpcp_send_skb(sk, skbc, dest,
+						(struct ip_options *)skb->cb);
+			if (err) {
+				kfree_skb(skbc);
+				break;
+			}
+
+			if (skb == dest->xmit_last) {
+				_udpcp_xmit(sk, dest);
+				break;
+			}
+
+			skb = skb->next;
+		}
+	}
+	dest->tx_time = jiffies;
+
+	return 1;
+}
+
+/*
+ * Handle udpcp timeout
+ */
+static void udpcp_handle_timeout(struct sock *sk)
+{
+	struct udpcp_dest *dest;
+	struct list_head *p;
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	int wflag = 0;
+	unsigned long t = jiffies + UDPCP_MAX_WAIT_SEC * HZ + 1;
+
+	usk->timeout = 0;
+
+	/*
+	 * walk through all destinations
+	 */
+	list_for_each(p, &usk->destlist) {
+		dest = list_to_udpcpdest(p);
+
+		if (dest->xmit_wait) {
+			if (time_is_before_eq_jiffies
+			    (dest->tx_time + usk->tx_timeout)) {
+				/*
+				 * transmit timeout expired
+				 */
+				if (unlikely(debug))
+					dump_msg("send timeout",
+						 dest->xmit_wait,
+						 dest->fl.fl4_src,
+						 dest->fl.fl4_dst);
+				if (udpcp_resend(sk, dest) == 0) {
+					dest->txTimeout++;
+					usk->stat.txTimeout++;
+					atomic_inc(&udpcp_txTimeout);
+					goto check_incoming;
+				}
+				wflag = 1;
+			}
+			if (time_before(dest->tx_time + usk->tx_timeout, t)) {
+				/*
+				 * calculate new timeout timer value
+				 */
+				t = dest->tx_time + usk->tx_timeout;
+				wflag = 1;
+			}
+		}
+check_incoming:
+		if (dest->recv_msg) {
+			if (time_is_before_eq_jiffies
+			    (dest->rx_time + usk->rx_timeout)) {
+				/*
+				 * receive timeout occurred
+				 */
+				if (unlikely(debug))
+					dump_msg("receive timeout",
+						 dest->recv_last,
+						 dest->fl.fl4_src,
+						 dest->fl.fl4_dst);
+				udpcp_purge_incoming(sk, dest);
+				dest->rxTimeout++;
+				usk->stat.rxTimeout++;
+				atomic_inc(&udpcp_rxTimeout);
+			} else
+			if (time_before(dest->rx_time + usk->rx_timeout, t)) {
+				/*
+				 * calculate new timeout timer value
+				 */
+				t = dest->rx_time + usk->rx_timeout;
+				wflag = 1;
+			}
+		}
+	}
+	/*
+	 * restart timer if necessary
+	 */
+	if (wflag)
+		udpcp_timer(sk, t);
+}
+
+/*
+ * Timeout function
+ */
+static void udpcp_timeout(unsigned long data)
+{
+	struct sock *sk = (struct sock *)data;
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	bh_lock_sock(sk);
+	if (!sock_owned_by_user(sk))
+		udpcp_handle_timeout(sk);
+	else {
+		/*
+		 * bad, cannot handle the timeout because the socket is in use
+		 * set flag for unhandled timeout and rearm the timer
+		 */
+		usk->timeout = 1;
+		udpcp_timer(sk, jiffies + 1);
+	}
+	bh_unlock_sock(sk);
+}
+
+/*
+ * Handle timeout if an the unhandled timeout flag is set
+ */
+static inline void check_timeout(struct sock *sk)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	while (usk->timeout) {
+		lock_sock(sk);
+		while (usk->timeout)
+			udpcp_handle_timeout(sk);
+		release_sock(sk);
+	}
+}
+
+/*
+ * Release the socket lock and test for unhandled timeouts
+ */
+static inline void udpcp_release_sock(struct sock *sk)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	while (usk->timeout)
+		udpcp_handle_timeout(sk);
+	release_sock(sk);
+        check_timeout(sk);
+}
+
+/*
+ * Parse sendmsg() control message
+ */
+static int udpcp_cmsg_send(struct msghdr *msg, u8 * ackmode, u8 * chkmode)
+{
+	struct cmsghdr *cmsg;
+
+	for (cmsg = CMSG_FIRSTHDR(msg); cmsg; cmsg = CMSG_NXTHDR(msg, cmsg)) {
+		if (!CMSG_OK(msg, cmsg))
+			return -EINVAL;
+		if (cmsg->cmsg_level != SOL_UDPCP)
+			continue;
+		switch (cmsg->cmsg_type) {
+		case UDPCP_NOACK:
+		case UDPCP_ACK:
+		case UDPCP_SINGLE_ACK:
+			*ackmode = cmsg->cmsg_type;
+			break;
+		case UDPCP_CHECKSUM:
+		case UDPCP_NOCHECKSUM:
+			*chkmode = cmsg->cmsg_type;
+			break;
+		default:
+			return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+/*
+ * Validate a skb buffer
+ */
+static int udpcp_validate_skb(struct sk_buff *skb)
+{
+	if (skb->next) {
+		pr_err("udpcp: unexpected skb_buff->next != NULL\n");
+		BUG();
+		return 1;
+	}
+	if (skb_shinfo(skb)->frag_list) {
+		pr_err("udpcp: unexpected skb_shinfo(skb)->frag_list != NULL\n");
+		BUG();
+		return 1;
+	}
+	return 0;
+}
+
+/*
+ * Split a message into fragments and store it into the assemble queue
+ * mostly stolen from UDP stack
+ */
+static int udpcp_data(struct sock *sk, struct udpcp_dest *dest,
+		      struct iovec *from, int length, unsigned int flags)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	struct inet_sock *inet = inet_sk(sk);
+	struct sk_buff *skb;
+	struct ipcm_cookie *ipc = &dest->ipc;
+	struct ip_options *opt = ipc->opt;
+	int hh_len;
+	int exthdrlen;
+	int mtu;
+	int copy;
+	int err;
+	int offset = 0;
+	unsigned int maxfraglen, fragheaderlen;
+	int csummode = CHECKSUM_NONE;
+	int transhdrlen = sizeof(struct udpcphdr);
+	struct rtable *rt = dest->rt;
+
+	if (opt && sizeof(skb->cb) < optlength(opt)) {
+		err = -EFAULT;
+		goto error;
+	}
+
+	usk->assembly_len += length;
+	usk->assembly_dest = dest;
+
+	if (usk->assembly_len > UDPCP_MAX_MSGSIZE) {
+		ip_local_error(sk, EMSGSIZE, rt->rt_dst, dest->fl.fl_ip_dport,
+				usk->assembly_len);
+		err = -EMSGSIZE;
+		goto error;
+	}
+
+	mtu = (inet->pmtudisc == IP_PMTUDISC_PROBE) ?
+		rt->dst.dev->mtu : dst_mtu(rt->dst.path);
+	sk->sk_sndmsg_page = NULL;
+	sk->sk_sndmsg_off = 0;
+	exthdrlen = rt->dst.header_len;
+	length += exthdrlen;
+	transhdrlen += exthdrlen;
+
+	hh_len = LL_RESERVED_SPACE(rt->dst.dev);
+
+	fragheaderlen = sizeof(struct iphdr) + (opt ? opt->optlen : 0);
+	maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen;
+
+	if (rt->dst.dev->features & NETIF_F_V4_CSUM && !exthdrlen)
+		csummode = CHECKSUM_PARTIAL;
+
+	skb = skb_peek_tail(&usk->assembly);
+	if (skb) {
+		unsigned int off;
+
+		off = skb->len;
+
+		copy = mtu - skb->len;
+		if (copy > length)
+			copy = length;
+
+		if (copy > 0 &&
+		    ip_generic_getfrag(
+		     from, skb_put(skb, copy), 0, copy, off, skb) < 0) {
+			__skb_trim(skb, off);
+			err = -EFAULT;
+			goto error;
+		}
+		length -= copy;
+		offset += copy;
+
+		if (!length)
+			return 0;
+	}
+
+	do {
+		char *data;
+		unsigned int datalen;
+		unsigned int fraglen;
+		unsigned int alloclen;
+
+		length += transhdrlen;
+		/*
+		 * If remaining data exceeds the mtu,
+		 * we know we need more fragment(s).
+		 */
+		datalen = length;
+		if (datalen > mtu - fragheaderlen)
+			datalen = maxfraglen - fragheaderlen;
+		fraglen = datalen + fragheaderlen;
+
+		if ((flags & MSG_MORE)
+		    && !(rt->dst.dev->features & NETIF_F_SG))
+			alloclen = mtu;
+		else
+			alloclen = fraglen;
+
+		alloclen += rt->dst.trailer_len + hh_len + 15;
+
+		udpcp_release_sock(sk);
+		skb = sock_alloc_send_skb(sk, alloclen,
+					(flags & MSG_DONTWAIT), &err);
+		lock_sock(sk);
+		if (skb == NULL)
+			goto error;
+
+		if (udpcp_validate_skb(skb)) {
+			kfree_skb(skb);
+
+			goto error;
+		}
+
+		/*
+		 * Fill in the control structures
+		 */
+		skb->ip_summed = csummode;
+		skb->csum = 0;
+		skb_reserve(skb, hh_len);
+
+		/*
+		 * Find where to start putting bytes.
+		 */
+		data = skb_put(skb, fraglen);
+		skb_set_network_header(skb, exthdrlen);
+		skb->transport_header = (skb->network_header + fragheaderlen);
+		data += fragheaderlen;
+
+		copy = datalen - transhdrlen;
+
+		if (copy > 0 &&
+		  ip_generic_getfrag(
+		   from, data + transhdrlen, offset, copy, 0, skb) < 0) {
+			err = -EFAULT;
+			kfree_skb(skb);
+			goto error;
+		}
+
+		offset += copy;
+		length -= datalen;
+
+		if (ipc->opt)
+			memcpy(skb->cb, &ipc->opt, optlength(opt));
+
+		skb_pull(skb, fragheaderlen);
+		skb_queue_tail(&usk->assembly, skb);
+	} while (length > 0);
+
+	return 0;
+error:
+	skb_queue_purge(&usk->assembly);
+	usk->assembly_len = 0;
+
+	IP_INC_STATS(sock_net(sk), IPSTATS_MIB_OUTDISCARDS);
+	return err;
+}
+
+/*
+ * This function will be called by send(), sento() and sendmsg()
+ */
+static int udpcp_sendmsg(struct kiocb *iocb, struct sock *sk,
+			 struct msghdr *msg, size_t len)
+{
+	struct inet_sock *inet = inet_sk(sk);
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	struct ipcm_cookie *ipc;
+	struct rtable *rt = NULL;
+	int free = 0;
+	int connected = 0;
+	__be32 daddr, faddr, saddr;
+	__be16 dport;
+	u8 tos;
+	int err = 0;
+	int corkreq = usk->udpsock.corkflag || msg->msg_flags & MSG_MORE;
+	struct udpcp_dest *dest;
+
+	if (len > UDPCP_MAX_MSGSIZE)
+		return -EMSGSIZE;
+
+	/*
+	 * Check the flags.
+	 */
+	if (msg->msg_flags & MSG_OOB)
+		return -EOPNOTSUPP;
+
+	/*
+	 * check if socket is binded to a port
+	 */
+	if (!(sk->sk_userlocks & SOCK_BINDPORT_LOCK) || !inet->inet_num)
+		return -ENOTCONN;
+
+	/*
+	 * Get and verify the address.
+	 */
+	if (msg->msg_name) {
+		struct sockaddr_in *usin = (struct sockaddr_in *)msg->msg_name;
+		if (msg->msg_namelen < sizeof(*usin))
+			return -EINVAL;
+		if (usin->sin_family != AF_INET) {
+			if (usin->sin_family != AF_UNSPEC)
+				return -EAFNOSUPPORT;
+		}
+
+		daddr = usin->sin_addr.s_addr;
+		dport = usin->sin_port;
+	} else {
+		if (sk->sk_state != TCP_ESTABLISHED)
+			return -EDESTADDRREQ;
+		daddr = inet->inet_daddr;
+		dport = inet->inet_dport;
+		/* Open fast path for connected socket.
+		   Route will not be used, if at least one option is set.
+		 */
+		connected = 1;
+	}
+
+	if (dport == 0)
+		return -EINVAL;
+
+	dest = find_dest(sk, daddr, dport);
+
+	if (!(dest->use_flag & TX_NODE)) {
+		dest->use_flag |= TX_NODE;
+		usk->stat.txNodes++;
+		atomic_inc(&udpcp_txNodes);
+	}
+
+	ipc = &dest->ipc;
+
+	if (!skb_queue_empty(&usk->assembly)) {
+		/*
+		 * assembly is ongoing
+		 */
+		lock_sock(sk);
+		if (likely(!skb_queue_empty(&usk->assembly))) {
+			if (usk->assembly_dest != dest) {
+				udpcp_release_sock(sk);
+				return -EUSERS;
+			}
+			ipc->opt =
+			    (struct ip_options *)skb_peek(&usk->assembly)->cb;
+			goto queue_data;
+		}
+		udpcp_release_sock(sk);
+	}
+
+	ipc->addr = inet->inet_saddr;
+	ipc->oif = sk->sk_bound_dev_if;
+
+	dest->ackmode = usk->ackmode;
+	dest->chkmode = usk->chkmode;
+
+	if (msg->msg_controllen) {
+		/*
+		 * handle control message
+		 */
+		err = udpcp_cmsg_send(msg, &dest->ackmode, &dest->chkmode);
+		if (err)
+			return err;
+		err = ip_cmsg_send(sock_net(sk), msg, ipc);
+		if (err)
+			return err;
+		if (ipc->opt)
+			free = 1;
+		connected = 0;
+	}
+
+	if (!ipc->opt)
+		ipc->opt = inet->opt;
+
+	saddr = ipc->addr;
+	ipc->addr = faddr = daddr;
+
+	if (ipc->opt && ipc->opt->srr) {
+		if (!daddr)
+			return -EINVAL;
+		faddr = ipc->opt->faddr;
+		connected = 0;
+	}
+	tos = RT_TOS(inet->tos);
+	if (sock_flag(sk, SOCK_LOCALROUTE) ||
+	    (msg->msg_flags & MSG_DONTROUTE) ||
+	    (ipc->opt && ipc->opt->is_strictroute)) {
+		tos |= RTO_ONLINK;
+		connected = 0;
+	}
+
+	if (ipv4_is_multicast(daddr)) {
+		if (dest->ackmode != UDPCP_NOACK) {
+			err = EOPNOTSUPP;
+			goto out;
+		}
+		if (!ipc->oif)
+			ipc->oif = inet->mc_index;
+		if (!saddr)
+			saddr = inet->mc_addr;
+		connected = 0;
+	}
+
+	lock_sock(sk);
+	rt = dest->rt;
+	if (rt)
+		goto queue_data;
+	udpcp_release_sock(sk);
+
+	/*
+	 * calculate routing
+	 */
+	if (connected)
+		rt = (struct rtable *)sk_dst_check(sk, 0);
+
+	if (rt == NULL) {
+		struct flowi fl = {.oif = ipc->oif,
+			.nl_u = {.ip4_u = {.daddr = faddr,
+					   .saddr = saddr,
+					   .tos = tos} },
+			.proto = sk->sk_protocol,
+			.uli_u = {.ports = {.sport = inet->inet_sport,
+					    .dport = dport} }
+		};
+		struct net *net = sock_net(sk);
+
+		security_sk_classify_flow(sk, &fl);
+		err = ip_route_output_flow(net, &rt, &fl, sk, 1);
+		if (err) {
+			if (err == -ENETUNREACH)
+				IP_INC_STATS_BH(net, IPSTATS_MIB_OUTNOROUTES);
+			goto out;
+		}
+
+		err = -EACCES;
+		if ((rt->rt_flags & RTCF_BROADCAST) &&
+		    !sock_flag(sk, SOCK_BROADCAST))
+			goto out;
+		if (connected)
+			sk_dst_set(sk, dst_clone(&rt->dst));
+	}
+
+	if (msg->msg_flags & MSG_CONFIRM)
+		goto do_confirm;
+back_from_confirm:
+
+	saddr = rt->rt_src;
+	if (!ipc->addr)
+		daddr = ipc->addr = rt->rt_dst;
+
+	lock_sock(sk);
+
+	dest->fl.fl4_dst = daddr;
+	dest->fl.fl_ip_dport = dport;
+	dest->fl.fl4_src = saddr;
+	dest->fl.fl_ip_sport = inet->inet_sport;
+	dest->rt = rt;
+
+queue_data:
+	if (msg->msg_flags & MSG_PROBE)
+		goto release;
+
+	if (!dest->insync && skb_queue_empty(&dest->xmit)) {
+		/*
+		 * if not synced, queue a SYNC message
+		 */
+		err = udpcp_data(sk, dest, NULL, 0, 0);
+		if (err)
+			goto release;
+		dest->msgid = 0;
+		udpcp_queue_xmit(sk, dest, UDPCP_ACK, UDPCP_CHECKSUM);
+	}
+
+	/*
+	 * split message and store it to the assembly queue
+	 */
+	err = udpcp_data(sk, dest, msg->msg_iov, len,
+		       corkreq ? msg->msg_flags | MSG_MORE : msg->msg_flags);
+	if (err)
+		goto release;
+
+	if (!dest->msgid)
+		dest->msgid = 1;
+
+	if (!corkreq) {
+		/*
+		 * message is complete, transfer it from the assembly queue
+		 * into the transmit queue
+		 */
+		udpcp_queue_xmit(sk, dest, dest->ackmode, dest->chkmode);
+		/*
+		 * start transmit if possible
+		 */
+		err = udpcp_xmit(sk, dest);
+	}
+release:
+	udpcp_release_sock(sk);
+out:
+	if (free)
+		kfree(ipc->opt);
+
+	if (!err)
+		return len;
+	/*
+	 * ENOBUFS = no kernel mem, SOCK_NOSPACE = no sndbuf space.  Reporting
+	 * ENOBUFS might not be good (it's not tunable per se), but otherwise
+	 * we don't have a good statistic (IpOutDiscards but it can be too many
+	 * things).  We could add another new stat but at least for now that
+	 * seems like overkill.
+	 */
+	if (err == -ENOBUFS || test_bit(SOCK_NOSPACE, &sk->sk_socket->flags))
+		UDP_INC_STATS_USER(sock_net(sk), UDP_MIB_SNDBUFERRORS, 0);
+	return err;
+
+do_confirm:
+	dst_confirm(&rt->dst);
+	if (!(msg->msg_flags & MSG_PROBE) || len)
+		goto back_from_confirm;
+
+	err = 0;
+	goto out;
+}
+
+/*
+ * Sendpage() is not really implemented
+ */
+static int udpcp_sendpage(struct sock *sk, struct page *page, int offset,
+			  size_t size, int flags)
+{
+	return sock_no_sendpage(sk->sk_socket, page, offset, size, flags);
+}
+
+/*
+ * Release all message fragments of the first in the transmit queue
+ */
+static void udpcp_release_xmit(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	struct sk_buff *skb;
+	struct udpcphdr *uh;
+
+	for (;;) {
+		skb = skb_dequeue(&dest->xmit);
+
+		uh = udpcp_hdr(skb);
+
+		if (udpcp_is_last_frag(uh) && uh->msgid) {
+			usk->stat.txMsgs++;
+			atomic_inc(&udpcp_txMsgs);
+		}
+
+		udpcp_dec_pending(sk);
+
+		kfree_skb(skb);
+		if (skb == dest->xmit_last)
+			break;
+	}
+
+	dest->xmit_wait = 0;
+	dest->xmit_last = 0;
+	dest->try = 0;
+}
+
+/*
+ * Set the sync state
+ */
+static void udpcp_sync(struct sock *sk, struct udpcp_dest *dest)
+{
+	dest->xmit_wait = 0;
+	dest->xmit_last = 0;
+	dest->try = 0;
+	dest->acks = 0;
+	dest->insync = 1;
+}
+
+/*
+ * Returns true if the first message in the transmit queue is a sync message
+ */
+static inline int udpcp_xmit_is_sync(struct udpcp_dest *dest)
+{
+	struct sk_buff *skb = skb_peek(&dest->xmit);
+
+	return skb && !udpcp_hdr(skb)->msgid;
+}
+
+static inline struct udpcphdr *udpcp_ack_scan(struct sk_buff *skb)
+{
+	struct udpcphdr *uh;
+
+	for (;;) {
+		uh = udpcp_hdr(skb);
+
+		if (!(ntohs(uh->msginfo) & UDPCP_SINGLE_ACK_FLAG)
+		    || udpcp_is_last_frag(uh))
+			return uh;
+
+		skb = skb->next;
+	}
+}
+
+/*
+ * Handle an incoming ack
+ */
+static void udpcp_handle_ack(struct sock *sk, struct sk_buff *skb,
+			     struct udpcp_dest *dest)
+{
+	struct udpcphdr *r_uh;
+	struct udpcphdr *q_uh;
+
+	if (!dest->acks)
+		return;
+
+	r_uh = udpcp_hdr(skb);
+
+	/*
+	 * acks doesn't have a payload
+	 */
+	if (r_uh->length)
+		return;
+
+	q_uh = udpcp_ack_scan(dest->xmit_wait);
+
+	/*
+	 * message id, fragnum and fragamount must match the awaited message
+	 * fragment
+	 */
+	if (r_uh->msgid != q_uh->msgid)
+		return;
+
+	if (r_uh->fragnum != q_uh->fragnum)
+		return;
+
+	if (r_uh->fragamount != q_uh->fragamount)
+		return;
+
+	dest->acks--;
+
+	/*
+	 * if last fragment release message
+	 */
+	if (udpcp_is_last_frag(q_uh)) {
+		udpcp_release_xmit(sk, dest);
+
+		/*
+		 * special handling for sync messages
+		 */
+		if (r_uh->msgid == 0)
+			udpcp_sync(sk, dest);
+	} else
+		dest->xmit_wait = dest->xmit_wait->next;
+
+	/*
+	 * try to transmit next message/fragment
+	 */
+	udpcp_xmit(sk, dest);
+}
+
+/*
+ * Queue incoming message as owned by udpcp socket
+ */
+static void udpcp_set_owner_r(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct sk_buff *skb;
+
+	skb = dest->recv_msg;
+	skb_set_owner_r(skb, sk);
+
+	skb = skb_shinfo(skb)->frag_list;
+	if (!skb)
+		return;
+
+	for (;;) {
+		skb_set_owner_r(skb, sk);
+		if (udpcp_is_last_frag(udpcp_hdr(skb)))
+			break;
+		skb = skb->next;
+	}
+}
+
+/*
+ * Handle an incoming data message fragment
+ */
+static int udpcp_handle_data(struct sock *sk, struct sk_buff *skb,
+			     struct udpcp_dest *dest)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	struct udpcphdr *uh = udpcp_hdr(skb);
+	unsigned short msginfo = ntohs(uh->msginfo);
+	unsigned short length = ntohs(uh->length);
+
+	/*
+	 * special handling for sync messages
+	 */
+	if (uh->msgid == 0) {
+		/*
+		 * sync messages doesn't have a payload
+		 */
+		if (length)
+			return 1;
+
+		/*
+		 * sync messages doesn't have a ack rules
+		 */
+		if (msginfo & (UDPCP_NO_ACK_FLAG | UDPCP_SINGLE_ACK_FLAG))
+			return 1;
+
+		udpcp_send_ack(sk, skb, dest,
+			       memcmp(uh, &dest->lastmsg,
+				      sizeof(dest->lastmsg)) ? 0 : 1);
+
+		udpcp_purge_incoming(sk, dest);
+
+		/*
+		 * skip the first message in the queue if it is a sync messages
+		 */
+		if (udpcp_xmit_is_sync(dest)) {
+			dest->acks--;
+			udpcp_dec_pending(sk);
+			kfree_skb(skb_dequeue(&dest->xmit));
+		}
+
+		if (!dest->insync)
+			udpcp_sync(sk, dest);
+
+		udpcp_xmit(sk, dest);
+
+		return -1;
+	}
+
+	if (!dest->insync)
+		return 1;
+
+	if (length > UDPCP_MAX_MSGSIZE)
+		return 1;
+
+	length += sizeof(struct udpcphdr);
+
+	/*
+	 * if the message was still handled, send a duplicate ack
+	 */
+	if (!memcmp(uh, &dest->lastmsg, sizeof(dest->lastmsg))) {
+		udpcp_send_ack(sk, skb, dest, 1);
+		return 1;
+	}
+
+	if (dest->recv_msg) {
+		/*
+		 * if a fragment is already received validate the fragment
+		 */
+		if ((uh->msgid != udpcp_hdr(dest->recv_msg)->msgid) ||
+		    (uh->msginfo != udpcp_hdr(dest->recv_msg)->msginfo) ||
+		    (uh->length != udpcp_hdr(dest->recv_msg)->length) ||
+		    (uh->fragamount != udpcp_hdr(dest->recv_msg)->fragamount)
+		    ) {
+			udpcp_purge_incoming(sk, dest);
+			goto newmsg;
+		}
+
+		if (uh->fragnum != udpcp_hdr(dest->recv_last)->fragnum + 1)
+			return 1;
+
+		if (dest->recv_msg->len + skb->len - sizeof(struct udpcphdr) >
+		    length)
+			return 1;
+	} else {
+newmsg:
+		/*
+		 * first fragment must have the number 0
+		 */
+		if (uh->fragnum != 0)
+			return 1;
+
+		/*
+		 * UDPCP data length cannot be smaller then the UDP data length
+		 */
+		if (skb->len > length)
+			return 1;
+
+		/*
+		 * id of the last received is not valid
+		 */
+		if (dest->lastmsg.msgid == uh->msgid)
+			return 1;
+
+		/*
+		 * check against receive buffer limit
+		 */
+		if (atomic_read(&sk->sk_rmem_alloc) + length > sk->sk_rcvbuf)
+			return 1;
+	}
+
+	memset(&dest->lastmsg, 0, sizeof(dest->lastmsg));
+
+	if (!dest->recv_msg) {
+		/*
+		 * store the first message fragment
+		 */
+		if (skb->cloned) {
+			struct sk_buff *skbc;
+
+			skbc = skb_copy(skb, sk->sk_allocation);
+			if (skbc == NULL)
+				return 1;
+			kfree_skb(skb);
+			skb = skbc;
+		}
+		dest->recv_msg = skb;
+	} else {
+		/*
+		 * store the consecutively message fragment
+		 */
+		struct skb_shared_info *shinfo;
+
+		shinfo = skb_shinfo(dest->recv_msg);
+
+		if (!shinfo->frag_list)
+			shinfo->frag_list = skb;
+		else
+			dest->recv_last->next = skb;
+
+		skb_pull(skb, sizeof(struct udpcphdr));
+		dest->recv_msg->len += skb->len;
+		dest->recv_msg->data_len += skb->len;
+	}
+	dest->recv_last = skb;
+
+	msginfo = ntohs(uh->msginfo);
+
+	if (udpcp_is_last_frag(uh) || uh->fragamount == 0) {
+		/*
+		 * last fragment: queue it to the socket sk_receive_queue
+		 * and ack it
+		 */
+
+		if (dest->recv_msg->len != length) {
+			udpcp_purge_incoming(sk, dest);
+			return 0;
+		}
+
+		if (!(msginfo & UDPCP_NO_ACK_FLAG))
+			udpcp_send_ack(sk, skb, dest, 0);
+
+		memcpy(dest->recv_msg->data + UDPCP_HDRSIZE,
+		       dest->recv_msg->data, sizeof(struct udphdr));
+		skb_pull(dest->recv_msg, UDPCP_HDRSIZE);
+
+		usk->stat.rxMsgs++;
+		atomic_inc(&udpcp_rxMsgs);
+
+		/*
+		 * set a flag for UDPCP message
+		 */
+		UDP_SKB_CB(skb)->udpcp_flag = 1;
+
+		udpcp_set_owner_r(sk, dest);
+		skb_queue_tail(&sk->sk_receive_queue, dest->recv_msg);
+
+		/*
+		 * call the original data available handler
+		 */
+		if (usk->udp_data_ready)
+			usk->udp_data_ready(sk, dest->recv_msg->len);
+
+		dest->recv_msg = 0;
+		dest->recv_last = 0;
+	} else {
+		/*
+		 * ack fragment if requiered
+		 */
+		if (!(msginfo & UDPCP_NO_ACK_FLAG)
+		    && !(msginfo & UDPCP_SINGLE_ACK_FLAG))
+			udpcp_send_ack(sk, skb, dest, 0);
+
+		/*
+		 * setup timeout handler
+		 */
+		dest->rx_time = jiffies;
+
+		if (!timer_pending(&usk->timer))
+			udpcp_timer(sk, dest->rx_time + usk->rx_timeout);
+	}
+
+	return 0;
+}
+
+/*
+ * Deal with received UDPCP frames - sort out what type source it is
+ * and hand of it to the udpcp_handle_packet function.
+ */
+static void udpcp_data_ready(struct sock *sk, int slen)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	struct sk_buff *skb;
+	struct udpcp_dest *dest;
+	struct udpcphdr *uh;
+	unsigned short msginfo;
+	int ret;
+
+	skb = skb_peek_tail(&sk->sk_receive_queue);
+
+	/*
+	 * don't handle NULL pointer buffer and UDPCP messages
+	 */
+	if (skb == NULL || UDP_SKB_CB(skb)->udpcp_flag) {
+		if (usk->udp_data_ready)
+			usk->udp_data_ready(sk, slen);
+		return;
+	}
+
+	__skb_unlink(skb, &sk->sk_receive_queue);
+	if (udpcp_validate_skb(skb)) {
+		kfree_skb(skb);
+
+		return;
+	}
+
+	skb_orphan(skb);
+
+	/*
+	 * do UDP checksum
+	 */
+	if (udp_lib_checksum_complete(skb)) {
+		UDP_INC_STATS_BH(sock_net(sk), UDP_MIB_INERRORS, 0);
+		return;
+	}
+
+	if (unlikely(debug))
+		dump_msg("receive", skb, ip_hdr(skb)->saddr,
+			 ip_hdr(skb)->daddr);
+
+	uh = udpcp_hdr(skb);
+	msginfo = ntohs(uh->msginfo);
+
+	/*
+	 * handle only UDPCP protocol version 2
+	 */
+	if ((msginfo & UDPCP_PROTOCOL_MASK) != UDPCP_PROTOCOL_VERSION_2) {
+		kfree_skb(skb);
+		return;
+	}
+
+	/*
+	 * handle UDPCP checksum
+	 */
+	if (msginfo & UDPCP_CHECKSUM_FLAG) {
+		u8 *data;
+		u32 data_len;
+		u32 chksum;
+
+		chksum = ntohl(uh->chksum);
+		data = (u8 *) skb->data + sizeof(struct udphdr);
+		data_len = skb->len - sizeof(struct udphdr);
+
+		uh->chksum = 0;
+
+		if (chksum != zlib_adler32(1, data, data_len)) {
+			kfree_skb(skb);
+			usk->stat.crcErrors++;
+			atomic_inc(&udpcp_crcErrors);
+			return;
+		}
+	}
+
+	dest = __find_dest(sk, ip_hdr(skb)->saddr, udp_hdr(skb)->source);
+
+	if (!dest) {
+		/*
+		 * new communication destination must start with an sync message
+		 */
+		if (((msginfo & UDPCP_MSG_TYPE_MASK) != UDPCP_MSG_TYPE_DATA) ||
+		    (uh->msgid != 0)) {
+			kfree_skb(skb);
+			return;
+		}
+
+		dest = new_dest(sk, ip_hdr(skb)->saddr, udp_hdr(skb)->source);
+
+		if (!dest) {
+			kfree_skb(skb);
+			return;
+		}
+	}
+
+	/*
+	 * handle message type
+	 */
+	switch (msginfo & UDPCP_MSG_TYPE_MASK) {
+	case UDPCP_MSG_TYPE_DATA:
+		if (!(dest->use_flag & RX_NODE)) {
+			dest->use_flag |= RX_NODE;
+			usk->stat.rxNodes++;
+			atomic_inc(&udpcp_rxNodes);
+		}
+
+		ret = udpcp_handle_data(sk, skb, dest);
+
+		if (ret > 0) {
+			dest->rxDiscardedFrags++;
+			usk->stat.rxDiscardedFrags++;
+			atomic_inc(&udpcp_rxDiscardedFrags);
+		}
+		break;
+	case UDPCP_MSG_TYPE_ACK:
+		udpcp_handle_ack(sk, skb, dest);
+	default:
+		ret = 1;
+		break;
+	}
+	if (ret)
+		kfree_skb(skb);
+}
+
+/*
+ * Set socket options
+ */
+static int udpcp_setsockopt(struct sock *sk, int level, int optname,
+			    char __user *optval, unsigned int optlen)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	int val, ret;
+
+	if (level != SOL_UDPCP) {
+		if (udp_prot.setsockopt) {
+			ret = udp_prot.setsockopt(sk, level, optname, optval,
+						optlen);
+			check_timeout(sk);
+			return ret;
+		}
+		return -ENOPROTOOPT;
+	}
+
+	if (optlen < sizeof(int))
+		return -EINVAL;
+
+	if (get_user(val, (int __user *)optval))
+		return -EFAULT;
+
+	switch (optname) {
+	case UDPCP_OPT_TRANSFER_MODE:
+		switch (val) {
+		case UDPCP_NOACK:
+		case UDPCP_ACK:
+		case UDPCP_SINGLE_ACK:
+			usk->ackmode = val;
+			break;
+		default:
+			return -EINVAL;
+		}
+		break;
+	case UDPCP_OPT_CHECKSUM_MODE:
+		switch (val) {
+		case UDPCP_NOCHECKSUM:
+		case UDPCP_CHECKSUM:
+			usk->chkmode = val;
+			break;
+		default:
+			return -EINVAL;
+		}
+		break;
+
+	case UDPCP_OPT_TX_TIMEOUT:
+		if ((val < 1) || (val > UDPCP_MAX_WAIT_SEC * 1000))
+			return -EINVAL;
+		usk->tx_timeout = msecs_to_jiffies(val);
+		break;
+
+	case UDPCP_OPT_RX_TIMEOUT:
+		if ((val < 1) || (val > UDPCP_MAX_WAIT_SEC * 1000))
+			return -EINVAL;
+		usk->rx_timeout = msecs_to_jiffies(val);
+		break;
+
+	case UDPCP_OPT_MAXTRY:
+		if ((val < 1) || (val > 10))
+			return -EINVAL;
+		usk->maxtry = val;
+		break;
+
+	case UDPCP_OPT_OUTSTANDING_ACKS:
+		if ((val < 1) || (val > 255))
+			return -EINVAL;
+		usk->acks = val;
+		break;
+
+	default:
+		return -ENOPROTOOPT;
+	}
+	return 0;
+}
+
+/*
+ * Get socket options
+ */
+static int udpcp_getsockopt(struct sock *sk, int level, int optname,
+			    char __user *optval, int __user *optlen)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	int val, len, ret;
+
+	if (level != SOL_UDPCP) {
+		if (udp_prot.getsockopt) {
+			ret = udp_prot.getsockopt(sk, level, optname, optval,
+						optlen);
+			check_timeout(sk);
+			return ret;
+		}
+		return -ENOPROTOOPT;
+	}
+
+	if (get_user(len, optlen))
+		return -EFAULT;
+
+	len = min_t(unsigned int, len, sizeof(int));
+
+	if (len < 0)
+		return -EINVAL;
+
+	switch (optname) {
+	case UDPCP_OPT_TRANSFER_MODE:
+		val = usk->ackmode;
+		break;
+
+	case UDPCP_OPT_CHECKSUM_MODE:
+		val = usk->chkmode;
+		break;
+
+	case UDPCP_OPT_TX_TIMEOUT:
+		val = jiffies_to_msecs(usk->tx_timeout);
+		break;
+
+	case UDPCP_OPT_MAXTRY:
+		val = usk->maxtry;
+		break;
+
+	case UDPCP_OPT_OUTSTANDING_ACKS:
+		val = usk->acks;
+		break;
+
+	default:
+		return -ENOPROTOOPT;
+	}
+
+	if (put_user(len, optlen))
+		return -EFAULT;
+	if (copy_to_user(optval, &val, len))
+		return -EFAULT;
+	return 0;
+}
+
+/*
+ * ioctl() requests applicable to the UDPCP protocol
+ */
+int udpcp_ioctl(struct sock *sk, int cmd, unsigned long arg)
+{
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	int ret = 0;
+
+	switch (cmd) {
+	case UDPCP_IOCTL_GET_STATISTICS:
+		lock_sock(sk);
+		if (copy_to_user((void *)arg, &usk->stat, sizeof(usk->stat)))
+			ret = -EFAULT;
+		udpcp_release_sock(sk);
+		break;
+
+	case UDPCP_IOCTL_RESET_STATISTICS:
+		lock_sock(sk);
+		usk->stat.txMsgs = 0;
+		usk->stat.rxMsgs = 0;
+		usk->stat.txTimeout = 0;
+		usk->stat.rxTimeout = 0;
+		usk->stat.txRetries = 0;
+		usk->stat.rxDiscardedFrags = 0;
+		usk->stat.crcErrors = 0;
+		udpcp_release_sock(sk);
+		break;
+
+	case UDPCP_IOCTL_SYNC:
+		if (arg)
+			ret = wait_event_interruptible_timeout(usk->wq,
+				!usk->pending, msecs_to_jiffies(arg));
+		else
+			ret = wait_event_interruptible(usk->wq, !usk->pending);
+
+		break;
+
+	default:
+		if (udp_prot.ioctl) {
+			ret = udp_prot.ioctl(sk, cmd, arg);
+			check_timeout(sk);
+		} else
+			ret = -ENOIOCTLCMD;
+		break;
+	}
+	return ret;
+}
+
+/*
+ * This function will be called by recv(), recvfrom() and revmsg()
+ */
+int udpcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
+		  size_t len, int noblock, int flags, int *addr_len)
+{
+	int ret;
+
+	ret = udp_prot.recvmsg(iocb, sk, msg, len, noblock, flags, addr_len);
+	check_timeout(sk);
+	return ret;
+}
+
+/*
+ * This function will be called by socket() and initialized the socket
+ */
+static int udpcp_sockinit(struct sock *sk)
+{
+	int ret;
+	struct udpcp_sock *usk;
+
+	sk->sk_protocol = SOL_UDP;
+	sk->sk_allocation = GFP_ATOMIC;
+	if (udp_prot.init) {
+		ret = udp_prot.init(sk);
+
+		if (ret)
+			return ret;
+	}
+
+	usk = udpcp_sk(sk);
+	usk->timer.expires = 0;
+	usk->timer.function = udpcp_timeout;
+	usk->timer.data = (long)sk;
+	init_timer(&usk->timer);
+	INIT_LIST_HEAD(&usk->destlist);
+	init_waitqueue_head(&usk->wq);
+	usk->pending = 0;
+	usk->ackmode = UDPCP_ACK;
+	usk->chkmode = UDPCP_CHECKSUM;
+	usk->maxtry = UDPCP_TX_MAXTRY;
+	usk->acks = UDPCP_OUTSTANDING_ACKS;
+	usk->tx_timeout = msecs_to_jiffies(UDPCP_TX_TIMEOUT);
+	usk->rx_timeout = msecs_to_jiffies(UDPCP_RX_TIMEOUT);
+	usk->udp_data_ready = sk->sk_data_ready;
+	sk->sk_data_ready = udpcp_data_ready;
+	usk->udpsock.pending = 0;
+	skb_queue_head_init(&usk->assembly);
+	usk->assembly_len = 0;
+	usk->assembly_dest = NULL;
+
+	spin_lock_bh(&udpcp_lock);
+	list_add_tail(&usk->udpcplist, &udpcp_list);
+	spin_unlock_bh(&udpcp_lock);
+
+#ifdef MODULE
+	try_module_get(THIS_MODULE);
+#endif
+	return 0;
+}
+
+/*
+ * This function will be called by close()
+ */
+static void udpcp_destroy(struct sock *sk)
+{
+	struct list_head *p;
+	struct list_head *n;
+	struct udpcp_sock *usk = udpcp_sk(sk);
+
+	spin_lock_bh(&udpcp_lock);
+	list_del(&usk->udpcplist);
+	spin_unlock_bh(&udpcp_lock);
+
+	if (udp_prot.destroy)
+		udp_prot.destroy(sk);
+
+	lock_sock(sk);
+
+	del_timer_sync(&usk->timer);
+	sk->sk_data_ready = usk->udp_data_ready;
+
+	skb_queue_purge(&usk->assembly);
+
+	list_for_each_safe(p, n, &usk->destlist) {
+		struct udpcp_dest *dest;
+
+		dest = list_to_udpcpdest(p);
+
+		skb_queue_purge(&dest->xmit);
+
+		kfree_skb(dest->recv_msg);
+
+		if (dest->rt)
+			dst_release(&dest->rt->dst);
+
+		kfree(dest);
+	}
+
+	atomic_sub(usk->stat.txNodes, &udpcp_txNodes);
+	atomic_sub(usk->stat.rxNodes, &udpcp_rxNodes);
+
+	usk->pending = 0;
+
+	if (waitqueue_active(&usk->wq))
+		wake_up_interruptible(&usk->wq);
+
+	release_sock(sk);
+
+#ifdef MODULE
+	module_put(THIS_MODULE);
+#endif
+}
+
+static struct proto udpcp_prot;
+
+/*
+ * inet protocol stack descriptor
+ */
+static struct inet_protosw udpcp_protosw = {
+	.type = SOCK_DGRAM,
+	.protocol = PF_UDPCP,
+	.prot = &udpcp_prot,
+	.ops = &inet_dgram_ops,
+	.no_check = UDP_CSUM_DEFAULT,
+	.flags = 0,
+};
+
+#ifdef CONFIG_PROC_FS
+/*
+ * The following functions handles the /proc/net/udpcp entry
+ */
+struct udpcp_seq_afinfo {
+	char *name;
+	const struct file_operations seq_fops;
+	const struct seq_operations seq_ops;
+};
+
+struct udpcp_iter_state {
+	struct seq_net_private p;
+	struct sock *sk;
+	struct list_head *list;
+	int bucket;
+};
+
+static int udpcp_get_destlist(struct udpcp_sock *usk,
+			struct udpcp_iter_state *state)
+{
+	struct sock *sk = (struct sock *)usk;
+
+	if (sock_flag(sk, SOCK_DEAD))
+		return 0;
+
+	sock_hold(sk);
+	if (!list_empty(&usk->destlist)) {
+		state->sk = sk;
+		state->list = &usk->destlist;
+		return 1;
+	}
+	sock_put(sk);
+
+	return 0;
+}
+
+static inline int udpcp_next_dest(struct udpcp_iter_state *state)
+{
+	struct sock *sk = state->sk;
+	struct udpcp_sock *usk = udpcp_sk(sk);
+	int found = 0;
+
+	if (sock_flag(sk, SOCK_DEAD))
+		return 0;
+
+	lock_sock(sk);
+	if (!list_is_last(state->list, &usk->destlist)) {
+		state->list = state->list->next;
+		state->bucket++;
+		found = 1;
+	}
+	udpcp_release_sock(sk);
+	return found;
+}
+
+static void *udpcp_get_next(struct seq_file *seq)
+{
+	struct udpcp_iter_state *state = seq->private;
+	struct udpcp_sock *usk;
+	struct sock *sk;
+
+	while (state) {
+		if (udpcp_next_dest(state))
+			return state;
+
+		sk = state->sk;
+		usk = udpcp_sk(sk);
+
+		spin_lock_bh(&udpcp_lock);
+		while (!list_is_last(&usk->udpcplist, &udpcp_list)) {
+			usk = list_entry(usk->udpcplist.next, struct udpcp_sock,
+				       udpcplist);
+
+			if (udpcp_get_destlist(usk, state))
+				goto found;
+		}
+		state->sk = NULL;
+		state = NULL;
+found:
+		spin_unlock_bh(&udpcp_lock);
+		sock_put(sk);
+	}
+	return state;
+}
+
+static void *udpcp_get_first(struct seq_file *seq)
+{
+	struct list_head *p;
+	struct udpcp_iter_state *state = seq->private;
+	int found = 0;
+
+	if (!state)
+		return NULL;
+
+	spin_lock_bh(&udpcp_lock);
+	list_for_each(p, &udpcp_list) {
+		found = udpcp_get_destlist(list_to_udpcpsock(p), state);
+		if (found)
+			goto found;
+	}
+found:
+	spin_unlock_bh(&udpcp_lock);
+
+	if (!found)
+		return NULL;
+	return udpcp_get_next(seq);
+}
+
+static void *udpcp_get_idx(struct seq_file *seq, loff_t pos)
+{
+	if (!udpcp_get_first(seq))
+		return NULL;
+
+	while (pos--) {
+		if (!udpcp_get_next(seq))
+			return NULL;
+	}
+	return seq->private;
+}
+
+static void *udpcp_seq_start(struct seq_file *seq, loff_t * pos)
+{
+	return *pos ? udpcp_get_idx(seq, *pos - 1) : SEQ_START_TOKEN;
+}
+
+static void *udpcp_seq_next(struct seq_file *seq, void *v, loff_t * pos)
+{
+	void *private;
+
+	if (v == SEQ_START_TOKEN)
+		private = udpcp_get_idx(seq, 0);
+	else
+		private = udpcp_get_next(seq);
+
+	++*pos;
+	return private;
+}
+
+static void udpcp_seq_stop(struct seq_file *seq, void *v)
+{
+	struct udpcp_iter_state *state = seq->private;
+
+	if (state->sk)
+		sock_put(state->sk);
+}
+
+static int udpcp_seq_open(struct inode *inode, struct file *file)
+{
+	struct udpcp_seq_afinfo *afinfo = PDE(inode)->data;
+	int err;
+
+	err = seq_open_net(inode, file, &afinfo->seq_ops,
+			   sizeof(struct udpcp_iter_state));
+	if (err < 0)
+		return err;
+
+	return err;
+}
+
+int udpcp_proc_register(struct net *net, struct udpcp_seq_afinfo *afinfo)
+{
+	struct proc_dir_entry *p;
+	int rc = 0;
+
+	p = proc_create_data(afinfo->name, S_IRUGO, net->proc_net,
+			     &afinfo->seq_fops, afinfo);
+	if (!p)
+		rc = -ENOMEM;
+	return rc;
+}
+
+void udpcp_proc_unregister(struct net *net, struct udpcp_seq_afinfo *afinfo)
+{
+	proc_net_remove(net, afinfo->name);
+}
+
+static unsigned int udpcp_tx_queue_len(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct sk_buff *skb;
+	unsigned int n;
+
+	n = 0;
+	skb_queue_walk(&dest->xmit, skb)
+	    n += skb->len;
+	return n;
+}
+
+static unsigned int udpcp_rx_queue_len(struct sock *sk, struct udpcp_dest *dest)
+{
+	struct sk_buff *skb;
+	unsigned int n;
+
+	n = 0;
+	skb_queue_walk(&sk->sk_receive_queue, skb) {
+		if (udp_hdr(skb)->source == dest->port
+		    && ip_hdr(skb)->saddr == dest->addr)
+			n += skb->len;
+	}
+	return n;
+}
+
+static void udpcp_format_sock(struct seq_file *seq, int *len)
+{
+	struct udpcp_iter_state *state = seq->private;
+	struct sock *sk = state->sk;
+	struct inet_sock *inet = inet_sk(sk);
+	struct udpcp_dest *p = list_to_udpcpdest(state->list);
+	__be32 src = inet->inet_rcv_saddr;
+	__u16 srcp = ntohs(inet->inet_sport);
+	__be32 dest = p->addr;
+	__u16 destp = ntohs(p->port);
+
+	lock_sock(sk);
+	seq_printf(seq, "%4d: %08X:%04X %08X:%04X"
+		   " %02X %08X:%08X %02X:%08lX %08X %5d %8d %lu %d %p %u%n",
+		   state->bucket, src, srcp, dest, destp, sk->sk_state,
+		   udpcp_tx_queue_len(sk, p),
+		   udpcp_rx_queue_len(sk, p),
+		   0, 0L, p->txRetries, sock_i_uid(sk),
+		   p->txTimeout, sock_i_ino(sk),
+		   atomic_read(&sk->sk_refcnt), sk, p->rxTimeout,
+		   len);
+	udpcp_release_sock(sk);
+}
+
+int udpcp_seq_show(struct seq_file *seq, void *v)
+{
+	if (v == SEQ_START_TOKEN)
+		seq_printf(seq, "%-127s\n",
+			   "  sl  local_address rem_address   st tx_queue "
+			   "rx_queue tr tm->when retrnsmt   uid  timeout "
+			   "inode ref pointer drops");
+	else {
+		int len;
+
+		udpcp_format_sock(seq, &len);
+		seq_printf(seq, "%*s\n", 127 - len, "");
+	}
+	return 0;
+}
+
+static struct udpcp_seq_afinfo udpcp_seq_afinfo = {
+	.name = "udpcp",
+	.seq_fops = {
+			.owner = THIS_MODULE,
+			.open = udpcp_seq_open,
+			.read = seq_read,
+			.llseek = seq_lseek,
+			.release = seq_release_net,
+		     },
+	.seq_ops = {
+			.show = udpcp_seq_show,
+			.start = udpcp_seq_start,
+			.next = udpcp_seq_next,
+			.stop = udpcp_seq_stop,
+		    },
+};
+
+static int udpcp_proc_init_net(struct net *net)
+{
+	return udpcp_proc_register(net, &udpcp_seq_afinfo);
+}
+
+static void udpcp_proc_exit_net(struct net *net)
+{
+	udpcp_proc_unregister(net, &udpcp_seq_afinfo);
+}
+
+static struct pernet_operations udpcp_net_ops = {
+	.init = udpcp_proc_init_net,
+	.exit = udpcp_proc_exit_net,
+};
+
+int __init udpcp_proc_init(void)
+{
+	return register_pernet_subsys(&udpcp_net_ops);
+}
+
+void udpcp_proc_exit(void)
+{
+	unregister_pernet_subsys(&udpcp_net_ops);
+}
+#endif /* CONFIG_PROC_FS */
+
+/*
+ * Install and init module
+ */
+static int __init udpcp_init(void)
+{
+	int ret;
+	struct proc_dir_entry *proc_entry = NULL;
+
+	spin_lock_init(&udpcp_lock);
+
+	INIT_LIST_HEAD(&udpcp_list);
+
+	/*
+	 * to prevent to rewrite the whole UDP protocol,
+	 * assign struct proto udp to the struct proto udpcp
+	 */
+	udpcp_prot = udp_prot;
+
+	/*
+	 * change the protocol name
+	 */
+	strcpy(udpcp_prot.name, "UDPCP");
+
+	/*
+	 * overload the following function, all other
+	 * functions will use the UDP protocol functions
+	 */
+	udpcp_prot.sendmsg = udpcp_sendmsg;
+	udpcp_prot.sendpage = udpcp_sendpage;
+	udpcp_prot.init = udpcp_sockinit;
+	udpcp_prot.destroy = udpcp_destroy;
+	udpcp_prot.setsockopt = udpcp_setsockopt;
+	udpcp_prot.getsockopt = udpcp_getsockopt;
+	udpcp_prot.ioctl = udpcp_ioctl;
+	udpcp_prot.recvmsg = udpcp_recvmsg;
+
+	/*
+	 * fix the object size for the embedded udpcp_sock structure
+	 */
+	udpcp_prot.obj_size = sizeof(struct udpcp_sock);
+
+	/*
+	 * register the UDPCP protocol
+	 */
+	ret = proto_register(&udpcp_prot, 1);
+	if (ret)
+		return ret;
+
+	/*
+	 * register the inet socket for UDPCP
+	 */
+	inet_register_protosw(&udpcp_protosw);
+
+#ifdef CONFIG_PROC_FS
+	/*
+	 * register /proc/driver/udpcp entry
+	 */
+	proc_entry =
+	    create_proc_read_entry(UDPCP_PROC, S_IRUSR | S_IRGRP | S_IROTH,
+				   NULL, udpcp_proc, NULL);
+
+	if (!proc_entry) {
+		ret = -ENOMEM;
+		goto err;
+	}
+
+	/*
+	 * register /proc/net/udpcp entry
+	 */
+	ret = udpcp_proc_init();
+
+	if (ret)
+		goto err;
+#endif
+	pr_info("UDPCP protocol stack version " VERSION "\n");
+	return 0;
+#ifdef CONFIG_PROC_FS
+err:
+	if (proc_entry)
+		remove_proc_entry(UDPCP_PROC, NULL);
+	proto_unregister(&udpcp_prot);
+	return ret;
+#endif
+}
+
+/*
+ * Cleanup and exit module
+ */
+static void __exit udpcp_exit(void)
+{
+#ifdef CONFIG_PROC_FS
+	udpcp_proc_exit();
+	remove_proc_entry(UDPCP_PROC, NULL);
+#endif
+	inet_unregister_protosw(&udpcp_protosw);
+	proto_unregister(&udpcp_prot);
+}
+
+module_init(udpcp_init);
+module_exit(udpcp_exit);
+
+MODULE_AUTHOR("Stefani Seibold <stefani@seibold.net>");
+MODULE_DESCRIPTION("UDPCP protocol stack v" VERSION);
+MODULE_LICENSE("GPL");
+
-- 
1.7.3.4

^ permalink raw reply related

* Re: [PATCH] net: eepro testing positive EBUSY return by request_irq()?
From: roel kluin @ 2011-01-02 14:52 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: davem, netdev, Andrew Morton, LKML
In-Reply-To: <1293809262.2870.45.camel@localhost>


>> -		if (request_irq (*irqp, NULL, IRQF_SHARED, "bogus", dev) != EBUSY) {
>> +		if (request_irq (*irqp, NULL, IRQF_SHARED, "bogus", dev) != -EBUSY) {

> This condition is completely bogus - request_irq() with a NULL handler
> now returns -EINVAL before even checking whether the IRQ is in use.  The
> code should be fixed along the lines of what I did for 3c503 in commit
> b0cf4dfb7cd21556efd9a6a67edcba0840b4d98d.
> 
> The e2100 and hp net drivers have the same bug.
> 
> Ben.

I've made an attempt to fix the mentioned issues, but I'm not sure it is complete.
The commit made some other changes as well which didn't seem appropriate here to me.
And I didn't compile or run test this.

As a side note, in the function changed by the commit you mentioned, el2_open(), it
appears that in current git, boolean `seen' cannot ever become true, can it? So it
loops forever until the request_irq() returns an error other than -EBUSY. Not sure
how it should be fixed though.

------------->8-------------------8<----------------
Handle errors returned by request_irq() appropriately

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
---
 drivers/net/e2100.c |   14 +++++++++-----
 drivers/net/eepro.c |   26 +++++++++++++++-----------
 drivers/net/hp.c    |   25 ++++++++++++++-----------
 3 files changed, 38 insertions(+), 27 deletions(-)

diff --git a/drivers/net/e2100.c b/drivers/net/e2100.c
index 06e72fb..115fa16 100644
--- a/drivers/net/e2100.c
+++ b/drivers/net/e2100.c
@@ -217,11 +217,15 @@ static int __init e21_probe1(struct net_device *dev, int ioaddr)
 
 	if (dev->irq < 2) {
 		int irqlist[] = {15, 11, 10, 12, 5, 9, 3, 4};
-		for (i = 0; i < ARRAY_SIZE(irqlist); i++)
-			if (request_irq (irqlist[i], NULL, 0, "bogus", NULL) != -EBUSY) {
-				dev->irq = irqlist[i];
-				break;
-			}
+		for (i = 0; i < ARRAY_SIZE(irqlist); i++) {
+			retval = request_irq (irqlist[i], NULL, 0, "bogus", NULL);
+			if (retval != -EBUSY)
+				continue;
+			if (retval < 0)
+				goto out;
+			dev->irq = irqlist[i];
+			break;
+		}
 		if (i >= ARRAY_SIZE(irqlist)) {
 			printk(" unable to get IRQ %d.\n", dev->irq);
 			retval = -EAGAIN;
diff --git a/drivers/net/eepro.c b/drivers/net/eepro.c
index 7c82631..24ff522 100644
--- a/drivers/net/eepro.c
+++ b/drivers/net/eepro.c
@@ -897,6 +897,7 @@ static int	eepro_grab_irq(struct net_device *dev)
 {
 	int irqlist[] = { 3, 4, 5, 7, 9, 10, 11, 12, 0 };
 	int *irqp = irqlist, temp_reg, ioaddr = dev->base_addr;
+	int retval;
 
 	eepro_sw2bank1(ioaddr); /* be CAREFUL, BANK 1 now */
 
@@ -919,21 +920,24 @@ static int	eepro_grab_irq(struct net_device *dev)
 		outb((temp_reg & 0xf8) | irqrmap[*irqp], ioaddr + INT_NO_REG);
 
 		eepro_sw2bank0(ioaddr); /* Switch back to Bank 0 */
+		retval = request_irq (*irqp, NULL, IRQF_SHARED, "bogus", dev);
+		if (retval == -EBUSY)
+			continue;
+		if (retval < 0)
+			return retval;
 
-		if (request_irq (*irqp, NULL, IRQF_SHARED, "bogus", dev) != EBUSY) {
-			unsigned long irq_mask;
-			/* Twinkle the interrupt, and check if it's seen */
-			irq_mask = probe_irq_on();
+		unsigned long irq_mask;
+		/* Twinkle the interrupt, and check if it's seen */
+		irq_mask = probe_irq_on();
 
-			eepro_diag(ioaddr); /* RESET the 82595 */
-			mdelay(20);
+		eepro_diag(ioaddr); /* RESET the 82595 */
+		mdelay(20);
 
-			if (*irqp == probe_irq_off(irq_mask))  /* It's a good IRQ line */
-				break;
+		if (*irqp == probe_irq_off(irq_mask))  /* It's a good IRQ line */
+			break;
 
-			/* clear all interrupts */
-			eepro_clear_int(ioaddr);
-		}
+		/* clear all interrupts */
+		eepro_clear_int(ioaddr);
 	} while (*++irqp);
 
 	eepro_sw2bank1(ioaddr); /* Switch back to Bank 1 */
diff --git a/drivers/net/hp.c b/drivers/net/hp.c
index d15d2f2..aeb68a0 100644
--- a/drivers/net/hp.c
+++ b/drivers/net/hp.c
@@ -167,17 +167,20 @@ static int __init hp_probe1(struct net_device *dev, int ioaddr)
 		int *irqp = wordmode ? irq_16list : irq_8list;
 		do {
 			int irq = *irqp;
-			if (request_irq (irq, NULL, 0, "bogus", NULL) != -EBUSY) {
-				unsigned long cookie = probe_irq_on();
-				/* Twinkle the interrupt, and check if it's seen. */
-				outb_p(irqmap[irq] | HP_RUN, ioaddr + HP_CONFIGURE);
-				outb_p( 0x00 | HP_RUN, ioaddr + HP_CONFIGURE);
-				if (irq == probe_irq_off(cookie)		 /* It's a good IRQ line! */
-					&& request_irq (irq, eip_interrupt, 0, DRV_NAME, dev) == 0) {
-					printk(" selecting IRQ %d.\n", irq);
-					dev->irq = *irqp;
-					break;
-				}
+			retval = request_irq (irq, NULL, 0, "bogus", NULL);
+			if (retval == -EBUSY)
+				continue;
+			if (retval < 0)
+				goto out;
+			unsigned long cookie = probe_irq_on();
+			/* Twinkle the interrupt, and check if it's seen. */
+			outb_p(irqmap[irq] | HP_RUN, ioaddr + HP_CONFIGURE);
+			outb_p( 0x00 | HP_RUN, ioaddr + HP_CONFIGURE);
+			if (irq == probe_irq_off(cookie)		 /* It's a good IRQ line! */
+				&& request_irq (irq, eip_interrupt, 0, DRV_NAME, dev) == 0) {
+				printk(" selecting IRQ %d.\n", irq);
+				dev->irq = *irqp;
+				break;
 			}
 		} while (*++irqp);
 		if (*irqp == 0) {

^ permalink raw reply related

* [PATCH net-next-2.6] bonding: remove meaningless /sys/module/bonding/parameters entries.
From: Nicolas de Pesloüan @ 2011-01-02 14:35 UTC (permalink / raw)
  To: bonding-devel, netdev, fubar, davem; +Cc: nicolas.2p.debian

Only two bonding parameters are exposed in /sys/module/bonding:

num_grat_arp
num_unsol_na

Those values are not module global, but per device.

The per device values are available in /sys/class/net/<device>/bonding.

The values exposed in /sys/module/bonding are those given at module load time
and only used as default values when creating a device. They are read-only and
cannot change in any way.

As such, they are mostly meaningless.
---
 drivers/net/bonding/bond_main.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index b1025b8..b7fd966 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -116,9 +116,9 @@ module_param(max_bonds, int, 0);
 MODULE_PARM_DESC(max_bonds, "Max number of bonded devices");
 module_param(tx_queues, int, 0);
 MODULE_PARM_DESC(tx_queues, "Max number of transmit queues (default = 16)");
-module_param(num_grat_arp, int, 0644);
+module_param(num_grat_arp, int, 0);
 MODULE_PARM_DESC(num_grat_arp, "Number of gratuitous ARP packets to send on failover event");
-module_param(num_unsol_na, int, 0644);
+module_param(num_unsol_na, int, 0);
 MODULE_PARM_DESC(num_unsol_na, "Number of unsolicited IPv6 Neighbor Advertisements packets to send on failover event");
 module_param(miimon, int, 0);
 MODULE_PARM_DESC(miimon, "Link check interval in milliseconds");
-- 
1.7.2.3

^ permalink raw reply related

* [PATCH net-2.6 V2] bridge: fix br_multicast_ipv6_rcv for paged skbs
From: Tomas Winkler @ 2011-01-02 13:24 UTC (permalink / raw)
  To: davem; +Cc: netdev, Tomas Winkler, Johannes Berg, Stephen Hemminger

use pskb_may_pull to access ipv6 header correctly for paged skbs
It was omitted in the bridge code leading to crash in blind
__skb_pull

this fixes bug https://bugzilla.kernel.org/show_bug.cgi?id=25202

Dec 15 14:36:40 User-PC hostapd: wlan0: STA 00:15:00:60:5d:34 IEEE 802.11: authenticated
Dec 15 14:36:40 User-PC hostapd: wlan0: STA 00:15:00:60:5d:34 IEEE 802.11: associated (aid 2)
Dec 15 14:36:40 User-PC hostapd: wlan0: STA 00:15:00:60:5d:34 RADIUS: starting accounting session 4D0608A3-00000005
Dec 15 14:36:41 User-PC kernel: [175576.120287] ------------[ cut here ]------------
Dec 15 14:36:41 User-PC kernel: [175576.120452] kernel BUG at include/linux/skbuff.h:1178!
Dec 15 14:36:41 User-PC kernel: [175576.120609] invalid opcode: 0000 [#1] SMP
Dec 15 14:36:41 User-PC kernel: [175576.120749] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/uevent
Dec 15 14:36:41 User-PC kernel: [175576.121035] Modules linked in: oprofile binfmt_misc bridge stp llc parport_pc ppdev arc4 iwlagn snd_hda_codec_realtek iwlcore i915 snd_hda_intel mac80211 joydev snd_hda_codec snd_hwdep snd_pcm snd_seq_midi drm_kms_helper snd_rawmidi drm snd_seq_midi_event snd_seq snd_timer snd_seq_device cfg80211 eeepc_wmi usbhid psmouse intel_agp i2c_algo_bit intel_gtt uvcvideo agpgart videodev sparse_keymap snd shpchp v4l1_compat lp hid video serio_raw soundcore output snd_page_alloc ahci libahci atl1c
Dec 15 14:36:41 User-PC kernel: [175576.122712]
Dec 15 14:36:41 User-PC kernel: [175576.122769] Pid: 0, comm: kworker/0:0 Tainted: G        W   2.6.37-rc5-wl+ #3 1015PE/1016P
Dec 15 14:36:41 User-PC kernel: [175576.123012] EIP: 0060:[<f83edd65>] EFLAGS: 00010283 CPU: 1
Dec 15 14:36:41 User-PC kernel: [175576.123193] EIP is at br_multicast_rcv+0xc95/0xe1c [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.123362] EAX: 0000001c EBX: f5626318 ECX: 00000000 EDX: 00000000
Dec 15 14:36:41 User-PC kernel: [175576.123550] ESI: ec512262 EDI: f5626180 EBP: f60b5ca0 ESP: f60b5bd8
Dec 15 14:36:41 User-PC kernel: [175576.123737]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Dec 15 14:36:41 User-PC kernel: [175576.123902] Process kworker/0:0 (pid: 0, ti=f60b4000 task=f60a8000 task.ti=f60b0000)
Dec 15 14:36:41 User-PC kernel: [175576.124137] Stack:
Dec 15 14:36:41 User-PC kernel: [175576.124181]  ec556500 f6d06800 f60b5be8 c01087d8 ec512262 00000030 00000024 f5626180
Dec 15 14:36:41 User-PC kernel: [175576.124181]  f572c200 ef463440 f5626300 3affffff f6d06dd0 e60766a4 000000c4 f6d06860
Dec 15 14:36:41 User-PC kernel: [175576.124181]  ffffffff ec55652c 00000001 f6d06844 f60b5c64 c0138264 c016e451 c013e47d
Dec 15 14:36:41 User-PC kernel: [175576.124181] Call Trace:
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01087d8>] ? sched_clock+0x8/0x10
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0138264>] ? enqueue_entity+0x174/0x440
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c016e451>] ? sched_clock_cpu+0x131/0x190
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c013e47d>] ? select_task_rq_fair+0x2ad/0x730
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0524fc1>] ? nf_iterate+0x71/0x90
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4914>] ? br_handle_frame_finish+0x184/0x220 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4790>] ? br_handle_frame_finish+0x0/0x220 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e46e9>] ? br_handle_frame+0x189/0x230 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4790>] ? br_handle_frame_finish+0x0/0x220 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4560>] ? br_handle_frame+0x0/0x230 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04ff026>] ? __netif_receive_skb+0x1b6/0x5b0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04f7a30>] ? skb_copy_bits+0x110/0x210
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0503a7f>] ? netif_receive_skb+0x6f/0x80
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82cb74c>] ? ieee80211_deliver_skb+0x8c/0x1a0 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82cc836>] ? ieee80211_rx_handlers+0xeb6/0x1aa0 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04ff1f0>] ? __netif_receive_skb+0x380/0x5b0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c016e242>] ? sched_clock_local+0xb2/0x190
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c012b688>] ? default_spin_lock_flags+0x8/0x10
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d83df>] ? _raw_spin_lock_irqsave+0x2f/0x50
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82cd621>] ? ieee80211_prepare_and_rx_handle+0x201/0xa90 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82ce154>] ? ieee80211_rx+0x2a4/0x830 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f815a8d6>] ? iwl_update_stats+0xa6/0x2a0 [iwlcore]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f8499212>] ? iwlagn_rx_reply_rx+0x292/0x3b0 [iwlagn]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d83df>] ? _raw_spin_lock_irqsave+0x2f/0x50
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f8483697>] ? iwl_rx_handle+0xe7/0x350 [iwlagn]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f8486ab7>] ? iwl_irq_tasklet+0xf7/0x5c0 [iwlagn]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01aece1>] ? __rcu_process_callbacks+0x201/0x2d0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0150d05>] ? tasklet_action+0xc5/0x100
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0150a07>] ? __do_softirq+0x97/0x1d0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d910c>] ? nmi_stack_correct+0x2f/0x34
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0150970>] ? __do_softirq+0x0/0x1d0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  <IRQ>
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01508f5>] ? irq_exit+0x65/0x70
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05df062>] ? do_IRQ+0x52/0xc0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01036b0>] ? common_interrupt+0x30/0x38
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c03a1fc2>] ? intel_idle+0xc2/0x160
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04daebb>] ? cpuidle_idle_call+0x6b/0x100
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0101dea>] ? cpu_idle+0x8a/0xf0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d2702>] ? start_secondary+0x1e8/0x1ee

Cc: David Miller <davem@davemloft.net>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
---
V2: Implement David Miller's suggestion 

 net/bridge/br_multicast.c |   26 ++++++++++++++++++++------
 1 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index f19e347..5d1449f 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1469,15 +1469,16 @@ static int br_multicast_ipv6_rcv(struct net_bridge *br,
 	if (!skb2)
 		return -ENOMEM;
 
+
+	err = -EINVAL;
+	if (!pskb_may_pull(skb2, offset + sizeof(struct icmp6hdr)))
+		goto out_nopush;
+
 	len -= offset - skb_network_offset(skb2);
 
 	__skb_pull(skb2, offset);
 	skb_reset_transport_header(skb2);
 
-	err = -EINVAL;
-	if (!pskb_may_pull(skb2, sizeof(*icmp6h)))
-		goto out;
-
 	icmp6h = icmp6_hdr(skb2);
 
 	switch (icmp6h->icmp6_type) {
@@ -1516,7 +1517,13 @@ static int br_multicast_ipv6_rcv(struct net_bridge *br,
 	switch (icmp6h->icmp6_type) {
 	case ICMPV6_MGM_REPORT:
 	    {
-		struct mld_msg *mld = (struct mld_msg *)icmp6h;
+		struct mld_msg *mld;
+		if (skb->len < sizeof(*mld) ||
+		    !pskb_may_pull(skb2, sizeof(*mld))) {
+			err = -EINVAL;
+			goto out;
+		}
+		mld = (struct mld_msg *)icmp6h;
 		BR_INPUT_SKB_CB(skb2)->mrouters_only = 1;
 		err = br_ip6_multicast_add_group(br, port, &mld->mld_mca);
 		break;
@@ -1529,13 +1536,20 @@ static int br_multicast_ipv6_rcv(struct net_bridge *br,
 		break;
 	case ICMPV6_MGM_REDUCTION:
 	    {
-		struct mld_msg *mld = (struct mld_msg *)icmp6h;
+		struct mld_msg *mld;
+		if (skb->len < sizeof(*mld) ||
+		    !pskb_may_pull(skb2, sizeof(*mld))) {
+			err = -EINVAL;
+			goto out;
+		}
+		mld = (struct mld_msg *)icmp6h;
 		br_ip6_multicast_leave_group(br, port, &mld->mld_mca);
 	    }
 	}
 
 out:
 	__skb_push(skb2, offset);
+out_nopush:
 	if (skb2 != skb)
 		kfree_skb(skb2);
 	return err;
-- 
1.7.3.4

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


^ permalink raw reply related

* Final CFP and deadline extension: IFIP/IEEE DANMS 2011 (co-located with IM 2011)
From: danms2011 @ 2011-01-02 13:10 UTC (permalink / raw)
  To: danms2011

(We apologize if you receive multiple copies of this message)

(Due to multiple requests paper submission deadline has been extended to 7 Jan 2011)

CALL FOR PAPERS

IFIP/IEEE DANMS 2011
Forth International Workshop on Distributed Autonomous Network Management Systems
Co-located with IM 2011, Dublin, Ireland
www.danms.org

Important Dates
Paper submission deadline: 7 Jan 2011 (extended)
Notification of acceptance: 30 Jan 2011
Camera-ready paper: 15 Feb 2011

Paper Submission URL
https://jems.sbc.org.br/danms2011

The DANMS 2011 workshop is part of a series of workshops dedicated to advances in network management and the application of new management principles in network design.
This year’s workshop emphasizes on “Efficient Management of Loosely Collaborative Service Networks” where network connectivity providers, content and service providers come together to provide high-level services such as Over-The-Top (OTT) services to end-users in a loosely collaborative way.
In the recent past, end-user services such as high quality video calls, web-based video content servers, HD video streaming and VoD services etc. have driven the telecom market. The appearance of these new content/service providers, which include so called Over-The-Top (OTT) service providers, has driven network usage, but also created huge network management problems for the operators. In this context, content/service providers together with network operators provide value to the subscribers in a “loosely collaborative” fashion. Service assurance in this loose collaborative environment is challenging, particularly in the presence of a network with limited or no guarantees. This nascent eco-system is driving fundamental changes in how networks are deployed and managed as well as how they interact with content and service providers. In this context, together with wider network management contributions, we expect to have technical contributions in the areas of automated service assurance and in loosely collaborative service networks.

Topics of interest include (but are not limited to) the following:

 * Federated network management
 * Impact of Over the Top (OTT) services in resource management
 * Case studies of service assurance
 * Efficient use of terminal reports in service assurance
 * SLA for OTT service assurance
 * Service and resource modelling approaches for management
 * Business rules and organizational modelling
 * Use of semantics in service deployment and quality assurance
 * Extensions and refinements of NM standards
 * Aspects of service management and assurance
 * Automated service provisioning across multiple service providers
 * Fault and performance management, diagnosis, and troubleshooting
 * End-to-end QoS management for enterprise networks
 * Measurements and insights from network operations
 * Metrics, techniques, and experiments for evaluating network
 * Convergence of fixed and mobile networks

Steering Committee
 Nazim Agoulmine, University of Evry Val d'Essonne France
 Raouf Boutaba, University of Waterloo Canada
 Gabriel Hogan, Network Management Lab, Ericsson, Ireland

Workshop Co-Chairs
 Sidath Handurukande, Network Management Lab, Ericsson Ireland
 Jose Neuman de Souza, Federal University of Ceara, Brazil

Publicity Chair
 Yangcheng Huang, Ericsson Ireland

Technical Program Committee
 Arosha Bandara, Open University, UK
 Javier Baliosian, Uni. of the Republic, Uruguay
 Saleem Bhatti, Uni. of St Andrews, UK
 Sami Bhiri, DERI, Ireland
 Anne-Marie Bosneag, Ericsson, Ireland
 Anca Chandra, IBM Research, USA
 Dave Cleary, Ericsson, Ireland
 Zoran Despotovic, NTT DoCoMo Euro-Labs, Germany
 Dominique Dudkowski, NEC, Germany
 Liam Fallon, Ericsson, Ireland
 James Hong, POSTECH, Korea
 Jesse Kielthy, TSSG, Ireland
 Francine Krief, Bordeaux 1 University, France
 Petr Kuznetsov, Deutsche Telekom Laboratories, Germany
 Brian Lee, AIT, Ireland
 Declan O'Sullivan, Trinity College Dublin, Ireland
 Gerard Parr, University of Ulster, UK
 Luis Rodrigues, INESC-ID/IST, Portugal
 Florian Schreiner, Fraunhofer FOKUS, Germany
 Rolf Stadler, KTH, Sweden
 Martin van Steen, Vrje University, Netherlands
 Filip De Turck, Ghent University-IBBT, Belgium
 Stefan Wallin, Luleå University of Technology, Sweden
 Martin Zach, Siemens AG Austria

Paper Submission Instructions

Authors are invited to submit papers of maximum six pages or extended papers of up to eight pages (including figures, tables, and references), in the standard two-column, 11 pt font, IEEE conference paper format. Standard IEEE Transactions templates for Microsoft Word or LaTeX formats can be found at
http://www.ieee.org/portal/pages/pubs/transactions/stylesheets.html.
All work submitted must be original, not previously published or under submission at other venues.
At least one author of accepted papers must be present at the workshop, to present the paper. Accepted papers will be published by IEEE Xplore and indexed accordingly.
Only PDF files will be accepted for the review process and all submissions must be done through JEMS at the URL below:
https://jems.sbc.org.br/danms2011

^ permalink raw reply

* Re: [PATCH] new UDPCP Communication Protocol
From: Stefani Seibold @ 2011-01-02 11:57 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel, akpm, davem, netdev, shemminger
In-Reply-To: <1293967996.2535.76.camel@edumazet-laptop>

Am Sonntag, den 02.01.2011, 12:33 +0100 schrieb Eric Dumazet:
> Le dimanche 02 janvier 2011 à 12:17 +0100, Stefani Seibold a écrit :
> > Am Samstag, den 01.01.2011, 23:23 +0100 schrieb Eric Dumazet:
> 
> > > 3) I see UDPLITE references in your code. Are you sure UDPCP protocol
> > > can really work on top of UDPLITE ? I think not, so please remove dead
> > > code.
> > > 
> > 
> > I have not tested it yet. But i think i should work, the code is mostly
> > stolen from the udp.c file. Since the stack not depend on the len field
> > of the UDP header, it should not matter if UPD or UDP-Lite is used.
> > 
> 
> Hmm, but isnt your :
> 
> udpcp_prot = udp_prot; /* in udpcp_init() */
> 
> implies that all UDPCP sockets you are going to provide to user are on
> top of UDP, not UDPLITE ?
> 

Okay, i kick to udplite support away!

> Also, could you provide a link to UDPCP protocol specs ?
> 

http://read.pudn.com/downloads76/doc/project/283718/OBSAI/OBSAI/RP1_V2.0.PDF

> Thanks
> 
> 

^ permalink raw reply

* Re: [PATCH] new UDPCP Communication Protocol
From: Eric Dumazet @ 2011-01-02 11:33 UTC (permalink / raw)
  To: Stefani Seibold; +Cc: linux-kernel, akpm, davem, netdev, shemminger
In-Reply-To: <1293967023.29525.16.camel@wall-e>

Le dimanche 02 janvier 2011 à 12:17 +0100, Stefani Seibold a écrit :
> Am Samstag, den 01.01.2011, 23:23 +0100 schrieb Eric Dumazet:

> > 3) I see UDPLITE references in your code. Are you sure UDPCP protocol
> > can really work on top of UDPLITE ? I think not, so please remove dead
> > code.
> > 
> 
> I have not tested it yet. But i think i should work, the code is mostly
> stolen from the udp.c file. Since the stack not depend on the len field
> of the UDP header, it should not matter if UPD or UDP-Lite is used.
> 

Hmm, but isnt your :

udpcp_prot = udp_prot; /* in udpcp_init() */

implies that all UDPCP sockets you are going to provide to user are on
top of UDP, not UDPLITE ?

Also, could you provide a link to UDPCP protocol specs ?

Thanks

^ permalink raw reply

* Re: [PATCH] new UDPCP Communication Protocol
From: Stefani Seibold @ 2011-01-02 11:17 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel, akpm, davem, netdev, shemminger
In-Reply-To: <1293920601.2535.64.camel@edumazet-laptop>

Am Samstag, den 01.01.2011, 23:23 +0100 schrieb Eric Dumazet:
> Le samedi 01 janvier 2011 à 22:44 +0100, stefani@seibold.net a écrit :
> > From: Stefani Seibold <stefani@seibold.net>
> > 
> > Changelog:
> > 31.12.2010 first proposal
> > 01.01.2011 code cleanup and fixes suggest by Eric Dumazet
> > 
> > UDPCP is a communication protocol specified by the Open Base Station
> > Architecture Initiative Special Interest Group (OBSAI SIG). The
> > protocol is based on UDP and is designed to meet the needs of "Mobile
> > Communcation Base Station" internal communications. It is widely used by
> > the major networks infrastructure supplier.
> > 
> > The UDPCP communication service supports the following features:
> > 
> > -Connectionless communication for serial mode data transfer
> > -Acknowledged and unacknowledged transfer modes
> > -Retransmissions Algorithm
> > -Checksum Algorithm using Adler32
> > -Fragmentation of long messages (disassembly/reassembly) to match to the MTU
> >  during transport:
> > -Broadcasting and multicasting messages to multiple peers in unacknowledged
> >  transfer mode
> > 
> > UDPCP supports application level messages up to 64 KBytes (limited by 16-bit
> > packet data length field). Messages that are longer than the MTU will be
> > fragmented to the MTU.
> > 
> > UDPCP provides a reliable transport service that will perform message
> > retransmissions in case transport failures occur.
> > 
> > The code is also a nice example how to implement a UDP based protocol as
> > a kernel socket modules.
> > 
> > Due the nature of UDPCP which has no sliding windows support, the latency has a
> > huge impact. The perfomance increase by implementing as a kernel module is
> > about the factor 10, because there are no context switches and data packets or
> > ACKs will be handled in the interrupt service.
> > 
> > There are no side effects to the network subsystems so i ask for merge it
> > into linux-next. Hope you like it.
> > 
> > The patch is against 2.6.37-rc8
> > 
> 
> Hi Stefani
> 
> 1) Please base your next trys on net-next-2.6 : This is the reference
> for stuff like that. It probably does not matter a lot, but still...
> 

Okay.

> 
> 2) I still see some _irq() variants of spinlock(). Its not necessary in
> network stack at the level you are working (process context, and
> softirqs)
> 
> Please only use _bh() variants, it's enough.
> 

Will be fixed...

> 3) I see UDPLITE references in your code. Are you sure UDPCP protocol
> can really work on top of UDPLITE ? I think not, so please remove dead
> code.
> 

I have not tested it yet. But i think i should work, the code is mostly
stolen from the udp.c file. Since the stack not depend on the len field
of the UDP header, it should not matter if UPD or UDP-Lite is used.

> 4) udpcp_release_sock() seems expensive to me. Why not testing
> usk->timeout before releasing sock lock, and save a lock/unlock pair ?
> 
> static inline void udpcp_release_sock(struct sock *sk)
> {
> 	struct udpcp_sock *usk = udpcp_sk(sk);
> 
> 	if (usk->timeout)
> 		udpcp_handle_timeout(sk);
> 	release_sock(sk);
> }
> 

No... What is when a timer occurs between exiting udpcp_handle_timeout()
and release_sock(). This timer will not longer handled.

What about this:?

static inline void udpcp_release_sock(struct sock *sk)
{
	while (usk->timeout) {
		udpcp_handle_timeout(sk);
	release_sock(sk);
        check_timeout(sk);
}

> 5) In udpcp_timeout(), if you find socket is locked by user, you set
> timeout to one and rearm a timer to udpcp_timer(sk, jiffies + 1); 
> 
> Why is it needed, since user process is going to handle the timeout
> indication from udpcp_release_sock() ?
> 

It is only for paranoid reasons, maybe there is a bug in my wonderful
code and than the stack will stop to work.... If it is okay, i didn't
want to remove it, my cuts feels a little bit nervous. 



^ permalink raw reply

* Re: [*v3 PATCH 14/22] IPVS: netns awareness to ip_vs_sync
From: Hans Schillstrom @ 2011-01-02  9:47 UTC (permalink / raw)
  To: Simon Horman
  Cc: ja, daniel.lezcano, wensong, lvs-devel, netdev, netfilter-devel,
	Hans Schillstrom
In-Reply-To: <20101231004433.GA14681@verge.net.au>


On Friday, December 31, 2010 01:44:33 Simon Horman wrote:
> On Thu, Dec 30, 2010 at 11:50:58AM +0100, hans@schillstrom.com wrote:
> > From: Hans Schillstrom <hans.schillstrom@ericsson.com>
> > 
> > All global variables moved to struct ipvs,
> > most external changes fixed (i.e. init_net removed)
> > in sync_buf create  + 4 replaced by sizeof(struct..)
> 
> [ snip ]
> 
> > diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
> > index 0454a11..5d6e250 100644
> > --- a/net/netfilter/ipvs/ip_vs_core.c
> > +++ b/net/netfilter/ipvs/ip_vs_core.c
> > @@ -1477,6 +1477,8 @@ ip_vs_in(unsigned int hooknum, struct sk_buff *skb, int af)
> >  	struct ip_vs_proto_data *pd;
> >  	struct ip_vs_conn *cp;
> >  	int ret, restart, pkts;
> > +	struct net *net;
> 
> The line above adds a duplicate declaration of net as
> just before the diff-context starts there is
> 
> 	struct net *net = NULL;
> 
I'll remove that

^ permalink raw reply

* Re: [PATCH V7 7/8] ptp: Added a clock driver for the IXP46x.
From: Pavel Machek @ 2011-01-02  9:20 UTC (permalink / raw)
  To: Richard Cochran
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	Alan Cox, Arnd Bergmann, Christoph Lameter, David Miller,
	John Stultz, Krzysztof Halasa, Peter Zijlstra, Rodolfo Giometti,
	Thomas Gleixner
In-Reply-To: <20110102091233.GA2847-7KxsofuKt4IfAd9E5cN8NEzG7cXyKsk/@public.gmane.org>

On Sun 2011-01-02 10:12:33, Richard Cochran wrote:
> On Sun, Jan 02, 2011 at 09:45:05AM +0100, Pavel Machek wrote:
> > Hi!!
> > 
> > > +struct ixp46x_channel_ctl {
> > > +	u32 Ch_Control; /* 0x40 Time Synchronization Channel Control */
> > > +	u32 Ch_Event;   /* 0x44 Time Synchronization Channel Event */
> > > +	u32 TxSnapLo;   /* 0x48 Transmit Snapshot Low Register */
> > > +	u32 TxSnapHi;   /* 0x4C Transmit Snapshot High Register */
> > 
> > CouldWeGetRidOfCamelCase?
> 
> I agree that CamelCase is ugly and in bad taste. However, I make an
> exception when the register level programmer's manual uses this style.
> 
> IMHO, it is better to use the exact same mnemonics as in the
> manual. That way, when the next person comes along to make a change to

Given the comments -- does manual really use camelCase crap?
And... the identifiers actually combine camelCase with _. Better fix
it.
							Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply

* Re: [PATCH V7 7/8] ptp: Added a clock driver for the IXP46x.
From: Richard Cochran @ 2011-01-02  9:19 UTC (permalink / raw)
  To: Pavel Machek
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	Alan Cox, Arnd Bergmann, Christoph Lameter, David Miller,
	John Stultz, Krzysztof Halasa, Peter Zijlstra, Rodolfo Giometti,
	Thomas Gleixner
In-Reply-To: <20110102084505.GA3980-+ZI9xUNit7I@public.gmane.org>

BTW, I posted an updated V8 of this patch series on December 31.

Richard

^ permalink raw reply

* Re: [PATCH V7 7/8] ptp: Added a clock driver for the IXP46x.
From: Richard Cochran @ 2011-01-02  9:12 UTC (permalink / raw)
  To: Pavel Machek
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	Alan Cox, Arnd Bergmann, Christoph Lameter, David Miller,
	John Stultz, Krzysztof Halasa, Peter Zijlstra, Rodolfo Giometti,
	Thomas Gleixner
In-Reply-To: <20110102084505.GA3980-+ZI9xUNit7I@public.gmane.org>

On Sun, Jan 02, 2011 at 09:45:05AM +0100, Pavel Machek wrote:
> Hi!!
> 
> > +struct ixp46x_channel_ctl {
> > +	u32 Ch_Control; /* 0x40 Time Synchronization Channel Control */
> > +	u32 Ch_Event;   /* 0x44 Time Synchronization Channel Event */
> > +	u32 TxSnapLo;   /* 0x48 Transmit Snapshot Low Register */
> > +	u32 TxSnapHi;   /* 0x4C Transmit Snapshot High Register */
> 
> CouldWeGetRidOfCamelCase?

I agree that CamelCase is ugly and in bad taste. However, I make an
exception when the register level programmer's manual uses this style.

IMHO, it is better to use the exact same mnemonics as in the
manual. That way, when the next person comes along to make a change to
the code, with manual in hand, he will immediately know what is what.

Thanks,
Richard

^ permalink raw reply

* Re: [PATCH V7 7/8] ptp: Added a clock driver for the IXP46x.
From: Pavel Machek @ 2011-01-02  8:45 UTC (permalink / raw)
  To: Richard Cochran
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	Alan Cox, Arnd Bergmann, Christoph Lameter, David Miller,
	John Stultz, Krzysztof Halasa, Peter Zijlstra, Rodolfo Giometti,
	Thomas Gleixner
In-Reply-To: <d2e0a5ac5eb51e9e4c7fcc94723b67c72c2f57c1.1292512461.git.richard.cochran-3mrvs1K0uXizZXS1Dc/lvw@public.gmane.org>

Hi!!

> +struct ixp46x_channel_ctl {
> +	u32 Ch_Control; /* 0x40 Time Synchronization Channel Control */
> +	u32 Ch_Event;   /* 0x44 Time Synchronization Channel Event */
> +	u32 TxSnapLo;   /* 0x48 Transmit Snapshot Low Register */
> +	u32 TxSnapHi;   /* 0x4C Transmit Snapshot High Register */

CouldWeGetRidOfCamelCase?

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
From: Jesse Gross @ 2011-01-02  0:27 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Matt Carlson, Michael Leun, Michael Chan, David Miller,
	Ben Greear, linux-kernel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <1293901393.2535.49.camel@edumazet-laptop>

On Sat, Jan 1, 2011 at 12:03 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mardi 14 décembre 2010 à 11:15 -0800, Matt Carlson a écrit :
>
>> Thanks for the comments Jesse.  Below is an updated patch.
>>
>> Michael, I'm wondering if the difference in behavior can be explained by
>> the presence or absence of management firmware.  Can you look at the
>> driver sign-on messages in your syslogs for ASF[]?  I'm half expecting
>> the 5752 to show "ASF[0]" and the 5714 to show "ASF[1]".  If you see
>> this, and the below patch doesn't fix the problem, let me know.  I have
>> another test I'd like you to run.
>>
>> ----
>>
>> [PATCH] tg3: Use new VLAN code
>>
>> This patch pivots the tg3 driver to the new VLAN infrastructure.
>> All references to vlgrp have been removed and all VLAN code is
>> unconditionally active.
>>
>> Signed-off-by: Matt Carlson <mcarlson@broadcom.com>

[...]

> Hi Matt.
>
> Any news on this patch ?
>
> Without it, net-next-2.6 doesnt work for me on a vlan setup on top of
> bonding.
>
> (bond0 : eth1 & eth2, eth1 being bnx2, eth2 beging tg3)
>
> ip link add link bond0 vlan.103 type vlan id 103
> ip addr add 192.168.20.110/24 dev vlan.103
> ip link set vlan.103 up
>
>
> If active slave is eth1 (bnx2), everything works, but if active slave is
> eth2 (tg3), incoming tagged frames (on vlan 103) are lost.

This patch isn't quite right - it always disables vlan stripping
unless management firmware is in use, so it's not really a correct
fix.

You said that this used to work correctly on this NIC?  Does it work
without a bond, just a vlan on the tg3 device?  It sounds like Michael
has a problem with vlan stripping on one of his NICs but if it works
with just a vlan or on older kernels, it's probably not the same
thing.

If it works on bnx2, it would seem to be a driver problem but it would
be good to confirm that the tag in skb->vlan_tci is not being
delievered to the networking core in this case.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox