Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH][RFC] sctp: fix reporting of unknown parameters
From: Jiri Bohac @ 2011-02-17 23:12 UTC (permalink / raw)
  To: linux-sctp, Neil Horman

Hi,

commit 5fa782c2f5ef6c2e4f04d3e228412c9b4a4c8809 re-worked the 
handling of unknown parameters. sctp_init_cause_fixed() can now
return -ENOSPC if there is not enough tailroom in the error
chunk skb. When this happens, the error header is not appended to
the error chunk. In that case, the payload of the unknown parameter
should not be appended either.

Signed-off-by: Jiri Bohac <jbohac@suse.cz>


diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index 2cc46f0..b23428f 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -2029,11 +2029,11 @@ static sctp_ierror_t sctp_process_unk_param(const struct sctp_association *asoc,
 			*errp = sctp_make_op_error_fixed(asoc, chunk);
 
 		if (*errp) {
-			sctp_init_cause_fixed(*errp, SCTP_ERROR_UNKNOWN_PARAM,
-					WORD_ROUND(ntohs(param.p->length)));
-			sctp_addto_chunk_fixed(*errp,
-					WORD_ROUND(ntohs(param.p->length)),
-					param.v);
+			if (!sctp_init_cause_fixed(*errp, SCTP_ERROR_UNKNOWN_PARAM,
+					WORD_ROUND(ntohs(param.p->length))))
+				sctp_addto_chunk_fixed(*errp,
+						WORD_ROUND(ntohs(param.p->length)),
+						param.v);
 		} else {
 			/* If there is no memory for generating the ERROR
 			 * report as specified, an ABORT will be triggered


-- 
Jiri Bohac <jbohac@suse.cz>
SUSE Labs, SUSE CZ


^ permalink raw reply related

* Loan
From: Orbit Loans @ 2011-02-18 13:38 UTC (permalink / raw)


Good day,

Do you have bad credit?
Do you want to upgrade your business?

I give loans to both private and public organizations at
very low and considerable interest rates.

For more information you could contact me at my email
address below:orbitfinanceltduk@mail2uk.com


Regards
Hill Colin Neil Esq.



^ permalink raw reply

* Re: [PATCH 1/1] tproxy: do not assign timewait sockets to skb->sk
From: Patrick McHardy @ 2011-02-18 16:19 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Balazs Scheidler, netfilter-devel, netdev, KOVACS Krisztian
In-Reply-To: <20110217212717.GG8821@Chamillionaire.breakpoint.cc>

Am 17.02.2011 22:27, schrieb Florian Westphal:
> Balazs Scheidler <bazsi@balabit.hu> wrote:
>> the destination port in the packet can be different in the two lookups. --on-port tproxy option.
> 
> Hrm...  The initial lookup uses the header ip addresses:
>         sk = nf_tproxy_get_sock_v4(dev_net(skb->dev), iph->protocol,
>                                    iph->saddr, iph->daddr,
>                                    hp->source, hp->dest,
>                                    skb->dev, NFT_LOOKUP_ESTABLISHED);
> 
> 3 possible cases:
> - no socket -- try to find listener. This case is not changed by my patch.
> - sk is normal socket. set nfmark and skb->sk. Also not changed.
> - sk is in TW state. This is not changed either:
> 	tproxy_handle_time_wait4() will check if this is a SYN. If it is, a new
> 	listener lookup is made, and TW socket is kicked out.
> 
> If the packet is not a SYN, then tproxy_handle_time_wait4() won't do anything.
> Previously, the timewait sk would now be assigned to skb->sk, which my patch
> prevents.  But I don't see where the '--on-port' port number is involved in the
> TW socket lookup?
> 
> Thanks for reviwing the patch!

For some reason I've not received Balazs' email. Balazs, I'm about to
submit my queued patches upstream, if you wish to object to this patch,
please do so now.

Thanks.

^ permalink raw reply

* IGMPMSG_WRONGVIF only happens when true VIF is an OIF of correct IIF?
From: Dan Langlois @ 2011-02-18 15:46 UTC (permalink / raw)
  To: netdev

Hi, I have a question about the IGMPMSG_WRONGVIF message.
I have MRT_ASSERT enabled but do not receive notifications.
However, if I enable MRT_PIM, I do get them.  I'm curious
why MRT_ASSERT alone isn't sufficient.

This if-statement appears in ip_mr_forward() in net/ipv4/ipmr.c

       if (true_vifi >= 0 && net->ipv4.mroute_do_assert &&
            /* pimsm uses asserts, when switching from RPT to SPT, 
               so that we cannot check that packet arrived on an oif. 
               It is bad, but otherwise we would need to move pretty
               large chunk of pimd to kernel. Ough... --ANK
             */      
            (net->ipv4.mroute_do_pim ||
             cache->mfc_un.res.ttls[true_vifi] < 255) && 
            time_after(jiffies,
                       cache->mfc_un.res.last_assert +
MFC_ASSERT_THRESH)) {
                cache->mfc_un.res.last_assert = jiffies;
                ipmr_cache_report(net, skb, true_vifi,
IGMPMSG_WRONGVIF);
        }       

By the time this statement is reached, it is known that the packet
did not arrive on the expected incoming interface (IIF) and thus a
"WRONGVIF" condition.  This if-statement decides whether a PIM-assert
notification needs to be sent to the multicast routing daemon.

The part of this if-statement I'm questioning is:
    (net->ipv4.mroute_do_pim ||
     cache->mfc_un.res.ttls[true_vifi] < 255) &&

I read this as:
"send WRONGVIF if PIM is enabled -OR-
the packet arrived on an outgoing interface OIF of the correct IIF"

So this statement is always true when PIM is enabled.
However, if only ASSERT is enabled, this statement is only true when
a packet is reflected back on an OIF.

Why not always send the assert for any WRONGVIF condition regardless
of the interface it came in on?  I mean, isn't that the point of
enabling MRT_ASSERT in the first place?  If the assertions weren't
wanted, just don't enable that feature, right?

In our system, multicast streams may get rerouted through a different
VIF than what was first installed in the MFC cache.  I would very much
like to get these WRONGVIF assertions in this case, which is NOT the
case that a packet is seen on an OIF.

Of course I can simply enable MRT_PIM to get the messages, but in that
case I don't understand the reason for having two separate toggles
(MRT_ASSERT versus MRT_PIM).

Could someone please explain the reasoning here, as I don't understand
the comment, and I'm tempted to patch out this part of the if-statement.

cheers,
Dan

^ permalink raw reply

* [RFC PATCH net-next-2.6 2/2] sfc: Add CPU queue mapping for XPS
From: Ben Hutchings @ 2011-02-18 16:15 UTC (permalink / raw)
  To: Tom Herbert; +Cc: netdev, linux-net-drivers
In-Reply-To: <1298045607.2570.17.camel@bwh-desktop>

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/sfc/efx.c |   59 ++++++++++++++++++++++++++++++++++---------------
 1 files changed, 41 insertions(+), 18 deletions(-)

diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
index 35b7bc5..6d698c3 100644
--- a/drivers/net/sfc/efx.c
+++ b/drivers/net/sfc/efx.c
@@ -1179,9 +1179,11 @@ static int efx_wanted_channels(void)
 }
 
 static int
-efx_init_rx_cpu_rmap(struct efx_nic *efx, struct msix_entry *xentries)
+efx_init_cpu_rmaps(struct efx_nic *efx, struct msix_entry *xentries)
 {
-#ifdef CONFIG_RFS_ACCEL
+#ifndef CONFIG_NET_IRQ_CPU_RMAP
+	return 0;
+#else
 	int i, rc;
 
 	efx->net_dev->rx_cpu_rmap = alloc_irq_cpu_rmap(efx->n_rx_channels);
@@ -1190,14 +1192,38 @@ efx_init_rx_cpu_rmap(struct efx_nic *efx, struct msix_entry *xentries)
 	for (i = 0; i < efx->n_rx_channels; i++) {
 		rc = irq_cpu_rmap_add(efx->net_dev->rx_cpu_rmap,
 				      xentries[i].vector);
-		if (rc) {
-			free_irq_cpu_rmap(efx->net_dev->rx_cpu_rmap);
-			efx->net_dev->rx_cpu_rmap = NULL;
-			return rc;
+		if (rc)
+			goto fail_rx;
+	}
+
+	if (efx->tx_channel_offset == 0) {
+		efx->net_dev->tx_cpu_rmap = efx->net_dev->rx_cpu_rmap;
+	} else {
+		efx->net_dev->tx_cpu_rmap =
+			alloc_irq_cpu_rmap(efx->n_tx_channels);
+		if (!efx->net_dev->tx_cpu_rmap) {
+			rc = -ENOMEM;
+			goto fail_rx;
+		}
+		for (i = 0; i < efx->n_tx_channels; i++) {
+			rc = irq_cpu_rmap_add(
+				efx->net_dev->tx_cpu_rmap,
+				xentries[efx->tx_channel_offset + i].vector);
+			if (rc)
+				goto fail_tx;
 		}
 	}
-#endif
+
 	return 0;
+
+fail_tx:
+	free_irq_cpu_rmap(efx->net_dev->tx_cpu_rmap);
+	efx->net_dev->tx_cpu_rmap = NULL;
+fail_rx:
+	free_irq_cpu_rmap(efx->net_dev->rx_cpu_rmap);
+	efx->net_dev->rx_cpu_rmap = NULL;
+	return rc;
+#endif
 }
 
 /* Probe the number and type of interrupts we are able to obtain, and
@@ -1238,14 +1264,15 @@ static int efx_probe_interrupts(struct efx_nic *efx)
 			if (separate_tx_channels) {
 				efx->n_tx_channels =
 					max(efx->n_channels / 2, 1U);
+				efx->tx_channel_offset =
+					efx->n_channels - efx->n_tx_channels;
 				efx->n_rx_channels =
-					max(efx->n_channels -
-					    efx->n_tx_channels, 1U);
+					max(efx->tx_channel_offset, 1U);
 			} else {
 				efx->n_tx_channels = efx->n_channels;
 				efx->n_rx_channels = efx->n_channels;
 			}
-			rc = efx_init_rx_cpu_rmap(efx, xentries);
+			rc = efx_init_cpu_rmaps(efx, xentries);
 			if (rc) {
 				pci_disable_msix(efx->pci_dev);
 				return rc;
@@ -1301,12 +1328,6 @@ static void efx_remove_interrupts(struct efx_nic *efx)
 	efx->legacy_irq = 0;
 }
 
-static void efx_set_channels(struct efx_nic *efx)
-{
-	efx->tx_channel_offset =
-		separate_tx_channels ? efx->n_channels - efx->n_tx_channels : 0;
-}
-
 static int efx_probe_nic(struct efx_nic *efx)
 {
 	size_t i;
@@ -1330,7 +1351,6 @@ static int efx_probe_nic(struct efx_nic *efx)
 	for (i = 0; i < ARRAY_SIZE(efx->rx_indir_table); i++)
 		efx->rx_indir_table[i] = i % efx->n_rx_channels;
 
-	efx_set_channels(efx);
 	netif_set_real_num_tx_queues(efx->net_dev, efx->n_tx_channels);
 	netif_set_real_num_rx_queues(efx->net_dev, efx->n_rx_channels);
 
@@ -2315,7 +2335,10 @@ static void efx_fini_struct(struct efx_nic *efx)
  */
 static void efx_pci_remove_main(struct efx_nic *efx)
 {
-#ifdef CONFIG_RFS_ACCEL
+#ifdef CONFIG_NET_IRQ_CPU_RMAP
+	if (efx->net_dev->tx_cpu_rmap != efx->net_dev->rx_cpu_rmap)
+		free_irq_cpu_rmap(efx->net_dev->tx_cpu_rmap);
+	efx->net_dev->tx_cpu_rmap = NULL;
 	free_irq_cpu_rmap(efx->net_dev->rx_cpu_rmap);
 	efx->net_dev->rx_cpu_rmap = NULL;
 #endif
-- 
1.7.3.4


-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* [RFC PATCH net-next-2.6 1/2] net: XPS: Allow driver to provide a default mapping of CPUs to TX queues
From: Ben Hutchings @ 2011-02-18 16:15 UTC (permalink / raw)
  To: Tom Herbert; +Cc: netdev, linux-net-drivers
In-Reply-To: <1298045607.2570.17.camel@bwh-desktop>

As with RFS acceleration, drivers may use the irq_cpu_rmap facility to
provide a mapping from each CPU to the 'nearest' TX queue.

Define CONFIG_PACKET_STEERING and CONFIG_NET_IRQ_CPU_RMAP so that
drivers can make all cpu_rmap initialisation conditional on the
latter.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 include/linux/netdevice.h |   10 +++++++---
 net/Kconfig               |   17 ++++++++++++-----
 net/core/dev.c            |   41 +++++++++++++++++++++++------------------
 3 files changed, 42 insertions(+), 26 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index c7d7074..c5cba16 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1080,14 +1080,13 @@ struct net_device {
 
 	/* Number of RX queues currently active in device */
 	unsigned int		real_num_rx_queues;
-
-#ifdef CONFIG_RFS_ACCEL
+#endif
+#ifdef CONFIG_NET_IRQ_CPU_RMAP
 	/* CPU reverse-mapping for RX completion interrupts, indexed
 	 * by RX queue number.  Assigned by driver.  This must only be
 	 * set if the ndo_rx_flow_steer operation is defined. */
 	struct cpu_rmap		*rx_cpu_rmap;
 #endif
-#endif
 
 	rx_handler_func_t __rcu	*rx_handler;
 	void __rcu		*rx_handler_data;
@@ -1114,6 +1113,11 @@ struct net_device {
 #ifdef CONFIG_XPS
 	struct xps_dev_maps __rcu *xps_maps;
 #endif
+#ifdef CONFIG_NET_IRQ_CPU_RMAP
+	/* CPU reverse-mapping for TX completion interrupts.  Assigned
+	 * by driver. */
+	struct cpu_rmap		*tx_cpu_rmap;
+#endif
 
 	/* These may be needed for future network-power-down code. */
 
diff --git a/net/Kconfig b/net/Kconfig
index 79cabf1..fa0a093 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -216,21 +216,28 @@ source "net/dcb/Kconfig"
 source "net/dns_resolver/Kconfig"
 source "net/batman-adv/Kconfig"
 
-config RPS
+config PACKET_STEERING
 	boolean
 	depends on SMP && SYSFS && USE_GENERIC_SMP_HELPERS
+	select RPS
+	select XPS
 	default y
 
-config RFS_ACCEL
+config NET_IRQ_CPU_RMAP
 	boolean
-	depends on RPS && GENERIC_HARDIRQS
+	depends on PACKET_STEERING && GENERIC_HARDIRQS
 	select CPU_RMAP
+	select RFS_ACCEL
 	default y
 
+config RPS
+	boolean
+
+config RFS_ACCEL
+	boolean
+
 config XPS
 	boolean
-	depends on SMP && SYSFS && USE_GENERIC_SMP_HELPERS
-	default y
 
 menu "Network testing"
 
diff --git a/net/core/dev.c b/net/core/dev.c
index 54aaca6..dd38b88 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2257,27 +2257,32 @@ static inline int get_xps_queue(struct net_device *dev, struct sk_buff *skb)
 
 	rcu_read_lock();
 	dev_maps = rcu_dereference(dev->xps_maps);
-	if (dev_maps) {
+	if (dev_maps)
 		map = rcu_dereference(
-		    dev_maps->cpu_map[raw_smp_processor_id()]);
-		if (map) {
-			if (map->len == 1)
-				queue_index = map->queues[0];
-			else {
-				u32 hash;
-				if (skb->sk && skb->sk->sk_hash)
-					hash = skb->sk->sk_hash;
-				else
-					hash = (__force u16) skb->protocol ^
-					    skb->rxhash;
-				hash = jhash_1word(hash, hashrnd);
-				queue_index = map->queues[
-				    ((u64)hash * map->len) >> 32];
-			}
-			if (unlikely(queue_index >= dev->real_num_tx_queues))
-				queue_index = -1;
+			dev_maps->cpu_map[raw_smp_processor_id()]);
+	if (map) {
+		if (map->len == 1)
+			queue_index = map->queues[0];
+		else {
+			u32 hash;
+			if (skb->sk && skb->sk->sk_hash)
+				hash = skb->sk->sk_hash;
+			else
+				hash = (__force u16) skb->protocol ^
+					skb->rxhash;
+			hash = jhash_1word(hash, hashrnd);
+			queue_index = map->queues[
+				((u64)hash * map->len) >> 32];
 		}
+		if (unlikely(queue_index >= dev->real_num_tx_queues))
+			queue_index = -1;
+	}
+#ifdef CONFIG_PACKET_STEERING_CPU_RMAP
+	else if (dev->tx_cpu_rmap) {
+		queue_index = cpu_rmap_lookup_index(dev->tx_cpu_rmap,
+						    raw_smp_processor_id());
 	}
+#endif
 	rcu_read_unlock();
 
 	return queue_index;
-- 
1.7.3.4



-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* Re: [patch net-next-2.6] net: convert bonding to use rx_handler
From: Eric Dumazet @ 2011-02-18 16:14 UTC (permalink / raw)
  To: Patrick McHardy, Jiri Pirko
  Cc: netdev, davem, shemminger, fubar, nicolas.2p.debian, andy
In-Reply-To: <4D5E953F.6010606@trash.net>

Le vendredi 18 février 2011 à 16:50 +0100, Patrick McHardy a écrit :
> On 18.02.2011 15:58, Jiri Pirko wrote:
> > Fri, Feb 18, 2011 at 03:46:45PM CET, kaber@trash.net wrote:
> >> Am 18.02.2011 15:27, schrieb Eric Dumazet:
> >>> Le vendredi 18 février 2011 à 15:14 +0100, Jiri Pirko a écrit :
> >>>
> >>>> Do not know how to do it better. As for percpu variable, not only
> >>>> origdev would have to be remembered but also probably skb pointer to
> >>>> know if it's the first run on the skb or not. Can't really figure out a
> >>>> better solution. Can you?
> >>>
> >>> I'll try and let you know.
> >>
> >> Why not simply do a lookup on skb->iif?
> > 
> > Well I was trying to avoid iterating over list of devices for each
> > incoming frame.
> > 
> 
> Well, there are a couple of holes on 64 bit, perhaps you can rearrange
> things and eliminate either iif or input_dev without increasing size
> since they appear to be redundant.

Jiri

I dont understand why netif_rx() is needed in your patch.

Can we stack 10 bond devices or so ???

If we avoid this stage and call the real thing (netif_receive_skb()),
then we dont need adding a field in each skb, since it can be carried by
a global variable (per cpu of course)

bond_handle_frame() being called from __netif_receive_skb() I believe it
can use netif_receive_skb() instead of netif_rx().

Same remark for vlan_on_bond_hook()




^ permalink raw reply

* [RFC PATCH net-next-2.6 0/2] Automatic XPS mapping
From: Ben Hutchings @ 2011-02-18 16:13 UTC (permalink / raw)
  To: Tom Herbert; +Cc: netdev, sf-linux-drivers

In the same way that we maintain a mapping CPUs to RX queues for RFS
acceleration based on current IRQ affinity and the CPU topology, we can
maintain a mapping of CPUs to TX queues for queue selection in XPS.  (In
fact this may be the same mapping.)

Questions:
- Does this make a real difference to performance?
  (I've only barely tested this.)
- Should there be a way to disable it?
- Should the automatic mapping be made visible?
  (This applies RFS acceleration too.)
- Should different mappings be allowed for different traffic classes,
  in case they have separate sets of TX interrupts with different
  affinity?
  (This applies to manual XPS configuration too.)

Ben Hutchings (2):
  net: XPS: Allow driver to provide a default mapping of CPUs to TX
    queues
  sfc: Add CPU queue mapping for XPS

 drivers/net/sfc/efx.c     |   59 +++++++++++++++++++++++++++++++-------------
 include/linux/netdevice.h |   10 +++++--
 net/Kconfig               |   17 +++++++++----
 net/core/dev.c            |   41 +++++++++++++++++-------------
 4 files changed, 83 insertions(+), 44 deletions(-)

-- 
1.7.3.4


-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* [PATCH net-next-2.6 3/3] sfc: Implement hardware acceleration of RFS
From: Ben Hutchings @ 2011-02-18 16:04 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1298044839.2570.7.camel@bwh-desktop>

Use the existing filter management functions to insert TCP/IPv4 and
UDP/IPv4 4-tuple filters for Receive Flow Steering.

For each channel, track how many RFS filters are being added during
processing of received packets and scan the corresponding number of
table entries for filters that may be reclaimed.  Do this in batches
to reduce lock overhead.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/sfc/efx.c        |   49 +++++++++++++++++++-
 drivers/net/sfc/efx.h        |   15 ++++++
 drivers/net/sfc/filter.c     |  104 ++++++++++++++++++++++++++++++++++++++++++
 drivers/net/sfc/net_driver.h |    3 +
 4 files changed, 169 insertions(+), 2 deletions(-)

diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
index d4e0425..35b7bc5 100644
--- a/drivers/net/sfc/efx.c
+++ b/drivers/net/sfc/efx.c
@@ -21,6 +21,7 @@
 #include <linux/ethtool.h>
 #include <linux/topology.h>
 #include <linux/gfp.h>
+#include <linux/cpu_rmap.h>
 #include "net_driver.h"
 #include "efx.h"
 #include "nic.h"
@@ -307,6 +308,8 @@ static int efx_poll(struct napi_struct *napi, int budget)
 			channel->irq_mod_score = 0;
 		}
 
+		efx_filter_rfs_expire(channel);
+
 		/* There is no race here; although napi_disable() will
 		 * only wait for napi_complete(), this isn't a problem
 		 * since efx_channel_processed() will have no effect if
@@ -1175,10 +1178,32 @@ static int efx_wanted_channels(void)
 	return count;
 }
 
+static int
+efx_init_rx_cpu_rmap(struct efx_nic *efx, struct msix_entry *xentries)
+{
+#ifdef CONFIG_RFS_ACCEL
+	int i, rc;
+
+	efx->net_dev->rx_cpu_rmap = alloc_irq_cpu_rmap(efx->n_rx_channels);
+	if (!efx->net_dev->rx_cpu_rmap)
+		return -ENOMEM;
+	for (i = 0; i < efx->n_rx_channels; i++) {
+		rc = irq_cpu_rmap_add(efx->net_dev->rx_cpu_rmap,
+				      xentries[i].vector);
+		if (rc) {
+			free_irq_cpu_rmap(efx->net_dev->rx_cpu_rmap);
+			efx->net_dev->rx_cpu_rmap = NULL;
+			return rc;
+		}
+	}
+#endif
+	return 0;
+}
+
 /* Probe the number and type of interrupts we are able to obtain, and
  * the resulting numbers of channels and RX queues.
  */
-static void efx_probe_interrupts(struct efx_nic *efx)
+static int efx_probe_interrupts(struct efx_nic *efx)
 {
 	int max_channels =
 		min_t(int, efx->type->phys_addr_channels, EFX_MAX_CHANNELS);
@@ -1220,6 +1245,11 @@ static void efx_probe_interrupts(struct efx_nic *efx)
 				efx->n_tx_channels = efx->n_channels;
 				efx->n_rx_channels = efx->n_channels;
 			}
+			rc = efx_init_rx_cpu_rmap(efx, xentries);
+			if (rc) {
+				pci_disable_msix(efx->pci_dev);
+				return rc;
+			}
 			for (i = 0; i < n_channels; i++)
 				efx_get_channel(efx, i)->irq =
 					xentries[i].vector;
@@ -1253,6 +1283,8 @@ static void efx_probe_interrupts(struct efx_nic *efx)
 		efx->n_tx_channels = 1;
 		efx->legacy_irq = efx->pci_dev->irq;
 	}
+
+	return 0;
 }
 
 static void efx_remove_interrupts(struct efx_nic *efx)
@@ -1289,7 +1321,9 @@ static int efx_probe_nic(struct efx_nic *efx)
 
 	/* Determine the number of channels and queues by trying to hook
 	 * in MSI-X interrupts. */
-	efx_probe_interrupts(efx);
+	rc = efx_probe_interrupts(efx);
+	if (rc)
+		goto fail;
 
 	if (efx->n_channels > 1)
 		get_random_bytes(&efx->rx_hash_key, sizeof(efx->rx_hash_key));
@@ -1304,6 +1338,10 @@ static int efx_probe_nic(struct efx_nic *efx)
 	efx_init_irq_moderation(efx, tx_irq_mod_usec, rx_irq_mod_usec, true);
 
 	return 0;
+
+fail:
+	efx->type->remove(efx);
+	return rc;
 }
 
 static void efx_remove_nic(struct efx_nic *efx)
@@ -1837,6 +1875,9 @@ static const struct net_device_ops efx_netdev_ops = {
 	.ndo_poll_controller = efx_netpoll,
 #endif
 	.ndo_setup_tc		= efx_setup_tc,
+#ifdef CONFIG_RFS_ACCEL
+	.ndo_rx_flow_steer	= efx_filter_rfs,
+#endif
 };
 
 static void efx_update_name(struct efx_nic *efx)
@@ -2274,6 +2315,10 @@ static void efx_fini_struct(struct efx_nic *efx)
  */
 static void efx_pci_remove_main(struct efx_nic *efx)
 {
+#ifdef CONFIG_RFS_ACCEL
+	free_irq_cpu_rmap(efx->net_dev->rx_cpu_rmap);
+	efx->net_dev->rx_cpu_rmap = NULL;
+#endif
 	efx_nic_fini_interrupt(efx);
 	efx_fini_channels(efx);
 	efx_fini_port(efx);
diff --git a/drivers/net/sfc/efx.h b/drivers/net/sfc/efx.h
index 0cb198a..cbce62b 100644
--- a/drivers/net/sfc/efx.h
+++ b/drivers/net/sfc/efx.h
@@ -76,6 +76,21 @@ extern int efx_filter_remove_filter(struct efx_nic *efx,
 				    struct efx_filter_spec *spec);
 extern void efx_filter_clear_rx(struct efx_nic *efx,
 				enum efx_filter_priority priority);
+#ifdef CONFIG_RFS_ACCEL
+extern int efx_filter_rfs(struct net_device *net_dev, const struct sk_buff *skb,
+			  u16 rxq_index, u32 flow_id);
+extern bool __efx_filter_rfs_expire(struct efx_nic *efx, unsigned quota);
+static inline void efx_filter_rfs_expire(struct efx_channel *channel)
+{
+	if (channel->rfs_filters_added >= 60 &&
+	    __efx_filter_rfs_expire(channel->efx, 100))
+		channel->rfs_filters_added -= 60;
+}
+#define efx_filter_rfs_enabled() 1
+#else
+static inline void efx_filter_rfs_expire(struct efx_channel *channel) {}
+#define efx_filter_rfs_enabled() 0
+#endif
 
 /* Channels */
 extern void efx_process_channel_now(struct efx_channel *channel);
diff --git a/drivers/net/sfc/filter.c b/drivers/net/sfc/filter.c
index 47a1b79..95a980f 100644
--- a/drivers/net/sfc/filter.c
+++ b/drivers/net/sfc/filter.c
@@ -8,6 +8,7 @@
  */
 
 #include <linux/in.h>
+#include <net/ip.h>
 #include "efx.h"
 #include "filter.h"
 #include "io.h"
@@ -51,6 +52,10 @@ struct efx_filter_table {
 struct efx_filter_state {
 	spinlock_t	lock;
 	struct efx_filter_table table[EFX_FILTER_TABLE_COUNT];
+#ifdef CONFIG_RFS_ACCEL
+	u32		*rps_flow_id;
+	unsigned	rps_expire_index;
+#endif
 };
 
 /* The filter hash function is LFSR polynomial x^16 + x^3 + 1 of a 32-bit
@@ -567,6 +572,13 @@ int efx_probe_filters(struct efx_nic *efx)
 	spin_lock_init(&state->lock);
 
 	if (efx_nic_rev(efx) >= EFX_REV_FALCON_B0) {
+#ifdef CONFIG_RFS_ACCEL
+		state->rps_flow_id = kcalloc(FR_BZ_RX_FILTER_TBL0_ROWS,
+					     sizeof(*state->rps_flow_id),
+					     GFP_KERNEL);
+		if (!state->rps_flow_id)
+			goto fail;
+#endif
 		table = &state->table[EFX_FILTER_TABLE_RX_IP];
 		table->id = EFX_FILTER_TABLE_RX_IP;
 		table->offset = FR_BZ_RX_FILTER_TBL0;
@@ -612,5 +624,97 @@ void efx_remove_filters(struct efx_nic *efx)
 		kfree(state->table[table_id].used_bitmap);
 		vfree(state->table[table_id].spec);
 	}
+#ifdef CONFIG_RFS_ACCEL
+	kfree(state->rps_flow_id);
+#endif
 	kfree(state);
 }
+
+#ifdef CONFIG_RFS_ACCEL
+
+int efx_filter_rfs(struct net_device *net_dev, const struct sk_buff *skb,
+		   u16 rxq_index, u32 flow_id)
+{
+	struct efx_nic *efx = netdev_priv(net_dev);
+	struct efx_channel *channel;
+	struct efx_filter_state *state = efx->filter_state;
+	struct efx_filter_spec spec;
+	const struct iphdr *ip;
+	const __be16 *ports;
+	int nhoff;
+	int rc;
+
+	nhoff = skb_network_offset(skb);
+
+	if (skb->protocol != htons(ETH_P_IP))
+		return -EPROTONOSUPPORT;
+
+	/* RFS must validate the IP header length before calling us */
+	EFX_BUG_ON_PARANOID(!pskb_may_pull(skb, nhoff + sizeof(*ip)));
+	ip = (const struct iphdr *)(skb->data + nhoff);
+	if (ip->frag_off & htons(IP_MF | IP_OFFSET))
+		return -EPROTONOSUPPORT;
+	EFX_BUG_ON_PARANOID(!pskb_may_pull(skb, nhoff + 4 * ip->ihl + 4));
+	ports = (const __be16 *)(skb->data + nhoff + 4 * ip->ihl);
+
+	efx_filter_init_rx(&spec, EFX_FILTER_PRI_HINT, 0, rxq_index);
+	rc = efx_filter_set_ipv4_full(&spec, ip->protocol,
+				      ip->daddr, ports[1], ip->saddr, ports[0]);
+	if (rc)
+		return rc;
+
+	rc = efx_filter_insert_filter(efx, &spec, true);
+	if (rc < 0)
+		return rc;
+
+	/* Remember this so we can check whether to expire the filter later */
+	state->rps_flow_id[rc] = flow_id;
+	channel = efx_get_channel(efx, skb_get_rx_queue(skb));
+	++channel->rfs_filters_added;
+
+	netif_info(efx, rx_status, efx->net_dev,
+		   "steering %s %pI4:%u:%pI4:%u to queue %u [flow %u filter %d]\n",
+		   (ip->protocol == IPPROTO_TCP) ? "TCP" : "UDP",
+		   &ip->saddr, ntohs(ports[0]), &ip->daddr, ntohs(ports[1]),
+		   rxq_index, flow_id, rc);
+
+	return rc;
+}
+
+bool __efx_filter_rfs_expire(struct efx_nic *efx, unsigned quota)
+{
+	struct efx_filter_state *state = efx->filter_state;
+	struct efx_filter_table *table = &state->table[EFX_FILTER_TABLE_RX_IP];
+	unsigned mask = table->size - 1;
+	unsigned index;
+	unsigned stop;
+
+	if (!spin_trylock_bh(&state->lock))
+		return false;
+
+	index = state->rps_expire_index;
+	stop = (index + quota) & mask;
+
+	while (index != stop) {
+		if (test_bit(index, table->used_bitmap) &&
+		    table->spec[index].priority == EFX_FILTER_PRI_HINT &&
+		    rps_may_expire_flow(efx->net_dev,
+					table->spec[index].dmaq_id,
+					state->rps_flow_id[index], index)) {
+			netif_info(efx, rx_status, efx->net_dev,
+				   "expiring filter %d [flow %u]\n",
+				   index, state->rps_flow_id[index]);
+			efx_filter_table_clear_entry(efx, table, index);
+		}
+		index = (index + 1) & mask;
+	}
+
+	state->rps_expire_index = stop;
+	if (table->used == 0)
+		efx_filter_table_reset_search_depth(table);
+
+	spin_unlock_bh(&state->lock);
+	return true;
+}
+
+#endif /* CONFIG_RFS_ACCEL */
diff --git a/drivers/net/sfc/net_driver.h b/drivers/net/sfc/net_driver.h
index 96e22ad..15b9068 100644
--- a/drivers/net/sfc/net_driver.h
+++ b/drivers/net/sfc/net_driver.h
@@ -362,6 +362,9 @@ struct efx_channel {
 
 	unsigned int irq_count;
 	unsigned int irq_mod_score;
+#ifdef CONFIG_RFS_ACCEL
+	unsigned int rfs_filters_added;
+#endif
 
 	int rx_alloc_level;
 	int rx_alloc_push_pages;
-- 
1.7.3.4


-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* [PATCH net-next-2.6 2/3] sfc: Limit filter search depth further for performance hints (i.e. RFS)
From: Ben Hutchings @ 2011-02-18 16:03 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1298044839.2570.7.camel@bwh-desktop>

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/sfc/filter.c |   13 +++++++++----
 1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/net/sfc/filter.c b/drivers/net/sfc/filter.c
index d4722c4..47a1b79 100644
--- a/drivers/net/sfc/filter.c
+++ b/drivers/net/sfc/filter.c
@@ -27,6 +27,10 @@
  */
 #define FILTER_CTL_SRCH_MAX 200
 
+/* Don't try very hard to find space for performance hints, as this is
+ * counter-productive. */
+#define FILTER_CTL_SRCH_HINT_MAX 5
+
 enum efx_filter_table_id {
 	EFX_FILTER_TABLE_RX_IP = 0,
 	EFX_FILTER_TABLE_RX_MAC,
@@ -325,15 +329,16 @@ static int efx_filter_search(struct efx_filter_table *table,
 			     struct efx_filter_spec *spec, u32 key,
 			     bool for_insert, int *depth_required)
 {
-	unsigned hash, incr, filter_idx, depth;
+	unsigned hash, incr, filter_idx, depth, depth_max;
 	struct efx_filter_spec *cmp;
 
 	hash = efx_filter_hash(key);
 	incr = efx_filter_increment(key);
+	depth_max = (spec->priority <= EFX_FILTER_PRI_HINT ?
+		     FILTER_CTL_SRCH_HINT_MAX : FILTER_CTL_SRCH_MAX);
 
 	for (depth = 1, filter_idx = hash & (table->size - 1);
-	     depth <= FILTER_CTL_SRCH_MAX &&
-		     test_bit(filter_idx, table->used_bitmap);
+	     depth <= depth_max && test_bit(filter_idx, table->used_bitmap);
 	     ++depth) {
 		cmp = &table->spec[filter_idx];
 		if (efx_filter_equal(spec, cmp))
@@ -342,7 +347,7 @@ static int efx_filter_search(struct efx_filter_table *table,
 	}
 	if (!for_insert)
 		return -ENOENT;
-	if (depth > FILTER_CTL_SRCH_MAX)
+	if (depth > depth_max)
 		return -EBUSY;
 found:
 	*depth_required = depth;
-- 
1.7.3.4



-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* [PATCH net-next-2.6 1/3] net: RPS: Make hardware-accelerated RFS conditional on NETIF_F_NTUPLE
From: Ben Hutchings @ 2011-02-18 16:03 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1298044839.2570.7.camel@bwh-desktop>

For testing and debugging purposes it is useful to be able to disable
hardware acceleration of RFS without disabling RFS altogether.  Since
this is a similar feature to 'n-tuple' flow steering through the
ethtool API, test the same feature flag that controls that.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 net/core/dev.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 30c71f9..54aaca6 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2607,7 +2607,8 @@ set_rps_cpu(struct net_device *dev, struct sk_buff *skb,
 		int rc;
 
 		/* Should we steer this flow to a different hardware queue? */
-		if (!skb_rx_queue_recorded(skb) || !dev->rx_cpu_rmap)
+		if (!skb_rx_queue_recorded(skb) || !dev->rx_cpu_rmap ||
+		    !(dev->features & NETIF_F_NTUPLE))
 			goto out;
 		rxq_index = cpu_rmap_lookup_index(dev->rx_cpu_rmap, next_cpu);
 		if (rxq_index == skb_get_rx_queue(skb))
-- 
1.7.3.4



-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* pull request: sfc-next-2.6 2011-02-18
From: Ben Hutchings @ 2011-02-18 16:00 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, sf-linux-drivers, Tom Herbert

The following changes since commit f878b995b0f746f5726af9e66940f3bf373dae91:

  Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next-2.6 (2011-02-15 12:25:19 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next-2.6.git for-davem

Add support for RFS acceleration in the sfc driver, and allow it to be
enabled and disabled dynamically using ethtool.

Ben.

Ben Hutchings (3):
      net: RPS: Make hardware-accelerated RFS conditional on NETIF_F_NTUPLE
      sfc: Limit filter search depth further for performance hints (i.e. RFS)
      sfc: Implement hardware acceleration of RFS

 drivers/net/sfc/efx.c        |   49 +++++++++++++++++-
 drivers/net/sfc/efx.h        |   15 +++++
 drivers/net/sfc/filter.c     |  117 ++++++++++++++++++++++++++++++++++++++++--
 drivers/net/sfc/net_driver.h |    3 +
 net/core/dev.c               |    3 +-
 5 files changed, 180 insertions(+), 7 deletions(-)

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH ethtool] ethtool: Warn if we cannot set the advertising mask for specified speed/duplex
From: Ben Hutchings @ 2011-02-18 15:56 UTC (permalink / raw)
  To: netdev; +Cc: linux-net-drivers, Naveen Chouksey
In-Reply-To: <1297969576.2637.33.camel@bwh-desktop>

On Thu, 2011-02-17 at 19:06 +0000, Ben Hutchings wrote:
> When autonegotiation is enabled, drivers must determine link speed and
> duplex through the autonegotiation process and will generally ignore
> the speed and duplex specified in struct ethtool_cmd.  If the user
> specifies a recognised combination of speed and duplex, we set the
> advertising mask to include only the matching mode.  Otherwise, we
> advertise all supported modes.  We should really limit the advertised
> modes separately by speed and duplex, but for now we just warn.
> 
> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
> ---
> This is a longstanding bug in ethtool which people keep getting bitten
> by.  Naveen was just the latest to report it.  I have been working on
> some changes to improve link advertising and reporting, but until those
> are ready I'm adding this warning.

That patch fixes an additional bug: we will now update the advertising
mask based on speed/duplex when autoneg is already on.  Unfortunately it
introduces a bug: we will update advertising even if none of autoneg,
speed or duplex are specified!

Here's a second version which I think will do the right thing in all
cases, though I haven't tested it yet.  I'll include this in ethtool
2.6.38 if no-one can spot a flaw.

Ben.

---
From: Ben Hutchings <bhutchings@solarflare.com>
Date: Thu, 17 Feb 2011 18:51:15 +0000
Subject: [PATCH ethtool] ethtool: Don't silently ignore speed/duplex when autoneg is on

When autonegotiation is enabled, drivers must determine link speed and
duplex through the autonegotiation process and will generally ignore
the speed and duplex specified in struct ethtool_cmd.  Currently, if
the user specifies autoneg on but does not specify the advertising
mask then:

- If the user specifies a recognised combination of speed and duplex,
  we set the advertising mask to the flag for that mode.  (Currently
  only one mode is recognised per combination of speed and duplex.)
- Otherwise, we advertise all recognised and supported modes.

But we should also set the advertising mask if autoneg is *already*
on.  Also, we should be able to limit the advertised modes separately
by speed and duplex.  For now, we just warn if we fail to do that.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 ethtool.c |   45 +++++++++++++++++++++++++++++++--------------
 1 files changed, 31 insertions(+), 14 deletions(-)

diff --git a/ethtool.c b/ethtool.c
index 1afdfe4..2f4b6ee 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -1126,7 +1126,7 @@ static void parse_cmdline(int argc, char **argp)
 		}
 	}
 
-	if ((autoneg_wanted == AUTONEG_ENABLE) && (advertising_wanted < 0)) {
+	if (advertising_wanted < 0) {
 		if (speed_wanted == SPEED_10 && duplex_wanted == DUPLEX_HALF)
 			advertising_wanted = ADVERTISED_10baseT_Half;
 		else if (speed_wanted == SPEED_10 &&
@@ -2528,19 +2528,36 @@ static int do_sset(int fd, struct ifreq *ifr)
 				ecmd.phy_address = phyad_wanted;
 			if (xcvr_wanted != -1)
 				ecmd.transceiver = xcvr_wanted;
-			if (advertising_wanted != -1) {
-				if (advertising_wanted == 0)
-					ecmd.advertising = ecmd.supported &
-						(ADVERTISED_10baseT_Half |
-						 ADVERTISED_10baseT_Full |
-						 ADVERTISED_100baseT_Half |
-						 ADVERTISED_100baseT_Full |
-						 ADVERTISED_1000baseT_Half |
-						 ADVERTISED_1000baseT_Full |
-						 ADVERTISED_2500baseX_Full |
-						 ADVERTISED_10000baseT_Full);
-				else
-					ecmd.advertising = advertising_wanted;
+			/* XXX If the user specified speed or duplex
+			 * then we should mask the advertised modes
+			 * accordingly.  For now, warn that we aren't
+			 * doing that.
+			 */
+			if ((speed_wanted != -1 || duplex_wanted != -1) &&
+			    ecmd.autoneg && advertising_wanted == 0) {
+				fprintf(stderr, "Cannot advertise");
+				if (speed_wanted >= 0)
+					fprintf(stderr, " speed %d",
+						speed_wanted);
+				if (duplex_wanted >= 0)
+					fprintf(stderr, " duplex %s",
+						duplex_wanted ? 
+						"full" : "half");
+				fprintf(stderr,	"\n");
+			}
+			if (autoneg_wanted == AUTONEG_ENABLE &&
+			    advertising_wanted == 0) {
+				ecmd.advertising = ecmd.supported &
+					(ADVERTISED_10baseT_Half |
+					 ADVERTISED_10baseT_Full |
+					 ADVERTISED_100baseT_Half |
+					 ADVERTISED_100baseT_Full |
+					 ADVERTISED_1000baseT_Half |
+					 ADVERTISED_1000baseT_Full |
+					 ADVERTISED_2500baseX_Full |
+					 ADVERTISED_10000baseT_Full);
+			} else if (advertising_wanted != -1) {
+				ecmd.advertising = advertising_wanted;
 			}
 
 			/* Try to perform the update. */
-- 
1.7.3.4



-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* Re: [patch net-next-2.6] net: convert bonding to use rx_handler
From: Patrick McHardy @ 2011-02-18 15:50 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Eric Dumazet, netdev, davem, shemminger, fubar, nicolas.2p.debian,
	andy
In-Reply-To: <20110218145850.GF2939@psychotron.redhat.com>

On 18.02.2011 15:58, Jiri Pirko wrote:
> Fri, Feb 18, 2011 at 03:46:45PM CET, kaber@trash.net wrote:
>> Am 18.02.2011 15:27, schrieb Eric Dumazet:
>>> Le vendredi 18 février 2011 à 15:14 +0100, Jiri Pirko a écrit :
>>>
>>>> Do not know how to do it better. As for percpu variable, not only
>>>> origdev would have to be remembered but also probably skb pointer to
>>>> know if it's the first run on the skb or not. Can't really figure out a
>>>> better solution. Can you?
>>>
>>> I'll try and let you know.
>>
>> Why not simply do a lookup on skb->iif?
> 
> Well I was trying to avoid iterating over list of devices for each
> incoming frame.
> 

Well, there are a couple of holes on 64 bit, perhaps you can rearrange
things and eliminate either iif or input_dev without increasing size
since they appear to be redundant.

^ permalink raw reply

* Re: [PATCH v2 09/13] can: pruss CAN driver.
From: Arnd Bergmann @ 2011-02-18 15:07 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Subhasish Ghosh, davinci-linux-open-source, sachi, nsekhar,
	open list, open list:CAN NETWORK DRIVERS,
	open list:CAN NETWORK DRIVERS, m-watkins, Wolfgang Grandegger
In-Reply-To: <1297435892-28278-10-git-send-email-subhasish@mistralsolutions.com>

On Friday 11 February 2011, Subhasish Ghosh wrote:
> This patch adds support for the CAN device emulated on PRUSS.

Hi Subhasish,

This is a detailed walk through the can driver. The pruss_can.c
file mostly looks good, there are very tiny changes that I'm
suggesting to improve the code. I assume that you wrote that file.

The pruss_can_api.c is a bit of a mess and looks like it was copied
from some other code base and just barely changed to follow Linux
coding style. I can tell from the main driver file that you can do
better than that.

My recommendation for that would be to throw it away and reimplement
the few parts that you actually need, in proper coding style.
You can also try to fix the file according to the comments I give
you below, but I assume that would be signficantly more work.

Moving everything into one file also makes things easier to read
here and lets you identifer more quickly what is unused.

	Arnd

> +#ifdef __CAN_DEBUG
> +#define __can_debug(fmt, args...) printk(KERN_DEBUG "can_debug: " fmt, ## args)
> +#else
> +#define __can_debug(fmt, args...)
> +#endif
> +#define __can_err(fmt, args...) printk(KERN_ERR "can_err: " fmt, ## args)

Better use the existing dev_dbg() and dev_err() macros that provide the
same functionality in a more standard way. You already use them in
some places, as I noticed.

If you don't have a way to pass a meaningful device, you can use
pr_debug/pr_err.

> +void omapl_pru_can_rx_wQ(struct work_struct *work)
> +{

This is only used in the same file, so better make it static.

> +	if (-1 == pru_can_get_intr_status(priv->dev, &priv->can_rx_hndl))
> +		return;

Don't make up your own return values, just use the standard error codes,
e.g. -EIO or -EAGAIN, whatever fits.

The more common way to write the comparison would be the other way round,

	if (pru_can_get_intr_status(priv->dev, &priv->can_rx_hndl) == -EAGAIN)
		return;

or, simpler

	if (pru_can_get_intr_status(priv->dev, &priv->can_rx_hndl))
		return;

> +irqreturn_t omapl_tx_can_intr(int irq, void *dev_id)
> +{

This also should be static

> +		for (bit_set = 0; ((priv->can_tx_hndl.u32interruptstatus & 0xFF)
> +						>> bit_set != 0); bit_set++)
> +		;

	bit_set = fls(priv->can_tx_hndl.u32interruptstatus & 0xFF); ?

> +irqreturn_t omapl_rx_can_intr(int irq, void *dev_id)

static

> diff --git a/drivers/net/can/da8xx_pruss/pruss_can_api.c b/drivers/net/can/da8xx_pruss/pruss_can_api.c
> new file mode 100644
> index 0000000..2f7438a
> --- /dev/null
> +++ b/drivers/net/can/da8xx_pruss/pruss_can_api.c

A lot of code in this file seems to be unused. Is that right?
I would suggest adding only the code that is actually being
used. If you add more functionality later, you can always
add back the low-level functions, but dead code usually
turns into broken code quickly.

> +static can_emu_drv_inst gstr_can_inst[ecanmaxinst];

This is global data and probably needs some for of locking

> +/*
> + * pru_can_set_brp()	Updates the  BRP register of PRU0
> + * and PRU1 of OMAP L138. This API will be called by the
> + * Application to updtae the BRP register of PRU0 and PRU1
> + *
> + * param	u16bitrateprescaler		The can bus bitrate
> + * prescaler value be set
> + *
> + * return   SUCCESS or FAILURE
> + */
> +s16 pru_can_set_brp(struct device *dev, u16 u16bitrateprescaler)
> +{

unused.

> +	u32 u32offset;
> +
> +	if (u16bitrateprescaler > 255) {
> +		return -1;
> +	}

non-standard error code. It also doesn't match the comment, which
claims it is SUCCESS or FAILURE, both of which are (rightfully)
not defined.

> +	u32offset = (PRU_CAN_RX_CLOCK_BRP_REGISTER);
> +	pruss_writel(dev, u32offset, (u32 *) &u16bitrateprescaler, 1);
> +
> +	u32offset = (PRU_CAN_TX_CLOCK_BRP_REGISTER);
> +	pruss_writel(dev, u32offset, (u32 *) &u16bitrateprescaler, 1);

You pass a 32 bit pointer to a 16 bit local variable here.
This has an undefined effect, and if you build this code on
a big-endian platform, it cannot possibly do anything good.

pruss_writel() is defined in a funny way if it takes a thirty-two bit
input argument by reference, rather than by value. What is going
on there?

> +s16 pru_can_set_bit_timing(struct device *dev,
> +		can_bit_timing_consts *pstrbittiming)

unused.

> +	u32 u32offset;
> +	u32 u32serregister;

It's a bit silly to put the name of the type into the name
of a variable. You already spell it out in the definition.

> +s16 pru_can_write_data_to_mailbox(struct device *dev,
> +			can_emu_app_hndl *pstremuapphndl)
> +{
> +	s16 s16subrtnretval;
> +	u32 u32offset;
> +
> +	if (pstremuapphndl == NULL) {
> +		return -1;
> +	}

nonstandard error code. Also, why the heck is type function
return type s16 when the only possible return values are 0
and -1? Just make this an int.

> +	switch ((u8) pstremuapphndl->ecanmailboxnumber) {
> +	case 0:
> +		u32offset = (PRU_CAN_TX_MAILBOX0);
> +		break;
> +	case 1:
> +		u32offset = (PRU_CAN_TX_MAILBOX1);
> +		break;
> +	case 2:
> +		u32offset = (PRU_CAN_TX_MAILBOX2);
> +		break;
> +	case 3:
> +		u32offset = (PRU_CAN_TX_MAILBOX3);
> +		break;
> +	case 4:
> +		u32offset = (PRU_CAN_TX_MAILBOX4);
> +		break;
> +	case 5:
> +		u32offset = (PRU_CAN_TX_MAILBOX5);
> +		break;
> +	case 6:
> +		u32offset = (PRU_CAN_TX_MAILBOX6);
> +		break;
> +	case 7:
> +		u32offset = (PRU_CAN_TX_MAILBOX7);
> +		break;
> +	default:
> +		return -1;
> +	}

Lovely switch statement. I'm sure you find a better way to express this ;-)

> +	s16subrtnretval = pruss_writel(dev, u32offset,
> +		(u32 *) &(pstremuapphndl->strcanmailbox), 4);
> +	if (s16subrtnretval == -1) {
> +		return -1;
> +	}
> +	return 0;
> +}

return pruss_writel(...) ?

> +
> +/*
> + * pru_can_get_intr_status()
> + * Gets the interrupts status register value.
> + * This API will be called by the Application
> + * to get the interrupts status register value
> + *
> + * param  u8prunumber	PRU number for which IntStatusReg
> + * has to be read
> + *
> + * return   SUCCESS or FAILURE
> + */
> +s16 pru_can_get_intr_status(struct device *dev,
> +		can_emu_app_hndl *pstremuapphndl)
> +{
> +	u32 u32offset;
> +	s16 s16subrtnretval = -1;
> +
> +	if (pstremuapphndl == NULL) {
> +		return -1;
> +	}

In every function you check that pstremuapphndl is present. This seems
rather pointless. How about just making sure you never pass a NULL
value to these functions? That should not be hard at all from the
high-level driver.

> +	if (pstremuapphndl->u8prunumber == DA8XX_PRUCORE_1) {
> +		u32offset = (PRU_CAN_TX_INTERRUPT_STATUS_REGISTER);
> +	} else if (pstremuapphndl->u8prunumber == DA8XX_PRUCORE_0) {
> +		u32offset = (PRU_CAN_RX_INTERRUPT_STATUS_REGISTER);
> +	} else {
> +		return -1;
> +	}
> +
> +	s16subrtnretval = pruss_readl(dev, u32offset,
> +		(u32 *) &pstremuapphndl->u32interruptstatus, 1);
> +	if (s16subrtnretval == -1) {
> +		return -1;
> +	}

You can also get rid of all these {} braces around one-line statements.

> +/*
> + * pru_can_get_mailbox_status()		Gets the mailbox status
> + * register value. This API will be called by the Application
> + * to get the mailbox status register value
> + *
> + * return   SUCCESS or FAILURE
> + */
> +s16 pru_can_get_mailbox_status(struct device *dev,
> +		can_emu_app_hndl *pstremuapphndl)

unused.

> +/*
> + * pru_can_config_mode_set()		Sets the timing value
> + * for data transfer. This API will be called by the Application
> + * to set timing valus for data transfer
> + *
> + * return   SUCCESS or FAILURE
> + */
> +s16 pru_can_config_mode_set(struct device *dev, bool bconfigmodeflag)

unused.

> +	u32offset = (PRU_CAN_TX_GLOBAL_CONTROL_REGISTER & 0xFFFF);
> +	u32value = 0x00000000;
> +	s16subrtnretval = pruss_writel(dev, u32offset, (u32 *) &u32value, 1);
> +	if (s16subrtnretval == -1) {
> +		return -1;
> +	}
> +
> +	u32offset = (PRU_CAN_TX_GLOBAL_STATUS_REGISTER & 0xFFFF);
> +	u32value = 0x00000040;
> +	s16subrtnretval = pruss_writel(dev, u32offset, (u32 *) &u32value, 1);
> +	if (s16subrtnretval == -1) {
> +		return -1;
> +	}
> +	u32offset = (PRU_CAN_RX_GLOBAL_STATUS_REGISTER & 0xFFFF);
> +	u32value = 0x00000040;
> +	s16subrtnretval = pruss_writel(dev, u32offset, (u32 *) &u32value, 1);
> +	if (s16subrtnretval == -1) {
> +		return -1;
> +	}

<skipping 50 (!) more of these>

After the third time of writing the same code, you should have noticed that
there is some duplication involved that can trivially be reduced. A good
way to express the same would be a table with the contents:

static struct pru_can_register_init {
	u16 offset;
	u32 value;
} = {
	{ PRU_CAN_TX_GLOBAL_CONTROL_REGISTER, 0, },
	{ PRU_CAN_TX_GLOBAL_STATUS_REGISTER, 0x40, },
	{ PRU_CAN_RX_GLOBAL_STATUS_REGISTER, 0x40, },
	...
};


> +
> +
> +/*
> + * pru_can_emu_open()		Opens the can emu for
> + * application to use. This API will be called by the Application
> + * to Open the can emu for application to use.
> + *
> + * param	pstremuapphndl	Pointer to application handler
> + * structure
> + *
> + * return   SUCCESS or FAILURE
> + */
> +s16 pru_can_emu_open(struct device *dev, can_emu_app_hndl *pstremuapphndl)

unused.

> +/*
> + * brief    pru_can_emu_close()	Closes the can emu for other
> + * applications to use. This API will be called by the Application to Close
> + * the can emu for other applications to use
> + *
> + * param	pstremuapphndl	Pointer to application handler structure
> + *
> + * return   SUCCESS or FAILURE
> + */
> +s16 pru_can_emu_close(struct device *dev, can_emu_app_hndl *pstremuapphndl)

unused

> +s16 pru_can_emu_sreset(struct device *dev)
> +{
> +	return 0;
> +}

pointless.

> +s16 pru_can_tx(struct device *dev, u8 u8mailboxnumber, u8 u8prunumber)
> +{
> +	u32 u32offset = 0;
> +	u32 u32value = 0;
> +	s16 s16subrtnretval = -1;
> +
> +	if (DA8XX_PRUCORE_1 == u8prunumber) {
> +		switch (u8mailboxnumber) {
> +		case 0:
> +			u32offset = (PRU_CAN_TX_MAILBOX0_STATUS_REGISTER & 0xFFFF);
> +			u32value = 0x00000080;
> +			s16subrtnretval = pruss_writel(dev, u32offset,
> +					(u32 *) &u32value, 1);
> +			if (s16subrtnretval == -1) {
> +				return -1;
> +			}
> +			break;
> +		case 1:
> +			u32offset = (PRU_CAN_TX_MAILBOX1_STATUS_REGISTER & 0xFFFF);
> +			u32value = 0x00000080;
> +			s16subrtnretval = pruss_writel(dev, u32offset,
> +						(u32 *) &u32value, 1);
> +			if (s16subrtnretval == -1) {
> +				return -1;
> +			}
> +			break;
> +		case 2:
> +			u32offset = (PRU_CAN_TX_MAILBOX2_STATUS_REGISTER & 0xFFFF);

Another pointless switch statement.

> +s16 pru_can_mask_ints(struct device *dev, u32 int_mask)
> +{
> +	return 0;
> +}
> +
> +int pru_can_get_error_cnt(struct device *dev, u8 u8prunumber)
> +{
> +	return 0;
> +}

useless.

> +int pru_can_get_intc_status(struct device *dev)
> +{
> +	u32 u32offset = 0;
> +	u32 u32getvalue = 0;
> +	u32 u32clrvalue = 0;
> +
> +	u32offset = (PRUSS_INTC_STATCLRINT1 & 0xFFFF);
> +	pruss_readl(dev, u32offset, (u32 *) &u32getvalue, 1);
> +
> +	if (u32getvalue & 4)
> +		u32clrvalue = 34;	/* CLR Event 34 */
> +
> +	if (u32getvalue & 2)
> +		u32clrvalue = 33;	/* CLR Event 33  */
> +
> +	if (u32clrvalue) {
> +		u32offset = (PRUSS_INTC_STATIDXCLR & 0xFFFF);
> +		pruss_writel(dev, u32offset, &u32clrvalue, 1);
> +	} else
> +		return -1;

Could the controller signal both event 34 and 33 simultaneously?
The only user of this function looks at the individual bits
of the return value again, which looks wrong for all possible
return values here.

> +#ifndef _PRU_CAN_API_H_
> +#define CAN_BIT_TIMINGS			(0x273)
> +
> +/* Timer Clock is sourced from DDR freq (PLL1 SYS CLK 2) */
> +#define	TIMER_CLK_FREQ			132000000
> +
> +#define TIMER_SETUP_DELAY		14
> +#define GPIO_SETUP_DELAY		150
> +
> +#define CAN_RX_PRU_0			PRUSS_NUM0
> +#define CAN_TX_PRU_1			PRUSS_NUM1
> +
> +/* Number of Instruction in the Delay loop */
> +#define DELAY_LOOP_LENGTH		2
> +
> +#define PRU1_BASE_ADDR			0x2000
> +
> +#define PRU_CAN_TX_GLOBAL_CONTROL_REGISTER		(PRU1_BASE_ADDR)
> +#define PRU_CAN_TX_GLOBAL_STATUS_REGISTER		(PRU1_BASE_ADDR	+ 0x04)
> +#define PRU_CAN_TX_INTERRUPT_MASK_REGISTER		(PRU1_BASE_ADDR	+ 0x08)
> +#define PRU_CAN_TX_INTERRUPT_STATUS_REGISTER		(PRU1_BASE_ADDR	+ 0x0C)
> +#define PRU_CAN_TX_MAILBOX0_STATUS_REGISTER		(PRU1_BASE_ADDR	+ 0x10)
> +#define PRU_CAN_TX_MAILBOX1_STATUS_REGISTER		(PRU1_BASE_ADDR	+ 0x14)

The header file should be used for interfaces between the two .c files,
don't mix that with hardware specific definitions. Sometimes you may want
to have register number lists in a header, if that list is going to be
used in multiple places. In this case, there is just one user, so better
move all those definitions over there.

> +typedef enum {
> +	ecaninst0 = 0,
> +	ecaninst1,
> +	ecanmaxinst
> +} can_instance_enum;
> +
> +typedef enum {
> +	ecanmailbox0 = 0,
> +	ecanmailbox1,
> +	ecanmailbox2,
> +	ecanmailbox3,
> +	ecanmailbox4,
> +	ecanmailbox5,
> +	ecanmailbox6,
> +	ecanmailbox7
> +} can_mailbox_number;
> +
> +typedef enum {
> +	ecandirectioninit = 0,
> +	ecantransmit,
> +	ecanreceive
> +} can_transfer_direction;

The values are all unused, you only use the typedefs.
IMHO it would be more sensible to just pass these as unsigned int
or u32 values, but if you prefer, there is no reason to just do 

typedef u32 can_mailbox_number;

etc.

> +typedef struct {
> +	u16 u16extendedidentifier;
> +	u16 u16baseidentifier;
> +	u8 u8data7;
> +	u8 u8data6;
> +	u8 u8data5;
> +	u8 u8data4;
> +	u8 u8data3;
> +	u8 u8data2;
> +	u8 u8data1;
> +	u8 u8data0;
> +	u16 u16datalength;
> +	u16 u16crc;
> +} can_mail_box_structure;

Please don't use typedef for complex data structures, and learn about
better naming of identifiers. I would suggest writing this as

struct pru_can_mailbox {
	u16	ext_id;
	u16	base_id;
	u8	data[8]; /* note: reverse order */
	u16	len;
	u16	crc;
};

Sames rules apply to the other structures.

	Arnd

^ permalink raw reply

* Re: [patch net-next-2.6] net: convert bonding to use rx_handler
From: Jiri Pirko @ 2011-02-18 14:58 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Eric Dumazet, netdev, davem, shemminger, fubar, nicolas.2p.debian,
	andy
In-Reply-To: <4D5E8655.5070304@trash.net>

Fri, Feb 18, 2011 at 03:46:45PM CET, kaber@trash.net wrote:
>Am 18.02.2011 15:27, schrieb Eric Dumazet:
>> Le vendredi 18 février 2011 à 15:14 +0100, Jiri Pirko a écrit :
>> 
>>> Do not know how to do it better. As for percpu variable, not only
>>> origdev would have to be remembered but also probably skb pointer to
>>> know if it's the first run on the skb or not. Can't really figure out a
>>> better solution. Can you?
>> 
>> I'll try and let you know.
>
>Why not simply do a lookup on skb->iif?

Well I was trying to avoid iterating over list of devices for each
incoming frame.


^ permalink raw reply

* Re: [patch net-next-2.6] net: convert bonding to use rx_handler
From: Patrick McHardy @ 2011-02-18 14:46 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jiri Pirko, netdev, davem, shemminger, fubar, nicolas.2p.debian,
	andy
In-Reply-To: <1298039252.6201.66.camel@edumazet-laptop>

Am 18.02.2011 15:27, schrieb Eric Dumazet:
> Le vendredi 18 février 2011 à 15:14 +0100, Jiri Pirko a écrit :
> 
>> Do not know how to do it better. As for percpu variable, not only
>> origdev would have to be remembered but also probably skb pointer to
>> know if it's the first run on the skb or not. Can't really figure out a
>> better solution. Can you?
> 
> I'll try and let you know.

Why not simply do a lookup on skb->iif?

^ permalink raw reply

* Re: [PATCH v6 0/9] net: Unified offload configuration
From: Ben Hutchings @ 2011-02-18 14:29 UTC (permalink / raw)
  To: Michał Mirosław; +Cc: David Miller, netdev
In-Reply-To: <20110218142238.GA18099@rere.qmqm.pl>

On Fri, 2011-02-18 at 15:22 +0100, Michał Mirosław wrote:
> On Thu, Feb 17, 2011 at 02:56:11PM -0800, David Miller wrote:
[...]
> > Please get rid of that annoying message spit out by netif_features_change(),
> > it's just spam.  If we want notifications for stuff like this, use a
> > non-unicast netlink message so those who want to hear it can do so.
> 
> You mean netdev_update_features() "Features changed" message? Is it ok
> to just demote it to DEBUG level or you want to remove it altogether?
> What about netdev_fix_features() messages?

I think you need to emit these messages at 'error' severity when fixing
up features for a newly-added device, but at 'debug' later on.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [patch net-next-2.6] net: convert bonding to use rx_handler
From: Eric Dumazet @ 2011-02-18 14:27 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, shemminger, kaber, fubar, nicolas.2p.debian, andy
In-Reply-To: <20110218141456.GD2939@psychotron.redhat.com>

Le vendredi 18 février 2011 à 15:14 +0100, Jiri Pirko a écrit :

> Do not know how to do it better. As for percpu variable, not only
> origdev would have to be remembered but also probably skb pointer to
> know if it's the first run on the skb or not. Can't really figure out a
> better solution. Can you?

I'll try and let you know.

Thanks



^ permalink raw reply

* [patch 1/1] [PATCH] qeth: remove needless IPA-commands in offline
From: frank.blaschka @ 2011-02-18 14:22 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, Ursula Braun
In-Reply-To: <20110218142258.791988211@de.ibm.com>

[-- Attachment #1: 602-needless-ipa-commands.diff --]
[-- Type: text/plain, Size: 12373 bytes --]

From: Ursula Braun <ursula.braun@de.ibm.com>

If a qeth device is set offline, data and control subchannels are
cleared, which means removal of all IP Assist Primitive settings
implicitly. There is no need to delete those settings explicitly.
This patch removes all IP Assist invocations from offline.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
---


 drivers/s390/net/qeth_core.h      |    1 
 drivers/s390/net/qeth_core_main.c |   52 ++++++++++-------------------------
 drivers/s390/net/qeth_l2_main.c   |   45 +++++++++----------------------
 drivers/s390/net/qeth_l3_main.c   |   55 ++------------------------------------
 4 files changed, 33 insertions(+), 120 deletions(-)
--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -741,7 +741,6 @@ struct qeth_card {
 	/* QDIO buffer handling */
 	struct qeth_qdio_info qdio;
 	struct qeth_perf_stats perf_stats;
-	int use_hard_stop;
 	int read_or_write_problem;
 	struct qeth_osn_info osn_info;
 	struct qeth_discipline discipline;
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -302,12 +302,15 @@ static void qeth_issue_ipa_msg(struct qe
 	int com = cmd->hdr.command;
 	ipa_name = qeth_get_ipa_cmd_name(com);
 	if (rc)
-		QETH_DBF_MESSAGE(2, "IPA: %s(x%X) for %s returned x%X \"%s\"\n",
-				ipa_name, com, QETH_CARD_IFNAME(card),
-					rc, qeth_get_ipa_msg(rc));
+		QETH_DBF_MESSAGE(2, "IPA: %s(x%X) for %s/%s returned "
+				"x%X \"%s\"\n",
+				ipa_name, com, dev_name(&card->gdev->dev),
+				QETH_CARD_IFNAME(card), rc,
+				qeth_get_ipa_msg(rc));
 	else
-		QETH_DBF_MESSAGE(5, "IPA: %s(x%X) for %s succeeded\n",
-				ipa_name, com, QETH_CARD_IFNAME(card));
+		QETH_DBF_MESSAGE(5, "IPA: %s(x%X) for %s/%s succeeded\n",
+				ipa_name, com, dev_name(&card->gdev->dev),
+				QETH_CARD_IFNAME(card));
 }
 
 static struct qeth_ipa_cmd *qeth_check_ipa_data(struct qeth_card *card,
@@ -1083,7 +1086,6 @@ static int qeth_setup_card(struct qeth_c
 	card->data.state  = CH_STATE_DOWN;
 	card->state = CARD_STATE_DOWN;
 	card->lan_online = 0;
-	card->use_hard_stop = 0;
 	card->read_or_write_problem = 0;
 	card->dev = NULL;
 	spin_lock_init(&card->vlanlock);
@@ -1732,20 +1734,22 @@ int qeth_send_control_data(struct qeth_c
 		};
 	}
 
+	if (reply->rc == -EIO)
+		goto error;
 	rc = reply->rc;
 	qeth_put_reply(reply);
 	return rc;
 
 time_err:
+	reply->rc = -ETIME;
 	spin_lock_irqsave(&reply->card->lock, flags);
 	list_del_init(&reply->list);
 	spin_unlock_irqrestore(&reply->card->lock, flags);
-	reply->rc = -ETIME;
 	atomic_inc(&reply->received);
+error:
 	atomic_set(&card->write.irq_pending, 0);
 	qeth_release_buffer(iob->channel, iob);
 	card->write.buf_no = (card->write.buf_no + 1) % QETH_CMD_BUFFER_NO;
-	wake_up(&reply->wait_q);
 	rc = reply->rc;
 	qeth_put_reply(reply);
 	return rc;
@@ -2490,45 +2494,19 @@ int qeth_send_ipa_cmd(struct qeth_card *
 }
 EXPORT_SYMBOL_GPL(qeth_send_ipa_cmd);
 
-static int qeth_send_startstoplan(struct qeth_card *card,
-		enum qeth_ipa_cmds ipacmd, enum qeth_prot_versions prot)
-{
-	int rc;
-	struct qeth_cmd_buffer *iob;
-
-	iob = qeth_get_ipacmd_buffer(card, ipacmd, prot);
-	rc = qeth_send_ipa_cmd(card, iob, NULL, NULL);
-
-	return rc;
-}
-
 int qeth_send_startlan(struct qeth_card *card)
 {
 	int rc;
+	struct qeth_cmd_buffer *iob;
 
 	QETH_DBF_TEXT(SETUP, 2, "strtlan");
 
-	rc = qeth_send_startstoplan(card, IPA_CMD_STARTLAN, 0);
+	iob = qeth_get_ipacmd_buffer(card, IPA_CMD_STARTLAN, 0);
+	rc = qeth_send_ipa_cmd(card, iob, NULL, NULL);
 	return rc;
 }
 EXPORT_SYMBOL_GPL(qeth_send_startlan);
 
-int qeth_send_stoplan(struct qeth_card *card)
-{
-	int rc = 0;
-
-	/*
-	 * TODO: according to the IPA format document page 14,
-	 * TCP/IP (we!) never issue a STOPLAN
-	 * is this right ?!?
-	 */
-	QETH_DBF_TEXT(SETUP, 2, "stoplan");
-
-	rc = qeth_send_startstoplan(card, IPA_CMD_STOPLAN, 0);
-	return rc;
-}
-EXPORT_SYMBOL_GPL(qeth_send_stoplan);
-
 int qeth_default_setadapterparms_cb(struct qeth_card *card,
 		struct qeth_reply *reply, unsigned long data)
 {
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -202,17 +202,19 @@ static void qeth_l2_add_mc(struct qeth_c
 		kfree(mc);
 }
 
-static void qeth_l2_del_all_mc(struct qeth_card *card)
+static void qeth_l2_del_all_mc(struct qeth_card *card, int del)
 {
 	struct qeth_mc_mac *mc, *tmp;
 
 	spin_lock_bh(&card->mclock);
 	list_for_each_entry_safe(mc, tmp, &card->mc_list, list) {
-		if (mc->is_vmac)
-			qeth_l2_send_setdelmac(card, mc->mc_addr,
+		if (del) {
+			if (mc->is_vmac)
+				qeth_l2_send_setdelmac(card, mc->mc_addr,
 					IPA_CMD_DELVMAC, NULL);
-		else
-			qeth_l2_send_delgroupmac(card, mc->mc_addr);
+			else
+				qeth_l2_send_delgroupmac(card, mc->mc_addr);
+		}
 		list_del(&mc->list);
 		kfree(mc);
 	}
@@ -288,18 +290,13 @@ static int qeth_l2_send_setdelvlan(struc
 				 qeth_l2_send_setdelvlan_cb, NULL);
 }
 
-static void qeth_l2_process_vlans(struct qeth_card *card, int clear)
+static void qeth_l2_process_vlans(struct qeth_card *card)
 {
 	struct qeth_vlan_vid *id;
 	QETH_CARD_TEXT(card, 3, "L2prcvln");
 	spin_lock_bh(&card->vlanlock);
 	list_for_each_entry(id, &card->vid_list, list) {
-		if (clear)
-			qeth_l2_send_setdelvlan(card, id->vid,
-				IPA_CMD_DELVLAN);
-		else
-			qeth_l2_send_setdelvlan(card, id->vid,
-				IPA_CMD_SETVLAN);
+		qeth_l2_send_setdelvlan(card, id->vid, IPA_CMD_SETVLAN);
 	}
 	spin_unlock_bh(&card->vlanlock);
 }
@@ -379,19 +376,11 @@ static int qeth_l2_stop_card(struct qeth
 			dev_close(card->dev);
 			rtnl_unlock();
 		}
-		if (!card->use_hard_stop ||
-			recovery_mode) {
-			__u8 *mac = &card->dev->dev_addr[0];
-			rc = qeth_l2_send_delmac(card, mac);
-			QETH_DBF_TEXT_(SETUP, 2, "Lerr%d", rc);
-		}
+		card->info.mac_bits &= ~QETH_LAYER2_MAC_REGISTERED;
 		card->state = CARD_STATE_SOFTSETUP;
 	}
 	if (card->state == CARD_STATE_SOFTSETUP) {
-		qeth_l2_process_vlans(card, 1);
-		if (!card->use_hard_stop ||
-			recovery_mode)
-			qeth_l2_del_all_mc(card);
+		qeth_l2_del_all_mc(card, 0);
 		qeth_clear_ipacmd_list(card);
 		card->state = CARD_STATE_HARDSETUP;
 	}
@@ -405,7 +394,6 @@ static int qeth_l2_stop_card(struct qeth
 		qeth_clear_cmd_buffers(&card->read);
 		qeth_clear_cmd_buffers(&card->write);
 	}
-	card->use_hard_stop = 0;
 	return rc;
 }
 
@@ -705,7 +693,7 @@ static void qeth_l2_set_multicast_list(s
 	if (qeth_threads_running(card, QETH_RECOVER_THREAD) &&
 	    (card->state != CARD_STATE_UP))
 		return;
-	qeth_l2_del_all_mc(card);
+	qeth_l2_del_all_mc(card, 1);
 	spin_lock_bh(&card->mclock);
 	netdev_for_each_mc_addr(ha, dev)
 		qeth_l2_add_mc(card, ha->addr, 0);
@@ -907,10 +895,8 @@ static void qeth_l2_remove_device(struct
 	qeth_set_allowed_threads(card, 0, 1);
 	wait_event(card->wait_q, qeth_threads_running(card, 0xffffffff) == 0);
 
-	if (cgdev->state == CCWGROUP_ONLINE) {
-		card->use_hard_stop = 1;
+	if (cgdev->state == CCWGROUP_ONLINE)
 		qeth_l2_set_offline(cgdev);
-	}
 
 	if (card->dev) {
 		unregister_netdev(card->dev);
@@ -1040,7 +1026,7 @@ contin:
 
 	if (card->info.type != QETH_CARD_TYPE_OSN &&
 	    card->info.type != QETH_CARD_TYPE_OSM)
-		qeth_l2_process_vlans(card, 0);
+		qeth_l2_process_vlans(card);
 
 	netif_tx_disable(card->dev);
 
@@ -1076,7 +1062,6 @@ contin:
 	return 0;
 
 out_remove:
-	card->use_hard_stop = 1;
 	qeth_l2_stop_card(card, 0);
 	ccw_device_set_offline(CARD_DDEV(card));
 	ccw_device_set_offline(CARD_WDEV(card));
@@ -1144,7 +1129,6 @@ static int qeth_l2_recover(void *ptr)
 	QETH_CARD_TEXT(card, 2, "recover2");
 	dev_warn(&card->gdev->dev,
 		"A recovery process has been started for the device\n");
-	card->use_hard_stop = 1;
 	__qeth_l2_set_offline(card->gdev, 1);
 	rc = __qeth_l2_set_online(card->gdev, 1);
 	if (!rc)
@@ -1191,7 +1175,6 @@ static int qeth_l2_pm_suspend(struct ccw
 	if (gdev->state == CCWGROUP_OFFLINE)
 		return 0;
 	if (card->state == CARD_STATE_UP) {
-		card->use_hard_stop = 1;
 		__qeth_l2_set_offline(card->gdev, 1);
 	} else
 		__qeth_l2_set_offline(card->gdev, 0);
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -510,8 +510,7 @@ static void qeth_l3_set_ip_addr_list(str
 	kfree(tbd_list);
 }
 
-static void qeth_l3_clear_ip_list(struct qeth_card *card, int clean,
-					int recover)
+static void qeth_l3_clear_ip_list(struct qeth_card *card, int recover)
 {
 	struct qeth_ipaddr *addr, *tmp;
 	unsigned long flags;
@@ -530,11 +529,6 @@ static void qeth_l3_clear_ip_list(struct
 		addr = list_entry(card->ip_list.next,
 				  struct qeth_ipaddr, entry);
 		list_del_init(&addr->entry);
-		if (clean) {
-			spin_unlock_irqrestore(&card->ip_lock, flags);
-			qeth_l3_deregister_addr_entry(card, addr);
-			spin_lock_irqsave(&card->ip_lock, flags);
-		}
 		if (!recover || addr->is_multicast) {
 			kfree(addr);
 			continue;
@@ -1611,29 +1605,6 @@ static int qeth_l3_start_ipassists(struc
 	return 0;
 }
 
-static int qeth_l3_put_unique_id(struct qeth_card *card)
-{
-
-	int rc = 0;
-	struct qeth_cmd_buffer *iob;
-	struct qeth_ipa_cmd *cmd;
-
-	QETH_CARD_TEXT(card, 2, "puniqeid");
-
-	if ((card->info.unique_id & UNIQUE_ID_NOT_BY_CARD) ==
-		UNIQUE_ID_NOT_BY_CARD)
-		return -1;
-	iob = qeth_get_ipacmd_buffer(card, IPA_CMD_DESTROY_ADDR,
-				     QETH_PROT_IPV6);
-	cmd = (struct qeth_ipa_cmd *)(iob->data+IPA_PDU_HEADER_SIZE);
-	*((__u16 *) &cmd->data.create_destroy_addr.unique_id[6]) =
-				card->info.unique_id;
-	memcpy(&cmd->data.create_destroy_addr.unique_id[0],
-	       card->dev->dev_addr, OSA_ADDR_LEN);
-	rc = qeth_send_ipa_cmd(card, iob, NULL, NULL);
-	return rc;
-}
-
 static int qeth_l3_iqd_read_initial_mac_cb(struct qeth_card *card,
 		struct qeth_reply *reply, unsigned long data)
 {
@@ -2324,25 +2295,14 @@ static int qeth_l3_stop_card(struct qeth
 			dev_close(card->dev);
 			rtnl_unlock();
 		}
-		if (!card->use_hard_stop) {
-			rc = qeth_send_stoplan(card);
-			if (rc)
-				QETH_DBF_TEXT_(SETUP, 2, "1err%d", rc);
-		}
 		card->state = CARD_STATE_SOFTSETUP;
 	}
 	if (card->state == CARD_STATE_SOFTSETUP) {
-		qeth_l3_clear_ip_list(card, !card->use_hard_stop, 1);
+		qeth_l3_clear_ip_list(card, 1);
 		qeth_clear_ipacmd_list(card);
 		card->state = CARD_STATE_HARDSETUP;
 	}
 	if (card->state == CARD_STATE_HARDSETUP) {
-		if (!card->use_hard_stop &&
-		    (card->info.type != QETH_CARD_TYPE_IQD)) {
-			rc = qeth_l3_put_unique_id(card);
-			if (rc)
-				QETH_DBF_TEXT_(SETUP, 2, "2err%d", rc);
-		}
 		qeth_qdio_clear_card(card, 0);
 		qeth_clear_qdio_buffers(card);
 		qeth_clear_working_pool_list(card);
@@ -2352,7 +2312,6 @@ static int qeth_l3_stop_card(struct qeth
 		qeth_clear_cmd_buffers(&card->read);
 		qeth_clear_cmd_buffers(&card->write);
 	}
-	card->use_hard_stop = 0;
 	return rc;
 }
 
@@ -3483,17 +3442,15 @@ static void qeth_l3_remove_device(struct
 	qeth_set_allowed_threads(card, 0, 1);
 	wait_event(card->wait_q, qeth_threads_running(card, 0xffffffff) == 0);
 
-	if (cgdev->state == CCWGROUP_ONLINE) {
-		card->use_hard_stop = 1;
+	if (cgdev->state == CCWGROUP_ONLINE)
 		qeth_l3_set_offline(cgdev);
-	}
 
 	if (card->dev) {
 		unregister_netdev(card->dev);
 		card->dev = NULL;
 	}
 
-	qeth_l3_clear_ip_list(card, 0, 0);
+	qeth_l3_clear_ip_list(card, 0);
 	qeth_l3_clear_ipato_list(card);
 	return;
 }
@@ -3594,7 +3551,6 @@ contin:
 	mutex_unlock(&card->discipline_mutex);
 	return 0;
 out_remove:
-	card->use_hard_stop = 1;
 	qeth_l3_stop_card(card, 0);
 	ccw_device_set_offline(CARD_DDEV(card));
 	ccw_device_set_offline(CARD_WDEV(card));
@@ -3663,7 +3619,6 @@ static int qeth_l3_recover(void *ptr)
 	QETH_CARD_TEXT(card, 2, "recover2");
 	dev_warn(&card->gdev->dev,
 		"A recovery process has been started for the device\n");
-	card->use_hard_stop = 1;
 	__qeth_l3_set_offline(card->gdev, 1);
 	rc = __qeth_l3_set_online(card->gdev, 1);
 	if (!rc)
@@ -3684,7 +3639,6 @@ static int qeth_l3_recover(void *ptr)
 static void qeth_l3_shutdown(struct ccwgroup_device *gdev)
 {
 	struct qeth_card *card = dev_get_drvdata(&gdev->dev);
-	qeth_l3_clear_ip_list(card, 0, 0);
 	qeth_qdio_clear_card(card, 0);
 	qeth_clear_qdio_buffers(card);
 }
@@ -3700,7 +3654,6 @@ static int qeth_l3_pm_suspend(struct ccw
 	if (gdev->state == CCWGROUP_OFFLINE)
 		return 0;
 	if (card->state == CARD_STATE_UP) {
-		card->use_hard_stop = 1;
 		__qeth_l3_set_offline(card->gdev, 1);
 	} else
 		__qeth_l3_set_offline(card->gdev, 0);


^ permalink raw reply

* [patch 0/1] s390: one more qeth patches for net-next
From: frank.blaschka @ 2011-02-18 14:22 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390

Hi Dave,

one more qeth patch for net-next

shortlog:
Ursula Braun (1)
qeth: remove needless IPA-commands in offline

Thanks,
        Frank

^ permalink raw reply

* Re: [PATCH v6 0/9] net: Unified offload configuration
From: Michał Mirosław @ 2011-02-18 14:22 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, bhutchings
In-Reply-To: <20110217.145611.229745070.davem@davemloft.net>

On Thu, Feb 17, 2011 at 02:56:11PM -0800, David Miller wrote:
> From: Michał Mirosław <mirq-linux@rere.qmqm.pl>
> Date: Wed, 16 Feb 2011 03:59:16 +0100 (CET)
> > Here's a v6 of the ethtool unification patch series.
> > 
> > What's in it?
> >  1..4:
> > 	cleanups for the core patches
> >  5:
> > 	the patch - implement unified ethtool setting ops
> >  6..7:
> > 	implement interoperation between old and new ethtool ops
> >  8:
> > 	include RX checksum in features and plug it into new framework
> >  9:
> > 	convert loopback device to new framework
> > 
> > What is it good for?
> >  - unifies driver behaviour wrt hardware offloads
> >  - removes a lot of boilerplate code from drivers
> >  - allows better fine-grained control over used offloads
> Applied to net-next-2.6, please send any bug fixes relative to this.
> 
> Please get rid of that annoying message spit out by netif_features_change(),
> it's just spam.  If we want notifications for stuff like this, use a
> non-unicast netlink message so those who want to hear it can do so.

You mean netdev_update_features() "Features changed" message? Is it ok
to just demote it to DEBUG level or you want to remove it altogether?
What about netdev_fix_features() messages?

Best Regards,
Michał Mirosław

^ permalink raw reply

* Re: [patch net-next-2.6] net: convert bonding to use rx_handler
From: Jiri Pirko @ 2011-02-18 14:14 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, davem, shemminger, kaber, fubar, nicolas.2p.debian, andy
In-Reply-To: <1298035791.6201.56.camel@edumazet-laptop>

Fri, Feb 18, 2011 at 02:29:51PM CET, eric.dumazet@gmail.com wrote:
>Le vendredi 18 février 2011 à 14:25 +0100, Jiri Pirko a écrit :
>> This patch converts bonding to use rx_handler. Results in cleaner
>> __netif_receive_skb() with much less exceptions needed. Also bond-specific
>> work is moved into bond code.
>> 
>> Signed-off-by: Jiri Pirko <jpirko@redhat.com>
>> ---
>>  drivers/net/bonding/bond_main.c |   75 ++++++++++++++++++++-
>>  include/linux/skbuff.h          |    2 +
>>  net/core/dev.c                  |  144 +++++++++++---------------------------
>>  net/core/skbuff.c               |    1 +
>>  4 files changed, 119 insertions(+), 103 deletions(-)
>> 
>
>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> index 31f02d0..9f3af5d 100644
>> --- a/include/linux/skbuff.h
>> +++ b/include/linux/skbuff.h
>> @@ -267,6 +267,7 @@ typedef unsigned char *sk_buff_data_t;
>>   *	@sk: Socket we are owned by
>>   *	@tstamp: Time we arrived
>>   *	@dev: Device we arrived on/are leaving by
>> + *	@input_dev: Original device on which we arrived
>>   *	@transport_header: Transport layer header
>>   *	@network_header: Network layer header
>>   *	@mac_header: Link layer header
>> @@ -325,6 +326,7 @@ struct sk_buff {
>>  
>>  	struct sock		*sk;
>>  	struct net_device	*dev;
>> +	struct net_device	*input_dev;
>>  
>
>Your patch looks fine, but adding 8 bytes to sk_buff for a "cleanup" is
>really a show stopper for me.

Do not know how to do it better. As for percpu variable, not only
origdev would have to be remembered but also probably skb pointer to
know if it's the first run on the skb or not. Can't really figure out a
better solution. Can you?

Thanks.

Jirka
>
>
>

^ permalink raw reply

* [PATCH v2] net: provide default_advmss() methods to blackhole dst_ops
From: Eric Dumazet @ 2011-02-18 13:44 UTC (permalink / raw)
  To: Changli Gao
  Cc: George Spelvin, David Miller, linux-kernel, netdev, Roland Dreier,
	Daniel Baluta
In-Reply-To: <1298036005.6201.58.camel@edumazet-laptop>

Le vendredi 18 février 2011 à 14:33 +0100, Eric Dumazet a écrit :

> I had this exact idea but found we need struct net pointer to get this
> value, not provided in parameters, so I falled back to the 256 value.
> 
> 

Hmm, reading again this stuff, maybe we can just use
ipv4_default_advmss() instead of a custom one.

dst->dev should be available

[PATCH] net: provide default_advmss() methods to blackhole dst_ops

Commit 0dbaee3b37e118a (net: Abstract default ADVMSS behind an
accessor.) introduced a possible crash in tcp_connect_init(), when
dst->default_advmss() is called from dst_metric_advmss()

Reported-by: George Spelvin <linux@horizon.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Roland Dreier <roland@purestorage.com>
CC: Changli Gao <xiaosuo@gmail.com>
CC: Daniel Baluta <dbaluta@ixiacom.com>
---
 net/ipv4/route.c |    1 +
 net/ipv6/route.c |    1 +
 2 files changed, 2 insertions(+)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 788a3e7..6ed6603 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2722,6 +2722,7 @@ static struct dst_ops ipv4_dst_blackhole_ops = {
 	.destroy		=	ipv4_dst_destroy,
 	.check			=	ipv4_blackhole_dst_check,
 	.default_mtu		=	ipv4_blackhole_default_mtu,
+	.default_advmss		=	ipv4_default_advmss,
 	.update_pmtu		=	ipv4_rt_blackhole_update_pmtu,
 };
 
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 1c29f95..a998db6 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -128,6 +128,7 @@ static struct dst_ops ip6_dst_blackhole_ops = {
 	.destroy		=	ip6_dst_destroy,
 	.check			=	ip6_dst_check,
 	.default_mtu		=	ip6_blackhole_default_mtu,
+	.default_advmss		=	ip6_default_advmss,
 	.update_pmtu		=	ip6_rt_blackhole_update_pmtu,
 };
 



^ permalink raw reply related

* Re: [PATCH] net: Add default_advmss() methods to blackhole dst_ops
From: Daniel Baluta @ 2011-02-18 13:43 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Changli Gao, George Spelvin, David Miller, linux-kernel, netdev,
	Roland Dreier
In-Reply-To: <1298036005.6201.58.camel@edumazet-laptop>

On Fri, Feb 18, 2011 at 3:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le vendredi 18 février 2011 à 21:29 +0800, Changli Gao a écrit :
>> On Fri, Feb 18, 2011 at 9:24 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> > Le vendredi 18 février 2011 à 21:16 +0800, Changli Gao a écrit :
>> >
>> >> I am wondering why magic number 256 is used here. Is there a special
>> >> reason? Thanks.
>> >>
>> >
>> > It really doesnt matter. SYN message will be dropped anyway.
>> >
>> > 256 happens to be the default value
>> > of /proc/sys/net/ipv4/route/min_adv_mss

Perhaps a comment would make sense then.

thanks,
Daniel.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox