Netdev List
 help / color / mirror / Atom feed
* [PATCH 02/12] ipv4: netfilter: use PTR_RET instead of IS_ERR + PTR_ERR
From: pablo @ 2013-03-25 12:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1364213752-3934-1-git-send-email-pablo@netfilter.org>

From: Silviu-Mihai Popescu <silviupopescu1990@gmail.com>

This uses PTR_RET instead of IS_ERR and PTR_ERR in order to increase
readability.

Signed-off-by: Silviu-Mihai Popescu <silviupopescu1990@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/ipv4/netfilter/arptable_filter.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/net/ipv4/netfilter/arptable_filter.c b/net/ipv4/netfilter/arptable_filter.c
index 79ca5e7..eadab1e 100644
--- a/net/ipv4/netfilter/arptable_filter.c
+++ b/net/ipv4/netfilter/arptable_filter.c
@@ -48,9 +48,7 @@ static int __net_init arptable_filter_net_init(struct net *net)
 	net->ipv4.arptable_filter =
 		arpt_register_table(net, &packet_filter, repl);
 	kfree(repl);
-	if (IS_ERR(net->ipv4.arptable_filter))
-		return PTR_ERR(net->ipv4.arptable_filter);
-	return 0;
+	return PTR_RET(net->ipv4.arptable_filter);
 }
 
 static void __net_exit arptable_filter_net_exit(struct net *net)
-- 
1.7.10.4


^ permalink raw reply related

* [PATCH 01/12] netfilter: ip6t_NPT: Use csum_partial()
From: pablo @ 2013-03-25 12:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1364213752-3934-1-git-send-email-pablo@netfilter.org>

From: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[ Some fixes went into mainstream before this patch, so I needed
  to rebase it upon the current tree, that's why it's different from
  the original one posted on the list --pablo ]

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/ipv6/netfilter/ip6t_NPT.c |   11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/net/ipv6/netfilter/ip6t_NPT.c b/net/ipv6/netfilter/ip6t_NPT.c
index 83acc14..59286a1 100644
--- a/net/ipv6/netfilter/ip6t_NPT.c
+++ b/net/ipv6/netfilter/ip6t_NPT.c
@@ -18,9 +18,8 @@
 static int ip6t_npt_checkentry(const struct xt_tgchk_param *par)
 {
 	struct ip6t_npt_tginfo *npt = par->targinfo;
-	__wsum src_sum = 0, dst_sum = 0;
 	struct in6_addr pfx;
-	unsigned int i;
+	__wsum src_sum, dst_sum;
 
 	if (npt->src_pfx_len > 64 || npt->dst_pfx_len > 64)
 		return -EINVAL;
@@ -33,12 +32,8 @@ static int ip6t_npt_checkentry(const struct xt_tgchk_param *par)
 	if (!ipv6_addr_equal(&pfx, &npt->dst_pfx.in6))
 		return -EINVAL;
 
-	for (i = 0; i < ARRAY_SIZE(npt->src_pfx.in6.s6_addr16); i++) {
-		src_sum = csum_add(src_sum,
-				(__force __wsum)npt->src_pfx.in6.s6_addr16[i]);
-		dst_sum = csum_add(dst_sum,
-				(__force __wsum)npt->dst_pfx.in6.s6_addr16[i]);
-	}
+	src_sum = csum_partial(&npt->src_pfx.in6, sizeof(npt->src_pfx.in6), 0);
+	dst_sum = csum_partial(&npt->dst_pfx.in6, sizeof(npt->dst_pfx.in6), 0);
 
 	npt->adjustment = ~csum_fold(csum_sub(src_sum, dst_sum));
 	return 0;
-- 
1.7.10.4


^ permalink raw reply related

* [PATCH 00/12] Netfilter updates for net-next
From: pablo @ 2013-03-25 12:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>

Hi David,

The following patchset contains Netfilter/IPVS updates for
your net-next tree, they are:

* Better performance in nfnetlink_queue by avoiding copy from the
  packet to netlink message, from Eric Dumazet.

* Remove unnecessary locking in the exit path of ebt_ulog, from Gao Feng.

* Use new function ipv6_iface_scope_id in nf_ct_ipv6, from Hannes Frederic Sowa.

* A couple of sparse fixes for IPVS, from Julian Anastasov.

* Use xor hashing in nfnetlink_queue, as suggested by Eric Dumazet, from
  myself.

* Allow to dump expectations per master conntrack via ctnetlink, from myself.

* A couple of cleanups to use PTR_RET in module init path, from Silviu-Mihai
  Popescu.

* Remove nf_conntrack module a bit faster if netns are in use, from
  Vladimir Davydov.

* Use checksum_partial in ip6t_NPT, from YOSHIFUJI Hideaki.

* Sparse fix for nf_conntrack, from Stephen Hemminger.

You can pull these changes from:

git://1984.lsi.us.es/nf-next master

Thanks!

Eric Dumazet (1):
  netfilter: nfnetlink_queue: zero copy support

Gao feng (1):
  netfilter: ebt_ulog: remove unnecessary spin lock protection

Hannes Frederic Sowa (1):
  netfilter: nf_ct_ipv6: use ipv6_iface_scope_id in conntrack to return scope id

Julian Anastasov (2):
  ipvs: fix hashing in ip_vs_svc_hashkey
  ipvs: fix some sparse warnings

Pablo Neira Ayuso (2):
  netfilter: nfnetlink_queue: use xor hash function to distribute instances
  netfilter: ctnetlink: allow to dump expectation per master conntrack

Silviu-Mihai Popescu (2):
  ipv4: netfilter: use PTR_RET instead of IS_ERR + PTR_ERR
  bridge: netfilter: use PTR_RET instead of IS_ERR + PTR_ERR

Vladimir Davydov (1):
  netfilter: nf_conntrack: speed up module removal path if netns in use

YOSHIFUJI Hideaki (1):
  netfilter: ip6t_NPT: Use csum_partial()

stephen hemminger (1):
  netfilter: nf_conntrack: add include to fix sparse warning

 include/net/ip_vs.h                            |    2 +-
 include/net/netfilter/nf_conntrack_core.h      |    1 +
 net/bridge/netfilter/ebt_ulog.c                |    3 +-
 net/bridge/netfilter/ebtable_broute.c          |    4 +-
 net/ipv4/netfilter/arptable_filter.c           |    4 +-
 net/ipv6/netfilter/ip6t_NPT.c                  |   11 +--
 net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c |    8 +-
 net/netfilter/ipvs/ip_vs_core.c                |    8 +-
 net/netfilter/ipvs/ip_vs_ctl.c                 |    8 +-
 net/netfilter/ipvs/ip_vs_est.c                 |    2 +-
 net/netfilter/nf_conntrack_core.c              |   47 +++++++----
 net/netfilter/nf_conntrack_netlink.c           |  100 ++++++++++++++++++++++--
 net/netfilter/nf_conntrack_standalone.c        |   16 ++--
 net/netfilter/nfnetlink_queue_core.c           |   96 +++++++++++++++++------
 14 files changed, 228 insertions(+), 82 deletions(-)

-- 
1.7.10.4


^ permalink raw reply

* Re: [PATCH 6/6] xen-netback: don't disconnect frontend when seeing oversize frame
From: Wei Liu @ 2013-03-25 12:00 UTC (permalink / raw)
  To: David Vrabel
  Cc: xen-devel@lists.xen.org, netdev@vger.kernel.org, Ian Campbell,
	annie.li@oracle.com, konrad.wilk@oracle.com
In-Reply-To: <51503945.6040204@citrix.com>

On Mon, Mar 25, 2013 at 11:47:17AM +0000, David Vrabel wrote:
> On 25/03/13 11:08, Wei Liu wrote:
> > Some buggy frontends may generate frames larger than 64 KiB. We should
> > aggresively consume all slots and drop the packet instead of disconnecting the
> > frontend.
> 
> The following is the changeset description I wrote internally.  It's a
> bit more descriptive.
> 
> Apologies for not sending out a proper patch in the first place.
> 
> "Some frontend drivers are sending packets >= 64 KiB in length.  This
> length overflows the length field in the first frag making the
> following frags have an invalid length ("Frag is bigger than frame").
> 
> Turn this error back into a non-fatal error by dropping the packet.
> To avoid having the following frags having fatal errors, consume all
> frags in the packet.
> 
> This does not reopen the security hole as if the packet as an invalid
> number of frags it will still hit this fatal error case."
> 

Thanks.

Overall this looks good. I will need to change 'frags' to 'slots'
though.


Wei.

> David

^ permalink raw reply

* Re: [PATCH 6/6] xen-netback: don't disconnect frontend when seeing oversize frame
From: David Vrabel @ 2013-03-25 11:47 UTC (permalink / raw)
  To: Wei Liu
  Cc: xen-devel@lists.xen.org, netdev@vger.kernel.org, Ian Campbell,
	annie.li@oracle.com, konrad.wilk@oracle.com
In-Reply-To: <1364209702-12437-7-git-send-email-wei.liu2@citrix.com>

On 25/03/13 11:08, Wei Liu wrote:
> Some buggy frontends may generate frames larger than 64 KiB. We should
> aggresively consume all slots and drop the packet instead of disconnecting the
> frontend.

The following is the changeset description I wrote internally.  It's a
bit more descriptive.

Apologies for not sending out a proper patch in the first place.

"Some frontend drivers are sending packets >= 64 KiB in length.  This
length overflows the length field in the first frag making the
following frags have an invalid length ("Frag is bigger than frame").

Turn this error back into a non-fatal error by dropping the packet.
To avoid having the following frags having fatal errors, consume all
frags in the packet.

This does not reopen the security hole as if the packet as an invalid
number of frags it will still hit this fatal error case."

David

^ permalink raw reply

* [PATCH 3/6] xen-netfront: frags -> slots in xennet_get_responses
From: Wei Liu @ 2013-03-25 11:08 UTC (permalink / raw)
  To: xen-devel, netdev
  Cc: ian.campbell, annie.li, konrad.wilk, david.vrabel, Wei Liu
In-Reply-To: <1364209702-12437-1-git-send-email-wei.liu2@citrix.com>

This function is in fact counting the ring slots required for responses.
Separate the concepts of ring slots and skb frags make the code clearer, as
now netfront and netback can have different MAX_SKB_FRAGS, slot and frag are
not mapped 1:1 any more.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netfront.c |   18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 3ae9dc1..52f5010 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -724,7 +724,7 @@ static int xennet_get_responses(struct netfront_info *np,
 	struct sk_buff *skb = xennet_get_rx_skb(np, cons);
 	grant_ref_t ref = xennet_get_rx_ref(np, cons);
 	int max = MAX_SKB_FRAGS + (rx->status <= RX_COPY_THRESHOLD);
-	int frags = 1;
+	int slots = 1;
 	int err = 0;
 	unsigned long ret;
 
@@ -768,27 +768,27 @@ next:
 		if (!(rx->flags & XEN_NETRXF_more_data))
 			break;
 
-		if (cons + frags == rp) {
+		if (cons + slots == rp) {
 			if (net_ratelimit())
-				dev_warn(dev, "Need more frags\n");
+				dev_warn(dev, "Need more slots\n");
 			err = -ENOENT;
 			break;
 		}
 
-		rx = RING_GET_RESPONSE(&np->rx, cons + frags);
-		skb = xennet_get_rx_skb(np, cons + frags);
-		ref = xennet_get_rx_ref(np, cons + frags);
-		frags++;
+		rx = RING_GET_RESPONSE(&np->rx, cons + slots);
+		skb = xennet_get_rx_skb(np, cons + slots);
+		ref = xennet_get_rx_ref(np, cons + slots);
+		slots++;
 	}
 
-	if (unlikely(frags > max)) {
+	if (unlikely(slots > max)) {
 		if (net_ratelimit())
 			dev_warn(dev, "Too many frags\n");
 		err = -E2BIG;
 	}
 
 	if (unlikely(err))
-		np->rx.rsp_cons = cons + frags;
+		np->rx.rsp_cons = cons + slots;
 
 	return err;
 }
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 2/6] xen-netfront: reduce gso_max_size to account for ethernet header
From: Wei Liu @ 2013-03-25 11:08 UTC (permalink / raw)
  To: xen-devel, netdev
  Cc: ian.campbell, annie.li, konrad.wilk, david.vrabel, Wei Liu
In-Reply-To: <1364209702-12437-1-git-send-email-wei.liu2@citrix.com>

The maximum packet including ethernet header that can be handled by netfront /
netback wire format is 65535. Reduce gso_max_size accordingly.

Drop skb and print warning when skb->len > 65535. This can 1) save the effort
to send malformed packet to netback, 2) help spotting misconfiguration of
netfront in the future.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netfront.c |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 5527663..3ae9dc1 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -547,6 +547,18 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	unsigned int len = skb_headlen(skb);
 	unsigned long flags;
 
+	/* Wire format of xen_netif_tx_request only supports skb->len
+	 * < 64K, because size field in xen_netif_tx_request is
+	 * uint16_t. If skb->len is too big, drop it and alert user about
+	 * misconfiguration.
+	 */
+	if (unlikely(skb->len >= (uint16_t)(~0))) {
+		net_alert_ratelimited(
+			"xennet: skb->len = %u, too big for wire format\n",
+			skb->len);
+		goto drop;
+	}
+
 	slots = DIV_ROUND_UP(offset + len, PAGE_SIZE) +
 		xennet_count_skb_frag_slots(skb);
 	if (unlikely(slots > MAX_SKB_FRAGS + 1)) {
@@ -1362,6 +1374,8 @@ static struct net_device *xennet_create_dev(struct xenbus_device *dev)
 	SET_ETHTOOL_OPS(netdev, &xennet_ethtool_ops);
 	SET_NETDEV_DEV(netdev, &dev->dev);
 
+	netif_set_gso_max_size(netdev, 65535 - ETH_HLEN);
+
 	np->netdev = netdev;
 
 	netif_carrier_off(netdev);
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 4/6] xen-netback: remove skb in xen_netbk_alloc_page
From: Wei Liu @ 2013-03-25 11:08 UTC (permalink / raw)
  To: xen-devel, netdev
  Cc: ian.campbell, annie.li, konrad.wilk, david.vrabel, Wei Liu
In-Reply-To: <1364209702-12437-1-git-send-email-wei.liu2@citrix.com>

This variable is never used.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/xen-netback/netback.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index da726a3..6e8e51a 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -948,7 +948,6 @@ static int netbk_count_requests(struct xenvif *vif,
 }
 
 static struct page *xen_netbk_alloc_page(struct xen_netbk *netbk,
-					 struct sk_buff *skb,
 					 u16 pending_idx)
 {
 	struct page *page;
@@ -982,7 +981,7 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk,
 
 		index = pending_index(netbk->pending_cons++);
 		pending_idx = netbk->pending_ring[index];
-		page = xen_netbk_alloc_page(netbk, skb, pending_idx);
+		page = xen_netbk_alloc_page(netbk, pending_idx);
 		if (!page)
 			goto err;
 
@@ -1387,7 +1386,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
 		}
 
 		/* XXX could copy straight to head */
-		page = xen_netbk_alloc_page(netbk, skb, pending_idx);
+		page = xen_netbk_alloc_page(netbk, pending_idx);
 		if (!page) {
 			kfree_skb(skb);
 			netbk_tx_err(vif, &txreq, idx);
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 1/6] xen-netfront: remove unused variable `extra'
From: Wei Liu @ 2013-03-25 11:08 UTC (permalink / raw)
  To: xen-devel, netdev
  Cc: ian.campbell, annie.li, konrad.wilk, david.vrabel, Wei Liu
In-Reply-To: <1364209702-12437-1-git-send-email-wei.liu2@citrix.com>

This variable is supposed to hold reference to the last extra_info in the
loop. However there is only type of extra info here and the loop to process
extra info is missing, so this variable is never used and causes confusion.

Remove it at the moment. We can add it back when necessary.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netfront.c |    8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 7ffa43b..5527663 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -537,7 +537,6 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct netfront_info *np = netdev_priv(dev);
 	struct netfront_stats *stats = this_cpu_ptr(np->stats);
 	struct xen_netif_tx_request *tx;
-	struct xen_netif_extra_info *extra;
 	char *data = skb->data;
 	RING_IDX i;
 	grant_ref_t ref;
@@ -581,7 +580,6 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	tx->gref = np->grant_tx_ref[id] = ref;
 	tx->offset = offset;
 	tx->size = len;
-	extra = NULL;
 
 	tx->flags = 0;
 	if (skb->ip_summed == CHECKSUM_PARTIAL)
@@ -597,10 +595,7 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		gso = (struct xen_netif_extra_info *)
 			RING_GET_REQUEST(&np->tx, ++i);
 
-		if (extra)
-			extra->flags |= XEN_NETIF_EXTRA_FLAG_MORE;
-		else
-			tx->flags |= XEN_NETTXF_extra_info;
+		tx->flags |= XEN_NETTXF_extra_info;
 
 		gso->u.gso.size = skb_shinfo(skb)->gso_size;
 		gso->u.gso.type = XEN_NETIF_GSO_TYPE_TCPV4;
@@ -609,7 +604,6 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 		gso->type = XEN_NETIF_EXTRA_TYPE_GSO;
 		gso->flags = 0;
-		extra = gso;
 	}
 
 	np->tx.req_prod_pvt = i + 1;
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 0/6 V2] Bundle fixes for Xen netfront / netback
From: Wei Liu @ 2013-03-25 11:08 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: ian.campbell, annie.li, konrad.wilk, david.vrabel

1, 3 and 4 are just mechanical fixes to make code cleaner.

2 fixes the bug that netfront generates oversize skbs. The fix was suggested by
Ben. My test shows that netfront doesn't generate oversize skb after this fix.

4 tries to coalesce slots before copying. This version enlarges the grant copy
buffer to avoid overflow.

6 loosens the punishment when netback sees oversize packet.


Wei.

^ permalink raw reply

* [PATCH 5/6] xen-netback: coalesce slots before copying
From: Wei Liu @ 2013-03-25 11:08 UTC (permalink / raw)
  To: xen-devel, netdev
  Cc: ian.campbell, annie.li, konrad.wilk, david.vrabel, Wei Liu
In-Reply-To: <1364209702-12437-1-git-send-email-wei.liu2@citrix.com>

This patch tries to coalesce tx requests when constructing grant copy
structures. It enables netback to deal with situation when frontend's
MAX_SKB_FRAGS is larger than backend's MAX_SKB_FRAGS.

It defines max_skb_slots, which is a estimation of the maximum number of slots
a guest can send, anything bigger than that is considered malicious. Now it is
set to 20, which should be enough to accommodate Linux (16 to 19).

Also change variable name from "frags" to "slots" in netbk_count_requests.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/netback.c |  220 ++++++++++++++++++++++++++++---------
 1 file changed, 171 insertions(+), 49 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 6e8e51a..a634dc5 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -47,11 +47,26 @@
 #include <asm/xen/hypercall.h>
 #include <asm/xen/page.h>
 
+/*
+ * This is an estimation of the maximum possible frags a SKB might
+ * have, anything larger than this is considered malicious. Typically
+ * Linux has 16 to 19.
+ */
+#define MAX_SKB_SLOTS_DEFAULT 20
+static unsigned int max_skb_slots = MAX_SKB_SLOTS_DEFAULT;
+module_param(max_skb_slots, uint, 0444);
+
+typedef unsigned int pending_ring_idx_t;
+#define INVALID_PENDING_RING_IDX (~0U)
+
 struct pending_tx_info {
-	struct xen_netif_tx_request req;
+	struct xen_netif_tx_request req; /* coalesced tx request  */
 	struct xenvif *vif;
+	pending_ring_idx_t head; /* head != INVALID_PENDING_RING_IDX
+				  * if it is head of serveral merged
+				  * tx reqs
+				  */
 };
-typedef unsigned int pending_ring_idx_t;
 
 struct netbk_rx_meta {
 	int id;
@@ -102,7 +117,11 @@ struct xen_netbk {
 	atomic_t netfront_count;
 
 	struct pending_tx_info pending_tx_info[MAX_PENDING_REQS];
-	struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS];
+	/* Coalescing tx requests before copying makes number of grant
+	 * copy ops greater of equal to number of slots required. In
+	 * worst case a tx request consumes 2 gnttab_copy.
+	 */
+	struct gnttab_copy tx_copy_ops[2*MAX_PENDING_REQS];
 
 	u16 pending_ring[MAX_PENDING_REQS];
 
@@ -251,7 +270,7 @@ static int max_required_rx_slots(struct xenvif *vif)
 	int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
 
 	if (vif->can_sg || vif->gso || vif->gso_prefix)
-		max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
+		max += max_skb_slots + 1; /* extra_info + frags */
 
 	return max;
 }
@@ -657,7 +676,7 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
 		__skb_queue_tail(&rxq, skb);
 
 		/* Filled the batch queue? */
-		if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)
+		if (count + max_skb_slots >= XEN_NETIF_RX_RING_SIZE)
 			break;
 	}
 
@@ -908,34 +927,34 @@ static int netbk_count_requests(struct xenvif *vif,
 				int work_to_do)
 {
 	RING_IDX cons = vif->tx.req_cons;
-	int frags = 0;
+	int slots = 0;
 
 	if (!(first->flags & XEN_NETTXF_more_data))
 		return 0;
 
 	do {
-		if (frags >= work_to_do) {
-			netdev_err(vif->dev, "Need more frags\n");
+		if (slots >= work_to_do) {
+			netdev_err(vif->dev, "Need more slots\n");
 			netbk_fatal_tx_err(vif);
 			return -ENODATA;
 		}
 
-		if (unlikely(frags >= MAX_SKB_FRAGS)) {
-			netdev_err(vif->dev, "Too many frags\n");
+		if (unlikely(slots >= max_skb_slots)) {
+			netdev_err(vif->dev, "Too many slots\n");
 			netbk_fatal_tx_err(vif);
 			return -E2BIG;
 		}
 
-		memcpy(txp, RING_GET_REQUEST(&vif->tx, cons + frags),
+		memcpy(txp, RING_GET_REQUEST(&vif->tx, cons + slots),
 		       sizeof(*txp));
 		if (txp->size > first->size) {
-			netdev_err(vif->dev, "Frag is bigger than frame.\n");
+			netdev_err(vif->dev, "Packet is bigger than frame.\n");
 			netbk_fatal_tx_err(vif);
 			return -EIO;
 		}
 
 		first->size -= txp->size;
-		frags++;
+		slots++;
 
 		if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
 			netdev_err(vif->dev, "txp->offset: %x, size: %u\n",
@@ -944,7 +963,7 @@ static int netbk_count_requests(struct xenvif *vif,
 			return -EINVAL;
 		}
 	} while ((txp++)->flags & XEN_NETTXF_more_data);
-	return frags;
+	return slots;
 }
 
 static struct page *xen_netbk_alloc_page(struct xen_netbk *netbk,
@@ -968,48 +987,114 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk,
 	struct skb_shared_info *shinfo = skb_shinfo(skb);
 	skb_frag_t *frags = shinfo->frags;
 	u16 pending_idx = *((u16 *)skb->data);
-	int i, start;
+	u16 head_idx = 0;
+	int slot, start;
+	struct page *page;
+	pending_ring_idx_t index, start_idx = 0;
+	uint16_t dst_offset;
+	unsigned int nr_slots;
+	struct pending_tx_info *first = NULL;
+
+	/* At this point shinfo->nr_frags is in fact the number of
+	 * slots, which can be as large as max_skb_slots.
+	 */
+	nr_slots = shinfo->nr_frags;
 
 	/* Skip first skb fragment if it is on same page as header fragment. */
 	start = (frag_get_pending_idx(&shinfo->frags[0]) == pending_idx);
 
-	for (i = start; i < shinfo->nr_frags; i++, txp++) {
-		struct page *page;
-		pending_ring_idx_t index;
+	/* Coalesce tx requests, at this point the packet passed in
+	 * should be <= 64K. Any packets larger than 64K have been
+	 * handled in netbk_count_requests().
+	 */
+	for (shinfo->nr_frags = slot = start; slot < nr_slots;
+	     shinfo->nr_frags++) {
 		struct pending_tx_info *pending_tx_info =
 			netbk->pending_tx_info;
 
-		index = pending_index(netbk->pending_cons++);
-		pending_idx = netbk->pending_ring[index];
-		page = xen_netbk_alloc_page(netbk, pending_idx);
+		page = alloc_page(GFP_KERNEL|__GFP_COLD);
 		if (!page)
 			goto err;
 
-		gop->source.u.ref = txp->gref;
-		gop->source.domid = vif->domid;
-		gop->source.offset = txp->offset;
-
-		gop->dest.u.gmfn = virt_to_mfn(page_address(page));
-		gop->dest.domid = DOMID_SELF;
-		gop->dest.offset = txp->offset;
-
-		gop->len = txp->size;
-		gop->flags = GNTCOPY_source_gref;
+		dst_offset = 0;
+		first = NULL;
+		while (dst_offset < PAGE_SIZE && slot < nr_slots) {
+			gop->flags = GNTCOPY_source_gref;
+
+			gop->source.u.ref = txp->gref;
+			gop->source.domid = vif->domid;
+			gop->source.offset = txp->offset;
+
+			gop->dest.domid = DOMID_SELF;
+
+			gop->dest.offset = dst_offset;
+			gop->dest.u.gmfn = virt_to_mfn(page_address(page));
+
+			if (dst_offset + txp->size > PAGE_SIZE) {
+				/* This page can only merge a portion
+				 * of tx request. Do not increment any
+				 * pointer / counter here. The txp
+				 * will be dealt with in future
+				 * rounds, eventually hitting the
+				 * `else` branch.
+				 */
+				gop->len = PAGE_SIZE - dst_offset;
+				txp->offset += gop->len;
+				txp->size -= gop->len;
+				dst_offset += gop->len; /* quit loop */
+			} else {
+				/* This tx request can be merged in the page */
+				gop->len = txp->size;
+				dst_offset += gop->len;
+
+				index = pending_index(netbk->pending_cons++);
+
+				pending_idx = netbk->pending_ring[index];
+
+				memcpy(&pending_tx_info[pending_idx].req, txp,
+				       sizeof(*txp));
+				xenvif_get(vif);
+
+				pending_tx_info[pending_idx].vif = vif;
+
+				/* Poison these fields, corresponding
+				 * fields for head tx req will be set
+				 * to correct values after the loop.
+				 */
+				netbk->mmap_pages[pending_idx] = (void *)(~0UL);
+				pending_tx_info[pending_idx].head =
+					INVALID_PENDING_RING_IDX;
+
+				if (unlikely(!first)) {
+					first = &pending_tx_info[pending_idx];
+					start_idx = index;
+					head_idx = pending_idx;
+				}
+
+				txp++;
+				slot++;
+			}
 
-		gop++;
+			gop++;
+		}
 
-		memcpy(&pending_tx_info[pending_idx].req, txp, sizeof(*txp));
-		xenvif_get(vif);
-		pending_tx_info[pending_idx].vif = vif;
-		frag_set_pending_idx(&frags[i], pending_idx);
+		first->req.offset = 0;
+		first->req.size = dst_offset;
+		first->head = start_idx;
+		set_page_ext(page, netbk, head_idx);
+		netbk->mmap_pages[head_idx] = page;
+		frag_set_pending_idx(&frags[shinfo->nr_frags], head_idx);
 	}
 
+	BUG_ON(shinfo->nr_frags > MAX_SKB_FRAGS);
+
 	return gop;
 err:
 	/* Unwind, freeing all pages and sending error responses. */
-	while (i-- > start) {
-		xen_netbk_idx_release(netbk, frag_get_pending_idx(&frags[i]),
-				      XEN_NETIF_RSP_ERROR);
+	while (shinfo->nr_frags-- > start) {
+		xen_netbk_idx_release(netbk,
+				frag_get_pending_idx(&frags[shinfo->nr_frags]),
+				XEN_NETIF_RSP_ERROR);
 	}
 	/* The head too, if necessary. */
 	if (start)
@@ -1025,8 +1110,10 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk,
 	struct gnttab_copy *gop = *gopp;
 	u16 pending_idx = *((u16 *)skb->data);
 	struct skb_shared_info *shinfo = skb_shinfo(skb);
+	struct pending_tx_info *tx_info;
 	int nr_frags = shinfo->nr_frags;
 	int i, err, start;
+	u16 peek; /* peek into next tx request */
 
 	/* Check status of header. */
 	err = gop->status;
@@ -1038,11 +1125,21 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk,
 
 	for (i = start; i < nr_frags; i++) {
 		int j, newerr;
+		pending_ring_idx_t head;
 
 		pending_idx = frag_get_pending_idx(&shinfo->frags[i]);
+		tx_info = &netbk->pending_tx_info[pending_idx];
+		head = tx_info->head;
 
 		/* Check error status: if okay then remember grant handle. */
-		newerr = (++gop)->status;
+		do {
+			newerr = (++gop)->status;
+			if (newerr)
+				break;
+			peek = netbk->pending_ring[pending_index(++head)];
+		} while (netbk->pending_tx_info[peek].head
+			 == INVALID_PENDING_RING_IDX);
+
 		if (likely(!newerr)) {
 			/* Had a previous error? Invalidate this fragment. */
 			if (unlikely(err))
@@ -1267,11 +1364,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
 	struct sk_buff *skb;
 	int ret;
 
-	while (((nr_pending_reqs(netbk) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) &&
+	while (((nr_pending_reqs(netbk) + max_skb_slots) < MAX_PENDING_REQS) &&
 		!list_empty(&netbk->net_schedule_list)) {
 		struct xenvif *vif;
 		struct xen_netif_tx_request txreq;
-		struct xen_netif_tx_request txfrags[MAX_SKB_FRAGS];
+		struct xen_netif_tx_request txfrags[max_skb_slots];
 		struct page *page;
 		struct xen_netif_extra_info extras[XEN_NETIF_EXTRA_TYPE_MAX-1];
 		u16 pending_idx;
@@ -1359,7 +1456,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
 		pending_idx = netbk->pending_ring[index];
 
 		data_len = (txreq.size > PKT_PROT_LEN &&
-			    ret < MAX_SKB_FRAGS) ?
+			    ret < max_skb_slots) ?
 			PKT_PROT_LEN : txreq.size;
 
 		skb = alloc_skb(data_len + NET_SKB_PAD + NET_IP_ALIGN,
@@ -1409,6 +1506,7 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
 		memcpy(&netbk->pending_tx_info[pending_idx].req,
 		       &txreq, sizeof(txreq));
 		netbk->pending_tx_info[pending_idx].vif = vif;
+		netbk->pending_tx_info[pending_idx].head = index;
 		*((u16 *)skb->data) = pending_idx;
 
 		__skb_put(skb, data_len);
@@ -1539,7 +1637,10 @@ static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx,
 {
 	struct xenvif *vif;
 	struct pending_tx_info *pending_tx_info;
-	pending_ring_idx_t index;
+	pending_ring_idx_t head;
+	u16 peek; /* peek into next tx request */
+
+	BUG_ON(netbk->mmap_pages[pending_idx] == (void *)(~0UL));
 
 	/* Already complete? */
 	if (netbk->mmap_pages[pending_idx] == NULL)
@@ -1548,13 +1649,34 @@ static void xen_netbk_idx_release(struct xen_netbk *netbk, u16 pending_idx,
 	pending_tx_info = &netbk->pending_tx_info[pending_idx];
 
 	vif = pending_tx_info->vif;
+	head = pending_tx_info->head;
 
-	make_tx_response(vif, &pending_tx_info->req, status);
+	BUG_ON(head == INVALID_PENDING_RING_IDX);
+	BUG_ON(netbk->pending_ring[pending_index(head)] != pending_idx);
 
-	index = pending_index(netbk->pending_prod++);
-	netbk->pending_ring[index] = pending_idx;
+	do {
+		pending_ring_idx_t index;
+		pending_ring_idx_t idx = pending_index(head);
+		u16 info_idx = netbk->pending_ring[idx];
 
-	xenvif_put(vif);
+		pending_tx_info = &netbk->pending_tx_info[info_idx];
+		make_tx_response(vif, &pending_tx_info->req, status);
+
+		/* Setting any number other than
+		 * INVALID_PENDING_RING_IDX indicates this slot is
+		 * starting a new packet / ending a previous packet.
+		 */
+		pending_tx_info->head = 0;
+
+		index = pending_index(netbk->pending_prod++);
+		netbk->pending_ring[index] = netbk->pending_ring[info_idx];
+
+		xenvif_put(vif);
+
+		peek = netbk->pending_ring[pending_index(++head)];
+
+	} while (netbk->pending_tx_info[peek].head
+		 == INVALID_PENDING_RING_IDX);
 
 	netbk->mmap_pages[pending_idx]->mapping = 0;
 	put_page(netbk->mmap_pages[pending_idx]);
@@ -1613,7 +1735,7 @@ static inline int rx_work_todo(struct xen_netbk *netbk)
 static inline int tx_work_todo(struct xen_netbk *netbk)
 {
 
-	if (((nr_pending_reqs(netbk) + MAX_SKB_FRAGS) < MAX_PENDING_REQS) &&
+	if (((nr_pending_reqs(netbk) + max_skb_slots) < MAX_PENDING_REQS) &&
 			!list_empty(&netbk->net_schedule_list))
 		return 1;
 
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 6/6] xen-netback: don't disconnect frontend when seeing oversize frame
From: Wei Liu @ 2013-03-25 11:08 UTC (permalink / raw)
  To: xen-devel, netdev
  Cc: ian.campbell, annie.li, konrad.wilk, david.vrabel, Wei Liu
In-Reply-To: <1364209702-12437-1-git-send-email-wei.liu2@citrix.com>

Some buggy frontends may generate frames larger than 64 KiB. We should
aggresively consume all slots and drop the packet instead of disconnecting the
frontend.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/netback.c |   32 ++++++++++++++++++++++++++------
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index a634dc5..1971623 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -924,10 +924,12 @@ static void netbk_fatal_tx_err(struct xenvif *vif)
 static int netbk_count_requests(struct xenvif *vif,
 				struct xen_netif_tx_request *first,
 				struct xen_netif_tx_request *txp,
-				int work_to_do)
+				int work_to_do,
+				RING_IDX first_idx)
 {
 	RING_IDX cons = vif->tx.req_cons;
 	int slots = 0;
+	bool drop = false;
 
 	if (!(first->flags & XEN_NETTXF_more_data))
 		return 0;
@@ -947,10 +949,21 @@ static int netbk_count_requests(struct xenvif *vif,
 
 		memcpy(txp, RING_GET_REQUEST(&vif->tx, cons + slots),
 		       sizeof(*txp));
-		if (txp->size > first->size) {
-			netdev_err(vif->dev, "Packet is bigger than frame.\n");
-			netbk_fatal_tx_err(vif);
-			return -EIO;
+
+		/* If the guest submitted a frame >= 64 KiB then
+		 * first->size overflowed and following slots will
+		 * appear to be larger than the frame.
+		 *
+		 * This cannot be fatal error as there are buggy
+		 * frontends that do this.
+		 *
+		 * Consume all slots and drop the packet.
+		 */
+		if (!drop && txp->size > first->size) {
+			if (net_ratelimit())
+				netdev_dbg(vif->dev,
+					   "Packet is bigger than frame.\n");
+			drop = true;
 		}
 
 		first->size -= txp->size;
@@ -963,6 +976,12 @@ static int netbk_count_requests(struct xenvif *vif,
 			return -EINVAL;
 		}
 	} while ((txp++)->flags & XEN_NETTXF_more_data);
+
+	if (drop) {
+		netbk_tx_err(vif, first, first_idx + slots);
+		return -EIO;
+	}
+
 	return slots;
 }
 
@@ -1429,7 +1448,8 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
 				continue;
 		}
 
-		ret = netbk_count_requests(vif, &txreq, txfrags, work_to_do);
+		ret = netbk_count_requests(vif, &txreq, txfrags,
+					   work_to_do, idx);
 		if (unlikely(ret < 0))
 			continue;
 
-- 
1.7.10.4

^ permalink raw reply related

* [Eulerkernel] [PATCH] af_unix: dont send SCM_CREDENTIAL when dest socket is NULL
From: dingtianhong @ 2013-03-25 10:28 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, netdev, Li Zefan, Xinwei Hu

SCM_SCREDENTIALS should apply to write() syscalls only either source or destination
socket asserted SOCK_PASSCRED. The original implememtation in maybe_add_creds is wrong,
and breaks several LSB testcases ( i.e. /tset/LSB.os/netowkr/recvfrom/T.recvfrom).

Origionally-authored-by: Karel Srot <ksrot@redhat.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
   net/unix/af_unix.c | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 51be64f..99189fd 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1413,8 +1413,8 @@ static void maybe_add_creds(struct sk_buff *skb, const struct socket *sock,
          if (UNIXCB(skb).cred)
                  return;
          if (test_bit(SOCK_PASSCRED, &sock->flags) ||
-           !other->sk_socket ||
-           test_bit(SOCK_PASSCRED, &other->sk_socket->flags)) {
+           (other->sk_socket &&
+           test_bit(SOCK_PASSCRED, &other->sk_socket->flags))) {
                  UNIXCB(skb).pid  = get_pid(task_tgid(current));
                  UNIXCB(skb).cred = get_current_cred();
          }
-- 

^ permalink raw reply related

* [PATCH net-next 2/2] net/davinci_emac: use clk_{prepare|unprepare}
From: Sekhar Nori @ 2013-03-25  9:25 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, davinci-linux-open-source, Sekhar Nori, Mike Turquette
In-Reply-To: <1364203547-2960-1-git-send-email-nsekhar@ti.com>

Use clk_prepare()/clk_unprepare() in the driver since common
clock framework needs these to be called before clock is enabled.

This is in preparation of common clock framework migration
for DaVinci.

Cc: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
---
 drivers/net/ethernet/ti/davinci_emac.c |   19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/davinci_emac.c b/drivers/net/ethernet/ti/davinci_emac.c
index b1349b5..436296c 100644
--- a/drivers/net/ethernet/ti/davinci_emac.c
+++ b/drivers/net/ethernet/ti/davinci_emac.c
@@ -350,6 +350,7 @@ struct emac_priv {
 	/*platform specific members*/
 	void (*int_enable) (void);
 	void (*int_disable) (void);
+	struct clk *clk;
 };
 
 /* EMAC TX Host Error description strings */
@@ -1870,19 +1871,29 @@ static int davinci_emac_probe(struct platform_device *pdev)
 		dev_err(&pdev->dev, "failed to get EMAC clock\n");
 		return -EBUSY;
 	}
+
+	rc = clk_prepare(emac_clk);
+	if (rc) {
+		dev_err(&pdev->dev, "emac clock prepare failed.\n");
+		return rc;
+	}
+
 	emac_bus_frequency = clk_get_rate(emac_clk);
 
 	/* TODO: Probe PHY here if possible */
 
 	ndev = alloc_etherdev(sizeof(struct emac_priv));
-	if (!ndev)
-		return -ENOMEM;
+	if (!ndev) {
+		rc = -ENOMEM;
+		goto no_etherdev;
+	}
 
 	platform_set_drvdata(pdev, ndev);
 	priv = netdev_priv(ndev);
 	priv->pdev = pdev;
 	priv->ndev = ndev;
 	priv->msg_enable = netif_msg_init(debug_level, DAVINCI_EMAC_DEBUG);
+	priv->clk = emac_clk;
 
 	spin_lock_init(&priv->lock);
 
@@ -2020,6 +2031,8 @@ no_cpdma_chan:
 	cpdma_ctlr_destroy(priv->dma);
 no_pdata:
 	free_netdev(ndev);
+no_etherdev:
+	clk_unprepare(emac_clk);
 	return rc;
 }
 
@@ -2048,6 +2061,8 @@ static int davinci_emac_remove(struct platform_device *pdev)
 	unregister_netdev(ndev);
 	free_netdev(ndev);
 
+	clk_unprepare(priv->clk);
+
 	return 0;
 }
 
-- 
1.7.10.1

^ permalink raw reply related

* [PATCH net-next 1/2] net/davinci_emac: use devres APIs
From: Sekhar Nori @ 2013-03-25  9:25 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, davinci-linux-open-source, Sekhar Nori
In-Reply-To: <1364203547-2960-1-git-send-email-nsekhar@ti.com>

Use devres APIs where possible to simplify error handling
in driver probe.

While at it, also rename the goto targets in error path to
introduce some consistency in how they are named.

Signed-off-by: Sekhar Nori <nsekhar@ti.com>
---
 drivers/net/ethernet/ti/davinci_emac.c |   46 +++++++++++---------------------
 1 file changed, 16 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ethernet/ti/davinci_emac.c b/drivers/net/ethernet/ti/davinci_emac.c
index ae1b77a..b1349b5 100644
--- a/drivers/net/ethernet/ti/davinci_emac.c
+++ b/drivers/net/ethernet/ti/davinci_emac.c
@@ -1865,21 +1865,18 @@ static int davinci_emac_probe(struct platform_device *pdev)
 
 
 	/* obtain emac clock from kernel */
-	emac_clk = clk_get(&pdev->dev, NULL);
+	emac_clk = devm_clk_get(&pdev->dev, NULL);
 	if (IS_ERR(emac_clk)) {
 		dev_err(&pdev->dev, "failed to get EMAC clock\n");
 		return -EBUSY;
 	}
 	emac_bus_frequency = clk_get_rate(emac_clk);
-	clk_put(emac_clk);
 
 	/* TODO: Probe PHY here if possible */
 
 	ndev = alloc_etherdev(sizeof(struct emac_priv));
-	if (!ndev) {
-		rc = -ENOMEM;
-		goto no_ndev;
-	}
+	if (!ndev)
+		return -ENOMEM;
 
 	platform_set_drvdata(pdev, ndev);
 	priv = netdev_priv(ndev);
@@ -1893,7 +1890,7 @@ static int davinci_emac_probe(struct platform_device *pdev)
 	if (!pdata) {
 		dev_err(&pdev->dev, "no platform data\n");
 		rc = -ENODEV;
-		goto probe_quit;
+		goto no_pdata;
 	}
 
 	/* MAC addr and PHY mask , RMII enable info from platform_data */
@@ -1913,23 +1910,23 @@ static int davinci_emac_probe(struct platform_device *pdev)
 	if (!res) {
 		dev_err(&pdev->dev,"error getting res\n");
 		rc = -ENOENT;
-		goto probe_quit;
+		goto no_pdata;
 	}
 
 	priv->emac_base_phys = res->start + pdata->ctrl_reg_offset;
 	size = resource_size(res);
-	if (!request_mem_region(res->start, size, ndev->name)) {
+	if (!devm_request_mem_region(&pdev->dev, res->start,
+				     size, ndev->name)) {
 		dev_err(&pdev->dev, "failed request_mem_region() for regs\n");
 		rc = -ENXIO;
-		goto probe_quit;
+		goto no_pdata;
 	}
 
-	priv->remap_addr = ioremap(res->start, size);
+	priv->remap_addr = devm_ioremap(&pdev->dev, res->start, size);
 	if (!priv->remap_addr) {
 		dev_err(&pdev->dev, "unable to map IO\n");
 		rc = -ENOMEM;
-		release_mem_region(res->start, size);
-		goto probe_quit;
+		goto no_pdata;
 	}
 	priv->emac_base = priv->remap_addr + pdata->ctrl_reg_offset;
 	ndev->base_addr = (unsigned long)priv->remap_addr;
@@ -1962,7 +1959,7 @@ static int davinci_emac_probe(struct platform_device *pdev)
 	if (!priv->dma) {
 		dev_err(&pdev->dev, "error initializing DMA\n");
 		rc = -ENOMEM;
-		goto no_dma;
+		goto no_pdata;
 	}
 
 	priv->txchan = cpdma_chan_create(priv->dma, tx_chan_num(EMAC_DEF_TX_CH),
@@ -1971,14 +1968,14 @@ static int davinci_emac_probe(struct platform_device *pdev)
 				       emac_rx_handler);
 	if (WARN_ON(!priv->txchan || !priv->rxchan)) {
 		rc = -ENOMEM;
-		goto no_irq_res;
+		goto no_cpdma_chan;
 	}
 
 	res = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
 	if (!res) {
 		dev_err(&pdev->dev, "error getting irq res\n");
 		rc = -ENOENT;
-		goto no_irq_res;
+		goto no_cpdma_chan;
 	}
 	ndev->irq = res->start;
 
@@ -2000,7 +1997,7 @@ static int davinci_emac_probe(struct platform_device *pdev)
 	if (rc) {
 		dev_err(&pdev->dev, "error in register_netdev\n");
 		rc = -ENODEV;
-		goto no_irq_res;
+		goto no_cpdma_chan;
 	}
 
 
@@ -2015,20 +2012,14 @@ static int davinci_emac_probe(struct platform_device *pdev)
 
 	return 0;
 
-no_irq_res:
+no_cpdma_chan:
 	if (priv->txchan)
 		cpdma_chan_destroy(priv->txchan);
 	if (priv->rxchan)
 		cpdma_chan_destroy(priv->rxchan);
 	cpdma_ctlr_destroy(priv->dma);
-no_dma:
-	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-	release_mem_region(res->start, resource_size(res));
-	iounmap(priv->remap_addr);
-
-probe_quit:
+no_pdata:
 	free_netdev(ndev);
-no_ndev:
 	return rc;
 }
 
@@ -2041,14 +2032,12 @@ no_ndev:
  */
 static int davinci_emac_remove(struct platform_device *pdev)
 {
-	struct resource *res;
 	struct net_device *ndev = platform_get_drvdata(pdev);
 	struct emac_priv *priv = netdev_priv(ndev);
 
 	dev_notice(&ndev->dev, "DaVinci EMAC: davinci_emac_remove()\n");
 
 	platform_set_drvdata(pdev, NULL);
-	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 
 	if (priv->txchan)
 		cpdma_chan_destroy(priv->txchan);
@@ -2056,10 +2045,7 @@ static int davinci_emac_remove(struct platform_device *pdev)
 		cpdma_chan_destroy(priv->rxchan);
 	cpdma_ctlr_destroy(priv->dma);
 
-	release_mem_region(res->start, resource_size(res));
-
 	unregister_netdev(ndev);
-	iounmap(priv->remap_addr);
 	free_netdev(ndev);
 
 	return 0;
-- 
1.7.10.1

^ permalink raw reply related

* [PATCH net-next 0/2] net/davinci_emac updates
From: Sekhar Nori @ 2013-03-25  9:25 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, davinci-linux-open-source, Sekhar Nori

This patch series converts davinci_emac driver to use
clk_prepare/unprepare in preparation towards common
clock migration for DaVinci. While at it, also introduce
a patch to start using devres APIs to simplify error
handling during probe.

Sekhar Nori (2):
  net/davinci_emac: use devres APIs
  net/davinci_emac: use clk_{prepare|unprepare}

 drivers/net/ethernet/ti/davinci_emac.c |   55 ++++++++++++++++----------------
 1 file changed, 28 insertions(+), 27 deletions(-)

-- 
1.7.10.1

^ permalink raw reply

* Re: [PATCH 3/3] net/macb: make clk_enable atomic
From: Steffen Trumtrar @ 2013-03-25  8:33 UTC (permalink / raw)
  To: David Miller; +Cc: festevam, netdev, nicolas.ferre
In-Reply-To: <20130324.170849.1662647665547155225.davem@davemloft.net>

On Sun, Mar 24, 2013 at 05:08:49PM -0400, David Miller wrote:
> From: Fabio Estevam <festevam@gmail.com>
> Date: Fri, 22 Mar 2013 14:38:39 -0300
> 
> > On Fri, Mar 22, 2013 at 2:33 PM, Steffen Trumtrar
> > <s.trumtrar@pengutronix.de> wrote:
> >> Use clk_prepare_enable to be safe on SMP systems.
> > 
> > Wouldn't you have to use clk_disable_unprepare() now?
> 
> Indeed I think he does.
> 

I will fix that of course.

Thanks,
Steffen

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply

* [PATCH 1/1] ethernet driver: dm9000: davicom: Upgrade the driver to suit for all DM9000 series chips
From: Joseph CHANG @ 2013-03-25  8:04 UTC (permalink / raw)
  To: David S. Miller, Bill Pemberton, Matthew Leach,
	Greg Kroah-Hartman, Joseph CHANG, Jiri Pirko, netdev
  Cc: linux-kernel, Joseph CHANG

From: Joseph CHANG <joseph_chang@davicom.com.tw>

Some necessary modification for DM9000 series chips to be better in operation!

Had tested to all of DM9000 series chips, include DM9000E, DM9000A, DM9000B,
and DM9000C in X86 and ARM embedded Linux system these years.

Signed-off-by: Joseph CHANG <joseph_chang@davicom.com.tw>
---
 drivers/net/ethernet/davicom/dm9000.c |   28 +++++++++++++++++++++++-----
 drivers/net/ethernet/davicom/dm9000.h |    4 +++-
 2 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/davicom/dm9000.c b/drivers/net/ethernet/davicom/dm9000.c
index 8cdf025..54bdc92 100644
--- a/drivers/net/ethernet/davicom/dm9000.c
+++ b/drivers/net/ethernet/davicom/dm9000.c
@@ -47,7 +47,8 @@
 #define DM9000_PHY		0x40	/* PHY address 0x01 */
 
 #define CARDNAME	"dm9000"
-#define DRV_VERSION	"1.31"
+/* [KERN-ADD-1](upgrade) */
+#define DRV_VERSION	"1.39"
 
 /*
  * Transmit timeout, default 5 seconds.
@@ -671,10 +672,16 @@ dm9000_poll_work(struct work_struct *w)
 			if (netif_msg_link(db))
 				dm9000_show_carrier(db, new_carrier, nsr);
 
-			if (!new_carrier)
+			/* [KERN-ADD-2] */
+			if (!new_carrier) {
 				netif_carrier_off(ndev);
-			else
+				netdev_info(ndev, "[%s Ethernet Driver, V%s]: Link-Down!!\n",
+					CARDNAME, DRV_VERSION);
+			} else {
 				netif_carrier_on(ndev);
+				netdev_info(ndev, "[%s Ethernet Driver, V%s]: Link-Up!!\n",
+					CARDNAME, DRV_VERSION);
+			}
 		}
 	} else
 		mii_check_media(&db->mii, netif_msg_link(db), 0);
@@ -794,6 +801,9 @@ dm9000_init_dm9000(struct net_device *dev)
 			(dev->features & NETIF_F_RXCSUM) ? RCSR_CSUM : 0);
 
 	iow(db, DM9000_GPCR, GPCR_GEP_CNTL);	/* Let GPIO0 output */
+    /* [KERN-ADD-3] */
+	dm9000_phy_write(dev, 0, 0, 0x8000); /* reset PHY */
+	mdelay(2);
 
 	ncr = (db->flags & DM9000_PLATF_EXT_PHY) ? NCR_EXT_PHY : 0;
 
@@ -830,6 +840,8 @@ dm9000_init_dm9000(struct net_device *dev)
 	db->tx_pkt_cnt = 0;
 	db->queue_pkt_len = 0;
 	dev->trans_start = jiffies;
+    /* [KERN-ADD-4]	*/
+	dm9000_phy_write(dev, 0, 27, 0xE100);
 }
 
 /* Our watchdog timed out. Called by the networking layer */
@@ -1181,7 +1193,8 @@ dm9000_open(struct net_device *dev)
 
 	/* GPIO0 on pre-activate PHY, Reg 1F is not set by reset */
 	iow(db, DM9000_GPR, 0);	/* REG_1F bit0 activate phyxcer */
-	mdelay(1); /* delay needs by DM9000B */
+	/* [KERN-ADD-5](Enable PHY) */
+	mdelay(2); /* delay needs by DM9000B */
 
 	/* Initialize DM9000 board */
 	dm9000_reset(db);
@@ -1311,6 +1324,8 @@ dm9000_shutdown(struct net_device *dev)
 
 	/* RESET device */
 	dm9000_phy_write(dev, 0, MII_BMCR, BMCR_RESET);	/* PHY RESET */
+    /* [KERN-ADD-6](NOTICE)(by)("if (!db->wake_state) ..") */
+    /* iow(db, DM9000_GPR, 0x01); */
 	iow(db, DM9000_GPR, 0x01);	/* Power-Down PHY */
 	iow(db, DM9000_IMR, IMR_PAR);	/* Disable all interrupt */
 	iow(db, DM9000_RCR, 0x00);	/* Disable RX */
@@ -1465,6 +1480,8 @@ dm9000_probe(struct platform_device *pdev)
 	/* fill in parameters for net-dev structure */
 	ndev->base_addr = (unsigned long)db->io_addr;
 	ndev->irq	= db->irq_res->start;
+    /* [KERN-ADD-7] */
+	netdev_info(ndev, "[dm9] %s ndev->irq=%x\n", __func__, ndev->irq);
 
 	/* ensure at least we have a default set of IO routines */
 	dm9000_set_io(db, iosize);
@@ -1502,7 +1519,8 @@ dm9000_probe(struct platform_device *pdev)
 	db->flags |= DM9000_PLATF_SIMPLE_PHY;
 #endif
 
-	dm9000_reset(db);
+    /* [KERN-ADD-8](SPECIAL WHEN INIT_OF_PROBE)[takeover"dm9000_reset(db)"] */
+	iow(db, DM9000_NCR, NCR_MAC_LBK | NCR_RST); /* 0x03 */
 
 	/* try multiple times, DM9000 sometimes gets the read wrong */
 	for (i = 0; i < 8; i++) {
diff --git a/drivers/net/ethernet/davicom/dm9000.h b/drivers/net/ethernet/davicom/dm9000.h
index 55688bd..d644506 100644
--- a/drivers/net/ethernet/davicom/dm9000.h
+++ b/drivers/net/ethernet/davicom/dm9000.h
@@ -69,7 +69,9 @@
 #define NCR_WAKEEN          (1<<6)
 #define NCR_FCOL            (1<<4)
 #define NCR_FDX             (1<<3)
-#define NCR_LBK             (3<<1)
+
+#define NCR_RESERVED        (3<<1)
+#define NCR_MAC_LBK         (1<<1) /* [kern-add-0] */
 #define NCR_RST	            (1<<0)
 
 #define NSR_SPEED           (1<<7)
-- 
1.7.1

^ permalink raw reply related

* RX/dropped counter values for tagged packets
From: Ani Sinha @ 2013-03-25  4:43 UTC (permalink / raw)
  To: netdev, linuxdev, linux-netdev

hello everyone :

Please bear with me if you think I said something wrong.

I was looking at the netif_receive_skb() code and it seems to me that
for vlan tagged packets, the counters are manipulated at two different
places. One is from vlan_do_receive() where the receive counters are
incremented. The other counter (the dropped counter) is incremented at
the very end of netif_receive_skb() when there are no takers for this
packet. The dropped counts are incremented for skb->dev->rx_dropped
whereas the receive counts are incremented for the vlan device
specific private stat counter values. When the numbers are reported on
/proc/net/dev (from where ifconfig gets the numbers), these two stat
counters are combined together (see dev_get_stats() where rx_dropped
is added to the device private stat numbers). So if no one is
receiving the tagged packets, it will be reported both as received and
dropped! I wonder whether I am analyzing it wrong or whether no one
has noticed this before or whether this is intentional.

Any  clarification will be greatly appreciated. Again, I could be
wrong looking at the code, so please bear with me.

Thanks

ani

^ permalink raw reply

* [PATCH v2 1/1]core:Change a wrong explain about dev_get_by_name
From: tingwei liu @ 2013-03-25  4:10 UTC (permalink / raw)
  To: netdev, Alexey Kuznetsov, davem, Eric Dumazet, Ben Hutchings,
	Linus Torvalds

From: Tingwei Liu <tingw.liu@gmail.com>
Date: Mon, 25 Mar 2013 11:09:59 +0800
Subject: [PATCH] Change a wrong explain about dev_get_by_name

a long time ago by the  commit.

 commit 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2
 Author: Linus Torvalds <torvalds@ppc970.osdl.org>
 Date:   Sat Apr 16 15:20:36 2005 -0700

    Linux-2.6.12-rc2

    Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

When function "list_netdevice" get write lock "dev_base_lock" only
disable soft interrupt. So dev_get_by_name get read lock
"dev_base_lock", can not called on interrupt context.


Signed-off-by: Tingwei Liu <tingw.liu@gmail.com>
---
Changes from v1:
- add change log
- add the commnet

 net/core/dev.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index d540ced..20c5c7c 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -700,10 +700,10 @@ EXPORT_SYMBOL(dev_get_by_name_rcu);
  *     @name: name to find
  *
  *     Find an interface by name. This can be called from any
- *     context and does its own locking. The returned handle has
- *     the usage count incremented and the caller must use dev_put() to
- *     release it when it is no longer needed. %NULL is returned if no
- *     matching device is found.
+ *     context except hard irq context and does its own locking.
+ *     The returned handle has the usage count incremented and the
+ *     caller must use dev_put() to release it when it is no longer
+ *     needed. %NULL is returned if no matching device is found.
  */

 struct net_device *dev_get_by_name(struct net *net, const char *name)
--
1.6.0.2

^ permalink raw reply related

* Re: [PATCH net 0/3] ipv4: Fix ip-header identification for gso packets.
From: Cong Wang @ 2013-03-25  3:48 UTC (permalink / raw)
  To: Pravin B Shelar; +Cc: davem, netdev
In-Reply-To: <1364182567-1667-1-git-send-email-pshelar@nicira.com>

On Sun, 2013-03-24 at 20:36 -0700, Pravin B Shelar wrote:
> Following patch fixes ip-header identification according to comments from
> David Miller and Cong Wang.
> First two patches reverts fixes to upper layer gso handlers which is not
> required after fixing ipv4 gso handler.
> 

Looks good to me now:

Acked-by: Cong Wang <amwang@redhat.com>

Thanks!

^ permalink raw reply

* [PATCH net 3/3] ipv4: Fix ip-header identification for gso packets.
From: Pravin B Shelar @ 2013-03-25  3:36 UTC (permalink / raw)
  To: davem, netdev; +Cc: amwang, Pravin B Shelar

ip-header id needs to be incremented even if IP_DF flag is set.
This behaviour was changed in commit 490ab08127cebc25e3a26
(IP_GRE: Fix IP-Identification).

Following patch fixes it so that identification is always
incremented.

Reported-by: Cong Wang <amwang@redhat.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
---
 include/net/ipip.h |   16 ++++++----------
 net/ipv4/af_inet.c |    3 +--
 2 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/include/net/ipip.h b/include/net/ipip.h
index 0c046e3..483b91a 100644
--- a/include/net/ipip.h
+++ b/include/net/ipip.h
@@ -74,15 +74,11 @@ static inline void tunnel_ip_select_ident(struct sk_buff *skb,
 {
 	struct iphdr *iph = ip_hdr(skb);
 
-	if (iph->frag_off & htons(IP_DF))
-		iph->id	= 0;
-	else {
-		/* Use inner packet iph-id if possible. */
-		if (skb->protocol == htons(ETH_P_IP) && old_iph->id)
-			iph->id	= old_iph->id;
-		else
-			__ip_select_ident(iph, dst,
-					  (skb_shinfo(skb)->gso_segs ?: 1) - 1);
-	}
+	/* Use inner packet iph-id if possible. */
+	if (skb->protocol == htons(ETH_P_IP) && old_iph->id)
+		iph->id	= old_iph->id;
+	else
+		__ip_select_ident(iph, dst,
+				  (skb_shinfo(skb)->gso_segs ?: 1) - 1);
 }
 #endif
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 9e5882c..70b2d4c 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1334,8 +1334,7 @@ static struct sk_buff *inet_gso_segment(struct sk_buff *skb,
 				iph->frag_off |= htons(IP_MF);
 			offset += (skb->len - skb->mac_len - iph->ihl * 4);
 		} else  {
-			if (!(iph->frag_off & htons(IP_DF)))
-				iph->id = htons(id++);
+			iph->id = htons(id++);
 		}
 		iph->tot_len = htons(skb->len - skb->mac_len);
 		iph->check = 0;
-- 
1.7.1

^ permalink raw reply related

* [PATCH net 2/3] Revert "udp: increase inner ip header ID during segmentation"
From: Pravin B Shelar @ 2013-03-25  3:36 UTC (permalink / raw)
  To: davem, netdev; +Cc: amwang, Pravin B Shelar

This reverts commit d6a8c36dd6f6f06f046e5c61d3fb39b777c3bdc6.
Next commit makes this commit unnecessary.

CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
---
 net/ipv4/udp.c |    7 +------
 1 files changed, 1 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 3c362ee..7117d14 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2304,8 +2304,7 @@ static struct sk_buff *skb_udp_tunnel_segment(struct sk_buff *skb,
 	struct sk_buff *segs = ERR_PTR(-EINVAL);
 	int mac_len = skb->mac_len;
 	int tnl_hlen = skb_inner_mac_header(skb) - skb_transport_header(skb);
-	int outer_hlen, id;
-	struct iphdr *iph;
+	int outer_hlen;
 	netdev_features_t enc_features;
 
 	if (unlikely(!pskb_may_pull(skb, tnl_hlen)))
@@ -2316,8 +2315,6 @@ static struct sk_buff *skb_udp_tunnel_segment(struct sk_buff *skb,
 	skb_reset_mac_header(skb);
 	skb_set_network_header(skb, skb_inner_network_offset(skb));
 	skb->mac_len = skb_inner_network_offset(skb);
-	iph = ip_hdr(skb);
-	id = ntohs(iph->id);
 
 	/* segment inner packet. */
 	enc_features = skb->dev->hw_enc_features & netif_skb_features(skb);
@@ -2332,8 +2329,6 @@ static struct sk_buff *skb_udp_tunnel_segment(struct sk_buff *skb,
 		int udp_offset = outer_hlen - tnl_hlen;
 
 		skb->mac_len = mac_len;
-		iph = (struct iphdr *)skb_inner_network_header(skb);
-		iph->id = htons(id++);
 
 		skb_push(skb, outer_hlen);
 		skb_reset_mac_header(skb);
-- 
1.7.1

^ permalink raw reply related

* [PATCH net 1/3] Revert "ip_gre: increase inner ip header ID during segmentation"
From: Pravin B Shelar @ 2013-03-25  3:36 UTC (permalink / raw)
  To: davem, netdev; +Cc: amwang, Pravin B Shelar

This reverts commit 10c0d7ed32b7c273970a20e211c08ab46fea3c26.
Next commit makes this commit unnecessary.

CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
---
 net/ipv4/gre.c |    7 +------
 1 files changed, 1 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/gre.c b/net/ipv4/gre.c
index e20631c..7a4c710 100644
--- a/net/ipv4/gre.c
+++ b/net/ipv4/gre.c
@@ -125,9 +125,8 @@ static struct sk_buff *gre_gso_segment(struct sk_buff *skb,
 	netdev_features_t enc_features;
 	int ghl = GRE_HEADER_SECTION;
 	struct gre_base_hdr *greh;
-	struct iphdr *iph;
 	int mac_len = skb->mac_len;
-	int tnl_hlen, id;
+	int tnl_hlen;
 	bool csum;
 
 	if (unlikely(skb_shinfo(skb)->gso_type &
@@ -171,8 +170,6 @@ static struct sk_buff *gre_gso_segment(struct sk_buff *skb,
 	skb_set_network_header(skb, skb_inner_network_offset(skb));
 	skb->mac_len = skb_inner_network_offset(skb);
 
-	iph = ip_hdr(skb);
-	id = ntohs(iph->id);
 	/* segment inner packet. */
 	enc_features = skb->dev->hw_enc_features & netif_skb_features(skb);
 	segs = skb_mac_gso_segment(skb, enc_features);
@@ -182,8 +179,6 @@ static struct sk_buff *gre_gso_segment(struct sk_buff *skb,
 	skb = segs;
 	tnl_hlen = skb_tnl_header_len(skb);
 	do {
-		iph = (struct iphdr *)skb->data;
-		iph->id = htons(id++);
 		__skb_push(skb, ghl);
 		if (csum) {
 			__be32 *pcsum;
-- 
1.7.1

^ permalink raw reply related

* [PATCH net 0/3] ipv4: Fix ip-header identification for gso packets.
From: Pravin B Shelar @ 2013-03-25  3:36 UTC (permalink / raw)
  To: davem, netdev; +Cc: amwang, Pravin B Shelar

Following patch fixes ip-header identification according to comments from
David Miller and Cong Wang.
First two patches reverts fixes to upper layer gso handlers which is not
required after fixing ipv4 gso handler.

Pravin B Shelar (3):
  Revert "ip_gre: increase inner ip header ID during segmentation"
  Revert "udp: increase inner ip header ID during segmentation"
  ipv4: Fix ip-header identification for gso packets.

 include/net/ipip.h |   16 ++++++----------
 net/ipv4/af_inet.c |    3 +--
 net/ipv4/gre.c     |    7 +------
 net/ipv4/udp.c     |    7 +------
 4 files changed, 9 insertions(+), 24 deletions(-)

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox