* Re: [PATCH] net: ipmr: limit MRT_TABLE identifiers
From: Eric Dumazet @ 2012-11-26 2:55 UTC (permalink / raw)
To: Chen Gang; +Cc: David Miller, netdev
In-Reply-To: <50B2D42A.6090804@asianux.com>
On Mon, 2012-11-26 at 10:30 +0800, Chen Gang wrote:
> maybe a hacker can do ? (I just guess).
>
> for my experience:
> It is necessary to think of more coding ways, when we have to give comments inside a function.
I have absolutely no idea of what you are trying to say.
^ permalink raw reply
* Re: [RFC net-next PATCH V1 5/9] net: frag per CPU mem limit and LRU list accounting
From: Cong Wang @ 2012-11-26 2:54 UTC (permalink / raw)
To: Jesper Dangaard Brouer
Cc: Eric Dumazet, David S. Miller, Florian Westphal, netdev,
Pablo Neira Ayuso, Thomas Graf, Patrick McHardy, Paul E. McKenney,
Herbert Xu
In-Reply-To: <20121123130826.18764.66507.stgit@dragon>
On Fri, 2012-11-23 at 14:08 +0100, Jesper Dangaard Brouer wrote:
> + // QUESTION: Please advice in a better alloc solution than
> NR_CPUS
> + struct cpu_resource percpu[NR_CPUS];
alloc_percpu(struct cpu_resource) ?
BTW, struct cpu_resource is not a good name here, it is too generic.
^ permalink raw reply
* Re: [PATCH] net: ipmr: limit MRT_TABLE identifiers
From: Chen Gang @ 2012-11-26 3:13 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1353898520.30446.868.camel@edumazet-glaptop>
于 2012年11月26日 10:55, Eric Dumazet 写道:
> On Mon, 2012-11-26 at 10:30 +0800, Chen Gang wrote:
>
>> maybe a hacker can do ? (I just guess).
>>
>> for my experience:
>> It is necessary to think of more coding ways, when we have to give comments inside a function.
>
> I have absolutely no idea of what you are trying to say.
>
excuse me, my English is not quite well.
I will try to say it as clear as I can.
what I want to say are:
for os api, we need describe them as full as we can (including limitations, although it seems minor).
I do not suggest to give comments inside a fuction (especially coding with hard code number).
they are just my suggestion, not mean, it should be regression.
at least, I think this patch is valuable.
--
Chen Gang
Asianux Corporation
^ permalink raw reply
* [PATCH] ip6mr: Add sizeof verification to MRT6_ASSERT and MT6_PIM
From: Joe Perches @ 2012-11-26 4:26 UTC (permalink / raw)
To: David Miller; +Cc: eric.dumazet, netdev
In-Reply-To: <20121125.164036.1283541826012498001.davem@davemloft.net>
Verify the length of the user-space arguments.
Signed-off-by: Joe Perches <joe@perches.com>
---
net/ipv6/ip6mr.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index 79bb490..926ea54 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -1646,6 +1646,9 @@ int ip6_mroute_setsockopt(struct sock *sk, int optname, char __user *optval, uns
case MRT6_ASSERT:
{
int v;
+
+ if (optlen != sizeof(v))
+ return -EINVAL;
if (get_user(v, (int __user *)optval))
return -EFAULT;
mrt->mroute_do_assert = v;
@@ -1656,6 +1659,9 @@ int ip6_mroute_setsockopt(struct sock *sk, int optname, char __user *optval, uns
case MRT6_PIM:
{
int v;
+
+ if (optlen != sizeof(v))
+ return -EINVAL;
if (get_user(v, (int __user *)optval))
return -EFAULT;
v = !!v;
^ permalink raw reply related
* linux-next: manual merge of the cgroup tree with the net-next tree
From: Stephen Rothwell @ 2012-11-26 4:31 UTC (permalink / raw)
To: Tejun Heo
Cc: linux-next-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Daniel Wagner, David Miller,
netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA
[-- Attachment #1: Type: text/plain, Size: 1203 bytes --]
Hi Tejun,
Today's linux-next merge of the cgroup tree got a conflict in
net/sched/cls_cgroup.c between commit 6a328d8c6f03 ("cgroup: net_cls:
Rework update socket logic") from the net-next tree and commits
92fb97487a7e ("cgroup: rename ->create/post_create/pre_destroy/destroy()
to ->css_alloc/online/offline/free()") and 0ba18f7a5e26 ("netcls_cgroup:
move config inheritance to ->css_online() and remove .broken_hierarchy
marking") from the cgroup tree.
I fixed it up (I think - see below) and can carry the fix as necessary
(no action is required).
--
Cheers,
Stephen Rothwell sfr-3FnU+UHB4dNDw9hX6IcOSA@public.gmane.org
diff --cc net/sched/cls_cgroup.c
index 709b0fb,31f06b6..0000000
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@@ -98,9 -79,9 +102,10 @@@ static struct cftype ss_files[] =
struct cgroup_subsys net_cls_subsys = {
.name = "net_cls",
- .create = cgrp_create,
- .destroy = cgrp_destroy,
+ .css_alloc = cgrp_css_alloc,
+ .css_online = cgrp_css_online,
+ .css_free = cgrp_css_free,
+ .attach = cgrp_attach,
.subsys_id = net_cls_subsys_id,
.base_cftypes = ss_files,
.module = THIS_MODULE,
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply
* Re: [PATCH 080/493] fddi: remove use of __devexit_p
From: Maciej W. Rozycki @ 2012-11-26 4:35 UTC (permalink / raw)
To: Greg KH; +Cc: Bill Pemberton, netdev
In-Reply-To: <20121122002337.GA16151@kroah.com>
On Wed, 21 Nov 2012, Greg KH wrote:
> > TURBOchannel, although valid, is an old exotic case that might not be
> > worth arguing for, except for purity maybe. But there are surely many
> > contemporary systems out there that are known they are never going to
> > support hot device replacement. Consider most of the embedded systems for
> > example, where devices may even physically be cast into a single SOC (with
> > no prospect of chipping off any pieces ever ;) ), that certainly could not
> > care less of device replacement, but they do care a lot about memory
> > consumption.
>
> Even those don't care about less than 5k of memory, do they?
I guess you're right. As long as it's not 5k per driver +
who_knows_how_much per platform for some generic stuff that is. Of course
this is still a waste, but I can accept it as a compromise between the use
of machine resources and the cost of maintenance.
> > Was this implication considered, discussed and diregarded as not
> > important enough compared to benefits from hardcoding HOTPLUG support?
>
> Yes, I don't know of any modern system that does not enable
> CONFIG_HOTPLUG, do you?
I've never enabled it outside x86 to be honest. None of the platforms I
care of supports it in hardware.
> > I'm seriously asking for a pointer, not trying to cause any stir-up --
> > regrettably I fail to follow most discussions these days, but I would like
> > to know what the background behind this decision was. Thanks a lot!
>
> See Russell's response in this thread for details if you are curious.
Hmm, not in my inbox for some reason (unless I'm blind), but thanks for
the pointer, I'll see if I can chase it online in some mailing list
archive.
Maciej
^ permalink raw reply
* [net-next:master 20/25] warning: (NET_DSA_TAG_DSA && NET_DSA_TAG_EDSA && NET_DSA_TAG_TRAILER) selects NET_DSA which has unmet direct dependencies (NET && EXPERIMENTAL && NETDEVICES && !S390)
From: kbuild test robot @ 2012-11-26 4:56 UTC (permalink / raw)
To: viresh kumar; +Cc: netdev
tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
head: 53d6841d225b93c20d561878637c3cd307c11648
commit: 82167cb8c6b2f8166d5c7532e5ef4b5e0cc46a72 [20/25] net: dsa/slave: Fix compilation warnings
config: make ARCH=s390 allmodconfig
All warnings:
warning: (NET_DSA_TAG_DSA && NET_DSA_TAG_EDSA && NET_DSA_TAG_TRAILER) selects NET_DSA which has unmet direct dependencies (NET && EXPERIMENTAL && NETDEVICES && !S390)
warning: (NET_DSA_TAG_DSA && NET_DSA_TAG_EDSA && NET_DSA_TAG_TRAILER) selects NET_DSA which has unmet direct dependencies (NET && EXPERIMENTAL && NETDEVICES && !S390)
--
warning: (NET_DSA_TAG_DSA && NET_DSA_TAG_EDSA && NET_DSA_TAG_TRAILER) selects NET_DSA which has unmet direct dependencies (NET && EXPERIMENTAL && NETDEVICES && !S390)
---
0-DAY kernel build testing backend Open Source Technology Center
Fengguang Wu, Yuanhan Liu Intel Corporation
^ permalink raw reply
* [PATCH net-next] net: clean up locking in inet_frag_find()
From: Cong Wang @ 2012-11-26 5:18 UTC (permalink / raw)
To: netdev; +Cc: Patrick McHardy, Pablo Neira Ayuso, David S. Miller, Cong Wang
It is weird to take the read lock outside of inet_frag_find()
but release it inside... Also, the hash function doesn't
need this lock.
Cc: Patrick McHardy <kaber@trash.net>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
---
diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
index 4750d2b..caf1807 100644
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -272,11 +272,11 @@ static struct inet_frag_queue *inet_frag_create(struct netns_frags *nf,
struct inet_frag_queue *inet_frag_find(struct netns_frags *nf,
struct inet_frags *f, void *key, unsigned int hash)
- __releases(&f->lock)
{
struct inet_frag_queue *q;
struct hlist_node *n;
+ read_lock(&f->lock);
hlist_for_each_entry(q, n, &f->hash[hash], list) {
if (q->net == nf && f->match(q, key)) {
atomic_inc(&q->refcnt);
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index 1cf6a76..d9541f4 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -295,9 +295,7 @@ static inline struct ipq *ip_find(struct net *net, struct iphdr *iph, u32 user)
arg.iph = iph;
arg.user = user;
- read_lock(&ip4_frags.lock);
hash = ipqhashfn(iph->id, iph->saddr, iph->daddr, iph->protocol);
-
q = inet_frag_find(&net->ipv4.frags, &ip4_frags, &arg, hash);
if (q == NULL)
goto out_nomem;
diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c
index 22c8ea9..b778423 100644
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -175,9 +175,8 @@ static inline struct frag_queue *fq_find(struct net *net, __be32 id,
arg.src = src;
arg.dst = dst;
- read_lock_bh(&nf_frags.lock);
hash = inet6_hash_frag(id, src, dst, nf_frags.rnd);
-
+ local_bh_disable();
q = inet_frag_find(&net->nf_frag.frags, &nf_frags, &arg, hash);
local_bh_enable();
if (q == NULL)
diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
index e5253ec..15e09a6 100644
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -193,9 +193,7 @@ fq_find(struct net *net, __be32 id, const struct in6_addr *src, const struct in6
arg.src = src;
arg.dst = dst;
- read_lock(&ip6_frags.lock);
hash = inet6_hash_frag(id, src, dst, ip6_frags.rnd);
-
q = inet_frag_find(&net->ipv6.frags, &ip6_frags, &arg, hash);
if (q == NULL)
return NULL;
^ permalink raw reply related
* Re: [PATCH net-next] net: clean up locking in inet_frag_find()
From: Eric Dumazet @ 2012-11-26 5:43 UTC (permalink / raw)
To: Cong Wang; +Cc: netdev, Patrick McHardy, Pablo Neira Ayuso, David S. Miller
In-Reply-To: <1353907085-30824-1-git-send-email-amwang@redhat.com>
On Mon, 2012-11-26 at 13:18 +0800, Cong Wang wrote:
> It is weird to take the read lock outside of inet_frag_find()
> but release it inside... Also, the hash function doesn't
> need this lock.
This patch is wrong. hash needs the lock.
^ permalink raw reply
* Re: [PATCH net-next] net: clean up locking in inet_frag_find()
From: Cong Wang @ 2012-11-26 5:53 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev, Patrick McHardy, Pablo Neira Ayuso, David S. Miller
In-Reply-To: <1353908592.30446.1136.camel@edumazet-glaptop>
On Sun, 2012-11-25 at 21:43 -0800, Eric Dumazet wrote:
> On Mon, 2012-11-26 at 13:18 +0800, Cong Wang wrote:
> > It is weird to take the read lock outside of inet_frag_find()
> > but release it inside... Also, the hash function doesn't
> > need this lock.
>
> This patch is wrong. hash needs the lock.
>
>
Indeed, I missed .rnd...
I am cooking a better patch now.
Thanks!
^ permalink raw reply
* Re: [RFC net-next PATCH V1 9/9] net: frag remove readers-writer lock (hack)
From: Stephen Hemminger @ 2012-11-26 6:03 UTC (permalink / raw)
To: Jesper Dangaard Brouer
Cc: Eric Dumazet, David S. Miller, Florian Westphal, netdev,
Pablo Neira Ayuso, Thomas Graf, Cong Wang, Patrick McHardy,
Paul E. McKenney, Herbert Xu
In-Reply-To: <20121123130847.18764.87682.stgit@dragon>
On Fri, 23 Nov 2012 14:08:47 +0100
Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> Do NOT APPLY this patch.
>
> After all the other patches, the rw_lock is now the contention point.
Reader-writer locks are significantly slower than a simple spinlock.
Unless the reader holds the lock for "significant" number of cycles,
a spin lock will be faster.
Did you try just changing it to a spinlock?
^ permalink raw reply
* Re: [net-next:master 20/25] warning: (NET_DSA_TAG_DSA && NET_DSA_TAG_EDSA && NET_DSA_TAG_TRAILER) selects NET_DSA which has unmet direct dependencies (NET && EXPERIMENTAL && NETDEVICES && !S390)
From: Viresh Kumar @ 2012-11-26 6:13 UTC (permalink / raw)
To: kbuild test robot, Ben Hutchings, davem; +Cc: netdev
In-Reply-To: <50b2f66a.czw+uqRm3Q/zfbER%fengguang.wu@intel.com>
Hi Ben/Dave,
On 26 November 2012 10:26, kbuild test robot <fengguang.wu@intel.com> wrote:
> tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
> head: 53d6841d225b93c20d561878637c3cd307c11648
> commit: 82167cb8c6b2f8166d5c7532e5ef4b5e0cc46a72 [20/25] net: dsa/slave: Fix compilation warnings
> config: make ARCH=s390 allmodconfig
>
> All warnings:
>
> warning: (NET_DSA_TAG_DSA && NET_DSA_TAG_EDSA && NET_DSA_TAG_TRAILER) selects NET_DSA which has unmet direct dependencies (NET && EXPERIMENTAL && NETDEVICES && !S390)
> warning: (NET_DSA_TAG_DSA && NET_DSA_TAG_EDSA && NET_DSA_TAG_TRAILER) selects NET_DSA which has unmet direct dependencies (NET && EXPERIMENTAL && NETDEVICES && !S390)
> --
> warning: (NET_DSA_TAG_DSA && NET_DSA_TAG_EDSA && NET_DSA_TAG_TRAILER) selects NET_DSA which has unmet direct dependencies (NET && EXPERIMENTAL && NETDEVICES && !S390)
I am not really sure how to fix this up. Can you help?
--
viresh
^ permalink raw reply
* Re: [PATCH RFC] [INET]: Get cirtical word in first 64bit of cache line
From: Eric Dumazet @ 2012-11-26 6:44 UTC (permalink / raw)
To: ling.ma.program; +Cc: linux-kernel, netdev
In-Reply-To: <1353900555-5966-1-git-send-email-ling.ma.program@gmail.com>
On Mon, 2012-11-26 at 11:29 +0800, ling.ma.program@gmail.com wrote:
> From: Ma Ling <ling.ma.program@gmail.com>
>
> In order to reduce memory latency when last level cache miss occurs,
> modern CPUs i.e. x86 and arm introduced Critical Word First(CWF) or
> Early Restart(ER) to get data ASAP. For CWF if critical word is first member
> in cache line, memory feed CPU with critical word, then fill others
> data in cache line one by one, otherwise after critical word it must
> cost more cycle to fill the remaining cache line. For Early First CPU will
> restart until critical word in cache line reaches.
>
> Hash value is critical word, so in this patch we place it as first member
> in cache line(sock address is cache-line aligned), and it is also good for
> Early Restart platform as well .
>
> Thanks
> Ling
networking patches should be sent to netdev.
(I understand this patch is more a generic one, but at least CC netdev)
You give no performance numbers for this change...
I never heard of this CWF/ER, where are the official Intel documents
about this, and what models really benefit from it ?
Also, why not moving skc_net as well ?
BTW, skc_daddr & skc_rcv_saddr are 'critical' as well, we use them in
INET_MATCH()
It seems we have a 32bit hole on 64bit arches, so we probably should
move inet_dport/inet_num in it. It could well remove a full cache line
miss (I'll send a patch for this after tests)
Thanks
^ permalink raw reply
* [PATCH net-next v2] net: clean up locking in inet_frag_find()
From: Cong Wang @ 2012-11-26 7:26 UTC (permalink / raw)
To: netdev
Cc: Eric Dumazet, Patrick McHardy, Pablo Neira Ayuso, David S. Miller,
Cong Wang
In-Reply-To: <1353909184.11282.3.camel@cr0>
It is weird to take the read lock outside of inet_frag_find()
but release it inside... This can be improved by refactoring
the code, that is, introducing inet{4,6}_frag_find() which call
the their own hash function, inet{4,6}_hash_frag(), hiding the
details from their callers.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
---
include/net/inet_frag.h | 17 +++++-
include/net/ipv6.h | 3 -
net/ipv4/inet_fragment.c | 82 +++++++++++++++++++++++++++++--
net/ipv4/ip_fragment.c | 16 +-----
net/ipv6/netfilter/nf_conntrack_reasm.c | 7 +--
net/ipv6/reassembly.c | 34 +------------
6 files changed, 97 insertions(+), 62 deletions(-)
diff --git a/include/net/inet_frag.h b/include/net/inet_frag.h
index 32786a0..7314ec5 100644
--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -1,6 +1,8 @@
#ifndef __NET_FRAG_H__
#define __NET_FRAG_H__
+#include <linux/in6.h>
+
struct netns_frags {
int nqueues;
atomic_t mem;
@@ -62,9 +64,18 @@ void inet_frag_kill(struct inet_frag_queue *q, struct inet_frags *f);
void inet_frag_destroy(struct inet_frag_queue *q,
struct inet_frags *f, int *work);
int inet_frag_evictor(struct netns_frags *nf, struct inet_frags *f, bool force);
-struct inet_frag_queue *inet_frag_find(struct netns_frags *nf,
- struct inet_frags *f, void *key, unsigned int hash)
- __releases(&f->lock);
+
+extern unsigned int inet4_hash_frag(__be16 id, __be32 saddr, __be32 daddr,
+ u8 prot, u32 rnd);
+struct inet_frag_queue *inet4_frag_find(struct netns_frags *nf,
+ struct inet_frags *f, void *key);
+
+#if IS_ENABLED(CONFIG_IPV6)
+extern unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr,
+ const struct in6_addr *daddr, u32 rnd);
+struct inet_frag_queue *inet6_frag_find(struct netns_frags *nf,
+ struct inet_frags *f, void *key);
+#endif
static inline void inet_frag_put(struct inet_frag_queue *q, struct inet_frags *f)
{
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 979bf6c..ba19ebc 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -695,9 +695,6 @@ extern int ip6_mc_msfilter(struct sock *sk, struct group_filter *gsf);
extern int ip6_mc_msfget(struct sock *sk, struct group_filter *gsf,
struct group_filter __user *optval,
int __user *optlen);
-extern unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr,
- const struct in6_addr *daddr, u32 rnd);
-
#ifdef CONFIG_PROC_FS
extern int ac6_proc_init(struct net *net);
extern void ac6_proc_exit(struct net *net);
diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
index 4750d2b..7290238 100644
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -20,7 +20,10 @@
#include <linux/skbuff.h>
#include <linux/rtnetlink.h>
#include <linux/slab.h>
+#include <linux/jhash.h>
+#include <linux/ip.h>
+#include <net/ipv6.h>
#include <net/inet_frag.h>
static void inet_frag_secret_rebuild(unsigned long dummy)
@@ -270,9 +273,8 @@ static struct inet_frag_queue *inet_frag_create(struct netns_frags *nf,
return inet_frag_intern(nf, q, f, arg);
}
-struct inet_frag_queue *inet_frag_find(struct netns_frags *nf,
+static struct inet_frag_queue *__inet_frag_find(struct netns_frags *nf,
struct inet_frags *f, void *key, unsigned int hash)
- __releases(&f->lock)
{
struct inet_frag_queue *q;
struct hlist_node *n;
@@ -280,12 +282,84 @@ struct inet_frag_queue *inet_frag_find(struct netns_frags *nf,
hlist_for_each_entry(q, n, &f->hash[hash], list) {
if (q->net == nf && f->match(q, key)) {
atomic_inc(&q->refcnt);
- read_unlock(&f->lock);
return q;
}
}
+
+ return NULL;
+}
+
+unsigned int inet4_hash_frag(__be16 id, __be32 saddr, __be32 daddr, u8 prot, u32 rnd)
+{
+ return jhash_3words((__force u32)id << 16 | prot,
+ (__force u32)saddr, (__force u32)daddr,
+ rnd) & (INETFRAGS_HASHSZ - 1);
+}
+EXPORT_SYMBOL_GPL(inet4_hash_frag);
+
+struct inet_frag_queue *inet4_frag_find(struct netns_frags *nf,
+ struct inet_frags *f, void *key)
+{
+ struct inet_frag_queue *q;
+ struct iphdr *iph;
+ unsigned int hash;
+
+ iph = *(struct iphdr **)key;
+
+ read_lock(&f->lock);
+ hash = inet4_hash_frag(iph->id, iph->saddr, iph->daddr, iph->protocol, f->rnd);
+ q = __inet_frag_find(nf, f, key, hash);
read_unlock(&f->lock);
+ if (q)
+ return q;
+ return inet_frag_create(nf, f, key);
+}
+EXPORT_SYMBOL(inet4_frag_find);
+
+#if IS_ENABLED(CONFIG_IPV6)
+/*
+ * callers should be careful not to use the hash value outside the ipfrag_lock
+ * as doing so could race with ipfrag_hash_rnd being recalculated.
+ */
+unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr,
+ const struct in6_addr *daddr, u32 rnd)
+{
+ u32 c;
+
+ c = jhash_3words((__force u32)saddr->s6_addr32[0],
+ (__force u32)saddr->s6_addr32[1],
+ (__force u32)saddr->s6_addr32[2],
+ rnd);
+
+ c = jhash_3words((__force u32)saddr->s6_addr32[3],
+ (__force u32)daddr->s6_addr32[0],
+ (__force u32)daddr->s6_addr32[1],
+ c);
+
+ c = jhash_3words((__force u32)daddr->s6_addr32[2],
+ (__force u32)daddr->s6_addr32[3],
+ (__force u32)id,
+ c);
+
+ return c & (INETFRAGS_HASHSZ - 1);
+}
+EXPORT_SYMBOL_GPL(inet6_hash_frag);
+
+struct inet_frag_queue *inet6_frag_find(struct netns_frags *nf,
+ struct inet_frags *f, void *key)
+{
+ struct inet_frag_queue *q;
+ struct ip6_create_arg *arg = key;
+ unsigned int hash;
+
+ read_lock(&f->lock);
+ hash = inet6_hash_frag(arg->id, arg->src, arg->dst, f->rnd);
+ q = __inet_frag_find(nf, f, key, hash);
+ read_unlock(&f->lock);
+ if (q)
+ return q;
return inet_frag_create(nf, f, key);
}
-EXPORT_SYMBOL(inet_frag_find);
+EXPORT_SYMBOL(inet6_frag_find);
+#endif
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index 1cf6a76..0a02037 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -32,7 +32,6 @@
#include <linux/ip.h>
#include <linux/icmp.h>
#include <linux/netdevice.h>
-#include <linux/jhash.h>
#include <linux/random.h>
#include <linux/slab.h>
#include <net/route.h>
@@ -133,19 +132,12 @@ struct ip4_create_arg {
u32 user;
};
-static unsigned int ipqhashfn(__be16 id, __be32 saddr, __be32 daddr, u8 prot)
-{
- return jhash_3words((__force u32)id << 16 | prot,
- (__force u32)saddr, (__force u32)daddr,
- ip4_frags.rnd) & (INETFRAGS_HASHSZ - 1);
-}
-
static unsigned int ip4_hashfn(struct inet_frag_queue *q)
{
struct ipq *ipq;
ipq = container_of(q, struct ipq, q);
- return ipqhashfn(ipq->id, ipq->saddr, ipq->daddr, ipq->protocol);
+ return inet4_hash_frag(ipq->id, ipq->saddr, ipq->daddr, ipq->protocol, ip4_frags.rnd);
}
static bool ip4_frag_match(struct inet_frag_queue *q, void *a)
@@ -290,15 +282,11 @@ static inline struct ipq *ip_find(struct net *net, struct iphdr *iph, u32 user)
{
struct inet_frag_queue *q;
struct ip4_create_arg arg;
- unsigned int hash;
arg.iph = iph;
arg.user = user;
- read_lock(&ip4_frags.lock);
- hash = ipqhashfn(iph->id, iph->saddr, iph->daddr, iph->protocol);
-
- q = inet_frag_find(&net->ipv4.frags, &ip4_frags, &arg, hash);
+ q = inet4_frag_find(&net->ipv4.frags, &ip4_frags, &arg);
if (q == NULL)
goto out_nomem;
diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c
index 22c8ea9..2d51584 100644
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -168,17 +168,14 @@ static inline struct frag_queue *fq_find(struct net *net, __be32 id,
{
struct inet_frag_queue *q;
struct ip6_create_arg arg;
- unsigned int hash;
arg.id = id;
arg.user = user;
arg.src = src;
arg.dst = dst;
- read_lock_bh(&nf_frags.lock);
- hash = inet6_hash_frag(id, src, dst, nf_frags.rnd);
-
- q = inet_frag_find(&net->nf_frag.frags, &nf_frags, &arg, hash);
+ local_bh_disable();
+ q = inet6_frag_find(&net->nf_frag.frags, &nf_frags, &arg);
local_bh_enable();
if (q == NULL)
goto oom;
diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
index e5253ec..08c5063 100644
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -70,34 +70,6 @@ static struct inet_frags ip6_frags;
static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
struct net_device *dev);
-/*
- * callers should be careful not to use the hash value outside the ipfrag_lock
- * as doing so could race with ipfrag_hash_rnd being recalculated.
- */
-unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr,
- const struct in6_addr *daddr, u32 rnd)
-{
- u32 c;
-
- c = jhash_3words((__force u32)saddr->s6_addr32[0],
- (__force u32)saddr->s6_addr32[1],
- (__force u32)saddr->s6_addr32[2],
- rnd);
-
- c = jhash_3words((__force u32)saddr->s6_addr32[3],
- (__force u32)daddr->s6_addr32[0],
- (__force u32)daddr->s6_addr32[1],
- c);
-
- c = jhash_3words((__force u32)daddr->s6_addr32[2],
- (__force u32)daddr->s6_addr32[3],
- (__force u32)id,
- c);
-
- return c & (INETFRAGS_HASHSZ - 1);
-}
-EXPORT_SYMBOL_GPL(inet6_hash_frag);
-
static unsigned int ip6_hashfn(struct inet_frag_queue *q)
{
struct frag_queue *fq;
@@ -186,17 +158,13 @@ fq_find(struct net *net, __be32 id, const struct in6_addr *src, const struct in6
{
struct inet_frag_queue *q;
struct ip6_create_arg arg;
- unsigned int hash;
arg.id = id;
arg.user = IP6_DEFRAG_LOCAL_DELIVER;
arg.src = src;
arg.dst = dst;
- read_lock(&ip6_frags.lock);
- hash = inet6_hash_frag(id, src, dst, ip6_frags.rnd);
-
- q = inet_frag_find(&net->ipv6.frags, &ip6_frags, &arg, hash);
+ q = inet6_frag_find(&net->ipv6.frags, &ip6_frags, &arg);
if (q == NULL)
return NULL;
^ permalink raw reply related
* Re: Fwd: Re: RTL 8169 linux driver question
From: Stéphane ANCELOT @ 2012-11-26 7:35 UTC (permalink / raw)
To: Francois Romieu; +Cc: netdev, sancelot, Hayes Wang
In-Reply-To: <20121123192056.GA30290@electric-eye.fr.zoreil.com>
On 23/11/2012 20:20, Francois Romieu wrote:
> Stéphane ANCELOT <sancelot@free.fr> :
> [...]
>> I have an adapted version of this driver for a realtime linux kernel
>> and a 8168/811B rev 2 component (as listed by lspci).
> You should grep for the XID line in the kernel dmesg to identify
> the chipset version.
XID 1c4000c0
>> I had problem with it, my application sends a frame that is
>> immediately transmitted back by some slaves, there was abnormally
>> 100us lost between the send and receive call.
>>
>> Finally I found it was coming from the following register setup in
>> the driver :
>>
>> RTL_W16(IntrMitigate, 0x5151);
>>
>> Can you give me some details about it, since I do not have the
>> RTL8169 programming guide.
> "Reserved" in my 2007 8168c rev1.0 datasheet.
>
> I merged it long ago from Realtek's driver. It has now changed to 0x5f51.
>
> On the old PCI 8169, bits 15..8 relate to Tx and 7..0 to Rx. Bits 7..4
> count in units of 125 us and bits 0..3 in packet units. You may give
> 0x..00 a try.
>
> Hayes knows better for the 8168 line.
>
>> /100us is important since this component acts as an Ethercat Master
>> running at 1ms./
> Which realtime kernel is it ?
xenomai.
^ permalink raw reply
* [net-next RFC] pktgen: don't wait for the device who doesn't free skb immediately after sent
From: Jason Wang @ 2012-11-26 7:56 UTC (permalink / raw)
To: rusty, mst, davem, netdev, linux-kernel, virtualization; +Cc: shemminger
Some deivces do not free the old tx skbs immediately after it has been sent
(usually in tx interrupt). One such example is virtio-net which optimizes for
virt and only free the possible old tx skbs during the next packet sending. This
would lead the pktgen to wait forever in the refcount of the skb if no other
pakcet will be sent afterwards.
Solving this issue by introducing a new flag IFF_TX_SKB_FREE_DELAY which could
notify the pktgen that the device does not free skb immediately after it has
been sent and let it not to wait for the refcount to be one.
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
Another choice is to introduce an new .ndo which could be called by pktgen
during the waiting to free the old tx skbs.
drivers/net/virtio_net.c | 3 ++-
include/uapi/linux/if.h | 2 ++
net/core/pktgen.c | 3 ++-
3 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 26c502e..8262232 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1058,7 +1058,8 @@ static int virtnet_probe(struct virtio_device *vdev)
return -ENOMEM;
/* Set up network device as normal. */
- dev->priv_flags |= IFF_UNICAST_FLT | IFF_LIVE_ADDR_CHANGE;
+ dev->priv_flags |= IFF_UNICAST_FLT | IFF_LIVE_ADDR_CHANGE |
+ IFF_TX_SKB_FREE_DELAY;
dev->netdev_ops = &virtnet_netdev;
dev->features = NETIF_F_HIGHDMA;
diff --git a/include/uapi/linux/if.h b/include/uapi/linux/if.h
index 1ec407b..a0d2168 100644
--- a/include/uapi/linux/if.h
+++ b/include/uapi/linux/if.h
@@ -83,6 +83,8 @@
#define IFF_SUPP_NOFCS 0x80000 /* device supports sending custom FCS */
#define IFF_LIVE_ADDR_CHANGE 0x100000 /* device supports hardware address
* change when it's running */
+#define IFF_TX_SKB_FREE_DELAY 0x200000 /* device does free the tx skb
+ * immediately when it is sent */
#define IF_GET_IFACE 0x0001 /* for querying only */
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index b29dacf..85d4e53 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -3269,7 +3269,8 @@ unlock:
/* If pkt_dev->count is zero, then run forever */
if ((pkt_dev->count != 0) && (pkt_dev->sofar >= pkt_dev->count)) {
- pktgen_wait_for_skb(pkt_dev);
+ if (!(pkt_dev->odev->priv_flags & IFF_TX_SKB_FREE_DELAY))
+ pktgen_wait_for_skb(pkt_dev);
/* Done with this */
pktgen_stop_device(pkt_dev);
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 0/2] tun: trivial fixes.
From: Rami Rosen @ 2012-11-26 8:07 UTC (permalink / raw)
To: davem; +Cc: maxk, netdev, Rami Rosen
These two patches contain trivial fixes to the tun driver.
First patch fixes 4 typos.
Second patch puts the correct method name in a debug message in tun_do_read().
Rami Rosen (2):
[PATCH net-next 1/2] vtun: fix typos.
[PATCH net-next 2/2] tun: put correct method name in a debug message.
drivers/net/tun.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
--
1.7.11.7
^ permalink raw reply
* [PATCH net-next 1/2] vtun: fix typos.
From: Rami Rosen @ 2012-11-26 8:07 UTC (permalink / raw)
To: davem; +Cc: maxk, netdev, Rami Rosen
In-Reply-To: <1353917261-2974-1-git-send-email-ramirose@gmail.com>
This patch fixes four typos in drivers/net/vtun.c.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
---
drivers/net/tun.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 3bd9932..1dfb135 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -121,11 +121,11 @@ struct tap_filter {
* also contains all socket related strctures (except sock_fprog and tap_filter)
* to serve as one transmit queue for tuntap device. The sock_fprog and
* tap_filter were kept in tun_struct since they were used for filtering for the
- * netdevice not for a specific queue (at least I didn't see the reqirement for
+ * netdevice not for a specific queue (at least I didn't see the requirement for
* this).
*
* RCU usage:
- * The tun_file and tun_struct are loosely coupled, the pointer from on to the
+ * The tun_file and tun_struct are loosely coupled, the pointer from one to the
* other can only be read while rcu_read_lock or rtnl_lock is held.
*/
struct tun_file {
@@ -153,7 +153,7 @@ struct tun_flow_entry {
#define TUN_NUM_FLOW_ENTRIES 1024
/* Since the socket were moved to tun_file, to preserve the behavior of persist
- * device, socket fileter, sndbuf and vnet header size were restore when the
+ * device, socket filter, sndbuf and vnet header size were restore when the
* file were attached to a persist device.
*/
struct tun_struct {
@@ -689,7 +689,7 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
sk_filter(tfile->socket.sk, skb))
goto drop;
- /* Limit the number of packets queued by divining txq length with the
+ /* Limit the number of packets queued by dividing txq length with the
* number of queues.
*/
if (skb_queue_len(&tfile->socket.sk->sk_receive_queue)
--
1.7.11.7
^ permalink raw reply related
* [PATCH net-next 2/2] tun: put correct method name in a debug message.
From: Rami Rosen @ 2012-11-26 8:07 UTC (permalink / raw)
To: davem; +Cc: maxk, netdev, Rami Rosen
In-Reply-To: <1353917261-2974-1-git-send-email-ramirose@gmail.com>
This patch puts the correct method name, tun_do_read, in a debug message.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
---
drivers/net/tun.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 1dfb135..607a3a5 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1296,7 +1296,7 @@ static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
struct sk_buff *skb;
ssize_t ret = 0;
- tun_debug(KERN_INFO, tun, "tun_chr_read\n");
+ tun_debug(KERN_INFO, tun, "tun_do_read\n");
if (unlikely(!noblock))
add_wait_queue(&tfile->wq.wait, &wait);
--
1.7.11.7
^ permalink raw reply related
* [net-next.git 1/6] stmmac: remove dead code for STMMAC_TIMER support
From: Giuseppe CAVALLARO @ 2012-11-26 9:10 UTC (permalink / raw)
To: netdev; +Cc: bhutchings, davem, Giuseppe Cavallaro
In-Reply-To: <1353921046-5121-1-git-send-email-peppe.cavallaro@st.com>
The TIMER option is not longer supported and this
code can be considered dead for this driver in
the new kernel series.
In fact, It was not updated at all and never used.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
drivers/net/ethernet/stmicro/stmmac/Kconfig | 25 ----
drivers/net/ethernet/stmicro/stmmac/Makefile | 1 -
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 6 -
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 101 +--------------
drivers/net/ethernet/stmicro/stmmac/stmmac_timer.c | 134 --------------------
drivers/net/ethernet/stmicro/stmmac/stmmac_timer.h | 46 -------
6 files changed, 3 insertions(+), 310 deletions(-)
delete mode 100644 drivers/net/ethernet/stmicro/stmmac/stmmac_timer.c
delete mode 100644 drivers/net/ethernet/stmicro/stmmac/stmmac_timer.h
diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig b/drivers/net/ethernet/stmicro/stmmac/Kconfig
index 9f44827..1164930 100644
--- a/drivers/net/ethernet/stmicro/stmmac/Kconfig
+++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig
@@ -54,31 +54,6 @@ config STMMAC_DA
By default, the DMA arbitration scheme is based on Round-robin
(rx:tx priority is 1:1).
-config STMMAC_TIMER
- bool "STMMAC Timer optimisation"
- default n
- depends on RTC_HCTOSYS_DEVICE
- ---help---
- Use an external timer for mitigating the number of network
- interrupts. Currently, for SH architectures, it is possible
- to use the TMU channel 2 and the SH-RTC device.
-
-choice
- prompt "Select Timer device"
- depends on STMMAC_TIMER
-
-config STMMAC_TMU_TIMER
- bool "TMU channel 2"
- depends on CPU_SH4
- ---help---
-
-config STMMAC_RTC_TIMER
- bool "Real time clock"
- depends on RTC_CLASS
- ---help---
-
-endchoice
-
choice
prompt "Select the DMA TX/RX descriptor operating modes"
depends on STMMAC_ETH
diff --git a/drivers/net/ethernet/stmicro/stmmac/Makefile b/drivers/net/ethernet/stmicro/stmmac/Makefile
index bc965ac..c8e8ea6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/Makefile
+++ b/drivers/net/ethernet/stmicro/stmmac/Makefile
@@ -1,5 +1,4 @@
obj-$(CONFIG_STMMAC_ETH) += stmmac.o
-stmmac-$(CONFIG_STMMAC_TIMER) += stmmac_timer.o
stmmac-$(CONFIG_STMMAC_RING) += ring_mode.o
stmmac-$(CONFIG_STMMAC_CHAINED) += chain_mode.o
stmmac-$(CONFIG_STMMAC_PLATFORM) += stmmac_platform.o
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 7d51a65..5f89415 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -31,9 +31,6 @@
#include <linux/phy.h>
#include <linux/pci.h>
#include "common.h"
-#ifdef CONFIG_STMMAC_TIMER
-#include "stmmac_timer.h"
-#endif
struct stmmac_priv {
/* Frequently used values are kept adjacent for cache effect */
@@ -77,9 +74,6 @@ struct stmmac_priv {
spinlock_t tx_lock;
int wolopts;
int wol_irq;
-#ifdef CONFIG_STMMAC_TIMER
- struct stmmac_timer *tm;
-#endif
struct plat_stmmacenet_data *plat;
struct stmmac_counters mmc;
struct dma_features dma_cap;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index c6cdbc4..777234c 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -115,16 +115,6 @@ static int tc = TC_DEFAULT;
module_param(tc, int, S_IRUGO | S_IWUSR);
MODULE_PARM_DESC(tc, "DMA threshold control value");
-/* Pay attention to tune this parameter; take care of both
- * hardware capability and network stabitily/performance impact.
- * Many tests showed that ~4ms latency seems to be good enough. */
-#ifdef CONFIG_STMMAC_TIMER
-#define DEFAULT_PERIODIC_RATE 256
-static int tmrate = DEFAULT_PERIODIC_RATE;
-module_param(tmrate, int, S_IRUGO | S_IWUSR);
-MODULE_PARM_DESC(tmrate, "External timer freq. (default: 256Hz)");
-#endif
-
#define DMA_BUFFER_SIZE BUF_SIZE_2KiB
static int buf_sz = DMA_BUFFER_SIZE;
module_param(buf_sz, int, S_IRUGO | S_IWUSR);
@@ -536,12 +526,6 @@ static void init_dma_desc_rings(struct net_device *dev)
else
bfsize = stmmac_set_bfsize(dev->mtu, priv->dma_buf_sz);
-#ifdef CONFIG_STMMAC_TIMER
- /* Disable interrupts on completion for the reception if timer is on */
- if (likely(priv->tm->enable))
- dis_ic = 1;
-#endif
-
DBG(probe, INFO, "stmmac: txsize %d, rxsize %d, bfsize %d\n",
txsize, rxsize, bfsize);
@@ -775,22 +759,12 @@ static void stmmac_tx(struct stmmac_priv *priv)
static inline void stmmac_enable_irq(struct stmmac_priv *priv)
{
-#ifdef CONFIG_STMMAC_TIMER
- if (likely(priv->tm->enable))
- priv->tm->timer_start(tmrate);
- else
-#endif
- priv->hw->dma->enable_dma_irq(priv->ioaddr);
+ priv->hw->dma->enable_dma_irq(priv->ioaddr);
}
static inline void stmmac_disable_irq(struct stmmac_priv *priv)
{
-#ifdef CONFIG_STMMAC_TIMER
- if (likely(priv->tm->enable))
- priv->tm->timer_stop();
- else
-#endif
- priv->hw->dma->disable_dma_irq(priv->ioaddr);
+ priv->hw->dma->disable_dma_irq(priv->ioaddr);
}
static int stmmac_has_work(struct stmmac_priv *priv)
@@ -818,25 +792,6 @@ static inline void _stmmac_schedule(struct stmmac_priv *priv)
}
}
-#ifdef CONFIG_STMMAC_TIMER
-void stmmac_schedule(struct net_device *dev)
-{
- struct stmmac_priv *priv = netdev_priv(dev);
-
- priv->xstats.sched_timer_n++;
-
- _stmmac_schedule(priv);
-}
-
-static void stmmac_no_timer_started(unsigned int x)
-{;
-};
-
-static void stmmac_no_timer_stopped(void)
-{;
-};
-#endif
-
/**
* stmmac_tx_err:
* @priv: pointer to the private device structure
@@ -1038,23 +993,6 @@ static int stmmac_open(struct net_device *dev)
struct stmmac_priv *priv = netdev_priv(dev);
int ret;
-#ifdef CONFIG_STMMAC_TIMER
- priv->tm = kzalloc(sizeof(struct stmmac_timer *), GFP_KERNEL);
- if (unlikely(priv->tm == NULL))
- return -ENOMEM;
-
- priv->tm->freq = tmrate;
-
- /* Test if the external timer can be actually used.
- * In case of failure continue without timer. */
- if (unlikely((stmmac_open_ext_timer(dev, priv->tm)) < 0)) {
- pr_warning("stmmaceth: cannot attach the external timer.\n");
- priv->tm->freq = 0;
- priv->tm->timer_start = stmmac_no_timer_started;
- priv->tm->timer_stop = stmmac_no_timer_stopped;
- } else
- priv->tm->enable = 1;
-#endif
clk_prepare_enable(priv->stmmac_clk);
stmmac_check_ether_addr(priv);
@@ -1141,10 +1079,6 @@ static int stmmac_open(struct net_device *dev)
priv->hw->dma->start_tx(priv->ioaddr);
priv->hw->dma->start_rx(priv->ioaddr);
-#ifdef CONFIG_STMMAC_TIMER
- priv->tm->timer_start(tmrate);
-#endif
-
/* Dump DMA/MAC registers */
if (netif_msg_hw(priv)) {
priv->hw->mac->dump_regs(priv->ioaddr);
@@ -1170,9 +1104,6 @@ open_error_wolirq:
free_irq(dev->irq, dev);
open_error:
-#ifdef CONFIG_STMMAC_TIMER
- kfree(priv->tm);
-#endif
if (priv->phydev)
phy_disconnect(priv->phydev);
@@ -1203,12 +1134,6 @@ static int stmmac_release(struct net_device *dev)
netif_stop_queue(dev);
-#ifdef CONFIG_STMMAC_TIMER
- /* Stop and release the timer */
- stmmac_close_ext_timer();
- if (priv->tm != NULL)
- kfree(priv->tm);
-#endif
napi_disable(&priv->napi);
/* Free the IRQ lines */
@@ -1323,12 +1248,6 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
/* Interrupt on completition only for the latest segment */
priv->hw->desc->close_tx_desc(desc);
-#ifdef CONFIG_STMMAC_TIMER
- /* Clean IC while using timer */
- if (likely(priv->tm->enable))
- priv->hw->desc->clear_tx_ic(desc);
-#endif
-
wmb();
/* To avoid raise condition */
@@ -1523,7 +1442,7 @@ static int stmmac_poll(struct napi_struct *napi, int budget)
* stmmac_tx_timeout
* @dev : Pointer to net device structure
* Description: this function is called when a packet transmission fails to
- * complete within a reasonable tmrate. The driver will mark the error in the
+ * complete within a reasonable time. The driver will mark the error in the
* netdev structure and arrange for the device to be reset to a sane state
* in order to transmit a new packet.
*/
@@ -2141,11 +2060,6 @@ int stmmac_suspend(struct net_device *ndev)
netif_device_detach(ndev);
netif_stop_queue(ndev);
-#ifdef CONFIG_STMMAC_TIMER
- priv->tm->timer_stop();
- if (likely(priv->tm->enable))
- dis_ic = 1;
-#endif
napi_disable(&priv->napi);
/* Stop TX/RX DMA */
@@ -2196,10 +2110,6 @@ int stmmac_resume(struct net_device *ndev)
priv->hw->dma->start_tx(priv->ioaddr);
priv->hw->dma->start_rx(priv->ioaddr);
-#ifdef CONFIG_STMMAC_TIMER
- if (likely(priv->tm->enable))
- priv->tm->timer_start(tmrate);
-#endif
napi_enable(&priv->napi);
netif_start_queue(ndev);
@@ -2295,11 +2205,6 @@ static int __init stmmac_cmdline_opt(char *str)
} else if (!strncmp(opt, "eee_timer:", 6)) {
if (kstrtoint(opt + 10, 0, &eee_timer))
goto err;
-#ifdef CONFIG_STMMAC_TIMER
- } else if (!strncmp(opt, "tmrate:", 7)) {
- if (kstrtoint(opt + 7, 0, &tmrate))
- goto err;
-#endif
}
}
return 0;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_timer.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_timer.c
deleted file mode 100644
index 4ccd4e2..0000000
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_timer.c
+++ /dev/null
@@ -1,134 +0,0 @@
-/*******************************************************************************
- STMMAC external timer support.
-
- Copyright (C) 2007-2009 STMicroelectronics Ltd
-
- This program is free software; you can redistribute it and/or modify it
- under the terms and conditions of the GNU General Public License,
- version 2, as published by the Free Software Foundation.
-
- This program is distributed in the hope it will be useful, but WITHOUT
- ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
- more details.
-
- You should have received a copy of the GNU General Public License along with
- this program; if not, write to the Free Software Foundation, Inc.,
- 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
-
- The full GNU General Public License is included in this distribution in
- the file called "COPYING".
-
- Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
-*******************************************************************************/
-
-#include <linux/kernel.h>
-#include <linux/etherdevice.h>
-#include "stmmac_timer.h"
-
-static void stmmac_timer_handler(void *data)
-{
- struct net_device *dev = (struct net_device *)data;
-
- stmmac_schedule(dev);
-}
-
-#define STMMAC_TIMER_MSG(timer, freq) \
-printk(KERN_INFO "stmmac_timer: %s Timer ON (freq %dHz)\n", timer, freq);
-
-#if defined(CONFIG_STMMAC_RTC_TIMER)
-#include <linux/rtc.h>
-static struct rtc_device *stmmac_rtc;
-static rtc_task_t stmmac_task;
-
-static void stmmac_rtc_start(unsigned int new_freq)
-{
- rtc_irq_set_freq(stmmac_rtc, &stmmac_task, new_freq);
- rtc_irq_set_state(stmmac_rtc, &stmmac_task, 1);
-}
-
-static void stmmac_rtc_stop(void)
-{
- rtc_irq_set_state(stmmac_rtc, &stmmac_task, 0);
-}
-
-int stmmac_open_ext_timer(struct net_device *dev, struct stmmac_timer *tm)
-{
- stmmac_task.private_data = dev;
- stmmac_task.func = stmmac_timer_handler;
-
- stmmac_rtc = rtc_class_open(CONFIG_RTC_HCTOSYS_DEVICE);
- if (stmmac_rtc == NULL) {
- pr_err("open rtc device failed\n");
- return -ENODEV;
- }
-
- rtc_irq_register(stmmac_rtc, &stmmac_task);
-
- /* Periodic mode is not supported */
- if ((rtc_irq_set_freq(stmmac_rtc, &stmmac_task, tm->freq) < 0)) {
- pr_err("set periodic failed\n");
- rtc_irq_unregister(stmmac_rtc, &stmmac_task);
- rtc_class_close(stmmac_rtc);
- return -1;
- }
-
- STMMAC_TIMER_MSG(CONFIG_RTC_HCTOSYS_DEVICE, tm->freq);
-
- tm->timer_start = stmmac_rtc_start;
- tm->timer_stop = stmmac_rtc_stop;
-
- return 0;
-}
-
-int stmmac_close_ext_timer(void)
-{
- rtc_irq_set_state(stmmac_rtc, &stmmac_task, 0);
- rtc_irq_unregister(stmmac_rtc, &stmmac_task);
- rtc_class_close(stmmac_rtc);
- return 0;
-}
-
-#elif defined(CONFIG_STMMAC_TMU_TIMER)
-#include <linux/clk.h>
-#define TMU_CHANNEL "tmu2_clk"
-static struct clk *timer_clock;
-
-static void stmmac_tmu_start(unsigned int new_freq)
-{
- clk_set_rate(timer_clock, new_freq);
- clk_prepare_enable(timer_clock);
-}
-
-static void stmmac_tmu_stop(void)
-{
- clk_disable_unprepare(timer_clock);
-}
-
-int stmmac_open_ext_timer(struct net_device *dev, struct stmmac_timer *tm)
-{
- timer_clock = clk_get(NULL, TMU_CHANNEL);
-
- if (IS_ERR(timer_clock))
- return -1;
-
- if (tmu2_register_user(stmmac_timer_handler, (void *)dev) < 0) {
- timer_clock = NULL;
- return -1;
- }
-
- STMMAC_TIMER_MSG("TMU2", tm->freq);
- tm->timer_start = stmmac_tmu_start;
- tm->timer_stop = stmmac_tmu_stop;
-
- return 0;
-}
-
-int stmmac_close_ext_timer(void)
-{
- clk_disable_unprepare(timer_clock);
- tmu2_unregister_user();
- clk_put(timer_clock);
- return 0;
-}
-#endif
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_timer.h b/drivers/net/ethernet/stmicro/stmmac/stmmac_timer.h
deleted file mode 100644
index aea9b14..0000000
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_timer.h
+++ /dev/null
@@ -1,46 +0,0 @@
-/*******************************************************************************
- STMMAC external timer Header File.
-
- Copyright (C) 2007-2009 STMicroelectronics Ltd
-
- This program is free software; you can redistribute it and/or modify it
- under the terms and conditions of the GNU General Public License,
- version 2, as published by the Free Software Foundation.
-
- This program is distributed in the hope it will be useful, but WITHOUT
- ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
- more details.
-
- You should have received a copy of the GNU General Public License along with
- this program; if not, write to the Free Software Foundation, Inc.,
- 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
-
- The full GNU General Public License is included in this distribution in
- the file called "COPYING".
-
- Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
-*******************************************************************************/
-#ifndef __STMMAC_TIMER_H__
-#define __STMMAC_TIMER_H__
-
-struct stmmac_timer {
- void (*timer_start) (unsigned int new_freq);
- void (*timer_stop) (void);
- unsigned int freq;
- unsigned int enable;
-};
-
-/* Open the HW timer device and return 0 in case of success */
-int stmmac_open_ext_timer(struct net_device *dev, struct stmmac_timer *tm);
-/* Stop the timer and release it */
-int stmmac_close_ext_timer(void);
-/* Function used for scheduling task within the stmmac */
-void stmmac_schedule(struct net_device *dev);
-
-#if defined(CONFIG_STMMAC_TMU_TIMER)
-extern int tmu2_register_user(void *fnt, void *data);
-extern void tmu2_unregister_user(void);
-#endif
-
-#endif /* __STMMAC_TIMER_H__ */
--
1.7.4.4
^ permalink raw reply related
* [net-next.git 3/6 (V4)] stmmac: add Rx watchdog support to mitigate the DMA irqs
From: Giuseppe CAVALLARO @ 2012-11-26 9:10 UTC (permalink / raw)
To: netdev; +Cc: bhutchings, davem, Giuseppe Cavallaro
In-Reply-To: <1353921046-5121-1-git-send-email-peppe.cavallaro@st.com>
GMAC devices newer than databook 3.40 has an embedded timer
that can be used for mitigating the number of interrupts.
So this patch adds this optimizations.
At any rate, the Rx watchdog can be disable (on bugged HW) by
passing from the platform the riwt_off field.
In this implementation the rx timer stored in the Reg9 is fixed
to the max value. This will be tuned by using ethtool.
V2: added a platform parameter to force to disable the rx-watchdog
for example on new core where it is bugged.
V3: do not disable NAPI when Rx watchdog is used.
V4: a new extra statistic field has been added to show the early
receive status in the interrupt handler.
This patch also adds an extra check to avoid to call
napi_schedule when the DMA_INTR_ENA_RIE bit is disabled in the
Interrupt Mask register.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
drivers/net/ethernet/stmicro/stmmac/common.h | 17 +++++++++--
drivers/net/ethernet/stmicro/stmmac/dwmac1000.h | 3 --
.../net/ethernet/stmicro/stmmac/dwmac1000_dma.c | 6 ++++
drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h | 5 ++-
drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c | 19 ++++++++++--
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 +
.../net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 9 ++++--
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 30 ++++++++++++++++----
include/linux/stmmac.h | 1 +
9 files changed, 72 insertions(+), 20 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index d17eeef..98203f6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -48,6 +48,10 @@
#define CHIP_DBG(fmt, args...) do { } while (0)
#endif
+/* Synopsys Core versions */
+#define DWMAC_CORE_3_40 0x34
+#define DWMAC_CORE_3_50 0x35
+
#undef FRAME_FILTER_DEBUG
/* #define FRAME_FILTER_DEBUG */
@@ -81,7 +85,7 @@ struct stmmac_extra_stats {
unsigned long rx_missed_cntr;
unsigned long rx_overflow_cntr;
unsigned long rx_vlan;
- /* Tx/Rx IRQ errors */
+ /* Tx/Rx IRQ error info */
unsigned long tx_undeflow_irq;
unsigned long tx_process_stopped_irq;
unsigned long tx_jabber_irq;
@@ -91,7 +95,8 @@ struct stmmac_extra_stats {
unsigned long rx_watchdog_irq;
unsigned long tx_early_irq;
unsigned long fatal_bus_error_irq;
- /* Extra info */
+ /* Tx/Rx IRQ Events */
+ unsigned long rx_early_irq;
unsigned long threshold;
unsigned long tx_pkt_n;
unsigned long rx_pkt_n;
@@ -101,11 +106,12 @@ struct stmmac_extra_stats {
unsigned long tx_normal_irq_n;
unsigned long tx_clean;
unsigned long tx_reset_ic_bit;
+ unsigned long irq_receive_pmt_irq_n;
+ /* MMC info */
unsigned long mmc_tx_irq_n;
unsigned long mmc_rx_irq_n;
unsigned long mmc_rx_csum_offload_irq_n;
/* EEE */
- unsigned long irq_receive_pmt_irq_n;
unsigned long irq_tx_path_in_lpi_mode_n;
unsigned long irq_tx_path_exit_lpi_mode_n;
unsigned long irq_rx_path_in_lpi_mode_n;
@@ -165,6 +171,9 @@ struct stmmac_extra_stats {
#define DMA_HW_FEAT_ACTPHYIF 0x70000000 /* Active/selected PHY interface */
#define DEFAULT_DMA_PBL 8
+/* Max/Min RI Watchdog Timer count value */
+#define MAX_DMA_RIWT 0xff
+#define MIN_DMA_RIWT 0x20
/* Tx coalesce parameters */
#define STMMAC_COAL_TX_TIMER 40000
#define STMMAC_MAX_COAL_TX_TICK 100000
@@ -306,6 +315,8 @@ struct stmmac_dma_ops {
struct stmmac_extra_stats *x);
/* If supported then get the optional core features */
unsigned int (*get_hw_feature) (void __iomem *ioaddr);
+ /* Program the HW RX Watchdog */
+ void (*rx_watchdog) (void __iomem *ioaddr, u32 riwt);
};
struct stmmac_ops {
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
index 0e4cace..7ad56af 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
@@ -230,8 +230,5 @@ enum rtc_control {
#define GMAC_MMC_TX_INTR 0x108
#define GMAC_MMC_RX_CSUM_OFFLOAD 0x208
-/* Synopsys Core versions */
-#define DWMAC_CORE_3_40 0x34
-
extern const struct stmmac_dma_ops dwmac1000_dma_ops;
#endif /* __DWMAC1000_H__ */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
index 0335000..bf83c03 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
@@ -174,6 +174,11 @@ static unsigned int dwmac1000_get_hw_feature(void __iomem *ioaddr)
return readl(ioaddr + DMA_HW_FEATURE);
}
+static void dwmac1000_rx_watchdog(void __iomem *ioaddr, u32 riwt)
+{
+ writel(riwt, ioaddr + DMA_RX_WATCHDOG);
+}
+
const struct stmmac_dma_ops dwmac1000_dma_ops = {
.init = dwmac1000_dma_init,
.dump_regs = dwmac1000_dump_dma_regs,
@@ -187,4 +192,5 @@ const struct stmmac_dma_ops dwmac1000_dma_ops = {
.stop_rx = dwmac_dma_stop_rx,
.dma_interrupt = dwmac_dma_interrupt,
.get_hw_feature = dwmac1000_get_hw_feature,
+ .rx_watchdog = dwmac1000_rx_watchdog,
};
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
index e49c9a0..807f303 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h
@@ -35,7 +35,10 @@
#define DMA_CONTROL 0x00001018 /* Ctrl (Operational Mode) */
#define DMA_INTR_ENA 0x0000101c /* Interrupt Enable */
#define DMA_MISSED_FRAME_CTR 0x00001020 /* Missed Frame Counter */
-#define DMA_AXI_BUS_MODE 0x00001028 /* AXI Bus Mode */
+/* Rx watchdog register */
+#define DMA_RX_WATCHDOG 0x00001024
+/* AXI Bus Mode */
+#define DMA_AXI_BUS_MODE 0x00001028
#define DMA_CUR_TX_BUF_ADDR 0x00001050 /* Current Host Tx Buffer */
#define DMA_CUR_RX_BUF_ADDR 0x00001054 /* Current Host Rx Buffer */
#define DMA_HW_FEATURE 0x00001058 /* HW Feature Register */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
index 73766e6..491d7e9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
@@ -204,17 +204,28 @@ int dwmac_dma_interrupt(void __iomem *ioaddr,
}
}
/* TX/RX NORMAL interrupts */
- if (intr_status & DMA_STATUS_NIS) {
+ if (likely(intr_status & DMA_STATUS_NIS)) {
x->normal_irq_n++;
- if (likely(intr_status & DMA_STATUS_RI))
- ret |= handle_rx;
- if (intr_status & (DMA_STATUS_TI))
+ if (likely(intr_status & DMA_STATUS_RI)) {
+ u32 value = readl(ioaddr + DMA_INTR_ENA);
+ /* to schedule NAPI on real RIE event. */
+ if (likely(value & DMA_INTR_ENA_RIE)) {
+ x->rx_normal_irq_n++;
+ ret |= handle_rx;
+ }
+ }
+ if (likely(intr_status & DMA_STATUS_TI)) {
+ x->tx_normal_irq_n++;
ret |= handle_tx;
+ }
+ if (unlikely(intr_status & DMA_STATUS_ERI))
+ x->rx_early_irq++;
}
/* Optional hardware blocks, interrupts should be disabled */
if (unlikely(intr_status &
(DMA_STATUS_GPI | DMA_STATUS_GMI | DMA_STATUS_GLI)))
pr_info("%s: unexpected status %08x\n", __func__, intr_status);
+
/* Clear the interrupt by writing a logic 1 to the CSR5[15-0] */
writel((intr_status & 0x1ffff), ioaddr + DMA_STATUS);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 01b3451..fe0535c 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -91,6 +91,8 @@ struct stmmac_priv {
u32 tx_count_frames;
u32 tx_coal_frames;
u32 tx_coal_timer;
+ int use_riwt;
+ u32 rx_riwt;
};
extern int phyaddr;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
index de2210c..240acd2 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
@@ -76,7 +76,7 @@ static const struct stmmac_stats stmmac_gstrings_stats[] = {
STMMAC_STAT(rx_missed_cntr),
STMMAC_STAT(rx_overflow_cntr),
STMMAC_STAT(rx_vlan),
- /* Tx/Rx IRQ errors */
+ /* Tx/Rx IRQ error info */
STMMAC_STAT(tx_undeflow_irq),
STMMAC_STAT(tx_process_stopped_irq),
STMMAC_STAT(tx_jabber_irq),
@@ -86,7 +86,8 @@ static const struct stmmac_stats stmmac_gstrings_stats[] = {
STMMAC_STAT(rx_watchdog_irq),
STMMAC_STAT(tx_early_irq),
STMMAC_STAT(fatal_bus_error_irq),
- /* Extra info */
+ /* Tx/Rx IRQ Events */
+ STMMAC_STAT(rx_early_irq),
STMMAC_STAT(threshold),
STMMAC_STAT(tx_pkt_n),
STMMAC_STAT(rx_pkt_n),
@@ -96,10 +97,12 @@ static const struct stmmac_stats stmmac_gstrings_stats[] = {
STMMAC_STAT(tx_normal_irq_n),
STMMAC_STAT(tx_clean),
STMMAC_STAT(tx_reset_ic_bit),
+ STMMAC_STAT(irq_receive_pmt_irq_n),
+ /* MMC info */
STMMAC_STAT(mmc_tx_irq_n),
STMMAC_STAT(mmc_rx_irq_n),
STMMAC_STAT(mmc_rx_csum_offload_irq_n),
- STMMAC_STAT(irq_receive_pmt_irq_n),
+ /* EEE */
STMMAC_STAT(irq_tx_path_in_lpi_mode_n),
STMMAC_STAT(irq_tx_path_exit_lpi_mode_n),
STMMAC_STAT(irq_rx_path_in_lpi_mode_n),
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index d40cf9e..2fb0e5f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -603,6 +603,8 @@ static void init_dma_desc_rings(struct net_device *dev)
priv->dirty_tx = 0;
priv->cur_tx = 0;
+ if (priv->use_riwt)
+ dis_ic = 1;
/* Clear the Rx/Tx descriptors */
priv->hw->desc->init_rx_desc(priv->dma_rx, rxsize, dis_ic);
priv->hw->desc->init_tx_desc(priv->dma_tx, txsize);
@@ -1106,6 +1108,11 @@ static int stmmac_open(struct net_device *dev)
stmmac_init_tx_coalesce(priv);
+ if ((priv->use_riwt) && (priv->hw->dma->rx_watchdog)) {
+ priv->rx_riwt = MAX_DMA_RIWT;
+ priv->hw->dma->rx_watchdog(priv->ioaddr, MAX_DMA_RIWT);
+ }
+
napi_enable(&priv->napi);
netif_start_queue(dev);
@@ -1423,14 +1430,12 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit)
#endif
skb->protocol = eth_type_trans(skb, priv->dev);
- if (unlikely(!priv->plat->rx_coe)) {
- /* No RX COE for old mac10/100 devices */
+ if (unlikely(!priv->plat->rx_coe))
skb_checksum_none_assert(skb);
- netif_receive_skb(skb);
- } else {
+ else
skb->ip_summed = CHECKSUM_UNNECESSARY;
- napi_gro_receive(&priv->napi, skb);
- }
+
+ napi_gro_receive(&priv->napi, skb);
priv->dev->stats.rx_packets++;
priv->dev->stats.rx_bytes += frame_len;
@@ -2001,6 +2006,16 @@ struct stmmac_priv *stmmac_dvr_probe(struct device *device,
if (flow_ctrl)
priv->flow_ctrl = FLOW_AUTO; /* RX/TX pause on */
+ /* Rx Watchdog is available in the COREs newer than the 3.40.
+ * In some case, for example on bugged HW this feature
+ * has to be disable and this can be done by passing the
+ * riwt_off field from the platform.
+ */
+ if ((priv->synopsys_id >= DWMAC_CORE_3_50) && (!priv->plat->riwt_off)) {
+ priv->use_riwt = 1;
+ pr_info(" Enable RX Mitigation via HW Watchdog Timer\n");
+ }
+
netif_napi_add(ndev, &priv->napi, stmmac_poll, 64);
spin_lock_init(&priv->lock);
@@ -2092,6 +2107,9 @@ int stmmac_suspend(struct net_device *ndev)
netif_device_detach(ndev);
netif_stop_queue(ndev);
+ if (priv->use_riwt)
+ dis_ic = 1;
+
napi_disable(&priv->napi);
/* Stop TX/RX DMA */
diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h
index a1547ea..de5b2f8 100644
--- a/include/linux/stmmac.h
+++ b/include/linux/stmmac.h
@@ -104,6 +104,7 @@ struct plat_stmmacenet_data {
int bugged_jumbo;
int pmt;
int force_sf_dma_mode;
+ int riwt_off;
void (*fix_mac_speed)(void *priv, unsigned int speed);
void (*bus_setup)(void __iomem *ioaddr);
int (*init)(struct platform_device *pdev);
--
1.7.4.4
^ permalink raw reply related
* [net-next.git 0/6 (V5)] stmmac: remove dead code for STMMAC_TIMER and add new mitigation schema
From: Giuseppe CAVALLARO @ 2012-11-26 9:10 UTC (permalink / raw)
To: netdev; +Cc: bhutchings, davem, Giuseppe Cavallaro
These patch series remove the STMMAC_TIMER option no longer updated
and never used and add a new mitigation schema.
Having removed the Timer opt, this has made the driver slim.
On top of this work, it has been easier to introduce the new
mitigation schema based on HW RX-watchdog (available in new cores).
In fact, 3.50 and newer cores have an HW RX-Watchdog that can be used for
mitigating the Rx-interrupts and first results look promising.
Running n-u-t-t-c-p with the following parameters:
Throughput: 500Mbps
UDP Buffer size: 1328bytes
TCP Buffer size: 65536bytes
for example, I got on ST box (arm-based) these improvements:
--------------------------------------------------------------------
Original | With New Mitigation patch
--------------------------------------------------------------------
Test CPU usage pkt/loss | CPU usage pkt/loss
Type Mbps % % |Mbps % %
--------------------------------------------------------------------
UDP-RX 395.5065 95 20.89 |499.8811 23 0.00
UDP-TX 499.5578 100 0.08915 |499.1340 99 0.00
TCP-RX 499.9221 77 |499.8776 26
TCP-TX 389.5719 99 |499.9771 79
--------------------------------------------------------------------
... no regression on ST boxes (SH based) I always test.
This is a brief explanation of the new mitigation schema although there
is a patch that updates the driver's documentation.
o On Rx-side I have:
New GMACs will use the RX-watchdog timer; old ones will continue to
use NAPI to mitigate the RX DMA interrupts.
For the RX-watchdog, there is a parameter that is the RI Watchdog
Timer count. It indicates the number of system clock cycles and can be
set via *ethtool*.
o On Tx-side, the mitigation schema is based on a SW timer
that calls the tx function (stmmac_tx) to reclaim the resource after
transmitting the frames.
Also there is another parameter (a threshold) used to program
the descriptors avoiding to set the interrupt on completion bit in
when the frame is sent (xmit).
V2: these patches add the ethtool support to get/set coalesce parameters
and totally remove the sysFS support added in the first patches.
V3: added several fixes: for example NAPI and RX-watchdog work together
while in the previous implementation the HW RX-watchdog disabled NAPI.
On the tx side, erroneously the tx coalesce frame parameter was limited
to the ring size and the driver didn't take care of the segment numbers
when enable/disable the IC bit in the TDES.
V4: reject not supported coalesce settings in ethtool.
V5: in these new patch series, I am releasing two patches that
add the mitigation for tx and rx where I collected the feedback received
from the mailing list. Concerning the tx, I've also reworked and
fixed the spinlock irqsave and restore as D. Miller had suggested (thx).
Giuseppe Cavallaro (6):
stmmac: remove dead code for STMMAC_TIMER support
stmmac: add the initial tx coalesce schema
stmmac: add Rx watchdog support to mitigate the DMA irqs
stmmac: get/set coalesce parameters via ethtool
stmmac: update the doc with new IRQ mitigation
stmmac: update the driver version to Nov_2012
Documentation/networking/stmmac.txt | 28 ++-
drivers/net/ethernet/stmicro/stmmac/Kconfig | 25 --
drivers/net/ethernet/stmicro/stmmac/Makefile | 1 -
drivers/net/ethernet/stmicro/stmmac/common.h | 39 +++-
drivers/net/ethernet/stmicro/stmmac/dwmac1000.h | 3 -
.../net/ethernet/stmicro/stmmac/dwmac1000_dma.c | 6 +
drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h | 5 +-
drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c | 20 ++-
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 14 +-
.../net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 100 +++++++-
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 261 ++++++++------------
drivers/net/ethernet/stmicro/stmmac/stmmac_timer.c | 134 ----------
drivers/net/ethernet/stmicro/stmmac/stmmac_timer.h | 46 ----
include/linux/stmmac.h | 1 +
14 files changed, 281 insertions(+), 402 deletions(-)
delete mode 100644 drivers/net/ethernet/stmicro/stmmac/stmmac_timer.c
delete mode 100644 drivers/net/ethernet/stmicro/stmmac/stmmac_timer.h
--
1.7.4.4
^ permalink raw reply
* [net-next.git 2/6 (V3)] stmmac: add the initial tx coalesce schema
From: Giuseppe CAVALLARO @ 2012-11-26 9:10 UTC (permalink / raw)
To: netdev; +Cc: bhutchings, davem, Giuseppe Cavallaro
In-Reply-To: <1353921046-5121-1-git-send-email-peppe.cavallaro@st.com>
This patch adds a new schema used for mitigating the
number of transmit interrupts.
It is based on a SW timer and a threshold value.
The timer is used to periodically call the stmmac_tx_clean
function; the threshold is used for setting the IC (Interrupt
on Completion bit). The ISR will then invoke the poll method.
Also the patch improves some ethtool stat fields.
V2: review the logic to manage the IC bit in the TDESC
that was bugged because it didn't take care about the
fragments. Also fix the tx_count_frames that has not to be
limited to TX DMA ring. Thanks to Ben Hutchings.
V3: removed the spin_lock irqsave/restore as D. Miller suggested.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
drivers/net/ethernet/stmicro/stmmac/common.h | 22 +++-
drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c | 7 +-
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 4 +
.../net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 8 +-
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 132 ++++++++++++--------
5 files changed, 111 insertions(+), 62 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index 719be39..d17eeef 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -95,9 +95,12 @@ struct stmmac_extra_stats {
unsigned long threshold;
unsigned long tx_pkt_n;
unsigned long rx_pkt_n;
- unsigned long poll_n;
- unsigned long sched_timer_n;
unsigned long normal_irq_n;
+ unsigned long rx_normal_irq_n;
+ unsigned long napi_poll;
+ unsigned long tx_normal_irq_n;
+ unsigned long tx_clean;
+ unsigned long tx_reset_ic_bit;
unsigned long mmc_tx_irq_n;
unsigned long mmc_rx_irq_n;
unsigned long mmc_rx_csum_offload_irq_n;
@@ -162,6 +165,12 @@ struct stmmac_extra_stats {
#define DMA_HW_FEAT_ACTPHYIF 0x70000000 /* Active/selected PHY interface */
#define DEFAULT_DMA_PBL 8
+/* Tx coalesce parameters */
+#define STMMAC_COAL_TX_TIMER 40000
+#define STMMAC_MAX_COAL_TX_TICK 100000
+#define STMMAC_TX_MAX_FRAMES 256
+#define STMMAC_TX_FRAMES 64
+
enum rx_frame_status { /* IPC status */
good_frame = 0,
discard_frame = 1,
@@ -169,10 +178,11 @@ enum rx_frame_status { /* IPC status */
llc_snap = 4,
};
-enum tx_dma_irq_status {
- tx_hard_error = 1,
- tx_hard_error_bump_tc = 2,
- handle_tx_rx = 3,
+enum dma_irq_status {
+ tx_hard_error = 0x1,
+ tx_hard_error_bump_tc = 0x2,
+ handle_rx = 0x4,
+ handle_tx = 0x8,
};
enum core_specific_irq_mask {
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
index 4e0e18a..73766e6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
@@ -206,9 +206,10 @@ int dwmac_dma_interrupt(void __iomem *ioaddr,
/* TX/RX NORMAL interrupts */
if (intr_status & DMA_STATUS_NIS) {
x->normal_irq_n++;
- if (likely((intr_status & DMA_STATUS_RI) ||
- (intr_status & (DMA_STATUS_TI))))
- ret = handle_tx_rx;
+ if (likely(intr_status & DMA_STATUS_RI))
+ ret |= handle_rx;
+ if (intr_status & (DMA_STATUS_TI))
+ ret |= handle_tx;
}
/* Optional hardware blocks, interrupts should be disabled */
if (unlikely(intr_status &
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 5f89415..01b3451 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -87,6 +87,10 @@ struct stmmac_priv {
int eee_enabled;
int eee_active;
int tx_lpi_timer;
+ struct timer_list txtimer;
+ u32 tx_count_frames;
+ u32 tx_coal_frames;
+ u32 tx_coal_timer;
};
extern int phyaddr;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
index 76fd61a..de2210c 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
@@ -90,10 +90,12 @@ static const struct stmmac_stats stmmac_gstrings_stats[] = {
STMMAC_STAT(threshold),
STMMAC_STAT(tx_pkt_n),
STMMAC_STAT(rx_pkt_n),
- STMMAC_STAT(poll_n),
- STMMAC_STAT(sched_timer_n),
- STMMAC_STAT(normal_irq_n),
STMMAC_STAT(normal_irq_n),
+ STMMAC_STAT(rx_normal_irq_n),
+ STMMAC_STAT(napi_poll),
+ STMMAC_STAT(tx_normal_irq_n),
+ STMMAC_STAT(tx_clean),
+ STMMAC_STAT(tx_reset_ic_bit),
STMMAC_STAT(mmc_tx_irq_n),
STMMAC_STAT(mmc_rx_irq_n),
STMMAC_STAT(mmc_rx_csum_offload_irq_n),
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 777234c..d40cf9e 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -137,6 +137,8 @@ static int stmmac_init_fs(struct net_device *dev);
static void stmmac_exit_fs(void);
#endif
+#define STMMAC_COAL_TIMER(x) (jiffies + usecs_to_jiffies(x))
+
/**
* stmmac_verify_args - verify the driver parameters.
* Description: it verifies if some wrong parameter is passed to the driver.
@@ -688,16 +690,18 @@ static void stmmac_dma_operation_mode(struct stmmac_priv *priv)
}
/**
- * stmmac_tx:
- * @priv: private driver structure
+ * stmmac_tx_clean:
+ * @priv: private data pointer
* Description: it reclaims resources after transmission completes.
*/
-static void stmmac_tx(struct stmmac_priv *priv)
+static void stmmac_tx_clean(struct stmmac_priv *priv)
{
unsigned int txsize = priv->dma_tx_size;
spin_lock(&priv->tx_lock);
+ priv->xstats.tx_clean++;
+
while (priv->dirty_tx != priv->cur_tx) {
int last;
unsigned int entry = priv->dirty_tx % txsize;
@@ -757,40 +761,16 @@ static void stmmac_tx(struct stmmac_priv *priv)
spin_unlock(&priv->tx_lock);
}
-static inline void stmmac_enable_irq(struct stmmac_priv *priv)
+static inline void stmmac_enable_dma_irq(struct stmmac_priv *priv)
{
priv->hw->dma->enable_dma_irq(priv->ioaddr);
}
-static inline void stmmac_disable_irq(struct stmmac_priv *priv)
+static inline void stmmac_disable_dma_irq(struct stmmac_priv *priv)
{
priv->hw->dma->disable_dma_irq(priv->ioaddr);
}
-static int stmmac_has_work(struct stmmac_priv *priv)
-{
- unsigned int has_work = 0;
- int rxret, tx_work = 0;
-
- rxret = priv->hw->desc->get_rx_owner(priv->dma_rx +
- (priv->cur_rx % priv->dma_rx_size));
-
- if (priv->dirty_tx != priv->cur_tx)
- tx_work = 1;
-
- if (likely(!rxret || tx_work))
- has_work = 1;
-
- return has_work;
-}
-
-static inline void _stmmac_schedule(struct stmmac_priv *priv)
-{
- if (likely(stmmac_has_work(priv))) {
- stmmac_disable_irq(priv);
- napi_schedule(&priv->napi);
- }
-}
/**
* stmmac_tx_err:
@@ -813,16 +793,18 @@ static void stmmac_tx_err(struct stmmac_priv *priv)
netif_wake_queue(priv->dev);
}
-
static void stmmac_dma_interrupt(struct stmmac_priv *priv)
{
int status;
status = priv->hw->dma->dma_interrupt(priv->ioaddr, &priv->xstats);
- if (likely(status == handle_tx_rx))
- _stmmac_schedule(priv);
-
- else if (unlikely(status == tx_hard_error_bump_tc)) {
+ if (likely((status & handle_rx)) || (status & handle_tx)) {
+ if (likely(napi_schedule_prep(&priv->napi))) {
+ stmmac_disable_dma_irq(priv);
+ __napi_schedule(&priv->napi);
+ }
+ }
+ if (unlikely(status & tx_hard_error_bump_tc)) {
/* Try to bump up the dma threshold on this failure */
if (unlikely(tc != SF_DMA_MODE) && (tc <= 256)) {
tc += 64;
@@ -938,7 +920,6 @@ static int stmmac_get_hw_features(struct stmmac_priv *priv)
/* Alternate (enhanced) DESC mode*/
priv->dma_cap.enh_desc =
(hw_cap & DMA_HW_FEAT_ENHDESSEL) >> 24;
-
}
return hw_cap;
@@ -980,6 +961,38 @@ static int stmmac_init_dma_engine(struct stmmac_priv *priv)
}
/**
+ * stmmac_tx_timer:
+ * @data: data pointer
+ * Description:
+ * This is the timer handler to directly invoke the stmmac_tx_clean.
+ */
+static void stmmac_tx_timer(unsigned long data)
+{
+ struct stmmac_priv *priv = (struct stmmac_priv *)data;
+
+ stmmac_tx_clean(priv);
+}
+
+/**
+ * stmmac_tx_timer:
+ * @priv: private data structure
+ * Description:
+ * This inits the transmit coalesce parameters: i.e. timer rate,
+ * timer handler and default threshold used for enabling the
+ * interrupt on completion bit.
+ */
+static void stmmac_init_tx_coalesce(struct stmmac_priv *priv)
+{
+ priv->tx_coal_frames = STMMAC_TX_FRAMES;
+ priv->tx_coal_timer = STMMAC_COAL_TX_TIMER;
+ init_timer(&priv->txtimer);
+ priv->txtimer.expires = STMMAC_COAL_TIMER(priv->tx_coal_timer);
+ priv->txtimer.data = (unsigned long)priv;
+ priv->txtimer.function = stmmac_tx_timer;
+ add_timer(&priv->txtimer);
+}
+
+/**
* stmmac_open - open entry point of the driver
* @dev : pointer to the device structure.
* Description:
@@ -1091,6 +1104,8 @@ static int stmmac_open(struct net_device *dev)
priv->tx_lpi_timer = STMMAC_DEFAULT_TWT_LS_TIMER;
priv->eee_enabled = stmmac_eee_init(priv);
+ stmmac_init_tx_coalesce(priv);
+
napi_enable(&priv->napi);
netif_start_queue(dev);
@@ -1136,6 +1151,8 @@ static int stmmac_release(struct net_device *dev)
napi_disable(&priv->napi);
+ del_timer_sync(&priv->txtimer);
+
/* Free the IRQ lines */
free_irq(dev->irq, dev);
if (priv->wol_irq != dev->irq)
@@ -1198,11 +1215,13 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
#ifdef STMMAC_XMIT_DEBUG
if ((skb->len > ETH_FRAME_LEN) || nfrags)
- pr_info("stmmac xmit:\n"
- "\tskb addr %p - len: %d - nopaged_len: %d\n"
- "\tn_frags: %d - ip_summed: %d - %s gso\n",
- skb, skb->len, nopaged_len, nfrags, skb->ip_summed,
- !skb_is_gso(skb) ? "isn't" : "is");
+ pr_debug("stmmac xmit: [entry %d]\n"
+ "\tskb addr %p - len: %d - nopaged_len: %d\n"
+ "\tn_frags: %d - ip_summed: %d - %s gso\n"
+ "\ttx_count_frames %d\n", entry,
+ skb, skb->len, nopaged_len, nfrags, skb->ip_summed,
+ !skb_is_gso(skb) ? "isn't" : "is",
+ priv->tx_count_frames);
#endif
csum_insertion = (skb->ip_summed == CHECKSUM_PARTIAL);
@@ -1212,9 +1231,9 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
#ifdef STMMAC_XMIT_DEBUG
if ((nfrags > 0) || (skb->len > ETH_FRAME_LEN))
- pr_debug("stmmac xmit: skb len: %d, nopaged_len: %d,\n"
- "\t\tn_frags: %d, ip_summed: %d\n",
- skb->len, nopaged_len, nfrags, skb->ip_summed);
+ pr_debug("\tskb len: %d, nopaged_len: %d,\n"
+ "\t\tn_frags: %d, ip_summed: %d\n",
+ skb->len, nopaged_len, nfrags, skb->ip_summed);
#endif
priv->tx_skbuff[entry] = skb;
@@ -1245,10 +1264,24 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
wmb();
}
- /* Interrupt on completition only for the latest segment */
+ /* Finalize the latest segment. */
priv->hw->desc->close_tx_desc(desc);
wmb();
+ /* According to the coalesce parameter the IC bit for the latest
+ * segment could be reset and the timer re-started to invoke the
+ * stmmac_tx function. This approach takes care about the fragments.
+ */
+ priv->tx_count_frames += nfrags + 1;
+ if (priv->tx_coal_frames > priv->tx_count_frames) {
+ priv->hw->desc->clear_tx_ic(desc);
+ priv->xstats.tx_reset_ic_bit++;
+ TX_DBG("\t[entry %d]: tx_count_frames %d\n", entry,
+ priv->tx_count_frames);
+ mod_timer(&priv->txtimer,
+ STMMAC_COAL_TIMER(priv->tx_coal_timer));
+ } else
+ priv->tx_count_frames = 0;
/* To avoid raise condition */
priv->hw->desc->set_tx_owner(first);
@@ -1419,21 +1452,20 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit)
* @budget : maximum number of packets that the current CPU can receive from
* all interfaces.
* Description :
- * This function implements the the reception process.
- * Also it runs the TX completion thread
+ * To look at the incoming frames and clear the tx resources.
*/
static int stmmac_poll(struct napi_struct *napi, int budget)
{
struct stmmac_priv *priv = container_of(napi, struct stmmac_priv, napi);
int work_done = 0;
- priv->xstats.poll_n++;
- stmmac_tx(priv);
- work_done = stmmac_rx(priv, budget);
+ priv->xstats.napi_poll++;
+ stmmac_tx_clean(priv);
+ work_done = stmmac_rx(priv, budget);
if (work_done < budget) {
napi_complete(napi);
- stmmac_enable_irq(priv);
+ stmmac_enable_dma_irq(priv);
}
return work_done;
}
--
1.7.4.4
^ permalink raw reply related
* [net-next.git 5/6] stmmac: update the doc with new IRQ mitigation
From: Giuseppe CAVALLARO @ 2012-11-26 9:10 UTC (permalink / raw)
To: netdev; +Cc: bhutchings, davem, Giuseppe Cavallaro
In-Reply-To: <1353921046-5121-1-git-send-email-peppe.cavallaro@st.com>
This patch updates the stmmac.txt adding some information
about the new rx/tx mitigation schema adopted in the driver.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
Documentation/networking/stmmac.txt | 28 +++++++++++++++-------------
1 files changed, 15 insertions(+), 13 deletions(-)
diff --git a/Documentation/networking/stmmac.txt b/Documentation/networking/stmmac.txt
index ef9ee71..f9fa6db 100644
--- a/Documentation/networking/stmmac.txt
+++ b/Documentation/networking/stmmac.txt
@@ -29,11 +29,9 @@ The kernel configuration option is STMMAC_ETH:
dma_txsize: DMA tx ring size;
buf_sz: DMA buffer size;
tc: control the HW FIFO threshold;
- tx_coe: Enable/Disable Tx Checksum Offload engine;
watchdog: transmit timeout (in milliseconds);
flow_ctrl: Flow control ability [on/off];
pause: Flow Control Pause Time;
- tmrate: timer period (only if timer optimisation is configured).
3) Command line options
Driver parameters can be also passed in command line by using:
@@ -60,17 +58,19 @@ Then the poll method will be scheduled at some future point.
The incoming packets are stored, by the DMA, in a list of pre-allocated socket
buffers in order to avoid the memcpy (Zero-copy).
-4.3) Timer-Driver Interrupt
-Instead of having the device that asynchronously notifies the frame receptions,
-the driver configures a timer to generate an interrupt at regular intervals.
-Based on the granularity of the timer, the frames that are received by the
-device will experience different levels of latency. Some NICs have dedicated
-timer device to perform this task. STMMAC can use either the RTC device or the
-TMU channel 2 on STLinux platforms.
-The timers frequency can be passed to the driver as parameter; when change it,
-take care of both hardware capability and network stability/performance impact.
-Several performance tests on STM platforms showed this optimisation allows to
-spare the CPU while having the maximum throughput.
+4.3) Interrupt Mitigation
+The driver is able to mitigate the number of its DMA interrupts
+using NAPI for the reception on chips older than the 3.50.
+New chips have an HW RX-Watchdog used for this mitigation.
+
+On Tx-side, the mitigation schema is based on a SW timer that calls the
+tx function (stmmac_tx) to reclaim the resource after transmitting the
+frames.
+Also there is another parameter (like a threshold) used to program
+the descriptors avoiding to set the interrupt on completion bit in
+when the frame is sent (xmit).
+
+Mitigation parameters can be tuned by ethtool.
4.4) WOL
Wake up on Lan feature through Magic and Unicast frames are supported for the
@@ -121,6 +121,7 @@ struct plat_stmmacenet_data {
int bugged_jumbo;
int pmt;
int force_sf_dma_mode;
+ int riwt_off;
void (*fix_mac_speed)(void *priv, unsigned int speed);
void (*bus_setup)(void __iomem *ioaddr);
int (*init)(struct platform_device *pdev);
@@ -156,6 +157,7 @@ Where:
o pmt: core has the embedded power module (optional).
o force_sf_dma_mode: force DMA to use the Store and Forward mode
instead of the Threshold.
+ o riwt_off: force to disable the RX watchdog feature and switch to NAPI mode.
o fix_mac_speed: this callback is used for modifying some syscfg registers
(on ST SoCs) according to the link speed negotiated by the
physical layer .
--
1.7.4.4
^ permalink raw reply related
* [net-next.git 4/6 (V2)] stmmac: get/set coalesce parameters via ethtool
From: Giuseppe CAVALLARO @ 2012-11-26 9:10 UTC (permalink / raw)
To: netdev; +Cc: bhutchings, davem, Giuseppe Cavallaro
In-Reply-To: <1353921046-5121-1-git-send-email-peppe.cavallaro@st.com>
This patch is to get/set the tx/rx coalesce parameters
via ethtool interface.
Tests have been done on several platform with different GMAC chips w/o and w/
RX watchdog feature.
V2: reject coalesce settings that are not supported.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
---
.../net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 83 ++++++++++++++++++++
1 files changed, 83 insertions(+), 0 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
index 240acd2..f2aa013 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
@@ -524,6 +524,87 @@ static int stmmac_ethtool_op_set_eee(struct net_device *dev,
return phy_ethtool_set_eee(priv->phydev, edata);
}
+static u32 stmmac_usec2riwt(u32 usec, struct stmmac_priv *priv)
+{
+ unsigned long clk = clk_get_rate(priv->stmmac_clk);
+
+ if (!clk)
+ return 0;
+
+ return (usec * (clk / 1000000)) / 256;
+}
+
+static u32 stmmac_riwt2usec(u32 riwt, struct stmmac_priv *priv)
+{
+ unsigned long clk = clk_get_rate(priv->stmmac_clk);
+
+ if (!clk)
+ return 0;
+
+ return (riwt * 256) / (clk / 1000000);
+}
+
+static int stmmac_get_coalesce(struct net_device *dev,
+ struct ethtool_coalesce *ec)
+{
+ struct stmmac_priv *priv = netdev_priv(dev);
+
+ ec->tx_coalesce_usecs = priv->tx_coal_timer;
+ ec->tx_max_coalesced_frames = priv->tx_coal_frames;
+
+ if (priv->use_riwt)
+ ec->rx_coalesce_usecs = stmmac_riwt2usec(priv->rx_riwt, priv);
+
+ return 0;
+}
+
+static int stmmac_set_coalesce(struct net_device *dev,
+ struct ethtool_coalesce *ec)
+{
+ struct stmmac_priv *priv = netdev_priv(dev);
+ unsigned int rx_riwt;
+
+ /* Check not supported parameters */
+ if ((ec->rx_max_coalesced_frames) || (ec->rx_coalesce_usecs_irq) ||
+ (ec->rx_max_coalesced_frames_irq) || (ec->tx_coalesce_usecs_irq) ||
+ (ec->use_adaptive_rx_coalesce) || (ec->use_adaptive_tx_coalesce) ||
+ (ec->pkt_rate_low) || (ec->rx_coalesce_usecs_low) ||
+ (ec->rx_max_coalesced_frames_low) || (ec->tx_coalesce_usecs_high) ||
+ (ec->tx_max_coalesced_frames_low) || (ec->pkt_rate_high) ||
+ (ec->tx_coalesce_usecs_low) || (ec->rx_coalesce_usecs_high) ||
+ (ec->rx_max_coalesced_frames_high) ||
+ (ec->tx_max_coalesced_frames_irq) ||
+ (ec->stats_block_coalesce_usecs) ||
+ (ec->tx_max_coalesced_frames_high) || (ec->rate_sample_interval))
+ return -EOPNOTSUPP;
+
+ if (ec->rx_coalesce_usecs == 0)
+ return -EINVAL;
+
+ if ((ec->tx_coalesce_usecs == 0) &&
+ (ec->tx_max_coalesced_frames == 0))
+ return -EINVAL;
+
+ if ((ec->tx_coalesce_usecs > STMMAC_COAL_TX_TIMER) ||
+ (ec->tx_max_coalesced_frames > STMMAC_TX_MAX_FRAMES))
+ return -EINVAL;
+
+ rx_riwt = stmmac_usec2riwt(ec->rx_coalesce_usecs, priv);
+
+ if ((rx_riwt > MAX_DMA_RIWT) || (rx_riwt < MIN_DMA_RIWT))
+ return -EINVAL;
+ else if (!priv->use_riwt)
+ return -EOPNOTSUPP;
+
+ /* Only copy relevant parameters, ignore all others. */
+ priv->tx_coal_frames = ec->tx_max_coalesced_frames;
+ priv->tx_coal_timer = ec->tx_coalesce_usecs;
+ priv->rx_riwt = rx_riwt;
+ priv->hw->dma->rx_watchdog(priv->ioaddr, priv->rx_riwt);
+
+ return 0;
+}
+
static const struct ethtool_ops stmmac_ethtool_ops = {
.begin = stmmac_check_if_running,
.get_drvinfo = stmmac_ethtool_getdrvinfo,
@@ -544,6 +625,8 @@ static const struct ethtool_ops stmmac_ethtool_ops = {
.set_eee = stmmac_ethtool_op_set_eee,
.get_sset_count = stmmac_get_sset_count,
.get_ts_info = ethtool_op_get_ts_info,
+ .get_coalesce = stmmac_get_coalesce,
+ .set_coalesce = stmmac_set_coalesce,
};
void stmmac_set_ethtool_ops(struct net_device *netdev)
--
1.7.4.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox