* [PATCH] netfilter: CONNMARK: support save the mark of the master connection
From: Changli Gao @ 2011-01-27 9:38 UTC (permalink / raw)
To: Patrick McHardy; +Cc: David S. Miller, netfilter-devel, netdev, Changli Gao
In some cases(Policy routing), it is expected that all the sub-connections
share the same mark with their master.
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
---
include/linux/netfilter/xt_connmark.h | 3 ++-
net/netfilter/xt_connmark.c | 15 +++++++++++++++
2 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/include/linux/netfilter/xt_connmark.h b/include/linux/netfilter/xt_connmark.h
index efc17a8..4b513f8 100644
--- a/include/linux/netfilter/xt_connmark.h
+++ b/include/linux/netfilter/xt_connmark.h
@@ -15,7 +15,8 @@
enum {
XT_CONNMARK_SET = 0,
XT_CONNMARK_SAVE,
- XT_CONNMARK_RESTORE
+ XT_CONNMARK_RESTORE,
+ XT_CONNMARK_SAVE_MASTER,
};
struct xt_connmark_tginfo1 {
diff --git a/net/netfilter/xt_connmark.c b/net/netfilter/xt_connmark.c
index 7278145..4207bb6 100644
--- a/net/netfilter/xt_connmark.c
+++ b/net/netfilter/xt_connmark.c
@@ -69,6 +69,21 @@ connmark_tg(struct sk_buff *skb, const struct xt_action_param *par)
(ct->mark & info->ctmask);
skb->mark = newmark;
break;
+ case XT_CONNMARK_SAVE_MASTER:
+ if (ct->master) {
+ struct nf_conn *master;
+
+ master = ct->master;
+ while (master->master)
+ master = master->master;
+ newmark = (ct->mark & ~info->ctmask) ^
+ (master->mark & info->nfmask);
+ if (ct->mark != newmark) {
+ ct->mark = newmark;
+ nf_conntrack_event_cache(IPCT_MARK, ct);
+ }
+ }
+ break;
}
return XT_CONTINUE;
^ permalink raw reply related
* Re: [PATCH net-next-2.6] drivers/net: remove some rcu sparse warnings
From: Arnd Bergmann @ 2011-01-27 9:22 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev, Michael Chan, Eilon Greenstein
In-Reply-To: <1296106103.1783.114.camel@edumazet-laptop>
On Thursday 27 January 2011, Eric Dumazet wrote:
> Add missing __rcu annotations and helpers.
> minor : Fix some rcu_dereference() calls in macvtap
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Arnd Bergmann <arnd@arndb.de>
> CC: Michael Chan <mchan@broadcom.com>
> CC: Eilon Greenstein <eilong@broadcom.com>
Macvtap bits look good, thanks!
Acked-by: Arnd Bergmann <arnd@arndb.de>
^ permalink raw reply
* Re: [PATCH] pch_gbe: Fix the issue that the receiving data is not normal.
From: Toshiharu Okada @ 2011-01-27 8:45 UTC (permalink / raw)
To: David Miller
Cc: netdev, linux-kernel, qi.wang, yong.y.wang, andrew.chih.howe.khor,
joel.clark, kok.howg.ewe
In-Reply-To: <20110125.133240.59688304.davem@davemloft.net>
Hi David
Thank you for your comment.
I will confirm them and will submit the patch modified.
Best regards
Toshiharu Okada(OKI semiconductor)
----- Original Message -----
From: "David Miller" <davem@davemloft.net>
To: <toshiharu-linux@dsn.okisemi.com>
Cc: <netdev@vger.kernel.org>; <linux-kernel@vger.kernel.org>;
<qi.wang@intel.com>; <yong.y.wang@intel.com>;
<andrew.chih.howe.khor@intel.com>; <joel.clark@intel.com>;
<kok.howg.ewe@intel.com>
Sent: Wednesday, January 26, 2011 6:32 AM
Subject: Re: [PATCH] pch_gbe: Fix the issue that the receiving data is not
normal.
From: Toshiharu Okada <toshiharu-linux@dsn.okisemi.com>
Date: Mon, 24 Jan 2011 13:43:31 +0900
> This PCH_GBE driver had an issue that the receiving data is not normal.
> This driver had not removed correctly the padding data
> which the DMA include in receiving data.
>
> This patch fixed this issue.
>
> Signed-off-by: Toshiharu Okada <toshiharu-linux@dsn.okisemi.com>
There are bugs in these changes:
> if (skb_copy_flag) { /* recycle skb */
> struct sk_buff *new_skb;
> new_skb =
> - netdev_alloc_skb(netdev,
> - length + NET_IP_ALIGN);
> + netdev_alloc_skb(netdev, length);
> if (new_skb) {
> if (!skb_padding_flag) {
> skb_reserve(new_skb,
> - NET_IP_ALIGN);
> + PCH_GBE_DMA_PADDING);
> }
> memcpy(new_skb->data, skb->data,
> length);
If "!skb_padding_flag" then you will write past the end of the SKB
data in that memcpy.
You cannot allocate only "length" then proceed to reserve
PCH_GBE_DMA_PADDING
and then add "length" worth of data on top of that. In such a cause you
must allocate at least "length + PCH_GBE_DMA_PADDING".
Furthermore you _MUST_ respect NET_IP_ALIGN. Some platforms set this value
to "0", because otherwise performance suffers greatly.
There are two seperate issues, removing the padding bytes provided by
the device, and aligning the IP headers as wanted by the cpu
architecutre. Therefore they should be handled seperately, and we
therefore should still see references to NET_IP_ALIGN in your patch.
^ permalink raw reply
* Re: TSO/GRO/LRO/somethingO breaks LVS on 2.6.36
From: Eric Dumazet @ 2011-01-27 7:42 UTC (permalink / raw)
To: Simon Kirby; +Cc: Simon Horman, netdev
In-Reply-To: <20110127004805.GB11578@hostway.ca>
Le mercredi 26 janvier 2011 à 16:48 -0800, Simon Kirby a écrit :
> On Thu, Jan 13, 2011 at 03:34:22PM +0900, Simon Horman wrote:
>
> > Hi Simon,
> >
> > thanks for prodding me to respond to this post offline and sorry for not
> > responding earlier.
> >
> > Firstly, I think that this is a receive-side problem so I don't believe
> > that GSO (generic segmentation offload) or other transmit-side options are
> > likely to have any affect.
> >
> > My understanding is that on the receive-side there are two options which
> > when enabled can result in the behaviour that you describe.
> >
> > * LRO (large receive offload)
> >
> > You have this disabled, and assuming it really is disabled it
> > shouldn't be causing a problem.
> >
> > * GRO (generic receive offload)
> >
> > This does not seem to be in the output of your ethtool commands at all.
> > So I wonder if your ethtool is too old to support this option?
>
> So, this was the case. Our ethtool (lenny) was too old to see the GRO
> option, only GSO. Disabling GRO on eth1.39 has no effect, but disabling
> it on eth1 caused it to stop receiving the merged frames, fixing the LVS
> packet loss (due to no sending GSO support from LVS/IPVS).
>
> Speaking of this, did your patch for LVS/IPVS GSO support go anywhere?
>
> > In any case, I was able to reproduce the problem that you describe (or at
> > least something very similar) using 2.6.36 with GRO enabled on eth1.1 and
> > the problem did not manifest when I disabled GRO on eth1.1.
>
> It worked for you to do ethtool -K eth1.1 gro off, then? For me on
> 2.6.37, it seemed to be that "ethtool -K eth1 gro off" was needed, even
> though packets arrive on eth1.39.
>
> Also, strangely, 2.6.35.4's default state (with no received merged frames)
> has GRO on for eth1 but off for eth1.39:
>
> # ethtool -k eth1
> Offload parameters for eth1:
> rx-checksumming: on
> tx-checksumming: on
> scatter-gather: on
> tcp-segmentation-offload: on
> udp-fragmentation-offload: off
> generic-segmentation-offload: on
> generic-receive-offload: on
> large-receive-offload: off
> ntuple-filters: off
> receive-hashing: off
>
> # ethtool -k eth1.39
> Offload parameters for eth1.39:
> rx-checksumming: on
> tx-checksumming: off
> scatter-gather: off
> tcp-segmentation-offload: off
> udp-fragmentation-offload: off
> generic-segmentation-offload: off
> generic-receive-offload: off
> large-receive-offload: off
> ntuple-filters: off
> receive-hashing: off
>
> If I set 2.6.37 to have all of the same options, I still see GRO frames
> on 2.6.37 (tg3), which is weird.
>
Weird maybe, but GRO check/handling is done in dev_gro_receive(), on
eth1 receive path.
Frames are assembled by GRO layer using tg3 NAPI structure (holding GRO
machine state) before being delivered to eth1.39
It would be useless/expensive to add another GRO layer on eth1.39
We might not report GRO state on vlan/bonding (or reflect real device
GRO state)
^ permalink raw reply
* Re: [PATCH net-next-2.6] net_sched: sch_mqprio: dont leak kernel memory
From: Eric Dumazet @ 2011-01-27 7:04 UTC (permalink / raw)
To: Joe Perches; +Cc: David Miller, netdev, john.r.fastabend
In-Reply-To: <1296108251.2448.183.camel@Joe-Laptop>
Le mercredi 26 janvier 2011 à 22:04 -0800, Joe Perches a écrit :
> /* MQPRIO */
> #define TC_QOPT_BITMASK 15
> #define TC_QOPT_MAX_QUEUE 16
>
> struct tc_mqprio_qopt {
> __u8 num_tc;
> __u8 prio_tc_map[TC_QOPT_BITMASK + 1];
> __u8 hw;
> __u16 count[TC_QOPT_MAX_QUEUE];
> __u16 offset[TC_QOPT_MAX_QUEUE];
> };
>
> I believe this struct needs to be declared __packed.
>
Oh my god. Yet another ugly thing.
> It could otherwise be 24 bytes not 22.
22 ? You are kidding probably. Its 82 exactly.
Listen, I doubled check my patch, its good, while your rants are lazy.
> Or if char array declarations have a different
> alignment requirement, could be any size.
>
If if if... could could could...
> memset is better than {0}.
>
You never stop do you ?
The bigger object is u16, therefore alignof() is 2, not 4
No ABI requires a short (u16) is aligned on 4 byte boundary.
If you find a compiler not respecting this, you can bet linux wont run
at all if compiled with it. mqprio 'potential 2 bytes leak' will hardly
be a problem.
^ permalink raw reply
* Re: [PATCH net-next-2.6] net_sched: sch_mqprio: dont leak kernel memory
From: Changli Gao @ 2011-01-27 6:54 UTC (permalink / raw)
To: Joe Perches; +Cc: David Miller, eric.dumazet, netdev, john.r.fastabend
In-Reply-To: <1296108251.2448.183.camel@Joe-Laptop>
On Thu, Jan 27, 2011 at 2:04 PM, Joe Perches <joe@perches.com> wrote:
> On Wed, 2011-01-26 at 11:55 -0800, David Miller wrote:
>> From: Joe Perches <joe@perches.com>
>> Date: Wed, 26 Jan 2011 09:43:43 -0800
>> > On Wed, 2011-01-26 at 18:21 +0100, Eric Dumazet wrote:
>> >> mqprio_dump() should make sure all fields of struct tc_mqprio_qopt are
>> >> initialized.
>> >> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>> >> CC: John Fastabend <john.r.fastabend@intel.com>
>> >> ---
>> >> net/sched/sch_mqprio.c | 2 +-
>> >> 1 file changed, 1 insertion(+), 1 deletion(-)
>> >>
>> >> diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
>> >> index fbc6f53..effd4ee 100644
>> >> --- a/net/sched/sch_mqprio.c
>> >> +++ b/net/sched/sch_mqprio.c
>> >> @@ -215,7 +215,7 @@ static int mqprio_dump(struct Qdisc *sch, struct sk_buff *skb)
>> >> struct net_device *dev = qdisc_dev(sch);
>> >> struct mqprio_sched *priv = qdisc_priv(sch);
>> >> unsigned char *b = skb_tail_pointer(skb);
>> >> - struct tc_mqprio_qopt opt;
>> >> + struct tc_mqprio_qopt opt = { 0 };
>> > I think the best style to use memset so that any
>> > possible struct padding is guaranteed to be zeroed.
>> Such padding does not exist, and we won't add such padding since this is
>> a user visible data structure and thus whose layout is cast in stone.
>
> /* MQPRIO */
> #define TC_QOPT_BITMASK 15
> #define TC_QOPT_MAX_QUEUE 16
>
> struct tc_mqprio_qopt {
> __u8 num_tc; // 1
> __u8 prio_tc_map[TC_QOPT_BITMASK + 1]; // 16
> __u8 hw; // 1
> __u16 count[TC_QOPT_MAX_QUEUE]; // 32
> __u16 offset[TC_QOPT_MAX_QUEUE]; //32
> };
>
> I believe this struct needs to be declared __packed.
>
> It could otherwise be 24 bytes not 22.
> Or if char array declarations have a different
> alignment requirement, could be any size.
>
The total size is 1 + 16 + 1 + 32 + 32 = 82.
How do you get 24 or 22?
> memset is better than {0}.
>
--
Regards,
Changli Gao(xiaosuo@gmail.com)
^ permalink raw reply
* Re: [PATCH net-next-2.6] net_sched: sch_mqprio: dont leak kernel memory
From: Joe Perches @ 2011-01-27 6:04 UTC (permalink / raw)
To: David Miller; +Cc: eric.dumazet, netdev, john.r.fastabend
In-Reply-To: <20110126.115530.226756606.davem@davemloft.net>
On Wed, 2011-01-26 at 11:55 -0800, David Miller wrote:
> From: Joe Perches <joe@perches.com>
> Date: Wed, 26 Jan 2011 09:43:43 -0800
> > On Wed, 2011-01-26 at 18:21 +0100, Eric Dumazet wrote:
> >> mqprio_dump() should make sure all fields of struct tc_mqprio_qopt are
> >> initialized.
> >> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> >> CC: John Fastabend <john.r.fastabend@intel.com>
> >> ---
> >> net/sched/sch_mqprio.c | 2 +-
> >> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
> >> index fbc6f53..effd4ee 100644
> >> --- a/net/sched/sch_mqprio.c
> >> +++ b/net/sched/sch_mqprio.c
> >> @@ -215,7 +215,7 @@ static int mqprio_dump(struct Qdisc *sch, struct sk_buff *skb)
> >> struct net_device *dev = qdisc_dev(sch);
> >> struct mqprio_sched *priv = qdisc_priv(sch);
> >> unsigned char *b = skb_tail_pointer(skb);
> >> - struct tc_mqprio_qopt opt;
> >> + struct tc_mqprio_qopt opt = { 0 };
> > I think the best style to use memset so that any
> > possible struct padding is guaranteed to be zeroed.
> Such padding does not exist, and we won't add such padding since this is
> a user visible data structure and thus whose layout is cast in stone.
/* MQPRIO */
#define TC_QOPT_BITMASK 15
#define TC_QOPT_MAX_QUEUE 16
struct tc_mqprio_qopt {
__u8 num_tc;
__u8 prio_tc_map[TC_QOPT_BITMASK + 1];
__u8 hw;
__u16 count[TC_QOPT_MAX_QUEUE];
__u16 offset[TC_QOPT_MAX_QUEUE];
};
I believe this struct needs to be declared __packed.
It could otherwise be 24 bytes not 22.
Or if char array declarations have a different
alignment requirement, could be any size.
memset is better than {0}.
cheers, Joe
^ permalink raw reply
* [PATCH net-next-2.6] drivers/net: remove some rcu sparse warnings
From: Eric Dumazet @ 2011-01-27 5:28 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Michael Chan, Arnd Bergmann, Eilon Greenstein
Add missing __rcu annotations and helpers.
minor : Fix some rcu_dereference() calls in macvtap
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Arnd Bergmann <arnd@arndb.de>
CC: Michael Chan <mchan@broadcom.com>
CC: Eilon Greenstein <eilong@broadcom.com>
---
drivers/net/bnx2.c | 6 ++++--
drivers/net/bnx2.h | 2 +-
drivers/net/bnx2x/bnx2x.h | 2 +-
drivers/net/bnx2x/bnx2x_main.c | 3 ++-
drivers/net/cnic.c | 27 ++++++++++++++++++---------
drivers/net/cnic.h | 2 +-
drivers/net/hamradio/bpqether.c | 5 +++--
drivers/net/macvtap.c | 18 ++++++++++--------
8 files changed, 40 insertions(+), 25 deletions(-)
diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index 231aa97..2c3d747 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -435,7 +435,8 @@ bnx2_cnic_stop(struct bnx2 *bp)
struct cnic_ctl_info info;
mutex_lock(&bp->cnic_lock);
- c_ops = bp->cnic_ops;
+ c_ops = rcu_dereference_protected(bp->cnic_ops,
+ lockdep_is_held(&bp->cnic_lock));
if (c_ops) {
info.cmd = CNIC_CTL_STOP_CMD;
c_ops->cnic_ctl(bp->cnic_data, &info);
@@ -450,7 +451,8 @@ bnx2_cnic_start(struct bnx2 *bp)
struct cnic_ctl_info info;
mutex_lock(&bp->cnic_lock);
- c_ops = bp->cnic_ops;
+ c_ops = rcu_dereference_protected(bp->cnic_ops,
+ lockdep_is_held(&bp->cnic_lock));
if (c_ops) {
if (!(bp->flags & BNX2_FLAG_USING_MSIX)) {
struct bnx2_napi *bnapi = &bp->bnx2_napi[0];
diff --git a/drivers/net/bnx2.h b/drivers/net/bnx2.h
index 5488a2e..6824eba 100644
--- a/drivers/net/bnx2.h
+++ b/drivers/net/bnx2.h
@@ -6758,7 +6758,7 @@ struct bnx2 {
u32 tx_wake_thresh;
#ifdef BCM_CNIC
- struct cnic_ops *cnic_ops;
+ struct cnic_ops __rcu *cnic_ops;
void *cnic_data;
#endif
diff --git a/drivers/net/bnx2x/bnx2x.h b/drivers/net/bnx2x/bnx2x.h
index 8e41837..dfdb9b5 100644
--- a/drivers/net/bnx2x/bnx2x.h
+++ b/drivers/net/bnx2x/bnx2x.h
@@ -1110,7 +1110,7 @@ struct bnx2x {
#define BNX2X_CNIC_FLAG_MAC_SET 1
void *t2;
dma_addr_t t2_mapping;
- struct cnic_ops *cnic_ops;
+ struct cnic_ops __rcu *cnic_ops;
void *cnic_data;
u32 cnic_tag;
struct cnic_eth_dev cnic_eth_dev;
diff --git a/drivers/net/bnx2x/bnx2x_main.c b/drivers/net/bnx2x/bnx2x_main.c
index 8cdcf5b..a2a1bc4 100644
--- a/drivers/net/bnx2x/bnx2x_main.c
+++ b/drivers/net/bnx2x/bnx2x_main.c
@@ -9862,7 +9862,8 @@ static int bnx2x_cnic_ctl_send(struct bnx2x *bp, struct cnic_ctl_info *ctl)
int rc = 0;
mutex_lock(&bp->cnic_mutex);
- c_ops = bp->cnic_ops;
+ c_ops = rcu_dereference_protected(bp->cnic_ops,
+ lockdep_is_held(&bp->cnic_mutex));
if (c_ops)
rc = c_ops->cnic_ctl(bp->cnic_data, ctl);
mutex_unlock(&bp->cnic_mutex);
diff --git a/drivers/net/cnic.c b/drivers/net/cnic.c
index 263a294..e12049e 100644
--- a/drivers/net/cnic.c
+++ b/drivers/net/cnic.c
@@ -65,7 +65,14 @@ static LIST_HEAD(cnic_udev_list);
static DEFINE_RWLOCK(cnic_dev_lock);
static DEFINE_MUTEX(cnic_lock);
-static struct cnic_ulp_ops *cnic_ulp_tbl[MAX_CNIC_ULP_TYPE];
+static struct cnic_ulp_ops __rcu *cnic_ulp_tbl[MAX_CNIC_ULP_TYPE];
+
+/* helper function, assuming cnic_lock is held */
+static inline struct cnic_ulp_ops *cnic_ulp_tbl_prot(int type)
+{
+ return rcu_dereference_protected(cnic_ulp_tbl[type],
+ lockdep_is_held(&cnic_lock));
+}
static int cnic_service_bnx2(void *, void *);
static int cnic_service_bnx2x(void *, void *);
@@ -435,7 +442,7 @@ int cnic_register_driver(int ulp_type, struct cnic_ulp_ops *ulp_ops)
return -EINVAL;
}
mutex_lock(&cnic_lock);
- if (cnic_ulp_tbl[ulp_type]) {
+ if (cnic_ulp_tbl_prot(ulp_type)) {
pr_err("%s: Type %d has already been registered\n",
__func__, ulp_type);
mutex_unlock(&cnic_lock);
@@ -478,7 +485,7 @@ int cnic_unregister_driver(int ulp_type)
return -EINVAL;
}
mutex_lock(&cnic_lock);
- ulp_ops = cnic_ulp_tbl[ulp_type];
+ ulp_ops = cnic_ulp_tbl_prot(ulp_type);
if (!ulp_ops) {
pr_err("%s: Type %d has not been registered\n",
__func__, ulp_type);
@@ -529,7 +536,7 @@ static int cnic_register_device(struct cnic_dev *dev, int ulp_type,
return -EINVAL;
}
mutex_lock(&cnic_lock);
- if (cnic_ulp_tbl[ulp_type] == NULL) {
+ if (cnic_ulp_tbl_prot(ulp_type) == NULL) {
pr_err("%s: Driver with type %d has not been registered\n",
__func__, ulp_type);
mutex_unlock(&cnic_lock);
@@ -544,7 +551,7 @@ static int cnic_register_device(struct cnic_dev *dev, int ulp_type,
clear_bit(ULP_F_START, &cp->ulp_flags[ulp_type]);
cp->ulp_handle[ulp_type] = ulp_ctx;
- ulp_ops = cnic_ulp_tbl[ulp_type];
+ ulp_ops = cnic_ulp_tbl_prot(ulp_type);
rcu_assign_pointer(cp->ulp_ops[ulp_type], ulp_ops);
cnic_hold(dev);
@@ -2953,7 +2960,8 @@ static void cnic_ulp_stop(struct cnic_dev *dev)
struct cnic_ulp_ops *ulp_ops;
mutex_lock(&cnic_lock);
- ulp_ops = cp->ulp_ops[if_type];
+ ulp_ops = rcu_dereference_protected(cp->ulp_ops[if_type],
+ lockdep_is_held(&cnic_lock));
if (!ulp_ops) {
mutex_unlock(&cnic_lock);
continue;
@@ -2977,7 +2985,8 @@ static void cnic_ulp_start(struct cnic_dev *dev)
struct cnic_ulp_ops *ulp_ops;
mutex_lock(&cnic_lock);
- ulp_ops = cp->ulp_ops[if_type];
+ ulp_ops = rcu_dereference_protected(cp->ulp_ops[if_type],
+ lockdep_is_held(&cnic_lock));
if (!ulp_ops || !ulp_ops->cnic_start) {
mutex_unlock(&cnic_lock);
continue;
@@ -3041,7 +3050,7 @@ static void cnic_ulp_init(struct cnic_dev *dev)
struct cnic_ulp_ops *ulp_ops;
mutex_lock(&cnic_lock);
- ulp_ops = cnic_ulp_tbl[i];
+ ulp_ops = cnic_ulp_tbl_prot(i);
if (!ulp_ops || !ulp_ops->cnic_init) {
mutex_unlock(&cnic_lock);
continue;
@@ -3065,7 +3074,7 @@ static void cnic_ulp_exit(struct cnic_dev *dev)
struct cnic_ulp_ops *ulp_ops;
mutex_lock(&cnic_lock);
- ulp_ops = cnic_ulp_tbl[i];
+ ulp_ops = cnic_ulp_tbl_prot(i);
if (!ulp_ops || !ulp_ops->cnic_exit) {
mutex_unlock(&cnic_lock);
continue;
diff --git a/drivers/net/cnic.h b/drivers/net/cnic.h
index b328f6c..4456260 100644
--- a/drivers/net/cnic.h
+++ b/drivers/net/cnic.h
@@ -220,7 +220,7 @@ struct cnic_local {
#define ULP_F_INIT 0
#define ULP_F_START 1
#define ULP_F_CALL_PENDING 2
- struct cnic_ulp_ops *ulp_ops[MAX_CNIC_ULP_TYPE];
+ struct cnic_ulp_ops __rcu *ulp_ops[MAX_CNIC_ULP_TYPE];
unsigned long cnic_local_flags;
#define CNIC_LCL_FL_KWQ_INIT 0x0
diff --git a/drivers/net/hamradio/bpqether.c b/drivers/net/hamradio/bpqether.c
index ac1d323..8931168 100644
--- a/drivers/net/hamradio/bpqether.c
+++ b/drivers/net/hamradio/bpqether.c
@@ -400,13 +400,14 @@ static void *bpq_seq_start(struct seq_file *seq, loff_t *pos)
static void *bpq_seq_next(struct seq_file *seq, void *v, loff_t *pos)
{
struct list_head *p;
+ struct bpqdev *bpqdev = v;
++*pos;
if (v == SEQ_START_TOKEN)
- p = rcu_dereference(bpq_devices.next);
+ p = rcu_dereference(list_next_rcu(&bpq_devices));
else
- p = rcu_dereference(((struct bpqdev *)v)->bpq_list.next);
+ p = rcu_dereference(list_next_rcu(&bpqdev->bpq_list));
return (p == &bpq_devices) ? NULL
: list_entry(p, struct bpqdev, bpq_list);
diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 5933621..2300e45 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -39,7 +39,7 @@ struct macvtap_queue {
struct socket sock;
struct socket_wq wq;
int vnet_hdr_sz;
- struct macvlan_dev *vlan;
+ struct macvlan_dev __rcu *vlan;
struct file *file;
unsigned int flags;
};
@@ -141,7 +141,8 @@ static void macvtap_put_queue(struct macvtap_queue *q)
struct macvlan_dev *vlan;
spin_lock(&macvtap_lock);
- vlan = rcu_dereference(q->vlan);
+ vlan = rcu_dereference_protected(q->vlan,
+ lockdep_is_held(&macvtap_lock));
if (vlan) {
int index = get_slot(vlan, q);
@@ -219,7 +220,8 @@ static void macvtap_del_queues(struct net_device *dev)
/* macvtap_put_queue can free some slots, so go through all slots */
spin_lock(&macvtap_lock);
for (i = 0; i < MAX_MACVTAP_QUEUES && vlan->numvtaps; i++) {
- q = rcu_dereference(vlan->taps[i]);
+ q = rcu_dereference_protected(vlan->taps[i],
+ lockdep_is_held(&macvtap_lock));
if (q) {
qlist[j++] = q;
rcu_assign_pointer(vlan->taps[i], NULL);
@@ -569,7 +571,7 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q,
}
rcu_read_lock_bh();
- vlan = rcu_dereference(q->vlan);
+ vlan = rcu_dereference_bh(q->vlan);
if (vlan)
macvlan_start_xmit(skb, vlan->dev);
else
@@ -583,7 +585,7 @@ err_kfree:
err:
rcu_read_lock_bh();
- vlan = rcu_dereference(q->vlan);
+ vlan = rcu_dereference_bh(q->vlan);
if (vlan)
vlan->dev->stats.tx_dropped++;
rcu_read_unlock_bh();
@@ -631,7 +633,7 @@ static ssize_t macvtap_put_user(struct macvtap_queue *q,
ret = skb_copy_datagram_const_iovec(skb, 0, iv, vnet_hdr_len, len);
rcu_read_lock_bh();
- vlan = rcu_dereference(q->vlan);
+ vlan = rcu_dereference_bh(q->vlan);
if (vlan)
macvlan_count_rx(vlan, len, ret == 0, 0);
rcu_read_unlock_bh();
@@ -727,7 +729,7 @@ static long macvtap_ioctl(struct file *file, unsigned int cmd,
case TUNGETIFF:
rcu_read_lock_bh();
- vlan = rcu_dereference(q->vlan);
+ vlan = rcu_dereference_bh(q->vlan);
if (vlan)
dev_hold(vlan->dev);
rcu_read_unlock_bh();
@@ -736,7 +738,7 @@ static long macvtap_ioctl(struct file *file, unsigned int cmd,
return -ENOLINK;
ret = 0;
- if (copy_to_user(&ifr->ifr_name, q->vlan->dev->name, IFNAMSIZ) ||
+ if (copy_to_user(&ifr->ifr_name, vlan->dev->name, IFNAMSIZ) ||
put_user(q->flags, &ifr->ifr_flags))
ret = -EFAULT;
dev_put(vlan->dev);
^ permalink raw reply related
* Re: [net-next 08/12] ixgb: convert to new VLAN model
From: Ben Hutchings @ 2011-01-27 4:18 UTC (permalink / raw)
To: Jesse Gross
Cc: Tantilov, Emil S, Kirsher, Jeffrey T, davem@davemloft.net,
netdev@vger.kernel.org, bphilips@novell.com, Pieper, Jeffrey E
In-Reply-To: <AANLkTi=RU11ibzd3c9sqCLL0pNowvx1_ow7C=qWVoPMt@mail.gmail.com>
On Wed, 2011-01-26 at 19:53 -0800, Jesse Gross wrote:
> On Tue, Jan 25, 2011 at 10:20 AM, Tantilov, Emil S
> <emil.s.tantilov@intel.com> wrote:
[...]
> > Sure, but I think a savvy user would always check the result of an
> > ethtool command (ie. `ethtool -K` followed with `ethtool -k`, -A/-a,
> > etc).
>
> Probably, but it seems the less savviness required from the user the
> better. Regardless, it doesn't affect anything here, it would just be
> a change to the userspace tool.
I am intending to modify ethtool so that it will report any other
offload settings that were changed automatically.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* [PATCH net-next-2.6] net: fix dev_seq_next()
From: Eric Dumazet @ 2011-01-27 4:08 UTC (permalink / raw)
To: David Miller, Paul E. McKenney; +Cc: netdev
Paul, the following comment in include/linux/rculist.h is misleading :
"Why is there no list_empty_rcu()? Because list_empty() serves this
purpose..."
This is probably why I made the error ;)
list_empty() has a meaning only if state cannot change right after its
use.
In an rcu_read_lock() section, state _can_ change, so there is no way
list_empty() can be used at all.
Thanks
[PATCH net-next-2.6] net: fix dev_seq_next()
Commit c6d14c84566d (net: Introduce for_each_netdev_rcu() iterator)
added a race in dev_seq_next().
The rcu_dereference() call should be done _before_ testing the end of
list, or we might return a wrong net_device if a concurrent thread
changes net_device list under us.
Note : discovered thanks to a sparse warning :
net/core/dev.c:3919:9: error: incompatible types in comparison expression
(different address spaces)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
Given this was discovered by code analysis rather than a bug report, I
prepared a patch for net-next-2.6. Once fully tested, this could be
backported to 2.6.33
include/linux/netdevice.h | 9 ++++++++-
net/core/dev.c | 11 +++++++----
2 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 8858422..c7d7074 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1447,7 +1447,7 @@ static inline struct net_device *next_net_device_rcu(struct net_device *dev)
struct net *net;
net = dev_net(dev);
- lh = rcu_dereference(dev->dev_list.next);
+ lh = rcu_dereference(list_next_rcu(&dev->dev_list));
return lh == &net->dev_base_head ? NULL : net_device_entry(lh);
}
@@ -1457,6 +1457,13 @@ static inline struct net_device *first_net_device(struct net *net)
net_device_entry(net->dev_base_head.next);
}
+static inline struct net_device *first_net_device_rcu(struct net *net)
+{
+ struct list_head *lh = rcu_dereference(list_next_rcu(&net->dev_base_head));
+
+ return lh == &net->dev_base_head ? NULL : net_device_entry(lh);
+}
+
extern int netdev_boot_setup_check(struct net_device *dev);
extern unsigned long netdev_boot_base(const char *prefix, int unit);
extern struct net_device *dev_getbyhwaddr_rcu(struct net *net, unsigned short type,
diff --git a/net/core/dev.c b/net/core/dev.c
index 1b4c07f..ddd5df2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4051,12 +4051,15 @@ void *dev_seq_start(struct seq_file *seq, loff_t *pos)
void *dev_seq_next(struct seq_file *seq, void *v, loff_t *pos)
{
- struct net_device *dev = (v == SEQ_START_TOKEN) ?
- first_net_device(seq_file_net(seq)) :
- next_net_device((struct net_device *)v);
+ struct net_device *dev = v;
+
+ if (v == SEQ_START_TOKEN)
+ dev = first_net_device_rcu(seq_file_net(seq));
+ else
+ dev = next_net_device_rcu(dev);
++*pos;
- return rcu_dereference(dev);
+ return dev;
}
void dev_seq_stop(struct seq_file *seq, void *v)
^ permalink raw reply related
* Re: [net-next 08/12] ixgb: convert to new VLAN model
From: Jesse Gross @ 2011-01-27 3:53 UTC (permalink / raw)
To: Tantilov, Emil S
Cc: Kirsher, Jeffrey T, davem@davemloft.net, netdev@vger.kernel.org,
bphilips@novell.com, Pieper, Jeffrey E, Ben Hutchings
In-Reply-To: <EA929A9653AAE14F841771FB1DE5A136602DF59AA3@rrsmsx501.amr.corp.intel.com>
On Tue, Jan 25, 2011 at 10:20 AM, Tantilov, Emil S
<emil.s.tantilov@intel.com> wrote:
>>-----Original Message-----
>>From: Jesse Gross [mailto:jesse@nicira.com]
>>Sent: Tuesday, January 25, 2011 9:23 AM
>>To: Tantilov, Emil S
>>Cc: Kirsher, Jeffrey T; davem@davemloft.net; netdev@vger.kernel.org;
>>bphilips@novell.com; Pieper, Jeffrey E
>>Subject: Re: [net-next 08/12] ixgb: convert to new VLAN model
>>
>>On Sun, Jan 23, 2011 at 4:25 PM, Tantilov, Emil S
>><emil.s.tantilov@intel.com> wrote:
>>> Jesse Gross wrote:
>>>> On Thu, Jan 6, 2011 at 7:29 PM, <jeffrey.t.kirsher@intel.com> wrote:
>>>>> +static int ixgb_set_flags(struct net_device *netdev, u32 data) +{
>>>>> + struct ixgb_adapter *adapter = netdev_priv(netdev); +
>>>>> bool need_reset; + int rc;
>>>>> +
>>>>> + /*
>>>>> + * TX vlan insertion does not work per HW design when Rx
>>>>> stripping is + * disabled. Disable txvlan when rxvlan is
>>>>> off. + */ + if ((data & ETH_FLAG_RXVLAN) !=
>>>>> (netdev->features & NETIF_F_HW_VLAN_RX)) + data ^=
>>>>> ETH_FLAG_TXVLAN;
>>>>
>>>> Does this really do the right thing? If the RX vlan setting is
>>>> changed, it will do the opposite of what the user requested for TX
>>>> vlan?
>>>>
>>>> So if I start with both on (the default) and turn them both off in one
>>>> command (a valid setting), I will get RX off and TX on (an invalid
>>>> setting).
>>>>
>>>> Why not:
>>>>
>>>> if (!(data & ETH_FLAG_RXVLAN))
>>>> data &= ~ETH_FLAG_TXVLAN;
>>>
>>> Yeah that works for disabling rxvlan, but what if rxvlan is disabled, and
>>the user attempts to enable txvlan? At least our validation argued that we
>>should make it work both ways. Perhaps something like the following?
>>>
>>> if (!(data & ETH_FLAG_RXVLAN) &&
>>> (netdev->features & NETIF_F_HW_VLAN_TX))
>>> data &= ~ETH_FLAG_TXVLAN;
>>> else if (data & ETH_FLAG_TXVLAN)
>>> data |= ETH_FLAG_RXVLAN;
>>
>>I think the logic above does what you describe and will always result
>>in a consistent state. Turning dependent features on when needed is a
>>little bit inconsistent with the rest of Ethtool (for example, turning
>>on TSO when checksum offloading is off will not enable checksum
>>offloading, it will produce an error). However, I know that drivers
>
> That is the reason I asked, as I don't want to keep bouncing the patch back and forth. Personally I like the idea of helping the user and adjusting the flags to something that works rather than a generic error message.
I think it is fine to adjust things, especially where the restrictions
are hardware specific and the user is less likely to know what
settings are related. As long as it works, it doesn't matter too much
to me either way, so please do what you think is the most appropriate.
>
>>aren't completely consistent here and the most important part is that
>>it enforces valid states, so I don't have a strong opinion. Ben's
>>previous suggestion of Ethtool querying again after the operation and
>>reporting any flags that were automatically changed would help a lot
>>here.
>
> Sure, but I think a savvy user would always check the result of an ethtool command (ie. `ethtool -K` followed with `ethtool -k`, -A/-a, etc).
Probably, but it seems the less savviness required from the user the
better. Regardless, it doesn't affect anything here, it would just be
a change to the userspace tool.
^ permalink raw reply
* i have business for you get back to us
From: MR CHEN GUAN @ 2011-01-26 1:30 UTC (permalink / raw)
I am Mr. Chen Guan, Foreign Operations Manager of the Bank of China (Hong Kong). I have a business suggestion for you of $17,300,000 Million USD. Further details Contact me via email (chenguan000000@yahoo.com.hk)
^ permalink raw reply
* Re: TSO/GRO/LRO/somethingO breaks LVS on 2.6.36
From: Simon Horman @ 2011-01-27 1:36 UTC (permalink / raw)
To: Simon Kirby; +Cc: Eric Dumazet, netdev
In-Reply-To: <20110127004805.GB11578@hostway.ca>
On Wed, Jan 26, 2011 at 04:48:05PM -0800, Simon Kirby wrote:
> On Thu, Jan 13, 2011 at 03:34:22PM +0900, Simon Horman wrote:
>
> > Hi Simon,
> >
> > thanks for prodding me to respond to this post offline and sorry for not
> > responding earlier.
> >
> > Firstly, I think that this is a receive-side problem so I don't believe
> > that GSO (generic segmentation offload) or other transmit-side options are
> > likely to have any affect.
> >
> > My understanding is that on the receive-side there are two options which
> > when enabled can result in the behaviour that you describe.
> >
> > * LRO (large receive offload)
> >
> > You have this disabled, and assuming it really is disabled it
> > shouldn't be causing a problem.
> >
> > * GRO (generic receive offload)
> >
> > This does not seem to be in the output of your ethtool commands at all.
> > So I wonder if your ethtool is too old to support this option?
>
> So, this was the case. Our ethtool (lenny) was too old to see the GRO
> option, only GSO. Disabling GRO on eth1.39 has no effect, but disabling
> it on eth1 caused it to stop receiving the merged frames, fixing the LVS
> packet loss (due to no sending GSO support from LVS/IPVS).
>
> Speaking of this, did your patch for LVS/IPVS GSO support go anywhere?
The patch for IPVS GRO support has been merged and should appear in 2.6.39.
This is somewhat later than I previously anticipated due to a merge mix-up
on my part.
> > In any case, I was able to reproduce the problem that you describe (or at
> > least something very similar) using 2.6.36 with GRO enabled on eth1.1 and
> > the problem did not manifest when I disabled GRO on eth1.1.
>
> It worked for you to do ethtool -K eth1.1 gro off, then? For me on
> 2.6.37, it seemed to be that "ethtool -K eth1 gro off" was needed, even
> though packets arrive on eth1.39.
I will recheck my results, but in general I think it is a bit
of an open question as to how ethtool settings should be propagated
between related devices.
> Also, strangely, 2.6.35.4's default state (with no received merged frames)
> has GRO on for eth1 but off for eth1.39:
>
> # ethtool -k eth1
> Offload parameters for eth1:
> rx-checksumming: on
> tx-checksumming: on
> scatter-gather: on
> tcp-segmentation-offload: on
> udp-fragmentation-offload: off
> generic-segmentation-offload: on
> generic-receive-offload: on
> large-receive-offload: off
> ntuple-filters: off
> receive-hashing: off
>
> # ethtool -k eth1.39
> Offload parameters for eth1.39:
> rx-checksumming: on
> tx-checksumming: off
> scatter-gather: off
> tcp-segmentation-offload: off
> udp-fragmentation-offload: off
> generic-segmentation-offload: off
> generic-receive-offload: off
> large-receive-offload: off
> ntuple-filters: off
> receive-hashing: off
>
> If I set 2.6.37 to have all of the same options, I still see GRO frames
> on 2.6.37 (tg3), which is weird.
Yes that is a weird.
There has been quite a lot of work on VLANs recently and
I suspect that the behaviour that you are observing with 2.6.37
is a regression that occurred during that work. It would
be good to fix things to restore the 2.6.35 behaviour.
^ permalink raw reply
* Re: Simultaneous error on two different machines
From: Michael Chan @ 2011-01-27 0:52 UTC (permalink / raw)
To: J.H.; +Cc: netdev@vger.kernel.org
In-Reply-To: <4D40BE2A.2010903@kernel.org>
On Wed, 2011-01-26 at 16:36 -0800, J.H. wrote:
> Afternoon,
>
> Happened to trip over this yesterday on two machines (one an HP DL380 G6
> and one an HP DL 380 G7). Error was within minutes of each other,
> identical on the two boxes. Driver seemed to reset (as indicated
> below), however loads on the two boxes almost immediately skyrocketed,
> eventually leading to a more serious deadlock panic on one of the boxes,
> and me just rebooting the other.
>
> The chipsets involved seem to be:
>
> Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit
> Ethernet (rev 20)
>
> Kernel is from Fedora 13, 2.6.34.7-61.korg.fc13.x86_64. The only
> additional patches applied involve fixing an XFS bug.
>
Please check if you have this patch in your kernel:
commit c441b8d2cb2194b05550a558d6d95d8944e56a84
bnx2: Fix lost MSI-X problem on 5709 NICs.
^ permalink raw reply
* Re: TSO/GRO/LRO/somethingO breaks LVS on 2.6.36
From: Simon Kirby @ 2011-01-27 0:48 UTC (permalink / raw)
To: Simon Horman; +Cc: Eric Dumazet, netdev
In-Reply-To: <20110113063422.GB14643@verge.net.au>
On Thu, Jan 13, 2011 at 03:34:22PM +0900, Simon Horman wrote:
> Hi Simon,
>
> thanks for prodding me to respond to this post offline and sorry for not
> responding earlier.
>
> Firstly, I think that this is a receive-side problem so I don't believe
> that GSO (generic segmentation offload) or other transmit-side options are
> likely to have any affect.
>
> My understanding is that on the receive-side there are two options which
> when enabled can result in the behaviour that you describe.
>
> * LRO (large receive offload)
>
> You have this disabled, and assuming it really is disabled it
> shouldn't be causing a problem.
>
> * GRO (generic receive offload)
>
> This does not seem to be in the output of your ethtool commands at all.
> So I wonder if your ethtool is too old to support this option?
So, this was the case. Our ethtool (lenny) was too old to see the GRO
option, only GSO. Disabling GRO on eth1.39 has no effect, but disabling
it on eth1 caused it to stop receiving the merged frames, fixing the LVS
packet loss (due to no sending GSO support from LVS/IPVS).
Speaking of this, did your patch for LVS/IPVS GSO support go anywhere?
> In any case, I was able to reproduce the problem that you describe (or at
> least something very similar) using 2.6.36 with GRO enabled on eth1.1 and
> the problem did not manifest when I disabled GRO on eth1.1.
It worked for you to do ethtool -K eth1.1 gro off, then? For me on
2.6.37, it seemed to be that "ethtool -K eth1 gro off" was needed, even
though packets arrive on eth1.39.
Also, strangely, 2.6.35.4's default state (with no received merged frames)
has GRO on for eth1 but off for eth1.39:
# ethtool -k eth1
Offload parameters for eth1:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
ntuple-filters: off
receive-hashing: off
# ethtool -k eth1.39
Offload parameters for eth1.39:
rx-checksumming: on
tx-checksumming: off
scatter-gather: off
tcp-segmentation-offload: off
udp-fragmentation-offload: off
generic-segmentation-offload: off
generic-receive-offload: off
large-receive-offload: off
ntuple-filters: off
receive-hashing: off
If I set 2.6.37 to have all of the same options, I still see GRO frames
on 2.6.37 (tg3), which is weird.
Cheers,
Simon-
^ permalink raw reply
* Simultaneous error on two different machines
From: J.H. @ 2011-01-27 0:36 UTC (permalink / raw)
To: Michael Chan, netdev
Afternoon,
Happened to trip over this yesterday on two machines (one an HP DL380 G6
and one an HP DL 380 G7). Error was within minutes of each other,
identical on the two boxes. Driver seemed to reset (as indicated
below), however loads on the two boxes almost immediately skyrocketed,
eventually leading to a more serious deadlock panic on one of the boxes,
and me just rebooting the other.
The chipsets involved seem to be:
Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit
Ethernet (rev 20)
Kernel is from Fedora 13, 2.6.34.7-61.korg.fc13.x86_64. The only
additional patches applied involve fixing an XFS bug.
Here follows what I was able to snag from the system:
------------[ cut here ]------------
WARNING: at net/sched/sch_generic.c:256 dev_watchdog+0xf5/0x197()
Hardware name: ProLiant DL380 G6
NETDEV WATCHDOG: eth0 (bnx2): transmit queue 2 timed out
Modules linked in: ipmi_devintf coretemp ipv6 xt_multiport
iptable_mangle xfs exportfs uinput hpwdt ipmi_si ipmi_msghandler bnx2
iTCO_wdt iTCO_vendor_support serio_raw microcode power_meter raid0 cciss
hpsa radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded:
speedstep_lib]
Pid: 0, comm: swapper Tainted: G M 2.6.34.7-61.korg.fc13.x86_64 #1
Call Trace:
<IRQ> [<ffffffff8104d14f>] warn_slowpath_common+0x7c/0x94
[<ffffffff8104d1be>] warn_slowpath_fmt+0x41/0x43
[<ffffffff813b530f>] ? netif_tx_lock+0x44/0x6d
[<ffffffff813b542d>] dev_watchdog+0xf5/0x197
[<ffffffff81010261>] ? sched_clock+0x9/0xd
[<ffffffff8106b2f7>] ? sched_clock_cpu+0x44/0xce
[<ffffffff810594d6>] run_timer_softirq+0x1bf/0x263
[<ffffffff8106e3f7>] ? ktime_get+0x65/0xbe
[<ffffffff81053285>] __do_softirq+0xe5/0x1a6
[<ffffffff810726d0>] ? tick_program_event+0x2a/0x2c
[<ffffffff8100ab5c>] call_softirq+0x1c/0x30
[<ffffffff8100c342>] do_softirq+0x46/0x83
[<ffffffff810530f6>] irq_exit+0x3b/0x7d
[<ffffffff8144dc30>] smp_apic_timer_interrupt+0x8d/0x9b
[<ffffffff8100a613>] apic_timer_interrupt+0x13/0x20
<EOI> [<ffffffff8127d4c7>] ? acpi_idle_enter_bm+0x288/0x2bc
[<ffffffff8127d4c0>] ? acpi_idle_enter_bm+0x281/0x2bc
[<ffffffff8137596c>] ? menu_select+0x141/0x1f8
[<ffffffff81374b74>] cpuidle_idle_call+0x99/0xf1
[<ffffffff81008c22>] cpu_idle+0xaa/0xe4
[<ffffffff81440b89>] start_secondary+0x253/0x294
---[ end trace d6fc5aa3e2b641f7 ]---
bnx2 0000:02:00.0: eth0: DEBUG: intr_sem[0]
bnx2 0000:02:00.0: eth0: DEBUG: EMAC_TX_STATUS[00000008]
RPM_MGMT_PKT_CTRL[40000088]
bnx2 0000:02:00.0: eth0: DEBUG: MCP_STATE_P0[0003610e]
MCP_STATE_P1[0003600e]
bnx2 0000:02:00.0: eth0: DEBUG: HC_STATS_INTERRUPT_STATUS[01fb0004]
bnx2 0000:02:00.0: eth0: DEBUG: PBA[00000000]
bnx2 0000:02:00.0: eth0: NIC Copper Link is Down
bnx2 0000:02:00.0: eth0: NIC Copper Link is Up, 1000 Mbps full duplex
- John 'Warthog9' Hawley
^ permalink raw reply
* Re: [RFC PATCH] net: Implement read-only protection and COW'ing of metrics.
From: David Miller @ 2011-01-26 23:31 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev
In-Reply-To: <20110126.152538.260074157.davem@davemloft.net>
From: David Miller <davem@davemloft.net>
Date: Wed, 26 Jan 2011 15:25:38 -0800 (PST)
> Finally, once this change is stabilized we can be a lot smarter about
> what we do at the time an entry is created. For example, when a route
> is looked up for a TCP socket, we essentially know we are going to COW
> the route %99.99999 of the time. So we can pass a hint into TCP's
> route lookups in the flow flags field telling it to pre-COW the route.
>
> TCP pre-COW'ing of metrics will thus save several atomics.
I forgot to mention one other idea I had.
To get rid of the atomics in the non-TCP cases, we note that pretty much
all routes installed have no special metrics attached, the fib_info
metrics are equal to dst_default_metrics.
This means if we check for this case, we can point the dst->_metrics
at dst_default_metrics and then we don't need to do any atomics at
all. Just one straight assignment at creation and then absolutely no
work at all during destroy.
We could even consider allocating fib_info->metrics externally, and point
it directly at dst_default_metrics when possible. This is going to save
an enormous amount of memory as well as get rid of the atomics.
So Eric, I really hope I can sell you on this :-)
^ permalink raw reply
* Re: [RFC PATCH] net: Implement read-only protection and COW'ing of metrics.
From: David Miller @ 2011-01-26 23:25 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev
In-Reply-To: <20101216.115900.183061857.davem@davemloft.net>
Eric, thanks again for your feedback. I've taken a stab at fixing the
various races, in particular the one you discovered about metrics
sharing and how this interacts with fib_info releases.
What I've choosen to do is two-fold:
1) Update ->_metrics atomically with cmpxchg once a route becomes publicly
visible.
2) Remember and grab a reference to the fib_info for shared read-only
metrics in rt->fi, then release it once the metrics regerence goes
away.
It sounds expensive but hear me out :-)
First of all, at rt_set_nexthop() time, the atomic we use to grab a
ref to the fib_info is replacing a 60-byte memcpy() into the dst
metrics.
Next, the ->_metrics atomic to un-COW the metrics at destroy time
might in fact be overkill. Especially once writable metrics live in
the inetpeer cache (that's the next set of patches after this one).
Finally, once this change is stabilized we can be a lot smarter about
what we do at the time an entry is created. For example, when a route
is looked up for a TCP socket, we essentially know we are going to COW
the route %99.99999 of the time. So we can pass a hint into TCP's
route lookups in the flow flags field telling it to pre-COW the route.
TCP pre-COW'ing of metrics will thus save several atomics.
Anyways, here is the patch, it is only build tested at this point, but
I wanted to get feedback from you about the basic gist of things
as soon as possible.
Thanks!
diff --git a/include/net/dst.h b/include/net/dst.h
index be5a0d4..94a8c23 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -40,24 +40,10 @@ struct dst_entry {
struct rcu_head rcu_head;
struct dst_entry *child;
struct net_device *dev;
- short error;
- short obsolete;
- int flags;
-#define DST_HOST 0x0001
-#define DST_NOXFRM 0x0002
-#define DST_NOPOLICY 0x0004
-#define DST_NOHASH 0x0008
-#define DST_NOCACHE 0x0010
+ struct dst_ops *ops;
+ unsigned long _metrics;
unsigned long expires;
-
- unsigned short header_len; /* more space at head required */
- unsigned short trailer_len; /* space to reserve at tail */
-
- unsigned int rate_tokens;
- unsigned long rate_last; /* rate limiting for ICMP */
-
struct dst_entry *path;
-
struct neighbour *neighbour;
struct hh_cache *hh;
#ifdef CONFIG_XFRM
@@ -68,17 +54,16 @@ struct dst_entry {
int (*input)(struct sk_buff*);
int (*output)(struct sk_buff*);
- struct dst_ops *ops;
-
- u32 _metrics[RTAX_MAX];
-
+ short error;
+ short obsolete;
+ unsigned short header_len; /* more space at head required */
+ unsigned short trailer_len; /* space to reserve at tail */
#ifdef CONFIG_IP_ROUTE_CLASSID
__u32 tclassid;
#else
__u32 __pad2;
#endif
-
/*
* Align __refcnt to a 64 bytes alignment
* (L1_CACHE_SIZE would be too much)
@@ -93,6 +78,14 @@ struct dst_entry {
atomic_t __refcnt; /* client references */
int __use;
unsigned long lastuse;
+ unsigned long rate_last; /* rate limiting for ICMP */
+ unsigned int rate_tokens;
+ int flags;
+#define DST_HOST 0x0001
+#define DST_NOXFRM 0x0002
+#define DST_NOPOLICY 0x0004
+#define DST_NOHASH 0x0008
+#define DST_NOCACHE 0x0010
union {
struct dst_entry *next;
struct rtable __rcu *rt_next;
@@ -103,10 +96,69 @@ struct dst_entry {
#ifdef __KERNEL__
+extern u32 *dst_cow_metrics_generic(struct dst_entry *dst, unsigned long old);
+
+#define DST_METRICS_READ_ONLY 0x1UL
+#define __DST_METRICS_PTR(Y) \
+ ((u32 *)((Y) & ~DST_METRICS_READ_ONLY))
+#define DST_METRICS_PTR(X) __DST_METRICS_PTR((X)->_metrics)
+
+static inline bool dst_metrics_read_only(const struct dst_entry *dst)
+{
+ return dst->_metrics & DST_METRICS_READ_ONLY;
+}
+
+extern void __dst_destroy_metrics_generic(struct dst_entry *dst, unsigned long old);
+
+static inline void dst_destroy_metrics_generic(struct dst_entry *dst)
+{
+ unsigned long val = dst->_metrics;
+ if (!(val & DST_METRICS_READ_ONLY))
+ __dst_destroy_metrics_generic(dst, val);
+}
+
+static inline u32 *dst_metrics_write_ptr(struct dst_entry *dst)
+{
+ unsigned long p = dst->_metrics;
+
+ if (p & DST_METRICS_READ_ONLY)
+ return dst->ops->cow_metrics(dst, p);
+ return __DST_METRICS_PTR(p);
+}
+
+/* This may only be invoked before the entry has reached global
+ * visibility.
+ */
+static inline void dst_init_metrics(struct dst_entry *dst,
+ const u32 *src_metrics,
+ bool read_only)
+{
+ dst->_metrics = ((unsigned long) src_metrics) |
+ (read_only ? DST_METRICS_READ_ONLY : 0);
+}
+
+static inline void dst_copy_metrics(struct dst_entry *dest, const struct dst_entry *src)
+{
+ u32 *dst_metrics = dst_metrics_write_ptr(dest);
+
+ if (dst_metrics) {
+ u32 *src_metrics = DST_METRICS_PTR(src);
+
+ memcpy(dst_metrics, src_metrics, RTAX_MAX * sizeof(u32));
+ }
+}
+
+static inline u32 *dst_metrics_ptr(struct dst_entry *dst)
+{
+ return DST_METRICS_PTR(dst);
+}
+
static inline u32
dst_metric_raw(const struct dst_entry *dst, const int metric)
{
- return dst->_metrics[metric-1];
+ u32 *p = DST_METRICS_PTR(dst);
+
+ return p[metric-1];
}
static inline u32
@@ -131,22 +183,10 @@ dst_metric_advmss(const struct dst_entry *dst)
static inline void dst_metric_set(struct dst_entry *dst, int metric, u32 val)
{
- dst->_metrics[metric-1] = val;
-}
-
-static inline void dst_import_metrics(struct dst_entry *dst, const u32 *src_metrics)
-{
- memcpy(dst->_metrics, src_metrics, RTAX_MAX * sizeof(u32));
-}
+ u32 *p = dst_metrics_write_ptr(dst);
-static inline void dst_copy_metrics(struct dst_entry *dest, const struct dst_entry *src)
-{
- dst_import_metrics(dest, src->_metrics);
-}
-
-static inline u32 *dst_metrics_ptr(struct dst_entry *dst)
-{
- return dst->_metrics;
+ if (p)
+ p[metric-1] = val;
}
static inline u32
diff --git a/include/net/dst_ops.h b/include/net/dst_ops.h
index 21a320b..dc07463 100644
--- a/include/net/dst_ops.h
+++ b/include/net/dst_ops.h
@@ -18,6 +18,7 @@ struct dst_ops {
struct dst_entry * (*check)(struct dst_entry *, __u32 cookie);
unsigned int (*default_advmss)(const struct dst_entry *);
unsigned int (*default_mtu)(const struct dst_entry *);
+ u32 * (*cow_metrics)(struct dst_entry *, unsigned long);
void (*destroy)(struct dst_entry *);
void (*ifdown)(struct dst_entry *,
struct net_device *dev, int how);
diff --git a/include/net/route.h b/include/net/route.h
index 93e10c4..5677cbf 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -49,6 +49,7 @@
struct fib_nh;
struct inet_peer;
+struct fib_info;
struct rtable {
struct dst_entry dst;
@@ -69,6 +70,7 @@ struct rtable {
/* Miscellaneous cached information */
__be32 rt_spec_dst; /* RFC1122 specific destination */
struct inet_peer *peer; /* long-living peer info */
+ struct fib_info *fi; /* for client ref to shared metrics */
};
static inline bool rt_is_input_route(struct rtable *rt)
diff --git a/net/core/dst.c b/net/core/dst.c
index b99c7c7..5788935 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -164,6 +164,8 @@ int dst_discard(struct sk_buff *skb)
}
EXPORT_SYMBOL(dst_discard);
+static const u32 dst_default_metrics[RTAX_MAX];
+
void *dst_alloc(struct dst_ops *ops)
{
struct dst_entry *dst;
@@ -180,6 +182,7 @@ void *dst_alloc(struct dst_ops *ops)
dst->lastuse = jiffies;
dst->path = dst;
dst->input = dst->output = dst_discard;
+ dst_init_metrics(dst, dst_default_metrics, true);
#if RT_CACHE_DEBUG >= 2
atomic_inc(&dst_total);
#endif
@@ -282,6 +285,42 @@ void dst_release(struct dst_entry *dst)
}
EXPORT_SYMBOL(dst_release);
+u32 *dst_cow_metrics_generic(struct dst_entry *dst, unsigned long old)
+{
+ u32 *p = kmalloc(sizeof(u32) * RTAX_MAX, GFP_ATOMIC);
+
+ if (p) {
+ u32 *old_p = __DST_METRICS_PTR(old);
+ unsigned long prev, new;
+
+ memcpy(p, old_p, sizeof(u32) * RTAX_MAX);
+
+ new = (unsigned long) p;
+ prev = cmpxchg(&dst->_metrics, old, new);
+
+ if (prev != old) {
+ kfree(p);
+ p = __DST_METRICS_PTR(prev);
+ if (prev & DST_METRICS_READ_ONLY)
+ p = NULL;
+ }
+ }
+ return p;
+}
+EXPORT_SYMBOL(dst_cow_metrics_generic);
+
+/* Caller asserts that dst_metrics_read_only(dst) is false. */
+void __dst_destroy_metrics_generic(struct dst_entry *dst, unsigned long old)
+{
+ unsigned long prev, new;
+
+ new = (unsigned long) dst_default_metrics;
+ prev = cmpxchg(&dst->_metrics, old, new);
+ if (prev == old)
+ kfree(__DST_METRICS_PTR(old));
+}
+EXPORT_SYMBOL(__dst_destroy_metrics_generic);
+
/**
* skb_dst_set_noref - sets skb dst, without a reference
* @skb: buffer
diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c
index 5e63636..42c9c62 100644
--- a/net/decnet/dn_route.c
+++ b/net/decnet/dn_route.c
@@ -112,6 +112,7 @@ static int dn_dst_gc(struct dst_ops *ops);
static struct dst_entry *dn_dst_check(struct dst_entry *, __u32);
static unsigned int dn_dst_default_advmss(const struct dst_entry *dst);
static unsigned int dn_dst_default_mtu(const struct dst_entry *dst);
+static void dn_dst_destroy(struct dst_entry *);
static struct dst_entry *dn_dst_negative_advice(struct dst_entry *);
static void dn_dst_link_failure(struct sk_buff *);
static void dn_dst_update_pmtu(struct dst_entry *dst, u32 mtu);
@@ -133,11 +134,18 @@ static struct dst_ops dn_dst_ops = {
.check = dn_dst_check,
.default_advmss = dn_dst_default_advmss,
.default_mtu = dn_dst_default_mtu,
+ .cow_metrics = dst_cow_metrics_generic,
+ .destroy = dn_dst_destroy,
.negative_advice = dn_dst_negative_advice,
.link_failure = dn_dst_link_failure,
.update_pmtu = dn_dst_update_pmtu,
};
+static void dn_dst_destroy(struct dst_entry *dst)
+{
+ dst_destroy_metrics_generic(dst);
+}
+
static __inline__ unsigned dn_hash(__le16 src, __le16 dst)
{
__u16 tmp = (__u16 __force)(src ^ dst);
@@ -814,14 +822,14 @@ static int dn_rt_set_next_hop(struct dn_route *rt, struct dn_fib_res *res)
{
struct dn_fib_info *fi = res->fi;
struct net_device *dev = rt->dst.dev;
+ unsigned int mss_metric;
struct neighbour *n;
- unsigned int metric;
if (fi) {
if (DN_FIB_RES_GW(*res) &&
DN_FIB_RES_NH(*res).nh_scope == RT_SCOPE_LINK)
rt->rt_gateway = DN_FIB_RES_GW(*res);
- dst_import_metrics(&rt->dst, fi->fib_metrics);
+ dst_init_metrics(&rt->dst, fi->fib_metrics, true);
}
rt->rt_type = res->type;
@@ -834,10 +842,10 @@ static int dn_rt_set_next_hop(struct dn_route *rt, struct dn_fib_res *res)
if (dst_metric(&rt->dst, RTAX_MTU) > rt->dst.dev->mtu)
dst_metric_set(&rt->dst, RTAX_MTU, rt->dst.dev->mtu);
- metric = dst_metric_raw(&rt->dst, RTAX_ADVMSS);
- if (metric) {
+ mss_metric = dst_metric_raw(&rt->dst, RTAX_ADVMSS);
+ if (mss_metric) {
unsigned int mss = dn_mss_from_pmtu(dev, dst_mtu(&rt->dst));
- if (metric > mss)
+ if (mss_metric > mss)
dst_metric_set(&rt->dst, RTAX_ADVMSS, mss);
}
return 0;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 3e5b7cc..7fc6301 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -152,6 +152,36 @@ static void ipv4_dst_ifdown(struct dst_entry *dst, struct net_device *dev,
{
}
+static u32 *ipv4_cow_metrics(struct dst_entry *dst, unsigned long old)
+{
+ u32 *p = kmalloc(sizeof(u32) * RTAX_MAX, GFP_ATOMIC);
+
+ if (p) {
+ u32 *old_p = __DST_METRICS_PTR(old);
+ unsigned long prev, new;
+
+ memcpy(p, old_p, sizeof(u32) * RTAX_MAX);
+
+ new = (unsigned long) p;
+ prev = cmpxchg(&dst->_metrics, old, new);
+
+ if (prev != old) {
+ kfree(p);
+ p = __DST_METRICS_PTR(prev);
+ if (prev & DST_METRICS_READ_ONLY)
+ p = NULL;
+ } else {
+ struct rtable *rt = (struct rtable *) dst;
+
+ if (rt->fi) {
+ fib_info_put(rt->fi);
+ rt->fi = NULL;
+ }
+ }
+ }
+ return p;
+}
+
static struct dst_ops ipv4_dst_ops = {
.family = AF_INET,
.protocol = cpu_to_be16(ETH_P_IP),
@@ -159,6 +189,7 @@ static struct dst_ops ipv4_dst_ops = {
.check = ipv4_dst_check,
.default_advmss = ipv4_default_advmss,
.default_mtu = ipv4_default_mtu,
+ .cow_metrics = ipv4_cow_metrics,
.destroy = ipv4_dst_destroy,
.ifdown = ipv4_dst_ifdown,
.negative_advice = ipv4_negative_advice,
@@ -1720,6 +1751,11 @@ static void ipv4_dst_destroy(struct dst_entry *dst)
struct rtable *rt = (struct rtable *) dst;
struct inet_peer *peer = rt->peer;
+ dst_destroy_metrics_generic(dst);
+ if (rt->fi) {
+ fib_info_put(rt->fi);
+ rt->fi = NULL;
+ }
if (peer) {
rt->peer = NULL;
inet_putpeer(peer);
@@ -1824,7 +1860,9 @@ static void rt_set_nexthop(struct rtable *rt, struct fib_result *res, u32 itag)
if (FIB_RES_GW(*res) &&
FIB_RES_NH(*res).nh_scope == RT_SCOPE_LINK)
rt->rt_gateway = FIB_RES_GW(*res);
- dst_import_metrics(dst, fi->fib_metrics);
+ rt->fi = fi;
+ atomic_inc(&fi->fib_clntref);
+ dst_init_metrics(dst, fi->fib_metrics, true);
#ifdef CONFIG_IP_ROUTE_CLASSID
dst->tclassid = FIB_RES_NH(*res).nh_tclassid;
#endif
diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
index b057d40..19fbdec 100644
--- a/net/ipv4/xfrm4_policy.c
+++ b/net/ipv4/xfrm4_policy.c
@@ -196,8 +196,11 @@ static void xfrm4_dst_destroy(struct dst_entry *dst)
{
struct xfrm_dst *xdst = (struct xfrm_dst *)dst;
+ dst_destroy_metrics_generic(dst);
+
if (likely(xdst->u.rt.peer))
inet_putpeer(xdst->u.rt.peer);
+
xfrm_dst_destroy(xdst);
}
@@ -215,6 +218,7 @@ static struct dst_ops xfrm4_dst_ops = {
.protocol = cpu_to_be16(ETH_P_IP),
.gc = xfrm4_garbage_collect,
.update_pmtu = xfrm4_update_pmtu,
+ .cow_metrics = dst_cow_metrics_generic,
.destroy = xfrm4_dst_destroy,
.ifdown = xfrm4_dst_ifdown,
.local_out = __ip_local_out,
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 1534508..45fafa0 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -105,6 +105,7 @@ static struct dst_ops ip6_dst_ops_template = {
.check = ip6_dst_check,
.default_advmss = ip6_default_advmss,
.default_mtu = ip6_default_mtu,
+ .cow_metrics = dst_cow_metrics_generic,
.destroy = ip6_dst_destroy,
.ifdown = ip6_dst_ifdown,
.negative_advice = ip6_negative_advice,
@@ -125,6 +126,10 @@ static struct dst_ops ip6_dst_blackhole_ops = {
.update_pmtu = ip6_rt_blackhole_update_pmtu,
};
+static const u32 ip6_template_metrics[RTAX_MAX] = {
+ [RTAX_HOPLIMIT - 1] = 255,
+};
+
static struct rt6_info ip6_null_entry_template = {
.dst = {
.__refcnt = ATOMIC_INIT(1),
@@ -193,6 +198,7 @@ static void ip6_dst_destroy(struct dst_entry *dst)
rt->rt6i_idev = NULL;
in6_dev_put(idev);
}
+ dst_destroy_metrics_generic(dst);
if (peer) {
BUG_ON(!(rt->rt6i_flags & RTF_CACHE));
rt->rt6i_peer = NULL;
@@ -2681,7 +2687,8 @@ static int __net_init ip6_route_net_init(struct net *net)
net->ipv6.ip6_null_entry->dst.path =
(struct dst_entry *)net->ipv6.ip6_null_entry;
net->ipv6.ip6_null_entry->dst.ops = &net->ipv6.ip6_dst_ops;
- dst_metric_set(&net->ipv6.ip6_null_entry->dst, RTAX_HOPLIMIT, 255);
+ dst_init_metrics(&net->ipv6.ip6_null_entry->dst,
+ ip6_template_metrics, true);
#ifdef CONFIG_IPV6_MULTIPLE_TABLES
net->ipv6.ip6_prohibit_entry = kmemdup(&ip6_prohibit_entry_template,
@@ -2692,7 +2699,8 @@ static int __net_init ip6_route_net_init(struct net *net)
net->ipv6.ip6_prohibit_entry->dst.path =
(struct dst_entry *)net->ipv6.ip6_prohibit_entry;
net->ipv6.ip6_prohibit_entry->dst.ops = &net->ipv6.ip6_dst_ops;
- dst_metric_set(&net->ipv6.ip6_prohibit_entry->dst, RTAX_HOPLIMIT, 255);
+ dst_init_metrics(&net->ipv6.ip6_prohibit_entry->dst,
+ ip6_template_metrics, true);
net->ipv6.ip6_blk_hole_entry = kmemdup(&ip6_blk_hole_entry_template,
sizeof(*net->ipv6.ip6_blk_hole_entry),
@@ -2702,7 +2710,8 @@ static int __net_init ip6_route_net_init(struct net *net)
net->ipv6.ip6_blk_hole_entry->dst.path =
(struct dst_entry *)net->ipv6.ip6_blk_hole_entry;
net->ipv6.ip6_blk_hole_entry->dst.ops = &net->ipv6.ip6_dst_ops;
- dst_metric_set(&net->ipv6.ip6_blk_hole_entry->dst, RTAX_HOPLIMIT, 255);
+ dst_init_metrics(&net->ipv6.ip6_blk_hole_entry->dst,
+ ip6_template_metrics, true);
#endif
net->ipv6.sysctl.flush_delay = 0;
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index da87428..834dc02 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -220,6 +220,7 @@ static void xfrm6_dst_destroy(struct dst_entry *dst)
if (likely(xdst->u.rt6.rt6i_idev))
in6_dev_put(xdst->u.rt6.rt6i_idev);
+ dst_destroy_metrics_generic(dst);
if (likely(xdst->u.rt6.rt6i_peer))
inet_putpeer(xdst->u.rt6.rt6i_peer);
xfrm_dst_destroy(xdst);
@@ -257,6 +258,7 @@ static struct dst_ops xfrm6_dst_ops = {
.protocol = cpu_to_be16(ETH_P_IPV6),
.gc = xfrm6_garbage_collect,
.update_pmtu = xfrm6_update_pmtu,
+ .cow_metrics = dst_cow_metrics_generic,
.destroy = xfrm6_dst_destroy,
.ifdown = xfrm6_dst_ifdown,
.local_out = __ip6_local_out,
^ permalink raw reply related
* [PATCH v5] Gemini: Gigabit ethernet driver
From: Michał Mirosław @ 2011-01-26 23:24 UTC (permalink / raw)
To: Hans Ulli Kroll, gemini-board-dev; +Cc: netdev, Christoph Biedl
In-Reply-To: <20101230083905.5A8EB13909@rere.qmqm.pl>
Driver for SL351x (Gemini) SoC ethernet peripheral. Based in part
on work by Paulius Zaleckas and GPLd code from Raidsonic and other
NAS vendors.
Tested on Raidsonic IcyBox 4220-B (dual SATA NAS).
Signed-off-by: MichaÅ MirosÅaw <mirq-linux@rere.qmqm.pl>
---
Against: v2.6.38-rc2+
This depends on following commits from
git://git.berlios.de/gemini-board
1. 732ae5221a89db85626c7c636bd2520fa98768d2
ARM: Gemini: add missing gmac.h for gigabit driver
2. feda237215382063b4174929ac18701976353418
ARM: Gemini: add gmac platform driver o IB4220B
Note for testers: you may tweak DEFAULT_RX_BUF_ORDER (=log2(buffer size),
at most a page - so in range [6..12]) and RX_MAX_ALLOC_ORDER (max allocated
rx buffer page order) to find best memory usage/performance settings apart
from what is available via ethtool. For TX, there's not much you can do when
offloads are disabled.
MAC address needs to be set by userspace after parsing VCTL flash partition
(mtd4 on my box).
Changes from v4:
- rebased on upcoming 2.6.38 (removal of page_to_dma() and per-txq stats)
- removed setting last_rx and trans_start as that's handled by net core
- changed __raw_read/writel() to read/writel()
- added setting of AHB_WEIGHT register (didn't improve anything, I'm afraid)
- fixed DMA unmapping bug
- added limit of packet size for TX offload (HW checks only 13 bits of mtu_size field)
- reduced RX_MAX_ALLOC_ORDER as it caused a lot of order 4 allocation failures
under load
- cleanups
Changes from v3:
- fixed remaining tx_queue_len misuse bugs
- bulk RX DMA page map/unmap
- whitespace changes to make checkpatch happier (please ignore remaining
complaints - long lines in .c and typedefs/whitespace/long lines in .h)
Changes from v2:
- converted to page buffers and napi_gro_frags()
- later IRQ acking and NAPI exits
- larger rings by default
- tx-interrupt coalescing
- MTU changing
- jumbo frames support
- ringparam and coalesce settings via ethtool
- more fixes/cleanups
Changes from v1:
- fixed stats (now using u64_stats_sync; no-op on UP anyway)
- pre-load mdio-gpio if built as module
- disable TX checksum offload by default (unreliable HW)
- convert to NAPI+GRO (netperf TCP STREAM RX test:
before: 156mbit/s, now: 185mbit/s)
Later TODO:
- netpoll (netconsole)
- parse MAC address from flash settings and pass it through platform data
- move TX completion to NAPI poll
- maybe implement rx copybreak
diff --git a/MAINTAINERS b/MAINTAINERS
index cf0f3a5..bc5f4bf 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5670,6 +5670,15 @@ S: Maintained
F: drivers/net/skge.*
F: drivers/net/sky2.*
+SL351X (STORLINK GEMINI SOC) GIGABIT ETHERNET DRIVER
+M: MichaÅ MirosÅaw <mirq-linux@rere.qmqm.pl>
+L: netdev@vger.kernel.org
+L: gemini-board-dev@lists.berlios.de
+T: git git://git.berlios.de/gemini-board
+S: Maintained
+F: driver/net/sl351x.c
+F: driver/net/sl351x_hw.h
+
SLAB ALLOCATOR
M: Christoph Lameter <cl@linux-foundation.org>
M: Pekka Enberg <penberg@kernel.org>
diff --git a/drivers/net/sl351x.c b/drivers/net/sl351x.c
new file mode 100644
index 0000000..60ad482
--- /dev/null
+++ b/drivers/net/sl351x.c
@@ -0,0 +1,2363 @@
+/*
+ * Ethernet device driver for Gemini SoC (SL351x GMAC).
+ *
+ * Copyright (C) 2010, MichaÅ MirosÅaw <mirq-linux@rere.qmqm.pl>
+ *
+ * Based on work by Paulius Zaleckas <paulius.zaleckas@gmail.com> and
+ * GPLd spaghetti code from Raidsonic and other Gemini-based NAS vendors.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+
+#include <linux/spinlock.h>
+#include <linux/slab.h>
+#include <linux/dma-mapping.h>
+#include <linux/cache.h>
+#include <linux/interrupt.h>
+
+#include <linux/platform_device.h>
+#include <linux/etherdevice.h>
+#include <linux/if_vlan.h>
+#include <linux/skbuff.h>
+#include <linux/phy.h>
+#include <linux/crc32.h>
+#include <linux/ethtool.h>
+#include <linux/tcp.h>
+#include <linux/u64_stats_sync.h>
+
+#include <linux/in.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+
+#include <mach/hardware.h>
+#include <mach/global_reg.h>
+#include <mach/irqs.h>
+#include <mach/gmac.h>
+#include "sl351x_hw.h"
+
+#define DEFAULT_TX_COALESCE 16
+#define DEFAULT_GMAC_RXQ_ORDER 10
+#define DEFAULT_GMAC_TXQ_ORDER 10
+#define DEFAULT_RX_BUF_ORDER 11
+#define DEFAULT_NAPI_WEIGHT 64
+#define RX_INSERT_BYTES 2
+#define TX_MAX_FRAGS 8
+#define TX_QUEUE_NUM 1 /* max: 6 */
+#define RX_MAX_ALLOC_ORDER 2
+#define NETIF_TSO_FEATURES \
+ (NETIF_F_TSO|NETIF_F_TSO_ECN|NETIF_F_TSO6)
+#define GMAC_TX_OFFLOAD_FEATURES \
+ (NETIF_TSO_FEATURES|NETIF_F_ALL_CSUM)
+
+static int debug_level;
+module_param(debug_level, int, 0600);
+MODULE_PARM_DESC(debug_level, "netif debug level mask");
+
+struct toe_private {
+ void __iomem *iomem;
+ GMAC_RXDESC_T *freeq_ring;
+ spinlock_t irq_lock;
+
+ struct net_device *netdev[2];
+ struct device *dev;
+ int irq;
+
+ unsigned int freeq_frag_order;
+ unsigned int freeq_order;
+ unsigned int freeq_entries;
+ dma_addr_t freeq_dma_base;
+
+ struct page *freeq_page;
+ unsigned int freeq_page_count;
+ unsigned int alloc_order;
+ unsigned int freeq_page_offs;
+};
+
+struct gmac_txq {
+ GMAC_TXDESC_T *ring;
+ unsigned int cptr;
+ struct sk_buff **skb;
+ unsigned int noirq_packets;
+} ____cacheline_aligned_in_smp;
+
+struct gmac_private {
+ void __iomem *dma_iomem;
+
+ void __iomem *rxq_rwptr;
+ GMAC_RXDESC_T *rxq_ring;
+ unsigned int rxq_order;
+
+ struct napi_struct napi;
+ struct gmac_txq txq[TX_QUEUE_NUM];
+ unsigned int txq_order;
+ unsigned int irq_every_tx_packets;
+
+ dma_addr_t rxq_dma_base;
+ dma_addr_t txq_dma_base;
+
+ unsigned int msg_enable;
+ spinlock_t config_lock;
+
+ int in_reset;
+
+ struct u64_stats_sync tx_stats_syncp;
+ struct u64_stats_sync rx_stats_syncp;
+ struct u64_stats_sync ir_stats_syncp;
+
+ struct rtnl_link_stats64 stats;
+ u64 hw_stats[RX_STATS_NUM];
+ u64 rx_stats[RX_STATUS_NUM];
+ u64 rx_csum_stats[RX_CHKSUM_NUM];
+ u64 rx_napi_exits;
+ u64 tx_frag_stats[TX_MAX_FRAGS];
+ u64 tx_frags_linearized;
+ u64 tx_hw_csummed;
+};
+
+#define GMAC_STATS_NUM ( \
+ RX_STATS_NUM + RX_STATUS_NUM + RX_CHKSUM_NUM + 1 + \
+ TX_MAX_FRAGS + 2)
+
+static const char gmac_stats_strings[GMAC_STATS_NUM][ETH_GSTRING_LEN] = {
+ "GMAC_IN_DISCARDS",
+ "GMAC_IN_ERRORS",
+ "GMAC_IN_MCAST",
+ "GMAC_IN_BCAST",
+ "GMAC_IN_MAC1",
+ "GMAC_IN_MAC2",
+ "RX_STATUS_GOOD_FRAME",
+ "RX_STATUS_TOO_LONG_GOOD_CRC",
+ "RX_STATUS_RUNT_FRAME",
+ "RX_STATUS_SFD_NOT_FOUND",
+ "RX_STATUS_CRC_ERROR",
+ "RX_STATUS_TOO_LONG_BAD_CRC",
+ "RX_STATUS_ALIGNMENT_ERROR",
+ "RX_STATUS_TOO_LONG_BAD_ALIGN",
+ "RX_STATUS_RX_ERR",
+ "RX_STATUS_DA_FILTERED",
+ "RX_STATUS_BUFFER_FULL",
+ "RX_STATUS_11",
+ "RX_STATUS_12",
+ "RX_STATUS_13",
+ "RX_STATUS_14",
+ "RX_STATUS_15",
+ "RX_CHKSUM_IP_UDP_TCP_OK",
+ "RX_CHKSUM_IP_OK_ONLY",
+ "RX_CHKSUM_NONE",
+ "RX_CHKSUM_3",
+ "RX_CHKSUM_IP_ERR_UNKNOWN",
+ "RX_CHKSUM_IP_ERR",
+ "RX_CHKSUM_TCP_UDP_ERR",
+ "RX_CHKSUM_7",
+ "RX_NAPI_EXITS",
+ "TX_FRAGS[1]",
+ "TX_FRAGS[2]",
+ "TX_FRAGS[3]",
+ "TX_FRAGS[4]",
+ "TX_FRAGS[5]",
+ "TX_FRAGS[6]",
+ "TX_FRAGS[7]",
+ "TX_FRAGS[8]",
+ "TX_FRAGS_LINEARIZED",
+ "TX_HW_CSUMMED",
+};
+
+static struct gmac_private *netdev_to_gmac(struct net_device *dev)
+{
+ return netdev_priv(dev);
+}
+
+static struct toe_private *netdev_to_toe(struct net_device *dev)
+{
+ return dev->ml_priv;
+}
+
+static struct gmac_private *napi_to_gmac(struct napi_struct *napi)
+{
+ return container_of(napi, struct gmac_private, napi);
+}
+
+static void __iomem *toe_reg(struct toe_private *toe, unsigned int reg)
+{
+ return toe->iomem + reg;
+}
+
+static void __iomem *gmac_dma_reg(struct net_device *dev, unsigned int reg)
+{
+ return netdev_to_gmac(dev)->dma_iomem + reg;
+}
+
+static void __iomem *gmac_ctl_reg(struct net_device *dev, unsigned int reg)
+{
+ return (void __iomem *)dev->base_addr + reg;
+}
+
+static struct page *toe_unmap_rx_desc(struct toe_private *toe,
+ GMAC_RXDESC_T *rx)
+{
+ struct page *page;
+
+ if (unlikely(!rx->word2.buf_adr))
+ return NULL;
+
+ page = pfn_to_page(dma_to_pfn(toe->dev, rx->word2.buf_adr));
+
+ dma_unmap_page(toe->dev, rx->word2.buf_adr,
+ 1 << toe->freeq_frag_order, DMA_FROM_DEVICE);
+
+ return page;
+}
+
+static void gmac_hw_start(struct net_device *dev)
+{
+ GMAC_DMA_CTRL_T dma_ctrl;
+
+ dma_ctrl.bits32 = readl(gmac_dma_reg(dev, GMAC_DMA_CTRL_REG));
+
+ dma_ctrl.bits.rd_enable = 1;
+ dma_ctrl.bits.td_enable = 1;
+ dma_ctrl.bits.loopback = 0;
+ dma_ctrl.bits.drop_small_ack = 0;
+ dma_ctrl.bits.rd_prot = 0;
+ dma_ctrl.bits.rd_burst_size = 3;
+ dma_ctrl.bits.rd_insert_bytes = RX_INSERT_BYTES;
+ dma_ctrl.bits.rd_bus = 3;
+ dma_ctrl.bits.td_prot = 0;
+ dma_ctrl.bits.td_burst_size = 3;
+ dma_ctrl.bits.td_bus = 3;
+
+ writel(dma_ctrl.bits32, gmac_dma_reg(dev, GMAC_DMA_CTRL_REG));
+}
+
+static void gmac_hw_stop(struct net_device *dev)
+{
+ GMAC_DMA_CTRL_T dma_ctrl;
+
+ dma_ctrl.bits32 = readl(gmac_dma_reg(dev, GMAC_DMA_CTRL_REG));
+
+ dma_ctrl.bits.rd_enable = 0;
+ dma_ctrl.bits.td_enable = 0;
+
+ writel(dma_ctrl.bits32, gmac_dma_reg(dev, GMAC_DMA_CTRL_REG));
+}
+
+static void gmac_update_config0_reg(struct net_device *dev, u32 val, u32 vmask)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ unsigned long flags;
+ u32 reg;
+
+ spin_lock_irqsave(&gmac->config_lock, flags);
+
+ reg = readl(gmac_ctl_reg(dev, GMAC_CONFIG0));
+ reg = (reg & ~vmask) | val;
+ writel(reg, gmac_ctl_reg(dev, GMAC_CONFIG0));
+
+ spin_unlock_irqrestore(&gmac->config_lock, flags);
+}
+
+static void gmac_enable_tx_rx(struct net_device *dev)
+{
+ gmac_update_config0_reg(dev, 0, CONFIG0_TX_RX_DISABLE);
+}
+
+static void gmac_disable_tx_rx(struct net_device *dev)
+{
+ gmac_update_config0_reg(dev, CONFIG0_TX_RX_DISABLE,
+ CONFIG0_TX_RX_DISABLE);
+ mdelay(10); /* let GMAC consume packet */
+}
+
+static void gmac_set_flow_control(struct net_device *dev, bool tx, bool rx)
+{
+ u32 val = (tx ? CONFIG0_FLOW_TX : 0)|(rx ? CONFIG0_FLOW_RX : 0);
+
+ gmac_update_config0_reg(dev, val, CONFIG0_FLOW_CTL);
+}
+
+static void gmac_update_link_state(struct net_device *dev)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ void __iomem *status_reg = gmac_ctl_reg(dev, GMAC_STATUS);
+ struct phy_device *phydev = dev->phydev;
+ GMAC_STATUS_T status, old_status;
+
+ old_status.bits32 = status.bits32 = readl(status_reg);
+
+ status.bits.link = phydev->link;
+ status.bits.duplex = phydev->duplex;
+
+ switch (phydev->speed) {
+ case 1000:
+ status.bits.speed = GMAC_SPEED_1000;
+ if (phydev->interface == PHY_INTERFACE_MODE_RGMII)
+ status.bits.mii_rmii = GMAC_PHY_RGMII_1000;
+ break;
+ case 100:
+ status.bits.speed = GMAC_SPEED_100;
+ if (phydev->interface == PHY_INTERFACE_MODE_RGMII)
+ status.bits.mii_rmii = GMAC_PHY_RGMII_100_10;
+ break;
+ case 10:
+ status.bits.speed = GMAC_SPEED_10;
+ if (phydev->interface == PHY_INTERFACE_MODE_RGMII)
+ status.bits.mii_rmii = GMAC_PHY_RGMII_100_10;
+ break;
+ default:
+ dev_warn(&dev->dev, "Not supported PHY speed (%d)\n",
+ phydev->speed);
+ }
+
+ gmac_set_flow_control(dev, phydev->pause,
+ phydev->pause ^ phydev->asym_pause);
+
+ if (old_status.bits32 == status.bits32)
+ return;
+
+ if (netif_msg_link(gmac)) {
+ phy_print_status(phydev);
+ netdev_info(dev, "link flow control: %s\n",
+ phydev->pause
+ ? (phydev->asym_pause ? "tx" : "both")
+ : (phydev->asym_pause ? "rx" : "none")
+ );
+ }
+
+ gmac_disable_tx_rx(dev);
+ writel(status.bits32, status_reg);
+ gmac_enable_tx_rx(dev);
+}
+
+static int gmac_setup_phy(struct net_device *dev)
+{
+ struct toe_private *toe = netdev_to_toe(dev);
+ struct gemini_gmac_platform_data *pdata = toe->dev->platform_data;
+ GMAC_STATUS_T status = { .bits32 = 0 };
+ int num = dev->dev_id;
+
+ dev->phydev = phy_connect(dev, pdata->bus_id[num],
+ &gmac_update_link_state, 0, pdata->interface[num]);
+
+ if (IS_ERR(dev->phydev)) {
+ int err = PTR_ERR(dev->phydev);
+ dev->phydev = NULL;
+ return err;
+ }
+
+ dev->phydev->supported &= PHY_GBIT_FEATURES|SUPPORTED_Pause;
+ dev->phydev->advertising = dev->phydev->supported;
+
+ /* set PHY interface type */
+ switch (dev->phydev->interface) {
+ case PHY_INTERFACE_MODE_MII:
+ status.bits.mii_rmii = GMAC_PHY_MII;
+ break;
+ case PHY_INTERFACE_MODE_GMII:
+ status.bits.mii_rmii = GMAC_PHY_GMII;
+ break;
+ case PHY_INTERFACE_MODE_RGMII:
+ status.bits.mii_rmii = GMAC_PHY_RGMII_100_10;
+ break;
+ default:
+ dev_err(&dev->dev, "Unsupported MII interface\n");
+ phy_disconnect(dev->phydev);
+ dev->phydev = NULL;
+ return -EINVAL;
+ }
+ writel(status.bits32, gmac_ctl_reg(dev, GMAC_STATUS));
+
+ return 0;
+}
+
+static int gmac_pick_rx_max_len(int max_l3_len)
+{
+ /* index = CONFIG_MAXLEN_XXX values */
+ static const int max_len[8] = {
+ 1536, 1518, 1522, 1542,
+ 9212, 10236, 1518, 1518
+ };
+ int i, n = 5;
+
+ max_l3_len += ETH_HLEN + VLAN_HLEN;
+
+ if (max_l3_len > max_len[n])
+ return -1;
+
+ for (i = 0; i < 5; ++i) {
+ if (max_len[i] >= max_l3_len && max_len[i] < max_len[n])
+ n = i;
+ }
+
+ return n;
+}
+
+static int gmac_init(struct net_device *dev)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ u32 val;
+
+ GMAC_CONFIG0_T config0 = { .bits = {
+ .dis_tx = 1,
+ .dis_rx = 1,
+ .ipv4_rx_chksum = 1,
+ .ipv6_rx_chksum = 1,
+ .rx_err_detect = 1,
+ .rgmm_edge = 1,
+ .port0_chk_hwq = 1,
+ .port1_chk_hwq = 1,
+ .port0_chk_toeq = 1,
+ .port1_chk_toeq = 1,
+ .port0_chk_classq = 1,
+ .port1_chk_classq = 1,
+ } };
+ GMAC_AHB_WEIGHT_T ahb_weight = { .bits = {
+ .rx_weight = 1,
+ .tx_weight = 1,
+ .hash_weight = 1,
+ .pre_req = 0x1f,
+ .tqDV_threshold = 0,
+ } };
+ GMAC_TX_WCR0_T hw_weigh = { .bits = {
+ .hw_tq3 = 1,
+ .hw_tq2 = 1,
+ .hw_tq1 = 1,
+ .hw_tq0 = 1,
+ } };
+ GMAC_TX_WCR1_T sw_weigh = { .bits = {
+ .sw_tq5 = 1,
+ .sw_tq4 = 1,
+ .sw_tq3 = 1,
+ .sw_tq2 = 1,
+ .sw_tq1 = 1,
+ .sw_tq0 = 1,
+ } };
+ GMAC_CONFIG1_T config1 = { .bits = {
+ .set_threshold = 16,
+ .rel_threshold = 24,
+ } };
+ GMAC_CONFIG2_T config2 = { .bits = {
+ .set_threshold = 16,
+ .rel_threshold = 32,
+ } };
+ GMAC_CONFIG3_T config3 = { .bits = {
+ .set_threshold = 0,
+ .rel_threshold = 0,
+ } };
+
+ config0.bits.max_len = gmac_pick_rx_max_len(dev->mtu);
+
+ val = readl(gmac_ctl_reg(dev, GMAC_CONFIG0));
+ config0.bits.reserved = ((GMAC_CONFIG0_T)val).bits.reserved;
+ writel(config0.bits32, gmac_ctl_reg(dev, GMAC_CONFIG0));
+ writel(config1.bits32, gmac_ctl_reg(dev, GMAC_CONFIG1));
+ writel(config2.bits32, gmac_ctl_reg(dev, GMAC_CONFIG2));
+ writel(config3.bits32, gmac_ctl_reg(dev, GMAC_CONFIG3));
+
+ val = readl(gmac_dma_reg(dev, GMAC_AHB_WEIGHT_REG));
+ netif_info(gmac, ifup, dev, "init: prev AHB_WEIGHT = 0x%08x\n", val);
+ writel(ahb_weight.bits32, gmac_dma_reg(dev, GMAC_AHB_WEIGHT_REG));
+
+ writel(hw_weigh.bits32,
+ gmac_dma_reg(dev, GMAC_TX_WEIGHTING_CTRL_0_REG));
+ writel(sw_weigh.bits32,
+ gmac_dma_reg(dev, GMAC_TX_WEIGHTING_CTRL_1_REG));
+
+ gmac->rxq_order = DEFAULT_GMAC_RXQ_ORDER;
+ gmac->txq_order = DEFAULT_GMAC_TXQ_ORDER;
+
+ gmac->irq_every_tx_packets = DEFAULT_TX_COALESCE;
+
+ return 0;
+}
+
+static void gmac_uninit(struct net_device *dev)
+{
+ if (dev->phydev)
+ phy_disconnect(dev->phydev);
+}
+
+static int gmac_setup_txqs(struct net_device *dev)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ struct toe_private *toe = netdev_to_toe(dev);
+ void __iomem *rwptr_reg = gmac_dma_reg(dev, GMAC_SW_TX_QUEUE0_PTR_REG);
+ void __iomem *base_reg = gmac_dma_reg(dev, GMAC_SW_TX_QUEUE_BASE_REG);
+
+ unsigned int n_txq = dev->num_tx_queues;
+ struct gmac_txq *txq = gmac->txq;
+ GMAC_TXDESC_T *desc_ring;
+ struct sk_buff **skb_tab;
+ int i;
+
+ skb_tab = kzalloc(
+ n_txq * sizeof(*skb_tab) << gmac->txq_order, GFP_KERNEL);
+ if (!skb_tab)
+ return -ENOMEM;
+
+ desc_ring = dma_alloc_coherent(toe->dev,
+ n_txq * sizeof(*desc_ring) << gmac->txq_order,
+ &gmac->txq_dma_base, GFP_KERNEL);
+ if (!desc_ring) {
+ kfree(skb_tab);
+ return -ENOMEM;
+ }
+
+ BUG_ON(gmac->txq_dma_base & ~DMA_Q_BASE_MASK);
+
+ for (i = 0; i < n_txq; i++) {
+ netif_info(gmac, ifup, dev,
+ "txq%u: ring %p (dma 0x%08x), skb %p, rwptr %p, len %u (order %u)\n",
+ i, desc_ring, gmac->txq_dma_base, skb_tab, rwptr_reg,
+ 1 << gmac->txq_order, gmac->txq_order);
+
+ writel(0, rwptr_reg);
+ txq->ring = desc_ring;
+ txq->cptr = 0;
+ txq->skb = skb_tab;
+
+ desc_ring += 1 << gmac->txq_order;
+ skb_tab += 1 << gmac->txq_order;
+ rwptr_reg += 4;
+ }
+
+ writel(gmac->txq_dma_base | gmac->txq_order, base_reg);
+
+ return 0;
+}
+
+static void gmac_cleanup_txqs(struct net_device *dev)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ struct toe_private *toe = netdev_to_toe(dev);
+ void __iomem *rwptr_reg = gmac_dma_reg(dev, GMAC_SW_TX_QUEUE0_PTR_REG);
+ void __iomem *base_reg = gmac_dma_reg(dev, GMAC_SW_TX_QUEUE_BASE_REG);
+
+ struct gmac_txq *txq = gmac->txq;
+ unsigned n_txq = dev->num_tx_queues;
+ int i, j;
+
+ for (i = 0; i < n_txq; ++i, ++txq) {
+ writel(0, rwptr_reg + 4 * i);
+ for (j = 0; j < (1 << gmac->txq_order); ++j)
+ if (txq->skb[j])
+ dev_kfree_skb(txq->skb[j]);
+ }
+
+ writel(0, base_reg);
+
+ kfree(gmac->txq->skb);
+ dma_free_coherent(toe->dev,
+ n_txq * sizeof(*gmac->txq->ring) << gmac->txq_order,
+ gmac->txq->ring, gmac->txq_dma_base);
+}
+
+static int gmac_setup_rxq(struct net_device *dev)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ struct toe_private *toe = netdev_to_toe(dev);
+ NONTOE_QHDR_T __iomem *qhdr = toe_reg(toe, TOE_DEFAULT_Q_HDR_BASE(dev->dev_id));
+
+ gmac->rxq_rwptr = &qhdr->word1;
+ gmac->rxq_ring = dma_alloc_coherent(toe->dev,
+ sizeof(*gmac->rxq_ring) << gmac->rxq_order,
+ &gmac->rxq_dma_base, GFP_KERNEL);
+ if (!gmac->rxq_ring)
+ return -ENOMEM;
+
+ BUG_ON(gmac->rxq_dma_base & ~NONTOE_QHDR0_BASE_MASK);
+
+ writel(0, gmac->rxq_rwptr);
+ writel(gmac->rxq_dma_base | gmac->rxq_order, &qhdr->word0);
+
+ netif_info(gmac, ifup, dev,
+ "rxq: ring %p (dma 0x%08x), rwptr %p, len %u (order %u)\n",
+ gmac->rxq_ring, gmac->rxq_dma_base, gmac->rxq_rwptr,
+ 1 << gmac->rxq_order, gmac->rxq_order);
+ return 0;
+}
+
+static void gmac_cleanup_rxq(struct net_device *dev)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ struct toe_private *toe = netdev_to_toe(dev);
+
+ NONTOE_QHDR_T __iomem *qhdr = toe_reg(toe, TOE_DEFAULT_Q_HDR_BASE(dev->dev_id));
+ void __iomem *dma_reg = &qhdr->word0;
+ void __iomem *ptr_reg = &qhdr->word1;
+ unsigned i, e, mask = __RWPTR_MASK(gmac->rxq_order);
+ struct page *page;
+
+ i = GET_RPTR(ptr_reg);
+ e = GET_WPTR(ptr_reg);
+ writel(0, ptr_reg);
+ writel(0, dma_reg);
+
+ for (; i != e; i = __RWPTR_NEXT(i, mask)) {
+ page = toe_unmap_rx_desc(toe, &gmac->rxq_ring[i]);
+ if (likely(page))
+ put_page(page);
+ }
+
+ dma_free_coherent(toe->dev, sizeof(*gmac->rxq_ring) << gmac->rxq_order,
+ gmac->rxq_ring, gmac->rxq_dma_base);
+}
+
+static void __gmac_enable_txfin_irq(struct net_device *, int txq, int enable);
+
+static void gmac_tx_interrupt(struct net_device *dev, unsigned txq_num)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ struct toe_private *toe = netdev_to_toe(dev);
+
+ void __iomem *ptr_reg = gmac_dma_reg(dev, GMAC_SW_TX_QUEUE_PTR_REG(txq_num));
+ struct netdev_queue *ntxq = netdev_get_tx_queue(dev, txq_num);
+ struct gmac_txq *txq = &gmac->txq[txq_num];
+
+ unsigned i, n, mask = __RWPTR_MASK(gmac->txq_order);
+ unsigned errs = 0, pkts = 0, bytes = 0;
+ struct sk_buff *skb;
+ GMAC_TXDESC_T *tx;
+
+ netif_info(gmac, tx_done, dev, "txirq%u: %u,%u,%u\n",
+ txq_num, txq->cptr, GET_RPTR(ptr_reg), GET_WPTR(ptr_reg));
+
+ for (i = txq->cptr; i != GET_RPTR(ptr_reg); i = __RWPTR_NEXT(i, mask)) {
+retry:
+ tx = &txq->ring[i];
+ skb = txq->skb[i];
+ txq->skb[i] = NULL;
+
+ BUG_ON(!skb);
+
+ dma_unmap_single(toe->dev, tx->word2.buf_adr,
+ tx->word0.bits.buffer_size, DMA_TO_DEVICE);
+
+ if (tx->word0.bits.status_tx_ok) {
+ pkts++;
+ bytes += skb->len;
+ netif_info(gmac, tx_done, dev,
+ "TX done descriptor: [%u] 0x%08x 0x%08x 0x%08x 0x%08x\n",
+ i, tx->word0.bits32, tx->word1.bits32,
+ tx->word2.bits32, tx->word3.bits32);
+ } else {
+ errs++;
+ netif_err(gmac, tx_err, dev,
+ "TX error descriptor: [%u] 0x%08x 0x%08x 0x%08x 0x%08x\n",
+ i, tx->word0.bits32, tx->word1.bits32,
+ tx->word2.bits32, tx->word3.bits32);
+ }
+
+ n = tx->word0.bits.desc_count;
+ BUG_ON(__RWPTR_DISTANCE(i, GET_RPTR(ptr_reg), mask) < n);
+
+ while (--n) {
+ i = __RWPTR_NEXT(i, mask);
+ dma_unmap_page(toe->dev, txq->ring[i].word2.buf_adr,
+ txq->ring[i].word0.bits.buffer_size,
+ DMA_TO_DEVICE);
+ netif_info(gmac, tx_done, dev,
+ "TX frag descriptor: [%u] 0x%08x 0x%08x 0x%08x 0x%08x\n",
+ i, txq->ring[i].word0.bits32,
+ txq->ring[i].word1.bits32,
+ txq->ring[i].word2.bits32,
+ txq->ring[i].word3.bits32);
+ }
+
+ dev_kfree_skb_irq(skb);
+ }
+
+ spin_lock(&toe->irq_lock);
+
+ u64_stats_update_begin(&gmac->ir_stats_syncp);
+ gmac->stats.tx_errors += errs;
+ gmac->stats.tx_packets += pkts;
+ gmac->stats.tx_bytes += bytes;
+ u64_stats_update_end(&gmac->ir_stats_syncp);
+
+ txq->cptr = i;
+ __gmac_enable_txfin_irq(dev, txq_num, 0);
+ netif_tx_wake_queue(ntxq);
+
+ spin_unlock(&toe->irq_lock);
+
+ writel(
+ (GMAC0_SWTQ00_EOF_INT_BIT|GMAC0_SWTQ00_FIN_INT_BIT)
+ << (6 * dev->dev_id + txq_num),
+ toe_reg(toe, GLOBAL_INTERRUPT_STATUS_0_REG));
+ txq->noirq_packets = gmac->irq_every_tx_packets;
+
+ if (unlikely(i != GET_RPTR(ptr_reg))) {
+ errs = 0;
+ goto retry;
+ }
+}
+
+static inline unsigned tss_pkt_len(struct sk_buff *skb)
+{
+ if (!skb_is_gso(skb))
+ return skb->len;
+
+ return skb_transport_offset(skb) +
+ tcp_hdrlen(skb) + skb_shinfo(skb)->gso_size;
+}
+
+static int gmac_map_tx_bufs(struct net_device *dev, struct sk_buff *skb,
+ struct gmac_txq *txq, int desc)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ struct device *dma_dev = netdev_to_toe(dev)->dev;
+ skb_frag_t *frag;
+ dma_addr_t mapping;
+ int nfrags, w;
+ unsigned tss_flags = TSS_MTU_ENABLE_BIT;
+
+ frag = skb_shinfo(skb)->frags;
+ nfrags = skb_shinfo(skb)->nr_frags;
+ w = desc;
+
+ mapping = dma_map_single(dma_dev, skb->data,
+ skb_headlen(skb), DMA_TO_DEVICE);
+ if (dma_mapping_error(dma_dev, mapping))
+ goto map1_error;
+
+ if (skb->ip_summed != CHECKSUM_NONE) {
+ int tcp = 0;
+ if (skb->protocol == htons(ETH_P_IP)) {
+ tss_flags |= TSS_IP_CHKSUM_BIT;
+ tcp = ip_hdr(skb)->protocol == IPPROTO_TCP;
+ } else { /* IPv6 */
+ tss_flags |= TSS_IPV6_ENABLE_BIT;
+ tcp = ipv6_hdr(skb)->nexthdr == IPPROTO_TCP;
+ }
+
+ if (tcp)
+ tss_flags |= TSS_TCP_CHKSUM_BIT;
+ else
+ tss_flags |= TSS_UDP_CHKSUM_BIT;
+ } else if (!skb_is_gso(skb))
+ tss_flags |= TSS_BYPASS_BIT;
+
+ txq->ring[w].word0.bits32 = skb_headlen(skb);
+ txq->ring[w].word1.bits32 = skb->len | tss_flags;
+ txq->ring[w].word2.bits32 = mapping;
+ txq->ring[w].word3.bits32 = tss_pkt_len(skb) | SOF_BIT;
+
+ /* racing with TX completion irq, harmless */
+ if (txq->noirq_packets == 1) {
+ txq->noirq_packets = 0;
+ txq->ring[w].word3.bits32 |= EOFIE_BIT;
+ } else if (txq->noirq_packets)
+ txq->noirq_packets--;
+
+ netif_info(gmac, tx_queued, dev,
+ "txq%ld[%u]: 0x%08x 0x%08x 0x%08x 0x%08x, datap %p\n",
+ txq - gmac->txq, w,
+ txq->ring[w].word0.bits32, txq->ring[w].word1.bits32,
+ txq->ring[w].word2.bits32, txq->ring[w].word3.bits32,
+ skb->data);
+
+ while (nfrags--) {
+ mapping = dma_map_page(dma_dev, frag->page,
+ frag->page_offset, frag->size, DMA_TO_DEVICE);
+ if (dma_mapping_error(dma_dev, mapping))
+ goto map_error;
+
+ w = RWPTR_NEXT(w, gmac->txq_order);
+ txq->ring[w].word0.bits32 = frag->size;
+ txq->ring[w].word1.bits32 = 0;
+ txq->ring[w].word2.bits32 = mapping;
+ txq->ring[w].word3.bits32 = 0;
+
+ netif_info(gmac, tx_queued, dev,
+ "txq%ld[%u]: 0x%08x 0x%08x 0x%08x 0x%08x, data %u @ %p+0x%03x\n",
+ txq - gmac->txq, w,
+ txq->ring[w].word0.bits32, txq->ring[w].word1.bits32,
+ txq->ring[w].word2.bits32, txq->ring[w].word3.bits32,
+ frag->size, frag->page, frag->page_offset);
+
+ ++frag;
+ }
+
+ txq->ring[w].word3.bits32 |= EOFIE_BIT | EOF_BIT;
+
+ return RWPTR_NEXT(w, gmac->txq_order);
+
+map_error:
+ while (w != desc) {
+ dma_unmap_page(dma_dev, txq->ring[w].word2.buf_adr,
+ txq->ring[w].word0.bits.buffer_size, DMA_TO_DEVICE);
+ w = RWPTR_PREV(w, gmac->txq_order);
+ }
+
+ dma_unmap_single(dma_dev, txq->ring[w].word2.buf_adr,
+ txq->ring[w].word0.bits.buffer_size, DMA_TO_DEVICE);
+
+map1_error:
+ netif_info(gmac, tx_err, dev,
+ "txq%ld: DMA mapping error\n", txq - gmac->txq);
+
+ return -ENOMEM;
+}
+
+static int gmac_start_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ struct toe_private *toe = netdev_to_toe(dev);
+
+ void __iomem *ptr_reg;
+ struct gmac_txq *txq;
+ struct netdev_queue *ntxq;
+ int w, nw, txq_num, nfrags;
+ unsigned long flags;
+
+ SKB_FRAG_ASSERT(skb);
+
+ txq_num = skb_get_queue_mapping(skb);
+ ptr_reg = gmac_dma_reg(dev, GMAC_SW_TX_QUEUE_PTR_REG(txq_num));
+ txq = &gmac->txq[txq_num];
+ ntxq = netdev_get_tx_queue(dev, txq_num);
+
+ netif_info(gmac, tx_queued, dev, "txq%u: %u,%u,%u ? %p (%u @ %p) /%u\n",
+ txq_num, txq->cptr, GET_RPTR(ptr_reg), GET_WPTR(ptr_reg),
+ skb, skb->len, skb->data, skb_shinfo(skb)->gso_size);
+ if (netif_msg_pktdata(gmac))
+ print_hex_dump(KERN_DEBUG, "TX: ", DUMP_PREFIX_OFFSET, 16, 1,
+ skb->data, skb_headlen(skb), true);
+
+ u64_stats_update_begin(&gmac->tx_stats_syncp);
+
+ if (skb->len >= 0x10000)
+ goto out_drop_free;
+
+ w = GET_WPTR(ptr_reg);
+ spin_lock_irqsave(&toe->irq_lock, flags);
+ nw = RWPTR_DISTANCE(w, txq->cptr - 1, gmac->txq_order);
+ if (!nw) {
+ netif_tx_stop_queue(ntxq);
+ __gmac_enable_txfin_irq(dev, txq_num, 1);
+ spin_unlock_irqrestore(&toe->irq_lock, flags);
+ goto out_drop_free;
+ }
+ spin_unlock_irqrestore(&toe->irq_lock, flags);
+
+ nfrags = skb_shinfo(skb)->nr_frags;
+ if (nw <= nfrags || nfrags >= TX_MAX_FRAGS) {
+ if (skb_linearize(skb))
+ goto out_drop;
+ gmac->tx_frags_linearized++;
+ } else
+ gmac->tx_frag_stats[nfrags]++;
+
+ txq->skb[w] = skb;
+
+ w = gmac_map_tx_bufs(dev, skb, txq, w);
+ if (w < 0)
+ goto out_drop_free;
+
+ if (skb->ip_summed != CHECKSUM_NONE)
+ gmac->tx_hw_csummed++;
+
+ u64_stats_update_end(&gmac->tx_stats_syncp);
+
+ netif_info(gmac, tx_queued, dev, "txq%u: %u,%u,%u + %p\n",
+ txq_num, txq->cptr, GET_RPTR(ptr_reg), w, skb);
+
+ SET_WPTR(ptr_reg, w);
+
+ /* stats updated on tx completion */
+ return NETDEV_TX_OK;
+
+out_drop_free:
+ dev_kfree_skb(skb);
+out_drop:
+ gmac->stats.tx_dropped++;
+ u64_stats_update_end(&gmac->tx_stats_syncp);
+ return NETDEV_TX_OK;
+}
+
+static struct sk_buff *gmac_drop_napi_skb(struct gmac_private *gmac,
+ struct toe_private *toe)
+{
+ napi_free_frags(&gmac->napi);
+
+ u64_stats_update_begin(&gmac->rx_stats_syncp);
+ gmac->stats.rx_dropped++;
+ u64_stats_update_end(&gmac->rx_stats_syncp);
+
+ return NULL;
+}
+
+static struct sk_buff *gmac_skb_if_good_frame(struct gmac_private *gmac,
+ GMAC_RXDESC_T *rx)
+{
+ struct sk_buff *skb;
+ unsigned pkt_size = rx->word1.bits.byte_count;
+ unsigned rx_status = rx->word0.bits.status;
+ unsigned rx_csum = rx->word0.bits.chksum_status;
+
+ u64_stats_update_begin(&gmac->rx_stats_syncp);
+
+ gmac->rx_stats[rx_status]++;
+ gmac->rx_csum_stats[rx_csum]++;
+
+ if (rx->word0.bits.derr || rx->word0.bits.perr ||
+ rx_status || pkt_size < ETH_ZLEN ||
+ rx_csum >= RX_CHKSUM_IP_ERR_UNKNOWN) {
+ gmac->stats.rx_errors++;
+
+ if (pkt_size < ETH_ZLEN || RX_ERROR_LENGTH(rx_status))
+ gmac->stats.rx_length_errors++;
+ if (RX_ERROR_OVER(rx_status))
+ gmac->stats.rx_over_errors++;
+ if (RX_ERROR_CRC(rx_status))
+ gmac->stats.rx_crc_errors++;
+ if (RX_ERROR_FRAME(rx_status))
+ gmac->stats.rx_frame_errors++;
+
+ u64_stats_update_end(&gmac->rx_stats_syncp);
+
+ return NULL;
+ }
+
+ skb = napi_get_frags(&gmac->napi);
+ if (!skb) {
+ gmac->stats.rx_dropped++;
+ u64_stats_update_end(&gmac->rx_stats_syncp);
+
+ return NULL;
+ }
+
+ if (rx_csum == RX_CHKSUM_IP_UDP_TCP_OK)
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
+
+ gmac->stats.rx_bytes += pkt_size;
+ gmac->stats.rx_packets++;
+
+ u64_stats_update_end(&gmac->rx_stats_syncp);
+
+ return skb;
+}
+
+static unsigned gmac_rx(struct net_device *dev, unsigned budget)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ struct toe_private *toe = netdev_to_toe(dev);
+ void __iomem *ptr_reg = gmac->rxq_rwptr;
+
+ unsigned i, mask = __RWPTR_MASK(gmac->rxq_order);
+ GMAC_RXDESC_T *rx = NULL;
+ struct sk_buff *skb = NULL;
+ struct page *page, *last_page = NULL;
+ unsigned page_offs, next_offs = 0;
+ unsigned pkt_size = 0, frag_size;
+ int frag_nr = 0;
+
+ netif_info(gmac, rx_status, dev, "rxq: %u,%u\n",
+ GET_RPTR(ptr_reg), GET_WPTR(ptr_reg));
+
+ i = GET_RPTR(ptr_reg);
+ for (; budget && i != GET_WPTR(ptr_reg); i = __RWPTR_NEXT(i, mask)) {
+ rx = &gmac->rxq_ring[i];
+
+ page = toe_unmap_rx_desc(toe, rx);
+ page_offs = rx->word2.buf_adr & ~PAGE_MASK;
+
+ netif_info(gmac, rx_status, dev,
+ "rxq[%u]: 0x%08x 0x%08x 0x%08x 0x%08x, page %p, offs 0x%04x\n",
+ i, rx->word0.bits32, rx->word1.bits32,
+ rx->word2.bits32, rx->word3.bits32, page, page_offs);
+
+ if (unlikely(!rx->word2.buf_adr)) {
+ netif_err(gmac, rx_status, dev,
+ "rxq[%u]: HW BUG: zero DMA descriptor\n", i);
+ if (skb)
+ skb = gmac_drop_napi_skb(gmac, toe);
+ continue;
+ }
+
+ if (rx->word3.bits32 & SOF_BIT) {
+ if (skb)
+ gmac_drop_napi_skb(gmac, toe);
+
+ skb = gmac_skb_if_good_frame(gmac, rx);
+ if (!skb) {
+ put_page(page);
+ continue;
+ }
+ skb->dev = dev;
+
+ pkt_size = rx->word1.bits.byte_count;
+ frag_nr = -1;
+ last_page = NULL;
+ page_offs += RX_INSERT_BYTES;
+ } else if (!skb) {
+ put_page(page);
+ continue;
+ }
+
+ /* append page frag to skb */
+
+ if (rx->word3.bits32 & EOF_BIT)
+ frag_size = pkt_size;
+ else {
+ frag_size = 1 << toe->freeq_frag_order;
+ if (rx->word3.bits32 & SOF_BIT)
+ frag_size -= RX_INSERT_BYTES;
+ }
+
+ if (page == last_page && page_offs == next_offs) {
+ skb_shinfo(skb)->frags[frag_nr].size += frag_size;
+ put_page(page);
+ } else if (likely(++frag_nr != MAX_SKB_FRAGS))
+ skb_fill_page_desc(skb, frag_nr,
+ page, page_offs, frag_size);
+ else {
+ skb = gmac_drop_napi_skb(gmac, toe);
+ put_page(page);
+ continue;
+ }
+
+ last_page = page;
+ next_offs = page_offs + frag_size;
+
+ skb->len += frag_size;
+ skb->data_len += frag_size;
+ skb->truesize += frag_size;
+
+ /* receive */
+
+ if (rx->word3.bits32 & EOF_BIT) {
+ napi_gro_frags(&gmac->napi);
+ skb = NULL;
+ --budget;
+ }
+ }
+
+ SET_RPTR(ptr_reg, i);
+
+ if (skb)
+ gmac_drop_napi_skb(gmac, toe);
+
+ return budget;
+}
+
+#define GMAC0_IRQ0_2 (GMAC0_TXDERR_INT_BIT|GMAC0_TXPERR_INT_BIT| \
+ GMAC0_RXDERR_INT_BIT|GMAC0_RXPERR_INT_BIT)
+#define GMAC0_IRQ0_6 (GMAC0_SWTQ00_EOF_INT_BIT|GMAC0_SWTQ00_FIN_INT_BIT)
+#define GMAC0_IRQ4_8 (GMAC0_MIB_INT_BIT|GMAC0_RX_OVERRUN_INT_BIT)
+
+static void gmac_enable_irq(struct net_device *dev, int enable)
+{
+ struct toe_private *toe = netdev_to_toe(dev);
+ unsigned long flags;
+ unsigned val, mask;
+
+ spin_lock_irqsave(&toe->irq_lock, flags);
+
+ mask = (GMAC0_IRQ0_2 << (dev->dev_id * 2)) |
+ (GMAC0_IRQ0_6 << (dev->dev_id * 6));
+ val = readl(toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_0_REG));
+ val = enable ? (val | mask) : (val & ~mask);
+ writel(val, toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_0_REG));
+
+ mask = DEFAULT_Q0_INT_BIT << dev->dev_id;
+ val = readl(toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_1_REG));
+ val = enable ? (val | mask) : (val & ~mask);
+ writel(val, toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_1_REG));
+
+ mask = GMAC0_IRQ4_8 << (dev->dev_id * 8);
+ val = readl(toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_4_REG));
+ val = enable ? (val | mask) : (val & ~mask);
+ writel(val, toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_4_REG));
+
+ spin_unlock_irqrestore(&toe->irq_lock, flags);
+}
+
+static void __gmac_enable_txfin_irq(struct net_device *dev, int txq, int enable)
+{
+ struct toe_private *toe = netdev_to_toe(dev);
+ unsigned val, mask;
+
+ mask = GMAC0_SWTQ00_FIN_INT_BIT << (6 * dev->dev_id + txq);
+ val = readl(toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_0_REG));
+ val = enable ? (val | mask) : (val & ~mask);
+ writel(val, toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_0_REG));
+}
+
+static void gmac_enable_rx_irq(struct net_device *dev, int enable)
+{
+ struct toe_private *toe = netdev_to_toe(dev);
+ unsigned long flags;
+ unsigned val, mask;
+
+ spin_lock_irqsave(&toe->irq_lock, flags);
+
+ mask = DEFAULT_Q0_INT_BIT << dev->dev_id;
+ val = readl(toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_1_REG));
+ val = enable ? (val | mask) : (val & ~mask);
+ writel(val, toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_1_REG));
+
+ spin_unlock_irqrestore(&toe->irq_lock, flags);
+}
+
+static int gmac_napi_poll(struct napi_struct *napi, int max_work)
+{
+ struct gmac_private *gmac = napi_to_gmac(napi);
+ unsigned work_left;
+
+ work_left = gmac_rx(napi->dev, max_work);
+
+ if (work_left != max_work) {
+ if (work_left) {
+ struct toe_private *toe = netdev_to_toe(napi->dev);
+ /* we've cleared the queue, ack rx interrupt;
+ * on next poll the interrupt will be enabled
+ * if the queue stays empty
+ */
+ writel(DEFAULT_Q0_INT_BIT << napi->dev->dev_id,
+ toe_reg(toe, GLOBAL_INTERRUPT_STATUS_1_REG));
+ }
+ return max_work - work_left;
+ }
+
+ napi_complete(napi);
+ u64_stats_update_begin(&gmac->rx_stats_syncp);
+ ++gmac->rx_napi_exits;
+ u64_stats_update_end(&gmac->rx_stats_syncp);
+ gmac_enable_rx_irq(napi->dev, 1);
+
+ return 0;
+}
+
+static void gmac_dump_dma_state(struct net_device *dev)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ struct toe_private *toe = netdev_to_toe(dev);
+ void __iomem *ptr_reg;
+ unsigned reg[5];
+
+ /* Interrupt status */
+ reg[0] = readl(toe_reg(toe, GLOBAL_INTERRUPT_STATUS_0_REG));
+ reg[1] = readl(toe_reg(toe, GLOBAL_INTERRUPT_STATUS_1_REG));
+ reg[2] = readl(toe_reg(toe, GLOBAL_INTERRUPT_STATUS_2_REG));
+ reg[3] = readl(toe_reg(toe, GLOBAL_INTERRUPT_STATUS_3_REG));
+ reg[4] = readl(toe_reg(toe, GLOBAL_INTERRUPT_STATUS_4_REG));
+ netdev_err(dev, "IRQ status: 0x%08x 0x%08x 0x%08x 0x%08x 0x%08x\n",
+ reg[0], reg[1], reg[2], reg[3], reg[4]);
+
+ /* Interrupt enable */
+ reg[0] = readl(toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_0_REG));
+ reg[1] = readl(toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_1_REG));
+ reg[2] = readl(toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_2_REG));
+ reg[3] = readl(toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_3_REG));
+ reg[4] = readl(toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_4_REG));
+ netdev_err(dev, "IRQ enable: 0x%08x 0x%08x 0x%08x 0x%08x 0x%08x\n",
+ reg[0], reg[1], reg[2], reg[3], reg[4]);
+
+ /* RX DMA status */
+ reg[0] = readl(gmac_dma_reg(dev, GMAC_DMA_RX_FIRST_DESC_REG));
+ reg[1] = readl(gmac_dma_reg(dev, GMAC_DMA_RX_CURR_DESC_REG));
+ reg[2] = GET_RPTR(gmac->rxq_rwptr);
+ reg[3] = GET_WPTR(gmac->rxq_rwptr);
+ netdev_err(dev, "RX DMA regs: 0x%08x 0x%08x, ptr: %u %u\n",
+ reg[0], reg[1], reg[2], reg[3]);
+
+ reg[0] = readl(gmac_dma_reg(dev, GMAC_DMA_RX_DESC_WORD0_REG));
+ reg[1] = readl(gmac_dma_reg(dev, GMAC_DMA_RX_DESC_WORD1_REG));
+ reg[2] = readl(gmac_dma_reg(dev, GMAC_DMA_RX_DESC_WORD2_REG));
+ reg[3] = readl(gmac_dma_reg(dev, GMAC_DMA_RX_DESC_WORD3_REG));
+ netdev_err(dev, "RX DMA descriptor: 0x%08x 0x%08x 0x%08x 0x%08x\n",
+ reg[0], reg[1], reg[2], reg[3]);
+
+ /* TX DMA status */
+ ptr_reg = gmac_dma_reg(dev, GMAC_SW_TX_QUEUE0_PTR_REG);
+
+ reg[0] = readl(gmac_dma_reg(dev, GMAC_DMA_TX_FIRST_DESC_REG));
+ reg[1] = readl(gmac_dma_reg(dev, GMAC_DMA_TX_CURR_DESC_REG));
+ reg[2] = GET_RPTR(ptr_reg);
+ reg[3] = GET_WPTR(ptr_reg);
+ netdev_err(dev, "TX DMA regs: 0x%08x 0x%08x, ptr: %u %u\n",
+ reg[0], reg[1], reg[2], reg[3]);
+
+ reg[0] = readl(gmac_dma_reg(dev, GMAC_DMA_TX_DESC_WORD0_REG));
+ reg[1] = readl(gmac_dma_reg(dev, GMAC_DMA_TX_DESC_WORD1_REG));
+ reg[2] = readl(gmac_dma_reg(dev, GMAC_DMA_TX_DESC_WORD2_REG));
+ reg[3] = readl(gmac_dma_reg(dev, GMAC_DMA_TX_DESC_WORD3_REG));
+ netdev_err(dev, "TX DMA descriptor: 0x%08x 0x%08x 0x%08x 0x%08x\n",
+ reg[0], reg[1], reg[2], reg[3]);
+
+ /* FREE queues status */
+ ptr_reg = toe_reg(toe, GLOBAL_SWFQ_RWPTR_REG);
+
+ reg[0] = GET_RPTR(ptr_reg);
+ reg[1] = GET_WPTR(ptr_reg);
+
+ ptr_reg = toe_reg(toe, GLOBAL_HWFQ_RWPTR_REG);
+
+ reg[2] = GET_RPTR(ptr_reg);
+ reg[3] = GET_WPTR(ptr_reg);
+ netdev_err(dev, "FQ SW ptr: %u %u, HW ptr: %u %u\n",
+ reg[0], reg[1], reg[2], reg[3]);
+}
+
+static void gmac_update_hw_stats(struct net_device *dev)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ struct toe_private *toe = netdev_to_toe(dev);
+ unsigned long flags;
+ unsigned int rx_discards, rx_mcast, rx_bcast;
+
+ spin_lock_irqsave(&toe->irq_lock, flags);
+ u64_stats_update_begin(&gmac->ir_stats_syncp);
+
+ gmac->hw_stats[0] += rx_discards = readl(gmac_ctl_reg(dev, GMAC_IN_DISCARDS));
+ gmac->hw_stats[1] += readl(gmac_ctl_reg(dev, GMAC_IN_ERRORS));
+ gmac->hw_stats[2] += rx_mcast = readl(gmac_ctl_reg(dev, GMAC_IN_MCAST));
+ gmac->hw_stats[3] += rx_bcast = readl(gmac_ctl_reg(dev, GMAC_IN_BCAST));
+ gmac->hw_stats[4] += readl(gmac_ctl_reg(dev, GMAC_IN_MAC1));
+ gmac->hw_stats[5] += readl(gmac_ctl_reg(dev, GMAC_IN_MAC2));
+
+ gmac->stats.rx_missed_errors += rx_discards;
+ gmac->stats.multicast += rx_mcast;
+ gmac->stats.multicast += rx_bcast;
+
+ writel(GMAC0_MIB_INT_BIT << (dev->dev_id * 8),
+ toe_reg(toe, GLOBAL_INTERRUPT_STATUS_4_REG));
+
+ u64_stats_update_end(&gmac->ir_stats_syncp);
+ spin_unlock_irqrestore(&toe->irq_lock, flags);
+}
+
+static inline unsigned gmac_get_intr_flags(struct net_device *dev, int i)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ struct toe_private *toe = netdev_to_toe(dev);
+ void __iomem *irqif_reg, *irqen_reg;
+ unsigned offs, val;
+
+ offs = i * (GLOBAL_INTERRUPT_STATUS_1_REG - GLOBAL_INTERRUPT_STATUS_0_REG);
+
+ irqif_reg = toe_reg(toe, GLOBAL_INTERRUPT_STATUS_0_REG + offs);
+ irqen_reg = toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_0_REG + offs);
+
+ val = readl(irqif_reg) & readl(irqen_reg);
+ if (val)
+ netif_info(gmac, intr, dev, "irq: val%d&en = 0x%08x\n", i, val);
+
+ return val;
+}
+
+static irqreturn_t gmac_interrupt(int irq, void *data)
+{
+ struct net_device *dev = data;
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ struct toe_private *toe = netdev_to_toe(dev);
+ unsigned val, orr = 0;
+
+
+ orr |= val = gmac_get_intr_flags(dev, 0);
+
+ if (unlikely(val & (GMAC0_IRQ0_2 << (dev->dev_id * 2)))) {
+ /* oh, crap. */
+ netif_err(gmac, intr, dev, "hw failure/sw bug\n");
+ gmac_dump_dma_state(dev);
+
+ /* don't know how to recover, just reduce losses */
+ gmac_enable_irq(dev, 0);
+ return IRQ_HANDLED;
+ }
+
+ if (val & (GMAC0_IRQ0_6 << (dev->dev_id * 6)))
+ gmac_tx_interrupt(dev, 0);
+
+
+ orr |= val = gmac_get_intr_flags(dev, 1);
+
+ if (val & (DEFAULT_Q0_INT_BIT << dev->dev_id)) {
+ gmac_enable_rx_irq(dev, 0);
+ napi_schedule(&gmac->napi);
+ }
+
+
+ orr |= val = gmac_get_intr_flags(dev, 4);
+
+ if (unlikely(val & (GMAC0_MIB_INT_BIT << (dev->dev_id * 8))))
+ gmac_update_hw_stats(dev);
+
+ if (unlikely(val & (GMAC0_RX_OVERRUN_INT_BIT << (dev->dev_id * 8)))) {
+ writel(GMAC0_RXDERR_INT_BIT << (dev->dev_id * 8),
+ toe_reg(toe, GLOBAL_INTERRUPT_STATUS_4_REG));
+
+ spin_lock(&toe->irq_lock);
+ u64_stats_update_begin(&gmac->ir_stats_syncp);
+ ++gmac->stats.rx_fifo_errors;
+ u64_stats_update_end(&gmac->ir_stats_syncp);
+ spin_unlock(&toe->irq_lock);
+ }
+
+ return orr ? IRQ_HANDLED : IRQ_NONE;
+}
+
+static int gmac_open_running(struct net_device *dev)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ int err;
+
+ err = gmac_setup_rxq(dev);
+ if (unlikely(err))
+ return err;
+
+ err = gmac_setup_txqs(dev);
+ if (unlikely(err)) {
+ gmac_cleanup_rxq(dev);
+ return err;
+ }
+
+ napi_enable(&gmac->napi);
+ gmac_hw_start(dev);
+ gmac_enable_irq(dev, 1);
+ gmac_enable_tx_rx(dev);
+ netif_tx_start_all_queues(dev);
+
+ gmac->in_reset = 0;
+
+ return 0;
+}
+
+static void gmac_stop_running(struct net_device *dev)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+
+ netif_tx_stop_all_queues(dev);
+
+ gmac_disable_tx_rx(dev);
+ gmac_hw_stop(dev);
+
+ napi_disable(&gmac->napi);
+
+ gmac_enable_irq(dev, 0);
+
+ gmac_cleanup_txqs(dev);
+ gmac_cleanup_rxq(dev);
+
+ gmac->in_reset = 1;
+}
+
+static int gmac_open(struct net_device *dev)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ int err;
+
+ if (!dev->phydev) {
+ err = gmac_setup_phy(dev);
+ if (err) {
+ netif_err(gmac, ifup, dev,
+ "PHY init failed: %d\n", err);
+ return err;
+ }
+ }
+
+ err = request_irq(dev->irq, gmac_interrupt,
+ IRQF_SHARED, dev->name, dev);
+ if (unlikely(err))
+ return err;
+
+ netif_carrier_off(dev);
+ phy_start(dev->phydev);
+
+ err = gmac_open_running(dev);
+ if (likely(!err))
+ return 0;
+
+ phy_stop(dev->phydev);
+ free_irq(dev->irq, dev);
+ return err;
+}
+
+static int gmac_stop(struct net_device *dev)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+
+ if (!gmac->in_reset)
+ gmac_stop_running(dev);
+
+ phy_stop(dev->phydev);
+ free_irq(dev->irq, dev);
+
+ gmac_update_hw_stats(dev);
+
+ return 0;
+}
+
+static void gmac_set_multicast_list(struct net_device *dev)
+{
+ struct netdev_hw_addr *ha;
+ __u32 mc_filter[2];
+ unsigned bit_nr;
+
+ if (dev->flags & IFF_ALLMULTI)
+ return;
+
+ mc_filter[1] = mc_filter[0] = 0;
+ netdev_for_each_mc_addr(ha, dev) {
+ bit_nr = ~crc32_be(~0, ha->addr, ETH_ALEN) & 0x3f;
+ mc_filter[bit_nr >> 5] |= 1 << (bit_nr & 0x1f);
+ }
+
+ writel(mc_filter[0], gmac_ctl_reg(dev, GMAC_MCAST_FIL0));
+ writel(mc_filter[1], gmac_ctl_reg(dev, GMAC_MCAST_FIL1));
+}
+
+static void gmac_set_rx_mode(struct net_device *dev)
+{
+ GMAC_RX_FLTR_T filter = { .bits = {
+ .broadcast = 1,
+ .multicast = 1,
+ .unicast = 1,
+ } };
+
+ if (dev->flags & IFF_PROMISC) {
+ filter.bits.error = 1;
+ filter.bits.promiscuous = 1;
+ } else if (dev->flags & IFF_ALLMULTI) {
+ writel(~0, gmac_ctl_reg(dev, GMAC_MCAST_FIL0));
+ writel(~0, gmac_ctl_reg(dev, GMAC_MCAST_FIL1));
+ } else {
+ gmac_set_multicast_list(dev);
+ }
+
+ writel(filter.bits32, gmac_ctl_reg(dev, GMAC_RX_FLTR));
+}
+
+static void __gmac_set_mac_address(struct net_device *dev)
+{
+ __le32 addr[3];
+
+ memset(addr, 0, sizeof(addr));
+ memcpy(addr, dev->dev_addr, ETH_ALEN);
+
+ writel(le32_to_cpu(addr[0]), gmac_ctl_reg(dev, GMAC_STA_ADD0));
+ writel(le32_to_cpu(addr[1]), gmac_ctl_reg(dev, GMAC_STA_ADD1));
+ writel(le32_to_cpu(addr[2]), gmac_ctl_reg(dev, GMAC_STA_ADD2));
+}
+
+static int gmac_set_mac_address(struct net_device *dev, void *addr)
+{
+ struct sockaddr *sa = addr;
+
+ memcpy(dev->dev_addr, sa->sa_data, ETH_ALEN);
+ __gmac_set_mac_address(dev);
+
+ return 0;
+}
+
+static void gmac_clear_hw_stats(struct net_device *dev)
+{
+ readl(gmac_ctl_reg(dev, GMAC_IN_DISCARDS));
+ readl(gmac_ctl_reg(dev, GMAC_IN_ERRORS));
+ readl(gmac_ctl_reg(dev, GMAC_IN_MCAST));
+ readl(gmac_ctl_reg(dev, GMAC_IN_BCAST));
+ readl(gmac_ctl_reg(dev, GMAC_IN_MAC1));
+ readl(gmac_ctl_reg(dev, GMAC_IN_MAC2));
+}
+
+static struct rtnl_link_stats64 *gmac_get_stats64(struct net_device *dev,
+ struct rtnl_link_stats64 *storage)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ unsigned int start;
+
+ gmac_update_hw_stats(dev);
+
+ /* racing with RX NAPI */
+ do {
+ start = u64_stats_fetch_begin(&gmac->rx_stats_syncp);
+
+ storage->rx_packets = gmac->stats.rx_packets;
+ storage->rx_bytes = gmac->stats.rx_bytes;
+ storage->rx_errors = gmac->stats.rx_errors;
+ storage->rx_dropped = gmac->stats.rx_dropped;
+
+ storage->rx_length_errors = gmac->stats.rx_length_errors;
+ storage->rx_over_errors = gmac->stats.rx_over_errors;
+ storage->rx_crc_errors = gmac->stats.rx_crc_errors;
+ storage->rx_frame_errors = gmac->stats.rx_frame_errors;
+
+ } while (u64_stats_fetch_retry(&gmac->rx_stats_syncp, start));
+
+ /* racing with MIB and TX completion interrupts */
+ do {
+ start = u64_stats_fetch_begin(&gmac->ir_stats_syncp);
+
+ storage->tx_errors = gmac->stats.tx_errors;
+ storage->tx_packets = gmac->stats.tx_packets;
+ storage->tx_bytes = gmac->stats.tx_bytes;
+
+ storage->multicast = gmac->stats.multicast;
+ storage->rx_missed_errors = gmac->stats.rx_missed_errors;
+ storage->rx_fifo_errors = gmac->stats.rx_fifo_errors;
+
+ } while (u64_stats_fetch_retry(&gmac->ir_stats_syncp, start));
+
+ /* racing with hard_start_xmit */
+ do {
+ start = u64_stats_fetch_begin(&gmac->tx_stats_syncp);
+
+ storage->tx_dropped = gmac->stats.tx_dropped;
+
+ } while (u64_stats_fetch_retry(&gmac->tx_stats_syncp, start));
+
+ storage->rx_dropped += storage->rx_missed_errors;
+
+ return storage;
+}
+
+static int gmac_change_mtu(struct net_device *dev, int new_mtu)
+{
+ int max_len = gmac_pick_rx_max_len(new_mtu);
+
+ if (max_len < 0)
+ return -EINVAL;
+
+ gmac_disable_tx_rx(dev);
+
+ dev->mtu = new_mtu;
+ gmac_update_config0_reg(dev,
+ max_len << CONFIG0_MAXLEN_SHIFT,
+ CONFIG0_MAXLEN_MASK);
+ if (new_mtu + ETH_HLEN + VLAN_HLEN > MTU_SIZE_BIT_MASK &&
+ dev->features & GMAC_TX_OFFLOAD_FEATURES) {
+ netdev_warn(dev, "Dropping TX offloads for MTU > %d\n",
+ MTU_SIZE_BIT_MASK - ETH_HLEN - VLAN_HLEN);
+ dev->features &= ~GMAC_TX_OFFLOAD_FEATURES;
+ }
+
+ gmac_enable_tx_rx(dev);
+
+ return 0;
+}
+
+static u32 gmac_get_rx_csum(struct net_device *dev)
+{
+ return !!(readl(gmac_ctl_reg(dev, GMAC_CONFIG0)) & CONFIG0_RX_CHKSUM);
+}
+
+static int gmac_set_rx_csum(struct net_device *dev, u32 enable)
+{
+ gmac_update_config0_reg(dev,
+ enable ? CONFIG0_RX_CHKSUM : 0, CONFIG0_RX_CHKSUM);
+
+ return 0;
+}
+
+static int gmac_set_tx_csum(struct net_device *dev, u32 data)
+{
+ if (data && dev->mtu + ETH_HLEN + VLAN_HLEN > MTU_SIZE_BIT_MASK)
+ return -EINVAL;
+
+ return ethtool_op_set_tx_ipv6_csum(dev, data);
+}
+
+static int gmac_set_tso(struct net_device *dev, u32 data)
+{
+ if (data) {
+ if (dev->mtu + ETH_HLEN + VLAN_HLEN <= MTU_SIZE_BIT_MASK)
+ dev->features |= NETIF_TSO_FEATURES;
+ else
+ return -EINVAL;
+ } else
+ dev->features &= ~NETIF_TSO_FEATURES;
+
+ return 0;
+}
+
+static int gmac_get_sset_count(struct net_device *dev, int sset)
+{
+ return sset == ETH_SS_STATS ? GMAC_STATS_NUM : 0;
+}
+
+static void gmac_get_strings(struct net_device *dev, u32 stringset, u8 *data)
+{
+ if (stringset != ETH_SS_STATS)
+ return;
+
+ memcpy(data, gmac_stats_strings, sizeof(gmac_stats_strings));
+}
+
+static void gmac_get_ethtool_stats(struct net_device *dev,
+ struct ethtool_stats *estats, u64 *values)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ unsigned int start;
+ u64 *p;
+ int i;
+
+ gmac_update_hw_stats(dev);
+
+ /* racing with MIB interrupt */
+ do {
+ p = values;
+ start = u64_stats_fetch_begin(&gmac->ir_stats_syncp);
+
+ for (i = 0; i < RX_STATS_NUM; ++i)
+ *p++ = gmac->hw_stats[i];
+
+ } while (u64_stats_fetch_retry(&gmac->ir_stats_syncp, start));
+ values = p;
+
+ /* racing with RX NAPI */
+ do {
+ p = values;
+ start = u64_stats_fetch_begin(&gmac->rx_stats_syncp);
+
+ for (i = 0; i < RX_STATUS_NUM; ++i)
+ *p++ = gmac->rx_stats[i];
+ for (i = 0; i < RX_CHKSUM_NUM; ++i)
+ *p++ = gmac->rx_csum_stats[i];
+ *p++ = gmac->rx_napi_exits;
+
+ } while (u64_stats_fetch_retry(&gmac->rx_stats_syncp, start));
+ values = p;
+
+ /* racing with TX start_xmit */
+ do {
+ p = values;
+ start = u64_stats_fetch_begin(&gmac->tx_stats_syncp);
+
+ for (i = 0; i < TX_MAX_FRAGS; ++i)
+ *values++ = gmac->tx_frag_stats[i];
+ *values++ = gmac->tx_frags_linearized;
+ *values++ = gmac->tx_hw_csummed;
+
+ } while (u64_stats_fetch_retry(&gmac->tx_stats_syncp, start));
+}
+
+static int gmac_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
+{
+ if (!dev->phydev)
+ return -ENXIO;
+ return phy_ethtool_gset(dev->phydev, cmd);
+}
+
+static int gmac_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
+{
+ if (!dev->phydev)
+ return -ENXIO;
+ return phy_ethtool_sset(dev->phydev, cmd);
+}
+
+static int gmac_nway_reset(struct net_device *dev)
+{
+ if (!dev->phydev)
+ return -ENXIO;
+ return phy_start_aneg(dev->phydev);
+}
+
+static void gmac_get_pauseparam(struct net_device *dev,
+ struct ethtool_pauseparam *pparam)
+{
+ GMAC_CONFIG0_T config0;
+
+ config0.bits32 = readl(gmac_ctl_reg(dev, GMAC_CONFIG0));
+
+ pparam->rx_pause = config0.bits.rx_fc_en;
+ pparam->tx_pause = config0.bits.tx_fc_en;
+ pparam->autoneg = true;
+}
+
+static void gmac_get_ringparam(struct net_device *dev,
+ struct ethtool_ringparam *rp)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ GMAC_CONFIG0_T config0;
+
+ config0.bits32 = readl(gmac_ctl_reg(dev, GMAC_CONFIG0));
+
+ rp->rx_max_pending = 1 << 15;
+ rp->rx_mini_max_pending = 0;
+ rp->rx_jumbo_max_pending = 0;
+ rp->tx_max_pending = 1 << 15;
+
+ rp->rx_pending = 1 << gmac->rxq_order;
+ rp->rx_mini_pending = 0;
+ rp->rx_jumbo_pending = 0;
+ rp->tx_pending = 1 << gmac->txq_order;
+}
+
+static int toe_resize_freeq(struct toe_private *toe, int changing_dev_id);
+
+static int gmac_set_ringparam(struct net_device *dev,
+ struct ethtool_ringparam *rp)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ struct toe_private *toe = netdev_to_toe(dev);
+ int err = 0;
+
+ if (netif_running(dev))
+ return -EBUSY;
+
+ if (rp->rx_pending) {
+ gmac->rxq_order = min(15, ilog2(rp->rx_pending - 1) + 1);
+ err = toe_resize_freeq(toe, dev->dev_id);
+ }
+
+ if (rp->tx_pending)
+ gmac->txq_order = min(15, ilog2(rp->tx_pending - 1) + 1);
+
+ return err;
+}
+
+static int gmac_get_coalesce(struct net_device *dev,
+ struct ethtool_coalesce *ecmd)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+
+ ecmd->rx_max_coalesced_frames = 1;
+ ecmd->tx_max_coalesced_frames = gmac->irq_every_tx_packets;
+
+ return 0;
+}
+
+static int gmac_set_coalesce(struct net_device *dev,
+ struct ethtool_coalesce *ecmd)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+
+ if (ecmd->tx_max_coalesced_frames < 1)
+ return -EINVAL;
+ if (ecmd->tx_max_coalesced_frames >= 1 << gmac->txq_order)
+ return -EINVAL;
+
+ gmac->irq_every_tx_packets = ecmd->tx_max_coalesced_frames;
+
+ return 0;
+}
+
+static u32 gmac_get_msglevel(struct net_device *dev)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ return gmac->msg_enable;
+}
+
+static void gmac_set_msglevel(struct net_device *dev, u32 level)
+{
+ struct gmac_private *gmac = netdev_to_gmac(dev);
+ gmac->msg_enable = level;
+}
+
+static void gmac_get_drvinfo(struct net_device *dev,
+ struct ethtool_drvinfo *info)
+{
+ strcpy(info->driver, "sl351x");
+ strcpy(info->version, "mq-k");
+ strcpy(info->bus_info, dev->dev_id ? "1" : "0");
+}
+
+static const struct net_device_ops gmac_351x_ops = {
+ .ndo_init = gmac_init,
+ .ndo_uninit = gmac_uninit,
+ .ndo_open = gmac_open,
+ .ndo_stop = gmac_stop,
+ .ndo_start_xmit = gmac_start_xmit,
+ .ndo_tx_timeout = gmac_dump_dma_state,
+ .ndo_set_multicast_list = gmac_set_multicast_list,
+ .ndo_set_rx_mode = gmac_set_rx_mode,
+ .ndo_set_mac_address = gmac_set_mac_address,
+ .ndo_get_stats64 = gmac_get_stats64,
+ .ndo_change_mtu = gmac_change_mtu,
+};
+
+static const struct ethtool_ops gmac_351x_ethtool_ops = {
+ .get_rx_csum = gmac_get_rx_csum,
+ .set_rx_csum = gmac_set_rx_csum,
+ .get_tx_csum = ethtool_op_get_tx_csum,
+ .set_tx_csum = gmac_set_tx_csum,
+ .get_sg = ethtool_op_get_sg,
+ .set_sg = ethtool_op_set_sg,
+ .get_tso = ethtool_op_get_tso,
+ .set_tso = gmac_set_tso,
+ .get_sset_count = gmac_get_sset_count,
+ .get_strings = gmac_get_strings,
+ .get_ethtool_stats = gmac_get_ethtool_stats,
+ .get_settings = gmac_get_settings,
+ .set_settings = gmac_set_settings,
+ .get_link = ethtool_op_get_link,
+ .nway_reset = gmac_nway_reset,
+ .get_pauseparam = gmac_get_pauseparam,
+ .get_ringparam = gmac_get_ringparam,
+ .set_ringparam = gmac_set_ringparam,
+ .get_coalesce = gmac_get_coalesce,
+ .set_coalesce = gmac_set_coalesce,
+ .get_msglevel = gmac_get_msglevel,
+ .set_msglevel = gmac_set_msglevel,
+ .get_drvinfo = gmac_get_drvinfo,
+};
+
+static int __devinit gmac_init_netdev(struct toe_private *toe, int num,
+ struct platform_device *pdev)
+{
+ struct gemini_gmac_platform_data *pdata = pdev->dev.platform_data;
+ struct gmac_private *gmac;
+ struct net_device *dev;
+ __le32 addr[3];
+ int irq, err;
+
+ if (!pdata->bus_id[num])
+ return 0;
+
+ irq = platform_get_irq(pdev, num);
+ if (irq < 0) {
+ dev_err(toe->dev, "No IRQ for ethernet device #%d\n", num);
+ return irq;
+ }
+
+ dev = alloc_etherdev_mq(sizeof(*gmac), TX_QUEUE_NUM);
+ if (!dev) {
+ dev_err(toe->dev, "Can't allocate ethernet device #%d\n", num);
+ return -ENOMEM;
+ }
+
+ gmac = netdev_priv(dev);
+ dev->ml_priv = toe;
+ SET_NETDEV_DEV(dev, toe->dev);
+
+ toe->netdev[num] = dev;
+ dev->dev_id = num;
+
+ gmac->dma_iomem = toe->iomem + TOE_GMAC_DMA_BASE(num);
+ dev->base_addr = (unsigned long)(toe->iomem + TOE_GMAC_BASE(num));
+ dev->irq = irq;
+
+ dev->netdev_ops = &gmac_351x_ops;
+ SET_ETHTOOL_OPS(dev, &gmac_351x_ethtool_ops);
+
+ spin_lock_init(&gmac->config_lock);
+ gmac->msg_enable = debug_level;
+ gmac_clear_hw_stats(dev);
+
+ /* select working offloads by default */
+ /* (SG will be disabled when HW csum is disabled) */
+ /* TSO_ECN untested, TX csum unreliable, TX DMA unreliable */
+ dev->features |= NETIF_F_SG | NETIF_F_GSO | NETIF_F_GRO;
+
+ netif_napi_add(dev, &gmac->napi, gmac_napi_poll, DEFAULT_NAPI_WEIGHT);
+
+ /* dump MAC address regs; CPU is LE anyway */
+ addr[0] = cpu_to_le32(readl(gmac_ctl_reg(dev, GMAC_STA_ADD0)));
+ addr[1] = cpu_to_le32(readl(gmac_ctl_reg(dev, GMAC_STA_ADD1)));
+ addr[2] = cpu_to_le32(readl(gmac_ctl_reg(dev, GMAC_STA_ADD2)));
+ dev_dbg(&pdev->dev, "port %d address regs: %pM %pM\n",
+ num, (char *)addr, (char *)addr + ETH_ALEN);
+
+ if (is_valid_ether_addr((void *)addr))
+ memcpy(dev->dev_addr, addr, ETH_ALEN);
+ else
+ random_ether_addr(dev->dev_addr);
+ __gmac_set_mac_address(dev);
+
+ err = gmac_setup_phy(dev);
+ if (err)
+ netif_warn(gmac, probe, dev,
+ "PHY init failed: %d, deferring to ifup time\n", err);
+
+ err = register_netdev(dev);
+ if (!err)
+ return 0;
+
+ toe->netdev[num] = NULL;
+ free_netdev(dev);
+ return err;
+}
+
+static struct page *toe_alloc_freeq_pages(struct toe_private *toe, bool emerg)
+{
+ gfp_t gfp_mask = GFP_NOIO;
+ bool retried = false;
+
+retry:
+ toe->freeq_page = alloc_pages(gfp_mask, toe->alloc_order);
+ if (!toe->freeq_page) {
+ toe->alloc_order >>= 1;
+ if (gfp_mask & __GFP_HIGH)
+ /* even emergency alloc failed */
+ return NULL;
+ if (!toe->alloc_order && emerg)
+ gfp_mask |= __GFP_HIGH;
+ retried = true;
+ goto retry;
+ }
+
+ if (dma_mapping_error(toe->dev, dma_map_page(toe->dev,
+ toe->freeq_page, 0, PAGE_SIZE << toe->alloc_order,
+ DMA_FROM_DEVICE))) {
+ put_page(toe->freeq_page);
+ goto retry;
+ }
+
+ toe->freeq_page_count = 1 << toe->alloc_order;
+ toe->freeq_page_offs = 0;
+
+ if (!retried && toe->alloc_order < RX_MAX_ALLOC_ORDER)
+ toe->alloc_order++;
+
+ return toe->freeq_page;
+}
+
+static struct page *toe_get_next_page(struct toe_private *toe,
+ struct page *page, unsigned eaten_size)
+{
+ toe->freeq_page_offs += eaten_size;
+ if (toe->freeq_page_offs & ~PAGE_MASK) {
+ get_page(page);
+ return page;
+ }
+
+ if (!--toe->freeq_page_count)
+ return NULL;
+
+ toe->freeq_page_offs = 0;
+
+ toe->freeq_page = ++page;
+ get_page(page);
+
+ return page;
+}
+
+static unsigned int toe_fill_freeq_range(struct toe_private *toe,
+ unsigned int begin, unsigned int end)
+{
+ void __iomem *ptr_reg = toe_reg(toe, GLOBAL_SWFQ_RWPTR_REG);
+ GMAC_RXDESC_T *desc, *dend;
+ struct page *page;
+ unsigned count;
+
+ dev_dbg(toe->dev, "freeq: filling <%u,%u) (ptr: %u %u)\n",
+ begin, end, GET_RPTR(ptr_reg), GET_WPTR(ptr_reg));
+
+ if (toe->freeq_page_count)
+ page = toe->freeq_page;
+ else
+ page = NULL;
+
+ desc = toe->freeq_ring + begin;
+ dend = toe->freeq_ring + end;
+
+ for (count = 0; desc != dend; ++desc, ++count) {
+ if (!page) {
+ page = toe_alloc_freeq_pages(toe, 0);
+ if (!page)
+ break;
+ }
+
+ /* only word2 gets copied to rxq descriptor */
+ /* buffer size is taken from DMA_SKB_SIZE_REG */
+ desc->word2.buf_adr = pfn_to_dma(toe->dev, page_to_pfn(page)) +
+ toe->freeq_page_offs;
+
+ if (unlikely(!count))
+ dev_dbg(toe->dev,
+ "freeq[%zu]: 0x%08x 0x%08x 0x%08x 0x%08x, page %p, offs 0x%04x\n",
+ desc - toe->freeq_ring,
+ desc->word0.bits32, desc->word1.bits32,
+ desc->word2.bits32, desc->word3.bits32,
+ page, toe->freeq_page_offs);
+
+ page = toe_get_next_page(toe, page,
+ 1 << toe->freeq_frag_order);
+ }
+
+ end = (desc - toe->freeq_ring) & __RWPTR_MASK(toe->freeq_order);
+ SET_WPTR(ptr_reg, end);
+
+ return count;
+}
+
+static void toe_enable_irq(struct toe_private *toe, int enable)
+{
+ void __iomem *irqen_reg = toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_4_REG);
+
+ unsigned long flags;
+ unsigned val;
+
+ spin_lock_irqsave(&toe->irq_lock, flags);
+
+ val = readl(irqen_reg);
+ if (enable)
+ val |= SWFQ_EMPTY_INT_BIT;
+ else
+ val &= ~SWFQ_EMPTY_INT_BIT;
+ writel(val, irqen_reg);
+
+ spin_unlock_irqrestore(&toe->irq_lock, flags);
+}
+
+static irqreturn_t toe_interrupt(int irq, void *data)
+{
+ struct toe_private *toe = data;
+
+ void __iomem *irqif_reg = toe_reg(toe, GLOBAL_INTERRUPT_STATUS_4_REG);
+ void __iomem *irqen_reg = toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_4_REG);
+ unsigned val, en;
+ irqreturn_t ret = IRQ_NONE;
+
+ spin_lock(&toe->irq_lock);
+
+ val = readl(irqif_reg);
+ val &= en = readl(irqen_reg);
+
+ if (val & SWFQ_EMPTY_INT_BIT) {
+ toe_enable_irq(toe, 0);
+ ret = IRQ_WAKE_THREAD;
+ }
+
+ spin_unlock(&toe->irq_lock);
+
+ return ret;
+}
+
+static irqreturn_t toe_interrupt_thread(int irq, void *data)
+{
+ struct toe_private *toe = data;
+ void __iomem *rwptr_reg = toe_reg(toe, GLOBAL_SWFQ_RWPTR_REG);
+ void __iomem *irqif_reg = toe_reg(toe, GLOBAL_INTERRUPT_STATUS_4_REG);
+ unsigned int r, w, d, end, count = 0;
+
+retry:
+ w = GET_WPTR(rwptr_reg);
+ r = GET_RPTR(rwptr_reg);
+
+ d = RWPTR_DISTANCE(r, w, toe->freeq_order);
+ if (unlikely(d >= toe->freeq_entries))
+ goto full;
+ d = toe->freeq_entries - d;
+
+ end = min(w + d, 1u << toe->freeq_order);
+ count += toe_fill_freeq_range(toe, w, end);
+
+ if (count == end - w && !(end & __RWPTR_MASK(toe->freeq_order)))
+ goto retry;
+
+full:
+ dev_dbg(toe->dev, "freeq: filled %u buffers\n", count);
+
+ writel(SWFQ_EMPTY_INT_BIT, irqif_reg);
+ if (unlikely(GET_WPTR(rwptr_reg) == GET_RPTR(rwptr_reg))) {
+ if (unlikely(!count))
+ dev_warn(toe->dev, "freeq: full, but empty?\n");
+ count = 0;
+ goto retry;
+ }
+
+ toe_enable_irq(toe, 1);
+
+ return IRQ_HANDLED;
+}
+
+static int toe_setup_freeq(struct toe_private *toe)
+{
+ void __iomem *dma_reg = toe_reg(toe, GLOBAL_SW_FREEQ_BASE_SIZE_REG);
+ QUEUE_THRESHOLD_T qt;
+ DMA_SKB_SIZE_T skbsz = { .bits = { .sw_skb_size = 1 << toe->freeq_frag_order } };
+ unsigned n;
+
+ toe->freeq_ring = dma_alloc_coherent(toe->dev,
+ sizeof(*toe->freeq_ring) << toe->freeq_order,
+ &toe->freeq_dma_base, GFP_KERNEL);
+ if (!toe->freeq_ring)
+ return -ENOMEM;
+
+ BUG_ON(toe->freeq_dma_base & ~DMA_Q_BASE_MASK);
+
+ writel(skbsz.bits32, toe_reg(toe, GLOBAL_DMA_SKB_SIZE_REG));
+ writel(toe->freeq_dma_base | toe->freeq_order, dma_reg);
+
+ /* fill ring */
+ n = toe_fill_freeq_range(toe, 0, toe->freeq_entries);
+ if (!n)
+ goto err_freeq;
+ if (n != toe->freeq_entries)
+ dev_warn(toe->dev, "Allocated only %u of %u RX buffers\n",
+ n, toe->freeq_entries);
+
+ qt.bits32 = readl(toe_reg(toe, GLOBAL_QUEUE_THRESHOLD_REG));
+ qt.bits.swfq_empty = min_t(unsigned, (n + 1) >> 1, 255);
+ writel(qt.bits32, toe_reg(toe, GLOBAL_QUEUE_THRESHOLD_REG));
+
+ dev_dbg(toe->dev, "freeq: ring %p (dma 0x%08x), len %u (order %u), thr %u\n",
+ toe->freeq_ring, toe->freeq_dma_base,
+ toe->freeq_entries, toe->freeq_order, qt.bits.swfq_empty);
+
+ return 0;
+
+err_freeq:
+ writel(0, dma_reg);
+ dma_free_coherent(toe->dev,
+ sizeof(*toe->freeq_ring) << toe->freeq_order,
+ toe->freeq_ring, toe->freeq_dma_base);
+ toe->freeq_ring = NULL;
+
+ return -ENOMEM;
+}
+
+static void toe_cleanup_freeq(struct toe_private *toe)
+{
+ void __iomem *dma_reg = toe_reg(toe, GLOBAL_SW_FREEQ_BASE_SIZE_REG);
+ void __iomem *ptr_reg = toe_reg(toe, GLOBAL_SWFQ_RWPTR_REG);
+ unsigned i, e, mask = __RWPTR_MASK(toe->freeq_order);
+
+ i = GET_RPTR(ptr_reg);
+ e = GET_WPTR(ptr_reg);
+ writel(0, ptr_reg);
+ writel(0, dma_reg);
+
+ for (; i != e; i = __RWPTR_NEXT(i, mask))
+ put_page(toe_unmap_rx_desc(toe, &toe->freeq_ring[i]));
+
+ dma_free_coherent(toe->dev,
+ sizeof(*toe->freeq_ring) << toe->freeq_order,
+ toe->freeq_ring, toe->freeq_dma_base);
+
+ toe->freeq_ring = NULL;
+}
+
+static int toe_resize_freeq(struct toe_private *toe, int changing_dev_id)
+{
+ struct net_device *other = toe->netdev[1 - changing_dev_id];
+ unsigned new_size = 0;
+ unsigned new_order;
+ int err;
+
+ if (other && netif_running(other))
+ return -EBUSY;
+
+ if (toe->netdev[0])
+ new_size = 1 << netdev_to_gmac(toe->netdev[0])->rxq_order;
+
+ if (toe->netdev[1])
+ new_size += 1 << netdev_to_gmac(toe->netdev[1])->rxq_order;
+
+ new_order = min(15, ilog2(new_size - 1) + 1);
+ if (new_size >= 1 << new_order)
+ new_size = (1 << new_order) - 1;
+
+ toe_enable_irq(toe, 0);
+ if (toe->freeq_ring)
+ toe_cleanup_freeq(toe);
+
+ toe->freeq_order = new_order;
+ toe->freeq_entries = new_size;
+
+ err = toe_setup_freeq(toe);
+ if (unlikely(err))
+ return err;
+
+ toe_enable_irq(toe, 1);
+
+ return 0;
+}
+
+
+/*
+ * Interrupt config:
+ *
+ * GMAC0 intr bits ------> int0 ----> eth0
+ * GMAC1 intr bits ------> int1 ----> eth1
+ * TOE intr -------------> int1 ----> eth1
+ * Classification Intr --> int0 ----> eth0
+ * Default Q0 -----------> int0 ----> eth0
+ * Default Q1 -----------> int1 ----> eth1
+ * FreeQ intr -----------> int1 ----> eth1
+ */
+static void toe_prepare_irq(struct toe_private *toe)
+{
+ writel(0, toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_0_REG));
+ writel(0, toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_1_REG));
+ writel(0, toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_2_REG));
+ writel(0, toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_3_REG));
+ writel(0, toe_reg(toe, GLOBAL_INTERRUPT_ENABLE_4_REG));
+
+ writel(0xCCFC0FC0, toe_reg(toe, GLOBAL_INTERRUPT_SELECT_0_REG));
+ writel(0x00F00002, toe_reg(toe, GLOBAL_INTERRUPT_SELECT_1_REG));
+ writel(0xFFFFFFFF, toe_reg(toe, GLOBAL_INTERRUPT_SELECT_2_REG));
+ writel(0xFFFFFFFF, toe_reg(toe, GLOBAL_INTERRUPT_SELECT_3_REG));
+ writel(0xFF000003, toe_reg(toe, GLOBAL_INTERRUPT_SELECT_4_REG));
+
+ /* edge-triggered interrupts packed to level-triggered one... */
+ writel(~0, toe_reg(toe, GLOBAL_INTERRUPT_STATUS_0_REG));
+ writel(~0, toe_reg(toe, GLOBAL_INTERRUPT_STATUS_1_REG));
+ writel(~0, toe_reg(toe, GLOBAL_INTERRUPT_STATUS_2_REG));
+ writel(~0, toe_reg(toe, GLOBAL_INTERRUPT_STATUS_3_REG));
+ writel(~0, toe_reg(toe, GLOBAL_INTERRUPT_STATUS_4_REG));
+}
+
+static __devinit int toe_init(struct toe_private *toe,
+ struct platform_device *pdev)
+{
+ int err;
+
+ writel(0, toe_reg(toe, GLOBAL_SW_FREEQ_BASE_SIZE_REG));
+ writel(0, toe_reg(toe, GLOBAL_HW_FREEQ_BASE_SIZE_REG));
+ writel(0, toe_reg(toe, GLOBAL_SWFQ_RWPTR_REG));
+ writel(0, toe_reg(toe, GLOBAL_HWFQ_RWPTR_REG));
+
+ toe->freeq_frag_order = DEFAULT_RX_BUF_ORDER;
+ toe->freeq_order = ~0;
+
+ toe_prepare_irq(toe);
+ err = request_threaded_irq(toe->irq, toe_interrupt,
+ toe_interrupt_thread, IRQF_SHARED, "sl351x-TOE", toe);
+ if (err)
+ goto err_freeq;
+
+ return 0;
+
+err_freeq:
+ toe_cleanup_freeq(toe);
+ return err;
+}
+
+static void toe_deinit(struct toe_private *toe)
+{
+ toe_prepare_irq(toe);
+ free_irq(toe->irq, toe);
+ toe_cleanup_freeq(toe);
+
+ if (toe->freeq_page_count)
+ put_page(toe->freeq_page);
+}
+
+static int toe_reset(struct toe_private *toe)
+{
+ unsigned int reg, retry = 5;
+
+ reg = readl(IO_ADDRESS(GEMINI_GLOBAL_BASE) + GLOBAL_RESET);
+ reg |= RESET_GMAC1 | RESET_GMAC0;
+ writel(reg, IO_ADDRESS(GEMINI_GLOBAL_BASE) + GLOBAL_RESET);
+
+ do {
+ udelay(2);
+ reg = readl(toe_reg(toe, GLOBAL_TOE_VERSION_REG));
+ barrier();
+ } while (!reg && --retry);
+
+ dev_info(toe->dev, "Gemini GMAC version 0x%x\n", reg);
+
+ return reg ? 0 : -EIO;
+}
+
+static int __devinit gemini_gmac_probe(struct platform_device *pdev)
+{
+ struct resource *res;
+ struct toe_private *toe;
+ int retval;
+
+ if (!pdev->dev.platform_data)
+ return -EINVAL;
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!res) {
+ dev_err(&pdev->dev, "can't get device resources\n");
+ return -ENODEV;
+ }
+
+ toe = kzalloc(sizeof(*toe), GFP_KERNEL);
+ if (!toe)
+ return -ENOMEM;
+
+ platform_set_drvdata(pdev, toe);
+ toe->dev = &pdev->dev;
+ toe->iomem = ioremap(res->start, resource_size(res));
+ if (!toe->iomem) {
+ dev_err(toe->dev, "ioremap failed\n");
+ retval = -EIO;
+ goto err_data;
+ }
+
+ retval = toe_reset(toe);
+ if (retval < 0)
+ goto err_unmap;
+
+ retval = toe->irq = platform_get_irq(pdev, 1);
+ if (retval < 0)
+ goto err_unmap;
+
+ spin_lock_init(&toe->irq_lock);
+
+ retval = toe_init(toe, pdev);
+ if (retval)
+ goto err_unmap;
+
+ retval = gmac_init_netdev(toe, 0, pdev);
+ if (retval)
+ goto err_uninit;
+
+ retval = gmac_init_netdev(toe, 1, pdev);
+ if (retval)
+ goto err_uninit;
+
+ toe_resize_freeq(toe, 0);
+
+ dev_dbg(&pdev->dev, "initialized.\n");
+ return 0;
+
+err_uninit:
+ if (toe->netdev[0])
+ unregister_netdev(toe->netdev[0]);
+ toe_deinit(toe);
+err_unmap:
+ iounmap(toe->iomem);
+err_data:
+ kfree(toe);
+ return retval;
+}
+
+static int __devexit gemini_gmac_remove(struct platform_device *pdev)
+{
+ struct toe_private *toe = platform_get_drvdata(pdev);
+ int i;
+
+ for (i = 0; i < 2; i++)
+ if (toe->netdev[i])
+ unregister_netdev(toe->netdev[i]);
+ toe_deinit(toe);
+
+ iounmap(toe->iomem);
+ kfree(toe);
+
+ return 0;
+}
+
+static struct platform_driver gemini_gmac_driver = {
+ .probe = gemini_gmac_probe,
+ .remove = __devexit_p(gemini_gmac_remove),
+ .driver.name = "gemini-gmac",
+ .driver.owner = THIS_MODULE,
+};
+
+static int __init gemini_gmac_init(void)
+{
+#ifdef CONFIG_MDIO_GPIO_MODULE
+ request_module("mdio-gpio");
+#endif
+ return platform_driver_register(&gemini_gmac_driver);
+}
+
+static void __exit gemini_gmac_exit(void)
+{
+ platform_driver_unregister(&gemini_gmac_driver);
+}
+
+module_init(gemini_gmac_init);
+module_exit(gemini_gmac_exit);
+
+MODULE_AUTHOR("MichaÅ MirosÅaw");
+MODULE_DESCRIPTION("StorLink SL351x (Gemini) ethernet driver");
+MODULE_LICENSE("GPL");
+MODULE_ALIAS("platform:gemini-gmac");
diff --git a/drivers/net/sl351x_hw.h b/drivers/net/sl351x_hw.h
new file mode 100644
index 0000000..f7bff5a
--- /dev/null
+++ b/drivers/net/sl351x_hw.h
@@ -0,0 +1,1436 @@
+/*
+ * Register definitions for Gemini LEPUS GMAC Ethernet device driver.
+ *
+ * Copyright (C) 2006, Storlink, Corp.
+ * Copyright (C) 2008-2009, Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
+ * Copyright (C) 2010, MichaÅ MirosÅaw <mirq-linux@rere.qmqm.pl>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+#ifndef _GMAC_HW_H
+#define _GMAC_HW_H
+
+#include <linux/bitops.h>
+
+/*
+ * Base Registers
+ */
+#define TOE_NONTOE_QUE_HDR_BASE 0x2000
+#define TOE_TOE_QUE_HDR_BASE 0x3000
+#define TOE_V_BIT_BASE 0x4000
+#define TOE_A_BIT_BASE 0x6000
+#define TOE_GMAC_DMA_BASE(x) (0x8000 + 0x4000 * (x))
+#define TOE_GMAC_BASE(x) (0xA000 + 0x4000 * (x))
+
+/*
+ * Queue ID
+ */
+#define TOE_SW_FREE_QID 0x00
+#define TOE_HW_FREE_QID 0x01
+#define TOE_GMAC0_SW_TXQ0_QID 0x02
+#define TOE_GMAC0_SW_TXQ1_QID 0x03
+#define TOE_GMAC0_SW_TXQ2_QID 0x04
+#define TOE_GMAC0_SW_TXQ3_QID 0x05
+#define TOE_GMAC0_SW_TXQ4_QID 0x06
+#define TOE_GMAC0_SW_TXQ5_QID 0x07
+#define TOE_GMAC0_HW_TXQ0_QID 0x08
+#define TOE_GMAC0_HW_TXQ1_QID 0x09
+#define TOE_GMAC0_HW_TXQ2_QID 0x0A
+#define TOE_GMAC0_HW_TXQ3_QID 0x0B
+#define TOE_GMAC1_SW_TXQ0_QID 0x12
+#define TOE_GMAC1_SW_TXQ1_QID 0x13
+#define TOE_GMAC1_SW_TXQ2_QID 0x14
+#define TOE_GMAC1_SW_TXQ3_QID 0x15
+#define TOE_GMAC1_SW_TXQ4_QID 0x16
+#define TOE_GMAC1_SW_TXQ5_QID 0x17
+#define TOE_GMAC1_HW_TXQ0_QID 0x18
+#define TOE_GMAC1_HW_TXQ1_QID 0x19
+#define TOE_GMAC1_HW_TXQ2_QID 0x1A
+#define TOE_GMAC1_HW_TXQ3_QID 0x1B
+#define TOE_GMAC0_DEFAULT_QID 0x20
+#define TOE_GMAC1_DEFAULT_QID 0x21
+#define TOE_CLASSIFICATION_QID(x) (0x22 + x) /* 0x22 ~ 0x2F */
+#define TOE_TOE_QID(x) (0x40 + x) /* 0x40 ~ 0x7F */
+
+/*
+ * old info:
+ * TOE DMA Queue Size should be 2^n, n = 6...12
+ * TOE DMA Queues are the following queue types:
+ * SW Free Queue, HW Free Queue,
+ * GMAC 0/1 SW TX Q0-5, and GMAC 0/1 HW TX Q0-5
+ * The base address and descriptor number are configured at
+ * DMA Queues Descriptor Ring Base Address/Size Register (offset 0x0004)
+ */
+
+#define GET_WPTR(addr) __raw_readw((addr) + 2)
+#define GET_RPTR(addr) __raw_readw((addr))
+#define SET_WPTR(addr, data) __raw_writew((data), (addr) + 2)
+#define SET_RPTR(addr, data) __raw_writew((data), (addr))
+#define __RWPTR_NEXT(x, mask) (((unsigned int)(x) + 1) & (mask))
+#define __RWPTR_PREV(x, mask) (((unsigned int)(x) - 1) & (mask))
+#define __RWPTR_DISTANCE(r, w, mask) (((unsigned int)(w) - (r)) & (mask))
+#define __RWPTR_MASK(order) ((1 << (order)) - 1)
+#define RWPTR_NEXT(x, order) __RWPTR_NEXT((x), __RWPTR_MASK((order)))
+#define RWPTR_PREV(x, order) __RWPTR_PREV((x), __RWPTR_MASK((order)))
+#define RWPTR_DISTANCE(r, w, order) __RWPTR_DISTANCE((r), (w), \
+ __RWPTR_MASK((order)))
+
+/*
+ * Global registers
+ * #define TOE_GLOBAL_BASE (TOE_BASE + 0x0000)
+ * Base 0x60000000
+ */
+#define GLOBAL_TOE_VERSION_REG 0x0000
+#define GLOBAL_SW_FREEQ_BASE_SIZE_REG 0x0004
+#define GLOBAL_HW_FREEQ_BASE_SIZE_REG 0x0008
+#define GLOBAL_DMA_SKB_SIZE_REG 0x0010
+#define GLOBAL_SWFQ_RWPTR_REG 0x0014
+#define GLOBAL_HWFQ_RWPTR_REG 0x0018
+#define GLOBAL_INTERRUPT_STATUS_0_REG 0x0020
+#define GLOBAL_INTERRUPT_ENABLE_0_REG 0x0024
+#define GLOBAL_INTERRUPT_SELECT_0_REG 0x0028
+#define GLOBAL_INTERRUPT_STATUS_1_REG 0x0030
+#define GLOBAL_INTERRUPT_ENABLE_1_REG 0x0034
+#define GLOBAL_INTERRUPT_SELECT_1_REG 0x0038
+#define GLOBAL_INTERRUPT_STATUS_2_REG 0x0040
+#define GLOBAL_INTERRUPT_ENABLE_2_REG 0x0044
+#define GLOBAL_INTERRUPT_SELECT_2_REG 0x0048
+#define GLOBAL_INTERRUPT_STATUS_3_REG 0x0050
+#define GLOBAL_INTERRUPT_ENABLE_3_REG 0x0054
+#define GLOBAL_INTERRUPT_SELECT_3_REG 0x0058
+#define GLOBAL_INTERRUPT_STATUS_4_REG 0x0060
+#define GLOBAL_INTERRUPT_ENABLE_4_REG 0x0064
+#define GLOBAL_INTERRUPT_SELECT_4_REG 0x0068
+#define GLOBAL_HASH_TABLE_BASE_REG 0x006C
+#define GLOBAL_QUEUE_THRESHOLD_REG 0x0070
+
+/*
+ * GMAC 0/1 DMA/TOE register
+ * #define TOE_GMAC0_DMA_BASE (TOE_BASE + 0x8000)
+ * #define TOE_GMAC1_DMA_BASE (TOE_BASE + 0xC000)
+ * Base 0x60008000 or 0x6000C000
+ */
+#define GMAC_DMA_CTRL_REG 0x0000
+#define GMAC_TX_WEIGHTING_CTRL_0_REG 0x0004
+#define GMAC_TX_WEIGHTING_CTRL_1_REG 0x0008
+#define GMAC_SW_TX_QUEUE0_PTR_REG 0x000C
+#define GMAC_SW_TX_QUEUE1_PTR_REG 0x0010
+#define GMAC_SW_TX_QUEUE2_PTR_REG 0x0014
+#define GMAC_SW_TX_QUEUE3_PTR_REG 0x0018
+#define GMAC_SW_TX_QUEUE4_PTR_REG 0x001C
+#define GMAC_SW_TX_QUEUE5_PTR_REG 0x0020
+#define GMAC_SW_TX_QUEUE_PTR_REG(i) (GMAC_SW_TX_QUEUE0_PTR_REG + 4 * (i))
+#define GMAC_HW_TX_QUEUE0_PTR_REG 0x0024
+#define GMAC_HW_TX_QUEUE1_PTR_REG 0x0028
+#define GMAC_HW_TX_QUEUE2_PTR_REG 0x002C
+#define GMAC_HW_TX_QUEUE3_PTR_REG 0x0030
+#define GMAC_HW_TX_QUEUE_PTR_REG(i) (GMAC_HW_TX_QUEUE0_PTR_REG + 4 * (i))
+#define GMAC_DMA_TX_FIRST_DESC_REG 0x0038
+#define GMAC_DMA_TX_CURR_DESC_REG 0x003C
+#define GMAC_DMA_TX_DESC_WORD0_REG 0x0040
+#define GMAC_DMA_TX_DESC_WORD1_REG 0x0044
+#define GMAC_DMA_TX_DESC_WORD2_REG 0x0048
+#define GMAC_DMA_TX_DESC_WORD3_REG 0x004C
+#define GMAC_SW_TX_QUEUE_BASE_REG 0x0050
+#define GMAC_HW_TX_QUEUE_BASE_REG 0x0054
+#define GMAC_DMA_RX_FIRST_DESC_REG 0x0058
+#define GMAC_DMA_RX_CURR_DESC_REG 0x005C
+#define GMAC_DMA_RX_DESC_WORD0_REG 0x0060
+#define GMAC_DMA_RX_DESC_WORD1_REG 0x0064
+#define GMAC_DMA_RX_DESC_WORD2_REG 0x0068
+#define GMAC_DMA_RX_DESC_WORD3_REG 0x006C
+#define GMAC_HASH_ENGINE_REG0 0x0070
+#define GMAC_HASH_ENGINE_REG1 0x0074
+/* matching rule 0 Control register 0 */
+#define GMAC_MR0CR0 0x0078
+#define GMAC_MR0CR1 0x007C
+#define GMAC_MR0CR2 0x0080
+#define GMAC_MR1CR0 0x0084
+#define GMAC_MR1CR1 0x0088
+#define GMAC_MR1CR2 0x008C
+#define GMAC_MR2CR0 0x0090
+#define GMAC_MR2CR1 0x0094
+#define GMAC_MR2CR2 0x0098
+#define GMAC_MR3CR0 0x009C
+#define GMAC_MR3CR1 0x00A0
+#define GMAC_MR3CR2 0x00A4
+/* Support Protocol Regsister 0 */
+#define GMAC_SPR0 0x00A8
+#define GMAC_SPR1 0x00AC
+#define GMAC_SPR2 0x00B0
+#define GMAC_SPR3 0x00B4
+#define GMAC_SPR4 0x00B8
+#define GMAC_SPR5 0x00BC
+#define GMAC_SPR6 0x00C0
+#define GMAC_SPR7 0x00C4
+/* GMAC Hash/Rx/Tx AHB Weighting register */
+#define GMAC_AHB_WEIGHT_REG 0x00C8
+
+/*
+ * TOE GMAC 0/1 register
+ * #define TOE_GMAC0_BASE (TOE_BASE + 0xA000)
+ * #define TOE_GMAC1_BASE (TOE_BASE + 0xE000)
+ * Base 0x6000A000 or 0x6000E000
+ */
+enum GMAC_REGISTER {
+ GMAC_STA_ADD0 = 0x0000,
+ GMAC_STA_ADD1 = 0x0004,
+ GMAC_STA_ADD2 = 0x0008,
+ GMAC_RX_FLTR = 0x000c,
+ GMAC_MCAST_FIL0 = 0x0010,
+ GMAC_MCAST_FIL1 = 0x0014,
+ GMAC_CONFIG0 = 0x0018,
+ GMAC_CONFIG1 = 0x001c,
+ GMAC_CONFIG2 = 0x0020,
+ GMAC_CONFIG3 = 0x0024,
+ GMAC_RESERVED = 0x0028,
+ GMAC_STATUS = 0x002c,
+ GMAC_IN_DISCARDS= 0x0030,
+ GMAC_IN_ERRORS = 0x0034,
+ GMAC_IN_MCAST = 0x0038,
+ GMAC_IN_BCAST = 0x003c,
+ GMAC_IN_MAC1 = 0x0040, /* for STA 1 MAC Address */
+ GMAC_IN_MAC2 = 0x0044 /* for STA 2 MAC Address */
+};
+
+#define RX_STATS_NUM 6
+
+/*
+ * DMA Queues description Ring Base Address/Size Register (offset 0x0004)
+ */
+typedef union {
+ unsigned int bits32;
+ unsigned int base_size;
+} DMA_Q_BASE_SIZE_T;
+#define DMA_Q_BASE_MASK (~0x0f)
+
+/*
+ * DMA SKB Buffer register (offset 0x0008)
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_0008 {
+ unsigned int sw_skb_size : 16; /* SW Free poll SKB Size */
+ unsigned int hw_skb_size : 16; /* HW Free poll SKB Size */
+ } bits;
+} DMA_SKB_SIZE_T;
+
+/*
+ * DMA SW Free Queue Read/Write Pointer Register (offset 0x000C)
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_000c {
+ unsigned int rptr : 16; /* Read Ptr, RO */
+ unsigned int wptr : 16; /* Write Ptr, RW */
+ } bits;
+} DMA_RWPTR_T;
+
+/*
+ * DMA HW Free Queue Read/Write Pointer Register (offset 0x0010)
+ * see DMA_RWPTR_T structure
+ */
+
+/*
+ * Interrupt Status Register 0 (offset 0x0020)
+ * Interrupt Mask Register 0 (offset 0x0024)
+ * Interrupt Select Register 0 (offset 0x0028)
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_0020 {
+ /* GMAC0 SW Tx Queue 0 EOF Interrupt */
+ unsigned int swtq00_eof : 1;
+ unsigned int swtq01_eof : 1;
+ unsigned int swtq02_eof : 1;
+ unsigned int swtq03_eof : 1;
+ unsigned int swtq04_eof : 1;
+ unsigned int swtq05_eof : 1;
+ /* GMAC1 SW Tx Queue 0 EOF Interrupt */
+ unsigned int swtq10_eof : 1;
+ unsigned int swtq11_eof : 1;
+ unsigned int swtq12_eof : 1;
+ unsigned int swtq13_eof : 1;
+ unsigned int swtq14_eof : 1;
+ unsigned int swtq15_eof : 1;
+ /* GMAC0 SW Tx Queue 0 Finish Interrupt */
+ unsigned int swtq00_fin : 1;
+ unsigned int swtq01_fin : 1;
+ unsigned int swtq02_fin : 1;
+ unsigned int swtq03_fin : 1;
+ unsigned int swtq04_fin : 1;
+ unsigned int swtq05_fin : 1;
+ /* GMAC1 SW Tx Queue 0 Finish Interrupt */
+ unsigned int swtq10_fin : 1;
+ unsigned int swtq11_fin : 1;
+ unsigned int swtq12_fin : 1;
+ unsigned int swtq13_fin : 1;
+ unsigned int swtq14_fin : 1;
+ unsigned int swtq15_fin : 1;
+ /* GMAC0 Rx Descriptor Protocol Error */
+ unsigned int rxPerr0 : 1;
+ /* GMAC0 AHB Bus Error while Rx */
+ unsigned int rxDerr0 : 1;
+ /* GMAC1 Rx Descriptor Protocol Error */
+ unsigned int rxPerr1 : 1;
+ /* GMAC1 AHB Bus Error while Rx */
+ unsigned int rxDerr1 : 1;
+ /* GMAC0 Tx Descriptor Protocol Error */
+ unsigned int txPerr0 : 1;
+ /* GMAC0 AHB Bus Error while Tx */
+ unsigned int txDerr0 : 1;
+ /* GMAC1 Tx Descriptor Protocol Error */
+ unsigned int txPerr1 : 1;
+ /* GMAC1 AHB Bus Error while Tx */
+ unsigned int txDerr1 : 1;
+ } bits;
+} INTR_REG0_T;
+
+#define GMAC1_TXDERR_INT_BIT BIT(31)
+#define GMAC1_TXPERR_INT_BIT BIT(30)
+#define GMAC0_TXDERR_INT_BIT BIT(29)
+#define GMAC0_TXPERR_INT_BIT BIT(28)
+#define GMAC1_RXDERR_INT_BIT BIT(27)
+#define GMAC1_RXPERR_INT_BIT BIT(26)
+#define GMAC0_RXDERR_INT_BIT BIT(25)
+#define GMAC0_RXPERR_INT_BIT BIT(24)
+#define GMAC1_SWTQ15_FIN_INT_BIT BIT(23)
+#define GMAC1_SWTQ14_FIN_INT_BIT BIT(22)
+#define GMAC1_SWTQ13_FIN_INT_BIT BIT(21)
+#define GMAC1_SWTQ12_FIN_INT_BIT BIT(20)
+#define GMAC1_SWTQ11_FIN_INT_BIT BIT(19)
+#define GMAC1_SWTQ10_FIN_INT_BIT BIT(18)
+#define GMAC0_SWTQ05_FIN_INT_BIT BIT(17)
+#define GMAC0_SWTQ04_FIN_INT_BIT BIT(16)
+#define GMAC0_SWTQ03_FIN_INT_BIT BIT(15)
+#define GMAC0_SWTQ02_FIN_INT_BIT BIT(14)
+#define GMAC0_SWTQ01_FIN_INT_BIT BIT(13)
+#define GMAC0_SWTQ00_FIN_INT_BIT BIT(12)
+#define GMAC1_SWTQ15_EOF_INT_BIT BIT(11)
+#define GMAC1_SWTQ14_EOF_INT_BIT BIT(10)
+#define GMAC1_SWTQ13_EOF_INT_BIT BIT(9)
+#define GMAC1_SWTQ12_EOF_INT_BIT BIT(8)
+#define GMAC1_SWTQ11_EOF_INT_BIT BIT(7)
+#define GMAC1_SWTQ10_EOF_INT_BIT BIT(6)
+#define GMAC0_SWTQ05_EOF_INT_BIT BIT(5)
+#define GMAC0_SWTQ04_EOF_INT_BIT BIT(4)
+#define GMAC0_SWTQ03_EOF_INT_BIT BIT(3)
+#define GMAC0_SWTQ02_EOF_INT_BIT BIT(2)
+#define GMAC0_SWTQ01_EOF_INT_BIT BIT(1)
+#define GMAC0_SWTQ00_EOF_INT_BIT BIT(0)
+
+/*
+ * Interrupt Status Register 1 (offset 0x0030)
+ * Interrupt Mask Register 1 (offset 0x0034)
+ * Interrupt Select Register 1 (offset 0x0038)
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_0030 {
+ unsigned int default_q0_eof : 1; /* Default Queue 0 EOF Interrupt */
+ unsigned int default_q1_eof : 1; /* Default Queue 1 EOF Interrupt */
+ unsigned int class_rx : 14; /* Classification Queue Rx Interrupt */
+ unsigned int hwtq00_eof : 1; /* GMAC0 HW Tx Queue0 EOF Interrupt */
+ unsigned int hwtq01_eof : 1; /* GMAC0 HW Tx Queue1 EOF Interrupt */
+ unsigned int hwtq02_eof : 1; /* GMAC0 HW Tx Queue2 EOF Interrupt */
+ unsigned int hwtq03_eof : 1; /* GMAC0 HW Tx Queue3 EOF Interrupt */
+ unsigned int hwtq10_eof : 1; /* GMAC1 HW Tx Queue0 EOF Interrupt */
+ unsigned int hwtq11_eof : 1; /* GMAC1 HW Tx Queue1 EOF Interrupt */
+ unsigned int hwtq12_eof : 1; /* GMAC1 HW Tx Queue2 EOF Interrupt */
+ unsigned int hwtq13_eof : 1; /* GMAC1 HW Tx Queue3 EOF Interrupt */
+ unsigned int toe_iq0_intr : 1; /* TOE Interrupt Queue 0 with Interrupts */
+ unsigned int toe_iq1_intr : 1; /* TOE Interrupt Queue 1 with Interrupts */
+ unsigned int toe_iq2_intr : 1; /* TOE Interrupt Queue 2 with Interrupts */
+ unsigned int toe_iq3_intr : 1; /* TOE Interrupt Queue 3 with Interrupts */
+ unsigned int toe_iq0_full : 1; /* TOE Interrupt Queue 0 Full Interrupt */
+ unsigned int toe_iq1_full : 1; /* TOE Interrupt Queue 1 Full Interrupt */
+ unsigned int toe_iq2_full : 1; /* TOE Interrupt Queue 2 Full Interrupt */
+ unsigned int toe_iq3_full : 1; /* TOE Interrupt Queue 3 Full Interrupt */
+ } bits;
+} INTR_REG1_T;
+
+#define TOE_IQ3_FULL_INT_BIT BIT(31)
+#define TOE_IQ2_FULL_INT_BIT BIT(30)
+#define TOE_IQ1_FULL_INT_BIT BIT(29)
+#define TOE_IQ0_FULL_INT_BIT BIT(28)
+#define TOE_IQ3_INT_BIT BIT(27)
+#define TOE_IQ2_INT_BIT BIT(26)
+#define TOE_IQ1_INT_BIT BIT(25)
+#define TOE_IQ0_INT_BIT BIT(24)
+#define GMAC1_HWTQ13_EOF_INT_BIT BIT(23)
+#define GMAC1_HWTQ12_EOF_INT_BIT BIT(22)
+#define GMAC1_HWTQ11_EOF_INT_BIT BIT(21)
+#define GMAC1_HWTQ10_EOF_INT_BIT BIT(20)
+#define GMAC0_HWTQ03_EOF_INT_BIT BIT(19)
+#define GMAC0_HWTQ02_EOF_INT_BIT BIT(18)
+#define GMAC0_HWTQ01_EOF_INT_BIT BIT(17)
+#define GMAC0_HWTQ00_EOF_INT_BIT BIT(16)
+#define CLASS_RX_INT_BIT(x) BIT((x + 2))
+#define DEFAULT_Q1_INT_BIT BIT(1)
+#define DEFAULT_Q0_INT_BIT BIT(0)
+
+#define TOE_IQ_INT_BITS (TOE_IQ0_INT_BIT | TOE_IQ1_INT_BIT | \
+ TOE_IQ2_INT_BIT | TOE_IQ3_INT_BIT)
+#define TOE_IQ_FULL_BITS (TOE_IQ0_FULL_INT_BIT | TOE_IQ1_FULL_INT_BIT | \
+ TOE_IQ2_FULL_INT_BIT | TOE_IQ3_FULL_INT_BIT)
+#define TOE_IQ_ALL_BITS (TOE_IQ_INT_BITS | TOE_IQ_FULL_BITS)
+#define TOE_CLASS_RX_INT_BITS 0xfffc
+
+/*
+ * Interrupt Status Register 2 (offset 0x0040)
+ * Interrupt Mask Register 2 (offset 0x0044)
+ * Interrupt Select Register 2 (offset 0x0048)
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_0040 {
+ unsigned int toe_q0_full : 1; /* bit 0 TOE Queue 0 Full Interrupt */
+ unsigned int toe_q1_full : 1; /* bit 1 TOE Queue 1 Full Interrupt */
+ unsigned int toe_q2_full : 1; /* bit 2 TOE Queue 2 Full Interrupt */
+ unsigned int toe_q3_full : 1; /* bit 3 TOE Queue 3 Full Interrupt */
+ unsigned int toe_q4_full : 1; /* bit 4 TOE Queue 4 Full Interrupt */
+ unsigned int toe_q5_full : 1; /* bit 5 TOE Queue 5 Full Interrupt */
+ unsigned int toe_q6_full : 1; /* bit 6 TOE Queue 6 Full Interrupt */
+ unsigned int toe_q7_full : 1; /* bit 7 TOE Queue 7 Full Interrupt */
+ unsigned int toe_q8_full : 1; /* bit 8 TOE Queue 8 Full Interrupt */
+ unsigned int toe_q9_full : 1; /* bit 9 TOE Queue 9 Full Interrupt */
+ unsigned int toe_q10_full : 1; /* bit 10 TOE Queue 10 Full Interrupt */
+ unsigned int toe_q11_full : 1; /* bit 11 TOE Queue 11 Full Interrupt */
+ unsigned int toe_q12_full : 1; /* bit 12 TOE Queue 12 Full Interrupt */
+ unsigned int toe_q13_full : 1; /* bit 13 TOE Queue 13 Full Interrupt */
+ unsigned int toe_q14_full : 1; /* bit 14 TOE Queue 14 Full Interrupt */
+ unsigned int toe_q15_full : 1; /* bit 15 TOE Queue 15 Full Interrupt */
+ unsigned int toe_q16_full : 1; /* bit 16 TOE Queue 16 Full Interrupt */
+ unsigned int toe_q17_full : 1; /* bit 17 TOE Queue 17 Full Interrupt */
+ unsigned int toe_q18_full : 1; /* bit 18 TOE Queue 18 Full Interrupt */
+ unsigned int toe_q19_full : 1; /* bit 19 TOE Queue 19 Full Interrupt */
+ unsigned int toe_q20_full : 1; /* bit 20 TOE Queue 20 Full Interrupt */
+ unsigned int toe_q21_full : 1; /* bit 21 TOE Queue 21 Full Interrupt */
+ unsigned int toe_q22_full : 1; /* bit 22 TOE Queue 22 Full Interrupt */
+ unsigned int toe_q23_full : 1; /* bit 23 TOE Queue 23 Full Interrupt */
+ unsigned int toe_q24_full : 1; /* bit 24 TOE Queue 24 Full Interrupt */
+ unsigned int toe_q25_full : 1; /* bit 25 TOE Queue 25 Full Interrupt */
+ unsigned int toe_q26_full : 1; /* bit 26 TOE Queue 26 Full Interrupt */
+ unsigned int toe_q27_full : 1; /* bit 27 TOE Queue 27 Full Interrupt */
+ unsigned int toe_q28_full : 1; /* bit 28 TOE Queue 28 Full Interrupt */
+ unsigned int toe_q29_full : 1; /* bit 29 TOE Queue 29 Full Interrupt */
+ unsigned int toe_q30_full : 1; /* bit 30 TOE Queue 30 Full Interrupt */
+ unsigned int toe_q31_full : 1; /* bit 31 TOE Queue 31 Full Interrupt */
+ } bits;
+} INTR_REG2_T;
+
+#define TOE_QL_FULL_INT_BIT(x) BIT(x)
+
+/*
+ * Interrupt Status Register 3 (offset 0x0050)
+ * Interrupt Mask Register 3 (offset 0x0054)
+ * Interrupt Select Register 3 (offset 0x0058)
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_0050 {
+ unsigned int toe_q32_full : 1; /* bit 32 TOE Queue 32 Full Interrupt */
+ unsigned int toe_q33_full : 1; /* bit 33 TOE Queue 33 Full Interrupt */
+ unsigned int toe_q34_full : 1; /* bit 34 TOE Queue 34 Full Interrupt */
+ unsigned int toe_q35_full : 1; /* bit 35 TOE Queue 35 Full Interrupt */
+ unsigned int toe_q36_full : 1; /* bit 36 TOE Queue 36 Full Interrupt */
+ unsigned int toe_q37_full : 1; /* bit 37 TOE Queue 37 Full Interrupt */
+ unsigned int toe_q38_full : 1; /* bit 38 TOE Queue 38 Full Interrupt */
+ unsigned int toe_q39_full : 1; /* bit 39 TOE Queue 39 Full Interrupt */
+ unsigned int toe_q40_full : 1; /* bit 40 TOE Queue 40 Full Interrupt */
+ unsigned int toe_q41_full : 1; /* bit 41 TOE Queue 41 Full Interrupt */
+ unsigned int toe_q42_full : 1; /* bit 42 TOE Queue 42 Full Interrupt */
+ unsigned int toe_q43_full : 1; /* bit 43 TOE Queue 43 Full Interrupt */
+ unsigned int toe_q44_full : 1; /* bit 44 TOE Queue 44 Full Interrupt */
+ unsigned int toe_q45_full : 1; /* bit 45 TOE Queue 45 Full Interrupt */
+ unsigned int toe_q46_full : 1; /* bit 46 TOE Queue 46 Full Interrupt */
+ unsigned int toe_q47_full : 1; /* bit 47 TOE Queue 47 Full Interrupt */
+ unsigned int toe_q48_full : 1; /* bit 48 TOE Queue 48 Full Interrupt */
+ unsigned int toe_q49_full : 1; /* bit 49 TOE Queue 49 Full Interrupt */
+ unsigned int toe_q50_full : 1; /* bit 50 TOE Queue 50 Full Interrupt */
+ unsigned int toe_q51_full : 1; /* bit 51 TOE Queue 51 Full Interrupt */
+ unsigned int toe_q52_full : 1; /* bit 52 TOE Queue 52 Full Interrupt */
+ unsigned int toe_q53_full : 1; /* bit 53 TOE Queue 53 Full Interrupt */
+ unsigned int toe_q54_full : 1; /* bit 54 TOE Queue 54 Full Interrupt */
+ unsigned int toe_q55_full : 1; /* bit 55 TOE Queue 55 Full Interrupt */
+ unsigned int toe_q56_full : 1; /* bit 56 TOE Queue 56 Full Interrupt */
+ unsigned int toe_q57_full : 1; /* bit 57 TOE Queue 57 Full Interrupt */
+ unsigned int toe_q58_full : 1; /* bit 58 TOE Queue 58 Full Interrupt */
+ unsigned int toe_q59_full : 1; /* bit 59 TOE Queue 59 Full Interrupt */
+ unsigned int toe_q60_full : 1; /* bit 60 TOE Queue 60 Full Interrupt */
+ unsigned int toe_q61_full : 1; /* bit 61 TOE Queue 61 Full Interrupt */
+ unsigned int toe_q62_full : 1; /* bit 62 TOE Queue 62 Full Interrupt */
+ unsigned int toe_q63_full : 1; /* bit 63 TOE Queue 63 Full Interrupt */
+ } bits;
+} INTR_REG3_T;
+
+#define TOE_QH_FULL_INT_BIT(x) BIT(x-32)
+
+/*
+ * Interrupt Status Register 4 (offset 0x0060)
+ * Interrupt Mask Register 4 (offset 0x0064)
+ * Interrupt Select Register 4 (offset 0x0068)
+ */
+typedef union {
+ unsigned char byte;
+ struct bit_0060 {
+ unsigned char status_changed : 1; /* Status Changed Intr for RGMII Mode */
+ unsigned char rx_overrun : 1; /* GMAC Rx FIFO overrun interrupt */
+ unsigned char tx_pause_off : 1; /* received pause off frame interrupt */
+ unsigned char rx_pause_off : 1; /* received pause off frame interrupt */
+ unsigned char tx_pause_on : 1; /* transmit pause on frame interrupt */
+ unsigned char rx_pause_on : 1; /* received pause on frame interrupt */
+ unsigned char cnt_full : 1; /* MIB counters half full interrupt */
+ unsigned char reserved : 1; /* */
+ } __packed bits;
+} __packed GMAC_INTR_T;
+
+typedef union {
+ unsigned int bits32;
+ struct bit_0060_2 {
+ unsigned int swfq_empty : 1; /* bit 0 Software Free Queue Empty Intr. */
+ unsigned int hwfq_empty : 1; /* bit 1 Hardware Free Queue Empty Intr. */
+ unsigned int class_qf_int : 14; /* bit 15:2 Classification Rx Queue13-0 Full Intr. */
+ GMAC_INTR_T gmac0;
+ GMAC_INTR_T gmac1;
+ } bits;
+} INTR_REG4_T;
+
+#define GMAC1_RESERVED_INT_BIT BIT(31)
+#define GMAC1_MIB_INT_BIT BIT(30)
+#define GMAC1_RX_PAUSE_ON_INT_BIT BIT(29)
+#define GMAC1_TX_PAUSE_ON_INT_BIT BIT(28)
+#define GMAC1_RX_PAUSE_OFF_INT_BIT BIT(27)
+#define GMAC1_TX_PAUSE_OFF_INT_BIT BIT(26)
+#define GMAC1_RX_OVERRUN_INT_BIT BIT(25)
+#define GMAC1_STATUS_CHANGE_INT_BIT BIT(24)
+#define GMAC0_RESERVED_INT_BIT BIT(23)
+#define GMAC0_MIB_INT_BIT BIT(22)
+#define GMAC0_RX_PAUSE_ON_INT_BIT BIT(21)
+#define GMAC0_TX_PAUSE_ON_INT_BIT BIT(20)
+#define GMAC0_RX_PAUSE_OFF_INT_BIT BIT(19)
+#define GMAC0_TX_PAUSE_OFF_INT_BIT BIT(18)
+#define GMAC0_RX_OVERRUN_INT_BIT BIT(17)
+#define GMAC0_STATUS_CHANGE_INT_BIT BIT(16)
+#define CLASS_RX_FULL_INT_BIT(x) BIT((x+2))
+#define HWFQ_EMPTY_INT_BIT BIT(1)
+#define SWFQ_EMPTY_INT_BIT BIT(0)
+
+#define GMAC0_INT_BITS (GMAC0_RESERVED_INT_BIT | GMAC0_MIB_INT_BIT | \
+ GMAC0_RX_PAUSE_ON_INT_BIT | GMAC0_TX_PAUSE_ON_INT_BIT | \
+ GMAC0_RX_PAUSE_OFF_INT_BIT | GMAC0_TX_PAUSE_OFF_INT_BIT | \
+ GMAC0_RX_OVERRUN_INT_BIT | GMAC0_STATUS_CHANGE_INT_BIT)
+#define GMAC1_INT_BITS (GMAC1_RESERVED_INT_BIT | GMAC1_MIB_INT_BIT | \
+ GMAC1_RX_PAUSE_ON_INT_BIT | GMAC1_TX_PAUSE_ON_INT_BIT | \
+ GMAC1_RX_PAUSE_OFF_INT_BIT | GMAC1_TX_PAUSE_OFF_INT_BIT | \
+ GMAC1_RX_OVERRUN_INT_BIT | GMAC1_STATUS_CHANGE_INT_BIT)
+
+#define CLASS_RX_FULL_INT_BITS 0xfffc
+
+/*
+ * GLOBAL_QUEUE_THRESHOLD_REG (offset 0x0070)
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_0070_2 {
+ unsigned int swfq_empty : 8; /* 7:0 Software Free Queue Empty Threshold */
+ unsigned int hwfq_empty : 8; /* 15:8 Hardware Free Queue Empty Threshold */
+ unsigned int intrq : 8; /* 23:16 */
+ unsigned int toe_class : 8; /* 31:24 */
+ } bits;
+} QUEUE_THRESHOLD_T;
+
+
+/*
+ * GMAC DMA Control Register
+ * GMAC0 offset 0x8000
+ * GMAC1 offset 0xC000
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_8000 {
+ unsigned int td_bus : 2; /* bit 1:0 Peripheral Bus Width */
+ unsigned int td_burst_size : 2; /* bit 3:2 TxDMA max burst size for every AHB request */
+ unsigned int td_prot : 4; /* bit 7:4 TxDMA protection control */
+ unsigned int rd_bus : 2; /* bit 9:8 Peripheral Bus Width */
+ unsigned int rd_burst_size : 2; /* bit 11:10 DMA max burst size for every AHB request */
+ unsigned int rd_prot : 4; /* bit 15:12 DMA Protection Control */
+ unsigned int rd_insert_bytes : 2; /* bit 17:16 */
+ unsigned int reserved : 10; /* bit 27:18 */
+ unsigned int drop_small_ack : 1; /* bit 28 1: Drop, 0: Accept */
+ unsigned int loopback : 1; /* bit 29 Loopback TxDMA to RxDMA */
+ unsigned int td_enable : 1; /* bit 30 Tx DMA Enable */
+ unsigned int rd_enable : 1; /* bit 31 Rx DMA Enable */
+ } bits;
+} GMAC_DMA_CTRL_T;
+
+/*
+ * GMAC Tx Weighting Control Register 0
+ * GMAC0 offset 0x8004
+ * GMAC1 offset 0xC004
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_8004 {
+ unsigned int hw_tq0 : 6; /* bit 5:0 HW TX Queue 3 */
+ unsigned int hw_tq1 : 6; /* bit 11:6 HW TX Queue 2 */
+ unsigned int hw_tq2 : 6; /* bit 17:12 HW TX Queue 1 */
+ unsigned int hw_tq3 : 6; /* bit 23:18 HW TX Queue 0 */
+ unsigned int reserved : 8; /* bit 31:24 */
+ } bits;
+} GMAC_TX_WCR0_T; /* Weighting Control Register 0 */
+
+/*
+ * GMAC Tx Weighting Control Register 1
+ * GMAC0 offset 0x8008
+ * GMAC1 offset 0xC008
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_8008 {
+ unsigned int sw_tq0 : 5; /* bit 4:0 SW TX Queue 0 */
+ unsigned int sw_tq1 : 5; /* bit 9:5 SW TX Queue 1 */
+ unsigned int sw_tq2 : 5; /* bit 14:10 SW TX Queue 2 */
+ unsigned int sw_tq3 : 5; /* bit 19:15 SW TX Queue 3 */
+ unsigned int sw_tq4 : 5; /* bit 24:20 SW TX Queue 4 */
+ unsigned int sw_tq5 : 5; /* bit 29:25 SW TX Queue 5 */
+ unsigned int reserved : 2; /* bit 31:30 */
+ } bits;
+} GMAC_TX_WCR1_T; /* Weighting Control Register 1 */
+
+/*
+ * Queue Read/Write Pointer
+ * GMAC SW TX Queue 0~5 Read/Write Pointer register
+ * GMAC0 offset 0x800C ~ 0x8020
+ * GMAC1 offset 0xC00C ~ 0xC020
+ * GMAC HW TX Queue 0~3 Read/Write Pointer register
+ * GMAC0 offset 0x8024 ~ 0x8030
+ * GMAC1 offset 0xC024 ~ 0xC030
+ *
+ * see DMA_RWPTR_T structure
+ */
+
+/*
+ * GMAC DMA Tx First Description Address Register
+ * GMAC0 offset 0x8038
+ * GMAC1 offset 0xC038
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_8038 {
+ unsigned int reserved : 3;
+ unsigned int td_busy : 1; /* bit 3 1: TxDMA busy; 0: TxDMA idle */
+ unsigned int td_first_des_ptr : 28; /* bit 31:4 first descriptor address */
+ } bits;
+} GMAC_TXDMA_FIRST_DESC_T;
+
+/*
+ * GMAC DMA Tx Current Description Address Register
+ * GMAC0 offset 0x803C
+ * GMAC1 offset 0xC03C
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_803C {
+ unsigned int reserved : 4;
+ unsigned int td_curr_desc_ptr : 28; /* bit 31:4 current descriptor address */
+ } bits;
+} GMAC_TXDMA_CURR_DESC_T;
+
+/*
+ * GMAC DMA Tx Description Word 0 Register
+ * GMAC0 offset 0x8040
+ * GMAC1 offset 0xC040
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_8040 {
+ unsigned int buffer_size : 16; /* bit 15:0 Transfer size */
+ unsigned int desc_count : 6; /* bit 21:16 number of descriptors used for the current frame */
+ unsigned int status_tx_ok : 1; /* bit 22 Tx Status, 1: Successful 0: Failed */
+ unsigned int status_rvd : 6; /* bit 28:23 Tx Status, Reserved bits */
+ unsigned int perr : 1; /* bit 29 protocol error during processing this descriptor */
+ unsigned int derr : 1; /* bit 30 data error during processing this descriptor */
+ unsigned int reserved : 1; /* bit 31 */
+ } bits;
+} GMAC_TXDESC_0_T;
+
+/*
+ * GMAC DMA Tx Description Word 1 Register
+ * GMAC0 offset 0x8044
+ * GMAC1 offset 0xC044
+ */
+typedef union {
+ unsigned int bits32;
+ struct txdesc_word1 {
+ unsigned int byte_count : 16; /* bit 15: 0 Tx Frame Byte Count */
+ unsigned int mtu_enable : 1; /* bit 16 TSS segmentation use MTU setting */
+ unsigned int ip_chksum : 1; /* bit 17 IPV4 Header Checksum Enable */
+ unsigned int ipv6_enable : 1; /* bit 18 IPV6 Tx Enable */
+ unsigned int tcp_chksum : 1; /* bit 19 TCP Checksum Enable */
+ unsigned int udp_chksum : 1; /* bit 20 UDP Checksum Enable */
+ unsigned int bypass_tss : 1; /* bit 21 Bypass HW offload engine */
+ unsigned int ip_fixed_len : 1; /* bit 22 Don't update IP length field */
+ unsigned int reserved : 9; /* bit 31:23 Tx Flag, Reserved */
+ } bits;
+} GMAC_TXDESC_1_T;
+
+#define TSS_IP_FIXED_LEN_BIT BIT(22)
+#define TSS_BYPASS_BIT BIT(21)
+#define TSS_UDP_CHKSUM_BIT BIT(20)
+#define TSS_TCP_CHKSUM_BIT BIT(19)
+#define TSS_IPV6_ENABLE_BIT BIT(18)
+#define TSS_IP_CHKSUM_BIT BIT(17)
+#define TSS_MTU_ENABLE_BIT BIT(16)
+
+#define TSS_CHECKUM_ENABLE \
+ (TSS_IP_CHKSUM_BIT|TSS_IPV6_ENABLE_BIT| \
+ TSS_TCP_CHKSUM_BIT|TSS_UDP_CHKSUM_BIT)
+
+/*
+ * GMAC DMA Tx Description Word 2 Register
+ * GMAC0 offset 0x8048
+ * GMAC1 offset 0xC048
+ */
+typedef union {
+ unsigned int bits32;
+ unsigned int buf_adr;
+} GMAC_TXDESC_2_T;
+
+/*
+ * GMAC DMA Tx Description Word 3 Register
+ * GMAC0 offset 0x804C
+ * GMAC1 offset 0xC04C
+ */
+typedef union {
+ unsigned int bits32;
+ struct txdesc_word3 {
+ unsigned int mtu_size : 13; /* bit 12: 0 Tx Frame Byte Count */
+ unsigned int reserved : 16; /* bit 28:13 */
+ unsigned int eofie : 1; /* bit 29 End of frame interrupt enable */
+ unsigned int sof_eof : 2; /* bit 31:30 11: only one, 10: first, 01: last, 00: linking */
+ } bits;
+} GMAC_TXDESC_3_T;
+#define SOF_EOF_BIT_MASK 0x3fffffff
+#define SOF_BIT 0x80000000
+#define EOF_BIT 0x40000000
+#define EOFIE_BIT BIT(29)
+#define MTU_SIZE_BIT_MASK 0x1fff
+
+/*
+ * GMAC Tx Descriptor
+ */
+typedef struct {
+ GMAC_TXDESC_0_T word0;
+ GMAC_TXDESC_1_T word1;
+ GMAC_TXDESC_2_T word2;
+ GMAC_TXDESC_3_T word3;
+} GMAC_TXDESC_T;
+
+/*
+ * GMAC DMA Rx First Description Address Register
+ * GMAC0 offset 0x8058
+ * GMAC1 offset 0xC058
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_8058 {
+ unsigned int reserved : 3; /* bit 2:0 */
+ unsigned int rd_busy : 1; /* bit 3 1-RxDMA busy; 0-RxDMA idle */
+ unsigned int rd_first_des_ptr : 28; /* bit 31:4 first descriptor address */
+ } bits;
+} GMAC_RXDMA_FIRST_DESC_T;
+
+/*
+ * GMAC DMA Rx Current Description Address Register
+ * GMAC0 offset 0x805C
+ * GMAC1 offset 0xC05C
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_805C {
+ unsigned int reserved : 4; /* bit 3:0 */
+ unsigned int rd_curr_des_ptr : 28; /* bit 31:4 current descriptor address */
+ } bits;
+} GMAC_RXDMA_CURR_DESC_T;
+
+/*
+ * GMAC DMA Rx Description Word 0 Register
+ * GMAC0 offset 0x8060
+ * GMAC1 offset 0xC060
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_8060 {
+ unsigned int buffer_size : 16; /* bit 15:0 number of descriptors used for the current frame */
+ unsigned int desc_count : 6; /* bit 21:16 number of descriptors used for the current frame */
+ unsigned int status : 4; /* bit 24:22 Status of rx frame */
+ unsigned int chksum_status : 3; /* bit 28:26 Check Sum Status */
+ unsigned int perr : 1; /* bit 29 protocol error during processing this descriptor */
+ unsigned int derr : 1; /* bit 30 data error during processing this descriptor */
+ unsigned int drop : 1; /* bit 31 TOE/CIS Queue Full dropped packet to default queue */
+ } bits;
+} GMAC_RXDESC_0_T;
+
+#define GMAC_RXDESC_0_T_derr BIT(30)
+#define GMAC_RXDESC_0_T_perr BIT(29)
+#define GMAC_RXDESC_0_T_chksum_status(x) BIT((x+26))
+#define GMAC_RXDESC_0_T_status(x) BIT((x+22))
+#define GMAC_RXDESC_0_T_desc_count(x) BIT((x+16))
+
+#define RX_CHKSUM_IP_UDP_TCP_OK 0
+#define RX_CHKSUM_IP_OK_ONLY 1
+#define RX_CHKSUM_NONE 2
+#define RX_CHKSUM_IP_ERR_UNKNOWN 4
+#define RX_CHKSUM_IP_ERR 5
+#define RX_CHKSUM_TCP_UDP_ERR 6
+#define RX_CHKSUM_NUM 8
+
+#define RX_STATUS_GOOD_FRAME 0
+#define RX_STATUS_TOO_LONG_GOOD_CRC 1
+#define RX_STATUS_RUNT_FRAME 2
+#define RX_STATUS_SFD_NOT_FOUND 3
+#define RX_STATUS_CRC_ERROR 4
+#define RX_STATUS_TOO_LONG_BAD_CRC 5
+#define RX_STATUS_ALIGNMENT_ERROR 6
+#define RX_STATUS_TOO_LONG_BAD_ALIGN 7
+#define RX_STATUS_RX_ERR 8
+#define RX_STATUS_DA_FILTERED 9
+#define RX_STATUS_BUFFER_FULL 10
+#define RX_STATUS_NUM 16
+
+#define RX_ERROR_LENGTH(s) \
+ ((s) == RX_STATUS_TOO_LONG_GOOD_CRC || \
+ (s) == RX_STATUS_TOO_LONG_BAD_CRC || \
+ (s) == RX_STATUS_TOO_LONG_BAD_ALIGN)
+#define RX_ERROR_OVER(s) \
+ ((s) == RX_STATUS_BUFFER_FULL)
+#define RX_ERROR_CRC(s) \
+ ((s) == RX_STATUS_CRC_ERROR || \
+ (s) == RX_STATUS_TOO_LONG_BAD_CRC)
+#define RX_ERROR_FRAME(s) \
+ ((s) == RX_STATUS_ALIGNMENT_ERROR || \
+ (s) == RX_STATUS_TOO_LONG_BAD_ALIGN)
+#define RX_ERROR_FIFO(s) \
+ (0)
+
+/*
+ * GMAC DMA Rx Description Word 1 Register
+ * GMAC0 offset 0x8064
+ * GMAC1 offset 0xC064
+ */
+typedef union {
+ unsigned int bits32;
+ struct rxdesc_word1 {
+ unsigned int byte_count : 16; /* bit 15: 0 Rx Frame Byte Count */
+ unsigned int sw_id : 16; /* bit 31:16 Software ID */
+ } bits;
+} GMAC_RXDESC_1_T;
+
+/*
+ * GMAC DMA Rx Description Word 2 Register
+ * GMAC0 offset 0x8068
+ * GMAC1 offset 0xC068
+ */
+typedef union {
+ unsigned int bits32;
+ unsigned int buf_adr;
+} GMAC_RXDESC_2_T;
+
+#define RX_INSERT_NONE 0
+#define RX_INSERT_1_BYTE 1
+#define RX_INSERT_2_BYTE 2
+#define RX_INSERT_3_BYTE 3
+
+/*
+ * GMAC DMA Rx Description Word 3 Register
+ * GMAC0 offset 0x806C
+ * GMAC1 offset 0xC06C
+ */
+typedef union {
+ unsigned int bits32;
+ struct rxdesc_word3 {
+ unsigned int l3_offset : 8; /* bit 7: 0 L3 data offset */
+ unsigned int l4_offset : 8; /* bit 15: 8 L4 data offset */
+ unsigned int l7_offset : 8; /* bit 23: 16 L7 data offset */
+ unsigned int dup_ack : 1; /* bit 24 Duplicated ACK detected */
+ unsigned int abnormal : 1; /* bit 25 abnormal case found */
+ unsigned int option : 1; /* bit 26 IPV4 option or IPV6 extension header */
+ unsigned int out_of_seq : 1; /* bit 27 Out of Sequence packet */
+ unsigned int ctrl_flag : 1; /* bit 28 Control Flag is present */
+ unsigned int eofie : 1; /* bit 29 End of frame interrupt enable */
+ unsigned int sof_eof : 2; /* bit 31:30 11: only one, 10: first, 01: last, 00: linking */
+ } bits;
+} GMAC_RXDESC_3_T;
+
+/*
+ * GMAC Rx Descriptor
+ */
+typedef struct {
+ GMAC_RXDESC_0_T word0;
+ GMAC_RXDESC_1_T word1;
+ GMAC_RXDESC_2_T word2;
+ GMAC_RXDESC_3_T word3;
+} GMAC_RXDESC_T;
+
+/*
+ * GMAC Hash Engine Enable/Action Register 0 Offset Register
+ * GMAC0 offset 0x8070
+ * GMAC1 offset 0xC070
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_8070 {
+ unsigned int mr0hel : 6; /* bit 5:0 match rule 0 hash entry size */
+ unsigned int mr0_action : 5; /* bit 10:6 Matching Rule 0 action offset */
+ unsigned int reserved0 : 4; /* bit 14:11 */
+ unsigned int mr0en : 1; /* bit 15 Enable Matching Rule 0 */
+ unsigned int mr1hel : 6; /* bit 21:16 match rule 1 hash entry size */
+ unsigned int mr1_action : 5; /* bit 26:22 Matching Rule 1 action offset */
+ unsigned int timing : 3; /* bit 29:27 */
+ unsigned int reserved1 : 1; /* bit 30 */
+ unsigned int mr1en : 1; /* bit 31 Enable Matching Rule 1 */
+ } bits;
+} GMAC_HASH_ENABLE_REG0_T;
+
+/*
+ * GMAC Hash Engine Enable/Action Register 1 Offset Register
+ * GMAC0 offset 0x8074
+ * GMAC1 offset 0xC074
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_8074 {
+ unsigned int mr2hel : 6; /* bit 5:0 match rule 2 hash entry size */
+ unsigned int mr2_action : 5; /* bit 10:6 Matching Rule 2 action offset */
+ unsigned int reserved2 : 4; /* bit 14:11 */
+ unsigned int mr2en : 1; /* bit 15 Enable Matching Rule 2 */
+ unsigned int mr3hel : 6; /* bit 21:16 match rule 3 hash entry size */
+ unsigned int mr3_action : 5; /* bit 26:22 Matching Rule 3 action offset */
+ unsigned int reserved1 : 4; /* bit 30:27 */
+ unsigned int mr3en : 1; /* bit 31 Enable Matching Rule 3 */
+ } bits;
+} GMAC_HASH_ENABLE_REG1_T;
+
+/*
+ * GMAC Matching Rule Control Register 0
+ * GMAC0 offset 0x8078
+ * GMAC1 offset 0xC078
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_8078 {
+ unsigned int sprx : 8; /* bit 7:0 Support Protocol Register 7:0 */
+ unsigned int reserved2 : 4; /* bit 11:8 */
+ unsigned int tos_traffic : 1; /* bit 12 IPV4 TOS or IPV6 Traffice Class */
+ unsigned int flow_lable : 1; /* bit 13 IPV6 Flow label */
+ unsigned int ip_hdr_len : 1; /* bit 14 IPV4 Header length */
+ unsigned int ip_version : 1; /* bit 15 0: IPV4, 1: IPV6 */
+ unsigned int reserved1 : 3; /* bit 18:16 */
+ unsigned int pppoe : 1; /* bit 19 PPPoE Session ID enable */
+ unsigned int vlan : 1; /* bit 20 VLAN ID enable */
+ unsigned int ether_type : 1; /* bit 21 Ethernet type enable */
+ unsigned int sa : 1; /* bit 22 MAC SA enable */
+ unsigned int da : 1; /* bit 23 MAC DA enable */
+ unsigned int priority : 3; /* bit 26:24 priority if multi-rules matched */
+ unsigned int port : 1; /* bit 27 PORT ID matching enable */
+ unsigned int l7 : 1; /* bit 28 L7 matching enable */
+ unsigned int l4 : 1; /* bit 29 L4 matching enable */
+ unsigned int l3 : 1; /* bit 30 L3 matching enable */
+ unsigned int l2 : 1; /* bit 31 L2 matching enable */
+ } bits;
+} GMAC_MRxCR0_T;
+
+#define MR_L2_BIT BIT(31)
+#define MR_L3_BIT BIT(30)
+#define MR_L4_BIT BIT(29)
+#define MR_L7_BIT BIT(28)
+#define MR_PORT_BIT BIT(27)
+#define MR_PRIORITY_BIT BIT(26)
+#define MR_DA_BIT BIT(23)
+#define MR_SA_BIT BIT(22)
+#define MR_ETHER_TYPE_BIT BIT(21)
+#define MR_VLAN_BIT BIT(20)
+#define MR_PPPOE_BIT BIT(19)
+#define MR_IP_VER_BIT BIT(15)
+#define MR_IP_HDR_LEN_BIT BIT(14)
+#define MR_FLOW_LABLE_BIT BIT(13)
+#define MR_TOS_TRAFFIC_BIT BIT(12)
+#define MR_SPR_BIT(x) BIT(x)
+#define MR_SPR_BITS 0xff
+
+/*
+ * GMAC Matching Rule Control Register 1
+ * GMAC0 offset 0x807C
+ * GMAC1 offset 0xC07C
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_807C {
+ unsigned int l4_byte0_15 : 16; /* bit 15: 0 */
+ unsigned int dip_netmask : 7; /* bit 22:16 Dest IP net mask, number of mask bits */
+ unsigned int dip : 1; /* bit 23 Dest IP */
+ unsigned int sip_netmask : 7; /* bit 30:24 Srce IP net mask, number of mask bits */
+ unsigned int sip : 1; /* bit 31 Srce IP */
+ } bits;
+} GMAC_MRxCR1_T;
+
+/*
+ * GMAC Matching Rule Control Register 2
+ * GMAC0 offset 0x8080
+ * GMAC1 offset 0xC080
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_8080 {
+ unsigned int l7_byte0_23 : 24; /* bit 23:0 */
+ unsigned int l4_byte16_24 : 8; /* bit 31: 24 */
+ } bits;
+} GMAC_MRxCR2_T;
+
+/*
+ * GMAC Support registers
+ * GMAC0 offset 0x80A8
+ * GMAC1 offset 0xC0A8
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_80A8 {
+ unsigned int protocol : 8; /* bit 7:0 Supported protocol */
+ unsigned int swap : 3; /* bit 10:8 Swap */
+ unsigned int reserved : 21; /* bit 31:11 */
+ } bits;
+} GMAC_SPR_T;
+
+/*
+ * GMAC_AHB_WEIGHT registers
+ * GMAC0 offset 0x80C8
+ * GMAC1 offset 0xC0C8
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_80C8 {
+ unsigned int hash_weight : 5; /* 4:0 */
+ unsigned int rx_weight : 5; /* 9:5 */
+ unsigned int tx_weight : 5; /* 14:10 */
+ unsigned int pre_req : 5; /* 19:15 Rx Data Pre Request FIFO Threshold */
+ unsigned int tqDV_threshold : 5; /* 24:20 DMA TqCtrl to Start tqDV FIFO Threshold */
+ unsigned int reserved : 7; /* 31:25 */
+ } bits;
+} GMAC_AHB_WEIGHT_T;
+
+/*
+ * the register structure of GMAC
+ */
+
+/*
+ * GMAC RX FLTR
+ * GMAC0 Offset 0xA00C
+ * GMAC1 Offset 0xE00C
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit1_000c {
+ unsigned int unicast : 1; /* enable receive of unicast frames that are sent to STA address */
+ unsigned int multicast : 1; /* enable receive of multicast frames that pass multicast filter */
+ unsigned int broadcast : 1; /* enable receive of broadcast frames */
+ unsigned int promiscuous : 1; /* enable receive of all frames */
+ unsigned int error : 1; /* enable receive of all error frames */
+ unsigned int : 27;
+ } bits;
+} GMAC_RX_FLTR_T;
+
+/*
+ * GMAC Configuration 0
+ * GMAC0 Offset 0xA018
+ * GMAC1 Offset 0xE018
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit1_0018 {
+ unsigned int dis_tx : 1; /* 0: disable transmit */
+ unsigned int dis_rx : 1; /* 1: disable receive */
+ unsigned int loop_back : 1; /* 2: transmit data loopback enable */
+ unsigned int flow_ctrl : 1; /* 3: flow control also trigged by Rx queues */
+ unsigned int adj_ifg : 4; /* 4-7: adjust IFG from 96+/-56 */
+ unsigned int max_len : 3; /* 8-10 maximum receive frame length allowed */
+ unsigned int dis_bkoff : 1; /* 11: disable back-off function */
+ unsigned int dis_col : 1; /* 12: disable 16 collisions abort function */
+ unsigned int sim_test : 1; /* 13: speed up timers in simulation */
+ unsigned int rx_fc_en : 1; /* 14: RX flow control enable */
+ unsigned int tx_fc_en : 1; /* 15: TX flow control enable */
+ unsigned int rgmii_en : 1; /* 16: RGMII in-band status enable */
+ unsigned int ipv4_rx_chksum : 1; /* 17: IPv4 RX Checksum enable */
+ unsigned int ipv6_rx_chksum : 1; /* 18: IPv6 RX Checksum enable */
+ unsigned int rx_tag_remove : 1; /* 19: Remove Rx VLAN tag */
+ unsigned int rgmm_edge : 1; /* 20 */
+ unsigned int rxc_inv : 1; /* 21 */
+ unsigned int ipv6_exthdr_order : 1; /* 22 */
+ unsigned int rx_err_detect : 1; /* 23 */
+ unsigned int port0_chk_hwq : 1; /* 24 */
+ unsigned int port1_chk_hwq : 1; /* 25 */
+ unsigned int port0_chk_toeq : 1; /* 26 */
+ unsigned int port1_chk_toeq : 1; /* 27 */
+ unsigned int port0_chk_classq : 1; /* 28 */
+ unsigned int port1_chk_classq : 1; /* 29 */
+ unsigned int reserved : 2; /* 31 */
+ } bits;
+} GMAC_CONFIG0_T;
+
+#define CONFIG0_TX_RX_DISABLE (BIT(1)|BIT(0))
+#define CONFIG0_RX_CHKSUM (BIT(18)|BIT(17))
+#define CONFIG0_FLOW_RX (BIT(14))
+#define CONFIG0_FLOW_TX (BIT(15))
+#define CONFIG0_FLOW_TX_RX (BIT(14)|BIT(15))
+#define CONFIG0_FLOW_CTL (BIT(14)|BIT(15))
+
+#define CONFIG0_MAXLEN_SHIFT 8
+#define CONFIG0_MAXLEN_MASK (7 << CONFIG0_MAXLEN_SHIFT)
+#define CONFIG0_MAXLEN_1536 0
+#define CONFIG0_MAXLEN_1518 1
+#define CONFIG0_MAXLEN_1522 2
+#define CONFIG0_MAXLEN_1542 3
+#define CONFIG0_MAXLEN_9k 4 /* 9212 */
+#define CONFIG0_MAXLEN_10k 5 /* 10236 */
+#define CONFIG0_MAXLEN_1518__6 6
+#define CONFIG0_MAXLEN_1518__7 7
+
+/*
+ * GMAC Configuration 1
+ * GMAC0 Offset 0xA01C
+ * GMAC1 Offset 0xE01C
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit1_001c {
+ unsigned int set_threshold : 8; /* flow control set threshold */
+ unsigned int rel_threshold : 8; /* flow control release threshold */
+ unsigned int reserved : 16;
+ } bits;
+} GMAC_CONFIG1_T;
+
+#define GMAC_FLOWCTRL_SET_MAX 32
+#define GMAC_FLOWCTRL_SET_MIN 0
+#define GMAC_FLOWCTRL_RELEASE_MAX 32
+#define GMAC_FLOWCTRL_RELEASE_MIN 0
+
+/*
+ * GMAC Configuration 2
+ * GMAC0 Offset 0xA020
+ * GMAC1 Offset 0xE020
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit1_0020 {
+ unsigned int set_threshold : 16; /* flow control set threshold */
+ unsigned int rel_threshold : 16; /* flow control release threshold */
+ } bits;
+} GMAC_CONFIG2_T;
+
+/*
+ * GMAC Configuration 3
+ * GMAC0 Offset 0xA024
+ * GMAC1 Offset 0xE024
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit1_0024 {
+ unsigned int set_threshold : 16; /* flow control set threshold */
+ unsigned int rel_threshold : 16; /* flow control release threshold */
+ } bits;
+} GMAC_CONFIG3_T;
+
+
+/*
+ * GMAC STATUS
+ * GMAC0 Offset 0xA02C
+ * GMAC1 Offset 0xE02C
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit1_002c {
+ unsigned int link : 1; /* link status */
+ unsigned int speed : 2; /* link speed(00->2.5M 01->25M 10->125M) */
+ unsigned int duplex : 1; /* duplex mode */
+ unsigned int reserved : 1;
+ unsigned int mii_rmii : 2; /* PHY interface type */
+ unsigned int : 25;
+ } bits;
+} GMAC_STATUS_T;
+
+#define GMAC_SPEED_10 0
+#define GMAC_SPEED_100 1
+#define GMAC_SPEED_1000 2
+
+#define GMAC_PHY_MII 0
+#define GMAC_PHY_GMII 1
+#define GMAC_PHY_RGMII_100_10 2
+#define GMAC_PHY_RGMII_1000 3
+
+/*
+ * Queue Header
+ * (1) TOE Queue Header
+ * (2) Non-TOE Queue Header
+ * (3) Interrupt Queue Header
+ *
+ * memory Layout
+ * TOE Queue Header
+ * 0x60003000 +---------------------------+ 0x0000
+ * | TOE Queue 0 Header |
+ * | 8 * 4 Bytes |
+ * +---------------------------+ 0x0020
+ * | TOE Queue 1 Header |
+ * | 8 * 4 Bytes |
+ * +---------------------------+ 0x0040
+ * | ...... |
+ * | |
+ * +---------------------------+
+ *
+ * Non TOE Queue Header
+ * 0x60002000 +---------------------------+ 0x0000
+ * | Default Queue 0 Header |
+ * | 2 * 4 Bytes |
+ * +---------------------------+ 0x0008
+ * | Default Queue 1 Header |
+ * | 2 * 4 Bytes |
+ * +---------------------------+ 0x0010
+ * | Classification Queue 0 |
+ * | 2 * 4 Bytes |
+ * +---------------------------+
+ * | Classification Queue 1 |
+ * | 2 * 4 Bytes |
+ * +---------------------------+ (n * 8 + 0x10)
+ * | ... |
+ * | 2 * 4 Bytes |
+ * +---------------------------+ (13 * 8 + 0x10)
+ * | Classification Queue 13 |
+ * | 2 * 4 Bytes |
+ * +---------------------------+ 0x80
+ * | Interrupt Queue 0 |
+ * | 2 * 4 Bytes |
+ * +---------------------------+
+ * | Interrupt Queue 1 |
+ * | 2 * 4 Bytes |
+ * +---------------------------+
+ * | Interrupt Queue 2 |
+ * | 2 * 4 Bytes |
+ * +---------------------------+
+ * | Interrupt Queue 3 |
+ * | 2 * 4 Bytes |
+ * +---------------------------+
+ *
+ */
+#define TOE_QUEUE_HDR_ADDR(n) (TOE_TOE_QUE_HDR_BASE + n * 32)
+#define TOE_Q_HDR_AREA_END (TOE_QUEUE_HDR_ADDR(TOE_TOE_QUEUE_MAX + 1))
+#define TOE_DEFAULT_Q_HDR_BASE(x) (TOE_NONTOE_QUE_HDR_BASE + 0x08 * (x))
+#define TOE_CLASS_Q_HDR_BASE (TOE_NONTOE_QUE_HDR_BASE + 0x10)
+#define TOE_INTR_Q_HDR_BASE (TOE_NONTOE_QUE_HDR_BASE + 0x80)
+#define INTERRUPT_QUEUE_HDR_ADDR(n) (TOE_INTR_Q_HDR_BASE + n * 8)
+#define NONTOE_Q_HDR_AREA_END (INTERRUPT_QUEUE_HDR_ADDR(TOE_INTR_QUEUE_MAX + 1))
+/*
+ * TOE Queue Header Word 0
+ */
+typedef union {
+ unsigned int bits32;
+ unsigned int base_size;
+} TOE_QHDR0_T;
+
+#define TOE_QHDR0_BASE_MASK (~0x0f)
+
+/*
+ * TOE Queue Header Word 1
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_qhdr1 {
+ unsigned int rptr : 16; /* bit 15:0 */
+ unsigned int wptr : 16; /* bit 31:16 */
+ } bits;
+} TOE_QHDR1_T;
+
+/*
+ * TOE Queue Header Word 2
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_qhdr2 {
+ unsigned int TotalPktSize : 17; /* bit 16: 0 Total packet size */
+ unsigned int reserved : 7; /* bit 23:17 */
+ unsigned int dack : 1; /* bit 24 1: Duplicated ACK */
+ unsigned int abn : 1; /* bit 25 1: Abnormal case Found */
+ unsigned int tcp_opt : 1; /* bit 26 1: Have TCP option */
+ unsigned int ip_opt : 1; /* bit 27 1: have IPV4 option or IPV6 Extension header */
+ unsigned int sat : 1; /* bit 28 1: SeqCnt > SeqThreshold, or AckCnt > AckThreshold */
+ unsigned int osq : 1; /* bit 29 1: out of sequence */
+ unsigned int ctl : 1; /* bit 30 1: have control flag bits (except ack) */
+ unsigned int usd : 1; /* bit 31 0: if no data assembled yet */
+ } bits;
+} TOE_QHDR2_T;
+
+/*
+ * TOE Queue Header Word 3
+ */
+typedef union {
+ unsigned int bits32;
+ unsigned int seq_num;
+} TOE_QHDR3_T;
+
+/*
+ * TOE Queue Header Word 4
+ */
+typedef union {
+ unsigned int bits32;
+ unsigned int ack_num;
+} TOE_QHDR4_T;
+
+/*
+ * TOE Queue Header Word 5
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_qhdr5 {
+ unsigned int AckCnt : 16; /* bit 15:0 */
+ unsigned int SeqCnt : 16; /* bit 31:16 */
+ } bits;
+} TOE_QHDR5_T;
+
+/*
+ * TOE Queue Header Word 6
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_qhdr6 {
+ unsigned int WinSize : 16; /* bit 15:0 */
+ unsigned int iq_num : 2; /* bit 17:16 */
+ unsigned int MaxPktSize : 14; /* bit 31:18 */
+ } bits;
+} TOE_QHDR6_T;
+
+/*
+ * TOE Queue Header Word 7
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_qhdr7 {
+ unsigned int AckThreshold : 16; /* bit 15:0 */
+ unsigned int SeqThreshold : 16; /* bit 31:16 */
+ } bits;
+} TOE_QHDR7_T;
+
+/*
+ * TOE Queue Header
+ */
+typedef struct {
+ TOE_QHDR0_T word0;
+ TOE_QHDR1_T word1;
+ TOE_QHDR2_T word2;
+ TOE_QHDR3_T word3;
+ TOE_QHDR4_T word4;
+ TOE_QHDR5_T word5;
+ TOE_QHDR6_T word6;
+ TOE_QHDR7_T word7;
+} TOE_QHDR_T;
+
+/*
+ * NONTOE Queue Header Word 0
+ */
+typedef union {
+ unsigned int bits32;
+ unsigned int base_size;
+} NONTOE_QHDR0_T;
+
+#define NONTOE_QHDR0_BASE_MASK (~0x0f)
+
+/*
+ * NONTOE Queue Header Word 1
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_nonqhdr1 {
+ unsigned int rptr : 16; /* bit 15:0 */
+ unsigned int wptr : 16; /* bit 31:16 */
+ } bits;
+} NONTOE_QHDR1_T;
+
+/*
+ * Non-TOE Queue Header
+ */
+typedef struct {
+ NONTOE_QHDR0_T word0;
+ NONTOE_QHDR1_T word1;
+} NONTOE_QHDR_T;
+
+/*
+ * Interrupt Queue Header Word 0
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_intrqhdr0 {
+ unsigned int win_size : 16; /* bit 15:0 Descriptor Ring Size */
+ unsigned int wptr : 16; /* bit 31:16 Write Pointer where hw stopped */
+ } bits;
+} INTR_QHDR0_T;
+
+/*
+ * Interrupt Queue Header Word 1
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_intrqhdr1 {
+ unsigned int TotalPktSize : 17; /* bit 16: 0 Total packet size */
+ unsigned int tcp_qid : 8; /* bit 24:17 TCP Queue ID */
+ unsigned int dack : 1; /* bit 25 1: Duplicated ACK */
+ unsigned int abn : 1; /* bit 26 1: Abnormal case Found */
+ unsigned int tcp_opt : 1; /* bit 27 1: Have TCP option */
+ unsigned int ip_opt : 1; /* bit 28 1: have IPV4 option or IPV6 Extension header */
+ unsigned int sat : 1; /* bit 29 1: SeqCnt > SeqThreshold, or AckCnt > AckThreshold */
+ unsigned int osq : 1; /* bit 30 1: out of sequence */
+ unsigned int ctl : 1; /* bit 31 1: have control flag bits (except ack) */
+ } bits;
+} INTR_QHDR1_T;
+
+/*
+ * Interrupt Queue Header Word 2
+ */
+typedef union {
+ unsigned int bits32;
+ unsigned int seq_num;
+} INTR_QHDR2_T;
+
+/*
+ * Interrupt Queue Header Word 3
+ */
+typedef union {
+ unsigned int bits32;
+ unsigned int ack_num;
+} INTR_QHDR3_T;
+
+/*
+ * Interrupt Queue Header Word 4
+ */
+typedef union {
+ unsigned int bits32;
+ struct bit_intrqhdr4 {
+ unsigned int AckCnt : 16; /* bit 15:0 Ack# change since last ack# intr. */
+ unsigned int SeqCnt : 16; /* bit 31:16 Seq# change since last seq# intr. */
+ } bits;
+} INTR_QHDR4_T;
+
+/*
+ * Interrupt Queue Header
+ */
+typedef struct {
+ INTR_QHDR0_T word0;
+ INTR_QHDR1_T word1;
+ INTR_QHDR2_T word2;
+ INTR_QHDR3_T word3;
+ INTR_QHDR4_T word4;
+ unsigned int word5;
+ unsigned int word6;
+ unsigned int word7;
+} INTR_QHDR_T;
+
+#endif /* _GMAC_SL351x_H */
^ permalink raw reply related
* [GIT] Networking
From: David Miller @ 2011-01-26 23:13 UTC (permalink / raw)
To: torvalds; +Cc: akpm, netdev, linux-kernel
The ipv6 inetpeer support changes this merge window introduced a
few regressions, sorry about that, but all of the ones I am aware
of should be completely fixed here.
We also have a bug fix for an 11 year old TCP bug, that to me is
the definition of "awesome"
1) INET peer cache releases ipv6 peers using ipv4 tree root, oops.
Based upon a report by Eric Dumazet.
2) GRO packet merging bug fix from Michal Schmidt. If the GRO code
gets a linear SKB then a paged one, we merge incorrectly. This is
possible with the SFC driver which packages RX frames dynamically
based upon a flow's behavior.
3) /proc/net/tcp optimization regression fix, we walk the listener hash
incorrectly such that if there are enough sockets we can loop
essentially forever. Fix from Eric Dumazet.
4) The ipv6 inetpeer support wants to always attach peers to cached routes
only. This is almost always true, except for some special case situations
wrt. local network routes. Fix by always cloning these routes into
RTF_CACHE ones. Based upon a report by PK <runningdoglackey@yahoo.com>
5) Due to some merging side effects, we ended up undoing a memory clear for
the ethtool get-regs request structure. One change (the original fix)
went kmalloc --> kzalloc, the other change did kmalloc --> vmalloc,
and this wasn't caught during the merge (my bad). Luckily Eugene Teo caught
it and submitted thsi fix.
6) Like ipv4 ipsec routes, ipv6 ones must propagate the inetpeer binding from
the non-ipsec child route.
7) Fix tg3 driver VLAN regressions reported by Eric Dumazet by converting over
to the new VLAN driver interface framework. From Matt Carlson.
8) Fix a TCP bug that causes erroneous resets to be emitted on the
final "data + FIN" packet. This bug dates back to January, 2000 :-)
Fix from Jerry Chu.
9) Revert a set of ipv6 interface address semantic changes from last
January that have broken several things, in particular the "disable_ipv6"
sysctl.
10) at91_can driver chip bug errata handling from Marc Kleine-Budde.
11) New softing CAN driver from Kurt Van Dijck.
12) arp_ioctl() locking regression fix from Eric Dumazet based upon a report
by Jamie Heilman.
13) atm idt77105 driver copies wrong stats back to userspace, fix from
Vasiliy Kulikov.
14) Work cancelling fixup in pch_gbe from Tejun Heo.
15) CNIC endianness bug fix from Michael Chan, based upon a report by Breno
Leitao.
16) BNX2 driver barks in logs about AER even on platforms where AER isn't even
supported, fix from Michael Chan.
17) cxgb4 driver need to call netif_carrier_off() after registering the device,
not before.
18) bonding crashes because it performs pskb_may_pull() potentially on shared
SKBs, fix from Neil Horman.
19) All packet schedulers that can drop previously enqueued packets over-estimate
their stats because the rules concerning bstats updates keep up from being
able to undo the increment at drop time. This makes accurate rate estimation
et al. basically impossible. Fix this by creating a helper function and
doing the bstats increment at dequeue instead of enqueue time, this way the
dropped frames do not accidently get into the state. From Eric Dumazet.
20) Various bluetooth regression and memory leak fixes from Alexander Holler,
Gustavo F. Padovan, David Sterba, Johan Hedberg, and Lukas Turek.
21) EEPROM reading fix for older iwlwifi devices from Wey-Yi Guy.
22) Disable PARPD on ath9k wireless, it is causing regressions in both
connectivity and performance. From Luis R. Rodriguez.
23) Missing dev_alloc_skb() error checking in rtlwifi, from Jesper Juhl.
24) ieee80211_beacon_get_tim() crash fix from Felix Fietkau.
Please pull, thanks a lot!
The following changes since commit c723fdab8aa728dc2bf0da6a0de8bb9c3f588d84:
Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 (2011-01-25 14:23:54 +1000)
are available in the git repository at:
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master
Alexander Holler (1):
Bluetooth: ath3k: reduce memory usage
David S. Miller (7):
Merge branch 'master' of master.kernel.org:/.../torvalds/linux-2.6
inetpeer: Use correct AVL tree base pointer in inet_getpeer().
Merge branch 'can/at91_can-for-net-2.6' of git://git.pengutronix.de/git/mkl/linux-2.6
ipv6: Always clone offlink routes.
ipv6: Revert 'administrative down' address handling changes.
Merge branch 'master' of git://git.kernel.org/.../linville/wireless-2.6
xfrm6: Don't forget to propagate peer into ipsec route.
David Sterba (1):
Bluetooth: l2cap: fix misuse of logical operation in place of bitop
Dimitris Michailidis (1):
cxgb4: fix reported state of interfaces without link
Eric Dumazet (3):
net_sched: accurate bytes/packets stats/rates
net: arp_ioctl() must hold RTNL
tcp: fix bug in listening_get_next()
Eugene Teo (1):
net: clear heap allocation for ethtool_get_regs()
Felix Fietkau (2):
ath9k: add missing ps wakeup/restore calls
mac80211: fix a crash in ieee80211_beacon_get_tim on change_interface
Greg Kroah-Hartman (1):
rt2x00: add device id for windy31 usb device
Jerry Chu (1):
TCP: fix a bug that triggers large number of TCP RST by mistake
Jesper Dangaard Brouer (1):
textsearch: doc - fix spelling in lib/textsearch.c.
Jesper Juhl (2):
rtlwifi: Fix possible NULL dereference
USB NET KL5KUSB101: Fix mem leak in error path of kaweth_download_firmware()
Johan Hedberg (6):
Bluetooth: Fix leaking blacklist when unregistering a hci device
Revert "Bluetooth: Update sec_level/auth_type for already existing connections"
Bluetooth: Fix MITM protection requirement preservation
Bluetooth: Create a unified auth_type evaluation function
Bluetooth: Fix authentication request for L2CAP raw sockets
Bluetooth: Fix race condition with conn->sec_level
John Fastabend (1):
dcbnl: make get_app handling symmetric for IEEE and CEE DCBx
Kurt Van Dijck (2):
can: add driver for Softing card
can: add driver for Softing card
Luis R. Rodriguez (1):
ath9k_hw: disabled PAPRD for AR9003
Lukáš Turek (1):
Bluetooth: Never deallocate a session when some DLC points to it
Marc Kleine-Budde (3):
can: at91_can: clean up usage of AT91_MB_RX_FIRST and AT91_MB_RX_NUM
can: at91_can: don't use mailbox 0
can: at91_can: make can_id of mailbox 0 configurable
Matt Carlson (1):
tg3: Use new VLAN code
Michael Chan (3):
bnx2: Always set ETH_FLAG_TXVLAN
cnic: Fix big endian bug
bnx2: Eliminate AER error messages on systems not supporting it
Michal Schmidt (1):
GRO: fix merging a paged skb after non-paged skbs
Neil Horman (1):
bonding: Ensure that we unshare skbs prior to calling pskb_may_pull
Nicolas de Pesloüan (1):
bonding: update documentation - alternate configuration.
Reinette Chatre (1):
MAINTAINERS: remove Reinette Chatre as iwlwifi maintainer
Tejun Heo (1):
pch_gbe: don't use flush_scheduled_work()
Vasiliy Kulikov (1):
atm: idt77105: fix fetch_stats() result
Wey-Yi Guy (1):
iwlwifi: don't read sku information from EEPROM for 4965
Documentation/ABI/testing/sysfs-platform-at91 | 25 +
Documentation/networking/bonding.txt | 83 ++-
MAINTAINERS | 1 -
drivers/atm/idt77105.c | 2 +-
drivers/bluetooth/ath3k.c | 75 +--
drivers/net/bnx2.c | 21 +-
drivers/net/bnx2.h | 1 +
drivers/net/bonding/bond_3ad.c | 4 +
drivers/net/bonding/bond_alb.c | 4 +
drivers/net/bonding/bond_main.c | 4 +
drivers/net/can/Kconfig | 2 +
drivers/net/can/Makefile | 1 +
drivers/net/can/at91_can.c | 138 +++-
drivers/net/can/softing/Kconfig | 30 +
drivers/net/can/softing/Makefile | 6 +
drivers/net/can/softing/softing.h | 167 +++++
drivers/net/can/softing/softing_cs.c | 359 ++++++++++
drivers/net/can/softing/softing_fw.c | 691 +++++++++++++++++++
drivers/net/can/softing/softing_main.c | 893 +++++++++++++++++++++++++
drivers/net/can/softing/softing_platform.h | 40 ++
drivers/net/cnic.c | 12 +-
drivers/net/cxgb4/cxgb4_main.c | 3 +-
drivers/net/pch_gbe/pch_gbe_main.c | 2 +-
drivers/net/tg3.c | 95 +---
drivers/net/tg3.h | 3 -
drivers/net/usb/kaweth.c | 1 +
drivers/net/wireless/ath/ath9k/hw.c | 6 +-
drivers/net/wireless/ath/ath9k/hw.h | 1 +
drivers/net/wireless/ath/ath9k/main.c | 8 +-
drivers/net/wireless/ath/ath9k/xmit.c | 2 -
drivers/net/wireless/iwlwifi/iwl-4965.c | 1 +
drivers/net/wireless/iwlwifi/iwl-agn-eeprom.c | 11 +-
drivers/net/wireless/rt2x00/rt73usb.c | 1 +
drivers/net/wireless/rtlwifi/pci.c | 11 +-
include/net/bluetooth/hci_core.h | 1 +
include/net/sch_generic.h | 8 +-
lib/textsearch.c | 10 +-
net/bluetooth/hci_conn.c | 16 +-
net/bluetooth/hci_core.c | 4 +
net/bluetooth/hci_event.c | 9 +-
net/bluetooth/l2cap.c | 84 +--
net/bluetooth/rfcomm/core.c | 3 +-
net/core/dev.c | 3 +-
net/core/ethtool.c | 2 +-
net/core/skbuff.c | 8 +-
net/dcb/dcbnl.c | 13 +-
net/ipv4/arp.c | 11 +-
net/ipv4/inetpeer.c | 2 +-
net/ipv4/tcp_input.c | 2 +-
net/ipv4/tcp_ipv4.c | 1 -
net/ipv6/addrconf.c | 81 +--
net/ipv6/route.c | 9 +-
net/ipv6/xfrm6_policy.c | 6 +
net/mac80211/tx.c | 3 +
net/sched/sch_cbq.c | 3 +-
net/sched/sch_drr.c | 2 +-
net/sched/sch_dsmark.c | 2 +-
net/sched/sch_fifo.c | 5 +-
net/sched/sch_hfsc.c | 2 +-
net/sched/sch_htb.c | 12 +-
net/sched/sch_multiq.c | 2 +-
net/sched/sch_netem.c | 3 +-
net/sched/sch_prio.c | 2 +-
net/sched/sch_red.c | 11 +-
net/sched/sch_sfq.c | 5 +-
net/sched/sch_tbf.c | 2 +-
net/sched/sch_teql.c | 3 +-
67 files changed, 2650 insertions(+), 384 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-platform-at91
create mode 100644 drivers/net/can/softing/Kconfig
create mode 100644 drivers/net/can/softing/Makefile
create mode 100644 drivers/net/can/softing/softing.h
create mode 100644 drivers/net/can/softing/softing_cs.c
create mode 100644 drivers/net/can/softing/softing_fw.c
create mode 100644 drivers/net/can/softing/softing_main.c
create mode 100644 drivers/net/can/softing/softing_platform.h
^ permalink raw reply
* Re: [PATCH net-2.6] bnx2: Eliminate AER error messages on systems not supporting it
From: David Miller @ 2011-01-26 22:28 UTC (permalink / raw)
To: mchan; +Cc: leitao, netdev
In-Reply-To: <20110126.142635.59692241.davem@davemloft.net>
From: David Miller <davem@davemloft.net>
Date: Wed, 26 Jan 2011 14:26:35 -0800 (PST)
> From: "Michael Chan" <mchan@broadcom.com>
> Date: Wed, 26 Jan 2011 00:14:51 -0800
>
>> On PPC for example, AER is not supported and we see unnecessary AER
>> error message without this patch:
>>
>> bnx2 0003:01:00.1: pci_cleanup_aer_uncorrect_error_status failed 0xfffffffb
>>
>> Reported-by: Breno Leitao <leitao@linux.vnet.ibm.com>
>> Signed-off-by: Michael Chan <mchan@broadcom.com>
>
> Applied.
Please check for warnings in your build, I ammended the following fix
into this commit:
diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index 62c6079..0ba59d5 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -8540,7 +8540,7 @@ static pci_ers_result_t bnx2_io_slot_reset(struct pci_dev *pdev)
}
rtnl_unlock();
- if (!bp->flags & BNX2_FLAG_AER_ENABLED)
+ if (!(bp->flags & BNX2_FLAG_AER_ENABLED))
return result;
err = pci_cleanup_aer_uncorrect_error_status(pdev);
^ permalink raw reply related
* Re: [PATCH net-2.6] bnx2: Eliminate AER error messages on systems not supporting it
From: David Miller @ 2011-01-26 22:26 UTC (permalink / raw)
To: mchan; +Cc: leitao, netdev
In-Reply-To: <1296029691-3591-2-git-send-email-mchan@broadcom.com>
From: "Michael Chan" <mchan@broadcom.com>
Date: Wed, 26 Jan 2011 00:14:51 -0800
> On PPC for example, AER is not supported and we see unnecessary AER
> error message without this patch:
>
> bnx2 0003:01:00.1: pci_cleanup_aer_uncorrect_error_status failed 0xfffffffb
>
> Reported-by: Breno Leitao <leitao@linux.vnet.ibm.com>
> Signed-off-by: Michael Chan <mchan@broadcom.com>
Applied.
^ permalink raw reply
* Re: [PATCH net-2.6] cnic: Fix big endian bug
From: David Miller @ 2011-01-26 22:26 UTC (permalink / raw)
To: mchan; +Cc: leitao, netdev
In-Reply-To: <1296029691-3591-1-git-send-email-mchan@broadcom.com>
From: "Michael Chan" <mchan@broadcom.com>
Date: Wed, 26 Jan 2011 00:14:50 -0800
> The chip's page tables did not set up properly on big endian machines,
> causing EEH errors on PPC machines.
>
> Reported-by: Breno Leitao <leitao@linux.vnet.ibm.com>
> Signed-off-by: Michael Chan <mchan@broadcom.com>
Applied.
^ permalink raw reply
* [PATCH] xfrm6: Don't forget to propagate peer into ipsec route.
From: David Miller @ 2011-01-26 21:42 UTC (permalink / raw)
To: netdev
Like ipv4, we have to propagate the ipv6 route peer into
the ipsec top-level route during instantiation.
Signed-off-by: David S. Miller <davem@davemloft.net>
---
I noticed this oversight while going back to work on route metrics
COW'ing. Committed to net-2.6
net/ipv6/xfrm6_policy.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index 7e74023..da87428 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -98,6 +98,10 @@ static int xfrm6_fill_dst(struct xfrm_dst *xdst, struct net_device *dev,
if (!xdst->u.rt6.rt6i_idev)
return -ENODEV;
+ xdst->u.rt6.rt6i_peer = rt->rt6i_peer;
+ if (rt->rt6i_peer)
+ atomic_inc(&rt->rt6i_peer->refcnt);
+
/* Sheit... I remember I did this right. Apparently,
* it was magically lost, so this code needs audit */
xdst->u.rt6.rt6i_flags = rt->rt6i_flags & (RTF_ANYCAST |
@@ -216,6 +220,8 @@ static void xfrm6_dst_destroy(struct dst_entry *dst)
if (likely(xdst->u.rt6.rt6i_idev))
in6_dev_put(xdst->u.rt6.rt6i_idev);
+ if (likely(xdst->u.rt6.rt6i_peer))
+ inet_putpeer(xdst->u.rt6.rt6i_peer);
xfrm_dst_destroy(xdst);
}
--
1.7.3.4
^ permalink raw reply related
* Re: [PATCH net-next-2.6] net_sched: sch_mqprio: dont leak kernel memory
From: Joe Perches @ 2011-01-26 21:33 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev, john.r.fastabend
In-Reply-To: <1296077287.2631.9.camel@edumazet-laptop>
On Wed, 2011-01-26 at 22:28 +0100, Eric Dumazet wrote:
> Le mercredi 26 janvier 2011 à 13:24 -0800, Joe Perches a écrit :
> > Ugly maybe, but correct, definitely.
> > The same can not be said of the {0}.
> What about fixing real problems Joe ?
What about it? You seem to have fixed it.
> Are you telling me I dont know C ?
All I'm saying is that from a style and auditing
perspective, it's better to use memset for all
structs that are exposed to user space.
Other than that, there's no issue here.
cheers, Joe
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox