* Re: [PATCH v3] net: add Faraday FTMAC100 10/100 Ethernet driver
From: Po-Yu Chuang @ 2011-01-21 7:06 UTC (permalink / raw)
To: Joe Perches; +Cc: netdev, linux-kernel, bhutchings, eric.dumazet, dilinger
In-Reply-To: <1295592411.6795.10.camel@Joe-Laptop>
Dear Joe,
On Fri, Jan 21, 2011 at 2:46 PM, Joe Perches <joe@perches.com> wrote:
> On Fri, 2011-01-21 at 13:03 +0800, Po-Yu Chuang wrote:
>> > Is it useful to retry the NORXBUF case?
>> The idea is that if I miss packet finished interrupts (then rx buffers used up),
>> I should retrieve the received packets ASAP to free buffers to HW.
>> Maybe this is really unnecessary.
>> I am not quite sure, but I'll do your advice now.
>
> I wasn't giving advice, just asking a question.
> Your concept makes sense to me.
I see. So I will leave it as is.
>> >> + if (status & FTMAC100_INT_NORXBUF) {
>> >> + /* RX buffer unavailable */
>> >> + if (net_ratelimit())
>> >> + netdev_info(netdev, "INT_NORXBUF\n");
>> >> +
>> >> + netdev->stats.rx_over_errors++;
>> >> + }
>> >
>> > Perhaps this "if (status & FTMAC100_INT_NORXBUF)" block should be
>> > moved into the test block above it before the retry?
>>
>> Since status is not changed in the function, it does not matter where
>> the test is.
>> But I agree that it is better to handle error cases earlier.
>
> This wasn't so much a handle error case early, but
> a suggestion that
> if (status & (foo | bar)) {
> ...
> }
> if (status & bar) {
> ...
> }
> should be
> if (status & (foo | bar)) {
> ...
> if (status & bar) {
> ...
> }
> }
>
> so that when the first test fails, a known
> subset of the first test isn't tested again.
Understand. Thanks.
best regards,
Po-Yu Chuang
^ permalink raw reply
* Re: 2.6.38-rc1: arp triggers RTNL assertion
From: Eric Dumazet @ 2011-01-21 7:12 UTC (permalink / raw)
To: Jamie Heilman; +Cc: linux-kernel, netdev
In-Reply-To: <20110121061758.GA2247@fifty-fifty.audible.transient.net>
Le jeudi 20 janvier 2011 à 22:17 -0800, Jamie Heilman a écrit :
> With 2.6.38-rc1 when I run: arp -Ds 192.168.2.41 eth0 pub
> I see:
>
> RTNL: assertion failed at net/core/neighbour.c (589)
> Pid: 2330, comm: arp Not tainted 2.6.38-rc1-00132-g8d99641-dirty #1
> Call Trace:
> [<c11ed339>] ? pneigh_lookup+0xc3/0x168
> [<c1219f27>] ? arp_req_set+0x86/0x1d5
> [<c11e74b5>] ? dev_get_by_name_rcu+0x72/0x7f
> [<c121a1a3>] ? arp_ioctl+0x12d/0x22e
> [<c121dfeb>] ? inet_ioctl+0x82/0xa7
> [<c11d8ffc>] ? sock_ioctl+0x1b7/0x1db
> [<c11d8e45>] ? sock_ioctl+0x0/0x1db
> [<c108f02f>] ? do_vfs_ioctl+0x47c/0x4c5
> [<c101803c>] ? do_page_fault+0x315/0x341
> [<c11daaf3>] ? sys_socket+0x44/0x5a
> [<c11dab71>] ? sys_socketcall+0x68/0x270
> [<c108f0ab>] ? sys_ioctl+0x33/0x4b
> [<c1002897>] ? sysenter_do_call+0x12/0x26
>
> Figured I'd Cc Eric as this could be related to commit 941666c2,
> "net: RCU conversion of dev_getbyhwaddr() and arp_ioctl()"
>
> Config attached, just in case (the uncommited change, in the tree this
> kernel was built from, is just Chuck Lever's recent nfs3xdr.c patch).
Thanks for the report, I am looking at this right now.
^ permalink raw reply
* Re: [PATCH] rtnetlink: fix link attribute validation with IFLA_GROUP
From: Vlad Dogaru @ 2011-01-20 16:09 UTC (permalink / raw)
To: Patrick McHardy; +Cc: NetDev, David S. Miller
In-Reply-To: <4D3831FA.2010806@trash.net>
On Thu, Jan 20, 2011 at 02:00:42PM +0100, Patrick McHardy wrote:
> Fix a few semantic problems with the new IFLA_GROUP attribute.
>
> Vlad, could you please give this is a try to see whether it
> still behaves as expected?
> commit e4b31d565a45e06ed2e51a005f5c00ff1d00725c
> Author: Patrick McHardy <kaber@trash.net>
> Date: Thu Jan 20 13:55:25 2011 +0100
>
> rtnetlink: fix link attribute validation with IFLA_GROUP
>
> rtnl_group_changelink() is invoked by rtnl_newlink() before the link
> attributes have been validated. Additionally the group changes are
> performed even if NLM_F_CREATE is specified and a new link is
> created, while more reasonable semantics would be to set the group
> value on the newly created link.
>
> Fix both problems by moving the rtnl_group_changelink() invocation
> down to the handling of non-existant links without NLM_F_CREATE()
> and add a dev_set_group() call to rtnl_create_link().
>
> Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Vlad Dogaru <ddvlad@rosedu.org>
This looks OK and behaves as before. Thanks for taking the time to look
through it, the new semantics do seem saner.
There is a slight difference, though: in the (yet unapplied) iproute2
patch, ifi_index is set to -1 and kernel space checks that it is <= 0;
your patch checks for == 0. I will update the iproute2 patches with
setting ifi_index to 0 (and other changes) and resubmit them.
^ permalink raw reply
* Re: [PATCH] rtnetlink: fix link attribute validation with IFLA_GROUP
From: David Miller @ 2011-01-21 7:29 UTC (permalink / raw)
To: ddvlad; +Cc: kaber, netdev
In-Reply-To: <20110120160909.GC12415@cormyr>
From: Vlad Dogaru <ddvlad@rosedu.org>
Date: Thu, 20 Jan 2011 18:09:10 +0200
> On Thu, Jan 20, 2011 at 02:00:42PM +0100, Patrick McHardy wrote:
>> Fix a few semantic problems with the new IFLA_GROUP attribute.
>>
>> Vlad, could you please give this is a try to see whether it
>> still behaves as expected?
>
>> commit e4b31d565a45e06ed2e51a005f5c00ff1d00725c
>> Author: Patrick McHardy <kaber@trash.net>
>> Date: Thu Jan 20 13:55:25 2011 +0100
>>
>> rtnetlink: fix link attribute validation with IFLA_GROUP
>>
>> rtnl_group_changelink() is invoked by rtnl_newlink() before the link
>> attributes have been validated. Additionally the group changes are
>> performed even if NLM_F_CREATE is specified and a new link is
>> created, while more reasonable semantics would be to set the group
>> value on the newly created link.
>>
>> Fix both problems by moving the rtnl_group_changelink() invocation
>> down to the handling of non-existant links without NLM_F_CREATE()
>> and add a dev_set_group() call to rtnl_create_link().
>>
>> Signed-off-by: Patrick McHardy <kaber@trash.net>
>
> Acked-by: Vlad Dogaru <ddvlad@rosedu.org>
Applied, thanks everyone.
^ permalink raw reply
* Re: [PATCH] net_sched: accurate bytes/packets stats/rates
From: David Miller @ 2011-01-21 7:31 UTC (permalink / raw)
To: shemminger; +Cc: eric.dumazet, netdev, kaber, hadi, jarkao2
In-Reply-To: <20110114110342.4d95ad5b@nehalam>
From: Stephen Hemminger <shemminger@vyatta.com>
Date: Fri, 14 Jan 2011 11:03:42 -0800
> From Eric Dumazet <eric.dumazet@gmail.com>
>
> In commit 44b8288308ac9d (net_sched: pfifo_head_drop problem), we fixed
> a problem with pfifo_head drops that incorrectly decreased
> sch->bstats.bytes and sch->bstats.packets
>
> Several qdiscs (CHOKe, SFQ, pfifo_head, ...) are able to drop a
> previously enqueued packet, and bstats cannot be changed, so
> bstats/rates are not accurate (over estimated)
>
> This patch changes the qdisc_bstats updates to be done at dequeue() time
> instead of enqueue() time. bstats counters no longer account for dropped
> frames, and rates are more correct, since enqueue() bursts dont have
> effect on dequeue() rate.
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Applied to net-2.6, thanks everyone.
^ permalink raw reply
* Re: [Bugme-new] [Bug 27212] New: Warning kmemcheck: Caught 64-bit read from uninitialized memory in netlink_broadcast_filtered
From: Pekka Enberg @ 2011-01-21 7:49 UTC (permalink / raw)
To: Eric Dumazet
Cc: Andrew Morton, netdev, bugzilla-daemon, bugme-daemon,
casteyde.christian, Changli Gao, Vegard Nossum
In-Reply-To: <1295556085.2613.22.camel@edumazet-laptop>
On 1/20/11 10:41 PM, Eric Dumazet wrote:
> Le jeudi 20 janvier 2011 à 12:25 -0800, Andrew Morton a écrit :
>> (switched to email. Please respond via emailed reply-to-all, not via the
>> bugzilla web interface).
>>
>> On Thu, 20 Jan 2011 20:08:32 GMT
>> bugzilla-daemon@bugzilla.kernel.org wrote:
>>
>>> https://bugzilla.kernel.org/show_bug.cgi?id=27212
>>>
>>> Summary: Warning kmemcheck: Caught 64-bit read from
>>> uninitialized memory in netlink_broadcast_filtered
>>> Product: Other
>>> Version: 2.5
>>> Kernel Version: 2.6.38-rc1
>>> Platform: All
>>> OS/Version: Linux
>>> Tree: Mainline
>>> Status: NEW
>>> Severity: normal
>>> Priority: P1
>>> Component: Other
>>> AssignedTo: other_other@kernel-bugs.osdl.org
>>> ReportedBy: casteyde.christian@free.fr
>>> Regression: Yes
>>>
>>>
>>> Athlon 64 X2 3000 in 64bits
>>> Slackware64 13.1
>>> Kernel compiled with kmemcheck and other debug options
>>>
>>> At boot I got the following warning:
>>>
>>> PCI: Using ACPI for IRQ routing
>>> PCI: pci_cache_line_size set to 64 bytes
>>> pci 0000:00:00.0: address space collision: [mem 0xe0000000-0xefffffff pref]
>>> conflicts with GART [mem 0x
>>> e0000000-0xefffffff]
>>> reserve RAM buffer: 000000000009fc00 - 000000000009ffff
>>> reserve RAM buffer: 000000003ffb0000 - 000000003fffffff
>>> WARNING: kmemcheck: Caught 64-bit read from uninitialized memory
>>> (ffff88003e170eb0)
>>> 0000000000000000010000000000000000000000000000000000000000000000
>>> i i i i i i i i i i i i u u u u u u u u u u u u u u u u u u u u
>>> ^
>>>
>>> Pid: 1, comm: swapper Not tainted 2.6.38-rc1 #2 K8 Combo-Z/K8 Combo-Z
>>> RIP: 0010:[<ffffffff8127ad72>] [<ffffffff8127ad72>] memmove+0x122/0x1a0
>>> RSP: 0018:ffff88003e0b3c60 EFLAGS: 00010202
>>> RAX: ffff88003e170080 RBX: ffff88003e27b500 RCX: 0000000000000020
>>> RDX: 0000000000000018 RSI: ffff88003e170ea0 RDI: ffff88003e1700a0
>>> RBP: ffff88003e0b3c60 R08: 0000000000000001 R09: 0000000000000001
>>> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>>> R13: 0000000000000080 R14: 0000000000000000 R15: 0000000000000001
>>> FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
>>> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> CR2: ffff88003e018abc CR3: 0000000001a1c000 CR4: 00000000000006f0
>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
>>> [<ffffffff814741c2>] pskb_expand_head+0xc2/0x2a0
>>> [<ffffffff81498fa7>] netlink_broadcast_filtered+0xa7/0x4a0
>>> [<ffffffff814993b8>] netlink_broadcast+0x18/0x20
>>> [<ffffffff8149b884>] genlmsg_mcast+0x144/0x180
>>> [<ffffffff8149bc4a>] genl_ctrl_event+0xca/0x450
>>> [<ffffffff8149c75d>] genl_register_mc_group+0x10d/0x2a0
>>> [<ffffffff81ad9da4>] genl_init+0x6c/0x84
>>> [<ffffffff810001de>] do_one_initcall+0x3e/0x170
>>> [<ffffffff81aae6ea>] kernel_init+0x197/0x21b
>>> [<ffffffff81003254>] kernel_thread_helper+0x4/0x10
>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>> pnp: PnP ACPI init
>>> ACPI: bus type pnp registered
>>> pnp 00:00: [bus 00-ff]
>>> pnp 00:00: [io 0x0cf8-0x0cff]
>>>
>>> This is specific to 2.6.38-rc1.
>>>
> Likely a false positive after commit ca44ac38
> (net: don't reallocate skb->head unless the current one hasn't the
> needed extra size or is shared)
>
> ksize() allows us to use a bit more than what was asked at kmalloc()
> time, because of discrete kmem caches sizes.
>
> We probably need to instruct kmemcheck of this.
It actually looks like a bug in SLUB+kmemcheck. The
kmemcheck_slab_alloc() call in slab_post_alloc_hook() should use ksize()
instead of s->objsize. SLAB seems to do the right thing already. Anyone
care to send a patch my way?
Pekka
^ permalink raw reply
* [PATCH 0/4] Convert PPP to frag lists and SKB list accessors
From: David Miller @ 2011-01-21 7:55 UTC (permalink / raw)
To: netdev; +Cc: paulus
Thanks to some stellar testing and debugging work by Paul, my PPP
SKB list handling changes are in a working state and now checked into
net-next-2.6
There is still a stray "head->next" access in ppp_generic.c but this
is a huge step in the right direction.
I'm posting the final of this stuff so people can see what ended up
in the tree after Paul fixed the bugs.
^ permalink raw reply
* [PATCH 1/4] ppp: Clean up kernel log messages.
From: David Miller @ 2011-01-21 7:56 UTC (permalink / raw)
To: netdev; +Cc: paulus
Use netdev_*() and pr_*().
To preserve existing semantics in cases where KERN_DEBUG is indeed
appropriate, use netdev_printk(KERN_DEBUG, ...)
Convert PPPIOCDETACH to pr_warn() because an unexpected file count is
a serious bug and should be logged with KERN_WARN.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
---
drivers/net/ppp_generic.c | 86 +++++++++++++++++++++++++-------------------
1 files changed, 49 insertions(+), 37 deletions(-)
diff --git a/drivers/net/ppp_generic.c b/drivers/net/ppp_generic.c
index c7a6c44..3d7a38e 100644
--- a/drivers/net/ppp_generic.c
+++ b/drivers/net/ppp_generic.c
@@ -592,8 +592,8 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
ppp_release(NULL, file);
err = 0;
} else
- printk(KERN_DEBUG "PPPIOCDETACH file->f_count=%ld\n",
- atomic_long_read(&file->f_count));
+ pr_warn("PPPIOCDETACH file->f_count=%ld\n",
+ atomic_long_read(&file->f_count));
mutex_unlock(&ppp_mutex);
return err;
}
@@ -630,7 +630,7 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
if (pf->kind != INTERFACE) {
/* can't happen */
- printk(KERN_ERR "PPP: not interface or channel??\n");
+ pr_err("PPP: not interface or channel??\n");
return -EINVAL;
}
@@ -704,7 +704,8 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
}
vj = slhc_init(val2+1, val+1);
if (!vj) {
- printk(KERN_ERR "PPP: no memory (VJ compressor)\n");
+ netdev_err(ppp->dev,
+ "PPP: no memory (VJ compressor)\n");
err = -ENOMEM;
break;
}
@@ -898,17 +899,17 @@ static int __init ppp_init(void)
{
int err;
- printk(KERN_INFO "PPP generic driver version " PPP_VERSION "\n");
+ pr_info("PPP generic driver version " PPP_VERSION "\n");
err = register_pernet_device(&ppp_net_ops);
if (err) {
- printk(KERN_ERR "failed to register PPP pernet device (%d)\n", err);
+ pr_err("failed to register PPP pernet device (%d)\n", err);
goto out;
}
err = register_chrdev(PPP_MAJOR, "ppp", &ppp_device_fops);
if (err) {
- printk(KERN_ERR "failed to register PPP device (%d)\n", err);
+ pr_err("failed to register PPP device (%d)\n", err);
goto out_net;
}
@@ -1078,7 +1079,7 @@ pad_compress_skb(struct ppp *ppp, struct sk_buff *skb)
new_skb = alloc_skb(new_skb_size, GFP_ATOMIC);
if (!new_skb) {
if (net_ratelimit())
- printk(KERN_ERR "PPP: no memory (comp pkt)\n");
+ netdev_err(ppp->dev, "PPP: no memory (comp pkt)\n");
return NULL;
}
if (ppp->dev->hard_header_len > PPP_HDRLEN)
@@ -1108,7 +1109,7 @@ pad_compress_skb(struct ppp *ppp, struct sk_buff *skb)
* the same number.
*/
if (net_ratelimit())
- printk(KERN_ERR "ppp: compressor dropped pkt\n");
+ netdev_err(ppp->dev, "ppp: compressor dropped pkt\n");
kfree_skb(skb);
kfree_skb(new_skb);
new_skb = NULL;
@@ -1138,7 +1139,9 @@ ppp_send_frame(struct ppp *ppp, struct sk_buff *skb)
if (ppp->pass_filter &&
sk_run_filter(skb, ppp->pass_filter) == 0) {
if (ppp->debug & 1)
- printk(KERN_DEBUG "PPP: outbound frame not passed\n");
+ netdev_printk(KERN_DEBUG, ppp->dev,
+ "PPP: outbound frame "
+ "not passed\n");
kfree_skb(skb);
return;
}
@@ -1164,7 +1167,7 @@ ppp_send_frame(struct ppp *ppp, struct sk_buff *skb)
new_skb = alloc_skb(skb->len + ppp->dev->hard_header_len - 2,
GFP_ATOMIC);
if (!new_skb) {
- printk(KERN_ERR "PPP: no memory (VJ comp pkt)\n");
+ netdev_err(ppp->dev, "PPP: no memory (VJ comp pkt)\n");
goto drop;
}
skb_reserve(new_skb, ppp->dev->hard_header_len - 2);
@@ -1202,7 +1205,9 @@ ppp_send_frame(struct ppp *ppp, struct sk_buff *skb)
proto != PPP_LCP && proto != PPP_CCP) {
if (!(ppp->flags & SC_CCP_UP) && (ppp->flags & SC_MUST_COMP)) {
if (net_ratelimit())
- printk(KERN_ERR "ppp: compression required but down - pkt dropped.\n");
+ netdev_err(ppp->dev,
+ "ppp: compression required but "
+ "down - pkt dropped.\n");
goto drop;
}
skb = pad_compress_skb(ppp, skb);
@@ -1505,7 +1510,7 @@ static int ppp_mp_explode(struct ppp *ppp, struct sk_buff *skb)
noskb:
spin_unlock_bh(&pch->downl);
if (ppp->debug & 1)
- printk(KERN_ERR "PPP: no memory (fragment)\n");
+ netdev_err(ppp->dev, "PPP: no memory (fragment)\n");
++ppp->dev->stats.tx_errors;
++ppp->nxseq;
return 1; /* abandon the frame */
@@ -1686,7 +1691,8 @@ ppp_receive_nonmp_frame(struct ppp *ppp, struct sk_buff *skb)
/* copy to a new sk_buff with more tailroom */
ns = dev_alloc_skb(skb->len + 128);
if (!ns) {
- printk(KERN_ERR"PPP: no memory (VJ decomp)\n");
+ netdev_err(ppp->dev, "PPP: no memory "
+ "(VJ decomp)\n");
goto err;
}
skb_reserve(ns, 2);
@@ -1699,7 +1705,8 @@ ppp_receive_nonmp_frame(struct ppp *ppp, struct sk_buff *skb)
len = slhc_uncompress(ppp->vj, skb->data + 2, skb->len - 2);
if (len <= 0) {
- printk(KERN_DEBUG "PPP: VJ decompression error\n");
+ netdev_printk(KERN_DEBUG, ppp->dev,
+ "PPP: VJ decompression error\n");
goto err;
}
len += 2;
@@ -1721,7 +1728,7 @@ ppp_receive_nonmp_frame(struct ppp *ppp, struct sk_buff *skb)
goto err;
if (slhc_remember(ppp->vj, skb->data + 2, skb->len - 2) <= 0) {
- printk(KERN_ERR "PPP: VJ uncompressed error\n");
+ netdev_err(ppp->dev, "PPP: VJ uncompressed error\n");
goto err;
}
proto = PPP_IP;
@@ -1762,8 +1769,9 @@ ppp_receive_nonmp_frame(struct ppp *ppp, struct sk_buff *skb)
if (ppp->pass_filter &&
sk_run_filter(skb, ppp->pass_filter) == 0) {
if (ppp->debug & 1)
- printk(KERN_DEBUG "PPP: inbound frame "
- "not passed\n");
+ netdev_printk(KERN_DEBUG, ppp->dev,
+ "PPP: inbound frame "
+ "not passed\n");
kfree_skb(skb);
return;
}
@@ -1821,7 +1829,8 @@ ppp_decompress_frame(struct ppp *ppp, struct sk_buff *skb)
ns = dev_alloc_skb(obuff_size);
if (!ns) {
- printk(KERN_ERR "ppp_decompress_frame: no memory\n");
+ netdev_err(ppp->dev, "ppp_decompress_frame: "
+ "no memory\n");
goto err;
}
/* the decompressor still expects the A/C bytes in the hdr */
@@ -2002,8 +2011,9 @@ ppp_mp_reconstruct(struct ppp *ppp)
next = p->next;
if (seq_before(PPP_MP_CB(p)->sequence, seq)) {
/* this can't happen, anyway ignore the skb */
- printk(KERN_ERR "ppp_mp_reconstruct bad seq %u < %u\n",
- PPP_MP_CB(p)->sequence, seq);
+ netdev_err(ppp->dev, "ppp_mp_reconstruct bad "
+ "seq %u < %u\n",
+ PPP_MP_CB(p)->sequence, seq);
head = next;
continue;
}
@@ -2042,8 +2052,9 @@ ppp_mp_reconstruct(struct ppp *ppp)
(PPP_MP_CB(head)->BEbits & B)) {
if (len > ppp->mrru + 2) {
++ppp->dev->stats.rx_length_errors;
- printk(KERN_DEBUG "PPP: reconstructed packet"
- " is too long (%d)\n", len);
+ netdev_printk(KERN_DEBUG, ppp->dev,
+ "PPP: reconstructed packet"
+ " is too long (%d)\n", len);
} else if (p == head) {
/* fragment is complete packet - reuse skb */
tail = p;
@@ -2051,8 +2062,9 @@ ppp_mp_reconstruct(struct ppp *ppp)
break;
} else if ((skb = dev_alloc_skb(len)) == NULL) {
++ppp->dev->stats.rx_missed_errors;
- printk(KERN_DEBUG "PPP: no memory for "
- "reconstructed packet");
+ netdev_printk(KERN_DEBUG, ppp->dev,
+ "PPP: no memory for "
+ "reconstructed packet");
} else {
tail = p;
break;
@@ -2077,9 +2089,10 @@ ppp_mp_reconstruct(struct ppp *ppp)
signal a receive error. */
if (PPP_MP_CB(head)->sequence != ppp->nextseq) {
if (ppp->debug & 1)
- printk(KERN_DEBUG " missed pkts %u..%u\n",
- ppp->nextseq,
- PPP_MP_CB(head)->sequence-1);
+ netdev_printk(KERN_DEBUG, ppp->dev,
+ " missed pkts %u..%u\n",
+ ppp->nextseq,
+ PPP_MP_CB(head)->sequence-1);
++ppp->dev->stats.rx_dropped;
ppp_receive_error(ppp);
}
@@ -2617,8 +2630,8 @@ ppp_create_interface(struct net *net, int unit, int *retp)
ret = register_netdev(dev);
if (ret != 0) {
unit_put(&pn->units_idr, unit);
- printk(KERN_ERR "PPP: couldn't register device %s (%d)\n",
- dev->name, ret);
+ netdev_err(ppp->dev, "PPP: couldn't register device %s (%d)\n",
+ dev->name, ret);
goto out2;
}
@@ -2690,9 +2703,9 @@ static void ppp_destroy_interface(struct ppp *ppp)
if (!ppp->file.dead || ppp->n_channels) {
/* "can't happen" */
- printk(KERN_ERR "ppp: destroying ppp struct %p but dead=%d "
- "n_channels=%d !\n", ppp, ppp->file.dead,
- ppp->n_channels);
+ netdev_err(ppp->dev, "ppp: destroying ppp struct %p "
+ "but dead=%d n_channels=%d !\n",
+ ppp, ppp->file.dead, ppp->n_channels);
return;
}
@@ -2834,8 +2847,7 @@ static void ppp_destroy_channel(struct channel *pch)
if (!pch->file.dead) {
/* "can't happen" */
- printk(KERN_ERR "ppp: destroying undead channel %p !\n",
- pch);
+ pr_err("ppp: destroying undead channel %p !\n", pch);
return;
}
skb_queue_purge(&pch->file.xq);
@@ -2847,7 +2859,7 @@ static void __exit ppp_cleanup(void)
{
/* should never happen */
if (atomic_read(&ppp_unit_count) || atomic_read(&channel_count))
- printk(KERN_ERR "PPP: removing module but units remain!\n");
+ pr_err("PPP: removing module but units remain!\n");
unregister_chrdev(PPP_MAJOR, "ppp");
device_destroy(ppp_class, MKDEV(PPP_MAJOR, 0));
class_destroy(ppp_class);
@@ -2865,7 +2877,7 @@ static int __unit_alloc(struct idr *p, void *ptr, int n)
again:
if (!idr_pre_get(p, GFP_KERNEL)) {
- printk(KERN_ERR "PPP: No free memory for idr\n");
+ pr_err("PPP: No free memory for idr\n");
return -ENOMEM;
}
--
1.7.3.4
^ permalink raw reply related
* [PATCH 2/4] ppp: Reconstruct fragmented packets using frag lists instead of copying.
From: David Miller @ 2011-01-21 7:56 UTC (permalink / raw)
To: netdev; +Cc: paulus
[paulus@samba.org: fixed a couple of bugs]
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
---
drivers/net/ppp_generic.c | 39 +++++++++++++++++++++++----------------
1 files changed, 23 insertions(+), 16 deletions(-)
diff --git a/drivers/net/ppp_generic.c b/drivers/net/ppp_generic.c
index 3d7a38e..1d4fb34 100644
--- a/drivers/net/ppp_generic.c
+++ b/drivers/net/ppp_generic.c
@@ -2055,16 +2055,6 @@ ppp_mp_reconstruct(struct ppp *ppp)
netdev_printk(KERN_DEBUG, ppp->dev,
"PPP: reconstructed packet"
" is too long (%d)\n", len);
- } else if (p == head) {
- /* fragment is complete packet - reuse skb */
- tail = p;
- skb = skb_get(p);
- break;
- } else if ((skb = dev_alloc_skb(len)) == NULL) {
- ++ppp->dev->stats.rx_missed_errors;
- netdev_printk(KERN_DEBUG, ppp->dev,
- "PPP: no memory for "
- "reconstructed packet");
} else {
tail = p;
break;
@@ -2097,16 +2087,33 @@ ppp_mp_reconstruct(struct ppp *ppp)
ppp_receive_error(ppp);
}
- if (head != tail)
- /* copy to a single skb */
- for (p = head; p != tail->next; p = p->next)
- skb_copy_bits(p, 0, skb_put(skb, p->len), p->len);
+ skb = head;
+ if (head != tail) {
+ struct sk_buff **fragpp = &skb_shinfo(skb)->frag_list;
+ p = skb_queue_next(list, head);
+ __skb_unlink(skb, list);
+ skb_queue_walk_from_safe(list, p, tmp) {
+ __skb_unlink(p, list);
+ *fragpp = p;
+ p->next = NULL;
+ fragpp = &p->next;
+
+ skb->len += p->len;
+ skb->data_len += p->len;
+ skb->truesize += p->len;
+
+ if (p == tail)
+ break;
+ }
+ } else {
+ __skb_unlink(skb, list);
+ }
+
ppp->nextseq = PPP_MP_CB(tail)->sequence + 1;
head = tail->next;
}
- /* Discard all the skbuffs that we have copied the data out of
- or that we can't use. */
+ /* Discard all the skbuffs that we can't use. */
while ((p = list->next) != head) {
__skb_unlink(p, list);
kfree_skb(p);
--
1.7.3.4
^ permalink raw reply related
* [PATCH 3/4] net: Add safe reverse SKB queue walkers.
From: David Miller @ 2011-01-21 7:56 UTC (permalink / raw)
To: netdev; +Cc: paulus
Signed-off-by: David S. Miller <davem@davemloft.net>
---
include/linux/skbuff.h | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index bf221d6..6e946da 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1801,6 +1801,15 @@ static inline int pskb_trim_rcsum(struct sk_buff *skb, unsigned int len)
prefetch(skb->prev), (skb != (struct sk_buff *)(queue)); \
skb = skb->prev)
+#define skb_queue_reverse_walk_safe(queue, skb, tmp) \
+ for (skb = (queue)->prev, tmp = skb->prev; \
+ skb != (struct sk_buff *)(queue); \
+ skb = tmp, tmp = skb->prev)
+
+#define skb_queue_reverse_walk_from_safe(queue, skb, tmp) \
+ for (tmp = skb->prev; \
+ skb != (struct sk_buff *)(queue); \
+ skb = tmp, tmp = skb->prev)
static inline bool skb_has_frag_list(const struct sk_buff *skb)
{
--
1.7.3.4
^ permalink raw reply related
* [PATCH v4] net: add Faraday FTMAC100 10/100 Ethernet driver
From: Po-Yu Chuang @ 2011-01-21 7:55 UTC (permalink / raw)
To: netdev
Cc: linux-kernel, bhutchings, eric.dumazet, joe, dilinger, mirqus,
Po-Yu Chuang
In-Reply-To: <1295537418-2057-1-git-send-email-ratbert.chuang@gmail.com>
From: Po-Yu Chuang <ratbert@faraday-tech.com>
FTMAC100 Ethernet Media Access Controller supports 10/100 Mbps and
MII. This driver has been working on some ARM/NDS32 SoC's including
Faraday A320 and Andes AG101.
Signed-off-by: Po-Yu Chuang <ratbert@faraday-tech.com>
---
v2:
always use NAPI
do not use our own net_device_stats structure
don't set trans_start and last_rx
stats.rx_packets and stats.rx_bytes include dropped packets
add missed netif_napi_del()
initialize spinlocks in probe function
remove rx_lock and hw_lock
use netdev_[err/info/dbg] instead of dev_* ones
use netdev_alloc_skb_ip_align()
remove ftmac100_get_stats()
use is_valid_ether_addr() instead of is_zero_ether_addr()
add const to ftmac100_ethtool_ops and ftmac100_netdev_ops
use net_ratelimit() instead of printk_ratelimit()
no explicit inline
use %pM to print MAC address
add comment before wmb
use napi poll() to handle all interrupts
v3:
undo "stats.rx_packets and stats.rx_bytes include dropped packets"
ftmac100_mdio_read() returns 0 if error
fix comment typos
use pr_fmt and pr_info
define INT_MASK_ALL_ENABLED
define MACCR_ENABLE_ALL
do not count length error many times
use bool/true/false
use cpu_to_le32/le32_to_cpu to access descriptors
indent style fix
v4:
should not access skb after netif_receive_skb()
use resource_size()
better way to use cpu_to_le32/le32_to_cpu
use spin_lock() for tx_lock
combine all netdev_info() together in ftmac100_poll()
drivers/net/Kconfig | 9 +
drivers/net/Makefile | 1 +
drivers/net/ftmac100.c | 1207 ++++++++++++++++++++++++++++++++++++++++++++++++
drivers/net/ftmac100.h | 180 +++++++
4 files changed, 1397 insertions(+), 0 deletions(-)
create mode 100644 drivers/net/ftmac100.c
create mode 100644 drivers/net/ftmac100.h
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 4f1755b..26da0ee 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -2014,6 +2014,15 @@ config BCM63XX_ENET
This driver supports the ethernet MACs in the Broadcom 63xx
MIPS chipset family (BCM63XX).
+config FTMAC100
+ tristate "Faraday FTMAC100 10/100 Ethernet support"
+ depends on ARM
+ select MII
+ help
+ This driver supports the FTMAC100 Ethernet controller from
+ Faraday. It is used on Faraday A320, Andes AG101, AG101P
+ and some other ARM/NDS32 SoC's.
+
source "drivers/net/fs_enet/Kconfig"
source "drivers/net/octeon/Kconfig"
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index b90738d..7c21711 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -147,6 +147,7 @@ obj-$(CONFIG_FORCEDETH) += forcedeth.o
obj-$(CONFIG_NE_H8300) += ne-h8300.o 8390.o
obj-$(CONFIG_AX88796) += ax88796.o
obj-$(CONFIG_BCM63XX_ENET) += bcm63xx_enet.o
+obj-$(CONFIG_FTMAC100) += ftmac100.o
obj-$(CONFIG_TSI108_ETH) += tsi108_eth.o
obj-$(CONFIG_MV643XX_ETH) += mv643xx_eth.o
diff --git a/drivers/net/ftmac100.c b/drivers/net/ftmac100.c
new file mode 100644
index 0000000..58b2d5f
--- /dev/null
+++ b/drivers/net/ftmac100.c
@@ -0,0 +1,1207 @@
+/*
+ * Faraday FTMAC100 10/100 Ethernet
+ *
+ * (C) Copyright 2009-2011 Faraday Technology
+ * Po-Yu Chuang <ratbert@faraday-tech.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/dma-mapping.h>
+#include <linux/etherdevice.h>
+#include <linux/ethtool.h>
+#include <linux/init.h>
+#include <linux/io.h>
+#include <linux/kernel.h>
+#include <linux/mii.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/netdevice.h>
+#include <linux/platform_device.h>
+
+#include "ftmac100.h"
+
+#define DRV_NAME "ftmac100"
+#define DRV_VERSION "0.2"
+
+#define RX_QUEUE_ENTRIES 128 /* must be power of 2 */
+#define TX_QUEUE_ENTRIES 16 /* must be power of 2 */
+
+#define MAX_PKT_SIZE 1518
+#define RX_BUF_SIZE 2044 /* must be smaller than 0x7ff */
+
+/******************************************************************************
+ * private data
+ *****************************************************************************/
+struct ftmac100_descs {
+ struct ftmac100_rxdes rxdes[RX_QUEUE_ENTRIES];
+ struct ftmac100_txdes txdes[TX_QUEUE_ENTRIES];
+};
+
+struct ftmac100 {
+ struct resource *res;
+ void *base;
+ int irq;
+
+ struct ftmac100_descs *descs;
+ dma_addr_t descs_dma_addr;
+
+ unsigned int rx_pointer;
+ unsigned int tx_clean_pointer;
+ unsigned int tx_pointer;
+ unsigned int tx_pending;
+
+ spinlock_t tx_lock;
+
+ struct net_device *netdev;
+ struct device *dev;
+ struct napi_struct napi;
+
+ struct mii_if_info mii;
+};
+
+/******************************************************************************
+ * internal functions (hardware register access)
+ *****************************************************************************/
+#define INT_MASK_ALL_ENABLED (FTMAC100_INT_RPKT_FINISH | \
+ FTMAC100_INT_NORXBUF | \
+ FTMAC100_INT_XPKT_OK | \
+ FTMAC100_INT_XPKT_LOST | \
+ FTMAC100_INT_RPKT_LOST | \
+ FTMAC100_INT_AHB_ERR | \
+ FTMAC100_INT_PHYSTS_CHG)
+
+static void ftmac100_enable_all_int(struct ftmac100 *priv)
+{
+ iowrite32(INT_MASK_ALL_ENABLED, priv->base + FTMAC100_OFFSET_IMR);
+}
+
+static void ftmac100_disable_all_int(struct ftmac100 *priv)
+{
+ iowrite32(0, priv->base + FTMAC100_OFFSET_IMR);
+}
+
+static void ftmac100_set_rx_ring_base(struct ftmac100 *priv, dma_addr_t addr)
+{
+ iowrite32(addr, priv->base + FTMAC100_OFFSET_RXR_BADR);
+}
+
+static void ftmac100_set_tx_ring_base(struct ftmac100 *priv, dma_addr_t addr)
+{
+ iowrite32(addr, priv->base + FTMAC100_OFFSET_TXR_BADR);
+}
+
+static void ftmac100_txdma_start_polling(struct ftmac100 *priv)
+{
+ iowrite32(1, priv->base + FTMAC100_OFFSET_TXPD);
+}
+
+static int ftmac100_reset(struct ftmac100 *priv)
+{
+ struct net_device *netdev = priv->netdev;
+ int i;
+
+ /* NOTE: reset clears all registers */
+ iowrite32(FTMAC100_MACCR_SW_RST, priv->base + FTMAC100_OFFSET_MACCR);
+
+ for (i = 0; i < 5; i++) {
+ unsigned int maccr;
+
+ maccr = ioread32(priv->base + FTMAC100_OFFSET_MACCR);
+ if (!(maccr & FTMAC100_MACCR_SW_RST)) {
+ /*
+ * FTMAC100_MACCR_SW_RST cleared does not indicate
+ * that hardware reset completed (what the f*ck).
+ * We still need to wait for a while.
+ */
+ usleep_range(500, 1000);
+ return 0;
+ }
+
+ usleep_range(1000, 10000);
+ }
+
+ netdev_err(netdev, "software reset failed\n");
+ return -EIO;
+}
+
+static void ftmac100_set_mac(struct ftmac100 *priv, const unsigned char *mac)
+{
+ unsigned int maddr = mac[0] << 8 | mac[1];
+ unsigned int laddr = mac[2] << 24 | mac[3] << 16 | mac[4] << 8 | mac[5];
+
+ iowrite32(maddr, priv->base + FTMAC100_OFFSET_MAC_MADR);
+ iowrite32(laddr, priv->base + FTMAC100_OFFSET_MAC_LADR);
+}
+
+#define MACCR_ENABLE_ALL (FTMAC100_MACCR_XMT_EN | \
+ FTMAC100_MACCR_RCV_EN | \
+ FTMAC100_MACCR_XDMA_EN | \
+ FTMAC100_MACCR_RDMA_EN | \
+ FTMAC100_MACCR_CRC_APD | \
+ FTMAC100_MACCR_FULLDUP | \
+ FTMAC100_MACCR_RX_RUNT | \
+ FTMAC100_MACCR_RX_BROADPKT)
+
+static int ftmac100_start_hw(struct ftmac100 *priv)
+{
+ struct net_device *netdev = priv->netdev;
+
+ if (ftmac100_reset(priv))
+ return -EIO;
+
+ /* setup ring buffer base registers */
+
+ ftmac100_set_rx_ring_base(priv,
+ priv->descs_dma_addr +
+ offsetof(struct ftmac100_descs, rxdes));
+ ftmac100_set_tx_ring_base(priv,
+ priv->descs_dma_addr +
+ offsetof(struct ftmac100_descs, txdes));
+
+ iowrite32(FTMAC100_APTC_RXPOLL_CNT(1), priv->base + FTMAC100_OFFSET_APTC);
+
+ ftmac100_set_mac(priv, netdev->dev_addr);
+
+ iowrite32(MACCR_ENABLE_ALL, priv->base + FTMAC100_OFFSET_MACCR);
+ return 0;
+}
+
+static void ftmac100_stop_hw(struct ftmac100 *priv)
+{
+ iowrite32(0, priv->base + FTMAC100_OFFSET_MACCR);
+}
+
+/******************************************************************************
+ * internal functions (receive descriptor)
+ *****************************************************************************/
+static bool ftmac100_rxdes_first_segment(struct ftmac100_rxdes *rxdes)
+{
+ return rxdes->rxdes0 & cpu_to_le32(FTMAC100_RXDES0_FRS);
+}
+
+static bool ftmac100_rxdes_last_segment(struct ftmac100_rxdes *rxdes)
+{
+ return rxdes->rxdes0 & cpu_to_le32(FTMAC100_RXDES0_LRS);
+}
+
+static bool ftmac100_rxdes_owned_by_dma(struct ftmac100_rxdes *rxdes)
+{
+ return rxdes->rxdes0 & cpu_to_le32(FTMAC100_RXDES0_RXDMA_OWN);
+}
+
+static void ftmac100_rxdes_set_dma_own(struct ftmac100_rxdes *rxdes)
+{
+ /* clear status bits */
+ rxdes->rxdes0 = cpu_to_le32(FTMAC100_RXDES0_RXDMA_OWN);
+}
+
+static bool ftmac100_rxdes_rx_error(struct ftmac100_rxdes *rxdes)
+{
+ return rxdes->rxdes0 & cpu_to_le32(FTMAC100_RXDES0_RX_ERR);
+}
+
+static bool ftmac100_rxdes_crc_error(struct ftmac100_rxdes *rxdes)
+{
+ return rxdes->rxdes0 & cpu_to_le32(FTMAC100_RXDES0_CRC_ERR);
+}
+
+static bool ftmac100_rxdes_frame_too_long(struct ftmac100_rxdes *rxdes)
+{
+ return rxdes->rxdes0 & cpu_to_le32(FTMAC100_RXDES0_FTL);
+}
+
+static bool ftmac100_rxdes_runt(struct ftmac100_rxdes *rxdes)
+{
+ return rxdes->rxdes0 & cpu_to_le32(FTMAC100_RXDES0_RUNT);
+}
+
+static bool ftmac100_rxdes_odd_nibble(struct ftmac100_rxdes *rxdes)
+{
+ return rxdes->rxdes0 & cpu_to_le32(FTMAC100_RXDES0_RX_ODD_NB);
+}
+
+static unsigned int ftmac100_rxdes_frame_length(struct ftmac100_rxdes *rxdes)
+{
+ return rxdes->rxdes0 & cpu_to_le32(FTMAC100_RXDES0_RFL);
+}
+
+static bool ftmac100_rxdes_multicast(struct ftmac100_rxdes *rxdes)
+{
+ return rxdes->rxdes0 & cpu_to_le32(FTMAC100_RXDES0_MULTICAST);
+}
+
+static void ftmac100_rxdes_set_buffer_size(struct ftmac100_rxdes *rxdes,
+ unsigned int size)
+{
+ rxdes->rxdes1 &= cpu_to_le32(FTMAC100_RXDES1_EDORR);
+ rxdes->rxdes1 |= cpu_to_le32(FTMAC100_RXDES1_RXBUF_SIZE(size));
+}
+
+static void ftmac100_rxdes_set_end_of_ring(struct ftmac100_rxdes *rxdes)
+{
+ rxdes->rxdes1 |= cpu_to_le32(FTMAC100_RXDES1_EDORR);
+}
+
+static void ftmac100_rxdes_set_dma_addr(struct ftmac100_rxdes *rxdes,
+ dma_addr_t addr)
+{
+ rxdes->rxdes2 = cpu_to_le32(addr);
+}
+
+static dma_addr_t ftmac100_rxdes_get_dma_addr(struct ftmac100_rxdes *rxdes)
+{
+ return le32_to_cpu(rxdes->rxdes2);
+}
+
+/* rxdes3 is not used by hardware, we use it to keep track of buffer */
+static void ftmac100_rxdes_set_va(struct ftmac100_rxdes *rxdes, void *addr)
+{
+ rxdes->rxdes3 = cpu_to_le32(addr);
+}
+
+static void *ftmac100_rxdes_get_va(struct ftmac100_rxdes *rxdes)
+{
+ return (void *)le32_to_cpu(rxdes->rxdes3);
+}
+
+/******************************************************************************
+ * internal functions (receive)
+ *****************************************************************************/
+static int ftmac100_next_rx_pointer(int pointer)
+{
+ return (pointer + 1) & (RX_QUEUE_ENTRIES - 1);
+}
+
+static void ftmac100_rx_pointer_advance(struct ftmac100 *priv)
+{
+ priv->rx_pointer = ftmac100_next_rx_pointer(priv->rx_pointer);
+}
+
+static struct ftmac100_rxdes *ftmac100_current_rxdes(struct ftmac100 *priv)
+{
+ return &priv->descs->rxdes[priv->rx_pointer];
+}
+
+static struct ftmac100_rxdes *
+ftmac100_rx_locate_first_segment(struct ftmac100 *priv)
+{
+ struct ftmac100_rxdes *rxdes = ftmac100_current_rxdes(priv);
+
+ while (!ftmac100_rxdes_owned_by_dma(rxdes)) {
+ if (ftmac100_rxdes_first_segment(rxdes))
+ return rxdes;
+
+ ftmac100_rxdes_set_dma_own(rxdes);
+ ftmac100_rx_pointer_advance(priv);
+ rxdes = ftmac100_current_rxdes(priv);
+ }
+
+ return NULL;
+}
+
+static bool ftmac100_rx_packet_error(struct ftmac100 *priv,
+ struct ftmac100_rxdes *rxdes)
+{
+ struct net_device *netdev = priv->netdev;
+ bool error = false;
+
+ if (unlikely(ftmac100_rxdes_rx_error(rxdes))) {
+ if (net_ratelimit())
+ netdev_info(netdev, "rx err\n");
+
+ netdev->stats.rx_errors++;
+ error = true;
+ }
+
+ if (unlikely(ftmac100_rxdes_crc_error(rxdes))) {
+ if (net_ratelimit())
+ netdev_info(netdev, "rx crc err\n");
+
+ netdev->stats.rx_crc_errors++;
+ error = true;
+ }
+
+ if (unlikely(ftmac100_rxdes_frame_too_long(rxdes))) {
+ if (net_ratelimit())
+ netdev_info(netdev, "rx frame too long\n");
+
+ netdev->stats.rx_length_errors++;
+ error = true;
+ } else if (unlikely(ftmac100_rxdes_runt(rxdes))) {
+ if (net_ratelimit())
+ netdev_info(netdev, "rx runt\n");
+
+ netdev->stats.rx_length_errors++;
+ error = true;
+ } else if (unlikely(ftmac100_rxdes_odd_nibble(rxdes))) {
+ if (net_ratelimit())
+ netdev_info(netdev, "rx odd nibble\n");
+
+ netdev->stats.rx_length_errors++;
+ error = true;
+ }
+
+ return error;
+}
+
+static void ftmac100_rx_drop_packet(struct ftmac100 *priv)
+{
+ struct net_device *netdev = priv->netdev;
+ struct ftmac100_rxdes *rxdes = ftmac100_current_rxdes(priv);
+ bool done = false;
+
+ if (net_ratelimit())
+ netdev_dbg(netdev, "drop packet %p\n", rxdes);
+
+ do {
+ if (ftmac100_rxdes_last_segment(rxdes))
+ done = true;
+
+ ftmac100_rxdes_set_dma_own(rxdes);
+ ftmac100_rx_pointer_advance(priv);
+ rxdes = ftmac100_current_rxdes(priv);
+ } while (!done && !ftmac100_rxdes_owned_by_dma(rxdes));
+
+ netdev->stats.rx_dropped++;
+}
+
+static bool ftmac100_rx_packet(struct ftmac100 *priv, int *processed)
+{
+ struct net_device *netdev = priv->netdev;
+ struct ftmac100_rxdes *rxdes;
+ struct sk_buff *skb;
+ int length;
+ bool copied = false;
+ bool done = false;
+
+ rxdes = ftmac100_rx_locate_first_segment(priv);
+ if (!rxdes)
+ return false;
+
+ if (unlikely(ftmac100_rx_packet_error(priv, rxdes))) {
+ ftmac100_rx_drop_packet(priv);
+ return true;
+ }
+
+ /* start processing */
+
+ length = ftmac100_rxdes_frame_length(rxdes);
+
+ skb = netdev_alloc_skb_ip_align(netdev, length);
+ if (unlikely(!skb)) {
+ if (net_ratelimit())
+ netdev_err(netdev, "rx skb alloc failed\n");
+
+ ftmac100_rx_drop_packet(priv);
+ return true;
+ }
+
+ if (unlikely(ftmac100_rxdes_multicast(rxdes)))
+ netdev->stats.multicast++;
+
+ do {
+ dma_addr_t d = ftmac100_rxdes_get_dma_addr(rxdes);
+ void *buf = ftmac100_rxdes_get_va(rxdes);
+ int size;
+
+ size = min(length - copied, RX_BUF_SIZE);
+
+ dma_sync_single_for_cpu(priv->dev, d, RX_BUF_SIZE,
+ DMA_FROM_DEVICE);
+ memcpy(skb_put(skb, size), buf, size);
+
+ copied += size;
+
+ if (ftmac100_rxdes_last_segment(rxdes))
+ done = true;
+
+ dma_sync_single_for_device(priv->dev, d, RX_BUF_SIZE,
+ DMA_FROM_DEVICE);
+
+ ftmac100_rxdes_set_dma_own(rxdes);
+
+ ftmac100_rx_pointer_advance(priv);
+ rxdes = ftmac100_current_rxdes(priv);
+ } while (!done && copied < length);
+
+ skb->protocol = eth_type_trans(skb, netdev);
+
+ netdev->stats.rx_packets++;
+ netdev->stats.rx_bytes += skb->len;
+
+ /* push packet to protocol stack */
+ netif_receive_skb(skb);
+
+ (*processed)++;
+ return true;
+}
+
+/******************************************************************************
+ * internal functions (transmit descriptor)
+ *****************************************************************************/
+static void ftmac100_txdes_reset(struct ftmac100_txdes *txdes)
+{
+ /* clear all except end of ring bit */
+ txdes->txdes0 = 0;
+ txdes->txdes1 &= FTMAC100_TXDES1_EDOTR;
+ txdes->txdes2 = 0;
+ txdes->txdes3 = 0;
+}
+
+static bool ftmac100_txdes_owned_by_dma(struct ftmac100_txdes *txdes)
+{
+ return txdes->txdes0 & cpu_to_le32(FTMAC100_TXDES0_TXDMA_OWN);
+}
+
+static void ftmac100_txdes_set_dma_own(struct ftmac100_txdes *txdes)
+{
+ /*
+ * Make sure dma own bit will not be set before any other
+ * descriptor fields.
+ */
+ wmb();
+ txdes->txdes0 |= cpu_to_le32(FTMAC100_TXDES0_TXDMA_OWN);
+}
+
+static bool ftmac100_txdes_excessive_collision(struct ftmac100_txdes *txdes)
+{
+ return txdes->txdes0 & cpu_to_le32(FTMAC100_TXDES0_TXPKT_EXSCOL);
+}
+
+static bool ftmac100_txdes_late_collision(struct ftmac100_txdes *txdes)
+{
+ return txdes->txdes0 & cpu_to_le32(FTMAC100_TXDES0_TXPKT_LATECOL);
+}
+
+static void ftmac100_txdes_set_end_of_ring(struct ftmac100_txdes *txdes)
+{
+ txdes->txdes1 |= cpu_to_le32(FTMAC100_TXDES1_EDOTR);
+}
+
+static void ftmac100_txdes_set_first_segment(struct ftmac100_txdes *txdes)
+{
+ txdes->txdes1 |= cpu_to_le32(FTMAC100_TXDES1_FTS);
+}
+
+static void ftmac100_txdes_set_last_segment(struct ftmac100_txdes *txdes)
+{
+ txdes->txdes1 |= cpu_to_le32(FTMAC100_TXDES1_LTS);
+}
+
+static void ftmac100_txdes_set_txint(struct ftmac100_txdes *txdes)
+{
+ txdes->txdes1 |= cpu_to_le32(FTMAC100_TXDES1_TXIC);
+}
+
+static void ftmac100_txdes_set_buffer_size(struct ftmac100_txdes *txdes,
+ unsigned int len)
+{
+ txdes->txdes1 |= cpu_to_le32(FTMAC100_TXDES1_TXBUF_SIZE(len));
+}
+
+static void ftmac100_txdes_set_dma_addr(struct ftmac100_txdes *txdes,
+ dma_addr_t addr)
+{
+ txdes->txdes2 = cpu_to_le32(addr);
+}
+
+static dma_addr_t ftmac100_txdes_get_dma_addr(struct ftmac100_txdes *txdes)
+{
+ return le32_to_cpu(txdes->txdes2);
+}
+
+/* txdes3 is not used by hardware, we use it to keep track of socket buffer */
+static void ftmac100_txdes_set_skb(struct ftmac100_txdes *txdes,
+ struct sk_buff *skb)
+{
+ txdes->txdes3 = cpu_to_le32(skb);
+}
+
+static struct sk_buff *ftmac100_txdes_get_skb(struct ftmac100_txdes *txdes)
+{
+ return (struct sk_buff *)le32_to_cpu(txdes->txdes3);
+}
+
+/******************************************************************************
+ * internal functions (transmit)
+ *****************************************************************************/
+static int ftmac100_next_tx_pointer(int pointer)
+{
+ return (pointer + 1) & (TX_QUEUE_ENTRIES - 1);
+}
+
+static void ftmac100_tx_pointer_advance(struct ftmac100 *priv)
+{
+ priv->tx_pointer = ftmac100_next_tx_pointer(priv->tx_pointer);
+}
+
+static void ftmac100_tx_clean_pointer_advance(struct ftmac100 *priv)
+{
+ priv->tx_clean_pointer = ftmac100_next_tx_pointer(priv->tx_clean_pointer);
+}
+
+static struct ftmac100_txdes *ftmac100_current_txdes(struct ftmac100 *priv)
+{
+ return &priv->descs->txdes[priv->tx_pointer];
+}
+
+static struct ftmac100_txdes *
+ftmac100_current_clean_txdes(struct ftmac100 *priv)
+{
+ return &priv->descs->txdes[priv->tx_clean_pointer];
+}
+
+static bool ftmac100_tx_complete_packet(struct ftmac100 *priv)
+{
+ struct net_device *netdev = priv->netdev;
+ struct ftmac100_txdes *txdes;
+ struct sk_buff *skb;
+ dma_addr_t map;
+
+ if (priv->tx_pending == 0)
+ return false;
+
+ txdes = ftmac100_current_clean_txdes(priv);
+
+ if (ftmac100_txdes_owned_by_dma(txdes))
+ return false;
+
+ skb = ftmac100_txdes_get_skb(txdes);
+ map = ftmac100_txdes_get_dma_addr(txdes);
+
+ if (unlikely(ftmac100_txdes_excessive_collision(txdes) ||
+ ftmac100_txdes_late_collision(txdes))) {
+ /*
+ * packet transmitted to ethernet lost due to late collision
+ * or excessive collision
+ */
+ netdev->stats.tx_aborted_errors++;
+ } else {
+ netdev->stats.tx_packets++;
+ netdev->stats.tx_bytes += skb->len;
+ }
+
+ dma_unmap_single(priv->dev, map, skb_headlen(skb), DMA_TO_DEVICE);
+
+ dev_kfree_skb_irq(skb);
+
+ ftmac100_txdes_reset(txdes);
+
+ ftmac100_tx_clean_pointer_advance(priv);
+
+ priv->tx_pending--;
+ netif_wake_queue(netdev);
+
+ return true;
+}
+
+static void ftmac100_tx_complete(struct ftmac100 *priv)
+{
+ spin_lock(&priv->tx_lock);
+ while (ftmac100_tx_complete_packet(priv))
+ ;
+ spin_unlock(&priv->tx_lock);
+}
+
+static int ftmac100_xmit(struct ftmac100 *priv, struct sk_buff *skb,
+ dma_addr_t map)
+{
+ struct net_device *netdev = priv->netdev;
+ struct ftmac100_txdes *txdes;
+ unsigned int len = (skb->len < ETH_ZLEN) ? ETH_ZLEN : skb->len;
+
+ txdes = ftmac100_current_txdes(priv);
+ ftmac100_tx_pointer_advance(priv);
+
+ /* setup TX descriptor */
+
+ spin_lock(&priv->tx_lock);
+ ftmac100_txdes_set_skb(txdes, skb);
+ ftmac100_txdes_set_dma_addr(txdes, map);
+
+ ftmac100_txdes_set_first_segment(txdes);
+ ftmac100_txdes_set_last_segment(txdes);
+ ftmac100_txdes_set_txint(txdes);
+ ftmac100_txdes_set_buffer_size(txdes, len);
+
+ priv->tx_pending++;
+ if (priv->tx_pending == TX_QUEUE_ENTRIES) {
+ if (net_ratelimit())
+ netdev_info(netdev, "tx queue full\n");
+
+ netif_stop_queue(netdev);
+ }
+
+ /* start transmit */
+ ftmac100_txdes_set_dma_own(txdes);
+ spin_unlock(&priv->tx_lock);
+
+ ftmac100_txdma_start_polling(priv);
+
+ return NETDEV_TX_OK;
+}
+
+/******************************************************************************
+ * internal functions (buffer)
+ *****************************************************************************/
+static void ftmac100_free_buffers(struct ftmac100 *priv)
+{
+ int i;
+
+ for (i = 0; i < RX_QUEUE_ENTRIES; i += 2) {
+ struct ftmac100_rxdes *rxdes = &priv->descs->rxdes[i];
+ dma_addr_t d = ftmac100_rxdes_get_dma_addr(rxdes);
+ void *page = ftmac100_rxdes_get_va(rxdes);
+
+ if (d)
+ dma_unmap_single(priv->dev, d, PAGE_SIZE,
+ DMA_FROM_DEVICE);
+
+ if (page != NULL)
+ free_page((unsigned long)page);
+ }
+
+ for (i = 0; i < TX_QUEUE_ENTRIES; i++) {
+ struct ftmac100_txdes *txdes = &priv->descs->txdes[i];
+ struct sk_buff *skb = ftmac100_txdes_get_skb(txdes);
+
+ if (skb) {
+ dma_addr_t map;
+
+ map = ftmac100_txdes_get_dma_addr(txdes);
+ dma_unmap_single(priv->dev, map, skb_headlen(skb),
+ DMA_TO_DEVICE);
+ dev_kfree_skb(skb);
+ }
+ }
+
+ dma_free_coherent(priv->dev, sizeof(struct ftmac100_descs),
+ priv->descs, priv->descs_dma_addr);
+}
+
+static int ftmac100_alloc_buffers(struct ftmac100 *priv)
+{
+ int i;
+
+ priv->descs = dma_alloc_coherent(priv->dev,
+ sizeof(struct ftmac100_descs),
+ &priv->descs_dma_addr,
+ GFP_KERNEL | GFP_DMA);
+ if (priv->descs == NULL)
+ return -ENOMEM;
+
+ memset(priv->descs, 0, sizeof(struct ftmac100_descs));
+
+ /* initialize RX ring */
+
+ ftmac100_rxdes_set_end_of_ring(&priv->descs->rxdes[RX_QUEUE_ENTRIES - 1]);
+
+ for (i = 0; i < RX_QUEUE_ENTRIES; i += 2) {
+ struct ftmac100_rxdes *rxdes = &priv->descs->rxdes[i];
+ void *page;
+ dma_addr_t d;
+
+ page = (void *)__get_free_page(GFP_KERNEL | GFP_DMA);
+ if (page == NULL)
+ goto err;
+
+ d = dma_map_single(priv->dev, page, PAGE_SIZE, DMA_FROM_DEVICE);
+ if (unlikely(dma_mapping_error(priv->dev, d))) {
+ free_page((unsigned long)page);
+ goto err;
+ }
+
+ /*
+ * The hardware enforces a sub-2K maximum packet size, so we
+ * put two buffers on every hardware page.
+ */
+ ftmac100_rxdes_set_va(rxdes, page);
+ ftmac100_rxdes_set_va(rxdes + 1, page + PAGE_SIZE / 2);
+
+ ftmac100_rxdes_set_dma_addr(rxdes, d);
+ ftmac100_rxdes_set_dma_addr(rxdes + 1, d + PAGE_SIZE / 2);
+
+ ftmac100_rxdes_set_buffer_size(rxdes, RX_BUF_SIZE);
+ ftmac100_rxdes_set_buffer_size(rxdes + 1, RX_BUF_SIZE);
+
+ ftmac100_rxdes_set_dma_own(rxdes);
+ ftmac100_rxdes_set_dma_own(rxdes + 1);
+ }
+
+ /* initialize TX ring */
+
+ ftmac100_txdes_set_end_of_ring(&priv->descs->txdes[TX_QUEUE_ENTRIES - 1]);
+ return 0;
+
+err:
+ ftmac100_free_buffers(priv);
+ return -ENOMEM;
+}
+
+/******************************************************************************
+ * struct mii_if_info functions
+ *****************************************************************************/
+static int ftmac100_mdio_read(struct net_device *netdev, int phy_id, int reg)
+{
+ struct ftmac100 *priv = netdev_priv(netdev);
+ unsigned int phycr;
+ int i;
+
+ phycr = FTMAC100_PHYCR_PHYAD(phy_id) |
+ FTMAC100_PHYCR_REGAD(reg) |
+ FTMAC100_PHYCR_MIIRD;
+
+ iowrite32(phycr, priv->base + FTMAC100_OFFSET_PHYCR);
+ for (i = 0; i < 10; i++) {
+ phycr = ioread32(priv->base + FTMAC100_OFFSET_PHYCR);
+
+ if ((phycr & FTMAC100_PHYCR_MIIRD) == 0)
+ return phycr & FTMAC100_PHYCR_MIIRDATA;
+
+ usleep_range(100, 1000);
+ }
+
+ netdev_err(netdev, "mdio read timed out\n");
+ return 0;
+}
+
+static void ftmac100_mdio_write(struct net_device *netdev, int phy_id, int reg,
+ int data)
+{
+ struct ftmac100 *priv = netdev_priv(netdev);
+ unsigned int phycr;
+ int i;
+
+ phycr = FTMAC100_PHYCR_PHYAD(phy_id) |
+ FTMAC100_PHYCR_REGAD(reg) |
+ FTMAC100_PHYCR_MIIWR;
+
+ data = FTMAC100_PHYWDATA_MIIWDATA(data);
+
+ iowrite32(data, priv->base + FTMAC100_OFFSET_PHYWDATA);
+ iowrite32(phycr, priv->base + FTMAC100_OFFSET_PHYCR);
+
+ for (i = 0; i < 10; i++) {
+ phycr = ioread32(priv->base + FTMAC100_OFFSET_PHYCR);
+
+ if ((phycr & FTMAC100_PHYCR_MIIWR) == 0)
+ return;
+
+ usleep_range(100, 1000);
+ }
+
+ netdev_err(netdev, "mdio write timed out\n");
+}
+
+/******************************************************************************
+ * struct ethtool_ops functions
+ *****************************************************************************/
+static void ftmac100_get_drvinfo(struct net_device *netdev,
+ struct ethtool_drvinfo *info)
+{
+ strcpy(info->driver, DRV_NAME);
+ strcpy(info->version, DRV_VERSION);
+ strcpy(info->bus_info, dev_name(&netdev->dev));
+}
+
+static int ftmac100_get_settings(struct net_device *netdev,
+ struct ethtool_cmd *cmd)
+{
+ struct ftmac100 *priv = netdev_priv(netdev);
+ return mii_ethtool_gset(&priv->mii, cmd);
+}
+
+static int ftmac100_set_settings(struct net_device *netdev,
+ struct ethtool_cmd *cmd)
+{
+ struct ftmac100 *priv = netdev_priv(netdev);
+ return mii_ethtool_sset(&priv->mii, cmd);
+}
+
+static int ftmac100_nway_reset(struct net_device *netdev)
+{
+ struct ftmac100 *priv = netdev_priv(netdev);
+ return mii_nway_restart(&priv->mii);
+}
+
+static u32 ftmac100_get_link(struct net_device *netdev)
+{
+ struct ftmac100 *priv = netdev_priv(netdev);
+ return mii_link_ok(&priv->mii);
+}
+
+static const struct ethtool_ops ftmac100_ethtool_ops = {
+ .set_settings = ftmac100_set_settings,
+ .get_settings = ftmac100_get_settings,
+ .get_drvinfo = ftmac100_get_drvinfo,
+ .nway_reset = ftmac100_nway_reset,
+ .get_link = ftmac100_get_link,
+};
+
+/******************************************************************************
+ * interrupt handler
+ *****************************************************************************/
+static irqreturn_t ftmac100_interrupt(int irq, void *dev_id)
+{
+ struct net_device *netdev = dev_id;
+ struct ftmac100 *priv = netdev_priv(netdev);
+
+ if (likely(netif_running(netdev))) {
+ /* Disable interrupts for polling */
+ ftmac100_disable_all_int(priv);
+ napi_schedule(&priv->napi);
+ }
+
+ return IRQ_HANDLED;
+}
+
+/******************************************************************************
+ * struct napi_struct functions
+ *****************************************************************************/
+static int ftmac100_poll(struct napi_struct *napi, int budget)
+{
+ struct ftmac100 *priv = container_of(napi, struct ftmac100, napi);
+ struct net_device *netdev = priv->netdev;
+ unsigned int status;
+ bool completed = true;
+ int rx = 0;
+
+ status = ioread32(priv->base + FTMAC100_OFFSET_ISR);
+
+ if (status & (FTMAC100_INT_RPKT_FINISH | FTMAC100_INT_NORXBUF)) {
+ /*
+ * FTMAC100_INT_RPKT_FINISH:
+ * RX DMA has received packets into RX buffer successfully
+ *
+ * FTMAC100_INT_NORXBUF:
+ * RX buffer unavailable
+ */
+ bool retry;
+
+ do {
+ retry = ftmac100_rx_packet(priv, &rx);
+ } while (retry && rx < budget);
+
+ if (retry && rx == budget)
+ completed = false;
+ }
+
+ if (status & (FTMAC100_INT_XPKT_OK | FTMAC100_INT_XPKT_LOST)) {
+ /*
+ * FTMAC100_INT_XPKT_OK:
+ * packet transmitted to ethernet successfully
+ *
+ * FTMAC100_INT_XPKT_LOST:
+ * packet transmitted to ethernet lost due to late
+ * collision or excessive collision
+ */
+ ftmac100_tx_complete(priv);
+ }
+
+ if (status & (FTMAC100_INT_NORXBUF | FTMAC100_INT_RPKT_LOST |
+ FTMAC100_INT_AHB_ERR | FTMAC100_INT_PHYSTS_CHG)) {
+ if (net_ratelimit())
+ netdev_info(netdev, "[ISR] = 0x%x: %s%s%s%s\n", status,
+ status & FTMAC100_INT_NORXBUF ? "NORXBUF " : "",
+ status & FTMAC100_INT_RPKT_LOST ? "RPKT_LOST " : "",
+ status & FTMAC100_INT_AHB_ERR ? "AHB_ERR " : "",
+ status & FTMAC100_INT_PHYSTS_CHG ? "PHYSTS_CHG" : "");
+
+ if (status & FTMAC100_INT_NORXBUF) {
+ /* RX buffer unavailable */
+ netdev->stats.rx_over_errors++;
+ }
+
+ if (status & FTMAC100_INT_RPKT_LOST) {
+ /* received packet lost due to RX FIFO full */
+ netdev->stats.rx_fifo_errors++;
+ }
+
+ if (status & FTMAC100_INT_PHYSTS_CHG) {
+ /* PHY link status change */
+ mii_check_link(&priv->mii);
+ }
+ }
+
+ if (completed) {
+ /* stop polling */
+ napi_complete(napi);
+ ftmac100_enable_all_int(priv);
+ }
+
+ return rx;
+}
+
+/******************************************************************************
+ * struct net_device_ops functions
+ *****************************************************************************/
+static int ftmac100_open(struct net_device *netdev)
+{
+ struct ftmac100 *priv = netdev_priv(netdev);
+ int err;
+
+ err = ftmac100_alloc_buffers(priv);
+ if (err) {
+ netdev_err(netdev, "failed to allocate buffers\n");
+ goto err_alloc;
+ }
+
+ err = request_irq(priv->irq, ftmac100_interrupt, 0, netdev->name,
+ netdev);
+ if (err) {
+ netdev_err(netdev, "failed to request irq %d\n", priv->irq);
+ goto err_irq;
+ }
+
+ priv->rx_pointer = 0;
+ priv->tx_clean_pointer = 0;
+ priv->tx_pointer = 0;
+ priv->tx_pending = 0;
+
+ err = ftmac100_start_hw(priv);
+ if (err)
+ goto err_hw;
+
+ napi_enable(&priv->napi);
+ netif_start_queue(netdev);
+
+ ftmac100_enable_all_int(priv);
+ return 0;
+
+err_hw:
+ free_irq(priv->irq, netdev);
+err_irq:
+ ftmac100_free_buffers(priv);
+err_alloc:
+ return err;
+}
+
+static int ftmac100_stop(struct net_device *netdev)
+{
+ struct ftmac100 *priv = netdev_priv(netdev);
+
+ ftmac100_disable_all_int(priv);
+ netif_stop_queue(netdev);
+ napi_disable(&priv->napi);
+ ftmac100_stop_hw(priv);
+ free_irq(priv->irq, netdev);
+ ftmac100_free_buffers(priv);
+
+ return 0;
+}
+
+static int ftmac100_hard_start_xmit(struct sk_buff *skb,
+ struct net_device *netdev)
+{
+ struct ftmac100 *priv = netdev_priv(netdev);
+ dma_addr_t map;
+
+ if (unlikely(skb->len > MAX_PKT_SIZE)) {
+ if (net_ratelimit())
+ netdev_dbg(netdev, "tx packet too big\n");
+
+ netdev->stats.tx_dropped++;
+ dev_kfree_skb(skb);
+ return NETDEV_TX_OK;
+ }
+
+ map = dma_map_single(priv->dev, skb->data, skb_headlen(skb), DMA_TO_DEVICE);
+ if (unlikely(dma_mapping_error(priv->dev, map))) {
+ /* drop packet */
+ if (net_ratelimit())
+ netdev_err(netdev, "map socket buffer failed\n");
+
+ netdev->stats.tx_dropped++;
+ dev_kfree_skb(skb);
+ return NETDEV_TX_OK;
+ }
+
+ return ftmac100_xmit(priv, skb, map);
+}
+
+/* optional */
+static int ftmac100_do_ioctl(struct net_device *netdev, struct ifreq *ifr,
+ int cmd)
+{
+ struct ftmac100 *priv = netdev_priv(netdev);
+ struct mii_ioctl_data *data = if_mii(ifr);
+
+ return generic_mii_ioctl(&priv->mii, data, cmd, NULL);
+}
+
+static const struct net_device_ops ftmac100_netdev_ops = {
+ .ndo_open = ftmac100_open,
+ .ndo_stop = ftmac100_stop,
+ .ndo_start_xmit = ftmac100_hard_start_xmit,
+ .ndo_set_mac_address = eth_mac_addr,
+ .ndo_validate_addr = eth_validate_addr,
+ .ndo_do_ioctl = ftmac100_do_ioctl,
+};
+
+/******************************************************************************
+ * struct platform_driver functions
+ *****************************************************************************/
+static int ftmac100_remove(struct platform_device *pdev)
+{
+ struct net_device *netdev;
+ struct ftmac100 *priv;
+
+ netdev = platform_get_drvdata(pdev);
+ if (netdev == NULL)
+ return 0;
+
+ platform_set_drvdata(pdev, NULL);
+
+ priv = netdev_priv(netdev);
+
+ netif_napi_del(&priv->napi);
+ unregister_netdev(netdev);
+
+ if (priv->base != NULL)
+ iounmap(priv->base);
+
+ if (priv->res != NULL)
+ release_resource(priv->res);
+
+ free_netdev(netdev);
+ return 0;
+}
+
+static int ftmac100_probe(struct platform_device *pdev)
+{
+ struct resource *res;
+ int irq;
+ struct net_device *netdev;
+ struct ftmac100 *priv;
+ int err;
+
+ if (!pdev)
+ return -ENODEV;
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!res)
+ return -ENXIO;
+
+ irq = platform_get_irq(pdev, 0);
+ if (irq < 0)
+ return irq;
+
+ /* setup net_device */
+
+ netdev = alloc_etherdev(sizeof(struct ftmac100));
+ if (netdev == NULL) {
+ err = -ENOMEM;
+ goto err_out;
+ }
+
+ SET_NETDEV_DEV(netdev, &pdev->dev);
+ SET_ETHTOOL_OPS(netdev, &ftmac100_ethtool_ops);
+ netdev->netdev_ops = &ftmac100_netdev_ops;
+
+ platform_set_drvdata(pdev, netdev);
+
+ /* setup private data */
+
+ priv = netdev_priv(netdev);
+ priv->netdev = netdev;
+ priv->dev = &pdev->dev;
+
+ spin_lock_init(&priv->tx_lock);
+
+ /* initialize NAPI */
+ netif_napi_add(netdev, &priv->napi, ftmac100_poll, 64);
+
+ /* map io memory */
+ priv->res = request_mem_region(res->start, resource_size(res),
+ dev_name(&pdev->dev));
+ if (priv->res == NULL) {
+ dev_err(&pdev->dev, "Could not reserve memory region\n");
+ err = -ENOMEM;
+ goto err_out;
+ }
+
+ priv->base = ioremap(res->start, res->end - res->start);
+ if (priv->base == NULL) {
+ dev_err(&pdev->dev, "Failed to ioremap ethernet registers\n");
+ err = -EIO;
+ goto err_out;
+ }
+
+ priv->irq = irq;
+
+ /* initialize struct mii_if_info */
+
+ priv->mii.phy_id = 0;
+ priv->mii.phy_id_mask = 0x1f;
+ priv->mii.reg_num_mask = 0x1f;
+ priv->mii.dev = netdev;
+ priv->mii.mdio_read = ftmac100_mdio_read;
+ priv->mii.mdio_write = ftmac100_mdio_write;
+
+ /* register network device */
+
+ err = register_netdev(netdev);
+ if (err) {
+ dev_err(&pdev->dev, "Failed to register netdev\n");
+ goto err_out;
+ }
+
+ netdev_info(netdev, "irq %d, mapped at %p\n", priv->irq, priv->base);
+
+ if (!is_valid_ether_addr(netdev->dev_addr)) {
+ random_ether_addr(netdev->dev_addr);
+ netdev_info(netdev, "generated random MAC address %pM\n",
+ netdev->dev_addr);
+ }
+
+ return 0;
+
+err_out:
+ ftmac100_remove(pdev);
+ return err;
+}
+
+static struct platform_driver ftmac100_driver = {
+ .probe = ftmac100_probe,
+ .remove = ftmac100_remove,
+ .driver = {
+ .name = DRV_NAME,
+ .owner = THIS_MODULE,
+ },
+};
+
+/******************************************************************************
+ * initialization / finalization
+ *****************************************************************************/
+static int __init ftmac100_init(void)
+{
+ pr_info("Loading version " DRV_VERSION " ...\n");
+ return platform_driver_register(&ftmac100_driver);
+}
+
+static void __exit ftmac100_exit(void)
+{
+ platform_driver_unregister(&ftmac100_driver);
+}
+
+module_init(ftmac100_init);
+module_exit(ftmac100_exit);
+
+MODULE_AUTHOR("Po-Yu Chuang <ratbert@faraday-tech.com>");
+MODULE_DESCRIPTION("FTMAC100 driver");
+MODULE_LICENSE("GPL");
diff --git a/drivers/net/ftmac100.h b/drivers/net/ftmac100.h
new file mode 100644
index 0000000..46a0c47
--- /dev/null
+++ b/drivers/net/ftmac100.h
@@ -0,0 +1,180 @@
+/*
+ * Faraday FTMAC100 10/100 Ethernet
+ *
+ * (C) Copyright 2009-2011 Faraday Technology
+ * Po-Yu Chuang <ratbert@faraday-tech.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#ifndef __FTMAC100_H
+#define __FTMAC100_H
+
+#define FTMAC100_OFFSET_ISR 0x00
+#define FTMAC100_OFFSET_IMR 0x04
+#define FTMAC100_OFFSET_MAC_MADR 0x08
+#define FTMAC100_OFFSET_MAC_LADR 0x0c
+#define FTMAC100_OFFSET_MAHT0 0x10
+#define FTMAC100_OFFSET_MAHT1 0x14
+#define FTMAC100_OFFSET_TXPD 0x18
+#define FTMAC100_OFFSET_RXPD 0x1c
+#define FTMAC100_OFFSET_TXR_BADR 0x20
+#define FTMAC100_OFFSET_RXR_BADR 0x24
+#define FTMAC100_OFFSET_ITC 0x28
+#define FTMAC100_OFFSET_APTC 0x2c
+#define FTMAC100_OFFSET_DBLAC 0x30
+#define FTMAC100_OFFSET_MACCR 0x88
+#define FTMAC100_OFFSET_MACSR 0x8c
+#define FTMAC100_OFFSET_PHYCR 0x90
+#define FTMAC100_OFFSET_PHYWDATA 0x94
+#define FTMAC100_OFFSET_FCR 0x98
+#define FTMAC100_OFFSET_BPR 0x9c
+#define FTMAC100_OFFSET_TS 0xc4
+#define FTMAC100_OFFSET_DMAFIFOS 0xc8
+#define FTMAC100_OFFSET_TM 0xcc
+#define FTMAC100_OFFSET_TX_MCOL_SCOL 0xd4
+#define FTMAC100_OFFSET_RPF_AEP 0xd8
+#define FTMAC100_OFFSET_XM_PG 0xdc
+#define FTMAC100_OFFSET_RUNT_TLCC 0xe0
+#define FTMAC100_OFFSET_CRCER_FTL 0xe4
+#define FTMAC100_OFFSET_RLC_RCC 0xe8
+#define FTMAC100_OFFSET_BROC 0xec
+#define FTMAC100_OFFSET_MULCA 0xf0
+#define FTMAC100_OFFSET_RP 0xf4
+#define FTMAC100_OFFSET_XP 0xf8
+
+/*
+ * Interrupt status register & interrupt mask register
+ */
+#define FTMAC100_INT_RPKT_FINISH (1 << 0)
+#define FTMAC100_INT_NORXBUF (1 << 1)
+#define FTMAC100_INT_XPKT_FINISH (1 << 2)
+#define FTMAC100_INT_NOTXBUF (1 << 3)
+#define FTMAC100_INT_XPKT_OK (1 << 4)
+#define FTMAC100_INT_XPKT_LOST (1 << 5)
+#define FTMAC100_INT_RPKT_SAV (1 << 6)
+#define FTMAC100_INT_RPKT_LOST (1 << 7)
+#define FTMAC100_INT_AHB_ERR (1 << 8)
+#define FTMAC100_INT_PHYSTS_CHG (1 << 9)
+
+/*
+ * Interrupt timer control register
+ */
+#define FTMAC100_ITC_RXINT_CNT(x) (((x) & 0xf) << 0)
+#define FTMAC100_ITC_RXINT_THR(x) (((x) & 0x7) << 4)
+#define FTMAC100_ITC_RXINT_TIME_SEL (1 << 7)
+#define FTMAC100_ITC_TXINT_CNT(x) (((x) & 0xf) << 8)
+#define FTMAC100_ITC_TXINT_THR(x) (((x) & 0x7) << 12)
+#define FTMAC100_ITC_TXINT_TIME_SEL (1 << 15)
+
+/*
+ * Automatic polling timer control register
+ */
+#define FTMAC100_APTC_RXPOLL_CNT(x) (((x) & 0xf) << 0)
+#define FTMAC100_APTC_RXPOLL_TIME_SEL (1 << 4)
+#define FTMAC100_APTC_TXPOLL_CNT(x) (((x) & 0xf) << 8)
+#define FTMAC100_APTC_TXPOLL_TIME_SEL (1 << 12)
+
+/*
+ * DMA burst length and arbitration control register
+ */
+#define FTMAC100_DBLAC_INCR4_EN (1 << 0)
+#define FTMAC100_DBLAC_INCR8_EN (1 << 1)
+#define FTMAC100_DBLAC_INCR16_EN (1 << 2)
+#define FTMAC100_DBLAC_RXFIFO_LTHR(x) (((x) & 0x7) << 3)
+#define FTMAC100_DBLAC_RXFIFO_HTHR(x) (((x) & 0x7) << 6)
+#define FTMAC100_DBLAC_RX_THR_EN (1 << 9)
+
+/*
+ * MAC control register
+ */
+#define FTMAC100_MACCR_XDMA_EN (1 << 0)
+#define FTMAC100_MACCR_RDMA_EN (1 << 1)
+#define FTMAC100_MACCR_SW_RST (1 << 2)
+#define FTMAC100_MACCR_LOOP_EN (1 << 3)
+#define FTMAC100_MACCR_CRC_DIS (1 << 4)
+#define FTMAC100_MACCR_XMT_EN (1 << 5)
+#define FTMAC100_MACCR_ENRX_IN_HALFTX (1 << 6)
+#define FTMAC100_MACCR_RCV_EN (1 << 8)
+#define FTMAC100_MACCR_HT_MULTI_EN (1 << 9)
+#define FTMAC100_MACCR_RX_RUNT (1 << 10)
+#define FTMAC100_MACCR_RX_FTL (1 << 11)
+#define FTMAC100_MACCR_RCV_ALL (1 << 12)
+#define FTMAC100_MACCR_CRC_APD (1 << 14)
+#define FTMAC100_MACCR_FULLDUP (1 << 15)
+#define FTMAC100_MACCR_RX_MULTIPKT (1 << 16)
+#define FTMAC100_MACCR_RX_BROADPKT (1 << 17)
+
+/*
+ * PHY control register
+ */
+#define FTMAC100_PHYCR_MIIRDATA 0xffff
+#define FTMAC100_PHYCR_PHYAD(x) (((x) & 0x1f) << 16)
+#define FTMAC100_PHYCR_REGAD(x) (((x) & 0x1f) << 21)
+#define FTMAC100_PHYCR_MIIRD (1 << 26)
+#define FTMAC100_PHYCR_MIIWR (1 << 27)
+
+/*
+ * PHY write data register
+ */
+#define FTMAC100_PHYWDATA_MIIWDATA(x) ((x) & 0xffff)
+
+/*
+ * Transmit descriptor, aligned to 16 bytes
+ */
+struct ftmac100_txdes {
+ unsigned int txdes0;
+ unsigned int txdes1;
+ unsigned int txdes2; /* TXBUF_BADR */
+ unsigned int txdes3; /* not used by HW */
+} __attribute__ ((aligned(16)));
+
+#define FTMAC100_TXDES0_TXPKT_LATECOL (1 << 0)
+#define FTMAC100_TXDES0_TXPKT_EXSCOL (1 << 1)
+#define FTMAC100_TXDES0_TXDMA_OWN (1 << 31)
+
+#define FTMAC100_TXDES1_TXBUF_SIZE(x) ((x) & 0x7ff)
+#define FTMAC100_TXDES1_LTS (1 << 27)
+#define FTMAC100_TXDES1_FTS (1 << 28)
+#define FTMAC100_TXDES1_TX2FIC (1 << 29)
+#define FTMAC100_TXDES1_TXIC (1 << 30)
+#define FTMAC100_TXDES1_EDOTR (1 << 31)
+
+/*
+ * Receive descriptor, aligned to 16 bytes
+ */
+struct ftmac100_rxdes {
+ unsigned int rxdes0;
+ unsigned int rxdes1;
+ unsigned int rxdes2; /* RXBUF_BADR */
+ unsigned int rxdes3; /* not used by HW */
+} __attribute__ ((aligned(16)));
+
+#define FTMAC100_RXDES0_RFL 0x7ff
+#define FTMAC100_RXDES0_MULTICAST (1 << 16)
+#define FTMAC100_RXDES0_BROADCAST (1 << 17)
+#define FTMAC100_RXDES0_RX_ERR (1 << 18)
+#define FTMAC100_RXDES0_CRC_ERR (1 << 19)
+#define FTMAC100_RXDES0_FTL (1 << 20)
+#define FTMAC100_RXDES0_RUNT (1 << 21)
+#define FTMAC100_RXDES0_RX_ODD_NB (1 << 22)
+#define FTMAC100_RXDES0_LRS (1 << 28)
+#define FTMAC100_RXDES0_FRS (1 << 29)
+#define FTMAC100_RXDES0_RXDMA_OWN (1 << 31)
+
+#define FTMAC100_RXDES1_RXBUF_SIZE(x) ((x) & 0x7ff)
+#define FTMAC100_RXDES1_EDORR (1 << 31)
+
+#endif /* __FTMAC100_H */
--
1.6.3.3
^ permalink raw reply related
* [PATCH 4/4] ppp: Use SKB queue abstraction interfaces in fragment processing.
From: David Miller @ 2011-01-21 7:56 UTC (permalink / raw)
To: netdev; +Cc: paulus
No more direct references to SKB queue and list implementation
details.
Signed-off-by: David S. Miller <davem@davemloft.net>
---
drivers/net/ppp_generic.c | 31 ++++++++++++++++---------------
1 files changed, 16 insertions(+), 15 deletions(-)
diff --git a/drivers/net/ppp_generic.c b/drivers/net/ppp_generic.c
index 1d4fb34..9f6d670 100644
--- a/drivers/net/ppp_generic.c
+++ b/drivers/net/ppp_generic.c
@@ -1998,7 +1998,7 @@ ppp_mp_reconstruct(struct ppp *ppp)
u32 seq = ppp->nextseq;
u32 minseq = ppp->minseq;
struct sk_buff_head *list = &ppp->mrq;
- struct sk_buff *p, *next;
+ struct sk_buff *p, *tmp;
struct sk_buff *head, *tail;
struct sk_buff *skb = NULL;
int lost = 0, len = 0;
@@ -2007,14 +2007,15 @@ ppp_mp_reconstruct(struct ppp *ppp)
return NULL;
head = list->next;
tail = NULL;
- for (p = head; p != (struct sk_buff *) list; p = next) {
- next = p->next;
+ skb_queue_walk_safe(list, p, tmp) {
+ again:
if (seq_before(PPP_MP_CB(p)->sequence, seq)) {
/* this can't happen, anyway ignore the skb */
netdev_err(ppp->dev, "ppp_mp_reconstruct bad "
"seq %u < %u\n",
PPP_MP_CB(p)->sequence, seq);
- head = next;
+ __skb_unlink(p, list);
+ kfree_skb(p);
continue;
}
if (PPP_MP_CB(p)->sequence != seq) {
@@ -2026,8 +2027,7 @@ ppp_mp_reconstruct(struct ppp *ppp)
lost = 1;
seq = seq_before(minseq, PPP_MP_CB(p)->sequence)?
minseq + 1: PPP_MP_CB(p)->sequence;
- next = p;
- continue;
+ goto again;
}
/*
@@ -2067,9 +2067,17 @@ ppp_mp_reconstruct(struct ppp *ppp)
* and we haven't found a complete valid packet yet,
* we can discard up to and including this fragment.
*/
- if (PPP_MP_CB(p)->BEbits & E)
- head = next;
+ if (PPP_MP_CB(p)->BEbits & E) {
+ struct sk_buff *tmp2;
+ skb_queue_reverse_walk_from_safe(list, p, tmp2) {
+ __skb_unlink(p, list);
+ kfree_skb(p);
+ }
+ head = skb_peek(list);
+ if (!head)
+ break;
+ }
++seq;
}
@@ -2110,13 +2118,6 @@ ppp_mp_reconstruct(struct ppp *ppp)
}
ppp->nextseq = PPP_MP_CB(tail)->sequence + 1;
- head = tail->next;
- }
-
- /* Discard all the skbuffs that we can't use. */
- while ((p = list->next) != head) {
- __skb_unlink(p, list);
- kfree_skb(p);
}
return skb;
--
1.7.3.4
^ permalink raw reply related
* Re: [PATCH v4] net: add Faraday FTMAC100 10/100 Ethernet driver
From: Eric Dumazet @ 2011-01-21 9:08 UTC (permalink / raw)
To: Po-Yu Chuang
Cc: netdev, linux-kernel, bhutchings, joe, dilinger, mirqus,
Po-Yu Chuang
In-Reply-To: <1295596533-1748-1-git-send-email-ratbert.chuang@gmail.com>
Le vendredi 21 janvier 2011 à 15:55 +0800, Po-Yu Chuang a écrit :
> From: Po-Yu Chuang <ratbert@faraday-tech.com>
>
> FTMAC100 Ethernet Media Access Controller supports 10/100 Mbps and
> MII. This driver has been working on some ARM/NDS32 SoC's including
> Faraday A320 and Andes AG101.
>
> Signed-off-by: Po-Yu Chuang <ratbert@faraday-tech.com>
> +
> +static bool ftmac100_tx_complete_packet(struct ftmac100 *priv)
> +{
...
> +
> + dma_unmap_single(priv->dev, map, skb_headlen(skb), DMA_TO_DEVICE);
> +
> + dev_kfree_skb_irq(skb);
> +
> + ftmac100_txdes_reset(txdes);
> +
> + ftmac100_tx_clean_pointer_advance(priv);
> +
> + priv->tx_pending--;
> + netif_wake_queue(netdev);
> +
> + return true;
> +}
> +
Thanks to NAPI, you can free skb directly, not queuing it via
NET_TX_SOFTIRQ softirq, using dev_kfree_skb() instead of
dev_kfree_skb_irq()
^ permalink raw reply
* [PATCH] atm: idt77105: fix fetch_stats() result
From: Vasiliy Kulikov @ 2011-01-21 9:43 UTC (permalink / raw)
To: kernel-janitors; +Cc: Chas Williams, linux-atm-general, netdev, linux-kernel
copy_to_user() used PRIV(dev)->stats instead of local stats variable.
Zero stats were returned to user in case of (zero != 0), also memcpy()
was pointless.
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
---
drivers/atm/idt77105.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/atm/idt77105.c b/drivers/atm/idt77105.c
index bca9cb8..487a547 100644
--- a/drivers/atm/idt77105.c
+++ b/drivers/atm/idt77105.c
@@ -151,7 +151,7 @@ static int fetch_stats(struct atm_dev *dev,struct idt77105_stats __user *arg,int
spin_unlock_irqrestore(&idt77105_priv_lock, flags);
if (arg == NULL)
return 0;
- return copy_to_user(arg, &PRIV(dev)->stats,
+ return copy_to_user(arg, &stats,
sizeof(struct idt77105_stats)) ? -EFAULT : 0;
}
--
1.7.0.4
^ permalink raw reply related
* RFC: pid "ownership" of ip config information
From: Patrick Schaaf @ 2011-01-21 9:28 UTC (permalink / raw)
To: netdev
Dear netdev,
I want to solicit comments on a feature enhancement that occured
to me recently.
Feature:
- For "ip addr add", "ip route add", "ip rule add", and maybe "ip link
add",
implement an option 'pid XXXXX' to specify a PID
- if that PID is not currently existing, fail the operation
- if, at a later time, that PID dies, automatically remove the
configuration,
as if a corresponding "ip ... del" would have been given
The feature would be useful in any kind of "IP takeover" scenario.
I'm concretely working on deployment of keepalived (VRRP address
takeover) and memcachedb (address takeover after berkeley DB master
selection).
It would also apply to all kinds of routing daemons (zebra, quagga...).
In all these cases, for as long as the process is working normally,
it can trigger the relevant address withdrawal, but when the process
dies unexpectedly (oom killer or whatever), addresses are left
configured,
while a partner on another host might take them over, resulting in
actively duplicate IPs and the application breaking.
The alternative to such a feature, would be to have an additional
monitoring process, which would watch the PID somehow, and need to
be configured to know what to withdraw when it dies.
Before I go ahead and try to implement that, I would like to have
some feedback regarding the idea
- has it been discussed before?
- would it be accepted by the relevant maintainers?
- did I overlook alternative solutions to the problem?
best regards
Patrick
^ permalink raw reply
* [vmxnet3] possible irq lock inversion dependency detected
From: Jongman Heo @ 2011-01-21 9:44 UTC (permalink / raw)
To: netdev
I'm using Fedora 14 on VMWare.
With latest Linus git tree(2b1caf6ed7b888), following warnings are printed.
Is this a known issue? I don't know whether this is a regression or not.
This is my first time using vmxnet3 driver.
===============================================================
[ 17.593243] NET: Registered protocol family 10
[ 17.640420] ip6_tables: (C) 2000-2006 Netfilter Core Team
[ 18.418134] auditd (733): /proc/733/oom_adj is deprecated, please
use /proc/733/oom_score_adj instead.
[ 24.074627] eth0: intr type 3, mode 0, 5 vectors allocated
[ 24.075450] eth0: NIC Link is Up 10000 Mbps
[ 24.081505]
[ 24.081507] =========================================================
[ 24.081693] [ INFO: possible irq lock inversion dependency detected ]
[ 24.081797] 2.6.38-rc1+ #85
[ 24.081914] ---------------------------------------------------------
[ 24.082061] dbus-daemon/847 just changed the state of lock:
[ 24.082200] (&(&mc->mca_lock)->rlock){+.-...}, at: [<f85a034e>]
mld_ifc_timer_expire+0x12a/0x1f2 [ipv6]
[ 24.082488] but this lock took another, SOFTIRQ-unsafe lock in the past:
[ 24.082690] (&(&adapter->cmd_lock)->rlock){+.+...}
[ 24.082769]
[ 24.082770] and interrupts could create inverse lock ordering between them.
[ 24.082772]
[ 24.083196]
[ 24.083197] other info that might help us debug this:
[ 24.083415] 3 locks held by dbus-daemon/847:
[ 24.083538] #0: (&mm->mmap_sem){++++++}, at: [<c07d49ea>]
do_page_fault+0x140/0x33b
[ 24.083799] #1: (&idev->mc_ifc_timer){+.-...}, at: [<c04459c7>]
run_timer_softirq+0x11c/0x268
[ 24.084081] #2: (&ndev->lock){++.-..}, at: [<f85a023f>]
mld_ifc_timer_expire+0x1b/0x1f2 [ipv6]
[ 24.084364]
[ 24.084365] the shortest dependencies between 2nd lock and 1st lock:
[ 24.084659] -> (&(&adapter->cmd_lock)->rlock){+.+...} ops: 28 {
[ 24.084826] HARDIRQ-ON-W at:
[ 24.084987] [<c0461e11>]
__lock_acquire+0x2d9/0xbf2
[ 24.085302] [<c0462b5f>]
lock_acquire+0xb7/0xd7
[ 24.085507] [<c07d1835>]
_raw_spin_lock+0x33/0x40
[ 24.085708] [<f855e2bf>]
vmxnet3_alloc_intr_resources+0x18/0x1c1 [vmxnet3]
[ 24.085964] [<f8562b23>]
vmxnet3_probe_device+0x503/0x712 [vmxnet3]
[ 24.086180] [<c05f645a>]
local_pci_probe+0x2f/0x5a
[ 24.086382] [<c05f68ed>]
pci_device_probe+0x48/0x6b
[ 24.086582] [<c067f87a>]
driver_probe_device+0x115/0x1ec
[ 24.086788] [<c067f990>]
__driver_attach+0x3f/0x5b
[ 24.087014] [<c067eb28>]
bus_for_each_dev+0x3d/0x60
[ 24.087214] [<c067f50e>]
driver_attach+0x19/0x1b
[ 24.087411] [<c067f1a4>]
bus_add_driver+0xbd/0x215
[ 24.087611] [<c067fb61>]
driver_register+0x7f/0xde
[ 24.087811] [<c05f6adb>]
__pci_register_driver+0x4c/0xa9
[ 24.088046] [<f8568036>]
0xf8568036
[ 24.088238] [<c0401268>]
do_one_initcall+0x87/0x143
[ 24.088439] [<c046b0a6>]
sys_init_module+0x130d/0x14aa
[ 24.088643] [<c040319f>]
sysenter_do_call+0x12/0x38
[ 24.088844] SOFTIRQ-ON-W at:
[ 24.115469] [<c0461e30>]
__lock_acquire+0x2f8/0xbf2
[ 24.115483] [<c0462b5f>]
lock_acquire+0xb7/0xd7
[ 24.115486] [<c07d1835>]
_raw_spin_lock+0x33/0x40
[ 24.115493] [<f855e2bf>]
vmxnet3_alloc_intr_resources+0x18/0x1c1 [vmxnet3]
[ 24.115508] [<f8562b23>]
vmxnet3_probe_device+0x503/0x712 [vmxnet3]
[ 24.115513] [<c05f645a>]
local_pci_probe+0x2f/0x5a
[ 24.115519] [<c05f68ed>]
pci_device_probe+0x48/0x6b
[ 24.115523] [<c067f87a>]
driver_probe_device+0x115/0x1ec
[ 24.115529] [<c067f990>]
__driver_attach+0x3f/0x5b
[ 24.115532] [<c067eb28>]
bus_for_each_dev+0x3d/0x60
[ 24.115535] [<c067f50e>]
driver_attach+0x19/0x1b
[ 24.115539] [<c067f1a4>]
bus_add_driver+0xbd/0x215
[ 24.115542] [<c067fb61>]
driver_register+0x7f/0xde
[ 24.115545] [<c05f6adb>]
__pci_register_driver+0x4c/0xa9
[ 24.115555] [<f8568036>]
0xf8568036
[ 24.115562] [<c0401268>]
do_one_initcall+0x87/0x143
[ 24.115567] [<c046b0a6>]
sys_init_module+0x130d/0x14aa
[ 24.115590] [<c040319f>]
sysenter_do_call+0x12/0x38
[ 24.115596] INITIAL USE at:
[ 24.115598] [<c0461e85>]
__lock_acquire+0x34d/0xbf2
[ 24.115602] [<c0462b5f>]
lock_acquire+0xb7/0xd7
[ 24.115606] [<c07d1835>]
_raw_spin_lock+0x33/0x40
[ 24.115609] [<f855e2bf>]
vmxnet3_alloc_intr_resources+0x18/0x1c1 [vmxnet3]
[ 24.115614] [<f8562b23>]
vmxnet3_probe_device+0x503/0x712 [vmxnet3]
[ 24.115619] [<c05f645a>]
local_pci_probe+0x2f/0x5a
[ 24.115622] [<c05f68ed>]
pci_device_probe+0x48/0x6b
[ 24.115626] [<c067f87a>]
driver_probe_device+0x115/0x1ec
[ 24.115629] [<c067f990>]
__driver_attach+0x3f/0x5b
[ 24.115633] [<c067eb28>]
bus_for_each_dev+0x3d/0x60
[ 24.115636] [<c067f50e>]
driver_attach+0x19/0x1b
[ 24.115639] [<c067f1a4>]
bus_add_driver+0xbd/0x215
[ 24.115642] [<c067fb61>]
driver_register+0x7f/0xde
[ 24.115645] [<c05f6adb>]
__pci_register_driver+0x4c/0xa9
[ 24.115648] [<f8568036>]
0xf8568036
[ 24.115652] [<c0401268>]
do_one_initcall+0x87/0x143
[ 24.115655] [<c046b0a6>]
sys_init_module+0x130d/0x14aa
[ 24.115659] [<c040319f>]
sysenter_do_call+0x12/0x38
[ 24.115662] }
[ 24.115663] ... key at: [<f8564580>] __key.40447+0x0/0xffffe7b2
[vmxnet3]
[ 24.115668] ... acquired at:
[ 24.115670] [<c0462b5f>] lock_acquire+0xb7/0xd7
[ 24.115673] [<c07d1920>] _raw_spin_lock_irqsave+0x40/0x50
[ 24.115676] [<f855f494>] vmxnet3_set_mc+0x11a/0x165 [vmxnet3]
[ 24.115684] [<c0751f4d>] __dev_set_rx_mode+0x76/0x7a
[ 24.115689] [<c0751f6c>] dev_set_rx_mode+0x1b/0x26
[ 24.115692] [<c0752014>] __dev_open+0x9d/0xaf
[ 24.115694] [<c07521e6>] __dev_change_flags+0x98/0x10d
[ 24.115697] [<c07522c1>] dev_change_flags+0x13/0x3f
[ 24.115699] [<c075ae71>] do_setlink+0x245/0x56b
[ 24.115703] [<c075b6a6>] rtnl_setlink+0xaa/0xc6
[ 24.115706] [<c075b90f>] rtnetlink_rcv_msg+0x1a0/0x1af
[ 24.115709] [<c07694fd>] netlink_rcv_skb+0x32/0x73
[ 24.115712] [<c075b3bc>] rtnetlink_rcv+0x1b/0x22
[ 24.115714] [<c0769098>] netlink_unicast+0xc4/0x120
[ 24.115716] [<c076934e>] netlink_sendmsg+0x25a/0x271
[ 24.115719] [<c074135e>] __sock_sendmsg+0x54/0x5b
[ 24.115723] [<c07419dd>] sock_sendmsg+0x95/0xac
[ 24.115726] [<c0742fbc>] sys_sendmsg+0x181/0x1e8
[ 24.115729] [<c074348c>] sys_socketcall+0x22c/0x287
[ 24.115732] [<c040319f>] sysenter_do_call+0x12/0x38
[ 24.115735]
[ 24.115736] -> (_xmit_ETHER){+.....} ops: 6 {
[ 24.115741] HARDIRQ-ON-W at:
[ 24.115742] [<c0461e11>]
__lock_acquire+0x2d9/0xbf2
[ 24.115746] [<c0462b5f>]
lock_acquire+0xb7/0xd7
[ 24.115750] [<c07d1a20>]
_raw_spin_lock_bh+0x38/0x45
[ 24.115753] [<c07553cd>]
__dev_mc_add+0x23/0x61
[ 24.115761] [<c0755424>]
dev_mc_add+0xa/0xc
[ 24.115764] [<f85a0bb9>]
igmp6_group_added+0x56/0x139 [ipv6]
[ 24.115784] [<f85a114f>]
ipv6_dev_mc_inc+0x1fb/0x20c [ipv6]
[ 24.115799] [<f858d0f9>]
ipv6_add_dev+0x26d/0x28b [ipv6]
[ 24.115834] [<f8590007>]
addrconf_notify+0x57/0x52c [ipv6]
[ 24.115848] [<c074ec2a>]
register_netdevice_notifier+0x54/0x14e
[ 24.115852] [<f866b324>] 0xf866b324
[ 24.115856] [<f866b18a>] 0xf866b18a
[ 24.115859] [<c0401268>]
do_one_initcall+0x87/0x143
[ 24.115862] [<c046b0a6>]
sys_init_module+0x130d/0x14aa
[ 24.115867] [<c040319f>]
sysenter_do_call+0x12/0x38
[ 24.115870] INITIAL USE at:
[ 24.115872] [<c0461e85>]
__lock_acquire+0x34d/0xbf2
[ 24.115876] [<c0462b5f>]
lock_acquire+0xb7/0xd7
[ 24.115880] [<c07d1a20>]
_raw_spin_lock_bh+0x38/0x45
[ 24.115884] [<c07553cd>]
__dev_mc_add+0x23/0x61
[ 24.115887] [<c0755424>]
dev_mc_add+0xa/0xc
[ 24.115891] [<f85a0bb9>]
igmp6_group_added+0x56/0x139 [ipv6]
[ 24.115911] [<f85a114f>]
ipv6_dev_mc_inc+0x1fb/0x20c [ipv6]
[ 24.115926] [<f858d0f9>]
ipv6_add_dev+0x26d/0x28b [ipv6]
[ 24.115939] [<f8590007>]
addrconf_notify+0x57/0x52c [ipv6]
[ 24.115951] [<c074ec2a>]
register_netdevice_notifier+0x54/0x14e
[ 24.115954] [<f866b324>] 0xf866b324
[ 24.115957] [<f866b18a>] 0xf866b18a
[ 24.115960] [<c0401268>]
do_one_initcall+0x87/0x143
[ 24.115963] [<c046b0a6>]
sys_init_module+0x130d/0x14aa
[ 24.115966] [<c040319f>]
sysenter_do_call+0x12/0x38
[ 24.115970] }
[ 24.115971] ... key at: [<c10308b8>] netdev_addr_lock_key+0x8/0x1d0
[ 24.115985] ... acquired at:
[ 24.115986] [<c0462b5f>] lock_acquire+0xb7/0xd7
[ 24.115990] [<c07d1a20>] _raw_spin_lock_bh+0x38/0x45
[ 24.115992] [<c07553cd>] __dev_mc_add+0x23/0x61
[ 24.115995] [<c0755424>] dev_mc_add+0xa/0xc
[ 24.115997] [<f85a0bb9>] igmp6_group_added+0x56/0x139 [ipv6]
[ 24.116013] [<f85a114f>] ipv6_dev_mc_inc+0x1fb/0x20c [ipv6]
[ 24.116027] [<f858d0f9>] ipv6_add_dev+0x26d/0x28b [ipv6]
[ 24.116039] [<f8590007>] addrconf_notify+0x57/0x52c [ipv6]
[ 24.116051] [<c074ec2a>] register_netdevice_notifier+0x54/0x14e
[ 24.116054] [<f866b324>] 0xf866b324
[ 24.116056] [<f866b18a>] 0xf866b18a
[ 24.116058] [<c0401268>] do_one_initcall+0x87/0x143
[ 24.116061] [<c046b0a6>] sys_init_module+0x130d/0x14aa
[ 24.116064] [<c040319f>] sysenter_do_call+0x12/0x38
[ 24.116067]
[ 24.116068] -> (&(&mc->mca_lock)->rlock){+.-...} ops: 6 {
[ 24.116071] HARDIRQ-ON-W at:
[ 24.116073] [<c0461e11>]
__lock_acquire+0x2d9/0xbf2
[ 24.116077] [<c0462b5f>]
lock_acquire+0xb7/0xd7
[ 24.116080] [<c07d1a20>]
_raw_spin_lock_bh+0x38/0x45
[ 24.116083] [<f85a0b8b>]
igmp6_group_added+0x28/0x139 [ipv6]
[ 24.116102] [<f85a114f>]
ipv6_dev_mc_inc+0x1fb/0x20c [ipv6]
[ 24.116118] [<f858d0f9>]
ipv6_add_dev+0x26d/0x28b [ipv6]
[ 24.116130] [<f866b2f0>] 0xf866b2f0
[ 24.116133] [<f866b18a>] 0xf866b18a
[ 24.116136] [<c0401268>]
do_one_initcall+0x87/0x143
[ 24.116139] [<c046b0a6>]
sys_init_module+0x130d/0x14aa
[ 24.116143] [<c040319f>]
sysenter_do_call+0x12/0x38
[ 24.116146] IN-SOFTIRQ-W at:
[ 24.116148] [<c0461dbc>]
__lock_acquire+0x284/0xbf2
[ 24.116151] [<c0462b5f>]
lock_acquire+0xb7/0xd7
[ 24.116154] [<c07d1a20>]
_raw_spin_lock_bh+0x38/0x45
[ 24.116158] [<f85a034e>]
mld_ifc_timer_expire+0x12a/0x1f2 [ipv6]
[ 24.116173] [<c0445a4a>]
run_timer_softirq+0x19f/0x268
[ 24.116180] [<c043fd5b>]
__do_softirq+0xa9/0x16a
[ 24.116183] INITIAL USE at:
[ 24.116185] [<c0461e85>]
__lock_acquire+0x34d/0xbf2
[ 24.116188] [<c0462b5f>]
lock_acquire+0xb7/0xd7
[ 24.116191] [<c07d1a20>]
_raw_spin_lock_bh+0x38/0x45
[ 24.116195] [<f85a0b8b>]
igmp6_group_added+0x28/0x139 [ipv6]
[ 24.116210] [<f85a114f>]
ipv6_dev_mc_inc+0x1fb/0x20c [ipv6]
[ 24.116226] [<f858d0f9>]
ipv6_add_dev+0x26d/0x28b [ipv6]
[ 24.116238] [<f866b2f0>] 0xf866b2f0
[ 24.116241] [<f866b18a>] 0xf866b18a
[ 24.116244] [<c0401268>]
do_one_initcall+0x87/0x143
[ 24.116247] [<c046b0a6>]
sys_init_module+0x130d/0x14aa
[ 24.116251] [<c040319f>]
sysenter_do_call+0x12/0x38
[ 24.116254] }
[ 24.116255] ... key at: [<f85b382c>] __key.38329+0x0/0xffff9cd8 [ipv6]
[ 24.116266] ... acquired at:
[ 24.116268] [<c046135b>] check_usage_forwards+0x6f/0x77
[ 24.116271] [<c0461a70>] mark_lock+0xf3/0x1bb
[ 24.116273] [<c0461dbc>] __lock_acquire+0x284/0xbf2
[ 24.116276] [<c0462b5f>] lock_acquire+0xb7/0xd7
[ 24.116279] [<c07d1a20>] _raw_spin_lock_bh+0x38/0x45
[ 24.116282] [<f85a034e>] mld_ifc_timer_expire+0x12a/0x1f2 [ipv6]
[ 24.116296] [<c0445a4a>] run_timer_softirq+0x19f/0x268
[ 24.116299] [<c043fd5b>] __do_softirq+0xa9/0x16a
[ 24.116302]
[ 24.116303]
[ 24.116303] stack backtrace:
[ 24.116307] Pid: 847, comm: dbus-daemon Not tainted 2.6.38-rc1+ #85
[ 24.116309] Call Trace:
[ 24.116314] [<c04612e2>] ? print_irq_inversion_bug+0xfc/0x106
[ 24.116317] [<c046135b>] ? check_usage_forwards+0x6f/0x77
[ 24.116320] [<c0461a70>] ? mark_lock+0xf3/0x1bb
[ 24.116323] [<c04612ec>] ? check_usage_forwards+0x0/0x77
[ 24.116327] [<c0461dbc>] ? __lock_acquire+0x284/0xbf2
[ 24.116330] [<c04607f5>] ? save_trace+0x37/0x93
[ 24.116333] [<c046267c>] ? __lock_acquire+0xb44/0xbf2
[ 24.116348] [<f85a034e>] ? mld_ifc_timer_expire+0x12a/0x1f2 [ipv6]
[ 24.116352] [<c0462b5f>] ? lock_acquire+0xb7/0xd7
[ 24.116366] [<f85a034e>] ? mld_ifc_timer_expire+0x12a/0x1f2 [ipv6]
[ 24.116370] [<c07d1a20>] ? _raw_spin_lock_bh+0x38/0x45
[ 24.116385] [<f85a034e>] ? mld_ifc_timer_expire+0x12a/0x1f2 [ipv6]
[ 24.116400] [<f85a034e>] ? mld_ifc_timer_expire+0x12a/0x1f2 [ipv6]
[ 24.116403] [<c04459c7>] ? run_timer_softirq+0x11c/0x268
[ 24.116410] [<c0445a4a>] ? run_timer_softirq+0x19f/0x268
[ 24.116413] [<c04459c7>] ? run_timer_softirq+0x11c/0x268
[ 24.116428] [<f85a0224>] ? mld_ifc_timer_expire+0x0/0x1f2 [ipv6]
[ 24.116432] [<c043fd5b>] ? __do_softirq+0xa9/0x16a
[ 24.116434] [<c043fcb2>] ? __do_softirq+0x0/0x16a
[ 24.116436] <IRQ> [<c043fead>] ? irq_exit+0x38/0x6c
[ 24.116443] [<c0419e71>] ? smp_apic_timer_interrupt+0x66/0x73
[ 24.116447] [<c05e6cc0>] ? trace_hardirqs_off_thunk+0xc/0x10
[ 24.116451] [<c07d2522>] ? apic_timer_interrupt+0x36/0x3c
[ 24.116456] [<c04bb9d7>] ? copy_user_highpage.clone.44+0x21/0x34
[ 24.116459] [<c04bc87a>] ? do_wp_page+0x397/0x514
[ 24.116462] [<c07d183c>] ? _raw_spin_lock+0x3a/0x40
[ 24.116465] [<c04be2a8>] ? handle_pte_fault+0x67f/0x6ea
[ 24.116468] [<c04be3bf>] ? handle_mm_fault+0xac/0xb8
[ 24.116472] [<c07d4bcd>] ? do_page_fault+0x323/0x33b
[ 24.116475] [<c0462b77>] ? lock_acquire+0xcf/0xd7
[ 24.116478] [<c07d205d>] ? restore_all_notrace+0x0/0x18
[ 24.116481] [<c04601a3>] ? trace_hardirqs_off_caller+0x2e/0x86
[ 24.116484] [<c07d48aa>] ? do_page_fault+0x0/0x33b
[ 24.116487] [<c07d27a4>] ? error_code+0x6c/0x74
[ 24.550299] RPC: Registered udp transport module.
[ 24.550405] RPC: Registered tcp transport module.
[ 24.550498] RPC: Registered tcp NFSv4.1 backchannel transport module.
[ 28.499064] Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
[ 28.725260] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery
directory
[ 28.783996] NFSD: starting 90-second grace period
[ 33.488381] Bridge firewalling registered
[ 34.443551] ------------[ cut here ]------------
[ 34.443561] WARNING: at net/core/dev.c:1351 dev_disable_lro+0x54/0x57()
[ 34.443563] Hardware name: VMware Virtual Platform
[ 34.443565] Modules linked in: ipt_MASQUERADE iptable_nat nf_nat bridge stp
llc nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc xt_physdev
nf_conntrack_tftp nf_conntrack_netbios_ns ip6t_REJECT nf_conntrack_ipv6
nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 vmhgfs uinput snd_ens1371
gameport snd_rawmidi snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm
snd_timer microcode vmxnet3 vmci snd soundcore snd_page_alloc i2c_piix4 mptspi
mptscsih mptbase scsi_transport_spi [last unloaded: scsi_wait_scan]
[ 34.443605] Pid: 1358, comm: libvirtd Not tainted 2.6.38-rc1+ #85
[ 34.443607] Call Trace:
[ 34.443615] [<c043a801>] ? warn_slowpath_common+0x77/0x8c
[ 34.443618] [<c074d8cb>] ? dev_disable_lro+0x54/0x57
[ 34.443620] [<c074d8cb>] ? dev_disable_lro+0x54/0x57
[ 34.443623] [<c043a833>] ? warn_slowpath_null+0x1d/0x1f
[ 34.443626] [<c074d8cb>] ? dev_disable_lro+0x54/0x57
[ 34.443630] [<c079a574>] ? devinet_sysctl_forward+0xd5/0x139
[ 34.443633] [<c079a49f>] ? devinet_sysctl_forward+0x0/0x139
[ 34.443638] [<c051e889>] ? proc_sys_call_handler.clone.0+0x6a/0x89
[ 34.443641] [<c051e8a8>] ? proc_sys_write+0x0/0x22
[ 34.443643] [<c051e8c5>] ? proc_sys_write+0x1d/0x22
[ 34.443649] [<c04da9d8>] ? vfs_write+0x86/0xde
[ 34.443651] [<c04dba58>] ? fget_light+0x5f/0x66
[ 34.443654] [<c04daba6>] ? sys_write+0x3d/0x5e
[ 34.443659] [<c040319f>] ? sysenter_do_call+0x12/0x38
[ 34.443662] ---[ end trace 06a697a570356b0c ]---
^ permalink raw reply
* [PATCH] netfilter: ipvs: fix compiler warnings
From: Changli Gao @ 2011-01-21 10:02 UTC (permalink / raw)
To: Simon Horman
Cc: Wensong Zhang, Julian Anastasov, Patrick McHardy, David S. Miller,
netdev, lvs-devel, netfilter-devel, Changli Gao
Fix compiler warnings when no transport protocol load balancing support
is configured.
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
---
net/netfilter/ipvs/ip_vs_core.c | 4 +---
net/netfilter/ipvs/ip_vs_ctl.c | 4 ++++
net/netfilter/ipvs/ip_vs_proto.c | 4 ++++
3 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index f36a84f..d889f4f 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -1894,9 +1894,7 @@ static int __net_init __ip_vs_init(struct net *net)
static void __net_exit __ip_vs_cleanup(struct net *net)
{
- struct netns_ipvs *ipvs = net_ipvs(net);
-
- IP_VS_DBG(10, "ipvs netns %d released\n", ipvs->gen);
+ IP_VS_DBG(10, "ipvs netns %d released\n", net_ipvs(net)->gen);
}
static struct pernet_operations ipvs_core_ops = {
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 09ca2ce..68b8033 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -2062,7 +2062,9 @@ static const struct file_operations ip_vs_stats_percpu_fops = {
*/
static int ip_vs_set_timeout(struct net *net, struct ip_vs_timeout_user *u)
{
+#if defined(CONFIG_IP_VS_PROTO_TCP) || defined(CONFIG_IP_VS_PROTO_UDP)
struct ip_vs_proto_data *pd;
+#endif
IP_VS_DBG(2, "Setting timeout tcp:%d tcpfin:%d udp:%d\n",
u->tcp_timeout,
@@ -2405,7 +2407,9 @@ __ip_vs_get_dest_entries(struct net *net, const struct ip_vs_get_dests *get,
static inline void
__ip_vs_get_timeouts(struct net *net, struct ip_vs_timeout_user *u)
{
+#if defined(CONFIG_IP_VS_PROTO_TCP) || defined(CONFIG_IP_VS_PROTO_UDP)
struct ip_vs_proto_data *pd;
+#endif
#ifdef CONFIG_IP_VS_PROTO_TCP
pd = ip_vs_proto_data_get(net, IPPROTO_TCP);
diff --git a/net/netfilter/ipvs/ip_vs_proto.c b/net/netfilter/ipvs/ip_vs_proto.c
index 6ac986c..17484a4 100644
--- a/net/netfilter/ipvs/ip_vs_proto.c
+++ b/net/netfilter/ipvs/ip_vs_proto.c
@@ -60,6 +60,9 @@ static int __used __init register_ip_vs_protocol(struct ip_vs_protocol *pp)
return 0;
}
+#if defined(CONFIG_IP_VS_PROTO_TCP) || defined(CONFIG_IP_VS_PROTO_UDP) || \
+ defined(CONFIG_IP_VS_PROTO_SCTP) || defined(CONFIG_IP_VS_PROTO_AH) || \
+ defined(CONFIG_IP_VS_PROTO_ESP)
/*
* register an ipvs protocols netns related data
*/
@@ -85,6 +88,7 @@ register_ip_vs_proto_netns(struct net *net, struct ip_vs_protocol *pp)
return 0;
}
+#endif
/*
* unregister an ipvs protocol
^ permalink raw reply related
* Re: RFC: pid "ownership" of ip config information
From: Nicolas de Pesloüan @ 2011-01-21 10:17 UTC (permalink / raw)
To: Patrick Schaaf; +Cc: netdev
In-Reply-To: <1295602091.3582.1.camel@lat1>
Le 21/01/2011 10:28, Patrick Schaaf a écrit :
> Dear netdev,
>
> I want to solicit comments on a feature enhancement that occured
> to me recently.
>
> Feature:
>
> - For "ip addr add", "ip route add", "ip rule add", and maybe "ip link
> add",
> implement an option 'pid XXXXX' to specify a PID
> - if that PID is not currently existing, fail the operation
> - if, at a later time, that PID dies, automatically remove the
> configuration,
> as if a corresponding "ip ... del" would have been given
>
> The feature would be useful in any kind of "IP takeover" scenario.
>
> I'm concretely working on deployment of keepalived (VRRP address
> takeover) and memcachedb (address takeover after berkeley DB master
> selection).
>
> It would also apply to all kinds of routing daemons (zebra, quagga...).
>
> In all these cases, for as long as the process is working normally,
> it can trigger the relevant address withdrawal, but when the process
> dies unexpectedly (oom killer or whatever), addresses are left
> configured,
> while a partner on another host might take them over, resulting in
> actively duplicate IPs and the application breaking.
>
> The alternative to such a feature, would be to have an additional
> monitoring process, which would watch the PID somehow, and need to
> be configured to know what to withdraw when it dies.
>
> Before I go ahead and try to implement that, I would like to have
> some feedback regarding the idea
>
> - has it been discussed before?
> - would it be accepted by the relevant maintainers?
> - did I overlook alternative solutions to the problem?
There exists some user space clustering system that should provide the same functionalities. Did you
had a look at http://www.linux-ha.org/ ?
> best regards
> Patrick
^ permalink raw reply
* Re: Flow Control and Port Mirroring Revisited
From: Michael S. Tsirkin @ 2011-01-21 9:59 UTC (permalink / raw)
To: Simon Horman
Cc: Rick Jones, Jesse Gross, Rusty Russell, virtualization, dev,
virtualization, netdev, kvm
In-Reply-To: <20110120083727.GA1807@verge.net.au>
On Thu, Jan 20, 2011 at 05:38:33PM +0900, Simon Horman wrote:
> [ Trimmed Eric from CC list as vger was complaining that it is too long ]
>
> On Tue, Jan 18, 2011 at 11:41:22AM -0800, Rick Jones wrote:
> > >So it won't be all that simple to implement well, and before we try,
> > >I'd like to know whether there are applications that are helped
> > >by it. For example, we could try to measure latency at various
> > >pps and see whether the backpressure helps. netperf has -b, -w
> > >flags which might help these measurements.
> >
> > Those options are enabled when one adds --enable-burst to the
> > pre-compilation ./configure of netperf (one doesn't have to
> > recompile netserver). However, if one is also looking at latency
> > statistics via the -j option in the top-of-trunk, or simply at the
> > histogram with --enable-histogram on the ./configure and a verbosity
> > level of 2 (global -v 2) then one wants the very top of trunk
> > netperf from:
>
> Hi,
>
> I have constructed a test where I run an un-paced UDP_STREAM test in
> one guest and a paced omni rr test in another guest at the same time.
Hmm, what is this supposed to measure? Basically each time you run an
un-paced UDP_STREAM you get some random load on the network.
You can't tell what it was exactly, only that it was between
the send and receive throughput.
> Breifly I get the following results from the omni test..
>
> 1. Omni test only: MEAN_LATENCY=272.00
> 2. Omni and stream test: MEAN_LATENCY=3423.00
> 3. cpu and net_cls group: MEAN_LATENCY=493.00
> As per 2 plus cgoups are created for each guest
> and guest tasks added to the groups
> 4. 100Mbit/s class: MEAN_LATENCY=273.00
> As per 3 plus the net_cls groups each have a 100MBit/s HTB class
> 5. cpu.shares=128: MEAN_LATENCY=652.00
> As per 4 plus the cpu groups have cpu.shares set to 128
> 6. Busy CPUS: MEAN_LATENCY=15126.00
> As per 5 but the CPUs are made busy using a simple shell while loop
>
> There is a bit of noise in the results as the two netperf invocations
> aren't started at exactly the same moment
>
> For reference, my netperf invocations are:
> netperf -c -C -t UDP_STREAM -H 172.17.60.216 -l 12
> netperf.omni -p 12866 -D -c -C -H 172.17.60.216 -t omni -j -v 2 -- -r 1 -d rr -k foo -b 1 -w 200 -m 200
>
> foo contains
> PROTOCOL
> THROUGHPUT,THROUGHPUT_UNITS
> LOCAL_SEND_THROUGHPUT
> LOCAL_RECV_THROUGHPUT
> REMOTE_SEND_THROUGHPUT
> REMOTE_RECV_THROUGHPUT
> RT_LATENCY,MIN_LATENCY,MEAN_LATENCY,MAX_LATENCY
> P50_LATENCY,P90_LATENCY,P99_LATENCY,STDDEV_LATENCY
> LOCAL_CPU_UTIL,REMOTE_CPU_UTIL
^ permalink raw reply
* [PATCH net-next-2.6] net_sched: TCQ_F_CAN_BYPASS generalization
From: Eric Dumazet @ 2011-01-21 11:04 UTC (permalink / raw)
To: David Miller
Cc: netdev, Patrick McHardy, Jesper Dangaard Brouer, Jarek Poplawski,
jamal
In-Reply-To: <1295537236.2825.286.camel@edumazet-laptop>
Now qdisc stab is handled before TCQ_F_CAN_BYPASS test in
__dev_xmit_skb(), we can generalize TCQ_F_CAN_BYPASS to other qdiscs
than pfifo_fast : pfifo, bfifo, pfifo_head_drop and sfq
SFQ is special because it can have external classifiers, and in these
cases, we cannot bypass queue discipline (packet could be dropped by
classifier) without admin asking it, or further changes.
Its worth doing this, especially for SFQ, avoiding dirtying memory in
case no packets are already waiting in queue.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
CC: Jesper Dangaard Brouer <hawk@diku.dk>
CC: Jarek Poplawski <jarkao2@gmail.com>
CC: Jamal Hadi Salim <hadi@cyberus.ca>
CC: Stephen Hemminger <shemminger@vyatta.com>
---
I am not sure RED can use bypass too, feel free to comment on this ;)
net/sched/sch_fifo.c | 13 ++++++++++++-
net/sched/sch_generic.c | 5 ++---
net/sched/sch_mq.c | 1 -
net/sched/sch_mqprio.c | 1 -
net/sched/sch_sfq.c | 6 ++++++
5 files changed, 20 insertions(+), 6 deletions(-)
diff --git a/net/sched/sch_fifo.c b/net/sched/sch_fifo.c
index b3075f8..f7290d2 100644
--- a/net/sched/sch_fifo.c
+++ b/net/sched/sch_fifo.c
@@ -64,11 +64,13 @@ static int pfifo_tail_enqueue(struct sk_buff *skb, struct Qdisc *sch)
static int fifo_init(struct Qdisc *sch, struct nlattr *opt)
{
struct fifo_sched_data *q = qdisc_priv(sch);
+ bool bypass;
+ bool is_bfifo = sch->ops == &bfifo_qdisc_ops;
if (opt == NULL) {
u32 limit = qdisc_dev(sch)->tx_queue_len ? : 1;
- if (sch->ops == &bfifo_qdisc_ops)
+ if (is_bfifo)
limit *= psched_mtu(qdisc_dev(sch));
q->limit = limit;
@@ -81,6 +83,15 @@ static int fifo_init(struct Qdisc *sch, struct nlattr *opt)
q->limit = ctl->limit;
}
+ if (is_bfifo)
+ bypass = q->limit >= psched_mtu(qdisc_dev(sch));
+ else
+ bypass = q->limit >= 1;
+
+ if (bypass)
+ sch->flags |= TCQ_F_CAN_BYPASS;
+ else
+ sch->flags &= ~TCQ_F_CAN_BYPASS;
return 0;
}
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index cc17e79..0da09d5 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -527,6 +527,8 @@ static int pfifo_fast_init(struct Qdisc *qdisc, struct nlattr *opt)
for (prio = 0; prio < PFIFO_FAST_BANDS; prio++)
skb_queue_head_init(band2list(priv, prio));
+ /* Can by-pass the queue discipline */
+ qdisc->flags |= TCQ_F_CAN_BYPASS;
return 0;
}
@@ -691,9 +693,6 @@ static void attach_one_default_qdisc(struct net_device *dev,
netdev_info(dev, "activation failed\n");
return;
}
-
- /* Can by-pass the queue discipline for default qdisc */
- qdisc->flags |= TCQ_F_CAN_BYPASS;
}
dev_queue->qdisc_sleeping = qdisc;
}
diff --git a/net/sched/sch_mq.c b/net/sched/sch_mq.c
index ecc302f..ec5cbc8 100644
--- a/net/sched/sch_mq.c
+++ b/net/sched/sch_mq.c
@@ -61,7 +61,6 @@ static int mq_init(struct Qdisc *sch, struct nlattr *opt)
TC_H_MIN(ntx + 1)));
if (qdisc == NULL)
goto err;
- qdisc->flags |= TCQ_F_CAN_BYPASS;
priv->qdiscs[ntx] = qdisc;
}
diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
index 8620c65..fbc6f53 100644
--- a/net/sched/sch_mqprio.c
+++ b/net/sched/sch_mqprio.c
@@ -130,7 +130,6 @@ static int mqprio_init(struct Qdisc *sch, struct nlattr *opt)
err = -ENOMEM;
goto err;
}
- qdisc->flags |= TCQ_F_CAN_BYPASS;
priv->qdiscs[i] = qdisc;
}
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 156ad30..fdba52a 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -560,6 +560,10 @@ static int sfq_init(struct Qdisc *sch, struct nlattr *opt)
slot_queue_init(&q->slots[i]);
sfq_link(q, i);
}
+ if (q->limit >= 1)
+ sch->flags |= TCQ_F_CAN_BYPASS;
+ else
+ sch->flags &= ~TCQ_F_CAN_BYPASS;
return 0;
}
@@ -611,6 +615,8 @@ static unsigned long sfq_get(struct Qdisc *sch, u32 classid)
static unsigned long sfq_bind(struct Qdisc *sch, unsigned long parent,
u32 classid)
{
+ /* we cannot bypass queue discipline anymore */
+ sch->flags &= ~TCQ_F_CAN_BYPASS;
return 0;
}
^ permalink raw reply related
* RE: Using ethernet device as efficient small packet generator
From: juice @ 2011-01-21 11:44 UTC (permalink / raw)
To: Loke, Chetan, Jon Zhou, Eric Dumazet, Stephen Hemminger, netdev
In-Reply-To: <D3F292ADF945FB49B35E96C94C2061B90ECC4FAC@nsmail.netscout.com>
>> -----Original Message-----
>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>> owner@vger.kernel.org] On Behalf Of Jon Zhou
>> Sent: December 23, 2010 3:58 AM
>> To: juice@swagman.org; Eric Dumazet; Stephen Hemminger;
>> netdev@vger.kernel.org
>> Subject: RE: Using ethernet device as efficient small packet generator
>>
>>
>> At another old kernel(2.6.16) with tg3 and bnx2 1G NIC,XEON E5450, I
>> only got 490K pps(it is about 300Mbps,30% GE), I think the reason is
>> multiqueue unsupported in this kernel.
>>
>> I will do a test with 1Gb nic on the new kernel later.
>>
>
>
> I can hit close to 1M pps(first time every time) w/ a 64-byte payload on
> my VirtualMachine(running 2.6.33) via vmxnet3 vNIC -
>
>
> [root@localhost ~]# cat /proc/net/pktgen/eth2
> Params: count 0 min_pkt_size: 60 max_pkt_size: 60
> frags: 0 delay: 0 clone_skb: 0 ifname: eth2
> flows: 0 flowlen: 0
> queue_map_min: 0 queue_map_max: 0
> dst_min: 192.168.222.2 dst_max:
> src_min: src_max:
> src_mac: 00:50:56:b1:00:19 dst_mac: 00:50:56:c0:00:3e
> udp_src_min: 9 udp_src_max: 9 udp_dst_min: 9 udp_dst_max: 9
> src_mac_count: 0 dst_mac_count: 0
> Flags:
> Current:
> pkts-sofar: 59241012 errors: 0
> started: 1898437021us stopped: 1957709510us idle: 9168us
> seq_num: 59241013 cur_dst_mac_offset: 0 cur_src_mac_offset: 0
> cur_saddr: 0x0 cur_daddr: 0x2dea8c0
> cur_udp_dst: 9 cur_udp_src: 9
> cur_queue_map: 0
> flows: 0
> Result: OK: 59272488(c59263320+d9168) nsec, 59241012 (60byte,0frags)
> 999468pps 479Mb/sec (479744640bps) errors: 0
>
>
>
> Chetan
>
Hi again.
It has been a while since last time I got to be able to test this
again, as there have been some other matters at hand.
However, now I managed to rerun my tests in several different kernels.
I am using now a PCIe Intel e1000e card, that should be able to handle
the needed traffic amount.
The statistics that I get are as follows:
kernel 2.6.32-27 (ubuntu 10.10 default)
pktgen: 750064pps 360Mb/sec (360030720bps)
AX4000 analyser: Total bitrate: 383.879 MBits/s
Bandwidth: 38.39% GE
Average packet intereval: 1.33 us
kernel 2.6.37 (latest stable from kernel.org)
pktgen: 786848pps 377Mb/sec (377687040bps)
AX4000 analyser: Total bitrate: 402.904 MBits/s
Bandwidth: 40.29% GE
Average packet intereval: 1.27 us
kernel 2.6.38-rc1 (latest from kernel.org)
pktgen: 795297pps 381Mb/sec (381742560bps)
AX4000 analyser: Total bitrate: 407.117 MBits/s
Bandwidth: 40.72% GE
Average packet intereval: 1.26 us
In every case I have set the IRQ affinity of eth1 to CPU0 and started
the test running in kpktgend_0.
The complete data of my measurements follows in the end of this post.
It looks like the small packet sending effiency of the ethernet driver
is improving all the time, albeit quite slowly.
Now, I would be intrested in knowing whether it is indeed possible to
increase the sending rate near full 1GE capacity with the current
ethernet card I am using or do I have here a hardware limitation here?
I recall hearing that there are some enhanced versions of the e1000
network card, such that have been geared towards higher performance
at the expense of some functionality or general system effiency.
Can anybody point me how to do that?
As I stated before, quoting myself:
> Which do you suppose is the reason for poor performance on my setup,
> is it lack of multiqueue HW in the GE NIC's I am using or is it lack
> of multiqueue support in the kernel (2.6.32) that I am using?
>
> Is multiqueue really necessary to achieve the full 1GE saturation, or
> is it only needed on 10GE NIC's?
>
> As I understand multiqueue is useful only if there are lots of CPU cores
> to run, each handling one queue.
>
> The application I am thinking of, preloading a packet sequence into
> kernel from userland application and then starting to send from buffer
> propably does not benefit so much from many cores, it would be enough
> that one CPU would handle the sending and other core(s) would handle
> other tasks.
Yours, Jussi Ohenoja
*** Measurement details follows ***
root@d8labralinux:/var/home/juice# lspci -vvv -s 04:00.0
04:00.0 Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet
Controller (Copper) (rev 06)
Subsystem: Intel Corporation Device 1082
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 11
Region 0: Memory at f3cc0000 (32-bit, non-prefetchable) [size=128K]
Region 1: Memory at f3ce0000 (32-bit, non-prefetchable) [size=128K]
Region 2: I/O ports at cce0 [size=32]
Expansion ROM at f3d00000 [disabled] [size=128K]
Capabilities: [c8] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0
Enable-
Address: 0000000000000000 Data: 0000
Capabilities: [e0] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Latency L0 <4us, L1
<64us
ClockPM- Suprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive-
BWMgmt- ABWMgmt-
Capabilities: [100] Advanced Error Reporting <?>
Capabilities: [140] Device Serial Number b1-e5-7c-ff-ff-21-1b-00
Kernel modules: e1000e
root@d8labralinux:/var/home/juice# ethtool eth1
Settings for eth1:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Link partner advertised link modes: Not reported
Link partner advertised pause frame use: No
Link partner advertised auto-negotiation: No
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
MDI-X: on
Supports Wake-on: pumbag
Wake-on: d
Current message level: 0x00000001 (1)
Link detected: yes
2.6.38-rc1
----------
dmesg:
[ 195.685655] e1000e: Intel(R) PRO/1000 Network Driver - 1.2.20-k2
[ 195.685658] e1000e: Copyright(c) 1999 - 2011 Intel Corporation.
[ 195.685677] e1000e 0000:04:00.0: Disabling ASPM L1
[ 195.685690] e1000e 0000:04:00.0: PCI INT A -> GSI 16 (level, low) ->
IRQ 16
[ 195.685707] e1000e 0000:04:00.0: setting latency timer to 64
[ 195.685852] e1000e 0000:04:00.0: irq 69 for MSI/MSI-X
[ 195.869917] e1000e 0000:04:00.0: eth1: (PCI Express:2.5GB/s:Width x1)
00:1b:21:7c:e5:b1
[ 195.869921] e1000e 0000:04:00.0: eth1: Intel(R) PRO/1000 Network
Connection
[ 195.870006] e1000e 0000:04:00.0: eth1: MAC: 1, PHY: 4, PBA No: D50861-006
[ 196.017285] e1000e 0000:04:00.0: irq 69 for MSI/MSI-X
[ 196.073144] e1000e 0000:04:00.0: irq 69 for MSI/MSI-X
[ 196.073630] ADDRCONF(NETDEV_UP): eth1: link is not ready
[ 198.746000] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: None
[ 198.746162] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
[ 209.564433] eth1: no IPv6 routers present
pktgen:
Params: count 10000000 min_pkt_size: 60 max_pkt_size: 60
frags: 0 delay: 0 clone_skb: 1 ifname: eth1
flows: 0 flowlen: 0
queue_map_min: 0 queue_map_max: 0
dst_min: 10.10.11.2 dst_max:
src_min: src_max:
src_mac: 00:1b:21:7c:e5:b1 dst_mac: 00:04:23:08:91:dc
udp_src_min: 9 udp_src_max: 9 udp_dst_min: 9 udp_dst_max: 9
src_mac_count: 0 dst_mac_count: 0
Flags:
Current:
pkts-sofar: 10000000 errors: 0
started: 77203892067us stopped: 77216465982us idle: 1325us
seq_num: 10000001 cur_dst_mac_offset: 0 cur_src_mac_offset: 0
cur_saddr: 0x0 cur_daddr: 0x20b0a0a
cur_udp_dst: 9 cur_udp_src: 9
cur_queue_map: 0
flows: 0
Result: OK: 12573914(c12572589+d1325) nsec, 10000000 (60byte,0frags)
795297pps 381Mb/sec (381742560bps) errors: 0
AX4000 analyser:
Total bitrate: 407.117 MBits/s
Bandwidth: 40.72% GE
Average packet intereval: 1.26 us
2.6.37
------
dmesg:
[ 1810.959907] e1000e: Intel(R) PRO/1000 Network Driver - 1.2.7-k2
[ 1810.959909] e1000e: Copyright (c) 1999 - 2010 Intel Corporation.
[ 1810.959928] e1000e 0000:04:00.0: Disabling ASPM L1
[ 1810.959942] e1000e 0000:04:00.0: PCI INT A -> GSI 16 (level, low) ->
IRQ 16
[ 1810.959961] e1000e 0000:04:00.0: setting latency timer to 64
[ 1810.960103] e1000e 0000:04:00.0: irq 66 for MSI/MSI-X
[ 1811.137269] e1000e 0000:04:00.0: eth1: (PCI Express:2.5GB/s:Width x1)
00:1b:21:7c:e5:b1
[ 1811.137272] e1000e 0000:04:00.0: eth1: Intel(R) PRO/1000 Network
Connection
[ 1811.137358] e1000e 0000:04:00.0: eth1: MAC: 1, PHY: 4, PBA No: d50861-006
[ 1811.286173] e1000e 0000:04:00.0: irq 66 for MSI/MSI-X
[ 1811.342065] e1000e 0000:04:00.0: irq 66 for MSI/MSI-X
[ 1811.342575] ADDRCONF(NETDEV_UP): eth1: link is not ready
[ 1814.010736] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: None
[ 1814.010949] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
[ 1824.082148] eth1: no IPv6 routers present
pktgen:
Params: count 10000000 min_pkt_size: 60 max_pkt_size: 60
frags: 0 delay: 0 clone_skb: 1 ifname: eth1
flows: 0 flowlen: 0
queue_map_min: 0 queue_map_max: 0
dst_min: 10.10.11.2 dst_max:
src_min: src_max:
src_mac: 00:1b:21:7c:e5:b1 dst_mac: 00:04:23:08:91:dc
udp_src_min: 9 udp_src_max: 9 udp_dst_min: 9 udp_dst_max: 9
src_mac_count: 0 dst_mac_count: 0
Flags:
Current:
pkts-sofar: 10000000 errors: 0
started: 265936151us stopped: 278645077us idle: 1651us
seq_num: 10000001 cur_dst_mac_offset: 0 cur_src_mac_offset: 0
cur_saddr: 0x0 cur_daddr: 0x20b0a0a
cur_udp_dst: 9 cur_udp_src: 9
cur_queue_map: 0
flows: 0
Result: OK: 12708925(c12707274+d1651) nsec, 10000000 (60byte,0frags)
786848pps 377Mb/sec (377687040bps) errors: 0
AX4000 analyser:
Total bitrate: 402.904 MBits/s
Bandwidth: 40.29% GE
Average packet intereval: 1.27 us
2.6.32-27
---------
dmesg:
[ 2.178800] e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2
[ 2.178802] e1000e: Copyright (c) 1999-2008 Intel Corporation.
[ 2.178854] e1000e 0000:04:00.0: PCI INT A -> GSI 16 (level, low) ->
IRQ 16
[ 2.178887] e1000e 0000:04:00.0: setting latency timer to 64
[ 2.179039] e1000e 0000:04:00.0: irq 53 for MSI/MSI-X
[ 2.360700] 0000:04:00.0: eth1: (PCI Express:2.5GB/s:Width x1)
00:1b:21:7c:e5:b1
[ 2.360702] 0000:04:00.0: eth1: Intel(R) PRO/1000 Network Connection
[ 2.360787] 0000:04:00.0: eth1: MAC: 1, PHY: 4, PBA No: d50861-006
[ 9.551486] e1000e 0000:04:00.0: irq 53 for MSI/MSI-X
[ 9.607309] e1000e 0000:04:00.0: irq 53 for MSI/MSI-X
[ 9.607876] ADDRCONF(NETDEV_UP): eth1: link is not ready
[ 12.448302] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: None
[ 12.448544] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
[ 23.068498] eth1: no IPv6 routers present
pktgen:
Params: count 10000000 min_pkt_size: 60 max_pkt_size: 60
frags: 0 delay: 0 clone_skb: 1 ifname: eth1
flows: 0 flowlen: 0
queue_map_min: 0 queue_map_max: 0
dst_min: 10.10.11.2 dst_max:
src_min: src_max:
src_mac: 00:1b:21:7c:e5:b1 dst_mac: 00:04:23:08:91:dc
udp_src_min: 9 udp_src_max: 9 udp_dst_min: 9 udp_dst_max: 9
src_mac_count: 0 dst_mac_count: 0
Flags:
Current:
pkts-sofar: 10000000 errors: 0
started: 799760010us stopped: 813092189us idle: 1314us
seq_num: 10000001 cur_dst_mac_offset: 0 cur_src_mac_offset: 0
cur_saddr: 0x0 cur_daddr: 0x20b0a0a
cur_udp_dst: 9 cur_udp_src: 9
cur_queue_map: 0
flows: 0
Result: OK: 13332178(c13330864+d1314) nsec, 10000000 (60byte,0frags)
750064pps 360Mb/sec (360030720bps) errors: 0
AX4000 analyser:
Total bitrate: 383.879 MBits/s
Bandwidth: 38.39% GE
Average packet intereval: 1.33 us
root@d8labralinux:/var/home/juice/pkt_test# cat ./pktgen_conf
#!/bin/bash
#modprobe pktgen
function pgset() {
local result
echo $1 > $PGDEV
result=`cat $PGDEV | fgrep "Result: OK:"`
if [ "$result" = "" ]; then
cat $PGDEV | fgrep Result:
fi
}
function pg() {
echo inject > $PGDEV
cat $PGDEV
}
# Config Start Here
-----------------------------------------------------------
# thread config
# Each CPU has own thread. Two CPU exammple. We add eth1, eth2 respectivly.
PGDEV=/proc/net/pktgen/kpktgend_0
echo "Removing all devices"
pgset "rem_device_all"
PGDEV=/proc/net/pktgen/kpktgend_1
pgset "rem_device_all"
PGDEV=/proc/net/pktgen/kpktgend_0
echo "Adding eth1"
pgset "add_device eth1"
#echo "Setting max_before_softirq 10000"
#pgset "max_before_softirq 10000"
# device config
# ipg is inter packet gap. 0 means maximum speed.
CLONE_SKB="clone_skb 1"
# NIC adds 4 bytes CRC
PKT_SIZE="pkt_size 60"
# COUNT 0 means forever
#COUNT="count 0"
COUNT="count 10000000"
IPG="delay 0"
PGDEV=/proc/net/pktgen/eth1
echo "Configuring $PGDEV"
pgset "$COUNT"
pgset "$CLONE_SKB"
pgset "$PKT_SIZE"
pgset "$IPG"
pgset "dst 10.10.11.2"
pgset "dst_mac 00:04:23:08:91:dc"
pgset "queue_map_min 0"
# Time to run
PGDEV=/proc/net/pktgen/pgctrl
echo "Running... ctrl^C to stop"
pgset "start"
echo "Done"
# Result can be vieved in /proc/net/pktgen/eth1
^ permalink raw reply
* Re: [PATCH] Ensure that we unshare skbs prior to calling pskb_may_pull in bonding driver
From: Neil Horman @ 2011-01-21 11:51 UTC (permalink / raw)
To: David Miller; +Cc: netdev, andy, fubar
In-Reply-To: <20110120.164723.73670910.davem@davemloft.net>
On Thu, Jan 20, 2011 at 04:47:23PM -0800, David Miller wrote:
> From: Neil Horman <nhorman@tuxdriver.com>
> Date: Thu, 20 Jan 2011 14:02:31 -0500
>
> > Recently reported oops:
>
> Applied, but please compose reasonable Subject lines with your patches,
> always begin the line with a subsystem tag followed by a colon.
>
> This way we get
>
> bonding: Foo bar baz
>
> instead of
>
> Foo bar baz in the bonding driver
>
> Thanks.
>
Yeah, my bad, I realized I screwed up the Subject the second I sent the email,
sorry about that.
Regards
Neil
^ permalink raw reply
* RE: Using ethernet device as efficient small packet generator
From: Eric Dumazet @ 2011-01-21 11:51 UTC (permalink / raw)
To: juice; +Cc: Loke, Chetan, Jon Zhou, Stephen Hemminger, netdev
In-Reply-To: <13dbf221c875a931d408784495884998.squirrel@www.liukuma.net>
Le vendredi 21 janvier 2011 à 13:44 +0200, juice a écrit :
> Hi again.
>
> It has been a while since last time I got to be able to test this
> again, as there have been some other matters at hand.
> However, now I managed to rerun my tests in several different kernels.
>
> I am using now a PCIe Intel e1000e card, that should be able to handle
> the needed traffic amount.
>
> The statistics that I get are as follows:
>
> kernel 2.6.32-27 (ubuntu 10.10 default)
> pktgen: 750064pps 360Mb/sec (360030720bps)
> AX4000 analyser: Total bitrate: 383.879 MBits/s
> Bandwidth: 38.39% GE
> Average packet intereval: 1.33 us
>
> kernel 2.6.37 (latest stable from kernel.org)
> pktgen: 786848pps 377Mb/sec (377687040bps)
> AX4000 analyser: Total bitrate: 402.904 MBits/s
> Bandwidth: 40.29% GE
> Average packet intereval: 1.27 us
>
> kernel 2.6.38-rc1 (latest from kernel.org)
> pktgen: 795297pps 381Mb/sec (381742560bps)
> AX4000 analyser: Total bitrate: 407.117 MBits/s
> Bandwidth: 40.72% GE
> Average packet intereval: 1.26 us
>
>
...
> pktgen:
>
> Params: count 10000000 min_pkt_size: 60 max_pkt_size: 60
> frags: 0 delay: 0 clone_skb: 1 ifname: eth1
> flows: 0 flowlen: 0
> queue_map_min: 0 queue_map_max: 0
> dst_min: 10.10.11.2 dst_max:
> src_min: src_max:
> src_mac: 00:1b:21:7c:e5:b1 dst_mac: 00:04:23:08:91:dc
> udp_src_min: 9 udp_src_max: 9 udp_dst_min: 9 udp_dst_max: 9
> src_mac_count: 0 dst_mac_count: 0
> Flags:
> Current:
> pkts-sofar: 10000000 errors: 0
> started: 77203892067us stopped: 77216465982us idle: 1325us
> seq_num: 10000001 cur_dst_mac_offset: 0 cur_src_mac_offset: 0
> cur_saddr: 0x0 cur_daddr: 0x20b0a0a
> cur_udp_dst: 9 cur_udp_src: 9
> cur_queue_map: 0
> flows: 0
> Result: OK: 12573914(c12572589+d1325) nsec, 10000000 (60byte,0frags)
> 795297pps 381Mb/sec (381742560bps) errors: 0
>
>
> AX4000 analyser:
>
> Total bitrate: 407.117 MBits/s
> Bandwidth: 40.72% GE
> Average packet intereval: 1.26 us
>
>
You should try
CLONE_SKB="clone_skb 10"
...
pgset "$CLONE_SKB"
Because I suspect you hit a performance problem on skb
allocation/filling/use/freeing
You can use perf tool to get some performance profile while your pktgen
session is running
# cd tools/perf
# make
...
# ./perf top
^ permalink raw reply
* RE: Using ethernet device as efficient small packet generator
From: juice @ 2011-01-21 12:12 UTC (permalink / raw)
To: Eric Dumazet, Loke, Chetan, Jon Zhou, Stephen Hemminger, netdev
In-Reply-To: <1295610709.2601.35.camel@edumazet-laptop>
> Le vendredi 21 janvier 2011 à 13:44 +0200, juice a écrit :
>
>
> You should try
>
> CLONE_SKB="clone_skb 10"
> ...
> pgset "$CLONE_SKB"
>
>
> Because I suspect you hit a performance problem on skb
> allocation/filling/use/freeing
Actually, that makes the performance worse:
(Now I tried it with kernel 2.6.37, which is currently running)
root@d8labralinux:/var/home/juice/pkt_test# cat /proc/net/pktgen/eth1
Params: count 10000000 min_pkt_size: 60 max_pkt_size: 60
frags: 0 delay: 0 clone_skb: 10 ifname: eth1
flows: 0 flowlen: 0
queue_map_min: 0 queue_map_max: 0
dst_min: 10.10.11.2 dst_max:
src_min: src_max:
src_mac: 00:1b:21:7c:e5:b1 dst_mac: 00:04:23:08:91:dc
udp_src_min: 9 udp_src_max: 9 udp_dst_min: 9 udp_dst_max: 9
src_mac_count: 0 dst_mac_count: 0
Flags:
Current:
pkts-sofar: 10000000 errors: 0
started: 2555660074us stopped: 2569239323us idle: 3484us
seq_num: 10000001 cur_dst_mac_offset: 0 cur_src_mac_offset: 0
cur_saddr: 0x0 cur_daddr: 0x20b0a0a
cur_udp_dst: 9 cur_udp_src: 9
cur_queue_map: 0
flows: 0
Result: OK: 13579248(c13575763+d3484) nsec, 10000000 (60byte,0frags)
736417pps 353Mb/sec (353480160bps) errors: 0
> You can use perf tool to get some performance profile while your pktgen
> session is running
>
> # cd tools/perf
> # make
> ...
> # ./perf top
>
I can try that.
Where do I get the performance profiler tool?
Yours, Jussi Ohenoja
^ permalink raw reply
* Re: [PATCH v4] net: add Faraday FTMAC100 10/100 Ethernet driver
From: Michał Mirosław @ 2011-01-21 12:26 UTC (permalink / raw)
To: Po-Yu Chuang
Cc: netdev, linux-kernel, bhutchings, eric.dumazet, joe, dilinger,
Po-Yu Chuang
In-Reply-To: <1295596533-1748-1-git-send-email-ratbert.chuang@gmail.com>
2011/1/21 Po-Yu Chuang <ratbert.chuang@gmail.com>:
> From: Po-Yu Chuang <ratbert@faraday-tech.com>
>
> FTMAC100 Ethernet Media Access Controller supports 10/100 Mbps and
> MII. This driver has been working on some ARM/NDS32 SoC's including
> Faraday A320 and Andes AG101.
>
> Signed-off-by: Po-Yu Chuang <ratbert@faraday-tech.com>
[...]
> +static void ftmac100_txdes_reset(struct ftmac100_txdes *txdes)
> +{
> + /* clear all except end of ring bit */
> + txdes->txdes0 = 0;
> + txdes->txdes1 &= FTMAC100_TXDES1_EDOTR;
> + txdes->txdes2 = 0;
> + txdes->txdes3 = 0;
> +}
This also probably needs cpu_to_le32().
[...]
> +static void ftmac100_free_buffers(struct ftmac100 *priv)
> +{
> + int i;
> +
> + for (i = 0; i < RX_QUEUE_ENTRIES; i += 2) {
> + struct ftmac100_rxdes *rxdes = &priv->descs->rxdes[i];
> + dma_addr_t d = ftmac100_rxdes_get_dma_addr(rxdes);
> + void *page = ftmac100_rxdes_get_va(rxdes);
> +
> + if (d)
> + dma_unmap_single(priv->dev, d, PAGE_SIZE,
> + DMA_FROM_DEVICE);
> +
> + if (page != NULL)
> + free_page((unsigned long)page);
> + }
> +
[...]
> +static int ftmac100_alloc_buffers(struct ftmac100 *priv)
> +{
> + int i;
> +
> + priv->descs = dma_alloc_coherent(priv->dev,
> + sizeof(struct ftmac100_descs),
> + &priv->descs_dma_addr,
> + GFP_KERNEL | GFP_DMA);
> + if (priv->descs == NULL)
> + return -ENOMEM;
> +
> + memset(priv->descs, 0, sizeof(struct ftmac100_descs));
> +
> + /* initialize RX ring */
> +
> + ftmac100_rxdes_set_end_of_ring(&priv->descs->rxdes[RX_QUEUE_ENTRIES - 1]);
> +
> + for (i = 0; i < RX_QUEUE_ENTRIES; i += 2) {
> + struct ftmac100_rxdes *rxdes = &priv->descs->rxdes[i];
> + void *page;
> + dma_addr_t d;
> +
> + page = (void *)__get_free_page(GFP_KERNEL | GFP_DMA);
> + if (page == NULL)
> + goto err;
> +
> + d = dma_map_single(priv->dev, page, PAGE_SIZE, DMA_FROM_DEVICE);
> + if (unlikely(dma_mapping_error(priv->dev, d))) {
> + free_page((unsigned long)page);
> + goto err;
> + }
> +
> + /*
> + * The hardware enforces a sub-2K maximum packet size, so we
> + * put two buffers on every hardware page.
> + */
> + ftmac100_rxdes_set_va(rxdes, page);
> + ftmac100_rxdes_set_va(rxdes + 1, page + PAGE_SIZE / 2);
> +
> + ftmac100_rxdes_set_dma_addr(rxdes, d);
> + ftmac100_rxdes_set_dma_addr(rxdes + 1, d + PAGE_SIZE / 2);
> +
> + ftmac100_rxdes_set_buffer_size(rxdes, RX_BUF_SIZE);
> + ftmac100_rxdes_set_buffer_size(rxdes + 1, RX_BUF_SIZE);
> +
> + ftmac100_rxdes_set_dma_own(rxdes);
> + ftmac100_rxdes_set_dma_own(rxdes + 1);
> + }
[...]
Did you test this? This looks like it will result in double free after
packet RX, as you are giving the same page (referenced once) to two
distinct RX descriptors, that may be assigned different packets.
Since your not implementing any RX offloads, you might just allocate
fresh skb's with alloc_skb() and store skb pointer in rxdes3. Since
hardware doesn't touch it, you can skip cpu_to_le32()/le32_to_cpu()
there (leave a comment, though).
Unless this needs to work for ISA devices, you should drop GFP_DMA
allocation flag.
Best Regards,
Michał Mirosław
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox