* Re: Linux Route Cache performance tests
From: Paweł Staszewski @ 2011-11-07 8:36 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Linux Network Development list
In-Reply-To: <1320620915.6506.44.camel@edumazet-laptop>
W dniu 2011-11-07 00:08, Eric Dumazet pisze:
> Le dimanche 06 novembre 2011 à 22:57 +0100, Paweł Staszewski a écrit :
>> W dniu 2011-11-06 22:26, Eric Dumazet pisze:
>>> Le dimanche 06 novembre 2011 à 21:25 +0100, Paweł Staszewski a écrit :
>>>> Yes with this is a little problem i think with kernel 3.1 because
>>>> dmesg | egrep '(rhash)|(route)'
>>>> [ 0.000000] Command line: root=/dev/md2 rhash_entries=2097152
>>>> [ 0.000000] Kernel command line: root=/dev/md2 rhash_entries=2097152
>>>> [ 4.697294] IP route cache hash table entries: 524288 (order: 10,
>>>> 4194304 bytes)
>>>>
>>>>
>>> Dont tell me you _still_ use a 32bit kernel ?
>> no it is 64bit :)
>> Linux localhost 3.1.0 #16 SMP Sun Nov 6 18:09:48 CET 2011 x86_64 Intel(R)
>> :)
>>
>>> If so, you need to tweak alloc_large_system_hash() to use vmalloc,
>>> because you hit MAX_ORDER (10) page allocations.
>> funny then :)
>> Maybee i turned off too many kernel features
>>> But considering LOWMEM is about 700 Mbytes, you wont be able to create a
>>> lot of route cache entries.
>>>
>>> Come on, do us a favor, and enter new era of computing.
> OK, then your kernel is not CONFIG_NUMA enabled
>
> It seems strange given you probably have a NUMA machine (24 cpus)
Yes NUMA was not enabled
I make some tests with NUMA and without to compare performance of ixgbe
with use Node="" parameters for ixgbe module
> If so, your choices are :
>
> 1) enable CONFIG_NUMA. Really this is a must given the workload of your
> machine.
>
> 2) Or : you need to add "hashdist=1" on boot params
> and patch your kernel with following patch :
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 9dd443d..07f86e0 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5362,7 +5362,6 @@ int percpu_pagelist_fraction_sysctl_handler(ctl_table *table, int write,
>
> int hashdist = HASHDIST_DEFAULT;
>
> -#ifdef CONFIG_NUMA
> static int __init set_hashdist(char *str)
> {
> if (!str)
> @@ -5371,7 +5370,6 @@ static int __init set_hashdist(char *str)
> return 1;
> }
> __setup("hashdist=", set_hashdist);
> -#endif
>
> /*
> * allocate a large system hash table from bootmem
>
Yes after enabling NUMA I can change rhash_entries on kernel boot.
And what is the most important for big route cahce is rhash_entries
if route cache size exceed hash size performance will drop 6x to 8x
So the best settings for route cache are:
rhash_entries = gc_thresh = max_size
Eric tell me what are the plans for removing route cache from kernel ?
Because as You see with route cache performance is better
And without route cache performance is not soo good than with route
cache enabled but it is stable for all situations even DDOS with 10kk
random_ips
So for the feature we need to prepare for lower kernel IP forwarding
performance because of no route cache ?
Or removing route cache will save some time in IP stack processing ?
Thanks
Pawel
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply
* Re: Linux Route Cache performance tests
From: Eric Dumazet @ 2011-11-07 9:08 UTC (permalink / raw)
To: Paweł Staszewski; +Cc: Linux Network Development list
In-Reply-To: <4EB798A5.7000900@itcare.pl>
Le lundi 07 novembre 2011 à 09:36 +0100, Paweł Staszewski a écrit :
> Yes after enabling NUMA I can change rhash_entries on kernel boot.
>
> And what is the most important for big route cahce is rhash_entries
> if route cache size exceed hash size performance will drop 6x to 8x
> So the best settings for route cache are:
> rhash_entries = gc_thresh = max_size
>
> Eric tell me what are the plans for removing route cache from kernel ?
> Because as You see with route cache performance is better
> And without route cache performance is not soo good than with route
> cache enabled but it is stable for all situations even DDOS with 10kk
> random_ips
>
> So for the feature we need to prepare for lower kernel IP forwarding
> performance because of no route cache ?
> Or removing route cache will save some time in IP stack processing ?
>
Obviously, cache removal will be possible only when performance without
it is the same.
Work is in progress, it started a long time ago.
^ permalink raw reply
* Re: Linux Route Cache performance tests
From: Eric Dumazet @ 2011-11-07 9:16 UTC (permalink / raw)
To: Paweł Staszewski; +Cc: Linux Network Development list
In-Reply-To: <1320656922.2361.1.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
Le lundi 07 novembre 2011 à 10:08 +0100, Eric Dumazet a écrit :
> Obviously, cache removal will be possible only when performance without
> it is the same.
>
> Work is in progress, it started a long time ago.
>
One of the reason to get rid of this cache is its memory use.
256 bytes per entry, thats a lot of memory if you need 2.000.000
entries...
^ permalink raw reply
* [PATCH resend] MAINTAINERS/rds: update maintainer
From: Or Gerlitz @ 2011-11-07 9:39 UTC (permalink / raw)
To: David Miller; +Cc: netdev
update for the actual maintainer
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
This was previously sent by Joe Perches but somehow missed upstream
resending now again... used wrong address for Dave Miller
MAINTAINERS | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: b/MAINTAINERS
===================================================================
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5470,7 +5470,7 @@ S: Maintained
F: drivers/net/ethernet/rdc/r6040.c
RDS - RELIABLE DATAGRAM SOCKETS
-M: Andy Grover <andy.grover@oracle.com>
+M: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
L: rds-devel@oss.oracle.com (moderated for non-subscribers)
S: Supported
F: net/rds/
^ permalink raw reply
* Re: net-next tree question: time to submit new features
From: Daniel Baluta @ 2011-11-07 10:10 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Alexander Smirnov, open list:NETWORKING [GENERAL]
In-Reply-To: <1320654133.6506.48.camel@edumazet-laptop>
On Mon, Nov 7, 2011 at 10:22 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le lundi 07 novembre 2011 à 11:11 +0300, Alexander Smirnov a écrit :
>> Hello everybody,
>>
>> last week I sent to the list patch series for 6LoWPAN, but the reply
>> was that it wasn't proper time for submitting new features right in
>> the middle of merge window.
>>
>> Could anyone please specify when the net-next tree opens back and the
>> 'proper' time comes?
>>
>
> It reopens after Linus announces linux-3.x.0-rc1
s/linux-3.x.0-rc1/linux-3.x-rc1
> Right now, we are in the 'merge window'.
Daniel.
^ permalink raw reply
* Re: [PATCH 6/7] fsl_pmc: Add API to enable device as wakeup event source
From: Zhao Chenhui @ 2011-11-07 11:22 UTC (permalink / raw)
To: Scott Wood; +Cc: linuxppc-dev, netdev, leoli
In-Reply-To: <4EB455B1.8030009@freescale.com>
On Fri, Nov 04, 2011 at 04:14:25PM -0500, Scott Wood wrote:
> On 11/04/2011 07:39 AM, Zhao Chenhui wrote:
> > @@ -45,6 +46,72 @@ static int has_lossless;
> > * code can be compatible with both 32-bit & 36-bit */
> > extern void mpc85xx_enter_deep_sleep(u64 ccsrbar, u32 powmgtreq);
> >
> > +#ifdef CONFIG_FSL_PMC
> > +/**
> > + * pmc_enable_wake - enable OF device as wakeup event source
> > + * @pdev: platform device affected
> > + * @state: PM state from which device will issue wakeup events
> > + * @enable: True to enable event generation; false to disable
> > + *
> > + * This enables the device as a wakeup event source, or disables it.
> > + *
> > + * RETURN VALUE:
> > + * 0 is returned on success
> > + * -EINVAL is returned if device is not supposed to wake up the system
> > + * Error code depending on the platform is returned if both the platform and
> > + * the native mechanism fail to enable the generation of wake-up events
> > + */
> > +int pmc_enable_wake(struct platform_device *pdev,
> > + suspend_state_t state, bool enable)
>
> "pmc" is too generic for a global function. If this can be either
> enable or disable, perhaps it should be something like
> mpc85xx_pmc_set_wake().
>
> > +{
> > + int ret = 0;
> > + struct device_node *clk_np;
> > + u32 *pmcdr_mask;
> > +
> > + if (!pmc_regs) {
> > + printk(KERN_WARNING "PMC is unavailable\n");
> > + return -ENOMEM;
> > + }
>
> -ENOMEM is not appropriate here, maybe -ENODEV?
>
> Should print __func__ so the user knows what's complaining.
>
> > + if (enable && !device_may_wakeup(&pdev->dev))
> > + return -EINVAL;
> > +
> > + clk_np = of_parse_phandle(pdev->dev.of_node, "clk-handle", 0);
> > + if (!clk_np)
> > + return -EINVAL;
> > +
> > + pmcdr_mask = (u32 *)of_get_property(clk_np, "fsl,pmcdr-mask", NULL);
> > + if (!pmcdr_mask) {
> > + ret = -EINVAL;
> > + goto out;
> > + }
> > +
> > + /* clear to enable clock in low power mode */
> > + if (enable)
> > + clrbits32(&pmc_regs->pmcdr, *pmcdr_mask);
> > + else
> > + setbits32(&pmc_regs->pmcdr, *pmcdr_mask);
>
> We should probably initialize PMCDR to all bits set (or at least all
> ones we know are valid) -- the default should be "not a wakeup source".
I think it should be initialized in u-boot.
>
> > +/**
> > + * pmc_enable_lossless - enable lossless ethernet in low power mode
> > + * @enable: True to enable event generation; false to disable
> > + */
> > +void pmc_enable_lossless(int enable)
> > +{
> > + if (enable && has_lossless)
> > + setbits32(&pmc_regs->pmcsr, PMCSR_LOSSLESS);
> > + else
> > + clrbits32(&pmc_regs->pmcsr, PMCSR_LOSSLESS);
> > +}
> > +EXPORT_SYMBOL_GPL(pmc_enable_lossless);
> > +#endif
>
> Won't we overwrite this later?
>
> -Scott
Do you have any idea?
-chenhui
^ permalink raw reply
* Re: [PATCH 6/7] fsl_pmc: Add API to enable device as wakeup event source
From: Zhao Chenhui @ 2011-11-07 11:24 UTC (permalink / raw)
To: Tabi Timur-B04825
Cc: Zhao Chenhui-B35336, linuxppc-dev@lists.ozlabs.org,
Li Yang-R58472, netdev@vger.kernel.org
In-Reply-To: <CAOZdJXXB9zJWqC+kPq7ZDdzePtp8XNBnWcf5UmE8Ye50U-G7Dg@mail.gmail.com>
On Fri, Nov 04, 2011 at 07:08:24PM -0500, Tabi Timur-B04825 wrote:
> On Fri, Nov 4, 2011 at 7:39 AM, Zhao Chenhui <chenhui.zhao@freescale.com> wrote:
>
> > + if (!pmc_regs) {
> > + printk(KERN_WARNING "PMC is unavailable\n");
>
> Use pr_warn() and the other pr_xxx functions.
>
> > + pmcdr_mask = (u32 *)of_get_property(clk_np, "fsl,pmcdr-mask", NULL);
>
> The typecast is unnecessary here.
>
> > + /* clear to enable clock in low power mode */
> > + if (enable)
> > + clrbits32(&pmc_regs->pmcdr, *pmcdr_mask);
> > + else
> > + setbits32(&pmc_regs->pmcdr, *pmcdr_mask);
>
> You need to use be32_to_cpup() when dereferencing a pointer to a
> device tree property.
>
> --
> Timur Tabi
> Linux kernel developer at Freescale
Thanks. I will fix them all.
-chenhui
^ permalink raw reply
* Re: [GIT] Networking
From: Stanislaw Gruszka @ 2011-11-07 11:40 UTC (permalink / raw)
To: Johannes Berg
Cc: Linus Torvalds, David Miller, Grumbach, Emmanuel, Guy, Wey-Yi W,
John W. Linville, akpm@linux-foundation.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <1320426222.3969.121.camel@jlt3.sipsolutions.net>
On Fri, Nov 04, 2011 at 06:03:42PM +0100, Johannes Berg wrote:
> On Tue, 2011-10-25 at 12:26 +0000, Linus Torvalds wrote:
> > On Tue, Oct 25, 2011 at 12:32 PM, David Miller <davem@davemloft.net> wrote:
> > >
> > > The most visible thing in here is the reorganization of the drivers
> > > under drivers/net.
> >
> > Nope, the most visible thing are some new annoying warnings, that seem
> > absolutely broken.
> >
> > At the very minimum, that WARN_ON(1) should be a WARN_ON_ONCE() or
> > something like that. Because showing it over an dover again is not
> > helpful.
> >
> > Added the people who signed off on the changes to the affected files,
> > since they hopefully know why this happens and can fix it..
>
> Sorry for the late reply.
>
> I think this was actually caused by some changes from Ben Greear
> "optimising" stuff in net/mac80211/work.c and keeping the wrong channel,
> Stanislaw was also looking at this at some point I think. I agree that
> it would be good to change to WARN_ON_ONCE(), will do that.
>
> The warning itself has been around forever, but the fact that it
> triggers now is probably related to the channel work & band switches.
I think this problem is related with linuxcon :-) and it is not current
regression, i.e. Ben's off-channel code was already committed in 3.0.
I have also RHEL6 - 2.6.32 bug report about that:
https://bugzilla.redhat.com/show_bug.cgi?id=749125
Probably there was something special on linuxcon wireless network (i.e.
big contention) what cause people start to see this warning. Eventually
nobody use wireless except conferences :-/
I have one pending patch which fix other instance of the warning (started
at ieee80211_tx_status()). I'll post it today. As Ben mentioned, two
fixes are pending in wireless-testing tree:
http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=commitdiff;h=682ba5a44516529b29fc780c055e06104d36e31e
http://git.kernel.org/?p=linux/kernel/git/linville/wireless-testing.git;a=commitdiff;h=776308dd5fa8a41e8bbb79818a66bd68a7db757e
However, I'm not sure if that cover all possible situations when
warning can be triggered. We need to do a careful review of channel
switching in mac80211.
Stanislaw
^ permalink raw reply
* Re: [PATCH 3/5] net/mac80211/debugfs.c: use kstrtoul, etc
From: Eliad Peller @ 2011-11-07 11:47 UTC (permalink / raw)
To: Julia Lawall
Cc: John W. Linville, kernel-janitors, Johannes Berg, David S. Miller,
linux-wireless, netdev, linux-kernel
In-Reply-To: <1320586010-21931-4-git-send-email-julia@diku.dk>
On Sun, Nov 6, 2011 at 3:26 PM, Julia Lawall <julia@diku.dk> wrote:
> From: Julia Lawall <julia@diku.dk>
>
> Use kstrtoul, etc instead of the now deprecated strict_strtoul, etc.
>
> A semantic patch rule for the kstrtoul case is as follows:
> (http://coccinelle.lip6.fr/)
>
> // <smpl>
> @@
> expression a,b;
> {int,long} *c;
> @@
>
> -strict_strtoul
> +kstrtoul
> (a,b,c)
> // </smpl>
>
> Signed-off-by: Julia Lawall <julia@diku.dk>
>
> ---
> net/mac80211/debugfs.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff -u -p a/net/mac80211/debugfs.c b/net/mac80211/debugfs.c
> --- a/net/mac80211/debugfs.c
> +++ b/net/mac80211/debugfs.c
> @@ -190,7 +190,7 @@ static ssize_t uapsd_max_sp_len_write(st
> return -EFAULT;
> buf[len] = '\0';
>
> - ret = strict_strtoul(buf, 0, &val);
> + ret = kstrtoul(buf, 0, &val);
>
> if (ret)
> return -EINVAL;
>
maybe while cleaning it up change copy_from_user +
strict_stroul/kstroul -> kstroul_from_user?
Eliad.
^ permalink raw reply
* Re: [PATCH 3/5] net/mac80211/debugfs.c: use kstrtoul, etc
From: Julia Lawall @ 2011-11-07 11:58 UTC (permalink / raw)
To: Eliad Peller
Cc: Julia Lawall, John W. Linville, kernel-janitors, Johannes Berg,
David S. Miller, linux-wireless, netdev, linux-kernel
In-Reply-To: <CAB3XZEfHV+sO=+vJdZUQVqXFtHvZyhJQ1Cm8LJ8Puyz9eqq9Kg@mail.gmail.com>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1159 bytes --]
On Mon, 7 Nov 2011, Eliad Peller wrote:
> On Sun, Nov 6, 2011 at 3:26 PM, Julia Lawall <julia@diku.dk> wrote:
>> From: Julia Lawall <julia@diku.dk>
>>
>> Use kstrtoul, etc instead of the now deprecated strict_strtoul, etc.
>>
>> A semantic patch rule for the kstrtoul case is as follows:
>> (http://coccinelle.lip6.fr/)
>>
>> // <smpl>
>> @@
>> expression a,b;
>> {int,long} *c;
>> @@
>>
>> -strict_strtoul
>> +kstrtoul
>> (a,b,c)
>> // </smpl>
>>
>> Signed-off-by: Julia Lawall <julia@diku.dk>
>>
>> ---
>> net/mac80211/debugfs.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff -u -p a/net/mac80211/debugfs.c b/net/mac80211/debugfs.c
>> --- a/net/mac80211/debugfs.c
>> +++ b/net/mac80211/debugfs.c
>> @@ -190,7 +190,7 @@ static ssize_t uapsd_max_sp_len_write(st
>> return -EFAULT;
>> buf[len] = '\0';
>>
>> - ret = strict_strtoul(buf, 0, &val);
>> + ret = kstrtoul(buf, 0, &val);
>>
>> if (ret)
>> return -EINVAL;
>>
>
> maybe while cleaning it up change copy_from_user +
> strict_stroul/kstroul -> kstroul_from_user?
Thanks for the suggestion. I will look into it.
julia
^ permalink raw reply
* Re: [PATCH V2] Add platform driver support to the CS890x driver
From: Jaccon Bastiaansen @ 2011-11-07 12:14 UTC (permalink / raw)
To: Sascha Hauer; +Cc: netdev, Uwe Kleine-König, kernel
In-Reply-To: <20111103080615.GD16886@pengutronix.de>
Hello Sascha,
I will do this. Since I'm on holiday the rest of the week, it will
take some time before I can post the new patches.
Regards,
Jaccon
2011/11/3 Sascha Hauer <s.hauer@pengutronix.de>:
> Hi Jaccon,
>
> On Sun, Oct 09, 2011 at 10:51:23PM +0200, Jaccon Bastiaansen wrote:
>> Hello,
>>
>> This patch hasn't been sent to the netdev mailing list before, sorry for that.
>
> I appreciate what you are trying to do. The cs89x0 is still not dead and
> it's quite annoying that we do not have proper platform device driver
> support for it. Unfortunately your patch was ignored by the important
> people, so can you respin it? You should create a proper series from it
> with one patch for the driver and one patch per board. This helps to
> increase your visibility and also you can set the individual board
> maintainers on Cc for their board. Please also Cc
> netdev@vger.kernel.org and the arm linux kernel mailing list.
> Your patch also contains some cleanups like the removal of the unused
> QQ2440. You should create a seperate patch for this, then it will be
> easier to review (and also people love to read 'cleanup' in a patch
> subject ;)
>
> Your mailer turns tabs into spaces, you should fix this before
> resending.
>
> Some more comments inline.
>
>
>>
>> +static struct resource ixdp2x01_cs8900_resources[] = {
>> + {
>> + .start = (u32)IXDP2X01_CS8900_VIRT_BASE,
>> + .end = (u32)IXDP2X01_CS8900_VIRT_BASE + 0x1000 - 1,
>> + .flags = IORESOURCE_MEM,
>> + },
>
> This is wrong. resources are about physical addresses, not virtual. You
> have to ioremap them in the driver.
>
>> diff --git a/drivers/net/cs89x0.c b/drivers/net/cs89x0.c
>> index 537a4b2..b7fb3bc 100644
>> --- a/drivers/net/cs89x0.c
>> +++ b/drivers/net/cs89x0.c
>> @@ -12,6 +12,14 @@
>> The author may be reached at nelson@crynwr.com, Crynwr
>> Software, 521 Pleasant Valley Rd., Potsdam, NY 13676
>>
>> +Sources
>> +
>> + Crynwr packet driver epktisa.
>> +
>> + Crystal Semiconductor data sheets.
>> +
>> +
>> +
>
> This seems unrelated to this patch. Please drop.
>
>> Changelog:
>>
>> Mike Cruse : mcruse@cti-ltd.com
>> @@ -98,39 +106,14 @@
>> Domenico Andreoli : cavokz@gmail.com
>> : QQ2440 platform support
>>
>> + Jaccon Bastiaansen: jaccon.bastiaansen@gmail.com
>> + : added platform driver support
>
> The history in git is enough (and even better) than the changelog in the
> file headers. Please drop this.
>
>> +
>> +#ifdef CONFIG_CS89x0_PLATFORM
>> +static int cs89x0_platform_probe(struct platform_device *pdev)
>> +{
>> + struct net_device *dev = alloc_etherdev(sizeof(struct net_local));
>> + struct resource *mem_res;
>> + struct resource *irq_res;
>> + int err;
>> +
>> + if (!dev)
>> + return -ENODEV;
>> +
>> + mem_res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> + irq_res = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
>> + if (mem_res == NULL || irq_res == NULL) {
>> + pr_warning("memory and/or interrupt resource missing.\n");
>> + err = -ENOENT;
>> + goto out;
>> + }
>> +
>> + cs8900_irq_map[0] = irq_res->start;
>
> This limits the driver to a single instance. I think this is ok for now
> as an intermediate step, but you should check this and bail out with
> -EBUSY if a second instance is registered.
>
>> + err = cs89x0_probe1(dev, mem_res->start, 0);
>> + if (err) {
>> + pr_warning("no cs8900 or cs8920 detected.\n");
>> + goto out;
>> + }
>> +
>> + platform_set_drvdata(pdev, dev);
>> + return 0;
>> +out:
>> + free_netdev(dev);
>> + return err;
>> +}
>
> Sascha
>
> --
> Pengutronix e.K. | |
> Industrial Linux Solutions | http://www.pengutronix.de/ |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
> Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
>
^ permalink raw reply
* Re: ixgbe: compilation failed if CONFIG_PCI_IOV isn't set
From: Alexander Kolesen @ 2011-11-07 12:46 UTC (permalink / raw)
To: Or Gerlitz
Cc: jeffrey.t.kirsher, netdev@vger.kernel.org, Rose, Gregory V,
Li, Sibai
In-Reply-To: <CAJZOPZJrrSFr_WvXWpd9xH1ioa9kmg+KX8Xzx940DB=_V4W6EQ@mail.gmail.com>
> On Sun, Nov 6, 2011 at 5:18 AM, Jeff Kirsher
> <jeffrey.t.kirsher@intel.com> wrote:
> > Was with the latest kernel from David Miller's net.gt tree? I just ask
> > because I just pushed a patch (a couple of days ago) to resolve
> > compilation errors when CONFIG_PCI_IOV is not enabled by Greg Rose.
>
> In my case it was with Linus tree from github
>
> Or.
It was with Linus tree, but David Miller's net.gt alsa has this issue.
^ permalink raw reply
* Re: Kernel Panic every 2 weeks on ISP server (NULL pointer dereference)
From: Luciano Ruete @ 2011-11-07 13:11 UTC (permalink / raw)
To: netdev; +Cc: Eric Dumazet
In-Reply-To: <201110241509.14027.lruete@sequre.com.ar>
[-- Attachment #1: Type: Text/Plain, Size: 1059 bytes --]
On Monday, October 24, 2011 03:09:13 pm Luciano Ruete wrote:
> On Sunday, October 23, 2011 02:16:29 am Eric Dumazet wrote:
> > Le samedi 22 octobre 2011 à 22:18 -0300, Luciano Ruete a écrit :
> [...]
> Thanks again i will try the kernel upgrade and post results in this thread.
Ok, now running Linux Kernel 3.0.0(Ubuntu 11.10)[0]
After 3 days of uptime, i've had a new kind of crash(panic), this time in
nf_conntrack_sip flush_expectations function.
Trace and decoded trace attached, i still do not know how to read this in
order to follow excecution and blame a particular kernel line of code. Guess
compiling C into assembler on the fly is not in my skills bag.
I just want to check if this is kernel a bug, or may be there is something
wrong somewhere else in my setup...
[0] server:~# uname -a
Linux server 3.0.0-12-server #20-Ubuntu SMP Fri Oct 7 16:36:30 UTC 2011 x86_64
GNU/Linux
--
Luciano Ruete
Sequre - Sys Admin
Mitre 617, piso 7, of. 1
+54 261 4254894
Mendoza - Argentina
http://www.sequre.com.ar/
[-- Attachment #2: decoded_trace.txt --]
[-- Type: text/plain, Size: 1695 bytes --]
[328686.010062] Code: 84 d2 75 7f 48 c7 c7 e8 19 12 a0 45 0f b6 ee e8 47 2d 48 e1 48 8b 5b 28 48 85 db 75 0e eb 4c 0f 1f 40 00 4d 85 e4 74 43 4c 89 e3 <8b> b3 d0 00 00 00 31 c0 4c 8b 23 85 f6 0f 95 c0 41 39 c5 75 e3
All code
========
0: 84 d2 test %dl,%dl
2: 75 7f jne 0x83
4: 48 c7 c7 e8 19 12 a0 mov $0xffffffffa01219e8,%rdi
b: 45 0f b6 ee movzbl %r14b,%r13d
f: e8 47 2d 48 e1 callq 0xffffffffe1482d5b
14: 48 8b 5b 28 mov 0x28(%rbx),%rbx
18: 48 85 db test %rbx,%rbx
1b: 75 0e jne 0x2b
1d: eb 4c jmp 0x6b
1f: 0f 1f 40 00 nopl 0x0(%rax)
23: 4d 85 e4 test %r12,%r12
26: 74 43 je 0x6b
28: 4c 89 e3 mov %r12,%rbx
2b:* 8b b3 d0 00 00 00 mov 0xd0(%rbx),%esi <-- trapping instruction
31: 31 c0 xor %eax,%eax
33: 4c 8b 23 mov (%rbx),%r12
36: 85 f6 test %esi,%esi
38: 0f 95 c0 setne %al
3b: 41 39 c5 cmp %eax,%r13d
3e: 75 e3 jne 0x23
Code starting with the faulting instruction
===========================================
0: 8b b3 d0 00 00 00 mov 0xd0(%rbx),%esi
6: 31 c0 xor %eax,%eax
8: 4c 8b 23 mov (%rbx),%r12
b: 85 f6 test %esi,%esi
d: 0f 95 c0 setne %al
10: 41 39 c5 cmp %eax,%r13d
13: 75 e3 jne 0xfffffffffffffff8
[-- Attachment #3: kern.log.txt --]
[-- Type: text/plain, Size: 8709 bytes --]
[328680.672986] general protection fault: 0000 [#1] SMP
[328680.733325] CPU 1
[328680.756199] Modules linked in: ppp_deflate zlib_deflate bsd_comp ppp_async crc_ccitt nf_conntrack_netlink nfnetlink xt_owner ipt_REJECT ipt_REDIRECT ipt_MASQUERADE xt_iprange xt_helper xt_length xt_TCPMSS xt_connmark xt_mark xt_state xt_tcpudp xt_multiport iptable_mangle iptable_nat iptable_filter ip_tables x_tables sch_sfq act_mirred cls_u32 sch_prio cls_fw sch_htb ifb dummy 8021q garp stp nf_nat_irc nf_conntrack_irc nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_proto_gre nf_nat_amanda ts_kmp nf_conntrack_amanda nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp nf_conntrack cdc_ether usbnet i7core_edac ioatdma tpm_tis serio_raw lp shpchp parport edac_core raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov ra
id6_pq async_tx raid1 igb usbhid raid0 hid megaraid_sas dca multipath linear
[328681.672554]
[328681.691295] Pid: 0, comm: kworker/0:0 Not tainted 3.0.0-12-server #20-Ubuntu IBM System x3650 M3 -[7945AC1]-/69Y5698
[328681.823341] RIP: 0010:[<ffffffffa017bc70>] [<ffffffffa017bc70>] flush_expectations+0x50/0xc0 [nf_conntrack_sip]
[328681.945982] RSP: 0018:ffff88027f223890 EFLAGS: 00010286
[328682.010416] RAX: 0000000000000000 RBX: dead000000100100 RCX: ffff88026fb00480
[328682.096674] RDX: ffff88027ec1c420 RSI: 0000000000000001 RDI: ffff88026aa8df68
[328682.182933] RBP: ffff88027f2238b0 R08: ffff88027ec1c000 R09: dead000000200200
[328682.269198] R10: dead000000200200 R11: dead000000200200 R12: dead000000100100
[328682.355459] R13: 0000000000000001 R14: 0000000000000001 R15: 000000000000012e
[328682.441720] FS: 0000000000000000(0000) GS:ffff88027f220000(0000) knlGS:0000000000000000
[328682.539407] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[328682.609036] CR2: 00007f07284b6500 CR3: 0000000001c03000 CR4: 00000000000006e0
[328682.695299] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[328682.781560] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[328682.867822] Process kworker/0:0 (pid: 0, threadinfo ffff880272a2e000, task ffff880272a30000)
[328682.969665] Stack:
[328682.994635] 0000000000000030 ffff88027f2239cc ffff88027f2239c0 ffff88013dfe2100
[328683.084115] ffff88027f2238e0 ffffffffa017cec0 ffff88027f2239cc ffff880200000001
[328683.173599] ffff88027f2238e0 0000000000000000 ffff88027f223970 ffffffffa017b9eb
[328683.263086] Call Trace:
[328683.293249] <IRQ>
[328683.319288] [<ffffffffa017cec0>] process_invite_response+0x80/0x90 [nf_conntrack_sip]
[328683.414895] [<ffffffffa017b9eb>] process_sip_response+0x15b/0x170 [nf_conntrack_sip]
[328683.509468] [<ffffffffa017d10d>] process_sip_msg.isra.8+0x7d/0xb0 [nf_conntrack_sip]
[328683.604039] [<ffffffffa017d1dd>] sip_help_udp+0x9d/0xd0 [nf_conntrack_sip]
[328683.688209] [<ffffffffa012fdcf>] ipv4_confirm+0xbf/0x200 [nf_conntrack_ipv4]
[328683.774479] [<ffffffff81516075>] nf_iterate+0x85/0xc0
[328683.836838] [<ffffffff815223e0>] ? ip_fragment+0x950/0x950
[328683.904391] [<ffffffff81516126>] nf_hook_slow+0x76/0x130
[328683.969865] [<ffffffff815223e0>] ? ip_fragment+0x950/0x950
[328684.037414] [<ffffffff8152310c>] ip_output+0x9c/0xc0
[328684.098737] [<ffffffff8151f094>] ip_forward_finish+0x44/0x60
[328684.168372] [<ffffffff8151f355>] ip_forward+0x2a5/0x440
[328684.232807] [<ffffffff8151d641>] ip_rcv_finish+0x131/0x370
[328684.300363] [<ffffffff8151deee>] ip_rcv+0x21e/0x2f0
[328684.360647] [<ffffffff814e9972>] __netif_receive_skb+0x4a2/0x540
[328684.434433] [<ffffffff814dc63f>] ? __alloc_skb+0x4f/0x230
[328684.500945] [<ffffffff814e99d9>] __netif_receive_skb+0x509/0x540
[328684.574730] [<ffffffff814ea530>] netif_receive_skb+0x80/0x90
[328684.644361] [<ffffffff814ea928>] ? dev_gro_receive+0x1b8/0x2c0
[328684.716067] [<ffffffff814ea670>] napi_skb_finish+0x50/0x70
[328684.783620] [<ffffffff814eaba5>] napi_gro_receive+0xb5/0xc0
[328684.852215] [<ffffffff815ba50b>] vlan_gro_receive+0x1b/0x20
[328684.920811] [<ffffffffa00c4be8>] igb_clean_rx_irq_adv+0x2a8/0x630 [igb]
[328685.001873] [<ffffffffa00c4fde>] igb_poll+0x6e/0x140 [igb]
[328685.069425] [<ffffffff814eadb4>] net_rx_action+0x134/0x290
[328685.136984] [<ffffffffa00bf796>] ? igb_msix_ring+0x36/0x50 [igb]
[328685.210772] [<ffffffff81065e38>] __do_softirq+0xa8/0x210
[328685.276249] [<ffffffff815fe82e>] ? _raw_spin_lock+0xe/0x20
[328685.343804] [<ffffffff81607e1c>] call_softirq+0x1c/0x30
[328685.408240] [<ffffffff8100c295>] do_softirq+0x65/0xa0
[328685.470602] [<ffffffff8106621e>] irq_exit+0x8e/0xb0
[328685.530885] [<ffffffff81608673>] do_IRQ+0x63/0xe0
[328685.589092] [<ffffffff815fed53>] common_interrupt+0x13/0x13
[328685.657682] <EOI>
[328685.683719] [<ffffffff814bc8ba>] ? poll_idle+0x3a/0x80
[328685.747119] [<ffffffff814bc893>] ? poll_idle+0x13/0x80
[328685.810519] [<ffffffff814bcba2>] cpuidle_idle_call+0xa2/0x1d0
[328685.881188] [<ffffffff8100920b>] cpu_idle+0xab/0x100
[328685.942510] [<ffffffff815de7ec>] start_secondary+0xd9/0xdb
[328686.010062] Code: 84 d2 75 7f 48 c7 c7 e8 19 12 a0 45 0f b6 ee e8 47 2d 48 e1 48 8b 5b 28 48 85 db 75 0e eb 4c 0f 1f 40 00 4d 85 e4 74 43 4c 89 e3 <8b> b3 d0 00 00 00 31 c0 4c 8b 23 85 f6 0f 95 c0 41 39 c5 75 e3
[328686.238203] RIP [<ffffffffa017bc70>] flush_expectations+0x50/0xc0 [nf_conntrack_sip]
[328686.332795] RSP <ffff88027f223890>
[328686.375794] ---[ end trace 806ab2e6e0730fa6 ]---
[328686.431970] Kernel panic - not syncing: Fatal exception in interrupt
[328686.508915] Pid: 0, comm: kworker/0:0 Tainted: G D 3.0.0-12-server #20-Ubuntu
[328686.604571] Call Trace:
[328686.634779] <IRQ> [<ffffffff815e8184>] panic+0x91/0x194
[328686.700387] [<ffffffff815ffd0a>] oops_end+0xea/0xf0
[328686.760720] [<ffffffff8100d8c8>] die+0x58/0x90
[328686.815859] [<ffffffff815ff7c2>] do_general_protection+0x162/0x170
[328686.891771] [<ffffffff8150a4eb>] ? qdisc_watchdog_schedule+0x3b/0x40
[328686.969755] [<ffffffff815fefe5>] general_protection+0x25/0x30
[328687.040477] [<ffffffffa017bc70>] ? flush_expectations+0x50/0xc0 [nf_conntrack_sip]
[328687.133024] [<ffffffffa017cec0>] process_invite_response+0x80/0x90 [nf_conntrack_sip]
[328687.228685] [<ffffffffa017b9eb>] process_sip_response+0x15b/0x170 [nf_conntrack_sip]
[328687.323308] [<ffffffffa017d10d>] process_sip_msg.isra.8+0x7d/0xb0 [nf_conntrack_sip]
[328687.417932] [<ffffffffa017d1dd>] sip_help_udp+0x9d/0xd0 [nf_conntrack_sip]
[328687.502154] [<ffffffffa012fdcf>] ipv4_confirm+0xbf/0x200 [nf_conntrack_ipv4]
[328687.588464] [<ffffffff81516075>] nf_iterate+0x85/0xc0
[328687.650873] [<ffffffff815223e0>] ? ip_fragment+0x950/0x950
[328687.718473] [<ffffffff81516126>] nf_hook_slow+0x76/0x130
[328687.783991] [<ffffffff815223e0>] ? ip_fragment+0x950/0x950
[328687.851594] [<ffffffff8152310c>] ip_output+0x9c/0xc0
[328687.912962] [<ffffffff8151f094>] ip_forward_finish+0x44/0x60
[328687.982636] [<ffffffff8151f355>] ip_forward+0x2a5/0x440
[328688.047122] [<ffffffff8151d641>] ip_rcv_finish+0x131/0x370
[328688.114722] [<ffffffff8151deee>] ip_rcv+0x21e/0x2f0
[328688.175049] [<ffffffff814e9972>] __netif_receive_skb+0x4a2/0x540
[328688.248881] [<ffffffff814dc63f>] ? __alloc_skb+0x4f/0x230
[328688.315440] [<ffffffff814e99d9>] __netif_receive_skb+0x509/0x540
[328688.389273] [<ffffffff814ea530>] netif_receive_skb+0x80/0x90
[328688.458955] [<ffffffff814ea928>] ? dev_gro_receive+0x1b8/0x2c0
[328688.530709] [<ffffffff814ea670>] napi_skb_finish+0x50/0x70
[328688.598311] [<ffffffff814eaba5>] napi_gro_receive+0xb5/0xc0
[328688.666951] [<ffffffff815ba50b>] vlan_gro_receive+0x1b/0x20
[328688.735594] [<ffffffffa00c4be8>] igb_clean_rx_irq_adv+0x2a8/0x630 [igb]
[328688.816700] [<ffffffffa00c4fde>] igb_poll+0x6e/0x140 [igb]
[328688.884300] [<ffffffff814eadb4>] net_rx_action+0x134/0x290
[328688.951901] [<ffffffffa00bf796>] ? igb_msix_ring+0x36/0x50 [igb]
[328689.025732] [<ffffffff81065e38>] __do_softirq+0xa8/0x210
[328689.091255] [<ffffffff815fe82e>] ? _raw_spin_lock+0xe/0x20
[328689.158854] [<ffffffff81607e1c>] call_softirq+0x1c/0x30
[328689.223337] [<ffffffff8100c295>] do_softirq+0x65/0xa0
[328689.285746] [<ffffffff8106621e>] irq_exit+0x8e/0xb0
[328689.346074] [<ffffffff81608673>] do_IRQ+0x63/0xe0
[328689.404330] [<ffffffff815fed53>] common_interrupt+0x13/0x13
[328689.472964] <EOI> [<ffffffff814bc8ba>] ? poll_idle+0x3a/0x80
[328689.543765] [<ffffffff814bc893>] ? poll_idle+0x13/0x80
[328689.607210] [<ffffffff814bcba2>] cpuidle_idle_call+0xa2/0x1d0
[328689.677935] [<ffffffff8100920b>] cpu_idle+0xab/0x100
[328689.739309] [<ffffffff815de7ec>] start_secondary+0xd9/0xdb
[328689.806913] Rebooting in 1 seconds..
^ permalink raw reply
* Re: Contributing for the first time
From: Daniel Baluta @ 2011-11-07 13:17 UTC (permalink / raw)
To: Alexandru Juncu; +Cc: kernelnewbies, shemminger, Greg Freemyer, netdev
In-Reply-To: <4a08f8c6-d02e-40c4-b976-5437270ba5b2@email.android.com>
On Mon, Nov 7, 2011 at 3:05 PM, Greg Freemyer <greg.freemyer@gmail.com> wrote:
> Alexandru Juncu <alex.juncu@rosedu.org> wrote:
>
>>Hello!
>>
>>I have been a linux user for many years, and mostly on the networking
>>side. And I would like to start contributing somehow to the linux
>>community (in some other way than just promoting it). I guess that I
>>should start small with something like man pages.
>>
>>And there is something that really has been bugging me for some time.
>>I'm an iproute2 user and I teach linux courses and show people how to
>>use it. On most questions from my students about new commands, I
>>redirect them to the man pages and to the Examples section of that
>>page. iproute2 doesn't have such examples and I always wish it did.
>>
>>Do you think that if I submit a patch to the man pages, adding some
>>examples of how to use the ip command, will it get accepted? Because
>>this sounds like a simple thing and it's hard to believe that someone
>>else didn't try do to this before. What do you think?
>>Alexandru Juncu
> I think it will be accepted, but few people like to work on the man pages.
>
> Since this is a userspace package you will need to figure out who the maintainer is and if there is a mailinglist they use to discuss/support the package.
>
> Then submit your patch there.
According to [1] maintainer for iproute2 is Stephen Hemminger (CC'ed) and I
think patches should be sent to netdev mailing list ([2]).
thanks,
Daniel.
[1] http://www.linuxfoundation.org/collaborate/workgroups/networking/iproute2
[2] http://vger.kernel.org/vger-lists.html#netdev
^ permalink raw reply
* Re: [PATCH] usbnet: fix oops in usbnet_start_xmit
From: Richard Cochran @ 2011-11-07 13:33 UTC (permalink / raw)
To: Konstantin Khlebnikov
Cc: Oliver Neukum, netdev, David S. Miller, devel, Michael Riesch
In-Reply-To: <20111106183337.5379.4356.stgit@zurg>
On Sun, Nov 06, 2011 at 10:33:37PM +0300, Konstantin Khlebnikov wrote:
> This patch fixes the bug added in commit v3.1-rc7-1055-gf9b491e
> SKB can be NULL at this point, at least for cdc-ncm.
What? You mean .ndo_start_xmit is called with skb NULL?
> Let's call skb_tx_timestamp() after driver specific tx-fixup hacks.
No, that won't work.
That call is before the fixup on purpose, because some fixups add
padding in front of the Ethernet payload, and this will spoil the PTP
packet detection filter.
I don't know why the skb can be NULL here. If that is really the case,
then the correct fix is:
if (skb)
skb_tx_timestamp(skb);
Thanks,
Richard
>
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
> ---
> drivers/net/usb/usbnet.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
> index 7d60821..485be70 100644
> --- a/drivers/net/usb/usbnet.c
> +++ b/drivers/net/usb/usbnet.c
> @@ -1057,8 +1057,6 @@ netdev_tx_t usbnet_start_xmit (struct sk_buff *skb,
> unsigned long flags;
> int retval;
>
> - skb_tx_timestamp(skb);
> -
> // some devices want funky USB-level framing, for
> // win32 driver (usually) and/or hardware quirks
> if (info->tx_fixup) {
> @@ -1075,6 +1073,8 @@ netdev_tx_t usbnet_start_xmit (struct sk_buff *skb,
> }
> length = skb->len;
>
> + skb_tx_timestamp(skb);
> +
> if (!(urb = usb_alloc_urb (0, GFP_ATOMIC))) {
> netif_dbg(dev, tx_err, dev->net, "no urb\n");
> goto drop;
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: Linux Route Cache performance tests
From: Ben Hutchings @ 2011-11-07 13:42 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Paweł Staszewski, Linux Network Development list
In-Reply-To: <1320608290.6506.33.camel@edumazet-laptop>
On Sun, 2011-11-06 at 20:38 +0100, Eric Dumazet wrote:
> Le dimanche 06 novembre 2011 à 20:20 +0100, Paweł Staszewski a écrit :
[...]
> > So the point of this test was figure out how much of route cache entries
> > Linux can handle without dropping performance.
>
> No need to even do a bench, its pretty easy to understand how a hash
> table is handled.
>
> Allowing long chains is not good.
>
> With your 512k slots hash table, you cannot expect handling 1.4M routes
> with optimal performance. End of story.
>
> Since route hash table is allocated at boot time, only way to change its
> size is using "rhash_entries=2097152" boot parameter.
>
> If it still doesnt fly, try with "rhash_entries=4194304"
A routing cache this big is not going to fit in the processor caches,
anyway; in fact even the hash table may not. So a routing cache hit is
likely to involve processor cache misses. After David's work to make
cacheless operation faster, I suspect that such a 'hit' can be a net
loss. But it *is* necessary to run a benchmark to answer this (and the
answer will obviously vary between systems).
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: [PATCH] usbnet: fix oops in usbnet_start_xmit
From: Michael Riesch @ 2011-11-07 13:29 UTC (permalink / raw)
To: Konstantin Khlebnikov; +Cc: Oliver Neukum, netdev, David S. Miller, devel
In-Reply-To: <20111106183337.5379.4356.stgit@zurg>
On Sun, 2011-11-06 at 22:33 +0300, Konstantin Khlebnikov wrote:
> This patch fixes the bug added in commit v3.1-rc7-1055-gf9b491e
> SKB can be NULL at this point, at least for cdc-ncm.
OK, I didn't think of that, but...
> Let's call skb_tx_timestamp() after driver specific tx-fixup hacks.
... the reason I put the skb_tx_timestamp() call before the tx_fixup is
that these hacks often perform skb_push/skb_pull or any other kind of
framing. This may result (at least in the case of the asix drivers) in
perfectly correct PTP packets being not recognized as such by the packet
filter.
Can we do a check like this:
if(skb) skb_tx_timestamp()
tx_fixup()
?
Regards, Michael
^ permalink raw reply
* How to indenfy the real physical network interface?
From: santosh @ 2011-11-07 13:59 UTC (permalink / raw)
To: netdev
Hi,
I am posting this question to "netdev" mailing list because I could
not find "linux-net" mailing list as suggested at
http://kernelnewbies.org/ML .
I have a wireless device running on Linux 2.6.15. (Can't upgrade to
latest at this time).
It has 3 interfaces.
ath0 - Wireless interface.
eth0 - Ethernet interface.
br0 - Bridge interface joining ath0 and eth0.
I have a user space socket program that listens to the broadcast
messages and responds.
My socket is opened as sock = socket(PF_INET, SOCK_DGRAM, 0).
I need this socket program to listen for the packet coming from
Ethernet interface only.
Or, this socket program should be able to figure out the actual
interface the packet come from.
I tried below methods but both doesn't help me because kernel is
giving bridge as interface and not giving the real interface to socket
program.
1. //setsockopt(sock, SOL_SOCKET, SO_BINDTODEVICE, "br0", 3)
setsockopt(sock, SOL_SOCKET, SO_BINDTODEVICE, "eth0", 4)
2. setsockopt(sock, SOL_IP, IP_PKTINFO, (char *) &on, sizeof on)
// Use recvmsg instead of recvfrom and read the interface index.
// If interface is not Ethernet do not respond.
Can you please let me know if there is a way to identify the actual
interface in a UDP socket program when traffic is being controlled by
a Bridge?
Thanks,
Santosh
^ permalink raw reply
* Re: [PATCH] usbnet: fix oops in usbnet_start_xmit
From: Konstantin Khlebnikov @ 2011-11-07 14:05 UTC (permalink / raw)
To: Richard Cochran
Cc: Oliver Neukum, netdev@vger.kernel.org, David S. Miller,
devel@openvz.org, Michael Riesch
In-Reply-To: <20111107133351.GA1484@cherladcori01>
Richard Cochran wrote:
> On Sun, Nov 06, 2011 at 10:33:37PM +0300, Konstantin Khlebnikov wrote:
>> This patch fixes the bug added in commit v3.1-rc7-1055-gf9b491e
>> SKB can be NULL at this point, at least for cdc-ncm.
>
> What? You mean .ndo_start_xmit is called with skb NULL?
no. cdc_ncm call usbnet_start_xmit with NULL skb from timer handler
and tx_fixup hook pickup skb from internal context. yeah, it really messy.
>
>> Let's call skb_tx_timestamp() after driver specific tx-fixup hacks.
>
> No, that won't work.
>
> That call is before the fixup on purpose, because some fixups add
> padding in front of the Ethernet payload, and this will spoil the PTP
> packet detection filter.
>
> I don't know why the skb can be NULL here. If that is really the case,
> then the correct fix is:
>
> if (skb)
> skb_tx_timestamp(skb);
>
> Thanks,
> Richard
>
>
>>
>> Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> ---
>> drivers/net/usb/usbnet.c | 4 ++--
>> 1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
>> index 7d60821..485be70 100644
>> --- a/drivers/net/usb/usbnet.c
>> +++ b/drivers/net/usb/usbnet.c
>> @@ -1057,8 +1057,6 @@ netdev_tx_t usbnet_start_xmit (struct sk_buff *skb,
>> unsigned long flags;
>> int retval;
>>
>> - skb_tx_timestamp(skb);
>> -
>> // some devices want funky USB-level framing, for
>> // win32 driver (usually) and/or hardware quirks
>> if (info->tx_fixup) {
>> @@ -1075,6 +1073,8 @@ netdev_tx_t usbnet_start_xmit (struct sk_buff *skb,
>> }
>> length = skb->len;
>>
>> + skb_tx_timestamp(skb);
>> +
>> if (!(urb = usb_alloc_urb (0, GFP_ATOMIC))) {
>> netif_dbg(dev, tx_err, dev->net, "no urb\n");
>> goto drop;
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] tcp: Fix comments for Nagle algorithm
From: Srivatsa S. Bhat @ 2011-11-07 14:24 UTC (permalink / raw)
To: Feng King; +Cc: netdev, davem, linux-kernel
In-Reply-To: <1320503003-5169-1-git-send-email-kinwin2008@gmail.com>
On 11/05/2011 07:53 PM, Feng King wrote:
> TCP_NODELAY is weaker than TCP_CORK, when TCP_CORK was set, small
> segments will always pass Nagle test regardless of TCP_NODELAY option.
>
> Signed-off-by: Feng King <kinwin2008@gmail.com>
>
You have missed adding "---" after the Signed-off-by line.
The tools that are used to extract the patch from the email will look
for that symbol to distinguish things.
[FYI, tools like git or stgit can generate your patches in the right
format automatically for you.]
Thanks,
Srivatsa S. Bhat
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 882e0b0..a908f95 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1377,7 +1377,7 @@ static inline int tcp_minshall_check(const struct tcp_sock *tp)
> /* Return 0, if packet can be sent now without violation Nagle's rules:
> * 1. It is full sized.
> * 2. Or it contains FIN. (already checked by caller)
> - * 3. Or TCP_NODELAY was set.
> + * 3. Or TCP_CORK is not set, and TCP_NODELAY is set.
> * 4. Or TCP_CORK is not set, and all sent packets are ACKed.
> * With Minshall's modification: all sent small packets are ACKed.
> */
^ permalink raw reply
* Re: Linux Route Cache performance tests
From: Eric Dumazet @ 2011-11-07 14:33 UTC (permalink / raw)
To: Ben Hutchings; +Cc: Paweł Staszewski, Linux Network Development list
In-Reply-To: <1320673364.3020.21.camel@bwh-desktop>
Le lundi 07 novembre 2011 à 13:42 +0000, Ben Hutchings a écrit :
> A routing cache this big is not going to fit in the processor caches,
> anyway; in fact even the hash table may not. So a routing cache hit is
> likely to involve processor cache misses. After David's work to make
> cacheless operation faster, I suspect that such a 'hit' can be a net
> loss. But it *is* necessary to run a benchmark to answer this (and the
> answer will obviously vary between systems).
>
I dont know why you think full hash table should fit processor cache.
If it does, thats perfect, but its not a requirement.
This is one cache miss, to get the pointer to the first element in
chain. Of course this might be a cache hit if several packets for a
given flow are processed in a short period of time.
Given a dst itself is 256 bytes (4 cache lines), one extra cache miss to
get the pointer to dst is not very expensive.
At least, in recent kernels we dont change dst->refcnt in forwarding
patch (usinf NOREF skb->dst)
One particular point is the atomic_inc(dst->refcnt) we have to perform
when queuing an UDP packet if socket asked PKTINFO stuff (for example a
typical DNS server has to setup this option)
I have one patch somewhere that stores the information in skb->cb[] and
avoid the atomic_{inc|dec}(dst->refcnt).
^ permalink raw reply
* [PATCH net-next] W5300: Add WIZnet W5300 Ethernet driver
From: Taehun Kim @ 2011-11-07 14:37 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, linux-kernel, suhwan, bongbong
I have modified W5300 driver by applying the David Miller's feedback
(http://www.spinics.net/lists/netdev/msg177862.html).
Please review this driver and apply it if do not have any problems.
Thank you,
T.K.
Signed-off-by: Taehun Kim <kth3321@gmail.com>
---
drivers/net/ethernet/Kconfig | 1 +
drivers/net/ethernet/Makefile | 1 +
drivers/net/ethernet/wiznet/Kconfig | 32 ++
drivers/net/ethernet/wiznet/Makefile | 5 +
drivers/net/ethernet/wiznet/w5300.c | 697 ++++++++++++++++++++++++++++++++++
drivers/net/ethernet/wiznet/w5300.h | 121 ++++++
6 files changed, 857 insertions(+), 0 deletions(-)
create mode 100644 drivers/net/ethernet/wiznet/Kconfig
create mode 100644 drivers/net/ethernet/wiznet/Makefile
create mode 100644 drivers/net/ethernet/wiznet/w5300.c
create mode 100644 drivers/net/ethernet/wiznet/w5300.h
diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig
index 6dff5a0..6325d85 100644
--- a/drivers/net/ethernet/Kconfig
+++ b/drivers/net/ethernet/Kconfig
@@ -173,5 +173,6 @@ source "drivers/net/ethernet/tundra/Kconfig"
source "drivers/net/ethernet/via/Kconfig"
source "drivers/net/ethernet/xilinx/Kconfig"
source "drivers/net/ethernet/xircom/Kconfig"
+source "drivers/net/ethernet/wiznet/Kconfig"
endif # ETHERNET
diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile
index c53ad3a..7bd5211 100644
--- a/drivers/net/ethernet/Makefile
+++ b/drivers/net/ethernet/Makefile
@@ -72,3 +72,4 @@ obj-$(CONFIG_NET_VENDOR_TUNDRA) += tundra/
obj-$(CONFIG_NET_VENDOR_VIA) += via/
obj-$(CONFIG_NET_VENDOR_XILINX) += xilinx/
obj-$(CONFIG_NET_VENDOR_XIRCOM) += xircom/
+obj-$(CONFIG_NET_VENDOR_WIZNET) += wiznet/
diff --git a/drivers/net/ethernet/wiznet/Kconfig b/drivers/net/ethernet/wiznet/Kconfig
new file mode 100644
index 0000000..b5925bd
--- /dev/null
+++ b/drivers/net/ethernet/wiznet/Kconfig
@@ -0,0 +1,32 @@
+#
+# WIZnet device configuration
+#
+
+config NET_VENDOR_WIZNET
+ bool "WIZnet devices"
+ default y
+ ---help---
+ If you have a network (Ethernet) card belonging to this class, say Y
+ and read the Ethernet-HOWTO, available from
+ <http://www.tldp.org/docs.html#howto>.
+
+ Note that the answer to this question doesn't directly affect the
+ kernel: saying N will just cause the configurator to skip all
+ the questions about WIZnet devices. If you say Y, you will be asked for
+ your specific card in the following questions.
+
+if NET_VENDOR_WIZNET
+
+config W5300
+ tristate "WIZnet W5300 Ethernet support"
+ depends on ARM
+ ---help---
+ This driver supports the Ethernet in the WIZnet W5300 chips.
+ W5300 supports hardwired TCP/IP stack. But this driver is limited to
+ the Ethernet function. To use hardwired TCP/IP stack, need to modify
+ the TCP/IP stack in linux kerenl.
+
+ To compile this driver as a module, choose M here: the module
+ will be called w5300.
+
+endif # NET_VENDOR_WIZNET
diff --git a/drivers/net/ethernet/wiznet/Makefile b/drivers/net/ethernet/wiznet/Makefile
new file mode 100644
index 0000000..53120bc
--- /dev/null
+++ b/drivers/net/ethernet/wiznet/Makefile
@@ -0,0 +1,5 @@
+#
+# Makefile for the WIZnet device drivers.
+#
+
+obj-$(CONFIG_W5300) += w5300.o
diff --git a/drivers/net/ethernet/wiznet/w5300.c b/drivers/net/ethernet/wiznet/w5300.c
new file mode 100644
index 0000000..6fe3b57
--- /dev/null
+++ b/drivers/net/ethernet/wiznet/w5300.c
@@ -0,0 +1,697 @@
+/* w5300.c: A Linux Ethernet driver for the WIZnet W5300 chip. */
+/*
+ Copyright (C) 2011 Taehun Kim <kth3321@gmail.com>
+
+ This software may be used and distributed according to the terms of
+ the GNU General Public License (GPL), incorporated herein by reference.
+ Drivers based on or derived from this code fall under the GPL and must
+ retain the authorship, copyright and license notice. This file is not
+ a complete program and may only be used when the entire operating
+ system is licensed under the GPL.
+*/
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/delay.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/errno.h>
+
+#include <linux/netdevice.h>
+#include <linux/etherdevice.h>
+#include <linux/skbuff.h>
+
+#include <linux/device.h>
+#include <linux/platform_device.h>
+#include <linux/ioport.h>
+#include <linux/io.h>
+
+#include "w5300.h"
+
+#define DEV_NAME "W5300"
+#define DRV_VERSION "1.0"
+#define DRV_RELDATE "Nov 7, 2011"
+
+#define W5300_DEF_MSG_ENABLE \
+ (NETIF_MSG_DRV | \
+ NETIF_MSG_TIMER | \
+ NETIF_MSG_IFUP | \
+ NETIF_MSG_RX_ERR | \
+ NETIF_MSG_INTR | \
+ NETIF_MSG_TX_DONE)
+
+static const char version[] =
+ DEV_NAME ": Ethernet driver v" DRV_VERSION "(" DRV_RELDATE ")\n";
+
+MODULE_AUTHOR("Taehun Kim <kth3321@gmail.com>");
+MODULE_DESCRIPTION("WIZnet W5300 Ethernet driver");
+MODULE_VERSION(DRV_VERSION);
+MODULE_LICENSE("GPL");
+
+/* Transmit timeout, default 5 seconds. */
+static int watchdog = 5000;
+module_param(watchdog, int, 0400);
+MODULE_PARM_DESC(watchdog, "transmit timeout in milliseconds");
+
+static int debug = -1;
+module_param(debug, int, 0);
+MODULE_PARM_DESC(debug, "W5300: bitmapped message enable number");
+
+/*
+ * This is W5300 information structure.
+ * Additional information is included in struct net_device.
+ */
+struct wiz_private {
+ void __iomem *base;
+ struct net_device *dev;
+ u8 rxbuf_conf[MAX_SOCK_NUM];
+ u8 txbuf_conf[MAX_SOCK_NUM];
+ struct napi_struct napi;
+ spinlock_t lock;
+ u32 msg_enable;
+};
+
+/* Default MAC address. */
+static __initdata u8 w5300_defmac[6] = {0x00, 0x08, 0xDC, 0xA0, 0x00, 0x01};
+
+/* Default RX/TX buffer size(KByte). */
+static u8 w5300_rxbuf_conf[MAX_SOCK_NUM] __initdata = {
+ 64, 0, 0, 0, 0, 0, 0, 0
+};
+
+static u8 w5300_txbuf_conf[MAX_SOCK_NUM] __initdata = {
+ 64, 0, 0, 0, 0, 0, 0, 0
+};
+
+/* Notifying packet size in the RX FIFO */
+static int w5300_get_rxsize(struct wiz_private *wp, int s)
+{
+ u32 val;
+
+ val = w5300_read(wp, Sn_RX_RSR(s));
+ val = (val << 16) + w5300_read(wp, Sn_RX_RSR(s) + 2);
+ return val;
+}
+
+/* Packet Receive Function. It reads received packet from the Rx FIFO. */
+static void w5300_recv_data(struct wiz_private *wp, int s, u8 *buf,
+ ssize_t len)
+{
+ int i;
+ u16 recv_data;
+
+ /* read from RX FIFO */
+ for (i = 0; i < len; i += 2) {
+ recv_data = w5300_read(wp, Sn_RX_FIFO(s));
+ buf[i] = (u8) ((recv_data & 0xFF00) >> 8);
+ buf[i + 1] = (u8) (recv_data & 0x00FF);
+ }
+}
+
+/* Setting MAC address of W5300 */
+static void w5300_set_macaddr(struct wiz_private *wp, u8 * addr)
+{
+ int i;
+
+ for (i = 0; i < 3; ++i) {
+ u16 mac_addr = (addr[2*i] << 8) | addr[2*i+1];
+
+ w5300_write(wp, SHAR + 2*i, mac_addr);
+ }
+}
+
+/* Opening channels of W5300 */
+static int w5300_channel_open(struct wiz_private *wp, u32 type)
+{
+ int timeout = 1000;
+
+ /* Which type will be used for open? */
+ switch (type) {
+ case Sn_MR_MACRAW:
+ case Sn_MR_MACRAW_MF:
+ w5300_write(wp, Sn_MR(0), type);
+ break;
+ default:
+ netif_err(wp, ifup, wp->dev,
+ "Unknown socket type (%d)\n", type);
+
+ return -EFAULT;
+ }
+
+ w5300_write(wp, Sn_CR(0), Sn_CR_OPEN);
+
+ while (timeout--) {
+ if (!w5300_read(wp, Sn_CR(0)))
+ return 0;
+ udelay(1);
+ }
+
+ return -EBUSY;
+}
+
+/* Activating the interrupt of related channel */
+static void w5300_interrupt_enable(struct wiz_private *wp)
+{
+ u16 mask;
+
+ mask = w5300_read(wp, IMR) | 0x1;
+ w5300_write(wp, IMR, mask);
+}
+
+/* De-activating the interrupt of related channel */
+static void w5300_interrupt_disable(struct wiz_private *wp)
+{
+ u16 mask;
+
+ mask = w5300_read(wp, IMR) & ~0x1;
+ w5300_write(wp, IMR, mask);
+}
+
+/* W5300 initialization function */
+static int w5300_reset(struct net_device *dev)
+{
+ struct wiz_private *wp = netdev_priv(dev);
+ u32 txbuf_total = 0, i;
+ u16 mem_cfg = 0;
+ u16 rx_size, tx_size;
+
+ netif_dbg(wp, drv, wp->dev, "w5300 chip reset\n");
+
+ /* W5300 is initialized by sending RESET command. */
+ w5300_write(wp, MR, MR_RST);
+ mdelay(5);
+
+ /* Mode Register Setting
+ * Ping uses S/W stack of the Linux kernel. Set the Ping Block.*/
+ w5300_write(wp, MR, MR_WDF(1) | MR_PB);
+
+ /* Setting MAC address */
+ w5300_set_macaddr(wp, dev->dev_addr);
+
+ /* Setting the size of Rx/Tx FIFO */
+ for (i = 0; i < MAX_SOCK_NUM; ++i) {
+ if (wp->rxbuf_conf[i] > 64) {
+ netif_err(wp, drv, wp->dev,
+ "Illegal Channel(%d) RX memory size.\n", i);
+
+ return -EINVAL;
+ }
+ if (wp->txbuf_conf[i] > 64) {
+ netif_err(wp, drv, wp->dev,
+ "Illegal Channel(%d) TX memory size.\n", i);
+
+ return -EINVAL;
+ }
+ txbuf_total += wp->txbuf_conf[i];
+ }
+
+ if (txbuf_total % 8) {
+ netif_err(wp, drv, wp->dev,
+ "Illegal memory size register setting.\n");
+
+ return -EINVAL;
+ }
+
+ for (i = 0; i < 4; ++i) {
+ rx_size = (wp->rxbuf_conf[2*i] << 8) | wp->rxbuf_conf[2*i+1];
+ tx_size = (wp->txbuf_conf[2*i] << 8) | wp->txbuf_conf[2*i+1];
+
+ w5300_write(wp, RMSR + 2*i, rx_size);
+ w5300_write(wp, TMSR + 2*i, tx_size);
+ }
+
+ /* Setting FIFO Memory Type (TX&RX) */
+ for (i = 0; i < txbuf_total / 8; ++i) {
+ mem_cfg <<= 1;
+ mem_cfg |= 1;
+ }
+ w5300_write(wp, MTYPER, mem_cfg);
+
+ /* Masking all interrupts */
+ w5300_write(wp, IMR, 0x0000);
+
+ return 0;
+}
+
+/* Interrupt Handler(ISR) */
+static irqreturn_t wiz_interrupt(int irq, void *dev_instance)
+{
+ struct net_device *dev = dev_instance;
+ struct wiz_private *wp = netdev_priv(dev);
+ int timeout = 100;
+ u16 isr, ssr;
+ int s;
+
+ isr = w5300_read(wp, IR);
+
+ /* Completing all interrupts at a time. */
+ while (isr && timeout--) {
+ w5300_write(wp, IR, isr);
+
+ /* Finding the channel to create the interrupt */
+ s = find_first_bit((ulong *)&isr, sizeof(u16));
+ ssr = w5300_read(wp, Sn_IR(s));
+ /* socket interrupt is cleared. */
+ w5300_write(wp, Sn_IR(s), ssr);
+
+ netif_dbg(wp, intr, wp->dev,
+ "ISR = %X, SSR = %X, s = %X\n",
+ isr, ssr, s);
+
+ if (likely(!s)) {
+ if (ssr & Sn_IR_RECV) {
+ /* Interrupt disable. */
+ w5300_interrupt_disable(wp);
+ /* Receiving by polling method */
+ napi_schedule(&wp->napi);
+ }
+ }
+
+ /* Is there any interrupt to be processed? */
+ isr = w5300_read(wp, IR);
+ }
+
+ return IRQ_HANDLED;
+}
+
+static int wiz_open(struct net_device *dev)
+{
+ struct wiz_private *wp = netdev_priv(dev);
+ int ret;
+
+ napi_enable(&wp->napi);
+
+ ret = request_irq(dev->irq, wiz_interrupt, IRQF_SHARED,
+ dev->name, dev);
+ if (ret < 0) {
+ netif_err(wp, ifup, wp->dev, "request_irq() error!\n");
+ return ret;
+ }
+
+ w5300_interrupt_enable(wp);
+
+ /* Sending OPEN command to use channel 0 as MACRAW mode. */
+ ret = w5300_channel_open(wp, Sn_MR_MACRAW_MF);
+ if (ret < 0) {
+ netif_err(wp, ifup, wp->dev, "w5300 channel open fail!\n");
+ return ret;
+ }
+
+ netif_carrier_on(dev);
+ netif_start_queue(dev);
+
+ return 0;
+}
+
+static int wiz_close(struct net_device *dev)
+{
+ struct wiz_private *wp = netdev_priv(dev);
+ int timeout = 1000;
+
+ napi_disable(&wp->napi);
+ netif_carrier_off(dev);
+
+ /* Interrupt masking of all channels */
+ w5300_write(wp, IMR, 0x0000);
+ w5300_write(wp, Sn_CR(0), Sn_CR_CLOSE);
+
+ while (timeout--) {
+ if (!w5300_read(wp, Sn_CR(0)))
+ break;
+ udelay(1);
+ }
+
+ free_irq(dev->irq, dev);
+
+ return 0;
+}
+
+static int w5300_send_data(struct wiz_private *wp, u8 *buf, ssize_t len)
+{
+ int i;
+ u16 send_data;
+ int timeout = 1000;
+
+ /* Writing packets in to Tx FIFO */
+ for (i = 0; i < len; i += 2) {
+ send_data = (buf[i] << 8) | buf[i+1];
+ w5300_write(wp, Sn_TX_FIFO(0), send_data);
+ }
+
+ w5300_write(wp, Sn_TX_WRSR(0), (u16)(len >> 16));
+ w5300_write(wp, Sn_TX_WRSR(0) + 2, (u16)len);
+ w5300_write(wp, Sn_CR(0), Sn_CR_SEND);
+
+ while (timeout--) {
+ if (!w5300_read(wp, Sn_CR(0)))
+ return len;
+ udelay(1);
+ }
+
+ return -EBUSY;
+}
+
+/* Function to transmit data at the MACRAW mode */
+static int wiz_start_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+ struct wiz_private *wp = netdev_priv(dev);
+ int ret;
+
+ ret = w5300_send_data(wp, skb->data, skb->len);
+
+ /* Statistical Process */
+ if (ret < 0) {
+ dev->stats.tx_dropped++;
+ } else {
+ dev->stats.tx_bytes += skb->len;
+ dev->stats.tx_packets++;
+ dev->trans_start = jiffies;
+ netif_dbg(wp, tx_done, wp->dev,
+ "tx done, packet size = %d\n", skb->len);
+ }
+ dev_kfree_skb(skb);
+
+ return NETDEV_TX_OK;
+}
+
+/* It is called when multi-cast list or flag is changed. */
+static void wiz_set_multicast(struct net_device *dev)
+{
+ struct wiz_private *wp = netdev_priv(dev);
+ int ret;
+ u32 type = dev->flags & IFF_PROMISC ? Sn_MR_MACRAW : Sn_MR_MACRAW_MF;
+
+ ret = w5300_channel_open(wp, type);
+ if (ret < 0) {
+ netif_err(wp, ifup, wp->dev,
+ "w5300 channel open fail!\n");
+ }
+}
+
+static int wiz_set_mac_address(struct net_device *dev, void *addr)
+{
+ struct wiz_private *wp = netdev_priv(dev);
+ struct sockaddr *sock_addr = addr;
+
+ netif_dbg(wp, drv, wp->dev, "set mac address");
+
+ spin_lock(&wp->lock);
+ w5300_set_macaddr(wp, sock_addr->sa_data);
+ memcpy(dev->dev_addr, sock_addr->sa_data, dev->addr_len);
+ spin_unlock(&wp->lock);
+
+ return 0;
+}
+
+static void wiz_tx_timeout(struct net_device *dev)
+{
+ struct wiz_private *wp = netdev_priv(dev);
+ unsigned long flags;
+
+ netif_dbg(wp, timer, wp->dev, "Transmit timeout");
+
+ spin_lock_irqsave(&wp->lock, flags);
+
+ /* Initializing W5300 chip. */
+ if (w5300_reset(dev) < 0) {
+ netif_err(wp, timer, wp->dev, "w5300 reset fail!\n");
+ return;
+ }
+
+ /* Waking up network interface */
+ netif_wake_queue(dev);
+ spin_unlock_irqrestore(&wp->lock, flags);
+}
+
+/*
+ * Polling Function to process only receiving at the MACRAW mode.
+ * De-activating the interrupt when recv interrupt occurs,
+ * and processing the RECEIVE with this Function
+ * Activating the interrupt after completing RECEIVE process
+ * As recv interrupt often occurs at short intervals,
+ * there will system load in case that interrupt handler process the RECEIVE.
+ */
+static int wiz_rx_poll(struct napi_struct *napi, int budget)
+{
+ struct wiz_private *wp = container_of(napi, struct wiz_private, napi);
+ struct net_device *dev = wp->dev;
+ int npackets = 0;
+
+ /* Processing the RECEIVE during Rx FIFO is containing any packet */
+ while (w5300_get_rxsize(wp, 0) > 0) {
+ struct sk_buff *skb;
+ u16 rxbuf_len, pktlen;
+ u32 crc;
+
+ /* The first 2byte is the information about packet lenth. */
+ w5300_recv_data(wp, 0, (u8 *)&pktlen, 2);
+ pktlen = be16_to_cpu(pktlen);
+
+ netif_dbg(wp, rx_err, wp->dev, "pktlen = %d\n", pktlen);
+
+ /*
+ * Allotting the socket buffer in which packet will be contained
+ * Ethernet packet is of 14byte.
+ * In order to make it multiplied by 2, the buffer allocation
+ * should be 2bytes bigger than the packet.
+ */
+ skb = netdev_alloc_skb_ip_align(dev, pktlen);
+ if (!skb) {
+ u8 temp[pktlen + 4];
+
+ dev->stats.rx_dropped++;
+ w5300_recv_data(wp, 0, temp, pktlen + 4);
+ continue;
+ }
+
+ /* Initializing the socket buffer */
+ skb->dev = dev;
+ skb_reserve(skb, 2);
+ skb_put(skb, pktlen);
+
+ /* Reading packets from W5300 Rx FIFO into socket buffer. */
+ w5300_recv_data(wp, 0, (u8 *)skb->data, pktlen);
+
+ /* Reading and discarding 4byte CRC. */
+ w5300_recv_data(wp, 0, (u8 *)&crc, 4);
+ crc = be32_to_cpu(crc);
+
+ /* The packet type is Ethernet. */
+ skb->protocol = eth_type_trans(skb, dev);
+
+ /* Passing packets to uppder stack (kernel). */
+ netif_receive_skb(skb);
+
+ /* Processing statistical information */
+ dev->stats.rx_packets++;
+ dev->stats.rx_bytes += pktlen;
+ dev->last_rx = jiffies;
+ rxbuf_len -= pktlen;
+ npackets++;
+
+ if (npackets >= budget)
+ break;
+ }
+
+ /* If packet number is smaller than budget when getting out of loopback,
+ * the RECEIVE process is completed. */
+ if (npackets < budget) {
+ unsigned long flags;
+
+ spin_lock_irqsave(&wp->lock, flags);
+ w5300_interrupt_enable(wp);
+ __napi_complete(napi);
+ spin_unlock_irqrestore(&wp->lock, flags);
+ }
+ return npackets;
+}
+
+static const struct net_device_ops wiz_netdev_ops = {
+ .ndo_open = wiz_open,
+ .ndo_stop = wiz_close,
+ .ndo_validate_addr = eth_validate_addr,
+ .ndo_set_mac_address = wiz_set_mac_address,
+ .ndo_set_rx_mode = wiz_set_multicast,
+ .ndo_start_xmit = wiz_start_xmit,
+ .ndo_tx_timeout = wiz_tx_timeout,
+};
+
+/* Initialize W5300 driver. */
+static int __devinit w5300_drv_probe(struct platform_device *pdev)
+{
+ struct net_device *dev;
+ struct wiz_private *wp;
+ struct resource *res;
+ void __iomem *addr;
+ int ret;
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!res) {
+ ret = -ENODEV;
+ goto out;
+ }
+
+ /* Request the chip register regions. */
+ if (!request_mem_region(res->start, resource_size(res), DEV_NAME)) {
+ ret = -EBUSY;
+ goto out;
+ }
+
+ /* Allocatting struct net_device structure which is managing W5300 */
+ dev = alloc_etherdev(sizeof(struct wiz_private));
+ if (!dev) {
+ ret = -ENOMEM;
+ goto release_region;
+ }
+
+ dev->dma = (unsigned char)-1;
+ dev->irq = platform_get_irq(pdev, 0);
+ wp = netdev_priv(dev);
+ wp->dev = dev;
+ wp->msg_enable = (debug < 0 ? W5300_DEF_MSG_ENABLE : debug);
+ addr = ioremap(res->start, SZ_1M);
+ if (!addr) {
+ ret = -ENOMEM;
+ goto release_both;
+ }
+
+ platform_set_drvdata(pdev, dev);
+ wp->base = addr;
+
+ spin_lock_init(&wp->lock);
+
+ /* Initialization of Rx/Tx FIFO size */
+ memcpy(wp->rxbuf_conf, w5300_rxbuf_conf, MAX_SOCK_NUM);
+ memcpy(wp->txbuf_conf, w5300_txbuf_conf, MAX_SOCK_NUM);
+
+ dev->base_addr = res->start;
+
+ memcpy(dev->dev_addr, w5300_defmac, dev->addr_len);
+ dev->netdev_ops = &wiz_netdev_ops;
+
+ /* Setting napi. Enabling to process max 16 packets at a time. */
+ netif_napi_add(dev, &wp->napi, wiz_rx_poll, 16);
+
+ dev->watchdog_timeo = msecs_to_jiffies(watchdog);
+
+ ret = w5300_reset(dev);
+ if (ret < 0)
+ goto release_both;
+
+ ret = register_netdev(dev);
+ if (ret != 0) {
+ platform_set_drvdata(pdev, NULL);
+ iounmap(addr);
+release_both:
+ free_netdev(dev);
+release_region:
+ release_mem_region(res->start, resource_size(res));
+ }
+out:
+ return ret;
+}
+
+static int __devexit w5300_drv_remove(struct platform_device *pdev)
+{
+ struct net_device *dev = platform_get_drvdata(pdev);
+ struct wiz_private *wp = netdev_priv(dev);
+ struct resource *res;
+
+ platform_set_drvdata(pdev, NULL);
+ unregister_netdev(dev);
+
+ iounmap(wp->base);
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (res != NULL)
+ release_mem_region(res->start, resource_size(res));
+
+ free_netdev(dev);
+
+ return 0;
+}
+
+#ifdef CONFIG_PM
+
+static int w5300_drv_suspend(struct platform_device *pdev, pm_message_t state)
+{
+ struct net_device *dev = platform_get_drvdata(pdev);
+
+ if (dev) {
+ struct wiz_private *wp = netdev_priv(dev);
+
+ if (netif_running(dev)) {
+ int timeout = 1000;
+
+ netif_carrier_off(dev);
+ netif_device_detach(dev);
+ w5300_write(wp, IMR, 0x0000);
+ w5300_write(wp, Sn_CR(0), Sn_CR_CLOSE);
+
+ while (timeout--) {
+ if (!w5300_read(wp, Sn_CR(0)))
+ return 0;
+ udelay(1);
+ }
+ return -EBUSY;
+ }
+ }
+ return 0;
+}
+
+static int w5300_drv_resume(struct platform_device *pdev)
+{
+ struct net_device *dev = platform_get_drvdata(pdev);
+ int ret = 0;
+
+ if (dev) {
+ struct wiz_private *wp = netdev_priv(dev);
+
+ if (netif_running(dev)) {
+ ret = w5300_reset(dev);
+ if (ret < 0)
+ goto out;
+
+ w5300_interrupt_enable(wp);
+
+ ret = w5300_channel_open(wp, Sn_MR_MACRAW_MF);
+ if (ret < 0)
+ goto out;
+
+ netif_carrier_on(dev);
+ netif_device_attach(dev);
+ }
+ }
+
+out:
+ return ret;
+}
+#endif /* CONFIG_PM */
+
+static struct platform_driver w5300_driver = {
+ .driver = {
+ .name = DEV_NAME,
+ .owner = THIS_MODULE,
+ },
+ .probe = w5300_drv_probe,
+ .remove = __devexit_p(w5300_drv_remove),
+#ifdef CONFIG_PM
+ .suspend = w5300_drv_suspend,
+ .resume = w5300_drv_resume,
+#endif
+};
+
+static int __init wiz_module_init(void)
+{
+ return platform_driver_register(&w5300_driver);
+}
+
+static void __exit wiz_module_exit(void)
+{
+ platform_driver_unregister(&w5300_driver);
+}
+
+module_init(wiz_module_init);
+module_exit(wiz_module_exit);
diff --git a/drivers/net/ethernet/wiznet/w5300.h b/drivers/net/ethernet/wiznet/w5300.h
new file mode 100644
index 0000000..0d12288
--- /dev/null
+++ b/drivers/net/ethernet/wiznet/w5300.h
@@ -0,0 +1,121 @@
+#ifndef _W5300_H_
+#define _W5300_H_
+
+/* Maximum socket number. W5300 supports max 8 channels. */
+#define MAX_SOCK_NUM 8
+
+/* socket register */
+#define CH_BASE (0x200)
+
+/* size of each channel register map */
+#define CH_SIZE 0x40
+
+#define MR (0) /**< Mode register */
+#define IR (0x02) /**< Interrupt register */
+#define IMR (0x04) /**< Interrupt mask register */
+#define SHAR (0x08) /**< Source MAC register address */
+#define TMSR (0x20) /**< Transmit memory size register */
+#define RMSR (0x28) /**< Receive memory size register */
+
+/*
+ * Memory Type Register
+ * '1' - TX memory
+ * '0' - RX memory
+ */
+#define MTYPER (0x30)
+
+/* Chip ID register(=0x5300) */
+#define IDR (0xFE)
+#define IDR1 (IDR + 1)
+
+/* socket Mode register */
+#define Sn_MR(ch) (CH_BASE + ch * CH_SIZE + 0x00)
+#define Sn_MR1(ch) (Sn_MR(ch)+1)
+
+/* socket command register */
+#define Sn_CR(ch) (CH_BASE + ch * CH_SIZE + 0x02)
+#define Sn_CR1(ch) (Sn_CR(ch)+1);
+
+/* socket interrupt register */
+#define Sn_IR(ch) (CH_BASE + ch * CH_SIZE + 0x06)
+
+/* Transmit Size Register (Byte count) */
+#define Sn_TX_WRSR(ch) (CH_BASE + ch * CH_SIZE + 0x20)
+
+/* Transmit free memory size register (Byte count) */
+#define Sn_TX_FSR(ch) (CH_BASE + ch * CH_SIZE + 0x24)
+
+/* Received data size register (Byte count) */
+#define Sn_RX_RSR(ch) (CH_BASE + ch * CH_SIZE + 0x28)
+
+/* FIFO register for Transmit */
+#define Sn_TX_FIFO(ch) (CH_BASE + ch * CH_SIZE + 0x2E)
+
+/* FIFO register for Receive */
+#define Sn_RX_FIFO(ch) (CH_BASE + ch * CH_SIZE + 0x30)
+
+/* MODE register values */
+#define MR_DBW (1 << 15) /**< Data bus width bit of MR. */
+#define MR_MPF (1 << 14) /**< Mac layer pause frame bit of MR. */
+#define MR_WDF(x) ((x & 0x07) << 11) /**< Write data fetch time bit of MR. */
+#define MR_RDH (1 << 10) /**< Read data hold time bit of MR. */
+#define MR_FS (1 << 8) /**< FIFO swap bit of MR. */
+#define MR_RST (1 << 7) /**< S/W reset bit of MR. */
+#define MR_MT (1 << 5) /**< Memory test bit of MR. */
+#define MR_PB (1 << 4) /**< Ping block bit of MR. */
+#define MR_PPPoE (1 << 3) /**< PPPoE bit of MR. */
+#define MR_DBS (1 << 2) /**< Data bus swap of MR. */
+#define MR_IND (1 << 0) /**< Indirect mode bit of MR. */
+
+/* IR register values */
+#define IR_IPCF (1 << 7) /**< IP conflict bit of IR. */
+#define IR_DPUR (1 << 6) /**< Destination port unreachable bit of IR. */
+#define IR_PPPT (1 << 5) /**< PPPoE terminate bit of IR. */
+#define IR_FMTU (1 << 4) /**< Fragment MTU bit of IR. */
+#define IR_SnINT(n) (0x01 << n) /**< SOCKETn interrupt occurrence bit of IR. */
+
+/* Sn_MR values */
+#define Sn_MR_ALIGN (1 << 8) /**< Alignment bit of Sn_MR. */
+#define Sn_MR_MULTI (1 << 7) /**< Multicasting bit of Sn_MR. */
+#define Sn_MR_MF (1 << 6) /**< MAC filter bit of Sn_MR. */
+#define Sn_MR_IGMPv (1 << 5) /**< IGMP version bit of Sn_MR. */
+#define Sn_MR_ND (1 << 5) /**< No delayed ack bit of Sn_MR. */
+#define Sn_MR_CLOSE 0x00 /**< Protocol bits of Sn_MR. */
+#define Sn_MR_TCP 0x01 /**< Protocol bits of Sn_MR. */
+#define Sn_MR_UDP 0x02 /**< Protocol bits of Sn_MR. */
+#define Sn_MR_IPRAW 0x03 /**< Protocol bits of Sn_MR. */
+#define Sn_MR_MACRAW 0x04 /**< Protocol bits of Sn_MR. */
+#define Sn_MR_MACRAW_MF 0x44 /**< Protocol bits of Sn_MR */
+#define Sn_MR_PPPoE 0x05 /**< Protocol bits of Sn_MR. */
+
+/* Sn_CR values */
+#define Sn_CR_OPEN 0x01 /**< OPEN command value of Sn_CR. */
+#define Sn_CR_LISTEN 0x02 /**< LISTEN command value of Sn_CR. */
+#define Sn_CR_CONNECT 0x04 /**< CONNECT command value of Sn_CR. */
+#define Sn_CR_DISCON 0x08 /**< DISCONNECT command value of Sn_CR. */
+#define Sn_CR_CLOSE 0x10 /**< CLOSE command value of Sn_CR. */
+#define Sn_CR_SEND 0x20 /**< SEND command value of Sn_CR. */
+#define Sn_CR_SEND_MAC 0x21 /**< SEND_MAC command value of Sn_CR. */
+#define Sn_CR_SEND_KEEP 0x22 /**< SEND_KEEP command value of Sn_CR */
+#define Sn_CR_RECV 0x40 /**< RECV command value of Sn_CR */
+#define Sn_CR_PCON 0x23 /**< PCON command value of Sn_CR */
+#define Sn_CR_PDISCON 0x24 /**< PDISCON command value of Sn_CR */
+#define Sn_CR_PCR 0x25 /**< PCR command value of Sn_CR */
+#define Sn_CR_PCN 0x26 /**< PCN command value of Sn_CR */
+#define Sn_CR_PCJ 0x27 /**< PCJ command value of Sn_CR */
+
+/* Sn_IR values */
+#define Sn_IR_PRECV 0x80 /**< PPP receive bit of Sn_IR */
+#define Sn_IR_PFAIL 0x40 /**< PPP fail bit of Sn_IR */
+#define Sn_IR_PNEXT 0x20 /**< PPP next phase bit of Sn_IR */
+#define Sn_IR_SENDOK 0x10 /**< Send OK bit of Sn_IR */
+#define Sn_IR_TIMEOUT 0x08 /**< Timout bit of Sn_IR */
+#define Sn_IR_RECV 0x04 /**< Receive bit of Sn_IR */
+#define Sn_IR_DISCON 0x02 /**< Disconnect bit of Sn_IR */
+#define Sn_IR_CON 0x01 /**< Connect bit of Sn_IR */
+
+/* W5300 Register READ/WRITE funtions(Just 16 bit interface). */
+#define w5300_write(wp, addr, val) writew(val, (wp->base + addr))
+#define w5300_read(wp, addr) readw((wp->base + addr))
+
+#endif /* _W5300_H_ */
--
1.7.1
^ permalink raw reply related
* Re: [PATCH] usbnet: fix oops in usbnet_start_xmit
From: Richard Cochran @ 2011-11-07 14:42 UTC (permalink / raw)
To: Konstantin Khlebnikov
Cc: Oliver Neukum, netdev@vger.kernel.org, David S. Miller,
devel@openvz.org, Michael Riesch
In-Reply-To: <4EB7E5A2.2070407@openvz.org>
On Mon, Nov 07, 2011 at 06:05:22PM +0400, Konstantin Khlebnikov wrote:
> Richard Cochran wrote:
> >On Sun, Nov 06, 2011 at 10:33:37PM +0300, Konstantin Khlebnikov wrote:
> >>This patch fixes the bug added in commit v3.1-rc7-1055-gf9b491e
> >>SKB can be NULL at this point, at least for cdc-ncm.
> >
> >What? You mean .ndo_start_xmit is called with skb NULL?
>
> no. cdc_ncm call usbnet_start_xmit with NULL skb from timer handler
> and tx_fixup hook pickup skb from internal context. yeah, it really messy.
You said it.
Can you please submit the fix suggested by Michael and myself?
Thanks,
Richard
^ permalink raw reply
* [PATCH v2] usbnet: fix oops in usbnet_start_xmit
From: Konstantin Khlebnikov @ 2011-11-07 15:54 UTC (permalink / raw)
To: Oliver Neukum
Cc: Michael Riesch, Alexey Orishko, netdev, Richard Cochran,
David S. Miller, devel
In-Reply-To: <20111106183337.5379.4356.stgit@zurg>
This patch fixes the bug added in commit v3.1-rc7-1055-gf9b491e
SKB can be NULL at this point, at least for cdc-ncm.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
drivers/net/usb/usbnet.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index 7d60821..fae0fbd 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -1057,7 +1057,8 @@ netdev_tx_t usbnet_start_xmit (struct sk_buff *skb,
unsigned long flags;
int retval;
- skb_tx_timestamp(skb);
+ if (skb)
+ skb_tx_timestamp(skb);
// some devices want funky USB-level framing, for
// win32 driver (usually) and/or hardware quirks
^ permalink raw reply related
* Re: [PATCH net-next 0/2] 802.1ad S-VLAN support
From: Ben Hutchings @ 2011-11-07 15:11 UTC (permalink / raw)
To: David Lamparter; +Cc: netdev
In-Reply-To: <1320512055-1231037-1-git-send-email-equinox@diac24.net>
On Sat, 2011-11-05 at 17:54 +0100, David Lamparter wrote:
> Hi DaveM, hi everyone,
>
>
> this kernel patch, together with the iproute2 userspace support,
> allows creating 802.1ad S-VLAN devices.
>
> This feature might have weird interactions with hardware VLAN
> acceleration. I've done my best to make sure it doesn't break
> 802.1Q, but my access to hardware is rather limited. I did grep
> & scan all drivers for maybe-affected vlan behaviour and found
> nothing. I've tested on e1000, forcedeth, virtio and a Kirkwood
> ARM.
I didn't try it at all, but it looks reasonable to me.
We definitely need to think about how MTU/MRU are configured when
multiple VLAN tags are used, though I don't think it's essential to do
before this goes in. To be slightly more blunt than your documentation,
our current handling of MTU/MRU and VLANs is a botch.
Do you have any plan to improve that? Or to allow use of offload
features for multiple-tagged packets?
Ben.
> It'd be nice to get this into the next merge window to get some
> people with funny hardware a nice smoke trail...
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox