* [RFC PATCH 04/17] phy/icplus: Fix read_status/config_aneg error handling
From: Kyle Moffett @ 2011-10-20 21:00 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: Kyle Moffett, David S. Miller, Greg Dietsche, Giuseppe Cavallaro
In-Reply-To: <1319144425-15547-1-git-send-email-Kyle.D.Moffett@boeing.com>
Fixes the icplus PHY driver to propagate the return values of the
functions genphy_read_status() and genphy_config_aneg() instead of
ignoring them.
NOTE: Completely untested. Needs somebody with hardware to try it out.
Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com>
---
drivers/net/phy/icplus.c | 10 +++++-----
1 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/net/phy/icplus.c b/drivers/net/phy/icplus.c
index d4cbc29..2969dac 100644
--- a/drivers/net/phy/icplus.c
+++ b/drivers/net/phy/icplus.c
@@ -115,19 +115,19 @@ static int ip1001_config_init(struct phy_device *phydev)
static int ip175c_read_status(struct phy_device *phydev)
{
if (phydev->addr == 4) /* WAN port */
- genphy_read_status(phydev);
- else
- /* Don't need to read status for switch ports */
- phydev->irq = PHY_IGNORE_INTERRUPT;
+ return genphy_read_status(phydev);
+ /* Don't need to read status for switch ports */
+ phydev->irq = PHY_IGNORE_INTERRUPT;
return 0;
}
static int ip175c_config_aneg(struct phy_device *phydev)
{
if (phydev->addr == 4) /* WAN port */
- genphy_config_aneg(phydev);
+ return genphy_config_aneg(phydev);
+ /* Don't need to do anything for switch ports */
return 0;
}
--
1.7.2.5
^ permalink raw reply related
* [RFC PATCH 03/17] greth: Allow PHYs to override ->read_status
From: Kyle Moffett @ 2011-10-20 21:00 UTC (permalink / raw)
To: linux-kernel, netdev; +Cc: Kyle Moffett, Kristoffer Glembo
In-Reply-To: <1319144425-15547-1-git-send-email-Kyle.D.Moffett@boeing.com>
Instead of manually calling genphy_read_status(), the greth driver
should call phy_read_status() to allow the PHY driver to override the
read_status method with its own version.
NOTE: Completely untested. Needs somebody with hardware to try it out.
Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com>
---
drivers/net/greth.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/net/greth.c b/drivers/net/greth.c
index 52a3900..e7f268f 100644
--- a/drivers/net/greth.c
+++ b/drivers/net/greth.c
@@ -1367,7 +1367,7 @@ static int greth_mdio_init(struct greth_private *greth)
timeout = jiffies + 6*HZ;
while (!phy_aneg_done(greth->phy) && time_before(jiffies, timeout)) {
}
- genphy_read_status(greth->phy);
+ phy_read_status(greth->phy);
greth_link_change(greth->netdev);
}
--
1.7.2.5
^ permalink raw reply related
* [RFC PATCH 02/17] of_mdio: Don't phy_scan_fixups() twice
From: Kyle Moffett @ 2011-10-20 21:00 UTC (permalink / raw)
To: linux-kernel, netdev; +Cc: Kyle Moffett, Grant Likely, devicetree-discuss
In-Reply-To: <1319144425-15547-1-git-send-email-Kyle.D.Moffett@boeing.com>
The "phy_device_register()" call 5 lines down already calls
phy_scan_fixups(), there's no need to do it a second time.
Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com>
---
drivers/of/of_mdio.c | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)
diff --git a/drivers/of/of_mdio.c b/drivers/of/of_mdio.c
index d35e300..980c079 100644
--- a/drivers/of/of_mdio.c
+++ b/drivers/of/of_mdio.c
@@ -83,7 +83,6 @@ int of_mdiobus_register(struct mii_bus *mdio, struct device_node *np)
addr);
continue;
}
- phy_scan_fixups(phy);
/* Associate the OF node with the device structure so it
* can be looked up later */
--
1.7.2.5
^ permalink raw reply related
* [RFC PATCH 01/17] et1011c: Replaced PHY driver by a small dm646x board fixup
From: Kyle Moffett @ 2011-10-20 21:00 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: Kevin Hilman, Russell King, Sekhar Nori, David S. Miller,
H Hartley Sweeten, John Stultz, Kyle Moffett, Florian Fainelli,
Giuseppe Cavallaro, Richard Cochran, linux-arm-kernel
In-Reply-To: <1319144425-15547-1-git-send-email-Kyle.D.Moffett@boeing.com>
The et1011c PHY driver has several noticeable code smells:
(1) It uses a "static int speed" variable to see if the speed changed
between calls to et1011c_read_status(). This obviously breaks if
more than one PHY is present in a system.
(2) The "GMII_INTERFACE" and "SYS_CLK_EN" bits should be properly set
at reset-time by hardwired pins on the ET1011C chip, as they are
specific to the wiring on the board. They should NOT be set by a
generic PHY driver, and at best belong in a board-specific fixup.
(3) The FIFO bits are "changed" to the default reset value specified
in the datasheet.
(4) The driver does not appear to contain code anywhere which undoes
any of the above changes if the interface drops from 1000BaseT to
100BaseTX without a chip reset. Instead it appears to perform an
extraneous BMCR_RESET in its ->config_aneg() method, which would
wipe out any settings applied by phy_register_fixup() and friends.
This PHY should be handled entirely by the genphy driver with only a
minimal board-specific "phy_register_fixup()" in the DM646x code.
NOTE: Completely untested. Needs somebody with hardware to try it out.
Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com>
---
arch/arm/mach-davinci/dm646x.c | 24 ++++++++
drivers/net/phy/Kconfig | 5 --
drivers/net/phy/Makefile | 1 -
drivers/net/phy/et1011c.c | 119 ----------------------------------------
4 files changed, 24 insertions(+), 125 deletions(-)
delete mode 100644 drivers/net/phy/et1011c.c
diff --git a/arch/arm/mach-davinci/dm646x.c b/arch/arm/mach-davinci/dm646x.c
index 1802e71..afcdf37 100644
--- a/arch/arm/mach-davinci/dm646x.c
+++ b/arch/arm/mach-davinci/dm646x.c
@@ -14,6 +14,7 @@
#include <linux/serial_8250.h>
#include <linux/platform_device.h>
#include <linux/gpio.h>
+#include <linux/phy.h>
#include <asm/mach/map.h>
@@ -908,6 +909,27 @@ void __init dm646x_init(void)
davinci_common_init(&davinci_soc_info_dm646x);
}
+/* Apparently the PHY bootstrap pin wiring on the board is wrong */
+#define ET1011C_CONFIG_REG (0x16)
+#define ET1011C_TX_FIFO_MASK (0x3000)
+#define ET1011C_TX_FIFO_DEPTH_16 (0x1000)
+#define ET1011C_SYS_CLK_EN (0x0010)
+#define ET1011C_INTERFACE_MASK (0x0007)
+#define ET1011C_INTERFACE_GMII_GTX_CLK (0x0002)
+static int dm646x_et1011c_phy_fixup(struct phy_device *phydev)
+{
+ int val = phy_read(phydev, ET1011C_CONFIG_REG);
+ if (val < 0)
+ return val;
+
+ val &= ~ET1011C_TX_FIFO_MASK;
+ val |= ET1011C_TX_FIFO_DEPTH_16;
+ val |= ET1011C_SYS_CLK_EN;
+ val &= ~ET1011C_INTERFACE_MASK
+ val |= ET1011C_INTERFACE_GMII_GTX_CLK;
+ return phy_write(phydev, ET1011C_CONFIG_REG, val);
+}
+
static int __init dm646x_init_devices(void)
{
if (!cpu_is_davinci_dm646x())
@@ -917,6 +939,8 @@ static int __init dm646x_init_devices(void)
platform_device_register(&dm646x_emac_device);
clk_add_alias(NULL, dev_name(&dm646x_mdio_device.dev),
NULL, &dm646x_emac_device.dev);
+ phy_register_fixup_for_uid(0x0282f014, 0xfffffff0,
+ &dm646x_et1011c_phy_fixup);
return 0;
}
diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index a702443..fdd2ace 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -82,11 +82,6 @@ config STE10XP
---help---
This is the driver for the STe100p and STe101p PHYs.
-config LSI_ET1011C_PHY
- tristate "Driver for LSI ET1011C PHY"
- ---help---
- Supports the LSI ET1011C PHY.
-
config MICREL_PHY
tristate "Driver for Micrel PHYs"
---help---
diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
index 2333215..d4c0bd0 100644
--- a/drivers/net/phy/Makefile
+++ b/drivers/net/phy/Makefile
@@ -14,7 +14,6 @@ obj-$(CONFIG_BROADCOM_PHY) += broadcom.o
obj-$(CONFIG_BCM63XX_PHY) += bcm63xx.o
obj-$(CONFIG_ICPLUS_PHY) += icplus.o
obj-$(CONFIG_REALTEK_PHY) += realtek.o
-obj-$(CONFIG_LSI_ET1011C_PHY) += et1011c.o
obj-$(CONFIG_FIXED_PHY) += fixed.o
obj-$(CONFIG_MDIO_BITBANG) += mdio-bitbang.o
obj-$(CONFIG_MDIO_GPIO) += mdio-gpio.o
diff --git a/drivers/net/phy/et1011c.c b/drivers/net/phy/et1011c.c
deleted file mode 100644
index a8eb19e..0000000
--- a/drivers/net/phy/et1011c.c
+++ /dev/null
@@ -1,119 +0,0 @@
-/*
- * drivers/net/phy/et1011c.c
- *
- * Driver for LSI ET1011C PHYs
- *
- * Author: Chaithrika U S
- *
- * Copyright (c) 2008 Texas Instruments
- *
- * This program is free software; you can redistribute it and/or modify it
- * under the terms of the GNU General Public License as published by the
- * Free Software Foundation; either version 2 of the License, or (at your
- * option) any later version.
- *
- */
-#include <linux/kernel.h>
-#include <linux/string.h>
-#include <linux/errno.h>
-#include <linux/unistd.h>
-#include <linux/interrupt.h>
-#include <linux/init.h>
-#include <linux/delay.h>
-#include <linux/netdevice.h>
-#include <linux/etherdevice.h>
-#include <linux/skbuff.h>
-#include <linux/spinlock.h>
-#include <linux/mm.h>
-#include <linux/module.h>
-#include <linux/mii.h>
-#include <linux/ethtool.h>
-#include <linux/phy.h>
-#include <linux/io.h>
-#include <linux/uaccess.h>
-#include <asm/irq.h>
-
-#define ET1011C_STATUS_REG (0x1A)
-#define ET1011C_CONFIG_REG (0x16)
-#define ET1011C_SPEED_MASK (0x0300)
-#define ET1011C_GIGABIT_SPEED (0x0200)
-#define ET1011C_TX_FIFO_MASK (0x3000)
-#define ET1011C_TX_FIFO_DEPTH_8 (0x0000)
-#define ET1011C_TX_FIFO_DEPTH_16 (0x1000)
-#define ET1011C_INTERFACE_MASK (0x0007)
-#define ET1011C_GMII_INTERFACE (0x0002)
-#define ET1011C_SYS_CLK_EN (0x01 << 4)
-
-
-MODULE_DESCRIPTION("LSI ET1011C PHY driver");
-MODULE_AUTHOR("Chaithrika U S");
-MODULE_LICENSE("GPL");
-
-static int et1011c_config_aneg(struct phy_device *phydev)
-{
- int ctl = 0;
- ctl = phy_read(phydev, MII_BMCR);
- if (ctl < 0)
- return ctl;
- ctl &= ~(BMCR_FULLDPLX | BMCR_SPEED100 | BMCR_SPEED1000 |
- BMCR_ANENABLE);
- /* First clear the PHY */
- phy_write(phydev, MII_BMCR, ctl | BMCR_RESET);
-
- return genphy_config_aneg(phydev);
-}
-
-static int et1011c_read_status(struct phy_device *phydev)
-{
- int ret;
- u32 val;
- static int speed;
- ret = genphy_read_status(phydev);
-
- if (speed != phydev->speed) {
- speed = phydev->speed;
- val = phy_read(phydev, ET1011C_STATUS_REG);
- if ((val & ET1011C_SPEED_MASK) ==
- ET1011C_GIGABIT_SPEED) {
- val = phy_read(phydev, ET1011C_CONFIG_REG);
- val &= ~ET1011C_TX_FIFO_MASK;
- phy_write(phydev, ET1011C_CONFIG_REG, val\
- | ET1011C_GMII_INTERFACE\
- | ET1011C_SYS_CLK_EN\
- | ET1011C_TX_FIFO_DEPTH_16);
-
- }
- }
- return ret;
-}
-
-static struct phy_driver et1011c_driver = {
- .phy_id = 0x0282f014,
- .name = "ET1011C",
- .phy_id_mask = 0xfffffff0,
- .features = (PHY_BASIC_FEATURES | SUPPORTED_1000baseT_Full),
- .flags = PHY_POLL,
- .config_aneg = et1011c_config_aneg,
- .read_status = et1011c_read_status,
- .driver = { .owner = THIS_MODULE,},
-};
-
-static int __init et1011c_init(void)
-{
- return phy_driver_register(&et1011c_driver);
-}
-
-static void __exit et1011c_exit(void)
-{
- phy_driver_unregister(&et1011c_driver);
-}
-
-module_init(et1011c_init);
-module_exit(et1011c_exit);
-
-static struct mdio_device_id __maybe_unused et1011c_tbl[] = {
- { 0x0282f014, 0xfffffff0 },
- { }
-};
-
-MODULE_DEVICE_TABLE(mdio, et1011c_tbl);
--
1.7.2.5
^ permalink raw reply related
* Re: [PATCH net-next] myri10ge: fix truesize underestimation
From: Eric Dumazet @ 2011-10-20 20:59 UTC (permalink / raw)
To: Andrew Gallatin; +Cc: Jon Mason, David Miller, netdev
In-Reply-To: <4EA0885A.9010009@myri.com>
Le jeudi 20 octobre 2011 à 16:45 -0400, Andrew Gallatin a écrit :
> On 10/20/11 16:44, Eric Dumazet wrote:
> > Le jeudi 20 octobre 2011 à 15:33 -0500, Jon Mason a écrit :
> >> On Thu, Oct 20, 2011 at 3:10 PM, Eric Dumazet<eric.dumazet@gmail.com> wrote:
> >>> skb->truesize must account for allocated memory, not the used part of
> >>> it. Doing this work is important to avoid unexpected OOM situations.
> >>>
> >>> Signed-off-by: Eric Dumazet<eric.dumazet@gmail.com>
> >>
> >> Acked-by: Jon Mason<mason@myri.com>
> >
> > Thanks for reviewing Jon !
> >
> >
>
> Please wait a second.. I think the patch is incorrect.
>
> There is already code in myri10ge_rx_skb_build() which
> attempts to set the truesize. However, it sets it to
> the used, rather than the allocated size so it is apparently
> incorrect.
>
> I'd prefer we fix that code.
Well, I believe I did exactly that :)
truesize of initial skb is fine.
Then for everay frag added, you must add to skb-truesize the allocated
memory for this frag.
You add frags of a given size (small or big)
In the end, its truesize += bytes * number_of_frags
(bytes being small_size or big_size)
^ permalink raw reply
* Re: [PATCH 07/10] RDMA/cxgb4: DB Drop Recovery for RDMA and LLD queues.
From: David Miller @ 2011-10-20 20:57 UTC (permalink / raw)
To: swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW
Cc: roland-BHEL68pLQRGGvPXPguhicg, vipul-ut6Up61K2wZBDgjK7y7TUQ,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
divy-ut6Up61K2wZBDgjK7y7TUQ, dm-ut6Up61K2wZBDgjK7y7TUQ,
kumaras-ut6Up61K2wZBDgjK7y7TUQ
In-Reply-To: <4EA05A27.9090605-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
From: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Date: Thu, 20 Oct 2011 12:28:07 -0500
> On 10/20/2011 12:17 PM, Roland Dreier wrote:
>>> I believe 5 and 7 have build dependencies.
>> Right, missed that one too.
>>
>> But it seems 4,6,8,9,10 are independent of the rest of the series?
>>
>> ie I can trivially apply them and then worry about working out
>> the drivers/net / drivers/infiniband interdependency a bit later?
>>
>
> Some of these might be dependent on prior patches the series. But if
> they aren't, yes, you could do that.
So, how do you guys want to do this? If you give me a list of which
patches I should put into net-next and leave the rest to the infiniband
tree, that'd work fine for me as long as net-next is left in a working
state independent of the infiniband tree.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [patch] pktgen: bug when calling ndelay in x86 architectures
From: Eric Dumazet @ 2011-10-20 20:55 UTC (permalink / raw)
To: David Miller
Cc: bhutchings, daniel.turull, netdev, robert, voravit, jens.laas
In-Reply-To: <20111020.162444.559487256559727633.davem@davemloft.net>
Le jeudi 20 octobre 2011 à 16:24 -0400, David Miller a écrit :
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Tue, 18 Oct 2011 16:47:44 +0200
>
> > Le mardi 18 octobre 2011 à 15:00 +0100, Ben Hutchings a écrit :
> >
> >> AIUI, the reason for limits on delays is not that it's bad practice to
> >> spin for so long, but that the delay calculations may overflow or
> >> otherwise become inaccurate.
> >
> > OK, I can understand that, then a more appropriate patch would be :
>
> I think doing the udelay/ndelay thing is the way to go for 'net' and
> -stable. We can do something sophisticated with ktime et al. in
> 'net-next'.
>
Well, I am not sure a patch is needed for net, since there is no bug,
but maybe small inaccuracies ? Correct me if I misunderstood Daniel !
> Eric, could you please formally submit this patch with proper
> changelog etc.?
Sure !
[PATCH net-next] pktgen: remove ndelay() call
Daniel Turull reported inaccuracies in pktgen when using low packet
rates, because we call ndelay(val) with values bigger than 20000.
Instead of calling ndelay() for delays < 100us, we can instead loop
calling ktime_now() only.
Reported-by: Daniel Turull <daniel.turull@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
net/core/pktgen.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 6bbf008..0001c24 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -2145,9 +2145,12 @@ static void spin(struct pktgen_dev *pkt_dev, ktime_t spin_until)
}
start_time = ktime_now();
- if (remaining < 100000)
- ndelay(remaining); /* really small just spin */
- else {
+ if (remaining < 100000) {
+ /* for small delays (<100us), just loop until limit is reached */
+ do {
+ end_time = ktime_now();
+ } while (ktime_lt(end_time, spin_until));
+ } else {
/* see do_nanosleep */
hrtimer_init_sleeper(&t, current);
do {
@@ -2162,8 +2165,8 @@ static void spin(struct pktgen_dev *pkt_dev, ktime_t spin_until)
hrtimer_cancel(&t.timer);
} while (t.task && pkt_dev->running && !signal_pending(current));
__set_current_state(TASK_RUNNING);
+ end_time = ktime_now();
}
- end_time = ktime_now();
pkt_dev->idle_acc += ktime_to_ns(ktime_sub(end_time, start_time));
pkt_dev->next_tx = ktime_add_ns(spin_until, pkt_dev->delay);
^ permalink raw reply related
* Re: [PATCH v2 net-next] tcp: use TCP_DEFAULT_INIT_RCVWND in tcp_fixup_rcvbuf()
From: David Miller @ 2011-10-20 20:54 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev
In-Reply-To: <1319143281.2854.25.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 20 Oct 2011 22:41:21 +0200
> Since commit 356f039822b (TCP: increase default initial receive
> window.), we allow sender to send 10 (TCP_DEFAULT_INIT_RCVWND) segments.
>
> Change tcp_fixup_rcvbuf() to reflect this change, even if no real change
> is expected, since sysctl_tcp_rmem[1] = 87380 and this value
> is bigger than tcp_fixup_rcvbuf() computed rcvmem (~23720)
>
> Note: Since commit 356f039822b limited default window to maximum of
> 10*1460 and 2*MSS, we use same heuristic in this patch.
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Applied, thanks a lot Eric.
^ permalink raw reply
* RE: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address filtering support for passthru mode
From: Rose, Gregory V @ 2011-10-20 20:47 UTC (permalink / raw)
To: Rose, Gregory V, Roopa Prabhu, netdev@vger.kernel.org
Cc: sri@us.ibm.com, dragos.tatulea@gmail.com, arnd@arndb.de,
kvm@vger.kernel.org, mst@redhat.com, davem@davemloft.net,
mchan@broadcom.com, dwang2@cisco.com, shemminger@vyatta.com,
eric.dumazet@gmail.com, kaber@trash.net, benve@cisco.com
In-Reply-To: <43F901BD926A4E43B106BF17856F075501A19FF1D8@orsmsx508.amr.corp.intel.com>
> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
> On Behalf Of Rose, Gregory V
> Sent: Thursday, October 20, 2011 1:44 PM
> To: Roopa Prabhu; netdev@vger.kernel.org
> Cc: sri@us.ibm.com; dragos.tatulea@gmail.com; arnd@arndb.de;
> kvm@vger.kernel.org; mst@redhat.com; davem@davemloft.net;
> mchan@broadcom.com; dwang2@cisco.com; shemminger@vyatta.com;
> eric.dumazet@gmail.com; kaber@trash.net; benve@cisco.com
> Subject: RE: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address
> filtering support for passthru mode
>
> > -----Original Message-----
> > From: Roopa Prabhu [mailto:roprabhu@cisco.com]
> > Sent: Wednesday, October 19, 2011 3:30 PM
> > To: Rose, Gregory V; netdev@vger.kernel.org
> > Cc: sri@us.ibm.com; dragos.tatulea@gmail.com; arnd@arndb.de;
> > kvm@vger.kernel.org; mst@redhat.com; davem@davemloft.net;
> > mchan@broadcom.com; dwang2@cisco.com; shemminger@vyatta.com;
> > eric.dumazet@gmail.com; kaber@trash.net; benve@cisco.com
> > Subject: Re: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address
> > filtering support for passthru mode
> >
> >
> >
> >
> > On 10/19/11 2:06 PM, "Rose, Gregory V" <gregory.v.rose@intel.com> wrote:
> >
> > >> -----Original Message-----
> > >> From: netdev-owner@vger.kernel.org [mailto:netdev-
> > owner@vger.kernel.org]
> > >> On Behalf Of Roopa Prabhu
> > >> Sent: Tuesday, October 18, 2011 11:26 PM
> > >> To: netdev@vger.kernel.org
> > >> Cc: sri@us.ibm.com; dragos.tatulea@gmail.com; arnd@arndb.de;
> > >> kvm@vger.kernel.org; mst@redhat.com; davem@davemloft.net;
> > >> mchan@broadcom.com; dwang2@cisco.com; shemminger@vyatta.com;
> > >> eric.dumazet@gmail.com; kaber@trash.net; benve@cisco.com
> > >> Subject: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address
> filtering
> > >> support for passthru mode
> > >>
> > >
> > > [snip...]
> > >
> > >>
> > >>
> > >> Note: The choice of rtnl_link_ops was because I saw the use case for
> > >> this in virtual devices that need to do filtering in sw like macvlan
> > >> and tun. Hw devices usually have filtering in hw with netdev->uc and
> > >> mc lists to indicate active filters. But I can move from
> rtnl_link_ops
> > >> to netdev_ops if that is the preferred way to go and if there is a
> > >> need to support this interface on all kinds of interfaces.
> > >> Please suggest.
> > >
> > > I'm still digesting the rest of the RFC patches but I did want to
> > quickly jump
> > > in and push for adding this support in netdev_ops. I would like to
> see
> > these
> > > features available in more devices than just macvtap and macvlan. I
> can
> > > conceive
> > > of use cases for multiple HW MAC and VLAN filters for a VF device that
> > isn't
> > > owned by a macvlan/macvtap interface and only has netdev_ops support.
> > In this
> > > case it would be necessary to program the filters directly to the VF
> > device
> > > interface or PF interface (or lowerdev as you refer to it) instead of
> > going
> > > through macvlan/macvtap.
> > >
> > > This work dovetails nicely with some work I've been doing and I'd be
> > very
> > > interested
> > > in helping move this forward if we could work out the details that
> would
> > allow
> > > support
> > > of the features we (and the community) require.
> >
> > Great. Thanks. I will definitely be interested to get this patch working
> > for
> > any other use case you have.
> >
> > Moving the ops to netdev should be trivial. You probably want the ops to
> > work on the VF via the PF, like the existing ndo_set_vf_mac etc.
>
> That is correct, so we would need to add some way to pass the VF number to
> the op.
> In addition, there are use cases for multiple MAC address filters for the
> Physical
> Function (PF) so we would like to be able to identify to the netdev op
> that it is
> supposed to perform the action on the PF filters instead of a VF.
>
> An example of this would be when an administrator has created some number of VFs
> for a given PF but is also running the PF in bridged (i.e. promiscuous)mode so
> that it can support purely SW emulated network connections in some VMs that have
> low network latency and bandwidth requirements while reserving the VFs for VMs that
^^^
That should be "no", not low...
- Greg
^ permalink raw reply
* Re: [PATCH net-next] myri10ge: fix truesize underestimation
From: Andrew Gallatin @ 2011-10-20 20:45 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Jon Mason, David Miller, netdev
In-Reply-To: <1319143442.2854.26.camel@edumazet-laptop>
On 10/20/11 16:44, Eric Dumazet wrote:
> Le jeudi 20 octobre 2011 à 15:33 -0500, Jon Mason a écrit :
>> On Thu, Oct 20, 2011 at 3:10 PM, Eric Dumazet<eric.dumazet@gmail.com> wrote:
>>> skb->truesize must account for allocated memory, not the used part of
>>> it. Doing this work is important to avoid unexpected OOM situations.
>>>
>>> Signed-off-by: Eric Dumazet<eric.dumazet@gmail.com>
>>
>> Acked-by: Jon Mason<mason@myri.com>
>
> Thanks for reviewing Jon !
>
>
Please wait a second.. I think the patch is incorrect.
There is already code in myri10ge_rx_skb_build() which
attempts to set the truesize. However, it sets it to
the used, rather than the allocated size so it is apparently
incorrect.
I'd prefer we fix that code.
Thanks,
Drew
^ permalink raw reply
* Re: [PATCH net-next] myri10ge: fix truesize underestimation
From: Eric Dumazet @ 2011-10-20 20:44 UTC (permalink / raw)
To: Jon Mason; +Cc: David Miller, netdev, Andrew Gallatin
In-Reply-To: <CAMaF-rN8K3hDiwwqh_eGQ0nrxskn+7r9Rn_yDJ46aesKR77nbg@mail.gmail.com>
Le jeudi 20 octobre 2011 à 15:33 -0500, Jon Mason a écrit :
> On Thu, Oct 20, 2011 at 3:10 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > skb->truesize must account for allocated memory, not the used part of
> > it. Doing this work is important to avoid unexpected OOM situations.
> >
> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>
> Acked-by: Jon Mason <mason@myri.com>
Thanks for reviewing Jon !
^ permalink raw reply
* RE: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address filtering support for passthru mode
From: Rose, Gregory V @ 2011-10-20 20:43 UTC (permalink / raw)
To: Roopa Prabhu, netdev@vger.kernel.org
Cc: sri@us.ibm.com, dragos.tatulea@gmail.com, arnd@arndb.de,
kvm@vger.kernel.org, mst@redhat.com, davem@davemloft.net,
mchan@broadcom.com, dwang2@cisco.com, shemminger@vyatta.com,
eric.dumazet@gmail.com, kaber@trash.net, benve@cisco.com
In-Reply-To: <CAC49D8A.374FB%roprabhu@cisco.com>
> -----Original Message-----
> From: Roopa Prabhu [mailto:roprabhu@cisco.com]
> Sent: Wednesday, October 19, 2011 3:30 PM
> To: Rose, Gregory V; netdev@vger.kernel.org
> Cc: sri@us.ibm.com; dragos.tatulea@gmail.com; arnd@arndb.de;
> kvm@vger.kernel.org; mst@redhat.com; davem@davemloft.net;
> mchan@broadcom.com; dwang2@cisco.com; shemminger@vyatta.com;
> eric.dumazet@gmail.com; kaber@trash.net; benve@cisco.com
> Subject: Re: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address
> filtering support for passthru mode
>
>
>
>
> On 10/19/11 2:06 PM, "Rose, Gregory V" <gregory.v.rose@intel.com> wrote:
>
> >> -----Original Message-----
> >> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org]
> >> On Behalf Of Roopa Prabhu
> >> Sent: Tuesday, October 18, 2011 11:26 PM
> >> To: netdev@vger.kernel.org
> >> Cc: sri@us.ibm.com; dragos.tatulea@gmail.com; arnd@arndb.de;
> >> kvm@vger.kernel.org; mst@redhat.com; davem@davemloft.net;
> >> mchan@broadcom.com; dwang2@cisco.com; shemminger@vyatta.com;
> >> eric.dumazet@gmail.com; kaber@trash.net; benve@cisco.com
> >> Subject: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address filtering
> >> support for passthru mode
> >>
> >
> > [snip...]
> >
> >>
> >>
> >> Note: The choice of rtnl_link_ops was because I saw the use case for
> >> this in virtual devices that need to do filtering in sw like macvlan
> >> and tun. Hw devices usually have filtering in hw with netdev->uc and
> >> mc lists to indicate active filters. But I can move from rtnl_link_ops
> >> to netdev_ops if that is the preferred way to go and if there is a
> >> need to support this interface on all kinds of interfaces.
> >> Please suggest.
> >
> > I'm still digesting the rest of the RFC patches but I did want to
> quickly jump
> > in and push for adding this support in netdev_ops. I would like to see
> these
> > features available in more devices than just macvtap and macvlan. I can
> > conceive
> > of use cases for multiple HW MAC and VLAN filters for a VF device that
> isn't
> > owned by a macvlan/macvtap interface and only has netdev_ops support.
> In this
> > case it would be necessary to program the filters directly to the VF
> device
> > interface or PF interface (or lowerdev as you refer to it) instead of
> going
> > through macvlan/macvtap.
> >
> > This work dovetails nicely with some work I've been doing and I'd be
> very
> > interested
> > in helping move this forward if we could work out the details that would
> allow
> > support
> > of the features we (and the community) require.
>
> Great. Thanks. I will definitely be interested to get this patch working
> for
> any other use case you have.
>
> Moving the ops to netdev should be trivial. You probably want the ops to
> work on the VF via the PF, like the existing ndo_set_vf_mac etc.
That is correct, so we would need to add some way to pass the VF number to the op.
In addition, there are use cases for multiple MAC address filters for the Physical
Function (PF) so we would like to be able to identify to the netdev op that it is
supposed to perform the action on the PF filters instead of a VF.
An example of this would be when an administrator has created some number of VFs
for a given PF but is also running the PF in bridged (i.e. promiscuous) mode so that it
can support purely SW emulated network connections in some VMs that have low network
latency and bandwidth requirements while reserving the VFs for VMs that require the low latency, high throughput that directly assigned VFs can provide. In this case an
emulated SW interface in a VM is unable to properly communicate with VFs on the same
PF because the emulated SW interface's MAC address isn't programmed into the HW filters
on the PF. If we could use this op to program the MAC address and VLAN filters of
the emulated SW interfaces into the PF HW a VF could then properly communicate across
the NIC's internal VEB to the emulated SW interfaces.
> Yes, lets work out the details and I can move this to netdev->ops. Let me
> know.
I think essentially if you could add some parameter to the ops to specify whether it
is addressing a VF or the PF and then if it is a VF further specify the VF number we
would be very close to addressing the requirements of many valuable use cases in
addition to the ones you have identified in your RFC.
Does that sound reasonable?
Thanks,
- Greg
^ permalink raw reply
* [PATCH v2 net-next] tcp: use TCP_DEFAULT_INIT_RCVWND in tcp_fixup_rcvbuf()
From: Eric Dumazet @ 2011-10-20 20:41 UTC (permalink / raw)
To: David Miller; +Cc: netdev
In-Reply-To: <20111020.161356.2120784879469409197.davem@davemloft.net>
Since commit 356f039822b (TCP: increase default initial receive
window.), we allow sender to send 10 (TCP_DEFAULT_INIT_RCVWND) segments.
Change tcp_fixup_rcvbuf() to reflect this change, even if no real change
is expected, since sysctl_tcp_rmem[1] = 87380 and this value
is bigger than tcp_fixup_rcvbuf() computed rcvmem (~23720)
Note: Since commit 356f039822b limited default window to maximum of
10*1460 and 2*MSS, we use same heuristic in this patch.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
net/ipv4/tcp_input.c | 23 +++++++++++++++--------
1 file changed, 15 insertions(+), 8 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 1e848b2..e8e6d49 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -345,17 +345,24 @@ static void tcp_grow_window(struct sock *sk, struct sk_buff *skb)
static void tcp_fixup_rcvbuf(struct sock *sk)
{
- struct tcp_sock *tp = tcp_sk(sk);
- int rcvmem = SKB_TRUESIZE(tp->advmss + MAX_TCP_HEADER);
+ u32 mss = tcp_sk(sk)->advmss;
+ u32 icwnd = TCP_DEFAULT_INIT_RCVWND;
+ int rcvmem;
- /* Try to select rcvbuf so that 4 mss-sized segments
- * will fit to window and corresponding skbs will fit to our rcvbuf.
- * (was 3; 4 is minimum to allow fast retransmit to work.)
+ /* Limit to 10 segments if mss <= 1460,
+ * or 14600/mss segments, with a minimum of two segments.
*/
- while (tcp_win_from_space(rcvmem) < tp->advmss)
+ if (mss > 1460)
+ icwnd = max_t(u32, (1460 * TCP_DEFAULT_INIT_RCVWND) / mss, 2);
+
+ rcvmem = SKB_TRUESIZE(mss + MAX_TCP_HEADER);
+ while (tcp_win_from_space(rcvmem) < mss)
rcvmem += 128;
- if (sk->sk_rcvbuf < 4 * rcvmem)
- sk->sk_rcvbuf = min(4 * rcvmem, sysctl_tcp_rmem[2]);
+
+ rcvmem *= icwnd;
+
+ if (sk->sk_rcvbuf < rcvmem)
+ sk->sk_rcvbuf = min(rcvmem, sysctl_tcp_rmem[2]);
}
/* 4. Try to fixup all. It is made immediately after connection enters
^ permalink raw reply related
* Re: Kernel panic from tg3 net driver
From: Ari Savolainen @ 2011-10-20 20:37 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, richardcochran, netdev, linux-kernel
In-Reply-To: <1319141867.2854.19.camel@edumazet-laptop>
That's right. I tried the patch and it didn't help.
Ari
2011/10/20 Eric Dumazet <eric.dumazet@gmail.com>:
> Le jeudi 20 octobre 2011 à 16:11 -0400, David Miller a écrit :
>> From: Eric Dumazet <eric.dumazet@gmail.com>
>> Date: Thu, 20 Oct 2011 22:05:25 +0200
>>
>> > And I think this was fixed yesterday ?
>> >
>> > De: roy.qing.li@gmail.com
>> > À: ari.m.savolainen@gmail.com, netdev@vger.kernel.org
>> > Sujet: [PATCH net-next] neigh: fix rcu splat in neigh_update()
>> > Date: Tue, 18 Oct 2011 16:32:42 +0800 (18/10/2011 10:32:42)
>> >
>>
>> Good catch, it seems to be this bug.
>
> Oh well, sorry, it seems it was one bug hit during bisection, but maybe
> its completely unrelated to the real problem.
>
>
>
>
^ permalink raw reply
* Re: [PATCH net-next] myri10ge: fix truesize underestimation
From: Jon Mason @ 2011-10-20 20:33 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev, Andrew Gallatin
In-Reply-To: <1319141403.2854.17.camel@edumazet-laptop>
On Thu, Oct 20, 2011 at 3:10 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> skb->truesize must account for allocated memory, not the used part of
> it. Doing this work is important to avoid unexpected OOM situations.
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Jon Mason <mason@myri.com>
> CC: Jon Mason <mason@myri.com>
> ---
> drivers/net/ethernet/myricom/myri10ge/myri10ge.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
> index c970a48..0778edc 100644
> --- a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
> +++ b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
> @@ -1210,7 +1210,6 @@ myri10ge_rx_skb_build(struct sk_buff *skb, u8 * va,
> struct skb_frag_struct *skb_frags;
>
> skb->len = skb->data_len = len;
> - skb->truesize = len + sizeof(struct sk_buff);
> /* attach the page(s) */
>
> skb_frags = skb_shinfo(skb)->frags;
> @@ -1385,6 +1384,8 @@ myri10ge_rx_done(struct myri10ge_slice_state *ss, int len, __wsum csum,
> if (skb_frag_size(&skb_shinfo(skb)->frags[0]) <= 0) {
> skb_frag_unref(skb, 0);
> skb_shinfo(skb)->nr_frags = 0;
> + } else {
> + skb->truesize += bytes * skb_shinfo(skb)->nr_frags;
> }
> skb->protocol = eth_type_trans(skb, dev);
> skb_record_rx_queue(skb, ss - &mgp->ss[0]);
>
>
>
^ permalink raw reply
* Re: [patch net-next]alx: Atheros AR8131/AR8151/AR8152/AR8161 Ethernet driver
From: Luis R. Rodriguez @ 2011-10-20 20:33 UTC (permalink / raw)
To: Ren, Cloud
Cc: David Miller, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <6349D7A510622448B1BA0967850A8438011CC2A0@nasanexd02d.na.qualcomm.com>
On Thu, Oct 20, 2011 at 2:48 AM, Ren, Cloud <cjren@qca.qualcomm.com> wrote:
>
>>From: "Ren, Cloud" <cjren@qca.qualcomm.com>
>>Date: Thu, 20 Oct 2011 09:23:07 +0000
>>
>>> As you saw, should I do the two following steps?
>>> 1. I firstly try to submit code to linux-staging.git.
>>> 2. After the driver have been accepted by linux-staging.git, I submit to net-
>>next.git again.
>>
>>You submit and get it into staging so that it can sit there for some time and get
>>reviewed and improved by others.
>>
>>One doesn't submit directly to net-next right after it gets into staging, staging
>>is a place where your driver lives while it still smelly funky and needs more
>>work.
>
> The driver will support the next generation NICs of Atheros. Meanwhile, the driver can
> also have better optimization for AR8131 and AR8151 than atl1c. For some reason, we
> don't plan to patch atl1c driver to support our new NIC, such as AR8161. So I hope the driver
> can stay in net-next in the end. Of course, I will be responsible for modify source code and
> let it match kernel requirements.
Cloud,
If you want to skip staging (which I recommend) then you need to
address all upstream concerns expressed. Given that you indicate that
you will be working on following up with the driver until its
acceptable upstream my recommendation is either to clean up the driver
very well and review it internally at Atheros prior to a public
submission *or* just dump into staging and get the benefit of
community cleanup and eventually wait until it is ready for proper
upstream. If you want internal private review at Atheros you can use
the internal private ath9k-devel list.
Also are you going to maintain the older atlx drivers? While at it can
you clear up who maintains what as far as Atheros is concerned for
Ethernet?
Luis
^ permalink raw reply
* Re: [patch] pktgen: bug when calling ndelay in x86 architectures
From: David Miller @ 2011-10-20 20:24 UTC (permalink / raw)
To: eric.dumazet
Cc: bhutchings, daniel.turull, netdev, robert, voravit, jens.laas
In-Reply-To: <1318949264.2657.97.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 18 Oct 2011 16:47:44 +0200
> Le mardi 18 octobre 2011 à 15:00 +0100, Ben Hutchings a écrit :
>
>> AIUI, the reason for limits on delays is not that it's bad practice to
>> spin for so long, but that the delay calculations may overflow or
>> otherwise become inaccurate.
>
> OK, I can understand that, then a more appropriate patch would be :
I think doing the udelay/ndelay thing is the way to go for 'net' and
-stable. We can do something sophisticated with ktime et al. in
'net-next'.
Eric, could you please formally submit this patch with proper
changelog etc.?
Thanks.
^ permalink raw reply
* Re: PROBLEM: System call 'sendmsg' of process ospfd (quagga) causes kernel oops
From: David Miller @ 2011-10-20 20:21 UTC (permalink / raw)
To: herbert; +Cc: eric.dumazet, evonlanthen, linux-kernel, netdev, timo.teras
In-Reply-To: <20111020093541.GA3024@gondor.apana.org.au>
From: Herbert Xu <herbert@gondor.hengli.com.au>
Date: Thu, 20 Oct 2011 11:35:41 +0200
> On Thu, Oct 20, 2011 at 05:30:50AM -0400, David Miller wrote:
>>
>> So I'm a little confused what your suggestion for rc10 really
>> is :-)
>
> I meant his first initial patch :)
>
> While it is suboptimal in the sense that should the value of
> needed_headroom increase we'll end up constantly reallocating
> skbs, I believe that it is at least semantically correct.
Ok, I applied Eric's patch which removes the dynamic changing of the
needed_headroom in IP_GRE.
Thanks everyone!
^ permalink raw reply
* Re: [PATCH] route: fix ICMP redirect validation
From: David Miller @ 2011-10-20 20:19 UTC (permalink / raw)
To: fbl; +Cc: netdev
In-Reply-To: <20111020154702.13f69021@asterix.rh>
From: Flavio Leitner <fbl@redhat.com>
Date: Thu, 20 Oct 2011 15:47:02 -0200
> I was reviewing this again and instead of doing the above, it would
> be better to use rt_bind_peer() to update rt->peer as well.
>
> if (!rt->peer)
> rt_bind_peer(rt, rt->rt_dst, 1);
>
> peer = rt->peer;
> if (peer) {
> peer->redirect_learned.a4 = new_gw;
> atomic_inc(&__rt_peer_genid);
> }
>
>
> but I am not sure if I understood you completely when you say
> to do such that only an inetpeer cache probe is necessary.
If you have the route entry available already and you're doing the
inetpeer lookup anyways, you might as well use rt_bind_peer() since
all of the expensive work has to be done anyways.
So yes, using rt_bind_peer() would be the best thing to do here.
^ permalink raw reply
* Re: Kernel panic from tg3 net driver
From: Eric Dumazet @ 2011-10-20 20:17 UTC (permalink / raw)
To: David Miller; +Cc: ari.m.savolainen, richardcochran, netdev, linux-kernel
In-Reply-To: <20111020.161147.33259825921677777.davem@davemloft.net>
Le jeudi 20 octobre 2011 à 16:11 -0400, David Miller a écrit :
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Thu, 20 Oct 2011 22:05:25 +0200
>
> > And I think this was fixed yesterday ?
> >
> > De: roy.qing.li@gmail.com
> > À: ari.m.savolainen@gmail.com, netdev@vger.kernel.org
> > Sujet: [PATCH net-next] neigh: fix rcu splat in neigh_update()
> > Date: Tue, 18 Oct 2011 16:32:42 +0800 (18/10/2011 10:32:42)
> >
>
> Good catch, it seems to be this bug.
Oh well, sorry, it seems it was one bug hit during bisection, but maybe
its completely unrelated to the real problem.
^ permalink raw reply
* Re: [PATCH] dev: use name hash for dev_seq_ops
From: David Miller @ 2011-10-20 20:17 UTC (permalink / raw)
To: mihai.maruseac
Cc: shemminger, eric.dumazet, mirq-linux, therbert, jpirko, netdev,
linux-kernel, dbaluta, mmaruseac
In-Reply-To: <1319097717-14910-1-git-send-email-mmaruseac@ixiacom.com>
From: Mihai Maruseac <mihai.maruseac@gmail.com>
Date: Thu, 20 Oct 2011 11:01:57 +0300
> Instead of using the dev->next chain and trying to resync at each call to
> dev_seq_start, use the name hash, keeping the bucket and the offset in
> seq->private field.
I'm totally fine with this patch from a technical perspective, but I'd
like one small thing tidied up before I apply this.
> + unsigned int pos; /* bucket << 24 + offset */
Please don't mention this as a constant in the comment, if we ever
change NETDEV_HASHBITS this comment will be inaccurate.
I'd suggest putting the BUCKET_SPACE define before the dev_iter_state
definition, and using BUCKET_SPACE in the comment instead of 24.
Thanks.
^ permalink raw reply
* Re: [PATCH net-next] tcp: use TCP_DEFAULT_INIT_RCVWND in tcp_fixup_rcvbuf()
From: David Miller @ 2011-10-20 20:13 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev
In-Reply-To: <1319140954.2854.12.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 20 Oct 2011 22:02:34 +0200
> Le jeudi 20 octobre 2011 à 15:50 -0400, David Miller a écrit :
>> From: Eric Dumazet <eric.dumazet@gmail.com>
>> Date: Thu, 20 Oct 2011 21:16:26 +0200
>>
>> > Since commit 356f039822b (TCP: increase default initial receive
>> > window.), we allow sender to send 10 (TCP_DEFAULT_INIT_RCVWND) segments.
>> >
>> > Change tcp_fixup_rcvbuf() to reflect this change, even if no real change
>> > is expected, since sysctl_tcp_rmem[1] = 87380 and this value
>> > is bigger than tcp_fixup_rcvbuf() computed rcvmem (~23720)
>> >
>> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>> ...
>> > + unsigned int mss = min_t(unsigned int, tp->advmss, 1460);
>>
>> I don't understand where this calculation comes from, and even if it
>> should be obvious it isn't to me and deserves a mention in the commit
>> message at a minimum.
>
> This is the calculation done in commit 356f039822b as well.
>
> The window is 10*MSS, but no more than 14600
>
> On loopback, this matters, because we could end with rcvmem=219680
Thanks, please help weak brains like mine by adding this to the commit message.
:-)
^ permalink raw reply
* Re: Kernel panic from tg3 net driver
From: David Miller @ 2011-10-20 20:11 UTC (permalink / raw)
To: eric.dumazet; +Cc: ari.m.savolainen, richardcochran, netdev, linux-kernel
In-Reply-To: <1319141125.2854.14.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 20 Oct 2011 22:05:25 +0200
> And I think this was fixed yesterday ?
>
> De: roy.qing.li@gmail.com
> À: ari.m.savolainen@gmail.com, netdev@vger.kernel.org
> Sujet: [PATCH net-next] neigh: fix rcu splat in neigh_update()
> Date: Tue, 18 Oct 2011 16:32:42 +0800 (18/10/2011 10:32:42)
>
Good catch, it seems to be this bug.
^ permalink raw reply
* [PATCH net-next] myri10ge: fix truesize underestimation
From: Eric Dumazet @ 2011-10-20 20:10 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Jon Mason
skb->truesize must account for allocated memory, not the used part of
it. Doing this work is important to avoid unexpected OOM situations.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Jon Mason <mason@myri.com>
---
drivers/net/ethernet/myricom/myri10ge/myri10ge.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
index c970a48..0778edc 100644
--- a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
+++ b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
@@ -1210,7 +1210,6 @@ myri10ge_rx_skb_build(struct sk_buff *skb, u8 * va,
struct skb_frag_struct *skb_frags;
skb->len = skb->data_len = len;
- skb->truesize = len + sizeof(struct sk_buff);
/* attach the page(s) */
skb_frags = skb_shinfo(skb)->frags;
@@ -1385,6 +1384,8 @@ myri10ge_rx_done(struct myri10ge_slice_state *ss, int len, __wsum csum,
if (skb_frag_size(&skb_shinfo(skb)->frags[0]) <= 0) {
skb_frag_unref(skb, 0);
skb_shinfo(skb)->nr_frags = 0;
+ } else {
+ skb->truesize += bytes * skb_shinfo(skb)->nr_frags;
}
skb->protocol = eth_type_trans(skb, dev);
skb_record_rx_queue(skb, ss - &mgp->ss[0]);
^ permalink raw reply related
* Re: Kernel panic from tg3 net driver
From: Eric Dumazet @ 2011-10-20 20:05 UTC (permalink / raw)
To: David Miller; +Cc: ari.m.savolainen, richardcochran, netdev, linux-kernel
In-Reply-To: <20111020.155659.486754557434415381.davem@davemloft.net>
Le jeudi 20 octobre 2011 à 15:56 -0400, David Miller a écrit :
> From: Ari Savolainen <ari.m.savolainen@gmail.com>
> Date: Thu, 20 Oct 2011 22:30:44 +0300
>
> > I finally got time to continue bisecting. The commit that causes the
> > kernel panic is: 2669069aacc9 "tg3: enable transmit time stamping."
>
> I thought initially that the issue might be that we have to do the
> skb_tx_timestamp() call before we advance the mailbox transmit
> descriptor pointer.
>
> But that shouldn't matter, we run with a lock held, and TX reclaim takes
> that same lock.
>
> So I'm sort of stumped at the moment.
But its not a panic, its a RCU splat ?
> [ 105.612129] [<ffffffff810ccdcb>] lockdep_rcu_dereference+0xbb/0xc0
> [ 105.612132] [<ffffffff815dc5a9>] neigh_update+0x4f9/0x5f0
> [ 105.612135] [<ffffffff815da001>] ? neigh_lookup+0xe1/0x220
> [ 105.612139] [<ffffffff81639298>] arp_req_set+0xb8/0x230
> [ 105.612142] [<ffffffff8163a59f>] arp_ioctl+0x1bf/0x310
> [ 105.612146] [<ffffffff810baa40>] ? lock_hrtimer_base.isra.26+0x30/0x60
> [ 105.612150] [<ffffffff8163fb75>] inet_ioctl+0x85/0x90
> [ 105.612154] [<ffffffff815b5520>] sock_do_ioctl+0x30/0x70
> [ 105.612157] [<ffffffff815b55d3>] sock_ioctl+0x73/0x280
> [ 105.612162] [<ffffffff811b7698>] do_vfs_ioctl+0x98/0x570
> [ 105.612165] [<ffffffff811a5c40>] ? fget_light+0x340/0x3a0
> [ 105.612168] [<ffffffff811b7bbf>] sys_ioctl+0x4f/0x80
> [ 105.612172] [<ffffffff816fdcab>] system_call_fastpath+0x16/0x1b
And I think this was fixed yesterday ?
De: roy.qing.li@gmail.com
À: ari.m.savolainen@gmail.com, netdev@vger.kernel.org
Sujet: [PATCH net-next] neigh: fix rcu splat in neigh_update()
Date: Tue, 18 Oct 2011 16:32:42 +0800 (18/10/2011 10:32:42)
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox