* Re: [PATCH] ip: reuse ip_summed of first fragment for all subsequent fragments
From: Timo Juhani Lindfors @ 2011-02-16 9:10 UTC (permalink / raw)
To: David Miller; +Cc: netdev
In-Reply-To: <20110112.184220.250810179.davem@davemloft.net>
David Miller <davem@davemloft.net> writes:
> You're now not handling the code block above this one, guarded
> by the "if (len <= 0)" check.
Yes that's true. Should we use skb_copy_bits() when sk->sk_no_check ==
UDP_CSUM_NOXMIT?
> You seem to just be peppering checks all over the place rather
> than coming up with a coherent, complete, fix for this problem.
I can understand that but I'm afraid that I lack the expertise to do
that. I might be able to fix the above problem but I can't be sure that
it is the only one. The bug report will remain at
https://bugzilla.kernel.org/show_bug.cgi?id=24832
in case somebody wants to continue from here.
^ permalink raw reply
* Re: [PATCH 1/1] tproxy: do not assign timewait sockets to skb->sk
From: KOVACS Krisztian @ 2011-02-16 8:54 UTC (permalink / raw)
To: Patrick McHardy
Cc: Florian Westphal, netfilter-devel, netdev, Balazs Scheidler
In-Reply-To: <4D594F9E.2090100@trash.net>
Hi,
On 02/14/2011 04:51 PM, Patrick McHardy wrote:
> Am 14.02.2011 12:44, schrieb Florian Westphal:
>> Assigning a socket in timewait state to skb->sk can trigger
>> kernel oops, e.g. in nfnetlink_log, which does:
>>
>> if (skb->sk) {
>> read_lock_bh(&skb->sk->sk_callback_lock);
>> if (skb->sk->sk_socket&& skb->sk->sk_socket->file) ...
>>
>> in the timewait case, accessing sk->sk_callback_lock and sk->sk_socket
>> is invalid.
>>
>> Either all of these spots will need to add a test for sk->sk_state != TCP_TIME_WAIT,
>> or xt_TPROXY must not assign a timewait socket to skb->sk.
>>
>> This does the latter.
>>
>> If a TW socket is found, assign the tproxy nfmark, but skip the skb->sk assignment,
>> thus mimicking behaviour of a '-m socket .. -j MARK/ACCEPT' re-routing rule.
>>
>> The 'SYN to TW socket' case is left unchanged -- we try to redirect to the
>> listener socket.
>>
>> Cc: Balazs Scheidler<bazsi@balabit.hu>
>> Cc: KOVACS Krisztian<hidden@balabit.hu>
>> Signed-off-by: Florian Westphal<fwestphal@astaro.com>
>
> Looks fine to me. Balazs. Krisztian, any objections?
Seems to be OK, as far as I can see.
Florian, did you make sure the tests still run after applying this patch?
http://git.balabit.hu/?p=bazsi/tproxy-test.git;a=summary
--
KOVACS Krisztian
^ permalink raw reply
* Re: [RFC !!BONUS!! PATCH 6/5] ipv4: Delete routing cache.
From: Eric Dumazet @ 2011-02-16 7:56 UTC (permalink / raw)
To: David Miller; +Cc: netdev
In-Reply-To: <20110215.185534.71133854.davem@davemloft.net>
Le mardi 15 février 2011 à 18:55 -0800, David Miller a écrit :
> From: David Miller <davem@davemloft.net>
> Date: Wed, 09 Feb 2011 22:39:39 -0800 (PST)
>
> >
> > Signed-off-by: David S. Miller <davem@davemloft.net>
>
> Ok, this patch had one nasty bug:
>
> > + if (!err == 0)
>
> Yeah... right.
>
> I'm actively testing this version at the moment, against net-next-2.6,
> works fine thus far.
>
> --------------------
> ipv4: Delete routing cache.
>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
I suspect we can zap DST_NOCACHE later ?
^ permalink raw reply
* [PATCH v3] sh: sh_eth: Add support ethtool
From: Nobuhiro Iwamatsu @ 2011-02-16 7:17 UTC (permalink / raw)
To: netdev
Cc: linux-sh, bhutchings, eric.dumazet, Nobuhiro Iwamatsu,
Yoshihiro Shimoda
This commit supports following functions.
- get_settings
- set_settings
- nway_reset
- get_msglevel
- set_msglevel
- get_link
- get_strings
- get_ethtool_stats
- get_sset_count
About other function, the device does not support.
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
---
>From v2:
- Remove get function of standard device stats.
- Remove get_drvinfo function.
- Change mdelay 100ms to 1ms in reset fucntion
- Change function name from sh_eth_link* to sh_eth_rcv_snd_*.
Because sh_eth_link* function does not linkup/down.
- Add netif_msg_* function.
drivers/net/sh_eth.c | 208 +++++++++++++++++++++++++++++++++++++++++++++-----
1 files changed, 189 insertions(+), 19 deletions(-)
diff --git a/drivers/net/sh_eth.c b/drivers/net/sh_eth.c
index 819c175..095e525 100644
--- a/drivers/net/sh_eth.c
+++ b/drivers/net/sh_eth.c
@@ -32,10 +32,17 @@
#include <linux/io.h>
#include <linux/pm_runtime.h>
#include <linux/slab.h>
+#include <linux/ethtool.h>
#include <asm/cacheflush.h>
#include "sh_eth.h"
+#define SH_ETH_DEF_MSG_ENABLE \
+ (NETIF_MSG_LINK | \
+ NETIF_MSG_TIMER | \
+ NETIF_MSG_RX_ERR| \
+ NETIF_MSG_TX_ERR)
+
/* There is CPU dependent code */
#if defined(CONFIG_CPU_SUBTYPE_SH7724)
#define SH_ETH_RESET_DEFAULT 1
@@ -817,6 +824,20 @@ static int sh_eth_rx(struct net_device *ndev)
return 0;
}
+static void sh_eth_rcv_snd_disable(u32 ioaddr)
+{
+ /* disable tx and rx */
+ writel(readl(ioaddr + ECMR) &
+ ~(ECMR_RE | ECMR_TE), ioaddr + ECMR);
+}
+
+static void sh_eth_rcv_snd_enable(u32 ioaddr)
+{
+ /* enable tx and rx */
+ writel(readl(ioaddr + ECMR) |
+ (ECMR_RE | ECMR_TE), ioaddr + ECMR);
+}
+
/* error control function */
static void sh_eth_error(struct net_device *ndev, int intr_status)
{
@@ -843,11 +864,9 @@ static void sh_eth_error(struct net_device *ndev, int intr_status)
if (mdp->ether_link_active_low)
link_stat = ~link_stat;
}
- if (!(link_stat & PHY_ST_LINK)) {
- /* Link Down : disable tx and rx */
- writel(readl(ioaddr + ECMR) &
- ~(ECMR_RE | ECMR_TE), ioaddr + ECMR);
- } else {
+ if (!(link_stat & PHY_ST_LINK))
+ sh_eth_rcv_snd_disable(ioaddr);
+ else {
/* Link Up */
writel(readl(ioaddr + EESIPR) &
~DMAC_M_ECI, ioaddr + EESIPR);
@@ -857,8 +876,7 @@ static void sh_eth_error(struct net_device *ndev, int intr_status)
writel(readl(ioaddr + EESIPR) |
DMAC_M_ECI, ioaddr + EESIPR);
/* enable tx and rx */
- writel(readl(ioaddr + ECMR) |
- (ECMR_RE | ECMR_TE), ioaddr + ECMR);
+ sh_eth_rcv_snd_enable(ioaddr);
}
}
}
@@ -867,6 +885,8 @@ static void sh_eth_error(struct net_device *ndev, int intr_status)
/* Write buck end. unused write back interrupt */
if (intr_status & EESR_TABT) /* Transmit Abort int */
mdp->stats.tx_aborted_errors++;
+ if (netif_msg_tx_err(mdp))
+ dev_err(&ndev->dev, "Transmit Abort\n");
}
if (intr_status & EESR_RABT) {
@@ -874,14 +894,23 @@ static void sh_eth_error(struct net_device *ndev, int intr_status)
if (intr_status & EESR_RFRMER) {
/* Receive Frame Overflow int */
mdp->stats.rx_frame_errors++;
- dev_err(&ndev->dev, "Receive Frame Overflow\n");
+ if (netif_msg_rx_err(mdp))
+ dev_err(&ndev->dev, "Receive Abort\n");
}
}
- if (!mdp->cd->no_ade) {
- if (intr_status & EESR_ADE && intr_status & EESR_TDE &&
- intr_status & EESR_TFE)
- mdp->stats.tx_fifo_errors++;
+ if (intr_status & EESR_TDE) {
+ /* Transmit Descriptor Empty int */
+ mdp->stats.tx_fifo_errors++;
+ if (netif_msg_tx_err(mdp))
+ dev_err(&ndev->dev, "Transmit Descriptor Empty\n");
+ }
+
+ if (intr_status & EESR_TFE) {
+ /* FIFO under flow */
+ mdp->stats.tx_fifo_errors++;
+ if (netif_msg_tx_err(mdp))
+ dev_err(&ndev->dev, "Transmit FIFO Under flow\n");
}
if (intr_status & EESR_RDE) {
@@ -890,12 +919,22 @@ static void sh_eth_error(struct net_device *ndev, int intr_status)
if (readl(ioaddr + EDRRR) ^ EDRRR_R)
writel(EDRRR_R, ioaddr + EDRRR);
- dev_err(&ndev->dev, "Receive Descriptor Empty\n");
+ if (netif_msg_rx_err(mdp))
+ dev_err(&ndev->dev, "Receive Descriptor Empty\n");
}
+
if (intr_status & EESR_RFE) {
/* Receive FIFO Overflow int */
mdp->stats.rx_fifo_errors++;
- dev_err(&ndev->dev, "Receive FIFO Overflow\n");
+ if (netif_msg_rx_err(mdp))
+ dev_err(&ndev->dev, "Receive FIFO Overflow\n");
+ }
+
+ if (!mdp->cd->no_ade && (intr_status & EESR_ADE)) {
+ /* Address Error */
+ mdp->stats.tx_fifo_errors++;
+ if (netif_msg_tx_err(mdp))
+ dev_err(&ndev->dev, "Address Error\n");
}
mask = EESR_TWB | EESR_TABT | EESR_ADE | EESR_TDE | EESR_TFE;
@@ -1012,7 +1051,7 @@ static void sh_eth_adjust_link(struct net_device *ndev)
mdp->duplex = -1;
}
- if (new_state)
+ if (new_state && netif_msg_link(mdp))
phy_print_status(phydev);
}
@@ -1063,6 +1102,132 @@ static int sh_eth_phy_start(struct net_device *ndev)
return 0;
}
+static int sh_eth_get_settings(struct net_device *ndev,
+ struct ethtool_cmd *ecmd)
+{
+ struct sh_eth_private *mdp = netdev_priv(ndev);
+ unsigned long flags;
+ int ret;
+
+ spin_lock_irqsave(&mdp->lock, flags);
+ ret = phy_ethtool_gset(mdp->phydev, ecmd);
+ spin_unlock_irqrestore(&mdp->lock, flags);
+
+ return ret;
+}
+
+static int sh_eth_set_settings(struct net_device *ndev,
+ struct ethtool_cmd *ecmd)
+{
+ struct sh_eth_private *mdp = netdev_priv(ndev);
+ unsigned long flags;
+ int ret;
+ u32 ioaddr = ndev->base_addr;
+
+ spin_lock_irqsave(&mdp->lock, flags);
+
+ /* disable tx and rx */
+ sh_eth_rcv_snd_disable(ioaddr);
+
+ ret = phy_ethtool_sset(mdp->phydev, ecmd);
+ if (ret)
+ goto error_exit;
+
+ if (ecmd->duplex == DUPLEX_FULL)
+ mdp->duplex = 1;
+ else
+ mdp->duplex = 0;
+
+ if (mdp->cd->set_duplex)
+ mdp->cd->set_duplex(ndev);
+
+error_exit:
+ mdelay(1);
+
+ /* enable tx and rx */
+ sh_eth_rcv_snd_enable(ioaddr);
+
+ spin_unlock_irqrestore(&mdp->lock, flags);
+
+ return ret;
+}
+
+static int sh_eth_nway_reset(struct net_device *ndev)
+{
+ struct sh_eth_private *mdp = netdev_priv(ndev);
+ unsigned long flags;
+ int ret;
+
+ spin_lock_irqsave(&mdp->lock, flags);
+ ret = phy_start_aneg(mdp->phydev);
+ spin_unlock_irqrestore(&mdp->lock, flags);
+
+ return ret;
+}
+
+static u32 sh_eth_get_msglevel(struct net_device *ndev)
+{
+ struct sh_eth_private *mdp = netdev_priv(ndev);
+ return mdp->msg_enable;
+}
+
+static void sh_eth_set_msglevel(struct net_device *ndev, u32 value)
+{
+ struct sh_eth_private *mdp = netdev_priv(ndev);
+ mdp->msg_enable = value;
+}
+
+static const char sh_eth_gstrings_stats[][ETH_GSTRING_LEN] = {
+ "rx_current", "tx_current",
+ "rx_dirty", "tx_dirty",
+};
+#define SH_ETH_STATS_LEN ARRAY_SIZE(sh_eth_gstrings_stats)
+
+static int sh_eth_get_sset_count(struct net_device *netdev, int sset)
+{
+ switch (sset) {
+ case ETH_SS_STATS:
+ return SH_ETH_STATS_LEN;
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
+static void sh_eth_get_ethtool_stats(struct net_device *ndev,
+ struct ethtool_stats *stats, u64 *data)
+{
+ struct sh_eth_private *mdp = netdev_priv(ndev);
+ int i = 0;
+
+ /* device-specific stats */
+ data[i++] = mdp->cur_rx;
+ data[i++] = mdp->cur_tx;
+ data[i++] = mdp->dirty_rx;
+ data[i++] = mdp->dirty_tx;
+}
+
+static void sh_eth_get_strings(struct net_device *ndev, u32 stringset, u8 *data)
+{
+ switch (stringset) {
+ case ETH_SS_STATS:
+ memcpy(data, *sh_eth_gstrings_stats,
+ sizeof(sh_eth_gstrings_stats));
+ break;
+ }
+}
+
+static struct ethtool_ops sh_eth_ethtool_ops = {
+ .get_settings = sh_eth_get_settings,
+ .set_settings = sh_eth_set_settings,
+ .nway_reset = sh_eth_nway_reset,
+ .get_msglevel = sh_eth_get_msglevel,
+ .set_msglevel = sh_eth_set_msglevel,
+ .get_link = ethtool_op_get_link,
+ .get_strings = sh_eth_get_strings,
+ .get_ethtool_stats = sh_eth_get_ethtool_stats,
+ .get_sset_count = sh_eth_get_sset_count,
+};
+
/* network device open function */
static int sh_eth_open(struct net_device *ndev)
{
@@ -1073,8 +1238,8 @@ static int sh_eth_open(struct net_device *ndev)
ret = request_irq(ndev->irq, sh_eth_interrupt,
#if defined(CONFIG_CPU_SUBTYPE_SH7763) || \
- defined(CONFIG_CPU_SUBTYPE_SH7764) || \
- defined(CONFIG_CPU_SUBTYPE_SH7757)
+ defined(CONFIG_CPU_SUBTYPE_SH7764) || \
+ defined(CONFIG_CPU_SUBTYPE_SH7757)
IRQF_SHARED,
#else
0,
@@ -1123,8 +1288,8 @@ static void sh_eth_tx_timeout(struct net_device *ndev)
netif_stop_queue(ndev);
- /* worning message out. */
- printk(KERN_WARNING "%s: transmit timed out, status %8.8x,"
+ if (netif_msg_timer(mdp))
+ dev_err(&ndev->dev, "%s: transmit timed out, status %8.8x,"
" resetting...\n", ndev->name, (int)readl(ioaddr + EESR));
/* tx_errors count up */
@@ -1167,6 +1332,8 @@ static int sh_eth_start_xmit(struct sk_buff *skb, struct net_device *ndev)
spin_lock_irqsave(&mdp->lock, flags);
if ((mdp->cur_tx - mdp->dirty_tx) >= (TX_RING_SIZE - 4)) {
if (!sh_eth_txfree(ndev)) {
+ if (netif_msg_tx_queued(mdp))
+ dev_warn(&ndev->dev, "TxFD exhausted.\n");
netif_stop_queue(ndev);
spin_unlock_irqrestore(&mdp->lock, flags);
return NETDEV_TX_BUSY;
@@ -1497,8 +1664,11 @@ static int sh_eth_drv_probe(struct platform_device *pdev)
/* set function */
ndev->netdev_ops = &sh_eth_netdev_ops;
+ SET_ETHTOOL_OPS(ndev, &sh_eth_ethtool_ops);
ndev->watchdog_timeo = TX_TIMEOUT;
+ /* debug message level */
+ mdp->msg_enable = SH_ETH_DEF_MSG_ENABLE;
mdp->post_rx = POST_RX >> (devno << 1);
mdp->post_fw = POST_FW >> (devno << 1);
--
1.7.2.3
^ permalink raw reply related
* Re: [PATCH v2] sh: sh_eth: Add support ethtool
From: Nobuhiro Iwamatsu @ 2011-02-16 7:04 UTC (permalink / raw)
To: Ben Hutchings; +Cc: netdev, linux-sh, yoshihiro.shimoda.uh
In-Reply-To: <1294761426.3637.8.camel@bwh-desktop>
2011/1/12 Ben Hutchings <bhutchings@solarflare.com>:
> On Tue, 2011-01-11 at 20:58 +0900, nobuhiro.iwamatsu.yj@renesas.com
> wrote:
>> From: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
>>
>> This commit supports following functions.
>> - get_drvinfo
>> - get_settings
>> - set_settings
>> - nway_reset
>> - get_msglevel
>> - set_msglevel
>> - get_link
>> - get_strings
>> - get_ethtool_stats
>> - get_sset_count
>>
>> About other function, the device does not support.
>>
>> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
>> Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
>> ---
>> v2: reverted one part of the checks of checkpatch.pl.
>> foo *bar -> foo * bar.
>> changed function copying of net_device_stats from *for* to memcopy.
>>
>> drivers/net/sh_eth.c | 186 ++++++++++++++++++++++++++++++++++++++++++++++---
>> 1 files changed, 174 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/net/sh_eth.c b/drivers/net/sh_eth.c
>> index 819c175..0b2cb7d 100644
>> --- a/drivers/net/sh_eth.c
>> +++ b/drivers/net/sh_eth.c
> [...]
>> @@ -1063,6 +1074,154 @@ static int sh_eth_phy_start(struct net_device *ndev)
>> return 0;
>> }
>>
>> +static void sh_eth_get_drvinfo(struct net_device *ndev,
>> + struct ethtool_drvinfo *info)
>> +{
>> + strncpy(info->driver, "sh_eth", sizeof(info->driver) - 1);
>> + strcpy(info->version, "N/A");
>> + strcpy(info->fw_version, "N/A");
>> + strlcpy(info->bus_info, dev_name(ndev->dev.parent),
>> + sizeof(info->bus_info));
>> +}
>
> This is redundant; the default implementation already does this.
I see. I removed this.
>
> [...]
>> +static int sh_eth_set_settings(struct net_device *ndev,
>> + struct ethtool_cmd *ecmd)
>> +{
>> + struct sh_eth_private *mdp = netdev_priv(ndev);
>> + unsigned long flags;
>> + int ret;
>> + u32 ioaddr = ndev->base_addr;
>> +
>> + spin_lock_irqsave(&mdp->lock, flags);
>> +
>> + /* disable tx and rx */
>> + sh_eth_linkdown(ioaddr);
>> +
>> + ret = phy_ethtool_sset(mdp->phydev, ecmd);
>> + if (ret)
>> + goto error_exit;
>> +
>> + if (ecmd->duplex == DUPLEX_FULL)
>> + mdp->duplex = 1;
>> + else
>> + mdp->duplex = 0;
>> +
>> + if (mdp->cd->set_duplex)
>> + mdp->cd->set_duplex(ndev);
>> +
>> +error_exit:
>> + mdelay(100);
>
> Ugh, 100 ms holding a spinlock?!
Oh, This was not need 100ms.
I changed to 1 ms.
>
>> + /* enable tx and rx */
>> + sh_eth_linkup(ioaddr);
>
> How do you know the link is up? Shouldn't this be left to the link
> polling function?
>
Hmm. this has bad function name.
This function does not linkup. This enable recv / send function of the
hardware.
I changed a function name from sh_eth_linkup to sh_eth_rcv_send_enable.
> [...]
>> +static u32 sh_eth_get_msglevel(struct net_device *ndev)
>> +{
>> + struct sh_eth_private *mdp = netdev_priv(ndev);
>> + return mdp->msg_enable;
>> +}
>> +
>> +static void sh_eth_set_msglevel(struct net_device *ndev, u32 value)
>> +{
>> + struct sh_eth_private *mdp = netdev_priv(ndev);
>> + mdp->msg_enable = value;
>> +}
>
> This would be more useful if msg_enable was actually used anywhere in
> the driver.
I forgot this.
I am going to include msglevel stuff.
>
> [...]
>> @@ -1073,8 +1232,8 @@ static int sh_eth_open(struct net_device *ndev)
>>
>> ret = request_irq(ndev->irq, sh_eth_interrupt,
>> #if defined(CONFIG_CPU_SUBTYPE_SH7763) || \
>> - defined(CONFIG_CPU_SUBTYPE_SH7764) || \
>> - defined(CONFIG_CPU_SUBTYPE_SH7757)
>> + defined(CONFIG_CPU_SUBTYPE_SH7764) || \
>> + defined(CONFIG_CPU_SUBTYPE_SH7757)
>> IRQF_SHARED,
>> #else
>> 0,
>> @@ -1232,11 +1391,11 @@ static int sh_eth_close(struct net_device *ndev)
>> sh_eth_ring_free(ndev);
>>
>> /* free DMA buffer */
>> - ringsize = sizeof(struct sh_eth_rxdesc) * RX_RING_SIZE;
>> + ringsize = sizeof(struct sh_eth_rxdesc) *RX_RING_SIZE;
>> dma_free_coherent(NULL, ringsize, mdp->rx_ring, mdp->rx_desc_dma);
>>
>> /* free DMA buffer */
>> - ringsize = sizeof(struct sh_eth_txdesc) * TX_RING_SIZE;
>> + ringsize = sizeof(struct sh_eth_txdesc) *TX_RING_SIZE;
>> dma_free_coherent(NULL, ringsize, mdp->tx_ring, mdp->tx_desc_dma);
>>
>> pm_runtime_put_sync(&mdp->pdev->dev);
>
> Please do not include these space changes.
I revised this.
>
>> @@ -1497,8 +1656,11 @@ static int sh_eth_drv_probe(struct platform_device *pdev)
>>
>> /* set function */
>> ndev->netdev_ops = &sh_eth_netdev_ops;
>> + SET_ETHTOOL_OPS(ndev, &sh_eth_ethtool_ops);
>> ndev->watchdog_timeo = TX_TIMEOUT;
>>
>> + /* debug message level */
>> + mdp->msg_enable = (1 << 3) - 1;
>
> If you're actually going to *use* msg_enable, its value should be
> initialised in terms of the NETIF_MSG_* flags defined in
> <linux/netdevice.h>.
Thanks, I replaced to NETIF_MSG_*.
>
>> mdp->post_rx = POST_RX >> (devno << 1);
>> mdp->post_fw = POST_FW >> (devno << 1);
>>
>> @@ -1572,7 +1734,7 @@ static int sh_eth_runtime_nop(struct device *dev)
>> return 0;
>> }
>>
>> -static struct dev_pm_ops sh_eth_dev_pm_ops = {
>> +static const struct dev_pm_ops sh_eth_dev_pm_ops = {
>> .runtime_suspend = sh_eth_runtime_nop,
>> .runtime_resume = sh_eth_runtime_nop,
>> };
>
> This is worthwhile but unrelated to ethtool!
Oh, I split to other patch.
>
> Ben.
>
Best regards,
Nobuhiro
--
Nobuhiro Iwamatsu
^ permalink raw reply
* Re: [PATCH v2] sh: sh_eth: Add support ethtool
From: Nobuhiro Iwamatsu @ 2011-02-16 7:04 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev, linux-sh, yoshihiro.shimoda.uh, bhutchings
In-Reply-To: <1294748343.2927.57.camel@edumazet-laptop>
2011/1/11 Eric Dumazet <eric.dumazet@gmail.com>:
> Le mardi 11 janvier 2011 à 20:58 +0900, nobuhiro.iwamatsu.yj@renesas.com
> a écrit :
>> From: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
>>
>> This commit supports following functions.
>> - get_drvinfo
>> - get_settings
>> - set_settings
>> - nway_reset
>> - get_msglevel
>> - set_msglevel
>> - get_link
>> - get_strings
>> - get_ethtool_stats
>> - get_sset_count
>>
>> About other function, the device does not support.
>>
>> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
>> Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
>> ---
>
>> +static const char sh_eth_gstrings_stats[][ETH_GSTRING_LEN] = {
>> + "rx_packets", "tx_packets", "rx_bytes", "tx_bytes", "rx_errors",
>> + "tx_errors", "rx_dropped", "tx_dropped", "multicast", "collisions",
>> + "rx_length_errors", "rx_over_errors", "rx_crc_errors",
>> + "rx_frame_errors", "rx_fifo_errors", "rx_missed_errors",
>> + "tx_aborted_errors", "tx_carrier_errors", "tx_fifo_errors",
>> + "tx_heartbeat_errors", "tx_window_errors",
>> + /* device-specific stats */
>> + "rx_current", "tx_current",
>> + "rx_dirty", "tx_dirty",
>> +};
>> +#define SH_ETH_NET_STATS_LEN 21
>> +#define SH_ETH_STATS_LEN ARRAY_SIZE(sh_eth_gstrings_stats)
>
> Why is it needed to report standard device stats ?
>
I dont know that we could get standart device status from basic interface.
I removed this.
>
>> +
>> +static int sh_eth_get_sset_count(struct net_device *netdev, int sset)
>> +{
>> + switch (sset) {
>> + case ETH_SS_STATS:
>> + return SH_ETH_STATS_LEN;
>> + default:
>> + return -EOPNOTSUPP;
>> + }
>> +}
>> +
>> +static void sh_eth_get_ethtool_stats(struct net_device *ndev,
>> + struct ethtool_stats *stats, u64 *data)
>> +{
>> + struct sh_eth_private *mdp = netdev_priv(ndev);
>> + int i = SH_ETH_NET_STATS_LEN;
>> +
>> + memcpy(data, (unsigned long *)&ndev->stats,
>> + SH_ETH_NET_STATS_LEN * sizeof(unsigned long));
>
> This is wrong on 32bit arches.
> ndev->stats is an array of "long" values, not u64 ones.
I removed this too.
>> +
>> + /* device-specific stats */
>> + data[i++] = mdp->cur_rx;
>> + data[i++] = mdp->cur_tx;
>> + data[i++] = mdp->dirty_rx;
>> + data[i++] = mdp->dirty_tx;
>> +}
>> +
>
--
Nobuhiro Iwamatsu
^ permalink raw reply
* [PATCH 2/3]drivers:net:rrunner.c Fix typo occationally to occasionally
From: Justin P. Mattock @ 2011-02-16 6:55 UTC (permalink / raw)
To: trivial; +Cc: davem, eric.dumazet, netdev, linux-kernel, Justin P. Mattock
The below patch fixes a typo occationally to occasionally.
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
---
drivers/net/rrunner.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/net/rrunner.c b/drivers/net/rrunner.c
index e68c941..6dceeb5 100644
--- a/drivers/net/rrunner.c
+++ b/drivers/net/rrunner.c
@@ -1072,7 +1072,7 @@ static irqreturn_t rr_interrupt(int irq, void *dev_id)
txcon = rrpriv->dirty_tx;
if (txcsmr != txcon) {
do {
- /* Due to occational firmware TX producer/consumer out
+ /* Due to occasional firmware TX producer/consumer out
* of sync. error need to check entry in ring -kbf
*/
if(rrpriv->tx_skbuff[txcon]){
--
1.6.5.2.180.gc5b3e
^ permalink raw reply related
* [PATCH 3/3] ipvs: make "no destination available" message more informative
From: Simon Horman @ 2011-02-16 6:04 UTC (permalink / raw)
To: lvs-devel, netdev, netfilter-devel, netfilter
Cc: Julian Anastasov, Patrick Schaaf, Patrick McHardy, Simon Horman
In-Reply-To: <1297836293-5942-1-git-send-email-horms@verge.net.au>
From: Patrick Schaaf <netdev@bof.de>
When IP_VS schedulers do not find a destination, they output a terse
"WLC: no destination available" message through kernel syslog, which I
can not only make sense of because syslog puts them in a logfile
together with keepalived checker results.
This patch makes the output a bit more informative, by telling you which
virtual service failed to find a destination.
Example output:
kernel: [1539214.552233] IPVS: wlc: TCP 192.168.8.30:22 - no destination available
kernel: [1539299.674418] IPVS: wlc: FWM 22 0x00000016 - no destination available
I have tested the code for IPv4 and FWM services, as you can see from
the example; I do not have an IPv6 setup to test the third code path
with.
To avoid code duplication, I put a new function ip_vs_scheduler_err()
into ip_vs_sched.c, and use that from the schedulers instead of calling
IP_VS_ERR_RL directly.
Signed-off-by: Patrick Schaaf <netdev@bof.de>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
include/net/ip_vs.h | 2 ++
net/netfilter/ipvs/ip_vs_lblc.c | 2 +-
net/netfilter/ipvs/ip_vs_lblcr.c | 2 +-
net/netfilter/ipvs/ip_vs_lc.c | 2 +-
net/netfilter/ipvs/ip_vs_nq.c | 2 +-
net/netfilter/ipvs/ip_vs_rr.c | 2 +-
net/netfilter/ipvs/ip_vs_sched.c | 25 +++++++++++++++++++++++++
net/netfilter/ipvs/ip_vs_sed.c | 2 +-
net/netfilter/ipvs/ip_vs_sh.c | 2 +-
net/netfilter/ipvs/ip_vs_wlc.c | 2 +-
net/netfilter/ipvs/ip_vs_wrr.c | 14 ++++++++------
11 files changed, 43 insertions(+), 14 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 5d75fea..9399549 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1019,6 +1019,8 @@ ip_vs_schedule(struct ip_vs_service *svc, struct sk_buff *skb,
extern int ip_vs_leave(struct ip_vs_service *svc, struct sk_buff *skb,
struct ip_vs_proto_data *pd);
+extern void ip_vs_scheduler_err(struct ip_vs_service *svc, const char *msg);
+
/*
* IPVS control data and functions (from ip_vs_ctl.c)
diff --git a/net/netfilter/ipvs/ip_vs_lblc.c b/net/netfilter/ipvs/ip_vs_lblc.c
index 00b5ffa..4a9c8cd 100644
--- a/net/netfilter/ipvs/ip_vs_lblc.c
+++ b/net/netfilter/ipvs/ip_vs_lblc.c
@@ -510,7 +510,7 @@ ip_vs_lblc_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
/* No cache entry or it is invalid, time to schedule */
dest = __ip_vs_lblc_schedule(svc);
if (!dest) {
- IP_VS_ERR_RL("LBLC: no destination available\n");
+ ip_vs_scheduler_err(svc, "no destination available");
return NULL;
}
diff --git a/net/netfilter/ipvs/ip_vs_lblcr.c b/net/netfilter/ipvs/ip_vs_lblcr.c
index bfa25f1..bd329b1 100644
--- a/net/netfilter/ipvs/ip_vs_lblcr.c
+++ b/net/netfilter/ipvs/ip_vs_lblcr.c
@@ -692,7 +692,7 @@ ip_vs_lblcr_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
/* The cache entry is invalid, time to schedule */
dest = __ip_vs_lblcr_schedule(svc);
if (!dest) {
- IP_VS_ERR_RL("LBLCR: no destination available\n");
+ ip_vs_scheduler_err(svc, "no destination available");
read_unlock(&svc->sched_lock);
return NULL;
}
diff --git a/net/netfilter/ipvs/ip_vs_lc.c b/net/netfilter/ipvs/ip_vs_lc.c
index 4f69db1..6063800 100644
--- a/net/netfilter/ipvs/ip_vs_lc.c
+++ b/net/netfilter/ipvs/ip_vs_lc.c
@@ -70,7 +70,7 @@ ip_vs_lc_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
}
if (!least)
- IP_VS_ERR_RL("LC: no destination available\n");
+ ip_vs_scheduler_err(svc, "no destination available");
else
IP_VS_DBG_BUF(6, "LC: server %s:%u activeconns %d "
"inactconns %d\n",
diff --git a/net/netfilter/ipvs/ip_vs_nq.c b/net/netfilter/ipvs/ip_vs_nq.c
index c413e18..984d9c1 100644
--- a/net/netfilter/ipvs/ip_vs_nq.c
+++ b/net/netfilter/ipvs/ip_vs_nq.c
@@ -99,7 +99,7 @@ ip_vs_nq_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
}
if (!least) {
- IP_VS_ERR_RL("NQ: no destination available\n");
+ ip_vs_scheduler_err(svc, "no destination available");
return NULL;
}
diff --git a/net/netfilter/ipvs/ip_vs_rr.c b/net/netfilter/ipvs/ip_vs_rr.c
index e210f37..c49b388 100644
--- a/net/netfilter/ipvs/ip_vs_rr.c
+++ b/net/netfilter/ipvs/ip_vs_rr.c
@@ -72,7 +72,7 @@ ip_vs_rr_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
q = q->next;
} while (q != p);
write_unlock(&svc->sched_lock);
- IP_VS_ERR_RL("RR: no destination available\n");
+ ip_vs_scheduler_err(svc, "no destination available");
return NULL;
out:
diff --git a/net/netfilter/ipvs/ip_vs_sched.c b/net/netfilter/ipvs/ip_vs_sched.c
index 076ebe0..08dbdd5 100644
--- a/net/netfilter/ipvs/ip_vs_sched.c
+++ b/net/netfilter/ipvs/ip_vs_sched.c
@@ -29,6 +29,7 @@
#include <net/ip_vs.h>
+EXPORT_SYMBOL(ip_vs_scheduler_err);
/*
* IPVS scheduler list
*/
@@ -146,6 +147,30 @@ void ip_vs_scheduler_put(struct ip_vs_scheduler *scheduler)
module_put(scheduler->module);
}
+/*
+ * Common error output helper for schedulers
+ */
+
+void ip_vs_scheduler_err(struct ip_vs_service *svc, const char *msg)
+{
+ if (svc->fwmark) {
+ IP_VS_ERR_RL("%s: FWM %u 0x%08X - %s\n",
+ svc->scheduler->name, svc->fwmark,
+ svc->fwmark, msg);
+#ifdef CONFIG_IP_VS_IPV6
+ } else if (svc->af == AF_INET6) {
+ IP_VS_ERR_RL("%s: %s [%pI6]:%d - %s\n",
+ svc->scheduler->name,
+ ip_vs_proto_name(svc->protocol),
+ &svc->addr.in6, ntohs(svc->port), msg);
+#endif
+ } else {
+ IP_VS_ERR_RL("%s: %s %pI4:%d - %s\n",
+ svc->scheduler->name,
+ ip_vs_proto_name(svc->protocol),
+ &svc->addr.ip, ntohs(svc->port), msg);
+ }
+}
/*
* Register a scheduler in the scheduler list
diff --git a/net/netfilter/ipvs/ip_vs_sed.c b/net/netfilter/ipvs/ip_vs_sed.c
index 1ab75a9..89ead24 100644
--- a/net/netfilter/ipvs/ip_vs_sed.c
+++ b/net/netfilter/ipvs/ip_vs_sed.c
@@ -87,7 +87,7 @@ ip_vs_sed_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
goto nextstage;
}
}
- IP_VS_ERR_RL("SED: no destination available\n");
+ ip_vs_scheduler_err(svc, "no destination available");
return NULL;
/*
diff --git a/net/netfilter/ipvs/ip_vs_sh.c b/net/netfilter/ipvs/ip_vs_sh.c
index e6cc174..b5e2556 100644
--- a/net/netfilter/ipvs/ip_vs_sh.c
+++ b/net/netfilter/ipvs/ip_vs_sh.c
@@ -223,7 +223,7 @@ ip_vs_sh_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
|| !(dest->flags & IP_VS_DEST_F_AVAILABLE)
|| atomic_read(&dest->weight) <= 0
|| is_overloaded(dest)) {
- IP_VS_ERR_RL("SH: no destination available\n");
+ ip_vs_scheduler_err(svc, "no destination available");
return NULL;
}
diff --git a/net/netfilter/ipvs/ip_vs_wlc.c b/net/netfilter/ipvs/ip_vs_wlc.c
index bbddfdb..fdf0f58 100644
--- a/net/netfilter/ipvs/ip_vs_wlc.c
+++ b/net/netfilter/ipvs/ip_vs_wlc.c
@@ -75,7 +75,7 @@ ip_vs_wlc_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
goto nextstage;
}
}
- IP_VS_ERR_RL("WLC: no destination available\n");
+ ip_vs_scheduler_err(svc, "no destination available");
return NULL;
/*
diff --git a/net/netfilter/ipvs/ip_vs_wrr.c b/net/netfilter/ipvs/ip_vs_wrr.c
index 30db633..1ef41f5 100644
--- a/net/netfilter/ipvs/ip_vs_wrr.c
+++ b/net/netfilter/ipvs/ip_vs_wrr.c
@@ -147,8 +147,9 @@ ip_vs_wrr_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
if (mark->cl == mark->cl->next) {
/* no dest entry */
- IP_VS_ERR_RL("WRR: no destination available: "
- "no destinations present\n");
+ ip_vs_scheduler_err(svc,
+ "no destination available: "
+ "no destinations present");
dest = NULL;
goto out;
}
@@ -162,8 +163,8 @@ ip_vs_wrr_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
*/
if (mark->cw == 0) {
mark->cl = &svc->destinations;
- IP_VS_ERR_RL("WRR: no destination "
- "available\n");
+ ip_vs_scheduler_err(svc,
+ "no destination available");
dest = NULL;
goto out;
}
@@ -185,8 +186,9 @@ ip_vs_wrr_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
/* back to the start, and no dest is found.
It is only possible when all dests are OVERLOADED */
dest = NULL;
- IP_VS_ERR_RL("WRR: no destination available: "
- "all destinations are overloaded\n");
+ ip_vs_scheduler_err(svc,
+ "no destination available: "
+ "all destinations are overloaded");
goto out;
}
}
--
1.7.2.3
^ permalink raw reply related
* [PATCH 2/3] ipvs: remove extra lookups for ICMP packets
From: Simon Horman @ 2011-02-16 6:04 UTC (permalink / raw)
To: lvs-devel, netdev, netfilter-devel, netfilter
Cc: Julian Anastasov, Patrick Schaaf, Patrick McHardy, Simon Horman
In-Reply-To: <1297836293-5942-1-git-send-email-horms@verge.net.au>
From: Julian Anastasov <ja@ssi.bg>
Remove code that should not be called anymore.
Now when ip_vs_out handles replies for local clients at
LOCAL_IN hook we do not need to call conn_out_get and
handle_response_icmp from ip_vs_in_icmp* because such
lookups were already performed for the ICMP packet and no
connection was found.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
net/netfilter/ipvs/ip_vs_core.c | 28 +++-------------------------
1 files changed, 3 insertions(+), 25 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index 4d06617..2d1f932 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -729,7 +729,7 @@ void ip_vs_nat_icmp_v6(struct sk_buff *skb, struct ip_vs_protocol *pp,
#endif
/* Handle relevant response ICMP messages - forward to the right
- * destination host. Used for NAT and local client.
+ * destination host.
*/
static int handle_response_icmp(int af, struct sk_buff *skb,
union nf_inet_addr *snet,
@@ -979,7 +979,6 @@ static inline int is_tcp_reset(const struct sk_buff *skb, int nh_len)
}
/* Handle response packets: rewrite addresses and send away...
- * Used for NAT and local client.
*/
static unsigned int
handle_response(int af, struct sk_buff *skb, struct ip_vs_proto_data *pd,
@@ -1280,7 +1279,6 @@ ip_vs_in_icmp(struct sk_buff *skb, int *related, unsigned int hooknum)
struct ip_vs_protocol *pp;
struct ip_vs_proto_data *pd;
unsigned int offset, ihl, verdict;
- union nf_inet_addr snet;
*related = 1;
@@ -1339,17 +1337,8 @@ ip_vs_in_icmp(struct sk_buff *skb, int *related, unsigned int hooknum)
ip_vs_fill_iphdr(AF_INET, cih, &ciph);
/* The embedded headers contain source and dest in reverse order */
cp = pp->conn_in_get(AF_INET, skb, &ciph, offset, 1);
- if (!cp) {
- /* The packet could also belong to a local client */
- cp = pp->conn_out_get(AF_INET, skb, &ciph, offset, 1);
- if (cp) {
- snet.ip = iph->saddr;
- return handle_response_icmp(AF_INET, skb, &snet,
- cih->protocol, cp, pp,
- offset, ihl);
- }
+ if (!cp)
return NF_ACCEPT;
- }
verdict = NF_DROP;
@@ -1395,7 +1384,6 @@ ip_vs_in_icmp_v6(struct sk_buff *skb, int *related, unsigned int hooknum)
struct ip_vs_protocol *pp;
struct ip_vs_proto_data *pd;
unsigned int offset, verdict;
- union nf_inet_addr snet;
struct rt6_info *rt;
*related = 1;
@@ -1455,18 +1443,8 @@ ip_vs_in_icmp_v6(struct sk_buff *skb, int *related, unsigned int hooknum)
ip_vs_fill_iphdr(AF_INET6, cih, &ciph);
/* The embedded headers contain source and dest in reverse order */
cp = pp->conn_in_get(AF_INET6, skb, &ciph, offset, 1);
- if (!cp) {
- /* The packet could also belong to a local client */
- cp = pp->conn_out_get(AF_INET6, skb, &ciph, offset, 1);
- if (cp) {
- ipv6_addr_copy(&snet.in6, &iph->saddr);
- return handle_response_icmp(AF_INET6, skb, &snet,
- cih->nexthdr,
- cp, pp, offset,
- sizeof(struct ipv6hdr));
- }
+ if (!cp)
return NF_ACCEPT;
- }
verdict = NF_DROP;
--
1.7.2.3
^ permalink raw reply related
* [PATCH 1/3] ipvs: fix timer in get_curr_sync_buff
From: Simon Horman @ 2011-02-16 6:04 UTC (permalink / raw)
To: lvs-devel, netdev, netfilter-devel, netfilter
Cc: Julian Anastasov, Patrick Schaaf, Patrick McHardy, Tinggong Wang,
Simon Horman
In-Reply-To: <1297836293-5942-1-git-send-email-horms@verge.net.au>
From: Tinggong Wang <wangtinggong@gmail.com>
Fix get_curr_sync_buff to keep buffer for 2 seconds
as intended, not just for the current jiffie. By this way
we will sync more connection structures with single packet.
Signed-off-by: Tinggong Wang <wangtinggong@gmail.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
net/netfilter/ipvs/ip_vs_sync.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index d1b7298..fecf24d 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -374,8 +374,8 @@ get_curr_sync_buff(struct netns_ipvs *ipvs, unsigned long time)
struct ip_vs_sync_buff *sb;
spin_lock_bh(&ipvs->sync_buff_lock);
- if (ipvs->sync_buff && (time == 0 ||
- time_before(jiffies - ipvs->sync_buff->firstuse, time))) {
+ if (ipvs->sync_buff &&
+ time_after_eq(jiffies - ipvs->sync_buff->firstuse, time)) {
sb = ipvs->sync_buff;
ipvs->sync_buff = NULL;
} else
--
1.7.2.3
^ permalink raw reply related
* [GIT PULL nf-next-2.6] IPVS
From: Simon Horman @ 2011-02-16 6:04 UTC (permalink / raw)
To: lvs-devel, netdev, netfilter-devel, netfilter
Cc: Julian Anastasov, Patrick Schaaf, Patrick McHardy
Hi Patrick,
please consider pulling
git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-test-2.6.git master
go get:
* Removal of unused ICMP code by Julian
* More informative "no destination available" messages
by Patrick Schaaf
* Fix to buffering of synchronisation messages
by Tinggong Wang and Julian
include/net/ip_vs.h | 2 ++
net/netfilter/ipvs/ip_vs_core.c | 28 +++-------------------------
net/netfilter/ipvs/ip_vs_lblc.c | 2 +-
net/netfilter/ipvs/ip_vs_lblcr.c | 2 +-
net/netfilter/ipvs/ip_vs_lc.c | 2 +-
net/netfilter/ipvs/ip_vs_nq.c | 2 +-
net/netfilter/ipvs/ip_vs_rr.c | 2 +-
net/netfilter/ipvs/ip_vs_sched.c | 25 +++++++++++++++++++++++++
net/netfilter/ipvs/ip_vs_sed.c | 2 +-
net/netfilter/ipvs/ip_vs_sh.c | 2 +-
net/netfilter/ipvs/ip_vs_sync.c | 4 ++--
net/netfilter/ipvs/ip_vs_wlc.c | 2 +-
net/netfilter/ipvs/ip_vs_wrr.c | 14 ++++++++------
13 files changed, 48 insertions(+), 41 deletions(-)
^ permalink raw reply
* Re: [PATCH] bnx2x: Support for managing RX indirection table
From: Vlad Zolotarov @ 2011-02-16 5:53 UTC (permalink / raw)
To: Tom Herbert
Cc: Ben Hutchings, davem@davemloft.net, Eilon Greenstein,
netdev@vger.kernel.org
In-Reply-To: <AANLkTi=VT7BMpJZeE_gKOceNw+=Db40p3znmyt=Jc2Un@mail.gmail.com>
On Wednesday 16 February 2011 00:50:18 Tom Herbert wrote:
> >> + u32 rx_indir_table[128];
> >
> > Shouldn't the dimension be TSTORM_INDIRECTION_TABLE_SIZE?
> >
>
> It's not a defined constant, so the alternative would be to malloc it
> which seems like overkill to me.
>
> Broadcom guys: are there any adapters or configuration of bnx2x where
> the indirection table would be greater than 128?
Although for all currently supported adapters the actual value of the indirection
table size is 128 I agree with Ben and would like to ask u to use the above macro (which
is a rename for an entry in a per-adapter array of constants) to keep the code
scalable and clean. I don't think that a malloc would be too much of a price for it... ;)
thanks,
vlad
>
> Tom
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply
* [PATCH v6 9/9] loopback: convert to hw_features
From: Michał Mirosław @ 2011-02-16 2:59 UTC (permalink / raw)
To: netdev; +Cc: Ben Hutchings, David Miller
In-Reply-To: <cover.1297824704.git.mirq-linux@rere.qmqm.pl>
This also enables TSOv6, TSO-ECN, and UFO as loopback clearly can handle them.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
drivers/net/loopback.c | 9 ++++-----
1 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index 2d9663a..ea0dc45 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -129,10 +129,6 @@ static u32 always_on(struct net_device *dev)
static const struct ethtool_ops loopback_ethtool_ops = {
.get_link = always_on,
- .set_tso = ethtool_op_set_tso,
- .get_tx_csum = always_on,
- .get_sg = always_on,
- .get_rx_csum = always_on,
};
static int loopback_dev_init(struct net_device *dev)
@@ -169,9 +165,12 @@ static void loopback_setup(struct net_device *dev)
dev->type = ARPHRD_LOOPBACK; /* 0x0001*/
dev->flags = IFF_LOOPBACK;
dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
+ dev->hw_features = NETIF_F_ALL_TSO | NETIF_F_UFO;
dev->features = NETIF_F_SG | NETIF_F_FRAGLIST
- | NETIF_F_TSO
+ | NETIF_F_ALL_TSO
+ | NETIF_F_UFO
| NETIF_F_NO_CSUM
+ | NETIF_F_RXCSUM
| NETIF_F_HIGHDMA
| NETIF_F_LLTX
| NETIF_F_NETNS_LOCAL;
--
1.7.2.3
^ permalink raw reply related
* [PATCH v6 8/9] net: introduce NETIF_F_RXCSUM
From: Michał Mirosław @ 2011-02-16 2:59 UTC (permalink / raw)
To: netdev; +Cc: Ben Hutchings, David Miller
In-Reply-To: <cover.1297824704.git.mirq-linux@rere.qmqm.pl>
Introduce NETIF_F_RXCSUM to replace device-private flags for RX checksum
offload. Integrate it with ndo_fix_features.
ethtool_op_get_rx_csum() is removed altogether as nothing in-tree uses it.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
include/linux/ethtool.h | 1 -
include/linux/netdevice.h | 5 +++-
net/core/ethtool.c | 47 ++++++++++++++++++++++-----------------------
3 files changed, 27 insertions(+), 26 deletions(-)
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 806e716..54d776c 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -625,7 +625,6 @@ struct net_device;
/* Some generic methods drivers may use in their ethtool_ops */
u32 ethtool_op_get_link(struct net_device *dev);
-u32 ethtool_op_get_rx_csum(struct net_device *dev);
u32 ethtool_op_get_tx_csum(struct net_device *dev);
int ethtool_op_set_tx_csum(struct net_device *dev, u32 data);
int ethtool_op_set_tx_hw_csum(struct net_device *dev, u32 data);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 85f67e2..ffe56c1 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -977,6 +977,7 @@ struct net_device {
#define NETIF_F_FCOE_MTU (1 << 26) /* Supports max FCoE MTU, 2158 bytes*/
#define NETIF_F_NTUPLE (1 << 27) /* N-tuple filters supported */
#define NETIF_F_RXHASH (1 << 28) /* Receive hashing offload */
+#define NETIF_F_RXCSUM (1 << 29) /* Receive checksumming offload */
/* Segmentation offload features */
#define NETIF_F_GSO_SHIFT 16
@@ -992,7 +993,7 @@ struct net_device {
/* = all defined minus driver/device-class-related */
#define NETIF_F_NEVER_CHANGE (NETIF_F_HIGHDMA | NETIF_F_VLAN_CHALLENGED | \
NETIF_F_LLTX | NETIF_F_NETNS_LOCAL)
-#define NETIF_F_ETHTOOL_BITS (0x1f3fffff & ~NETIF_F_NEVER_CHANGE)
+#define NETIF_F_ETHTOOL_BITS (0x3f3fffff & ~NETIF_F_NEVER_CHANGE)
/* List of features with software fallbacks. */
#define NETIF_F_GSO_SOFTWARE (NETIF_F_TSO | NETIF_F_TSO_ECN | \
@@ -2510,6 +2511,8 @@ static inline int dev_ethtool_get_settings(struct net_device *dev,
static inline u32 dev_ethtool_get_rx_csum(struct net_device *dev)
{
+ if (dev->hw_features & NETIF_F_RXCSUM)
+ return !!(dev->features & NETIF_F_RXCSUM);
if (!dev->ethtool_ops || !dev->ethtool_ops->get_rx_csum)
return 0;
return dev->ethtool_ops->get_rx_csum(dev);
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 65b3d50..66cdc76 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -34,12 +34,6 @@ u32 ethtool_op_get_link(struct net_device *dev)
}
EXPORT_SYMBOL(ethtool_op_get_link);
-u32 ethtool_op_get_rx_csum(struct net_device *dev)
-{
- return (dev->features & NETIF_F_ALL_CSUM) != 0;
-}
-EXPORT_SYMBOL(ethtool_op_get_rx_csum);
-
u32 ethtool_op_get_tx_csum(struct net_device *dev)
{
return (dev->features & NETIF_F_ALL_CSUM) != 0;
@@ -274,7 +268,7 @@ static const char netdev_features_strings[ETHTOOL_DEV_FEATURE_WORDS * 32][ETH_GS
/* NETIF_F_FCOE_MTU */ "fcoe-mtu",
/* NETIF_F_NTUPLE */ "rx-ntuple-filter",
/* NETIF_F_RXHASH */ "rx-hashing",
- "",
+ /* NETIF_F_RXCSUM */ "rx-checksum",
"",
"",
};
@@ -313,6 +307,9 @@ static u32 ethtool_get_feature_mask(u32 eth_cmd)
case ETHTOOL_GTXCSUM:
case ETHTOOL_STXCSUM:
return NETIF_F_ALL_CSUM | NETIF_F_SCTP_CSUM;
+ case ETHTOOL_GRXCSUM:
+ case ETHTOOL_SRXCSUM:
+ return NETIF_F_RXCSUM;
case ETHTOOL_GSG:
case ETHTOOL_SSG:
return NETIF_F_SG;
@@ -343,6 +340,8 @@ static void *__ethtool_get_one_feature_actor(struct net_device *dev, u32 ethcmd)
switch (ethcmd) {
case ETHTOOL_GTXCSUM:
return ops->get_tx_csum;
+ case ETHTOOL_GRXCSUM:
+ return ops->get_rx_csum;
case ETHTOOL_SSG:
return ops->get_sg;
case ETHTOOL_STSO:
@@ -354,6 +353,11 @@ static void *__ethtool_get_one_feature_actor(struct net_device *dev, u32 ethcmd)
}
}
+static u32 __ethtool_get_rx_csum_oldbug(struct net_device *dev)
+{
+ return !!(dev->features & NETIF_F_ALL_CSUM);
+}
+
static int ethtool_get_one_feature(struct net_device *dev,
char __user *useraddr, u32 ethcmd)
{
@@ -369,6 +373,10 @@ static int ethtool_get_one_feature(struct net_device *dev,
actor = __ethtool_get_one_feature_actor(dev, ethcmd);
+ /* bug compatibility with old get_rx_csum */
+ if (ethcmd == ETHTOOL_GRXCSUM && !actor)
+ actor = __ethtool_get_rx_csum_oldbug;
+
if (actor)
edata.data = actor(dev);
}
@@ -379,6 +387,7 @@ static int ethtool_get_one_feature(struct net_device *dev,
}
static int __ethtool_set_tx_csum(struct net_device *dev, u32 data);
+static int __ethtool_set_rx_csum(struct net_device *dev, u32 data);
static int __ethtool_set_sg(struct net_device *dev, u32 data);
static int __ethtool_set_tso(struct net_device *dev, u32 data);
static int __ethtool_set_ufo(struct net_device *dev, u32 data);
@@ -416,6 +425,8 @@ static int ethtool_set_one_feature(struct net_device *dev,
switch (ethcmd) {
case ETHTOOL_STXCSUM:
return __ethtool_set_tx_csum(dev, edata.data);
+ case ETHTOOL_SRXCSUM:
+ return __ethtool_set_rx_csum(dev, edata.data);
case ETHTOOL_SSG:
return __ethtool_set_sg(dev, edata.data);
case ETHTOOL_STSO:
@@ -1404,20 +1415,15 @@ static int __ethtool_set_tx_csum(struct net_device *dev, u32 data)
return dev->ethtool_ops->set_tx_csum(dev, data);
}
-static int ethtool_set_rx_csum(struct net_device *dev, char __user *useraddr)
+static int __ethtool_set_rx_csum(struct net_device *dev, u32 data)
{
- struct ethtool_value edata;
-
if (!dev->ethtool_ops->set_rx_csum)
return -EOPNOTSUPP;
- if (copy_from_user(&edata, useraddr, sizeof(edata)))
- return -EFAULT;
-
- if (!edata.data && dev->ethtool_ops->set_sg)
+ if (!data)
dev->features &= ~NETIF_F_GRO;
- return dev->ethtool_ops->set_rx_csum(dev, edata.data);
+ return dev->ethtool_ops->set_rx_csum(dev, data);
}
static int __ethtool_set_tso(struct net_device *dev, u32 data)
@@ -1765,15 +1771,6 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
case ETHTOOL_SPAUSEPARAM:
rc = ethtool_set_pauseparam(dev, useraddr);
break;
- case ETHTOOL_GRXCSUM:
- rc = ethtool_get_value(dev, useraddr, ethcmd,
- (dev->ethtool_ops->get_rx_csum ?
- dev->ethtool_ops->get_rx_csum :
- ethtool_op_get_rx_csum));
- break;
- case ETHTOOL_SRXCSUM:
- rc = ethtool_set_rx_csum(dev, useraddr);
- break;
case ETHTOOL_TEST:
rc = ethtool_self_test(dev, useraddr);
break;
@@ -1846,6 +1843,7 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
rc = ethtool_set_features(dev, useraddr);
break;
case ETHTOOL_GTXCSUM:
+ case ETHTOOL_GRXCSUM:
case ETHTOOL_GSG:
case ETHTOOL_GTSO:
case ETHTOOL_GUFO:
@@ -1854,6 +1852,7 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
rc = ethtool_get_one_feature(dev, useraddr, ethcmd);
break;
case ETHTOOL_STXCSUM:
+ case ETHTOOL_SRXCSUM:
case ETHTOOL_SSG:
case ETHTOOL_STSO:
case ETHTOOL_SUFO:
--
1.7.2.3
^ permalink raw reply related
* [PATCH v6 3/9] ethtool: factorize ethtool_get_strings() and ethtool_get_sset_count()
From: Michał Mirosław @ 2011-02-16 2:59 UTC (permalink / raw)
To: netdev; +Cc: Ben Hutchings, David Miller
In-Reply-To: <cover.1297824704.git.mirq-linux@rere.qmqm.pl>
This is needed for unified offloads patch.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
net/core/ethtool.c | 35 +++++++++++++++++++++++------------
1 files changed, 23 insertions(+), 12 deletions(-)
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 9eb8277..85aaeab 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -172,6 +172,25 @@ EXPORT_SYMBOL(ethtool_ntuple_flush);
/* Handlers for each ethtool command */
+static int __ethtool_get_sset_count(struct net_device *dev, int sset)
+{
+ const struct ethtool_ops *ops = dev->ethtool_ops;
+
+ if (ops && ops->get_sset_count && ops->get_strings)
+ return ops->get_sset_count(dev, sset);
+ else
+ return -EOPNOTSUPP;
+}
+
+static void __ethtool_get_strings(struct net_device *dev,
+ u32 stringset, u8 *data)
+{
+ const struct ethtool_ops *ops = dev->ethtool_ops;
+
+ /* ops->get_strings is valid because checked earlier */
+ ops->get_strings(dev, stringset, data);
+}
+
static int ethtool_get_settings(struct net_device *dev, void __user *useraddr)
{
struct ethtool_cmd cmd = { .cmd = ETHTOOL_GSET };
@@ -252,14 +271,10 @@ static noinline_for_stack int ethtool_get_sset_info(struct net_device *dev,
void __user *useraddr)
{
struct ethtool_sset_info info;
- const struct ethtool_ops *ops = dev->ethtool_ops;
u64 sset_mask;
int i, idx = 0, n_bits = 0, ret, rc;
u32 *info_buf = NULL;
- if (!ops->get_sset_count)
- return -EOPNOTSUPP;
-
if (copy_from_user(&info, useraddr, sizeof(info)))
return -EFAULT;
@@ -286,7 +301,7 @@ static noinline_for_stack int ethtool_get_sset_info(struct net_device *dev,
if (!(sset_mask & (1ULL << i)))
continue;
- rc = ops->get_sset_count(dev, i);
+ rc = __ethtool_get_sset_count(dev, i);
if (rc >= 0) {
info.sset_mask |= (1ULL << i);
info_buf[idx++] = rc;
@@ -1287,17 +1302,13 @@ static int ethtool_self_test(struct net_device *dev, char __user *useraddr)
static int ethtool_get_strings(struct net_device *dev, void __user *useraddr)
{
struct ethtool_gstrings gstrings;
- const struct ethtool_ops *ops = dev->ethtool_ops;
u8 *data;
int ret;
- if (!ops->get_strings || !ops->get_sset_count)
- return -EOPNOTSUPP;
-
if (copy_from_user(&gstrings, useraddr, sizeof(gstrings)))
return -EFAULT;
- ret = ops->get_sset_count(dev, gstrings.string_set);
+ ret = __ethtool_get_sset_count(dev, gstrings.string_set);
if (ret < 0)
return ret;
@@ -1307,7 +1318,7 @@ static int ethtool_get_strings(struct net_device *dev, void __user *useraddr)
if (!data)
return -ENOMEM;
- ops->get_strings(dev, gstrings.string_set, data);
+ __ethtool_get_strings(dev, gstrings.string_set, data);
ret = -EFAULT;
if (copy_to_user(useraddr, &gstrings, sizeof(gstrings)))
@@ -1317,7 +1328,7 @@ static int ethtool_get_strings(struct net_device *dev, void __user *useraddr)
goto out;
ret = 0;
- out:
+out:
kfree(data);
return ret;
}
--
1.7.2.3
^ permalink raw reply related
* [PATCH v6 7/9] net: use ndo_fix_features for ethtool_ops->set_flags
From: Michał Mirosław @ 2011-02-16 2:59 UTC (permalink / raw)
To: netdev; +Cc: Ben Hutchings, David Miller
In-Reply-To: <cover.1297824704.git.mirq-linux@rere.qmqm.pl>
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
net/core/ethtool.c | 31 +++++++++++++++++++++++++++++--
1 files changed, 29 insertions(+), 2 deletions(-)
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 6599997..65b3d50 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -427,6 +427,34 @@ static int ethtool_set_one_feature(struct net_device *dev,
}
}
+static int __ethtool_set_flags(struct net_device *dev, u32 data)
+{
+ u32 changed;
+
+ if (data & ~flags_dup_features)
+ return -EINVAL;
+
+ /* legacy set_flags() op */
+ if (dev->ethtool_ops->set_flags) {
+ if (unlikely(dev->hw_features & flags_dup_features))
+ netdev_warn(dev,
+ "driver BUG: mixed hw_features and set_flags()\n");
+ return dev->ethtool_ops->set_flags(dev, data);
+ }
+
+ /* allow changing only bits set in hw_features */
+ changed = (data ^ dev->wanted_features) & flags_dup_features;
+ if (changed & ~dev->hw_features)
+ return (changed & dev->hw_features) ? -EINVAL : -EOPNOTSUPP;
+
+ dev->wanted_features =
+ (dev->wanted_features & ~changed) | data;
+
+ netdev_update_features(dev);
+
+ return 0;
+}
+
static int ethtool_get_settings(struct net_device *dev, void __user *useraddr)
{
struct ethtool_cmd cmd = { .cmd = ETHTOOL_GSET };
@@ -1768,8 +1796,7 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
ethtool_op_get_flags));
break;
case ETHTOOL_SFLAGS:
- rc = ethtool_set_value(dev, useraddr,
- dev->ethtool_ops->set_flags);
+ rc = ethtool_set_value(dev, useraddr, __ethtool_set_flags);
break;
case ETHTOOL_GPFLAGS:
rc = ethtool_get_value(dev, useraddr, ethcmd,
--
1.7.2.3
^ permalink raw reply related
* [PATCH v6 5/9] net: Introduce new feature setting ops
From: Michał Mirosław @ 2011-02-16 2:59 UTC (permalink / raw)
To: netdev; +Cc: Ben Hutchings, David Miller
In-Reply-To: <cover.1297824704.git.mirq-linux@rere.qmqm.pl>
This introduces a new framework to handle device features setting.
It consists of:
- new fields in struct net_device:
+ hw_features - features that hw/driver supports toggling
+ wanted_features - features that user wants enabled, when possible
- new netdev_ops:
+ feat = ndo_fix_features(dev, feat) - API checking constraints for
enabling features or their combinations
+ ndo_set_features(dev) - API updating hardware state to match
changed dev->features
- new ethtool commands:
+ ETHTOOL_GFEATURES/ETHTOOL_SFEATURES: get/set dev->wanted_features
and trigger device reconfiguration if resulting dev->features
changed
+ ETHTOOL_GSTRINGS(ETH_SS_FEATURES): get feature bits names (meaning)
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
include/linux/ethtool.h | 85 ++++++++++++++++++++++++++++++
include/linux/netdevice.h | 37 +++++++++++++-
net/core/dev.c | 46 ++++++++++++++--
net/core/ethtool.c | 125 ++++++++++++++++++++++++++++++++++++++++++++-
4 files changed, 283 insertions(+), 10 deletions(-)
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 1908929..806e716 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -251,6 +251,7 @@ enum ethtool_stringset {
ETH_SS_STATS,
ETH_SS_PRIV_FLAGS,
ETH_SS_NTUPLE_FILTERS,
+ ETH_SS_FEATURES,
};
/* for passing string sets for data tagging */
@@ -523,6 +524,87 @@ struct ethtool_flash {
char data[ETHTOOL_FLASH_MAX_FILENAME];
};
+/* for returning and changing feature sets */
+
+/**
+ * struct ethtool_get_features_block - block with state of 32 features
+ * @available: mask of changeable features
+ * @requested: mask of features requested to be enabled if possible
+ * @active: mask of currently enabled features
+ * @never_changed: mask of features not changeable for any device
+ */
+struct ethtool_get_features_block {
+ __u32 available;
+ __u32 requested;
+ __u32 active;
+ __u32 never_changed;
+};
+
+/**
+ * struct ethtool_gfeatures - command to get state of device's features
+ * @cmd: command number = %ETHTOOL_GFEATURES
+ * @size: in: number of elements in the features[] array;
+ * out: number of elements in features[] needed to hold all features
+ * @features: state of features
+ */
+struct ethtool_gfeatures {
+ __u32 cmd;
+ __u32 size;
+ struct ethtool_get_features_block features[0];
+};
+
+/**
+ * struct ethtool_set_features_block - block with request for 32 features
+ * @valid: mask of features to be changed
+ * @requested: values of features to be changed
+ */
+struct ethtool_set_features_block {
+ __u32 valid;
+ __u32 requested;
+};
+
+/**
+ * struct ethtool_sfeatures - command to request change in device's features
+ * @cmd: command number = %ETHTOOL_SFEATURES
+ * @size: array size of the features[] array
+ * @features: feature change masks
+ */
+struct ethtool_sfeatures {
+ __u32 cmd;
+ __u32 size;
+ struct ethtool_set_features_block features[0];
+};
+
+/*
+ * %ETHTOOL_SFEATURES changes features present in features[].valid to the
+ * values of corresponding bits in features[].requested. Bits in .requested
+ * not set in .valid or not changeable are ignored.
+ *
+ * Returns %EINVAL when .valid contains undefined or never-changable bits
+ * or size is not equal to required number of features words (32-bit blocks).
+ * Returns >= 0 if request was completed; bits set in the value mean:
+ * %ETHTOOL_F_UNSUPPORTED - there were bits set in .valid that are not
+ * changeable (not present in %ETHTOOL_GFEATURES' features[].available)
+ * those bits were ignored.
+ * %ETHTOOL_F_WISH - some or all changes requested were recorded but the
+ * resulting state of bits masked by .valid is not equal to .requested.
+ * Probably there are other device-specific constraints on some features
+ * in the set. When %ETHTOOL_F_UNSUPPORTED is set, .valid is considered
+ * here as though ignored bits were cleared.
+ *
+ * Meaning of bits in the masks are obtained by %ETHTOOL_GSSET_INFO (number of
+ * bits in the arrays - always multiple of 32) and %ETHTOOL_GSTRINGS commands
+ * for ETH_SS_FEATURES string set. First entry in the table corresponds to least
+ * significant bit in features[0] fields. Empty strings mark undefined features.
+ */
+enum ethtool_sfeatures_retval_bits {
+ ETHTOOL_F_UNSUPPORTED__BIT,
+ ETHTOOL_F_WISH__BIT,
+};
+
+#define ETHTOOL_F_UNSUPPORTED (1 << ETHTOOL_F_UNSUPPORTED__BIT)
+#define ETHTOOL_F_WISH (1 << ETHTOOL_F_WISH__BIT)
+
#ifdef __KERNEL__
#include <linux/rculist.h>
@@ -744,6 +826,9 @@ struct ethtool_ops {
#define ETHTOOL_GRXFHINDIR 0x00000038 /* Get RX flow hash indir'n table */
#define ETHTOOL_SRXFHINDIR 0x00000039 /* Set RX flow hash indir'n table */
+#define ETHTOOL_GFEATURES 0x0000003a /* Get device offload settings */
+#define ETHTOOL_SFEATURES 0x0000003b /* Change device offload settings */
+
/* compatibility with older code */
#define SPARC_ETH_GSET ETHTOOL_GSET
#define SPARC_ETH_SSET ETHTOOL_SSET
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index dede3fd..85f67e2 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -791,6 +791,18 @@ struct netdev_tc_txq {
*
* int (*ndo_del_slave)(struct net_device *dev, struct net_device *slave_dev);
* Called to release previously enslaved netdev.
+ *
+ * Feature/offload setting functions.
+ * u32 (*ndo_fix_features)(struct net_device *dev, u32 features);
+ * Adjusts the requested feature flags according to device-specific
+ * constraints, and returns the resulting flags. Must not modify
+ * the device state.
+ *
+ * int (*ndo_set_features)(struct net_device *dev, u32 features);
+ * Called to update device configuration to new features. Passed
+ * feature set might be less than what was returned by ndo_fix_features()).
+ * Must return >0 or -errno if it changed dev->features itself.
+ *
*/
#define HAVE_NET_DEVICE_OPS
struct net_device_ops {
@@ -874,6 +886,10 @@ struct net_device_ops {
struct net_device *slave_dev);
int (*ndo_del_slave)(struct net_device *dev,
struct net_device *slave_dev);
+ u32 (*ndo_fix_features)(struct net_device *dev,
+ u32 features);
+ int (*ndo_set_features)(struct net_device *dev,
+ u32 features);
};
/*
@@ -925,12 +941,18 @@ struct net_device {
struct list_head napi_list;
struct list_head unreg_list;
- /* Net device features */
+ /* currently active device features */
u32 features;
-
+ /* user-changeable features */
+ u32 hw_features;
+ /* user-requested features */
+ u32 wanted_features;
/* VLAN feature mask */
u32 vlan_features;
+ /* Net device feature bits; if you change something,
+ * also update netdev_features_strings[] in ethtool.c */
+
#define NETIF_F_SG 1 /* Scatter/gather IO. */
#define NETIF_F_IP_CSUM 2 /* Can checksum TCP/UDP over IPv4. */
#define NETIF_F_NO_CSUM 4 /* Does not require checksum. F.e. loopack. */
@@ -966,6 +988,12 @@ struct net_device {
#define NETIF_F_TSO6 (SKB_GSO_TCPV6 << NETIF_F_GSO_SHIFT)
#define NETIF_F_FSO (SKB_GSO_FCOE << NETIF_F_GSO_SHIFT)
+ /* Features valid for ethtool to change */
+ /* = all defined minus driver/device-class-related */
+#define NETIF_F_NEVER_CHANGE (NETIF_F_HIGHDMA | NETIF_F_VLAN_CHALLENGED | \
+ NETIF_F_LLTX | NETIF_F_NETNS_LOCAL)
+#define NETIF_F_ETHTOOL_BITS (0x1f3fffff & ~NETIF_F_NEVER_CHANGE)
+
/* List of features with software fallbacks. */
#define NETIF_F_GSO_SOFTWARE (NETIF_F_TSO | NETIF_F_TSO_ECN | \
NETIF_F_TSO6 | NETIF_F_UFO)
@@ -2428,8 +2456,13 @@ extern char *netdev_drivername(const struct net_device *dev, char *buffer, int l
extern void linkwatch_run_queue(void);
+static inline u32 netdev_get_wanted_features(struct net_device *dev)
+{
+ return (dev->features & ~dev->hw_features) | dev->wanted_features;
+}
u32 netdev_increment_features(u32 all, u32 one, u32 mask);
u32 netdev_fix_features(struct net_device *dev, u32 features);
+void netdev_update_features(struct net_device *dev);
void netif_stacked_transfer_operstate(const struct net_device *rootdev,
struct net_device *dev);
diff --git a/net/core/dev.c b/net/core/dev.c
index 8686f6f..4f69439 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5302,6 +5302,37 @@ u32 netdev_fix_features(struct net_device *dev, u32 features)
}
EXPORT_SYMBOL(netdev_fix_features);
+void netdev_update_features(struct net_device *dev)
+{
+ u32 features;
+ int err = 0;
+
+ features = netdev_get_wanted_features(dev);
+
+ if (dev->netdev_ops->ndo_fix_features)
+ features = dev->netdev_ops->ndo_fix_features(dev, features);
+
+ /* driver might be less strict about feature dependencies */
+ features = netdev_fix_features(dev, features);
+
+ if (dev->features == features)
+ return;
+
+ netdev_info(dev, "Features changed: 0x%08x -> 0x%08x\n",
+ dev->features, features);
+
+ if (dev->netdev_ops->ndo_set_features)
+ err = dev->netdev_ops->ndo_set_features(dev, features);
+
+ if (!err)
+ dev->features = features;
+ else if (err < 0)
+ netdev_err(dev,
+ "set_features() failed (%d); wanted 0x%08x, left 0x%08x\n",
+ err, features, dev->features);
+}
+EXPORT_SYMBOL(netdev_update_features);
+
/**
* netif_stacked_transfer_operstate - transfer operstate
* @rootdev: the root or lower level device to transfer state from
@@ -5436,15 +5467,18 @@ int register_netdevice(struct net_device *dev)
if (dev->iflink == -1)
dev->iflink = dev->ifindex;
- /* Enable software offloads by default - will be stripped in
- * netdev_fix_features() if not supported. */
- dev->features |= NETIF_F_SOFT_FEATURES;
+ /* Transfer changeable features to wanted_features and enable
+ * software offloads (GSO and GRO).
+ */
+ dev->hw_features |= NETIF_F_SOFT_FEATURES;
+ dev->wanted_features = (dev->features & dev->hw_features)
+ | NETIF_F_SOFT_FEATURES;
/* Avoid warning from netdev_fix_features() for GSO without SG */
- if (!(dev->features & NETIF_F_SG))
- dev->features &= ~NETIF_F_GSO;
+ if (!(dev->wanted_features & NETIF_F_SG))
+ dev->wanted_features &= ~NETIF_F_GSO;
- dev->features = netdev_fix_features(dev, dev->features);
+ netdev_update_features(dev);
/* Enable GRO and NETIF_F_HIGHDMA for vlans by default,
* vlan_dev_init() will do the dev->features check, so these features
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index c3fb8f9..9577396 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -172,10 +172,120 @@ EXPORT_SYMBOL(ethtool_ntuple_flush);
/* Handlers for each ethtool command */
+#define ETHTOOL_DEV_FEATURE_WORDS 1
+
+static int ethtool_get_features(struct net_device *dev, void __user *useraddr)
+{
+ struct ethtool_gfeatures cmd = {
+ .cmd = ETHTOOL_GFEATURES,
+ .size = ETHTOOL_DEV_FEATURE_WORDS,
+ };
+ struct ethtool_get_features_block features[ETHTOOL_DEV_FEATURE_WORDS] = {
+ {
+ .available = dev->hw_features,
+ .requested = dev->wanted_features,
+ .active = dev->features,
+ .never_changed = NETIF_F_NEVER_CHANGE,
+ },
+ };
+ u32 __user *sizeaddr;
+ u32 copy_size;
+
+ sizeaddr = useraddr + offsetof(struct ethtool_gfeatures, size);
+ if (get_user(copy_size, sizeaddr))
+ return -EFAULT;
+
+ if (copy_size > ETHTOOL_DEV_FEATURE_WORDS)
+ copy_size = ETHTOOL_DEV_FEATURE_WORDS;
+
+ if (copy_to_user(useraddr, &cmd, sizeof(cmd)))
+ return -EFAULT;
+ useraddr += sizeof(cmd);
+ if (copy_to_user(useraddr, features, copy_size * sizeof(*features)))
+ return -EFAULT;
+
+ return 0;
+}
+
+static int ethtool_set_features(struct net_device *dev, void __user *useraddr)
+{
+ struct ethtool_sfeatures cmd;
+ struct ethtool_set_features_block features[ETHTOOL_DEV_FEATURE_WORDS];
+ int ret = 0;
+
+ if (copy_from_user(&cmd, useraddr, sizeof(cmd)))
+ return -EFAULT;
+ useraddr += sizeof(cmd);
+
+ if (cmd.size != ETHTOOL_DEV_FEATURE_WORDS)
+ return -EINVAL;
+
+ if (copy_from_user(features, useraddr, sizeof(features)))
+ return -EFAULT;
+
+ if (features[0].valid & ~NETIF_F_ETHTOOL_BITS)
+ return -EINVAL;
+
+ if (features[0].valid & ~dev->hw_features) {
+ features[0].valid &= dev->hw_features;
+ ret |= ETHTOOL_F_UNSUPPORTED;
+ }
+
+ dev->wanted_features &= ~features[0].valid;
+ dev->wanted_features |= features[0].valid & features[0].requested;
+ netdev_update_features(dev);
+
+ if ((dev->wanted_features ^ dev->features) & features[0].valid)
+ ret |= ETHTOOL_F_WISH;
+
+ return ret;
+}
+
+static const char netdev_features_strings[ETHTOOL_DEV_FEATURE_WORDS * 32][ETH_GSTRING_LEN] = {
+ /* NETIF_F_SG */ "tx-scatter-gather",
+ /* NETIF_F_IP_CSUM */ "tx-checksum-ipv4",
+ /* NETIF_F_NO_CSUM */ "tx-checksum-unneeded",
+ /* NETIF_F_HW_CSUM */ "tx-checksum-ip-generic",
+ /* NETIF_F_IPV6_CSUM */ "tx_checksum-ipv6",
+ /* NETIF_F_HIGHDMA */ "highdma",
+ /* NETIF_F_FRAGLIST */ "tx-scatter-gather-fraglist",
+ /* NETIF_F_HW_VLAN_TX */ "tx-vlan-hw-insert",
+
+ /* NETIF_F_HW_VLAN_RX */ "rx-vlan-hw-parse",
+ /* NETIF_F_HW_VLAN_FILTER */ "rx-vlan-filter",
+ /* NETIF_F_VLAN_CHALLENGED */ "vlan-challenged",
+ /* NETIF_F_GSO */ "tx-generic-segmentation",
+ /* NETIF_F_LLTX */ "tx-lockless",
+ /* NETIF_F_NETNS_LOCAL */ "netns-local",
+ /* NETIF_F_GRO */ "rx-gro",
+ /* NETIF_F_LRO */ "rx-lro",
+
+ /* NETIF_F_TSO */ "tx-tcp-segmentation",
+ /* NETIF_F_UFO */ "tx-udp-fragmentation",
+ /* NETIF_F_GSO_ROBUST */ "tx-gso-robust",
+ /* NETIF_F_TSO_ECN */ "tx-tcp-ecn-segmentation",
+ /* NETIF_F_TSO6 */ "tx-tcp6-segmentation",
+ /* NETIF_F_FSO */ "tx-fcoe-segmentation",
+ "",
+ "",
+
+ /* NETIF_F_FCOE_CRC */ "tx-checksum-fcoe-crc",
+ /* NETIF_F_SCTP_CSUM */ "tx-checksum-sctp",
+ /* NETIF_F_FCOE_MTU */ "fcoe-mtu",
+ /* NETIF_F_NTUPLE */ "rx-ntuple-filter",
+ /* NETIF_F_RXHASH */ "rx-hashing",
+ "",
+ "",
+ "",
+};
+
static int __ethtool_get_sset_count(struct net_device *dev, int sset)
{
const struct ethtool_ops *ops = dev->ethtool_ops;
+ if (sset == ETH_SS_FEATURES)
+ return ARRAY_SIZE(netdev_features_strings);
+
if (ops && ops->get_sset_count && ops->get_strings)
return ops->get_sset_count(dev, sset);
else
@@ -187,8 +297,12 @@ static void __ethtool_get_strings(struct net_device *dev,
{
const struct ethtool_ops *ops = dev->ethtool_ops;
- /* ops->get_strings is valid because checked earlier */
- ops->get_strings(dev, stringset, data);
+ if (stringset == ETH_SS_FEATURES)
+ memcpy(data, netdev_features_strings,
+ sizeof(netdev_features_strings));
+ else
+ /* ops->get_strings is valid because checked earlier */
+ ops->get_strings(dev, stringset, data);
}
static u32 ethtool_get_feature_mask(u32 eth_cmd)
@@ -1533,6 +1647,7 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
case ETHTOOL_GRXCLSRLCNT:
case ETHTOOL_GRXCLSRULE:
case ETHTOOL_GRXCLSRLALL:
+ case ETHTOOL_GFEATURES:
break;
default:
if (!capable(CAP_NET_ADMIN))
@@ -1678,6 +1793,12 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
case ETHTOOL_SRXFHINDIR:
rc = ethtool_set_rxfh_indir(dev, useraddr);
break;
+ case ETHTOOL_GFEATURES:
+ rc = ethtool_get_features(dev, useraddr);
+ break;
+ case ETHTOOL_SFEATURES:
+ rc = ethtool_set_features(dev, useraddr);
+ break;
case ETHTOOL_GTXCSUM:
case ETHTOOL_GSG:
case ETHTOOL_GTSO:
--
1.7.2.3
^ permalink raw reply related
* [PATCH v6 6/9] net: ethtool: use ndo_fix_features for offload setting
From: Michał Mirosław @ 2011-02-16 2:59 UTC (permalink / raw)
To: netdev; +Cc: Ben Hutchings, David Miller
In-Reply-To: <cover.1297824704.git.mirq-linux@rere.qmqm.pl>
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
net/core/ethtool.c | 45 ++++++++++++++++++++++++++++++++-------------
1 files changed, 32 insertions(+), 13 deletions(-)
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 9577396..6599997 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -357,15 +357,21 @@ static void *__ethtool_get_one_feature_actor(struct net_device *dev, u32 ethcmd)
static int ethtool_get_one_feature(struct net_device *dev,
char __user *useraddr, u32 ethcmd)
{
+ u32 mask = ethtool_get_feature_mask(ethcmd);
struct ethtool_value edata = {
.cmd = ethcmd,
- .data = !!(dev->features & ethtool_get_feature_mask(ethcmd)),
+ .data = !!(dev->features & mask),
};
- u32 (*actor)(struct net_device *);
- actor = __ethtool_get_one_feature_actor(dev, ethcmd);
- if (actor)
- edata.data = actor(dev);
+ /* compatibility with discrete get_ ops */
+ if (!(dev->hw_features & mask)) {
+ u32 (*actor)(struct net_device *);
+
+ actor = __ethtool_get_one_feature_actor(dev, ethcmd);
+
+ if (actor)
+ edata.data = actor(dev);
+ }
if (copy_to_user(useraddr, &edata, sizeof(edata)))
return -EFAULT;
@@ -386,6 +392,27 @@ static int ethtool_set_one_feature(struct net_device *dev,
if (copy_from_user(&edata, useraddr, sizeof(edata)))
return -EFAULT;
+ mask = ethtool_get_feature_mask(ethcmd);
+ mask &= dev->hw_features;
+ if (mask) {
+ if (edata.data)
+ dev->wanted_features |= mask;
+ else
+ dev->wanted_features &= ~mask;
+
+ netdev_update_features(dev);
+ return 0;
+ }
+
+ /* Driver is not converted to ndo_fix_features or does not
+ * support changing this offload. In the latter case it won't
+ * have corresponding ethtool_ops field set.
+ *
+ * Following part is to be removed after all drivers advertise
+ * their changeable features in netdev->hw_features and stop
+ * using discrete offload setting ops.
+ */
+
switch (ethcmd) {
case ETHTOOL_STXCSUM:
return __ethtool_set_tx_csum(dev, edata.data);
@@ -395,14 +422,6 @@ static int ethtool_set_one_feature(struct net_device *dev,
return __ethtool_set_tso(dev, edata.data);
case ETHTOOL_SUFO:
return __ethtool_set_ufo(dev, edata.data);
- case ETHTOOL_SGSO:
- case ETHTOOL_SGRO:
- mask = ethtool_get_feature_mask(ethcmd);
- if (edata.data)
- dev->features |= mask;
- else
- dev->features &= ~mask;
- return 0;
default:
return -EOPNOTSUPP;
}
--
1.7.2.3
^ permalink raw reply related
* [PATCH v6 2/9] ethtool: enable GSO and GRO by default
From: Michał Mirosław @ 2011-02-16 2:59 UTC (permalink / raw)
To: netdev; +Cc: Ben Hutchings, David Miller
In-Reply-To: <cover.1297824704.git.mirq-linux@rere.qmqm.pl>
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
include/linux/netdevice.h | 3 +++
net/core/dev.c | 18 ++++++++++++++----
2 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index d08ef65..168e3ad 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -984,6 +984,9 @@ struct net_device {
NETIF_F_SG | NETIF_F_HIGHDMA | \
NETIF_F_FRAGLIST)
+ /* changeable features with no special hardware requirements */
+#define NETIF_F_SOFT_FEATURES (NETIF_F_GSO | NETIF_F_GRO)
+
/* Interface index. Unique device identifier */
int ifindex;
int iflink;
diff --git a/net/core/dev.c b/net/core/dev.c
index 4580460..8686f6f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5274,6 +5274,12 @@ u32 netdev_fix_features(struct net_device *dev, u32 features)
features &= ~NETIF_F_TSO;
}
+ /* Software GSO depends on SG. */
+ if ((features & NETIF_F_GSO) && !(features & NETIF_F_SG)) {
+ netdev_info(dev, "Dropping NETIF_F_GSO since no SG feature.\n");
+ features &= ~NETIF_F_GSO;
+ }
+
/* UFO needs SG and checksumming */
if (features & NETIF_F_UFO) {
/* maybe split UFO into V4 and V6? */
@@ -5430,12 +5436,16 @@ int register_netdevice(struct net_device *dev)
if (dev->iflink == -1)
dev->iflink = dev->ifindex;
+ /* Enable software offloads by default - will be stripped in
+ * netdev_fix_features() if not supported. */
+ dev->features |= NETIF_F_SOFT_FEATURES;
+
+ /* Avoid warning from netdev_fix_features() for GSO without SG */
+ if (!(dev->features & NETIF_F_SG))
+ dev->features &= ~NETIF_F_GSO;
+
dev->features = netdev_fix_features(dev, dev->features);
- /* Enable software GSO if SG is supported. */
- if (dev->features & NETIF_F_SG)
- dev->features |= NETIF_F_GSO;
-
/* Enable GRO and NETIF_F_HIGHDMA for vlans by default,
* vlan_dev_init() will do the dev->features check, so these features
* are enabled only if supported by underlying device.
--
1.7.2.3
^ permalink raw reply related
* [PATCH v6 1/9] ethtool: move EXPORT_SYMBOL(ethtool_op_set_tx_csum) to correct place
From: Michał Mirosław @ 2011-02-16 2:59 UTC (permalink / raw)
To: netdev; +Cc: Ben Hutchings, David Miller
In-Reply-To: <cover.1297824704.git.mirq-linux@rere.qmqm.pl>
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
net/core/ethtool.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 5984ee0..9eb8277 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -55,6 +55,7 @@ int ethtool_op_set_tx_csum(struct net_device *dev, u32 data)
return 0;
}
+EXPORT_SYMBOL(ethtool_op_set_tx_csum);
int ethtool_op_set_tx_hw_csum(struct net_device *dev, u32 data)
{
@@ -1124,7 +1125,6 @@ static int ethtool_set_tx_csum(struct net_device *dev, char __user *useraddr)
return dev->ethtool_ops->set_tx_csum(dev, edata.data);
}
-EXPORT_SYMBOL(ethtool_op_set_tx_csum);
static int ethtool_set_rx_csum(struct net_device *dev, char __user *useraddr)
{
--
1.7.2.3
^ permalink raw reply related
* [PATCH v6 4/9] ethtool: factorize get/set_one_feature
From: Michał Mirosław @ 2011-02-16 2:59 UTC (permalink / raw)
To: netdev; +Cc: Ben Hutchings, David Miller
In-Reply-To: <cover.1297824704.git.mirq-linux@rere.qmqm.pl>
This allows to enable GRO even if RX csum is disabled. GRO will not
be used for packets without hardware checksum anyway.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
---
include/linux/netdevice.h | 6 +
net/core/ethtool.c | 274 ++++++++++++++++++++++-----------------------
2 files changed, 138 insertions(+), 142 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 168e3ad..dede3fd 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -976,6 +976,12 @@ struct net_device {
#define NETIF_F_V6_CSUM (NETIF_F_GEN_CSUM | NETIF_F_IPV6_CSUM)
#define NETIF_F_ALL_CSUM (NETIF_F_V4_CSUM | NETIF_F_V6_CSUM)
+#define NETIF_F_ALL_TSO (NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_TSO_ECN)
+
+#define NETIF_F_ALL_TX_OFFLOADS (NETIF_F_ALL_CSUM | NETIF_F_SG | \
+ NETIF_F_FRAGLIST | NETIF_F_ALL_TSO | \
+ NETIF_F_SCTP_CSUM | NETIF_F_FCOE_CRC)
+
/*
* If one device supports one of these features, then enable them
* for all in netdev_increment_features.
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 85aaeab..c3fb8f9 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -191,6 +191,109 @@ static void __ethtool_get_strings(struct net_device *dev,
ops->get_strings(dev, stringset, data);
}
+static u32 ethtool_get_feature_mask(u32 eth_cmd)
+{
+ /* feature masks of legacy discrete ethtool ops */
+
+ switch (eth_cmd) {
+ case ETHTOOL_GTXCSUM:
+ case ETHTOOL_STXCSUM:
+ return NETIF_F_ALL_CSUM | NETIF_F_SCTP_CSUM;
+ case ETHTOOL_GSG:
+ case ETHTOOL_SSG:
+ return NETIF_F_SG;
+ case ETHTOOL_GTSO:
+ case ETHTOOL_STSO:
+ return NETIF_F_ALL_TSO;
+ case ETHTOOL_GUFO:
+ case ETHTOOL_SUFO:
+ return NETIF_F_UFO;
+ case ETHTOOL_GGSO:
+ case ETHTOOL_SGSO:
+ return NETIF_F_GSO;
+ case ETHTOOL_GGRO:
+ case ETHTOOL_SGRO:
+ return NETIF_F_GRO;
+ default:
+ BUG();
+ }
+}
+
+static void *__ethtool_get_one_feature_actor(struct net_device *dev, u32 ethcmd)
+{
+ const struct ethtool_ops *ops = dev->ethtool_ops;
+
+ if (!ops)
+ return NULL;
+
+ switch (ethcmd) {
+ case ETHTOOL_GTXCSUM:
+ return ops->get_tx_csum;
+ case ETHTOOL_SSG:
+ return ops->get_sg;
+ case ETHTOOL_STSO:
+ return ops->get_tso;
+ case ETHTOOL_SUFO:
+ return ops->get_ufo;
+ default:
+ return NULL;
+ }
+}
+
+static int ethtool_get_one_feature(struct net_device *dev,
+ char __user *useraddr, u32 ethcmd)
+{
+ struct ethtool_value edata = {
+ .cmd = ethcmd,
+ .data = !!(dev->features & ethtool_get_feature_mask(ethcmd)),
+ };
+ u32 (*actor)(struct net_device *);
+
+ actor = __ethtool_get_one_feature_actor(dev, ethcmd);
+ if (actor)
+ edata.data = actor(dev);
+
+ if (copy_to_user(useraddr, &edata, sizeof(edata)))
+ return -EFAULT;
+ return 0;
+}
+
+static int __ethtool_set_tx_csum(struct net_device *dev, u32 data);
+static int __ethtool_set_sg(struct net_device *dev, u32 data);
+static int __ethtool_set_tso(struct net_device *dev, u32 data);
+static int __ethtool_set_ufo(struct net_device *dev, u32 data);
+
+static int ethtool_set_one_feature(struct net_device *dev,
+ void __user *useraddr, u32 ethcmd)
+{
+ struct ethtool_value edata;
+ u32 mask;
+
+ if (copy_from_user(&edata, useraddr, sizeof(edata)))
+ return -EFAULT;
+
+ switch (ethcmd) {
+ case ETHTOOL_STXCSUM:
+ return __ethtool_set_tx_csum(dev, edata.data);
+ case ETHTOOL_SSG:
+ return __ethtool_set_sg(dev, edata.data);
+ case ETHTOOL_STSO:
+ return __ethtool_set_tso(dev, edata.data);
+ case ETHTOOL_SUFO:
+ return __ethtool_set_ufo(dev, edata.data);
+ case ETHTOOL_SGSO:
+ case ETHTOOL_SGRO:
+ mask = ethtool_get_feature_mask(ethcmd);
+ if (edata.data)
+ dev->features |= mask;
+ else
+ dev->features &= ~mask;
+ return 0;
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
static int ethtool_get_settings(struct net_device *dev, void __user *useraddr)
{
struct ethtool_cmd cmd = { .cmd = ETHTOOL_GSET };
@@ -1107,6 +1210,9 @@ static int __ethtool_set_sg(struct net_device *dev, u32 data)
{
int err;
+ if (data && !(dev->features & NETIF_F_ALL_CSUM))
+ return -EINVAL;
+
if (!data && dev->ethtool_ops->set_tso) {
err = dev->ethtool_ops->set_tso(dev, 0);
if (err)
@@ -1121,24 +1227,20 @@ static int __ethtool_set_sg(struct net_device *dev, u32 data)
return dev->ethtool_ops->set_sg(dev, data);
}
-static int ethtool_set_tx_csum(struct net_device *dev, char __user *useraddr)
+static int __ethtool_set_tx_csum(struct net_device *dev, u32 data)
{
- struct ethtool_value edata;
int err;
if (!dev->ethtool_ops->set_tx_csum)
return -EOPNOTSUPP;
- if (copy_from_user(&edata, useraddr, sizeof(edata)))
- return -EFAULT;
-
- if (!edata.data && dev->ethtool_ops->set_sg) {
+ if (!data && dev->ethtool_ops->set_sg) {
err = __ethtool_set_sg(dev, 0);
if (err)
return err;
}
- return dev->ethtool_ops->set_tx_csum(dev, edata.data);
+ return dev->ethtool_ops->set_tx_csum(dev, data);
}
static int ethtool_set_rx_csum(struct net_device *dev, char __user *useraddr)
@@ -1157,108 +1259,28 @@ static int ethtool_set_rx_csum(struct net_device *dev, char __user *useraddr)
return dev->ethtool_ops->set_rx_csum(dev, edata.data);
}
-static int ethtool_set_sg(struct net_device *dev, char __user *useraddr)
+static int __ethtool_set_tso(struct net_device *dev, u32 data)
{
- struct ethtool_value edata;
-
- if (!dev->ethtool_ops->set_sg)
- return -EOPNOTSUPP;
-
- if (copy_from_user(&edata, useraddr, sizeof(edata)))
- return -EFAULT;
-
- if (edata.data &&
- !(dev->features & NETIF_F_ALL_CSUM))
- return -EINVAL;
-
- return __ethtool_set_sg(dev, edata.data);
-}
-
-static int ethtool_set_tso(struct net_device *dev, char __user *useraddr)
-{
- struct ethtool_value edata;
-
if (!dev->ethtool_ops->set_tso)
return -EOPNOTSUPP;
- if (copy_from_user(&edata, useraddr, sizeof(edata)))
- return -EFAULT;
-
- if (edata.data && !(dev->features & NETIF_F_SG))
+ if (data && !(dev->features & NETIF_F_SG))
return -EINVAL;
- return dev->ethtool_ops->set_tso(dev, edata.data);
+ return dev->ethtool_ops->set_tso(dev, data);
}
-static int ethtool_set_ufo(struct net_device *dev, char __user *useraddr)
+static int __ethtool_set_ufo(struct net_device *dev, u32 data)
{
- struct ethtool_value edata;
-
if (!dev->ethtool_ops->set_ufo)
return -EOPNOTSUPP;
- if (copy_from_user(&edata, useraddr, sizeof(edata)))
- return -EFAULT;
- if (edata.data && !(dev->features & NETIF_F_SG))
+ if (data && !(dev->features & NETIF_F_SG))
return -EINVAL;
- if (edata.data && !((dev->features & NETIF_F_GEN_CSUM) ||
+ if (data && !((dev->features & NETIF_F_GEN_CSUM) ||
(dev->features & (NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM))
== (NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM)))
return -EINVAL;
- return dev->ethtool_ops->set_ufo(dev, edata.data);
-}
-
-static int ethtool_get_gso(struct net_device *dev, char __user *useraddr)
-{
- struct ethtool_value edata = { ETHTOOL_GGSO };
-
- edata.data = dev->features & NETIF_F_GSO;
- if (copy_to_user(useraddr, &edata, sizeof(edata)))
- return -EFAULT;
- return 0;
-}
-
-static int ethtool_set_gso(struct net_device *dev, char __user *useraddr)
-{
- struct ethtool_value edata;
-
- if (copy_from_user(&edata, useraddr, sizeof(edata)))
- return -EFAULT;
- if (edata.data)
- dev->features |= NETIF_F_GSO;
- else
- dev->features &= ~NETIF_F_GSO;
- return 0;
-}
-
-static int ethtool_get_gro(struct net_device *dev, char __user *useraddr)
-{
- struct ethtool_value edata = { ETHTOOL_GGRO };
-
- edata.data = dev->features & NETIF_F_GRO;
- if (copy_to_user(useraddr, &edata, sizeof(edata)))
- return -EFAULT;
- return 0;
-}
-
-static int ethtool_set_gro(struct net_device *dev, char __user *useraddr)
-{
- struct ethtool_value edata;
-
- if (copy_from_user(&edata, useraddr, sizeof(edata)))
- return -EFAULT;
-
- if (edata.data) {
- u32 rxcsum = dev->ethtool_ops->get_rx_csum ?
- dev->ethtool_ops->get_rx_csum(dev) :
- ethtool_op_get_rx_csum(dev);
-
- if (!rxcsum)
- return -EINVAL;
- dev->features |= NETIF_F_GRO;
- } else
- dev->features &= ~NETIF_F_GRO;
-
- return 0;
+ return dev->ethtool_ops->set_ufo(dev, data);
}
static int ethtool_self_test(struct net_device *dev, char __user *useraddr)
@@ -1590,33 +1612,6 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
case ETHTOOL_SRXCSUM:
rc = ethtool_set_rx_csum(dev, useraddr);
break;
- case ETHTOOL_GTXCSUM:
- rc = ethtool_get_value(dev, useraddr, ethcmd,
- (dev->ethtool_ops->get_tx_csum ?
- dev->ethtool_ops->get_tx_csum :
- ethtool_op_get_tx_csum));
- break;
- case ETHTOOL_STXCSUM:
- rc = ethtool_set_tx_csum(dev, useraddr);
- break;
- case ETHTOOL_GSG:
- rc = ethtool_get_value(dev, useraddr, ethcmd,
- (dev->ethtool_ops->get_sg ?
- dev->ethtool_ops->get_sg :
- ethtool_op_get_sg));
- break;
- case ETHTOOL_SSG:
- rc = ethtool_set_sg(dev, useraddr);
- break;
- case ETHTOOL_GTSO:
- rc = ethtool_get_value(dev, useraddr, ethcmd,
- (dev->ethtool_ops->get_tso ?
- dev->ethtool_ops->get_tso :
- ethtool_op_get_tso));
- break;
- case ETHTOOL_STSO:
- rc = ethtool_set_tso(dev, useraddr);
- break;
case ETHTOOL_TEST:
rc = ethtool_self_test(dev, useraddr);
break;
@@ -1632,21 +1627,6 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
case ETHTOOL_GPERMADDR:
rc = ethtool_get_perm_addr(dev, useraddr);
break;
- case ETHTOOL_GUFO:
- rc = ethtool_get_value(dev, useraddr, ethcmd,
- (dev->ethtool_ops->get_ufo ?
- dev->ethtool_ops->get_ufo :
- ethtool_op_get_ufo));
- break;
- case ETHTOOL_SUFO:
- rc = ethtool_set_ufo(dev, useraddr);
- break;
- case ETHTOOL_GGSO:
- rc = ethtool_get_gso(dev, useraddr);
- break;
- case ETHTOOL_SGSO:
- rc = ethtool_set_gso(dev, useraddr);
- break;
case ETHTOOL_GFLAGS:
rc = ethtool_get_value(dev, useraddr, ethcmd,
(dev->ethtool_ops->get_flags ?
@@ -1677,12 +1657,6 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
case ETHTOOL_SRXCLSRLINS:
rc = ethtool_set_rxnfc(dev, ethcmd, useraddr);
break;
- case ETHTOOL_GGRO:
- rc = ethtool_get_gro(dev, useraddr);
- break;
- case ETHTOOL_SGRO:
- rc = ethtool_set_gro(dev, useraddr);
- break;
case ETHTOOL_FLASHDEV:
rc = ethtool_flash_device(dev, useraddr);
break;
@@ -1704,6 +1678,22 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
case ETHTOOL_SRXFHINDIR:
rc = ethtool_set_rxfh_indir(dev, useraddr);
break;
+ case ETHTOOL_GTXCSUM:
+ case ETHTOOL_GSG:
+ case ETHTOOL_GTSO:
+ case ETHTOOL_GUFO:
+ case ETHTOOL_GGSO:
+ case ETHTOOL_GGRO:
+ rc = ethtool_get_one_feature(dev, useraddr, ethcmd);
+ break;
+ case ETHTOOL_STXCSUM:
+ case ETHTOOL_SSG:
+ case ETHTOOL_STSO:
+ case ETHTOOL_SUFO:
+ case ETHTOOL_SGSO:
+ case ETHTOOL_SGRO:
+ rc = ethtool_set_one_feature(dev, useraddr, ethcmd);
+ break;
default:
rc = -EOPNOTSUPP;
}
--
1.7.2.3
^ permalink raw reply related
* [PATCH v6 0/9] net: Unified offload configuration
From: Michał Mirosław @ 2011-02-16 2:59 UTC (permalink / raw)
To: netdev; +Cc: Ben Hutchings, David Miller
Here's a v6 of the ethtool unification patch series.
What's in it?
1..4:
cleanups for the core patches
5:
the patch - implement unified ethtool setting ops
6..7:
implement interoperation between old and new ethtool ops
8:
include RX checksum in features and plug it into new framework
9:
convert loopback device to new framework
What is it good for?
- unifies driver behaviour wrt hardware offloads
- removes a lot of boilerplate code from drivers
- allows better fine-grained control over used offloads
This version is not tested, yet.
Best Regards,
Michał Mirosław
v1: http://marc.info/?l=linux-netdev&m=129245188832643&w=3
Changes from v5:
- register_netdevice(): avoid warning on GSO for non-SG capable devices
- rebased on current net-next (introduction of ndo_add/del_slave)
Changes from v4:
- more split cleanups
- fix error return for ETHTOOL_SFLAGS
- fix ETHTOOL_G* compatibility for not converted drivers
Changes from v3:
- fixed kernel-doc and other comments
- added HIGHDMA to never-changeable features
- changed GFEATURES .size interpretation
- changed feature strings
- change __ethtool_set_flags() to reject invalid changes
Changes from v2:
- rebase to net-next after merging v2 leading patches
- fix missing comma in feature name table
- force NETIF_F_SOFT_FEATURES in hw_features for simpler code
(fixes a bug that disallowed changing GSO and GRO state)
Changes from v1:
- split structures for GFEATURES/SFEATURES
- naming of feature bits using GSTRINGS ETH_SS_FEATURES
- strict checking of bits used in SFEATURES call
- more comments and kernel-doc
- rebased to net-next after 2.6.37
---
Michał Mirosław (9):
ethtool: move EXPORT_SYMBOL(ethtool_op_set_tx_csum) to correct place
ethtool: enable GSO and GRO by default
ethtool: factorize ethtool_get_strings() and ethtool_get_sset_count()
ethtool: factorize get/set_one_feature
net: Introduce new feature setting ops
net: ethtool: use ndo_fix_features for offload setting
net: use ndo_fix_features for ethtool_ops->set_flags
net: introduce NETIF_F_RXCSUM
loopback: convert to hw_features
drivers/net/loopback.c | 9 +-
include/linux/ethtool.h | 86 ++++++++-
include/linux/netdevice.h | 49 ++++-
net/core/dev.c | 52 ++++-
net/core/ethtool.c | 527 +++++++++++++++++++++++++++++---------------
5 files changed, 531 insertions(+), 192 deletions(-)
--
1.7.2.3
^ permalink raw reply
* Re: [RFC !!BONUS!! PATCH 6/5] ipv4: Delete routing cache.
From: David Miller @ 2011-02-16 2:55 UTC (permalink / raw)
To: netdev
In-Reply-To: <20110209.223939.246547003.davem@davemloft.net>
From: David Miller <davem@davemloft.net>
Date: Wed, 09 Feb 2011 22:39:39 -0800 (PST)
>
> Signed-off-by: David S. Miller <davem@davemloft.net>
Ok, this patch had one nasty bug:
> + if (!err == 0)
Yeah... right.
I'm actively testing this version at the moment, against net-next-2.6,
works fine thus far.
--------------------
ipv4: Delete routing cache.
Signed-off-by: David S. Miller <davem@davemloft.net>
---
include/net/route.h | 1 -
net/ipv4/fib_frontend.c | 5 -
net/ipv4/route.c | 891 ++---------------------------------------------
3 files changed, 23 insertions(+), 874 deletions(-)
diff --git a/include/net/route.h b/include/net/route.h
index bf790c1..fcf1b11 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -117,7 +117,6 @@ extern int ip_rt_init(void);
extern void ip_rt_redirect(__be32 old_gw, __be32 dst, __be32 new_gw,
__be32 src, struct net_device *dev);
extern void rt_cache_flush(struct net *net, int how);
-extern void rt_cache_flush_batch(struct net *net);
extern int __ip_route_output_key(struct net *, struct rtable **, const struct flowi *flp);
extern int ip_route_output_key(struct net *, struct rtable **, struct flowi *flp);
extern int ip_route_output_flow(struct net *, struct rtable **rp, struct flowi *flp, struct sock *sk, int flags);
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 2a49c06..694145c 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -978,11 +978,6 @@ static int fib_netdev_event(struct notifier_block *this, unsigned long event, vo
rt_cache_flush(dev_net(dev), 0);
break;
case NETDEV_UNREGISTER_BATCH:
- /* The batch unregister is only called on the first
- * device in the list of devices being unregistered.
- * Therefore we should not pass dev_net(dev) in here.
- */
- rt_cache_flush_batch(NULL);
break;
}
return NOTIFY_DONE;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 756f544..58419fe 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -129,7 +129,6 @@ static int ip_rt_gc_elasticity __read_mostly = 8;
static int ip_rt_mtu_expires __read_mostly = 10 * 60 * HZ;
static int ip_rt_min_pmtu __read_mostly = 512 + 20 + 20;
static int ip_rt_min_advmss __read_mostly = 256;
-static int rt_chain_length_max __read_mostly = 20;
/*
* Interface to generic destination cache.
@@ -222,184 +221,30 @@ const __u8 ip_tos2prio[16] = {
};
-/*
- * Route cache.
- */
-
-/* The locking scheme is rather straight forward:
- *
- * 1) Read-Copy Update protects the buckets of the central route hash.
- * 2) Only writers remove entries, and they hold the lock
- * as they look at rtable reference counts.
- * 3) Only readers acquire references to rtable entries,
- * they do so with atomic increments and with the
- * lock held.
- */
-
-struct rt_hash_bucket {
- struct rtable __rcu *chain;
-};
-
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || \
- defined(CONFIG_PROVE_LOCKING)
-/*
- * Instead of using one spinlock for each rt_hash_bucket, we use a table of spinlocks
- * The size of this table is a power of two and depends on the number of CPUS.
- * (on lockdep we have a quite big spinlock_t, so keep the size down there)
- */
-#ifdef CONFIG_LOCKDEP
-# define RT_HASH_LOCK_SZ 256
-#else
-# if NR_CPUS >= 32
-# define RT_HASH_LOCK_SZ 4096
-# elif NR_CPUS >= 16
-# define RT_HASH_LOCK_SZ 2048
-# elif NR_CPUS >= 8
-# define RT_HASH_LOCK_SZ 1024
-# elif NR_CPUS >= 4
-# define RT_HASH_LOCK_SZ 512
-# else
-# define RT_HASH_LOCK_SZ 256
-# endif
-#endif
-
-static spinlock_t *rt_hash_locks;
-# define rt_hash_lock_addr(slot) &rt_hash_locks[(slot) & (RT_HASH_LOCK_SZ - 1)]
-
-static __init void rt_hash_lock_init(void)
-{
- int i;
-
- rt_hash_locks = kmalloc(sizeof(spinlock_t) * RT_HASH_LOCK_SZ,
- GFP_KERNEL);
- if (!rt_hash_locks)
- panic("IP: failed to allocate rt_hash_locks\n");
-
- for (i = 0; i < RT_HASH_LOCK_SZ; i++)
- spin_lock_init(&rt_hash_locks[i]);
-}
-#else
-# define rt_hash_lock_addr(slot) NULL
-
-static inline void rt_hash_lock_init(void)
-{
-}
-#endif
-
-static struct rt_hash_bucket *rt_hash_table __read_mostly;
-static unsigned rt_hash_mask __read_mostly;
-static unsigned int rt_hash_log __read_mostly;
-
static DEFINE_PER_CPU(struct rt_cache_stat, rt_cache_stat);
#define RT_CACHE_STAT_INC(field) __this_cpu_inc(rt_cache_stat.field)
-static inline unsigned int rt_hash(__be32 daddr, __be32 saddr, int idx,
- int genid)
-{
- return jhash_3words((__force u32)daddr, (__force u32)saddr,
- idx, genid)
- & rt_hash_mask;
-}
-
static inline int rt_genid(struct net *net)
{
return atomic_read(&net->ipv4.rt_genid);
}
#ifdef CONFIG_PROC_FS
-struct rt_cache_iter_state {
- struct seq_net_private p;
- int bucket;
- int genid;
-};
-
-static struct rtable *rt_cache_get_first(struct seq_file *seq)
-{
- struct rt_cache_iter_state *st = seq->private;
- struct rtable *r = NULL;
-
- for (st->bucket = rt_hash_mask; st->bucket >= 0; --st->bucket) {
- if (!rcu_dereference_raw(rt_hash_table[st->bucket].chain))
- continue;
- rcu_read_lock_bh();
- r = rcu_dereference_bh(rt_hash_table[st->bucket].chain);
- while (r) {
- if (dev_net(r->dst.dev) == seq_file_net(seq) &&
- r->rt_genid == st->genid)
- return r;
- r = rcu_dereference_bh(r->dst.rt_next);
- }
- rcu_read_unlock_bh();
- }
- return r;
-}
-
-static struct rtable *__rt_cache_get_next(struct seq_file *seq,
- struct rtable *r)
-{
- struct rt_cache_iter_state *st = seq->private;
-
- r = rcu_dereference_bh(r->dst.rt_next);
- while (!r) {
- rcu_read_unlock_bh();
- do {
- if (--st->bucket < 0)
- return NULL;
- } while (!rcu_dereference_raw(rt_hash_table[st->bucket].chain));
- rcu_read_lock_bh();
- r = rcu_dereference_bh(rt_hash_table[st->bucket].chain);
- }
- return r;
-}
-
-static struct rtable *rt_cache_get_next(struct seq_file *seq,
- struct rtable *r)
-{
- struct rt_cache_iter_state *st = seq->private;
- while ((r = __rt_cache_get_next(seq, r)) != NULL) {
- if (dev_net(r->dst.dev) != seq_file_net(seq))
- continue;
- if (r->rt_genid == st->genid)
- break;
- }
- return r;
-}
-
-static struct rtable *rt_cache_get_idx(struct seq_file *seq, loff_t pos)
-{
- struct rtable *r = rt_cache_get_first(seq);
-
- if (r)
- while (pos && (r = rt_cache_get_next(seq, r)))
- --pos;
- return pos ? NULL : r;
-}
-
static void *rt_cache_seq_start(struct seq_file *seq, loff_t *pos)
{
- struct rt_cache_iter_state *st = seq->private;
if (*pos)
- return rt_cache_get_idx(seq, *pos - 1);
- st->genid = rt_genid(seq_file_net(seq));
+ return NULL;
return SEQ_START_TOKEN;
}
static void *rt_cache_seq_next(struct seq_file *seq, void *v, loff_t *pos)
{
- struct rtable *r;
-
- if (v == SEQ_START_TOKEN)
- r = rt_cache_get_first(seq);
- else
- r = rt_cache_get_next(seq, v);
++*pos;
- return r;
+ return NULL;
}
static void rt_cache_seq_stop(struct seq_file *seq, void *v)
{
- if (v && v != SEQ_START_TOKEN)
- rcu_read_unlock_bh();
}
static int rt_cache_seq_show(struct seq_file *seq, void *v)
@@ -409,29 +254,6 @@ static int rt_cache_seq_show(struct seq_file *seq, void *v)
"Iface\tDestination\tGateway \tFlags\t\tRefCnt\tUse\t"
"Metric\tSource\t\tMTU\tWindow\tIRTT\tTOS\tHHRef\t"
"HHUptod\tSpecDst");
- else {
- struct rtable *r = v;
- int len;
-
- seq_printf(seq, "%s\t%08X\t%08X\t%8X\t%d\t%u\t%d\t"
- "%08X\t%d\t%u\t%u\t%02X\t%d\t%1d\t%08X%n",
- r->dst.dev ? r->dst.dev->name : "*",
- (__force u32)r->rt_dst,
- (__force u32)r->rt_gateway,
- r->rt_flags, atomic_read(&r->dst.__refcnt),
- r->dst.__use, 0, (__force u32)r->rt_src,
- dst_metric_advmss(&r->dst) + 40,
- dst_metric(&r->dst, RTAX_WINDOW),
- (int)((dst_metric(&r->dst, RTAX_RTT) >> 3) +
- dst_metric(&r->dst, RTAX_RTTVAR)),
- r->fl.fl4_tos,
- r->dst.hh ? atomic_read(&r->dst.hh->hh_refcnt) : -1,
- r->dst.hh ? (r->dst.hh->hh_output ==
- dev_queue_xmit) : 0,
- r->rt_spec_dst, &len);
-
- seq_printf(seq, "%*s\n", 127 - len, "");
- }
return 0;
}
@@ -444,8 +266,7 @@ static const struct seq_operations rt_cache_seq_ops = {
static int rt_cache_seq_open(struct inode *inode, struct file *file)
{
- return seq_open_net(inode, file, &rt_cache_seq_ops,
- sizeof(struct rt_cache_iter_state));
+ return seq_open_net(inode, file, &rt_cache_seq_ops, 0);
}
static const struct file_operations rt_cache_seq_fops = {
@@ -643,184 +464,12 @@ static inline int ip_rt_proc_init(void)
}
#endif /* CONFIG_PROC_FS */
-static inline void rt_free(struct rtable *rt)
-{
- call_rcu_bh(&rt->dst.rcu_head, dst_rcu_free);
-}
-
-static inline void rt_drop(struct rtable *rt)
-{
- ip_rt_put(rt);
- call_rcu_bh(&rt->dst.rcu_head, dst_rcu_free);
-}
-
-static inline int rt_fast_clean(struct rtable *rth)
-{
- /* Kill broadcast/multicast entries very aggresively, if they
- collide in hash table with more useful entries */
- return (rth->rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST)) &&
- rt_is_input_route(rth) && rth->dst.rt_next;
-}
-
-static inline int rt_valuable(struct rtable *rth)
-{
- return (rth->rt_flags & (RTCF_REDIRECTED | RTCF_NOTIFY)) ||
- (rth->peer && rth->peer->pmtu_expires);
-}
-
-static int rt_may_expire(struct rtable *rth, unsigned long tmo1, unsigned long tmo2)
-{
- unsigned long age;
- int ret = 0;
-
- if (atomic_read(&rth->dst.__refcnt))
- goto out;
-
- age = jiffies - rth->dst.lastuse;
- if ((age <= tmo1 && !rt_fast_clean(rth)) ||
- (age <= tmo2 && rt_valuable(rth)))
- goto out;
- ret = 1;
-out: return ret;
-}
-
-/* Bits of score are:
- * 31: very valuable
- * 30: not quite useless
- * 29..0: usage counter
- */
-static inline u32 rt_score(struct rtable *rt)
-{
- u32 score = jiffies - rt->dst.lastuse;
-
- score = ~score & ~(3<<30);
-
- if (rt_valuable(rt))
- score |= (1<<31);
-
- if (rt_is_output_route(rt) ||
- !(rt->rt_flags & (RTCF_BROADCAST|RTCF_MULTICAST|RTCF_LOCAL)))
- score |= (1<<30);
-
- return score;
-}
-
-static inline bool rt_caching(const struct net *net)
-{
- return net->ipv4.current_rt_cache_rebuild_count <=
- net->ipv4.sysctl_rt_cache_rebuild_count;
-}
-
-static inline bool compare_hash_inputs(const struct flowi *fl1,
- const struct flowi *fl2)
-{
- return ((((__force u32)fl1->fl4_dst ^ (__force u32)fl2->fl4_dst) |
- ((__force u32)fl1->fl4_src ^ (__force u32)fl2->fl4_src) |
- (fl1->iif ^ fl2->iif)) == 0);
-}
-
-static inline int compare_keys(struct flowi *fl1, struct flowi *fl2)
-{
- return (((__force u32)fl1->fl4_dst ^ (__force u32)fl2->fl4_dst) |
- ((__force u32)fl1->fl4_src ^ (__force u32)fl2->fl4_src) |
- (fl1->mark ^ fl2->mark) |
- (*(u16 *)&fl1->fl4_tos ^ *(u16 *)&fl2->fl4_tos) |
- (fl1->oif ^ fl2->oif) |
- (fl1->iif ^ fl2->iif)) == 0;
-}
-
-static inline int compare_netns(struct rtable *rt1, struct rtable *rt2)
-{
- return net_eq(dev_net(rt1->dst.dev), dev_net(rt2->dst.dev));
-}
-
static inline int rt_is_expired(struct rtable *rth)
{
return rth->rt_genid != rt_genid(dev_net(rth->dst.dev));
}
/*
- * Perform a full scan of hash table and free all entries.
- * Can be called by a softirq or a process.
- * In the later case, we want to be reschedule if necessary
- */
-static void rt_do_flush(struct net *net, int process_context)
-{
- unsigned int i;
- struct rtable *rth, *next;
-
- for (i = 0; i <= rt_hash_mask; i++) {
- struct rtable __rcu **pprev;
- struct rtable *list;
-
- if (process_context && need_resched())
- cond_resched();
- rth = rcu_dereference_raw(rt_hash_table[i].chain);
- if (!rth)
- continue;
-
- spin_lock_bh(rt_hash_lock_addr(i));
-
- list = NULL;
- pprev = &rt_hash_table[i].chain;
- rth = rcu_dereference_protected(*pprev,
- lockdep_is_held(rt_hash_lock_addr(i)));
-
- while (rth) {
- next = rcu_dereference_protected(rth->dst.rt_next,
- lockdep_is_held(rt_hash_lock_addr(i)));
-
- if (!net ||
- net_eq(dev_net(rth->dst.dev), net)) {
- rcu_assign_pointer(*pprev, next);
- rcu_assign_pointer(rth->dst.rt_next, list);
- list = rth;
- } else {
- pprev = &rth->dst.rt_next;
- }
- rth = next;
- }
-
- spin_unlock_bh(rt_hash_lock_addr(i));
-
- for (; list; list = next) {
- next = rcu_dereference_protected(list->dst.rt_next, 1);
- rt_free(list);
- }
- }
-}
-
-/*
- * While freeing expired entries, we compute average chain length
- * and standard deviation, using fixed-point arithmetic.
- * This to have an estimation of rt_chain_length_max
- * rt_chain_length_max = max(elasticity, AVG + 4*SD)
- * We use 3 bits for frational part, and 29 (or 61) for magnitude.
- */
-
-#define FRACT_BITS 3
-#define ONE (1UL << FRACT_BITS)
-
-/*
- * Given a hash chain and an item in this hash chain,
- * find if a previous entry has the same hash_inputs
- * (but differs on tos, mark or oif)
- * Returns 0 if an alias is found.
- * Returns ONE if rth has no alias before itself.
- */
-static int has_noalias(const struct rtable *head, const struct rtable *rth)
-{
- const struct rtable *aux = head;
-
- while (aux != rth) {
- if (compare_hash_inputs(&aux->fl, &rth->fl))
- return 0;
- aux = rcu_dereference_protected(aux->dst.rt_next, 1);
- }
- return ONE;
-}
-
-/*
* Pertubation of rt_genid by a small quantity [1..256]
* Using 8 bits of shuffling ensure we can call rt_cache_invalidate()
* many times (2^24) without giving recent rt_genid.
@@ -841,366 +490,32 @@ static void rt_cache_invalidate(struct net *net)
void rt_cache_flush(struct net *net, int delay)
{
rt_cache_invalidate(net);
- if (delay >= 0)
- rt_do_flush(net, !in_softirq());
-}
-
-/* Flush previous cache invalidated entries from the cache */
-void rt_cache_flush_batch(struct net *net)
-{
- rt_do_flush(net, !in_softirq());
}
-static void rt_emergency_hash_rebuild(struct net *net)
-{
- if (net_ratelimit())
- printk(KERN_WARNING "Route hash chain too long!\n");
- rt_cache_invalidate(net);
-}
-
-/*
- Short description of GC goals.
-
- We want to build algorithm, which will keep routing cache
- at some equilibrium point, when number of aged off entries
- is kept approximately equal to newly generated ones.
-
- Current expiration strength is variable "expire".
- We try to adjust it dynamically, so that if networking
- is idle expires is large enough to keep enough of warm entries,
- and when load increases it reduces to limit cache size.
- */
-
static int rt_garbage_collect(struct dst_ops *ops)
{
- static unsigned long expire = RT_GC_TIMEOUT;
- static unsigned long last_gc;
- static int rover;
- static int equilibrium;
- struct rtable *rth;
- struct rtable __rcu **rthp;
- unsigned long now = jiffies;
- int goal;
- int entries = dst_entries_get_fast(&ipv4_dst_ops);
-
- /*
- * Garbage collection is pretty expensive,
- * do not make it too frequently.
- */
-
RT_CACHE_STAT_INC(gc_total);
-
- if (now - last_gc < ip_rt_gc_min_interval &&
- entries < ip_rt_max_size) {
- RT_CACHE_STAT_INC(gc_ignored);
- goto out;
- }
-
- entries = dst_entries_get_slow(&ipv4_dst_ops);
- /* Calculate number of entries, which we want to expire now. */
- goal = entries - (ip_rt_gc_elasticity << rt_hash_log);
- if (goal <= 0) {
- if (equilibrium < ipv4_dst_ops.gc_thresh)
- equilibrium = ipv4_dst_ops.gc_thresh;
- goal = entries - equilibrium;
- if (goal > 0) {
- equilibrium += min_t(unsigned int, goal >> 1, rt_hash_mask + 1);
- goal = entries - equilibrium;
- }
- } else {
- /* We are in dangerous area. Try to reduce cache really
- * aggressively.
- */
- goal = max_t(unsigned int, goal >> 1, rt_hash_mask + 1);
- equilibrium = entries - goal;
- }
-
- if (now - last_gc >= ip_rt_gc_min_interval)
- last_gc = now;
-
- if (goal <= 0) {
- equilibrium += goal;
- goto work_done;
- }
-
- do {
- int i, k;
-
- for (i = rt_hash_mask, k = rover; i >= 0; i--) {
- unsigned long tmo = expire;
-
- k = (k + 1) & rt_hash_mask;
- rthp = &rt_hash_table[k].chain;
- spin_lock_bh(rt_hash_lock_addr(k));
- while ((rth = rcu_dereference_protected(*rthp,
- lockdep_is_held(rt_hash_lock_addr(k)))) != NULL) {
- if (!rt_is_expired(rth) &&
- !rt_may_expire(rth, tmo, expire)) {
- tmo >>= 1;
- rthp = &rth->dst.rt_next;
- continue;
- }
- *rthp = rth->dst.rt_next;
- rt_free(rth);
- goal--;
- }
- spin_unlock_bh(rt_hash_lock_addr(k));
- if (goal <= 0)
- break;
- }
- rover = k;
-
- if (goal <= 0)
- goto work_done;
-
- /* Goal is not achieved. We stop process if:
-
- - if expire reduced to zero. Otherwise, expire is halfed.
- - if table is not full.
- - if we are called from interrupt.
- - jiffies check is just fallback/debug loop breaker.
- We will not spin here for long time in any case.
- */
-
- RT_CACHE_STAT_INC(gc_goal_miss);
-
- if (expire == 0)
- break;
-
- expire >>= 1;
-#if RT_CACHE_DEBUG >= 2
- printk(KERN_DEBUG "expire>> %u %d %d %d\n", expire,
- dst_entries_get_fast(&ipv4_dst_ops), goal, i);
-#endif
-
- if (dst_entries_get_fast(&ipv4_dst_ops) < ip_rt_max_size)
- goto out;
- } while (!in_softirq() && time_before_eq(jiffies, now));
-
- if (dst_entries_get_fast(&ipv4_dst_ops) < ip_rt_max_size)
- goto out;
- if (dst_entries_get_slow(&ipv4_dst_ops) < ip_rt_max_size)
- goto out;
- if (net_ratelimit())
- printk(KERN_WARNING "dst cache overflow\n");
- RT_CACHE_STAT_INC(gc_dst_overflow);
- return 1;
-
-work_done:
- expire += ip_rt_gc_min_interval;
- if (expire > ip_rt_gc_timeout ||
- dst_entries_get_fast(&ipv4_dst_ops) < ipv4_dst_ops.gc_thresh ||
- dst_entries_get_slow(&ipv4_dst_ops) < ipv4_dst_ops.gc_thresh)
- expire = ip_rt_gc_timeout;
-#if RT_CACHE_DEBUG >= 2
- printk(KERN_DEBUG "expire++ %u %d %d %d\n", expire,
- dst_entries_get_fast(&ipv4_dst_ops), goal, rover);
-#endif
-out: return 0;
-}
-
-/*
- * Returns number of entries in a hash chain that have different hash_inputs
- */
-static int slow_chain_length(const struct rtable *head)
-{
- int length = 0;
- const struct rtable *rth = head;
-
- while (rth) {
- length += has_noalias(head, rth);
- rth = rcu_dereference_protected(rth->dst.rt_next, 1);
- }
- return length >> FRACT_BITS;
+ return 0;
}
-static int rt_intern_hash(unsigned hash, struct rtable *rt,
- struct rtable **rp, struct sk_buff *skb, int ifindex)
+static int rt_finalize(struct rtable *rt, struct rtable **rp, struct sk_buff *skb)
{
- struct rtable *rth, *cand;
- struct rtable __rcu **rthp, **candp;
- unsigned long now;
- u32 min_score;
- int chain_length;
- int attempts = !in_softirq();
-
-restart:
- chain_length = 0;
- min_score = ~(u32)0;
- cand = NULL;
- candp = NULL;
- now = jiffies;
-
- if (!rt_caching(dev_net(rt->dst.dev))) {
- /*
- * If we're not caching, just tell the caller we
- * were successful and don't touch the route. The
- * caller hold the sole reference to the cache entry, and
- * it will be released when the caller is done with it.
- * If we drop it here, the callers have no way to resolve routes
- * when we're not caching. Instead, just point *rp at rt, so
- * the caller gets a single use out of the route
- * Note that we do rt_free on this new route entry, so that
- * once its refcount hits zero, we are still able to reap it
- * (Thanks Alexey)
- * Note: To avoid expensive rcu stuff for this uncached dst,
- * we set DST_NOCACHE so that dst_release() can free dst without
- * waiting a grace period.
- */
-
- rt->dst.flags |= DST_NOCACHE;
- if (rt->rt_type == RTN_UNICAST || rt_is_output_route(rt)) {
- int err = arp_bind_neighbour(&rt->dst);
- if (err) {
- if (net_ratelimit())
- printk(KERN_WARNING
- "Neighbour table failure & not caching routes.\n");
- ip_rt_put(rt);
- return err;
- }
- }
-
- goto skip_hashing;
- }
-
- rthp = &rt_hash_table[hash].chain;
-
- spin_lock_bh(rt_hash_lock_addr(hash));
- while ((rth = rcu_dereference_protected(*rthp,
- lockdep_is_held(rt_hash_lock_addr(hash)))) != NULL) {
- if (rt_is_expired(rth)) {
- *rthp = rth->dst.rt_next;
- rt_free(rth);
- continue;
- }
- if (compare_keys(&rth->fl, &rt->fl) && compare_netns(rth, rt)) {
- /* Put it first */
- *rthp = rth->dst.rt_next;
- /*
- * Since lookup is lockfree, the deletion
- * must be visible to another weakly ordered CPU before
- * the insertion at the start of the hash chain.
- */
- rcu_assign_pointer(rth->dst.rt_next,
- rt_hash_table[hash].chain);
- /*
- * Since lookup is lockfree, the update writes
- * must be ordered for consistency on SMP.
- */
- rcu_assign_pointer(rt_hash_table[hash].chain, rth);
-
- dst_use(&rth->dst, now);
- spin_unlock_bh(rt_hash_lock_addr(hash));
-
- rt_drop(rt);
- if (rp)
- *rp = rth;
- else
- skb_dst_set(skb, &rth->dst);
- return 0;
- }
-
- if (!atomic_read(&rth->dst.__refcnt)) {
- u32 score = rt_score(rth);
-
- if (score <= min_score) {
- cand = rth;
- candp = rthp;
- min_score = score;
- }
- }
-
- chain_length++;
-
- rthp = &rth->dst.rt_next;
- }
-
- if (cand) {
- /* ip_rt_gc_elasticity used to be average length of chain
- * length, when exceeded gc becomes really aggressive.
- *
- * The second limit is less certain. At the moment it allows
- * only 2 entries per bucket. We will see.
- */
- if (chain_length > ip_rt_gc_elasticity) {
- *candp = cand->dst.rt_next;
- rt_free(cand);
- }
- } else {
- if (chain_length > rt_chain_length_max &&
- slow_chain_length(rt_hash_table[hash].chain) > rt_chain_length_max) {
- struct net *net = dev_net(rt->dst.dev);
- int num = ++net->ipv4.current_rt_cache_rebuild_count;
- if (!rt_caching(net)) {
- printk(KERN_WARNING "%s: %d rebuilds is over limit, route caching disabled\n",
- rt->dst.dev->name, num);
- }
- rt_emergency_hash_rebuild(net);
- spin_unlock_bh(rt_hash_lock_addr(hash));
-
- hash = rt_hash(rt->fl.fl4_dst, rt->fl.fl4_src,
- ifindex, rt_genid(net));
- goto restart;
- }
- }
-
- /* Try to bind route to arp only if it is output
- route or unicast forwarding path.
+ /* To avoid expensive rcu stuff for this uncached dst, we set
+ * DST_NOCACHE so that dst_release() can free dst without
+ * waiting a grace period.
*/
+ rt->dst.flags |= DST_NOCACHE;
if (rt->rt_type == RTN_UNICAST || rt_is_output_route(rt)) {
int err = arp_bind_neighbour(&rt->dst);
if (err) {
- spin_unlock_bh(rt_hash_lock_addr(hash));
-
- if (err != -ENOBUFS) {
- rt_drop(rt);
- return err;
- }
-
- /* Neighbour tables are full and nothing
- can be released. Try to shrink route cache,
- it is most likely it holds some neighbour records.
- */
- if (attempts-- > 0) {
- int saved_elasticity = ip_rt_gc_elasticity;
- int saved_int = ip_rt_gc_min_interval;
- ip_rt_gc_elasticity = 1;
- ip_rt_gc_min_interval = 0;
- rt_garbage_collect(&ipv4_dst_ops);
- ip_rt_gc_min_interval = saved_int;
- ip_rt_gc_elasticity = saved_elasticity;
- goto restart;
- }
-
if (net_ratelimit())
- printk(KERN_WARNING "ipv4: Neighbour table overflow.\n");
- rt_drop(rt);
- return -ENOBUFS;
+ printk(KERN_WARNING
+ "Neighbour table failure & not caching routes.\n");
+ ip_rt_put(rt);
+ return err;
}
}
- rt->dst.rt_next = rt_hash_table[hash].chain;
-
-#if RT_CACHE_DEBUG >= 2
- if (rt->dst.rt_next) {
- struct rtable *trt;
- printk(KERN_DEBUG "rt_cache @%02x: %pI4",
- hash, &rt->rt_dst);
- for (trt = rt->dst.rt_next; trt; trt = trt->dst.rt_next)
- printk(" . %pI4", &trt->rt_dst);
- printk("\n");
- }
-#endif
- /*
- * Since lookup is lockfree, we must make sure
- * previous writes to rt are comitted to memory
- * before making rt visible to other CPUS.
- */
- rcu_assign_pointer(rt_hash_table[hash].chain, rt);
-
- spin_unlock_bh(rt_hash_lock_addr(hash));
-
-skip_hashing:
if (rp)
*rp = rt;
else
@@ -1270,26 +585,6 @@ void __ip_select_ident(struct iphdr *iph, struct dst_entry *dst, int more)
}
EXPORT_SYMBOL(__ip_select_ident);
-static void rt_del(unsigned hash, struct rtable *rt)
-{
- struct rtable __rcu **rthp;
- struct rtable *aux;
-
- rthp = &rt_hash_table[hash].chain;
- spin_lock_bh(rt_hash_lock_addr(hash));
- ip_rt_put(rt);
- while ((aux = rcu_dereference_protected(*rthp,
- lockdep_is_held(rt_hash_lock_addr(hash)))) != NULL) {
- if (aux == rt || rt_is_expired(aux)) {
- *rthp = aux->dst.rt_next;
- rt_free(aux);
- continue;
- }
- rthp = &aux->dst.rt_next;
- }
- spin_unlock_bh(rt_hash_lock_addr(hash));
-}
-
/* called in rcu_read_lock() section */
void ip_rt_redirect(__be32 old_gw, __be32 daddr, __be32 new_gw,
__be32 saddr, struct net_device *dev)
@@ -1348,14 +643,11 @@ static struct dst_entry *ipv4_negative_advice(struct dst_entry *dst)
ip_rt_put(rt);
ret = NULL;
} else if (rt->rt_flags & RTCF_REDIRECTED) {
- unsigned hash = rt_hash(rt->fl.fl4_dst, rt->fl.fl4_src,
- rt->fl.oif,
- rt_genid(dev_net(dst->dev)));
#if RT_CACHE_DEBUG >= 1
printk(KERN_DEBUG "ipv4_negative_advice: redirect to %pI4/%02x dropped\n",
- &rt->rt_dst, rt->fl.fl4_tos);
+ &rt->rt_dst, rt->fl.fl4_tos);
#endif
- rt_del(hash, rt);
+ ip_rt_put(rt);
ret = NULL;
} else if (rt->peer &&
rt->peer->pmtu_expires &&
@@ -1820,7 +1112,6 @@ static void rt_set_nexthop(struct rtable *rt, struct fib_result *res, u32 itag)
static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr,
u8 tos, struct net_device *dev, int our)
{
- unsigned int hash;
struct rtable *rth;
__be32 spec_dst;
struct in_device *in_dev = __in_dev_get_rcu(dev);
@@ -1887,8 +1178,7 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr,
#endif
RT_CACHE_STAT_INC(in_slow_mc);
- hash = rt_hash(daddr, saddr, dev->ifindex, rt_genid(dev_net(dev)));
- return rt_intern_hash(hash, rth, NULL, skb, dev->ifindex);
+ return rt_finalize(rth, NULL, skb);
e_nobufs:
return -ENOBUFS;
@@ -2035,7 +1325,6 @@ static int ip_mkroute_input(struct sk_buff *skb,
{
struct rtable* rth = NULL;
int err;
- unsigned hash;
#ifdef CONFIG_IP_ROUTE_MULTIPATH
if (res->fi && res->fi->fib_nhs > 1 && fl->oif == 0)
@@ -2048,9 +1337,7 @@ static int ip_mkroute_input(struct sk_buff *skb,
return err;
/* put it into the cache */
- hash = rt_hash(daddr, saddr, fl->iif,
- rt_genid(dev_net(rth->dst.dev)));
- return rt_intern_hash(hash, rth, NULL, skb, fl->iif);
+ return rt_finalize(rth, NULL, skb);
}
/*
@@ -2078,7 +1365,6 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
unsigned flags = 0;
u32 itag = 0;
struct rtable * rth;
- unsigned hash;
__be32 spec_dst;
int err = -EINVAL;
struct net * net = dev_net(dev);
@@ -2197,8 +1483,7 @@ local_input:
rth->rt_flags &= ~RTCF_LOCAL;
}
rth->rt_type = res.type;
- hash = rt_hash(daddr, saddr, fl.iif, rt_genid(net));
- err = rt_intern_hash(hash, rth, NULL, skb, fl.iif);
+ err = rt_finalize(rth, NULL, skb);
goto out;
no_route:
@@ -2242,47 +1527,10 @@ martian_source_keep_err:
int ip_route_input_common(struct sk_buff *skb, __be32 daddr, __be32 saddr,
u8 tos, struct net_device *dev, bool noref)
{
- struct rtable * rth;
- unsigned hash;
- int iif = dev->ifindex;
- struct net *net;
int res;
- net = dev_net(dev);
-
rcu_read_lock();
- if (!rt_caching(net))
- goto skip_cache;
-
- tos &= IPTOS_RT_MASK;
- hash = rt_hash(daddr, saddr, iif, rt_genid(net));
-
- for (rth = rcu_dereference(rt_hash_table[hash].chain); rth;
- rth = rcu_dereference(rth->dst.rt_next)) {
- if ((((__force u32)rth->fl.fl4_dst ^ (__force u32)daddr) |
- ((__force u32)rth->fl.fl4_src ^ (__force u32)saddr) |
- (rth->fl.iif ^ iif) |
- rth->fl.oif |
- (rth->fl.fl4_tos ^ tos)) == 0 &&
- rth->fl.mark == skb->mark &&
- net_eq(dev_net(rth->dst.dev), net) &&
- !rt_is_expired(rth)) {
- if (noref) {
- dst_use_noref(&rth->dst, jiffies);
- skb_dst_set_noref(skb, &rth->dst);
- } else {
- dst_use(&rth->dst, jiffies);
- skb_dst_set(skb, &rth->dst);
- }
- RT_CACHE_STAT_INC(in_hit);
- rcu_read_unlock();
- return 0;
- }
- RT_CACHE_STAT_INC(in_hlist_search);
- }
-
-skip_cache:
/* Multicast recognition logic is moved from route cache to here.
The problem was that too many Ethernet cards have broken/missing
hardware multicast filters :-( As result the host on multicasting
@@ -2439,12 +1687,9 @@ static int ip_mkroute_output(struct rtable **rp,
{
struct rtable *rth = NULL;
int err = __mkroute_output(&rth, res, fl, oldflp, dev_out, flags);
- unsigned hash;
- if (err == 0) {
- hash = rt_hash(oldflp->fl4_dst, oldflp->fl4_src, oldflp->oif,
- rt_genid(dev_net(dev_out)));
- err = rt_intern_hash(hash, rth, rp, NULL, oldflp->oif);
- }
+
+ if (err == 0)
+ err = rt_finalize(rth, rp, NULL);
return err;
}
@@ -2635,38 +1880,8 @@ out: return err;
int __ip_route_output_key(struct net *net, struct rtable **rp,
const struct flowi *flp)
{
- unsigned int hash;
int res;
- struct rtable *rth;
- if (!rt_caching(net))
- goto slow_output;
-
- hash = rt_hash(flp->fl4_dst, flp->fl4_src, flp->oif, rt_genid(net));
-
- rcu_read_lock_bh();
- for (rth = rcu_dereference_bh(rt_hash_table[hash].chain); rth;
- rth = rcu_dereference_bh(rth->dst.rt_next)) {
- if (rth->fl.fl4_dst == flp->fl4_dst &&
- rth->fl.fl4_src == flp->fl4_src &&
- rt_is_output_route(rth) &&
- rth->fl.oif == flp->oif &&
- rth->fl.mark == flp->mark &&
- !((rth->fl.fl4_tos ^ flp->fl4_tos) &
- (IPTOS_RT_MASK | RTO_ONLINK)) &&
- net_eq(dev_net(rth->dst.dev), net) &&
- !rt_is_expired(rth)) {
- dst_use(&rth->dst, jiffies);
- RT_CACHE_STAT_INC(out_hit);
- rcu_read_unlock_bh();
- *rp = rth;
- return 0;
- }
- RT_CACHE_STAT_INC(out_hlist_search);
- }
- rcu_read_unlock_bh();
-
-slow_output:
rcu_read_lock();
res = ip_route_output_slow(net, rp, flp);
rcu_read_unlock();
@@ -2966,43 +2181,6 @@ errout_free:
int ip_rt_dump(struct sk_buff *skb, struct netlink_callback *cb)
{
- struct rtable *rt;
- int h, s_h;
- int idx, s_idx;
- struct net *net;
-
- net = sock_net(skb->sk);
-
- s_h = cb->args[0];
- if (s_h < 0)
- s_h = 0;
- s_idx = idx = cb->args[1];
- for (h = s_h; h <= rt_hash_mask; h++, s_idx = 0) {
- if (!rt_hash_table[h].chain)
- continue;
- rcu_read_lock_bh();
- for (rt = rcu_dereference_bh(rt_hash_table[h].chain), idx = 0; rt;
- rt = rcu_dereference_bh(rt->dst.rt_next), idx++) {
- if (!net_eq(dev_net(rt->dst.dev), net) || idx < s_idx)
- continue;
- if (rt_is_expired(rt))
- continue;
- skb_dst_set_noref(skb, &rt->dst);
- if (rt_fill_info(net, skb, NETLINK_CB(cb->skb).pid,
- cb->nlh->nlmsg_seq, RTM_NEWROUTE,
- 1, NLM_F_MULTI) <= 0) {
- skb_dst_drop(skb);
- rcu_read_unlock_bh();
- goto done;
- }
- skb_dst_drop(skb);
- }
- rcu_read_unlock_bh();
- }
-
-done:
- cb->args[0] = h;
- cb->args[1] = idx;
return skb->len;
}
@@ -3235,16 +2413,6 @@ static __net_initdata struct pernet_operations rt_genid_ops = {
struct ip_rt_acct __percpu *ip_rt_acct __read_mostly;
#endif /* CONFIG_IP_ROUTE_CLASSID */
-static __initdata unsigned long rhash_entries;
-static int __init set_rhash_entries(char *str)
-{
- if (!str)
- return 0;
- rhash_entries = simple_strtoul(str, &str, 0);
- return 1;
-}
-__setup("rhash_entries=", set_rhash_entries);
-
int __init ip_rt_init(void)
{
int rc = 0;
@@ -3267,21 +2435,8 @@ int __init ip_rt_init(void)
if (dst_entries_init(&ipv4_dst_blackhole_ops) < 0)
panic("IP: failed to allocate ipv4_dst_blackhole_ops counter\n");
- rt_hash_table = (struct rt_hash_bucket *)
- alloc_large_system_hash("IP route cache",
- sizeof(struct rt_hash_bucket),
- rhash_entries,
- (totalram_pages >= 128 * 1024) ?
- 15 : 17,
- 0,
- &rt_hash_log,
- &rt_hash_mask,
- rhash_entries ? 0 : 512 * 1024);
- memset(rt_hash_table, 0, (rt_hash_mask + 1) * sizeof(struct rt_hash_bucket));
- rt_hash_lock_init();
-
- ipv4_dst_ops.gc_thresh = (rt_hash_mask + 1);
- ip_rt_max_size = (rt_hash_mask + 1) * 16;
+ ipv4_dst_ops.gc_thresh = ~0;
+ ip_rt_max_size = INT_MAX;
devinet_init();
ip_fib_init();
--
1.7.4.1
^ permalink raw reply related
* Re: [PATCH v6 2/9] ethtool: enable GSO and GRO by default
From: Michał Mirosław @ 2011-02-16 2:50 UTC (permalink / raw)
To: David Miller; +Cc: netdev, bhutchings
In-Reply-To: <20110215.184808.226754921.davem@davemloft.net>
On Tue, Feb 15, 2011 at 06:48:08PM -0800, David Miller wrote:
> You can't just update specific patches, you have to freshly resend
> the entire series again if you want me to apply your work.
Ah, ok. I tried to minimize duplicate mails in your mailbox. :)
-- Michał Mirosław
^ permalink raw reply
* Re: [PATCH v6 2/9] ethtool: enable GSO and GRO by default
From: David Miller @ 2011-02-16 2:48 UTC (permalink / raw)
To: mirq-linux; +Cc: netdev, bhutchings
In-Reply-To: <d256f661690245d75bec50f5a6acafcefcae1d8a.1297823573.git.mirq-linux@rere.qmqm.pl>
You can't just update specific patches, you have to freshly resend
the entire series again if you want me to apply your work.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox