* Re: [PATCH] net: phy: smsc: Re-enable EDPD mode for LAN87xx
From: David Miller @ 2012-11-15 22:49 UTC (permalink / raw)
To: patrick.trantham
Cc: netdev, steve.glendinning, otavio, marex, chohnstaedt, jkosina
In-Reply-To: <1353006057-7593-1-git-send-email-patrick.trantham@fuel7.com>
From: Patrick Trantham <patrick.trantham@fuel7.com>
Date: Thu, 15 Nov 2012 13:00:57 -0600
> This patch re-enables Energy Detect Power Down (EDPD) mode for the
> LAN8710/LAN8720. EDPD mode was disabled in a previous commit,
> (b629820d18fa65cc598390e4b9712fd5f83ee693), because it was causing the
> PHY to not be able to detect a link when cold started without a cable
> connected.
>
> The LAN8710/LAN8720 requires a minimum of 2 link pulses within 64ms of
> each other in order to set the ENERGYON bit and exit EDPD mode. If a
> link partner does send the pulses within this interval, the PHY will
> remained powered down.
>
> This workaround will manually toggle the PHY on/off upon calls to
> read_status in order to generate link test pulses if the link is down.
> If a link partner is present, it will respond to the pulses, which will
> cause the ENERGYON bit to be set and will cause the EDPD mode to be
> exited.
>
> Signed-off-by: Patrick Trantham <patrick.trantham@fuel7.com>
Applied to net-next, thanks.
^ permalink raw reply
* Re: [PATCH V2 00/14] Always build GSO/GRO functionality into the kernel
From: Vlad Yasevich @ 2012-11-15 22:48 UTC (permalink / raw)
To: David Miller; +Cc: netdev, eric.dumazet
In-Reply-To: <20121115.174205.2002769827912724231.davem@davemloft.net>
On 11/15/2012 05:42 PM, David Miller wrote:
>
> All applied to net-next, but there were some minor conflicts (an
> IS_ENABLED() conversion happened in ip6_output.c in net-next but the
> context in your patch didn't have it) and git warnings (trailing
> whitespace in the final patch).
>
> I fixed them all up, but this is something you can easily take care
> of yourself in the future.
>
> Thanks.
>
sorry, the patches were bases on net-2.6. That's probably why you saw
the conflicts. I wasn't sure where they should go since they actually
fix a long standing issue, so I bases them net-2.6.
Sorry, and thanks for fixing things up.
-vlad
^ permalink raw reply
* Re: Optics (SFP) monitoring on ixgbe and igbe
From: Jeff Kirsher @ 2012-11-15 22:46 UTC (permalink / raw)
To: footplus; +Cc: Ben Hutchings, netdev
In-Reply-To: <CAPN4dA-moLxn_jQW4e906j9wLU+tZVLxvkWeLOxDEbZpGtJ28g@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1818 bytes --]
On 11/15/2012 01:36 PM, Aurélien wrote:
>> On Fri, Nov 9, 2012 at 4:08 PM, Ben Hutchings <bhutchings@solarflare.com> wrote:
>>> No, the driver also needs to implement ethtool_ops::get_module_info and
>>> ethtool_ops::get_module_eeprom. But those should be quite easy to do.
>>>
> Hi !
>
> I started to implement these operations in ixgbe.
>
> So far, the result is the attached patch, which applies on dave-m's
> net-next @ 1ff05fb7114a6b4118e0f7d89fed2659f7131b0a. It's not yet
> finished, and since it is my first peek at network drivers I need some
> advice on:
>
> - whether the implementation seems correct for ixgbe and all its
> supported MAC/PHY combinations ?
> - what would be the best way to manage SFF-8472 A0/A2 bank swapping
> mechanism for reading A2h ? (it seems I need to lock the whole -
> adress change sequence - read A2h - address change again - operation)
> in case it's needed. I may not be able to test that, so I may add an
> unsupported return code for now.
> - Is the supported PHY selection correct, or should other PHYs be
> supported ? What should be the rule ?
>
> I have been able to get correct temperature readings with a patched-up
> ethtool, so it seems to work correctly on at least my card (Ethernet
> controller [0200]: Intel Corporation 82599EB 10-Gigabit Network
> Connection [8086:10fb] (rev 01)).
>
> About ethtool, I was thinking about making a -O option for optical
> diagnostics, which would have a readable output. I will make a
> function to parse the A2 register contents, so it can be reused in
> other daemons/libs (SNMP, etc).
>
> Thanks,
> Best regards,
Can you please add me <jeffrey.t.kirsher@intel.com> to the CC on future
patches for ixgbe or ixgb, as I will be the one applying the patch to my
queue?
Thanks
Jeff
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 897 bytes --]
^ permalink raw reply
* Re: [PATCH] tcp: fix retransmission in repair mode
From: David Miller @ 2012-11-15 22:45 UTC (permalink / raw)
To: xemul; +Cc: avagin, netdev, criu, linux-kernel, kuznet, jmorris, yoshfuji,
kaber
In-Reply-To: <50A4F7F3.5080500@parallels.com>
From: Pavel Emelyanov <xemul@parallels.com>
Date: Thu, 15 Nov 2012 18:10:59 +0400
> On 11/15/2012 06:03 PM, Andrey Vagin wrote:
>> From: Andrew Vagin <avagin@openvz.org>
>>
>> Currently if a socket was repaired with a few packet in a write queue,
>> a kernel bug may be triggered:
>>
>> kernel BUG at net/ipv4/tcp_output.c:2330!
>> RIP: 0010:[<ffffffff8155784f>] tcp_retransmit_skb+0x5ff/0x610
>>
>> According to the initial realization v3.4-rc2-963-gc0e88ff,
>> all skb-s should look like already posted. This patch fixes code
>> according with this sentence.
>>
>> Here are three points, which were not done in the initial patch:
>> 1. A tcp send head should not be changed
>> 2. Initialize TSO state of a skb
>> 3. Reset the retransmission time
>>
>> This patch moves logic from tcp_sendmsg to tcp_write_xmit. A packet
>> passes the ussual way, but isn't sent to network. This patch solves
>> all described problems and handles tcp_sendpages.
...
>> Signed-off-by: Andrey Vagin <avagin@openvz.org>
>
> Acked-by: Pavel Emelyanov <xemul@parallels.com>
Applied and queued up for -stable, thanks.
^ permalink raw reply
* Re: [PATCH V2 00/14] Always build GSO/GRO functionality into the kernel
From: David Miller @ 2012-11-15 22:42 UTC (permalink / raw)
To: vyasevic; +Cc: netdev, eric.dumazet
In-Reply-To: <1353005363-6974-1-git-send-email-vyasevic@redhat.com>
All applied to net-next, but there were some minor conflicts (an
IS_ENABLED() conversion happened in ip6_output.c in net-next but the
context in your patch didn't have it) and git warnings (trailing
whitespace in the final patch).
I fixed them all up, but this is something you can easily take care
of yourself in the future.
Thanks.
^ permalink raw reply
* GRO + splice panics in 3.7.0-rc5
From: Willy Tarreau @ 2012-11-15 22:28 UTC (permalink / raw)
To: netdev; +Cc: eric.dumazet
Hello,
I was just about to make a quick comparison between LRO and GRO in
3.7.0-rc5 to see if LRO still had the big advantage I've always observed,
but I failed the test because as soon as I enable LRO + splice, the kernel
panics and reboots.
I could not yet manage to catch the panic output, I could just reliably
reproduce it, it crashes instantly.
All I can say at the moment is the following :
- test consist in forwarding HTTP traffic between two NICs via haproxy
- driver used was myri10ge
- LRO + recv+send : OK
- LRO + splice : OK
- GRO + recv+send : OK
- GRO + splice : panic
- no such problem was observed in 3.6.6 so I think this is a recent
regression.
I'll go back digging for more information, but as I'm used to often see
Eric suggest the right candidates for reverting, I wanted to report the
issue here in case there are easy ones to try first.
Thanks,
Willy
^ permalink raw reply
* Re: [PATCHv2] sctp: fix /proc/net/sctp/ memory leak
From: Eric W. Biederman @ 2012-11-15 21:48 UTC (permalink / raw)
To: Tommi Rantala
Cc: netdev, Neil Horman, Vlad Yasevich, Sridhar Samudrala,
David S. Miller, linux-sctp, Dave Jones
In-Reply-To: <1352987345-11263-1-git-send-email-tt.rantala@gmail.com>
Tommi Rantala <tt.rantala@gmail.com> writes:
> Commit 13d782f ("sctp: Make the proc files per network namespace.")
> changed the /proc/net/sctp/ struct file_operations opener functions to
> use single_open_net() and seq_open_net().
>
> Avoid leaking memory by using single_release_net() and seq_release_net()
> as the release functions.
>
> Discovered with Trinity (the syscall fuzzer).
Doh! Thanks for catching this.
Eric
> - .release = single_release,
> + .release = single_release_net,
> };
> - .release = seq_release,
> + .release = seq_release_net,
> };
>
^ permalink raw reply
* Re: Optics (SFP) monitoring on ixgbe and igbe
From: Aurélien @ 2012-11-15 21:36 UTC (permalink / raw)
To: Ben Hutchings; +Cc: netdev
In-Reply-To: <CAPN4dA_kb4Gy6JWEW1h4YsGf5QwsOz58gGJqS0OcPTvHoAL7Xw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1575 bytes --]
> On Fri, Nov 9, 2012 at 4:08 PM, Ben Hutchings <bhutchings@solarflare.com> wrote:
>>
>> No, the driver also needs to implement ethtool_ops::get_module_info and
>> ethtool_ops::get_module_eeprom. But those should be quite easy to do.
>>
Hi !
I started to implement these operations in ixgbe.
So far, the result is the attached patch, which applies on dave-m's
net-next @ 1ff05fb7114a6b4118e0f7d89fed2659f7131b0a. It's not yet
finished, and since it is my first peek at network drivers I need some
advice on:
- whether the implementation seems correct for ixgbe and all its
supported MAC/PHY combinations ?
- what would be the best way to manage SFF-8472 A0/A2 bank swapping
mechanism for reading A2h ? (it seems I need to lock the whole -
adress change sequence - read A2h - address change again - operation)
in case it's needed. I may not be able to test that, so I may add an
unsupported return code for now.
- Is the supported PHY selection correct, or should other PHYs be
supported ? What should be the rule ?
I have been able to get correct temperature readings with a patched-up
ethtool, so it seems to work correctly on at least my card (Ethernet
controller [0200]: Intel Corporation 82599EB 10-Gigabit Network
Connection [8086:10fb] (rev 01)).
About ethtool, I was thinking about making a -O option for optical
diagnostics, which would have a readable output. I will make a
function to parse the A2 register contents, so it can be reused in
other daemons/libs (SNMP, etc).
Thanks,
Best regards,
--
Aurélien Guillaume
[-- Attachment #2: ixgbe-sff8472.patch --]
[-- Type: application/octet-stream, Size: 11887 bytes --]
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c
index 4253733..385b3c1 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c
@@ -1006,14 +1006,15 @@ static s32 ixgbe_write_analog_reg8_82598(struct ixgbe_hw *hw, u32 reg, u8 val)
}
/**
- * ixgbe_read_i2c_eeprom_82598 - Reads 8 bit word over I2C interface.
+ * ixgbe_read_i2c_phy_82598 - Reads 8 bit word over I2C interface.
* @hw: pointer to hardware structure
- * @byte_offset: EEPROM byte offset to read
+ * @dev_addr: SFF-8079 or SFF-8472 bank address to read.
+ * @byte_offset: EEPROM or DOM byte offset to read
* @eeprom_data: value read
*
* Performs 8 byte read operation to SFP module's EEPROM over I2C interface.
**/
-static s32 ixgbe_read_i2c_eeprom_82598(struct ixgbe_hw *hw, u8 byte_offset,
+static s32 ixgbe_read_i2c_phy_82598(struct ixgbe_hw *hw, u8 dev_addr, u8 byte_offset,
u8 *eeprom_data)
{
s32 status = 0;
@@ -1028,7 +1029,7 @@ static s32 ixgbe_read_i2c_eeprom_82598(struct ixgbe_hw *hw, u8 byte_offset,
* 0xC30D. These registers are used to talk to the SFP+
* module's EEPROM through the SDA/SCL (I2C) interface.
*/
- sfp_addr = (IXGBE_I2C_EEPROM_DEV_ADDR << 8) + byte_offset;
+ sfp_addr = (dev_addr << 8) + byte_offset;
sfp_addr = (sfp_addr | IXGBE_I2C_EEPROM_READ_MASK);
hw->phy.ops.write_reg(hw,
IXGBE_MDIO_PMA_PMD_SDA_SCL_ADDR,
@@ -1068,6 +1069,37 @@ out:
}
/**
+ * ixgbe_read_i2c_eeprom_82598 - Reads 8 bit word over I2C interface.
+ * @hw: pointer to hardware structure
+ * @byte_offset: EEPROM byte offset to read
+ * @eeprom_data: value read
+ *
+ * Performs 8 byte read operation to SFP module's EEPROM over I2C interface.
+ **/
+static s32 ixgbe_read_i2c_eeprom_82598(struct ixgbe_hw *hw, u8 byte_offset,
+ u8 *eeprom_data)
+{
+ return ixgbe_read_i2c_phy_82598(hw, IXGBE_I2C_EEPROM_DEV_ADDR, byte_offset, eeprom_data);
+}
+
+/**
+ * ixgbe_read_i2c_dom_82598 - Reads 8 bit word over I2C interface.
+ * @hw: pointer to hardware structure
+ * @byte_offset: DOM byte offset to read
+ * @eeprom_data: value read
+ *
+ * Performs 8 byte read operation to SFP module's DOM registers over I2C interface.
+ **/
+static s32 ixgbe_read_i2c_dom_82598(struct ixgbe_hw *hw, u8 byte_offset,
+ u8 *dom_data)
+{
+ return ixgbe_read_i2c_phy_82598(hw, IXGBE_I2C_EEPROM_DEV_ADDR2, byte_offset, dom_data);
+}
+
+
+
+
+/**
* ixgbe_get_supported_physical_layer_82598 - Returns physical layer type
* @hw: pointer to hardware structure
*
@@ -1300,6 +1332,7 @@ static struct ixgbe_phy_operations phy_ops_82598 = {
.write_reg = &ixgbe_write_phy_reg_generic,
.setup_link = &ixgbe_setup_phy_link_generic,
.setup_link_speed = &ixgbe_setup_phy_link_speed_generic,
+ .read_i2c_dom = &ixgbe_read_i2c_dom_82598,
.read_i2c_eeprom = &ixgbe_read_i2c_eeprom_82598,
.check_overtemp = &ixgbe_tn_check_overtemp,
};
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_82599.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_82599.c
index e75f5a4..4eb64a6 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_82599.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_82599.c
@@ -2253,6 +2253,7 @@ static struct ixgbe_phy_operations phy_ops_82599 = {
.setup_link_speed = &ixgbe_setup_phy_link_speed_generic,
.read_i2c_byte = &ixgbe_read_i2c_byte_generic,
.write_i2c_byte = &ixgbe_write_i2c_byte_generic,
+ .read_i2c_dom = &ixgbe_read_i2c_dom_generic,
.read_i2c_eeprom = &ixgbe_read_i2c_eeprom_generic,
.write_i2c_eeprom = &ixgbe_write_i2c_eeprom_generic,
.check_overtemp = &ixgbe_tn_check_overtemp,
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index a545728..2bf16c3 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -39,6 +39,7 @@
#include <linux/uaccess.h>
#include "ixgbe.h"
+#include "ixgbe_phy.h"
#define IXGBE_ALL_RAR_ENTRIES 16
@@ -2699,6 +2700,100 @@ static int ixgbe_get_ts_info(struct net_device *dev,
return 0;
}
+static int ixgbe_get_module_info(struct net_device *dev,
+ struct ethtool_modinfo *modinfo)
+{
+ struct ixgbe_adapter *adapter = netdev_priv(dev);
+ struct ixgbe_hw *hw = &adapter->hw;
+ u32 status = IXGBE_ERR_PHY_ADDR_INVALID;
+ u8 sff8472_rev = IXGBE_SFF_SFF_8472_UNSUP;
+
+ /* We do not support the operation when SFP is absent/unsupported */
+ if (hw->phy.sfp_type == ixgbe_sfp_type_not_present
+ || hw->phy.sfp_type == ixgbe_sfp_type_unknown) {
+ return -EOPNOTSUPP;
+ }
+
+ /* Check whether we support SFF-8472 or not */
+ status = hw->phy.ops.read_i2c_eeprom(hw,
+ IXGBE_SFF_SFF_8472_COMP,
+ &sff8472_rev);
+
+ if (status == IXGBE_ERR_SWFW_SYNC ||
+ status == IXGBE_ERR_I2C ||
+ status == IXGBE_ERR_SFP_NOT_PRESENT)
+ /* Error occured while reading module */
+ return -EIO;
+
+ if (sff8472_rev == IXGBE_SFF_SFF_8472_UNSUP)
+ {
+ /* We have a SFP, but it does not support DOM. */
+ modinfo->type = ETH_MODULE_SFF_8079;
+ modinfo->eeprom_len = ETH_MODULE_SFF_8079_LEN;
+ return 0;
+ }
+
+ /* We have a SFP which supports a revision of SFF-8472. */
+ modinfo->type = ETH_MODULE_SFF_8472;
+ modinfo->eeprom_len = ETH_MODULE_SFF_8472_LEN;
+ return 0;
+}
+
+static int ixgbe_get_module_eeprom(struct net_device *dev,
+ struct ethtool_eeprom *ee,
+ u8 *data)
+{
+ struct ixgbe_adapter *adapter = netdev_priv(dev);
+ struct ixgbe_hw *hw = &adapter->hw;
+ u32 status = IXGBE_ERR_PHY_ADDR_INVALID;
+ u8 databyte = 0xFF;
+ int i = 0;
+
+ /* We do not support the operation when SFP is absent/unsupported */
+ if (hw->phy.sfp_type == ixgbe_sfp_type_not_present
+ || hw->phy.sfp_type == ixgbe_sfp_type_unknown) {
+ return -EOPNOTSUPP;
+ }
+
+ /* Read the first block, SFF-8079 */
+ for (i = 0; (i < ee->len && i < ETH_MODULE_SFF_8079_LEN); i++)
+ {
+ status = hw->phy.ops.read_i2c_eeprom(hw, i, &databyte);
+ if (status == IXGBE_ERR_SWFW_SYNC ||
+ status == IXGBE_ERR_I2C ||
+ status == IXGBE_ERR_SFP_NOT_PRESENT)
+ /* Error occured while reading module */
+ return -EIO;
+ data[i] = databyte;
+ }
+
+ /* If the second block is requested, check if SFF-8472 is supported. */
+ if (ee->len > ETH_MODULE_SFF_8079_LEN)
+ {
+ if (data[IXGBE_SFF_SFF_8472_COMP] == IXGBE_SFF_SFF_8472_UNSUP
+ || !hw->phy.ops.read_i2c_dom)
+ return -EOPNOTSUPP;
+ /* Read the second block, SFF-8472 */
+ for (i = ETH_MODULE_SFF_8079_LEN;
+ (i < ee->len && i < ETH_MODULE_SFF_8472_LEN); i++)
+ {
+ status = hw->phy.ops.read_i2c_dom(hw, i - ETH_MODULE_SFF_8079_LEN, &databyte);
+ if (status == IXGBE_ERR_SWFW_SYNC ||
+ status == IXGBE_ERR_I2C ||
+ status == IXGBE_ERR_SFP_NOT_PRESENT)
+ /* Error occured while reading module */
+ return -EIO;
+ data[i] = databyte;
+ }
+ }
+
+ return 0;
+}
+
+
+
+
+
static const struct ethtool_ops ixgbe_ethtool_ops = {
.get_settings = ixgbe_get_settings,
.set_settings = ixgbe_set_settings,
@@ -2728,6 +2823,8 @@ static const struct ethtool_ops ixgbe_ethtool_ops = {
.get_rxnfc = ixgbe_get_rxnfc,
.set_rxnfc = ixgbe_set_rxnfc,
.get_ts_info = ixgbe_get_ts_info,
+ .get_module_info = ixgbe_get_module_info,
+ .get_module_eeprom = ixgbe_get_module_eeprom,
};
void ixgbe_set_ethtool_ops(struct net_device *netdev)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c
index 71659ed..e9599d9 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c
@@ -1206,6 +1206,27 @@ s32 ixgbe_read_i2c_eeprom_generic(struct ixgbe_hw *hw, u8 byte_offset,
}
/**
+ * ixgbe_read_i2c_dom_generic - Reads 8 bit DOM word over I2C interface
+ * @hw: pointer to hardware structure
+ * @byte_offset: DOM byte offset to read
+ * @eeprom_data: value read
+ *
+ * Performs byte read operation to SFP module's DOM info over I2C interface,
+ * possibly managing the byte swap.
+ **/
+s32 ixgbe_read_i2c_dom_generic(struct ixgbe_hw *hw, u8 byte_offset,
+ u8 *dom_data)
+{
+
+ // FIXME: Got to manage the A0/A2 switch according to SFF-8472 (bank swap).
+
+ return hw->phy.ops.read_i2c_byte(hw, byte_offset,
+ IXGBE_I2C_EEPROM_DEV_ADDR2,
+ dom_data);
+}
+
+
+/**
* ixgbe_write_i2c_eeprom_generic - Writes 8 bit EEPROM word over I2C interface
* @hw: pointer to hardware structure
* @byte_offset: EEPROM byte offset to write
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.h
index cc18165..1ce0624 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.h
@@ -30,6 +30,8 @@
#include "ixgbe_type.h"
#define IXGBE_I2C_EEPROM_DEV_ADDR 0xA0
+#define IXGBE_I2C_EEPROM_DEV_ADDR2 0xA2
+#define IXGBE_I2C_EEPROM_BANK_LEN 0xFF
/* EEPROM byte offsets */
#define IXGBE_SFF_IDENTIFIER 0x0
@@ -41,6 +43,7 @@
#define IXGBE_SFF_10GBE_COMP_CODES 0x3
#define IXGBE_SFF_CABLE_TECHNOLOGY 0x8
#define IXGBE_SFF_CABLE_SPEC_COMP 0x3C
+#define IXGBE_SFF_SFF_8472_COMP 0x5E
/* Bitmasks */
#define IXGBE_SFF_DA_PASSIVE_CABLE 0x4
@@ -88,6 +91,14 @@
#define IXGBE_TN_LASI_STATUS_REG 0x9005
#define IXGBE_TN_LASI_STATUS_TEMP_ALARM 0x0008
+/* SFP+ SFF-8472 Compliance code */
+#define IXGBE_SFF_SFF_8472_UNSUP 0x00
+#define IXGBE_SFF_SFF_8472_REV_9_3 0x01
+#define IXGBE_SFF_SFF_8472_REV_9_5 0x02
+#define IXGBE_SFF_SFF_8472_REV_10_2 0x03
+#define IXGBE_SFF_SFF_8472_REV_10_4 0x04
+#define IXGBE_SFF_SFF_8472_REV_11_0 0x05
+
s32 ixgbe_init_phy_ops_generic(struct ixgbe_hw *hw);
s32 ixgbe_identify_phy_generic(struct ixgbe_hw *hw);
s32 ixgbe_reset_phy_generic(struct ixgbe_hw *hw);
@@ -126,6 +137,8 @@ s32 ixgbe_write_i2c_byte_generic(struct ixgbe_hw *hw, u8 byte_offset,
u8 dev_addr, u8 data);
s32 ixgbe_read_i2c_eeprom_generic(struct ixgbe_hw *hw, u8 byte_offset,
u8 *eeprom_data);
+s32 ixgbe_read_i2c_dom_generic(struct ixgbe_hw *hw, u8 byte_offset,
+ u8 *dom_data);
s32 ixgbe_write_i2c_eeprom_generic(struct ixgbe_hw *hw, u8 byte_offset,
u8 eeprom_data);
#endif /* _IXGBE_PHY_H_ */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
index 21915e2..8fa1f15 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
@@ -2882,6 +2882,7 @@ struct ixgbe_phy_operations {
s32 (*get_firmware_version)(struct ixgbe_hw *, u16 *);
s32 (*read_i2c_byte)(struct ixgbe_hw *, u8, u8, u8 *);
s32 (*write_i2c_byte)(struct ixgbe_hw *, u8, u8, u8);
+ s32 (*read_i2c_dom)(struct ixgbe_hw *, u8 , u8 *);
s32 (*read_i2c_eeprom)(struct ixgbe_hw *, u8 , u8 *);
s32 (*write_i2c_eeprom)(struct ixgbe_hw *, u8, u8);
s32 (*check_overtemp)(struct ixgbe_hw *);
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
index de4da52..a129aad 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
@@ -879,6 +879,7 @@ static struct ixgbe_phy_operations phy_ops_X540 = {
.setup_link_speed = &ixgbe_setup_phy_link_speed_generic,
.read_i2c_byte = &ixgbe_read_i2c_byte_generic,
.write_i2c_byte = &ixgbe_write_i2c_byte_generic,
+ .read_i2c_dom = &ixgbe_read_i2c_dom_generic,
.read_i2c_eeprom = &ixgbe_read_i2c_eeprom_generic,
.write_i2c_eeprom = &ixgbe_write_i2c_eeprom_generic,
.check_overtemp = &ixgbe_tn_check_overtemp,
^ permalink raw reply related
* Re: [Pv-drivers] [PATCH 0/6] VSOCK for Linux upstreaming
From: Anthony Liguori @ 2012-11-15 21:32 UTC (permalink / raw)
To: Gerd Hoffmann
Cc: Andy King, pv-drivers, netdev, linux-kernel, virtualization,
gregkh, David Miller, georgezhang, Benjamin Herrenschmidt
In-Reply-To: <509A06AB.2020700@redhat.com>
On 11/07/2012 12:58 AM, Gerd Hoffmann wrote:
> On 11/05/12 19:19, Andy King wrote:
>> Hi David,
>>
>>> The big and only question is whether anyone can actually use any of
>>> this stuff without your proprietary bits?
>>
>> Do you mean the VMCI calls? The VMCI driver is in the process of being
>> upstreamed into the drivers/misc tree. Greg (cc'd on these patches) is
>> actively reviewing that code and we are addressing feedback.
>>
>> Also, there was some interest from RedHat into using vSockets as a unified
>> interface, routed over a hypervisor-specific transport (virtio or
>> otherwise, although for now VMCI is the only one implemented).
>
> Can you outline how this can be done? From a quick look over the code
> it seems like vsock has a hard dependency on vmci, is that correct?
>
> When making vsock a generic, reusable kernel service it should be the
> other way around: vsock should provide the core implementation and an
> interface where hypervisor-specific transports (vmci, virtio, xenbus,
> ...) can register themself.
This was already done in a hypervisor neutral way using virtio:
http://lists.openwall.net/netdev/2008/12/14/8
The concept was Nacked and that led to the abomination of virtio-serial. If an
address family for virtualization is on the table, we should reconsider
AF_VMCHANNEL.
I'd be thrilled to get rid of virtio-serial...
Regards,
Anthony Liguori
>
> cheers,
> Gerd
^ permalink raw reply
* Re: gro vs vlan in myri10ge
From: Ben Hutchings @ 2012-11-15 20:39 UTC (permalink / raw)
To: Eric Dumazet, Andrew Gallatin; +Cc: netdev
In-Reply-To: <1352428870.19779.496.camel@edumazet-glaptop>
On Thu, 2012-11-08 at 18:41 -0800, Eric Dumazet wrote:
> On Thu, 2012-11-08 at 21:20 -0500, Andrew Gallatin wrote:
> > Hi,
> >
> > I've wanted to convert myri10ge from LRO to GRO for quite a while.
> > The problem I'm facing is that the NIC cannot perform hardware vlan
> > tag offload, so GRO performance is far below LRO performance when
> > receiving vlan tagged TCP traffic.
Thanks for the reminder; I need to deal with this in sfc as well.
> > If a vlan tagged frame is passed to lro_receive_frags(), inet_lro will
> > look at the encapsulated IPv4 frame and TCP aggregation will succeed.
> > However, it appears that GRO will not do this. When I patch the
> > driver to use GRO, and configure a vlan interface, I see high CPU
> > utilization and poor bandwidth when I'm receiving a netperf TCP stream
> > on the vlan interface. If I use LRO in an unpatched driver, then I
> > see good receive performance in the same scenario.
> >
> > What is the best way to "fix" this?
> >
> > Unless I'm just using GRO wrong, it seems that the simplest thing for
> > me to do is to claim NETIF_F_HW_VLAN_RX, but pop the tags in the
> > driver so as to allow myri10ge to pass up a non-encapsulated frame the
> > same way that (nearly?) every other 10GbE NIC does. I've got a quick
> > and dirty patch that confirms doing the vtag pop in the driver gives
> > me roughly the same performance with GRO as I used to have with LRO.
> >
> > Is this (popping vlan tags in the driver) acceptable, or is it
> > too much of a layering violation?
>
> Given GRO assumes NIC does hardware vlan offloading, I guess
> I would chose to do that.
Really, after all the changes in 2.6.37 to unify behaviour between the
offloaded and non-offloaded paths?
> It seems unfortunate to add vlan decap in GRO path, already very
> complex.
True but can't this be done at the top?
I'll post a patch for this shortly.
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* pull-request: can-next 2012-11-15
From: Marc Kleine-Budde @ 2012-11-15 20:30 UTC (permalink / raw)
To: David Miller; +Cc: Linux Netdev List, linux-can@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 2466 bytes --]
Hello David
this pull request is for net-next, for the v3.8 release cycle. Muhammad
Ghias added support another board to the plx_pci sja1000 driver.
Matthias Fuchs improved the esd_usb2 driver with listen-only mode and
CAN-USB/Micro support. Andreas Larsson contributed a driver for the
GRHCAN CAN IP-Core
regards,
Marc
---
The following changes since commit 702ed3c1a9dfe4dfe112f13542d0c9d689f5008b:
Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge (2012-11-14 22:10:50 -0500)
are available in the git repository at:
git://gitorious.org/linux-can/linux-can-next.git for-davem
for you to fetch changes up to 6cec9b07fe6a0c4dfbcdcee7c6283529f087c521:
can: grcan: Add device driver for GRCAN and GRHCAN cores (2012-11-15 20:47:26 +0100)
----------------------------------------------------------------
Andreas Larsson (1):
can: grcan: Add device driver for GRCAN and GRHCAN cores
Chuansheng Liu (1):
can: janz-ican3: Fix the usage of wait_for_completion_timeout
Matthias Fuchs (2):
can: usb: esd_usb2: Add support for listen-only mode
can: usb: esd_usb2: Add support for CAN-USB/Micro
Muhammad Ghias (1):
can: sja1000: plx_pci: add support for Connect Tech Inc's Canpro/104-Plus Opto CAN board
Documentation/ABI/testing/sysfs-class-net-grcan | 35 +
.../devicetree/bindings/net/can/grcan.txt | 28 +
Documentation/kernel-parameters.txt | 18 +
drivers/net/can/Kconfig | 9 +
drivers/net/can/Makefile | 1 +
drivers/net/can/grcan.c | 1756 ++++++++++++++++++++
drivers/net/can/janz-ican3.c | 4 +-
drivers/net/can/sja1000/Kconfig | 1 +
drivers/net/can/sja1000/plx_pci.c | 19 +
drivers/net/can/usb/esd_usb2.c | 35 +-
10 files changed, 1898 insertions(+), 8 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-class-net-grcan
create mode 100644 Documentation/devicetree/bindings/net/can/grcan.txt
create mode 100644 drivers/net/can/grcan.c
--
Pengutronix e.K. | Marc Kleine-Budde |
Industrial Linux Solutions | Phone: +49-231-2826-924 |
Vertretung West/Dortmund | Fax: +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]
^ permalink raw reply
* RE: 82571EB: Detected Hardware Unit Hang
From: Dave, Tushar N @ 2012-11-15 20:26 UTC (permalink / raw)
To: Joe Jin
Cc: e1000-devel@lists.sf.net, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, Mary Mcgrath
In-Reply-To: <50A43828.6000702@oracle.com>
>-----Original Message-----
>From: Joe Jin [mailto:joe.jin@oracle.com]
>Sent: Wednesday, November 14, 2012 4:33 PM
>To: Dave, Tushar N
>Cc: e1000-devel@lists.sf.net; netdev@vger.kernel.org; linux-
>kernel@vger.kernel.org; Mary Mcgrath
>Subject: Re: 82571EB: Detected Hardware Unit Hang
>
>On 11/14/12 11:45, Dave, Tushar N wrote:
>>> -----Original Message-----
>>> From: Joe Jin [mailto:joe.jin@oracle.com]
>>> Sent: Tuesday, November 13, 2012 6:48 PM
>>> To: Dave, Tushar N
>>> Cc: e1000-devel@lists.sf.net; netdev@vger.kernel.org; linux-
>>> kernel@vger.kernel.org; Mary Mcgrath
>>> Subject: Re: 82571EB: Detected Hardware Unit Hang
>>>
>>> On 11/09/12 04:35, Dave, Tushar N wrote:
>>>> All devices in path from root complex to 82571, should have *same*
>>>> max
>>> payload size otherwise it can cause hang.
>>>> Can you double check this?
>>>
>>> Hi Tushar,
>>>
>>> Checked with hardware vendor and they said no way to modify the max
>>> payload size from BIOS, can I modify it from driver side?
>>
>> If you want to change value for 82571 device you can do it from eeprom
>but for other upstream devices I am not sure. I will check with my team.
>
>Hi Tushar,
>
>Would you please help to fine the offset of max payload size in eeprom?
>I'd like to have a try to modify it by ethtool.
It is defined using bit 8 of word 0x1A.
Bit value 0 = 128B , bit value 1 = 256B
-Tushar
^ permalink raw reply
* Re: [net-next 0/9][pull request] Intel Wired LAN Driver Updates
From: David Miller @ 2012-11-15 20:18 UTC (permalink / raw)
To: jeffrey.t.kirsher; +Cc: netdev, gospo, sassmann
In-Reply-To: <1352990387-3872-1-git-send-email-jeffrey.t.kirsher@intel.com>
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Thu, 15 Nov 2012 06:39:38 -0800
> This series contains updates to ioat (DCA) and ixgbevf.
>
> The following are changes since commit 702ed3c1a9dfe4dfe112f13542d0c9d689f5008b:
> Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge
> and are available in the git repository at:
> git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next master
Pulled, thanks Jeff.
^ permalink raw reply
* Re: [PATCH v2 net-next 01/22] bnx2x: Support probing and removing of VF device
From: David Miller @ 2012-11-15 20:09 UTC (permalink / raw)
To: ariele; +Cc: netdev, eilong
In-Reply-To: <1352998067-9707-2-git-send-email-ariele@broadcom.com>
From: "Ariel Elior" <ariele@broadcom.com>
Date: Thu, 15 Nov 2012 18:47:26 +0200
I'm very angry, I told you guys to fix the coding style issues in this
patch set, and you didn't even fix the class of problems I
specifically asked to be fixed. Even the very first hunk in the very
first patch has the exact problem I said you MUST resolve.
> To support probing and removing of a bnx2x virtual function
> the following were added:
> 1. add bnx2x_vfpf.h: defines the VF to PF channel
> 2. add bnx2x_sriov.h: header for bnx2x SR-IOV functionality
> 3. enumerate VF hw types (identify VFs)
> 4. if driving a VF, map VF bar
> 5. if driving a VF, allocate Vf to PF channel
> 6. refactor interrupt flows to include VF
>
> Signed-off-by: Ariel Elior <ariele@broadcom.com>
> Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Please stop wasting reviewer resources, because that is what
you are doing when the very first hunk we see in a huge patch
submission is something like this:
> +enum bnx2x_int_mode {
> + BNX2X_INT_MODE_MSIX,
> + BNX2X_INT_MODE_INTX,
> + BNX2X_INT_MODE_MSI
> +};
> +
> +
> +
There is no reason to have 3 blank lines here, one is more than
sufficient.
Tell me, what exactly was NOT clear in the directives I gave you for
the previous submission 2 days ago:
http://marc.info/?l=linux-netdev&m=135283453929818&w=2
I said, remove graduitous empty lines. That's what I asked for, and
the very first first patch starts by adding gratuitous empty lines.
Re-audit this entire patch series and do not even think about
resubmitting this until such coding style problems are eliminated.
In fact, I'm going to ignore any patches you submit for the next week,
you're officially on my crap list. Don't even think about
resubmitting this patch series until next Thursday at the earliest.
^ permalink raw reply
* Re: stp issue and "bridge: send proper message_age in config BPDU"
From: Lennert Buytenhek @ 2012-11-15 20:09 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev, Steven Kath, Chris Healy
In-Reply-To: <20121115115853.6b9a3ec3@s6510.linuxnetplumber.net>
On Thu, Nov 15, 2012 at 11:58:53AM -0800, Stephen Hemminger wrote:
> > FWIW, I've been debugging an STP issue on an old product kernel tree
> > that I couldn't find an upstream fix for, but after having debugged the
> > issue, there does actually appear to be an upstream commit that makes
> > the issue go away, but the commit message on that commit is somewhat
> > unclear about what the issue is that it's fixing and why the given fix
> > fixes it, and given that I spent considerable time debugging it I
> > figured I'd send this out for the sake of the next person googling for
> > this.
> >
> > The symptoms are pretty much what's described in this bug:
> >
> > https://bugzilla.vyatta.com/show_bug.cgi?id=7164
> >
> > And the upstream commit is:
> >
> > https://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=0c03150e7ea8f7fcd03cfef29385e0010b22ee92
> >
> > commit 0c03150e7ea8f7fcd03cfef29385e0010b22ee92
> > Author: stephen hemminger <shemminger@vyatta.com>
> > Date: Fri Jul 22 07:47:06 2011 +0000
> >
> > bridge: send proper message_age in config BPDU
> >
> > What I was seeing was that as a non-root bridge, Linux STP would often
> > fail to transmit BPDUs out of designated ports upon reception of a BPDU
> > from an upstream port.
> >
> > br_received_config_bpdu() handles the received BPDU, and calls into
> > br_record_config_information(), which resets the message age timer on
> > this port to jiffies + (p->br->max_age - bpdu->message_age);
> >
> > When br_received_config_bpdu() then calls br_config_bpdu_generation(),
> > the latter will call into br_transmit_config() for each enabled
> > designated port, which will send out BPDUs with age br->max_age
> > - (root->message_age_timer.expires - jiffies) + MESSAGE_AGE_INCR if
> > we're not the root bridge, which if you plug in the previously
> > computed timeout simplifies to bpdu->message_age + MESSAGE_AGE_INCR,
> > which is exactly what we want it to be and this computation isn't
> > wrong per se.
> >
> > The problem with the above logic, though, is that it fails to
> > consider that mod_timer() can round up the timeout you give it (i.e.
> > add timer slack), and that reading back root->message_age_timer.expires
> > in br_transmit_config() won't necessarily return the value that was
> > plugged into mod_timer() for this timer in br_record_config_information().
> >
> > E.g. if mod_timer() decides to add 5 jiffies to the timeout, the message
> > age value that br_transmit_config() will compute will be:
> >
> > br->max_age - (root->message_age_timer.expires - jiffies) +
> > MESSAGE_AGE_INCR
> >
> > = br->max_age - (jiffies + (p->br->max_age - bpdu->message_age) + 5
> > - jiffies) + MESSAGE_AGE_INCR
> >
> > = br->max_age - (p->br->max_age - bpdu->message_age + 5) +
> > MESSAGE_AGE_INCR
> >
> > = bpdu->message_age - 5 + MESSAGE_AGE_INCR
> >
> > Which will likely make the computed message age value negative.
> > This message age is stored in a signed int, but is then compared
> > against the bridge max age time:
> >
> > if (bpdu.message_age < br->max_age) {
> >
> > and br->max_age is an unsigned long, causing the comparison to be
> > unsigned and always fail if the computed message age was negative,
> > and no BPDU to be sent (causing our downstream neighbours to time
> > us out after some time and etc).
> >
> > Commit 0c03150e7ea fixes the issue because it avoids reading back the
> > expiration time (possibly with timer slack included) of a previously
> > set timer. Disabling timer slack on the message age timer achieves
> > the same thing (and is what I did initially):
> >
> > --- a/net/bridge/br_stp_timer.c
> > +++ b/net/bridge/br_stp_timer.c
> > @@ -158,6 +158,7 @@ void br_stp_port_timer_init(struct net_bridge_port *p)
> > {
> > setup_timer(&p->message_age_timer, br_message_age_timer_expired,
> > (unsigned long) p);
> > + set_timer_slack(&p->message_age_timer, 0);
> >
> > setup_timer(&p->forward_delay_timer, br_forward_delay_timer_expired,
> > (unsigned long) p);
>
> Disabling timer slack causes additional power consumption because
> the tick wakeup has to be immediate. I prefer to handle late timer
> in the code.
ACK, I wasn't advocating that we do this instead.
> P.s: not sure if timer slack existed back when I first saw the problem.
Timer slack was introduced in March 2010, and first appeared in
2.6.34, and this bug was reported against Vyatta 6.2, which has 2.6.35.
(I ran into it on 2.6.35.3.)
In fact, timer slack is the only reason why this issue triggers in
the first place. Without timer slack, BPDU generation works fine on
either version of the STP code.
^ permalink raw reply
* Re: stp issue and "bridge: send proper message_age in config BPDU"
From: Stephen Hemminger @ 2012-11-15 19:58 UTC (permalink / raw)
To: Lennert Buytenhek
Cc: netdev, Steven Kath, Anatoly Kaplan, Arthur Xiong, Chris Healy
In-Reply-To: <20121115195200.GD730@wantstofly.org>
On Thu, 15 Nov 2012 20:52:00 +0100
Lennert Buytenhek <buytenh@wantstofly.org> wrote:
> Hi!
>
> FWIW, I've been debugging an STP issue on an old product kernel tree
> that I couldn't find an upstream fix for, but after having debugged the
> issue, there does actually appear to be an upstream commit that makes
> the issue go away, but the commit message on that commit is somewhat
> unclear about what the issue is that it's fixing and why the given fix
> fixes it, and given that I spent considerable time debugging it I
> figured I'd send this out for the sake of the next person googling for
> this.
>
> The symptoms are pretty much what's described in this bug:
>
> https://bugzilla.vyatta.com/show_bug.cgi?id=7164
>
> And the upstream commit is:
>
> https://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=0c03150e7ea8f7fcd03cfef29385e0010b22ee92
>
> commit 0c03150e7ea8f7fcd03cfef29385e0010b22ee92
> Author: stephen hemminger <shemminger@vyatta.com>
> Date: Fri Jul 22 07:47:06 2011 +0000
>
> bridge: send proper message_age in config BPDU
>
> What I was seeing was that as a non-root bridge, Linux STP would often
> fail to transmit BPDUs out of designated ports upon reception of a BPDU
> from an upstream port.
>
> br_received_config_bpdu() handles the received BPDU, and calls into
> br_record_config_information(), which resets the message age timer on
> this port to jiffies + (p->br->max_age - bpdu->message_age);
>
> When br_received_config_bpdu() then calls br_config_bpdu_generation(),
> the latter will call into br_transmit_config() for each enabled
> designated port, which will send out BPDUs with age br->max_age
> - (root->message_age_timer.expires - jiffies) + MESSAGE_AGE_INCR if
> we're not the root bridge, which if you plug in the previously
> computed timeout simplifies to bpdu->message_age + MESSAGE_AGE_INCR,
> which is exactly what we want it to be and this computation isn't
> wrong per se.
>
> The problem with the above logic, though, is that it fails to
> consider that mod_timer() can round up the timeout you give it (i.e.
> add timer slack), and that reading back root->message_age_timer.expires
> in br_transmit_config() won't necessarily return the value that was
> plugged into mod_timer() for this timer in br_record_config_information().
>
> E.g. if mod_timer() decides to add 5 jiffies to the timeout, the message
> age value that br_transmit_config() will compute will be:
>
> br->max_age - (root->message_age_timer.expires - jiffies) +
> MESSAGE_AGE_INCR
>
> = br->max_age - (jiffies + (p->br->max_age - bpdu->message_age) + 5
> - jiffies) + MESSAGE_AGE_INCR
>
> = br->max_age - (p->br->max_age - bpdu->message_age + 5) +
> MESSAGE_AGE_INCR
>
> = bpdu->message_age - 5 + MESSAGE_AGE_INCR
>
> Which will likely make the computed message age value negative.
> This message age is stored in a signed int, but is then compared
> against the bridge max age time:
>
> if (bpdu.message_age < br->max_age) {
>
> and br->max_age is an unsigned long, causing the comparison to be
> unsigned and always fail if the computed message age was negative,
> and no BPDU to be sent (causing our downstream neighbours to time
> us out after some time and etc).
>
> Commit 0c03150e7ea fixes the issue because it avoids reading back the
> expiration time (possibly with timer slack included) of a previously
> set timer. Disabling timer slack on the message age timer achieves
> the same thing (and is what I did initially):
>
> --- a/net/bridge/br_stp_timer.c
> +++ b/net/bridge/br_stp_timer.c
> @@ -158,6 +158,7 @@ void br_stp_port_timer_init(struct net_bridge_port *p)
> {
> setup_timer(&p->message_age_timer, br_message_age_timer_expired,
> (unsigned long) p);
> + set_timer_slack(&p->message_age_timer, 0);
>
> setup_timer(&p->forward_delay_timer, br_forward_delay_timer_expired,
> (unsigned long) p);
>
>
> thanks,
> Lennert
Disabling timer slack causes additional power consumption because
the tick wakeup has to be immediate. I prefer to handle late timer
in the code.
P.s: not sure if timer slack existed back when I first saw the problem.
^ permalink raw reply
* stp issue and "bridge: send proper message_age in config BPDU"
From: Lennert Buytenhek @ 2012-11-15 19:52 UTC (permalink / raw)
To: Stephen Hemminger, netdev
Cc: Steven Kath, Anatoly Kaplan, Arthur Xiong, Chris Healy
Hi!
FWIW, I've been debugging an STP issue on an old product kernel tree
that I couldn't find an upstream fix for, but after having debugged the
issue, there does actually appear to be an upstream commit that makes
the issue go away, but the commit message on that commit is somewhat
unclear about what the issue is that it's fixing and why the given fix
fixes it, and given that I spent considerable time debugging it I
figured I'd send this out for the sake of the next person googling for
this.
The symptoms are pretty much what's described in this bug:
https://bugzilla.vyatta.com/show_bug.cgi?id=7164
And the upstream commit is:
https://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=0c03150e7ea8f7fcd03cfef29385e0010b22ee92
commit 0c03150e7ea8f7fcd03cfef29385e0010b22ee92
Author: stephen hemminger <shemminger@vyatta.com>
Date: Fri Jul 22 07:47:06 2011 +0000
bridge: send proper message_age in config BPDU
What I was seeing was that as a non-root bridge, Linux STP would often
fail to transmit BPDUs out of designated ports upon reception of a BPDU
from an upstream port.
br_received_config_bpdu() handles the received BPDU, and calls into
br_record_config_information(), which resets the message age timer on
this port to jiffies + (p->br->max_age - bpdu->message_age);
When br_received_config_bpdu() then calls br_config_bpdu_generation(),
the latter will call into br_transmit_config() for each enabled
designated port, which will send out BPDUs with age br->max_age
- (root->message_age_timer.expires - jiffies) + MESSAGE_AGE_INCR if
we're not the root bridge, which if you plug in the previously
computed timeout simplifies to bpdu->message_age + MESSAGE_AGE_INCR,
which is exactly what we want it to be and this computation isn't
wrong per se.
The problem with the above logic, though, is that it fails to
consider that mod_timer() can round up the timeout you give it (i.e.
add timer slack), and that reading back root->message_age_timer.expires
in br_transmit_config() won't necessarily return the value that was
plugged into mod_timer() for this timer in br_record_config_information().
E.g. if mod_timer() decides to add 5 jiffies to the timeout, the message
age value that br_transmit_config() will compute will be:
br->max_age - (root->message_age_timer.expires - jiffies) +
MESSAGE_AGE_INCR
= br->max_age - (jiffies + (p->br->max_age - bpdu->message_age) + 5
- jiffies) + MESSAGE_AGE_INCR
= br->max_age - (p->br->max_age - bpdu->message_age + 5) +
MESSAGE_AGE_INCR
= bpdu->message_age - 5 + MESSAGE_AGE_INCR
Which will likely make the computed message age value negative.
This message age is stored in a signed int, but is then compared
against the bridge max age time:
if (bpdu.message_age < br->max_age) {
and br->max_age is an unsigned long, causing the comparison to be
unsigned and always fail if the computed message age was negative,
and no BPDU to be sent (causing our downstream neighbours to time
us out after some time and etc).
Commit 0c03150e7ea fixes the issue because it avoids reading back the
expiration time (possibly with timer slack included) of a previously
set timer. Disabling timer slack on the message age timer achieves
the same thing (and is what I did initially):
--- a/net/bridge/br_stp_timer.c
+++ b/net/bridge/br_stp_timer.c
@@ -158,6 +158,7 @@ void br_stp_port_timer_init(struct net_bridge_port *p)
{
setup_timer(&p->message_age_timer, br_message_age_timer_expired,
(unsigned long) p);
+ set_timer_slack(&p->message_age_timer, 0);
setup_timer(&p->forward_delay_timer, br_forward_delay_timer_expired,
(unsigned long) p);
thanks,
Lennert
^ permalink raw reply
* [PATCH] net: phy: smsc: Re-enable EDPD mode for LAN87xx
From: Patrick Trantham @ 2012-11-15 19:00 UTC (permalink / raw)
To: netdev
Cc: steve.glendinning, davem, otavio, marex, chohnstaedt, jkosina,
Patrick Trantham
This patch re-enables Energy Detect Power Down (EDPD) mode for the
LAN8710/LAN8720. EDPD mode was disabled in a previous commit,
(b629820d18fa65cc598390e4b9712fd5f83ee693), because it was causing the
PHY to not be able to detect a link when cold started without a cable
connected.
The LAN8710/LAN8720 requires a minimum of 2 link pulses within 64ms of
each other in order to set the ENERGYON bit and exit EDPD mode. If a
link partner does send the pulses within this interval, the PHY will
remained powered down.
This workaround will manually toggle the PHY on/off upon calls to
read_status in order to generate link test pulses if the link is down.
If a link partner is present, it will respond to the pulses, which will
cause the ENERGYON bit to be set and will cause the EDPD mode to be
exited.
Signed-off-by: Patrick Trantham <patrick.trantham@fuel7.com>
---
drivers/net/phy/smsc.c | 73 +++++++++++++++++++++++++++++-------------------
1 file changed, 45 insertions(+), 28 deletions(-)
diff --git a/drivers/net/phy/smsc.c b/drivers/net/phy/smsc.c
index 88e3991..16dceed 100644
--- a/drivers/net/phy/smsc.c
+++ b/drivers/net/phy/smsc.c
@@ -56,37 +56,54 @@ static int smsc_phy_config_init(struct phy_device *phydev)
return smsc_phy_ack_interrupt (phydev);
}
-static int lan87xx_config_init(struct phy_device *phydev)
-{
- /*
- * Make sure the EDPWRDOWN bit is NOT set. Setting this bit on
- * LAN8710/LAN8720 PHY causes the PHY to misbehave, likely due
- * to a bug on the chip.
- *
- * When the system is powered on with the network cable being
- * disconnected all the way until after ifconfig ethX up is
- * issued for the LAN port with this PHY, connecting the cable
- * afterwards does not cause LINK change detection, while the
- * expected behavior is the Link UP being detected.
- */
- int rc = phy_read(phydev, MII_LAN83C185_CTRL_STATUS);
- if (rc < 0)
- return rc;
-
- rc &= ~MII_LAN83C185_EDPWRDOWN;
-
- rc = phy_write(phydev, MII_LAN83C185_CTRL_STATUS, rc);
- if (rc < 0)
- return rc;
-
- return smsc_phy_ack_interrupt(phydev);
-}
-
static int lan911x_config_init(struct phy_device *phydev)
{
return smsc_phy_ack_interrupt(phydev);
}
+/*
+ * The LAN8710/LAN8720 requires a minimum of 2 link pulses within 64ms of each
+ * other in order to set the ENERGYON bit and exit EDPD mode. If a link partner
+ * does send the pulses within this interval, the PHY will remained powered
+ * down.
+ *
+ * This workaround will manually toggle the PHY on/off upon calls to read_status
+ * in order to generate link test pulses if the link is down. If a link partner
+ * is present, it will respond to the pulses, which will cause the ENERGYON bit
+ * to be set and will cause the EDPD mode to be exited.
+ */
+static int lan87xx_read_status(struct phy_device *phydev)
+{
+ int err = genphy_read_status(phydev);
+
+ if (!phydev->link) {
+ /* Disable EDPD to wake up PHY */
+ int rc = phy_read(phydev, MII_LAN83C185_CTRL_STATUS);
+ if (rc < 0)
+ return rc;
+
+ rc = phy_write(phydev, MII_LAN83C185_CTRL_STATUS,
+ rc & ~MII_LAN83C185_EDPWRDOWN);
+ if (rc < 0)
+ return rc;
+
+ /* Sleep 64 ms to allow ~5 link test pulses to be sent */
+ msleep(64);
+
+ /* Re-enable EDPD */
+ rc = phy_read(phydev, MII_LAN83C185_CTRL_STATUS);
+ if (rc < 0)
+ return rc;
+
+ rc = phy_write(phydev, MII_LAN83C185_CTRL_STATUS,
+ rc | MII_LAN83C185_EDPWRDOWN);
+ if (rc < 0)
+ return rc;
+ }
+
+ return err;
+}
+
static struct phy_driver smsc_phy_driver[] = {
{
.phy_id = 0x0007c0a0, /* OUI=0x00800f, Model#=0x0a */
@@ -187,8 +204,8 @@ static struct phy_driver smsc_phy_driver[] = {
/* basic functions */
.config_aneg = genphy_config_aneg,
- .read_status = genphy_read_status,
- .config_init = lan87xx_config_init,
+ .read_status = lan87xx_read_status,
+ .config_intr = smsc_phy_config_intr,
/* IRQ related */
.ack_interrupt = smsc_phy_ack_interrupt,
--
1.7.9.5
^ permalink raw reply related
* Re: [Xen-devel] [PATCH 0/4] Implement persistent grant in xen-netfront/netback
From: Ian Campbell @ 2012-11-15 19:11 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk
Cc: Roger Pau Monne, ANNIE LI, Pasi Kärkkäinen,
netdev@vger.kernel.org, xen-devel@lists.xensource.com
In-Reply-To: <20121115182928.GB22320@phenom.dumpdata.com>
On Thu, 2012-11-15 at 18:29 +0000, Konrad Rzeszutek Wilk wrote:
> On Thu, Nov 15, 2012 at 11:15:06AM +0000, Ian Campbell wrote:
> > On Thu, 2012-11-15 at 10:56 +0000, Roger Pau Monne wrote:
> > > On 15/11/12 09:38, ANNIE LI wrote:
> > > >
> > > >
> > > > On 2012-11-15 15:40, Pasi Kärkkäinen wrote:
> > > >> Hello,
> > > >>
> > > >> On Thu, Nov 15, 2012 at 03:03:07PM +0800, Annie Li wrote:
> > > >>> This patch implements persistent grants for xen-netfront/netback. This
> > > >>> mechanism maintains page pools in netback/netfront, these page pools is used to
> > > >>> save grant pages which are mapped. This way improve performance which is wasted
> > > >>> when doing grant operations.
> > > >>>
> > > >>> Current netback/netfront does map/unmap grant operations frequently when
> > > >>> transmitting/receiving packets, and grant operations costs much cpu clock. In
> > > >>> this patch, netfront/netback maps grant pages when needed and then saves them
> > > >>> into a page pool for future use. All these pages will be unmapped when
> > > >>> removing/releasing the net device.
> > > >>>
> > > >> Do you have performance numbers available already? with/without persistent grants?
> > > > I have some simple netperf/netserver test result with/without persistent
> > > > grants,
> > > >
> > > > Following is result of with persistent grant patch,
> > > >
> > > > Guests, Sum, Avg, Min, Max
> > > > 1, 15106.4, 15106.4, 15106.36, 15106.36
> > > > 2, 13052.7, 6526.34, 6261.81, 6790.86
> > > > 3, 12675.1, 6337.53, 6220.24, 6454.83
> > > > 4, 13194, 6596.98, 6274.70, 6919.25
> > > >
> > > >
> > > > Following are result of without persistent patch
> > > >
> > > > Guests, Sum, Avg, Min, Max
> > > > 1, 10864.1, 10864.1, 10864.10, 10864.10
> > > > 2, 10898.5, 5449.24, 4862.08, 6036.40
> > > > 3, 10734.5, 5367.26, 5261.43, 5473.08
> > > > 4, 10924, 5461.99, 5314.84, 5609.14
> > >
> > > In the block case, performance improvement is seen when using a large
> > > number of guests, could you perform the same benchmark increasing the
> > > number of guests to 15?
> >
> > It would also be nice to see some analysis of the numbers which justify
> > why this change is a good one without every reviewer having to evaluate
> > the raw data themselves. In fact this should really be part of the
> > commit message.
>
> You mean like a nice graph, eh?
Together with an analysis of what it means and why it is a good thing,
yes.
Ian.
>
> I will run these patches on my 32GB box and see if I can give you
> a nice PDF/jpg.
>
> >
> > Ian.
> >
^ permalink raw reply
* Re: [PATCH net-next v2 3/3] ip6tnl: fix sparse warnings in ip6_tnl_netlink_parms()
From: David Miller @ 2012-11-15 18:57 UTC (permalink / raw)
To: nicolas.dichtel; +Cc: eric.dumazet, netdev, fengguang.wu
In-Reply-To: <1352988402-16950-3-git-send-email-nicolas.dichtel@6wind.com>
From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Thu, 15 Nov 2012 15:06:42 +0100
> This change fixes a sparse warning triggered by casting the flowinfo from
> netlink messages in an u32 instead of be32. This change corrects that in order
> to resolve the sparse warning.
>
> Reported-by: Fengguang Wu <fengguang.wu@intel.com>
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Applied.
^ permalink raw reply
* Re: [PATCH net-next v2 2/3] sit: fix sparse warnings
From: David Miller @ 2012-11-15 18:56 UTC (permalink / raw)
To: nicolas.dichtel; +Cc: eric.dumazet, netdev, fengguang.wu
In-Reply-To: <1352988402-16950-2-git-send-email-nicolas.dichtel@6wind.com>
From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Thu, 15 Nov 2012 15:06:41 +0100
> This change fixes several sparse warnings about endianness problem. The wrong
> nla_*() functions were used.
> It also fix a sparse warning about a flag test (field i_flags). This field is
> used in this file like a local flag only, so it is more an u16 (gre uses it as a
> be16). This sparse warning was already there before the patch that add netlink
> management, the code has just been moved.
>
> Reported-by: Fengguang Wu <fengguang.wu@intel.com>
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Applied.
^ permalink raw reply
* Re: [PATCH net-next v2 1/3] ipip: fix sparse warnings in ipip_netlink_parms()
From: David Miller @ 2012-11-15 18:56 UTC (permalink / raw)
To: nicolas.dichtel; +Cc: eric.dumazet, netdev, fengguang.wu
In-Reply-To: <1352988402-16950-1-git-send-email-nicolas.dichtel@6wind.com>
From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Thu, 15 Nov 2012 15:06:40 +0100
> This change fixes two sparse warnings triggered by casting the ip addresses
> from netlink messages in an u32 instead of be32. This change corrects that
> in order to resolve the sparse warnings.
>
> Reported-by: Fengguang Wu <fengguang.wu@intel.com>
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Applied.
^ permalink raw reply
* Re: [PATCHv2] sctp: fix /proc/net/sctp/ memory leak
From: David Miller @ 2012-11-15 18:56 UTC (permalink / raw)
To: tt.rantala; +Cc: netdev, nhorman, vyasevich, sri, linux-sctp, davej, ebiederm
In-Reply-To: <1352987345-11263-1-git-send-email-tt.rantala@gmail.com>
From: Tommi Rantala <tt.rantala@gmail.com>
Date: Thu, 15 Nov 2012 15:49:05 +0200
> Commit 13d782f ("sctp: Make the proc files per network namespace.")
> changed the /proc/net/sctp/ struct file_operations opener functions to
> use single_open_net() and seq_open_net().
>
> Avoid leaking memory by using single_release_net() and seq_release_net()
> as the release functions.
>
> Discovered with Trinity (the syscall fuzzer).
>
> Signed-off-by: Tommi Rantala <tt.rantala@gmail.com>
> Acked-by: Neil Horman <nhorman@tuxdriver.com>
Applied.
^ permalink raw reply
* [PATCH V2 14/14] net: Remove code duplication between offload structures
From: Vlad Yasevich @ 2012-11-15 18:49 UTC (permalink / raw)
To: netdev; +Cc: davem, eric.dumazet
In-Reply-To: <1353005363-6974-1-git-send-email-vyasevic@redhat.com>
Move the offload callbacks into its own structure.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
---
include/linux/netdevice.h | 10 +++++++---
include/net/protocol.h | 10 +++-------
net/core/dev.c | 14 +++++++-------
net/ipv4/af_inet.c | 44 +++++++++++++++++++++++++-------------------
net/ipv6/ip6_offload.c | 28 +++++++++++++++-------------
net/ipv6/tcpv6_offload.c | 10 ++++++----
net/ipv6/udp_offload.c | 6 ++++--
7 files changed, 67 insertions(+), 55 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 12c217d..a91828a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1505,15 +1505,19 @@ struct packet_type {
struct list_head list;
};
-struct packet_offload {
- __be16 type; /* This is really htons(ether_type). */
+struct offload_callbacks {
struct sk_buff *(*gso_segment)(struct sk_buff *skb,
netdev_features_t features);
int (*gso_send_check)(struct sk_buff *skb);
struct sk_buff **(*gro_receive)(struct sk_buff **head,
struct sk_buff *skb);
int (*gro_complete)(struct sk_buff *skb);
- struct list_head list;
+};
+
+struct packet_offload {
+ __be16 type; /* This is really htons(ether_type). */
+ struct offload_callbacks callbacks;
+ struct list_head list;
};
#include <linux/notifier.h>
diff --git a/include/net/protocol.h b/include/net/protocol.h
index 2c90794..047c047 100644
--- a/include/net/protocol.h
+++ b/include/net/protocol.h
@@ -29,6 +29,7 @@
#if IS_ENABLED(CONFIG_IPV6)
#include <linux/ipv6.h>
#endif
+#include <linux/netdevice.h>
/* This is one larger than the largest protocol value that can be
* found in an ipv4 or ipv6 header. Since in both cases the protocol
@@ -63,13 +64,8 @@ struct inet6_protocol {
#endif
struct net_offload {
- int (*gso_send_check)(struct sk_buff *skb);
- struct sk_buff *(*gso_segment)(struct sk_buff *skb,
- netdev_features_t features);
- struct sk_buff **(*gro_receive)(struct sk_buff **head,
- struct sk_buff *skb);
- int (*gro_complete)(struct sk_buff *skb);
- unsigned int flags; /* Flags used by IPv6 for now */
+ struct offload_callbacks callbacks;
+ unsigned int flags; /* Flags used by IPv6 for now */
};
/* This should be set for any extension header which is compatible with GSO. */
#define INET6_PROTO_GSO_EXTHDR 0x1
diff --git a/net/core/dev.c b/net/core/dev.c
index 13f9b85..3ee2cf1 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2102,16 +2102,16 @@ struct sk_buff *skb_gso_segment(struct sk_buff *skb,
rcu_read_lock();
list_for_each_entry_rcu(ptype, &offload_base, list) {
- if (ptype->type == type && ptype->gso_segment) {
+ if (ptype->type == type && ptype->callbacks.gso_segment) {
if (unlikely(skb->ip_summed != CHECKSUM_PARTIAL)) {
- err = ptype->gso_send_check(skb);
+ err = ptype->callbacks.gso_send_check(skb);
segs = ERR_PTR(err);
if (err || skb_gso_ok(skb, features))
break;
__skb_push(skb, (skb->data -
skb_network_header(skb)));
}
- segs = ptype->gso_segment(skb, features);
+ segs = ptype->callbacks.gso_segment(skb, features);
break;
}
}
@@ -3533,10 +3533,10 @@ static int napi_gro_complete(struct sk_buff *skb)
rcu_read_lock();
list_for_each_entry_rcu(ptype, head, list) {
- if (ptype->type != type || !ptype->gro_complete)
+ if (ptype->type != type || !ptype->callbacks.gro_complete)
continue;
- err = ptype->gro_complete(skb);
+ err = ptype->callbacks.gro_complete(skb);
break;
}
rcu_read_unlock();
@@ -3598,7 +3598,7 @@ enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
rcu_read_lock();
list_for_each_entry_rcu(ptype, head, list) {
- if (ptype->type != type || !ptype->gro_receive)
+ if (ptype->type != type || !ptype->callbacks.gro_receive)
continue;
skb_set_network_header(skb, skb_gro_offset(skb));
@@ -3608,7 +3608,7 @@ enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
NAPI_GRO_CB(skb)->flush = 0;
NAPI_GRO_CB(skb)->free = 0;
- pp = ptype->gro_receive(&napi->gro_list, skb);
+ pp = ptype->callbacks.gro_receive(&napi->gro_list, skb);
break;
}
rcu_read_unlock();
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 9f2e7fd..3067e04 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1276,8 +1276,8 @@ static int inet_gso_send_check(struct sk_buff *skb)
rcu_read_lock();
ops = rcu_dereference(inet_offloads[proto]);
- if (likely(ops && ops->gso_send_check))
- err = ops->gso_send_check(skb);
+ if (likely(ops && ops->callbacks.gso_send_check))
+ err = ops->callbacks.gso_send_check(skb);
rcu_read_unlock();
out:
@@ -1326,8 +1326,8 @@ static struct sk_buff *inet_gso_segment(struct sk_buff *skb,
rcu_read_lock();
ops = rcu_dereference(inet_offloads[proto]);
- if (likely(ops && ops->gso_segment))
- segs = ops->gso_segment(skb, features);
+ if (likely(ops && ops->callbacks.gso_segment))
+ segs = ops->callbacks.gso_segment(skb, features);
rcu_read_unlock();
if (!segs || IS_ERR(segs))
@@ -1379,7 +1379,7 @@ static struct sk_buff **inet_gro_receive(struct sk_buff **head,
rcu_read_lock();
ops = rcu_dereference(inet_offloads[proto]);
- if (!ops || !ops->gro_receive)
+ if (!ops || !ops->callbacks.gro_receive)
goto out_unlock;
if (*(u8 *)iph != 0x45)
@@ -1420,7 +1420,7 @@ static struct sk_buff **inet_gro_receive(struct sk_buff **head,
skb_gro_pull(skb, sizeof(*iph));
skb_set_transport_header(skb, skb_gro_offset(skb));
- pp = ops->gro_receive(head, skb);
+ pp = ops->callbacks.gro_receive(head, skb);
out_unlock:
rcu_read_unlock();
@@ -1444,10 +1444,10 @@ static int inet_gro_complete(struct sk_buff *skb)
rcu_read_lock();
ops = rcu_dereference(inet_offloads[proto]);
- if (WARN_ON(!ops || !ops->gro_complete))
+ if (WARN_ON(!ops || !ops->callbacks.gro_complete))
goto out_unlock;
- err = ops->gro_complete(skb);
+ err = ops->callbacks.gro_complete(skb);
out_unlock:
rcu_read_unlock();
@@ -1563,11 +1563,13 @@ static const struct net_protocol tcp_protocol = {
};
static const struct net_offload tcp_offload = {
- .gso_send_check = tcp_v4_gso_send_check,
- .gso_segment = tcp_tso_segment,
- .gro_receive = tcp4_gro_receive,
- .gro_complete = tcp4_gro_complete,
-};
+ .callbacks = {
+ .gso_send_check = tcp_v4_gso_send_check,
+ .gso_segment = tcp_tso_segment,
+ .gro_receive = tcp4_gro_receive,
+ .gro_complete = tcp4_gro_complete,
+ },
+};
static const struct net_protocol udp_protocol = {
.handler = udp_rcv,
@@ -1577,8 +1579,10 @@ static const struct net_protocol udp_protocol = {
};
static const struct net_offload udp_offload = {
- .gso_send_check = udp4_ufo_send_check,
- .gso_segment = udp4_ufo_fragment,
+ .callbacks = {
+ .gso_send_check = udp4_ufo_send_check,
+ .gso_segment = udp4_ufo_fragment,
+ },
};
static const struct net_protocol icmp_protocol = {
@@ -1667,10 +1671,12 @@ static int ipv4_proc_init(void);
static struct packet_offload ip_packet_offload __read_mostly = {
.type = cpu_to_be16(ETH_P_IP),
- .gso_send_check = inet_gso_send_check,
- .gso_segment = inet_gso_segment,
- .gro_receive = inet_gro_receive,
- .gro_complete = inet_gro_complete,
+ .callbacks = {
+ .gso_send_check = inet_gso_send_check,
+ .gso_segment = inet_gso_segment,
+ .gro_receive = inet_gro_receive,
+ .gro_complete = inet_gro_complete,
+ },
};
static int __init ipv4_offload_init(void)
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index 63d79d9..f26f0da 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -70,9 +70,9 @@ static int ipv6_gso_send_check(struct sk_buff *skb)
ops = rcu_dereference(inet6_offloads[
ipv6_gso_pull_exthdrs(skb, ipv6h->nexthdr)]);
- if (likely(ops && ops->gso_send_check)) {
+ if (likely(ops && ops->callbacks.gso_send_check)) {
skb_reset_transport_header(skb);
- err = ops->gso_send_check(skb);
+ err = ops->callbacks.gso_send_check(skb);
}
rcu_read_unlock();
@@ -113,9 +113,9 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
proto = ipv6_gso_pull_exthdrs(skb, ipv6h->nexthdr);
rcu_read_lock();
ops = rcu_dereference(inet6_offloads[proto]);
- if (likely(ops && ops->gso_segment)) {
+ if (likely(ops && ops->callbacks.gso_segment)) {
skb_reset_transport_header(skb);
- segs = ops->gso_segment(skb, features);
+ segs = ops->callbacks.gso_segment(skb, features);
}
rcu_read_unlock();
@@ -173,7 +173,7 @@ static struct sk_buff **ipv6_gro_receive(struct sk_buff **head,
rcu_read_lock();
proto = iph->nexthdr;
ops = rcu_dereference(inet6_offloads[proto]);
- if (!ops || !ops->gro_receive) {
+ if (!ops || !ops->callbacks.gro_receive) {
__pskb_pull(skb, skb_gro_offset(skb));
proto = ipv6_gso_pull_exthdrs(skb, proto);
skb_gro_pull(skb, -skb_transport_offset(skb));
@@ -181,7 +181,7 @@ static struct sk_buff **ipv6_gro_receive(struct sk_buff **head,
__skb_push(skb, skb_gro_offset(skb));
ops = rcu_dereference(inet6_offloads[proto]);
- if (!ops || !ops->gro_receive)
+ if (!ops || !ops->callbacks.gro_receive)
goto out_unlock;
iph = ipv6_hdr(skb);
@@ -220,7 +220,7 @@ static struct sk_buff **ipv6_gro_receive(struct sk_buff **head,
csum = skb->csum;
skb_postpull_rcsum(skb, iph, skb_network_header_len(skb));
- pp = ops->gro_receive(head, skb);
+ pp = ops->callbacks.gro_receive(head, skb);
skb->csum = csum;
@@ -244,10 +244,10 @@ static int ipv6_gro_complete(struct sk_buff *skb)
rcu_read_lock();
ops = rcu_dereference(inet6_offloads[NAPI_GRO_CB(skb)->proto]);
- if (WARN_ON(!ops || !ops->gro_complete))
+ if (WARN_ON(!ops || !ops->callbacks.gro_complete))
goto out_unlock;
- err = ops->gro_complete(skb);
+ err = ops->callbacks.gro_complete(skb);
out_unlock:
rcu_read_unlock();
@@ -257,10 +257,12 @@ out_unlock:
static struct packet_offload ipv6_packet_offload __read_mostly = {
.type = cpu_to_be16(ETH_P_IPV6),
- .gso_send_check = ipv6_gso_send_check,
- .gso_segment = ipv6_gso_segment,
- .gro_receive = ipv6_gro_receive,
- .gro_complete = ipv6_gro_complete,
+ .callbacks = {
+ .gso_send_check = ipv6_gso_send_check,
+ .gso_segment = ipv6_gso_segment,
+ .gro_receive = ipv6_gro_receive,
+ .gro_complete = ipv6_gro_complete,
+ },
};
static int __init ipv6_offload_init(void)
diff --git a/net/ipv6/tcpv6_offload.c b/net/ipv6/tcpv6_offload.c
index 3a27fe6..2ec6bf6 100644
--- a/net/ipv6/tcpv6_offload.c
+++ b/net/ipv6/tcpv6_offload.c
@@ -81,10 +81,12 @@ static int tcp6_gro_complete(struct sk_buff *skb)
}
static const struct net_offload tcpv6_offload = {
- .gso_send_check = tcp_v6_gso_send_check,
- .gso_segment = tcp_tso_segment,
- .gro_receive = tcp6_gro_receive,
- .gro_complete = tcp6_gro_complete,
+ .callbacks = {
+ .gso_send_check = tcp_v6_gso_send_check,
+ .gso_segment = tcp_tso_segment,
+ .gro_receive = tcp6_gro_receive,
+ .gro_complete = tcp6_gro_complete,
+ },
};
int __init tcpv6_offload_init(void)
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index 979e4ab..8e01c44 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -107,8 +107,10 @@ out:
return segs;
}
static const struct net_offload udpv6_offload = {
- .gso_send_check = udp6_ufo_send_check,
- .gso_segment = udp6_ufo_fragment,
+ .callbacks = {
+ .gso_send_check = udp6_ufo_send_check,
+ .gso_segment = udp6_ufo_fragment,
+ },
};
int __init udp_offload_init(void)
--
1.7.7.6
^ permalink raw reply related
* [PATCH V2 13/14] ipv6: Pull IPv6 GSO registration out of the module
From: Vlad Yasevich @ 2012-11-15 18:49 UTC (permalink / raw)
To: netdev; +Cc: davem, eric.dumazet
In-Reply-To: <1353005363-6974-1-git-send-email-vyasevic@redhat.com>
Sing GSO support is now separate, pull it out of the module
and make it its own init call.
Remove the cleanup functions as they are no longer called.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
---
include/net/protocol.h | 11 ++++++-----
net/ipv6/Makefile | 6 +++---
net/ipv6/af_inet6.c | 3 ---
net/ipv6/exthdrs.c | 10 +---------
net/ipv6/exthdrs_offload.c | 6 ------
net/ipv6/ip6_offload.c | 17 ++++++++++++-----
net/ipv6/ip6_offload.h | 8 --------
net/ipv6/protocol.c | 20 ++++++++++++--------
net/ipv6/tcp_ipv6.c | 10 +---------
net/ipv6/tcpv6_offload.c | 5 -----
net/ipv6/udp.c | 10 +---------
net/ipv6/udp_offload.c | 5 -----
12 files changed, 36 insertions(+), 75 deletions(-)
diff --git a/include/net/protocol.h b/include/net/protocol.h
index 7019c16..2c90794 100644
--- a/include/net/protocol.h
+++ b/include/net/protocol.h
@@ -25,6 +25,7 @@
#define _PROTOCOL_H
#include <linux/in6.h>
+#include <linux/skbuff.h>
#if IS_ENABLED(CONFIG_IPV6)
#include <linux/ipv6.h>
#endif
@@ -59,8 +60,6 @@ struct inet6_protocol {
#define INET6_PROTO_NOPOLICY 0x1
#define INET6_PROTO_FINAL 0x2
-/* This should be set for any extension header which is compatible with GSO. */
-#define INET6_PROTO_GSO_EXTHDR 0x4
#endif
struct net_offload {
@@ -72,6 +71,8 @@ struct net_offload {
int (*gro_complete)(struct sk_buff *skb);
unsigned int flags; /* Flags used by IPv6 for now */
};
+/* This should be set for any extension header which is compatible with GSO. */
+#define INET6_PROTO_GSO_EXTHDR 0x1
/* This is used to register socket interfaces for IP protocols. */
struct inet_protosw {
@@ -93,10 +94,10 @@ struct inet_protosw {
extern const struct net_protocol __rcu *inet_protos[MAX_INET_PROTOS];
extern const struct net_offload __rcu *inet_offloads[MAX_INET_PROTOS];
+extern const struct net_offload __rcu *inet6_offloads[MAX_INET_PROTOS];
#if IS_ENABLED(CONFIG_IPV6)
extern const struct inet6_protocol __rcu *inet6_protos[MAX_INET_PROTOS];
-extern const struct net_offload __rcu *inet6_offloads[MAX_INET_PROTOS];
#endif
extern int inet_add_protocol(const struct net_protocol *prot, unsigned char num);
@@ -109,10 +110,10 @@ extern void inet_unregister_protosw(struct inet_protosw *p);
#if IS_ENABLED(CONFIG_IPV6)
extern int inet6_add_protocol(const struct inet6_protocol *prot, unsigned char num);
extern int inet6_del_protocol(const struct inet6_protocol *prot, unsigned char num);
-extern int inet6_add_offload(const struct net_offload *prot, unsigned char num);
-extern int inet6_del_offload(const struct net_offload *prot, unsigned char num);
extern int inet6_register_protosw(struct inet_protosw *p);
extern void inet6_unregister_protosw(struct inet_protosw *p);
#endif
+extern int inet6_add_offload(const struct net_offload *prot, unsigned char num);
+extern int inet6_del_offload(const struct net_offload *prot, unsigned char num);
#endif /* _PROTOCOL_H */
diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
index cdca302..04a475d 100644
--- a/net/ipv6/Makefile
+++ b/net/ipv6/Makefile
@@ -7,7 +7,7 @@ obj-$(CONFIG_IPV6) += ipv6.o
ipv6-objs := af_inet6.o anycast.o ip6_output.o ip6_input.o addrconf.o \
addrlabel.o \
route.o ip6_fib.o ipv6_sockglue.o ndisc.o udp.o udplite.o \
- raw.o protocol.o icmp.o mcast.o reassembly.o tcp_ipv6.o \
+ raw.o icmp.o mcast.o reassembly.o tcp_ipv6.o \
exthdrs.o datagram.o ip6_flowlabel.o inet6_connection_sock.o
ipv6-offload := ip6_offload.o tcpv6_offload.o udp_offload.o exthdrs_offload.o
@@ -23,7 +23,6 @@ ipv6-$(CONFIG_PROC_FS) += proc.o
ipv6-$(CONFIG_SYN_COOKIES) += syncookies.o
ipv6-objs += $(ipv6-y)
-ipv6-objs += $(ipv6-offload)
obj-$(CONFIG_INET6_AH) += ah6.o
obj-$(CONFIG_INET6_ESP) += esp6.o
@@ -41,6 +40,7 @@ obj-$(CONFIG_IPV6_SIT) += sit.o
obj-$(CONFIG_IPV6_TUNNEL) += ip6_tunnel.o
obj-$(CONFIG_IPV6_GRE) += ip6_gre.o
-obj-y += addrconf_core.o exthdrs_core.o output_core.o
+obj-y += addrconf_core.o exthdrs_core.o output_core.o protocol.o
+obj-y += $(ipv6-offload)
obj-$(subst m,y,$(CONFIG_IPV6)) += inet6_hashtables.o
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index c84d5ba..7bafc51 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -62,7 +62,6 @@
#include <asm/uaccess.h>
#include <linux/mroute6.h>
-#include "ip6_offload.h"
MODULE_AUTHOR("Cast of dozens");
MODULE_DESCRIPTION("IPv6 protocol stack for Linux");
@@ -707,14 +706,12 @@ static struct packet_type ipv6_packet_type __read_mostly = {
static int __init ipv6_packet_init(void)
{
- ipv6_offload_init();
dev_add_pack(&ipv6_packet_type);
return 0;
}
static void ipv6_packet_cleanup(void)
{
- ipv6_offload_cleanup();
dev_remove_pack(&ipv6_packet_type);
}
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index e9b5b33..bb02d2a 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -48,7 +48,6 @@
#endif
#include <asm/uaccess.h>
-#include "ip6_offload.h"
/*
* Parsing tlv encoded headers.
@@ -502,13 +501,9 @@ int __init ipv6_exthdrs_init(void)
{
int ret;
- ret = ipv6_exthdrs_offload_init();
- if (ret)
- goto out;
-
ret = inet6_add_protocol(&rthdr_protocol, IPPROTO_ROUTING);
if (ret)
- goto out_offload;
+ goto out;
ret = inet6_add_protocol(&destopt_protocol, IPPROTO_DSTOPTS);
if (ret)
@@ -524,14 +519,11 @@ out_destopt:
inet6_del_protocol(&destopt_protocol, IPPROTO_DSTOPTS);
out_rthdr:
inet6_del_protocol(&rthdr_protocol, IPPROTO_ROUTING);
-out_offload:
- ipv6_exthdrs_offload_exit();
goto out;
};
void ipv6_exthdrs_exit(void)
{
- ipv6_exthdrs_offload_exit();
inet6_del_protocol(&nodata_protocol, IPPROTO_NONE);
inet6_del_protocol(&destopt_protocol, IPPROTO_DSTOPTS);
inet6_del_protocol(&rthdr_protocol, IPPROTO_ROUTING);
diff --git a/net/ipv6/exthdrs_offload.c b/net/ipv6/exthdrs_offload.c
index 271bf4a..cf77f3a 100644
--- a/net/ipv6/exthdrs_offload.c
+++ b/net/ipv6/exthdrs_offload.c
@@ -39,9 +39,3 @@ out_rt:
inet_del_offload(&rthdr_offload, IPPROTO_ROUTING);
goto out;
}
-
-void ipv6_exthdrs_offload_exit(void)
-{
- inet_del_offload(&rthdr_offload, IPPROTO_ROUTING);
- inet_del_offload(&rthdr_offload, IPPROTO_DSTOPTS);
-}
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index 01cf983..63d79d9 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -12,6 +12,7 @@
#include <linux/socket.h>
#include <linux/netdevice.h>
#include <linux/skbuff.h>
+#include <linux/printk.h>
#include <net/protocol.h>
#include <net/ipv6.h>
@@ -262,12 +263,18 @@ static struct packet_offload ipv6_packet_offload __read_mostly = {
.gro_complete = ipv6_gro_complete,
};
-void __init ipv6_offload_init(void)
+static int __init ipv6_offload_init(void)
{
+
+ if (tcpv6_offload_init() < 0)
+ pr_crit("%s: Cannot add TCP protocol offload\n", __func__);
+ if (udp_offload_init() < 0)
+ pr_crit("%s: Cannot add UDP protocol offload\n", __func__);
+ if (ipv6_exthdrs_offload_init() < 0)
+ pr_crit("%s: Cannot add EXTHDRS protocol offload\n", __func__);
+
dev_add_offload(&ipv6_packet_offload);
+ return 0;
}
-void ipv6_offload_cleanup(void)
-{
- dev_remove_offload(&ipv6_packet_offload);
-}
+fs_initcall(ipv6_offload_init);
diff --git a/net/ipv6/ip6_offload.h b/net/ipv6/ip6_offload.h
index 4e88ddb..2e155c6 100644
--- a/net/ipv6/ip6_offload.h
+++ b/net/ipv6/ip6_offload.h
@@ -12,15 +12,7 @@
#define __ip6_offload_h
int ipv6_exthdrs_offload_init(void);
-void ipv6_exthdrs_offload_exit(void);
-
int udp_offload_init(void);
-void udp_offload_cleanup(void);
-
int tcpv6_offload_init(void);
-void tcpv6_offload_cleanup(void);
-
-extern void ipv6_offload_init(void);
-extern void ipv6_offload_cleanup(void);
#endif
diff --git a/net/ipv6/protocol.c b/net/ipv6/protocol.c
index f7c53a7..22d1bd4 100644
--- a/net/ipv6/protocol.c
+++ b/net/ipv6/protocol.c
@@ -25,8 +25,9 @@
#include <linux/spinlock.h>
#include <net/protocol.h>
+#if IS_ENABLED(CONFIG_IPV6)
const struct inet6_protocol __rcu *inet6_protos[MAX_INET_PROTOS] __read_mostly;
-const struct net_offload __rcu *inet6_offloads[MAX_INET_PROTOS] __read_mostly;
+EXPORT_SYMBOL(inet6_protos);
int inet6_add_protocol(const struct inet6_protocol *prot, unsigned char protocol)
{
@@ -35,13 +36,6 @@ int inet6_add_protocol(const struct inet6_protocol *prot, unsigned char protocol
}
EXPORT_SYMBOL(inet6_add_protocol);
-int inet6_add_offload(const struct net_offload *prot, unsigned char protocol)
-{
- return !cmpxchg((const struct net_offload **)&inet6_offloads[protocol],
- NULL, prot) ? 0 : -1;
-}
-EXPORT_SYMBOL(inet6_add_offload);
-
/*
* Remove a protocol from the hash tables.
*/
@@ -58,6 +52,16 @@ int inet6_del_protocol(const struct inet6_protocol *prot, unsigned char protocol
return ret;
}
EXPORT_SYMBOL(inet6_del_protocol);
+#endif
+
+const struct net_offload __rcu *inet6_offloads[MAX_INET_PROTOS] __read_mostly;
+
+int inet6_add_offload(const struct net_offload *prot, unsigned char protocol)
+{
+ return !cmpxchg((const struct net_offload **)&inet6_offloads[protocol],
+ NULL, prot) ? 0 : -1;
+}
+EXPORT_SYMBOL(inet6_add_offload);
int inet6_del_offload(const struct net_offload *prot, unsigned char protocol)
{
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 58fabc5..c5d2d61 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -71,7 +71,6 @@
#include <linux/crypto.h>
#include <linux/scatterlist.h>
-#include "ip6_offload.h"
static void tcp_v6_send_reset(struct sock *sk, struct sk_buff *skb);
static void tcp_v6_reqsk_send_ack(struct sock *sk, struct sk_buff *skb,
@@ -2004,13 +2003,9 @@ int __init tcpv6_init(void)
{
int ret;
- ret = tcpv6_offload_init();
- if (ret)
- goto out;
-
ret = inet6_add_protocol(&tcpv6_protocol, IPPROTO_TCP);
if (ret)
- goto out_offload;
+ goto out;
/* register inet6 protocol */
ret = inet6_register_protosw(&tcpv6_protosw);
@@ -2027,8 +2022,6 @@ out_tcpv6_protosw:
inet6_unregister_protosw(&tcpv6_protosw);
out_tcpv6_protocol:
inet6_del_protocol(&tcpv6_protocol, IPPROTO_TCP);
-out_offload:
- tcpv6_offload_cleanup();
goto out;
}
@@ -2037,5 +2030,4 @@ void tcpv6_exit(void)
unregister_pernet_subsys(&tcpv6_net_ops);
inet6_unregister_protosw(&tcpv6_protosw);
inet6_del_protocol(&tcpv6_protocol, IPPROTO_TCP);
- tcpv6_offload_cleanup();
}
diff --git a/net/ipv6/tcpv6_offload.c b/net/ipv6/tcpv6_offload.c
index edeafed..3a27fe6 100644
--- a/net/ipv6/tcpv6_offload.c
+++ b/net/ipv6/tcpv6_offload.c
@@ -91,8 +91,3 @@ int __init tcpv6_offload_init(void)
{
return inet6_add_offload(&tcpv6_offload, IPPROTO_TCP);
}
-
-void tcpv6_offload_cleanup(void)
-{
- inet6_del_offload(&tcpv6_offload, IPPROTO_TCP);
-}
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 013fef7..dfaa29b 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -50,7 +50,6 @@
#include <linux/seq_file.h>
#include <trace/events/skb.h>
#include "udp_impl.h"
-#include "ip6_offload.h"
int ipv6_rcv_saddr_equal(const struct sock *sk, const struct sock *sk2)
{
@@ -1472,13 +1471,9 @@ int __init udpv6_init(void)
{
int ret;
- ret = udp_offload_init();
- if (ret)
- goto out;
-
ret = inet6_add_protocol(&udpv6_protocol, IPPROTO_UDP);
if (ret)
- goto out_offload;
+ goto out;
ret = inet6_register_protosw(&udpv6_protosw);
if (ret)
@@ -1488,8 +1483,6 @@ out:
out_udpv6_protocol:
inet6_del_protocol(&udpv6_protocol, IPPROTO_UDP);
-out_offload:
- udp_offload_cleanup();
goto out;
}
@@ -1497,5 +1490,4 @@ void udpv6_exit(void)
{
inet6_unregister_protosw(&udpv6_protosw);
inet6_del_protocol(&udpv6_protocol, IPPROTO_UDP);
- udp_offload_cleanup();
}
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index f964d2b..979e4ab 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -115,8 +115,3 @@ int __init udp_offload_init(void)
{
return inet6_add_offload(&udpv6_offload, IPPROTO_UDP);
}
-
-void udp_offload_cleanup(void)
-{
- inet6_del_offload(&udpv6_offload, IPPROTO_UDP);
-}
--
1.7.7.6
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox