Netdev List

Netdev List
 help / color / mirror / Atom feed

* [net-next v2 7/9] bnx2x: Fix self test of BCM57800
From: Yaniv Rosner @ 2011-11-28 10:49 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Yaniv Rosner, Eilon Greenstein
In-Reply-To: <1322477393-22904-1-git-send-email-yanivr@broadcom.com>

Fix the MAC test of the 1G port of the BCM57800 to use the UMAC instead of the XMAC.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 .../net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c    |   14 ++++++++++++--
 1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
index ec31871..f1bf1f5 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
@@ -1749,8 +1749,18 @@ static int bnx2x_run_loopback(struct bnx2x *bp, int loopback_mode)
 			return -EINVAL;
 		break;
 	case BNX2X_MAC_LOOPBACK:
-		bp->link_params.loopback_mode = CHIP_IS_E3(bp) ?
-						LOOPBACK_XMAC : LOOPBACK_BMAC;
+		if (CHIP_IS_E3(bp)) {
+			int cfg_idx = bnx2x_get_link_cfg_idx(bp);
+			if (bp->port.supported[cfg_idx] &
+			    (SUPPORTED_10000baseT_Full |
+			     SUPPORTED_20000baseMLD2_Full |
+			     SUPPORTED_20000baseKR2_Full))
+				bp->link_params.loopback_mode = LOOPBACK_XMAC;
+			else
+				bp->link_params.loopback_mode = LOOPBACK_UMAC;
+		} else
+			bp->link_params.loopback_mode = LOOPBACK_BMAC;
+
 		bnx2x_phy_init(&bp->link_params, &bp->link_vars);
 		break;
 	default:
-- 
1.7.7.1

^ permalink raw reply related

* [net-next v2 6/9] bnx2x: Add known PHY type check
From: Yaniv Rosner @ 2011-11-28 10:49 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Yaniv Rosner, Eilon Greenstein
In-Reply-To: <1322477393-22904-1-git-send-email-yanivr@broadcom.com>

The populate function will fail in case an unknown external PHY is detected.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
index 4ffc75d..cc7dbbe 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
@@ -11489,6 +11489,10 @@ static int bnx2x_populate_ext_phy(struct bnx2x *bp,
 		return -EINVAL;
 	default:
 		*phy = phy_null;
+		/* In case external PHY wasn't found */
+		if ((phy_type != PORT_HW_CFG_XGXS_EXT_PHY_TYPE_DIRECT) &&
+		    (phy_type != PORT_HW_CFG_XGXS_EXT_PHY_TYPE_NOT_CONN))
+			return -EINVAL;
 		return 0;
 	}
 
-- 
1.7.7.1

^ permalink raw reply related

* [net-next v2 5/9] bnx2x: Change Warpcore MDIO work around mode
From: Yaniv Rosner @ 2011-11-28 10:49 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Yaniv Rosner, Eilon Greenstein
In-Reply-To: <1322477393-22904-1-git-send-email-yanivr@broadcom.com>

This patch enables the usage of simpler MDC/MDIO work-around when accessing Warpcore registers.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
index 6523723..4ffc75d 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
@@ -11308,7 +11308,9 @@ static int bnx2x_populate_int_phy(struct bnx2x *bp, u32 shmem_base, u8 port,
 				       offsetof(struct shmem_region,
 			dev_info.port_feature_config[port].link_config)) &
 			  PORT_FEATURE_CONNECTED_SWITCH_MASK);
-	chip_id = REG_RD(bp, MISC_REG_CHIP_NUM) << 16;
+	chip_id = (REG_RD(bp, MISC_REG_CHIP_NUM) << 16) |
+		((REG_RD(bp, MISC_REG_CHIP_REV) & 0xf) << 12);
+
 	DP(NETIF_MSG_LINK, ":chip_id = 0x%x\n", chip_id);
 	if (USES_WARPCORE(bp)) {
 		u32 serdes_net_if;
-- 
1.7.7.1

^ permalink raw reply related

* [net-next v2 0/9] bnx2x: Link changes
From: Yaniv Rosner @ 2011-11-28 10:49 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Yaniv Rosner

Hi Dave,
The following patch series describe some link changes.
Please consider applying it to net-next.

Thanks,
Yaniv

^ permalink raw reply

* [net-next v2 1/9] bnx2x: PFC changes
From: Yaniv Rosner @ 2011-11-28 10:49 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Yaniv Rosner, Eilon Greenstein
In-Reply-To: <1322477393-22904-1-git-send-email-yanivr@broadcom.com>

Change BRB to work in per class guaranteed mode and handle cases for BW 0%.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c |  248 +++++++++++++++-------
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h  |    7 +-
 2 files changed, 179 insertions(+), 76 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
index 882f48f..8e6909a 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
@@ -163,6 +163,11 @@
 #define EDC_MODE_LIMITING				0x0044
 #define EDC_MODE_PASSIVE_DAC			0x0055
 
+/* BRB default for class 0 E2 */
+#define DEFAULT0_E2_BRB_MAC_PAUSE_XOFF_THR	170
+#define DEFAULT0_E2_BRB_MAC_PAUSE_XON_THR		250
+#define DEFAULT0_E2_BRB_MAC_FULL_XOFF_THR		10
+#define DEFAULT0_E2_BRB_MAC_FULL_XON_THR		50
 
 /* BRB thresholds for E2*/
 #define PFC_E2_BRB_MAC_PAUSE_XOFF_THR_PAUSE		170
@@ -177,6 +182,12 @@
 #define PFC_E2_BRB_MAC_FULL_XON_THR_PAUSE			50
 #define PFC_E2_BRB_MAC_FULL_XON_THR_NON_PAUSE		250
 
+/* BRB default for class 0 E3A0 */
+#define DEFAULT0_E3A0_BRB_MAC_PAUSE_XOFF_THR	290
+#define DEFAULT0_E3A0_BRB_MAC_PAUSE_XON_THR	410
+#define DEFAULT0_E3A0_BRB_MAC_FULL_XOFF_THR	10
+#define DEFAULT0_E3A0_BRB_MAC_FULL_XON_THR	50
+
 /* BRB thresholds for E3A0 */
 #define PFC_E3A0_BRB_MAC_PAUSE_XOFF_THR_PAUSE		290
 #define PFC_E3A0_BRB_MAC_PAUSE_XOFF_THR_NON_PAUSE		0
@@ -190,6 +201,11 @@
 #define PFC_E3A0_BRB_MAC_FULL_XON_THR_PAUSE		50
 #define PFC_E3A0_BRB_MAC_FULL_XON_THR_NON_PAUSE		410
 
+/* BRB default for E3B0 */
+#define DEFAULT0_E3B0_BRB_MAC_PAUSE_XOFF_THR	330
+#define DEFAULT0_E3B0_BRB_MAC_PAUSE_XON_THR	490
+#define DEFAULT0_E3B0_BRB_MAC_FULL_XOFF_THR	15
+#define DEFAULT0_E3B0_BRB_MAC_FULL_XON_THR	55
 
 /* BRB thresholds for E3B0 2 port mode*/
 #define PFC_E3B0_2P_BRB_MAC_PAUSE_XOFF_THR_PAUSE		1025
@@ -251,6 +267,18 @@
 #define PFC_E3B0_4P_BRB_MAC_1_CLASS_T_GUART		80
 #define PFC_E3B0_4P_BRB_MAC_1_CLASS_T_GUART_HYST		120
 
+/* Pause defines*/
+#define DEFAULT_E3B0_BRB_FULL_LB_XOFF_THR			330
+#define DEFAULT_E3B0_BRB_FULL_LB_XON_THR			490
+#define DEFAULT_E3B0_LB_GUART		40
+
+#define DEFAULT_E3B0_BRB_MAC_0_CLASS_T_GUART		40
+#define DEFAULT_E3B0_BRB_MAC_0_CLASS_T_GUART_HYST	0
+
+#define DEFAULT_E3B0_BRB_MAC_1_CLASS_T_GUART		40
+#define DEFAULT_E3B0_BRB_MAC_1_CLASS_T_GUART_HYST	0
+
+/* ETS defines*/
 #define DCBX_INVALID_COS					(0xFF)
 
 #define ETS_BW_LIMIT_CREDIT_UPPER_BOUND		(0x5000)
@@ -2009,6 +2037,8 @@ struct bnx2x_pfc_brb_threshold_val {
 };
 
 struct bnx2x_pfc_brb_e3b0_val {
+	u32 per_class_guaranty_mode;
+	u32 lb_guarantied_hyst;
 	u32 full_lb_xoff_th;
 	u32 full_lb_xon_threshold;
 	u32 lb_guarantied;
@@ -2021,6 +2051,9 @@ struct bnx2x_pfc_brb_e3b0_val {
 struct bnx2x_pfc_brb_th_val {
 	struct bnx2x_pfc_brb_threshold_val pauseable_th;
 	struct bnx2x_pfc_brb_threshold_val non_pauseable_th;
+	struct bnx2x_pfc_brb_threshold_val default_class0;
+	struct bnx2x_pfc_brb_threshold_val default_class1;
+
 };
 static int bnx2x_pfc_brb_get_config_params(
 				struct link_params *params,
@@ -2028,7 +2061,23 @@ static int bnx2x_pfc_brb_get_config_params(
 {
 	struct bnx2x *bp = params->bp;
 	DP(NETIF_MSG_LINK, "Setting PFC BRB configuration\n");
+
+	config_val->default_class1.pause_xoff = 0;
+	config_val->default_class1.pause_xon = 0;
+	config_val->default_class1.full_xoff = 0;
+	config_val->default_class1.full_xon = 0;
+
 	if (CHIP_IS_E2(bp)) {
+		/*  class0 defaults */
+		config_val->default_class0.pause_xoff =
+			DEFAULT0_E2_BRB_MAC_PAUSE_XOFF_THR;
+		config_val->default_class0.pause_xon =
+		    DEFAULT0_E2_BRB_MAC_PAUSE_XON_THR;
+		config_val->default_class0.full_xoff =
+		    DEFAULT0_E2_BRB_MAC_FULL_XOFF_THR;
+		config_val->default_class0.full_xon =
+		    DEFAULT0_E2_BRB_MAC_FULL_XON_THR;
+		/*  pause able*/
 		config_val->pauseable_th.pause_xoff =
 		    PFC_E2_BRB_MAC_PAUSE_XOFF_THR_PAUSE;
 		config_val->pauseable_th.pause_xon =
@@ -2047,6 +2096,16 @@ static int bnx2x_pfc_brb_get_config_params(
 		config_val->non_pauseable_th.full_xon =
 		    PFC_E2_BRB_MAC_FULL_XON_THR_NON_PAUSE;
 	} else if (CHIP_IS_E3A0(bp)) {
+		/*  class0 defaults */
+		config_val->default_class0.pause_xoff =
+			DEFAULT0_E3A0_BRB_MAC_PAUSE_XOFF_THR;
+		config_val->default_class0.pause_xon =
+		    DEFAULT0_E3A0_BRB_MAC_PAUSE_XON_THR;
+		config_val->default_class0.full_xoff =
+		    DEFAULT0_E3A0_BRB_MAC_FULL_XOFF_THR;
+		config_val->default_class0.full_xon =
+		    DEFAULT0_E3A0_BRB_MAC_FULL_XON_THR;
+		/*  pause able */
 		config_val->pauseable_th.pause_xoff =
 		    PFC_E3A0_BRB_MAC_PAUSE_XOFF_THR_PAUSE;
 		config_val->pauseable_th.pause_xon =
@@ -2065,29 +2124,39 @@ static int bnx2x_pfc_brb_get_config_params(
 		config_val->non_pauseable_th.full_xon =
 		    PFC_E3A0_BRB_MAC_FULL_XON_THR_NON_PAUSE;
 	} else if (CHIP_IS_E3B0(bp)) {
+		/*  class0 defaults */
+		config_val->default_class0.pause_xoff =
+			DEFAULT0_E3B0_BRB_MAC_PAUSE_XOFF_THR;
+		config_val->default_class0.pause_xon =
+		    DEFAULT0_E3B0_BRB_MAC_PAUSE_XON_THR;
+		config_val->default_class0.full_xoff =
+		    DEFAULT0_E3B0_BRB_MAC_FULL_XOFF_THR;
+		config_val->default_class0.full_xon =
+		    DEFAULT0_E3B0_BRB_MAC_FULL_XON_THR;
+
 		if (params->phy[INT_PHY].flags &
-		    FLAGS_4_PORT_MODE) {
+			FLAGS_4_PORT_MODE) {
 			config_val->pauseable_th.pause_xoff =
-			    PFC_E3B0_4P_BRB_MAC_PAUSE_XOFF_THR_PAUSE;
+				PFC_E3B0_4P_BRB_MAC_PAUSE_XOFF_THR_PAUSE;
 			config_val->pauseable_th.pause_xon =
-			    PFC_E3B0_4P_BRB_MAC_PAUSE_XON_THR_PAUSE;
+				PFC_E3B0_4P_BRB_MAC_PAUSE_XON_THR_PAUSE;
 			config_val->pauseable_th.full_xoff =
-			    PFC_E3B0_4P_BRB_MAC_FULL_XOFF_THR_PAUSE;
+				PFC_E3B0_4P_BRB_MAC_FULL_XOFF_THR_PAUSE;
 			config_val->pauseable_th.full_xon =
-			    PFC_E3B0_4P_BRB_MAC_FULL_XON_THR_PAUSE;
+				PFC_E3B0_4P_BRB_MAC_FULL_XON_THR_PAUSE;
 			/* non pause able*/
 			config_val->non_pauseable_th.pause_xoff =
-			    PFC_E3B0_4P_BRB_MAC_PAUSE_XOFF_THR_NON_PAUSE;
+			PFC_E3B0_4P_BRB_MAC_PAUSE_XOFF_THR_NON_PAUSE;
 			config_val->non_pauseable_th.pause_xon =
-			    PFC_E3B0_4P_BRB_MAC_PAUSE_XON_THR_NON_PAUSE;
+			PFC_E3B0_4P_BRB_MAC_PAUSE_XON_THR_NON_PAUSE;
 			config_val->non_pauseable_th.full_xoff =
-			    PFC_E3B0_4P_BRB_MAC_FULL_XOFF_THR_NON_PAUSE;
+			PFC_E3B0_4P_BRB_MAC_FULL_XOFF_THR_NON_PAUSE;
 			config_val->non_pauseable_th.full_xon =
-			    PFC_E3B0_4P_BRB_MAC_FULL_XON_THR_NON_PAUSE;
-	    } else {
-		config_val->pauseable_th.pause_xoff =
-		    PFC_E3B0_2P_BRB_MAC_PAUSE_XOFF_THR_PAUSE;
-		config_val->pauseable_th.pause_xon =
+			PFC_E3B0_4P_BRB_MAC_FULL_XON_THR_NON_PAUSE;
+		} else {
+			config_val->pauseable_th.pause_xoff =
+				PFC_E3B0_2P_BRB_MAC_PAUSE_XOFF_THR_PAUSE;
+			config_val->pauseable_th.pause_xon =
 		    PFC_E3B0_2P_BRB_MAC_PAUSE_XON_THR_PAUSE;
 		config_val->pauseable_th.full_xoff =
 		    PFC_E3B0_2P_BRB_MAC_FULL_XOFF_THR_PAUSE;
@@ -2109,59 +2178,83 @@ static int bnx2x_pfc_brb_get_config_params(
 	return 0;
 }
 
-
-static void bnx2x_pfc_brb_get_e3b0_config_params(struct link_params *params,
-						 struct bnx2x_pfc_brb_e3b0_val
-						 *e3b0_val,
-						 u32 cos0_pauseable,
-						 u32 cos1_pauseable)
+static void bnx2x_pfc_brb_get_e3b0_config_params(
+		struct link_params *params,
+		struct bnx2x_pfc_brb_e3b0_val
+		*e3b0_val,
+		struct bnx2x_nig_brb_pfc_port_params *pfc_params,
+		const u8 pfc_enabled)
 {
-	if (params->phy[INT_PHY].flags & FLAGS_4_PORT_MODE) {
+	if (pfc_enabled && pfc_params) {
+		e3b0_val->per_class_guaranty_mode = 1;
+		e3b0_val->lb_guarantied_hyst = 80;
+
+		if (params->phy[INT_PHY].flags &
+		    FLAGS_4_PORT_MODE) {
+			e3b0_val->full_lb_xoff_th =
+				PFC_E3B0_4P_BRB_FULL_LB_XOFF_THR;
+			e3b0_val->full_lb_xon_threshold =
+				PFC_E3B0_4P_BRB_FULL_LB_XON_THR;
+			e3b0_val->lb_guarantied =
+				PFC_E3B0_4P_LB_GUART;
+			e3b0_val->mac_0_class_t_guarantied =
+				PFC_E3B0_4P_BRB_MAC_0_CLASS_T_GUART;
+			e3b0_val->mac_0_class_t_guarantied_hyst =
+				PFC_E3B0_4P_BRB_MAC_0_CLASS_T_GUART_HYST;
+			e3b0_val->mac_1_class_t_guarantied =
+				PFC_E3B0_4P_BRB_MAC_1_CLASS_T_GUART;
+			e3b0_val->mac_1_class_t_guarantied_hyst =
+				PFC_E3B0_4P_BRB_MAC_1_CLASS_T_GUART_HYST;
+		} else {
+			e3b0_val->full_lb_xoff_th =
+				PFC_E3B0_2P_BRB_FULL_LB_XOFF_THR;
+			e3b0_val->full_lb_xon_threshold =
+				PFC_E3B0_2P_BRB_FULL_LB_XON_THR;
+			e3b0_val->mac_0_class_t_guarantied_hyst =
+				PFC_E3B0_2P_BRB_MAC_0_CLASS_T_GUART_HYST;
+			e3b0_val->mac_1_class_t_guarantied =
+				PFC_E3B0_2P_BRB_MAC_1_CLASS_T_GUART;
+			e3b0_val->mac_1_class_t_guarantied_hyst =
+				PFC_E3B0_2P_BRB_MAC_1_CLASS_T_GUART_HYST;
+
+			if (pfc_params->cos0_pauseable !=
+				pfc_params->cos1_pauseable) {
+				/* nonpauseable= Lossy + pauseable = Lossless*/
+				e3b0_val->lb_guarantied =
+					PFC_E3B0_2P_MIX_PAUSE_LB_GUART;
+				e3b0_val->mac_0_class_t_guarantied =
+			       PFC_E3B0_2P_MIX_PAUSE_MAC_0_CLASS_T_GUART;
+			} else if (pfc_params->cos0_pauseable) {
+				/* Lossless +Lossless*/
+				e3b0_val->lb_guarantied =
+					PFC_E3B0_2P_PAUSE_LB_GUART;
+				e3b0_val->mac_0_class_t_guarantied =
+				   PFC_E3B0_2P_PAUSE_MAC_0_CLASS_T_GUART;
+			} else {
+				/* Lossy +Lossy*/
+				e3b0_val->lb_guarantied =
+					PFC_E3B0_2P_NON_PAUSE_LB_GUART;
+				e3b0_val->mac_0_class_t_guarantied =
+			       PFC_E3B0_2P_NON_PAUSE_MAC_0_CLASS_T_GUART;
+			}
+		}
+	} else {
+		e3b0_val->per_class_guaranty_mode = 0;
+		e3b0_val->lb_guarantied_hyst = 0;
 		e3b0_val->full_lb_xoff_th =
-		    PFC_E3B0_4P_BRB_FULL_LB_XOFF_THR;
+			DEFAULT_E3B0_BRB_FULL_LB_XOFF_THR;
 		e3b0_val->full_lb_xon_threshold =
-		    PFC_E3B0_4P_BRB_FULL_LB_XON_THR;
+			DEFAULT_E3B0_BRB_FULL_LB_XON_THR;
 		e3b0_val->lb_guarantied =
-		    PFC_E3B0_4P_LB_GUART;
+			DEFAULT_E3B0_LB_GUART;
 		e3b0_val->mac_0_class_t_guarantied =
-		    PFC_E3B0_4P_BRB_MAC_0_CLASS_T_GUART;
-		e3b0_val->mac_0_class_t_guarantied_hyst =
-		    PFC_E3B0_4P_BRB_MAC_0_CLASS_T_GUART_HYST;
-		e3b0_val->mac_1_class_t_guarantied =
-		    PFC_E3B0_4P_BRB_MAC_1_CLASS_T_GUART;
-		e3b0_val->mac_1_class_t_guarantied_hyst =
-		    PFC_E3B0_4P_BRB_MAC_1_CLASS_T_GUART_HYST;
-	} else {
-		e3b0_val->full_lb_xoff_th =
-		    PFC_E3B0_2P_BRB_FULL_LB_XOFF_THR;
-		e3b0_val->full_lb_xon_threshold =
-		    PFC_E3B0_2P_BRB_FULL_LB_XON_THR;
+			DEFAULT_E3B0_BRB_MAC_0_CLASS_T_GUART;
 		e3b0_val->mac_0_class_t_guarantied_hyst =
-		    PFC_E3B0_2P_BRB_MAC_0_CLASS_T_GUART_HYST;
+			DEFAULT_E3B0_BRB_MAC_0_CLASS_T_GUART_HYST;
 		e3b0_val->mac_1_class_t_guarantied =
-		    PFC_E3B0_2P_BRB_MAC_1_CLASS_T_GUART;
+			DEFAULT_E3B0_BRB_MAC_1_CLASS_T_GUART;
 		e3b0_val->mac_1_class_t_guarantied_hyst =
-		    PFC_E3B0_2P_BRB_MAC_1_CLASS_T_GUART_HYST;
-
-		if (cos0_pauseable != cos1_pauseable) {
-			/* nonpauseable= Lossy + pauseable = Lossless*/
-			e3b0_val->lb_guarantied =
-			    PFC_E3B0_2P_MIX_PAUSE_LB_GUART;
-			e3b0_val->mac_0_class_t_guarantied =
-			    PFC_E3B0_2P_MIX_PAUSE_MAC_0_CLASS_T_GUART;
-		} else if (cos0_pauseable) {
-			/* Lossless +Lossless*/
-			e3b0_val->lb_guarantied =
-			    PFC_E3B0_2P_PAUSE_LB_GUART;
-			e3b0_val->mac_0_class_t_guarantied =
-			    PFC_E3B0_2P_PAUSE_MAC_0_CLASS_T_GUART;
-		} else {
-			/* Lossy +Lossy*/
-			e3b0_val->lb_guarantied =
-			    PFC_E3B0_2P_NON_PAUSE_LB_GUART;
-			e3b0_val->mac_0_class_t_guarantied =
-			    PFC_E3B0_2P_NON_PAUSE_MAC_0_CLASS_T_GUART;
-		}
+			DEFAULT_E3B0_BRB_MAC_1_CLASS_T_GUART_HYST;
 	}
 }
 static int bnx2x_update_pfc_brb(struct link_params *params,
@@ -2174,8 +2267,9 @@ static int bnx2x_update_pfc_brb(struct link_params *params,
 	struct bnx2x_pfc_brb_threshold_val *reg_th_config =
 	    &config_val.pauseable_th;
 	struct bnx2x_pfc_brb_e3b0_val e3b0_val = {0};
-	int set_pfc = params->feature_config_flags &
+	const int set_pfc = params->feature_config_flags &
 		FEATURE_CONFIG_PFC_ENABLED;
+	const u8 pfc_enabled = (set_pfc && pfc_params);
 	int bnx2x_status = 0;
 	u8 port = params->port;
 
@@ -2185,10 +2279,14 @@ static int bnx2x_update_pfc_brb(struct link_params *params,
 	if (0 != bnx2x_status)
 		return bnx2x_status;
 
-	if (set_pfc && pfc_params)
+	if (pfc_enabled) {
 		/* First COS */
-		if (!pfc_params->cos0_pauseable)
+		if (pfc_params->cos0_pauseable)
+			reg_th_config = &config_val.pauseable_th;
+		else
 			reg_th_config = &config_val.non_pauseable_th;
+	} else
+		reg_th_config = &config_val.default_class0;
 	/*
 	 * The number of free blocks below which the pause signal to class 0
 	 * of MAC #n is asserted. n=0,1
@@ -2215,12 +2313,14 @@ static int bnx2x_update_pfc_brb(struct link_params *params,
 	REG_WR(bp, (port) ? BRB1_REG_FULL_0_XON_THRESHOLD_1 :
 	       BRB1_REG_FULL_0_XON_THRESHOLD_0 , reg_th_config->full_xon);
 
-	if (set_pfc && pfc_params) {
+	if (pfc_enabled) {
 		/* Second COS */
 		if (pfc_params->cos1_pauseable)
 			reg_th_config = &config_val.pauseable_th;
 		else
 			reg_th_config = &config_val.non_pauseable_th;
+	} else
+		reg_th_config = &config_val.default_class1;
 		/*
 		 * The number of free blocks below which the pause signal to
 		 * class 1 of MAC #n is asserted. n=0,1
@@ -2250,32 +2350,34 @@ static int bnx2x_update_pfc_brb(struct link_params *params,
 		       BRB1_REG_FULL_1_XON_THRESHOLD_0,
 		       reg_th_config->full_xon);
 
+	if (CHIP_IS_E3B0(bp)) {
+		bnx2x_pfc_brb_get_e3b0_config_params(
+			params,
+			&e3b0_val,
+			pfc_params,
+			pfc_enabled);
 
-		if (CHIP_IS_E3B0(bp)) {
 			/*Should be done by init tool */
 			/*
 			* BRB_empty_for_dup = BRB1_REG_BRB_EMPTY_THRESHOLD
 			* reset value
 			* 944
 			*/
+		REG_WR(bp, BRB1_REG_PER_CLASS_GUARANTY_MODE,
+			   e3b0_val.per_class_guaranty_mode);
 
 			/**
 			 * The hysteresis on the guarantied buffer space for the Lb port
 			 * before signaling XON.
 			 **/
-			REG_WR(bp, BRB1_REG_LB_GUARANTIED_HYST, 80);
-
-			bnx2x_pfc_brb_get_e3b0_config_params(
-			    params,
-			    &e3b0_val,
-			    pfc_params->cos0_pauseable,
-			    pfc_params->cos1_pauseable);
+		REG_WR(bp, BRB1_REG_LB_GUARANTIED_HYST,
+			   e3b0_val.lb_guarantied_hyst);
 			/**
 			 * The number of free blocks below which the full signal to the
 			 * LB port is asserted.
 			*/
-			REG_WR(bp, BRB1_REG_FULL_LB_XOFF_THRESHOLD,
-				   e3b0_val.full_lb_xoff_th);
+		REG_WR(bp, BRB1_REG_FULL_LB_XOFF_THRESHOLD,
+			e3b0_val.full_lb_xoff_th);
 			/**
 			 * The number of free blocks above which the full signal to the
 			 * LB port is de-asserted.
@@ -2331,8 +2433,6 @@ static int bnx2x_update_pfc_brb(struct link_params *params,
 
 	    }
 
-	}
-
 	return bnx2x_status;
 }
 
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
index e58073e..92584d3 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
@@ -160,8 +160,11 @@
 #define BRB1_REG_PAUSE_HIGH_THRESHOLD_1 			 0x6007c
 /* [RW 10] Write client 0: Assert pause threshold. */
 #define BRB1_REG_PAUSE_LOW_THRESHOLD_0				 0x60068
-#define BRB1_REG_PAUSE_LOW_THRESHOLD_1				 0x6006c
-/* [R 24] The number of full blocks occupied by port. */
+/* [RW 1] Indicates if to use per-class guaranty mode (new mode) or per-MAC
+ * guaranty mode (backwards-compatible mode). 0=per-MAC guaranty mode (BC
+ * mode). 1=per-class guaranty mode (new mode). */
+#define BRB1_REG_PER_CLASS_GUARANTY_MODE			 0x60268
+/* [R 24] The number of full blocks occpied by port. */
 #define BRB1_REG_PORT_NUM_OCC_BLOCKS_0				 0x60094
 /* [RW 1] Reset the design by software. */
 #define BRB1_REG_SOFT_RESET					 0x600dc
-- 
1.7.7.1

^ permalink raw reply related

* [net-next v2 2/9] bnx2x: ETS changes
From: Yaniv Rosner @ 2011-11-28 10:49 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Yaniv Rosner, Eilon Greenstein
In-Reply-To: <1322477393-22904-1-git-send-email-yanivr@broadcom.com>

Fix a problem when new traffic class is created with 0% BW, the ETS is not conforming.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c |   19 ++++++++++++++++---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.h |    2 +-
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
index 8e6909a..de03730 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
@@ -874,23 +874,36 @@ static int bnx2x_ets_e3b0_set_cos_bw(struct bnx2x *bp,
 ******************************************************************************/
 static int bnx2x_ets_e3b0_get_total_bw(
 	const struct link_params *params,
-	const struct bnx2x_ets_params *ets_params,
+	struct bnx2x_ets_params *ets_params,
 	u16 *total_bw)
 {
 	struct bnx2x *bp = params->bp;
 	u8 cos_idx = 0;
+	u8 is_bw_cos_exist = 0;
 
 	*total_bw = 0 ;
+
 	/* Calculate total BW requested */
 	for (cos_idx = 0; cos_idx < ets_params->num_of_cos; cos_idx++) {
 		if (bnx2x_cos_state_bw == ets_params->cos[cos_idx].state) {
+			is_bw_cos_exist = 1;
+			if (!ets_params->cos[cos_idx].params.bw_params.bw) {
+				DP(NETIF_MSG_LINK, "bnx2x_ets_E3B0_config BW"
+						   "was set to 0\n");
+				/*
+				 * This is to prevent a state when ramrods
+				 * can't be sent
+				*/
+				ets_params->cos[cos_idx].params.bw_params.bw
+					 = 1;
+			}
 			*total_bw +=
 				ets_params->cos[cos_idx].params.bw_params.bw;
 		}
 	}
 
 	/* Check total BW is valid */
-	if ((100 != *total_bw) || (0 == *total_bw)) {
+	if ((1 == is_bw_cos_exist) && (100 != *total_bw)) {
 		if (0 == *total_bw) {
 			DP(NETIF_MSG_LINK,
 			   "bnx2x_ets_E3B0_config toatl BW shouldn't be 0\n");
@@ -1100,7 +1113,7 @@ static int bnx2x_ets_e3b0_sp_set_pri_cli_reg(const struct link_params *params,
 ******************************************************************************/
 int bnx2x_ets_e3b0_config(const struct link_params *params,
 			 const struct link_vars *vars,
-			 const struct bnx2x_ets_params *ets_params)
+			 struct bnx2x_ets_params *ets_params)
 {
 	struct bnx2x *bp = params->bp;
 	int bnx2x_status = 0;
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.h
index 2a46e63..e02a68a 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.h
@@ -479,7 +479,7 @@ int bnx2x_ets_strict(const struct link_params *params, const u8 strict_cos);
 /*  Configure the COS to ETS according to BW and SP settings.*/
 int bnx2x_ets_e3b0_config(const struct link_params *params,
 			 const struct link_vars *vars,
-			 const struct bnx2x_ets_params *ets_params);
+			 struct bnx2x_ets_params *ets_params);
 /* Read pfc statistic*/
 void bnx2x_pfc_statistic(struct link_params *params, struct link_vars *vars,
 						 u32 pfc_frames_sent[2],
-- 
1.7.7.1

^ permalink raw reply related

* [net-next v2 3/9] bnx2x: Warpcore HW reset following fan failure
From: Yaniv Rosner @ 2011-11-28 10:49 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Yaniv Rosner, Eilon Greenstein
In-Reply-To: <1322477393-22904-1-git-send-email-yanivr@broadcom.com>

Put Warpcore in low power mode in case of fan failure to reduce heat.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c |    8 ++++++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h  |    9 +++++++++
 2 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
index de03730..7eabcee 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
@@ -8216,7 +8216,15 @@ static void bnx2x_warpcore_power_module(struct link_params *params,
 static void bnx2x_warpcore_hw_reset(struct bnx2x_phy *phy,
 				    struct link_params *params)
 {
+	struct bnx2x *bp = params->bp;
 	bnx2x_warpcore_power_module(params, phy, 0);
+	/* Put Warpcore in low power mode */
+	REG_WR(bp, MISC_REG_WC0_RESET, 0x0c0e);
+
+	/* Put LCPLL in low power mode */
+	REG_WR(bp, MISC_REG_LCPLL_E40_PWRDWN, 1);
+	REG_WR(bp, MISC_REG_LCPLL_E40_RESETB_ANA, 0);
+	REG_WR(bp, MISC_REG_LCPLL_E40_RESETB_DIG, 0);
 }
 
 static void bnx2x_power_sfp_module(struct link_params *params,
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
index 92584d3..d5a0dde 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
@@ -1622,6 +1622,14 @@
    register bits. */
 #define MISC_REG_LCPLL_CTRL_1					 0xa2a4
 #define MISC_REG_LCPLL_CTRL_REG_2				 0xa2a8
+/* [RW 1] LCPLL power down. Global register. Active High. Reset on POR
+ * reset. */
+#define MISC_REG_LCPLL_E40_PWRDWN				 0xaa74
+/* [RW 1] LCPLL VCO reset. Global register. Active Low Reset on POR reset. */
+#define MISC_REG_LCPLL_E40_RESETB_ANA				 0xaa78
+/* [RW 1] LCPLL post-divider reset. Global register. Active Low Reset on POR
+ * reset. */
+#define MISC_REG_LCPLL_E40_RESETB_DIG				 0xaa7c
 /* [RW 4] Interrupt mask register #0 read/write */
 #define MISC_REG_MISC_INT_MASK					 0xa388
 /* [RW 1] Parity mask register #0 read/write */
@@ -1757,6 +1765,7 @@
  * is compared to the value on ctrl_md_devad. Drives output
  * misc_xgxs0_phy_addr. Global register. */
 #define MISC_REG_WC0_CTRL_PHY_ADDR				 0xa9cc
+#define MISC_REG_WC0_RESET					 0xac30
 /* [RW 2] XMAC Core port mode. Indicates the number of ports on the system
    side. This should be less than or equal to phy_port_mode; if some of the
    ports are not used. This enables reduction of frequency on the core side.
-- 
1.7.7.1

^ permalink raw reply related

* Re[2]:  [v4 PATCH 2/2] NETFILTER userspace part for target HMARK
From: Hans Schillstrom @ 2011-11-28  8:37 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: Hans Schillström, kaber@trash.net, pablo@netfilter.org,
	netfilter-devel@vger.kernel.org, netdev@vger.kernel.org

>>>>+Parameters:
>>>>+For all masks default is all "1:s", to disable a field use mask 0
>>>>+For IPv6 it's just the last 32 bits that is included in the hash
>>>
>>>Why limit IPv6 to 32?
>>
>>Performance, and the gain of adding another 192 bits to jhash ain't much.
>>However there is some cases when it hurts, i.e. when you can't mask of an subnet
>>I'm not sure it it's a problem or not... 
>
>I was thinking about the case where two particular hosts have the same 
>trailing 32 bits in their source address. For example, assuming IPv6 
>starts to take a stronghold in the real world and home customers start 
>assigning <myprefix>::1 to the little home server (i.e. the PPP 
>endpoint) of theirs for remote login.

Yes that's a good point,  I will have a look at this and see haw to speed-up the IPv6 calc.

btw
parsing by using xoption.c  is there a way to allow both hex format and mask length ?
i.e. --smask 0xffff0000   or  --smask /16

/Hans


^ permalink raw reply

* [PATCH] hugetlb: release pages in the error path of hugetlb_cow() (was: Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1)
From: Michal Hocko @ 2011-11-28  8:33 UTC (permalink / raw)
  To: stable-u79uwXL29TY76Z2rM5mHXA
  Cc: Rafael J. Wysocki, Andy Lutomirski, Linux Kernel Mailing List,
	Maciej Rutecki, Florian Mickler, Andrew Morton,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List, Linux Wireless List, DRI,
	Linus Torvalds
In-Reply-To: <CA+55aFygSFt+O5KLoiE_0V+o45eKfsoDDV5ML8EF=J0n9z_D-Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Mon 21-11-11 14:18:29, Linus Torvalds wrote:
> On Mon, Nov 21, 2011 at 1:49 PM, Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
> >
> > Subject    : hugetlb oops on 3.1.0-rc8-devel
> > Submitter  : Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
> > Date       : 2011-11-01 22:20
> > Message-ID : CALCETrW1mpVCz2tO5roaz1r6vnno+srHR-dHA6_pkRi2qiCfdw@mail.gmail.com
> > References : http://marc.info/?l=linux-kernel&m=132018604426692&w=2
> 
> Despite the subject line, that's not an oops, it's a BUG_ON().
> 
> And it *should* be fixed by commit ea4039a34c4c ("hugetlb: release
> pages in the error path of hugetlb_cow()") although I don't think Andy
> ever confirmed that (since it was hard to trigger).

AFAICS the issue has been introduced by 0fe6e20b (hugetlb, rmap:
add reverse mapping for hugepage) in 2.6.36-rc1 so this is a stable
material. I do not see the patch in any stable branch so here we go.
The patch is on top of 3.0.y branch and it applies as is to 3.1.y
as well.
---
From fdaa4aaa008cce149a5fd60934112acd8988e0b6 Mon Sep 17 00:00:00 2001
From: Hillf Danton <dhillf-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Tue, 15 Nov 2011 14:36:12 -0800
Subject: [PATCH] hugetlb: release pages in the error path of hugetlb_cow()

commit ea4039a34c4c206d015d34a49d0b00868e37db1d upstream.

If we fail to prepare an anon_vma, the {new, old}_page should be released,
or they will leak.

Signed-off-by: Hillf Danton <dhillf-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Andrea Arcangeli <aarcange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Johannes Weiner <jweiner-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Signed-off-by: Linus Torvalds <torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
---
 mm/hugetlb.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index bfcf153..2b57cd9 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2415,6 +2415,8 @@ retry_avoidcopy:
 	 * anon_vma prepared.
 	 */
 	if (unlikely(anon_vma_prepare(vma))) {
+		page_cache_release(new_page);
+		page_cache_release(old_page);
 		/* Caller expects lock to be held */
 		spin_lock(&mm->page_table_lock);
 		return VM_FAULT_OOM;
-- 
1.7.7.3


-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply related

* Re: [NET][TG3] Fwd: page allocation failure with linux 3.1.1
From: Eric Dumazet @ 2011-11-28  8:26 UTC (permalink / raw)
  To: Пламен Петров
  Cc: David Miller, netdev
In-Reply-To: <CALkh-Hi=Pb_cNUzHrJSe4o5-kmb+jLA8vpw8hiaGxROzNJ6oag@mail.gmail.com>

Le lundi 28 novembre 2011 à 09:16 +0200, Пламен Петров a écrit :

> Well, Eric, thanks for the explanation! I will disable TSO and will
> see how is that working out for me - if I recall correctly, I had no
> problems when TSO was off.
> 
> Sorry to bother you, if the above explanation was somewhere readily
> available, but I didn't manage to find it.

You dont bother me at all :)

As a matter of fact, your mail reminded me something I wanted to do in
the past and forgot about.

Since some devices might copy skb to a linear one in their
ndo_start_xmit(), we could set a netdev limit so that TSO can still be
used on these devices, but limiting number of frags to
2^PAGE_ALLOC_COSTLY_ORDER

(Since PAGE_ALLOC_COSTLY_ORDER == 3, thats 8 frags, instead of 16 right
now on x86)

Of course, if memory is really tight, even an ATOMIC order-3 allocation
might fail. If this happens, device limit could be dynamically
decreased. In the end, only linear skb could be built by TCP.

^ permalink raw reply

* [UPDATED PATCH 21/62] net: remove the second argument of k[un]map_atomic()
From: Cong Wang @ 2011-11-28  7:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, Jiri Pirko, e1000-devel, Dean Nelson, Bruce Allan,
	Jesse Brandeburg, David S. Miller, John Ronciak, netdev, akpm,
	Ian Campbell
In-Reply-To: <1322371662-26166-22-git-send-email-amwang@redhat.com>

Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
---
 drivers/net/ethernet/intel/e1000/e1000_main.c |    6 ++----
 drivers/net/ethernet/intel/e1000e/netdev.c    |   10 ++++------
 drivers/net/ethernet/sun/cassini.c            |    4 ++--
 3 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index cf480b5..b194beb 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -3878,11 +3878,9 @@ static bool e1000_clean_jumbo_rx_irq(struct e1000_adapter *adapter,
 				if (length <= copybreak &&
 				    skb_tailroom(skb) >= length) {
 					u8 *vaddr;
-					vaddr = kmap_atomic(buffer_info->page,
-					                    KM_SKB_DATA_SOFTIRQ);
+					vaddr = kmap_atomic(buffer_info->page);
 					memcpy(skb_tail_pointer(skb), vaddr, length);
-					kunmap_atomic(vaddr,
-					              KM_SKB_DATA_SOFTIRQ);
+					kunmap_atomic(vaddr);
 					/* re-use the page, so don't erase
 					 * buffer_info->page */
 					skb_put(skb, length);
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index a855db1..8603c87 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -1272,9 +1272,9 @@ static bool e1000_clean_rx_irq_ps(struct e1000_adapter *adapter,
 			 */
 			dma_sync_single_for_cpu(&pdev->dev, ps_page->dma,
 						PAGE_SIZE, DMA_FROM_DEVICE);
-			vaddr = kmap_atomic(ps_page->page, KM_SKB_DATA_SOFTIRQ);
+			vaddr = kmap_atomic(ps_page->page);
 			memcpy(skb_tail_pointer(skb), vaddr, l1);
-			kunmap_atomic(vaddr, KM_SKB_DATA_SOFTIRQ);
+			kunmap_atomic(vaddr);
 			dma_sync_single_for_device(&pdev->dev, ps_page->dma,
 						   PAGE_SIZE, DMA_FROM_DEVICE);
 
@@ -1465,12 +1465,10 @@ static bool e1000_clean_jumbo_rx_irq(struct e1000_adapter *adapter,
 				if (length <= copybreak &&
 				    skb_tailroom(skb) >= length) {
 					u8 *vaddr;
-					vaddr = kmap_atomic(buffer_info->page,
-					                   KM_SKB_DATA_SOFTIRQ);
+					vaddr = kmap_atomic(buffer_info->page);
 					memcpy(skb_tail_pointer(skb), vaddr,
 					       length);
-					kunmap_atomic(vaddr,
-					              KM_SKB_DATA_SOFTIRQ);
+					kunmap_atomic(vaddr);
 					/* re-use the page, so don't erase
 					 * buffer_info->page */
 					skb_put(skb, length);
diff --git a/drivers/net/ethernet/sun/cassini.c b/drivers/net/ethernet/sun/cassini.c
index fd40988..c22a195 100644
--- a/drivers/net/ethernet/sun/cassini.c
+++ b/drivers/net/ethernet/sun/cassini.c
@@ -104,8 +104,8 @@
 #include <asm/byteorder.h>
 #include <asm/uaccess.h>
 
-#define cas_page_map(x)      kmap_atomic((x), KM_SKB_DATA_SOFTIRQ)
-#define cas_page_unmap(x)    kunmap_atomic((x), KM_SKB_DATA_SOFTIRQ)
+#define cas_page_map(x)      kmap_atomic((x))
+#define cas_page_unmap(x)    kunmap_atomic((x))
 #define CAS_NCPUS            num_online_cpus()
 
 #define cas_skb_release(x)  netif_rx(x)
-- 
1.7.4.4


------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply related

* Re: [PATCH 21/62] net: remove the second argument of k[un]map_atomic()
From: Cong Wang @ 2011-11-28  7:39 UTC (permalink / raw)
  To: Cong Wang
  Cc: Jiri Pirko, Ian Campbell, David S. Miller, e1000-devel,
	Dean Nelson, Bruce Allan, Jesse Brandeburg, linux-kernel,
	John Ronciak, netdev, akpm
In-Reply-To: <1322371662-26166-22-git-send-email-amwang@redhat.com>

于 2011年11月27日 13:27, Cong Wang 写道:
> Signed-off-by: Cong Wang<amwang@redhat.com>
...
>   create mode 100644 drivers/net/team/Module.symvers

This piece is not correct, I will send an updated patch.

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* [PATCH] ipv6: Set mcast_hops to IPV6_DEFAULT_MCASTHOPS when -1 was given.
From: Li Wei @ 2011-11-28  7:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev

We need to set np->mcast_hops to it's default value at this moment
otherwise when we use it and found it's value is -1, the logic to
get default hop limit doesn't take multicast into account and will
return wrong hop limit(IPV6_DEFAULT_HOPLIMIT) which is for unicast.

Signed-off-by: Li Wei <lw@cn.fujitsu.com>
---
 net/ipv6/ipv6_sockglue.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index c99e3ee..26cb08c 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -503,7 +503,7 @@ done:
 			goto e_inval;
 		if (val > 255 || val < -1)
 			goto e_inval;
-		np->mcast_hops = val;
+		np->mcast_hops = (val == -1 ? IPV6_DEFAULT_MCASTHOPS : val);
 		retv = 0;
 		break;

-- 
1.7.3.2

^ permalink raw reply related

* Re: [NET][TG3] Fwd: page allocation failure with linux 3.1.1
From: Пламен Петров @ 2011-11-28  7:16 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1322462145.2826.6.camel@edumazet-laptop>

2011/11/28 Eric Dumazet <eric.dumazet@gmail.com>:
> Le dimanche 27 novembre 2011 à 22:34 +0200, Пламен Петров a écrit :
>> Hello, David, Jarek and Eric!
>>
>> In September 2010 I had trouble with the Broadcom Tygon 3 network driver:
>>
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=64289c8e6851bca0e589e064c9a5c9fbd6ae5dd4
>>
>> Please, see below my most recent findings, where again tg3 is present.
>>
>> I sent this earlier today to linux-kernel list. Sorry if this is not
>> related to the above, thus to the tg3 kernel driver.
>>
>> ---------- Forwarded message ----------
>> From: Пламен Петров <plamen.sisi@gmail.com>
>> Date: 2011/11/27
>> Subject: page allocation failure with linux 3.1.1
>> To: linux-kernel@vger.kernel.org
>>
>>
>> Hello, folks!
>>
>> Please, cc me, as I'm not subsribed to the list.
>>
>> With Linux 3.1.1 I'm hitting a page allocation failure on a machine
>> with 512 MB RAM and 2 GB swap, which looks like this:
>>
>> [388632.749423] swapper: page allocation failure: order:4, mode:0x4020
>> [388632.749431] Pid: 0, comm: swapper Not tainted 3.1.1-FS #1
>> [388632.749434] Call Trace:
>> [388632.749447]  [<c106cd03>] ? warn_alloc_failed+0xab/0xd5
>> [388632.749454]  [<c106f5f5>] ? __alloc_pages_nodemask+0x4ea/0x6aa
>> [388632.749460]  [<c106f822>] ? __get_free_pages+0x14/0x32
>> [388632.749467]  [<c12aecc7>] ? __alloc_skb+0x4c/0xe9
>> [388632.749472]  [<c12b02cc>] ? skb_copy+0x2e/0x7a
>> [388632.749478]  [<c122de36>] ? tg3_start_xmit+0x82f/0xa37
>> [388632.749484]  [<c12b778b>] ? dev_hard_start_xmit+0x27e/0x4f8
>> [388632.749495]  [<c130e851>] ? ip_fragment+0x7dd/0x7dd
>> [388632.749500]  [<c12e8b88>] ? nf_iterate+0x66/0x87
...
>
> You hit a known tg3 hardware bug, and the workaround needs to copy skb
> to a new one.
>
> Since TCP layer can build big skbs if TSO is on, and you can hit an
> allocation error because your memory gets fragmented after a while, a
> way to avoid this is to switch off TSO :
>
> ethtool -K eth0 tso off
> ip route flush cache
>
>

Well, Eric, thanks for the explanation! I will disable TSO and will
see how is that working out for me - if I recall correctly, I had no
problems when TSO was off.

Sorry to bother you, if the above explanation was somewhere readily
available, but I didn't manage to find it.

Thanks,
Plamen

^ permalink raw reply

* [PATCH] net: Fix corruption in /proc/*/net/dev_mcast
From: Anton Blanchard @ 2011-11-28  7:14 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Sasha Levin, David Miller, Matt Mackall, Christoph Lameter,
	Pekka Enberg, linux-mm, linux-kernel, netdev
In-Reply-To: <1321870529.2552.19.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>


Hi,

> I got the following output when running some tests (I'm not really sure
> what exactly happened when this bug was triggered):
> 
> [13850.947279] =============================================================================
> [13850.948024] BUG kmalloc-8: Redzone overwritten
> [13850.948024] -----------------------------------------------------------------------------
> [13850.948024] 
> [13850.948024] INFO: 0xffff8800104f6d28-0xffff8800104f6d2b. First byte 0x0 instead of 0xcc
> [13850.948024] INFO: Allocated in __seq_open_private+0x20/0x5e age=4436 cpu=0 pid=17295
> [13850.948024] 	__slab_alloc.clone.46+0x3e7/0x456
> [13850.948024] 	__kmalloc+0x8c/0x110
> [13850.948024] 	__seq_open_private+0x20/0x5e
> [13850.948024] 	seq_open_net+0x3b/0x5d
> [13850.948024] 	dev_mc_seq_open+0x15/0x17
> [13850.948024] 	proc_reg_open+0xad/0x127

I just hit this during my testing. Isn't there another bug lurking?

Anton
--


With slub debugging on I see red zone issues in /proc/*/net/dev_mcast:

=============================================================================
BUG kmalloc-8: Redzone overwritten
-----------------------------------------------------------------------------

INFO: 0xc0000000de9dec48-0xc0000000de9dec4b. First byte 0x0 instead of 0xcc
INFO: Allocated in .__seq_open_private+0x30/0xa0 age=0 cpu=5 pid=3896
	.__kmalloc+0x1e0/0x2d0
	.__seq_open_private+0x30/0xa0
	.seq_open_net+0x60/0xe0
	.dev_mc_seq_open+0x4c/0x70
	.proc_reg_open+0xd8/0x260
	.__dentry_open.clone.11+0x2b8/0x400
	.do_last+0xf4/0x950
	.path_openat+0xf8/0x480
	.do_filp_open+0x48/0xc0
	.do_sys_open+0x140/0x250
	syscall_exit+0x0/0x40

dev_mc_seq_ops uses dev_seq_start/next/stop but only allocates
sizeof(struct seq_net_private) of private data, whereas it expects
sizeof(struct dev_iter_state):

struct dev_iter_state {
	struct seq_net_private p;
	unsigned int pos; /* bucket << BUCKET_SPACE + offset */
};

Create dev_seq_open_ops and use it so we don't have to expose
struct dev_iter_state.

Signed-off-by: Anton Blanchard <anton@samba.org>
---

Index: linux-net/include/linux/netdevice.h
===================================================================
--- linux-net.orig/include/linux/netdevice.h	2011-11-28 17:55:51.469508056 +1100
+++ linux-net/include/linux/netdevice.h	2011-11-28 17:55:52.985535812 +1100
@@ -2536,6 +2536,8 @@ extern void		net_disable_timestamp(void)
 extern void *dev_seq_start(struct seq_file *seq, loff_t *pos);
 extern void *dev_seq_next(struct seq_file *seq, void *v, loff_t *pos);
 extern void dev_seq_stop(struct seq_file *seq, void *v);
+extern int dev_seq_open_ops(struct inode *inode, struct file *file,
+			    const struct seq_operations *ops);
 #endif
 
 extern int netdev_class_create_file(struct class_attribute *class_attr);
Index: linux-net/net/core/dev.c
===================================================================
--- linux-net.orig/net/core/dev.c	2011-11-28 17:55:51.481508276 +1100
+++ linux-net/net/core/dev.c	2011-11-28 17:55:52.989535885 +1100
@@ -4282,6 +4282,12 @@ static int dev_seq_open(struct inode *in
 			    sizeof(struct dev_iter_state));
 }
 
+int dev_seq_open_ops(struct inode *inode, struct file *file,
+		     const struct seq_operations *ops)
+{
+	return seq_open_net(inode, file, ops, sizeof(struct dev_iter_state));
+}
+
 static const struct file_operations dev_seq_fops = {
 	.owner	 = THIS_MODULE,
 	.open    = dev_seq_open,
Index: linux-net/net/core/dev_addr_lists.c
===================================================================
--- linux-net.orig/net/core/dev_addr_lists.c	2011-11-28 17:55:47.845441705 +1100
+++ linux-net/net/core/dev_addr_lists.c	2011-11-28 17:55:52.989535885 +1100
@@ -696,8 +696,7 @@ static const struct seq_operations dev_m
 
 static int dev_mc_seq_open(struct inode *inode, struct file *file)
 {
-	return seq_open_net(inode, file, &dev_mc_seq_ops,
-			    sizeof(struct seq_net_private));
+	return dev_seq_open_ops(inode, file, &dev_mc_seq_ops);
 }
 
 static const struct file_operations dev_mc_seq_fops = {

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [NET][TG3] Fwd: page allocation failure with linux 3.1.1
From: Eric Dumazet @ 2011-11-28  6:35 UTC (permalink / raw)
  To: Пламен Петров
  Cc: David Miller, Jarek Poplawski, netdev
In-Reply-To: <CALkh-HiaDSJb7sStH=DHxXCmecFWYAC2q7hx72gGvFSYz+2r=A@mail.gmail.com>

Le dimanche 27 novembre 2011 à 22:34 +0200, Пламен Петров a écrit :
> Hello, David, Jarek and Eric!
> 
> In September 2010 I had trouble with the Broadcom Tygon 3 network driver:
> 
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=64289c8e6851bca0e589e064c9a5c9fbd6ae5dd4
> 
> Please, see below my most recent findings, where again tg3 is present.
> 
> I sent this earlier today to linux-kernel list. Sorry if this is not
> related to the above, thus to the tg3 kernel driver.
> 
> ---------- Forwarded message ----------
> From: Пламен Петров <plamen.sisi@gmail.com>
> Date: 2011/11/27
> Subject: page allocation failure with linux 3.1.1
> To: linux-kernel@vger.kernel.org
> 
> 
> Hello, folks!
> 
> Please, cc me, as I'm not subsribed to the list.
> 
> With Linux 3.1.1 I'm hitting a page allocation failure on a machine
> with 512 MB RAM and 2 GB swap, which looks like this:
> 
> [388632.749423] swapper: page allocation failure: order:4, mode:0x4020
> [388632.749431] Pid: 0, comm: swapper Not tainted 3.1.1-FS #1
> [388632.749434] Call Trace:
> [388632.749447]  [<c106cd03>] ? warn_alloc_failed+0xab/0xd5
> [388632.749454]  [<c106f5f5>] ? __alloc_pages_nodemask+0x4ea/0x6aa
> [388632.749460]  [<c106f822>] ? __get_free_pages+0x14/0x32
> [388632.749467]  [<c12aecc7>] ? __alloc_skb+0x4c/0xe9
> [388632.749472]  [<c12b02cc>] ? skb_copy+0x2e/0x7a
> [388632.749478]  [<c122de36>] ? tg3_start_xmit+0x82f/0xa37
> [388632.749484]  [<c12b778b>] ? dev_hard_start_xmit+0x27e/0x4f8
> [388632.749495]  [<c130e851>] ? ip_fragment+0x7dd/0x7dd
> [388632.749500]  [<c12e8b88>] ? nf_iterate+0x66/0x87
> [388632.749505]  [<c12ca904>] ? sch_direct_xmit+0xa8/0x170
> [388632.749510]  [<c12e8c0f>] ? nf_hook_slow+0x66/0x11f
> [388632.749514]  [<c12b7c05>] ? dev_queue_xmit+0x129/0x4ce
> [388632.749518]  [<c130db10>] ? ip_local_out+0x18/0x1a
> [388632.749523]  [<c1321630>] ? tcp_transmit_skb+0x380/0x843
> [388632.749527]  [<c1321d5b>] ? tcp_write_xmit+0x1ba/0x956
> [388632.749532]  [<c132254e>] ? __tcp_push_pending_frames+0x24/0x7e
> [388632.749536]  [<c131f622>] ? tcp_rcv_established+0x3b2/0x74d
> [388632.749541]  [<c134831c>] ? ipv4_confirm+0xe1/0x179
> [388632.749545]  [<c1325f59>] ? tcp_v4_do_rcv+0x133/0x361
> [388632.749550]  [<c12e8b88>] ? nf_iterate+0x66/0x87
> [388632.749555]  [<c1149923>] ? security_sock_rcv_skb+0xc/0xd
> [388632.749559]  [<c13283bf>] ? tcp_v4_rcv+0x536/0x789
> [388632.749564]  [<c130a54a>] ? ip_local_deliver_finish+0x91/0x1ff
> [388632.749568]  [<c130a4b9>] ? ip_rcv_finish+0x339/0x339
> [388632.749572]  [<c130a27b>] ? ip_rcv_finish+0xfb/0x339
> [388632.749576]  [<c12b5ac6>] ? __netif_receive_skb+0xf5/0x438
> [388632.749581]  [<c12b890a>] ? netif_receive_skb+0x60/0x65
> [388632.749585]  [<c12b8bff>] ? napi_skb_finish+0x28/0x36
> [388632.749589]  [<c1232909>] ? tg3_poll_work+0x4da/0xaf0
> [388632.749594]  [<c1006760>] ? text_poke_smp_batch+0x31/0x31
> [388632.749599]  [<c1233016>] ? tg3_poll+0x5b/0x2e8
> [388632.749603]  [<c12b90af>] ? net_rx_action+0x70/0xfd
> [388632.749608]  [<c10331b1>] ? __do_softirq+0x6e/0xe7
> [388632.749612]  [<c1033143>] ? local_bh_enable_ip+0x76/0x76
> [388632.749614]  <IRQ>  [<c10333ea>] ? irq_exit+0x65/0x86
> [388632.749621]  [<c1003ca0>] ? do_IRQ+0x3a/0x97
> [388632.749627]  [<c13cc429>] ? common_interrupt+0x29/0x30
> [388632.749631]  [<c1008044>] ? mwait_idle+0x42/0x4f
> [388632.749635]  [<c100140a>] ? cpu_idle+0x50/0x78
> [388632.749640]  [<c159a627>] ? start_kernel+0x271/0x276
> [388632.749644]  [<c159a15e>] ? loglevel+0x2b/0x2b
> [388632.749646] Mem-Info:
> [388632.749648] DMA per-cpu:
> [388632.749651] CPU    0: hi:    0, btch:   1 usd:   0
> [388632.749654] CPU    1: hi:    0, btch:   1 usd:   0
> [388632.749656] Normal per-cpu:
> [388632.749659] CPU    0: hi:  186, btch:  31 usd: 138
> [388632.749662] CPU    1: hi:  186, btch:  31 usd: 127
> [388632.749667] active_anon:3930 inactive_anon:8795 isolated_anon:0
> [388632.749669]  active_file:42299 inactive_file:49253 isolated_file:0
> [388632.749670]  unevictable:0 dirty:518 writeback:0 unstable:0
> [388632.749672]  free:2317 slab_reclaimable:5924 slab_unreclaimable:3513
> [388632.749673]  mapped:3122 shmem:50 pagetables:469 bounce:0
> [388632.749681] DMA free:3056kB min:84kB low:104kB high:124kB
> active_anon:0kB inactive_anon:8kB active_file:2004kB
> inactive_file:8168kB unevictable:0kB isolated(anon):0kB
> isolated(file):0kB present:15804kB mlocked:0kB dirty:0kB writeback:0kB
> mapped:12kB shmem:0kB slab_reclaimable:1168kB slab_unreclaimable:196kB
> kernel_stack:8kB pagetables:0kB unstable:0kB bounce:0kB
> writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> [388632.749687] lowmem_reserve[]: 0 490 490
> [388632.749697] Normal free:6212kB min:2788kB low:3484kB high:4180kB
> active_anon:15720kB inactive_anon:35172kB active_file:167192kB
> inactive_file:188844kB unevictable:0kB isolated(anon):0kB
> isolated(file):0kB present:502396kB mlocked:0kB dirty:2072kB
> writeback:0kB mapped:12476kB shmem:200kB slab_reclaimable:22528kB
> slab_unreclaimable:13856kB kernel_stack:1280kB pagetables:1876kB
> unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
> all_unreclaimable? no
> [388632.749703] lowmem_reserve[]: 0 0 0
> [388632.749708] DMA: 360*4kB 94*8kB 48*16kB 3*32kB 0*64kB 0*128kB
> 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3056kB
> [388632.749721] Normal: 633*4kB 264*8kB 66*16kB 16*32kB 0*64kB 0*128kB
> 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6212kB
> [388632.749733] 92084 total pagecache pages
> [388632.749736] 482 pages in swap cache
> [388632.749738] Swap cache stats: add 6685, delete 6203, find 168669/168766
> [388632.749741] Free swap  = 1984732kB
> [388632.749743] Total swap = 2007996kB
> [388632.751066] 130668 pages RAM
> [388632.751069] 2789 pages reserved
> [388632.751071] 65065 pages shared
> [388632.751073] 84385 pages non-shared
> 
> The swap partition is on a software raid 1 device, like this:
> 
> root@master:~# swapon -s
> Filename                                Type            Size    Used    Priority
> /dev/md/2                               partition       2007996 46912   -1
> 
> Should I try running Linux 3.1.2 or 3.1.3 ?
> 
> I'm attaching the full dmesg, as well as the kernel .config file used.
> 
> Also, if anyone needs more details about my setup - just ask, and I'll
> help with what I can.
> 
> Thanks in advance,
> Plamen Petrov

You hit a known tg3 hardware bug, and the workaround needs to copy skb
to a new one.

Since TCP layer can build big skbs if TSO is on, and you can hit an
allocation error because your memory gets fragmented after a while, a
way to avoid this is to switch off TSO :

ethtool -K eth0 tso off
ip route flush cache

^ permalink raw reply

* RE: [PATCH net-next] netxen: write IP address to firmware when using bonding
From: Rajesh Borundia @ 2011-11-28  6:02 UTC (permalink / raw)
  To: David Miller, andy@greyhouse.net; +Cc: netdev, Sony Chacko
In-Reply-To: <20111124.003855.1498231373002177172.davem@davemloft.net>

Hi Andy,

We need to restore the ip address after the adapter reset.
netxen_restore_indev_addr is the function that restores normal ip addresses
and vlan ip addresses. If we could find bond device directly from slave then 
we can use netxen_config_indev_addr to add the ip address of the bond device.
Otherwise we may need to cache the bond ip address in function netxen_list_config_vlan_ip
and change the condition from

if (!is_vlan_dev(dev))
                return;

to

if (!is_vlan_dev(dev) && !is_bond_dev(dev))
                return;


Some of the code in if and else part is repeated.
If possible can we have small functions for that ?
eg:
if (!is_netxen_netdev(dev))
+                       goto done;
+
+               adapter = netdev_priv(dev);
+               if (!adapter)
+                       goto done;
+
+               if (adapter->is_up != NETXEN_ADAPTER_UP_MAGIC)
+                       goto done;
+
+               netxen_config_indev_addr(adapter, orig_dev, event);


Rajesh
________________________________________
From: David Miller [davem@davemloft.net]
Sent: Thursday, November 24, 2011 11:08 AM
To: andy@greyhouse.net
Cc: netdev; Sony Chacko; Rajesh Borundia
Subject: Re: [PATCH net-next] netxen: write IP address to firmware when using bonding

From: Andy Gospodarek <andy@greyhouse.net>
Date: Wed, 23 Nov 2011 22:24:27 -0500

> Are you talking about adding a macro like this:
>
>       for_each_dev_in_bond(bond,slave) {
>               [...]
>       }
>
> to replace the statements I added that were like this:
>
>       for_each_netdev_rcu(&init_net, slave) {
>               if (slave->master == dev) {
>                       [...]
>               }
>       }
>
> If so, that totally seems reasonable.  If were requesting something
> else, please let me know.

Yes, some helper that walks the device list and tries to find
a device whose ->master matches 'dev'.

^ permalink raw reply

* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
From: Jason Wang @ 2011-11-28  4:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: krkumar2, arnd, netdev, virtualization, levinsasha928,
	David Miller
In-Reply-To: <20111127172333.GD31987@redhat.com>

On 11/28/2011 01:23 AM, Michael S. Tsirkin wrote:
> On Fri, Nov 25, 2011 at 01:35:52AM -0500, David Miller wrote:
>> From: Krishna Kumar2<krkumar2@in.ibm.com>
>> Date: Fri, 25 Nov 2011 09:39:11 +0530
>>
>>> Jason Wang<jasowang@redhat.com>  wrote on 11/25/2011 08:51:57 AM:
>>>> My description is not clear again :(
>>>> I mean the same vhost thead:
>>>>
>>>> vhost thread #0 transmits packets of flow A on processor M
>>>> ...
>>>> vhost thread #0 move to another process N and start to transmit packets
>>>> of flow A
>>> Thanks for clarifying. Yes, binding vhosts to CPU's
>>> makes the incoming packet go to the same vhost each
>>> time. BTW, are you doing any binding and/or irqbalance
>>> when you run your tests? I am not running either at
>>> this time, but thought both might be useful.
>> So are we going with this patch or are we saying that vhost binding
>> is a requirement?
> I think it's a good idea to make sure we understand the problem
> root cause well before applying the patch. We still
> have a bit of time before 3.2. In particular, why does
> the vhost thread bounce between CPUs so much?

Other than this, since we could not assume the behavior of the under 
nic, using rxhash to identify a flow is more generic way.
>
> Long term it seems the best way is to expose the preferred mapping
> from the guest and forward it to the device.
>

I was working on this and hope to post it soon.

^ permalink raw reply

* Re: [PATCH v6 10/10] Disable task moving when using kernel memory accounting
From: KAMEZAWA Hiroyuki @ 2011-11-28  4:32 UTC (permalink / raw)
  To: Glauber Costa
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, lizf-BthXqXjhjHXQFUHtdCDX3A,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	paul-inf54ven1CmVyaH7bEyXVA, gthelen-hpIqsD4AKlfQT0dZR+AlfA,
	netdev-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	kirill-oKw7cIdHH8eLwutG50LtGA, avagin-bzQdu9zFT3WakBO8gow8eQ,
	devel-GEFAQzZX7r8dnm+yROfE0A, eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	cgroups-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1322242696-27682-11-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>

On Fri, 25 Nov 2011 15:38:16 -0200
Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> wrote:

> Since this code is still experimental, we are leaving the exact
> details of how to move tasks between cgroups when kernel memory
> accounting is used as future work.
> 
> For now, we simply disallow movement if there are any pending
> accounted memory.
> 
> Signed-off-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
> CC: Hiroyouki Kamezawa <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
> ---
>  mm/memcontrol.c |   23 ++++++++++++++++++++++-
>  1 files changed, 22 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 2df5d3c..ab7e57b 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5451,10 +5451,19 @@ static int mem_cgroup_can_attach(struct cgroup_subsys *ss,
>  {
>  	int ret = 0;
>  	struct mem_cgroup *mem = mem_cgroup_from_cont(cgroup);
> +	struct mem_cgroup *from = mem_cgroup_from_task(p);
> +
> +#if defined(CONFIG_CGROUP_MEM_RES_CTLR_KMEM) && defined(CONFIG_INET)
> +	if (from != mem && !mem_cgroup_is_root(from) &&
> +	    res_counter_read_u64(&from->tcp_mem.tcp_memory_allocated, RES_USAGE)) {
> +		printk(KERN_WARNING "Can't move tasks between cgroups: "
> +			"Kernel memory held. task: %s\n", p->comm);
> +		return 1;
> +	}
> +#endif

Hmm, the kernel memory is not guaranteed as being held by the 'task' ?

How about
"Now, moving task between cgroup is disallowed while the source cgroup 
 containes kmem reference." ?

Hmm.. we need to fix this task-move/rmdir issue before production use.


Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
From: Jason Wang @ 2011-11-28  4:25 UTC (permalink / raw)
  To: Krishna Kumar2
  Cc: arnd, Michael S. Tsirkin, netdev, virtualization, levinsasha928,
	davem
In-Reply-To: <OF835A6FF5.7D752AD1-ON65257953.0016350C-65257953.00169AF4@in.ibm.com>

On 11/25/2011 12:09 PM, Krishna Kumar2 wrote:
> Jason Wang<jasowang@redhat.com>  wrote on 11/25/2011 08:51:57 AM:
>> My description is not clear again :(
>> I mean the same vhost thead:
>>
>> vhost thread #0 transmits packets of flow A on processor M
>> ...
>> vhost thread #0 move to another process N and start to transmit packets
>> of flow A
> Thanks for clarifying. Yes, binding vhosts to CPU's
> makes the incoming packet go to the same vhost each
> time. BTW, are you doing any binding and/or irqbalance
> when you run your tests? I am not running either at
> this time, but thought both might be useful.

I'm using ixgbe for testing also, for host, its driver seems provide irq 
affinity hint, so no binding or irqbalance is needed. For guest, 
irqbalance is used in guest. I've tried bind irq in guest, and it can 
improve the rx performance.

> - KK
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v6 09/10] Display maximum tcp memory allocation in kmem cgroup
From: KAMEZAWA Hiroyuki @ 2011-11-28  4:22 UTC (permalink / raw)
  To: Glauber Costa
  Cc: linux-kernel, lizf, ebiederm, davem, paul, gthelen, netdev,
	linux-mm, kirill, avagin, devel, eric.dumazet, cgroups
In-Reply-To: <1322242696-27682-10-git-send-email-glommer@parallels.com>

On Fri, 25 Nov 2011 15:38:15 -0200
Glauber Costa <glommer@parallels.com> wrote:

> This patch introduces kmem.tcp.max_usage_in_bytes file, living in the
> kmem_cgroup filesystem. The root cgroup will display a value equal
> to RESOURCE_MAX. This is to avoid introducing any locking schemes in
> the network paths when cgroups are not being actively used.
> 
> All others, will see the maximum memory ever used by this cgroup.
> 
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> CC: David S. Miller <davem@davemloft.net>
> CC: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
> CC: Eric W. Biederman <ebiederm@xmission.com>

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH v6 08/10] Display current tcp failcnt in kmem cgroup
From: KAMEZAWA Hiroyuki @ 2011-11-28  4:21 UTC (permalink / raw)
  To: Glauber Costa
  Cc: linux-kernel, lizf, ebiederm, davem, paul, gthelen, netdev,
	linux-mm, kirill, avagin, devel, eric.dumazet, cgroups
In-Reply-To: <1322242696-27682-9-git-send-email-glommer@parallels.com>

On Fri, 25 Nov 2011 15:38:14 -0200
Glauber Costa <glommer@parallels.com> wrote:

> This patch introduces kmem.tcp.failcnt file, living in the
> kmem_cgroup filesystem. Following the pattern in the other
> memcg resources, this files keeps a counter of how many times
> allocation failed due to limits being hit in this cgroup.
> The root cgroup will always show a failcnt of 0.
> 
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> CC: David S. Miller <davem@davemloft.net>
> CC: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
> CC: Eric W. Biederman <ebiederm@xmission.com>

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH v6 07/10] Display current tcp memory allocation in kmem cgroup
From: KAMEZAWA Hiroyuki @ 2011-11-28  4:20 UTC (permalink / raw)
  To: Glauber Costa
  Cc: linux-kernel, lizf, ebiederm, davem, paul, gthelen, netdev,
	linux-mm, kirill, avagin, devel, eric.dumazet, cgroups
In-Reply-To: <1322242696-27682-8-git-send-email-glommer@parallels.com>

On Fri, 25 Nov 2011 15:38:13 -0200
Glauber Costa <glommer@parallels.com> wrote:

> This patch introduces kmem.tcp.usage_in_bytes file, living in the
> kmem_cgroup filesystem. It is a simple read-only file that displays the
> amount of kernel memory currently consumed by the cgroup.
> 
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> CC: David S. Miller <davem@davemloft.net>
> CC: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
> CC: Eric W. Biederman <ebiederm@xmission.com>

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH v6 06/10] tcp buffer limitation: per-cgroup limit
From: KAMEZAWA Hiroyuki @ 2011-11-28  3:24 UTC (permalink / raw)
  To: Glauber Costa
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, lizf-BthXqXjhjHXQFUHtdCDX3A,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	paul-inf54ven1CmVyaH7bEyXVA, gthelen-hpIqsD4AKlfQT0dZR+AlfA,
	netdev-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	kirill-oKw7cIdHH8eLwutG50LtGA, avagin-bzQdu9zFT3WakBO8gow8eQ,
	devel-GEFAQzZX7r8dnm+yROfE0A, eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	cgroups-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1322242696-27682-7-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>


some comments.

On Fri, 25 Nov 2011 15:38:12 -0200
Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org> wrote:

> This patch uses the "tcp.limit_in_bytes" field of the kmem_cgroup to
> effectively control the amount of kernel memory pinned by a cgroup.
> 
> This value is ignored in the root cgroup, and in all others,
> caps the value specified by the admin in the net namespaces'
> view of tcp_sysctl_mem.
> 
> If namespaces are being used, the admin is allowed to set a
> value bigger than cgroup's maximum, the same way it is allowed
> to set pretty much unlimited values in a real box.
> 
> Signed-off-by: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
> CC: David S. Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
> CC: Hiroyouki Kamezawa <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
> CC: Eric W. Biederman <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>

<snip>

>  EXPORT_SYMBOL(tcp_destroy_cgroup);
> +
> +int tcp_update_limit(struct mem_cgroup *memcg, u64 val)
> +{
> +	struct net *net = current->nsproxy->net_ns;
> +	struct tcp_memcontrol *tcp;
> +	struct cg_proto *cg_proto;
> +	int i;
> +	int ret;
> +
> +	cg_proto = tcp_prot.proto_cgroup(memcg);
> +	if (!cg_proto)
> +		return -EINVAL;
> +
> +	tcp = tcp_from_cgproto(cg_proto);
> +
> +	ret = res_counter_set_limit(&tcp->tcp_memory_allocated, val);

Here, you changed the limit.

> +	if (ret)
> +		return ret;
> +
> +	val >>= PAGE_SHIFT;

Here, you modifies 'val'

> +
> +	for (i = 0; i < 3; i++)
> +		tcp->tcp_prot_mem[i] = min_t(long, val,
> +					     net->ipv4.sysctl_tcp_mem[i]);
> +
> +	if (val == RESOURCE_MAX)
> +		jump_label_dec(&memcg_socket_limit_enabled);

the 'val' never be RESOUECE_MAX.


> +	else {
> +		u64 old_lim;
> +		old_lim = res_counter_read_u64(&tcp->tcp_memory_allocated,
> +					       RES_LIMIT);

old_lim is not already overwritten ?

> +		if (old_lim == RESOURCE_MAX)
> +			jump_label_inc(&memcg_socket_limit_enabled);
> +	}
> +	return 0;
> +}
> +

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net-next 3/3] dsa: Move switch drivers to new directory drivers/net/dsa
From: Joe Perches @ 2011-11-28  3:16 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: David Miller, Lennert Buytenhek, netdev
In-Reply-To: <1322449713.7454.36.camel@deadeye>

On Mon, 2011-11-28 at 03:08 +0000, Ben Hutchings wrote:
> Support for specific hardware belongs under drivers/net/ not net/.

Not sure I agree but maybe
drivers/net/ethernet/marvell/dsa?

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox