Netdev List
 help / color / mirror / Atom feed
* [PATCH] AF_RXRPC: Handle receiving ACKALL packets
From: David Howells @ 2011-02-28 13:27 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, David Howells

The OpenAFS server is now sending ACKALL packets, so we need to handle them.
Otherwise we report a protocol error and abort.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 net/rxrpc/ar-input.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/rxrpc/ar-input.c b/net/rxrpc/ar-input.c
index a4fc974..996d3ef 100644
--- a/net/rxrpc/ar-input.c
+++ b/net/rxrpc/ar-input.c
@@ -423,6 +423,7 @@ void rxrpc_fast_process_packet(struct rxrpc_call *call, struct sk_buff *skb)
 			goto protocol_error;
 		}
 
+	case RXRPC_PACKET_TYPE_ACKALL:
 	case RXRPC_PACKET_TYPE_ACK:
 		/* ACK processing is done in process context */
 		read_lock_bh(&call->state_lock);

^ permalink raw reply related

* [PATCH] RxRPC: Fix v1 keys
From: David Howells @ 2011-02-28 13:27 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, Anton Blanchard, David Howells

From: Anton Blanchard <anton@au1.ibm.com>

commit 339412841d7 (RxRPC: Allow key payloads to be passed in XDR form)
broke klog for me. I notice the v1 key struct had a kif_version field
added:

-struct rxkad_key {
-       u16     security_index;         /* RxRPC header security index */
-       u16     ticket_len;             /* length of ticket[] */
-       u32     expiry;                 /* time at which expires */
-       u32     kvno;                   /* key version number */
-       u8      session_key[8];         /* DES session key */
-       u8      ticket[0];              /* the encrypted ticket */
-};

+struct rxrpc_key_data_v1 {
+       u32             kif_version;            /* 1 */
+       u16             security_index;
+       u16             ticket_length;
+       u32             expiry;                 /* time_t */
+       u32             kvno;
+       u8              session_key[8];
+       u8              ticket[0];
+};

However the code in rxrpc_instantiate strips it away:

	data += sizeof(kver);
	datalen -= sizeof(kver);

Removing kif_version fixes my problem.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Howells <dhowells@redhat.com>
---

 include/keys/rxrpc-type.h |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/include/keys/rxrpc-type.h b/include/keys/rxrpc-type.h
index 5cb86c3..fc48754 100644
--- a/include/keys/rxrpc-type.h
+++ b/include/keys/rxrpc-type.h
@@ -99,7 +99,6 @@ struct rxrpc_key_token {
  * structure of raw payloads passed to add_key() or instantiate key
  */
 struct rxrpc_key_data_v1 {
-	u32		kif_version;		/* 1 */
 	u16		security_index;
 	u16		ticket_length;
 	u32		expiry;			/* time_t */

^ permalink raw reply related

* [PATCH] fcoe: correct checking for bonding
From: Jiri Pirko @ 2011-02-28 13:32 UTC (permalink / raw)
  To: linux-scsi; +Cc: devel, robert.w.love, James.Bottomley, netdev

Check for IFF_BONDING as this flag is set-up for all bonding devices.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
---
 drivers/scsi/fcoe/fcoe.c |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
index 9f9600b..67714a4 100644
--- a/drivers/scsi/fcoe/fcoe.c
+++ b/drivers/scsi/fcoe/fcoe.c
@@ -285,9 +285,7 @@ static int fcoe_interface_setup(struct fcoe_interface *fcoe,
 	}
 
 	/* Do not support for bonding device */
-	if ((netdev->priv_flags & IFF_MASTER_ALB) ||
-	    (netdev->priv_flags & IFF_SLAVE_INACTIVE) ||
-	    (netdev->priv_flags & IFF_MASTER_8023AD)) {
+	if (netdev->priv_flags & IFF_BONDING) {
 		FCOE_NETDEV_DBG(netdev, "Bonded interfaces not supported\n");
 		return -EOPNOTSUPP;
 	}
-- 
1.7.3.4


^ permalink raw reply related

* Re: SO_REUSEPORT - can it be done in kernel?
From: Eric Dumazet @ 2011-02-28 13:32 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Miller, rick.jones2, therbert, wsommerfeld, daniel.baluta,
	netdev
In-Reply-To: <20110228113659.GA20726@gondor.apana.org.au>

Le lundi 28 février 2011 à 19:36 +0800, Herbert Xu a écrit :
> On Sun, Feb 27, 2011 at 07:06:14PM +0800, Herbert Xu wrote:
> > I'm working on this right now.
> 
> OK I think I was definitely on the right track.  With the send
> patch made lockless I now get numbers which are even better than
> those obtained with running named with multiple sockets.  That's
> right, a single socket is now faster than what multiple sockets
> were without the patch (of course, multiple sockets may still
> faster with the patch vs. a single socket for obvious reasons,
> but I couldn't measure any significant difference).
> 
> Also worthy of note is that prior to the patch all CPUs showed
> idleness (lazy bastards!), with the patch they're all maxed out.
> 
> In retrospect, the idleness was simply the result of the socket
> lock scheduling away and was an indication of lock contention.
> 

Now, input path can run without finding socket locked by xmit path, so
skb are queued into receive queue, not backlog one.

> Here are the patches I used.  Please don't them yet as I intend
> to clean them up quite a bit.
> 
> But please do test them heavily, especially if you have an AMD
> NUMA machine as that's where scalability problems really show
> up.  Intel tends to be a lot more forgiving.  My last AMD machine
> blew up years ago :)

I am going to test them, thanks !



^ permalink raw reply

* [PATCH net-2.6 0/7] bnx2x fixes
From: Dmitry Kravkov @ 2011-02-28 13:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, Eilon Greenstein, Vladislav Zolotarov

Hi Dave,

Please consider applying the series with bnx2x fixes to net-2.6.

Thanks
Dmitry


 



^ permalink raw reply

* [PATCH net-2.6 7/7] bnx2x: update driver version to 1.62.00-6
From: Dmitry Kravkov @ 2011-02-28 13:37 UTC (permalink / raw)
  To: davem, netdev; +Cc: Eilon Greenstein


Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x/bnx2x.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/bnx2x/bnx2x.h b/drivers/net/bnx2x/bnx2x.h
index 368cfcd..7897d11 100644
--- a/drivers/net/bnx2x/bnx2x.h
+++ b/drivers/net/bnx2x/bnx2x.h
@@ -22,7 +22,7 @@
  * (you will need to reboot afterwards) */
 /* #define BNX2X_STOP_ON_ERROR */
 
-#define DRV_MODULE_VERSION      "1.62.00-5"
+#define DRV_MODULE_VERSION      "1.62.00-6"
 #define DRV_MODULE_RELDATE      "2011/01/30"
 #define BNX2X_BC_VER            0x040200
 
-- 
1.7.2.2





^ permalink raw reply related

* [PATCH net-2.6 6/7] bnx2x: properly calculate lro_mss
From: Dmitry Kravkov @ 2011-02-28 13:37 UTC (permalink / raw)
  To: davem, netdev; +Cc: Eilon Greenstein, Vladislav Zolotarov


From: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x/bnx2x_cmn.c |   48 +++++++++++++++++++++++++++++++++++-----
 1 files changed, 42 insertions(+), 6 deletions(-)

diff --git a/drivers/net/bnx2x/bnx2x_cmn.c b/drivers/net/bnx2x/bnx2x_cmn.c
index a58baf3..73a1f8e 100644
--- a/drivers/net/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/bnx2x/bnx2x_cmn.c
@@ -259,10 +259,44 @@ static void bnx2x_tpa_start(struct bnx2x_fastpath *fp, u16 queue,
 #endif
 }
 
+/* Timestamp option length allowed for TPA aggregation:
+ *
+ *		nop nop kind length echo val
+ */
+#define TPA_TSTAMP_OPT_LEN	12
+/**
+ * Calculate the approximate value of the MSS for this
+ * aggregation using the first packet of it.
+ *
+ * @param bp
+ * @param parsing_flags Parsing flags from the START CQE
+ * @param len_on_bd Total length of the first packet for the
+ *		     aggregation.
+ */
+static inline u16 bnx2x_set_lro_mss(struct bnx2x *bp, u16 parsing_flags,
+				    u16 len_on_bd)
+{
+	/* TPA arrgregation won't have an IP options and TCP options
+	 * other than timestamp.
+	 */
+	u16 hdrs_len = ETH_HLEN + sizeof(struct iphdr) + sizeof(struct tcphdr);
+
+
+	/* Check if there was a TCP timestamp, if there is it's will
+	 * always be 12 bytes length: nop nop kind length echo val.
+	 *
+	 * Otherwise FW would close the aggregation.
+	 */
+	if (parsing_flags & PARSING_FLAGS_TIME_STAMP_EXIST_FLAG)
+		hdrs_len += TPA_TSTAMP_OPT_LEN;
+
+	return len_on_bd - hdrs_len;
+}
+
 static int bnx2x_fill_frag_skb(struct bnx2x *bp, struct bnx2x_fastpath *fp,
 			       struct sk_buff *skb,
 			       struct eth_fast_path_rx_cqe *fp_cqe,
-			       u16 cqe_idx)
+			       u16 cqe_idx, u16 parsing_flags)
 {
 	struct sw_rx_page *rx_pg, old_rx_pg;
 	u16 len_on_bd = le16_to_cpu(fp_cqe->len_on_bd);
@@ -275,8 +309,8 @@ static int bnx2x_fill_frag_skb(struct bnx2x *bp, struct bnx2x_fastpath *fp,
 
 	/* This is needed in order to enable forwarding support */
 	if (frag_size)
-		skb_shinfo(skb)->gso_size = min((u32)SGE_PAGE_SIZE,
-					       max(frag_size, (u32)len_on_bd));
+		skb_shinfo(skb)->gso_size = bnx2x_set_lro_mss(bp, parsing_flags,
+							      len_on_bd);
 
 #ifdef BNX2X_STOP_ON_ERROR
 	if (pages > min_t(u32, 8, MAX_SKB_FRAGS)*SGE_PAGE_SIZE*PAGES_PER_SGE) {
@@ -344,6 +378,8 @@ static void bnx2x_tpa_stop(struct bnx2x *bp, struct bnx2x_fastpath *fp,
 	if (likely(new_skb)) {
 		/* fix ip xsum and give it to the stack */
 		/* (no need to map the new skb) */
+		u16 parsing_flags =
+			le16_to_cpu(cqe->fast_path_cqe.pars_flags.flags);
 
 		prefetch(skb);
 		prefetch(((char *)(skb)) + L1_CACHE_BYTES);
@@ -373,9 +409,9 @@ static void bnx2x_tpa_stop(struct bnx2x *bp, struct bnx2x_fastpath *fp,
 		}
 
 		if (!bnx2x_fill_frag_skb(bp, fp, skb,
-					 &cqe->fast_path_cqe, cqe_idx)) {
-			if ((le16_to_cpu(cqe->fast_path_cqe.
-			    pars_flags.flags) & PARSING_FLAGS_VLAN))
+					 &cqe->fast_path_cqe, cqe_idx,
+					 parsing_flags)) {
+			if (parsing_flags & PARSING_FLAGS_VLAN)
 				__vlan_hwaccel_put_tag(skb,
 						 le16_to_cpu(cqe->fast_path_cqe.
 							     vlan_tag));
-- 
1.7.2.2





^ permalink raw reply related

* [PATCH net-2.6 5/7] bnx2x: perform statistics "action" before state transition.
From: Dmitry Kravkov @ 2011-02-28 13:37 UTC (permalink / raw)
  To: davem, netdev; +Cc: Eilon Greenstein, Vladislav Zolotarov


From: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x/bnx2x_stats.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bnx2x/bnx2x_stats.c b/drivers/net/bnx2x/bnx2x_stats.c
index bda60d5..3445ded 100644
--- a/drivers/net/bnx2x/bnx2x_stats.c
+++ b/drivers/net/bnx2x/bnx2x_stats.c
@@ -1239,14 +1239,14 @@ void bnx2x_stats_handle(struct bnx2x *bp, enum bnx2x_stats_event event)
 	if (unlikely(bp->panic))
 		return;
 
+	bnx2x_stats_stm[bp->stats_state][event].action(bp);
+
 	/* Protect a state change flow */
 	spin_lock_bh(&bp->stats_lock);
 	state = bp->stats_state;
 	bp->stats_state = bnx2x_stats_stm[state][event].next_state;
 	spin_unlock_bh(&bp->stats_lock);
 
-	bnx2x_stats_stm[state][event].action(bp);
-
 	if ((event != STATS_EVENT_UPDATE) || netif_msg_timer(bp))
 		DP(BNX2X_MSG_STATS, "state %d -> event %d -> state %d\n",
 		   state, event, bp->stats_state);
-- 
1.7.2.2





^ permalink raw reply related

* [PATCH net-2.6 3/7] bnx2x: Fix ethtool -t link test for MF (non-pmf) devices.
From: Dmitry Kravkov @ 2011-02-28 13:37 UTC (permalink / raw)
  To: davem, netdev; +Cc: Eilon Greenstein


Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x/bnx2x_ethtool.c |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/bnx2x/bnx2x_ethtool.c b/drivers/net/bnx2x/bnx2x_ethtool.c
index 5b0fe7a..ef29199 100644
--- a/drivers/net/bnx2x/bnx2x_ethtool.c
+++ b/drivers/net/bnx2x/bnx2x_ethtool.c
@@ -1932,11 +1932,11 @@ static void bnx2x_self_test(struct net_device *dev,
 		buf[4] = 1;
 		etest->flags |= ETH_TEST_FL_FAILED;
 	}
-	if (bp->port.pmf)
-		if (bnx2x_link_test(bp, is_serdes) != 0) {
-			buf[5] = 1;
-			etest->flags |= ETH_TEST_FL_FAILED;
-		}
+
+	if (bnx2x_link_test(bp, is_serdes) != 0) {
+		buf[5] = 1;
+		etest->flags |= ETH_TEST_FL_FAILED;
+	}
 
 #ifdef BNX2X_EXTRA_DEBUG
 	bnx2x_panic_dump(bp);
-- 
1.7.2.2





^ permalink raw reply related

* [PATCH net-2.6 4/7] bnx2x: properly configure coefficients for MinBW algorithm (NPAR mode).
From: Dmitry Kravkov @ 2011-02-28 13:37 UTC (permalink / raw)
  To: davem, netdev; +Cc: Eilon Greenstein



Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x/bnx2x.h      |   26 +++++++++++++++-----------
 drivers/net/bnx2x/bnx2x_main.c |    3 ++-
 2 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/drivers/net/bnx2x/bnx2x.h b/drivers/net/bnx2x/bnx2x.h
index 653c624..368cfcd 100644
--- a/drivers/net/bnx2x/bnx2x.h
+++ b/drivers/net/bnx2x/bnx2x.h
@@ -1613,19 +1613,23 @@ static inline u32 reg_poll(struct bnx2x *bp, u32 reg, u32 expected, int ms,
 #define BNX2X_BTR			4
 #define MAX_SPQ_PENDING			8
 
-
-/* CMNG constants
-   derived from lab experiments, and not from system spec calculations !!! */
-#define DEF_MIN_RATE			100
+/* CMNG constants, as derived from system spec calculations */
+/* default MIN rate in case VNIC min rate is configured to zero - 100Mbps */
+#define DEF_MIN_RATE					100
 /* resolution of the rate shaping timer - 100 usec */
-#define RS_PERIODIC_TIMEOUT_USEC	100
-/* resolution of fairness algorithm in usecs -
-   coefficient for calculating the actual t fair */
-#define T_FAIR_COEF			10000000
+#define RS_PERIODIC_TIMEOUT_USEC			100
 /* number of bytes in single QM arbitration cycle -
-   coefficient for calculating the fairness timer */
-#define QM_ARB_BYTES			40000
-#define FAIR_MEM			2
+ * coefficient for calculating the fairness timer */
+#define QM_ARB_BYTES					160000
+/* resolution of Min algorithm 1:100 */
+#define MIN_RES						100
+/* how many bytes above threshold for the minimal credit of Min algorithm*/
+#define MIN_ABOVE_THRESH				32768
+/* Fairness algorithm integration time coefficient -
+ * for calculating the actual Tfair */
+#define T_FAIR_COEF	((MIN_ABOVE_THRESH +  QM_ARB_BYTES) * 8 * MIN_RES)
+/* Memory of fairness algorithm . 2 cycles */
+#define FAIR_MEM					2
 
 
 #define ATTN_NIG_FOR_FUNC		(1L << 8)
diff --git a/drivers/net/bnx2x/bnx2x_main.c b/drivers/net/bnx2x/bnx2x_main.c
index 203e9bf..032ae18 100644
--- a/drivers/net/bnx2x/bnx2x_main.c
+++ b/drivers/net/bnx2x/bnx2x_main.c
@@ -2015,7 +2015,8 @@ static void bnx2x_init_vn_minmax(struct bnx2x *bp, int vn)
 		m_fair_vn.vn_credit_delta =
 			max_t(u32, (vn_min_rate * (T_FAIR_COEF /
 						   (8 * bp->vn_weight_sum))),
-			      (bp->cmng.fair_vars.fair_threshold * 2));
+			      (bp->cmng.fair_vars.fair_threshold +
+							MIN_ABOVE_THRESH));
 		DP(NETIF_MSG_IFUP, "m_fair_vn.vn_credit_delta %d\n",
 		   m_fair_vn.vn_credit_delta);
 	}
-- 
1.7.2.2





^ permalink raw reply related

* [PATCH net-2.6 2/7] bnx2x: Fix nvram test for single port devices.
From: Dmitry Kravkov @ 2011-02-28 13:37 UTC (permalink / raw)
  To: davem, netdev; +Cc: Eilon Greenstein



Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x/bnx2x_ethtool.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bnx2x/bnx2x_ethtool.c b/drivers/net/bnx2x/bnx2x_ethtool.c
index b3da295..5b0fe7a 100644
--- a/drivers/net/bnx2x/bnx2x_ethtool.c
+++ b/drivers/net/bnx2x/bnx2x_ethtool.c
@@ -1782,9 +1782,7 @@ static int bnx2x_test_nvram(struct bnx2x *bp)
 		{ 0x100, 0x350 }, /* manuf_info */
 		{ 0x450,  0xf0 }, /* feature_info */
 		{ 0x640,  0x64 }, /* upgrade_key_info */
-		{ 0x6a4,  0x64 },
 		{ 0x708,  0x70 }, /* manuf_key_info */
-		{ 0x778,  0x70 },
 		{     0,     0 }
 	};
 	__be32 buf[0x350 / 4];
-- 
1.7.2.2





^ permalink raw reply related

* [PATCH net-2.6 1/7] bnx2x: (NPAR mode) Fix FW initialization
From: Dmitry Kravkov @ 2011-02-28 13:37 UTC (permalink / raw)
  To: davem, netdev; +Cc: Eilon Greenstein

 Fix FW initialization according to max BW stored in percents
 for NPAR mode. Protect HW from being configured to speed 0.


Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x/bnx2x_cmn.c     |   17 +++++++++--------
 drivers/net/bnx2x/bnx2x_cmn.h     |   20 ++++++++++++++++++++
 drivers/net/bnx2x/bnx2x_ethtool.c |   13 +++++++------
 drivers/net/bnx2x/bnx2x_main.c    |   15 ++++++++++++---
 4 files changed, 48 insertions(+), 17 deletions(-)

diff --git a/drivers/net/bnx2x/bnx2x_cmn.c b/drivers/net/bnx2x/bnx2x_cmn.c
index 710ce5d..a58baf3 100644
--- a/drivers/net/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/bnx2x/bnx2x_cmn.c
@@ -703,19 +703,20 @@ u16 bnx2x_get_mf_speed(struct bnx2x *bp)
 {
 	u16 line_speed = bp->link_vars.line_speed;
 	if (IS_MF(bp)) {
-		u16 maxCfg = (bp->mf_config[BP_VN(bp)] &
-						FUNC_MF_CFG_MAX_BW_MASK) >>
-						FUNC_MF_CFG_MAX_BW_SHIFT;
-		/* Calculate the current MAX line speed limit for the DCC
-		 * capable devices
+		u16 maxCfg = bnx2x_extract_max_cfg(bp,
+						   bp->mf_config[BP_VN(bp)]);
+
+		/* Calculate the current MAX line speed limit for the MF
+		 * devices
 		 */
-		if (IS_MF_SD(bp)) {
+		if (IS_MF_SI(bp))
+			line_speed = (line_speed * maxCfg) / 100;
+		else { /* SD mode */
 			u16 vn_max_rate = maxCfg * 100;
 
 			if (vn_max_rate < line_speed)
 				line_speed = vn_max_rate;
-		} else /* IS_MF_SI(bp)) */
-			line_speed = (line_speed * maxCfg) / 100;
+		}
 	}
 
 	return line_speed;
diff --git a/drivers/net/bnx2x/bnx2x_cmn.h b/drivers/net/bnx2x/bnx2x_cmn.h
index 03eb4d6..326ba44 100644
--- a/drivers/net/bnx2x/bnx2x_cmn.h
+++ b/drivers/net/bnx2x/bnx2x_cmn.h
@@ -1044,4 +1044,24 @@ static inline void storm_memset_cmng(struct bnx2x *bp,
 void bnx2x_acquire_phy_lock(struct bnx2x *bp);
 void bnx2x_release_phy_lock(struct bnx2x *bp);
 
+/**
+ * Extracts MAX BW part from MF configuration.
+ *
+ * @param bp
+ * @param mf_cfg
+ *
+ * @return u16
+ */
+static inline u16 bnx2x_extract_max_cfg(struct bnx2x *bp, u32 mf_cfg)
+{
+	u16 max_cfg = (mf_cfg & FUNC_MF_CFG_MAX_BW_MASK) >>
+			      FUNC_MF_CFG_MAX_BW_SHIFT;
+	if (!max_cfg) {
+		BNX2X_ERR("Illegal configuration detected for Max BW - "
+			  "using 100 instead\n");
+		max_cfg = 100;
+	}
+	return max_cfg;
+}
+
 #endif /* BNX2X_CMN_H */
diff --git a/drivers/net/bnx2x/bnx2x_ethtool.c b/drivers/net/bnx2x/bnx2x_ethtool.c
index 5b44a8b..b3da295 100644
--- a/drivers/net/bnx2x/bnx2x_ethtool.c
+++ b/drivers/net/bnx2x/bnx2x_ethtool.c
@@ -238,7 +238,7 @@ static int bnx2x_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 	speed |= (cmd->speed_hi << 16);
 
 	if (IS_MF_SI(bp)) {
-		u32 param = 0;
+		u32 param = 0, part;
 		u32 line_speed = bp->link_vars.line_speed;
 
 		/* use 10G if no link detected */
@@ -251,9 +251,11 @@ static int bnx2x_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 				       REQ_BC_VER_4_SET_MF_BW);
 			return -EINVAL;
 		}
-		if (line_speed < speed) {
-			BNX2X_DEV_INFO("New speed should be less or equal "
-				       "to actual line speed\n");
+		part = (speed * 100) / line_speed;
+		if (line_speed < speed || !part) {
+			BNX2X_DEV_INFO("Speed setting should be in a range "
+				       "from 1%% to 100%% "
+				       "of actual line speed\n");
 			return -EINVAL;
 		}
 		/* load old values */
@@ -263,8 +265,7 @@ static int bnx2x_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 		param &= FUNC_MF_CFG_MIN_BW_MASK;
 
 		/* set new MAX value */
-		param |= (((speed * 100) / line_speed)
-				 << FUNC_MF_CFG_MAX_BW_SHIFT)
+		param |= (part << FUNC_MF_CFG_MAX_BW_SHIFT)
 				  & FUNC_MF_CFG_MAX_BW_MASK;
 
 		bnx2x_fw_command(bp, DRV_MSG_CODE_SET_MF_BW, param);
diff --git a/drivers/net/bnx2x/bnx2x_main.c b/drivers/net/bnx2x/bnx2x_main.c
index d584d32..203e9bf 100644
--- a/drivers/net/bnx2x/bnx2x_main.c
+++ b/drivers/net/bnx2x/bnx2x_main.c
@@ -1974,13 +1974,22 @@ static void bnx2x_init_vn_minmax(struct bnx2x *bp, int vn)
 		vn_max_rate = 0;
 
 	} else {
+		u32 maxCfg = bnx2x_extract_max_cfg(bp, vn_cfg);
+
 		vn_min_rate = ((vn_cfg & FUNC_MF_CFG_MIN_BW_MASK) >>
 				FUNC_MF_CFG_MIN_BW_SHIFT) * 100;
-		/* If min rate is zero - set it to 1 */
+		/* If fairness is enabled (not all min rates are zeroes) and
+		   if current min rate is zero - set it to 1.
+		   This is a requirement of the algorithm. */
 		if (bp->vn_weight_sum && (vn_min_rate == 0))
 			vn_min_rate = DEF_MIN_RATE;
-		vn_max_rate = ((vn_cfg & FUNC_MF_CFG_MAX_BW_MASK) >>
-				FUNC_MF_CFG_MAX_BW_SHIFT) * 100;
+
+		if (IS_MF_SI(bp))
+			/* maxCfg in percents of linkspeed */
+			vn_max_rate = (bp->link_vars.line_speed * maxCfg) / 100;
+		else
+			/* maxCfg is absolute in 100Mb units */
+			vn_max_rate = maxCfg * 100;
 	}
 
 	DP(NETIF_MSG_IFUP,
-- 
1.7.2.2





^ permalink raw reply related

* Re: [PATCH] iproute2: allow to specify truncation bits on auth algo
From: Nicolas Dichtel @ 2011-02-28 13:46 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, herbert, netdev, christophe.gouault
In-Reply-To: <4D49879B.4060108@6wind.com>

Hi,

what is the status of this patch? It has been set to 'Superseded' in the 
patchwork tool (http://patchwork.ozlabs.org/patch/81486/).
Kernel headers have been updated in iproute2, should I resend the patch?


Regards,
Nicolas

On 02/02/2011 17:34, Nicolas Dichtel wrote:
> On 02/02/2011 17:30, Nicolas Dichtel wrote:
>> On 28/01/2011 20:46, David Miller wrote:
>>> From: Nicolas Dichtel<nicolas.dichtel@6wind.com>
>>> Date: Fri, 28 Jan 2011 09:51:40 +0100
>>>
>>>> On 28/01/2011 05:51, Herbert Xu wrote:
>>>>> So perhaps an SA configuration flag is needed?
>>>> I agree. If David is ok, I will update the patch.
>>>
>>> Sounds good to me.
>> And the patch for iproute2.
> Sorry, two patches were mixed :(
>
> Here is the right one.
>
>
> Regards,
> Nicolas

^ permalink raw reply

* [PATCH] macb: don't use platform_set_drvdata() on a net_device
From: Jamie Iles @ 2011-02-28 14:05 UTC (permalink / raw)
  To: netdev; +Cc: Jamie Iles, Nicolas Ferre

Commit 71d6429 (Driver core: convert platform_{get,set}_drvdata to
static inline functions) now triggers a warning in the macb network
driver:

  CC      drivers/net/macb.o
drivers/net/macb.c: In function ‘macb_mii_init’:
drivers/net/macb.c:263: warning: passing argument 1 of ‘platform_set_drvdata’ from incompatible pointer type
include/linux/platform_device.h:138: note: expected ‘struct platform_device *’ but argument is of type ‘struct net_device *’

Use dev_set_drvdata() on the device embedded in the net_device instead.

Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Jamie Iles <jamie@jamieiles.com>
---
 drivers/net/macb.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/macb.c b/drivers/net/macb.c
index f69e73e..79ccb54 100644
--- a/drivers/net/macb.c
+++ b/drivers/net/macb.c
@@ -260,7 +260,7 @@ static int macb_mii_init(struct macb *bp)
 	for (i = 0; i < PHY_MAX_ADDR; i++)
 		bp->mii_bus->irq[i] = PHY_POLL;
 
-	platform_set_drvdata(bp->dev, bp->mii_bus);
+	dev_set_drvdata(&bp->dev->dev, bp->mii_bus);
 
 	if (mdiobus_register(bp->mii_bus))
 		goto err_out_free_mdio_irq;
-- 
1.7.4


^ permalink raw reply related

* Re: SO_REUSEPORT - can it be done in kernel?
From: Thomas Graf @ 2011-02-28 14:13 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Miller, rick.jones2, therbert, wsommerfeld, daniel.baluta,
	netdev
In-Reply-To: <20110228113659.GA20726@gondor.apana.org.au>

On Mon, Feb 28, 2011 at 07:36:59PM +0800, Herbert Xu wrote:
> But please do test them heavily, especially if you have an AMD
> NUMA machine as that's where scalability problems really show
> up.  Intel tends to be a lot more forgiving.  My last AMD machine
> blew up years ago :)

This is just a preliminary test result and not 100% reliable
because half through the testing the machine reported memory
issues and disabled a DIMM before booting the tested kernels.

Nevertheless, bind 9.7.3:

2.6.38-rc5+: 62kqps
2.6.38-rc5+ w/ Herbert's patch: 442kqps

This is on a 2 NUMA Intel Xeon X5560 @ 2.80GHz with 16 cores

Again, this number is not 100% reliably but clearly shows that
the concept of the patch is working very well.

Will test Herbert's patch on the machine that did 650kqps with
SO_REUSEPORT and also on some AMD machines.

^ permalink raw reply

* Re: SO_REUSEPORT - can it be done in kernel?
From: Herbert Xu @ 2011-02-28 14:13 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, rick.jones2, therbert, wsommerfeld, daniel.baluta,
	netdev
In-Reply-To: <1298899971.2941.281.camel@edumazet-laptop>

On Mon, Feb 28, 2011 at 02:32:51PM +0100, Eric Dumazet wrote:
>
> Now, input path can run without finding socket locked by xmit path, so
> skb are queued into receive queue, not backlog one.

Indeed, I think this is what Dave alluded to earlier.  This will
eventually have to be dealt with but for now the data rate is low
enough that it isn't killing us.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: SO_REUSEPORT - can it be done in kernel?
From: Eric Dumazet @ 2011-02-28 14:22 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Miller, rick.jones2, therbert, wsommerfeld, daniel.baluta,
	netdev
In-Reply-To: <20110228141327.GA22851@gondor.apana.org.au>

Le lundi 28 février 2011 à 22:13 +0800, Herbert Xu a écrit :
> On Mon, Feb 28, 2011 at 02:32:51PM +0100, Eric Dumazet wrote:
> >
> > Now, input path can run without finding socket locked by xmit path, so
> > skb are queued into receive queue, not backlog one.
> 
> Indeed, I think this is what Dave alluded to earlier.  This will
> eventually have to be dealt with but for now the data rate is low
> enough that it isn't killing us.

Not sure how you read this ;)

I said that before your patches, a sender was consuming lot of time to
transfert frames from backlog to receive queue right before releasing
socket lock.

Now, the receive path doesnt slow down the senders, and vice versa.

:)



^ permalink raw reply

* Re: SO_REUSEPORT - can it be done in kernel?
From: Herbert Xu @ 2011-02-28 14:25 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, rick.jones2, therbert, wsommerfeld, daniel.baluta,
	netdev
In-Reply-To: <1298902926.2941.349.camel@edumazet-laptop>

On Mon, Feb 28, 2011 at 03:22:06PM +0100, Eric Dumazet wrote:
>
> Not sure how you read this ;)
> 
> I said that before your patches, a sender was consuming lot of time to
> transfert frames from backlog to receive queue right before releasing
> socket lock.
> 
> Now, the receive path doesnt slow down the senders, and vice versa.
> 
> :)

I understood what you wrote :)

I was just referring to an earlier message where Dave talked about
the UDP accounting patch making us having to take the lock on every
packet.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* [PATCH] net: fix multithreaded signal handling in unix recv routines
From: Rainer Weikusat @ 2011-02-28 14:50 UTC (permalink / raw)
  To: netdev; +Cc: davem, linux-kernel

From: Rainer Weikusat <rweikusat@mobileactivedefense.com>

The unix_dgram_recvmsg and unix_stream_recvmsg routines in
net/af_unix.c utilize mutex_lock(&u->readlock) calls in order to
serialize read operations of multiple threads on a single socket. This
implies that, if all n threads of a process block in an AF_UNIX recv
call trying to read data from the same socket, one of these threads
will be sleeping in state TASK_INTERRUPTIBLE and all others in state
TASK_UNINTERRUPTIBLE. Provided that a particular signal is supposed to
be handled by a signal handler defined by the process and that none of
this threads is blocking the signal, the complete_signal routine in
kernel/signal.c will select the 'first' such thread it happens to
encounter when deciding which thread to notify that a signal is
supposed to be handled and if this is one of the TASK_UNINTERRUPTIBLE
threads, the signal won't be handled until the one thread not blocking
on the u->readlock mutex is woken up because some data to process has
arrived (if this ever happens). The included patch fixes this by
changing mutex_lock to mutex_lock_interruptible and handling possible
error returns in the same way interruptions are handled by the actual
receive-code.

Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com>

---
diff -urp net-2.6/net/unix/af_unix.c net-2.6-patched//net/unix/af_unix.c
--- net-2.6/net/unix/af_unix.c	2011-02-16 22:19:43.338358559 +0000
+++ net-2.6-patched//net/unix/af_unix.c	2011-02-16 22:38:39.483543598 +0000
@@ -1724,7 +1724,11 @@ static int unix_dgram_recvmsg(struct kio
 
 	msg->msg_namelen = 0;
 
-	mutex_lock(&u->readlock);
+	err = mutex_lock_interruptible(&u->readlock);
+	if (err) {
+		err = sock_intr_errno(sock_rcvtimeo(sk, noblock));
+		goto out;
+	}
 
 	skb = skb_recv_datagram(sk, flags, noblock, &err);
 	if (!skb) {
@@ -1864,7 +1868,11 @@ static int unix_stream_recvmsg(struct ki
 		memset(&tmp_scm, 0, sizeof(tmp_scm));
 	}
 
-	mutex_lock(&u->readlock);
+	err = mutex_lock_interruptible(&u->readlock);
+	if (err) {
+		err = sock_intr_errno(timeo);
+		goto out;
+	}
 
 	do {
 		int chunk;
@@ -1895,11 +1903,12 @@ static int unix_stream_recvmsg(struct ki
 
 			timeo = unix_stream_data_wait(sk, timeo);
 
-			if (signal_pending(current)) {
+			if (signal_pending(current)
+			    ||  mutex_lock_interruptible(&u->readlock)) {
 				err = sock_intr_errno(timeo);
 				goto out;
 			}
-			mutex_lock(&u->readlock);
+
 			continue;
  unlock:
 			unix_state_unlock(sk);

^ permalink raw reply

* Re: SO_REUSEPORT - can it be done in kernel?
From: Eric Dumazet @ 2011-02-28 14:53 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Miller, rick.jones2, therbert, wsommerfeld, daniel.baluta,
	netdev
In-Reply-To: <1298899971.2941.281.camel@edumazet-laptop>

Le lundi 28 février 2011 à 14:32 +0100, Eric Dumazet a écrit :
> Le lundi 28 février 2011 à 19:36 +0800, Herbert Xu a écrit :
> > On Sun, Feb 27, 2011 at 07:06:14PM +0800, Herbert Xu wrote:
> > > I'm working on this right now.
> > 
> > OK I think I was definitely on the right track.  With the send
> > patch made lockless I now get numbers which are even better than
> > those obtained with running named with multiple sockets.  That's
> > right, a single socket is now faster than what multiple sockets
> > were without the patch (of course, multiple sockets may still
> > faster with the patch vs. a single socket for obvious reasons,
> > but I couldn't measure any significant difference).
> > 
> > Also worthy of note is that prior to the patch all CPUs showed
> > idleness (lazy bastards!), with the patch they're all maxed out.
> > 
> > In retrospect, the idleness was simply the result of the socket
> > lock scheduling away and was an indication of lock contention.
> > 
> 
> Now, input path can run without finding socket locked by xmit path, so
> skb are queued into receive queue, not backlog one.
> 
> > Here are the patches I used.  Please don't them yet as I intend
> > to clean them up quite a bit.
> > 
> > But please do test them heavily, especially if you have an AMD
> > NUMA machine as that's where scalability problems really show
> > up.  Intel tends to be a lot more forgiving.  My last AMD machine
> > blew up years ago :)
> 
> I am going to test them, thanks !
> 

First "sending only" tests on my 2x4x2 machine (two E5540@2.53GHz, quad
core, hyper threaded, NUMA kernel)

16 threads, each one sending 100.000 UDP frames using a _shared_ socket

I use the same destination IP, so suffer a bit of dst refcount
contention.

(to dummy0 device to avoid contention on qdisc and device)
# ip ro get 10.2.2.21
10.2.2.21 dev dummy0  src 10.2.2.2 
    cache 

LOCKDEP enabled kernel

Before :

time ./udpflood -f -t 16 -l 100000 10.2.2.21

real	0m42.749s
user	0m1.010s
sys	1m38.039s

After :

time ./udpflood -f -t 16 -l 100000 10.2.2.21

real	0m1.167s
user	0m0.488s
sys	0m17.373s


With one thread only and 16*100000 frames :
# time ./udpflood -f -l 1600000 10.2.2.21

real	0m9.318s
user	0m0.238s
sys	0m9.052s

(We have some false sharing on atomic fields in struct file and socket,
but nothing to worry about.)

With LOCKDEP OFF :

16 threads :

# time ./udpflood -f -t 16 -l 100000 10.2.2.21

real	0m0.718s
user	0m0.376s
sys	0m10.963s

1 thread :

# time ./udpflood -f -l 1600000 10.2.2.21

real	0m1.514s
user	0m0.153s
sys	0m1.357s


"perf record/report" results for the 16 threads case (no lockdep)

# Events: 389K cpu-clock-msecs
#
# Overhead      Command        Shared Object                               Symbol
# ........  ...........  ...................  ...................................
#
     9.03%     udpflood  [kernel.kallsyms]    [k] sock_wfree
     8.58%     udpflood  [kernel.kallsyms]    [k] __ip_route_output_key
     8.52%     udpflood  [kernel.kallsyms]    [k] sock_alloc_send_pskb
     7.46%     udpflood  [kernel.kallsyms]    [k] sock_def_write_space
     6.76%     udpflood  [kernel.kallsyms]    [k] __xfrm_lookup
     6.18%      swapper  [kernel.kallsyms]    [k] acpi_idle_enter_bm
     5.66%     udpflood  [kernel.kallsyms]    [k] dst_release
     4.96%     udpflood  [kernel.kallsyms]    [k] udp_sendmsg
     3.48%     udpflood  [kernel.kallsyms]    [k] fget_light
     2.75%     udpflood  [kernel.kallsyms]    [k] sock_tx_timestamp
     2.40%     udpflood  [kernel.kallsyms]    [k] __ip_make_skb
     2.36%     udpflood  [kernel.kallsyms]    [k] fput
     1.87%      swapper  [kernel.kallsyms]    [k] _raw_spin_unlock_irqrestore
     1.81%     udpflood  [kernel.kallsyms]    [k] inet_sendmsg
     1.53%     udpflood  [kernel.kallsyms]    [k] sys_sendto
     1.50%     udpflood  [kernel.kallsyms]    [k] ip_finish_output
     1.31%     udpflood  [kernel.kallsyms]    [k] csum_partial_copy_generic
     1.30%     udpflood  udpflood             [.] do_thread
     1.28%     udpflood  [kernel.kallsyms]    [k] __ip_append_data
     1.08%     udpflood  [kernel.kallsyms]    [k] __memset
     1.05%     udpflood  [kernel.kallsyms]    [k] ip_route_output_flow
     0.91%     udpflood  [kernel.kallsyms]    [k] kfree
     0.88%     udpflood  [vdso]               [.] 0xffffe430
     0.83%     udpflood  [kernel.kallsyms]    [k] copy_user_generic_string
     0.78%     udpflood  libc-2.3.4.so        [.] __GI_memcpy
     0.77%     udpflood  [kernel.kallsyms]    [k] ia32_sysenter_target


What do you suggest to perform a bind based test ?




^ permalink raw reply

* Re: SO_REUSEPORT - can it be done in kernel?
From: Thomas Graf @ 2011-02-28 15:01 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Herbert Xu, David Miller, rick.jones2, therbert, wsommerfeld,
	daniel.baluta, netdev
In-Reply-To: <1298904783.2941.412.camel@edumazet-laptop>

On Mon, Feb 28, 2011 at 03:53:03PM +0100, Eric Dumazet wrote:
> What do you suggest to perform a bind based test ?

We use queryperf from BIND sources. I typically run 1 queryperf
instance per core on multiple machines.

^ permalink raw reply

* Re: fix multithreaded signal handling in unix recv routines/ background
From: Rainer Weikusat @ 2011-02-28 15:07 UTC (permalink / raw)
  To: netdev; +Cc: davem, linux-kernel
In-Reply-To: <877hck43hs.fsf@sapphire.mobileactivedefense.com>

Rainer Weikusat <rw@sapphire.mobileactivedefense.com> writes:
> The unix_dgram_recvmsg and unix_stream_recvmsg routines in
> net/af_unix.c utilize mutex_lock(&u->readlock) calls

This is IMHO a more sensible place for additional information.

I noticed this because the intended termination processing sequence of
some program which is used as part of a 'content-filtering solution'
for mobile devices (aka iPhones, iPads etc) stopped working the first
time I tested the program in its intended 'actual execution
context'. The program is supposed to listen for 'URL classifiction
requests' on a AF_UNIX SOCK_SEQPACKET socket, pass these to a
third-party library which does the actual classification job and then
send a reply containing a list of categories associated with the URL
in question. It uses multiple threads which basically work as follows:

	1. initialize the library
        2. unblock termination signals
        3. block in read awaiting requests
        4. block termination signals
        5. process request and send reply
        6. goto 2

Upon termination, each thread needs to execute the library
finalization routine before exiting. This is supposed to work with the
help of a signal handler for 'termination signals' calling siglongjmp
to get the particular thread executing it out of the processing
loop. Afterwards, this thread (with termination signals again blocked)
does the finalization call, executes a kill(getpid(), SIGTERM) and
exits via pthread_exit. The SIGTERM should then be picked up by
another thread of the process which will then perform the same
shutdown sequence and signal the next thread, until all threads of the
process have terminated properly. An example program whose structure
is basically identical to that of the actual application which
demonstrates the problem is available here:

	http://mss-uk.mssgmbh.com/~rw/signal/signal-problem-app.c

I've since spent some more thoughts on this and came to the conclusion
that this should also affect independent process blocking on the same
AF_UNIX socket and this even in absence of any user-defined signal
handling. Another example program demonstrating this phenomenon can be
downloaded from

	http://mss-uk.mssgmbh.com/~rw/signal/signal-problem-fork-simple.c

This basically creates an 'unkillable' process, meaning, one which is
even immune to a SIGKILL.

I've also tested that the issue still occurs with 2.6.38-rc5 and that
it is fixed by the proposed patch. The program itself has meanwhile
been moved to the computers which are actually used by the customers
of my employer. This move included patching all the kernels running on
these machines in the way I suggested.

^ permalink raw reply

* Re: Bug inkvm_set_irq
From: Jean-Philippe Menil @ 2011-02-28 15:13 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, netdev, virtualization
In-Reply-To: <20110228113939.GH28006@redhat.com>

Le 28/02/2011 12:39, Michael S. Tsirkin a écrit :
> On Mon, Feb 28, 2011 at 11:40:43AM +0100, Jean-Philippe Menil wrote:
>> Le 28/02/2011 11:11, Michael S. Tsirkin a écrit :
>>> On Mon, Feb 28, 2011 at 09:56:46AM +0100, Jean-Philippe Menil wrote:
>>>> Le 27/02/2011 18:00, Michael S. Tsirkin a écrit :
>>>>> On Fri, Feb 25, 2011 at 10:07:22AM +0100, Jean-Philippe Menil wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Each time i try tou use vhost_net, i'm facing a kernel bug.
>>>>>> I do a "modprobe vhost_net", and start guest whith vhost=on.
>>>>>>
>>>>>> Following is a trace with a kernel 2.6.37, but  i had the same
>>>>>> problem with 2.6.36 (cf https://lkml.org/lkml/2010/11/30/29).
>>>>> 2.6.36 had a theorectical race that could explain this,
>>>>> but it should be ok in 2.6.37.
>>>>>
>>>>>> The bug only occurs whith vhost_net charged, so i don't know if this
>>>>>> is a bug in kvm module code or in the vhost_net code.
>>>>> It could be a bug in eventfd which is the interface
>>>>> used by both kvm and vhost_net.
>>>>> Just for fun, you can try 3.6.38 - eventfd code has been changed
>>>>> a lot in 2.6.38 and if it does not trigger there
>>>>> it's a hint that irqfd is the reason.
>>>>>
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.243100] BUG: unable to handle kernel paging request at
>>>>>> 0000000000002458
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.243250] IP: [<ffffffffa041aa8a>] kvm_set_irq+0x2a/0x130 [kvm]
>>>>> Could you run markup_oops/ ksymoops on this please?
>>>>> As far as I can see kvm_set_irq can only get a wrong
>>>>> kvm pointer. Unless there's some general memory corruption,
>>>>> I'd guess
>>>>>
>>>>> You can also try comparing the irqfd->kvm pointer in
>>>>> kvm_irqfd_assign irqfd_wakeup and kvm_set_irq in
>>>>> virt/kvm/eventfd.c.
>>>>>
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.243378] PGD 45d363067 PUD 45e77a067 PMD 0
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.243556] Oops: 0000 [#1] SMP
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.243692] last sysfs file:
>>>>>> /sys/devices/pci0000:00/0000:00:0d.0/0000:05:00.0/0000:06:00.0/irq
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [  685.243777] CPU 0
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.243820] Modules linked in: vhost_net macvtap macvlan tun
>>>>>> powernow_k8 mperf cpufreq_userspace cpufreq_stats cpufreq_powersave
>>>>>> cpufreq_ondemand fre
>>>>>> q_table cpufreq_conservative fuse xt_physdev ip6t_LOG
>>>>>> ip6table_filter ip6_tables ipt_LOG xt_multiport xt_limit xt_tcpudp
>>>>>> xt_state iptable_filter ip_tables x_tables nf_conntrack_tftp
>>>>>> nf_conntrack_ftp nf_connt
>>>>>> rack_ipv4 nf_defrag_ipv4 8021q bridge stp ext2 mbcache
>>>>>> dm_round_robin dm_multipath nf_conntrack_ipv6 nf_conntrack
>>>>>> nf_defrag_ipv6 kvm_amd kvm ipv6 snd_pcm snd_timer snd soundcore
>>>>>> snd_page_alloc tpm_tis tpm ps
>>>>>> mouse dcdbas tpm_bios processor i2c_nforce2 shpchp pcspkr ghes
>>>>>> serio_raw joydev evdev pci_hotplug i2c_core hed button thermal_sys
>>>>>> xfs exportfs dm_mod sg sr_mod cdrom usbhid hid usb_storage ses
>>>>>> sd_mod enclosu
>>>>>> re megaraid_sas ohci_hcd lpfc scsi_transport_fc scsi_tgt bnx2
>>>>>> scsi_mod ehci_hcd [last unloaded: scsi_wait_scan]
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [  685.246123]
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] Pid: 10, comm: kworker/0:1 Not tainted
>>>>>> 2.6.37-dsiun-110105 #17 0K543T/PowerEdge M605
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] RIP: 0010:[<ffffffffa041aa8a>]  [<ffffffffa041aa8a>]
>>>>>> kvm_set_irq+0x2a/0x130 [kvm]
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] RSP: 0018:ffff88045fc89d30  EFLAGS: 00010246
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] RAX: 0000000000000000 RBX: 000000000000001a RCX:
>>>>>> 0000000000000001
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
>>>>>> 0000000000000000
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] RBP: 0000000000000000 R08: 0000000000000001 R09:
>>>>>> ffff880856a91e48
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] R10: 0000000000000000 R11: 00000000ffffffff R12:
>>>>>> 0000000000000000
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] R13: 0000000000000001 R14: 0000000000000000 R15:
>>>>>> 0000000000000000
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] FS:  00007f617986c710(0000) GS:ffff88007f800000(0000)
>>>>>> knlGS:0000000000000000
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] CR2: 0000000000002458 CR3: 000000045d197000 CR4:
>>>>>> 00000000000006f0
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>>>>> 0000000000000000
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
>>>>>> 0000000000000400
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] Process kworker/0:1 (pid: 10, threadinfo
>>>>>> ffff88045fc88000, task ffff88085fc53c30)
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [  685.246123] Stack:
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123]  ffff88045fc89fd8 00000000000119c0 ffff88045fc88010
>>>>>> ffff88085fc53ee8
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123]  ffff88045fc89fd8 ffff88085fc53ee0 ffff88085fc53c30
>>>>>> 00000000000119c0
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123]  00000000000119c0 ffffffff8137f7ce ffff88007f80df40
>>>>>> 00000000ffffffff
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] Call Trace:
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123]  [<ffffffff8137f7ce>] ? common_interrupt+0xe/0x13
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123]  [<ffffffffa041bc30>] ? irqfd_inject+0x0/0x50 [kvm]
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123]  [<ffffffffa041bc57>] ? irqfd_inject+0x27/0x50 [kvm]
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123]  [<ffffffffa041bc30>] ? irqfd_inject+0x0/0x50 [kvm]
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123]  [<ffffffff8106b6f2>] ? process_one_work+0x112/0x460
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123]  [<ffffffff8106be25>] ? worker_thread+0x145/0x410
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123]  [<ffffffff8103a3d0>] ? __wake_up_common+0x50/0x80
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123]  [<ffffffff8106bce0>] ? worker_thread+0x0/0x410
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123]  [<ffffffff8106bce0>] ? worker_thread+0x0/0x410
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123]  [<ffffffff8106f786>] ? kthread+0x96/0xa0
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123]  [<ffffffff81003ce4>] ? kernel_thread_helper+0x4/0x10
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123]  [<ffffffff8106f6f0>] ? kthread+0x0/0xa0
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123]  [<ffffffff81003ce0>] ? kernel_thread_helper+0x0/0x10
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] Code: ff 41 57 41 89 f7 41 56 41 55 41 89 cd 41 54 49 89
>>>>>> fc 55 53 89 d3 48 81 ec 98 00 00 00 8b 15 c6 79 03 00 85 d2 0f 85 c4
>>>>>> 00 00 00<4
>>>>>> 9>    8b 84 24 58 24 00 00 3b 98 28 01 00 00 73 5e 89 db 48 8b 84
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] RIP  [<ffffffffa041aa8a>] kvm_set_irq+0x2a/0x130 [kvm]
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123]  RSP<ffff88045fc89d30>
>>>>>> Feb 23 13:56:19 ayrshire.u06.univ-nantes.prive kernel: [
>>>>>> 685.246123] CR2: 0000000000002458
>>>>>>
>>>>>>
>>>>>> If someone can help me, on how to solve this.
>>>>>>
>>>>>> Regards.
>>>>>> _______________________________________________
>>>>>> Virtualization mailing list
>>>>>> Virtualization@lists.linux-foundation.org
>>>>>> https://lists.linux-foundation.org/mailman/listinfo/virtualization
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> Hi,
>>>>
>>>> thanks for your response.
>>>>
>>>> This is what markup_oops.pl return me:
>>>> "No matching code found"
>>> Well, let's try to understand what's there.
>>>
>>> Do objdumop -ldS kvm.ko
>>> look for<kvm_set_irq>
>>>
>>> and paste the content from start of that function
>>> to offset 0x2a and a bit beyond.
>>>
>>> You can also upload your kvm.ko somewhere, I'll try to take a look.
>>>
>>>
>>>> So this is not a vhost_net bug, or my oops is incomplete and
>>>> markup_oops can't find the good vma offset.
>>>>
>>>> I will try to compare the pointers you indicate me, even it could be
>>>> a little difficult for me.
>>> Hmm you know how to add printk to code and rebuild, right?
>>>
>>>> Maybe i will try a 2.6.38, will wait a response from the kvm team.
>>>>
>>>> Regards.
>>>>
>>>> -- 
>>>> Jean-Philippe Menil - Pôle réseau Service IRTS
>>>> DSI Université de Nantes
>>>> jean-philippe.menil@univ-nantes.fr
>>>> Tel : 02.53.48.49.27 - Fax : 02.53.48.49.09
>> So, here is the result for the objdump against the kvm.ko (the
>> kvm_set_irq part) :
> Can you try building with -g and adding -l and -S to objdump
> please? I'd rather make the tool do the legwork than
> do it manually.
>
>> 0000000000006a60<kvm_set_irq>:
>> kvm_set_irq():
>>      6a60:       41 57                   push   %r15
>>      6a62:       41 89 f7                mov    %esi,%r15d
>>      6a65:       41 56                   push   %r14
>>      6a67:       41 55                   push   %r13
>>      6a69:       41 89 cd                mov    %ecx,%r13d
>>      6a6c:       41 54                   push   %r12
>>      6a6e:       49 89 fc                mov    %rdi,%r12
>>      6a71:       55                      push   %rbp
>>      6a72:       53                      push   %rbx
>>      6a73:       89 d3                   mov    %edx,%ebx
>>      6a75:       48 81 ec 98 00 00 00    sub    $0x98,%rsp
>>      6a7c:       8b 15 00 00 00 00       mov    0x0(%rip),%edx
>> # 6a82<kvm_set_irq+0x22>
>>      6a82:       85 d2                   test   %edx,%edx
>>      6a84:       0f 85 c4 00 00 00       jne    6b4e<kvm_set_irq+0xee>
>>      6a8a:       49 8b 84 24 58 24 00    mov    0x2458(%r12),%rax
> OK, 0x6a8a is the offset.
> After you build with -g, try
>
> addr2line kvm.ko 0x6a8a
>
> and see which line this points to.
>
>
>>      6a91:       00
>>      6a92:       3b 98 28 01 00 00       cmp    0x128(%rax),%ebx
>>      6a98:       73 5e                   jae    6af8<kvm_set_irq+0x98>
>>      6a9a:       89 db                   mov    %ebx,%ebx
>>      6a9c:       48 8b 84 d8 30 01 00    mov    0x130(%rax,%rbx,8),%rax
>>      6aa3:       00
>>      6aa4:       48 85 c0                test   %rax,%rax
>>      6aa7:       74 4f                   je     6af8<kvm_set_irq+0x98>
>>      6aa9:       48 89 e2                mov    %rsp,%rdx
>>      6aac:       31 db                   xor    %ebx,%ebx
>>      6aae:       48 8b 08                mov    (%rax),%rcx
>>      6ab1:       83 c3 01                add    $0x1,%ebx
>>      6ab4:       0f 18 09                prefetcht0 (%rcx)
>>      6ab7:       48 8b 48 e0             mov    -0x20(%rax),%rcx
>>      6abb:       48 89 0a                mov    %rcx,(%rdx)
>>      6abe:       48 8b 48 e8             mov    -0x18(%rax),%rcx
>>      6ac2:       48 89 4a 08             mov    %rcx,0x8(%rdx)
>>      6ac6:       48 8b 48 f0             mov    -0x10(%rax),%rcx
>>      6aca:       48 89 4a 10             mov    %rcx,0x10(%rdx)
>>      6ace:       48 8b 48 f8             mov    -0x8(%rax),%rcx
>>      6ad2:       48 89 4a 18             mov    %rcx,0x18(%rdx)
>>      6ad6:       48 8b 08                mov    (%rax),%rcx
>>      6ad9:       48 89 4a 20             mov    %rcx,0x20(%rdx)
>>      6add:       48 8b 48 08             mov    0x8(%rax),%rcx
>>      6ae1:       48 89 4a 28             mov    %rcx,0x28(%rdx)
>>      6ae5:       48 8b 00                mov    (%rax),%rax
>>      6ae8:       48 83 c2 30             add    $0x30,%rdx
>>      6aec:       48 85 c0                test   %rax,%rax
>>      6aef:       75 bd                   jne    6aae<kvm_set_irq+0x4e>
>>      6af1:       eb 07                   jmp    6afa<kvm_set_irq+0x9a>
>>      6af3:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
>>      6af8:       31 db                   xor    %ebx,%ebx
>>      6afa:       bd ff ff ff ff          mov    $0xffffffff,%ebp
>>      6aff:       49 89 e6                mov    %rsp,%r14
>>      6b02:       85 db                   test   %ebx,%ebx
>>      6b04:       74 34                   je     6b3a<kvm_set_irq+0xda>
>>      6b06:       83 eb 01                sub    $0x1,%ebx
>>      6b09:       44 89 e9                mov    %r13d,%ecx
>>      6b0c:       44 89 fa                mov    %r15d,%edx
>>      6b0f:       48 63 c3                movslq %ebx,%rax
>>      6b12:       4c 89 e6                mov    %r12,%rsi
>>      6b15:       48 8d 04 40             lea    (%rax,%rax,2),%rax
>>      6b19:       48 c1 e0 04             shl    $0x4,%rax
>>      6b1d:       49 8d 3c 06             lea    (%r14,%rax,1),%rdi
>>      6b21:       ff 54 04 08             callq  *0x8(%rsp,%rax,1)
>>      6b25:       85 c0                   test   %eax,%eax
>>      6b27:       78 d9                   js     6b02<kvm_set_irq+0xa2>
>>      6b29:       85 ed                   test   %ebp,%ebp
>>      6b2b:       ba 00 00 00 00          mov    $0x0,%edx
>>      6b30:       0f 48 ea                cmovs  %edx,%ebp
>>      6b33:       85 db                   test   %ebx,%ebx
>>      6b35:       8d 2c 28                lea    (%rax,%rbp,1),%ebp
>>      6b38:       75 cc                   jne    6b06<kvm_set_irq+0xa6>
>>      6b3a:       48 81 c4 98 00 00 00    add    $0x98,%rsp
>>      6b41:       89 e8                   mov    %ebp,%eax
>>      6b43:       5b                      pop    %rbx
>>      6b44:       5d                      pop    %rbp
>>      6b45:       41 5c                   pop    %r12
>>      6b47:       41 5d                   pop    %r13
>>      6b49:       41 5e                   pop    %r14
>>      6b4b:       41 5f                   pop    %r15
>>      6b4d:       c3                      retq
>>      6b4e:       48 8b 2d 00 00 00 00    mov    0x0(%rip),%rbp
>> # 6b55<kvm_set_irq+0xf5>
>>      6b55:       48 85 ed                test   %rbp,%rbp
>>      6b58:       0f 84 2c ff ff ff       je     6a8a<kvm_set_irq+0x2a>
>>      6b5e:       48 8b 45 00             mov    0x0(%rbp),%rax
>>      6b62:       48 8b 7d 08             mov    0x8(%rbp),%rdi
>>      6b66:       48 83 c5 10             add    $0x10,%rbp
>>      6b6a:       44 89 f9                mov    %r15d,%ecx
>>      6b6d:       44 89 ea                mov    %r13d,%edx
>>      6b70:       89 de                   mov    %ebx,%esi
>>      6b72:       ff d0                   callq  *%rax
>>      6b74:       48 8b 45 00             mov    0x0(%rbp),%rax
>>      6b78:       48 85 c0                test   %rax,%rax
>>      6b7b:       75 e5                   jne    6b62<kvm_set_irq+0x102>
>>      6b7d:       e9 08 ff ff ff          jmpq   6a8a<kvm_set_irq+0x2a>
>>      6b82:       66 66 66 66 66 2e 0f    nopw   %cs:0x0(%rax,%rax,1)
>>      6b89:       1f 84 00 00 00 00 00
>>
>> I admit that this analysis is too complicated for me.
>> I, effectively, can rebuild a kernel with more printk, and program a reboot.
>>
>> The kvm.ko is available through the following address:
>> http://filex.univ-nantes.fr/get?k=k1jKhQghdcHLz12Z50H
>>
>> Regards.
> This has no debug data. Can you rebuild with -g please?
>
> BTW if you want to rerun and get more reliable backtrace,
> tyr enabling frame pointers (do you know how to?). But this will change code
> so backtrace will no longer be val we will need
> a new one.
>
>> -- 
>> Jean-Philippe Menil - Pôle réseau Service IRTS
>> DSI Université de Nantes
>> jean-philippe.menil@univ-nantes.fr
>> Tel : 02.53.48.49.27 - Fax : 02.53.48.49.09
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Host reboot with his new kernel (2.6.37.2), i modprobe vhost_net, start 
three kvm guests.
Host hang in a half an hour.

This time i get a general protection fault:

[ 2380.381225] general protection fault: 0000 [#1] SMP
[ 2380.381261] last sysfs file: 
/sys/devices/system/cpu/cpu11/cache/index2/shared_cpu_map
[ 2380.381309] CPU 0
[ 2380.381316] Modules linked in: vhost_net macvtap macvlan tun veth 
powernow_k8 mperf cpufreq_userspace cpufreq_stats cpufreq_powersave 
cpufreq_ondemand freq_table cpufreq_conservative fuse xt_physdev 
ip6t_LOG ip6table_filter ip6_tables ipt_LOG xt_multiport xt_limit 
xt_tcpudp xt_state iptable_filter ip_tables x_tables nf_conntrack_tftp 
nf_conntrack_ftp nf_conntrack_ipv4 nf_defrag_ipv4 8021q bridge stp ext2 
mbcache dm_round_robin dm_multipath nf_conntrack_ipv6 nf_conntrack 
nf_defrag_ipv6 kvm_amd kvm ipv6 snd_pcm snd_timer snd soundcore 
snd_page_alloc shpchp i2c_nforce2 pci_hotplug psmouse tpm_tis joydev 
pcspkr tpm evdev i2c_core dcdbas tpm_bios serio_raw processor ghes 
button hed thermal_sys xfs exportfs dm_mod sg sr_mod cdrom usbhid hid 
usb_storage sd_mod ses enclosure megaraid_sas lpfc ohci_hcd 
scsi_transport_fc scsi_tgt scsi_mod bnx2 ehci_hcd [last unloaded: 
scsi_wait_scan]
Feb 28 15:28:09 ayrshire.u06.univ-nantes.prive kernel: Feb 28 15:28:09 
ayrshire.u06.univ-nantes.prive kernel: [ 2380.381839] Pid: 10, comm: 
kworker/0:1 Not tainted 2.6.37.2-dsiun-110105+ #2 Dell Inc. PowerEdge 
M605/0K543T
[ 2380.381902] RIP: 0010:[<ffffffffa037e877>]  [<ffffffffa037e877>] 
kvm_set_irq+0x37/0x140 [kvm]
[ 2380.381973] RSP: 0018:ffff88045fc85d00  EFLAGS: 00010246
[ 2380.382002] RAX: 000200740000029c RBX: 000000000000001a RCX: 
0000000000000001
[ 2380.382035] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
ffff88045dbb7440
[ 2380.382068] RBP: ffff88045fc85dd0 R08: ffff88045fc84000 R09: 
000000000000000c
[ 2380.382101] R10: 0000000000000036 R11: 00000000ffffffff R12: 
ffff88045dbb7440
[ 2380.382134] R13: ffff88045dbb7440 R14: ffffffffa037faa0 R15: 
0000000000000001
[ 2380.382168] FS:  00007f0c97165720(0000) GS:ffff88007f800000(0000) 
knlGS:0000000000000000
[ 2380.382216] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 2380.382246] CR2: 00007f13bcc80b40 CR3: 000000045e96c000 CR4: 
00000000000006f0
[ 2380.382279] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[ 2380.382312] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[ 2380.382347] Process kworker/0:1 (pid: 10, threadinfo 
ffff88045fc84000, task ffff88085fc53c30)
[ 2380.382395] Stack:
[ 2380.382416]  00000000000119c0 00000000000119c0 00000000000119c0 
ffff88085fc53c30
[ 2380.382466]  ffff88085fc53ee0 ffff88045fc85fd8 ffff88085fc53ee8 
ffff88045fc84010
[ 2380.382516]  00000000000119c0 ffff88045fc85fd8 00000000000119c0 
00000000000119c0
[ 2380.382566] Call Trace:
[ 2380.382600]  [<ffffffff813818ce>] ? common_interrupt+0xe/0x13
[ 2380.382648]  [<ffffffffa037faa0>] ? irqfd_inject+0x0/0x50 [kvm]
[ 2380.382694]  [<ffffffffa037faca>] irqfd_inject+0x2a/0x50 [kvm]
[ 2380.382729]  [<ffffffff8106b7bb>] process_one_work+0x11b/0x450
[ 2380.382762]  [<ffffffff8106bf37>] worker_thread+0x157/0x410
[ 2380.382796]  [<ffffffff8103a569>] ? __wake_up_common+0x59/0x90
[ 2380.382828]  [<ffffffff8106bde0>] ? worker_thread+0x0/0x410
[ 2380.382861]  [<ffffffff8106f996>] kthread+0x96/0xa0
[ 2380.382894]  [<ffffffff81003c64>] kernel_thread_helper+0x4/0x10
[ 2380.382927]  [<ffffffff8106f900>] ? kthread+0x0/0xa0
[ 2380.382958]  [<ffffffff81003c60>] ? kernel_thread_helper+0x0/0x10
[ 2380.382987] Code: 55 49 89 fd 41 54 53 89 d3 48 81 ec a8 00 00 00 8b 
15 a6 75 03 00 89 b5 3c ff ff ff 85 d2 0f 85 d5 00 00 00 49 8b 85 58 24 
00 00 <3b> 98 28 01 00 00 73 61 89 db 48 8b 84 d8 30 01 00 00 48 85 c0
[ 2380.383185] RIP  [<ffffffffa037e877>] kvm_set_irq+0x37/0x140 [kvm]
[ 2380.383231]  RSP <ffff88045fc85d00>

Running markup_oops.pl give me the following:

vmaoffset = 18446744072102576128 ffffffffa037e841:    48 89 
e5                 mov    %rsp,%rbp
  ffffffffa037e844:    41 57                    push   %r15
  ffffffffa037e846:    41 89 cf                 mov    %ecx,%r15d  |  
%r15 => 1  %ecx = 1
  ffffffffa037e849:    41 56                    push   %r14        |  
%r14 => ffffffffa037faa0
  ffffffffa037e84b:    41 55                    push   %r13
  ffffffffa037e84d:    49 89 fd                 mov    %rdi,%r13   |  
%edi = ffff88045dbb7440  %r13 => ffff88045dbb7440
  ffffffffa037e850:    41 54                    push   %r12        |  
%r12 => ffff88045dbb7440
  ffffffffa037e852:    53                       push   %rbx
  ffffffffa037e853:    89 d3                    mov    %edx,%ebx   |  
%ebx => 1a
  ffffffffa037e855:    48 81 ec a8 00 00 00     sub    $0xa8,%rsp
  ffffffffa037e85c:    8b 15 00 00 00 00        mov    
0x0(%rip),%edx        # ffffffffa037e862 <kvm_set_irq+0x22>
  ffffffffa037e862:    89 b5 3c ff ff ff        mov    %esi,-0xc4(%rbp) 
|  %esi = 0
  ffffffffa037e868:    85 d2                    test   %edx,%edx   |  
%edx => 0
  ffffffffa037e86a:    0f 85 d5 00 00 00        jne    ffffffffa037e945 
<kvm_set_irq+0x105>
  ffffffffa037e870:    49 8b 85 58 24 00 00     mov    0x2458(%r13),%rax 
|  %eax => 200740000029c  %r13 = ffff88045dbb7440
*ffffffffa037e877:    3b 98 28 01 00 00        cmp    0x128(%rax),%ebx 
|  %eax = 200740000029c  %ebx = 1a <--- faulting instruction
  ffffffffa037e87d:    73 61                    jae    ffffffffa037e8e0 
<kvm_set_irq+0xa0>
  ffffffffa037e87f:    89 db                    mov    %ebx,%ebx
  ffffffffa037e881:    48 8b 84 d8 30 01 00     mov    
0x130(%rax,%rbx,8),%rax
  ffffffffa037e888:    00
  ffffffffa037e889:    48 85 c0                 test   %rax,%rax
  ffffffffa037e88c:    74 52                    je     ffffffffa037e8e0 
<kvm_set_irq+0xa0>
  ffffffffa037e88e:    48 8d 95 40 ff ff ff     lea    -0xc0(%rbp),%rdx
  ffffffffa037e895:    31 db                    xor    %ebx,%ebx
  ffffffffa037e897:    48 8b 08                 mov    (%rax),%rcx
  ffffffffa037e89a:    83 c3 01                 add    $0x1,%ebx
  ffffffffa037e89d:    0f 18 09                 prefetcht0 (%rcx)
  ffffffffa037e8a0:    48 8b 48 e0              mov    -0x20(%rax),%rcx
  ffffffffa037e8a4:    48 89 0a                 mov    %rcx,(%rdx)
  ffffffffa037e8a7:    48 8b 48 e8              mov    -0x18(%rax),%rcx
  ffffffffa037e8ab:    48 89 4a 08              mov    %rcx,0x8(%rdx)
  ffffffffa037e8af:    48 8b 48 f0              mov    -0x10(%rax),%rcx
  ffffffffa037e8b3:    48 89 4a 10              mov    %rcx,0x10(%rdx)
  ffffffffa037e8b7:    48 8b 48 f8              mov    -0x8(%rax),%rcx
  ffffffffa037e8bb:    48 89 4a 18              mov    %rcx,0x18(%rdx)
  ffffffffa037e8bf:    48 8b 08                 mov    (%rax),%rcx

I've re-run markup_oops on the first oops (2.6.37.1) (on the right 
module this time, sorry for that), it give me the following:

vmaoffset = 18446744072103215104 ffffffffa041aa62:    41 89 
f7                 mov    %esi,%r15d  |  %r15 => 0  %esi = 0
  ffffffffa041aa65:    41 56                    push   %r14        |  
%r14 => 0
  ffffffffa041aa67:    41 55                    push   %r13
  ffffffffa041aa69:    41 89 cd                 mov    %ecx,%r13d  |  
%ecx = 1  %r13 => 1
  ffffffffa041aa6c:    41 54                    push   %r12
  ffffffffa041aa6e:    49 89 fc                 mov    %rdi,%r12   |  
%edi = 0  %r12 => 0
  ffffffffa041aa71:    55                       push   %rbp
  ffffffffa041aa72:    53                       push   %rbx
  ffffffffa041aa73:    89 d3                    mov    %edx,%ebx   |  
%ebx => 1a
  ffffffffa041aa75:    48 81 ec 98 00 00 00     sub    $0x98,%rsp
  ffffffffa041aa7c:    8b 15 00 00 00 00        mov    
0x0(%rip),%edx        # ffffffffa041aa82 <kvm_set_irq+0x22>
  ffffffffa041aa82:    85 d2                    test   %edx,%edx   |  
%edx => 0
  ffffffffa041aa84:    0f 85 c4 00 00 00        jne    ffffffffa041ab4e 
<kvm_set_irq+0xee>
*ffffffffa041aa8a:    49 8b 84 24 58 24 00     mov    0x2458(%r12),%rax 
|  %eax = 0  %r12 = 0 <--- faulting instruction
  ffffffffa041aa91:    00
  ffffffffa041aa92:    3b 98 28 01 00 00        cmp    0x128(%rax),%ebx
  ffffffffa041aa98:    73 5e                    jae    ffffffffa041aaf8 
<kvm_set_irq+0x98>
  ffffffffa041aa9a:    89 db                    mov    %ebx,%ebx
  ffffffffa041aa9c:    48 8b 84 d8 30 01 00     mov    
0x130(%rax,%rbx,8),%rax
  ffffffffa041aaa3:    00
  ffffffffa041aaa4:    48 85 c0                 test   %rax,%rax
  ffffffffa041aaa7:    74 4f                    je     ffffffffa041aaf8 
<kvm_set_irq+0x98>
  ffffffffa041aaa9:    48 89 e2                 mov    %rsp,%rdx
  ffffffffa041aaac:    31 db                    xor    %ebx,%ebx
  ffffffffa041aaae:    48 8b 08                 mov    (%rax),%rcx
  ffffffffa041aab1:    83 c3 01                 add    $0x1,%ebx
  ffffffffa041aab4:    0f 18 09                 prefetcht0 (%rcx)
  ffffffffa041aab7:    48 8b 48 e0              mov    -0x20(%rax),%rcx
  ffffffffa041aabb:    48 89 0a                 mov    %rcx,(%rdx)
  ffffffffa041aabe:    48 8b 48 e8              mov    -0x18(%rax),%rcx
  ffffffffa041aac2:    48 89 4a 08              mov    %rcx,0x8(%rdx)
  ffffffffa041aac6:    48 8b 48 f0              mov    -0x10(%rax),%rcx
  ffffffffa041aaca:    48 89 4a 10              mov    %rcx,0x10(%rdx)
  ffffffffa041aace:    48 8b 48 f8              mov    -0x8(%rax),%rcx

It's appear that the kernel i recompiled (make-pkg) with the debug 
options for kvm module, doesn't have the debug!
addr2line give me an "??:0"

I will retent with the good options.

Regards.

-- 
Jean-Philippe Menil - Pôle réseau Service IRTS
DSI Université de Nantes
jean-philippe.menil@univ-nantes.fr
Tel : 02.53.48.49.27 - Fax : 02.53.48.49.09


^ permalink raw reply

* Re: [PATCH 0/3] [RFC] Implement multiqueue (RX & TX) virtio-net
From: Krishna Kumar2 @ 2011-02-28 15:35 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: anthony, arnd, avi, davem, eric.dumazet, horms, kvm, netdev,
	rusty
In-Reply-To: <20110228073514.GA28006@redhat.com>

"Michael S. Tsirkin" <mst@redhat.com> wrote on 02/28/2011 01:05:15 PM:

> > This patch series is a continuation of an earlier one that
> > implemented guest MQ TX functionality.  This new patchset
> > implements both RX and TX MQ.  Qemu changes are not being
> > included at this time solely to aid in easier review.
> > Compatibility testing with old/new combinations of qemu/guest
> > and vhost was done without any issues.
> >
> > Some early TCP/UDP test results are at the bottom of this
> > post, I plan to submit more test results in the coming days.
> >
> > Please review and provide feedback on what can improve.
> >
> > Thanks!
> >
> > Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
>
>
> To help testing, could you post the qemu changes separately please?

Thanks Michael for your review and feedback. I will send the qemu
changes and respond to your comments tomorrow.

Thanks,

- KK


^ permalink raw reply

* Re: txqueuelen has wrong units; should be time
From: Hagen Paul Pfeifer @ 2011-02-28 15:38 UTC (permalink / raw)
  To: Albert Cahalan
  Cc: Jussi Kivilinna, Eric Dumazet, Mikael Abrahamsson, linux-kernel,
	netdev
In-Reply-To: <AANLkTimofhhH5omyk=HhkyaNG+MGqoac4rDf=dPuR7K-@mail.gmail.com>


On Sun, 27 Feb 2011 18:33:39 -0500, Albert Cahalan wrote:



> I suppose there is a need to allow at least 2 packets despite any

> time limits, so that it remains possible to use a traditional modem

> even if a huge packet takes several seconds to send.



That is a good point! We talk about as we may know every use case of

Linux. But this is not true at all. One of my customer for example operates

the Linux network stack functionality on top of a proprietary MAC/Driver

where the current packet queue characteristic is just fine. The

time-drop-approach is unsuitable because the bandwidth can vary in a small

amount of time over a great range (0 till max. bandwidth). A sufficient

buffering shows up superior in this environment (only IPv{4,6}/UDP).



Hagen

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox