Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next] bnx2x: Disable local BHes to prevent a dead-lock situation
From: Vladislav Zolotarov @ 2010-11-24 13:45 UTC (permalink / raw)
  To: Dave Miller; +Cc: Eilon Greenstein, netdev list, Eric Dumazet

From: Eric Dumazet <eric.dumazet@gmail.com>

According to Eric's suggestion:
Disable local BHes to prevent a dead-lock situation between sch_direct_xmit()
(Soft_IRQ context) and bnx2x_tx_int (called by bnx2x_run_loopback() - syscall
context), as both are taking a netif_tx_lock().

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
CCing Eric... :)

 drivers/net/bnx2x/bnx2x_ethtool.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/drivers/net/bnx2x/bnx2x_ethtool.c b/drivers/net/bnx2x/bnx2x_ethtool.c
index d02ffbd..0301278 100644
--- a/drivers/net/bnx2x/bnx2x_ethtool.c
+++ b/drivers/net/bnx2x/bnx2x_ethtool.c
@@ -1499,8 +1499,15 @@ static int bnx2x_run_loopback(struct bnx2x *bp, int loopback_mode, u8 link_up)
 	 * updates that have been performed while interrupts were
 	 * disabled.
 	 */
-	if (bp->common.int_block == INT_BLOCK_IGU)
+	if (bp->common.int_block == INT_BLOCK_IGU) {
+		/* Disable local BHes to prevent a dead-lock situation between
+		 * sch_direct_xmit() and bnx2x_run_loopback() (calling
+		 * bnx2x_tx_int()), as both are taking netif_tx_lock().
+		 */
+		local_bh_disable();
 		bnx2x_tx_int(fp_tx);
+		local_bh_enable();
+	}
 
 	rx_idx = le16_to_cpu(*fp_rx->rx_cons_sb);
 	if (rx_idx != rx_start_idx + num_pkts)
-- 
1.7.0.4





^ permalink raw reply related

* [PATCH] infiniband: remove dev_base_lock use
From: Eric Dumazet @ 2010-11-24 13:47 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Roland Dreier, Sean Hefty, Hal Rosenstock,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1288350226.2560.11.camel@edumazet-laptop>



Le vendredi 29 octobre 2010 à 13:03 +0200, Eric Dumazet a écrit :
> dev_base_lock is the legacy way to lock the device list, and is planned
> to disappear. (writers hold RTNL, readers hold RCU lock)
> 
> Convert rdma_translate_ip() to RCU locking.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
>  drivers/infiniband/core/addr.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
> index a5ea1bc..309b477 100644
> --- a/drivers/infiniband/core/addr.c
> +++ b/drivers/infiniband/core/addr.c
> @@ -130,8 +130,8 @@ int rdma_translate_ip(struct sockaddr *addr, struct rdma_dev_addr *dev_addr)
>  
>  #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
>  	case AF_INET6:
> -		read_lock(&dev_base_lock);
> -		for_each_netdev(&init_net, dev) {
> +		rcu_read_lock();
> +		for_each_netdev_rcu(&init_net, dev) {
>  			if (ipv6_chk_addr(&init_net,
>  					  &((struct sockaddr_in6 *) addr)->sin6_addr,
>  					  dev, 1)) {
> @@ -139,7 +139,7 @@ int rdma_translate_ip(struct sockaddr *addr, struct rdma_dev_addr *dev_addr)
>  				break;
>  			}
>  		}
> -		read_unlock(&dev_base_lock);
> +		rcu_read_unlock();
>  		break;
>  #endif
>  	}
> 

Here is a new version of patch.

David, it seems no infiniband guy commented this patch, could you take
it in your tree ?

Thanks

[PATCH] infiniband: remove dev_base_lock use

dev_base_lock is the legacy way to lock the device list, and is planned
to disappear. (writers hold RTNL, readers hold RCU lock)

Convert rdma_translate_ip() and update_ipv6_gids() to RCU locking.

Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/core/addr.c    |    6 +++---
 drivers/infiniband/hw/mlx4/main.c |    6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index c15fd2e..8aba0ba 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -130,8 +130,8 @@ int rdma_translate_ip(struct sockaddr *addr, struct rdma_dev_addr *dev_addr)
 
 #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
 	case AF_INET6:
-		read_lock(&dev_base_lock);
-		for_each_netdev(&init_net, dev) {
+		rcu_read_lock();
+		for_each_netdev_rcu(&init_net, dev) {
 			if (ipv6_chk_addr(&init_net,
 					  &((struct sockaddr_in6 *) addr)->sin6_addr,
 					  dev, 1)) {
@@ -139,7 +139,7 @@ int rdma_translate_ip(struct sockaddr *addr, struct rdma_dev_addr *dev_addr)
 				break;
 			}
 		}
-		read_unlock(&dev_base_lock);
+		rcu_read_unlock();
 		break;
 #endif
 	}
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index bf3e20c..4e55a28 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -848,8 +848,8 @@ static int update_ipv6_gids(struct mlx4_ib_dev *dev, int port, int clear)
 		goto out;
 	}
 
-	read_lock(&dev_base_lock);
-	for_each_netdev(&init_net, tmp) {
+	rcu_read_lock();
+	for_each_netdev_rcu(&init_net, tmp) {
 		if (ndev && (tmp == ndev || rdma_vlan_dev_real_dev(tmp) == ndev)) {
 			gid.global.subnet_prefix = cpu_to_be64(0xfe80000000000000LL);
 			vid = rdma_vlan_dev_vlan_id(tmp);
@@ -884,7 +884,7 @@ static int update_ipv6_gids(struct mlx4_ib_dev *dev, int port, int clear)
 			}
 		}
 	}
-	read_unlock(&dev_base_lock);
+	rcu_read_unlock();
 
 	for (i = 0; i < 128; ++i)
 		if (!hits[i]) {


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* RE: [PATCH 1/2] usbnet: changes for upcoming cdc_ncm driver
From: Alexey ORISHKO @ 2010-11-24 13:53 UTC (permalink / raw)
  To: Viral Mehta, gregkh-l3A5Bk7waGM@public.gmane.org,
	linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	oliver-GvhC2dPhHPQdnm+yROfE0A@public.gmane.org,
	yauheni.kaliuta-xNZwKgViW5gAvxtiuMwx3w@public.gmane.org,
	balbi-l0cyMroinI0@public.gmane.org, Sjur BRENDELAND,
	Alexey ORISHKO
In-Reply-To: <D69C90565D53114396BF743585AF5A09122E61E8A3-fVvhrSDAkVjuuPCCJ6VnObSn4PsL5ZDKvpI+GvaZM4lBDgjK7y7TUQ@public.gmane.org>

> -----Original Message-----
> From: Viral Mehta [mailto:Viral.Mehta-oonXi/34qI7c+919tysfdA@public.gmane.org]
> Sent: 23 ноября 2010 г. 18:40
> To: Alexey Orishko; gregkh-l3A5Bk7waGM@public.gmane.org; linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; oliver-GvhC2dPhHPQdnm+yROfE0A@public.gmane.org;
> yauheni.kaliuta-xNZwKgViW5gAvxtiuMwx3w@public.gmane.org; felipe.balbi-xNZwKgViW5gAvxtiuMwx3w@public.gmane.org; Sjur BRENDELAND;
> Alexey ORISHKO
> Subject: RE: [PATCH 1/2] usbnet: changes for upcoming cdc_ncm driver
> 
> Hi,
> 
> >Signed-off-by: Alexey Orishko <alexey.orishko-0IS4wlFg1OjSUeElwK9/Pw@public.gmane.org>
> >---
> >Changes:
> >drivers/net/usb/usbnet.c:
> >- the procedure of counting packets in usbnet was updated due to the
>  >accumulating of IP packets in the CDC NCM driver
> >- no short packets sent for cdc_ncm driver
> >include/linux/usb/usbnet.h:
> >- a new flag to indicate driver's capability to accumulate IP packets
> in one
>  >skb before sending them (Tx direction)
> 
> Feels like good to have these comments in comment log itself (i..e,
> above Signed-off-by line)
> Same for another patch as well.

No problem, I will repost it with comments above signed-off line

> 
> Viral
> 
> This Email may contain confidential or privileged information for the
> intended recipient (s) If you are not the intended recipient, please do
> not use or disseminate the information, notify the sender and delete it
> from your system.
> 
> ______________________________________________________________________
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net-next-2.6 10/17 v3] can: EG20T PCH: Fix coding rule violation
From: Marc Kleine-Budde @ 2010-11-24 13:54 UTC (permalink / raw)
  To: Tomoya MORINAGA
  Cc: andrew.chih.howe.khor-ral2JQCrhuEAvxtiuMwx3w, Samuel Ortiz,
	margie.foster-ral2JQCrhuEAvxtiuMwx3w,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	yong.y.wang-ral2JQCrhuEAvxtiuMwx3w,
	kok.howg.ewe-ral2JQCrhuEAvxtiuMwx3w, Wolfgang Grandegger,
	joel.clark-ral2JQCrhuEAvxtiuMwx3w, David S. Miller,
	Christian Pellegrin, qi.wang-ral2JQCrhuEAvxtiuMwx3w
In-Reply-To: <4CED031E.7030403-ECg8zkTtlr0C6LszWs/t0g@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 6441 bytes --]

On 11/24/2010 01:20 PM, Tomoya MORINAGA wrote:
> Fix coding rule violation.
> 
> Signed-off-by: Tomoya MORINAGA <tomoya-linux-ECg8zkTtlr0C6LszWs/t0g@public.gmane.org>

Fix my comment (see online) , check if the lines stay (mostly) <= 80
chars and add my Acked-by.

cheers, Marc

> ---
>  drivers/net/can/pch_can.c |   48
> ++++++++++++++++++++------------------------
>  1 files changed, 22 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/net/can/pch_can.c b/drivers/net/can/pch_can.c
> index e71817d..318eb1f 100644
> --- a/drivers/net/can/pch_can.c
> +++ b/drivers/net/can/pch_can.c
> @@ -89,9 +89,11 @@
> 
>  #define PCH_CAN_CLK		50000000	/* 50MHz */
> 
> -/* Define the number of message object.
> +/*
> + * Define the number of message object.
>   * PCH CAN communications are done via Message RAM.
> - * The Message RAM consists of 32 message objects. */
> + * The Message RAM consists of 32 message objects.
> + */
>  #define PCH_RX_OBJ_NUM		26
>  #define PCH_TX_OBJ_NUM		6
>  #define PCH_RX_OBJ_START	1
> @@ -126,7 +128,7 @@ enum pch_can_mode {
>  	PCH_CAN_ALL,
>  	PCH_CAN_NONE,
>  	PCH_CAN_STOP,
> -	PCH_CAN_RUN
> +	PCH_CAN_RUN,
>  };
> 
>  struct pch_can_if_regs {
> @@ -290,21 +292,20 @@ static void pch_can_set_rxtx(struct pch_can_priv
> *priv, u32 buff_num,
>  	else
>  		ie = PCH_IF_MCONT_RXIE;
> 
> -	/* Reading the receive buffer data from RAM to Interface1 registers */
> +	/* Reading the receive buffer data from RAM to Interface1/2 registers */
>  	iowrite32(PCH_CMASK_RX_TX_GET, &priv->regs->ifregs[dir].cmask);
>  	pch_can_rw_msg_obj(&priv->regs->ifregs[dir].creq, buff_num);
> 
> -	/* Setting the IF1MASK1 register to access MsgVal and RxIE bits */
> +	/* Setting the IF1/2MASK1 register to access MsgVal and RxIE bits */
>  	iowrite32(PCH_CMASK_RDWR | PCH_CMASK_ARB | PCH_CMASK_CTRL,
>  		  &priv->regs->ifregs[dir].cmask);
> 
>  	if (set) {
> -		/* Setting the MsgVal and RxIE bits */
> +		/* Setting the MsgVal and RxIE/TxIE bits */
>  		pch_can_bit_set(&priv->regs->ifregs[dir].mcont, ie);
>  		pch_can_bit_set(&priv->regs->ifregs[dir].id2, PCH_ID_MSGVAL);
> -
>  	} else {
> -		/* Resetting the MsgVal and RxIE bits */
> +		/* Clearing the MsgVal and RxIE/TxIE bits */
>  		pch_can_bit_clear(&priv->regs->ifregs[dir].mcont, ie);
>  		pch_can_bit_clear(&priv->regs->ifregs[dir].id2, PCH_ID_MSGVAL);
>  	}
> @@ -312,7 +313,6 @@ static void pch_can_set_rxtx(struct pch_can_priv
> *priv, u32 buff_num,
>  	pch_can_rw_msg_obj(&priv->regs->ifregs[dir].creq, buff_num);
>  }
> 
> -
>  static void pch_can_set_rx_all(struct pch_can_priv *priv, int set)
>  {
>  	int i;
> @@ -328,7 +328,7 @@ static void pch_can_set_tx_all(struct pch_can_priv
> *priv, int set)
> 
>  	/* Traversing to obtain the object configured as transmit object. */
>  	for (i = PCH_TX_OBJ_START; i <= PCH_TX_OBJ_END; i++)
> -		pch_can_set_rxtx(priv, i, set, 1);
> +		pch_can_set_rxtx(priv, i, set, PCH_TX_IFREG);

this should be folded into the patch 1 (along with the introduction of
the enums)

>  }
> 
>  static u32 pch_can_int_pending(struct pch_can_priv *priv)
> @@ -363,8 +363,7 @@ static void pch_can_config_rx_tx_buffers(struct
> pch_can_priv *priv)
>  	int i;
> 
>  	for (i = PCH_RX_OBJ_START; i <= PCH_RX_OBJ_END; i++) {
> -		iowrite32(PCH_CMASK_RX_TX_GET,
> -			&priv->regs->ifregs[0].cmask);
> +		iowrite32(PCH_CMASK_RX_TX_GET, &priv->regs->ifregs[0].cmask);
>  		pch_can_rw_msg_obj(&priv->regs->ifregs[0].creq, i);
> 
>  		iowrite32(0x0, &priv->regs->ifregs[0].id1);
> @@ -386,16 +385,14 @@ static void pch_can_config_rx_tx_buffers(struct
> pch_can_priv *priv)
>  				  0x1fff | PCH_MASK2_MDIR_MXTD);
> 
>  		/* Setting CMASK for writing */
> -		iowrite32(PCH_CMASK_RDWR | PCH_CMASK_MASK |
> -			  PCH_CMASK_ARB | PCH_CMASK_CTRL,
> -			  &priv->regs->ifregs[0].cmask);
> +		iowrite32(PCH_CMASK_RDWR | PCH_CMASK_MASK | PCH_CMASK_ARB |
> +			  PCH_CMASK_CTRL, &priv->regs->ifregs[0].cmask);
> 
>  		pch_can_rw_msg_obj(&priv->regs->ifregs[0].creq, i);
>  	}
> 
>  	for (i = PCH_TX_OBJ_START; i <= PCH_TX_OBJ_END; i++) {
> -		iowrite32(PCH_CMASK_RX_TX_GET,
> -			&priv->regs->ifregs[1].cmask);
> +		iowrite32(PCH_CMASK_RX_TX_GET, &priv->regs->ifregs[1].cmask);
>  		pch_can_rw_msg_obj(&priv->regs->ifregs[1].creq, i);
> 
>  		/* Resetting DIR bit for reception */
> @@ -410,9 +407,8 @@ static void pch_can_config_rx_tx_buffers(struct
> pch_can_priv *priv)
>  		pch_can_bit_clear(&priv->regs->ifregs[1].mask2, 0x1fff);
> 
>  		/* Setting CMASK for writing */
> -		iowrite32(PCH_CMASK_RDWR | PCH_CMASK_MASK |
> -			  PCH_CMASK_ARB | PCH_CMASK_CTRL,
> -			  &priv->regs->ifregs[1].cmask);
> +		iowrite32(PCH_CMASK_RDWR | PCH_CMASK_MASK | PCH_CMASK_ARB |
> +			  PCH_CMASK_CTRL, &priv->regs->ifregs[1].cmask);
> 
>  		pch_can_rw_msg_obj(&priv->regs->ifregs[1].creq, i);
>  	}
> @@ -471,8 +467,9 @@ static void pch_can_int_clr(struct pch_can_priv
> *priv, u32 mask)
> 
>  		pch_can_rw_msg_obj(&priv->regs->ifregs[0].creq, mask);
>  	} else if ((mask >= PCH_TX_OBJ_START) && (mask <= PCH_TX_OBJ_END)) {
> -		/* Setting CMASK for clearing interrupts for
> -					 frame transmission. */
> +		/*
> +		 * Setting CMASK for clearing interrupts for frame transmission.
> +		 */
>  		iowrite32(PCH_CMASK_RDWR | PCH_CMASK_CTRL | PCH_CMASK_ARB,
>  			  &priv->regs->ifregs[1].cmask);
> 
> @@ -600,7 +597,6 @@ static irqreturn_t pch_can_interrupt(int irq, void
> *dev_id)
>  	struct pch_can_priv *priv = netdev_priv(ndev);
> 
>  	pch_can_set_int_enables(priv, PCH_CAN_NONE);
> -
>  	napi_schedule(&priv->napi);
> 
>  	return IRQ_HANDLED;
> @@ -1048,11 +1044,11 @@ static u32 pch_can_get_rxtx_ir(struct
> pch_can_priv *priv, u32 buff_num, u32 dir)
>  	pch_can_rw_msg_obj(&priv->regs->ifregs[dir].creq, buff_num);
> 
>  	if (((ioread32(&priv->regs->ifregs[dir].id2)) & PCH_ID_MSGVAL) &&
> -			((ioread32(&priv->regs->ifregs[dir].mcont)) & ie)) {
> +			((ioread32(&priv->regs->ifregs[dir].mcont)) & ie))
>  		enable = 1;
> -	} else {
> +	else
>  		enable = 0;
> -	}
> +
>  	return enable;
>  }
> 


-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

[-- Attachment #2: Type: text/plain, Size: 188 bytes --]

_______________________________________________
Socketcan-core mailing list
Socketcan-core-0fE9KPoRgkgATYTw5x5z8w@public.gmane.org
https://lists.berlios.de/mailman/listinfo/socketcan-core

^ permalink raw reply

* [PATCH 7/8] ethoc: rework mdio read/write
From: Jonas Bonn @ 2010-11-24 13:40 UTC (permalink / raw)
  To: netdev; +Cc: Jonas Bonn
In-Reply-To: <1290606058-26703-1-git-send-email-jonas@southpole.se>

MDIO read and write were checking whether a timeout had expired to determine
whether to recheck the result of the MDIO operation.  Under heavy CPU usage,
however, it was possible for the timeout to expire before the routine got
around to be able to check a second time even, thus erroneousy returning an
-EBUSY.

This patch changes the the MDIO IO routines to try up to five times to complete
the operation before giving up, thus lessening the dependency on CPU load.

This resolves a problem whereby a ping flood would keep the CPU so busy that
the above problem would manifest itself; the MDIO command to check link status
would fail and the interface would erroneously be shut down.

Signed-off-by: Jonas Bonn <jonas@southpole.se>
---
 drivers/net/ethoc.c |   14 ++++++--------
 1 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethoc.c b/drivers/net/ethoc.c
index 70ca2f3..4f7c9ad 100644
--- a/drivers/net/ethoc.c
+++ b/drivers/net/ethoc.c
@@ -616,13 +616,13 @@ static int ethoc_poll(struct napi_struct *napi, int budget)
 
 static int ethoc_mdio_read(struct mii_bus *bus, int phy, int reg)
 {
-	unsigned long timeout = jiffies + ETHOC_MII_TIMEOUT;
 	struct ethoc *priv = bus->priv;
+	int i;
 
 	ethoc_write(priv, MIIADDRESS, MIIADDRESS_ADDR(phy, reg));
 	ethoc_write(priv, MIICOMMAND, MIICOMMAND_READ);
 
-	while (time_before(jiffies, timeout)) {
+	for (i=0; i < 5; i++) {
 		u32 status = ethoc_read(priv, MIISTATUS);
 		if (!(status & MIISTATUS_BUSY)) {
 			u32 data = ethoc_read(priv, MIIRX_DATA);
@@ -630,8 +630,7 @@ static int ethoc_mdio_read(struct mii_bus *bus, int phy, int reg)
 			ethoc_write(priv, MIICOMMAND, 0);
 			return data;
 		}
-
-		schedule();
+		usleep_range(100,200);
 	}
 
 	return -EBUSY;
@@ -639,22 +638,21 @@ static int ethoc_mdio_read(struct mii_bus *bus, int phy, int reg)
 
 static int ethoc_mdio_write(struct mii_bus *bus, int phy, int reg, u16 val)
 {
-	unsigned long timeout = jiffies + ETHOC_MII_TIMEOUT;
 	struct ethoc *priv = bus->priv;
+	int i;
 
 	ethoc_write(priv, MIIADDRESS, MIIADDRESS_ADDR(phy, reg));
 	ethoc_write(priv, MIITX_DATA, val);
 	ethoc_write(priv, MIICOMMAND, MIICOMMAND_WRITE);
 
-	while (time_before(jiffies, timeout)) {
+	for (i=0; i < 5; i++) {
 		u32 stat = ethoc_read(priv, MIISTATUS);
 		if (!(stat & MIISTATUS_BUSY)) {
 			/* reset MII command register */
 			ethoc_write(priv, MIICOMMAND, 0);
 			return 0;
 		}
-
-		schedule();
+		usleep_range(100,200);
 	}
 
 	return -EBUSY;
-- 
1.7.1


^ permalink raw reply related

* ethoc driver updates
From: Jonas Bonn @ 2010-11-24 13:40 UTC (permalink / raw)
  To: netdev

Here's a set of patches that fix up some aspects of the ethoc network
driver:

i) device tree configuration
ii) reworked interrupt handling
iii) cleanups

This has been tested on an OpenRISC 1200 board; the OpenRISC Linux port
is using device trees now, hence the device tree bits.

/Jonas


^ permalink raw reply

* [PATCH 4/8] ethoc: prevent overflow of rx counter
From: Jonas Bonn @ 2010-11-24 13:40 UTC (permalink / raw)
  To: netdev; +Cc: Jonas Bonn
In-Reply-To: <1290606058-26703-1-git-send-email-jonas@southpole.se>

Rewind cur_rx to prevent it from overflowing.

Signed-off-by: Jonas Bonn <jonas@southpole.se>
---
 drivers/net/ethoc.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethoc.c b/drivers/net/ethoc.c
index 53c03f2..7d1b5d8 100644
--- a/drivers/net/ethoc.c
+++ b/drivers/net/ethoc.c
@@ -408,6 +408,9 @@ static int ethoc_rx(struct net_device *dev, int limit)
 	struct ethoc *priv = netdev_priv(dev);
 	int count;
 
+	/* Prevent overflow of priv->cur_rx by rewinding it */	
+	priv->cur_rx = priv->cur_rx % priv->num_rx;
+
 	for (count = 0; count < limit; ++count) {
 		unsigned int entry;
 		struct ethoc_bd bd;
-- 
1.7.1


^ permalink raw reply related

* [PATCH 6/8] ethoc: rework interrupt handling
From: Jonas Bonn @ 2010-11-24 13:40 UTC (permalink / raw)
  To: netdev; +Cc: Jonas Bonn
In-Reply-To: <1290606058-26703-1-git-send-email-jonas@southpole.se>

The old interrupt handling was incorrect in that it did not account for the
fact that the interrupt source bits get set irregardless of whether or not
their corresponding mask is set.  This patch fixes that by masking off the
source bits for masked interrupts.

Furthermore, the handling of transmission events is moved to the NAPI polling
handler alongside the reception handler, thus preventing a whole bunch of
interrupts during heavy traffic.

Signed-off-by: Jonas Bonn <jonas@southpole.se>
---
 drivers/net/ethoc.c |   79 +++++++++++++++++++++++++++++++++------------------
 1 files changed, 51 insertions(+), 28 deletions(-)

diff --git a/drivers/net/ethoc.c b/drivers/net/ethoc.c
index 5afa934..70ca2f3 100644
--- a/drivers/net/ethoc.c
+++ b/drivers/net/ethoc.c
@@ -499,29 +499,43 @@ static int ethoc_update_tx_stats(struct ethoc *dev, struct ethoc_bd *bd)
 	return 0;
 }
 
-static void ethoc_tx(struct net_device *dev)
+static int ethoc_tx(struct net_device *dev, int limit)
 {
 	struct ethoc *priv = netdev_priv(dev);
+	int count;
+	struct ethoc_bd bd;
 
-	spin_lock(&priv->lock);
+	for (count = 0; count < limit; ++count) {
+		unsigned int entry;
 
-	while (priv->dty_tx != priv->cur_tx) {
-		unsigned int entry = priv->dty_tx % priv->num_tx;
-		struct ethoc_bd bd;
+		entry = priv->dty_tx % priv->num_tx;
 
 		ethoc_read_bd(priv, entry, &bd);
-		if (bd.stat & TX_BD_READY)
-			break;
 
-		entry = (++priv->dty_tx) % priv->num_tx;
+		if (bd.stat & TX_BD_READY || (priv->dty_tx == priv->cur_tx)) {
+			ethoc_ack_irq(priv, INT_MASK_TX);
+			/* If interrupt came in between reading in the BD
+			 * and clearing the interrupt source, then we risk 
+			 * missing the event as the TX interrupt won't trigger
+			 * right away when we reenable it; hence, check 
+			 * BD_EMPTY here again to make sure there isn't such an
+			 * event pending...
+			 */
+			ethoc_read_bd(priv, entry, &bd);
+			if (bd.stat & TX_BD_READY ||
+			    (priv->dty_tx == priv->cur_tx))
+				break;
+		}
+
 		(void)ethoc_update_tx_stats(priv, &bd);
+		priv->dty_tx++;
 	}
 
-	if ((priv->cur_tx - priv->dty_tx) <= (priv->num_tx / 2))
+	if ((priv->cur_tx - priv->dty_tx) <= (priv->num_tx / 2)) {
 		netif_wake_queue(dev);
+	}
 
-	ethoc_ack_irq(priv, INT_MASK_TX);
-	spin_unlock(&priv->lock);
+	return count;
 }
 
 static irqreturn_t ethoc_interrupt(int irq, void *dev_id)
@@ -529,32 +543,38 @@ static irqreturn_t ethoc_interrupt(int irq, void *dev_id)
 	struct net_device *dev = dev_id;
 	struct ethoc *priv = netdev_priv(dev);
 	u32 pending;
-
-	ethoc_disable_irq(priv, INT_MASK_ALL);
+	u32 mask;
+
+	/* Figure out what triggered the interrupt...
+	 * The tricky bit here is that the interrupt source bits get
+	 * set in INT_SOURCE for an event irregardless of whether that
+	 * event is masked or not.  Thus, in order to figure out what
+	 * triggered the interrupt, we need to remove the sources
+	 * for all events that are currently masked.  This behaviour
+	 * is not particularly well documented but reasonable...
+	 */
+	mask = ethoc_read(priv, INT_MASK);
 	pending = ethoc_read(priv, INT_SOURCE);
+	pending &= mask;
+
 	if (unlikely(pending == 0)) {
-		ethoc_enable_irq(priv, INT_MASK_ALL);
 		return IRQ_NONE;
 	}
 
 	ethoc_ack_irq(priv, pending);
 
+	/* We always handle the dropped packet interrupt */
 	if (pending & INT_MASK_BUSY) {
 		dev_err(&dev->dev, "packet dropped\n");
 		dev->stats.rx_dropped++;
 	}
 
-	if (pending & INT_MASK_RX) {
-		if (napi_schedule_prep(&priv->napi))
-			__napi_schedule(&priv->napi);
-	} else {
-		ethoc_enable_irq(priv, INT_MASK_RX);
+	/* Handle receive/transmit event by switching to polling */
+	if (pending & (INT_MASK_TX | INT_MASK_RX)) {
+		ethoc_disable_irq(priv, INT_MASK_TX | INT_MASK_RX);
+		napi_schedule(&priv->napi);
 	}
 
-	if (pending & INT_MASK_TX)
-		ethoc_tx(dev);
-
-	ethoc_enable_irq(priv, INT_MASK_ALL & ~INT_MASK_RX);
 	return IRQ_HANDLED;
 }
 
@@ -580,15 +600,18 @@ static int ethoc_get_mac_address(struct net_device *dev, void *addr)
 static int ethoc_poll(struct napi_struct *napi, int budget)
 {
 	struct ethoc *priv = container_of(napi, struct ethoc, napi);
-	int work_done = 0;
+	int rx_work_done = 0;
+	int tx_work_done = 0;
+
+	rx_work_done = ethoc_rx(priv->netdev, budget);
+	tx_work_done = ethoc_tx(priv->netdev, budget);
 
-	work_done = ethoc_rx(priv->netdev, budget);
-	if (work_done < budget) {
+	if (rx_work_done < budget && tx_work_done < budget) {
 		napi_complete(napi);
-		ethoc_enable_irq(priv, INT_MASK_RX);
+		ethoc_enable_irq(priv, INT_MASK_TX | INT_MASK_RX);
 	}
 
-	return work_done;
+	return rx_work_done;
 }
 
 static int ethoc_mdio_read(struct mii_bus *bus, int phy, int reg)
-- 
1.7.1


^ permalink raw reply related

* [PATCH 5/8] ethoc: Double check pending RX packet
From: Jonas Bonn @ 2010-11-24 13:40 UTC (permalink / raw)
  To: netdev; +Cc: Jonas Bonn
In-Reply-To: <1290606058-26703-1-git-send-email-jonas@southpole.se>

An interrupt may occur between checking bd.stat and clearing the
interrupt source register which would result in the packet going totally
unnoticed as the interrupt will be missed.  Double check bd.stat after
clearing the interrupt source register to guard against such an
occurrence.

Signed-off-by: Jonas Bonn <jonas@southpole.se>
---
 drivers/net/ethoc.c |   16 ++++++++++++++--
 1 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethoc.c b/drivers/net/ethoc.c
index 7d1b5d8..5afa934 100644
--- a/drivers/net/ethoc.c
+++ b/drivers/net/ethoc.c
@@ -417,8 +417,20 @@ static int ethoc_rx(struct net_device *dev, int limit)
 
 		entry = priv->num_tx + (priv->cur_rx % priv->num_rx);
 		ethoc_read_bd(priv, entry, &bd);
-		if (bd.stat & RX_BD_EMPTY)
-			break;
+		if (bd.stat & RX_BD_EMPTY) {
+			ethoc_ack_irq(priv, INT_MASK_RX);
+			/* If packet (interrupt) came in between checking
+			 * BD_EMTPY and clearing the interrupt source, then we
+			 * risk missing the packet as the RX interrupt won't
+			 * trigger right away when we reenable it; hence, check
+			 * BD_EMTPY here again to make sure there isn't such a
+			 * packet waiting for us...
+			 */
+			ethoc_read_bd(priv, entry, &bd);
+			if (bd.stat & RX_BD_EMPTY) 
+				break;
+
+		}
 
 		if (ethoc_update_rx_stats(priv, &bd) == 0) {
 			int size = bd.stat >> 16;
-- 
1.7.1


^ permalink raw reply related

* [PATCH 8/8] ethoc: fix function return type
From: Jonas Bonn @ 2010-11-24 13:40 UTC (permalink / raw)
  To: netdev; +Cc: Jonas Bonn
In-Reply-To: <1290606058-26703-1-git-send-email-jonas@southpole.se>

update_ethoc_tx_stats doesn't need to return anything so make its return
type void in order to avoid an unnecessary cast when the function is called.

Signed-off-by: Jonas Bonn <jonas@southpole.se>
---
 drivers/net/ethoc.c |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethoc.c b/drivers/net/ethoc.c
index 4f7c9ad..b12c588 100644
--- a/drivers/net/ethoc.c
+++ b/drivers/net/ethoc.c
@@ -466,7 +466,7 @@ static int ethoc_rx(struct net_device *dev, int limit)
 	return count;
 }
 
-static int ethoc_update_tx_stats(struct ethoc *dev, struct ethoc_bd *bd)
+static void ethoc_update_tx_stats(struct ethoc *dev, struct ethoc_bd *bd)
 {
 	struct net_device *netdev = dev->netdev;
 
@@ -496,7 +496,6 @@ static int ethoc_update_tx_stats(struct ethoc *dev, struct ethoc_bd *bd)
 	netdev->stats.collisions += (bd->stat >> 4) & 0xf;
 	netdev->stats.tx_bytes += bd->stat >> 16;
 	netdev->stats.tx_packets++;
-	return 0;
 }
 
 static int ethoc_tx(struct net_device *dev, int limit)
@@ -527,7 +526,7 @@ static int ethoc_tx(struct net_device *dev, int limit)
 				break;
 		}
 
-		(void)ethoc_update_tx_stats(priv, &bd);
+		ethoc_update_tx_stats(priv, &bd);
 		priv->dty_tx++;
 	}
 
-- 
1.7.1


^ permalink raw reply related

* [PATCH 2/8] ethoc: remove unused spinlock
From: Jonas Bonn @ 2010-11-24 13:40 UTC (permalink / raw)
  To: netdev; +Cc: Jonas Bonn
In-Reply-To: <1290606058-26703-1-git-send-email-jonas@southpole.se>

Signed-off-by: Jonas Bonn <jonas@southpole.se>
---
 drivers/net/ethoc.c |    3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethoc.c b/drivers/net/ethoc.c
index 6b4b9e1..fc8c044 100644
--- a/drivers/net/ethoc.c
+++ b/drivers/net/ethoc.c
@@ -185,7 +185,6 @@ MODULE_PARM_DESC(buffer_size, "DMA buffer allocation size");
  * @netdev:	pointer to network device structure
  * @napi:	NAPI structure
  * @msg_enable:	device state flags
- * @rx_lock:	receive lock
  * @lock:	device lock
  * @phy:	attached PHY
  * @mdio:	MDIO bus for PHY access
@@ -210,7 +209,6 @@ struct ethoc {
 	struct napi_struct napi;
 	u32 msg_enable;
 
-	spinlock_t rx_lock;
 	spinlock_t lock;
 
 	struct phy_device *phy;
@@ -1061,7 +1059,6 @@ static int __devinit ethoc_probe(struct platform_device *pdev)
 	/* setup NAPI */
 	netif_napi_add(netdev, &priv->napi, ethoc_poll, 64);
 
-	spin_lock_init(&priv->rx_lock);
 	spin_lock_init(&priv->lock);
 
 	ret = register_netdev(netdev);
-- 
1.7.1


^ permalink raw reply related

* [PATCH 1/8] ethoc: Add device tree configuration
From: Jonas Bonn @ 2010-11-24 13:40 UTC (permalink / raw)
  To: netdev; +Cc: Jonas Bonn
In-Reply-To: <1290606058-26703-1-git-send-email-jonas@southpole.se>

This patch adds the ability to describe ethernet devices via a flattened
device tree.  As device tree remains an optional feature, these bits all
need to be guarded by CONFIG_OF ifdefs.

MAC address is settable via the device tree parameter "local-mac-address";
however, the selection of the phy id is limited to probing, for now.

Signed-off-by: Jonas Bonn <jonas@southpole.se>
---
 drivers/net/ethoc.c |   31 ++++++++++++++++++++++++++++++-
 1 files changed, 30 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethoc.c b/drivers/net/ethoc.c
index c5a2fe0..6b4b9e1 100644
--- a/drivers/net/ethoc.c
+++ b/drivers/net/ethoc.c
@@ -19,6 +19,7 @@
 #include <linux/platform_device.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
+#include <linux/of.h>
 #include <net/ethoc.h>
 
 static int buffer_size = 0x8000; /* 32 KBytes */
@@ -986,7 +987,21 @@ static int __devinit ethoc_probe(struct platform_device *pdev)
 			(struct ethoc_platform_data *)pdev->dev.platform_data;
 		memcpy(netdev->dev_addr, pdata->hwaddr, IFHWADDRLEN);
 		priv->phy_id = pdata->phy_id;
-	}
+	} else {
+		priv->phy_id = -1;
+
+#ifdef CONFIG_OF
+		{
+		uint8_t* mac;
+
+		mac = (uint8_t*)of_get_property(pdev->dev.of_node,
+						"local-mac-address",
+						NULL);
+		if (mac)
+			memcpy(netdev->dev_addr, mac, IFHWADDRLEN);
+		}
+#endif
+	} 
 
 	/* Check that the given MAC address is valid. If it isn't, read the
 	 * current MAC from the controller. */
@@ -1113,6 +1128,16 @@ static int ethoc_resume(struct platform_device *pdev)
 # define ethoc_resume  NULL
 #endif
 
+#ifdef CONFIG_OF
+static struct of_device_id ethoc_match[] = {
+	{
+		.compatible = "opencores,ethoc",
+	},
+	{},
+};
+MODULE_DEVICE_TABLE(of, ethoc_match);
+#endif
+
 static struct platform_driver ethoc_driver = {
 	.probe   = ethoc_probe,
 	.remove  = __devexit_p(ethoc_remove),
@@ -1120,6 +1145,10 @@ static struct platform_driver ethoc_driver = {
 	.resume  = ethoc_resume,
 	.driver  = {
 		.name = "ethoc",
+		.owner = THIS_MODULE,
+#ifdef CONFIG_OF
+		.of_match_table = ethoc_match,
+#endif
 	},
 };
 
-- 
1.7.1


^ permalink raw reply related

* [PATCH 3/8] ethoc: enable interrupts after napi_complete
From: Jonas Bonn @ 2010-11-24 13:40 UTC (permalink / raw)
  To: netdev; +Cc: Adam Edvardsson, Jonas Bonn
In-Reply-To: <1290606058-26703-1-git-send-email-jonas@southpole.se>

From: Adam Edvardsson <adam.edvardsson@orsoc.se>

Occasionally, it seems that some race is causing the interrupts to not be
reenabled otherwise with the end result that networking just stops working.
Enabling interrupts after calling napi_complete is more in line with what
other drivers do.

Signed-off-by: Jonas Bonn <jonas@southpole.se>
---
 drivers/net/ethoc.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethoc.c b/drivers/net/ethoc.c
index fc8c044..53c03f2 100644
--- a/drivers/net/ethoc.c
+++ b/drivers/net/ethoc.c
@@ -569,8 +569,8 @@ static int ethoc_poll(struct napi_struct *napi, int budget)
 
 	work_done = ethoc_rx(priv->netdev, budget);
 	if (work_done < budget) {
-		ethoc_enable_irq(priv, INT_MASK_RX);
 		napi_complete(napi);
+		ethoc_enable_irq(priv, INT_MASK_RX);
 	}
 
 	return work_done;
-- 
1.7.1


^ permalink raw reply related

* Re: [PATCH net-next] bnx2x: Disable local BHes to prevent a dead-lock situation
From: Eric Dumazet @ 2010-11-24 14:04 UTC (permalink / raw)
  To: Vladislav Zolotarov; +Cc: Dave Miller, Eilon Greenstein, netdev list
In-Reply-To: <1290606310.25676.7.camel@lb-tlvb-vladz>

Le mercredi 24 novembre 2010 à 15:45 +0200, Vladislav Zolotarov a
écrit :
> From: Eric Dumazet <eric.dumazet@gmail.com>
> 
> According to Eric's suggestion:
> Disable local BHes to prevent a dead-lock situation between sch_direct_xmit()
> (Soft_IRQ context) and bnx2x_tx_int (called by bnx2x_run_loopback() - syscall
> context), as both are taking a netif_tx_lock().
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
> Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
> ---
> CCing Eric... :)

Dont worry, I am subscribed to netdev ;)

Thanks !



^ permalink raw reply

* Re: [PATCH] arch/tile: fix rwlock so would-be write lockers don't block new readers
From: Chris Metcalf @ 2010-11-24 14:09 UTC (permalink / raw)
  To: Cypher Wu; +Cc: linux-kernel, Américo Wang, Eric Dumazet, netdev
In-Reply-To: <AANLkTinx3pcJHGrQq_C-Rgk5g7SxjmpTcD0A50d8d48b@mail.gmail.com>

On 11/23/2010 9:53 PM, Cypher Wu wrote:
> 2010/11/24 Chris Metcalf <cmetcalf@tilera.com>:
>> On 11/22/2010 8:36 PM, Cypher Wu wrote:
>>> Say, if core A try to write_lock() rwlock and current_ticket_ is 0 and
>>> it write next_ticket_ to 1, when it processing the lock, core B try to
>>> write_lock() again and write next_ticket_ to 2, then when A
>>> write_unlock() it seen that (current_ticket_+1) is not equal to
>>> next_ticket_, so it increment current_ticket_, and core B get the
>>> lock. If core A try write_lock again before core B write_unlock, it
>>> will increment next_ticket_ to 3. And so on.
>>> This may rarely happened, I've tested it yesterday for several hours
>>> it goes very well under pressure.
>> This should be OK when it happens (other than starving out the readers, but
>> that was the decision made by doing a ticket lock in the first place).
>> Even if we wrap around 255 back to zero on the tickets, the ticket queue
>> will work correctly.  The key is not to need more than 256 concurrent write
>> lock waiters, which we don't.
> If we count on that, should we make 'my_ticket_ = (val >>
> WR_NEXT_SHIFT) & WR_MASK;'

No, it's OK.  As the comment for the declaration of "my_ticket_" says, the
trailing underscore reminds us that the high bits are garbage, and when we
use the value, we do the mask: "((my_ticket_ - curr_) & WR_MASK)".  It
turned out doing the mask here made the most sense from a code-generation
point of view, partly just because of the possibility of the counter wrapping.

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com


^ permalink raw reply

* RE: [PATCH net-next] bnx2x: Disable local BHes to prevent a dead-lock situation
From: Vladislav Zolotarov @ 2010-11-24 14:18 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Dave Miller, Eilon Greenstein, netdev list
In-Reply-To: <1290607444.3464.39.camel@edumazet-laptop>

> > CCing Eric... :)
> 
> Dont worry, I am subscribed to netdev ;)
> 
> Thanks !
> 

I'm sure u r. ;) But u see, Dave is very punctual about a formal
side and the rule says I had to add u. So I fixed myself before
Dave's remark on this... ;)

Thanks again for your help.

Regards,
vlad

^ permalink raw reply

* Re: [PATCH] af_unix: limit unix_tot_inflight
From: Andi Kleen @ 2010-11-24 14:44 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Vegard Nossum, David Miller, LKML, Andrew Morton, Eugene Teo,
	netdev
In-Reply-To: <1290590335.3464.24.camel@edumazet-laptop>

Eric Dumazet <eric.dumazet@gmail.com> writes:
>
> diff --git a/net/unix/garbage.c b/net/unix/garbage.c
> index c8df6fd..40df93d 100644
> --- a/net/unix/garbage.c
> +++ b/net/unix/garbage.c
> @@ -259,9 +259,16 @@ static void inc_inflight_move_tail(struct unix_sock *u)
>  }
>  
>  static bool gc_in_progress = false;
> +#define UNIX_INFLIGHT_TRIGGER_GC 16000

It would be better to define this as a percentage of
lowmem.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply

* RE: [PATCH 1/2] usbnet: changes for upcoming cdc_ncm driver
From: Alexey ORISHKO @ 2010-11-24 15:14 UTC (permalink / raw)
  To: David Brownell
  Cc: gregkh@suse.de, linux-usb@vger.kernel.org, netdev@vger.kernel.org,
	oliver@neukum.org, yauheni.kaliuta@nokia.com, balbi@ti.com,
	Sjur BRENDELAND, Alexey ORISHKO
In-Reply-To: <1290538720.2529.109.camel@helium>

Hi David,

> > -		skb_queue_tail (&dev->done, skb);
> > +	if (skb->len) {
> > +		/* all data was already cloned inside NCM driver */
> 
> Fix this comment.  NCM isn't the only framing policy which un-batches
> RX packets ... RNDIS has done so for a number of years already, and
> more recently EEM needs it too ... plus at least one hardware driver.

Will do.

> Also, check pending patches, since I seem to recall one that supports
> some hardware (SMSC?) that batches, and needed to update the calling
> convention you're using here (i.e. the original one).

Patch was based on linux-next git and I don't see any difference from
net-next. Any particular git you'd like me to look at pending patches?

> 
> 
> > +		if (dev->driver_info->flags & FLAG_MULTI_PACKET)
> 
> except ... you documented this flag as affecting TX paths not RX...

This flag carries two purposes:
- help to correct statistic (Tx/Rx counting)
- handle NCM feature to avoid sending short packets in certain
conditions: cdc_ncm driver has to send short packet (or ZLP) only
if data size is not equal to negotiated data size (dwNtbOutMaxSize).

The other option (instead of dedicated flag) would be to add
additional field in the usbnet context structure, which will keep
dwNtbOutMaxSize value (it still would be better to use
dedicated flag in addition to the value). Thus usbnet would possess
the knowledge when ZLP or short packet should be skipped.
By doing this approach we will eliminate a need to generate short
packet inside the driver and use generic code in usbnet.

Could you comment, what would be the preferable way to get it in the
next version of this patch?

Regards,
Alexey


^ permalink raw reply

* Re: [PATCH] af_unix: limit unix_tot_inflight
From: Eric Dumazet @ 2010-11-24 15:18 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Vegard Nossum, David Miller, LKML, Andrew Morton, Eugene Teo,
	netdev
In-Reply-To: <877hg2g4re.fsf@basil.nowhere.org>

Le mercredi 24 novembre 2010 à 15:44 +0100, Andi Kleen a écrit :
> Eric Dumazet <eric.dumazet@gmail.com> writes:
> >
> > diff --git a/net/unix/garbage.c b/net/unix/garbage.c
> > index c8df6fd..40df93d 100644
> > --- a/net/unix/garbage.c
> > +++ b/net/unix/garbage.c
> > @@ -259,9 +259,16 @@ static void inc_inflight_move_tail(struct unix_sock *u)
> >  }
> >  
> >  static bool gc_in_progress = false;
> > +#define UNIX_INFLIGHT_TRIGGER_GC 16000
> 
> It would be better to define this as a percentage of
> lowmem.
> 

I knew somebody would suggest this ;)

Hmm, why bother ?

Do you think 16000 is too big ? Too small ?

1) What would be the percentage of memory ? 1%, 0.001 % ?

  On a 16TB machine, a percentage will still give huge latencies to the
poor guy that hit the unix_gc().

With 16000, the max latency I had was 11.5 ms (on an Intel E5540
@2.53GHz), instead of more than 2000 ms

I guess it would make more sense to limit to the size of cpu cache
anyway.


2) We currently allocate 4096 bytes (on x86_64) to store one file
pointer, or 2048 bytes on x86_32.

But we can store in it up to 255 files.

 I posted a patch to shrink this to 32 or 16 bytes. Should we then
change the heuristic ?

3) Really who needs more than 16000 inflight unix files ?

  (inflight unix files means : af_unix file descriptors that were sent
(sendfd()) through af_unix, not yet garbage collected.).


4) If we autotune a limit at boot time as a lowmem percentage, some guys
then want a /proc/sys/net/core/max_unix_inflight sysctl , just for
completeness. One extra sysctl... 

I cant see valid uses but programs designed to stress our stack.

^ permalink raw reply

* [PATCH net-next] bnx2x: Do interrupt mode initialization and NAPIs adding before register_netdev()
From: Vladislav Zolotarov @ 2010-11-24 15:21 UTC (permalink / raw)
  To: Dave Miller; +Cc: Eilon Greenstein, netdev list, mchan

Move the interrupt mode configuration and NAPIs adding before a
register_netdev() call to prevent netdev->open() from running
before these functions are done.

Advance a driver version number.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Reported-by: Michael Chan <mchan@broadcom.com>
---
 drivers/net/bnx2x/bnx2x.h      |    4 ++--
 drivers/net/bnx2x/bnx2x_main.c |   12 ++++++------
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/bnx2x/bnx2x.h b/drivers/net/bnx2x/bnx2x.h
index 863e73a..342ab58 100644
--- a/drivers/net/bnx2x/bnx2x.h
+++ b/drivers/net/bnx2x/bnx2x.h
@@ -20,8 +20,8 @@
  * (you will need to reboot afterwards) */
 /* #define BNX2X_STOP_ON_ERROR */
 
-#define DRV_MODULE_VERSION      "1.60.00-4"
-#define DRV_MODULE_RELDATE      "2010/11/01"
+#define DRV_MODULE_VERSION      "1.60.00-5"
+#define DRV_MODULE_RELDATE      "2010/11/24"
 #define BNX2X_BC_VER            0x040200
 
 #define BNX2X_MULTI_QUEUE
diff --git a/drivers/net/bnx2x/bnx2x_main.c b/drivers/net/bnx2x/bnx2x_main.c
index 92057d7..f53edfd 100644
--- a/drivers/net/bnx2x/bnx2x_main.c
+++ b/drivers/net/bnx2x/bnx2x_main.c
@@ -9096,12 +9096,6 @@ static int __devinit bnx2x_init_one(struct pci_dev *pdev,
 	/* calc qm_cid_count */
 	bp->qm_cid_count = bnx2x_set_qm_cid_count(bp, cid_count);
 
-	rc = register_netdev(dev);
-	if (rc) {
-		dev_err(&pdev->dev, "Cannot register net device\n");
-		goto init_one_exit;
-	}
-
 	/* Configure interupt mode: try to enable MSI-X/MSI if
 	 * needed, set bp->num_queues appropriately.
 	 */
@@ -9110,6 +9104,12 @@ static int __devinit bnx2x_init_one(struct pci_dev *pdev,
 	/* Add all NAPI objects */
 	bnx2x_add_all_napi(bp);
 
+	rc = register_netdev(dev);
+	if (rc) {
+		dev_err(&pdev->dev, "Cannot register net device\n");
+		goto init_one_exit;
+	}
+
 	bnx2x_get_pcie_width_speed(bp, &pcie_width, &pcie_speed);
 
 	netdev_info(dev, "%s (%c%d) PCI-E x%d %s found at mem %lx,"
-- 
1.7.0.4





^ permalink raw reply related

* [PATCH] the macvlan device causes SLAB corruption on network namespace exit.
From: Anders Franzen @ 2010-11-24 15:35 UTC (permalink / raw)
  To: kaber; +Cc: netdev, eric.dumazet, davem, Anders Franzen

When doing exit from a network namespace, cleanup functions are executed from
the function default_device_exit_batch.
In the macvlan device, this will generate a call to macvlan_dellink.
If it is the last macvlan using a ''lowerdev'' the port is
destroyed.

Later the a call to macvlan_stop is executed. when the macvlan_hash_del(vlan) is
executed, the memory of the already freed port will be corrupted.

Signed-off-by: Anders Franzen <anders.franzen@ericsson.com>
---
 drivers/net/macvlan.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 6ed577b..c1cddb6 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -338,7 +338,9 @@ static int macvlan_stop(struct net_device *dev)
 	dev_uc_del(lowerdev, dev->dev_addr);
 
 hash_del:
-	macvlan_hash_del(vlan);
+	if (macvlan_port_exists(lowerdev))
+		macvlan_hash_del(vlan);
+
 	return 0;
 }
 
-- 
1.7.2.3


^ permalink raw reply related

* [PATCH] Make the ip6_tunnel reflect the true mtu.
From: Anders Franzen @ 2010-11-24 15:47 UTC (permalink / raw)
  To: vnuorval, kozakai; +Cc: netdev, eric.dumazet, davem, Anders Franzen

The ip6_tunnel always assumes it consumes 40 bytes (ip6 hdr) of the mtu of the
underlaying device. So for a normal ethernet bearer, the mtu of the ip6_tunnel is
1460.
However, when creating a tunnel the encap limit option is enabled by default, and it
consumes 8 bytes more, so the true mtu shall be 1452.

I dont really know if this breaks some statement in some RFC, so this is a request for
comments.

Signed-off-by: Anders Franzen <anders.franzen@ericsson.com>
---
 net/ipv6/ip6_tunnel.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 2a59610..df948e8 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1175,6 +1175,8 @@ static void ip6_tnl_link_config(struct ip6_tnl *t)
 				sizeof (struct ipv6hdr);
 
 			dev->mtu = rt->rt6i_dev->mtu - sizeof (struct ipv6hdr);
+			if (!(t->parms.flags & IP6_TNL_F_IGN_ENCAP_LIMIT))
+				dev->mtu-=8;
 
 			if (dev->mtu < IPV6_MIN_MTU)
 				dev->mtu = IPV6_MIN_MTU;
@@ -1363,12 +1365,17 @@ static const struct net_device_ops ip6_tnl_netdev_ops = {
 
 static void ip6_tnl_dev_setup(struct net_device *dev)
 {
+	struct ip6_tnl *t = NULL;
+
 	dev->netdev_ops = &ip6_tnl_netdev_ops;
 	dev->destructor = ip6_dev_free;
 
 	dev->type = ARPHRD_TUNNEL6;
 	dev->hard_header_len = LL_MAX_HEADER + sizeof (struct ipv6hdr);
 	dev->mtu = ETH_DATA_LEN - sizeof (struct ipv6hdr);
+	t = netdev_priv(dev);
+	if (!(t->parms.flags & IP6_TNL_F_IGN_ENCAP_LIMIT))
+		dev->mtu-=8;
 	dev->flags |= IFF_NOARP;
 	dev->addr_len = sizeof(struct in6_addr);
 	dev->features |= NETIF_F_NETNS_LOCAL;
-- 
1.7.2.3


^ permalink raw reply related

* Re: [PATCH net-2.6] net, ppp: Report correct error code if unit allocation failed
From: Cyrill Gorcunov @ 2010-11-24 16:25 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: LNML, Paul Mackerras, David Miller, linux-ppp, linux-kernel
In-Reply-To: <20101124124901.GA9268@ff.dom.local>

On Wed, Nov 24, 2010 at 12:49:01PM +0000, Jarek Poplawski wrote:
> On 2010-11-23 22:43, Cyrill Gorcunov wrote:
> > Allocating unit from ird might return several error codes
> > not only -EAGAIN, so it should not be changed and returned
> > precisely. Same time unit release procedure should be invoked
> > only if device is unregistering.
> 
> IMHO this unit release fix should be in a separate patch.
> 

 I thought about it, but still think it should be addressed at
same patch. Though if a separate would be preferred still --
I've no problem in making two patches instead.

> ...
> > @@ -2668,10 +2668,10 @@ static void ppp_shutdown_interface(struc
> >  		ppp->closing = 1;
> >  		ppp_unlock(ppp);
> >  		unregister_netdev(ppp->dev);
> > +		unit_put(&pn->units_idr, ppp->file.index);
> >  	} else
> >  		ppp_unlock(ppp);
> >  
> > -	unit_put(&pn->units_idr, ppp->file.index);
> >  	ppp->file.dead = 1;
> >  	ppp->owner = NULL;
> >  	wake_up_interruptible(&ppp->file.rwait);
> 
> Btw, it seems these last 3 lines could be moved similarly.

yup, at least ppp->file.dead and ppp->owner for sure, I wanted
make this patch 'unit' orientedc and do not touch anything aside,
it should be a separate change.

> 
> Jarek P.
>

Thanks for comments Jarek!
 
  Cyrill

^ permalink raw reply

* Re: [PATCH] af_unix: limit unix_tot_inflight
From: Andi Kleen @ 2010-11-24 16:25 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andi Kleen, Vegard Nossum, David Miller, LKML, Andrew Morton,
	Eugene Teo, netdev
In-Reply-To: <1290611906.3464.66.camel@edumazet-laptop>

> I knew somebody would suggest this ;)
> 
> Hmm, why bother ?
> 
> Do you think 16000 is too big ? Too small ?

I just don't like static limits. Traditionally even the ones
that seemed reasonable at some point were hit by someone
years later.

The latency issue you mention is a valid concern. I guess
an incremental GC would be overkill here ...

-Andi

^ permalink raw reply

* Re: [PATCH 4/8] ethoc: prevent overflow of rx counter
From: Ben Hutchings @ 2010-11-24 16:30 UTC (permalink / raw)
  To: Jonas Bonn; +Cc: netdev
In-Reply-To: <1290606058-26703-5-git-send-email-jonas@southpole.se>

On Wed, 2010-11-24 at 14:40 +0100, Jonas Bonn wrote:
> Rewind cur_rx to prevent it from overflowing.
> 
> Signed-off-by: Jonas Bonn <jonas@southpole.se>
> ---
>  drivers/net/ethoc.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/ethoc.c b/drivers/net/ethoc.c
> index 53c03f2..7d1b5d8 100644
> --- a/drivers/net/ethoc.c
> +++ b/drivers/net/ethoc.c
> @@ -408,6 +408,9 @@ static int ethoc_rx(struct net_device *dev, int limit)
>  	struct ethoc *priv = netdev_priv(dev);
>  	int count;
>  
> +	/* Prevent overflow of priv->cur_rx by rewinding it */	
> +	priv->cur_rx = priv->cur_rx % priv->num_rx;
> +

Division is expensive; you should either use masking (if priv->num_rx is
guaranteed to be a power of 2) or check for overflow whenever you
increment priv->cur_rx:

	if (++priv->cur_rx == priv->num_rx)
		priv->cur_rx = 0;

Ben.

>  	for (count = 0; count < limit; ++count) {
>  		unsigned int entry;
>  		struct ethoc_bd bd;

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox