Netdev List

Netdev List
 help / color / mirror / Atom feed

* RE: [PATCH] netxen: Fix Unlikely(x) > y
From: Dhananjay Phadke @ 2009-10-07  5:34 UTC (permalink / raw)
  To: Roel Kluin, netdev@vger.kernel.org, Andrew Morton
In-Reply-To: <4ACB44DA.1010300@gmail.com>

Oh sure, that was a typo. Thanks for catching it. 

Acked-by: Dhananjay Phadke <dhananjay@netxen.com>

-----Original Message-----
From: Roel Kluin [mailto:roel.kluin@gmail.com] 
Sent: Tuesday, October 06, 2009 06:24
To: Dhananjay Phadke; netdev@vger.kernel.org; Andrew Morton
Subject: [PATCH] netxen: Fix Unlikely(x) > y

The closing parenthesis was not on the right location.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
---
This was intended, I think?

diff --git a/drivers/net/netxen/netxen_nic_main.c b/drivers/net/netxen/netxen_nic_main.c
index b5aa974..9b9eab1 100644
--- a/drivers/net/netxen/netxen_nic_main.c
+++ b/drivers/net/netxen/netxen_nic_main.c
@@ -1714,7 +1714,7 @@ netxen_nic_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
 	/* 4 fragments per cmd des */
 	no_of_desc = (frag_count + 3) >> 2;
 
-	if (unlikely(no_of_desc + 2) > netxen_tx_avail(tx_ring)) {
+	if (unlikely(no_of_desc + 2 > netxen_tx_avail(tx_ring))) {
 		netif_stop_queue(netdev);
 		return NETDEV_TX_BUSY;
 	}

^ permalink raw reply related

* Re: [PATCH] udp: extend hash tables to 256 slots
From: Eric Dumazet @ 2009-10-07  5:33 UTC (permalink / raw)
  To: David Miller; +Cc: rick.jones2, netdev
In-Reply-To: <20091006.222935.231081303.davem@davemloft.net>

David Miller a écrit :
> 
> That's incredible that it's been that low for so long :-)
> 
> Bug please, dynamically size this thing, maybe with a cap of say 64K
> to start with.  If you don't have time for it I'll take care of this.


Well, we can not exceed 65536 slots, given the nature of UDP protocol :)

Do you mean a static allocation at boot time with a size that can be 
overidden in cmdline(like tcp and ip route),

or something that can dynamically extends hash table at runtime ?


^ permalink raw reply

* Re: [PATCH] udp: extend hash tables to 256 slots
From: David Miller @ 2009-10-07  5:29 UTC (permalink / raw)
  To: eric.dumazet; +Cc: rick.jones2, netdev
In-Reply-To: <4ACC1C73.1010506@gmail.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 07 Oct 2009 06:43:31 +0200

> David, I believe UDP_HTABLE_SIZE never changed from its initial value of 128,
> defined 15 years ago. Could we bump it to 256 ?
> 
> (back in 1995, SOCK_ARRAY_SIZE was 256)
> 
> (I'll probably use 1024 value for my tests)

That's incredible that it's been that low for so long :-)

Bug please, dynamically size this thing, maybe with a cap of say 64K
to start with.  If you don't have time for it I'll take care of this.

^ permalink raw reply

* Re: [PATCH net-next] myri10ge: add adaptive coalescing
From: Brice Goglin @ 2009-10-07  5:17 UTC (permalink / raw)
  To: Rick Jones; +Cc: David Miller, brice, netdev, gallatin
In-Reply-To: <4ACBEEB2.7010703@hp.com>

Rick Jones wrote:
> David Miller wrote:
>> From: Brice Goglin <brice@myri.com>
>> Date: Tue, 06 Oct 2009 18:52:43 +0200
>>
>>
>>> This patch adds support for adaptive interrupt coalescing to the
>>> myri10ge driver. It is based on the host periodically look at
>>> statistics and update the NIC coalescing accordingly.
>>>
>>> The NIC only provides packet throughput and we feel that it is a
>>> better heuristics than the packet rate heuristics currently used
>>> in ethtool. Also, assuming that the packet packet rate heuristics
>>> uses what is actually sent on the wire when using TSO, it would be
>>> much more expensive to implement correctly, as the driver would
>>> need to calculate how many packets were sent.
>>>
>>> Signed-off-by: Andrew Gallatin <gallatin@myri.com>
>>> Signed-off-by: Brice Goglin <brice@myri.com>
>>
>>
>> Drivers tried to do this as far back as 6 years ago (tg3) and we don't
>> recommend doing this with NAPI drivers.
>
> Doesn't e1000(e) still try to do adaptive coalescing?

mlx_en, benet, sfc, ... do as well.

Brice


^ permalink raw reply

* Re: [PATCH net-next] myri10ge: add adaptive coalescing
From: David Miller @ 2009-10-07  5:25 UTC (permalink / raw)
  To: Brice.Goglin, bgoglin; +Cc: rick.jones2, brice, netdev, gallatin
In-Reply-To: <4ACC247A.9020504@free.fr>

From: Brice Goglin <bgoglin@free.fr>
Date: Wed, 07 Oct 2009 07:17:46 +0200

> Rick Jones wrote:
>> David Miller wrote:
>>> From: Brice Goglin <brice@myri.com>
>>> Date: Tue, 06 Oct 2009 18:52:43 +0200
>>>
>>>
>>>> This patch adds support for adaptive interrupt coalescing to the
>>>> myri10ge driver. It is based on the host periodically look at
>>>> statistics and update the NIC coalescing accordingly.
>>>>
>>>> The NIC only provides packet throughput and we feel that it is a
>>>> better heuristics than the packet rate heuristics currently used
>>>> in ethtool. Also, assuming that the packet packet rate heuristics
>>>> uses what is actually sent on the wire when using TSO, it would be
>>>> much more expensive to implement correctly, as the driver would
>>>> need to calculate how many packets were sent.
>>>>
>>>> Signed-off-by: Andrew Gallatin <gallatin@myri.com>
>>>> Signed-off-by: Brice Goglin <brice@myri.com>
>>>
>>>
>>> Drivers tried to do this as far back as 6 years ago (tg3) and we don't
>>> recommend doing this with NAPI drivers.
>>
>> Doesn't e1000(e) still try to do adaptive coalescing?
> 
> mlx_en, benet, sfc, ... do as well.

If the patches that added that code slipped by me, my bad.  But
if I had noticed I would have been against them as well.

It really isn't the right thing to do.  By the same arguments
it is even arguable to turn off TCP congestion control completely
on local subnets.

It is absolutely impossible to react to on-the-wire changes in
traffic patterns at the granularity in which we get to execute.
It simply is not possible to do it right.

^ permalink raw reply

* [net-next-2.6 PATCH V2] can: add TI CAN (HECC) driver
From: Anant Gole @ 2009-10-07  5:15 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	linux-arm-kernel-xIg/pKzrS19vn6HldHNs0ANdhmdF6hFW

TI HECC (High End CAN Controller) module is found on many TI devices. It
has 32 hardware mailboxes with full implementation of CAN protocol 2.0B
with bus speeds up to 1Mbps. Specifications of the module are available
on TI web <http://www.ti.com>

Signed-off-by: Anant Gole <anantgole-l0cyMroinI0@public.gmane.org>
---
 drivers/net/can/Kconfig              |    7 +
 drivers/net/can/Makefile             |    1 +
 drivers/net/can/ti_hecc.c            | 1006 ++++++++++++++++++++++++++++++++++
 include/linux/can/platform/ti_hecc.h |   40 ++
 4 files changed, 1054 insertions(+), 0 deletions(-)
 create mode 100644 drivers/net/can/ti_hecc.c
 create mode 100644 include/linux/can/platform/ti_hecc.h

diff --git a/drivers/net/can/Kconfig b/drivers/net/can/Kconfig
index df32c10..57a8733 100644
--- a/drivers/net/can/Kconfig
+++ b/drivers/net/can/Kconfig
@@ -95,6 +95,13 @@ config CAN_AT91
 	---help---
 	  This is a driver for the SoC CAN controller in Atmel's AT91SAM9263.
 
+config CAN_TI_HECC
+	depends on CAN_DEV
+	tristate "TI High End CAN Controller"
+	---help---
+	  Driver for TI HECC (High End CAN Controller) module found on many
+	  TI devices. The device specifications are available from www.ti.com
+
 config CAN_DEBUG_DEVICES
 	bool "CAN devices debugging messages"
 	depends on CAN
diff --git a/drivers/net/can/Makefile b/drivers/net/can/Makefile
index 0dea627..31f4ab5 100644
--- a/drivers/net/can/Makefile
+++ b/drivers/net/can/Makefile
@@ -11,5 +11,6 @@ obj-y				+= usb/
 
 obj-$(CONFIG_CAN_SJA1000)	+= sja1000/
 obj-$(CONFIG_CAN_AT91)		+= at91_can.o
+obj-$(CONFIG_CAN_TI_HECC)	+= ti_hecc.o
 
 ccflags-$(CONFIG_CAN_DEBUG_DEVICES) := -DDEBUG
diff --git a/drivers/net/can/ti_hecc.c b/drivers/net/can/ti_hecc.c
new file mode 100644
index 0000000..814e6c5
--- /dev/null
+++ b/drivers/net/can/ti_hecc.c
@@ -0,0 +1,1006 @@
+/*
+ * TI HECC (CAN) device driver
+ *
+ * This driver supports TI's HECC (High End CAN Controller module) and the
+ * specs for the same is available at <http://www.ti.com>
+ *
+ * Copyright (C) 2009 Texas Instruments Incorporated - http://www.ti.com/
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed as is WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+/*
+ * Your platform definitions should specify module ram offsets and interrupt
+ * number to use as follows:
+ *
+ * static struct ti_hecc_platform_data am3517_evm_hecc_pdata = {
+ *         .scc_hecc_offset        = 0,
+ *         .scc_ram_offset         = 0x3000,
+ *         .hecc_ram_offset        = 0x3000,
+ *         .mbx_offset             = 0x2000,
+ *         .int_line               = 0,
+ *         .revision               = 1,
+ * };
+ *
+ * Please see include/can/platform/ti_hecc.h for description of above fields
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/interrupt.h>
+#include <linux/errno.h>
+#include <linux/netdevice.h>
+#include <linux/skbuff.h>
+#include <linux/platform_device.h>
+#include <linux/clk.h>
+
+#include <linux/can.h>
+#include <linux/can/dev.h>
+#include <linux/can/error.h>
+#include <linux/can/platform/ti_hecc.h>
+
+#define DRV_NAME "ti_hecc"
+#define HECC_MODULE_VERSION     "0.7"
+MODULE_VERSION(HECC_MODULE_VERSION);
+#define DRV_DESC "TI High End CAN Controller Driver " HECC_MODULE_VERSION
+
+/* TX / RX Mailbox Configuration */
+#define HECC_MAX_MAILBOXES	32	/* hardware mailboxes - do not change */
+#define MAX_TX_PRIO		0x3F	/* hardware value - do not change */
+
+/*
+ * Important Note: TX mailbox configuration
+ * TX mailboxes should be restricted to the number of SKB buffers to avoid
+ * maintaining SKB buffers separately. TX mailboxes should be a power of 2
+ * for the mailbox logic to work.  Top mailbox numbers are reserved for RX
+ * and lower mailboxes for TX.
+ *
+ * HECC_MAX_TX_MBOX	HECC_MB_TX_SHIFT
+ * 4 (default)		2
+ * 8			3
+ * 16			4
+ */
+#define HECC_MB_TX_SHIFT	2 /* as per table above */
+#define HECC_MAX_TX_MBOX	BIT(HECC_MB_TX_SHIFT)
+
+#if (HECC_MAX_TX_MBOX > CAN_ECHO_SKB_MAX)
+#error "HECC: MAX TX mailboxes should be equal or less than CAN_ECHO_SKB_MAX"
+#endif
+
+#define HECC_TX_PRIO_SHIFT	(HECC_MB_TX_SHIFT)
+#define HECC_TX_PRIO_MASK	(MAX_TX_PRIO << HECC_MB_TX_SHIFT)
+#define HECC_TX_MB_MASK		(HECC_MAX_TX_MBOX - 1)
+#define HECC_TX_MASK		((HECC_MAX_TX_MBOX - 1) | HECC_TX_PRIO_MASK)
+#define HECC_TX_MBOX_MASK	(~(BIT(HECC_MAX_TX_MBOX) - 1))
+#define HECC_DEF_NAPI_WEIGHT	HECC_MAX_RX_MBOX
+
+/*
+ * Important Note: RX mailbox configuration
+ * RX mailboxes are further logically split into two - main and buffer
+ * mailboxes. The goal is to get all packets into main mailboxes as
+ * driven by mailbox number and receive priority (higher to lower) and
+ * buffer mailboxes are used to receive pkts while main mailboxes are being
+ * processed. This ensures in-order packet reception.
+ *
+ * Here are the recommended values for buffer mailbox. Note that RX mailboxes
+ * start after TX mailboxes:
+ *
+ * HECC_MAX_RX_MBOX		HECC_RX_BUFFER_MBOX	No of buffer mailboxes
+ * 28				12			8
+ * 16				20			4
+ */
+
+#define HECC_MAX_RX_MBOX	(HECC_MAX_MAILBOXES - HECC_MAX_TX_MBOX)
+#define HECC_RX_BUFFER_MBOX	12 /* as per table above */
+#define HECC_RX_FIRST_MBOX	(HECC_MAX_MAILBOXES - 1)
+#define HECC_RX_HIGH_MBOX_MASK	(~(BIT(HECC_RX_BUFFER_MBOX) - 1))
+
+/* TI HECC module registers */
+#define HECC_CANME		0x0	/* Mailbox enable */
+#define HECC_CANMD		0x4	/* Mailbox direction */
+#define HECC_CANTRS		0x8	/* Transmit request set */
+#define HECC_CANTRR		0xC	/* Transmit request */
+#define HECC_CANTA		0x10	/* Transmission acknowledge */
+#define HECC_CANAA		0x14	/* Abort acknowledge */
+#define HECC_CANRMP		0x18	/* Receive message pending */
+#define HECC_CANRML		0x1C	/* Remote message lost */
+#define HECC_CANRFP		0x20	/* Remote frame pending */
+#define HECC_CANGAM		0x24	/* SECC only:Global acceptance mask */
+#define HECC_CANMC		0x28	/* Master control */
+#define HECC_CANBTC		0x2C	/* Bit timing configuration */
+#define HECC_CANES		0x30	/* Error and status */
+#define HECC_CANTEC		0x34	/* Transmit error counter */
+#define HECC_CANREC		0x38	/* Receive error counter */
+#define HECC_CANGIF0		0x3C	/* Global interrupt flag 0 */
+#define HECC_CANGIM		0x40	/* Global interrupt mask */
+#define HECC_CANGIF1		0x44	/* Global interrupt flag 1 */
+#define HECC_CANMIM		0x48	/* Mailbox interrupt mask */
+#define HECC_CANMIL		0x4C	/* Mailbox interrupt level */
+#define HECC_CANOPC		0x50	/* Overwrite protection control */
+#define HECC_CANTIOC		0x54	/* Transmit I/O control */
+#define HECC_CANRIOC		0x58	/* Receive I/O control */
+#define HECC_CANLNT		0x5C	/* HECC only: Local network time */
+#define HECC_CANTOC		0x60	/* HECC only: Time-out control */
+#define HECC_CANTOS		0x64	/* HECC only: Time-out status */
+#define HECC_CANTIOCE		0x68	/* SCC only:Enhanced TX I/O control */
+#define HECC_CANRIOCE		0x6C	/* SCC only:Enhanced RX I/O control */
+
+/* Mailbox registers */
+#define HECC_CANMID		0x0
+#define HECC_CANMCF		0x4
+#define HECC_CANMDL		0x8
+#define HECC_CANMDH		0xC
+
+#define HECC_SET_REG		0xFFFFFFFF
+#define HECC_CANID_MASK		0x3FF	/* 18 bits mask for extended id's */
+#define HECC_CCE_WAIT_COUNT     100	/* Wait for ~1 sec for CCE bit */
+
+#define HECC_CANMC_SCM		BIT(13)	/* SCC compat mode */
+#define HECC_CANMC_CCR		BIT(12)	/* Change config request */
+#define HECC_CANMC_PDR		BIT(11)	/* Local Power down - for sleep mode */
+#define HECC_CANMC_ABO		BIT(7)	/* Auto Bus On */
+#define HECC_CANMC_STM		BIT(6)	/* Self test mode - loopback */
+#define HECC_CANMC_SRES		BIT(5)	/* Software reset */
+
+#define HECC_CANTIOC_EN		BIT(3)	/* Enable CAN TX I/O pin */
+#define HECC_CANRIOC_EN		BIT(3)	/* Enable CAN RX I/O pin */
+
+#define HECC_CANMID_IDE		BIT(31)	/* Extended frame format */
+#define HECC_CANMID_AME		BIT(30)	/* Acceptance mask enable */
+#define HECC_CANMID_AAM		BIT(29)	/* Auto answer mode */
+
+#define HECC_CANES_FE		BIT(24)	/* form error */
+#define HECC_CANES_BE		BIT(23)	/* bit error */
+#define HECC_CANES_SA1		BIT(22)	/* stuck at dominant error */
+#define HECC_CANES_CRCE		BIT(21)	/* CRC error */
+#define HECC_CANES_SE		BIT(20)	/* stuff bit error */
+#define HECC_CANES_ACKE		BIT(19)	/* ack error */
+#define HECC_CANES_BO		BIT(18)	/* Bus off status */
+#define HECC_CANES_EP		BIT(17)	/* Error passive status */
+#define HECC_CANES_EW		BIT(16)	/* Error warning status */
+#define HECC_CANES_SMA		BIT(5)	/* suspend mode ack */
+#define HECC_CANES_CCE		BIT(4)	/* Change config enabled */
+#define HECC_CANES_PDA		BIT(3)	/* Power down mode ack */
+
+#define HECC_CANBTC_SAM		BIT(7)	/* sample points */
+
+#define HECC_BUS_ERROR		(HECC_CANES_FE | HECC_CANES_BE |\
+				HECC_CANES_CRCE | HECC_CANES_SE |\
+				HECC_CANES_ACKE)
+
+#define HECC_CANMCF_RTR		BIT(4)	/* Remote transmit request */
+
+#define HECC_CANGIF_MAIF	BIT(17)	/* Message alarm interrupt */
+#define HECC_CANGIF_TCOIF	BIT(16) /* Timer counter overflow int */
+#define HECC_CANGIF_GMIF	BIT(15)	/* Global mailbox interrupt */
+#define HECC_CANGIF_AAIF	BIT(14)	/* Abort ack interrupt */
+#define HECC_CANGIF_WDIF	BIT(13)	/* Write denied interrupt */
+#define HECC_CANGIF_WUIF	BIT(12)	/* Wake up interrupt */
+#define HECC_CANGIF_RMLIF	BIT(11)	/* Receive message lost interrupt */
+#define HECC_CANGIF_BOIF	BIT(10)	/* Bus off interrupt */
+#define HECC_CANGIF_EPIF	BIT(9)	/* Error passive interrupt */
+#define HECC_CANGIF_WLIF	BIT(8)	/* Warning level interrupt */
+#define HECC_CANGIF_MBOX_MASK	0x1F	/* Mailbox number mask */
+#define HECC_CANGIM_I1EN	BIT(1)	/* Int line 1 enable */
+#define HECC_CANGIM_I0EN	BIT(0)	/* Int line 0 enable */
+#define HECC_CANGIM_DEF_MASK	0x700	/* only busoff/warning/passive */
+#define HECC_CANGIM_SIL		BIT(2)	/* system interrupts to int line 1 */
+
+/* CAN Bittiming constants as per HECC specs */
+static struct can_bittiming_const ti_hecc_bittiming_const = {
+	.name = DRV_NAME,
+	.tseg1_min = 1,
+	.tseg1_max = 16,
+	.tseg2_min = 1,
+	.tseg2_max = 8,
+	.sjw_max = 4,
+	.brp_min = 1,
+	.brp_max = 256,
+	.brp_inc = 1,
+};
+
+struct ti_hecc_priv {
+	struct can_priv can;	/* MUST be first member/field */
+	struct napi_struct napi;
+	struct net_device *ndev;
+	struct clk *clk;
+	void __iomem *base;
+	u32 scc_ram_offset;
+	u32 hecc_ram_offset;
+	u32 mbx_offset;
+	u32 int_line;
+	spinlock_t mbx_lock; /* CANME register needs protection */
+	u32 tx_head;
+	u32 tx_tail;
+	u32 rx_next;
+};
+
+static inline int get_tx_head_mb(struct ti_hecc_priv *priv)
+{
+	return priv->tx_head & HECC_TX_MB_MASK;
+}
+
+static inline int get_tx_tail_mb(struct ti_hecc_priv *priv)
+{
+	return priv->tx_tail & HECC_TX_MB_MASK;
+}
+
+static inline int get_tx_head_prio(struct ti_hecc_priv *priv)
+{
+	return (priv->tx_head >> HECC_TX_PRIO_SHIFT) & MAX_TX_PRIO;
+}
+
+static inline void hecc_write_lam(struct ti_hecc_priv *priv, u32 mbxno, u32 val)
+{
+	__raw_writel(val, priv->base + priv->hecc_ram_offset + mbxno * 4);
+}
+
+static inline void hecc_write_mbx(struct ti_hecc_priv *priv, u32 mbxno,
+	u32 reg, u32 val)
+{
+	__raw_writel(val, priv->base + priv->mbx_offset + mbxno * 0x10 +
+			reg);
+}
+
+static inline u32 hecc_read_mbx(struct ti_hecc_priv *priv, u32 mbxno, u32 reg)
+{
+	return __raw_readl(priv->base + priv->mbx_offset + mbxno * 0x10 +
+			reg);
+}
+
+static inline void hecc_write(struct ti_hecc_priv *priv, u32 reg, u32 val)
+{
+	__raw_writel(val, priv->base + reg);
+}
+
+static inline u32 hecc_read(struct ti_hecc_priv *priv, int reg)
+{
+	return __raw_readl(priv->base + reg);
+}
+
+static inline void hecc_set_bit(struct ti_hecc_priv *priv, int reg,
+	u32 bit_mask)
+{
+	hecc_write(priv, reg, hecc_read(priv, reg) | bit_mask);
+}
+
+static inline void hecc_clear_bit(struct ti_hecc_priv *priv, int reg,
+	u32 bit_mask)
+{
+	hecc_write(priv, reg, hecc_read(priv, reg) & ~bit_mask);
+}
+
+static inline u32 hecc_get_bit(struct ti_hecc_priv *priv, int reg, u32 bit_mask)
+{
+	return (hecc_read(priv, reg) & bit_mask) ? 1 : 0;
+}
+
+static int ti_hecc_get_state(const struct net_device *ndev,
+	enum can_state *state)
+{
+	struct ti_hecc_priv *priv = netdev_priv(ndev);
+
+	*state = priv->can.state;
+	return 0;
+}
+
+static int ti_hecc_set_btc(struct ti_hecc_priv *priv)
+{
+	struct can_bittiming *bit_timing = &priv->can.bittiming;
+	u32 can_btc;
+
+	can_btc = (bit_timing->phase_seg2 - 1) & 0x7;
+	can_btc |= ((bit_timing->phase_seg1 + bit_timing->prop_seg - 1)
+			& 0xF) << 3;
+	if (priv->can.ctrlmode & CAN_CTRLMODE_3_SAMPLES) {
+		if (bit_timing->brp > 4)
+			can_btc |= HECC_CANBTC_SAM;
+		else
+			dev_warn(priv->ndev->dev.parent, "WARN: Triple" \
+				"sampling not set due to h/w limitations");
+	}
+	can_btc |= ((bit_timing->sjw - 1) & 0x3) << 8;
+	can_btc |= ((bit_timing->brp - 1) & 0xFF) << 16;
+
+	/* ERM being set to 0 by default meaning resync at falling edge */
+
+	hecc_write(priv, HECC_CANBTC, can_btc);
+	dev_info(priv->ndev->dev.parent, "setting CANBTC=%#x\n", can_btc);
+
+	return 0;
+}
+
+static void ti_hecc_reset(struct net_device *ndev)
+{
+	u32 cnt;
+	struct ti_hecc_priv *priv = netdev_priv(ndev);
+
+	dev_dbg(ndev->dev.parent, "resetting hecc ...\n");
+	hecc_set_bit(priv, HECC_CANMC, HECC_CANMC_SRES);
+
+	/* Set change control request and wait till enabled */
+	hecc_set_bit(priv, HECC_CANMC, HECC_CANMC_CCR);
+
+	/*
+	 * INFO: It has been observed that at times CCE bit may not be
+	 * set and hw seems to be ok even if this bit is not set so
+	 * timing out with a timing of 1ms to respect the specs
+	 */
+	cnt = HECC_CCE_WAIT_COUNT;
+	while (!hecc_get_bit(priv, HECC_CANES, HECC_CANES_CCE) && cnt != 0) {
+		--cnt;
+		udelay(10);
+	}
+
+	/*
+	 * Note: On HECC, BTC can be programmed only in initialization mode, so
+	 * it is expected that the can bittiming parameters are set via ip
+	 * utility before the device is opened
+	 */
+	ti_hecc_set_btc(priv);
+
+	/* Clear CCR (and CANMC register) and wait for CCE = 0 enable */
+	hecc_write(priv, HECC_CANMC, 0);
+
+	/*
+	 * INFO: CAN net stack handles bus off and hence disabling auto-bus-on
+	 * hecc_set_bit(priv, HECC_CANMC, HECC_CANMC_ABO);
+	 */
+
+	/*
+	 * INFO: It has been observed that at times CCE bit may not be
+	 * set and hw seems to be ok even if this bit is not set so
+	 */
+	cnt = HECC_CCE_WAIT_COUNT;
+	while (hecc_get_bit(priv, HECC_CANES, HECC_CANES_CCE) && cnt != 0) {
+		--cnt;
+		udelay(10);
+	}
+
+	/* Enable TX and RX I/O Control pins */
+	hecc_write(priv, HECC_CANTIOC, HECC_CANTIOC_EN);
+	hecc_write(priv, HECC_CANRIOC, HECC_CANRIOC_EN);
+
+	/* Clear registers for clean operation */
+	hecc_write(priv, HECC_CANTA, HECC_SET_REG);
+	hecc_write(priv, HECC_CANRMP, HECC_SET_REG);
+	hecc_write(priv, HECC_CANGIF0, HECC_SET_REG);
+	hecc_write(priv, HECC_CANGIF1, HECC_SET_REG);
+	hecc_write(priv, HECC_CANME, 0);
+	hecc_write(priv, HECC_CANMD, 0);
+
+	/* SCC compat mode NOT supported (and not needed too) */
+	hecc_set_bit(priv, HECC_CANMC, HECC_CANMC_SCM);
+}
+
+static void ti_hecc_start(struct net_device *ndev)
+{
+	struct ti_hecc_priv *priv = netdev_priv(ndev);
+	u32 cnt, mbxno, mbx_mask;
+
+	/* put HECC in initialization mode and set btc */
+	ti_hecc_reset(ndev);
+
+	priv->tx_head = priv->tx_tail = HECC_TX_MASK;
+	priv->rx_next = HECC_RX_FIRST_MBOX;
+
+	/* Enable local and global acceptance mask registers */
+	hecc_write(priv, HECC_CANGAM, HECC_SET_REG);
+
+	/* Prepare configured mailboxes to receive messages */
+	for (cnt = 0; cnt < HECC_MAX_RX_MBOX; cnt++) {
+		mbxno = HECC_MAX_MAILBOXES - 1 - cnt;
+		mbx_mask = BIT(mbxno);
+		hecc_clear_bit(priv, HECC_CANME, mbx_mask);
+		hecc_write_mbx(priv, mbxno, HECC_CANMID, HECC_CANMID_AME);
+		hecc_write_lam(priv, mbxno, HECC_SET_REG);
+		hecc_set_bit(priv, HECC_CANMD, mbx_mask);
+		hecc_set_bit(priv, HECC_CANME, mbx_mask);
+		hecc_set_bit(priv, HECC_CANMIM, mbx_mask);
+	}
+
+	/* Prevent message over-write & Enable interrupts */
+	hecc_write(priv, HECC_CANOPC, HECC_SET_REG);
+	if (priv->int_line) {
+		hecc_write(priv, HECC_CANMIL, HECC_SET_REG);
+		hecc_write(priv, HECC_CANGIM, HECC_CANGIM_DEF_MASK |
+			HECC_CANGIM_I1EN | HECC_CANGIM_SIL);
+	} else {
+		hecc_write(priv, HECC_CANMIL, 0);
+		hecc_write(priv, HECC_CANGIM,
+			HECC_CANGIM_DEF_MASK | HECC_CANGIM_I0EN);
+	}
+	priv->can.state = CAN_STATE_ERROR_ACTIVE;
+}
+
+static void ti_hecc_stop(struct net_device *ndev)
+{
+	struct ti_hecc_priv *priv = netdev_priv(ndev);
+
+	/* Disable interrupts and disable mailboxes */
+	hecc_write(priv, HECC_CANGIM, 0);
+	hecc_write(priv, HECC_CANMIM, 0);
+	hecc_write(priv, HECC_CANME, 0);
+	priv->can.state = CAN_STATE_STOPPED;
+}
+
+static int ti_hecc_do_set_mode(struct net_device *ndev, enum can_mode mode)
+{
+	int ret = 0;
+
+	switch (mode) {
+	case CAN_MODE_START:
+		ti_hecc_start(ndev);
+		netif_wake_queue(ndev);
+		break;
+	default:
+		ret = -EOPNOTSUPP;
+		break;
+	}
+
+	return ret;
+}
+
+/*
+ * ti_hecc_xmit: HECC Transmit
+ *
+ * The transmit mailboxes start from 0 to HECC_MAX_TX_MBOX. In HECC the
+ * priority of the mailbox for tranmission is dependent upon priority setting
+ * field in mailbox registers. The mailbox with highest value in priority field
+ * is transmitted first. Only when two mailboxes have the same value in
+ * priority field the highest numbered mailbox is transmitted first.
+ *
+ * To utilize the HECC priority feature as described above we start with the
+ * highest numbered mailbox with highest priority level and move on to the next
+ * mailbox with the same priority level and so on. Once we loop through all the
+ * transmit mailboxes we choose the next priority level (lower) and so on
+ * until we reach the lowest priority level on the lowest numbered mailbox
+ * when we stop transmission until all mailboxes are transmitted and then
+ * restart at highest numbered mailbox with highest priority.
+ *
+ * Two counters (head and tail) are used to track the next mailbox to transmit
+ * and to track the echo buffer for already transmitted mailbox. The queue
+ * is stopped when all the mailboxes are busy or when there is a priority
+ * value roll-over happens.
+ */
+static netdev_tx_t ti_hecc_xmit(struct sk_buff *skb, struct net_device *ndev)
+{
+	struct ti_hecc_priv *priv = netdev_priv(ndev);
+	struct can_frame *cf = (struct can_frame *)skb->data;
+	u32 mbxno, mbx_mask, data;
+	unsigned long flags;
+
+	mbxno = get_tx_head_mb(priv);
+	mbx_mask = BIT(mbxno);
+	spin_lock_irqsave(&priv->mbx_lock, flags);
+	if (unlikely(hecc_read(priv, HECC_CANME) & mbx_mask)) {
+		spin_unlock_irqrestore(&priv->mbx_lock, flags);
+		netif_stop_queue(ndev);
+		dev_err(priv->ndev->dev.parent,
+			"BUG: TX mbx not ready tx_head=%08X, tx_tail=%08X\n",
+			priv->tx_head, priv->tx_tail);
+		return NETDEV_TX_BUSY;
+	}
+	spin_unlock_irqrestore(&priv->mbx_lock, flags);
+
+	/* Prepare mailbox for transmission */
+	data = min_t(u8, cf->can_dlc, 8);
+	if (cf->can_id & CAN_RTR_FLAG) /* Remote transmission request */
+		data |= HECC_CANMCF_RTR;
+	data |= get_tx_head_prio(priv) << 8;
+	hecc_write_mbx(priv, mbxno, HECC_CANMCF, data);
+
+	if (cf->can_id & CAN_EFF_FLAG) /* Extended frame format */
+		data = (cf->can_id & CAN_EFF_MASK) | HECC_CANMID_IDE;
+	else /* Standard frame format */
+		data = (cf->can_id & CAN_SFF_MASK) << 18;
+	hecc_write_mbx(priv, mbxno, HECC_CANMID, data);
+	hecc_write_mbx(priv, mbxno, HECC_CANMDL,
+		be32_to_cpu(*(u32 *)(cf->data)));
+	if (cf->can_dlc > 4)
+		hecc_write_mbx(priv, mbxno, HECC_CANMDH,
+			be32_to_cpu(*(u32 *)(cf->data + 4)));
+	else
+		*(u32 *)(cf->data + 4) = 0;
+	can_put_echo_skb(skb, ndev, mbxno);
+
+	spin_lock_irqsave(&priv->mbx_lock, flags);
+	--priv->tx_head;
+	if ((hecc_read(priv, HECC_CANME) & BIT(get_tx_head_mb(priv))) ||
+		(priv->tx_head & HECC_TX_MASK) == HECC_TX_MASK) {
+		netif_stop_queue(ndev);
+	}
+	hecc_set_bit(priv, HECC_CANME, mbx_mask);
+	spin_unlock_irqrestore(&priv->mbx_lock, flags);
+
+	hecc_clear_bit(priv, HECC_CANMD, mbx_mask);
+	hecc_set_bit(priv, HECC_CANMIM, mbx_mask);
+	hecc_write(priv, HECC_CANTRS, mbx_mask);
+
+	return NETDEV_TX_OK;
+}
+
+static int ti_hecc_rx_pkt(struct ti_hecc_priv *priv, int mbxno)
+{
+	struct net_device_stats *stats = &priv->ndev->stats;
+	struct can_frame *cf;
+	struct sk_buff *skb;
+	u32 data, mbx_mask;
+	unsigned long flags;
+
+	skb = netdev_alloc_skb(priv->ndev, sizeof(struct can_frame));
+	if (!skb) {
+		if (printk_ratelimit())
+			dev_err(priv->ndev->dev.parent,
+				"ti_hecc_rx_pkt: netdev_alloc_skb() failed\n");
+		return -ENOMEM;
+	}
+	skb->protocol = __constant_htons(ETH_P_CAN);
+	skb->ip_summed = CHECKSUM_UNNECESSARY;
+
+	mbx_mask = BIT(mbxno);
+	cf = (struct can_frame *)skb_put(skb, sizeof(struct can_frame));
+	data = hecc_read_mbx(priv, mbxno, HECC_CANMID);
+	if (data & HECC_CANMID_IDE)
+		cf->can_id = (data & CAN_EFF_MASK) | CAN_EFF_FLAG;
+	else
+		cf->can_id = (data >> 18) & CAN_SFF_MASK;
+	data = hecc_read_mbx(priv, mbxno, HECC_CANMCF);
+	if (data & HECC_CANMCF_RTR)
+		cf->can_id |= CAN_RTR_FLAG;
+	cf->can_dlc = data & 0xF;
+	data = hecc_read_mbx(priv, mbxno, HECC_CANMDL);
+	*(u32 *)(cf->data) = cpu_to_be32(data);
+	if (cf->can_dlc > 4) {
+		data = hecc_read_mbx(priv, mbxno, HECC_CANMDH);
+		*(u32 *)(cf->data + 4) = cpu_to_be32(data);
+	} else {
+		*(u32 *)(cf->data + 4) = 0;
+	}
+	spin_lock_irqsave(&priv->mbx_lock, flags);
+	hecc_clear_bit(priv, HECC_CANME, mbx_mask);
+	hecc_write(priv, HECC_CANRMP, mbx_mask);
+	/* enable mailbox only if it is part of rx buffer mailboxes */
+	if (priv->rx_next < HECC_RX_BUFFER_MBOX)
+		hecc_set_bit(priv, HECC_CANME, mbx_mask);
+	spin_unlock_irqrestore(&priv->mbx_lock, flags);
+
+	stats->rx_bytes += cf->can_dlc;
+	netif_receive_skb(skb);
+	stats->rx_packets++;
+
+	return 0;
+}
+
+/*
+ * ti_hecc_rx_poll - HECC receive pkts
+ *
+ * The receive mailboxes start from highest numbered mailbox till last xmit
+ * mailbox. On CAN frame reception the hardware places the data into highest
+ * numbered mailbox that matches the CAN ID filter. Since all receive mailboxes
+ * have same filtering (ALL CAN frames) packets will arrive in the highest
+ * available RX mailbox and we need to ensure in-order packet reception.
+ *
+ * To ensure the packets are received in the right order we logically divide
+ * the RX mailboxes into main and buffer mailboxes. Packets are received as per
+ * mailbox priotity (higher to lower) in the main bank and once it is full we
+ * disable further reception into main mailboxes. While the main mailboxes are
+ * processed in NAPI, further packets are received in buffer mailboxes.
+ *
+ * We maintain a RX next mailbox counter to process packets and once all main
+ * mailboxe packets are passed to the upper stack we enable all of them but
+ * continue to process packets received in buffer mailboxes. With each packet
+ * received from buffer mailbox we enable it immediately so as to handle the
+ * overflow from higher mailboxes.
+ */
+static int ti_hecc_rx_poll(struct napi_struct *napi, int quota)
+{
+	struct net_device *ndev = napi->dev;
+	struct ti_hecc_priv *priv = netdev_priv(ndev);
+	u32 num_pkts = 0;
+	u32 mbx_mask;
+	unsigned long pending_pkts, flags;
+
+	if (!netif_running(ndev))
+		return 0;
+
+	while ((pending_pkts = hecc_read(priv, HECC_CANRMP)) &&
+		num_pkts < quota) {
+		mbx_mask = BIT(priv->rx_next); /* next rx mailbox to process */
+		if (mbx_mask & pending_pkts) {
+			if (ti_hecc_rx_pkt(priv, priv->rx_next) < 0)
+				return num_pkts;
+			++num_pkts;
+		} else if (priv->rx_next > HECC_RX_BUFFER_MBOX) {
+			break; /* pkt not received yet */
+		}
+		--priv->rx_next;
+		if (priv->rx_next == HECC_RX_BUFFER_MBOX) {
+			/* enable high bank mailboxes */
+			spin_lock_irqsave(&priv->mbx_lock, flags);
+			mbx_mask = hecc_read(priv, HECC_CANME);
+			mbx_mask |= HECC_RX_HIGH_MBOX_MASK;
+			hecc_write(priv, HECC_CANME, mbx_mask);
+			spin_unlock_irqrestore(&priv->mbx_lock, flags);
+		} else if (priv->rx_next == HECC_MAX_TX_MBOX - 1) {
+			priv->rx_next = HECC_RX_FIRST_MBOX;
+			break;
+		}
+	}
+
+	/* Enable packet interrupt if all pkts are handled */
+	if (hecc_read(priv, HECC_CANRMP) == 0) {
+		napi_complete(napi);
+		/* Re-enable RX mailbox interrupts */
+		mbx_mask = hecc_read(priv, HECC_CANMIM);
+		mbx_mask |= HECC_TX_MBOX_MASK;
+		hecc_write(priv, HECC_CANMIM, mbx_mask);
+	}
+
+	return num_pkts;
+}
+
+static int ti_hecc_error(struct net_device *ndev, int int_status,
+	int err_status)
+{
+	struct ti_hecc_priv *priv = netdev_priv(ndev);
+	struct net_device_stats *stats = &ndev->stats;
+	struct can_frame *cf;
+	struct sk_buff *skb;
+
+	/* propogate the error condition to the can stack */
+	skb = netdev_alloc_skb(ndev, sizeof(struct can_frame));
+	if (!skb) {
+		if (printk_ratelimit())
+			dev_err(priv->ndev->dev.parent,
+				"ti_hecc_error: netdev_alloc_skb() failed\n");
+		return -ENOMEM;
+	}
+	skb->protocol = __constant_htons(ETH_P_CAN);
+	skb->ip_summed = CHECKSUM_UNNECESSARY;
+	cf = (struct can_frame *)skb_put(skb, sizeof(struct can_frame));
+	memset(cf, 0, sizeof(struct can_frame));
+	cf->can_id = CAN_ERR_FLAG;
+	cf->can_dlc = CAN_ERR_DLC;
+
+	if (int_status & HECC_CANGIF_WLIF) { /* warning level int */
+		if ((int_status & HECC_CANGIF_BOIF) == 0) {
+			priv->can.state = CAN_STATE_ERROR_WARNING;
+			++priv->can.can_stats.error_warning;
+			cf->can_id |= CAN_ERR_CRTL;
+			if (hecc_read(priv, HECC_CANTEC) > 96)
+				cf->data[1] |= CAN_ERR_CRTL_TX_WARNING;
+			if (hecc_read(priv, HECC_CANREC) > 96)
+				cf->data[1] |= CAN_ERR_CRTL_RX_WARNING;
+		}
+		hecc_set_bit(priv, HECC_CANES, HECC_CANES_EW);
+		dev_dbg(priv->ndev->dev.parent, "Error Warning interrupt\n");
+		hecc_clear_bit(priv, HECC_CANMC, HECC_CANMC_CCR);
+	}
+
+	if (int_status & HECC_CANGIF_EPIF) { /* error passive int */
+		if ((int_status & HECC_CANGIF_BOIF) == 0) {
+			priv->can.state = CAN_STATE_ERROR_PASSIVE;
+			++priv->can.can_stats.error_passive;
+			cf->can_id |= CAN_ERR_CRTL;
+			if (hecc_read(priv, HECC_CANTEC) > 127)
+				cf->data[1] |= CAN_ERR_CRTL_TX_PASSIVE;
+			if (hecc_read(priv, HECC_CANREC) > 127)
+				cf->data[1] |= CAN_ERR_CRTL_RX_PASSIVE;
+		}
+		hecc_set_bit(priv, HECC_CANES, HECC_CANES_EP);
+		dev_dbg(priv->ndev->dev.parent, "Error passive interrupt\n");
+		hecc_clear_bit(priv, HECC_CANMC, HECC_CANMC_CCR);
+	}
+
+	/*
+	 * Need to check busoff condition in error status register too to
+	 * ensure warning interrupts don't hog the system
+	 */
+	if ((int_status & HECC_CANGIF_BOIF) || (err_status & HECC_CANES_BO)) {
+		priv->can.state = CAN_STATE_BUS_OFF;
+		cf->can_id |= CAN_ERR_BUSOFF;
+		hecc_set_bit(priv, HECC_CANES, HECC_CANES_BO);
+		hecc_clear_bit(priv, HECC_CANMC, HECC_CANMC_CCR);
+		/* Disable all interrupts in bus-off to avoid int hog */
+		hecc_write(priv, HECC_CANGIM, 0);
+		can_bus_off(ndev);
+	}
+
+	if (err_status & HECC_BUS_ERROR) {
+		++priv->can.can_stats.bus_error;
+		cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
+		cf->data[2] |= CAN_ERR_PROT_UNSPEC;
+		if (err_status & HECC_CANES_FE) {
+			hecc_set_bit(priv, HECC_CANES, HECC_CANES_FE);
+			cf->data[2] |= CAN_ERR_PROT_FORM;
+		}
+		if (err_status & HECC_CANES_BE) {
+			hecc_set_bit(priv, HECC_CANES, HECC_CANES_BE);
+			cf->data[2] |= CAN_ERR_PROT_BIT;
+		}
+		if (err_status & HECC_CANES_SE) {
+			hecc_set_bit(priv, HECC_CANES, HECC_CANES_SE);
+			cf->data[2] |= CAN_ERR_PROT_STUFF;
+		}
+		if (err_status & HECC_CANES_CRCE) {
+			hecc_set_bit(priv, HECC_CANES, HECC_CANES_CRCE);
+			cf->data[2] |= CAN_ERR_PROT_LOC_CRC_SEQ |
+					CAN_ERR_PROT_LOC_CRC_DEL;
+		}
+		if (err_status & HECC_CANES_ACKE) {
+			hecc_set_bit(priv, HECC_CANES, HECC_CANES_ACKE);
+			cf->data[2] |= CAN_ERR_PROT_LOC_ACK |
+					CAN_ERR_PROT_LOC_ACK_DEL;
+		}
+	}
+
+	netif_receive_skb(skb);
+	stats->rx_packets++;
+	stats->rx_bytes += cf->can_dlc;
+	return 0;
+}
+
+static irqreturn_t ti_hecc_interrupt(int irq, void *dev_id)
+{
+	struct net_device *ndev = (struct net_device *)dev_id;
+	struct ti_hecc_priv *priv = netdev_priv(ndev);
+	struct net_device_stats *stats = &ndev->stats;
+	u32 mbxno, mbx_mask, int_status, err_status;
+	unsigned long ack, flags;
+
+	int_status = hecc_read(priv,
+		(priv->int_line) ? HECC_CANGIF1 : HECC_CANGIF0);
+
+	if (!int_status)
+		return IRQ_NONE;
+
+	err_status = hecc_read(priv, HECC_CANES);
+	if (err_status & (HECC_BUS_ERROR | HECC_CANES_BO |
+		HECC_CANES_EP | HECC_CANES_EW))
+			ti_hecc_error(ndev, int_status, err_status);
+
+	if (int_status & HECC_CANGIF_GMIF) {
+		while (priv->tx_tail - priv->tx_head > 0) {
+			mbxno = get_tx_tail_mb(priv);
+			mbx_mask = BIT(mbxno);
+			if (!(mbx_mask & hecc_read(priv, HECC_CANTA)))
+				break;
+			hecc_clear_bit(priv, HECC_CANMIM, mbx_mask);
+			hecc_write(priv, HECC_CANTA, mbx_mask);
+			spin_lock_irqsave(&priv->mbx_lock, flags);
+			hecc_clear_bit(priv, HECC_CANME, mbx_mask);
+			spin_unlock_irqrestore(&priv->mbx_lock, flags);
+			stats->tx_bytes += hecc_read_mbx(priv, mbxno,
+						HECC_CANMCF) & 0xF;
+			stats->tx_packets++;
+			can_get_echo_skb(ndev, mbxno);
+			--priv->tx_tail;
+		}
+
+		/* restart queue if wrap-up or if queue stalled on last pkt */
+		if (((priv->tx_head == priv->tx_tail) &&
+		((priv->tx_head & HECC_TX_MASK) != HECC_TX_MASK)) ||
+		(((priv->tx_tail & HECC_TX_MASK) == HECC_TX_MASK) &&
+		((priv->tx_head & HECC_TX_MASK) == HECC_TX_MASK)))
+			netif_wake_queue(ndev);
+
+		/* Disable RX mailbox interrupts and let NAPI reenable them */
+		if (hecc_read(priv, HECC_CANRMP)) {
+			ack = hecc_read(priv, HECC_CANMIM);
+			ack &= BIT(HECC_MAX_TX_MBOX) - 1;
+			hecc_write(priv, HECC_CANMIM, ack);
+			napi_schedule(&priv->napi);
+		}
+	}
+
+	/* clear all interrupt conditions - read back to avoid spurious ints */
+	if (priv->int_line) {
+		hecc_write(priv, HECC_CANGIF1, HECC_SET_REG);
+		int_status = hecc_read(priv, HECC_CANGIF1);
+	} else {
+		hecc_write(priv, HECC_CANGIF0, HECC_SET_REG);
+		int_status = hecc_read(priv, HECC_CANGIF0);
+	}
+
+	return IRQ_HANDLED;
+}
+
+static int ti_hecc_open(struct net_device *ndev)
+{
+	struct ti_hecc_priv *priv = netdev_priv(ndev);
+	int err;
+
+	err = request_irq(ndev->irq, ti_hecc_interrupt, IRQF_SHARED,
+			ndev->name, ndev);
+	if (err) {
+		dev_err(ndev->dev.parent, "error requesting interrupt\n");
+		return err;
+	}
+
+	/* Open common can device */
+	err = open_candev(ndev);
+	if (err) {
+		dev_err(ndev->dev.parent, "open_candev() failed %d\n", err);
+		free_irq(ndev->irq, ndev);
+		return err;
+	}
+
+	clk_enable(priv->clk);
+	ti_hecc_start(ndev);
+	napi_enable(&priv->napi);
+	netif_start_queue(ndev);
+
+	return 0;
+}
+
+static int ti_hecc_close(struct net_device *ndev)
+{
+	struct ti_hecc_priv *priv = netdev_priv(ndev);
+
+	netif_stop_queue(ndev);
+	napi_disable(&priv->napi);
+	ti_hecc_stop(ndev);
+	free_irq(ndev->irq, ndev);
+	clk_disable(priv->clk);
+	close_candev(ndev);
+
+	return 0;
+}
+
+static const struct net_device_ops ti_hecc_netdev_ops = {
+	.ndo_open		= ti_hecc_open,
+	.ndo_stop		= ti_hecc_close,
+	.ndo_start_xmit		= ti_hecc_xmit,
+};
+
+static int ti_hecc_probe(struct platform_device *pdev)
+{
+	struct net_device *ndev = (struct net_device *)0;
+	struct ti_hecc_priv *priv;
+	struct ti_hecc_platform_data *pdata;
+	struct resource *mem, *irq;
+	void __iomem *addr;
+	int err = -ENODEV;
+
+	pdata = pdev->dev.platform_data;
+	if (!pdata) {
+		dev_err(&pdev->dev, "No platform data\n");
+		goto probe_exit;
+	}
+
+	mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!mem) {
+		dev_err(&pdev->dev, "No mem resources\n");
+		goto probe_exit;
+	}
+	irq = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
+	if (!irq) {
+		dev_err(&pdev->dev, "No irq resource\n");
+		goto probe_exit;
+	}
+	if (!request_mem_region(mem->start, resource_size(mem), pdev->name)) {
+		dev_err(&pdev->dev, "HECC region already claimed\n");
+		err = -EBUSY;
+		goto probe_exit;
+	}
+	addr = ioremap(mem->start, resource_size(mem));
+	if (!addr) {
+		dev_err(&pdev->dev, "ioremap failed\n");
+		err = -ENOMEM;
+		goto probe_exit_free_region;
+	}
+
+	ndev = alloc_candev(sizeof(struct ti_hecc_priv));
+	if (!ndev) {
+		dev_err(&pdev->dev, "alloc_candev failed\n");
+		err = -ENOMEM;
+		goto probe_exit_iounmap;
+	}
+
+	priv = netdev_priv(ndev);
+	priv->ndev = ndev;
+	priv->base = addr;
+	priv->scc_ram_offset = pdata->scc_ram_offset;
+	priv->hecc_ram_offset = pdata->hecc_ram_offset;
+	priv->mbx_offset = pdata->mbx_offset;
+	priv->int_line = pdata->int_line;
+
+	priv->can.bittiming_const = &ti_hecc_bittiming_const;
+	priv->can.do_set_mode = ti_hecc_do_set_mode;
+	priv->can.do_get_state = ti_hecc_get_state;
+
+	ndev->irq = irq->start;
+	ndev->flags |= IFF_ECHO;
+	platform_set_drvdata(pdev, ndev);
+	SET_NETDEV_DEV(ndev, &pdev->dev);
+	ndev->netdev_ops = &ti_hecc_netdev_ops;
+
+	priv->clk = clk_get(&pdev->dev, "hecc_ck");
+	if (IS_ERR(priv->clk)) {
+		dev_err(&pdev->dev, "No clock available\n");
+		err = PTR_ERR(priv->clk);
+		priv->clk = NULL;
+		goto probe_exit_candev;
+	}
+	priv->can.clock.freq = clk_get_rate(priv->clk);
+	netif_napi_add(ndev, &priv->napi, ti_hecc_rx_poll,
+		HECC_DEF_NAPI_WEIGHT);
+
+	err = register_candev(ndev);
+	if (err) {
+		dev_err(&pdev->dev, "register_candev() failed\n");
+		goto probe_exit_clk;
+	}
+	dev_info(&pdev->dev, "device registered (reg_base=%p, irq=%u)\n",
+		priv->base, (u32) ndev->irq);
+
+	return 0;
+
+probe_exit_clk:
+	clk_put(priv->clk);
+probe_exit_candev:
+	free_candev(ndev);
+probe_exit_iounmap:
+	iounmap(addr);
+probe_exit_free_region:
+	release_mem_region(mem->start, resource_size(mem));
+probe_exit:
+	return err;
+}
+
+static int __devexit ti_hecc_remove(struct platform_device *pdev)
+{
+	struct resource *res;
+	struct net_device *ndev = platform_get_drvdata(pdev);
+	struct ti_hecc_priv *priv = netdev_priv(ndev);
+
+	clk_put(priv->clk);
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	iounmap(priv->base);
+	release_mem_region(res->start, resource_size(res));
+	unregister_candev(ndev);
+	free_candev(ndev);
+	platform_set_drvdata(pdev, NULL);
+
+	return 0;
+}
+
+/* TI HECC netdevice driver: platform driver structure */
+static struct platform_driver ti_hecc_driver = {
+	.driver = {
+		.name    = DRV_NAME,
+		.owner   = THIS_MODULE,
+	},
+	.probe = ti_hecc_probe,
+	.remove = __devexit_p(ti_hecc_remove),
+};
+
+static int __init ti_hecc_init_driver(void)
+{
+	printk(KERN_INFO DRV_DESC "\n");
+	return platform_driver_register(&ti_hecc_driver);
+}
+module_init(ti_hecc_init_driver);
+
+static void __exit ti_hecc_exit_driver(void)
+{
+	printk(KERN_INFO DRV_DESC " unloaded\n");
+	platform_driver_unregister(&ti_hecc_driver);
+}
+module_exit(ti_hecc_exit_driver);
+
+MODULE_AUTHOR("Anant Gole <anantgole-l0cyMroinI0@public.gmane.org>");
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION(DRV_DESC);
diff --git a/include/linux/can/platform/ti_hecc.h b/include/linux/can/platform/ti_hecc.h
new file mode 100644
index 0000000..4688c7b
--- /dev/null
+++ b/include/linux/can/platform/ti_hecc.h
@@ -0,0 +1,40 @@
+/*
+ * TI HECC (High End CAN Controller) driver platform header
+ *
+ * Copyright (C) 2009 Texas Instruments Incorporated - http://www.ti.com/
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed as is WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+/**
+ * struct hecc_platform_data - HECC Platform Data
+ *
+ * @scc_hecc_offset:	mostly 0 - should really never change
+ * @scc_ram_offset:	SCC RAM offset
+ * @hecc_ram_offset:	HECC RAM offset
+ * @mbx_offset:		Mailbox RAM offset
+ * @int_line:		Interrupt line to use - 0 or 1
+ * @version:		version for future use
+ *
+ * Platform data structure to get all platform specific settings.
+ * this structure also accounts the fact that the IP may have different
+ * RAM and mailbox offsets for different SOC's
+ */
+struct ti_hecc_platform_data {
+	u32 scc_hecc_offset;
+	u32 scc_ram_offset;
+	u32 hecc_ram_offset;
+	u32 mbx_offset;
+	u32 int_line;
+	u32 version;
+};
+
+
-- 
1.6.2.4

^ permalink raw reply related

* [PATCH] udp: extend hash tables to 256 slots
From: Eric Dumazet @ 2009-10-07  4:43 UTC (permalink / raw)
  To: David S. Miller; +Cc: Rick Jones, Linux Netdev List
In-Reply-To: <4ACC0CDE.1020907@gmail.com>

Eric Dumazet a écrit :
> I was going to setup a bench lab, with a typical RTP mediaserver, with say
> 4000 UDP sockets, 2000 sockets exchanging 50 G.711 Alaw/ulaw
> messages per second tx and rx. (Total : 100.000 packets per second each way)
> 

Hmm, it seems we'll have too many sockets per udp hash chain unfortunatly for this
workload to show any improvement.

(~32 sockets per chain : average of 16 misses to lookup the target socket.)

David, I believe UDP_HTABLE_SIZE never changed from its initial value of 128,
defined 15 years ago. Could we bump it to 256 ?

(back in 1995, SOCK_ARRAY_SIZE was 256)

(I'll probably use 1024 value for my tests)

[PATCH] udp: extend hash tables to 256 slots

UDP_HTABLE_SIZE was initialy defined to 128, which is a bit small for several setups.
4000 active sockets -> 32 sockets per chain in average.

Doubling hash table size has a memory cost of 128 (pointers + spinlocks) for UDP,
same for UDPLite, this should be OK.

It reduces the size of bitmap used in udp_lib_get_port() and speedup port allocation.
#define PORTS_PER_CHAIN (65536 / UDP_HTABLE_SIZE) -> 256 bits instead of 512 bits

Use CONFIG_BASE_SMALL to keep hash tables small for small machines.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
diff --git a/include/linux/udp.h b/include/linux/udp.h
index 0cf5c4c..8aaa151 100644
--- a/include/linux/udp.h
+++ b/include/linux/udp.h
@@ -45,7 +45,7 @@ static inline struct udphdr *udp_hdr(const struct sk_buff *skb)
 	return (struct udphdr *)skb_transport_header(skb);
 }

-#define UDP_HTABLE_SIZE		128
+#define UDP_HTABLE_SIZE		(CONFIG_BASE_SMALL ? 128 : 256)

 static inline int udp_hashfn(struct net *net, const unsigned num)
 {

^ permalink raw reply related

* Re: [RFC net-next-2.6] net: speedup sk_wake_async()
From: Eric Dumazet @ 2009-10-07  3:37 UTC (permalink / raw)
  To: Rick Jones; +Cc: David S. Miller, Linux Netdev List
In-Reply-To: <4ACBE3E7.60404@hp.com>

Rick Jones a écrit :
> 
> How about 64-bit?

No data yet, but larger footprint unfortunatly :-(

> 
> Got any netperf service demand changes?

I was going to setup a bench lab, with a typical RTP mediaserver, with say
4000 UDP sockets, 2000 sockets exchanging 50 G.711 Alaw/ulaw
messages per second tx and rx. (Total : 100.000 packets per second each way)

Is netperf able to simulate this workload ?

^ permalink raw reply

* Re: skb_shinfo(skb)->nr_frags > 0 while skb_is_gso(skb) == 0?
From: Michael Chan @ 2009-10-07  1:37 UTC (permalink / raw)
  To: John Wright; +Cc: netdev@vger.kernel.org, Bob Montgomery
In-Reply-To: <20091007010315.GA26498@neptune.jswright>

On Tue, 2009-10-06 at 18:03 -0700, John Wright wrote:
> So, first, a question for someone who knows more about sk_buff's than
> I:
> is it reasonable/legal for an skb for which skb_is_gso(skb) == 0 to
> also
> have skb_shinfo(skb)->nr_frags > 0?
> 

As Stephen pointed out, yes.

> If yes, then for Michael, or someone familiar with bnx2 hardware: are
> "partial BD completions" (where the hw_tx_cons value is on a ring
> index
> that, on the tx_buf_ring, would have a NULL skb value, and on the
> tx_desc_ring, the tx_bd would not have the TX_BD_FLAGS_START flag set)
> possible only for gso skb's, or is it possible any time nr_frags > 0?
> 

Partial BD completions are only possible on TSO/GSO packets, unless
there is a hardware bug that we haven't found during all these years.

Several years ago, the same crash in bnx2 was found to be caused by HTB
corrupting nr_frags while the skb was queued by the driver.  That issue
has been fixed.  I wonder if skb->gso_size can change on us while the
skb is queued or we still have another case of changing nr_frags.

I think it will be good if you can run the same test on 2.6.31 where
is_gso and nr_frags are cached.  If the chip really does partial BD
completions on TX, you should still see the same issue whether we cache
these values or not.  If it doesn't crash on 2.6.31, then it may be
something else.

Thanks for the detailed debugging information.

^ permalink raw reply

* Re: [PATCH net-next] myri10ge: add adaptive coalescing
From: Rick Jones @ 2009-10-07  1:28 UTC (permalink / raw)
  To: David Miller; +Cc: brice, netdev, gallatin
In-Reply-To: <20091006.172537.148962576.davem@davemloft.net>

David Miller wrote:
> From: Brice Goglin <brice@myri.com>
> Date: Tue, 06 Oct 2009 18:52:43 +0200
> 
> 
>>This patch adds support for adaptive interrupt coalescing to the
>>myri10ge driver. It is based on the host periodically look at
>>statistics and update the NIC coalescing accordingly.
>>
>>The NIC only provides packet throughput and we feel that it is a
>>better heuristics than the packet rate heuristics currently used
>>in ethtool. Also, assuming that the packet packet rate heuristics
>>uses what is actually sent on the wire when using TSO, it would be
>>much more expensive to implement correctly, as the driver would
>>need to calculate how many packets were sent.
>>
>>Signed-off-by: Andrew Gallatin <gallatin@myri.com>
>>Signed-off-by: Brice Goglin <brice@myri.com>
> 
> 
> Drivers tried to do this as far back as 6 years ago (tg3) and we don't
> recommend doing this with NAPI drivers.

Doesn't e1000(e) still try to do adaptive coalescing?

rick jones

^ permalink raw reply

* Re: skb_shinfo(skb)->nr_frags > 0 while skb_is_gso(skb) == 0?
From: Stephen Hemminger @ 2009-10-07  1:21 UTC (permalink / raw)
  To: John Wright; +Cc: netdev, Michael Chan, Bob Montgomery
In-Reply-To: <20091007010315.GA26498@neptune.jswright>

On Tue, 6 Oct 2009 19:03:15 -0600
John Wright <john.wright@hp.com> wrote:

> Hello,
> 
> Bob Montgomery and I are debugging an OOPS in the bnx2 driver.  The
> driver OOPSes in bnx2_tx_int(), getting a NULL pointer dereference when
> checking if the skb is GSO.  (This is on 2.6.29, before is_gso was
> cached in the tx_buf (commit d62fda08), but bear with me - while kernels
> with that commit might not crash in the same place, I think we have
> discovered a bug that would manifest itself another way.)
> 
> So, first, a question for someone who knows more about sk_buff's than I:
> is it reasonable/legal for an skb for which skb_is_gso(skb) == 0 to also
> have skb_shinfo(skb)->nr_frags > 0?

Yes, if driver support Scatter/Gather and Checksum offload,
TCP (especially splice) will hand fragmented frames to device.

Don't know what assumptions driver is making that could cause your
issue.

^ permalink raw reply

* skb_shinfo(skb)->nr_frags > 0 while skb_is_gso(skb) == 0?
From: John Wright @ 2009-10-07  1:03 UTC (permalink / raw)
  To: netdev, Michael Chan; +Cc: Bob Montgomery

[-- Attachment #1: Type: text/plain, Size: 7447 bytes --]

Hello,

Bob Montgomery and I are debugging an OOPS in the bnx2 driver.  The
driver OOPSes in bnx2_tx_int(), getting a NULL pointer dereference when
checking if the skb is GSO.  (This is on 2.6.29, before is_gso was
cached in the tx_buf (commit d62fda08), but bear with me - while kernels
with that commit might not crash in the same place, I think we have
discovered a bug that would manifest itself another way.)

So, first, a question for someone who knows more about sk_buff's than I:
is it reasonable/legal for an skb for which skb_is_gso(skb) == 0 to also
have skb_shinfo(skb)->nr_frags > 0?

If yes, then for Michael, or someone familiar with bnx2 hardware: are
"partial BD completions" (where the hw_tx_cons value is on a ring index
that, on the tx_buf_ring, would have a NULL skb value, and on the
tx_desc_ring, the tx_bd would not have the TX_BD_FLAGS_START flag set)
possible only for gso skb's, or is it possible any time nr_frags > 0?

We have a crash where both of the above conditions seem to be met.  Our
crash actually occurs in skb_is_gso(skb) - when dereferencing skb, which
is NULL - called from this chunk of code in bnx2_tx_int():

...
                tx_buf = &txr->tx_buf_ring[sw_ring_cons];
                skb = tx_buf->skb;

                /* partial BD completions possible with TSO packets */
                if (skb_is_gso(skb)) {
                        u16 last_idx, last_ring_idx;

                        last_idx = sw_cons +
                                skb_shinfo(skb)->nr_frags + 1;
...

An analysis of the disassembly shows that, at the time of the OOPS, RCX
contains the bnapi passed to bnx2_tx_int, RBX contains in its lower two
bytes the value of sw_cons, and R12 contains the value of hw_cons, the
last read value of the hardware consumer pointer.  (See the disassembly
of bnx2_poll_work, which all this was inlined into, at [1].)  Now, from
the OOPS:

[120223.249838] RAX: 000000000000005b RBX: 000000005746575b RCX: ffff8801a43cd680
[120223.249838] R10: 0000000000000006 R11: ffff8800562f6bc0 R12: 000000000000575a

Already, we see hw_cons is 0x575a (corresponding to a ring index of
0x5a), and our sw_cons (which is never supposed to go beyond hw_cons) is
0x575b.  How did this happen?

Unfortunately, it looks like another CPU ran bnx2_start_xmit and put
stuff on this ring before the OOPS handler finished running and
crash_kexec() was called.  So in our actual dump, it looks like the skb
we tried to read was not really NULL:

crash> struct bnx2_napi.tx_ring ffff8801a43cd680
  tx_ring = {
    tx_prod_bseq = 0xf8c4f09b, 
    tx_prod = 0x5783, 
    tx_bidx_addr = 0x12588, 
    tx_bseq_addr = 0x12590, 
    tx_desc_ring = 0xffff8801a3cb4000, 
    tx_buf_ring = 0xffff880325e52000, 
    tx_cons = 0x5746, 
    hw_tx_cons = 0x5746, 
    tx_desc_mapping = 0x1a3cb4000
  }
crash> p *((struct sw_tx_bd *)(0xffff880325e52000) + 0x5b)
$20 = {
  skb = 0xffff8801574c2780
}

But we'll assume for now that the kernel didn't lie, and that it really
was NULL at the time of the OOPS. :)  What's interesting, though, is
the place where the card thinks it is:

crash> p *((struct sw_tx_bd *)(0xffff880325e52000) + 0x5a)
$21 = {
  skb = 0x0
}

Ok, we might expect this if we've just cleared that skb, but we couldn't
have, since that would mean our sw_cons value would equal hw_cons, and
we wouldn't have performed that iteration in the for loop.  (We only try
to clean up a particular ring index after the card has *passed* it.)
Inside of the tx_desc_ring for this bnapi is something more interesting:

crash> p *((struct tx_bd *)(0xffff8801a3cb4000) + 0x5a)
$22 = {
  tx_bd_haddr_hi = 0x2, 
  tx_bd_haddr_lo = 0x5a8acd31, 
  tx_bd_mss_nbytes = 0x36, 
  tx_bd_vlan_tag_flags = 0x42
}

But 0x42 == 0100 0010 == TX_BD_FLAGS_TCP_UDP_CKSUM | TX_BD_FLAGS_END.
Note, TX_BD_FLAGS_START is not one of its flags, and yet it's also not
GSO - if it were, we'd expect TX_BD_FLAGS_SW_LSO (1 << 15) to be set.
See bnx2_start_xmit:

...
        if ((mss = skb_shinfo(skb)->gso_size)) {
                u32 tcp_opt_len;
                struct iphdr *iph;

                vlan_tag_flags |= TX_BD_FLAGS_SW_LSO;
...

(I have verified that this flag, among others, is indeed set on a tx_bd
whose skb we still have and has non-zero gso_size.)

If we go back one, we find a tx_bd with the TX_BD_FLAGS_START set, but
not TX_BD_FLAGS_END:

crash> p *((struct tx_bd *)(0xffff8801a3cb4000) + 0x59)
$24 = {
  tx_bd_haddr_hi = 0x1, 
  tx_bd_haddr_lo = 0xf915d3e, 
  tx_bd_mss_nbytes = 0x42, 
  tx_bd_vlan_tag_flags = 0x82
}

So, presumably, the corresponding skb would have nr_frags = 1.

That said, here's what we *think* is happening:

At the end of one of the iterations through the
while(sw_cons != hw_cons) loop, we have sw_cons == 0x5759, and

  hw_cons = bnx2_get_hw_tx_cons(bnapi);  /* hw_cons = 0x575a */

Then, next time around the loop, sw_cons != hw_cons, so we go back
inside the loop.  But since the skb is not gso, we don't detect that the
card is in the middle of the fragmented skb, and then we go ahead and
increment sw_cons (nr_frags + 1) times, which puts it at 0x575b.  Then,
we clean up an skb and a DMA mapping that the card might still be using,
and go around the loop again.  And we'll keep doing so, until the card
happens to catch up with us or we crash.  I'm not sure what happens to
the card when the skb's and DMA mappings get ripped out from under it,
but I suppose it's not good - in this case, we happened not even to have
any more skb's in the tx_buf_ring to go through, so we crashed before we
could find out. :)

I think this patch (against v2.6.29 - after d62fda08, is_gso and
nr_frags are actually stored in the tx_buf - see attached patch for one
that would apply against linux-2.6.git HEAD) would solve this issue:

diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index a7e688a..db075c8 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -2615,8 +2615,8 @@ bnx2_tx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int 
                tx_buf = &txr->tx_buf_ring[sw_ring_cons];
                skb = tx_buf->skb;

-               /* partial BD completions possible with TSO packets */
-               if (skb_is_gso(skb)) {
+               /* partial BD completions possible with fragmented packets */
+               if (skb_shinfo(skb)->nr_frags) {
                        u16 last_idx, last_ring_idx;

                        last_idx = sw_cons +

Unfortunately, we haven't been able to reproduce this crash reliably.
(Under heavy network load, we've seen multiple crashes on the same
instruction; but this one seems to be the first that clearly exhibits
the "sw_cons overtakes hw_cons" behavior.  We're still analyzing some
older dumps to see if we can find evidence of the same behavior causing
those crashes.)

Can somebody who knows a bit more about sk_buff's and fragmenting
and/or the bnx2 chipsets comment on this?

I have attached the OOPS message and an equivalent patch that applies
against current linux-2.6.git, and posted a copy of the bnx2_poll_work
disassembly at [1].  I can also provide the debug vmlinux, modules, and
crash dump if necessary.

 [1]: http://free.linux.hp.com/~jswright/bnx2-crash-data/

Thanks!

-- 
+----------------------------------------------------------+
| John Wright <john.wright@hp.com>                         |
| HP Mission Critical OS Enablement & Solution Test (MOST) |
+----------------------------------------------------------+

[-- Attachment #2: oops-message.txt --]
[-- Type: text/plain, Size: 4984 bytes --]

[120222.643171] BUG: unable to handle kernel NULL pointer dereference at 00000000000000cc
[120222.644007] IP: [<ffffffffa0089c65>] bnx2_poll_work+0xd5/0x1133 [bnx2]
[120222.644007] PGD 178498067 PUD 11f002067 PMD 0 
[120222.644007] Oops: 0000 [#1] SMP 
[120222.644007] last sysfs file: /sys/devices/system/cpu/cpu15/crash_notes
[120222.644007] CPU 0 
[120222.644007] Modules linked in: sctp crc32c libcrc32c ipmi_devintf nfsd exportfs nfs lockd nfs_acl auth_rpcgss sunrpc deflate zlib_deflate ctr twofish twofish_common camellia serpent blowfish des_generic cbc aes_x86_64 aes_generic xcbc rmd160 sha256_generic sha1_generic crypto_null af_key loop psmouse ipmi_si ipmi_msghandler hpilo serio_raw button container evdev ext3 jbd mbcache usbhid hid sg sr_mod cdrom ide_pci_generic ide_core ata_generic ata_piix libata ehci_hcd uhci_hcd cciss bnx2 zlib_inflate e1000e scsi_mod thermal processor fan thermal_sys
[120223.249838] Pid: 15233, comm: mirrorclient Not tainted 2.6.29-clim-3-amd64 #1 ProLiant DL380 G6
[120223.249838] RIP: 0010:[<ffffffffa0089c65>]  [<ffffffffa0089c65>] bnx2_poll_work+0xd5/0x1133 [bnx2]
[120223.249838] RSP: 0000:ffffffff8072ccf0  EFLAGS: 00010286
[120223.249838] RAX: 000000000000005b RBX: 000000005746575b RCX: ffff8801a43cd680
[120223.249838] RDX: ffff8801a43cd680 RSI: 000000000000005b RDI: ffff88007cfb4a28
[120223.249838] RBP: ffffffff8072ce60 R08: 0000000000000000 R09: 000000000000ad77
[120223.249838] R10: 0000000000000006 R11: ffff8800562f6bc0 R12: 000000000000575a
[120223.249838] R13: ffff880325e522d8 R14: 0000000000000000 R15: ffff8801a43c6b00
[120223.249838] FS:  0000000042df8950(0063) GS:ffffffff80735000(0000) knlGS:0000000000000000
[120223.249838] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[120223.249838] CR2: 00000000000000cc CR3: 000000003215c000 CR4: 00000000000006e0
[120223.249838] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[120223.249838] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[120223.249838] Process mirrorclient (pid: 15233, threadinfo ffff880035842000, task ffff880185c4b320)
[120223.249838] Stack:
[120223.249838]  000000000000001e 000000004d8d1012 0000000000000040 ffff8801a43cd680
[120223.249838]  ffff8801a43cc700 ffffffff80227322 ffffffff8072cd90 ffffffffa006765f
[120223.249838]  ffffffff8072cd70 ffff8801a4284000 ffff880325eb8800 ffff8801a425c940
[120223.249838] Call Trace:
[120223.249838]  <IRQ> <0> [<ffffffff80227322>] ? swiotlb_map_single_phys+0x0/0x18
[120223.249838]  [<ffffffffa006765f>] ? e1000_alloc_rx_buffers+0x13b/0x203 [e1000e]
[120225.260273]  [<ffffffffa0067a17>] ? e1000_clean_rx_irq+0x290/0x2cb [e1000e]
[120225.260273]  [<ffffffff80248291>] ? irq_exit+0x4c/0x79
[120225.401539]  [<ffffffffa008acf6>] bnx2_poll_msix+0x33/0xad [bnx2]
[120225.401539]  [<ffffffff804162e8>] net_rx_action+0xae/0x1a1
[120225.401539]  [<ffffffff80248584>] __do_softirq+0x8a/0x132
[120225.401539]  [<ffffffff8021251c>] call_softirq+0x1c/0x30
[120225.401539]  [<ffffffff8021362c>] do_softirq+0x44/0x8f
[120225.401539]  [<ffffffff80248284>] irq_exit+0x3f/0x79
[120225.770044]  [<ffffffff802138a6>] do_IRQ+0xc3/0xe5
[120225.770044]  [<ffffffff80211c93>] ret_from_intr+0x0/0x29
[120225.894313]  <EOI> <0> [<ffffffff8035d4fd>] ? copy_user_generic_string+0x2d/0x40
[120225.967993]  [<ffffffff80412b57>] ? memcpy_toiovec+0x37/0x67
[120226.056077]  [<ffffffff8041321e>] ? skb_copy_datagram_iovec+0x4b/0x1d8
[120226.056077]  [<ffffffff8044f389>] ? tcp_rcv_established+0x344/0xabb
[120226.220504]  [<ffffffff80456fa8>] ? tcp_v4_do_rcv+0x1b1/0x35e
[120226.289418]  [<ffffffff8025680d>] ? prepare_to_wait+0x60/0x69
[120226.289418]  [<ffffffff8040d2a7>] ? sk_wait_data+0xc3/0xd1
[120226.289418]  [<ffffffff80445955>] ? tcp_prequeue_process+0x73/0x89
[120226.289418]  [<ffffffff80446c40>] ? tcp_recvmsg+0x4eb/0xb18
[120226.289418]  [<ffffffff8040c022>] ? sock_common_recvmsg+0x32/0x47
[120226.289418]  [<ffffffff8040a6c2>] ? sock_recvmsg+0x10e/0x133
[120226.680001]  [<ffffffff8040a838>] ? sock_sendmsg+0xfd/0x120
[120226.680001]  [<ffffffff802565f0>] ? autoremove_wake_function+0x0/0x38
[120226.680001]  [<ffffffff802d0b4e>] ? core_sys_select+0x1df/0x254
[120226.680001]  [<ffffffff8024ca7a>] ? mod_timer+0x3d/0x43
[120226.680001]  [<ffffffff8040d0fe>] ? sk_reset_timer+0x17/0x27
[120226.680001]  [<ffffffff8020f6fc>] ? __switch_to+0xb9/0x3a5
[120226.680001]  [<ffffffff8040b805>] ? sys_recvfrom+0xa3/0xf8
[120226.680001]  [<ffffffff80240157>] ? finish_task_switch+0x2b/0xc8
[120226.680001]  [<ffffffff804c4a1d>] ? thread_return+0x3d/0xd4
[120226.680001]  [<ffffffff802111aa>] ? system_call_fastpath+0x16/0x1b
[120226.680001] Code: 00 00 00 41 01 c4 e9 f1 00 00 00 48 8b 8d a8 fe ff ff 0f b6 f3 0f b7 c6 4c 8d 2c c5 00 00 00 00 4c 03 a9 00 02 00 00 4d 8b 75 00 <41> 8b 86 cc 00 00 00 49 03 86 d0 00 00 00 66 83 78 06 00 74 24 
[120226.680001] RIP  [<ffffffffa0089c65>] bnx2_poll_work+0xd5/0x1133 [bnx2]
[120226.680001]  RSP <ffffffff8072ccf0>
[120226.680001] CR2: 00000000000000cc

[-- Attachment #3: bnx2-crash.patch --]
[-- Type: text/x-diff, Size: 567 bytes --]

diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index 08cddb6..7a2dc58 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -2797,8 +2797,8 @@ bnx2_tx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
 		/* prefetch skb_end_pointer() to speedup skb_shinfo(skb) */
 		prefetch(&skb->end);

-		/* partial BD completions possible with TSO packets */
-		if (tx_buf->is_gso) {
+		/* partial BD completions possible with fragmented packets */
+		if (tx_buf->nr_frags) {
 			u16 last_idx, last_ring_idx;

 			last_idx = sw_cons + tx_buf->nr_frags + 1;

^ permalink raw reply related

* Re: [RFC net-next-2.6] net: speedup sk_wake_async()
From: Rick Jones @ 2009-10-07  0:42 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S. Miller, Linux Netdev List
In-Reply-To: <4ACBCDD8.5000306@gmail.com>

Eric Dumazet wrote:
> Latency works, part 1
> 
> 
> An incoming datagram must bring into cpu cache *lot* of cache lines,
> in particular : (other parts omitted (hash chains, ip route cache...))
> 
> On 32bit arches :

How about 64-bit?

> offsetof(struct sock, sk_rcvbuf)       =0x30    (read)
> offsetof(struct sock, sk_lock)         =0x34   (rw)
> 
> offsetof(struct sock, sk_sleep)        =0x50 (read)
> offsetof(struct sock, sk_rmem_alloc)   =0x64   (rw)
> offsetof(struct sock, sk_receive_queue)=0x74   (rw)
> 
> offsetof(struct sock, sk_forward_alloc)=0x98   (rw)
> 
> offsetof(struct sock, sk_callback_lock)=0xcc    (rw)
> offsetof(struct sock, sk_drops)        =0xd8 (read if we add dropcount support, rw if frame dropped)
> offsetof(struct sock, sk_filter)       =0xf8    (read)
> 
> offsetof(struct sock, sk_socket)       =0x138 (read)
> 
> offsetof(struct sock, sk_data_ready)   =0x15c   (read)
> 
> 
> We can avoid sk->sk_socket and socket->fasync_list referencing on sockets
> with no fasync() structures. (socket->fasync_list ptr is probably already in cache
> because it shares a cache line with socket->wait, ie location pointed by sk->sk_sleep)
> 
> This avoids one cache line load per incoming packet for common cases (no fasync())
> 
> We can leave (or even move in a future patch) sk->sk_socket in a cold location
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Got any netperf service demand changes?

rick jones

^ permalink raw reply

* Re: [PATCH 0/4] ISDN patches for 2.6.32 (v3)
From: David Miller @ 2009-10-07  0:36 UTC (permalink / raw)
  To: tilman; +Cc: netdev, linux-kernel, isdn, keil, isdn4linux, i4ldeveloper
In-Reply-To: <20091006-patch-capi-0.tilman@imap.cc>

From: Tilman Schmidt <tilman@imap.cc>
Date: Wed,  7 Oct 2009 00:17:55 +0200 (CEST)

> as I still haven't heard anything from Karsten and none of the other
> ISDN developers has volunteered to step into the breach, I would like to
> get back to the suggestion of merging my ISDN patches through you.
> 
> Here is a respin of the four patches for the Kernel CAPI subsystem I
> first submitted on 2009-09-06. I have folded into patch 1 the amendment
> I posted as "patch 5/4" on 2009-09-25, and re-added to patches 2 and 3
> Karsten's original Acked-by that came out of the discussion on the
> isdn4linux mailing list. Apart from that, the series is unchanged from
> the second submission on 2009-09-19.
> 
> I'll be following up with the respun patches to the Gigaset driver.
> 
> It would be great if these could still be integrated into kernel release
> 2.6.32.

Ok, I'll look over this stuff, thanks for your patience Tilman.

^ permalink raw reply

* Re: [PATCH 3/3] be2net: Bug fix to properly update ethtool tx-checksumming after ethtool -K <ifname> tx off
From: David Miller @ 2009-10-07  0:35 UTC (permalink / raw)
  To: ajitk; +Cc: netdev
In-Reply-To: <20091005122207.GA22935@serverengines.com>

From: Ajit Khaparde <ajitk@serverengines.com>
Date: Mon, 5 Oct 2009 17:52:19 +0530

> This is a fix for a bug which was a result of wrong use of checksum offload flag.
> The status of tx-checksumming was not changed from on to off
> after a 'ethtool -K <ifname> tx off' operation.
> Use the proper checksum offload flag NETIF_F_HW_CSUM instead of 
> NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM.
> Patch is against net-2.6 tree.
> 
> Signed-off-by: Ajit Khaparde <ajitk@serverengines.com>

Applied.

^ permalink raw reply

* Re: [PATCH 2/3] be2net: Fix a typo in be_cmds.h
From: David Miller @ 2009-10-07  0:35 UTC (permalink / raw)
  To: ajitk; +Cc: netdev
In-Reply-To: <20091005122154.GA22954@serverengines.com>

From: Ajit Khaparde <ajitk@serverengines.com>
Date: Mon, 5 Oct 2009 17:52:05 +0530

> MCC_STATUS_NOT_SUPPORTED should be decimal 66 not hex 66.
> This patch fixes this typo. Patch against net-2.6 tree.
> 
> Signed-off-by: Ajit Khaparde <ajitk@serverengines.com>

Applied.

^ permalink raw reply

* Re: [PATCH 1/3] be2net: Bug Fix while accounting of multicast frames during netdev stats update
From: David Miller @ 2009-10-07  0:35 UTC (permalink / raw)
  To: ajitk; +Cc: netdev
In-Reply-To: <20091005122140.GA22946@serverengines.com>

From: Ajit Khaparde <ajitk@serverengines.com>
Date: Mon, 5 Oct 2009 17:51:51 +0530

> While updating the statistics to be passed via the get_stats,
> tx multicast frames were being accounted instead of rx multicast frames.
> This patch fixes the bug. This patch is against the net-2.6 tree.
> 
> Signed-off-by: Ajit Khaparde <ajitk@serverengines.com>

Applied.

^ permalink raw reply

* Re: [net-2.6 PATCH 0/3] qlge: Fixes for qlge.
From: David Miller @ 2009-10-07  0:35 UTC (permalink / raw)
  To: ron.mercer; +Cc: netdev
In-Reply-To: <1254779209-15174-1-git-send-email-ron.mercer@qlogic.com>


All 3 patches applied, thanks Ron.

^ permalink raw reply

* Re: [RFC net-next-2.6] net: speedup sk_wake_async()
From: David Miller @ 2009-10-07  0:28 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <4ACBCDD8.5000306@gmail.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 07 Oct 2009 01:08:08 +0200

> An incoming datagram must bring into cpu cache *lot* of cache lines,
> in particular : (other parts omitted (hash chains, ip route cache...))
> 
> On 32bit arches :
 ...
> We can avoid sk->sk_socket and socket->fasync_list referencing on sockets
> with no fasync() structures. (socket->fasync_list ptr is probably already in cache
> because it shares a cache line with socket->wait, ie location pointed by sk->sk_sleep)
> 
> This avoids one cache line load per incoming packet for common cases (no fasync())
> 
> We can leave (or even move in a future patch) sk->sk_socket in a cold location
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

I like it, applied to net-next-2.6, thanks!

^ permalink raw reply

* Re: [PATCH net-next] myri10ge: add adaptive coalescing
From: David Miller @ 2009-10-07  0:25 UTC (permalink / raw)
  To: brice; +Cc: netdev, gallatin
In-Reply-To: <4ACB75DB.4060003@myri.com>

From: Brice Goglin <brice@myri.com>
Date: Tue, 06 Oct 2009 18:52:43 +0200

> This patch adds support for adaptive interrupt coalescing to the
> myri10ge driver. It is based on the host periodically look at
> statistics and update the NIC coalescing accordingly.
> 
> The NIC only provides packet throughput and we feel that it is a
> better heuristics than the packet rate heuristics currently used
> in ethtool. Also, assuming that the packet packet rate heuristics
> uses what is actually sent on the wire when using TSO, it would be
> much more expensive to implement correctly, as the driver would
> need to calculate how many packets were sent.
> 
> Signed-off-by: Andrew Gallatin <gallatin@myri.com>
> Signed-off-by: Brice Goglin <brice@myri.com>

Drivers tried to do this as far back as 6 years ago (tg3) and we don't
recommend doing this with NAPI drivers.

You detection code can never respond quick enough in response to
changes in traffic conditions.  By the time you program the new values
into the registers things on the wire can change a lot.

It's also very easy to flap and hit the settings a lot.

That's why we recommend using low hw mitigation settings and simply
leaving it like that.

^ permalink raw reply

* Re: [Bugme-new] [Bug 14336] New: [Pardus] Soft Lockup Problem with Attansic Ethernet Card
From: J. K. Cliburn @ 2009-10-06 23:39 UTC (permalink / raw)
  To: Andrew Morton
  Cc: badibere, bugzilla-daemon, bugme-daemon, netdev, linux-acpi,
	Jesse Barnes, Chris Snook, Jie Yang
In-Reply-To: <20091006155502.f3fc86d9.akpm@linux-foundation.org>


On Oct 6, 2009, at 5:55 PM, Andrew Morton wrote:

>
> (switched to email.  Please respond via emailed reply-to-all, not  
> via the
> bugzilla web interface).
>
> On Tue, 6 Oct 2009 22:37:13 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
>
>> http://bugzilla.kernel.org/show_bug.cgi?id=14336
>>
>>            Summary: [Pardus] Soft Lockup Problem with Attansic  
>> Ethernet
>>                     Card
>>            Product: Networking
>>            Version: 2.5
>>     Kernel Version: 2.6.30.8
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: high
>>           Priority: P1
>>          Component: IPV4
>>         AssignedTo: shemminger@linux-foundation.org
>>         ReportedBy: badibere@gmail.com
>>         Regression: No
>>
>>
>> I have soft lockup problem with attansic ethernet card while  
>> activating eth0
>> interface.
>>
>> My ethernet card is 09:00.0 Ethernet controller: Attansic  
>> Technology Corp.
>> Device 1063 (rev c0)
>>
>> If I try to activate eth0 interface while network cable is pluged  
>> in, then
>> everything freezes.
>>
>> If I try to activate eth0 interface while network cable isn't  
>> pluged in, then
>> no problem. But, if i plug in, again freezes.
>>
>> My /var/log/syslog is in attachment.
>>
>
> From the dmesg output it appears that this card is driven by the atl1c
> driver:
>
> Oct  7 00:55:57 baDibere kernel: [  169.943463] atl1c 0000:09:00.0:  
> Unable to allocate MSI interrupt Error: -22
> Oct  7 00:55:57 baDibere kernel: [  169.943576] atl1c 0000:09:00.0:  
> atl1c: eth0 NIC Link is Up<100 Mbps Full Duplex>
> Oct  7 00:56:22 baDibere kernel: [  194.849361] NVRM: Xid  
> (0001:00): 16, Head 00000000 Count 00003651
> Oct  7 00:56:31 baDibere kernel: [  203.849307] NVRM: Xid  
> (0001:00): 16, Head 00000000 Count 00003653
> Oct  7 00:56:36 baDibere kernel: [  208.849339] NVRM: Xid  
> (0001:00): 8, Channel 00000003
> Oct  7 00:56:41 baDibere kernel: [  213.852320] NVRM: Xid  
> (0001:00): 16, Head 00000000 Count 00003655
>
>
> The "Unable to allocate MSI interrupt" might be the immediate problem.
>
> Who belongs to MSI interrupt allocation?  PCI?  ACPI?
>
> However the driver does attempt to handle and recover from the MSI
> interrupt allocation error so perhaps that's no the cause at all.
>

I'd be curious to know if the lockup occurs when booted with pci=nomsi.

Jay

^ permalink raw reply

* Re: [Bugme-new] [Bug 14336] New: [Pardus] Soft Lockup Problem with  Attansic Ethernet Card
From: J. K. Cliburn @ 2009-10-06 23:33 UTC (permalink / raw)
  To: Erdem ARTAN
  Cc: Andrew Morton, bugzilla-daemon, bugme-daemon, netdev, linux-acpi,
	Jesse Barnes, Chris Snook, Jie Yang
In-Reply-To: <3d54862b0910061601iba25b87hfba151f71d6ce4ef@mail.gmail.com>

On Oct 6, 2009, at 6:01 PM, Erdem ARTAN wrote:

> Yesterday, there wasn't any problem but 3 days ago there was same  
> problem.
>
> While there wasn't any problem, if I'm not wrong, both atl1c and  
> atl1e modules were loaded on boot. But, If I try to load atl1e now,  
> nothing changes.

The pci_id says your chip is an L1c.  Jie Yang is the maintainer of  
the atl1c driver and is cc'd.

Jay

^ permalink raw reply

* RE: [net-next-2.6 PATCH 1/9] vxge: Modify __vxge_hw_device_is_privilaged() to not assume function-0 as the privileged function: Resubmit#1
From: Ramkrishna Vepa @ 2009-10-06 23:22 UTC (permalink / raw)
  To: David Miller, Sreenivasa Honnur; +Cc: netdev, support
In-Reply-To: <20091006.152241.159698718.davem@davemloft.net>

> So I'm going to apply this patch set to net-next-2.6, omitting patch
> #8 from the series.
[Ram] Thanks!

Ram

^ permalink raw reply

* alt1c: ethernet driver bug
From: Stephen Hemminger @ 2009-10-06 23:10 UTC (permalink / raw)
  To: Jie Yang; +Cc: netdev

> http://bugzilla.kernel.org/show_bug.cgi?id=14336
> 
>            Summary: [Pardus] Soft Lockup Problem with Attansic Ethernet
>                     Card
>            Product: Networking
>            Version: 2.5
>     Kernel Version: 2.6.30.8
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: IPV4
>         AssignedTo: shemminger@linux-foundation.org
>         ReportedBy: badibere@gmail.com
>         Regression: No
> 
> 
> I have soft lockup problem with attansic ethernet card while activating eth0
> interface.
> 
> My ethernet card is 09:00.0 Ethernet controller: Attansic Technology Corp.
> Device 1063 (rev c0)
> 
> If I try to activate eth0 interface while network cable is pluged in, then
> everything freezes.
> 
> If I try to activate eth0 interface while network cable isn't pluged in, then
> no problem. But, if i plug in, again freezes.
> 
> My /var/log/syslog is in attachment.
> 

this is what was in the bug attachment

Oct  7 00:53:33 baDibere kernel: [   16.162099] atl1c 0000:09:00.0: enabling device (0000 -> 0002)
Oct  7 00:53:33 baDibere kernel: [   16.162117] atl1c 0000:09:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
Oct  7 00:53:33 baDibere kernel: [   16.162141] atl1c 0000:09:00.0: setting latency timer to 64
Oct  7 00:53:33 baDibere kernel: [   16.162247] atl1c 0000:09:00.0: PME# disabled
Oct  7 00:53:33 baDibere kernel: [   16.162266] atl1c 0000:09:00.0: PME# disabled
Oct  7 00:53:33 baDibere kernel: [   16.230835] atl1c 0000:09:00.0: version 1.0.0.1-NAPI
Oct  7 00:55:57 baDibere kernel: [  169.943463] atl1c 0000:09:00.0: Unable to allocate MSI interrupt Error: -22
Oct  7 00:55:57 baDibere kernel: [  169.943576] atl1c 0000:09:00.0: atl1c: eth0 NIC Link is Up<100 Mbps Full Duplex>

^ permalink raw reply

* [RFC net-next-2.6] net: speedup sk_wake_async()
From: Eric Dumazet @ 2009-10-06 23:08 UTC (permalink / raw)
  To: David S. Miller; +Cc: Linux Netdev List

Latency works, part 1


An incoming datagram must bring into cpu cache *lot* of cache lines,
in particular : (other parts omitted (hash chains, ip route cache...))

On 32bit arches :

offsetof(struct sock, sk_rcvbuf)       =0x30    (read)
offsetof(struct sock, sk_lock)         =0x34   (rw)

offsetof(struct sock, sk_sleep)        =0x50 (read)
offsetof(struct sock, sk_rmem_alloc)   =0x64   (rw)
offsetof(struct sock, sk_receive_queue)=0x74   (rw)

offsetof(struct sock, sk_forward_alloc)=0x98   (rw)

offsetof(struct sock, sk_callback_lock)=0xcc    (rw)
offsetof(struct sock, sk_drops)        =0xd8 (read if we add dropcount support, rw if frame dropped)
offsetof(struct sock, sk_filter)       =0xf8    (read)

offsetof(struct sock, sk_socket)       =0x138 (read)

offsetof(struct sock, sk_data_ready)   =0x15c   (read)


We can avoid sk->sk_socket and socket->fasync_list referencing on sockets
with no fasync() structures. (socket->fasync_list ptr is probably already in cache
because it shares a cache line with socket->wait, ie location pointed by sk->sk_sleep)

This avoids one cache line load per incoming packet for common cases (no fasync())

We can leave (or even move in a future patch) sk->sk_socket in a cold location

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/net/sock.h |    3 ++-
 net/socket.c       |    3 +++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 1621935..98398bd 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -504,6 +504,7 @@ enum sock_flags {
 	SOCK_TIMESTAMPING_SOFTWARE,     /* %SOF_TIMESTAMPING_SOFTWARE */
 	SOCK_TIMESTAMPING_RAW_HARDWARE, /* %SOF_TIMESTAMPING_RAW_HARDWARE */
 	SOCK_TIMESTAMPING_SYS_HARDWARE, /* %SOF_TIMESTAMPING_SYS_HARDWARE */
+	SOCK_FASYNC, /* fasync() active */
 };
 
 static inline void sock_copy_flags(struct sock *nsk, struct sock *osk)
@@ -1396,7 +1397,7 @@ static inline unsigned long sock_wspace(struct sock *sk)
 
 static inline void sk_wake_async(struct sock *sk, int how, int band)
 {
-	if (sk->sk_socket && sk->sk_socket->fasync_list)
+	if (sock_flag(sk, SOCK_FASYNC))
 		sock_wake_async(sk->sk_socket, how, band);
 }
 
diff --git a/net/socket.c b/net/socket.c
index 7565536..d53ad11 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1100,11 +1100,14 @@ static int sock_fasync(int fd, struct file *filp, int on)
 		fna->fa_next = sock->fasync_list;
 		write_lock_bh(&sk->sk_callback_lock);
 		sock->fasync_list = fna;
+		sock_set_flag(sk, SOCK_FASYNC);
 		write_unlock_bh(&sk->sk_callback_lock);
 	} else {
 		if (fa != NULL) {
 			write_lock_bh(&sk->sk_callback_lock);
 			*prev = fa->fa_next;
+			if (!sock->fasync_list)
+				sock_reset_flag(sk, SOCK_FASYNC);
 			write_unlock_bh(&sk->sk_callback_lock);
 			kfree(fa);
 		}

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox