Netdev List
 help / color / mirror / Atom feed
* Re: [RFC] sched: CHOKe packet scheduler (v0.2)
From: Stephen Hemminger @ 2011-01-05 20:15 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1294257967.2723.9.camel@edumazet-laptop>

On Wed, 05 Jan 2011 21:06:07 +0100
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Le mercredi 05 janvier 2011 à 11:21 -0800, Stephen Hemminger a écrit :
> > This implements the CHOKe packet scheduler based on the existing
> > Linux RED scheduler based on the algorithm described in the paper.
> > 
> 
> > +
> > +static int choke_enqueue(struct sk_buff *skb, struct Qdisc *sch)
> > +{
> > +	struct choke_sched_data *q = qdisc_priv(sch);
> > +	struct red_parms *p = &q->parms;
> > +
> > +	p->qavg = red_calc_qavg(p, skb_queue_len(&sch->q));
> > +	if (red_is_idling(p))
> > +		red_end_of_idle_period(p);
> > +
> > +	if (p->qavg <= p->qth_min)
> > +		p->qcount = -1;
> > +	else {
> > +		struct sk_buff *oskb;
> > +
> > +		/* Draw a packet at random from queue */
> > +		oskb = skb_peek_random(&sch->q);
> > +
> > +		/* Both packets from same flow?
> > +		 * Assumes skb_get_rxhash already set hash on oskb->rxhash
> > +		 * prior to queuing
> 
> but this is not true... if at that time, p->qavg was <= p->qth_min.
> Packet was directly enqueued.
> 
> Just use :
> 
> if (skb_get_rxhash(oskb) == skb_get_rxhash(skb))
> 
> Since skb_get_rxhash(skb) doesnt recompute rxhash if already set.
> 
> Hmm... I am now wondering if this actually works on egress at all
> (can we use rxhash here I mean)
> 
> 
> 
> > +		 */
> > +		if (oskb->rxhash == skb_get_rxhash(skb)) {
> > +			/* Drop both packets */
> > +			__skb_unlink(oskb, &sch->q);
> > +			qdisc_drop(oskb, sch);
> > +			goto congestion_drop;
> > +		}
> 

The code computes a value, whether it is correct or not is another question.
Also a little concerned that different NIC's compute different flow hash
values which could cause false positives.

-- 

^ permalink raw reply

* [PATCH] net_sched: pfifo_head_drop problem
From: Eric Dumazet @ 2011-01-05 20:35 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Florian Westphal, Patrick McHardy, Hagen Paul Pfeifer,
	Stephen Hemminger, Jarek Poplawski
In-Reply-To: <1294246850.2775.244.camel@edumazet-laptop>

commit 57dbb2d83d100ea (sched: add head drop fifo queue)
introduced pfifo_head_drop, and broke the invariant that
sch->bstats.bytes and sch->bstats.packets are COUNTER (increasing
counters only)

This can break estimators because est_timer() handles unsigned deltas
only. A decreasing counter can then give a huge unsigned delta.

My mid term suggestion would be to change things so that
sch->bstats.bytes and sch->bstats.packets are incremented in dequeue()
only, not at enqueue() time. We also could add drop_bytes/drop_packets
and provide estimations of drop rates.

It would be more sensible anyway for very low speeds, and big bursts.
Right now, if we drop packets, they still are accounted in byte/packets
abolute counters and rate estimators.

Before this mid term change, this patch makes pfifo_head_drop behavior
similar to other qdiscs in case of drops :
Dont decrement sch->bstats.bytes and sch->bstats.packets

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/sched/sch_fifo.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/net/sched/sch_fifo.c b/net/sched/sch_fifo.c
index 4dfecb0..aa4d633 100644
--- a/net/sched/sch_fifo.c
+++ b/net/sched/sch_fifo.c
@@ -54,8 +54,6 @@ static int pfifo_tail_enqueue(struct sk_buff *skb, struct Qdisc* sch)
 
 	/* queue full, remove one skb to fulfill the limit */
 	skb_head = qdisc_dequeue_head(sch);
-	sch->bstats.bytes -= qdisc_pkt_len(skb_head);
-	sch->bstats.packets--;
 	sch->qstats.drops++;
 	kfree_skb(skb_head);
 



^ permalink raw reply related

* ASSISTANCE CONEIDETIAL!!!
From: Dr. Smith Lee. @ 2011-01-05 17:11 UTC (permalink / raw)


Hello,

I have a proposition for you, this however is not mandatory nor will I 
in any manner compel you to honor against your will.Let me start by 
introducing myself. I am Dr.Smith  Lee, Director of Operations 
of the Hang Seng Bank Ltd,Sai Wan Ho Branch. I have a mutual beneficial 
business suggestion for you.

1. Can you handle this project?
2. Can I give you this trust ?

Absolute confidentiality is required from you.Besides,I will use my connection 
to get some documents to back up the fund so that the fund can not be 
question by any authority.

More information await you in my next response to your email message.


Treat as very urgent.

Yours Faithfully,

Dr. Smith Lee.


smith@hangsengbankltd-hk.com

^ permalink raw reply

* Re: [PATCH] net_sched: pfifo_head_drop problem
From: Hagen Paul Pfeifer @ 2011-01-05 20:52 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, netdev, Florian Westphal, Patrick McHardy,
	Stephen Hemminger, Jarek Poplawski
In-Reply-To: <1294259702.2723.22.camel@edumazet-laptop>

* Eric Dumazet | 2011-01-05 21:35:02 [+0100]:

>My mid term suggestion would be to change things so that
>sch->bstats.bytes and sch->bstats.packets are incremented in dequeue()
>only, not at enqueue() time. We also could add drop_bytes/drop_packets
>and provide estimations of drop rates.
>
>It would be more sensible anyway for very low speeds, and big bursts.
>Right now, if we drop packets, they still are accounted in byte/packets
>abolute counters and rate estimators.
>
>Before this mid term change, this patch makes pfifo_head_drop behavior
>similar to other qdiscs in case of drops :
>Dont decrement sch->bstats.bytes and sch->bstats.packets

Thanks Stephen and Erik for spotting this bug!

>Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Acked-by: Hagen Paul Pfeifer <hagen@jauu.net>

^ permalink raw reply

* Re: [PATCH net-next-2.6 v2 1/2] can: add driver for Softing card
From: Wolfgang Grandegger @ 2011-01-05 20:57 UTC (permalink / raw)
  To: Kurt Van Dijck
  Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20110104150759.GB321-MxZ6Iy/zr/UdbCeoMzGj59i2O/JbrIOy@public.gmane.org>

Hi Kurt,

here comes my review... First some general remarks. As Mark already
pointed out, there are still some coding style issues:

- Please use the following style for multi line comments:

  /*
   * Comment
   */

- Please separate the function header from the body by one empty line.

- Please avoid alignment of expressions and structure members.

- Please use a consistent style for function declaration and
  continuation lines.

More comments inline...

On 12/23/2010 10:43 On 01/04/2011 04:07 PM, Kurt Van Dijck wrote:
> This patch adds a driver for the platform:softing device.
> This will create (up to) 2 CAN network devices from 1
> platform:softing device
> 
> Signed-off-by: Kurt Van Dijck <kurt.van.dijck-/BeEPy95v10@public.gmane.org>
> 
> ---
>  drivers/net/can/Kconfig                    |    2 +
>  drivers/net/can/Makefile                   |    1 +
>  drivers/net/can/softing/Kconfig            |   16 +
>  drivers/net/can/softing/Makefile           |    5 +
>  drivers/net/can/softing/softing.h          |  193 ++++++
>  drivers/net/can/softing/softing_fw.c       |  648 ++++++++++++++++++++
>  drivers/net/can/softing/softing_main.c     |  903 ++++++++++++++++++++++++++++
>  drivers/net/can/softing/softing_platform.h |   38 ++
>  8 files changed, 1806 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/can/Kconfig b/drivers/net/can/Kconfig
> index d5a9db6..986195e 100644
> --- a/drivers/net/can/Kconfig
> +++ b/drivers/net/can/Kconfig
> @@ -117,6 +117,8 @@ source "drivers/net/can/sja1000/Kconfig"
>  
>  source "drivers/net/can/usb/Kconfig"
>  
> +source "drivers/net/can/softing/Kconfig"
> +
>  config CAN_DEBUG_DEVICES
>  	bool "CAN devices debugging messages"
>  	depends on CAN
> diff --git a/drivers/net/can/Makefile b/drivers/net/can/Makefile
> index 07ca159..53c82a7 100644
> --- a/drivers/net/can/Makefile
> +++ b/drivers/net/can/Makefile
> @@ -9,6 +9,7 @@ obj-$(CONFIG_CAN_DEV)		+= can-dev.o
>  can-dev-y			:= dev.o
>  
>  obj-y				+= usb/
> +obj-y				+= softing/

Please use "obj-$(CONFIG_CAN_SOFTING)" here.

>  obj-$(CONFIG_CAN_SJA1000)	+= sja1000/
>  obj-$(CONFIG_CAN_MSCAN)		+= mscan/
> diff --git a/drivers/net/can/softing/Kconfig b/drivers/net/can/softing/Kconfig
> new file mode 100644
> index 0000000..072f337
> --- /dev/null
> +++ b/drivers/net/can/softing/Kconfig
> @@ -0,0 +1,16 @@
> +config CAN_SOFTING
> +	tristate "Softing Gmbh CAN generic support"
> +	depends on CAN_DEV
> +	---help---
> +	  Support for CAN cards from Softing Gmbh & some cards
> +	  from Vector Gmbh.
> +	  Softing Gmbh CAN cards come with 1 or 2 physical busses.
> +	  Those cards typically use Dual Port RAM to communicate
> +	  with the host CPU. The interface is then identical for PCI
> +	  and PCMCIA cards. This driver operates on a platform device,
> +	  which has been created by softing_cs or softing_pci driver.
> +	  Warning:
> +	  The API of the card does not allow fine control per bus, but
> +	  controls the 2 busses on the card together.
> +	  As such, some actions (start/stop/busoff recovery) on 1 bus
> +	  must bring down the other bus too temporarily.
> diff --git a/drivers/net/can/softing/Makefile b/drivers/net/can/softing/Makefile
> new file mode 100644
> index 0000000..7878b7b
> --- /dev/null
> +++ b/drivers/net/can/softing/Makefile
> @@ -0,0 +1,5 @@
> +
> +softing-y := softing_main.o softing_fw.o
> +obj-$(CONFIG_CAN_SOFTING)        += softing.o
> +
> +ccflags-$(CONFIG_CAN_DEBUG_DEVICES) := -DDEBUG
> diff --git a/drivers/net/can/softing/softing.h b/drivers/net/can/softing/softing.h
> new file mode 100644
> index 0000000..72d3adb
> --- /dev/null
> +++ b/drivers/net/can/softing/softing.h
> @@ -0,0 +1,193 @@
> +/*
> + * softing common interfaces
> + *
> + * by Kurt Van Dijck, 06-2008
> + */
> +
> +#include <linux/atomic.h>
> +#include <linux/netdevice.h>
> +#include <linux/ktime.h>
> +#include <linux/mutex.h>
> +#include <linux/spinlock.h>
> +#include <linux/can.h>
> +#include <linux/can/dev.h>
> +
> +#include "softing_platform.h"
> +
> +#ifndef CAN_CTRLMODE_BERR_REPORTING
> +#define CAN_CTRLMODE_BERR_REPORTING 0
> +#endif

Hm, do you really need that definition?

> +struct softing;
> +
> +struct softing_priv {
> +	struct can_priv can;	/* must be the first member! */
> +	struct net_device *netdev;
> +	struct softing *card;
> +	struct {
> +		int pending;
> +		/* variables wich hold the circular buffer */
> +		int echo_put;
> +		int echo_get;
> +	} tx;
> +	struct can_bittiming_const btr_const;
> +	int index;
> +	u8 output;
> +	u16 chip;
> +};
> +#define netdev2softing(netdev)	((struct softing_priv *)netdev_priv(netdev))
> +
> +struct softing {
> +	const struct softing_platform_data *pdat;
> +	struct platform_device *pdev;
> +	struct net_device *net[2];
> +	spinlock_t spin; /* protect this structure & DPRAM access */
> +	ktime_t ts_ref;
> +	ktime_t ts_overflow; /* timestamp overflow value, in ktime */
> +
> +	struct {
> +		/* indication of firmware status */
> +		int up;
> +		/* protection of the 'up' variable */
> +		struct mutex lock;
> +	} fw;
> +	struct {
> +		int nr;
> +		int requested;
> +		int svc_count;
> +		unsigned int dpram_position;
> +	} irq;
> +	struct {
> +		int pending;
> +		int last_bus;
> +		/*
> +		 * keep the bus that last tx'd a message,
> +		 * in order to let every netdev queue resume
> +		 */
> +	} tx;
> +	__iomem uint8_t *dpram;
> +	unsigned long dpram_phys;
> +	unsigned long dpram_size;
> +	struct {
> +		u32  serial, fw, hw, lic;
> +		u16  chip[2];
> +		u32  freq;
> +	} id;
> +};
> +
> +extern int softing_default_output(struct net_device *netdev);
> +
> +extern ktime_t softing_raw2ktime(struct softing *card, u32 raw);
> +
> +extern int softing_fct_cmd(struct softing *card, int16_t cmd, uint16_t vector,
> +		const char *msg);
> +
> +extern int softing_bootloader_command(struct softing *card, int16_t cmd,
> +		const char *msg);
> +
> +/* Load firmware after reset */
> +extern int softing_load_fw(const char *file, struct softing *card,
> +			__iomem uint8_t *virt, unsigned int size, int offset);
> +
> +/* Load final application firmware after bootloader */
> +extern int softing_load_app_fw(const char *file, struct softing *card);
> +
> +/*
> + * enable or disable irq
> + * only called with fw.lock locked
> + */
> +extern int softing_enable_irq(struct softing *card, int enable);
> +
> +/* start/stop 1 bus on card */
> +extern int softing_startstop(struct net_device *netdev, int up);
> +
> +/* netif_rx() */
> +extern int softing_netdev_rx(struct net_device *netdev,
> +		const struct can_frame *msg, ktime_t ktime);
> +
> +/* SOFTING DPRAM mappings */
> +#define DPRAM_RX		0x0000
> +	#define DPRAM_RX_SIZE	32
> +	#define DPRAM_RX_CNT	16
> +#define DPRAM_RX_RD		0x0201	/* uint8_t */
> +#define DPRAM_RX_WR		0x0205	/* uint8_t */
> +#define DPRAM_RX_LOST		0x0207	/* uint8_t */
> +
> +#define DPRAM_FCT_PARAM		0x0300	/* int16_t [20] */
> +#define DPRAM_FCT_RESULT	0x0328	/* int16_t */
> +#define DPRAM_FCT_HOST		0x032b	/* uint16_t */
> +
> +#define DPRAM_INFO_BUSSTATE	0x0331	/* uint16_t */
> +#define DPRAM_INFO_BUSSTATE2	0x0335	/* uint16_t */
> +#define DPRAM_INFO_ERRSTATE	0x0339	/* uint16_t */
> +#define DPRAM_INFO_ERRSTATE2	0x033d	/* uint16_t */
> +#define DPRAM_RESET		0x0341	/* uint16_t */
> +#define DPRAM_CLR_RECV_FIFO	0x0345	/* uint16_t */
> +#define DPRAM_RESET_TIME	0x034d	/* uint16_t */
> +#define DPRAM_TIME		0x0350	/* uint64_t */
> +#define DPRAM_WR_START		0x0358	/* uint8_t */
> +#define DPRAM_WR_END		0x0359	/* uint8_t */
> +#define DPRAM_RESET_RX_FIFO	0x0361	/* uint16_t */
> +#define DPRAM_RESET_TX_FIFO	0x0364	/* uint8_t */
> +#define DPRAM_READ_FIFO_LEVEL	0x0365	/* uint8_t */
> +#define DPRAM_RX_FIFO_LEVEL	0x0366	/* uint16_t */
> +#define DPRAM_TX_FIFO_LEVEL	0x0366	/* uint16_t */
> +
> +#define DPRAM_TX		0x0400	/* uint16_t */
> +	#define DPRAM_TX_SIZE	16
> +	#define DPRAM_TX_CNT	32
> +#define DPRAM_TX_RD		0x0601	/* uint8_t */
> +#define DPRAM_TX_WR		0x0605	/* uint8_t */
> +
> +#define DPRAM_COMMAND		0x07e0	/* uint16_t */
> +#define DPRAM_RECEIPT		0x07f0	/* uint16_t */
> +#define DPRAM_IRQ_TOHOST	0x07fe	/* uint8_t */
> +#define DPRAM_IRQ_TOCARD	0x07ff	/* uint8_t */
> +
> +#define DPRAM_V2_RESET		0x0e00	/* uint8_t */
> +#define DPRAM_V2_IRQ_TOHOST	0x0e02	/* uint8_t */
> +
> +#define TXMAX	(DPRAM_TX_CNT - 1)
> +
> +/* DPRAM return codes */
> +#define RES_NONE	0
> +#define RES_OK		1
> +#define RES_NOK		2
> +#define RES_UNKNOWN	3
> +/* DPRAM flags */
> +#define CMD_TX		0x01
> +#define CMD_ACK		0x02
> +#define CMD_XTD		0x04
> +#define CMD_RTR		0x08
> +#define CMD_ERR		0x10
> +#define CMD_BUS2	0x80
> +
> +/*
> + * some inline DPRAM acces function
> + * to prevent extra dependency between softing & softingcs
> + */
> +/* reset DPRAM */
> +static inline void softing_set_reset_dpram(struct softing *card)
> +{
> +	if (card->pdat->generation >= 2) {
> +		uint8_t tmp;
> +		spin_lock_bh(&card->spin);
> +		tmp = ioread8(&card->dpram[DPRAM_V2_RESET]);
> +		tmp &= ~1;
> +		iowrite8(tmp, &card->dpram[DPRAM_V2_RESET]);
> +		spin_unlock_bh(&card->spin);
> +	}
> +}
> +
> +static inline void softing_clr_reset_dpram(struct softing *card)
> +{
> +	if (card->pdat->generation >= 2) {
> +		uint8_t tmp;

Empty line please.

> +		spin_lock_bh(&card->spin);
> +		tmp = ioread8(&card->dpram[DPRAM_V2_RESET]);
> +		tmp |= 1;

Could be done in one line or even without tmp.

> +		iowrite8(tmp, &card->dpram[DPRAM_V2_RESET]);
> +		spin_unlock_bh(&card->spin);
> +	}
> +}
> +
> diff --git a/drivers/net/can/softing/softing_fw.c b/drivers/net/can/softing/softing_fw.c
> new file mode 100644
> index 0000000..03ed853
> --- /dev/null
> +++ b/drivers/net/can/softing/softing_fw.c
> @@ -0,0 +1,648 @@
> +/*
> +* drivers/net/can/softing/softing_fw.c
> +*
> +* Copyright (C) 2008-2010
> +*
> +* - Kurt Van Dijck, EIA Electronics
> +*
> +* This program is free software; you can redistribute it and/or modify
> +* it under the terms of the version 2 of the GNU General Public License
> +* as published by the Free Software Foundation
> +*
> +* This program is distributed in the hope that it will be useful,
> +* but WITHOUT ANY WARRANTY; without even the implied warranty of
> +* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +* GNU General Public License for more details.
> +*
> +* You should have received a copy of the GNU General Public License
> +* along with this program; if not, write to the Free Software
> +* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
> +*/
> +
> +#include <linux/firmware.h>
> +#include <linux/sched.h>
> +#include <asm/div64.h>
> +
> +#include "softing.h"
> +
> +int softing_fct_cmd(struct softing *card, int16_t cmd, uint16_t vector,
> +		const char *msg)
> +{
> +	int ret;
> +	unsigned long stamp;
> +
> +	if (vector == RES_OK)
> +		vector = RES_NONE;
> +	iowrite16(cmd, &card->dpram[DPRAM_FCT_PARAM]);
> +	iowrite8(vector >> 8, &card->dpram[DPRAM_FCT_HOST + 1]);
> +	iowrite8(vector, &card->dpram[DPRAM_FCT_HOST]);
> +
> +	/* be sure to flush this to the card */
> +	wmb();
> +	stamp = jiffies + 1 * HZ;
> +	/* wait for card */
> +	do {
> +		/* DPRAM_FCT_HOST is _not_ aligned */
> +		ret = ioread8(&card->dpram[DPRAM_FCT_HOST]) +
> +			(ioread8(&card->dpram[DPRAM_FCT_HOST + 1]) << 8);
> +		/* don't have any cached variables */
> +		rmb();
> +		if (ret == RES_OK) {
> +			/* don't read return-value now */
> +			ret = ioread16(&card->dpram[DPRAM_FCT_RESULT]);
> +			if (ret)
> +				dev_alert(&card->pdev->dev,
> +					"%s returned %u\n", msg, ret);
> +			return 0;

Why do you not return an error here?

> +		}
> +		if (time_after(jiffies, stamp))
> +			break;
> +		/* process context => relax */
> +		usleep_range(500, 10000);
> +	} while (!signal_pending(current));
> +
> +	if (ret == RES_NONE) {
> +		dev_alert(&card->pdev->dev,
> +			"%s, no response from card on %u/0x%02x\n",
> +			msg, cmd, vector);
> +		return 1;
> +	} else {
> +		dev_alert(&card->pdev->dev,
> +			"%s, bad response from card on %u/0x%02x, 0x%04x\n",
> +			msg, cmd, vector, ret);
> +		/* make sure to return something not 0 */
> +		return ret ?: 1;

What does it return if ret > 0?

> +	}
> +}
> +
> +int softing_bootloader_command(struct softing *card, int16_t cmd,
> +		const char *msg)
> +{
> +	int ret;
> +	unsigned long stamp;

Empty line please.

> +	iowrite16(RES_NONE, &card->dpram[DPRAM_RECEIPT]);
> +	iowrite16(cmd, &card->dpram[DPRAM_COMMAND]);
> +	/* be sure to flush this to the card */
> +	wmb();
> +	stamp = jiffies + 3 * HZ;
> +	/* wait for card */
> +	do {
> +		ret = ioread16(&card->dpram[DPRAM_RECEIPT]);
> +		/* don't have any cached variables */
> +		rmb();
> +		if (ret == RES_OK)
> +			return 0;
> +		if (time_after(jiffies, stamp))
> +			break;
> +		/* process context => relax */
> +		usleep_range(500, 10000);
> +	} while (!signal_pending(current));
> +
> +	switch (ret) {
> +	case RES_NONE:
> +		dev_alert(&card->pdev->dev, "%s: no response from card\n", msg);
> +		break;
> +	case RES_NOK:
> +		dev_alert(&card->pdev->dev, "%s: response from card nok\n",
> +				msg);
> +		break;
> +	case RES_UNKNOWN:
> +		dev_alert(&card->pdev->dev, "%s: command 0x%04x unknown\n",
> +			msg, cmd);
> +		break;
> +	default:
> +		dev_alert(&card->pdev->dev, "%s: bad response from card: %i\n",
> +			msg, ret);
> +		break;
> +	}
> +	return ret ?: 1;

Ditto.

> +}
> +
> +static int fw_parse(const uint8_t **pmem, uint16_t *ptype, uint32_t *paddr,
> +		uint16_t *plen, const uint8_t **pdat)
> +{
> +	uint16_t checksum[2];
> +	const uint8_t *mem;
> +	const uint8_t *end;
> +
> +	mem = *pmem;
> +	*ptype = le16_to_cpup((void *)&mem[0]);
> +	*paddr = le32_to_cpup((void *)&mem[2]);
> +	*plen = le16_to_cpup((void *)&mem[6]);
> +	*pdat = &mem[8];

You often handle arrays of specific data. Couldn't those be described
better by structs also avoiding ugly casts??

> +	/* verify checksum */
> +	end = &mem[8 + *plen];
> +	checksum[0] = le16_to_cpup((void *)end);
> +	for (checksum[1] = 0; mem < end; ++mem)
> +		checksum[1] += *mem;
> +	if (checksum[0] != checksum[1])
> +		return -EINVAL;
> +	/* increment */
> +	*pmem += 10 + *plen;
> +	return 0;
> +}
> +
> +int softing_load_fw(const char *file, struct softing *card,
> +			__iomem uint8_t *dpram, unsigned int size, int offset)
> +{
> +	const struct firmware *fw;
> +	int ret, ok = 0;
> +	const uint8_t *mem, *end, *dat;
> +	uint16_t type, len;
> +	uint32_t addr;
> +	uint8_t buf[1024];

Please avoid allocating large arrays on the stack.

> +
> +	ret = request_firmware(&fw, file, &card->pdev->dev);
> +	if (ret) {
> +		dev_alert(&card->pdev->dev, "request_firmware(%s) got %i\n",
> +			file, ret);
> +		return ret;
> +	}
> +	dev_dbg(&card->pdev->dev, "%s, firmware(%s) got %u bytes"
> +		", offset %c0x%04x\n",
> +		card->pdat->name, file, (unsigned int)fw->size,
> +		(offset >= 0) ? '+' : '-', (unsigned int)abs(offset));
> +	/* parse the firmware */
> +	mem = fw->data;
> +	end = &mem[fw->size];
> +	/* look for header record */
> +	ret = fw_parse(&mem, &type, &addr, &len, &dat);
> +	if (ret < 0)
> +		goto fw_end;
> +	if (type != 0xffff) {
> +		dev_alert(&card->pdev->dev, "firware starts with type 0x%04x\n",
> +			type);
> +		goto fw_end;
> +	}
> +	if (strncmp("Structured Binary Format, Softing GmbH" , dat, len)) {
> +		dev_info(&card->pdev->dev, "firware string '%.*s'\n", len, dat);
> +		goto fw_end;
> +	}
> +	ok |= 1;
> +	/* ok, we had a header */
> +	while (mem < end) {
> +		ret = fw_parse(&mem, &type, &addr, &len, &dat);
> +		if (ret)
> +			break;
> +		if (type == 3) {
> +			/* start address */
> +			ok |= 2;
> +			continue;
> +		} else if (type == 1) {
> +			/* eof */
> +			ok |= 4;
> +			goto fw_end;
> +		} else if (type != 0) {
> +			dev_alert(&card->pdev->dev,
> +					"unknown record type 0x%04x\n", type);
> +			break;
> +		}

You still use a lot of magic constants. Giving them a name would make
the code more readable.

> +		if ((addr + len + offset) > size) {
> +			dev_alert(&card->pdev->dev,
> +				"firmware out of range (0x%08x / 0x%08x)\n",
> +				(addr + len + offset), size);
> +			goto fw_end;
> +		}
> +		memcpy_toio(&dpram[addr + offset], dat, len);
> +		/* be sure to flush caches from IO space */
> +		mb();
> +		if (len > sizeof(buf)) {
> +			dev_info(&card->pdev->dev,
> +				"record too big for verify (%u)\n", len);
> +			continue;
> +		}
> +		/* verify record data */
> +		memcpy_fromio(buf, &dpram[addr + offset], len);
> +		if (!memcmp(buf, dat, len))
> +			/* is ok */
> +			continue;
> +		dev_alert(&card->pdev->dev, "0x%08x:0x%03x at 0x%u failed\n",
> +				addr, len, addr + offset);
> +		goto fw_end;
> +	}
> +fw_end:
> +	release_firmware(fw);
> +	if (0x5 == (ok & 0x5))
> +		/* got eof & start */
> +		return 0;

Ditto.

> +	dev_info(&card->pdev->dev, "firmware %s failed\n", file);
> +	return ret ?: -EINVAL;
> +}
> +
> +int softing_load_app_fw(const char *file, struct softing *card)
> +{
> +	const struct firmware *fw;
> +	const uint8_t *mem, *end, *dat;
> +	int ret, ok = 0, j;
> +	uint16_t type, len;
> +	uint32_t addr, start_addr = 0;
> +	unsigned int sum, rx_sum;
> +
> +	ret = request_firmware(&fw, file, &card->pdev->dev);
> +	if (ret) {
> +		dev_alert(&card->pdev->dev, "request_firmware(%s) got %i\n",
> +			file, ret);
> +		return ret;
> +	}
> +	dev_dbg(&card->pdev->dev, "firmware(%s) got %lu bytes\n",
> +		file, (unsigned long)fw->size);
> +	/* parse the firmware */
> +	mem = fw->data;
> +	end = &mem[fw->size];
> +	/* look for header record */
> +	ret = fw_parse(&mem, &type, &addr, &len, &dat);
> +	if (ret)
> +		goto fw_end;
> +	if (type != 0xffff) {
> +		dev_alert(&card->pdev->dev, "firware starts with type 0x%04x\n",

Typo?

> +			type);
> +		goto fw_end;
> +	}
> +	if (strncmp("Structured Binary Format, Softing GmbH", dat, len)) {
> +		dev_alert(&card->pdev->dev, "firware string '%.*s' fault\n",
> +				len, dat);
> +		goto fw_end;
> +	}
> +	ok |= 1;
> +	/* ok, we had a header */
> +	while (mem < end) {
> +		ret = fw_parse(&mem, &type, &addr, &len, &dat);
> +		if (ret)
> +			break;
> +
> +		if (type == 3) {
> +			/* start address */
> +			start_addr = addr;
> +			ok |= 2;
> +			continue;
> +		} else if (type == 1) {
> +			/* eof */
> +			ok |= 4;
> +			goto fw_end;
> +		} else if (type != 0) {
> +			dev_alert(&card->pdev->dev,
> +					"unknown record type 0x%04x\n", type);
> +			break;
> +		}
> +
> +		/* regualar data */
> +		for (sum = 0, j = 0; j < len; ++j)
> +			sum += dat[j];
> +		/* work in 16bit (target) */
> +		sum &= 0xffff;
> +
> +		memcpy_toio(&card->dpram[card->pdat->app.offs], dat, len);
> +		iowrite32(card->pdat->app.offs + card->pdat->app.addr,
> +				&card->dpram[DPRAM_COMMAND + 2]);
> +		iowrite32(addr, &card->dpram[DPRAM_COMMAND + 6]);
> +		iowrite16(len, &card->dpram[DPRAM_COMMAND + 10]);
> +		iowrite8(1, &card->dpram[DPRAM_COMMAND + 12]);

See my comment about using arrays above.

> +		if (softing_bootloader_command(card, 1, "loading app."))
> +			goto fw_end;
> +		/* verify checksum */
> +		rx_sum = ioread16(&card->dpram[DPRAM_RECEIPT + 2]);
> +		if (rx_sum != sum) {
> +			dev_alert(&card->pdev->dev, "SRAM seems to be damaged"
> +				", wanted 0x%04x, got 0x%04x\n", sum, rx_sum);
> +			goto fw_end;
> +		}
> +	}
> +fw_end:
> +	release_firmware(fw);
> +	if (ok != 7)
> +		goto fw_failed;
> +	/* got start, start_addr, & eof */
> +	iowrite32(start_addr, &card->dpram[DPRAM_COMMAND + 2]);
> +	iowrite8(1, &card->dpram[DPRAM_COMMAND + 6]);
> +	if (softing_bootloader_command(card, 3, "start app."))
> +		goto fw_failed;
> +	dev_info(&card->pdev->dev, "firmware %s up\n", file);
> +	return 0;
> +fw_failed:
> +	dev_info(&card->pdev->dev, "firmware %s failed\n", file);
> +	return ret ?: -EINVAL;
> +}
> +
> +static int softing_reset_chip(struct softing *card)
> +{
> +	do {
> +		/* reset chip */
> +		iowrite8(0, &card->dpram[DPRAM_RESET_RX_FIFO]);
> +		iowrite8(0, &card->dpram[DPRAM_RESET_RX_FIFO+1]);
> +		iowrite8(1, &card->dpram[DPRAM_RESET]);
> +		iowrite8(0, &card->dpram[DPRAM_RESET+1]);
> +
> +		if (!softing_fct_cmd(card, 0, 0, "reset_chip"))
> +			break;
> +		if (signal_pending(current))
> +			goto failed;
> +		/* sync */
> +		if (softing_fct_cmd(card, 99, 0x55, "sync-a"))
> +			goto failed;
> +		if (softing_fct_cmd(card, 99, 0xaa, "sync-a"))
> +			goto failed;
> +	} while (1);
> +	card->tx.pending = 0;
> +	return 0;
> +failed:
> +	return -EIO;
> +}
> +
> +static void softing_initialize_timestamp(struct softing *card)
> +{
> +	uint64_t ovf;
> +
> +	card->ts_ref = ktime_get();
> +
> +	/* 16MHz is the reference */
> +	ovf = 0x100000000ULL * 16;
> +	do_div(ovf, card->pdat->freq ?: 16);
> +
> +	card->ts_overflow = ktime_add_us(ktime_set(0, 0), ovf);
> +}
> +
> +ktime_t softing_raw2ktime(struct softing *card, u32 raw)
> +{
> +	uint64_t rawl;
> +	ktime_t now, real_offset;
> +	ktime_t target;
> +	ktime_t tmp;
> +
> +	now = ktime_get();
> +	real_offset = ktime_sub(ktime_get_real(), now);
> +
> +	/* find nsec from card */
> +	rawl = raw * 16;
> +	do_div(rawl, card->pdat->freq ?: 16);
> +	target = ktime_add_us(card->ts_ref, rawl);
> +	/* test for overflows */
> +	tmp = ktime_add(target, card->ts_overflow);
> +	while (unlikely(ktime_to_ns(tmp) > ktime_to_ns(now))) {
> +		card->ts_ref = ktime_add(card->ts_ref, card->ts_overflow);
> +		target = tmp;
> +		tmp = ktime_add(target, card->ts_overflow);
> +	}
> +	return ktime_add(target, real_offset);
> +}
> +
> +static inline int softing_error_reporting(struct net_device *netdev)
> +{
> +	struct softing_priv *priv = netdev_priv(netdev);
> +
> +	return (priv->can.ctrlmode & CAN_CTRLMODE_BERR_REPORTING)
> +		? 1 : 0;
> +}
> +
> +int softing_startstop(struct net_device *dev, int up)
> +{
> +	int ret;
> +	struct softing *card;
> +	struct softing_priv *priv;
> +	struct net_device *netdev;
> +	int mask_start;
> +	int j, error_reporting;
> +	struct can_frame msg;
> +	const struct can_bittiming *bt;
> +
> +	priv = netdev_priv(dev);
> +	card = priv->card;
> +
> +	if (!card->fw.up)
> +		return -EIO;
> +
> +	ret = mutex_lock_interruptible(&card->fw.lock);
> +	if (ret)
> +		return ret;
> +
> +	mask_start = 0;
> +	if (dev && up)
> +		/* prepare to start this bus as well */
> +		mask_start |= (1 << priv->index);
> +	/* bring netdevs down */
> +	for (j = 0; j < ARRAY_SIZE(card->net); ++j) {
> +		netdev = card->net[j];
> +		if (!netdev)
> +			continue;
> +		priv = netdev_priv(netdev);
> +
> +		if (dev != netdev)
> +			netif_stop_queue(netdev);
> +
> +		if (netif_running(netdev)) {
> +			if (dev != netdev)
> +				mask_start |= (1 << j);
> +			priv->tx.pending = 0;
> +			priv->tx.echo_put = 0;
> +			priv->tx.echo_get = 0;
> +			/* this bus' may just have called open_candev()

Please use

  /*
   * Comment
   */

> +			 * which is rather stupid to call close_candev()
> +			 * already
> +			 * but we may come here from busoff recovery too
> +			 * in which case the echo_skb _needs_ flushing too.
> +			 * just be sure to call open_candev() again
> +			 */
> +			close_candev(netdev);
> +		}
> +		priv->can.state = CAN_STATE_STOPPED;
> +	}
> +	card->tx.pending = 0;

> +
> +	softing_enable_irq(card, 0);
> +	ret = softing_reset_chip(card);
> +	if (ret)
> +		goto failed;
> +	if (!mask_start)
> +		/* no busses to be brought up */
> +		goto card_done;
> +
> +	if ((mask_start & 1) && (mask_start & 2)
> +			&& (softing_error_reporting(card->net[0])
> +				!= softing_error_reporting(card->net[1]))) {
> +		dev_alert(&card->pdev->dev,
> +				"err_reporting flag differs for busses\n");
> +		goto invalid;
> +	}
> +	error_reporting = 0;
> +	if (mask_start & 1) {
> +		netdev = card->net[0];
> +		priv = netdev_priv(netdev);
> +		error_reporting += softing_error_reporting(netdev);
> +		/* init chip 1 */
> +		bt = &priv->can.bittiming;
> +		iowrite16(bt->brp, &card->dpram[DPRAM_FCT_PARAM + 2]);
> +		iowrite16(bt->sjw, &card->dpram[DPRAM_FCT_PARAM + 4]);
> +		iowrite16(bt->phase_seg1 + bt->prop_seg,
> +				&card->dpram[DPRAM_FCT_PARAM + 6]);
> +		iowrite16(bt->phase_seg2, &card->dpram[DPRAM_FCT_PARAM + 8]);
> +		iowrite16((priv->can.ctrlmode & CAN_CTRLMODE_3_SAMPLES) ? 1 : 0,
> +				&card->dpram[DPRAM_FCT_PARAM + 10]);
> +		if (softing_fct_cmd(card, 1, 0, "initialize_chip[0]"))
> +			goto failed;
> +		/* set mode */
> +		iowrite16(0, &card->dpram[DPRAM_FCT_PARAM + 2]);
> +		iowrite16(0, &card->dpram[DPRAM_FCT_PARAM + 4]);
> +		if (softing_fct_cmd(card, 3, 0, "set_mode[0]"))
> +			goto failed;
> +		/* set filter */
> +		/* 11bit id & mask */
> +		iowrite16(0x0000, &card->dpram[DPRAM_FCT_PARAM + 2]);
> +		iowrite16(0x07ff, &card->dpram[DPRAM_FCT_PARAM + 4]);
> +		/* 29bit id.lo & mask.lo & id.hi & mask.hi */
> +		iowrite16(0x0000, &card->dpram[DPRAM_FCT_PARAM + 6]);
> +		iowrite16(0xffff, &card->dpram[DPRAM_FCT_PARAM + 8]);
> +		iowrite16(0x0000, &card->dpram[DPRAM_FCT_PARAM + 10]);
> +		iowrite16(0x1fff, &card->dpram[DPRAM_FCT_PARAM + 12]);
> +		if (softing_fct_cmd(card, 7, 0, "set_filter[0]"))
> +			goto failed;
> +		/* set output control */
> +		iowrite16(priv->output, &card->dpram[DPRAM_FCT_PARAM + 2]);
> +		if (softing_fct_cmd(card, 5, 0, "set_output[0]"))
> +			goto failed;
> +	}
> +	if (mask_start & 2) {

Magic constants?

> +		netdev = card->net[1];
> +		priv = netdev_priv(netdev);
> +		error_reporting += softing_error_reporting(netdev);
> +		/* init chip2 */
> +		bt = &priv->can.bittiming;
> +		iowrite16(bt->brp, &card->dpram[DPRAM_FCT_PARAM + 2]);
> +		iowrite16(bt->sjw, &card->dpram[DPRAM_FCT_PARAM + 4]);
> +		iowrite16(bt->phase_seg1 + bt->prop_seg,
> +				&card->dpram[DPRAM_FCT_PARAM + 6]);
> +		iowrite16(bt->phase_seg2, &card->dpram[DPRAM_FCT_PARAM + 8]);
> +		iowrite16((priv->can.ctrlmode & CAN_CTRLMODE_3_SAMPLES) ? 1 : 0,
> +				&card->dpram[DPRAM_FCT_PARAM + 10]);
> +		if (softing_fct_cmd(card, 2, 0, "initialize_chip[1]"))
> +			goto failed;
> +		/* set mode2 */
> +		iowrite16(0, &card->dpram[DPRAM_FCT_PARAM + 2]);
> +		iowrite16(0, &card->dpram[DPRAM_FCT_PARAM + 4]);
> +		if (softing_fct_cmd(card, 4, 0, "set_mode[1]"))
> +			goto failed;
> +		/* set filter2 */
> +		/* 11bit id & mask */
> +		iowrite16(0x0000, &card->dpram[DPRAM_FCT_PARAM + 2]);
> +		iowrite16(0x07ff, &card->dpram[DPRAM_FCT_PARAM + 4]);
> +		/* 29bit id.lo & mask.lo & id.hi & mask.hi */
> +		iowrite16(0x0000, &card->dpram[DPRAM_FCT_PARAM + 6]);
> +		iowrite16(0xffff, &card->dpram[DPRAM_FCT_PARAM + 8]);
> +		iowrite16(0x0000, &card->dpram[DPRAM_FCT_PARAM + 10]);
> +		iowrite16(0x1fff, &card->dpram[DPRAM_FCT_PARAM + 12]);
> +		if (softing_fct_cmd(card, 8, 0, "set_filter[1]"))
> +			goto failed;
> +		/* set output control2 */
> +		iowrite16(priv->output, &card->dpram[DPRAM_FCT_PARAM + 2]);
> +		if (softing_fct_cmd(card, 6, 0, "set_output[1]"))
> +			goto failed;
> +	}
> +	/* enable_error_frame */
> +	/*
> +	if (error_reporting) {
> +		if (softing_fct_cmd(card, 51, 0, "enable_error_frame"))
> +			goto failed;
> +	}
> +	*/

Please remove dead code!

> +	/* initialize interface */
> +	iowrite16(1, &card->dpram[DPRAM_FCT_PARAM + 2]);
> +	iowrite16(1, &card->dpram[DPRAM_FCT_PARAM + 4]);
> +	iowrite16(1, &card->dpram[DPRAM_FCT_PARAM + 6]);
> +	iowrite16(1, &card->dpram[DPRAM_FCT_PARAM + 8]);
> +	iowrite16(1, &card->dpram[DPRAM_FCT_PARAM + 10]);
> +	iowrite16(1, &card->dpram[DPRAM_FCT_PARAM + 12]);
> +	iowrite16(1, &card->dpram[DPRAM_FCT_PARAM + 14]);
> +	iowrite16(1, &card->dpram[DPRAM_FCT_PARAM + 16]);
> +	iowrite16(1, &card->dpram[DPRAM_FCT_PARAM + 18]);
> +	iowrite16(1, &card->dpram[DPRAM_FCT_PARAM + 20]);

Could be coded more efficiently with a for loop.

> +	if (softing_fct_cmd(card, 17, 0, "initialize_interface"))
> +		goto failed;
> +	/* enable_fifo */
> +	if (softing_fct_cmd(card, 36, 0, "enable_fifo"))
> +		goto failed;
> +	/* enable fifo tx ack */
> +	if (softing_fct_cmd(card, 13, 0, "fifo_tx_ack[0]"))
> +		goto failed;
> +	/* enable fifo tx ack2 */
> +	if (softing_fct_cmd(card, 14, 0, "fifo_tx_ack[1]"))
> +		goto failed;
> +	/* enable timestamps */
> +	/* is default, no code found */
> +	/* start_chip */
> +	if (softing_fct_cmd(card, 11, 0, "start_chip"))
> +		goto failed;
> +	iowrite8(0, &card->dpram[DPRAM_INFO_BUSSTATE]);
> +	iowrite8(0, &card->dpram[DPRAM_INFO_BUSSTATE2]);
> +	dev_info(&card->pdev->dev, "%s up\n", __func__);
> +	if (card->pdat->generation < 2) {
> +		iowrite8(0, &card->dpram[DPRAM_V2_IRQ_TOHOST]);
> +		/* flush the DPRAM caches */
> +		wmb();
> +	}
> +
> +	softing_initialize_timestamp(card);
> +
> +	/*
> +	 * do socketcan notifications/status changes
> +	 * from here, no errors should occur, or the failed: part
> +	 * must be reviewed
> +	 */
> +	memset(&msg, 0, sizeof(msg));
> +	msg.can_id = CAN_ERR_FLAG | CAN_ERR_RESTARTED;
> +	msg.can_dlc = CAN_ERR_DLC;
> +	for (j = 0; j < ARRAY_SIZE(card->net); ++j) {
> +		if (!(mask_start & (1 << j)))
> +			continue;
> +		netdev = card->net[j];
> +		if (!netdev)
> +			continue;
> +		priv = netdev_priv(netdev);
> +		priv->can.state = CAN_STATE_ERROR_ACTIVE;
> +		open_candev(netdev);
> +		if (dev != netdev) {
> +			/* notify other busses on the restart */
> +			softing_netdev_rx(netdev, &msg, ktime_set(0, 0));
> +			++priv->can.can_stats.restarts;
> +		}
> +		netif_wake_queue(netdev);
> +	}
> +
> +	/* enable interrupts */
> +	ret = softing_enable_irq(card, 1);
> +	if (ret)
> +		goto failed;
> +card_done:
> +	mutex_unlock(&card->fw.lock);
> +	return 0;
> +failed:
> +	dev_alert(&card->pdev->dev, "firmware failed, going idle\n");
> +invalid:
> +	softing_enable_irq(card, 0);
> +	softing_reset_chip(card);
> +	mutex_unlock(&card->fw.lock);
> +	/* bring all other interfaces down */
> +	for (j = 0; j < ARRAY_SIZE(card->net); ++j) {
> +		netdev = card->net[j];
> +		if (!netdev)
> +			continue;
> +		dev_close(netdev);
> +	}
> +	return -EIO;
> +}
> +
> +int softing_default_output(struct net_device *netdev)
> +{
> +	struct softing_priv *priv = netdev_priv(netdev);
> +	struct softing *card = priv->card;
> +
> +	switch (priv->chip) {
> +	case 1000:
> +		if (card->pdat->generation < 2)
> +			return 0xfb;
> +		return 0xfa;
> +	case 5:
> +		return 0x60;
> +	default:
> +		return 0x40;
> +	}
> +}

Again, some magic constants.

> diff --git a/drivers/net/can/softing/softing_main.c b/drivers/net/can/softing/softing_main.c
> new file mode 100644
> index 0000000..4f74075
> --- /dev/null
> +++ b/drivers/net/can/softing/softing_main.c
> @@ -0,0 +1,903 @@
> +/*
> +* drivers/net/can/softing/softing_main.c
> +*
> +* Copyright (C) 2008-2010
> +*
> +* - Kurt Van Dijck, EIA Electronics
> +*
> +* This program is free software; you can redistribute it and/or modify
> +* it under the terms of the version 2 of the GNU General Public License
> +* as published by the Free Software Foundation
> +*
> +* This program is distributed in the hope that it will be useful,
> +* but WITHOUT ANY WARRANTY; without even the implied warranty of
> +* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +* GNU General Public License for more details.
> +*
> +* You should have received a copy of the GNU General Public License
> +* along with this program; if not, write to the Free Software
> +* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
> +*/
> +
> +#include <linux/version.h>
> +#include <linux/module.h>
> +#include <linux/init.h>
> +#include <linux/interrupt.h>
> +
> +#include "softing.h"
> +
> +#define TX_ECHO_SKB_MAX (TXMAX/2)
> +
> +/*
> + * test is a specific CAN netdev
> + * is online (ie. up 'n running, not sleeping, not busoff
> + */
> +static inline int canif_is_active(struct net_device *netdev)
> +{
> +	struct can_priv *can = netdev_priv(netdev);

Empty line, please.

> +	if (!netif_running(netdev))
> +		return 0;
> +	return (can->state <= CAN_STATE_ERROR_PASSIVE);
> +}
> +
> +/* trigger the tx queue-ing */
> +static netdev_tx_t
> +softing_netdev_start_xmit(struct sk_buff *skb, struct net_device *dev)

See general comments.

> +{
> +	struct softing_priv *priv = netdev_priv(dev);
> +	struct softing *card = priv->card;
> +	int ret;
> +	uint8_t *ptr;
> +	uint8_t fifo_wr, fifo_rd;
> +	struct can_frame *cf = (struct can_frame *)skb->data;
> +	uint8_t buf[DPRAM_TX_SIZE];
> +
> +	if (can_dropped_invalid_skb(dev, skb))
> +		return NETDEV_TX_OK;
> +
> +	spin_lock(&card->spin);
> +
> +	ret = NETDEV_TX_BUSY;
> +	if (!card->fw.up)
> +		goto xmit_done;
> +	if (card->tx.pending >= TXMAX)
> +		goto xmit_done;
> +	if (priv->tx.pending >= TX_ECHO_SKB_MAX)
> +		goto xmit_done;

What about using "||"?

> +	fifo_wr = ioread8(&card->dpram[DPRAM_TX_WR]);
> +	fifo_rd = ioread8(&card->dpram[DPRAM_TX_RD]);
> +	if (fifo_wr == fifo_rd)
> +		/* fifo full */
> +		goto xmit_done;
> +	memset(buf, 0, sizeof(buf));
> +	ptr = buf;
> +	*ptr = CMD_TX;
> +	if (cf->can_id & CAN_RTR_FLAG)
> +		*ptr |= CMD_RTR;
> +	if (cf->can_id & CAN_EFF_FLAG)
> +		*ptr |= CMD_XTD;
> +	if (priv->index)
> +		*ptr |= CMD_BUS2;
> +	++ptr;
> +	*ptr++ = cf->can_dlc;
> +	*ptr++ = (cf->can_id >> 0);
> +	*ptr++ = (cf->can_id >> 8);
> +	if (cf->can_id & CAN_EFF_FLAG) {
> +		*ptr++ = (cf->can_id >> 16);
> +		*ptr++ = (cf->can_id >> 24);
> +	} else {
> +		/* increment 1, not 2 as you might think */
> +		ptr += 1;
> +	}
> +	if (!(cf->can_id & CAN_RTR_FLAG))
> +		memcpy(ptr, &cf->data[0], cf->can_dlc);
> +	memcpy_toio(&card->dpram[DPRAM_TX + DPRAM_TX_SIZE * fifo_wr],
> +			buf, DPRAM_TX_SIZE);
> +	if (++fifo_wr >= DPRAM_TX_CNT)
> +		fifo_wr = 0;
> +	iowrite8(fifo_wr, &card->dpram[DPRAM_TX_WR]);
> +	card->tx.last_bus = priv->index;
> +	++card->tx.pending;
> +	++priv->tx.pending;
> +	can_put_echo_skb(skb, dev, priv->tx.echo_put);
> +	++priv->tx.echo_put;
> +	if (priv->tx.echo_put >= TX_ECHO_SKB_MAX)
> +		priv->tx.echo_put = 0;
> +	/* can_put_echo_skb() saves the skb, safe to return TX_OK */
> +	ret = NETDEV_TX_OK;
> +xmit_done:
> +	spin_unlock(&card->spin);
> +	if (card->tx.pending >= TXMAX) {
> +		int j;

Empty line please.

> +		for (j = 0; j < ARRAY_SIZE(card->net); ++j) {
> +			if (card->net[j])
> +				netif_stop_queue(card->net[j]);
> +		}
> +	}
> +	if (ret != NETDEV_TX_OK)
> +		netif_stop_queue(dev);
> +
> +	return ret;
> +}
> +
> +/*
> + * shortcut for skb delivery
> + */
> +int softing_netdev_rx(struct net_device *netdev,
> +		const struct can_frame *msg, ktime_t ktime)
> +{
> +	struct sk_buff *skb;
> +	struct can_frame *cf;
> +	int ret;
> +
> +	skb = alloc_can_skb(netdev, &cf);
> +	if (!skb)
> +		return -ENOMEM;
> +	memcpy(cf, msg, sizeof(*msg));
> +	skb->tstamp = ktime;
> +	ret = netif_rx(skb);
> +	if (ret == NET_RX_DROP)
> +		++netdev->stats.rx_dropped;

Hm, I wonder if all Socket-CAN drivers should handle the return value of
netif_rx that way?

> +	return ret;
> +}
> +
> +/*
> + * softing_handle_1
> + * pop 1 entry from the DPRAM queue, and process
> + */
> +static int softing_handle_1(struct softing *card)
> +{
> +	int j;
> +	struct net_device *netdev;
> +	struct softing_priv *priv;
> +	ktime_t ktime;
> +	struct can_frame msg;
> +
> +	int lost_msg;
> +	uint8_t fifo_rd, fifo_wr;
> +	unsigned int cnt = 0;
> +	uint8_t *ptr;
> +	u32 tmp;
> +	u8 cmd;
> +	uint8_t buf[DPRAM_RX_SIZE];
> +
> +	memset(&msg, 0, sizeof(msg));
> +	/* test for lost msgs */
> +	lost_msg = ioread8(&card->dpram[DPRAM_RX_LOST]);
> +	if (lost_msg) {
> +		/* reset condition */
> +		iowrite8(0, &card->dpram[DPRAM_RX_LOST]);
> +		/* prepare msg */
> +		msg.can_id = CAN_ERR_FLAG | CAN_ERR_CRTL;
> +		msg.can_dlc = CAN_ERR_DLC;
> +		msg.data[1] = CAN_ERR_CRTL_RX_OVERFLOW;
> +		/*
> +		 * service to all busses, we don't know which it was applicable
> +		 * but only service busses that are online
> +		 */
> +		for (j = 0; j < ARRAY_SIZE(card->net); ++j) {
> +			netdev = card->net[j];
> +			if (!netdev)
> +				continue;
> +			if (!canif_is_active(netdev))
> +				/* a dead bus has no overflows */
> +				continue;
> +			++netdev->stats.rx_over_errors;
> +			softing_netdev_rx(netdev, &msg, ktime_set(0, 0));
> +		}
> +		/* prepare for other use */
> +		memset(&msg, 0, sizeof(msg));
> +		++cnt;
> +	}
> +
> +	fifo_rd = ioread8(&card->dpram[DPRAM_RX_RD]);
> +	fifo_wr = ioread8(&card->dpram[DPRAM_RX_WR]);
> +
> +	if (++fifo_rd >= DPRAM_RX_CNT)
> +		fifo_rd = 0;
> +	if (fifo_wr == fifo_rd)
> +		return cnt;
> +
> +	memcpy_fromio(buf, &card->dpram[DPRAM_RX + DPRAM_RX_SIZE*fifo_rd],
> +			DPRAM_RX_SIZE);
> +	mb();
> +	/* trigger dual port RAM */
> +	iowrite8(fifo_rd, &card->dpram[DPRAM_RX_RD]);
> +
> +	ptr = buf;
> +	cmd = *ptr++;
> +	if (cmd == 0xff) {
> +		/* not quite usefull, probably the card has got out */
> +		dev_alert(&card->pdev->dev, "got cmd 0x%02x,"
> +			" I suspect the card is lost\n", cmd);
> +		return 0;
> +	}
> +	netdev = card->net[0];
> +	if (cmd & CMD_BUS2)
> +		netdev = card->net[1];
> +	priv = netdev_priv(netdev);
> +
> +	if (cmd & CMD_ERR) {
> +		u8 can_state;
> +		u8 state;
> +		state = *ptr++;
> +
> +		msg.can_id = CAN_ERR_FLAG;
> +		msg.can_dlc = CAN_ERR_DLC;
> +
> +		if (state & 0x80) {

Again some magic constants!

> +			can_state = CAN_STATE_BUS_OFF;
> +			msg.can_id |= CAN_ERR_BUSOFF;
> +			state = 2;

Ditto.

> +		} else if (state & 0x60) {
> +			can_state = CAN_STATE_ERROR_PASSIVE;
> +			msg.can_id |= CAN_ERR_BUSERROR;

A state change is not a bus error! You should use:

  msg.can_id |= CAN_ERR_CRTL;

> +			msg.data[1] = CAN_ERR_CRTL_TX_PASSIVE;
> +			state = 1;
> +		} else {
> +			can_state = CAN_STATE_ERROR_ACTIVE;
> +			state = 0;
> +			msg.can_id |= CAN_ERR_BUSERROR;

Ditto.

> +		}
> +		/* update DPRAM */
> +		iowrite8(state, &card->dpram[priv->index ?
> +				DPRAM_INFO_BUSSTATE2 : DPRAM_INFO_BUSSTATE]);
> +		/* timestamp */
> +		tmp = le32_to_cpup((void *)ptr);
> +		ptr += 4;
> +		ktime = softing_raw2ktime(card, tmp);
> +
> +		++priv->can.can_stats.bus_error;

Ditto.

> +		++netdev->stats.rx_errors;
> +		/* update internal status */
> +		if (can_state != priv->can.state) {
> +			priv->can.state = can_state;
> +			if (can_state == CAN_STATE_ERROR_PASSIVE)
> +				++priv->can.can_stats.error_passive;
> +			if (can_state == CAN_STATE_BUS_OFF) {

else if?

> +				/* this calls can_close_cleanup() */
> +				can_bus_off(netdev);
> +				netif_stop_queue(netdev);
> +			}
> +			/* trigger socketcan */
> +			softing_netdev_rx(netdev, &msg, ktime);
> +		}
> +
> +	} else {
> +		if (cmd & CMD_RTR)
> +			msg.can_id |= CAN_RTR_FLAG;
> +		/* acknowledge, was tx msg
> +		 * no real tx flag to set
> +		if (cmd & CMD_ACK) {
> +		}
> +		 */
> +		msg.can_dlc = get_can_dlc(*ptr++);
> +		if (cmd & CMD_XTD) {
> +			msg.can_id |= CAN_EFF_FLAG;
> +			msg.can_id |= le32_to_cpup((void *)ptr);
> +			ptr += 4;
> +		} else {
> +			msg.can_id |= le16_to_cpup((void *)ptr);
> +			ptr += 2;
> +		}
> +		/* timestamp */
> +		tmp = le32_to_cpup((void *)ptr);
> +		ptr += 4;
> +		ktime = softing_raw2ktime(card, tmp);
> +		if (!(msg.can_id & CAN_RTR_FLAG))
> +			memcpy(&msg.data[0], ptr, 8);
> +		ptr += 8;
> +		/* update socket */
> +		if (cmd & CMD_ACK) {
> +			struct sk_buff *skb;
> +			skb = priv->can.echo_skb[priv->tx.echo_get];
> +			if (skb)
> +				skb->tstamp = ktime;
> +			can_get_echo_skb(netdev, priv->tx.echo_get);
> +			++priv->tx.echo_get;
> +			if (priv->tx.echo_get >= TX_ECHO_SKB_MAX)
> +				priv->tx.echo_get = 0;
> +			if (priv->tx.pending)
> +				--priv->tx.pending;
> +			if (card->tx.pending)
> +				--card->tx.pending;
> +			++netdev->stats.tx_packets;
> +			netdev->stats.tx_bytes += msg.can_dlc;
> +		} else {
> +			++netdev->stats.rx_packets;
> +			netdev->stats.rx_bytes += msg.can_dlc;
> +			softing_netdev_rx(netdev, &msg, ktime);
> +		}
> +	}
> +	++cnt;
> +	return cnt;
> +}
> +
> +/*
> + * real interrupt handler
> + */
> +static irqreturn_t softing_irq_thread(int irq, void *dev_id)
> +{
> +	struct softing *card = (struct softing *)dev_id;
> +	struct net_device *netdev;
> +	struct softing_priv *priv;
> +	int j, offset, work_done;
> +
> +	work_done = 0;
> +	spin_lock_bh(&card->spin);
> +	while (softing_handle_1(card) > 0) {
> +		++card->irq.svc_count;
> +		++work_done;
> +	}
> +	spin_unlock_bh(&card->spin);
> +	/* resume tx queue's */
> +	offset = card->tx.last_bus;
> +	for (j = 0; j < ARRAY_SIZE(card->net); ++j) {
> +		if (card->tx.pending >= TXMAX)
> +			break;
> +		netdev = card->net[(j + offset + 1) % card->pdat->nbus];
> +		if (!netdev)
> +			continue;
> +		priv = netdev_priv(netdev);
> +		if (!canif_is_active(netdev))
> +			/* it makes no sense to wake dead busses */
> +			continue;
> +		if (priv->tx.pending >= TX_ECHO_SKB_MAX)
> +			continue;
> +		++work_done;
> +		netif_wake_queue(netdev);
> +	}
> +	return work_done ? IRQ_HANDLED : IRQ_NONE;
> +}
> +
> +/*
> + * interrupt routines:
> + * schedule the 'real interrupt handler'
> + */
> +static
> +irqreturn_t softing_irq_v2(int irq, void *dev_id)
> +{
> +	struct softing *card = (struct softing *)dev_id;
> +	uint8_t ir;
> +
> +	ir = ioread8(&card->dpram[DPRAM_V2_IRQ_TOHOST]);
> +	iowrite8(0, &card->dpram[DPRAM_V2_IRQ_TOHOST]);
> +	return (1 == ir) ? IRQ_WAKE_THREAD : IRQ_NONE;
> +}
> +
> +static
> +irqreturn_t softing_irq_v1(int irq, void *dev_id)

See general comments.

> +{
> +	struct softing *card = (struct softing *)dev_id;
> +	uint8_t ir;
> +
> +	ir = ioread8(&card->dpram[DPRAM_IRQ_TOHOST]);
> +	iowrite8(0, &card->dpram[DPRAM_IRQ_TOHOST]);
> +	return ir ? IRQ_WAKE_THREAD : IRQ_NONE;
> +}
> +
> +/*
> + * netdev/candev inter-operability
> + */
> +static int softing_netdev_open(struct net_device *ndev)
> +{
> +	int ret;
> +
> +	/* check or determine and set bittime */
> +	ret = open_candev(ndev);
> +	if (ret)
> +		goto failed;
> +	ret = softing_startstop(ndev, 1);
> +	if (ret)
> +		goto failed;
> +	return 0;
> +failed:
> +	return ret;

Do you really need that label?

> +}
> +
> +static int softing_netdev_stop(struct net_device *ndev)
> +{
> +	int ret;
> +
> +	netif_stop_queue(ndev);
> +
> +	/* softing cycle does close_candev() */
> +	ret = softing_startstop(ndev, 0);
> +	return ret;
> +}
> +
> +static int softing_candev_set_mode(struct net_device *ndev, enum can_mode mode)
> +{
> +	int ret;
> +
> +	switch (mode) {
> +	case CAN_MODE_START:
> +		/* softing_startstop does close_candev() */
> +		ret = softing_startstop(ndev, 1);
> +		return ret;
> +	case CAN_MODE_STOP:
> +	case CAN_MODE_SLEEP:
> +		return -EOPNOTSUPP;
> +	}
> +	return 0;
> +}
> +
> +/*
> + * Softing device management helpers
> + */
> +int softing_enable_irq(struct softing *card, int enable)
> +{
> +	int ret;

Empty line, please.

> +	if (!enable) {
> +		if (card->irq.requested && card->irq.nr) {
> +			free_irq(card->irq.nr, card);
> +			card->irq.requested = 0;
> +		}
> +		return 0;
> +	}
> +	if (!card->irq.requested && (card->irq.nr)) {
> +		ret = request_threaded_irq(card->irq.nr,
> +				(card->pdat->generation >= 2)
> +					? softing_irq_v2 : softing_irq_v1,
> +				softing_irq_thread, IRQF_SHARED,
> +				dev_name(&card->pdev->dev), card);
> +		if (ret) {
> +			dev_alert(&card->pdev->dev,
> +					"request_threaded_irq(%u) failed\n",
> +					card->irq.nr);
> +			return ret;
> +		}
> +		card->irq.requested = 1;
> +	}
> +	return 0;
> +}
> +
> +static void softing_card_shutdown(struct softing *card)
> +{
> +	int fw_up = 0;

Empty line, please.

> +	dev_dbg(&card->pdev->dev, "%s()\n", __func__);

Please reduce the debugging output to a few useful messages (for the
final user).

> +	if (mutex_lock_interruptible(&card->fw.lock))
> +		/* return -ERESTARTSYS*/;

What is the "if" then good for? Do you want to handle the return code?

> +	fw_up = card->fw.up;
> +	card->fw.up = 0;
> +
> +	if (card->irq.requested && card->irq.nr) {
> +		free_irq(card->irq.nr, card);
> +		card->irq.requested = 0;
> +	}
> +	if (fw_up) {
> +		if (card->pdat->enable_irq)
> +			card->pdat->enable_irq(card->pdev, 0);
> +		softing_set_reset_dpram(card);
> +		if (card->pdat->reset)
> +			card->pdat->reset(card->pdev, 1);
> +	}
> +	mutex_unlock(&card->fw.lock);
> +}
> +
> +static int softing_card_boot(struct softing *card)
> +{
> +	int j;
> +	static const uint8_t stream[] = {
> +		0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, };
> +	unsigned char back[sizeof(stream)];

Empty line please.

> +	dev_dbg(&card->pdev->dev, "%s()\n", __func__);

See comment above.

> +	if (mutex_lock_interruptible(&card->fw.lock))
> +		return -ERESTARTSYS;
> +	if (card->fw.up) {
> +		mutex_unlock(&card->fw.lock);
> +		return 0;
> +	}
> +	/* reset board */
> +	if (card->pdat->enable_irq)
> +		card->pdat->enable_irq(card->pdev, 1);
> +	/* boot card */
> +	softing_set_reset_dpram(card);
> +	if (card->pdat->reset)
> +		card->pdat->reset(card->pdev, 1);
> +	for (j = 0; (j + sizeof(stream)) < card->dpram_size;
> +			j += sizeof(stream)) {
> +
> +		memcpy_toio(&card->dpram[j], stream, sizeof(stream));
> +		/* flush IO cache */
> +		mb();
> +		memcpy_fromio(back, &card->dpram[j], sizeof(stream));
> +
> +		if (!memcmp(back, stream, sizeof(stream)))
> +			continue;
> +		/* memory is not equal */
> +		dev_alert(&card->pdev->dev, "dpram failed at 0x%04x\n", j);
> +		goto open_failed;
> +	}
> +	wmb();
> +	/* load boot firmware */
> +	if (softing_load_fw(card->pdat->boot.fw, card, card->dpram,
> +				 card->dpram_size,
> +				 card->pdat->boot.offs -
> +				 card->pdat->boot.addr))
> +		goto open_failed;
> +	/* load loader firmware */
> +	if (softing_load_fw(card->pdat->load.fw, card, card->dpram,
> +				 card->dpram_size,
> +				 card->pdat->load.offs -
> +				 card->pdat->load.addr))
> +		goto open_failed;
> +
> +	if (card->pdat->reset)
> +		card->pdat->reset(card->pdev, 0);
> +	softing_clr_reset_dpram(card);
> +	if (softing_bootloader_command(card, 0, "card boot"))
> +		goto open_failed;
> +	if (softing_load_app_fw(card->pdat->app.fw, card))
> +		goto open_failed;
> +	/* reset chip */
> +	iowrite8(0, &card->dpram[DPRAM_RESET_RX_FIFO]);
> +	iowrite8(0, &card->dpram[DPRAM_RESET_RX_FIFO+1]);
> +	iowrite8(1, &card->dpram[DPRAM_RESET]);
> +	iowrite8(0, &card->dpram[DPRAM_RESET+1]);
> +	/* sync */
> +	if (softing_fct_cmd(card, 99, 0x55, "sync-a"))
> +		goto open_failed;
> +	if (softing_fct_cmd(card, 99, 0xaa, "sync-a"))
> +		goto open_failed;
> +	/* reset chip */
> +	if (softing_fct_cmd(card, 0, 0, "reset_chip"))
> +		goto open_failed;
> +	/* get_serial */
> +	if (softing_fct_cmd(card, 43, 0, "get_serial_number"))
> +		goto open_failed;
> +	card->id.serial = ioread32(&card->dpram[DPRAM_FCT_PARAM]);
> +	/* get_version */
> +	if (softing_fct_cmd(card, 12, 0, "get_version"))
> +		goto open_failed;
> +	card->id.fw = ioread16(&card->dpram[DPRAM_FCT_PARAM + 2]);
> +	card->id.hw = ioread16(&card->dpram[DPRAM_FCT_PARAM + 4]);
> +	card->id.lic = ioread16(&card->dpram[DPRAM_FCT_PARAM + 6]);
> +	card->id.chip[0] = ioread16(&card->dpram[DPRAM_FCT_PARAM + 8]);
> +	card->id.chip[1] = ioread16(&card->dpram[DPRAM_FCT_PARAM + 10]);
> +
> +	dev_info(&card->pdev->dev, "card booted, type %s, "
> +			"serial %u, fw %u, hw %u, lic %u, chip (%u,%u)\n",
> +		  card->pdat->name, card->id.serial, card->id.fw, card->id.hw,
> +		  card->id.lic, card->id.chip[0], card->id.chip[1]);
> +
> +	card->fw.up = 1;
> +	mutex_unlock(&card->fw.lock);
> +	return 0;
> +open_failed:
> +	card->fw.up = 0;
> +	if (card->pdat->enable_irq)
> +		card->pdat->enable_irq(card->pdev, 0);
> +	softing_set_reset_dpram(card);
> +	if (card->pdat->reset)
> +		card->pdat->reset(card->pdev, 1);
> +	mutex_unlock(&card->fw.lock);
> +	return -EIO;
> +}
> +
> +/*
> + * netdev sysfs
> + */
> +static ssize_t show_channel(struct device *dev
> +		, struct device_attribute *attr, char *buf)

See general comments. Here and for the following function declarations.

> +{
> +	struct net_device *ndev = to_net_dev(dev);
> +	struct softing_priv *priv = netdev2softing(ndev);
> +	return sprintf(buf, "%i\n", priv->index);
> +}
> +
> +static ssize_t show_chip(struct device *dev
> +		, struct device_attribute *attr, char *buf)
> +{
> +	struct net_device *ndev = to_net_dev(dev);
> +	struct softing_priv *priv = netdev2softing(ndev);
> +	return sprintf(buf, "%i\n", priv->chip);
> +}
> +
> +static ssize_t show_output(struct device *dev
> +		, struct device_attribute *attr, char *buf)
> +{
> +	struct net_device *ndev = to_net_dev(dev);
> +	struct softing_priv *priv = netdev2softing(ndev);
> +	return sprintf(buf, "0x%02x\n", priv->output);
> +}
> +
> +static ssize_t store_output(struct device *dev
> +		, struct device_attribute *attr
> +		, const char *buf, size_t count)
> +{
> +	struct net_device *ndev = to_net_dev(dev);
> +	struct softing_priv *priv = netdev2softing(ndev);
> +	struct softing *card = priv->card;
> +	unsigned long val;
> +	int ret;
> +
> +	ret = strict_strtoul(buf, 0, &val);
> +	if (ret < 0)
> +		return ret;
> +	val &= 0xFF;
> +
> +	ret = mutex_lock_interruptible(&card->fw.lock);
> +	if (ret)
> +		return -ERESTARTSYS;
> +	if (netif_running(ndev)) {
> +		mutex_unlock(&card->fw.lock);
> +		return -EBUSY;
> +	}
> +	priv->output = val;
> +	mutex_unlock(&card->fw.lock);
> +	return count;
> +}
> +
> +static const DEVICE_ATTR(channel, S_IRUGO, show_channel, NULL);
> +static const DEVICE_ATTR(chip, S_IRUGO, show_chip, NULL);
> +static const DEVICE_ATTR(output, S_IRUGO | S_IWUSR, show_output, store_output);
> +
> +static const struct attribute *const netdev_sysfs_attrs[] = {
> +	&dev_attr_channel.attr,
> +	&dev_attr_chip.attr,
> +	&dev_attr_output.attr,
> +	NULL,
> +};
> +static const struct attribute_group netdev_sysfs_group = {
> +	.name  = NULL,
> +	.attrs = (struct attribute **)netdev_sysfs_attrs,
> +};
> +
> +static const struct net_device_ops softing_netdev_ops = {
> +	.ndo_open = softing_netdev_open,
> +	.ndo_stop = softing_netdev_stop,
> +	.ndo_start_xmit	= softing_netdev_start_xmit,
> +};
> +
> +static const struct can_bittiming_const softing_btr_const = {
> +	.tseg1_min = 1,
> +	.tseg1_max = 16,
> +	.tseg2_min = 1,
> +	.tseg2_max = 8,
> +	.sjw_max = 4, /* overruled */
> +	.brp_min = 1,
> +	.brp_max = 32, /* overruled */
> +	.brp_inc = 1,
> +};
> +
> +
> +static struct net_device *softing_netdev_create(
> +		struct softing *card, u16 chip_id)
> +{
> +	struct net_device *netdev;
> +	struct softing_priv *priv;
> +
> +	netdev = alloc_candev(sizeof(*priv), TX_ECHO_SKB_MAX);
> +	if (!netdev) {
> +		dev_alert(&card->pdev->dev, "alloc_candev failed\n");
> +		return NULL;
> +	}
> +	priv = netdev_priv(netdev);
> +	priv->netdev = netdev;
> +	priv->card = card;
> +	memcpy(&priv->btr_const, &softing_btr_const, sizeof(priv->btr_const));
> +	priv->btr_const.brp_max = card->pdat->max_brp;
> +	priv->btr_const.sjw_max = card->pdat->max_sjw;
> +	priv->can.bittiming_const = &priv->btr_const;
> +	priv->can.clock.freq = 8000000;

Another magic constant.

> +	priv->chip = chip_id;
> +	priv->output = softing_default_output(netdev);
> +	SET_NETDEV_DEV(netdev, &card->pdev->dev);
> +
> +	netdev->flags |= IFF_ECHO;
> +	netdev->netdev_ops	= &softing_netdev_ops;
> +	priv->can.do_set_mode	= softing_candev_set_mode;

See general comments.

> +	priv->can.ctrlmode_supported =
> +		CAN_CTRLMODE_3_SAMPLES;/* | CAN_CTRLMODE_BERR_REPORTING */;

Hm, any chance to support CAN_CTRLMODE_BERR_REPORTING? If not, please
remove the comment.

> +	return netdev;
> +}
> +
> +static int softing_netdev_register(struct net_device *netdev)
> +{
> +	int ret;
> +
> +	/*
> +	 * provide bus-specific sysfs attributes _during_ the uevent
> +	 */
> +	netdev->sysfs_groups[0] = &netdev_sysfs_group;
> +	ret = register_candev(netdev);
> +	if (ret) {
> +		dev_alert(&netdev->dev, "register failed\n");
> +		return ret;
> +	}
> +	return 0;
> +}
> +
> +static void softing_netdev_cleanup(struct net_device *netdev)
> +{
> +	unregister_candev(netdev);
> +	free_candev(netdev);
> +}
> +
> +/*
> + * sysfs for Platform device
> + */
> +#define DEV_ATTR_RO(name, member) \
> +static ssize_t show_##name(struct device *dev, \
> +		struct device_attribute *attr, char *buf) \
> +{ \
> +	struct softing *card = platform_get_drvdata(to_platform_device(dev)); \
> +	return sprintf(buf, "%u\n", card->member); \
> +} \
> +static DEVICE_ATTR(name, 0444, show_##name, NULL)
> +
> +DEV_ATTR_RO(serial	, id.serial);
> +DEV_ATTR_RO(firmware	, id.fw);
> +DEV_ATTR_RO(hardware	, id.hw);
> +DEV_ATTR_RO(license	, id.lic);
> +DEV_ATTR_RO(freq	, id.freq);
> +DEV_ATTR_RO(txpending	, tx.pending);
> +
> +static struct attribute *softing_pdev_attrs[] = {
> +	&dev_attr_serial.attr,
> +	&dev_attr_firmware.attr,
> +	&dev_attr_hardware.attr,
> +	&dev_attr_license.attr,
> +	&dev_attr_freq.attr,
> +	&dev_attr_txpending.attr,
> +	NULL,
> +};
> +
> +static const struct attribute_group softing_pdev_group = {
> +	.attrs = softing_pdev_attrs,
> +};
> +
> +/*
> + * platform driver
> + */
> +static int softing_pdev_remove(struct platform_device *pdev)
> +{
> +	struct softing *card = platform_get_drvdata(pdev);
> +	int j;
> +
> +	/* first, disable card*/
> +	softing_card_shutdown(card);
> +
> +	for (j = 0; j < ARRAY_SIZE(card->net); ++j) {
> +		if (!card->net[j])
> +			continue;
> +		softing_netdev_cleanup(card->net[j]);
> +		card->net[j] = NULL;
> +	}
> +	sysfs_remove_group(&pdev->dev.kobj, &softing_pdev_group);
> +
> +	iounmap(card->dpram);
> +	kfree(card);
> +	return 0;
> +}
> +
> +static int softing_pdev_probe(struct platform_device *pdev)
> +{
> +	const struct softing_platform_data *pdat = pdev->dev.platform_data;
> +	struct softing *card;
> +	struct net_device *netdev;
> +	struct softing_priv *priv;
> +	struct resource *pres;
> +	int ret;
> +	int j;
> +
> +	if (!pdat) {
> +		dev_warn(&pdev->dev, "no platform data\n");
> +		return -EINVAL;
> +	}
> +	if (pdat->nbus > ARRAY_SIZE(card->net)) {
> +		dev_warn(&pdev->dev, "%u nets??\n", pdat->nbus);
> +		return -EINVAL;
> +	}
> +
> +	card = kzalloc(sizeof(*card), GFP_KERNEL);
> +	if (!card)
> +		return -ENOMEM;
> +	card->pdat = pdat;
> +	card->pdev = pdev;
> +	platform_set_drvdata(pdev, card);
> +	/* try_module_get(THIS_MODULE); */
> +	mutex_init(&card->fw.lock);
> +	spin_lock_init(&card->spin);
> +
> +	ret = -EINVAL;
> +	pres = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	if (!pres)
> +		goto platform_resource_failed;;
> +	card->dpram_phys = pres->start;
> +	card->dpram_size = pres->end - pres->start + 1;
> +	card->dpram = ioremap_nocache(card->dpram_phys, card->dpram_size);
> +	if (!card->dpram) {
> +		dev_alert(&card->pdev->dev, "dpram ioremap failed\n");
> +		goto ioremap_failed;
> +	}
> +
> +	pres = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
> +	if (pres)
> +		card->irq.nr = pres->start;
> +
> +	/* reset card */
> +	ret = -EIO;
> +	if (softing_card_boot(card)) {
> +		dev_alert(&pdev->dev, "failed to boot\n");
> +		goto boot_failed;
> +	}
> +
> +	/* only now, the chip's are known */
> +	card->id.freq = card->pdat->freq * 1000000UL;

It would be more flexible to specific the frequency in Hz!? Or use a
more logical member name, frey_mhz, at least.

> +
> +	ret = sysfs_create_group(&pdev->dev.kobj, &softing_pdev_group);
> +	if (ret < 0) {
> +		dev_alert(&card->pdev->dev, "sysfs failed\n");
> +		goto sysfs_failed;
> +	}
> +
> +	ret = -ENOMEM;
> +	for (j = 0; j < ARRAY_SIZE(card->net); ++j) {
> +		card->net[j] = netdev =
> +			softing_netdev_create(card, card->id.chip[j]);
> +		if (!netdev) {
> +			dev_alert(&pdev->dev, "failed to make can[%i]", j);
> +			goto netdev_failed;
> +		}
> +		priv = netdev_priv(card->net[j]);
> +		priv->index = j;
> +		ret = softing_netdev_register(netdev);
> +		if (ret) {
> +			free_candev(netdev);
> +			card->net[j] = NULL;
> +			dev_alert(&card->pdev->dev,
> +				"failed to register can[%i]\n", j);
> +			goto netdev_failed;
> +		}
> +	}
> +	dev_info(&card->pdev->dev, "card initialised\n");
> +	return 0;
> +
> +netdev_failed:
> +	for (j = 0; j < ARRAY_SIZE(card->net); ++j) {
> +		if (!card->net[j])
> +			continue;
> +		softing_netdev_cleanup(card->net[j]);
> +	}
> +	sysfs_remove_group(&pdev->dev.kobj, &softing_pdev_group);
> +sysfs_failed:
> +	softing_card_shutdown(card);
> +boot_failed:
> +	iounmap(card->dpram);
> +ioremap_failed:
> +platform_resource_failed:
> +	kfree(card);
> +	return ret;
> +}
> +
> +static struct platform_driver softing_driver = {
> +	.driver = {
> +		.name = "softing",
> +		.owner = THIS_MODULE,
> +	},
> +	.probe = softing_pdev_probe,
> +	.remove = softing_pdev_remove,
> +};

I'm missing the use of __devinit and friends.


> +MODULE_ALIAS("platform:softing");
> +
> +static int __init softing_start(void)
> +{
> +	return platform_driver_register(&softing_driver);
> +}
> +
> +static void __exit softing_stop(void)
> +{
> +	platform_driver_unregister(&softing_driver);
> +}
> +
> +module_init(softing_start);
> +module_exit(softing_stop);
> +
> +MODULE_DESCRIPTION("Softing DPRAM CAN driver");
> +MODULE_AUTHOR("Kurt Van Dijck <kurt.van.dijck-/BeEPy95v10@public.gmane.org>");
> +MODULE_LICENSE("GPL");

GPL v2 ?

> diff --git a/drivers/net/can/softing/softing_platform.h b/drivers/net/can/softing/softing_platform.h
> new file mode 100644
> index 0000000..678df36
> --- /dev/null
> +++ b/drivers/net/can/softing/softing_platform.h
> @@ -0,0 +1,38 @@
> +
> +#include <linux/platform_device.h>
> +
> +#ifndef _SOFTING_DEVICE_H_
> +#define _SOFTING_DEVICE_H_
> +
> +/* softing firmware directory prefix */
> +#define fw_dir "softing-4.6/"
> +
> +struct softing_platform_data {
> +	unsigned int manf;
> +	unsigned int prod;
> +	/* generation
> +	 * 1st with NEC or SJA1000
> +	 * 8bit, exclusive interrupt, ...
> +	 * 2nd only SJA1000
> +	 * 16bit, shared interrupt
> +	 */

Please the usual multiline comment style.

> +	int generation;
> +	int nbus; /* # busses on device */
> +	unsigned int freq; /* crystal in MHz */
> +	unsigned int max_brp;
> +	unsigned int max_sjw;
> +	unsigned long dpram_size;
> +	char name[32];
> +	struct {
> +		unsigned long offs;
> +		unsigned long addr;
> +		const char *fw;
> +	} boot, load, app;
> +	/* reset() function, bring pdev in or out of reset, depending on
> +	   value */
> +	int (*reset)(struct platform_device *pdev, int value);
> +	int (*enable_irq)(struct platform_device *pdev, int value);
> +};
> +
> +#endif
> +

Wolfgang.

^ permalink raw reply

* Re: [PATCH net-next-2.6 v2 2/2] can: add driver for Softing card
From: Wolfgang Grandegger @ 2011-01-05 21:02 UTC (permalink / raw)
  To: Kurt Van Dijck
  Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20110104150923.GC321-MxZ6Iy/zr/UdbCeoMzGj59i2O/JbrIOy@public.gmane.org>

On 01/04/2011 04:09 PM, Kurt Van Dijck wrote:
> This patch adds the driver that creates a platform:softing device
> from a pcmcia_device
> Note: the Kconfig indicates a dependency on the softing.ko driver,
> but this is purely to make configuration intuitive. This driver will
> work independent, but no CAN network devices appear until softing.ko is
> loaded too.
> 
> Signed-off-by: Kurt Van Dijck <kurt.van.dijck-/BeEPy95v10@public.gmane.org>
> 
> ---
>  drivers/net/can/softing/Kconfig      |   13 ++
>  drivers/net/can/softing/Makefile     |    1 +
>  drivers/net/can/softing/softing_cs.c |  361 ++++++++++++++++++++++++++++++++++
>  3 files changed, 375 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/can/softing/Kconfig b/drivers/net/can/softing/Kconfig
> index 072f337..14ebe14 100644
> --- a/drivers/net/can/softing/Kconfig
> +++ b/drivers/net/can/softing/Kconfig
> @@ -14,3 +14,16 @@ config CAN_SOFTING
>  	  controls the 2 busses on the card together.
>  	  As such, some actions (start/stop/busoff recovery) on 1 bus
>  	  must bring down the other bus too temporarily.
> +
> +config CAN_SOFTING_CS
> +	tristate "Softing CAN pcmcia cards"
> +	depends on PCMCIA

Does it not also depend on CAN_SOFTING?

> +	---help---
> +	  Support for PCMCIA cards from Softing Gmbh & some cards
> +	  from Vector Gmbh.
> +	  You need firmware for these, which you can get at
> +	  http://developer.berlios.de/projects/socketcan/
> +	  This version of the driver is written against
> +	  firmware version 4.6 (softing-fw-4.6-binaries.tar.gz)
> +	  In order to use the card as CAN device, you need the Softing generic
> +	  support too.
> diff --git a/drivers/net/can/softing/Makefile b/drivers/net/can/softing/Makefile
> index 7878b7b..5f0f527 100644
> --- a/drivers/net/can/softing/Makefile
> +++ b/drivers/net/can/softing/Makefile
> @@ -1,5 +1,6 @@
>  
>  softing-y := softing_main.o softing_fw.o
>  obj-$(CONFIG_CAN_SOFTING)        += softing.o
> +obj-$(CONFIG_CAN_SOFTING_CS)     += softing_cs.o
>  
>  ccflags-$(CONFIG_CAN_DEBUG_DEVICES) := -DDEBUG
> diff --git a/drivers/net/can/softing/softing_cs.c b/drivers/net/can/softing/softing_cs.c
> new file mode 100644
> index 0000000..cffd4d1
> --- /dev/null
> +++ b/drivers/net/can/softing/softing_cs.c
> @@ -0,0 +1,361 @@
> +/*
> + * drivers/net/can/softing/softing_cs.c

Please remove reduntant information, here and in other files.

> + *
> + * Copyright (C) 2008-2010
> + *
> + * - Kurt Van Dijck, EIA Electronics
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the version 2 of the GNU General Public License
> + * as published by the Free Software Foundation
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
> + */
> +
> +#include <linux/module.h>
> +#include <linux/kernel.h>
> +
> +#include <pcmcia/cistpl.h>
> +#include <pcmcia/ds.h>
> +
> +#include "softing_platform.h"
> +
> +static int softingcs_index;
> +static spinlock_t softingcs_index_lock;
> +
> +static int softingcs_reset(struct platform_device *pdev, int v);
> +static int softingcs_enable_irq(struct platform_device *pdev, int v);
> +
> +/*
> + * platform_data descriptions
> + */
> +static const struct softing_platform_data softingcs_platform_data[] = {
> +{
> +	.name = "CANcard",
> +	.manf = 0x0168, .prod = 0x001,
> +	.generation = 1,
> +	.nbus = 2,
> +	.freq = 16, .max_brp = 32, .max_sjw = 4,
> +	.dpram_size = 0x0800,
> +	.boot = {0x0000, 0x000000, fw_dir "bcard.bin",},
> +	.load = {0x0120, 0x00f600, fw_dir "ldcard.bin",},
> +	.app = {0x0010, 0x0d0000, fw_dir "cancard.bin",},
> +	.reset = softingcs_reset,
> +	.enable_irq = softingcs_enable_irq,
> +}, {
> +	.name = "CANcard-NEC",
> +	.manf = 0x0168, .prod = 0x002,
> +	.generation = 1,
> +	.nbus = 2,
> +	.freq = 16, .max_brp = 32, .max_sjw = 4,
> +	.dpram_size = 0x0800,
> +	.boot = {0x0000, 0x000000, fw_dir "bcard.bin",},
> +	.load = {0x0120, 0x00f600, fw_dir "ldcard.bin",},
> +	.app = {0x0010, 0x0d0000, fw_dir "cancard.bin",},
> +	.reset = softingcs_reset,
> +	.enable_irq = softingcs_enable_irq,
> +}, {
> +	.name = "CANcard-SJA",
> +	.manf = 0x0168, .prod = 0x004,
> +	.generation = 1,
> +	.nbus = 2,
> +	.freq = 20, .max_brp = 32, .max_sjw = 4,
> +	.dpram_size = 0x0800,
> +	.boot = {0x0000, 0x000000, fw_dir "bcard.bin",},
> +	.load = {0x0120, 0x00f600, fw_dir "ldcard.bin",},
> +	.app = {0x0010, 0x0d0000, fw_dir "cansja.bin",},
> +	.reset = softingcs_reset,
> +	.enable_irq = softingcs_enable_irq,
> +}, {
> +	.name = "CANcard-2",
> +	.manf = 0x0168, .prod = 0x005,
> +	.generation = 2,
> +	.nbus = 2,
> +	.freq = 24, .max_brp = 64, .max_sjw = 4,
> +	.dpram_size = 0x1000,
> +	.boot = {0x0000, 0x000000, fw_dir "bcard2.bin",},
> +	.load = {0x0120, 0x00f600, fw_dir "ldcard2.bin",},
> +	.app = {0x0010, 0x0d0000, fw_dir "cancrd2.bin",},
> +	.reset = softingcs_reset,
> +	.enable_irq = 0,
> +}, {
> +	.name = "Vector-CANcard",
> +	.manf = 0x0168, .prod = 0x081,
> +	.generation = 1,
> +	.nbus = 2,
> +	.freq = 16, .max_brp = 64, .max_sjw = 4,
> +	.dpram_size = 0x0800,
> +	.boot = {0x0000, 0x000000, fw_dir "bcard.bin",},
> +	.load = {0x0120, 0x00f600, fw_dir "ldcard.bin",},
> +	.app = {0x0010, 0x0d0000, fw_dir "cancard.bin",},
> +	.reset = softingcs_reset,
> +	.enable_irq = softingcs_enable_irq,
> +}, {
> +	.name = "Vector-CANcard-SJA",
> +	.manf = 0x0168, .prod = 0x084,
> +	.generation = 1,
> +	.nbus = 2,
> +	.freq = 20, .max_brp = 32, .max_sjw = 4,
> +	.dpram_size = 0x0800,
> +	.boot = {0x0000, 0x000000, fw_dir "bcard.bin",},
> +	.load = {0x0120, 0x00f600, fw_dir "ldcard.bin",},
> +	.app = {0x0010, 0x0d0000, fw_dir "cansja.bin",},
> +	.reset = softingcs_reset,
> +	.enable_irq = softingcs_enable_irq,
> +}, {
> +	.name = "Vector-CANcard-2",
> +	.manf = 0x0168, .prod = 0x085,
> +	.generation = 2,
> +	.nbus = 2,
> +	.freq = 24, .max_brp = 64, .max_sjw = 4,
> +	.dpram_size = 0x1000,
> +	.boot = {0x0000, 0x000000, fw_dir "bcard2.bin",},
> +	.load = {0x0120, 0x00f600, fw_dir "ldcard2.bin",},
> +	.app = {0x0010, 0x0d0000, fw_dir "cancrd2.bin",},
> +	.reset = softingcs_reset,
> +	.enable_irq = 0,
> +}, {
> +	.name = "EDICcard-NEC",
> +	.manf = 0x0168, .prod = 0x102,
> +	.generation = 1,
> +	.nbus = 2,
> +	.freq = 16, .max_brp = 64, .max_sjw = 4,
> +	.dpram_size = 0x0800,
> +	.boot = {0x0000, 0x000000, fw_dir "bcard.bin",},
> +	.load = {0x0120, 0x00f600, fw_dir "ldcard.bin",},
> +	.app = {0x0010, 0x0d0000, fw_dir "cancard.bin",},
> +	.reset = softingcs_reset,
> +	.enable_irq = softingcs_enable_irq,
> +}, {
> +	.name = "EDICcard-2",
> +	.manf = 0x0168, .prod = 0x105,
> +	.generation = 2,
> +	.nbus = 2,
> +	.freq = 24, .max_brp = 64, .max_sjw = 4,
> +	.dpram_size = 0x1000,
> +	.boot = {0x0000, 0x000000, fw_dir "bcard2.bin",},
> +	.load = {0x0120, 0x00f600, fw_dir "ldcard2.bin",},
> +	.app = {0x0010, 0x0d0000, fw_dir "cancrd2.bin",},
> +	.reset = softingcs_reset,
> +	.enable_irq = 0,
> +}, {
> +	0, 0,
> +},
> +};
> +
> +MODULE_FIRMWARE(fw_dir "bcard.bin");
> +MODULE_FIRMWARE(fw_dir "ldcard.bin");
> +MODULE_FIRMWARE(fw_dir "cancard.bin");
> +MODULE_FIRMWARE(fw_dir "cansja.bin");
> +
> +MODULE_FIRMWARE(fw_dir "bcard2.bin");
> +MODULE_FIRMWARE(fw_dir "ldcard2.bin");
> +MODULE_FIRMWARE(fw_dir "cancrd2.bin");
> +
> +static const struct softing_platform_data *softingcs_find_platform_data(
> +		unsigned int manf, unsigned int prod)
> +{
> +	const struct softing_platform_data *lp;
> +
> +	for (lp = softingcs_platform_data; lp->manf; ++lp) {
> +		if ((lp->manf == manf) && (lp->prod == prod))
> +			return lp;
> +	}
> +	return 0;
> +}
> +
> +/*
> + * platformdata callbacks
> + */
> +static int softingcs_reset(struct platform_device *pdev, int v)
> +{
> +	struct pcmcia_device *pcmcia = to_pcmcia_dev(pdev->dev.parent);
> +
> +	dev_dbg(&pdev->dev, "pcmcia config [2] %02x\n", v ? 0 : 0x20);
> +	return pcmcia_write_config_byte(pcmcia, 2, v ? 0 : 0x20);
> +}
> +
> +static int softingcs_enable_irq(struct platform_device *pdev, int v)
> +{
> +	struct pcmcia_device *pcmcia = to_pcmcia_dev(pdev->dev.parent);
> +
> +	dev_dbg(&pdev->dev, "pcmcia config [0] %02x\n", v ? 0x60 : 0);
> +	return pcmcia_write_config_byte(pcmcia, 0, v ? 0x60 : 0);
> +}
> +
> +/*
> + * pcmcia check
> + */
> +static int softingcs_probe_config(struct pcmcia_device *pcmcia,
> +		void *priv_data)
> +{
> +	struct softing_platform_data *pdat = priv_data;
> +	struct resource *pres;
> +	int memspeed = 0;
> +
> +	WARN_ON(!pdat);
> +	pres = pcmcia->resource[PCMCIA_IOMEM_0];
> +	if (resource_size(pres) < 0x1000)
> +		return -ERANGE;
> +
> +	pres->flags |= WIN_MEMORY_TYPE_CM | WIN_ENABLE;
> +	if (pdat->generation < 2) {
> +		pres->flags |= WIN_USE_WAIT | WIN_DATA_WIDTH_8;
> +		memspeed = 3;
> +	} else {
> +		pres->flags |= WIN_DATA_WIDTH_16;
> +	}
> +	return pcmcia_request_window(pcmcia, pres, memspeed);
> +}
> +
> +static void softingcs_remove(struct pcmcia_device *pcmcia)
> +{
> +	struct platform_device *pdev = pcmcia->priv;
> +
> +	/* free bits */
> +	platform_device_unregister(pdev);
> +	/* release pcmcia stuff */
> +	pcmcia_disable_device(pcmcia);
> +}
> +
> +/*
> + * platform_device wrapper
> + * pdev->resource has 2 entries: io & irq
> + */
> +static void softingcs_pdev_release(struct device *dev)
> +{
> +	struct platform_device *pdev = to_platform_device(dev);
> +	kfree(pdev);
> +}
> +
> +static int softingcs_probe(struct pcmcia_device *pcmcia)
> +{
> +	int ret;
> +	struct platform_device *pdev;
> +	const struct softing_platform_data *pdat;
> +	struct resource *pres;
> +	struct dev {
> +		struct platform_device pdev;
> +		struct resource res[2];
> +	} *dev;
> +
> +	/* find matching platform_data */
> +	pdat = softingcs_find_platform_data(pcmcia->manf_id, pcmcia->card_id);
> +	if (!pdat)
> +		return -ENOTTY;
> +
> +	/* setup pcmcia device */
> +	pcmcia->config_flags |= CONF_ENABLE_IRQ | CONF_AUTO_SET_IOMEM |
> +		CONF_AUTO_SET_VPP | CONF_AUTO_CHECK_VCC;
> +	ret = pcmcia_loop_config(pcmcia, softingcs_probe_config, (void *)pdat);
> +	if (ret)
> +		goto pcmcia_failed;
> +
> +	ret = pcmcia_enable_device(pcmcia);
> +	if (ret < 0)
> +		goto pcmcia_failed;
> +
> +	pres = pcmcia->resource[PCMCIA_IOMEM_0];
> +	if (!pres) {
> +		ret = -EBADF;
> +		goto pcmcia_bad;
> +	}
> +
> +	/* create softing platform device */
> +	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
> +	if (!dev) {
> +		ret = -ENOMEM;
> +		goto mem_failed;
> +	}
> +	dev->pdev.resource = dev->res;
> +	dev->pdev.num_resources = ARRAY_SIZE(dev->res);
> +	dev->pdev.dev.release = softingcs_pdev_release;
> +
> +	pdev = &dev->pdev;
> +	pdev->dev.platform_data = (void *)pdat;
> +	pdev->dev.parent = &pcmcia->dev;
> +	pcmcia->priv = pdev;
> +
> +	/* platform device resources */
> +	pdev->resource[0].flags = IORESOURCE_MEM;
> +	pdev->resource[0].start = pres->start;
> +	pdev->resource[0].end = pres->end;
> +
> +	pdev->resource[1].flags = IORESOURCE_IRQ;
> +	pdev->resource[1].start = pcmcia->irq;
> +	pdev->resource[1].end = pdev->resource[1].start;
> +
> +	/* platform device setup */
> +	spin_lock(&softingcs_index_lock);
> +	pdev->id = softingcs_index++;
> +	spin_unlock(&softingcs_index_lock);
> +	pdev->name = "softing";
> +	dev_set_name(&pdev->dev, "softingcs.%i", pdev->id);
> +	ret = platform_device_register(pdev);
> +	if (ret < 0)
> +		goto platform_failed;
> +
> +	dev_info(&pcmcia->dev, "created %s\n", dev_name(&pdev->dev));
> +	return 0;
> +
> +platform_failed:
> +	kfree(dev);
> +mem_failed:
> +pcmcia_bad:
> +pcmcia_failed:
> +	pcmcia_disable_device(pcmcia);
> +	pcmcia->priv = 0;
> +	return ret ?: -ENODEV;
> +}
> +
> +static /*const*/ struct pcmcia_device_id softingcs_ids[] = {
> +	/* softing */
> +	PCMCIA_DEVICE_MANF_CARD(0x0168, 0x0001),
> +	PCMCIA_DEVICE_MANF_CARD(0x0168, 0x0002),
> +	PCMCIA_DEVICE_MANF_CARD(0x0168, 0x0004),
> +	PCMCIA_DEVICE_MANF_CARD(0x0168, 0x0005),
> +	/* vector, manufacturer? */
> +	PCMCIA_DEVICE_MANF_CARD(0x0168, 0x0081),
> +	PCMCIA_DEVICE_MANF_CARD(0x0168, 0x0084),
> +	PCMCIA_DEVICE_MANF_CARD(0x0168, 0x0085),
> +	/* EDIC */
> +	PCMCIA_DEVICE_MANF_CARD(0x0168, 0x0102),
> +	PCMCIA_DEVICE_MANF_CARD(0x0168, 0x0105),
> +	PCMCIA_DEVICE_NULL,
> +};
> +
> +MODULE_DEVICE_TABLE(pcmcia, softingcs_ids);
> +
> +static struct pcmcia_driver softingcs_driver = {
> +	.owner		= THIS_MODULE,
> +	.name		= "softingcs",
> +	.id_table	= softingcs_ids,
> +	.probe		= softingcs_probe,
> +	.remove		= softingcs_remove,
> +};

Also here, I'm missing the usage of __devinit and friends.

> +static int __init softingcs_start(void)
> +{
> +	spin_lock_init(&softingcs_index_lock);
> +	return pcmcia_register_driver(&softingcs_driver);
> +}
> +
> +static void __exit softingcs_stop(void)
> +{
> +	pcmcia_unregister_driver(&softingcs_driver);
> +}
> +
> +module_init(softingcs_start);
> +module_exit(softingcs_stop);
> +
> +MODULE_DESCRIPTION("softing CANcard driver"
> +		", links PCMCIA card to softing driver");
> +MODULE_LICENSE("GPL");

GPL v2 ?

Thanks for your contribution.

Wolfgang.

^ permalink raw reply

* Re: [PATCH] net_sched: pfifo_head_drop problem
From: David Miller @ 2011-01-05 21:39 UTC (permalink / raw)
  To: hagen; +Cc: eric.dumazet, netdev, fw, kaber, shemminger, jarkao2
In-Reply-To: <20110105205217.GC10322@nuttenaction>

From: Hagen Paul Pfeifer <hagen@jauu.net>
Date: Wed, 5 Jan 2011 21:52:17 +0100

> * Eric Dumazet | 2011-01-05 21:35:02 [+0100]:
> 
>>My mid term suggestion would be to change things so that
>>sch->bstats.bytes and sch->bstats.packets are incremented in dequeue()
>>only, not at enqueue() time. We also could add drop_bytes/drop_packets
>>and provide estimations of drop rates.
>>
>>It would be more sensible anyway for very low speeds, and big bursts.
>>Right now, if we drop packets, they still are accounted in byte/packets
>>abolute counters and rate estimators.
>>
>>Before this mid term change, this patch makes pfifo_head_drop behavior
>>similar to other qdiscs in case of drops :
>>Dont decrement sch->bstats.bytes and sch->bstats.packets
> 
> Thanks Stephen and Erik for spotting this bug!
> 
>>Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> 
> Acked-by: Hagen Paul Pfeifer <hagen@jauu.net>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* pull request: wireless-next-2.6 2011-01-05
From: John W. Linville @ 2011-01-05 21:51 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA

Dave,

Here is another big batch of updates intended for 2.6.38.  It should
be the last big one, but I still have a few patches in the queue that meet
the posting date requirements and that might merit inclusion -- we'll
see...

Again, this is the usual batch of driver updates from all the major
players.  Also, mac80211 gets a little action.  The bluetooth team makes
a showing as well.

Please let me know if there are problems!

Thanks,

John

---

The following changes since commit dbbe68bb12b34f3e450da7a73c20e6fa1f85d63a:

  Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 (2011-01-04 11:57:25 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6.git for-davem

Akinobu Mita (1):
      airo: use simple_write_to_buffer

Bob Copeland (2):
      ath5k: fix cycle counter inconsistent locking
      cfg80211: fix transposition of words in printk

Brian Prodoehl (1):
      ath9k: fix spur mitigation no-spur case for AR9002

Bruno Randolf (6):
      ath5k: Simplify powertable recalculation
      ath5k: Separate powertable setup and writing
      ath5k: Track current TX power separately from max TX power
      ath5k: Remove ATH5K_INI_RFGAIN defines, use band instead
      ath5k: Use helper function to get eeprom mode from channel
      ath5k: Move mac80211 functions into new file

Chaoming Li (1):
      rtlwifi: Fix large packet issue

Christian Lamparter (6):
      carl9170: add missing return-value check
      carl9170: reduce channel change delay
      carl9170: fix usb pm suspend->resume woes
      mac80211: ignore PSM bit of reordered frames
      mac80211: serialize rx path workers
      Revert "mac80211: temporarily disable reorder release timer"

Dan Carpenter (2):
      ath9k: unlock on error path in ath9k_change_interface()
      ath5k: ath5k_eeprom_mode_from_channel() returns signed

Felix Fietkau (1):
      ath9k_hw: fix dma descriptor rx error bit parsing

Gertjan van Wingerde (3):
      rt2x00: Remove intf->bssid field.
      rt2x00: remove intf->mac field.
      rt2x00: Fix pointer errors.

Gustavo F. Padovan (2):
      Bluetooth: Don't accept ConfigReq if we aren't in the BT_CONFIG state
      Bluetooth: Improve handling of HCI control channel in bind

Hauke Mehrtens (7):
      wl1251: remove unnecessary import
      wl12xx: remove unnecessary import
      ssb: Use pci_is_pcie()
      rt2x00: Use pci_is_pcie()
      ath5k: Use pci_is_pcie()
      ath9k: Use pci_is_pcie()
      rtlwifi: Use pci_pcie_cap()

Helmut Schaa (4):
      rt2x00: Remove superfluous assignment of mpdu_density
      rt2x00: Simplify intf->delayed_flags locking
      rt2x00: Remove unused interface spinlock
      rt2x00: Fix comment about removed spinlock

Ismael Luceno (1):
      rt2x00: Fix panic on frame padding for rt2800 usb devices

Joel A Fernandes (1):
      mac80211: Fix mesh portal communication with other mesh nodes.

Johan Hedberg (9):
      Bluetooth: Add Bluetooth Management interface definitions
      Bluetooth: Add initial Bluetooth Management interface callbacks
      Bluetooth: Make hci_send_to_sock usable for management control sockets
      Bluetooth: Add error handling for managment command handlers
      Bluetooth: Add read_version management command
      Bluetooth: Add read_index_list management command
      Bluetooth: Add read_info management command
      Bluetooth: Add management events for controller addition & removal
      Bluetooth: Fix __hci_request synchronization for hci_open_dev

Johannes Berg (7):
      mac80211: cleanup select_queue
      iwlagn: fix FH error
      mac80211: add missing synchronize_rcu
      mac80211: fix some key comments and code
      mac80211: implement hardware offload for remain-on-channel
      mac80211: implement off-channel TX using hw r-o-c offload
      mac80211: remove stray extern

Johannes Stezenbach (2):
      rt2x00: simplify txstatus_fifo handling
      rt2x00: allow txstatus_fifo w/o txstatus_tasklet

John W. Linville (8):
      rtl818x: move rtl8180 and rtl8187 to separate subdirectories
      Merge branch 'master' of master.kernel.org:/.../padovan/bluetooth-next-2.6
      Merge branch 'wireless-next-2.6' of git://git.kernel.org/.../iwlwifi/iwlwifi-2.6
      Merge branch 'master' of git://git.kernel.org/.../linville/wireless-2.6
      ath9k: qualify global modparam_nohwcrypt variable
      ath5k: qualify global modparam_nohwcrypt variable
      ath9k: correct MODULE_PARM_DESC parameters for force_new_ani
      Merge branch 'master' of git://git.kernel.org/.../linville/wireless-next-2.6 into for-davem

Jussi Kivilinna (7):
      rndis_wlan: scanning, workaround device returning incorrect bssid-list item count.
      rndis_wlan: do not set default_key if not WEP key
      rndis_wlan: turn radio off before interface is bring up
      rndis_wlan: constify rndis_config_ops
      rndis_wlan: remove unused variable from priv structure
      rndis_wlan: add support for set_cqm_rssi_config
      rndis_wlan: add support for set_power_mgmt

Larry Finger (1):
      rtlwifi: rtl8192ce: Fix driver problem when radio switch off at module load

Luis R. Rodriguez (1):
      ath9k: fix aphy / wiphy idle mismatch

Milton Miller (1):
      mac80211: fix mesh forwarding when ratelimited too

Mohammed Shafi Shajakhan (4):
      ath9k: Reset keycache on resume
      ath9k: Few clean ups in beacon config parameters
      Revert "ath9k: Parse DTIM period from mac80211"
      ath9k : few rate control clean ups

Rafał Miłecki (7):
      b43: use correct firmware for newer cores
      b43: N-PHY: implement radio 2056 init steps
      b43: N-PHY: add init tables for 2056 radio
      b43: N-PHY: avoid PHY hangs for rev 3 and 4
      b43: N-PHY: use correct channel tables for rev4+
      b43: N-PHY: update 2056 radio on channel switch on rev3+
      b43: N-PHY: enable support for PHYs rev 3 and higher

Rajkumar Manoharan (2):
      ath9k: Fix warnings on card removal
      ath9k: fix beacon restart on channel change

Senthil Balasubramanian (1):
      ath9k: spin_lock_bh is not required within tasklet context.

Stanislaw Gruszka (2):
      iwlagn: enable only rfkill interrupt when device is down
      iwlagn: fix scan tx antenna setting on 5Ghz band

Sujith Manoharan (7):
      ath9k_htc: Fix warning on device removal
      ath9k_htc: Handle pending URBs properly
      ath9k_htc: Move work cancellation outside of mutex
      ath9k_htc: Handle FATAL events
      ath9k_htc: Fix fast channel change
      ath9k_htc: Move LED/RFKILL code to htc_drv_gpio.c
      ath9k_htc: Fix packet injection

Tracey Dent (1):
      Net: bluetooth: Makefile: Remove deprecated kbuild goal definitions

Vasanthakumar Thiagarajan (1):
      ath9k_hw: Fix bug in eeprom data length validation for AR9485

Wey-Yi Guy (2):
      iwlwifi: remove extra string
      iwlwifi: remove reference to Gen2

roel kluin (1):
      libertas: down_interruptible() can return -EINTR, not EINTR

 MAINTAINERS                                        |    4 +-
 drivers/net/wireless/airo.c                        |   20 +-
 drivers/net/wireless/ath/ath5k/Makefile            |    1 +
 drivers/net/wireless/ath/ath5k/ath5k.h             |    8 +-
 drivers/net/wireless/ath/ath5k/attach.c            |    2 +-
 drivers/net/wireless/ath/ath5k/base.c              |  769 +-----
 drivers/net/wireless/ath/ath5k/eeprom.c            |   16 +
 drivers/net/wireless/ath/ath5k/eeprom.h            |    2 +
 drivers/net/wireless/ath/ath5k/mac80211-ops.c      |  774 +++++
 drivers/net/wireless/ath/ath5k/phy.c               |  152 +-
 drivers/net/wireless/ath/ath5k/reset.c             |   28 +-
 drivers/net/wireless/ath/ath9k/ar9002_hw.c         |    2 +-
 drivers/net/wireless/ath/ath9k/ar9002_phy.c        |    5 +-
 drivers/net/wireless/ath/ath9k/ar9003_eeprom.c     |    4 +-
 drivers/net/wireless/ath/ath9k/ar9003_mac.c        |   11 +-
 drivers/net/wireless/ath/ath9k/ath9k.h             |    4 +-
 drivers/net/wireless/ath/ath9k/beacon.c            |   25 +-
 drivers/net/wireless/ath/ath9k/eeprom.h            |    2 -
 drivers/net/wireless/ath/ath9k/hif_usb.c           |   46 +-
 drivers/net/wireless/ath/ath9k/hif_usb.h           |    1 +
 drivers/net/wireless/ath/ath9k/htc.h               |   31 +-
 drivers/net/wireless/ath/ath9k/htc_drv_gpio.c      |  327 +++
 drivers/net/wireless/ath/ath9k/htc_drv_init.c      |   16 +-
 drivers/net/wireless/ath/ath9k/htc_drv_main.c      |  460 +---
 drivers/net/wireless/ath/ath9k/hw.c                |    4 +-
 drivers/net/wireless/ath/ath9k/hw.h                |    4 +
 drivers/net/wireless/ath/ath9k/init.c              |   13 +-
 drivers/net/wireless/ath/ath9k/mac.c               |    9 +-
 drivers/net/wireless/ath/ath9k/main.c              |   21 +-
 drivers/net/wireless/ath/ath9k/pci.c               |   13 +-
 drivers/net/wireless/ath/ath9k/rc.c                |   16 +-
 drivers/net/wireless/ath/ath9k/rc.h                |    3 -
 drivers/net/wireless/ath/ath9k/recv.c              |    3 +-
 drivers/net/wireless/ath/ath9k/wmi.c               |   20 +-
 drivers/net/wireless/ath/ath9k/wmi.h               |    3 +-
 drivers/net/wireless/ath/carl9170/phy.c            |   10 +-
 drivers/net/wireless/ath/carl9170/usb.c            |   53 +-
 drivers/net/wireless/b43/main.c                    |   14 +-
 drivers/net/wireless/b43/phy_n.c                   |  130 +-
 drivers/net/wireless/b43/radio_2056.c              | 3045 +++++++++++++++++++-
 drivers/net/wireless/b43/radio_2056.h              |    3 +
 drivers/net/wireless/iwlwifi/iwl-6000.c            |    7 +-
 drivers/net/wireless/iwlwifi/iwl-agn-lib.c         |   14 +-
 drivers/net/wireless/iwlwifi/iwl-agn.c             |   18 +-
 drivers/net/wireless/iwlwifi/iwl-core.h            |    1 -
 drivers/net/wireless/iwlwifi/iwl-helpers.h         |    6 +
 drivers/net/wireless/iwlwifi/iwl-led.c             |    2 +-
 drivers/net/wireless/libertas/if_spi.c             |    2 +-
 drivers/net/wireless/rndis_wlan.c                  |  192 ++-
 drivers/net/wireless/rt2x00/rt2800pci.c            |   18 +-
 drivers/net/wireless/rt2x00/rt2800usb.c            |   16 +-
 drivers/net/wireless/rt2x00/rt2x00.h               |   25 +-
 drivers/net/wireless/rt2x00/rt2x00config.c         |    8 +-
 drivers/net/wireless/rt2x00/rt2x00dev.c            |   25 +-
 drivers/net/wireless/rt2x00/rt2x00ht.c             |    2 -
 drivers/net/wireless/rt2x00/rt2x00mac.c            |   35 +-
 drivers/net/wireless/rt2x00/rt2x00pci.c            |    2 +-
 drivers/net/wireless/rtl818x/Makefile              |    9 +-
 drivers/net/wireless/rtl818x/rtl8180/Makefile      |    5 +
 .../rtl818x/{rtl8180_dev.c => rtl8180/dev.c}       |    8 +-
 .../{rtl8180_grf5101.c => rtl8180/grf5101.c}       |    2 +-
 .../{rtl8180_grf5101.h => rtl8180/grf5101.h}       |    0
 .../{rtl8180_max2820.c => rtl8180/max2820.c}       |    2 +-
 .../{rtl8180_max2820.h => rtl8180/max2820.h}       |    0
 .../net/wireless/rtl818x/{ => rtl8180}/rtl8180.h   |    0
 .../{rtl8180_rtl8225.c => rtl8180/rtl8225.c}       |    2 +-
 .../{rtl8180_rtl8225.h => rtl8180/rtl8225.h}       |    0
 .../rtl818x/{rtl8180_sa2400.c => rtl8180/sa2400.c} |    2 +-
 .../rtl818x/{rtl8180_sa2400.h => rtl8180/sa2400.h} |    0
 drivers/net/wireless/rtl818x/rtl8187/Makefile      |    5 +
 .../rtl818x/{rtl8187_dev.c => rtl8187/dev.c}       |    6 +-
 .../rtl818x/{rtl8187_leds.c => rtl8187/leds.c}     |    2 +-
 .../rtl818x/{rtl8187_leds.h => rtl8187/leds.h}     |    0
 .../rtl818x/{rtl8187_rfkill.c => rtl8187/rfkill.c} |    2 +-
 .../rtl818x/{rtl8187_rfkill.h => rtl8187/rfkill.h} |    0
 .../net/wireless/rtl818x/{ => rtl8187}/rtl8187.h   |    2 +-
 .../{rtl8187_rtl8225.c => rtl8187/rtl8225.c}       |    2 +-
 .../{rtl8187_rtl8225.h => rtl8187/rtl8225.h}       |    0
 drivers/net/wireless/rtlwifi/base.c                |   12 +-
 drivers/net/wireless/rtlwifi/pci.c                 |   20 +-
 drivers/net/wireless/rtlwifi/rtl8192ce/hw.c        |   11 -
 drivers/net/wireless/wl1251/boot.c                 |    1 -
 drivers/net/wireless/wl12xx/boot.c                 |    1 -
 drivers/ssb/scan.c                                 |    4 +-
 include/net/bluetooth/bluetooth.h                  |    1 +
 include/net/bluetooth/hci.h                        |    4 +
 include/net/bluetooth/hci_core.h                   |    9 +-
 include/net/bluetooth/mgmt.h                       |   87 +
 include/net/mac80211.h                             |   19 +
 net/bluetooth/Makefile                             |    2 +-
 net/bluetooth/hci_core.c                           |   17 +-
 net/bluetooth/hci_event.c                          |   33 +-
 net/bluetooth/hci_sock.c                           |   52 +-
 net/bluetooth/l2cap.c                              |    8 +-
 net/bluetooth/mgmt.c                               |  308 ++
 net/mac80211/cfg.c                                 |  142 +
 net/mac80211/driver-ops.h                          |   30 +
 net/mac80211/driver-trace.h                        |   80 +
 net/mac80211/ieee80211_i.h                         |   21 +
 net/mac80211/iface.c                               |    9 +-
 net/mac80211/key.c                                 |   44 +-
 net/mac80211/led.c                                 |    4 +-
 net/mac80211/main.c                                |    8 +-
 net/mac80211/offchannel.c                          |   85 +
 net/mac80211/rx.c                                  |  112 +-
 net/mac80211/tx.c                                  |   14 +-
 net/mac80211/wme.c                                 |   20 +-
 net/wireless/reg.c                                 |    2 +-
 108 files changed, 6014 insertions(+), 1642 deletions(-)
 create mode 100644 drivers/net/wireless/ath/ath5k/mac80211-ops.c
 create mode 100644 drivers/net/wireless/rtl818x/rtl8180/Makefile
 rename drivers/net/wireless/rtl818x/{rtl8180_dev.c => rtl8180/dev.c} (99%)
 rename drivers/net/wireless/rtl818x/{rtl8180_grf5101.c => rtl8180/grf5101.c} (99%)
 rename drivers/net/wireless/rtl818x/{rtl8180_grf5101.h => rtl8180/grf5101.h} (100%)
 rename drivers/net/wireless/rtl818x/{rtl8180_max2820.c => rtl8180/max2820.c} (99%)
 rename drivers/net/wireless/rtl818x/{rtl8180_max2820.h => rtl8180/max2820.h} (100%)
 rename drivers/net/wireless/rtl818x/{ => rtl8180}/rtl8180.h (100%)
 rename drivers/net/wireless/rtl818x/{rtl8180_rtl8225.c => rtl8180/rtl8225.c} (99%)
 rename drivers/net/wireless/rtl818x/{rtl8180_rtl8225.h => rtl8180/rtl8225.h} (100%)
 rename drivers/net/wireless/rtl818x/{rtl8180_sa2400.c => rtl8180/sa2400.c} (99%)
 rename drivers/net/wireless/rtl818x/{rtl8180_sa2400.h => rtl8180/sa2400.h} (100%)
 create mode 100644 drivers/net/wireless/rtl818x/rtl8187/Makefile
 rename drivers/net/wireless/rtl818x/{rtl8187_dev.c => rtl8187/dev.c} (99%)
 rename drivers/net/wireless/rtl818x/{rtl8187_leds.c => rtl8187/leds.c} (99%)
 rename drivers/net/wireless/rtl818x/{rtl8187_leds.h => rtl8187/leds.h} (100%)
 rename drivers/net/wireless/rtl818x/{rtl8187_rfkill.c => rtl8187/rfkill.c} (98%)
 rename drivers/net/wireless/rtl818x/{rtl8187_rfkill.h => rtl8187/rfkill.h} (100%)
 rename drivers/net/wireless/rtl818x/{ => rtl8187}/rtl8187.h (99%)
 rename drivers/net/wireless/rtl818x/{rtl8187_rtl8225.c => rtl8187/rtl8225.c} (99%)
 rename drivers/net/wireless/rtl818x/{rtl8187_rtl8225.h => rtl8187/rtl8225.h} (100%)
 create mode 100644 include/net/bluetooth/mgmt.h
 create mode 100644 net/bluetooth/mgmt.c

Omnibus patch available here:

	http://www.kernel.org/pub/linux/kernel/people/linville/wireless-next-2.6-2011-01-05.patch.bz2

-- 
John W. Linville		Someday the world will need a hero, and you
linville-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org			might be all we have.  Be ready.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: Gaah: selinux_socket_unix_stream_connect oops
From: David Miller @ 2011-01-05 22:25 UTC (permalink / raw)
  To: torvalds; +Cc: netdev, jeremy, jmorris
In-Reply-To: <AANLkTikPvJLEZetB-CbVSO0VuYLnAQz5y8D_F0HGML=5@mail.gmail.com>

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 5 Jan 2011 08:27:01 -0800

> So I wonder if there is some subtle race that happens when one end of
> a unix domain socket attaches just as another end disconnects?
> Especially as "security_unix_stream_connect()" is called before the
> whole connect sequence is really final. It's generally
> "unix_release()" that sets 'sock->sk' to NULL.
> 
> Btw, why do we pass in "sock" and "other->sk_socket" ("struct
> socket"), when it appears that what the security code really wants to
> get "struct sock" (which would be "sk" and "other" in the caller)? The
> calling convention seems to result in (a) this NULL pointer thing and
> (b) all these extra dereferences.
> 
> Comments? Ideas?

There is nothing which blocks that unix_release() code which sets
socket->sk to NULL, it runs asynchronously to this connect code.

The reason it passes in the socket pointer instead of the pointer to
struct sock is because that's what SMACK wants to use, as it needs to
get at SOCK_INODE().

See, even you think the whole world is SELINUX :-)))

More seriously, we can get at the struct socket via sk->sk_socket in
the SMACK code.  sk->sk_socket, unlike socket->sk, has it's state
change to NULL (via sock_orphen()) protected by unix_state_lock(),
which we hold for "other" in this unix connect code path.

Therefore I propose we fix this like so:

--------------------
af_unix: Avoid socket->sk NULL OOPS in stream connect security hooks.

unix_release() can asynchornously set socket->sk to NULL, and
it does so without holding the unix_state_lock() on "other"
during stream connects.

However, the reverse mapping, sk->sk_socket, is only transitioned
to NULL under the unix_state_lock().

Therefore make the security hooks follow the reverse mapping instead
of the forward mapping.

Reported-by: Jeremy Fitzhardinge <jeremy@goop.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/include/linux/security.h b/include/linux/security.h
index fd4d55f..d47a4c2 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -796,8 +796,9 @@ static inline void security_free_mnt_opts(struct security_mnt_opts *opts)
  * @unix_stream_connect:
  *	Check permissions before establishing a Unix domain stream connection
  *	between @sock and @other.
- *	@sock contains the socket structure.
- *	@other contains the peer socket structure.
+ *	@sock contains the sock structure.
+ *	@other contains the peer sock structure.
+ *	@newsk contains the new sock structure.
  *	Return 0 if permission is granted.
  * @unix_may_send:
  *	Check permissions before connecting or sending datagrams from @sock to
@@ -1568,8 +1569,7 @@ struct security_operations {
 	int (*inode_getsecctx)(struct inode *inode, void **ctx, u32 *ctxlen);
 
 #ifdef CONFIG_SECURITY_NETWORK
-	int (*unix_stream_connect) (struct socket *sock,
-				    struct socket *other, struct sock *newsk);
+	int (*unix_stream_connect) (struct sock *sock, struct sock *other, struct sock *newsk);
 	int (*unix_may_send) (struct socket *sock, struct socket *other);
 
 	int (*socket_create) (int family, int type, int protocol, int kern);
@@ -2525,8 +2525,7 @@ static inline int security_inode_getsecctx(struct inode *inode, void **ctx, u32
 
 #ifdef CONFIG_SECURITY_NETWORK
 
-int security_unix_stream_connect(struct socket *sock, struct socket *other,
-				 struct sock *newsk);
+int security_unix_stream_connect(struct sock *sock, struct sock *other, struct sock *newsk);
 int security_unix_may_send(struct socket *sock,  struct socket *other);
 int security_socket_create(int family, int type, int protocol, int kern);
 int security_socket_post_create(struct socket *sock, int family,
@@ -2567,8 +2566,8 @@ void security_tun_dev_post_create(struct sock *sk);
 int security_tun_dev_attach(struct sock *sk);
 
 #else	/* CONFIG_SECURITY_NETWORK */
-static inline int security_unix_stream_connect(struct socket *sock,
-					       struct socket *other,
+static inline int security_unix_stream_connect(struct sock *sock,
+					       struct sock *other,
 					       struct sock *newsk)
 {
 	return 0;
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 417d7a6..dd419d2 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1157,7 +1157,7 @@ restart:
 		goto restart;
 	}
 
-	err = security_unix_stream_connect(sock, other->sk_socket, newsk);
+	err = security_unix_stream_connect(sk, other, newsk);
 	if (err) {
 		unix_state_unlock(sk);
 		goto out_unlock;
diff --git a/security/capability.c b/security/capability.c
index c773635..2a5df2b 100644
--- a/security/capability.c
+++ b/security/capability.c
@@ -548,7 +548,7 @@ static int cap_sem_semop(struct sem_array *sma, struct sembuf *sops,
 }
 
 #ifdef CONFIG_SECURITY_NETWORK
-static int cap_unix_stream_connect(struct socket *sock, struct socket *other,
+static int cap_unix_stream_connect(struct sock *sock, struct sock *other,
 				   struct sock *newsk)
 {
 	return 0;
diff --git a/security/security.c b/security/security.c
index 1b798d3..e5fb07a 100644
--- a/security/security.c
+++ b/security/security.c
@@ -977,8 +977,7 @@ EXPORT_SYMBOL(security_inode_getsecctx);
 
 #ifdef CONFIG_SECURITY_NETWORK
 
-int security_unix_stream_connect(struct socket *sock, struct socket *other,
-				 struct sock *newsk)
+int security_unix_stream_connect(struct sock *sock, struct sock *other, struct sock *newsk)
 {
 	return security_ops->unix_stream_connect(sock, other, newsk);
 }
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index c82538a..6f637d2 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3921,18 +3921,18 @@ static int selinux_socket_shutdown(struct socket *sock, int how)
 	return sock_has_perm(current, sock->sk, SOCKET__SHUTDOWN);
 }
 
-static int selinux_socket_unix_stream_connect(struct socket *sock,
-					      struct socket *other,
+static int selinux_socket_unix_stream_connect(struct sock *sock,
+					      struct sock *other,
 					      struct sock *newsk)
 {
-	struct sk_security_struct *sksec_sock = sock->sk->sk_security;
-	struct sk_security_struct *sksec_other = other->sk->sk_security;
+	struct sk_security_struct *sksec_sock = sock->sk_security;
+	struct sk_security_struct *sksec_other = other->sk_security;
 	struct sk_security_struct *sksec_new = newsk->sk_security;
 	struct common_audit_data ad;
 	int err;
 
 	COMMON_AUDIT_DATA_INIT(&ad, NET);
-	ad.u.net.sk = other->sk;
+	ad.u.net.sk = other;
 
 	err = avc_has_perm(sksec_sock->sid, sksec_other->sid,
 			   sksec_other->sclass,
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 489a85a..ccb71a0 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -2408,22 +2408,22 @@ static int smack_setprocattr(struct task_struct *p, char *name,
 
 /**
  * smack_unix_stream_connect - Smack access on UDS
- * @sock: one socket
- * @other: the other socket
+ * @sock: one sock
+ * @other: the other sock
  * @newsk: unused
  *
  * Return 0 if a subject with the smack of sock could access
  * an object with the smack of other, otherwise an error code
  */
-static int smack_unix_stream_connect(struct socket *sock,
-				     struct socket *other, struct sock *newsk)
+static int smack_unix_stream_connect(struct sock *sock,
+				     struct sock *other, struct sock *newsk)
 {
-	struct inode *sp = SOCK_INODE(sock);
-	struct inode *op = SOCK_INODE(other);
+	struct inode *sp = SOCK_INODE(sock->sk_socket);
+	struct inode *op = SOCK_INODE(other->sk_socket);
 	struct smk_audit_info ad;
 
 	smk_ad_init(&ad, __func__, LSM_AUDIT_DATA_NET);
-	smk_ad_setfield_u_net_sk(&ad, other->sk);
+	smk_ad_setfield_u_net_sk(&ad, other);
 	return smk_access(smk_of_inode(sp), smk_of_inode(op),
 				 MAY_READWRITE, &ad);
 }

^ permalink raw reply related

* Re: [PATCH] Export ACPI _DSM provided firmware instance number and string name to sysfs
From: Matt Domsch @ 2011-01-05 22:44 UTC (permalink / raw)
  To: Narendra_K, Jesse Barnes
  Cc: linux-pci, linux-hotplug, netdev, Jordan_Hargrave, Charles_Rose,
	Vijay_Nijhawan
In-Reply-To: <20101223194927.GA2717@fedora14-r610.oslab.blr.amer.dell.com>

On Thu, Dec 23, 2010 at 11:24:36AM -0800, Narendra_K@Dell.com wrote:
> Please find the version 2 of the patch.
> 
> V1 -> V2:
> 
> The attribute 'index' is changed to 'acpi_index' as the semantics of
> SMBIOS provided device type instance and ACPI _DSM provided instance
> number are different.
> 
> From: Narendra K <narendra_k@dell.com>
> Subject: [PATCH V2] Export ACPI _DSM provided firmware instance number and string to sysfs
> 
> This patch exports ACPI _DSM (Device Specific Method) provided firmware
> instance number and string name of PCI devices as defined by
> 'PCI Firmware Specification Revision 3.1' section 4.6.7.( DSM for Naming
> a PCI or PCI Express Device Under Operating Systems) to sysfs.

Thanks Narendra and Jordan.

Jesse, how does this look for -next please?

Thanks,
Matt

-- 
Matt Domsch
Technology Strategist
Dell | Office of the CTO

^ permalink raw reply

* Re: [PATCH] Export ACPI _DSM provided firmware instance number and string name to sysfs
From: Jesse Barnes @ 2011-01-05 22:46 UTC (permalink / raw)
  To: Matt Domsch
  Cc: Narendra_K, linux-pci, linux-hotplug, netdev, Jordan_Hargrave,
	Charles_Rose, Vijay_Nijhawan
In-Reply-To: <20110105224404.GB4647@auslistsprd01.us.dell.com>

On Wed, 5 Jan 2011 16:44:04 -0600
Matt Domsch <Matt_Domsch@Dell.com> wrote:

> On Thu, Dec 23, 2010 at 11:24:36AM -0800, Narendra_K@Dell.com wrote:
> > Please find the version 2 of the patch.
> > 
> > V1 -> V2:
> > 
> > The attribute 'index' is changed to 'acpi_index' as the semantics of
> > SMBIOS provided device type instance and ACPI _DSM provided instance
> > number are different.
> > 
> > From: Narendra K <narendra_k@dell.com>
> > Subject: [PATCH V2] Export ACPI _DSM provided firmware instance number and string to sysfs
> > 
> > This patch exports ACPI _DSM (Device Specific Method) provided firmware
> > instance number and string name of PCI devices as defined by
> > 'PCI Firmware Specification Revision 3.1' section 4.6.7.( DSM for Naming
> > a PCI or PCI Express Device Under Operating Systems) to sysfs.
> 
> Thanks Narendra and Jordan.
> 
> Jesse, how does this look for -next please?

I think it's fine; I'll check it out again and apply this week.

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply

* Re: Gaah: selinux_socket_unix_stream_connect oops
From: David Miller @ 2011-01-05 23:38 UTC (permalink / raw)
  To: torvalds; +Cc: netdev, jeremy, jmorris
In-Reply-To: <AANLkTimZnqQ1xmQC2Bp3vj60uU_vPEwWi2n+G9MZ4Uo9@mail.gmail.com>

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 5 Jan 2011 15:32:44 -0800

> On Wed, Jan 5, 2011 at 2:25 PM, David Miller <davem@davemloft.net> wrote:
>>
>> More seriously, we can get at the struct socket via sk->sk_socket in
>> the SMACK code.  sk->sk_socket, unlike socket->sk, has it's state
>> change to NULL (via sock_orphen()) protected by unix_state_lock(),
>> which we hold for "other" in this unix connect code path.
>>
>> Therefore I propose we fix this like so:
> 
> Looks fine to me.
> 
> And no, I don't think that selinux is all the world, but selinux is
> the _common_ case, and with the cross-pointers, the only difference
> can be whether you need to dereference the pointer or not - so
> choosing the "extra dereference" case for the common case seems silly.
> 
> The fact that this also fixes locking is obviously an even better
> reason to do it, though ;)

Right :)

I'll toss this your way during the merge window and queue it up for
later -stable submission as well.

Thanks.

^ permalink raw reply

* at91_can: errata "Contents of Mailbox 0 can be sent"
From: Marc Kleine-Budde @ 2011-01-05 23:41 UTC (permalink / raw)
  To: SocketCAN Core Mailing List; +Cc: Netdev-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 977 bytes --]

Moin,

the at91_can core has the following problem:
"50.3.5.3 CAN: Contents of Mailbox 0 can be sent Even if Mailbox is
Disabled"

This means under high bus load it can happen that the mailbox 0 is send
to the bus. And it does happen, even with the mainline driver where
Mailbox 0 is a receive mailbox. The errata proposes not to use mailbox 0
and load it with a unused can_id that will not disturb the bus.

I'm going to implement the workaround. As the unused identifier is
probably application specific my question is: What's the best way to
configure this identifier?

- sysfs
- netlink
- ioctl
- ethtool
- peek/poke :P

What will be a good default can_id?

regards, Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

[-- Attachment #2: Type: text/plain, Size: 188 bytes --]

_______________________________________________
Socketcan-core mailing list
Socketcan-core-0fE9KPoRgkgATYTw5x5z8w@public.gmane.org
https://lists.berlios.de/mailman/listinfo/socketcan-core

^ permalink raw reply

* Re: Gaah: selinux_socket_unix_stream_connect oops
From: Linus Torvalds @ 2011-01-05 23:32 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, jeremy, jmorris
In-Reply-To: <20110105.142540.245404254.davem@davemloft.net>

On Wed, Jan 5, 2011 at 2:25 PM, David Miller <davem@davemloft.net> wrote:
>
> More seriously, we can get at the struct socket via sk->sk_socket in
> the SMACK code.  sk->sk_socket, unlike socket->sk, has it's state
> change to NULL (via sock_orphen()) protected by unix_state_lock(),
> which we hold for "other" in this unix connect code path.
>
> Therefore I propose we fix this like so:

Looks fine to me.

And no, I don't think that selinux is all the world, but selinux is
the _common_ case, and with the cross-pointers, the only difference
can be whether you need to dereference the pointer or not - so
choosing the "extra dereference" case for the common case seems silly.

The fact that this also fixes locking is obviously an even better
reason to do it, though ;)

                      Linus

^ permalink raw reply

* Re: [RFC v2 PATCH] m68knommu: added dm9000 support
From: Arnaud Lacombe @ 2011-01-05 23:51 UTC (permalink / raw)
  To: Angelo Dureghello; +Cc: linux-kernel, netdev, Randy Dunlap
In-Reply-To: <4D24B27F.3020104@gmail.com>

Hi,

[disclamer: I have no power whatsoever on the dm9k driver, and the
only dm9000 chip I've access to belongs to hardware whose parent
company is now defunct.]

Btw, Linux 68k folks is missing from the CC list.

On Wed, Jan 5, 2011 at 1:03 PM, Angelo Dureghello <angelo70@gmail.com> wrote:
> This patch allows to use the dm9000 network chip with a m68knommu
> big-endian cpu. From the HW point of view,
what hardware ?

> the cpu data bus connected to
> the dm9000 chip should be hardware-byte-swapped, crossing the bytes
> wires (D0:7 to D24:31, etc.). In anyway, has been also added an option
> to swap the bytes in the driver, if some cpu has been wired straight
> D0:D31 to dm9000.
>
> Signed-off-by: Angelo Dureghello <angelo70@gmail.com>
> ---
>
> --- linux/drivers/net/Kconfig.orig      2011-01-05 17:11:37.992376124 +0100
> +++ linux/drivers/net/Kconfig   2011-01-04 22:33:14.132301872 +0100
> @@ -960,7 +960,7 @@ config TI_DAVINCI_EMAC
>
>  config DM9000
>        tristate "DM9000 support"
> -       depends on ARM || BLACKFIN || MIPS
> +       depends on COLDFIRE || ARM || BLACKFIN || MIPS
>        select CRC32
>        select MII
>        ---help---
> @@ -986,6 +986,14 @@ config DM9000_FORCE_SIMPLE_PHY_POLL
>          costly MII PHY reads. Note, this will not work if the chip is
>          operating with an external PHY.
>
> +config DM9000_32BIT_SW_SWAP
> +       bool "Software byte swap for 32 bit data bus"
> +       depends on DM9000 && COLDFIRE
> +       ---help---
> +         This configuration allows to swap data bytes from the dm9000
> +         driver itself, when the big endian cpu is wired straight to
> +         the dm9000 32 bit data bus.
> +
I'm not sure COLDFIRE is the best dependency. AFAICS, it should
depends on endianness, and potentially board specific settings.

>  config ENC28J60
>        tristate "ENC28J60 support"
>        depends on EXPERIMENTAL && SPI && NET_ETHERNET
> @@ -3347,4 +3355,3 @@ config VMXNET3
>          module will be called vmxnet3.
>
>  endif # NETDEVICES
> -
>
useless whitespace change ...

> --- linux/drivers/net/dm9000.c.orig     2010-12-30 23:19:39.747836070 +0100
> +++ linux/drivers/net/dm9000.c  2011-01-05 16:30:48.636116500 +0100
> [...]
> @@ -981,7 +1032,11 @@ dm9000_rx(struct net_device *dev)
>                ior(db, DM9000_MRCMDX); /* Dummy read */
>
>                /* Get most updated data */
> -               rxbyte = readb(db->io_data);
> +#ifdef CONFIG_DM9000_32BIT_SW_SWAP
> +      rxbyte = (u8)readl(db->io_data);
> +#else
> +      rxbyte = readb(db->io_data);
> +#endif
>
>                /* Status check: this byte must be 0 or 1 */
>                if (rxbyte & DM9000_PKT_ERR) {
> @@ -996,8 +1051,13 @@ dm9000_rx(struct net_device *dev)
>
>                /* A packet ready now  & Get status/length */
>                GoodPacket = true;
> -               writeb(DM9000_MRCMD, db->io_addr);
>
> +#ifdef CONFIG_DM9000_32BIT_SW_SWAP
> +               writel(DM9000_MRCMD, db->io_addr);
> +#else
> +      writeb(DM9000_MRCMD, db->io_addr);
> +#endif
> +
All these #ifdef make the driver quite horrible to read.

>                (db->inblk)(db->io_data, &rxhdr, sizeof(rxhdr));
>
>                RxLen = le16_to_cpu(rxhdr.RxLen);
> @@ -1077,7 +1137,7 @@ static irqreturn_t dm9000_interrupt(int
>        unsigned long flags;
>        u8 reg_save;
>
> -       dm9000_dbg(db, 3, "entering %s\n", __func__);
> +       //dm9000_dbg(db, 3, "entering %s\n", __func__);
>
C++ comment ... Why do you comment this ?

>        /* Disable all interrupts */
>        iow(db, DM9000_IMR, IMR_PAR);
> @@ -1100,7 +1164,7 @@ static irqreturn_t dm9000_interrupt(int
>        /* Received the coming packet */
>        if (int_status & ISR_PRS)
>                dm9000_rx(dev);
> -
> +
whitespace ...

> @@ -1233,11 +1301,15 @@ dm9000_phy_read(struct net_device *dev,
>        int ret;
>
>        mutex_lock(&db->addr_lock);
> -
> +
whitespace ...

> @@ -1713,4 +1811,3 @@ MODULE_AUTHOR("Sascha Hauer, Ben Dooks")
>  MODULE_DESCRIPTION("Davicom DM9000 network driver");
>  MODULE_LICENSE("GPL");
>  MODULE_ALIAS("platform:dm9000");
> -
whitespace ...

> --- linux/arch/m68k/include/asm/io_no.h.orig    2011-01-05 16:53:55.904905038 +0100
> +++ linux/arch/m68k/include/asm/io_no.h 2011-01-04 23:45:08.893049554 +0100
> @@ -47,6 +47,91 @@ static inline unsigned int _swapl(volati
>  #define writew(b,addr) (void)((*(volatile unsigned short *) (addr)) = (b))
>  #define writel(b,addr) (void)((*(volatile unsigned int *) (addr)) = (b))
>
> +static inline void writesb (void __iomem *reg, void *data, int count)
> +{
> +       unsigned char *p = (unsigned char*) data;
> +
> +       while (count--) writeb(*p++, reg);
> +}
> +
> +static inline void writesbsw (void __iomem *reg, void *data, int count)
> +{
> +       unsigned char *p = (unsigned char *) data;
> +
> +       while (count--) writel((int)(*p++), reg);
> +}
> +
> +static inline void writesw (void __iomem *reg, void *data, int count)
> +{
> +   unsigned short *p = (unsigned short*) data;
> +
> +   while (count--) writew(*p++, reg);
> +}
> +
> +static inline void writeswsw (void __iomem *reg, void *data, int count)
> +{
> +   unsigned short *p = (unsigned short *) data;
> +
> +   while (count--) writel((int)(_swapw(*p++)), reg);
> +}
> +
> +static inline void writesl (void __iomem *reg, void *data, int count)
> +{
> +   unsigned long *p = (unsigned long*) data;
> +
> +   while (count--) writel(*p++, reg);
> +}
> +
> +static inline void writeslsw (void __iomem *reg, void *data, int count)
> +{
> +   unsigned long *p = (unsigned long *) data;
> +
> +   while (count--) writel((int)(_swapl(*p++)), reg);
> +}
> +
> +static inline void readsb (void __iomem *reg, void *data, int count)
> +{
> +   unsigned char *p = (unsigned char *) data;
> +
> +   while (count--) *p++ = readb(reg);
> +}
> +
> +static inline void readsbsw (void __iomem *reg, void *data, int count)
> +{
> +   unsigned char *p = (unsigned char *) data;
> +
> +   while (count--) *p++ = (unsigned char)readl(reg);
> +}
> +
> +static inline void readsw (void __iomem *reg, void *data, int count)
> +{
> +   unsigned short *p = (unsigned short *) data;
> +
> +   while (count--) *p++ = readb(reg);
> +}
> +
> +static inline void readswsw (void __iomem *reg, void *data, int count)
> +{
> +   unsigned short *p = (unsigned short *) data;
> +
> +   while (count--) *p++ = _swapw((unsigned short)readw(reg));
> +}
> +
> +static inline void readsl (void __iomem *reg, void *data, int count)
> +{
> +   unsigned long *p = (unsigned long *) data;
> +
> +   while (count--) *p++ = readb(reg);
> +}
> +
> +static inline void readslsw (void __iomem *reg, void *data, int count)
> +{
> +   unsigned long *p = (unsigned long *) data;
> +
> +   while (count--) *p++ = _swapl(readl(reg));
> +}
> +
> +
>  #define __raw_readb readb
>  #define __raw_readw readw
>  #define __raw_readl readl
> @@ -180,4 +265,3 @@ extern void iounmap(void *addr);
>  #endif /* __KERNEL__ */
>
>  #endif /* _M68KNOMMU_IO_H */
Could not this be moved to a separate patch ? That way arch specific
changes are separated from the network arch-indep changes.

 - Arnaud

> -
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply

* Re: [net-next-2.6 PATCH v5 2/2] net_sched: implement a root container qdisc sch_mqprio
From: Jarek Poplawski @ 2011-01-06  0:32 UTC (permalink / raw)
  To: John Fastabend
  Cc: davem@davemloft.net, hadi@cyberus.ca, shemminger@vyatta.com,
	tgraf@infradead.org, eric.dumazet@gmail.com,
	bhutchings@solarflare.com, nhorman@tuxdriver.com,
	netdev@vger.kernel.org
In-Reply-To: <4D24ACA4.3000301@intel.com>

On Wed, Jan 05, 2011 at 09:38:44AM -0800, John Fastabend wrote:
> On 1/4/2011 2:59 PM, Jarek Poplawski wrote:
> > On Tue, Jan 04, 2011 at 10:56:46AM -0800, John Fastabend wrote:
...
> >> +
> >> +     /* Always use supplied priority mappings */
> >> +     for (i = 0; i < TC_BITMASK + 1; i++) {
> >> +             if (netdev_set_prio_tc_map(dev, i, qopt->prio_tc_map[i])) {
> >> +                     err = -EINVAL;
> >
> > This would probably trigger if we try qopt->num_tc == 0. Is it expected?
> 
> netdev_set_prio_tc_map() returns 0 on sucess. This if(..) is a bit strange though.
> 
>         err = netdev_set_prio_tc_map(dev, i, qopt->prio_tc_map[i])
>         if (err)
>                 ...
> 
> Is cleaner IMHO.

Maybe. But I still wonder if qopt->num_tc == 0 is valid (by design)?
(netdev_set_prio_tc_map() returns -EINVAL if dev->num_tc == 0)

> >> +static unsigned long mqprio_get(struct Qdisc *sch, u32 classid)
> >> +{
> >> +     unsigned int ntx = TC_H_MIN(classid);
> >
> > We need to 'get' tc classes too, eg for individual dumps. Then we
> > should omit them in .leaf, .graft etc.
> >
> 
> OK missed this. Looks like iproute2 always sets NLM_F_DUMP which works because it uses cl_ops->walk
> 
> # tc -s class show dev eth3 classid 800b:1
> class mqprio 800b:1 root
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0

OK, then it might be only of theoretical value (for some other tools).

> > Why dev->num_tc above, netdev_get_num_tc(dev) here, and dev->num_tc
> > below?
> 
> No reason just inconsistant I will use dev->num_tc.

Well, this would suggest netdev_get_num_tc() is redundant.

> >> +static void mqprio_walk(struct Qdisc *sch, struct qdisc_walker *arg)
> >> +{
> >> +     struct net_device *dev = qdisc_dev(sch);
> >> +     unsigned long ntx;
> >> +     u8 num_tc = netdev_get_num_tc(dev);
> >> +
> >> +     if (arg->stop)
> >> +             return;
> >> +
> >> +     /* Walk hierarchy with a virtual class per tc */
> >> +     arg->count = arg->skip;
> >> +     for (ntx = arg->skip; ntx < dev->num_tx_queues + num_tc; ntx++) {
> >
> > Should we report possibly unused/unconfigured tx_queues?
> 
> I think it may be OK select_queue() could push skbs onto these queues and we may still want to see the statistics in this case. Although (real_num_tx_queues + num_tc) may make sense I see no reason to show queues above real_num_tx_queues.

IMHO, pushing skbs to unconfigured/unwanted/disabled queues is a bug,
and select should handle this, but probably it's a matter of taste.
Btw, I wonder how dev->real_num_tx_queues is obeyed here.

Thanks,
Jarek P.

PS: No offense, but could you try cutting lines ~70c. It's strongly
recommended on kernel lists.

^ permalink raw reply

* Re: [PATCH v3 08/10] ARM: mxs: add ocotp read function
From: Jamie Lokier @ 2011-01-06  0:50 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Jamie Iles, gerg, B32542, netdev, s.hauer, bryan.wu, baruch,
	w.sang, r64343, Shawn Guo, eric, Uwe Kleine-König, davem,
	linux-arm-kernel, lw
In-Reply-To: <20110105201502.GK8638@n2100.arm.linux.org.uk>

Russell King - ARM Linux wrote:
> On Wed, Jan 05, 2011 at 07:44:18PM +0000, Jamie Lokier wrote:
> > 'git show 534be1d5' explains how it works: cpu_relax() flushes buffered
> > writes from _this_ CPU, so that other CPUs which are polling can make
> > progress, which avoids this CPU getting stuck if there is an indirect
> > dependency (no matter how convoluted) between what it's polling and which
> > it wrote just before.
> > 
> > So cpu_relax() is *essential* in some polling loops, not a hint.
> > 
> > In principle that could happen for I/O polling, if (a) buffered memory
> > writes are delayed by I/O read transactions, and (b) the device state we're
> > waiting on depends on I/O yet to be done on another CPU, which could be
> > polling memory first (e.g. a spinlock).
> > 
> > I doubt (a) in practice - but what about buses that block during I/O read?
> > (I have a chip like that here, but it's ARMv4T.)
> 
> Let's be clear - ARMv5 and below generally are well ordered architectures
> within the limits of caching.  There are cases where the write buffer
> allows two writes to pass each other.  However, for IO we generally map
> these - especially for ARMv4 and below - as 'uncacheable unbufferable'.
> So on these, if the program says "read this location" the pipeline will
> stall until the read has been issued - and if you use the result in the
> next instruction, it will stall until the data is available.  So really,
> it's not a problem here.
> 
> ARMv6 and above have a weakly ordered memory model with speculative
> prefetching, so memory reads/writes can be completely unordered.  Device
> accesses can pass memory accesses, but device accesses are always visible
> in program order with respect to each other.
> 
> So, if you're spinning in a loop reading an IO device, all previous IO
> accesses will be completed (in all ARM architectures) before the result
> of your read is evaluated.

No, that wasn't the scenario - it was:

You're spinning reading an IO device, whose state depends indirectly
on a *CPU memory* write that is forever buffered.

(Go and re-read 'git show 534be1d5' if you haven't already.)

The indirect dependence is that another CPU needs to see that write
before it can tell the device to change state in whatever way the
first CPU is polling for.

It's probably clearer in code:

CPU #1

    spin_lock(&mydev->lock);
    /* Look at state. */
    spin_unlock(&mydev->lock);       <-- THIS MEMORY WRITE BUFFERED FOREVER

    /* We expect this to be quick enough that polling is cool. */
    while (readl(mydev->reg_status) & MYDEV_STATUS_BUSY) {
        /* If only we had cpu_relax() */
    }

CPU #2

    spin_lock(&mydev->lock);         <-- STUCK HERE
    /* Look at state. */
    spin_unlock(&mydev->lock);

    writel(MYDEV_TRIGGER, mydev->reg_go);   /* Device is BUSY until this. */

The deadlock in this code (might) happen when CPU #2 is waiting for
the spinlock, and CPU #1's memory write remains in its write buffer
during CPU #1's polling loop.

If that can happen, it's fixed by adding cpu_relax() - to generic
driver code with polling loops.

It can only happen if any CPUs (i.e. ARMv6) that buffer writes due to
prioritising continuous memory reads also have that effect for
continuous IO reads.  This might even apply to non-ARM archs with
non-trivial cpu_relax() definitions; I don't know as they don't always
explain why.

The above driver style isn't particularly obvious, but there are a lot
of drivers with almost every conceivable access pattern.  If you use
your imagination, especially if the second code is an interrupt
handler, it's plausible.  Even though this example would be better
sleeping and waiting normally - there's nothing inherently forbidden
about the above pattern (except that cpu_relax() is needed).

> (But, let's make you squirm some more - mb() on ARMv6 and above may
> equate to a CPU memory barrier _plus_ a few IO accesses to the external
> L2 cache controller - which will be ordered wrt other IO accesses of
> course.)

I squirm at all modern ARM architectures.  Omit the slightest highly
version-specific thing, or run a kernel built with slightly wrong
config options, and it's fine except for random, very rare memory or
I/O corruption.  The workarounds and special bits seem to get more and
more convoluted with each version.

-- Jamie

^ permalink raw reply

* Re: [PATCH v3 08/10] ARM: mxs: add ocotp read function
From: Shawn Guo @ 2011-01-06  1:45 UTC (permalink / raw)
  To: Jamie Lokier
  Cc: Jamie Iles, Uwe Kleine-König, gerg, B32542, netdev, s.hauer,
	baruch, w.sang, r64343, linux-arm-kernel, eric, bryan.wu, davem,
	lw
In-Reply-To: <20110105175617.GD12222@shareable.org>

On Wed, Jan 05, 2011 at 05:56:17PM +0000, Jamie Lokier wrote:
> Jamie Iles wrote:
> > On Wed, Jan 05, 2011 at 05:44:09PM +0100, Uwe Kleine-König wrote:
> > > Hello Jamie,
> > > On Wed, Jan 05, 2011 at 04:16:46PM +0000, Jamie Iles wrote:
> > > > On Wed, Jan 05, 2011 at 10:07:35PM +0800, Shawn Guo wrote:
> > > > > +	/* check both BUSY and ERROR cleared */
> > > > > +	while ((__raw_readl(ocotp_base) &
> > > > > +		(BM_OCOTP_CTRL_BUSY | BM_OCOTP_CTRL_ERROR)) && --timeout)
> > > > > +		/* nothing */;
> > > > 
> > > > Is it worth using cpu_relax() in these polling loops?
> > > I don't know what cpu_relax does for other platforms, but on ARM it's
> > > just a memory barrier which AFAICT doesn't help here at all (which
> > > doesn't need to be correct).  Why do you think it would be better?
> > 
> > Well I don't see that there's anything broken without cpu_relax() but 
> > using cpu_relax() seems to be the most common way of doing busy polling 
> > loops that I've seen. It's also a bit easier to read than a comment and 
> > semi-colon. Perhaps it's just personal preference.
> 
> cpu_relax() is a hint to the CPU to, for example, save power or be
> less aggressive on the memory bus (to save power or be fairer).
> 
> Currently these architectures do more than just a barrier in cpu_relax():
> x86, IA64, PowerPC, Tile and S390.
> 
> Although it's just a hint on ARM at the moment, it might change in
> future - especially with power mattering on so many ARM systems.
> (Even now, just changing it to a very short udelay might save power
> on existing ARMs without breaking drivers.)
> 
Sounds reasonable.  I would take the suggestion.  Thanks, both Jamie.

-- 
Regards,
Shawn


^ permalink raw reply

* [PATCH net-next] cnic: Fix the type field in SPQ messages
From: Michael Chan @ 2011-01-06  1:14 UTC (permalink / raw)
  To: davem; +Cc: netdev

The new firmware interface requires each Slow Path Queue (SPQ) message's
type field to include the function number.  The existing code does not
do this consistently.  We fix this by OR'ing in the function number
into the type field centrally in cnic_submit_kwqe_16().

Signed-off-by: Michael Chan <mchan@broadcom.com>
---
 drivers/net/cnic.c |   33 ++++++++++++---------------------
 1 files changed, 12 insertions(+), 21 deletions(-)

diff --git a/drivers/net/cnic.c b/drivers/net/cnic.c
index aa5016a..263a294 100644
--- a/drivers/net/cnic.c
+++ b/drivers/net/cnic.c
@@ -1290,12 +1290,18 @@ static int cnic_submit_kwqe_16(struct cnic_dev *dev, u32 cmd, u32 cid,
 	struct cnic_local *cp = dev->cnic_priv;
 	struct l5cm_spe kwqe;
 	struct kwqe_16 *kwq[1];
+	u16 type_16;
 	int ret;
 
 	kwqe.hdr.conn_and_cmd_data =
 		cpu_to_le32(((cmd << SPE_HDR_CMD_ID_SHIFT) |
 			     BNX2X_HW_CID(cp, cid)));
-	kwqe.hdr.type = cpu_to_le16(type);
+
+	type_16 = (type << SPE_HDR_CONN_TYPE_SHIFT) & SPE_HDR_CONN_TYPE;
+	type_16 |= (cp->pfid << SPE_HDR_FUNCTION_ID_SHIFT) &
+		   SPE_HDR_FUNCTION_ID;
+
+	kwqe.hdr.type = cpu_to_le16(type_16);
 	kwqe.hdr.reserved1 = 0;
 	kwqe.data.phy_address.lo = cpu_to_le32(l5_data->phy_address.lo);
 	kwqe.data.phy_address.hi = cpu_to_le32(l5_data->phy_address.hi);
@@ -1819,19 +1825,15 @@ static int cnic_bnx2x_destroy_ramrod(struct cnic_dev *dev, u32 l5_cid)
 	struct cnic_context *ctx = &cp->ctx_tbl[l5_cid];
 	union l5cm_specific_data l5_data;
 	int ret;
-	u32 hw_cid, type;
+	u32 hw_cid;
 
 	init_waitqueue_head(&ctx->waitq);
 	ctx->wait_cond = 0;
 	memset(&l5_data, 0, sizeof(l5_data));
 	hw_cid = BNX2X_HW_CID(cp, ctx->cid);
-	type = (NONE_CONNECTION_TYPE << SPE_HDR_CONN_TYPE_SHIFT)
-		& SPE_HDR_CONN_TYPE;
-	type |= ((cp->pfid << SPE_HDR_FUNCTION_ID_SHIFT) &
-		 SPE_HDR_FUNCTION_ID);
 
 	ret = cnic_submit_kwqe_16(dev, RAMROD_CMD_ID_COMMON_CFC_DEL,
-				  hw_cid, type, &l5_data);
+				  hw_cid, NONE_CONNECTION_TYPE, &l5_data);
 
 	if (ret == 0)
 		wait_event(ctx->waitq, ctx->wait_cond);
@@ -2204,7 +2206,6 @@ static int cnic_bnx2x_fcoe_init1(struct cnic_dev *dev, struct kwqe *wqes[],
 	cp->kcq2.sw_prod_idx = 0;
 
 	cid = BNX2X_HW_CID(cp, cp->fcoe_init_cid);
-	printk(KERN_ERR "bdbg: submitting INIT RAMROD \n");
 	ret = cnic_submit_kwqe_16(dev, FCOE_RAMROD_CMD_ID_INIT, cid,
 				  FCOE_CONNECTION_TYPE, &l5_data);
 	*work = 3;
@@ -4977,7 +4978,7 @@ static void cnic_init_rings(struct cnic_dev *dev)
 	} else if (test_bit(CNIC_F_BNX2X_CLASS, &dev->flags)) {
 		u32 cli = cp->ethdev->iscsi_l2_client_id;
 		u32 cid = cp->ethdev->iscsi_l2_cid;
-		u32 cl_qzone_id, type;
+		u32 cl_qzone_id;
 		struct client_init_ramrod_data *data;
 		union l5cm_specific_data l5_data;
 		struct ustorm_eth_rx_producers rx_prods = {0};
@@ -5009,15 +5010,10 @@ static void cnic_init_rings(struct cnic_dev *dev)
 		l5_data.phy_address.lo = udev->l2_buf_map & 0xffffffff;
 		l5_data.phy_address.hi = (u64) udev->l2_buf_map >> 32;
 
-		type = (ETH_CONNECTION_TYPE << SPE_HDR_CONN_TYPE_SHIFT)
-			& SPE_HDR_CONN_TYPE;
-		type |= ((cp->pfid << SPE_HDR_FUNCTION_ID_SHIFT) &
-			SPE_HDR_FUNCTION_ID);
-
 		set_bit(CNIC_LCL_FL_RINGS_INITED, &cp->cnic_local_flags);
 
 		cnic_submit_kwqe_16(dev, RAMROD_CMD_ID_ETH_CLIENT_SETUP,
-			cid, type, &l5_data);
+			cid, ETH_CONNECTION_TYPE, &l5_data);
 
 		i = 0;
 		while (test_bit(CNIC_LCL_FL_L2_WAIT, &cp->cnic_local_flags) &&
@@ -5047,7 +5043,6 @@ static void cnic_shutdown_rings(struct cnic_dev *dev)
 		u32 cid = cp->ethdev->iscsi_l2_cid;
 		union l5cm_specific_data l5_data;
 		int i;
-		u32 type;
 
 		cnic_ring_ctl(dev, cid, cli, 0);
 
@@ -5068,12 +5063,8 @@ static void cnic_shutdown_rings(struct cnic_dev *dev)
 		cnic_spq_completion(dev, DRV_CTL_RET_L2_SPQ_CREDIT_CMD, 1);
 
 		memset(&l5_data, 0, sizeof(l5_data));
-		type = (NONE_CONNECTION_TYPE << SPE_HDR_CONN_TYPE_SHIFT)
-			& SPE_HDR_CONN_TYPE;
-		type |= ((cp->pfid << SPE_HDR_FUNCTION_ID_SHIFT) &
-			 SPE_HDR_FUNCTION_ID);
 		cnic_submit_kwqe_16(dev, RAMROD_CMD_ID_COMMON_CFC_DEL,
-			cid, type, &l5_data);
+			cid, NONE_CONNECTION_TYPE, &l5_data);
 		msleep(10);
 	}
 	clear_bit(CNIC_LCL_FL_RINGS_INITED, &cp->cnic_local_flags);
-- 
1.6.4.GIT



^ permalink raw reply related

* Re: [net-next-2.6 PATCH v5 2/2] net_sched: implement a root container qdisc sch_mqprio
From: John Fastabend @ 2011-01-06  2:31 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: davem@davemloft.net, hadi@cyberus.ca, shemminger@vyatta.com,
	tgraf@infradead.org, eric.dumazet@gmail.com,
	bhutchings@solarflare.com, nhorman@tuxdriver.com,
	netdev@vger.kernel.org
In-Reply-To: <20110106003208.GA2166@del.dom.local>

On 1/5/2011 4:32 PM, Jarek Poplawski wrote:
> On Wed, Jan 05, 2011 at 09:38:44AM -0800, John Fastabend wrote:
>> On 1/4/2011 2:59 PM, Jarek Poplawski wrote:
>>> On Tue, Jan 04, 2011 at 10:56:46AM -0800, John Fastabend wrote:
> ...
>>>> +
>>>> +     /* Always use supplied priority mappings */
>>>> +     for (i = 0; i < TC_BITMASK + 1; i++) {
>>>> +             if (netdev_set_prio_tc_map(dev, i, qopt->prio_tc_map[i])) {
>>>> +                     err = -EINVAL;
>>>
>>> This would probably trigger if we try qopt->num_tc == 0. Is it expected?
>>
>> netdev_set_prio_tc_map() returns 0 on sucess. This if(..) is a bit strange though.
>>
>>         err = netdev_set_prio_tc_map(dev, i, qopt->prio_tc_map[i])
>>         if (err)
>>                 ...
>>
>> Is cleaner IMHO.
> 
> Maybe. But I still wonder if qopt->num_tc == 0 is valid (by design)?
> (netdev_set_prio_tc_map() returns -EINVAL if dev->num_tc == 0)
> 

I guess I don't see how it returns -EINVAL. Although setting
qopt->num_tc = 0 is equivalent to disabling traffic classes so
it is a strange thing to do but I don't expect it should be a
problem.

static inline
int netdev_set_prio_tc_map(struct net_device *dev, u8 prio, u8 tc)
{
        if (tc >= dev->num_tc)
                return -EINVAL;

        dev->prio_tc_map[prio & TC_BITMASK] = tc & TC_BITMASK;
        return 0;
}

>>>> +static unsigned long mqprio_get(struct Qdisc *sch, u32 classid)
>>>> +{
>>>> +     unsigned int ntx = TC_H_MIN(classid);
>>>
>>> We need to 'get' tc classes too, eg for individual dumps. Then we
>>> should omit them in .leaf, .graft etc.
>>>
>>
>> OK missed this. Looks like iproute2 always sets NLM_F_DUMP which works because it uses cl_ops->walk
>>
>> # tc -s class show dev eth3 classid 800b:1
>> class mqprio 800b:1 root
>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>  backlog 0b 0p requeues 0
> 
> OK, then it might be only of theoretical value (for some other tools).
>>> Why dev->num_tc above, netdev_get_num_tc(dev) here, and dev->num_tc
>>> below?
>>
>> No reason just inconsistant I will use dev->num_tc.
> 
> Well, this would suggest netdev_get_num_tc() is redundant.
> 

Guess it is I can remove it.

static inline
u8 netdev_get_num_tc(const struct net_device *dev)
{
        return dev->num_tc;
}

>>>> +static void mqprio_walk(struct Qdisc *sch, struct qdisc_walker *arg)
>>>> +{
>>>> +     struct net_device *dev = qdisc_dev(sch);
>>>> +     unsigned long ntx;
>>>> +     u8 num_tc = netdev_get_num_tc(dev);
>>>> +
>>>> +     if (arg->stop)
>>>> +             return;
>>>> +
>>>> +     /* Walk hierarchy with a virtual class per tc */
>>>> +     arg->count = arg->skip;
>>>> +     for (ntx = arg->skip; ntx < dev->num_tx_queues + num_tc; ntx++) {
>>>
>>> Should we report possibly unused/unconfigured tx_queues?
>>
>> I think it may be OK select_queue() could push skbs onto these queues and we may still want to see the statistics in this case. Although (real_num_tx_queues + num_tc) may make sense I see no reason to show queues above real_num_tx_queues.
> 
> IMHO, pushing skbs to unconfigured/unwanted/disabled queues is a bug,
> and select should handle this, but probably it's a matter of taste.

certainly a bug for queues >= real_num_tx_queues.

> Btw, I wonder how dev->real_num_tx_queues is obeyed here.
> 

Its not. real_num_tx_queues is not handled correctly here also
real_num_tx_queues can be changed so this need to be handled as well.
This means an additional check in the options parsing.

> Thanks,
> Jarek P.
> 
> PS: No offense, but could you try cutting lines ~70c. It's strongly
> recommended on kernel lists.

None taken ~70c from now on.

^ permalink raw reply

* typo in ping.c?
From: Julian C. Dunn @ 2011-01-06  3:36 UTC (permalink / raw)
  To: netdev

I sent this about a month ago to Yoshifuji Hideaki but didn't hear anything.
Perhaps it could use a wider distribution.

- Julian

-----Original Message-----
From: Julian C. Dunn [mailto:jdunn@aquezada.com] 
Sent: Sunday, December 12, 2010 8:49 PM
To: 'YOSHIFUJI Hideaki'
Subject: typo in ping.c?

Hi,

I'm not an expert in C code or in iputils but I think I found a
typographical error in this commit:

http://www.linux-ipv6.org/gitweb/gitweb.cgi?p=gitroot/iputils.git;a=commitdi
ff;h=d5e3dcb81ff3828fa40eb4e8c562ca7015d7ac6a

The previous version printed the icmp_seq inline within ping_common.c, but
in your diff you are now invoking a function void pr_echo_reply().
Unfortunately, it looks like "icmp_seq" is spelled wrong: you have it
spelled as "icmp_req".

I discovered this when trying to use a tool that depended on the spelling of
"icmp_seq" and it failed.

- Julian


^ permalink raw reply

* Re: [RFC] sched: CHOKe packet scheduler (v0.2)
From: Eric Dumazet @ 2011-01-06  4:07 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, netdev
In-Reply-To: <20110105112104.64ad3c86@nehalam>

Le mercredi 05 janvier 2011 à 11:21 -0800, Stephen Hemminger a écrit :
> This implements the CHOKe packet scheduler based on the existing
> Linux RED scheduler based on the algorithm described in the paper.
> 
> The core idea is:
>   For every packet arrival:
>   	Calculate Qave
> 	if (Qave < minth) 
> 	     Queue the new packet
> 	else 
> 	     Select randomly a packet from the queue 
> 	     if (both packets from same flow)
> 	     then Drop both the packets
> 	     else if (Qave > maxth)
> 	          Drop packet
> 	     else
> 	       	  Admit packet with probability p (same as RED)
> 
> See also:
>   Rong Pan, Balaji Prabhakar, Konstantinos Psounis, "CHOKe: a stateless active
>    queue management scheme for approximating fair bandwidth allocation", 
>   Proceeding of INFOCOM'2000, March 2000. 
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 

To be really useful in a wide range of environments, I believe that :

- CHOKe should be able to use an external flow classifier (like say...
SFQ) to compute a token and compare two skbs by this token instead of
custom rxhash or whatever. (rxhash can be the default in absence of flow
classifier). Probably you need to store the token in skb->cb[] to avoid
calling tc_classify() several times for a given packet. 

http://lwn.net/Articles/236200/
http://kerneltrap.org/mailarchive/linux-netdev/2008/1/31/667679

- Must use a FIFO with O(1) access to Nth skb in queue.
 
 A linked list makes this implementation too expensive for big queues.

For small queues (less than 128 skbs at this moment for SFQ), existing
schedulers are good enough.

CHOKe authors dont mention this in their paper, but their experiments
were done in 1999 with 1Mbs links. minth=100 and maxth=200, limit=300

We want to try CHOKe with modern links, probably with minth=2000 and
maxth=4000 or more.

They said "It is arguably more difficult to drop a randomy chosen packet
since this means removing from a linked-list. Instead of doing this, we
propose to add one extra bit to the packet header. The bit is set to one
if the drop candidate is to be dropped. When a packet advance to the
head of the FIFO buffer, the status of the bit determines whether it is
to be immediately discarded or transmitted on the outgoind line"

If they thought removing a buffer from a linked list was expensive
(!!!), they certainly assumed the previous access to the randomly chosen
buffer was faster than the skb unlink !

Using a circular buffer should be enough, using a similar trick than
suggested : when droping an skb from the ring, stick a NULL pointer and
dont memmove() the window to shrink it.

struct skb_ring {
	unsigned int head;
	unsigned int tail;
	unsigned int size; /* a power of two */
	struct sk_buff **table;
};

Doing so avoids the cache misses to adjacent skbs prev/next when
queue/dequeue is done.





^ permalink raw reply

* Re: [PATCH v3 05/10] net/fec: add dual fec support for mx28
From: Shawn Guo @ 2011-01-06  4:14 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: davem, gerg, baruch, eric, bryan.wu, r64343, B32542, lw, w.sang,
	s.hauer, netdev, linux-arm-kernel
In-Reply-To: <20110105163449.GU25121@pengutronix.de>

Hi Uwe,

On Wed, Jan 05, 2011 at 05:34:49PM +0100, Uwe Kleine-König wrote:
> Hello,
> 
> On Wed, Jan 05, 2011 at 10:07:32PM +0800, Shawn Guo wrote:
> > This patch is to add mx28 dual fec support. Here are some key notes
> > for mx28 fec controller.
> >
> >  - The mx28 fec controller naming ENET-MAC is a different IP from FEC
> >    used on other i.mx variants.  But they are basically compatible
> >    on software interface, so it's possible to share the same driver.
> >  - ENET-MAC design made an improper assumption that it runs on a
> >    big-endian system. As the result, driver has to swap every frame
> >    going to and coming from the controller.
> >  - The external phys can only be configured by fec0, which means fec1
> >    can not work independently and both phys need to be configured by
> >    mii_bus attached on fec0.
> >  - ENET-MAC reset will get mac address registers reset too.
> >  - ENET-MAC MII/RMII mode and 10M/100M speed are configured
> >    differently FEC.
> >  - ETHER_EN bit must be set to get ENET-MAC interrupt work.
> >
> > Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
> > ---
> > Changes for v3:
> >  - Move v2 changes into patch #3
> >  - Use device name to check if it's running on ENET-MAC
> >
> >  drivers/net/Kconfig |    7 ++-
> >  drivers/net/fec.c   |  140 +++++++++++++++++++++++++++++++++++++++++++++------
> >  drivers/net/fec.h   |    5 +-
> >  3 files changed, 131 insertions(+), 21 deletions(-)
> >
> > diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
> > index 4f1755b..f34629b 100644
> > --- a/drivers/net/Kconfig
> > +++ b/drivers/net/Kconfig
> > @@ -1944,18 +1944,19 @@ config 68360_ENET
> >  config FEC
> >       bool "FEC ethernet controller (of ColdFire and some i.MX CPUs)"
> >       depends on M523x || M527x || M5272 || M528x || M520x || M532x || \
> > -             MACH_MX27 || ARCH_MX35 || ARCH_MX25 || ARCH_MX5
> > +             MACH_MX27 || ARCH_MX35 || ARCH_MX25 || ARCH_MX5 || SOC_IMX28
> >       select PHYLIB
> >       help
> >         Say Y here if you want to use the built-in 10/100 Fast ethernet
> >         controller on some Motorola ColdFire and Freescale i.MX processors.
> >
> >  config FEC2
> > -     bool "Second FEC ethernet controller (on some ColdFire CPUs)"
> > +     bool "Second FEC ethernet controller"
> >       depends on FEC
> >       help
> >         Say Y here if you want to use the second built-in 10/100 Fast
> > -       ethernet controller on some Motorola ColdFire processors.
> > +       ethernet controller on some Motorola ColdFire and Freescale
> > +       i.MX processors.
> >
> >  config FEC_MPC52xx
> >       tristate "MPC52xx FEC driver"
> > diff --git a/drivers/net/fec.c b/drivers/net/fec.c
> > index 8a1c51f..67ba263 100644
> > --- a/drivers/net/fec.c
> > +++ b/drivers/net/fec.c
> > @@ -17,6 +17,8 @@
> >   *
> >   * Bug fixes and cleanup by Philippe De Muyter (phdm@macqel.be)
> >   * Copyright (c) 2004-2006 Macq Electronique SA.
> > + *
> > + * Copyright (C) 2010 Freescale Semiconductor, Inc.
> >   */
> >
> >  #include <linux/module.h>
> > @@ -45,20 +47,33 @@
> >
> >  #include <asm/cacheflush.h>
> >
> > -#ifndef CONFIG_ARCH_MXC
> > +#if !defined(CONFIG_ARCH_MXC) && !defined(CONFIG_SOC_IMX28)
> maybe !defined(CONFIG_ARM)?
> 
Sounds good.

> >  #include <asm/coldfire.h>
> >  #include <asm/mcfsim.h>
> >  #endif
> >
> >  #include "fec.h"
> >
> > -#ifdef CONFIG_ARCH_MXC
> > -#include <mach/hardware.h>
> > +#if defined(CONFIG_ARCH_MXC) || defined(CONFIG_SOC_IMX28)
> >  #define FEC_ALIGNMENT        0xf
> >  #else
> >  #define FEC_ALIGNMENT        0x3
> >  #endif
> >
> > +#define DRIVER_NAME  "fec"
> > +#define ENET_MAC_NAME        "enet-mac"
> > +
> > +static struct platform_device_id fec_devtype[] = {
> > +     {
> > +             .name = DRIVER_NAME,
> > +     }, {
> > +             .name = ENET_MAC_NAME,
> > +     }
> I'd done it differently:
> 
>         {
>                 .name = "fec",
>                 .driver_data = 0,
>         }, {
>                 .name = "imx28-fec",
>                 .driver_data = HAS_ENET_MAC | ...,
>         }
> 
> and then test the bits in driver_data (which you get using
> platform_get_device_id() when you need to distinguish.
> Comparing names doesn't scale, assume there are three further features
> to distinguish, then you need to use strtok or index and get device
> names like enet-mac-with-feature1-but-without-feature2-and-feature3.
> 
Makes sense.  The frame endian issue will be fixed in future revision,
so I would define bit SWAP_FRAME for testing.

> > +};
> > +
> > +static unsigned fec_is_enetmac;
> > +static struct mii_bus *fec_mii_bus;
> In practice this might work, but actually these are per-device
> properties, not driver-global.  So it should go into the private data
> struct.
> 
Since it's just a tmp variable for holding fec0 mii_bus in function
fec_enet_mii_init, I would move it into the function as a static
variable.

> > +
> >  static unsigned char macaddr[ETH_ALEN];
> >  module_param_array(macaddr, byte, NULL, 0);
> >  MODULE_PARM_DESC(macaddr, "FEC Ethernet MAC address");
> > @@ -129,7 +144,8 @@ MODULE_PARM_DESC(macaddr, "FEC Ethernet MAC address");
> >   * account when setting it.
> >   */
> >  #if defined(CONFIG_M523x) || defined(CONFIG_M527x) || defined(CONFIG_M528x) || \
> > -    defined(CONFIG_M520x) || defined(CONFIG_M532x) || defined(CONFIG_ARCH_MXC)
> > +    defined(CONFIG_M520x) || defined(CONFIG_M532x) || \
> > +    defined(CONFIG_ARCH_MXC) || defined(CONFIG_SOC_IMX28)
> >  #define      OPT_FRAME_SIZE  (PKT_MAXBUF_SIZE << 16)
> >  #else
> >  #define      OPT_FRAME_SIZE  0
> > @@ -208,6 +224,17 @@ static void fec_stop(struct net_device *dev);
> >  /* Transmitter timeout */
> >  #define TX_TIMEOUT (2 * HZ)
> >
> > +static void *swap_buffer(void *bufaddr, int len)
> > +{
> > +     int i;
> > +     unsigned int *buf = bufaddr;
> > +
> > +     for (i = 0; i < (len + 3) / 4; i++, buf++)
> > +             *buf = __swab32(*buf);
> Would it better to use cpu_to_be32 here?  Then the compiler might
> be smart enough to optimize it away on BE.  (Currently the code
> generated for a BE build would be wrong with your patch, wouldn't it?)
Yes.

> > +
> > +     return bufaddr;
> > +}
> > +
> >  static netdev_tx_t
> >  fec_enet_start_xmit(struct sk_buff *skb, struct net_device *dev)
> >  {
> > @@ -256,6 +283,14 @@ fec_enet_start_xmit(struct sk_buff *skb, struct net_device *dev)
> >               bufaddr = fep->tx_bounce[index];
> >       }
> >
> > +     /*
> > +      * enet-mac design made an improper assumption that it's running
> > +      * on a big endian system. As the result, driver has to swap
> if he was really aware that he limits the performant use of the fec to
> big endian systems, can you please make him stop designing hardware!?
> 
You over looked my power :)  BTW, he had left Freescale.

> > +      * every frame going to and coming from the controller.
> > +      */
> > +     if (fec_is_enetmac)
> > +             swap_buffer(bufaddr, skb->len);
> > +
> >       /* Save skb pointer */
> >       fep->tx_skbuff[fep->skb_cur] = skb;
> >
> > @@ -487,6 +522,9 @@ fec_enet_rx(struct net_device *dev)
> >               dma_unmap_single(NULL, bdp->cbd_bufaddr, bdp->cbd_datlen,
> >                               DMA_FROM_DEVICE);
> >
> > +             if (fec_is_enetmac)
> > +                     swap_buffer(data, pkt_len);
> > +
> >               /* This does 16 byte alignment, exactly what we need.
> >                * The packet length includes FCS, but we don't want to
> >                * include that when passing upstream as it messes up
> > @@ -689,6 +727,7 @@ static int fec_enet_mii_probe(struct net_device *dev)
> >       char mdio_bus_id[MII_BUS_ID_SIZE];
> >       char phy_name[MII_BUS_ID_SIZE + 3];
> >       int phy_id;
> > +     int dev_id = fep->pdev->id;
> >
> >       fep->phy_dev = NULL;
> >
> > @@ -700,6 +739,8 @@ static int fec_enet_mii_probe(struct net_device *dev)
> >                       continue;
> >               if (fep->mii_bus->phy_map[phy_id]->phy_id == 0)
> >                       continue;
> > +             if (fec_is_enetmac && dev_id--)
> > +                     continue;
> >               strncpy(mdio_bus_id, fep->mii_bus->id, MII_BUS_ID_SIZE);
> >               break;
> >       }
> > @@ -741,6 +782,28 @@ static int fec_enet_mii_init(struct platform_device *pdev)
> >       struct fec_enet_private *fep = netdev_priv(dev);
> >       int err = -ENXIO, i;
> >
> > +     /*
> > +      * The dual fec interfaces are not equivalent with enet-mac.
> > +      * Here are the differences:
> > +      *
> > +      *  - fec0 supports MII & RMII modes while fec1 only supports RMII
> > +      *  - fec0 acts as the 1588 time master while fec1 is slave
> > +      *  - external phys can only be configured by fec0
> > +      *
> > +      * That is to say fec1 can not work independently. It only works
> > +      * when fec0 is working. The reason behind this design is that the
> > +      * second interface is added primarily for Switch mode.
> > +      *
> > +      * Because of the last point above, both phys are attached on fec0
> > +      * mdio interface in board design, and need to be configured by
> > +      * fec0 mii_bus.
> > +      */
> > +     if (fec_is_enetmac && pdev->id) {
> > +             /* fec1 uses fec0 mii_bus */
> > +             fep->mii_bus = fec_mii_bus;
> > +             return 0;
> > +     }
> > +
> >       fep->mii_timeout = 0;
> >
> >       /*
> > @@ -777,6 +840,10 @@ static int fec_enet_mii_init(struct platform_device *pdev)
> >       if (mdiobus_register(fep->mii_bus))
> >               goto err_out_free_mdio_irq;
> >
> > +     /* save fec0 mii_bus */
> > +     if (fec_is_enetmac)
> > +             fec_mii_bus = fep->mii_bus;
> > +
> >       return 0;
> >
> >  err_out_free_mdio_irq:
> > @@ -1149,11 +1216,22 @@ fec_restart(struct net_device *dev, int duplex)
> >  {
> >       struct fec_enet_private *fep = netdev_priv(dev);
> >       int i;
> > +     u32 val, temp_mac[2];
> >
> >       /* Whack a reset.  We should wait for this. */
> >       writel(1, fep->hwp + FEC_ECNTRL);
> >       udelay(10);
> >
> > +     /*
> > +      * enet-mac reset will reset mac address registers too,
> > +      * so need to reconfigure it.
> > +      */
> > +     if (fec_is_enetmac) {
> > +             memcpy(&temp_mac, dev->dev_addr, ETH_ALEN);
		  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > +             writel(cpu_to_be32(temp_mac[0]), fep->hwp + FEC_ADDR_LOW);
> > +             writel(cpu_to_be32(temp_mac[1]), fep->hwp + FEC_ADDR_HIGH);
> where is the value saved to temp_mac[]?  For me it looks you write
> uninitialized data into the mac registers.

memcpy above.

-- 
Regards,
Shawn


^ permalink raw reply

* [GIT PULL nf-next-2.6] ipvs namespaces
From: Simon Horman @ 2011-01-06  6:15 UTC (permalink / raw)
  To: netfilter-devel, lvs-devel, netdev
  Cc: Patrick McHardy, Julian Anastasov, Hans Schillstrom

Hi Patrick,

please consider pulling
git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-test-2.6.git master
in order to get the netns changes to IPVS from Hans Schillstrom.

Thanks


^ permalink raw reply

* [PATCH 01/22] IPVS: netns, add basic init per netns.
From: Simon Horman @ 2011-01-06  6:15 UTC (permalink / raw)
  To: netfilter-devel, lvs-devel, netdev
  Cc: Patrick McHardy, Julian Anastasov, Hans Schillstrom, Simon Horman
In-Reply-To: <1294294578-8601-1-git-send-email-horms@verge.net.au>

From: Hans Schillstrom <hans.schillstrom@ericsson.com>

Preparation for network name-space init, in this stage
some empty functions exists.

In most files there is a check if it is root ns i.e. init_net
if (!net_eq(net, &init_net))
        return ...
this will be removed by the last patch, when enabling name-space.

*v3
 ip_vs_conn.c merge error corrected.
 net_ipvs #ifdef removed as sugested by Jan Engelhardt

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
 include/net/ip_vs.h              |   11 ++++++
 include/net/net_namespace.h      |    2 +
 include/net/netns/ip_vs.h        |   25 +++++++++++++++
 net/netfilter/ipvs/ip_vs_app.c   |   32 +++++++++++++++---
 net/netfilter/ipvs/ip_vs_conn.c  |   34 +++++++++++++++++---
 net/netfilter/ipvs/ip_vs_core.c  |   63 ++++++++++++++++++++++++++++++++++++-
 net/netfilter/ipvs/ip_vs_ctl.c   |   49 ++++++++++++++++++++++++-----
 net/netfilter/ipvs/ip_vs_est.c   |   20 +++++++++++-
 net/netfilter/ipvs/ip_vs_ftp.c   |   34 ++++++++++++++++++--
 net/netfilter/ipvs/ip_vs_lblc.c  |   37 ++++++++++++++++++++--
 net/netfilter/ipvs/ip_vs_lblcr.c |   38 ++++++++++++++++++++--
 net/netfilter/ipvs/ip_vs_proto.c |   19 +++++++++++
 net/netfilter/ipvs/ip_vs_sync.c  |   27 ++++++++++++++++
 13 files changed, 356 insertions(+), 35 deletions(-)
 create mode 100644 include/net/netns/ip_vs.h

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index d858264..c1c2ece 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -28,6 +28,15 @@
 #if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
 #include <net/netfilter/nf_conntrack.h>
 #endif
+#include <net/net_namespace.h>		/* Netw namespace */
+
+/*
+ * Generic access of ipvs struct
+ */
+static inline struct netns_ipvs *net_ipvs(struct net* net)
+{
+	return net->ipvs;
+}
 
 /* Connections' size value needed by ip_vs_ctl.c */
 extern int ip_vs_conn_tab_size;
@@ -922,6 +931,8 @@ extern char ip_vs_backup_mcast_ifn[IP_VS_IFNAME_MAXLEN];
 extern int start_sync_thread(int state, char *mcast_ifn, __u8 syncid);
 extern int stop_sync_thread(int state);
 extern void ip_vs_sync_conn(struct ip_vs_conn *cp);
+extern int ip_vs_sync_init(void);
+extern void ip_vs_sync_cleanup(void);
 
 
 /*
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 1bf812b..b3b4a34 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -20,6 +20,7 @@
 #include <net/netns/conntrack.h>
 #endif
 #include <net/netns/xfrm.h>
+#include <net/netns/ip_vs.h>
 
 struct proc_dir_entry;
 struct net_device;
@@ -94,6 +95,7 @@ struct net {
 #ifdef CONFIG_XFRM
 	struct netns_xfrm	xfrm;
 #endif
+	struct netns_ipvs	*ipvs;
 };
 
 
diff --git a/include/net/netns/ip_vs.h b/include/net/netns/ip_vs.h
new file mode 100644
index 0000000..12fe840
--- /dev/null
+++ b/include/net/netns/ip_vs.h
@@ -0,0 +1,25 @@
+/*
+ *  IP Virtual Server
+ *  Data structure for network namspace
+ *
+ */
+
+#ifndef IP_VS_H_
+#define IP_VS_H_
+
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/list_nulls.h>
+#include <linux/ip_vs.h>
+#include <asm/atomic.h>
+#include <linux/in.h>
+
+struct ip_vs_stats;
+struct ip_vs_sync_buff;
+struct ctl_table_header;
+
+struct netns_ipvs {
+	int			gen;		/* Generation */
+};
+
+#endif /* IP_VS_H_ */
diff --git a/net/netfilter/ipvs/ip_vs_app.c b/net/netfilter/ipvs/ip_vs_app.c
index a475ede..6d10352 100644
--- a/net/netfilter/ipvs/ip_vs_app.c
+++ b/net/netfilter/ipvs/ip_vs_app.c
@@ -264,7 +264,7 @@ static inline void vs_fix_seq(const struct ip_vs_seq *vseq, struct tcphdr *th)
 	 *	for all packets	before most recent resized pkt seq.
 	 */
 	if (vseq->delta || vseq->previous_delta) {
-		if(after(seq, vseq->init_seq)) {
+		if (after(seq, vseq->init_seq)) {
 			th->seq = htonl(seq + vseq->delta);
 			IP_VS_DBG(9, "%s(): added delta (%d) to seq\n",
 				  __func__, vseq->delta);
@@ -293,7 +293,7 @@ vs_fix_ack_seq(const struct ip_vs_seq *vseq, struct tcphdr *th)
 	if (vseq->delta || vseq->previous_delta) {
 		/* since ack_seq is the number of octet that is expected
 		   to receive next, so compare it with init_seq+delta */
-		if(after(ack_seq, vseq->init_seq+vseq->delta)) {
+		if (after(ack_seq, vseq->init_seq+vseq->delta)) {
 			th->ack_seq = htonl(ack_seq - vseq->delta);
 			IP_VS_DBG(9, "%s(): subtracted delta "
 				  "(%d) from ack_seq\n", __func__, vseq->delta);
@@ -569,15 +569,35 @@ static const struct file_operations ip_vs_app_fops = {
 };
 #endif
 
-int __init ip_vs_app_init(void)
+static int __net_init __ip_vs_app_init(struct net *net)
 {
-	/* we will replace it with proc_net_ipvs_create() soon */
-	proc_net_fops_create(&init_net, "ip_vs_app", 0, &ip_vs_app_fops);
+	if (!net_eq(net, &init_net))	/* netns not enabled yet */
+		return -EPERM;
+
+	proc_net_fops_create(net, "ip_vs_app", 0, &ip_vs_app_fops);
 	return 0;
 }
 
+static void __net_exit __ip_vs_app_cleanup(struct net *net)
+{
+	proc_net_remove(net, "ip_vs_app");
+}
+
+static struct pernet_operations ip_vs_app_ops = {
+	.init = __ip_vs_app_init,
+	.exit = __ip_vs_app_cleanup,
+};
+
+int __init ip_vs_app_init(void)
+{
+	int rv;
+
+	rv = register_pernet_subsys(&ip_vs_app_ops);
+	return rv;
+}
+
 
 void ip_vs_app_cleanup(void)
 {
-	proc_net_remove(&init_net, "ip_vs_app");
+	unregister_pernet_subsys(&ip_vs_app_ops);
 }
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index 66e4662..7c1b502 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -1201,11 +1201,36 @@ static void ip_vs_conn_flush(void)
 		goto flush_again;
 	}
 }
+/*
+ * per netns init and exit
+ */
+int __net_init __ip_vs_conn_init(struct net *net)
+{
+	if (!net_eq(net, &init_net))	/* netns not enabled yet */
+		return -EPERM;
 
+	proc_net_fops_create(net, "ip_vs_conn", 0, &ip_vs_conn_fops);
+	proc_net_fops_create(net, "ip_vs_conn_sync", 0, &ip_vs_conn_sync_fops);
+	return 0;
+}
+
+static void __net_exit __ip_vs_conn_cleanup(struct net *net)
+{
+	if (!net_eq(net, &init_net))	/* netns not enabled yet */
+		return;
+
+	proc_net_remove(net, "ip_vs_conn");
+	proc_net_remove(net, "ip_vs_conn_sync");
+}
+static struct pernet_operations ipvs_conn_ops = {
+	.init = __ip_vs_conn_init,
+	.exit = __ip_vs_conn_cleanup,
+};
 
 int __init ip_vs_conn_init(void)
 {
 	int idx;
+	int retc;
 
 	/* Compute size and mask */
 	ip_vs_conn_tab_size = 1 << ip_vs_conn_tab_bits;
@@ -1243,24 +1268,21 @@ int __init ip_vs_conn_init(void)
 		rwlock_init(&__ip_vs_conntbl_lock_array[idx].l);
 	}
 
-	proc_net_fops_create(&init_net, "ip_vs_conn", 0, &ip_vs_conn_fops);
-	proc_net_fops_create(&init_net, "ip_vs_conn_sync", 0, &ip_vs_conn_sync_fops);
+	retc = register_pernet_subsys(&ipvs_conn_ops);
 
 	/* calculate the random value for connection hash */
 	get_random_bytes(&ip_vs_conn_rnd, sizeof(ip_vs_conn_rnd));
 
-	return 0;
+	return retc;
 }
 
-
 void ip_vs_conn_cleanup(void)
 {
+	unregister_pernet_subsys(&ipvs_conn_ops);
 	/* flush all the connection entries first */
 	ip_vs_conn_flush();
 
 	/* Release the empty cache */
 	kmem_cache_destroy(ip_vs_conn_cachep);
-	proc_net_remove(&init_net, "ip_vs_conn");
-	proc_net_remove(&init_net, "ip_vs_conn_sync");
 	vfree(ip_vs_conn_tab);
 }
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index 5287771..206f40c 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -41,6 +41,7 @@
 #include <net/icmp.h>                   /* for icmp_send */
 #include <net/route.h>
 #include <net/ip6_checksum.h>
+#include <net/netns/generic.h>		/* net_generic() */
 
 #include <linux/netfilter.h>
 #include <linux/netfilter_ipv4.h>
@@ -68,6 +69,12 @@ EXPORT_SYMBOL(ip_vs_conn_put);
 EXPORT_SYMBOL(ip_vs_get_debug_level);
 #endif
 
+int ip_vs_net_id __read_mostly;
+#ifdef IP_VS_GENERIC_NETNS
+EXPORT_SYMBOL(ip_vs_net_id);
+#endif
+/* netns cnt used for uniqueness */
+static atomic_t ipvs_netns_cnt = ATOMIC_INIT(0);
 
 /* ID used in ICMP lookups */
 #define icmp_id(icmph)          (((icmph)->un).echo.id)
@@ -1813,6 +1820,44 @@ static struct nf_hook_ops ip_vs_ops[] __read_mostly = {
 #endif
 };
 
+/*
+ *	Initialize IP Virtual Server netns mem.
+ */
+static int __net_init __ip_vs_init(struct net *net)
+{
+	struct netns_ipvs *ipvs;
+
+	if (!net_eq(net, &init_net)) {
+		pr_err("The final patch for enabling netns is missing\n");
+		return -EPERM;
+	}
+	ipvs = net_generic(net, ip_vs_net_id);
+	if (ipvs == NULL) {
+		pr_err("%s(): no memory.\n", __func__);
+		return -ENOMEM;
+	}
+	/* Counters used for creating unique names */
+	ipvs->gen = atomic_read(&ipvs_netns_cnt);
+	atomic_inc(&ipvs_netns_cnt);
+	net->ipvs = ipvs;
+	printk(KERN_INFO "IPVS: Creating netns size=%lu id=%d\n",
+			 sizeof(struct netns_ipvs), ipvs->gen);
+	return 0;
+}
+
+static void __net_exit __ip_vs_cleanup(struct net *net)
+{
+	struct netns_ipvs *ipvs = net_ipvs(net);
+
+	IP_VS_DBG(10, "ipvs netns %d released\n", ipvs->gen);
+}
+
+static struct pernet_operations ipvs_core_ops = {
+	.init = __ip_vs_init,
+	.exit = __ip_vs_cleanup,
+	.id   = &ip_vs_net_id,
+	.size = sizeof(struct netns_ipvs),
+};
 
 /*
  *	Initialize IP Virtual Server
@@ -1821,8 +1866,11 @@ static int __init ip_vs_init(void)
 {
 	int ret;
 
-	ip_vs_estimator_init();
+	ret = register_pernet_subsys(&ipvs_core_ops);	/* Alloc ip_vs struct */
+	if (ret < 0)
+		return ret;
 
+	ip_vs_estimator_init();
 	ret = ip_vs_control_init();
 	if (ret < 0) {
 		pr_err("can't setup control.\n");
@@ -1843,15 +1891,23 @@ static int __init ip_vs_init(void)
 		goto cleanup_app;
 	}
 
+	ret = ip_vs_sync_init();
+	if (ret < 0) {
+		pr_err("can't setup sync data.\n");
+		goto cleanup_conn;
+	}
+
 	ret = nf_register_hooks(ip_vs_ops, ARRAY_SIZE(ip_vs_ops));
 	if (ret < 0) {
 		pr_err("can't register hooks.\n");
-		goto cleanup_conn;
+		goto cleanup_sync;
 	}
 
 	pr_info("ipvs loaded.\n");
 	return ret;
 
+cleanup_sync:
+	ip_vs_sync_cleanup();
   cleanup_conn:
 	ip_vs_conn_cleanup();
   cleanup_app:
@@ -1861,17 +1917,20 @@ static int __init ip_vs_init(void)
 	ip_vs_control_cleanup();
   cleanup_estimator:
 	ip_vs_estimator_cleanup();
+	unregister_pernet_subsys(&ipvs_core_ops);	/* free ip_vs struct */
 	return ret;
 }
 
 static void __exit ip_vs_cleanup(void)
 {
 	nf_unregister_hooks(ip_vs_ops, ARRAY_SIZE(ip_vs_ops));
+	ip_vs_sync_cleanup();
 	ip_vs_conn_cleanup();
 	ip_vs_app_cleanup();
 	ip_vs_protocol_cleanup();
 	ip_vs_control_cleanup();
 	ip_vs_estimator_cleanup();
+	unregister_pernet_subsys(&ipvs_core_ops);	/* free ip_vs struct */
 	pr_info("ipvs unloaded.\n");
 }
 
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index d12a13c..4e27024 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -3408,6 +3408,42 @@ static void ip_vs_genl_unregister(void)
 
 /* End of Generic Netlink interface definitions */
 
+/*
+ * per netns intit/exit func.
+ */
+int __net_init __ip_vs_control_init(struct net *net)
+{
+	if (!net_eq(net, &init_net))	/* netns not enabled yet */
+		return -EPERM;
+
+	proc_net_fops_create(net, "ip_vs", 0, &ip_vs_info_fops);
+	proc_net_fops_create(net, "ip_vs_stats", 0, &ip_vs_stats_fops);
+	sysctl_header = register_net_sysctl_table(net, net_vs_ctl_path,
+						  vs_vars);
+	if (sysctl_header == NULL)
+		goto err_reg;
+	ip_vs_new_estimator(&ip_vs_stats);
+	return 0;
+
+err_reg:
+	return -ENOMEM;
+}
+
+static void __net_exit __ip_vs_control_cleanup(struct net *net)
+{
+	if (!net_eq(net, &init_net))	/* netns not enabled yet */
+		return;
+
+	ip_vs_kill_estimator(&ip_vs_stats);
+	unregister_net_sysctl_table(sysctl_header);
+	proc_net_remove(net, "ip_vs_stats");
+	proc_net_remove(net, "ip_vs");
+}
+
+static struct pernet_operations ipvs_control_ops = {
+	.init = __ip_vs_control_init,
+	.exit = __ip_vs_control_cleanup,
+};
 
 int __init ip_vs_control_init(void)
 {
@@ -3439,12 +3475,9 @@ int __init ip_vs_control_init(void)
 		return ret;
 	}
 
-	proc_net_fops_create(&init_net, "ip_vs", 0, &ip_vs_info_fops);
-	proc_net_fops_create(&init_net, "ip_vs_stats",0, &ip_vs_stats_fops);
-
-	sysctl_header = register_sysctl_paths(net_vs_ctl_path, vs_vars);
-
-	ip_vs_new_estimator(&ip_vs_stats);
+	ret = register_pernet_subsys(&ipvs_control_ops);
+	if (ret)
+		return ret;
 
 	/* Hook the defense timer */
 	schedule_delayed_work(&defense_work, DEFENSE_TIMER_PERIOD);
@@ -3461,9 +3494,7 @@ void ip_vs_control_cleanup(void)
 	cancel_rearming_delayed_work(&defense_work);
 	cancel_work_sync(&defense_work.work);
 	ip_vs_kill_estimator(&ip_vs_stats);
-	unregister_sysctl_table(sysctl_header);
-	proc_net_remove(&init_net, "ip_vs_stats");
-	proc_net_remove(&init_net, "ip_vs");
+	unregister_pernet_subsys(&ipvs_control_ops);
 	ip_vs_genl_unregister();
 	nf_unregister_sockopt(&ip_vs_sockopts);
 	LeaveFunction(2);
diff --git a/net/netfilter/ipvs/ip_vs_est.c b/net/netfilter/ipvs/ip_vs_est.c
index ff28801..7417a0c 100644
--- a/net/netfilter/ipvs/ip_vs_est.c
+++ b/net/netfilter/ipvs/ip_vs_est.c
@@ -157,13 +157,31 @@ void ip_vs_zero_estimator(struct ip_vs_stats *stats)
 	est->outbps = 0;
 }
 
+static int __net_init __ip_vs_estimator_init(struct net *net)
+{
+	if (!net_eq(net, &init_net))	/* netns not enabled yet */
+		return -EPERM;
+
+	return 0;
+}
+
+static struct pernet_operations ip_vs_app_ops = {
+	.init = __ip_vs_estimator_init,
+};
+
 int __init ip_vs_estimator_init(void)
 {
+	int rv;
+
+	rv = register_pernet_subsys(&ip_vs_app_ops);
+	if (rv < 0)
+		return rv;
 	mod_timer(&est_timer, jiffies + 2 * HZ);
-	return 0;
+	return rv;
 }
 
 void ip_vs_estimator_cleanup(void)
 {
 	del_timer_sync(&est_timer);
+	unregister_pernet_subsys(&ip_vs_app_ops);
 }
diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
index 84aef65..0e762f3 100644
--- a/net/netfilter/ipvs/ip_vs_ftp.c
+++ b/net/netfilter/ipvs/ip_vs_ftp.c
@@ -399,15 +399,17 @@ static struct ip_vs_app ip_vs_ftp = {
 	.pkt_in =	ip_vs_ftp_in,
 };
 
-
 /*
- *	ip_vs_ftp initialization
+ *	per netns ip_vs_ftp initialization
  */
-static int __init ip_vs_ftp_init(void)
+static int __net_init __ip_vs_ftp_init(struct net *net)
 {
 	int i, ret;
 	struct ip_vs_app *app = &ip_vs_ftp;
 
+	if (!net_eq(net, &init_net))	/* netns not enabled yet */
+		return -EPERM;
+
 	ret = register_ip_vs_app(app);
 	if (ret)
 		return ret;
@@ -427,14 +429,38 @@ static int __init ip_vs_ftp_init(void)
 
 	return ret;
 }
+/*
+ *	netns exit
+ */
+static void __ip_vs_ftp_exit(struct net *net)
+{
+	struct ip_vs_app *app = &ip_vs_ftp;
+
+	if (!net_eq(net, &init_net))	/* netns not enabled yet */
+		return;
+
+	unregister_ip_vs_app(app);
+}
+
+static struct pernet_operations ip_vs_ftp_ops = {
+	.init = __ip_vs_ftp_init,
+	.exit = __ip_vs_ftp_exit,
+};
 
+int __init ip_vs_ftp_init(void)
+{
+	int rv;
+
+	rv = register_pernet_subsys(&ip_vs_ftp_ops);
+	return rv;
+}
 
 /*
  *	ip_vs_ftp finish.
  */
 static void __exit ip_vs_ftp_exit(void)
 {
-	unregister_ip_vs_app(&ip_vs_ftp);
+	unregister_pernet_subsys(&ip_vs_ftp_ops);
 }
 
 
diff --git a/net/netfilter/ipvs/ip_vs_lblc.c b/net/netfilter/ipvs/ip_vs_lblc.c
index 9323f89..84278fb 100644
--- a/net/netfilter/ipvs/ip_vs_lblc.c
+++ b/net/netfilter/ipvs/ip_vs_lblc.c
@@ -543,23 +543,54 @@ static struct ip_vs_scheduler ip_vs_lblc_scheduler =
 	.schedule =		ip_vs_lblc_schedule,
 };
 
+/*
+ *  per netns init.
+ */
+static int __net_init __ip_vs_lblc_init(struct net *net)
+{
+	if (!net_eq(net, &init_net))	/* netns not enabled yet */
+		return -EPERM;
+
+	sysctl_header = register_net_sysctl_table(net, net_vs_ctl_path,
+						  vs_vars_table);
+	if (!sysctl_header)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void __net_exit __ip_vs_lblc_exit(struct net *net)
+{
+	if (!net_eq(net, &init_net))	/* netns not enabled yet */
+		return;
+
+	unregister_net_sysctl_table(sysctl_header);
+}
+
+static struct pernet_operations ip_vs_lblc_ops = {
+	.init = __ip_vs_lblc_init,
+	.exit = __ip_vs_lblc_exit,
+};
 
 static int __init ip_vs_lblc_init(void)
 {
 	int ret;
 
-	sysctl_header = register_sysctl_paths(net_vs_ctl_path, vs_vars_table);
+	ret = register_pernet_subsys(&ip_vs_lblc_ops);
+	if (ret)
+		return ret;
+
 	ret = register_ip_vs_scheduler(&ip_vs_lblc_scheduler);
 	if (ret)
-		unregister_sysctl_table(sysctl_header);
+		unregister_pernet_subsys(&ip_vs_lblc_ops);
 	return ret;
 }
 
 
 static void __exit ip_vs_lblc_cleanup(void)
 {
-	unregister_sysctl_table(sysctl_header);
 	unregister_ip_vs_scheduler(&ip_vs_lblc_scheduler);
+	unregister_pernet_subsys(&ip_vs_lblc_ops);
 }
 
 
diff --git a/net/netfilter/ipvs/ip_vs_lblcr.c b/net/netfilter/ipvs/ip_vs_lblcr.c
index dbeed8e..7c7396a 100644
--- a/net/netfilter/ipvs/ip_vs_lblcr.c
+++ b/net/netfilter/ipvs/ip_vs_lblcr.c
@@ -744,23 +744,53 @@ static struct ip_vs_scheduler ip_vs_lblcr_scheduler =
 	.schedule =		ip_vs_lblcr_schedule,
 };
 
+/*
+ *  per netns init.
+ */
+static int __net_init __ip_vs_lblcr_init(struct net *net)
+{
+	if (!net_eq(net, &init_net))	/* netns not enabled yet */
+		return -EPERM;
+
+	sysctl_header = register_net_sysctl_table(net, net_vs_ctl_path,
+						  vs_vars_table);
+	if (!sysctl_header)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void __net_exit __ip_vs_lblcr_exit(struct net *net)
+{
+	if (!net_eq(net, &init_net))	/* netns not enabled yet */
+		return;
+
+	unregister_net_sysctl_table(sysctl_header);
+}
+
+static struct pernet_operations ip_vs_lblcr_ops = {
+	.init = __ip_vs_lblcr_init,
+	.exit = __ip_vs_lblcr_exit,
+};
 
 static int __init ip_vs_lblcr_init(void)
 {
 	int ret;
 
-	sysctl_header = register_sysctl_paths(net_vs_ctl_path, vs_vars_table);
+	ret = register_pernet_subsys(&ip_vs_lblcr_ops);
+	if (ret)
+		return ret;
+
 	ret = register_ip_vs_scheduler(&ip_vs_lblcr_scheduler);
 	if (ret)
-		unregister_sysctl_table(sysctl_header);
+		unregister_pernet_subsys(&ip_vs_lblcr_ops);
 	return ret;
 }
 
-
 static void __exit ip_vs_lblcr_cleanup(void)
 {
-	unregister_sysctl_table(sysctl_header);
 	unregister_ip_vs_scheduler(&ip_vs_lblcr_scheduler);
+	unregister_pernet_subsys(&ip_vs_lblcr_ops);
 }
 
 
diff --git a/net/netfilter/ipvs/ip_vs_proto.c b/net/netfilter/ipvs/ip_vs_proto.c
index c539983..4539294 100644
--- a/net/netfilter/ipvs/ip_vs_proto.c
+++ b/net/netfilter/ipvs/ip_vs_proto.c
@@ -236,6 +236,23 @@ ip_vs_tcpudp_debug_packet(int af, struct ip_vs_protocol *pp,
 		ip_vs_tcpudp_debug_packet_v4(pp, skb, offset, msg);
 }
 
+/*
+ * per network name-space init
+ */
+static int __net_init __ip_vs_protocol_init(struct net *net)
+{
+	return 0;
+}
+
+static void __net_exit __ip_vs_protocol_cleanup(struct net *net)
+{
+	/* empty */
+}
+
+static struct pernet_operations ipvs_proto_ops = {
+	.init = __ip_vs_protocol_init,
+	.exit = __ip_vs_protocol_cleanup,
+};
 
 int __init ip_vs_protocol_init(void)
 {
@@ -265,6 +282,7 @@ int __init ip_vs_protocol_init(void)
 	REGISTER_PROTOCOL(&ip_vs_protocol_esp);
 #endif
 	pr_info("Registered protocols (%s)\n", &protocols[2]);
+	return register_pernet_subsys(&ipvs_proto_ops);
 
 	return 0;
 }
@@ -275,6 +293,7 @@ void ip_vs_protocol_cleanup(void)
 	struct ip_vs_protocol *pp;
 	int i;
 
+	unregister_pernet_subsys(&ipvs_proto_ops);
 	/* unregister all the ipvs protocols */
 	for (i = 0; i < IP_VS_PROTO_TAB_SIZE; i++) {
 		while ((pp = ip_vs_proto_table[i]) != NULL)
diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index c1c167a..3668739 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -1639,3 +1639,30 @@ int stop_sync_thread(int state)
 
 	return 0;
 }
+
+/*
+ * Initialize data struct for each netns
+ */
+static int __net_init __ip_vs_sync_init(struct net *net)
+{
+	return 0;
+}
+
+static void __ip_vs_sync_cleanup(struct net *net)
+{
+}
+static struct pernet_operations ipvs_sync_ops = {
+	.init = __ip_vs_sync_init,
+	.exit = __ip_vs_sync_cleanup,
+};
+
+
+int __init ip_vs_sync_init(void)
+{
+	return register_pernet_subsys(&ipvs_sync_ops);
+}
+
+void __exit ip_vs_sync_cleanup(void)
+{
+	unregister_pernet_subsys(&ipvs_sync_ops);
+}
-- 
1.7.2.3


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox