Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH 2/3] cxgb4: function namespace cleanup
From: Dimitris Michailidis @ 2010-10-15 23:34 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Divy Le Ray, David S. Miller, Casey Leedom, netdev
In-Reply-To: <20101015224523.633775810@vyatta.com>

Stephen Hemminger wrote:
> Make functions only used in one file local.
> Remove lots of dead code. Most surprising is the function
> cxgb4_iscsi_init which is defined but never called!
> Compile tested only

You are changing to static or removing entirely functions exported for use 
by the iSCSI and RDMA drivers.  The function you mention, cxgb4_iscsi_init, 
  is used in the cxgb4i driver in the scsi tree and this change breaks it.

> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> 
> ---
>  drivers/net/cxgb4/cxgb4.h      |   17 -
>  drivers/net/cxgb4/cxgb4_main.c |   91 ---------
>  drivers/net/cxgb4/cxgb4_uld.h  |   12 -
>  drivers/net/cxgb4/l2t.c        |   34 ---
>  drivers/net/cxgb4/l2t.h        |    3 
>  drivers/net/cxgb4/sge.c        |    5 
>  drivers/net/cxgb4/t4_hw.c      |  394 -----------------------------------------
>  7 files changed, 5 insertions(+), 551 deletions(-)
> 
> --- a/drivers/net/cxgb4/cxgb4_main.c	2010-10-15 11:31:38.980766681 -0700
> +++ b/drivers/net/cxgb4/cxgb4_main.c	2010-10-15 11:40:13.976943486 -0700
> @@ -880,7 +880,7 @@ void *t4_alloc_mem(size_t size)
>  /*
>   * Free memory allocated through alloc_mem().
>   */
> -void t4_free_mem(void *addr)
> +static void t4_free_mem(void *addr)
>  {
>  	if (is_vmalloc_addr(addr))
>  		vfree(addr);
> @@ -2206,8 +2206,8 @@ static void mk_tid_release(struct sk_buf
>   * Queue a TID release request and if necessary schedule a work queue to
>   * process it.
>   */
> -void cxgb4_queue_tid_release(struct tid_info *t, unsigned int chan,
> -			     unsigned int tid)
> +static void cxgb4_queue_tid_release(struct tid_info *t, unsigned int chan,
> +				    unsigned int tid)
>  {
>  	void **p = &t->tid_tab[tid];
>  	struct adapter *adap = container_of(t, struct adapter, tids);
> @@ -2222,7 +2222,6 @@ void cxgb4_queue_tid_release(struct tid_
>  	}
>  	spin_unlock_bh(&adap->tid_release_lock);
>  }
> -EXPORT_SYMBOL(cxgb4_queue_tid_release);
>  
>  /*
>   * Process the list of pending TID release requests.
> @@ -2355,48 +2354,6 @@ int cxgb4_create_server(const struct net
>  EXPORT_SYMBOL(cxgb4_create_server);
>  
>  /**
> - *	cxgb4_create_server6 - create an IPv6 server
> - *	@dev: the device
> - *	@stid: the server TID
> - *	@sip: local IPv6 address to bind server to
> - *	@sport: the server's TCP port
> - *	@queue: queue to direct messages from this server to
> - *
> - *	Create an IPv6 server for the given port and address.
> - *	Returns <0 on error and one of the %NET_XMIT_* values on success.
> - */
> -int cxgb4_create_server6(const struct net_device *dev, unsigned int stid,
> -			 const struct in6_addr *sip, __be16 sport,
> -			 unsigned int queue)
> -{
> -	unsigned int chan;
> -	struct sk_buff *skb;
> -	struct adapter *adap;
> -	struct cpl_pass_open_req6 *req;
> -
> -	skb = alloc_skb(sizeof(*req), GFP_KERNEL);
> -	if (!skb)
> -		return -ENOMEM;
> -
> -	adap = netdev2adap(dev);
> -	req = (struct cpl_pass_open_req6 *)__skb_put(skb, sizeof(*req));
> -	INIT_TP_WR(req, 0);
> -	OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_PASS_OPEN_REQ6, stid));
> -	req->local_port = sport;
> -	req->peer_port = htons(0);
> -	req->local_ip_hi = *(__be64 *)(sip->s6_addr);
> -	req->local_ip_lo = *(__be64 *)(sip->s6_addr + 8);
> -	req->peer_ip_hi = cpu_to_be64(0);
> -	req->peer_ip_lo = cpu_to_be64(0);
> -	chan = rxq_to_chan(&adap->sge, queue);
> -	req->opt0 = cpu_to_be64(TX_CHAN(chan));
> -	req->opt1 = cpu_to_be64(CONN_POLICY_ASK |
> -				SYN_RSS_ENABLE | SYN_RSS_QUEUE(queue));
> -	return t4_mgmt_tx(adap, skb);
> -}
> -EXPORT_SYMBOL(cxgb4_create_server6);
> -
> -/**
>   *	cxgb4_best_mtu - find the entry in the MTU table closest to an MTU
>   *	@mtus: the HW MTU table
>   *	@mtu: the target MTU
> @@ -2455,48 +2412,6 @@ unsigned int cxgb4_port_idx(const struct
>  }
>  EXPORT_SYMBOL(cxgb4_port_idx);
>  
> -/**
> - *	cxgb4_netdev_by_hwid - return the net device of a HW port
> - *	@pdev: identifies the adapter
> - *	@id: the HW port id
> - *
> - *	Return the net device associated with the interface with the given HW
> - *	id.
> - */
> -struct net_device *cxgb4_netdev_by_hwid(struct pci_dev *pdev, unsigned int id)
> -{
> -	const struct adapter *adap = pci_get_drvdata(pdev);
> -
> -	if (!adap || id >= NCHAN)
> -		return NULL;
> -	id = adap->chan_map[id];
> -	return id < MAX_NPORTS ? adap->port[id] : NULL;
> -}
> -EXPORT_SYMBOL(cxgb4_netdev_by_hwid);
> -
> -void cxgb4_get_tcp_stats(struct pci_dev *pdev, struct tp_tcp_stats *v4,
> -			 struct tp_tcp_stats *v6)
> -{
> -	struct adapter *adap = pci_get_drvdata(pdev);
> -
> -	spin_lock(&adap->stats_lock);
> -	t4_tp_get_tcp_stats(adap, v4, v6);
> -	spin_unlock(&adap->stats_lock);
> -}
> -EXPORT_SYMBOL(cxgb4_get_tcp_stats);
> -
> -void cxgb4_iscsi_init(struct net_device *dev, unsigned int tag_mask,
> -		      const unsigned int *pgsz_order)
> -{
> -	struct adapter *adap = netdev2adap(dev);
> -
> -	t4_write_reg(adap, ULP_RX_ISCSI_TAGMASK, tag_mask);
> -	t4_write_reg(adap, ULP_RX_ISCSI_PSZ, HPZ0(pgsz_order[0]) |
> -		     HPZ1(pgsz_order[1]) | HPZ2(pgsz_order[2]) |
> -		     HPZ3(pgsz_order[3]));
> -}
> -EXPORT_SYMBOL(cxgb4_iscsi_init);
> -
>  static struct pci_driver cxgb4_driver;
>  
>  static void check_neigh_update(struct neighbour *neigh)
> --- a/drivers/net/cxgb4/cxgb4_uld.h	2010-10-15 11:31:38.964766089 -0700
> +++ b/drivers/net/cxgb4/cxgb4_uld.h	2010-10-15 11:39:06.777878706 -0700
> @@ -139,16 +139,11 @@ int cxgb4_alloc_stid(struct tid_info *t,
>  void cxgb4_free_atid(struct tid_info *t, unsigned int atid);
>  void cxgb4_free_stid(struct tid_info *t, unsigned int stid, int family);
>  void cxgb4_remove_tid(struct tid_info *t, unsigned int qid, unsigned int tid);
> -void cxgb4_queue_tid_release(struct tid_info *t, unsigned int chan,
> -			     unsigned int tid);
>  
>  struct in6_addr;
>  
>  int cxgb4_create_server(const struct net_device *dev, unsigned int stid,
>  			__be32 sip, __be16 sport, unsigned int queue);
> -int cxgb4_create_server6(const struct net_device *dev, unsigned int stid,
> -			 const struct in6_addr *sip, __be16 sport,
> -			 unsigned int queue);
>  
>  static inline void set_wr_txq(struct sk_buff *skb, int prio, int queue)
>  {
> @@ -233,13 +228,6 @@ int cxgb4_ofld_send(struct net_device *d
>  unsigned int cxgb4_port_chan(const struct net_device *dev);
>  unsigned int cxgb4_port_viid(const struct net_device *dev);
>  unsigned int cxgb4_port_idx(const struct net_device *dev);
> -struct net_device *cxgb4_netdev_by_hwid(struct pci_dev *pdev, unsigned int id);
>  unsigned int cxgb4_best_mtu(const unsigned short *mtus, unsigned short mtu,
>  			    unsigned int *idx);
> -void cxgb4_get_tcp_stats(struct pci_dev *pdev, struct tp_tcp_stats *v4,
> -			 struct tp_tcp_stats *v6);
> -void cxgb4_iscsi_init(struct net_device *dev, unsigned int tag_mask,
> -		      const unsigned int *pgsz_order);
> -struct sk_buff *cxgb4_pktgl_to_skb(const struct pkt_gl *gl,
> -				   unsigned int skb_len, unsigned int pull_len);
>  #endif  /* !__CXGB4_OFLD_H */
> --- a/drivers/net/cxgb4/cxgb4.h	2010-10-15 11:40:02.748431897 -0700
> +++ b/drivers/net/cxgb4/cxgb4.h	2010-10-15 11:46:18.685455254 -0700
> @@ -592,7 +592,6 @@ void t4_os_portmod_changed(const struct
>  void t4_os_link_changed(struct adapter *adap, int port_id, int link_stat);
>  
>  void *t4_alloc_mem(size_t size);
> -void t4_free_mem(void *addr);
>  
>  void t4_free_sge_resources(struct adapter *adap);
>  irq_handler_t t4_intr_handler(struct adapter *adap);
> @@ -651,7 +650,6 @@ static inline int t4_wr_mbox_ns(struct a
>  
>  void t4_intr_enable(struct adapter *adapter);
>  void t4_intr_disable(struct adapter *adapter);
> -void t4_intr_clear(struct adapter *adapter);
>  int t4_slow_intr_handler(struct adapter *adapter);
>  
>  int t4_wait_dev_ready(struct adapter *adap);
> @@ -664,26 +662,16 @@ int t4_check_fw_version(struct adapter *
>  int t4_prep_adapter(struct adapter *adapter);
>  int t4_port_init(struct adapter *adap, int mbox, int pf, int vf);
>  void t4_fatal_err(struct adapter *adapter);
> -int t4_set_trace_filter(struct adapter *adapter, const struct trace_params *tp,
> -			int filter_index, int enable);
> -void t4_get_trace_filter(struct adapter *adapter, struct trace_params *tp,
> -			 int filter_index, int *enabled);
>  int t4_config_rss_range(struct adapter *adapter, int mbox, unsigned int viid,
>  			int start, int n, const u16 *rspq, unsigned int nrspq);
>  int t4_config_glbl_rss(struct adapter *adapter, int mbox, unsigned int mode,
>  		       unsigned int flags);
> -int t4_read_rss(struct adapter *adapter, u16 *entries);
>  int t4_mc_read(struct adapter *adap, u32 addr, __be32 *data, u64 *parity);
>  int t4_edc_read(struct adapter *adap, int idx, u32 addr, __be32 *data,
>  		u64 *parity);
>  
>  void t4_get_port_stats(struct adapter *adap, int idx, struct port_stats *p);
> -void t4_get_lb_stats(struct adapter *adap, int idx, struct lb_port_stats *p);
> -
>  void t4_read_mtu_tbl(struct adapter *adap, u16 *mtus, u8 *mtu_log);
> -void t4_tp_get_err_stats(struct adapter *adap, struct tp_err_stats *st);
> -void t4_tp_get_tcp_stats(struct adapter *adap, struct tp_tcp_stats *v4,
> -			 struct tp_tcp_stats *v6);
>  void t4_load_mtus(struct adapter *adap, const unsigned short *mtus,
>  		  const unsigned short *alpha, const unsigned short *beta);
>  
> @@ -711,8 +699,6 @@ int t4_cfg_pfvf(struct adapter *adap, un
>  int t4_alloc_vi(struct adapter *adap, unsigned int mbox, unsigned int port,
>  		unsigned int pf, unsigned int vf, unsigned int nmac, u8 *mac,
>  		unsigned int *rss_size);
> -int t4_free_vi(struct adapter *adap, unsigned int mbox, unsigned int pf,
> -	       unsigned int vf, unsigned int viid);
>  int t4_set_rxmode(struct adapter *adap, unsigned int mbox, unsigned int viid,
>  		int mtu, int promisc, int all_multi, int bcast, int vlanex,
>  		bool sleep_ok);
> @@ -731,9 +717,6 @@ int t4_mdio_rd(struct adapter *adap, uns
>  	       unsigned int mmd, unsigned int reg, u16 *valp);
>  int t4_mdio_wr(struct adapter *adap, unsigned int mbox, unsigned int phy_addr,
>  	       unsigned int mmd, unsigned int reg, u16 val);
> -int t4_iq_start_stop(struct adapter *adap, unsigned int mbox, bool start,
> -		     unsigned int pf, unsigned int vf, unsigned int iqid,
> -		     unsigned int fl0id, unsigned int fl1id);
>  int t4_iq_free(struct adapter *adap, unsigned int mbox, unsigned int pf,
>  	       unsigned int vf, unsigned int iqtype, unsigned int iqid,
>  	       unsigned int fl0id, unsigned int fl1id);
> --- a/drivers/net/cxgb4/l2t.c	2010-10-15 11:33:43.549387836 -0700
> +++ b/drivers/net/cxgb4/l2t.c	2010-10-15 11:33:55.949849920 -0700
> @@ -481,40 +481,6 @@ void t4_l2t_update(struct adapter *adap,
>  		handle_failed_resolution(adap, arpq);
>  }
>  
> -/*
> - * Allocate an L2T entry for use by a switching rule.  Such entries need to be
> - * explicitly freed and while busy they are not on any hash chain, so normal
> - * address resolution updates do not see them.
> - */
> -struct l2t_entry *t4_l2t_alloc_switching(struct l2t_data *d)
> -{
> -	struct l2t_entry *e;
> -
> -	write_lock_bh(&d->lock);
> -	e = alloc_l2e(d);
> -	if (e) {
> -		spin_lock(&e->lock);          /* avoid race with t4_l2t_free */
> -		e->state = L2T_STATE_SWITCHING;
> -		atomic_set(&e->refcnt, 1);
> -		spin_unlock(&e->lock);
> -	}
> -	write_unlock_bh(&d->lock);
> -	return e;
> -}
> -
> -/*
> - * Sets/updates the contents of a switching L2T entry that has been allocated
> - * with an earlier call to @t4_l2t_alloc_switching.
> - */
> -int t4_l2t_set_switching(struct adapter *adap, struct l2t_entry *e, u16 vlan,
> -			 u8 port, u8 *eth_addr)
> -{
> -	e->vlan = vlan;
> -	e->lport = port;
> -	memcpy(e->dmac, eth_addr, ETH_ALEN);
> -	return write_l2e(adap, e, 0);
> -}
> -
>  struct l2t_data *t4_init_l2t(void)
>  {
>  	int i;
> --- a/drivers/net/cxgb4/l2t.h	2010-10-15 11:33:00.175774482 -0700
> +++ b/drivers/net/cxgb4/l2t.h	2010-10-15 11:33:13.728278102 -0700
> @@ -100,9 +100,6 @@ struct l2t_entry *cxgb4_l2t_get(struct l
>  				unsigned int priority);
>  
>  void t4_l2t_update(struct adapter *adap, struct neighbour *neigh);
> -struct l2t_entry *t4_l2t_alloc_switching(struct l2t_data *d);
> -int t4_l2t_set_switching(struct adapter *adap, struct l2t_entry *e, u16 vlan,
> -			 u8 port, u8 *eth_addr);
>  struct l2t_data *t4_init_l2t(void);
>  void do_l2t_write_rpl(struct adapter *p, const struct cpl_l2t_write_rpl *rpl);
>  
> --- a/drivers/net/cxgb4/sge.c	2010-10-15 11:34:06.486242828 -0700
> +++ b/drivers/net/cxgb4/sge.c	2010-10-15 11:34:41.967567863 -0700
> @@ -1434,8 +1434,8 @@ static inline void copy_frags(struct skb
>   *	Builds an sk_buff from the given packet gather list.  Returns the
>   *	sk_buff or %NULL if sk_buff allocation failed.
>   */
> -struct sk_buff *cxgb4_pktgl_to_skb(const struct pkt_gl *gl,
> -				   unsigned int skb_len, unsigned int pull_len)
> +static struct sk_buff *cxgb4_pktgl_to_skb(const struct pkt_gl *gl,
> +					  unsigned int skb_len, unsigned int pull_len)
>  {
>  	struct sk_buff *skb;
>  
> @@ -1464,7 +1464,6 @@ struct sk_buff *cxgb4_pktgl_to_skb(const
>  	}
>  out:	return skb;
>  }
> -EXPORT_SYMBOL(cxgb4_pktgl_to_skb);
>  
>  /**
>   *	t4_pktgl_free - free a packet gather list
> --- a/drivers/net/cxgb4/t4_hw.c	2010-10-15 11:35:08.640565879 -0700
> +++ b/drivers/net/cxgb4/t4_hw.c	2010-10-15 11:46:55.167096446 -0700
> @@ -97,53 +97,6 @@ void t4_set_reg_field(struct adapter *ad
>  	(void) t4_read_reg(adapter, addr);      /* flush */
>  }
>  
> -/**
> - *	t4_read_indirect - read indirectly addressed registers
> - *	@adap: the adapter
> - *	@addr_reg: register holding the indirect address
> - *	@data_reg: register holding the value of the indirect register
> - *	@vals: where the read register values are stored
> - *	@nregs: how many indirect registers to read
> - *	@start_idx: index of first indirect register to read
> - *
> - *	Reads registers that are accessed indirectly through an address/data
> - *	register pair.
> - */
> -static void t4_read_indirect(struct adapter *adap, unsigned int addr_reg,
> -			     unsigned int data_reg, u32 *vals,
> -			     unsigned int nregs, unsigned int start_idx)
> -{
> -	while (nregs--) {
> -		t4_write_reg(adap, addr_reg, start_idx);
> -		*vals++ = t4_read_reg(adap, data_reg);
> -		start_idx++;
> -	}
> -}
> -
> -#if 0
> -/**
> - *	t4_write_indirect - write indirectly addressed registers
> - *	@adap: the adapter
> - *	@addr_reg: register holding the indirect addresses
> - *	@data_reg: register holding the value for the indirect registers
> - *	@vals: values to write
> - *	@nregs: how many indirect registers to write
> - *	@start_idx: address of first indirect register to write
> - *
> - *	Writes a sequential block of registers that are accessed indirectly
> - *	through an address/data register pair.
> - */
> -static void t4_write_indirect(struct adapter *adap, unsigned int addr_reg,
> -			      unsigned int data_reg, const u32 *vals,
> -			      unsigned int nregs, unsigned int start_idx)
> -{
> -	while (nregs--) {
> -		t4_write_reg(adap, addr_reg, start_idx++);
> -		t4_write_reg(adap, data_reg, *vals++);
> -	}
> -}
> -#endif
> -
>  /*
>   * Get the reply to a mailbox command and store it in @rpl in big-endian order.
>   */
> @@ -1560,44 +1513,6 @@ void t4_intr_disable(struct adapter *ada
>  }
>  
>  /**
> - *	t4_intr_clear - clear all interrupts
> - *	@adapter: the adapter whose interrupts should be cleared
> - *
> - *	Clears all interrupts.  The caller must be a PCI function managing
> - *	global interrupts.
> - */
> -void t4_intr_clear(struct adapter *adapter)
> -{
> -	static const unsigned int cause_reg[] = {
> -		SGE_INT_CAUSE1, SGE_INT_CAUSE2, SGE_INT_CAUSE3,
> -		PCIE_CORE_UTL_SYSTEM_BUS_AGENT_STATUS,
> -		PCIE_CORE_UTL_PCI_EXPRESS_PORT_STATUS,
> -		PCIE_NONFAT_ERR, PCIE_INT_CAUSE,
> -		MC_INT_CAUSE,
> -		MA_INT_WRAP_STATUS, MA_PARITY_ERROR_STATUS, MA_INT_CAUSE,
> -		EDC_INT_CAUSE, EDC_REG(EDC_INT_CAUSE, 1),
> -		CIM_HOST_INT_CAUSE, CIM_HOST_UPACC_INT_CAUSE,
> -		MYPF_REG(CIM_PF_HOST_INT_CAUSE),
> -		TP_INT_CAUSE,
> -		ULP_RX_INT_CAUSE, ULP_TX_INT_CAUSE,
> -		PM_RX_INT_CAUSE, PM_TX_INT_CAUSE,
> -		MPS_RX_PERR_INT_CAUSE,
> -		CPL_INTR_CAUSE,
> -		MYPF_REG(PL_PF_INT_CAUSE),
> -		PL_PL_INT_CAUSE,
> -		LE_DB_INT_CAUSE,
> -	};
> -
> -	unsigned int i;
> -
> -	for (i = 0; i < ARRAY_SIZE(cause_reg); ++i)
> -		t4_write_reg(adapter, cause_reg[i], 0xffffffff);
> -
> -	t4_write_reg(adapter, PL_INT_CAUSE, GLBL_INTR_MASK);
> -	(void) t4_read_reg(adapter, PL_INT_CAUSE);          /* flush */
> -}
> -
> -/**
>   *	hash_mac_addr - return the hash value of a MAC address
>   *	@addr: the 48-bit Ethernet MAC address
>   *
> @@ -1709,98 +1624,6 @@ int t4_config_glbl_rss(struct adapter *a
>  	return t4_wr_mbox(adapter, mbox, &c, sizeof(c), NULL);
>  }
>  
> -/* Read an RSS table row */
> -static int rd_rss_row(struct adapter *adap, int row, u32 *val)
> -{
> -	t4_write_reg(adap, TP_RSS_LKP_TABLE, 0xfff00000 | row);
> -	return t4_wait_op_done_val(adap, TP_RSS_LKP_TABLE, LKPTBLROWVLD, 1,
> -				   5, 0, val);
> -}
> -
> -/**
> - *	t4_read_rss - read the contents of the RSS mapping table
> - *	@adapter: the adapter
> - *	@map: holds the contents of the RSS mapping table
> - *
> - *	Reads the contents of the RSS hash->queue mapping table.
> - */
> -int t4_read_rss(struct adapter *adapter, u16 *map)
> -{
> -	u32 val;
> -	int i, ret;
> -
> -	for (i = 0; i < RSS_NENTRIES / 2; ++i) {
> -		ret = rd_rss_row(adapter, i, &val);
> -		if (ret)
> -			return ret;
> -		*map++ = LKPTBLQUEUE0_GET(val);
> -		*map++ = LKPTBLQUEUE1_GET(val);
> -	}
> -	return 0;
> -}
> -
> -/**
> - *	t4_tp_get_tcp_stats - read TP's TCP MIB counters
> - *	@adap: the adapter
> - *	@v4: holds the TCP/IP counter values
> - *	@v6: holds the TCP/IPv6 counter values
> - *
> - *	Returns the values of TP's TCP/IP and TCP/IPv6 MIB counters.
> - *	Either @v4 or @v6 may be %NULL to skip the corresponding stats.
> - */
> -void t4_tp_get_tcp_stats(struct adapter *adap, struct tp_tcp_stats *v4,
> -			 struct tp_tcp_stats *v6)
> -{
> -	u32 val[TP_MIB_TCP_RXT_SEG_LO - TP_MIB_TCP_OUT_RST + 1];
> -
> -#define STAT_IDX(x) ((TP_MIB_TCP_##x) - TP_MIB_TCP_OUT_RST)
> -#define STAT(x)     val[STAT_IDX(x)]
> -#define STAT64(x)   (((u64)STAT(x##_HI) << 32) | STAT(x##_LO))
> -
> -	if (v4) {
> -		t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, val,
> -				 ARRAY_SIZE(val), TP_MIB_TCP_OUT_RST);
> -		v4->tcpOutRsts = STAT(OUT_RST);
> -		v4->tcpInSegs  = STAT64(IN_SEG);
> -		v4->tcpOutSegs = STAT64(OUT_SEG);
> -		v4->tcpRetransSegs = STAT64(RXT_SEG);
> -	}
> -	if (v6) {
> -		t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, val,
> -				 ARRAY_SIZE(val), TP_MIB_TCP_V6OUT_RST);
> -		v6->tcpOutRsts = STAT(OUT_RST);
> -		v6->tcpInSegs  = STAT64(IN_SEG);
> -		v6->tcpOutSegs = STAT64(OUT_SEG);
> -		v6->tcpRetransSegs = STAT64(RXT_SEG);
> -	}
> -#undef STAT64
> -#undef STAT
> -#undef STAT_IDX
> -}
> -
> -/**
> - *	t4_tp_get_err_stats - read TP's error MIB counters
> - *	@adap: the adapter
> - *	@st: holds the counter values
> - *
> - *	Returns the values of TP's error counters.
> - */
> -void t4_tp_get_err_stats(struct adapter *adap, struct tp_err_stats *st)
> -{
> -	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, st->macInErrs,
> -			 12, TP_MIB_MAC_IN_ERR_0);
> -	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, st->tnlCongDrops,
> -			 8, TP_MIB_TNL_CNG_DROP_0);
> -	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, st->tnlTxDrops,
> -			 4, TP_MIB_TNL_DROP_0);
> -	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, st->ofldVlanDrops,
> -			 4, TP_MIB_OFD_VLN_DROP_0);
> -	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, st->tcp6InErrs,
> -			 4, TP_MIB_TCP_V6IN_ERR_0);
> -	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, &st->ofldNoNeigh,
> -			 2, TP_MIB_OFD_ARP_DROP);
> -}
> -
>  /**
>   *	t4_read_mtu_tbl - returns the values in the HW path MTU table
>   *	@adap: the adapter
> @@ -1916,122 +1739,6 @@ void t4_load_mtus(struct adapter *adap,
>  }
>  
>  /**
> - *	t4_set_trace_filter - configure one of the tracing filters
> - *	@adap: the adapter
> - *	@tp: the desired trace filter parameters
> - *	@idx: which filter to configure
> - *	@enable: whether to enable or disable the filter
> - *
> - *	Configures one of the tracing filters available in HW.  If @enable is
> - *	%0 @tp is not examined and may be %NULL.
> - */
> -int t4_set_trace_filter(struct adapter *adap, const struct trace_params *tp,
> -			int idx, int enable)
> -{
> -	int i, ofst = idx * 4;
> -	u32 data_reg, mask_reg, cfg;
> -	u32 multitrc = TRCMULTIFILTER;
> -
> -	if (!enable) {
> -		t4_write_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + ofst, 0);
> -		goto out;
> -	}
> -
> -	if (tp->port > 11 || tp->invert > 1 || tp->skip_len > 0x1f ||
> -	    tp->skip_ofst > 0x1f || tp->min_len > 0x1ff ||
> -	    tp->snap_len > 9600 || (idx && tp->snap_len > 256))
> -		return -EINVAL;
> -
> -	if (tp->snap_len > 256) {            /* must be tracer 0 */
> -		if ((t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + 4) |
> -		     t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + 8) |
> -		     t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + 12)) & TFEN)
> -			return -EINVAL;  /* other tracers are enabled */
> -		multitrc = 0;
> -	} else if (idx) {
> -		i = t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_B);
> -		if (TFCAPTUREMAX_GET(i) > 256 &&
> -		    (t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A) & TFEN))
> -			return -EINVAL;
> -	}
> -
> -	/* stop the tracer we'll be changing */
> -	t4_write_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + ofst, 0);
> -
> -	/* disable tracing globally if running in the wrong single/multi mode */
> -	cfg = t4_read_reg(adap, MPS_TRC_CFG);
> -	if ((cfg & TRCEN) && multitrc != (cfg & TRCMULTIFILTER)) {
> -		t4_write_reg(adap, MPS_TRC_CFG, cfg ^ TRCEN);
> -		t4_read_reg(adap, MPS_TRC_CFG);                  /* flush */
> -		msleep(1);
> -		if (!(t4_read_reg(adap, MPS_TRC_CFG) & TRCFIFOEMPTY))
> -			return -ETIMEDOUT;
> -	}
> -	/*
> -	 * At this point either the tracing is enabled and in the right mode or
> -	 * disabled.
> -	 */
> -
> -	idx *= (MPS_TRC_FILTER1_MATCH - MPS_TRC_FILTER0_MATCH);
> -	data_reg = MPS_TRC_FILTER0_MATCH + idx;
> -	mask_reg = MPS_TRC_FILTER0_DONT_CARE + idx;
> -
> -	for (i = 0; i < TRACE_LEN / 4; i++, data_reg += 4, mask_reg += 4) {
> -		t4_write_reg(adap, data_reg, tp->data[i]);
> -		t4_write_reg(adap, mask_reg, ~tp->mask[i]);
> -	}
> -	t4_write_reg(adap, MPS_TRC_FILTER_MATCH_CTL_B + ofst,
> -		     TFCAPTUREMAX(tp->snap_len) |
> -		     TFMINPKTSIZE(tp->min_len));
> -	t4_write_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + ofst,
> -		     TFOFFSET(tp->skip_ofst) | TFLENGTH(tp->skip_len) |
> -		     TFPORT(tp->port) | TFEN |
> -		     (tp->invert ? TFINVERTMATCH : 0));
> -
> -	cfg &= ~TRCMULTIFILTER;
> -	t4_write_reg(adap, MPS_TRC_CFG, cfg | TRCEN | multitrc);
> -out:	t4_read_reg(adap, MPS_TRC_CFG);  /* flush */
> -	return 0;
> -}
> -
> -/**
> - *	t4_get_trace_filter - query one of the tracing filters
> - *	@adap: the adapter
> - *	@tp: the current trace filter parameters
> - *	@idx: which trace filter to query
> - *	@enabled: non-zero if the filter is enabled
> - *
> - *	Returns the current settings of one of the HW tracing filters.
> - */
> -void t4_get_trace_filter(struct adapter *adap, struct trace_params *tp, int idx,
> -			 int *enabled)
> -{
> -	u32 ctla, ctlb;
> -	int i, ofst = idx * 4;
> -	u32 data_reg, mask_reg;
> -
> -	ctla = t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + ofst);
> -	ctlb = t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_B + ofst);
> -
> -	*enabled = !!(ctla & TFEN);
> -	tp->snap_len = TFCAPTUREMAX_GET(ctlb);
> -	tp->min_len = TFMINPKTSIZE_GET(ctlb);
> -	tp->skip_ofst = TFOFFSET_GET(ctla);
> -	tp->skip_len = TFLENGTH_GET(ctla);
> -	tp->invert = !!(ctla & TFINVERTMATCH);
> -	tp->port = TFPORT_GET(ctla);
> -
> -	ofst = (MPS_TRC_FILTER1_MATCH - MPS_TRC_FILTER0_MATCH) * idx;
> -	data_reg = MPS_TRC_FILTER0_MATCH + ofst;
> -	mask_reg = MPS_TRC_FILTER0_DONT_CARE + ofst;
> -
> -	for (i = 0; i < TRACE_LEN / 4; i++, data_reg += 4, mask_reg += 4) {
> -		tp->mask[i] = ~t4_read_reg(adap, mask_reg);
> -		tp->data[i] = t4_read_reg(adap, data_reg) & tp->mask[i];
> -	}
> -}
> -
> -/**
>   *	get_mps_bg_map - return the buffer groups associated with a port
>   *	@adap: the adapter
>   *	@idx: the port index
> @@ -2133,52 +1840,6 @@ void t4_get_port_stats(struct adapter *a
>  }
>  
>  /**
> - *	t4_get_lb_stats - collect loopback port statistics
> - *	@adap: the adapter
> - *	@idx: the loopback port index
> - *	@p: the stats structure to fill
> - *
> - *	Return HW statistics for the given loopback port.
> - */
> -void t4_get_lb_stats(struct adapter *adap, int idx, struct lb_port_stats *p)
> -{
> -	u32 bgmap = get_mps_bg_map(adap, idx);
> -
> -#define GET_STAT(name) \
> -	t4_read_reg64(adap, PORT_REG(idx, MPS_PORT_STAT_LB_PORT_##name##_L))
> -#define GET_STAT_COM(name) t4_read_reg64(adap, MPS_STAT_##name##_L)
> -
> -	p->octets           = GET_STAT(BYTES);
> -	p->frames           = GET_STAT(FRAMES);
> -	p->bcast_frames     = GET_STAT(BCAST);
> -	p->mcast_frames     = GET_STAT(MCAST);
> -	p->ucast_frames     = GET_STAT(UCAST);
> -	p->error_frames     = GET_STAT(ERROR);
> -
> -	p->frames_64        = GET_STAT(64B);
> -	p->frames_65_127    = GET_STAT(65B_127B);
> -	p->frames_128_255   = GET_STAT(128B_255B);
> -	p->frames_256_511   = GET_STAT(256B_511B);
> -	p->frames_512_1023  = GET_STAT(512B_1023B);
> -	p->frames_1024_1518 = GET_STAT(1024B_1518B);
> -	p->frames_1519_max  = GET_STAT(1519B_MAX);
> -	p->drop             = t4_read_reg(adap, PORT_REG(idx,
> -					  MPS_PORT_STAT_LB_PORT_DROP_FRAMES));
> -
> -	p->ovflow0 = (bgmap & 1) ? GET_STAT_COM(RX_BG_0_LB_DROP_FRAME) : 0;
> -	p->ovflow1 = (bgmap & 2) ? GET_STAT_COM(RX_BG_1_LB_DROP_FRAME) : 0;
> -	p->ovflow2 = (bgmap & 4) ? GET_STAT_COM(RX_BG_2_LB_DROP_FRAME) : 0;
> -	p->ovflow3 = (bgmap & 8) ? GET_STAT_COM(RX_BG_3_LB_DROP_FRAME) : 0;
> -	p->trunc0 = (bgmap & 1) ? GET_STAT_COM(RX_BG_0_LB_TRUNC_FRAME) : 0;
> -	p->trunc1 = (bgmap & 2) ? GET_STAT_COM(RX_BG_1_LB_TRUNC_FRAME) : 0;
> -	p->trunc2 = (bgmap & 4) ? GET_STAT_COM(RX_BG_2_LB_TRUNC_FRAME) : 0;
> -	p->trunc3 = (bgmap & 8) ? GET_STAT_COM(RX_BG_3_LB_TRUNC_FRAME) : 0;
> -
> -#undef GET_STAT
> -#undef GET_STAT_COM
> -}
> -
> -/**
>   *	t4_wol_magic_enable - enable/disable magic packet WoL
>   *	@adap: the adapter
>   *	@port: the physical port index
> @@ -2584,30 +2245,6 @@ int t4_alloc_vi(struct adapter *adap, un
>  }
>  
>  /**
> - *	t4_free_vi - free a virtual interface
> - *	@adap: the adapter
> - *	@mbox: mailbox to use for the FW command
> - *	@pf: the PF owning the VI
> - *	@vf: the VF owning the VI
> - *	@viid: virtual interface identifiler
> - *
> - *	Free a previously allocated virtual interface.
> - */
> -int t4_free_vi(struct adapter *adap, unsigned int mbox, unsigned int pf,
> -	       unsigned int vf, unsigned int viid)
> -{
> -	struct fw_vi_cmd c;
> -
> -	memset(&c, 0, sizeof(c));
> -	c.op_to_vfn = htonl(FW_CMD_OP(FW_VI_CMD) | FW_CMD_REQUEST |
> -			    FW_CMD_EXEC | FW_VI_CMD_PFN(pf) |
> -			    FW_VI_CMD_VFN(vf));
> -	c.alloc_to_len16 = htonl(FW_VI_CMD_FREE | FW_LEN16(c));
> -	c.type_viid = htons(FW_VI_CMD_VIID(viid));
> -	return t4_wr_mbox(adap, mbox, &c, sizeof(c), &c);
> -}
> -
> -/**
>   *	t4_set_rxmode - set Rx properties of a virtual interface
>   *	@adap: the adapter
>   *	@mbox: mailbox to use for the FW command
> @@ -2832,37 +2469,6 @@ int t4_identify_port(struct adapter *ada
>  	return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL);
>  }
>  
> -/**
> - *	t4_iq_start_stop - enable/disable an ingress queue and its FLs
> - *	@adap: the adapter
> - *	@mbox: mailbox to use for the FW command
> - *	@start: %true to enable the queues, %false to disable them
> - *	@pf: the PF owning the queues
> - *	@vf: the VF owning the queues
> - *	@iqid: ingress queue id
> - *	@fl0id: FL0 queue id or 0xffff if no attached FL0
> - *	@fl1id: FL1 queue id or 0xffff if no attached FL1
> - *
> - *	Starts or stops an ingress queue and its associated FLs, if any.
> - */
> -int t4_iq_start_stop(struct adapter *adap, unsigned int mbox, bool start,
> -		     unsigned int pf, unsigned int vf, unsigned int iqid,
> -		     unsigned int fl0id, unsigned int fl1id)
> -{
> -	struct fw_iq_cmd c;
> -
> -	memset(&c, 0, sizeof(c));
> -	c.op_to_vfn = htonl(FW_CMD_OP(FW_IQ_CMD) | FW_CMD_REQUEST |
> -			    FW_CMD_EXEC | FW_IQ_CMD_PFN(pf) |
> -			    FW_IQ_CMD_VFN(vf));
> -	c.alloc_to_len16 = htonl(FW_IQ_CMD_IQSTART(start) |
> -				 FW_IQ_CMD_IQSTOP(!start) | FW_LEN16(c));
> -	c.iqid = htons(iqid);
> -	c.fl0id = htons(fl0id);
> -	c.fl1id = htons(fl1id);
> -	return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL);
> -}
> -
>  /**
>   *	t4_iq_free - free an ingress queue and its FLs
>   *	@adap: the adapter
> 
> 


^ permalink raw reply

* Re: [PATCH] bonding: various fixes for bonding, netpoll & netconsole (v2)
From: Flavio Leitner @ 2010-10-15 23:41 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, bonding-devel, fubar, davem, andy, amwang
In-Reply-To: <1286973334-4339-1-git-send-email-nhorman@tuxdriver.com>

On Wed, Oct 13, 2010 at 08:35:29AM -0400, nhorman@tuxdriver.com wrote:
> Version 2, taking teh following changes into account:
> 
> 1) Moved tx blocking/checking macros to netpoll.h as suggested by amwang
> 
> 2) Added tx blocking macro calls to sysfs paths, as they can deadlock in the
> same way that the link monitoring paths can.
> 
> Summary: 
> A while ago we tried to enable netpoll on the bonding driver to enable
> netconsole.  That worked well in a steady state, but deadlocked frequently in
> failover conditions due to some recursive lock-taking (as well as a few other
> problems).  I've gone through the driver, netconsole and netpoll code, fixed up
> those deadlocks, and confirmed that, with this patch series, we can use
> netconsole on bonding without deadlock in all bonding modes with all slaves,
> even accross failovers.  I've also fixed up some incidental bugs that I ran
> across while looking through this code, as described in individual patches
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>

I've tested these patch series and found this:

netconsole: network logging started
bonding: bond0: making interface eth0 the new active one.
------------[ cut here ]------------
WARNING: at kernel/softirq.c:143 _local_bh_enable_ip+0x4e/0xd7()
Hardware name: Precision WorkStation 490    
Modules linked in: netconsole configfs sunrpc bonding ip6t_REJECT
nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 p4_clockmod freq_table
speedstep_lib dm_multipath uinput snd_hda_codec_idt snd_hda_intel
snd_hda_codec snd_hwdep snd_seq snd_seq_device i5k_amb snd_pcm hwmon
i5000_edac snd_timer edac_core e1000 snd ppdev parport_pc iTCO_wdt
parport iTCO_vendor_support soundcore tg3 dcdbas pcspkr shpchp i2c_i801
serio_raw snd_page_alloc nouveau ttm drm_kms_helper drm i2c_algo_bit
video output i2c_core [last unloaded: netconsole]
Pid: 8, comm: kworker/1:0 Not tainted 2.6.36-rc7+ #26
Call Trace:
 [<ffffffff810510c5>] warn_slowpath_common+0x85/0x9d
 [<ffffffff813cfcf2>] ? rcu_read_unlock_bh+0x26/0x28
 [<ffffffff810510f7>] warn_slowpath_null+0x1a/0x1c
 [<ffffffff810574fa>] _local_bh_enable_ip+0x4e/0xd7
 [<ffffffff810575a5>] local_bh_enable+0x12/0x14 <-- enabling again
 [<ffffffff813cfcf2>] rcu_read_unlock_bh+0x26/0x28
 [<ffffffff813d08a1>] dev_queue_xmit+0x363/0x375
 [<ffffffff813d053e>] ? dev_queue_xmit+0x0/0x375
 [<ffffffffa028c1e0>] bond_dev_queue_xmit+0xbe/0xdb [bonding]
 [<ffffffffa028c46e>] bond_start_xmit+0x271/0x4df [bonding]
 [<ffffffff813e0a15>] queue_process+0xcd/0x18a <- interrupts disabled
 [<ffffffff813e0948>] ? queue_process+0x0/0x18a
 [<ffffffff810673cf>] process_one_work+0x216/0x37d
 [<ffffffff81067344>] ? process_one_work+0x18b/0x37d
 [<ffffffff8106920d>] ? manage_workers+0x10b/0x195
 [<ffffffff810693d8>] worker_thread+0x141/0x21e
 [<ffffffff81069297>] ? worker_thread+0x0/0x21e
 [<ffffffff8106c988>] kthread+0x9d/0xa5
 [<ffffffff8100aaa4>] kernel_thread_helper+0x4/0x10
 [<ffffffff8147f950>] ? restore_args+0x0/0x30
 [<ffffffff8106c8eb>] ? kthread+0x0/0xa5
 [<ffffffff8100aaa0>] ? kernel_thread_helper+0x0/0x10
---[ end trace 55688f5173e9b393 ]---
e1000: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
bonding: bond0: link status definitely up for interface eth1.
0)

It happens because queue_process() disables the local
interrupts before call ->ndo_start_xmit() and then
dev_queue_xmit() will enable them back.

I have CONFIG_TRACE_IRQFLAGS=y on my .config.


-- 
Flavio

^ permalink raw reply

* Re: [PATCH 2/3] cxgb4: function namespace cleanup
From: Stephen Hemminger @ 2010-10-15 23:50 UTC (permalink / raw)
  To: Joe Perches
  Cc: Divy Le Ray, David S. Miller, Casey Leedom, Dimitris Michailidis,
	netdev
In-Reply-To: <1287185361.1117.734.camel@Joe-Laptop>

On Fri, 15 Oct 2010 16:29:21 -0700
Joe Perches <joe@perches.com> wrote:

> On Fri, 2010-10-15 at 15:43 -0700, Stephen Hemminger wrote:
> > plain text document attachment (cxgb4-local.patch)
> > Make functions only used in one file local.
> > Remove lots of dead code. Most surprising is the function
> > cxgb4_iscsi_init which is defined but never called!
> > Compile tested only
> 
> drivers/scsi/cxgbi/cxgb4i/cxgb4i.c:1423:	cxgb4_iscsi_init(lldi->ports[0], tagmask, pgsz_factor);

Ok. drivers/scsi/cxgbi/cxgb4i does not exist in net-next
will recheck with linux-next

^ permalink raw reply

* Re: large divisor for flow classifier
From: Jonathan Thibault @ 2010-10-15 23:52 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Patrick McHardy, netdev
In-Reply-To: <1287172905.2799.8.camel@edumazet-laptop>

Merci beaucoup :),

It at least shows I wasn't just confused about the way it works.  In its planned final form, the rate will be set around 175Mbit.  I don't need perfect distribution so things should be fine as long as hosts cannot easily cheat their way into having more bandwidth merely by creating more flows.

Jonathan

On 15/10/10 04:01 PM, Eric Dumazet wrote:
> Le vendredi 15 octobre 2010 à 14:14 -0400, Jonathan Thibault a écrit :
> 
> SFQ is limited to a 1024 divisor
> 
> You might try following patch :
> 
> (8192 is the smallest power of two greater than 6144)
> 
> sizeof(struct sfq_sched_data) becomes 0x2ccc instead of 0x10cc
> 
> keep in mind hash distribution is not perfect.
> 
> What would be the real rate ?
> 
> 
> diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
> index 3cf478d..c4a53d6 100644
> --- a/net/sched/sch_sfq.c
> +++ b/net/sched/sch_sfq.c
> @@ -77,7 +77,7 @@
>  	It is easy to increase these values, but not in flight.  */
>  
>  #define SFQ_DEPTH		128
> -#define SFQ_HASH_DIVISOR	1024
> +#define SFQ_HASH_DIVISOR	8192
>  
>  /* This type should contain at least SFQ_DEPTH*2 values */
>  typedef unsigned char sfq_index;
> 
> 

^ permalink raw reply

* Re: large divisor for flow classifier
From: Jonathan Thibault @ 2010-10-15 23:58 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Eric Dumazet, Patrick McHardy, netdev
In-Reply-To: <4CB8C341.40000@gmail.com>

I will certainly look into that.

Jonathan

On 15/10/10 05:10 PM, Jarek Poplawski wrote:
> Eric Dumazet wrote:
> 
> Because of low SFQ_DEPTH, which limits its queue to 127 packets,
> SFQ isn't suitable for serving so many users. There is sch_drr as
> a replacement, alas more complex and undocumented, but google should
> help you enough.
> 
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=13d2a1d2b032de08d7dcab6a1edcd47802681f96
> 
> Jarek P.


^ permalink raw reply

* Re: [PATCH] bonding: various fixes for bonding, netpoll & netconsole (v2)
From: Neil Horman @ 2010-10-16  0:06 UTC (permalink / raw)
  To: Flavio Leitner; +Cc: netdev, bonding-devel, fubar, davem, andy, amwang
In-Reply-To: <20101015234115.GB2747@redhat.com>

On Fri, Oct 15, 2010 at 08:41:15PM -0300, Flavio Leitner wrote:
> On Wed, Oct 13, 2010 at 08:35:29AM -0400, nhorman@tuxdriver.com wrote:
> > Version 2, taking teh following changes into account:
> > 
> > 1) Moved tx blocking/checking macros to netpoll.h as suggested by amwang
> > 
> > 2) Added tx blocking macro calls to sysfs paths, as they can deadlock in the
> > same way that the link monitoring paths can.
> > 
> > Summary: 
> > A while ago we tried to enable netpoll on the bonding driver to enable
> > netconsole.  That worked well in a steady state, but deadlocked frequently in
> > failover conditions due to some recursive lock-taking (as well as a few other
> > problems).  I've gone through the driver, netconsole and netpoll code, fixed up
> > those deadlocks, and confirmed that, with this patch series, we can use
> > netconsole on bonding without deadlock in all bonding modes with all slaves,
> > even accross failovers.  I've also fixed up some incidental bugs that I ran
> > across while looking through this code, as described in individual patches
> > 
> > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> 
> I've tested these patch series and found this:
> 
> netconsole: network logging started
> bonding: bond0: making interface eth0 the new active one.
> ------------[ cut here ]------------
> WARNING: at kernel/softirq.c:143 _local_bh_enable_ip+0x4e/0xd7()
> Hardware name: Precision WorkStation 490    
> Modules linked in: netconsole configfs sunrpc bonding ip6t_REJECT
> nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 p4_clockmod freq_table
> speedstep_lib dm_multipath uinput snd_hda_codec_idt snd_hda_intel
> snd_hda_codec snd_hwdep snd_seq snd_seq_device i5k_amb snd_pcm hwmon
> i5000_edac snd_timer edac_core e1000 snd ppdev parport_pc iTCO_wdt
> parport iTCO_vendor_support soundcore tg3 dcdbas pcspkr shpchp i2c_i801
> serio_raw snd_page_alloc nouveau ttm drm_kms_helper drm i2c_algo_bit
> video output i2c_core [last unloaded: netconsole]
> Pid: 8, comm: kworker/1:0 Not tainted 2.6.36-rc7+ #26
> Call Trace:
>  [<ffffffff810510c5>] warn_slowpath_common+0x85/0x9d
>  [<ffffffff813cfcf2>] ? rcu_read_unlock_bh+0x26/0x28
>  [<ffffffff810510f7>] warn_slowpath_null+0x1a/0x1c
>  [<ffffffff810574fa>] _local_bh_enable_ip+0x4e/0xd7
>  [<ffffffff810575a5>] local_bh_enable+0x12/0x14 <-- enabling again
>  [<ffffffff813cfcf2>] rcu_read_unlock_bh+0x26/0x28
>  [<ffffffff813d08a1>] dev_queue_xmit+0x363/0x375
>  [<ffffffff813d053e>] ? dev_queue_xmit+0x0/0x375
>  [<ffffffffa028c1e0>] bond_dev_queue_xmit+0xbe/0xdb [bonding]
>  [<ffffffffa028c46e>] bond_start_xmit+0x271/0x4df [bonding]
>  [<ffffffff813e0a15>] queue_process+0xcd/0x18a <- interrupts disabled
>  [<ffffffff813e0948>] ? queue_process+0x0/0x18a
>  [<ffffffff810673cf>] process_one_work+0x216/0x37d
>  [<ffffffff81067344>] ? process_one_work+0x18b/0x37d
>  [<ffffffff8106920d>] ? manage_workers+0x10b/0x195
>  [<ffffffff810693d8>] worker_thread+0x141/0x21e
>  [<ffffffff81069297>] ? worker_thread+0x0/0x21e
>  [<ffffffff8106c988>] kthread+0x9d/0xa5
>  [<ffffffff8100aaa4>] kernel_thread_helper+0x4/0x10
>  [<ffffffff8147f950>] ? restore_args+0x0/0x30
>  [<ffffffff8106c8eb>] ? kthread+0x0/0xa5
>  [<ffffffff8100aaa0>] ? kernel_thread_helper+0x0/0x10
> ---[ end trace 55688f5173e9b393 ]---
> e1000: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
> bonding: bond0: link status definitely up for interface eth1.
> 0)
> 
> It happens because queue_process() disables the local
> interrupts before call ->ndo_start_xmit() and then
> dev_queue_xmit() will enable them back.
> 
> I have CONFIG_TRACE_IRQFLAGS=y on my .config.
> 
Well, you look to be correct, although I'm not sure why you're replying to this
thread to note the condition.  This patch series doesn't change any of that
code (although it does make use of the existing function).  This problem could
just as easily happen to any driver that returns NETDEV_TX_BUSY in response to a
netpoll transmit, or anytime a netpoll gets blocked because the xmit_lock is
already held or the tx queue is stopped.  Can you please write a patch to fix
it?

Thanks!
Neil

> 
> -- 
> Flavio
> 

^ permalink raw reply

* [PATCH 2/3] cxgb4: function namespace cleanup (v2)
From: Stephen Hemminger @ 2010-10-16  0:10 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Divy Le Ray, David S. Miller, Casey Leedom, Dimitris Michailidis,
	netdev
In-Reply-To: <20101015224523.633775810@vyatta.com>

Make functions only used in one file local.
Remove lots of dead code, relating to unsupported functions
in mainline driver like RSS, IPv6, and TCP offload.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
Patch against version in linux-next. It keeps the hooks
necessary for iscsi.

 drivers/net/cxgb4/cxgb4.h      |   17 -
 drivers/net/cxgb4/cxgb4_main.c |   78 --------
 drivers/net/cxgb4/cxgb4_uld.h  |    8 
 drivers/net/cxgb4/l2t.c        |   34 ---
 drivers/net/cxgb4/l2t.h        |    3 
 drivers/net/cxgb4/t4_hw.c      |  394 -----------------------------------------
 6 files changed, 3 insertions(+), 531 deletions(-)

--- a/drivers/net/cxgb4/cxgb4_main.c	2010-10-15 16:58:22.899742447 -0700
+++ b/drivers/net/cxgb4/cxgb4_main.c	2010-10-15 17:02:22.336255220 -0700
@@ -880,7 +880,7 @@ void *t4_alloc_mem(size_t size)
 /*
  * Free memory allocated through alloc_mem().
  */
-void t4_free_mem(void *addr)
+static void t4_free_mem(void *addr)
 {
 	if (is_vmalloc_addr(addr))
 		vfree(addr);
@@ -2207,8 +2207,8 @@ static void mk_tid_release(struct sk_buf
  * Queue a TID release request and if necessary schedule a work queue to
  * process it.
  */
-void cxgb4_queue_tid_release(struct tid_info *t, unsigned int chan,
-			     unsigned int tid)
+static void cxgb4_queue_tid_release(struct tid_info *t, unsigned int chan,
+				    unsigned int tid)
 {
 	void **p = &t->tid_tab[tid];
 	struct adapter *adap = container_of(t, struct adapter, tids);
@@ -2223,7 +2223,6 @@ void cxgb4_queue_tid_release(struct tid_
 	}
 	spin_unlock_bh(&adap->tid_release_lock);
 }
-EXPORT_SYMBOL(cxgb4_queue_tid_release);
 
 /*
  * Process the list of pending TID release requests.
@@ -2356,48 +2355,6 @@ int cxgb4_create_server(const struct net
 EXPORT_SYMBOL(cxgb4_create_server);
 
 /**
- *	cxgb4_create_server6 - create an IPv6 server
- *	@dev: the device
- *	@stid: the server TID
- *	@sip: local IPv6 address to bind server to
- *	@sport: the server's TCP port
- *	@queue: queue to direct messages from this server to
- *
- *	Create an IPv6 server for the given port and address.
- *	Returns <0 on error and one of the %NET_XMIT_* values on success.
- */
-int cxgb4_create_server6(const struct net_device *dev, unsigned int stid,
-			 const struct in6_addr *sip, __be16 sport,
-			 unsigned int queue)
-{
-	unsigned int chan;
-	struct sk_buff *skb;
-	struct adapter *adap;
-	struct cpl_pass_open_req6 *req;
-
-	skb = alloc_skb(sizeof(*req), GFP_KERNEL);
-	if (!skb)
-		return -ENOMEM;
-
-	adap = netdev2adap(dev);
-	req = (struct cpl_pass_open_req6 *)__skb_put(skb, sizeof(*req));
-	INIT_TP_WR(req, 0);
-	OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_PASS_OPEN_REQ6, stid));
-	req->local_port = sport;
-	req->peer_port = htons(0);
-	req->local_ip_hi = *(__be64 *)(sip->s6_addr);
-	req->local_ip_lo = *(__be64 *)(sip->s6_addr + 8);
-	req->peer_ip_hi = cpu_to_be64(0);
-	req->peer_ip_lo = cpu_to_be64(0);
-	chan = rxq_to_chan(&adap->sge, queue);
-	req->opt0 = cpu_to_be64(TX_CHAN(chan));
-	req->opt1 = cpu_to_be64(CONN_POLICY_ASK |
-				SYN_RSS_ENABLE | SYN_RSS_QUEUE(queue));
-	return t4_mgmt_tx(adap, skb);
-}
-EXPORT_SYMBOL(cxgb4_create_server6);
-
-/**
  *	cxgb4_best_mtu - find the entry in the MTU table closest to an MTU
  *	@mtus: the HW MTU table
  *	@mtu: the target MTU
@@ -2456,35 +2413,6 @@ unsigned int cxgb4_port_idx(const struct
 }
 EXPORT_SYMBOL(cxgb4_port_idx);
 
-/**
- *	cxgb4_netdev_by_hwid - return the net device of a HW port
- *	@pdev: identifies the adapter
- *	@id: the HW port id
- *
- *	Return the net device associated with the interface with the given HW
- *	id.
- */
-struct net_device *cxgb4_netdev_by_hwid(struct pci_dev *pdev, unsigned int id)
-{
-	const struct adapter *adap = pci_get_drvdata(pdev);
-
-	if (!adap || id >= NCHAN)
-		return NULL;
-	id = adap->chan_map[id];
-	return id < MAX_NPORTS ? adap->port[id] : NULL;
-}
-EXPORT_SYMBOL(cxgb4_netdev_by_hwid);
-
-void cxgb4_get_tcp_stats(struct pci_dev *pdev, struct tp_tcp_stats *v4,
-			 struct tp_tcp_stats *v6)
-{
-	struct adapter *adap = pci_get_drvdata(pdev);
-
-	spin_lock(&adap->stats_lock);
-	t4_tp_get_tcp_stats(adap, v4, v6);
-	spin_unlock(&adap->stats_lock);
-}
-EXPORT_SYMBOL(cxgb4_get_tcp_stats);
 
 void cxgb4_iscsi_init(struct net_device *dev, unsigned int tag_mask,
 		      const unsigned int *pgsz_order)
--- a/drivers/net/cxgb4/cxgb4_uld.h	2010-10-15 16:58:22.875741593 -0700
+++ b/drivers/net/cxgb4/cxgb4_uld.h	2010-10-15 17:07:33.695516682 -0700
@@ -139,16 +139,11 @@ int cxgb4_alloc_stid(struct tid_info *t,
 void cxgb4_free_atid(struct tid_info *t, unsigned int atid);
 void cxgb4_free_stid(struct tid_info *t, unsigned int stid, int family);
 void cxgb4_remove_tid(struct tid_info *t, unsigned int qid, unsigned int tid);
-void cxgb4_queue_tid_release(struct tid_info *t, unsigned int chan,
-			     unsigned int tid);
 
 struct in6_addr;
 
 int cxgb4_create_server(const struct net_device *dev, unsigned int stid,
 			__be32 sip, __be16 sport, unsigned int queue);
-int cxgb4_create_server6(const struct net_device *dev, unsigned int stid,
-			 const struct in6_addr *sip, __be16 sport,
-			 unsigned int queue);
 
 static inline void set_wr_txq(struct sk_buff *skb, int prio, int queue)
 {
@@ -233,11 +228,8 @@ int cxgb4_ofld_send(struct net_device *d
 unsigned int cxgb4_port_chan(const struct net_device *dev);
 unsigned int cxgb4_port_viid(const struct net_device *dev);
 unsigned int cxgb4_port_idx(const struct net_device *dev);
-struct net_device *cxgb4_netdev_by_hwid(struct pci_dev *pdev, unsigned int id);
 unsigned int cxgb4_best_mtu(const unsigned short *mtus, unsigned short mtu,
 			    unsigned int *idx);
-void cxgb4_get_tcp_stats(struct pci_dev *pdev, struct tp_tcp_stats *v4,
-			 struct tp_tcp_stats *v6);
 void cxgb4_iscsi_init(struct net_device *dev, unsigned int tag_mask,
 		      const unsigned int *pgsz_order);
 struct sk_buff *cxgb4_pktgl_to_skb(const struct pkt_gl *gl,
--- a/drivers/net/cxgb4/cxgb4.h	2010-10-15 16:58:22.847740597 -0700
+++ b/drivers/net/cxgb4/cxgb4.h	2010-10-15 17:00:42.852719427 -0700
@@ -592,7 +592,6 @@ void t4_os_portmod_changed(const struct
 void t4_os_link_changed(struct adapter *adap, int port_id, int link_stat);
 
 void *t4_alloc_mem(size_t size);
-void t4_free_mem(void *addr);
 
 void t4_free_sge_resources(struct adapter *adap);
 irq_handler_t t4_intr_handler(struct adapter *adap);
@@ -651,7 +650,6 @@ static inline int t4_wr_mbox_ns(struct a
 
 void t4_intr_enable(struct adapter *adapter);
 void t4_intr_disable(struct adapter *adapter);
-void t4_intr_clear(struct adapter *adapter);
 int t4_slow_intr_handler(struct adapter *adapter);
 
 int t4_wait_dev_ready(struct adapter *adap);
@@ -664,26 +662,16 @@ int t4_check_fw_version(struct adapter *
 int t4_prep_adapter(struct adapter *adapter);
 int t4_port_init(struct adapter *adap, int mbox, int pf, int vf);
 void t4_fatal_err(struct adapter *adapter);
-int t4_set_trace_filter(struct adapter *adapter, const struct trace_params *tp,
-			int filter_index, int enable);
-void t4_get_trace_filter(struct adapter *adapter, struct trace_params *tp,
-			 int filter_index, int *enabled);
 int t4_config_rss_range(struct adapter *adapter, int mbox, unsigned int viid,
 			int start, int n, const u16 *rspq, unsigned int nrspq);
 int t4_config_glbl_rss(struct adapter *adapter, int mbox, unsigned int mode,
 		       unsigned int flags);
-int t4_read_rss(struct adapter *adapter, u16 *entries);
 int t4_mc_read(struct adapter *adap, u32 addr, __be32 *data, u64 *parity);
 int t4_edc_read(struct adapter *adap, int idx, u32 addr, __be32 *data,
 		u64 *parity);
 
 void t4_get_port_stats(struct adapter *adap, int idx, struct port_stats *p);
-void t4_get_lb_stats(struct adapter *adap, int idx, struct lb_port_stats *p);
-
 void t4_read_mtu_tbl(struct adapter *adap, u16 *mtus, u8 *mtu_log);
-void t4_tp_get_err_stats(struct adapter *adap, struct tp_err_stats *st);
-void t4_tp_get_tcp_stats(struct adapter *adap, struct tp_tcp_stats *v4,
-			 struct tp_tcp_stats *v6);
 void t4_load_mtus(struct adapter *adap, const unsigned short *mtus,
 		  const unsigned short *alpha, const unsigned short *beta);
 
@@ -711,8 +699,6 @@ int t4_cfg_pfvf(struct adapter *adap, un
 int t4_alloc_vi(struct adapter *adap, unsigned int mbox, unsigned int port,
 		unsigned int pf, unsigned int vf, unsigned int nmac, u8 *mac,
 		unsigned int *rss_size);
-int t4_free_vi(struct adapter *adap, unsigned int mbox, unsigned int pf,
-	       unsigned int vf, unsigned int viid);
 int t4_set_rxmode(struct adapter *adap, unsigned int mbox, unsigned int viid,
 		int mtu, int promisc, int all_multi, int bcast, int vlanex,
 		bool sleep_ok);
@@ -731,9 +717,6 @@ int t4_mdio_rd(struct adapter *adap, uns
 	       unsigned int mmd, unsigned int reg, u16 *valp);
 int t4_mdio_wr(struct adapter *adap, unsigned int mbox, unsigned int phy_addr,
 	       unsigned int mmd, unsigned int reg, u16 val);
-int t4_iq_start_stop(struct adapter *adap, unsigned int mbox, bool start,
-		     unsigned int pf, unsigned int vf, unsigned int iqid,
-		     unsigned int fl0id, unsigned int fl1id);
 int t4_iq_free(struct adapter *adap, unsigned int mbox, unsigned int pf,
 	       unsigned int vf, unsigned int iqtype, unsigned int iqid,
 	       unsigned int fl0id, unsigned int fl1id);
--- a/drivers/net/cxgb4/l2t.c	2010-10-15 16:58:22.883741877 -0700
+++ b/drivers/net/cxgb4/l2t.c	2010-10-15 17:00:42.856719569 -0700
@@ -481,40 +481,6 @@ void t4_l2t_update(struct adapter *adap,
 		handle_failed_resolution(adap, arpq);
 }
 
-/*
- * Allocate an L2T entry for use by a switching rule.  Such entries need to be
- * explicitly freed and while busy they are not on any hash chain, so normal
- * address resolution updates do not see them.
- */
-struct l2t_entry *t4_l2t_alloc_switching(struct l2t_data *d)
-{
-	struct l2t_entry *e;
-
-	write_lock_bh(&d->lock);
-	e = alloc_l2e(d);
-	if (e) {
-		spin_lock(&e->lock);          /* avoid race with t4_l2t_free */
-		e->state = L2T_STATE_SWITCHING;
-		atomic_set(&e->refcnt, 1);
-		spin_unlock(&e->lock);
-	}
-	write_unlock_bh(&d->lock);
-	return e;
-}
-
-/*
- * Sets/updates the contents of a switching L2T entry that has been allocated
- * with an earlier call to @t4_l2t_alloc_switching.
- */
-int t4_l2t_set_switching(struct adapter *adap, struct l2t_entry *e, u16 vlan,
-			 u8 port, u8 *eth_addr)
-{
-	e->vlan = vlan;
-	e->lport = port;
-	memcpy(e->dmac, eth_addr, ETH_ALEN);
-	return write_l2e(adap, e, 0);
-}
-
 struct l2t_data *t4_init_l2t(void)
 {
 	int i;
--- a/drivers/net/cxgb4/l2t.h	2010-10-15 16:58:22.855740881 -0700
+++ b/drivers/net/cxgb4/l2t.h	2010-10-15 17:00:42.856719569 -0700
@@ -100,9 +100,6 @@ struct l2t_entry *cxgb4_l2t_get(struct l
 				unsigned int priority);
 
 void t4_l2t_update(struct adapter *adap, struct neighbour *neigh);
-struct l2t_entry *t4_l2t_alloc_switching(struct l2t_data *d);
-int t4_l2t_set_switching(struct adapter *adap, struct l2t_entry *e, u16 vlan,
-			 u8 port, u8 *eth_addr);
 struct l2t_data *t4_init_l2t(void);
 void do_l2t_write_rpl(struct adapter *p, const struct cpl_l2t_write_rpl *rpl);
 
--- a/drivers/net/cxgb4/t4_hw.c	2010-10-15 16:58:22.839740313 -0700
+++ b/drivers/net/cxgb4/t4_hw.c	2010-10-15 17:00:42.860719711 -0700
@@ -97,53 +97,6 @@ void t4_set_reg_field(struct adapter *ad
 	(void) t4_read_reg(adapter, addr);      /* flush */
 }
 
-/**
- *	t4_read_indirect - read indirectly addressed registers
- *	@adap: the adapter
- *	@addr_reg: register holding the indirect address
- *	@data_reg: register holding the value of the indirect register
- *	@vals: where the read register values are stored
- *	@nregs: how many indirect registers to read
- *	@start_idx: index of first indirect register to read
- *
- *	Reads registers that are accessed indirectly through an address/data
- *	register pair.
- */
-static void t4_read_indirect(struct adapter *adap, unsigned int addr_reg,
-			     unsigned int data_reg, u32 *vals,
-			     unsigned int nregs, unsigned int start_idx)
-{
-	while (nregs--) {
-		t4_write_reg(adap, addr_reg, start_idx);
-		*vals++ = t4_read_reg(adap, data_reg);
-		start_idx++;
-	}
-}
-
-#if 0
-/**
- *	t4_write_indirect - write indirectly addressed registers
- *	@adap: the adapter
- *	@addr_reg: register holding the indirect addresses
- *	@data_reg: register holding the value for the indirect registers
- *	@vals: values to write
- *	@nregs: how many indirect registers to write
- *	@start_idx: address of first indirect register to write
- *
- *	Writes a sequential block of registers that are accessed indirectly
- *	through an address/data register pair.
- */
-static void t4_write_indirect(struct adapter *adap, unsigned int addr_reg,
-			      unsigned int data_reg, const u32 *vals,
-			      unsigned int nregs, unsigned int start_idx)
-{
-	while (nregs--) {
-		t4_write_reg(adap, addr_reg, start_idx++);
-		t4_write_reg(adap, data_reg, *vals++);
-	}
-}
-#endif
-
 /*
  * Get the reply to a mailbox command and store it in @rpl in big-endian order.
  */
@@ -1560,44 +1513,6 @@ void t4_intr_disable(struct adapter *ada
 }
 
 /**
- *	t4_intr_clear - clear all interrupts
- *	@adapter: the adapter whose interrupts should be cleared
- *
- *	Clears all interrupts.  The caller must be a PCI function managing
- *	global interrupts.
- */
-void t4_intr_clear(struct adapter *adapter)
-{
-	static const unsigned int cause_reg[] = {
-		SGE_INT_CAUSE1, SGE_INT_CAUSE2, SGE_INT_CAUSE3,
-		PCIE_CORE_UTL_SYSTEM_BUS_AGENT_STATUS,
-		PCIE_CORE_UTL_PCI_EXPRESS_PORT_STATUS,
-		PCIE_NONFAT_ERR, PCIE_INT_CAUSE,
-		MC_INT_CAUSE,
-		MA_INT_WRAP_STATUS, MA_PARITY_ERROR_STATUS, MA_INT_CAUSE,
-		EDC_INT_CAUSE, EDC_REG(EDC_INT_CAUSE, 1),
-		CIM_HOST_INT_CAUSE, CIM_HOST_UPACC_INT_CAUSE,
-		MYPF_REG(CIM_PF_HOST_INT_CAUSE),
-		TP_INT_CAUSE,
-		ULP_RX_INT_CAUSE, ULP_TX_INT_CAUSE,
-		PM_RX_INT_CAUSE, PM_TX_INT_CAUSE,
-		MPS_RX_PERR_INT_CAUSE,
-		CPL_INTR_CAUSE,
-		MYPF_REG(PL_PF_INT_CAUSE),
-		PL_PL_INT_CAUSE,
-		LE_DB_INT_CAUSE,
-	};
-
-	unsigned int i;
-
-	for (i = 0; i < ARRAY_SIZE(cause_reg); ++i)
-		t4_write_reg(adapter, cause_reg[i], 0xffffffff);
-
-	t4_write_reg(adapter, PL_INT_CAUSE, GLBL_INTR_MASK);
-	(void) t4_read_reg(adapter, PL_INT_CAUSE);          /* flush */
-}
-
-/**
  *	hash_mac_addr - return the hash value of a MAC address
  *	@addr: the 48-bit Ethernet MAC address
  *
@@ -1709,98 +1624,6 @@ int t4_config_glbl_rss(struct adapter *a
 	return t4_wr_mbox(adapter, mbox, &c, sizeof(c), NULL);
 }
 
-/* Read an RSS table row */
-static int rd_rss_row(struct adapter *adap, int row, u32 *val)
-{
-	t4_write_reg(adap, TP_RSS_LKP_TABLE, 0xfff00000 | row);
-	return t4_wait_op_done_val(adap, TP_RSS_LKP_TABLE, LKPTBLROWVLD, 1,
-				   5, 0, val);
-}
-
-/**
- *	t4_read_rss - read the contents of the RSS mapping table
- *	@adapter: the adapter
- *	@map: holds the contents of the RSS mapping table
- *
- *	Reads the contents of the RSS hash->queue mapping table.
- */
-int t4_read_rss(struct adapter *adapter, u16 *map)
-{
-	u32 val;
-	int i, ret;
-
-	for (i = 0; i < RSS_NENTRIES / 2; ++i) {
-		ret = rd_rss_row(adapter, i, &val);
-		if (ret)
-			return ret;
-		*map++ = LKPTBLQUEUE0_GET(val);
-		*map++ = LKPTBLQUEUE1_GET(val);
-	}
-	return 0;
-}
-
-/**
- *	t4_tp_get_tcp_stats - read TP's TCP MIB counters
- *	@adap: the adapter
- *	@v4: holds the TCP/IP counter values
- *	@v6: holds the TCP/IPv6 counter values
- *
- *	Returns the values of TP's TCP/IP and TCP/IPv6 MIB counters.
- *	Either @v4 or @v6 may be %NULL to skip the corresponding stats.
- */
-void t4_tp_get_tcp_stats(struct adapter *adap, struct tp_tcp_stats *v4,
-			 struct tp_tcp_stats *v6)
-{
-	u32 val[TP_MIB_TCP_RXT_SEG_LO - TP_MIB_TCP_OUT_RST + 1];
-
-#define STAT_IDX(x) ((TP_MIB_TCP_##x) - TP_MIB_TCP_OUT_RST)
-#define STAT(x)     val[STAT_IDX(x)]
-#define STAT64(x)   (((u64)STAT(x##_HI) << 32) | STAT(x##_LO))
-
-	if (v4) {
-		t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, val,
-				 ARRAY_SIZE(val), TP_MIB_TCP_OUT_RST);
-		v4->tcpOutRsts = STAT(OUT_RST);
-		v4->tcpInSegs  = STAT64(IN_SEG);
-		v4->tcpOutSegs = STAT64(OUT_SEG);
-		v4->tcpRetransSegs = STAT64(RXT_SEG);
-	}
-	if (v6) {
-		t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, val,
-				 ARRAY_SIZE(val), TP_MIB_TCP_V6OUT_RST);
-		v6->tcpOutRsts = STAT(OUT_RST);
-		v6->tcpInSegs  = STAT64(IN_SEG);
-		v6->tcpOutSegs = STAT64(OUT_SEG);
-		v6->tcpRetransSegs = STAT64(RXT_SEG);
-	}
-#undef STAT64
-#undef STAT
-#undef STAT_IDX
-}
-
-/**
- *	t4_tp_get_err_stats - read TP's error MIB counters
- *	@adap: the adapter
- *	@st: holds the counter values
- *
- *	Returns the values of TP's error counters.
- */
-void t4_tp_get_err_stats(struct adapter *adap, struct tp_err_stats *st)
-{
-	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, st->macInErrs,
-			 12, TP_MIB_MAC_IN_ERR_0);
-	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, st->tnlCongDrops,
-			 8, TP_MIB_TNL_CNG_DROP_0);
-	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, st->tnlTxDrops,
-			 4, TP_MIB_TNL_DROP_0);
-	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, st->ofldVlanDrops,
-			 4, TP_MIB_OFD_VLN_DROP_0);
-	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, st->tcp6InErrs,
-			 4, TP_MIB_TCP_V6IN_ERR_0);
-	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, &st->ofldNoNeigh,
-			 2, TP_MIB_OFD_ARP_DROP);
-}
-
 /**
  *	t4_read_mtu_tbl - returns the values in the HW path MTU table
  *	@adap: the adapter
@@ -1916,122 +1739,6 @@ void t4_load_mtus(struct adapter *adap,
 }
 
 /**
- *	t4_set_trace_filter - configure one of the tracing filters
- *	@adap: the adapter
- *	@tp: the desired trace filter parameters
- *	@idx: which filter to configure
- *	@enable: whether to enable or disable the filter
- *
- *	Configures one of the tracing filters available in HW.  If @enable is
- *	%0 @tp is not examined and may be %NULL.
- */
-int t4_set_trace_filter(struct adapter *adap, const struct trace_params *tp,
-			int idx, int enable)
-{
-	int i, ofst = idx * 4;
-	u32 data_reg, mask_reg, cfg;
-	u32 multitrc = TRCMULTIFILTER;
-
-	if (!enable) {
-		t4_write_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + ofst, 0);
-		goto out;
-	}
-
-	if (tp->port > 11 || tp->invert > 1 || tp->skip_len > 0x1f ||
-	    tp->skip_ofst > 0x1f || tp->min_len > 0x1ff ||
-	    tp->snap_len > 9600 || (idx && tp->snap_len > 256))
-		return -EINVAL;
-
-	if (tp->snap_len > 256) {            /* must be tracer 0 */
-		if ((t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + 4) |
-		     t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + 8) |
-		     t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + 12)) & TFEN)
-			return -EINVAL;  /* other tracers are enabled */
-		multitrc = 0;
-	} else if (idx) {
-		i = t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_B);
-		if (TFCAPTUREMAX_GET(i) > 256 &&
-		    (t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A) & TFEN))
-			return -EINVAL;
-	}
-
-	/* stop the tracer we'll be changing */
-	t4_write_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + ofst, 0);
-
-	/* disable tracing globally if running in the wrong single/multi mode */
-	cfg = t4_read_reg(adap, MPS_TRC_CFG);
-	if ((cfg & TRCEN) && multitrc != (cfg & TRCMULTIFILTER)) {
-		t4_write_reg(adap, MPS_TRC_CFG, cfg ^ TRCEN);
-		t4_read_reg(adap, MPS_TRC_CFG);                  /* flush */
-		msleep(1);
-		if (!(t4_read_reg(adap, MPS_TRC_CFG) & TRCFIFOEMPTY))
-			return -ETIMEDOUT;
-	}
-	/*
-	 * At this point either the tracing is enabled and in the right mode or
-	 * disabled.
-	 */
-
-	idx *= (MPS_TRC_FILTER1_MATCH - MPS_TRC_FILTER0_MATCH);
-	data_reg = MPS_TRC_FILTER0_MATCH + idx;
-	mask_reg = MPS_TRC_FILTER0_DONT_CARE + idx;
-
-	for (i = 0; i < TRACE_LEN / 4; i++, data_reg += 4, mask_reg += 4) {
-		t4_write_reg(adap, data_reg, tp->data[i]);
-		t4_write_reg(adap, mask_reg, ~tp->mask[i]);
-	}
-	t4_write_reg(adap, MPS_TRC_FILTER_MATCH_CTL_B + ofst,
-		     TFCAPTUREMAX(tp->snap_len) |
-		     TFMINPKTSIZE(tp->min_len));
-	t4_write_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + ofst,
-		     TFOFFSET(tp->skip_ofst) | TFLENGTH(tp->skip_len) |
-		     TFPORT(tp->port) | TFEN |
-		     (tp->invert ? TFINVERTMATCH : 0));
-
-	cfg &= ~TRCMULTIFILTER;
-	t4_write_reg(adap, MPS_TRC_CFG, cfg | TRCEN | multitrc);
-out:	t4_read_reg(adap, MPS_TRC_CFG);  /* flush */
-	return 0;
-}
-
-/**
- *	t4_get_trace_filter - query one of the tracing filters
- *	@adap: the adapter
- *	@tp: the current trace filter parameters
- *	@idx: which trace filter to query
- *	@enabled: non-zero if the filter is enabled
- *
- *	Returns the current settings of one of the HW tracing filters.
- */
-void t4_get_trace_filter(struct adapter *adap, struct trace_params *tp, int idx,
-			 int *enabled)
-{
-	u32 ctla, ctlb;
-	int i, ofst = idx * 4;
-	u32 data_reg, mask_reg;
-
-	ctla = t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + ofst);
-	ctlb = t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_B + ofst);
-
-	*enabled = !!(ctla & TFEN);
-	tp->snap_len = TFCAPTUREMAX_GET(ctlb);
-	tp->min_len = TFMINPKTSIZE_GET(ctlb);
-	tp->skip_ofst = TFOFFSET_GET(ctla);
-	tp->skip_len = TFLENGTH_GET(ctla);
-	tp->invert = !!(ctla & TFINVERTMATCH);
-	tp->port = TFPORT_GET(ctla);
-
-	ofst = (MPS_TRC_FILTER1_MATCH - MPS_TRC_FILTER0_MATCH) * idx;
-	data_reg = MPS_TRC_FILTER0_MATCH + ofst;
-	mask_reg = MPS_TRC_FILTER0_DONT_CARE + ofst;
-
-	for (i = 0; i < TRACE_LEN / 4; i++, data_reg += 4, mask_reg += 4) {
-		tp->mask[i] = ~t4_read_reg(adap, mask_reg);
-		tp->data[i] = t4_read_reg(adap, data_reg) & tp->mask[i];
-	}
-}
-
-/**
  *	get_mps_bg_map - return the buffer groups associated with a port
  *	@adap: the adapter
  *	@idx: the port index
@@ -2133,52 +1840,6 @@ void t4_get_port_stats(struct adapter *a
 }
 
 /**
- *	t4_get_lb_stats - collect loopback port statistics
- *	@adap: the adapter
- *	@idx: the loopback port index
- *	@p: the stats structure to fill
- *
- *	Return HW statistics for the given loopback port.
- */
-void t4_get_lb_stats(struct adapter *adap, int idx, struct lb_port_stats *p)
-{
-	u32 bgmap = get_mps_bg_map(adap, idx);
-
-#define GET_STAT(name) \
-	t4_read_reg64(adap, PORT_REG(idx, MPS_PORT_STAT_LB_PORT_##name##_L))
-#define GET_STAT_COM(name) t4_read_reg64(adap, MPS_STAT_##name##_L)
-
-	p->octets           = GET_STAT(BYTES);
-	p->frames           = GET_STAT(FRAMES);
-	p->bcast_frames     = GET_STAT(BCAST);
-	p->mcast_frames     = GET_STAT(MCAST);
-	p->ucast_frames     = GET_STAT(UCAST);
-	p->error_frames     = GET_STAT(ERROR);
-
-	p->frames_64        = GET_STAT(64B);
-	p->frames_65_127    = GET_STAT(65B_127B);
-	p->frames_128_255   = GET_STAT(128B_255B);
-	p->frames_256_511   = GET_STAT(256B_511B);
-	p->frames_512_1023  = GET_STAT(512B_1023B);
-	p->frames_1024_1518 = GET_STAT(1024B_1518B);
-	p->frames_1519_max  = GET_STAT(1519B_MAX);
-	p->drop             = t4_read_reg(adap, PORT_REG(idx,
-					  MPS_PORT_STAT_LB_PORT_DROP_FRAMES));
-
-	p->ovflow0 = (bgmap & 1) ? GET_STAT_COM(RX_BG_0_LB_DROP_FRAME) : 0;
-	p->ovflow1 = (bgmap & 2) ? GET_STAT_COM(RX_BG_1_LB_DROP_FRAME) : 0;
-	p->ovflow2 = (bgmap & 4) ? GET_STAT_COM(RX_BG_2_LB_DROP_FRAME) : 0;
-	p->ovflow3 = (bgmap & 8) ? GET_STAT_COM(RX_BG_3_LB_DROP_FRAME) : 0;
-	p->trunc0 = (bgmap & 1) ? GET_STAT_COM(RX_BG_0_LB_TRUNC_FRAME) : 0;
-	p->trunc1 = (bgmap & 2) ? GET_STAT_COM(RX_BG_1_LB_TRUNC_FRAME) : 0;
-	p->trunc2 = (bgmap & 4) ? GET_STAT_COM(RX_BG_2_LB_TRUNC_FRAME) : 0;
-	p->trunc3 = (bgmap & 8) ? GET_STAT_COM(RX_BG_3_LB_TRUNC_FRAME) : 0;
-
-#undef GET_STAT
-#undef GET_STAT_COM
-}
-
-/**
  *	t4_wol_magic_enable - enable/disable magic packet WoL
  *	@adap: the adapter
  *	@port: the physical port index
@@ -2584,30 +2245,6 @@ int t4_alloc_vi(struct adapter *adap, un
 }
 
 /**
- *	t4_free_vi - free a virtual interface
- *	@adap: the adapter
- *	@mbox: mailbox to use for the FW command
- *	@pf: the PF owning the VI
- *	@vf: the VF owning the VI
- *	@viid: virtual interface identifiler
- *
- *	Free a previously allocated virtual interface.
- */
-int t4_free_vi(struct adapter *adap, unsigned int mbox, unsigned int pf,
-	       unsigned int vf, unsigned int viid)
-{
-	struct fw_vi_cmd c;
-
-	memset(&c, 0, sizeof(c));
-	c.op_to_vfn = htonl(FW_CMD_OP(FW_VI_CMD) | FW_CMD_REQUEST |
-			    FW_CMD_EXEC | FW_VI_CMD_PFN(pf) |
-			    FW_VI_CMD_VFN(vf));
-	c.alloc_to_len16 = htonl(FW_VI_CMD_FREE | FW_LEN16(c));
-	c.type_viid = htons(FW_VI_CMD_VIID(viid));
-	return t4_wr_mbox(adap, mbox, &c, sizeof(c), &c);
-}
-
-/**
  *	t4_set_rxmode - set Rx properties of a virtual interface
  *	@adap: the adapter
  *	@mbox: mailbox to use for the FW command
@@ -2832,37 +2469,6 @@ int t4_identify_port(struct adapter *ada
 	return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL);
 }
 
-/**
- *	t4_iq_start_stop - enable/disable an ingress queue and its FLs
- *	@adap: the adapter
- *	@mbox: mailbox to use for the FW command
- *	@start: %true to enable the queues, %false to disable them
- *	@pf: the PF owning the queues
- *	@vf: the VF owning the queues
- *	@iqid: ingress queue id
- *	@fl0id: FL0 queue id or 0xffff if no attached FL0
- *	@fl1id: FL1 queue id or 0xffff if no attached FL1
- *
- *	Starts or stops an ingress queue and its associated FLs, if any.
- */
-int t4_iq_start_stop(struct adapter *adap, unsigned int mbox, bool start,
-		     unsigned int pf, unsigned int vf, unsigned int iqid,
-		     unsigned int fl0id, unsigned int fl1id)
-{
-	struct fw_iq_cmd c;
-
-	memset(&c, 0, sizeof(c));
-	c.op_to_vfn = htonl(FW_CMD_OP(FW_IQ_CMD) | FW_CMD_REQUEST |
-			    FW_CMD_EXEC | FW_IQ_CMD_PFN(pf) |
-			    FW_IQ_CMD_VFN(vf));
-	c.alloc_to_len16 = htonl(FW_IQ_CMD_IQSTART(start) |
-				 FW_IQ_CMD_IQSTOP(!start) | FW_LEN16(c));
-	c.iqid = htons(iqid);
-	c.fl0id = htons(fl0id);
-	c.fl1id = htons(fl1id);
-	return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL);
-}
-
 /**
  *	t4_iq_free - free an ingress queue and its FLs
  *	@adap: the adapter

^ permalink raw reply

* Re: [PATCH] bonding: various fixes for bonding, netpoll & netconsole (v2)
From: Flavio Leitner @ 2010-10-16  0:45 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, bonding-devel, fubar, davem, andy, amwang
In-Reply-To: <20101016000634.GA6986@localhost.localdomain>

On Fri, Oct 15, 2010 at 08:06:34PM -0400, Neil Horman wrote:
> On Fri, Oct 15, 2010 at 08:41:15PM -0300, Flavio Leitner wrote:
> > On Wed, Oct 13, 2010 at 08:35:29AM -0400, nhorman@tuxdriver.com wrote:
> > > Version 2, taking teh following changes into account:
> > > 
> > > 1) Moved tx blocking/checking macros to netpoll.h as suggested by amwang
> > > 
> > > 2) Added tx blocking macro calls to sysfs paths, as they can deadlock in the
> > > same way that the link monitoring paths can.
> > > 
> > > Summary: 
> > > A while ago we tried to enable netpoll on the bonding driver to enable
> > > netconsole.  That worked well in a steady state, but deadlocked frequently in
> > > failover conditions due to some recursive lock-taking (as well as a few other
> > > problems).  I've gone through the driver, netconsole and netpoll code, fixed up
> > > those deadlocks, and confirmed that, with this patch series, we can use
> > > netconsole on bonding without deadlock in all bonding modes with all slaves,
> > > even accross failovers.  I've also fixed up some incidental bugs that I ran
> > > across while looking through this code, as described in individual patches
> > > 
> > > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> > 
> > I've tested these patch series and found this:
> > 
> > netconsole: network logging started
> > bonding: bond0: making interface eth0 the new active one.
> > ------------[ cut here ]------------
> > WARNING: at kernel/softirq.c:143 _local_bh_enable_ip+0x4e/0xd7()
> > Hardware name: Precision WorkStation 490    
> > Modules linked in: netconsole configfs sunrpc bonding ip6t_REJECT
> > nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 p4_clockmod freq_table
> > speedstep_lib dm_multipath uinput snd_hda_codec_idt snd_hda_intel
> > snd_hda_codec snd_hwdep snd_seq snd_seq_device i5k_amb snd_pcm hwmon
> > i5000_edac snd_timer edac_core e1000 snd ppdev parport_pc iTCO_wdt
> > parport iTCO_vendor_support soundcore tg3 dcdbas pcspkr shpchp i2c_i801
> > serio_raw snd_page_alloc nouveau ttm drm_kms_helper drm i2c_algo_bit
> > video output i2c_core [last unloaded: netconsole]
> > Pid: 8, comm: kworker/1:0 Not tainted 2.6.36-rc7+ #26
> > Call Trace:
> >  [<ffffffff810510c5>] warn_slowpath_common+0x85/0x9d
> >  [<ffffffff813cfcf2>] ? rcu_read_unlock_bh+0x26/0x28
> >  [<ffffffff810510f7>] warn_slowpath_null+0x1a/0x1c
> >  [<ffffffff810574fa>] _local_bh_enable_ip+0x4e/0xd7
> >  [<ffffffff810575a5>] local_bh_enable+0x12/0x14 <-- enabling again
> >  [<ffffffff813cfcf2>] rcu_read_unlock_bh+0x26/0x28
> >  [<ffffffff813d08a1>] dev_queue_xmit+0x363/0x375
> >  [<ffffffff813d053e>] ? dev_queue_xmit+0x0/0x375
> >  [<ffffffffa028c1e0>] bond_dev_queue_xmit+0xbe/0xdb [bonding]
> >  [<ffffffffa028c46e>] bond_start_xmit+0x271/0x4df [bonding]
> >  [<ffffffff813e0a15>] queue_process+0xcd/0x18a <- interrupts disabled
> >  [<ffffffff813e0948>] ? queue_process+0x0/0x18a
> >  [<ffffffff810673cf>] process_one_work+0x216/0x37d
> >  [<ffffffff81067344>] ? process_one_work+0x18b/0x37d
> >  [<ffffffff8106920d>] ? manage_workers+0x10b/0x195
> >  [<ffffffff810693d8>] worker_thread+0x141/0x21e
> >  [<ffffffff81069297>] ? worker_thread+0x0/0x21e
> >  [<ffffffff8106c988>] kthread+0x9d/0xa5
> >  [<ffffffff8100aaa4>] kernel_thread_helper+0x4/0x10
> >  [<ffffffff8147f950>] ? restore_args+0x0/0x30
> >  [<ffffffff8106c8eb>] ? kthread+0x0/0xa5
> >  [<ffffffff8100aaa0>] ? kernel_thread_helper+0x0/0x10
> > ---[ end trace 55688f5173e9b393 ]---
> > e1000: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
> > bonding: bond0: link status definitely up for interface eth1.
> > 0)
> > 
> > It happens because queue_process() disables the local
> > interrupts before call ->ndo_start_xmit() and then
> > dev_queue_xmit() will enable them back.
> > 
> > I have CONFIG_TRACE_IRQFLAGS=y on my .config.
> > 
> Well, you look to be correct, although I'm not sure why you're replying to this
> thread to note the condition.  This patch series doesn't change any of that
> code (although it does make use of the existing function).  This problem could
> just as easily happen to any driver that returns NETDEV_TX_BUSY in response to a
> netpoll transmit, or anytime a netpoll gets blocked because the xmit_lock is
> already held or the tx queue is stopped.  Can you please write a patch to fix
> it?

Hm, right. I had disabled netconsole before and didn't notice that until
now testing your patch series.

Other than that the patches look okay and work out on my tests.
nice work, thanks

Acked-by: Flavio Leitner <fleitner@redhat.com>

-- 
Flavio

^ permalink raw reply

* Re: [PATCH 2/3] cxgb4: function namespace cleanup (v2)
From: Dimitris Michailidis @ 2010-10-16  1:11 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Divy Le Ray, David S. Miller, Casey Leedom, netdev, Steve Wise
In-Reply-To: <20101015171057.409db0f2@nehalam>

Stephen Hemminger wrote:
> Make functions only used in one file local.
> Remove lots of dead code, relating to unsupported functions
> in mainline driver like RSS, IPv6, and TCP offload.

Thanks, this looks OK.  One exception, cxgb4_get_tcp_stats was intended to 
be used by the rdma driver.  I see that driver doesn't call it presently but 
if you don't mind can we give Steve a few hours to tell us if he has any 
imminent plans to use it.  If he doesn't offer to do something to use it for 
.37 it goes.

> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> ---
> Patch against version in linux-next. It keeps the hooks
> necessary for iscsi.
> 
>  drivers/net/cxgb4/cxgb4.h      |   17 -
>  drivers/net/cxgb4/cxgb4_main.c |   78 --------
>  drivers/net/cxgb4/cxgb4_uld.h  |    8 
>  drivers/net/cxgb4/l2t.c        |   34 ---
>  drivers/net/cxgb4/l2t.h        |    3 
>  drivers/net/cxgb4/t4_hw.c      |  394 -----------------------------------------
>  6 files changed, 3 insertions(+), 531 deletions(-)
> 
> --- a/drivers/net/cxgb4/cxgb4_main.c	2010-10-15 16:58:22.899742447 -0700
> +++ b/drivers/net/cxgb4/cxgb4_main.c	2010-10-15 17:02:22.336255220 -0700
> @@ -880,7 +880,7 @@ void *t4_alloc_mem(size_t size)
>  /*
>   * Free memory allocated through alloc_mem().
>   */
> -void t4_free_mem(void *addr)
> +static void t4_free_mem(void *addr)
>  {
>  	if (is_vmalloc_addr(addr))
>  		vfree(addr);
> @@ -2207,8 +2207,8 @@ static void mk_tid_release(struct sk_buf
>   * Queue a TID release request and if necessary schedule a work queue to
>   * process it.
>   */
> -void cxgb4_queue_tid_release(struct tid_info *t, unsigned int chan,
> -			     unsigned int tid)
> +static void cxgb4_queue_tid_release(struct tid_info *t, unsigned int chan,
> +				    unsigned int tid)
>  {
>  	void **p = &t->tid_tab[tid];
>  	struct adapter *adap = container_of(t, struct adapter, tids);
> @@ -2223,7 +2223,6 @@ void cxgb4_queue_tid_release(struct tid_
>  	}
>  	spin_unlock_bh(&adap->tid_release_lock);
>  }
> -EXPORT_SYMBOL(cxgb4_queue_tid_release);
>  
>  /*
>   * Process the list of pending TID release requests.
> @@ -2356,48 +2355,6 @@ int cxgb4_create_server(const struct net
>  EXPORT_SYMBOL(cxgb4_create_server);
>  
>  /**
> - *	cxgb4_create_server6 - create an IPv6 server
> - *	@dev: the device
> - *	@stid: the server TID
> - *	@sip: local IPv6 address to bind server to
> - *	@sport: the server's TCP port
> - *	@queue: queue to direct messages from this server to
> - *
> - *	Create an IPv6 server for the given port and address.
> - *	Returns <0 on error and one of the %NET_XMIT_* values on success.
> - */
> -int cxgb4_create_server6(const struct net_device *dev, unsigned int stid,
> -			 const struct in6_addr *sip, __be16 sport,
> -			 unsigned int queue)
> -{
> -	unsigned int chan;
> -	struct sk_buff *skb;
> -	struct adapter *adap;
> -	struct cpl_pass_open_req6 *req;
> -
> -	skb = alloc_skb(sizeof(*req), GFP_KERNEL);
> -	if (!skb)
> -		return -ENOMEM;
> -
> -	adap = netdev2adap(dev);
> -	req = (struct cpl_pass_open_req6 *)__skb_put(skb, sizeof(*req));
> -	INIT_TP_WR(req, 0);
> -	OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_PASS_OPEN_REQ6, stid));
> -	req->local_port = sport;
> -	req->peer_port = htons(0);
> -	req->local_ip_hi = *(__be64 *)(sip->s6_addr);
> -	req->local_ip_lo = *(__be64 *)(sip->s6_addr + 8);
> -	req->peer_ip_hi = cpu_to_be64(0);
> -	req->peer_ip_lo = cpu_to_be64(0);
> -	chan = rxq_to_chan(&adap->sge, queue);
> -	req->opt0 = cpu_to_be64(TX_CHAN(chan));
> -	req->opt1 = cpu_to_be64(CONN_POLICY_ASK |
> -				SYN_RSS_ENABLE | SYN_RSS_QUEUE(queue));
> -	return t4_mgmt_tx(adap, skb);
> -}
> -EXPORT_SYMBOL(cxgb4_create_server6);
> -
> -/**
>   *	cxgb4_best_mtu - find the entry in the MTU table closest to an MTU
>   *	@mtus: the HW MTU table
>   *	@mtu: the target MTU
> @@ -2456,35 +2413,6 @@ unsigned int cxgb4_port_idx(const struct
>  }
>  EXPORT_SYMBOL(cxgb4_port_idx);
>  
> -/**
> - *	cxgb4_netdev_by_hwid - return the net device of a HW port
> - *	@pdev: identifies the adapter
> - *	@id: the HW port id
> - *
> - *	Return the net device associated with the interface with the given HW
> - *	id.
> - */
> -struct net_device *cxgb4_netdev_by_hwid(struct pci_dev *pdev, unsigned int id)
> -{
> -	const struct adapter *adap = pci_get_drvdata(pdev);
> -
> -	if (!adap || id >= NCHAN)
> -		return NULL;
> -	id = adap->chan_map[id];
> -	return id < MAX_NPORTS ? adap->port[id] : NULL;
> -}
> -EXPORT_SYMBOL(cxgb4_netdev_by_hwid);
> -
> -void cxgb4_get_tcp_stats(struct pci_dev *pdev, struct tp_tcp_stats *v4,
> -			 struct tp_tcp_stats *v6)
> -{
> -	struct adapter *adap = pci_get_drvdata(pdev);
> -
> -	spin_lock(&adap->stats_lock);
> -	t4_tp_get_tcp_stats(adap, v4, v6);
> -	spin_unlock(&adap->stats_lock);
> -}
> -EXPORT_SYMBOL(cxgb4_get_tcp_stats);
>  
>  void cxgb4_iscsi_init(struct net_device *dev, unsigned int tag_mask,
>  		      const unsigned int *pgsz_order)
> --- a/drivers/net/cxgb4/cxgb4_uld.h	2010-10-15 16:58:22.875741593 -0700
> +++ b/drivers/net/cxgb4/cxgb4_uld.h	2010-10-15 17:07:33.695516682 -0700
> @@ -139,16 +139,11 @@ int cxgb4_alloc_stid(struct tid_info *t,
>  void cxgb4_free_atid(struct tid_info *t, unsigned int atid);
>  void cxgb4_free_stid(struct tid_info *t, unsigned int stid, int family);
>  void cxgb4_remove_tid(struct tid_info *t, unsigned int qid, unsigned int tid);
> -void cxgb4_queue_tid_release(struct tid_info *t, unsigned int chan,
> -			     unsigned int tid);
>  
>  struct in6_addr;
>  
>  int cxgb4_create_server(const struct net_device *dev, unsigned int stid,
>  			__be32 sip, __be16 sport, unsigned int queue);
> -int cxgb4_create_server6(const struct net_device *dev, unsigned int stid,
> -			 const struct in6_addr *sip, __be16 sport,
> -			 unsigned int queue);
>  
>  static inline void set_wr_txq(struct sk_buff *skb, int prio, int queue)
>  {
> @@ -233,11 +228,8 @@ int cxgb4_ofld_send(struct net_device *d
>  unsigned int cxgb4_port_chan(const struct net_device *dev);
>  unsigned int cxgb4_port_viid(const struct net_device *dev);
>  unsigned int cxgb4_port_idx(const struct net_device *dev);
> -struct net_device *cxgb4_netdev_by_hwid(struct pci_dev *pdev, unsigned int id);
>  unsigned int cxgb4_best_mtu(const unsigned short *mtus, unsigned short mtu,
>  			    unsigned int *idx);
> -void cxgb4_get_tcp_stats(struct pci_dev *pdev, struct tp_tcp_stats *v4,
> -			 struct tp_tcp_stats *v6);
>  void cxgb4_iscsi_init(struct net_device *dev, unsigned int tag_mask,
>  		      const unsigned int *pgsz_order);
>  struct sk_buff *cxgb4_pktgl_to_skb(const struct pkt_gl *gl,
> --- a/drivers/net/cxgb4/cxgb4.h	2010-10-15 16:58:22.847740597 -0700
> +++ b/drivers/net/cxgb4/cxgb4.h	2010-10-15 17:00:42.852719427 -0700
> @@ -592,7 +592,6 @@ void t4_os_portmod_changed(const struct
>  void t4_os_link_changed(struct adapter *adap, int port_id, int link_stat);
>  
>  void *t4_alloc_mem(size_t size);
> -void t4_free_mem(void *addr);
>  
>  void t4_free_sge_resources(struct adapter *adap);
>  irq_handler_t t4_intr_handler(struct adapter *adap);
> @@ -651,7 +650,6 @@ static inline int t4_wr_mbox_ns(struct a
>  
>  void t4_intr_enable(struct adapter *adapter);
>  void t4_intr_disable(struct adapter *adapter);
> -void t4_intr_clear(struct adapter *adapter);
>  int t4_slow_intr_handler(struct adapter *adapter);
>  
>  int t4_wait_dev_ready(struct adapter *adap);
> @@ -664,26 +662,16 @@ int t4_check_fw_version(struct adapter *
>  int t4_prep_adapter(struct adapter *adapter);
>  int t4_port_init(struct adapter *adap, int mbox, int pf, int vf);
>  void t4_fatal_err(struct adapter *adapter);
> -int t4_set_trace_filter(struct adapter *adapter, const struct trace_params *tp,
> -			int filter_index, int enable);
> -void t4_get_trace_filter(struct adapter *adapter, struct trace_params *tp,
> -			 int filter_index, int *enabled);
>  int t4_config_rss_range(struct adapter *adapter, int mbox, unsigned int viid,
>  			int start, int n, const u16 *rspq, unsigned int nrspq);
>  int t4_config_glbl_rss(struct adapter *adapter, int mbox, unsigned int mode,
>  		       unsigned int flags);
> -int t4_read_rss(struct adapter *adapter, u16 *entries);
>  int t4_mc_read(struct adapter *adap, u32 addr, __be32 *data, u64 *parity);
>  int t4_edc_read(struct adapter *adap, int idx, u32 addr, __be32 *data,
>  		u64 *parity);
>  
>  void t4_get_port_stats(struct adapter *adap, int idx, struct port_stats *p);
> -void t4_get_lb_stats(struct adapter *adap, int idx, struct lb_port_stats *p);
> -
>  void t4_read_mtu_tbl(struct adapter *adap, u16 *mtus, u8 *mtu_log);
> -void t4_tp_get_err_stats(struct adapter *adap, struct tp_err_stats *st);
> -void t4_tp_get_tcp_stats(struct adapter *adap, struct tp_tcp_stats *v4,
> -			 struct tp_tcp_stats *v6);
>  void t4_load_mtus(struct adapter *adap, const unsigned short *mtus,
>  		  const unsigned short *alpha, const unsigned short *beta);
>  
> @@ -711,8 +699,6 @@ int t4_cfg_pfvf(struct adapter *adap, un
>  int t4_alloc_vi(struct adapter *adap, unsigned int mbox, unsigned int port,
>  		unsigned int pf, unsigned int vf, unsigned int nmac, u8 *mac,
>  		unsigned int *rss_size);
> -int t4_free_vi(struct adapter *adap, unsigned int mbox, unsigned int pf,
> -	       unsigned int vf, unsigned int viid);
>  int t4_set_rxmode(struct adapter *adap, unsigned int mbox, unsigned int viid,
>  		int mtu, int promisc, int all_multi, int bcast, int vlanex,
>  		bool sleep_ok);
> @@ -731,9 +717,6 @@ int t4_mdio_rd(struct adapter *adap, uns
>  	       unsigned int mmd, unsigned int reg, u16 *valp);
>  int t4_mdio_wr(struct adapter *adap, unsigned int mbox, unsigned int phy_addr,
>  	       unsigned int mmd, unsigned int reg, u16 val);
> -int t4_iq_start_stop(struct adapter *adap, unsigned int mbox, bool start,
> -		     unsigned int pf, unsigned int vf, unsigned int iqid,
> -		     unsigned int fl0id, unsigned int fl1id);
>  int t4_iq_free(struct adapter *adap, unsigned int mbox, unsigned int pf,
>  	       unsigned int vf, unsigned int iqtype, unsigned int iqid,
>  	       unsigned int fl0id, unsigned int fl1id);
> --- a/drivers/net/cxgb4/l2t.c	2010-10-15 16:58:22.883741877 -0700
> +++ b/drivers/net/cxgb4/l2t.c	2010-10-15 17:00:42.856719569 -0700
> @@ -481,40 +481,6 @@ void t4_l2t_update(struct adapter *adap,
>  		handle_failed_resolution(adap, arpq);
>  }
>  
> -/*
> - * Allocate an L2T entry for use by a switching rule.  Such entries need to be
> - * explicitly freed and while busy they are not on any hash chain, so normal
> - * address resolution updates do not see them.
> - */
> -struct l2t_entry *t4_l2t_alloc_switching(struct l2t_data *d)
> -{
> -	struct l2t_entry *e;
> -
> -	write_lock_bh(&d->lock);
> -	e = alloc_l2e(d);
> -	if (e) {
> -		spin_lock(&e->lock);          /* avoid race with t4_l2t_free */
> -		e->state = L2T_STATE_SWITCHING;
> -		atomic_set(&e->refcnt, 1);
> -		spin_unlock(&e->lock);
> -	}
> -	write_unlock_bh(&d->lock);
> -	return e;
> -}
> -
> -/*
> - * Sets/updates the contents of a switching L2T entry that has been allocated
> - * with an earlier call to @t4_l2t_alloc_switching.
> - */
> -int t4_l2t_set_switching(struct adapter *adap, struct l2t_entry *e, u16 vlan,
> -			 u8 port, u8 *eth_addr)
> -{
> -	e->vlan = vlan;
> -	e->lport = port;
> -	memcpy(e->dmac, eth_addr, ETH_ALEN);
> -	return write_l2e(adap, e, 0);
> -}
> -
>  struct l2t_data *t4_init_l2t(void)
>  {
>  	int i;
> --- a/drivers/net/cxgb4/l2t.h	2010-10-15 16:58:22.855740881 -0700
> +++ b/drivers/net/cxgb4/l2t.h	2010-10-15 17:00:42.856719569 -0700
> @@ -100,9 +100,6 @@ struct l2t_entry *cxgb4_l2t_get(struct l
>  				unsigned int priority);
>  
>  void t4_l2t_update(struct adapter *adap, struct neighbour *neigh);
> -struct l2t_entry *t4_l2t_alloc_switching(struct l2t_data *d);
> -int t4_l2t_set_switching(struct adapter *adap, struct l2t_entry *e, u16 vlan,
> -			 u8 port, u8 *eth_addr);
>  struct l2t_data *t4_init_l2t(void);
>  void do_l2t_write_rpl(struct adapter *p, const struct cpl_l2t_write_rpl *rpl);
>  
> --- a/drivers/net/cxgb4/t4_hw.c	2010-10-15 16:58:22.839740313 -0700
> +++ b/drivers/net/cxgb4/t4_hw.c	2010-10-15 17:00:42.860719711 -0700
> @@ -97,53 +97,6 @@ void t4_set_reg_field(struct adapter *ad
>  	(void) t4_read_reg(adapter, addr);      /* flush */
>  }
>  
> -/**
> - *	t4_read_indirect - read indirectly addressed registers
> - *	@adap: the adapter
> - *	@addr_reg: register holding the indirect address
> - *	@data_reg: register holding the value of the indirect register
> - *	@vals: where the read register values are stored
> - *	@nregs: how many indirect registers to read
> - *	@start_idx: index of first indirect register to read
> - *
> - *	Reads registers that are accessed indirectly through an address/data
> - *	register pair.
> - */
> -static void t4_read_indirect(struct adapter *adap, unsigned int addr_reg,
> -			     unsigned int data_reg, u32 *vals,
> -			     unsigned int nregs, unsigned int start_idx)
> -{
> -	while (nregs--) {
> -		t4_write_reg(adap, addr_reg, start_idx);
> -		*vals++ = t4_read_reg(adap, data_reg);
> -		start_idx++;
> -	}
> -}
> -
> -#if 0
> -/**
> - *	t4_write_indirect - write indirectly addressed registers
> - *	@adap: the adapter
> - *	@addr_reg: register holding the indirect addresses
> - *	@data_reg: register holding the value for the indirect registers
> - *	@vals: values to write
> - *	@nregs: how many indirect registers to write
> - *	@start_idx: address of first indirect register to write
> - *
> - *	Writes a sequential block of registers that are accessed indirectly
> - *	through an address/data register pair.
> - */
> -static void t4_write_indirect(struct adapter *adap, unsigned int addr_reg,
> -			      unsigned int data_reg, const u32 *vals,
> -			      unsigned int nregs, unsigned int start_idx)
> -{
> -	while (nregs--) {
> -		t4_write_reg(adap, addr_reg, start_idx++);
> -		t4_write_reg(adap, data_reg, *vals++);
> -	}
> -}
> -#endif
> -
>  /*
>   * Get the reply to a mailbox command and store it in @rpl in big-endian order.
>   */
> @@ -1560,44 +1513,6 @@ void t4_intr_disable(struct adapter *ada
>  }
>  
>  /**
> - *	t4_intr_clear - clear all interrupts
> - *	@adapter: the adapter whose interrupts should be cleared
> - *
> - *	Clears all interrupts.  The caller must be a PCI function managing
> - *	global interrupts.
> - */
> -void t4_intr_clear(struct adapter *adapter)
> -{
> -	static const unsigned int cause_reg[] = {
> -		SGE_INT_CAUSE1, SGE_INT_CAUSE2, SGE_INT_CAUSE3,
> -		PCIE_CORE_UTL_SYSTEM_BUS_AGENT_STATUS,
> -		PCIE_CORE_UTL_PCI_EXPRESS_PORT_STATUS,
> -		PCIE_NONFAT_ERR, PCIE_INT_CAUSE,
> -		MC_INT_CAUSE,
> -		MA_INT_WRAP_STATUS, MA_PARITY_ERROR_STATUS, MA_INT_CAUSE,
> -		EDC_INT_CAUSE, EDC_REG(EDC_INT_CAUSE, 1),
> -		CIM_HOST_INT_CAUSE, CIM_HOST_UPACC_INT_CAUSE,
> -		MYPF_REG(CIM_PF_HOST_INT_CAUSE),
> -		TP_INT_CAUSE,
> -		ULP_RX_INT_CAUSE, ULP_TX_INT_CAUSE,
> -		PM_RX_INT_CAUSE, PM_TX_INT_CAUSE,
> -		MPS_RX_PERR_INT_CAUSE,
> -		CPL_INTR_CAUSE,
> -		MYPF_REG(PL_PF_INT_CAUSE),
> -		PL_PL_INT_CAUSE,
> -		LE_DB_INT_CAUSE,
> -	};
> -
> -	unsigned int i;
> -
> -	for (i = 0; i < ARRAY_SIZE(cause_reg); ++i)
> -		t4_write_reg(adapter, cause_reg[i], 0xffffffff);
> -
> -	t4_write_reg(adapter, PL_INT_CAUSE, GLBL_INTR_MASK);
> -	(void) t4_read_reg(adapter, PL_INT_CAUSE);          /* flush */
> -}
> -
> -/**
>   *	hash_mac_addr - return the hash value of a MAC address
>   *	@addr: the 48-bit Ethernet MAC address
>   *
> @@ -1709,98 +1624,6 @@ int t4_config_glbl_rss(struct adapter *a
>  	return t4_wr_mbox(adapter, mbox, &c, sizeof(c), NULL);
>  }
>  
> -/* Read an RSS table row */
> -static int rd_rss_row(struct adapter *adap, int row, u32 *val)
> -{
> -	t4_write_reg(adap, TP_RSS_LKP_TABLE, 0xfff00000 | row);
> -	return t4_wait_op_done_val(adap, TP_RSS_LKP_TABLE, LKPTBLROWVLD, 1,
> -				   5, 0, val);
> -}
> -
> -/**
> - *	t4_read_rss - read the contents of the RSS mapping table
> - *	@adapter: the adapter
> - *	@map: holds the contents of the RSS mapping table
> - *
> - *	Reads the contents of the RSS hash->queue mapping table.
> - */
> -int t4_read_rss(struct adapter *adapter, u16 *map)
> -{
> -	u32 val;
> -	int i, ret;
> -
> -	for (i = 0; i < RSS_NENTRIES / 2; ++i) {
> -		ret = rd_rss_row(adapter, i, &val);
> -		if (ret)
> -			return ret;
> -		*map++ = LKPTBLQUEUE0_GET(val);
> -		*map++ = LKPTBLQUEUE1_GET(val);
> -	}
> -	return 0;
> -}
> -
> -/**
> - *	t4_tp_get_tcp_stats - read TP's TCP MIB counters
> - *	@adap: the adapter
> - *	@v4: holds the TCP/IP counter values
> - *	@v6: holds the TCP/IPv6 counter values
> - *
> - *	Returns the values of TP's TCP/IP and TCP/IPv6 MIB counters.
> - *	Either @v4 or @v6 may be %NULL to skip the corresponding stats.
> - */
> -void t4_tp_get_tcp_stats(struct adapter *adap, struct tp_tcp_stats *v4,
> -			 struct tp_tcp_stats *v6)
> -{
> -	u32 val[TP_MIB_TCP_RXT_SEG_LO - TP_MIB_TCP_OUT_RST + 1];
> -
> -#define STAT_IDX(x) ((TP_MIB_TCP_##x) - TP_MIB_TCP_OUT_RST)
> -#define STAT(x)     val[STAT_IDX(x)]
> -#define STAT64(x)   (((u64)STAT(x##_HI) << 32) | STAT(x##_LO))
> -
> -	if (v4) {
> -		t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, val,
> -				 ARRAY_SIZE(val), TP_MIB_TCP_OUT_RST);
> -		v4->tcpOutRsts = STAT(OUT_RST);
> -		v4->tcpInSegs  = STAT64(IN_SEG);
> -		v4->tcpOutSegs = STAT64(OUT_SEG);
> -		v4->tcpRetransSegs = STAT64(RXT_SEG);
> -	}
> -	if (v6) {
> -		t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, val,
> -				 ARRAY_SIZE(val), TP_MIB_TCP_V6OUT_RST);
> -		v6->tcpOutRsts = STAT(OUT_RST);
> -		v6->tcpInSegs  = STAT64(IN_SEG);
> -		v6->tcpOutSegs = STAT64(OUT_SEG);
> -		v6->tcpRetransSegs = STAT64(RXT_SEG);
> -	}
> -#undef STAT64
> -#undef STAT
> -#undef STAT_IDX
> -}
> -
> -/**
> - *	t4_tp_get_err_stats - read TP's error MIB counters
> - *	@adap: the adapter
> - *	@st: holds the counter values
> - *
> - *	Returns the values of TP's error counters.
> - */
> -void t4_tp_get_err_stats(struct adapter *adap, struct tp_err_stats *st)
> -{
> -	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, st->macInErrs,
> -			 12, TP_MIB_MAC_IN_ERR_0);
> -	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, st->tnlCongDrops,
> -			 8, TP_MIB_TNL_CNG_DROP_0);
> -	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, st->tnlTxDrops,
> -			 4, TP_MIB_TNL_DROP_0);
> -	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, st->ofldVlanDrops,
> -			 4, TP_MIB_OFD_VLN_DROP_0);
> -	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, st->tcp6InErrs,
> -			 4, TP_MIB_TCP_V6IN_ERR_0);
> -	t4_read_indirect(adap, TP_MIB_INDEX, TP_MIB_DATA, &st->ofldNoNeigh,
> -			 2, TP_MIB_OFD_ARP_DROP);
> -}
> -
>  /**
>   *	t4_read_mtu_tbl - returns the values in the HW path MTU table
>   *	@adap: the adapter
> @@ -1916,122 +1739,6 @@ void t4_load_mtus(struct adapter *adap,
>  }
>  
>  /**
> - *	t4_set_trace_filter - configure one of the tracing filters
> - *	@adap: the adapter
> - *	@tp: the desired trace filter parameters
> - *	@idx: which filter to configure
> - *	@enable: whether to enable or disable the filter
> - *
> - *	Configures one of the tracing filters available in HW.  If @enable is
> - *	%0 @tp is not examined and may be %NULL.
> - */
> -int t4_set_trace_filter(struct adapter *adap, const struct trace_params *tp,
> -			int idx, int enable)
> -{
> -	int i, ofst = idx * 4;
> -	u32 data_reg, mask_reg, cfg;
> -	u32 multitrc = TRCMULTIFILTER;
> -
> -	if (!enable) {
> -		t4_write_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + ofst, 0);
> -		goto out;
> -	}
> -
> -	if (tp->port > 11 || tp->invert > 1 || tp->skip_len > 0x1f ||
> -	    tp->skip_ofst > 0x1f || tp->min_len > 0x1ff ||
> -	    tp->snap_len > 9600 || (idx && tp->snap_len > 256))
> -		return -EINVAL;
> -
> -	if (tp->snap_len > 256) {            /* must be tracer 0 */
> -		if ((t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + 4) |
> -		     t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + 8) |
> -		     t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + 12)) & TFEN)
> -			return -EINVAL;  /* other tracers are enabled */
> -		multitrc = 0;
> -	} else if (idx) {
> -		i = t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_B);
> -		if (TFCAPTUREMAX_GET(i) > 256 &&
> -		    (t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A) & TFEN))
> -			return -EINVAL;
> -	}
> -
> -	/* stop the tracer we'll be changing */
> -	t4_write_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + ofst, 0);
> -
> -	/* disable tracing globally if running in the wrong single/multi mode */
> -	cfg = t4_read_reg(adap, MPS_TRC_CFG);
> -	if ((cfg & TRCEN) && multitrc != (cfg & TRCMULTIFILTER)) {
> -		t4_write_reg(adap, MPS_TRC_CFG, cfg ^ TRCEN);
> -		t4_read_reg(adap, MPS_TRC_CFG);                  /* flush */
> -		msleep(1);
> -		if (!(t4_read_reg(adap, MPS_TRC_CFG) & TRCFIFOEMPTY))
> -			return -ETIMEDOUT;
> -	}
> -	/*
> -	 * At this point either the tracing is enabled and in the right mode or
> -	 * disabled.
> -	 */
> -
> -	idx *= (MPS_TRC_FILTER1_MATCH - MPS_TRC_FILTER0_MATCH);
> -	data_reg = MPS_TRC_FILTER0_MATCH + idx;
> -	mask_reg = MPS_TRC_FILTER0_DONT_CARE + idx;
> -
> -	for (i = 0; i < TRACE_LEN / 4; i++, data_reg += 4, mask_reg += 4) {
> -		t4_write_reg(adap, data_reg, tp->data[i]);
> -		t4_write_reg(adap, mask_reg, ~tp->mask[i]);
> -	}
> -	t4_write_reg(adap, MPS_TRC_FILTER_MATCH_CTL_B + ofst,
> -		     TFCAPTUREMAX(tp->snap_len) |
> -		     TFMINPKTSIZE(tp->min_len));
> -	t4_write_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + ofst,
> -		     TFOFFSET(tp->skip_ofst) | TFLENGTH(tp->skip_len) |
> -		     TFPORT(tp->port) | TFEN |
> -		     (tp->invert ? TFINVERTMATCH : 0));
> -
> -	cfg &= ~TRCMULTIFILTER;
> -	t4_write_reg(adap, MPS_TRC_CFG, cfg | TRCEN | multitrc);
> -out:	t4_read_reg(adap, MPS_TRC_CFG);  /* flush */
> -	return 0;
> -}
> -
> -/**
> - *	t4_get_trace_filter - query one of the tracing filters
> - *	@adap: the adapter
> - *	@tp: the current trace filter parameters
> - *	@idx: which trace filter to query
> - *	@enabled: non-zero if the filter is enabled
> - *
> - *	Returns the current settings of one of the HW tracing filters.
> - */
> -void t4_get_trace_filter(struct adapter *adap, struct trace_params *tp, int idx,
> -			 int *enabled)
> -{
> -	u32 ctla, ctlb;
> -	int i, ofst = idx * 4;
> -	u32 data_reg, mask_reg;
> -
> -	ctla = t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_A + ofst);
> -	ctlb = t4_read_reg(adap, MPS_TRC_FILTER_MATCH_CTL_B + ofst);
> -
> -	*enabled = !!(ctla & TFEN);
> -	tp->snap_len = TFCAPTUREMAX_GET(ctlb);
> -	tp->min_len = TFMINPKTSIZE_GET(ctlb);
> -	tp->skip_ofst = TFOFFSET_GET(ctla);
> -	tp->skip_len = TFLENGTH_GET(ctla);
> -	tp->invert = !!(ctla & TFINVERTMATCH);
> -	tp->port = TFPORT_GET(ctla);
> -
> -	ofst = (MPS_TRC_FILTER1_MATCH - MPS_TRC_FILTER0_MATCH) * idx;
> -	data_reg = MPS_TRC_FILTER0_MATCH + ofst;
> -	mask_reg = MPS_TRC_FILTER0_DONT_CARE + ofst;
> -
> -	for (i = 0; i < TRACE_LEN / 4; i++, data_reg += 4, mask_reg += 4) {
> -		tp->mask[i] = ~t4_read_reg(adap, mask_reg);
> -		tp->data[i] = t4_read_reg(adap, data_reg) & tp->mask[i];
> -	}
> -}
> -
> -/**
>   *	get_mps_bg_map - return the buffer groups associated with a port
>   *	@adap: the adapter
>   *	@idx: the port index
> @@ -2133,52 +1840,6 @@ void t4_get_port_stats(struct adapter *a
>  }
>  
>  /**
> - *	t4_get_lb_stats - collect loopback port statistics
> - *	@adap: the adapter
> - *	@idx: the loopback port index
> - *	@p: the stats structure to fill
> - *
> - *	Return HW statistics for the given loopback port.
> - */
> -void t4_get_lb_stats(struct adapter *adap, int idx, struct lb_port_stats *p)
> -{
> -	u32 bgmap = get_mps_bg_map(adap, idx);
> -
> -#define GET_STAT(name) \
> -	t4_read_reg64(adap, PORT_REG(idx, MPS_PORT_STAT_LB_PORT_##name##_L))
> -#define GET_STAT_COM(name) t4_read_reg64(adap, MPS_STAT_##name##_L)
> -
> -	p->octets           = GET_STAT(BYTES);
> -	p->frames           = GET_STAT(FRAMES);
> -	p->bcast_frames     = GET_STAT(BCAST);
> -	p->mcast_frames     = GET_STAT(MCAST);
> -	p->ucast_frames     = GET_STAT(UCAST);
> -	p->error_frames     = GET_STAT(ERROR);
> -
> -	p->frames_64        = GET_STAT(64B);
> -	p->frames_65_127    = GET_STAT(65B_127B);
> -	p->frames_128_255   = GET_STAT(128B_255B);
> -	p->frames_256_511   = GET_STAT(256B_511B);
> -	p->frames_512_1023  = GET_STAT(512B_1023B);
> -	p->frames_1024_1518 = GET_STAT(1024B_1518B);
> -	p->frames_1519_max  = GET_STAT(1519B_MAX);
> -	p->drop             = t4_read_reg(adap, PORT_REG(idx,
> -					  MPS_PORT_STAT_LB_PORT_DROP_FRAMES));
> -
> -	p->ovflow0 = (bgmap & 1) ? GET_STAT_COM(RX_BG_0_LB_DROP_FRAME) : 0;
> -	p->ovflow1 = (bgmap & 2) ? GET_STAT_COM(RX_BG_1_LB_DROP_FRAME) : 0;
> -	p->ovflow2 = (bgmap & 4) ? GET_STAT_COM(RX_BG_2_LB_DROP_FRAME) : 0;
> -	p->ovflow3 = (bgmap & 8) ? GET_STAT_COM(RX_BG_3_LB_DROP_FRAME) : 0;
> -	p->trunc0 = (bgmap & 1) ? GET_STAT_COM(RX_BG_0_LB_TRUNC_FRAME) : 0;
> -	p->trunc1 = (bgmap & 2) ? GET_STAT_COM(RX_BG_1_LB_TRUNC_FRAME) : 0;
> -	p->trunc2 = (bgmap & 4) ? GET_STAT_COM(RX_BG_2_LB_TRUNC_FRAME) : 0;
> -	p->trunc3 = (bgmap & 8) ? GET_STAT_COM(RX_BG_3_LB_TRUNC_FRAME) : 0;
> -
> -#undef GET_STAT
> -#undef GET_STAT_COM
> -}
> -
> -/**
>   *	t4_wol_magic_enable - enable/disable magic packet WoL
>   *	@adap: the adapter
>   *	@port: the physical port index
> @@ -2584,30 +2245,6 @@ int t4_alloc_vi(struct adapter *adap, un
>  }
>  
>  /**
> - *	t4_free_vi - free a virtual interface
> - *	@adap: the adapter
> - *	@mbox: mailbox to use for the FW command
> - *	@pf: the PF owning the VI
> - *	@vf: the VF owning the VI
> - *	@viid: virtual interface identifiler
> - *
> - *	Free a previously allocated virtual interface.
> - */
> -int t4_free_vi(struct adapter *adap, unsigned int mbox, unsigned int pf,
> -	       unsigned int vf, unsigned int viid)
> -{
> -	struct fw_vi_cmd c;
> -
> -	memset(&c, 0, sizeof(c));
> -	c.op_to_vfn = htonl(FW_CMD_OP(FW_VI_CMD) | FW_CMD_REQUEST |
> -			    FW_CMD_EXEC | FW_VI_CMD_PFN(pf) |
> -			    FW_VI_CMD_VFN(vf));
> -	c.alloc_to_len16 = htonl(FW_VI_CMD_FREE | FW_LEN16(c));
> -	c.type_viid = htons(FW_VI_CMD_VIID(viid));
> -	return t4_wr_mbox(adap, mbox, &c, sizeof(c), &c);
> -}
> -
> -/**
>   *	t4_set_rxmode - set Rx properties of a virtual interface
>   *	@adap: the adapter
>   *	@mbox: mailbox to use for the FW command
> @@ -2832,37 +2469,6 @@ int t4_identify_port(struct adapter *ada
>  	return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL);
>  }
>  
> -/**
> - *	t4_iq_start_stop - enable/disable an ingress queue and its FLs
> - *	@adap: the adapter
> - *	@mbox: mailbox to use for the FW command
> - *	@start: %true to enable the queues, %false to disable them
> - *	@pf: the PF owning the queues
> - *	@vf: the VF owning the queues
> - *	@iqid: ingress queue id
> - *	@fl0id: FL0 queue id or 0xffff if no attached FL0
> - *	@fl1id: FL1 queue id or 0xffff if no attached FL1
> - *
> - *	Starts or stops an ingress queue and its associated FLs, if any.
> - */
> -int t4_iq_start_stop(struct adapter *adap, unsigned int mbox, bool start,
> -		     unsigned int pf, unsigned int vf, unsigned int iqid,
> -		     unsigned int fl0id, unsigned int fl1id)
> -{
> -	struct fw_iq_cmd c;
> -
> -	memset(&c, 0, sizeof(c));
> -	c.op_to_vfn = htonl(FW_CMD_OP(FW_IQ_CMD) | FW_CMD_REQUEST |
> -			    FW_CMD_EXEC | FW_IQ_CMD_PFN(pf) |
> -			    FW_IQ_CMD_VFN(vf));
> -	c.alloc_to_len16 = htonl(FW_IQ_CMD_IQSTART(start) |
> -				 FW_IQ_CMD_IQSTOP(!start) | FW_LEN16(c));
> -	c.iqid = htons(iqid);
> -	c.fl0id = htons(fl0id);
> -	c.fl1id = htons(fl1id);
> -	return t4_wr_mbox(adap, mbox, &c, sizeof(c), NULL);
> -}
> -
>  /**
>   *	t4_iq_free - free an ingress queue and its FLs
>   *	@adap: the adapter


^ permalink raw reply

* Irish Lotto***You Earned £750,000***
From: Fiduciary Desk @ 2010-10-15 23:21 UTC (permalink / raw)





Send Names, Age, Occupation, Country.


^ permalink raw reply

* [net-next-2.6 PATCH] igbvf: Remove unneeded pm_qos* calls
From: Jeff Kirsher @ 2010-10-16  3:26 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, bphilips, Greg Rose, Jeff Kirsher

From: Greg Rose <gregory.v.rose@intel.com>

Power Management Quality of Service is not supported or used by the VF
driver.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/igbvf/netdev.c |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/drivers/net/igbvf/netdev.c b/drivers/net/igbvf/netdev.c
index 2655013..6693323 100644
--- a/drivers/net/igbvf/netdev.c
+++ b/drivers/net/igbvf/netdev.c
@@ -41,14 +41,12 @@
 #include <linux/mii.h>
 #include <linux/ethtool.h>
 #include <linux/if_vlan.h>
-#include <linux/pm_qos_params.h>
 
 #include "igbvf.h"
 
 #define DRV_VERSION "1.0.0-k0"
 char igbvf_driver_name[] = "igbvf";
 const char igbvf_driver_version[] = DRV_VERSION;
-static struct pm_qos_request_list igbvf_driver_pm_qos_req;
 static const char igbvf_driver_string[] =
 				"Intel(R) Virtual Function Network Driver";
 static const char igbvf_copyright[] = "Copyright (c) 2009 Intel Corporation.";
@@ -2904,8 +2902,6 @@ static int __init igbvf_init_module(void)
 	printk(KERN_INFO "%s\n", igbvf_copyright);
 
 	ret = pci_register_driver(&igbvf_driver);
-	pm_qos_add_request(&igbvf_driver_pm_qos_req, PM_QOS_CPU_DMA_LATENCY,
-			   PM_QOS_DEFAULT_VALUE);
 
 	return ret;
 }
@@ -2920,7 +2916,6 @@ module_init(igbvf_init_module);
 static void __exit igbvf_exit_module(void)
 {
 	pci_unregister_driver(&igbvf_driver);
-	pm_qos_remove_request(&igbvf_driver_pm_qos_req);
 }
 module_exit(igbvf_exit_module);
 


^ permalink raw reply related

* [net-next-2.6 PATCH] igb: fix stats handling
From: Jeff Kirsher @ 2010-10-16  3:27 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, bphilips, Eric Dumazet, Jeff Kirsher

From: Eric Dumazet <eric.dumazet@gmail.com>

There are currently some problems with igb.

- On 32bit arches, maintaining 64bit counters without proper
synchronization between writers and readers.

- Stats updated every two seconds, as reported by Jesper.
   (Jesper provided a patch for this)

- Potential problem between worker thread and ethtool -S

This patch uses u64_stats_sync, and convert everything to be 64bit safe,
SMP safe, even on 32bit arches. It integrates Jesper idea of providing
accurate stats at the time user reads them.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/igb/igb.h         |    9 +++
 drivers/net/igb/igb_ethtool.c |   52 ++++++++++++++-----
 drivers/net/igb/igb_main.c    |  113 ++++++++++++++++++++++++++++++-----------
 3 files changed, 129 insertions(+), 45 deletions(-)

diff --git a/drivers/net/igb/igb.h b/drivers/net/igb/igb.h
index 44e0ff1..edab9c4 100644
--- a/drivers/net/igb/igb.h
+++ b/drivers/net/igb/igb.h
@@ -159,6 +159,7 @@ struct igb_tx_queue_stats {
 	u64 packets;
 	u64 bytes;
 	u64 restart_queue;
+	u64 restart_queue2;
 };
 
 struct igb_rx_queue_stats {
@@ -210,11 +211,14 @@ struct igb_ring {
 		/* TX */
 		struct {
 			struct igb_tx_queue_stats tx_stats;
+			struct u64_stats_sync tx_syncp;
+			struct u64_stats_sync tx_syncp2;
 			bool detect_tx_hung;
 		};
 		/* RX */
 		struct {
 			struct igb_rx_queue_stats rx_stats;
+			struct u64_stats_sync rx_syncp;
 			u32 rx_buffer_len;
 		};
 	};
@@ -288,6 +292,9 @@ struct igb_adapter {
 	struct timecompare compare;
 	struct hwtstamp_config hwtstamp_config;
 
+	spinlock_t stats64_lock;
+	struct rtnl_link_stats64 stats64;
+
 	/* structs defined in e1000_hw.h */
 	struct e1000_hw hw;
 	struct e1000_hw_stats stats;
@@ -357,7 +364,7 @@ extern netdev_tx_t igb_xmit_frame_ring_adv(struct sk_buff *, struct igb_ring *);
 extern void igb_unmap_and_free_tx_resource(struct igb_ring *,
 					   struct igb_buffer *);
 extern void igb_alloc_rx_buffers_adv(struct igb_ring *, int);
-extern void igb_update_stats(struct igb_adapter *);
+extern void igb_update_stats(struct igb_adapter *, struct rtnl_link_stats64 *);
 extern bool igb_has_link(struct igb_adapter *adapter);
 extern void igb_set_ethtool_ops(struct net_device *);
 extern void igb_power_up_link(struct igb_adapter *);
diff --git a/drivers/net/igb/igb_ethtool.c b/drivers/net/igb/igb_ethtool.c
index 26bf6a1..a70e16b 100644
--- a/drivers/net/igb/igb_ethtool.c
+++ b/drivers/net/igb/igb_ethtool.c
@@ -90,8 +90,8 @@ static const struct igb_stats igb_gstrings_stats[] = {
 
 #define IGB_NETDEV_STAT(_net_stat) { \
 	.stat_string = __stringify(_net_stat), \
-	.sizeof_stat = FIELD_SIZEOF(struct net_device_stats, _net_stat), \
-	.stat_offset = offsetof(struct net_device_stats, _net_stat) \
+	.sizeof_stat = FIELD_SIZEOF(struct rtnl_link_stats64, _net_stat), \
+	.stat_offset = offsetof(struct rtnl_link_stats64, _net_stat) \
 }
 static const struct igb_stats igb_gstrings_net_stats[] = {
 	IGB_NETDEV_STAT(rx_errors),
@@ -111,8 +111,9 @@ static const struct igb_stats igb_gstrings_net_stats[] = {
 	(sizeof(igb_gstrings_net_stats) / sizeof(struct igb_stats))
 #define IGB_RX_QUEUE_STATS_LEN \
 	(sizeof(struct igb_rx_queue_stats) / sizeof(u64))
-#define IGB_TX_QUEUE_STATS_LEN \
-	(sizeof(struct igb_tx_queue_stats) / sizeof(u64))
+
+#define IGB_TX_QUEUE_STATS_LEN 3 /* packets, bytes, restart_queue */
+
 #define IGB_QUEUE_STATS_LEN \
 	((((struct igb_adapter *)netdev_priv(netdev))->num_rx_queues * \
 	  IGB_RX_QUEUE_STATS_LEN) + \
@@ -2070,12 +2071,14 @@ static void igb_get_ethtool_stats(struct net_device *netdev,
 				  struct ethtool_stats *stats, u64 *data)
 {
 	struct igb_adapter *adapter = netdev_priv(netdev);
-	struct net_device_stats *net_stats = &netdev->stats;
-	u64 *queue_stat;
-	int i, j, k;
+	struct rtnl_link_stats64 *net_stats = &adapter->stats64;
+	unsigned int start;
+	struct igb_ring *ring;
+	int i, j;
 	char *p;
 
-	igb_update_stats(adapter);
+	spin_lock(&adapter->stats64_lock);
+	igb_update_stats(adapter, net_stats);
 
 	for (i = 0; i < IGB_GLOBAL_STATS_LEN; i++) {
 		p = (char *)adapter + igb_gstrings_stats[i].stat_offset;
@@ -2088,15 +2091,36 @@ static void igb_get_ethtool_stats(struct net_device *netdev,
 			sizeof(u64)) ? *(u64 *)p : *(u32 *)p;
 	}
 	for (j = 0; j < adapter->num_tx_queues; j++) {
-		queue_stat = (u64 *)&adapter->tx_ring[j]->tx_stats;
-		for (k = 0; k < IGB_TX_QUEUE_STATS_LEN; k++, i++)
-			data[i] = queue_stat[k];
+		u64	restart2;
+
+		ring = adapter->tx_ring[j];
+		do {
+			start = u64_stats_fetch_begin_bh(&ring->tx_syncp);
+			data[i]   = ring->tx_stats.packets;
+			data[i+1] = ring->tx_stats.bytes;
+			data[i+2] = ring->tx_stats.restart_queue;
+		} while (u64_stats_fetch_retry_bh(&ring->tx_syncp, start));
+		do {
+			start = u64_stats_fetch_begin_bh(&ring->tx_syncp2);
+			restart2  = ring->tx_stats.restart_queue2;
+		} while (u64_stats_fetch_retry_bh(&ring->tx_syncp2, start));
+		data[i+2] += restart2;
+
+		i += IGB_TX_QUEUE_STATS_LEN;
 	}
 	for (j = 0; j < adapter->num_rx_queues; j++) {
-		queue_stat = (u64 *)&adapter->rx_ring[j]->rx_stats;
-		for (k = 0; k < IGB_RX_QUEUE_STATS_LEN; k++, i++)
-			data[i] = queue_stat[k];
+		ring = adapter->rx_ring[j];
+		do {
+			start = u64_stats_fetch_begin_bh(&ring->rx_syncp);
+			data[i]   = ring->rx_stats.packets;
+			data[i+1] = ring->rx_stats.bytes;
+			data[i+2] = ring->rx_stats.drops;
+			data[i+3] = ring->rx_stats.csum_err;
+			data[i+4] = ring->rx_stats.alloc_failed;
+		} while (u64_stats_fetch_retry_bh(&ring->rx_syncp, start));
+		i += IGB_RX_QUEUE_STATS_LEN;
 	}
+	spin_unlock(&adapter->stats64_lock);
 }
 
 static void igb_get_strings(struct net_device *netdev, u32 stringset, u8 *data)
diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
index 5b04eff..b8dccc0 100644
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -96,7 +96,6 @@ static int igb_setup_all_rx_resources(struct igb_adapter *);
 static void igb_free_all_tx_resources(struct igb_adapter *);
 static void igb_free_all_rx_resources(struct igb_adapter *);
 static void igb_setup_mrqc(struct igb_adapter *);
-void igb_update_stats(struct igb_adapter *);
 static int igb_probe(struct pci_dev *, const struct pci_device_id *);
 static void __devexit igb_remove(struct pci_dev *pdev);
 static int igb_sw_init(struct igb_adapter *);
@@ -113,7 +112,8 @@ static void igb_update_phy_info(unsigned long);
 static void igb_watchdog(unsigned long);
 static void igb_watchdog_task(struct work_struct *);
 static netdev_tx_t igb_xmit_frame_adv(struct sk_buff *skb, struct net_device *);
-static struct net_device_stats *igb_get_stats(struct net_device *);
+static struct rtnl_link_stats64 *igb_get_stats64(struct net_device *dev,
+						 struct rtnl_link_stats64 *stats);
 static int igb_change_mtu(struct net_device *, int);
 static int igb_set_mac(struct net_device *, void *);
 static void igb_set_uta(struct igb_adapter *adapter);
@@ -1536,7 +1536,9 @@ void igb_down(struct igb_adapter *adapter)
 	netif_carrier_off(netdev);
 
 	/* record the stats before reset*/
-	igb_update_stats(adapter);
+	spin_lock(&adapter->stats64_lock);
+	igb_update_stats(adapter, &adapter->stats64);
+	spin_unlock(&adapter->stats64_lock);
 
 	adapter->link_speed = 0;
 	adapter->link_duplex = 0;
@@ -1689,7 +1691,7 @@ static const struct net_device_ops igb_netdev_ops = {
 	.ndo_open		= igb_open,
 	.ndo_stop		= igb_close,
 	.ndo_start_xmit		= igb_xmit_frame_adv,
-	.ndo_get_stats		= igb_get_stats,
+	.ndo_get_stats64	= igb_get_stats64,
 	.ndo_set_rx_mode	= igb_set_rx_mode,
 	.ndo_set_multicast_list	= igb_set_rx_mode,
 	.ndo_set_mac_address	= igb_set_mac,
@@ -2276,6 +2278,7 @@ static int __devinit igb_sw_init(struct igb_adapter *adapter)
 	adapter->max_frame_size = netdev->mtu + ETH_HLEN + ETH_FCS_LEN;
 	adapter->min_frame_size = ETH_ZLEN + ETH_FCS_LEN;
 
+	spin_lock_init(&adapter->stats64_lock);
 #ifdef CONFIG_PCI_IOV
 	if (hw->mac.type == e1000_82576)
 		adapter->vfs_allocated_count = (max_vfs > 7) ? 7 : max_vfs;
@@ -3483,7 +3486,9 @@ static void igb_watchdog_task(struct work_struct *work)
 		}
 	}
 
-	igb_update_stats(adapter);
+	spin_lock(&adapter->stats64_lock);
+	igb_update_stats(adapter, &adapter->stats64);
+	spin_unlock(&adapter->stats64_lock);
 
 	for (i = 0; i < adapter->num_tx_queues; i++) {
 		struct igb_ring *tx_ring = adapter->tx_ring[i];
@@ -3550,6 +3555,8 @@ static void igb_update_ring_itr(struct igb_q_vector *q_vector)
 	int new_val = q_vector->itr_val;
 	int avg_wire_size = 0;
 	struct igb_adapter *adapter = q_vector->adapter;
+	struct igb_ring *ring;
+	unsigned int packets;
 
 	/* For non-gigabit speeds, just fix the interrupt rate at 4000
 	 * ints/sec - ITR timer value of 120 ticks.
@@ -3559,16 +3566,21 @@ static void igb_update_ring_itr(struct igb_q_vector *q_vector)
 		goto set_itr_val;
 	}
 
-	if (q_vector->rx_ring && q_vector->rx_ring->total_packets) {
-		struct igb_ring *ring = q_vector->rx_ring;
-		avg_wire_size = ring->total_bytes / ring->total_packets;
+	ring = q_vector->rx_ring;
+	if (ring) {
+		packets = ACCESS_ONCE(ring->total_packets);
+
+		if (packets)
+			avg_wire_size = ring->total_bytes / packets;
 	}
 
-	if (q_vector->tx_ring && q_vector->tx_ring->total_packets) {
-		struct igb_ring *ring = q_vector->tx_ring;
-		avg_wire_size = max_t(u32, avg_wire_size,
-		                      (ring->total_bytes /
-		                       ring->total_packets));
+	ring = q_vector->tx_ring;
+	if (ring) {
+		packets = ACCESS_ONCE(ring->total_packets);
+
+		if (packets)
+			avg_wire_size = max_t(u32, avg_wire_size,
+			                      ring->total_bytes / packets);
 	}
 
 	/* if avg_wire_size isn't set no work was done */
@@ -4077,7 +4089,11 @@ static int __igb_maybe_stop_tx(struct igb_ring *tx_ring, int size)
 
 	/* A reprieve! */
 	netif_wake_subqueue(netdev, tx_ring->queue_index);
-	tx_ring->tx_stats.restart_queue++;
+
+	u64_stats_update_begin(&tx_ring->tx_syncp2);
+	tx_ring->tx_stats.restart_queue2++;
+	u64_stats_update_end(&tx_ring->tx_syncp2);
+
 	return 0;
 }
 
@@ -4214,16 +4230,22 @@ static void igb_reset_task(struct work_struct *work)
 }
 
 /**
- * igb_get_stats - Get System Network Statistics
+ * igb_get_stats64 - Get System Network Statistics
  * @netdev: network interface device structure
+ * @stats: rtnl_link_stats64 pointer
  *
- * Returns the address of the device statistics structure.
- * The statistics are actually updated from the timer callback.
  **/
-static struct net_device_stats *igb_get_stats(struct net_device *netdev)
+static struct rtnl_link_stats64 *igb_get_stats64(struct net_device *netdev,
+						 struct rtnl_link_stats64 *stats)
 {
-	/* only return the current stats */
-	return &netdev->stats;
+	struct igb_adapter *adapter = netdev_priv(netdev);
+
+	spin_lock(&adapter->stats64_lock);
+	igb_update_stats(adapter, &adapter->stats64);
+	memcpy(stats, &adapter->stats64, sizeof(*stats));
+	spin_unlock(&adapter->stats64_lock);
+
+	return stats;
 }
 
 /**
@@ -4305,15 +4327,17 @@ static int igb_change_mtu(struct net_device *netdev, int new_mtu)
  * @adapter: board private structure
  **/
 
-void igb_update_stats(struct igb_adapter *adapter)
+void igb_update_stats(struct igb_adapter *adapter,
+		      struct rtnl_link_stats64 *net_stats)
 {
-	struct net_device_stats *net_stats = igb_get_stats(adapter->netdev);
 	struct e1000_hw *hw = &adapter->hw;
 	struct pci_dev *pdev = adapter->pdev;
 	u32 reg, mpc;
 	u16 phy_tmp;
 	int i;
 	u64 bytes, packets;
+	unsigned int start;
+	u64 _bytes, _packets;
 
 #define PHY_IDLE_ERROR_COUNT_MASK 0x00FF
 
@@ -4331,10 +4355,17 @@ void igb_update_stats(struct igb_adapter *adapter)
 	for (i = 0; i < adapter->num_rx_queues; i++) {
 		u32 rqdpc_tmp = rd32(E1000_RQDPC(i)) & 0x0FFF;
 		struct igb_ring *ring = adapter->rx_ring[i];
+
 		ring->rx_stats.drops += rqdpc_tmp;
 		net_stats->rx_fifo_errors += rqdpc_tmp;
-		bytes += ring->rx_stats.bytes;
-		packets += ring->rx_stats.packets;
+
+		do {
+			start = u64_stats_fetch_begin_bh(&ring->rx_syncp);
+			_bytes = ring->rx_stats.bytes;
+			_packets = ring->rx_stats.packets;
+		} while (u64_stats_fetch_retry_bh(&ring->rx_syncp, start));
+		bytes += _bytes;
+		packets += _packets;
 	}
 
 	net_stats->rx_bytes = bytes;
@@ -4344,8 +4375,13 @@ void igb_update_stats(struct igb_adapter *adapter)
 	packets = 0;
 	for (i = 0; i < adapter->num_tx_queues; i++) {
 		struct igb_ring *ring = adapter->tx_ring[i];
-		bytes += ring->tx_stats.bytes;
-		packets += ring->tx_stats.packets;
+		do {
+			start = u64_stats_fetch_begin_bh(&ring->tx_syncp);
+			_bytes = ring->tx_stats.bytes;
+			_packets = ring->tx_stats.packets;
+		} while (u64_stats_fetch_retry_bh(&ring->tx_syncp, start));
+		bytes += _bytes;
+		packets += _packets;
 	}
 	net_stats->tx_bytes = bytes;
 	net_stats->tx_packets = packets;
@@ -5397,7 +5433,10 @@ static bool igb_clean_tx_irq(struct igb_q_vector *q_vector)
 		if (__netif_subqueue_stopped(netdev, tx_ring->queue_index) &&
 		    !(test_bit(__IGB_DOWN, &adapter->state))) {
 			netif_wake_subqueue(netdev, tx_ring->queue_index);
+
+			u64_stats_update_begin(&tx_ring->tx_syncp);
 			tx_ring->tx_stats.restart_queue++;
+			u64_stats_update_end(&tx_ring->tx_syncp);
 		}
 	}
 
@@ -5437,8 +5476,10 @@ static bool igb_clean_tx_irq(struct igb_q_vector *q_vector)
 	}
 	tx_ring->total_bytes += total_bytes;
 	tx_ring->total_packets += total_packets;
+	u64_stats_update_begin(&tx_ring->tx_syncp);
 	tx_ring->tx_stats.bytes += total_bytes;
 	tx_ring->tx_stats.packets += total_packets;
+	u64_stats_update_end(&tx_ring->tx_syncp);
 	return count < tx_ring->count;
 }
 
@@ -5480,9 +5521,11 @@ static inline void igb_rx_checksum_adv(struct igb_ring *ring,
 		 * packets, (aka let the stack check the crc32c)
 		 */
 		if ((skb->len == 60) &&
-		    (ring->flags & IGB_RING_FLAG_RX_SCTP_CSUM))
+		    (ring->flags & IGB_RING_FLAG_RX_SCTP_CSUM)) {
+			u64_stats_update_begin(&ring->rx_syncp);
 			ring->rx_stats.csum_err++;
-
+			u64_stats_update_end(&ring->rx_syncp);
+		}
 		/* let the stack verify checksum errors */
 		return;
 	}
@@ -5669,8 +5712,10 @@ next_desc:
 
 	rx_ring->total_packets += total_packets;
 	rx_ring->total_bytes += total_bytes;
+	u64_stats_update_begin(&rx_ring->rx_syncp);
 	rx_ring->rx_stats.packets += total_packets;
 	rx_ring->rx_stats.bytes += total_bytes;
+	u64_stats_update_end(&rx_ring->rx_syncp);
 	return cleaned;
 }
 
@@ -5698,8 +5743,10 @@ void igb_alloc_rx_buffers_adv(struct igb_ring *rx_ring, int cleaned_count)
 		if ((bufsz < IGB_RXBUFFER_1024) && !buffer_info->page_dma) {
 			if (!buffer_info->page) {
 				buffer_info->page = netdev_alloc_page(netdev);
-				if (!buffer_info->page) {
+				if (unlikely(!buffer_info->page)) {
+					u64_stats_update_begin(&rx_ring->rx_syncp);
 					rx_ring->rx_stats.alloc_failed++;
+					u64_stats_update_end(&rx_ring->rx_syncp);
 					goto no_buffers;
 				}
 				buffer_info->page_offset = 0;
@@ -5714,7 +5761,9 @@ void igb_alloc_rx_buffers_adv(struct igb_ring *rx_ring, int cleaned_count)
 			if (dma_mapping_error(rx_ring->dev,
 					      buffer_info->page_dma)) {
 				buffer_info->page_dma = 0;
+				u64_stats_update_begin(&rx_ring->rx_syncp);
 				rx_ring->rx_stats.alloc_failed++;
+				u64_stats_update_end(&rx_ring->rx_syncp);
 				goto no_buffers;
 			}
 		}
@@ -5722,8 +5771,10 @@ void igb_alloc_rx_buffers_adv(struct igb_ring *rx_ring, int cleaned_count)
 		skb = buffer_info->skb;
 		if (!skb) {
 			skb = netdev_alloc_skb_ip_align(netdev, bufsz);
-			if (!skb) {
+			if (unlikely(!skb)) {
+				u64_stats_update_begin(&rx_ring->rx_syncp);
 				rx_ring->rx_stats.alloc_failed++;
+				u64_stats_update_end(&rx_ring->rx_syncp);
 				goto no_buffers;
 			}
 
@@ -5737,7 +5788,9 @@ void igb_alloc_rx_buffers_adv(struct igb_ring *rx_ring, int cleaned_count)
 			if (dma_mapping_error(rx_ring->dev,
 					      buffer_info->dma)) {
 				buffer_info->dma = 0;
+				u64_stats_update_begin(&rx_ring->rx_syncp);
 				rx_ring->rx_stats.alloc_failed++;
+				u64_stats_update_end(&rx_ring->rx_syncp);
 				goto no_buffers;
 			}
 		}


^ permalink raw reply related

* [net-next-2.6 PATCH] e1000e: Fix for offline diag test failure at first call
From: Jeff Kirsher @ 2010-10-16  3:35 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, bphilips, Carolyn Wyborny, Bruce Allan,
	Jeff Kirsher

From: Carolyn Wyborny <carolyn.wyborny@intel.com>

Move link test call to later in the offline sequence, move the
restore settings block to afterwards and add another reset to ensure
the hardware is in a known state afterwards.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Acked-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/e1000e/ethtool.c |   19 ++++++++-----------
 1 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/drivers/net/e1000e/ethtool.c b/drivers/net/e1000e/ethtool.c
index b7f15b3..8984d16 100644
--- a/drivers/net/e1000e/ethtool.c
+++ b/drivers/net/e1000e/ethtool.c
@@ -1717,13 +1717,6 @@ static void e1000_diag_test(struct net_device *netdev,
 
 		e_info("offline testing starting\n");
 
-		/*
-		 * Link test performed before hardware reset so autoneg doesn't
-		 * interfere with test result
-		 */
-		if (e1000_link_test(adapter, &data[4]))
-			eth_test->flags |= ETH_TEST_FL_FAILED;
-
 		if (if_running)
 			/* indicate we're in test mode */
 			dev_close(netdev);
@@ -1747,15 +1740,19 @@ static void e1000_diag_test(struct net_device *netdev,
 		if (e1000_loopback_test(adapter, &data[3]))
 			eth_test->flags |= ETH_TEST_FL_FAILED;
 
+		/* force this routine to wait until autoneg complete/timeout */
+		adapter->hw.phy.autoneg_wait_to_complete = 1;
+		e1000e_reset(adapter);
+		adapter->hw.phy.autoneg_wait_to_complete = 0;
+
+		if (e1000_link_test(adapter, &data[4]))
+			eth_test->flags |= ETH_TEST_FL_FAILED;
+
 		/* restore speed, duplex, autoneg settings */
 		adapter->hw.phy.autoneg_advertised = autoneg_advertised;
 		adapter->hw.mac.forced_speed_duplex = forced_speed_duplex;
 		adapter->hw.mac.autoneg = autoneg;
-
-		/* force this routine to wait until autoneg complete/timeout */
-		adapter->hw.phy.autoneg_wait_to_complete = 1;
 		e1000e_reset(adapter);
-		adapter->hw.phy.autoneg_wait_to_complete = 0;
 
 		clear_bit(__E1000_TESTING, &adapter->state);
 		if (if_running)


^ permalink raw reply related

* Re: RFD: OF device tree vs. PHY flags.
From: Grant Likely @ 2010-10-16  3:39 UTC (permalink / raw)
  To: David Daney; +Cc: devicetree-discuss, Netdev
In-Reply-To: <4CB89AD8.7020200@caviumnetworks.com>

Hi David,

On Fri, Oct 15, 2010 at 11:18:00AM -0700, David Daney wrote:
> I am in the process of planning a conversion of Octeon SOC platform
> code to use the OF device tree in the Linux kernel.
> 
> One issue that I have encountered, is that for some boards, we need
> to pass a non-zero flags argument to the phy_attach_direct() method.
> The value of the flags is board dependent, so it would make some
> sense to encode its value in the device tree itself.  The flags I am
> interested in control the configuration of clocking modes and status
> LED connections.
> 
> I would suggest the following:
> 
> o Add a new property to "ethernet-phy" dts bindings called
> "linux,flags".  It would contain a comma separated string of flag
> names.  Something like "led-mode1,clock-mode2".  The semantics of
> the flag names would be interpreted by the PHY driver...

It does make sense to encode board-specific properties into the device
tree.  However, it is a very bad idea to do it in this form because it
encodes Linux-specific implementation details into the hardware
description.  Instead, figure out what needs to be described about the
*hardware* and write a binding that encodes that information.  Let the
driver take care of translating the hardware description into the set
of flags that it needs to use.

Otherwise you'll end up in the situation where the phylib
implementation changes and the data in the device tree will no longer
make sense.  That is the path of pain.

> o Add a new function pointer to struct phy_driver: u32
> (*of_parse_flags)(struct phy_device *phydev).  This would parse and
> return the flags value for the "linux,flags" property from the
> device_node associated with the particular PHY device in question.

I'm not clear why a new hook is needed.  What prevents the phy driver
from parsing the device tree data at .probe() time?  The caller of
phy_attach_direct() would then never need to be exposed to the parsing
of the tree data.

> o Modify of_phy_connect() to do something like the following:
> 
> .
> .
> .
> 	if (phy->driver && phy->driver->of_parse_flags)
> 		flags |= phy->driver->of_parse_flags(phy);
> .
> .
> .
> 
> o Perhaps add some helper functions to of_mdio.c to assist in
> parsing the "linux,flags" properties string.
> 
> o Any extra code in the PHY drivers and struct phy_driver would be
> protected by #ifdef CONFIG_OF

Yes.

^ permalink raw reply

* [PATCH] Drivers: atm: Makefile: replace the use of <module>-objs with <module>-y
From: Tracey Dent @ 2010-10-16  3:53 UTC (permalink / raw)
  To: chas; +Cc: linux-kernel, netdev, linux-atm-general, Tracey Dent

Changed <module>-objs to <module>-y in Makefile.

Signed-off-by: Tracey Dent <tdent48227@gmail.com>
---
 drivers/atm/Makefile |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/atm/Makefile b/drivers/atm/Makefile
index 62c3cc1..c6c9ee9 100644
--- a/drivers/atm/Makefile
+++ b/drivers/atm/Makefile
@@ -2,7 +2,7 @@
 # Makefile for the Linux network (ATM) device drivers.
 #
 
-fore_200e-objs	:= fore200e.o
+fore_200e-y	:= fore200e.o
 
 obj-$(CONFIG_ATM_ZATM)		+= zatm.o uPD98402.o
 obj-$(CONFIG_ATM_NICSTAR)	+= nicstar.o
-- 
1.7.3.1.104.gc752e


^ permalink raw reply related

* Re: [PATCH 2/3] cxgb4: function namespace cleanup (v2)
From: Stephen Hemminger @ 2010-10-16  4:23 UTC (permalink / raw)
  To: Dimitris Michailidis
  Cc: Divy Le Ray, David S. Miller, Casey Leedom, netdev, Steve Wise
In-Reply-To: <4CB8FBCE.3090401@chelsio.com>

On Fri, 15 Oct 2010 18:11:42 -0700
Dimitris Michailidis <dm@chelsio.com> wrote:

> Stephen Hemminger wrote:
> > Make functions only used in one file local.
> > Remove lots of dead code, relating to unsupported functions
> > in mainline driver like RSS, IPv6, and TCP offload.
> 
> Thanks, this looks OK.  One exception, cxgb4_get_tcp_stats was intended to 
> be used by the rdma driver.  I see that driver doesn't call it presently but 
> if you don't mind can we give Steve a few hours to tell us if he has any 
> imminent plans to use it.  If he doesn't offer to do something to use it for 
> .37 it goes.

The kernel source tree is not your development place holder tree.
At least #ifdef the code out for now.

^ permalink raw reply

* Re: tbf/htb qdisc limitations
From: Bill Fink @ 2010-10-16  4:51 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Eric Dumazet, Rick Jones, Steven Brudenell, netdev
In-Reply-To: <20101015220535.GA1997@del.dom.local>

On Sat, 16 Oct 2010, Jarek Poplawski wrote:

> On Fri, Oct 15, 2010 at 05:37:46PM -0400, Bill Fink wrote:
> ...
> > i7test7% tc -s -d qdisc show dev eth2
> > qdisc prio 1: root refcnt 33 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
> >  Sent 11028687119 bytes 1223828 pkt (dropped 293, overlimits 0 requeues 0) 
> >  backlog 0b 0p requeues 0 
> > qdisc tbf 10: parent 1:1 rate 8900Mbit burst 1112500b/64 mpu 0b lat 4295.0s 
> >  Sent 11028687077 bytes 1223827 pkt (dropped 293, overlimits 593 requeues 0) 
> >  backlog 0b 0p requeues 0 
> > 
> > I'm not sure how you can have so many dropped but not have
> > any TCP retransmissions (or not show up as requeues).  But
> > there's probably something basic I just don't understand
> > about how all this stuff works.
> 
> Me either, but it seems higher "limit" might help with these drops.

You were of course correct about the higher limit helping.
I finally upgraded the field system to 2.6.35, and did some
testing on the real data path of interest, which has an RTT
of about 29 ms.  I set up a rate limit of 8 Gbps using the
following commands:

tc qdisc add dev eth2 root handle 1: prio
tc qdisc add dev eth2 parent 1:1 handle 10: tbf rate 8000mbit limit 35000000 burst 20000 mtu 9000
tc filter add dev eth2 protocol ip parent 1: prio 1 u32 match ip protocol 6 0xff match ip dst 192.168.1.23 flowid 10:1

hecn-i7sl1% nuttcp -T10 -i1 -w50m 192.168.1.23
  676.3750 MB /   1.00 sec = 5673.4646 Mbps     0 retrans
  948.5625 MB /   1.00 sec = 7957.1508 Mbps     0 retrans
  948.8125 MB /   1.00 sec = 7959.5902 Mbps     0 retrans
  948.3750 MB /   1.00 sec = 7955.5382 Mbps     0 retrans
  949.0000 MB /   1.00 sec = 7960.6696 Mbps     0 retrans
  948.7500 MB /   1.00 sec = 7958.7873 Mbps     0 retrans
  948.6875 MB /   1.00 sec = 7958.0959 Mbps     0 retrans
  948.6250 MB /   1.00 sec = 7957.4205 Mbps     0 retrans
  948.7500 MB /   1.00 sec = 7958.7237 Mbps     0 retrans
  948.4375 MB /   1.00 sec = 7956.3648 Mbps     0 retrans

 9270.5625 MB /  10.09 sec = 7707.7457 Mbps 24 %TX 36 %RX 0 retrans 29.38 msRTT

hecn-i7sl1% tc -s -d qdisc show dev eth2
qdisc prio 1: root refcnt 33 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 9779476756 bytes 1084943 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
qdisc tbf 10: parent 1:1 rate 8000Mbit burst 19000b/64 mpu 0b lat 35.0ms 
 Sent 9779476756 bytes 1084943 pkt (dropped 0, overlimits 1831360 requeues 0) 
 backlog 0b 0p requeues 0 

No drops!

BTW the effective rate limit seems to be a very coarse adjustment
at these speeds.  I was seeing some data path issues at 8.9 Gbps
so I tried setting slightly lower rates such as 8.8 Gbps, 8.7 Gbps,
etc, but they still gave me an effective rate limit of about 8.9 Gbps.
It wasn't until I got down to a setting of 8 Gbps that I actually
got an effective rate limit of 8 Gbps.

Also the man page for tbf seems to be wrong/misleading about
the burst parameter.  It states:

	"If your buffer is too small, packets may be dropped because more
	tokens arrive per timer tick than fit in your bucket.  The minimum
	buffer size can be calculated by dividing the rate by HZ.

According to that, with a rate of 8 Gbps and HZ=1000, the minimum
burst should be 1000000 bytes.  But my testing shows that a burst
of just 20000 works just fine.  That's only 2 9000-byte packets
or about 20 usec of traffic at the 8 Gbps rate.  Using too large
a value for burst can actually be harmful as it allows the traffic
to temporarily exceed the desired rate limit.

						-Thanks

						-Bill

^ permalink raw reply

* [PATCH net-2.6] net/sched: fix missing spinlock init
From: Eric Dumazet @ 2010-10-16  5:22 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

Under network load, doing :

tc qdisc del dev eth0 root

triggers :

[  167.193087] BUG: spinlock bad magic on CPU#3, udpflood/4928
[  167.193139]  lock: c15bc324, .magic: 00000000, .owner:
<none>/-1, .owner_cpu: -1
[  167.193193] Pid: 4928, comm: udpflood Not tainted
2.6.36-rc7-11417-g215340c-dirty #323
[  167.193245] Call Trace:
[  167.193292]  [<c13abaa0>] ? printk+0x18/0x20
[  167.193342]  [<c11afb53>] spin_bug+0xa3/0xf0
[  167.193389]  [<c11afcdd>] do_raw_spin_lock+0x7d/0x160
[  167.193440]  [<c1313d4e>] ? __dev_xmit_skb+0x27e/0x2b0
[  167.193496]  [<c107382b>] ? trace_hardirqs_on+0xb/0x10
[  167.193545]  [<c13ae99a>] _raw_spin_lock+0x3a/0x40
[  167.193593]  [<c1313d4e>] ? __dev_xmit_skb+0x27e/0x2b0
[  167.193641]  [<c1313d4e>] __dev_xmit_skb+0x27e/0x2b0

commit 79640a4ca695 (add additional lock to qdisc to increase
throughput) forgot to initialize  noop_qdisc and noqueue_qdisc busylock 

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/sched/sch_generic.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 3d57681..0abcc49 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -383,6 +383,7 @@ struct Qdisc noop_qdisc = {
 	.list		=	LIST_HEAD_INIT(noop_qdisc.list),
 	.q.lock		=	__SPIN_LOCK_UNLOCKED(noop_qdisc.q.lock),
 	.dev_queue	=	&noop_netdev_queue,
+	.busylock	=	__SPIN_LOCK_UNLOCKED(noop_qdisc.busylock),
 };
 EXPORT_SYMBOL(noop_qdisc);
 
@@ -409,6 +410,7 @@ static struct Qdisc noqueue_qdisc = {
 	.list		=	LIST_HEAD_INIT(noqueue_qdisc.list),
 	.q.lock		=	__SPIN_LOCK_UNLOCKED(noqueue_qdisc.q.lock),
 	.dev_queue	=	&noqueue_netdev_queue,
+	.busylock	=	__SPIN_LOCK_UNLOCKED(noqueue_qdisc.busylock),
 };
 
 



^ permalink raw reply related

* Re: [PATCH 2/3] cxgb4: function namespace cleanup (v2)
From: Dimitris Michailidis @ 2010-10-16  6:16 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Divy Le Ray, David S. Miller, Casey Leedom, netdev, Steve Wise
In-Reply-To: <20101015212310.47a74d86@nehalam>

Stephen Hemminger wrote:
> On Fri, 15 Oct 2010 18:11:42 -0700
> Dimitris Michailidis <dm@chelsio.com> wrote:
> 
>> Stephen Hemminger wrote:
>>> Make functions only used in one file local.
>>> Remove lots of dead code, relating to unsupported functions
>>> in mainline driver like RSS, IPv6, and TCP offload.
>> Thanks, this looks OK.  One exception, cxgb4_get_tcp_stats was intended to 
>> be used by the rdma driver.  I see that driver doesn't call it presently but 
>> if you don't mind can we give Steve a few hours to tell us if he has any 
>> imminent plans to use it.  If he doesn't offer to do something to use it for 
>> .37 it goes.
> 
> The kernel source tree is not your development place holder tree.
> At least #ifdef the code out for now.

I am trying to protect Stephen Rothwell's time by checking that the IB folks 
don't plan to add a call to this in their tree while we remove the function 
in net-next.  There's supposed to be a call in the IB driver.  I don't know 
why there isn't one or whether they are planning to fix it for .37.  I see 
the potential for a linux-next conflict and I am trying to avoid it.  #ifdef 
doesn't help, if it's not needed we can remove it for good.

^ permalink raw reply

* [PATCH] vmxnet3: remove set_flag_le{16,64} functions
From: Harvey Harrison @ 2010-10-16  7:33 UTC (permalink / raw)
  To: sbhatewara; +Cc: netdev, shemminger

Opencode the flag setting in the few places it was being done.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
---
 drivers/net/vmxnet3/vmxnet3_drv.c     |   35 +++++++-------------------------
 drivers/net/vmxnet3/vmxnet3_ethtool.c |   10 +++-----
 drivers/net/vmxnet3/vmxnet3_int.h     |    4 ---
 3 files changed, 12 insertions(+), 37 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c b/drivers/net/vmxnet3/vmxnet3_drv.c
index 198ce92..c52d259 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -1548,23 +1548,6 @@ vmxnet3_free_irqs(struct vmxnet3_adapter *adapter)
 	}
 }
 
-
-inline void set_flag_le16(__le16 *data, u16 flag)
-{
-	*data = cpu_to_le16(le16_to_cpu(*data) | flag);
-}
-
-inline void set_flag_le64(__le64 *data, u64 flag)
-{
-	*data = cpu_to_le64(le64_to_cpu(*data) | flag);
-}
-
-inline void reset_flag_le64(__le64 *data, u64 flag)
-{
-	*data = cpu_to_le64(le64_to_cpu(*data) & ~flag);
-}
-
-
 static void
 vmxnet3_vlan_rx_register(struct net_device *netdev, struct vlan_group *grp)
 {
@@ -1580,8 +1563,7 @@ vmxnet3_vlan_rx_register(struct net_device *netdev, struct vlan_group *grp)
 			adapter->vlan_grp = grp;
 
 			/* update FEATURES to device */
-			set_flag_le64(&devRead->misc.uptFeatures,
-				      UPT1_F_RXVLAN);
+			devRead->misc.uptFeatures |= cpu_to_le64(UPT1_F_RXVLAN);
 			VMXNET3_WRITE_BAR1_REG(adapter, VMXNET3_REG_CMD,
 					       VMXNET3_CMD_UPDATE_FEATURE);
 			/*
@@ -1617,8 +1599,7 @@ vmxnet3_vlan_rx_register(struct net_device *netdev, struct vlan_group *grp)
 					       VMXNET3_CMD_UPDATE_VLAN_FILTERS);
 
 			/* update FEATURES to device */
-			reset_flag_le64(&devRead->misc.uptFeatures,
-					UPT1_F_RXVLAN);
+			devRead->misc.uptFeatures |= cpu_to_le64(UPT1_F_RXVLAN);
 			VMXNET3_WRITE_BAR1_REG(adapter, VMXNET3_REG_CMD,
 					       VMXNET3_CMD_UPDATE_FEATURE);
 		}
@@ -1779,15 +1760,15 @@ vmxnet3_setup_driver_shared(struct vmxnet3_adapter *adapter)
 
 	/* set up feature flags */
 	if (adapter->rxcsum)
-		set_flag_le64(&devRead->misc.uptFeatures, UPT1_F_RXCSUM);
+		devRead->misc.uptFeatures |= cpu_to_le64(UPT1_F_RXCSUM);
 
 	if (adapter->lro) {
-		set_flag_le64(&devRead->misc.uptFeatures, UPT1_F_LRO);
+		devRead->misc.uptFeatures |= cpu_to_le64(UPT1_F_LRO);
 		devRead->misc.maxNumRxSG = cpu_to_le16(1 + MAX_SKB_FRAGS);
 	}
 	if ((adapter->netdev->features & NETIF_F_HW_VLAN_RX) &&
 	    adapter->vlan_grp) {
-		set_flag_le64(&devRead->misc.uptFeatures, UPT1_F_RXVLAN);
+		devRead->misc.uptFeatures |= cpu_to_le64(UPT1_F_RXVLAN);
 	}
 
 	devRead->misc.mtu = cpu_to_le32(adapter->netdev->mtu);
@@ -2594,7 +2575,7 @@ vmxnet3_suspend(struct device *device)
 		memcpy(pmConf->filters[i].pattern, netdev->dev_addr, ETH_ALEN);
 		pmConf->filters[i].mask[0] = 0x3F; /* LSB ETH_ALEN bits */
 
-		set_flag_le16(&pmConf->wakeUpEvents, VMXNET3_PM_WAKEUP_FILTER);
+		pmConf->wakeUpEvents |= cpu_to_le16(VMXNET3_PM_WAKEUP_FILTER);
 		i++;
 	}
 
@@ -2636,13 +2617,13 @@ vmxnet3_suspend(struct device *device)
 		pmConf->filters[i].mask[5] = 0x03; /* IPv4 TIP */
 		in_dev_put(in_dev);
 
-		set_flag_le16(&pmConf->wakeUpEvents, VMXNET3_PM_WAKEUP_FILTER);
+		pmConf->wakeUpEvents |= cpu_to_le16(VMXNET3_PM_WAKEUP_FILTER);
 		i++;
 	}
 
 skip_arp:
 	if (adapter->wol & WAKE_MAGIC)
-		set_flag_le16(&pmConf->wakeUpEvents, VMXNET3_PM_WAKEUP_MAGIC);
+		pmConf->wakeUpEvents |= cpu_to_le16(VMXNET3_PM_WAKEUP_MAGIC);
 
 	pmConf->numFilters = i;
 
diff --git a/drivers/net/vmxnet3/vmxnet3_ethtool.c b/drivers/net/vmxnet3/vmxnet3_ethtool.c
index 7e4b5a8..79a72e8 100644
--- a/drivers/net/vmxnet3/vmxnet3_ethtool.c
+++ b/drivers/net/vmxnet3/vmxnet3_ethtool.c
@@ -50,13 +50,11 @@ vmxnet3_set_rx_csum(struct net_device *netdev, u32 val)
 		adapter->rxcsum = val;
 		if (netif_running(netdev)) {
 			if (val)
-				set_flag_le64(
-				&adapter->shared->devRead.misc.uptFeatures,
-				UPT1_F_RXCSUM);
+				adapter->shared->devRead.misc.uptFeatures |=
+					cpu_to_le64(UPT1_F_RXCSUM);
 			else
-				reset_flag_le64(
-				&adapter->shared->devRead.misc.uptFeatures,
-				UPT1_F_RXCSUM);
+				adapter->shared->devRead.misc.uptFeatures &= 
+					~cpu_to_le64(UPT1_F_RXCSUM);
 
 			VMXNET3_WRITE_BAR1_REG(adapter, VMXNET3_REG_CMD,
 					       VMXNET3_CMD_UPDATE_FEATURE);
diff --git a/drivers/net/vmxnet3/vmxnet3_int.h b/drivers/net/vmxnet3/vmxnet3_int.h
index 2121c73..46aee6d 100644
--- a/drivers/net/vmxnet3/vmxnet3_int.h
+++ b/drivers/net/vmxnet3/vmxnet3_int.h
@@ -353,10 +353,6 @@ struct vmxnet3_adapter {
 #define VMXNET3_MAX_ETH_HDR_SIZE    22
 #define VMXNET3_MAX_SKB_BUF_SIZE    (3*1024)
 
-void set_flag_le16(__le16 *data, u16 flag);
-void set_flag_le64(__le64 *data, u64 flag);
-void reset_flag_le64(__le64 *data, u64 flag);
-
 int
 vmxnet3_quiesce_dev(struct vmxnet3_adapter *adapter);
 
-- 
1.7.1


^ permalink raw reply related

* Re: [PATCH V2 net-next] can-raw: add msg_flags to distinguish local traffic
From: Kurt Van Dijck @ 2010-10-16  7:37 UTC (permalink / raw)
  To: Oliver Hartkopp; +Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w, netdev
In-Reply-To: <4CB87E37.4070908-fJ+pQTUTwRTk1uMJSBkQmQ@public.gmane.org>

Oliver,

I see my mistake in documenting the use MSG_DONTROUTE.

I'll take over your other remarks too.

Thanks for the review,
Kurt

^ permalink raw reply

* [patch v4.1] ipvs: IPv6 tunnel mode
From: Simon Horman @ 2010-10-16  7:58 UTC (permalink / raw)
  To: lvs-devel, netdev, netfilter-devel
  Cc: Julian Anastasov, Julius Volz, Wensong Zhang,
	Hans Schillström

From: Hans Schillstrom <hans.schillstrom@ericsson.com>

ipvs: IPv6 tunnel mode

IPv6 encapsulation uses a bad source address for the tunnel.
i.e. VIP will be used as local-addr and encap. dst addr.
Decapsulation will not accept this.

Example
LVS (eth1 2003::2:0:1/96, VIP 2003::2:0:100)
   (eth0 2003::1:0:1/96)
RS  (ethX 2003::1:0:5/96)

tcpdump
2003::2:0:100 > 2003::1:0:5:
IP6 (hlim 63, next-header TCP (6) payload length: 40)
 2003::3:0:10.50991 > 2003::2:0:100.http: Flags [S], cksum 0x7312
(correct), seq 3006460279, win 5760, options [mss 1440,sackOK,TS val
1904932 ecr 0,nop,wscale 3], length 0

In Linux IPv6 impl. you can't have a tunnel with an any cast address
receiving packets (I have not tried to interpret RFC 2473)
To have receive capabilities the tunnel must have:
 - Local address set as multicast addr or an unicast addr
 - Remote address set as an unicast addr.
 - Loop back addres or Link local address are not allowed.

This causes us to setup a tunnel in the Real Server with the
LVS as the remote address, here you can't use the VIP address since it's
used inside the tunnel.

Solution
Use outgoing interface IPv6 address (match against the destination).
i.e. use ip6_route_output() to look up the route cache and
then use ipv6_dev_get_saddr(...) to set the source address of the
encapsulated packet.

Additionally, cache the results in new destination
fields: dst_cookie and dst_saddr and properly check the
returned dst from ip6_route_output. We now add xfrm_lookup
call only for the tunneling method where the source address
is a local one.

Signed-off-by:Hans Schillstrom <hans.schillstrom@ericsson.com>

---

Original patch by Hans Schillstrom.
Check dst state and cache results for IPv6 by Julian Anastasov.
Subsequent revisions made by Hans Schillstrom:

* v1
    
  This is Julian's patch with a slightly edited version of the description
  from Hans's original patch.

* v2
    
  Updated changelog as per commends from Julian

* v3
    
  Flowi dest address used as destination instead of rt6_info in
+ip_vs_tunnel_xmit_v6()
  rt6_info somtimes contains a netw address insted of a tunnel

* v4
    
  Update destination as recommended from Julian
  i.e. use &cp->daddr.in6

* v4.1 Simon Horman
  Fix patch corruption

Patrick, please consider this for nf-next-2.6

Index: nf-next-2.6/include/net/ip_vs.h
===================================================================
--- nf-next-2.6.orig/include/net/ip_vs.h	2010-10-08 17:44:01.000000000 +0900
+++ nf-next-2.6/include/net/ip_vs.h	2010-10-08 17:45:13.000000000 +0900
@@ -529,6 +529,10 @@ struct ip_vs_dest {
 	spinlock_t		dst_lock;	/* lock of dst_cache */
 	struct dst_entry	*dst_cache;	/* destination cache entry */
 	u32			dst_rtos;	/* RT_TOS(tos) for dst */
+	u32			dst_cookie;
+#ifdef CONFIG_IP_VS_IPV6
+	struct in6_addr		dst_saddr;
+#endif
 
 	/* for virtual service */
 	struct ip_vs_service	*svc;		/* service it belongs to */
Index: nf-next-2.6/net/netfilter/ipvs/ip_vs_xmit.c
===================================================================
--- nf-next-2.6.orig/net/netfilter/ipvs/ip_vs_xmit.c	2010-10-08 17:44:01.000000000 +0900
+++ nf-next-2.6/net/netfilter/ipvs/ip_vs_xmit.c	2010-10-08 17:45:13.000000000 +0900
@@ -26,6 +26,7 @@
 #include <net/route.h>                  /* for ip_route_output */
 #include <net/ipv6.h>
 #include <net/ip6_route.h>
+#include <net/addrconf.h>
 #include <linux/icmpv6.h>
 #include <linux/netfilter.h>
 #include <linux/netfilter_ipv4.h>
@@ -37,26 +38,27 @@
  *      Destination cache to speed up outgoing route lookup
  */
 static inline void
-__ip_vs_dst_set(struct ip_vs_dest *dest, u32 rtos, struct dst_entry *dst)
+__ip_vs_dst_set(struct ip_vs_dest *dest, u32 rtos, struct dst_entry *dst,
+		u32 dst_cookie)
 {
 	struct dst_entry *old_dst;
 
 	old_dst = dest->dst_cache;
 	dest->dst_cache = dst;
 	dest->dst_rtos = rtos;
+	dest->dst_cookie = dst_cookie;
 	dst_release(old_dst);
 }
 
 static inline struct dst_entry *
-__ip_vs_dst_check(struct ip_vs_dest *dest, u32 rtos, u32 cookie)
+__ip_vs_dst_check(struct ip_vs_dest *dest, u32 rtos)
 {
 	struct dst_entry *dst = dest->dst_cache;
 
 	if (!dst)
 		return NULL;
-	if ((dst->obsolete
-	     || (dest->af == AF_INET && rtos != dest->dst_rtos)) &&
-	    dst->ops->check(dst, cookie) == NULL) {
+	if ((dst->obsolete || rtos != dest->dst_rtos) &&
+	    dst->ops->check(dst, dest->dst_cookie) == NULL) {
 		dest->dst_cache = NULL;
 		dst_release(dst);
 		return NULL;
@@ -66,15 +68,16 @@ __ip_vs_dst_check(struct ip_vs_dest *des
 }
 
 static struct rtable *
-__ip_vs_get_out_rt(struct ip_vs_conn *cp, u32 rtos)
+__ip_vs_get_out_rt(struct sk_buff *skb, struct ip_vs_conn *cp, u32 rtos)
 {
+	struct net *net = dev_net(skb->dev);
 	struct rtable *rt;			/* Route to the other host */
 	struct ip_vs_dest *dest = cp->dest;
 
 	if (dest) {
 		spin_lock(&dest->dst_lock);
 		if (!(rt = (struct rtable *)
-		      __ip_vs_dst_check(dest, rtos, 0))) {
+		      __ip_vs_dst_check(dest, rtos))) {
 			struct flowi fl = {
 				.oif = 0,
 				.nl_u = {
@@ -84,13 +87,13 @@ __ip_vs_get_out_rt(struct ip_vs_conn *cp
 						.tos = rtos, } },
 			};
 
-			if (ip_route_output_key(&init_net, &rt, &fl)) {
+			if (ip_route_output_key(net, &rt, &fl)) {
 				spin_unlock(&dest->dst_lock);
 				IP_VS_DBG_RL("ip_route_output error, dest: %pI4\n",
 					     &dest->addr.ip);
 				return NULL;
 			}
-			__ip_vs_dst_set(dest, rtos, dst_clone(&rt->dst));
+			__ip_vs_dst_set(dest, rtos, dst_clone(&rt->dst), 0);
 			IP_VS_DBG(10, "new dst %pI4, refcnt=%d, rtos=%X\n",
 				  &dest->addr.ip,
 				  atomic_read(&rt->dst.__refcnt), rtos);
@@ -106,7 +109,7 @@ __ip_vs_get_out_rt(struct ip_vs_conn *cp
 					.tos = rtos, } },
 		};
 
-		if (ip_route_output_key(&init_net, &rt, &fl)) {
+		if (ip_route_output_key(net, &rt, &fl)) {
 			IP_VS_DBG_RL("ip_route_output error, dest: %pI4\n",
 				     &cp->daddr.ip);
 			return NULL;
@@ -117,62 +120,79 @@ __ip_vs_get_out_rt(struct ip_vs_conn *cp
 }
 
 #ifdef CONFIG_IP_VS_IPV6
+
+static struct dst_entry *
+__ip_vs_route_output_v6(struct net *net, struct in6_addr *daddr,
+			struct in6_addr *ret_saddr, int do_xfrm)
+{
+	struct dst_entry *dst;
+	struct flowi fl = {
+		.oif = 0,
+		.nl_u = {
+			.ip6_u = {
+				.daddr = *daddr,
+			},
+		},
+	};
+
+	dst = ip6_route_output(net, NULL, &fl);
+	if (dst->error)
+		goto out_err;
+	if (!ret_saddr)
+		return dst;
+	if (ipv6_addr_any(&fl.fl6_src) &&
+	    ipv6_dev_get_saddr(net, ip6_dst_idev(dst)->dev,
+			       &fl.fl6_dst, 0, &fl.fl6_src) < 0)
+		goto out_err;
+	if (do_xfrm && xfrm_lookup(net, &dst, &fl, NULL, 0) < 0)
+		goto out_err;
+	ipv6_addr_copy(ret_saddr, &fl.fl6_src);
+	return dst;
+
+out_err:
+	dst_release(dst);
+	IP_VS_DBG_RL("ip6_route_output error, dest: %pI6\n", daddr);
+	return NULL;
+}
+
 static struct rt6_info *
-__ip_vs_get_out_rt_v6(struct ip_vs_conn *cp)
+__ip_vs_get_out_rt_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
+		      struct in6_addr *ret_saddr, int do_xfrm)
 {
+	struct net *net = dev_net(skb->dev);
 	struct rt6_info *rt;			/* Route to the other host */
 	struct ip_vs_dest *dest = cp->dest;
+	struct dst_entry *dst;
 
 	if (dest) {
 		spin_lock(&dest->dst_lock);
-		rt = (struct rt6_info *)__ip_vs_dst_check(dest, 0, 0);
+		rt = (struct rt6_info *)__ip_vs_dst_check(dest, 0);
 		if (!rt) {
-			struct flowi fl = {
-				.oif = 0,
-				.nl_u = {
-					.ip6_u = {
-						.daddr = dest->addr.in6,
-						.saddr = {
-							.s6_addr32 =
-								{ 0, 0, 0, 0 },
-						},
-					},
-				},
-			};
+			u32 cookie;
 
-			rt = (struct rt6_info *)ip6_route_output(&init_net,
-								 NULL, &fl);
-			if (!rt) {
+			dst = __ip_vs_route_output_v6(net, &dest->addr.in6,
+						      &dest->dst_saddr,
+						      do_xfrm);
+			if (!dst) {
 				spin_unlock(&dest->dst_lock);
-				IP_VS_DBG_RL("ip6_route_output error, dest: %pI6\n",
-					     &dest->addr.in6);
 				return NULL;
 			}
-			__ip_vs_dst_set(dest, 0, dst_clone(&rt->dst));
-			IP_VS_DBG(10, "new dst %pI6, refcnt=%d\n",
-				  &dest->addr.in6,
+			rt = (struct rt6_info *) dst;
+			cookie = rt->rt6i_node ? rt->rt6i_node->fn_sernum : 0;
+			__ip_vs_dst_set(dest, 0, dst_clone(&rt->dst), cookie);
+			IP_VS_DBG(10, "new dst %pI6, src %pI6, refcnt=%d\n",
+				  &dest->addr.in6, &dest->dst_saddr,
 				  atomic_read(&rt->dst.__refcnt));
 		}
+		if (ret_saddr)
+			ipv6_addr_copy(ret_saddr, &dest->dst_saddr);
 		spin_unlock(&dest->dst_lock);
 	} else {
-		struct flowi fl = {
-			.oif = 0,
-			.nl_u = {
-				.ip6_u = {
-					.daddr = cp->daddr.in6,
-					.saddr = {
-						.s6_addr32 = { 0, 0, 0, 0 },
-					},
-				},
-			},
-		};
-
-		rt = (struct rt6_info *)ip6_route_output(&init_net, NULL, &fl);
-		if (!rt) {
-			IP_VS_DBG_RL("ip6_route_output error, dest: %pI6\n",
-				     &cp->daddr.in6);
+		dst = __ip_vs_route_output_v6(net, &cp->daddr.in6, ret_saddr,
+					      do_xfrm);
+		if (!dst)
 			return NULL;
-		}
+		rt = (struct rt6_info *) dst;
 	}
 
 	return rt;
@@ -248,6 +268,7 @@ int
 ip_vs_bypass_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
 		  struct ip_vs_protocol *pp)
 {
+	struct net *net = dev_net(skb->dev);
 	struct rtable *rt;			/* Route to the other host */
 	struct iphdr  *iph = ip_hdr(skb);
 	u8     tos = iph->tos;
@@ -263,7 +284,7 @@ ip_vs_bypass_xmit(struct sk_buff *skb, s
 
 	EnterFunction(10);
 
-	if (ip_route_output_key(&init_net, &rt, &fl)) {
+	if (ip_route_output_key(net, &rt, &fl)) {
 		IP_VS_DBG_RL("%s(): ip_route_output error, dest: %pI4\n",
 			     __func__, &iph->daddr);
 		goto tx_error_icmp;
@@ -313,25 +334,18 @@ int
 ip_vs_bypass_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
 		     struct ip_vs_protocol *pp)
 {
+	struct net *net = dev_net(skb->dev);
+	struct dst_entry *dst;
 	struct rt6_info *rt;			/* Route to the other host */
 	struct ipv6hdr  *iph = ipv6_hdr(skb);
 	int    mtu;
-	struct flowi fl = {
-		.oif = 0,
-		.nl_u = {
-			.ip6_u = {
-				.daddr = iph->daddr,
-				.saddr = { .s6_addr32 = {0, 0, 0, 0} }, } },
-	};
 
 	EnterFunction(10);
 
-	rt = (struct rt6_info *)ip6_route_output(&init_net, NULL, &fl);
-	if (!rt) {
-		IP_VS_DBG_RL("%s(): ip6_route_output error, dest: %pI6\n",
-			     __func__, &iph->daddr);
+	dst = __ip_vs_route_output_v6(net, &iph->daddr, NULL, 0);
+	if (!dst)
 		goto tx_error_icmp;
-	}
+	rt = (struct rt6_info *) dst;
 
 	/* MTU checking */
 	mtu = dst_mtu(&rt->dst);
@@ -397,7 +411,7 @@ ip_vs_nat_xmit(struct sk_buff *skb, stru
 		IP_VS_DBG(10, "filled cport=%d\n", ntohs(*p));
 	}
 
-	if (!(rt = __ip_vs_get_out_rt(cp, RT_TOS(iph->tos))))
+	if (!(rt = __ip_vs_get_out_rt(skb, cp, RT_TOS(iph->tos))))
 		goto tx_error_icmp;
 
 	/* MTU checking */
@@ -472,7 +486,7 @@ ip_vs_nat_xmit_v6(struct sk_buff *skb, s
 		IP_VS_DBG(10, "filled cport=%d\n", ntohs(*p));
 	}
 
-	rt = __ip_vs_get_out_rt_v6(cp);
+	rt = __ip_vs_get_out_rt_v6(skb, cp, NULL, 0);
 	if (!rt)
 		goto tx_error_icmp;
 
@@ -557,7 +571,6 @@ ip_vs_tunnel_xmit(struct sk_buff *skb, s
 	struct iphdr  *old_iph = ip_hdr(skb);
 	u8     tos = old_iph->tos;
 	__be16 df = old_iph->frag_off;
-	sk_buff_data_t old_transport_header = skb->transport_header;
 	struct iphdr  *iph;			/* Our new IP header */
 	unsigned int max_headroom;		/* The extra header space needed */
 	int    mtu;
@@ -572,7 +585,7 @@ ip_vs_tunnel_xmit(struct sk_buff *skb, s
 		goto tx_error;
 	}
 
-	if (!(rt = __ip_vs_get_out_rt(cp, RT_TOS(tos))))
+	if (!(rt = __ip_vs_get_out_rt(skb, cp, RT_TOS(tos))))
 		goto tx_error_icmp;
 
 	tdev = rt->dst.dev;
@@ -616,7 +629,7 @@ ip_vs_tunnel_xmit(struct sk_buff *skb, s
 		old_iph = ip_hdr(skb);
 	}
 
-	skb->transport_header = old_transport_header;
+	skb->transport_header = skb->network_header;
 
 	/* fix old IP header checksum */
 	ip_send_check(old_iph);
@@ -670,9 +683,9 @@ ip_vs_tunnel_xmit_v6(struct sk_buff *skb
 		     struct ip_vs_protocol *pp)
 {
 	struct rt6_info *rt;		/* Route to the other host */
+	struct in6_addr saddr;		/* Source for tunnel */
 	struct net_device *tdev;	/* Device to other host */
 	struct ipv6hdr  *old_iph = ipv6_hdr(skb);
-	sk_buff_data_t old_transport_header = skb->transport_header;
 	struct ipv6hdr  *iph;		/* Our new IP header */
 	unsigned int max_headroom;	/* The extra header space needed */
 	int    mtu;
@@ -687,17 +700,17 @@ ip_vs_tunnel_xmit_v6(struct sk_buff *skb
 		goto tx_error;
 	}
 
-	rt = __ip_vs_get_out_rt_v6(cp);
+	rt = __ip_vs_get_out_rt_v6(skb, cp, &saddr, 1);
 	if (!rt)
 		goto tx_error_icmp;
 
 	tdev = rt->dst.dev;
 
 	mtu = dst_mtu(&rt->dst) - sizeof(struct ipv6hdr);
-	/* TODO IPv6: do we need this check in IPv6? */
-	if (mtu < 1280) {
+	if (mtu < IPV6_MIN_MTU) {
 		dst_release(&rt->dst);
-		IP_VS_DBG_RL("%s(): mtu less than 1280\n", __func__);
+		IP_VS_DBG_RL("%s(): mtu less than %d\n", __func__,
+			     IPV6_MIN_MTU);
 		goto tx_error;
 	}
 	if (skb_dst(skb))
@@ -730,7 +743,7 @@ ip_vs_tunnel_xmit_v6(struct sk_buff *skb
 		old_iph = ipv6_hdr(skb);
 	}
 
-	skb->transport_header = old_transport_header;
+	skb->transport_header = skb->network_header;
 
 	skb_push(skb, sizeof(struct ipv6hdr));
 	skb_reset_network_header(skb);
@@ -750,8 +763,8 @@ ip_vs_tunnel_xmit_v6(struct sk_buff *skb
 	be16_add_cpu(&iph->payload_len, sizeof(*old_iph));
 	iph->priority		=	old_iph->priority;
 	memset(&iph->flow_lbl, 0, sizeof(iph->flow_lbl));
-	iph->daddr		=	rt->rt6i_dst.addr;
-	iph->saddr		=	cp->vaddr.in6; /* rt->rt6i_src.addr; */
+	ipv6_addr_copy(&iph->daddr, &cp->daddr.in6);
+	ipv6_addr_copy(&iph->saddr, &saddr);
 	iph->hop_limit		=	old_iph->hop_limit;
 
 	/* Another hack: avoid icmp_send in ip_fragment */
@@ -791,7 +804,7 @@ ip_vs_dr_xmit(struct sk_buff *skb, struc
 
 	EnterFunction(10);
 
-	if (!(rt = __ip_vs_get_out_rt(cp, RT_TOS(iph->tos))))
+	if (!(rt = __ip_vs_get_out_rt(skb, cp, RT_TOS(iph->tos))))
 		goto tx_error_icmp;
 
 	/* MTU checking */
@@ -843,7 +856,7 @@ ip_vs_dr_xmit_v6(struct sk_buff *skb, st
 
 	EnterFunction(10);
 
-	rt = __ip_vs_get_out_rt_v6(cp);
+	rt = __ip_vs_get_out_rt_v6(skb, cp, NULL, 0);
 	if (!rt)
 		goto tx_error_icmp;
 
@@ -919,7 +932,7 @@ ip_vs_icmp_xmit(struct sk_buff *skb, str
 	 * mangle and send the packet here (only for VS/NAT)
 	 */
 
-	if (!(rt = __ip_vs_get_out_rt(cp, RT_TOS(ip_hdr(skb)->tos))))
+	if (!(rt = __ip_vs_get_out_rt(skb, cp, RT_TOS(ip_hdr(skb)->tos))))
 		goto tx_error_icmp;
 
 	/* MTU checking */
@@ -993,7 +1006,7 @@ ip_vs_icmp_xmit_v6(struct sk_buff *skb,
 	 * mangle and send the packet here (only for VS/NAT)
 	 */
 
-	rt = __ip_vs_get_out_rt_v6(cp);
+	rt = __ip_vs_get_out_rt_v6(skb, cp, NULL, 0);
 	if (!rt)
 		goto tx_error_icmp;
 
--
To unsubscribe from this list: send the line "unsubscribe lvs-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe lvs-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply

* Re: Linux Plumbers Conference: User-visible Network Issues Mini-Conf
From: David Miller @ 2010-10-16  8:18 UTC (permalink / raw)
  To: Matt_Domsch; +Cc: netdev
In-Reply-To: <20101015195044.GA4416@auslistsprd01.us.dell.com>

From: Matt Domsch <Matt_Domsch@dell.com>
Date: Fri, 15 Oct 2010 14:50:44 -0500

> While I'm sure the individual topics will generate great discussion,
> it will be vital that members of the netdev community be present to
> represent the kernel developers' perspectives on the problems and to
> help brainstorm solutions.  Most of these topics have significant
> kernel components to them, and without additional kernel developer
> participation, I fear we would just be talking to ourselves, but no
> real progress made.  I invite you to attend LPC, and this mini-conf
> in particular, and lend your expertise.

Half a month before the event is not the time to be making
this kind of formal plea with core networking people.

If someone isn't already attending LPC, at this point there is next to
zero chance they are going to be able to make plans to do so in time.

^ permalink raw reply

* [PATCH] CAPI: Silence lockdep warning on get_capi_appl_by_nr usage
From: Jan Kiszka @ 2010-10-16 11:14 UTC (permalink / raw)
  To: David S. Miller
  Cc: Linux Kernel Mailing List, i4ldeveloper, Linux Netdev List,
	Karsten Keil

As long as we hold capi_controller_lock, we can safely access
capi_applications without RCU protection as no one can modify the
application list underneath us. Introduce an RCU-free
__get_capi_appl_by_nr for this purpose. This silences lockdep warnings
on suspicious rcu_dereference usage.

Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
---
 drivers/isdn/capi/kcapi.c |   19 ++++++++++++++-----
 1 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/isdn/capi/kcapi.c b/drivers/isdn/capi/kcapi.c
index 2648d39..2c535d8 100644
--- a/drivers/isdn/capi/kcapi.c
+++ b/drivers/isdn/capi/kcapi.c
@@ -99,6 +99,16 @@ static inline struct capi_ctr *get_capi_ctr_by_nr(u16 contr)
 	return capi_controller[contr - 1];
 }
 
+static inline struct capi20_appl *__get_capi_appl_by_nr(u16 applid)
+{
+	WARN_ON_ONCE(!mutex_is_locked(&capi_controller_lock));
+
+	if (applid - 1 >= CAPI_MAXAPPL)
+		return NULL;
+
+	return capi_applications[applid - 1];
+}
+
 static inline struct capi20_appl *get_capi_appl_by_nr(u16 applid)
 {
 	if (applid - 1 >= CAPI_MAXAPPL)
@@ -186,10 +196,9 @@ static void notify_up(u32 contr)
 		ctr->state = CAPI_CTR_RUNNING;
 
 		for (applid = 1; applid <= CAPI_MAXAPPL; applid++) {
-			ap = get_capi_appl_by_nr(applid);
-			if (!ap)
-				continue;
-			register_appl(ctr, applid, &ap->rparam);
+			ap = __get_capi_appl_by_nr(applid);
+			if (ap)
+				register_appl(ctr, applid, &ap->rparam);
 		}
 
 		wake_up_interruptible_all(&ctr->state_wait_queue);
@@ -216,7 +225,7 @@ static void ctr_down(struct capi_ctr *ctr, int new_state)
 	memset(ctr->serial, 0, sizeof(ctr->serial));
 
 	for (applid = 1; applid <= CAPI_MAXAPPL; applid++) {
-		ap = get_capi_appl_by_nr(applid);
+		ap = __get_capi_appl_by_nr(applid);
 		if (ap)
 			capi_ctr_put(ctr);
 	}
-- 
1.7.1

^ permalink raw reply related

* Re: openvswitch/flow WAS ( Re: [rfc] Merging the Open vSwitch datapath
From: jamal @ 2010-10-16 11:35 UTC (permalink / raw)
  To: Jesse Gross; +Cc: Ben Pfaff, netdev, ovs-team
In-Reply-To: <AANLkTim2LsFUOVSVss6HqGTxuyGH6eBKQdYabjMoWiaB@mail.gmail.com>

Jesse,

I re-added the other address Ben put earlier on in case you 
missed it.
yes, I have heard of TL;DR but unlike Alan Cox i find it hard to 
make a point in one sentence of 3 words - so please bear with me 
and read on.

On Fri, 2010-10-15 at 14:35 -0700, Jesse Gross wrote:

> 
> You're right, at a high level, it appears that there is a bit of an
> overlap between bridging, tc, and Open vSwitch.  

It looks like openvswitch rides on top of openflow, correct?
earlier i was looking at  openflow/datapath but gleaning
openvswitch/datapath it still looks conceptually the same
at the lower level. 

> However, in reality each is targeting a pretty different use case.  

Sure, use cases differences typically map either to policy 
or extension/addition of a new mechanism.
To clarify - you have the following approach per VM:

-->ingress port --> filter match --> actions 

Did i get this right?

You have a classifier that has 10 or so tuples. I could
replicate it with the u32 classifier - but it could be argued
that a brand new "hard-coded" classifier would be needed.

You have a series of actions like: redirect/mirror to port, drop etc
I can do most of these with existing tc actions and maybe replicate
most (like the vlan, MAC address, checksum etc rewrites) with pedit
action - but it could be argued that maybe one or more new tc actions
are needed.

Note: in linux, the above ingress port could be replaced with an
egress port instead. Bridging and L3 come after the actions in
the ingress path; and post that we have exactly the same approach of
port->filter->action

> Given that the design
> goals are not aligned, keeping separate things separate actually helps
> with overall simplicity.  

In general i would agree with the simplicity sentiment - but i fail to
see it so far.
A lot of the complexity, such as your own proprietary headers for flows
+actions, doesnt need to sit in the kernel.
IOW, the semantics of openflow already exist albeit a different syntax.
You can map the syntax to semantic in user space. This adheres to the
principal of simple kernel and external policy. 
I am sure thats what you would need to do with openflow on top of an
ASIC chip for example, no? I can see from the website you already run on
top of broadcom and marvel...

> Where there is overlap, I am certainly happy
> to see common functionality reused: for example, Open vSwitch uses tc
> for its QoS capabilities.

Refer to above. 

> In the future, I expect there to be an even clearer delineation
> between the various components.  One of the primary use cases of Open
> vSwitch at the moment is for virtualized data center networking but a
> few of the other potential uses that have been brought up include
> security processing (involving sending traffic of interest to
> userspace) and configuring SR-IOV NICs (to appropriately program rules
> in hardware).  You can see how each of these makes sense in the
> context of a virtual switch datapath but less so as a set of tc
> actions.

Unless i am misunderstanding - these are clearly more control extensions
but I dont see any of it needing to be in the kernel. It is all control
path stuff.
i.e something in user space (maybe even in a hypervisor) that is aware
of the virtualization creates, destroys and manages the VMs (SR-IOV etc)
and then configures per-VM flows whether directly in the kernel or via
some ethtool or other interface to the NIC.

> So, in short, I don't see this as something lacking in Linux, just
> complementary functionality.

Like i said above, I dont see the complimentary part.

cheers,
jamal

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox