Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: 2.6.35-rc3: Reported regressions 2.6.33 -> 2.6.34
From: Nick Bowler @ 2010-06-16 18:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Maciej Rutecki, Andrew Morton,
	Linus Torvalds, Kernel Testers List, Network Development,
	Linux ACPI, Linux PM List, Linux SCSI List, Linux Wireless List,
	DRI
In-Reply-To: <g77CuMUl7QI.A.5wF.V5OFMB@chimera>

On 16:45 Sun 13 Jun     , Rafael J. Wysocki wrote:
>  * This report has been delayed, because I had to go through all of the
>    entries and filter out the fixed ones, invalid ones etc.  Of course, I might
>    have missed some, but hopefully not too many.

This regression from 2.6.33 still seems to be missing from the list, and
is still present in 2.6.35-rc3.

  r600 CS checker rejects narrow FBO renderbuffers.
  https://bugs.freedesktop.org/show_bug.cgi?id=27609

-- 
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)

^ permalink raw reply

* pull request: wireless-2.6 2010-06-16
From: John W. Linville @ 2010-06-16 18:28 UTC (permalink / raw)
  To: davem; +Cc: linux-wireless, netdev, linux-kernel

Dave,

Here is another passel of of fixes intended for 2.6.35.  Included are
some build warning fixes, a PCI identifier, a fix for premature
IRQs during hostap initialization, a fix for a warning caused by
failing to cancel a scan watchdog in iwlwifi, a fix for a null
pointer dereference in iwlwifi, and a fix for a race condition in
the same driver.  Also included is the MAINTAINERS change for the
orphaning of the older Intel wireless drivers.  All but the last few
warning fixes have spent some time in linux-next already.

Please let me know if there are problems!

Thanks,

John

---

The following changes since commit fed396a585d8e1870b326f2e8e1888a72957abb8:
  Herbert Xu (1):
        bridge: Fix OOM crash in deliver_clone

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git master

Christoph Fritz (1):
      mac80211: fix warn, enum may be used uninitialized

Joerg Albert (1):
      p54pci: add Symbol AP-300 minipci adapters pciid

John W. Linville (1):
      iwlwifi: cancel scan watchdog in iwl_bg_abort_scan

Justin P. Mattock (2):
      wireless:hostap_main.c warning: variable 'iface' set but not used
      wireless:hostap_ap.c Fix warning: variable 'fc' set but not used

Prarit Bhargava (1):
      libertas_tf: Fix warning in lbtf_rx for stats struct

Reinette Chatre (1):
      iwlwifi: serialize station management actions

Shanyu Zhao (1):
      iwlagn: verify flow id in compressed BA packet

Tim Gardner (1):
      hostap: Protect against initialization interrupt

Zhu Yi (1):
      wireless: orphan ipw2x00 drivers

 MAINTAINERS                                 |   10 ++--------
 drivers/net/wireless/hostap/hostap_ap.c     |    3 +--
 drivers/net/wireless/hostap/hostap_cs.c     |   15 +++++++++++++--
 drivers/net/wireless/hostap/hostap_hw.c     |   13 +++++++++++++
 drivers/net/wireless/hostap/hostap_main.c   |    2 --
 drivers/net/wireless/hostap/hostap_wlan.h   |    2 +-
 drivers/net/wireless/iwlwifi/iwl-agn-tx.c   |    5 +++++
 drivers/net/wireless/iwlwifi/iwl-agn.c      |    8 ++++++--
 drivers/net/wireless/iwlwifi/iwl-scan.c     |    1 +
 drivers/net/wireless/iwlwifi/iwl-sta.c      |    4 ++++
 drivers/net/wireless/iwlwifi/iwl3945-base.c |    9 +++++++--
 drivers/net/wireless/libertas_tf/main.c     |    2 +-
 drivers/net/wireless/p54/p54pci.c           |    2 ++
 net/mac80211/work.c                         |    2 +-
 14 files changed, 57 insertions(+), 21 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 83be538..837a754 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2966,20 +2966,14 @@ F:	drivers/net/ixgb/
 F:	drivers/net/ixgbe/
 
 INTEL PRO/WIRELESS 2100 NETWORK CONNECTION SUPPORT
-M:	Reinette Chatre <reinette.chatre@intel.com>
-M:	Intel Linux Wireless <ilw@linux.intel.com>
 L:	linux-wireless@vger.kernel.org
-W:	http://ipw2100.sourceforge.net
-S:	Odd Fixes
+S:	Orphan
 F:	Documentation/networking/README.ipw2100
 F:	drivers/net/wireless/ipw2x00/ipw2100.*
 
 INTEL PRO/WIRELESS 2915ABG NETWORK CONNECTION SUPPORT
-M:	Reinette Chatre <reinette.chatre@intel.com>
-M:	Intel Linux Wireless <ilw@linux.intel.com>
 L:	linux-wireless@vger.kernel.org
-W:	http://ipw2200.sourceforge.net
-S:	Odd Fixes
+S:	Orphan
 F:	Documentation/networking/README.ipw2200
 F:	drivers/net/wireless/ipw2x00/ipw2200.*
 
diff --git a/drivers/net/wireless/hostap/hostap_ap.c b/drivers/net/wireless/hostap/hostap_ap.c
index 231dbd7..9cadaa2 100644
--- a/drivers/net/wireless/hostap/hostap_ap.c
+++ b/drivers/net/wireless/hostap/hostap_ap.c
@@ -688,7 +688,7 @@ static void hostap_ap_tx_cb_assoc(struct sk_buff *skb, int ok, void *data)
 	struct ap_data *ap = data;
 	struct net_device *dev = ap->local->dev;
 	struct ieee80211_hdr *hdr;
-	u16 fc, status;
+	u16 status;
 	__le16 *pos;
 	struct sta_info *sta = NULL;
 	char *txt = NULL;
@@ -699,7 +699,6 @@ static void hostap_ap_tx_cb_assoc(struct sk_buff *skb, int ok, void *data)
 	}
 
 	hdr = (struct ieee80211_hdr *) skb->data;
-	fc = le16_to_cpu(hdr->frame_control);
 	if ((!ieee80211_is_assoc_resp(hdr->frame_control) &&
 	     !ieee80211_is_reassoc_resp(hdr->frame_control)) ||
 	    skb->len < IEEE80211_MGMT_HDR_LEN + 4) {
diff --git a/drivers/net/wireless/hostap/hostap_cs.c b/drivers/net/wireless/hostap/hostap_cs.c
index db72461..29b31a6 100644
--- a/drivers/net/wireless/hostap/hostap_cs.c
+++ b/drivers/net/wireless/hostap/hostap_cs.c
@@ -594,6 +594,7 @@ static int prism2_config(struct pcmcia_device *link)
 	local_info_t *local;
 	int ret = 1;
 	struct hostap_cs_priv *hw_priv;
+	unsigned long flags;
 
 	PDEBUG(DEBUG_FLOW, "prism2_config()\n");
 
@@ -625,9 +626,15 @@ static int prism2_config(struct pcmcia_device *link)
 	local->hw_priv = hw_priv;
 	hw_priv->link = link;
 
+	/*
+	 * Make sure the IRQ handler cannot proceed until at least
+	 * dev->base_addr is initialized.
+	 */
+	spin_lock_irqsave(&local->irq_init_lock, flags);
+
 	ret = pcmcia_request_irq(link, prism2_interrupt);
 	if (ret)
-		goto failed;
+		goto failed_unlock;
 
 	/*
 	 * This actually configures the PCMCIA socket -- setting up
@@ -636,11 +643,13 @@ static int prism2_config(struct pcmcia_device *link)
 	 */
 	ret = pcmcia_request_configuration(link, &link->conf);
 	if (ret)
-		goto failed;
+		goto failed_unlock;
 
 	dev->irq = link->irq;
 	dev->base_addr = link->io.BasePort1;
 
+	spin_unlock_irqrestore(&local->irq_init_lock, flags);
+
 	/* Finally, report what we've done */
 	printk(KERN_INFO "%s: index 0x%02x: ",
 	       dev_info, link->conf.ConfigIndex);
@@ -667,6 +676,8 @@ static int prism2_config(struct pcmcia_device *link)
 
 	return ret;
 
+ failed_unlock:
+	 spin_unlock_irqrestore(&local->irq_init_lock, flags);
  failed:
 	kfree(hw_priv);
 	prism2_release((u_long)link);
diff --git a/drivers/net/wireless/hostap/hostap_hw.c b/drivers/net/wireless/hostap/hostap_hw.c
index ff9b5c8..2f999fc 100644
--- a/drivers/net/wireless/hostap/hostap_hw.c
+++ b/drivers/net/wireless/hostap/hostap_hw.c
@@ -2621,6 +2621,18 @@ static irqreturn_t prism2_interrupt(int irq, void *dev_id)
 	iface = netdev_priv(dev);
 	local = iface->local;
 
+	/* Detect early interrupt before driver is fully configued */
+	spin_lock(&local->irq_init_lock);
+	if (!dev->base_addr) {
+		if (net_ratelimit()) {
+			printk(KERN_DEBUG "%s: Interrupt, but dev not configured\n",
+			       dev->name);
+		}
+		spin_unlock(&local->irq_init_lock);
+		return IRQ_HANDLED;
+	}
+	spin_unlock(&local->irq_init_lock);
+
 	prism2_io_debug_add(dev, PRISM2_IO_DEBUG_CMD_INTERRUPT, 0, 0);
 
 	if (local->func->card_present && !local->func->card_present(local)) {
@@ -3138,6 +3150,7 @@ prism2_init_local_data(struct prism2_helper_functions *funcs, int card_idx,
 	spin_lock_init(&local->cmdlock);
 	spin_lock_init(&local->baplock);
 	spin_lock_init(&local->lock);
+	spin_lock_init(&local->irq_init_lock);
 	mutex_init(&local->rid_bap_mtx);
 
 	if (card_idx < 0 || card_idx >= MAX_PARM_DEVICES)
diff --git a/drivers/net/wireless/hostap/hostap_main.c b/drivers/net/wireless/hostap/hostap_main.c
index eb57d1e..eaee84b 100644
--- a/drivers/net/wireless/hostap/hostap_main.c
+++ b/drivers/net/wireless/hostap/hostap_main.c
@@ -741,9 +741,7 @@ void hostap_set_multicast_list_queue(struct work_struct *work)
 	local_info_t *local =
 		container_of(work, local_info_t, set_multicast_list_queue);
 	struct net_device *dev = local->dev;
-	struct hostap_interface *iface;
 
-	iface = netdev_priv(dev);
 	if (hostap_set_word(dev, HFA384X_RID_PROMISCUOUSMODE,
 			    local->is_promisc)) {
 		printk(KERN_INFO "%s: %sabling promiscuous mode failed\n",
diff --git a/drivers/net/wireless/hostap/hostap_wlan.h b/drivers/net/wireless/hostap/hostap_wlan.h
index 3d23891..1ba33be 100644
--- a/drivers/net/wireless/hostap/hostap_wlan.h
+++ b/drivers/net/wireless/hostap/hostap_wlan.h
@@ -654,7 +654,7 @@ struct local_info {
 	rwlock_t iface_lock; /* hostap_interfaces read lock; use write lock
 			      * when removing entries from the list.
 			      * TX and RX paths can use read lock. */
-	spinlock_t cmdlock, baplock, lock;
+	spinlock_t cmdlock, baplock, lock, irq_init_lock;
 	struct mutex rid_bap_mtx;
 	u16 infofid; /* MAC buffer id for info frame */
 	/* txfid, intransmitfid, next_txtid, and next_alloc are protected by
diff --git a/drivers/net/wireless/iwlwifi/iwl-agn-tx.c b/drivers/net/wireless/iwlwifi/iwl-agn-tx.c
index a732f10..7d614c4 100644
--- a/drivers/net/wireless/iwlwifi/iwl-agn-tx.c
+++ b/drivers/net/wireless/iwlwifi/iwl-agn-tx.c
@@ -1299,6 +1299,11 @@ void iwlagn_rx_reply_compressed_ba(struct iwl_priv *priv,
 	sta_id = ba_resp->sta_id;
 	tid = ba_resp->tid;
 	agg = &priv->stations[sta_id].tid[tid].agg;
+	if (unlikely(agg->txq_id != scd_flow)) {
+		IWL_ERR(priv, "BA scd_flow %d does not match txq_id %d\n",
+			scd_flow, agg->txq_id);
+		return;
+	}
 
 	/* Find index just before block-ack window */
 	index = iwl_queue_dec_wrap(ba_resp_scd_ssn & 0xff, txq->q.n_bd);
diff --git a/drivers/net/wireless/iwlwifi/iwl-agn.c b/drivers/net/wireless/iwlwifi/iwl-agn.c
index 7726e67..24aff65 100644
--- a/drivers/net/wireless/iwlwifi/iwl-agn.c
+++ b/drivers/net/wireless/iwlwifi/iwl-agn.c
@@ -3391,10 +3391,12 @@ static int iwlagn_mac_sta_add(struct ieee80211_hw *hw,
 	int ret;
 	u8 sta_id;
 
-	sta_priv->common.sta_id = IWL_INVALID_STATION;
-
 	IWL_DEBUG_INFO(priv, "received request to add station %pM\n",
 			sta->addr);
+	mutex_lock(&priv->mutex);
+	IWL_DEBUG_INFO(priv, "proceeding to add station %pM\n",
+			sta->addr);
+	sta_priv->common.sta_id = IWL_INVALID_STATION;
 
 	atomic_set(&sta_priv->pending_frames, 0);
 	if (vif->type == NL80211_IFTYPE_AP)
@@ -3406,6 +3408,7 @@ static int iwlagn_mac_sta_add(struct ieee80211_hw *hw,
 		IWL_ERR(priv, "Unable to add station %pM (%d)\n",
 			sta->addr, ret);
 		/* Should we return success if return code is EEXIST ? */
+		mutex_unlock(&priv->mutex);
 		return ret;
 	}
 
@@ -3415,6 +3418,7 @@ static int iwlagn_mac_sta_add(struct ieee80211_hw *hw,
 	IWL_DEBUG_INFO(priv, "Initializing rate scaling for station %pM\n",
 		       sta->addr);
 	iwl_rs_rate_init(priv, sta, sta_id);
+	mutex_unlock(&priv->mutex);
 
 	return 0;
 }
diff --git a/drivers/net/wireless/iwlwifi/iwl-scan.c b/drivers/net/wireless/iwlwifi/iwl-scan.c
index 5d3f51f..386c5f9 100644
--- a/drivers/net/wireless/iwlwifi/iwl-scan.c
+++ b/drivers/net/wireless/iwlwifi/iwl-scan.c
@@ -491,6 +491,7 @@ void iwl_bg_abort_scan(struct work_struct *work)
 
 	mutex_lock(&priv->mutex);
 
+	cancel_delayed_work_sync(&priv->scan_check);
 	set_bit(STATUS_SCAN_ABORTING, &priv->status);
 	iwl_send_scan_abort(priv);
 
diff --git a/drivers/net/wireless/iwlwifi/iwl-sta.c b/drivers/net/wireless/iwlwifi/iwl-sta.c
index 83a2636..c27c13f 100644
--- a/drivers/net/wireless/iwlwifi/iwl-sta.c
+++ b/drivers/net/wireless/iwlwifi/iwl-sta.c
@@ -1373,10 +1373,14 @@ int iwl_mac_sta_remove(struct ieee80211_hw *hw,
 
 	IWL_DEBUG_INFO(priv, "received request to remove station %pM\n",
 			sta->addr);
+	mutex_lock(&priv->mutex);
+	IWL_DEBUG_INFO(priv, "proceeding to remove station %pM\n",
+			sta->addr);
 	ret = iwl_remove_station(priv, sta_common->sta_id, sta->addr);
 	if (ret)
 		IWL_ERR(priv, "Error removing station %pM\n",
 			sta->addr);
+	mutex_unlock(&priv->mutex);
 	return ret;
 }
 EXPORT_SYMBOL(iwl_mac_sta_remove);
diff --git a/drivers/net/wireless/iwlwifi/iwl3945-base.c b/drivers/net/wireless/iwlwifi/iwl3945-base.c
index 6c353ca..a27872d 100644
--- a/drivers/net/wireless/iwlwifi/iwl3945-base.c
+++ b/drivers/net/wireless/iwlwifi/iwl3945-base.c
@@ -3437,10 +3437,13 @@ static int iwl3945_mac_sta_add(struct ieee80211_hw *hw,
 	bool is_ap = vif->type == NL80211_IFTYPE_STATION;
 	u8 sta_id;
 
-	sta_priv->common.sta_id = IWL_INVALID_STATION;
-
 	IWL_DEBUG_INFO(priv, "received request to add station %pM\n",
 			sta->addr);
+	mutex_lock(&priv->mutex);
+	IWL_DEBUG_INFO(priv, "proceeding to add station %pM\n",
+			sta->addr);
+	sta_priv->common.sta_id = IWL_INVALID_STATION;
+
 
 	ret = iwl_add_station_common(priv, sta->addr, is_ap, &sta->ht_cap,
 				     &sta_id);
@@ -3448,6 +3451,7 @@ static int iwl3945_mac_sta_add(struct ieee80211_hw *hw,
 		IWL_ERR(priv, "Unable to add station %pM (%d)\n",
 			sta->addr, ret);
 		/* Should we return success if return code is EEXIST ? */
+		mutex_unlock(&priv->mutex);
 		return ret;
 	}
 
@@ -3457,6 +3461,7 @@ static int iwl3945_mac_sta_add(struct ieee80211_hw *hw,
 	IWL_DEBUG_INFO(priv, "Initializing rate scaling for station %pM\n",
 		       sta->addr);
 	iwl3945_rs_rate_init(priv, sta, sta_id);
+	mutex_unlock(&priv->mutex);
 
 	return 0;
 }
diff --git a/drivers/net/wireless/libertas_tf/main.c b/drivers/net/wireless/libertas_tf/main.c
index 6a04c21..817fffc 100644
--- a/drivers/net/wireless/libertas_tf/main.c
+++ b/drivers/net/wireless/libertas_tf/main.c
@@ -549,7 +549,7 @@ int lbtf_rx(struct lbtf_private *priv, struct sk_buff *skb)
 
 	prxpd = (struct rxpd *) skb->data;
 
-	stats.flag = 0;
+	memset(&stats, 0, sizeof(stats));
 	if (!(prxpd->status & cpu_to_le16(MRVDRV_RXPD_STATUS_OK)))
 		stats.flag |= RX_FLAG_FAILED_FCS_CRC;
 	stats.freq = priv->cur_freq;
diff --git a/drivers/net/wireless/p54/p54pci.c b/drivers/net/wireless/p54/p54pci.c
index 07c4528..a5ea89c 100644
--- a/drivers/net/wireless/p54/p54pci.c
+++ b/drivers/net/wireless/p54/p54pci.c
@@ -41,6 +41,8 @@ static DEFINE_PCI_DEVICE_TABLE(p54p_table) = {
 	{ PCI_DEVICE(0x1260, 0x3877) },
 	/* Intersil PRISM Javelin/Xbow Wireless LAN adapter */
 	{ PCI_DEVICE(0x1260, 0x3886) },
+	/* Intersil PRISM Xbow Wireless LAN adapter (Symbol AP-300) */
+	{ PCI_DEVICE(0x1260, 0xffff) },
 	{ },
 };
 
diff --git a/net/mac80211/work.c b/net/mac80211/work.c
index be3d4a6..b025dc7 100644
--- a/net/mac80211/work.c
+++ b/net/mac80211/work.c
@@ -715,7 +715,7 @@ static void ieee80211_work_rx_queued_mgmt(struct ieee80211_local *local,
 	struct ieee80211_rx_status *rx_status;
 	struct ieee80211_mgmt *mgmt;
 	struct ieee80211_work *wk;
-	enum work_action rma;
+	enum work_action rma = WORK_ACT_NONE;
 	u16 fc;
 
 	rx_status = (struct ieee80211_rx_status *) skb->cb;
-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply related

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
From: Arnd Bergmann @ 2010-06-16 18:26 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Pedro Garcia, netdev, Eric Dumazet, Ben Hutchings
In-Reply-To: <4C18ED97.3060702@trash.net>

On Wednesday 16 June 2010 17:28:23 Patrick McHardy wrote:

> Since we don't have any special VLAN handling in the bridging code, I
> guess it comes down to optionally using a different ethertype value
> (0x88a8) in the VLAN code. We probably also need some indication from
> device drivers whether they are able to add these headers to avoid
> trying to offload tagging in case they're not.

It's probably a little more than just supporting the new ethertype, but not
much. The outer tag can be handled like our current VLAN module does,
but the standard does not allow a regular frame to be encapsulated directly,
but rather requires one of 

1. In 802.1ad: an 802.1Q VLAN tag (ethertype 0x8100) followed by the frame
2. In 802.1ah: A service tag (ethertype 0x88e7) followed by the 802.1Q VLAN tag
   and then the frame.

Maybe what we can do is extend the vlan code to understand all three frame
formats (q, ad and ah) or at least the first two so we configure both the
provider VID and the Customer VID for the interface in case of 802.1ad but
only the regular VID in 802.1Q.

Device drivers can then flag whether they support both formats or just
the regular Q tag.

	Arnd

^ permalink raw reply

* Re: [PATCH net-next-2.6] inetpeer: do not use zero refcnt for freed entries
From: Paul E. McKenney @ 2010-06-16 18:12 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1276656324.19249.39.camel@edumazet-laptop>

On Wed, Jun 16, 2010 at 04:45:24AM +0200, Eric Dumazet wrote:
> Le mardi 15 juin 2010 à 14:25 -0700, David Miller a écrit :
> > From: Eric Dumazet <eric.dumazet@gmail.com>
> > Date: Tue, 15 Jun 2010 20:23:14 +0200
> > 
> > > inetpeer currently uses an AVL tree protected by an rwlock.
> > > 
> > > It's possible to make most lookups use RCU
> >  ...
> > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> > 
> > Applied, nice work Eric.
> 
> Thanks David !
> 
> Re-reading patch I realize refcnt is expected to be 0 for unused entries
> (obviously), so we should use a different marker for 'about to be freed'
> ones.
> 
> Thanks
> 
> [PATCH net-next-2.6] inetpeer: do not use zero refcnt for freed entries
> 
> Followup of commit aa1039e73cc2 (inetpeer: RCU conversion)
> 
> Unused inet_peer entries have a null refcnt.
> 
> Using atomic_inc_not_zero() in rcu lookups is not going to work for
> them, and slow path is taken.
> 
> Fix this using -1 marker instead of 0 for deleted entries.

Based on this patch, looks good to me!  (I don't see lookup_rcu_bh() and
friends in the trees I have at hand.)

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
>  net/ipv4/inetpeer.c |   10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv4/inetpeer.c b/net/ipv4/inetpeer.c
> index 58fbc7e..39a14ba 100644
> --- a/net/ipv4/inetpeer.c
> +++ b/net/ipv4/inetpeer.c
> @@ -187,7 +187,12 @@ static struct inet_peer *lookup_rcu_bh(__be32 daddr)
> 
>  	while (u != peer_avl_empty) {
>  		if (daddr == u->v4daddr) {
> -			if (unlikely(!atomic_inc_not_zero(&u->refcnt)))
> +			/* Before taking a reference, check if this entry was
> +			 * deleted, unlink_from_pool() sets refcnt=-1 to make
> +			 * distinction between an unused entry (refcnt=0) and
> +			 * a freed one.
> +			 */
> +			if (unlikely(!atomic_add_unless(&u->refcnt, 1, -1)))
>  				u = NULL;
>  			return u;
>  		}
> @@ -322,8 +327,9 @@ static void unlink_from_pool(struct inet_peer *p)
>  	 * in cleanup() function to prevent sudden disappearing.  If we can
>  	 * atomically (because of lockless readers) take this last reference,
>  	 * it's safe to remove the node and free it later.
> +	 * We use refcnt=-1 to alert lockless readers this entry is deleted.
>  	 */
> -	if (atomic_cmpxchg(&p->refcnt, 1, 0) == 1) {
> +	if (atomic_cmpxchg(&p->refcnt, 1, -1) == 1) {
>  		struct inet_peer **stack[PEER_MAXDEPTH];
>  		struct inet_peer ***stackptr, ***delp;
>  		if (lookup(p->v4daddr, stack) != p)
> 
> 

^ permalink raw reply

* Re: [Bugme-new] [Bug 16216] New: wrong source addr of UDP packets when using policy routing
From: Patrick McHardy @ 2010-06-16 17:43 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Andrew Morton, netdev, bugzilla-daemon, bugme-daemon, borg
In-Reply-To: <1276709309.2632.126.camel@edumazet-laptop>

Eric Dumazet wrote:
> Le mercredi 16 juin 2010 à 18:46 +0200, Patrick McHardy a écrit :
>
>   
>> This is know behaviour, fwmarks don't work for source address selection
>> since before the source address is chosen, you don't even have a packet
>> which could be marked.
>>     
>
> We know have sk->sk_mark routing (socket based), so we might change
> sk->sk_mark with appropriate iptables target when one packet is
> received... not very clean but worth to mention...
>   
That would still be too late. The proper way would be to have the 
application
set the socket mark.

^ permalink raw reply

* Re: [Bugme-new] [Bug 16216] New: wrong source addr of UDP packets when using policy routing
From: Eric Dumazet @ 2010-06-16 17:28 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Andrew Morton, netdev, bugzilla-daemon, bugme-daemon, borg
In-Reply-To: <4C18FFDC.8060102@trash.net>

Le mercredi 16 juin 2010 à 18:46 +0200, Patrick McHardy a écrit :

> This is know behaviour, fwmarks don't work for source address selection
> since before the source address is chosen, you don't even have a packet
> which could be marked.

We know have sk->sk_mark routing (socket based), so we might change
sk->sk_mark with appropriate iptables target when one packet is
received... not very clean but worth to mention...

commit 914a9ab386a288d0f22252fc268ecbc048cdcbd5
Author: Atis Elsts <atis@mikrotik.com>
Date:   Thu Oct 1 15:16:49 2009 -0700

    net: Use sk_mark for routing lookup in more places
    
    This patch against v2.6.31 adds support for route lookup using sk_mark in some
    more places. The benefits from this patch are the following.
    First, SO_MARK option now has effect on UDP sockets too.
    Second, ip_queue_xmit() and inet_sk_rebuild_header() could fail to do routing
    lookup correctly if TCP sockets with SO_MARK were used.
    
    Signed-off-by: Atis Elsts <atis@mikrotik.com>
    Acked-by: Eric Dumazet <eric.dumazet@gmail.com>



^ permalink raw reply

* Re: [Bugme-new] [Bug 16216] New: wrong source addr of UDP packets when using policy routing
From: Patrick McHardy @ 2010-06-16 16:46 UTC (permalink / raw)
  To: Andrew Morton; +Cc: netdev, bugzilla-daemon, bugme-daemon, borg
In-Reply-To: <20100616093328.0671254b.akpm@linux-foundation.org>

Andrew Morton wrote:
> On Tue, 15 Jun 2010 15:14:43 GMT bugzilla-daemon@bugzilla.kernel.org wrote:
>
>   
>> https://bugzilla.kernel.org/show_bug.cgi?id=16216
>>
>>            Summary: wrong source addr of UDP packets when using policy
>>                     routing
>>            Product: Networking
>>            Version: 2.5
>>     Kernel Version: 2.6.24.7
>>     
>
> The reporter has confirmed that this issue persistes in 2.6.34.
>
>   
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: IPV4
>>         AssignedTo: shemminger@linux-foundation.org
>>         ReportedBy: borg@uu3.net
>>         Regression: No
>>
>>
>> When policy routing is used, UDP packets have wrong source address.
>> Source addr is probably taken from looking up routing table (main) to given
>> destination instead of being set just after POSTROUTING, looking up cache.
>>
>> This how it looks like doing simple netcat test:
>> (tcpdump is run on aa.aa.47.90)
>> 16:38:02.053053 IP aa.aa.47.67.32826 > aa.aa.47.90.660: UDP, length 8
>> 16:38:05.660394 IP bb.bbb.241.62.660 > aa.aa.47.67.32826: UDP, length 8
>>
>> aa.aa.47.90 have specific setup having 3 routing tables: main, 10, 20
>> and all of them have default gateway. bb.bbb.241.62 is an addr of 
>> outgoing interface of default route from main table.
>> If a packet cames from specific interface
>> its being stored to ipset and when packet is going to be sent out of the box
>> its being marked in mangle OUTPUT matching specific ipset:
>>
>> ### mangle PREROUTING ###
>> fw="iptables -t mangle -A PREROUTING"
>> $fw -i vlan0.13 -j SET --add-set gw10 src
>> $fw -i lan2 -j SET --add-set gw20 src
>>
>> ### mangle OUTPUT ###
>> fw="iptables -t mangle -A OUTPUT"
>> $fw -m set --set gw10 dst -j MARK --set-mark 10
>> $fw -m set --set gw10 dst -j ACCEPT
>> $fw -m set --set gw20 dst -j MARK --set-mark 20
>> $fw -m set --set gw20 dst -j ACCEPT
>>
>> % ip rule show
>> 32764:  from all fwmark 0x14 lookup 20
>> 32765:  from all fwmark 0xa lookup 10

This is know behaviour, fwmarks don't work for source address selection
since before the source address is chosen, you don't even have a packet
which could be marked.

^ permalink raw reply

* Re: [Bugme-new] [Bug 16216] New: wrong source addr of UDP packets when using policy routing
From: Andrew Morton @ 2010-06-16 16:33 UTC (permalink / raw)
  To: netdev; +Cc: bugzilla-daemon, bugme-daemon, borg
In-Reply-To: <bug-16216-10286@https.bugzilla.kernel.org/>

On Tue, 15 Jun 2010 15:14:43 GMT bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=16216
> 
>            Summary: wrong source addr of UDP packets when using policy
>                     routing
>            Product: Networking
>            Version: 2.5
>     Kernel Version: 2.6.24.7

The reporter has confirmed that this issue persistes in 2.6.34.

>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: IPV4
>         AssignedTo: shemminger@linux-foundation.org
>         ReportedBy: borg@uu3.net
>         Regression: No
> 
> 
> When policy routing is used, UDP packets have wrong source address.
> Source addr is probably taken from looking up routing table (main) to given
> destination instead of being set just after POSTROUTING, looking up cache.
> 
> This how it looks like doing simple netcat test:
> (tcpdump is run on aa.aa.47.90)
> 16:38:02.053053 IP aa.aa.47.67.32826 > aa.aa.47.90.660: UDP, length 8
> 16:38:05.660394 IP bb.bbb.241.62.660 > aa.aa.47.67.32826: UDP, length 8
> 
> aa.aa.47.90 have specific setup having 3 routing tables: main, 10, 20
> and all of them have default gateway. bb.bbb.241.62 is an addr of 
> outgoing interface of default route from main table.
> If a packet cames from specific interface
> its being stored to ipset and when packet is going to be sent out of the box
> its being marked in mangle OUTPUT matching specific ipset:
> 
> ### mangle PREROUTING ###
> fw="iptables -t mangle -A PREROUTING"
> $fw -i vlan0.13 -j SET --add-set gw10 src
> $fw -i lan2 -j SET --add-set gw20 src
> 
> ### mangle OUTPUT ###
> fw="iptables -t mangle -A OUTPUT"
> $fw -m set --set gw10 dst -j MARK --set-mark 10
> $fw -m set --set gw10 dst -j ACCEPT
> $fw -m set --set gw20 dst -j MARK --set-mark 20
> $fw -m set --set gw20 dst -j ACCEPT
> 
> % ip rule show
> 32764:  from all fwmark 0x14 lookup 20
> 32765:  from all fwmark 0xa lookup 10
> 
> Problem was noticed for UDP packets (openvpn connections are not working).
> Other non connection oriented protocols might be affected too.
> TCP (as connection oriented protocol) works just fine.
> 


^ permalink raw reply

* Re: [0/8] netpoll/bridge fixes
From: Paul E. McKenney @ 2010-06-16 16:01 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, herbert, shemminger, mst, frzhang, netdev, amwang,
	mpm
In-Reply-To: <1276669281.19249.62.camel@edumazet-laptop>

On Wed, Jun 16, 2010 at 08:21:21AM +0200, Eric Dumazet wrote:
> Le mardi 15 juin 2010 à 22:08 -0700, Paul E. McKenney a écrit :
> > On Wed, Jun 16, 2010 at 04:58:59AM +0200, Eric Dumazet wrote:
> > > 
> > > Paul, could you please explain if current lockdep rules are correct, or could be relaxed ?
> > > 
> > > I thought :
> > > 
> > > rcu_read_lock_bh();
> > > 
> > > was a shorthand to
> > > 
> > > local_disable_bh();
> > > rcu_read_lock();
> > 
> > In CONFIG_TREE_RCU and CONFIG_TINY_RCU, rcu_read_lock_bh() is actually
> > shorthand for only local_disable_bh().  Therefore, rcu_dereference()
> > will scream if only rcu_read_lock_bh() is held.
> > 
> > However, in CONFIG_PREEMPT_TREE_RCU, rcu_read_lock_bh() is its own
> > mechanism that does local_disable_bh() but has its own set of grace
> > periods, independent of those of rcu_read_lock().
> > 
> > > Why lockdep is not able to make a correct diagnostic ?
> > 
> > Here is the situation I am concerned about:
> > 
> > o	Task 0 does rcu_read_lock(), then p=rcu_dereference_bh().
> > 	If we make the change you are asking for, rcu_dereference_bh()
> > 	is OK with this.
> > 
> > o	Task 0 now is preempted before finishing its RCU read-side
> > 	critical section.
> > 
> > o	Task 1 removes the data element referenced by pointer p,
> > 	then invokes synchronize_rcu_bh().
> > 
> > o	Task 0 does not block synchronize_rcu_bh(), so the grace
> > 	period completes.
> > 
> > o	Task 1 frees up the data element referenced by pointer p,
> > 	which might be reallocated as some other type, unmapped,
> > 	or whatever else.
> > 
> > o	Task 0 resumes, and is sadly disappointed when the data
> > 	element referenced by pointer p has been swept out from
> > 	under it.
> > 
> > Or am I missing something here?
> > 
> 
> Nice thing with RCU is that I learn new things every day ;)
> 
> Thanks Paul, I'll try to remember all the details ! ;)

;-)

But just to be clear...  All but one use of RCU-bh is in networking,
so if you guys need something different from RCU-bh, let's talk!

And I learn something new about RCU every day as well.  One of today's
lessons is that networking is no longer the only user of RCU-bh.  ;-)

							Thanx, Paul

^ permalink raw reply

* RE: ipv6: netif_carrier_(on|off) with traces afterwards
From: Tantilov, Emil S @ 2010-06-16 15:55 UTC (permalink / raw)
  To: Einar EL Lueck; +Cc: NetDev
In-Reply-To: <OF3A036420.64516D4F-ONC1257744.00492A34-C1257744.004A240B@de.ibm.com>

>-----Original Message-----
>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
>Behalf Of Einar EL Lueck
>Sent: Wednesday, June 16, 2010 6:30 AM
>To: netdev@vger.kernel.org
>Subject: ipv6: netif_carrier_(on|off) with traces afterwards
>
>
>Hi,
>With IPv6 addresses configured and and a network card doing
>netif_carrier_off|on we see afterwards in 2.6.34 on S/390 some traces in
>fib.
>
>Example sequence of operations:
>ip -6 addr add fd00:10:30:49:4008:ffff:35:2/80 dev eth1
>ip link set eth1 up
>ip link set eth1 down
># the following lines have as effect netif_carrier_off and then on (among
>other stuff)
>echo 0 > /sys/bus/ccwgroup/drivers/qeth/devices/0.0.e300/online
>echo 1 > /sys/bus/ccwgroup/drivers/qeth/devices/0.0.e300/online
># end of plugging
>ip link set eth1  up
>ip link set eth1 down
>=> at this point we get the following trace:
>
>Badness
>at /home/autobuild/BUILD/linux-2.6.34-20100531/net/ipv6/ip6_fib.c:1160

<snip>

>
>Has anybody seen effects like this before on other platforms, or has
>anybody suggestions for the root cause?

I had similar issues. The following patches fixed it for me:

http://marc.info/?l=linux-netdev&m=127472600330413

http://marc.info/?l=linux-netdev&m=127472599530407

Thanks,
Emil


^ permalink raw reply

* Re: [PATCH] broadcom: Add 5241 support
From: Ben Hutchings @ 2010-06-16 15:35 UTC (permalink / raw)
  To: Dmitry Eremin-Solenikov
  Cc: netdev, David S. Miller, Matt Carlson, Michael Chan
In-Reply-To: <1276699126-8168-1-git-send-email-dbaryshkov@gmail.com>

On Wed, 2010-06-16 at 18:38 +0400, Dmitry Eremin-Solenikov wrote:
> This patch adds the 5241 PHY ID to the broadcom module.
[...]

You need to add the ID and mask to the module device ID table
(broadcom_tbl) as well.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
From: Patrick McHardy @ 2010-06-16 15:28 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Pedro Garcia, netdev, Eric Dumazet, Ben Hutchings
In-Reply-To: <201006161624.12101.arnd@arndb.de>

Arnd Bergmann wrote:
> On Wednesday 16 June 2010, Pedro Garcia wrote:
>   
>> Probably a definitive fix would be not to allow the definition of VLAN 0 
>> in 802.1q module and provide some other way to tag priority packets without
>> using a subinterface (maybe in the same module or a new 8021p one). I am 
>> having a look at the kernel to see what happens if we load two modules for 
>> the same protocol. 
>>     
>
> On a related note, we will also need to support 802.1Qad provider bridges
> at some point, which use yet another variation of the VLAN header (actually
> two nested VLAN tags) with a different ethertype.
> I need this for 802.1Qbg multi-channel VEPA (possibly also 802.1Qbh
> port extenders), but I have not yet investigated how to implement this
> in the VLAN module.
>   

Since we don't have any special VLAN handling in the bridging code, I
guess it comes down to optionally using a different ethertype value
(0x88a8) in the VLAN code. We probably also need some indication from
device drivers whether they are able to add these headers to avoid
trying to offload tagging in case they're not.



^ permalink raw reply

* Re: 2.6.35-rc2 kernel crashes under heavy network load
From: Lazy @ 2010-06-16 15:24 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel, netdev
In-Reply-To: <1276698862.2632.93.camel@edumazet-laptop>

2010/6/16 Eric Dumazet <eric.dumazet@gmail.com>:
> Le mercredi 16 juin 2010 à 15:39 +0200, Lazy a écrit :
>> Our linux router crashes while rebooting under heavy network load
>> (800kpps generated by pktgen on other machine).
>> While running everything system seams stable.
>>
>> Any pointers what can I do to help resolve this issue ?
>>
>> the system is Dell R210 X3420, 64bit kernel, debian 5.0 Broadcom BCM5716
>> same thing happens on a Dell R410 running same software Broadcom BCM5716
>>
>> kernel version is Linux version 2.6.35-rc2 (root@cisco3-2) (gcc
>> version 4.3.2 (Debian 4.3.2-1.1) ) #2 SMP Fri Jun 11 10:22:51 CEST
>> 2010
>>
>> general protection fault: 0000 [#1] SMP
>> last sysfs file: /sys/devices/platform/dcdbas/smi_data_buf_phys_addr
>> CPU 1
>> Modules linked in: iTCO_wdt 8021q ipmi_poweroff ipmi_devintf ipmi_si
>> ipmi_msghandler mptctl loop ioatdma dca button evdev dcdbas raid10
>> raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq
>> async_tx raid1 raid0 linear md_mod sd_mod sg sr_mod cdrom
>> ide_pci_generic ide_core usbhid mptsas ata_piix mptscsih libata
>> mptbase scsi_transport_sas scsi_mod ehci_hcd uhci_hcd bnx2 fan [last
>> unloaded: scsi_wait_scan]
>>
>> Pid: 20, comm: events/1 Not tainted 2.6.35-rc2 #2 0N051F/PowerEdge R410
>> RIP: 0010:[<ffffffff81087ae9>]  [<ffffffff81087ae9>] drain_array+0x29/0xc7
>> RSP: 0000:ffff88012fb6ddc0  EFLAGS: 00010202
>> RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000
>> RDX: 0720072007200720 RSI: ffff88012fc04ec0 RDI: ffff88012fc1be00
>> RBP: ffff88012fb6de00 R08: 0000000000000000 R09: ffff88012fa91618
>> R10: 000000102cdbdd88 R11: 0000000000000000 R12: ffff88012fc04ec0
>> R13: 0720072007200720 R14: ffff88012fc04ec0 R15: 0000000000000000
>
> 072007200720 pattern is the signature of a known bug.
>
>
> commit 386f40c86d6c8d5b717 (Revert "tty: fix a little bug in scrup,
> vt.c") will help you.
>
> This is probably solved in current git tree...


You are right, 2.6.35-rc3 (with this commit) works fine

thank You

-- 
Michal Grzedzicki

^ permalink raw reply

* Re: [PATCH 12/12] ptp: Added a clock driver for the National Semiconductor PHYTER.
From: Grant Likely @ 2010-06-16 15:10 UTC (permalink / raw)
  To: Richard Cochran
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ,
	linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Krzysztof Halasa
In-Reply-To: <20100616100539.GA3569-7KxsofuKt4IfAd9E5cN8NEzG7cXyKsk/@public.gmane.org>

On Wed, Jun 16, 2010 at 4:05 AM, Richard Cochran
<richardcochran-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Tue, Jun 15, 2010 at 12:49:13PM -0600, Grant Likely wrote:
>> Won't this break things for existing DP83640 users?
>
> Nope, the driver was only added five patches ago, and it only offers
> the timestamping stuff. The standard PHY functions just call the
> generic functions, so the PHY works fine even without this driver.
>
>> > +static struct ptp_clock *dp83640_clock;
>> > +DEFINE_SPINLOCK(clock_lock); /* protects the one and only dp83640_clock */
>>
>> Why only one?  Is it not possible to have 2 of these PHYs in a system?
>
> Yes, you can have multiple PHYs, but only one PTP clock.
>
> If you do use multiple PHYs, then you must wire their clocks together
> and adjust the PTP clock on only one of the PHYs.
>
>
> Thanks for your other comments,

You're welcome.  Make sure to cc: linux-kernel on your next posting.
I commented on what I could, but there is a lot of code outside my
areas of expertise.  In particular the time keeping code needs to be
looked at by the maintainers in that area.

Cheers,
g.

^ permalink raw reply

* DMFE Driver
From: Heiko Gerstung @ 2010-06-16 15:01 UTC (permalink / raw)
  To: netdev

Hi Everybody,

just a short heads up that I already found someone who is willing to
work on this and will receive hardware from me for testing it. And, I
already received a first patch from someone else, this is magnificent!
You guys really know what "customer service" is :-) !

If anybody has an idea regarding the ASIX driver and why it only seems
to work as a bonding group member when it is put into promisc, I would
really appreciate that!

Thanks again for all your time and assistance so far!

Regards,
  Heiko

-- 

Heiko Gerstung

*MEINBERG Funkuhren* GmbH & Co. KG
Lange Wand 9
D-31812 Bad Pyrmont, Germany
Phone: +49 (0)5281 9309-25
Fax: +49 (0)5281 9309-30
Amtsgericht Hannover 17HRA 100322
Geschäftsführer/Managing Directors: Günter Meinberg, Werner Meinberg,
Andre Hartmann, Heiko Gerstung
Email: heiko.gerstung@meinberg.de <mailto:heiko.gerstung@meinberg.de>
Web: www.meinberg.de <http://www.meinberg.de>

------------------------------------------------------------------------
*MEINBERG - Accurate Time Worldwide*

^ permalink raw reply

* [PATCH net-next-2.6] inetpeer: restore small inet_peer structures
From: Eric Dumazet @ 2010-06-16 14:52 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

Addition of rcu_head to struct inet_peer added 16bytes on 64bit arches.

Thats a bit unfortunate, since old size was exactly 64 bytes.

This can be solved, using an union between this rcu_head an four fields,
that are normally used only when a refcount is taken on inet_peer.
rcu_head is used only when refcnt=-1, right before structure freeing.

Add a inet_peer_refcheck() function to check this assertion for a while.

We can bring back SLAB_HWCACHE_ALIGN qualifier in kmem cache creation.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/net/inetpeer.h |   31 ++++++++++++++++++++++++++-----
 net/ipv4/inetpeer.c    |    4 ++--
 net/ipv4/route.c       |    1 +
 net/ipv4/tcp_ipv4.c    |   11 +++++++----
 4 files changed, 36 insertions(+), 11 deletions(-)

diff --git a/include/net/inetpeer.h b/include/net/inetpeer.h
index 6174047..51c06af 100644
--- a/include/net/inetpeer.h
+++ b/include/net/inetpeer.h
@@ -22,11 +22,21 @@ struct inet_peer {
 	__u32			dtime;		/* the time of last use of not
 						 * referenced entries */
 	atomic_t		refcnt;
-	atomic_t		rid;		/* Frag reception counter */
-	atomic_t		ip_id_count;	/* IP ID for the next packet */
-	__u32			tcp_ts;
-	__u32			tcp_ts_stamp;
-	struct rcu_head		rcu;
+	/*
+	 * Once inet_peer is queued for deletion (refcnt == -1), following fields
+	 * are not available: rid, ip_id_count, tcp_ts, tcp_ts_stamp
+	 * We can share memory with rcu_head to keep inet_peer small
+	 * (less then 64 bytes)
+	 */
+	union {
+		struct {
+			atomic_t	rid;		/* Frag reception counter */
+			atomic_t	ip_id_count;	/* IP ID for the next packet */
+			__u32		tcp_ts;
+			__u32		tcp_ts_stamp;
+		};
+		struct rcu_head         rcu;
+	};
 };
 
 void			inet_initpeers(void) __init;
@@ -37,10 +47,21 @@ struct inet_peer	*inet_getpeer(__be32 daddr, int create);
 /* can be called from BH context or outside */
 extern void inet_putpeer(struct inet_peer *p);
 
+/*
+ * temporary check to make sure we dont access rid, ip_id_count, tcp_ts,
+ * tcp_ts_stamp if no refcount is taken on inet_peer
+ */
+static inline void inet_peer_refcheck(const struct inet_peer *p)
+{
+	WARN_ON_ONCE(atomic_read(&p->refcnt) <= 0);
+}
+
+
 /* can be called with or without local BH being disabled */
 static inline __u16	inet_getid(struct inet_peer *p, int more)
 {
 	more++;
+	inet_peer_refcheck(p);
 	return atomic_add_return(more, &p->ip_id_count) - more;
 }
 
diff --git a/net/ipv4/inetpeer.c b/net/ipv4/inetpeer.c
index 349249f..bb58aed 100644
--- a/net/ipv4/inetpeer.c
+++ b/net/ipv4/inetpeer.c
@@ -64,7 +64,7 @@
  *		   usually under some other lock to prevent node disappearing
  *		dtime: unused node list lock
  *		v4daddr: unchangeable
- *		ip_id_count: idlock
+ *		ip_id_count: atomic value (no lock needed)
  */
 
 static struct kmem_cache *peer_cachep __read_mostly;
@@ -129,7 +129,7 @@ void __init inet_initpeers(void)
 
 	peer_cachep = kmem_cache_create("inet_peer_cache",
 			sizeof(struct inet_peer),
-			0, SLAB_PANIC,
+			0, SLAB_HWCACHE_ALIGN | SLAB_PANIC,
 			NULL);
 
 	/* All the timers, started at system startup tend
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index a291edb..03430de 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2881,6 +2881,7 @@ static int rt_fill_info(struct net *net,
 	error = rt->dst.error;
 	expires = rt->dst.expires ? rt->dst.expires - jiffies : 0;
 	if (rt->peer) {
+		inet_peer_refcheck(rt->peer);
 		id = atomic_read(&rt->peer->ip_id_count) & 0xffff;
 		if (rt->peer->tcp_ts_stamp) {
 			ts = rt->peer->tcp_ts;
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 7f9515c..2e41e6f 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -204,10 +204,12 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 		 * TIME-WAIT * and initialize rx_opt.ts_recent from it,
 		 * when trying new connection.
 		 */
-		if (peer != NULL &&
-		    (u32)get_seconds() - peer->tcp_ts_stamp <= TCP_PAWS_MSL) {
-			tp->rx_opt.ts_recent_stamp = peer->tcp_ts_stamp;
-			tp->rx_opt.ts_recent = peer->tcp_ts;
+		if (peer) {
+			inet_peer_refcheck(peer);
+			if ((u32)get_seconds() - peer->tcp_ts_stamp <= TCP_PAWS_MSL) {
+				tp->rx_opt.ts_recent_stamp = peer->tcp_ts_stamp;
+				tp->rx_opt.ts_recent = peer->tcp_ts;
+			}
 		}
 	}
 
@@ -1351,6 +1353,7 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
 		    (dst = inet_csk_route_req(sk, req)) != NULL &&
 		    (peer = rt_get_peer((struct rtable *)dst)) != NULL &&
 		    peer->v4daddr == saddr) {
+			inet_peer_refcheck(peer);
 			if ((u32)get_seconds() - peer->tcp_ts_stamp < TCP_PAWS_MSL &&
 			    (s32)(peer->tcp_ts - req->ts_recent) >
 							TCP_PAWS_WINDOW) {



^ permalink raw reply related

* [PATCH] broadcom: Add 5241 support
From: Dmitry Eremin-Solenikov @ 2010-06-16 14:38 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Matt Carlson, Michael Chan

This patch adds the 5241 PHY ID to the broadcom module.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
---
 drivers/net/phy/broadcom.c |   21 +++++++++++++++++++++
 1 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/drivers/net/phy/broadcom.c b/drivers/net/phy/broadcom.c
index f482fc4..1c12a57 100644
--- a/drivers/net/phy/broadcom.c
+++ b/drivers/net/phy/broadcom.c
@@ -834,6 +834,21 @@ static struct phy_driver bcmac131_driver = {
 	.driver		= { .owner = THIS_MODULE },
 };
 
+static struct phy_driver bcm5241_driver = {
+	.phy_id		= 0x0143bc30,
+	.phy_id_mask	= 0xfffffff0,
+	.name		= "Broadcom BCM5241",
+	.features	= PHY_BASIC_FEATURES |
+			  SUPPORTED_Pause | SUPPORTED_Asym_Pause,
+	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
+	.config_init	= brcm_fet_config_init,
+	.config_aneg	= genphy_config_aneg,
+	.read_status	= genphy_read_status,
+	.ack_interrupt	= brcm_fet_ack_interrupt,
+	.config_intr	= brcm_fet_config_intr,
+	.driver		= { .owner = THIS_MODULE },
+};
+
 static int __init broadcom_init(void)
 {
 	int ret;
@@ -868,8 +883,13 @@ static int __init broadcom_init(void)
 	ret = phy_driver_register(&bcmac131_driver);
 	if (ret)
 		goto out_ac131;
+	ret = phy_driver_register(&bcm5241_driver);
+	if (ret)
+		goto out_5241;
 	return ret;
 
+out_5241:
+	phy_driver_unregister(&bcmac131_driver);
 out_ac131:
 	phy_driver_unregister(&bcm57780_driver);
 out_57780:
@@ -894,6 +914,7 @@ out_5411:
 
 static void __exit broadcom_exit(void)
 {
+	phy_driver_unregister(&bcm5241_driver);
 	phy_driver_unregister(&bcmac131_driver);
 	phy_driver_unregister(&bcm57780_driver);
 	phy_driver_unregister(&bcm50610m_driver);
-- 
1.7.1


^ permalink raw reply related

* Re: 2.6.35-rc2 kernel crashes under heavy network load
From: Eric Dumazet @ 2010-06-16 14:34 UTC (permalink / raw)
  To: Lazy; +Cc: linux-kernel, netdev
In-Reply-To: <AANLkTinal4PZ3ESQMIdqd8_zmnSDTNgweczNsYdGobTS@mail.gmail.com>

Le mercredi 16 juin 2010 à 15:39 +0200, Lazy a écrit :
> Our linux router crashes while rebooting under heavy network load
> (800kpps generated by pktgen on other machine).
> While running everything system seams stable.
> 
> Any pointers what can I do to help resolve this issue ?
> 
> the system is Dell R210 X3420, 64bit kernel, debian 5.0 Broadcom BCM5716
> same thing happens on a Dell R410 running same software Broadcom BCM5716
> 
> kernel version is Linux version 2.6.35-rc2 (root@cisco3-2) (gcc
> version 4.3.2 (Debian 4.3.2-1.1) ) #2 SMP Fri Jun 11 10:22:51 CEST
> 2010
> 
> general protection fault: 0000 [#1] SMP
> last sysfs file: /sys/devices/platform/dcdbas/smi_data_buf_phys_addr
> CPU 1
> Modules linked in: iTCO_wdt 8021q ipmi_poweroff ipmi_devintf ipmi_si
> ipmi_msghandler mptctl loop ioatdma dca button evdev dcdbas raid10
> raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq
> async_tx raid1 raid0 linear md_mod sd_mod sg sr_mod cdrom
> ide_pci_generic ide_core usbhid mptsas ata_piix mptscsih libata
> mptbase scsi_transport_sas scsi_mod ehci_hcd uhci_hcd bnx2 fan [last
> unloaded: scsi_wait_scan]
> 
> Pid: 20, comm: events/1 Not tainted 2.6.35-rc2 #2 0N051F/PowerEdge R410
> RIP: 0010:[<ffffffff81087ae9>]  [<ffffffff81087ae9>] drain_array+0x29/0xc7
> RSP: 0000:ffff88012fb6ddc0  EFLAGS: 00010202
> RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: 0720072007200720 RSI: ffff88012fc04ec0 RDI: ffff88012fc1be00
> RBP: ffff88012fb6de00 R08: 0000000000000000 R09: ffff88012fa91618
> R10: 000000102cdbdd88 R11: 0000000000000000 R12: ffff88012fc04ec0
> R13: 0720072007200720 R14: ffff88012fc04ec0 R15: 0000000000000000

072007200720 pattern is the signature of a known bug.


commit 386f40c86d6c8d5b717 (Revert "tty: fix a little bug in scrup,
vt.c") will help you.

This is probably solved in current git tree...




^ permalink raw reply

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
From: Arnd Bergmann @ 2010-06-16 14:24 UTC (permalink / raw)
  To: Pedro Garcia; +Cc: netdev, Eric Dumazet, Ben Hutchings, Patrick McHardy
In-Reply-To: <6b5ed8108cebb1865c85e03d3244b6ed@dondevamos.com>

On Wednesday 16 June 2010, Pedro Garcia wrote:
> Probably a definitive fix would be not to allow the definition of VLAN 0 
> in 802.1q module and provide some other way to tag priority packets without
> using a subinterface (maybe in the same module or a new 8021p one). I am 
> having a look at the kernel to see what happens if we load two modules for 
> the same protocol. 

On a related note, we will also need to support 802.1Qad provider bridges
at some point, which use yet another variation of the VLAN header (actually
two nested VLAN tags) with a different ethertype.
I need this for 802.1Qbg multi-channel VEPA (possibly also 802.1Qbh
port extenders), but I have not yet investigated how to implement this
in the VLAN module.

> By the way, the changelog I have to write is just the text before the 
> patch?

Yes.

	Arnd

^ permalink raw reply

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
From: Eric Dumazet @ 2010-06-16 14:24 UTC (permalink / raw)
  To: Pedro Garcia; +Cc: netdev, Ben Hutchings, Patrick McHardy
In-Reply-To: <6b5ed8108cebb1865c85e03d3244b6ed@dondevamos.com>

Le mercredi 16 juin 2010 à 15:28 +0200, Pedro Garcia a écrit :

> 
> In my understanding, 802.1p is a "subset" of 802.1q, and they share the 
> protocol number. We can do a 802.1p module, but in the end it will end
> up reusing most of the code in 802.1q.
> 

I was more thinking of a default ETH_P_8021Q rx handler (aka
vlan_skb_recv_minimal) with minimum handling (only accept vid=0 frames),
being overridden by real 8021q handler if module loaded/present.

> In any case defining a VLAN 0 ends up usually in problems with which table 
> the ARP entries get stored in. This patch solves the problem to whom 
> is not using VLAN 0 explicitly, but if somebody is using VLAN 0 tagging
> it will work (whatever "work" means) as before.
> 
> Probably a definitive fix would be not to allow the definition of VLAN 0 
> in 802.1q module and provide some other way to tag priority packets without
> using a subinterface (maybe in the same module or a new 8021p one). I am 
> having a look at the kernel to see what happens if we load two modules for 
> the same protocol. 
> 
> By the way, the changelog I have to write is just the text before the 
> patch?

Yes, you can take a look on any patch around for examples, like...

git show 6e327c11a91d190650df9aabe7d3694d4838bfa1

Check Documentation/SubmittingPatches   section 2)




^ permalink raw reply

* Re: [PATCH 08/12] ptp: Added a brand new class driver for ptp clocks.
From: Richard Cochran @ 2010-06-16 14:22 UTC (permalink / raw)
  To: Grant Likely
  Cc: netdev, devicetree-discuss, linuxppc-dev, linux-arm-kernel,
	Krzysztof Halasa, Thomas Gleixner
In-Reply-To: <AANLkTin4lYghEWhjzEyARvYDgHaXdniwLbfyZ4jY0rwm@mail.gmail.com>

On Tue, Jun 15, 2010 at 11:00:10AM -0600, Grant Likely wrote:
> 
> Question from an ignorant reviewer:  Why a new interface instead of
> working with the existing high resolution timers infrastructure?

Short answer: Timers are only one part of the PTP API. If you offer
the PTP clock as a Linux clock source, then you could just use the
existing POSIX timers. However, we decided not to offer the PTP clock
in that way. The following excerpts from an upcoming paper explain
why:

\subsection{Basic Clock Operations}

   Based on our experience with a number of commercially available
   hardware clocks, we identified a set of four basic clock
   operations. Besides simply setting or getting the clock's time
   value, two additional operations are needed for clock control.
   Once a PTP slave has an initial estimate of the offset to its
   master, it typically would need to shift the clock by the offset
   atomically.  Also, the servo loop of a slave periodically needs to
   adjust the clock frequency.

\subsection{Ancillary Clock Features}

   Perhaps the most challenging design issue was deciding how to offer
   a PTP clock's capabilities to the GNU/Linux operating system.  As
   John Eidson pointed out~\cite{eidson2006measurement}, modern
   operating systems provide surprisingly little support for
   programming based on absolute time.  As the IEEE 1588 standard is
   being applied to a wide variety of test, measurement, and control
   applications, we can imagine many possible ways to use an embedded
   computer equipped with a PTP hardware clock.  We do not expect that
   any API will be able to cover every conceivable application of this
   technology.  However, the design presented here does cover common
   use cases based on the capabilities of currently available hardware
   clocks.

   The design allows user space programs to control all of a clock's
   ancillary features.  Programs may create one-shot or periodic
   alarms, with signal delivery on expiration.  \Timestamps on
   external events are provided via a First In, First Out (FIFO)
   interface.  If the clock has output signals, then their periods are
   configurable from user space.  Synchronization of the Linux system
   time via the PPS subsystem may be enabled or disabled as desired.

\subsection{Synchronizing the Linux System Clock}

   One important question that needed to be addressed was, now that we
   have a precise time source, how do we synchronize the Linux kernel
   to it?  The Linux kernel offers a modular clock infrastructure
   comprising ``clock sources'' and ``clock event devices.'' Clock
   sources provide a monotonically increasing time base, and clock
   event devices are used to schedule the next interrupt for various
   timer events.

   We considered but ultimately rejected the idea of offering the PTP
   clock to the Linux kernel as a combined clock source and clock
   event device.  The one great advantage of this approach would have
   been that it obviates the need for synchronization when the PTP
   clock is selected as the system timer.

   However, this approach is problematic when using certain kinds of
   clock hardware. For example, physical layer (PHY) chip based clocks
   can only be accessed by the relatively slow 16 bit wide MDIO
   bus. Such a clock would not be suitable for providing high
   resolution timers, which are now a standard Linux kernel feature.
   Furthermore, we cannot even be sure that a given hardware clock
   will offer any interrupt to the system at all.

   Instead, we elected to use the Pulse Per Second (PPS) subsystem as
   a method to optionally synchronize the Linux system time to the PTP
   clock.  This method is feasible even for clocks that do not offer
   fast register access, such as the PHY clocks.  Of course, the main
   disadvantage of this approach is that the Linux system time will
   not be exactly synchronized to the PTP clock time.  Since PTP
   clocks can be synchronized an order of magnitude better than the
   typical operating system scheduling latency, we expect that this
   method will still yield acceptable results for many applications.
   Applications with more demanding time requirements may use the new
   PTP interfaces directly when needed.

\subsection{System Calls or Character Device}

   When adding new functionality to an operating system, a basic design
   decision is how user space programs will call into the kernel.  For
   the Linux kernel, two different ways come into question, namely
   system calls or as a ``character device.''  In an attempt to make
   the PTP clock API easy to understand, we patterned it after the
   existing Network Time Protocol (NTP) and the POSIX timer APIs, as
   described in Section~\ref{UserAPI}.  Both of these services are
   exported to the user space as system calls.  However, we decided to
   offer the PTP clock as a character device because extending the NTP
   and POSIX interfaces seemed impractical.  In addition, the
   character device's \fn{read()} method provides a
   convenient way to deliver time stamped events to user space
   programs.

^ permalink raw reply

* Re: [PATCH] Clear IFF_XMIT_DST_RELEASE for teql interfaces
From: Eric Dumazet @ 2010-06-16 14:14 UTC (permalink / raw)
  To: hadi
  Cc: Tom Hughes, netdev, akpm, David S. Miller, Stephen Hemminger,
	Patrick McHardy, Tejun Heo, linux-kernel
In-Reply-To: <1276694745.3862.1.camel@bigi>

Le mercredi 16 juin 2010 à 09:25 -0400, jamal a écrit :
> On Wed, 2010-06-16 at 09:24 +0100, Tom Hughes wrote:
> > The sch_teql module, which can be used to load balance over a set of
> > underlying interfaces, stopped working after 2.6.30 and has been
> > broken in all kernels since then for any underlying interface which
> > requires the addition of link level headers.
> > 
> > The problem is that the transmit routine relies on being able to
> > access the destination address in the skb in order to do address
> > resolution once it has decided which underlying interface it is going
> > to transmit through.
> > 
> > In 2.6.31 the IFF_XMIT_DST_RELEASE flag was introduced, and set by
> > default for all interfaces, which causes the destination address to be
> > released before the transmit routine for the interface is called.
> > 
> > The solution is to clear that flag for teql interfaces.
> > 
> > Signed-off-by: Tom Hughes <tom@compton.nu>
> 
> Sounds reasonable. Lets CC Eric and get his ACK.
> 
> cheers,
> jamal
> 

Sure, I already Acked in on a previous message 5 days ago (although not
a formal patch, Stephen forwarded a bugzilla entry)

http://permalink.gmane.org/gmane.linux.network/163688

Please David, could you add bugzilla entry in commit ?

https://bugzilla.kernel.org/show_bug.cgi?id=16183

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

Thanks !

^ permalink raw reply

* ipv6: netif_carrier_(on|off) with traces afterwards
From: Einar EL Lueck @ 2010-06-16 13:29 UTC (permalink / raw)
  To: netdev


Hi,
With IPv6 addresses configured and and a network card doing
netif_carrier_off|on we see afterwards in 2.6.34 on S/390 some traces in
fib.

Example sequence of operations:
ip -6 addr add fd00:10:30:49:4008:ffff:35:2/80 dev eth1
ip link set eth1 up
ip link set eth1 down
# the following lines have as effect netif_carrier_off and then on (among
other stuff)
echo 0 > /sys/bus/ccwgroup/drivers/qeth/devices/0.0.e300/online
echo 1 > /sys/bus/ccwgroup/drivers/qeth/devices/0.0.e300/online
# end of plugging
ip link set eth1  up
ip link set eth1 down
=> at this point we get the following trace:

Badness
at /home/autobuild/BUILD/linux-2.6.34-20100531/net/ipv6/ip6_fib.c:1160
Modules linked in: qeth_l2 sunrpc qeth_l3 binfmt_misc dm_multipath scsi_dh
dm_mod ipv6 qeth ccwgroup
CPU: 9 Not tainted 2.6.34-43.x.20100531-s390xperformance #1
Process ip (pid: 18144, task: 000000007c304238, ksp: 00000000033af428)
Krnl PSW : 0704200180000000 000003c0018c05a4 (fib6_del+0x60/0x3f8 [ipv6])
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3
Krnl GPRS: 0000000000000000 0000000000000002 000000007b58a700
0000000000830398
           0000000000830398 0000000000000001 0000000000000000
0000000000830398
           000003c0018bba5e 00000000033af480 0000000071ff2780
000000007b58a700
           000003c0018a7000 000003c0018e4d50 00000000033af398
00000000033af2e0
Krnl Code: 000003c0018c0598: eb6ff1000004       lmg     %r6,%r15,256(%r15)
           000003c0018c059e: 07f4               bcr     15,%r4
           000003c0018c05a0: a7f40001           brc     15,3c0018c05a2
          >000003c0018c05a4: a728fffe           lhi     %r2,-2
           000003c0018c05a8: a7f4fff3           brc     15,3c0018c058e
           000003c0018c05ac: b90200aa           ltgr    %r10,%r10
           000003c0018c05b0: a784ffed           brc     8,3c0018c058a
           000003c0018c05b4: e32033b00020       cg      %r2,944(%r3)
Call Trace:
([<0000000000000020>] 0x20)
 [<000003c0018bb9c6>] __ip6_del_rt+0x6e/0xc8 [ipv6]
 [<000003c0018bba5e>] ip6_del_rt+0x3e/0x4c [ipv6]
 [<000003c0018b4448>] __ipv6_ifa_notify+0x13c/0x1d4 [ipv6]
 [<000003c0018b7072>] addrconf_ifdown+0x3b2/0x5f0 [ipv6]
 [<000003c0018b8b2c>] addrconf_notify+0xb4/0x944 [ipv6]
 [<00000000004fb6aa>] notifier_call_chain+0x5a/0xa0
 [<000000000016675a>] raw_notifier_call_chain+0x2a/0x3c
 [<0000000000459194>] __dev_notify_flags+0x88/0xac
 [<0000000000459200>] dev_change_flags+0x48/0x70
 [<00000000004664d4>] do_setlink+0x35c/0x60c
 [<00000000004678fa>] rtnl_newlink+0x446/0x574
 [<000000000047c89c>] netlink_rcv_skb+0xdc/0xf0
 [<00000000004671a8>] rtnetlink_rcv+0x3c/0x48
 [<000000000047c35e>] netlink_unicast+0x352/0x3ac
 [<000000000047cebe>] netlink_sendmsg+0x22a/0x35c
 [<00000000004414c4>] sock_sendmsg+0xdc/0x100
 [<00000000004417ca>] SyS_sendmsg+0x182/0x2d4
 [<000000000043f17c>] SyS_socketcall+0x150/0x338
 [<0000000000119346>] sysc_noemu+0x10/0x16
 [<0000004e1271c71a>] 0x4e1271c71a
Last Breaking-Event-Address:
 [<000003c0018c05a0>] fib6_del+0x5c/0x3f8 [ipv6]

Has anybody seen effects like this before on other platforms, or has
anybody suggestions for the root cause?

Thanks,
Einar.


^ permalink raw reply

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
From: Pedro Garcia @ 2010-06-16 13:28 UTC (permalink / raw)
  To: netdev; +Cc: Eric Dumazet, Ben Hutchings, Patrick McHardy
In-Reply-To: <4C18B898.4000307@trash.net>

On Wed, 16 Jun 2010 13:42:16 +0200, Patrick McHardy <kaber@trash.net>
wrote:
> Eric Dumazet wrote:
>> Le mercredi 16 juin 2010 à 10:49 +0200, Pedro Garcia a écrit :
>>> Here it is again. I added the modifications in
>>> http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868 for HW
>>> accelerated incoming packets (it did not apply cleanly on the last
>>> version of
>>> the kernel, so I applied manually). Now, if the VLAN 0 is not
>>> explicitly created by the user, VLAN 0 packets will be treated as no
>>> VLAN (802.1p packets), instead of dropping them.
>>>
>>> The patch is now for two files: vlan_core (accel) and vlan_dev (non
>>> accel)
>>>
>>> I can not test on HW accelerated devices, so if someone can check it I
>>> will appreciate (even though in the thread above it looked like yes).
>>> For non accel I tessted in 2.6.26. Now the patch is for
>>> net-next-2.6, and it compiles OK, but I a have to setup a test
>>> environment to check it is still OK (should, but better to test).
>>>
>>> Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>
>>>     
>>
>> OK, the patch itself is correct.
>>   
> 
> Yes, looks fine to me as well.
> 
>> Now, could you please send it again with a proper changelog ?
>>
>> In this changelog, please explain why patch is needed, and
>> keep lines short (< 72 chars), like the one you did in your first mail.
>>
>> I'll then add my Signed-off-by, since I wrote the accelerated part ;)
>>
>> Note : I wonder if another patch is needed, in case 8021q module is
>> _not_ loaded. We probably should accept vlan 0 frames in this case ?
>>   
> 
> I agree that this would be best for consistency, but that would mean
> adding more special cases to __netif_receive_skb().

In my understanding, 802.1p is a "subset" of 802.1q, and they share the 
protocol number. We can do a 802.1p module, but in the end it will end
up reusing most of the code in 802.1q.

In any case defining a VLAN 0 ends up usually in problems with which table 
the ARP entries get stored in. This patch solves the problem to whom 
is not using VLAN 0 explicitly, but if somebody is using VLAN 0 tagging
it will work (whatever "work" means) as before.

Probably a definitive fix would be not to allow the definition of VLAN 0 
in 802.1q module and provide some other way to tag priority packets without
using a subinterface (maybe in the same module or a new 8021p one). I am 
having a look at the kernel to see what happens if we load two modules for 
the same protocol. 

By the way, the changelog I have to write is just the text before the 
patch?


Pedro

^ permalink raw reply

* Re: [PATCH] Clear IFF_XMIT_DST_RELEASE for teql interfaces
From: jamal @ 2010-06-16 13:25 UTC (permalink / raw)
  To: Tom Hughes
  Cc: netdev, akpm, David S. Miller, Eric Dumazet, Stephen Hemminger,
	Patrick McHardy, Tejun Heo, linux-kernel
In-Reply-To: <1276676668-10256-1-git-send-email-tom@compton.nu>

On Wed, 2010-06-16 at 09:24 +0100, Tom Hughes wrote:
> The sch_teql module, which can be used to load balance over a set of
> underlying interfaces, stopped working after 2.6.30 and has been
> broken in all kernels since then for any underlying interface which
> requires the addition of link level headers.
> 
> The problem is that the transmit routine relies on being able to
> access the destination address in the skb in order to do address
> resolution once it has decided which underlying interface it is going
> to transmit through.
> 
> In 2.6.31 the IFF_XMIT_DST_RELEASE flag was introduced, and set by
> default for all interfaces, which causes the destination address to be
> released before the transmit routine for the interface is called.
> 
> The solution is to clear that flag for teql interfaces.
> 
> Signed-off-by: Tom Hughes <tom@compton.nu>

Sounds reasonable. Lets CC Eric and get his ACK.

cheers,
jamal


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox