Netdev List
 help / color / mirror / Atom feed
* Re: Low performance Intel 10GE NIC (3.2.10) on 2.6.38 Kernel
From: Peter Zijlstra @ 2011-04-14 16:56 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Wei Gu, Eric Dumazet, netdev, Kirsher, Jeffrey T, Mike Galbraith
In-Reply-To: <4DA723F1.7000901@intel.com>

On Thu, 2011-04-14 at 09:42 -0700, Alexander Duyck wrote:

> I'm doing some more digging into this now.  One thought that occurred to 
> me is that if the patch you mention is having some sort of effect this 
> could be a sign of perhaps a kernel timer or scheduling problem.

Right, so the removal of the NO_HZ throttle will allow the CPU to go
into C states more often, this could result in longer wake-up times for
IRQs.

We reverted because:
  - it caused significant battery drain due to not going into C states
    often enough, and
  - its a much better idea to implement these things in the idle
    governor since it already has the job of guestimating the idle
    duration.

I really can't remember back far enough to even come up with a theory of
why kernels prior to merging the NO_HZ throttle would not exhibit this
problem.




^ permalink raw reply

* Re: Low performance Intel 10GE NIC (3.2.10) on 2.6.38 Kernel
From: Eric Dumazet @ 2011-04-14 16:57 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alexander Duyck, Wei Gu, netdev, Kirsher, Jeffrey T,
	Mike Galbraith
In-Reply-To: <1302800202.2035.32.camel@laptop>

Le jeudi 14 avril 2011 à 18:56 +0200, Peter Zijlstra a écrit :
> On Thu, 2011-04-14 at 09:42 -0700, Alexander Duyck wrote:
> 
> > I'm doing some more digging into this now.  One thought that occurred to 
> > me is that if the patch you mention is having some sort of effect this 
> > could be a sign of perhaps a kernel timer or scheduling problem.
> 
> Right, so the removal of the NO_HZ throttle will allow the CPU to go
> into C states more often, this could result in longer wake-up times for
> IRQs.
> 
> We reverted because:
>   - it caused significant battery drain due to not going into C states
>     often enough, and
>   - its a much better idea to implement these things in the idle
>     governor since it already has the job of guestimating the idle
>     duration.
> 
> I really can't remember back far enough to even come up with a theory of
> why kernels prior to merging the NO_HZ throttle would not exhibit this
> problem.
> 
> 
> 

Normally, Wei Gu already asked to not use C states.

http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01804533/c01804533.pdf

How can we/he check this ?




^ permalink raw reply

* ipv6 multicasting: "interface-local" scope
From: Kristoff Bonne @ 2011-04-14 17:23 UTC (permalink / raw)
  To: netdev

Hi,


(This is a repost of a message I posted yesterday in
linux.network.general but to which I have not seen any replies).


If I understand RFC 4291 correctly, the ipv6 multicast ip-addresses
ffx1:<scope> are "interface local" scope.

In draft-ietf-ipngwg-scoping-arch-02, I read this:
"The interface-local scope spans a single interface only; a multicast
address of interface-local scope is useful only for loopback delivery of
multicasts within a single node, for example, as a form of inter-process
communication within a computer".


This is exactly what I want to do for an application I am working on.

However when I try it and make a small program that sends out UDP
packats to the ipv6 multicast address ff11::1234, I do see them show up
on remote machines on my LAN!
So it looks like these packets do leave my local compute which -I think-
is the opposite of what RFC4291 tells me should happen.

Am I missing something? or is this a bug?



The machine on which I have tested this runs ubuntu 10.04.1 LTE with
kernel 2.6.32-30-generic



Anybody any ideas?



links:
- RFC4291: http://tools.ietf.org/html/rfc4291#section-2.7
- draft-ietf-ipngwg-scoping-arch-02:
http://tools.ietf.org/html/draft-ietf-ipngwg-scoping-arch-02)

Cheerio! Kr. Bonne.



^ permalink raw reply

* [PATCH] net: export skb_clone_tx_timestamp
From: Richard Cochran @ 2011-04-14 17:35 UTC (permalink / raw)
  To: netdev; +Cc: David Miller

MAC drivers compiled as modules may well want to call this function via
the skb_tx_timestamp inline function. This patch exports the function in
order to let this happen.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
---
 net/core/timestamping.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/core/timestamping.c b/net/core/timestamping.c
index 7e7ca37..3b00a6b 100644
--- a/net/core/timestamping.c
+++ b/net/core/timestamping.c
@@ -68,6 +68,7 @@ void skb_clone_tx_timestamp(struct sk_buff *skb)
 		break;
 	}
 }
+EXPORT_SYMBOL_GPL(skb_clone_tx_timestamp);
 
 void skb_complete_tx_timestamp(struct sk_buff *skb,
 			       struct skb_shared_hwtstamps *hwtstamps)
-- 
1.7.0.4


^ permalink raw reply related

* Re: Low performance Intel 10GE NIC (3.2.10) on 2.6.38 Kernel
From: Eric Dumazet @ 2011-04-14 17:49 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alexander Duyck, Wei Gu, netdev, Kirsher, Jeffrey T,
	Mike Galbraith
In-Reply-To: <1302800221.3248.39.camel@edumazet-laptop>

Le jeudi 14 avril 2011 à 18:57 +0200, Eric Dumazet a écrit :
> Le jeudi 14 avril 2011 à 18:56 +0200, Peter Zijlstra a écrit :
> > On Thu, 2011-04-14 at 09:42 -0700, Alexander Duyck wrote:
> > 
> > > I'm doing some more digging into this now.  One thought that occurred to 
> > > me is that if the patch you mention is having some sort of effect this 
> > > could be a sign of perhaps a kernel timer or scheduling problem.
> > 
> > Right, so the removal of the NO_HZ throttle will allow the CPU to go
> > into C states more often, this could result in longer wake-up times for
> > IRQs.
> > 
> > We reverted because:
> >   - it caused significant battery drain due to not going into C states
> >     often enough, and
> >   - its a much better idea to implement these things in the idle
> >     governor since it already has the job of guestimating the idle
> >     duration.
> > 
> > I really can't remember back far enough to even come up with a theory of
> > why kernels prior to merging the NO_HZ throttle would not exhibit this
> > problem.
> > 
> > 
> > 
> 
> Normally, Wei Gu already asked to not use C states.
> 
> http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01804533/c01804533.pdf
> 
> How can we/he check this ?
> 
> 

Anyway, this could explain a latency problem, not packet drops.

With NAPI, we should get few hardware irqs under load.

Once softirq started, scheduler is out of the equation.




^ permalink raw reply

* [PATCH 1/2] bna: fix for clean fw re-initialization
From: Rasesh Mody @ 2011-04-14 18:05 UTC (permalink / raw)
  To: davem, netdev; +Cc: Rasesh Mody, Debashis Dutt

During a kernel crash, bna control path state machine and firmware do not
get a notification and hence are not cleanly shutdown. The registers
holding driver/IOC state information are not reset back to valid
disabled/parking values. This causes subsequent driver initialization
to hang during kdump kernel boot. This patch, during the initialization
of first PCI function, resets corresponding register when unclean shutown
is detect by reading chip registers. This will make sure that ioc/fw
gets clean re-initialization.

Signed-off-by: Debashis Dutt <ddutt@brocade.com>
Signed-off-by: Rasesh Mody <rmody@brocade.com>
---
 drivers/net/bna/bfa_ioc.c    |   31 ++++++++++++++++++-------------
 drivers/net/bna/bfa_ioc.h    |    1 +
 drivers/net/bna/bfa_ioc_ct.c |   28 ++++++++++++++++++++++++++++
 drivers/net/bna/bfi.h        |    6 ++++--
 4 files changed, 51 insertions(+), 15 deletions(-)

diff --git a/drivers/net/bna/bfa_ioc.c b/drivers/net/bna/bfa_ioc.c
index e3de0b8..7581518 100644
--- a/drivers/net/bna/bfa_ioc.c
+++ b/drivers/net/bna/bfa_ioc.c
@@ -38,6 +38,8 @@
 #define bfa_ioc_map_port(__ioc) ((__ioc)->ioc_hwif->ioc_map_port(__ioc))
 #define bfa_ioc_notify_fail(__ioc)			\
 			((__ioc)->ioc_hwif->ioc_notify_fail(__ioc))
+#define bfa_ioc_sync_start(__ioc)               \
+			((__ioc)->ioc_hwif->ioc_sync_start(__ioc))
 #define bfa_ioc_sync_join(__ioc)			\
 			((__ioc)->ioc_hwif->ioc_sync_join(__ioc))
 #define bfa_ioc_sync_leave(__ioc)			\
@@ -602,7 +604,7 @@ bfa_iocpf_sm_fwcheck(struct bfa_iocpf *iocpf, enum iocpf_event event)
 	switch (event) {
 	case IOCPF_E_SEMLOCKED:
 		if (bfa_ioc_firmware_lock(ioc)) {
-			if (bfa_ioc_sync_complete(ioc)) {
+			if (bfa_ioc_sync_start(ioc)) {
 				iocpf->retry_count = 0;
 				bfa_ioc_sync_join(ioc);
 				bfa_fsm_set_state(iocpf, bfa_iocpf_sm_hwinit);
@@ -1314,7 +1316,7 @@ bfa_nw_ioc_fwver_cmp(struct bfa_ioc *ioc, struct bfi_ioc_image_hdr *fwhdr)
  * execution context (driver/bios) must match.
  */
 static bool
-bfa_ioc_fwver_valid(struct bfa_ioc *ioc)
+bfa_ioc_fwver_valid(struct bfa_ioc *ioc, u32 boot_env)
 {
 	struct bfi_ioc_image_hdr fwhdr, *drv_fwhdr;
 
@@ -1325,7 +1327,7 @@ bfa_ioc_fwver_valid(struct bfa_ioc *ioc)
 	if (fwhdr.signature != drv_fwhdr->signature)
 		return false;
 
-	if (fwhdr.exec != drv_fwhdr->exec)
+	if (swab32(fwhdr.param) != boot_env)
 		return false;
 
 	return bfa_nw_ioc_fwver_cmp(ioc, &fwhdr);
@@ -1352,9 +1354,12 @@ bfa_ioc_hwinit(struct bfa_ioc *ioc, bool force)
 {
 	enum bfi_ioc_state ioc_fwstate;
 	bool fwvalid;
+	u32 boot_env;
 
 	ioc_fwstate = readl(ioc->ioc_regs.ioc_fwstate);
 
+	boot_env = BFI_BOOT_LOADER_OS;
+
 	if (force)
 		ioc_fwstate = BFI_IOC_UNINIT;
 
@@ -1362,10 +1367,10 @@ bfa_ioc_hwinit(struct bfa_ioc *ioc, bool force)
 	 * check if firmware is valid
 	 */
 	fwvalid = (ioc_fwstate == BFI_IOC_UNINIT) ?
-		false : bfa_ioc_fwver_valid(ioc);
+		false : bfa_ioc_fwver_valid(ioc, boot_env);
 
 	if (!fwvalid) {
-		bfa_ioc_boot(ioc, BFI_BOOT_TYPE_NORMAL, ioc->pcidev.device_id);
+		bfa_ioc_boot(ioc, BFI_BOOT_TYPE_NORMAL, boot_env);
 		return;
 	}
 
@@ -1396,7 +1401,7 @@ bfa_ioc_hwinit(struct bfa_ioc *ioc, bool force)
 	/**
 	 * Initialize the h/w for any other states.
 	 */
-	bfa_ioc_boot(ioc, BFI_BOOT_TYPE_NORMAL, ioc->pcidev.device_id);
+	bfa_ioc_boot(ioc, BFI_BOOT_TYPE_NORMAL, boot_env);
 }
 
 void
@@ -1506,7 +1511,7 @@ bfa_ioc_hb_stop(struct bfa_ioc *ioc)
  */
 static void
 bfa_ioc_download_fw(struct bfa_ioc *ioc, u32 boot_type,
-		    u32 boot_param)
+		    u32 boot_env)
 {
 	u32 *fwimg;
 	u32 pgnum, pgoff;
@@ -1558,10 +1563,10 @@ bfa_ioc_download_fw(struct bfa_ioc *ioc, u32 boot_type,
 	/*
 	 * Set boot type and boot param at the end.
 	*/
-	writel((swab32(swab32(boot_type))), ((ioc->ioc_regs.smem_page_start)
+	writel(boot_type, ((ioc->ioc_regs.smem_page_start)
 			+ (BFI_BOOT_TYPE_OFF)));
-	writel((swab32(swab32(boot_param))), ((ioc->ioc_regs.smem_page_start)
-			+ (BFI_BOOT_PARAM_OFF)));
+	writel(boot_env, ((ioc->ioc_regs.smem_page_start)
+			+ (BFI_BOOT_LOADER_OFF)));
 }
 
 static void
@@ -1721,7 +1726,7 @@ bfa_ioc_pll_init(struct bfa_ioc *ioc)
  * as the entry vector.
  */
 static void
-bfa_ioc_boot(struct bfa_ioc *ioc, u32 boot_type, u32 boot_param)
+bfa_ioc_boot(struct bfa_ioc *ioc, u32 boot_type, u32 boot_env)
 {
 	void __iomem *rb;
 
@@ -1734,7 +1739,7 @@ bfa_ioc_boot(struct bfa_ioc *ioc, u32 boot_type, u32 boot_param)
 	 * Initialize IOC state of all functions on a chip reset.
 	 */
 	rb = ioc->pcidev.pci_bar_kva;
-	if (boot_param == BFI_BOOT_TYPE_MEMTEST) {
+	if (boot_type == BFI_BOOT_TYPE_MEMTEST) {
 		writel(BFI_IOC_MEMTEST, (rb + BFA_IOC0_STATE_REG));
 		writel(BFI_IOC_MEMTEST, (rb + BFA_IOC1_STATE_REG));
 	} else {
@@ -1743,7 +1748,7 @@ bfa_ioc_boot(struct bfa_ioc *ioc, u32 boot_type, u32 boot_param)
 	}
 
 	bfa_ioc_msgflush(ioc);
-	bfa_ioc_download_fw(ioc, boot_type, boot_param);
+	bfa_ioc_download_fw(ioc, boot_type, boot_env);
 
 	/**
 	 * Enable interrupts just before starting LPU
diff --git a/drivers/net/bna/bfa_ioc.h b/drivers/net/bna/bfa_ioc.h
index e4974bc..bd48abe 100644
--- a/drivers/net/bna/bfa_ioc.h
+++ b/drivers/net/bna/bfa_ioc.h
@@ -194,6 +194,7 @@ struct bfa_ioc_hwif {
 					bool msix);
 	void		(*ioc_notify_fail)	(struct bfa_ioc *ioc);
 	void		(*ioc_ownership_reset)	(struct bfa_ioc *ioc);
+	bool		(*ioc_sync_start)       (struct bfa_ioc *ioc);
 	void		(*ioc_sync_join)	(struct bfa_ioc *ioc);
 	void		(*ioc_sync_leave)	(struct bfa_ioc *ioc);
 	void		(*ioc_sync_ack)		(struct bfa_ioc *ioc);
diff --git a/drivers/net/bna/bfa_ioc_ct.c b/drivers/net/bna/bfa_ioc_ct.c
index 469997c..87aecdf 100644
--- a/drivers/net/bna/bfa_ioc_ct.c
+++ b/drivers/net/bna/bfa_ioc_ct.c
@@ -41,6 +41,7 @@ static void bfa_ioc_ct_map_port(struct bfa_ioc *ioc);
 static void bfa_ioc_ct_isr_mode_set(struct bfa_ioc *ioc, bool msix);
 static void bfa_ioc_ct_notify_fail(struct bfa_ioc *ioc);
 static void bfa_ioc_ct_ownership_reset(struct bfa_ioc *ioc);
+static bool bfa_ioc_ct_sync_start(struct bfa_ioc *ioc);
 static void bfa_ioc_ct_sync_join(struct bfa_ioc *ioc);
 static void bfa_ioc_ct_sync_leave(struct bfa_ioc *ioc);
 static void bfa_ioc_ct_sync_ack(struct bfa_ioc *ioc);
@@ -63,6 +64,7 @@ bfa_nw_ioc_set_ct_hwif(struct bfa_ioc *ioc)
 	nw_hwif_ct.ioc_isr_mode_set = bfa_ioc_ct_isr_mode_set;
 	nw_hwif_ct.ioc_notify_fail = bfa_ioc_ct_notify_fail;
 	nw_hwif_ct.ioc_ownership_reset = bfa_ioc_ct_ownership_reset;
+	nw_hwif_ct.ioc_sync_start = bfa_ioc_ct_sync_start;
 	nw_hwif_ct.ioc_sync_join = bfa_ioc_ct_sync_join;
 	nw_hwif_ct.ioc_sync_leave = bfa_ioc_ct_sync_leave;
 	nw_hwif_ct.ioc_sync_ack = bfa_ioc_ct_sync_ack;
@@ -345,6 +347,32 @@ bfa_ioc_ct_ownership_reset(struct bfa_ioc *ioc)
 /**
  * Synchronized IOC failure processing routines
  */
+static bool
+bfa_ioc_ct_sync_start(struct bfa_ioc *ioc)
+{
+	u32 r32 = readl(ioc->ioc_regs.ioc_fail_sync);
+	u32 sync_reqd = bfa_ioc_ct_get_sync_reqd(r32);
+
+	/*
+	 * Driver load time.  If the sync required bit for this PCI fn
+	 * is set, it is due to an unclean exit by the driver for this
+	 * PCI fn in the previous incarnation. Whoever comes here first
+	 * should clean it up, no matter which PCI fn.
+	 */
+
+	if (sync_reqd & bfa_ioc_ct_sync_pos(ioc)) {
+		writel(0, ioc->ioc_regs.ioc_fail_sync);
+		writel(1, ioc->ioc_regs.ioc_usage_reg);
+		writel(BFI_IOC_UNINIT, ioc->ioc_regs.ioc_fwstate);
+		writel(BFI_IOC_UNINIT, ioc->ioc_regs.alt_ioc_fwstate);
+		return true;
+	}
+
+	return bfa_ioc_ct_sync_complete(ioc);
+}
+/**
+ * Synchronized IOC failure processing routines
+ */
 static void
 bfa_ioc_ct_sync_join(struct bfa_ioc *ioc)
 {
diff --git a/drivers/net/bna/bfi.h b/drivers/net/bna/bfi.h
index a973968..6050379 100644
--- a/drivers/net/bna/bfi.h
+++ b/drivers/net/bna/bfi.h
@@ -184,12 +184,14 @@ enum bfi_mclass {
 #define BFI_IOC_MSGLEN_MAX	32	/* 32 bytes */
 
 #define BFI_BOOT_TYPE_OFF		8
-#define BFI_BOOT_PARAM_OFF		12
+#define BFI_BOOT_LOADER_OFF		12
 
-#define BFI_BOOT_TYPE_NORMAL 		0	/* param is device id */
+#define BFI_BOOT_TYPE_NORMAL 		0
 #define	BFI_BOOT_TYPE_FLASH		1
 #define	BFI_BOOT_TYPE_MEMTEST		2
 
+#define BFI_BOOT_LOADER_OS		0
+
 #define BFI_BOOT_MEMTEST_RES_ADDR   0x900
 #define BFI_BOOT_MEMTEST_RES_SIG    0xA0A1A2A3
 
-- 
1.7.1


^ permalink raw reply related

* [PATCH 2/2] bna: fix memory leak during RX path cleanup
From: Rasesh Mody @ 2011-04-14 18:05 UTC (permalink / raw)
  To: davem, netdev; +Cc: Rasesh Mody, Debashis Dutt
In-Reply-To: <1302804319-15677-1-git-send-email-rmody@brocade.com>

The memory leak was caused by unintentional assignment of the Rx path
destroy callback function pointer to NULL just after correct
initialization.

Signed-off-by: Debashis Dutt <ddutt@brocade.com>
Signed-off-by: Rasesh Mody <rmody@brocade.com>
---
 drivers/net/bna/bnad.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/net/bna/bnad.c b/drivers/net/bna/bnad.c
index b9f2534..e588511 100644
--- a/drivers/net/bna/bnad.c
+++ b/drivers/net/bna/bnad.c
@@ -1837,7 +1837,6 @@ bnad_setup_rx(struct bnad *bnad, uint rx_id)
 	/* Initialize the Rx event handlers */
 	rx_cbfn.rcb_setup_cbfn = bnad_cb_rcb_setup;
 	rx_cbfn.rcb_destroy_cbfn = bnad_cb_rcb_destroy;
-	rx_cbfn.rcb_destroy_cbfn = NULL;
 	rx_cbfn.ccb_setup_cbfn = bnad_cb_ccb_setup;
 	rx_cbfn.ccb_destroy_cbfn = bnad_cb_ccb_destroy;
 	rx_cbfn.rx_cleanup_cbfn = bnad_cb_rx_cleanup;
-- 
1.7.1


^ permalink raw reply related

* Re: [net-next-2.6 RFC PATCH v3] ethtool: allow custom interval for physical identification
From: Jon Mason @ 2011-04-14 18:55 UTC (permalink / raw)
  To: Bruce Allan
  Cc: netdev, Ben Hutchings, Sathya Perla, Subbu Seetharaman,
	Ajit Khaparde, Michael Chan, Eilon Greenstein, Divy Le Ray,
	Don Fry, Solarflare linux maintainers, Steve Hodgson,
	Stephen Hemminger, Matt Carlson
In-Reply-To: <20110413230910.16317.11372.stgit@gitlad.jf.intel.com>

On Wed, Apr 13, 2011 at 04:09:10PM -0700, Bruce Allan wrote:
> When physical identification of an adapter is done by toggling the
> mechanism on and off through software utilizing the set_phys_id operation,
> it is done with a fixed duration for both on and off states.  Some drivers
> may want to set a custom duration for the on/off intervals.  This patch
> changes the API so the return code from the driver's entry point when it
> is called with ETHTOOL_ID_ACTIVE can specify the frequency at which to
> cycle the on/off states, and updates the drivers that have already been
> converted to use the new set_phys_id and use the synchronous method for
> identifying an adapter.
> 
> The physical identification frequency set in the updated drivers is based
> on how it was done prior to the introduction of set_phys_id.
> 
> Compile tested only.  Also fixes a compiler warning in sfc.
> 
> v2: drivers do not return -EINVAL for ETHOOL_ID_ACTIVE
> v3: fold patchset into single patch and cleanup per Ben's feedback
> 
> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Acked-by: Jon Mason <jdmason@kudzu.us>
> Cc: Ben Hutchings <bhutchings@solarflare.com>
> Cc: Sathya Perla <sathya.perla@emulex.com>
> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com>
> Cc: Ajit Khaparde <ajit.khaparde@emulex.com>
> Cc: Michael Chan <mchan@broadcom.com>
> Cc: Eilon Greenstein <eilong@broadcom.com>
> Cc: Divy Le Ray <divy@chelsio.com>
> Cc: Don Fry <pcnet32@frontier.com>
> Cc: Jon Mason <jdmason@kudzu.us>
> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
> Cc: Steve Hodgson <shodgson@solarflare.com>
> Cc: Stephen Hemminger <shemminger@linux-foundation.org>
> Cc: Matt Carlson <mcarlson@broadcom.com>
> ---
> 
>  drivers/net/benet/be_ethtool.c    |    2 +-
>  drivers/net/bnx2.c                |    2 +-
>  drivers/net/bnx2x/bnx2x_ethtool.c |    2 +-
>  drivers/net/cxgb3/cxgb3_main.c    |    2 +-
>  drivers/net/ewrk3.c               |    2 +-
>  drivers/net/niu.c                 |    2 +-
>  drivers/net/pcnet32.c             |    2 +-
>  drivers/net/s2io.c                |    2 +-
>  drivers/net/sfc/ethtool.c         |    6 +++---
>  drivers/net/skge.c                |    2 +-
>  drivers/net/sky2.c                |    2 +-
>  drivers/net/tg3.c                 |    2 +-
>  include/linux/ethtool.h           |    6 ++++--
>  net/core/ethtool.c                |   31 ++++++++++++++++---------------
>  14 files changed, 34 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/net/benet/be_ethtool.c b/drivers/net/benet/be_ethtool.c
> index 96f5502..80226e4 100644
> --- a/drivers/net/benet/be_ethtool.c
> +++ b/drivers/net/benet/be_ethtool.c
> @@ -516,7 +516,7 @@ be_set_phys_id(struct net_device *netdev,
>  	case ETHTOOL_ID_ACTIVE:
>  		be_cmd_get_beacon_state(adapter, adapter->hba_port_num,
>  					&adapter->beacon_state);
> -		return -EINVAL;
> +		return 1;	/* cycle on/off once per second */
>  
>  	case ETHTOOL_ID_ON:
>  		be_cmd_set_beacon_state(adapter, adapter->hba_port_num, 0, 0,
> diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
> index 0a52079..bf729ee 100644
> --- a/drivers/net/bnx2.c
> +++ b/drivers/net/bnx2.c
> @@ -7473,7 +7473,7 @@ bnx2_set_phys_id(struct net_device *dev, enum ethtool_phys_id_state state)
>  
>  		bp->leds_save = REG_RD(bp, BNX2_MISC_CFG);
>  		REG_WR(bp, BNX2_MISC_CFG, BNX2_MISC_CFG_LEDMODE_MAC);
> -		return -EINVAL;
> +		return 1;	/* cycle on/off once per second */
>  
>  	case ETHTOOL_ID_ON:
>  		REG_WR(bp, BNX2_EMAC_LED, BNX2_EMAC_LED_OVERRIDE |
> diff --git a/drivers/net/bnx2x/bnx2x_ethtool.c b/drivers/net/bnx2x/bnx2x_ethtool.c
> index ad7d91e..0a5e88d 100644
> --- a/drivers/net/bnx2x/bnx2x_ethtool.c
> +++ b/drivers/net/bnx2x/bnx2x_ethtool.c
> @@ -2025,7 +2025,7 @@ static int bnx2x_set_phys_id(struct net_device *dev,
>  
>  	switch (state) {
>  	case ETHTOOL_ID_ACTIVE:
> -		return -EINVAL;
> +		return 1;	/* cycle on/off once per second */
>  
>  	case ETHTOOL_ID_ON:
>  		bnx2x_set_led(&bp->link_params, &bp->link_vars,
> diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
> index 802c7a7..a087e06 100644
> --- a/drivers/net/cxgb3/cxgb3_main.c
> +++ b/drivers/net/cxgb3/cxgb3_main.c
> @@ -1757,7 +1757,7 @@ static int set_phys_id(struct net_device *dev,
>  
>  	switch (state) {
>  	case ETHTOOL_ID_ACTIVE:
> -		return -EINVAL;
> +		return 1;	/* cycle on/off once per second */
>  
>  	case ETHTOOL_ID_OFF:
>  		t3_set_reg_field(adapter, A_T3DBG_GPIO_EN, F_GPIO0_OUT_VAL, 0);
> diff --git a/drivers/net/ewrk3.c b/drivers/net/ewrk3.c
> index c7ce443..17b6027 100644
> --- a/drivers/net/ewrk3.c
> +++ b/drivers/net/ewrk3.c
> @@ -1618,7 +1618,7 @@ static int ewrk3_set_phys_id(struct net_device *dev,
>  		/* Prevent ISR from twiddling the LED */
>  		lp->led_mask = 0;
>  		spin_unlock_irq(&lp->hw_lock);
> -		return -EINVAL;
> +		return 2;	/* cycle on/off twice per second */
>  
>  	case ETHTOOL_ID_ON:
>  		cr = inb(EWRK3_CR);
> diff --git a/drivers/net/niu.c b/drivers/net/niu.c
> index 3fa1e9c..ea2272f 100644
> --- a/drivers/net/niu.c
> +++ b/drivers/net/niu.c
> @@ -7896,7 +7896,7 @@ static int niu_set_phys_id(struct net_device *dev,
>  	switch (state) {
>  	case ETHTOOL_ID_ACTIVE:
>  		np->orig_led_state = niu_led_state_save(np);
> -		return -EINVAL;
> +		return 1;	/* cycle on/off once per second */
>  
>  	case ETHTOOL_ID_ON:
>  		niu_force_led(np, 1);
> diff --git a/drivers/net/pcnet32.c b/drivers/net/pcnet32.c
> index e89afb9..0a1efba 100644
> --- a/drivers/net/pcnet32.c
> +++ b/drivers/net/pcnet32.c
> @@ -1038,7 +1038,7 @@ static int pcnet32_set_phys_id(struct net_device *dev,
>  		for (i = 4; i < 8; i++)
>  			lp->save_regs[i - 4] = a->read_bcr(ioaddr, i);
>  		spin_unlock_irqrestore(&lp->lock, flags);
> -		return -EINVAL;
> +		return 2;	/* cycle on/off twice per second */
>  
>  	case ETHTOOL_ID_ON:
>  	case ETHTOOL_ID_OFF:
> diff --git a/drivers/net/s2io.c b/drivers/net/s2io.c
> index 2d5cc61..2302d97 100644
> --- a/drivers/net/s2io.c
> +++ b/drivers/net/s2io.c
> @@ -5541,7 +5541,7 @@ static int s2io_ethtool_set_led(struct net_device *dev,
>  	switch (state) {
>  	case ETHTOOL_ID_ACTIVE:
>  		sp->adapt_ctrl_org = readq(&bar0->gpio_control);
> -		return -EINVAL;
> +		return 1;	/* cycle on/off once per second */
>  
>  	case ETHTOOL_ID_ON:
>  		s2io_set_led(sp, true);
> diff --git a/drivers/net/sfc/ethtool.c b/drivers/net/sfc/ethtool.c
> index 644f7c1..5d8468f 100644
> --- a/drivers/net/sfc/ethtool.c
> +++ b/drivers/net/sfc/ethtool.c
> @@ -182,7 +182,7 @@ static int efx_ethtool_phys_id(struct net_device *net_dev,
>  			       enum ethtool_phys_id_state state)
>  {
>  	struct efx_nic *efx = netdev_priv(net_dev);
> -	enum efx_led_mode mode;
> +	enum efx_led_mode mode = EFX_LED_DEFAULT;
>  
>  	switch (state) {
>  	case ETHTOOL_ID_ON:
> @@ -194,8 +194,8 @@ static int efx_ethtool_phys_id(struct net_device *net_dev,
>  	case ETHTOOL_ID_INACTIVE:
>  		mode = EFX_LED_DEFAULT;
>  		break;
> -	default:
> -		return -EINVAL;
> +	case ETHTOOL_ID_ACTIVE:
> +		return 1;	/* cycle on/off once per second */
>  	}
>  
>  	efx->type->set_id_led(efx, mode);
> diff --git a/drivers/net/skge.c b/drivers/net/skge.c
> index 310dcbc..176d784 100644
> --- a/drivers/net/skge.c
> +++ b/drivers/net/skge.c
> @@ -753,7 +753,7 @@ static int skge_set_phys_id(struct net_device *dev,
>  
>  	switch (state) {
>  	case ETHTOOL_ID_ACTIVE:
> -		return -EINVAL;
> +		return 2;	/* cycle on/off twice per second */
>  
>  	case ETHTOOL_ID_ON:
>  		skge_led(skge, LED_MODE_TST);
> diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
> index a4b8fe5..c8d0451 100644
> --- a/drivers/net/sky2.c
> +++ b/drivers/net/sky2.c
> @@ -3813,7 +3813,7 @@ static int sky2_set_phys_id(struct net_device *dev,
>  
>  	switch (state) {
>  	case ETHTOOL_ID_ACTIVE:
> -		return -EINVAL;
> +		return 1;	/* cycle on/off once per second */
>  	case ETHTOOL_ID_INACTIVE:
>  		sky2_led(sky2, MO_LED_NORM);
>  		break;
> diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
> index 9d7defc..7c1a9dd 100644
> --- a/drivers/net/tg3.c
> +++ b/drivers/net/tg3.c
> @@ -10292,7 +10292,7 @@ static int tg3_set_phys_id(struct net_device *dev,
>  
>  	switch (state) {
>  	case ETHTOOL_ID_ACTIVE:
> -		return -EINVAL;
> +		return 1;	/* cycle on/off once per second */
>  
>  	case ETHTOOL_ID_ON:
>  		tw32(MAC_LED_CTRL, LED_CTRL_LNKLED_OVERRIDE |
> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> index ad22a68..9de3127 100644
> --- a/include/linux/ethtool.h
> +++ b/include/linux/ethtool.h
> @@ -798,8 +798,10 @@ bool ethtool_invalid_flags(struct net_device *dev, u32 data, u32 supported);
>   *	attached to it.  The implementation may update the indicator
>   *	asynchronously or synchronously, but in either case it must return
>   *	quickly.  It is initially called with the argument %ETHTOOL_ID_ACTIVE,
> - *	and must either activate asynchronous updates or return -%EINVAL.
> - *	If it returns -%EINVAL then it will be called again at intervals with
> + *	and must either activate asynchronous updates and return zero, return
> + *	a negative error or return a positive frequency for synchronous
> + *	indication (e.g. 1 for one on/off cycle per second).  If it returns
> + *	a frequency then it will be called again at intervals with the
>   *	argument %ETHTOOL_ID_ON or %ETHTOOL_ID_OFF and should set the state of
>   *	the indicator accordingly.  Finally, it is called with the argument
>   *	%ETHTOOL_ID_INACTIVE and must deactivate the indicator.  Returns a
> diff --git a/net/core/ethtool.c b/net/core/ethtool.c
> index 41dee2d..13d79f5 100644
> --- a/net/core/ethtool.c
> +++ b/net/core/ethtool.c
> @@ -1669,7 +1669,7 @@ static int ethtool_phys_id(struct net_device *dev, void __user *useraddr)
>  		return dev->ethtool_ops->phys_id(dev, id.data);
>  
>  	rc = dev->ethtool_ops->set_phys_id(dev, ETHTOOL_ID_ACTIVE);
> -	if (rc && rc != -EINVAL)
> +	if (rc < 0)
>  		return rc;
>  
>  	/* Drop the RTNL lock while waiting, but prevent reentry or
> @@ -1684,21 +1684,22 @@ static int ethtool_phys_id(struct net_device *dev, void __user *useraddr)
>  		schedule_timeout_interruptible(
>  			id.data ? (id.data * HZ) : MAX_SCHEDULE_TIMEOUT);
>  	} else {
> -		/* Driver expects to be called periodically */
> +		/* Driver expects to be called at twice the frequency in rc */
> +		int n = rc * 2, i, interval = HZ / n;
> +
> +		/* Count down seconds */
>  		do {
> -			rtnl_lock();
> -			rc = dev->ethtool_ops->set_phys_id(dev, ETHTOOL_ID_ON);
> -			rtnl_unlock();
> -			if (rc)
> -				break;
> -			schedule_timeout_interruptible(HZ / 2);
> -
> -			rtnl_lock();
> -			rc = dev->ethtool_ops->set_phys_id(dev, ETHTOOL_ID_OFF);
> -			rtnl_unlock();
> -			if (rc)
> -				break;
> -			schedule_timeout_interruptible(HZ / 2);
> +			/* Count down iterations per second */
> +			i = n;
> +			do {
> +				rtnl_lock();
> +				rc = dev->ethtool_ops->set_phys_id(dev,
> +				    (i & 1) ? ETHTOOL_ID_OFF : ETHTOOL_ID_ON);
> +				rtnl_unlock();
> +				if (rc)
> +					break;
> +				schedule_timeout_interruptible(interval);
> +			} while (!signal_pending(current) && --i != 0);
>  		} while (!signal_pending(current) &&
>  			 (id.data == 0 || --id.data != 0));
>  	}
> 

^ permalink raw reply

* Re: Low performance Intel 10GE NIC (3.2.10) on 2.6.38 Kernel
From: Alexander Duyck @ 2011-04-14 19:08 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Peter Zijlstra, Wei Gu, netdev, Kirsher, Jeffrey T,
	Mike Galbraith
In-Reply-To: <1302803357.2744.1.camel@edumazet-laptop>

On 4/14/2011 10:49 AM, Eric Dumazet wrote:
> Le jeudi 14 avril 2011 à 18:57 +0200, Eric Dumazet a écrit :
>> Le jeudi 14 avril 2011 à 18:56 +0200, Peter Zijlstra a écrit :
>>> On Thu, 2011-04-14 at 09:42 -0700, Alexander Duyck wrote:
>>>
>>>> I'm doing some more digging into this now.  One thought that occurred to
>>>> me is that if the patch you mention is having some sort of effect this
>>>> could be a sign of perhaps a kernel timer or scheduling problem.
>>>
>>> Right, so the removal of the NO_HZ throttle will allow the CPU to go
>>> into C states more often, this could result in longer wake-up times for
>>> IRQs.
>>>
>>> We reverted because:
>>>    - it caused significant battery drain due to not going into C states
>>>      often enough, and
>>>    - its a much better idea to implement these things in the idle
>>>      governor since it already has the job of guestimating the idle
>>>      duration.
>>>
>>> I really can't remember back far enough to even come up with a theory of
>>> why kernels prior to merging the NO_HZ throttle would not exhibit this
>>> problem.
>>>
>>>
>>>
>>
>> Normally, Wei Gu already asked to not use C states.
>>
>> http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01804533/c01804533.pdf
>>
>> How can we/he check this ?
>>
>>
>
> Anyway, this could explain a latency problem, not packet drops.
>
> With NAPI, we should get few hardware irqs under load.
>
> Once softirq started, scheduler is out of the equation.

The problem is on these newer systems it is becoming significantly 
harder to get locked into the polling only state.  In many cases we will 
just complete all of the RX work in a single poll and go back to 
interrupts.  This is especially true when traffic is spread out across 
multiple queues and CPUs.

I'm thinking that maybe powertop results for before that patch and after 
that patch should be pretty telling.  It should tell us if C states are 
active, and if so it will also tell us if we are being woken by 
interrupts or if we are staying in the polling state.

Thanks,

Alex

^ permalink raw reply

* pull request: wireless-2.6 2011-04-14
From: John W. Linville @ 2011-04-14 20:08 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Dave,

Another small round of fixes intended for 2.6.39...

Included are a fix for a WARNING from ath9k regarding a DMA failure, a
fix for iwlegacy not initializing its Tx power correctly, and a small
ath9k fix to report the driver name correctly to ethtool.

Please let me know if there are problems!

Thanks,

John

---

The following changes since commit 38a2f37258f9e2ae3f6e4241e01088be8dfaf4e9:

  usbnet: Fix up 'FLAG_POINTTOPOINT' and 'FLAG_MULTI_PACKET' overlaps. (2011-04-14 00:22:27 -0700)

are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git master

Felix Fietkau (1):
      ath9k_hw: fix stopping rx DMA during resets

Stanislaw Gruszka (1):
      iwlegacy: fix tx_power initialization

Sujith Manoharan (1):
      ath9k_htc: Fix ethtool reporting

 drivers/net/wireless/ath/ath9k/hif_usb.c     |    4 ++--
 drivers/net/wireless/ath/ath9k/hw.c          |    9 ---------
 drivers/net/wireless/ath/ath9k/mac.c         |   25 ++++++++++++++++++++++---
 drivers/net/wireless/ath/ath9k/mac.h         |    2 +-
 drivers/net/wireless/ath/ath9k/recv.c        |    6 +++---
 drivers/net/wireless/iwlegacy/iwl-3945-hw.h  |    2 --
 drivers/net/wireless/iwlegacy/iwl-4965-hw.h  |    3 ---
 drivers/net/wireless/iwlegacy/iwl-core.c     |   17 +++++++++++------
 drivers/net/wireless/iwlegacy/iwl-eeprom.c   |    7 -------
 drivers/net/wireless/iwlegacy/iwl3945-base.c |    4 ----
 drivers/net/wireless/iwlegacy/iwl4965-base.c |    6 ------
 11 files changed, 39 insertions(+), 46 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/hif_usb.c b/drivers/net/wireless/ath/ath9k/hif_usb.c
index f1b8af6..2d10239 100644
--- a/drivers/net/wireless/ath/ath9k/hif_usb.c
+++ b/drivers/net/wireless/ath/ath9k/hif_usb.c
@@ -1040,7 +1040,7 @@ static int ath9k_hif_usb_probe(struct usb_interface *interface,
 	}
 
 	ret = ath9k_htc_hw_init(hif_dev->htc_handle,
-				&hif_dev->udev->dev, hif_dev->device_id,
+				&interface->dev, hif_dev->device_id,
 				hif_dev->udev->product, id->driver_info);
 	if (ret) {
 		ret = -EINVAL;
@@ -1158,7 +1158,7 @@ fail_resume:
 #endif
 
 static struct usb_driver ath9k_hif_usb_driver = {
-	.name = "ath9k_hif_usb",
+	.name = KBUILD_MODNAME,
 	.probe = ath9k_hif_usb_probe,
 	.disconnect = ath9k_hif_usb_disconnect,
 #ifdef CONFIG_PM
diff --git a/drivers/net/wireless/ath/ath9k/hw.c b/drivers/net/wireless/ath/ath9k/hw.c
index 1ec9bcd..c95bc5c 100644
--- a/drivers/net/wireless/ath/ath9k/hw.c
+++ b/drivers/net/wireless/ath/ath9k/hw.c
@@ -1254,15 +1254,6 @@ int ath9k_hw_reset(struct ath_hw *ah, struct ath9k_channel *chan,
 	ah->txchainmask = common->tx_chainmask;
 	ah->rxchainmask = common->rx_chainmask;
 
-	if ((common->bus_ops->ath_bus_type != ATH_USB) && !ah->chip_fullsleep) {
-		ath9k_hw_abortpcurecv(ah);
-		if (!ath9k_hw_stopdmarecv(ah)) {
-			ath_dbg(common, ATH_DBG_XMIT,
-				"Failed to stop receive dma\n");
-			bChannelChange = false;
-		}
-	}
-
 	if (!ath9k_hw_setpower(ah, ATH9K_PM_AWAKE))
 		return -EIO;
 
diff --git a/drivers/net/wireless/ath/ath9k/mac.c b/drivers/net/wireless/ath/ath9k/mac.c
index 562257a..edc1cbb 100644
--- a/drivers/net/wireless/ath/ath9k/mac.c
+++ b/drivers/net/wireless/ath/ath9k/mac.c
@@ -751,28 +751,47 @@ void ath9k_hw_abortpcurecv(struct ath_hw *ah)
 }
 EXPORT_SYMBOL(ath9k_hw_abortpcurecv);
 
-bool ath9k_hw_stopdmarecv(struct ath_hw *ah)
+bool ath9k_hw_stopdmarecv(struct ath_hw *ah, bool *reset)
 {
 #define AH_RX_STOP_DMA_TIMEOUT 10000   /* usec */
 #define AH_RX_TIME_QUANTUM     100     /* usec */
 	struct ath_common *common = ath9k_hw_common(ah);
+	u32 mac_status, last_mac_status = 0;
 	int i;
 
+	/* Enable access to the DMA observation bus */
+	REG_WRITE(ah, AR_MACMISC,
+		  ((AR_MACMISC_DMA_OBS_LINE_8 << AR_MACMISC_DMA_OBS_S) |
+		   (AR_MACMISC_MISC_OBS_BUS_1 <<
+		    AR_MACMISC_MISC_OBS_BUS_MSB_S)));
+
 	REG_WRITE(ah, AR_CR, AR_CR_RXD);
 
 	/* Wait for rx enable bit to go low */
 	for (i = AH_RX_STOP_DMA_TIMEOUT / AH_TIME_QUANTUM; i != 0; i--) {
 		if ((REG_READ(ah, AR_CR) & AR_CR_RXE) == 0)
 			break;
+
+		if (!AR_SREV_9300_20_OR_LATER(ah)) {
+			mac_status = REG_READ(ah, AR_DMADBG_7) & 0x7f0;
+			if (mac_status == 0x1c0 && mac_status == last_mac_status) {
+				*reset = true;
+				break;
+			}
+
+			last_mac_status = mac_status;
+		}
+
 		udelay(AH_TIME_QUANTUM);
 	}
 
 	if (i == 0) {
 		ath_err(common,
-			"DMA failed to stop in %d ms AR_CR=0x%08x AR_DIAG_SW=0x%08x\n",
+			"DMA failed to stop in %d ms AR_CR=0x%08x AR_DIAG_SW=0x%08x DMADBG_7=0x%08x\n",
 			AH_RX_STOP_DMA_TIMEOUT / 1000,
 			REG_READ(ah, AR_CR),
-			REG_READ(ah, AR_DIAG_SW));
+			REG_READ(ah, AR_DIAG_SW),
+			REG_READ(ah, AR_DMADBG_7));
 		return false;
 	} else {
 		return true;
diff --git a/drivers/net/wireless/ath/ath9k/mac.h b/drivers/net/wireless/ath/ath9k/mac.h
index b2b2ff8..c2a5938 100644
--- a/drivers/net/wireless/ath/ath9k/mac.h
+++ b/drivers/net/wireless/ath/ath9k/mac.h
@@ -695,7 +695,7 @@ bool ath9k_hw_setrxabort(struct ath_hw *ah, bool set);
 void ath9k_hw_putrxbuf(struct ath_hw *ah, u32 rxdp);
 void ath9k_hw_startpcureceive(struct ath_hw *ah, bool is_scanning);
 void ath9k_hw_abortpcurecv(struct ath_hw *ah);
-bool ath9k_hw_stopdmarecv(struct ath_hw *ah);
+bool ath9k_hw_stopdmarecv(struct ath_hw *ah, bool *reset);
 int ath9k_hw_beaconq_setup(struct ath_hw *ah);
 
 /* Interrupt Handling */
diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
index a9c3f46..dcd19bc 100644
--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -486,12 +486,12 @@ start_recv:
 bool ath_stoprecv(struct ath_softc *sc)
 {
 	struct ath_hw *ah = sc->sc_ah;
-	bool stopped;
+	bool stopped, reset = false;
 
 	spin_lock_bh(&sc->rx.rxbuflock);
 	ath9k_hw_abortpcurecv(ah);
 	ath9k_hw_setrxfilter(ah, 0);
-	stopped = ath9k_hw_stopdmarecv(ah);
+	stopped = ath9k_hw_stopdmarecv(ah, &reset);
 
 	if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_EDMA)
 		ath_edma_stop_recv(sc);
@@ -506,7 +506,7 @@ bool ath_stoprecv(struct ath_softc *sc)
 			"confusing the DMA engine when we start RX up\n");
 		ATH_DBG_WARN_ON_ONCE(!stopped);
 	}
-	return stopped;
+	return stopped || reset;
 }
 
 void ath_flushrecv(struct ath_softc *sc)
diff --git a/drivers/net/wireless/iwlegacy/iwl-3945-hw.h b/drivers/net/wireless/iwlegacy/iwl-3945-hw.h
index 779d3cb..5c3a68d 100644
--- a/drivers/net/wireless/iwlegacy/iwl-3945-hw.h
+++ b/drivers/net/wireless/iwlegacy/iwl-3945-hw.h
@@ -74,8 +74,6 @@
 /* RSSI to dBm */
 #define IWL39_RSSI_OFFSET	95
 
-#define IWL_DEFAULT_TX_POWER	0x0F
-
 /*
  * EEPROM related constants, enums, and structures.
  */
diff --git a/drivers/net/wireless/iwlegacy/iwl-4965-hw.h b/drivers/net/wireless/iwlegacy/iwl-4965-hw.h
index 08b189c..fc6fa28 100644
--- a/drivers/net/wireless/iwlegacy/iwl-4965-hw.h
+++ b/drivers/net/wireless/iwlegacy/iwl-4965-hw.h
@@ -804,9 +804,6 @@ struct iwl4965_scd_bc_tbl {
 
 #define IWL4965_DEFAULT_TX_RETRY  15
 
-/* Limit range of txpower output target to be between these values */
-#define IWL4965_TX_POWER_TARGET_POWER_MIN	(0)	/* 0 dBm: 1 milliwatt */
-
 /* EEPROM */
 #define IWL4965_FIRST_AMPDU_QUEUE	10
 
diff --git a/drivers/net/wireless/iwlegacy/iwl-core.c b/drivers/net/wireless/iwlegacy/iwl-core.c
index a209a0e..2b08efb 100644
--- a/drivers/net/wireless/iwlegacy/iwl-core.c
+++ b/drivers/net/wireless/iwlegacy/iwl-core.c
@@ -160,6 +160,7 @@ int iwl_legacy_init_geos(struct iwl_priv *priv)
 	struct ieee80211_channel *geo_ch;
 	struct ieee80211_rate *rates;
 	int i = 0;
+	s8 max_tx_power = 0;
 
 	if (priv->bands[IEEE80211_BAND_2GHZ].n_bitrates ||
 	    priv->bands[IEEE80211_BAND_5GHZ].n_bitrates) {
@@ -235,8 +236,8 @@ int iwl_legacy_init_geos(struct iwl_priv *priv)
 
 			geo_ch->flags |= ch->ht40_extension_channel;
 
-			if (ch->max_power_avg > priv->tx_power_device_lmt)
-				priv->tx_power_device_lmt = ch->max_power_avg;
+			if (ch->max_power_avg > max_tx_power)
+				max_tx_power = ch->max_power_avg;
 		} else {
 			geo_ch->flags |= IEEE80211_CHAN_DISABLED;
 		}
@@ -249,6 +250,10 @@ int iwl_legacy_init_geos(struct iwl_priv *priv)
 				 geo_ch->flags);
 	}
 
+	priv->tx_power_device_lmt = max_tx_power;
+	priv->tx_power_user_lmt = max_tx_power;
+	priv->tx_power_next = max_tx_power;
+
 	if ((priv->bands[IEEE80211_BAND_5GHZ].n_channels == 0) &&
 	     priv->cfg->sku & IWL_SKU_A) {
 		IWL_INFO(priv, "Incorrectly detected BG card as ABG. "
@@ -1124,11 +1129,11 @@ int iwl_legacy_set_tx_power(struct iwl_priv *priv, s8 tx_power, bool force)
 	if (!priv->cfg->ops->lib->send_tx_power)
 		return -EOPNOTSUPP;
 
-	if (tx_power < IWL4965_TX_POWER_TARGET_POWER_MIN) {
+	/* 0 dBm mean 1 milliwatt */
+	if (tx_power < 0) {
 		IWL_WARN(priv,
-			 "Requested user TXPOWER %d below lower limit %d.\n",
-			 tx_power,
-			 IWL4965_TX_POWER_TARGET_POWER_MIN);
+			 "Requested user TXPOWER %d below 1 mW.\n",
+			 tx_power);
 		return -EINVAL;
 	}
 
diff --git a/drivers/net/wireless/iwlegacy/iwl-eeprom.c b/drivers/net/wireless/iwlegacy/iwl-eeprom.c
index 04c5648..cb346d1 100644
--- a/drivers/net/wireless/iwlegacy/iwl-eeprom.c
+++ b/drivers/net/wireless/iwlegacy/iwl-eeprom.c
@@ -471,13 +471,6 @@ int iwl_legacy_init_channel_map(struct iwl_priv *priv)
 					     flags & EEPROM_CHANNEL_RADAR))
 				       ? "" : "not ");
 
-			/* Set the tx_power_user_lmt to the highest power
-			 * supported by any channel */
-			if (eeprom_ch_info[ch].max_power_avg >
-						priv->tx_power_user_lmt)
-				priv->tx_power_user_lmt =
-				    eeprom_ch_info[ch].max_power_avg;
-
 			ch_info++;
 		}
 	}
diff --git a/drivers/net/wireless/iwlegacy/iwl3945-base.c b/drivers/net/wireless/iwlegacy/iwl3945-base.c
index 28eb3d8..cc7ebce 100644
--- a/drivers/net/wireless/iwlegacy/iwl3945-base.c
+++ b/drivers/net/wireless/iwlegacy/iwl3945-base.c
@@ -3825,10 +3825,6 @@ static int iwl3945_init_drv(struct iwl_priv *priv)
 	priv->force_reset[IWL_FW_RESET].reset_duration =
 		IWL_DELAY_NEXT_FORCE_FW_RELOAD;
 
-
-	priv->tx_power_user_lmt = IWL_DEFAULT_TX_POWER;
-	priv->tx_power_next = IWL_DEFAULT_TX_POWER;
-
 	if (eeprom->version < EEPROM_3945_EEPROM_VERSION) {
 		IWL_WARN(priv, "Unsupported EEPROM version: 0x%04X\n",
 			 eeprom->version);
diff --git a/drivers/net/wireless/iwlegacy/iwl4965-base.c b/drivers/net/wireless/iwlegacy/iwl4965-base.c
index 91b3d8b..d484c36 100644
--- a/drivers/net/wireless/iwlegacy/iwl4965-base.c
+++ b/drivers/net/wireless/iwlegacy/iwl4965-base.c
@@ -3140,12 +3140,6 @@ static int iwl4965_init_drv(struct iwl_priv *priv)
 
 	iwl_legacy_init_scan_params(priv);
 
-	/* Set the tx_power_user_lmt to the lowest power level
-	 * this value will get overwritten by channel max power avg
-	 * from eeprom */
-	priv->tx_power_user_lmt = IWL4965_TX_POWER_TARGET_POWER_MIN;
-	priv->tx_power_next = IWL4965_TX_POWER_TARGET_POWER_MIN;

^ permalink raw reply related

* Re: [PATCH NET-2.6 1/1] qlcnic: limit skb frags for non tso packet
From: Greg KH @ 2011-04-14 20:09 UTC (permalink / raw)
  To: Amit Salecha
  Cc: netdev, Anirban Chakraborty, David Miller, Ameen Rahman, stable
In-Reply-To: <99737F4847ED0A48AECC9F4A1974A4B80FD13840FE@MNEXMB2.qlogic.org>

On Thu, Apr 14, 2011 at 12:22:35AM -0500, Amit Salecha wrote:
> > > Footer will present in my reply to this email. But footer should not
> > be there in patches sent by me.
> > > Can you verify patch version 2 again ? Here
> > http://patchwork.ozlabs.org/patch/90938/ I don't see any footer.
> > > If you see footer with patch version 2, please send me that.
> >
> > Your footer was not in your patch, correct.  But it was in this email.
> >
> > And that's the issue, you can't have that footer on emails you send to
> > a
> > public list where you are going to be collaborating on a public
> > project,
> > otherwise no one can use anything you say.
> >
> > Now if you only think that people will just accept your patches,
> > without
> > being able to have you participate in development and maintance of
> > those
> > patches (which is required to be done through email), you are mistaken.
> >
> > So please fix your email issue, otherwise it is not going to work.
> >
> > Note, other people at qualcomm have fixed this, so you are not alone.
> >
> Ok.
> 
> Our IT has fixed the footer issue. Sending this email to verify.
> 
> -Amit
> 
> This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.

Nope, still shows up :(


_______________________________________________
stable mailing list
stable@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/stable

^ permalink raw reply

* Re: pull request: wireless-2.6 2011-04-14
From: David Miller @ 2011-04-14 20:18 UTC (permalink / raw)
  To: linville; +Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <20110414200858.GC2652@tuxdriver.com>

From: "John W. Linville" <linville@tuxdriver.com>
Date: Thu, 14 Apr 2011 16:08:58 -0400

> Another small round of fixes intended for 2.6.39...
> 
> Included are a fix for a WARNING from ath9k regarding a DMA failure, a
> fix for iwlegacy not initializing its Tx power correctly, and a small
> ath9k fix to report the driver name correctly to ethtool.

Pulled, thanks a lot John.

^ permalink raw reply

* Re: [PATCH] NFS: Fix infinite loop in gss_create_upcall()
From: Jiri Slaby @ 2011-04-14 20:37 UTC (permalink / raw)
  To: Bryan Schumaker
  Cc: Trond Myklebust, Jiri Slaby, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mm-commits-u79uwXL29TY76Z2rM5mHXA, ML netdev,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4DA60AB9.1050104-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>

On 04/13/2011 10:42 PM, Bryan Schumaker wrote:
> On 04/12/2011 02:52 PM, Jiri Slaby wrote:
>> On 04/12/2011 08:43 PM, Bryan Schumaker wrote:
>>> On 04/12/2011 02:34 PM, Jiri Slaby wrote:
>>>> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
>>>>>> Yes, it fixes the problem. But it waits 15s before it times out. This is
>>>>>> inacceptable for automounted NFS dirs.
>>>>>
>>>>> I'm still confused as to why you are hitting it at all. In the normal
>>>>> autonegotiation case, the client should be trying to use AUTH_SYS first
>>>>> and then trying rpcsec_gss if and only if that fails.
>>>>>
>>>>> Are you really exporting a filesystem using AUTH_NULL as the only
>>>>> supported flavour?
>>>>
>>>> I don't know, I connect to a nfs server which is not maintained by me.
>>>> It looks like that. How can I find out?
>>>
>>> If you're not using gss for anything, you could try rmmod-ing rpcsec_gss_krb5 (and other rpcsec_gss_* modules).
>>
>> I don't have NFS in modules. It's all built-in. And this one is
>> unconditionally selected because of CONFIG_NFS_V4.
> 
> Does this patch help?

Nope, it makes things even worse:
# mount -oro,intr XXX:/yyy /mnt/c/
<15s delay here>
mount.nfs: access denied by server while mounting XXX:/yyy

So in nfs4_proc_get_root I do:
  printk("%s: %d %u\n", __func__, i, flav_array[i]);
  status = nfs4_lookup_root_sec(server, fhandle, info, flav_array[i]);
  printk("%s: res=%d\n", __func__, status);
and get:
[   18.159818] nfs4_proc_get_root: 0 1
[   18.214872] nfs4_proc_get_root: res=-1
[   18.214875] nfs4_proc_get_root: 1 0
[   18.254636] nfs4_proc_get_root: res=-1
[   18.254639] nfs4_proc_get_root: 2 390003
[   33.252174] RPC: AUTH_GSS upcall timed out.
[   33.252177] Please check user daemon is running.
[   33.252192] nfs4_proc_get_root: res=-13

If I revert that back and do the same:
[   28.275569] nfs4_proc_get_root: 0 1
[   28.296545] nfs4_proc_get_root: res=-1
[   28.296548] nfs4_proc_get_root: 1 390003
[   43.296107] RPC: AUTH_GSS upcall timed out.
[   43.296108] Please check user daemon is running.
[   43.296121] nfs4_proc_get_root: res=-13
[   43.296122] nfs4_proc_get_root: 2 0
[   43.318201] nfs4_proc_get_root: res=-1

I.e. all methods fail. And what matters is the last retval. From NULL it
is EPERM, from GSS it is EACCESS. For EPERM, mount(8) falls back to
nfs3, for EACCESS it dies terrible death.

linux-b984:~ # strace -fe mount -s 1000 mount -oro,intr XXX:/yyy /mnt/c/
Process 2396 attached
Process 2395 suspended
[pid  2396] mount("XXX:/yyy", "/mnt/c", "nfs", MS_RDONLY,
"intr,vers=4,addr=10.20.3.2,clientaddr=10.0.2.15") = -1 EPERM (Operation
not permitted)
[pid  2396] mount("XXX:/yyy", "/mnt/c", "nfs", MS_RDONLY,
"intr,addr=10.20.3.2,vers=3,proto=tcp,mountvers=3,mountproto=udp,mountport=709")
= 0
Process 2395 resumed
Process 2396 detached
--- SIGCHLD (Child exited) @ 0 (0) ---

thanks,
-- 
js
suse labs
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 1/2] bna: fix for clean fw re-initialization
From: David Miller @ 2011-04-14 20:39 UTC (permalink / raw)
  To: rmody; +Cc: netdev, ddutt
In-Reply-To: <1302804319-15677-1-git-send-email-rmody@brocade.com>

From: Rasesh Mody <rmody@brocade.com>
Date: Thu, 14 Apr 2011 11:05:18 -0700

> During a kernel crash, bna control path state machine and firmware do not
> get a notification and hence are not cleanly shutdown. The registers
> holding driver/IOC state information are not reset back to valid
> disabled/parking values. This causes subsequent driver initialization
> to hang during kdump kernel boot. This patch, during the initialization
> of first PCI function, resets corresponding register when unclean shutown
> is detect by reading chip registers. This will make sure that ioc/fw
> gets clean re-initialization.
> 
> Signed-off-by: Debashis Dutt <ddutt@brocade.com>
> Signed-off-by: Rasesh Mody <rmody@brocade.com>

Applied.

^ permalink raw reply

* Re: [PATCH 2/2] bna: fix memory leak during RX path cleanup
From: David Miller @ 2011-04-14 20:40 UTC (permalink / raw)
  To: rmody; +Cc: netdev, ddutt
In-Reply-To: <1302804319-15677-2-git-send-email-rmody@brocade.com>

From: Rasesh Mody <rmody@brocade.com>
Date: Thu, 14 Apr 2011 11:05:19 -0700

> The memory leak was caused by unintentional assignment of the Rx path
> destroy callback function pointer to NULL just after correct
> initialization.
> 
> Signed-off-by: Debashis Dutt <ddutt@brocade.com>
> Signed-off-by: Rasesh Mody <rmody@brocade.com>

Applied.

^ permalink raw reply

* Re: Race condition when creating multiple namespaces?
From: Hans Schillstrom @ 2011-04-14 20:46 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: netdev, Daniel Lezcano
In-Reply-To: <m1ei58co08.fsf@fess.ebiederm.org>

Hello
I thought this might have been a kvm bug, but now I've got it in  net-netx 2.6.39-rc2 too

On Tuesday, April 12, 2011 02:27:35 Eric W. Biederman wrote:
> Hans Schillstrom <hans@schillstrom.com> writes:
> 
> > Hello
> > I'v been strugling with this for some time now
> >
> > When creating multiple namespaces using lxc-start,  un-initialized network namespace parts will be called by the new process in the namespace.
> > ex. when using conntrack or ipvsadm to quickly,  (a sleep 2 "solves" the problem).
> > (From what I can see syscall clone() is used in lx-start  i.e. do_fork will be called later on.)
> > Actually I was debugging ip_vs when closing multiple ns  when I fell into this one.
> >
> > I have a loop that create 33 containers whith lxc-start ... -- test.sh
> > the first thing the new conatiner does in test.sh is
> > #!/bin/bash
> > iptables -t mangle -A PREROUTING -m conntrack --ctstate RELATED,ESTABLISHED -j CONNMARK --restore-mark
> > nc -l -p1234
> >
> > This results in NULL ptr in ip_conntrack_net_init(struct *net)
> 
> Ouch!
> 
> > and in anoither test test.sh looks like this
> > #!/bin/bash
> > ipvsadm --start-daemon=master --mcast-interface=lo
> > nc -l -p1234
> >
> > And this results in an uniitialized spinlock in ip_vs_sync
> >
> > I put a printk in nsproxy: copy_namespaces() and could see a dozens of them
> > before anything appears from ipvs or conntrack.
> >
> > My feeling is that when you start up user processes in a new name space, 
> > all kernel related init should have been done (you should not need to add a sleep to get it working)
> >
> > All test  made by using todays net-next-2.6 (2.6.39-rc1)
> >

Same problem in rc2 from today

> > Note:
> > That neither conntrack or ip_vs modules where loaded,
> > if modules where loaded before creating new namespaces it all works...
> >
> > Finally the question,
> > Should it really work to load modules within a namespace , 
> > that is a part of netns ?
> 
> >From an implementation point of view kernel modules are not in a
> namespace, so there should be no difference between being in a namespace
> and loading a kernel networking module and not being in a namespace and
> loading in a kernel module.
> 
> It does sound like you have hit a module loading race, and perhaps
> a race that is confined to network namespaces.
> 

When the namespace was created I had a bunch of IPv4 & IPv6 tunnels and eth0 & eth1


[ 1114.323402] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[ 1114.330293] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[ 1114.331002] IP: [<ffffffff8104de50>] __sysctl_head_next+0x70/0x130
[ 1114.331002] PGD 169693067 PUD 16bfce067 PMD 0 
[ 1114.331002] Oops: 0000 [#1] PREEMPT SMP 
[ 1114.331002] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host2/target2:0:0/2:0:0:0/scsi_generic/sg0/dev
[ 1114.331002] CPU 1 
[ 1114.331002] Modules linked in: nf_conntrack(+) macvlan arptable_filter arp_tables 3c59x nouveau ttm drm_kms_helper
[ 1114.331002] 
[ 1114.331002] Pid: 936, comm: modprobe Not tainted 2.6.39-rc2+ #21 System manufacturer System Product Name/P5B
[ 1114.331002] RIP: 0010:[<ffffffff8104de50>]  [<ffffffff8104de50>] __sysctl_head_next+0x70/0x130
[ 1114.331002] RSP: 0018:ffff880169c1bb98  EFLAGS: 00010286
[ 1114.331002] RAX: ffff88016bdb1530 RBX: fffffffffffffff8 RCX: 0000000000000000
[ 1114.331002] RDX: 000000000000e901 RSI: ffff880169c1bda8 RDI: ffffffff816b94a0
[ 1114.331002] RBP: ffff880169c1bbb8 R08: 0000000000000000 R09: ffff880169eee2b0
[ 1114.331002] R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000000
[ 1114.331002] R13: ffff880169c1bda8 R14: ffffffffa0103300 R15: 0000000000000001
[ 1114.331002] FS:  00007f6039af3700(0000) GS:ffff88017fc80000(0000) knlGS:0000000000000000
[ 1114.331002] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1114.331002] CR2: 0000000000000018 CR3: 000000016968d000 CR4: 00000000000006e0
[ 1114.331002] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1114.331002] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1114.331002] Process modprobe (pid: 936, threadinfo ffff880169c1a000, task ffff88016bcc16c0)
[ 1114.331002] Stack:
[ 1114.331002]  ffff88017fffcc00 ffff880169eec9c8 ffff880169c1bbf0 ffff880169eee388
[ 1114.331002]  ffff880169c1bc28 ffffffff8106fba5 000000007fffde48 ffff880169c1bda8
[ 1114.331002]  0000000201c94f80 ffff88016958f818 ffff880169c1bc38 0000000000000000
[ 1114.331002] Call Trace:
[ 1114.331002]  [<ffffffff8106fba5>] sysctl_check_table+0x2b5/0x3f0
[ 1114.331002]  [<ffffffff8106f955>] sysctl_check_table+0x65/0x3f0
[ 1114.331002]  [<ffffffff8106f955>] sysctl_check_table+0x65/0x3f0
[ 1114.331002]  [<ffffffff8104dadc>] __register_sysctl_paths+0xfc/0x320
[ 1114.331002]  [<ffffffff810fd85a>] ? cache_alloc_debugcheck_after+0xea/0x220
[ 1114.331002]  [<ffffffffa01006ce>] ? nf_conntrack_acct_init+0x3e/0xe0 [nf_conntrack]
[ 1114.331002]  [<ffffffff811007ef>] ? __kmalloc_track_caller+0x11f/0x2a0
[ 1114.331002]  [<ffffffff814534f1>] register_net_sysctl_table+0x61/0x70
[ 1114.331002]  [<ffffffffa01006f4>] nf_conntrack_acct_init+0x64/0xe0 [nf_conntrack]
[ 1114.331002]  [<ffffffffa00f8604>] nf_conntrack_init+0xf4/0x350 [nf_conntrack]
[ 1114.331002]  [<ffffffffa00fb614>] nf_conntrack_net_init+0x14/0x1a0 [nf_conntrack]
[ 1114.331002]  [<ffffffff813718d7>] ops_init+0x47/0x130
[ 1114.331002]  [<ffffffff81371de3>] register_pernet_operations+0xa3/0x180
[ 1114.331002]  [<ffffffffa010c000>] ? 0xffffffffa010bfff
[ 1114.331002]  [<ffffffffa010c000>] ? 0xffffffffa010bfff
[ 1114.331002]  [<ffffffff81371fec>] register_pernet_subsys+0x2c/0x50
[ 1114.331002]  [<ffffffffa010c010>] nf_conntrack_standalone_init+0x10/0x12 [nf_conntrack]
[ 1114.331002]  [<ffffffff810001d3>] do_one_initcall+0x43/0x170
[ 1114.331002]  [<ffffffff8108393b>] sys_init_module+0xbb/0x200
[ 1114.331002]  [<ffffffff81469beb>] system_call_fastpath+0x16/0x1b
[ 1114.331002] Code: 87 00 00 00 48 8b 5b 30 4d 8b 24 24 48 8b 43 30 48 85 c0 0f 84 92 00 00 00 4c 89 ee 48 89 df ff d0 49 39 c4 74 45 49 8d 5c 24 f8 
[ 1114.331002]  83 7b 20 00 75 d2 83 43 18 01 48 c7 c7 60 9a 67 81 e8 a9 b2 
[ 1114.331002] RIP  [<ffffffff8104de50>] __sysctl_head_next+0x70/0x130
[ 1114.331002]  RSP <ffff880169c1bb98>
[ 1114.331002] CR2: 0000000000000018
[ 1114.691196] ---[ end trace b3f24866c78b4f05 ]---
[ 1114.696485] note: modprobe[936] exited with preempt_count 1
[ 1114.702440] BUG: sleeping function called from invalid context at /opt/src/ericsson/kvm/net-next-2.6/kernel/rwsem.c:21


> My head is in another problem so I won't be able to look at this for
> a bit.  But if you are getting into ip_conntrack_net_init with
> a NULL network namespace something spectacularly bad is happening.
> 
> In particular it looks like you must be hitting a bug in for_each_net.
> Which would pretty much have to be a race in adding or removing from
> net_namespace_list.
> 
> I took a quick skim through the code and whenever we modify the
> net_namespace we hold but the net_mutex and inside it the rtnl_lock so I
> don't immediate see how you could be getting a NULL net into
> ip_conntrack_net_init.
> 
> Is there a codepath besides register_pernet_subsys that is calling
> ip_conntrack_net_init?
> 
In this case it's ip_vs that tries to load nf_conntrack

> Do you have any local modifications that could be messing up register_pernet_subsys?

nop
> 
> Eric
> 

^ permalink raw reply

* Re: [PATCH] NFS: Fix infinite loop in gss_create_upcall()
From: Trond Myklebust @ 2011-04-14 21:21 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Bryan Schumaker, Jiri Slaby, linux-kernel, akpm, mm-commits,
	ML netdev, linux-nfs
In-Reply-To: <4DA75AFC.3040000@suse.cz>

On Thu, 2011-04-14 at 22:37 +0200, Jiri Slaby wrote:
> On 04/13/2011 10:42 PM, Bryan Schumaker wrote:
> > On 04/12/2011 02:52 PM, Jiri Slaby wrote:
> >> On 04/12/2011 08:43 PM, Bryan Schumaker wrote:
> >>> On 04/12/2011 02:34 PM, Jiri Slaby wrote:
> >>>> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
> >>>>>> Yes, it fixes the problem. But it waits 15s before it times out. This is
> >>>>>> inacceptable for automounted NFS dirs.
> >>>>>
> >>>>> I'm still confused as to why you are hitting it at all. In the normal
> >>>>> autonegotiation case, the client should be trying to use AUTH_SYS first
> >>>>> and then trying rpcsec_gss if and only if that fails.
> >>>>>
> >>>>> Are you really exporting a filesystem using AUTH_NULL as the only
> >>>>> supported flavour?
> >>>>
> >>>> I don't know, I connect to a nfs server which is not maintained by me.
> >>>> It looks like that. How can I find out?
> >>>
> >>> If you're not using gss for anything, you could try rmmod-ing rpcsec_gss_krb5 (and other rpcsec_gss_* modules).
> >>
> >> I don't have NFS in modules. It's all built-in. And this one is
> >> unconditionally selected because of CONFIG_NFS_V4.
> > 
> > Does this patch help?
> 
> Nope, it makes things even worse:
> # mount -oro,intr XXX:/yyy /mnt/c/
> <15s delay here>
> mount.nfs: access denied by server while mounting XXX:/yyy
> 
> So in nfs4_proc_get_root I do:
>   printk("%s: %d %u\n", __func__, i, flav_array[i]);
>   status = nfs4_lookup_root_sec(server, fhandle, info, flav_array[i]);
>   printk("%s: res=%d\n", __func__, status);
> and get:
> [   18.159818] nfs4_proc_get_root: 0 1
> [   18.214872] nfs4_proc_get_root: res=-1
> [   18.214875] nfs4_proc_get_root: 1 0
> [   18.254636] nfs4_proc_get_root: res=-1
> [   18.254639] nfs4_proc_get_root: 2 390003
> [   33.252174] RPC: AUTH_GSS upcall timed out.
> [   33.252177] Please check user daemon is running.
> [   33.252192] nfs4_proc_get_root: res=-13
> 
> If I revert that back and do the same:
> [   28.275569] nfs4_proc_get_root: 0 1
> [   28.296545] nfs4_proc_get_root: res=-1
> [   28.296548] nfs4_proc_get_root: 1 390003
> [   43.296107] RPC: AUTH_GSS upcall timed out.
> [   43.296108] Please check user daemon is running.
> [   43.296121] nfs4_proc_get_root: res=-13
> [   43.296122] nfs4_proc_get_root: 2 0
> [   43.318201] nfs4_proc_get_root: res=-1
> 
> I.e. all methods fail. And what matters is the last retval. From NULL it
> is EPERM, from GSS it is EACCESS. For EPERM, mount(8) falls back to
> nfs3, for EACCESS it dies terrible death.

OK. That's good information. Thanks for testing!

I'm still curious as to why that NFS server is refusing all NFSv4 mounts
with NFS4ERR_WRONGSEC. Unless NFSv4 really is configured only to export
the root filesystem with RPCSEC_GSS, then that definitely sounds like a
bug...

Cheers
  Trond
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply

* Re: [PATCH] NFS: Fix infinite loop in gss_create_upcall()
From: Jiri Slaby @ 2011-04-14 21:30 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Bryan Schumaker, Jiri Slaby, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mm-commits-u79uwXL29TY76Z2rM5mHXA, ML netdev,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1302816095.24028.87.camel-SyLVLa/KEI9HwK5hSS5vWB2eb7JE58TQ@public.gmane.org>

On 04/14/2011 11:21 PM, Trond Myklebust wrote:
> On Thu, 2011-04-14 at 22:37 +0200, Jiri Slaby wrote:
>> On 04/13/2011 10:42 PM, Bryan Schumaker wrote:
>>> On 04/12/2011 02:52 PM, Jiri Slaby wrote:
>>>> On 04/12/2011 08:43 PM, Bryan Schumaker wrote:
>>>>> On 04/12/2011 02:34 PM, Jiri Slaby wrote:
>>>>>> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
>>>>>>>> Yes, it fixes the problem. But it waits 15s before it times out. This is
>>>>>>>> inacceptable for automounted NFS dirs.
>>>>>>>
>>>>>>> I'm still confused as to why you are hitting it at all. In the normal
>>>>>>> autonegotiation case, the client should be trying to use AUTH_SYS first
>>>>>>> and then trying rpcsec_gss if and only if that fails.
>>>>>>>
>>>>>>> Are you really exporting a filesystem using AUTH_NULL as the only
>>>>>>> supported flavour?
>>>>>>
>>>>>> I don't know, I connect to a nfs server which is not maintained by me.
>>>>>> It looks like that. How can I find out?
>>>>>
>>>>> If you're not using gss for anything, you could try rmmod-ing rpcsec_gss_krb5 (and other rpcsec_gss_* modules).
>>>>
>>>> I don't have NFS in modules. It's all built-in. And this one is
>>>> unconditionally selected because of CONFIG_NFS_V4.
>>>
>>> Does this patch help?
>>
>> Nope, it makes things even worse:
>> # mount -oro,intr XXX:/yyy /mnt/c/
>> <15s delay here>
>> mount.nfs: access denied by server while mounting XXX:/yyy
>>
>> So in nfs4_proc_get_root I do:
>>   printk("%s: %d %u\n", __func__, i, flav_array[i]);
>>   status = nfs4_lookup_root_sec(server, fhandle, info, flav_array[i]);
>>   printk("%s: res=%d\n", __func__, status);
>> and get:
>> [   18.159818] nfs4_proc_get_root: 0 1
>> [   18.214872] nfs4_proc_get_root: res=-1
>> [   18.214875] nfs4_proc_get_root: 1 0
>> [   18.254636] nfs4_proc_get_root: res=-1
>> [   18.254639] nfs4_proc_get_root: 2 390003
>> [   33.252174] RPC: AUTH_GSS upcall timed out.
>> [   33.252177] Please check user daemon is running.
>> [   33.252192] nfs4_proc_get_root: res=-13
>>
>> If I revert that back and do the same:
>> [   28.275569] nfs4_proc_get_root: 0 1
>> [   28.296545] nfs4_proc_get_root: res=-1
>> [   28.296548] nfs4_proc_get_root: 1 390003
>> [   43.296107] RPC: AUTH_GSS upcall timed out.
>> [   43.296108] Please check user daemon is running.
>> [   43.296121] nfs4_proc_get_root: res=-13
>> [   43.296122] nfs4_proc_get_root: 2 0
>> [   43.318201] nfs4_proc_get_root: res=-1
>>
>> I.e. all methods fail. And what matters is the last retval. From NULL it
>> is EPERM, from GSS it is EACCESS. For EPERM, mount(8) falls back to
>> nfs3, for EACCESS it dies terrible death.
> 
> OK. That's good information. Thanks for testing!
> 
> I'm still curious as to why that NFS server is refusing all NFSv4 mounts
> with NFS4ERR_WRONGSEC. Unless NFSv4 really is configured only to export
> the root filesystem with RPCSEC_GSS, then that definitely sounds like a
> bug...

With gssd running if that helps:
[  229.806528] nfs4_proc_get_root: 0 1
[  229.828491] nfs4_proc_get_root: res=-1
[  229.828494] nfs4_proc_get_root: 1 390003
[  229.896994] nfs4_proc_get_root: res=-13
[  229.896997] nfs4_proc_get_root: 2 0
[  229.920344] nfs4_proc_get_root: res=-1

thanks,
-- 
js
suse labs
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 08/12] netvm: Allow skb allocation to use PFMEMALLOC reserves
From: David Miller @ 2011-04-14 21:33 UTC (permalink / raw)
  To: mgorman; +Cc: linux-mm, netdev, linux-kernel, a.p.zijlstra
In-Reply-To: <1302777698-28237-9-git-send-email-mgorman@suse.de>

From: Mel Gorman <mgorman@suse.de>
Date: Thu, 14 Apr 2011 11:41:34 +0100

> +extern int memalloc_socks;
> +static inline int sk_memalloc_socks(void)
> +{
> +	return memalloc_socks;
> +}
> +
 ...
> +static DEFINE_MUTEX(memalloc_socks_lock);
> +int memalloc_socks __read_mostly;

Please use an atomic_t, it has to be more efficient than this mutex
business.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* RE: SMSC 8720a/MDIO/PHY help.
From: Michael Riesch @ 2011-04-14 21:39 UTC (permalink / raw)
  To: netdev; +Cc: ANDY KENNEDY
In-Reply-To: <9AC3F0E75060224C8BBC5BA2DDC8853A1FA8E8FD@EXV1.corp.adtran.com>


> 
> Along this line of though:  phy_connect requires struct net_device, which has a struct net_device_ops within it.  When I do a phy_connect am I supposed to provide the minimal functions for netdev_ops (correct this list if I am mistaken):
> ndo_open
> ndo_stop
> ndo_start_xmit
> ndo_get_stats
> ndo_set_multicast_list
> As well as populate the dev->dev_addr within the struct net_device.
> 
> The part that confuses me is that the smsc.c ??driver?? under drivers/net/phy/smsc.c doesn’t do any of this.  This is a phy supported by this file, so should I have to do all this to get the device up?

The smsc.c is a PHY driver, so it is probed when the specified PHY
appears on the MDIO bus. It is responsible for the proper PHY settings
like auto-negotiation etc.

If I understood you correctly, you are writing a MDIO bus driver, the
opposite part. It provides access to the MDIO bus, the net_device
structure with its ops (and, of course, implements them as well).

With a call of PHY_connect you can stick the both of them together. And
the nice trick: Maybe your PHY is supported by the generic PHY driver
and you just need your MDIO bus driver, which provides the net_device
ops, registers the MDIO bus and calls phy_connect. (and maybe
phy_start(), I am not sure about that one).

HTH


^ permalink raw reply

* [PATCH] ipv4: Call fib_select_default() only when actually necessary.
From: David Miller @ 2011-04-14 21:50 UTC (permalink / raw)
  To: netdev


fib_select_default() is a complete NOP, and completely pointless
to invoke, when we have no more than 1 default route installed.

And this is far and away the common case.

So remember how many prefixlen==0 routes we have in the routing
table, and elide the call when we have no more than one of those.

This cuts output route creation time by 157 cycles on Niagara2+.

In order to add the new int to fib_table, we have to correct the type
of ->tb_data[] to unsigned long, otherwise the private area will be
unaligned on 64-bit systems.

Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/net/ip_fib.h |    3 ++-
 net/ipv4/fib_trie.c  |    7 +++++++
 net/ipv4/route.c     |    4 +++-
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 514627f..10422ef 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -160,7 +160,8 @@ struct fib_table {
 	struct hlist_node tb_hlist;
 	u32		tb_id;
 	int		tb_default;
-	unsigned char	tb_data[0];
+	int		tb_num_default;
+	unsigned long	tb_data[0];
 };
 
 extern int fib_table_lookup(struct fib_table *tb, const struct flowi4 *flp,
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index bde80c4..9ac481a 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1332,6 +1332,9 @@ int fib_table_insert(struct fib_table *tb, struct fib_config *cfg)
 		}
 	}
 
+	if (!plen)
+		tb->tb_num_default++;
+
 	list_add_tail_rcu(&new_fa->fa_list,
 			  (fa ? &fa->fa_list : fa_head));
 
@@ -1697,6 +1700,9 @@ int fib_table_delete(struct fib_table *tb, struct fib_config *cfg)
 
 	list_del_rcu(&fa->fa_list);
 
+	if (!plen)
+		tb->tb_num_default--;
+
 	if (list_empty(fa_head)) {
 		hlist_del_rcu(&li->hlist);
 		free_leaf_info(li);
@@ -1987,6 +1993,7 @@ struct fib_table *fib_trie_table(u32 id)
 
 	tb->tb_id = id;
 	tb->tb_default = -1;
+	tb->tb_num_default = 0;
 
 	t = (struct trie *) tb->tb_data;
 	memset(t, 0, sizeof(*t));
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 0e7430c..e9aee81 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2615,7 +2615,9 @@ static struct rtable *ip_route_output_slow(struct net *net,
 		fib_select_multipath(&res);
 	else
 #endif
-	if (!res.prefixlen && res.type == RTN_UNICAST && !fl4.flowi4_oif)
+	if (!res.prefixlen &&
+	    res.table->tb_num_default > 1 &&
+	    res.type == RTN_UNICAST && !fl4.flowi4_oif)
 		fib_select_default(&res);
 
 	if (!fl4.saddr)
-- 
1.7.4.3


^ permalink raw reply related

* Re: [PATCH] ipv4: Call fib_select_default() only when actually necessary.
From: Eric Dumazet @ 2011-04-14 21:59 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20110414.145035.189701333.davem@davemloft.net>

Le jeudi 14 avril 2011 à 14:50 -0700, David Miller a écrit :
> fib_select_default() is a complete NOP, and completely pointless
> to invoke, when we have no more than 1 default route installed.
> 
> And this is far and away the common case.
> 
> So remember how many prefixlen==0 routes we have in the routing
> table, and elide the call when we have no more than one of those.
> 
> This cuts output route creation time by 157 cycles on Niagara2+.
> 
> In order to add the new int to fib_table, we have to correct the type
> of ->tb_data[] to unsigned long, otherwise the private area will be
> unaligned on 64-bit systems.
> 
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---

Excellent :)

Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>



^ permalink raw reply

* Re: [PATCH v2] ip: ip_options_compile() resilient to NULL skb route
From: Scot Doyle @ 2011-04-14 22:02 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Hiroaki SHIMODA, Stephen Hemminger, David Miller, netdev
In-Reply-To: <1302796537.3248.22.camel@edumazet-laptop>

On 04/14/2011 10:55 AM, Eric Dumazet wrote:
> Scot Doyle demonstrated ip_options_compile() could be called with an skb
> without an attached route, using a setup involving a bridge, netfilter,
> and forged IP packets.
>
> Let's make ip_options_compile() and ip_options_rcv_srr() a bit more
> robust, instead of changing bridge/netfilter code.
>
> With help from Hiroaki SHIMODA.
>
> Reported-by: Scot Doyle<lkml@scotdoyle.com>
> Signed-off-by: Eric Dumazet<eric.dumazet@gmail.com>
> Cc: Stephen Hemminger<shemminger@vyatta.com>
> Cc: Hiroaki SHIMODA<shimoda.hiroaki@gmail.com>
> ---
> v2: ip_options_rcv_srr() fix as well, from Hiroaki
>
>   net/ipv4/ip_options.c |    6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c
> index 28a736f..2391b24 100644
> --- a/net/ipv4/ip_options.c
> +++ b/net/ipv4/ip_options.c
> @@ -329,7 +329,7 @@ int ip_options_compile(struct net *net,
>   					pp_ptr = optptr + 2;
>   					goto error;
>   				}
> -				if (skb) {
> +				if (rt) {
>   					memcpy(&optptr[optptr[2]-1],&rt->rt_spec_dst, 4);
>   					opt->is_changed = 1;
>   				}
> @@ -371,7 +371,7 @@ int ip_options_compile(struct net *net,
>   						goto error;
>   					}
>   					opt->ts = optptr - iph;
> -					if (skb) {
> +					if (rt)  {
>   						memcpy(&optptr[optptr[2]-1],&rt->rt_spec_dst, 4);
>   						timeptr = (__be32*)&optptr[optptr[2]+3];
>   					}
> @@ -603,7 +603,7 @@ int ip_options_rcv_srr(struct sk_buff *skb)
>   	unsigned long orefdst;
>   	int err;
>
> -	if (!opt->srr)
> +	if (!opt->srr || !rt)
>   		return 0;
>
>   	if (skb->pkt_type != PACKET_HOST)
The 2.6.39-rc3 kernel, plus this patch and the two patches previously 
accepted by David in this thread, didn't panic when tested with the IP 
Stack Checker tool hitting either the assigned bridge IP address or a 
guest virtual machine IP address sharing that bridge.

^ permalink raw reply

* Re: [PATCH v2] ip: ip_options_compile() resilient to NULL skb route
From: David Miller @ 2011-04-14 22:04 UTC (permalink / raw)
  To: lkml; +Cc: eric.dumazet, shimoda.hiroaki, shemminger, netdev
In-Reply-To: <4DA76F08.5000302@scotdoyle.com>

From: Scot Doyle <lkml@scotdoyle.com>
Date: Thu, 14 Apr 2011 17:02:48 -0500

> The 2.6.39-rc3 kernel, plus this patch and the two patches previously
> accepted by David in this thread, didn't panic when tested with the IP
> Stack Checker tool hitting either the assigned bridge IP address or a
> guest virtual machine IP address sharing that bridge.

Thank you for testing.

^ permalink raw reply

* Re: [PATCH] ipv4: Call fib_select_default() only when actually necessary.
From: David Miller @ 2011-04-14 22:05 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1302818357.2744.47.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 14 Apr 2011 23:59:17 +0200

> Excellent :)

:)

> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>

Thanks for reviewing Eric.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox