Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] ethernet: add sanity check before memory dereferencing
From: David Miller @ 2010-05-04  6:11 UTC (permalink / raw)
  To: xiaosuo; +Cc: eric.dumazet, netdev
In-Reply-To: <1272944028-23410-1-git-send-email-xiaosuo@gmail.com>

From: Changli Gao <xiaosuo@gmail.com>
Date: Tue,  4 May 2010 11:33:48 +0800

> add sanity check before memory dereferencing
> 
> Some callers of eth_type_trans() only can assure the length of the packets
> passed to it is not less than ETH_HLEN. We'd better check the packets length
> before dereferencing skb->data.
> 
> Signed-off-by: Changli Gao <xiaosuo@gmail.com>

We can deference skb->data for at least 16 bytes or so past the end
of the valid packet data area however we want.

It might give garbage values, but it will not cause a fault or any
kind.

We want to remove checks here, not add new ones Changli.

^ permalink raw reply

* Re: [PATCH v2] ethernet: call __skb_pull() in eth_type_trans()
From: David Miller @ 2010-05-04  6:16 UTC (permalink / raw)
  To: xiaosuo; +Cc: eric.dumazet, netdev
In-Reply-To: <h2h412e6f7f1005031934m971342dw965e046d40485ec7@mail.gmail.com>

From: Changli Gao <xiaosuo@gmail.com>
Date: Tue, 4 May 2010 10:34:07 +0800

> It seems no callers pass eth_type_trans() a packet, whose length is
> less than ETH_HLEN. It means that skb_pull() always returns non-NULL.
> And if skb_pull() returns NULL, the later memory dereferences must be
> invalid.

In your opinion.  As I explained in the reply to your latest
eth_type_trans() patch, we can defererence several bytes past
the end of skb->data's valid packet data area without faulting.

We'll just read in garbage, because it's things like skb_shared_info()
and friends might be there.

But it's completely safe, and frankly I'm fine with the kernel doing
this when runts make it into this code if that allows us to avoid
stupid checks.

> As Eric mentioned above, GRE only assures the length of the packets
> passed to eth_type_trans() isn't less than ETH_HLEN, we should check
> skb->len before we dereference skb->data.

The code needs only ETH_HLEN, but valid ethernet packets must be at
least ETH_ZLEN.

> For performance, how about inlining eth_type_trans(). Because its main
> users are NIC drivers, and there aren't likely many kinds of NICs at
> the same time, inlining it won't increases the size of the kernel
> image much.

No, that's unnecessary bloat, plus I want to make ->ndo_type_trans()
a netdev operation so it can possibly be deferred.

Changli, please go hack elsewhere and on some other piece of code,
all your ideas here in eth_type_trans() are not well founded.

I see from your struct dst union removal patch that you don't even
build test your changes, so why don't you spend your excess energy on
making sure your code at least compiles successfully?

Thanks.

^ permalink raw reply

* Re: OOP in ip_cmsg_recv (net-next)
From: David Miller @ 2010-05-04  6:17 UTC (permalink / raw)
  To: eric.dumazet; +Cc: shemminger, netdev
In-Reply-To: <1272948225.2407.170.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 04 May 2010 06:43:45 +0200

> David, if I am not mistaken (not thea yet for me this early morning) the
> tracer you mention is included in kfree_skb(), not in __kfree_skb() :
 ...
> I only copied part of consume_skb() which doesnt call
> trace_kfree_skb() :
 ...
> So I believe my second patch is a bit better : We dont even lock the
> socket in the (rare) case we should not orphan the skb ;)

Right you are.

> [PATCH net-next-2.6] net: skb_free_datagram_locked() fix

I'll apply this, thanks!

^ permalink raw reply

* Re: linux kernel's IPV6_MULTICAST_HOPS default is 64; should be 1?
From: enh @ 2010-05-04  6:19 UTC (permalink / raw)
  To: David Miller; +Cc: brian.haley, netdev
In-Reply-To: <20100503.230553.200626786.davem@davemloft.net>

On Mon, May 3, 2010 at 23:05, David Miller <davem@davemloft.net> wrote:
> From: Brian Haley <brian.haley@hp.com>
> Date: Mon, 03 May 2010 22:16:39 -0400
>
>> It looks like a bug to me, feel free to send along a patch :)
>
> Is it?  The quoted text is only about setting the value and what
> effect setting -1 or whatever has.
>
> For getting the value, the behavior described sounds just fine.
>
> The default for a socket is whatever the kernel-wide default is.

for the *unicast* hops, a part of the RFC i didn't quote says:

   If the [IPV6_UNICAST_HOPS] option is not set, the
   system selects a default value.

but for the *multicast* hops, which is what i'm talking about, this
part of the quoted text seems pretty definitive:

           If IPV6_MULTICAST_HOPS is not set, the default is 1
           (same as IPv4 today)

this is what my test shows isn't true of linux; linux reuses its
unicast default instead.

-- 
Elliott Hughes - http://who/enh - http://jessies.org/~enh/

^ permalink raw reply

* Re: linux kernel's IPV6_MULTICAST_HOPS default is 64; should be 1?
From: David Miller @ 2010-05-04  6:22 UTC (permalink / raw)
  To: enh; +Cc: brian.haley, netdev
In-Reply-To: <AANLkTingTdmhyJhENb9AYDjUT7m_-yQflw5-gJqIA9iw@mail.gmail.com>

From: enh <enh@google.com>
Date: Mon, 3 May 2010 23:19:22 -0700

> for the *unicast* hops, a part of the RFC i didn't quote says:
> 
>    If the [IPV6_UNICAST_HOPS] option is not set, the
>    system selects a default value.
> 
> but for the *multicast* hops, which is what i'm talking about, this
> part of the quoted text seems pretty definitive:
> 
>            If IPV6_MULTICAST_HOPS is not set, the default is 1
>            (same as IPv4 today)
> 
> this is what my test shows isn't true of linux; linux reuses its
> unicast default instead.

Ok, I see, so yeah this needs to be fixed to use "1" instead of
"-1" in the np->xxx ipv6 socket initialization.

^ permalink raw reply

* Re: [PATCH] ep93xx_eth stopps receiving packets
From: David Miller @ 2010-05-04  6:23 UTC (permalink / raw)
  To: buytenh; +Cc: stefan, netdev
In-Reply-To: <20100504014606.GI4586@mail.wantstofly.org>

From: Lennert Buytenhek <buytenh@wantstofly.org>
Date: Tue, 4 May 2010 03:46:07 +0200

> On Mon, May 03, 2010 at 01:42:44PM +0200, Stefan Agner wrote:
> 
>> Receiving small packet(s) in a fast pace leads to not receiving any
>> packets at all after some time.
>> 
>> After ethernet packet(s) arrived the receive descriptor is incremented
>> by the number of frames processed. If another packet arrives while
>> processing, this is processed in another call of ep93xx_rx. This
>> second call leads that too many receive descriptors getting released.
>> 
>> This fix increments, even in these case, the right number of processed
>> receive descriptors.
>> 
>> Signed-off-by: Stefan Agner <stefan@agner.ch>
> 
> I haven't opened my ep93xx docs for a while, but if this works for you:
> 
> Acked-by: Lennert Buytenhek <buytenh@wantstofly.org>

I had to apply this by hand because Stefan's email client corrupted the
spacing et al. in the patch.  Stefan please post a usable patch next
time.

^ permalink raw reply

* Re: [PATCH] forcedeth: GRO support
From: David Miller @ 2010-05-04  6:27 UTC (permalink / raw)
  To: therbert; +Cc: netdev, aabdulla
In-Reply-To: <alpine.DEB.1.00.1005032203520.11991@pokey.mtv.corp.google.com>

From: Tom Herbert <therbert@google.com>
Date: Mon, 3 May 2010 22:08:45 -0700 (PDT)

> Add GRO support to forcedeth.
> 
> Signed-off-by: Tom Herbert <therbert@google.com>

Applied, nice work Tom.

I think it's really time to kill the NAPI ifdef crap from this driver.
Something marked "experimental" that people have been actively using
and distributions have been turning on for years isn't experimental
any more.

Actually this goes back to 2006 even.

In fact, I'm just going to kill it right now in net-next-2.6

Thanks again Tom!

^ permalink raw reply

* Re: linux kernel's IPV6_MULTICAST_HOPS default is 64; should be 1?
From: enh @ 2010-05-04  6:27 UTC (permalink / raw)
  To: David Miller; +Cc: brian.haley, netdev
In-Reply-To: <20100503.232249.200767273.davem@davemloft.net>

On Mon, May 3, 2010 at 23:22, David Miller <davem@davemloft.net> wrote:
> From: enh <enh@google.com>
> Date: Mon, 3 May 2010 23:19:22 -0700
>
>> for the *unicast* hops, a part of the RFC i didn't quote says:
>>
>>    If the [IPV6_UNICAST_HOPS] option is not set, the
>>    system selects a default value.
>>
>> but for the *multicast* hops, which is what i'm talking about, this
>> part of the quoted text seems pretty definitive:
>>
>>            If IPV6_MULTICAST_HOPS is not set, the default is 1
>>            (same as IPv4 today)
>>
>> this is what my test shows isn't true of linux; linux reuses its
>> unicast default instead.
>
> Ok, I see, so yeah this needs to be fixed to use "1" instead of
> "-1" in the np->xxx ipv6 socket initialization.

i think so. there's already the IPV6_DEFAULT_MCASTHOPS constant
defined to 1 but unused according to gitweb, so you might want to
either use it or remove it.

thanks!

-- 
Elliott Hughes - http://who/enh - http://jessies.org/~enh/

^ permalink raw reply

* Re: [PATCH] forcedeth: GRO support
From: David Miller @ 2010-05-04  6:33 UTC (permalink / raw)
  To: therbert; +Cc: netdev, aabdulla
In-Reply-To: <20100503.232714.43422983.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Mon, 03 May 2010 23:27:14 -0700 (PDT)

> In fact, I'm just going to kill it right now in net-next-2.6

--------------------
forcedeth: Kill NAPI config options.

All distributions enable it, therefore no significant body of users
are even testing the driver with it disabled.  And making NAPI
configurable is heavily discouraged anyways.

I left the MSI-X interrupt enabling thing in an "#if 0" block
so hopefully someone can debug that and it can get re-enabled.
Probably it was just one of the NVIDIA chipset MSI erratas that
we work handle these days in the PCI quirks (see drivers/pci/quirks.c
and stuff like nvenet_msi_disable()).

Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/Kconfig     |   14 ----
 drivers/net/forcedeth.c |  194 +----------------------------------------------
 2 files changed, 1 insertions(+), 207 deletions(-)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index dbd26f9..b9e7618 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -1453,20 +1453,6 @@ config FORCEDETH
 	  To compile this driver as a module, choose M here. The module
 	  will be called forcedeth.
 
-config FORCEDETH_NAPI
-	bool "Use Rx Polling (NAPI) (EXPERIMENTAL)"
-	depends on FORCEDETH && EXPERIMENTAL
-	help
-	  NAPI is a new driver API designed to reduce CPU and interrupt load
-	  when the driver is receiving lots of packets from the card. It is
-	  still somewhat experimental and thus not yet enabled by default.
-
-	  If your estimated Rx load is 10kpps or more, or if the card will be
-	  deployed on potentially unfriendly networks (e.g. in a firewall),
-	  then say Y here.
-
-	  If in doubt, say N.
-
 config CS89x0
 	tristate "CS89x0 support"
 	depends on NET_ETHERNET && (ISA || EISA || MACH_IXDP2351 \
diff --git a/drivers/net/forcedeth.c b/drivers/net/forcedeth.c
index 4a24cc7..f9e1dd4 100644
--- a/drivers/net/forcedeth.c
+++ b/drivers/net/forcedeth.c
@@ -1104,20 +1104,16 @@ static void nv_disable_hw_interrupts(struct net_device *dev, u32 mask)
 
 static void nv_napi_enable(struct net_device *dev)
 {
-#ifdef CONFIG_FORCEDETH_NAPI
 	struct fe_priv *np = get_nvpriv(dev);
 
 	napi_enable(&np->napi);
-#endif
 }
 
 static void nv_napi_disable(struct net_device *dev)
 {
-#ifdef CONFIG_FORCEDETH_NAPI
 	struct fe_priv *np = get_nvpriv(dev);
 
 	napi_disable(&np->napi);
-#endif
 }
 
 #define MII_READ	(-1)
@@ -1810,7 +1806,6 @@ static int nv_alloc_rx_optimized(struct net_device *dev)
 }
 
 /* If rx bufs are exhausted called after 50ms to attempt to refresh */
-#ifdef CONFIG_FORCEDETH_NAPI
 static void nv_do_rx_refill(unsigned long data)
 {
 	struct net_device *dev = (struct net_device *) data;
@@ -1819,41 +1814,6 @@ static void nv_do_rx_refill(unsigned long data)
 	/* Just reschedule NAPI rx processing */
 	napi_schedule(&np->napi);
 }
-#else
-static void nv_do_rx_refill(unsigned long data)
-{
-	struct net_device *dev = (struct net_device *) data;
-	struct fe_priv *np = netdev_priv(dev);
-	int retcode;
-
-	if (!using_multi_irqs(dev)) {
-		if (np->msi_flags & NV_MSI_X_ENABLED)
-			disable_irq(np->msi_x_entry[NV_MSI_X_VECTOR_ALL].vector);
-		else
-			disable_irq(np->pci_dev->irq);
-	} else {
-		disable_irq(np->msi_x_entry[NV_MSI_X_VECTOR_RX].vector);
-	}
-	if (!nv_optimized(np))
-		retcode = nv_alloc_rx(dev);
-	else
-		retcode = nv_alloc_rx_optimized(dev);
-	if (retcode) {
-		spin_lock_irq(&np->lock);
-		if (!np->in_shutdown)
-			mod_timer(&np->oom_kick, jiffies + OOM_REFILL);
-		spin_unlock_irq(&np->lock);
-	}
-	if (!using_multi_irqs(dev)) {
-		if (np->msi_flags & NV_MSI_X_ENABLED)
-			enable_irq(np->msi_x_entry[NV_MSI_X_VECTOR_ALL].vector);
-		else
-			enable_irq(np->pci_dev->irq);
-	} else {
-		enable_irq(np->msi_x_entry[NV_MSI_X_VECTOR_RX].vector);
-	}
-}
-#endif
 
 static void nv_init_rx(struct net_device *dev)
 {
@@ -2816,11 +2776,7 @@ static int nv_rx_process(struct net_device *dev, int limit)
 		skb->protocol = eth_type_trans(skb, dev);
 		dprintk(KERN_DEBUG "%s: nv_rx_process: %d bytes, proto %d accepted.\n",
 					dev->name, len, skb->protocol);
-#ifdef CONFIG_FORCEDETH_NAPI
 		napi_gro_receive(&np->napi, skb);
-#else
-		netif_rx(skb);
-#endif
 		dev->stats.rx_packets++;
 		dev->stats.rx_bytes += len;
 next_pkt:
@@ -2909,27 +2865,14 @@ static int nv_rx_process_optimized(struct net_device *dev, int limit)
 				dev->name, len, skb->protocol);
 
 			if (likely(!np->vlangrp)) {
-#ifdef CONFIG_FORCEDETH_NAPI
 				napi_gro_receive(&np->napi, skb);
-#else
-				netif_rx(skb);
-#endif
 			} else {
 				vlanflags = le32_to_cpu(np->get_rx.ex->buflow);
 				if (vlanflags & NV_RX3_VLAN_TAG_PRESENT) {
-#ifdef CONFIG_FORCEDETH_NAPI
 					vlan_gro_receive(&np->napi, np->vlangrp,
 							 vlanflags & NV_RX3_VLAN_TAG_MASK, skb);
-#else
-					vlan_hwaccel_rx(skb, np->vlangrp,
-							vlanflags & NV_RX3_VLAN_TAG_MASK);
-#endif
 				} else {
-#ifdef CONFIG_FORCEDETH_NAPI
 					napi_gro_receive(&np->napi, skb);
-#else
-					netif_rx(skb);
-#endif
 				}
 			}
 
@@ -3496,10 +3439,6 @@ static irqreturn_t nv_nic_irq(int foo, void *data)
 	struct net_device *dev = (struct net_device *) data;
 	struct fe_priv *np = netdev_priv(dev);
 	u8 __iomem *base = get_hwbase(dev);
-#ifndef CONFIG_FORCEDETH_NAPI
-	int total_work = 0;
-	int loop_count = 0;
-#endif
 
 	dprintk(KERN_DEBUG "%s: nv_nic_irq\n", dev->name);
 
@@ -3516,7 +3455,6 @@ static irqreturn_t nv_nic_irq(int foo, void *data)
 
 	nv_msi_workaround(np);
 
-#ifdef CONFIG_FORCEDETH_NAPI
 	if (napi_schedule_prep(&np->napi)) {
 		/*
 		 * Disable further irq's (msix not enabled with napi)
@@ -3525,65 +3463,6 @@ static irqreturn_t nv_nic_irq(int foo, void *data)
 		__napi_schedule(&np->napi);
 	}
 
-#else
-	do
-	{
-		int work = 0;
-		if ((work = nv_rx_process(dev, RX_WORK_PER_LOOP))) {
-			if (unlikely(nv_alloc_rx(dev))) {
-				spin_lock(&np->lock);
-				if (!np->in_shutdown)
-					mod_timer(&np->oom_kick, jiffies + OOM_REFILL);
-				spin_unlock(&np->lock);
-			}
-		}
-
-		spin_lock(&np->lock);
-		work += nv_tx_done(dev, TX_WORK_PER_LOOP);
-		spin_unlock(&np->lock);
-
-		if (!work)
-			break;
-
-		total_work += work;
-
-		loop_count++;
-	}
-	while (loop_count < max_interrupt_work);
-
-	if (nv_change_interrupt_mode(dev, total_work)) {
-		/* setup new irq mask */
-		writel(np->irqmask, base + NvRegIrqMask);
-	}
-
-	if (unlikely(np->events & NVREG_IRQ_LINK)) {
-		spin_lock(&np->lock);
-		nv_link_irq(dev);
-		spin_unlock(&np->lock);
-	}
-	if (unlikely(np->need_linktimer && time_after(jiffies, np->link_timeout))) {
-		spin_lock(&np->lock);
-		nv_linkchange(dev);
-		spin_unlock(&np->lock);
-		np->link_timeout = jiffies + LINK_TIMEOUT;
-	}
-	if (unlikely(np->events & NVREG_IRQ_RECOVER_ERROR)) {
-		spin_lock(&np->lock);
-		/* disable interrupts on the nic */
-		if (!(np->msi_flags & NV_MSI_X_ENABLED))
-			writel(0, base + NvRegIrqMask);
-		else
-			writel(np->irqmask, base + NvRegIrqMask);
-		pci_push(base);
-
-		if (!np->in_shutdown) {
-			np->nic_poll_irq = np->irqmask;
-			np->recover_error = 1;
-			mod_timer(&np->nic_poll, jiffies + POLL_WAIT);
-		}
-		spin_unlock(&np->lock);
-	}
-#endif
 	dprintk(KERN_DEBUG "%s: nv_nic_irq completed\n", dev->name);
 
 	return IRQ_HANDLED;
@@ -3599,10 +3478,6 @@ static irqreturn_t nv_nic_irq_optimized(int foo, void *data)
 	struct net_device *dev = (struct net_device *) data;
 	struct fe_priv *np = netdev_priv(dev);
 	u8 __iomem *base = get_hwbase(dev);
-#ifndef CONFIG_FORCEDETH_NAPI
-	int total_work = 0;
-	int loop_count = 0;
-#endif
 
 	dprintk(KERN_DEBUG "%s: nv_nic_irq_optimized\n", dev->name);
 
@@ -3619,7 +3494,6 @@ static irqreturn_t nv_nic_irq_optimized(int foo, void *data)
 
 	nv_msi_workaround(np);
 
-#ifdef CONFIG_FORCEDETH_NAPI
 	if (napi_schedule_prep(&np->napi)) {
 		/*
 		 * Disable further irq's (msix not enabled with napi)
@@ -3627,66 +3501,6 @@ static irqreturn_t nv_nic_irq_optimized(int foo, void *data)
 		writel(0, base + NvRegIrqMask);
 		__napi_schedule(&np->napi);
 	}
-#else
-	do
-	{
-		int work = 0;
-		if ((work = nv_rx_process_optimized(dev, RX_WORK_PER_LOOP))) {
-			if (unlikely(nv_alloc_rx_optimized(dev))) {
-				spin_lock(&np->lock);
-				if (!np->in_shutdown)
-					mod_timer(&np->oom_kick, jiffies + OOM_REFILL);
-				spin_unlock(&np->lock);
-			}
-		}
-
-		spin_lock(&np->lock);
-		work += nv_tx_done_optimized(dev, TX_WORK_PER_LOOP);
-		spin_unlock(&np->lock);
-
-		if (!work)
-			break;
-
-		total_work += work;
-
-		loop_count++;
-	}
-	while (loop_count < max_interrupt_work);
-
-	if (nv_change_interrupt_mode(dev, total_work)) {
-		/* setup new irq mask */
-		writel(np->irqmask, base + NvRegIrqMask);
-	}
-
-	if (unlikely(np->events & NVREG_IRQ_LINK)) {
-		spin_lock(&np->lock);
-		nv_link_irq(dev);
-		spin_unlock(&np->lock);
-	}
-	if (unlikely(np->need_linktimer && time_after(jiffies, np->link_timeout))) {
-		spin_lock(&np->lock);
-		nv_linkchange(dev);
-		spin_unlock(&np->lock);
-		np->link_timeout = jiffies + LINK_TIMEOUT;
-	}
-	if (unlikely(np->events & NVREG_IRQ_RECOVER_ERROR)) {
-		spin_lock(&np->lock);
-		/* disable interrupts on the nic */
-		if (!(np->msi_flags & NV_MSI_X_ENABLED))
-			writel(0, base + NvRegIrqMask);
-		else
-			writel(np->irqmask, base + NvRegIrqMask);
-		pci_push(base);
-
-		if (!np->in_shutdown) {
-			np->nic_poll_irq = np->irqmask;
-			np->recover_error = 1;
-			mod_timer(&np->nic_poll, jiffies + POLL_WAIT);
-		}
-		spin_unlock(&np->lock);
-	}
-
-#endif
 	dprintk(KERN_DEBUG "%s: nv_nic_irq_optimized completed\n", dev->name);
 
 	return IRQ_HANDLED;
@@ -3735,7 +3549,6 @@ static irqreturn_t nv_nic_irq_tx(int foo, void *data)
 	return IRQ_RETVAL(i);
 }
 
-#ifdef CONFIG_FORCEDETH_NAPI
 static int nv_napi_poll(struct napi_struct *napi, int budget)
 {
 	struct fe_priv *np = container_of(napi, struct fe_priv, napi);
@@ -3805,7 +3618,6 @@ static int nv_napi_poll(struct napi_struct *napi, int budget)
 	}
 	return rx_work;
 }
-#endif
 
 static irqreturn_t nv_nic_irq_rx(int foo, void *data)
 {
@@ -5711,9 +5523,7 @@ static int __devinit nv_probe(struct pci_dev *pci_dev, const struct pci_device_i
 		np->txrxctl_bits |= NVREG_TXRXCTL_RXCHECK;
 		dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG;
 		dev->features |= NETIF_F_TSO;
-#ifdef CONFIG_FORCEDETH_NAPI
 		dev->features |= NETIF_F_GRO;
-#endif
 	}
 
 	np->vlanctl_bits = 0;
@@ -5766,9 +5576,7 @@ static int __devinit nv_probe(struct pci_dev *pci_dev, const struct pci_device_i
 	else
 		dev->netdev_ops = &nv_netdev_ops_optimized;
 
-#ifdef CONFIG_FORCEDETH_NAPI
 	netif_napi_add(dev, &np->napi, nv_napi_poll, RX_WORK_PER_LOOP);
-#endif
 	SET_ETHTOOL_OPS(dev, &ops);
 	dev->watchdog_timeo = NV_WATCHDOG_TIMEO;
 
@@ -5871,7 +5679,7 @@ static int __devinit nv_probe(struct pci_dev *pci_dev, const struct pci_device_i
 		/* msix has had reported issues when modifying irqmask
 		   as in the case of napi, therefore, disable for now
 		*/
-#ifndef CONFIG_FORCEDETH_NAPI
+#if 0
 		np->msi_flags |= NV_MSI_X_CAPABLE;
 #endif
 	}
-- 
1.7.0.4


^ permalink raw reply related

* Re: linux kernel's IPV6_MULTICAST_HOPS default is 64; should be 1?
From: David Miller @ 2010-05-04  6:42 UTC (permalink / raw)
  To: enh; +Cc: brian.haley, netdev
In-Reply-To: <AANLkTimf-FWCsiLqYDGg4nl-relRMuNvDowfTKZbe9kx@mail.gmail.com>

From: enh <enh@google.com>
Date: Mon, 3 May 2010 23:27:23 -0700

> On Mon, May 3, 2010 at 23:22, David Miller <davem@davemloft.net> wrote:
>> Ok, I see, so yeah this needs to be fixed to use "1" instead of
>> "-1" in the np->xxx ipv6 socket initialization.
> 
> i think so. there's already the IPV6_DEFAULT_MCASTHOPS constant
> defined to 1 but unused according to gitweb, so you might want to
> either use it or remove it.

I've applied the following, thanks Elliot.

--------------------
ipv6: Fix default multicast hops setting.

As per RFC 3493 the default multicast hops setting
for a socket should be "1" just like ipv4.

Ironically we have a IPV6_DEFAULT_MCASTHOPS macro
it just wasn't being used.

Reported-by: Elliot Hughes <enh@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 net/ipv6/af_inet6.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 3192aa0..3f9e86b 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -200,7 +200,7 @@ lookup_protocol:
 
 	inet_sk(sk)->pinet6 = np = inet6_sk_generic(sk);
 	np->hop_limit	= -1;
-	np->mcast_hops	= -1;
+	np->mcast_hops	= IPV6_DEFAULT_MCASTHOPS;
 	np->mc_loop	= 1;
 	np->pmtudisc	= IPV6_PMTUDISC_WANT;
 	np->ipv6only	= net->ipv6.sysctl.bindv6only;
-- 
1.7.0.4


^ permalink raw reply related

* Re: linux kernel's IPV6_MULTICAST_HOPS default is 64; should be 1?
From: David Stevens @ 2010-05-04  7:48 UTC (permalink / raw)
  To: enh; +Cc: brian.haley, David Miller, netdev, netdev-owner
In-Reply-To: <AANLkTimf-FWCsiLqYDGg4nl-relRMuNvDowfTKZbe9kx@mail.gmail.com>

It's set to -1 by default, but the common code for unicast and
multicast in getsockopt is falling through to use the dst_entry.

I believe (though I haven't actually tried it recently) it actually
uses "1" for the default value for multicast; it just doesn't
report it correctly. It should distinguish multicast from unicast
in the <0 check in getsockopt.
                                        +-DLS

^ permalink raw reply

* Re: linux kernel's IPV6_MULTICAST_HOPS default is 64; should be 1?
From: David Miller @ 2010-05-04  7:57 UTC (permalink / raw)
  To: dlstevens; +Cc: enh, brian.haley, netdev, netdev-owner
In-Reply-To: <OF8BB88147.4F3756DA-ON88257719.002A895A-88257719.002AEB08@us.ibm.com>

From: David Stevens <dlstevens@us.ibm.com>
Date: Tue, 4 May 2010 00:48:46 -0700

> It's set to -1 by default, but the common code for unicast and
> multicast in getsockopt is falling through to use the dst_entry.
> 
> I believe (though I haven't actually tried it recently) it actually
> uses "1" for the default value for multicast;

It doesn't, all of the uses in the ipv6 stack say something like:

	if (multicast)
		hlimit = np->mcast_hops;
	else
		hlimit = np->hop_limit;
	if (hlimit < 0)
		hlimit = ip6_dst_hoplimit(dst);

Therefore, the change suggested by Elliot and which I committed is the
way to get the correct behavior and fix this.

^ permalink raw reply

* Re: [PATCH 1/3] ptp: Added a brand new class driver for ptp clocks.
From: Wolfgang Grandegger @ 2010-05-04  7:57 UTC (permalink / raw)
  To: Richard Cochran; +Cc: netdev
In-Reply-To: <20100503100754.GA30417@riccoc20.at.omicron.at>

On 05/03/2010 12:07 PM, Richard Cochran wrote:
> On Sun, May 02, 2010 at 12:50:56PM +0200, Wolfgang Grandegger wrote:
>>
>> As long as the device is in use by an application, no other can access
>> it, because the mutex is locked. Other application may want to read the
>> PTP clock time while ptpd is running, though.
> 
> Yes, of course. I implemented it that way just to get started. I first
> want to concentrate on getting the basic drivers in place (still have
> IXP46x and Phyter to do), and then on the ancillary features, like
> timers, time stamping external inputs, and so on.
> 
> I understand that some fine grained access control to the PTP clock
> woul be nice to have, but I am not sure exactly what would work best,
> and I would like to save that decision for later...

OK, fair enough.

> However, if you have some ideas, please take a look at the list of
> features in the docu, and explain how you would like the access
> control to work.

The fine grain locking for set/get properties is tricky. I personally
would just require "CAP_SYS_TIME" for setting properties and a
spin_lock_irq* to protect against concurrent access, just like for
setting/getting system clock properties.

> Or better yet, post a patch ;)

Would do if we could agree on the solution mentioned above.

Wolfgang.

^ permalink raw reply

* Re: Performance problem in network namespaces
From: Benny Amorsen @ 2010-05-04  9:48 UTC (permalink / raw)
  To: Martín Ferrari; +Cc: netdev, Mathieu Lacage
In-Reply-To: <q2tb9800b71005030225q21365e3ch8dd5b35383a32e8a@mail.gmail.com>

Martín Ferrari <martin.ferrari@gmail.com> writes:

> When running some benchmarks to test the feasibility of using
> namespaces for emulating networks, I have found a big drop in
> performance when one of the namespaces is performing routing of
> packets.

Is this problem specific to vnet, or do the other types of interfaces
suffer from it as well? (phys, vlan, macvlan...)


/Benny


^ permalink raw reply

* [GIT PULL] amended: first round of vhost-net enhancements for net-next
From: Michael S. Tsirkin @ 2010-05-04 11:21 UTC (permalink / raw)
  To: David Miller; +Cc: kvm, virtualization, netdev, linux-kernel

David,
This is an amended pull request: I have rebased the tree to the
correct patches. This has been through basic tests and seems
to work fine here.

The following tree includes a couple of enhancements that help vhost-net.
Please pull them for net-next. Another set of patches is under
debugging/testing and I hope to get them ready in time for 2.6.35,
so there may be another pull request later.

Thanks!

The following changes since commit 7ef527377b88ff05fb122a47619ea506c631c914:

  Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 (2010-05-02 22:02:06 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost

Michael S. Tsirkin (2):
      tun: add ioctl to modify vnet header size
      macvtap: add ioctl to modify vnet header size

 drivers/net/macvtap.c  |   27 +++++++++++++++++++++++----
 drivers/net/tun.c      |   32 ++++++++++++++++++++++++++++----
 include/linux/if_tun.h |    2 ++
 3 files changed, 53 insertions(+), 8 deletions(-)

-- 
MST

^ permalink raw reply

* Re: TCP-MD5 checksum failure on x86_64 SMP
From: Ben Hutchings @ 2010-05-04 11:32 UTC (permalink / raw)
  To: Bhaskar Dutta; +Cc: netdev
In-Reply-To: <o2h571fb4001005032030tdf02a4fag520ec4e56ebdb8df@mail.gmail.com>

On Tue, 2010-05-04 at 09:00 +0530, Bhaskar Dutta wrote:
> Hi,
> 
> I am observing intermittent TCP-MD5 checksum failures
> (CONFIG_TCP_MD5SIG)  on kernel 2.6.31 while talking to a BGP router.
> 
> The problem is only seen in multi-core 64 bit machines.
> Is there any known bug in the per_cpu_ptr implementation (I am aware
> that the percpu allocator has been re-implemented in 2.6.33) that
> might cause a corruption in 64 bit SMP machines?
> 
> Any pointers would be appreciated.

There was another recent report of incorrect MD5 signatures in
<http://thread.gmane.org/gmane.linux.network/159556>, but without any
response.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* [patch] sunrpc: add missing return statement
From: Johannes Weiner @ 2010-05-04 11:59 UTC (permalink / raw)
  To: David S. Miller
  Cc: Alexandros Batsakis, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

f300bab "nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel"
introduced an error case branch that lacks an actual `return' keyword
before the return value.  Add it.

Signed-off-by: Johannes Weiner <jw-QdrG9jWwCLEAvxtiuMwx3w@public.gmane.org>
Cc: Alexandros Batsakis <batsakis-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>
---
 net/sunrpc/xprtsock.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2444,7 +2444,7 @@ static struct rpc_xprt *xs_setup_bc_tcp(
 	struct svc_sock *bc_sock;
 
 	if (!args->bc_xprt)
-		ERR_PTR(-EINVAL);
+		return ERR_PTR(-EINVAL);
 
 	xprt = xs_setup_xprt(args, xprt_tcp_slot_table_entries);
 	if (IS_ERR(xprt))
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [patch] sunrpc: add missing return statement
From: Trond Myklebust @ 2010-05-04 12:27 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: David S. Miller, Alexandros Batsakis, linux-nfs, netdev,
	linux-kernel
In-Reply-To: <20100504115759.266396633@emlix.com>

On Tue, 2010-05-04 at 13:59 +0200, Johannes Weiner wrote: 
> f300bab "nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel"
> introduced an error case branch that lacks an actual `return' keyword
> before the return value.  Add it.
> 
> Signed-off-by: Johannes Weiner <jw@emlix.com>
> Cc: Alexandros Batsakis <batsakis@netapp.com>
> ---
>  net/sunrpc/xprtsock.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -2444,7 +2444,7 @@ static struct rpc_xprt *xs_setup_bc_tcp(
>  	struct svc_sock *bc_sock;
>  
>  	if (!args->bc_xprt)
> -		ERR_PTR(-EINVAL);
> +		return ERR_PTR(-EINVAL);
>  
>  	xprt = xs_setup_xprt(args, xprt_tcp_slot_table_entries);
>  	if (IS_ERR(xprt))

No. It should either be a BUG_ON(), or else be removed entirely.
Returning an error value for something that is clearly a programming bug
is not a particularly useful exercise...

Cheers
  Trond

^ permalink raw reply

* Re: [patch] sunrpc: add missing return statement
From: Tetsuo Handa @ 2010-05-04 13:03 UTC (permalink / raw)
  To: trond.myklebust-41N18TsMXrtuMpJDpNschA, jw-QdrG9jWwCLEAvxtiuMwx3w
  Cc: davem-fT/PcQaiUtIeIZ0/mPfg9Q, batsakis-HgOvQuBEEgTQT0dZR+AlfA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1272976042.7559.24.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>

Trond Myklebust wrote:
> On Tue, 2010-05-04 at 13:59 +0200, Johannes Weiner wrote: 
> > f300bab "nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel"
> > introduced an error case branch that lacks an actual `return' keyword
> > before the return value.  Add it.
> > 
> > Signed-off-by: Johannes Weiner <jw-QdrG9jWwCLEAvxtiuMwx3w@public.gmane.org>
> > Cc: Alexandros Batsakis <batsakis-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>
> > ---
> >  net/sunrpc/xprtsock.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > --- a/net/sunrpc/xprtsock.c
> > +++ b/net/sunrpc/xprtsock.c
> > @@ -2444,7 +2444,7 @@ static struct rpc_xprt *xs_setup_bc_tcp(
> >  	struct svc_sock *bc_sock;
> >  
> >  	if (!args->bc_xprt)
> > -		ERR_PTR(-EINVAL);
> > +		return ERR_PTR(-EINVAL);
> >  
> >  	xprt = xs_setup_xprt(args, xprt_tcp_slot_table_entries);
> >  	if (IS_ERR(xprt))
> 
> No. It should either be a BUG_ON(), or else be removed entirely.
> Returning an error value for something that is clearly a programming bug
> is not a particularly useful exercise...
> 
Removing NULL check is wrong because it will NULL pointer dereference later.

Tetsuo Handa wrote:
> Jani Nikula wrote:
> > Signed-off-by: Jani Nikula <ext-jani.1.nikula-xNZwKgViW5gAvxtiuMwx3w@public.gmane.org>
> > 
> > ---
> > 
> > NOTE: I'm afraid I'm unable to test this; please consider this more a
> > bug report than a complete patch.
> > ---
> Indeed, it has to be "return ERR_PTR(-EINVAL);".
> Otherwise, it will trigger NULL pointer dereference some lines later.
> 
>     bc_sock = container_of(args->bc_xprt, struct svc_sock, sk_xprt);
>     bc_sock->sk_bc_xprt = xprt;
> 
> This bug was introduced by f300baba5a1536070d6d77bf0c8c4ca999bb4f0f
> "nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel" and
> exists in 2.6.32 and later.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [patch] sunrpc: add missing return statement
From: Trond Myklebust @ 2010-05-04 13:13 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: jw, davem, batsakis, linux-nfs, netdev, linux-kernel
In-Reply-To: <201005042203.DAJ09881.tFQOJFHSOLVOMF@I-love.SAKURA.ne.jp>

On Tue, 2010-05-04 at 22:03 +0900, Tetsuo Handa wrote: 
> Trond Myklebust wrote:
> > On Tue, 2010-05-04 at 13:59 +0200, Johannes Weiner wrote: 
> > > f300bab "nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel"
> > > introduced an error case branch that lacks an actual `return' keyword
> > > before the return value.  Add it.
> > > 
> > > Signed-off-by: Johannes Weiner <jw@emlix.com>
> > > Cc: Alexandros Batsakis <batsakis@netapp.com>
> > > ---
> > >  net/sunrpc/xprtsock.c |    2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > --- a/net/sunrpc/xprtsock.c
> > > +++ b/net/sunrpc/xprtsock.c
> > > @@ -2444,7 +2444,7 @@ static struct rpc_xprt *xs_setup_bc_tcp(
> > >  	struct svc_sock *bc_sock;
> > >  
> > >  	if (!args->bc_xprt)
> > > -		ERR_PTR(-EINVAL);
> > > +		return ERR_PTR(-EINVAL);
> > >  
> > >  	xprt = xs_setup_xprt(args, xprt_tcp_slot_table_entries);
> > >  	if (IS_ERR(xprt))
> > 
> > No. It should either be a BUG_ON(), or else be removed entirely.
> > Returning an error value for something that is clearly a programming bug
> > is not a particularly useful exercise...
> > 
> Removing NULL check is wrong because it will NULL pointer dereference later.

Wrong. Removing NULL check is _right_ because calling this function
without setting up a back channel first is a major BUG. Returning an
error value to the user is pointless, since the user has no control over
this. It is entirely under control of the sunrpc developers...

Trond


^ permalink raw reply

* Re: [patch] sunrpc: add missing return statement
From: Tetsuo Handa @ 2010-05-04 14:02 UTC (permalink / raw)
  To: trond.myklebust-41N18TsMXrtuMpJDpNschA
  Cc: jw-QdrG9jWwCLEAvxtiuMwx3w, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	batsakis-HgOvQuBEEgTQT0dZR+AlfA, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1272978815.7559.27.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>

Trond Myklebust wrote:
> > > No. It should either be a BUG_ON(), or else be removed entirely.
> > > Returning an error value for something that is clearly a programming bug
> > > is not a particularly useful exercise...
> > > 
> > Removing NULL check is wrong because it will NULL pointer dereference later.
> 
> Wrong. Removing NULL check is _right_ because calling this function
> without setting up a back channel first is a major BUG. Returning an
> error value to the user is pointless, since the user has no control over
> this. It is entirely under control of the sunrpc developers...
> 
For security people, removing

	if (!args->bc_xprt)
		ERR_PTR(-EINVAL);

is worse and changing to

	BUG_ON(!args->bc_xprt);

is better.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] Comment sysfs directory tagging
From: Tejun Heo @ 2010-05-04 14:03 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Greg KH, Eric W. Biederman, bcrl, benjamin.thery, cornelia.huck,
	eric.dumazet, kay.sievers, netdev
In-Reply-To: <20100503212315.GA4141@us.ibm.com>

Hello,

On 05/03/2010 11:23 PM, Serge E. Hallyn wrote:
> (against gregkh-2.6)
> 
> Add some in-line comments to explain the new infrastructure, which
> was introduced to support sysfs directory tagging with namespaces.
> I think an overall description someplace might be good too, but it
> didn't really seem to fit into Documentation/filesystems/sysfs.txt,
> which appears more geared toward users, rather than maintainers, of
> sysfs.
> 
> (Tejun, please let me know if I can make anything clearer or failed
> altogether to comment something that should be commented.)

This sure is an improvement but yeah I agree that having an overall
documentation would be great.  If there isn't a matching one, just
create one - sysfs-ns.txt, sysfs-tagged.txt or whatever.

Thanks.

-- 
tejun

^ permalink raw reply

* Re: sctp pull request for net-next-2.6
From: Vlad Yasevich @ 2010-05-04 14:27 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20100503.162401.102667647.davem@davemloft.net>



David Miller wrote:
> From: David Miller <davem@davemloft.net>
> Date: Mon, 03 May 2010 16:21:48 -0700 (PDT)
> 
>> From: Vlad Yasevich <vladislav.yasevich@hp.com>
>> Date: Fri, 30 Apr 2010 22:52:39 -0400
>>
>>> The following changes since commit 83d7eb2979cd3390c375470225dd2d8f2009bc70:
>>>   Dan Carpenter (1):
>>>         ipv6: cleanup: remove unneeded null check
>>>
>>> are available in the git repository at:
>>>
>>>   git://git.kernel.org/pub/scm/linux/kernel/git/vxy/lksctp-dev.git net-next
>> Pulled, thanks Vlad.
> 
> I had to fix the build when I merged this by adding a missing
> linux/vmalloc.h include to net/sctp/probe.c
> 
> net/sctp/probe.c: In function ‘sctpprobe_read’:
> net/sctp/probe.c:97: error: implicit declaration of function ‘vmalloc’
> net/sctp/probe.c:97: warning: assignment makes pointer from integer without a cast
> net/sctp/probe.c:110: error: implicit declaration of function ‘vfree’
> 

I feel sheeepish.  I just checked my config and realized this thing wasn't
turned on (my guess it got turned off by the stable builds...).  Sorry about that.

-vlad

> diff --git a/net/sctp/probe.c b/net/sctp/probe.c
> index 8f025d5..db3a42b 100644
> --- a/net/sctp/probe.c
> +++ b/net/sctp/probe.c
> @@ -27,6 +27,7 @@
>  #include <linux/socket.h>
>  #include <linux/sctp.h>
>  #include <linux/proc_fs.h>
> +#include <linux/vmalloc.h>
>  #include <linux/module.h>
>  #include <linux/kfifo.h>
>  #include <linux/time.h>

^ permalink raw reply

* Re: TCP-MD5 checksum failure on x86_64 SMP
From: Bhaskar Dutta @ 2010-05-04 14:28 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev
In-Reply-To: <1272972722.2097.1.camel@achroite.uk.solarflarecom.com>

On Tue, May 4, 2010 at 5:02 PM, Ben Hutchings <bhutchings@solarflare.com> wrote:
> On Tue, 2010-05-04 at 09:00 +0530, Bhaskar Dutta wrote:
>> Hi,
>>
>> I am observing intermittent TCP-MD5 checksum failures
>> (CONFIG_TCP_MD5SIG)  on kernel 2.6.31 while talking to a BGP router.
>>
>> The problem is only seen in multi-core 64 bit machines.
>> Is there any known bug in the per_cpu_ptr implementation (I am aware
>> that the percpu allocator has been re-implemented in 2.6.33) that
>> might cause a corruption in 64 bit SMP machines?
>>
>> Any pointers would be appreciated.
>
> There was another recent report of incorrect MD5 signatures in
> <http://thread.gmane.org/gmane.linux.network/159556>, but without any
> response.
>
> Ben.
>

I found another thread posted back in Jan 2007 with a similar bug
(x86_64 on 2.6.20) but no replies to that as well.
http://lkml.org/lkml/2007/1/20/56

Bhaskar

^ permalink raw reply

* Re: [PATCH]PM QOS refresh against next-20100430
From: mark gross @ 2010-05-04 14:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Kevin Hilman, aili, dwalker, tiwai, bruce.w.allan, davidb, mcgrof,
	pavel, linux-pm, lkml, NetDev, Johannes Berg,
	ACPI Devel Maling List, Len Brown, John W. Linville
In-Reply-To: <201005010108.28593.rjw@sisk.pl>

On Sat, May 01, 2010 at 01:08:28AM +0200, Rafael J. Wysocki wrote:
> On Saturday 01 May 2010, mark gross wrote:
> > On Sat, May 01, 2010 at 12:13:16AM +0200, Rafael J. Wysocki wrote:
> > > On Friday 30 April 2010, mark gross wrote:
> > > > The following is a refresh of the PM_QOS implementation, this patch
> > > > updates some documentation input I got from Randy.
> > > > 
> > > > This patch changes the string based list management to a handle base
> > > > implementation to help with the hot path use of pm-qos, it also renames
> > > > much of the API to use "request" as opposed to "requirement" that was
> > > > used in the initial implementation.  I did this because request more
> > > > accurately represents what it actually does.
> > > > 
> > > > Also, I added a string based ABI for users wanting to use a string
> > > > interface.  So if the user writes 0xDDDDDDDD formatted hex it will be
> > > > accepted by the interface.  (someone asked me for it and I don't think
> > > > it hurts anything.)
> > > > 
> > > > I really would like to get this refresh taken care of.  Its been taking
> > > > me too long to close this.  please review or include it in next.
> > > > 
> > > > Thanks!
> > > 
> > > Well, I'd take it to suspend-2.6/linux-next, but first, it touches
> > > subsystems whose maintainers were not in the Cc list, like the network
> > > drivers, wireless and ACPI.  The changes are trivial, so I hope they don't
> > > mind.
> > > 
> > > Second, my tree is based on the Linus' tree rather than linux-next and
> > > the change in net/mac80211/scan.c doesn't seem to match that.  Please tell me
> > > what I'm supposed to do about that.
> > 
> > You can waite for monday and I'll send a rebased version to linus' tree.
> > 
> > I thought linux-next was where folks wanted me to put it.
> > 
> > I'll email out a new one monday.
> 
> Great, thanks!
> 
> Rafael

Sorry I'm late, 


Signed-off-by: markgross <mgross@linux.intel.com>



>From e6300f50eff52d585dd9d5e14364dba5aa542477 Mon Sep 17 00:00:00 2001
From: markgross <mark.gross@intel.com>
Date: Mon, 3 May 2010 12:46:42 -0700
Subject: [PATCH] PM_QOS to use handle based list implementation and exported function name changes to be more descriptive of what is actually happening.

---
 Documentation/power/pm_qos_interface.txt |   48 ++++---
 drivers/acpi/processor_idle.c            |    2 +-
 drivers/cpuidle/governors/ladder.c       |    2 +-
 drivers/cpuidle/governors/menu.c         |    2 +-
 drivers/net/e1000e/netdev.c              |   22 ++--
 drivers/net/igbvf/netdev.c               |    6 +-
 drivers/net/wireless/ipw2x00/ipw2100.c   |   11 +-
 include/linux/netdevice.h                |    4 +
 include/linux/pm_qos_params.h            |   14 +-
 include/sound/pcm.h                      |    3 +-
 kernel/pm_qos_params.c                   |  214 ++++++++++++++---------------
 net/mac80211/mlme.c                      |    2 +-
 sound/core/pcm.c                         |    3 -
 sound/core/pcm_native.c                  |   14 +-
 14 files changed, 176 insertions(+), 171 deletions(-)

diff --git a/Documentation/power/pm_qos_interface.txt b/Documentation/power/pm_qos_interface.txt
index c40866e..bfed898 100644
--- a/Documentation/power/pm_qos_interface.txt
+++ b/Documentation/power/pm_qos_interface.txt
@@ -18,44 +18,46 @@ and pm_qos_params.h.  This is done because having the available parameters
 being runtime configurable or changeable from a driver was seen as too easy to
 abuse.
 
-For each parameter a list of performance requirements is maintained along with
+For each parameter a list of performance requests is maintained along with
 an aggregated target value.  The aggregated target value is updated with
-changes to the requirement list or elements of the list.  Typically the
-aggregated target value is simply the max or min of the requirement values held
+changes to the request list or elements of the list.  Typically the
+aggregated target value is simply the max or min of the request values held
 in the parameter list elements.
 
 From kernel mode the use of this interface is simple:
-pm_qos_add_requirement(param_id, name, target_value):
-Will insert a named element in the list for that identified PM_QOS parameter
-with the target value.  Upon change to this list the new target is recomputed
-and any registered notifiers are called only if the target value is now
-different.
 
-pm_qos_update_requirement(param_id, name, new_target_value):
-Will search the list identified by the param_id for the named list element and
-then update its target value, calling the notification tree if the aggregated
-target is changed.  with that name is already registered.
+handle = pm_qos_add_request(param_class, target_value):
+Will insert an element into the list for that identified PM_QOS class with the
+target value.  Upon change to this list the new target is recomputed and any
+registered notifiers are called only if the target value is now different.
+Clients of pm_qos need to save the returned handle.
 
-pm_qos_remove_requirement(param_id, name):
-Will search the identified list for the named element and remove it, after
-removal it will update the aggregate target and call the notification tree if
-the target was changed as a result of removing the named requirement.
+void pm_qos_update_request(handle, new_target_value):
+Will update the list element pointed to by the handle with the new target value
+and recompute the new aggregated target, calling the notification tree if the
+target is changed.
+
+void pm_qos_remove_request(handle):
+Will remove the element.  After removal it will update the aggregate target and
+call the notification tree if the target was changed as a result of removing
+the request.
 
 
 From user mode:
-Only processes can register a pm_qos requirement.  To provide for automatic
-cleanup for process the interface requires the process to register its
-parameter requirements in the following way:
+Only processes can register a pm_qos request.  To provide for automatic
+cleanup of a process, the interface requires the process to register its
+parameter requests in the following way:
 
 To register the default pm_qos target for the specific parameter, the process
 must open one of /dev/[cpu_dma_latency, network_latency, network_throughput]
 
 As long as the device node is held open that process has a registered
-requirement on the parameter.  The name of the requirement is "process_<PID>"
-derived from the current->pid from within the open system call.
+request on the parameter.
 
-To change the requested target value the process needs to write a s32 value to
-the open device node.  This translates to a pm_qos_update_requirement call.
+To change the requested target value the process needs to write an s32 value to
+the open device node.  Alternatively the user mode program could write a hex
+string for the value using 10 char long format e.g. "0x12345678".  This
+translates to a pm_qos_update_request call.
 
 To remove the user mode request for a target value simply close the device
 node.
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 5939e7f..c3817e1 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -698,7 +698,7 @@ static int acpi_processor_power_seq_show(struct seq_file *seq, void *offset)
 		   "max_cstate:              C%d\n"
 		   "maximum allowed latency: %d usec\n",
 		   pr->power.state ? pr->power.state - pr->power.states : 0,
-		   max_cstate, pm_qos_requirement(PM_QOS_CPU_DMA_LATENCY));
+		   max_cstate, pm_qos_request(PM_QOS_CPU_DMA_LATENCY));
 
 	seq_puts(seq, "states:\n");
 
diff --git a/drivers/cpuidle/governors/ladder.c b/drivers/cpuidle/governors/ladder.c
index 1c1ceb4..12c9890 100644
--- a/drivers/cpuidle/governors/ladder.c
+++ b/drivers/cpuidle/governors/ladder.c
@@ -67,7 +67,7 @@ static int ladder_select_state(struct cpuidle_device *dev)
 	struct ladder_device *ldev = &__get_cpu_var(ladder_devices);
 	struct ladder_device_state *last_state;
 	int last_residency, last_idx = ldev->last_state_idx;
-	int latency_req = pm_qos_requirement(PM_QOS_CPU_DMA_LATENCY);
+	int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
 
 	/* Special case when user has set very strict latency requirement */
 	if (unlikely(latency_req == 0)) {
diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index 1aea715..61ca939 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -183,7 +183,7 @@ static u64 div_round64(u64 dividend, u32 divisor)
 static int menu_select(struct cpuidle_device *dev)
 {
 	struct menu_device *data = &__get_cpu_var(menu_devices);
-	int latency_req = pm_qos_requirement(PM_QOS_CPU_DMA_LATENCY);
+	int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
 	int i;
 	int multiplier;
 
diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
index fb8fc7d..021e1dd 100644
--- a/drivers/net/e1000e/netdev.c
+++ b/drivers/net/e1000e/netdev.c
@@ -2524,12 +2524,12 @@ static void e1000_configure_rx(struct e1000_adapter *adapter)
 			 * excessive C-state transition latencies result in
 			 * dropped transactions.
 			 */
-			pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY,
-						  adapter->netdev->name, 55);
+			pm_qos_update_request(
+				adapter->netdev->pm_qos_req, 55);
 		} else {
-			pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY,
-						  adapter->netdev->name,
-						  PM_QOS_DEFAULT_VALUE);
+			pm_qos_update_request(
+				adapter->netdev->pm_qos_req,
+				PM_QOS_DEFAULT_VALUE);
 		}
 	}
 
@@ -2824,8 +2824,8 @@ int e1000e_up(struct e1000_adapter *adapter)
 
 	/* DMA latency requirement to workaround early-receive/jumbo issue */
 	if (adapter->flags & FLAG_HAS_ERT)
-		pm_qos_add_requirement(PM_QOS_CPU_DMA_LATENCY,
-		                       adapter->netdev->name,
+		adapter->netdev->pm_qos_req =
+			pm_qos_add_request(PM_QOS_CPU_DMA_LATENCY,
 				       PM_QOS_DEFAULT_VALUE);
 
 	/* hardware has been reset, we need to reload some things */
@@ -2887,9 +2887,11 @@ void e1000e_down(struct e1000_adapter *adapter)
 	e1000_clean_tx_ring(adapter);
 	e1000_clean_rx_ring(adapter);
 
-	if (adapter->flags & FLAG_HAS_ERT)
-		pm_qos_remove_requirement(PM_QOS_CPU_DMA_LATENCY,
-		                          adapter->netdev->name);
+	if (adapter->flags & FLAG_HAS_ERT) {
+		pm_qos_remove_request(
+			      adapter->netdev->pm_qos_req);
+		adapter->netdev->pm_qos_req = NULL;
+	}
 
 	/*
 	 * TODO: for power management, we could drop the link and
diff --git a/drivers/net/igbvf/netdev.c b/drivers/net/igbvf/netdev.c
index 1b1edad..f16e981 100644
--- a/drivers/net/igbvf/netdev.c
+++ b/drivers/net/igbvf/netdev.c
@@ -48,6 +48,7 @@
 #define DRV_VERSION "1.0.0-k0"
 char igbvf_driver_name[] = "igbvf";
 const char igbvf_driver_version[] = DRV_VERSION;
+struct pm_qos_request_list *igbvf_driver_pm_qos_req;
 static const char igbvf_driver_string[] =
 				"Intel(R) Virtual Function Network Driver";
 static const char igbvf_copyright[] = "Copyright (c) 2009 Intel Corporation.";
@@ -2899,7 +2900,7 @@ static int __init igbvf_init_module(void)
 	printk(KERN_INFO "%s\n", igbvf_copyright);
 
 	ret = pci_register_driver(&igbvf_driver);
-	pm_qos_add_requirement(PM_QOS_CPU_DMA_LATENCY, igbvf_driver_name,
+	igbvf_driver_pm_qos_req = pm_qos_add_request(PM_QOS_CPU_DMA_LATENCY,
 	                       PM_QOS_DEFAULT_VALUE);
 
 	return ret;
@@ -2915,7 +2916,8 @@ module_init(igbvf_init_module);
 static void __exit igbvf_exit_module(void)
 {
 	pci_unregister_driver(&igbvf_driver);
-	pm_qos_remove_requirement(PM_QOS_CPU_DMA_LATENCY, igbvf_driver_name);
+	pm_qos_remove_request(igbvf_driver_pm_qos_req);
+	igbvf_driver_pm_qos_req = NULL;
 }
 module_exit(igbvf_exit_module);
 
diff --git a/drivers/net/wireless/ipw2x00/ipw2100.c b/drivers/net/wireless/ipw2x00/ipw2100.c
index 9b72c45..2b05fe5 100644
--- a/drivers/net/wireless/ipw2x00/ipw2100.c
+++ b/drivers/net/wireless/ipw2x00/ipw2100.c
@@ -174,6 +174,8 @@ that only one external action is invoked at a time.
 #define DRV_DESCRIPTION	"Intel(R) PRO/Wireless 2100 Network Driver"
 #define DRV_COPYRIGHT	"Copyright(c) 2003-2006 Intel Corporation"
 
+struct pm_qos_request_list *ipw2100_pm_qos_req;
+
 /* Debugging stuff */
 #ifdef CONFIG_IPW2100_DEBUG
 #define IPW2100_RX_DEBUG	/* Reception debugging */
@@ -1739,7 +1741,7 @@ static int ipw2100_up(struct ipw2100_priv *priv, int deferred)
 	/* the ipw2100 hardware really doesn't want power management delays
 	 * longer than 175usec
 	 */
-	pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY, "ipw2100", 175);
+	pm_qos_update_request(ipw2100_pm_qos_req, 175);
 
 	/* If the interrupt is enabled, turn it off... */
 	spin_lock_irqsave(&priv->low_lock, flags);
@@ -1887,8 +1889,7 @@ static void ipw2100_down(struct ipw2100_priv *priv)
 	ipw2100_disable_interrupts(priv);
 	spin_unlock_irqrestore(&priv->low_lock, flags);
 
-	pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY, "ipw2100",
-			PM_QOS_DEFAULT_VALUE);
+	pm_qos_update_request(ipw2100_pm_qos_req, PM_QOS_DEFAULT_VALUE);
 
 	/* We have to signal any supplicant if we are disassociating */
 	if (associated)
@@ -6669,7 +6670,7 @@ static int __init ipw2100_init(void)
 	if (ret)
 		goto out;
 
-	pm_qos_add_requirement(PM_QOS_CPU_DMA_LATENCY, "ipw2100",
+	ipw2100_pm_qos_req = pm_qos_add_request(PM_QOS_CPU_DMA_LATENCY,
 			PM_QOS_DEFAULT_VALUE);
 #ifdef CONFIG_IPW2100_DEBUG
 	ipw2100_debug_level = debug;
@@ -6692,7 +6693,7 @@ static void __exit ipw2100_exit(void)
 			   &driver_attr_debug_level);
 #endif
 	pci_unregister_driver(&ipw2100_pci_driver);
-	pm_qos_remove_requirement(PM_QOS_CPU_DMA_LATENCY, "ipw2100");
+	pm_qos_remove_request(ipw2100_pm_qos_req);
 }
 
 module_init(ipw2100_init);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index fa8b476..3857517 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -31,6 +31,7 @@
 #include <linux/if_link.h>
 
 #ifdef __KERNEL__
+#include <linux/pm_qos_params.h>
 #include <linux/timer.h>
 #include <linux/delay.h>
 #include <linux/mm.h>
@@ -711,6 +712,9 @@ struct net_device {
 	 * the interface.
 	 */
 	char			name[IFNAMSIZ];
+
+	struct pm_qos_request_list *pm_qos_req;
+
 	/* device name hash chain */
 	struct hlist_node	name_hlist;
 	/* snmp alias */
diff --git a/include/linux/pm_qos_params.h b/include/linux/pm_qos_params.h
index d74f75e..8ba440e 100644
--- a/include/linux/pm_qos_params.h
+++ b/include/linux/pm_qos_params.h
@@ -14,12 +14,14 @@
 #define PM_QOS_NUM_CLASSES 4
 #define PM_QOS_DEFAULT_VALUE -1
 
-int pm_qos_add_requirement(int qos, char *name, s32 value);
-int pm_qos_update_requirement(int qos, char *name, s32 new_value);
-void pm_qos_remove_requirement(int qos, char *name);
+struct pm_qos_request_list;
 
-int pm_qos_requirement(int qos);
+struct pm_qos_request_list *pm_qos_add_request(int pm_qos_class, s32 value);
+void pm_qos_update_request(struct pm_qos_request_list *pm_qos_req,
+		s32 new_value);
+void pm_qos_remove_request(struct pm_qos_request_list *pm_qos_req);
 
-int pm_qos_add_notifier(int qos, struct notifier_block *notifier);
-int pm_qos_remove_notifier(int qos, struct notifier_block *notifier);
+int pm_qos_request(int pm_qos_class);
+int pm_qos_add_notifier(int pm_qos_class, struct notifier_block *notifier);
+int pm_qos_remove_notifier(int pm_qos_class, struct notifier_block *notifier);
 
diff --git a/include/sound/pcm.h b/include/sound/pcm.h
index 8b611a5..dd76cde 100644
--- a/include/sound/pcm.h
+++ b/include/sound/pcm.h
@@ -29,6 +29,7 @@
 #include <linux/poll.h>
 #include <linux/mm.h>
 #include <linux/bitops.h>
+#include <linux/pm_qos_params.h>
 
 #define snd_pcm_substream_chip(substream) ((substream)->private_data)
 #define snd_pcm_chip(pcm) ((pcm)->private_data)
@@ -365,7 +366,7 @@ struct snd_pcm_substream {
 	int number;
 	char name[32];			/* substream name */
 	int stream;			/* stream (direction) */
-	char latency_id[20];		/* latency identifier */
+	struct pm_qos_request_list *latency_pm_qos_req; /* pm_qos request */
 	size_t buffer_bytes_max;	/* limit ring buffer size */
 	struct snd_dma_buffer dma_buffer;
 	unsigned int dma_buf_id;
diff --git a/kernel/pm_qos_params.c b/kernel/pm_qos_params.c
index 3db49b9..a1aea04 100644
--- a/kernel/pm_qos_params.c
+++ b/kernel/pm_qos_params.c
@@ -2,7 +2,7 @@
  * This module exposes the interface to kernel space for specifying
  * QoS dependencies.  It provides infrastructure for registration of:
  *
- * Dependents on a QoS value : register requirements
+ * Dependents on a QoS value : register requests
  * Watchers of QoS value : get notified when target QoS value changes
  *
  * This QoS design is best effort based.  Dependents register their QoS needs.
@@ -14,19 +14,21 @@
  * timeout: usec <-- currently not used.
  * throughput: kbs (kilo byte / sec)
  *
- * There are lists of pm_qos_objects each one wrapping requirements, notifiers
+ * There are lists of pm_qos_objects each one wrapping requests, notifiers
  *
- * User mode requirements on a QOS parameter register themselves to the
+ * User mode requests on a QOS parameter register themselves to the
  * subsystem by opening the device node /dev/... and writing there request to
  * the node.  As long as the process holds a file handle open to the node the
  * client continues to be accounted for.  Upon file release the usermode
- * requirement is removed and a new qos target is computed.  This way when the
- * requirement that the application has is cleaned up when closes the file
+ * request is removed and a new qos target is computed.  This way when the
+ * request that the application has is cleaned up when closes the file
  * pointer or exits the pm_qos_object will get an opportunity to clean up.
  *
  * Mark Gross <mgross@linux.intel.com>
  */
 
+/*#define DEBUG*/
+
 #include <linux/pm_qos_params.h>
 #include <linux/sched.h>
 #include <linux/spinlock.h>
@@ -42,25 +44,25 @@
 #include <linux/uaccess.h>
 
 /*
- * locking rule: all changes to requirements or notifiers lists
+ * locking rule: all changes to requests or notifiers lists
  * or pm_qos_object list and pm_qos_objects need to happen with pm_qos_lock
  * held, taken with _irqsave.  One lock to rule them all
  */
-struct requirement_list {
+struct pm_qos_request_list {
 	struct list_head list;
 	union {
 		s32 value;
 		s32 usec;
 		s32 kbps;
 	};
-	char *name;
+	int pm_qos_class;
 };
 
 static s32 max_compare(s32 v1, s32 v2);
 static s32 min_compare(s32 v1, s32 v2);
 
 struct pm_qos_object {
-	struct requirement_list requirements;
+	struct pm_qos_request_list requests;
 	struct blocking_notifier_head *notifiers;
 	struct miscdevice pm_qos_power_miscdev;
 	char *name;
@@ -72,7 +74,7 @@ struct pm_qos_object {
 static struct pm_qos_object null_pm_qos;
 static BLOCKING_NOTIFIER_HEAD(cpu_dma_lat_notifier);
 static struct pm_qos_object cpu_dma_pm_qos = {
-	.requirements = {LIST_HEAD_INIT(cpu_dma_pm_qos.requirements.list)},
+	.requests = {LIST_HEAD_INIT(cpu_dma_pm_qos.requests.list)},
 	.notifiers = &cpu_dma_lat_notifier,
 	.name = "cpu_dma_latency",
 	.default_value = 2000 * USEC_PER_SEC,
@@ -82,7 +84,7 @@ static struct pm_qos_object cpu_dma_pm_qos = {
 
 static BLOCKING_NOTIFIER_HEAD(network_lat_notifier);
 static struct pm_qos_object network_lat_pm_qos = {
-	.requirements = {LIST_HEAD_INIT(network_lat_pm_qos.requirements.list)},
+	.requests = {LIST_HEAD_INIT(network_lat_pm_qos.requests.list)},
 	.notifiers = &network_lat_notifier,
 	.name = "network_latency",
 	.default_value = 2000 * USEC_PER_SEC,
@@ -93,8 +95,7 @@ static struct pm_qos_object network_lat_pm_qos = {
 
 static BLOCKING_NOTIFIER_HEAD(network_throughput_notifier);
 static struct pm_qos_object network_throughput_pm_qos = {
-	.requirements =
-		{LIST_HEAD_INIT(network_throughput_pm_qos.requirements.list)},
+	.requests = {LIST_HEAD_INIT(network_throughput_pm_qos.requests.list)},
 	.notifiers = &network_throughput_notifier,
 	.name = "network_throughput",
 	.default_value = 0,
@@ -135,31 +136,34 @@ static s32 min_compare(s32 v1, s32 v2)
 }
 
 
-static void update_target(int target)
+static void update_target(int pm_qos_class)
 {
 	s32 extreme_value;
-	struct requirement_list *node;
+	struct pm_qos_request_list *node;
 	unsigned long flags;
 	int call_notifier = 0;
 
 	spin_lock_irqsave(&pm_qos_lock, flags);
-	extreme_value = pm_qos_array[target]->default_value;
+	extreme_value = pm_qos_array[pm_qos_class]->default_value;
 	list_for_each_entry(node,
-			&pm_qos_array[target]->requirements.list, list) {
-		extreme_value = pm_qos_array[target]->comparitor(
+			&pm_qos_array[pm_qos_class]->requests.list, list) {
+		extreme_value = pm_qos_array[pm_qos_class]->comparitor(
 				extreme_value, node->value);
 	}
-	if (atomic_read(&pm_qos_array[target]->target_value) != extreme_value) {
+	if (atomic_read(&pm_qos_array[pm_qos_class]->target_value) !=
+			extreme_value) {
 		call_notifier = 1;
-		atomic_set(&pm_qos_array[target]->target_value, extreme_value);
-		pr_debug(KERN_ERR "new target for qos %d is %d\n", target,
-			atomic_read(&pm_qos_array[target]->target_value));
+		atomic_set(&pm_qos_array[pm_qos_class]->target_value,
+				extreme_value);
+		pr_debug(KERN_ERR "new target for qos %d is %d\n", pm_qos_class,
+			atomic_read(&pm_qos_array[pm_qos_class]->target_value));
 	}
 	spin_unlock_irqrestore(&pm_qos_lock, flags);
 
 	if (call_notifier)
-		blocking_notifier_call_chain(pm_qos_array[target]->notifiers,
-			(unsigned long) extreme_value, NULL);
+		blocking_notifier_call_chain(
+				pm_qos_array[pm_qos_class]->notifiers,
+					(unsigned long) extreme_value, NULL);
 }
 
 static int register_pm_qos_misc(struct pm_qos_object *qos)
@@ -185,125 +189,110 @@ static int find_pm_qos_object_by_minor(int minor)
 }
 
 /**
- * pm_qos_requirement - returns current system wide qos expectation
+ * pm_qos_request - returns current system wide qos expectation
  * @pm_qos_class: identification of which qos value is requested
  *
  * This function returns the current target value in an atomic manner.
  */
-int pm_qos_requirement(int pm_qos_class)
+int pm_qos_request(int pm_qos_class)
 {
 	return atomic_read(&pm_qos_array[pm_qos_class]->target_value);
 }
-EXPORT_SYMBOL_GPL(pm_qos_requirement);
+EXPORT_SYMBOL_GPL(pm_qos_request);
 
 /**
- * pm_qos_add_requirement - inserts new qos request into the list
+ * pm_qos_add_request - inserts new qos request into the list
  * @pm_qos_class: identifies which list of qos request to us
- * @name: identifies the request
  * @value: defines the qos request
  *
  * This function inserts a new entry in the pm_qos_class list of requested qos
  * performance characteristics.  It recomputes the aggregate QoS expectations
- * for the pm_qos_class of parameters.
+ * for the pm_qos_class of parameters, and returns the pm_qos_request list
+ * element as a handle for use in updating and removal.  Call needs to save
+ * this handle for later use.
  */
-int pm_qos_add_requirement(int pm_qos_class, char *name, s32 value)
+struct pm_qos_request_list *pm_qos_add_request(int pm_qos_class, s32 value)
 {
-	struct requirement_list *dep;
+	struct pm_qos_request_list *dep;
 	unsigned long flags;
 
-	dep = kzalloc(sizeof(struct requirement_list), GFP_KERNEL);
+	dep = kzalloc(sizeof(struct pm_qos_request_list), GFP_KERNEL);
 	if (dep) {
 		if (value == PM_QOS_DEFAULT_VALUE)
 			dep->value = pm_qos_array[pm_qos_class]->default_value;
 		else
 			dep->value = value;
-		dep->name = kstrdup(name, GFP_KERNEL);
-		if (!dep->name)
-			goto cleanup;
+		dep->pm_qos_class = pm_qos_class;
 
 		spin_lock_irqsave(&pm_qos_lock, flags);
 		list_add(&dep->list,
-			&pm_qos_array[pm_qos_class]->requirements.list);
+			&pm_qos_array[pm_qos_class]->requests.list);
 		spin_unlock_irqrestore(&pm_qos_lock, flags);
 		update_target(pm_qos_class);
-
-		return 0;
 	}
 
-cleanup:
-	kfree(dep);
-	return -ENOMEM;
+	return dep;
 }
-EXPORT_SYMBOL_GPL(pm_qos_add_requirement);
+EXPORT_SYMBOL_GPL(pm_qos_add_request);
 
 /**
- * pm_qos_update_requirement - modifies an existing qos request
- * @pm_qos_class: identifies which list of qos request to us
- * @name: identifies the request
+ * pm_qos_update_request - modifies an existing qos request
+ * @pm_qos_req : handle to list element holding a pm_qos request to use
  * @value: defines the qos request
  *
- * Updates an existing qos requirement for the pm_qos_class of parameters along
+ * Updates an existing qos request for the pm_qos_class of parameters along
  * with updating the target pm_qos_class value.
  *
- * If the named request isn't in the list then no change is made.
+ * Attempts are made to make this code callable on hot code paths.
  */
-int pm_qos_update_requirement(int pm_qos_class, char *name, s32 new_value)
+void pm_qos_update_request(struct pm_qos_request_list *pm_qos_req,
+		s32 new_value)
 {
 	unsigned long flags;
-	struct requirement_list *node;
 	int pending_update = 0;
+	s32 temp;
 
 	spin_lock_irqsave(&pm_qos_lock, flags);
-	list_for_each_entry(node,
-		&pm_qos_array[pm_qos_class]->requirements.list, list) {
-		if (strcmp(node->name, name) == 0) {
-			if (new_value == PM_QOS_DEFAULT_VALUE)
-				node->value =
-				pm_qos_array[pm_qos_class]->default_value;
-			else
-				node->value = new_value;
-			pending_update = 1;
-			break;
-		}
+	if (new_value == PM_QOS_DEFAULT_VALUE)
+		temp = pm_qos_array[pm_qos_req->pm_qos_class]->default_value;
+	else
+		temp = new_value;
+
+	if (temp != pm_qos_req->value) {
+		pending_update = 1;
+		pm_qos_req->value = temp;
 	}
 	spin_unlock_irqrestore(&pm_qos_lock, flags);
 	if (pending_update)
-		update_target(pm_qos_class);
-
-	return 0;
+		update_target(pm_qos_req->pm_qos_class);
 }
-EXPORT_SYMBOL_GPL(pm_qos_update_requirement);
+EXPORT_SYMBOL_GPL(pm_qos_update_request);
 
 /**
- * pm_qos_remove_requirement - modifies an existing qos request
- * @pm_qos_class: identifies which list of qos request to us
- * @name: identifies the request
+ * pm_qos_remove_request - modifies an existing qos request
+ * @pm_qos_req: handle to request list element
  *
- * Will remove named qos request from pm_qos_class list of parameters and
- * recompute the current target value for the pm_qos_class.
+ * Will remove pm qos request from the list of requests and
+ * recompute the current target value for the pm_qos_class.  Call this
+ * on slow code paths.
  */
-void pm_qos_remove_requirement(int pm_qos_class, char *name)
+void pm_qos_remove_request(struct pm_qos_request_list *pm_qos_req)
 {
 	unsigned long flags;
-	struct requirement_list *node;
-	int pending_update = 0;
+	int qos_class;
+
+	if (pm_qos_req == NULL)
+		return;
+		/* silent return to keep pcm code cleaner */
 
+	qos_class = pm_qos_req->pm_qos_class;
 	spin_lock_irqsave(&pm_qos_lock, flags);
-	list_for_each_entry(node,
-		&pm_qos_array[pm_qos_class]->requirements.list, list) {
-		if (strcmp(node->name, name) == 0) {
-			kfree(node->name);
-			list_del(&node->list);
-			kfree(node);
-			pending_update = 1;
-			break;
-		}
-	}
+	list_del(&pm_qos_req->list);
+	kfree(pm_qos_req);
 	spin_unlock_irqrestore(&pm_qos_lock, flags);
-	if (pending_update)
-		update_target(pm_qos_class);
+	update_target(qos_class);
 }
-EXPORT_SYMBOL_GPL(pm_qos_remove_requirement);
+EXPORT_SYMBOL_GPL(pm_qos_remove_request);
 
 /**
  * pm_qos_add_notifier - sets notification entry for changes to target value
@@ -313,7 +302,7 @@ EXPORT_SYMBOL_GPL(pm_qos_remove_requirement);
  * will register the notifier into a notification chain that gets called
  * upon changes to the pm_qos_class target value.
  */
- int pm_qos_add_notifier(int pm_qos_class, struct notifier_block *notifier)
+int pm_qos_add_notifier(int pm_qos_class, struct notifier_block *notifier)
 {
 	int retval;
 
@@ -343,21 +332,16 @@ int pm_qos_remove_notifier(int pm_qos_class, struct notifier_block *notifier)
 }
 EXPORT_SYMBOL_GPL(pm_qos_remove_notifier);
 
-#define PID_NAME_LEN 32
-
 static int pm_qos_power_open(struct inode *inode, struct file *filp)
 {
-	int ret;
 	long pm_qos_class;
-	char name[PID_NAME_LEN];
 
 	pm_qos_class = find_pm_qos_object_by_minor(iminor(inode));
 	if (pm_qos_class >= 0) {
-		filp->private_data = (void *)pm_qos_class;
-		snprintf(name, PID_NAME_LEN, "process_%d", current->pid);
-		ret = pm_qos_add_requirement(pm_qos_class, name,
-					PM_QOS_DEFAULT_VALUE);
-		if (ret >= 0)
+		filp->private_data = (void *) pm_qos_add_request(pm_qos_class,
+				PM_QOS_DEFAULT_VALUE);
+
+		if (filp->private_data)
 			return 0;
 	}
 	return -EPERM;
@@ -365,32 +349,40 @@ static int pm_qos_power_open(struct inode *inode, struct file *filp)
 
 static int pm_qos_power_release(struct inode *inode, struct file *filp)
 {
-	int pm_qos_class;
-	char name[PID_NAME_LEN];
+	struct pm_qos_request_list *req;
 
-	pm_qos_class = (long)filp->private_data;
-	snprintf(name, PID_NAME_LEN, "process_%d", current->pid);
-	pm_qos_remove_requirement(pm_qos_class, name);
+	req = (struct pm_qos_request_list *)filp->private_data;
+	pm_qos_remove_request(req);
 
 	return 0;
 }
 
+
 static ssize_t pm_qos_power_write(struct file *filp, const char __user *buf,
 		size_t count, loff_t *f_pos)
 {
 	s32 value;
-	int pm_qos_class;
-	char name[PID_NAME_LEN];
-
-	pm_qos_class = (long)filp->private_data;
-	if (count != sizeof(s32))
+	int x;
+	char ascii_value[11];
+	struct pm_qos_request_list *pm_qos_req;
+
+	if (count == sizeof(s32)) {
+		if (copy_from_user(&value, buf, sizeof(s32)))
+			return -EFAULT;
+	} else if (count == 11) { /* len('0x12345678/0') */
+		if (copy_from_user(ascii_value, buf, 11))
+			return -EFAULT;
+		x = sscanf(ascii_value, "%x", &value);
+		if (x != 1)
+			return -EINVAL;
+		pr_debug(KERN_ERR "%s, %d, 0x%x\n", ascii_value, x, value);
+	} else
 		return -EINVAL;
-	if (copy_from_user(&value, buf, sizeof(s32)))
-		return -EFAULT;
-	snprintf(name, PID_NAME_LEN, "process_%d", current->pid);
-	pm_qos_update_requirement(pm_qos_class, name, value);
 
-	return  sizeof(s32);
+	pm_qos_req = (struct pm_qos_request_list *)filp->private_data;
+	pm_qos_update_request(pm_qos_req, value);
+
+	return count;
 }
 
 
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index 4aefa6d..29de196 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -495,7 +495,7 @@ void ieee80211_recalc_ps(struct ieee80211_local *local, s32 latency)
 		s32 beaconint_us;
 
 		if (latency < 0)
-			latency = pm_qos_requirement(PM_QOS_NETWORK_LATENCY);
+			latency = pm_qos_request(PM_QOS_NETWORK_LATENCY);
 
 		beaconint_us = ieee80211_tu_to_usec(
 					found->vif.bss_conf.beacon_int);
diff --git a/sound/core/pcm.c b/sound/core/pcm.c
index 0d428d0..cbe815d 100644
--- a/sound/core/pcm.c
+++ b/sound/core/pcm.c
@@ -648,9 +648,6 @@ int snd_pcm_new_stream(struct snd_pcm *pcm, int stream, int substream_count)
 		substream->number = idx;
 		substream->stream = stream;
 		sprintf(substream->name, "subdevice #%i", idx);
-		snprintf(substream->latency_id, sizeof(substream->latency_id),
-			 "ALSA-PCM%d-%d%c%d", pcm->card->number, pcm->device,
-			 (stream ? 'c' : 'p'), idx);
 		substream->buffer_bytes_max = UINT_MAX;
 		if (prev == NULL)
 			pstr->substream = substream;
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 8728876..605c86d 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -481,11 +481,13 @@ static int snd_pcm_hw_params(struct snd_pcm_substream *substream,
 	snd_pcm_timer_resolution_change(substream);
 	runtime->status->state = SNDRV_PCM_STATE_SETUP;
 
-	pm_qos_remove_requirement(PM_QOS_CPU_DMA_LATENCY,
-				substream->latency_id);
+	if (substream->latency_pm_qos_req) {
+		pm_qos_remove_request(substream->latency_pm_qos_req);
+		substream->latency_pm_qos_req = NULL;
+	}
 	if ((usecs = period_to_usecs(runtime)) >= 0)
-		pm_qos_add_requirement(PM_QOS_CPU_DMA_LATENCY,
-					substream->latency_id, usecs);
+		substream->latency_pm_qos_req = pm_qos_add_request(
+					PM_QOS_CPU_DMA_LATENCY, usecs);
 	return 0;
  _error:
 	/* hardware might be unuseable from this time,
@@ -540,8 +542,8 @@ static int snd_pcm_hw_free(struct snd_pcm_substream *substream)
 	if (substream->ops->hw_free)
 		result = substream->ops->hw_free(substream);
 	runtime->status->state = SNDRV_PCM_STATE_OPEN;
-	pm_qos_remove_requirement(PM_QOS_CPU_DMA_LATENCY,
-		substream->latency_id);
+	pm_qos_remove_request(substream->latency_pm_qos_req);
+	substream->latency_pm_qos_req = NULL;
 	return result;
 }
 
-- 
1.6.3.3

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox