Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: SPLICE_F_NONBLOCK semantics...
From: Jens Axboe @ 2009-10-02  7:47 UTC (permalink / raw)
  To: David Miller
  Cc: torvalds, eric.dumazet, jgunthorpe, vl, opurdila, netdev,
	linux-kernel
In-Reply-To: <20091001.152717.187318570.davem@davemloft.net>

On Thu, Oct 01 2009, David Miller wrote:
> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Thu, 1 Oct 2009 15:21:44 -0700 (PDT)
> 
> > On Thu, 1 Oct 2009, David Miller wrote:
> >> 
> >> It depends upon our interpretation of how you intended the
> >> SPLICE_F_NONBLOCK flag to work when you added it way back
> >> when.
> >> 
> >> Linus introduced  SPLICE_F_NONBLOCK in commit 29e350944fdc2dfca102500790d8ad6d6ff4f69d
> >> (splice: add SPLICE_F_NONBLOCK flag )
> >> 
> >>   It doesn't make the splice itself necessarily nonblocking (because the
> >>   actual file descriptors that are spliced from/to may block unless they
> >>   have the O_NONBLOCK flag set), but it makes the splice pipe operations
> >>   nonblocking.
> >> 
> >> Linus intention was clear : let SPLICE_F_NONBLOCK control the splice pipe mode only
> > 
> > Ack. The original intent was for the flag to affect the buffering, not the 
> > end points.
> 
> Great, thanks for reviewing.
> 
> > Although the more I think about it, the more I suspect that the
> > whole NONBLOCK thing should probably have been two bits, and simply
> > been about "nonblocking input" vs "nonblocking output" (so that you
> > could control both sides on a call-by-call basis).
> 
> I think we could still extend things in this way if we wanted to.
> So if you specify the explicit input and/or output nonblock flag,
> it takes precedence over the SPLICE_F_NONBLOCK thing.

Yes I agree, thank god for having a 'flags' parameter for the syscalls
:-). I'll make a note to add and test bidirectional nonblock hints.

The net patch looks fine and correct to me, feel free to add my acked-by
if you want.

-- 
Jens Axboe


^ permalink raw reply

* Re: [PATCH 03/31] mm: expose gfp_to_alloc_flags()
From: Suresh Jayaraman @ 2009-10-02  8:11 UTC (permalink / raw)
  To: David Rientjes
  Cc: Linus Torvalds, Andrew Morton, linux-kernel, linux-mm, netdev,
	Neil Brown, Miklos Szeredi, Wouter Verhelst, Peter Zijlstra,
	trond.myklebust
In-Reply-To: <alpine.DEB.1.00.0910011355230.32006@chino.kir.corp.google.com>

David Rientjes wrote:
> On Thu, 1 Oct 2009, Suresh Jayaraman wrote:
> 
>> From: Peter Zijlstra <a.p.zijlstra@chello.nl> 
>>
>> Expose the gfp to alloc_flags mapping, so we can use it in other parts
>> of the vm.
>>
>> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
> 
> Nack, these flags are internal to the page allocator and exporting them to 
> generic VM code is unnecessary.

Yes, you're right.

> The only bit you actually use in your patchset is ALLOC_NO_WATERMARKS to 
> determine whether a particular allocation can use memory reserves.  I'd 
> suggest adding a bool function that returns whether the current context is 
> given access to reserves including your new __GFP_MEMALLOC flag and 
> exporting that instead.

Makes sense and Neil already posted a patch citing the suggested
changes, will incorporate the change.

Thanks,

-- 
Suresh Jayaraman

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: Network hangs with 2.6.30.5
From: Ilpo Järvinen @ 2009-10-02  8:11 UTC (permalink / raw)
  To: David Miller
  Cc: jarkao2, holger.hoffstaette, Netdev, eric.dumazet,
	Evgeniy Polyakov
In-Reply-To: <20091001.154913.88345178.davem@davemloft.net>

On Thu, 1 Oct 2009, David Miller wrote:

> From: Jarek Poplawski <jarkao2@gmail.com>
> Date: Mon, 7 Sep 2009 07:21:43 +0000
> 
> > While Eric is analyzing your data, I guess you could try reverting
> > some stuff around this tcp_tw_recycle, and my tcp ignorance would
> > point these commits for the beginning:
> > 
> > http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.30.y.git;a=commitdiff;h=fc1ad92dfc4e363a055053746552cdb445ba5c57
> > http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.30.y.git;a=commitdiff;h=c887e6d2d9aee56ee7c9f2af4cec3a5efdcc4c72
> 
> Ilpo's cleanup (the second commit listed) looks most likely to
> be a possibility.
> 
> But I surely cannot find any bugs in it, even after studying it
> a few times.
> 
> Ilpo could you audit it one more time for us just in case?

Argh, not that one ...the jungle of negations. But I'll try to go it 
through once more but I tell you I did go through those negations multiple 
times already before submitting it :-).

> I also looked through all the TCP commits in 2.6.29 to 2.6.30
> and I could not find anything else that might cause stalls with
> time-wait recycled connections.

What about the more than 64k connections change a9d8f9110d7e953c2f2 (or 
its fixes), it might be another possibility? ...It certainly does 
something related to reuse and happens to be in the correct time frame... 
(I've added Evgeniy).

-- 
 i.

^ permalink raw reply

* Re: [PATCH 00/31] Swap over NFS -v20
From: Suresh Jayaraman @ 2009-10-02  8:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Andrew Morton, linux-kernel, linux-mm, netdev,
	Neil Brown, Miklos Szeredi, Wouter Verhelst, Peter Zijlstra,
	trond.myklebust
In-Reply-To: <20091001174201.GA30068@infradead.org>

Christoph Hellwig wrote:
> On Thu, Oct 01, 2009 at 07:34:18PM +0530, Suresh Jayaraman wrote:
> 
> The other really big one is adding a proper method for safe, page-backed
> kernelspace I/O on files.  That is not something like the grotty
> swap-tied address_space operations in this patch, but more something in

I'm not sure I understood about what problems you see with the proposed
address_space operations. Could you please elaborate a bit more?

> the direction of the kernel direct I/O patches from Jenx Axboe he did
> for using in the loop driver.  But even those aren't complete as they
> don't touch the locking issue yet.
> 

Thanks,

-- 
Suresh Jayaraman

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH] ipvs: Add boundary check on ioctl arguments
From: Julian Anastasov @ 2009-10-02  8:35 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Hannes Eder, Wensong Zhang, netdev, linux-kernel, Simon Horman
In-Reply-To: <20090930171833.5ce0011d@infradead.org>


	Hello,

On Wed, 30 Sep 2009, Arjan van de Ven wrote:

> fair enough; updated patch below

	OK, you can add my signed-off line after changing
'cmd > ...MAX + 1' to 'cmd > ...MAX' at both
places, nf_sockopt_ops ranges are [optmin ... optmax)

May be comments should be changed because:

- i'm not the author but after ispection we do not see any holes,
we do not want users to upgrade just for this change
- the cmd checks are just to help code checking tools
- the len checks should help programmers (may be BUG_ON is
better, user does not deserve EINVAL for wrong set_arglen/get_arglen).
Checks for *len and len are not needed.

	For example, for len checks this should be enough, before
copy_from_user():

in do_ip_vs_get_ctl check can be
	BUG_ON(get_arglen[GET_CMDID(cmd)] > sizeof(arg));

in do_ip_vs_set_ctl check can be
	BUG_ON(set_arglen[SET_CMDID(cmd)] > sizeof(arg));

Acked-by: Julian Anastasov <ja@ssi.bg>

> >From 28ae217858e683c0c94c02219d46a9a9c87f61c6 Mon Sep 17 00:00:00 2001
> From: Arjan van de Ven <arjan@linux.intel.com>
> Date: Wed, 30 Sep 2009 13:05:51 +0200
> Subject: [PATCH] ipvs: Add boundary check on ioctl arguments
> 
> The ipvs code has a nifty system for doing the size of ioctl command copies;
> it defines an array with values into which it indexes the cmd to find the
> right length.
> 
> Unfortunately, the ipvs code forgot to check if the cmd was in the range
> that the array provides, allowing for an index outside of the array,
> which then gives a "garbage" result into the length, which then gets
> used for copying into a stack buffer.
> 
> Fix this by adding sanity checks on these as well as the copy size.
> 
> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
> ---
>  net/netfilter/ipvs/ip_vs_ctl.c |   14 +++++++++++++-
>  1 files changed, 13 insertions(+), 1 deletions(-)
> 
> diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
> index ac624e5..7adc876 100644
> --- a/net/netfilter/ipvs/ip_vs_ctl.c
> +++ b/net/netfilter/ipvs/ip_vs_ctl.c
> @@ -2077,6 +2077,10 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len)
>  	if (!capable(CAP_NET_ADMIN))
>  		return -EPERM;
>  
> +	if (cmd < IP_VS_BASE_CTL || cmd > IP_VS_SO_SET_MAX + 1)
> +		return -EINVAL;
> +	if (len < 0 || len >  sizeof(arg))
> +		return -EINVAL;
>  	if (len != set_arglen[SET_CMDID(cmd)]) {
>  		pr_err("set_ctl: len %u != %u\n",
>  		       len, set_arglen[SET_CMDID(cmd)]);
> @@ -2353,17 +2357,25 @@ do_ip_vs_get_ctl(struct sock *sk, int cmd, void __user *user, int *len)
>  {
>  	unsigned char arg[128];
>  	int ret = 0;
> +	unsigned int copylen;
>  
>  	if (!capable(CAP_NET_ADMIN))
>  		return -EPERM;
>  
> +	if (cmd < IP_VS_BASE_CTL || cmd > IP_VS_SO_GET_MAX + 1)
> +		return -EINVAL;
> +
>  	if (*len < get_arglen[GET_CMDID(cmd)]) {
>  		pr_err("get_ctl: len %u < %u\n",
>  		       *len, get_arglen[GET_CMDID(cmd)]);
>  		return -EINVAL;
>  	}
>  
> -	if (copy_from_user(arg, user, get_arglen[GET_CMDID(cmd)]) != 0)
> +	copylen = get_arglen[GET_CMDID(cmd)];
> +	if (copylen > sizeof(arg))
> +		return -EINVAL;
> +
> +	if (copy_from_user(arg, user, copylen) != 0)
>  		return -EFAULT;
>  
>  	if (mutex_lock_interruptible(&__ip_vs_mutex))

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* [RFC] netlink: add socket destruction notification
From: Johannes Berg @ 2009-10-02  8:44 UTC (permalink / raw)
  To: netdev; +Cc: Jouni Malinen, Thomas Graf

When we want to keep track of resources associated with applications, we
need to know when an app is going away. Add a notification function to
netlink that tells us that, and also hook it up to generic netlink so
generic netlink can notify the families. Due to the way generic netlink
works though, we need to notify all families and they have to sort out
whatever resources some commands associated with the socket themselves.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
---
 drivers/connector/connector.c       |    2 +-
 drivers/scsi/scsi_netlink.c         |    2 +-
 drivers/scsi/scsi_transport_iscsi.c |    2 +-
 include/linux/netlink.h             |    1 +
 include/net/genetlink.h             |    3 +++
 kernel/audit.c                      |    3 ++-
 lib/kobject_uevent.c                |    2 +-
 net/bridge/netfilter/ebt_ulog.c     |    2 +-
 net/core/rtnetlink.c                |    3 ++-
 net/decnet/netfilter/dn_rtmsg.c     |    2 +-
 net/ipv4/fib_frontend.c             |    2 +-
 net/ipv4/inet_diag.c                |    2 +-
 net/ipv4/netfilter/ip_queue.c       |    2 +-
 net/ipv4/netfilter/ipt_ULOG.c       |    6 +++---
 net/ipv6/netfilter/ip6_queue.c      |    2 +-
 net/netfilter/nfnetlink.c           |    2 +-
 net/netlink/af_netlink.c            |    6 ++++++
 net/netlink/genetlink.c             |   18 ++++++++++++++++--
 net/xfrm/xfrm_user.c                |    2 +-
 security/selinux/netlink.c          |    3 ++-
 20 files changed, 47 insertions(+), 20 deletions(-)

--- wireless-testing.orig/net/xfrm/xfrm_user.c	2009-09-23 10:10:41.000000000 +0200
+++ wireless-testing/net/xfrm/xfrm_user.c	2009-09-29 14:45:33.000000000 +0200
@@ -2605,7 +2605,7 @@ static int __net_init xfrm_user_net_init
 	struct sock *nlsk;
 
 	nlsk = netlink_kernel_create(net, NETLINK_XFRM, XFRMNLGRP_MAX,
-				     xfrm_netlink_rcv, NULL, THIS_MODULE);
+				     xfrm_netlink_rcv, NULL, NULL, THIS_MODULE);
 	if (nlsk == NULL)
 		return -ENOMEM;
 	rcu_assign_pointer(net->xfrm.nlsk, nlsk);
--- wireless-testing.orig/drivers/connector/connector.c	2009-09-29 12:26:17.000000000 +0200
+++ wireless-testing/drivers/connector/connector.c	2009-09-29 14:45:33.000000000 +0200
@@ -451,7 +451,7 @@ static int __devinit cn_init(void)
 
 	dev->nls = netlink_kernel_create(&init_net, NETLINK_CONNECTOR,
 					 CN_NETLINK_USERS + 0xf,
-					 dev->input, NULL, THIS_MODULE);
+					 dev->input, NULL, NULL, THIS_MODULE);
 	if (!dev->nls)
 		return -EIO;
 
--- wireless-testing.orig/drivers/scsi/scsi_netlink.c	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/drivers/scsi/scsi_netlink.c	2009-09-29 14:45:33.000000000 +0200
@@ -496,7 +496,7 @@ scsi_netlink_init(void)
 
 	scsi_nl_sock = netlink_kernel_create(&init_net, NETLINK_SCSITRANSPORT,
 				SCSI_NL_GRP_CNT, scsi_nl_rcv_msg, NULL,
-				THIS_MODULE);
+				NULL, THIS_MODULE);
 	if (!scsi_nl_sock) {
 		printk(KERN_ERR "%s: register of recieve handler failed\n",
 				__func__);
--- wireless-testing.orig/drivers/scsi/scsi_transport_iscsi.c	2009-09-29 12:26:46.000000000 +0200
+++ wireless-testing/drivers/scsi/scsi_transport_iscsi.c	2009-09-29 14:45:33.000000000 +0200
@@ -2082,7 +2082,7 @@ static __init int iscsi_transport_init(v
 		goto unregister_conn_class;
 
 	nls = netlink_kernel_create(&init_net, NETLINK_ISCSI, 1, iscsi_if_rx,
-				    NULL, THIS_MODULE);
+				    NULL, NULL, THIS_MODULE);
 	if (!nls) {
 		err = -ENOBUFS;
 		goto unregister_session_class;
--- wireless-testing.orig/kernel/audit.c	2009-09-29 12:27:01.000000000 +0200
+++ wireless-testing/kernel/audit.c	2009-09-29 14:45:33.000000000 +0200
@@ -970,7 +970,8 @@ static int __init audit_init(void)
 	printk(KERN_INFO "audit: initializing netlink socket (%s)\n",
 	       audit_default ? "enabled" : "disabled");
 	audit_sock = netlink_kernel_create(&init_net, NETLINK_AUDIT, 0,
-					   audit_receive, NULL, THIS_MODULE);
+					   audit_receive, NULL, NULL,
+					   THIS_MODULE);
 	if (!audit_sock)
 		audit_panic("cannot initialize netlink socket");
 	else
--- wireless-testing.orig/lib/kobject_uevent.c	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/lib/kobject_uevent.c	2009-09-29 14:45:33.000000000 +0200
@@ -322,7 +322,7 @@ EXPORT_SYMBOL_GPL(add_uevent_var);
 static int __init kobject_uevent_init(void)
 {
 	uevent_sock = netlink_kernel_create(&init_net, NETLINK_KOBJECT_UEVENT,
-					    1, NULL, NULL, THIS_MODULE);
+					    1, NULL, NULL, NULL, THIS_MODULE);
 	if (!uevent_sock) {
 		printk(KERN_ERR
 		       "kobject_uevent: unable to create netlink socket!\n");
--- wireless-testing.orig/net/bridge/netfilter/ebt_ulog.c	2009-09-29 12:27:03.000000000 +0200
+++ wireless-testing/net/bridge/netfilter/ebt_ulog.c	2009-09-29 14:45:33.000000000 +0200
@@ -304,7 +304,7 @@ static int __init ebt_ulog_init(void)
 
 	ebtulognl = netlink_kernel_create(&init_net, NETLINK_NFLOG,
 					  EBT_ULOG_MAXNLGROUPS, NULL, NULL,
-					  THIS_MODULE);
+					  NULL, THIS_MODULE);
 	if (!ebtulognl) {
 		printk(KERN_WARNING KBUILD_MODNAME ": out of memory trying to "
 		       "call netlink_kernel_create\n");
--- wireless-testing.orig/net/core/rtnetlink.c	2009-09-29 12:27:04.000000000 +0200
+++ wireless-testing/net/core/rtnetlink.c	2009-09-29 14:45:33.000000000 +0200
@@ -1360,7 +1360,8 @@ static int rtnetlink_net_init(struct net
 {
 	struct sock *sk;
 	sk = netlink_kernel_create(net, NETLINK_ROUTE, RTNLGRP_MAX,
-				   rtnetlink_rcv, &rtnl_mutex, THIS_MODULE);
+				   rtnetlink_rcv, NULL,
+				   &rtnl_mutex, THIS_MODULE);
 	if (!sk)
 		return -ENOMEM;
 	net->rtnl = sk;
--- wireless-testing.orig/net/decnet/netfilter/dn_rtmsg.c	2009-09-23 10:10:41.000000000 +0200
+++ wireless-testing/net/decnet/netfilter/dn_rtmsg.c	2009-09-29 14:45:33.000000000 +0200
@@ -128,7 +128,7 @@ static int __init dn_rtmsg_init(void)
 
 	dnrmg = netlink_kernel_create(&init_net,
 				      NETLINK_DNRTMSG, DNRNG_NLGRP_MAX,
-				      dnrmg_receive_user_skb,
+				      dnrmg_receive_user_skb, NULL,
 				      NULL, THIS_MODULE);
 	if (dnrmg == NULL) {
 		printk(KERN_ERR "dn_rtmsg: Cannot create netlink socket");
--- wireless-testing.orig/net/ipv4/fib_frontend.c	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/net/ipv4/fib_frontend.c	2009-09-29 14:45:33.000000000 +0200
@@ -879,7 +879,7 @@ static int nl_fib_lookup_init(struct net
 {
 	struct sock *sk;
 	sk = netlink_kernel_create(net, NETLINK_FIB_LOOKUP, 0,
-				   nl_fib_input, NULL, THIS_MODULE);
+				   nl_fib_input, NULL, NULL, THIS_MODULE);
 	if (sk == NULL)
 		return -EAFNOSUPPORT;
 	net->ipv4.fibnl = sk;
--- wireless-testing.orig/net/ipv4/inet_diag.c	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/net/ipv4/inet_diag.c	2009-09-29 14:45:33.000000000 +0200
@@ -924,7 +924,7 @@ static int __init inet_diag_init(void)
 		goto out;
 
 	idiagnl = netlink_kernel_create(&init_net, NETLINK_INET_DIAG, 0,
-					inet_diag_rcv, NULL, THIS_MODULE);
+					inet_diag_rcv, NULL, NULL, THIS_MODULE);
 	if (idiagnl == NULL)
 		goto out_free_table;
 	err = 0;
--- wireless-testing.orig/net/ipv4/netfilter/ip_queue.c	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/net/ipv4/netfilter/ip_queue.c	2009-09-29 14:45:33.000000000 +0200
@@ -578,7 +578,7 @@ static int __init ip_queue_init(void)
 
 	netlink_register_notifier(&ipq_nl_notifier);
 	ipqnl = netlink_kernel_create(&init_net, NETLINK_FIREWALL, 0,
-				      ipq_rcv_skb, NULL, THIS_MODULE);
+				      ipq_rcv_skb, NULL, NULL, THIS_MODULE);
 	if (ipqnl == NULL) {
 		printk(KERN_ERR "ip_queue: failed to create netlink socket\n");
 		goto cleanup_netlink_notifier;
--- wireless-testing.orig/net/ipv4/netfilter/ipt_ULOG.c	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/net/ipv4/netfilter/ipt_ULOG.c	2009-09-29 14:45:33.000000000 +0200
@@ -400,9 +400,9 @@ static int __init ulog_tg_init(void)
 	for (i = 0; i < ULOG_MAXNLGROUPS; i++)
 		setup_timer(&ulog_buffers[i].timer, ulog_timer, i);
 
-	nflognl = netlink_kernel_create(&init_net,
-					NETLINK_NFLOG, ULOG_MAXNLGROUPS, NULL,
-					NULL, THIS_MODULE);
+	nflognl = netlink_kernel_create(&init_net, NETLINK_NFLOG,
+					ULOG_MAXNLGROUPS, NULL,
+					NULL, NULL, THIS_MODULE);
 	if (!nflognl)
 		return -ENOMEM;
 
--- wireless-testing.orig/net/ipv6/netfilter/ip6_queue.c	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/net/ipv6/netfilter/ip6_queue.c	2009-09-29 14:45:33.000000000 +0200
@@ -580,7 +580,7 @@ static int __init ip6_queue_init(void)
 
 	netlink_register_notifier(&ipq_nl_notifier);
 	ipqnl = netlink_kernel_create(&init_net, NETLINK_IP6_FW, 0,
-			              ipq_rcv_skb, NULL, THIS_MODULE);
+				      ipq_rcv_skb, NULL, NULL, THIS_MODULE);
 	if (ipqnl == NULL) {
 		printk(KERN_ERR "ip6_queue: failed to create netlink socket\n");
 		goto cleanup_netlink_notifier;
--- wireless-testing.orig/net/netfilter/nfnetlink.c	2009-09-29 12:27:12.000000000 +0200
+++ wireless-testing/net/netfilter/nfnetlink.c	2009-09-29 14:45:33.000000000 +0200
@@ -196,7 +196,7 @@ static int __init nfnetlink_init(void)
 	printk("Netfilter messages via NETLINK v%s.\n", nfversion);
 
 	nfnl = netlink_kernel_create(&init_net, NETLINK_NETFILTER, NFNLGRP_MAX,
-				     nfnetlink_rcv, NULL, THIS_MODULE);
+				     nfnetlink_rcv, NULL, NULL, THIS_MODULE);
 	if (!nfnl) {
 		printk(KERN_ERR "cannot initialize nfnetlink!\n");
 		return -ENOMEM;
--- wireless-testing.orig/net/netlink/genetlink.c	2009-09-29 12:27:12.000000000 +0200
+++ wireless-testing/net/netlink/genetlink.c	2009-09-29 14:45:33.000000000 +0200
@@ -561,6 +561,20 @@ static void genl_rcv(struct sk_buff *skb
 	genl_unlock();
 }
 
+static void genl_destruct(struct sock *sk)
+{
+	struct genl_family *f;
+	int idx;
+
+	genl_lock();
+
+	for (idx = 0; idx < GENL_FAM_TAB_SIZE; idx++)
+		list_for_each_entry(f, &family_ht[idx], family_list)
+			if (f->destruct_sk)
+				f->destruct_sk(sk);
+	genl_unlock();
+}
+
 /**************************************************************************
  * Controller
  **************************************************************************/
@@ -852,8 +866,8 @@ static int __net_init genl_pernet_init(s
 {
 	/* we'll bump the group number right afterwards */
 	net->genl_sock = netlink_kernel_create(net, NETLINK_GENERIC, 0,
-					       genl_rcv, &genl_mutex,
-					       THIS_MODULE);
+					       genl_rcv, genl_destruct,
+					       &genl_mutex, THIS_MODULE);
 
 	if (!net->genl_sock && net_eq(net, &init_net))
 		panic("GENL: Cannot initialize generic netlink\n");
--- wireless-testing.orig/security/selinux/netlink.c	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/security/selinux/netlink.c	2009-09-29 14:45:33.000000000 +0200
@@ -106,7 +106,8 @@ void selnl_notify_policyload(u32 seqno)
 static int __init selnl_init(void)
 {
 	selnl = netlink_kernel_create(&init_net, NETLINK_SELINUX,
-				      SELNLGRP_MAX, NULL, NULL, THIS_MODULE);
+				      SELNLGRP_MAX, NULL, NULL, NULL,
+				      THIS_MODULE);
 	if (selnl == NULL)
 		panic("SELinux:  Cannot create netlink socket.");
 	netlink_set_nonroot(NETLINK_SELINUX, NL_NONROOT_RECV);
--- wireless-testing.orig/include/linux/netlink.h	2009-09-29 12:26:58.000000000 +0200
+++ wireless-testing/include/linux/netlink.h	2009-09-29 14:45:33.000000000 +0200
@@ -182,6 +182,7 @@ extern void netlink_table_ungrab(void);
 extern struct sock *netlink_kernel_create(struct net *net,
 					  int unit,unsigned int groups,
 					  void (*input)(struct sk_buff *skb),
+					  void (*destruct)(struct sock *sk),
 					  struct mutex *cb_mutex,
 					  struct module *module);
 extern void netlink_kernel_release(struct sock *sk);
--- wireless-testing.orig/include/net/genetlink.h	2009-09-23 10:10:42.000000000 +0200
+++ wireless-testing/include/net/genetlink.h	2009-09-29 14:45:33.000000000 +0200
@@ -30,6 +30,8 @@ struct genl_multicast_group
  * @maxattr: maximum number of attributes supported
  * @netnsok: set to true if the family can handle network
  *	namespaces and should be presented in all of them
+ * @destruct_sk: called when any generic netlink socket
+ *	is destroyed (e.g. by the application closing it)
  * @attrbuf: buffer to store parsed attributes
  * @ops_list: list of all assigned operations
  * @family_list: family list
@@ -43,6 +45,7 @@ struct genl_family
 	unsigned int		version;
 	unsigned int		maxattr;
 	bool			netnsok;
+	void			(*destruct_sk)(struct sock *sk);
 	struct nlattr **	attrbuf;	/* private */
 	struct list_head	ops_list;	/* private */
 	struct list_head	family_list;	/* private */
--- wireless-testing.orig/net/netlink/af_netlink.c	2009-09-29 12:27:12.000000000 +0200
+++ wireless-testing/net/netlink/af_netlink.c	2009-09-29 14:45:33.000000000 +0200
@@ -80,6 +80,7 @@ struct netlink_sock {
 	struct mutex		*cb_mutex;
 	struct mutex		cb_def_mutex;
 	void			(*netlink_rcv)(struct sk_buff *skb);
+	void			(*destruct)(struct sock *sk);
 	struct module		*module;
 };
 
@@ -166,6 +167,9 @@ static void netlink_sock_destruct(struct
 		return;
 	}
 
+	if (nlk->destruct)
+		nlk->destruct(sk);
+
 	WARN_ON(atomic_read(&sk->sk_rmem_alloc));
 	WARN_ON(atomic_read(&sk->sk_wmem_alloc));
 	WARN_ON(nlk_sk(sk)->groups);
@@ -1464,6 +1468,7 @@ static void netlink_data_ready(struct so
 struct sock *
 netlink_kernel_create(struct net *net, int unit, unsigned int groups,
 		      void (*input)(struct sk_buff *skb),
+		      void (*destruct)(struct sock *sk),
 		      struct mutex *cb_mutex, struct module *module)
 {
 	struct socket *sock;
@@ -1502,6 +1507,7 @@ netlink_kernel_create(struct net *net, i
 	sk->sk_data_ready = netlink_data_ready;
 	if (input)
 		nlk_sk(sk)->netlink_rcv = input;
+	nlk_sk(sk)->destruct = destruct;
 
 	if (netlink_insert(sk, net, 0))
 		goto out_sock_release;



^ permalink raw reply

* Re: [PATCH 03/31] mm: expose gfp_to_alloc_flags()
From: David Rientjes @ 2009-10-02  9:30 UTC (permalink / raw)
  To: Neil Brown
  Cc: Suresh Jayaraman, Linus Torvalds, Andrew Morton, linux-kernel,
	linux-mm, netdev, Miklos Szeredi, Wouter Verhelst, Peter Zijlstra,
	trond.myklebust
In-Reply-To: <19141.35274.513790.845711@notabene.brown>

On Fri, 2 Oct 2009, Neil Brown wrote:

> So something like this?
> Then change every occurrence of
> +		if (!(gfp_to_alloc_flags(gfpflags) & ALLOC_NO_WATERMARKS))
> to
> +		if (!(gfp_has_no_watermarks(gfpflags)))
> 
> ??
> 

No, it's not even necessary to call gfp_to_alloc_flags() at all, just 
create a globally exported function such as can_alloc_use_reserves() and 
use it in gfp_to_alloc_flags().

 [ Using 'p' in gfp_to_alloc_flags() is actually wrong since
   test_thread_flag() only works on current anyway, so it would be
   inconsistent if p were set to anything other than current; we can
   get rid of that auto variable. ]

Something like the following, which you can fold into this patch proposal 
and modify later for GFP_MEMALLOC.

Signed-off-by: David Rientjes <rientjes@google.com>
---
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 557bdad..7dd62a0 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -265,6 +265,8 @@ static inline void arch_free_page(struct page *page, int order) { }
 static inline void arch_alloc_page(struct page *page, int order) { }
 #endif
 
+int can_alloc_use_reserves(void);
+
 struct page *
 __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
 		       struct zonelist *zonelist, nodemask_t *nodemask);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index bf72055..cf1d765 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1744,10 +1744,19 @@ void wake_all_kswapd(unsigned int order, struct zonelist *zonelist,
 		wakeup_kswapd(zone, order);
 }
 
+/*
+ * Does the current context allow the allocation to utilize memory reserves
+ * by ignoring watermarks for all zones?
+ */
+int can_alloc_use_reserves(void)
+{
+	return !in_interrupt() && ((current->flags & PF_MEMALLOC) ||
+				   unlikely(test_thread_flag(TIF_MEMDIE)));
+}
+
 static inline int
 gfp_to_alloc_flags(gfp_t gfp_mask)
 {
-	struct task_struct *p = current;
 	int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET;
 	const gfp_t wait = gfp_mask & __GFP_WAIT;
 
@@ -1769,15 +1778,12 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
 		 * See also cpuset_zone_allowed() comment in kernel/cpuset.c.
 		 */
 		alloc_flags &= ~ALLOC_CPUSET;
-	} else if (unlikely(rt_task(p)))
+	} else if (unlikely(rt_task(current)))
 		alloc_flags |= ALLOC_HARDER;
 
-	if (likely(!(gfp_mask & __GFP_NOMEMALLOC))) {
-		if (!in_interrupt() &&
-		    ((p->flags & PF_MEMALLOC) ||
-		     unlikely(test_thread_flag(TIF_MEMDIE))))
+	if (likely(!(gfp_mask & __GFP_NOMEMALLOC)))
+		if (can_alloc_use_reserves())
 			alloc_flags |= ALLOC_NO_WATERMARKS;
-	}
 
 	return alloc_flags;
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related

* [PATCH] net: Fix wrong sizeof
From: Jean Delvare @ 2009-10-02  9:30 UTC (permalink / raw)
  To: LKML, netdev; +Cc: linux-doc, Randy Dunlap, stable

Which is why I have always preferred sizeof(struct foo) over
sizeof(var).

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Randy Dunlap <rdunlap@xenotime.net>
---
Stable team, the non-documentation part of this fix applies to 2.6.31,
2.6.30 and 2.6.27.

 Documentation/networking/timestamping/timestamping.c |    2 +-
 drivers/net/iseries_veth.c                           |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- linux-2.6.32-rc1.orig/Documentation/networking/timestamping/timestamping.c	2009-06-10 05:05:27.000000000 +0200
+++ linux-2.6.32-rc1/Documentation/networking/timestamping/timestamping.c	2009-10-02 11:07:19.000000000 +0200
@@ -381,7 +381,7 @@ int main(int argc, char **argv)
 	memset(&hwtstamp, 0, sizeof(hwtstamp));
 	strncpy(hwtstamp.ifr_name, interface, sizeof(hwtstamp.ifr_name));
 	hwtstamp.ifr_data = (void *)&hwconfig;
-	memset(&hwconfig, 0, sizeof(&hwconfig));
+	memset(&hwconfig, 0, sizeof(hwconfig));
 	hwconfig.tx_type =
 		(so_timestamping_flags & SOF_TIMESTAMPING_TX_HARDWARE) ?
 		HWTSTAMP_TX_ON : HWTSTAMP_TX_OFF;
--- linux-2.6.32-rc1.orig/drivers/net/iseries_veth.c	2009-09-28 10:28:42.000000000 +0200
+++ linux-2.6.32-rc1/drivers/net/iseries_veth.c	2009-10-02 11:07:15.000000000 +0200
@@ -495,7 +495,7 @@ static void veth_take_cap_ack(struct vet
 			   cnx->remote_lp);
 	} else {
 		memcpy(&cnx->cap_ack_event, event,
-		       sizeof(&cnx->cap_ack_event));
+		       sizeof(cnx->cap_ack_event));
 		cnx->state |= VETH_STATE_GOTCAPACK;
 		veth_kick_statemachine(cnx);
 	}


-- 
Jean Delvare

^ permalink raw reply

* Re: [PATCH 04/31] mm: tag reseve pages
From: David Rientjes @ 2009-10-02  9:50 UTC (permalink / raw)
  To: Neil Brown
  Cc: Suresh Jayaraman, Linus Torvalds, Andrew Morton, linux-kernel,
	linux-mm, netdev, Miklos Szeredi, Wouter Verhelst, Peter Zijlstra,
	trond.myklebust
In-Reply-To: <19141.34038.274185.392663@notabene.brown>

On Fri, 2 Oct 2009, Neil Brown wrote:

> Normally if zones are above their watermarks, page->reserve will not
> be set.
> This is because __alloc_page_nodemask (which seems to be the main
> non-inline entrypoint) first calls get_page_from_freelist with
> alloc_flags set to ALLOC_WMARK_LOW|ALLOC_CPUSET.
> Only if this fails does __alloc_page_nodemask call
> __alloc_pages_slowpath which potentially sets ALLOC_NO_WATERMARKS in
> alloc_flags.
> 
> So page->reserved being set actually tells us:
>   PF_MEMALLOC or GFP_MEMALLOC were used, and
>   a WMARK_LOW allocation attempt failed very recently
> 
> which is close enough to "the emergency reserves were used" I think.
> 

There're a couple cornercases for GFP_ATOMIC, though:

 - it isn't restricted by cpuset, so ALLOC_CPUSET will never get set for 
   the slowpath allocs and may very well allow the allocation to succeed 
   in zones far above their min watermark.

 - it allows for allocating beyond the min watermark in allowed zones
   simply by setting ALLOC_HARDER; these types of "reserve" allocations
   wouldn't be marked as page->reserve with your patches if
   ALLOC_NO_WATERMARKS wasn't set because of the allocation context.

The second one is debatable whether it fits your definition of reserve or 
not, but there's an inconsistency if it doesn't because the allocation may 
succeed in "no watermark" context (for example, in hard irq context) even 
though that privilege wasn't necessary to successfully allocate: perhaps 
it only needed ALLOC_HARDER.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [BUG net-2.6] bluetooth/rfcomm : sleeping function called from invalid context at mm/slub.c:1719
From: Oliver Hartkopp @ 2009-10-02  9:52 UTC (permalink / raw)
  To: Marcel Holtmann; +Cc: Linux Netdev List, linux-bluetooth
In-Reply-To: <4AC59D8A.6000102@hartkopp.net>

It's a reproducible bug.

When creating a ppp dialup connection a second time there is a lockdep annotation:

[ 1477.716936] PPP generic driver version 2.4.2
[ 1477.738035] BUG: sleeping function called from invalid context at
mm/slub.c:1719
[ 1477.738046] in_atomic(): 1, irqs_disabled(): 0, pid: 5057, name: pppd
[ 1477.738053] 3 locks held by pppd/5057:
[ 1477.738058]  #0:  (rfcomm_mutex){+.+.+.}, at: [<fa5dd2a1>]
rfcomm_dlc_open+0x28/0x2d6 [rfcomm]
[ 1477.738083]  #1:  (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.+.}, at:
[<fa53f4f8>] l2cap_sock_connect+0x62/0x2c6 [l2cap]
[ 1477.738105]  #2:  (&hdev->lock){+...+.}, at: [<fa53f5b4>]
l2cap_sock_connect+0x11e/0x2c6 [l2cap]
[ 1477.738129] Pid: 5057, comm: pppd Not tainted 2.6.31-08939-gdb8abec-dirty #21
[ 1477.738135] Call Trace:
[ 1477.738148]  [<c1042a2b>] ? __debug_show_held_locks+0x1e/0x20
[ 1477.738160]  [<c10212a1>] __might_sleep+0xc9/0xce
[ 1477.738171]  [<c1078b62>] __kmalloc+0x6d/0xfb
[ 1477.738181]  [<c119e739>] ? kzalloc+0xb/0xd
[ 1477.738190]  [<c119e739>] kzalloc+0xb/0xd
[ 1477.738199]  [<c119ef1a>] device_private_init+0x15/0x3d
[ 1477.738209]  [<c11a0e1b>] dev_set_drvdata+0x18/0x26
[ 1477.738233]  [<f88f9a1b>] hci_conn_init_sysfs+0x3d/0xc7 [bluetooth]
[ 1477.738253]  [<f88f61b3>] hci_conn_add+0x1c0/0x1d5 [bluetooth]
[ 1477.738271]  [<f88f6360>] hci_connect+0x71/0x17d [bluetooth]
[ 1477.738285]  [<fa53f62c>] l2cap_sock_connect+0x196/0x2c6 [l2cap]
[ 1477.738298]  [<c1246e3d>] kernel_connect+0xd/0x12
[ 1477.738311]  [<fa5dd3c3>] rfcomm_dlc_open+0x14a/0x2d6 [rfcomm]
[ 1477.738326]  [<fa5df0fa>] ? rfcomm_tty_open+0x73/0x227 [rfcomm]
[ 1477.738341]  [<fa5df130>] rfcomm_tty_open+0xa9/0x227 [rfcomm]
[ 1477.738352]  [<c1022e3f>] ? default_wake_function+0x0/0xd
[ 1477.738363]  [<c1180c79>] tty_open+0x29e/0x399
[ 1477.738374]  [<c107e9bd>] chrdev_open+0x13f/0x156
[ 1477.738384]  [<c107b0d3>] __dentry_open+0x11b/0x20f
[ 1477.738394]  [<c107b261>] nameidata_to_filp+0x2c/0x43
[ 1477.738403]  [<c107e87e>] ? chrdev_open+0x0/0x156
[ 1477.738414]  [<c1084e9e>] do_filp_open+0x3c6/0x70a
[ 1477.738426]  [<c108d3e4>] ? alloc_fd+0xc8/0xd2
[ 1477.738436]  [<c108d3e4>] ? alloc_fd+0xc8/0xd2
[ 1477.738446]  [<c107aebc>] do_sys_open+0x4a/0xe7
[ 1477.738456]  [<c1002acc>] ? restore_all_notrace+0x0/0x18
[ 1477.738466]  [<c107af9b>] sys_open+0x1e/0x26
[ 1477.738475]  [<c1002a18>] sysenter_do_call+0x12/0x36
[ 1484.844933] PPP BSD Compression module registered
[ 1484.870946] PPP Deflate Compression module registered
[ 4335.008503] CE: hpet increasing min_delta_ns to 15000 nsec
[ 7605.540870] INFO: trying to register non-static key.
[ 7605.540879] the code is fine but needs lockdep annotation.
[ 7605.540884] turning off the locking correctness validator.
[ 7605.540894] Pid: 0, comm: swapper Not tainted 2.6.31-08939-gdb8abec-dirty #21
[ 7605.540900] Call Trace:
[ 7605.540915]  [<c12e4fb2>] ? printk+0xf/0x11
[ 7605.540928]  [<c1042214>] register_lock_class+0x5a/0x295
[ 7605.540939]  [<c1043af2>] __lock_acquire+0x9b/0xc03
[ 7605.540949]  [<c104464b>] ? __lock_acquire+0xbf4/0xc03
[ 7605.540967]  [<fa53b168>] ? l2cap_get_chan_by_scid+0x35/0x43 [l2cap]
[ 7605.540977]  [<c104491f>] ? lock_release_non_nested+0x17b/0x1db
[ 7605.540990]  [<fa53b168>] ? l2cap_get_chan_by_scid+0x35/0x43 [l2cap]
[ 7605.541001]  [<c10426fd>] ? trace_hardirqs_off+0xb/0xd
[ 7605.541010]  [<c10446b6>] lock_acquire+0x5c/0x73
[ 7605.541021]  [<c124cd14>] ? skb_dequeue+0x12/0x4c
[ 7605.541031]  [<c12e6e23>] _spin_lock_irqsave+0x24/0x34
[ 7605.541039]  [<c124cd14>] ? skb_dequeue+0x12/0x4c
[ 7605.541048]  [<c124cd14>] skb_dequeue+0x12/0x4c
[ 7605.541057]  [<c124d579>] skb_queue_purge+0x14/0x1b
[ 7605.541070]  [<fa53de3f>] l2cap_recv_frame+0xe9e/0x129a [l2cap]
[ 7605.541080]  [<c10421d1>] ? register_lock_class+0x17/0x295
[ 7605.541091]  [<c104464b>] ? __lock_acquire+0xbf4/0xc03
[ 7605.541114]  [<c104464b>] ? __lock_acquire+0xbf4/0xc03
[ 7605.541125]  [<c120de74>] ? uhci_giveback_urb+0xf2/0x162
[ 7605.541148]  [<f88f4c45>] ? hci_rx_task+0xfe/0x1f8 [bluetooth]
[ 7605.541162]  [<fa53e2e4>] l2cap_recv_acldata+0xa9/0x1be [l2cap]
[ 7605.541174]  [<fa53e23b>] ? l2cap_recv_acldata+0x0/0x1be [l2cap]
[ 7605.541193]  [<f88f4c77>] hci_rx_task+0x130/0x1f8 [bluetooth]
[ 7605.541204]  [<c102a098>] tasklet_action+0x6b/0xb2
[ 7605.541213]  [<c102a46b>] __do_softirq+0x82/0x101
[ 7605.541222]  [<c102a515>] do_softirq+0x2b/0x43
[ 7605.541231]  [<c102a619>] irq_exit+0x35/0x68
[ 7605.541241]  [<c1004513>] do_IRQ+0x80/0x96
[ 7605.541250]  [<c10030ae>] common_interrupt+0x2e/0x34
[ 7605.541260]  [<c104007b>] ? tick_device_uses_broadcast+0x71/0x7c
[ 7605.541271]  [<c11747a8>] ? acpi_idle_enter_simple+0x103/0x12e
[ 7605.541281]  [<c1174515>] acpi_idle_enter_bm+0xc3/0x253
[ 7605.541291]  [<c1238b6f>] cpuidle_idle_call+0x60/0x91
[ 7605.541300]  [<c1001d44>] cpu_idle+0x49/0x65
[ 7605.541310]  [<c12e2f0e>] start_secondary+0x190/0x195


Oliver Hartkopp wrote:
> Hello Marcel,
> 
> with current net-2.6 tree ...
> 
> While starting my PPP Bluetooth dialup networking, i got this:
> 
> [  722.461549] PPP generic driver version 2.4.2
> [  722.477519] BUG: sleeping function called from invalid context at
> mm/slub.c:1719
> [  722.477530] in_atomic(): 1, irqs_disabled(): 0, pid: 4677, name: pppd
> [  722.477537] 3 locks held by pppd/4677:
> [  722.477542]  #0:  (rfcomm_mutex){+.+.+.}, at: [<fa5df2a1>]
> rfcomm_dlc_open+0x28/0x2d6 [rfcomm]
> [  722.477568]  #1:  (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.+.}, at:
> [<fa5414f8>] l2cap_sock_connect+0x62/0x2c6 [l2cap]
> [  722.477589]  #2:  (&hdev->lock){+...+.}, at: [<fa5415b4>]
> l2cap_sock_connect+0x11e/0x2c6 [l2cap]
> [  722.477613] Pid: 4677, comm: pppd Not tainted 2.6.31-08939-gdb8abec-dirty #21
> [  722.477619] Call Trace:
> [  722.477633]  [<c1042a2b>] ? __debug_show_held_locks+0x1e/0x20
> [  722.477644]  [<c10212a1>] __might_sleep+0xc9/0xce
> [  722.477655]  [<c1078b62>] __kmalloc+0x6d/0xfb
> [  722.477666]  [<c119e739>] ? kzalloc+0xb/0xd
> [  722.477674]  [<c119e739>] kzalloc+0xb/0xd
> [  722.477683]  [<c119ef1a>] device_private_init+0x15/0x3d
> [  722.477693]  [<c11a0e1b>] dev_set_drvdata+0x18/0x26
> [  722.477718]  [<f8b7ca1b>] hci_conn_init_sysfs+0x3d/0xc7 [bluetooth]
> [  722.477737]  [<f8b791b3>] hci_conn_add+0x1c0/0x1d5 [bluetooth]
> [  722.477756]  [<f8b79360>] hci_connect+0x71/0x17d [bluetooth]
> [  722.477769]  [<fa54162c>] l2cap_sock_connect+0x196/0x2c6 [l2cap]
> [  722.477782]  [<c1246e3d>] kernel_connect+0xd/0x12
> [  722.477795]  [<fa5df3c3>] rfcomm_dlc_open+0x14a/0x2d6 [rfcomm]
> [  722.477810]  [<fa5e10fa>] ? rfcomm_tty_open+0x73/0x227 [rfcomm]
> [  722.477825]  [<fa5e1130>] rfcomm_tty_open+0xa9/0x227 [rfcomm]
> [  722.477836]  [<c1022e3f>] ? default_wake_function+0x0/0xd
> [  722.477847]  [<c1180c79>] tty_open+0x29e/0x399
> [  722.477858]  [<c107e9bd>] chrdev_open+0x13f/0x156
> [  722.477868]  [<c107b0d3>] __dentry_open+0x11b/0x20f
> [  722.477878]  [<c107b261>] nameidata_to_filp+0x2c/0x43
> [  722.477888]  [<c107e87e>] ? chrdev_open+0x0/0x156
> [  722.477898]  [<c1084e9e>] do_filp_open+0x3c6/0x70a
> [  722.477910]  [<c108d3e4>] ? alloc_fd+0xc8/0xd2
> [  722.477920]  [<c108d3e4>] ? alloc_fd+0xc8/0xd2
> [  722.477930]  [<c107aebc>] do_sys_open+0x4a/0xe7
> [  722.477940]  [<c1002acc>] ? restore_all_notrace+0x0/0x18
> [  722.477950]  [<c107af9b>] sys_open+0x1e/0x26
> [  722.477959]  [<c1002a18>] sysenter_do_call+0x12/0x36
> [  729.658613] PPP BSD Compression module registered
> [  729.684789] PPP Deflate Compression module registered
> 
> Any idea?
> 
> Regards,
> Oliver
> 


^ permalink raw reply

* Messages are printed on screen
From: Markus Feldmann @ 2009-10-02  9:52 UTC (permalink / raw)
  To: netdev

Hi All,

i am setting up my Server, with Linux Debian lenny. Therefore i am using 
  Kernel 2.6.31.1. I configured this Kernel with <make defconfig> and 
<make menuconfig>. My motherboard has 5 PCI slots and 1 AGP slot. All 
slots are in use. All devices are mapped to the following IRQ-Line:

Mass Storage Device (PCI Slot1)	IRQ-Line 11
Ethernet (PCI Slot 2) 		IRQ-Line 4
Ethernet (PCI Slot 3) 		IRQ-Line 5
Ethernet (PCI Slot 4) 		IRQ-Line 7
Ethernet (PCI Slot 5) 		IRQ-Line 11
Onboard USB-Controller		IRQ-Line 5
Onboard USB-Controller		IRQ-Line 4
Onboard USB-Controller		IRQ-Line 11
Onboard IDE			IRQ-Line 14
AGP VGA				IRQ-Line 11

As you see some of my IRQ-Lines are multiply in use, so my Server is 
working hard at his limit. The result is sometimes freezing of my 
Server, especially if there is much processing on these devices. I 
remember that with Kernel 2.6.18 my system didn't does freezing.

So i am trying to reduce the amount of this processing. I still get 
messages about dropped network packets on my Terminal, although i set up 
my <rsyslog> to save this only to </var/log>. Here is my 
</etc/rsyslog.conf>

How can i disable the output of messages (about dropped packets from my 
firewall) to my terminal ?

How can i stabilize my IRQ-System with the kernel 2.6.31.1 ?

What debug features should i disable ?

regards Markus

^ permalink raw reply

* Re: [PATCH 30/31] Fix use of uninitialized variable in cache_grow()
From: David Rientjes @ 2009-10-02 10:05 UTC (permalink / raw)
  To: Neil Brown
  Cc: Suresh Jayaraman, Linus Torvalds, Andrew Morton, linux-kernel,
	linux-mm, netdev, Miklos Szeredi, Wouter Verhelst, Peter Zijlstra,
	trond.myklebust
In-Reply-To: <19141.34685.863491.329836@notabene.brown>

On Fri, 2 Oct 2009, Neil Brown wrote:

> > > Index: mmotm/mm/slab.c
> > > ===================================================================
> > > --- mmotm.orig/mm/slab.c
> > > +++ mmotm/mm/slab.c
> > > @@ -2760,7 +2760,7 @@ static int cache_grow(struct kmem_cache
> > >  	size_t offset;
> > >  	gfp_t local_flags;
> > >  	struct kmem_list3 *l3;
> > > -	int reserve;
> > > +	int reserve = -1;
> > >  
> > >  	/*
> > >  	 * Be lazy and only check for valid flags here,  keeping it out of the
> > > @@ -2816,7 +2816,8 @@ static int cache_grow(struct kmem_cache
> > >  	if (local_flags & __GFP_WAIT)
> > >  		local_irq_disable();
> > >  	check_irq_off();
> > > -	slab_set_reserve(cachep, reserve);
> > > +	if (reserve != -1)
> > > +		slab_set_reserve(cachep, reserve);
> > >  	spin_lock(&l3->list_lock);
> > >  
> > >  	/* Make slab active. */
> > 
> > Given the patch description, shouldn't this be a test for objp != NULL 
> > instead, then?
> 
> In between those to patch hunks, cache_grow contains the code:
> 	if (!objp)
> 		objp = kmem_getpages(cachep, local_flags, nodeid, &reserve);
> 	if (!objp)
> 		goto failed;
> 
> We can no longer test if objp was NULL on entry to the function.
> We could take a copy of objp on entry to the function, and test it
> here.  But initialising 'reserve' to an invalid value is easier.
> 

Seems like you could do all this in kmem_getpages(), then, by calling 
slab_set_reserve(cachep, page->reserve) before returning the new page?

 [ I'd also drop the branch in slab_set_reserve(), it's faster to just 
   assign it unconditionally. ]

^ permalink raw reply

* [RFC take2] pkt_sched: gen_estimator: Dont report fake rate estimators
From: Eric Dumazet @ 2009-10-02 10:35 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David Miller, kaber, netdev
In-Reply-To: <20091002070819.GA9694@ff.dom.local>

Here is second attempt to make this change, thanks Jarek !

This is indeed less intrusive !

[RFC] pkt_sched: gen_estimator: Dont report fake rate estimators

We currently send TCA_STATS_RATE_EST elements to netlink users, even if no estimator
is running.

# tc -s -d qdisc
qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 112833764978 bytes 1495081739 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0

User has no way to tell if the "rate 0bit 0pps" is a real estimation, or a fake
one (because no estimator is active)

After this patch, tc command output is :
$ tc -s -d qdisc
qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 561075 bytes 1196 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

We add a parameter to gnet_stats_copy_rate_est() function so that
it can use gen_estimator_active(bstats, r), as suggested by Jarek.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/net/gen_stats.h |    1 +
 net/core/gen_stats.c    |    7 ++++++-
 net/sched/act_api.c     |    2 +-
 net/sched/sch_api.c     |    2 +-
 net/sched/sch_cbq.c     |    2 +-
 net/sched/sch_drr.c     |    2 +-
 net/sched/sch_hfsc.c    |    2 +-
 net/sched/sch_htb.c     |    2 +-
 8 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/include/net/gen_stats.h b/include/net/gen_stats.h
index c148855..a0800e6 100644
--- a/include/net/gen_stats.h
+++ b/include/net/gen_stats.h
@@ -30,6 +30,7 @@ extern int gnet_stats_start_copy_compat(struct sk_buff *skb, int type,
 extern int gnet_stats_copy_basic(struct gnet_dump *d,
 				 struct gnet_stats_basic_packed *b);
 extern int gnet_stats_copy_rate_est(struct gnet_dump *d,
+				    const struct gnet_stats_basic_packed *bstats,
 				    struct gnet_stats_rate_est *r);
 extern int gnet_stats_copy_queue(struct gnet_dump *d,
 				 struct gnet_stats_queue *q);
diff --git a/net/core/gen_stats.c b/net/core/gen_stats.c
index 8569310..6f9513e 100644
--- a/net/core/gen_stats.c
+++ b/net/core/gen_stats.c
@@ -136,8 +136,13 @@ gnet_stats_copy_basic(struct gnet_dump *d, struct gnet_stats_basic_packed *b)
  * if the room in the socket buffer was not sufficient.
  */
 int
-gnet_stats_copy_rate_est(struct gnet_dump *d, struct gnet_stats_rate_est *r)
+gnet_stats_copy_rate_est(struct gnet_dump *d,
+			 const struct gnet_stats_basic_packed *bstats,
+			 struct gnet_stats_rate_est *r)
 {
+	if (!gen_estimator_active(bstats, r))
+		return 0;
+
 	if (d->compat_tc_stats) {
 		d->tc_stats.bps = r->bps;
 		d->tc_stats.pps = r->pps;
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 2dfb3e7..2b0d5ee 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -618,7 +618,7 @@ int tcf_action_copy_stats(struct sk_buff *skb, struct tc_action *a,
 			goto errout;
 
 	if (gnet_stats_copy_basic(&d, &h->tcf_bstats) < 0 ||
-	    gnet_stats_copy_rate_est(&d, &h->tcf_rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(&d, &h->tcf_bstats, &h->tcf_rate_est) < 0 ||
 	    gnet_stats_copy_queue(&d, &h->tcf_qstats) < 0)
 		goto errout;
 
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 903e418..1acfd29 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -1179,7 +1179,7 @@ static int tc_fill_qdisc(struct sk_buff *skb, struct Qdisc *q, u32 clid,
 		goto nla_put_failure;
 
 	if (gnet_stats_copy_basic(&d, &q->bstats) < 0 ||
-	    gnet_stats_copy_rate_est(&d, &q->rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(&d, &q->bstats, &q->rate_est) < 0 ||
 	    gnet_stats_copy_queue(&d, &q->qstats) < 0)
 		goto nla_put_failure;
 
diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c
index 5b132c4..3846d65 100644
--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -1609,7 +1609,7 @@ cbq_dump_class_stats(struct Qdisc *sch, unsigned long arg,
 		cl->xstats.undertime = cl->undertime - q->now;
 
 	if (gnet_stats_copy_basic(d, &cl->bstats) < 0 ||
-	    gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(d, &cl->bstats, &cl->rate_est) < 0 ||
 	    gnet_stats_copy_queue(d, &cl->qstats) < 0)
 		return -1;
 
diff --git a/net/sched/sch_drr.c b/net/sched/sch_drr.c
index 5a888af..a65604f 100644
--- a/net/sched/sch_drr.c
+++ b/net/sched/sch_drr.c
@@ -280,7 +280,7 @@ static int drr_dump_class_stats(struct Qdisc *sch, unsigned long arg,
 	}
 
 	if (gnet_stats_copy_basic(d, &cl->bstats) < 0 ||
-	    gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(d, &cl->bstats, &cl->rate_est) < 0 ||
 	    gnet_stats_copy_queue(d, &cl->qdisc->qstats) < 0)
 		return -1;
 
diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 2c5c76b..b38b39c 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -1375,7 +1375,7 @@ hfsc_dump_class_stats(struct Qdisc *sch, unsigned long arg,
 	xstats.rtwork  = cl->cl_cumul;
 
 	if (gnet_stats_copy_basic(d, &cl->bstats) < 0 ||
-	    gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(d, &cl->bstats, &cl->rate_est) < 0 ||
 	    gnet_stats_copy_queue(d, &cl->qstats) < 0)
 		return -1;
 
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 85acab9..8352fa3 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -1105,7 +1105,7 @@ htb_dump_class_stats(struct Qdisc *sch, unsigned long arg, struct gnet_dump *d)
 	cl->xstats.ctokens = cl->ctokens;
 
 	if (gnet_stats_copy_basic(d, &cl->bstats) < 0 ||
-	    gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(d, &cl->bstats, &cl->rate_est) < 0 ||
 	    gnet_stats_copy_queue(d, &cl->qstats) < 0)
 		return -1;
 

^ permalink raw reply related

* Re: [RFC][PATCH] ethtool: Add reset operation
From: Ajit Khaparde @ 2009-10-02 10:40 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: David Miller, netdev, linux-net-drivers
In-Reply-To: <1254426195.2735.16.camel@achroite>

On 01/10/09 20:43 +0100, Ben Hutchings wrote:
> After updating firmware stored in flash, users may wish to reset the
> relevant hardware and start the new firmware immediately.  This should
> not be completely automatic as it may be disruptive.
> 
> A selective reset may also be useful for debugging or diagnostics.
> 
> This adds a separate reset operation which takes flags indicating the
> components to be reset.  Drivers are allowed to reset only a subset of
> those requested, and must report the actual subset.  This allows the
> use of generic component masks and some future expansion.
> ---
Looks good. But one question.

> +static int ethtool_reset(struct net_device *dev, char __user *useraddr)
> +{
> +	struct ethtool_value reset;
> +	int ret;
> +
> +	if (!dev->ethtool_ops->reset)
> +		return -EOPNOTSUPP;
> +
> +	if (copy_from_user(&reset, useraddr, sizeof(reset)))
> +		return -EFAULT;
> +
> +	ret = dev->ethtool_ops->reset(dev, &reset.data);
> +	if (ret)
> +		return ret;
> +
> +	if (copy_to_user(useraddr, &reset, sizeof(reset)))
> +		return -EFAULT;
Can you tell the intention behind this copy_to_user?
Do you envision drivers sending back some data to the userland - may be
sometime in future?

Thanks
-Ajit

^ permalink raw reply

* Re: Messages are printed on screen
From: Ben Hutchings @ 2009-10-02 10:56 UTC (permalink / raw)
  To: Markus Feldmann; +Cc: netdev
In-Reply-To: <ha4igd$ghh$1@ger.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 2380 bytes --]

On Fri, 2009-10-02 at 11:52 +0200, Markus Feldmann wrote:
> Hi All,
> 
> i am setting up my Server, with Linux Debian lenny. Therefore i am using 
>   Kernel 2.6.31.1.

The current kernel version for 'lenny' is 2.6.26 (with stable updates
and other fixes).

2.6.31 is known to have a large number of regressions outstanding, which
is why it is not in Debian yet.

> I configured this Kernel with <make defconfig> and 
> <make menuconfig>. My motherboard has 5 PCI slots and 1 AGP slot. All 
> slots are in use. All devices are mapped to the following IRQ-Line:
> 
> Mass Storage Device (PCI Slot1)	IRQ-Line 11
> Ethernet (PCI Slot 2) 		IRQ-Line 4
> Ethernet (PCI Slot 3) 		IRQ-Line 5
> Ethernet (PCI Slot 4) 		IRQ-Line 7
> Ethernet (PCI Slot 5) 		IRQ-Line 11
> Onboard USB-Controller		IRQ-Line 5
> Onboard USB-Controller		IRQ-Line 4
> Onboard USB-Controller		IRQ-Line 11
> Onboard IDE			IRQ-Line 14
> AGP VGA				IRQ-Line 11
> 
> As you see some of my IRQ-Lines are multiply in use, so my Server is 
> working hard at his limit.

IRQ sharing is normal on PCs without MSI support, but to see where
that's happening you need to look at /proc/interrupts and not the BIOS
setup program or wherever you got the above information from.

This does not result in 'working hard at his limit'.

> The result is sometimes freezing of my 
> Server, especially if there is much processing on these devices. I 
> remember that with Kernel 2.6.18 my system didn't does freezing.

This is simply a bug, not a result of IRQ sharing or 'working hard'.

> So i am trying to reduce the amount of this processing. I still get 
> messages about dropped network packets on my Terminal, although i set up 
> my <rsyslog> to save this only to </var/log>. Here is my 
> </etc/rsyslog.conf>

You forgot to paste it.

> How can i disable the output of messages (about dropped packets from my 
> firewall) to my terminal ?

Edit the value of kernel.printk in /etc/sysctl.conf.

> How can i stabilize my IRQ-System with the kernel 2.6.31.1 ?

I would expect the standard kernel version for 'lenny' or the 2.6.30
kernel from 'sid' to be more stable.

> What debug features should i disable ?

No idea, you didn't even specify what you enabled...

Ben.

-- 
Ben Hutchings
Who are all these weirdos? - David Bowie, about L-Space IRC channel #afp

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply

* Re: Messages are printed on screen
From: Markus Feldmann @ 2009-10-02 10:56 UTC (permalink / raw)
  To: netdev
In-Reply-To: <ha4igd$ghh$1@ger.gmane.org>

Markus Feldmann schrieb:
> ....
> my <rsyslog> to save this only to </var/log>. Here is my 
> </etc/rsyslog.conf>
http://pastebin.com/m4400fb9e



^ permalink raw reply

* Re: [RFC][PATCH] ethtool: Add reset operation
From: Ben Hutchings @ 2009-10-02 11:00 UTC (permalink / raw)
  To: Ajit Khaparde; +Cc: David Miller, netdev, linux-net-drivers
In-Reply-To: <20091002104010.GA19862@serverengines.com>

On Fri, 2009-10-02 at 16:10 +0530, Ajit Khaparde wrote:
[...]
> Can you tell the intention behind this copy_to_user?
> Do you envision drivers sending back some data to the userland - may be
> sometime in future?

This allows userland to see which components were actually reset.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [BUG net-2.6] bluetooth/rfcomm : sleeping function called from invalid context at mm/slub.c:1719
From: Dave Young @ 2009-10-02 11:01 UTC (permalink / raw)
  To: Oliver Hartkopp; +Cc: Marcel Holtmann, Linux Netdev List, linux-bluetooth
In-Reply-To: <4AC59D8A.6000102@hartkopp.net>

On Fri, Oct 2, 2009 at 2:28 PM, Oliver Hartkopp <oliver@hartkopp.net> wrote:
> Hello Marcel,
>
> with current net-2.6 tree ...
>
> While starting my PPP Bluetooth dialup networking, i got this:

Hi, oliver

please try following patch:
http://patchwork.kernel.org/patch/51326/

>
> [  722.461549] PPP generic driver version 2.4.2
> [  722.477519] BUG: sleeping function called from invalid context at
> mm/slub.c:1719
> [  722.477530] in_atomic(): 1, irqs_disabled(): 0, pid: 4677, name: pppd
> [  722.477537] 3 locks held by pppd/4677:
> [  722.477542]  #0:  (rfcomm_mutex){+.+.+.}, at: [<fa5df2a1>]
> rfcomm_dlc_open+0x28/0x2d6 [rfcomm]
> [  722.477568]  #1:  (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.+.}, at:
> [<fa5414f8>] l2cap_sock_connect+0x62/0x2c6 [l2cap]
> [  722.477589]  #2:  (&hdev->lock){+...+.}, at: [<fa5415b4>]
> l2cap_sock_connect+0x11e/0x2c6 [l2cap]
> [  722.477613] Pid: 4677, comm: pppd Not tainted 2.6.31-08939-gdb8abec-dirty #21
> [  722.477619] Call Trace:
> [  722.477633]  [<c1042a2b>] ? __debug_show_held_locks+0x1e/0x20
> [  722.477644]  [<c10212a1>] __might_sleep+0xc9/0xce
> [  722.477655]  [<c1078b62>] __kmalloc+0x6d/0xfb
> [  722.477666]  [<c119e739>] ? kzalloc+0xb/0xd
> [  722.477674]  [<c119e739>] kzalloc+0xb/0xd
> [  722.477683]  [<c119ef1a>] device_private_init+0x15/0x3d
> [  722.477693]  [<c11a0e1b>] dev_set_drvdata+0x18/0x26
> [  722.477718]  [<f8b7ca1b>] hci_conn_init_sysfs+0x3d/0xc7 [bluetooth]
> [  722.477737]  [<f8b791b3>] hci_conn_add+0x1c0/0x1d5 [bluetooth]
> [  722.477756]  [<f8b79360>] hci_connect+0x71/0x17d [bluetooth]
> [  722.477769]  [<fa54162c>] l2cap_sock_connect+0x196/0x2c6 [l2cap]
> [  722.477782]  [<c1246e3d>] kernel_connect+0xd/0x12
> [  722.477795]  [<fa5df3c3>] rfcomm_dlc_open+0x14a/0x2d6 [rfcomm]
> [  722.477810]  [<fa5e10fa>] ? rfcomm_tty_open+0x73/0x227 [rfcomm]
> [  722.477825]  [<fa5e1130>] rfcomm_tty_open+0xa9/0x227 [rfcomm]
> [  722.477836]  [<c1022e3f>] ? default_wake_function+0x0/0xd
> [  722.477847]  [<c1180c79>] tty_open+0x29e/0x399
> [  722.477858]  [<c107e9bd>] chrdev_open+0x13f/0x156
> [  722.477868]  [<c107b0d3>] __dentry_open+0x11b/0x20f
> [  722.477878]  [<c107b261>] nameidata_to_filp+0x2c/0x43
> [  722.477888]  [<c107e87e>] ? chrdev_open+0x0/0x156
> [  722.477898]  [<c1084e9e>] do_filp_open+0x3c6/0x70a
> [  722.477910]  [<c108d3e4>] ? alloc_fd+0xc8/0xd2
> [  722.477920]  [<c108d3e4>] ? alloc_fd+0xc8/0xd2
> [  722.477930]  [<c107aebc>] do_sys_open+0x4a/0xe7
> [  722.477940]  [<c1002acc>] ? restore_all_notrace+0x0/0x18
> [  722.477950]  [<c107af9b>] sys_open+0x1e/0x26
> [  722.477959]  [<c1002a18>] sysenter_do_call+0x12/0x36
> [  729.658613] PPP BSD Compression module registered
> [  729.684789] PPP Deflate Compression module registered
>
> Any idea?
>
> Regards,
> Oliver
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bluetooth" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Regards
dave

^ permalink raw reply

* Re: [RFC take2] pkt_sched: gen_estimator: Dont report fake rate estimators
From: Jarek Poplawski @ 2009-10-02 11:25 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, kaber, netdev
In-Reply-To: <4AC5D78D.3030400@gmail.com>

On Fri, Oct 02, 2009 at 12:35:57PM +0200, Eric Dumazet wrote:
> Here is second attempt to make this change, thanks Jarek !
> 
> This is indeed less intrusive !
> 
> [RFC] pkt_sched: gen_estimator: Dont report fake rate estimators
> 
> We currently send TCA_STATS_RATE_EST elements to netlink users, even if no estimator
> is running.
> 
> # tc -s -d qdisc
> qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
>  Sent 112833764978 bytes 1495081739 pkt (dropped 0, overlimits 0 requeues 0)
>  rate 0bit 0pps backlog 0b 0p requeues 0
> 
> User has no way to tell if the "rate 0bit 0pps" is a real estimation, or a fake
> one (because no estimator is active)
> 
> After this patch, tc command output is :
> $ tc -s -d qdisc
> qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
>  Sent 561075 bytes 1196 pkt (dropped 0, overlimits 0 requeues 0)
>  backlog 0b 0p requeues 0
> 
> We add a parameter to gnet_stats_copy_rate_est() function so that
> it can use gen_estimator_active(bstats, r), as suggested by Jarek.

So you prefer the additional parameter version, but since these
_active tests are not needed e.g. for HTB classes, which got it
active by default, so maybe bstats == NULL would let skip such a test?

...
> --- a/include/net/gen_stats.h
> +++ b/include/net/gen_stats.h
> @@ -30,6 +30,7 @@ extern int gnet_stats_start_copy_compat(struct sk_buff *skb, int type,
>  extern int gnet_stats_copy_basic(struct gnet_dump *d,
>  				 struct gnet_stats_basic_packed *b);
>  extern int gnet_stats_copy_rate_est(struct gnet_dump *d,
> +				    const struct gnet_stats_basic_packed *bstats,

It seems these *b/*bstats defs could look more consistent. Otherwise
it looks OK to me.

Thanks,
Jarek P.

^ permalink raw reply

* Re: Messages are printed on screen
From: Markus Feldmann @ 2009-10-02 12:01 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1254480996.23350.73.camel@localhost>

Ben Hutchings schrieb:
> On Fri, 2009-10-02 at 11:52 +0200, Markus Feldmann wrote:
>
>>
>> As you see some of my IRQ-Lines are multiply in use, so my Server is 
>> working hard at his limit.
> 
> IRQ sharing is normal on PCs without MSI support, but to see where
> that's happening you need to look at /proc/interrupts and not the BIOS
> setup program or wherever you got the above information from.
Ok i did <cat /proc/interrupts> and got:
            CPU0
   0:     259603    XT-PIC-XT        timer
   1:       1421    XT-PIC-XT        i8042
   2:          0    XT-PIC-XT        cascade
   4:     200000    XT-PIC-XT        ohci_hcd:usb3, pppp0
   5:          0    XT-PIC-XT        ehci_hcd:usb1, lan0
   7:       6959    XT-PIC-XT        lan1
   8:          2    XT-PIC-XT        rtc0
   9:          0    XT-PIC-XT        acpi
  11:      37697    XT-PIC-XT        ide2, ide3, ohci_hcd:usb2, lan2
  14:          0    XT-PIC-XT        ide0
NMI:          0   Non-maskable interrupts
TRM:          0   Thermal event interrupts
MCE:          0   Machine check exceptions
MCP:         13   Machine check polls
ERR:          2

How can i assigned IRQs during Boot?

How can i watch which IRQ Line has most traffic or problems ?

>> The result is sometimes freezing of my 
>> Server, especially if there is much processing on these devices. I 
>> remember that with Kernel 2.6.18 my system didn't does freezing.
> 
> This is simply a bug, not a result of IRQ sharing or 'working hard'.
But something had initiated this freezing. Although i do not know the 
Bug, i should be able to avoide this Problem by do some prevention ?!

> 
>> How can i disable the output of messages (about dropped packets from my 
>> firewall) to my terminal ?
> 
> Edit the value of kernel.printk in /etc/sysctl.conf.
Ok i did add:
kernel.printk= 4 4 1 7
to </etc/sysctl.conf>
> 
>> How can i stabilize my IRQ-System with the kernel 2.6.31.1 ?
> 
> I would expect the standard kernel version for 'lenny' or the 2.6.30
> kernel from 'sid' to be more stable.
Ok i will try the kernel from Debian Sid. :-)
> 
>> What debug features should i disable ?
> 
> No idea, you didn't even specify what you enabled...
I will add some enabled features next week.

regards Markus


^ permalink raw reply

* Re: [PATCH 1/4] qeth: Convert ethtool get_stats_count() ops to get_sset_count()
From: Frank Blaschka @ 2009-10-02 12:13 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: David Miller, Ursula Braun, Frank Blaschka, linux-s390, netdev
In-Reply-To: <1254432272.2735.20.camel@achroite>

works fine, thanks a lot here is my ACK

Ben Hutchings schrieb:
> This string query operation was supposed to be replaced by the
> generic get_sset_count() starting in 2007.  Convert qeth's
> implementation.
> 
> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
> ---
> This is not even compile-tested because I don't have an s390 compiler.
> But it's simple enough that I think I got it right...
> 
> Ben.
> 
>  drivers/s390/net/qeth_core.h      |    2 +-
>  drivers/s390/net/qeth_core_main.c |   11 ++++++++---
>  drivers/s390/net/qeth_l2_main.c   |    4 ++--
>  drivers/s390/net/qeth_l3_main.c   |    2 +-
>  4 files changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/s390/net/qeth_core.h b/drivers/s390/net/qeth_core.h
> index 31a2b4e..e8f72d7 100644
> --- a/drivers/s390/net/qeth_core.h
> +++ b/drivers/s390/net/qeth_core.h
> @@ -849,7 +849,7 @@ int qeth_do_send_packet_fast(struct qeth_card *, struct qeth_qdio_out_q *,
>  			struct sk_buff *, struct qeth_hdr *, int, int, int);
>  int qeth_do_send_packet(struct qeth_card *, struct qeth_qdio_out_q *,
>  		    struct sk_buff *, struct qeth_hdr *, int);
> -int qeth_core_get_stats_count(struct net_device *);
> +int qeth_core_get_sset_count(struct net_device *, int);
>  void qeth_core_get_ethtool_stats(struct net_device *,
>  				struct ethtool_stats *, u64 *);
>  void qeth_core_get_strings(struct net_device *, u32, u8 *);
> diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c
> index c4a42d9..edee4dc 100644
> --- a/drivers/s390/net/qeth_core_main.c
> +++ b/drivers/s390/net/qeth_core_main.c
> @@ -4305,11 +4305,16 @@ static struct {
>  	{"tx csum"},
>  };
> 
> -int qeth_core_get_stats_count(struct net_device *dev)
> +int qeth_core_get_sset_count(struct net_device *dev, int stringset)
>  {
> -	return (sizeof(qeth_ethtool_stats_keys) / ETH_GSTRING_LEN);
> +	switch (stringset) {
> +	case ETH_SS_STATS:
> +		return (sizeof(qeth_ethtool_stats_keys) / ETH_GSTRING_LEN);
> +	default:
> +		return -EINVAL;
> +	}
>  }
> -EXPORT_SYMBOL_GPL(qeth_core_get_stats_count);
> +EXPORT_SYMBOL_GPL(qeth_core_get_sset_count);
> 
>  void qeth_core_get_ethtool_stats(struct net_device *dev,
>  		struct ethtool_stats *stats, u64 *data)
> diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c
> index f4f3ca1..b61d5c7 100644
> --- a/drivers/s390/net/qeth_l2_main.c
> +++ b/drivers/s390/net/qeth_l2_main.c
> @@ -866,7 +866,7 @@ static const struct ethtool_ops qeth_l2_ethtool_ops = {
>  	.get_link = ethtool_op_get_link,
>  	.get_strings = qeth_core_get_strings,
>  	.get_ethtool_stats = qeth_core_get_ethtool_stats,
> -	.get_stats_count = qeth_core_get_stats_count,
> +	.get_sset_count = qeth_core_get_sset_count,
>  	.get_drvinfo = qeth_core_get_drvinfo,
>  	.get_settings = qeth_core_ethtool_get_settings,
>  };
> @@ -874,7 +874,7 @@ static const struct ethtool_ops qeth_l2_ethtool_ops = {
>  static const struct ethtool_ops qeth_l2_osn_ops = {
>  	.get_strings = qeth_core_get_strings,
>  	.get_ethtool_stats = qeth_core_get_ethtool_stats,
> -	.get_stats_count = qeth_core_get_stats_count,
> +	.get_sset_count = qeth_core_get_sset_count,
>  	.get_drvinfo = qeth_core_get_drvinfo,
>  };
> 
> diff --git a/drivers/s390/net/qeth_l3_main.c b/drivers/s390/net/qeth_l3_main.c
> index 073b6d3..4ca28c1 100644
> --- a/drivers/s390/net/qeth_l3_main.c
> +++ b/drivers/s390/net/qeth_l3_main.c
> @@ -2957,7 +2957,7 @@ static const struct ethtool_ops qeth_l3_ethtool_ops = {
>  	.set_tso     = qeth_l3_ethtool_set_tso,
>  	.get_strings = qeth_core_get_strings,
>  	.get_ethtool_stats = qeth_core_get_ethtool_stats,
> -	.get_stats_count = qeth_core_get_stats_count,
> +	.get_sset_count = qeth_core_get_sset_count,
>  	.get_drvinfo = qeth_core_get_drvinfo,
>  	.get_settings = qeth_core_ethtool_get_settings,
>  };
> 



^ permalink raw reply

* Re: Network hangs with 2.6.30.5
From: Ilpo Järvinen @ 2009-10-02 12:29 UTC (permalink / raw)
  To: David Miller
  Cc: jarkao2, holger.hoffstaette, Netdev, eric.dumazet,
	Evgeniy Polyakov
In-Reply-To: <alpine.DEB.2.00.0910021104130.13543@wel-95.cs.helsinki.fi>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5364 bytes --]

On Fri, 2 Oct 2009, Ilpo Järvinen wrote:

> On Thu, 1 Oct 2009, David Miller wrote:
> 
> > From: Jarek Poplawski <jarkao2@gmail.com>
> > Date: Mon, 7 Sep 2009 07:21:43 +0000
> > 
> > > While Eric is analyzing your data, I guess you could try reverting
> > > some stuff around this tcp_tw_recycle, and my tcp ignorance would
> > > point these commits for the beginning:
> > > 
> > > http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.30.y.git;a=commitdiff;h=fc1ad92dfc4e363a055053746552cdb445ba5c57
> > > http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.30.y.git;a=commitdiff;h=c887e6d2d9aee56ee7c9f2af4cec3a5efdcc4c72
> > 
> > Ilpo's cleanup (the second commit listed) looks most likely to
> > be a possibility.
> > 
> > But I surely cannot find any bugs in it, even after studying it
> > a few times.
> > 
> > Ilpo could you audit it one more time for us just in case?
> 
> Argh, not that one ...the jungle of negations. But I'll try to go it 
> through once more but I tell you I did go through those negations multiple 
> times already before submitting it :-).
> 
> > I also looked through all the TCP commits in 2.6.29 to 2.6.30
> > and I could not find anything else that might cause stalls with
> > time-wait recycled connections.
> 
> What about the more than 64k connections change a9d8f9110d7e953c2f2 (or 
> its fixes), it might be another possibility? ...It certainly does 
> something related to reuse and happens to be in the correct time frame... 
> (I've added Evgeniy).

Here's my full analysis:

> c887e6d2d9aee56ee7c9f2af4cec3a5efdcc4c72
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index d74ac30..255ca35 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -997,11 +997,21 @@ static inline int tcp_fin_time(const struct sock *sk)
>  	return fin_timeout;
>  }
>  
> -static inline int tcp_paws_check(const struct tcp_options_received *rx_opt, int rst)
> +static inline int tcp_paws_check(const struct tcp_options_received *rx_opt,
> +				 int paws_win)
>  {
> -	if ((s32)(rx_opt->rcv_tsval - rx_opt->ts_recent) >= 0)
> -		return 0;
> -	if (get_seconds() >= rx_opt->ts_recent_stamp + TCP_PAWS_24DAYS)
> +	if ((s32)(rx_opt->ts_recent - rx_opt->rcv_tsval) <= paws_win)
> +		return 1;
> +	if (unlikely(get_seconds() >= rx_opt->ts_recent_stamp + TCP_PAWS_24DAYS))
> +		return 1;
> +
> +	return 0;
> +}
> +
> +static inline int tcp_paws_reject(const struct tcp_options_received *rx_opt,
> +				  int rst)
> +{
> +	if (tcp_paws_check(rx_opt, 0))
>  		return 0;

First condition is * -1 to switch subtraction terms around (and reverses
inequality). The other condition is very much the same. In addition, it 
has an extra negation round but still OK.

>  
>  	/* RST segments are not recommended to carry timestamp,
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index f527a16..b7d02c5 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -3883,8 +3883,7 @@ static inline void tcp_replace_ts_recent(struct tcp_sock *tp, u32 seq)
>  		 * Not only, also it occurs for expired timestamps.
>  		 */
>  
> -		if ((s32)(tp->rx_opt.rcv_tsval - tp->rx_opt.ts_recent) >= 0 ||

* -1 here too.

> -		   get_seconds() >= tp->rx_opt.ts_recent_stamp + TCP_PAWS_24DAYS)

The very same condition.

> +		if (tcp_paws_check(&tp->rx_opt, 0))
>  			tcp_store_ts_recent(tp);
>  	}
>  }
> @@ -3936,9 +3935,9 @@ static inline int tcp_paws_discard(const struct sock *sk,
>  				   const struct sk_buff *skb)
>  {
>  	const struct tcp_sock *tp = tcp_sk(sk);
> -	return ((s32)(tp->rx_opt.ts_recent - tp->rx_opt.rcv_tsval) > TCP_PAWS_WINDOW &&
> -		get_seconds() < tp->rx_opt.ts_recent_stamp + TCP_PAWS_24DAYS &&
> -		!tcp_disordered_ack(sk, skb));
> +
> +	return !tcp_paws_check(&tp->rx_opt, TCP_PAWS_WINDOW) &&

DeMorgan: 

   (a > b) &&  (c < d) 
          <==>
!(!(a > b) || !(c < d))
          <==>
!((a <= b) || (c >= d))

> +	       !tcp_disordered_ack(sk, skb);
>  }
>  
>  /* Check segment sequence number for validity.
> @@ -5513,7 +5512,7 @@ discard:
>  
>  	/* PAWS check. */
>  	if (tp->rx_opt.ts_recent_stamp && tp->rx_opt.saw_tstamp &&
> -	    tcp_paws_check(&tp->rx_opt, 0))
> +	    tcp_paws_reject(&tp->rx_opt, 0))

A plain rename, the rest likewise.

>  		goto discard_and_undo;
>  
>  	if (th->syn) {
> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
> index 4b0df3e..43bbba7 100644
> --- a/net/ipv4/tcp_minisocks.c
> +++ b/net/ipv4/tcp_minisocks.c
> @@ -107,7 +107,7 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
>  		if (tmp_opt.saw_tstamp) {
>  			tmp_opt.ts_recent	= tcptw->tw_ts_recent;
>  			tmp_opt.ts_recent_stamp	= tcptw->tw_ts_recent_stamp;
> -			paws_reject = tcp_paws_check(&tmp_opt, th->rst);
> +			paws_reject = tcp_paws_reject(&tmp_opt, th->rst);
>  		}
>  	}
>  
> @@ -511,7 +511,7 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
>  			 * from another data.
>  			 */
>  			tmp_opt.ts_recent_stamp = get_seconds() - ((TCP_TIMEOUT_INIT/HZ)<<req->retrans);
> -			paws_reject = tcp_paws_check(&tmp_opt, th->rst);
> +			paws_reject = tcp_paws_reject(&tmp_opt, th->rst);
>  		}
>  	}
>  

...Which concludes the patch innocent. ...I certainly won't regret this 
cleanup after having to figure that mess out once again - that is to say,
hopefully for the last time :-). ...Sadly the problem remains.

-- 
 i.

^ permalink raw reply

* Re: Network hangs with 2.6.30.5
From: Eric Dumazet @ 2009-10-02 12:38 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Miller, jarkao2, holger.hoffstaette, Netdev,
	Evgeniy Polyakov
In-Reply-To: <alpine.DEB.2.00.0910021520280.13543@wel-95.cs.helsinki.fi>

Ilpo Järvinen a écrit :
> On Fri, 2 Oct 2009, Ilpo Järvinen wrote:
> 
>> On Thu, 1 Oct 2009, David Miller wrote:
>>
>>> From: Jarek Poplawski <jarkao2@gmail.com>
>>> Date: Mon, 7 Sep 2009 07:21:43 +0000
>>>
>>>> While Eric is analyzing your data, I guess you could try reverting
>>>> some stuff around this tcp_tw_recycle, and my tcp ignorance would
>>>> point these commits for the beginning:
>>>>
>>>> http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.30.y.git;a=commitdiff;h=fc1ad92dfc4e363a055053746552cdb445ba5c57
>>>> http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.30.y.git;a=commitdiff;h=c887e6d2d9aee56ee7c9f2af4cec3a5efdcc4c72
>>> Ilpo's cleanup (the second commit listed) looks most likely to
>>> be a possibility.
>>>
>>> But I surely cannot find any bugs in it, even after studying it
>>> a few times.
>>>
>>> Ilpo could you audit it one more time for us just in case?
>> Argh, not that one ...the jungle of negations. But I'll try to go it 
>> through once more but I tell you I did go through those negations multiple 
>> times already before submitting it :-).
>>
>>> I also looked through all the TCP commits in 2.6.29 to 2.6.30
>>> and I could not find anything else that might cause stalls with
>>> time-wait recycled connections.
>> What about the more than 64k connections change a9d8f9110d7e953c2f2 (or 
>> its fixes), it might be another possibility? ...It certainly does 
>> something related to reuse and happens to be in the correct time frame... 
>> (I've added Evgeniy).

I scratched my head to reproduce the conditions of hang but failed.

I am pretty sure both commits are OK (yours and mine), maybe a brute force
git bisection is needed.


^ permalink raw reply

* Re: [RFC take2] pkt_sched: gen_estimator: Dont report fake rate estimators
From: Eric Dumazet @ 2009-10-02 12:39 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David Miller, kaber, netdev
In-Reply-To: <20091002112514.GA14100@ff.dom.local>

Jarek Poplawski a écrit :

> So you prefer the additional parameter version, but since these
> _active tests are not needed e.g. for HTB classes, which got it
> active by default, so maybe bstats == NULL would let skip such a test?
> 
> ...
>> --- a/include/net/gen_stats.h
>> +++ b/include/net/gen_stats.h
>> @@ -30,6 +30,7 @@ extern int gnet_stats_start_copy_compat(struct sk_buff *skb, int type,
>>  extern int gnet_stats_copy_basic(struct gnet_dump *d,
>>  				 struct gnet_stats_basic_packed *b);
>>  extern int gnet_stats_copy_rate_est(struct gnet_dump *d,
>> +				    const struct gnet_stats_basic_packed *bstats,
> 
> It seems these *b/*bstats defs could look more consistent. Otherwise
> it looks OK to me.

Agreed, here is the updated version, I added your Signoff if you dont mind :)

[RFC] pkt_sched: gen_estimator: Dont report fake rate estimators

We currently send TCA_STATS_RATE_EST elements to netlink users, even if no estimator
is running.

# tc -s -d qdisc
qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 112833764978 bytes 1495081739 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0

User has no way to tell if the "rate 0bit 0pps" is a real estimation, or a fake
one (because no estimator is active)

After this patch, tc command output is :
$ tc -s -d qdisc
qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 561075 bytes 1196 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

We add a parameter to gnet_stats_copy_rate_est() function so that
it can use gen_estimator_active(bstats, r), as suggested by Jarek.

This parameter can be NULL if check is not necessary, (htb for
example has a mandatory rate estimator)


Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---
 include/net/gen_stats.h |    1 +
 net/core/gen_stats.c    |    7 ++++++-
 net/sched/act_api.c     |    2 +-
 net/sched/sch_api.c     |    2 +-
 net/sched/sch_cbq.c     |    2 +-
 net/sched/sch_drr.c     |    2 +-
 net/sched/sch_hfsc.c    |    2 +-
 net/sched/sch_htb.c     |    2 +-
 8 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/include/net/gen_stats.h b/include/net/gen_stats.h
index c148855..eb87a14 100644
--- a/include/net/gen_stats.h
+++ b/include/net/gen_stats.h
@@ -30,6 +30,7 @@ extern int gnet_stats_start_copy_compat(struct sk_buff *skb, int type,
 extern int gnet_stats_copy_basic(struct gnet_dump *d,
 				 struct gnet_stats_basic_packed *b);
 extern int gnet_stats_copy_rate_est(struct gnet_dump *d,
+				    const struct gnet_stats_basic_packed *b,
 				    struct gnet_stats_rate_est *r);
 extern int gnet_stats_copy_queue(struct gnet_dump *d,
 				 struct gnet_stats_queue *q);
diff --git a/net/core/gen_stats.c b/net/core/gen_stats.c
index 8569310..054a49c 100644
--- a/net/core/gen_stats.c
+++ b/net/core/gen_stats.c
@@ -136,8 +136,13 @@ gnet_stats_copy_basic(struct gnet_dump *d, struct gnet_stats_basic_packed *b)
  * if the room in the socket buffer was not sufficient.
  */
 int
-gnet_stats_copy_rate_est(struct gnet_dump *d, struct gnet_stats_rate_est *r)
+gnet_stats_copy_rate_est(struct gnet_dump *d,
+			 const struct gnet_stats_basic_packed *b,
+			 struct gnet_stats_rate_est *r)
 {
+	if (b && !gen_estimator_active(b, r))
+		return 0;
+
 	if (d->compat_tc_stats) {
 		d->tc_stats.bps = r->bps;
 		d->tc_stats.pps = r->pps;
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 2dfb3e7..2b0d5ee 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -618,7 +618,7 @@ int tcf_action_copy_stats(struct sk_buff *skb, struct tc_action *a,
 			goto errout;
 
 	if (gnet_stats_copy_basic(&d, &h->tcf_bstats) < 0 ||
-	    gnet_stats_copy_rate_est(&d, &h->tcf_rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(&d, &h->tcf_bstats, &h->tcf_rate_est) < 0 ||
 	    gnet_stats_copy_queue(&d, &h->tcf_qstats) < 0)
 		goto errout;
 
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 903e418..1acfd29 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -1179,7 +1179,7 @@ static int tc_fill_qdisc(struct sk_buff *skb, struct Qdisc *q, u32 clid,
 		goto nla_put_failure;
 
 	if (gnet_stats_copy_basic(&d, &q->bstats) < 0 ||
-	    gnet_stats_copy_rate_est(&d, &q->rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(&d, &q->bstats, &q->rate_est) < 0 ||
 	    gnet_stats_copy_queue(&d, &q->qstats) < 0)
 		goto nla_put_failure;
 
diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c
index 5b132c4..3846d65 100644
--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -1609,7 +1609,7 @@ cbq_dump_class_stats(struct Qdisc *sch, unsigned long arg,
 		cl->xstats.undertime = cl->undertime - q->now;
 
 	if (gnet_stats_copy_basic(d, &cl->bstats) < 0 ||
-	    gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(d, &cl->bstats, &cl->rate_est) < 0 ||
 	    gnet_stats_copy_queue(d, &cl->qstats) < 0)
 		return -1;
 
diff --git a/net/sched/sch_drr.c b/net/sched/sch_drr.c
index 5a888af..a65604f 100644
--- a/net/sched/sch_drr.c
+++ b/net/sched/sch_drr.c
@@ -280,7 +280,7 @@ static int drr_dump_class_stats(struct Qdisc *sch, unsigned long arg,
 	}
 
 	if (gnet_stats_copy_basic(d, &cl->bstats) < 0 ||
-	    gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(d, &cl->bstats, &cl->rate_est) < 0 ||
 	    gnet_stats_copy_queue(d, &cl->qdisc->qstats) < 0)
 		return -1;
 
diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 2c5c76b..b38b39c 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -1375,7 +1375,7 @@ hfsc_dump_class_stats(struct Qdisc *sch, unsigned long arg,
 	xstats.rtwork  = cl->cl_cumul;
 
 	if (gnet_stats_copy_basic(d, &cl->bstats) < 0 ||
-	    gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(d, &cl->bstats, &cl->rate_est) < 0 ||
 	    gnet_stats_copy_queue(d, &cl->qstats) < 0)
 		return -1;
 
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 85acab9..2e38d1a 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -1105,7 +1105,7 @@ htb_dump_class_stats(struct Qdisc *sch, unsigned long arg, struct gnet_dump *d)
 	cl->xstats.ctokens = cl->ctokens;
 
 	if (gnet_stats_copy_basic(d, &cl->bstats) < 0 ||
-	    gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 ||
+	    gnet_stats_copy_rate_est(d, NULL, &cl->rate_est) < 0 ||
 	    gnet_stats_copy_queue(d, &cl->qstats) < 0)
 		return -1;
 

^ permalink raw reply related

* [PATCH 1/8] connector: Keep the skb in cn_callback_data
From: Philipp Reisner @ 2009-10-02 12:40 UTC (permalink / raw)
  To: linux-kernel, netdev, Andrew Morton, David S. Miller, Greg KH
  Cc: dm-devel, Evgeniy Polyakov, linux-fbdev-devel, Philipp Reisner
In-Reply-To: <1254487211-11810-1-git-send-email-philipp.reisner@linbit.com>

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Acked-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
---
 drivers/connector/cn_queue.c  |    3 ++-
 drivers/connector/connector.c |   11 +++++------
 include/linux/connector.h     |    4 ++--
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/connector/cn_queue.c b/drivers/connector/cn_queue.c
index 4a1dfe1..b4cfac9 100644
--- a/drivers/connector/cn_queue.c
+++ b/drivers/connector/cn_queue.c
@@ -78,8 +78,9 @@ void cn_queue_wrapper(struct work_struct *work)
 	struct cn_callback_entry *cbq =
 		container_of(work, struct cn_callback_entry, work);
 	struct cn_callback_data *d = &cbq->data;
+	struct cn_msg *msg = NLMSG_DATA(nlmsg_hdr(d->skb));
 
-	d->callback(d->callback_priv);
+	d->callback(msg);
 
 	d->destruct_data(d->ddata);
 	d->ddata = NULL;
diff --git a/drivers/connector/connector.c b/drivers/connector/connector.c
index 74f52af..fc9887f 100644
--- a/drivers/connector/connector.c
+++ b/drivers/connector/connector.c
@@ -129,10 +129,11 @@ EXPORT_SYMBOL_GPL(cn_netlink_send);
 /*
  * Callback helper - queues work and setup destructor for given data.
  */
-static int cn_call_callback(struct cn_msg *msg, void (*destruct_data)(void *), void *data)
+static int cn_call_callback(struct sk_buff *skb, void (*destruct_data)(void *), void *data)
 {
 	struct cn_callback_entry *__cbq, *__new_cbq;
 	struct cn_dev *dev = &cdev;
+	struct cn_msg *msg = NLMSG_DATA(nlmsg_hdr(skb));
 	int err = -ENODEV;
 
 	spin_lock_bh(&dev->cbdev->queue_lock);
@@ -140,7 +141,7 @@ static int cn_call_callback(struct cn_msg *msg, void (*destruct_data)(void *), v
 		if (cn_cb_equal(&__cbq->id.id, &msg->id)) {
 			if (likely(!work_pending(&__cbq->work) &&
 					__cbq->data.ddata == NULL)) {
-				__cbq->data.callback_priv = msg;
+				__cbq->data.skb = skb;
 
 				__cbq->data.ddata = data;
 				__cbq->data.destruct_data = destruct_data;
@@ -156,7 +157,7 @@ static int cn_call_callback(struct cn_msg *msg, void (*destruct_data)(void *), v
 				__new_cbq = kzalloc(sizeof(struct cn_callback_entry), GFP_ATOMIC);
 				if (__new_cbq) {
 					d = &__new_cbq->data;
-					d->callback_priv = msg;
+					d->skb = skb;
 					d->callback = __cbq->data.callback;
 					d->ddata = data;
 					d->destruct_data = destruct_data;
@@ -191,7 +192,6 @@ static int cn_call_callback(struct cn_msg *msg, void (*destruct_data)(void *), v
  */
 static void cn_rx_skb(struct sk_buff *__skb)
 {
-	struct cn_msg *msg;
 	struct nlmsghdr *nlh;
 	int err;
 	struct sk_buff *skb;
@@ -208,8 +208,7 @@ static void cn_rx_skb(struct sk_buff *__skb)
 			return;
 		}
 
-		msg = NLMSG_DATA(nlh);
-		err = cn_call_callback(msg, (void (*)(void *))kfree_skb, skb);
+		err = cn_call_callback(skb, (void (*)(void *))kfree_skb, skb);
 		if (err < 0)
 			kfree_skb(skb);
 	}
diff --git a/include/linux/connector.h b/include/linux/connector.h
index 47ebf41..05a7a14 100644
--- a/include/linux/connector.h
+++ b/include/linux/connector.h
@@ -134,8 +134,8 @@ struct cn_callback_id {
 struct cn_callback_data {
 	void (*destruct_data) (void *);
 	void *ddata;
-	
-	void *callback_priv;
+
+	struct sk_buff *skb;
 	void (*callback) (struct cn_msg *);
 
 	void *free;
-- 
1.6.0.4

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox