From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Morton Subject: Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats Date: Fri, 30 Jun 2006 15:56:12 -0700 Message-ID: <20060630155612.36189ced.akpm@osdl.org> References: <44892610.6040001@watson.ibm.com> <449999D1.7000403@engr.sgi.com> <44999A98.8030406@engr.sgi.com> <44999F5A.2080809@watson.ibm.com> <4499D7CD.1020303@engr.sgi.com> <449C2181.6000007@watson.ibm.com> <20060623141926.b28a5fc0.akpm@osdl.org> <449C6620.1020203@engr.sgi.com> <20060623164743.c894c314.akpm@osdl.org> <449CAA78.4080902@watson.ibm.com> <20060623213912.96056b02.akpm@osdl.org> <449CD4B3.8020300@watson.ibm.com> <44A01A50.1050403@sgi.com> <20060626105548.edef4c64.akpm@osdl.org> <44A020CD.30903@watson.ibm.com> <20060626111249.7aece36e.akpm@osdl.org> <44A026ED.8080903@sgi.com> <20060626113959.839d72bc.akpm@osdl.org> <44A2F50D.8030306@engr.sgi.com> <20060628145341.529a61ab.akpm@osdl.org> <44A2FC72.9090407@engr.sgi.com> <20060629014050.d3bf0be4.pj@sgi.com> <200606291230.k5TCUg45030710@turing-police.cc.vt.edu> <20060629094408.360ac157.pj@sgi.com> <20060629110107.2e56310b.akpm@osdl.org> <44A57310.3010208@watson.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: pj@sgi.com, Valdis.Kletnieks@vt.edu, jlan@engr.sgi.com, balbir@in.ibm.com, csturtiv@sgi.com, linux-kernel@vger.kernel.org, hadi@cyberus.ca, netdev@vger.kernel.org Return-path: Received: from smtp.osdl.org ([65.172.181.4]:22684 "EHLO smtp.osdl.org") by vger.kernel.org with ESMTP id S932106AbWF3Wxc (ORCPT ); Fri, 30 Jun 2006 18:53:32 -0400 To: Shailabh Nagar In-Reply-To: <44A57310.3010208@watson.ibm.com> Sender: netdev-owner@vger.kernel.org List-Id: netdev.vger.kernel.org Shailabh Nagar wrote: > > Based on previous discussions, the above solutions can be expanded/modified to: > > a) allow userspace to listen to a group of cpus instead of all. Multiple > collection daemons can distribute the load as you pointed out. Doing collection > by cpu groups rather than individual cpus reduces the aggregation burden on > userspace (and scales better with NR_CPUS) > > b) do flow control on the kernel send side. This can involve buffering and sending > later (to handle bursty case) or dropping (to handle sustained load) as pointed out > by you, Jamal in other threads. > > c) increase receiver's socket buffer. This can and should always be done but no > involvement needed. > > > With regards to taskstats changes to handle the problem and its impact on userspace > visible changes, > > a) will change userspace > b) will be transparent. > c) is immaterial going forward (except perhaps as a change in Documentation) > > > I'm sending a patch that demonstrates how a) can be done quite simply > and a patch for b) is in progress. > > If the approach suggested in patch a) is acceptable (and I'll provide the testing, stability > results once comments on it are largely over), could taskstats acceptance in 2.6.18 go ahead > and patch b) be added later (solution outline has already been provided and a prelim patch should > be out by eod) Throwing more CPUs at the problem makes heaps of sense. It's not necessarily a userspace-incompatible change. As long as userspace sets nl_pid to 0x00000000, future kernel revisions can treat that as "all CPUs". Or userspace can be forward-compatible by setting nl_pid to 0xffff0000, or whatever.