From mboxrd@z Thu Jan 1 00:00:00 1970 From: jamal Subject: Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats Date: Thu, 29 Jun 2006 21:30:47 -0400 Message-ID: <1151631048.8922.139.camel@jzny2> References: <44892610.6040001@watson.ibm.com> <449C2181.6000007@watson.ibm.com> <20060623141926.b28a5fc0.akpm@osdl.org> <449C6620.1020203@engr.sgi.com> <20060623164743.c894c314.akpm@osdl.org> <449CAA78.4080902@watson.ibm.com> <20060623213912.96056b02.akpm@osdl.org> <449CD4B3.8020300@watson.ibm.com> <44A01A50.1050403@sgi.com> <20060626105548.edef4c64.akpm@osdl.org> <44A020CD.30903@watson.ibm.com> <20060626111249.7aece36e.akpm@osdl.org> <44A026ED.8080903@sgi.com> <20060626113959.839d72bc.akpm@osdl.org> <44A2F50D.8030306@engr.sgi.com> <20060628145341.529a61ab.akpm@osdl.org> <44A2FC72.9090407@engr.sgi.com> <20060629014050.d3bf0be4.pj@sgi.com> <200606291230.k5TCUg45030710@turing-police.cc.vt.edu> <20060629094408.360ac157.pj@sgi.com> <20060629110107.2e56310b.akpm@osdl.org> <44A425A7.2060900@watson.ibm.com> <20060629123338.0d355297.akpm@osdl.org> <44A43187.3090307@watson.ibm.com> <1151621692.8922.4.camel@jzny2> <44A47285.6060307@watson.ibm.com> <20060629180502.3987a98e.akpm@osdl.or! g> <44A47A3E.5070809@watson.ibm.com> Reply-To: hadi@cyberus.ca Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, csturtiv@sgi.com, balbir@in.ibm.com, jlan@engr.sgi.com, Valdis.Kletnieks@vt.edu, pj@sgi.com, Andrew Morton Return-path: Received: from mx03.cybersurf.com ([209.197.145.106]:36737 "EHLO mx03.cybersurf.com") by vger.kernel.org with ESMTP id S1751081AbWF3Baz (ORCPT ); Thu, 29 Jun 2006 21:30:55 -0400 To: Shailabh Nagar In-Reply-To: <44A47A3E.5070809@watson.ibm.com> Sender: netdev-owner@vger.kernel.org List-Id: netdev.vger.kernel.org On Thu, 2006-29-06 at 21:11 -0400, Shailabh Nagar wrote: > Andrew Morton wrote: > > >Shailabh Nagar wrote: [..] > >So if we can detect the silly sustained-high-exit-rate scenario then it > >seems to me quite legitimate to do some aggressive data reduction on that. > >Like, a single message which says "20,000 sub-millisecond-runtime tasks > >exited in the past second" or something. > > > > > The "buffering within taskstats" might be a way out then. Thats what it looks like. > As long as the user is willing to pay the price in terms of memory, You may wanna draw a line to the upper limit - maybe even allocate slab space. > we can collect the exiting task's taskstats data but not send it > immediately (taskstats_cache would grow) > unless a high water mark had been crossed. Otherwise a timer event would do the > sends of accumalated taskstats (not all at once but > iteratively if necessary). > Sounds reasonable. Thats what xfrm events do. Try to have those parameters settable because different machines or users may have different view as to what is proper - maybe even as simple as sysctl. > At task exit, despite doing a few rounds of sending of pending data, if > netlink were still reporting errors > then it would be a sign of unsustainable rate and the pending queue > could be dropped and a message like you suggest could be sent. > When you send inside the kernel - you will get an error if there's problems sending to the socket queue. So you may wanna use that info to release the kernel allocated entries or keep them for a little longer. Hopefully that helps. cheers, jamal