From mboxrd@z Thu Jan 1 00:00:00 1970 From: Shailabh Nagar Subject: Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats Date: Thu, 29 Jun 2006 20:38:29 -0400 Message-ID: <44A47285.6060307@watson.ibm.com> References: <44892610.6040001@watson.ibm.com> <4499D7CD.1020303@engr.sgi.com> <449C2181.6000007@watson.ibm.com> <20060623141926.b28a5fc0.akpm@osdl.org> <449C6620.1020203@engr.sgi.com> <20060623164743.c894c314.akpm@osdl.org> <449CAA78.4080902@watson.ibm.com> <20060623213912.96056b02.akpm@osdl.org> <449CD4B3.8020300@watson.ibm.com> <44A01A50.1050403@sgi.com> <20060626105548.edef4c64.akpm@osdl.org> <44A020CD.30903@watson.ibm.com> <20060626111249.7aece36e.akpm@osdl.org> <44A026ED.8080903@sgi.com> <20060626113959.839d72bc.akpm@osdl.org> <44A2F50D.8030306@engr.sgi.com> <20060628145341.529a61ab.akpm@osdl.org> <44A2FC72.9090407@engr.sgi.com> <20060629014050.d3bf0be4.pj@sgi.com> <200606291230.k5TCUg45030710@turing-police.cc.vt.edu> <20060629094408.360ac157.pj@sgi.com> <20060629110107.2e56310b.akpm@osdl.org> <44A425A7.2060900@watson.ibm.com> <20060629123338.0d355297.akpm@osdl.org> < 44A43187.3090307@watson.ibm.com> <1151621692.8922.4.camel@jzny2> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Andrew Morton , pj@sgi.com, Valdis.Kletnieks@vt.edu, jlan@engr.sgi.com, balbir@in.ibm.com, csturtiv@sgi.com, linux-kernel@vger.kernel.org, netdev@vger.kernel.org Return-path: Received: from mtagate1.uk.ibm.com ([195.212.29.134]:14340 "EHLO mtagate1.uk.ibm.com") by vger.kernel.org with ESMTP id S1751350AbWF3Aid (ORCPT ); Thu, 29 Jun 2006 20:38:33 -0400 To: hadi@cyberus.ca In-Reply-To: <1151621692.8922.4.camel@jzny2> Sender: netdev-owner@vger.kernel.org List-Id: netdev.vger.kernel.org jamal wrote: >On Thu, 2006-29-06 at 16:01 -0400, Shailabh Nagar wrote: > > > >>Jamal, >>any thoughts on the flow control capabilities of netlink that apply here >>? Usage of the connection is to supply statistics data to userspace. >> >> >> > >if you want reliable delivery, then you cant just depend on async events >from the kernel -> user - which i am assuming is the way stats get >delivered as processes exit? > Yes. >Sorry, i dont remember the details. You >need some synchronous scheme to ask the kernel to do a "get" or "dump". > > Oh, yes. Dump is synchronous. So it won't be useful unless we buffer task exit records within taskstats. >Lets be clear about one thing: >The problem really has nothing to do with gen/netlink or any other >scheme you use;-> >It has everything to do with reliability implications and the fact >that you need to assume memory is a finite resource - at one point >or another you will run out of memory ;-> And of course then messages >will be lost. So for gen/netlink, just make sure you have large socket >buffer and you would most likely be fine. >I havent seen how the numbers were reached: But if you say you receive >14K exits/sec each of which is a 50B message, I would think a 1M socket >buffer would be plenty. > > The rates (or upper bounds) that are being discussed here, as of now, are 1000 exits/sec/CPU for 1024 CPU systems. That would be roughly 1M exits/system * 248Bytes/message = 248 MB/sec. >You can find out about lack of memory in netlink when you get a ENOBUFS. >As an example, you should then do a kernel query. Clearly if you do a >query of that sort, you may not want to find obsolete info. Therefore, >as a suggestion, you may want to keep sequence numbers of sorts as >markers. Perhaps keep a 32-bit field which monotically increases per >process exit or use the pid as the sequence number etc.. > >As for throttling - Shailabh, I think we talked about this: >- You could maintain info using some thresholds and timer. Then >when a timer expires or threshold is exceeded send to user space. > > Hmm. So we could buffer the per-task exit data within taskstats (the mem consumption would grow but thats probably not a problem) and then send it out later. Jay - would not getting exit data soon after exit be a problem for CSA ? I'm guessing not, if the timeout is kept small enough. Internally, taskstats could always pace its sends so that "too much" isn't sent out at one shot. --Shailabh