All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Kluge <Michael.Kluge@tu-dresden.de>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] Lustre RPC visualization
Date: Mon, 17 May 2010 07:53:22 +0200	[thread overview]
Message-ID: <1274075602.9095.86.camel@radar> (raw)
In-Reply-To: <AANLkTim7d1u1gJXSwCWxAc7eE8N7iKnKgrGWfasK0nyL@mail.gmail.com>

Hi Andrew,

unfortunately no. We don't own a Cray :( 


Regards, Michael


Am Sonntag, den 16.05.2010, 20:24 -0700 schrieb Andrew Uselton:
> I think this work is very interesting.  Will anyone be at CUG 2010
> next week to discuss? 
> Cheers,
> Andrew
> 
> 
> 2010/5/16 Michael Kluge <Michael.Kluge@tu-dresden.de>
>         Hi WangDi,
>         
>         the first version works. Screenshot is attached. I have a
>         couple of counter realized: RPC's in flight and RPC's
>         completed in total on the client, RPC's enqueued, RPC's in
>         processing and RPC'c completed in total on the server. All
>         these counter can be broken down by the type of RPC (op code).
>         The picture has not yet the lines that show each single RPC, I
>         still have to do counter like "avg. time to complete an RPC
>         over the last second" and there are some more TODO's. Like the
>         timer synchronization. (In the screenshot the first and the
>         last counter show total values while the one in the middle
>         shows a rate.)
>         
>         What I like to have is a complete set of traces from a small
>         cluster (<100 nodes) including the servers. Would that be
>         possible?
>         
>         Is one of you in Hamburg May, 31-June, 3 for ISC'2010? I'll be
>         there and like to talk about what would be useful for the next
>         steps. 
>         
>         
>         
>         Regards, Michael
>         
>         Am 03.05.2010 21:52, schrieb di.wang:
>         
>                 Michael Kluge wrote: 
>                 
>                 
>                                         One more question: RPC
>                                         1334380768266400 (in the log
>                                         WangDi sent me)
>                                         has on the client side only a
>                                         "Sending RPC" message, thus
>                                         missing the
>                                         "Completed RPC". The server
>                                         has all three (received,start
>                                         work, done
>                                         work). Has this RPC vanished
>                                         on the way back to the client?
>                                         There is
>                                         no further indication what
>                                         happend. The last timestamp in
>                                         the client
>                                         log is:
>                                         1272565368.228628
>                                         and the server says it
>                                         finished the processing of the
>                                         request at:
>                                         1272565281.379471
>                                         So the client log has been
>                                         recorded long enough to
>                                         contain the
>                                         "Completed RPC" message for
>                                         this RPC if it arrived
>                                         ever ...
>                                 Logically, yes. But in some cases,
>                                 some debug logs might be abandoned
>                                 for some reasons(actually, it happens
>                                 not rarely), and probably you need
>                                 maintain an average time from server
>                                 "Handled RPC" to client "Completed
>                                 RPC", then you just guess the client
>                                 "Completed RPC" time in this case.
>                         
>                         Oh my gosh ;) I don't want to start
>                         speculations about the helpfulness
>                         of incomplete debug logs. Anyway, what can get
>                         lost? Any kind of
>                         message on the servers and clients? I think
>                         I'd like to know what
>                         cases have to be handled while I try to track
>                         individual RPC's on
>                         their way.
>                 Any records can get lost here. Unfortunately, there
>                 are not any messages
>                 indicate the missing happened. :(
>                 (Usually, I would check the time stamp in the log,
>                 i.e. no records for a
>                 "long" time, for example several seconds, but this is
>                 not the accurate
>                 way).
>                 
>                 I guess you can just ignore these uncompleted records
>                 in your first
>                 step? Let's see how these incomplete log will
>                 impact the profiling result, then we will decide how
>                 to deal with this?
>                 
>                 Thanks
>                 Wangdi
>                         
>                         Regards, Michael
>                         _______________________________________________
>                         Lustre-devel mailing list
>                         Lustre-devel at lists.lustre.org
>                         http://lists.lustre.org/mailman/listinfo/lustre-devel
>                 
>                 
>                 
>         
>         
>         -- 
>         Michael Kluge, M.Sc.
>         
>         Technische Universit?t Dresden
>         Center for Information Services and
>         High Performance Computing (ZIH)
>         D-01062 Dresden
>         Germany
>         
>         Contact:
>         Willersbau, Room WIL A 208
>         Phone:  (+49) 351 463-34217
>         Fax:    (+49) 351 463-37773
>         e-mail: michael.kluge at tu-dresden.de
>         
>         
>         WWW:    http://www.tu-dresden.de/zih
>         
>         
>         _______________________________________________
>         Lustre-devel mailing list
>         Lustre-devel at lists.lustre.org
>         http://lists.lustre.org/mailman/listinfo/lustre-devel
>         
> 
> 
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel

-- 

Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5997 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20100517/92de1f31/attachment.bin>

  reply	other threads:[~2010-05-17  5:53 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <000c01cae6ee$1d4693d0$57d3bb70$@barton@oracle.com>
2010-04-29  1:25 ` [Lustre-devel] (no subject) di.wang
2010-04-29  1:49   ` Andreas Dilger
2010-04-29  2:04     ` di.wang
2010-04-29  4:48   ` [Lustre-devel] Lustre RPC visualization Michael Kluge
     [not found]     ` <4BD9CF75.8030204@oracle.com>
2010-05-03  8:41       ` Michael Kluge
2010-05-03 13:20         ` Andreas Dilger
2010-05-03 18:10           ` Michael Kluge
2010-05-03 18:57             ` Robert Read
2010-05-03 18:58             ` di.wang
2010-05-03 19:32               ` Michael Kluge
2010-05-03 19:52                 ` di.wang
2010-05-03 20:04                   ` Michael Kluge
2010-05-16  9:29                   ` Michael Kluge
2010-05-16 13:12                     ` Eric Barton
2010-05-17  4:52                       ` Michael Kluge
2010-05-17  3:24                     ` Andrew Uselton
2010-05-17  5:53                       ` Michael Kluge [this message]
     [not found]                     ` <009101caf4f9$67e1dd50$37a597f0$%barton@oracle.com>
2010-05-17  3:39                       ` Shipman, Galen M.
2010-05-17  5:59                         ` Michael Kluge
2010-05-25 12:03                     ` Michael Kluge
     [not found]                       ` <4BFC7177.9000808@oracle.com>
2010-05-28 14:54                         ` Michael Kluge
     [not found]                           ` <4BFFA456.7030502@oracle.com>
     [not found]                             ` <C671351E-110C-4D2C-B216-4E8BE23A943A@oracle.com>
     [not found]                               ` <1FF3D25F-3369-462E-9651-62D56319612A@tu-dresden.de>
     [not found]                                 ` <D29ED098-3DEB-4AF4-AA68-B52B4E2BF5EA@oracle.com>
     [not found]                                   ` <4C04F3F0.9040708@oracle.com>
     [not found]                                     ` <001601cb01a3$546c93d0$fd45bb70$%barton@oracle.com>
2010-06-01 12:12                                       ` di.wang
2010-06-01 17:03                                         ` Andreas Dilger
2010-06-01 19:39                                           ` Michael Kluge
2010-06-16  8:46                                             ` Michael Kluge
2010-06-16 14:50                                               ` Andreas Dilger
2010-06-17 14:02                                                 ` Michael Kluge
     [not found]                                                   ` <4169315E-9A94-4430-8970-92068222EF15@oracle.com>
2010-06-20 20:44                                                     ` Michael Kluge
2010-06-22 15:12                                                       ` Michael Kluge
2010-06-23 10:29                                                         ` Alexey Lyashkov
2010-06-23 11:50                                                           ` Michael Kluge
2010-06-23 12:09                                                             ` Alexey Lyashkov
2010-06-23 12:38                                                               ` Michael Kluge
2010-06-23 15:55                                                             ` Andreas Dilger
2010-06-24  8:01                                                               ` Michael Kluge
2010-06-01 15:58                                     ` Eric Barton
2010-09-22 13:46                               ` Michael Kluge
2010-09-22 18:28                                 ` Andreas Dilger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1274075602.9095.86.camel@radar \
    --to=michael.kluge@tu-dresden.de \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.