From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.lttng.org (lists.lttng.org [167.114.26.123]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A7D5EEB64DB for ; Tue, 20 Jun 2023 10:21:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=lists.lttng.org; s=default; t=1687256479; bh=otGtXNlyqQdfo+sr2IORUS+TbEs/OYMvTT756qJsh/A=; h=To:Date:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=aQE6wBQpv1gdRRXOfz3xF+z8K8cIIeOV8GsaaoTyANhDB8dTEF67asj+nVQCjATmU uDe1LW2WhGh23jC8WOwf/nMZwVadzkviDueEu0o5ZRCMiVm0kEVbDxx8kcgQpvWsen fSt20StuXapd9+uuigBCFpDGPdDGSovXyRcTbYJmex9/uRg/gP4ZGL0BFYLQXgYps7 mzx3gRk+JwRim33xsuItSV5JW9neZMK0aurtRRenAqPr0kvqANM3iysQjGn0yQl9gU 7+pBs301+dAYp9kZkpiLL96msZ4cBROQ3TlXTchNqZ4pYUC5ENPk3m3v8/xsk0KqXF tQv4IMOjynQHA== Received: from lists-lttng01.efficios.com (localhost [IPv6:::1]) by lists.lttng.org (Postfix) with ESMTP id 4QljMy3cSGz1yBx; Tue, 20 Jun 2023 06:21:18 -0400 (EDT) Received: from smtp-fw-52002.amazon.com (smtp-fw-52002.amazon.com [52.119.213.150]) by lists.lttng.org (Postfix) with ESMTPS id 4QljMw3qclz1yDM for ; Tue, 20 Jun 2023 06:21:16 -0400 (EDT) X-IronPort-AV: E=Sophos;i="6.00,256,1681171200"; d="scan'208,217";a="567554695" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-pdx-2b-m6i4x-a893d89c.us-west-2.amazon.com) ([10.43.8.6]) by smtp-border-fw-52002.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jun 2023 10:21:13 +0000 Received: from EX19D008EUC001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-pdx-2b-m6i4x-a893d89c.us-west-2.amazon.com (Postfix) with ESMTPS id 88B3140D4C for ; Tue, 20 Jun 2023 10:21:11 +0000 (UTC) Received: from EX19D008EUC001.ant.amazon.com (10.252.51.165) by EX19D008EUC001.ant.amazon.com (10.252.51.165) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1118.26; Tue, 20 Jun 2023 10:21:10 +0000 Received: from EX19D008EUC001.ant.amazon.com ([fe80::9611:c62b:a7ba:aee1]) by EX19D008EUC001.ant.amazon.com ([fe80::9611:c62b:a7ba:aee1%3]) with mapi id 15.02.1118.026; Tue, 20 Jun 2023 10:21:10 +0000 To: "lttng-dev@lists.lttng.org" Thread-Index: AQHZo1of9E6kcCAeVUyScQoTVjxCgw== Date: Tue, 20 Jun 2023 10:21:10 +0000 Message-ID: <1255477015ae4da8a6523b5d204334da@amazon.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.85.143.172] MIME-Version: 1.0 Subject: [lttng-dev] (no subject) X-BeenThere: lttng-dev@lists.lttng.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: LTTng development list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: "Mousa, Anas via lttng-dev" Reply-To: "Mousa, Anas" Content-Type: multipart/mixed; boundary="===============0369875271704751774==" Errors-To: lttng-dev-bounces@lists.lttng.org Sender: "lttng-dev" --===============0369875271704751774== Content-Language: en-US Content-Type: multipart/alternative; boundary="_000_1255477015ae4da8a6523b5d204334daamazoncom_" --_000_1255477015ae4da8a6523b5d204334daamazoncom_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hello, I've recently profiled the latency of LTTng tracepoints on arm platforms, using the follow sample program: ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ---------------------------- static inline uint64_t get_time_nsec(void) { struct timespec ts; if (caa_unlikely(clock_gettime(CLOCK_MONOTONIC, &ts))) { ts.tv_sec =3D 0; ts.tv_nsec =3D 0; } return ((uint64_t) ts.tv_sec * 1000000000ULL) + ts.tv_nsec; } int main(int argc, char *argv[]) { unsigned int i; int tp_num =3D 0; uint64_t total_time =3D 0; uint64_t now, nowz; if (argc > 1) { sscanf (argv[1],"%d",&tp_num); } for (i =3D 0; i < tp_num; i++) { now =3D get_time_nsec(); lttng_ust_tracepoint(hello_world, my_first_tracepoint, i, "some_str"); nowz =3D get_time_nsec(); total_time +=3D (nowz - now); } if (tp_num) { printf("---------------------------Average TP time is %"PRIu64"----= -----------------------\n", total_time / tp_num); } } ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= ---------------------------- I observed a big average latency variance on different platforms when traci= ng a high number (many thousands to millions) of tracepoints: * [platform 1] with CPU info running a linux kernel based on Buildroot = (4.19.273 aarch64 GNU/Linux): BogoMIPS : 187.50 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd08 CPU revision : 3 * Saw an average latency of 2-3usec * [platform 2] with CPU info running a linux kernel based on Amazon Lin= ux (4.14.294-220.533.amzn2.aarch64 aarch64 GNU/Linux): BogoMIPS : 243.75 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp a= simdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x3 CPU part : 0xd0c CPU revision : 1 * Saw an average latency of ~0.5usec Are there any suggestions to root cause the high latency and potentially im= prove it on platform 1? Thanks and best regards, Anas. --_000_1255477015ae4da8a6523b5d204334daamazoncom_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable

Hello,
I've recently profiled the latency of LTTng tracepoints on arm pla=
tforms,
using the follow sample program:<=
/pre>
--------------------------=
---------------------------------------------------------------------------=
---------------------------------------------------------------------------=
--
static inline uint64_t get_time_nsec(void)
{<=
/pre>
	struct timespec ts;
	if (caa_unlikely(clock_gettime(CLOCK_MO=
NOTONIC, &ts))) {
		ts.tv_sec =3D 0;
		ts.tv_nsec =3D <=
/span>0;
	}=
	return ((uint64_t) ts.=
tv_sec * 1000000000ULL) + ts=
.tv_nsec;
}<=
/pre>

int main(int argc, char *argv[])
{<=
/pre>
    unsigned int i;=
    int tp_num =3D 0;
    uint64_t total_time =
=3D 0;
    uint64_t now, nowz=
;
    if (argc >=
 1) {
        sscanf (argv[1],"%d",&tp_num);
    }
    for (i =3D 0; i <=
; tp_num; i++) {
        now =3D get_time_nsec();
        lttng_ust_tracepoint(=
hello_world, my_firs=
t_tracepoint,
                             i, "some_str"<=
span style=3D"font-family: Calibri, Helvetica, sans-serif; font-size: 12pt;=
 color: rgb(0, 0, 0);">);
        nowz =3D get_time_nsec();
        total_time +=3D=
 (nowz - now);
    }
    if (tp_num) {
        printf("---------------------------Average TP t=
ime is %"PRIu64"--------------------=
-------\n", total_time =
/ tp_num);      
    }
}<=
/pre>
<=
span style=3D"font-size: 12pt;">-------------------------------------------=
---------------------------------------------------------------------------=
------------------------------------------------------------

I observed <=
span class=3D"cm-variable" style=3D"font-family:Calibri,Helvetica,sans-seri=
f; font-size:14pt; color:rgb(0,0,0)">a big average latency variance on different platforms when tracing a high number (many thousands to millions) of tracepoints:
  • [platform 1] with CPU info running a linux kernel based on Buildroot (4.19.273 aarch64 GNU/Linux):
BogoMIPS	: =
187.50
Features	: =
fp asimd evtstrm aes =
pmull <=
span class=3D"cm-variable" style=3D"font-family:Calibri,Helvetica,sans-seri=
f; font-size:14pt; color:rgb(0,0,0)">sha1 sha2 crc32 cpuid
CPU =
implementer	: =
0x41
CPU =
architecture: =
8
CPU =
variant	: 0x0
CPU =
part	: =
0xd08
CPU =
revision	: 3
  •  Saw an average latency of 2-3usec=0A= =0A=
  • [platform 2] with CPU info running a linux kernel based onAmazonLinux (4.14.294-220.533.amzn= 2.aarch64 aarch64 GNU/Linux):
<= /pre>
BogoMIPS	: =
243.75
Features	: =
fp asimd evtstrm aes =
pmull <=
span class=3D"cm-variable" style=3D"font-family:Calibri,Helvetica,sans-seri=
f; font-size:14pt; color:rgb(0,0,0)">sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
CPU =
implementer	: =
0x41
CPU =
architecture: =
8
CPU =
variant	: 0x3
CPU =
part	: =
0xd0c
CPU =
revision	: 1
  • Saw an average latency of ~0.5= usec=0A=

Are =
there <=
span class=3D"cm-variable" style=3D"font-family:Calibri,Helvetica,sans-seri=
f; font-size:14pt; color:rgb(0,0,0)">any suggestions to root cause the high latency and potentially improve it on platform 1?

Thanks and best regards,
Anas.

--_000_1255477015ae4da8a6523b5d204334daamazoncom_-- --===============0369875271704751774== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev --===============0369875271704751774==--