From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 621E4C433EF for ; Sat, 4 Dec 2021 14:34:51 +0000 (UTC) Received: from localhost ([::1] helo=shelob.surriel.com) by shelob.surriel.com with esmtp (Exim 4.94.2) (envelope-from ) id 1mtW7b-0003LN-6m; Sat, 04 Dec 2021 09:34:39 -0500 Received: from eu-smtp-delivery-151.mimecast.com ([185.58.86.151]) by shelob.surriel.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1mtW7T-0003KV-J2 for kernelnewbies@kernelnewbies.org; Sat, 04 Dec 2021 09:34:37 -0500 Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-267-MTGFJstEMMy1CY558vknEA-1; Sat, 04 Dec 2021 14:34:04 +0000 X-MC-Unique: MTGFJstEMMy1CY558vknEA-1 Received: from AcuMS.Aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) by AcuMS.aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sat, 4 Dec 2021 14:34:04 +0000 Received: from AcuMS.Aculab.com ([fe80::994c:f5c2:35d6:9b65]) by AcuMS.aculab.com ([fe80::994c:f5c2:35d6:9b65%12]) with mapi id 15.00.1497.026; Sat, 4 Dec 2021 14:34:04 +0000 From: David Laight To: 'Subhashini Rao Beerisetty' , "linux-pci@vger.kernel.org" , LKML , kernelnewbies Subject: RE: latency Thread-Topic: latency Thread-Index: AQHX6GdhmNv7avsZu0CbwMJv9zlFaKwiXsGw Date: Sat, 4 Dec 2021 14:34:04 +0000 Message-ID: <3b914a515b1d4e749e58d3b46cf12b26@AcuMS.aculab.com> References: In-Reply-To: Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=C51A453 smtp.mailfrom=david.laight@aculab.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US X-BeenThere: kernelnewbies@kernelnewbies.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Learn about the Linux kernel List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: kernelnewbies-bounces@kernelnewbies.org From: Subhashini Rao Beerisetty > Sent: 03 December 2021 17:01 > > [ Please keep me in CC as I'm not subscribed to the list] > > Hi all, > > We are using the Linux OS on an x86_64 machine. I need to measure the > PCIe latency on my system, does kernel have any latency measurement > module for the PCIe bus? Slower than you expect :-) Writes are asynchronous so really only limited by the actual speed of the PCIe link and the rate the slave can process them. So the actual latency of writes doesn't matter and the throughput is reasonable. Reads are much more problematic. While the PCIe bus allows multiple outstanding read requests the Intel x86 I've tested will only generate one outstanding request for each cpu core. So buffer reads are particularly slow. The delays between on read completing and the next read TLP being sent are (probably) negligible compared to the other delays. So the latency of a read is just the time the two TLP take to be transmitted over the wire (including delays for PCIe bridges) plus the time the slave takes to generate the response TLP. On the fpga slaves we are using that is (from memory) about 128 cycles of the 62.5MHz clock - ie absolutely ages. For reads you definitely need to use the largest register size possible - each read instruction (even misaligned ones) generates exactly one read TLP. If you are designing an interface for an fpga then consider using writes from both sides for everything except bulk data. You can (probably) measure the latency of your actual system using: x = rdtsc(); v = readl(); lfence; elapsed = rdtsc() - x; However the TSC values depend on the current cpu frequency (which will change 'randomly'). Or put the readl() into a loop and do enough that the high-res system time delts makes sense. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) _______________________________________________ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies