From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bruce Richardson Subject: Re: librte_pmd_ixgbe implementation of ixgbe_dev_rx_queue_count Date: Tue, 29 Mar 2016 10:31:19 +0100 Message-ID: <20160329093119.GC17800@bricha3-MOBL3> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: dev@dpdk.org To: Mohammad El-Shabani Return-path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 61EB4532C for ; Tue, 29 Mar 2016 11:31:23 +0200 (CEST) Content-Disposition: inline In-Reply-To: List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Mon, Mar 28, 2016 at 06:45:26PM -0700, Mohammad El-Shabani wrote: > Hi, > Looking into why it hurts performance, I see that ixgbe_dev_rx_queue_count > is implemented a scan of elements of rx descriptors, which is very > expensive. I am wondering why its implemented the way it is. Could it not > just read the head location from the driver? > > Thanks! > Mohammad El-Shabani It's likely that reading the head location from the driver will be even slower than scanning the descriptor rings in memory. Access to PCI is very much slower than accessing memory - especially since on platforms with DDIO, many memory accesses will actually be cache reads. That being said, I haven't actually written a test to prove this out, so feel free to try out the head pointer read method instead and see if it improves things. The results may vary depending on how far ahead needs to be scanned, but certainly for the empty ring case, the descriptor scan method will be far faster than a head read. Regards, /Bruce