From mboxrd@z Thu Jan  1 00:00:00 1970
From: Olivier Matz <olivier.matz@6wind.com>
Subject: Re: [RFC 0/8] mbuf: structure reorganization
Date: Fri, 17 Feb 2017 15:17:08 +0100
Message-ID: <20170217151708.20bf4a49@platinum>
References: <1485271173-13408-1-git-send-email-olivier.matz@6wind.com>
 <2601191342CEEE43887BDE71AB9772583F111A29@irsmsx105.ger.corp.intel.com>
 <20170216144807.7add2c71@platinum>
 <CALe+Z00Y_=4rsAjTeyaEzKFqtuMb6HMRUEUDL2LveJx6bWL-dA@mail.gmail.com>
 <20170217115153.0afeb061@platinum>
 <CALe+Z03QtPVmZ39bCaqGvmdsSipKftA_sZqc5K2F1WDjkEUrsg@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, "dev@dpdk.org"
 <dev@dpdk.org>
To: Jan Blunck <jblunck@infradead.org>
Return-path: <dev-bounces@dpdk.org>
Received: from mail-wm0-f45.google.com (mail-wm0-f45.google.com [74.125.82.45])
 by dpdk.org (Postfix) with ESMTP id 272363B5
 for <dev@dpdk.org>; Fri, 17 Feb 2017 15:17:11 +0100 (CET)
Received: by mail-wm0-f45.google.com with SMTP id r141so11146900wmg.1
 for <dev@dpdk.org>; Fri, 17 Feb 2017 06:17:11 -0800 (PST)
In-Reply-To: <CALe+Z03QtPVmZ39bCaqGvmdsSipKftA_sZqc5K2F1WDjkEUrsg@mail.gmail.com>
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

Hi Jan,

On Fri, 17 Feb 2017 14:38:32 +0100, Jan Blunck <jblunck@infradead.org>
wrote:
> On Fri, Feb 17, 2017 at 11:51 AM, Olivier Matz
> <olivier.matz@6wind.com> wrote:
> > Hi Jan,
> >
> > On Thu, 16 Feb 2017 18:26:39 +0100, Jan Blunck
> > <jblunck@infradead.org> wrote:  
> >> On Thu, Feb 16, 2017 at 2:48 PM, Olivier Matz
> >> <olivier.matz@6wind.com> wrote:  
> >> > On Mon, 6 Feb 2017 18:41:27 +0000, "Ananyev, Konstantin"
> >> > <konstantin.ananyev@intel.com> wrote:  
> >> >> >
> >> >> > The main changes are:
> >> >> > - reorder structure to increase vector performance on some
> >> >> > non-ia platforms.
> >> >> > - add a 64bits timestamp field in the 1st cache line  
> >> >>
> >> >> Wonder why it deserves to be in first cache line?
> >> >> How it differs from seqn below (pure SW stuff right now).  
> >> >
> >> > In case the timestamp is set from a NIC value, it is set in the
> >> > Rx path. So that's why I think it deserve to be located in the
> >> > 1st cache line.
> >> >
> >> > As you said, the seqn is a pure sw stuff right: it is set in a
> >> > lib, not in a PMD rx path.
> >> >  
> >>
> >> If we talk about setting the timestamp value in the RX path this
> >> implicitly means software timestamps. Hardware timestamping usually
> >> works by letting the hardware inject sync events for coarse time
> >> tracking and additionally injecting fine granular per-packet ticks
> >> at a specific offset in the packet. Out of performance reasons I
> >> don't think it makes sense to extract this during the burst and
> >> write it into the mbuf again.  
> >
> > From what I've understand, at least it does not work like this for
> > mellanox NICs: timestamp is a metadata attached to a rx packet. But
> > maybe they (and other NIC vendors interrested in the feature) can
> > confirm or not.
> >  
> 
> Mellanox NICs use a 48bit cycle counter split into a high and low
> part. To convert the cycle values into a timestamp you need to
> initialize and maintainer a timecounter that shifts the cycle count
> e.g. nanosecs. IIRC Mellanox doesn't generate explicit clock events
> but the cycle counter is large enough so that the user can easily
> maintain the timecounter by manually updating it.
> 
> >>
> >> The problem with timestamps is to get the abstraction right wrt the
> >> correction factors and the size of the tick vs. the timestamp in
> >> the events injected. From my perspective it would be better to
> >> extract the handling of timestamp data into a library with PMD
> >> specific implementation of the conversions. That way the
> >> normalized timestamp values can get extracted if they are present.
> >> The mbuf itself would only indicate the presence of timestamp
> >> metadata in that case.  
> >
> > I agree however that we need to properly define the meaning of this
> > field. My idea is:
> >
> > - the timestamp is in nanosecond
> > - the reference is always the same for a given path: if the
> > timestamp is set in a PMD, all the packets for this PMD will have
> > the same reference, but for 2 different PMDs (or a sw lib), the
> > reference would not be the same.
> >
> > I think it's enough for many use cases.
> > We can later add helpers to compare timestamps with different
> > references.  
> 
> My point is that I still doubt that it belongs into the first
> cacheline. It requires accessing other structures for converting into
> nanoseconds anyway. Optimally I would like to see this happening on
> access instead but if that isn't achievable at least in a second step.

Sorry, I don't really get your point. My comprehension of the timestamp
usage in a PMD is as following:

rx_burst(struct rxq *rxq, ...)
{
	unsigned long factor = rxq->timestamp_factor;
	unsigned port = rxq->port;

	for each hw_desc {
		m = rte_pktmbuf_alloc(rxq->pool);
		m->len = hw_desc->len;
		m->port = port;
		m->ol_flags = 
		...
		m->timestamp = hw_desc->timestamp * factor;
	}
	...
}

In that case, I think it deserves to be in the 1st cache line.


Olivier