From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jonathan David Subject: Re: [RFC] e1000e: Add delays after writing to registers Date: Tue, 3 Nov 2015 11:43:21 -0600 Message-ID: <5638F239.1030804@ni.com> References: <1445465268-10347-1-git-send-email-jonathan.david@ni.com> <20151022055909.GA7263@icarus.home.austad.us> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Cc: , To: Henrik Austad Return-path: Received: from mail-bl2on0140.outbound.protection.outlook.com ([65.55.169.140]:29472 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1755213AbbKCR56 (ORCPT ); Tue, 3 Nov 2015 12:57:58 -0500 In-Reply-To: <20151022055909.GA7263@icarus.home.austad.us> Sender: linux-rt-users-owner@vger.kernel.org List-ID: On 10/22/2015 12:59 AM, Henrik Austad wrote: > On Wed, Oct 21, 2015 at 05:07:48PM -0500, Jonathan David wrote: >> There is a noticeable impact on determinism when a large number of >> writes are flushed. Writes to the hardware registers are sent across >> the PCI bus and take a significant amount of time to complete after >> a flush, which causes high priority tasks (including interrupts) to >> be delayed. > > Do you see this in the entire system, or on the core where the write was > triggered? Only on the core where the writes are issued. >> Adding a delay after long series of writes gives them time to >> complete, and for higher priority tasks to run unimpeded. > > Aren't we running with threaded interrupts? > > What happens to the thread(s) pushing data to the network? > What about xmit-buffer once it is full? Which thread will block on send or > have its sk_buff dropped? All of this is totally irrelevant to the problem we are seeing. The e1000x driver itself is not responsible for the delay here. The issue is with PCI where issuing a large number of MMIO writes followed by a read (to force said writes to execute) will stall the CPU. When the CPU is stalled, no interrupts are serviced, including the local apic timer interrupt, which was responsible for waking up cyclictest. This behavior was observed within traces gathered from cyclictest with ftrace enabled. > I'm not sure if adding random delay and giving an unpredictable impact on > completely random threads is the best way to solve this.. Agreed, we know that this is a hack. Do you have any better solutions? - JD