From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52779) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1byhlP-0001pK-2l for qemu-devel@nongnu.org; Mon, 24 Oct 2016 12:06:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1byhlL-0000Ns-UO for qemu-devel@nongnu.org; Mon, 24 Oct 2016 12:06:15 -0400 Sender: Richard Henderson References: <1477288523-10819-1-git-send-email-vijay.kilari@gmail.com> <1477288523-10819-4-git-send-email-vijay.kilari@gmail.com> <538a5505-fef3-fc2b-4411-bbaa8537c9c7@redhat.com> From: Richard Henderson Message-ID: Date: Mon, 24 Oct 2016 08:51:52 -0700 MIME-Version: 1.0 In-Reply-To: <538a5505-fef3-fc2b-4411-bbaa8537c9c7@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v3 3/3] utils: Add prefetch for Thunderx platform List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini , vijay.kilari@gmail.com, qemu-arm@nongnu.org, peter.maydell@linaro.org Cc: qemu-devel@nongnu.org, Vijaya Kumar K On 10/24/2016 04:25 AM, Paolo Bonzini wrote: >> > for (; p + 8 <= e; p += 8) { >> > - __builtin_prefetch(p + 8, 0, 0); >> > + __builtin_prefetch(p + >> > + (8 * cache_line_factor * prefetch_line_dist), 0, 0); > You should precompute cache_line_bytes * prefetch_line_dist / > sizeof(uint64_t) in a single variable, prefetch_distance. This saves > the effort of loading global variables repeatedly. Then you can do > > __builtin_prefetch(p + prefetch_distance, 0, 0); > Let's not complicate things by dividing by sizeof(uint64_t). It's less complicated to avoid both that and the implied multiply. __builtin_prefetch((char *)p + prefetch_distance, 0, 0) r~