From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91CACC2BB1D for ; Tue, 14 Apr 2020 05:26:51 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6D9AB206E9 for ; Tue, 14 Apr 2020 05:26:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6D9AB206E9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id F1C7C89DE6; Tue, 14 Apr 2020 05:26:50 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id 47C1A89D7B for ; Tue, 14 Apr 2020 05:26:49 +0000 (UTC) IronPort-SDR: 7n33Xny44AlMzy2hV+xewMRbjI1MUGHZOq3eAOzyLDgmW0bxbTkYi41z87Y1xs1TEuSXLZgxhL 1rgl9qoh6VKQ== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2020 22:26:47 -0700 IronPort-SDR: quRvy4XlfcLwVJp1TAvL1tS3rsGTkQeE6Fk0n3Iiruyd9+RzBa5C3gJpaEra8/Jc7cI7Gk71RC GlM4gqgHW3ZA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,381,1580803200"; d="scan'208";a="241877983" Received: from adixit-mobl.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.212.8.3]) by orsmga007.jf.intel.com with ESMTP; 13 Apr 2020 22:26:45 -0700 Date: Mon, 13 Apr 2020 22:08:47 -0700 Message-ID: <87y2qy4psw.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Umesh Nerlige Ramappa In-Reply-To: <20200413154822.11620-2-umesh.nerlige.ramappa@intel.com> References: <20200413154822.11620-1-umesh.nerlige.ramappa@intel.com> <20200413154822.11620-2-umesh.nerlige.ramappa@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL/10.8 EasyPG/1.0.0 Emacs/26 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Subject: Re: [Intel-gfx] [PATCH 1/3] drm/i915/perf: Reduce cpu overhead for blocking perf OA reads X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: intel-gfx@lists.freedesktop.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On Mon, 13 Apr 2020 08:48:20 -0700, Umesh Nerlige Ramappa wrote: > > A condition in wait_event_interruptible seems to be checked twice before > waiting on the event to occur. These checks are redundant when hrtimer > events will call oa_buffer_check_unlocked to update the oa_buffer tail > pointers. The redundant checks add cpu overhead. Simplify the check > to reduce cpu overhead when using blocking io to read oa buffer reports. > > Signed-off-by: Umesh Nerlige Ramappa > --- > drivers/gpu/drm/i915/i915_perf.c | 28 +++++++++++++++++++++++++++- > 1 file changed, 27 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c > index 5cde3e4e7be6..e28a3ab83fde 100644 > --- a/drivers/gpu/drm/i915/i915_perf.c > +++ b/drivers/gpu/drm/i915/i915_perf.c > @@ -541,6 +541,32 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream) > return pollin; > } > > +/** > + * oa_buffer_check_reports - quick check if reports are available > + * @stream: i915 stream instance > + * > + * The return from this function is used as a condition for > + * wait_event_interruptible in blocking read. This is used to detect > + * available reports. > + * > + * A condition in wait_event_interruptible seems to be checked twice before > + * waiting on an event to occur. These checks are redundant when hrtimer events > + * will call oa_buffer_check_unlocked to update the oa_buffer tail pointers. The > + * redundant checks add cpu overhead. We simplify the check to reduce cpu > + * overhead. > + */ > +static bool oa_buffer_check_reports(struct i915_perf_stream *stream) > +{ > + unsigned long flags; > + bool available; > + > + spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags); > + available = stream->oa_buffer.tail != stream->oa_buffer.head; > + spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags); > + > + return available; > +} > + > /** > * append_oa_status - Appends a status record to a userspace read() buffer. > * @stream: An i915-perf stream opened for OA metrics > @@ -1150,7 +1176,7 @@ static int i915_oa_wait_unlocked(struct i915_perf_stream *stream) > return -EIO; > > return wait_event_interruptible(stream->poll_wq, > - oa_buffer_check_unlocked(stream)); > + oa_buffer_check_reports(stream)); > } I think the problem with this patch is that the original code had the property that the condition for data availability is checked (and the OA tail advanced) /before/ entering the wait. So the tail was advanced and if there was data there was no need to wait at all. This change has lost that property. With this change we will first always enter the wait and then get unblocked after the timer interrupt which will advance the tail and wake us up. I think if we want to do this, it is possible to do without losing the original property. Just call oa_buffer_check_unlocked() first (outside wait_event) and if there is data just return. If not, put yourself on stream->poll_wq from which the timer callback will wake us up. I think this is going to be something like prepare_to_wait() / schedule() / finish_wait() pattern of which there are numerous examples in the kernel. So in this case we won't call wait_event_interruptible() which checks the condition upon waking up. No need to define oa_buffer_check_reports() either. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx