From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0C1EBE9270B for ; Thu, 5 Oct 2023 15:22:37 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9061710E420; Thu, 5 Oct 2023 15:22:36 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by gabe.freedesktop.org (Postfix) with ESMTPS id DA67010E41F for ; Thu, 5 Oct 2023 15:22:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696519351; x=1728055351; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=yJKeiL4myK+SIyG1Jam8xp2tfw1H1D1cKiGrzE6K+Rw=; b=LYd1Y+ti4a7Bkv2HpUljzPP2+x/DJWFW+u15u/d5AxafJ05XLw0bBjS+ gBopSE+44C4AeLeTw75cUK5J3DvfYS7/jQEEuRJedRvLipj2zV/oIWCwV t3kPTM/z+WDuOFNaXqyOvFku8XtcecfRyoHnXGd1s5XRm2ss0CfArIyhD elCRS2FDU007f9VYfKat0jy5VVTXe3VgzbojjnTcPKUFYu3/UXL57Gp9h ojRol4TlbbU1gnhSCUZMJVBji3Zv7CSTR/ixY9L1P1/4fQuv/U2PvId9u OF5zHjHQONZj4+z8hLJ2WW9fN8pJFBw0rt2goYCH0W6JnHUQt8rmNKDQU Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10854"; a="368603405" X-IronPort-AV: E=Sophos;i="6.03,203,1694761200"; d="scan'208";a="368603405" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Oct 2023 08:22:31 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10854"; a="728487015" X-IronPort-AV: E=Sophos;i="6.03,203,1694761200"; d="scan'208";a="728487015" Received: from adixit-mobl.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.212.189.229]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Oct 2023 08:22:30 -0700 Date: Thu, 05 Oct 2023 08:22:30 -0700 Message-ID: <87jzs114cp.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Umesh Nerlige Ramappa In-Reply-To: <87o7hd1vwt.wl-ashutosh.dixit@intel.com> References: <20230919161049.2307855-1-ashutosh.dixit@intel.com> <20230919161049.2307855-14-ashutosh.dixit@intel.com> <87o7hd1vwt.wl-ashutosh.dixit@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Subject: Re: [Intel-xe] [PATCH 13/21] drm/xe/uapi: Multiplex PERF ops through a single PERF ioctl X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: intel-xe@lists.freedesktop.org Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Wed, 04 Oct 2023 22:27:14 -0700, Dixit, Ashutosh wrote: > > On Tue, 03 Oct 2023 19:23:24 -0700, Umesh Nerlige Ramappa wrote: > > > > Hi Umesh, > > > On Tue, Sep 19, 2023 at 09:10:41AM -0700, Ashutosh Dixit wrote: > > > Since we are already mulitplexing multiple perf counter stream types > > > through the PERF layer, it seems odd to retain separate ioctls for perf > > > op's (add/remove config). In fact it seems logical to also multiplex these > > > ops through a single PERF ioctl. This also affords greater flexibility to > > > add stream specific ops if needed for different perf stream types. > > > > > > Signed-off-by: Ashutosh Dixit > > > --- > > > drivers/gpu/drm/xe/xe_device.c | 5 +---- > > > drivers/gpu/drm/xe/xe_perf.c | 32 ++++++++------------------------ > > > drivers/gpu/drm/xe/xe_perf.h | 4 +--- > > > include/uapi/drm/xe_drm.h | 16 ++++++++++------ > > > 4 files changed, 20 insertions(+), 37 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c > > > index 770b9fe6e65df..24018a0801788 100644 > > > --- a/drivers/gpu/drm/xe/xe_device.c > > > +++ b/drivers/gpu/drm/xe/xe_device.c > > > @@ -115,10 +115,7 @@ static const struct drm_ioctl_desc xe_ioctls[] = { > > > DRM_RENDER_ALLOW), > > > DRM_IOCTL_DEF_DRV(XE_VM_MADVISE, xe_vm_madvise_ioctl, DRM_RENDER_ALLOW), > > > > > > - DRM_IOCTL_DEF_DRV(XE_PERF_OPEN, xe_perf_open_ioctl, DRM_RENDER_ALLOW), > > > - DRM_IOCTL_DEF_DRV(XE_PERF_ADD_CONFIG, xe_perf_add_config_ioctl, DRM_RENDER_ALLOW), > > > - DRM_IOCTL_DEF_DRV(XE_PERF_REMOVE_CONFIG, xe_perf_remove_config_ioctl, DRM_RENDER_ALLOW), > > > - > > > + DRM_IOCTL_DEF_DRV(XE_PERF, xe_perf_ioctl, DRM_RENDER_ALLOW), > > > }; > > > > > > static const struct file_operations xe_driver_fops = { > > > diff --git a/drivers/gpu/drm/xe/xe_perf.c b/drivers/gpu/drm/xe/xe_perf.c > > > index 0f747af59f245..f8d7eae8fffe0 100644 > > > --- a/drivers/gpu/drm/xe/xe_perf.c > > > +++ b/drivers/gpu/drm/xe/xe_perf.c > > > @@ -6,37 +6,21 @@ > > > #include "xe_oa.h" > > > #include "xe_perf.h" > > > > > > -int xe_perf_open_ioctl(struct drm_device *dev, void *data, struct drm_file *file) > > > +int xe_oa_ioctl(struct drm_device *dev, struct drm_xe_perf_param *arg, struct drm_file *file) > > > { > > > - struct drm_xe_perf_param *arg = data; > > > - > > > - if (arg->extensions) > > > - return -EINVAL; > > > - > > > - switch (arg->perf_type) { > > > - case XE_PERF_TYPE_OA: > > > + switch (arg->perf_op) { > > > + case XE_PERF_STREAM_OPEN: > > > return xe_oa_stream_open_ioctl(dev, (void *)arg->param, file); > > > > It's a nice idea to reduce the ioctls, but if your struct drm_xe_perf_param > > *arg is overloaded based on the PERF_OP passed, then I would recommend > > validating that the right arg is passed for the corresponding OP. > > I am not following what you mean here: which right arg for which OP? > > The PERF layer only demultiplexes based on perf_type (say OA/XYZ etc.). The > perf_op belongs to the perf_type layer (say OA), not the PERF layer. It is > the job of the perf_type layer (OA) to validate the perf_op, not the job of > the PERF layer. It is just convenient to include the perf_op as part of > 'struct drm_xe_perf_param' (rather than inventing yet another layer there). > See the function xe_perf_ioctl() in the patch. > > The xe_oa_ioctl function above could possibly be moved into xe_oa.c. I just > left it in xe_perf.c since it didn't seem to matter much. But I am open to > doing that. OK, I think I figured out the right way to visualize this. It's as follows. Let's say we have a an OA stream inside the PERF layer. So what we have is: struct drm_xe_perf_param { perf_type; struct oa { oa_op; struct oa_op_params { ... } } } So basically I have eliminated 'struct oa' and merged into 'struct drm_xe_perf_param'. But oa_op still belongs to the OA layer, not the PERF layer. So the oa layer handles the oa_op not the PERF layer. > > Ideally I wouldn't go that route since that would require some sort of > > signature in the arg which would identify it as the correct > > param. Instead I would be okay with retaining separate ioctls for the 3 > > operations. > > If we were not doing this multiplexing based on perf_type (as in i915) we > could have separate ioctl's for each operation. But since here we have > anyway introduced a multiplxing layer, to me it makes no sense to have > separate operation ioctl's (only disadvantags and no advantages). (Note > that the multiplexing layer implies a (non-obvious) additional > copy_from_user per operation visible in the previous "drm/xe/uapi: "Perf" > layer to support multiple perf counter stream types" patch). The drm layer does a copy_from_user for the first layer but any second layer structs need to be copy_from_user'd by the driver. > > Also we cannot assume that a future stream type will only have 3 operations > as i915 OA did. The OPEN/ADD_CONFIG/CLOSE are really OA specific > operations. But it appears other potential perf_type's will also be able to > use them, at least initially that is why they are left defined as PERF_OP's > (rather than OA_OP's) in xe_drm.h. New stream types are free to introduce > new ops in this design. > > So retaining the ops inside a single PERF ioctl eliminates the need for > introducing a new ioctl each time a stream type introduces a new OP. Thanks. -- Ashutosh