From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7593CC4345F for ; Mon, 22 Apr 2024 17:18:16 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3465F10EA49; Mon, 22 Apr 2024 17:18:16 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="Pq+EAazx"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1D47F112C8B for ; Mon, 22 Apr 2024 17:18:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1713806295; x=1745342295; h=date:from:to:cc:subject:message-id:references: content-transfer-encoding:in-reply-to:mime-version; bh=Sc7MM7pxyq5mH8SZvlPSlPvjx3EtDH4UZGgfxW63kK8=; b=Pq+EAazxqkF8NEKd2iy7PPczipK0afaLZrzW7ecUwh6VKjkQcGAZbCFg NGLH60l/en+BF2v4Bnx/dyqT2CruyL90Zc+e7uxFleVAZt4FSd29O/0NO D+bZ54EKeKqzhB9ERNLRkPPbpfkSUm+04npuO8DNmxtUac1Dl7Nj92nsK gSucHGmS1QIxHTkXPMYr271R/snGiFlYeyQfAh2orVbhYtkpe/DhIdVkB hNUBbEtKdLYt4EFHdCLd5jgdXgfpfuFxrA1KuMfsZpBWdYjcbelQQFD1O cHw5vK60m7LKePnpVKDc/OkLmvAAMCRF5YdZHdRMIHCSr2+oi1J9CztpG A==; X-CSE-ConnectionGUID: 3o5bN2/lRPObmBuwKVzlAA== X-CSE-MsgGUID: xqt7/oVKSIe/b+FQkCOvlw== X-IronPort-AV: E=McAfee;i="6600,9927,11052"; a="13192109" X-IronPort-AV: E=Sophos;i="6.07,221,1708416000"; d="scan'208";a="13192109" Received: from orviesa010.jf.intel.com ([10.64.159.150]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Apr 2024 10:18:14 -0700 X-CSE-ConnectionGUID: yromFvSESPGkPt7+Ob3BBQ== X-CSE-MsgGUID: hjri7y+GRRSk+PTGTR+eDQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,221,1708416000"; d="scan'208";a="23956536" Received: from fmsmsx601.amr.corp.intel.com ([10.18.126.81]) by orviesa010.jf.intel.com with ESMTP/TLS/AES256-GCM-SHA384; 22 Apr 2024 10:18:13 -0700 Received: from fmsmsx611.amr.corp.intel.com (10.18.126.91) by fmsmsx601.amr.corp.intel.com (10.18.126.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Mon, 22 Apr 2024 10:18:12 -0700 Received: from fmsmsx611.amr.corp.intel.com (10.18.126.91) by fmsmsx611.amr.corp.intel.com (10.18.126.91) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Mon, 22 Apr 2024 10:18:11 -0700 Received: from fmsedg601.ED.cps.intel.com (10.1.192.135) by fmsmsx611.amr.corp.intel.com (10.18.126.91) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35 via Frontend Transport; Mon, 22 Apr 2024 10:18:10 -0700 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (104.47.58.101) by edgegateway.intel.com (192.55.55.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Mon, 22 Apr 2024 10:18:09 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=fgvDyV8j+fBJ7Hh1tF1PKLmaK+7b+1/ys+wza3Z03J0CoXlY539AEiISswqg6HFIyJQFjLL6bhHbGyZNl1Eysu3tT07gyzpnC5CjNt3WBdWOvThgDYjy6fRCLQNJQH10i+KxAbl/n0XZudWJ1Yikz4CLFXGfoeiGYWSwlPtf09XXgsUNWFi915Ptb2609jZziwPNf7ywExELL47qL8kqPRnUrG65zERd3dXbL0fp0WlDW9HoqkGWukwH9vPVedxP7p4/o9ScqDeaJ4aCNC+wXGrYammz1gvc+E+/qLsqe/VD1+XcPCV9InwGfNQdUW6N0YxDnByiFg0W1fdZoQEwLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rizcjzlkZnFwWI+VUiXLPbRelPn3uIzkLD3lKEjShjw=; b=MEkdqMZiljfiWsCSlki+YdU2asILD4NgwJvn3kA+2LOBS0+YlWga3wsyLM+UuaU6WT8o6G3z2D9DkQh0jPMsh3y3c53UioGJC0apzOilQARmhKVTS47xfZDdYQj6p1Sib8qOART/YJVefpCDcu8V1P+i6PSAbEd8ocjZCD9nU66oSQxp9TE88gomsZxUi+XdXjPpOPotcASQpkb0e6kzuw3dTp7npQ2tJ0sIUBX4yw/pTRT2e3kZgVmYwMd20ZT5tLgoUkhstqZimP6Aeid5kaDzU0McIghy3JSA8mJNv4ksce2tTAaWvUY+MWkVgKe2B1FA4OXr10nlDo9Bs8k0Jg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; Received: from DS0PR11MB7408.namprd11.prod.outlook.com (2603:10b6:8:136::15) by MW3PR11MB4521.namprd11.prod.outlook.com (2603:10b6:303:55::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7519.21; Mon, 22 Apr 2024 17:18:08 +0000 Received: from DS0PR11MB7408.namprd11.prod.outlook.com ([fe80::cc14:a507:b02c:b551]) by DS0PR11MB7408.namprd11.prod.outlook.com ([fe80::cc14:a507:b02c:b551%7]) with mapi id 15.20.7519.020; Mon, 22 Apr 2024 17:18:07 +0000 Date: Mon, 22 Apr 2024 10:17:57 -0700 From: Umesh Nerlige Ramappa To: Tvrtko Ursulin CC: Lucas De Marchi , , Michal Wajdeczko Subject: Re: [PATCH 0/7] drm/xe: Per client usage Message-ID: References: <6490842a-65b3-4b84-9a8a-ebc9f7b9ac07@ursulin.net> <8e2af1d5-5d22-47d5-8cee-f8bd1e54c0ed@ursulin.net> <6ib4borj4paiuj5l6digeoc7mgnabtji6vtvdtvcafi64y3j3n@vqid357khkai> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-ClientProxiedBy: SJ0PR03CA0157.namprd03.prod.outlook.com (2603:10b6:a03:338::12) To DS0PR11MB7408.namprd11.prod.outlook.com (2603:10b6:8:136::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR11MB7408:EE_|MW3PR11MB4521:EE_ X-MS-Office365-Filtering-Correlation-Id: be43510e-4c39-4244-c8ab-08dc62f02d81 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230031|376005|366007|1800799015; X-Microsoft-Antispam-Message-Info: =?utf-8?B?OGR4MklCcGxHdnZCcndlZ1p2YVoveTN5VHVQK0ViTVd4aUlYOTMvWE1kdmdI?= =?utf-8?B?aWdtclNqUXVyNm8zYVZxR0R1OWs3ZGx2UW1NcWhlelY1bHVwczZFUVhCYVZa?= =?utf-8?B?VHJhYmJZRHF0QTZaN3dXSVREbmcrb1JkVEk0Sm5Rc25yTUkrQkNsc3p3a0xC?= =?utf-8?B?b2hhQ0p5bGY4ZTV2ZXVJU2dySDVJV3JwMnZFTjRBSWpEWTRFd2FuTW03L0Yw?= =?utf-8?B?RlkzU0hId09sL1JoeEVhb1g2TXQ3cmRqUzM3ci9OUUY2QzY4emZXQzJIb0JV?= =?utf-8?B?RWc4cUVTaXlvV1FTcnBtNkRqOXRwR21mSVg5d0VsYnVZOUVnL2FrZTg2TTlW?= =?utf-8?B?SVMySXpsanJGK0hPQnZUN212NDZTNTVSWkY5eW1LTnhwR0lwWndEdWttM2ww?= =?utf-8?B?UHc4bm1SZlNkZkFBSVJHeFJrSlMwaDd1QnlUczNaZlJGYnNJV0ZpV3VFeDQr?= =?utf-8?B?NzM0dm5iYVpHS3k2THp0bDlUVjZMOEFnYzNrTG8rUUF6SEQ1ajZxcmZRN0Vn?= =?utf-8?B?MDY1UkMvUjFCSUtjM3NvSDFXUGZpb1Q1SWZSc2VvQjdYdWhla0dkQ014MDdO?= =?utf-8?B?QkZLK3FsNGFWdW5TbFpXdW11ZithU2xKZ1FFSGQvN085OVp0NDNtSXovRmxY?= =?utf-8?B?dngxZnltKzUrMEV0VzdkL1J6TU1nWU9FRWVOa0gzaHFkR0p5RTRCQnFCNTJP?= =?utf-8?B?WGU3OGROQ3cxMVBSNk1uem4xMTErZFY1VUltNTNXaEsramQ5RHlpbEsrU3hY?= =?utf-8?B?L0dDVEk1ZXdYaGs1bTgzUWZFdjBTalF1SHpZUXRjNW9tb09GVTRvYmtjUFJw?= =?utf-8?B?cTdjMkRKOWhuanNGbWZmcGFNYnc4TW1WaXNYeHlHNWVjVFV5NFE5dVdJSW1s?= =?utf-8?B?Y2RrT0Z6M3B3M1d4TkJ1b1FDQ0JUaUZQZFNvZlNCMnNkYXV5TmpqUW5DdWZs?= =?utf-8?B?QnBQNTRGRHBjbW56NWRVS2dBbVlwNFFkZEVTVmZ3YTlNQkF5M2Y4cDQycUEx?= =?utf-8?B?K2VXcjVxVDcybEl0a2R4T0VsbzJEMGs1N3h0Rys4WDBGTXU2ejI2ZnQvYmRT?= =?utf-8?B?V2s5N2NqbEpzd2d3c0N2RHpHV1M3a256R1hNRE1GdkhqTndoTnBEUDQ1MVJX?= =?utf-8?B?VzV4dFhSRWF1TUppTDU0VzAzb0FlS3Y4WnVuTDlQdnRocW5ka1BFL1dMOUw2?= =?utf-8?B?TkFVWWgwUXJQS1Z3aHVvbHBZOFVFVUlvSHJJN2VaYzJlOENZcU43a2pERVd0?= =?utf-8?B?eElnU1ArSk8zQWVBbFBmUHRtaS9WbVVpdElPRVY1U0NjUnBxeTlNWnE2V2Y1?= =?utf-8?B?b2ZLS3gycjVnS25leDJhdHVpMTBZZDJ3OUVCV3lINnBPQ2RPcnRGbGFMMXl0?= =?utf-8?B?NXM5c3l3bzBxanIrMCt5Yng1YkZjTFZJTU8za3pMUmpIL21NWDkvWFduMmdO?= =?utf-8?B?R3RFTzJkaVZtcTNiZ2tTVkdWdGlFZjFWMUF2T21VdzhrYWl3dXZBdWF0Qm4x?= =?utf-8?B?U2tBeVllWUFnaDQzYXNIQXF6NTNXaHFRbWw2WHkxc0t0WHlSMGhpREg5WTBv?= =?utf-8?B?cmgwczAxbTQrTDJURGl0bFZTUlFCenluU2NOMEpwNE4yYlk0TmlDcWtvZ3VU?= =?utf-8?B?NW4rOTNQaHBjYjJmZzU1dDBNTHdaTUpwc2RQVm9vWnllekp0ZVFSL0pJRzZ2?= =?utf-8?B?bmNvUW1zMG1sUk1semtzbVFIbEJrZ21STkVDV0d1ek9HRnQwWXBYVDBRPT0=?= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DS0PR11MB7408.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(376005)(366007)(1800799015); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?WkhvODVVdHZ0V1pzK0tJOW9vUWczV01raUs3SklISkZieE1IbkF1b05jMGNu?= =?utf-8?B?T0l3V2MxL1AzSzM3blV5d3hDYWJxNkx4cnBXL0lEMlM0UitkUUs3emRJOThZ?= =?utf-8?B?ajI4eUdyMlljZmMxb0ZaVHdidDdaTUJtaVR1a2pCWWIxTWRMN2F6U0ZmVlZz?= =?utf-8?B?WThTVHpHdnAwejl0ZDUxUmQ3djZpNFNkWXhiKzZqSDR4bUt6QTNhK2llV21o?= =?utf-8?B?c3VyTURpZWgrTXowVjlTa1JhVWNoUVJleFRGVDZhd1pad1Jyd2tDUlFzNmxV?= =?utf-8?B?b01KczRZZ2JNYkJDWVk5RytUMEZuaS9EVERZVVN2aW91RkcyYmErS0NQU2do?= =?utf-8?B?LzVOOUdmMFlpclpneVU3a1hUamI2WGpmd1NxNVBNdlhxczZNVmJmTDZVNGdp?= =?utf-8?B?RnVFb090dEYzRnNWUjZMR1MxK1pIUHBPLzdLOGxvTG1ROXVoWFYxSlFtMjBl?= =?utf-8?B?K0xsR1hydnN4OG1UY01mUENGSk5GdDh4ck5OVUY0a0wyVDAzamo4ZkRjS09U?= =?utf-8?B?QU5XSStIaU5lVjZUbWVWSUhNeTZQeVVqTkhManJnZTRxT0ppM0Z4SHlpN1dx?= =?utf-8?B?MDNZckNaNDdVUzdiK2o1dEo5WnlaeFVVMElESzZZczZBSVZIRXhsNlUvWGs0?= =?utf-8?B?eWxnZVdMb1N2ZGpZNFg0QlRTNjQwWFFBbGdTZnNycGkvTXZCQVpIWlR1Zk5L?= =?utf-8?B?QWhYeXpwUWRJTjRxTWEwbmdOdlVnTW9EZGVFYktUUUg0djFJaGJ1NmExTlpj?= =?utf-8?B?emRReHFCaGZhOEJQWEg2QWtnc2hGMjh2TmV2YUdSQmF0eFZKaHhoNWRaL3Ra?= =?utf-8?B?eng1bUpqQUN0VmIzSGFGaERTNGxOVDA1elNWazRnQlZ3RHdKZThvUVc2RVE2?= =?utf-8?B?S0o1MGdqN1B3dk1sQlBtQmsrRTE4NC9ldCtJM3Q1SzVkV1E4U245Q3lqWWtz?= =?utf-8?B?NVZJU1hDQVFKc05XVXdTRGQwR1hYRkJKL3o0QjRLSklVeGhMSU9HZWpRbHls?= =?utf-8?B?ZStqR3pwNnF6bERqdlI5cjc0S0l1QUl4R3dHeDJQWWRXUVMrdjJvdnl4ZjYw?= =?utf-8?B?dUJNZWZDQjZuOWtzaU5OQ0xOWHNxdGpPMldOMUdKVmZleEprTXdlZGNBQ0VG?= =?utf-8?B?d2MyVUViRWZWWHJBc2RjbXpjSUc0NHYyRmJ2Z1h3ck5EdW9VcFRYcFlBcXFD?= =?utf-8?B?TkkzNjNSYzBqMjlSZXNOSjl0OHM3aUNmNzFwbjlHWWhtM3c5QjB0MUJ3VXpJ?= =?utf-8?B?WGlVajhBVVVDOHB1Y2FsYS9rM0llU2RIZk52UDJpaTVMT1ZrRXJxaW8wTk1y?= =?utf-8?B?dUdScXdlaXRFY0VlVVV3Y0M3NzU0UlgxSXU5Nm40Mm9NT0ZHd1EwTmlRS3hP?= =?utf-8?B?WHo1dFFjNmJCMUoxK3UrUE5nSDRka0l4V3ljcHcxTzVpU2lSNnlQWEtmb1E5?= =?utf-8?B?TW00ZTlqSlFYZnR3NFBtRWZRbzJwQXhseUxhSDBobS9MRGM0b1NvYXZVS2dC?= =?utf-8?B?Z2hhNTFkZ2ZsdVg4cVJyR2ViRTJBdVV2TjFmWmgxdW1naXBsdER0bE9jbE1F?= =?utf-8?B?QzlqOWNROFVSN04wR252NWViZkhNcHZ2TmZ0ckxYSzJzaUR6c0crbFhObXp4?= =?utf-8?B?dTZDdXpLNlNVUWt4MEgyU05oNVFGNC9iZmFFM1k4U1FtMGswNzJTMlpWM2FW?= =?utf-8?B?bFQzODd4ZzhhbTdpNm1DSkVORERxTnFwbkdSWElWSkJNaEpLcGRpS2ZjRWJi?= =?utf-8?B?bnZJYzY5QzVCU3NMYXNqR0JwSVNRaXRyS3d0aEdkMkxocmgrc2lDK0NYQ0gz?= =?utf-8?B?YzdvdU1QeGhNWVY4Qmw4cWtIbTdOblBPVkdjMGpscGpvaWw2TjM4ODhrUW85?= =?utf-8?B?OFNPOVdjQmh4WCtNM3RHR1E0eTVremc0QmdmczNOTHNpOE9WMTB3anRzaFg5?= =?utf-8?B?OXNaRDNTQTd6WTBXUGFKODdBblZTbkt6NnF5VkJKcjVaZElkaENFMEFVSEY4?= =?utf-8?B?WGduSllHTWYzME5BTVNDcmMvdFJyUktnai9ycTc4d0J5OVIwNDUvYUw4Q0pZ?= =?utf-8?B?Tk9sQ2lSQ2JzRHpSWS9TRFAwSTAydHJad0t2eFg4U0xkZXlkRWF3QTNYMnhn?= =?utf-8?B?N1hnaU5pWno1MTc1S3gxcllJLytpWUZIclNHM3o1eU0vNVB2eVFxb0FvYU1P?= =?utf-8?Q?PN2vKNTJ/Fg0/EUi5x2Agv4=3D?= X-MS-Exchange-CrossTenant-Network-Message-Id: be43510e-4c39-4244-c8ab-08dc62f02d81 X-MS-Exchange-CrossTenant-AuthSource: DS0PR11MB7408.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Apr 2024 17:18:07.7133 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 0KnSzmJ/+BCMJ+ups2qU+C4BZ8UV86Yco8q+GLPMvNsgt4Cn4m88+vzt9EGyPqfS92E4mO/06Cw1hAjQIVQ05Ln2wQNQ5rwg40Ee7/QBx4Y= X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW3PR11MB4521 X-OriginatorOrg: intel.com X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Mon, Apr 22, 2024 at 11:40:33AM +0100, Tvrtko Ursulin wrote: > >On 20/04/2024 00:51, Umesh Nerlige Ramappa wrote: >>On Fri, Apr 19, 2024 at 11:44:46AM +0100, Tvrtko Ursulin wrote: >>> >>>On 18/04/2024 00:19, Lucas De Marchi wrote: >>>>On Wed, Apr 17, 2024 at 01:35:29PM GMT, Umesh Nerlige Ramappa wrote: >>>>>On Wed, Apr 17, 2024 at 02:05:40PM -0500, Lucas De Marchi wrote: >>>>>>On Wed, Apr 17, 2024 at 09:51:42AM GMT, Tvrtko Ursulin wrote: >>>>>>> >>>>>>>On 16/04/2024 19:29, Lucas De Marchi wrote: >>>>>>>>On Tue, Apr 16, 2024 at 03:22:21PM +0100, Tvrtko Ursulin wrote: >>>>>>>>> >>>>>>>>>On 16/04/2024 14:51, Lucas De Marchi wrote: >>>>>>>>>>Forgot to Cc Michal, doing now. >>>>>>>>>> >>>>>>>>>>Lucas De Marchi >>>>>>>>>> >>>>>>>>>>On Tue, Apr 16, 2024 at 08:30:33AM -0500, Lucas De Marchi wrote: >>>>>>>>>>>On Tue, Apr 16, 2024 at 09:37:44AM +0100, Tvrtko Ursulin wrote: >>>>>>>>>>>> >>>>>>>>>>>>On 16/04/2024 04:04, Lucas De Marchi wrote: >>>>>>>>>>>>>Add per-client usage statistics to xe. This >>>>>>>>>>>>>ports xe to use the common >>>>>>>>>>>>>method in drm to export the usage to userspace >>>>>>>>>>>>>per client (where 1 >>>>>>>>>>>>>client == 1 drm fd open). >>>>>>>>>>>>> >>>>>>>>>>>>>However insted of using the current format, >>>>>>>>>>>>>this creates a new one with >>>>>>>>>>>>>the unit "ticks". The intention here is not to >>>>>>>>>>>>>mix the GPU clock domain >>>>>>>>>>>>>with the CPU clock. It allows to cover a few >>>>>>>>>>>>>more use cases without >>>>>>>>>>>>>extra complications. >>>>>>>>>>>>> >>>>>>>>>>>>>Last patch was a quick implemenation of a >>>>>>>>>>>>>gputop-like tool in python. >>>>>>>>>>>>>I ended doing it to cross check the gputop >>>>>>>>>>>>>implementation. I's not >>>>>>>>>>>>>really meant to be applied here. >>>>>>>>>>>>> >>>>>>>>>>>>>I tested this on DG2 and TGL with kmscube >>>>>>>>>>>>>(console-only) and vkcube >>>>>>>>>>>>>(in a gnome session), but it would be good to >>>>>>>>>>>>>soak this under more >>>>>>>>>>>>>tests. The biggest goal for this patch series >>>>>>>>>>>>>right now is to get >>>>>>>>>>>>>consensus on the new UAPI. >>>>>>>>>>>>> >>>>>>>>>>>>>TODO: Add documentation on top with the new interface. >>>>>>>>>>>> >>>>>>>>>>>>Yeah a drm-usage-stats.rst patch would be nice >>>>>>>>>>>>to have in the RFC so one does not have to look >>>>>>>>>>>>into the driver implementation to discuss the >>>>>>>>>>>>proposed uapi. >>>>>>>>>>>> >>>>>>>>>>>>Nevertheless I understand the proposal is to add this: >>>>>>>>>>>> >>>>>>>>>>>>drm-engine-: ticks >>>>>>>>>>> >>>>>>>>>>>yes, the gputop patch was more explicit about >>>>>>>>>>>this. Should had added in >>>>>>>>>>>the kernel patch series too. >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>That's two values per key. I guess "one key >>>>>>>>>>>>value pair for per one line of text" does not >>>>>>>>>>>>get strictly broken and that you propose a >>>>>>>>>>>>heuristics in parsing to detect that the >>>>>>>>>>>> cannot be mis-interpreted as the unit? >>>>>>>>>>> >>>>>>>>>>>the current format is >>>>>>>>>>> >>>>>>>>>>>    drm-engine-: ns >>>>>>>>>>> >>>>>>>>>>>the "ns" in the end should be parsed by userspace to know >>>>>>>>>>>what it is about. >>>>>>>>> >>>>>>>>>Right. >>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>Not sure it is a good idea though. If you >>>>>>>>>>>>instead added a new key for the gpu time what >>>>>>>>>>>>would be the downside in your view? Like: >>>>>>>>>>>> >>>>>>>>>>>>drm-engine-: ticks >>>>>>>>>>>>drm-ticks-: >>>>>>>>>>>> >>>>>>>>>>>>Or maybe even obsfuscate/generalise as: >>>>>>>>>>>> >>>>>>>>>>>>drm-engine-: gpu-time >>>>>>>>>>>>drm-gpu-time-: >>>>>>>>>>> >>>>>>>>>>>I think both work, but I fail to see the >>>>>>>>>>>advantage. This alternative is >>>>>>>>>>>slightly heavier on the parsing side since you >>>>>>>>>>>have to correlate the >>>>>>>>>>>values from 2 keys, possibly dealing with them >>>>>>>>>>>appearing in different >>>>>>>>>>>order. The only possible breakage remains with >>>>>>>>>>>this alternative: if >>>>>>>>>>>userspace didn't parse the unit before. I checked >>>>>>>>>>>nvtop and htop and >>>>>>>>>>>they were doing the right thing. I sent a fix to >>>>>>>>>>>igt a few weeks back >>>>>>>>>>>for it to consider the unit: >>>>>>>>>>>https://lore.kernel.org/igt-dev/20240405060056.59379-8-lucas.demarchi@intel.com/ >>>>>>>>> >>>>>>>>>Advantages are that "drm-engine-something: 1234 5678 >>>>>>>>>ticks" isn't self-explanatory (intuitively humanly >>>>>>>>>readable) and that it doesn't >>>>>>>> >>>>>>>>maybe I have a different expectation from procfs. When I do e.g. >>>>>>>> >>>>>>>># cat /proc/self/stat >>>>>>>>3861283 (cat) R 3861233 3861283 3861231 34816 3861283 >>>>>>>>4194304 90 0 0 0 0 0 0 0 20 0 1 0 1321348797 8560640 384 >>>>>>>>18446744073709551615 93979016876032 93979016892449 >>>>>>>>140720658378704 0 0 0 0 0 0 0 0 0 17 51 0 0 0 0 0 >>>>>>>>93979016907440 93979016908904 93979037196288 >>>>>>>>140720658380605 140720658380625 140720658380625 >>>>>>>>140720658382827 0 >>>>>>>> >>>>>>>>it doesn't seem to me "intuitively humanly readable" was the first >>>>>>>>concern for people adding files in procfs :)... I'd >>>>>>>>rather think "machine >>>>>>>>readable" was more important. >>>>>>> >>>>>>>I think you are pushing the argument a bit now :) since >>>>>>>IMO we should evaluate drm-usage-stats.rst proposal more >>>>>>>in the context of drm-usage-stats and other fdinfo files, >>>>>>>rather than the whole of procfs. In other words if there >>>>>>>isn't a strong reason to regress this particular file lets >>>>>>>not do it. >>>>>> >>>>>>:) I like pushing arguments if it helps revisit decisions (human vs >>>>>>machine readable for things in procfs). I'm not >>>>>>trying to push the 2 counter approaches though. I think other reasons >>>>>>like discussed below are enough to consider the other keys. >>>>>> >>>>>>TBH I was reluctant at first to add a separate uapi rather >>>>>>than re-using >>>>>>drm-engine- without realizing there was already a second one (not >>>>>>implemented in gputop). >>>>>> >>>>>>So AFAICS i915 and amdgpu use drm-engine-. msm and panfrost use >>>>>>drm-cycles + drm-maxfreq.  And none of them seem suitable to xe. >>>>>> >>>>>>> >>>>>>>>>diverge from the one value per key plus unit format. >>>>>>>>>Latter we would then document clearly. >>>>>>>>> >>>>>>>>>Different keys potentially appearing in different >>>>>>>>>order does not matter since userspace already has to >>>>>>>>>handle that. >>>>>>>>> >>>>>>>>>>>>Potentially could also add a key saying how much >>>>>>>>>>>>wall time is one unit of GPU time. >>>>>>>>>>> >>>>>>>>>>>I wouldn't add it really as it may not make sense >>>>>>>>>>>depending on the >>>>>>>>>>>vendor and or usage. Examples: the gpu time may be different for >>>>>>>>>>>different engines depending on where they are >>>>>>>>>>>located (tile/gt). The >>>>>>>>>>>correlation with CPU time is different when >>>>>>>>>>>running in VF mode, and may >>>>>>>>>>>change in runtime depending on the number of VFs. +Michal. >>>>>>>>> >>>>>>>>>Yes, that's why I said "potentially", which was >>>>>>>>>supposed to mean if and where it makes sense and >>>>>>>>>perhaps adds value. >>>>>>>>> >>>>>>>>>>>Also, if the userspace side really wants to know (why would it?) >>>>>>>>>>>it could be just calculate from 2 samples >>>>>>>>>>>(possibly repeated a few >>>>>>>>>>>times as it updates the output). >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>Or.. would even the existing drm-cycles, plus >>>>>>>>>>>>abuse of drm-maxfreq, work? Ticks == cycles, >>>>>>>>>>>>maxfreq == ticks per wall second. >>>>>>>>>>> >>>>>>>>>>>I think it'd be up to gpu vendor what clock backs >>>>>>>>>>>this time. For the >>>>>>>>>>>current Intel cards, it's the refclock and it doesn't vary the >>>>>>>>>>>frequency. >>>>>>>>> >>>>>>>>>Right, but that doesn't matter. What I was saying is >>>>>>>>>that if you exposed ticks in drm-cycles and tick >>>>>>>>>frequency in drm-maxfreq it would even work, no? >>>>>>>>>(Assuming support for those two was actually in >>>>>>>>>igt_drm_fdinfo/clients/gputop and could be used as >>>>>>>>>fallback to time based stats.) >>>>>>>> >>>>>>>>oh... I was looking at the output for i915 and missed that we had >>>>>>>>drm-cycles as currently i915 doesn't use it. It seems to >>>>>>>>be a similar >>>>>>>>thing. I agree the drm-maxfreq- is unfortunate and that we don't >>>>>>>>actually have support for that in gputop. >>>>>>>> >>>>>>>>So, instead of the 2 numbers + different unit, I can adapt this to >>>>>>>>rather use drm-cycles. However for maxfreq, it doesn't seem to be >>>>>>>>what we need since it doesn't scale for VF. It brings >>>>>>>>back the cpu clock >>>>>>>>domain this series is trying to avoid. The difference is that using >>>>>>>>drm-cycles- and drm-maxfreq- you are expecting the userspace to do >>>>>>>>(let me know if I interpreted the docs wrong): >>>>>>>> >>>>>>>>    s1 = sample() >>>>>>>>    sleep(period) >>>>>>>>    s2 = sample() >>>>>>>>    engine_utilization = ((s2.drm_cycles * >>>>>>>>s2.drm_max_freq) - (s1.drm_cycles * s1.drm_max_freq)) / >>>>>>>>period >>>>>>>> >>>>>>>>... considering the drm_max_freq may change from one >>>>>>>>call to the other. >>>>>>>>if we simplify it and assume it doesn't change: >>>>>>>> >>>>>>>>    engine_utilization = ((s2.drm_cycles - >>>>>>>>s1.drm_cycles) * drm_max_freq) / period >>>>>>>> >>>>>>>>we'd need different drm_max_freq reported on VF driver >>>>>>>>that would need >>>>>>>>to know the number of VFs enabled to scaled it >>>>>>>>correctly. Maybe this is >>>>>>>>abusing the "drm-maxfreq" a little bit? >>>>>>> >>>>>>>Yes it would be bad if the observed VF GPU clock will be >>>>>>>variable since maxfreq is supposed to be static. >>>>>>> >>>>>>>So on VFs would reported GPU clock moves by the VF "used" quanta? >>>>>> >>>>>>s/used/available/. That's my understanding, yes. Each VF has a quanta >>>>>>and the gpu clock moves according to that quanta.  Note that >>>>>>as I said, >>>>>>this is not the case right now (we are just reading >>>>>>RING_TIMESTAMP), but >>>>>>the intention is to have the UAPI side ready so it's already prepared >>>>>>for that. >>>>>> >>>>>>>Where "used" is defined as time given by the GuC, not >>>>>>>necessarily used >>>>>> >>>>>>s/used/available/ as above >>>>>> >>>>>>>GPU time. For instance 16ms quanta, VF GPU clock would >>>>>>>move by 16ms if the GuC decides not to switch out the idle >>>>>>>VF? Or it could move by less than 16ms if it switched it >>>>>>>out earlier. >>>>>> >>>>>>no, afaiu it's 16ms, not less. But the quanta depends on the number of >>>>>>VFs enabled, which may change in runtime. >>>>>> >>>>>>I'm not 100% certain and people in Cc may correct me. >>>>>> >>>>>>> >>>>>>>>What if we had >>>>>>>> >>>>>>>>drm-cycles-: >>>>>>>>drm-total-cycles-: >>>>>>>> >>>>>>>>Then the utilization can be done: >>>>>>>> >>>>>>>>    s1 = sample() >>>>>>>>    sleep(period) >>>>>>>>    s2 = sample() >>>>>>>>    engine_utilization = (s2.cycles - s1.cycles) / \ >>>>>>>>                 (s2.total_cycles - s1.total_cycles + 1); >>>>>>>> >>>>>>>>Capacity still to be added above, but we'd need to clarify if >>>>>>>>drm-total-cycles- already accounts for it. >>>>>>>> >>>>>>>>Here instead of the conversion to cpu clock, I'm expecting to read >>>>>>>>"total_cycles" from HW and that being different (slower) for VF. >>>>>>>>AFAICS this is not the case with this current polling implementation >>>>>>>>since we are simply reading the RING_TIMESTAMP, but >>>>>>>>there are planned >>>>>>>>changes to get it from GuC. Umesh/Michal Cc'ed may know better. >>>>>>> >>>>>>>I think this works and is clean. >>>>>>> >>>>>>>Although I have some doubts about the usefulness on VFs, >>>>>>>if the clock movements are at the mercy of the GuC >>>>>>>scheduler. Like what does 100% mean for a VF? Maybe it was >>>>>>>full quanta, or maybe it was half a quanta if GuC decided >>>>>>>to switch it out early, either due going idle or due some >>>>>>>other scheduling decision. >>>>>> >>>>>>in the scenario you described above the quanta could change >>>>>>according to >>>>>>the scheduler and 100% wouldn't mean much. That's not my >>>>>>understanding. >>>>>>100% always mean the VF used all the allocated time. I see this line >>>>>>potentially getting blurred a little bit if the scheduler tries to >>>>>>maximize the HW usage and distribute quanta unevenly, but I think the >>>>>>interface already contemplates that. >>>>>> >>>>>>Another case is the VF not being able to reach 100% because the PF is >>>>>>submitting high prio work. But I still think the current interface is >>>>>>sufficient and it's the implementation by GuC/HW that could >>>>>>be improved >>>>>>(e.g. adapting the gpu time reported). >>>>>> >>>>>>Michal / Umesh, please chime in if that is not accurate. >>>>>> >>>>> >>>>>Irrespective of how much quanta a VF used, all calculations >>>>>will be based on the quanta that it was allocated. That way >>>>>the VF would know that it could have better utilized the >>>>>allotted time if busyness is less than 100. This does result >>>>>in more than 100% usage for a VF that was resource hungry and >>>>>scheduling policies allowed it to run more than the allotted >>>>>quanta, but this is a known limitation of the solution >>>>>provided by GuC. When looking at the overall system (say from >>>>>a PF), the usage should still add up to 100%. >>>>> >>>>>> >>>>>>Thinking out loud: IFF the execution quanta is available for >>>>>>VF to query >>>>> >>>>>For the VF, GuC intends to provide a factor that can be used >>>>>to scale the wall time and deduce the VF quanta. This scaled >>>>>value is used as the second counter in a VF. >>>>>>and we are ok with just scaling drm-maxfreq, then maybe we could even >>>>>>just use the current interface instead of adding a third one. Although >>>>>>it could be confusing to have a that freq changing. >>>>> >>>>>Assuming you are talking about using the drm-cycles and >>>>>drm-max-freq. One of the concerns when supporting VFs was that >>>>>we cannot actually export busyness in absolute time units to >>>>>the user because the GPU is shared across VFs. If we scale the >>>>>busyness such that it is stretched across CPU time, then it >>>>>helps get the right busyness % relative to CPU time, but the >>>>>value of busyness in time units itself is false. This was the >>>>>primary reason to use 2 "unitless" counters. >>>>> >>>>>fwiu, I think by using the drm-maxfreq, you are going to bring >>>>>the same concept back in the discussion - exporting busyness >>>>>in time units. Not sure if that's a good idea. Let me know if >>>>>I got that wrong. >>>> >>>>no, but I think it would still work if we can scale the freq >>>>according to >>>>the quanta.  But that's probably abusing the interface. >>>> >>>>Anyway I think we are settling on >>>> >>>>    drm-cycles- >>>>    drm-total-cycles- >>>> >>>>so I will start changing the patches and igt while checking this for >>>>more feedback if any. >>>Another option came to mind - expose a quanta ratio as a new key. >>>Given it has no security implications and is to be used only for >>>calculating real VFs GPU utilisation. Like: >>> >>>drm-engine-*: ns >>>drm-engine-time-ratio-*: >>> >>>Unit would be a ratio of time over quanta, for instance 1000ms / >>>100ms = 10. Which would mean scale down reported time by 10 when >>>comparing against wall time. >>> >>>New key would only appear on VFs. Otherwise assumed 1. >>> >>>Or could be avoided per engine and just have single global: >>> >>>drm-gpu-time-ratio: >>> >>>Am I missing something or that could work? It would have the same >>>problem as above mentioned "could go over 100%" is one. I mean >>>this comment: >> >>I am hesitant to expose the quanta ratio at this level. We get that >>from GuC and that interface could potentially change. If that >>happens, I'd prefer that the uApi is unaffected. > >FWIW this idea was simply a time scale factor and I don't think it >should have a connection to any GuC implementation details. In other >words the observable result of time*scale/elapsed-time vs >ticks/total-ticks should be the same. At least with the semantics that >were discussed in this thread. > For the drm client busyness that we are discussing here, the VF specific interface is still wip. The quanta ratio is more of a engine busyness concept (see below) that I presume will be used here as well. >>> >>>""" >>>Irrespective of how much quanta a VF used, all calculations will be >>>based on the quanta that it was allocated. >>>... >>>This does result in more than 100% usage for a VF that >>>was resource hungry and scheduling policies allowed it to run more >>>than the allotted quanta, >>>""" >>> >>>I read that as total-ticks would never be reported as above the >>>configure quanta - always equal. >> >>Correct >> >>>Maybe I misunderstood? >>> >>>Second topic - are there any plans to allow the PF to monitor VF >>>GPU utilisation? That wouldn't work via fdinfo aggregation since >>>VF clients will not be visibible in a PF. But it sounds like a >>>basic and important use case. >> >>wip. Engine utilization will be available per-VF from PF. > >How it will be exposed out of curiosity? This would be supported only at the engine level granularity. Client level, like you mention, is not possible. Like here, the % engine busyness is exposed as 2 counters - engine_runtime (how long the engine ran) and total_ticks (how long the PF/VF ran). Internally GuC provides the engine_runtime in ticks and KMD derives total_ticks from quanta_ratio and elapsed_cpu_time. That information is available to PF as an array of functions that can be indexed using function index. A global busyness is also available (aggregate of all functions). Reg: KMD<->UMD interface, 1) 2 counters are exported via the PMU interface. 2) Function index will be part of the config bitmap used with the perf_event_open. I don't know yet if we will end up using kernel perf interface for this on XE (separate topic though). Regards, Umesh > >Regards, > >Tvrtko