From mboxrd@z Thu Jan 1 00:00:00 1970 From: Suravee Suthikulpanit Subject: Re: [PATCH V4 5/6] perf/amd/iommu: Enable support for multiple IOMMUs Date: Mon, 22 Feb 2016 15:00:31 +0700 Message-ID: <56CAC01F.8090800@amd.com> References: <1455182127-17551-1-git-send-email-Suravee.Suthikulpanit@amd.com> <1455182127-17551-6-git-send-email-Suravee.Suthikulpanit@amd.com> <20160218131853.GU6357@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20160218131853.GU6357-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: Peter Zijlstra Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, acme-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org, andihartmann-KuiJ5kEpwI6ELgA04lAiVw@public.gmane.org, mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, bp-Gina5bIWoIWzQB+pC5nmwQ@public.gmane.org List-Id: iommu@lists.linux-foundation.org Hi Peter, On 02/18/2016 08:18 PM, Peter Zijlstra wrote: > On Thu, Feb 11, 2016 at 04:15:26PM +0700, Suravee Suthikulpanit wrote: >> static void perf_iommu_read(struct perf_event *event) >> { >> + int i; >> u64 delta = 0ULL; >> struct hw_perf_event *hwc = &event->hw; >> + struct perf_amd_iommu *perf_iommu = container_of(event->pmu, >> + struct perf_amd_iommu, >> + pmu); >> >> + if (amd_iommu_pc_get_counters(_GET_BANK(event), _GET_CNTR(event), >> + amd_iommu_get_num_iommus(), >> + perf_iommu_cnts)) >> return; >> >> + /* >> + * Now we re-aggregating event counts and prev-counts >> + * from all IOMMUs >> + */ >> + local64_set(&hwc->prev_count, 0); >> + >> + for (i = 0; i < amd_iommu_get_num_iommus(); i++) { >> + int indx = get_iommu_bnk_cnt_evt_idx(perf_iommu, i, >> + _GET_BANK(event), >> + _GET_CNTR(event)); >> + u64 prev_raw_count = local64_read(&perf_iommu->prev_cnts[indx]); >> + >> + /* IOMMU pc counter register is only 48 bits */ >> + perf_iommu_cnts[i] &= GENMASK_ULL(48, 0); >> + >> + /* >> + * Since we do not enable counter overflow interrupts, >> + * we do not have to worry about prev_count changing on us >> + */ >> + local64_set(&perf_iommu->prev_cnts[indx], perf_iommu_cnts[i]); >> + local64_add(prev_raw_count, &hwc->prev_count); >> + >> + /* Handle 48-bit counter overflow */ >> + delta = (perf_iommu_cnts[i] << COUNTER_SHIFT) - (prev_raw_count << COUNTER_SHIFT); >> + delta >>= COUNTER_SHIFT; >> + local64_add(delta, &event->count); >> + } >> } > > So I really don't have time to review new muck while I'm hunting perf > core fail, but Boris made me look at this. > > This is crazy, if you have multiple IOMMUs then create an event per > IOMMU, do _NOT_ fold them all into a single event. These are system-wide events, which are programmed on every IOMMU the same way. I am not sure what you meant by creating an event per IOMMU. Do you mean I should create internal per-IOMMU struct perf_event for each event? Could you please give me some pointers? > In any case, the reason Boris asked me to look at this is that your > overflow handling is broken, you want delta to be s64. Otherwise: > > delta >>= COUNTER_SHIFT; > > ends up as a SHR and you loose the MSB sign bits. Ah. Sorry, I missed that. Thanks, Suravee From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753837AbcBVIAw (ORCPT ); Mon, 22 Feb 2016 03:00:52 -0500 Received: from mail-by2on0099.outbound.protection.outlook.com ([207.46.100.99]:30647 "EHLO na01-by2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753471AbcBVIAt (ORCPT ); Mon, 22 Feb 2016 03:00:49 -0500 Authentication-Results: lists.linux-foundation.org; dkim=none (message not signed) header.d=none;lists.linux-foundation.org; dmarc=none action=none header.from=amd.com; Subject: Re: [PATCH V4 5/6] perf/amd/iommu: Enable support for multiple IOMMUs To: Peter Zijlstra References: <1455182127-17551-1-git-send-email-Suravee.Suthikulpanit@amd.com> <1455182127-17551-6-git-send-email-Suravee.Suthikulpanit@amd.com> <20160218131853.GU6357@twins.programming.kicks-ass.net> CC: , , , , , , From: Suravee Suthikulpanit Message-ID: <56CAC01F.8090800@amd.com> Date: Mon, 22 Feb 2016 15:00:31 +0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <20160218131853.GU6357@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [124.121.8.20] X-ClientProxiedBy: HKXPR03CA0066.apcprd03.prod.outlook.com (25.163.104.24) To SN1PR12MB0447.namprd12.prod.outlook.com (25.162.105.140) X-MS-Office365-Filtering-Correlation-Id: 54d3d658-231d-4037-49ad-08d33b5e45d4 X-Microsoft-Exchange-Diagnostics: 1;SN1PR12MB0447;2:vJaVU8zGZELLmtOh0CnS8D2mmVvQa5OBLwRtUMVGieC1QHU6ozLRrx9sVhoblRqEDQ1917qzMYNLZSDvg2PLSvbPqH6goUVvEvqlbXn+WiQlayfFm6FqAMRCJgzOzFY5eP7SlGMltNnxIGnhtbRtymb18FJI+2aVX5eObixn3VmR4tm1Yr93P7VYLW6PLXGY;3:WzVCLOQfmtMkrmTkOd9CM6vswF/pvsc+a2OaQ1oF8QzEtUObnptG2O6V/VJ6b5sqLYdOYa7+XS6di/W8f/XLCq0y++LAJoLL91nLQmA0H4bNRFo8i7KmtNrvQxSoitX2;25:nSZKDgRmEzN3XGTkDaJLWfm/BaCQxuIzyMv9++5xIesM7iiw/TaPRvVGM/zeP2KAhEjX6mk4myheR5K/L8uQxE4mHpdLrnIVr9GqJbaYffyc3siivLSZ/Q28ZSbAhmGrAyZflcqpE+lWq2z8NF4DowxTW6HCp3X1DU61EM20QRF1/p3GeIpxOm/XF2MRtc1KTkGAFpsHJpR4bv8wFcmsfkzqRYv8zStIZEzvbbfXTJFQeMT+yPJcpSyhLAFYzi0XUPp+vojSaL3eppu/n8vsAfGNUfyu5olKDfBEXYiFmgjifA1O7uuMDNTFRNtIDoIZ2xldM9s2QXGijc/gY021qiwlt+2ntCgnw37TJIm/1fE= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:SN1PR12MB0447; X-Microsoft-Exchange-Diagnostics: 1;SN1PR12MB0447;20:2aBhHyX26Lof0OtVcocAj/d4n2NYG30aVBfJ59KyUkK9i5zzr26pQpR2R3UCPsfIsoFQc7kb7WV2F38MdDVyuuiQNtoPV7JVblj+MJZ1g7qMf5xerBNp38X7oxu5CryDCgL24ZRj8GPFsnc7JDHYqZj4zvD9pb3wgg68FhrkZPy3YiftPo+8SgirgPBBneU8JTup9K00kgytd39doP5GVzfe4HJvD7XRgZsganfM1kmqdrCFkdNoyk5gmI7LwEaN4oLMT99oOaFbEfZXAOed75TYKQwU6Z8npHM4HFB47uI+Jl/5+/C1nqFV5ScGZlt6WFc3rR4yP7kVpTo9ERlRG1hr4PeQAngoMUcxQHrrqLZ8jrJrrPj/FhBgyq6BA2adFD1RcFlNfl7dn5m89uDJ8F9s4EbVzscfVJpcoHOrVE2JeNcKmTJRUIcM1YzDMCeJi6dCGZg/CzIZj3HoV3EHQIsqFZAaVJ641ZMzh4DvWW9fMmoVJV5AjcyDdDwG/z28;4:ALT8Sz90+hUhWMtlSdK7x0nfqVJ73lMSelIOZozVZZ9JSValpCEzkTD2MkgLx/KNspeUgkz1i+slsusUkqUdwFULQf9GUHUXJbF765C6blh4svnMCZNFAjCc6BjO1U9otnLb70nQbpj86/0Tr+Sy5BPnmR0zSlVaZCAy7YItEblw0rTh8TWDzIFmspfocKmcxHH/JWx5ybHV5HynC3J0TAo3hKCO6mm+LvuCvmM+conFlVNojnnPETV/ZkhlXHqqIaedGWlj9DV7pwQN8k+G+XVQfcboZ7jxjmno7EeBJf0RDNm+XmUVY+GO4D5lx1VO5n9TievH2e5ytiRytbyLyTCsLy9PHzKUxCMnGBIGzD4XODeKudC5QmPG8QuirDf9 X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001);SRVR:SN1PR12MB0447;BCL:0;PCL:0;RULEID:;SRVR:SN1PR12MB0447; X-Forefront-PRVS: 0860FE717F X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(4630300001)(979002)(6049001)(6009001)(377454003)(479174004)(164054003)(24454002)(33656002)(4001350100001)(23746002)(110136002)(65816999)(76176999)(5001960100002)(54356999)(87266999)(83506001)(50986999)(2950100001)(77096005)(86362001)(36756003)(50466002)(59896002)(117156001)(87976001)(5004730100002)(2906002)(4326007)(92566002)(5008740100001)(230700001)(586003)(64126003)(6116002)(3846002)(1096002)(65806001)(42186005)(65956001)(66066001)(47776003)(40100003)(189998001)(80316001)(122386002)(969003)(989001)(999001)(1009001)(1019001);DIR:OUT;SFP:1101;SCL:1;SRVR:SN1PR12MB0447;H:[192.168.0.19];FPR:;SPF:None;MLV:ovrnspm;PTR:InfoNoRecords;LANG:en; X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;SN1PR12MB0447;23:OlgTvlMixjg8dOfmEUA2PQ6YZLWgwsPXZJPp7?= =?Windows-1252?Q?jT8rSy5YqJ5IO6fS2ihpPt72rb3CUhu0PoDWEGBFrX+dtURTTmr+YFQJ?= =?Windows-1252?Q?FwETbgHEgf3gLlMV6NYuYuBUZnlRaxHEwZOG7ePKGluFvYDZ+UvzozU+?= =?Windows-1252?Q?NQomj4H94NxwEXrGrdk5jHGMZeKYuGx6Obt6axDr08+IxMqTf4WdoJ59?= =?Windows-1252?Q?1YPKQxBDEfaHy8PUTvQXESfK9FnRLAWF6plqDZ1jcsmHUY3aEPRb3/QJ?= =?Windows-1252?Q?8oE4i9vouUZ98TeRfst/tuc6Mkr0MS/9hZy+hvj3O8r9PjcLh8cSe9r5?= =?Windows-1252?Q?tGMDF2LXbqZxsqKdiqdZxaF2fJrYC3I2kjYBgK3UDHqaykffdJL2C9EI?= =?Windows-1252?Q?3WwweQtARhZsV3CcSm6k7A3ZBRFEa2tTk+xlSqAA+M4E6jBa8j6j7rsp?= =?Windows-1252?Q?3hGz6RDhPtVacYYNQ49OmGmI8gG8OVpljfzDcEiW9mdNN6D7RfG4sdvA?= =?Windows-1252?Q?yeiqS3EWdkA+AhjO7K2kWoX/Xr4hNiFhvN4o8mQGCr89YMH29QWqbSfd?= =?Windows-1252?Q?mELhUbEc9DhOxw3Rv5cdERBiq6CiMN5SjRXSTmanv/LO2JvcODCHIFrw?= =?Windows-1252?Q?Gy/LjJ1KK+gFH62ObQBsA3QP+fjMNzPLDM2e5SyOMEd1G0pVnuZiwAnA?= =?Windows-1252?Q?TWrEb8/XLa/hFzoHkTShxKrEu1n7fpNQmkS++MAunbOSWRE3marKnSR4?= =?Windows-1252?Q?0ZnJH/14oXXbGfupciyJstSJnXqTqkY0wmVowJGVEtmxImKu5iWlKtsy?= =?Windows-1252?Q?to46BLFyhpz/da62imoyl9cT5nKjYtZgHFr7E+B2IzDgDEy7N4SwGj6N?= =?Windows-1252?Q?vIKuUfTgGFfG8yBVsV7W7f36nHDoBcIj+c+/yNak2x0BXxXxMCM1YOhY?= =?Windows-1252?Q?rOoY6GK31sSPd8h83mJShx1N0IETy+2Cx0hUbPVxhQBgMwTR18v//JZ1?= =?Windows-1252?Q?0rPS6SV4ywkf0wk9GkkZn850prayTJ1C3Jjm0RaN3hED2yxJXSYMNVKM?= =?Windows-1252?Q?LjGIhh9RbduMcx2Mzcq9RwyZaCAQ0vgeLUkQ1Ju1vYFIkpLhCE7QsZZF?= =?Windows-1252?Q?vGKXl+Wc0BZu5Uh+ZUM3Q0fpL1dUxyyLKxmU5TGILpREJe+lqL2/jZKW?= =?Windows-1252?Q?VaPb/2aMe8t0WY6NXOCWZ8IB7qoeV5MeH9qosjFM8Lfwxd9SIZ6F/wTD?= =?Windows-1252?Q?YwYUJw+W6wq6uTQrUZuBUPbHgw21Ww9iLT5cekXVdFIdASDet4c1RAO9?= =?Windows-1252?Q?dXUNwHYnmMhHBxppZeCxq134iFj/j7GNtDqcuWygRolRgWuSN38Gt9FH?= =?Windows-1252?Q?IUSNb4Xct3QEnF5YDyYVrRzzNAn09QUMQ=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;SN1PR12MB0447;5:81poqdq3NvEV5MtKSUktNDEctoz5/F2V7DQGy26b8McJrIJVHOzpzRJSSb9nN93a898HpcDKry7gpjy8jG5KqOB90ui646Dt4QXgKici0y1+3OgFD1HIwne7BEndhQepXrFn1rjizN7BKhQpDZ105g==;24:DPnhExVkML4wPlegv9bCt4w89PKurk75dLh60rlk7u9kansnh3Ye8EoyHAcZfTNEm2aocHta3egKCJ/s9We64F/r7MwJfT6CVwhmZGW4zRw=;20:f6LkJ9rdFPj57wImS4vZV2kg8RISReghMTfNbUibdco6wzbninF5cygsq0nJiAJ/mLrbm/6zz5Ed/HOfY0CF5/sVrRVlHB3llHDtCmgIRUOnwjMrMhe7thH3kKWunhwpBLCAgPczOP92bdKZX5fyFZk6JPMAvjz9Agq7hkePTBHWEFAgtaYPL1WaLSQOt20bR8xAxiilcHWkmSG1uVDSWHDD7xLj87Gd37WJEZfJTjncLX7E3Aj9VFXuglDUzEoh SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Feb 2016 08:00:44.8563 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN1PR12MB0447 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Peter, On 02/18/2016 08:18 PM, Peter Zijlstra wrote: > On Thu, Feb 11, 2016 at 04:15:26PM +0700, Suravee Suthikulpanit wrote: >> static void perf_iommu_read(struct perf_event *event) >> { >> + int i; >> u64 delta = 0ULL; >> struct hw_perf_event *hwc = &event->hw; >> + struct perf_amd_iommu *perf_iommu = container_of(event->pmu, >> + struct perf_amd_iommu, >> + pmu); >> >> + if (amd_iommu_pc_get_counters(_GET_BANK(event), _GET_CNTR(event), >> + amd_iommu_get_num_iommus(), >> + perf_iommu_cnts)) >> return; >> >> + /* >> + * Now we re-aggregating event counts and prev-counts >> + * from all IOMMUs >> + */ >> + local64_set(&hwc->prev_count, 0); >> + >> + for (i = 0; i < amd_iommu_get_num_iommus(); i++) { >> + int indx = get_iommu_bnk_cnt_evt_idx(perf_iommu, i, >> + _GET_BANK(event), >> + _GET_CNTR(event)); >> + u64 prev_raw_count = local64_read(&perf_iommu->prev_cnts[indx]); >> + >> + /* IOMMU pc counter register is only 48 bits */ >> + perf_iommu_cnts[i] &= GENMASK_ULL(48, 0); >> + >> + /* >> + * Since we do not enable counter overflow interrupts, >> + * we do not have to worry about prev_count changing on us >> + */ >> + local64_set(&perf_iommu->prev_cnts[indx], perf_iommu_cnts[i]); >> + local64_add(prev_raw_count, &hwc->prev_count); >> + >> + /* Handle 48-bit counter overflow */ >> + delta = (perf_iommu_cnts[i] << COUNTER_SHIFT) - (prev_raw_count << COUNTER_SHIFT); >> + delta >>= COUNTER_SHIFT; >> + local64_add(delta, &event->count); >> + } >> } > > So I really don't have time to review new muck while I'm hunting perf > core fail, but Boris made me look at this. > > This is crazy, if you have multiple IOMMUs then create an event per > IOMMU, do _NOT_ fold them all into a single event. These are system-wide events, which are programmed on every IOMMU the same way. I am not sure what you meant by creating an event per IOMMU. Do you mean I should create internal per-IOMMU struct perf_event for each event? Could you please give me some pointers? > In any case, the reason Boris asked me to look at this is that your > overflow handling is broken, you want delta to be s64. Otherwise: > > delta >>= COUNTER_SHIFT; > > ends up as a SHR and you loose the MSB sign bits. Ah. Sorry, I missed that. Thanks, Suravee