From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5BB8625E46C for ; Mon, 28 Apr 2025 08:56:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.42 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745830608; cv=none; b=nOXwqO5z2n3lsc0aa4IAKZ/PMzsLSUyAuSTQwLAcEN5Ispl4K3v/WgKFra/DIwQ4dePqf2CYwIvpNfrFfXLcU3rE+PSsnsfBWOP/HrczTyc2De+tEl14glV24TvL3EiJPaSj/utejrHtK/tCjI0ovouqCodf2QQ4wBXw0+iE7/s= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745830608; c=relaxed/simple; bh=SbDkUeMbMWy7d2cvvkwhlenq11GJmlAJAeJUSWm1BHM=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=erE30vao6SoiDLRjfYRjleWrHY9bbGV2cP72jYa0kYctKq/dJhzHQuRst6qB1KVgrmEXEuDbikNZpujcWE0NHPRLyNeKDiTHacJqG61MWIi69XWELmJb7LwFOcfTvJfo95D7Qsfg9paP60Ytf5fjORrHt4UgdImPw+N/KMKJ/Ck= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org; spf=pass smtp.mailfrom=linaro.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b=jQ3XQkdt; arc=none smtp.client-ip=209.85.128.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linaro.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="jQ3XQkdt" Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-43edecbfb46so28752695e9.0 for ; Mon, 28 Apr 2025 01:56:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1745830604; x=1746435404; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=wlcZn0CPj7R0ttEruwhiQMdrF075DvIb6xYHf7nikhA=; b=jQ3XQkdtyFCsJWs5w8mDr/p99VxwF45t4Me9QZN/ZGxG3s69WoApj0SHFtNTCSDX1c rBo7y58sGw7ktvE0+K9BKzo5rMmAi5t4cwaXeNXRs0Nv+Pt39EqNlOT+vV7526vR5TFe gimPnsRa01/NaKnwurBJVFaWsjCb/cD+vLq2pSEaTZj7IMQaOOKVBZhXIRxRXM39I3kt fggF2TFOOXr1zXcngT7x2N7XRKoA/3eyoEkFXtxLLDuqGOQfDL0aSKXGr75pl3ATdh6/ bUdnagVNF/H05u6Ebmcvjl34MRlpRLC/pFGzP6lM3ai+7Fib+KXRa5Vb09Ub8c+9vbr8 9zLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745830604; x=1746435404; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wlcZn0CPj7R0ttEruwhiQMdrF075DvIb6xYHf7nikhA=; b=KbmU8g1LmGvOUzh/tIc69s44VGBIOMS/XWXiG/ihsqiOGiZWR4zt7fEupLuqX8LSr5 J1t9/Cb2mlNI1u1xR4UbBMqPlvppxKAdud4xQ5rxrBcfLLXJcFr8+FS3hWqYVH8aT30/ cESptANNiRW5uxnZaOMh67MQkhiIFP4n61CbhJUKGso5GSYepHs57BL0K4L6XrHw7bWZ NPVWxOl350goxH0RpJSmSEB+DXy1YHJIjcs+C7F7FBvHaeuleldQDbTn9ddndeabD7JW EAIdowkLFQx0AvORfsypuicv3g/kc7eodaQOP98+AebkIaf6eAvo0LCm09wBfWfBFvRJ p7+w== X-Forwarded-Encrypted: i=1; AJvYcCVEaB18Xo+bYvIoGw8LDltKOFc9SrWdIkOBlDtVGKS0dWE1KhCJVNkPg8uYiwZ3M9c34wdinprNdPI9T9UfIqvC@vger.kernel.org X-Gm-Message-State: AOJu0YyMxucM0XKO7G6xf6pGQkhaYiolBJQ8sN8EPIGwAY82gcqjJnFQ Y6K3SoNcCAOyqvhaHenBA/2uI1qeGiz8bBa6DQJ5SP8IlVKayuFxKCXlA1eZrCA= X-Gm-Gg: ASbGncvjPhPX4M2NuAEX9VGxEVueW01yfNTha0ClKR9+Uet2EqBd8RhUiXuSzelfMed jKVW0GNm1tPuZoijMff7Kl8O9YspJERekmw3o2SVj/7wBPtTZEmEcD3LLZPZMOm5+Ub5O3k3IKG nElzUl0RDrZfIzez4ItCZFL+M6YEqYXIMBHxRCpUBwH1f7e1OxKvpO7F0gwLNgUBgDtw7puLIOv FwbDdZZYJI6RB3dpjklSeFxYVHpkAyYIpXH1SNBgMKCTKWQx4AP+yucMf83A7NpuUkuqkahjy4F /AfjbNP+mtzSVqIQsn6TDcRXQSqB1B2Ls6XylP4FPuA= X-Google-Smtp-Source: AGHT+IGu1qPPHs8pclY4hh3uvAZNqEG6T4suIydOxG89FHHVlJPyRkl6PrbVo2UzUBPlfWOSb+09bQ== X-Received: by 2002:a05:600c:1e04:b0:440:6a79:6df0 with SMTP id 5b1f17b1804b1-440ab848c1amr63152085e9.22.1745830604497; Mon, 28 Apr 2025 01:56:44 -0700 (PDT) Received: from [192.168.1.3] ([77.81.75.81]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-440a538f4e3sm116621855e9.36.2025.04.28.01.56.43 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 28 Apr 2025 01:56:44 -0700 (PDT) Message-ID: Date: Mon, 28 Apr 2025 09:56:42 +0100 Precedence: bulk X-Mailing-List: linux-perf-users@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 1/2] perf: Allow non-contiguous AUX buffer pages via PMU capability To: Yabin Cui , Leo Yan , Ingo Molnar Cc: Ingo Molnar , coresight@lists.linaro.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mike Leach , Alexander Shishkin , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Jiri Olsa , Ian Rogers , Adrian Hunter , Liang Kan References: <20250421215818.3800081-1-yabinc@google.com> <20250421215818.3800081-2-yabinc@google.com> <48640298-effa-42d4-9137-a18a51637f03@linaro.org> <20250422141026.GH28953@e132581.arm.com> Content-Language: en-US From: James Clark In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 23/04/2025 8:52 pm, Yabin Cui wrote: > On Tue, Apr 22, 2025 at 7:10 AM Leo Yan wrote: >> >> On Tue, Apr 22, 2025 at 02:49:54PM +0200, Ingo Molnar wrote: >> >> [...] >> >>>> Hi Yabin, >>>> >>>> I was wondering if this is just the opposite of >>>> PERF_PMU_CAP_AUX_NO_SG, and that order 0 should be used by default >>>> for all devices to solve the issue you describe. Because we already >>>> have PERF_PMU_CAP_AUX_NO_SG for devices that need contiguous pages. >>>> Then I found commit 5768402fd9c6 ("perf/ring_buffer: Use high order >>>> allocations for AUX buffers optimistically") that explains that the >>>> current allocation strategy is an optimization. >>>> >>>> Your change seems to decide that for certain devices we want to >>>> optimize for fragmentation rather than performance. If these are >>>> rarely used features specifically when looking at performance should >>>> we not continue to optimize for performance? Or at least make it user >>>> configurable? >>> >>> So there seems to be 3 categories: >>> >>> - 1) Must have physically contiguous AUX buffers, it's a hardware ABI. >>> (PERF_PMU_CAP_AUX_NO_SG for Intel BTS and PT.) >>> >>> - 2) Would be nice to have continguous AUX buffers, for a bit more >>> performance. >>> >>> - 3) Doesn't really care. >>> >>> So we do have #1, and it appears Yabin's usecase is #3? > > Yes, in my usecase, I care much more about MM-friendly than a little potential > performance when using PMU. It's not a rarely used feature. On Android, we > collect ETM data periodically on internal user devices for AutoFDO optimization > (for both userspace libraries and the kernel). Allocating a large > chunk of contiguous > AUX pages (4M for each CPU) periodically is almost unbearable. The kernel may > need to kill many processes to fulfill the request. It affects user > experience even > after using PMU. > > I am totally fine to reuse PERF_PMU_CAP_AUX_NO_SG. If PMUs don't want to > sacrifice performance for MM-friendly, why support scatter gather mode? If there > are strong performance reasons to allocate contiguous AUX pages in > scatter gather > mode, I hope max_order is configurable in userspace. > > Currently, max_order is affected by aux_watermark. But aux_watermark > also affects > how frequently the PMU overflows AUX buffer and notifies userspace. > It's not ideal > to set aux_watermark to 1 page size. So if we want to make max_order user > configurable, maybe we can add a one bit field in perf_event_attr? > >> >> In Yabin's case, the AUX buffer work as a bounce buffer. The hardware >> trace data is copied by a driver from low level's contiguous buffer to >> the AUX buffer. >> >> In this case we cannot benefit much from continguous AUX buffers. >> >> Thanks, >> Leo Hi Yabin, So after doing some testing it looks like there is 0 difference in overhead for max_order=0 vs ensuring the buffer is one contiguous allocation for Arm SPE, and TRBE would be exactly the same. This makes sense because we're vmapping pages individually anyway regardless of the base allocation. Seems like the performance optimization of the optimistically large mappings is only for devices that require extra buffer management stuff other than normal virtual memory. Can we add a new capability PERF_PMU_CAP_AUX_PREFER_LARGE and apply it to Intel PT and BTS? Then the old (before the optimistic large allocs change) max_order=0 behavior becomes the default again, and PREFER_LARGE is just for those two devices. Other and new devices would get the more memory friendly allocations by default, as it's unlikely they'll benefit from anything different. Thanks James