From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5EE9AEC4 for ; Wed, 6 Mar 2024 00:19:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709684344; cv=none; b=L5NCXnmwAX+Yb6qpxCilItIWEx6lb8ossUuTKpgqd1Uc6AeYHhhJOIg+OedepIy7t8KvX61jheuVplO1Hk5elNhA/peIRwiC5BSiNWvSPvyoX9cxjuTUn5SlEkNz7nC9EZl9GumkRQ8kpbxcCjiF4fJt3II1W49Bh6yL07BHNWA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709684344; c=relaxed/simple; bh=ELE4avQo+d8QsZ45q+yVs7D3nMNlIdGR5uEQwUa3lHA=; h=Message-ID:Date:MIME-Version:Subject:From:To:Cc:References: In-Reply-To:Content-Type; b=rJsZnUTz1o+eObRDZl1Tj8InQ4fn+oBVk10SaPwmWNPyMad2Y798c0uuGmZA7RijUxq9sBbQlD76AA+CZSOjnGKgVDjMINXNDh9h++CYfby2k17Ptj0P6kczvYIOmhgoft4blinFt26+X3xOarNLcIOOgLvPkf5/KFjRDrfJca0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=eixvOyPd; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="eixvOyPd" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709684342; x=1741220342; h=message-id:date:mime-version:subject:from:to:cc: references:in-reply-to:content-transfer-encoding; bh=ELE4avQo+d8QsZ45q+yVs7D3nMNlIdGR5uEQwUa3lHA=; b=eixvOyPdz+iFGgnSk34/64ZVSIrVBKQEng35thSCqqY3KOxudnnth40k cO5EAdK4n4nNYvJIrSeZw8XgeXbYUAKLXaZkvkHVhbg2ySiBLKnqNueC5 8Kajrpwf9JFwEWPdPNV6NlSqzxNi13s9pFJMoXt69d0wt3OCTs0mwXais 3SKeNcCoWxBeq9Lnph8MfsjLVJZLGAR0QWe5th9cN4j+VVt9mH8Gy9ssv RKoeloT45EWX1CADA4SBo46/ID2TgCpZ2xsHEPU1TOgAL2pu5CAYzcNRk rCzNJ2SrOP0XthRd5o9ib0C27bcctmFM4d8Pqam7LnOLfE3D4W9ILA3Rk g==; X-IronPort-AV: E=McAfee;i="6600,9927,11004"; a="29697132" X-IronPort-AV: E=Sophos;i="6.06,206,1705392000"; d="scan'208";a="29697132" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Mar 2024 16:19:01 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,206,1705392000"; d="scan'208";a="9454331" Received: from djiang5-mobl3.amr.corp.intel.com (HELO [10.247.118.77]) ([10.247.118.77]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Mar 2024 16:18:57 -0800 Message-ID: Date: Tue, 5 Mar 2024 17:18:52 -0700 Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 2/2] cxl: Add checks to access_coordinate calculation to fail missing data Content-Language: en-US From: Dave Jiang To: Jonathan Cameron Cc: linux-cxl@vger.kernel.org, dan.j.williams@intel.com, ira.weiny@intel.com, vishal.l.verma@intel.com, alison.schofield@intel.com, dave@stgolabs.net References: <20240229002542.634982-1-dave.jiang@intel.com> <20240229002542.634982-2-dave.jiang@intel.com> <20240229174427.00000e9e@Huawei.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 3/5/24 3:36 PM, Dave Jiang wrote: > > > On 2/29/24 10:44 AM, Jonathan Cameron wrote: >> On Wed, 28 Feb 2024 17:25:42 -0700 >> Dave Jiang wrote: >> >>> Jonathan noted that when the coordinates for host bridge and switches >>> can be 0s if no actual data are retrieved and the calculation continues. >>> The resulting number would be inaccurate. Add checks to ensure that the >>> calculation would complete only if the numbers are valid. >>> >>> Reported-by: Jonathan Cameron >>> Signed-off-by: Dave Jiang >> >> Hi Dave, >> >> Whilst I think the fix is right, it is getting hard to read. Maybe >> a rethink is needed on how that iteration works? >> >>> --- >>> drivers/cxl/core/port.c | 31 +++++++++++++++++++++++++++---- >>> 1 file changed, 27 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c >>> index e1d30a885700..2c82fe24b789 100644 >>> --- a/drivers/cxl/core/port.c >>> +++ b/drivers/cxl/core/port.c >>> @@ -2110,6 +2110,20 @@ static void combine_coordinates(struct access_coordinate *c1, >>> c1->read_latency += c2->read_latency; >>> } >>> >>> +static bool coordinates_invalid(struct access_coordinate *c) >>> +{ >>> + if (!c->read_bandwidth && !c->write_bandwidth && >>> + !c->read_latency && !c->write_latency) >>> + return true; >>> + >>> + return false; >>> +} >>> + >>> +static bool parent_port_is_cxl_root(struct cxl_port *port) >>> +{ >>> + return is_cxl_root(to_cxl_port(port->dev.parent)); >>> +} >>> + >>> /** >>> * cxl_endpoint_get_perf_coordinates - Retrieve performance numbers stored in dports >>> * of CXL path >>> @@ -2142,16 +2156,25 @@ int cxl_endpoint_get_perf_coordinates(struct cxl_port *port, >>> * port each iteration. If the parent is cxl root then there is >>> * nothing to gather. >>> */ >>> - while (!is_cxl_root(to_cxl_port(iter->dev.parent))) { >>> - combine_coordinates(&c, &dport->sw_coord); >>> + while (!parent_port_is_cxl_root(iter)) { >>> + iter = to_cxl_port(iter->dev.parent); >>> + >>> + /* There's no CDAT for the host bridge, so skip if so. */ >> >> Comment refers to skipping whereas code is 'doing more' for the other case >> so this is confusing to me. >> >> The inverse of this only occurs on the last iteration I think. >> >> Possibly a do / while instead of a while will do it. >> I'm far from confident though as all the levels of look up have >> me too confused. > > So this is somewhat tricky. For example: > devices/pci0000:35/0000:35:01.0/0000:37:00.0/mem5 > > In this case the endpoint is attached to the host bridge without any switches. The endpoint is 0000:37:00.0 and the host bridge down stream port is 0000:35:01.0. In this instance there is no switch and therefore switch CDAT, but there is a valid dport. So we would skip the dport->sw_coord. However, we do need to pick up the link_latency between endpoint and downstream port. So we spend 1 iteration in the loop and skips the dport->sw_coord. > > devices/pci0000:bf/0000:bf:00.0/0000:c0:00.0/0000:c1:00.0/0000:c2:00.0/mem8 > > Now in this case there's a CXL switch in between. So in first iteration, we pick up the dport->sw_coord. And in second iteration, we skip the dport->sw_coord. However, we would be adding two link_latency. For the link between 0000:c2:00.0 (endpoint) and 0000:c1:00.0 (switch downstream port), and the link between 0000:c0:00.0 (switch upstream port) and 0000:bf:00.0 (host bridge down stream port). So therefore we can't put the sum of link_latency outside of the loop. > > Not sure how much better this is: > > do { > struct cxl_port *parent_port = to_cxl_port(iter->dev.parent); > > dport = iter->parent_dport; > if (!parent_port_is_cxl_root(parent_port)) { > if (coordinates_invalid(&dport->sw_coord)) > return -EINVAL; > > combine_coordinates(&c, &dport->sw_coord); > } > > c.write_latency += dport->link_latency; > c.read_latency += dport->link_latency; > iter = to_cxl_port(iter->dev.parent); > } while (!parent_port_is_cxl_root(iter)); > > or: > > do { > struct cxl_port *parent_port = to_cxl_port(iter->dev.parent); > > dport = iter->parent_dport; > c.write_latency += dport->link_latency; > c.read_latency += dport->link_latency; > > if (parent_port_is_cxl_root(parent_port)) > break; > > if (coordinates_invalid(&dport->sw_coord)) > return -EINVAL; > > combine_coordinates(&c, &dport->sw_coord); > iter = to_cxl_port(iter->dev.parent); > } while (!parent_port_is_cxl_root(iter)); Actually, I think if we just make it dport->coord instead of dport->sw_coord and dport->hb_coord, we can remove the check and everything should work out properly. > > >> >> >> do { >> if (coordinates_invalid(&dport->sw_coord)) >> return -EINVAL; >> >> combine_coordinates(&c, &dport->sw_coord); >> iter = to_cxl_port(iter->dev.parent); >> dport = iter->parent_dport; >> } while (!parent_port_is_cxl_root(iter)); >> /* Do final link updates */ >> c.write_latency += dport->link_latency; >> c.read_latency += dport->link_latency; >> >>> + if (!parent_port_is_cxl_root(iter)) { >>> + if (coordinates_invalid(&dport->sw_coord)) >>> + return -EINVAL; >>> + >>> + combine_coordinates(&c, &dport->sw_coord); >>> + } >>> + >>> c.write_latency += dport->link_latency; >>> c.read_latency += dport->link_latency; >>> - >>> - iter = to_cxl_port(iter->dev.parent); >>> dport = iter->parent_dport; >>> } >>> >>> /* Augment with the generic port (host bridge) perf data */ >>> + if (coordinates_invalid(&dport->hb_coord)) >>> + return -EINVAL; >>> combine_coordinates(&c, &dport->hb_coord); >>> >>> /* Get the calculated PCI paths bandwidth */ >> >