From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_2 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 237CEC433ED for ; Wed, 19 May 2021 15:13:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F3DB7610A8 for ; Wed, 19 May 2021 15:13:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239797AbhESPPH (ORCPT ); Wed, 19 May 2021 11:15:07 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:4685 "EHLO szxga04-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233577AbhESPPH (ORCPT ); Wed, 19 May 2021 11:15:07 -0400 Received: from dggems706-chm.china.huawei.com (unknown [172.30.72.59]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4Flbst4yqpz1BPHc; Wed, 19 May 2021 23:10:58 +0800 (CST) Received: from lhreml710-chm.china.huawei.com (10.201.108.61) by dggems706-chm.china.huawei.com (10.3.19.183) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 19 May 2021 23:13:44 +0800 Received: from localhost (10.52.121.81) by lhreml710-chm.china.huawei.com (10.201.108.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 19 May 2021 16:13:41 +0100 Date: Wed, 19 May 2021 16:11:56 +0100 From: Jonathan Cameron To: Dan Williams CC: Ira Weiny , , Linux PCI , Bjorn Helgaas , "Lorenzo Pieralisi" , Ben Widawsky , Chris Browy , Linux ACPI , "Schofield, Alison" , Vishal L Verma , Linuxarm , Fangjian , Greg KH Subject: Re: [RFC PATCH v3 2/4] PCI/doe: Add Data Object Exchange support Message-ID: <20210519161156.00003bf9@Huawei.com> In-Reply-To: References: <20210419165451.2176200-1-Jonathan.Cameron@huawei.com> <20210419165451.2176200-3-Jonathan.Cameron@huawei.com> <20210506215934.GJ1904484@iweiny-DESK2.sc.intel.com> <20210511175006.00007861@Huawei.com> <20210514094755.00002081@Huawei.com> <20210517094045.00004d58@Huawei.com> <20210518110403.000013e6@Huawei.com> Organization: Huawei Technologies Research and Development (UK) Ltd. X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; i686-w64-mingw32) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.52.121.81] X-ClientProxiedBy: lhreml730-chm.china.huawei.com (10.201.108.81) To lhreml710-chm.china.huawei.com (10.201.108.61) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org On Wed, 19 May 2021 07:18:28 -0700 Dan Williams wrote: > On Tue, May 18, 2021 at 3:06 AM Jonathan Cameron > wrote: > > > > On Mon, 17 May 2021 10:21:14 -0700 > > Dan Williams wrote: > > > > > On Mon, May 17, 2021 at 1:42 AM Jonathan Cameron > > > wrote: > > > > > > > > On Fri, 14 May 2021 11:37:12 -0700 > > > > Dan Williams wrote: > > > > > > > > > On Fri, May 14, 2021 at 1:50 AM Jonathan Cameron > > > > > wrote: > > > > > [..] > > > > > > > If it simplifies the kernel implementation to assume single > > > > > > > kernel-initiator then I think that's more than enough reason to block > > > > > > > out userspace, and/or provide userspace a method to get into the > > > > > > > kernel's queue for service. > > > > > > > > > > > > This last suggestion makes sense to me. Let's provide a 'right' way > > > > > > to access the DOE from user space. I like the idea if it being possible > > > > > > to run CXL compliance tests from userspace whilst the driver is loaded. > > > > > > > > > > Ah, and I like your observation that once the kernel provides a > > > > > "right" way to access DOE then userspace direct-access of DOE is > > > > > indeed a "you get to keep the pieces" event like any other unwanted > > > > > userspace config-write. > > > > > > > > > > > Bjorn, given this would be a generic PCI thing, any preference for what > > > > > > this interface might look like? /dev/pcidoe[xxxxxx].i with ioctls similar > > > > > > to those for the BAR based CXL mailboxes? > > > > > > > > > > (warning, anti-ioctl bias incoming...) > > > > > > > > I feel very similar about ioctls - my immediate thought was to shove this in > > > > debugfs, but that feels the wrong choice if we are trying to persuade people > > > > to use it instead of writing code that directly accesses the config space. > > > > > > > > > > > > > > Hmm, DOE has an enumeration capability, could the DOE driver use a > > > > > scheme to have a sysfs bin_attr per discovered object type? This would > > > > > make it simliar to the pci-vpd sysfs interface. > > > > > > > > We can discover the protocols, but anything beyond that is protocol > > > > specific. I don't think there is a enough info available by any standards > > > > defined method. Also part of the reason to allow a safe userspace interface > > > > would be to provide a generic interface for vendor protocols and things like > > > > CXL compliance tests where we will almost certainly never provide a more > > > > specific kernel interface. > > > > > > > > Whilst sysfs would work for CDAT, some protocols are challenge response rather > > > > than simple read back and that really doesn't fit well for sysfs model. > > > > If we get other protocols that are simple data read back, then I would > > > > advocate giving them a simple sysfs interface much like proposed for CDAT > > > > as it will always be simpler to use + self describing. > > > > > > > > On a lesser note it might be helpful to provide sysfs attrs for > > > > what protocols are supported. The alternative is to let userspace run > > > > the discovery protocol. Perhaps we can do this as a later phase. > > > > > > > > > > > > > > Then the kernel could cache objects like CDAT that don't change > > > > > outside of some invalidation event. > > > > > > > > It's been a while since I last saw any conversation on sysfs bin_attrs > > > > but mostly I thought the feeling was pretty strongly against them for anything > > > > but a few niche usecases. > > > > > > > > Feels to me like it would break most of the usual rules in a way vpd does > > > > not (IIRC VPD is supposed to be a simple in the sense that if you write a value > > > > to a writable part, you will read back the same value). > > > > > > > > +CC Greg who is a fount of knowledge in this area (and regularly + correctly > > > > screams at the ways I try to abuse sysfs :) Note I don't think Dan was > > > > suggesting implementing response / request directly, but I think that is > > > > all we could do given DOE protocols can be vendor specific and the standard > > > > discovery protocol doesn't let us know the fine grained support (what commands > > > > within a given protocol). > > > > > > I'm not all that interested in supporting vendor defined DOE > > > shenanigans. There's more than enough published DOE protocols that the > > > kernel could limit its support to the known set. This is similar to > > > how ACPI DSMs are not generically supported, but when they appear in a > > > published specification the kernel may then grow the support. The > > > supported protocols could be limited to: CDAT, PCIe IDE, CXL > > > Compliance, etc... > > > > > > Vendor specific DOE is in the same class as unfettered /dev/mem > > > access, first you need to disable the kernel's integrity and > > > confidentiality protections, and then you can do whatever you want. If > > > a vendor wants a DOE protocol supported in the "trusted" set they can > > > simply publish the specification and send the proper support patches. > > > > Fair enough, though the interface should be root only, so a vendor shooting > > themselves in the foot this way would be no different to using pcitools > > to access the device directly (we are just providing safety from concurrency > > point of view). > > > > Anyway, I can see two options for how to do this. > > > > 1) Per protocol interface. Would not be generic, as these work in entirely > > different ways (some are simple read back of tables, some require complex > > cycles of operations in the right order with data flowing in both directions) > > 2) White list those protocols we are going to let through a generic interface > > Not including CXL compliance for instance as that has nasty side effects! > > > > If we want to enable userspace DOE access, I prefer option 2. > > > > Note that I wasn't that keen on a userspace interface in the first place as > > in my view these should all be handled in kernel. > > Ultimately we should have case 1 if userspace access make sense. > > However, if we do this we shouldn't pretend we are providing userspace > > access to the DOE at all. We are providing interfaces to things that just > > happen to be implemented using DOE under the hood. > > > > I have a prototype of a trivial ioctl based interface. I'll send it out > > as an RFC later this week. Might add a white list, depending on where > > this discussion goes. > > > > I'd say let's do this in typical Linux fashion and not solve future > problems before they need to be solved. I.e. start small and build > incrementally. To me that looks like a sysfs interface to convey a > cached copy of a CDAT with an internal interface for a driver to > trigger invalidations and re-reads on the next access. This would > assume that userspace may have left the DOE in an indeterminate state > and an abort cycle may be needed. A 1 second delay for the rare case > where a collision is detected seems reasonable for just CDAT > retrieval. The problem is you can not detect a collision. Hence it's a reset every time you use the DOE from in the kernel. Personally I think that this is fixing a problem that doesn't exist. Userspace should not access the DOE when a driver is loaded in exactly the same way it shouldn't be writing to anywhere else in config space under normal circumstances. I really don't see this as special. If we think it is special then we should provide a safe interface. Given it's nearly done, I might send out the ioctl proposal and we can can just decide to leave it unmerged for now, pending real usecases being established. Jonathan