* [PATCH v5 07/11] PCI: Replace pci_dev::driver usage that gets the driver name
From: Uwe Kleine-König @ 2021-09-29 8:53 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Alexander Duyck, oss-drivers, Paul Mackerras,
Christoph Hellwig, Herbert Xu, Rafał Miłecki,
Jesse Brandeburg, Ido Schimmel, Jakub Kicinski, Yisen Zhuang,
Vadym Kochan, Michael Buesch, Jiri Pirko, Salil Mehta, netdev,
linux-wireless, linux-kernel, Taras Chornyi, Zhou Wang,
linux-crypto, kernel, Simon Horman, Oliver O'Halloran,
linuxppc-dev, David S. Miller
In-Reply-To: <20210929085306.2203850-1-u.kleine-koenig@pengutronix.de>
struct pci_dev::driver holds (apart from a constant offset) the same
data as struct pci_dev::dev->driver. With the goal to remove struct
pci_dev::driver to get rid of data duplication replace getting the
driver name by dev_driver_string() which implicitly makes use of struct
pci_dev::dev->driver.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---
drivers/crypto/hisilicon/qm.c | 2 +-
drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 2 +-
drivers/net/ethernet/marvell/prestera/prestera_pci.c | 2 +-
drivers/net/ethernet/mellanox/mlxsw/pci.c | 2 +-
drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c | 3 ++-
5 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c
index 369562d34d66..8f361e54e524 100644
--- a/drivers/crypto/hisilicon/qm.c
+++ b/drivers/crypto/hisilicon/qm.c
@@ -3085,7 +3085,7 @@ static int qm_alloc_uacce(struct hisi_qm *qm)
};
int ret;
- ret = strscpy(interface.name, pdev->driver->name,
+ ret = strscpy(interface.name, dev_driver_string(&pdev->dev),
sizeof(interface.name));
if (ret < 0)
return -ENAMETOOLONG;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
index 7ea511d59e91..f279edfce3f1 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
@@ -606,7 +606,7 @@ static void hns3_get_drvinfo(struct net_device *netdev,
return;
}
- strncpy(drvinfo->driver, h->pdev->driver->name,
+ strncpy(drvinfo->driver, dev_driver_string(&h->pdev->dev),
sizeof(drvinfo->driver));
drvinfo->driver[sizeof(drvinfo->driver) - 1] = '\0';
diff --git a/drivers/net/ethernet/marvell/prestera/prestera_pci.c b/drivers/net/ethernet/marvell/prestera/prestera_pci.c
index a250d394da38..a8f007f6dad2 100644
--- a/drivers/net/ethernet/marvell/prestera/prestera_pci.c
+++ b/drivers/net/ethernet/marvell/prestera/prestera_pci.c
@@ -720,7 +720,7 @@ static int prestera_fw_load(struct prestera_fw *fw)
static int prestera_pci_probe(struct pci_dev *pdev,
const struct pci_device_id *id)
{
- const char *driver_name = pdev->driver->name;
+ const char *driver_name = dev_driver_string(&pdev->dev);
struct prestera_fw *fw;
int err;
diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci.c b/drivers/net/ethernet/mellanox/mlxsw/pci.c
index 13b0259f7ea6..8f306364f7bf 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/pci.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/pci.c
@@ -1876,7 +1876,7 @@ static void mlxsw_pci_cmd_fini(struct mlxsw_pci *mlxsw_pci)
static int mlxsw_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
{
- const char *driver_name = pdev->driver->name;
+ const char *driver_name = dev_driver_string(&pdev->dev);
struct mlxsw_pci *mlxsw_pci;
int err;
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
index 0685ece1f155..1de076f55740 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
@@ -202,7 +202,8 @@ nfp_get_drvinfo(struct nfp_app *app, struct pci_dev *pdev,
{
char nsp_version[ETHTOOL_FWVERS_LEN] = {};
- strlcpy(drvinfo->driver, pdev->driver->name, sizeof(drvinfo->driver));
+ strlcpy(drvinfo->driver, dev_driver_string(&pdev->dev),
+ sizeof(drvinfo->driver));
nfp_net_get_nspinfo(app, nsp_version);
snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
"%s %s %s %s", vnic_version, nsp_version,
--
2.30.2
^ permalink raw reply related
* [PATCH v5 05/11] powerpc/eeh: Don't use driver member of struct pci_dev and further cleanups
From: Uwe Kleine-König @ 2021-09-29 8:53 UTC (permalink / raw)
To: Bjorn Helgaas, Michael Ellerman
Cc: linux-pci, Paul Mackerras, kernel, Oliver O'Halloran,
linuxppc-dev, Christoph Hellwig
In-Reply-To: <20210929085306.2203850-1-u.kleine-koenig@pengutronix.de>
The driver member of struct pci_dev is to be removed as it tracks
information already present by tracking of the driver core. So replace
pdev->driver->name by dev_driver_string() for the corresponding struct
device.
Also move the function nearer to its only user and instead of the ?:
operator use a normal if which is more readable.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---
arch/powerpc/include/asm/ppc-pci.h | 5 -----
arch/powerpc/kernel/eeh.c | 8 ++++++++
2 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/asm/ppc-pci.h b/arch/powerpc/include/asm/ppc-pci.h
index 2b9edbf6e929..f6cf0159024e 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -55,11 +55,6 @@ void eeh_pe_dev_mode_mark(struct eeh_pe *pe, int mode);
void eeh_sysfs_add_device(struct pci_dev *pdev);
void eeh_sysfs_remove_device(struct pci_dev *pdev);
-static inline const char *eeh_driver_name(struct pci_dev *pdev)
-{
- return (pdev && pdev->driver) ? pdev->driver->name : "<null>";
-}
-
#endif /* CONFIG_EEH */
#define PCI_BUSNO(bdfn) ((bdfn >> 8) & 0xff)
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index e9b597ed423c..4b08881c4a1e 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -399,6 +399,14 @@ static int eeh_phb_check_failure(struct eeh_pe *pe)
return ret;
}
+static inline const char *eeh_driver_name(struct pci_dev *pdev)
+{
+ if (pdev)
+ return dev_driver_string(&pdev->dev);
+
+ return "<null>";
+}
+
/**
* eeh_dev_check_failure - Check if all 1's data is due to EEH slot freeze
* @edev: eeh device
--
2.30.2
^ permalink raw reply related
* Re: [PATCH kernel] powerpc/dma: Fix dma_map_ops::get_required_mask
From: Alexey Kardashevskiy @ 2021-09-29 8:48 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Carol Soto, linuxppc-dev, Cédric Le Goater, Frederic Barrat
In-Reply-To: <20200909075849.GA12282@lst.de>
On 09/09/2020 17:58, Christoph Hellwig wrote:
> On Tue, Sep 08, 2020 at 11:10:03PM +1000, Alexey Kardashevskiy wrote:
>>>> a-ha, this makes more sense, thanks. Then I guess we need to revert that
>>>> one bit from yours f1565c24b596, do not we?
>>>
>>> Why? The was the original intent of the API, but now we also use
>>> internally to check the addressing capabilities.
>>
>> The bigger mask the better, no? As it is now, it's limited by the window
>> size which happens to be bigger than 4GB but smaller then full 64bit (48bit
>> on my system)
>
> Yes, the bigger mask is better. But I don't see why you'd want to
> revert the dma bypass code for that entirely.
Ok we have another victim of this change:
https://github.com/torvalds/linux/blob/master/drivers/scsi/mpt3sas/mpt3sas_base.c#L3007
It calls dma_get_required_mask() to know "the mask that the platform
requires to operate efficiently" (from dma-api.rst). The current
upstream returns 31 for pseries which in no way is efficient, we can do
better so we need this hunk back (but just this one):
https://github.com/torvalds/linux/commit/f1565c24b5965dfd2352f209c417ff160be04db9#diff-18e87e1863bf902c6388d72ad99467b7fcec0dd37084636d96ad5a35a3e59904L156
(well, almost, move it above the !tbl check).
This does not hit us on powernv/upstream as that returns 44 (or so) and
the mpt3sas driver (which does the right thing afaict) just assumes that
">32" == ">=63". What do I miss here? Thanks,
ps:
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20200908015106.79661-1-aik@ozlabs.ru/#2528801
is the rest of the thread I am replying to.
--
Alexey
^ permalink raw reply
* Re: [PATCH v4 4/8] PCI: replace pci_dev::driver usage that gets the driver name
From: Simon Horman @ 2021-09-29 8:05 UTC (permalink / raw)
To: Uwe Kleine-König
Cc: Alexander Duyck, oss-drivers, Paul Mackerras, Herbert Xu,
Ido Schimmel, Rafał Miłecki, Jesse Brandeburg,
Bjorn Helgaas, linux-pci, Jakub Kicinski, Yisen Zhuang,
Uwe Kleine-König, Vadym Kochan, Michael Buesch, Jiri Pirko,
Salil Mehta, netdev, linux-wireless, linux-kernel, Taras Chornyi,
Zhou Wang, linux-crypto, kernel, Oliver O'Halloran,
linuxppc-dev, David S. Miller
In-Reply-To: <20210928103129.c3gcbnfbarezr3mm@pengutronix.de>
On Tue, Sep 28, 2021 at 12:31:29PM +0200, Uwe Kleine-König wrote:
> On Tue, Sep 28, 2021 at 12:01:28PM +0200, Simon Horman wrote:
> > On Mon, Sep 27, 2021 at 10:43:22PM +0200, Uwe Kleine-König wrote:
> > > From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> > >
> > > struct pci_dev::driver holds (apart from a constant offset) the same
> > > data as struct pci_dev::dev->driver. With the goal to remove struct
> > > pci_dev::driver to get rid of data duplication replace getting the
> > > driver name by dev_driver_string() which implicitly makes use of struct
> > > pci_dev::dev->driver.
> > >
> > > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> >
> > ...
> >
> > > diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
> > > index 0685ece1f155..23dfb599c828 100644
> > > --- a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
> > > +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
> > > @@ -202,7 +202,7 @@ nfp_get_drvinfo(struct nfp_app *app, struct pci_dev *pdev,
> > > {
> > > char nsp_version[ETHTOOL_FWVERS_LEN] = {};
> > >
> > > - strlcpy(drvinfo->driver, pdev->driver->name, sizeof(drvinfo->driver));
> > > + strlcpy(drvinfo->driver, dev_driver_string(&pdev->dev), sizeof(drvinfo->driver));
> >
> > I'd slightly prefer to maintain lines under 80 columns wide.
> > But not nearly strongly enough to engage in a long debate about it.
>
> :-)
>
> Looking at the output of
>
> git grep strlcpy.\*sizeof
>
> I wonder if it would be sensible to introduce something like
>
> #define strlcpy_array(arr, src) (strlcpy(arr, src, sizeof(arr)) + __must_be_array(arr))
>
> but not sure this is possible without a long debate either (and this
> line is over 80 chars wide, too :-).
My main motivation for the 80 char limit in nfp_net_ethtool.c is
not that I think 80 char is universally a good limit (although that is true),
but rather that I expect that is the prevailing style in nfp_net_ethtool.c.
So a macro more than 80 car wide somewhere else is fine by me.
However, when running checkpatch --strict over the patch it told me:
WARNING: Prefer strscpy over strlcpy - see: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
#276: FILE: drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c:205:
+ strlcpy(drvinfo->driver, dev_driver_string(&pdev->dev), sizeof(drvinfo->driver));
total: 0 errors, 1 warnings, 0 checks, 80 lines checked
(Amusingly, more text wider than 80 column, perhaps suggesting the folly of
my original comment, but lets move on from that.)
As your patch doesn't introduce the usage of strlcpy() I was considering a
follow-up patch to change it to strscpy(). And in general the email at the
link above suggests all usages of strlcpy() should do so. So perhaps
creating strscpy_array is a better idea?
I have not thought about this much, and probably this just leads us to a
deeper part of the rabbit hole.
^ permalink raw reply
* Re: [PATCH v2 2/2] powerpc/paravirt: correct preempt debug splat in vcpu_is_preempted()
From: Srikar Dronamraju @ 2021-09-29 7:30 UTC (permalink / raw)
To: Nathan Lynch; +Cc: linuxppc-dev, npiggin
In-Reply-To: <20210928214147.312412-3-nathanl@linux.ibm.com>
* Nathan Lynch <nathanl@linux.ibm.com> [2021-09-28 16:41:47]:
> vcpu_is_preempted() can be used outside of preempt-disabled critical
> sections, yielding warnings such as:
>
> BUG: using smp_processor_id() in preemptible [00000000] code: systemd-udevd/185
> caller is rwsem_spin_on_owner+0x1cc/0x2d0
> CPU: 1 PID: 185 Comm: systemd-udevd Not tainted 5.15.0-rc2+ #33
> Call Trace:
> [c000000012907ac0] [c000000000aa30a8] dump_stack_lvl+0xac/0x108 (unreliable)
> [c000000012907b00] [c000000001371f70] check_preemption_disabled+0x150/0x160
> [c000000012907b90] [c0000000001e0e8c] rwsem_spin_on_owner+0x1cc/0x2d0
> [c000000012907be0] [c0000000001e1408] rwsem_down_write_slowpath+0x478/0x9a0
> [c000000012907ca0] [c000000000576cf4] filename_create+0x94/0x1e0
> [c000000012907d10] [c00000000057ac08] do_symlinkat+0x68/0x1a0
> [c000000012907d70] [c00000000057ae18] sys_symlink+0x58/0x70
> [c000000012907da0] [c00000000002e448] system_call_exception+0x198/0x3c0
> [c000000012907e10] [c00000000000c54c] system_call_common+0xec/0x250
>
> The result of vcpu_is_preempted() is always used speculatively, and the
> function does not access per-cpu resources in a (Linux) preempt-unsafe way.
> Use raw_smp_processor_id() to avoid such warnings, adding explanatory
> comments.
>
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
> Fixes: ca3f969dcb11 ("powerpc/paravirt: Use is_kvm_guest() in vcpu_is_preempted()")
Looks good to me.
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---
> arch/powerpc/include/asm/paravirt.h | 18 +++++++++++++++++-
> 1 file changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h
> index 39f173961f6a..eb7df559ae74 100644
> --- a/arch/powerpc/include/asm/paravirt.h
> +++ b/arch/powerpc/include/asm/paravirt.h
> @@ -110,7 +110,23 @@ static inline bool vcpu_is_preempted(int cpu)
>
> #ifdef CONFIG_PPC_SPLPAR
> if (!is_kvm_guest()) {
> - int first_cpu = cpu_first_thread_sibling(smp_processor_id());
> + int first_cpu;
> +
> + /*
> + * The result of vcpu_is_preempted() is used in a
> + * speculative way, and is always subject to invalidation
> + * by events internal and external to Linux. While we can
> + * be called in preemptable context (in the Linux sense),
> + * we're not accessing per-cpu resources in a way that can
> + * race destructively with Linux scheduler preemption and
> + * migration, and callers can tolerate the potential for
> + * error introduced by sampling the CPU index without
> + * pinning the task to it. So it is permissible to use
> + * raw_smp_processor_id() here to defeat the preempt debug
> + * warnings that can arise from using smp_processor_id()
> + * in arbitrary contexts.
> + */
> + first_cpu = cpu_first_thread_sibling(raw_smp_processor_id());
>
> /*
> * The PowerVM hypervisor dispatches VMs on a whole core
> --
> 2.31.1
>
--
Thanks and Regards
Srikar Dronamraju
^ permalink raw reply
* Re: [PATCH v2 1/2] powerpc/paravirt: vcpu_is_preempted() commentary
From: Srikar Dronamraju @ 2021-09-29 7:30 UTC (permalink / raw)
To: Nathan Lynch; +Cc: linuxppc-dev, npiggin
In-Reply-To: <20210928214147.312412-2-nathanl@linux.ibm.com>
* Nathan Lynch <nathanl@linux.ibm.com> [2021-09-28 16:41:46]:
> Add comments more clearly documenting that this function determines whether
> hypervisor-level preemption of the VM has occurred.
>
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Looks good to me.
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---
> arch/powerpc/include/asm/paravirt.h | 22 ++++++++++++++++++----
> 1 file changed, 18 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h
> index bcb7b5f917be..39f173961f6a 100644
> --- a/arch/powerpc/include/asm/paravirt.h
> +++ b/arch/powerpc/include/asm/paravirt.h
> @@ -21,7 +21,7 @@ static inline bool is_shared_processor(void)
> return static_branch_unlikely(&shared_processor);
> }
>
> -/* If bit 0 is set, the cpu has been preempted */
> +/* If bit 0 is set, the cpu has been ceded, conferred, or preempted */
> static inline u32 yield_count_of(int cpu)
> {
> __be32 yield_count = READ_ONCE(lppaca_of(cpu).yield_count);
> @@ -92,6 +92,19 @@ static inline void prod_cpu(int cpu)
> #define vcpu_is_preempted vcpu_is_preempted
> static inline bool vcpu_is_preempted(int cpu)
> {
> + /*
> + * The dispatch/yield bit alone is an imperfect indicator of
> + * whether the hypervisor has dispatched @cpu to run on a physical
> + * processor. When it is clear, @cpu is definitely not preempted.
> + * But when it is set, it means only that it *might* be, subject to
> + * other conditions. So we check other properties of the VM and
> + * @cpu first, resorting to the yield count last.
> + */
> +
> + /*
> + * Hypervisor preemption isn't possible in dedicated processor
> + * mode by definition.
> + */
> if (!is_shared_processor())
> return false;
>
> @@ -100,9 +113,10 @@ static inline bool vcpu_is_preempted(int cpu)
> int first_cpu = cpu_first_thread_sibling(smp_processor_id());
>
> /*
> - * Preemption can only happen at core granularity. This CPU
> - * is not preempted if one of the CPU of this core is not
> - * preempted.
> + * The PowerVM hypervisor dispatches VMs on a whole core
> + * basis. So we know that a thread sibling of the local CPU
> + * cannot have been preempted by the hypervisor, even if it
> + * has called H_CONFER, which will set the yield bit.
> */
> if (cpu_first_thread_sibling(cpu) == first_cpu)
> return false;
> --
> 2.31.1
>
--
Thanks and Regards
Srikar Dronamraju
^ permalink raw reply
* Re: [PATCH] powerpc: fix unbalanced node refcount in check_kvm_guest()
From: Srikar Dronamraju @ 2021-09-29 7:29 UTC (permalink / raw)
To: Nathan Lynch; +Cc: linuxppc-dev, npiggin
In-Reply-To: <20210928124550.132020-1-nathanl@linux.ibm.com>
* Nathan Lynch <nathanl@linux.ibm.com> [2021-09-28 07:45:50]:
> When check_kvm_guest() succeeds in looking up a /hypervisor OF node, it
> returns without performing a matching put for the lookup, leaving the
> node's reference count elevated.
>
> Add the necessary call to of_node_put(), rearranging the code slightly to
> avoid repetition or goto.
>
Looks good to me.
I do see few other cases where we call of_find_node calls without
of_node_put().
Some of them that I saw were in
find_legacy_serial_ports() in arch/powerpc/kernel/legacy_serial.c
proc_ppc64_create in arch/powerpc/proc/powerpc.c
update_events_in_group in arch/powerpc/perf/imc-pmu.c
cell_iommu_init_disabled in arch/powerpc/platforms/cell/iommu.c
cell_publish_devices in arch/powerpc/platforms/cell/setup.c
spufs_init_isolated_loader in arch/powerpc/platforms/cell/spufs/inode.c
holly_init_pci / holly_restart and holly_init_IRQ in arch/powerpc/platforms/embedded6xx/holly.c
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
> Fixes: 107c55005fbd ("powerpc/pseries: Add KVM guest doorbell restrictions")
> ---
> arch/powerpc/kernel/firmware.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/kernel/firmware.c b/arch/powerpc/kernel/firmware.c
> index c7022c41cc31..20328f72f9f2 100644
> --- a/arch/powerpc/kernel/firmware.c
> +++ b/arch/powerpc/kernel/firmware.c
> @@ -31,11 +31,10 @@ int __init check_kvm_guest(void)
> if (!hyper_node)
> return 0;
>
> - if (!of_device_is_compatible(hyper_node, "linux,kvm"))
> - return 0;
> -
> - static_branch_enable(&kvm_guest);
> + if (of_device_is_compatible(hyper_node, "linux,kvm"))
> + static_branch_enable(&kvm_guest);
>
> + of_node_put(hyper_node);
> return 0;
> }
> core_initcall(check_kvm_guest); // before kvm_guest_init()
> --
> 2.31.1
>
--
Thanks and Regards
Srikar Dronamraju
^ permalink raw reply
* Re: [PATCH kernel] powerps/pseries/dma: Add support for 2M IOMMU page size
From: Frederic Barrat @ 2021-09-29 7:00 UTC (permalink / raw)
To: Alexey Kardashevskiy, linuxppc-dev
Cc: Leonardo Augusto Guimaraes Garcia, Leonardo Bras, Travis Pizel,
Brian J King, Murilo Vicentini
In-Reply-To: <20210928101521.3956331-1-aik@ozlabs.ru>
On 28/09/2021 12:15, Alexey Kardashevskiy wrote:
> The upcoming PAPR spec adds a 2M page size, bit 23 right after the 16G page
> size in the "ibm,query-pe-dma-window" call.
>
> This adds support for the new page size. Since the new page size is out
> of sorted order, this changes the loop to not assume that shift[] is
> sorted.
>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
Looks ok to me
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
>
> This might not work if PHYP keeps rejecting new window requests for less
> than 32768 TCEs. This is needed:
> https://github.com/aik/linux/commit/8cc8fa5ba5b3b4a18efbc9d81d9e5b85ca7c8a95
>
>
> ---
> arch/powerpc/platforms/pseries/iommu.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
> index c741689a5165..237bf405b0cb 100644
> --- a/arch/powerpc/platforms/pseries/iommu.c
> +++ b/arch/powerpc/platforms/pseries/iommu.c
> @@ -1159,14 +1159,15 @@ static void reset_dma_window(struct pci_dev *dev, struct device_node *par_dn)
> /* Return largest page shift based on "IO Page Sizes" output of ibm,query-pe-dma-window. */
> static int iommu_get_page_shift(u32 query_page_size)
> {
> - /* Supported IO page-sizes according to LoPAR */
> + /* Supported IO page-sizes according to LoPAR, note that 2M is out of order */
> const int shift[] = {
> __builtin_ctzll(SZ_4K), __builtin_ctzll(SZ_64K), __builtin_ctzll(SZ_16M),
> __builtin_ctzll(SZ_32M), __builtin_ctzll(SZ_64M), __builtin_ctzll(SZ_128M),
> - __builtin_ctzll(SZ_256M), __builtin_ctzll(SZ_16G)
> + __builtin_ctzll(SZ_256M), __builtin_ctzll(SZ_16G), __builtin_ctzll(SZ_2M)
> };
>
> int i = ARRAY_SIZE(shift) - 1;
> + int ret = 0;
>
> /*
> * On LoPAR, ibm,query-pe-dma-window outputs "IO Page Sizes" using a bit field:
> @@ -1176,11 +1177,10 @@ static int iommu_get_page_shift(u32 query_page_size)
> */
> for (; i >= 0 ; i--) {
> if (query_page_size & (1 << i))
> - return shift[i];
> + ret = max(ret, shift[i]);
> }
>
> - /* No valid page size found. */
> - return 0;
> + return ret;
> }
>
> static struct property *ddw_property_create(const char *propname, u32 liobn, u64 dma_addr,
>
^ permalink raw reply
* Re: [next-20210827][ppc][multipathd] INFO: task hung in dm_table_add_target
From: Christoph Hellwig @ 2021-09-29 6:52 UTC (permalink / raw)
To: Abdul Haleem
Cc: axboe, sachinp, jack, linux-scsi, linux-kernel, dm-devel,
linux-next, dougmill, Brian King, linuxppc-dev, Christoph Hellwig
In-Reply-To: <86f8e47b-84ca-4cc3-5d5b-f5f11c1d4d1a@linux.vnet.ibm.com>
On Tue, Sep 28, 2021 at 03:53:47PM +0530, Abdul Haleem wrote:
>
> On 9/1/21 7:06 PM, Christoph Hellwig wrote:
>> On Wed, Sep 01, 2021 at 04:47:26PM +0530, Abdul Haleem wrote:
>>> Greeting's
>>>
>>> multiple task hung while adding the vfc disk back to the multipath on my
>>> powerpc box running linux-next kernel
>> Can you retry to reproduce this with lockdep enabled to see if there
>> is anything interesting holding this lock?
>
> LOCKDEP was earlier enabled by default
>
> # cat .config | grep LOCKDEP
> CONFIG_LOCKDEP_SUPPORT=y
>
> BTW, Recreated again on 5.15.0-rc2 mainline kernel and attaching the logs
It seems the reinstate is blocking on the close which is blocking on
flushing dirty data. In other words it looks like the link blocking
looks like the symptom and not the cause.
^ permalink raw reply
* Re: [PATCH v2 4/4] powerpc/pseries/cpuhp: remove obsolete comment from pseries_cpu_die
From: Michael Ellerman @ 2021-09-29 5:45 UTC (permalink / raw)
To: Daniel Henrique Barboza, Nathan Lynch, linuxppc-dev
Cc: tyreld, ldufour, aneesh.kumar
In-Reply-To: <4479e869-1c98-4473-262c-3aeb37b8fca2@gmail.com>
Daniel Henrique Barboza <danielhb413@gmail.com> writes:
> On 9/27/21 17:19, Nathan Lynch wrote:
>> This comment likely refers to the obsolete DLPAR workflow where some
>> resource state transitions were driven more directly from user space
>> utilities, but it also seems to contradict itself: "Change isolate state to
>> Isolate [...]" is at odds with the preceding sentences, and it does not
>> relate at all to the code that follows.
>
> This comment was added by commit 413f7c405a34, a 2006 commit where Mike
> Ellerman moved code from platform/pseries/smp.c into hotplug-cpu.c.
>
> I checked the original code back in smp.c and this comment was added there
> by commit 1da177e4c3f41, which is Linus's initial git commit, where he mentions
> that he didn't bothered with full history (although it is available somewhere,
> allegedly).
>
> This is enough to say that we can't easily see the history behind this comment.
> I also believe that we're better of without it since it doesn't make sense
> with the current codebase.
It was added by the original CPU hotplug commit for ppc64::
https://github.com/mpe/linux-fullhistory/commit/0e9fd9441cd2113b67b14e739267c9e69761489b
The code was fairly similar:
void __cpu_die(unsigned int cpu)
{
int tries;
int cpu_status;
unsigned int pcpu = get_hard_smp_processor_id(cpu);
for (tries = 0; tries < 5; tries++) {
cpu_status = query_cpu_stopped(pcpu);
if (cpu_status == 0)
break;
set_current_state(TASK_UNINTERRUPTIBLE);
schedule_timeout(HZ);
}
if (cpu_status != 0) {
printk("Querying DEAD? cpu %i (%i) shows %i\n",
cpu, pcpu, cpu_status);
}
/* Isolation and deallocation are definatly done by
* drslot_chrp_cpu. If they were not they would be
* done here. Change isolate state to Isolate and
* change allocation-state to Unusable.
*/
paca[cpu].xProcStart = 0;
/* So we can recognize if it fails to come up next time. */
cpu_callin_map[cpu] = 0;
}
drslot_chrp_cpu() still exists in drmgr:
https://github.com/ibm-power-utilities/powerpc-utils/blob/e798c4a09fbf0fa0f421e624cfa366a6c405c9fe/src/drmgr/drslot_chrp_cpu.c#L406
I agree the comment is no longer meaningful and can be removed.
It might be good to then add a comment explaining why we need to set
cpu_start = 0.
It's not immediately clear why we need to. When we bring a CPU back
online in smp_pSeries_kick_cpu() we ask RTAS to start it and then
immediately set cpu_start = 1, ie. there isn't a separate step that sets
cpu_start = 1 for hotplugged CPUs.
cheers
^ permalink raw reply
* [PATCH v2 2/2] powerpc:85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n
From: Xiaoming Ni @ 2021-09-29 3:36 UTC (permalink / raw)
To: oss, mpe, benh, paulus, paul.gortmaker, chenhui.zhao,
Yuantian.Tang, linuxppc-dev, linux-kernel, stable, gregkh
Cc: liuwenliang, wangle6, nixiaoming, chenjianguo3
In-Reply-To: <20210929033646.39630-1-nixiaoming@huawei.com>
When CONFIG_SMP=y, timebase synchronization is required when the second
kernel is started.
arch/powerpc/kernel/smp.c:
int __cpu_up(unsigned int cpu, struct task_struct *tidle)
{
...
if (smp_ops->give_timebase)
smp_ops->give_timebase();
...
}
void start_secondary(void *unused)
{
...
if (smp_ops->take_timebase)
smp_ops->take_timebase();
...
}
When CONFIG_HOTPLUG_CPU=n and CONFIG_KEXEC_CORE=n,
smp_85xx_ops.give_timebase is NULL,
smp_85xx_ops.take_timebase is NULL,
As a result, the timebase is not synchronized.
Timebase synchronization does not depend on CONFIG_HOTPLUG_CPU.
Fixes: 56f1ba280719 ("powerpc/mpc85xx: refactor the PM operations")
Cc: stable@vger.kernel.org #v4.6
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
---
arch/powerpc/platforms/85xx/Makefile | 4 +++-
arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c | 4 ++++
arch/powerpc/platforms/85xx/smp.c | 12 ++++++------
3 files changed, 13 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/platforms/85xx/Makefile b/arch/powerpc/platforms/85xx/Makefile
index 60e4e97a929d..260fbad7967b 100644
--- a/arch/powerpc/platforms/85xx/Makefile
+++ b/arch/powerpc/platforms/85xx/Makefile
@@ -3,7 +3,9 @@
# Makefile for the PowerPC 85xx linux kernel.
#
obj-$(CONFIG_SMP) += smp.o
-obj-$(CONFIG_FSL_PMC) += mpc85xx_pm_ops.o
+ifneq ($(CONFIG_FSL_CORENET_RCPM),y)
+obj-$(CONFIG_SMP) += mpc85xx_pm_ops.o
+endif
obj-y += common.o
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c b/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c
index ffa8a7a6a2db..4a8af80011a6 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c
@@ -17,6 +17,7 @@
static struct ccsr_guts __iomem *guts;
+#ifdef CONFIG_FSL_PMC
static void mpc85xx_irq_mask(int cpu)
{
@@ -49,6 +50,7 @@ static void mpc85xx_cpu_up_prepare(int cpu)
{
}
+#endif
static void mpc85xx_freeze_time_base(bool freeze)
{
@@ -76,10 +78,12 @@ static const struct of_device_id mpc85xx_smp_guts_ids[] = {
static const struct fsl_pm_ops mpc85xx_pm_ops = {
.freeze_time_base = mpc85xx_freeze_time_base,
+#ifdef CONFIG_FSL_PMC
.irq_mask = mpc85xx_irq_mask,
.irq_unmask = mpc85xx_irq_unmask,
.cpu_die = mpc85xx_cpu_die,
.cpu_up_prepare = mpc85xx_cpu_up_prepare,
+#endif
};
int __init mpc85xx_setup_pmc(void)
diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c
index c6df294054fe..83f4a6389a28 100644
--- a/arch/powerpc/platforms/85xx/smp.c
+++ b/arch/powerpc/platforms/85xx/smp.c
@@ -40,7 +40,6 @@ struct epapr_spin_table {
u32 pir;
};
-#ifdef CONFIG_HOTPLUG_CPU
static u64 timebase;
static int tb_req;
static int tb_valid;
@@ -112,6 +111,7 @@ static void mpc85xx_take_timebase(void)
local_irq_restore(flags);
}
+#ifdef CONFIG_HOTPLUG_CPU
static void smp_85xx_cpu_offline_self(void)
{
unsigned int cpu = smp_processor_id();
@@ -495,21 +495,21 @@ void __init mpc85xx_smp_init(void)
smp_85xx_ops.probe = NULL;
}
-#ifdef CONFIG_HOTPLUG_CPU
#ifdef CONFIG_FSL_CORENET_RCPM
+ /* Assign a value to qoriq_pm_ops on PPC_E500MC */
fsl_rcpm_init();
-#endif
-
-#ifdef CONFIG_FSL_PMC
+#else
+ /* Assign a value to qoriq_pm_ops on !PPC_E500MC */
mpc85xx_setup_pmc();
#endif
if (qoriq_pm_ops) {
smp_85xx_ops.give_timebase = mpc85xx_give_timebase;
smp_85xx_ops.take_timebase = mpc85xx_take_timebase;
+#ifdef CONFIG_HOTPLUG_CPU
smp_85xx_ops.cpu_offline_self = smp_85xx_cpu_offline_self;
smp_85xx_ops.cpu_die = qoriq_cpu_kill;
- }
#endif
+ }
smp_ops = &smp_85xx_ops;
#ifdef CONFIG_KEXEC_CORE
--
2.27.0
^ permalink raw reply related
* [PATCH v2 1/2] powerpc:85xx:Fix oops when mpc85xx_smp_guts_ids node cannot be found
From: Xiaoming Ni @ 2021-09-29 3:36 UTC (permalink / raw)
To: oss, mpe, benh, paulus, paul.gortmaker, chenhui.zhao,
Yuantian.Tang, linuxppc-dev, linux-kernel, stable, gregkh
Cc: liuwenliang, wangle6, nixiaoming, chenjianguo3
In-Reply-To: <20210929033646.39630-1-nixiaoming@huawei.com>
When the field described in mpc85xx_smp_guts_ids[] is not configured in
dtb, the mpc85xx_setup_pmc() does not assign a value to the "guts"
variable. As a result, the oops is triggered when
mpc85xx_freeze_time_base() is executed.
Fixes:56f1ba280719 ("powerpc/mpc85xx: refactor the PM operations")
Cc: stable@vger.kernel.org #v4.6
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
---
arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c b/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c
index 7c0133f558d0..ffa8a7a6a2db 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c
@@ -94,9 +94,8 @@ int __init mpc85xx_setup_pmc(void)
pr_err("Could not map guts node address\n");
return -ENOMEM;
}
+ qoriq_pm_ops = &mpc85xx_pm_ops;
}
- qoriq_pm_ops = &mpc85xx_pm_ops;
-
return 0;
}
--
2.27.0
^ permalink raw reply related
* [PATCH v2 0/2] powerpc:85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n
From: Xiaoming Ni @ 2021-09-29 3:36 UTC (permalink / raw)
To: oss, mpe, benh, paulus, paul.gortmaker, chenhui.zhao,
Yuantian.Tang, linuxppc-dev, linux-kernel, stable, gregkh
Cc: liuwenliang, wangle6, nixiaoming, chenjianguo3
In-Reply-To: <021a5ee3-25ef-1de4-0111-d4c3281e0f45@huawei.com>
When CONFIG_SMP=y, timebase synchronization is required for mpc8572 when
the second kernel is started
arch/powerpc/kernel/smp.c:
int __cpu_up(unsigned int cpu, struct task_struct *tidle)
{
...
if (smp_ops->give_timebase)
smp_ops->give_timebase();
...
}
void start_secondary(void *unused)
{
...
if (smp_ops->take_timebase)
smp_ops->take_timebase();
...
}
When CONFIG_HOTPLUG_CPU=n and CONFIG_KEXEC_CORE=n,
smp_85xx_ops.give_timebase is NULL,
smp_85xx_ops.take_timebase is NULL,
As a result, the timebase is not synchronized.
test code:
for i in $(seq 1 3); do taskset 1 date; taskset 2 date; sleep 1; echo;done
log:
Sat Sep 25 18:50:00 CST 2021
Sat Sep 25 19:07:47 CST 2021
Sat Sep 25 18:50:01 CST 2021
Sat Sep 25 19:07:48 CST 2021
Sat Sep 25 18:50:02 CST 2021
Sat Sep 25 19:07:49 CST 2021
Code snippet about give_timebase and take_timebase assignments:
arch/powerpc/platforms/85xx/smp.c:
#ifdef CONFIG_HOTPLUG_CPU
#ifdef CONFIG_FSL_CORENET_RCPM
fsl_rcpm_init();
#endif
#ifdef CONFIG_FSL_PMC
mpc85xx_setup_pmc();
#endif
if (qoriq_pm_ops) {
smp_85xx_ops.give_timebase = mpc85xx_give_timebase;
smp_85xx_ops.take_timebase = mpc85xx_take_timebase;
config dependency:
FSL_CORENET_RCPM depends on the PPC_E500MC.
FSL_PMC depends on SUSPEND.
SUSPEND depends on ARCH_SUSPEND_POSSIBLE.
ARCH_SUSPEND_POSSIBLE depends on !PPC_E500MC.
CONFIG_HOTPLUG_CPU and CONFIG_FSL_PMC require the timebase function, but
the timebase should not depend on CONFIG_HOTPLUG_CPU and CONFIG_FSL_PMC.
Therefore, adjust the macro control range. Ensure that the corresponding
timebase hook function is not empty when the dtsi node is configured.
-----
changes in v2:
1. add new patch: "powerpc:85xx:Fix oops when mpc85xx_smp_guts_ids node
cannot be found"
2. Using !CONFIG_FSL_CORENET_RCPM to manage the timebase code of !PPC_E500MC
v1:
https://lore.kernel.org/lkml/20210926025144.55674-1-nixiaoming@huawei.com
------
Xiaoming Ni (2):
powerpc:85xx:Fix oops when mpc85xx_smp_guts_ids node cannot be found
powerpc:85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n
arch/powerpc/platforms/85xx/Makefile | 4 +++-
arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c | 7 +++++--
arch/powerpc/platforms/85xx/smp.c | 12 ++++++------
3 files changed, 14 insertions(+), 9 deletions(-)
--
2.27.0
^ permalink raw reply
* Re: [PATCH v2 2/4] mm: Make generic arch_is_kernel_initmem_freed() do what it says
From: Andrew Morton @ 2021-09-29 3:30 UTC (permalink / raw)
To: Christophe Leroy
Cc: linux-arch, linux-s390, arnd, Stephen Rothwell, linux-kernel,
linux-mm, Gerald Schaefer, linuxppc-dev
In-Reply-To: <ec69cc95a98548c862c22b742936244fdb0c7984.1632813331.git.christophe.leroy@csgroup.eu>
On Tue, 28 Sep 2021 09:15:35 +0200 Christophe Leroy <christophe.leroy@csgroup.eu> wrote:
> Commit 7a5da02de8d6 ("locking/lockdep: check for freed initmem in
> static_obj()") added arch_is_kernel_initmem_freed() which is supposed
> to report whether an object is part of already freed init memory.
>
> For the time being, the generic version of arch_is_kernel_initmem_freed()
> always reports 'false', allthough free_initmem() is generically called
> on all architectures.
>
> Therefore, change the generic version of arch_is_kernel_initmem_freed()
> to check whether free_initmem() has been called. If so, then check
> if a given address falls into init memory.
>
> In order to use function init_section_contains(), the fonction is
> moved at the end of asm-generic/section.h
i386 allmodconfig:
In file included from arch/x86/platform/intel-quark/imr.c:28:
./include/asm-generic/sections.h: In function 'arch_is_kernel_initmem_freed':
./include/asm-generic/sections.h:171:6: error: 'system_state' undeclared (first use in this function)
171 | if (system_state < SYSTEM_FREEING_INITMEM)
| ^~~~~~~~~~~~
./include/asm-generic/sections.h:171:6: note: each undeclared identifier is reported only once for each function it appears in
./include/asm-generic/sections.h:171:21: error: 'SYSTEM_FREEING_INITMEM' undeclared (first use in this function)
171 | if (system_state < SYSTEM_FREEING_INITMEM)
| ^~~~~~~~~~~~~~~~~~~~~~
I don't think it would be a good idea to include kernel.h from
sections.h - it's unclear to me which is the "innermost" of those two.
It would be better to uninline arch_is_kernel_initmem_freed(). Surely
there's no real reason for inlining it?
Anyway, I'll drop the series for now.
^ permalink raw reply
* Re: [PATCH v2 9/9] iommu/vt-d: Use pci core's DVSEC functionality
From: Lu Baolu @ 2021-09-29 1:36 UTC (permalink / raw)
To: Dan Williams, Ben Widawsky
Cc: Andrew Donnellan, Linux PCI, linuxppc-dev, linux-cxl,
open list:DMA MAPPING HELPERS, Bjorn Helgaas, David E. Box,
Frederic Barrat, Kan Liang, David Woodhouse, baolu.lu
In-Reply-To: <CAPcyv4i4T4XLW-P=CzdO47mZ8+_Mih7GMeDEXAtgEE+gO9JQHw@mail.gmail.com>
Hi Dan,
On 9/29/21 1:54 AM, Dan Williams wrote:
> On Thu, Sep 23, 2021 at 10:27 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>>
>> Reduce maintenance burden of DVSEC query implementation by using the
>> centralized PCI core implementation.
>>
>> Cc: iommu@lists.linux-foundation.org
>> Cc: David Woodhouse <dwmw2@infradead.org>
>> Cc: Lu Baolu <baolu.lu@linux.intel.com>
>> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
>> ---
>> drivers/iommu/intel/iommu.c | 15 +--------------
>> 1 file changed, 1 insertion(+), 14 deletions(-)
>>
>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>> index d75f59ae28e6..30c97181f0ae 100644
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -5398,20 +5398,7 @@ static int intel_iommu_disable_sva(struct device *dev)
>> */
>> static int siov_find_pci_dvsec(struct pci_dev *pdev)
>> {
>> - int pos;
>> - u16 vendor, id;
>> -
>> - pos = pci_find_next_ext_capability(pdev, 0, 0x23);
>> - while (pos) {
>> - pci_read_config_word(pdev, pos + 4, &vendor);
>> - pci_read_config_word(pdev, pos + 8, &id);
>> - if (vendor == PCI_VENDOR_ID_INTEL && id == 5)
>> - return pos;
>> -
>> - pos = pci_find_next_ext_capability(pdev, pos, 0x23);
>> - }
>> -
>> - return 0;
>> + return pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_INTEL, 5);
>> }
>
> Same comments as the CXL patch, siov_find_pci_dvsec() doesn't seem to
> have a reason to exist anymore. What is 5?
"5" is DVSEC ID for Scalable IOV.
Anyway, the siov_find_pci_dvsec() has been dead code since commit
262948f8ba57 ("iommu: Delete iommu_dev_has_feature()"). I have a patch
to clean it up. No need to care about it in this series.
Best regards,
baolu
^ permalink raw reply
* Re: [PATCH v3 9/9] powerpc/mm: Use is_kernel_text() and is_kernel_inittext() helper
From: Kefeng Wang @ 2021-09-29 1:14 UTC (permalink / raw)
To: Christophe Leroy, arnd, linux-arch, linux-kernel, linuxppc-dev,
rostedt, mingo, davem, ast, ryabinin.a.a, akpm
Cc: bpf, paulus
In-Reply-To: <c5895fa8-ed3d-74c7-1d71-4d90dee9ea4b@csgroup.eu>
On 2021/9/29 1:51, Christophe Leroy wrote:
>
>
> Le 26/09/2021 à 09:20, Kefeng Wang a écrit :
>> Use is_kernel_text() and is_kernel_inittext() helper to simplify code,
>> also drop etext, _stext, _sinittext, _einittext declaration which
>> already declared in section.h.
>>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Cc: linuxppc-dev@lists.ozlabs.org
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>> arch/powerpc/mm/pgtable_32.c | 7 ++-----
>> 1 file changed, 2 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
>> index dcf5ecca19d9..13c798308c2e 100644
>> --- a/arch/powerpc/mm/pgtable_32.c
>> +++ b/arch/powerpc/mm/pgtable_32.c
>> @@ -33,8 +33,6 @@
>> #include <mm/mmu_decl.h>
>> -extern char etext[], _stext[], _sinittext[], _einittext[];
>> -
>> static u8 early_fixmap_pagetable[FIXMAP_PTE_SIZE] __page_aligned_data;
>> notrace void __init early_ioremap_init(void)
>> @@ -104,14 +102,13 @@ static void __init __mapin_ram_chunk(unsigned
>> long offset, unsigned long top)
>> {
>> unsigned long v, s;
>> phys_addr_t p;
>> - int ktext;
>> + bool ktext;
>> s = offset;
>> v = PAGE_OFFSET + s;
>> p = memstart_addr + s;
>> for (; s < top; s += PAGE_SIZE) {
>> - ktext = ((char *)v >= _stext && (char *)v < etext) ||
>> - ((char *)v >= _sinittext && (char *)v < _einittext);
>> + ktext = (is_kernel_text(v) || is_kernel_inittext(v));
>
> I think we could use core_kernel_next() instead.
Indead. oops, sorry for the build error, will update, thanks.
>
> Build failure on mpc885_ads_defconfig
>
> arch/powerpc/mm/pgtable_32.c: In function '__mapin_ram_chunk':
> arch/powerpc/mm/pgtable_32.c:111:26: error: implicit declaration of
> function 'is_kernel_text'; did you mean 'is_kernel_inittext'?
> [-Werror=implicit-function-declaration]
> 111 | ktext = (is_kernel_text(v) ||
> is_kernel_inittext(v));
> | ^~~~~~~~~~~~~~
> | is_kernel_inittext
> cc1: all warnings being treated as errors
> make[2]: *** [scripts/Makefile.build:277: arch/powerpc/mm/pgtable_32.o]
> Error 1
> make[1]: *** [scripts/Makefile.build:540: arch/powerpc/mm] Error 2
> make: *** [Makefile:1868: arch/powerpc] Error 2
>
>
> .
^ permalink raw reply
* Re: [PATCH v2 4/4] powerpc/pseries/cpuhp: remove obsolete comment from pseries_cpu_die
From: Daniel Henrique Barboza @ 2021-09-29 0:14 UTC (permalink / raw)
To: Nathan Lynch, linuxppc-dev; +Cc: tyreld, ldufour, aneesh.kumar
In-Reply-To: <20210927201933.76786-5-nathanl@linux.ibm.com>
On 9/27/21 17:19, Nathan Lynch wrote:
> This comment likely refers to the obsolete DLPAR workflow where some
> resource state transitions were driven more directly from user space
> utilities, but it also seems to contradict itself: "Change isolate state to
> Isolate [...]" is at odds with the preceding sentences, and it does not
> relate at all to the code that follows.
This comment was added by commit 413f7c405a34, a 2006 commit where Mike
Ellerman moved code from platform/pseries/smp.c into hotplug-cpu.c.
I checked the original code back in smp.c and this comment was added there
by commit 1da177e4c3f41, which is Linus's initial git commit, where he mentions
that he didn't bothered with full history (although it is available somewhere,
allegedly).
This is enough to say that we can't easily see the history behind this comment.
I also believe that we're better of without it since it doesn't make sense
with the current codebase.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
>
> Remove it to prevent confusion.
>
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
> ---
> arch/powerpc/platforms/pseries/hotplug-cpu.c | 5 -----
> 1 file changed, 5 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> index b50f3e9aa259..5ab44600c8d3 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> @@ -137,11 +137,6 @@ static void pseries_cpu_die(unsigned int cpu)
> cpu, pcpu);
> }
>
> - /* Isolation and deallocation are definitely done by
> - * drslot_chrp_cpu. If they were not they would be
> - * done here. Change isolate state to Isolate and
> - * change allocation-state to Unusable.
> - */
> paca_ptrs[cpu]->cpu_start = 0;
> }
>
>
^ permalink raw reply
* Re: [RFC PATCH 7/8] riscv: rely on core code to keep thread_info::cpu updated
From: Palmer Dabbelt @ 2021-09-28 22:07 UTC (permalink / raw)
To: ardb
Cc: peterz, paulus, linux-riscv, will, ardb, linux-s390,
Arnd Bergmann, linux, borntraeger, mingo, catalin.marinas, aou,
keescook, gor, hca, keithpac, bp, luto, Paul Walmsley, tglx,
linux-arm-kernel, linuxppc-dev, linux-kernel, Linus Torvalds
In-Reply-To: <20210914121036.3975026-8-ardb@kernel.org>
On Tue, 14 Sep 2021 05:10:35 PDT (-0700), ardb@kernel.org wrote:
> Now that the core code switched back to using thread_info::cpu to keep
> a task's CPU number, we no longer need to keep it in sync explicitly. So
> just drop the code that does this.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
> arch/riscv/kernel/asm-offsets.c | 1 -
> arch/riscv/kernel/entry.S | 5 -----
> arch/riscv/kernel/head.S | 1 -
> 3 files changed, 7 deletions(-)
>
> diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
> index 90f8ce64fa6f..478d9f02dab5 100644
> --- a/arch/riscv/kernel/asm-offsets.c
> +++ b/arch/riscv/kernel/asm-offsets.c
> @@ -33,7 +33,6 @@ void asm_offsets(void)
> OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, thread_info.preempt_count);
> OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp);
> OFFSET(TASK_TI_USER_SP, task_struct, thread_info.user_sp);
> - OFFSET(TASK_TI_CPU, task_struct, thread_info.cpu);
>
> OFFSET(TASK_THREAD_F0, task_struct, thread.fstate.f[0]);
> OFFSET(TASK_THREAD_F1, task_struct, thread.fstate.f[1]);
> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> index 98f502654edd..459eb1714353 100644
> --- a/arch/riscv/kernel/entry.S
> +++ b/arch/riscv/kernel/entry.S
> @@ -544,11 +544,6 @@ ENTRY(__switch_to)
> REG_L s9, TASK_THREAD_S9_RA(a4)
> REG_L s10, TASK_THREAD_S10_RA(a4)
> REG_L s11, TASK_THREAD_S11_RA(a4)
> - /* Swap the CPU entry around. */
> - lw a3, TASK_TI_CPU(a0)
> - lw a4, TASK_TI_CPU(a1)
> - sw a3, TASK_TI_CPU(a1)
> - sw a4, TASK_TI_CPU(a0)
> /* The offset of thread_info in task_struct is zero. */
> move tp, a1
> ret
> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
> index fce5184b22c3..d5ec30ef6f5d 100644
> --- a/arch/riscv/kernel/head.S
> +++ b/arch/riscv/kernel/head.S
> @@ -317,7 +317,6 @@ clear_bss_done:
> call setup_trap_vector
> /* Restore C environment */
> la tp, init_task
> - sw zero, TASK_TI_CPU(tp)
> la sp, init_thread_union + THREAD_SIZE
>
> #ifdef CONFIG_KASAN
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
Thanks!
^ permalink raw reply
* [PATCH v2 0/2] powerpc/paravirt: vcpu_is_preempted() tweaks
From: Nathan Lynch @ 2021-09-28 21:41 UTC (permalink / raw)
To: linuxppc-dev; +Cc: srikar, npiggin
Minor changes arising from discovering that this code throws warnings with
DEBUG_PREEMPT kernels.
Changes since v1:
* Additional commentary to (1) distinguish hypervisor dispatch and preempt
behavior from kernel scheduler preemption; and (2) more clearly justify
the use of raw_smp_processor_id().
* Additional patch to update existing comments before making the functional
change.
v1: https://lore.kernel.org/linuxppc-dev/20210921031213.2029824-1-nathanl@linux.ibm.com/
Nathan Lynch (2):
powerpc/paravirt: vcpu_is_preempted() commentary
powerpc/paravirt: correct preempt debug splat in vcpu_is_preempted()
arch/powerpc/include/asm/paravirt.h | 40 +++++++++++++++++++++++++----
1 file changed, 35 insertions(+), 5 deletions(-)
--
2.31.1
^ permalink raw reply
* [PATCH v2 2/2] powerpc/paravirt: correct preempt debug splat in vcpu_is_preempted()
From: Nathan Lynch @ 2021-09-28 21:41 UTC (permalink / raw)
To: linuxppc-dev; +Cc: srikar, npiggin
In-Reply-To: <20210928214147.312412-1-nathanl@linux.ibm.com>
vcpu_is_preempted() can be used outside of preempt-disabled critical
sections, yielding warnings such as:
BUG: using smp_processor_id() in preemptible [00000000] code: systemd-udevd/185
caller is rwsem_spin_on_owner+0x1cc/0x2d0
CPU: 1 PID: 185 Comm: systemd-udevd Not tainted 5.15.0-rc2+ #33
Call Trace:
[c000000012907ac0] [c000000000aa30a8] dump_stack_lvl+0xac/0x108 (unreliable)
[c000000012907b00] [c000000001371f70] check_preemption_disabled+0x150/0x160
[c000000012907b90] [c0000000001e0e8c] rwsem_spin_on_owner+0x1cc/0x2d0
[c000000012907be0] [c0000000001e1408] rwsem_down_write_slowpath+0x478/0x9a0
[c000000012907ca0] [c000000000576cf4] filename_create+0x94/0x1e0
[c000000012907d10] [c00000000057ac08] do_symlinkat+0x68/0x1a0
[c000000012907d70] [c00000000057ae18] sys_symlink+0x58/0x70
[c000000012907da0] [c00000000002e448] system_call_exception+0x198/0x3c0
[c000000012907e10] [c00000000000c54c] system_call_common+0xec/0x250
The result of vcpu_is_preempted() is always used speculatively, and the
function does not access per-cpu resources in a (Linux) preempt-unsafe way.
Use raw_smp_processor_id() to avoid such warnings, adding explanatory
comments.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: ca3f969dcb11 ("powerpc/paravirt: Use is_kvm_guest() in vcpu_is_preempted()")
---
arch/powerpc/include/asm/paravirt.h | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h
index 39f173961f6a..eb7df559ae74 100644
--- a/arch/powerpc/include/asm/paravirt.h
+++ b/arch/powerpc/include/asm/paravirt.h
@@ -110,7 +110,23 @@ static inline bool vcpu_is_preempted(int cpu)
#ifdef CONFIG_PPC_SPLPAR
if (!is_kvm_guest()) {
- int first_cpu = cpu_first_thread_sibling(smp_processor_id());
+ int first_cpu;
+
+ /*
+ * The result of vcpu_is_preempted() is used in a
+ * speculative way, and is always subject to invalidation
+ * by events internal and external to Linux. While we can
+ * be called in preemptable context (in the Linux sense),
+ * we're not accessing per-cpu resources in a way that can
+ * race destructively with Linux scheduler preemption and
+ * migration, and callers can tolerate the potential for
+ * error introduced by sampling the CPU index without
+ * pinning the task to it. So it is permissible to use
+ * raw_smp_processor_id() here to defeat the preempt debug
+ * warnings that can arise from using smp_processor_id()
+ * in arbitrary contexts.
+ */
+ first_cpu = cpu_first_thread_sibling(raw_smp_processor_id());
/*
* The PowerVM hypervisor dispatches VMs on a whole core
--
2.31.1
^ permalink raw reply related
* [PATCH v2 1/2] powerpc/paravirt: vcpu_is_preempted() commentary
From: Nathan Lynch @ 2021-09-28 21:41 UTC (permalink / raw)
To: linuxppc-dev; +Cc: srikar, npiggin
In-Reply-To: <20210928214147.312412-1-nathanl@linux.ibm.com>
Add comments more clearly documenting that this function determines whether
hypervisor-level preemption of the VM has occurred.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
arch/powerpc/include/asm/paravirt.h | 22 ++++++++++++++++++----
1 file changed, 18 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h
index bcb7b5f917be..39f173961f6a 100644
--- a/arch/powerpc/include/asm/paravirt.h
+++ b/arch/powerpc/include/asm/paravirt.h
@@ -21,7 +21,7 @@ static inline bool is_shared_processor(void)
return static_branch_unlikely(&shared_processor);
}
-/* If bit 0 is set, the cpu has been preempted */
+/* If bit 0 is set, the cpu has been ceded, conferred, or preempted */
static inline u32 yield_count_of(int cpu)
{
__be32 yield_count = READ_ONCE(lppaca_of(cpu).yield_count);
@@ -92,6 +92,19 @@ static inline void prod_cpu(int cpu)
#define vcpu_is_preempted vcpu_is_preempted
static inline bool vcpu_is_preempted(int cpu)
{
+ /*
+ * The dispatch/yield bit alone is an imperfect indicator of
+ * whether the hypervisor has dispatched @cpu to run on a physical
+ * processor. When it is clear, @cpu is definitely not preempted.
+ * But when it is set, it means only that it *might* be, subject to
+ * other conditions. So we check other properties of the VM and
+ * @cpu first, resorting to the yield count last.
+ */
+
+ /*
+ * Hypervisor preemption isn't possible in dedicated processor
+ * mode by definition.
+ */
if (!is_shared_processor())
return false;
@@ -100,9 +113,10 @@ static inline bool vcpu_is_preempted(int cpu)
int first_cpu = cpu_first_thread_sibling(smp_processor_id());
/*
- * Preemption can only happen at core granularity. This CPU
- * is not preempted if one of the CPU of this core is not
- * preempted.
+ * The PowerVM hypervisor dispatches VMs on a whole core
+ * basis. So we know that a thread sibling of the local CPU
+ * cannot have been preempted by the hypervisor, even if it
+ * has called H_CONFER, which will set the yield bit.
*/
if (cpu_first_thread_sibling(cpu) == first_cpu)
return false;
--
2.31.1
^ permalink raw reply related
* Re: [PATCH v4 0/8] Implement generic cc_platform_has() helper function
From: Borislav Petkov @ 2021-09-28 21:40 UTC (permalink / raw)
To: Kuppuswamy, Sathyanarayanan
Cc: linux-efi, kvm, David Airlie, dri-devel, platform-driver-x86,
Paul Mackerras, Will Deacon, Ard Biesheuvel, linux-s390,
Andi Kleen, Baoquan He, Joerg Roedel, x86, amd-gfx,
Christoph Hellwig, Christian Borntraeger, VMware Graphics,
Dave Young, Tom Lendacky, Thomas Zimmermann, Vasily Gorbik,
Heiko Carstens, Maarten Lankhorst, Maxime Ripard, Andy Lutomirski,
Kirill A. Shutemov, kexec, LKML, iommu, Daniel Vetter,
linuxppc-dev
In-Reply-To: <695a3bf6-5382-68df-3ab5-8841b777fca2@linux.intel.com>
On Tue, Sep 28, 2021 at 02:01:57PM -0700, Kuppuswamy, Sathyanarayanan wrote:
> Yes. But, since the check is related to TDX, I just want to confirm whether
> you are fine with naming the function as intel_*().
Why is this such a big of a deal?!
There's amd_cc_platform_has() and intel_cc_platform_has() will be the
corresponding Intel version.
> Since this patch is going to have dependency on TDX code, I will include
> this patch in TDX patch set.
Ok.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply
* Re: [PATCH v4 4/8] PCI: replace pci_dev::driver usage that gets the driver name
From: Simon Horman @ 2021-09-28 10:01 UTC (permalink / raw)
To: Uwe Kleine-König
Cc: linux-pci, Alexander Duyck, oss-drivers, Paul Mackerras,
Herbert Xu, Rafał Miłecki, Jesse Brandeburg,
Bjorn Helgaas, Ido Schimmel, Jakub Kicinski, Yisen Zhuang,
Vadym Kochan, Uwe Kleine-König, Michael Buesch, Jiri Pirko,
Salil Mehta, netdev, linux-wireless, linux-kernel, Taras Chornyi,
Zhou Wang, linux-crypto, kernel, Oliver O'Halloran,
linuxppc-dev, David S. Miller
In-Reply-To: <20210927204326.612555-5-uwe@kleine-koenig.org>
On Mon, Sep 27, 2021 at 10:43:22PM +0200, Uwe Kleine-König wrote:
> From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
>
> struct pci_dev::driver holds (apart from a constant offset) the same
> data as struct pci_dev::dev->driver. With the goal to remove struct
> pci_dev::driver to get rid of data duplication replace getting the
> driver name by dev_driver_string() which implicitly makes use of struct
> pci_dev::dev->driver.
>
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
...
> diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
> index 0685ece1f155..23dfb599c828 100644
> --- a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
> +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
> @@ -202,7 +202,7 @@ nfp_get_drvinfo(struct nfp_app *app, struct pci_dev *pdev,
> {
> char nsp_version[ETHTOOL_FWVERS_LEN] = {};
>
> - strlcpy(drvinfo->driver, pdev->driver->name, sizeof(drvinfo->driver));
> + strlcpy(drvinfo->driver, dev_driver_string(&pdev->dev), sizeof(drvinfo->driver));
I'd slightly prefer to maintain lines under 80 columns wide.
But not nearly strongly enough to engage in a long debate about it.
In any case, for the NFP portion of this patch.
Acked-by: Simon Horman <simon.horman@corigine.com>
> nfp_net_get_nspinfo(app, nsp_version);
> snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
> "%s %s %s %s", vnic_version, nsp_version,
...
^ permalink raw reply
* Re: [PATCH v4 0/8] Implement generic cc_platform_has() helper function
From: Kuppuswamy, Sathyanarayanan @ 2021-09-28 21:01 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-efi, kvm, David Airlie, dri-devel, platform-driver-x86,
Paul Mackerras, Will Deacon, Ard Biesheuvel, linux-s390,
Andi Kleen, Baoquan He, Joerg Roedel, x86, amd-gfx,
Christoph Hellwig, Christian Borntraeger, VMware Graphics,
Dave Young, Tom Lendacky, Thomas Zimmermann, Vasily Gorbik,
Heiko Carstens, Maarten Lankhorst, Maxime Ripard, Andy Lutomirski,
Kirill A. Shutemov, kexec, LKML, iommu, Daniel Vetter,
linuxppc-dev
In-Reply-To: <YVOB3mFV1Kj3MXAs@zn.tnic>
On 9/28/21 1:58 PM, Borislav Petkov wrote:
> On Tue, Sep 28, 2021 at 01:48:46PM -0700, Kuppuswamy, Sathyanarayanan wrote:
>> Just read it. If you want to use cpuid_has_tdx_guest() directly in
>> cc_platform_has(), then you want to rename intel_cc_platform_has() to
>> tdx_cc_platform_has()?
>
> Why?
>
> You simply do:
>
> if (cpuid_has_tdx_guest())
> intel_cc_platform_has(...);
>
> and lemme paste from that mail: " ...you should use
> cpuid_has_tdx_guest() instead but cache its result so that you don't
> call CPUID each time the kernel executes cc_platform_has()."
>
> Makes sense?
Yes. But, since the check is related to TDX, I just want to confirm whether
you are fine with naming the function as intel_*().
Since this patch is going to have dependency on TDX code, I will include
this patch in TDX patch set.
>
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
^ permalink raw reply
* Re: [PATCH v4 0/8] Implement generic cc_platform_has() helper function
From: Borislav Petkov @ 2021-09-28 20:58 UTC (permalink / raw)
To: Kuppuswamy, Sathyanarayanan
Cc: linux-efi, kvm, David Airlie, dri-devel, platform-driver-x86,
Paul Mackerras, Will Deacon, Ard Biesheuvel, linux-s390,
Andi Kleen, Baoquan He, Joerg Roedel, x86, amd-gfx,
Christoph Hellwig, Christian Borntraeger, VMware Graphics,
Dave Young, Tom Lendacky, Thomas Zimmermann, Vasily Gorbik,
Heiko Carstens, Maarten Lankhorst, Maxime Ripard, Andy Lutomirski,
Kirill A. Shutemov, kexec, LKML, iommu, Daniel Vetter,
linuxppc-dev
In-Reply-To: <7319b756-55dc-c4d1-baf6-4686f0156ac4@linux.intel.com>
On Tue, Sep 28, 2021 at 01:48:46PM -0700, Kuppuswamy, Sathyanarayanan wrote:
> Just read it. If you want to use cpuid_has_tdx_guest() directly in
> cc_platform_has(), then you want to rename intel_cc_platform_has() to
> tdx_cc_platform_has()?
Why?
You simply do:
if (cpuid_has_tdx_guest())
intel_cc_platform_has(...);
and lemme paste from that mail: " ...you should use
cpuid_has_tdx_guest() instead but cache its result so that you don't
call CPUID each time the kernel executes cc_platform_has()."
Makes sense?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox