* Re: [PATCH 2/2] arm64: dts: rockchip: enable gpu on rk3588-tiger
From: Quentin Schulz @ 2024-03-27 12:31 UTC (permalink / raw)
To: Heiko Stuebner; +Cc: linux-arm-kernel, linux-rockchip, Heiko Stuebner
In-Reply-To: <20240327112120.1181570-2-heiko@sntech.de>
Hi Heiko,
On 3/27/24 12:21, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@cherry.de>
>
> Enable the mali gpu node and add the som-specific supply-regulator.
>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@cherry.de>
Reviewed-by: Quentin Schulz <quentin.schulz@theobroma-systems.com>
Thanks,
Quentin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v2 1/2] dmaengine: xilinx: dpdma: Fix race condition in vsync IRQ
From: Tomi Valkeinen @ 2024-03-27 12:32 UTC (permalink / raw)
To: Vishal Sagar
Cc: michal.simek, dmaengine, linux-arm-kernel, linux-kernel,
varunkumar.allagadapa, laurent.pinchart, vkoul, Sean Anderson
In-Reply-To: <20240228042124.3074044-2-vishal.sagar@amd.com>
On 28/02/2024 06:21, Vishal Sagar wrote:
> From: Neel Gandhi <neel.gandhi@xilinx.com>
>
> The vchan_next_desc() function, called from
> xilinx_dpdma_chan_queue_transfer(), must be called with
> virt_dma_chan.lock held. This isn't correctly handled in all code paths,
> resulting in a race condition between the .device_issue_pending()
> handler and the IRQ handler which causes DMA to randomly stop. Fix it by
> taking the lock around xilinx_dpdma_chan_queue_transfer() calls that are
> missing it.
>
> Signed-off-by: Neel Gandhi <neel.gandhi@amd.com>
> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
> Signed-off-by: Vishal Sagar <vishal.sagar@amd.com>
Sean posted almost identical, but very slightly better patch, for this,
so I think we can pick that one instead.
Tomi
>
> Link: https://lore.kernel.org/all/20220122121407.11467-1-neel.gandhi@xilinx.com
> ---
> drivers/dma/xilinx/xilinx_dpdma.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/dma/xilinx/xilinx_dpdma.c b/drivers/dma/xilinx/xilinx_dpdma.c
> index b82815e64d24..28d9af8f00f0 100644
> --- a/drivers/dma/xilinx/xilinx_dpdma.c
> +++ b/drivers/dma/xilinx/xilinx_dpdma.c
> @@ -1097,12 +1097,14 @@ static void xilinx_dpdma_chan_vsync_irq(struct xilinx_dpdma_chan *chan)
> * Complete the active descriptor, if any, promote the pending
> * descriptor to active, and queue the next transfer, if any.
> */
> + spin_lock(&chan->vchan.lock);
> if (chan->desc.active)
> vchan_cookie_complete(&chan->desc.active->vdesc);
> chan->desc.active = pending;
> chan->desc.pending = NULL;
>
> xilinx_dpdma_chan_queue_transfer(chan);
> + spin_unlock(&chan->vchan.lock);
>
> out:
> spin_unlock_irqrestore(&chan->lock, flags);
> @@ -1264,10 +1266,12 @@ static void xilinx_dpdma_issue_pending(struct dma_chan *dchan)
> struct xilinx_dpdma_chan *chan = to_xilinx_chan(dchan);
> unsigned long flags;
>
> - spin_lock_irqsave(&chan->vchan.lock, flags);
> + spin_lock_irqsave(&chan->lock, flags);
> + spin_lock(&chan->vchan.lock);
> if (vchan_issue_pending(&chan->vchan))
> xilinx_dpdma_chan_queue_transfer(chan);
> - spin_unlock_irqrestore(&chan->vchan.lock, flags);
> + spin_unlock(&chan->vchan.lock);
> + spin_unlock_irqrestore(&chan->lock, flags);
> }
>
> static int xilinx_dpdma_config(struct dma_chan *dchan,
> @@ -1495,7 +1499,9 @@ static void xilinx_dpdma_chan_err_task(struct tasklet_struct *t)
> XILINX_DPDMA_EINTR_CHAN_ERR_MASK << chan->id);
>
> spin_lock_irqsave(&chan->lock, flags);
> + spin_lock(&chan->vchan.lock);
> xilinx_dpdma_chan_queue_transfer(chan);
> + spin_unlock(&chan->vchan.lock);
> spin_unlock_irqrestore(&chan->lock, flags);
> }
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH net-next v3 3/3] net: ti: icssg-prueth: Add support for ICSSG switch firmware
From: Andrew Lunn @ 2024-03-27 12:35 UTC (permalink / raw)
To: MD Danish Anwar
Cc: Diogo Ivo, Rob Herring, Dan Carpenter, Jan Kiszka, Simon Horman,
Wolfram Sang, Arnd Bergmann, Vignesh Raghavendra, Vladimir Oltean,
Roger Quadros, Paolo Abeni, Jakub Kicinski, Eric Dumazet,
David S. Miller, linux-arm-kernel, netdev, linux-kernel, srk,
r-gunasekaran
In-Reply-To: <20240327114054.1907278-4-danishanwar@ti.com>
On Wed, Mar 27, 2024 at 05:10:54PM +0530, MD Danish Anwar wrote:
> Add support for ICSSG switch firmware using existing Dual EMAC driver
> with switchdev.
>
> Limitations:
> VLAN offloading is limited to 0-256 IDs.
> MDB/FDB static entries are limited to 511 entries and different FDBs can
> hash to same bucket and thus may not completely offloaded
>
> Switch mode requires loading of new firmware into ICSSG cores. This
> means interfaces have to taken down and then reconfigured to switch
> mode.
Patch 0/3 does not say this. It just shows the interfaces being added
to the bridge. There should not be any need to down the interfaces.
> Example assuming ETH1 and ETH2 as ICSSG2 interfaces:
>
> Switch to ICSSG Switch mode:
> ip link set dev eth1 down
> ip link set dev eth2 down
> ip link add name br0 type bridge
> ip link set dev eth1 master br0
> ip link set dev eth2 master br0
> ip link set dev br0 up
> ip link set dev eth1 up
> ip link set dev eth2 up
> bridge vlan add dev br0 vid 1 pvid untagged self
>
> Going back to Dual EMAC mode:
>
> ip link set dev br0 down
> ip link set dev eth1 nomaster
> ip link set dev eth2 nomaster
> ip link set dev eth1 down
> ip link set dev eth2 down
> ip link del name br0 type bridge
> ip link set dev eth1 up
> ip link set dev eth2 up
>
> By default, Dual EMAC firmware is loaded, and can be changed to switch
> mode by above steps
I keep asking this, so it would be good to explain it in the commit
message. What configuration is preserved over a firmware reload, and
what is lost?
Can i add VLAN in duel MAC mode and then swap into the switch firmware
and all the VLANs are preserved? Can i add fdb entries to a port in
dual MAC mode, and then swap into the swtich firmware and the FDB
table is preserved? What about STP port state? What about ... ?
> +bool prueth_dev_check(const struct net_device *ndev)
> +{
> + if (ndev->netdev_ops == &emac_netdev_ops && netif_running(ndev)) {
> + struct prueth_emac *emac = netdev_priv(ndev);
> +
> + return emac->prueth->is_switch_mode;
> + }
> +
> + return false;
> +}
This does not appear to be used anywhere?
Andrew
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH 3/4] arm64: dts: rockchip: Add VEPU121 to rk3588
From: Link Mauve @ 2024-03-27 12:40 UTC (permalink / raw)
To: Krzysztof Kozlowski
Cc: Emmanuel Gil Peyrot, linux-kernel, Ezequiel Garcia, Philipp Zabel,
Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Heiko Stuebner, Joerg Roedel, Will Deacon,
Robin Murphy, Sebastian Reichel, Cristian Ciocaltea, Dragan Simic,
Shreeya Patel, Chris Morgan, Andy Yan, Nicolas Frattaroli,
linux-media, linux-rockchip, devicetree, linux-arm-kernel, iommu
In-Reply-To: <a60f6017-bd19-431e-8cff-7d73f6f114fe@linaro.org>
[-- Attachment #1: Type: text/plain, Size: 1945 bytes --]
On Thu, Mar 21, 2024 at 09:15:38AM +0100, Krzysztof Kozlowski wrote:
> On 20/03/2024 18:37, Emmanuel Gil Peyrot wrote:
> > The TRM (version 1.0 page 385) lists five VEPU121 cores, but only four
> > interrupts are listed (on page 24), so I’ve only enabled four of them
> > for now.
> >
> > Signed-off-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
> > ---
> > arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 80 +++++++++++++++++++++++
> > 1 file changed, 80 insertions(+)
> >
> > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > index 2a23b4dc36e4..fe77b56ac9a0 100644
> > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > @@ -2488,6 +2488,86 @@ gpio4: gpio@fec50000 {
> > };
> > };
> >
> > + jpeg_enc0: video-codec@fdba0000 {
> > + compatible = "rockchip,rk3588-vepu121";
> > + reg = <0x0 0xfdba0000 0x0 0x800>;
> > + interrupts = <GIC_SPI 122 IRQ_TYPE_LEVEL_HIGH 0>;
> > + clocks = <&cru ACLK_JPEG_ENCODER0>, <&cru HCLK_JPEG_ENCODER0>;
> > + clock-names = "aclk", "hclk";
> > + iommus = <&jpeg_enc0_mmu>;
> > + power-domains = <&power RK3588_PD_VDPU>;
> > + };
> > +
> > + jpeg_enc0_mmu: iommu@fdba0800 {
> > + compatible = "rockchip,rk3588-iommu";
>
> It does not look like you tested the DTS against bindings. Please run
> `make dtbs_check W=1` (see
> Documentation/devicetree/bindings/writing-schema.rst or
> https://www.linaro.org/blog/tips-and-tricks-for-validating-devicetree-sources-with-the-devicetree-schema/
> for instructions).
Even on master I get an exception about this unresolvable file:
referencing.exceptions.Unresolvable: cache-controller.yaml#
Yet it seems to be present in only three files, all of them unrelated to
the rockchip board I’m interested in (it seems), so I’m not sure what to
do about that.
The full stack trace is attached.
>
> Best regards,
> Krzysztof
>
--
Link Mauve
[-- Attachment #2: CHECK_DTBS=y W=1 --]
[-- Type: text/plain, Size: 4155 bytes --]
% make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- CHECK_DTBS=y W=1 rockchip/rk3588-rock-5b.dtb
make[1]: Entering directory '/home/linkmauve/dev/linux'
DTC_CHK arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dtb
Traceback (most recent call last):
File "/usr/lib/python3.11/site-packages/referencing/_core.py", line 417, in get_or_retrieve
resource = registry._retrieve(uri)
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/jsonschema/validators.py", line 111, in _warn_for_remote_retrieve
request = Request(uri, headers=headers) # noqa: S310
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 322, in __init__
self.full_url = url
^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 348, in full_url
self._parse()
File "/usr/lib/python3.11/urllib/request.py", line 377, in _parse
raise ValueError("unknown url type: %r" % self.full_url)
ValueError: unknown url type: 'cache-controller.yaml'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3.11/site-packages/referencing/_core.py", line 667, in lookup
retrieved = self._registry.get_or_retrieve(uri)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/referencing/_core.py", line 424, in get_or_retrieve
raise exceptions.Unretrievable(ref=uri) from error
referencing.exceptions.Unretrievable: 'cache-controller.yaml'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/bin/dt-validate", line 8, in <module>
sys.exit(main())
^^^^^^
File "/usr/lib/python3.11/site-packages/dtschema/dtb_validate.py", line 144, in main
sg.check_dtb(filename)
File "/usr/lib/python3.11/site-packages/dtschema/dtb_validate.py", line 89, in check_dtb
self.check_subtree(dt, subtree, False, "/", "/", filename)
File "/usr/lib/python3.11/site-packages/dtschema/dtb_validate.py", line 82, in check_subtree
self.check_subtree(tree, value, disabled, name, fullname + name, filename)
File "/usr/lib/python3.11/site-packages/dtschema/dtb_validate.py", line 82, in check_subtree
self.check_subtree(tree, value, disabled, name, fullname + name, filename)
File "/usr/lib/python3.11/site-packages/dtschema/dtb_validate.py", line 77, in check_subtree
self.check_node(tree, subtree, disabled, nodename, fullname, filename)
File "/usr/lib/python3.11/site-packages/dtschema/dtb_validate.py", line 33, in check_node
for error in self.validator.iter_errors(node, filter=match_schema_file):
File "/usr/lib/python3.11/site-packages/dtschema/validator.py", line 413, in iter_errors
for error in self.DtValidator(sch,
File "/usr/lib/python3.11/site-packages/jsonschema/validators.py", line 371, in iter_errors
for error in errors:
File "/usr/lib/python3.11/site-packages/jsonschema/_keywords.py", line 386, in if_
yield from validator.descend(instance, then, schema_path="then")
File "/usr/lib/python3.11/site-packages/jsonschema/validators.py", line 419, in descend
for error in errors:
File "/usr/lib/python3.11/site-packages/jsonschema/_legacy_keywords.py", line 423, in unevaluatedProperties_draft2019
evaluated_keys = find_evaluated_property_keys_by_schema(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/jsonschema/_legacy_keywords.py", line 399, in find_evaluated_property_keys_by_schema
evaluated_keys += find_evaluated_property_keys_by_schema(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/jsonschema/_legacy_keywords.py", line 342, in find_evaluated_property_keys_by_schema
resolved = validator._resolver.lookup(ref)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/referencing/_core.py", line 671, in lookup
raise exceptions.Unresolvable(ref=ref) from error
referencing.exceptions.Unresolvable: cache-controller.yaml#
make[1]: Leaving directory '/home/linkmauve/dev/linux'
[-- Attachment #3: Type: text/plain, Size: 176 bytes --]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH 1/2] remoteproc: mediatek: Make sure IPI buffer fits in L2TCM
From: AngeloGioacchino Del Regno @ 2024-03-27 12:40 UTC (permalink / raw)
To: Mathieu Poirier
Cc: andersson, matthias.bgg, tzungbi, tinghan.shen, linux-remoteproc,
linux-kernel, linux-arm-kernel, linux-mediatek, wenst, kernel
In-Reply-To: <ZfxRY475SKaRYVTj@p14s>
Il 21/03/24 16:25, Mathieu Poirier ha scritto:
> Good day,
>
> On Thu, Mar 21, 2024 at 09:46:13AM +0100, AngeloGioacchino Del Regno wrote:
>> The IPI buffer location is read from the firmware that we load to the
>> System Companion Processor, and it's not granted that both the SRAM
>> (L2TCM) size that is defined in the devicetree node is large enough
>> for that, and while this is especially true for multi-core SCP, it's
>> still useful to check on single-core variants as well.
>>
>> Failing to perform this check may make this driver perform R/W
>> oeprations out of the L2TCM boundary, resulting (at best) in a
>
> s/oeprations/operations
>
> I will fix that when I apply the patch.
>
Thanks for that.
Cheers,
Angelo
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH 3/3] KVM: arm64: Use TLBI_TTL_UNKNOWN in __kvm_tlb_flush_vmid_range()
From: Will Deacon @ 2024-03-27 12:45 UTC (permalink / raw)
To: Ryan Roberts
Cc: kvmarm, linux-arm-kernel, Catalin Marinas, Gavin Shan,
Marc Zyngier, Mostafa Saleh, Oliver Upton, Quentin Perret,
Raghavendra Rao Ananta, Shaoqin Huang, Suzuki K Poulose,
Zenghui Yu
In-Reply-To: <b3177895-5ed0-4f1a-bede-4d0c87935bbc@arm.com>
On Tue, Mar 26, 2024 at 01:48:46PM +0000, Ryan Roberts wrote:
> On 25/03/2024 18:51, Will Deacon wrote:
> > Commit c910f2b65518 ("arm64/mm: Update tlb invalidation routines for
> > FEAT_LPA2") updated the __tlbi_level() macro to take the target level
> > as an argument, with TLBI_TTL_UNKNOWN (rather than 0) indicating that
> > the caller cannot provide level information. Unfortunately, the two
> > implementations of __kvm_tlb_flush_vmid_range() were not updated and so
> > now ask for an level 0 invalidation if FEAT_LPA2 is implemented.
>
> Ouch, sorry about this! I remember rebasing my change onto the KVM tlbi range
> changes and having a few conflicts. Obviously I didn't do a good enough job of
> reviewing the result and missed this new user.
No problem, it's easily done. It's also not your fault, as we shouldn't have
been using the reserved encoding of 0 to mean "no hint" in the first place!
> > Cc: Ryan Roberts <ryan.roberts@arm.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Oliver Upton <oliver.upton@linux.dev>
> > Cc: Marc Zyngier <maz@kernel.org>
> > Fixes: c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2")
> > Signed-off-by: Will Deacon <will@kernel.org>
>
> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Thanks.
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* [PATCH v2 0/4] KVM: arm64: TLBI fixes for the pgtable code
From: Will Deacon @ 2024-03-27 12:48 UTC (permalink / raw)
To: kvmarm
Cc: linux-arm-kernel, Will Deacon, Catalin Marinas, Gavin Shan,
Marc Zyngier, Mostafa Saleh, Oliver Upton, Quentin Perret,
Raghavendra Rao Ananta, Ryan Roberts, Shaoqin Huang
Hi again,
This is version two of the series I previously posted on Monday:
https://lore.kernel.org/r/20240325185158.8565-1-will@kernel.org
We've got a long weekend coming up in the UK, so I wanted to get this
out before I chuck the laptop in the river.
Changes since v1 include:
* Add Ryan's Reviewed-by on the third patch
* Add an extra patch to ensure correct alignment of range TLBI address
argument
* Tweak commit messages
Cheers,
Will
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mostafa Saleh <smostafa@google.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Quentin Perret <qperret@google.com>
Cc: Raghavendra Rao Ananta <rananta@google.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shaoqin Huang <shahuang@redhat.com>
--->8
Will Deacon (4):
KVM: arm64: Don't defer TLB invalidation when zapping table entries
KVM: arm64: Don't pass a TLBI level hint when zapping table entries
KVM: arm64: Use TLBI_TTL_UNKNOWN in __kvm_tlb_flush_vmid_range()
KVM: arm64: Ensure target address is granule-aligned for range TLBI
arch/arm64/kvm/hyp/nvhe/tlb.c | 3 ++-
arch/arm64/kvm/hyp/pgtable.c | 23 +++++++++++++++--------
arch/arm64/kvm/hyp/vhe/tlb.c | 3 ++-
3 files changed, 19 insertions(+), 10 deletions(-)
--
2.44.0.396.g6e790dbe36-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* [PATCH v2 1/4] KVM: arm64: Don't defer TLB invalidation when zapping table entries
From: Will Deacon @ 2024-03-27 12:48 UTC (permalink / raw)
To: kvmarm
Cc: linux-arm-kernel, Will Deacon, Catalin Marinas, Gavin Shan,
Marc Zyngier, Mostafa Saleh, Oliver Upton, Quentin Perret,
Raghavendra Rao Ananta, Ryan Roberts, Shaoqin Huang
In-Reply-To: <20240327124853.11206-1-will@kernel.org>
Commit 7657ea920c54 ("KVM: arm64: Use TLBI range-based instructions for
unmap") introduced deferred TLB invalidation for the stage-2 page-table
so that range-based invalidation can be used for the accumulated
addresses. This works fine if the structure of the page-tables remains
unchanged, but if entire tables are zapped and subsequently freed then
we transiently leave the hardware page-table walker with a reference
to freed memory thanks to the translation walk caches. For example,
stage2_unmap_walker() will free page-table pages:
if (childp)
mm_ops->put_page(childp);
and issue the TLB invalidation later in kvm_pgtable_stage2_unmap():
if (stage2_unmap_defer_tlb_flush(pgt))
/* Perform the deferred TLB invalidations */
kvm_tlb_flush_vmid_range(pgt->mmu, addr, size);
For now, take the conservative approach and invalidate the TLB eagerly
when we clear a table entry. Note, however, that the existing level
hint passed to __kvm_tlb_flush_vmid_ipa() is incorrect and will be
fixed in a subsequent patch.
Cc: Raghavendra Rao Ananta <rananta@google.com>
Cc: Shaoqin Huang <shahuang@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Will Deacon <will@kernel.org>
---
arch/arm64/kvm/hyp/pgtable.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 3fae5830f8d2..de0b667ba296 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -896,9 +896,11 @@ static void stage2_unmap_put_pte(const struct kvm_pgtable_visit_ctx *ctx,
if (kvm_pte_valid(ctx->old)) {
kvm_clear_pte(ctx->ptep);
- if (!stage2_unmap_defer_tlb_flush(pgt))
+ if (!stage2_unmap_defer_tlb_flush(pgt) ||
+ kvm_pte_table(ctx->old, ctx->level)) {
kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu,
ctx->addr, ctx->level);
+ }
}
mm_ops->put_page(ctx->ptep);
--
2.44.0.396.g6e790dbe36-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v2 2/4] KVM: arm64: Don't pass a TLBI level hint when zapping table entries
From: Will Deacon @ 2024-03-27 12:48 UTC (permalink / raw)
To: kvmarm
Cc: linux-arm-kernel, Will Deacon, Catalin Marinas, Gavin Shan,
Marc Zyngier, Mostafa Saleh, Oliver Upton, Quentin Perret,
Raghavendra Rao Ananta, Ryan Roberts, Shaoqin Huang
In-Reply-To: <20240327124853.11206-1-will@kernel.org>
The TLBI level hints are for leaf entries only, so take care not to pass
them incorrectly after clearing a table entry.
Cc: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Fixes: 82bb02445de5 ("KVM: arm64: Implement kvm_pgtable_hyp_unmap() at EL2")
Fixes: 6d9d2115c480 ("KVM: arm64: Add support for stage-2 map()/unmap() in generic page-table")
Signed-off-by: Will Deacon <will@kernel.org>
---
arch/arm64/kvm/hyp/pgtable.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index de0b667ba296..a40dafc43bb6 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -528,7 +528,7 @@ static int hyp_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx,
kvm_clear_pte(ctx->ptep);
dsb(ishst);
- __tlbi_level(vae2is, __TLBI_VADDR(ctx->addr, 0), ctx->level);
+ __tlbi_level(vae2is, __TLBI_VADDR(ctx->addr, 0), TLBI_TTL_UNKNOWN);
} else {
if (ctx->end - ctx->addr < granule)
return -EINVAL;
@@ -896,10 +896,12 @@ static void stage2_unmap_put_pte(const struct kvm_pgtable_visit_ctx *ctx,
if (kvm_pte_valid(ctx->old)) {
kvm_clear_pte(ctx->ptep);
- if (!stage2_unmap_defer_tlb_flush(pgt) ||
- kvm_pte_table(ctx->old, ctx->level)) {
- kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu,
- ctx->addr, ctx->level);
+ if (kvm_pte_table(ctx->old, ctx->level)) {
+ kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu, ctx->addr,
+ TLBI_TTL_UNKNOWN);
+ } else if (!stage2_unmap_defer_tlb_flush(pgt)) {
+ kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu, ctx->addr,
+ ctx->level);
}
}
--
2.44.0.396.g6e790dbe36-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v2 3/4] KVM: arm64: Use TLBI_TTL_UNKNOWN in __kvm_tlb_flush_vmid_range()
From: Will Deacon @ 2024-03-27 12:48 UTC (permalink / raw)
To: kvmarm
Cc: linux-arm-kernel, Will Deacon, Catalin Marinas, Gavin Shan,
Marc Zyngier, Mostafa Saleh, Oliver Upton, Quentin Perret,
Raghavendra Rao Ananta, Ryan Roberts, Shaoqin Huang
In-Reply-To: <20240327124853.11206-1-will@kernel.org>
Commit c910f2b65518 ("arm64/mm: Update tlb invalidation routines for
FEAT_LPA2") updated the __tlbi_level() macro to take the target level
as an argument, with TLBI_TTL_UNKNOWN (rather than 0) indicating that
the caller cannot provide level information. Unfortunately, the two
implementations of __kvm_tlb_flush_vmid_range() were not updated and so
now ask for an level 0 invalidation if FEAT_LPA2 is implemented.
Fix the problem by passing TLBI_TTL_UNKNOWN instead of 0 as the level
argument to __flush_s2_tlb_range_op() in __kvm_tlb_flush_vmid_range().
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Marc Zyngier <maz@kernel.org>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Fixes: c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2")
Signed-off-by: Will Deacon <will@kernel.org>
---
arch/arm64/kvm/hyp/nvhe/tlb.c | 3 ++-
arch/arm64/kvm/hyp/vhe/tlb.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c
index a60fb13e2192..2fc68da4036d 100644
--- a/arch/arm64/kvm/hyp/nvhe/tlb.c
+++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
@@ -154,7 +154,8 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
/* Switch to requested VMID */
__tlb_switch_to_guest(mmu, &cxt, false);
- __flush_s2_tlb_range_op(ipas2e1is, start, pages, stride, 0);
+ __flush_s2_tlb_range_op(ipas2e1is, start, pages, stride,
+ TLBI_TTL_UNKNOWN);
dsb(ish);
__tlbi(vmalle1is);
diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c
index b32e2940df7d..1a60b95381e8 100644
--- a/arch/arm64/kvm/hyp/vhe/tlb.c
+++ b/arch/arm64/kvm/hyp/vhe/tlb.c
@@ -171,7 +171,8 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
/* Switch to requested VMID */
__tlb_switch_to_guest(mmu, &cxt);
- __flush_s2_tlb_range_op(ipas2e1is, start, pages, stride, 0);
+ __flush_s2_tlb_range_op(ipas2e1is, start, pages, stride,
+ TLBI_TTL_UNKNOWN);
dsb(ish);
__tlbi(vmalle1is);
--
2.44.0.396.g6e790dbe36-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v2 4/4] KVM: arm64: Ensure target address is granule-aligned for range TLBI
From: Will Deacon @ 2024-03-27 12:48 UTC (permalink / raw)
To: kvmarm
Cc: linux-arm-kernel, Will Deacon, Catalin Marinas, Gavin Shan,
Marc Zyngier, Mostafa Saleh, Oliver Upton, Quentin Perret,
Raghavendra Rao Ananta, Ryan Roberts, Shaoqin Huang
In-Reply-To: <20240327124853.11206-1-will@kernel.org>
When zapping a table entry in stage2_try_break_pte(), we issue range
TLB invalidation for the region that was mapped by the table. However,
we neglect to align the base address down to the granule size and so
if we ended up reaching the table entry via a misaligned address then
we will accidentally skip invalidation for some prefix of the affected
address range.
Align 'ctx->addr' down to the granule size when performing TLB
invalidation for an unmapped table in stage2_try_break_pte().
Cc: Raghavendra Rao Ananta <rananta@google.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Shaoqin Huang <shahuang@redhat.com>
Cc: Quentin Perret <qperret@google.com>
Fixes: defc8cc7abf0 ("KVM: arm64: Invalidate the table entries upon a range")
Signed-off-by: Will Deacon <will@kernel.org>
---
arch/arm64/kvm/hyp/pgtable.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index a40dafc43bb6..5a59ef88b646 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -843,12 +843,15 @@ static bool stage2_try_break_pte(const struct kvm_pgtable_visit_ctx *ctx,
* Perform the appropriate TLB invalidation based on the
* evicted pte value (if any).
*/
- if (kvm_pte_table(ctx->old, ctx->level))
- kvm_tlb_flush_vmid_range(mmu, ctx->addr,
- kvm_granule_size(ctx->level));
- else if (kvm_pte_valid(ctx->old))
+ if (kvm_pte_table(ctx->old, ctx->level)) {
+ u64 size = kvm_granule_size(ctx->level);
+ u64 addr = ALIGN_DOWN(ctx->addr, size);
+
+ kvm_tlb_flush_vmid_range(mmu, addr, size);
+ } else if (kvm_pte_valid(ctx->old)) {
kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu,
ctx->addr, ctx->level);
+ }
}
if (stage2_pte_is_counted(ctx->old))
--
2.44.0.396.g6e790dbe36-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* Re: [PATCH 2/2] remoteproc: mediatek: Don't parse extraneous subnodes for multi-core
From: AngeloGioacchino Del Regno @ 2024-03-27 12:49 UTC (permalink / raw)
To: Mathieu Poirier
Cc: andersson, matthias.bgg, tzungbi, tinghan.shen, linux-remoteproc,
linux-kernel, linux-arm-kernel, linux-mediatek, wenst, kernel
In-Reply-To: <ZfxRyMyUqyqtXy8n@p14s>
Il 21/03/24 16:27, Mathieu Poirier ha scritto:
> On Thu, Mar 21, 2024 at 09:46:14AM +0100, AngeloGioacchino Del Regno wrote:
>> When probing multi-core SCP, this driver is parsing all sub-nodes of
>> the scp-cluster node, but one of those could be not an actual SCP core
>> and that would make the entire SCP cluster to fail probing for no good
>> reason.
>>
>> To fix that, in scp_add_multi_core() treat a subnode as a SCP Core by
>> parsing only available subnodes having compatible "mediatek,scp-core".
>>
>> Fixes: 1fdbf0cdde98 ("remoteproc: mediatek: Probe SCP cluster on multi-core SCP")
>> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>> ---
>> drivers/remoteproc/mtk_scp.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c
>> index 67518291a8ad..fbe1c232dae7 100644
>> --- a/drivers/remoteproc/mtk_scp.c
>> +++ b/drivers/remoteproc/mtk_scp.c
>> @@ -1096,6 +1096,9 @@ static int scp_add_multi_core(struct platform_device *pdev,
>> cluster_of_data = (const struct mtk_scp_of_data **)of_device_get_match_data(dev);
>>
>> for_each_available_child_of_node(np, child) {
>> + if (!of_device_is_compatible(child, "mediatek,scp-core"))
>> + continue;
>> +
>
> Interesting - what else gets stashed under the remote processor node? I don't
> see anything specified in the bindings.
>
Sorry for the late reply - well, in this precise moment in time, upstream,
nothing yet.
I have noticed this while debugging some lockups and wanted to move the scp_adsp
clock controller node as child of the SCP node (as some of those clocks are located
*into the SCP's CFG register space*, and it's correct for that to be a child as one
of those do depend on the SCP being up - and I'll spare you the rest) and noticed
the unexpected behavior, as the SCP driver was treating those as an SCP core.
There was no kernel panic, but the SCP would fail probing.
This is anyway a missed requirement ... for platforms that want *both* two SCP
cores *and* the AudioDSP, as that'd at least be two nodes with the same iostart
(scp@1072000, clock-controller@1072000), other than the reasons I explained some
lines back.
...and that's why this commit was sent :-)
P.S.: The reason why platforms don't crash without the scp_adsp clock controller
as a child of SCP is that the bootloader is actually doing basic init for
the SCP, hence the block is powered on when booting ;-)
Cheers,
Angelo
> Thanks,
> Mathieu
>
>> if (!cluster_of_data[core_id]) {
>> ret = -EINVAL;
>> dev_err(dev, "Not support core %d\n", core_id);
>> --
>> 2.44.0
>>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v2 2/2] dmaengine: xilinx: dpdma: Add support for cyclic dma mode
From: Tomi Valkeinen @ 2024-03-27 12:53 UTC (permalink / raw)
To: Vishal Sagar
Cc: michal.simek, dmaengine, linux-arm-kernel, linux-kernel,
varunkumar.allagadapa, laurent.pinchart, vkoul
In-Reply-To: <20240228042124.3074044-3-vishal.sagar@amd.com>
On 28/02/2024 06:21, Vishal Sagar wrote:
> From: Rohit Visavalia <rohit.visavalia@xilinx.com>
>
> This patch adds support for DPDMA cyclic dma mode,
> DMA cyclic transfers are required by audio streaming.
>
> Signed-off-by: Rohit Visavalia <rohit.visavalia@amd.com>
> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
> Signed-off-by: Vishal Sagar <vishal.sagar@amd.com>
>
> ---
> drivers/dma/xilinx/xilinx_dpdma.c | 97 +++++++++++++++++++++++++++++++
> 1 file changed, 97 insertions(+)
>
> diff --git a/drivers/dma/xilinx/xilinx_dpdma.c b/drivers/dma/xilinx/xilinx_dpdma.c
> index 28d9af8f00f0..88ad2f35538a 100644
> --- a/drivers/dma/xilinx/xilinx_dpdma.c
> +++ b/drivers/dma/xilinx/xilinx_dpdma.c
> @@ -669,6 +669,84 @@ static void xilinx_dpdma_chan_free_tx_desc(struct virt_dma_desc *vdesc)
> kfree(desc);
> }
>
> +/**
> + * xilinx_dpdma_chan_prep_cyclic - Prepare a cyclic dma descriptor
> + * @chan: DPDMA channel
> + * @buf_addr: buffer address
> + * @buf_len: buffer length
> + * @period_len: number of periods
> + * @flags: tx flags argument passed in to prepare function
> + *
> + * Prepare a tx descriptor incudling internal software/hardware descriptors
> + * for the given cyclic transaction.
> + *
> + * Return: A dma async tx descriptor on success, or NULL.
> + */
> +static struct dma_async_tx_descriptor *
> +xilinx_dpdma_chan_prep_cyclic(struct xilinx_dpdma_chan *chan,
> + dma_addr_t buf_addr, size_t buf_len,
> + size_t period_len, unsigned long flags)
> +{
> + struct xilinx_dpdma_tx_desc *tx_desc;
> + struct xilinx_dpdma_sw_desc *sw_desc, *last = NULL;
> + unsigned int periods = buf_len / period_len;
> + unsigned int i;
> +
> + tx_desc = xilinx_dpdma_chan_alloc_tx_desc(chan);
> + if (!tx_desc)
> + return (void *)tx_desc;
Just return NULL here?
> +
> + for (i = 0; i < periods; i++) {
> + struct xilinx_dpdma_hw_desc *hw_desc;
> +
> + if (!IS_ALIGNED(buf_addr, XILINX_DPDMA_ALIGN_BYTES)) {
> + dev_err(chan->xdev->dev,
> + "buffer should be aligned at %d B\n",
> + XILINX_DPDMA_ALIGN_BYTES);
> + goto error;
> + }
> +
> + sw_desc = xilinx_dpdma_chan_alloc_sw_desc(chan);
> + if (!sw_desc)
> + goto error;
> +
> + xilinx_dpdma_sw_desc_set_dma_addrs(chan->xdev, sw_desc, last,
> + &buf_addr, 1);
> + hw_desc = &sw_desc->hw;
> + hw_desc->xfer_size = period_len;
> + hw_desc->hsize_stride =
> + FIELD_PREP(XILINX_DPDMA_DESC_HSIZE_STRIDE_HSIZE_MASK,
> + period_len) |
> + FIELD_PREP(XILINX_DPDMA_DESC_HSIZE_STRIDE_STRIDE_MASK,
> + period_len);
> + hw_desc->control |= XILINX_DPDMA_DESC_CONTROL_PREEMBLE;
> + hw_desc->control |= XILINX_DPDMA_DESC_CONTROL_IGNORE_DONE;
> + hw_desc->control |= XILINX_DPDMA_DESC_CONTROL_COMPLETE_INTR;
You could:
hw_desc->control |= XILINX_DPDMA_DESC_CONTROL_PREEMBLE |
XILINX_DPDMA_DESC_CONTROL_IGNORE_DONE |
XILINX_DPDMA_DESC_CONTROL_COMPLETE_INTR;
Although... Shouldn't control always be 0 here, so you can just
hw_desc->control = ...;
> +
> + list_add_tail(&sw_desc->node, &tx_desc->descriptors);
> +
> + buf_addr += period_len;
> + last = sw_desc;
> + }
> +
> + sw_desc = list_first_entry(&tx_desc->descriptors,
> + struct xilinx_dpdma_sw_desc, node);
> + last->hw.next_desc = lower_32_bits(sw_desc->dma_addr);
> + if (chan->xdev->ext_addr)
> + last->hw.addr_ext |=
> + FIELD_PREP(XILINX_DPDMA_DESC_ADDR_EXT_NEXT_ADDR_MASK,
> + upper_32_bits(sw_desc->dma_addr));
> +
> + last->hw.control |= XILINX_DPDMA_DESC_CONTROL_LAST_OF_FRAME;
> +
> + return vchan_tx_prep(&chan->vchan, &tx_desc->vdesc, flags);
> +
> +error:
> + xilinx_dpdma_chan_free_tx_desc(&tx_desc->vdesc);
> +
> + return NULL;
> +}
> +
> /**
> * xilinx_dpdma_chan_prep_interleaved_dma - Prepare an interleaved dma
> * descriptor
> @@ -1190,6 +1268,23 @@ static void xilinx_dpdma_chan_handle_err(struct xilinx_dpdma_chan *chan)
> /* -----------------------------------------------------------------------------
> * DMA Engine Operations
> */
> +static struct dma_async_tx_descriptor *
> +xilinx_dpdma_prep_dma_cyclic(struct dma_chan *dchan, dma_addr_t buf_addr,
> + size_t buf_len, size_t period_len,
> + enum dma_transfer_direction direction,
> + unsigned long flags)
> +{
> + struct xilinx_dpdma_chan *chan = to_xilinx_chan(dchan);
> +
> + if (direction != DMA_MEM_TO_DEV)
> + return NULL;
> +
> + if (buf_len % period_len)
> + return NULL;
> +
> + return xilinx_dpdma_chan_prep_cyclic(chan, buf_addr, buf_len,
> + period_len, flags);
The parameters should be aligned above.
> +}
>
> static struct dma_async_tx_descriptor *
> xilinx_dpdma_prep_interleaved_dma(struct dma_chan *dchan,
> @@ -1673,6 +1768,7 @@ static int xilinx_dpdma_probe(struct platform_device *pdev)
>
> dma_cap_set(DMA_SLAVE, ddev->cap_mask);
> dma_cap_set(DMA_PRIVATE, ddev->cap_mask);
> + dma_cap_set(DMA_CYCLIC, ddev->cap_mask);
> dma_cap_set(DMA_INTERLEAVE, ddev->cap_mask);
> dma_cap_set(DMA_REPEAT, ddev->cap_mask);
> dma_cap_set(DMA_LOAD_EOT, ddev->cap_mask);
> @@ -1680,6 +1776,7 @@ static int xilinx_dpdma_probe(struct platform_device *pdev)
>
> ddev->device_alloc_chan_resources = xilinx_dpdma_alloc_chan_resources;
> ddev->device_free_chan_resources = xilinx_dpdma_free_chan_resources;
> + ddev->device_prep_dma_cyclic = xilinx_dpdma_prep_dma_cyclic;
> ddev->device_prep_interleaved_dma = xilinx_dpdma_prep_interleaved_dma;
> /* TODO: Can we achieve better granularity ? */
> ddev->device_tx_status = dma_cookie_status;
While I'm not too familiar with dma engines, this looks fine to me. So,
other than the few cosmetics comments:
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Tomi
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [RESEND][PATCH v2 3/4] PM: EM: Add em_dev_update_chip_binning()
From: Dietmar Eggemann @ 2024-03-27 12:55 UTC (permalink / raw)
To: Lukasz Luba
Cc: linux-arm-kernel, sboyd, nm, linux-samsung-soc, daniel.lezcano,
rafael, viresh.kumar, krzysztof.kozlowski, alim.akhtar,
m.szyprowski, mhiramat, linux-kernel, linux-pm
In-Reply-To: <30ee98e9-3d9a-4be8-8127-043f68a7dcb1@arm.com>
On 26/03/2024 21:32, Lukasz Luba wrote:
>
>
> On 3/26/24 10:09, Dietmar Eggemann wrote:
>> On 22/03/2024 12:08, Lukasz Luba wrote:
[...]
>>> + return em_recalc_and_update(dev, pd, em_table);
>>> +}
>>> +EXPORT_SYMBOL_GPL(em_dev_update_chip_binning);
>>
>> In the previous version of 'chip-binning' you were using the new EM
>> interface em_dev_compute_costs() (1) which is now replaced by
>> em_recalc_and_update() -> em_compute_costs().
>>
>> https://lkml.kernel.org/r/20231220110339.1065505-2-lukasz.luba@arm.com
>>
>> Which leaves (1) still unused.
>>
>> That was why my concern back then that we shouldn't introduce EM
>> interfaces without a user:
>>
>> https://lkml.kernel.org/r/8fc499cf-fca1-4465-bff7-a93dfd36f3c8@arm.com
>>
>> What happens now with em_dev_compute_costs()?
>>
>
> For now it's not used, but modules which will create new EMs
> with custom power values will use it. When such a module have
> e.g. 5 EMs for one PD and only switches on one of them, then
> this em_dev_compute_costs() will be used at setup for those
> 5 EMs. Later it won't be used.
> I don't wanted to combine the registration of new EM with
> the compute cost, because that will create overhead in the
> switching to new EM code path. Now we have only ~3us, which
> was the main goal.
>
> When our scmi-cpufreq get the support for EM update this
> compute cost will be used there.
OK, I see. I checked the reloadable EM test module and
em_dev_compute_costs() is used there like you described it.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [RESEND][PATCH v2 4/4] soc: samsung: exynos-asv: Update Energy Model after adjusting voltage
From: Dietmar Eggemann @ 2024-03-27 12:56 UTC (permalink / raw)
To: Lukasz Luba
Cc: linux-arm-kernel, sboyd, nm, linux-samsung-soc, daniel.lezcano,
rafael, viresh.kumar, krzysztof.kozlowski, alim.akhtar,
m.szyprowski, mhiramat, linux-kernel, linux-pm
In-Reply-To: <d5d6ae17-3ba1-4cb8-909f-865e47bfa45b@arm.com>
On 26/03/2024 21:12, Lukasz Luba wrote:
> Hi Dietmar,
>
> On 3/26/24 11:20, Dietmar Eggemann wrote:
>> On 22/03/2024 12:08, Lukasz Luba wrote:
>>
>> [...]
>>
>>> @@ -97,9 +98,17 @@ static int exynos_asv_update_opps(struct
>>> exynos_asv *asv)
>>> last_opp_table = opp_table;
>>> ret = exynos_asv_update_cpu_opps(asv, cpu);
>>> - if (ret < 0)
>>> + if (!ret) {
>>> + /*
>>> + * When the voltage for OPPs successfully
>>> + * changed, update the EM power values to
>>> + * reflect the reality and not use stale data
>>
>> At this point, can we really say that the voltage has changed?
>>
>> exynos_asv_update_cpu_opps()
>>
>> ...
>> ret = dev_pm_opp_adjust_voltage()
>> if (!ret)
>> em_dev_update_chip_binning()
>> ...
>>
>> dev_pm_opp_adjust_voltage() also returns 0 when the voltage value stays
>> the same?
>>
>> [...]
>
> The comment for the dev_pm_opp_adjust_voltage() says that it
> returns 0 if no modification was done or modification was
> successful. So I cannot distinguish in that driver code, but
> also there is no additional need to do it IMO (even framework
> doesn't do this).
Precisely. That's why the added comment in exynos_asv_update_opps():
"When the voltage for OPPs successfully __changed__, ..." is somehow
misleading IMHO.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* [PATCH RFC 0/3] mm/gup: consistently call it GUP-fast
From: David Hildenbrand @ 2024-03-27 13:05 UTC (permalink / raw)
To: linux-kernel
Cc: David Hildenbrand, Andrew Morton, Mike Rapoport, Jason Gunthorpe,
John Hubbard, Peter Xu, linux-arm-kernel, loongarch, linux-mips,
linuxppc-dev, linux-s390, linux-sh, linux-mm, linux-perf-users,
linux-fsdevel, x86
Some cleanups around function names, comments and the config option of
"GUP-fast" -- GUP without "lock" safety belts on.
With this cleanup it's easy to judge which functions are GUP-fast specific.
We now consistently call it "GUP-fast", avoiding mixing it with "fast GUP",
"lockless", or simply "gup" (which I always considered confusing in the
ode).
So the magic now happens in functions that contain "gup_fast", whereby
gup_fast() is the entry point into that magic. Comments consistently
reference either "GUP-fast" or "gup_fast()".
Based on mm-unstable from today. I won't CC arch maintainers, but only
arch mailing lists, to reduce noise.
Tested on x86_64, cross compiled on a bunch of archs, whereby some of them
don't properly even compile on mm-unstable anymore in my usual setup
(alpha, arc, parisc64, sh) ... maybe the cross compilers are outdated,
but there are no new ones around. Hm.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: loongarch@lists.linux.dev
Cc: linux-mips@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-perf-users@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: x86@kernel.org
David Hildenbrand (3):
mm/gup: consistently name GUP-fast functions
mm/treewide: rename CONFIG_HAVE_FAST_GUP to CONFIG_HAVE_GUP_FAST
mm: use "GUP-fast" instead "fast GUP" in remaining comments
arch/arm/Kconfig | 2 +-
arch/arm64/Kconfig | 2 +-
arch/loongarch/Kconfig | 2 +-
arch/mips/Kconfig | 2 +-
arch/powerpc/Kconfig | 2 +-
arch/s390/Kconfig | 2 +-
arch/sh/Kconfig | 2 +-
arch/x86/Kconfig | 2 +-
include/linux/rmap.h | 8 +-
kernel/events/core.c | 4 +-
mm/Kconfig | 2 +-
mm/filemap.c | 2 +-
mm/gup.c | 170 +++++++++++++++++++++--------------------
mm/internal.h | 2 +-
mm/khugepaged.c | 2 +-
15 files changed, 105 insertions(+), 101 deletions(-)
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* [PATCH RFC 1/3] mm/gup: consistently name GUP-fast functions
From: David Hildenbrand @ 2024-03-27 13:05 UTC (permalink / raw)
To: linux-kernel
Cc: David Hildenbrand, Andrew Morton, Mike Rapoport, Jason Gunthorpe,
John Hubbard, Peter Xu, linux-arm-kernel, loongarch, linux-mips,
linuxppc-dev, linux-s390, linux-sh, linux-mm, linux-perf-users,
linux-fsdevel, x86
In-Reply-To: <20240327130538.680256-1-david@redhat.com>
Let's consistently call the "fast-only" part of GUP "GUP-fast" and rename
all relevant internal functions to start with "gup_fast", to make it
clearer that this is not ordinary GUP. The current mixture of
"lockless", "gup" and "gup_fast" is confusing.
Further, avoid the term "huge" when talking about a "leaf" -- for
example, we nowadays check pmd_leaf() because pmd_huge() is gone. For the
"hugepd"/"hugepte" stuff, it's part of the name ("is_hugepd"), so that
says.
What remains is the "external" interface:
* get_user_pages_fast_only()
* get_user_pages_fast()
* pin_user_pages_fast()
And the "internal" interface that handles GUP-fast + fallback:
* internal_get_user_pages_fast()
The high-level internal function for GUP-fast is now:
* gup_fast()
The basic GUP-fast walker functions:
* gup_pgd_range() -> gup_fast_pgd_range()
* gup_p4d_range() -> gup_fast_p4d_range()
* gup_pud_range() -> gup_fast_pud_range()
* gup_pmd_range() -> gup_fast_pmd_range()
* gup_pte_range() -> gup_fast_pte_range()
* gup_huge_pgd() -> gup_fast_pgd_leaf()
* gup_huge_pud() -> gup_fast_pud_leaf()
* gup_huge_pmd() -> gup_fast_pmd_leaf()
The weird hugepd stuff:
* gup_huge_pd() -> gup_fast_hugepd()
* gup_hugepte() -> gup_fast_hugepte()
The weird devmap stuff:
* __gup_device_huge_pud() -> gup_fast_devmap_pud_leaf()
* __gup_device_huge_pmd -> gup_fast_devmap_pmd_leaf()
* __gup_device_huge() -> gup_fast_devmap_leaf()
Helper functions:
* unpin_user_pages_lockless() -> gup_fast_unpin_user_pages()
* gup_fast_folio_allowed() is already properly named
* gup_fast_permitted() is already properly named
With "gup_fast()", we now even have a function that is referred to in
comment in mm/mmu_gather.c.
Signed-off-by: David Hildenbrand <david@redhat.com>
---
mm/gup.c | 164 ++++++++++++++++++++++++++++---------------------------
1 file changed, 84 insertions(+), 80 deletions(-)
diff --git a/mm/gup.c b/mm/gup.c
index 03b74b148e30..c293aff30c5d 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -440,7 +440,7 @@ void unpin_user_page_range_dirty_lock(struct page *page, unsigned long npages,
}
EXPORT_SYMBOL(unpin_user_page_range_dirty_lock);
-static void unpin_user_pages_lockless(struct page **pages, unsigned long npages)
+static void gup_fast_unpin_user_pages(struct page **pages, unsigned long npages)
{
unsigned long i;
struct folio *folio;
@@ -2431,7 +2431,7 @@ long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
EXPORT_SYMBOL(get_user_pages_unlocked);
/*
- * Fast GUP
+ * GUP-fast
*
* get_user_pages_fast attempts to pin user pages by walking the page
* tables directly and avoids taking locks. Thus the walker needs to be
@@ -2445,7 +2445,7 @@ EXPORT_SYMBOL(get_user_pages_unlocked);
*
* Another way to achieve this is to batch up page table containing pages
* belonging to more than one mm_user, then rcu_sched a callback to free those
- * pages. Disabling interrupts will allow the fast_gup walker to both block
+ * pages. Disabling interrupts will allow the gup_fast() walker to both block
* the rcu_sched callback, and an IPI that we broadcast for splitting THPs
* (which is a relatively rare event). The code below adopts this strategy.
*
@@ -2589,9 +2589,9 @@ static void __maybe_unused undo_dev_pagemap(int *nr, int nr_start,
* also check pmd here to make sure pmd doesn't change (corresponds to
* pmdp_collapse_flush() in the THP collapse code path).
*/
-static int gup_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr,
- unsigned long end, unsigned int flags,
- struct page **pages, int *nr)
+static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr,
+ unsigned long end, unsigned int flags, struct page **pages,
+ int *nr)
{
struct dev_pagemap *pgmap = NULL;
int nr_start = *nr, ret = 0;
@@ -2688,20 +2688,19 @@ static int gup_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr,
*
* For a futex to be placed on a THP tail page, get_futex_key requires a
* get_user_pages_fast_only implementation that can pin pages. Thus it's still
- * useful to have gup_huge_pmd even if we can't operate on ptes.
+ * useful to have gup_fast_pmd_leaf even if we can't operate on ptes.
*/
-static int gup_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr,
- unsigned long end, unsigned int flags,
- struct page **pages, int *nr)
+static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr,
+ unsigned long end, unsigned int flags, struct page **pages,
+ int *nr)
{
return 0;
}
#endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */
#if defined(CONFIG_ARCH_HAS_PTE_DEVMAP) && defined(CONFIG_TRANSPARENT_HUGEPAGE)
-static int __gup_device_huge(unsigned long pfn, unsigned long addr,
- unsigned long end, unsigned int flags,
- struct page **pages, int *nr)
+static int gup_fast_devmap_leaf(unsigned long pfn, unsigned long addr,
+ unsigned long end, unsigned int flags, struct page **pages, int *nr)
{
int nr_start = *nr;
struct dev_pagemap *pgmap = NULL;
@@ -2734,15 +2733,15 @@ static int __gup_device_huge(unsigned long pfn, unsigned long addr,
return addr == end;
}
-static int __gup_device_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
- unsigned long end, unsigned int flags,
- struct page **pages, int *nr)
+static int gup_fast_devmap_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr,
+ unsigned long end, unsigned int flags, struct page **pages,
+ int *nr)
{
unsigned long fault_pfn;
int nr_start = *nr;
fault_pfn = pmd_pfn(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
- if (!__gup_device_huge(fault_pfn, addr, end, flags, pages, nr))
+ if (!gup_fast_devmap_leaf(fault_pfn, addr, end, flags, pages, nr))
return 0;
if (unlikely(pmd_val(orig) != pmd_val(*pmdp))) {
@@ -2752,15 +2751,15 @@ static int __gup_device_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
return 1;
}
-static int __gup_device_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr,
- unsigned long end, unsigned int flags,
- struct page **pages, int *nr)
+static int gup_fast_devmap_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr,
+ unsigned long end, unsigned int flags, struct page **pages,
+ int *nr)
{
unsigned long fault_pfn;
int nr_start = *nr;
fault_pfn = pud_pfn(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
- if (!__gup_device_huge(fault_pfn, addr, end, flags, pages, nr))
+ if (!gup_fast_devmap_leaf(fault_pfn, addr, end, flags, pages, nr))
return 0;
if (unlikely(pud_val(orig) != pud_val(*pudp))) {
@@ -2770,17 +2769,17 @@ static int __gup_device_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr,
return 1;
}
#else
-static int __gup_device_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
- unsigned long end, unsigned int flags,
- struct page **pages, int *nr)
+static int gup_fast_devmap_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr,
+ unsigned long end, unsigned int flags, struct page **pages,
+ int *nr)
{
BUILD_BUG();
return 0;
}
-static int __gup_device_huge_pud(pud_t pud, pud_t *pudp, unsigned long addr,
- unsigned long end, unsigned int flags,
- struct page **pages, int *nr)
+static int gup_fast_devmap_pud_leaf(pud_t pud, pud_t *pudp, unsigned long addr,
+ unsigned long end, unsigned int flags, struct page **pages,
+ int *nr)
{
BUILD_BUG();
return 0;
@@ -2806,9 +2805,9 @@ static unsigned long hugepte_addr_end(unsigned long addr, unsigned long end,
return (__boundary - 1 < end - 1) ? __boundary : end;
}
-static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
- unsigned long end, unsigned int flags,
- struct page **pages, int *nr)
+static int gup_fast_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
+ unsigned long end, unsigned int flags, struct page **pages,
+ int *nr)
{
unsigned long pte_end;
struct page *page;
@@ -2855,7 +2854,7 @@ static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
return 1;
}
-static int gup_huge_pd(hugepd_t hugepd, unsigned long addr,
+static int gup_fast_hugepd(hugepd_t hugepd, unsigned long addr,
unsigned int pdshift, unsigned long end, unsigned int flags,
struct page **pages, int *nr)
{
@@ -2866,14 +2865,14 @@ static int gup_huge_pd(hugepd_t hugepd, unsigned long addr,
ptep = hugepte_offset(hugepd, addr, pdshift);
do {
next = hugepte_addr_end(addr, end, sz);
- if (!gup_hugepte(ptep, sz, addr, end, flags, pages, nr))
+ if (!gup_fast_hugepte(ptep, sz, addr, end, flags, pages, nr))
return 0;
} while (ptep++, addr = next, addr != end);
return 1;
}
#else
-static inline int gup_huge_pd(hugepd_t hugepd, unsigned long addr,
+static inline int gup_fast_hugepd(hugepd_t hugepd, unsigned long addr,
unsigned int pdshift, unsigned long end, unsigned int flags,
struct page **pages, int *nr)
{
@@ -2881,9 +2880,9 @@ static inline int gup_huge_pd(hugepd_t hugepd, unsigned long addr,
}
#endif /* CONFIG_ARCH_HAS_HUGEPD */
-static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
- unsigned long end, unsigned int flags,
- struct page **pages, int *nr)
+static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr,
+ unsigned long end, unsigned int flags, struct page **pages,
+ int *nr)
{
struct page *page;
struct folio *folio;
@@ -2895,8 +2894,8 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
if (pmd_devmap(orig)) {
if (unlikely(flags & FOLL_LONGTERM))
return 0;
- return __gup_device_huge_pmd(orig, pmdp, addr, end, flags,
- pages, nr);
+ return gup_fast_devmap_pmd_leaf(orig, pmdp, addr, end, flags,
+ pages, nr);
}
page = nth_page(pmd_page(orig), (addr & ~PMD_MASK) >> PAGE_SHIFT);
@@ -2925,9 +2924,9 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
return 1;
}
-static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr,
- unsigned long end, unsigned int flags,
- struct page **pages, int *nr)
+static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr,
+ unsigned long end, unsigned int flags, struct page **pages,
+ int *nr)
{
struct page *page;
struct folio *folio;
@@ -2939,8 +2938,8 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr,
if (pud_devmap(orig)) {
if (unlikely(flags & FOLL_LONGTERM))
return 0;
- return __gup_device_huge_pud(orig, pudp, addr, end, flags,
- pages, nr);
+ return gup_fast_devmap_pud_leaf(orig, pudp, addr, end, flags,
+ pages, nr);
}
page = nth_page(pud_page(orig), (addr & ~PUD_MASK) >> PAGE_SHIFT);
@@ -2970,9 +2969,9 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr,
return 1;
}
-static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr,
- unsigned long end, unsigned int flags,
- struct page **pages, int *nr)
+static int gup_fast_pgd_leaf(pgd_t orig, pgd_t *pgdp, unsigned long addr,
+ unsigned long end, unsigned int flags, struct page **pages,
+ int *nr)
{
int refs;
struct page *page;
@@ -3010,8 +3009,9 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr,
return 1;
}
-static int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, unsigned long end,
- unsigned int flags, struct page **pages, int *nr)
+static int gup_fast_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr,
+ unsigned long end, unsigned int flags, struct page **pages,
+ int *nr)
{
unsigned long next;
pmd_t *pmdp;
@@ -3025,11 +3025,11 @@ static int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, unsigned lo
return 0;
if (unlikely(pmd_leaf(pmd))) {
- /* See gup_pte_range() */
+ /* See gup_fast_pte_range() */
if (pmd_protnone(pmd))
return 0;
- if (!gup_huge_pmd(pmd, pmdp, addr, next, flags,
+ if (!gup_fast_pmd_leaf(pmd, pmdp, addr, next, flags,
pages, nr))
return 0;
@@ -3038,18 +3038,20 @@ static int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, unsigned lo
* architecture have different format for hugetlbfs
* pmd format and THP pmd format
*/
- if (!gup_huge_pd(__hugepd(pmd_val(pmd)), addr,
- PMD_SHIFT, next, flags, pages, nr))
+ if (!gup_fast_hugepd(__hugepd(pmd_val(pmd)), addr,
+ PMD_SHIFT, next, flags, pages, nr))
return 0;
- } else if (!gup_pte_range(pmd, pmdp, addr, next, flags, pages, nr))
+ } else if (!gup_fast_pte_range(pmd, pmdp, addr, next, flags,
+ pages, nr))
return 0;
} while (pmdp++, addr = next, addr != end);
return 1;
}
-static int gup_pud_range(p4d_t *p4dp, p4d_t p4d, unsigned long addr, unsigned long end,
- unsigned int flags, struct page **pages, int *nr)
+static int gup_fast_pud_range(p4d_t *p4dp, p4d_t p4d, unsigned long addr,
+ unsigned long end, unsigned int flags, struct page **pages,
+ int *nr)
{
unsigned long next;
pud_t *pudp;
@@ -3062,22 +3064,24 @@ static int gup_pud_range(p4d_t *p4dp, p4d_t p4d, unsigned long addr, unsigned lo
if (unlikely(!pud_present(pud)))
return 0;
if (unlikely(pud_leaf(pud))) {
- if (!gup_huge_pud(pud, pudp, addr, next, flags,
- pages, nr))
+ if (!gup_fast_pud_leaf(pud, pudp, addr, next, flags,
+ pages, nr))
return 0;
} else if (unlikely(is_hugepd(__hugepd(pud_val(pud))))) {
- if (!gup_huge_pd(__hugepd(pud_val(pud)), addr,
- PUD_SHIFT, next, flags, pages, nr))
+ if (!gup_fast_hugepd(__hugepd(pud_val(pud)), addr,
+ PUD_SHIFT, next, flags, pages, nr))
return 0;
- } else if (!gup_pmd_range(pudp, pud, addr, next, flags, pages, nr))
+ } else if (!gup_fast_pmd_range(pudp, pud, addr, next, flags,
+ pages, nr))
return 0;
} while (pudp++, addr = next, addr != end);
return 1;
}
-static int gup_p4d_range(pgd_t *pgdp, pgd_t pgd, unsigned long addr, unsigned long end,
- unsigned int flags, struct page **pages, int *nr)
+static int gup_fast_p4d_range(pgd_t *pgdp, pgd_t pgd, unsigned long addr,
+ unsigned long end, unsigned int flags, struct page **pages,
+ int *nr)
{
unsigned long next;
p4d_t *p4dp;
@@ -3091,17 +3095,18 @@ static int gup_p4d_range(pgd_t *pgdp, pgd_t pgd, unsigned long addr, unsigned lo
return 0;
BUILD_BUG_ON(p4d_leaf(p4d));
if (unlikely(is_hugepd(__hugepd(p4d_val(p4d))))) {
- if (!gup_huge_pd(__hugepd(p4d_val(p4d)), addr,
- P4D_SHIFT, next, flags, pages, nr))
+ if (!gup_fast_hugepd(__hugepd(p4d_val(p4d)), addr,
+ P4D_SHIFT, next, flags, pages, nr))
return 0;
- } else if (!gup_pud_range(p4dp, p4d, addr, next, flags, pages, nr))
+ } else if (!gup_fast_pud_range(p4dp, p4d, addr, next, flags,
+ pages, nr))
return 0;
} while (p4dp++, addr = next, addr != end);
return 1;
}
-static void gup_pgd_range(unsigned long addr, unsigned long end,
+static void gup_fast_pgd_range(unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
{
unsigned long next;
@@ -3115,19 +3120,20 @@ static void gup_pgd_range(unsigned long addr, unsigned long end,
if (pgd_none(pgd))
return;
if (unlikely(pgd_leaf(pgd))) {
- if (!gup_huge_pgd(pgd, pgdp, addr, next, flags,
- pages, nr))
+ if (!gup_fast_pgd_leaf(pgd, pgdp, addr, next, flags,
+ pages, nr))
return;
} else if (unlikely(is_hugepd(__hugepd(pgd_val(pgd))))) {
- if (!gup_huge_pd(__hugepd(pgd_val(pgd)), addr,
- PGDIR_SHIFT, next, flags, pages, nr))
+ if (!gup_fast_hugepd(__hugepd(pgd_val(pgd)), addr,
+ PGDIR_SHIFT, next, flags, pages, nr))
return;
- } else if (!gup_p4d_range(pgdp, pgd, addr, next, flags, pages, nr))
+ } else if (!gup_fast_p4d_range(pgdp, pgd, addr, next, flags,
+ pages, nr))
return;
} while (pgdp++, addr = next, addr != end);
}
#else
-static inline void gup_pgd_range(unsigned long addr, unsigned long end,
+static inline void gup_fast_pgd_range(unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
{
}
@@ -3144,10 +3150,8 @@ static bool gup_fast_permitted(unsigned long start, unsigned long end)
}
#endif
-static unsigned long lockless_pages_from_mm(unsigned long start,
- unsigned long end,
- unsigned int gup_flags,
- struct page **pages)
+static unsigned long gup_fast(unsigned long start, unsigned long end,
+ unsigned int gup_flags, struct page **pages)
{
unsigned long flags;
int nr_pinned = 0;
@@ -3175,16 +3179,16 @@ static unsigned long lockless_pages_from_mm(unsigned long start,
* that come from THPs splitting.
*/
local_irq_save(flags);
- gup_pgd_range(start, end, gup_flags, pages, &nr_pinned);
+ gup_fast_pgd_range(start, end, gup_flags, pages, &nr_pinned);
local_irq_restore(flags);
/*
* When pinning pages for DMA there could be a concurrent write protect
- * from fork() via copy_page_range(), in this case always fail fast GUP.
+ * from fork() via copy_page_range(), in this case always fail GUP-fast.
*/
if (gup_flags & FOLL_PIN) {
if (read_seqcount_retry(¤t->mm->write_protect_seq, seq)) {
- unpin_user_pages_lockless(pages, nr_pinned);
+ gup_fast_unpin_user_pages(pages, nr_pinned);
return 0;
} else {
sanity_check_pinned_pages(pages, nr_pinned);
@@ -3224,7 +3228,7 @@ static int internal_get_user_pages_fast(unsigned long start,
if (unlikely(!access_ok((void __user *)start, len)))
return -EFAULT;
- nr_pinned = lockless_pages_from_mm(start, end, gup_flags, pages);
+ nr_pinned = gup_fast(start, end, gup_flags, pages);
if (nr_pinned == nr_pages || gup_flags & FOLL_FAST_ONLY)
return nr_pinned;
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH RFC 2/3] mm/treewide: rename CONFIG_HAVE_FAST_GUP to CONFIG_HAVE_GUP_FAST
From: David Hildenbrand @ 2024-03-27 13:05 UTC (permalink / raw)
To: linux-kernel
Cc: David Hildenbrand, Andrew Morton, Mike Rapoport, Jason Gunthorpe,
John Hubbard, Peter Xu, linux-arm-kernel, loongarch, linux-mips,
linuxppc-dev, linux-s390, linux-sh, linux-mm, linux-perf-users,
linux-fsdevel, x86
In-Reply-To: <20240327130538.680256-1-david@redhat.com>
Nowadays, we call it "GUP-fast", the external interface includes
functions like "get_user_pages_fast()", and we renamed all internal
functions to reflect that as well.
Let's make the config option reflect that.
Signed-off-by: David Hildenbrand <david@redhat.com>
---
arch/arm/Kconfig | 2 +-
arch/arm64/Kconfig | 2 +-
arch/loongarch/Kconfig | 2 +-
arch/mips/Kconfig | 2 +-
arch/powerpc/Kconfig | 2 +-
arch/s390/Kconfig | 2 +-
arch/sh/Kconfig | 2 +-
arch/x86/Kconfig | 2 +-
include/linux/rmap.h | 8 ++++----
kernel/events/core.c | 4 ++--
mm/Kconfig | 2 +-
mm/gup.c | 6 +++---
mm/internal.h | 2 +-
13 files changed, 19 insertions(+), 19 deletions(-)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index b14aed3a17ab..817918f6635a 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -99,7 +99,7 @@ config ARM
select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE
select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU
select HAVE_EXIT_THREAD
- select HAVE_FAST_GUP if ARM_LPAE
+ select HAVE_GUP_FAST if ARM_LPAE
select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
select HAVE_FUNCTION_ERROR_INJECTION
select HAVE_FUNCTION_GRAPH_TRACER
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7b11c98b3e84..de076a191e9f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -205,7 +205,7 @@ config ARM64
select HAVE_SAMPLE_FTRACE_DIRECT
select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
select HAVE_EFFICIENT_UNALIGNED_ACCESS
- select HAVE_FAST_GUP
+ select HAVE_GUP_FAST
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_TRACER
select HAVE_FUNCTION_ERROR_INJECTION
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index a5f300ec6f28..cd642eefd9e5 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -119,7 +119,7 @@ config LOONGARCH
select HAVE_EBPF_JIT
select HAVE_EFFICIENT_UNALIGNED_ACCESS if !ARCH_STRICT_ALIGN
select HAVE_EXIT_THREAD
- select HAVE_FAST_GUP
+ select HAVE_GUP_FAST
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_ARG_ACCESS_API
select HAVE_FUNCTION_ERROR_INJECTION
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 06ef440d16ce..10f7c6d88163 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -68,7 +68,7 @@ config MIPS
select HAVE_DYNAMIC_FTRACE
select HAVE_EBPF_JIT if !CPU_MICROMIPS
select HAVE_EXIT_THREAD
- select HAVE_FAST_GUP
+ select HAVE_GUP_FAST
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACER
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 1c4be3373686..e42cc8cd415f 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -236,7 +236,7 @@ config PPC
select HAVE_DYNAMIC_FTRACE_WITH_REGS if ARCH_USING_PATCHABLE_FUNCTION_ENTRY || MPROFILE_KERNEL || PPC32
select HAVE_EBPF_JIT
select HAVE_EFFICIENT_UNALIGNED_ACCESS
- select HAVE_FAST_GUP
+ select HAVE_GUP_FAST
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_ARG_ACCESS_API
select HAVE_FUNCTION_DESCRIPTORS if PPC64_ELF_ABI_V1
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 8f01ada6845e..d9aed4c93ee6 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -174,7 +174,7 @@ config S390
select HAVE_DYNAMIC_FTRACE_WITH_REGS
select HAVE_EBPF_JIT if HAVE_MARCH_Z196_FEATURES
select HAVE_EFFICIENT_UNALIGNED_ACCESS
- select HAVE_FAST_GUP
+ select HAVE_GUP_FAST
select HAVE_FENTRY
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_ARG_ACCESS_API
diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index 2ad3e29f0ebe..7292542f75e8 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -38,7 +38,7 @@ config SUPERH
select HAVE_DEBUG_BUGVERBOSE
select HAVE_DEBUG_KMEMLEAK
select HAVE_DYNAMIC_FTRACE
- select HAVE_FAST_GUP if MMU
+ select HAVE_GUP_FAST if MMU
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACER
select HAVE_FTRACE_MCOUNT_RECORD
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 39886bab943a..f82171292cf3 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -221,7 +221,7 @@ config X86
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_EISA
select HAVE_EXIT_THREAD
- select HAVE_FAST_GUP
+ select HAVE_GUP_FAST
select HAVE_FENTRY if X86_64 || DYNAMIC_FTRACE
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_GRAPH_RETVAL if HAVE_FUNCTION_GRAPH_TRACER
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index b7944a833668..9bf9324214fc 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -284,7 +284,7 @@ static inline int hugetlb_try_share_anon_rmap(struct folio *folio)
VM_WARN_ON_FOLIO(!PageAnonExclusive(&folio->page), folio);
/* Paired with the memory barrier in try_grab_folio(). */
- if (IS_ENABLED(CONFIG_HAVE_FAST_GUP))
+ if (IS_ENABLED(CONFIG_HAVE_GUP_FAST))
smp_mb();
if (unlikely(folio_maybe_dma_pinned(folio)))
@@ -295,7 +295,7 @@ static inline int hugetlb_try_share_anon_rmap(struct folio *folio)
* This is conceptually a smp_wmb() paired with the smp_rmb() in
* gup_must_unshare().
*/
- if (IS_ENABLED(CONFIG_HAVE_FAST_GUP))
+ if (IS_ENABLED(CONFIG_HAVE_GUP_FAST))
smp_mb__after_atomic();
return 0;
}
@@ -541,7 +541,7 @@ static __always_inline int __folio_try_share_anon_rmap(struct folio *folio,
*/
/* Paired with the memory barrier in try_grab_folio(). */
- if (IS_ENABLED(CONFIG_HAVE_FAST_GUP))
+ if (IS_ENABLED(CONFIG_HAVE_GUP_FAST))
smp_mb();
if (unlikely(folio_maybe_dma_pinned(folio)))
@@ -552,7 +552,7 @@ static __always_inline int __folio_try_share_anon_rmap(struct folio *folio,
* This is conceptually a smp_wmb() paired with the smp_rmb() in
* gup_must_unshare().
*/
- if (IS_ENABLED(CONFIG_HAVE_FAST_GUP))
+ if (IS_ENABLED(CONFIG_HAVE_GUP_FAST))
smp_mb__after_atomic();
return 0;
}
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 724e6d7e128f..c5a0dc1f135f 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7539,7 +7539,7 @@ static u64 perf_get_pgtable_size(struct mm_struct *mm, unsigned long addr)
{
u64 size = 0;
-#ifdef CONFIG_HAVE_FAST_GUP
+#ifdef CONFIG_HAVE_GUP_FAST
pgd_t *pgdp, pgd;
p4d_t *p4dp, p4d;
pud_t *pudp, pud;
@@ -7587,7 +7587,7 @@ static u64 perf_get_pgtable_size(struct mm_struct *mm, unsigned long addr)
if (pte_present(pte))
size = pte_leaf_size(pte);
pte_unmap(ptep);
-#endif /* CONFIG_HAVE_FAST_GUP */
+#endif /* CONFIG_HAVE_GUP_FAST */
return size;
}
diff --git a/mm/Kconfig b/mm/Kconfig
index b1448aa81e15..5fd14a536bcc 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -473,7 +473,7 @@ config ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP
config HAVE_MEMBLOCK_PHYS_MAP
bool
-config HAVE_FAST_GUP
+config HAVE_GUP_FAST
depends on MMU
bool
diff --git a/mm/gup.c b/mm/gup.c
index c293aff30c5d..e35c6a6a0301 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2463,7 +2463,7 @@ EXPORT_SYMBOL(get_user_pages_unlocked);
*
* This code is based heavily on the PowerPC implementation by Nick Piggin.
*/
-#ifdef CONFIG_HAVE_FAST_GUP
+#ifdef CONFIG_HAVE_GUP_FAST
/*
* Used in the GUP-fast path to determine whether GUP is permitted to work on
@@ -3137,7 +3137,7 @@ static inline void gup_fast_pgd_range(unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
{
}
-#endif /* CONFIG_HAVE_FAST_GUP */
+#endif /* CONFIG_HAVE_GUP_FAST */
#ifndef gup_fast_permitted
/*
@@ -3157,7 +3157,7 @@ static unsigned long gup_fast(unsigned long start, unsigned long end,
int nr_pinned = 0;
unsigned seq;
- if (!IS_ENABLED(CONFIG_HAVE_FAST_GUP) ||
+ if (!IS_ENABLED(CONFIG_HAVE_GUP_FAST) ||
!gup_fast_permitted(start, end))
return 0;
diff --git a/mm/internal.h b/mm/internal.h
index 8e11f7b2da21..c4587c20c881 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -1199,7 +1199,7 @@ static inline bool gup_must_unshare(struct vm_area_struct *vma,
}
/* Paired with a memory barrier in folio_try_share_anon_rmap_*(). */
- if (IS_ENABLED(CONFIG_HAVE_FAST_GUP))
+ if (IS_ENABLED(CONFIG_HAVE_GUP_FAST))
smp_rmb();
/*
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH RFC 3/3] mm: use "GUP-fast" instead "fast GUP" in remaining comments
From: David Hildenbrand @ 2024-03-27 13:05 UTC (permalink / raw)
To: linux-kernel
Cc: David Hildenbrand, Andrew Morton, Mike Rapoport, Jason Gunthorpe,
John Hubbard, Peter Xu, linux-arm-kernel, loongarch, linux-mips,
linuxppc-dev, linux-s390, linux-sh, linux-mm, linux-perf-users,
linux-fsdevel, x86
In-Reply-To: <20240327130538.680256-1-david@redhat.com>
Let's fixup the remaining comments to consistently call that thing
"GUP-fast". With this change, we consistently call it "GUP-fast".
Signed-off-by: David Hildenbrand <david@redhat.com>
---
mm/filemap.c | 2 +-
mm/khugepaged.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index 387b394754fa..c668e11cd6ef 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1810,7 +1810,7 @@ EXPORT_SYMBOL(page_cache_prev_miss);
* C. Return the page to the page allocator
*
* This means that any page may have its reference count temporarily
- * increased by a speculative page cache (or fast GUP) lookup as it can
+ * increased by a speculative page cache (or GUP-fast) lookup as it can
* be allocated by another user before the RCU grace period expires.
* Because the refcount temporarily acquired here may end up being the
* last refcount on the page, any page allocation must be freeable by
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 38830174608f..6972fa05132e 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1169,7 +1169,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address,
* huge and small TLB entries for the same virtual address to
* avoid the risk of CPU bugs in that area.
*
- * Parallel fast GUP is fine since fast GUP will back off when
+ * Parallel GUP-fast is fine since GUP-fast will back off when
* it detects PMD is changed.
*/
_pmd = pmdp_collapse_flush(vma, address, pmd);
--
2.43.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* Re: [PATCH 3/4] arm64: dts: rockchip: Add VEPU121 to rk3588
From: Sebastian Reichel @ 2024-03-27 13:09 UTC (permalink / raw)
To: Link Mauve
Cc: Krzysztof Kozlowski, linux-kernel, Ezequiel Garcia, Philipp Zabel,
Mauro Carvalho Chehab, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Heiko Stuebner, Joerg Roedel, Will Deacon,
Robin Murphy, Cristian Ciocaltea, Dragan Simic, Shreeya Patel,
Chris Morgan, Andy Yan, Nicolas Frattaroli, linux-media,
linux-rockchip, devicetree, linux-arm-kernel, iommu
In-Reply-To: <ZgQTrwOUtdZ1nRs0@desktop>
[-- Attachment #1.1: Type: text/plain, Size: 2253 bytes --]
Hi,
On Wed, Mar 27, 2024 at 01:40:15PM +0100, Link Mauve wrote:
> On Thu, Mar 21, 2024 at 09:15:38AM +0100, Krzysztof Kozlowski wrote:
> > On 20/03/2024 18:37, Emmanuel Gil Peyrot wrote:
> > > The TRM (version 1.0 page 385) lists five VEPU121 cores, but only four
> > > interrupts are listed (on page 24), so I’ve only enabled four of them
> > > for now.
> > >
> > > Signed-off-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
> > > ---
> > > arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 80 +++++++++++++++++++++++
> > > 1 file changed, 80 insertions(+)
> > >
> > > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > > index 2a23b4dc36e4..fe77b56ac9a0 100644
> > > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > > @@ -2488,6 +2488,86 @@ gpio4: gpio@fec50000 {
> > > };
> > > };
> > >
> > > + jpeg_enc0: video-codec@fdba0000 {
> > > + compatible = "rockchip,rk3588-vepu121";
> > > + reg = <0x0 0xfdba0000 0x0 0x800>;
> > > + interrupts = <GIC_SPI 122 IRQ_TYPE_LEVEL_HIGH 0>;
> > > + clocks = <&cru ACLK_JPEG_ENCODER0>, <&cru HCLK_JPEG_ENCODER0>;
> > > + clock-names = "aclk", "hclk";
> > > + iommus = <&jpeg_enc0_mmu>;
> > > + power-domains = <&power RK3588_PD_VDPU>;
> > > + };
> > > +
> > > + jpeg_enc0_mmu: iommu@fdba0800 {
> > > + compatible = "rockchip,rk3588-iommu";
> >
> > It does not look like you tested the DTS against bindings. Please run
> > `make dtbs_check W=1` (see
> > Documentation/devicetree/bindings/writing-schema.rst or
> > https://www.linaro.org/blog/tips-and-tricks-for-validating-devicetree-sources-with-the-devicetree-schema/
> > for instructions).
>
> Even on master I get an exception about this unresolvable file:
> referencing.exceptions.Unresolvable: cache-controller.yaml#
>
> Yet it seems to be present in only three files, all of them unrelated to
> the rockchip board I’m interested in (it seems), so I’m not sure what to
> do about that.
The trace looked like you tried using dt-schema with jsonschema
version 4.18+, which is known broken:
https://github.com/devicetree-org/dt-schema/issues/109
Greetings,
-- Sebastian
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #2: Type: text/plain, Size: 176 bytes --]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [RESEND PATCH v3] perf/marvell: Marvell PEM performance monitor support
From: Andrew Lunn @ 2024-03-27 13:11 UTC (permalink / raw)
To: Gowthami Thiagarajan
Cc: will, mark.rutland, linux-arm-kernel, linux-kernel, sgoutham,
gcherian, lcherian
In-Reply-To: <20240327072117.1556653-1-gthiagarajan@marvell.com>
On Wed, Mar 27, 2024 at 12:51:17PM +0530, Gowthami Thiagarajan wrote:
> PCI Express Interface PMU includes various performance counters
> to monitor the data that is transmitted over the PCIe link. The
> counters track various inbound and outbound transactions which
> includes separate counters for posted/non-posted/completion TLPs.
> Also, inbound and outbound memory read requests along with their
> latencies can also be monitored. Address Translation Services(ATS)events
> such as ATS Translation, ATS Page Request, ATS Invalidation along with
> their corresponding latencies are also supported.
>
> The performance counters are 64 bits wide.
>
> For instance,
> perf stat -e ib_tlp_pr <workload>
> tracks the inbound posted TLPs for the workload.
>
> Signed-off-by: Gowthami Thiagarajan <gthiagarajan@marvell.com>
> ---
> v2->v3 changes:
> - Dropped device tree support as the acpi table based probing is used.
So people using DT cannot use this driver? Can they use the PCIe
interface?
There does not appear to be any ACPI binding, it is not reading any
properties from ACPI tables etc. So the DT binding should be
trivial...
> index 000000000000..d4175483b982
> --- /dev/null
> +++ b/drivers/perf/marvell_pem_pmu.c
> @@ -0,0 +1,428 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Marvell PEM(PCIe RC) Performance Monitor Driver
> + *
> + * Copyright (C) 2024 Marvell.
> + */
> +
> +#include <linux/acpi.h>
> +#include <linux/init.h>
> +#include <linux/io.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/of_device.h>
Why do you need these header files? I don't see any calls to of_
functions.
> +static int pem_perf_probe(struct platform_device *pdev)
> +{
> + struct pem_pmu *pem_pmu;
> + struct resource *res;
> + void __iomem *base;
> + char *name;
> + int ret;
> +
> + pem_pmu = devm_kzalloc(&pdev->dev, sizeof(*pem_pmu), GFP_KERNEL);
> + if (!pem_pmu)
> + return -ENOMEM;
> +
> + pem_pmu->dev = &pdev->dev;
> + platform_set_drvdata(pdev, pem_pmu);
> +
> + base = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
> + if (IS_ERR(base))
> + return PTR_ERR(base);
> +
> + pem_pmu->base = base;
> +
> + pem_pmu->pmu = (struct pmu) {
> + .module = THIS_MODULE,
> + .capabilities = PERF_PMU_CAP_NO_EXCLUDE,
> + .task_ctx_nr = perf_invalid_context,
> + .attr_groups = pem_perf_attr_groups,
> + .event_init = pem_perf_event_init,
> + .add = pem_perf_event_add,
> + .del = pem_perf_event_del,
> + .start = pem_perf_event_start,
> + .stop = pem_perf_event_stop,
> + .read = pem_perf_event_update,
> + };
> +
> + /* Choose this cpu to collect perf data */
> + pem_pmu->cpu = raw_smp_processor_id();
> +
> + name = devm_kasprintf(pem_pmu->dev, GFP_KERNEL, "mrvl_pcie_rc_pmu_%llx",
> + res->start);
> + if (!name)
> + return -ENOMEM;
> +
> + cpuhp_state_add_instance_nocalls
> + (CPUHP_AP_PERF_ARM_MARVELL_PEM_ONLINE,
> + &pem_pmu->node);
> +
> + ret = perf_pmu_register(&pem_pmu->pmu, name, -1);
> + if (ret)
> + goto error;
> +
> + pr_info("Marvell PEM(PCIe RC) PMU Driver for pem@%llx\n", res->start);
Please don't spam the kernel log like this.
Andrew
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH 4/4] kprobes: Remove core dependency on modules
From: Jarkko Sakkinen @ 2024-03-27 13:23 UTC (permalink / raw)
To: Masami Hiramatsu, Mark Rutland
Cc: linux-kernel, agordeev, anil.s.keshavamurthy, aou, bp,
catalin.marinas, dave.hansen, davem, gor, hca, jcalvinowens,
linux-arm-kernel, mingo, mpe, naveen.n.rao, palmer, paul.walmsley,
tglx, will
In-Reply-To: <20240327090155.873f1ed32700dbdb75f8eada@kernel.org>
On Wed, 2024-03-27 at 09:01 +0900, Masami Hiramatsu wrote:
> On Tue, 26 Mar 2024 17:38:18 +0000
> Mark Rutland <mark.rutland@arm.com> wrote:
>
> > On Tue, Mar 26, 2024 at 07:13:51PM +0200, Jarkko Sakkinen wrote:
> > > On Tue Mar 26, 2024 at 6:36 PM EET, Mark Rutland wrote:
> >
> > > > +#ifdef CONFIG_MODULES
> > > > /* Check if 'p' is probing a module. */
> > > > *probed_mod = __module_text_address((unsigned long) p-
> > > > >addr);
> > > > if (*probed_mod) {
> > > > @@ -1605,6 +1606,8 @@ static int
> > > > check_kprobe_address_safe(struct kprobe *p,
> > > > ret = -ENOENT;
> > > > }
> > > > }
> > > > +#endif
> > >
> > > This can be scoped a bit more (see v7 of my patch set).
> >
> > > > +#ifdef CONFIG_MODULES
> > > > static nokprobe_inline bool trace_kprobe_module_exist(struct
> > > > trace_kprobe *tk)
> > > > {
> > > > char *p;
> > > > @@ -129,6 +130,9 @@ static nokprobe_inline bool
> > > > trace_kprobe_module_exist(struct trace_kprobe *tk)
> > > >
> > > > return ret;
> > > > }
> > > > +#else
> > > > +#define trace_kprobe_module_exist(tk) false /* aka a module
> > > > never exists */
> > > > +#endif /* CONFIG_MODULES */
> > > >
> > > > static bool trace_kprobe_is_busy(struct dyn_event *ev)
> > > > {
> > > > @@ -670,6 +674,7 @@ static int register_trace_kprobe(struct
> > > > trace_kprobe *tk)
> > > > return ret;
> > > > }
> > > >
> > > > +#ifdef CONFIG_MODULES
> > > > /* Module notifier call back, checking event on the module */
> > > > static int trace_kprobe_module_callback(struct notifier_block
> > > > *nb,
> > > > unsigned long val, void
> > > > *data)
> > > > @@ -699,6 +704,9 @@ static int
> > > > trace_kprobe_module_callback(struct notifier_block *nb,
> > > >
> > > > return NOTIFY_DONE;
> > > > }
> > > > +#else
> > > > +#define trace_kprobe_module_callback (NULL)
> > > > +#endif /* CONFIG_MODULES */
> > >
> > > The last two CONFIG_MODULES sections could be combined. This was
> > > also in
> > > v7.
> >
> > > Other than lgtm.
> >
> > Great! I've folded your v7 changes in, and pushed that out to:
> >
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=kprobes/without-modules
> >
> > I'll hold off sending that out to the list until other folk have
> > had a chance
> > to comment.
>
> Yeah, the updated one looks good to me too.
>
> Thanks!
Yeah, I'm also planning to test this with x86 instrumenting sgx_* calls
as I need to test the cgroups support for it so can help with the
coverage both RISC-V and x86 (as I find a good time slot).
BR, Jarkko
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH 1/2] arm64: dts: rockchip: enable gpu on rk3588-jaguar
From: Heiko Stuebner @ 2024-03-27 13:33 UTC (permalink / raw)
To: Heiko Stuebner
Cc: quentin.schulz, Heiko Stuebner, linux-rockchip, linux-arm-kernel
In-Reply-To: <20240327112120.1181570-1-heiko@sntech.de>
On Wed, 27 Mar 2024 12:21:19 +0100, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@cherry.de>
>
> Enable the mali gpu node and add the board-specific supply-regulator.
>
>
Applied, thanks!
[1/2] arm64: dts: rockchip: enable gpu on rk3588-jaguar
commit: 51ca6a22c52b10a787fa3fad2343279f11da27bb
[2/2] arm64: dts: rockchip: enable gpu on rk3588-tiger
commit: f5256f8ed4b729c3ab9d9cd7d406313773484b59
Best regards,
--
Heiko Stuebner <heiko@sntech.de>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v1 0/3] Speed up boot with faster linear map creation
From: Ard Biesheuvel @ 2024-03-27 13:36 UTC (permalink / raw)
To: Ryan Roberts
Cc: Catalin Marinas, Will Deacon, Mark Rutland, David Hildenbrand,
Donald Dutile, Eric Chanudet, linux-arm-kernel, linux-kernel
In-Reply-To: <e1078fe2-0b0b-48c6-8d24-3b2e835201b1@arm.com>
On Wed, 27 Mar 2024 at 12:43, Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> On 27/03/2024 10:09, Ard Biesheuvel wrote:
> > Hi Ryan,
> >
> > On Tue, 26 Mar 2024 at 12:15, Ryan Roberts <ryan.roberts@arm.com> wrote:
> >>
> >> Hi All,
> >>
> >> It turns out that creating the linear map can take a significant proportion of
> >> the total boot time, especially when rodata=full. And a large portion of the
> >> time it takes to create the linear map is issuing TLBIs. This series reworks the
> >> kernel pgtable generation code to significantly reduce the number of TLBIs. See
> >> each patch for details.
> >>
> >> The below shows the execution time of map_mem() across a couple of different
> >> systems with different RAM configurations. We measure after applying each patch
> >> and show the improvement relative to base (v6.9-rc1):
> >>
> >> | Apple M2 VM | Ampere Altra| Ampere Altra| Ampere Altra
> >> | VM, 16G | VM, 64G | VM, 256G | Metal, 512G
> >> ---------------|-------------|-------------|-------------|-------------
> >> | ms (%) | ms (%) | ms (%) | ms (%)
> >> ---------------|-------------|-------------|-------------|-------------
> >> base | 151 (0%) | 2191 (0%) | 8990 (0%) | 17443 (0%)
> >> no-cont-remap | 77 (-49%) | 429 (-80%) | 1753 (-80%) | 3796 (-78%)
> >> no-alloc-remap | 77 (-49%) | 375 (-83%) | 1532 (-83%) | 3366 (-81%)
> >> lazy-unmap | 63 (-58%) | 330 (-85%) | 1312 (-85%) | 2929 (-83%)
> >>
> >> This series applies on top of v6.9-rc1. All mm selftests pass. I haven't yet
> >> tested all VA size configs (although I don't anticipate any issues); I'll do
> >> this as part of followup.
> >>
> >
> > These are very nice results!
> >
> > Before digging into the details: do we still have a strong case for
> > supporting contiguous PTEs and PMDs in these routines?
>
> We are currently using contptes and pmds for the linear map when rodata=[on|off]
> IIRC?
In principle, yes. In practice?
> I don't see a need to remove the capability personally.
>
Since we are making changes here, it is a relevant question to ask imho.
> Also I was talking with Mark R yesterday and he suggested that an even better
> solution might be to create a temp pgtable that maps the linear map with pmds,
> switch to it, then create the real pgtable that maps the linear map with ptes,
> then switch to that. The benefit being that we can avoid the fixmap entirely
> when creating the second pgtable - we think this would likely be significantly
> faster still.
>
If this is going to be a temporary mapping for the duration of the
initial population of the linear map page tables, we might just as
well use a 1:1 TTBR0 mapping here, which would be completely disjoint
from swapper. And we'd only need to map memory that is being used for
page tables, so on those large systems we'd need to map only a small
slice. Maybe it's time to bring back the memblock alloc limit so we
can manage this more easily?
> My second patch adds the infrastructure to make this possible. But your changes
> for LPA2 make it significantly more effort; since that change we are now using
> the swapper pgtable when we populate the linear map into it - the kernel is
> already mapped and that isn't done in paging_init() anymore. So I'm not quite
> sure how we can easily make that work at the moment.
>
I think a mix of the fixmap approach with a 1:1 map could work here:
- use TTBR0 to create a temp 1:1 map of DRAM
- map page tables lazily as they are allocated but using a coarse mapping
- avoid all TLB maintenance except at the end when tearing down the 1:1 mapping.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [RESEND][PATCH v2 4/4] soc: samsung: exynos-asv: Update Energy Model after adjusting voltage
From: Lukasz Luba @ 2024-03-27 13:39 UTC (permalink / raw)
To: Dietmar Eggemann
Cc: linux-arm-kernel, sboyd, nm, linux-samsung-soc, daniel.lezcano,
rafael, viresh.kumar, krzysztof.kozlowski, alim.akhtar,
m.szyprowski, mhiramat, linux-kernel, linux-pm
In-Reply-To: <e02ca745-52df-4210-b175-f4ef278d81d8@arm.com>
On 3/27/24 12:56, Dietmar Eggemann wrote:
> On 26/03/2024 21:12, Lukasz Luba wrote:
>> Hi Dietmar,
>>
>> On 3/26/24 11:20, Dietmar Eggemann wrote:
>>> On 22/03/2024 12:08, Lukasz Luba wrote:
>>>
>>> [...]
>>>
>>>> @@ -97,9 +98,17 @@ static int exynos_asv_update_opps(struct
>>>> exynos_asv *asv)
>>>> last_opp_table = opp_table;
>>>> ret = exynos_asv_update_cpu_opps(asv, cpu);
>>>> - if (ret < 0)
>>>> + if (!ret) {
>>>> + /*
>>>> + * When the voltage for OPPs successfully
>>>> + * changed, update the EM power values to
>>>> + * reflect the reality and not use stale data
>>>
>>> At this point, can we really say that the voltage has changed?
>>>
>>> exynos_asv_update_cpu_opps()
>>>
>>> ...
>>> ret = dev_pm_opp_adjust_voltage()
>>> if (!ret)
>>> em_dev_update_chip_binning()
>>> ...
>>>
>>> dev_pm_opp_adjust_voltage() also returns 0 when the voltage value stays
>>> the same?
>>>
>>> [...]
>>
>> The comment for the dev_pm_opp_adjust_voltage() says that it
>> returns 0 if no modification was done or modification was
>> successful. So I cannot distinguish in that driver code, but
>> also there is no additional need to do it IMO (even framework
>> doesn't do this).
>
> Precisely. That's why the added comment in exynos_asv_update_opps():
> "When the voltage for OPPs successfully __changed__, ..." is somehow
> misleading IMHO.
>
I got your point. Let me re-phrase that comment to reflect this
OPP function return properly.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox