Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 1/4] exec: inherit HWCAPs from the parent process
From: Andrei Vagin @ 2026-04-15 19:27 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Andrei Vagin, Will Deacon, Kees Cook, Andrew Morton,
	Marek Szyprowski, Cyrill Gorcunov, Mike Rapoport,
	Alexander Mikhalitsyn, linux-kernel, linux-fsdevel, linux-mm,
	criu, Catalin Marinas, linux-arm-kernel, Chen Ridong,
	Christian Brauner, David Hildenbrand, Eric Biederman,
	Lorenzo Stoakes, Michal Koutny, Alexander Mikhalitsyn, Linux API
In-Reply-To: <adUhbk0sKT0ucWhJ@J2N7QTR9R3>

Hi Mark,

Thanks for the feedback and sorry for the delay, was on vacation.
Please see my comments inline.

On Tue, Apr 7, 2026 at 8:29 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Fri, Mar 27, 2026 at 05:21:26PM -0700, Andrei Vagin wrote:
> > Hi Mark,
> >
> > I understand all these points and they are valid. However, as I
> > mentioned, we are not trying to introduce a mechanism that will strictly
> > enforce feature sets for every container. While we would like to have
> > that functionality, as you and will mentioned, it would require
> > substantially more complexity to address, and maintainers would unlikely
> > to pick up that complexity.
>
> The crux of my complaint here is that unless you do that (to some
> degree), this is not going to work reliably, even with the constraints
> you outline.
>
> Further, I disagree with your proposed solution of pushing more
> constraints onto userspace (to also consider HWCAPs as overriding other
> mechainsms, etc).
>
> I think that as-is, the approach is flawed.

I would really appreciate it if we could move this conversation toward
how we can make it work.

>
> > Even masking ID registers on a per-container basis would introduce
> > extra complexity that could make architecture maintainers unhappy.
> > There were a few attempts to introduce container CPUID masking on
> > x86_64 in the past.
>
> > In CRIU, we are not aiming to handle every possible workload. Our goal
> > is to target workloads where developers are ready to cooperate and
> > willing to make adjustments to be C/R compatible. The goal here is to
> > provide developers with clear instructions on what they can do to ensure
> > their applications are C/R compatible. When I say "workloads", I mean
> > this in a broad sense. A container might pack a set of tools with
> > different runtimes (Go, Java, libc-based). All these runtimes should
> > detect only allowed features.
>
> I do not think that arbitrary applications (and libraries!) should have
> to pick up additional constraints that are unnecessary without CRIU,
> especially where that goes against deliberate design decisions (e.g.
> features in arm64's HINT instruction space, which are designed to be
> usable in fast paths WITHOUT needing explicit checks of things like
> HWCAPs). Note that those typically *do* have kernel controls.
>
> I think there's a much larger problem space than you anticipate, and
> adding an incomplete solution now is just going to introduce a
> maintenance burden.

I am not adding arbitrary constraints for standard non-CRIU use cases.
Previously, I suggested that standard libraries would need to call prctl
to determine if hwcaps should be used for feature detection.  However,
we can avoid this extra syscall by adding the new HWCAP2_CR bit. Then
libraries will simply check this bit in auxv[AT_HWCAP2], meaning the
overhead for "non-criu" cases is just a single bit check.

As for HINT instructions, there are two class of instructions.

The first one doesn't change a process state and they are not required
any special handling in term of checkpoint/restore. If a process is
checkpointed on a newer cpu, and restore it on an older cpu, the older
hardware will simply skip over that instructions.  The architectural
state (registers, memory) should remain consistent.

The second class such as PAC are instructions that actually change a
process state. These instructions require kernel/userspace coordination.
For example, usage of PAC keys can be controlled from userspace via prctl.
I mean when support for new instructions is implemented in the kernel,
we will need to consider that userspace should be able to control them.

>
> > Returning to the subject of this patchset: this series extends the role
> > of hwcaps. With this change, we would establish that hwcaps is the
> > "source of truth" for which features an application can safely use. Any
> > other features available on the current CPU would not be guaranteed to
> > remain available after migration to another machine.
> >
> > After this discussion, I found that the current version missed one major
> > thing: there should be a signal indicating that hwcaps must be used for
> > feature detection. Since we will need to integrate this interface into
> > libc, Go, and other runtimes, they definitely should not rely just on
> > hwcaps by default, especially in the early stages. This can be solved
> > via the prctl command.  Libraries like libc would call
> > prctl(PR_USER_HWCAP_ENABLED). If this returns true, the runtime knows
> > that only the features explicitly listed in hwcaps should be used.
>
> I do not think we should be pushing that shape of constraint onto
> userspace.

Look at the previous command.

>
> > You are right, the controlled feature set will be limited to features
> > the kernel knows about. And yes, we would need to report CPU features in
> > hwcaps even if the kernel isn't directly involved in handling them.
>
> To be clear, that is not what I am arguing.
>
> As I mentioned before, the way this works on arm64 is that the kernel
> only exposes what it is aware of, even in the ID regs accessible to
> userspace. We usually *can* hide features, and do that for cases of
> mismatched big.LITTLE, virtual machines, etc.

I understand that. My point was that the kernel would need to report
features in hwcaps even if they don't require specific kernel-side
handling.

>
> > Honestly, I am not certain if this is the "right" interface for that,
> > and I would be happy to consider other ideas. I understand that these
> > hwcaps will not work right out of the box, but we need a way to solve
> > this problem. Having a centralized API for CPU/kernel feature detection
> > seems like the right direction.
>
> I think that for better or worse the approach you are tkaing here simply
> does not solve enough of the problem to actually be worthwhile.

This approach mimics solutions that some CRIU users are already
implementing in userspace, but those only work when the user controls/
recompiles all their libraries. I am open to other ideas, but we need a
path forward.

>
> > As for signal frame size and extended states like SVE/SME, we aware
> > about this problem.  However, it is partly mitigated by the fact that if
> > an application does not use some features, those states are not placed
> > in the signal frame.
>
> That is not true. The kernel can and will create signal frames for
> architectural state that a task might never have touched.
>
> Generally arm64 creates signal frames for features when the feature
> *exists*, regardless of whether the task has actively manipulated the
> relevant state. For example, on systems with SVE a trivial SVE signal
> frame gets created even if a task only uses the FPSIMD registers, and on
> systms with SME a TPIDR2 signal frame gets created even if the task has
> never read/written TPIDR2.
>
> When restoring, an unrecognised signal frame is treated as invalid, and
> we can require that certain signal frames are present.

You are right; that was my mistake. My only explanation for why we don't
see this failure often is that C/R is rarely triggered while a process
is actually
inside a signal handler. This is definitely a problem that still needs
to be solved.

>
> > In the future, when we construct/reload a signal frame, we could look
> > at a process feature set for a process and generate a frame according
> > to those features...
>
> When you say 'we' here, are you talking about within the kernel, or
> within the userspace C/R mechanism?

... within the kernel.

Thanks,
Andrei


^ permalink raw reply

* Re: [PATCH net-next] net: stmmac: enable RPS and RBU interrupts
From: Russell King (Oracle) @ 2026-04-15 19:37 UTC (permalink / raw)
  To: Sam Edwards
  Cc: Jakub Kicinski, Andrew Lunn, Alexandre Torgue, Andrew Lunn,
	David S. Miller, Eric Dumazet,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	linux-stm32, Linux Network Development Mailing List, Paolo Abeni
In-Reply-To: <CAH5Ym4jKdzDeYwCfkMLmUz0FsiD2vFwfuAvqFE=uvMtPmakeMQ@mail.gmail.com>

On Wed, Apr 15, 2026 at 10:38:29AM -0700, Sam Edwards wrote:
> On Wed, Apr 15, 2026 at 5:44 AM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > On Tue, Apr 14, 2026 at 07:12:34PM -0700, Sam Edwards wrote:
> > > On Tue, Apr 14, 2026 at 6:19 PM Russell King (Oracle)
> > > <linux@armlinux.org.uk> wrote:
> > > > Okay, just a quick note to say that nvidia's 5.10.216-tegra kernel
> > > > survives iperf3 -c -R to the imx6.
> > >
> > > Hi Russell,
> > >
> > > Aw, you beat me to it! I was about to report that 5.10.104-tegra is
> > > unaffected. And my iperf3 server is a multi-GbE amd64 machine.
> > >
> > > > Dumping the registers and comparing, and then forcing the RQS and TQS
> > > > values to 0x23 (+1 = 36, *256 = 9216 bytes) and 0x8f (+1 = 144,
> > > > *256 = 36864 ytes) respectively seems to solve the problem. Under
> > > > net-next, these both end up being 0xff (+1 = 256, *256 = 65536 bytes.)
> > > > Suspiciously, 36 * 4 = 144, and I also see that this kernel programs
> > > > all four of the MTL receive operation mode registers, but only the
> > > > first MTL transmit operation mode register. However, DMA channels 1-3
> > > > aren't initialised.
> > >
> > > Wow, great! I wonder if the problem is that the MTL FIFOs are smaller
> > > than that, so when the DMA suffers a momentary hiccup, the FIFOs are
> > > allowed to overflow, putting the hardware in a bad state.
> > >
> > > Though I suspect this is only half of the problem: do you still see
> > > RBUs? Everything you've shared so far suggests the DMA failures are
> > > _not_ because the rx ring is drying up.
> >
> > Yes. Note that RBUs will happen not because of DMA failures, but if
> > the kernel fails to keep up with the packet rate. RBU means "we read
> > the next descriptor, and it wasn't owned by hardware".
> 
> Are you speaking from observation, documentation, or understanding?

Observation.

> I'd define RBU the same way, but you reported:

It's not a question about how I define RBU - this is defined by Synopsys
and I'm using it *exactly* that way as stated in the documentation.

"This bit indicates that the host owns the Next Descriptor in the
Receive List and the DMA cannot acquire it. The Receive Process is
suspended. ... This bit is set only when the previous Receive
Descriptor is owned by the DMA."

In other words, DMA has processed the previous receive descriptor which
_was_ owned by the hardware, written back to clear the OWN bit, and
then fetches the next descriptor and finds that the OWN bit is also
clear.

> 
> ```
> [   55.766199] dwc-eth-dwmac 2490000.ethernet eth0: q0: receive buffer
> unavailable: cur_rx=309 dirty_rx=309 last_cur_rx=245
> last_cur_rx_post=309 last_dirty_rx=245 count=64 budget=64
> 
> cur_rx == dirty_rx _should_ mean that we fully refilled the ring. [...]
> [...]
> Every ring entry contains the same RDES3 value, so it really is
> completely full at the point RBU fires (bit 31 clear means software
> owns the descriptor, and it's basically saying first/last segment,
> RDES1 valid, buffer 1 length of 1518.
> ```

Right, because the _last_ time stmmac_rx() was called, the ring was
completely refilled (as it always is for me).

There are two scenarios that what I'm seeing may happen.

1) The ring was fully refilled, but before stmmac_rx() is next
   executed, all descriptors end up being consumed due to the rate
   at which packets are being received. Thus, the hardware encounters
   a descriptor that has OWN=0

2) The kernel has been slow to respond to packets that have been
   received, and because of the NAPI throttling stmmac_rx() to only
   process 64 descriptors at a time, we are falling way behind the
   hardware position. Eventually, the hardware catches up with
   the point at which stmmac_rx_refill() is repopulating the receive
   descriptors, and encounters a descriptor that has OWN=0.

For (2), for example, let's take the example which you've quoted from
me.

stmmac_rx() gets called, and cur_rx = dirty_rx = 245. We're limited to
a count of 64 meaning we're not going to process more than 64 entries
no matter how far ahead the hardware is. Let's say the hardware is at
e.g. descriptor 400 at this point.

stmmac_rx() runs, processing descriptors. It works its way up to entry
309, at which point count == limit, so it stops, and we now have
cur_rx = 309, dirty_rx = 245.

The next thing stmmac_rx() does is call stmmac_rx_refill(). This looks
at the difference, and calculates how many entries need to be
repopulated. stmmac_rx_dirty() returns 64, as that's the number of
entries between dirty_rx and the updated cur_rx. It populates those
entries.

At this point, dirty_rx = 309. All well and good. However, during that
process, packet reception hasn't stopped, and let's say it's now at
descriptor 500.

In that scenario, we're consuming 100 descriptors, but only repopulating
64 descriptors. As this continues, the hardware is slowly catching up
with point in the ring that stmmac_rx_refill() is repopulating the
descriptors.

When it does catch up, it will encounter a descriptor with OWN=0, which
will fire the RBU interrupt.

At this point, my debug dumps the state of the ring. If the RBU was
raised when stmmac_rx()/stmmac_rx_refill() was not running, _and_ we
are always successfully refilling all the entries that stmmac_rx()
processed, then cur_rx will equal dirty_rx, even when the hardware
could be way ahead of cur_rx. Neither of these indexes have any
relevance to where the hardware actually is in the ring.

The dump of the ring state *clearly* shows that all descriptors have
a RDES3 value which indicates that every single descriptor is not
hardware owned at this point (since RBU has been raised, the receive
process is suspended, so hardware is no longer changing the ring.)

> It would seem* that the kernel isn't really failing to keep up with
> the packet rate. If RBU is firing with a ring that's not even close to
> empty, that tells me there's another way for it to fire. So I suspect
> the hardware designers implemented it to mean:
> "We couldn't read the next descriptor, _or_ it wasn't owned by hardware."
> 
> (* However, if bit 31 is clear everywhere, wouldn't that mean the ring
> is actually completely depleted, not full? If count==budget, wouldn't
> that mean the whole ring hasn't been visited, so we only refilled 64
> entries and not necessarily the entire ring? Maybe the kernel isn't
> keeping up after all.)

Ah, I think that's where our terminology differs.

You seem to define full as "populated with empty buffers". I define
full to mean "the hardware has filled every buffer with a packet that
it has received and handed it over to software to process." Note even
the terminology there - filling buffers with data. That ultimately
ends up filling the ring, and when completely filled, it is full.

I think of buffers like buckets. If a buffer contains no data, it
is empty. If a buffer contains data, it has been filled or is full.
Apply that to a list of buffers and you get the same thing. Many
ethernet driver documentation uses this same terminology, so I
thought it would be widely understood.

> > That has:
> >
> >         const nveu32_t rx_fifo_sz[2U][OSI_EQOS_MAX_NUM_QUEUES] = {
> >                 { FIFO_SZ(9U), FIFO_SZ(9U), FIFO_SZ(9U), FIFO_SZ(9U),
> >                   FIFO_SZ(1U), FIFO_SZ(1U), FIFO_SZ(1U), FIFO_SZ(1U) },
> >                 { FIFO_SZ(36U), FIFO_SZ(2U), FIFO_SZ(2U), FIFO_SZ(2U),
> >                   FIFO_SZ(2U), FIFO_SZ(2U), FIFO_SZ(2U), FIFO_SZ(16U) },
> >         };
> >         const nveu32_t tx_fifo_sz[2U][OSI_EQOS_MAX_NUM_QUEUES] = {
> >                 { FIFO_SZ(9U), FIFO_SZ(9U), FIFO_SZ(9U), FIFO_SZ(9U),
> >                   FIFO_SZ(1U), FIFO_SZ(1U), FIFO_SZ(1U), FIFO_SZ(1U) },
> >                 { FIFO_SZ(8U), FIFO_SZ(8U), FIFO_SZ(8U), FIFO_SZ(8U),
> >                   FIFO_SZ(8U), FIFO_SZ(8U), FIFO_SZ(8U), FIFO_SZ(8U) },
> >         };
> >
> > where each of those values is the RQS/TQS value to use in KiB:
> >
> > #define FIFO_SZ(x)              ((((x) * 1024U) / 256U) - 1U)
> >
> > This doesn't correspond with the values I'm seeing programmed into
> > the hardware under the 5.10.216-tegra kernel. I'm seeing TQS = 143
> > (36KiB), and RQS = 35 (9KiB). Yes, these values exist in the tables
> > above from a quick look, but they're not in the right place!
> 
> True, but:
> a) I doubt 5.10.216-tegra includes exactly the same version of the
> driver found in this random GitHub mirror. (My intent was only to
> point out that they don't use 5.10's stmmac; I should have been more
> clear that I wasn't trying to link the same version, sorry!)
> b) This is vendor code; I don't know how good their testing/review
> process is. It might not run the way it looks. The intent seems to be
> for RQS > TQS (which makes intuitive sense), but as you're seeing the
> registers programmed the other way 'round, they might have gotten them
> subtly mixed up.
> 
> > Now, as for FIFO sizes, if we sum up all the entries, then we
> > get:
> >
> > SUM(rx_fifo_size[0][]) = 60KiB
> > SUM(rx_fifo_size[1][]) = 64KiB
> > SUM(tx_fifo_size[0][]) = 60KiB
> > SUM(tx_fifo_size[1][]) = 64KiB
> 
> I follow the math with 64KiB, but surely the 60KiB should be
> 9+9+9+9+1+1+1+1=40KiB? This seems to me that the "legacy EQOS" simply
> shifts with smaller FIFOs. Since dwmac is licensed as a soft IP core,
> perhaps the FIFO size is an elaboration parameter? That would mean
> this isn't an issue with dwmac 5.0 broadly, but with Nvidia's specific
> instantiation of it.

Right, 40KiB. Sorry, I'm getting interrupted almost constantly while
trying to do anything.

However, I've tested with 0x7f in both fields, and it still falls flat
on its face. I've also tried other values, but because I had to unplug
the laptop from the nvidia board to use the laptop portably due to the
medical emergency situation, that caused screen to quit, so I've lost
all that. Chaos reigns supreme here :/

So, I'm not sure we understand what's going on - I don't think it's that
the FIFOs are smaller than specified. I suspect that the 9KiB vs 36KiB
results in some kind of throttling that prevents the condition which
hangs the hardware.

I'm not getting as much time as I'd like to really test out scenarios
due to everything that is going on, and honestly I feel like just
writing this week off now and giving up.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!


^ permalink raw reply

* Re: [PATCH net v5] net: stmmac: Prevent NULL deref when RX memory exhausted
From: Russell King (Oracle) @ 2026-04-15 19:58 UTC (permalink / raw)
  To: Sam Edwards
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Maxime Coquelin, Alexandre Torgue, Maxime Chevallier,
	Ovidiu Panait, Vladimir Oltean, Baruch Siach, Serge Semin,
	Giuseppe Cavallaro, netdev, linux-stm32, linux-arm-kernel,
	linux-kernel, stable
In-Reply-To: <CAH5Ym4gy6g8d88-vGhe1zxoV7jNH_fXHsDSdDWC4x00H7s-3=w@mail.gmail.com>

On Wed, Apr 15, 2026 at 10:53:15AM -0700, Sam Edwards wrote:
> On Wed, Apr 15, 2026 at 9:28 AM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > On Wed, Apr 15, 2026 at 01:56:32PM +0100, Russell King (Oracle) wrote:
> > > Locally, while debugging my issues, I used this to prevent cur_rx
> > > catching up with dirty_rx:
> > >
> > >                 status = stmmac_rx_status(priv, &priv->xstats, p);
> > >                 /* check if managed by the DMA otherwise go ahead */
> > >                 if (unlikely(status & dma_own))
> > >                         break;
> > >
> > >                 next_entry = STMMAC_NEXT_ENTRY(rx_q->cur_rx,
> > >                                                priv->dma_conf.dma_rx_size);
> > >                 if (unlikely(next_entry == rx_q->dirty_rx))
> > >                         break;
> > >
> > >                 rx_q->cur_rx = next_entry;
> > >
> > > If we care about the cost of reloading rx_q->dirty_rx on every
> > > iteration, then I'd suggest that the cost we already incur reading and
> > > writing rx_q->cur_rx is something that should be addressed, and
> > > eliminating that would counter the cost of reading rx_q->dirty_rx. I
> > > suspect, however, that the cost is minimal, as cur_tx and dirty_rx are
> > > likely in the same cache line.
> 
> No, no, I like your approach better. :) It also removes the need for
> the `limit` clamp at the top of the function, so later code can assume
> limit==budget.
> 
> > > It looks like any fix to stmmac_rx() will also need a corresponding
> > > fix for stmmac_rx_zc().
> 
> I agree that stmmac_rx_zc() is likely also broken (in a similar way,
> but not similar enough to permit a "corresponding" fix), but I don't
> agree that there's a dependency relationship here. This patch is
> addressing #221010, which affects the generic/non-ZC codepath; I'm
> afraid the ZC codepath warrants its own investigation.

The code structure is identical. The only difference is what happens
to the packets.

Both paths take the NAPI limit. Both paths process up to that limit of
descriptors. The state saving / restoring is similar. The read_again
label is the same, the condition after is the same.

The ZC path differs at this point in that it will attempt to refill
every 16 descriptors that have been processed.

Both paths then read the descriptor and check the ownership.
Both paths then increment cur_rx to point to the next entry around
the ring.
Both paths then get the following descriptor pointer and prefetch
it.
Both paths then get the extended status if we're using extended
descriptors.
Both paths then handle frame discard.
Both paths then jump back to read_again if this isn't the last
segment and we have an error.
Both paths then check for error.
... and so it goes on.

The ZC path to me looks like a copy-paste-and-tweak approach to
adding support. The difference seems to be centered only around
the handling of the data buffers in the descriptors. The overall
mechanism of processing the descriptors follows the same layout
in both functions.

> > I have some further information, but a new curveball has just been
> > chucked... and I've no idea what this will mean at this stage. Just
> > take it that I won't be responding for a while.
> 
> I think I follow your meaning. Good luck getting it straightened out!

It looks like further curveballs have been thrown as a result,
destroying all "plans" for the next days/week. I have aboslutely
no ideas how much time or when I'll be able to look at anything
at the moment, so don't assume that because I find an opportunity
to send an email, everthing is back to normal.

I'll also note that over the last two days I've written several
emails on this, spent many hours on them, only to discard them
as other ideas/research and maybe even the passage of time means
they're no longer appropriate to send.

Jakub: sorry, I just *can't* review stuff on netdev with everything
that is going on, not when .... cna't complete this.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!


^ permalink raw reply

* [PATCH] arm64: cpufeature: Fix GCIE field ordering in ftr_id_aa64pfr2
From: Mukesh Ojha @ 2026-04-15 20:00 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Marc Zyngier
  Cc: linux-arm-kernel, linux-kernel, Mukesh Ojha

The ftr_id_aa64pfr2[] array must be sorted in descending order of
shift value so that the overlap validation in init_cpu_features()
works correctly. The GCIE field (bits 15:12, shift=12) was placed
last in the array, after MTEFAR (bits 11:8, shift=8) and
MTESTOREONLY (bits 7:4, shift=4), causing a spurious warning at
boot:

[    0.000000] SYS_ID_AA64PFR2_EL1 has feature overlap at shift 12
[    0.000000] WARNING: arch/arm64/kernel/cpufeature.c:989 at init_cpu_features+0x144/0x3d0, CPU#0:
swapper/0
..

[    0.000000] pc : init_cpu_features+0x144/0x3d0
[    0.000000] lr : init_cpu_features+0x144/0x3d0
[    0.000000] sp : ffffc08678f03dc0

...
    0.000000] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffc08678f14000
[    0.000000] Call trace:
[    0.000000]  init_cpu_features+0x144/0x3d0 (P)
[    0.000000]  cpuinfo_store_boot_cpu+0x4c/0x5c
[    0.000000]  smp_prepare_boot_cpu+0x28/0x38
[    0.000000]  start_kernel+0x1d4/0x848
[    0.000000]  __primary_switched+0x88/0x90

This is because the overlap check computes (shift + width) > prev_shift,
i.e. (12 + 4) > 8, which triggers since GCIE occupies bits above MTEFAR
but was listed after it.

Fix the ordering to match the register layout: FPMR(35:32), GCIE(15:12),
MTEFAR(11:8), MTESTOREONLY(7:4).

Fixes: 899ff451fcee ("KVM: arm64: Advertise ID_AA64PFR2_EL1.GCIE")
Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
---
 arch/arm64/kernel/cpufeature.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 48f2d894101d..6d53bb15cf7b 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -328,9 +328,9 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = {
 
 static const struct arm64_ftr_bits ftr_id_aa64pfr2[] = {
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR2_EL1_FPMR_SHIFT, 4, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR2_EL1_GCIE_SHIFT, 4, ID_AA64PFR2_EL1_GCIE_NI),
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR2_EL1_MTEFAR_SHIFT, 4, ID_AA64PFR2_EL1_MTEFAR_NI),
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR2_EL1_MTESTOREONLY_SHIFT, 4, ID_AA64PFR2_EL1_MTESTOREONLY_NI),
-	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR2_EL1_GCIE_SHIFT, 4, ID_AA64PFR2_EL1_GCIE_NI),
 	ARM64_FTR_END,
 };
 
-- 
2.53.0



^ permalink raw reply related

* Re: [RFC PATCH] mmc: host: sdhci-iproc: implement the .hw_reset callback
From: Meagan Lloyd @ 2026-04-15 20:43 UTC (permalink / raw)
  To: Scott Branden
  Cc: Florian Fainelli, rjui, sbranden, linux-arm-kernel, tgopinath,
	adrian.hunter, linux-mmc, kernel-list
In-Reply-To: <CA+Jzhd+_CKn1X0YDsoh9-OFeCG9sc2Q0OFfphkRaXpEA647xyQ@mail.gmail.com>


On 4/15/2026 1:23 PM, Scott Branden wrote:
> Hi Meagan,
>
> On Wed, Apr 15, 2026 at 11:08 AM Meagan Lloyd <
> meaganlloyd@linux.microsoft.com> wrote:
>
>> On 4/13/2026 10:43 AM, Florian Fainelli wrote:
>>> On 4/13/26 10:38, Meagan Lloyd wrote:
>>>> On 3/27/2026 3:21 PM, Meagan Lloyd wrote:
>>>>> Implement the .hw_reset callback so that the eMMC can be reset as
>>>>> needed
>>>>> given cap-mmc-hw-reset is set in the devicetree and the
>>>>> functionality is
>>>>> enabled on the eMMC.
>>>>>
>>>>> Signed-off-by: Meagan Lloyd <meaganlloyd@linux.microsoft.com>
>>>>> ---
>>>>>
>>>>> SDHCI_POWER_CONTROL[4] (SD Host Controller Standard) has been
>>>>> repurposed
>>>>> on my Broadcomm processor to be eMMC hardware reset
>>>>> (SDIO*_eMMCSDXC_CTRL[12], HRESET).
>>>>>
>>>>> Can you confirm this repurposed bit is consistent across the Broadcomm
>>>>> iProc processors and thus the .hw_reset callback can be uniformly
>>>>> applied in this driver?
>>>> Hi Ray & Scott,
>>>>
>>>> I hope you're doing well. This bit looks to have been repurposed from
>>>> the SD Host Controller Standard's VDD2 Power Control to being used for
>>>> toggling the hardware reset signal to eMMCs. Can you verify that it
>>>> applies across the iProc processors so that I may finalize this patch?
>>> Which iProc process are you using? If you are not sure this applies
>>> broadly, can you at least make it specific to the SoC you are using?
>> Yes, if it comes to that I can. I think it's overkill to roll a new
>> compat string/associated structures over this small change, hence
>> checking with Broadcomm iProc maintainers on this thread.
>>
> Which iProc processor are you using?  You will have to check with
> RaspberryPI as I think they use this driver as well.
> If that family also supports it then you probably don't need a
> compatibility string.

The processor I am using is the BCM58732. Can you help direct me to
someone who could comment from the RaspberryPi side?



^ permalink raw reply

* Re: [RFC PATCH] mmc: host: sdhci-iproc: implement the .hw_reset callback
From: Florian Fainelli @ 2026-04-15 20:44 UTC (permalink / raw)
  To: Meagan Lloyd, Scott Branden
  Cc: rjui, sbranden, linux-arm-kernel, tgopinath, adrian.hunter,
	linux-mmc, kernel-list
In-Reply-To: <58294850-4ed4-4fb4-8f46-186063b76a2f@linux.microsoft.com>

On 4/15/26 13:43, Meagan Lloyd wrote:
> 
> On 4/15/2026 1:23 PM, Scott Branden wrote:
>> Hi Meagan,
>>
>> On Wed, Apr 15, 2026 at 11:08 AM Meagan Lloyd <
>> meaganlloyd@linux.microsoft.com> wrote:
>>
>>> On 4/13/2026 10:43 AM, Florian Fainelli wrote:
>>>> On 4/13/26 10:38, Meagan Lloyd wrote:
>>>>> On 3/27/2026 3:21 PM, Meagan Lloyd wrote:
>>>>>> Implement the .hw_reset callback so that the eMMC can be reset as
>>>>>> needed
>>>>>> given cap-mmc-hw-reset is set in the devicetree and the
>>>>>> functionality is
>>>>>> enabled on the eMMC.
>>>>>>
>>>>>> Signed-off-by: Meagan Lloyd <meaganlloyd@linux.microsoft.com>
>>>>>> ---
>>>>>>
>>>>>> SDHCI_POWER_CONTROL[4] (SD Host Controller Standard) has been
>>>>>> repurposed
>>>>>> on my Broadcomm processor to be eMMC hardware reset
>>>>>> (SDIO*_eMMCSDXC_CTRL[12], HRESET).
>>>>>>
>>>>>> Can you confirm this repurposed bit is consistent across the Broadcomm
>>>>>> iProc processors and thus the .hw_reset callback can be uniformly
>>>>>> applied in this driver?
>>>>> Hi Ray & Scott,
>>>>>
>>>>> I hope you're doing well. This bit looks to have been repurposed from
>>>>> the SD Host Controller Standard's VDD2 Power Control to being used for
>>>>> toggling the hardware reset signal to eMMCs. Can you verify that it
>>>>> applies across the iProc processors so that I may finalize this patch?
>>>> Which iProc process are you using? If you are not sure this applies
>>>> broadly, can you at least make it specific to the SoC you are using?
>>> Yes, if it comes to that I can. I think it's overkill to roll a new
>>> compat string/associated structures over this small change, hence
>>> checking with Broadcomm iProc maintainers on this thread.
>>>
>> Which iProc processor are you using?  You will have to check with
>> RaspberryPI as I think they use this driver as well.
>> If that family also supports it then you probably don't need a
>> compatibility string.
> 
> The processor I am using is the BCM58732. Can you help direct me to
> someone who could comment from the RaspberryPi side?
> 

I will take care of that.
-- 
Florian


^ permalink raw reply

* Re: [PATCH net-next] net: stmmac: enable RPS and RBU interrupts
From: Sam Edwards @ 2026-04-15 20:50 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Jakub Kicinski, Andrew Lunn, Alexandre Torgue, Andrew Lunn,
	David S. Miller, Eric Dumazet,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	linux-stm32, Linux Network Development Mailing List, Paolo Abeni
In-Reply-To: <ad_o4aDP0UBY_8i4@shell.armlinux.org.uk>

On Wed, Apr 15, 2026 at 12:37 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> It's not a question about how I define RBU - this is defined by Synopsys
> and I'm using it *exactly* that way as stated in the documentation.
>
> "This bit indicates that the host owns the Next Descriptor in the
> Receive List and the DMA cannot acquire it. The Receive Process is
> suspended. ... This bit is set only when the previous Receive
> Descriptor is owned by the DMA."
>
> In other words, DMA has processed the previous receive descriptor which
> _was_ owned by the hardware, written back to clear the OWN bit, and
> then fetches the next descriptor and finds that the OWN bit is also
> clear.

I'm only trying to leave open the possibility that the Synopsys
technical writer and the hardware implementation team weren't
communicating clearly. We already have a situation where RPS isn't
behaving as documented (even if that's likely just hardware
misconfiguration), so while I'm currently pretty sure RBU carries no
other (actual) meaning than "DMA caught up to OWN=0," I'm only about
75% confident.

> > It would seem* that the kernel isn't really failing to keep up with
> > the packet rate. If RBU is firing with a ring that's not even close to
> > empty, that tells me there's another way for it to fire. So I suspect
> > the hardware designers implemented it to mean:
> > "We couldn't read the next descriptor, _or_ it wasn't owned by hardware."
> >
> > (* However, if bit 31 is clear everywhere, wouldn't that mean the ring
> > is actually completely depleted, not full? If count==budget, wouldn't
> > that mean the whole ring hasn't been visited, so we only refilled 64
> > entries and not necessarily the entire ring? Maybe the kernel isn't
> > keeping up after all.)
>
> Ah, I think that's where our terminology differs.
>
> You seem to define full as "populated with empty buffers". I define
> full to mean "the hardware has filled every buffer with a packet that
> it has received and handed it over to software to process." Note even
> the terminology there - filling buffers with data. That ultimately
> ends up filling the ring, and when completely filled, it is full.
>
> I think of buffers like buckets. If a buffer contains no data, it
> is empty. If a buffer contains data, it has been filled or is full.
> Apply that to a list of buffers and you get the same thing. Many
> ethernet driver documentation uses this same terminology, so I
> thought it would be widely understood.

Ah okay, I was beginning to suspect the same. In my defense: though I
also think of buffers in the same way, this driver calls the process
of supplying empty buffers "refilling," which is also the terminology
we've both been using throughout this exchange, and when something is
"completely refilled" I generally call it "full." But I'm realizing
now that the bidirectional (submissions+completions) nature of this
ring means that "full" and "empty" aren't really well-defined
concepts. I'll try to read more carefully (and switch to saying
"completely dirty" and "completely clean") going forward.

So the kernel is able to supply clean buffers without issue, but it
somehow falls behind the incoming packet rate and the DMA is left with
a completely dirty ring. I agree that stmmac_rx() is therefore just
not running fast enough: either it's got really bad scheduler jitter
for the ~6.3ms minimum it takes for 512x full-sized Ethernet frames to
arrive from the PHY (your scenario 1), or -- more likely -- the NAPI
budgets gradually fall behind the hardware (your scenario 2).

> Right, 40KiB. Sorry, I'm getting interrupted almost constantly while
> trying to do anything.
>
> However, I've tested with 0x7f in both fields, and it still falls flat
> on its face. I've also tried other values, but because I had to unplug
> the laptop from the nvidia board to use the laptop portably due to the
> medical emergency situation, that caused screen to quit, so I've lost
> all that. Chaos reigns supreme here :/

I'm sorry to hear about that, please prioritize you/yours and don't
feel like you owe me speedy replies.

> So, I'm not sure we understand what's going on - I don't think it's that
> the FIFOs are smaller than specified. I suspect that the 9KiB vs 36KiB
> results in some kind of throttling that prevents the condition which
> hangs the hardware.

I'll try playing with the FIFO configuration on my end to learn:
a) If a suitably-configured FIFO size makes the RPS status arrive as documented
b) If I can safely fill the FIFO slowly (by manually stalling the
driver and adding frames one at a time) and have it drain on resume
c) Whether the TQS value can be adjusted independently of this
problem's prevalence
d) The maximum RQS value that allows the problem to happen

> I'm not getting as much time as I'd like to really test out scenarios
> due to everything that is going on, and honestly I feel like just
> writing this week off now and giving up.

I have the same hardware, observe the same issue, and find this
interesting enough to keep plugging away at it. I would have no hard
feelings if you left me alone with this problem for a bit. :)

Be well,
Sam


^ permalink raw reply

* Re: [PATCH rc v1 1/4] iommu/arm-smmu-v3: Add arm_smmu_adopt_strtab() for kdump
From: Nicolin Chen @ 2026-04-15 20:57 UTC (permalink / raw)
  To: jgg, will, robin.murphy
  Cc: jamien, joro, praan, baolu.lu, kevin.tian, smostafa,
	miko.lenczewski, linux-arm-kernel, iommu, linux-kernel, stable
In-Reply-To: <30c7c51c8771722813a9cf54dae7a1b5d0aeb65d.1775763475.git.nicolinc@nvidia.com>

On Thu, Apr 09, 2026 at 12:46:50PM -0700, Nicolin Chen wrote:
> +	if (fmt == STRTAB_BASE_CFG_FMT_2LVL) {
> +		/* Enforce 2-level feature flag to match the adopted table */
> +		smmu->features |= ARM_SMMU_FEAT_2_LVL_STRTAB;
> +		ret = arm_smmu_adopt_strtab_2lvl(smmu, cfg_reg, dma);
> +	} else if (fmt == STRTAB_BASE_CFG_FMT_LINEAR) {
> +		/* Force linear feature flag to match the adopted table */
> +		smmu->features &= ~ARM_SMMU_FEAT_2_LVL_STRTAB;
> +		ret = arm_smmu_adopt_strtab_linear(smmu, cfg_reg, dma);

Made a small fix here. Including it in v2.

@@ -4662,11 +4662,18 @@ static int arm_smmu_adopt_strtab(struct arm_smmu_device *smmu)
        dev_info(smmu->dev, "kdump: adopting crashed kernel's stream table\n");

        if (fmt == STRTAB_BASE_CFG_FMT_2LVL) {
-               /* Enforce 2-level feature flag to match the adopted table */
-               smmu->features |= ARM_SMMU_FEAT_2_LVL_STRTAB;
+               /*
+                * Both kernels run on the same hardware, so it's impossible for
+                * kdump kernel to see the support for linear stream table only.
+                */
+               if (WARN_ON(!(smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)))
+                       return -EINVAL;
                ret = arm_smmu_adopt_strtab_2lvl(smmu, cfg_reg, dma);
        } else if (fmt == STRTAB_BASE_CFG_FMT_LINEAR) {
-               /* Force linear feature flag to match the adopted table */
+               /*
+                * In case that the old kernel for some reason used the linear
+                * format, enforce the same format to match the adopted table.
+                */
                smmu->features &= ~ARM_SMMU_FEAT_2_LVL_STRTAB;
                ret = arm_smmu_adopt_strtab_linear(smmu, cfg_reg, dma);
        } else {

Nicolin


^ permalink raw reply

* [PATCH rc v2 0/5] iommu/arm-smmu-v3: Fix device crash on kdump kernel
From: Nicolin Chen @ 2026-04-15 21:17 UTC (permalink / raw)
  To: will, robin.murphy, jgg, kevin.tian
  Cc: joro, praan, baolu.lu, miko.lenczewski, smostafa,
	linux-arm-kernel, iommu, linux-kernel, stable, jamien

When transitioning to a kdump kernel, the primary kernel might have crashed
while endpoint devices were actively bus-mastering DMA. Currently, the SMMU
driver aggressively resets the hardware during probe by clearing CR0_SMMUEN
and setting the Global Bypass Attribute (GBPA) to ABORT.

In a kdump scenario, this aggressive reset is highly destructive:
a) If GBPA is set to ABORT, in-flight DMA will be aborted, generating fatal
   PCIe AER or SErrors that may panic the kdump kernel
b) If GBPA is set to BYPASS, in-flight DMA targeting some IOVAs will bypass
   the SMMU and corrupt the physical memory at those 1:1 mapped IOVAs.

To safely absorb in-flight DMA, the kdump kernel must leave SMMUEN=1 intact
and avoid modifying STRTAB_BASE. This allows HW to continue translating in-
flight DMA using the crashed kernel's page tables until the endpoint device
drivers probe and quiesce their respective hardware.

However, the ARM SMMUv3 architecture specification states that updating the
SMMU_STRTAB_BASE register while SMMUEN == 1 is UNPREDICTABLE or ignored.

This leaves a kdump kernel no choice but to adopt the stream table from the
crashed kernel.

In this series:
 - Introduce an ARM_SMMU_OPT_KDUMP
 - Skip SMMUEN and STRTAB_BASE resets in arm_smmu_device_reset()
 - Map the crashed kernel's stream tables into the kdump kernel [*]
 - Defer any default domain attachment to retain STEs until device drivers
   explicitly request it.

[*] This is implemented via memremap, which only works on a coherent SMMU.

Note that the entire series requires Jason's work that was merged in v6.12:
85196f54743d ("iommu/arm-smmu-v3: Reorganize struct arm_smmu_strtab_cfg").
I have a backported version that is verified with a v6.8 kernel. I can send
if we see a strong need after this version is accepted.

This is on Github:
https://github.com/nicolinc/iommufd/commits/smmuv3_kdump-v2

Changelog
v2
 * Add warning in non-coherent SMMU cases
 * Keep eventq/priq disabled v.s. enabling-and-disabling-later
 * Check KDUMP option in the beginning of arm_smmu_device_reset()
 * Validate STRTAB format matches HW capability instead of forcing flags
v1:
 https://lore.kernel.org/all/cover.1775763475.git.nicolinc@nvidia.com/

Nicolin Chen (5):
  iommu/arm-smmu-v3: Add arm_smmu_adopt_strtab() for kdump
  iommu/arm-smmu-v3: Implement is_attach_deferred() for kdump
  iommu/arm-smmu-v3: Retain CR0_SMMUEN during kdump device reset
  iommu/arm-smmu-v3: Skip EVTQ/PRIQ setup in kdump kernel
  iommu/arm-smmu-v3: Detect ARM_SMMU_OPT_KDUMP in
    arm_smmu_device_hw_probe()

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |   1 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 225 ++++++++++++++++++--
 2 files changed, 207 insertions(+), 19 deletions(-)

-- 
2.43.0



^ permalink raw reply

* [PATCH rc v2 3/5] iommu/arm-smmu-v3: Retain CR0_SMMUEN during kdump device reset
From: Nicolin Chen @ 2026-04-15 21:17 UTC (permalink / raw)
  To: will, robin.murphy, jgg, kevin.tian
  Cc: joro, praan, baolu.lu, miko.lenczewski, smostafa,
	linux-arm-kernel, iommu, linux-kernel, stable, jamien
In-Reply-To: <cover.1776286352.git.nicolinc@nvidia.com>

When ARM_SMMU_OPT_KDUMP is set, skip the GBPA/disable/CR1/CR2/STRTAB_BASE
update sequence in arm_smmu_device_reset(). Those register writes are all
CONSTRAINED UNPREDICTABLE while CR0_SMMUEN==1, so leaving them untouched
lets in-flight DMA continue to be translated by the adopted stream table.

Initialize 'enables' to 0 so it can carry CR0_SMMUEN in kdump case. Then,
preserve that when enabling the command queue.

Fixes: b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel")
Cc: stable@vger.kernel.org # v6.12+
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 29 +++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index d9d543eb8cecf..b2c34713bf9f2 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -4938,9 +4938,23 @@ static void arm_smmu_write_strtab(struct arm_smmu_device *smmu)
 static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
 {
 	int ret;
-	u32 reg, enables;
+	u32 reg, enables = 0;
 	struct arm_smmu_cmdq_ent cmd;
 
+	/*
+	 * In a kdump case, retain CR0_SMMUEN to avoid transiently aborting in-
+	 * flight DMA. According to spec, updating STRTAB_BASE, CR1, or CR2 when
+	 * CR0_SMMUEN=1 is CONSTRAINED UNPREDICTABLE. Thus, skip those register
+	 * updates and rely on the adopted stream table from the crashed kernel.
+	 */
+	if (smmu->options & ARM_SMMU_OPT_KDUMP) {
+		dev_info(smmu->dev,
+			 "kdump: retaining SMMUEN for in-flight DMA\n");
+		/* ARM_SMMU_OPT_KDUMP is only set when CR0_SMMUEN=1 */
+		enables = CR0_SMMUEN;
+		goto reset_queues;
+	}
+
 	/* Clear CR0 and sync (disables SMMU and queue processing) */
 	reg = readl_relaxed(smmu->base + ARM_SMMU_CR0);
 	if (reg & CR0_SMMUEN) {
@@ -4972,12 +4986,23 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
 	/* Stream table */
 	arm_smmu_write_strtab(smmu);
 
+reset_queues:
+	if (smmu->options & ARM_SMMU_OPT_KDUMP) {
+		/* Disable queues since arm_smmu_device_disable() was skipped */
+		ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
+					      ARM_SMMU_CR0ACK);
+		if (ret) {
+			dev_err(smmu->dev, "failed to disable queues\n");
+			return ret;
+		}
+	}
+
 	/* Command queue */
 	writeq_relaxed(smmu->cmdq.q.q_base, smmu->base + ARM_SMMU_CMDQ_BASE);
 	writel_relaxed(smmu->cmdq.q.llq.prod, smmu->base + ARM_SMMU_CMDQ_PROD);
 	writel_relaxed(smmu->cmdq.q.llq.cons, smmu->base + ARM_SMMU_CMDQ_CONS);
 
-	enables = CR0_CMDQEN;
+	enables |= CR0_CMDQEN;
 	ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
 				      ARM_SMMU_CR0ACK);
 	if (ret) {
-- 
2.43.0



^ permalink raw reply related

* [PATCH rc v2 4/5] iommu/arm-smmu-v3: Skip EVTQ/PRIQ setup in kdump kernel
From: Nicolin Chen @ 2026-04-15 21:17 UTC (permalink / raw)
  To: will, robin.murphy, jgg, kevin.tian
  Cc: joro, praan, baolu.lu, miko.lenczewski, smostafa,
	linux-arm-kernel, iommu, linux-kernel, stable, jamien
In-Reply-To: <cover.1776286352.git.nicolinc@nvidia.com>

In kdump cases, the crashed kernel's CDs and page tables can be corrupted,
which could trigger event spamming. Also, we cannot serve page requests.

Skip the EVTQ/PRIQ setup entirely rather than enabling then disabling them.

Also add some inline comments explaining that.

Fixes: b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel")
Cc: stable@vger.kernel.org # v6.12+
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 43 +++++++++++++--------
 1 file changed, 27 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b2c34713bf9f2..12cd148a99dc6 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -5023,21 +5023,35 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
 	cmd.opcode = CMDQ_OP_TLBI_NSNH_ALL;
 	arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
 
-	/* Event queue */
-	writeq_relaxed(smmu->evtq.q.q_base, smmu->base + ARM_SMMU_EVTQ_BASE);
-	writel_relaxed(smmu->evtq.q.llq.prod, smmu->page1 + ARM_SMMU_EVTQ_PROD);
-	writel_relaxed(smmu->evtq.q.llq.cons, smmu->page1 + ARM_SMMU_EVTQ_CONS);
-
-	enables |= CR0_EVTQEN;
-	ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
-				      ARM_SMMU_CR0ACK);
-	if (ret) {
-		dev_err(smmu->dev, "failed to enable event queue\n");
-		return ret;
+	/*
+	 * Event queue
+	 *
+	 * Do not enable in a kdump case, as the crashed kernel's CDs and page
+	 * tables might be corrupted, triggering event spamming.
+	 */
+	if (!is_kdump_kernel()) {
+		writeq_relaxed(smmu->evtq.q.q_base,
+			       smmu->base + ARM_SMMU_EVTQ_BASE);
+		writel_relaxed(smmu->evtq.q.llq.prod,
+			       smmu->page1 + ARM_SMMU_EVTQ_PROD);
+		writel_relaxed(smmu->evtq.q.llq.cons,
+			       smmu->page1 + ARM_SMMU_EVTQ_CONS);
+
+		enables |= CR0_EVTQEN;
+		ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
+					      ARM_SMMU_CR0ACK);
+		if (ret) {
+			dev_err(smmu->dev, "failed to enable event queue\n");
+			return ret;
+		}
 	}
 
-	/* PRI queue */
-	if (smmu->features & ARM_SMMU_FEAT_PRI) {
+	/*
+	 * PRI queue
+	 *
+	 * Do not enable in a kdump case, as we cannot serve page requests.
+	 */
+	if (!is_kdump_kernel() && (smmu->features & ARM_SMMU_FEAT_PRI)) {
 		writeq_relaxed(smmu->priq.q.q_base,
 			       smmu->base + ARM_SMMU_PRIQ_BASE);
 		writel_relaxed(smmu->priq.q.llq.prod,
@@ -5070,9 +5084,6 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
 		return ret;
 	}
 
-	if (is_kdump_kernel())
-		enables &= ~(CR0_EVTQEN | CR0_PRIQEN);
-
 	/* Enable the SMMU interface */
 	enables |= CR0_SMMUEN;
 	ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
-- 
2.43.0



^ permalink raw reply related

* [PATCH rc v2 1/5] iommu/arm-smmu-v3: Add arm_smmu_adopt_strtab() for kdump
From: Nicolin Chen @ 2026-04-15 21:17 UTC (permalink / raw)
  To: will, robin.murphy, jgg, kevin.tian
  Cc: joro, praan, baolu.lu, miko.lenczewski, smostafa,
	linux-arm-kernel, iommu, linux-kernel, stable, jamien
In-Reply-To: <cover.1776286352.git.nicolinc@nvidia.com>

When transitioning to a kdump kernel, the primary kernel might have crashed
while endpoint devices were actively bus-mastering DMA. Currently, the SMMU
driver aggressively resets the hardware during probe by clearing CR0_SMMUEN
and setting the Global Bypass Attribute (GBPA) to ABORT.

In a kdump scenario, this aggressive reset is highly destructive:
a) If GBPA is set to ABORT, in-flight DMA will be aborted, generating fatal
   PCIe AER or SErrors that may panic the kdump kernel
b) If GBPA is set to BYPASS, in-flight DMA targeting some IOVAs will bypass
   the SMMU and corrupt the physical memory at those 1:1 mapped IOVAs.

To safely absorb in-flight DMA, the kdump kernel must leave SMMUEN=1 intact
and avoid modifying STRTAB_BASE. This allows HW to continue translating in-
flight DMA using the crashed kernel's page tables until the endpoint device
drivers probe and quiesce their respective hardware.

However, the ARM SMMUv3 architecture specification states that updating the
SMMU_STRTAB_BASE register while SMMUEN == 1 is UNPREDICTABLE or ignored.

This leaves a kdump kernel no choice but to adopt the stream table from the
crashed kernel.

Introduce ARM_SMMU_OPT_KDUMP and arm_smmu_adopt_strtab() that does memremap
on all the stream tables extracted from STRTAB_BASE and STRTAB_BASE_CFG.

The option will be set in arm_smmu_device_hw_probe().

Fixes: b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel")
Cc: stable@vger.kernel.org # v6.12+
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |   1 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 106 +++++++++++++++++++-
 2 files changed, 106 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index ef42df4753ec4..74950d98ba09f 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -861,6 +861,7 @@ struct arm_smmu_device {
 #define ARM_SMMU_OPT_MSIPOLL		(1 << 2)
 #define ARM_SMMU_OPT_CMDQ_FORCE_SYNC	(1 << 3)
 #define ARM_SMMU_OPT_TEGRA241_CMDQV	(1 << 4)
+#define ARM_SMMU_OPT_KDUMP		(1 << 5)
 	u32				options;
 
 	struct arm_smmu_cmdq		cmdq;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f6901c5437edc..9a45f17200a21 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -4553,11 +4553,115 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
 	return 0;
 }
 
+static int arm_smmu_adopt_strtab_2lvl(struct arm_smmu_device *smmu, u32 cfg_reg,
+				      dma_addr_t dma)
+{
+	u32 log2size = FIELD_GET(STRTAB_BASE_CFG_LOG2SIZE, cfg_reg);
+	u32 split = FIELD_GET(STRTAB_BASE_CFG_SPLIT, cfg_reg);
+	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
+	u32 num_l1_ents;
+	int i;
+
+	if (log2size < split) {
+		dev_err(smmu->dev, "kdump: invalid log2size %u < split %u\n",
+			log2size, split);
+		return -EINVAL;
+	}
+
+	if (split != STRTAB_SPLIT) {
+		dev_err(smmu->dev,
+			"kdump: unsupported STRTAB_SPLIT %u (expected %u)\n",
+			split, STRTAB_SPLIT);
+		return -EINVAL;
+	}
+
+	num_l1_ents = 1 << (log2size - split);
+	cfg->l2.l1_dma = dma;
+	cfg->l2.num_l1_ents = num_l1_ents;
+	cfg->l2.l1tab = devm_memremap(
+		smmu->dev, dma, num_l1_ents * sizeof(struct arm_smmu_strtab_l1),
+		MEMREMAP_WB);
+	if (!cfg->l2.l1tab)
+		return -ENOMEM;
+
+	cfg->l2.l2ptrs = devm_kcalloc(smmu->dev, num_l1_ents,
+				      sizeof(*cfg->l2.l2ptrs), GFP_KERNEL);
+	if (!cfg->l2.l2ptrs)
+		return -ENOMEM;
+
+	for (i = 0; i < num_l1_ents; i++) {
+		u64 l2ptr = le64_to_cpu(cfg->l2.l1tab[i].l2ptr);
+		u32 span = FIELD_GET(STRTAB_L1_DESC_SPAN, l2ptr);
+		dma_addr_t l2_dma = l2ptr & STRTAB_L1_DESC_L2PTR_MASK;
+
+		if (span && l2_dma) {
+			cfg->l2.l2ptrs[i] = devm_memremap(
+				smmu->dev, l2_dma,
+				sizeof(struct arm_smmu_strtab_l2), MEMREMAP_WB);
+			if (!cfg->l2.l2ptrs[i])
+				return -ENOMEM;
+		}
+	}
+
+	return 0;
+}
+
+static int arm_smmu_adopt_strtab_linear(struct arm_smmu_device *smmu,
+					u32 cfg_reg, dma_addr_t dma)
+{
+	u32 log2size = FIELD_GET(STRTAB_BASE_CFG_LOG2SIZE, cfg_reg);
+	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
+
+	cfg->linear.ste_dma = dma;
+	cfg->linear.num_ents = 1 << log2size;
+	cfg->linear.table = devm_memremap(smmu->dev, dma,
+					  cfg->linear.num_ents *
+						  sizeof(struct arm_smmu_ste),
+					  MEMREMAP_WB);
+	if (!cfg->linear.table)
+		return -ENOMEM;
+	return 0;
+}
+
+static int arm_smmu_adopt_strtab(struct arm_smmu_device *smmu)
+{
+	u32 cfg_reg = readl_relaxed(smmu->base + ARM_SMMU_STRTAB_BASE_CFG);
+	u64 base_reg = readq_relaxed(smmu->base + ARM_SMMU_STRTAB_BASE);
+	u32 fmt = FIELD_GET(STRTAB_BASE_CFG_FMT, cfg_reg);
+	dma_addr_t dma = base_reg & STRTAB_BASE_ADDR_MASK;
+	int ret;
+
+	dev_info(smmu->dev, "kdump: adopting crashed kernel's stream table\n");
+
+	if (fmt == STRTAB_BASE_CFG_FMT_2LVL) {
+		/*
+		 * Both kernels run on the same hardware, so it's impossible for
+		 * kdump kernel to see the support for linear stream table only.
+		 */
+		if (WARN_ON(!(smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)))
+			return -EINVAL;
+		ret = arm_smmu_adopt_strtab_2lvl(smmu, cfg_reg, dma);
+	} else if (fmt == STRTAB_BASE_CFG_FMT_LINEAR) {
+		/*
+		 * In case that the old kernel for some reason used the linear
+		 * format, enforce the same format to match the adopted table.
+		 */
+		smmu->features &= ~ARM_SMMU_FEAT_2_LVL_STRTAB;
+		ret = arm_smmu_adopt_strtab_linear(smmu, cfg_reg, dma);
+	} else {
+		dev_err(smmu->dev, "kdump: invalid STRTAB format %u\n", fmt);
+		ret = -EINVAL;
+	}
+	return ret;
+}
+
 static int arm_smmu_init_strtab(struct arm_smmu_device *smmu)
 {
 	int ret;
 
-	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)
+	if (smmu->options & ARM_SMMU_OPT_KDUMP)
+		ret = arm_smmu_adopt_strtab(smmu);
+	else if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)
 		ret = arm_smmu_init_strtab_2lvl(smmu);
 	else
 		ret = arm_smmu_init_strtab_linear(smmu);
-- 
2.43.0



^ permalink raw reply related

* [PATCH rc v2 5/5] iommu/arm-smmu-v3: Detect ARM_SMMU_OPT_KDUMP in arm_smmu_device_hw_probe()
From: Nicolin Chen @ 2026-04-15 21:17 UTC (permalink / raw)
  To: will, robin.murphy, jgg, kevin.tian
  Cc: joro, praan, baolu.lu, miko.lenczewski, smostafa,
	linux-arm-kernel, iommu, linux-kernel, stable, jamien
In-Reply-To: <cover.1776286352.git.nicolinc@nvidia.com>

arm_smmu_device_hw_probe() runs before arm_smmu_init_structures(), so it's
natural to decide whether the kdump kernel must adopt the crashed kernel's
stream table.

Given that memremap is used to adopt the old stream table, set this option
only on a coherent SMMU.

Fixes: b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel")
Cc: stable@vger.kernel.org # v6.12+
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 12cd148a99dc6..5a5e0f80bbfb3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -5388,6 +5388,25 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
 
 	dev_info(smmu->dev, "oas %lu-bit (features 0x%08x)\n",
 		 smmu->oas, smmu->features);
+
+	/*
+	 * If SMMU is already active in kdump case, there could be in-flight DMA
+	 * from devices initiated by the crashed kernel. Mark ARM_SMMU_OPT_KDUMP
+	 * to let the init functions adopt the crashed kernel's stream table.
+	 *
+	 * Note that arm_smmu_adopt_strtab() uses memremap that can only work on
+	 * a coherent SMMU. A non-coherent SMMU has no choice but to continue to
+	 * abort any in-flight DMA.
+	 */
+	if (is_kdump_kernel() &&
+	    (readl_relaxed(smmu->base + ARM_SMMU_CR0) & CR0_SMMUEN)) {
+		if (coherent)
+			smmu->options |= ARM_SMMU_OPT_KDUMP;
+		else
+			dev_warn(smmu->dev,
+				 "kdump: in-flight DMA would be rejected\n");
+	}
+
 	return 0;
 }
 
-- 
2.43.0



^ permalink raw reply related

* [PATCH rc v2 2/5] iommu/arm-smmu-v3: Implement is_attach_deferred() for kdump
From: Nicolin Chen @ 2026-04-15 21:17 UTC (permalink / raw)
  To: will, robin.murphy, jgg, kevin.tian
  Cc: joro, praan, baolu.lu, miko.lenczewski, smostafa,
	linux-arm-kernel, iommu, linux-kernel, stable, jamien
In-Reply-To: <cover.1776286352.git.nicolinc@nvidia.com>

Though the kdump kernel adopts the crashed kernel's stream table, the iommu
core will still try to attach each probed device to a default domain, which
overwrites the adopted STE and breaks in-flight DMA from that device.

Implement an is_attach_deferred() callback to prevent this. For each device
that has STE.V=1 in the adopted table, defer the default domain attachment,
until the device driver explicitly requests it.

Fixes: b63b3439b856 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel")
Cc: stable@vger.kernel.org # v6.12+
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 28 +++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 9a45f17200a21..d9d543eb8cecf 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -4212,6 +4212,33 @@ static void arm_smmu_remove_master(struct arm_smmu_master *master)
 	kfree(master->build_invs);
 }
 
+static bool arm_smmu_is_attach_deferred(struct device *dev)
+{
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+	struct arm_smmu_device *smmu = master->smmu;
+	int i;
+
+	if (!(smmu->options & ARM_SMMU_OPT_KDUMP))
+		return false;
+
+	for (i = 0; i < master->num_streams; i++) {
+		u32 sid = master->streams[i].id;
+		struct arm_smmu_ste *step;
+
+		/* Guard against unpopulated L2 entries in the adopted table */
+		if ((smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) &&
+		    !smmu->strtab_cfg.l2.l2ptrs[arm_smmu_strtab_l1_idx(sid)])
+			continue;
+
+		step = arm_smmu_get_step_for_sid(smmu, sid);
+		/* If the STE has the Valid bit set, defer the attach */
+		if (le64_to_cpu(step->data[0]) & STRTAB_STE_0_V)
+			return true;
+	}
+
+	return false;
+}
+
 static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 {
 	int ret;
@@ -4374,6 +4401,7 @@ static const struct iommu_ops arm_smmu_ops = {
 	.hw_info		= arm_smmu_hw_info,
 	.domain_alloc_sva       = arm_smmu_sva_domain_alloc,
 	.domain_alloc_paging_flags = arm_smmu_domain_alloc_paging_flags,
+	.is_attach_deferred	= arm_smmu_is_attach_deferred,
 	.probe_device		= arm_smmu_probe_device,
 	.release_device		= arm_smmu_release_device,
 	.device_group		= arm_smmu_device_group,
-- 
2.43.0



^ permalink raw reply related

* Re: [PATCH v1 1/1] i2c: mediatek: add bus regulator control for power saving
From: Andi Shyti @ 2026-04-15 21:45 UTC (permalink / raw)
  To: adlavinitha reddy
  Cc: Qii Wang, Matthias Brugger, AngeloGioacchino Del Regno, linux-i2c,
	linux-kernel, linux-arm-kernel, linux-mediatek,
	Project_Global_Chrome_Upstream_Group
In-Reply-To: <20260415125833.2579133-1-adlavinitha.reddy@mediatek.com>

> Thank you for your guidance on formatting and submission rules. I will study
> the documentation carefully and ensure I properly send patches in the future.
> 
> Regarding the multiple submissions, I sincerely apologize for the confusion.
> As an intern here, I encountered some unexpected internal email permission
> issues when trying to send the patch. In an attempt to double-check whether
> the email successfully went through the server, I mistakenly re-sent it a
> couple of times. I will be much more careful next time.

No worries Adlavinitha your patch got in anyway.

Have fun in the kernel community :-)

Andi


^ permalink raw reply

* Re: [PATCH v2 1/8] dt-bindings: mfd: khadas: Add new compatible for Khadas VIM4 MCU
From: Rob Herring @ 2026-04-15 21:48 UTC (permalink / raw)
  To: Ronald Claveau
  Cc: Neil Armstrong, Lee Jones, Krzysztof Kozlowski, Conor Dooley,
	Andi Shyti, Kevin Hilman, Jerome Brunet, Martin Blumenstingl,
	Beniamino Galvani, Rafael J. Wysocki, Daniel Lezcano, Zhang Rui,
	Lukasz Luba, Liam Girdwood, Mark Brown, linux-amlogic, devicetree,
	linux-kernel, linux-i2c, linux-arm-kernel, linux-pm
In-Reply-To: <20260403-add-mcu-fan-khadas-vim4-v2-1-70536b22439a@aliel.fr>

On Fri, Apr 03, 2026 at 06:08:34PM +0200, Ronald Claveau wrote:
> The Khadas VIM4 MCU register is slightly different
> from previous boards' MCU.
> This board also features a switchable power source for its fan.
> 
> Signed-off-by: Ronald Claveau <linux-kernel-dev@aliel.fr>
> ---
>  Documentation/devicetree/bindings/mfd/khadas,mcu.yaml | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/mfd/khadas,mcu.yaml b/Documentation/devicetree/bindings/mfd/khadas,mcu.yaml
> index 084960fd5a1fd..67769ef5d58b1 100644
> --- a/Documentation/devicetree/bindings/mfd/khadas,mcu.yaml
> +++ b/Documentation/devicetree/bindings/mfd/khadas,mcu.yaml
> @@ -18,6 +18,7 @@ properties:
>    compatible:
>      enum:
>        - khadas,mcu # MCU revision is discoverable

The revision is no longer discoverable as was claimed?

> +      - khadas,vim4-mcu
>  
>    "#cooling-cells": # Only needed for boards having FAN control feature
>      const: 2
> @@ -25,6 +26,10 @@ properties:
>    reg:
>      maxItems: 1
>  
> +  fan-supply:
> +    description: Phandle to the regulator that powers the fan.
> +    $ref: /schemas/types.yaml#/definitions/phandle
> +
>  required:
>    - compatible
>    - reg
> 
> -- 
> 2.49.0
> 


^ permalink raw reply

* Re: [PATCH v2 2/8] dt-bindings: i2c: amlogic: Add compatible for T7 SOC
From: Rob Herring (Arm) @ 2026-04-15 21:48 UTC (permalink / raw)
  To: Ronald Claveau
  Cc: Jerome Brunet, Neil Armstrong, linux-i2c, Kevin Hilman, linux-pm,
	Lukasz Luba, linux-amlogic, linux-kernel, Zhang Rui, Lee Jones,
	devicetree, Conor Dooley, Andi Shyti, Daniel Lezcano,
	Martin Blumenstingl, Beniamino Galvani, Krzysztof Kozlowski,
	Liam Girdwood, Mark Brown, linux-arm-kernel, Rafael J. Wysocki
In-Reply-To: <20260403-add-mcu-fan-khadas-vim4-v2-2-70536b22439a@aliel.fr>


On Fri, 03 Apr 2026 18:08:35 +0200, Ronald Claveau wrote:
> Add the T7 SOC compatible which fallback to AXG compatible.
> 
> Signed-off-by: Ronald Claveau <linux-kernel-dev@aliel.fr>
> ---
>  .../devicetree/bindings/i2c/amlogic,meson6-i2c.yaml         | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 

Acked-by: Rob Herring (Arm) <robh@kernel.org>



^ permalink raw reply

* Re: [GIT PULL] pmdomain updates for v7.1
From: pr-tracker-bot @ 2026-04-15 22:00 UTC (permalink / raw)
  To: Ulf Hansson; +Cc: Linus, linux-pm, linux-kernel, Ulf Hansson, linux-arm-kernel
In-Reply-To: <20260414103826.161076-1-ulf.hansson@linaro.org>

The pull request you sent on Tue, 14 Apr 2026 12:38:21 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm.git tags/pmdomain-v7.1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/e41a25c53f96abe40edc5db1626d37a518852d84

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


^ permalink raw reply

* Re: [PATCH v5 8/9] driver core: Replace dev->of_node_reused with dev_of_node_reused()
From: Rob Herring (Arm) @ 2026-04-15 22:10 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: astewart, linux-arm-kernel, Mark Brown, bhelgaas, maz, linux,
	kees, Alan Stern, Saravana Kannan, netdev, linux-serial, davem,
	andrew, Greg Kroah-Hartman, brgl, jirislaby, mani, Johan Hovold,
	linux-aspeed, linux-pci, kuba, Alexander Lobakin, Leon Romanovsky,
	andriy.shevchenko, Rafael J . Wysocki, Alexey Kardashevskiy,
	lgirdwood, andrew, hkallweit1, linux-kernel, Danilo Krummrich,
	Eric Dumazet, linux-usb, alexander.stein, Robin Murphy, pabeni,
	devicetree, driver-core, joel, Christoph Hellwig
In-Reply-To: <20260406162231.v5.8.I806b8636cd3724f6cd1f5e199318ab8694472d90@changeid>


On Mon, 06 Apr 2026 16:23:01 -0700, Douglas Anderson wrote:
> In C, bitfields are not necessarily safe to modify from multiple
> threads without locking. Switch "of_node_reused" over to the "flags"
> field so modifications are safe.
> 
> Cc: Johan Hovold <johan@kernel.org>
> Acked-by: Mark Brown <broonie@kernel.org>
> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
> Reviewed-by: Danilo Krummrich <dakr@kernel.org>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> Not fixing any known bugs; problem is theoretical and found by code
> inspection. Change is done somewhat manually and only lightly tested
> (mostly compile-time tested).
> 
> (no changes since v4)
> 
> Changes in v4:
> - Use accessor functions for flags
> 
> Changes in v3:
> - New
> 
>  drivers/base/core.c                      | 2 +-
>  drivers/base/pinctrl.c                   | 2 +-
>  drivers/base/platform.c                  | 2 +-
>  drivers/net/pcs/pcs-xpcs-plat.c          | 2 +-
>  drivers/of/device.c                      | 6 +++---
>  drivers/pci/of.c                         | 2 +-
>  drivers/pci/pwrctrl/core.c               | 2 +-
>  drivers/regulator/bq257xx-regulator.c    | 2 +-
>  drivers/regulator/rk808-regulator.c      | 2 +-
>  drivers/tty/serial/serial_base_bus.c     | 2 +-
>  drivers/usb/gadget/udc/aspeed-vhub/dev.c | 2 +-
>  include/linux/device.h                   | 7 ++++---
>  12 files changed, 17 insertions(+), 16 deletions(-)
> 

Reviewed-by: Rob Herring (Arm) <robh@kernel.org>



^ permalink raw reply

* Re: [PATCH v5 1/4] dt-bindings: interrupt-controller: Describe AST2700-A2 hardware instead of A0
From: Rob Herring (Arm) @ 2026-04-15 22:13 UTC (permalink / raw)
  To: Ryan Chen
  Cc: linux-riscv, Joel Stanley, Albert Ou, Palmer Dabbelt,
	linux-kernel, Paul Walmsley, devicetree, Krzysztof Kozlowski,
	Conor Dooley, linux-aspeed, Thomas Gleixner, Andrew Jeffery,
	Alexandre Ghiti, linux-arm-kernel
In-Reply-To: <20260407-irqchip-v5-1-c0b0a300a057@aspeedtech.com>


On Tue, 07 Apr 2026 11:08:04 +0800, Ryan Chen wrote:
> Introduce a new binding describing the AST2700 interrupt controller
> architecture implemented in the A2 production silicon.
> 
> The AST2700 SoC has undergone multiple silicon revisions (A0, A1, A2)
> prior to mass production. The interrupt architecture was substantially
> reworked after the A0 revision for A1, and the A1 design is retained
> unchanged in the A2 production silicon.
> 
> The existing AST2700 interrupt controller binding
> ("aspeed,ast2700-intc-ic")was written against the pre-production A0
> design. That binding does not accurately describe the interrupt
> hierarchy and routing model present in A1/A2, where interrupts can be
> routed to multiple processor-local interrupt controllers (Primary
> Service Processor (PSP) GIC, Secondary Service Processor (SSP)/Tertiary
> Service Processor (TSP) NVICs, and BootMCU APLIC) depending on the
> execution context.
> 
> Remove the binding for the pre-production A0 design in favour of the
> binding for the A2 production design. There is no significant user
> impact from the removal as there are no existing devicetrees in any
> of Linux, u-boot or Zephyr that make use of the A0 binding.
> 
> Hardware connectivity between interrupt controllers is expressed using
> the aspeed,interrupt-ranges property.
> 
> Signed-off-by: Ryan Chen <ryan_chen@aspeedtech.com>
> 
> ---
> Changes in v3:
> - squash patch 5/5.
> - modify wrap lines at 80 char.
> - modify maintainers name and email.
> - modify typo Sevice-> Service
> Changes in v2:
> - Describe AST2700 A0/A1/A2 design evolution.
> - Drop the redundant '-ic' suffix from compatible strings.
> - Expand commit message to match the series cover letter context.
> - fix ascii diagram
> - remove intc0 label
> - remove spaces before >
> - drop intc1 example
> ---
>  .../interrupt-controller/aspeed,ast2700-intc.yaml  |  90 ----------
>  .../aspeed,ast2700-interrupt.yaml                  | 188 +++++++++++++++++++++
>  2 files changed, 188 insertions(+), 90 deletions(-)
> 

Reviewed-by: Rob Herring (Arm) <robh@kernel.org>



^ permalink raw reply

* Re: [PATCH RFC 7/8] clk: sunxi-ng: a733: Add bus clock gates
From: Andre Przywara @ 2026-04-15 22:14 UTC (permalink / raw)
  To: Junhui Liu, Michael Turquette, Stephen Boyd, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Chen-Yu Tsai, Jernej Skrabec,
	Samuel Holland, Philipp Zabel, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexandre Ghiti, Richard Cochran
  Cc: linux-clk, devicetree, linux-arm-kernel, linux-sunxi,
	linux-kernel, linux-riscv, netdev
In-Reply-To: <20260310-a733-clk-v1-7-36b4e9b24457@pigmoral.tech>

Hi,

cheekily jumping in here, for the parts that are easy to verify ;-)

In general this series looks very good, and many thanks for splitting
this up in reviewable chunks, that's much appreciated!

On 3/10/26 09:34, Junhui Liu wrote:
> Add the bus clock gates that control access to the devices' register
> interface on the Allwinner A733 SoC. These clocks are typically
> single-bit controls in the BGR registers, covering UARTs, SPI, I2C, and
> various multimedia engines. It also includes bus gates for system
> components like the IOMMU and MSI-lite interfaces.
> 
> Signed-off-by: Junhui Liu <junhui.liu@pigmoral.tech>
> 
> ---
> The parents of some bus clocks are difficult to determine, as the user
> manual only describes the clock source for a few instances. The current
> configurations are based on references to previous Allwinner SoCs and
> information gathered from the manual. Where documentation is lacking,
> vendor practices are followed by setting the parent to "hosc" for now.
> ---
>   drivers/clk/sunxi-ng/ccu-sun60i-a733.c | 475 ++++++++++++++++++++++++++++++++-
>   1 file changed, 474 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/clk/sunxi-ng/ccu-sun60i-a733.c b/drivers/clk/sunxi-ng/ccu-sun60i-a733.c
> index 36b44568a56f..c0b09f9197d1 100644
> --- a/drivers/clk/sunxi-ng/ccu-sun60i-a733.c
> +++ b/drivers/clk/sunxi-ng/ccu-sun60i-a733.c
> @@ -408,16 +408,19 @@ static SUNXI_CCU_M_DATA_WITH_MUX(ahb_clk, "ahb", ahb_apb_parents, 0x500,
>   				 0, 5,		/* M */
>   				 24, 2,		/* mux */
>   				 0);
> +static const struct clk_hw *ahb_hws[] = { &ahb_clk.common.hw };
>   
>   static SUNXI_CCU_M_DATA_WITH_MUX(apb0_clk, "apb0", ahb_apb_parents, 0x510,
>   				 0, 5,		/* M */
>   				 24, 2,		/* mux */
>   				 0);
> +static const struct clk_hw *apb0_hws[] = { &apb0_clk.common.hw };
>   
>   static SUNXI_CCU_M_DATA_WITH_MUX(apb1_clk, "apb1", ahb_apb_parents, 0x518,
>   				 0, 5,		/* M */
>   				 24, 2,		/* mux */
>   				 0);
> +static const struct clk_hw *apb1_hws[] = { &apb1_clk.common.hw };
>   
>   static const struct clk_parent_data apb_uart_parents[] = {
>   	{ .hw = &sys_24M_clk.hw },
> @@ -430,6 +433,9 @@ static SUNXI_CCU_M_DATA_WITH_MUX(apb_uart_clk, "apb-uart", apb_uart_parents, 0x5
>   				 0, 5,		/* M */
>   				 24, 3,		/* mux */
>   				 0);
> +static const struct clk_hw *apb_uart_hws[] = {
> +	&apb_uart_clk.common.hw
> +};
>   
>   static const struct clk_parent_data trace_parents[] = {
>   	{ .hw = &sys_24M_clk.hw },
> @@ -463,6 +469,8 @@ static SUNXI_CCU_M_DATA_WITH_MUX_GATE(cpu_peri_clk, "cpu-peri", gic_cpu_peri_par
>   				      BIT(31),	/* gate */
>   				      0);
>   
> +static SUNXI_CCU_GATE_DATA(bus_its_pcie_clk, "bus-its-pcie", hosc, 0x574, BIT(1), 0);
> +
>   static const struct clk_parent_data nsi_parents[] = {
>   	{ .hw = &sys_24M_clk.hw },
>   	{ .hw = &pll_ddr_clk.common.hw },
> @@ -477,6 +485,7 @@ static SUNXI_CCU_MP_DATA_WITH_MUX_GATE_FEAT(nsi_clk, "nsi", nsi_parents, 0x580,
>   					    24, 3,	/* mux */
>   					    BIT(31),	/* gate */
>   					    0, CCU_FEATURE_UPDATE_BIT);
> +static SUNXI_CCU_GATE_DATA(bus_nsi_clk, "bus-nsi", hosc, 0x584, BIT(0), 0);
>   
>   static const struct clk_parent_data mbus_parents[] = {
>   	{ .hw = &sys_24M_clk.hw },
> @@ -493,9 +502,117 @@ static SUNXI_CCU_MP_DATA_WITH_MUX_GATE_FEAT(mbus_clk, "mbus", mbus_parents, 0x58
>   					    BIT(31),	/* gate */
>   					    CLK_IS_CRITICAL,
>   					    CCU_FEATURE_UPDATE_BIT);
> +static const struct clk_hw *mbus_hws[] = { &mbus_clk.common.hw };
> +
> +static SUNXI_CCU_GATE_HWS(mbus_iommu0_sys_clk, "mbus-iommu0-sys", mbus_hws, 0x58c, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(apb_iommu0_sys_clk, "apb-iommu0-sys", apb0_hws, 0x58c, BIT(1), 0);
> +static SUNXI_CCU_GATE_HWS(ahb_iommu0_sys_clk, "ahb-iommu0-sys", ahb_hws, 0x58c, BIT(2), 0);
> +
> +static SUNXI_CCU_GATE_DATA(bus_msi_lite0_clk, "bus-msi-lite0", hosc, 0x594, BIT(0), 0);
> +static SUNXI_CCU_GATE_DATA(bus_msi_lite1_clk, "bus-msi-lite1", hosc, 0x59c, BIT(0), 0);
> +static SUNXI_CCU_GATE_DATA(bus_msi_lite2_clk, "bus-msi-lite2", hosc, 0x5a4, BIT(0), 0);
> +
> +static SUNXI_CCU_GATE_HWS(mbus_iommu1_sys_clk, "mbus-iommu1-sys", mbus_hws, 0x5b4, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(apb_iommu1_sys_clk, "apb_iommu1-sys", apb0_hws, 0x5b4, BIT(1), 0);
> +static SUNXI_CCU_GATE_HWS(ahb_iommu1_sys_clk, "ahb_iommu1-sys", ahb_hws, 0x5b4, BIT(2), 0);
> +
> +static SUNXI_CCU_GATE_HWS(ahb_ve_dec_clk, "ahb-ve-dec", ahb_hws,
> +			  0x5c0, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(ahb_ve_enc_clk, "ahb-ve-enc", ahb_hws,
> +			  0x5c0, BIT(1), 0);
> +static SUNXI_CCU_GATE_HWS(ahb_vid_in_clk, "ahb-vid-in", ahb_hws,
> +			  0x5c0, BIT(2), 0);
> +static SUNXI_CCU_GATE_HWS(ahb_vid_cout0_clk, "ahb-vid-cout0", ahb_hws,
> +			  0x5c0, BIT(3), 0);
> +static SUNXI_CCU_GATE_HWS(ahb_vid_cout1_clk, "ahb-vid-cout1", ahb_hws,
> +			  0x5c0, BIT(4), 0);
> +static SUNXI_CCU_GATE_HWS(ahb_de_clk, "ahb-de", ahb_hws,
> +			  0x5c0, BIT(5), 0);
> +static SUNXI_CCU_GATE_HWS(ahb_npu_clk, "ahb-npu", ahb_hws,
> +			  0x5c0, BIT(6), 0);
> +static SUNXI_CCU_GATE_HWS(ahb_gpu0_clk, "ahb-gpu0", ahb_hws,
> +			  0x5c0, BIT(7), 0);
> +static SUNXI_CCU_GATE_HWS(ahb_serdes_clk, "ahb-serdes", ahb_hws,
> +			  0x5c0, BIT(8), 0);
> +static SUNXI_CCU_GATE_HWS(ahb_usb_sys_clk, "ahb-usb-sys", ahb_hws,
> +			  0x5c0, BIT(9), 0);
> +static SUNXI_CCU_GATE_HWS(ahb_msi_lite0_clk, "ahb-msi-lite0", ahb_hws,
> +			  0x5c0, BIT(16), 0);
> +static SUNXI_CCU_GATE_HWS(ahb_store_clk, "ahb-store", ahb_hws,
> +			  0x5c0, BIT(24), 0);
> +static SUNXI_CCU_GATE_HWS(ahb_cpus_clk, "ahb-cpus", ahb_hws,
> +			  0x5c0, BIT(28), 0);
> +
> +static SUNXI_CCU_GATE_HWS(mbus_iommu0_clk, "mbus-iommu0", mbus_hws,
> +			  0x5e0, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_iommu1_clk, "mbus-iommu1", mbus_hws,
> +			  0x5e0, BIT(1), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_desys_clk, "mbus-desys", mbus_hws,
> +			  0x5e0, BIT(11), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_ve_enc_gate_clk, "mbus-ve-enc-gate", mbus_hws,
> +			  0x5e0, BIT(12), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_ve_dec_gate_clk, "mbus-ve-dec-gate", mbus_hws,
> +			  0x5e0, BIT(14), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_gpu0_clk, "mbus-gpu0", mbus_hws,
> +			  0x5e0, BIT(16), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_npu_clk, "mbus-npu", mbus_hws,
> +			  0x5e0, BIT(18), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_vid_in_clk, "mbus-vid-in", mbus_hws,
> +			  0x5e0, BIT(24), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_serdes_clk, "mbus-serdes", mbus_hws,
> +			  0x5e0, BIT(28), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_msi_lite0_clk, "mbus-msi-lite0", mbus_hws,
> +			  0x5e0, BIT(29), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_store_clk, "mbus-store", mbus_hws,
> +			  0x5e0, BIT(30), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_msi_lite2_clk, "mbus-msi-lite2", mbus_hws,
> +			  0x5e0, BIT(31), 0);
> +
> +static SUNXI_CCU_GATE_HWS(mbus_dma0_clk, "mbus-dma0", mbus_hws,
> +			  0x5e4, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_ve_enc_clk, "mbus-ve-enc", mbus_hws,
> +			  0x5e4, BIT(1), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_ce_clk, "mbus-ce", mbus_hws,
> +			  0x5e4, BIT(2), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_dma1_clk, "mbus-dma1", mbus_hws,
> +			  0x5e4, BIT(3), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_nand_clk, "mbus-nand", mbus_hws,
> +			  0x5e4, BIT(5), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_csi_clk, "mbus-csi", mbus_hws,
> +			  0x5e4, BIT(8), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_isp_clk, "mbus-isp", mbus_hws,
> +			  0x5e4, BIT(9), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_gmac0_clk, "mbus-gmac0", mbus_hws,
> +			  0x5e4, BIT(11), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_gmac1_clk, "mbus-gmac1", mbus_hws,
> +			  0x5e4, BIT(12), 0);
> +static SUNXI_CCU_GATE_HWS(mbus_ve_dec_clk, "mbus-ve-dec", mbus_hws,
> +			  0x5e4, BIT(18), 0);
> +
> +static SUNXI_CCU_GATE_HWS(bus_dma0_clk, "bus-dma0", ahb_hws,
> +			  0x704, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_dma1_clk, "bus-dma1", ahb_hws,
> +			  0x70c, BIT(0), 0);
> +
> +static SUNXI_CCU_GATE_HWS(bus_spinlock_clk, "bus-spinlock", ahb_hws,
> +			  0x724, BIT(0), 0);
> +
> +static SUNXI_CCU_GATE_HWS(bus_msgbox_clk, "bus-msgbox", ahb_hws,
> +			  0x744, BIT(0), 0);
> +
> +static SUNXI_CCU_GATE_HWS(bus_pwm0_clk, "bus-pwm0", apb0_hws,
> +			  0x784, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_pwm1_clk, "bus-pwm1", apb0_hws,
> +			  0x78c, BIT(0), 0);
> +
> +static SUNXI_CCU_GATE_HWS(bus_dbg_clk, "bus-dbg", sys_24M_hws,
> +			  0x7a4, BIT(0), 0);
> +
> +static SUNXI_CCU_GATE_HWS(bus_sysdap_clk, "bus-sysdap", apb1_hws,
> +			  0x88c, BIT(0), 0);
>   
>   /**************************************************************************
> - *                          mod clocks                                    *
> + *                          mod clocks with gates                         *
>    **************************************************************************/
>   
>   static const struct clk_parent_data timer_parents[] = {
> @@ -565,6 +682,7 @@ static SUNXI_CCU_MP_DATA_WITH_MUX_GATE(timer9_clk, "timer9", timer_parents, 0x82
>   				       24, 3,		/* mux */
>   				       BIT(31),		/* gate */
>   				       0);
> +static SUNXI_CCU_GATE_HWS(bus_timer_clk, "bus-timer", ahb_hws, 0x850, BIT(0), 0);
>   
>   static const struct clk_parent_data avs_parents[] = {
>   	{ .hw = &sys_24M_clk.hw },
> @@ -589,6 +707,7 @@ static SUNXI_CCU_M_HW_WITH_MUX_GATE(de_clk, "de", de_parents, 0xa00,
>   				    24, 3,	/* mux */
>   				    BIT(31),	/* gate */
>   				    CLK_SET_RATE_PARENT);
> +static SUNXI_CCU_GATE_HWS(bus_de_clk, "bus-de", ahb_hws, 0xa04, BIT(0), 0);
>   
>   static const struct clk_hw *di_parents[] = {
>   	&pll_periph0_600M_clk.hw,
> @@ -602,6 +721,7 @@ static SUNXI_CCU_M_HW_WITH_MUX_GATE(di_clk, "di", di_parents, 0xa20,
>   				    24, 3,	/* mux */
>   				    BIT(31),	/* gate */
>   				    CLK_SET_RATE_PARENT);
> +static SUNXI_CCU_GATE_HWS(bus_di_clk, "bus-di", ahb_hws, 0xa24, BIT(0), 0);
>   
>   static const struct clk_hw *g2d_parents[] = {
>   	&pll_periph0_400M_clk.hw,
> @@ -614,6 +734,7 @@ static SUNXI_CCU_M_HW_WITH_MUX_GATE(g2d_clk, "g2d", g2d_parents, 0xa40,
>   				    24, 3,	/* mux */
>   				    BIT(31),	/* gate */
>   				    CLK_SET_RATE_PARENT);
> +static SUNXI_CCU_GATE_HWS(bus_g2d_clk, "bus-g2d", ahb_hws, 0xa44, BIT(0), 0);
>   
>   static const struct clk_hw *eink_parents[] = {
>   	&pll_periph0_480M_clk.common.hw,
> @@ -637,6 +758,7 @@ static SUNXI_CCU_M_HW_WITH_MUX_GATE(eink_panel_clk, "eink-panel", eink_panel_par
>   				    24, 3,	/* mux */
>   				    BIT(31),	/* gate */
>   				    CLK_SET_RATE_PARENT);
> +static SUNXI_CCU_GATE_HWS(bus_eink_clk, "bus-eink", ahb_hws, 0xa6c, BIT(0), 0);
>   
>   static const struct clk_hw *ve_enc_parents[] = {
>   	&pll_ve0_clk.common.hw,
> @@ -668,6 +790,9 @@ static SUNXI_CCU_M_HW_WITH_MUX_GATE(ve_dec_clk, "ve-dec", ve_dec_parents, 0xa88,
>   				    BIT(31),	/* gate */
>   				    CLK_SET_RATE_PARENT);
>   
> +static SUNXI_CCU_GATE_HWS(bus_ve_enc_clk, "bus-ve-enc", ahb_hws, 0xa8c, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_ve_dec_clk, "bus-ve-dec", ahb_hws, 0xa8c, BIT(2), 0);
> +
>   static const struct clk_hw *ce_parents[] = {
>   	&sys_24M_clk.hw,
>   	&pll_periph0_400M_clk.hw,
> @@ -678,6 +803,8 @@ static SUNXI_CCU_M_HW_WITH_MUX_GATE(ce_clk, "ce", ce_parents, 0xac0,
>   				    24, 3,	/* mux */
>   				    BIT(31),	/* gate */
>   				    0);
> +static SUNXI_CCU_GATE_HWS(bus_ce_clk, "bus-ce", ahb_hws, 0xac4, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_ce_sys_clk, "bus-ce-sys", ahb_hws, 0xac4, BIT(1), 0);
>   
>   static const struct clk_hw *npu_parents[] = {
>   	&pll_npu_clk.common.hw,
> @@ -693,6 +820,7 @@ static SUNXI_CCU_M_HW_WITH_MUX_GATE(npu_clk, "npu", npu_parents, 0xb00,
>   				    24, 3,	/* mux */
>   				    BIT(31),	/* gate */
>   				    0);
> +static SUNXI_CCU_GATE_DATA(bus_npu_clk, "bus-npu", hosc, 0xb04, BIT(0), 0);
>   
>   /*
>    * GPU_CLK = ClockSource * ((16 - M) / 16)
> @@ -725,6 +853,7 @@ static struct ccu_div gpu_clk = {
>   							   &ccu_div_ops, 0),
>   	}
>   };
> +static SUNXI_CCU_GATE_HWS(bus_gpu_clk, "bus-gpu", ahb_hws, 0xb24, BIT(0), 0);
>   
>   static const struct clk_parent_data dram_parents[] = {
>   	{ .hw = &pll_ddr_clk.common.hw, },
> @@ -740,6 +869,7 @@ static SUNXI_CCU_MP_DATA_WITH_MUX_GATE_FEAT(dram_clk, "dram", dram_parents, 0xc0
>   					    BIT(31),	/* gate */
>   					    CLK_IS_CRITICAL,
>   					    CCU_FEATURE_UPDATE_BIT);
> +static SUNXI_CCU_GATE_HWS(bus_dram_clk, "bus-dram", ahb_hws, 0xc0c, BIT(0), 0);
>   
>   static const struct clk_parent_data nand_mmc_parents[] = {
>   	{ .hw = &sys_24M_clk.hw, },
> @@ -758,6 +888,7 @@ static SUNXI_CCU_M_DATA_WITH_MUX_GATE(nand1_clk, "nand1", nand_mmc_parents, 0xc8
>   				      24, 3,	/* mux */
>   				      BIT(31),	/* gate */
>   				      0);
> +static SUNXI_CCU_GATE_HWS(bus_nand_clk, "bus-nand", ahb_hws, 0xc8c, BIT(0), 0);
>   
>   static SUNXI_CCU_MP_MUX_GATE_POSTDIV_DUALDIV(mmc0_clk, "mmc0", nand_mmc_parents, 0xd00,
>   					     0, 5,	/* M */
> @@ -796,6 +927,11 @@ static SUNXI_CCU_MP_MUX_GATE_POSTDIV_DUALDIV(mmc3_clk, "mmc3", mmc2_mmc3_parents
>   					     2,		/* post div */
>   					     0);
>   
> +static SUNXI_CCU_GATE_HWS(bus_mmc0_clk, "bus-mmc0", ahb_hws, 0xd0c, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_mmc1_clk, "bus-mmc1", ahb_hws, 0xd1c, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_mmc2_clk, "bus-mmc2", ahb_hws, 0xd2c, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_mmc3_clk, "bus-mmc3", ahb_hws, 0xd3c, BIT(0), 0);
> +
>   static const struct clk_hw *ufs_axi_parents[] = {
>   	&pll_periph0_300M_clk.hw,
>   	&pll_periph0_200M_clk.hw,
> @@ -815,6 +951,29 @@ static SUNXI_CCU_M_DATA_WITH_MUX_GATE(ufs_cfg_clk, "ufs-cfg", ufs_cfg_parents, 0
>   				      24, 3,	/* mux */
>   				      BIT(31),	/* gate */
>   				      0);
> +static SUNXI_CCU_GATE_DATA(bus_ufs_clk, "bus-ufs", hosc, 0xd8c, BIT(0), 0);
> +
> +static SUNXI_CCU_GATE_HWS(bus_uart0_clk, "bus-uart0", apb_uart_hws, 0xe00, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_uart1_clk, "bus-uart1", apb_uart_hws, 0xe04, BIT(1), 0);
> +static SUNXI_CCU_GATE_HWS(bus_uart2_clk, "bus-uart2", apb_uart_hws, 0xe08, BIT(2), 0);
> +static SUNXI_CCU_GATE_HWS(bus_uart3_clk, "bus-uart3", apb_uart_hws, 0xe0c, BIT(3), 0);
> +static SUNXI_CCU_GATE_HWS(bus_uart4_clk, "bus-uart4", apb_uart_hws, 0xe10, BIT(4), 0);
> +static SUNXI_CCU_GATE_HWS(bus_uart5_clk, "bus-uart5", apb_uart_hws, 0xe14, BIT(5), 0);
> +static SUNXI_CCU_GATE_HWS(bus_uart6_clk, "bus-uart6", apb_uart_hws, 0xe18, BIT(6), 0);

According to the manual the gate bits are always BIT(0), since each
UART has its own bus gate register.

> +static SUNXI_CCU_GATE_HWS(bus_i2c0_clk, "bus-i2c0", apb1_hws, 0xe80, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_i2c1_clk, "bus-i2c1", apb1_hws, 0xe84, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_i2c2_clk, "bus-i2c2", apb1_hws, 0xe88, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_i2c3_clk, "bus-i2c3", apb1_hws, 0xe8c, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_i2c4_clk, "bus-i2c4", apb1_hws, 0xe90, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_i2c5_clk, "bus-i2c5", apb1_hws, 0xe94, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_i2c6_clk, "bus-i2c6", apb1_hws, 0xe98, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_i2c7_clk, "bus-i2c7", apb1_hws, 0xe9c, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_i2c8_clk, "bus-i2c8", apb1_hws, 0xea0, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_i2c9_clk, "bus-i2c9", apb1_hws, 0xea4, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_i2c10_clk, "bus-i2c10", apb1_hws, 0xea8, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_i2c11_clk, "bus-i2c11", apb1_hws, 0xeac, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_i2c12_clk, "bus-i2c12", apb1_hws, 0xeb0, BIT(0), 0);
>   
>   static const struct clk_parent_data spi_parents[] = {
>   	{ .hw = &sys_24M_clk.hw },
> @@ -856,6 +1015,11 @@ static SUNXI_CCU_DUALDIV_MUX_GATE(spi4_clk, "spi4", spi_parents, 0xf28,
>   				  24, 3,	/* mux */
>   				  BIT(31),	/* gate */
>   				  0);
> +static SUNXI_CCU_GATE_HWS(bus_spi0_clk, "bus-spi0", ahb_hws, 0xf04, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_spi1_clk, "bus-spi1", ahb_hws, 0xf0c, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_spi2_clk, "bus-spi2", ahb_hws, 0xf14, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_spi3_clk, "bus-spi3", ahb_hws, 0xf24, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_spi4_clk, "bus-spi4", ahb_hws, 0xf2c, BIT(0), 0);
>   
>   static const struct clk_parent_data spif_parents[] = {
>   	{ .hw = &sys_24M_clk.hw },
> @@ -873,6 +1037,7 @@ static SUNXI_CCU_DUALDIV_MUX_GATE(spif_clk, "spif", spif_parents, 0xf18,
>   				  24, 3,	/* mux */
>   				  BIT(31),	/* gate */
>   				  0);
> +static SUNXI_CCU_GATE_HWS(bus_spif_clk, "bus-spif", ahb_hws, 0xf1c, BIT(0), 0);

Can you please move that line into the other SPI gates above, so that
it is ordered by address?

>   
>   static const struct clk_parent_data gpadc_parents[] = {
>   	{ .hw = &sys_24M_clk.hw },
> @@ -883,6 +1048,9 @@ static SUNXI_CCU_M_DATA_WITH_MUX_GATE(gpadc_clk, "gpadc", gpadc_parents, 0xfc0,
>   				      24, 3,	/* mux */
>   				      BIT(31),	/* gate */
>   				      0);
> +static SUNXI_CCU_GATE_HWS(bus_gpadc_clk, "bus-gpadc", ahb_hws, 0xfc4, BIT(0), 0);
> +
> +static SUNXI_CCU_GATE_HWS(bus_ths_clk, "bus-ths", apb0_hws, 0xfe4, BIT(0), 0);
>   
>   static const struct clk_parent_data irrx_parents[] = {
>   	{ .fw_name = "losc"},
> @@ -894,6 +1062,7 @@ static SUNXI_CCU_M_DATA_WITH_MUX_GATE(irrx_clk, "irrx", irrx_parents, 0x1000,
>   				      24, 3,	/* mux */
>   				      BIT(31),	/* gate */
>   				      0);
> +static SUNXI_CCU_GATE_HWS(bus_irrx_clk, "bus-irrx", apb0_hws, 0x1004, BIT(0), 0);
>   
>   static const struct clk_parent_data irtx_parents[] = {
>   	{ .fw_name = "losc"},
> @@ -905,6 +1074,9 @@ static SUNXI_CCU_M_DATA_WITH_MUX_GATE(irtx_clk, "irtx", irtx_parents, 0x1008,
>   				      24, 3,	/* mux */
>   				      BIT(31),	/* gate */
>   				      0);
> +static SUNXI_CCU_GATE_HWS(bus_irtx_clk, "bus-irtx", apb0_hws, 0x100c, BIT(0), 0);
> +
> +static SUNXI_CCU_GATE_HWS(bus_lradc_clk, "bus-lradc", apb0_hws, 0x1024, BIT(0), 0);
>   
>   static const struct clk_parent_data sgpio_parents[] = {
>   	{ .fw_name = "losc"},
> @@ -915,6 +1087,7 @@ static SUNXI_CCU_M_DATA_WITH_MUX_GATE(sgpio_clk, "sgpio", sgpio_parents, 0x1060,
>   				      24, 3,	/* mux */
>   				      BIT(31),	/* gate */
>   				      0);
> +static SUNXI_CCU_GATE_DATA(bus_sgpio_clk, "bus-sgpio", hosc, 0x1064, BIT(0), 0);
>   
>   static const struct clk_hw *lpc_parents[] = {
>   	&pll_video0_3x_clk.common.hw,
> @@ -927,6 +1100,7 @@ static SUNXI_CCU_M_HW_WITH_MUX_GATE(lpc_clk, "lpc", lpc_parents, 0x1080,
>   				    24, 3,	/* mux */
>   				    BIT(31),	/* gate */
>   				    0);
> +static SUNXI_CCU_GATE_DATA(bus_lpc_clk, "bus-lpc", hosc, 0x1084, BIT(0), 0);

where do these two clocks come from? They are not mentioned in the 
version of the manual I am looking at. If they come from BSP sources, 
please add a comment about that.

>   
>   static const struct clk_hw *i2spcm_parents[] = {
>   	&pll_audio0_4x_clk.common.hw,
> @@ -959,6 +1133,11 @@ static SUNXI_CCU_M_HW_WITH_MUX_GATE(i2spcm4_clk, "i2spcm4", i2spcm_parents, 0x12
>   				    24, 3,	/* mux */
>   				    BIT(31),	/* gate */
>   				    0);
> +static SUNXI_CCU_GATE_DATA(bus_i2spcm0_clk, "bus-i2spcm0", hosc, 0x120c, BIT(0), 0);
> +static SUNXI_CCU_GATE_DATA(bus_i2spcm1_clk, "bus-i2spcm1", hosc, 0x121c, BIT(0), 0);
> +static SUNXI_CCU_GATE_DATA(bus_i2spcm2_clk, "bus-i2spcm2", hosc, 0x122c, BIT(0), 0);
> +static SUNXI_CCU_GATE_DATA(bus_i2spcm3_clk, "bus-i2spcm3", hosc, 0x123c, BIT(0), 0);
> +static SUNXI_CCU_GATE_DATA(bus_i2spcm4_clk, "bus-i2spcm4", hosc, 0x124c, BIT(0), 0);
>   
>   static const struct clk_hw *i2spcm2_asrc_parents[] = {
>   	&pll_audio0_4x_clk.common.hw,
> @@ -995,6 +1174,8 @@ static SUNXI_CCU_M_HW_WITH_MUX_GATE(owa_rx_clk, "owa_rx", owa_rx_parents, 0x1284
>   				    BIT(31),	/* gate */
>   				    0);
>   
> +static SUNXI_CCU_GATE_HWS(bus_owa_clk, "bus-owa", apb1_hws, 0x128c, BIT(0), 0);

In mainline we use "spdif" instead of "owa", compare the other drivers.

> +
>   static const struct clk_hw *dmic_parents[] = {
>   	&pll_audio0_4x_clk.common.hw,
>   	&pll_audio1_div2_clk.common.hw,
> @@ -1006,6 +1187,8 @@ static SUNXI_CCU_M_HW_WITH_MUX_GATE(dmic_clk, "dmic", dmic_parents, 0x12c0,
>   				    BIT(31),	/* gate */
>   				    0);
>   
> +static SUNXI_CCU_GATE_HWS(bus_dmic_clk, "bus-dmic", apb1_hws, 0x12cc, BIT(0), 0);
> +
>   /*
>    * The first parent is a 48 MHz input clock divided by 4. That 48 MHz clock is
>    * a 2x multiplier from pll-ref synchronized by pll-periph0, and is also used by
> @@ -1037,6 +1220,9 @@ static struct ccu_mux usb_ohci0_clk = {
>   							   &ccu_mux_ops, 0),
>   	},
>   };
> +static SUNXI_CCU_GATE_HWS(bus_ohci0_clk, "bus-ohci0", ahb_hws, 0x1304, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_ehci0_clk, "bus-ehci0", ahb_hws, 0x1304, BIT(4), 0);
> +static SUNXI_CCU_GATE_HWS(bus_otg_clk, "bus-otg", ahb_hws, 0x1304, BIT(8), 0);
>   
>   static struct ccu_mux usb_ohci1_clk = {
>   	.enable		= BIT(31),
> @@ -1053,6 +1239,8 @@ static struct ccu_mux usb_ohci1_clk = {
>   							   &ccu_mux_ops, 0),
>   	},
>   };
> +static SUNXI_CCU_GATE_HWS(bus_ohci1_clk, "bus-ohci1", ahb_hws, 0x130c, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_ehci1_clk, "bus-ehci1", ahb_hws, 0x130c, BIT(4), 0);
>   
>   static const struct clk_parent_data usb_ref_parents[] = {
>   	{ .hw = &sys_24M_clk.hw },
> @@ -1159,6 +1347,8 @@ static SUNXI_CCU_M_HWS_WITH_GATE(gmac1_phy_clk, "gmac1-phy", pll_periph0_150M_hw
>   				 0, 5,		/* M */
>   				 BIT(31),	/* gate */
>   				 0);
> +static SUNXI_CCU_GATE_HWS(bus_gmac0_clk, "bus-gmac0", ahb_hws, 0x141c, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_gmac1_clk, "bus-gmac1", ahb_hws, 0x142c, BIT(0), 0);

That GMAC1 clock is not in the manual, where does it come from?

>   
>   static const struct clk_hw *tcon_lcd_parents[] = {
>   	&pll_video0_4x_clk.common.hw,
> @@ -1181,6 +1371,9 @@ static SUNXI_CCU_M_HW_WITH_MUX_GATE(tcon_lcd2_clk, "tcon-lcd2", tcon_lcd_parents
>   				    24, 3,	/* mux */
>   				    BIT(31),	/* gate */
>   				    0);
> +static SUNXI_CCU_GATE_HWS(bus_tcon_lcd0_clk, "bus-tcon-lcd0", ahb_hws, 0x1504, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_tcon_lcd1_clk, "bus-tcon-lcd1", ahb_hws, 0x150c, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_tcon_lcd2_clk, "bus-tcon-lcd2", ahb_hws, 0x1514, BIT(0), 0);

Same here, LCD2 is not listed.

The rest looks alright when comparing to the manual, also the whole
boilerplate with the SUNXI_CC_GATE_HWS macro, the list of hw clocks
below and the assignment of the clock IDs to the clocks.

Cheers,
Andre

>   
>   static const struct clk_hw *dsi_parents[] = {
>   	&sys_24M_clk.hw,
> @@ -1197,6 +1390,8 @@ static SUNXI_CCU_M_HW_WITH_MUX_GATE(dsi1_clk, "dsi1", dsi_parents, 0x1588,
>   				    24, 3,	/* mux */
>   				    BIT(31),	/* gate */
>   				    0);
> +static SUNXI_CCU_GATE_HWS(bus_dsi0_clk, "bus-dsi0", ahb_hws, 0x1584, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_dsi1_clk, "bus-dsi1", ahb_hws, 0x158c, BIT(0), 0);
>   
>   static const struct clk_hw *combphy_parents[] = {
>   	&pll_video0_4x_clk.common.hw,
> @@ -1216,6 +1411,9 @@ static SUNXI_CCU_M_HW_WITH_MUX_GATE(combphy1_clk, "combphy1", combphy_parents, 0
>   				    BIT(31),	/* gate */
>   				    0);
>   
> +static SUNXI_CCU_GATE_HWS(bus_tcon_tv0_clk, "bus-tcon-tv0", ahb_hws, 0x1604, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_tcon_tv1_clk, "bus-tcon-tv1", ahb_hws, 0x160c, BIT(0), 0);
> +
>   static const struct clk_hw *edp_tv_parents[] = {
>   	&pll_video0_4x_clk.common.hw,
>   	&pll_video1_4x_clk.common.hw,
> @@ -1227,6 +1425,7 @@ static SUNXI_CCU_M_HW_WITH_MUX_GATE(edp_tv_clk, "edp-tv", edp_tv_parents, 0x1640
>   				    24, 3,	/* mux */
>   				    BIT(31),	/* gate */
>   				    0);
> +static SUNXI_CCU_GATE_HWS(bus_edp_tv_clk, "bus-edp-tv", ahb_hws, 0x164c, BIT(0), 0);
>   
>   static SUNXI_CCU_GATE_HWS_WITH_PREDIV(hdmi_cec_32k_clk, "hdmi-cec-32k", pll_periph0_2x_hws, 0x1680,
>   				      BIT(30),	/* gate */
> @@ -1254,6 +1453,7 @@ static SUNXI_CCU_DUALDIV_MUX_GATE(hdmi_tv_clk, "hdmi-tv", hdmi_tv_parents, 0x168
>   				  24, 3,	/* mux */
>   				  BIT(31),	/* gate */
>   				  0);
> +static SUNXI_CCU_GATE_HWS(bus_hdmi_tv_clk, "bus-hdmi-tv", ahb_hws, 0x168c, BIT(0), 0);
>   
>   static const struct clk_parent_data hdmi_sfr_parents[] = {
>   	{ .hw = &sys_24M_clk.hw },
> @@ -1266,6 +1466,9 @@ static SUNXI_CCU_MUX_DATA_WITH_GATE(hdmi_sfr_clk, "hdmi-sfr", hdmi_sfr_parents,
>   
>   static SUNXI_CCU_GATE_HWS(hdmi_esm_clk, "hdmi-esm", pll_periph0_300M_hws, 0x1694, BIT(31), 0);
>   
> +static SUNXI_CCU_GATE_HWS(bus_dpss_top0_clk, "bus-dpss-top0", ahb_hws, 0x16c4, BIT(0), 0);
> +static SUNXI_CCU_GATE_HWS(bus_dpss_top1_clk, "bus-dpss-top1", ahb_hws, 0x16cc, BIT(0), 0);
> +
>   static const struct clk_parent_data ledc_parents[] = {
>   	{ .hw = &sys_24M_clk.hw },
>   	{ .hw = &pll_periph0_600M_clk.hw },
> @@ -1276,6 +1479,9 @@ static SUNXI_CCU_M_DATA_WITH_MUX_GATE(ledc_clk, "ledc", ledc_parents, 0x1700,
>   				      24, 3,	/* mux */
>   				      BIT(31),	/* gate */
>   				      0);
> +static SUNXI_CCU_GATE_HWS(bus_ledc_clk, "bus-ledc", apb0_hws, 0x1704, BIT(0), 0);
> +
> +static SUNXI_CCU_GATE_HWS(bus_dsc_clk, "bus-dsc", ahb_hws, 0x1744, BIT(0), 0);
>   
>   static const struct clk_parent_data csi_master_parents[] = {
>   	{ .hw = &sys_24M_clk.hw },
> @@ -1317,6 +1523,7 @@ static SUNXI_CCU_M_HW_WITH_MUX_GATE(csi_clk, "csi", csi_parents, 0x1840,
>   				    24, 3,	/* mux */
>   				    BIT(31),	/* gate */
>   				    0);
> +static SUNXI_CCU_GATE_HWS(bus_csi_clk, "bus-csi", ahb_hws, 0x1844, BIT(0), 0);
>   
>   static const struct clk_hw *isp_parents[] = {
>   	&pll_video2_4x_clk.common.hw,
> @@ -1446,8 +1653,62 @@ static struct ccu_common *sun60i_a733_ccu_clks[] = {
>   	&trace_clk.common,
>   	&gic_clk.common,
>   	&cpu_peri_clk.common,
> +	&bus_its_pcie_clk.common,
>   	&nsi_clk.common,
> +	&bus_nsi_clk.common,
>   	&mbus_clk.common,
> +	&mbus_iommu0_sys_clk.common,
> +	&apb_iommu0_sys_clk.common,
> +	&ahb_iommu0_sys_clk.common,
> +	&bus_msi_lite0_clk.common,
> +	&bus_msi_lite1_clk.common,
> +	&bus_msi_lite2_clk.common,
> +	&mbus_iommu1_sys_clk.common,
> +	&apb_iommu1_sys_clk.common,
> +	&ahb_iommu1_sys_clk.common,
> +	&ahb_ve_dec_clk.common,
> +	&ahb_ve_enc_clk.common,
> +	&ahb_vid_in_clk.common,
> +	&ahb_vid_cout0_clk.common,
> +	&ahb_vid_cout1_clk.common,
> +	&ahb_de_clk.common,
> +	&ahb_npu_clk.common,
> +	&ahb_gpu0_clk.common,
> +	&ahb_serdes_clk.common,
> +	&ahb_usb_sys_clk.common,
> +	&ahb_msi_lite0_clk.common,
> +	&ahb_store_clk.common,
> +	&ahb_cpus_clk.common,
> +	&mbus_iommu0_clk.common,
> +	&mbus_iommu1_clk.common,
> +	&mbus_desys_clk.common,
> +	&mbus_ve_enc_gate_clk.common,
> +	&mbus_ve_dec_gate_clk.common,
> +	&mbus_gpu0_clk.common,
> +	&mbus_npu_clk.common,
> +	&mbus_vid_in_clk.common,
> +	&mbus_serdes_clk.common,
> +	&mbus_msi_lite0_clk.common,
> +	&mbus_store_clk.common,
> +	&mbus_msi_lite2_clk.common,
> +	&mbus_dma0_clk.common,
> +	&mbus_ve_enc_clk.common,
> +	&mbus_ce_clk.common,
> +	&mbus_dma1_clk.common,
> +	&mbus_nand_clk.common,
> +	&mbus_csi_clk.common,
> +	&mbus_isp_clk.common,
> +	&mbus_gmac0_clk.common,
> +	&mbus_gmac1_clk.common,
> +	&mbus_ve_dec_clk.common,
> +	&bus_dma0_clk.common,
> +	&bus_dma1_clk.common,
> +	&bus_spinlock_clk.common,
> +	&bus_msgbox_clk.common,
> +	&bus_pwm0_clk.common,
> +	&bus_pwm1_clk.common,
> +	&bus_dbg_clk.common,
> +	&bus_sysdap_clk.common,
>   	&timer0_clk.common,
>   	&timer1_clk.common,
>   	&timer2_clk.common,
> @@ -1458,48 +1719,111 @@ static struct ccu_common *sun60i_a733_ccu_clks[] = {
>   	&timer7_clk.common,
>   	&timer8_clk.common,
>   	&timer9_clk.common,
> +	&bus_timer_clk.common,
>   	&avs_clk.common,
>   	&de_clk.common,
> +	&bus_de_clk.common,
>   	&di_clk.common,
> +	&bus_di_clk.common,
>   	&g2d_clk.common,
> +	&bus_g2d_clk.common,
>   	&eink_clk.common,
>   	&eink_panel_clk.common,
> +	&bus_eink_clk.common,
>   	&ve_enc_clk.common,
>   	&ve_dec_clk.common,
> +	&bus_ve_enc_clk.common,
> +	&bus_ve_dec_clk.common,
>   	&ce_clk.common,
> +	&bus_ce_clk.common,
> +	&bus_ce_sys_clk.common,
>   	&npu_clk.common,
> +	&bus_npu_clk.common,
>   	&gpu_clk.common,
> +	&bus_gpu_clk.common,
>   	&dram_clk.common,
> +	&bus_dram_clk.common,
>   	&nand0_clk.common,
>   	&nand1_clk.common,
> +	&bus_nand_clk.common,
>   	&mmc0_clk.common,
>   	&mmc1_clk.common,
>   	&mmc2_clk.common,
>   	&mmc3_clk.common,
> +	&bus_mmc0_clk.common,
> +	&bus_mmc1_clk.common,
> +	&bus_mmc2_clk.common,
> +	&bus_mmc3_clk.common,
>   	&ufs_axi_clk.common,
>   	&ufs_cfg_clk.common,
> +	&bus_ufs_clk.common,
> +	&bus_uart0_clk.common,
> +	&bus_uart1_clk.common,
> +	&bus_uart2_clk.common,
> +	&bus_uart3_clk.common,
> +	&bus_uart4_clk.common,
> +	&bus_uart5_clk.common,
> +	&bus_uart6_clk.common,
> +	&bus_i2c0_clk.common,
> +	&bus_i2c1_clk.common,
> +	&bus_i2c2_clk.common,
> +	&bus_i2c3_clk.common,
> +	&bus_i2c4_clk.common,
> +	&bus_i2c5_clk.common,
> +	&bus_i2c6_clk.common,
> +	&bus_i2c7_clk.common,
> +	&bus_i2c8_clk.common,
> +	&bus_i2c9_clk.common,
> +	&bus_i2c10_clk.common,
> +	&bus_i2c11_clk.common,
> +	&bus_i2c12_clk.common,
>   	&spi0_clk.common,
>   	&spi1_clk.common,
>   	&spi2_clk.common,
>   	&spi3_clk.common,
>   	&spi4_clk.common,
> +	&bus_spi0_clk.common,
> +	&bus_spi1_clk.common,
> +	&bus_spi2_clk.common,
> +	&bus_spi3_clk.common,
> +	&bus_spi4_clk.common,
>   	&spif_clk.common,
> +	&bus_spif_clk.common,
>   	&gpadc_clk.common,
> +	&bus_gpadc_clk.common,
> +	&bus_ths_clk.common,
>   	&irrx_clk.common,
> +	&bus_irrx_clk.common,
>   	&irtx_clk.common,
> +	&bus_irtx_clk.common,
> +	&bus_lradc_clk.common,
>   	&sgpio_clk.common,
> +	&bus_sgpio_clk.common,
>   	&lpc_clk.common,
> +	&bus_lpc_clk.common,
>   	&i2spcm0_clk.common,
>   	&i2spcm1_clk.common,
>   	&i2spcm2_clk.common,
>   	&i2spcm3_clk.common,
>   	&i2spcm4_clk.common,
> +	&bus_i2spcm0_clk.common,
> +	&bus_i2spcm1_clk.common,
> +	&bus_i2spcm2_clk.common,
> +	&bus_i2spcm3_clk.common,
> +	&bus_i2spcm4_clk.common,
>   	&i2spcm2_asrc_clk.common,
>   	&owa_tx_clk.common,
>   	&owa_rx_clk.common,
> +	&bus_owa_clk.common,
>   	&dmic_clk.common,
> +	&bus_dmic_clk.common,
>   	&usb_ohci0_clk.common,
> +	&bus_otg_clk.common,
> +	&bus_ehci0_clk.common,
> +	&bus_ohci0_clk.common,
>   	&usb_ohci1_clk.common,
> +	&bus_ehci1_clk.common,
> +	&bus_ohci1_clk.common,
>   	&usb_ref_clk.common,
>   	&usb2_u2_ref_clk.common,
>   	&usb2_suspend_clk.common,
> @@ -1512,24 +1836,40 @@ static struct ccu_common *sun60i_a733_ccu_clks[] = {
>   	&gmac_ptp_clk.common,
>   	&gmac0_phy_clk.common,
>   	&gmac1_phy_clk.common,
> +	&bus_gmac0_clk.common,
> +	&bus_gmac1_clk.common,
>   	&tcon_lcd0_clk.common,
>   	&tcon_lcd1_clk.common,
>   	&tcon_lcd2_clk.common,
> +	&bus_tcon_lcd0_clk.common,
> +	&bus_tcon_lcd1_clk.common,
> +	&bus_tcon_lcd2_clk.common,
>   	&dsi0_clk.common,
>   	&dsi1_clk.common,
> +	&bus_dsi0_clk.common,
> +	&bus_dsi1_clk.common,
>   	&combphy0_clk.common,
>   	&combphy1_clk.common,
> +	&bus_tcon_tv0_clk.common,
> +	&bus_tcon_tv1_clk.common,
>   	&edp_tv_clk.common,
> +	&bus_edp_tv_clk.common,
>   	&hdmi_cec_32k_clk.common,
>   	&hdmi_cec_clk.common,
>   	&hdmi_tv_clk.common,
> +	&bus_hdmi_tv_clk.common,
>   	&hdmi_sfr_clk.common,
>   	&hdmi_esm_clk.common,
> +	&bus_dpss_top0_clk.common,
> +	&bus_dpss_top1_clk.common,
>   	&ledc_clk.common,
> +	&bus_ledc_clk.common,
> +	&bus_dsc_clk.common,
>   	&csi_master0_clk.common,
>   	&csi_master1_clk.common,
>   	&csi_master2_clk.common,
>   	&csi_clk.common,
> +	&bus_csi_clk.common,
>   	&isp_clk.common,
>   	&apb2jtag_clk.common,
>   	&fanout_24M_clk.common,
> @@ -1596,8 +1936,62 @@ static struct clk_hw_onecell_data sun60i_a733_hw_clks = {
>   		[CLK_TRACE]		= &trace_clk.common.hw,
>   		[CLK_GIC]		= &gic_clk.common.hw,
>   		[CLK_CPU_PERI]		= &cpu_peri_clk.common.hw,
> +		[CLK_BUS_ITS_PCIE]	= &bus_its_pcie_clk.common.hw,
>   		[CLK_NSI]		= &nsi_clk.common.hw,
> +		[CLK_BUS_NSI]		= &bus_nsi_clk.common.hw,
>   		[CLK_MBUS]		= &mbus_clk.common.hw,
> +		[CLK_MBUS_IOMMU0_SYS]	= &mbus_iommu0_sys_clk.common.hw,
> +		[CLK_APB_IOMMU0_SYS]	= &apb_iommu0_sys_clk.common.hw,
> +		[CLK_AHB_IOMMU0_SYS]	= &ahb_iommu0_sys_clk.common.hw,
> +		[CLK_BUS_MSI_LITE0]	= &bus_msi_lite0_clk.common.hw,
> +		[CLK_BUS_MSI_LITE1]	= &bus_msi_lite1_clk.common.hw,
> +		[CLK_BUS_MSI_LITE2]	= &bus_msi_lite2_clk.common.hw,
> +		[CLK_MBUS_IOMMU1_SYS]	= &mbus_iommu1_sys_clk.common.hw,
> +		[CLK_APB_IOMMU1_SYS]	= &apb_iommu1_sys_clk.common.hw,
> +		[CLK_AHB_IOMMU1_SYS]	= &ahb_iommu1_sys_clk.common.hw,
> +		[CLK_AHB_VE_DEC]	= &ahb_ve_dec_clk.common.hw,
> +		[CLK_AHB_VE_ENC]	= &ahb_ve_enc_clk.common.hw,
> +		[CLK_AHB_VID_IN]	= &ahb_vid_in_clk.common.hw,
> +		[CLK_AHB_VID_COUT0]	= &ahb_vid_cout0_clk.common.hw,
> +		[CLK_AHB_VID_COUT1]	= &ahb_vid_cout1_clk.common.hw,
> +		[CLK_AHB_DE]		= &ahb_de_clk.common.hw,
> +		[CLK_AHB_NPU]		= &ahb_npu_clk.common.hw,
> +		[CLK_AHB_GPU0]		= &ahb_gpu0_clk.common.hw,
> +		[CLK_AHB_SERDES]	= &ahb_serdes_clk.common.hw,
> +		[CLK_AHB_USB_SYS]	= &ahb_usb_sys_clk.common.hw,
> +		[CLK_AHB_MSI_LITE0]	= &ahb_msi_lite0_clk.common.hw,
> +		[CLK_AHB_STORE]		= &ahb_store_clk.common.hw,
> +		[CLK_AHB_CPUS]		= &ahb_cpus_clk.common.hw,
> +		[CLK_MBUS_IOMMU0]	= &mbus_iommu0_clk.common.hw,
> +		[CLK_MBUS_IOMMU1]	= &mbus_iommu1_clk.common.hw,
> +		[CLK_MBUS_DESYS]	= &mbus_desys_clk.common.hw,
> +		[CLK_MBUS_VE_ENC_GATE]	= &mbus_ve_enc_gate_clk.common.hw,
> +		[CLK_MBUS_VE_DEC_GATE]	= &mbus_ve_dec_gate_clk.common.hw,
> +		[CLK_MBUS_GPU0]		= &mbus_gpu0_clk.common.hw,
> +		[CLK_MBUS_NPU]		= &mbus_npu_clk.common.hw,
> +		[CLK_MBUS_VID_IN]	= &mbus_vid_in_clk.common.hw,
> +		[CLK_MBUS_SERDES]	= &mbus_serdes_clk.common.hw,
> +		[CLK_MBUS_MSI_LITE0]	= &mbus_msi_lite0_clk.common.hw,
> +		[CLK_MBUS_STORE]	= &mbus_store_clk.common.hw,
> +		[CLK_MBUS_MSI_LITE2]	= &mbus_msi_lite2_clk.common.hw,
> +		[CLK_MBUS_DMA0]		= &mbus_dma0_clk.common.hw,
> +		[CLK_MBUS_VE_ENC]	= &mbus_ve_enc_clk.common.hw,
> +		[CLK_MBUS_CE]		= &mbus_ce_clk.common.hw,
> +		[CLK_MBUS_DMA1]		= &mbus_dma1_clk.common.hw,
> +		[CLK_MBUS_NAND]		= &mbus_nand_clk.common.hw,
> +		[CLK_MBUS_CSI]		= &mbus_csi_clk.common.hw,
> +		[CLK_MBUS_ISP]		= &mbus_isp_clk.common.hw,
> +		[CLK_MBUS_GMAC0]	= &mbus_gmac0_clk.common.hw,
> +		[CLK_MBUS_GMAC1]	= &mbus_gmac1_clk.common.hw,
> +		[CLK_MBUS_VE_DEC]	= &mbus_ve_dec_clk.common.hw,
> +		[CLK_BUS_DMA0]		= &bus_dma0_clk.common.hw,
> +		[CLK_BUS_DMA1]		= &bus_dma1_clk.common.hw,
> +		[CLK_BUS_SPINLOCK]	= &bus_spinlock_clk.common.hw,
> +		[CLK_BUS_MSGBOX]	= &bus_msgbox_clk.common.hw,
> +		[CLK_BUS_PWM0]		= &bus_pwm0_clk.common.hw,
> +		[CLK_BUS_PWM1]		= &bus_pwm1_clk.common.hw,
> +		[CLK_BUS_DBG]		= &bus_dbg_clk.common.hw,
> +		[CLK_BUS_SYSDAP]	= &bus_sysdap_clk.common.hw,
>   		[CLK_TIMER0]		= &timer0_clk.common.hw,
>   		[CLK_TIMER1]		= &timer1_clk.common.hw,
>   		[CLK_TIMER2]		= &timer2_clk.common.hw,
> @@ -1608,48 +2002,111 @@ static struct clk_hw_onecell_data sun60i_a733_hw_clks = {
>   		[CLK_TIMER7]		= &timer7_clk.common.hw,
>   		[CLK_TIMER8]		= &timer8_clk.common.hw,
>   		[CLK_TIMER9]		= &timer9_clk.common.hw,
> +		[CLK_BUS_TIMER]		= &bus_timer_clk.common.hw,
>   		[CLK_AVS]		= &avs_clk.common.hw,
>   		[CLK_DE]		= &de_clk.common.hw,
> +		[CLK_BUS_DE]		= &bus_de_clk.common.hw,
>   		[CLK_DI]		= &di_clk.common.hw,
> +		[CLK_BUS_DI]		= &bus_di_clk.common.hw,
>   		[CLK_G2D]		= &g2d_clk.common.hw,
> +		[CLK_BUS_G2D]		= &bus_g2d_clk.common.hw,
>   		[CLK_EINK]		= &eink_clk.common.hw,
>   		[CLK_EINK_PANEL]	= &eink_panel_clk.common.hw,
> +		[CLK_BUS_EINK]		= &bus_eink_clk.common.hw,
>   		[CLK_VE_ENC]		= &ve_enc_clk.common.hw,
>   		[CLK_VE_DEC]		= &ve_dec_clk.common.hw,
> +		[CLK_BUS_VE_ENC]	= &bus_ve_enc_clk.common.hw,
> +		[CLK_BUS_VE_DEC]	= &bus_ve_dec_clk.common.hw,
>   		[CLK_CE]		= &ce_clk.common.hw,
> +		[CLK_BUS_CE]		= &bus_ce_clk.common.hw,
> +		[CLK_BUS_CE_SYS]	= &bus_ce_sys_clk.common.hw,
>   		[CLK_NPU]		= &npu_clk.common.hw,
> +		[CLK_BUS_NPU]		= &bus_npu_clk.common.hw,
>   		[CLK_GPU]		= &gpu_clk.common.hw,
> +		[CLK_BUS_GPU]		= &bus_gpu_clk.common.hw,
>   		[CLK_DRAM]		= &dram_clk.common.hw,
> +		[CLK_BUS_DRAM]		= &bus_dram_clk.common.hw,
>   		[CLK_NAND0]		= &nand0_clk.common.hw,
>   		[CLK_NAND1]		= &nand1_clk.common.hw,
> +		[CLK_BUS_NAND]		= &bus_nand_clk.common.hw,
>   		[CLK_MMC0]		= &mmc0_clk.common.hw,
>   		[CLK_MMC1]		= &mmc1_clk.common.hw,
>   		[CLK_MMC2]		= &mmc2_clk.common.hw,
>   		[CLK_MMC3]		= &mmc3_clk.common.hw,
> +		[CLK_BUS_MMC0]		= &bus_mmc0_clk.common.hw,
> +		[CLK_BUS_MMC1]		= &bus_mmc1_clk.common.hw,
> +		[CLK_BUS_MMC2]		= &bus_mmc2_clk.common.hw,
> +		[CLK_BUS_MMC3]		= &bus_mmc3_clk.common.hw,
>   		[CLK_UFS_AXI]		= &ufs_axi_clk.common.hw,
>   		[CLK_UFS_CFG]		= &ufs_cfg_clk.common.hw,
> +		[CLK_BUS_UFS]		= &bus_ufs_clk.common.hw,
> +		[CLK_BUS_UART0]		= &bus_uart0_clk.common.hw,
> +		[CLK_BUS_UART1]		= &bus_uart1_clk.common.hw,
> +		[CLK_BUS_UART2]		= &bus_uart2_clk.common.hw,
> +		[CLK_BUS_UART3]		= &bus_uart3_clk.common.hw,
> +		[CLK_BUS_UART4]		= &bus_uart4_clk.common.hw,
> +		[CLK_BUS_UART5]		= &bus_uart5_clk.common.hw,
> +		[CLK_BUS_UART6]		= &bus_uart6_clk.common.hw,
> +		[CLK_BUS_I2C0]		= &bus_i2c0_clk.common.hw,
> +		[CLK_BUS_I2C1]		= &bus_i2c1_clk.common.hw,
> +		[CLK_BUS_I2C2]		= &bus_i2c2_clk.common.hw,
> +		[CLK_BUS_I2C3]		= &bus_i2c3_clk.common.hw,
> +		[CLK_BUS_I2C4]		= &bus_i2c4_clk.common.hw,
> +		[CLK_BUS_I2C5]		= &bus_i2c5_clk.common.hw,
> +		[CLK_BUS_I2C6]		= &bus_i2c6_clk.common.hw,
> +		[CLK_BUS_I2C7]		= &bus_i2c7_clk.common.hw,
> +		[CLK_BUS_I2C8]		= &bus_i2c8_clk.common.hw,
> +		[CLK_BUS_I2C9]		= &bus_i2c9_clk.common.hw,
> +		[CLK_BUS_I2C10]		= &bus_i2c10_clk.common.hw,
> +		[CLK_BUS_I2C11]		= &bus_i2c11_clk.common.hw,
> +		[CLK_BUS_I2C12]		= &bus_i2c12_clk.common.hw,
>   		[CLK_SPI0]		= &spi0_clk.common.hw,
>   		[CLK_SPI1]		= &spi1_clk.common.hw,
>   		[CLK_SPI2]		= &spi2_clk.common.hw,
>   		[CLK_SPI3]		= &spi3_clk.common.hw,
>   		[CLK_SPI4]		= &spi4_clk.common.hw,
> +		[CLK_BUS_SPI0]		= &bus_spi0_clk.common.hw,
> +		[CLK_BUS_SPI1]		= &bus_spi1_clk.common.hw,
> +		[CLK_BUS_SPI2]		= &bus_spi2_clk.common.hw,
> +		[CLK_BUS_SPI3]		= &bus_spi3_clk.common.hw,
> +		[CLK_BUS_SPI4]		= &bus_spi4_clk.common.hw,
>   		[CLK_SPIF]		= &spif_clk.common.hw,
> +		[CLK_BUS_SPIF]		= &bus_spif_clk.common.hw,
>   		[CLK_GPADC]		= &gpadc_clk.common.hw,
> +		[CLK_BUS_GPADC]		= &bus_gpadc_clk.common.hw,
> +		[CLK_BUS_THS]		= &bus_ths_clk.common.hw,
>   		[CLK_IRRX]		= &irrx_clk.common.hw,
> +		[CLK_BUS_IRRX]		= &bus_irrx_clk.common.hw,
>   		[CLK_IRTX]		= &irtx_clk.common.hw,
> +		[CLK_BUS_IRTX]		= &bus_irtx_clk.common.hw,
> +		[CLK_BUS_LRADC]		= &bus_lradc_clk.common.hw,
>   		[CLK_SGPIO]		= &sgpio_clk.common.hw,
> +		[CLK_BUS_SGPIO]		= &bus_sgpio_clk.common.hw,
>   		[CLK_LPC]		= &lpc_clk.common.hw,
> +		[CLK_BUS_LPC]		= &bus_lpc_clk.common.hw,
>   		[CLK_I2SPCM0]		= &i2spcm0_clk.common.hw,
>   		[CLK_I2SPCM1]		= &i2spcm1_clk.common.hw,
>   		[CLK_I2SPCM2]		= &i2spcm2_clk.common.hw,
>   		[CLK_I2SPCM3]		= &i2spcm3_clk.common.hw,
>   		[CLK_I2SPCM4]		= &i2spcm4_clk.common.hw,
> +		[CLK_BUS_I2SPCM0]	= &bus_i2spcm0_clk.common.hw,
> +		[CLK_BUS_I2SPCM1]	= &bus_i2spcm1_clk.common.hw,
> +		[CLK_BUS_I2SPCM2]	= &bus_i2spcm2_clk.common.hw,
> +		[CLK_BUS_I2SPCM3]	= &bus_i2spcm3_clk.common.hw,
> +		[CLK_BUS_I2SPCM4]	= &bus_i2spcm4_clk.common.hw,
>   		[CLK_I2SPCM2_ASRC]	= &i2spcm2_asrc_clk.common.hw,
>   		[CLK_OWA_TX]		= &owa_tx_clk.common.hw,
>   		[CLK_OWA_RX]		= &owa_rx_clk.common.hw,
> +		[CLK_BUS_OWA]		= &bus_owa_clk.common.hw,
>   		[CLK_DMIC]		= &dmic_clk.common.hw,
> +		[CLK_BUS_DMIC]		= &bus_dmic_clk.common.hw,
>   		[CLK_USB_OHCI0]		= &usb_ohci0_clk.common.hw,
> +		[CLK_BUS_OTG]		= &bus_otg_clk.common.hw,
> +		[CLK_BUS_EHCI0]		= &bus_ehci0_clk.common.hw,
> +		[CLK_BUS_OHCI0]		= &bus_ohci0_clk.common.hw,
>   		[CLK_USB_OHCI1]		= &usb_ohci1_clk.common.hw,
> +		[CLK_BUS_EHCI1]		= &bus_ehci1_clk.common.hw,
> +		[CLK_BUS_OHCI1]		= &bus_ohci1_clk.common.hw,
>   		[CLK_USB_REF]		= &usb_ref_clk.common.hw,
>   		[CLK_USB2_U2_REF]	= &usb2_u2_ref_clk.common.hw,
>   		[CLK_USB2_SUSPEND]	= &usb2_suspend_clk.common.hw,
> @@ -1662,24 +2119,40 @@ static struct clk_hw_onecell_data sun60i_a733_hw_clks = {
>   		[CLK_GMAC_PTP]		= &gmac_ptp_clk.common.hw,
>   		[CLK_GMAC0_PHY]		= &gmac0_phy_clk.common.hw,
>   		[CLK_GMAC1_PHY]		= &gmac1_phy_clk.common.hw,
> +		[CLK_BUS_GMAC0]		= &bus_gmac0_clk.common.hw,
> +		[CLK_BUS_GMAC1]		= &bus_gmac1_clk.common.hw,
>   		[CLK_TCON_LCD0]		= &tcon_lcd0_clk.common.hw,
>   		[CLK_TCON_LCD1]		= &tcon_lcd1_clk.common.hw,
>   		[CLK_TCON_LCD2]		= &tcon_lcd2_clk.common.hw,
> +		[CLK_BUS_TCON_LCD0]	= &bus_tcon_lcd0_clk.common.hw,
> +		[CLK_BUS_TCON_LCD1]	= &bus_tcon_lcd1_clk.common.hw,
> +		[CLK_BUS_TCON_LCD2]	= &bus_tcon_lcd2_clk.common.hw,
>   		[CLK_DSI0]		= &dsi0_clk.common.hw,
>   		[CLK_DSI1]		= &dsi1_clk.common.hw,
> +		[CLK_BUS_DSI0]		= &bus_dsi0_clk.common.hw,
> +		[CLK_BUS_DSI1]		= &bus_dsi1_clk.common.hw,
>   		[CLK_COMBPHY0]		= &combphy0_clk.common.hw,
>   		[CLK_COMBPHY1]		= &combphy1_clk.common.hw,
> +		[CLK_BUS_TCON_TV0]	= &bus_tcon_tv0_clk.common.hw,
> +		[CLK_BUS_TCON_TV1]	= &bus_tcon_tv1_clk.common.hw,
>   		[CLK_EDP_TV]		= &edp_tv_clk.common.hw,
> +		[CLK_BUS_EDP_TV]	= &bus_edp_tv_clk.common.hw,
>   		[CLK_HDMI_CEC_32K]	= &hdmi_cec_32k_clk.common.hw,
>   		[CLK_HDMI_CEC]		= &hdmi_cec_clk.common.hw,
>   		[CLK_HDMI_TV]		= &hdmi_tv_clk.common.hw,
> +		[CLK_BUS_HDMI_TV]	= &bus_hdmi_tv_clk.common.hw,
>   		[CLK_HDMI_SFR]		= &hdmi_sfr_clk.common.hw,
>   		[CLK_HDMI_ESM]		= &hdmi_esm_clk.common.hw,
> +		[CLK_BUS_DPSS_TOP0]	= &bus_dpss_top0_clk.common.hw,
> +		[CLK_BUS_DPSS_TOP1]	= &bus_dpss_top1_clk.common.hw,
>   		[CLK_LEDC]		= &ledc_clk.common.hw,
> +		[CLK_BUS_LEDC]		= &bus_ledc_clk.common.hw,
> +		[CLK_BUS_DSC]		= &bus_dsc_clk.common.hw,
>   		[CLK_CSI_MASTER0]	= &csi_master0_clk.common.hw,
>   		[CLK_CSI_MASTER1]	= &csi_master1_clk.common.hw,
>   		[CLK_CSI_MASTER2]	= &csi_master2_clk.common.hw,
>   		[CLK_CSI]		= &csi_clk.common.hw,
> +		[CLK_BUS_CSI]		= &bus_csi_clk.common.hw,
>   		[CLK_ISP]		= &isp_clk.common.hw,
>   		[CLK_APB2JTAG]		= &apb2jtag_clk.common.hw,
>   		[CLK_FANOUT_24M]	= &fanout_24M_clk.common.hw,
> 



^ permalink raw reply

* Re: [PATCH v3 1/4] dt-bindings: input: adc-keys: allow all input properties
From: Rob Herring (Arm) @ 2026-04-15 22:19 UTC (permalink / raw)
  To: Nicolas Frattaroli
  Cc: Heiko Stuebner, kernel, Krzysztof Kozlowski, linux-kernel,
	linux-rockchip, linux-input, devicetree, Dmitry Torokhov,
	Conor Dooley, Alexandre Belloni, linux-arm-kernel,
	Krzysztof Kozlowski
In-Reply-To: <20260408-rock4d-audio-v3-1-49e43c3c2a68@collabora.com>


On Wed, 08 Apr 2026 19:49:39 +0200, Nicolas Frattaroli wrote:
> adc-keys, unlike gpio-keys, does not allow linux,input-type as a valid
> property. This makes it impossible to model devices that have ADC inputs
> that should generate switch events.
> 
> Replace "additionalProperties" with "unevaluatedProperties", so that any
> of the properties in the referenced input.yaml schema can be used.
> Consequently, throw out the explicit mention of "linux,code" and extend
> the example to verify.
> 
> Suggested-by: Krzysztof Kozlowski <krzk@kernel.org>
> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> ---
>  Documentation/devicetree/bindings/input/adc-keys.yaml | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 

Reviewed-by: Rob Herring (Arm) <robh@kernel.org>



^ permalink raw reply

* Re: [PATCH v6 1/3] dt-bindings: pinctrl: Add aspeed,ast2700-soc0-pinctrl
From: Rob Herring (Arm) @ 2026-04-15 22:31 UTC (permalink / raw)
  To: Billy Tsai
  Cc: Linus Walleij, Ryan Chen, Joel Stanley, Conor Dooley, openbmc,
	linux-aspeed, linux-gpio, linux-clk, Krzysztof Kozlowski,
	devicetree, linux-arm-kernel, Andrew Jeffery, Lee Jones,
	linux-kernel, Bartosz Golaszewski, Andrew Jeffery
In-Reply-To: <20260414-upstream_pinctrl-v6-1-709f2127da33@aspeedtech.com>


On Tue, 14 Apr 2026 17:38:59 +0800, Billy Tsai wrote:
> Add a device tree binding for the pin controller found in the
> ASPEED AST2700 SoC0.
> 
> The controller manages various peripheral functions such as eMMC, USB,
> VGA DDC, JTAG, and PCIe root complex signals.
> 
> Describe the AST2700 SoC0 pin controller using standard pin multiplexing
> and configuration properties.
> 
> Signed-off-by: Billy Tsai <billy_tsai@aspeedtech.com>
> ---
>  .../pinctrl/aspeed,ast2700-soc0-pinctrl.yaml       | 170 +++++++++++++++++++++
>  1 file changed, 170 insertions(+)
> 

My bot found errors running 'make dt_binding_check' on your patch:

yamllint warnings/errors:

dtschema/dtc warnings/errors:
/builds/robherring/dt-review-ci/linux/Documentation/devicetree/bindings/pinctrl/aspeed,ast2700-soc0-pinctrl.yaml: patternProperties:-state$:allOf:2: 'then' is a dependency of 'if'
	hint: Keywords must be a subset of known json-schema keywords
	from schema $id: http://devicetree.org/meta-schemas/keywords.yaml
/builds/robherring/dt-review-ci/linux/Documentation/devicetree/bindings/pinctrl/aspeed,ast2700-soc0-pinctrl.yaml: patternProperties:-state$:allOf:2: 'then' is a dependency of 'else'
	hint: Keywords must be a subset of known json-schema keywords
	from schema $id: http://devicetree.org/meta-schemas/keywords.yaml

doc reference errors (make refcheckdocs):

See https://patchwork.kernel.org/project/devicetree/patch/20260414-upstream_pinctrl-v6-1-709f2127da33@aspeedtech.com

The base for the series is generally the latest rc1. A different dependency
should be noted in *this* patch.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit after running the above command yourself. Note
that DT_SCHEMA_FILES can be set to your schema file to speed up checking
your schema. However, it must be unset to test all examples with your schema.



^ permalink raw reply

* Re: [PATCH v6 2/3] dt-bindings: mfd: aspeed,ast2x00-scu: Describe AST2700 SCU0
From: Rob Herring (Arm) @ 2026-04-15 22:31 UTC (permalink / raw)
  To: Billy Tsai
  Cc: linux-arm-kernel, Andrew Jeffery, Bartosz Golaszewski, Ryan Chen,
	Lee Jones, Andrew Jeffery, Linus Walleij, linux-kernel,
	Conor Dooley, devicetree, linux-aspeed, Krzysztof Kozlowski,
	linux-gpio, Joel Stanley, linux-clk, openbmc
In-Reply-To: <20260414-upstream_pinctrl-v6-2-709f2127da33@aspeedtech.com>


On Tue, 14 Apr 2026 17:39:00 +0800, Billy Tsai wrote:
> AST2700 consists of two interconnected SoC instances, each with its own
> System Control Unit (SCU). The SCU0 provides pin control, interrupt
> controllers, clocks, resets, and address-space mappings for the
> Secondary and Tertiary Service Processors (SSP and TSP).
> 
> Describe the SSP/TSP address mappings using the standard
> memory-region and memory-region-names properties.
> 
> Disallow legacy child nodes that are not present on AST2700, including
> p2a-control and smp-memram. The latter is unnecessary as software can
> access the scratch registers via the SCU syscon.
> 
> Also allow the AST2700 SoC0 pin controller to be described as a child
> node of the SCU0, and add an example illustrating the SCU0 layout,
> including reserved-memory, interrupt controllers, and pinctrl.
> 
> Signed-off-by: Billy Tsai <billy_tsai@aspeedtech.com>
> ---
>  .../bindings/mfd/aspeed,ast2x00-scu.yaml           | 112 +++++++++++++++++++++
>  1 file changed, 112 insertions(+)
> 

My bot found errors running 'make dt_binding_check' on your patch:

yamllint warnings/errors:

dtschema/dtc warnings/errors:
/builds/robherring/dt-review-ci/linux/Documentation/devicetree/bindings/mfd/aspeed,ast2x00-scu.yaml: allOf:1: 'then' is a dependency of 'if'
	hint: Keywords must be a subset of known json-schema keywords
	from schema $id: http://devicetree.org/meta-schemas/keywords.yaml
/builds/robherring/dt-review-ci/linux/Documentation/devicetree/bindings/mfd/aspeed,ast2x00-scu.yaml: allOf:1: 'then' is a dependency of 'else'
	hint: Keywords must be a subset of known json-schema keywords
	from schema $id: http://devicetree.org/meta-schemas/keywords.yaml

doc reference errors (make refcheckdocs):

See https://patchwork.kernel.org/project/devicetree/patch/20260414-upstream_pinctrl-v6-2-709f2127da33@aspeedtech.com

The base for the series is generally the latest rc1. A different dependency
should be noted in *this* patch.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit after running the above command yourself. Note
that DT_SCHEMA_FILES can be set to your schema file to speed up checking
your schema. However, it must be unset to test all examples with your schema.



^ permalink raw reply

* Re: [PATCH 1/1] KVM: arm64: nv: Avoid full shadow s2 unmap
From: Wei-Lin Chang @ 2026-04-15 23:05 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvmarm, linux-kernel, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon
In-Reply-To: <86eckg39eo.wl-maz@kernel.org>

On Wed, Apr 15, 2026 at 09:38:55AM +0100, Marc Zyngier wrote:
> On Sat, 11 Apr 2026 13:50:24 +0100,
> Wei-Lin Chang <weilin.chang@arm.com> wrote:
> > 
> > Currently we are forced to fully unmap all shadow stage-2 for a VM when
> > unmapping a page from the canonical stage-2, for example during an MMU
> > notifier call. This is because we are not tracking what canonical IPA
> > are mapped in the shadow stage-2 page tables hence there is no way to
> > know what to unmap.
> > 
> > Create a per kvm_s2_mmu maple tree to track canonical IPA range ->
> > nested IPA range, so that it is possible to partially unmap shadow
> > stage-2 when a canonical IPA range is unmapped. The algorithm is simple
> > and conservative:
> > 
> > At each shadow stage-2 map, insert the nested IPA range into the maple
> > tree, with the canonical IPA range as the key. If the canonical IPA
> > range doesn't overlap with existing ranges in the tree, insert as is,
> > and a reverse mapping for this range is established. But if the
> > canonical IPA range overlaps with any existing ranges in the tree,
> > create a new range that spans all the overlapping ranges including the
> > input range and replace those existing ranges. In the mean time, mark
> > this new spanning canonical IPA range as "polluted" indicating we lost
> > track of the nested IPA ranges that map to this canonical IPA range.
> > 
> > The maple tree's 64 bit entry is enough to store the nested IPA and
> > polluted status (stored as a bit called UNKNOWN_IPA), therefore besides
> > maple tree's internal operation, memory allocation is avoided.
> > 
> > Example:
> > |||| means existing range, ---- means empty range
> > 
> > input:            $$$$$$$$$$$$$$$$$$$$$$$$$$
> > tree:  --||||-----|||||||---------||||||||||-----------
> > 
> > insert spanning range and replace overlapping ones:
> >        --||||-----||||||||||||||||||||||||||-----------
> >                   ^^^^^^^^polluted!^^^^^^^^^
> 
> I think you should stick to a single terminology. It is either
> "polluted", or "unknown IPA". My preference goes to the latter, as the
> former is not very descriptive in this context.

Sure, I agree.

> 
> > 
> > With the reverse map created, when a canonical IPA range gets unmapped,
> > look into each s2 mmu's maple tree and look for canonical IPA ranges
> > affected, and base on their polluted status:
> > 
> > polluted -> fall back and fully invalidate the current shadow stage-2,
> >             also clear the tree
> > not polluted -> unmap the nested IPA range, and remove the reverse map
> >                 entry
> > 
> > Suggested-by: Marc Zyngier <maz@kernel.org>
> > Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h   |   4 +
> >  arch/arm64/include/asm/kvm_nested.h |   4 +
> >  arch/arm64/kvm/mmu.c                |  30 ++++--
> >  arch/arm64/kvm/nested.c             | 147 +++++++++++++++++++++++++++-
> >  4 files changed, 177 insertions(+), 8 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 851f6171751c..a97bd461c1e1 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -217,6 +217,10 @@ struct kvm_s2_mmu {
> >  	 */
> >  	bool	nested_stage2_enabled;
> >  
> > +	/* canonical IPA to nested IPA range lookup */
> > +	struct maple_tree nested_revmap_mt;
> > +	bool	nested_revmap_broken;
> > +
> 
> Consider moving this boolean next to the other ones so that you don't
> create too many holes in the kvm_s2_mmu structure (use pahole to find out).
> 
> But I have some misgivings about the way things are structured
> here. Only NV needs a revmap, yet this is present irrelevant of the
> nature of the VM and bloats the data structure a bit.
> 
> My naive approach would have been to only keep a pointer to the
> revmap, and make that pointer NULL when the tree is "broken", and
> freed under RCU if the context isn't the correct one.

Can you explain what you mean by "if the context isn't the correct one"?
If this refers to when selecting a specific kvm_s2_mmu instance for
another context, then IIUC refcnt would already be 0 and there would be
no other user of the tree.

However I do see RCU can be used for parallel accesses to the tree from
parallel s2 faults, and when one fault's revmap store fails RCU defers
freeing the tree until the other fault finishes using it.

> 
> This would have multiple benefits: no large-ish structure embedded in
> the s2_mmu structure, no extra boolean to indicate an error condition,
> memory reclaimed earlier.

Yes I see.

> 
> >  #ifdef CONFIG_PTDUMP_STAGE2_DEBUGFS
> >  	struct dentry *shadow_pt_debugfs_dentry;
> >  #endif
> > diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h
> > index 091544e6af44..f039220e87a6 100644
> > --- a/arch/arm64/include/asm/kvm_nested.h
> > +++ b/arch/arm64/include/asm/kvm_nested.h
> > @@ -76,6 +76,8 @@ extern void kvm_s2_mmu_iterate_by_vmid(struct kvm *kvm, u16 vmid,
> >  				       const union tlbi_info *info,
> >  				       void (*)(struct kvm_s2_mmu *,
> >  						const union tlbi_info *));
> > +extern void kvm_record_nested_revmap(gpa_t gpa, struct kvm_s2_mmu *mmu,
> > +				    gpa_t fault_gpa, size_t map_size);
> >  extern void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu);
> >  extern void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu);
> >  
> > @@ -164,6 +166,8 @@ extern int kvm_s2_handle_perm_fault(struct kvm_vcpu *vcpu,
> >  				    struct kvm_s2_trans *trans);
> >  extern int kvm_inject_s2_fault(struct kvm_vcpu *vcpu, u64 esr_el2);
> >  extern void kvm_nested_s2_wp(struct kvm *kvm);
> > +extern void kvm_unmap_gfn_range_nested(struct kvm *kvm, gpa_t gpa, size_t size,
> > +				       bool may_block);
> >  extern void kvm_nested_s2_unmap(struct kvm *kvm, bool may_block);
> >  extern void kvm_nested_s2_flush(struct kvm *kvm);
> >  
> > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> > index d089c107d9b7..4c9b9cf6dc43 100644
> > --- a/arch/arm64/kvm/mmu.c
> > +++ b/arch/arm64/kvm/mmu.c
> > @@ -5,6 +5,7 @@
> >   */
> >  
> >  #include <linux/acpi.h>
> > +#include <linux/maple_tree.h>
> >  #include <linux/mman.h>
> >  #include <linux/kvm_host.h>
> >  #include <linux/io.h>
> > @@ -1099,6 +1100,7 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
> >  {
> >  	struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
> >  	struct kvm_pgtable *pgt = NULL;
> > +	struct maple_tree *mt = &mmu->nested_revmap_mt;
> >  
> >  	write_lock(&kvm->mmu_lock);
> >  	pgt = mmu->pgt;
> > @@ -1108,8 +1110,11 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
> >  		free_percpu(mmu->last_vcpu_ran);
> >  	}
> >  
> > -	if (kvm_is_nested_s2_mmu(kvm, mmu))
> > +	if (kvm_is_nested_s2_mmu(kvm, mmu)) {
> > +		if (!mtree_empty(mt))
> > +			mtree_destroy(mt);
> >  		kvm_init_nested_s2_mmu(mmu);
> > +	}
> >  
> >  	write_unlock(&kvm->mmu_lock);
> >  
> > @@ -1631,6 +1636,10 @@ static int gmem_abort(const struct kvm_s2_fault_desc *s2fd)
> >  		goto out_unlock;
> >  	}
> >  
> > +	if (s2fd->nested)
> > +		kvm_record_nested_revmap(gfn << PAGE_SHIFT, pgt->mmu,
> > +					 s2fd->fault_ipa, PAGE_SIZE);
> > +
> >  	ret = KVM_PGT_FN(kvm_pgtable_stage2_map)(pgt, s2fd->fault_ipa, PAGE_SIZE,
> >  						 __pfn_to_phys(pfn), prot,
> >  						 memcache, flags);
> > @@ -2031,6 +2040,13 @@ static int kvm_s2_fault_map(const struct kvm_s2_fault_desc *s2fd,
> >  		ret = KVM_PGT_FN(kvm_pgtable_stage2_relax_perms)(pgt, gfn_to_gpa(gfn),
> >  								 prot, flags);
> >  	} else {
> > +		if (s2fd->nested) {
> > +			phys_addr_t ipa = gfn_to_gpa(get_canonical_gfn(s2fd, s2vi));
> > +
> > +			ipa &= ~(mapping_size - 1);
> 
> I guess it'd be worth adding a helper for this instead of duplicating
> the existing code.

Ack.

> 
> > +			kvm_record_nested_revmap(ipa, pgt->mmu, gfn_to_gpa(gfn),
> > +						 mapping_size);
> 
> This worries me a bit, see below.
> 
> > +		}
> >  		ret = KVM_PGT_FN(kvm_pgtable_stage2_map)(pgt, gfn_to_gpa(gfn), mapping_size,
> >  							 __pfn_to_phys(pfn), prot,
> >  							 memcache, flags);
> > @@ -2388,14 +2404,16 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
> >  
> >  bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
> >  {
> > +	gpa_t gpa = range->start << PAGE_SHIFT;
> > +	size_t size = (range->end - range->start) << PAGE_SHIFT;
> > +	bool may_block = range->may_block;
> > +
> >  	if (!kvm->arch.mmu.pgt || kvm_vm_is_protected(kvm))
> >  		return false;
> >  
> > -	__unmap_stage2_range(&kvm->arch.mmu, range->start << PAGE_SHIFT,
> > -			     (range->end - range->start) << PAGE_SHIFT,
> > -			     range->may_block);
> > +	__unmap_stage2_range(&kvm->arch.mmu, gpa, size, may_block);
> > +	kvm_unmap_gfn_range_nested(kvm, gpa, size, may_block);
> >  
> > -	kvm_nested_s2_unmap(kvm, range->may_block);
> >  	return false;
> >  }
> >  
> > @@ -2673,7 +2691,7 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
> >  
> >  	write_lock(&kvm->mmu_lock);
> >  	kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, size, true);
> > -	kvm_nested_s2_unmap(kvm, true);
> > +	kvm_unmap_gfn_range_nested(kvm, gpa, size, true);
> >  	write_unlock(&kvm->mmu_lock);
> >  }
> >  
> > diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> > index 883b6c1008fb..c9ebe969b453 100644
> > --- a/arch/arm64/kvm/nested.c
> > +++ b/arch/arm64/kvm/nested.c
> > @@ -7,6 +7,7 @@
> >  #include <linux/bitfield.h>
> >  #include <linux/kvm.h>
> >  #include <linux/kvm_host.h>
> > +#include <linux/maple_tree.h>
> >  
> >  #include <asm/fixmap.h>
> >  #include <asm/kvm_arm.h>
> > @@ -43,6 +44,19 @@ struct vncr_tlb {
> >   */
> >  #define S2_MMU_PER_VCPU		2
> >  
> > +/*
> > + * Per shadow S2 reverse map (IPA -> nested IPA range) maple tree payload
> > + * layout:
> > + *
> > + * bit 63: valid, 1 for non-polluted entries, prevents the case where the
> > + *         nested IPA is 0 and turns the whole value to 0
> > + * bits 55-12: nested IPA bits 55-12
> > + * bit 0: polluted, 1 for polluted, 0 for not
> > + */
> > +#define VALID_ENTRY		BIT(63)
> > +#define NESTED_IPA_MASK		GENMASK_ULL(55, 12)
> > +#define UNKNOWN_IPA		BIT(0)
> > +
> 
> This only works because you are using the "advanced" API, right?
> Otherwise, you'd be losing the high bit. It'd be good to add a comment
> so that people keep that in mind.

Sorry, I can't find any relationship between the advanced API and the
top most bit of the maple tree value, what am I missing?

> 
> >  void kvm_init_nested(struct kvm *kvm)
> >  {
> >  	kvm->arch.nested_mmus = NULL;
> > @@ -769,12 +783,57 @@ static struct kvm_s2_mmu *get_s2_mmu_nested(struct kvm_vcpu *vcpu)
> >  	return s2_mmu;
> >  }
> >  
> > +void kvm_record_nested_revmap(gpa_t ipa, struct kvm_s2_mmu *mmu,
> > +			      gpa_t fault_ipa, size_t map_size)
> > +{
> > +	struct maple_tree *mt = &mmu->nested_revmap_mt;
> > +	gpa_t start = ipa;
> > +	gpa_t end = ipa + map_size - 1;
> > +	u64 entry, new_entry = 0;
> > +	MA_STATE(mas, mt, start, end);
> > +
> > +	if (mmu->nested_revmap_broken)
> > +		return;
> > +
> > +	mtree_lock(mt);
> > +	entry = (u64)mas_find_range(&mas, end);
> > +
> > +	if (entry) {
> > +		/* maybe just a perm update... */
> > +		if (!(entry & UNKNOWN_IPA) && mas.index == start &&
> > +		    mas.last == end &&
> > +		    fault_ipa == (entry & NESTED_IPA_MASK))
> > +			goto unlock;
> > +		/*
> > +		 * Create a "polluted" range that spans all the overlapping
> > +		 * ranges and store it.
> > +		 */
> > +		while (entry && mas.index <= end) {
> > +			start = min(mas.index, start);
> > +			end = max(mas.last, end);
> > +			entry = (u64)mas_find_range(&mas, end);
> > +		}
> > +		new_entry |= UNKNOWN_IPA;
> > +	} else {
> > +		new_entry |= fault_ipa;
> > +		new_entry |= VALID_ENTRY;
> > +	}
> > +
> > +	mas_set_range(&mas, start, end);
> > +	if (mas_store_gfp(&mas, (void *)new_entry, GFP_NOWAIT | __GFP_ACCOUNT))
> > +		mmu->nested_revmap_broken = true;
> 
> Can we try and minimise the risk of allocation failure here?
> 
> user_mem_abort() tries very hard to pre-allocate pages for page
> tables by maintaining an memcache. Can we have a similar approach for
> the revmap?

Unfortunately, as I understand the maple tree can only pre-allocate for
a store when the range and the entry to be stored is given, but in this
case we must inspect the tree to get that information after we hold the
mmu and maple tree locks. It is possible to do a two pass approach:

pre-allocate -> take MMU lock -> take maple tree lock -> revalidate what
we pre-allocated is still usable (nobody changed the tree before we took
the maple tree lock)

But I am not fond of this extra complexity..

> 
> > +unlock:
> > +	mtree_unlock(mt);
> > +}
> > +
> >  void kvm_init_nested_s2_mmu(struct kvm_s2_mmu *mmu)
> >  {
> >  	/* CnP being set denotes an invalid entry */
> >  	mmu->tlb_vttbr = VTTBR_CNP_BIT;
> >  	mmu->nested_stage2_enabled = false;
> >  	atomic_set(&mmu->refcnt, 0);
> > +	mt_init(&mmu->nested_revmap_mt);
> > +	mmu->nested_revmap_broken = false;
> >  }
> >  
> >  void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu)
> > @@ -1150,6 +1209,90 @@ void kvm_nested_s2_wp(struct kvm *kvm)
> >  	kvm_invalidate_vncr_ipa(kvm, 0, BIT(kvm->arch.mmu.pgt->ia_bits));
> >  }
> >  
> > +static void reset_revmap_and_unmap(struct kvm_s2_mmu *mmu, bool may_block)
> > +{
> > +	mtree_destroy(&mmu->nested_revmap_mt);
> > +	kvm_stage2_unmap_range(mmu, 0, kvm_phys_size(mmu), may_block);
> > +	mmu->nested_revmap_broken = false;
> > +}
> > +
> > +static void unmap_mmu_ipa_range(struct kvm_s2_mmu *mmu, gpa_t gpa,
> > +				  size_t unmap_size, bool may_block)
> > +{
> > +	struct maple_tree *mt = &mmu->nested_revmap_mt;
> > +	gpa_t start = gpa;
> > +	gpa_t end = gpa + unmap_size - 1;
> > +	u64 entry;
> > +	size_t entry_size;
> > +	bool unlock, fallback;
> > +	MA_STATE(mas, mt, gpa, end);
> > +
> > +	if (mmu->nested_revmap_broken) {
> > +		unlock = false;
> > +		fallback = true;
> > +		goto fin;
> > +	}
> 
> Using booleans to affect the control flow reads really badly. I'd
> expect this to simply be:
> 
> 	if (...) {
> 		reset_revmap_and_unmap(mmu, may_block);
> 		return;
> 	}
> 
> > +
> > +	mtree_lock(mt);
> > +	entry = (u64)mas_find_range(&mas, end);
> > +
> > +	while (entry && mas.index <= end) {
> > +		start = mas.last + 1;
> > +		entry_size = mas.last - mas.index + 1;
> > +		/*
> > +		 * Give up and invalidate this s2 mmu if the unmap range
> > +		 * touches any polluted range.
> > +		 */
> > +		if (entry & UNKNOWN_IPA) {
> > +			unlock = true;
> > +			fallback = true;
> > +			goto fin;
> > +		}
> 
> and this to be:
> 
> 		if (entry & UNKNOWN_IPA) {
> 			mtree_unlock(mt);
> 			reset_revmap_and_unmap(mmu, may_block);
> 			return;
> 		}
> 
> > +
> > +		/*
> > +		 * Ignore result, it is okay if a reverse mapping erase
> > +		 * fails.
> > +		 */
> > +		mas_store_gfp(&mas, NULL, GFP_NOWAIT | __GFP_ACCOUNT);
> > +
> > +		mtree_unlock(mt);
> > +		kvm_stage2_unmap_range(mmu, entry & NESTED_IPA_MASK, entry_size,
> > +				       may_block);
> > +		mtree_lock(mt);
> > +		/*
> > +		 * Other maple tree operations during preemption could render
> > +		 * this ma_state invalid, so reset it.
> > +		 */
> > +		mas_set_range(&mas, start, end);
> > +		entry = (u64)mas_find_range(&mas, end);
> > +	}
> > +	unlock = true;
> > +	fallback = false;
> > +
> > +fin:
> > +	if (unlock)
> > +		mtree_unlock(mt);
> > +	if (fallback)
> > +		reset_revmap_and_unmap(mmu, may_block);
> 
> and this can eventually be greatly simplified.

Sure, I agree.

> 
> > +}
> > +
> > +void kvm_unmap_gfn_range_nested(struct kvm *kvm, gpa_t gpa, size_t size,
> > +				bool may_block)
> > +{
> > +	int i;
> > +
> > +	if (!kvm->arch.nested_mmus_size)
> > +		return;
> > +
> > +	/* TODO: accelerate this using mt of canonical s2 mmu */
> > +	for (i = 0; i < kvm->arch.nested_mmus_size; i++) {
> > +		struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i];
> > +
> > +		if (kvm_s2_mmu_valid(mmu))
> > +			unmap_mmu_ipa_range(mmu, gpa, size, may_block);
> > +	}
> > +}
> > +
> >  void kvm_nested_s2_unmap(struct kvm *kvm, bool may_block)
> >  {
> >  	int i;
> > @@ -1163,7 +1306,7 @@ void kvm_nested_s2_unmap(struct kvm *kvm, bool may_block)
> >  		struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i];
> >  
> >  		if (kvm_s2_mmu_valid(mmu))
> > -			kvm_stage2_unmap_range(mmu, 0, kvm_phys_size(mmu), may_block);
> > +			reset_revmap_and_unmap(mmu, may_block);
> >  	}
> >  
> >  	kvm_invalidate_vncr_ipa(kvm, 0, BIT(kvm->arch.mmu.pgt->ia_bits));
> > @@ -1848,7 +1991,7 @@ void check_nested_vcpu_requests(struct kvm_vcpu *vcpu)
> >  
> >  		write_lock(&vcpu->kvm->mmu_lock);
> >  		if (mmu->pending_unmap) {
> > -			kvm_stage2_unmap_range(mmu, 0, kvm_phys_size(mmu), true);
> > +			reset_revmap_and_unmap(mmu, true);
> >  			mmu->pending_unmap = false;
> >  		}
> >  		write_unlock(&vcpu->kvm->mmu_lock);
> 
> My other concern here is related to TLB invalidation. As the guest
> performs TLB invalidations that remove entries from the shadow S2,
> there is no way to update the revmap to account for this.
> 
> This obviously means that the revmap becomes more and more inaccurate
> over time, and that is likely to accumulate conflicting entries.
> 
> What is the plan to improve the situation on this front?

Right now I think using a direct map which goes from nested IPA to
canonical IPA could work while not generating too much complexity, if we
keep the reverse map and direct map in lockstep (direct map keeping the
same mappings as the reverse map but just in reverse).

I'll try to do that and include it in the next iteration.

Thanks,
Wei-Lin Chang

> 
> Thanks,
> 
> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox