* Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
From: Thomas Gleixner @ 2020-09-23 14:33 UTC (permalink / raw)
To: Linus Torvalds
Cc: Juri Lelli, Peter Zijlstra, Sebastian Andrzej Siewior,
Joonas Lahtinen, dri-devel, linux-mips, Ben Segall, Max Filippov,
Guo Ren, linux-sparc, Vincent Chen, Will Deacon, Ard Biesheuvel,
linux-arch, Vincent Guittot, Herbert Xu, the arch/x86 maintainers,
Russell King, linux-csky, David Airlie, Mel Gorman,
open list:SYNOPSYS ARC ARCHITECTURE, linux-xtensa, Paul McKenney,
intel-gfx, linuxppc-dev, Steven Rostedt, Jani Nikula,
Rodrigo Vivi, Dietmar Eggemann, Linux ARM, Chris Zankel,
Michal Simek, Thomas Bogendoerfer, Nick Hu, Linux-MM,
Vineet Gupta, LKML, Arnd Bergmann, Daniel Vetter, Paul Mackerras,
Andrew Morton, Daniel Bristot de Oliveira, David S. Miller,
Greentime Hu
In-Reply-To: <87sgbbaq0y.fsf@nanos.tec.linutronix.de>
On Mon, Sep 21 2020 at 21:27, Thomas Gleixner wrote:
> On Mon, Sep 21 2020 at 09:24, Linus Torvalds wrote:
>> On Mon, Sep 21, 2020 at 12:39 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>> Maybe we really *could* call this new kmap functionality something
>> like "kmap_percpu()" (or maybe "local" is good enough), and make it
>> act like your RT code does for spinlocks - not disable preemption, but
>> only disabling CPU migration.
>
> I"m all for it, but the scheduler people have opinions :)
I just took the latest version of migrate disable patches
https://lore.kernel.org/r/20200921163557.234036895@infradead.org
removed the RT dependency on top of them and adopted the kmap stuff
(addressing the various comments while at it and unbreaking ARM).
I'm not going to post that until there is consensus about the general
availability of migrate disable, but for those who want to play with it
I've pushed the resulting combo out to:
git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git highmem
For testing I've tweaked a few places to use the new _local() variants
and it survived testing so far and I've verified that there is actual
preemption which means zap/restore of the thread local kmaps.
Thanks,
tglx
^ permalink raw reply
* Re: [PATCH 5/9] fs: remove various compat readv/writev helpers
From: Christoph Hellwig @ 2020-09-23 14:32 UTC (permalink / raw)
To: Al Viro
Cc: linux-aio, linux-mips, David Howells, linux-mm, keyrings,
sparclinux, Christoph Hellwig, linux-arch, linux-s390, linux-scsi,
Arnd Bergmann, linux-block, io-uring, linux-arm-kernel,
Jens Axboe, linux-parisc, netdev, linux-kernel,
linux-security-module, David Laight, linux-fsdevel, Andrew Morton,
linuxppc-dev
In-Reply-To: <20200923142549.GK3421308@ZenIV.linux.org.uk>
On Wed, Sep 23, 2020 at 03:25:49PM +0100, Al Viro wrote:
> On Wed, Sep 23, 2020 at 08:05:43AM +0200, Christoph Hellwig wrote:
> > COMPAT_SYSCALL_DEFINE3(readv, compat_ulong_t, fd,
> > - const struct compat_iovec __user *,vec,
> > + const struct iovec __user *, vec,
>
> Um... Will it even compile?
>
> > #ifdef __ARCH_WANT_COMPAT_SYS_PREADV64
> > COMPAT_SYSCALL_DEFINE4(preadv64, unsigned long, fd,
> > - const struct compat_iovec __user *,vec,
> > + const struct iovec __user *, vec,
>
> Ditto. Look into include/linux/compat.h and you'll see
>
> asmlinkage long compat_sys_preadv64(unsigned long fd,
> const struct compat_iovec __user *vec,
> unsigned long vlen, loff_t pos);
>
> How does that manage to avoid the compiler screaming bloody
> murder?
That's a very good question. But it does not just compile but actually
works. Probably because all the syscall wrappers mean that we don't
actually generate the normal names. I just tried this:
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -468,7 +468,7 @@ asmlinkage long sys_lseek(unsigned int fd, off_t offset,
asmlinkage long sys_read(unsigned int fd, char __user *buf, size_t count);
asmlinkage long sys_write(unsigned int fd, const char __user *buf,
size_t count);
-asmlinkage long sys_readv(unsigned long fd,
+asmlinkage long sys_readv(void *fd,
for fun, and the compiler doesn't care either..
^ permalink raw reply
* Re: [PATCH 5/9] fs: remove various compat readv/writev helpers
From: Al Viro @ 2020-09-23 14:25 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-aio, linux-mips, David Howells, linux-mm, keyrings,
sparclinux, linux-arch, linux-s390, linux-scsi, Arnd Bergmann,
linux-block, io-uring, linux-arm-kernel, Jens Axboe, linux-parisc,
netdev, linux-kernel, linux-security-module, David Laight,
linux-fsdevel, Andrew Morton, linuxppc-dev
In-Reply-To: <20200923060547.16903-6-hch@lst.de>
On Wed, Sep 23, 2020 at 08:05:43AM +0200, Christoph Hellwig wrote:
> COMPAT_SYSCALL_DEFINE3(readv, compat_ulong_t, fd,
> - const struct compat_iovec __user *,vec,
> + const struct iovec __user *, vec,
Um... Will it even compile?
> #ifdef __ARCH_WANT_COMPAT_SYS_PREADV64
> COMPAT_SYSCALL_DEFINE4(preadv64, unsigned long, fd,
> - const struct compat_iovec __user *,vec,
> + const struct iovec __user *, vec,
Ditto. Look into include/linux/compat.h and you'll see
asmlinkage long compat_sys_preadv64(unsigned long fd,
const struct compat_iovec __user *vec,
unsigned long vlen, loff_t pos);
How does that manage to avoid the compiler screaming bloody
murder?
^ permalink raw reply
* Re: [PATCH 3/9] iov_iter: refactor rw_copy_check_uvector and import_iovec
From: Al Viro @ 2020-09-23 14:16 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-aio, linux-mips, David Howells, linux-mm, keyrings,
sparclinux, linux-arch, linux-s390, linux-scsi, Linus Torvalds,
Arnd Bergmann, linux-block, io-uring, linux-arm-kernel,
Jens Axboe, linux-parisc, netdev, linux-kernel,
linux-security-module, David Laight, linux-fsdevel, Andrew Morton,
linuxppc-dev
In-Reply-To: <20200923060547.16903-4-hch@lst.de>
On Wed, Sep 23, 2020 at 08:05:41AM +0200, Christoph Hellwig wrote:
> +struct iovec *iovec_from_user(const struct iovec __user *uvec,
> + unsigned long nr_segs, unsigned long fast_segs,
Hmm... For fast_segs unsigned long had always been ridiculous
(4G struct iovec on caller stack frame?), but that got me wondering about
nr_segs and I wish I'd thought of that when introducing import_iovec().
The thing is, import_iovec() takes unsigned int there. Which is fine
(hell, the maximal value that can be accepted in 1024), except that
we do pass unsigned long syscall argument to it in some places.
E.g. vfs_readv() quietly truncates vlen to 32 bits, and vlen can
come unchanged through sys_readv() -> do_readv() -> vfs_readv().
With unsigned long passed by syscall glue.
AFAICS, passing 4G+1 as the third argument to readv(2) on 64bit box
will be quietly treated as 1 these days. Which would be fine, except
that before "switch {compat_,}do_readv_writev() to {compat_,}import_iovec()"
it used to fail with -EINVAL.
Userland, BTW, describes readv(2) iovcnt as int; process_vm_readv(),
OTOH, has these counts unsigned long from the userland POV...
I suppose we ought to switch import_iovec() to unsigned long for nr_segs ;-/
Strictly speaking that had been a userland ABI change, even though nothing
except regression tests checking for expected errors would've been likely
to notice. And it looks like no regression tests covered that one...
Linus, does that qualify for your "if no userland has noticed the change,
it's not a breakage"?
^ permalink raw reply
* Re: [PATCH v2] i2c: cpm: Fix i2c_ram structure
From: Jochen Friedrich @ 2020-09-23 14:12 UTC (permalink / raw)
To: nicolas.vincent; +Cc: linuxppc-dev, linux-i2c
In-Reply-To: <20200923140840.8700-1-nicolas.vincent@vossloh.com>
Acked-by: Jochen Friedrich <jochen@scram.de>
Am 23.09.2020 um 16:08 schrieb nico.vince@gmail.com:
> From: Nicolas VINCENT <nicolas.vincent@vossloh.com>
>
> the i2c_ram structure is missing the sdmatmp field mentionned in
> datasheet for MPC8272 at paragraph 36.5. With this field missing, the
> hardware would write past the allocated memory done through
> cpm_muram_alloc for the i2c_ram structure and land in memory allocated
> for the buffers descriptors corrupting the cbd_bufaddr field. Since this
> field is only set during setup(), the first i2c transaction would work
> and the following would send data read from an arbitrary memory
> location.
>
> Signed-off-by: Nicolas VINCENT <nicolas.vincent@vossloh.com>
> ---
> drivers/i2c/busses/i2c-cpm.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c
> index 1213e1932ccb..24d584a1c9a7 100644
> --- a/drivers/i2c/busses/i2c-cpm.c
> +++ b/drivers/i2c/busses/i2c-cpm.c
> @@ -65,6 +65,9 @@ struct i2c_ram {
> char res1[4]; /* Reserved */
> ushort rpbase; /* Relocation pointer */
> char res2[2]; /* Reserved */
> + /* The following elements are only for CPM2 */
> + char res3[4]; /* Reserved */
> + uint sdmatmp; /* Internal */
> };
>
> #define I2COM_START 0x80
>
^ permalink raw reply
* Re: [PATCH kernel] powerpc/dma: Fix dma_map_ops::get_required_mask
From: Christoph Hellwig @ 2020-09-23 14:10 UTC (permalink / raw)
To: Alexey Kardashevskiy
Cc: Oliver OHalloran, linuxppc-dev@lists.ozlabs.org,
Christoph Hellwig, Cédric Le Goater
In-Reply-To: <93424419-3476-fc07-8a83-8d9d39062810@ozlabs.ru>
On Tue, Sep 22, 2020 at 12:26:18PM +1000, Alexey Kardashevskiy wrote:
> > Well, the original intent of dma_get_required_mask is to return the
> > mask that the driver then uses to figure out what to set, so what aacraid
> > does fits that use case.
>
> What was the original intent exactly? The driver asks for the minimum or
> maximum DMA mask the platform supports?
>
> As for now, we (ppc64/powernv) can do:
> 1. bypass (==64bit)
> 2. a DMA window which used to be limited by 2GB but not anymore.
>
> I can understand if the driver asked for required mask in expectation to
> receive "less or equal than 32bit" and "more than 32 bit" and choose.
> And this probably was the intent as at the time when the bug was
> introduced, the window was always smaller than 4GB.
>
> But today the window is bigger than than (44 bits now, or a similar
> value, depends on max page order) so the returned mask is >32. Which
> still enables that DAC in aacraid but I suspect this is accidental.
I think for powernv returning 64-bit always would make a lot of sense.
AFAIK all of powernv is PCIe and not legacy PCI, so returning anything
less isn't going to help to optimize anything.
> > Of course that idea is pretty bogus for
> > PCIe devices.
>
> Why? From the PHB side, there are windows. From the device side, there
> are many crippled devices, like, no GPU I saw in last years supported
> more than 48bit.
Yes, but dma_get_required_mask is misnamed - the mask is not required,
it is the optimal mask. Even if the window is smaller we handle it
some way, usually by using swiotlb, or by iommu tricks in your case.
> > I suspect the right fix is to just not query dma_get_required_mask for
> > PCIe devices in aacraid (and other drivers that do something similar).
>
> May be, if you write nice and big comment next to
> dma_get_required_mask() explaining exactly what it does, then I will
> realize I am getting this all wrong and we will move to fixing the
> drivers :)
Yes, it needs a comment or two, and probaby be renamed to
dma_get_optimal_dma_mask, and a cleanup of most users. I've added it
to my ever growing TODO list, but I would not be unhappy if someone
else gives it a spin.
^ permalink raw reply
* [PATCH v2] i2c: cpm: Fix i2c_ram structure
From: nico.vince @ 2020-09-23 14:08 UTC (permalink / raw)
To: jochen; +Cc: Nicolas VINCENT, linuxppc-dev, linux-i2c
From: Nicolas VINCENT <nicolas.vincent@vossloh.com>
the i2c_ram structure is missing the sdmatmp field mentionned in
datasheet for MPC8272 at paragraph 36.5. With this field missing, the
hardware would write past the allocated memory done through
cpm_muram_alloc for the i2c_ram structure and land in memory allocated
for the buffers descriptors corrupting the cbd_bufaddr field. Since this
field is only set during setup(), the first i2c transaction would work
and the following would send data read from an arbitrary memory
location.
Signed-off-by: Nicolas VINCENT <nicolas.vincent@vossloh.com>
---
drivers/i2c/busses/i2c-cpm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c
index 1213e1932ccb..24d584a1c9a7 100644
--- a/drivers/i2c/busses/i2c-cpm.c
+++ b/drivers/i2c/busses/i2c-cpm.c
@@ -65,6 +65,9 @@ struct i2c_ram {
char res1[4]; /* Reserved */
ushort rpbase; /* Relocation pointer */
char res2[2]; /* Reserved */
+ /* The following elements are only for CPM2 */
+ char res3[4]; /* Reserved */
+ uint sdmatmp; /* Internal */
};
#define I2COM_START 0x80
--
2.17.1
^ permalink raw reply related
* Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
From: Thomas Gleixner @ 2020-09-23 13:35 UTC (permalink / raw)
To: peterz
Cc: Juri Lelli, David Airlie, Sebastian Andrzej Siewior,
Joonas Lahtinen, dri-devel, linux-mips, Ben Segall, Max Filippov,
Guo Ren, linux-sparc, Vincent Chen, Will Deacon, Ard Biesheuvel,
linux-arch, Vincent Guittot, Herbert Xu, the arch/x86 maintainers,
Russell King, linux-csky, Mel Gorman,
open list:SYNOPSYS ARC ARCHITECTURE, linux-xtensa, Paul McKenney,
intel-gfx, linuxppc-dev, Steven Rostedt, Jani Nikula,
Rodrigo Vivi, Dietmar Eggemann, Linux ARM, Chris Zankel,
Michal Simek, Thomas Bogendoerfer, Nick Hu, Linux-MM,
Linus Torvalds, LKML, Arnd Bergmann, Daniel Vetter, Vineet Gupta,
Paul Mackerras, Andrew Morton, Daniel Bristot de Oliveira,
David S. Miller, Greentime Hu
In-Reply-To: <20200923084032.GU1362448@hirez.programming.kicks-ass.net>
On Wed, Sep 23 2020 at 10:40, peterz wrote:
> Right, so I'm concerned. migrate_disable() wrecks pretty much all
> Real-Time scheduler theory we have, and PREEMPRT_RT bringing it in is
> somewhat ironic.
It's even more ironic that the approach of PREEMPT_RT has been
'pragmatic ignorance of theory' from the very beginning and despite
violating all theories it still works. :)
> Yes, it allows breaking up non-preemptible regions of non-deterministic
> duration, and thereby both reduce and bound the scheduling latency, the
> cost for doing that is that the theory on CPU utilization/bandwidth go
> out the window.
I agree, that the theory goes out of the window, but does it matter in
practice? I've yet to see a report of migrate disable stacking being the
culprit of a missed deadline and I surely have stared at lots of reports
in the past 10+ years.
> To easily see this consider an SMP system with a number of tasks equal
> to the number of CPUs. On a regular (preempt_disable) kernel we can
> always run each task, by virtue of always having an idle CPU to take the
> task.
>
> However, with migrate_disable() we can have each task preempted in a
> migrate_disable() region, worse we can stack them all on the _same_ CPU
> (super ridiculous odds, sure). And then we end up only able to run one
> task, with the rest of the CPUs picking their nose.
>
> The end result is that, like with unbounded latency, we will still miss
> our deadline, simply because we got starved for CPU.
I'm well aware of that.
> Now, while we could (with a _lot_ of work) rework the kernel to not rely
> on the implicit per-cpu ness of things like spinlock_t, the moment we
> bring in basic primitives that rely on migrate_disable() we're stuck
> with it.
Right, but we are stuck with per CPUness and distangling that is just
infeasible IMO.
> The problem is; afaict there's been no research into this problem.
There is no research on a lot of things the kernel does :)
> There might be scheduling (read: balancing) schemes that can
> mitigate/solve this problem, or it might prove to be a 'hard' problem,
> I just don't know.
In practive balancing surely can take the number of preempted tasks
which are in a migrate disable section into account which would be just
another measure to work around the fact that the kernel is not adhering
to the theories. It never did that even w/o migrate disable.
> But once we start down this road, it's going to be hell to get rid of
> it.
Like most of the other things the kernel came up with to deal with the
oddities of modern hardware :)
> That's why I've been arguing (for many years) to strictly limit this to
> PREEMPT_RT and only as a gap-stop, not a fundamental primitive to build
> on.
I know, but short of rewriting the world, I'm not seing the faintest
plan to remove the stop gap. :)
As we discussed not long ago we have too many inconsistent preemption
models already. RT is adding yet another one. And that's worse than
introducing migrate disable as a general available facility.
IMO, reaching a point of consistency where our different preemption
models (NONE, VOLUNTARY, PREEMPT. RT) build on each other is far more
important.
For !RT migrate disable is far less of an danger than for RT kernels
because the amount of code which will use it is rather limited compared
to code which still will disable preemption implicit through spin and rw
locks.
On RT converting these locks to 'sleepable spinlocks' is just possible
because RT forces almost everything into task context and migrate
disable is just the obvious decomposition of preempt disable which
implicitely disables migration.
But that means that RT is by orders of magnitude more prone to run into
the scheduling trainwreck you are worried about. It just refuses to do
so at least with real world work loads.
I'm surely in favour of having solid theories behind implementation, but
at some point you just have to bite the bullet and chose pragmatism in
order to make progress.
Proliferating inconsistency is not real progress, as it is violating the
most fundamental engineering principles. That's by far more dangerous
than violating scheduling theories which are built on perfect models and
therefore enforce violation by practical implementations anyway.
Thanks,
tglx
^ permalink raw reply
* Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag
From: Al Viro @ 2020-09-23 13:22 UTC (permalink / raw)
To: Pavel Begunkov
Cc: linux-aio, open list:MIPS, David Howells, Linux-MM, keyrings,
sparclinux, Christoph Hellwig, linux-arch, linux-s390,
Linux SCSI List, X86 ML, Arnd Bergmann, linux-block,
Andy Lutomirski, io-uring, linux-arm-kernel, Jens Axboe,
Parisc List, Network Development, LKML, LSM List, Linux FS Devel,
Andrew Morton, linuxppc-dev
In-Reply-To: <91209170-dcb4-d9ee-afa0-a819f8877b86@gmail.com>
On Wed, Sep 23, 2020 at 11:01:34AM +0300, Pavel Begunkov wrote:
> > I'm not following why that would be considered a valid option,
> > as that clearly breaks existing users that update from a 32-bit
> > kernel to a 64-bit one.
>
> Do you mean users who move 32-bit binaries (without recompiling) to a
> new x64 kernel? Does the kernel guarantees that to work?
Yes.
No further (printable) comments for now...
^ permalink raw reply
* Re: [PATCH] i2c: cpm: Fix i2c_ram structure
From: Christophe Leroy @ 2020-09-23 13:04 UTC (permalink / raw)
To: Vincent Nicolas, jochen@scram.de
Cc: linuxppc-dev@lists.ozlabs.org, linux-i2c@vger.kernel.org
In-Reply-To: <PR3P193MB0731A954E01511C22259ECF0F1380@PR3P193MB0731.EURP193.PROD.OUTLOOK.COM>
Le 23/09/2020 à 14:55, Vincent Nicolas a écrit :
>
>
>> ________________________________________
>> From: Christophe Leroy <christophe.leroy@csgroup.eu>
>> Sent: Wednesday, 23 September 2020 10:01
>> To: Vincent Nicolas; jochen@scram.de
>> Cc: linuxppc-dev@lists.ozlabs.org; linux-i2c@vger.kernel.org
>> Subject: Re: [PATCH] i2c: cpm: Fix i2c_ram structure
>>
>>
>>
>> Le 23/09/2020 à 09:18, Vincent Nicolas a écrit :
>>>
>>>
>>>
>>> From: Christophe Leroy <christophe.leroy@csgroup.eu>
>>> Sent: Tuesday, 22 September 2020 14:38
>>> To: Vincent Nicolas <Nicolas.Vincent@vossloh.com>; jochen@scram.de <jochen@scram.de>
>>> Cc: linuxppc-dev@lists.ozlabs.org <linuxppc-dev@lists.ozlabs.org>; linux-i2c@vger.kernel.org <linux-i2c@vger.kernel.org>
>>> Subject: Re: [PATCH] i2c: cpm: Fix i2c_ram structure
>>>
>>>
>>>
>>> Le 22/09/2020 à 11:04, nico.vince@gmail.com a écrit :
>>>> From: Nicolas VINCENT <nicolas.vincent@vossloh.com>
>>>>
>>>> the i2c_ram structure is missing the sdmatmp field mentionned in
>>>> datasheet for MPC8272 at paragraph 36.5. With this field missing, the
>>>> hardware would write past the allocated memory done through
>>>> cpm_muram_alloc for the i2c_ram structure and land in memory allocated
>>>> for the buffers descriptors corrupting the cbd_bufaddr field. Since this
>>>> field is only set during setup(), the first i2c transaction would work
>>>> and the following would send data read from an arbitrary memory
>>>> location.
>>>>
>>>> Signed-off-by: Nicolas VINCENT <nicolas.vincent@vossloh.com>
>>>> ---
>>>> drivers/i2c/busses/i2c-cpm.c | 3 ++-
>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c
>>>> index 1213e1932ccb..c5700addbf65 100644
>>>> --- a/drivers/i2c/busses/i2c-cpm.c
>>>> +++ b/drivers/i2c/busses/i2c-cpm.c
>>>> @@ -64,7 +64,8 @@ struct i2c_ram {
>>>> uint txtmp; /* Internal */
>>>> char res1[4]; /* Reserved */
>>>> ushort rpbase; /* Relocation pointer */
>>>> - char res2[2]; /* Reserved */
>>>> + char res2[6]; /* Reserved */
>>>> + uint sdmatmp; /* Internal */
>>>
>>> On CPM1, I2C param RAM has size 0x30 (offset 0x1c80-0x1caf)
>>>
>>> Your change overlaps the miscellaneous area that contains CP Microcode
>>> Revision Number, ref MPC885 Reference Manual §18.7.3
>>>
>>> As far as I understand the mpc885 contains in the dts the compatible=fsl,cpm1-i2c which is used in cpm-i2c.c to either determine the address of the i2c_ram structure (cpm1), or dynamically allocate it with cpm_muram_alloc (cpm2).
>>> In the first case the structure will indeed overlaps with the miscellaneous section but since the sdmatmp is only used by cpm2 hardware it shall not be an issue.
>>>
>>> Please, let me know if I am mistaken. If the patch cannot be accepted as is, I would gladly accept pointers on how to address this kind of issue.
>>
>>
>> Please use a mail client that properly sets the > in front of
>> original/answered text. Here your mailer has mixed you text and mine,
>> that's unusable on the long term.
>
> I changed my configuration, please let me know if there are still issues
Yes it is OK now.
Christophe
^ permalink raw reply
* Re: [PATCH] i2c: cpm: Fix i2c_ram structure
From: Vincent Nicolas @ 2020-09-23 12:55 UTC (permalink / raw)
To: Christophe Leroy, jochen@scram.de
Cc: linuxppc-dev@lists.ozlabs.org, linux-i2c@vger.kernel.org
In-Reply-To: <2ecfe18a-61f6-bb0e-22c5-b7ab79a77d03@csgroup.eu>
> ________________________________________
> From: Christophe Leroy <christophe.leroy@csgroup.eu>
> Sent: Wednesday, 23 September 2020 10:01
> To: Vincent Nicolas; jochen@scram.de
> Cc: linuxppc-dev@lists.ozlabs.org; linux-i2c@vger.kernel.org
> Subject: Re: [PATCH] i2c: cpm: Fix i2c_ram structure
>
>
>
> Le 23/09/2020 à 09:18, Vincent Nicolas a écrit :
>>
>>
>>
>> From: Christophe Leroy <christophe.leroy@csgroup.eu>
>> Sent: Tuesday, 22 September 2020 14:38
>> To: Vincent Nicolas <Nicolas.Vincent@vossloh.com>; jochen@scram.de <jochen@scram.de>
>> Cc: linuxppc-dev@lists.ozlabs.org <linuxppc-dev@lists.ozlabs.org>; linux-i2c@vger.kernel.org <linux-i2c@vger.kernel.org>
>> Subject: Re: [PATCH] i2c: cpm: Fix i2c_ram structure
>>
>>
>>
>> Le 22/09/2020 à 11:04, nico.vince@gmail.com a écrit :
>>> From: Nicolas VINCENT <nicolas.vincent@vossloh.com>
>>>
>>> the i2c_ram structure is missing the sdmatmp field mentionned in
>>> datasheet for MPC8272 at paragraph 36.5. With this field missing, the
>>> hardware would write past the allocated memory done through
>>> cpm_muram_alloc for the i2c_ram structure and land in memory allocated
>>> for the buffers descriptors corrupting the cbd_bufaddr field. Since this
>>> field is only set during setup(), the first i2c transaction would work
>>> and the following would send data read from an arbitrary memory
>>> location.
>>>
>>> Signed-off-by: Nicolas VINCENT <nicolas.vincent@vossloh.com>
>>> ---
>>> drivers/i2c/busses/i2c-cpm.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c
>>> index 1213e1932ccb..c5700addbf65 100644
>>> --- a/drivers/i2c/busses/i2c-cpm.c
>>> +++ b/drivers/i2c/busses/i2c-cpm.c
>>> @@ -64,7 +64,8 @@ struct i2c_ram {
>>> uint txtmp; /* Internal */
>>> char res1[4]; /* Reserved */
>>> ushort rpbase; /* Relocation pointer */
>>> - char res2[2]; /* Reserved */
>>> + char res2[6]; /* Reserved */
>>> + uint sdmatmp; /* Internal */
>>
>> On CPM1, I2C param RAM has size 0x30 (offset 0x1c80-0x1caf)
>>
>> Your change overlaps the miscellaneous area that contains CP Microcode
>> Revision Number, ref MPC885 Reference Manual §18.7.3
>>
>> As far as I understand the mpc885 contains in the dts the compatible=fsl,cpm1-i2c which is used in cpm-i2c.c to either determine the address of the i2c_ram structure (cpm1), or dynamically allocate it with cpm_muram_alloc (cpm2).
>> In the first case the structure will indeed overlaps with the miscellaneous section but since the sdmatmp is only used by cpm2 hardware it shall not be an issue.
>>
>> Please, let me know if I am mistaken. If the patch cannot be accepted as is, I would gladly accept pointers on how to address this kind of issue.
>
>
> Please use a mail client that properly sets the > in front of
> original/answered text. Here your mailer has mixed you text and mine,
> that's unusable on the long term.
I changed my configuration, please let me know if there are still issues
>
>
> I think you are right on the fact that it doesn't seem to be an issue.
> Nevertheless, that's confusing.
>
> What I would suggest is to leave res2[2] as is, and add something like:
>
> /* The following elements are only for CPM2 */
> char res3[4]; /* Reserved */
> uint sdmatmp; /* Internal */
I'll repost the patch like this then.
>
>
> Other solution (not sure that's the best solution thought) would be to
> do as in spi-fsl-cpm: use iic_t structure from asm/cpm1.h when
> CONFIG_CPM1 is selected and use iic_t from asm/cpm2.h when CONFIG_CPM2
> is selected, taking into account that CONFIG_CPM1 and CONFIG_CPM2 are
> mutually exclusive at the time being.
Unless someone argues for this solution I will go with the first one.
Thank again for your time and quick responses.
Nicolas.
^ permalink raw reply
* Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
From: Thomas Gleixner @ 2020-09-23 12:33 UTC (permalink / raw)
To: peterz
Cc: Juri Lelli, David Airlie, Sebastian Andrzej Siewior,
Joonas Lahtinen, dri-devel, linux-mips, Ben Segall, Max Filippov,
Guo Ren, linux-sparc, Vincent Chen, Will Deacon, Ard Biesheuvel,
linux-arch, Vincent Guittot, Herbert Xu, the arch/x86 maintainers,
Russell King, linux-csky, Mel Gorman,
open list:SYNOPSYS ARC ARCHITECTURE, linux-xtensa, Paul McKenney,
intel-gfx, linuxppc-dev, Steven Rostedt, Jani Nikula,
Rodrigo Vivi, Dietmar Eggemann, Linux ARM, Chris Zankel,
Michal Simek, Thomas Bogendoerfer, Nick Hu, Linux-MM,
Linus Torvalds, LKML, Arnd Bergmann, Daniel Vetter, Vineet Gupta,
Paul Mackerras, Andrew Morton, Daniel Bristot de Oliveira,
David S. Miller, Greentime Hu
In-Reply-To: <20200923101953.GT2674@hirez.programming.kicks-ass.net>
On Wed, Sep 23 2020 at 12:19, peterz wrote:
> On Mon, Sep 21, 2020 at 09:27:57PM +0200, Thomas Gleixner wrote:
>> Alternatively this could of course be solved with per CPU page tables
>> which will come around some day anyway I fear.
>
> Previously (with PTI) we looked at making the entire kernel map per-CPU,
> and that takes a 2K copy on switch_mm() (or more general, the user part
> of whatever the top level directory is for architectures that have a
> shared kernel/user page-table setup in the first place).
>
> The idea was having a fixed per-cpu kernel page-table, share a bunch of
> (kernel) page-tables between all CPUs and then copy in the user part on
> switch.
>
> I've forgotten what the plan was for ASID/PCID in that scheme.
>
> For x86_64 we've been fearing the performance of that 2k copy, but I
> don't think we've ever actually bit the bullet and implemented it to see
> how bad it really is.
I actually did at some point and depending on the workload the overhead
was clearly measurable. And yes, it fell apart with PCID and I could not
come up with a scheme for it which did not suck horribly. So I burried
the patches in the poison cabinet.
Aside of that, we'd need to implement that for a eight other
architectures as well...
Thanks,
tglx
^ permalink raw reply
* Re: [PATCH] i2c: cpm: Fix i2c_ram structure
From: Vincent Nicolas @ 2020-09-23 7:18 UTC (permalink / raw)
To: Christophe Leroy, jochen@scram.de
Cc: linuxppc-dev@lists.ozlabs.org, linux-i2c@vger.kernel.org
In-Reply-To: <956c4b63-f859-df0c-2836-80a988ee6aa9@csgroup.eu>
From: Christophe Leroy <christophe.leroy@csgroup.eu>
Sent: Tuesday, 22 September 2020 14:38
To: Vincent Nicolas <Nicolas.Vincent@vossloh.com>; jochen@scram.de <jochen@scram.de>
Cc: linuxppc-dev@lists.ozlabs.org <linuxppc-dev@lists.ozlabs.org>; linux-i2c@vger.kernel.org <linux-i2c@vger.kernel.org>
Subject: Re: [PATCH] i2c: cpm: Fix i2c_ram structure
Le 22/09/2020 à 11:04, nico.vince@gmail.com a écrit :
> From: Nicolas VINCENT <nicolas.vincent@vossloh.com>
>
> the i2c_ram structure is missing the sdmatmp field mentionned in
> datasheet for MPC8272 at paragraph 36.5. With this field missing, the
> hardware would write past the allocated memory done through
> cpm_muram_alloc for the i2c_ram structure and land in memory allocated
> for the buffers descriptors corrupting the cbd_bufaddr field. Since this
> field is only set during setup(), the first i2c transaction would work
> and the following would send data read from an arbitrary memory
> location.
>
> Signed-off-by: Nicolas VINCENT <nicolas.vincent@vossloh.com>
> ---
> drivers/i2c/busses/i2c-cpm.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c
> index 1213e1932ccb..c5700addbf65 100644
> --- a/drivers/i2c/busses/i2c-cpm.c
> +++ b/drivers/i2c/busses/i2c-cpm.c
> @@ -64,7 +64,8 @@ struct i2c_ram {
> uint txtmp; /* Internal */
> char res1[4]; /* Reserved */
> ushort rpbase; /* Relocation pointer */
> - char res2[2]; /* Reserved */
> + char res2[6]; /* Reserved */
> + uint sdmatmp; /* Internal */
On CPM1, I2C param RAM has size 0x30 (offset 0x1c80-0x1caf)
Your change overlaps the miscellaneous area that contains CP Microcode
Revision Number, ref MPC885 Reference Manual §18.7.3
As far as I understand the mpc885 contains in the dts the compatible=fsl,cpm1-i2c which is used in cpm-i2c.c to either determine the address of the i2c_ram structure (cpm1), or dynamically allocate it with cpm_muram_alloc (cpm2).
In the first case the structure will indeed overlaps with the miscellaneous section but since the sdmatmp is only used by cpm2 hardware it shall not be an issue.
Please, let me know if I am mistaken. If the patch cannot be accepted as is, I would gladly accept pointers on how to address this kind of issue.
Thanks,
Nicolas.
> };
>
> #define I2COM_START 0x80
>
Christophe
^ permalink raw reply
* Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
From: peterz @ 2020-09-23 10:19 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Juri Lelli, David Airlie, Sebastian Andrzej Siewior,
Joonas Lahtinen, dri-devel, linux-mips, Ben Segall, Max Filippov,
Guo Ren, linux-sparc, Vincent Chen, Will Deacon, Ard Biesheuvel,
linux-arch, Vincent Guittot, Herbert Xu, the arch/x86 maintainers,
Russell King, linux-csky, Mel Gorman,
open list:SYNOPSYS ARC ARCHITECTURE, linux-xtensa, Paul McKenney,
intel-gfx, linuxppc-dev, Steven Rostedt, Jani Nikula,
Rodrigo Vivi, Dietmar Eggemann, Linux ARM, Chris Zankel,
Michal Simek, Thomas Bogendoerfer, Nick Hu, Linux-MM,
Linus Torvalds, LKML, Arnd Bergmann, Daniel Vetter, Vineet Gupta,
Paul Mackerras, Andrew Morton, Daniel Bristot de Oliveira,
David S. Miller, Greentime Hu
In-Reply-To: <87sgbbaq0y.fsf@nanos.tec.linutronix.de>
On Mon, Sep 21, 2020 at 09:27:57PM +0200, Thomas Gleixner wrote:
> Alternatively this could of course be solved with per CPU page tables
> which will come around some day anyway I fear.
Previously (with PTI) we looked at making the entire kernel map per-CPU,
and that takes a 2K copy on switch_mm() (or more general, the user part
of whatever the top level directory is for architectures that have a
shared kernel/user page-table setup in the first place).
The idea was having a fixed per-cpu kernel page-table, share a bunch of
(kernel) page-tables between all CPUs and then copy in the user part on
switch.
I've forgotten what the plan was for ASID/PCID in that scheme.
For x86_64 we've been fearing the performance of that 2k copy, but I
don't think we've ever actually bit the bullet and implemented it to see
how bad it really is.
^ permalink raw reply
* Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
From: peterz @ 2020-09-23 8:40 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Juri Lelli, David Airlie, Sebastian Andrzej Siewior,
Joonas Lahtinen, dri-devel, linux-mips, Ben Segall, Max Filippov,
Guo Ren, linux-sparc, Vincent Chen, Will Deacon, Ard Biesheuvel,
linux-arch, Vincent Guittot, Herbert Xu, the arch/x86 maintainers,
Russell King, linux-csky, Mel Gorman,
open list:SYNOPSYS ARC ARCHITECTURE, linux-xtensa, Paul McKenney,
intel-gfx, linuxppc-dev, Steven Rostedt, Jani Nikula,
Rodrigo Vivi, Dietmar Eggemann, Linux ARM, Chris Zankel,
Michal Simek, Thomas Bogendoerfer, Nick Hu, Linux-MM,
Linus Torvalds, LKML, Arnd Bergmann, Daniel Vetter, Vineet Gupta,
Paul Mackerras, Andrew Morton, Daniel Bristot de Oliveira,
David S. Miller, Greentime Hu
In-Reply-To: <87sgbbaq0y.fsf@nanos.tec.linutronix.de>
On Mon, Sep 21, 2020 at 09:27:57PM +0200, Thomas Gleixner wrote:
> On Mon, Sep 21 2020 at 09:24, Linus Torvalds wrote:
> > On Mon, Sep 21, 2020 at 12:39 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >>
> >> If a task is migrated to a different CPU then the mapping address will
> >> change which will explode in colourful ways.
> >
> > Right you are.
> >
> > Maybe we really *could* call this new kmap functionality something
> > like "kmap_percpu()" (or maybe "local" is good enough), and make it
> > act like your RT code does for spinlocks - not disable preemption, but
> > only disabling CPU migration.
>
> I"m all for it, but the scheduler people have opinions :)
Right, so I'm concerned. migrate_disable() wrecks pretty much all
Real-Time scheduler theory we have, and PREEMPRT_RT bringing it in is
somewhat ironic.
Yes, it allows breaking up non-preemptible regions of non-deterministic
duration, and thereby both reduce and bound the scheduling latency, the
cost for doing that is that the theory on CPU utilization/bandwidth go
out the window.
To easily see this consider an SMP system with a number of tasks equal
to the number of CPUs. On a regular (preempt_disable) kernel we can
always run each task, by virtue of always having an idle CPU to take the
task.
However, with migrate_disable() we can have each task preempted in a
migrate_disable() region, worse we can stack them all on the _same_ CPU
(super ridiculous odds, sure). And then we end up only able to run one
task, with the rest of the CPUs picking their nose.
The end result is that, like with unbounded latency, we will still miss
our deadline, simply because we got starved for CPU.
Now, while we could (with a _lot_ of work) rework the kernel to not rely
on the implicit per-cpu ness of things like spinlock_t, the moment we
bring in basic primitives that rely on migrate_disable() we're stuck
with it.
The problem is; afaict there's been no research into this problem. There
might be scheduling (read: balancing) schemes that can mitigate/solve
this problem, or it might prove to be a 'hard' problem, I just don't
know. But once we start down this road, it's going to be hell to get rid
of it.
That's why I've been arguing (for many years) to strictly limit this to
PREEMPT_RT and only as a gap-stop, not a fundamental primitive to build
on.
^ permalink raw reply
* Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag
From: Pavel Begunkov @ 2020-09-23 8:01 UTC (permalink / raw)
To: Arnd Bergmann
Cc: linux-aio, open list:MIPS, David Howells, Linux-MM, keyrings,
sparclinux, Christoph Hellwig, linux-arch, linux-s390,
Linux SCSI List, X86 ML, linux-block, Al Viro, Andy Lutomirski,
io-uring, linux-arm-kernel, Jens Axboe, Parisc List,
Network Development, LKML, LSM List, Linux FS Devel,
Andrew Morton, linuxppc-dev
In-Reply-To: <CAK8P3a39jN+t2hhLg0oKZnbYATQXmYE2-Z1JkmFyc1EPdg1HXw@mail.gmail.com>
On 22/09/2020 12:01, Arnd Bergmann wrote:
> On Tue, Sep 22, 2020 at 9:59 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>> On 22/09/2020 10:23, Arnd Bergmann wrote:
>>> On Tue, Sep 22, 2020 at 8:32 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>>>> On 22/09/2020 03:58, Andy Lutomirski wrote:
>>>>> On Mon, Sep 21, 2020 at 5:24 PM Pavel Begunkov <asml.silence@gmail.com> wrote:
>>>>> I may be looking at a different kernel than you, but aren't you
>>>>> preventing creating an io_uring regardless of whether SQPOLL is
>>>>> requested?
>>>>
>>>> I diffed a not-saved file on a sleepy head, thanks for noticing.
>>>> As you said, there should be an SQPOLL check.
>>>>
>>>> ...
>>>> if (ctx->compat && (p->flags & IORING_SETUP_SQPOLL))
>>>> goto err;
>>>
>>> Wouldn't that mean that now 32-bit containers behave differently
>>> between compat and native execution?
>>>
>>> I think if you want to prevent 32-bit applications from using SQPOLL,
>>> it needs to be done the same way on both to be consistent:
>>
>> The intention was to disable only compat not native 32-bit.
>
> I'm not following why that would be considered a valid option,
> as that clearly breaks existing users that update from a 32-bit
> kernel to a 64-bit one.
Do you mean users who move 32-bit binaries (without recompiling) to a
new x64 kernel? Does the kernel guarantees that to work? I'd personally
care more native-bitness apps.
>
> Taking away the features from users that are still on 32-bit kernels
> already seems questionable to me, but being inconsistent
> about it seems much worse, in particular when the regression
> is on the upgrade path.
TBH, this won't fix that entirely (e.g. passing non-compat io_uring
to a compat process should yield the same problem). So, let's put
it aside for now until this bikeshedding would be relevant.
>
>>> Can we expect all existing and future user space to have a sane
>>> fallback when IORING_SETUP_SQPOLL fails?
>>
>> SQPOLL has a few differences with non-SQPOLL modes, but it's easy
>> to convert between them. Anyway, SQPOLL is a privileged special
>> case that's here for performance/latency reasons, I don't think
>> there will be any non-accidental users of it.
>
> Ok, so the behavior of 32-bit tasks would be the same as running
> the same application as unprivileged 64-bit tasks, with applications
Yes, something like that, but that's not automatic and in some
(hopefully rare) cases there may be pitfalls. That's in short,
I can expand the idea a bit if anyone would be interested.
> already having to implement that fallback, right?
Well, not everyone _have_ to implement such a fallback, e.g.
applications working only whilst privileged may use SQPOLL only.
--
Pavel Begunkov
^ permalink raw reply
* Re: [PATCH] i2c: cpm: Fix i2c_ram structure
From: Christophe Leroy @ 2020-09-23 8:01 UTC (permalink / raw)
To: Vincent Nicolas, jochen@scram.de
Cc: linuxppc-dev@lists.ozlabs.org, linux-i2c@vger.kernel.org
In-Reply-To: <PR3P193MB0731945473A9F251C7F37608F1380@PR3P193MB0731.EURP193.PROD.OUTLOOK.COM>
Le 23/09/2020 à 09:18, Vincent Nicolas a écrit :
>
>
>
> From: Christophe Leroy <christophe.leroy@csgroup.eu>
> Sent: Tuesday, 22 September 2020 14:38
> To: Vincent Nicolas <Nicolas.Vincent@vossloh.com>; jochen@scram.de <jochen@scram.de>
> Cc: linuxppc-dev@lists.ozlabs.org <linuxppc-dev@lists.ozlabs.org>; linux-i2c@vger.kernel.org <linux-i2c@vger.kernel.org>
> Subject: Re: [PATCH] i2c: cpm: Fix i2c_ram structure
>
>
>
> Le 22/09/2020 à 11:04, nico.vince@gmail.com a écrit :
>> From: Nicolas VINCENT <nicolas.vincent@vossloh.com>
>>
>> the i2c_ram structure is missing the sdmatmp field mentionned in
>> datasheet for MPC8272 at paragraph 36.5. With this field missing, the
>> hardware would write past the allocated memory done through
>> cpm_muram_alloc for the i2c_ram structure and land in memory allocated
>> for the buffers descriptors corrupting the cbd_bufaddr field. Since this
>> field is only set during setup(), the first i2c transaction would work
>> and the following would send data read from an arbitrary memory
>> location.
>>
>> Signed-off-by: Nicolas VINCENT <nicolas.vincent@vossloh.com>
>> ---
>> drivers/i2c/busses/i2c-cpm.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c
>> index 1213e1932ccb..c5700addbf65 100644
>> --- a/drivers/i2c/busses/i2c-cpm.c
>> +++ b/drivers/i2c/busses/i2c-cpm.c
>> @@ -64,7 +64,8 @@ struct i2c_ram {
>> uint txtmp; /* Internal */
>> char res1[4]; /* Reserved */
>> ushort rpbase; /* Relocation pointer */
>> - char res2[2]; /* Reserved */
>> + char res2[6]; /* Reserved */
>> + uint sdmatmp; /* Internal */
>
> On CPM1, I2C param RAM has size 0x30 (offset 0x1c80-0x1caf)
>
> Your change overlaps the miscellaneous area that contains CP Microcode
> Revision Number, ref MPC885 Reference Manual §18.7.3
>
> As far as I understand the mpc885 contains in the dts the compatible=fsl,cpm1-i2c which is used in cpm-i2c.c to either determine the address of the i2c_ram structure (cpm1), or dynamically allocate it with cpm_muram_alloc (cpm2).
> In the first case the structure will indeed overlaps with the miscellaneous section but since the sdmatmp is only used by cpm2 hardware it shall not be an issue.
>
> Please, let me know if I am mistaken. If the patch cannot be accepted as is, I would gladly accept pointers on how to address this kind of issue.
Please use a mail client that properly sets the > in front of
original/answered text. Here your mailer has mixed you text and mine,
that's unusable on the long term.
I think you are right on the fact that it doesn't seem to be an issue.
Nevertheless, that's confusing.
What I would suggest is to leave res2[2] as is, and add something like:
/* The following elements are only for CPM2 */
char res3[4]; /* Reserved */
uint sdmatmp; /* Internal */
Other solution (not sure that's the best solution thought) would be to
do as in spi-fsl-cpm: use iic_t structure from asm/cpm1.h when
CONFIG_CPM1 is selected and use iic_t from asm/cpm2.h when CONFIG_CPM2
is selected, taking into account that CONFIG_CPM1 and CONFIG_CPM2 are
mutually exclusive at the time being.
Christophe
^ permalink raw reply
* [PATCH v3] powerpc/pci: unmap legacy INTx interrupts when a PHB is removed
From: Cédric Le Goater @ 2020-09-23 7:40 UTC (permalink / raw)
To: Michael Ellerman
Cc: Alexey Kardashevskiy, Oliver O'Halloran, linuxppc-dev,
Cédric Le Goater
When a passthrough IO adapter is removed from a pseries machine using
hash MMU and the XIVE interrupt mode, the POWER hypervisor expects the
guest OS to clear all page table entries related to the adapter. If
some are still present, the RTAS call which isolates the PCI slot
returns error 9001 "valid outstanding translations" and the removal of
the IO adapter fails. This is because when the PHBs are scanned, Linux
maps automatically the INTx interrupts in the Linux interrupt number
space but these are never removed.
To solve this problem, we introduce a PPC platform specific
pcibios_remove_bus() routine which clears all interrupt mappings when
the bus is removed. This also clears the associated page table entries
of the ESB pages when using XIVE.
For this purpose, we record the logical interrupt numbers of the
mapped interrupt under the PHB structure and let pcibios_remove_bus()
do the clean up.
Since some PCI adapters, like GPUs, use the "interrupt-map" property
to describe interrupt mappings other than the legacy INTx interrupts,
we can not restrict the size of the mapping array to PCI_NUM_INTX. The
number of interrupt mappings is computed from the "interrupt-map"
property and the mapping array is allocated accordingly.
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
Changes in v3 :
- NULLified 'irq_map' in pci_irq_map_dispose()
arch/powerpc/include/asm/pci-bridge.h | 6 ++
arch/powerpc/kernel/pci-common.c | 115 ++++++++++++++++++++++++++
2 files changed, 121 insertions(+)
diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
index d2a2a14e56f9..d21e070352dc 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -48,6 +48,9 @@ struct pci_controller_ops {
/*
* Structure of a PCI controller (host bridge)
+ *
+ * @irq_count: number of interrupt mappings
+ * @irq_map: interrupt mappings
*/
struct pci_controller {
struct pci_bus *bus;
@@ -127,6 +130,9 @@ struct pci_controller {
void *private_data;
struct npu *npu;
+
+ unsigned int irq_count;
+ unsigned int *irq_map;
};
/* These are used for config access before all the PCI probing
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index be108616a721..fb492de6902e 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -353,6 +353,116 @@ struct pci_controller *pci_find_controller_for_domain(int domain_nr)
return NULL;
}
+/*
+ * Assumption is made on the interrupt parent. All interrupt-map
+ * entries are considered to have the same parent.
+ */
+static int pcibios_irq_map_count(struct pci_controller *phb)
+{
+ const __be32 *imap;
+ int imaplen;
+ struct device_node *parent;
+ u32 intsize, addrsize, parintsize, paraddrsize;
+
+ if (of_property_read_u32(phb->dn, "#interrupt-cells", &intsize))
+ return 0;
+ if (of_property_read_u32(phb->dn, "#address-cells", &addrsize))
+ return 0;
+
+ imap = of_get_property(phb->dn, "interrupt-map", &imaplen);
+ if (!imap) {
+ pr_debug("%pOF : no interrupt-map\n", phb->dn);
+ return 0;
+ }
+ imaplen /= sizeof(u32);
+ pr_debug("%pOF : imaplen=%d\n", phb->dn, imaplen);
+
+ if (imaplen < (addrsize + intsize + 1))
+ return 0;
+
+ imap += intsize + addrsize;
+ parent = of_find_node_by_phandle(be32_to_cpup(imap));
+ if (!parent) {
+ pr_debug("%pOF : no imap parent found !\n", phb->dn);
+ return 0;
+ }
+
+ if (of_property_read_u32(parent, "#interrupt-cells", &parintsize)) {
+ pr_debug("%pOF : parent lacks #interrupt-cells!\n", phb->dn);
+ return 0;
+ }
+
+ if (of_property_read_u32(parent, "#address-cells", ¶ddrsize))
+ paraddrsize = 0;
+
+ return imaplen / (addrsize + intsize + 1 + paraddrsize + parintsize);
+}
+
+static void pcibios_irq_map_init(struct pci_controller *phb)
+{
+ phb->irq_count = pcibios_irq_map_count(phb);
+ if (phb->irq_count < PCI_NUM_INTX)
+ phb->irq_count = PCI_NUM_INTX;
+
+ pr_debug("%pOF : interrupt map #%d\n", phb->dn, phb->irq_count);
+
+ phb->irq_map = kcalloc(phb->irq_count, sizeof(unsigned int),
+ GFP_KERNEL);
+}
+
+static void pci_irq_map_register(struct pci_dev *pdev, unsigned int virq)
+{
+ struct pci_controller *phb = pci_bus_to_host(pdev->bus);
+ int i;
+
+ if (!phb->irq_map)
+ return;
+
+ for (i = 0; i < phb->irq_count; i++) {
+ /*
+ * Look for an empty or an equivalent slot, as INTx
+ * interrupts can be shared between adapters.
+ */
+ if (phb->irq_map[i] == virq || !phb->irq_map[i]) {
+ phb->irq_map[i] = virq;
+ break;
+ }
+ }
+
+ if (i == phb->irq_count)
+ pr_err("PCI:%s all platform interrupts mapped\n",
+ pci_name(pdev));
+}
+
+/*
+ * Clearing the mapped interrupts will also clear the underlying
+ * mappings of the ESB pages of the interrupts when under XIVE. It is
+ * a requirement of PowerVM to clear all memory mappings before
+ * removing a PHB.
+ */
+static void pci_irq_map_dispose(struct pci_bus *bus)
+{
+ struct pci_controller *phb = pci_bus_to_host(bus);
+ int i;
+
+ if (!phb->irq_map)
+ return;
+
+ pr_debug("PCI: Clearing interrupt mappings for PHB %04x:%02x...\n",
+ pci_domain_nr(bus), bus->number);
+ for (i = 0; i < phb->irq_count; i++)
+ irq_dispose_mapping(phb->irq_map[i]);
+
+ kfree(phb->irq_map);
+ phb->irq_map = NULL;
+}
+
+void pcibios_remove_bus(struct pci_bus *bus)
+{
+ pci_irq_map_dispose(bus);
+}
+EXPORT_SYMBOL_GPL(pcibios_remove_bus);
+
/*
* Reads the interrupt pin to determine if interrupt is use by card.
* If the interrupt is used, then gets the interrupt line from the
@@ -401,6 +511,8 @@ static int pci_read_irq_line(struct pci_dev *pci_dev)
pci_dev->irq = virq;
+ /* Record all interrut mappings for later removal of a PHB */
+ pci_irq_map_register(pci_dev, virq);
return 0;
}
@@ -1554,6 +1666,9 @@ void pcibios_scan_phb(struct pci_controller *hose)
pr_debug("PCI: Scanning PHB %pOF\n", node);
+ /* Allocate interrupt mappings array */
+ pcibios_irq_map_init(hose);
+
/* Get some IO space for the new PHB */
pcibios_setup_phb_io_space(hose);
--
2.25.4
^ permalink raw reply related
* [PATCH -next v2] powerpc/perf: Fix symbol undeclared warning
From: Wang Wensheng @ 2020-09-23 7:14 UTC (permalink / raw)
To: mpe, benh, paulus, atrajeev, maddy, wangwensheng4, anju,
linuxppc-dev, linux-kernel
Cc: rui.xiang
Build kernel with `C=2`:
arch/powerpc/perf/isa207-common.c:24:18: warning: symbol
'isa207_pmu_format_attr' was not declared. Should it be static?
arch/powerpc/perf/power9-pmu.c:101:5: warning: symbol 'p9_dd21_bl_ev'
was not declared. Should it be static?
arch/powerpc/perf/power9-pmu.c:115:5: warning: symbol 'p9_dd22_bl_ev'
was not declared. Should it be static?
Those symbols are used only in the files that define them so we declare
them as static to fix the warnings.
Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com>
---
arch/powerpc/perf/isa207-common.c | 2 +-
arch/powerpc/perf/power9-pmu.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c
index 964437adec18..85dc860b265b 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -21,7 +21,7 @@ PMU_FORMAT_ATTR(thresh_stop, "config:32-35");
PMU_FORMAT_ATTR(thresh_start, "config:36-39");
PMU_FORMAT_ATTR(thresh_cmp, "config:40-49");
-struct attribute *isa207_pmu_format_attr[] = {
+static struct attribute *isa207_pmu_format_attr[] = {
&format_attr_event.attr,
&format_attr_pmcxsel.attr,
&format_attr_mark.attr,
diff --git a/arch/powerpc/perf/power9-pmu.c b/arch/powerpc/perf/power9-pmu.c
index 2a57e93a79dc..4a315fad1f99 100644
--- a/arch/powerpc/perf/power9-pmu.c
+++ b/arch/powerpc/perf/power9-pmu.c
@@ -98,7 +98,7 @@ extern u64 PERF_REG_EXTENDED_MASK;
/* PowerISA v2.07 format attribute structure*/
extern struct attribute_group isa207_pmu_format_group;
-int p9_dd21_bl_ev[] = {
+static int p9_dd21_bl_ev[] = {
PM_MRK_ST_DONE_L2,
PM_RADIX_PWC_L1_HIT,
PM_FLOP_CMPL,
@@ -112,7 +112,7 @@ int p9_dd21_bl_ev[] = {
PM_DISP_HELD_SYNC_HOLD,
};
-int p9_dd22_bl_ev[] = {
+static int p9_dd22_bl_ev[] = {
PM_DTLB_MISS_16G,
PM_DERAT_MISS_2M,
PM_DTLB_MISS_2M,
--
2.25.0
^ permalink raw reply related
* Re: [PATCH v2] powerpc/pci: unmap legacy INTx interrupts when a PHB is removed
From: Cédric Le Goater @ 2020-09-23 7:06 UTC (permalink / raw)
To: Qian Cai, Michael Ellerman
Cc: Stephen Rothwell, Alexey Kardashevskiy, linux-kernel, linux-next,
Oliver O'Halloran, linuxppc-dev
In-Reply-To: <9c5eca863c63e360662fae7597213e8927c2a885.camel@redhat.com>
On 9/23/20 2:33 AM, Qian Cai wrote:
> On Fri, 2020-08-07 at 12:18 +0200, Cédric Le Goater wrote:
>> When a passthrough IO adapter is removed from a pseries machine using
>> hash MMU and the XIVE interrupt mode, the POWER hypervisor expects the
>> guest OS to clear all page table entries related to the adapter. If
>> some are still present, the RTAS call which isolates the PCI slot
>> returns error 9001 "valid outstanding translations" and the removal of
>> the IO adapter fails. This is because when the PHBs are scanned, Linux
>> maps automatically the INTx interrupts in the Linux interrupt number
>> space but these are never removed.
>>
>> To solve this problem, we introduce a PPC platform specific
>> pcibios_remove_bus() routine which clears all interrupt mappings when
>> the bus is removed. This also clears the associated page table entries
>> of the ESB pages when using XIVE.
>>
>> For this purpose, we record the logical interrupt numbers of the
>> mapped interrupt under the PHB structure and let pcibios_remove_bus()
>> do the clean up.
>>
>> Since some PCI adapters, like GPUs, use the "interrupt-map" property
>> to describe interrupt mappings other than the legacy INTx interrupts,
>> we can not restrict the size of the mapping array to PCI_NUM_INTX. The
>> number of interrupt mappings is computed from the "interrupt-map"
>> property and the mapping array is allocated accordingly.
>>
>> Cc: "Oliver O'Halloran" <oohall@gmail.com>
>> Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>
> Some syscall fuzzing will trigger this on POWER9 NV where the traces pointed to
> this patch.
>
> .config: https://gitlab.com/cailca/linux-mm/-/blob/master/powerpc.config
OK. The patch is missing a NULL assignement after kfree() and that
might be the issue.
I did try PHB removal under PowerNV, so I would like to understand
how we managed to remove twice the PCI bus and possibly reproduce.
Any chance we could grab what the syscall fuzzer (syzkaller) did ?
Thanks,
C.
>
> [ 3574.564109][ T965] ata1.00: disabled
> [ 3574.580373][T151472] sd 0:0:0:0: [sdb] Synchronizing SCSI cache
> [ 3574.581180][T151472] sd 0:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=0x04 driverbyte=0x00
> [ 3574.581226][T151472] sd 0:0:0:0: [sdb] Stopping disk
> [ 3574.581289][T151472] sd 0:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=0x04 driverbyte=0x00
> [ 3574.611424][ T3019] Read-error on swap-device (254:1:849792)
> [ 3574.611685][ T3019] Read-error on swap-device (254:1:914944)
> [ 3574.611769][ T3019] Read-error on swap-device (254:1:915072)
> [ 3574.611838][ T3019] Read-error on swap-device (254:1:915200)
> [ 3574.611926][ T3019] Read-error on swap-device (254:1:915328)
> [ 3574.612268][ T3084] Read-error on swap-device (254:1:792576)
> [ 3574.612342][ T3084] Read-error on swap-device (254:1:792704)
> [ 3574.612757][ T2362] Read-error on swap-device (254:1:957440)
> [ 3574.612773][ T2905] Read-error on swap-device (254:1:784128)
> [ 3574.613015][ T2362] Read-error on swap-device (254:1:957568)
> [ 3574.613160][ T2905] Read-error on swap-device (254:1:784256)
> [ 3574.613241][ T2362] Read-error on swap-device (254:1:957696)
> [ 3574.613342][ T2362] Read-error on swap-device (254:1:957824)
> [ 3574.614448][ T3019] Core dump to |/usr/lib/systemd/systemd-coredump pipe failed
> [ 3574.614663][ T3019] Read-error on swap-device (254:1:961536)
> [ 3574.675330][T151844] Read-error on swap-device (254:1:128)
> [ 3574.675515][T151844] Read-error on swap-device (254:1:256)
> [ 3574.675700][T151844] Read-error on swap-device (254:1:384)
> [ 3574.703570][ T971] ata2.00: disabled
> [ 3574.710393][T151472] sd 1:0:0:0: [sda] Synchronizing SCSI cache
> [ 3574.710864][T151472] sd 1:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=0x04 driverbyte=0x00
> [ 3574.710922][T151472] sd 1:0:0:0: [sda] Stopping disk
> [ 3574.711010][T151472] sd 1:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=0x04 driverbyte=0x00
> [ 3574.826569][ T674] dm-0: writeback error on inode 68507862, offset 65536, sector 54281504
> [ 3575.117547][ T3366] dm-0: writeback error on inode 68507851, offset 0, sector 54378880
> [ 3575.140104][T151472] pci 0004:03:00.0: Removing from iommu group 3
> [ 3575.141778][T151472] pci 0004:03 : [PE# fb] Releasing PE
> [ 3575.141965][T151472] pci 0004:03 : [PE# fb] Removing DMA window #0
> [ 3575.142452][T151472] pci 0004:03 : [PE# fb] Disabling 64-bit DMA bypass
> [ 3575.149369][T151472] pci_bus 0004:03: busn_res: [bus 03] is released
> [ 3575.150574][T152037] Read-error on swap-device (254:1:35584)
> [ 3575.150713][T152037] Read-error on swap-device (254:1:35712)
> [ 3575.152632][T152037] Read-error on swap-device (254:1:915584)
> [ 3575.152706][T151472] pci_bus 0004:04: busn_res: [bus 04-08] is released
> [ 3575.152983][T151472] =============================================================================
> [ 3575.153937][T151472] BUG kmalloc-16 (Not tainted): Object already free
> [ 3575.153962][T151472] -----------------------------------------------------------------------------
> [ 3575.153962][T151472]
> [ 3575.154020][T151472] Disabling lock debugging due to kernel taint
> [ 3575.154047][T151472] INFO: Allocated in pcibios_scan_phb+0x104/0x3e0 age=356904 cpu=4 pid=1
> [ 3575.154084][T151472] __slab_alloc+0xa4/0xf0
> [ 3575.154105][T151472] __kmalloc+0x294/0x330
> [ 3575.154127][T151472] pcibios_scan_phb+0x104/0x3e0
> [ 3575.154165][T151472] pcibios_init+0x84/0x124
> [ 3575.154209][T151472] do_one_initcall+0xac/0x528
> [ 3575.154241][T151472] kernel_init_freeable+0x35c/0x3fc
> [ 3575.154272][T151472] kernel_init+0x24/0x148
> [ 3575.154306][T151472] ret_from_kernel_thread+0x5c/0x80
> [ 3575.154352][T151472] INFO: Freed in pcibios_remove_bus+0x70/0x90 age=0 cpu=7 pid=151472
> [ 3575.154387][T151472] kfree+0x49c/0x510
> [ 3575.154406][T151472] pcibios_remove_bus+0x70/0x90
> [ 3575.154443][T151472] pci_remove_bus+0xe4/0x110
> [ 3575.154467][T151472] pci_remove_bus_device+0x74/0x170
> [ 3575.154503][T151472] pci_remove_bus_device+0x4c/0x170
> [ 3575.154524][T151472] pci_stop_and_remove_bus_device_locked+0x34/0x50
> [ 3575.154539][T151472] remove_store+0xc0/0xe0
> [ 3575.154551][T151472] dev_attr_store+0x30/0x50
> [ 3575.154573][T151472] sysfs_kf_write+0x68/0xb0
> [ 3575.154595][T151472] kernfs_fop_write+0x114/0x260
> [ 3575.154643][T151472] vfs_write+0xe4/0x260
> [ 3575.154667][T151472] ksys_write+0x74/0x130
> [ 3575.154692][T151472] system_call_exception+0xf8/0x1d0
> [ 3575.154728][T151472] system_call_common+0xe8/0x218
> [ 3575.154788][T151472] INFO: Slab 0x00000000cafdf25c objects=178 used=174 fp=0x0000000020a64b99 flags=0x7fff8000000201
> [ 3575.154848][T151472] INFO: Object 0x00000000af116201 @offset=5168 fp=0x0000000000000000
> [ 3575.154848][T151472]
> [ 3575.154901][T151472] Redzone 00000000781d3b96: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
> [ 3575.154968][T151472] Object 00000000af116201: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
> [ 3575.155012][T151472] Redzone 000000007b8ec00f: bb bb bb bb bb bb bb bb ........
> [ 3575.155052][T151472] Padding 00000000df7d5e89: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
> [ 3575.155107][T151472] CPU: 7 PID: 151472 Comm: trinity-c61 Tainted: G B 5.9.0-rc6-next-20200922+ #2
> [ 3575.155147][T151472] Call Trace:
> [ 3575.155171][T151472] [c000001fefdef800] [c000000000644278] dump_stack+0xec/0x144 (unreliable)
> [ 3575.155204][T151472] [c000001fefdef840] [c00000000035e9a8] print_trailer+0x278/0x2a0
> [ 3575.155220][T151472] [c000001fefdef8d0] [c000000000355d9c] free_debug_processing+0x57c/0x600
> [ 3575.155236][T151472] [c000001fefdef9b0] [c000000000356234] __slab_free+0x414/0x5b0
> [ 3575.155251][T151472] [c000001fefdefa80] [c00000000035686c] kfree+0x49c/0x510
> [ 3575.155267][T151472] [c000001fefdefb10] [c000000000043260] pcibios_remove_bus+0x70/0x90
> pci_irq_map_dispose at arch/powerpc/kernel/pci-common.c:456
> (inlined by) pcibios_remove_bus at arch/powerpc/kernel/pci-common.c:461
> [ 3575.155282][T151472] [c000001fefdefb40] [c00000000066fee4] pci_remove_bus+0xe4/0x110
> [ 3575.155309][T151472] [c000001fefdefb70] [c000000000670084] pci_remove_bus_device+0x74/0x170
> [ 3575.155348][T151472] [c000001fefdefbb0] [c000000000670070] pci_remove_bus_device+0x60/0x170
> [ 3575.155377][T151472] [c000001fefdefbf0] [c0000000006701f4] pci_stop_and_remove_bus_device_locked+0x34/0x50
> [ 3575.155395][T151472] [c000001fefdefc20] [c00000000067f5e0] remove_store+0xc0/0xe0
> [ 3575.155431][T151472] [c000001fefdefc70] [c0000000006dccd0] dev_attr_store+0x30/0x50
> [ 3575.155457][T151472] [c000001fefdefc90] [c00000000049fae8] sysfs_kf_write+0x68/0xb0
> [ 3575.155507][T151472] [c000001fefdefcd0] [c00000000049ed14] kernfs_fop_write+0x114/0x260
> [ 3575.155553][T151472] [c000001fefdefd20] [c0000000003ab2d4] vfs_write+0xe4/0x260
> [ 3575.155592][T151472] [c000001fefdefd70] [c0000000003ab604] ksys_write+0x74/0x130
> [ 3575.155630][T151472] [c000001fefdefdc0] [c00000000002a458] system_call_exception+0xf8/0x1d0
> [ 3575.155671][T151472] [c000001fefdefe20] [c00000000000d0a8] system_call_common+0xe8/0x218
> [ 3575.155715][T151472] FIX kmalloc-16: Object at 0x00000000af116201 not freed
> [ 3575.156125][T151472] =============================================================================
> [ 3575.156170][T151472] BUG kmalloc-16 (Tainted: G B ): Wrong object count. Counter is 174 but counted were 176
> [ 3575.156204][T151472] -----------------------------------------------------------------------------
> [ 3575.156204][T151472]
> [ 3575.156240][T151472] INFO: Slab 0x00000000cafdf25c objects=178 used=174 fp=0x0000000020a64b99 flags=0x7fff8000000201
> [ 3575.156296][T151472] CPU: 7 PID: 151472 Comm: trinity-c61 Tainted: G B 5.9.0-rc6-next-20200922+ #2
> [ 3575.156348][T151472] Call Trace:
> [ 3575.156376][T151472] [c000001fefdef6f0] [c000000000644278] dump_stack+0xec/0x144 (unreliable)
> [ 3575.156439][T151472] [c000001fefdef730] [c00000000035e688] slab_err+0x78/0xb0
> [ 3575.156502][T151472] [c000001fefdef810] [c0000000003552a4] on_freelist+0x364/0x390
> [ 3575.156541][T151472] [c000001fefdef8b0] [c000000000355aa8] free_debug_processing+0x288/0x600
> [ 3575.156598][T151472] [c000001fefdef990] [c000000000356234] __slab_free+0x414/0x5b0
> [ 3575.156644][T151472] [c000001fefdefa60] [c00000000035686c] kfree+0x49c/0x510
> [ 3575.156701][T151472] [c000001fefdefaf0] [c0000000002b8a10] kfree_const+0x60/0x80
> [ 3575.156738][T151472] [c000001fefdefb10] [c00000000064d3bc] kobject_release+0x7c/0xd0
> [ 3575.156784][T151472] [c000001fefdefb50] [c0000000006de0c0] put_device+0x20/0x40
> [ 3575.156831][T151472] [c000001fefdefb70] [c00000000067015c] pci_remove_bus_device+0x14c/0x170
> [ 3575.156867][T151472] [c000001fefdefbb0] [c000000000670070] pci_remove_bus_device+0x60/0x170
> [ 3575.156923][T151472] [c000001fefdefbf0] [c0000000006701f4] pci_stop_and_remove_bus_device_locked+0x34/0x50
> [ 3575.156998][T151472] [c000001fefdefc20] [c00000000067f5e0] remove_store+0xc0/0xe0
> [ 3575.157058][T151472] [c000001fefdefc70] [c0000000006dccd0] dev_attr_store+0x30/0x50
> [ 3575.157114][T151472] [c000001fefdefc90] [c00000000049fae8] sysfs_kf_write+0x68/0xb0
> [ 3575.157169][T151472] [c000001fefdefcd0] [c00000000049ed14] kernfs_fop_write+0x114/0x260
> [ 3575.157215][T151472] [c000001fefdefd20] [c0000000003ab2d4] vfs_write+0xe4/0x260
> [ 3575.157256][T151472] [c000001fefdefd70] [c0000000003ab604] ksys_write+0x74/0x130
> [ 3575.157301][T151472] [c000001fefdefdc0] [c00000000002a458] system_call_exception+0xf8/0x1d0
> [ 3575.157352][T151472] [c000001fefdefe20] [c00000000000d0a8] system_call_common+0xe8/0x218
> [ 3575.157396][T151472] FIX kmalloc-16: Object count adjusted.
> [ 3575.157457][T151472] pci_bus 0004:09: busn_res: [bus 09-0d] is released
>
>
>> ---
>>
>> Changes since v2:
>>
>> - merged 2 patches.
>>
>> arch/powerpc/include/asm/pci-bridge.h | 6 ++
>> arch/powerpc/kernel/pci-common.c | 114 ++++++++++++++++++++++++++
>> 2 files changed, 120 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/pci-bridge.h
>> b/arch/powerpc/include/asm/pci-bridge.h
>> index b92e81b256e5..ca75cf264ddf 100644
>> --- a/arch/powerpc/include/asm/pci-bridge.h
>> +++ b/arch/powerpc/include/asm/pci-bridge.h
>> @@ -48,6 +48,9 @@ struct pci_controller_ops {
>>
>> /*
>> * Structure of a PCI controller (host bridge)
>> + *
>> + * @irq_count: number of interrupt mappings
>> + * @irq_map: interrupt mappings
>> */
>> struct pci_controller {
>> struct pci_bus *bus;
>> @@ -127,6 +130,9 @@ struct pci_controller {
>>
>> void *private_data;
>> struct npu *npu;
>> +
>> + unsigned int irq_count;
>> + unsigned int *irq_map;
>> };
>>
>> /* These are used for config access before all the PCI probing
>> diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-
>> common.c
>> index be108616a721..deb831f0ae13 100644
>> --- a/arch/powerpc/kernel/pci-common.c
>> +++ b/arch/powerpc/kernel/pci-common.c
>> @@ -353,6 +353,115 @@ struct pci_controller
>> *pci_find_controller_for_domain(int domain_nr)
>> return NULL;
>> }
>>
>> +/*
>> + * Assumption is made on the interrupt parent. All interrupt-map
>> + * entries are considered to have the same parent.
>> + */
>> +static int pcibios_irq_map_count(struct pci_controller *phb)
>> +{
>> + const __be32 *imap;
>> + int imaplen;
>> + struct device_node *parent;
>> + u32 intsize, addrsize, parintsize, paraddrsize;
>> +
>> + if (of_property_read_u32(phb->dn, "#interrupt-cells", &intsize))
>> + return 0;
>> + if (of_property_read_u32(phb->dn, "#address-cells", &addrsize))
>> + return 0;
>> +
>> + imap = of_get_property(phb->dn, "interrupt-map", &imaplen);
>> + if (!imap) {
>> + pr_debug("%pOF : no interrupt-map\n", phb->dn);
>> + return 0;
>> + }
>> + imaplen /= sizeof(u32);
>> + pr_debug("%pOF : imaplen=%d\n", phb->dn, imaplen);
>> +
>> + if (imaplen < (addrsize + intsize + 1))
>> + return 0;
>> +
>> + imap += intsize + addrsize;
>> + parent = of_find_node_by_phandle(be32_to_cpup(imap));
>> + if (!parent) {
>> + pr_debug("%pOF : no imap parent found !\n", phb->dn);
>> + return 0;
>> + }
>> +
>> + if (of_property_read_u32(parent, "#interrupt-cells", &parintsize)) {
>> + pr_debug("%pOF : parent lacks #interrupt-cells!\n", phb->dn);
>> + return 0;
>> + }
>> +
>> + if (of_property_read_u32(parent, "#address-cells", ¶ddrsize))
>> + paraddrsize = 0;
>> +
>> + return imaplen / (addrsize + intsize + 1 + paraddrsize + parintsize);
>> +}
>> +
>> +static void pcibios_irq_map_init(struct pci_controller *phb)
>> +{
>> + phb->irq_count = pcibios_irq_map_count(phb);
>> + if (phb->irq_count < PCI_NUM_INTX)
>> + phb->irq_count = PCI_NUM_INTX;
>> +
>> + pr_debug("%pOF : interrupt map #%d\n", phb->dn, phb->irq_count);
>> +
>> + phb->irq_map = kcalloc(phb->irq_count, sizeof(unsigned int),
>> + GFP_KERNEL);
>> +}
>> +
>> +static void pci_irq_map_register(struct pci_dev *pdev, unsigned int virq)
>> +{
>> + struct pci_controller *phb = pci_bus_to_host(pdev->bus);
>> + int i;
>> +
>> + if (!phb->irq_map)
>> + return;
>> +
>> + for (i = 0; i < phb->irq_count; i++) {
>> + /*
>> + * Look for an empty or an equivalent slot, as INTx
>> + * interrupts can be shared between adapters.
>> + */
>> + if (phb->irq_map[i] == virq || !phb->irq_map[i]) {
>> + phb->irq_map[i] = virq;
>> + break;
>> + }
>> + }
>> +
>> + if (i == phb->irq_count)
>> + pr_err("PCI:%s all platform interrupts mapped\n",
>> + pci_name(pdev));
>> +}
>> +
>> +/*
>> + * Clearing the mapped interrupts will also clear the underlying
>> + * mappings of the ESB pages of the interrupts when under XIVE. It is
>> + * a requirement of PowerVM to clear all memory mappings before
>> + * removing a PHB.
>> + */
>> +static void pci_irq_map_dispose(struct pci_bus *bus)
>> +{
>> + struct pci_controller *phb = pci_bus_to_host(bus);
>> + int i;
>> +
>> + if (!phb->irq_map)
>> + return;
>> +
>> + pr_debug("PCI: Clearing interrupt mappings for PHB %04x:%02x...\n",
>> + pci_domain_nr(bus), bus->number);
>> + for (i = 0; i < phb->irq_count; i++)
>> + irq_dispose_mapping(phb->irq_map[i]);
>> +
>> + kfree(phb->irq_map);
>> +}
>> +
>> +void pcibios_remove_bus(struct pci_bus *bus)
>> +{
>> + pci_irq_map_dispose(bus);
>> +}
>> +EXPORT_SYMBOL_GPL(pcibios_remove_bus);
>> +
>> /*
>> * Reads the interrupt pin to determine if interrupt is use by card.
>> * If the interrupt is used, then gets the interrupt line from the
>> @@ -401,6 +510,8 @@ static int pci_read_irq_line(struct pci_dev *pci_dev)
>>
>> pci_dev->irq = virq;
>>
>> + /* Record all interrut mappings for later removal of a PHB */
>> + pci_irq_map_register(pci_dev, virq);
>> return 0;
>> }
>>
>> @@ -1554,6 +1665,9 @@ void pcibios_scan_phb(struct pci_controller *hose)
>>
>> pr_debug("PCI: Scanning PHB %pOF\n", node);
>>
>> + /* Allocate interrupt mappings array */
>> + pcibios_irq_map_init(hose);
>> +
>> /* Get some IO space for the new PHB */
>> pcibios_setup_phb_io_space(hose);
>>
>
^ permalink raw reply
* RE: [PATCH 1/5] Documentation: dt: binding: fsl: Add 'fsl,ippdexpcr1-alt-addr' property
From: Ran Wang @ 2020-09-23 6:44 UTC (permalink / raw)
To: Rob Herring
Cc: devicetree@vger.kernel.org, Biwen Li, Shawn Guo,
linux-kernel@vger.kernel.org, Leo Li,
linuxppc-dev@lists.ozlabs.org,
linux-arm-kernel@lists.infradead.org
In-Reply-To: <20200923023234.GA3751572@bogus>
Hi Rob,
On Wednesday, September 23, 2020 10:33 AM, Rob Herring wrote:
>
> On Wed, Sep 16, 2020 at 04:18:27PM +0800, Ran Wang wrote:
> > From: Biwen Li <biwen.li@nxp.com>
> >
> > The 'fsl,ippdexpcr1-alt-addr' property is used to handle an errata
> > A-008646 on LS1021A
> >
> > Signed-off-by: Biwen Li <biwen.li@nxp.com>
> > Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
> > ---
> > Documentation/devicetree/bindings/soc/fsl/rcpm.txt | 19
> > +++++++++++++++++++
> > 1 file changed, 19 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
> > b/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
> > index 5a33619..1be58a3 100644
> > --- a/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
> > +++ b/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
> > @@ -34,6 +34,11 @@ Chassis Version Example Chips
> > Optional properties:
> > - little-endian : RCPM register block is Little Endian. Without it RCPM
> > will be Big Endian (default case).
> > + - fsl,ippdexpcr1-alt-addr : The property is related to a hardware issue
> > + on SoC LS1021A and only needed on SoC LS1021A.
> > + Must include 2 entries:
> > + The first entry must be a link to the SCFG device node.
> > + The 2nd entry must be offset of register IPPDEXPCR1 in SCFG.
>
> You don't need a DT change for this. You can find SCFG node by its compatible
> string and then the offset should be known given this issue is only on 1 SoC.
Did you mean that RCPM driver just to access IPPDEXPCR1 shadowed register in SCFG
directly without fetching it's offset info. from DT?
Regards,
Ran
^ permalink raw reply
* [PATCH v2 5/5] arm: dts: ls1021a: fix rcpm failed to claim resource
From: Ran Wang @ 2020-09-23 6:25 UTC (permalink / raw)
To: Li Yang, Rob Herring, Shawn Guo
Cc: devicetree, Ran Wang, linuxppc-dev, linux-kernel,
linux-arm-kernel
In-Reply-To: <20200923062510.38253-1-ran.wang_1@nxp.com>
The range of dcfg reg is wrong, which overlap with other device,
such as rcpm. This issue causing rcpm driver failed to claim
reg resource when calling devm_ioremap_resource().
Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
Acked-by: Li Yang <leoyang.li@nxp.com>
---
Change in v2:
- None
arch/arm/boot/dts/ls1021a.dtsi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/ls1021a.dtsi b/arch/arm/boot/dts/ls1021a.dtsi
index e372630f..286c547 100644
--- a/arch/arm/boot/dts/ls1021a.dtsi
+++ b/arch/arm/boot/dts/ls1021a.dtsi
@@ -173,7 +173,7 @@
dcfg: dcfg@1ee0000 {
compatible = "fsl,ls1021a-dcfg", "syscon";
- reg = <0x0 0x1ee0000 0x0 0x10000>;
+ reg = <0x0 0x1ee0000 0x0 0x1000>;
big-endian;
};
--
2.7.4
^ permalink raw reply related
* [PATCH v2 4/5] arm: dts: ls1021a: fix flextimer failed to wake system
From: Ran Wang @ 2020-09-23 6:25 UTC (permalink / raw)
To: Li Yang, Rob Herring, Shawn Guo
Cc: devicetree, Ran Wang, linuxppc-dev, linux-kernel,
linux-arm-kernel
In-Reply-To: <20200923062510.38253-1-ran.wang_1@nxp.com>
The data of property 'fsl,rcpm-wakeup' is not corrcet, which causing
RCPM driver incorrectly program register IPPDEXPCR1, then flextimer is
wrongly clock gated during system suspend, can't send interrupt to
wake.
Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
Acked-by: Li Yang <leoyang.li@nxp.com>
---
Change in v2:
- None
arch/arm/boot/dts/ls1021a.dtsi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/ls1021a.dtsi b/arch/arm/boot/dts/ls1021a.dtsi
index 089fe86..e372630f 100644
--- a/arch/arm/boot/dts/ls1021a.dtsi
+++ b/arch/arm/boot/dts/ls1021a.dtsi
@@ -1014,7 +1014,7 @@
compatible = "fsl,ls1021a-ftm-alarm";
reg = <0x0 0x29d0000 0x0 0x10000>;
reg-names = "ftm";
- fsl,rcpm-wakeup = <&rcpm 0x20000 0x0>;
+ fsl,rcpm-wakeup = <&rcpm 0x0 0x20000000>;
interrupts = <GIC_SPI 118 IRQ_TYPE_LEVEL_HIGH>;
big-endian;
};
--
2.7.4
^ permalink raw reply related
* [PATCH v2 3/5] arm: dts: ls1021a: enable RCPM workaround for erratum A-008646
From: Ran Wang @ 2020-09-23 6:25 UTC (permalink / raw)
To: Li Yang, Rob Herring, Shawn Guo
Cc: devicetree, Biwen Li, linux-kernel, Ran Wang, linuxppc-dev,
linux-arm-kernel
In-Reply-To: <20200923062510.38253-1-ran.wang_1@nxp.com>
From: Biwen Li <biwen.li@nxp.com>
The patch fixes a bug that FlexTimer cannot
wakeup system in deep sleep.
Signed-off-by: Biwen Li <biwen.li@nxp.com>
Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
---
Change in v2:
- Change subject of commit message to be consistent with other related patches.
arch/arm/boot/dts/ls1021a.dtsi | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm/boot/dts/ls1021a.dtsi b/arch/arm/boot/dts/ls1021a.dtsi
index 827373e..089fe86 100644
--- a/arch/arm/boot/dts/ls1021a.dtsi
+++ b/arch/arm/boot/dts/ls1021a.dtsi
@@ -1007,6 +1007,7 @@
compatible = "fsl,ls1021a-rcpm", "fsl,qoriq-rcpm-2.1+";
reg = <0x0 0x1ee2140 0x0 0x8>;
#fsl,rcpm-wakeup-cells = <2>;
+ fsl,ippdexpcr1-alt-addr = <&scfg 0x51c>;
};
ftm_alarm0: timer0@29d0000 {
--
2.7.4
^ permalink raw reply related
* [PATCH v2 1/5] Documentation: dt: binding: fsl: Add 'fsl, ippdexpcr1-alt-addr' property
From: Ran Wang @ 2020-09-23 6:25 UTC (permalink / raw)
To: Li Yang, Rob Herring, Shawn Guo
Cc: devicetree, Biwen Li, linux-kernel, Ran Wang, linuxppc-dev,
linux-arm-kernel
From: Biwen Li <biwen.li@nxp.com>
The 'fsl,ippdexpcr1-alt-addr' property is used to handle an errata A-008646
on LS1021A
Signed-off-by: Biwen Li <biwen.li@nxp.com>
Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
---
Change in v2:
- None
Documentation/devicetree/bindings/soc/fsl/rcpm.txt | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/Documentation/devicetree/bindings/soc/fsl/rcpm.txt b/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
index 5a33619..1be58a3 100644
--- a/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
+++ b/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
@@ -34,6 +34,11 @@ Chassis Version Example Chips
Optional properties:
- little-endian : RCPM register block is Little Endian. Without it RCPM
will be Big Endian (default case).
+ - fsl,ippdexpcr1-alt-addr : The property is related to a hardware issue
+ on SoC LS1021A and only needed on SoC LS1021A.
+ Must include 2 entries:
+ The first entry must be a link to the SCFG device node.
+ The 2nd entry must be offset of register IPPDEXPCR1 in SCFG.
Example:
The RCPM node for T4240:
@@ -43,6 +48,20 @@ The RCPM node for T4240:
#fsl,rcpm-wakeup-cells = <2>;
};
+The RCPM node for LS1021A:
+ rcpm: rcpm@1ee2140 {
+ compatible = "fsl,ls1021a-rcpm", "fsl,qoriq-rcpm-2.1+";
+ reg = <0x0 0x1ee2140 0x0 0x8>;
+ #fsl,rcpm-wakeup-cells = <2>;
+
+ /*
+ * The second and third entry compose an alt offset
+ * address for IPPDEXPCR1(SCFG_SPARECR8)
+ */
+ fsl,ippdexpcr1-alt-addr = <&scfg 0x51c>;
+ };
+
+
* Freescale RCPM Wakeup Source Device Tree Bindings
-------------------------------------------
Required fsl,rcpm-wakeup property should be added to a device node if the device
--
2.7.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox