LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
From: Thomas Gleixner @ 2020-09-23 14:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Juri Lelli, Peter Zijlstra, Sebastian Andrzej Siewior,
	Joonas Lahtinen, dri-devel, linux-mips, Ben Segall, Max Filippov,
	Guo Ren, linux-sparc, Vincent Chen, Will Deacon, Ard Biesheuvel,
	linux-arch, Vincent Guittot, Herbert Xu, the arch/x86 maintainers,
	Russell King, linux-csky, David Airlie, Mel Gorman,
	open list:SYNOPSYS ARC ARCHITECTURE, linux-xtensa, Paul McKenney,
	intel-gfx, linuxppc-dev, Steven Rostedt, Jani Nikula,
	Rodrigo Vivi, Dietmar Eggemann, Linux ARM, Chris Zankel,
	Michal Simek, Thomas Bogendoerfer, Nick Hu, Linux-MM,
	Vineet Gupta, LKML, Arnd Bergmann, Daniel Vetter, Paul Mackerras,
	Andrew Morton, Daniel Bristot de Oliveira, David S. Miller,
	Greentime Hu
In-Reply-To: <87sgbbaq0y.fsf@nanos.tec.linutronix.de>

On Mon, Sep 21 2020 at 21:27, Thomas Gleixner wrote:
> On Mon, Sep 21 2020 at 09:24, Linus Torvalds wrote:
>> On Mon, Sep 21, 2020 at 12:39 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>> Maybe we really *could* call this new kmap functionality something
>> like "kmap_percpu()" (or maybe "local" is good enough), and make it
>> act like your RT code does for spinlocks - not disable preemption, but
>> only disabling CPU migration.
>
> I"m all for it, but the scheduler people have opinions :)

I just took the latest version of migrate disable patches

  https://lore.kernel.org/r/20200921163557.234036895@infradead.org

removed the RT dependency on top of them and adopted the kmap stuff
(addressing the various comments while at it and unbreaking ARM).

I'm not going to post that until there is consensus about the general
availability of migrate disable, but for those who want to play with it
I've pushed the resulting combo out to:

   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git highmem 

For testing I've tweaked a few places to use the new _local() variants
and it survived testing so far and I've verified that there is actual
preemption which means zap/restore of the thread local kmaps.

Thanks,

        tglx

^ permalink raw reply

* RE: [PATCH 3/9] iov_iter: refactor rw_copy_check_uvector and import_iovec
From: David Laight @ 2020-09-23 14:38 UTC (permalink / raw)
  To: 'Al Viro', Christoph Hellwig
  Cc: linux-aio@kvack.org, linux-mips@vger.kernel.org, David Howells,
	linux-mm@kvack.org, keyrings@vger.kernel.org,
	sparclinux@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
	Arnd Bergmann, linuxppc-dev@lists.ozlabs.org,
	linux-block@vger.kernel.org, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Andrew Morton, Linus Torvalds
In-Reply-To: <20200923141654.GJ3421308@ZenIV.linux.org.uk>

From: Al Viro
> Sent: 23 September 2020 15:17
> 
> On Wed, Sep 23, 2020 at 08:05:41AM +0200, Christoph Hellwig wrote:
> 
> > +struct iovec *iovec_from_user(const struct iovec __user *uvec,
> > +		unsigned long nr_segs, unsigned long fast_segs,
> 
> Hmm...  For fast_segs unsigned long had always been ridiculous
> (4G struct iovec on caller stack frame?), but that got me wondering about
> nr_segs and I wish I'd thought of that when introducing import_iovec().
> 
> The thing is, import_iovec() takes unsigned int there.  Which is fine
> (hell, the maximal value that can be accepted in 1024), except that
> we do pass unsigned long syscall argument to it in some places.

It will make diddly-squit difference.
The parameters end up in registers on most calling conventions.
Plausibly you get an extra 'REX' byte on x86 for the 64bit value.
What you want to avoid is explicit sign/zero extension and value
masking after arithmetic.

On x86-64 the 'horrid' type is actually 'signed int'.
It often needs sign extending to 64bits (eg when being
used as an array subscript).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply

* Re: [PATCH 3/9] iov_iter: refactor rw_copy_check_uvector and import_iovec
From: Al Viro @ 2020-09-23 14:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-aio, linux-mips, David Howells, linux-mm, keyrings,
	sparclinux, linux-arch, linux-s390, linux-scsi, Linus Torvalds,
	Arnd Bergmann, linux-block, io-uring, linux-arm-kernel,
	Jens Axboe, linux-parisc, netdev, linux-kernel,
	linux-security-module, David Laight, linux-fsdevel, Andrew Morton,
	linuxppc-dev
In-Reply-To: <20200923141654.GJ3421308@ZenIV.linux.org.uk>

On Wed, Sep 23, 2020 at 03:16:54PM +0100, Al Viro wrote:
> On Wed, Sep 23, 2020 at 08:05:41AM +0200, Christoph Hellwig wrote:
> 
> > +struct iovec *iovec_from_user(const struct iovec __user *uvec,
> > +		unsigned long nr_segs, unsigned long fast_segs,
> 
> Hmm...  For fast_segs unsigned long had always been ridiculous
> (4G struct iovec on caller stack frame?), but that got me wondering about
> nr_segs and I wish I'd thought of that when introducing import_iovec().
> 
> The thing is, import_iovec() takes unsigned int there.  Which is fine
> (hell, the maximal value that can be accepted in 1024), except that
> we do pass unsigned long syscall argument to it in some places.
> 
> E.g. vfs_readv() quietly truncates vlen to 32 bits, and vlen can
> come unchanged through sys_readv() -> do_readv() -> vfs_readv().
> With unsigned long passed by syscall glue.
> 
> AFAICS, passing 4G+1 as the third argument to readv(2) on 64bit box
> will be quietly treated as 1 these days.  Which would be fine, except
> that before "switch {compat_,}do_readv_writev() to {compat_,}import_iovec()"
> it used to fail with -EINVAL.
> 
> Userland, BTW, describes readv(2) iovcnt as int; process_vm_readv(),
> OTOH, has these counts unsigned long from the userland POV...
> 
> I suppose we ought to switch import_iovec() to unsigned long for nr_segs ;-/
> Strictly speaking that had been a userland ABI change, even though nothing
> except regression tests checking for expected errors would've been likely
> to notice.  And it looks like no regression tests covered that one...
> 
> Linus, does that qualify for your "if no userland has noticed the change,
> it's not a breakage"?

Egads...  We have sys_readv() with unsigned long for file descriptor, since
1.3.31 when it had been introduced.  And originally it did comparison with
NR_OPEN right in sys_readv().  Then in 2.1.60 it had been switched to
fget(), which used to take unsigned long at that point.  And in 2.1.90pre1
it went unsigned int, so non-zero upper 32 bits in readv(2) first argument
ceased to cause EBADF...

Of course, libc had it as int fd all along.

^ permalink raw reply

* Re: [PATCH 3/9] iov_iter: refactor rw_copy_check_uvector and import_iovec
From: Al Viro @ 2020-09-23 14:49 UTC (permalink / raw)
  To: David Laight
  Cc: linux-aio@kvack.org, linux-mips@vger.kernel.org, David Howells,
	linux-mm@kvack.org, keyrings@vger.kernel.org,
	sparclinux@vger.kernel.org, Christoph Hellwig,
	linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
	linux-scsi@vger.kernel.org, Linus Torvalds, Arnd Bergmann,
	linux-block@vger.kernel.org, io-uring@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Jens Axboe,
	linux-parisc@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, Andrew Morton,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <200cf2b9ce5e408f8838948fda7ce9a0@AcuMS.aculab.com>

On Wed, Sep 23, 2020 at 02:38:24PM +0000, David Laight wrote:
> From: Al Viro
> > Sent: 23 September 2020 15:17
> > 
> > On Wed, Sep 23, 2020 at 08:05:41AM +0200, Christoph Hellwig wrote:
> > 
> > > +struct iovec *iovec_from_user(const struct iovec __user *uvec,
> > > +		unsigned long nr_segs, unsigned long fast_segs,
> > 
> > Hmm...  For fast_segs unsigned long had always been ridiculous
> > (4G struct iovec on caller stack frame?), but that got me wondering about
> > nr_segs and I wish I'd thought of that when introducing import_iovec().
> > 
> > The thing is, import_iovec() takes unsigned int there.  Which is fine
> > (hell, the maximal value that can be accepted in 1024), except that
> > we do pass unsigned long syscall argument to it in some places.
> 
> It will make diddly-squit difference.
> The parameters end up in registers on most calling conventions.
> Plausibly you get an extra 'REX' byte on x86 for the 64bit value.
> What you want to avoid is explicit sign/zero extension and value
> masking after arithmetic.

Don't tell me what I want; your telepathic abilities are consistently sucky.

I am *NOT* talking about microoptimization here.  I have described
the behaviour change of syscall caused by commit 5 years ago.  Which is
generally considered a problem.  Then I asked whether that behaviour
change would fall under the "if nobody noticed, it's not a userland ABI
breakage" exception.

Could you show me the point where I have expressed concerns about
the quality of amd64 code generated for that thing, before or after
the change in question?

^ permalink raw reply

* Re: [PATCH 5/9] fs: remove various compat readv/writev helpers
From: Al Viro @ 2020-09-23 14:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-aio, linux-mips, David Howells, linux-mm, keyrings,
	sparclinux, linux-arch, linux-s390, linux-scsi, Arnd Bergmann,
	linux-block, io-uring, linux-arm-kernel, Jens Axboe, linux-parisc,
	netdev, linux-kernel, linux-security-module, David Laight,
	linux-fsdevel, Andrew Morton, linuxppc-dev
In-Reply-To: <20200923143251.GA14062@lst.de>

On Wed, Sep 23, 2020 at 04:32:51PM +0200, Christoph Hellwig wrote:
> On Wed, Sep 23, 2020 at 03:25:49PM +0100, Al Viro wrote:
> > On Wed, Sep 23, 2020 at 08:05:43AM +0200, Christoph Hellwig wrote:
> > >  COMPAT_SYSCALL_DEFINE3(readv, compat_ulong_t, fd,
> > > -		const struct compat_iovec __user *,vec,
> > > +		const struct iovec __user *, vec,
> > 
> > Um...  Will it even compile?
> > 
> > >  #ifdef __ARCH_WANT_COMPAT_SYS_PREADV64
> > >  COMPAT_SYSCALL_DEFINE4(preadv64, unsigned long, fd,
> > > -		const struct compat_iovec __user *,vec,
> > > +		const struct iovec __user *, vec,
> > 
> > Ditto.  Look into include/linux/compat.h and you'll see
> > 
> > asmlinkage long compat_sys_preadv64(unsigned long fd,
> >                 const struct compat_iovec __user *vec,
> >                 unsigned long vlen, loff_t pos);
> > 
> > How does that manage to avoid the compiler screaming bloody
> > murder?
> 
> That's a very good question.  But it does not just compile but actually
> works.  Probably because all the syscall wrappers mean that we don't
> actually generate the normal names.  I just tried this:
> 
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -468,7 +468,7 @@ asmlinkage long sys_lseek(unsigned int fd, off_t offset,
>  asmlinkage long sys_read(unsigned int fd, char __user *buf, size_t count);
>  asmlinkage long sys_write(unsigned int fd, const char __user *buf,
>                             size_t count);
> -asmlinkage long sys_readv(unsigned long fd,
> +asmlinkage long sys_readv(void *fd,
> 
> for fun, and the compiler doesn't care either..

Try to build it for sparc or ppc...

^ permalink raw reply

* Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
From: Steven Rostedt @ 2020-09-23 15:52 UTC (permalink / raw)
  To: peterz
  Cc: Juri Lelli, David Airlie, Sebastian Andrzej Siewior,
	Joonas Lahtinen, dri-devel, linux-mips, Ben Segall, Max Filippov,
	Guo Ren, linux-sparc, Vincent Chen, Will Deacon, Ard Biesheuvel,
	linux-arch, Vincent Guittot, Herbert Xu, the arch/x86 maintainers,
	Russell King, linux-csky, Mel Gorman,
	open list:SYNOPSYS ARC ARCHITECTURE, linux-xtensa, Paul McKenney,
	intel-gfx, linuxppc-dev, Jani Nikula, Greentime Hu, Rodrigo Vivi,
	Thomas Gleixner, Dietmar Eggemann, Linux ARM, Chris Zankel,
	Michal Simek, Thomas Bogendoerfer, Nick Hu, Linux-MM,
	Linus Torvalds, LKML, Arnd Bergmann, Daniel Vetter, Vineet Gupta,
	Paul Mackerras, Andrew Morton, Daniel Bristot de Oliveira,
	David S. Miller
In-Reply-To: <20200923084032.GU1362448@hirez.programming.kicks-ass.net>

On Wed, 23 Sep 2020 10:40:32 +0200
peterz@infradead.org wrote:

> However, with migrate_disable() we can have each task preempted in a
> migrate_disable() region, worse we can stack them all on the _same_ CPU
> (super ridiculous odds, sure). And then we end up only able to run one
> task, with the rest of the CPUs picking their nose.

What if we just made migrate_disable() a local_lock() available for !RT?

I mean make it a priority inheritance PER CPU lock.

That is, no two tasks could do a migrate_disable() on the same CPU? If
one task does a migrate_disable() and then gets preempted and the
preempting task does a migrate_disable() on the same CPU, it will block
and wait for the first task to do a migrate_enable().

No two tasks on the same CPU could enter the migrate_disable() section
simultaneously, just like no two tasks could enter a preempt_disable()
section.

In essence, we just allow local_lock() to be used for both RT and !RT.

Perhaps make migrate_disable() an anonymous local_lock()?

This should lower the SHC in theory, if you can't have stacked migrate
disables on the same CPU.

-- Steve

^ permalink raw reply

* Re: [PATCH v2] i2c: cpm: Fix i2c_ram structure
From: Wolfram Sang @ 2020-09-23 16:08 UTC (permalink / raw)
  To: nicolas.vincent; +Cc: linuxppc-dev, linux-i2c
In-Reply-To: <20200923140840.8700-1-nicolas.vincent@vossloh.com>

[-- Attachment #1: Type: text/plain, Size: 1406 bytes --]

On Wed, Sep 23, 2020 at 04:08:40PM +0200, nico.vince@gmail.com wrote:
> From: Nicolas VINCENT <nicolas.vincent@vossloh.com>
> 
> the i2c_ram structure is missing the sdmatmp field mentionned in
> datasheet for MPC8272 at paragraph 36.5. With this field missing, the
> hardware would write past the allocated memory done through
> cpm_muram_alloc for the i2c_ram structure and land in memory allocated
> for the buffers descriptors corrupting the cbd_bufaddr field. Since this
> field is only set during setup(), the first i2c transaction would work
> and the following would send data read from an arbitrary memory
> location.
> 
> Signed-off-by: Nicolas VINCENT <nicolas.vincent@vossloh.com>

Thanks!

Is someone able to identify a Fixes: tag I could add?

> ---
>  drivers/i2c/busses/i2c-cpm.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c
> index 1213e1932ccb..24d584a1c9a7 100644
> --- a/drivers/i2c/busses/i2c-cpm.c
> +++ b/drivers/i2c/busses/i2c-cpm.c
> @@ -65,6 +65,9 @@ struct i2c_ram {
>  	char    res1[4];	/* Reserved */
>  	ushort  rpbase;		/* Relocation pointer */
>  	char    res2[2];	/* Reserved */
> +	/* The following elements are only for CPM2 */
> +	char    res3[4];	/* Reserved */
> +	uint    sdmatmp;	/* Internal */
>  };
>  
>  #define I2COM_START	0x80
> -- 
> 2.17.1
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 5/9] fs: remove various compat readv/writev helpers
From: Al Viro @ 2020-09-23 16:38 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-aio, linux-mips, David Howells, linux-mm, keyrings,
	sparclinux, linux-arch, linux-s390, linux-scsi, Arnd Bergmann,
	linux-block, io-uring, linux-arm-kernel, Jens Axboe, linux-parisc,
	netdev, linux-kernel, linux-security-module, David Laight,
	linux-fsdevel, Andrew Morton, linuxppc-dev
In-Reply-To: <20200923145901.GN3421308@ZenIV.linux.org.uk>

On Wed, Sep 23, 2020 at 03:59:01PM +0100, Al Viro wrote:

> > That's a very good question.  But it does not just compile but actually
> > works.  Probably because all the syscall wrappers mean that we don't
> > actually generate the normal names.  I just tried this:
> > 
> > --- a/include/linux/syscalls.h
> > +++ b/include/linux/syscalls.h
> > @@ -468,7 +468,7 @@ asmlinkage long sys_lseek(unsigned int fd, off_t offset,
> >  asmlinkage long sys_read(unsigned int fd, char __user *buf, size_t count);
> >  asmlinkage long sys_write(unsigned int fd, const char __user *buf,
> >                             size_t count);
> > -asmlinkage long sys_readv(unsigned long fd,
> > +asmlinkage long sys_readv(void *fd,
> > 
> > for fun, and the compiler doesn't care either..
> 
> Try to build it for sparc or ppc...

FWIW, declarations in syscalls.h used to serve 4 purposes:
	1) syscall table initializers needed symbols declared
	2) direct calls needed the same
	3) catching mismatches between the declarations and definitions
	4) centralized list of all syscalls

(2) has been (thankfully) reduced for some time; in any case, ksys_... is
used for the remaining ones.

(1) and (3) are served by syscalls.h in architectures other than x86, arm64
and s390.  On those 3 (1) is done otherwise (near the syscall table initializer)
and (3) is not done at all.

I wonder if we should do something like

SYSCALL_DECLARE3(readv, unsigned long, fd, const struct iovec __user *, vec,
		 unsigned long, vlen);
in syscalls.h instead, and not under that ifdef.

Let it expand to declaration of sys_...() in generic case and, on x86, into
__do_sys_...() and __ia32_sys_...()/__x64_sys_...(), with types matching
what SYSCALL_DEFINE ends up using.

Similar macro would cover compat_sys_...() declarations.  That would
restore mismatch checking for x86 and friends.  AFAICS, the cost wouldn't
be terribly high - cpp would have more to chew through in syscalls.h,
but it shouldn't be all that costly.  Famous last words, of course...

Does anybody see fundamental problems with that?

^ permalink raw reply

* Re: [PATCH 5/9] fs: remove various compat readv/writev helpers
From: Al Viro @ 2020-09-23 17:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-aio, linux-mips, David Howells, linux-mm, keyrings,
	sparclinux, linux-arch, linux-s390, linux-scsi, Arnd Bergmann,
	linux-block, io-uring, linux-arm-kernel, Jens Axboe, linux-parisc,
	netdev, linux-kernel, linux-security-module, David Laight,
	linux-fsdevel, Andrew Morton, linuxppc-dev
In-Reply-To: <20200923163831.GO3421308@ZenIV.linux.org.uk>

On Wed, Sep 23, 2020 at 05:38:31PM +0100, Al Viro wrote:
> On Wed, Sep 23, 2020 at 03:59:01PM +0100, Al Viro wrote:
> 
> > > That's a very good question.  But it does not just compile but actually
> > > works.  Probably because all the syscall wrappers mean that we don't
> > > actually generate the normal names.  I just tried this:
> > > 
> > > --- a/include/linux/syscalls.h
> > > +++ b/include/linux/syscalls.h
> > > @@ -468,7 +468,7 @@ asmlinkage long sys_lseek(unsigned int fd, off_t offset,
> > >  asmlinkage long sys_read(unsigned int fd, char __user *buf, size_t count);
> > >  asmlinkage long sys_write(unsigned int fd, const char __user *buf,
> > >                             size_t count);
> > > -asmlinkage long sys_readv(unsigned long fd,
> > > +asmlinkage long sys_readv(void *fd,
> > > 
> > > for fun, and the compiler doesn't care either..
> > 
> > Try to build it for sparc or ppc...
> 
> FWIW, declarations in syscalls.h used to serve 4 purposes:
> 	1) syscall table initializers needed symbols declared
> 	2) direct calls needed the same
> 	3) catching mismatches between the declarations and definitions
> 	4) centralized list of all syscalls
> 
> (2) has been (thankfully) reduced for some time; in any case, ksys_... is
> used for the remaining ones.
> 
> (1) and (3) are served by syscalls.h in architectures other than x86, arm64
> and s390.  On those 3 (1) is done otherwise (near the syscall table initializer)
> and (3) is not done at all.
> 
> I wonder if we should do something like
> 
> SYSCALL_DECLARE3(readv, unsigned long, fd, const struct iovec __user *, vec,
> 		 unsigned long, vlen);
> in syscalls.h instead, and not under that ifdef.
> 
> Let it expand to declaration of sys_...() in generic case and, on x86, into
> __do_sys_...() and __ia32_sys_...()/__x64_sys_...(), with types matching
> what SYSCALL_DEFINE ends up using.
> 
> Similar macro would cover compat_sys_...() declarations.  That would
> restore mismatch checking for x86 and friends.  AFAICS, the cost wouldn't
> be terribly high - cpp would have more to chew through in syscalls.h,
> but it shouldn't be all that costly.  Famous last words, of course...
> 
> Does anybody see fundamental problems with that?

Just to make it clear - I do not propose to fold that into this series;
there we just need to keep those declarations in sync with fs/read_write.c

^ permalink raw reply

* Re: [PATCH 5/9] fs: remove various compat readv/writev helpers
From: Brian Gerst @ 2020-09-23 17:08 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-aio, open list:BROADCOM NVRAM DRIVER, David Howells,
	Linux-MM, keyrings, sparclinux, Christoph Hellwig, linux-arch,
	linux-s390, linux-scsi, Arnd Bergmann, linux-block, io-uring,
	Linux ARM, Jens Axboe, Parisc List, netdev,
	Linux Kernel Mailing List, linux-security-module, David Laight,
	Linux FS-devel Mailing List, Andrew Morton, linuxppc-dev
In-Reply-To: <20200923163831.GO3421308@ZenIV.linux.org.uk>

On Wed, Sep 23, 2020 at 12:39 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Wed, Sep 23, 2020 at 03:59:01PM +0100, Al Viro wrote:
>
> > > That's a very good question.  But it does not just compile but actually
> > > works.  Probably because all the syscall wrappers mean that we don't
> > > actually generate the normal names.  I just tried this:
> > >
> > > --- a/include/linux/syscalls.h
> > > +++ b/include/linux/syscalls.h
> > > @@ -468,7 +468,7 @@ asmlinkage long sys_lseek(unsigned int fd, off_t offset,
> > >  asmlinkage long sys_read(unsigned int fd, char __user *buf, size_t count);
> > >  asmlinkage long sys_write(unsigned int fd, const char __user *buf,
> > >                             size_t count);
> > > -asmlinkage long sys_readv(unsigned long fd,
> > > +asmlinkage long sys_readv(void *fd,
> > >
> > > for fun, and the compiler doesn't care either..
> >
> > Try to build it for sparc or ppc...
>
> FWIW, declarations in syscalls.h used to serve 4 purposes:
>         1) syscall table initializers needed symbols declared
>         2) direct calls needed the same
>         3) catching mismatches between the declarations and definitions
>         4) centralized list of all syscalls
>
> (2) has been (thankfully) reduced for some time; in any case, ksys_... is
> used for the remaining ones.
>
> (1) and (3) are served by syscalls.h in architectures other than x86, arm64
> and s390.  On those 3 (1) is done otherwise (near the syscall table initializer)
> and (3) is not done at all.
>
> I wonder if we should do something like
>
> SYSCALL_DECLARE3(readv, unsigned long, fd, const struct iovec __user *, vec,
>                  unsigned long, vlen);
> in syscalls.h instead, and not under that ifdef.
>
> Let it expand to declaration of sys_...() in generic case and, on x86, into
> __do_sys_...() and __ia32_sys_...()/__x64_sys_...(), with types matching
> what SYSCALL_DEFINE ends up using.
>
> Similar macro would cover compat_sys_...() declarations.  That would
> restore mismatch checking for x86 and friends.  AFAICS, the cost wouldn't
> be terribly high - cpp would have more to chew through in syscalls.h,
> but it shouldn't be all that costly.  Famous last words, of course...
>
> Does anybody see fundamental problems with that?

I think this would be a good idea.  I have been working on a patchset
to clean up the conditional syscall handling (sys_ni.c), and conflicts
with the prototypes in syscalls.h have been getting in the way.
Having the prototypes use SYSCALL_DECLAREx(...) would solve that
issue.

--
Brian Gerst

^ permalink raw reply

* Re: [PATCH 5/9] fs: remove various compat readv/writev helpers
From: Christoph Hellwig @ 2020-09-23 17:46 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-aio, linux-mips, David Howells, linux-mm, keyrings,
	sparclinux, Christoph Hellwig, linux-arch, linux-s390, linux-scsi,
	Arnd Bergmann, linux-block, io-uring, linux-arm-kernel,
	Jens Axboe, linux-parisc, netdev, linux-kernel,
	linux-security-module, David Laight, linux-fsdevel, Andrew Morton,
	linuxppc-dev
In-Reply-To: <20200923170527.GQ3421308@ZenIV.linux.org.uk>

On Wed, Sep 23, 2020 at 06:05:27PM +0100, Al Viro wrote:
> On Wed, Sep 23, 2020 at 05:38:31PM +0100, Al Viro wrote:
> > On Wed, Sep 23, 2020 at 03:59:01PM +0100, Al Viro wrote:
> > 
> > > > That's a very good question.  But it does not just compile but actually
> > > > works.  Probably because all the syscall wrappers mean that we don't
> > > > actually generate the normal names.  I just tried this:
> > > > 
> > > > --- a/include/linux/syscalls.h
> > > > +++ b/include/linux/syscalls.h
> > > > @@ -468,7 +468,7 @@ asmlinkage long sys_lseek(unsigned int fd, off_t offset,
> > > >  asmlinkage long sys_read(unsigned int fd, char __user *buf, size_t count);
> > > >  asmlinkage long sys_write(unsigned int fd, const char __user *buf,
> > > >                             size_t count);
> > > > -asmlinkage long sys_readv(unsigned long fd,
> > > > +asmlinkage long sys_readv(void *fd,
> > > > 
> > > > for fun, and the compiler doesn't care either..
> > > 
> > > Try to build it for sparc or ppc...
> > 
> > FWIW, declarations in syscalls.h used to serve 4 purposes:
> > 	1) syscall table initializers needed symbols declared
> > 	2) direct calls needed the same
> > 	3) catching mismatches between the declarations and definitions
> > 	4) centralized list of all syscalls
> > 
> > (2) has been (thankfully) reduced for some time; in any case, ksys_... is
> > used for the remaining ones.
> > 
> > (1) and (3) are served by syscalls.h in architectures other than x86, arm64
> > and s390.  On those 3 (1) is done otherwise (near the syscall table initializer)
> > and (3) is not done at all.
> > 
> > I wonder if we should do something like
> > 
> > SYSCALL_DECLARE3(readv, unsigned long, fd, const struct iovec __user *, vec,
> > 		 unsigned long, vlen);
> > in syscalls.h instead, and not under that ifdef.
> > 
> > Let it expand to declaration of sys_...() in generic case and, on x86, into
> > __do_sys_...() and __ia32_sys_...()/__x64_sys_...(), with types matching
> > what SYSCALL_DEFINE ends up using.
> > 
> > Similar macro would cover compat_sys_...() declarations.  That would
> > restore mismatch checking for x86 and friends.  AFAICS, the cost wouldn't
> > be terribly high - cpp would have more to chew through in syscalls.h,
> > but it shouldn't be all that costly.  Famous last words, of course...
> > 
> > Does anybody see fundamental problems with that?
> 
> Just to make it clear - I do not propose to fold that into this series;
> there we just need to keep those declarations in sync with fs/read_write.c

Agreed.  The above idea generally sounds sane to me.

^ permalink raw reply

* Re: [PATCH v2] i2c: cpm: Fix i2c_ram structure
From: Christophe Leroy @ 2020-09-23 18:27 UTC (permalink / raw)
  To: Wolfram Sang, nicolas.vincent; +Cc: linuxppc-dev, linux-i2c
In-Reply-To: <20200923160829.GB6697@ninjato>



Le 23/09/2020 à 18:08, Wolfram Sang a écrit :
> On Wed, Sep 23, 2020 at 04:08:40PM +0200, nico.vince@gmail.com wrote:
>> From: Nicolas VINCENT <nicolas.vincent@vossloh.com>
>>
>> the i2c_ram structure is missing the sdmatmp field mentionned in
>> datasheet for MPC8272 at paragraph 36.5. With this field missing, the
>> hardware would write past the allocated memory done through
>> cpm_muram_alloc for the i2c_ram structure and land in memory allocated
>> for the buffers descriptors corrupting the cbd_bufaddr field. Since this
>> field is only set during setup(), the first i2c transaction would work
>> and the following would send data read from an arbitrary memory
>> location.
>>
>> Signed-off-by: Nicolas VINCENT <nicolas.vincent@vossloh.com>
> 
> Thanks!
> 
> Is someone able to identify a Fixes: tag I could add?

I'd suggest

Fixes: 61045dbe9d8d ("i2c: Add support for I2C bus on Freescale 
CPM1/CPM2 controllers")

Christophe

> 
>> ---
>>   drivers/i2c/busses/i2c-cpm.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c
>> index 1213e1932ccb..24d584a1c9a7 100644
>> --- a/drivers/i2c/busses/i2c-cpm.c
>> +++ b/drivers/i2c/busses/i2c-cpm.c
>> @@ -65,6 +65,9 @@ struct i2c_ram {
>>   	char    res1[4];	/* Reserved */
>>   	ushort  rpbase;		/* Relocation pointer */
>>   	char    res2[2];	/* Reserved */
>> +	/* The following elements are only for CPM2 */
>> +	char    res3[4];	/* Reserved */
>> +	uint    sdmatmp;	/* Internal */
>>   };
>>   
>>   #define I2COM_START	0x80
>> -- 
>> 2.17.1
>>

^ permalink raw reply

* Re: [PATCH 5/9] fs: remove various compat readv/writev helpers
From: Arnd Bergmann @ 2020-09-23 18:45 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-aio, open list:BROADCOM NVRAM DRIVER, David Howells,
	Linux-MM, keyrings, sparclinux, Christoph Hellwig, linux-arch,
	linux-s390, linux-scsi, linux-block, io-uring, Linux ARM,
	Jens Axboe, Parisc List, Networking, linux-kernel@vger.kernel.org,
	LSM List, David Laight, Linux FS-devel Mailing List,
	Andrew Morton, linuxppc-dev
In-Reply-To: <20200923163831.GO3421308@ZenIV.linux.org.uk>

On Wed, Sep 23, 2020 at 6:38 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> I wonder if we should do something like
>
> SYSCALL_DECLARE3(readv, unsigned long, fd, const struct iovec __user *, vec,
>                  unsigned long, vlen);
> in syscalls.h instead, and not under that ifdef.
>
> Let it expand to declaration of sys_...() in generic case and, on x86, into
> __do_sys_...() and __ia32_sys_...()/__x64_sys_...(), with types matching
> what SYSCALL_DEFINE ends up using.
>
> Similar macro would cover compat_sys_...() declarations.  That would
> restore mismatch checking for x86 and friends.  AFAICS, the cost wouldn't
> be terribly high - cpp would have more to chew through in syscalls.h,
> but it shouldn't be all that costly.  Famous last words, of course...
>
> Does anybody see fundamental problems with that?

I've had some ideas along those lines in the past and I think it should work.

As a variation of this, the SYSCALL_DEFINEx() macros could go away
entirely, leaving only the macro instantiations from the header to
require that syntax. It would require first changing the remaining
architectures to build the syscall table from C code instead of
assembler though.

Regardless of that, another advantage of having the SYSCALL_DECLAREx()
would be the ability to include that header file from elsewhere with a different
macro definition to create a machine-readable version of the interface when
combined with the syscall.tbl files. This could be used to create a user
space stub for calling into the low-level syscall regardless of the
libc interfaces,
or for synchronizing the interfaces with strace, qemu-user, or anything that
needs to deal with the low-level interface.

      Arnd

^ permalink raw reply

* Re: [PATCH 5/9] fs: remove various compat readv/writev helpers
From: Al Viro @ 2020-09-23 19:47 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-aio, open list:BROADCOM NVRAM DRIVER, David Howells,
	Linux-MM, keyrings, sparclinux, Christoph Hellwig, linux-arch,
	linux-s390, linux-scsi, linux-block, io-uring, Linux ARM,
	Jens Axboe, Parisc List, Networking, linux-kernel@vger.kernel.org,
	LSM List, David Laight, Linux FS-devel Mailing List,
	Andrew Morton, linuxppc-dev
In-Reply-To: <CAK8P3a3nkLUOkR+jwz2_2LcYTUTqdVf8JOtZqKWbtEDotNhFZA@mail.gmail.com>

On Wed, Sep 23, 2020 at 08:45:51PM +0200, Arnd Bergmann wrote:
> On Wed, Sep 23, 2020 at 6:38 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > I wonder if we should do something like
> >
> > SYSCALL_DECLARE3(readv, unsigned long, fd, const struct iovec __user *, vec,
> >                  unsigned long, vlen);
> > in syscalls.h instead, and not under that ifdef.
> >
> > Let it expand to declaration of sys_...() in generic case and, on x86, into
> > __do_sys_...() and __ia32_sys_...()/__x64_sys_...(), with types matching
> > what SYSCALL_DEFINE ends up using.
> >
> > Similar macro would cover compat_sys_...() declarations.  That would
> > restore mismatch checking for x86 and friends.  AFAICS, the cost wouldn't
> > be terribly high - cpp would have more to chew through in syscalls.h,
> > but it shouldn't be all that costly.  Famous last words, of course...
> >
> > Does anybody see fundamental problems with that?
> 
> I've had some ideas along those lines in the past and I think it should work.
> 
> As a variation of this, the SYSCALL_DEFINEx() macros could go away
> entirely, leaving only the macro instantiations from the header to
> require that syntax. It would require first changing the remaining
> architectures to build the syscall table from C code instead of
> assembler though.
> 
> Regardless of that, another advantage of having the SYSCALL_DECLAREx()
> would be the ability to include that header file from elsewhere with a different
> macro definition to create a machine-readable version of the interface when
> combined with the syscall.tbl files. This could be used to create a user
> space stub for calling into the low-level syscall regardless of the
> libc interfaces,
> or for synchronizing the interfaces with strace, qemu-user, or anything that
> needs to deal with the low-level interface.

FWIW, after playing with that for a while...  Do we really want the
compat_sys_...() declarations to live in linux/compat.h?  Most of
the users of that file don't want those; why not move them to
linux/syscalls.h?

Reason: there's a lot more users of linux/compat.h than those of
linux/syscalls.h - it's pulled by everything in the networking stack,
for starters...

^ permalink raw reply

* Re: [PATCH 5/9] fs: remove various compat readv/writev helpers
From: Arnd Bergmann @ 2020-09-23 19:52 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-aio, open list:BROADCOM NVRAM DRIVER, David Howells,
	Linux-MM, keyrings, sparclinux, Christoph Hellwig, linux-arch,
	linux-s390, linux-scsi, linux-block, io-uring, Linux ARM,
	Jens Axboe, Parisc List, Networking, linux-kernel@vger.kernel.org,
	LSM List, David Laight, Linux FS-devel Mailing List,
	Andrew Morton, linuxppc-dev
In-Reply-To: <20200923194755.GR3421308@ZenIV.linux.org.uk>

On Wed, Sep 23, 2020 at 9:48 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> FWIW, after playing with that for a while...  Do we really want the
> compat_sys_...() declarations to live in linux/compat.h?  Most of
> the users of that file don't want those; why not move them to
> linux/syscalls.h?

Sure, let's do that. The trend overall is to integrate the compat stuff
more closely into where the native implementation lives, so this
would just follow that trend.

I think with Christoph's latest patches, about half of them are
going away as well.

> Reason: there's a lot more users of linux/compat.h than those of
> linux/syscalls.h - it's pulled by everything in the networking stack,
> for starters...

Right, the network headers pull in almost everything else through
multiple indirect inclusions, anything we can do to reduce that
helps.

     Arnd

^ permalink raw reply

* Re: [PATCH v6 0/3] Carry forward IMA measurement log on kexec on ARM64
From: Mimi Zohar @ 2020-09-23 20:03 UTC (permalink / raw)
  To: Lakshmi Ramasubramanian, bauerman, robh, gregkh, james.morse,
	catalin.marinas, sashal, will, mpe, benh, paulus, robh+dt,
	frowand.list, vincenzo.frascino, mark.rutland, dmitry.kasatkin,
	jmorris, serge, pasha.tatashin, allison, kstewart,
	takahiro.akashi, tglx, masahiroy, bhsharma, mbrugger, hsinyi,
	tao.li, christophe.leroy
  Cc: devicetree, balajib, Nayna Jain, linux-kernel, prsriva,
	linux-integrity, linuxppc-dev
In-Reply-To: <20200908230856.9799-1-nramas@linux.microsoft.com>

[Cc'ing Nayna Jain, linuxppc-dev@lists.ozlabs.org]

Hi Lakshmi,

On Tue, 2020-09-08 at 16:08 -0700, Lakshmi Ramasubramanian wrote:
> On kexec file load Integrity Measurement Architecture(IMA) subsystem
> may verify the IMA signature of the kernel and initramfs, and measure
> it. The command line parameters passed to the kernel in the kexec call
> may also be measured by IMA. A remote attestation service can verify
> the measurement through the IMA log and the TPM PCR data. This can be
> achieved only if the IMA measurement log is carried over from
> the current kernel to the next kernel across the kexec call.
> However in the current implementation the IMA measurement logs are not
> carried over on ARM64 platforms. Therefore a remote attestation service
> cannot verify the authenticity of the running kernel on ARM64 platforms
> when the kernel is updated through the kexec system call.
> 
> This patch series adds support for carrying forward the IMA measurement
> log on kexec on ARM64. powerpc already supports carrying forward
> the IMA measurement log on kexec.
> 
> This series refactors the platform independent code defined for powerpc
> such that it can be reused for ARM64 as well. A chosen node namely
> "linux,ima-kexec-buffer" is added to the DTB for ARM64 to hold
> the address and the size of the memory reserved to carry
> the IMA measurement log.
> 
> This patch series has been tested for ARM64 platform using QEMU.
> I would like help from the community for testing this change on powerpc.
> Thanks.

Getting access to and upgrading our group's Power system firmware took
time due to limited employee site access.  Confirmed, with these
patches applied, ima-evm-utils still properly verifies carrying the IMA
measurement list across kexec.

I plan on reviewing this patch set shortly.

thanks,

Mimi

> 
> This series is based on commit f4d51dffc6c0 ("Linux 5.9-rc4") in
> https://github.com/torvalds/linux "master" branch.
> 
> Changelog:
> 
> v6:
>   - Remove any existing FDT_PROP_IMA_KEXEC_BUFFER property in the device
>     tree and also its corresponding memory reservation in the currently
>     running kernel.
>   - Moved the function remove_ima_buffer() defined for powerpc to IMA
>     and renamed the function to ima_remove_kexec_buffer(). Also, moved
>     delete_fdt_mem_rsv() from powerpc to IMA.
> 
> v5:
>   - Merged get_addr_size_cells() and do_get_kexec_buffer() into a single
>     function when moving the arch independent code from powerpc to IMA
>   - Reverted the change to use FDT functions in powerpc code and added
>     back the original code in get_addr_size_cells() and
>     do_get_kexec_buffer() for powerpc.
>   - Added fdt_add_mem_rsv() for ARM64 to reserve the memory for
>     the IMA log buffer during kexec.
>   - Fixed the warning reported by kernel test bot for ARM64
>     arch_ima_add_kexec_buffer() - moved this function to a new file
>     namely arch/arm64/kernel/ima_kexec.c
> 
> v4:
>   - Submitting the patch series on behalf of the original author
>     Prakhar Srivastava <prsriva@linux.microsoft.com>
>   - Moved FDT_PROP_IMA_KEXEC_BUFFER ("linux,ima-kexec-buffer") to
>     libfdt.h so that it can be shared by multiple platforms.
> 
> v3:
> Breakup patches further into separate patches.
>   - Refactoring non architecture specific code out of powerpc
>   - Update powerpc related code to use fdt functions
>   - Update IMA buffer read related code to use of functions
>   - Add support to store the memory information of the IMA
>     measurement logs to be carried forward.
>   - Update the property strings to align with documented nodes
>     https://github.com/devicetree-org/dt-schema/pull/46
> 
> v2:
>   Break patches into separate patches.
>   - Powerpc related Refactoring
>   - Updating the docuemntation for chosen node
>   - Updating arm64 to support IMA buffer pass
> 
> v1:
>   Refactoring carrying over IMA measuremnet logs over Kexec. This patch
>     moves the non-architecture specific code out of powerpc and adds to
>     security/ima.(Suggested by Thiago)
>   Add Documentation regarding the ima-kexec-buffer node in the chosen
>     node documentation
> 
> v0:
>   Add a layer of abstraction to use the memory reserved by device tree
>     for ima buffer pass.
>   Add support for ima buffer pass using reserved memory for arm64 kexec.
>     Update the arch sepcific code path in kexec file load to store the
>     ima buffer in the reserved memory. The same reserved memory is read
>     on kexec or cold boot.
> 
> Lakshmi Ramasubramanian (3):
>   powerpc: Refactor kexec functions to move arch independent code to IMA
>   arm64: Store IMA log information in kimage used for kexec
>   arm64: Add IMA kexec buffer to DTB
> 
>  arch/arm64/Kconfig                      |   1 +
>  arch/arm64/include/asm/ima.h            |  18 ++++
>  arch/arm64/include/asm/kexec.h          |   3 +
>  arch/arm64/kernel/Makefile              |   1 +
>  arch/arm64/kernel/ima_kexec.c           |  34 ++++++++
>  arch/arm64/kernel/machine_kexec_file.c  |  18 ++++
>  arch/powerpc/include/asm/ima.h          |  11 +--
>  arch/powerpc/include/asm/kexec.h        |   1 -
>  arch/powerpc/kexec/file_load.c          |  33 +-------
>  arch/powerpc/kexec/ima.c                | 104 +-----------------------
>  include/linux/ima.h                     |   2 +
>  include/linux/kexec.h                   |  11 +++
>  include/linux/libfdt.h                  |   3 +
>  security/integrity/ima/Makefile         |   3 +-
>  security/integrity/ima/ima.h            |   2 +
>  security/integrity/ima/ima_fdt.c        |  80 ++++++++++++++++++
>  security/integrity/ima/ima_kexec.c      |  58 +++++++++++++
>  security/integrity/ima/ima_kexec_file.c |  51 ++++++++++++
>  18 files changed, 289 insertions(+), 145 deletions(-)
>  create mode 100644 arch/arm64/include/asm/ima.h
>  create mode 100644 arch/arm64/kernel/ima_kexec.c
>  create mode 100644 security/integrity/ima/ima_fdt.c
>  create mode 100644 security/integrity/ima/ima_kexec_file.c
> 



^ permalink raw reply

* Re: [PATCH v6 04/11] PCI: designware-ep: Modify MSI and MSIX CAP way of finding
From: Bjorn Helgaas @ 2020-09-23 20:16 UTC (permalink / raw)
  To: Xiaowei Bao
  Cc: roy.zang, lorenzo.pieralisi, devicetree, jingoohan1, Zhiqiang.Hou,
	linuxppc-dev, linux-pci, linux-kernel, leoyang.li, Minghuan.Lian,
	robh+dt, linux-arm-kernel, gustavo.pimentel, bhelgaas,
	andrew.murray, kishon, shawnguo, mingkai.hu, amurray
In-Reply-To: <20200314033038.24844-5-xiaowei.bao@nxp.com>

s/MSIX/MSI-X/ (subject and below)

On Sat, Mar 14, 2020 at 11:30:31AM +0800, Xiaowei Bao wrote:
> Each PF of EP device should have it's own MSI or MSIX capabitily
> struct, so create a dw_pcie_ep_func struct and remove the msi_cap
> and msix_cap to this struct from dw_pcie_ep, and manage the PFs
> with a list.

s/capabitily/capability/

I know Lorenzo has already applied this, but for the future, or
in case there are other reasons to update this patch.

There are a bunch of unnecessary initializations below for future
cleanup.

> Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com>
> ---
> v3:
>  - This is a new patch, to fix the issue of MSI and MSIX CAP way of
>    finding.
> v4:
>  - Correct some word of commit message.
> v5:
>  - No change.
> v6:
>  - Fix up the compile error.
> 
>  drivers/pci/controller/dwc/pcie-designware-ep.c | 135 +++++++++++++++++++++---
>  drivers/pci/controller/dwc/pcie-designware.h    |  18 +++-
>  2 files changed, 134 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> index 933bb89..fb915f2 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> @@ -19,6 +19,19 @@ void dw_pcie_ep_linkup(struct dw_pcie_ep *ep)
>  	pci_epc_linkup(epc);
>  }
>  
> +struct dw_pcie_ep_func *
> +dw_pcie_ep_get_func_from_ep(struct dw_pcie_ep *ep, u8 func_no)
> +{
> +	struct dw_pcie_ep_func *ep_func;
> +
> +	list_for_each_entry(ep_func, &ep->func_list, list) {
> +		if (ep_func->func_no == func_no)
> +			return ep_func;
> +	}
> +
> +	return NULL;
> +}
> +
>  static unsigned int dw_pcie_ep_func_select(struct dw_pcie_ep *ep, u8 func_no)
>  {
>  	unsigned int func_offset = 0;
> @@ -59,6 +72,47 @@ void dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum pci_barno bar)
>  		__dw_pcie_ep_reset_bar(pci, func_no, bar, 0);
>  }
>  
> +static u8 __dw_pcie_ep_find_next_cap(struct dw_pcie_ep *ep, u8 func_no,
> +		u8 cap_ptr, u8 cap)
> +{
> +	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> +	unsigned int func_offset = 0;

Unnecessary initialization.

> +	u8 cap_id, next_cap_ptr;
> +	u16 reg;
> +
> +	if (!cap_ptr)
> +		return 0;
> +
> +	func_offset = dw_pcie_ep_func_select(ep, func_no);
> +
> +	reg = dw_pcie_readw_dbi(pci, func_offset + cap_ptr);
> +	cap_id = (reg & 0x00ff);
> +
> +	if (cap_id > PCI_CAP_ID_MAX)
> +		return 0;
> +
> +	if (cap_id == cap)
> +		return cap_ptr;
> +
> +	next_cap_ptr = (reg & 0xff00) >> 8;
> +	return __dw_pcie_ep_find_next_cap(ep, func_no, next_cap_ptr, cap);
> +}
> +
> +static u8 dw_pcie_ep_find_capability(struct dw_pcie_ep *ep, u8 func_no, u8 cap)
> +{
> +	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> +	unsigned int func_offset = 0;

Unnecessary initialization.

> +	u8 next_cap_ptr;
> +	u16 reg;
> +
> +	func_offset = dw_pcie_ep_func_select(ep, func_no);
> +
> +	reg = dw_pcie_readw_dbi(pci, func_offset + PCI_CAPABILITY_LIST);
> +	next_cap_ptr = (reg & 0x00ff);
> +
> +	return __dw_pcie_ep_find_next_cap(ep, func_no, next_cap_ptr, cap);
> +}
> +
>  static int dw_pcie_ep_write_header(struct pci_epc *epc, u8 func_no,
>  				   struct pci_epf_header *hdr)
>  {
> @@ -246,13 +300,18 @@ static int dw_pcie_ep_get_msi(struct pci_epc *epc, u8 func_no)
>  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
>  	u32 val, reg;
>  	unsigned int func_offset = 0;

Unnecessary initialization (not from your patch).

> +	struct dw_pcie_ep_func *ep_func;
>  
> -	if (!ep->msi_cap)
> +	ep_func = dw_pcie_ep_get_func_from_ep(ep, func_no);
> +	if (!ep_func)
> +		return -EINVAL;
> +
> +	if (!ep_func->msi_cap)
>  		return -EINVAL;
>  
>  	func_offset = dw_pcie_ep_func_select(ep, func_no);
>  
> -	reg = ep->msi_cap + func_offset + PCI_MSI_FLAGS;
> +	reg = ep_func->msi_cap + func_offset + PCI_MSI_FLAGS;
>  	val = dw_pcie_readw_dbi(pci, reg);
>  	if (!(val & PCI_MSI_FLAGS_ENABLE))
>  		return -EINVAL;
> @@ -268,13 +327,18 @@ static int dw_pcie_ep_set_msi(struct pci_epc *epc, u8 func_no, u8 interrupts)
>  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
>  	u32 val, reg;
>  	unsigned int func_offset = 0;

Unnecessary initialization (not from your patch).

> +	struct dw_pcie_ep_func *ep_func;
> +
> +	ep_func = dw_pcie_ep_get_func_from_ep(ep, func_no);
> +	if (!ep_func)
> +		return -EINVAL;
>  
> -	if (!ep->msi_cap)
> +	if (!ep_func->msi_cap)
>  		return -EINVAL;
>  
>  	func_offset = dw_pcie_ep_func_select(ep, func_no);
>  
> -	reg = ep->msi_cap + func_offset + PCI_MSI_FLAGS;
> +	reg = ep_func->msi_cap + func_offset + PCI_MSI_FLAGS;
>  	val = dw_pcie_readw_dbi(pci, reg);
>  	val &= ~PCI_MSI_FLAGS_QMASK;
>  	val |= (interrupts << 1) & PCI_MSI_FLAGS_QMASK;
> @@ -291,13 +355,18 @@ static int dw_pcie_ep_get_msix(struct pci_epc *epc, u8 func_no)
>  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
>  	u32 val, reg;
>  	unsigned int func_offset = 0;

Unnecessary initialization (not from your patch).

> +	struct dw_pcie_ep_func *ep_func;
> +
> +	ep_func = dw_pcie_ep_get_func_from_ep(ep, func_no);
> +	if (!ep_func)
> +		return -EINVAL;
>  
> -	if (!ep->msix_cap)
> +	if (!ep_func->msix_cap)
>  		return -EINVAL;
>  
>  	func_offset = dw_pcie_ep_func_select(ep, func_no);
>  
> -	reg = ep->msix_cap + func_offset + PCI_MSIX_FLAGS;
> +	reg = ep_func->msix_cap + func_offset + PCI_MSIX_FLAGS;
>  	val = dw_pcie_readw_dbi(pci, reg);
>  	if (!(val & PCI_MSIX_FLAGS_ENABLE))
>  		return -EINVAL;
> @@ -313,13 +382,18 @@ static int dw_pcie_ep_set_msix(struct pci_epc *epc, u8 func_no, u16 interrupts)
>  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
>  	u32 val, reg;
>  	unsigned int func_offset = 0;

Unnecessary initialization (not from your patch).

> +	struct dw_pcie_ep_func *ep_func;
>  
> -	if (!ep->msix_cap)
> +	ep_func = dw_pcie_ep_get_func_from_ep(ep, func_no);
> +	if (!ep_func)
> +		return -EINVAL;
> +
> +	if (!ep_func->msix_cap)
>  		return -EINVAL;
>  
>  	func_offset = dw_pcie_ep_func_select(ep, func_no);
>  
> -	reg = ep->msix_cap + func_offset + PCI_MSIX_FLAGS;
> +	reg = ep_func->msix_cap + func_offset + PCI_MSIX_FLAGS;
>  	val = dw_pcie_readw_dbi(pci, reg);
>  	val &= ~PCI_MSIX_FLAGS_QSIZE;
>  	val |= interrupts;
> @@ -404,6 +478,7 @@ int dw_pcie_ep_raise_msi_irq(struct dw_pcie_ep *ep, u8 func_no,
>  			     u8 interrupt_num)
>  {
>  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> +	struct dw_pcie_ep_func *ep_func;
>  	struct pci_epc *epc = ep->epc;
>  	unsigned int aligned_offset;
>  	unsigned int func_offset = 0;

Unnecessary initialization (not from your patch).

> @@ -413,25 +488,29 @@ int dw_pcie_ep_raise_msi_irq(struct dw_pcie_ep *ep, u8 func_no,
>  	bool has_upper;
>  	int ret;
>  
> -	if (!ep->msi_cap)
> +	ep_func = dw_pcie_ep_get_func_from_ep(ep, func_no);
> +	if (!ep_func)
> +		return -EINVAL;
> +
> +	if (!ep_func->msi_cap)
>  		return -EINVAL;
>  
>  	func_offset = dw_pcie_ep_func_select(ep, func_no);
>  
>  	/* Raise MSI per the PCI Local Bus Specification Revision 3.0, 6.8.1. */
> -	reg = ep->msi_cap + func_offset + PCI_MSI_FLAGS;
> +	reg = ep_func->msi_cap + func_offset + PCI_MSI_FLAGS;
>  	msg_ctrl = dw_pcie_readw_dbi(pci, reg);
>  	has_upper = !!(msg_ctrl & PCI_MSI_FLAGS_64BIT);
> -	reg = ep->msi_cap + func_offset + PCI_MSI_ADDRESS_LO;
> +	reg = ep_func->msi_cap + func_offset + PCI_MSI_ADDRESS_LO;
>  	msg_addr_lower = dw_pcie_readl_dbi(pci, reg);
>  	if (has_upper) {
> -		reg = ep->msi_cap + func_offset + PCI_MSI_ADDRESS_HI;
> +		reg = ep_func->msi_cap + func_offset + PCI_MSI_ADDRESS_HI;
>  		msg_addr_upper = dw_pcie_readl_dbi(pci, reg);
> -		reg = ep->msi_cap + func_offset + PCI_MSI_DATA_64;
> +		reg = ep_func->msi_cap + func_offset + PCI_MSI_DATA_64;
>  		msg_data = dw_pcie_readw_dbi(pci, reg);
>  	} else {
>  		msg_addr_upper = 0;
> -		reg = ep->msi_cap + func_offset + PCI_MSI_DATA_32;
> +		reg = ep_func->msi_cap + func_offset + PCI_MSI_DATA_32;
>  		msg_data = dw_pcie_readw_dbi(pci, reg);
>  	}
>  	aligned_offset = msg_addr_lower & (epc->mem->page_size - 1);
> @@ -467,6 +546,7 @@ int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
>  			      u16 interrupt_num)
>  {
>  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> +	struct dw_pcie_ep_func *ep_func;
>  	struct pci_epc *epc = ep->epc;
>  	u16 tbl_offset, bir;
>  	unsigned int func_offset = 0;

Unnecessary initialization (not from your patch).

> @@ -477,9 +557,16 @@ int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
>  	void __iomem *msix_tbl;
>  	int ret;
>  
> +	ep_func = dw_pcie_ep_get_func_from_ep(ep, func_no);
> +	if (!ep_func)
> +		return -EINVAL;
> +
> +	if (!ep_func->msix_cap)
> +		return -EINVAL;
> +
>  	func_offset = dw_pcie_ep_func_select(ep, func_no);
>  
> -	reg = ep->msix_cap + func_offset + PCI_MSIX_TABLE;
> +	reg = ep_func->msix_cap + func_offset + PCI_MSIX_TABLE;
>  	tbl_offset = dw_pcie_readl_dbi(pci, reg);
>  	bir = (tbl_offset & PCI_MSIX_TABLE_BIR);
>  	tbl_offset &= PCI_MSIX_TABLE_OFFSET;
> @@ -558,6 +645,7 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
>  	int i;
>  	int ret;
>  	u32 reg;
> +	u8 func_no;
>  	void *addr;
>  	u8 hdr_type;
>  	unsigned int nbars;
> @@ -566,6 +654,9 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
>  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
>  	struct device *dev = pci->dev;
>  	struct device_node *np = dev->of_node;
> +	struct dw_pcie_ep_func *ep_func;
> +
> +	INIT_LIST_HEAD(&ep->func_list);
>  
>  	if (!pci->dbi_base || !pci->dbi_base2) {
>  		dev_err(dev, "dbi_base/dbi_base2 is not populated\n");
> @@ -632,9 +723,19 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
>  	if (ret < 0)
>  		epc->max_functions = 1;
>  
> -	ep->msi_cap = dw_pcie_find_capability(pci, PCI_CAP_ID_MSI);
> +	for (func_no = 0; func_no < epc->max_functions; func_no++) {
> +		ep_func = devm_kzalloc(dev, sizeof(*ep_func), GFP_KERNEL);
> +		if (!ep_func)
> +			return -ENOMEM;
>  
> -	ep->msix_cap = dw_pcie_find_capability(pci, PCI_CAP_ID_MSIX);
> +		ep_func->func_no = func_no;
> +		ep_func->msi_cap = dw_pcie_ep_find_capability(ep, func_no,
> +							      PCI_CAP_ID_MSI);
> +		ep_func->msix_cap = dw_pcie_ep_find_capability(ep, func_no,
> +							       PCI_CAP_ID_MSIX);
> +
> +		list_add_tail(&ep_func->list, &ep->func_list);
> +	}
>  
>  	if (ep->ops->ep_init)
>  		ep->ops->ep_init(ep);
> diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
> index cb32afa..dd9b7b4 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.h
> +++ b/drivers/pci/controller/dwc/pcie-designware.h
> @@ -230,8 +230,16 @@ struct dw_pcie_ep_ops {
>  	unsigned int (*func_conf_select)(struct dw_pcie_ep *ep, u8 func_no);
>  };
>  
> +struct dw_pcie_ep_func {
> +	struct list_head	list;
> +	u8			func_no;
> +	u8			msi_cap;	/* MSI capability offset */
> +	u8			msix_cap;	/* MSI-X capability offset */
> +};
> +
>  struct dw_pcie_ep {
>  	struct pci_epc		*epc;
> +	struct list_head	func_list;
>  	const struct dw_pcie_ep_ops *ops;
>  	phys_addr_t		phys_base;
>  	size_t			addr_size;
> @@ -244,8 +252,6 @@ struct dw_pcie_ep {
>  	u32			num_ob_windows;
>  	void __iomem		*msi_mem;
>  	phys_addr_t		msi_mem_phys;
> -	u8			msi_cap;	/* MSI capability offset */
> -	u8			msix_cap;	/* MSI-X capability offset */
>  };
>  
>  struct dw_pcie_ops {
> @@ -437,6 +443,8 @@ int dw_pcie_ep_raise_msix_irq(struct dw_pcie_ep *ep, u8 func_no,
>  int dw_pcie_ep_raise_msix_irq_doorbell(struct dw_pcie_ep *ep, u8 func_no,
>  				       u16 interrupt_num);
>  void dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum pci_barno bar);
> +struct dw_pcie_ep_func *
> +dw_pcie_ep_get_func_from_ep(struct dw_pcie_ep *ep, u8 func_no);
>  #else
>  static inline void dw_pcie_ep_linkup(struct dw_pcie_ep *ep)
>  {
> @@ -478,5 +486,11 @@ static inline int dw_pcie_ep_raise_msix_irq_doorbell(struct dw_pcie_ep *ep,
>  static inline void dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum pci_barno bar)
>  {
>  }
> +
> +static inline struct dw_pcie_ep_func *
> +dw_pcie_ep_get_func_from_ep(struct dw_pcie_ep *ep, u8 func_no)
> +{
> +	return NULL;
> +}
>  #endif
>  #endif /* _PCIE_DESIGNWARE_H */
> -- 
> 2.9.5
> 

^ permalink raw reply

* Re: [PATCH] spi: fsl-espi: Only process interrupts for expected events
From: Heiner Kallweit @ 2020-09-23 20:27 UTC (permalink / raw)
  To: Chris Packham, broonie, npiggin
  Cc: linuxppc-dev, linux-kernel, stable, linux-spi
In-Reply-To: <20200904002812.7300-1-chris.packham@alliedtelesis.co.nz>

On 04.09.2020 02:28, Chris Packham wrote:
> The SPIE register contains counts for the TX FIFO so any time the irq
> handler was invoked we would attempt to process the RX/TX fifos. Use the
> SPIM value to mask the events so that we only process interrupts that
> were expected.
> 
> This was a latent issue exposed by commit 3282a3da25bd ("powerpc/64:
> Implement soft interrupt replay in C").
> 
> Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
> Cc: stable@vger.kernel.org
> ---
> 
> Notes:
>     I've tested this on a T2080RDB and a custom board using the T2081 SoC. With
>     this change I don't see any spurious instances of the "Transfer done but
>     SPIE_DON isn't set!" or "Transfer done but rx/tx fifo's aren't empty!" messages
>     and the updates to spi flash are successful.
>     
>     I think this should go into the stable trees that contain 3282a3da25bd but I
>     haven't added a Fixes: tag because I think 3282a3da25bd exposed the issue as
>     opposed to causing it.
> 
>  drivers/spi/spi-fsl-espi.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/spi/spi-fsl-espi.c b/drivers/spi/spi-fsl-espi.c
> index 7e7c92cafdbb..cb120b68c0e2 100644
> --- a/drivers/spi/spi-fsl-espi.c
> +++ b/drivers/spi/spi-fsl-espi.c
> @@ -574,13 +574,14 @@ static void fsl_espi_cpu_irq(struct fsl_espi *espi, u32 events)
>  static irqreturn_t fsl_espi_irq(s32 irq, void *context_data)
>  {
>  	struct fsl_espi *espi = context_data;
> -	u32 events;
> +	u32 events, mask;
>  
>  	spin_lock(&espi->lock);
>  
>  	/* Get interrupt events(tx/rx) */
>  	events = fsl_espi_read_reg(espi, ESPI_SPIE);
> -	if (!events) {
> +	mask = fsl_espi_read_reg(espi, ESPI_SPIM);
> +	if (!(events & mask)) {
>  		spin_unlock(&espi->lock);
>  		return IRQ_NONE;

Sorry, I was on vacation and therefore couldn't comment earlier.
I'm fine with the change, just one thing could be improved IMO.
If we skip an unneeded interrupt now, then returning IRQ_NONE
causes reporting this interrupt as spurious. This isn't too nice
as spurious interrupts typically are seen as a problem indicator.
Therefore returning IRQ_HANDLED should be more appropriate.
This would just require a comment in the code explaining why we
do this, and why it can happen that we receive interrupts
we're not interested in.

>  	}
> 


^ permalink raw reply

* Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
From: Thomas Gleixner @ 2020-09-23 20:55 UTC (permalink / raw)
  To: Steven Rostedt, peterz
  Cc: Juri Lelli, David Airlie, Sebastian Andrzej Siewior,
	Joonas Lahtinen, dri-devel, linux-mips, Ben Segall, Max Filippov,
	Guo Ren, linux-sparc, Vincent Chen, Will Deacon, Ard Biesheuvel,
	linux-arch, Vincent Guittot, Herbert Xu, the arch/x86 maintainers,
	Russell King, linux-csky, Mel Gorman,
	open list:SYNOPSYS ARC ARCHITECTURE, linux-xtensa, Paul McKenney,
	intel-gfx, linuxppc-dev, Jani Nikula, Greentime Hu, Rodrigo Vivi,
	Dietmar Eggemann, Linux ARM, Chris Zankel, Michal Simek,
	Thomas Bogendoerfer, Nick Hu, Linux-MM, Linus Torvalds, LKML,
	Arnd Bergmann, Daniel Vetter, Vineet Gupta, Paul Mackerras,
	Andrew Morton, Daniel Bristot de Oliveira, David S. Miller
In-Reply-To: <20200923115251.7cc63a7e@oasis.local.home>

On Wed, Sep 23 2020 at 11:52, Steven Rostedt wrote:
> On Wed, 23 Sep 2020 10:40:32 +0200
> peterz@infradead.org wrote:
>
>> However, with migrate_disable() we can have each task preempted in a
>> migrate_disable() region, worse we can stack them all on the _same_ CPU
>> (super ridiculous odds, sure). And then we end up only able to run one
>> task, with the rest of the CPUs picking their nose.
>
> What if we just made migrate_disable() a local_lock() available for !RT?
>
> I mean make it a priority inheritance PER CPU lock.
>
> That is, no two tasks could do a migrate_disable() on the same CPU? If
> one task does a migrate_disable() and then gets preempted and the
> preempting task does a migrate_disable() on the same CPU, it will block
> and wait for the first task to do a migrate_enable().
>
> No two tasks on the same CPU could enter the migrate_disable() section
> simultaneously, just like no two tasks could enter a preempt_disable()
> section.
>
> In essence, we just allow local_lock() to be used for both RT and !RT.
>
> Perhaps make migrate_disable() an anonymous local_lock()?
>
> This should lower the SHC in theory, if you can't have stacked migrate
> disables on the same CPU.

I'm pretty sure this ends up in locking hell pretty fast and aside of
that it's not working for scenarios like:

     kmap_local();
       migrate_disable();
       ...

     copy_from_user()
        -> #PF
           -> schedule()

which brought us into that discussion in the first place. You would stop
any other migrate disable user from running until the page fault is
resolved...

Thanks,

        tglx

^ permalink raw reply

* Re: [PATCH] spi: fsl-espi: Only process interrupts for expected events
From: Chris Packham @ 2020-09-23 21:01 UTC (permalink / raw)
  To: Heiner Kallweit, broonie@kernel.org, npiggin@gmail.com
  Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org, linux-spi@vger.kernel.org
In-Reply-To: <ec77cf82-5ef1-c650-3e8a-80be749c2214@gmail.com>


On 24/09/20 8:27 am, Heiner Kallweit wrote:
> On 04.09.2020 02:28, Chris Packham wrote:
>> The SPIE register contains counts for the TX FIFO so any time the irq
>> handler was invoked we would attempt to process the RX/TX fifos. Use the
>> SPIM value to mask the events so that we only process interrupts that
>> were expected.
>>
>> This was a latent issue exposed by commit 3282a3da25bd ("powerpc/64:
>> Implement soft interrupt replay in C").
>>
>> Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
>> Cc: stable@vger.kernel.org
>> ---
>>
>> Notes:
>>      I've tested this on a T2080RDB and a custom board using the T2081 SoC. With
>>      this change I don't see any spurious instances of the "Transfer done but
>>      SPIE_DON isn't set!" or "Transfer done but rx/tx fifo's aren't empty!" messages
>>      and the updates to spi flash are successful.
>>      
>>      I think this should go into the stable trees that contain 3282a3da25bd but I
>>      haven't added a Fixes: tag because I think 3282a3da25bd exposed the issue as
>>      opposed to causing it.
>>
>>   drivers/spi/spi-fsl-espi.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/spi/spi-fsl-espi.c b/drivers/spi/spi-fsl-espi.c
>> index 7e7c92cafdbb..cb120b68c0e2 100644
>> --- a/drivers/spi/spi-fsl-espi.c
>> +++ b/drivers/spi/spi-fsl-espi.c
>> @@ -574,13 +574,14 @@ static void fsl_espi_cpu_irq(struct fsl_espi *espi, u32 events)
>>   static irqreturn_t fsl_espi_irq(s32 irq, void *context_data)
>>   {
>>   	struct fsl_espi *espi = context_data;
>> -	u32 events;
>> +	u32 events, mask;
>>   
>>   	spin_lock(&espi->lock);
>>   
>>   	/* Get interrupt events(tx/rx) */
>>   	events = fsl_espi_read_reg(espi, ESPI_SPIE);
>> -	if (!events) {
>> +	mask = fsl_espi_read_reg(espi, ESPI_SPIM);
>> +	if (!(events & mask)) {
>>   		spin_unlock(&espi->lock);
>>   		return IRQ_NONE;
> Sorry, I was on vacation and therefore couldn't comment earlier.
> I'm fine with the change, just one thing could be improved IMO.
> If we skip an unneeded interrupt now, then returning IRQ_NONE
> causes reporting this interrupt as spurious. This isn't too nice
> as spurious interrupts typically are seen as a problem indicator.
> Therefore returning IRQ_HANDLED should be more appropriate.
> This would just require a comment in the code explaining why we
> do this, and why it can happen that we receive interrupts
> we're not interested in.
I'd be happy to send a follow-up to change IRQ_NONE to IRQ_HANDLED. I 
don't think the old code could have ever hit the IRQ_NONE (because event 
will always be non-zero) so it won't really be a change in behaviour. 
With the patch (that is now in spi/for-next) so far I do see a low 
number of spurious interrupts on the test setup where previously I would 
have seen failure to talk to the spi-flash.

^ permalink raw reply

* Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
From: Steven Rostedt @ 2020-09-23 21:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Juri Lelli, peterz, Sebastian Andrzej Siewior, Joonas Lahtinen,
	dri-devel, linux-mips, Ben Segall, Max Filippov, Guo Ren,
	linux-sparc, Vincent Chen, Will Deacon, Ard Biesheuvel,
	linux-arch, Vincent Guittot, Herbert Xu, the arch/x86 maintainers,
	Russell King, linux-csky, David Airlie, Mel Gorman,
	open list:SYNOPSYS ARC ARCHITECTURE, linux-xtensa, Paul McKenney,
	intel-gfx, linuxppc-dev, Jani Nikula, Greentime Hu, Rodrigo Vivi,
	Dietmar Eggemann, Linux ARM, Chris Zankel, Michal Simek,
	Thomas Bogendoerfer, Nick Hu, Linux-MM, Linus Torvalds, LKML,
	Arnd Bergmann, Daniel Vetter, Vineet Gupta, Paul Mackerras,
	Andrew Morton, Daniel Bristot de Oliveira, David S. Miller
In-Reply-To: <874kno9pr9.fsf@nanos.tec.linutronix.de>

On Wed, 23 Sep 2020 22:55:54 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:

> > Perhaps make migrate_disable() an anonymous local_lock()?
> >
> > This should lower the SHC in theory, if you can't have stacked migrate
> > disables on the same CPU.  
> 
> I'm pretty sure this ends up in locking hell pretty fast and aside of
> that it's not working for scenarios like:
> 
>      kmap_local();
>        migrate_disable();
>        ...
> 
>      copy_from_user()
>         -> #PF
>            -> schedule()  
> 
> which brought us into that discussion in the first place. You would stop
> any other migrate disable user from running until the page fault is
> resolved...

Then scratch the idea of having anonymous local_lock() and just bring
local_lock in directly? Then have a kmap local lock, which would only
block those that need to do a kmap.

Now as for migration disabled nesting, at least now we would have
groupings of this, and perhaps the theorists can handle that. I mean,
how is this much different that having a bunch of tasks blocked on a
mutex with the owner is pinned on a CPU?

migrate_disable() is a BKL of pinning affinity. If we only have
local_lock() available (even on !RT), then it makes the blocking in
groups. At least this way you could grep for all the different
local_locks in the system and plug that into the algorithm for WCS,
just like one would with a bunch of mutexes.

-- Steve

^ permalink raw reply

* RE: [PATCH 5/9] fs: remove various compat readv/writev helpers
From: David Laight @ 2020-09-23 21:30 UTC (permalink / raw)
  To: 'Arnd Bergmann', Al Viro
  Cc: Jens Axboe, linux-s390, linux-scsi, Parisc List, linux-aio,
	Networking, io-uring@vger.kernel.org,
	linux-kernel@vger.kernel.org, open list:BROADCOM NVRAM DRIVER,
	David Howells, Linux FS-devel Mailing List, LSM List,
	keyrings@vger.kernel.org, linux-block, Linux-MM, linux-arch,
	sparclinux, Andrew Morton, linuxppc-dev, Christoph Hellwig,
	Linux ARM
In-Reply-To: <CAK8P3a3nkLUOkR+jwz2_2LcYTUTqdVf8JOtZqKWbtEDotNhFZA@mail.gmail.com>

From: Arnd Bergmann
> Sent: 23 September 2020 19:46
...
> Regardless of that, another advantage of having the SYSCALL_DECLAREx()
> would be the ability to include that header file from elsewhere with a different
> macro definition to create a machine-readable version of the interface when
> combined with the syscall.tbl files. This could be used to create a user
> space stub for calling into the low-level syscall regardless of the
> libc interfaces,
> or for synchronizing the interfaces with strace, qemu-user, or anything that
> needs to deal with the low-level interface.

A similar 'trick' (that probably won't work here) is to pass
the name of a #define function as a parameter to another define.
Useful for defining constants and error strings together. eg:
#define TRAFFIC_LIGHTS(x) \
	x(RED, 0, "Red") \
	x(YELLOW, 1, "Yellow) \
	x(GREEN, 2, "GREEN)

You can then do thing like:
#define x(token, value, string) token = value,
enum {TRAFFIC_LIGHTS(x) NUM_LIGHTS};
#undef x
#define x(token, value, string) [value] = string,
const char *colours[] = {TRAFFIC_LIGHTS(x)};
#undef x
to initialise constants and a name table that are always in sync.

It is also a good way to generate source lines that are over 1MB.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply

* [PATCH] powerpc: PPC_SECURE_BOOT should not require PowerNV
From: Daniel Axtens @ 2020-09-24  1:49 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nayna Jain, Daniel Axtens

In commit 61f879d97ce4 ("powerpc/pseries: Detect secure and trusted
boot state of the system.") we taught the kernel how to understand the
secure-boot parameters used by a pseries guest.

However, CONFIG_PPC_SECURE_BOOT still requires PowerNV. I didn't
catch this because pseries_le_defconfig includes support for
PowerNV and so everything still worked. Indeed, most configs will.
Nonetheless, technically PPC_SECURE_BOOT doesn't require PowerNV
any more.

The secure variables support (PPC_SECVAR_SYSFS) doesn't do anything
on pSeries yet, but I don't think it's worth adding a new condition -
at some stage we'll want to add a backend for pSeries anyway.

Fixes: 61f879d97ce4 ("powerpc/pseries: Detect secure and trusted boot state of the system.")
Cc: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 arch/powerpc/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 4b33477dafb8..f645fa934853 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -983,7 +983,7 @@ config PPC_MEM_KEYS
 config PPC_SECURE_BOOT
 	prompt "Enable secure boot support"
 	bool
-	depends on PPC_POWERNV
+	depends on PPC_POWERNV || PPC_PSERIES
 	depends on IMA_ARCH_POLICY
 	imply IMA_SECURE_AND_OR_TRUSTED_BOOT
 	help
-- 
2.25.1

^ permalink raw reply related

* Re: [PATCH] cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_reboot_notifier
From: Daniel Axtens @ 2020-09-24  2:35 UTC (permalink / raw)
  To: Srikar Dronamraju, Michael Ellerman
  Cc: linuxppc-dev, Srikar Dronamraju, Pratik Rajesh Sampat
In-Reply-To: <20200922080254.41497-1-srikar@linux.vnet.ibm.com>

Hi Srikar,

> The patch avoids allocating cpufreq_policy on stack hence fixing frame
> size overflow in 'powernv_cpufreq_reboot_notifier'
>
> ./drivers/cpufreq/powernv-cpufreq.c: In function _powernv_cpufreq_reboot_notifier_:
> ./drivers/cpufreq/powernv-cpufreq.c:906:1: error: the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
>  }
>  ^
> cc1: all warnings being treated as errors
> make[3]: *** [./scripts/Makefile.build:316: drivers/cpufreq/powernv-cpufreq.o] Error 1
> make[2]: *** [./scripts/Makefile.build:556: drivers/cpufreq] Error 2
> make[1]: *** [./Makefile:1072: drivers] Error 2
> make[1]: *** Waiting for unfinished jobs....
> make: *** [Makefile:157: sub-make] Error 2
>

This looks a lot like commit d95fe371ecd2 ("cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_work_fn").

As with that patch, I have checked for matching puts/gets and that all
uses of '&' check out.

I tried to look at the snowpatch tests: they apparently reported a
checkpatch warning but the file has since disappeared so I can't see
what it was. Running checkpatch locally:

$ scripts/checkpatch.pl -g HEAD -strict
WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#15: 
make[3]: *** [./scripts/Makefile.build:316: drivers/cpufreq/powernv-cpufreq.o] Error 1

This is benign and you shouldn't wrap that line anyway.

On that basis:

Reviewed-by: Daniel Axtens <dja@axtens.net>

Kind regards,
Daniel

> Fixes: cf30af76 ("cpufreq: powernv: Set the cpus to nominal frequency during reboot/kexec")
> Cc: Pratik Rajesh Sampat <psampat@linux.ibm.com>
> Cc: Daniel Axtens <dja@axtens.net>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index a9af15e..e439b43 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -885,12 +885,15 @@ static int powernv_cpufreq_reboot_notifier(struct notifier_block *nb,
>  				unsigned long action, void *unused)
>  {
>  	int cpu;
> -	struct cpufreq_policy cpu_policy;
> +	struct cpufreq_policy *cpu_policy;
>  
>  	rebooting = true;
>  	for_each_online_cpu(cpu) {
> -		cpufreq_get_policy(&cpu_policy, cpu);
> -		powernv_cpufreq_target_index(&cpu_policy, get_nominal_index());
> +		cpu_policy = cpufreq_cpu_get(cpu);
> +		if (!cpu_policy)
> +			continue;
> +		powernv_cpufreq_target_index(cpu_policy, get_nominal_index());
> +		cpufreq_cpu_put(cpu_policy);
>  	}
>  
>  	return NOTIFY_DONE;
> -- 
> 1.8.3.1

^ permalink raw reply

* Re: [PATCH V2] powerpc/perf: Exclude pmc5/6 from the irrelevant PMU group constraints
From: Athira Rajeev @ 2020-09-24  2:40 UTC (permalink / raw)
  To: Paul A. Clarke; +Cc: maddy, linuxppc-dev
In-Reply-To: <20200922104656.GA664163@li-24c3614c-2adc-11b2-a85c-85f334518bdb.ibm.com>



> On 22-Sep-2020, at 4:16 PM, Paul A. Clarke <pc@us.ibm.com> wrote:
> 
> Just one nit in a comment below...
> (and this is not worthy of tags like "reviewed-by" ;-)
> 
> On Mon, Sep 21, 2020 at 03:10:04AM -0400, Athira Rajeev wrote:
>> PMU counter support functions enforces event constraints for group of
>> events to check if all events in a group can be monitored. Incase of
>> event codes using PMC5 and PMC6 ( 500fa and 600f4 respectively ),
>> not all constraints are applicable, say the threshold or sample bits.
>> But current code includes pmc5 and pmc6 in some group constraints (like
>> IC_DC Qualifier bits) which is actually not applicable and hence results
>> in those events not getting counted when scheduled along with group of
>> other events. Patch fixes this by excluding PMC5/6 from constraints
>> which are not relevant for it.
>> 
>> Fixes: 7ffd948 ("powerpc/perf: factor out power8 pmu functions")
>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>> ---
> 
>> diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c
>> index 964437a..12153da 100644
>> --- a/arch/powerpc/perf/isa207-common.c
>> +++ b/arch/powerpc/perf/isa207-common.c
>> @@ -288,6 +288,15 @@ int isa207_get_constraint(u64 event, unsigned long *maskp, unsigned long *valp)
>> 
>> 		mask  |= CNST_PMC_MASK(pmc);
>> 		value |= CNST_PMC_VAL(pmc);
>> +
>> +		/*
>> +		 * PMC5 and PMC6 are used to count cycles and instructions
>> +		 * and these doesnot support most of the constraint bits.
> 
> s/doesnot/do not/

Hi Paul,

Thanks for checking the patch and sharing the comment. 

Athira
> 
>> +		 * Add a check to exclude PMC5/6 from most of the constraints
>> +		 * except for ebb/bhrb.
>> +		 */
>> +		if (pmc >= 5)
>> +			goto ebb_bhrb;
> 
> PC


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox